1. Introduction
With the advantages of high load-carrying capacity, low ground-to-ground pressure, and zero-radius steering, tracked vehicles play an important role in modern military, agriculture, construction, mining, and disaster relief, as well as in response to emergencies [
1]. Owing to their high efficiency, high power density, and flexible layout, electrical-driven tracked unmanned vehicles have become an important development direction for ground combat platforms [
2]. Trajectory tracking, which aims to ensure vehicles track their predefined trajectories, has emerged as a key technology for unmanned vehicles [
3]. Due to harsh driving conditions and complex road conditions, achieving high-precision trajectory tracking remains a significant challenge for tracked unmanned vehicles.
Many control approaches have been presented in trajectory tracking, such as the pure tracking algorithm [
4], PID control [
5], sliding mode control [
6], linear quadratic regulator (LQR) [
7], model predictive control (MPC) [
8], and Deep Reinforcement Learning (DRL)-based methods [
9]. Despite the concise structure of PID controllers, obtaining suitable control parameters and achieving robust performance remain challenging [
10]. Although the optimal parameters can be derived by the LQR, it is difficult to deal with its constraints [
11]. Due to the lack of interpretability, the stability of DRL continues to pose significant challenges [
12]. Conversely, MPC exhibits strong capabilities in constraint handling, state prediction, and objective optimization, making it a promising approach for achieving accurate trajectory tracking control [
13].
There have been many existing works on MPC-based trajectory tracking control. Chu et al. [
14] employed an MPC controller based on a four-wheel kinematic model to achieve trajectory tracking control for passenger vehicles. Zhang et al. [
15] presented a spatial-domain-based kinematic model and a linear MPC controller for path tracking. Although both MPC controllers achieve satisfactory tracking performance using the kinematic model, neglecting the side-slip effect leads to degraded control accuracy, especially in high-speed scenarios. To describe the vehicle dynamics characteristics more accurately, a single-track dynamic model was used in [
16], which exhibited improved tracking performance during large steering maneuvers. In [
17], a robust MPC scheme incorporating the single-track dynamic model was proposed to enhance robustness against parameter uncertainties and varying speeds. In [
18], an event-trigger MPC framework is presented to reduce computational burden and improve real-time performance. Instead of focusing on wheeled vehicles, Chen et al. [
19] proposed a dynamic-model-based MPC scheme for tracked vehicles to achieve trajectory tracking. In [
20], an MPC-based strategy considering slipping and smoothing was proposed for an unmanned underwater tracked bulldozer to realize smooth motion control. Meanwhile, a tube–MPC method was developed in [
21] for a substation inspection robot to enhance the robustness of trajectory tracking.
Despite these advancements, most MPC-based studies still rely on physics-based prediction models such as kinematic or dynamic models. However, these models often lead to performance degradation due to modeling mismatches between the prediction model and actual system behaviors [
22]. In particular, for TUVs, the powertrain structure and steering mechanism differ significantly from wheeled vehicles. Steering is achieved by adjusting the velocity or torque between the two tracks [
23], which results in coupled longitudinal–lateral dynamics and strong nonlinearities [
24]. This makes accurate physics-based modeling challenging and may limit the achievable tracking performance. To address this problem, data-driven approaches, especially neural networks, have shown significant potential for representing complex nonlinear relationships [
25]. For vehicle dynamic modeling tasks, system states are inherently time-dependent. As a special class of neural networks, the Long Short-Term Memory (LSTM) network is well suited for capturing long-term temporal dependencies and mitigating gradient vanishing issues compared to feed-forward neural networks and gated recurrent units (GRUs), making them an attractive alternative to physics-based models [
26].
However, existing trajectory tracking control studies for TUVs still primarily adopt physics-based models [
27,
28]. Although acceptable trajectory tracking performance has been demonstrated, most approaches neglect coupled longitudinal–lateral dynamics, resulting in reduced precision and robustness under complex maneuvers. Moreover, the neural network-based dynamic modeling methods, as well as the integration of such models into an MPC framework for TUV trajectory tracking, have not been thoroughly investigated.
Motivated by the mentioned research gap, a data-driven-model-based predictive control scheme for a dual-motor-driven TUV is proposed in this article. The main contributions are as follows:
- (1)
A vehicle dynamics model based on a Long Short-Term Memory (LSTM) network is established for the dual-motor-driven TUV, incorporating coupled kinematic and dynamic characteristics. In this model, vehicle motion states in a subsequent time step are predicted using history states and control inputs.
- (2)
A novel MPC controller is developed for accurate trajectory tracking. Specially, the proposed LSTM-based vehicle dynamics model is employed to predict vehicle states in the receding horizon. The optimal motor torque is then calculated based on the prediction states and reference trajectory.
In addition, both cosimulations and field experiments are conducted to verify the effectiveness of the proposed scheme.
The remainder of this article is organized as follows: In
Section 2, the LSTM-based vehicle dynamics model is constructed. In
Section 3, the MPC-based trajectory tracking controller integrating the LSTM-based vehicle model is designed. In
Section 4, the results of simulation and field experiment are demonstrated. The conclusions are drawn in
Section 5.
2. Design of LSTM-Based Vehicle Dynamics Model
In this section, a data-driven vehicle dynamics model based on a LSTM network is introduced. Firstly, the model structure is demonstrated. Next, the process of data generation, model training, and validation are presented.
2.1. Model Structure and Input–Output Design
The vehicle states, such as position and velocity, exhibited pronounced time-series characteristics, demonstrating that the variation in vehicle states depends not only on the current control inputs, but also on the historical states and control commands. In this article, the vehicle states include the longitudinal and lateral positions, heading angle, the yaw rate, and longitudinal speed, all expressed in global coordinates. The control inputs are defined as U = [TL, TR]T, where TL and TR denote the motor torques applied to the left and right sides of the TUV, respectively.
Due to the significant advantages of the LSTM network in processing time-series data [
29], a LSTM-based vehicle dynamics model was established in which the vehicle states
Xglobal at the subsequent time step were predicted using the fixed length of both the history states
Xglobal and control inputs
U. In this article, the LSTM network consists of an input layer, an LSTM layer, a fully connected layer (Dense layer), and a dropout layer. The overall model structure is shown in
Figure 1.
Especially for the vehicle dynamics model shown in
Figure 1, the fixed length of the historical states and control inputs of the input layer was set to 20, with a constant sampling interval 0.1 s. Assuming that
t represents the current time slot, a data sequence consisting of vehicle states
Xglobal and control inputs
U at time slots
t − 19,
t − 18, …,
t were fed into the LSTM layer from the input layer. The LSTM layer serves as the core component of the network, capturing long-term temporal dependencies between historical motion states and control inputs. A fully connected layer with ReLU activation maps the LSTM hidden representation to the output space. To reduce overfitting during training, a dropout layer was applied after the fully connected layer. Finally, the output layer produces the predicted vehicle states at the next time step, represented as follows:
where
X and
Y represent the longitudinal and lateral positions,
φ is the heading angle, ω is the yaw rate, and
vx is the longitudinal speed, all expressed in global coordinates.
For the network shown in
Figure 1, the LSTM layer contains 32 hidden units. The number of parameters of the neural network vehicle dynamics model is summarized in
Table 1.
2.2. Training and Validation of Neural Network Prediction Model
Based on the neural network model architecture, a multi-body dynamics model of a TUV in RecurDyn V9R4 was first used to obtain raw data for model training. The raw data was then pre-processed to match the network input dimensions. Subsequently, the constructed dataset was partitioned into training and testing subsets. The neural network model was trained using the training set, and its performance was evaluated on the test set. The prediction accuracy was quantified using the mean square error (MSE) between the predicted and actual values.
2.2.1. Multi-Body Dynamics Modeling of TUV
The multi-body dynamics software RecurDyn enables highly accurate nonlinear dynamic simulations of complex tracked vehicle structures. It allows for direct application of rotational speed or torque to both sides of the wheels while providing vehicle state information, such as position, speed, track speed, acceleration, and yaw rate, for trajectory tracking control.
The parameters of TUV in RecurDyn are shown in
Table 2, while the simulated TUV in RecurDyn is shown in
Figure 2.
2.2.2. Data Acquisition and Preprocessing
For the established multi-body dynamics model in RecurDyn, the initial driving torque is manually set, and a random torque increment is subsequently applied to each side at each time step. The TUV state information in Xglobal is recorded at each step. A 1000 s simulation is performed, generating a raw dataset containing 50,000 data entries.
The main steps for preprocessing raw data are demonstrated as follows:
During data collection and recording, particularly in real vehicle experiments, anomalies may occur. Invalid data, outliers, and missing values must be removed or appropriately processed. Missing data are filled using the nearest-neighbor interpolation method, while outliers are identified and treated based on the 3σ principle. Data points exceeding the proximity threshold are removed. In addition, since vehicle state information is continuous, a filtering method is applied to suppress noise and smooth the data.
Although the network structure applies scaling with the Sigmoid or tanh functions, the value ranges of different features are first normalized to a common scale. This ensures efficient training and preserves the contribution of features with smaller magnitudes, thereby preventing certain features from dominating the model learning process. The Min–Max normalization method is adopted, expressed as
By processing the data through (2), the raw data is mapped to the interval [−1, 1] to achieve normalization.
For the LSTM-based model, the time-series data is divided into multiple fixed-length segments, and each segment is assigned a corresponding target value at the final time step.
The dataset is split into training, validation, and test sets for model training, optimization, and evaluation. It is noted that all training and validation data are collected from the RecurDyn cosimulation platform. The real-vehicle experimental data are not used for training, but only for validation of the control performance in field experiments, as detailed in
Section 4.2. Initially, the original dataset is divided in a 7:3 ratio. Of the remaining 30%, 15% is used for validation and the rest for testing. To enhance model generalization and reduce overfitting, the training set is further divided into smaller batches, and a batch size of 64 is adopted for multiple training iterations.
2.2.3. Training and Validation of the Vehicle Dynamic Model
Using the generated dataset, MSE is chosen as the loss function, which quantifies the difference between predicted and actual values, defined as
where
yi is the actual value,
ŷi is the predicted value and
n is the sample size. The Adam optimizer is adopted with a learning rate of 0.005 and 200 training epochs.
Table 3 presents the prediction results of three models: a physics-based dynamic model in [
24], a full-connected network DNN [
30], and the proposed LSTM. The metric MAE represents mean absolute error. All neural models are trained and evaluated on the same dataset. For the DNN and LSTM, the results are averaged over ten independent runs for reliability. As depicted in
Table 3, the LSTM model achieves the highest accuracy by effectively capturing the complex relationships between vehicle states. Although the DNN also has learning capability, its inability to handle long-term dependencies leads to reduced prediction accuracy. In contrast, the dynamic model exhibits the largest MSE and MAE, revealing its limitations in complex scenarios.
Furthermore, a parameter sensitivity analysis is conducted for the proposed LSTM model, where the candidate numbers of hidden units of LSTM layer are 16, 32, and 64 [
26], and the candidate fixed length of historical states are 10, 20, and 30. Using MSE as the evaluation metric, the prediction results are listed in
Table 4.
As shown in
Table 4, increasing the number of hidden units and the length of historical states generally improves prediction accuracy, as more temporal dependencies are captured. However, excessively large hyperparameters result in slight performance degradation, likely due to overfitting and increased model complexity. Considering the trade-off between accuracy and model size, the optimal configuration of the LSTM network is determined to be 32 hidden units with a history length of 20 steps.
3. NN-MPC Trajectory Tracking Control Synthesis
In this section, a trajectory tracking controller based on the neural network model predictive control (NN-MPC) is designed to obtain the optimal motor torque. The prediction framework using the LSTM-based vehicle dynamics model is first demonstrated. Next, the objective function and solution decision is introduced. The stability analysis of NN-MPC is also discussed. The structural block diagram of the proposed control strategy is shown in
Figure 3.
3.1. Establishment of NN-MPC Prediction Model
Directly embedding the neural network prediction model within the MPC framework causes the iterative updates of the state variables over the prediction horizon. Consequently, the state matrix also updates. This process involves overwriting the original data with new predictions, which complicates the optimization and may degrade the solution quality. Therefore, a carefully designed prediction model is required to ensure stability and efficient computation.
For the prediction equation, the dimensional transformations of the neural network can be formulated as
The final network output is represented as Z
out5*1 =
Xglobal(
t + 1). Only the inputs to the LSTM layer are multidimensional matrices, while all of the inputs and outputs of subsequent layers are one-dimensional vectors. Within the LSTM layer, the sequential data passes through 20 LSTM units sharing the same network parameters, where 20 corresponds to the length of historical states. Each LSTM unit takes the predicted output from the previous time step and the corresponding original cell state sequence as the input, producing a new predicted output along with an updated cell state. This process corresponds to the update of LSTM cell state, which can be mathematically expressed as
where
ht and
Ct denote the hidden state and cell state of the LSTM layer at time step
t [
26],
ht−1 and
Ct−1 represent their corresponding values at the previous time step
t − 1.
It means that the predicted output
ht of the LSTM at time
t can be directly obtained by
ht−1,
Ct−1,
Xglobal(
t),
U(
t), while simultaneously producing
ht and
Ct for the next iteration. Based on this concept, the LSTM layer with 20 history states in (5) is restructured into an LSTM layer, with 19 history states, and a Dense layer, with network parameters kept identical. Equation (5) can be written as
In this way, the original neural network prediction model can be regarded as a combination of a single-layer LSTM network that processes sequential data and a fully connected network that handles one-dimensional input. The resulting simplified single-layer neural network operates independently of the MPC and does not participate in state iterations within the prediction horizon.
The prediction of state variables can be expressed as follows:
where
Fh and
FC represent the process of updating cell states and hidden states within a LSTM cell.
FNN represents the function for computing the cell output. The detailed formulation of
Fh,
FC, and
FNN can be found in [
26].
Based on (7), the prediction equation of hidden state
h, cell state
C, and vehicle motion state
Xglobal at time slot
t +
k is given by
Selecting
ξ(
t) = [
ht−1,
Ct−1,
]
T as the state variables and
U(
t) = [
TR(
t),
TL(
t)]
T as the control input, Equations (8)–(10) can be regarded as the prediction equation of NN-MPC. To improve computation efficiency, an approximate linearization is further performed around the hidden and cell states of the LSTM layer (
ht−1,
Ct−1), current trajectory point X
global, and previous control input
U(
t − 1). At time slot
t, these quantities are all known from the previous iteration and sensor measurements. It is noted that the linearization is not performed at a fixed equilibrium point, but rather at the current trajectory point and corresponding LSTM states. Consequently, the resulting model is a local linear time-varying approximation corresponding to a trajectory-dependent linearization commonly adopted in NMPC design [
31]. The linearized prediction model can be reconstructed as follows:
where
ξ(
t +
k) = [
ht+k−1,
Ct+k−1,
(
t +
k)
, and
NC denotes the length of control horizon in the MPC design. The control input
U is transformed into its control increment form
U, in which
U(
t)=
U(
t) −
U(
t − 1). The matrices in (11) are given in (12) and (13).
Equation (11) can also be given by
where
A,
B, and
H are given as
Although the future state ξ(t + k) and control input U(t + j) appear in (14), they are not known a priori. Instead, the future states are recursively predicted from the current state ξ(t) and the sequence of control increments U. Specifically, U is the decision variable consisting of U(t), U(t + 1), …, U(t + NC − 1) in the NN-MPC. Consequently, (14) can be equivalently transformed into a formulation that depends only on the current state ξ(t) and the decision variable U, which ensures strict causality in the optimization.
As mentioned above, the linearization process is carried out around current trajectory point Xglobal and previous control input U(t − 1), leading to a local linear time-varying approximation. The matrices A and B in (14) are obtained from the Jacobians of the NN-based prediction model with respect to the linearization point. Since the linearization is performed online at each current trajectory point, and the actual vehicle states remain within a small neighborhood of this point during operation, it is reasonable to assume that the pair obtained from this local linearization is locally controllable within a compact neighborhood of the linearization point. Furthermore, the control sequence is computed by solving an optimization problem in the MPC design. Since the cost function is quadratic and the constraints are convex, the MPC optimization problem has a unique feasible solution within the local admissible region, thereby ensuring the feasibility of the resulting control action.
Choosing
η = [
ye,
φe,
ve]
T as the output variable, where
ye, φe, and
ve denote lateral error, heading error and speed tracking error. The value of
η at time slot
t +
k + 1 is given by
where
and
C are represented as
By combining formulas (14) and (16), the prediction model of NN-MPC is obtained as follows:
3.2. Objective Function and Optimal Solution
The output equation of the neural network prediction model at moment
t is expressed as follows:
where the matrices
AF,
BF,
CF,
DF,
SF, and
are given in (20)–(22).
where
NP is the length of the prediction horizon.
The objective function of trajectory tracking control based on the neural network prediction model is expressed as follows:
where the weight matrix for tracking error
Q = diag([
Qye,
Qφe,
Qve]),
Qye,
Qφe,
Qve are the weight matrix for lateral error, yaw error, and speed error.
R is the weight matrix for the control effort, which is characterized by the deviation of the control input.
is the slack variable and
is a positive weight for
.
The constraints are established, and the constraints on the state variables are expressed as follows:
The constraints on the output variables are expressed as follows:
The constraints on the control variables are given by
where
Pm and
ni are motor power and speed.
To enhance the stability of the closed-loop system, the constraint for terminal state is given as [
32]
Based on all the above constraints, Equation (30) is simplified to the standard form of quadratic programming, represented as follows:
The optimal U was obtained, and its first element was added to the previous desired motor torque, yielding U(t) = U(t − 1) + U(t). U(t) serves as the torque command transmitted to the actuator of the TUV.
Since the TUV is an underactuated system, it possesses nontrivial zero dynamics. In the proposed NN-MPC framework, the control inputs are generated by solving an optimization problem that includes the lateral, heading, and velocity tracking errors with corresponding weighting coefficients. These weights enable the controller to dynamically balance multiple tracking objectives and generate feasible control torques that drive the vehicle along the desired trajectory. It should be emphasized that, in real-world trajectory tracking applications, the primary objective is also to achieve satisfactory path-following performance (e.g., acceptable position, heading, and velocity error, rather than strict asymptotic convergence of all states).
In addition, due to the non-holonomic constraint of the TUV, the lateral axis is not directly controllable when the TUV is stationary. In such special conditions, this issue can be handled through the coordination of the overall control architecture. Specifically, when the TUV is stationary, the upper-level trajectory planning module replans the desired trajectory by taking the current vehicle position as the starting point. Once the vehicle starts moving, the NN-MPC controller dynamically adjusts the control torques to minimize the tracking errors. When the vehicle comes to a stop, the tracking process is considered complete if the terminal errors remain within acceptable tolerance bounds rather than absolute zero error, which is a standard practice in trajectory tracking control.
3.3. Stability Analysis
The stability of the proposed MPC controller is demonstrated in this subsection. The stability analysis for the nominal NN-based model is first presented, and then the practical stability under the modeling error caused by the NN-based model is discussed. It is noted that the focus of this article is primarily on practical implementation and control effectiveness, while the theoretical stability analysis is included here to ensure completeness.
3.3.1. Stability Analysis with Nominal NN-Based Model
For the stability of the nominal model, it is assumed that the NN-based model is sufficiently accurate. A candidate Lyapunov function
is formulated as
where
t is time slot,
and
are consistent with those presented in (22). It is evident that
0, while
0 only if
(
t +
i|
t) = 0 and
(
t +
i − 1|
t) = 0 for all
i, which means that
is positive-definite.
For the terminal constraint presented in (28), the initial feasibility is assumed to hold, i.e., there exists a feasible control sequence increment U(0) such that (NP|0) = 0. The assumption is reasonable, since the initial feasibility can be ensured through coordination among different modules of the tracked unmanned vehicle (TUV). For example, the upper-layer trajectory planning module generates the desired trajectory consistent with the initial position and speed of the TUV, ensuring feasible initialization of the control process.
Then, the recursive feasibility is discussed. Without loss of generality, the feasibility of terminal constraint at time slot t + 1 is analyzed, assuming that the feasibility of terminal constraint is satisfied at time slot t.
Due to the feasibility of
, it is assumed that, at time slot
t, a control sequence increment
Up(
t) exists that drives the predicted state to zero within
NP steps. The
Up(
t) is given as
Applying the
Up(
t|
t) as the current control input increment, the next-step state
(
t + 1|
t) is determined by the system model in (19) and
Up(
t|
t). Therefore, there exists a feasible control sequence
Up(
t + 1) satisfying
.
Up(
t + 1) is given by
Therefore, the recursive feasibility is guaranteed if initial feasibility is satisfied, which implies that the terminal constraint remains valid for all subsequent time steps under the optimal control law.
For (30), define the optimal solution at time slot
t as
Then,
is given as
where
is the output state under
.
Since the optimal solution in (33) is also a feasible solution satisfying the terminal constraint, there exists a feasible solution
U (
t + 1) at time slot
t + 1, given as
Based on (32) and (35),
satisfies
where
is a feasible function value with
(
t + 1).
According to (34) and (36), we obtain
Therefore,
is a monotonically non-increasing function, and the stability of the proposed MPC controller is proved. It is noted that the terminal constraint in (28) and the assumption of initial feasibility are theoretical design tools commonly adopted in the MPC literature to establish closed-loop stability [
31]. In practical implementations, this hard terminal constraint can be relaxed by introducing a terminal cost with sufficiently large weight, which effectively converts it into a soft constraint. This approach avoids infeasibility while maintaining stability in practice. Similarly, the initial feasibility can be ensured through coordination among the different modules of the tracked unmanned vehicle (TUV). For example, when the TUV is stationary, this condition can be regarded as an initial state. Although the lateral axis (
y-axis) is not directly controllable for a differential-driven tracked vehicle when the vehicle is stationary, the upper-layer trajectory planning module can then replan a feasible trajectory by taking the current position of the TUV as the starting point, ensuring that the reference trajectory always satisfies the non-holonomic constraints. Consequently, the NN-MPC controller is only applied in regions where the system remains locally controllable.
3.3.2. Practical Stability Under Bounded Model Error
The above proof assumes initial feasibility and neglects model mismatch between the NN-based model and the real system. In practice, the NN-based prediction model inevitably introduces the approximation error. The true system dynamics can be expressed as
where
fNN represents the nominal NN-based model and
ξtr(
t + 1|
t) is the true value of state
ξ(
t + 1|
t).
et is the model error with the assumed upper bound
.
Since the output state
(
t + 1|
t) is derived from the predicted state
ξ(
t + 1|
t), the modeling error between the true state
ξtr(
t + 1|
t) and predicted state
ξ(
t + 1|
t) leads to a deviation between the corresponding output
(
t + 1|
t) and
tr(
t + 1|
t), expressed as
where the upper bound of deviation depends on the
.
C1 is a positive constant determined by the mapping from system state
ξ(
t + 1|
t) to the output
(
t + 1|
t).
Defining
Jtr(
t + 1) and
J (
t + 1) as the true cost function value and nominal function value corresponding to
tr(
t + 1|
t) and
(
t + 1|
t), under the assumption that the cost function defined in (30) is locally Lipschitz, the deviation between
Jtr(
t + 1) and
J (
t + 1) is given by
where
C2 is a positive constant associated with local Lipschitz property of the cost function.
According to the stability analysis of nominal NN-based model and (38)–(40), we obtain
Due to the boundness of control input and output
(
t + 1|
t), there exist positive constants
and
satisfying that
Substituting (42) into (41), we obtain
where
min(
Q) is minimum eigen value of the positive-definite matrix
Q. Based on (43), it is evident that when
exceeds the boundary
min(
Q)
), the actual cost function decreases monotonically, implying that the closed-loop system is ultimately bounded within a neighborhood of radius
O(
) under the local Lipschitz assumption.
5. Conclusions
In this article, a neural network-based MPC approach is proposed for trajectory tracking of a dual-motor-driven TUV. Considering the coupled lateral and longitudinal dynamics characteristics, a data-driven vehicle model based on a LSTM network was developed to predict the vehicle motion states in a subsequent time step through a sequence of history states and control inputs. Subsequently, an NN-MPC trajectory tracking controller was designed, employing the LSTM-based model as the prediction model to calculate the optimal torque. The various simulation results demonstrate that the NN-MPC achieves accurate trajectory tracking performance across different speeds and reference paths. The effectiveness of the proposed scheme is further validated through the field experiment.
For future work, the validation of the proposed control scheme under high-speed scenarios will be further investigated. In addition, coordinated control strategies that simultaneously consider longitudinal speed tracking, lateral path following, and handling stability will be explored to enhance overall motion performance. Furthermore, to address the differences between the simulation model and the real scaled tracked vehicle, the adaptation of the trained LSTM-based vehicle dynamics model to the real TUV will be studied.