Stability Control of Vehicles with Brake Failure Based on the TD3 Adaptive Sliding Mode Control Algorithm

Wang, Ruochen; Wei, Feng; Ding, Renkai; Chen, Zhengrong; Liu, Wei; Sun, Dong

doi:10.3390/wevj17050230

Open AccessArticle

Stability Control of Vehicles with Brake Failure Based on the TD3 Adaptive Sliding Mode Control Algorithm

by

Ruochen Wang

^1,*,

Feng Wei

¹,

Renkai Ding

²,

Zhengrong Chen

¹,

Wei Liu

¹

and

Dong Sun

¹

Department of Vehicle Engineering, School of Automobile and Traffic Engineering, Jiangsu University, Zhenjiang 212013, China

²

Department of Vehicle Engineering, Automotive Engineering Research Institute, Jiangsu University, Zhenjiang 212013, China

^*

Author to whom correspondence should be addressed.

World Electr. Veh. J. 2026, 17(5), 230; https://doi.org/10.3390/wevj17050230

Submission received: 14 February 2026 / Revised: 7 April 2026 / Accepted: 20 April 2026 / Published: 24 April 2026

(This article belongs to the Special Issue Vehicle System Dynamics and Intelligent Control for Electric Vehicles)

Download

Browse Figures

Versions Notes

Abstract

To address the issue of vehicle instability and veering during braking when a single wheel fails in an electric vehicle’s electromechanical braking (EMB) system, an integrated application-oriented control framework based on adaptive sliding mode control (ASMC) is proposed. To address the shortcomings of SMC—such as difficulty in suppressing oscillations and the high workload associated with parameter tuning—a novel composite reaching law function was designed, and the TD3 algorithm was employed to optimize the sliding mode control parameters. When a failure in the EMB system is detected, the upper-layer control uses an improved ASMC algorithm to calculate the vehicle’s additional yaw moment. The lower-layer control employs an optimal control algorithm to distribute braking force, taking into account braking intensity, yaw moment, and tire utilization. This approach is integrated with sliding mode steering control to enhance vehicle stability during braking. To meet the driver’s braking requirements, a backpropagation (BP) neural network is first employed to identify braking intent. Based on this, the additional yaw moment is calculated by the upper-layer controller, and the brake force distribution is optimized through the lower-layer controller, thereby improving the vehicle’s stability. Through co-simulation analysis using Simulink-2024a and CarSim-2019.1, the results show that, compared to traditional algorithms, the proposed hierarchical control strategy reduced the maximum sideslip angle by 51.4%, decreased the maximum yaw rate by 47.2%, and reduced the maximum lateral offset by 45.6%. This control strategy enables enhanced stability across various braking intensity conditions.

Keywords:

electromechanical braking; brake system failure; reinforcement learning; adaptive sliding mode control; vehicle stability control; brake force distribution

1. Introduction

In recent years, new energy vehicles have been evolving toward intelligent integration. Within their braking systems, brake-by-wire systems are gradually replacing traditional hydraulic braking systems due to their faster response and lighter weight. Compared with electro-hydraulic braking (EHB) systems, the EMB system has become a key focus for researchers and enterprises worldwide because of its superior response speed and precise control capabilities [1,2].

Compared with the EHB system, the EMB system directly engages the brake caliper with the brake disc via an electric motor, enabling independent control of each wheel. This promotes enhanced handling stability and safety during vehicle operation [3]. However, due to its fully electronic nature, the EMB system lacks the traditional hydraulic backup found in EHB systems. A failure in the EMB system would lead to degraded braking performance [4]. Therefore, enhancing its fault tolerance has become a critical issue that urgently needs to be addressed in the development of this technology.

Extensive research has been conducted by scholars worldwide on the issue of failure in brake-by-wire systems. Reference [5] proposed a fault-tolerant control architecture for brake-by-wire vehicles. It integrated a regenerative braking system, first employing a sliding mode control algorithm to calculate the compensating yaw moment, then deriving the braking torque. This approach reduced the error between the controlled yaw rate and the desired yaw rate to less than 2°/s and minimized the speed control error to under 1 km/h. However, it did not address the issue of braking force reconstruction under high braking intensity conditions. Reference [6] proposed a fault-tolerant control algorithm employing an explicit weighted pseudoinverse distribution and redistribution mechanism. When an actuator failed, this approach required only the redistribution of signals generated by upper-layer controllers, without requiring reconfiguration of the control system. However, this control method showed limitations under heavy braking conditions. Reference [7] introduced a fault factor to quantify the severity of brake failures and employed a sliding mode controller to regulate the vehicle’s yaw moment, thereby reconstructing the braking force. Reference [8] proposed a two-layer controller using yaw rate and sideslip angle as state variables, where the inner layer employed model predictive control (MPC) and the outer layer utilized fuzzy proportional–integral–derivative (PID) control. Finally, the desired stability control was achieved through the distribution of four-wheel braking torque. Reference [9] proposed a nonlinear model predictive observer for vehicle trajectory tracking, which demonstrated improved control accuracy and computational efficiency compared to standard MPC controllers. Reference [10] introduced multiple enhancements to existing adaptive path-tracking controllers, improving tracking performance and disturbance rejection capabilities. In Reference [11], through broadcast control, the architecture transitioned from conventional centralized control to autonomous distributed control of braking and driving forces. When single-wheel or dual-wheel failures occurred, the vehicle maintained acceptable stability and performance through the autonomous distributed behavior of the remaining wheels.

In terms of vehicle stability control, Reference [12] proposed a method for achieving stable control by relying on the combined effect of the center yaw moment of the vehicle’s front axle and the center of gravity yaw moment. The upper-layer controller calculates the additional yaw moment required for the vehicle to maintain stability based on the vehicle’s state, while the lower-layer controller allocates the required yaw moment to the optimal driving torque of each wheel. Reference [13] proposed a joint sliding mode control algorithm combined with fuzzy adaptive gain to address the issue that lateral stability control of the vehicle during the steering state is affected by system parameter perturbations and external environmental disturbances. Additionally, a simplified unscented Kalman filter observer was proposed for the dynamic estimation of vehicle state parameters and the road adhesion coefficient, which can be used by the lateral stability controller. In terms of fault-tolerant control for braking failures, Reference [14] proposed an overall control architecture and first analyzed braking failure modes under various conditions. Subsequently, braking reconstruction rules were established to adjust the braking force on the remaining functioning actuators when a braking failure mode occurred. Moreover, the

H \infty

algorithm was introduced to determine the reconstructed braking force based on the vehicle’s dynamic state. Reference [15] proposed a coordinated control method for fault-tolerant control of the braking actuator. The drive and braking controllers optimally distributed the required longitudinal force and yaw moment, and set constraints aimed at keeping the yaw moment error within the allowable range. The yaw moment deficit left by the drive and braking controllers was effectively compensated by the steering controller’s front and rear wheel feedforward steering angles.

Most of the aforementioned studies focused on single-wheel failure scenarios during straight-line braking, often overlooking the impact of different road surface friction coefficients on braking performance. Furthermore, these studies exhibited limitations such as slow controller response times and the use of overly simplistic objective functions for optimizing brake force distribution, making them less adaptable to complex operating conditions. In response to the nonlinear and complex variable conditions caused by the failure of a single wheel brake, relying solely on theoretical improvements of a single algorithm is often insufficient to fully meet the practical application requirements. Therefore, this paper takes an application-oriented and engineering-integrated research approach, proposing a hierarchical control strategy to systematically mitigate the vehicle instability problem caused by the failure of the EMB system. Specifically, a comprehensive hierarchical fault-tolerant control architecture was constructed, integrating ASMC, TD3 reinforcement learning, and optimal braking force distribution. An improved reaching law is designed, and the ASMC parameters are optimized using the TD3 algorithm. This study provides a practical approach to alleviate the chattering problem of traditional SMC and significantly improves the real-time adaptability under various road friction conditions and driving maneuvers (such as straight and curved roads).

2. Vehicle Dynamics Model

2.1. Seven-Degree-of-Freedom Vehicle Model

Given that the vertical load, lateral force, and longitudinal force on the tires all exhibit dynamic variations during braking, a seven-degree-of-freedom (7-DOF) vehicle model was established [16] incorporating longitudinal, lateral, and yaw motion and the rotation of all four wheels. To facilitate the analysis of the forces acting on the vehicle during a single-wheel brake failure, practical factors such as air resistance, tire deformation, and sudden changes in the road friction coefficient were neglected. The established model enables the analysis of vehicle dynamics under such failure conditions, as illustrated in Figure 1.

Establish the dynamic equations for longitudinal, lateral, and yaw motions based on Newton’s laws of motion:

\begin{matrix} m ({\dot{v}}_{x} - ω v_{y}) = (F_{x 1} + F_{x 2}) \cos δ_{f} + \\ (F_{y 1} + F_{y 2}) \sin δ_{f} + F_{x 3} + F_{x 4} \end{matrix}

(1)

\begin{matrix} m ({\dot{v}}_{y} + ω v_{x}) = (F_{y 1} + F_{y 2}) \cos δ_{f} - \\ (F_{x 1} + F_{x 2}) \sin δ_{f} + F_{y 3} + F_{y 4} \end{matrix}

(2)

\begin{matrix} M_{z} = I_{z} \dot{ω} = a (F_{x 1} + F_{x 2}) \sin δ_{f} - \\ a (F_{y 1} + F_{y 2}) \cos δ_{f} - b (F_{y 3} + F_{y 4}) + T_{f} \end{matrix}

(3)

\begin{matrix} T_{f} = \frac{c}{2} [(F_{x 1} - F_{x 2}) \cos δ_{f} + \\ (F_{y 1} - F_{y 2}) \sin δ_{f} + (F_{x 3} - F_{x 4})] \end{matrix}

(4)

In the equation,

m

represents the vehicle mass,

v_{x}

is the longitudinal velocity of the vehicle,

v_{y}

is the lateral velocity of the vehicle,

δ_{f}

is the front-wheel steering angle, a and b are the distances from the vehicle center of mass to the front and rear axles, respectively, c is the vehicle track width,

I_{z}

is the vehicle moment of inertia about the z-axis, and

T_{f}

is the longitudinal component of the vehicle force acting on the center of mass. Here,

1 \sim 4

represent the left front wheel, right front wheel, left rear wheel, and right rear wheel, respectively.

F_{x}

represents the longitudinal tire force, and

F_{y}

represents the lateral tire force.

2.2. Two-Degree-of-Freedom Model

Vehicle stability control relies on precise vehicle dynamics models. This paper employs the two-degree-of-freedom vehicle model shown in Figure 2 as the reference model [17].

\{\begin{cases} \dot{β} = \frac{K_{f} + K_{r}}{m v_{x}} β + (\frac{a K_{f} - b K_{r}}{m v_{x}^{2}} - 1) ω - \frac{K_{f}}{m v_{x}} δ_{f} \\ \dot{ω} = \frac{a K_{f} - b K_{r}}{I_{z}} β + \frac{a^{2} K_{f} + b^{2} K_{r}}{I_{z} v_{x}} ω - \frac{a K_{f}}{I_{z}} δ_{f} \end{cases}

(5)

In Equation (5),

K_{f}, K_{r}

represent the lateral stiffness of the front and rear axles, respectively.

β

is the vehicle’s sideslip angle.

ω

is the vehicle’s yaw rate.

2.3. Tire Model

The forces and torques generated by tires, such as braking force, lateral force, rolling resistance and self-aligning torque, are key factors influencing vehicle dynamics. To comprehensively describe the longitudinal and lateral forces of a tire, the “Magic Formula” [18] can be used. This formula is less prone to divergence under critical operating conditions such as high slip, varying loads, and high lateral slip, making it suitable for analyzing stability-limited scenarios including wheel lockup, single-wheel failure, and yaw instability [19]:

F_{x i} = D_{x} \sin {C_{x} \arctan [B_{x} λ_{i} - E_{x} (B_{x} λ_{i} - \arctan (B_{x} λ_{i}))]}

(6)

F_{y i} = D_{y} \sin {C_{y} \arctan [B_{y} α_{i} - E_{x} (B_{y} α_{i} - \arctan (B_{y} α_{i}))]}

(7)

In Equations (6) and (7), B, C, D, and E represent tire parameters;

λ_{i}

represents the tire slip ratio;

α_{i}

represents the tire yaw angle.

In the equation:

\{\begin{cases} C_{x} = b_{0} \\ D_{x} = b_{1} F_{z}^{2} + b_{2} F_{z} \\ B_{x} = \frac{(b_{3} F_{z}^{2} + b_{4} F_{z})}{C_{x} D_{x} \exp (b_{5} F_{z})} \\ E_{x} = b_{6} F_{z}^{2} + b_{2} F_{z} + b_{8} \end{cases}

(8)

\{\begin{cases} C_{y} = a_{0} \\ D_{y} = a_{1} F_{z}^{2} + a_{2} F_{z} \\ B_{y} = \frac{a_{3} \sin (2 \arctan (F_{z} / a_{4})) (1 - a_{5} |λ|)}{C_{x} D_{x}} \\ E_{y} = a_{6} F_{z}^{2} + a_{7} \end{cases}

(9)

The various fitting parameters of the tire model under conditions of pure slip and pure lateral offset are shown in Table 1.

\{\begin{cases} F_{z 1} = \frac{m g b}{2 (a + b)} - \frac{m h_{g} a_{x}}{2 (a + b)} - \frac{m h_{g} a_{y} b}{c L} \\ F_{z 2} = \frac{m g b}{2 (a + b)} - \frac{m h_{g} a_{x}}{2 (a + b)} + \frac{m h_{g} a_{y} b}{c L} \\ F_{z 3} = \frac{m g}{2 (a + b)} + \frac{m h_{g} a_{x}}{2 (a + b)} - \frac{m h_{g} a_{y} b}{c L} \\ F_{z 4} = \frac{m g a}{2 (a + b)} + \frac{m h_{g} a_{x}}{2 (a + b)} + \frac{m h_{g} a_{y} b}{c L} \end{cases}

(10)

In Equation (10),

a_{x}

denotes the acceleration of the vehicle in the x-direction;

a_{y}

denotes the acceleration of the vehicle in the y-direction;

F_{z i}

is the vertical load on wheel i.

During vehicle braking instability, the maximum braking force each wheel can generate is related to its vertical load. The vertical load on the tires varies dynamically due to changes in longitudinal and lateral acceleration.

3. The Influence of Braking Intensity on Vehicle Stability

During vehicle motion, a single-wheel brake failure represents the most likely failure scenario. To investigate how front-wheel and rear-wheel brake failures differ in their effects on vehicle stability, this section analyzes the failures of the left front brake system and the right rear brakes as case studies.

3.1. Analysis of Vehicle Stability During Left Front Brake Failure

3.1.1. Theoretical Analysis of Vehicle Stability in the Event of Left Front Brake Failure

\{\begin{cases} M_{z x} = (F_{x 3} - F_{x 2} - F_{x 4}) c / 2 \\ M_{z y} = (F_{y 2} + F_{y 1}) a - (F_{y 4} + F_{y 3}) b \\ M_{z} = M_{z x} + M_{z y} \end{cases}

(11)

When the braking system of the left front wheel fails, the braking pressure on that wheel is generally considered to be zero, leaving the other three wheels to primarily provide braking force. As shown in Equation (11), the braking force on the left side of the vehicle is significantly less than that on the right side. Due to this imbalance in braking force between the left and right sides, a clockwise yaw moment

M_{z x}

is generated, causing the vehicle to veer. Meanwhile, according to the theory of the Kamm circle theory, because the braking force on the left front wheel is assumed to be zero, its longitudinal tire utilization decreases, allowing it to generate a greater lateral force. Under extreme conditions, the differing lateral forces exerted by the front and rear axles can also create an additional yaw moment

M_{z y}

, further exacerbating the vehicle’s yaw motion and compromising its lateral stability.

3.1.2. Simulation and Validation of Vehicle Stability with Left Front Brake Failure

To visually demonstrate the effect of braking force on vehicle stability during a single-wheel brake failure, steering control is temporarily excluded. In this section, we analyze the system using a Simulink–CarSim co-simulation, with a B-Class vehicle in CarSim serving as the reference vehicle. The simulation conditions are set as follows: an initial vehicle speed of V = 100 km/h, braking intensities of

Z = 0.5

, road friction coefficients of 0.5 and 0.8, and scenarios involving brake failure on the left front and right rear wheels.

As shown in Figure 3a, due to the failure of the vehicle’s left front brake, the vehicle exhibits a certain degree of lateral offset and an increase in yaw rate under all three braking conditions, indicating relatively poor lateral stability. Specifically, when

Z = 0.2

, the vehicle requires a lower braking force, resulting in a relatively small loss of braking force; consequently, the lateral offset and yaw rate remain minimal. However, as the braking intensity increases, the braking force deficit at the failed wheel progressively increases, causing the vehicle’s lateral offset and yaw rate to rise sharply.

In particular, at

Z = 0.8

, the maximum lateral offset and yaw rate reached 17.97 m and −64.64

° / s

, respectively. Under the same longitudinal displacement, compared with Z = 0.2 and Z = 0.5, the lateral offset increased by a factor of 18.5 and 1.88, respectively, while the yaw rate increased by a factor of 79.8 and 2.52, respectively. Consequently, the vehicle’s lateral stability deteriorated significantly, indicating that the impact of a single-wheel brake failure is more severe during high-intensity braking.

3.2. Analysis of Vehicle Stability in the Event of Right Rear Brake Failure

3.2.1. Theoretical Analysis of Vehicle Stability in the Event of Right Rear Brake Failure

\{\begin{cases} M_{z x} = (F_{x 1} + F_{x 3} - F_{x 2}) c / 2 \\ M_{z y} = (F_{y 2} + F_{y 1}) a - (F_{y 4} + F_{y 3}) b \\ M_{z} = M_{z x} + M_{z y} \end{cases}

(12)

When the braking system of the right rear wheel fails, the braking pressure on that wheel is generally considered to be zero, leaving the other three wheels to primarily provide braking force. At this point, the braking force on the right side of the vehicle is less than that on the left. Due to this imbalance in braking forces, as shown in Equation (12), a counterclockwise moment

M_{z x}

is generated, causing the vehicle to yaw. Meanwhile, since the braking force on the right rear wheel is assumed to be zero, its longitudinal tire utilization decreases, thereby increasing the available lateral force it can provide. Under extreme conditions, the lateral forces provided by the front and rear axles differ, which also generates a yaw moment

M_{z y}

, further exacerbating the vehicle’s yaw motion and compromising its lateral stability.

3.2.2. Simulation of Vehicle Stability with Right Rear Brake Failure

As shown in Figure 3b, compared with a failure in the left front brake system, a failure in the right rear brake system results in a smaller loss of overall braking force, because less braking force is allocated to the rear wheels during braking. Consequently, the resulting lateral offset and yaw rate tends to correspondingly smaller. However, as the braking intensity gradually increases, the loss of braking force at the right rear wheel becomes more significant, and the vehicle’s stability tends to deteriorate noticeably.

4. Brake Stability Control Strategy

4.1. Brake Force Identification Based on BP Neural Networks

Compared with traditional hydraulic braking systems, in which the brake pedal is mechanically coupled to the brake actuators, the EMB system achieves physical decoupling between the brake pedal and the wheel actuators. This system requires real-time recognition of the driver’s braking intent to precisely regulate clamping force at each wheel through electromechanical actuators. Within the vehicle stability control strategy described herein, the braking intensity Z plays a critical role in clamping force distribution and is therefore of particular importance.

There exists a strong nonlinear mapping relationship between braking intent and driver input, making it difficult to establish an accurate mathematical model for direct representation. As a typical type of multilayer feedforward network, BP neural networks have become the most widely applied neural network architecture in this field due to their global approximation capability for arbitrary nonlinear functions and excellent generalization performance [20,21]. They have been successfully applied for recognizing vehicle driving intentions (such as lane changes and braking scenarios). This paper employs a three-layer BP network architecture (consisting of an input layer, a hidden layer, and an output layer), as shown in Figure 4. Vehicle speed, brake pedal displacement, and the pedal’s rate of change are used as input features, with the output layer mapped to the braking intensity Z [22].

Principles of BP Neural Networks:

Definition: Input layer:

a_{1}^{0} = V, a_{2}^{0} = S_{p}, a_{3}^{0} = V_{p}

; the number of neurons in the hidden layer: M = 5; The number of output layer neurons:

L = 1

. Weights and biases:

w_{j k}^{l}

denotes the connection from the kth neuron in layer

L - 1

to the jth neuron. It is also the element in the kth column of the jth row of the

l

th-layer weight matrix. Similarly,

b_{j}^{l}

denotes the bias of the jth neuron in layer

l

, which is also the jth element of the bias vector for layer

l

. Let

Z_{j}^{l}

denote the linear output of the jth neuron in the lth layer, and let

a_{j}^{l}

denote the output of the activation function of the jth neuron in the lth layer. In this context, the activation function is denoted by the symbol

σ

, and the activation of the jth neuron in the lth layer is as follows:

a_{j}^{l} = σ (Z_{j}^{l}) = σ (\sum_{k} w_{j k}^{l} a_{k}^{l - 1} + b_{j}^{l})

(13)

Z^{1} = w^{1} \cdot a^{1} + b^{1} = [\begin{matrix} w_{11}^{1} a_{1}^{0} + w_{12}^{1} a_{2}^{0} + w_{13}^{1} a_{3}^{0} + b_{1}^{1} \\ w_{21}^{1} a_{1}^{0} + w_{22}^{1} a_{2}^{0} + w_{23}^{1} a_{3}^{0} + b_{2}^{1} \\ \begin{matrix} w_{31}^{1} a_{1}^{0} + w_{32}^{1} a_{2}^{0} + w_{33}^{1} a_{3}^{0} + b_{3}^{1} \\ \begin{matrix} w_{41}^{1} a_{1}^{0} + w_{42}^{1} a_{2}^{0} + w_{43}^{1} a_{3}^{0} + b_{4}^{1} \\ w_{51}^{1} a_{1}^{0} + w_{52}^{1} a_{2}^{0} + w_{53}^{1} a_{3}^{0} + b_{5}^{1} \end{matrix} \end{matrix} \end{matrix}]

(14)

Z^{2} = w_{11}^{2} a_{1}^{1} + w_{12}^{2} a_{2}^{1} + w_{13}^{2} a_{3}^{1} {+ w}_{14}^{2} a_{4}^{1} + w_{15}^{2} a_{5}^{1} + b_{1}^{2}

(15)

The forward propagation process can be expressed as follows:

a^{l} = σ (w^{l} a^{l - 1} + b^{l})

(16)

Loss function: To measure the difference between the predicted value

\hat{Z}

and the actual value

Z_{t r u e}

, the commonly used mean squared error is employed.

L = L o s s (\hat{Z} {, Z}_{t r u e}) = \frac{1}{2} {(\hat{Z} {- Z}_{t r u e})}^{2}

(17)

Backpropagation:

Output error:

δ_{j}^{l} = \frac{\partial L}{\partial a_{j}^{l}} σ^{'} (Z_{j}^{l})

(18)

Hidden layer error:

δ_{j}^{l} = \sum_{k} w_{k j}^{l + 1} δ_{k}^{l + 1} σ^{'} (Z_{j}^{l})

(19)

Rate of change of parameters:

\frac{\partial L}{\partial b_{j}^{l}} = δ_{j}^{l}, \frac{\partial L}{\partial w_{j k}^{l}} = a_{k}^{l - 1} δ_{j}^{l}

(20)

Parameter updates:

b_{j}^{l} \leftarrow b_{j}^{l} - η \frac{\partial L}{\partial b_{j}^{l}}, w_{j k}^{l} \leftarrow w_{j k}^{l} - η \frac{\partial L}{\partial w_{j k}^{l}}

(21)

For detailed parameter settings, see Appendix A, Table A1.

During the braking intensity recognition process, braking intensity simulation parameters exported from CarSim were used to construct the BP neural network model. Specifically, 70% of the data were used to train the neural network, 15% were used as the validation set, and the remaining 15% for testing. When determining the number of hidden layer neurons, both model recognition accuracy and computational speed were taken into account. Based on the evaluation of accuracy and training time across different neuron configurations (as shown in Figure 5), the number of hidden layer neurons was set to 5, with the maximum number of iterations capped at 1000. To prevent overfitting, an early stopping mechanism was employed, which halted the training process when the validation loss ceased to decrease.

To ensure the generalization, approximation capability, and convergence speed of the neural network, the tanh(x) activation function was selected for the hidden layer, and weights were initialized using the Xavier initialization method. Similarly, the Leaky-ReLU activation function was selected for the output layer, and the weights were initialized using the He initialization method.

Unseen samples were selected to test the predictive accuracy of the BP neural network. During the braking process, the vehicle gradually decelerated from a maximum speed of 100 km/h until coming to a complete stop. To simulate driver braking behavior, a step-input braking maneuver was employed, in which the brake pedal force rapidly increased to 100 N within 1 s and then remained constant. The model recognition accuracy is shown in Figure 6. The overall prediction error is less than 5%, meeting the accuracy requirements.

4.2. Upper-Level Additional Yaw Moment Control

When implementing vehicle stability control, the vehicle’s sideslip angle and yaw rate are used as reference targets. Therefore, a two-degree-of-freedom dynamic model involving lateral and yaw motions is established, with the yaw moment

∆ M

included, as shown in the following equation:

\{\begin{cases} \dot{β} = \frac{K_{f} + K_{r}}{m v_{x}} β + (\frac{a K_{f} - b K_{r}}{m v_{x}^{2}} - 1) ω - \frac{K_{f}}{m v_{x}} δ_{f} \\ \dot{ω} = \frac{a K_{f} - b K_{r}}{I_{z}} β + \frac{a^{2} K_{f} + b^{2} K_{r}}{I_{z} v_{x}} ω - \frac{a K_{f}}{I_{z}} δ_{f} + \frac{Δ M}{I_{z}} \end{cases}

(22)

When the vehicle is traveling steadily, i.e.,

\dot{β} = 0

and

\dot{ω} = 0

, the expected sideslip angle

β_{d}

and expected yaw rate

ω_{d}

can be obtained from the above equation:

β_{d} = (\frac{b}{L v_{x}} + \frac{m a}{L K_{r}} v_{x}) ω_{d}

(23)

ω_{d} = \frac{v_{x}}{(1 + K v_{x}^{2}) L} δ_{f}

(24)

K = \frac{m}{L^{2}} (\frac{a}{K_{r}} - \frac{b}{K_{f}})

(25)

where

K

is the stability coefficient.

Taking into account the constraints imposed by the road surface coefficient of friction

μ

, the modified expected sideslip angle and yaw rate are obtained as follows:

β_{r e f} = \min \{|β_{d}|, |μ g (\frac{b}{v_{x}^{2}} + \frac{m a}{K_{f} (a + b)})|\} sgn (δ_{f})

(26)

ω_{r e f} = \min \{|ω_{d}|, |\frac{μ g}{v_{x}}|\} sgn (δ_{f})

(27)

In the calculation of yaw moments in the upper-layer controller, PID control, MPC and SMC are commonly used. PID control relies heavily on empirical tuning, while MPC, although more accurate, involves a significant computational burden, which can affect the system’s real-time performance [23,24,25]. Therefore, SMC was selected as the method for calculating the vehicle’s additional yaw moment.

Based on the nonlinear characteristics of the vehicle dynamics model, SMC is employed to calculate the yaw moment to be applied to the vehicle [26,27,28]. To eliminate the chattering phenomenon during the switching process of SMC, an adaptive gain dynamic weighting reaching law is introduced in this paper [29].

Define the sliding surface as follows:

s = c_{1} e_{β} + c_{2} e_{ω} + \int (k_{β} e_{β} + k_{ω} e_{ω}) d τ

(28)

In Formula (28),

e_{β} = β - β_{d}

,

e_{ω} = ω - ω_{d}

,

c_{1} > 0

,

c_{2} > 0

,

k_{β} > 0

, and

k_{ω} > 0

.

The law of convergence is as follows:

\dot{s} = - k (s) \cdot ϕ (s)

(29)

ϕ (s) = [(1 - σ (s)) \tanh (\frac{s}{ε_{1}}) + σ (s) \frac{s}{ε_{2} + |s|}]

(30)

In Equation (29), the adaptive gain term is as follows:

k (s) = k_{0} + k_{1} {(|s| + ε)}^{k |s| - 1} + k_{2} \frac{|s|}{a + |s|}

(31)

k, k_{0}, k_{1}, k_{2}, ε, ε_{1}, ε_{2}, a

represent positive constants, where

k_{0}

is the base gain used to overcome inherent system uncertainties and disturbances, and

k_{1} {(|s| + ε)}^{k |s| - 1}

is the exponential adaptive term that provides nonlinear gain adjustment. When the system’s state deviates significantly from the sliding surface, the gain increases rapidly to accelerate convergence. As the system approaches the sliding surface, the gain decreases to suppress chattering.

k_{2} (|s| / (a + |s|))

is the saturation adaptive term, which accelerates convergence when the system is far from the equilibrium point and decelerates convergence near the equilibrium point to suppress chattering.

In Equation (30),

σ (s)

represents the dynamic weighting factor:

σ (s) = \frac{1}{1 + \exp (- α (|s| - ρ))} \cdot e^{- ξ t}

(32)

where

α

,

ρ

and

ξ

are positive constants, and

1 / [1 + e x p (- α (|s| - ρ))]

represents the error-driving term. When

|s| ≫ ρ

, this term approaches 1, enabling rapid system convergence. When

|s| ≪ ρ

, this term approaches 0, suppressing system chattering. The term

e^{- ξ t}

denotes the time decay term, which gradually attenuates over time to enable a smoothing mode.

As shown in Figure 7, compared with traditional exponential convergence algorithms, the adaptive reaching law proposed in this paper exhibits a faster convergence rate and shorter dynamic response time. Furthermore, the chattering phenomenon is significantly suppressed, indicating that this method can effectively improve transient performance and enhance control smoothness.

(1): Proof of stability:

Select the Lyapunov function:

V_{1} = \frac{1}{2} s^{2}

(33)

\begin{matrix} {\dot{V}}_{1} & = s \dot{s} \\ = - k (s) \cdot s \cdot ϕ (s) \end{matrix}

(34)

According to the Lyapunov stability criterion, if

\dot{V} \leq 0

, the system is stable. From Equation (32), we know that

0 < σ (s) < 1

. When

s > 0

,

ϕ (s) > 0

, and when

s < 0

,

ϕ (s) < 0

; therefore,

s \cdot ϕ (s) > 0

. Furthermore, from Equation (31), we know that k(s) > 0; thus, the following equation holds true:

\begin{matrix} {\dot{V}}_{1} & = s \dot{s} \\ = - k (s) \cdot s \cdot ϕ (s) \leq 0 \end{matrix}

(35)

Therefore, based on the Lyapunov stability criterion, the system is stable.

The desired yaw moment

M_{z}

is obtained by combining Equations (22), (28) and (29).

\begin{array}{l} Δ M = I_{z} [\frac{- k (s) \cdot ϕ (s) - c_{1} {\dot{e}}_{β} - k_{β} e_{β} - k_{ω} e_{ω}}{c_{2}} + {\dot{ω}}_{d} \\ \begin{matrix}  \end{matrix} - (\frac{a k_{f} - b k_{r}}{I_{z}} β + \frac{a^{2} k_{f} + b^{2} k_{r}}{I_{z} v_{x}} ω - \frac{a k_{f}}{I_{z}} δ_{f})] \end{array}

(36)

Define the system tracking error state vector as

e (t) = {[\begin{matrix} e_{β} (t) & e_{ω} (t) \end{matrix}]}^{T}

, where

e_{β} = β - β_{d}

and

e_{ω} = ω - ω_{d}

. Combining this with Equation (22), the open-loop error dynamics equation including disturbances can be expressed as follows:

\{\begin{matrix} {\dot{e}}_{β} = f_{β} (β, ω, δ_{f}) - {\dot{β}}_{d} + d_{β} (t) \\ {\dot{e}}_{ω} = f_{ω} (β, ω, δ_{f}) - {\dot{ω}}_{d} + \frac{∆ M}{I_{z}} + d_{ω} (t) \end{matrix}

(37)

Taking the derivative of Equation (28) yields the following:

\dot{s} = c_{1} {\dot{e}}_{β} + c_{2} {\dot{e}}_{ω} + k_{β} e_{β} + k_{ω} e_{ω}

(38)

Substitute Equation (36) into Equation (37):

{\dot{e}}_{ω} = f_{ω} - {\dot{ω}}_{d} + [\frac{- k (s) ϕ (s) - c_{1} {\dot{e}}_{β} - k_{β} e_{β} - k_{ω} e_{ω}}{c_{2}} + {\dot{ω}}_{d} - f_{ω}] + d_{ω} (t)

(39)

This can be simplified to the following:

{\dot{e}}_{ω} = - \frac{c_{1}}{c_{2}} {\dot{e}}_{β} - \frac{k_{β}}{c_{2}} e_{β} - \frac{k_{ω}}{c_{2}} e_{ω} - \frac{k (s)}{c_{2}} ϕ (s) + d_{ω} (t)

(40)

This yields the following complete closed-loop error dynamics matrix equation:

[\begin{matrix} {\dot{e}}_{β} \\ {\dot{e}}_{ω} \end{matrix}] = A_{c} [\begin{matrix} e_{β} \\ e_{ω} \end{matrix}] + [\begin{matrix} 0 \\ - \frac{k (s)}{c_{2}} ϕ (s) \end{matrix}] + [\begin{matrix} d_{β} (t) \\ d_{ω} (t) - \frac{c_{1}}{c_{2}} d_{β} (t) \end{matrix}]

(41)

where

A_{c}

=

[\begin{matrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{matrix}]

,

A_{11} = \frac{K_{f} + K_{r}}{m v_{x}}

,

A_{12} = \frac{a K_{f} - b K_{r}}{m v_{x}^{2}} - 1

,

A_{21} = [- \frac{1}{c_{2}} (c_{1} \frac{K_{f} + K_{r}}{m v_{x}} + k_{β})]

, and

A_{22} = [- \frac{1}{c_{2}} (c_{1} (\frac{a K_{f} - b K_{r}}{m v_{x}^{2}} - 1) + k_{ω})]

(2): Proof of Accessibility:

In real-world physical systems, two-degree-of-freedom vehicle models must account for external disturbances and parameter uncertainties. Let the derivative of the actual sliding surface be

\dot{s} = - k (s) \cdot ϕ (s) + d (t),

(42)

where d(t) is the total disturbance term, and

|d (t)| \leq D

.

Define a minimal sliding mode boundary layer

Ω = \{s ||s| \leq Δ\}

, where

Δ > 0

.

When the system’s state lies outside the boundary, i.e.,

|s| > ∆

, due to the properties of the

t a n h (x)

function, there must exist a constant

δ \in (0,1)

strictly greater than zero such that the following equation holds true:

|ϕ (s)| \geq δ, \forall |s| > ∆

(43)

Furthermore, as shown above, s and ϕ(s) always have the same sign; therefore, when |s| > ∆, the following holds:

s \cdot ϕ (s) = |s| |ϕ (s)| \geq δ |s|

(44)

Substitute Equation (42) into Equation (34):

\dot{V_{1}} = s \dot{s} = s [- k (s) \cdot ϕ (s) + d (t)]

(45)

\dot{V_{1}} = - k (s) \cdot s \cdot ϕ (s) + s \cdot d (t)

(46)

Substitute the perturbation upper bound

|d (t)| \leq D

:

\dot{V_{1}} \leq - k (s) \cdot |s| \cdot |ϕ (s)| + |s| \cdot |d (t)|

(47)

\dot{V_{1}} \leq - k (s) \cdot δ \cdot |s| \cdot + D \cdot |s| = - (k (s) δ - D) |s|

(48)

From Equation (31), it follows that

k (s) \geq k_{0}

; therefore, the basic gain k₀ must satisfy the following condition:

k_{0} δ - D \geq η > 0

(49)

Let

η

be a positive constant; substituting this into Equation (42) gives the following:

{\dot{V}}_{1} \leq - η |s|

(50)

Since

|s| = \sqrt{2 V_{1}}

, the above equation can be rewritten as follows:

{\dot{V}}_{1} \leq - η \sqrt{2 V_{1}}

(51)

(3): Proof of System State Boundedness:

\dot{β} = A_{11} β + A_{12} ω + B_{1} δ_{f}

(52)

where

A_{11} = \frac{K_{f} + K_{r}}{m v_{x}}

,

A_{12} = \frac{a K_{f} - b K_{r}}{m v_{x}^{2}} - 1

, and

B_{1} = - \frac{K_{f}}{m v_{x}}

.

Substituting the tracking errors

e_{β} = β - β_{d}

and

e_{ω} = ω - ω_{d}

yields the following equation:

{\dot{e}}_{β} = A_{11} e_{β} + A_{12} e_{ω} + D_{β} (t)

(53)

where

D_{β} (t) = A_{11} β_{d} + A_{12} ω_{d} + B_{1} δ_{f} - {\dot{β}}_{d}

From Equations (26) and (27), it follows that

β_{d}

and

ω_{d}

are strictly constrained by the road surface coefficient of friction

μ

, and the front-wheel steering angle

δ_{f}

is determined by the driver’s physical input; therefore,

D_{β} (t)

must be bounded (i.e., there exists a constant

D_{m} > 0

such that

|D_{β} (t)| \leq D_{m}

).

Compute the time derivative of the designed integral sliding surface (Equation (28)):

\dot{s} = c_{1} {\dot{e}}_{β} + c_{2} {\dot{e}}_{ω} + k_{β} e_{β} + k_{ω} e_{ω}

(54)

Since

Δ M

appears only in

{\dot{e}}_{ω}

as a control input, we rearrange the terms to express the controlled term

{\dot{e}}_{ω}

in terms of the uncontrolled term and other bounded variables:

{\dot{e}}_{ω} = - \frac{c_{1}}{c_{2}} {\dot{e}}_{β} - \frac{k_{β}}{c_{2}} e_{β} - \frac{k_{ω}}{c_{2}} e_{ω} + \frac{1}{c_{2}} \dot{s}

(55)

Substituting Equation (53) into the above equation yields the following:

{\dot{e}}_{ω} = - (\frac{c_{1} A_{11} + k_{β}}{c_{2}}) e_{β} - (\frac{c_{1} A_{12} + k_{ω}}{c_{2}}) e_{ω} - \frac{c_{1}}{c_{2}} D_{β} (t) + \frac{1}{c_{2}} \dot{s}

(56)

Combine Equations (48) and (51) and write them in standard state-space matrix form:

[\begin{matrix} {\dot{e}}_{β} \\ {\dot{e}}_{ω} \end{matrix}] = [\begin{matrix} A_{11} & A_{12} \\ - \frac{c_{1} A_{11} + k_{β}}{c_{2}} & - \frac{c_{1} A_{12} + k_{ω}}{c_{2}} \end{matrix}] [\begin{matrix} e_{β} \\ e_{ω} \end{matrix}] + [\begin{matrix} 1 & 0 \\ - \frac{c_{1}}{c_{2}} & \frac{1}{c_{2}} \end{matrix}] [\begin{matrix} D_{β} (t) \\ \dot{s} (t) \end{matrix}]

(57)

{[\begin{matrix} {\dot{e}}_{β} & {\dot{e}}_{ω} \end{matrix}]}^{T} = A_{e r r} {[\begin{matrix} e_{β} & e_{ω} \end{matrix}]}^{T} + B_{e r r} {[\begin{matrix} D_{β} (t) & \dot{s} (t) \end{matrix}]}^{T}

(58)

The above equations constitute a zero-error dynamic model of the system constrained near the sliding surface. During controller parameter design, by appropriately setting the positive constants

c_{1}

,

c_{2}

,

c_{3}

and

c_{4}

, the error system matrix

A_{e r r}

is ensured to satisfy the Hurwitz stability criterion.

As shown in Equations (33)–(35), the sliding mode variables remain within a finite boundary layer; therefore, the external excitation vector

{[\begin{matrix} D_{β} (t) & \dot{s} (t) \end{matrix}]}^{T}

is strictly bounded. According to the Input-to-State Stability (ISS) theorem in nonlinear control theory, the states of a strictly Hurwitz-stable linear system are necessarily bounded when driven by a bounded external input.

Therefore, the tracking error vector

{[\begin{matrix} e_{β} & e_{ω} \end{matrix}]}^{T}

is bounded. Meanwhile, Equations (26) and (27) show that the reference trajectory is bounded by physical constraints. According to the principle of linear superposition for bounded signals, the actual physical state variables—

β

and

ω

—in the closed-loop system are both bounded. Thus, while ensuring error convergence, this controller fundamentally guarantees the safety and stability of the vehicle’s chassis dynamics.

4.3. Parameter Optimization Based on the TD3 Reinforcement Learning Algorithm

TD3 is a model-free reinforcement learning (RL) method. It learns optimal policies through direct interaction with the environment (input states, output actions/parameters). Consequently, it can learn directly from data how to tune controller parameters for an uncertain or complex system without requiring knowledge of the system’s exact differential equations. This is of significant value for real-world systems where accurate modeling is notoriously challenging.

Addressing the challenges posed by strong nonlinearity, parameter perturbations, and position disturbances in electro-hydraulic servo systems—challenges that traditional control algorithms struggle to resolve—Reference [30] proposed an intelligent composite control strategy that integrates the TD3 deep reinforcement learning algorithm with adaptive fractional-order sliding mode control. A fractional-order sliding surface was designed, along with a reaching law function based on the sliding surface. Adaptive control laws were designed based on Lyapunov stability theory, and the gain parameters for sliding mode switching were optimized online using the TD3 deep reinforcement learning algorithm. Furthermore, in response to the issues of excessive reliance on manual experience, low efficiency, and cumbersome procedures in the selection of PID controller parameters, Reference [31] proposed a PID parameter optimization method based on the TD3 algorithm.

Given that the adaptive variable-gain sliding mode controller design involves numerous parameters, the sliding surface coefficients and controller gains typically rely on prior knowledge of the system’s dynamic model. For complex nonlinear, time-varying, or partially modeled systems, manually or model-based tuning of these parameters is extremely challenging, often requiring a difficult trade-off between robustness and chattering. Relying solely on manual adjustment results in a heavy workload and often leads to suboptimal performance. To address these limitations, this paper incorporates the TD3 algorithm from reinforcement learning to achieve autonomous parameter optimization [32,33].

TD3 (Twin Delayed Deep Deterministic Policy Gradient) is an enhanced actor–critic algorithm developed to address the limitations of DDPG. Its three key core mechanisms—twin critics, delayed policy updates, and target policy smoothing—are specifically designed to suppress Q-value overestimation, stabilize the training process, and enhance policy generalization. These mechanisms effectively mitigate common issues encountered in DDPG during continuous control tasks, such as training oscillations, premature convergence to local optima, and potential divergence. The specific implementation details are as follows:

(1): Twin Critics: Mitigating Overestimation of Q-values at the Source

When DDPG uses a single Q-network, environmental noise, sample bias, and function approximation errors cause the Q-values to systematically overestimate the true action values (i.e., they treat certain actions as better than they actually are). This overestimation is propagated through gradients to the policy network, causing the policy to continuously optimize toward “overestimated bad actions,” ultimately converging to a local optimum or even divergence.

The core principle of TD3 is to simultaneously train two independent value networks (Q₁, Q₂), which share the same policy network but do not interfere with one another; Meanwhile, two target value networks (Q₁′, Q₂′) are also maintained. When calculating the target Q-value, the minimum of the outputs from the two target Q-networks is taken:

y_{t} = r_{t} + γ \min_{i = 1, 2} {Q_{i}}^{'} (s_{t + 1}, a_{t + 1} |{θ_{Q i}}^{'})

(59)

Substitute into the Bellman equation to find the solution function:

L (θ_{Q_{i}}) = \frac{1}{M} \sum_{i = 1}^{M} (y_{t} - Q_{i} (s_{t}, a_{t} |θ_{Q_{i}}))

(60)

Here, M represents the number of action samples in the experience replay buffer, and

θ_{Q_{i}}

represents the parameters of the target critic network.

(2): Delayed update policy:

Unlike the synchronous update mechanism for the policy network and Q-network in DDPG, where even minor fluctuations in Q-values are directly passed to the policy network, causing policy oscillations, the TD3 algorithm innovatively introduces a delayed update strategy. Specifically, the Q-network is updated d times first until it converges sufficiently to a relatively stable state, and only then is the policy network is updated. This approach establishes a more stable mapping from the continuous state space to action values. During the parameter iteration process, the algorithm uses backpropagation to optimize the parameters of the actor network, as mathematically expressed below:

\frac{\partial J (θ_{μ})}{\partial θ_{μ}} = \frac{1}{M} \sum_{i = 1}^{M} [\nabla_{a} Q_{i} (s_{t}, a_{t} | θ_{Q_{i}}) |_{a_{t} = μ (s_{t} | θ_{μ})} \nabla θ_{μ} μ (s_{t} | θ_{μ})]

(61)

In the equation,

\partial J (θ_{μ}) / \partial (θ_{μ})

represents the loss gradient;

\nabla

denotes the gradient;

θ_{μ}

denotes the parameters of the actor network; and

μ

denotes the actor network.

At the same time, the target network has been updated as follows:

\{\begin{cases} {θ_{Q i}}^{'} \leftarrow τ θ_{Q_{i}} + (1 - τ) {θ_{Q i}}^{'} \\ {θ_{μ}}^{'} \leftarrow τ θ_{μ} + (1 - τ) {θ_{μ}}^{'} \end{cases}

(62)

Here,

τ

is the soft update parameter, with a value between 0 and 1.

(3): Optimization of the smoothness of the objective function:

In DDPG, the target policy directly outputs deterministic actions, and the Q-values are highly sensitive to even minor changes in these actions. TD3 smooths the target policy by adding normally distributed noise to the target actions, thereby smoothing the value estimates across the action space and improving the policy’s generalization to environmental disturbances and parameter perturbations, as shown below:

\tilde{a} = μ^{'} (s_{t + 1} |{θ_{μ}}^{'}) + ε, ε \sim c l i p (N (0, σ), - c, c), c > 0

(63)

Here,

ε

represents random noise following a normal distribution;

σ

represents the variance; and c represents the clipping factor.

For detailed parameter settings, see Appendix A, Table A2; for the pseudocode of the training process, see Appendix B, Algorithm A1. Figure 8 illustrates the structure of the TD3 algorithm. Figure 9 shows the training curve of the TD3 algorithm.

The intelligent agent module encompasses the primary execution process of the decision-making algorithm, responsible for receiving inputs and making decisions. Input state information includes the sideslip angle

β

, yaw rate

ω

, and lateral offset y, as follows:

s_{t} = \{β, ω, y\}

(64)

The action space for the intelligent agent is selected as the sliding surface parameters

C_{1}

,

C_{2}

,

k_{α}

,

k_{β}

and the tracking law parameters

k_{0}

,

k_{1}

,

k_{2}

for the sliding mode controller, expressed as follows:

a_{t} = \{C_{1}, C_{2}, k_{α}, k_{β}, k_{0}, k_{1}, k_{2}\}

(65)

The reward function is crucial for optimizing sliding mode control. To enable the agent to achieve multi-objective cooperative optimization, the following objective rewards are designed:

r = - w_{1} e_{β} - w_{2} e_{ω} - w_{3} e_{y}

(66)

In the equation,

e_{β}

represents the error between the ideal sideslip angle and the actual sideslip angle;

e_{ω}

represents the error between the ideal yaw rate and the actual yaw rate;

e_{y}

represents the error between the ideal lateral offset and the actual lateral offset.

w_{1}

,

w_{2}

, and

w_{3}

are the sideslip angle, yaw rate, and lateral offset, respectively.

4.4. Lower-Level Brake Force Distribution Control

To enhance braking stability following a vehicle braking failure, an objective function is established by comprehensively considering the tire adhesion ellipse [34] and the relationship between lateral force and tire utilization. The optimization objectives aim to maximize the wheel braking force and available lateral force while minimizing the yaw moment. Accordingly, the objective function is formulated as follows:

\begin{matrix} \min J = & \min \{[m Z g - (F_{x 1} + F_{x 2}) \cos δ_{f} - F_{x 3} - F_{x 4}] \\ + \frac{c}{2} [(F_{x 1} \cos δ_{f} + F_{x 3}) - (F_{x 2} \cos δ_{f} + F_{x 4})] \\ + \sum_{i = 1, 2, 3, 4} \frac{{F_{x i}}^{2}}{{(μ_{i} F_{z i})}^{2}}\} \end{matrix}

(67)

In the above equation, the first term represents the difference between the target and the actual braking forces; the second term denotes the yaw moment generated by th lateral braking imbalance between the left and right sides of the vehicle; the third term represents the tire utilization rate, accounting for the maximum lateral force. Furthermore, μ denotes the tire–road friction coefficient for each wheel.

The objective function constraints are as follows:

0 \leq F_{x i} \leq \sqrt{{(μ_{i} F_{z i})}^{2} - {(F_{y i})}^{2}}; i = 1, 2, 3, 4

(68)

Δ M = \frac{c}{2} (F_{x 1} - F_{x 2}) + \frac{c}{2} (F_{x 3} - F_{x 4})

(69)

In vehicle dynamics, the contact force between the tires and the ground is constrained by the friction limit. The resultant force, composed of the longitudinal force

F_{x i}

and lateral force

F_{y i}

, is limited by the maximum available friction force, denoted as

μ_{i} F_{z i}

, where μ is the road friction coefficient and

F_{z}

is the vertical load. Therefore, the inequality

F_{x}^{2} + F_{y}^{2} \leq {(μ F_{z})}^{2}

holds. Furthermore, the required yaw moment calculated by the upper-layer controller is distributed among the functional wheels.

Since the objective function and constraints formulated above constitute a quadratic programming problem, the Sequential Quadratic Programming (SQP) algorithm is employed to find the optimal solution.

4.5. Lower-Layer Active Steering Control

When a significant braking force is required, vehicle stability cannot be maintained solely through simple adjustment of braking force; active steering control is necessary. This paper employs sliding mode control for steering control [35].

Slip Form Surface:

s = e_{ω} + \int e_{ω} d τ + k_{4} e_{β}

(70)

where

k_{4}

is the control weight for the centroid lateral deviation angle.

Take the derivative of the sliding surface:```xml

\begin{matrix} \dot{s} & = {\dot{e}}_{ω} + e_{ω} + k_{4} {\dot{e}}_{β} \\ = {\dot{e}}_{ω} + e_{ω} + k_{4} [\frac{K_{f} + K_{r}}{m v_{x}} β + (\frac{a K_{f} - b K_{r}}{m {v_{x}}^{2}} - 1) ω - \frac{K_{f}}{m v_{x}} δ_{f} - β_{d}] \end{matrix}

(71)

Using the law of exponential convergence:

\dot{s} = - k_{3} s - ε_{3} sgn (s)

(72)

In the formula,

k_{3} > 0

and

ε_{3} > 0

.

Combined with Equation (48), the front wheel steering angle is obtained as follows:

\begin{matrix} δ_{f} = \frac{m v_{x}}{k_{4} \cdot K_{f}} (k_{3} s + ε_{3} sgn (s) + {\dot{e}}_{ω} + e_{ω}) + (1 + \frac{K_{r}}{K_{f}}) β \\ + (\frac{a K_{f} - b K_{r} - m {v_{x}}^{2}}{K_{f} v_{x}}) ω + \frac{m v_{x}}{K_{f}} β_{d} \end{matrix}

(73)

To demonstrate its stability, the Lyapunov function is selected:

V = \frac{1}{2} s^{2}

(74)

\begin{matrix} \dot{V} & = s \dot{s} \\ = s (- k_{3} s - ε_{3} sgn (s)) \\ = - k_{3} s^{2} - ε_{3} |s| \end{matrix}

(75)

According to the Lyapunov stability criterion, if

\dot{V} \leq 0

, the system is stable. From Equation (49) and the above analysis, k₃ > 0 and e₃ > 0, so the following equation holds true:

\begin{matrix} \dot{V} & = s \dot{s} \\ = - k_{3} s^{2} - ε_{3} |s| \leq 0 \end{matrix}

(76)

Therefore, based on the Lyapunov stability criterion, the system is stable.

The overall control architecture of this paper is illustrated in Figure 10. The BP neural network identifies the braking intensity Z based on input vehicle speed, brake pedal displacement, and pedal velocity. The additional yaw moment, determined by the sliding mode controller optimized via the TD3 algorithm, is utilized to optimize braking force distribution. This strategy, integrated with active steering control, enables precise regulation of the vehicle’s yaw rate and lateral offset under failure conditions.

5. Results

Based on the analysis in Section 3 regarding the effects of braking intensity and brake system failure locations on vehicle stability, it can be concluded that a failure in the front brake system has a significant impact on vehicle stability during medium-to-high-intensity braking. Furthermore, the EMB system, which features independent control capabilities, exhibits a lower probability of multi-wheel brake failures. Therefore, this paper focuses on a simulation analysis of stability control in the event of a left front brake system failure under medium-to-high braking intensity conditions. The specific settings for the carsim simulation parameters are shown in Appendix A, Table A3.

To validate the control effectiveness of secondary brake force distribution during a single-wheel brake failure and visually demonstrate the performance of the brake force distribution algorithm, this paper establishes four comparative scenarios: 1. Normal four-wheel braking (ideal condition). 2. Under a left front single-wheel brake failure, the upper layer employs TD3 adaptive sliding mode control (TD3-ASMC) to compute yaw moment. The lower-layer controller uses an optimization algorithm that takes into account braking force loss, yaw moment, and tire utilization to distribute braking force. 3. Under a single-wheel brake failure of the left front wheel, the upper layer employs standard SMC to calculate the yaw moment required for tracking sideslip angle and yaw rate, dynamically adjusting control weights. The lower layer uses an optimization algorithm that takes tire utilization into account to distribute braking force. 4. Vehicle with no control under a single-wheel brake failure of the left front wheel.

5.1. Comparison of Controller Outputs

To visually demonstrate the effectiveness of the proposed TD3-ASMC strategy in suppressing sliding mode oscillations, Figure 11 shows the time response curves of the additional yaw moment

M_{z}

(i.e., the control input signal) directly generated by the upper-layer controllers of the standard SMC and TD3-ASMC under straight-line and cornering conditions. Overall, both controllers respond quickly to yaw moment demands, but they exhibit significant differences in the high-frequency dynamic characteristics of the signals.

Examining the magnified section in Figure 11 reveals that the control output of the traditional SMC exhibits severe high-frequency, high-amplitude sawtooth-like chattering because it uses a discontinuous sign function sgn(s). In contrast, the control signal generated by TD3-ASMC exhibits significantly reduced high-frequency oscillation amplitude and is smoother. This indicates that the dynamic weighting smoothing reaching law introduced in the theoretical section of this paper, along with the reinforcement learning-based adaptive gain adjustment mechanism, has been successfully implemented, effectively softening the abrupt switching process of the control variable.

5.2. Straight-Line Braking Conditions

Figure 12 shows simulation results for a left front wheel brake failure on straight road surfaces with friction coefficients of

μ = 0.2

and 0.8. When the braking intensity is

Z = 0.5

, compared with the traditional SMC algorithm, the maximum sideslip angle of the vehicle controlled by the proposed algorithm decreased from 0.45

°

to 0.39

°

. The maximum yaw rate decreased from 4.48

° / s

to 3.92

° / s

, exhibiting a faster response. Meanwhile, the vehicle’s lateral offset decreased from 1.2 m to 0.96 m, remaining within the safety range (

< 1.5 m

).

As shown in Figure 13, when the friction coefficient is

μ = 0.8

, the maximum sideslip angle decreased from 0.027

°

to 0.019

°

, and the maximum yaw rate decreased from 1.06

° / s

to 1.01

° / s

, with a reduced fluctuation amplitude. The lateral offset also decreased from 0.169 m to 0.092 m.

Figure 14 and Figure 15 show the results of a braking maneuver with an intensity of

Z = 0.8

, performed at an initial speed of 60 km/h on a straight road surface with friction coefficients of 0.2 and 0.8, respectively, when the left front wheel fails. When the friction coefficient is 0.2, compared with the traditional algorithm, the maximum sideslip angle of the vehicle controlled by the proposed algorithm decreased from 0.77

°

to 0.67

°

, the maximum yaw rate decreased from 6.7

° / s

to 6.1

° / s

, and the lateral offset decreased from 1.16 m to 0.67 m. When the road friction coefficient is 0.8, the maximum sideslip angle decreased from 0.042° to 0.027°, the maximum yaw rate decreased from 1.49

° / s

to 1.37

° / s

, and the lateral offset decreased from 0.15 m to 0.08 m.

In summary, when the left front wheel brake fails while driving on a straight road surface, the control strategy proposed in this paper outperforms traditional control algorithms in terms of sideslip angle, yaw rate, and lateral offset. It effectively prevents loss of control caused by unbalanced braking forces during brake failure and successfully suppresses vehicle lateral offset under such events.

5.3. Corner Braking Conditions

Figure 16 shows simulation results for left front wheel brake failure on curved road surfaces with friction coefficients of

μ = 0.2

and 0.8. When

Z = 0.5

, compared to the traditional SMC algorithm, the maximum sideslip angle of the vehicle controlled by the proposed algorithm decreased from 3.626

°

to 2.457

°

, with significantly improved control response speed. Although the maximum yaw rate increased from 7.872

° / s

to 8.154

° / s

, the overall yaw rate converged more quickly and exhibited a marked decrease in steady-state fluctuation. Meanwhile, the lateral offset of the vehicle decreased from 3.33 m to 2.5 m.

As shown in Figure 17, at a road friction coefficient of

μ = 0.8

, the proposed control algorithm reduced the vehicle’s maximum sideslip angle from 0.976° to 0.474°, the maximum yaw rate from 3.72

° / s

to 3.04

° / s

, and the lateral offset from 1.647 m to 1.373 m.

In summary, in the event of a left front brake system failure during cornering, the proposed control strategy outperforms traditional algorithms in terms of sideslip angle, yaw rate, and lateral offset. For the specific improvement details, please refer to Table 2. It effectively avoids the loss of control induced by unbalanced braking forces and successfully suppresses the vehicle’s lateral deviation under these critical conditions.

6. Conclusions

To address single-wheel brake failure on straight roads and curves, a hierarchical control strategy was designed considering factors such as road friction coefficient, braking intensity, and vehicle speed. The upper layer calculates the required yaw moment, while the lower layer optimizes braking force distribution among the functioning wheels. This effectively suppresses lateral offset and, in conjunction with active steering control, ensures that the vehicle can travel stably along the predetermined path. Simulation results indicate that the proposed control strategy significantly improves vehicle stability under extreme conditions, providing a new technical reference for enhancing active vehicle safety.

However, this study presents several limitations that should be acknowledged. First, the current simulation environment is relatively idealized and lacks consideration of real-world external disturbances. Second, although theoretical proofs and comparative simulation results have been provided for the proposed controller, a comprehensive robustness analysis has yet to be conducted. Furthermore, the scope of the selected driving scenarios is limited. The current evaluation solely addresses left front and left rear braking system failures during both straight-line driving and small-curvature cornering under varying speeds and adhesion coefficients. Consequently, vehicle stability under daily maneuvers (e.g., right-angle turns) and extreme conditions (e.g., fishhook test) requires further exploration. Moreover, complex road conditions, including non-uniform adhesion coefficients across all four wheels (e.g., split-friction surfaces) and sudden changes in road friction, should also be considered.

Future research will focus on real-vehicle testing to incorporate actual physical disturbances. We also plan to evaluate the algorithm’s control performance under simultaneous failures of dual- or multi-wheel braking systems at various positions, as well as under demanding driving conditions like right-angle turns and uneven road adhesion. Finally, a detailed robustness analysis will be performed to further demonstrate the reliability of the proposed control strategy.

Author Contributions

Conceptualization, R.W. and F.W.; methodology, F.W.; software, Z.C.; validation, Z.C., D.S. and W.L.; formal analysis, R.W.; investigation, Z.C.; resources, R.W.; data curation, R.D.; writing original draft preparation, F.W.; writing review and editing, R.W. and R.D.; visualization, W.L.; supervision, D.S.; project administration, R.W.; funding acquisition, D.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the [National Key Research and Development Program of China] grant number [2023YF2504500].

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. BP parameter settings.

Item	Value
Number of input layers	3
Number of output layers	$1$
Number of hidden layers	$5$
Hidden layer activation function	tanh(x)
Output layer activation function	Leaky-ReLU
Learning rate	$0.001$
Optimizer	$A d a m$
Loss function	MSE
Mini-batch size	64
Max episodes	1000
Preprocessing method	Normalization

Table A2. TD3 parameter settings.

Hyperparameters
Item	Value
Critic learning rate	$1 \times 10^{- 3}$
Actor learning rate	$1 \times 10^{- 4}$
Target network update frequency	1
Sample time	0.01
Target smooth factor	$1 \times 10^{- 3}$
Experience buffer size	$1 \times 10^{6}$
Discount factor	0.9
Mini-batch size	64
Max episodes	500

Table A3. Configuration of co-simulation parameters for Carsim and Simulink.

Parameters	Abbreviations and Units	Value
Vehicle quality	$m / (k g)$	1895
Vehicle moment of inertia	$I_{z} / (k g \cdot m^{2})$	2031.4
Wheelbase	$c / (m)$	1.56
Distance from the front axle to the center of mass	$a / (m)$	1.15
Distance from the rear axle to the center of mass	$b / (m)$	1.52
Center of mass height	$h_{g} / (m)$	0.72
Front-wheel inertia	$I_{f} / (k g \cdot m^{2})$	1.9
Rear-wheel moment of inertia	$I_{r} / (k g \cdot m^{2})$	1.9
Unsprung mass of the front axle	$m_{f} / (k g)$	150
Unsprung mass of the rear axle	$m_{r} / (k g)$	118

Table A4. Terminology, units, and representations in the text.

Abbreviation	Unit	Physical Meaning
$v_{x}$	$m / s$	the longitudinal velocity of the vehicle
$v_{y}$	$m / s$	the lateral velocity of the vehicle
$δ_{f}$	rad	the front wheel steering angle
$a$	m	distance from the front axle to the center of mass
$b$	m	distance from the rear axle to the center of mass
$c$	m	wheelbase
$I_{z}$	$k g \cdot m^{2}$	the vehicle moment of inertia about the z-axis
$T_{f}$	$N \cdot m$	the longitudinal component of the vehicle force acting on the center-of-mass
$F_{x 1}, F_{x 2}, F_{x 3}, F_{x 4}$	$N$	represent the longitudinal forces acting on each of the four wheels
$F_{y 1}, F_{y 2}, F_{y 3}, F_{y 4}$	$N$	represent the lateral forces acting on each of the four wheels
$M_{z}$	$N \cdot m$	yaw moment of the vehicle
$M_{z x}$	$N \cdot m$	lateral moment caused by a longitudinal force
$M_{z y}$	$N \cdot m$	lateral torque caused by lateral forces
$ω$	$r a d / s$	yaw rate
$ω_{d}$	$r a d / s$	desired yaw rate
$K_{f}, K_{r}$	$N / r a d$	the lateral stiffness of the front and rear axles
$β$	$r a d$	the lateral angle of the vehicle’s center of gravity
$β_{d}$	$r a d$	desired lateral angle of the vehicle’s center of gravity
$F_{z 1}, F_{z 2}, F_{z 3}, F_{z 4}$	$N$	represents the load on each of the four wheels
$a_{x}$	$m / s^{2}$	the acceleration of the vehicle in the x direction
$a_{y}$	$m / s^{2}$	the acceleration of the vehicle in the y direction
$m$	$k g$	vehicle quality
$L$	m	wheelbase

Appendix B

Algorithm A1. Pseudocode for TD3 algorithm

Pseudocode of the TD3 optimized sliding mode control algorithm (for left front wheel brake failure)

1. Set the initial weights

θ_{Q_{1}}

,

θ_{Q_{2}}

,

θ_{μ}

for the critic networks

Q_{1}

,

Q_{2}

, and the actor network

μ

2. Set the initial parameters for the target actor and critic network

{θ_{Q_{i}}}^{'} \leftarrow θ_{Q_{1}}

,

{θ_{Q_{2}}}^{'} \leftarrow θ_{Q_{2}}

and

{θ_{μ}}^{'} \leftarrow θ_{μ}

3. for episode = 1 to number of iterations N do

4. Initialize replay buffer R and get initial states

s_{t} = \{β, ω, y\}

5. Initialize the actor network update frequency d and time length of the driving cycle T

6. for

t = 1 : T

do

7. Select and execute the action

a = μ (s_{t + 1} |θ_{μ}) + N

8. Obtain reward

r_{t}

new state

s_{t + 1}

9. Store transition

\{s_{t}, a_{t}, r_{t}, s_{t + 1}\}

in R and sample M data points

\{s_{i}, a_{i}, r_{i}, s_{i + 1}\}

,

i = 1, 2, 3 \dots M

from R

10. Calculate the expected return of the action using the critic target network:

11.

\tilde{a} = μ^{'} (s_{t + 1} |{θ_{μ}}^{'}) + ε

,

ε \sim c l i p (N (0, σ), - c, c), c > 0

12

y_{t} = r_{t} + γ \min_{i = 1, 2} Q_{i}^{'} (s_{t + 1}, a_{t + 1} |{θ_{Q_{i}}}^{'})

13. Update the critic network parameters based on the following loss function:

14.

L (θ_{Q_{i}}) = \frac{1}{M} \sum_{i = 1}^{M} (y_{t} - Q_{i} (s_{t}, a_{t} |θ_{Q_{i}}))

15. if t mod d then

16. Update the actor network parameters using policy gradient:

17.

\frac{\partial J (θ_{μ})}{\partial θ_{μ}} = \frac{1}{M} \sum_{i = 1}^{M} [\nabla_{a} Q_{i} (s_{t}, a_{t} | θ_{Q_{i}}) |_{a_{t} = μ (s_{t} | θ_{μ})} \nabla θ_{μ} μ (s_{t} | θ_{μ})]

18. Update the target actor and critic network parameters

19.

{θ_{Q_{i}}}^{'} = τ θ_{Q_{i}} + (1 - τ) {θ_{Q_{i}}}^{'}, {θ_{μ}}^{'} = τ θ_{μ} + (1 - τ) {θ_{μ}}^{'}

20. end if

21. end for

22. end for

References

Zhou, S.W.; Liu, J.S.; Wang, Z.L.; Sun, S.H. Research on Design Optimization and Simulation of Regenerative Braking Control Strategy for Pure Electric Vehicle Based on Emb Systems. Trans. FAMENA 2023, 47, 33–49. [Google Scholar] [CrossRef]
Xu, Z.; Gerada, C. Enhanced Estimation of Clamping-Force for Automotive EMB Actuators Using a Switching Extended State Observer. IEEE Trans. Ind. Electron. 2024, 71, 2220–2230. [Google Scholar] [CrossRef]
Li, C.C.; Zhuo, G.R.; Tang, C.; Xiong, L.; Tian, W.; Qiao, L.; Cheng, Y.L.; Duan, Y.L. A Review of Electro-Mechanical Brake (EMB) System: Structure, Control and Application. Sustainability 2023, 15, 38. [Google Scholar] [CrossRef]
Zheng, L.F.; Lu, Y.J.; Wang, J.X.; Li, H.Y. Braking failure anti-rollover control and hardware-in-the-loop verification of wire-controlled heavy vehicles. Sci. Rep. 2024, 14, 17. [Google Scholar] [CrossRef]
Kim, S.; Huh, K. Fault-tolerant braking control with integerated EMBs and regenerative in-wheel motors. Int. J. Automot. Technol. 2016, 17, 923–936. [Google Scholar] [CrossRef]
Argha, A.; Su, S.W.; Zheng, W.X.; Celler, B.G. Sliding-mode fault-tolerant control using the control allocation scheme. Int. J. Robust Nonlinear Control 2019, 29, 6256–6273. [Google Scholar] [CrossRef]
Zhou, J.; Di, Y.G.; Miao, X.L. Single-Wheel Failure Stability Control for Vehicle Equipped with Brake-by-Wire System. World Electr. Vehicle J. 2023, 14, 16. [Google Scholar] [CrossRef]
Zhang, L.P.; Pang, Z.W.; Wang, S.; Zhang, S.L.; Yuan, X.M. Electromechanical composite brake control for two in-wheel motors drive electric vehicle with single motor failure. Proc. Inst. Mech. Eng. Part D-J. Automob. Eng. 2020, 234, 1057–1074. [Google Scholar] [CrossRef]
Pang, H.; Liu, M.H.; Hu, C.; Liu, N. Practical Nonlinear Model Predictive Controller Design for Trajectory Tracking of Unmanned Vehicles. Electronics 2022, 11, 18. [Google Scholar] [CrossRef]
Zhou, X.Y.; Wang, Z.J.; Wang, J.M. Automated Vehicle Path Following: A Non-Quadratic-Lyapunov-Function-Based Model Reference Adaptive Control Approach with e∞-Smooth Projection Modification. IEEE Trans. Intell. Transp. Syst. 2022, 23, 21653–21664. [Google Scholar] [CrossRef]
Ito, A.; Azuma, S.I. Autonomous distributed braking and driving force control architecture based on broadcast control for vehicles with in-wheel motors on four wheels. Adv. Robot. 2024, 38, 672–683. [Google Scholar] [CrossRef]
Xiao, L.; Hu, Y.M.; Zeng, D.Q.; Yu, Y.Q.; Han, K.L.; Yang, J.W.; Liu, W.D. A study on lateral stability control of distributed drive electric vehicle based on fast adaptive super-twisting sliding mode control. Proc. Inst. Mech. Eng. Part D-J. Automob. Eng. 2026, 240, 2624–2639. [Google Scholar] [CrossRef]
Geng, G.Q.; Cheng, P.; Sun, L.Q.; Xu, X.; Shen, F.Q. A Study on Lateral Stability Control of Distributed Drive Electric Vehicle Based on Fuzzy Adaptive Sliding Mode Control. Int. J. Automot. Technol. 2024, 25, 1415–1429. [Google Scholar] [CrossRef]
Wu, T.; Wang, J.N.; Rong, J.; Meng, Y.; Yang, X.J.; Peng, J.; Chu, L. Braking reconstruction control for fault-tolerance in electro-mechanical brake actuator failures. Proc. Inst. Mech. Eng. Part D-J. Automob. Eng. 2025, 239, 6455–6470. [Google Scholar] [CrossRef]
Ha, Y.; Jeon, S.; Park, J.; Choi, S.; Woo, S. Coordinated Control Approach for Brake Actuator Failures: A Fault-Tolerant Strategy Using Braking Systems and Steering. IEEE Access 2025, 13, 108005–108024. [Google Scholar] [CrossRef]
Yang, K.; Dong, D.X.; Ma, C.; Tian, Z.X.; Chang, Y.L.; Wang, G. Stability Control for Electric Vehicles with Four In-Wheel-Motors Based on Sideslip Angle. World Electr. Vehicle J. 2021, 12, 15. [Google Scholar] [CrossRef]
Peng, Z.X.; Ning, G.T. 2 DOF Lateral Dynamic Model with Force Input of Skid Steering Wheeled Vehicle. In Proceedings of the IEEE Transportation Electrification Conference and Expo (ITEC Asia-Pacific), Beijing, China, 31 August–3 September 2014; IEEE: Piscataway, NJ, USA, 2014. [Google Scholar]
Barbaro, M.; Genovese, A.; Timpone, F.; Sakhnevych, A. Extension of the multiphysical magic formula tire model for ride comfort applications. Nonlinear Dyn. 2024, 112, 4663–4685. [Google Scholar] [CrossRef]
Paschoal, W.; Souza, I.; Torres, L.; Murilo, A.; Ozelo, R. Comparative Study of Tire Models Applied to Electronic Stability Control in Automotive Simulator. IEEE Latin Am. Trans. 2024, 22, 835–841. [Google Scholar] [CrossRef]
Tang, J.H.; Zuo, Y.Y. Braking Intention Recognition Method Based on the Fuzzy Neural Network. Wirel. Commun. Mob. Comput. 2022, 2022, 8. [Google Scholar] [CrossRef]
Tang, M.B.; Zhang, X.W. Optimal Regenerative Braking Control Strategy for Electric Vehicles Based on Braking Intention Recognition and Load Estimation. IEEE Trans. Veh. Technol. 2024, 73, 3378–3392. [Google Scholar] [CrossRef]
Chen, Z.R.; Ding, R.K.; Zhou, Q.; Wang, R.C.; Zhao, B.G.; Liao, Y.S. Research on coordinated control of electro-hydraulic composite braking for an electric vehicle based on the Fuzzy-TD3 deep reinforcement learning algorithm. Control Eng. Pract. 2025, 157, 26. [Google Scholar] [CrossRef]
Wu, D.M.; Zhang, Q.; Du, C.Q.; Li, Y. Path tracking and stability control of 4WID electric vehicles based on variable prediction horizon MPC. Int. J. Veh. Des. 2024, 95, 30. [Google Scholar] [CrossRef]
Krishnan, S.G.; Kumar, P.S.; Matara, N.N.; Wang, Y. Real-Time Experimental Evaluation and Analysis of PID and MPC Controllers Using HIL Setup for Robust Steering System of Autonomous Vehicles. IEEE Access 2024, 12, 74711–74723. [Google Scholar] [CrossRef]
Ding, S.H.; Wang, J.D.; Zheng, W.X. Second-Order Sliding Mode Control for Nonlinear Uncertain Systems Bounded by Positive Functions. IEEE Trans. Ind. Electron. 2015, 62, 5899–5909. [Google Scholar] [CrossRef]
Chen, J.; Han, Y.; Huangfu, B.; Shi, J. Research on Vehicle Stability Control Strategy Under Single-Wheel Emb Braking Failure. In Proceedings of the 2024 8th CAA International Conference on Vehicular Control and Intelligence (CVCI), Chongqing, China, 25–27 October 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–6. [Google Scholar]
Zhang, X.Q.; Li, J.; Ma, Z.G.; Chen, D.M.; Zhou, X.X. Lateral Trajectory Tracking of Self-Driving Vehicles Based on Sliding Mode and Fractional-Order Proportional-Integral-Derivative Control. Actuators 2024, 13, 21. [Google Scholar] [CrossRef]
Qin, Z.B.; Jing, H.D.; Chen, L.; Hu, M.J.; Bian, Y.G.; Cui, Q.J. Longitudinal Vehicle Stability Control Based on Modified Sliding Mode Control. Automot. Innov. 2024, 7, 335–348. [Google Scholar] [CrossRef]
Shtessel, Y.; Plestan, F.; Edwards, C. Adaptive Sliding Mode and Second Order Sliding Mode Control with Applications: A Survey. In Proceedings of the 22nd World Congress of the International Federation of Automatic Control (IFAC), Yokohama, Japan, 9–14 July 2023; pp. 761–772. [Google Scholar]
Sun, C.G.; Sun, Z.Y.; Liu, J.Q. Electro-hydraulic servo sliding mode control integrating TD3 intelligent optimization and adaptive fractional order. Mach. Tool Hydraul. 2025, 53, 108–117. [Google Scholar] [CrossRef]
Zhang, H.J.; Wang, Z.L. PID parameter optimization based on TD3 algorithm of double replay buffer. Control Theory Appl. 2026, 43, 139–148. [Google Scholar] [CrossRef]
Zhong, H.J.; Wang, Z.L. TD3 Algorithm of Dynamic Classification Replay Buffer Based PID Parameter Optimization. Int. J. Control Autom. Syst. 2024, 22, 3068–3082. [Google Scholar] [CrossRef]
Chen, Y.X.; Gai, J.T.; He, S.; Li, H.H.; Cheng, C.; Zou, W.J. MPC-TD3 Trajectory Tracking Control for Electrically Driven Unmanned Tracked Vehicles. Electronics 2024, 13, 16. [Google Scholar] [CrossRef]
Wang, H.L.; Tao, H.D.; Pi, D.W.; Wang, W.H.; Chen, Y.J.; Wang, X.H. Coordinated longitudinal and vertical control of corner module vehicles based on ground-tyre adhesion on rugged slopes. Int. J. Veh. Des. 2025, 97, 27. [Google Scholar] [CrossRef]
Ma, X.B.; Wong, P.K.; Zhao, J.; Xie, Z.C. Multi-Objective Sliding Mode Control on Vehicle Cornering Stability with Variable Gear Ratio Actuator-Based Active Front Steering Systems. Sensors 2017, 17, 16. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Seven-DOF vehicle dynamics model.

Figure 2. Two-DOF vehicle reference model.

Figure 3. Vehicle stability simulation. (a) Vehicle stability with left front brake failure. (b) Vehicle stability with rear right brake failure.

Figure 4. BP neural network structure.

Figure 5. The relationship between the training effect of the BP neural network and the number of neurons in the hidden layers.

Figure 6. Model identification accuracy.

Figure 7. Variation curves of system under two different convergence laws.

Figure 8. TD3 reinforcement learning framework.

Figure 9. TD3 training curve.

Figure 10. Control process.

Figure 11. Comparison of controller outputs. (a)

Z = 0.5, μ = 0.8, V = 100 k m / h

Straight. (b)

Z = 0.5, μ = 0.8, V = 100 k m / h

curve.

Figure 11. Comparison of controller outputs. (a)

Z = 0.5, μ = 0.8, V = 100 k m / h

Straight. (b)

Z = 0.5, μ = 0.8, V = 100 k m / h

curve.

Figure 12. Simulation results for a vehicle with a braking intensity of

Z = 0.5

, an initial speed of 100 km/h, and a road surface with a coefficient of friction of 0.2. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 12. Simulation results for a vehicle with a braking intensity of

Z = 0.5

, an initial speed of 100 km/h, and a road surface with a coefficient of friction of 0.2. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 13. Simulation results for a vehicle with a braking intensity of

Z = 0.5

, an initial speed of 100 km/h, and a road surface with a coefficient of friction of 0.8. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 13. Simulation results for a vehicle with a braking intensity of

Z = 0.5

, an initial speed of 100 km/h, and a road surface with a coefficient of friction of 0.8. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 14. Simulation results for a vehicle with a braking intensity of

Z = 0.8

, an initial speed of 60 km/h, and a road surface with a coefficient of friction of 0.2. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 14. Simulation results for a vehicle with a braking intensity of

Z = 0.8

, an initial speed of 60 km/h, and a road surface with a coefficient of friction of 0.2. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 15. Simulation results for a vehicle with a braking intensity of

Z = 0.8

, an initial speed of 60 km/h, and a road surface with a coefficient of friction of 0.8. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 15. Simulation results for a vehicle with a braking intensity of

Z = 0.8

, an initial speed of 60 km/h, and a road surface with a coefficient of friction of 0.8. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 16. Simulation results for a vehicle with a braking intensity of

Z = 0.5

, an initial speed of 60 km/h, and a road surface with a coefficient of friction of 0.2. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 16. Simulation results for a vehicle with a braking intensity of

Z = 0.5

, an initial speed of 60 km/h, and a road surface with a coefficient of friction of 0.2. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 17. Simulation results for a vehicle with a braking intensity of

Z = 0.5

, an initial speed of 60 km/h, and a road surface with a coefficient of friction of 0.8. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Figure 17. Simulation results for a vehicle with a braking intensity of

Z = 0.5

, an initial speed of 60 km/h, and a road surface with a coefficient of friction of 0.8. (a) Sideslip angle. (b) Yaw rate. (c) Lateral offset.

Table 1. Tire model fitting parameters.

$i$	0	1	2	3	4	5	6	7	8
$a_{i}$	1.65	−21.3	1144	49.6	226	0.069	−0.006	0.056	0.486
$b_{i}$	1.30	−22.1	1011	1078	1.82	0.208	0.000	−0.354	0.707

Table 2. Summary table comparing control effects of different algorithms.

Simulated Operating Conditions	Item	SMC	TD3-ASMC	Improvement
Straight road (μ = 0.2, Z = 0.5, V= 100 km/h)	Sideslip angle/(°)	0.45	0.39	13%
	Yaw rate/(°/s)	4.48	3.92	12.5%
	Lateral offset/m	1.20	0.93	20%
Straight road (μ = 0.8, Z = 0.5, V = 100 km/h)	Sideslip angle/(°)	0.027	0.019	29.6%
	Yaw rate/(°/s)	1.06	1.01	4.71%
	Lateral offset/m	0.169	0.092	45.6%
Straight road (μ = 0.2, Z = 0.8, V = 60 km/h)	Sideslip angle/(°)	0.77	0.66	14.3%
	Yaw rate/(°/s)	6.69	6.13	8.3%
	Lateral offset/m	1.16	0.67	42.2%
Straight road (μ = 0.8, Z = 0.8, V = 60 km/h)	Sideslip angle/(°)	0.042	0.027	35.7%
	Yaw rate/(°/s)	1.49	1.37	8.1%
	Lateral offset/m	0.151	0.086	43%
Curved road (μ = 0.2, Z = 0.5, V = 60 km/h)	Sideslip angle/(°)	3.626	2.457	32.2%
	Yaw rate/(°/s)	7.872	8.154	−3.6%
	Lateral offset/m	3.33	2.50	24.9%
Curved road (μ = 0.8, Z = 0.5, V = 60 km/h)	Sideslip angle/(°)	0.976	0.474	51.4%
	Yaw rate/(°/s)	3.72	3.04	18.3%
	Lateral offset/m	1.647	1.373	16.6%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Published by MDPI on behalf of the World Electric Vehicle Association. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

Share and Cite

MDPI and ACS Style

Wang, R.; Wei, F.; Ding, R.; Chen, Z.; Liu, W.; Sun, D. Stability Control of Vehicles with Brake Failure Based on the TD3 Adaptive Sliding Mode Control Algorithm. World Electr. Veh. J. 2026, 17, 230. https://doi.org/10.3390/wevj17050230

AMA Style

Wang R, Wei F, Ding R, Chen Z, Liu W, Sun D. Stability Control of Vehicles with Brake Failure Based on the TD3 Adaptive Sliding Mode Control Algorithm. World Electric Vehicle Journal. 2026; 17(5):230. https://doi.org/10.3390/wevj17050230

Chicago/Turabian Style

Wang, Ruochen, Feng Wei, Renkai Ding, Zhengrong Chen, Wei Liu, and Dong Sun. 2026. "Stability Control of Vehicles with Brake Failure Based on the TD3 Adaptive Sliding Mode Control Algorithm" World Electric Vehicle Journal 17, no. 5: 230. https://doi.org/10.3390/wevj17050230

APA Style

Wang, R., Wei, F., Ding, R., Chen, Z., Liu, W., & Sun, D. (2026). Stability Control of Vehicles with Brake Failure Based on the TD3 Adaptive Sliding Mode Control Algorithm. World Electric Vehicle Journal, 17(5), 230. https://doi.org/10.3390/wevj17050230

Article Menu

Stability Control of Vehicles with Brake Failure Based on the TD3 Adaptive Sliding Mode Control Algorithm

Abstract

1. Introduction

2. Vehicle Dynamics Model

2.1. Seven-Degree-of-Freedom Vehicle Model

2.2. Two-Degree-of-Freedom Model

2.3. Tire Model

3. The Influence of Braking Intensity on Vehicle Stability

3.1. Analysis of Vehicle Stability During Left Front Brake Failure

3.1.1. Theoretical Analysis of Vehicle Stability in the Event of Left Front Brake Failure

3.1.2. Simulation and Validation of Vehicle Stability with Left Front Brake Failure

3.2. Analysis of Vehicle Stability in the Event of Right Rear Brake Failure

3.2.1. Theoretical Analysis of Vehicle Stability in the Event of Right Rear Brake Failure

3.2.2. Simulation of Vehicle Stability with Right Rear Brake Failure

4. Brake Stability Control Strategy

4.1. Brake Force Identification Based on BP Neural Networks

4.2. Upper-Level Additional Yaw Moment Control

4.3. Parameter Optimization Based on the TD3 Reinforcement Learning Algorithm

4.4. Lower-Level Brake Force Distribution Control

4.5. Lower-Layer Active Steering Control

5. Results

5.1. Comparison of Controller Outputs

5.2. Straight-Line Braking Conditions

5.3. Corner Braking Conditions

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI