1. Introduction
Autonomous driving technology, embodied in intelligent vehicles, has become a focal point for both academia and industry due to its immense potential in enhancing driving safety and reducing driver workload. However, due to constraints imposed by practical factors, including inconsistent communication protocols among automotive manufacturers, incomplete regulatory standards, and ethical controversies, it is imperative for drivers to maintain their role as a critical node during the protracted transition towards fully autonomous operation. This is to ensure their ability to monitor and take control of the vehicle. In this context, human–machine co-piloting emerges as a pivotal technological pathway for achieving a seamless transition from manual driving to full autonomy, with its research value being increasingly recognised [
1].
The fundamental principle of human–machine co-driving systems is predicated on the establishment of dual control entities—the driver and the autonomous driving system—to collaboratively execute driving tasks. The core design elements encompass driver behaviour modelling, lateral motion control, and dynamic allocation of driving authority. The driver model is of particular significance in this regard, as it serves as the foundational element for comprehending human–machine interactions and facilitating intelligent authority handover. From the perspective of modelling method evolution, early research primarily relied on classical control theory, such as the pre-aiming driver model [
2] and the neuromuscular model [
3], which excelled in their clear physical mechanisms. The advent of artificial intelligence technology has precipitated the widespread adoption of data-driven methodologies, such as neural networks [
4] and hidden Markov models [
5]. These methodologies have been employed with the objective of engineering intelligent models that are capable of capturing the unique characteristics of individual drivers. Recent research trends demonstrate a tendency to integrate multiple model strengths in order to enhance anthropomorphism [
6]. Nevertheless, extant models continue to demonstrate significant deficiencies in their capacity to capture several critical physiological and control mechanisms: Firstly, the models fail to adequately simulate the information transmission and dynamic response processes within the driver’s nervous system. Secondly, they do not explicitly represent the neural delays and muscular lag effects that are inherent to driving operations. Thirdly, they lack systematic correction mechanisms for dynamic errors between desired inputs and actual outputs within control loops. The aforementioned limitations collectively constrain the depth of characterisation and generalization capabilities of driver behaviour models.
At the vehicle lateral control level, MPC and linear quadratic regulators (LQR) [
7] represent the most mainstream approaches. MPC has been shown to excel in precise trajectory tracking [
8,
9] due to its inherent advantage in handling system constraints and its ability to coordinate multiple objectives through rolling optimization. This has led to its widespread adoption in human–machine co-driving applications. In their seminal study, Shi et al. [
10] pioneered a novel approach to MPC, centring on the pivotal role of driver trust in the machine. They demonstrated the efficacy of this approach by successfully achieving control authority allocation through the utilisation of online cooperative computation. Dai et al. [
11] developed an MPC framework incorporating a linear parameter-varying model. This anthropomorphic control strategy significantly enhanced the harmony of input interactions between the driver and the system.
Human–machine co-driving systems can be categorised according to the interaction methods employed. The categorisation of these systems is based on two types of interaction as follows: haptic interaction and angular interaction (see reference [
12] for further details). The primary function of haptic interaction in this context is to guide drivers by means of assistive torque application through the steering motor, a component commonly found in traditional mechanical steering systems. Lin et al. [
13] proposed a methodology for the classification of object roughness, utilising machine vision technology. This approach was employed to enhance the precision of collaborative assistance through the utilisation of haptic feedback. However, studies indicate that when assistive torque fails to align with human neuromuscular behaviour patterns, overall co-driving performance significantly deteriorates [
14], revealing the inherent limitations of tactile interaction. Addressing this challenge, Huang et al. [
15] proposed a reference-trajectory-free human–machine conflict mitigation method. This approach is predicated on the continuous evaluation of discrepancies between driver intent and system objectives. The efficacy of this approach is evidenced by its effective reduction of adversarial behaviour during interaction.
The advent of steer-by-wire systems has effectively severed the mechanical coupling between the steering wheel and the wheels, thereby enabling intelligent systems to directly intervene in control at the fundamental level of steering angle execution [
16]. In comparison with tactile interaction, this steering angle-based collaborative control offers enhanced design flexibility and fundamentally avoids co-driving inefficiencies caused by assist torque that is mismatched with human physiology. The control structures in question can be categorised into two distinct classifications, namely, corrective and parallel. The modification of cooperative control draws inspiration from the safety envelope concept in aviation. The system calculates safety boundaries by evaluating vehicle dynamic stability, tyre force saturation constraints, or steering saturation constraints, intervening minimally only when necessary. As posited by Reference [
17], a dual-loop structure is employed, comprising an inner loop with an active disturbance suppressor and an outer loop with a fuzzy proportional-integral-derivative controller. This configuration is effective in reducing lateral vehicle errors. Reference [
18] proposes a risk assessment model based on a data-driven Gaussian process regression for driving authority, and implements collision-free driving through a multi-objective hierarchical MPC control integrated with human–machine control commands. Corrective control systems prioritise the driver as the ultimate decision-maker, implementing compensatory adjustments; however, they generally lack full autonomous takeover capability. In contrast, the parallel control architecture proposed in [
19] enables complete vehicle takeover when necessary, dynamically adjusting human–machine control weights based on driver fatigue status to deliver personalized assistance. With regard to weight allocation strategies, divergent academic perspectives emerge as follows: A number of studies have been conducted that consider the current system reliability limitations. The findings of these studies advocate the use of fixed weighting [
20] in order to maintain driver-in-the-loop control. Conversely, others emphasise dynamic weight adjustment based on driving risk levels, achieving adaptive allocation through risk-based decision mechanisms [
21]. Further research indicates the need to integrate driver intent recognition and state monitoring [
22] for more refined weight regulation. In addition to the above representative studies, recent work published in the last three years shows that steering authority allocation is increasingly evolving from fixed or heuristic blending rules toward adaptive, personalized, and context-aware frameworks. In particular, recent studies have shown that authority regulation should not only depend on vehicle states or preset weight coefficients, but should also explicitly consider driver take-over feasibility, human–machine conflict, driver acceptance, and environmental risk. For example, driver take-over feasibility has recently been introduced into shared steering authority allocation so that the control weight can be adjusted according to the driver’s state and surrounding traffic risk [
23]. Other recent studies have developed hierarchical MPC-based shared steering architectures that quantify human–machine conflict and perform online authority redistribution to improve both tracking performance and interaction quality [
24]. Moreover, human-centred authority allocation strategies have also been proposed to explicitly account for driver characteristics and acceptance in the authority negotiation process [
25]. These developments further indicate that effective weight regulation should simultaneously consider driver heterogeneity, dynamic risk, and interaction consistency. To further improve the flexibility of steering authority regulation, this study introduces a fuzzy logic-based dynamic weight adjustment mechanism. By combining the potential-field-based driving risk assessment with the quantified human–machine conflict intensity, the proposed mechanism enables adaptive weight adjustment under different driving risk levels and human–machine interaction conditions.
Within a parallel control framework, the driver and the system function as two independent agents, whose conflicting objectives naturally form a two-player game. In recent years, the application of game theory to redefine the allocation of responsibilities between humans and machines has emerged as a research focus [
26]. For instance, one study effectively addressed driver behaviour uncertainty by constructing a stochastic game framework, with its validity verified through hardware-in-the-loop experiments. In their seminal paper, Guo et al. [
27] proposed a cooperative, game-based shared steering controller. Employing a piecewise affine linearization method, they derived an analytical solution for the optimal human–machine cooperative steering strategy, enabling the determination of optimal collaborative steering control strategies for both human and machine components under low-friction road conditions. Reference [
28] established a Stackelberg master-slave game model in which the driver exerts a dominant influence over the system. The model was designed to be adaptive, with the objective of achieving control authority transition. This transition was intended to outperform traditional Nash game strategies in terms of reducing driver operational load. Building upon this, Yan et al. [
29] proposed a Stackelberg game-based active anti-roll decision method for lateral trajectory tracking control in commercial vehicles. By optimising the hierarchical relationship between human and machine control authority, this approach effectively enhances vehicle stability under extreme operating conditions. More recently, game-theoretic methods have continued to attract attention in human–machine shared steering because they provide a more explicit mathematical description of competition, cooperation, and authority negotiation between the driver and the automation system. In contrast to conventional weighted blending rules, recent game-based studies have modelled the driver and the automation as two decision-makers with different utility functions, thereby enabling more interpretable and flexible authority allocation. For instance, bargaining-game-based shared driving strategies have been proposed for steer-by-wire vehicles to achieve dynamic control authority distribution under human–machine interaction [
30]. In addition, recent studies have incorporated individual risk perception into game-theoretic driver steering models, which improves the interpretability of driver steering behaviour and helps explain the differences in decision preferences under shared control [
31]. Some newly published work has further extended game-based authority coordination from purely lateral steering to coupled lateral-longitudinal co-driving scenarios, indicating that game theory is gradually evolving into an important tool for multi-level human–machine authority coordination [
32]. These latest studies further demonstrate the necessity of establishing a unified shared steering framework that can simultaneously account for driver heterogeneity, risk evolution, and dynamic authority redistribution.
As was discussed in the review, significant progress has been made in the modelling, control and task allocation for human–machine co-driving. However, the core challenge remains to establish a game-theoretic control framework for steering control that ensures driving safety while dynamically optimising weighting to minimise driver physical and mental load. The present paper focuses on the problem of human–machine shared steering control, emphasising the study of dynamic driving authority allocation strategies based on Stackelberg games to achieve synergistic optimisation of safety and experience. The primary contributions of this paper are as follows:
- (1)
The construction of a human–machine cooperative steering interaction model based on Stackelberg games is illustrated in the
Figure 1. In order to address the issue of command conflicts arising from differing objectives between the driver and co-pilot controller during cooperative steering, this paper introduces master–servant game theory, modelling them as independent decision-makers in a hierarchical relationship. The theoretical derivation yielded optimal control strategies for both parties under game equilibrium conditions, thereby establishing a cooperative architecture wherein the driver exerts dominance and the controller functions in a subordinate capacity.
- (2)
The proposal of a weight allocation method is made, incorporating dynamic adjustment strategies that fuse human–vehicle–road information. The present method is an enhancement to a driving safety assessment model based on road risk fields. The novel method integrates the conflict level of human–machine control commands and the driver’s real-time state to form an online weight adjustment mechanism.
- (3)
The construction of a lane-change assistance scenario was undertaken utilising the Carsim/Simulink co-simulation platform and a test bench platform. This finding serves to substantiate the efficacy of the strategy, as it demonstrates that the co-driving controller is able to intervene promptly in high-risk scenarios, thereby ensuring safety. In low-risk conditions, the strategy prioritises the driver’s control authority, whilst concurrently reducing their operational burden.
The remainder of this paper is organised as follows.
Section 2 of the study proposes the construction of a driver model, incorporating style characteristics and a vehicle dynamics model. In
Section 3, a Stackelberg game interaction model is proposed. This model transforms the human–machine co-driving steering control problem into an MPC optimization problem. Through theoretical derivation, the optimal control sequences for both the driver and ADAS under game equilibrium conditions are obtained.
Section 4 proposes a methodology for the real-time adjustment of steering control weights, integrating elements such as driver state monitoring, risk assessment of real-time driving conditions, the identification of human–machine co-driving conflict levels, and driver style recognition. In
Section 5, the validation and analysis process is presented, with the Carsim/Simulink co-simulation platform and a test bench being utilised for this purpose. Finally,
Section 6 concludes the paper and outlines future research directions.
5. Results and Discussion
The present chapter employs a layered validation strategy combining Model-in-the-Loop (MIL) simulation with Driver-in-the-Loop (DIL) bench testing. Firstly, the CarSim/Simulink co-simulation platform is utilised to quantitatively analyse the control performance of the proposed strategy under standard test conditions across varying risk levels and driver states. Subsequently, a steer-by-wire hardware-in-the-loop test bench is constructed to incorporate real drivers’ control characteristics and road feedback, further validating the strategy’s real-time capability and robustness.
5.1. Co-Simulation Platform and Scenario Definition
- (1)
Experimental Scenario: As illustrated in
Figure 5, the scenario under consideration is a two-lane highway, with each lane measuring 3.75 metres in width. The present section introduces two obstacle vehicles into the experimental setup for the purpose of testing the effectiveness of dynamic weight allocation in risk scenarios. Vehicle
is defined as the process of following the leading vehicle
at a constant speed in the right lane. Static obstacles (i.e., accident vehicles
and
) are positioned 80 m ahead in the right lane and 130 m ahead in the left lane, respectively. However, the driver of
remains unaware of the danger due to the obstructing view from the leading vehicle. At a certain moment, the preceding vehicle suddenly changes lanes. The driver observes the obstruction in the lane and initiates an emergency avoidance manoeuvre. In order to validate steering interactions, the driver uniformly adopts a lane-change avoidance strategy.
- (2)
Experimental Platform: CarSim and Simulink co-simulation.
- (3)
Experimental Design: In the context of human–machine cooperative driving, the driver model and the autonomous driving system must maintain consistent trajectory expectations. The following two operating conditions are executed: Driver-Only Control (DA) and Human–Machine cooperative Control (HC).
5.2. Validation of the Effectiveness of the Dynamic Weight Allocation Strategy
Two control groups were selected for the study as follows: the study will compare DA with HC, with two driver states ( and ) set for comparison.
The experimental simulation results are presented in
Figure 6 and
Figure 7.
Figure 6 shows the simulation results with Driver A, while
Figure 7 presents the results with Driver B. In these figures,
and
denote the driver-alone driving conditions for Driver A and Driver B, respectively, while
and
represent the human–machine cooperative driving conditions for Driver A and Driver B, respectively.
A comparison of the change curves of various metrics when each driver operates alone in
Figure 6 and
Figure 7 reveals that Driver A exhibits superior driving performance relative to Driver B. Driver A demonstrates smoother trajectories and control inputs, with yaw rate peaks consistently below 0.2
. This finding suggests that Driver A successfully completes driving tasks, whereas Driver B fails to execute steering manoeuvrers for obstacle avoidance. At longitudinal positions
and
, Driver B comes close to a collision with the road boundary.
For Driver A, the trajectory during cooperative driving with the autonomous system largely overlaps with that during solo driving. As demonstrated in
Figure 6b, the autonomous system’s weight generally maintains a value of approximately 0.1, exhibiting only a marginal increase upon entering high-risk areas, while remaining below 0.2. This finding suggests that Driver A consistently maintains a high driving weight and retains significant driving autonomy throughout the cooperative process.
For Driver B, the autonomous driving system’s initial driving weight is comparatively elevated due to their suboptimal condition. This approach has been shown to yield significant advantages in the context of solving the non-cooperative interaction model, thereby enhancing the level of vehicle intervention to assist the driver in executing steering manoeuvres. Furthermore, as demonstrated in
Figure 7b, the weight adjustment strategy devised in this chapter thwarts extreme fluctuations in human–machine driving weights as follows: In the instance of the primary lane change, the autonomous driving system’s driving weight exhibited a marginal increase at the longitudinal position
m, given the driver’s state. This was attributable to a substantial environmental risk. During the secondary lane change, a notable escalation in risk was observed as the vehicle attained the longitudinal position
m. However, the weight exhibited only a marginal change, attributable to the global control deviation remaining within a confined range. In circumstances where the driver’s condition is suboptimal, the adoption of substantial weight changes has been shown to effectively mitigate the driver’s mental load and operational burden. As demonstrated in
Figure 7a, the vehicle trajectory is characterised by enhanced smoothness during cooperative driving. As demonstrated in
Figure 7c, the cooperative driving mode effectively mitigates the driver’s abrupt control actions, thereby reducing the level of effort required from the driver.
In consideration of the comprehensive outcomes derived from the simulation experiments, it is evident that the operator’s exertion is diminished, and the vehicle demonstrates a diminished yaw rate during collaborative driving. This observation signifies that the proposed strategy has the potential to assist in the reduction of operator workload and the enhancement of driving safety. A comparison of the results of cooperative driving with different drivers reveals that the strategy proposed in this chapter provides varying degrees of assistance to drivers in different states, thereby ensuring driving freedom. In comparison with strategies that exclusively depend on environmental risk to adjust driving weights, the adaptive weight decision model proposed in this paper achieves weights with a reduced range of variation. This approach is advantageous for drivers seeking to master driving characteristics, promote equilibrium in human–machine interaction, and alleviate driver discomfort.
5.3. Adaptability Analysis of Different Driving Styles
In order to validate the strategy’s generalization capability across individual differences, as established in Chapter 2, three driver models—Conservative (
), Balanced (
), and Aggressive (
)—were configured for lane-change testing. The results of the driving style test are displayed in
Figure 8 and
Figure 9. With regard to the issue of tracking errors and control inputs during lane changes, aggressive drivers tend to select higher lateral velocities in order to ensure that the vehicle follows the desired trajectory. By contrast, conservative drivers prefer to reduce lateral velocity with a view to prioritising lane change safety. It was evident that, in general, control inputs were considerably diminished for all three driver types in human–machine co-driving conditions. The lateral control allocation strategy employed for co-driven lane changes has been shown to effectively reduce driver workload, produce smoother cornering curves, meet comfort requirements, and notably assist conservative drivers by substantially lowering tracking errors. As demonstrated in
Figure 8c,d the initial weighting for all three driver types is comparatively low during human–machine co-driving. It is evident that the initial weighting magnitude is contingent on the aggressive driver’s larger desired lateral acceleration. This results in a sequence of magnitude values that is firstly aggressive driver > balanced driver > conservative driver. As shown in
Figure 8b, the real-time shared-control weight
remains smooth throughout the lane-change process for all three driver types, thereby avoiding the driving discomfort that may be caused by abrupt weight fluctuations. In addition, the weight magnitude follows the order Aggressive (
) > Balanced (
) > Conservative (
), indicating that the proposed strategy allocates a larger driving weight to more aggressive drivers while assigning a relatively lower driving weight to more conservative drivers. The overall variation range of the weight is relatively limited, which reflects the stability of the proposed dynamic authority allocation strategy. With regard to real-time weights, consistent driving intent for lane-change operations is shared between the driver and control system, thereby ensuring that human–machine conflict remains consistently low, below 0.2. The risk associated with vehicle operation remains within a relatively constrained range. At this juncture, the driving weight curve is characterised by a smooth progression, thus circumventing the driving discomfort that is often associated with pronounced fluctuations in control weight. It is evident that the weights assigned to the three distinct driver types remain within the specified range of
, exhibiting minimal variation.
In summary, when human and machine driving intentions are aligned, the driving control allocation strategy primarily focuses on reducing the driver’s workload while ensuring driving safety. Human–machine conflicts have negligible impact on the allocation strategy under these conditions. Conversely, conservative drivers are assigned the lowest driving weight, indicating maximum system intervention. The validity of this strategy has been demonstrated through its application in a variety of driving styles.
5.4. Driver-in-the-Loop (DIL) Bench Experiment
The overall scheme of the steer-by-wire hardware-in-the-loop test bench is shown in
Figure 10. the test bench is composed of three primary components: test simulation hardware, test simulation integration software, and a control system prototype. It is imperative to note that this configuration is pivotal in ensuring the seamless real-time operation of the steer-by-wire controller and steering actuators. The test simulation hardware comprises a steer-by-wire test bench, a road condition simulation load system, and a power distribution cabinet. The road condition simulation load system utilises servo electric cylinders. The test simulation integrated software comprises vehicle dynamics software (CarSim 2021.0), test software (Matlab 2022a, RTSimPlus), and corresponding host computer systems. This necessitates comprehensive system integration and debugging. The control system prototype is composed of two constituent elements as follows: control prototype hardware and control algorithm models. The primary development and implementation of this system is accomplished through the integration of rapid prototyping. The control prototype hardware utilises dedicated chips and peripheral circuits.The specific specifications are shown in
Table 5.
The experimental setup is illustrated in the accompanying
Figure 11. When the vehicle is travelling at 20 m per second in the longitudinal direction, the driver is able to complete a lane change manoeuvre by turning the steering wheel. A number of drivers were selected for the purpose of conducting multiple trials of free lane-change in the scenario depicted in the figure, without weight distribution. Each test was conducted for a duration of 11 s, during which drivers were prompted to initiate lane changes following a 50 m journey. A range of data was extracted for each manoeuvre, including information on trajectory, yaw angle and lane-change duration.
A thorough analysis of the lane-change positions for three drivers reveals that, disregarding the 50 m travel distance in the original lane before the change and the displacement in the target lane after completion, the average longitudinal displacement
D required for lane changes without system intervention was 80 m, 68 m, and 52 m, respectively. Consequently, these three drivers were classified as adhering to conservative, balanced, and aggressive driving styles, respectively. System-expected lane-change trajectories, derived from the aforementioned
D values, were designed to allocate vehicle control authority during human–machine co-driving. The results of these tests are displayed in
Figure 12 and
Figure 13.
As demonstrated in
Figure 12 and
Figure 13, all three driver types manifest a certain degree of overshoot when executing lane changes without intelligent system assistance. This overshoot corresponds to the maximum lateral deviation. However, when the intelligent system participates in control authority allocation, it effectively eliminates this overshoot, ensuring the lateral position strictly follows a strictly increasing trend during the lane change. As illustrated in
Figure 13, under human–machine co-driving conditions, all three driver types demonstrate increased lateral input during the initial phase of lane changes, with control inputs exhibiting a substantial decrease following entry into the target lane. Preliminary analysis suggests that, in the initial phase of lane changes, drivers exhibit a deficiency in trust in the control system. The system’s intervention has been observed to induce discomfort and distrust among these drivers, compelling them to input larger steering angles to achieve the desired lane change manoeuvrer. This, in turn, has been shown to increase human–machine conflict values. During the mid-to-late stages of lane changing, as drivers gradually adapt to the intelligent system’s appropriate intervention, they attempt to reduce their own steering wheel input. At this juncture, the driver’s workload is reduced, and the human–machine conflict value diminishes. The risk value curve indicates that, among the three driver types, conservative drivers selected the largest longitudinal displacement during lane changing, thereby maintaining the highest risk value. The conservative driver depicted in
Figure 12d exhibits the lowest initial and real-time weights, indicating the highest level of intelligent system intervention.
To quantitatively evaluate the lane-change trajectory, the maximum lateral overshoot
is defined as
where
y is the actual lateral displacement and
is the target lateral position corresponding to the target lane centre.
To characterize the vehicle dynamic response and driver workload, the yaw-rate energy
and the driver control effort
are defined as
where
is the yaw rate at the
k-th sampling instant,
is the driver output,
is the sampling period, and
N is the total number of samples.
In addition, the peak yaw rate, peak driver output, mean authority weight, mean risk, peak conflict, and cumulative conflict are also extracted to establish a multi-dimensional quantitative evaluation framework for the HIL experiments.
As shown in
Table 6, the proposed shared-control strategy significantly reduces the maximum lateral overshoot for all three driver types. Specifically, the overshoot is reduced from 0.4224 m, 0.0383 m, and 0.00070 m under manual driving to 0.0053 m, 0.0055 m, and 0.00001 m under shared control for D1, D2, and D3, respectively, corresponding to reductions of 98.75%, 85.54%, and 98.58%. This quantitatively confirms that the proposed strategy can effectively suppress excessive lateral deviation after the vehicle enters the target lane.
Table 7 further shows that both the peak yaw rate and the yaw-rate energy are substantially reduced under the shared-control mode. This indicates that the proposed strategy not only improves the final lateral stability, but also makes the lane-change response smoother and less oscillatory throughout the manoeuvrer. Meanwhile,
Table 8 shows that the peak driver output and the driver control effort are significantly decreased for all three driver types, which demonstrates that the proposed method can effectively alleviate the steering workload of the driver.
In addition,
Table 9 shows that the authority weight is adaptively adjusted for different driver types, reflecting the personalized assistance capability of the proposed strategy. The risk and conflict data also indicate that D1 exhibits the highest peak risk, cumulative risk, and conflict magnitude, whereas D2 and D3 remain at relatively lower levels. This is consistent with the qualitative observations in
Figure 12 and
Figure 13. Overall, the above quantitative results jointly demonstrate that the proposed shared-control strategy can improve lane-change stability, smooth the vehicle response, reduce driver workload, and maintain effective human–machine coordination under different driving styles.
5.5. Discussion on Communication Delays and Real-Time Implementation
It should be noted that the HIL platform adopted in this study is a semi-physical test bench rather than an ideal offline simulation environment. In the experimental setup, the controller, communication interfaces, and vehicle model operate in a real-time closed loop. Therefore, the obtained HIL results inherently include the combined effects of communication delay, computation delay, and execution delay existing in the practical implementation process. In this sense, the effectiveness of the proposed shared-control strategy has been validated under realistic closed-loop operating conditions rather than under an ideal delay-free assumption.
Meanwhile, it should also be acknowledged that the present work does not provide a separate parametric delay-sensitivity analysis by artificially injecting different prescribed delay values into the HIL platform. Therefore, the current experimental results mainly demonstrate the practical feasibility and real-time implementability of the proposed strategy on the semi-physical bench, while a more systematic robustness evaluation under different delay levels remains to be further investigated in future work.