Hybrid Supervised and Reinforcement Learning for Motion-Sickness-Aware Path Tracking in Autonomous Vehicles
Abstract
1. Introduction
2. Related Works
2.1. Geometric Controllers
2.2. Model-Free Controllers
2.3. Model-Based Controllers
3. Methodology
3.1. Problem Definition
3.1.1. Input Representation
- and denote the directional deviations between the vehicle’s current position and reference path waypoint, as shown in Figure 1.
- quantifies the heading error, defined as the angular difference between the vehicle’s orientation and the target trajectory direction, as shown in Figure 1.
- v represents the vehicle’s longitudinal velocity.
- k denotes the curvature parameter of the predefined reference path. By incorporating this parameter and explicitly quantifying the geometric characteristics of the path, the path-tracking model significantly enhances its adaptability to unstructured road geometries.
3.1.2. Output Representation
3.2. Method Framework
3.3. Offline Supervised Learning
3.3.1. Expert Demonstration
3.3.2. Loss Function
3.4. Online RL Framework
3.4.1. Markov Decision Process
3.4.2. Reward Function
3.4.3. RL Optimization Process
3.5. Path-Tracking Algorithm Based on HSRL
Algorithm 1 Path tracking based on HSRL. |
|
4. Experiment
4.1. Experimental Setup
4.1.1. Implementation Details
4.1.2. Training Dynamics Analysis
4.1.3. Baselines and Evaluation Metrics
4.2. Tracking Performance
4.3. Reduction in MS Performance
4.4. High-Speed Performance
4.5. Reward Hyperparameter Sensitivity Analysis
4.6. Ablation Study
- Lateral deviation increases from 0.0925 m (mean) to 0.1468 m, with a 41.3% increase in standard deviation.
- Jerk rises from 0.0202 m/ (mean) to 0.0315 m/, accompanied by a 20.8% increase in standard deviation.
- TMSDV increases from 475.7 to 681.3, with a 43.2% increase.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Parekh, D.; Poddar, N.; Rajpurkar, A.; Chahal, M.; Kumar, N.; Joshi, G.P.; Cho, W. A review on autonomous vehicles: Progress, methods and challenges. Electronics 2022, 11, 2162. [Google Scholar] [CrossRef]
- Pettersson, I.; Karlsson, I.M. Setting the stage for autonomous cars: A pilot study of future autonomous driving experiences. IET Intell. Transp. Syst. 2015, 9, 694–701. [Google Scholar] [CrossRef]
- Li, Q.; Chen, L.; Li, M.; Shaw, S.L.; Nüchter, A. A sensor-fusion drivable-region and lane-detection system for autonomous vehicle navigation in challenging road scenarios. IEEE Trans. Veh. Technol. 2013, 63, 540–555. [Google Scholar] [CrossRef]
- Chen, L.; Fan, L.; Xie, G.; Huang, K.; Nüchter, A. Moving-object detection from consecutive stereo pairs using slanted plane smoothing. IEEE Trans. Intell. Transp. Syst. 2017, 18, 3093–3102. [Google Scholar] [CrossRef]
- Fu, Y.; Li, C.; Yu, F.R.; Luan, T.H.; Zhang, Y. A decision-making strategy for vehicle autonomous braking in emergency via deep reinforcement learning. IEEE Trans. Veh. Technol. 2020, 69, 5876–5888. [Google Scholar] [CrossRef]
- Chen, L.; Shan, Y.; Tian, W.; Li, B.; Cao, D. A fast and efficient double-tree RRT*-like sampling-based planner applying on mobile robotic systems. IEEE/ASME Trans. Mechatron. 2018, 23, 2568–2578. [Google Scholar] [CrossRef]
- Yao, Q.; Tian, Y.; Wang, Q.; Wang, S. Control strategies on path tracking for autonomous vehicle: State of the art and future challenges. IEEE Access 2020, 8, 161211–161222. [Google Scholar] [CrossRef]
- Bertolini, G.; Straumann, D. Moving in a moving world: A review on vestibular motion sickness. Front. Neurol. 2016, 7, 14. [Google Scholar] [CrossRef]
- Reason, J.T. Motion sickness adaptation: A neural mismatch model. J. R. Soc. Med. 1978, 71, 819–829. [Google Scholar] [CrossRef]
- Medina Santiago, A.; Orozco Torres, J.A.; Hernández Gracidas, C.A.; Garduza, S.H.; Franco, J.D. Diagnosis and Study of Mechanical Vibrations in Cargo Vehicles Using ISO 2631-1: 1997. Sensors 2023, 23, 9677. [Google Scholar] [CrossRef]
- Wada, T. Motion sickness in automated vehicles. In Advanced Vehicle Control; CRC Press: Boca Raton, FL, USA, 2016; pp. 169–174. [Google Scholar]
- Diels, C.; Bos, J.E. Self-driving carsickness. Appl. Ergon. 2016, 53, 374–382. [Google Scholar] [CrossRef] [PubMed]
- Paden, B.; Čáp, M.; Yong, S.Z.; Yershov, D.; Frazzoli, E. A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Trans. Intell. Veh. 2016, 1, 33–55. [Google Scholar] [CrossRef]
- Amer, N.H.; Zamzuri, H.; Hudha, K.; Aparow, V.R.; Kadir, Z.A.; Abidin, A.F.Z. Modelling and trajectory following of an armoured vehicle. In Proceedings of the 2016 SICE International Symposium on Control Systems (ISCS), Nagoya, Japan, 7–10 March 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–6. [Google Scholar]
- Zhao, P.; Chen, J.; Song, Y.; Tao, X.; Xu, T.; Mei, T. Design of a control system for an autonomous vehicle based on adaptive-pid. Int. J. Adv. Robot. Syst. 2012, 9, 44. [Google Scholar] [CrossRef]
- Park, M.W.; Lee, S.W.; Han, W.Y. Development of lateral control system for autonomous vehicle based on adaptive pure pursuit algorithm. In Proceedings of the 2014 14th International Conference on Control, Automation and Systems (ICCAS 2014), Gyeonggi-do, Republic of Korea, 22–25 October 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 1443–1447. [Google Scholar]
- Hoffmann, G.M.; Tomlin, C.J.; Montemerlo, M.; Thrun, S. Autonomous automobile trajectory tracking for off-road driving: Controller design, experimental validation and racing. In Proceedings of the 2007 American Control Conference, New York, NY, USA, 9–13 July 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 2296–2301. [Google Scholar]
- Sharp, R. Rider control of a motorcycle near to its cornering limits. Veh. Syst. Dyn. 2012, 50, 1193–1208. [Google Scholar] [CrossRef]
- Yamashita, A.S.; Alexandre, P.M.; Zanin, A.C.; Odloak, D. Reference trajectory tuning of model predictive control. Control Eng. Pract. 2016, 50, 1–11. [Google Scholar] [CrossRef]
- Falcone, P.; Borrelli, F.; Asgari, J.; Tseng, H.E.; Hrovat, D. Predictive active steering control for autonomous vehicle systems. IEEE Trans. Control Syst. Technol. 2007, 15, 566–580. [Google Scholar] [CrossRef]
- Gutjahr, B.; Gröll, L.; Werling, M. Lateral vehicle trajectory optimization using constrained linear time-varying MPC. IEEE Trans. Intell. Transp. Syst. 2016, 18, 1586–1595. [Google Scholar] [CrossRef]
- Siddiqi, M.R.; Milani, S.; Jazar, R.N.; Marzbani, H. Motion sickness mitigating algorithms and control strategy for autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 24, 304–315. [Google Scholar] [CrossRef]
- Amer, N.H.; Zamzuri, H.; Hudha, K.; Kadir, Z.A. Modelling and control strategies in path tracking control for autonomous ground vehicles: A review of state of the art and challenges. J. Intell. Robot. Syst. 2017, 86, 225–254. [Google Scholar] [CrossRef]
- Shan, Y.; Yang, W.; Chen, C.; Zhou, J.; Zheng, L.; Li, B. CF-pursuit: A pursuit method with a clothoid fitting and a fuzzy controller for autonomous vehicles. Int. J. Adv. Robot. Syst. 2015, 12, 1–13. [Google Scholar] [CrossRef]
- Zhu, Q.; Huang, Z.; Liu, D.; Dai, B. An adaptive path tracking method for autonomous land vehicle based on neural dynamic programming. In Proceedings of the 2016 IEEE International Conference on Mechatronics and Automation, Harbin, China, 7–10 August 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1429–1434. [Google Scholar]
- Farag, W. Complex trajectory tracking using PID control for autonomous driving. Int. J. Intell. Transp. Syst. Res. 2020, 18, 356–366. [Google Scholar] [CrossRef]
- Park, M.; Lee, S.; Han, W. Development of steering control system for autonomous vehicle using geometry-based path tracking algorithm. Etri J. 2015, 37, 617–625. [Google Scholar] [CrossRef]
- Lee, D.; Lee, S.J.; Yim, S.C. Reinforcement learning-based adaptive PID controller for DPS. Ocean Eng. 2020, 216, 108053. [Google Scholar] [CrossRef]
- Ghafarian, M.; Watson, M.; Mohajer, N.; Nahavandi, D.; Kebria, P.M.; Mohamed, S. A review of dynamic vehicular motion simulators: Systems and algorithms. IEEE Access 2023, 11, 36331–36348. [Google Scholar] [CrossRef]
- Zha, Y.; Deng, J.; Qiu, Y.; Zhang, K.; Wang, Y. A survey of intelligent driving vehicle trajectory tracking based on vehicle dynamics. SAE Int. J. Veh. Dyn. Stab. NVH 2023, 7, 221–248. [Google Scholar] [CrossRef]
- Chen, S.; Chen, H.; Negrut, D. Implementation of MPC-based path tracking for autonomous vehicles considering three vehicle dynamics models with different fidelities. Automot. Innov. 2020, 3, 386–399. [Google Scholar] [CrossRef]
- Mattingley, J.; Boyd, S. CVXGEN: A code generator for embedded convex optimization. Optim. Eng. 2012, 13, 1–27. [Google Scholar] [CrossRef]
- Merabti, H.; Belarbi, K.; Bouchemal, B. Nonlinear predictive control of a mobile robot: A solution using metaheuristcs. J. Chin. Inst. Eng. 2016, 39, 282–290. [Google Scholar] [CrossRef]
- Shladover, S.E.; Desoer, C.A.; Hedrick, J.K.; Tomizuka, M.; Walrand, J.; Zhang, W.B.; McMahon, D.H.; Peng, H.; Sheikholeslam, S.; McKeown, N. Automated vehicle control developments in the PATH program. IEEE Trans. Veh. Technol. 1991, 40, 114–130. [Google Scholar] [CrossRef]
- Kapania, N.R.; Gerdes, J.C. Design of a feedback-feedforward steering controller for accurate path tracking and stability at the limits of handling. Veh. Syst. Dyn. 2015, 53, 1687–1704. [Google Scholar] [CrossRef]
- Ly, A.O.; Akhloufi, M. Learning to drive by imitation: An overview of deep behavior cloning methods. IEEE Trans. Intell. Veh. 2020, 6, 195–209. [Google Scholar] [CrossRef]
- Chitta, K.; Prakash, A.; Jaeger, B.; Yu, Z.; Renz, K.; Geiger, A. Transfuser: Imitation with transformer-based sensor fusion for autonomous driving. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 12878–12895. [Google Scholar] [CrossRef] [PubMed]
- Ye, Y.; Qiu, D.; Wang, H.; Tang, Y.; Strbac, G. Real-time autonomous residential demand response management based on twin delayed deep deterministic policy gradient learning. Energies 2021, 14, 531. [Google Scholar] [CrossRef]
- Puterman, M.L. Markov decision processes. Handbooks Oper. Res. Manag. Sci. 1990, 2, 331–434. [Google Scholar]
- Golding, J.; Markey, H.; Stott, J. The effects of motion direction, body axis, and posture on motion sickness induced by low frequency linear oscillation. Aviat. Space Environ. Med. 1995, 66, 1046–1051. [Google Scholar]
- Donohew, B.E.; Griffin, M.J. Motion sickness: Effect of the frequency of lateral oscillation. Aviat. Space, Environ. Med. 2004, 75, 649–656. [Google Scholar]
Parameters | Value |
---|---|
Supervised Learning | |
Learning rate | 1 × |
Policy noise | 0.1 |
Batch size | 64 |
Optimizer | Adam |
Reinforcement Learning | |
Optimizer | Adam |
Policy frequency | 2 |
Policy noise | 0.2 |
Policy learning rate | 3 × |
Discount factor | 0.99 |
tau | 0.005 |
Reward function weights | 1.5, 1.0, 2.5, 0.1, 1.6 |
PID based on WAF-Tune | |
35 km/h (Constant speed)–Lateral (Kp, Ki, Kd) | (0.35, 0.0005, 6.5) |
High speed–Lateral (Kp, Ki, Kd) | (0.2, 0.0000, 7.0) |
High speed–Longitudinal (Kp, Ki, Kd) | (0.3, 0.0500, 0.5) |
MPC | |
Sample time (s) | 0.05 |
Prediction horizon | 20 |
Control horizon | 5 |
Vehicle mass (kg) | 1720 |
Front suspension stiffness (N/m) | 35,000 |
Rear suspension stiffness (N/m) | 30,000 |
Scenario | Lateral Deviation (Mean, Std, Max) | Jerk (Mean, Std, Max) |
---|---|---|
W-shape (training) | (0.0549, 0.0815, 0.4443) | (0.0194, 0.0551, 0.5834) |
S-shape (test) | (0.0469, 0.0809, 0.7908) | (0.0189, 0.0571, 0.6745) |
U-shape (test) | (0.1219, 0.1587, 0.8659) | (0.0210, 0.0559, 0.6009) |
O-shape (test) | (0.1159, 0.1354, 0.7523) | (0.0204, 0.0602, 0.6714) |
Method | S- Shape (ME, t, k) | U- Shape (ME, t, k) | O- Shape (ME, t, k) |
---|---|---|---|
PID | (1.47, 2743, 0.018) | (1.07, 2973, 0.014) | (1.23, 429, 0.027) |
MPC | (0.98, 3208, 0.014) | (0.97, 2970, 0.013) | (0.84, 438, 0.025) |
HSRL | (0.79, 3215, 0.015) | (0.87, 1988, 0.017) | (0.75, 429, 0.026) |
Hyperparameter Combination | Value |
---|---|
A | |
B | |
C | |
D | |
E |
Method | Lateral Deviation (Mean, Std, Max) | Jerk (Mean, Std, Max) | TMSDV |
---|---|---|---|
HSRL | (0.0925, 0.1185, 0.8659) | (0.0202, 0.0578, 0.6745) | 475.7 |
HSRL w/o | (0.1468, 0.1676, 0.9817) | (0.0315, 0.0687, 0.9719) | 681.3 |
HSRL w/o | (0.1381, 0.1538, 0.8732) | (0.0293, 0.0613, 0.89324) | 572.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lv, Y.; Chen, Y.; Chen, Z.; Fan, Y.; Tao, Y.; Zhao, R.; Gao, F. Hybrid Supervised and Reinforcement Learning for Motion-Sickness-Aware Path Tracking in Autonomous Vehicles. Sensors 2025, 25, 3695. https://doi.org/10.3390/s25123695
Lv Y, Chen Y, Chen Z, Fan Y, Tao Y, Zhao R, Gao F. Hybrid Supervised and Reinforcement Learning for Motion-Sickness-Aware Path Tracking in Autonomous Vehicles. Sensors. 2025; 25(12):3695. https://doi.org/10.3390/s25123695
Chicago/Turabian StyleLv, Yukang, Yi Chen, Ziguo Chen, Yuze Fan, Yongchao Tao, Rui Zhao, and Fei Gao. 2025. "Hybrid Supervised and Reinforcement Learning for Motion-Sickness-Aware Path Tracking in Autonomous Vehicles" Sensors 25, no. 12: 3695. https://doi.org/10.3390/s25123695
APA StyleLv, Y., Chen, Y., Chen, Z., Fan, Y., Tao, Y., Zhao, R., & Gao, F. (2025). Hybrid Supervised and Reinforcement Learning for Motion-Sickness-Aware Path Tracking in Autonomous Vehicles. Sensors, 25(12), 3695. https://doi.org/10.3390/s25123695