Deep Reinforcement Learning-Based Torque Vectoring Control Considering Economy and Safety
Abstract
:1. Introduction
- Unlike the reference [18], this paper proposes a TVC method that takes into account both economy and safety. Specifically, the torque allocation layer based on deep RL adaptively adjusts the torque of each wheel according to the current vehicle state.
- An improved heuristic randomized ensembled double Q-learning (REDQ) algorithm is introduced for EV control, which reduces the training complexity of RL compared to existing RL algorithms for direct motor torque control.
2. The TVC Framework and System Model
2.1. The TVC Framework
2.2. Tire Model Identification
Algorithm 1. FTO algorithm |
1. Set the depth of Fibonacci tree N and the number of identification parameters n; |
2. Randomly generate an initial node B11: randomly generate a global random node N1; |
3. Repeat: |
4. Generate Fi global trial nodes W1~WFi according to the global random node Ni and the node Bij; |
5. Generate Fi−1 local trial nodes V1~VFi−1 according to the best adaptable element Bi1 in the current node and the remaining nodes; |
6. Get the next generation node Bi+1j; |
7. Update the node set. Incorporate the newly generated trial nodes into the current node set S, calculate the fitness function and sort, and retain the first Fi+1 nodes; |
8. Until Fi+1 ≥ FN |
9. Output the optimal node; |
2.3. Vehicle Reference Model
2.4. Vehicle 7-DOF Dynamic Model
3. The TVC Algorithm
3.1. Active Safety Control Layer
3.2. Torque Allocation Layer
3.2.1. Average Allocation Method
3.2.2. RL-Based Torque Allocation Algorithm
Algorithm 2. Heuristic REDQ algorithm |
1. Initialize an ensemble of Q-networks with parameters , Set target parameters |
2. Initialize the target Q-networks with parameters |
3. Initialize the replay buffer |
4. For each step t do: |
5. Randomly sample an action from the set of action strategies with distribution |
6. Execute the action and observe the next state , reward |
7. Store the experience tuple in the replay buffer |
8. for G updates do |
9. Sample a mini-batch experiences from replay buffer |
10. Randomly select m numbers from the set as a set |
11. Based on (42) compute the Q-value estimates |
12. for do |
13. Based on (43), update the parameters using gradient descent method |
14. Based on (44), Update each target Q-network |
15. end for |
16. end for |
17. end for |
18. Return the learned Q-network ensemble. |
4. Evaluation Indicators and Simulation Results
4.1. Simulation Environment
- RLES. The torque allocation algorithm proposed in this paper. The active safety control layer is a nonlinear MPC controller, and the lower controller is based on a heuristic REDQ deep RL algorithm which integrates considering economy and safety.
- MPC-CO. The torque allocation algorithm proposed in reference [26] which integrates considering economy and safety, where the lower controller is a quadratic planning algorithm.
- LQR-EQ. The active safety control layer is the LQR controller in reference [31], and the torque allocation layer is a common average allocation method in Section 3.2.1. This controller considers vehicle safety only.
- w/o control. There is no additional vehicle lateral control; steering is controlled by the driver.
4.2. Performance Indicators
- Handling stability
- 2.
- Driver workload
- 3.
- Motor load
- 4.
- Additional yaw moment
- 5.
- Velocity tracking
4.3. Training Performance
4.4. DLC Maneuver on Slippery Road
4.5. DLC Maneuver on Joint Road
4.6. Step Steering Maneuver
4.7. Driving Cycles
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wu, J.; Zhang, J.; Nie, B.; Liu, Y.; He, X. Adaptive Control of PMSM Servo System for Steering-by-Wire System With Disturbances Observation. IEEE Trans. Transp. Electrif. 2022, 8, 2015–2028. [Google Scholar] [CrossRef]
- Wu, J.; Kong, Q.; Yang, K.; Liu, Y.; Cao, D.; Li, Z. Research on the Steering Torque Control for Intelligent Vehicles Co-Driving With the Penalty Factor of Human–Machine Intervention. IEEE Trans. Syst. Man Cybern. Syst. 2023, 53, 59–70. [Google Scholar] [CrossRef]
- Lei, F.; Bai, Y.; Zhu, W.; Liu, J. A novel approach for electric powertrain optimization considering vehicle power performance, energy consumption and ride comfort. Energy 2019, 167, 1040–1050. [Google Scholar] [CrossRef]
- Karki, A.; Phuyal, S.; Tuladhar, D.; Basnet, S.; Shrestha, B.P. Status of Pure Electric Vehicle Power Train Technology and Future Prospects. Appl. Syst. Innov. 2020, 3, 35. [Google Scholar] [CrossRef]
- Dalboni, M.; Tavernini, D.; Montanaro, U.; Soldati, A.; Concari, C.; Dhaens, M.; Sorniotti, A. Nonlinear Model Predictive Control for Integrated Energy-Efficient Torque-Vectoring and Anti-Roll Moment Distribution. IEEE/ASME Trans. Mechatron. 2021, 26, 1212–1224. [Google Scholar] [CrossRef]
- Chatzikomis, C.; Zanchetta, M.; Gruber, P.; Sorniotti, A.; Modic, B.; Motaln, T.; Blagotinsek, L.; Gotovac, G. An energy-efficient torque-vectoring algorithm for electric vehicles with multiple motors. Mech. Syst. Sig. Process. 2019, 128, 655–673. [Google Scholar] [CrossRef]
- Xu, W.; Chen, H.; Zhao, H.; Ren, B. Torque optimization control for electric vehicles with four in-wheel motors equipped with regenerative braking system. Mechatronics 2019, 57, 95–108. [Google Scholar] [CrossRef]
- Hu, X.; Wang, P.; Hu, Y.; Chen, H. A stability-guaranteed and energy-conserving torque distribution strategy for electric vehicles under extreme conditions. Appl. Energy 2020, 259, 114162. [Google Scholar] [CrossRef]
- Ding, S.H.; Liu, L.; Zheng, W.X. Sliding Mode Direct Yaw-Moment Control Design for In-Wheel Electric Vehicles. IEEE Trans. Ind. Electron. 2017, 64, 6752–6762. [Google Scholar] [CrossRef]
- Zhao, B.; Xu, N.; Chen, H.; Guo, K.; Huang, Y. Stability control of electric vehicles with in-wheel motors by considering tire slip energy. Mech. Syst. Sig. Process. 2019, 118, 340–359. [Google Scholar] [CrossRef]
- Zhang, L.; Chen, H.; Huang, Y.; Wang, P.; Guo, K. Human-Centered Torque Vectoring Control for Distributed Drive Electric Vehicle Considering Driving Characteristics. IEEE Trans. Veh. Technol. 2021, 70, 7386–7399. [Google Scholar] [CrossRef]
- Li, Q.; Zhang, J.; Li, L.; Wang, X.; Zhang, B.; Ping, X. Coordination Control of Maneuverability and Stability for Four-Wheel-Independent-Drive EV Considering Tire Sideslip. IEEE Trans. Transp. Electrif. 2022, 8, 3111–3126. [Google Scholar] [CrossRef]
- Deng, H.; Zhao, Y.; Nguyen, A.T.; Huang, C. Fault-Tolerant Predictive Control With Deep-Reinforcement-Learning-Based Torque Distribution for Four In-Wheel Motor Drive Electric Vehicles. IEEE/ASME Trans. Mechatron. 2023, early access. [Google Scholar] [CrossRef]
- Aradi, S. Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles. IEEE Trans. Intell. Transp. Syst. 2020, 23, 740–759. [Google Scholar] [CrossRef]
- Zhu, Y.; Wang, Z.; Chen, C.; Dong, D. Rule-Based Reinforcement Learning for Efficient Robot Navigation With Space Reduction. IEEE/ASME Trans. Mechatron. 2022, 27, 846–857. [Google Scholar] [CrossRef]
- Kiran, B.R.; Sobh, I.; Talpaert, V.; Mannion, P.; Sallab, A.A.A.; Yogamani, S.; Pérez, P. Deep Reinforcement Learning for Autonomous Driving: A Survey. IEEE Trans. Intell. Transp. Syst. 2022, 23, 4909–4926. [Google Scholar] [CrossRef]
- Wei, H.; Zhang, N.; Liang, J.; Ai, Q.; Zhao, W.; Huang, T.; Zhang, Y. Deep reinforcement learning based direct torque control strategy for distributed drive electric vehicles considering active safety and energy saving performance. Energy 2022, 238, 121725. [Google Scholar] [CrossRef]
- Peng, H.; Wang, W.; Xiang, C.; Li, L.; Wang, X. Torque Coordinated Control of Four In-Wheel Motor Independent-Drive Vehicles With Consideration of the Safety and Economy. IEEE Trans. Veh. Technol. 2019, 68, 9604–9618. [Google Scholar] [CrossRef]
- Cabrera, J.A.; Ortiz, A.; Carabias, E.; Simon, A. An Alternative Method to Determine the Magic Tyre Model Parameters Using Genetic Algorithms. Veh. Syst. Dyn. 2004, 41, 109–127. [Google Scholar] [CrossRef]
- Alagappan, A.; Rao, K.V.N.; Kumar, R.K. A comparison of various algorithms to extract Magic Formula tyre model coefficients for vehicle dynamics simulations. Veh. Syst. Dyn. 2015, 53, 154–178. [Google Scholar] [CrossRef]
- Hu, C.; Wang, R.R.; Yan, F.J.; Chen, N. Should the Desired Heading in Path Following of Autonomous Vehicles be the Tangent Direction of the Desired Path? IEEE Trans. Intell. Transp. Syst. 2015, 16, 3084–3094. [Google Scholar] [CrossRef]
- Ji, X.; He, X.; Lv, C.; Liu, Y.; Wu, J. A vehicle stability control strategy with adaptive neural network sliding mode theory based on system uncertainty approximation. Veh. Syst. Dyn. 2018, 56, 923–946. [Google Scholar] [CrossRef] [Green Version]
- Zhang, H.; Liang, J.; Jiang, H.; Cai, Y.; Xu, X. Stability Research of Distributed Drive Electric Vehicle by Adaptive Direct Yaw Moment Control. IEEE Access 2019, 7, 106225–106237. [Google Scholar] [CrossRef]
- Houska, B.; Ferreau, H.J.; Diehl, M. An auto-generated real-time iteration algorithm for nonlinear MPC in the microsecond range. Automatica 2011, 47, 2279–2285. [Google Scholar] [CrossRef]
- Wang, J.; Luo, Z.; Wang, Y.; Yang, B.; Assadian, F. Coordination Control of Differential Drive Assist Steering and Vehicle Stability Control for Four-Wheel-Independent-Drive EV. IEEE Trans. Veh. Technol. 2018, 67, 11453–11467. [Google Scholar] [CrossRef]
- Deng, H.; Zhao, Y.; Feng, S.; Wang, Q.; Zhang, C.; Lin, F. Torque vectoring algorithm based on mechanical elastic electric wheels with consideration of the stability and economy. Energy 2021, 219, 119643. [Google Scholar] [CrossRef]
- Wu, X.; Zhou, B.; Wen, G.; Long, L.; Cui, Q. Intervention criterion and control research for active front steering with consideration of road adhesion. Veh. Syst. Dyn. 2018, 56, 553–578. [Google Scholar] [CrossRef]
- Zhai, L.; Sun, T.M.; Wang, J. Electronic Stability Control Based on Motor Driving and Braking Torque Distribution for a Four In-Wheel Motor Drive Electric Vehicle. IEEE Trans. Veh. Technol. 2016, 65, 4726–4739. [Google Scholar] [CrossRef]
- Chen, X.; Wang, C.; Zhou, Z.; Ross, K. Randomized Ensembled Double Q-Learning: Learning Fast Without a Model. arXiv 2021, arXiv:2101.05982. [Google Scholar]
- Parra, A.; Tavernini, D.; Gruber, P.; Sorniotti, A.; Zubizarreta, A.; Perez, J. On Nonlinear Model Predictive Control for Energy-Efficient Torque-Vectoring. IEEE Trans. Veh. Technol. 2021, 70, 173–188. [Google Scholar] [CrossRef]
- Mirzaei, M. A new strategy for minimum usage of external yaw moment in vehicle dynamic control system. Transp. Res. Part C Emerg. Technol. 2010, 18, 213–224. [Google Scholar] [CrossRef]
Item | 10 kN | 15 kN | 20 kN |
---|---|---|---|
FTO relative residual of longitudinal force | 1.40% | 1.37% | 1.40% |
GA relative residual of longitudinal force | 4.61% | 2.21% | 1.73% |
PSO relative residual of longitudinal force | 3.44% | 1.74% | 1.66% |
FTO relative residual of lateral force | 1.48% | 1.37% | 1.39% |
GA relative residual of lateral force | 1.78% | 1.63% | 1.48% |
PSO relative residual of lateral force | 1.65% | 1.69% | 1.56% |
Controller | 𝜺s | 𝜺driver | 𝜺motor | 𝜺Mz | 𝜺v | |||
---|---|---|---|---|---|---|---|---|
RLES | 0.4085 | −96% | 16.28 | −81% | 20,820 | −93% | 735,200 | 1.1 × 10−5 |
MPC CO | 0.5640 | −94% | 19.58 | −77% | 23,430 | −92% | 887,200 | 1.5 × 10−5 |
LQR EQ | 3.6532 | −61% | 32.54 | −62% | 29,680 | −90% | 892,000 | 3.5 × 10−4 |
w/o control | 9.260 | - | 85.46 | - | 306,300 | - | 0 | 2.9 × 102 |
Controller | 𝜺s | 𝜺driver | 𝜺motor | 𝜺Mz | 𝜺v | |||
---|---|---|---|---|---|---|---|---|
RLES | 0.5574 | −94% | 16.28 | −81% | 20,820 | −93% | 735,200 | 0.88 |
MPC CO | 0.6368 | −93% | 19.58 | −77% | 23,430 | −92% | 887,200 | 1.42 |
LQR EQ | 4.2190 | −57% | 32.54 | −62% | 29,680 | −90% | 892,000 | 2.47 |
w/o control | 9.707 | - | 85.46 | - | 306,300 | - | 0 | 36.78 |
Controller | 𝜺s | 𝜺driver | 𝜺motor | 𝜺Mz | 𝜺v | ||
---|---|---|---|---|---|---|---|
RLES | 0.0183 | −62% | 42.46 | −1.7% | 9930 | 1 × 107 | 0.01895 |
MPC CO | 0.0193 | −60% | 42.46 | −1.7% | 9873 | 1 × 107 | 0.0273 |
LQR EQ | 0.0274 | −43% | 43.21 | 0.1% | 4166 | 170,100 | 0.04486 |
w/o control | 0.0483 | - | 43.18 | - | 132.7 | 0 | 2.202 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Deng, H.; Zhao, Y.; Lin, F.; Wang, Q. Deep Reinforcement Learning-Based Torque Vectoring Control Considering Economy and Safety. Machines 2023, 11, 459. https://doi.org/10.3390/machines11040459
Deng H, Zhao Y, Lin F, Wang Q. Deep Reinforcement Learning-Based Torque Vectoring Control Considering Economy and Safety. Machines. 2023; 11(4):459. https://doi.org/10.3390/machines11040459
Chicago/Turabian StyleDeng, Huifan, Youqun Zhao, Fen Lin, and Qiuwei Wang. 2023. "Deep Reinforcement Learning-Based Torque Vectoring Control Considering Economy and Safety" Machines 11, no. 4: 459. https://doi.org/10.3390/machines11040459
APA StyleDeng, H., Zhao, Y., Lin, F., & Wang, Q. (2023). Deep Reinforcement Learning-Based Torque Vectoring Control Considering Economy and Safety. Machines, 11(4), 459. https://doi.org/10.3390/machines11040459