An Energy Management Strategy for FCHEVs Using Deep Reinforcement Learning with Thermal Runaway Fault Diagnosis Considering the Thermal Effects and Durability
Abstract
1. Introduction
- To mitigate thermal runaway under high-power operation, this paper proposes a dual-layer protection framework. The upper layer integrates a temperature fault diagnosis-based penalty into the EMS to proactively suppress heat generation. The lower layer employs a real-time safety-constrained power regulator that dynamically adjusts output limits of LIBs and PEMFCs. This strategy-to-execution coordination enables proactive thermal management, improving both system safety and operational reliability.
- To overcome Q-value overestimation and policy complexity in DRL-based energy management, this work proposes a DSAC-based EMS for FCHEVs. By embedding power loss, SoH degradation, and temperature fault diagnosis-based constraints into a composite penalty function, the method reduces policy dimensionality and achieves multi-objective optimization for economy, durability, and safety. The DSAC algorithm enhances efficiency and accuracy by learning the return distribution via a single network using return variance.
- To enhance safety under high-temperature conditions, this paper designs a DSAC-based safety-constrained controller for power action optimization. It continuously monitors source temperatures and activates automatically when thresholds are exceeded. The controller intelligently selects actions that meet power demand while satisfying thermal constraints, ensuring safe and efficient operation through real-time, adaptive decision-making.
2. Powertrain Model
2.1. FCHEV Powertrain System Overview
2.2. Electric–Thermal PEMFC Model
2.3. Electric–Thermal LIB Model
2.4. SoH Models of PEMFCs and LIBs
3. Design of DSAC-Based EMS with Thermal Flexibility Constraints Considering the Thermal Effects and Durability
3.1. Training of EMS
3.2. Safe Constriction Control System
| Algorithm 1 DSAC-based energy management and energy source thermal constraint algorithm |
|
4. Results and Discussion
4.1. Verification Preparation
4.2. Energy Distribution Control Verification
4.3. Temperature Control Verification
5. Conclusions
- Under the same operating conditions, considering the thermal characteristics and durability of the powertrain system, the energy consumption saved by the EMS based on DSAC is 17.69%, 39.91%, and 32.46% compared to those based on DDPG, TD3, and SAC, respectively. The SoH consumption is reduced by 26.52%, 29.38%, and 27.69%, respectively. After adding TSCC, the EMS based on DDPG, TD3, SAC, and DSAC achieves energy savings of 0%, 2.03%, 25.29%, and −1.93%, respectively, while SoH consumption is reduced by 6.07%, 3.7%, 20.32%, and 1.65%, respectively.
- The proposed flexible temperature TSCC effectively regulates the temperature of the LIBs and PEMFCs stack, minimizing their peak temperatures. However, without the introduction of a thermal management system, the energy source temperature cannot be more effectively maintained within the optimal range, which may prevent reductions in energy consumption and mitigate potential lifespan degradation.
- The method has good versatility and can be extended to other types of FCVs to achieve optimal fuel economy, durability, and thermal safety of the fuel cell vehicle power system in unpredictable driving scenarios.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Mazouzi, A.; Hadroug, N.; Alayed, W.; Hafaifa, A.; Iratni, A.; Kouzou, A. Comprehensive optimization of fuzzy logic-based energy management system for fuel-cell hybrid electric vehicle using genetic algorithm. Int. J. Hydrogen Energy 2024, 81, 889–905. [Google Scholar] [CrossRef]
- Sun, W.; Zou, Y.; Zhang, X.; Guo, N.; Zhang, B.; Du, G. High robustness energy management strategy of hybrid electric vehicle based on improved soft actor-critic deep reinforcement learning. Energy 2022, 258, 124806. [Google Scholar] [CrossRef]
- Ma, Y.; Ma, Q.; Liu, Y.; Gao, J. Adaptive optimization control strategy for electric vehicle battery thermal management system based on pontryagin’s minimal principle. IEEE Trans. Transp. Electrif. 2023, 10, 3855–3869. [Google Scholar] [CrossRef]
- Rudolf, T.; Schürmann, T.; Schwab, S.; Hohmann, S. Toward holistic energy management strategies for fuel cell hybrid electric vehicles in heavy-duty applications. Proc. IEEE 2021, 109, 1094–1114. [Google Scholar] [CrossRef]
- Pu, Z.; Jiao, X.; Yang, C.; Fang, S. An adaptive stochastic model predictive control strategy for plug-in hybrid electric bus during vehicle-following scenario. IEEE Access 2022, 8, 13887–13897. [Google Scholar] [CrossRef]
- Wu, Y.; Huang, Z.; Hofmann, H.; Liu, Y.; Huang, J.; Hu, X.; Song, Z. Hierarchical predictive control for electric vehicles with hybrid energy storage system under vehicle-following scenarios. Energy 2022, 251, 123774. [Google Scholar] [CrossRef]
- Jia, C.; Qiao, W.; Cui, J.; Qu, L. Adaptive model-predictive-control-based real-time energy management of fuel cell hybrid electric vehicles. IEEE Trans. Power Electron. 2022, 38, 2681–2694. [Google Scholar] [CrossRef]
- Shen, Y.; Xie, J.; He, T.; Yao, L.; Xiao, Y. CEEMD-fuzzy control energy management of hybrid energy storage systems in electric vehicles. IEEE Trans. Energy Convers. 2023, 39, 555–566. [Google Scholar] [CrossRef]
- Velimirović, L.Z.; Janjić, A.; Vranić, P.; Velimirović, J.D.; Petkovski, I. Determining the optimal route of electric vehicle using a hybrid algorithm based on fuzzy dynamic programming. IEEE Trans. Fuzzy Syst. 2022, 31, 609–618. [Google Scholar] [CrossRef]
- Da Silva, S.F.; Eckert, J.J.; Corrêa, F.C.; Silva, F.L.; Silva, L.C.; Dedini, F.G. Dual HESS electric vehicle powertrain design and fuzzy control based on multi-objective optimization to increase driving range and battery life cycle. Appl. Energy 2022, 324, 119723. [Google Scholar] [CrossRef]
- Wang, C.; Liu, R.; Tang, A. Energy management strategy of hybrid energy storage system for electric vehicles based on genetic algorithm optimization and temperature effect. J. Energy Storage 2022, 51, 104314. [Google Scholar] [CrossRef]
- Xu, Y.; Zhang, H.; Yang, Y.; Zhang, J.; Yang, F.; Yan, D.; Wang, Y. Optimization of energy management strategy for extended range electric vehicles using multi-island genetic algorithm. J. Energy Storage 2023, 61, 106802. [Google Scholar] [CrossRef]
- Liu, T.; Tan, K.; Zhu, W.; Feng, L. Computationally efficient energy management for a parallel hybrid electric vehicle using adaptive dynamic programming. IEEE Trans. Intell. Veh. 2023, 9, 4085–4099. [Google Scholar] [CrossRef]
- Min, D.; Song, Z.; Chen, H.; Wang, T.; Zhang, T. Genetic algorithm optimized neural network based fuel cell hybrid electric vehicle energy management strategy under start-stop condition. Appl. Energy 2022, 306, 118036. [Google Scholar] [CrossRef]
- Ganesh, A.H.; Xu, B. A review of reinforcement learning based energy management systems for electrified powertrains: Progress, challenge, and potential solution. Renew. Sustain. Energy Rev. 2022, 154, 111833. [Google Scholar] [CrossRef]
- Sotoudeh, S.M.; HomChaudhuri, B. A deep-learning-based approach to eco-driving-based energy management of hybrid electric vehicles. IEEE Trans. Transp. Electrif. 2023, 9, 3742–3752. [Google Scholar] [CrossRef]
- Zheng, C.; Zhang, D.; Xiao, Y.; Li, W. Reinforcement learning-based energy management strategies of fuel cell hybrid vehicles with multi-objective control. J. Power Sources 2022, 543, 231841. [Google Scholar] [CrossRef]
- Deng, K.; Liu, Y.; Hai, D.; Peng, H.; Löwenstein, L.; Pischinger, S.; Hameyer, K. Deep reinforcement learning based energy management strategy of fuel cell hybrid railway vehicles considering fuel cell aging. Energy Convers. Manag. 2022, 251, 115030. [Google Scholar] [CrossRef]
- Jia, C.; Liu, W.; He, H.; Chau, K.T. Deep reinforcement learning-based energy management strategy for fuel cell buses integrating future road information and cabin comfort control. Energy Convers. Manag. 2024, 321, 119032. [Google Scholar] [CrossRef]
- Wang, Y.; Wu, J.; He, H.; Wei, Z.; Sun, F. Data-driven energy management for electric vehicles using offline reinforcement learning. Nat. Commun. 2025, 16, 2835. [Google Scholar] [CrossRef]
- Shi, D.; Xu, H.; Wang, S.; Hu, J.; Chen, L.; Yin, C. Deep reinforcement learning based adaptive energy management for plug-in hybrid electric vehicle with double deep Q-network. Energy 2024, 305, 132402. [Google Scholar] [CrossRef]
- Engel, J.; Schmitt, T.; Rodemann, T.; Adamy, J. Hierarchical MPC for building energy management: Incorporating data-driven error compensation and mitigating information asymmetry. Appl. Energy 2024, 372, 123780. [Google Scholar] [CrossRef]
- Cheng, J.; Yang, F.; Zhang, H.; Yang, A.; Xu, Y. Multi-objective adaptive energy management strategy for fuel cell hybrid electric vehicles considering fuel cell health state. Appl. Therm. Eng. 2024, 257, 124270. [Google Scholar] [CrossRef]
- Qi, Y.; Xu, X.; Liu, Y.; Pan, L.; Liu, J.; Hu, W. Intelligent energy management for an on-grid hydrogen refueling station based on dueling double deep Q network algorithm with NoisyNet. Renew. Energy 2024, 222, 119885. [Google Scholar] [CrossRef]
- Ouyang, T.; Jin, S.; Xie, X.; Gong, Y.; Zhang, Z. Adaptive Energy Management in Dual-Motor Electric Vehicles Using Deep Deterministic Policy Gradient. IEEE Trans. Transp. Electrif. 2025, 11, 12647–12656. [Google Scholar] [CrossRef]
- Chen, B.; Wang, M.; Hu, L.; Zhang, R.; Li, H.; Wen, X.; Gao, K. A hierarchical cooperative eco-driving and energy management strategy of hybrid electric vehicle based on improved TD3 with multi-experience. Energy Convers. Manag. 2025, 326, 119508. [Google Scholar] [CrossRef]
- Wang, J.; Du, C.; Yan, F.; Duan, X.; Hua, M.; Xu, H.; Zhou, Q. Energy Management of a Plug-In Hybrid Electric Vehicle Using Bayesian Optimization and Soft Actor-Critic Algorithm. IEEE Trans. Transp. Electrif. 2024, 11, 912–921. [Google Scholar] [CrossRef]
- Zhang, H.; Chen, B.; Lei, N.; Li, B.; Li, R.; Wang, Z. Integrated thermal and energy management of connected hybrid electric vehicles using deep reinforcement learning. IEEE Trans. Transp. Electrif. 2023, 10, 4594–4603. [Google Scholar] [CrossRef]
- Khalatbarisoltani, A.; Han, J.; Liu, W.; Liu, C.Z.; Hu, X. Health-consciousness integrated thermal and energy management of connected hybrid electric vehicles using cooperative multi-agent deep reinforcement learning. IEEE Trans. Intell. Veh. 2024, 1–12. [Google Scholar] [CrossRef]
- Han, J.; Shu, H.; Tang, X.; Lin, X.; Liu, C.; Hu, X. Predictive energy management for plug-in hybrid electric vehicles considering electric motor thermal dynamics. Energy Convers. Manag. 2022, 251, 115022. [Google Scholar] [CrossRef]
- Abbasi, M.H.; Arjmandzadeh, Z.; Zhang, J.; Xu, B.; Krovi, V. Deep reinforcement learning based fast charging and thermal management optimization of an electric vehicle battery pack. J. Energy Storage 2024, 95, 112466. [Google Scholar] [CrossRef]
- Guo, Z.; Wang, Y.; Zhao, S.; Zhao, T.; Ni, M. Modeling and optimization of micro heat pipe cooling battery thermal management system via deep learning and multi-objective genetic algorithms. Int. J. Heat Mass Transf. 2023, 207, 124024. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhang, C.; Fan, R.; Deng, C.; Wan, S.; Chaoui, H. Energy management strategy for fuel cell vehicles via soft actor-critic-based deep reinforcement learning considering powertrain thermal and durability characteristics. Energy Convers. Manag. 2023, 283, 116921. [Google Scholar] [CrossRef]
- Han, L.; Yang, K.; Ma, T.; Yang, N.; Liu, H.; Guo, L. Battery life constrained real-time energy management strategy for hybrid electric vehicles based on reinforcement learning. Energy 2022, 259, 124986. [Google Scholar] [CrossRef]
- Duan, J.; Guan, Y.; Li, S.E.; Ren, Y.; Sun, Q.; Cheng, B. Distributional soft actor-critic: Off-policy reinforcement learning for addressing value estimation errors. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6584–6598. [Google Scholar] [CrossRef] [PubMed]













| Property | Parameter | Value | Unit |
|---|---|---|---|
| FCV mass | 2046 | kg | |
| Equivalent windward area | 2.62 | m2 | |
| Wheel radius | 0.35 | m | |
| Gear ratio of final drive | 9 | – |
| Parameter | Physical Meaning | Typical Value/Range | Unit |
|---|---|---|---|
| Global efficiency of the fuel cell system | 0.45–0.60 | — | |
| Lower heating value (LHV) of hydrogen | 33.3 | — | |
| Fuel cell degradation per percentage point of SOH loss | 750 | $ | |
| Unit price of hydrogen | 16 | $/kg | |
| Unit price of the lithiumion battery pack | 200 | $/kWh | |
| Lithiumion battery degradation per percentage point of SOH loss | 375 | $ | |
| Lithiumion battery temperature penalty weight | −10 | — | |
| Fuel cell temperature penalty weight | −10 | — | |
| Weighting factor for the SoC penalty term | 10 | — | |
| Current operating temperature of the fuel cell | Optimal: 65–70 °C | °C | |
| Current operating temperature of the lithium-ion battery | Optimal: 29–36 °C | °C | |
| Ideal initial state-of-charge of the battery | 0.5–0.6 | — | |
| Real-time output power of the fuel cell | Varies with demand | kW | |
| Real-time output power of the battery | Varies with demand | kW |
| Algorithm | Distributional Learning | Maximum Entropy | Double Q-Network | Continuous Action | Suitable for EMS |
|---|---|---|---|---|---|
| DDPG | no | no | no | yes | yes |
| TD3 | no | no | yes | yes | yes |
| SAC | no | yes | yes | yes | yes |
| DSAC | yes | yes | no | yes | yes |
| Algorithm | Fuel Consumption | Temperature and SoH | Thermal Safety Constraints |
|---|---|---|---|
| DDPG | Considering | Considering | Neglecting |
| TD3 | Considering | Considering | Neglecting |
| SAC | Considering | Considering | Neglecting |
| DSAC | Considering | Considering | Neglecting |
| DDPG_TSCC | Considering | Considering | Considering |
| TD3_TSCC | Considering | Considering | Considering |
| SAC_TSCC | Considering | Considering | Considering |
| DSAC_TSCC | Considering | Considering | Considering |
| Strategy/Driving Cycle | Algorithm | Effective Fuel Consumption (kg) | Energy Loss ($) |
|---|---|---|---|
| RL WLTC | DDPG | 0.503 | 0.494 |
| TD3 | 0.689 | 0.514 | |
| SAC | 0.613 | 0.502 | |
| DSAC | 0.414 | 0.363 | |
| TSCC-RL WLTC | DDPG | 0.503 | 0.464 |
| TD3 | 0.675 | 0.495 | |
| SAC | 0.458 | 0.400 | |
| DSAC | 0.422 | 0.357 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Y.; Tao, F.; Zhu, L.; Wang, N.; Fu, Z. An Energy Management Strategy for FCHEVs Using Deep Reinforcement Learning with Thermal Runaway Fault Diagnosis Considering the Thermal Effects and Durability. Machines 2025, 13, 962. https://doi.org/10.3390/machines13100962
Wang Y, Tao F, Zhu L, Wang N, Fu Z. An Energy Management Strategy for FCHEVs Using Deep Reinforcement Learning with Thermal Runaway Fault Diagnosis Considering the Thermal Effects and Durability. Machines. 2025; 13(10):962. https://doi.org/10.3390/machines13100962
Chicago/Turabian StyleWang, Yongqiang, Fazhan Tao, Longlong Zhu, Nan Wang, and Zhumu Fu. 2025. "An Energy Management Strategy for FCHEVs Using Deep Reinforcement Learning with Thermal Runaway Fault Diagnosis Considering the Thermal Effects and Durability" Machines 13, no. 10: 962. https://doi.org/10.3390/machines13100962
APA StyleWang, Y., Tao, F., Zhu, L., Wang, N., & Fu, Z. (2025). An Energy Management Strategy for FCHEVs Using Deep Reinforcement Learning with Thermal Runaway Fault Diagnosis Considering the Thermal Effects and Durability. Machines, 13(10), 962. https://doi.org/10.3390/machines13100962

