Energy Management for Microgrids with Hybrid Hydrogen-Battery Storage: A Reinforcement Learning Framework Integrated Multi-Objective Dynamic Regulation
Abstract
1. Introduction
- An HHB-ESS is integrated into the MG architecture to leverage the complementary characteristics of short-term and long-term energy storage technologies.
- Through comprehensive modeling of the MG system, three key performance indicators are formally defined to capture distinct aspects of system performance: EC, SRS, and BLO, thereby establishing a foundation for multi-objective optimization.
- Based on DRAM, a novel DRL algorithm named MOATD3 is proposed to address the multi-objective scheduling problem in the MG. The results demonstrate that the algorithm significantly improves performance in terms of economic efficiency, power exchange stability, and battery degradation suppression, validating its practical applicability and robustness.
2. Modeling of the MG Elements
2.1. RES Generation Modeling
2.2. Battery Storage Charging and Discharging Power and Aging Modeling
2.3. Hydrogen Energy Storage System Modeling
2.4. Multi-Objective Optimization Modeling
- Obj 1: The system’s optimization cost includes the transaction cost , the degradation cost of the BESS, and the operation and maintenance costs and of both the BESS and HSS. The objective function is formulated as:
- Obj 2: To ensure the stable operation of the MG, it is necessary to minimize the fluctuation of power exchanged with the main grid. This objective is formulated as:
- Obj 3: Battery degradation poses a significant challenge to the long-term performance and reliability of ESS. As degradation progresses, the battery’s usable SoC range shrinks, directly reducing the amount of energy available for flexible dispatch. Moreover, battery aging leads to increased replacement costs and, upon reaching end-of-life, may result in disposal and environmental concerns. Therefore, this paper incorporates BLO as a dedicated optimization objective:
3. Proposed MOATD3 Algorithm
3.1. Modeling of MDP
3.2. Scheduling Method of the MOATD3 Algorithm
3.2.1. Theoretical Description
3.2.2. Algorithm Workflow
Algorithm 1: Training process of the proposed MOATD3 method |
1: Initialize actor network and critic networks , with random parameters. 2: Initialize target networks: , and . 3: Initialize replay buffer . 4: for episode = 1 to N_episodes do: 5: Observe initial state from environment. 6: for t = 0 to T do: 7: Select action with exploration noise. 8: Project onto the feasible set defined by (4), (8), (11), (13) and (14). 9: Execute action , observe the next state , and sub-rewards: , . 10: Compute dynamic weights: . 11: Compute the total reward by (20). 12: Store transition into buffer . 13: Sample mini-batch of N transitions from . 14: Compute the target Q-value with target policy smoothing by (23). 15: Update critics by minimizing MSE loss: 16: Delayed actor update: 17: If t mod d then 18: Update the actor network: . 19: Update target networks by (27). 20: end if 21: 22: end for 23: end for |
4. Simulation Results
4.1. Simulation Setup
4.2. The Scheduling Results
4.3. Comparative Experiment Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Olabi, A.G.; Abdelkareem, M.A. Renewable energy and climate change. Renew. Sustain. Energy Rev. 2022, 158, 112111. [Google Scholar] [CrossRef]
- Sayed, E.T.; Olabi, A.G.; Alami, A.H.; Radwan, A.; Mdallal, A.; Rezk, A.; Abdelkareem, M.A. Renewable energy and energy storage systems. Energies 2023, 16, 1415. [Google Scholar] [CrossRef]
- Sayfutdinov, T.; Patsios, C.; Greenwood, D.; Peker, M.; Sarantakos, I. Optimization-based modelling and game-theoretic framework for techno-economic analysis of demand-side flexibility: A real case study. Appl. Energy 2022, 321, 119370. [Google Scholar] [CrossRef]
- Saeed, M.H.; Fangzong, W.; Kalwar, B.A.; Iqbal, S. A review on microgrids’ challenges & perspectives. IEEE Access 2021, 9, 166502–166517. [Google Scholar]
- Gayen, D.; Chatterjee, R.; Roy, S. A review on environmental impacts of renewable energy for sustainable development. Int. J. Environ. Sci. Technol. 2024, 21, 5285–5310. [Google Scholar] [CrossRef]
- Thirunavukkarasu, G.S.; Seyedmahmoudian, M.; Jamei, E.; Horan, B.; Mekhilef, S.; Stojcevski, A. Role of optimization techniques in microgrid energy management systems—A review. Energy Strategy Rev. 2022, 43, 100899. [Google Scholar] [CrossRef]
- Mahmoud, M.; Ramadan, M.; Olabi, A.G.; Pullen, K.; Naher, S. A review of mechanical energy storage systems combined with wind and solar applications. Energy Convers. Manag. 2024, 210, 112670. [Google Scholar] [CrossRef]
- Rouholamini, M.; Wang, C.; Nehrir, H.; Hu, X.; Hu, Z.; Aki, H.; Zhao, B.; Miao, Z.; Strunz, K. A review of modeling, management, and applications of grid-connected Li-ion battery storage systems. IEEE Trans. Smart Grid 2022, 13, 4505–4524. [Google Scholar] [CrossRef]
- Boretti, A.; Castelletto, S. Hydrogen energy storage requirements for solar and wind energy production to account for long-term variability. Renew. Energy 2024, 221, 119797. [Google Scholar] [CrossRef]
- Zou, Y.; Tang, S.; Guo, S.; Wu, J.; Zhao, W. Energy storage/power/heating production using compressed air energy storage integrated with solid oxide fuel cell. J. Energy Storage 2024, 83, 110718. [Google Scholar] [CrossRef]
- Schubert, C.; Hassen, W.F.; Poisl, B.; Seitz, S.; Schubert, J.; Oyarbide Usabiaga, E.; Gaudo, P.M.; Pettinger, K.H. Hybrid energy storage systems based on redox-flow batteries: Recent developments, challenges, and future perspectives. Batteries 2023, 9, 211. [Google Scholar] [CrossRef]
- Li, S.; Zhu, J.; Dong, H.; Zhu, H.; Fan, J. A novel rolling optimization strategy considering grid-connected power fluctuations smoothing for renewable energy microgrids. Appl. Energy 2022, 309, 118441. [Google Scholar] [CrossRef]
- Cui, F.; An, D.; Xi, H. Integrated energy hub dispatch with a multi-mode CAES–BESS hybrid system: An option-based hierarchical reinforcement learning approach. Appl. Energy 2024, 374, 123950. [Google Scholar] [CrossRef]
- Guven, A.F.; Abdelaziz, A.Y.; Samy, M.M.; Barakat, S. Optimizing energy Dynamics: A comprehensive analysis of hybrid energy storage systems integrating battery banks and supercapacitors. Energy Convers. Manag. 2024, 312, 118560. [Google Scholar] [CrossRef]
- Guezgouz, M.; Jurasz, J.; Bekkouche, B.; Ma, T.; Javed, M.S.; Kies, A. Optimal hybrid pumped hydro-battery storage scheme for off-grid renewable energy systems. Energy Convers. Manag. 2019, 199, 112046. [Google Scholar] [CrossRef]
- Nawaz, A.; Zhou, M.; Wu, J.; Long, C. A comprehensive review on energy management, demand response, and coordination schemes utilization in multi-microgrids network. Appl. Energy 2022, 323, 119596. [Google Scholar] [CrossRef]
- Lu, S.; Wang, C.; Fan, Y.; Lin, B. Robustness of building energy optimization with uncertainties using deterministic and stochastic methods: Analysis of two forms. Build. Environ. 2021, 205, 108185. [Google Scholar] [CrossRef]
- Kim, H.J.; Kim, M.K. A novel deep learning-based forecasting model optimized by heuristic algorithm for energy management of microgrid. Appl. Energy 2023, 332, 120525. [Google Scholar] [CrossRef]
- Dinh, H.T.; Lee, K.H.; Kim, D. A Supervised-Learning based Hour-Ahead Demand Response of a Behavior-based HEMS approximating MILP Optimization. arXiv 2021, arXiv:2111.01978. [Google Scholar]
- Kassab, F.A.; Celik, B.; Locment, F.; Sechilariu, M.; Liaquat, S.; Hansen, T.M. Optimal sizing and energy management of a microgrid: A joint MILP approach for minimization of energy cost and carbon emission. Renew. Energy 2024, 224, 120186. [Google Scholar] [CrossRef]
- Xu, X.F.; Wang, K.; Ma, W.H.; Wu, C.L.; Huang, X.R.; Ma, Z.X.; Li, Z.H. Multi-objective particle swarm optimization algorithm based on multi-strategy improvement for hybrid energy storage optimization configuration. Renew. Energy 2024, 223, 120086. [Google Scholar] [CrossRef]
- Luo, D. Optimizing Load Scheduling in Power Grids Using Reinforcement Learning and Markov Decision Processes. arXiv 2024, arXiv:2410.17696. [Google Scholar]
- Matsuo, Y.; LeCun, Y.; Sahani, M.; Precup, D.; Silver, D.; Sugiyama, M.; Uchibe, E.; Morimoto, J. Deep learning, reinforcement learning, and world models. Neural Netw. 2022, 152, 267–275. [Google Scholar] [CrossRef]
- Yu, L.; Qin, S.; Zhang, M.; Shen, C.; Jiang, T.; Guan, X. A review of deep reinforcement learning for smart building energy management. IEEE Internet Things J. 2021, 8, 12046–12063. [Google Scholar] [CrossRef]
- Guo, C.; Wang, X.; Zheng, Y.; Zhang, F. Real-time optimal energy management of microgrid with uncertainties based on deep reinforcement learning. Energy 2022, 238, 121873. [Google Scholar] [CrossRef]
- Alabdullah, M.H.; Abido, M.A. Microgrid energy management using deep Q-network reinforcement learning. Alex. Eng. J. 2022, 61, 9069–9078. [Google Scholar] [CrossRef]
- Liang, T.; Chai, L.; Cao, X.; Tan, J.; Jing, Y.; Lv, L. Real-time optimization of large-scale hydrogen production systems using off-grid renewable energy: Scheduling strategy based on deep reinforcement learning. Renew. Energy 2024, 224, 120177. [Google Scholar] [CrossRef]
- Ruan, Y.; Liang, Z.; Qian, F.; Meng, H.; Gao, Y. Operation strategy optimization of combined cooling, heating, and power systems with energy storage and renewable energy based on deep reinforcement learning. J. Build. Eng. 2023, 65, 105682. [Google Scholar] [CrossRef]
- Jahannoosh, M.; Nowdeh, S.A.; Naderipour, A.; Kamyab, H.; Davoudkhani, I.F.; Klemeš, J.J. New hybrid meta-heuristic algorithm for reliable and cost-effective designing of photovoltaic/wind/fuel cell energy system considering load interruption probability. J. Clean. Prod. 2021, 278, 123406. [Google Scholar] [CrossRef]
- Hossain, M.A.; Pota, H.R.; Squartini, S.; Abdou, A.F. Modified PSO algorithm for real-time energy management in grid-connected microgrids. Renew. Energy 2019, 136, 746–757. [Google Scholar] [CrossRef]
- Harsh, P.; Das, D. Energy management in microgrid using incentive-based demand response and reconfigured network considering uncertainties in renewable energy sources. Sustain. Energy Technol. Assess. 2021, 46, 101225. [Google Scholar] [CrossRef]
- Xu, B.; Zhao, J.; Zheng, T.; Litvinov, E.; Kirschen, D.S. Factoring the cycle aging cost of batteries participating in electricity markets. IEEE Trans. Power Syst. 2018, 33, 2248–2259. [Google Scholar] [CrossRef]
- Irham, A.; Roslan, M.F.; Jern, K.P.; Hannan, M.A.; Mahlia, T.I. Hydrogen energy storage integrated grid: A bibliometric analysis for sustainable energy production. Int. J. Hydrogen Energy 2024, 63, 1044–1087. [Google Scholar] [CrossRef]
- Yousri, D.; Farag, H.E.; Zeineldin, H.; El-Saadany, E.F. Integrated model for optimal energy management and demand response of microgrids considering hybrid hydrogen-battery storage systems. Energy Convers. Manag. 2023, 280, 116809. [Google Scholar] [CrossRef]
- Fujimoto, S.; Hoof, H.; Meger, D. Addressing function approximation error in actor-critic methods. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 1587–1596. [Google Scholar]
- Sun, B.; Song, M.; Li, A.; Zou, N.; Pan, P.; Lu, X.; Yang, Q.; Zhang, H.; Kong, X. Multi-objective solution of optimal power flow based on TD3 deep reinforcement learning algorithm. Sustain. Energy Grids Netw. 2023, 34, 101054. [Google Scholar] [CrossRef]
- Saleem, M.I.; Saha, S.; Izhar, U.; Ang, L. Optimized energy management of a solar battery microgrid: An economic approach towards voltage stability. J. Energy Storage 2024, 90, 111876. [Google Scholar] [CrossRef]
- Silani, A.; Yazdanpanah, M.J. Distributed optimal microgrid energy management with considering stochastic load. IEEE Trans. Sustain. Energy 2018, 10, 729–737. [Google Scholar] [CrossRef]
- Li, J.; Xiao, Y.; Lu, S. Optimal configuration of multi microgrid electric hydrogen hybrid energy storage capacity based on distributed robustness. J. Energy Storage 2024, 76, 109762. [Google Scholar] [CrossRef]
Component | Parameter | Value |
---|---|---|
RES | 5 kW | |
10 m/s | ||
3 m/s | ||
25 m/s | ||
0.327 W/m2 | ||
0.88 | ||
−0.0038 | ||
1000 W/m2 | ||
25 °C | ||
Battery (2 units) | 0.95 | |
0.95 | ||
200 kWh, 400 kWh | ||
100 kW, 200 kW | ||
0.1 | ||
0.9 | ||
HSS | 0.7 | |
100 kW | ||
LHV | 33.33 kWh/kg | |
0.98 | ||
0.98 | ||
4.5 kg/h | ||
6 kg | ||
0.98 | ||
4.5 kg/h |
Time Period | Buying Electricity Price (Yuan) | Selling Electricity Price (Yuan) |
---|---|---|
0:00–7:00 | 0.259 | 0.159 |
7:00–11:00 and 16:00–20:00 | 1.035 | 0.603 |
11:00–16:00 and 20:00–24:00 | 0.607 | 0.303 |
ESS | Algorithm | |
---|---|---|
Proposed | HHB-ESS | MOATD3 |
1 | BESS | MOATD3 |
2 | HHB-ESS | TD3 |
3 | HHB-ESS | DDPG |
Scenarios | Method | EC (Yuan) | SRS (kW) | BLO (%) |
---|---|---|---|---|
Good weather | HHBESS-MOATD3 | 2100.4 | 48 | 1.6 |
BESS-MOATD3 | 2136.6 | 58 | 2.3 | |
HHBESS-TD3 | 2117.1 | 60 | 2.1 | |
HHBESS-DDPG | 2118.0 | 66 | 3.0 | |
Bad weather | HHBESS-MOATD3 | 2275.3 | 62 | 2.2 |
BESS-MOATD3 | 2425.9 | 89 | 2.7 | |
HHBESS-TD3 | 2309.7 | 72 | 2.5 | |
HHBESS-DDPG | 2344.5 | 74 | 2.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zheng, Y.; Jia, J.; An, D. Energy Management for Microgrids with Hybrid Hydrogen-Battery Storage: A Reinforcement Learning Framework Integrated Multi-Objective Dynamic Regulation. Processes 2025, 13, 2558. https://doi.org/10.3390/pr13082558
Zheng Y, Jia J, An D. Energy Management for Microgrids with Hybrid Hydrogen-Battery Storage: A Reinforcement Learning Framework Integrated Multi-Objective Dynamic Regulation. Processes. 2025; 13(8):2558. https://doi.org/10.3390/pr13082558
Chicago/Turabian StyleZheng, Yi, Jinhua Jia, and Dou An. 2025. "Energy Management for Microgrids with Hybrid Hydrogen-Battery Storage: A Reinforcement Learning Framework Integrated Multi-Objective Dynamic Regulation" Processes 13, no. 8: 2558. https://doi.org/10.3390/pr13082558
APA StyleZheng, Y., Jia, J., & An, D. (2025). Energy Management for Microgrids with Hybrid Hydrogen-Battery Storage: A Reinforcement Learning Framework Integrated Multi-Objective Dynamic Regulation. Processes, 13(8), 2558. https://doi.org/10.3390/pr13082558