Optimal Power Flow for High Spatial and Temporal Resolution Power Systems with High Renewable Energy Penetration Using Multi-Agent Deep Reinforcement Learning
Abstract
:1. Introduction
- The heterogeneous MAPPO (H-MAPPO) DRL model is proposed to address the OPF problem in multi-area power systems, significantly reducing training complexity and enhancing decision-making efficiency.
- A GNN layer is incorporated to extract spatiotemporal features of RES fluctuations and power grid topologies, enhancing the convergence and feasibility of OPF solutions.
- The performance of the proposed DRL model is validated using the RTS-GMLC test system, a near-real power system model with high spatial and temporal resolution and substantial RES penetration.
2. Problem Formulation
Markov Decision Process Formulation
3. Deep Reinforcement Learning Foundations
3.1. Heterogeneous Multi-Agent System
3.2. Multi-Agent Proximal Policy Optimization
3.3. Graph Neural Network
4. Simulation and Results
4.1. RTS-Grid Modernization Laboratory Consortium System
4.2. Experimental Setup
4.3. Results Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations and Notations
OPF | optimal power flow |
RES | renewable energy source |
AI | artificial intelligence |
DRL | deep reinforcement learning |
GNN | graph neural network |
MAPPO | multi-agent proximal policy optimization |
H-MAPPO | heterogeneous multi-agent proximal policy optimization |
MDP | Markov decision process |
CTDE | centralized training and decentralized execution |
GMLC | Grid Modernization Laboratory Consortium |
CSP | concentrating solar power |
total cost of power generation | |
number of generators | |
active power output of generator k at time t | |
reactive power output of generator k at time t | |
quadratic cost function | |
, , and | coefficients of the cost function |
net injection of the active power of the ith node at time t | |
net injection of the reactive power of the ith node at time t | |
and | sets of neighborhood of node i and node v |
nodal voltage amplitude of the ith node at time t | |
angle difference between node i and node j at time t | |
and | conductance and susceptance of the branch |
and | upper and lower bounds of the active power outputs of generator k |
and | upper and lower bounds of the reactive power outputs of generator k |
active power on the branch | |
and | upper and lower bounds of active power on the branch |
and | upper and lower bounds of the voltage amplitude at node i |
set of transmission sections | |
set of tie lines in the zth transmission section | |
and | upper and lower bounds of active power across transmission section z |
, , and | state space, action space, and state transition probability matrix of the MDP |
reward function | |
local observation of area m at time t | |
M | total areas of the multi-area power system |
vector of active power output of generators in area m at time t | |
number of generators in area m | |
power loss | |
local observations of area m at time t | |
local action of area m at time t | |
active power output adjustment of generator i in area m at time t | |
, , , and | positive constants of the reward function |
mismatched power between total generation and total load | |
loss function of the actor network | |
importance sampling ratio | |
clipping function | |
advantage function | |
loss function of the critic network | |
discount factor | |
r | return at time |
evaluation of observation at time t | |
representation vector of node v at the lth layer of GraphSAGE | |
and | learnable weight matrices of GraphSAGE |
elementwise mean function | |
number of samples in the test dataset | |
number of samples with converged OPF solutions | |
number of samples with feasible OPF solutions |
References
- Skolfield, J.K.; Escobedo, A.R. Operations research in optimal power flow: A guide to recent and emerging methodologies and applications. Eur. J. Oper. Res. 2022, 300, 387–404. [Google Scholar] [CrossRef]
- Mitrovic, M.; Lukashevich, A.; Vorobev, P.; Terzija, V.; Budennyy, S.; Maximov, Y.; Deka, D. Data-driven stochastic AC-OPF using Gaussian process regression. Int. J. Electr. Power Energy Syst. 2023, 152, 109249. [Google Scholar]
- Chen, Y.; Guo, Z.; Li, H.; Yang, Y.; Tadie, A.T.; Wang, G.; Hou, Y. Probabilistic optimal power flow for day-ahead dispatching of power systems with high-proportion renewable power sources. Sustainability 2020, 12, 518. [Google Scholar] [CrossRef]
- Mhanna, S.; Mancarella, P. An exact sequential linear programming algorithm for the optimal power flow problem. IEEE Trans. Power Syst. 2021, 37, 666–679. [Google Scholar]
- Li, C.; Kies, A.; Zhou, K.; Schlott, M.; El Sayed, O.; Bilousova, M.; Stöcker, H. Optimal Power Flow in a highly renewable power system based on attention neural networks. Appl. Energy 2024, 359, 122779. [Google Scholar]
- Pan, X.; Chen, M.; Zhao, T.; Low, S.H. DeepOPF: A feasibility-optimized deep neural network approach for AC optimal power flow problems. IEEE Syst. J. 2022, 17, 673–683. [Google Scholar]
- Yi, Z.; Wang, X.; Yang, C.; Yang, C.; Niu, M.; Yin, W. Real-time sequential security-constrained optimal power flow: A hybrid knowledge-data-driven reinforcement learning approach. IEEE Trans. Power Syst. 2023, 39, 1664–1680. [Google Scholar] [CrossRef]
- Wang, Z.Y.; Chiang, H.D. On the nonconvex feasible region of optimal power flow: Theory, degree, and impacts. Int. J. Electr. Power Energy Syst. 2024, 161, 110167. [Google Scholar]
- Bescos, B.; Fácil, J.M.; Civera, J.; Neira, J. DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 2018, 3, 4076–4083. [Google Scholar] [CrossRef]
- Chen, X.; Qu, G.; Tang, Y.; Low, S.; Li, N. Reinforcement learning for selective key applications in power systems: Recent advances and future challenges. IEEE Trans. Smart Grid 2022, 13, 2935–2958. [Google Scholar]
- Yan, Z.; Xu, Y. Real-time optimal power flow: A lagrangian based deep reinforcement learning approach. IEEE Trans. Power Syst. 2020, 35, 3270–3273. [Google Scholar]
- Sayed, A.R.; Wang, C.; Anis, H.I.; Bi, T. Feasibility constrained online calculation for real-time optimal power flow: A convex constrained deep reinforcement learning approach. IEEE Trans. Power Syst. 2022, 38, 5215–5227. [Google Scholar]
- Chen, Y.; Du, Q.; Liu, H.; Cheng, L.; Younis, M.S. Improved Proximal Policy Optimization Algorithm for Sequential Security-Constrained Optimal Power Flow Based on Expert Knowledge and Safety Layer. J. Mod. Power Syst. Clean Energy 2023, 12, 742–753. [Google Scholar]
- Zhou, Y.; Lee, W.J.; Diao, R.; Shi, D. Deep reinforcement learning based real-time AC optimal power flow considering uncertainties. J. Mod. Power Syst. Clean Energy 2021, 10, 1098–1109. [Google Scholar]
- Zhou, Y.; Zhang, B.; Xu, C.; Lan, T.; Diao, R.; Shi, D.; Wang, Z.; Lee, W.J. A data-driven method for fast ac optimal power flow solutions via deep reinforcement learning. J. Mod. Power Syst. Clean Energy 2020, 8, 1128–1139. [Google Scholar]
- Perera, A.; Kamalaruban, P. Applications of reinforcement learning in energy systems. Renew. Sustain. Energy Rev. 2021, 137, 110618. [Google Scholar]
- Liu, S.; Luo, W.; Zhou, Y.; Chen, K.; Zhang, Q.; Xu, H.; Guo, Q.; Song, M. Transmission interface power flow adjustment: A deep reinforcement learning approach based on multi-task attribution map. IEEE Trans. Power Syst. 2023, 39, 3324–3335. [Google Scholar]
- Chen, J.; Yu, T.; Pan, Z.; Zhang, M.; Deng, B. A scalable graph reinforcement learning algorithm based stochastic dynamic dispatch of power system under high penetration of renewable energy. Int. J. Electr. Power Energy Syst. 2023, 152, 109212. [Google Scholar]
- Du, W.; Ding, S. A survey on multi-agent deep reinforcement learning: From the perspective of challenges and applications. Artif. Intell. Rev. 2021, 54, 3215–3238. [Google Scholar]
- Zhang, Q.; Dehghanpour, K.; Wang, Z.; Qiu, F.; Zhao, D. Multi-agent safe policy learning for power management of networked microgrids. IEEE Trans. Smart Grid 2020, 12, 1048–1062. [Google Scholar] [CrossRef]
- Jendoubi, I.; Bouffard, F. Multi-agent hierarchical reinforcement learning for energy management. Appl. Energy 2023, 332, 120500. [Google Scholar]
- Hu, D.; Ye, Z.; Gao, Y.; Ye, Z.; Peng, Y.; Yu, N. Multi-agent deep reinforcement learning for voltage control with coordinated active and reactive power optimization. IEEE Trans. Smart Grid 2022, 13, 4873–4886. [Google Scholar]
- Li, J.; Yu, T.; Zhang, X. Coordinated load frequency control of multi-area integrated energy system using multi-agent deep reinforcement learning. Appl. Energy 2022, 306, 117900. [Google Scholar]
- Yan, Z.; Xu, Y. A multi-agent deep reinforcement learning method for cooperative load frequency control of a multi-area power system. IEEE Trans. Power Syst. 2020, 35, 4599–4608. [Google Scholar]
- Gao, F.; Xu, Z.; Yin, L. Bayesian deep neural networks for spatio-temporal probabilistic optimal power flow with multi-source renewable energy. Appl. Energy 2024, 353, 122106. [Google Scholar]
- Noorizadegan, A.; Cavoretto, R.; Young, D.; Chen, C. Stable weight updating: A key to reliable PDE solutions using deep learning. Eng. Anal. Bound. Elem. 2024, 168, 105933. [Google Scholar]
- Zhang, R.; Xu, N.; Zhang, K.; Wang, L.; Lu, G. A parametric physics-informed deep learning method for probabilistic design of thermal protection systems. Energies 2023, 16, 3820. [Google Scholar] [CrossRef]
- Huang, B.; Wang, J. Applications of physics-informed neural networks in power systems-a review. IEEE Trans. Power Syst. 2022, 38, 572–588. [Google Scholar]
- Liao, W.; Bak-Jensen, B.; Pillai, J.R.; Wang, Y.; Wang, Y. A review of graph neural networks and their applications in power systems. J. Mod. Power Syst. Clean Energy 2021, 10, 345–360. [Google Scholar]
- Sun, P.; Huo, L.; Chen, X.; Liang, S. Rotor Angle Stability Prediction using Temporal and Topological Embedding Deep Neural Network Based on Grid-Informed Adjacency Matrix. J. Mod. Power Syst. Clean Energy 2023. [Google Scholar]
- Gao, M.; Yu, J.; Yang, Z.; Zhao, J. Physics embedded graph convolution neural network for power flow calculation considering uncertain injections and topology. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 15467–15478. [Google Scholar]
- Falconer, T.; Mones, L. Leveraging power grid topology in machine learning assisted optimal power flow. IEEE Trans. Power Syst. 2022, 38, 2234–2246. [Google Scholar] [CrossRef]
- Hansen, J.B.; Anfinsen, S.N.; Bianchi, F.M. Power flow balancing with decentralized graph neural networks. IEEE Trans. Power Syst. 2022, 38, 2423–2433. [Google Scholar] [CrossRef]
- Sayed, A.R.; Zhang, X.; Wang, G.; Wang, C.; Qiu, J. Optimal operable power flow: Sample-efficient holomorphic embedding-based reinforcement learning. IEEE Trans. Power Syst. 2023, 39, 1739–1751. [Google Scholar]
- Shixin, Z.; Feng, P.; Anni, J.; Hao, Z.; Qiuqi, G. The unmanned vehicle on-ramp merging model based on AM-MAPPO algorithm. Sci. Rep. 2024, 14, 19416. [Google Scholar] [CrossRef]
- Peng, R.; Chen, S.; Xue, C. Collaborative Content Caching Algorithm for Large-Scale ISTNs Based on MAPPO. IEEE Wirel. Commun. Lett. 2024, 13, 3069–3073. [Google Scholar]
- Liu, X.; Yin, Y.; Su, Y.; Ming, R. A multi-UCAV cooperative decision-making method based on an MAPPO algorithm for beyond-visual-range air combat. Aerospace 2022, 9, 563. [Google Scholar] [CrossRef]
- Xu, K.; Hu, W.; Leskovec, J.; Jegelka, S. How powerful are graph neural networks? arXiv 2018, arXiv:1810.00826. [Google Scholar]
- Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30, 1025–1035. [Google Scholar]
- Barrows, C.; Bloom, A.; Ehlen, A.; Ikäheimo, J.; Jorgenson, J.; Krishnamurthy, D.; Lau, J.; McBennett, B.; O’Connell, M.; Preston, E.; et al. The IEEE reliability test system: A proposed 2019 update. IEEE Trans. Power Syst. 2019, 35, 119–127. [Google Scholar]
- Karmakar, A.; Cole, W. Nodal Capacity Expansion Modeling with ReEDS: A Case Study of the RTS-GMLC Test System; Technical report; National Renewable Energy Laboratory (NREL): Golden, CO, USA, 2024.
- Preston, G. Repository for the TRS-GMLC System. 2024. Available online: https://github.com/GridMod/RTS-GMLC (accessed on 7 December 2024).
Parameters | Type or Value |
---|---|
optimizer | Adam |
activation function | tanh |
learning rate | 5e-5 |
GAE lambda | 1.0 |
discount factor | 0.99 |
clipping | 0.3 |
batch size | 2160 |
number of hidden layers in GNN | 5 |
neuron number of each hidden layer | 64 |
length of representation vector | 68 |
Metrics | Proposed DRL Model | MATPOWER |
---|---|---|
Convergence rate (%) | 100% | 100% |
Feasibility rate (%) | 100% | 100% |
Computing time (ms) | 6.7 ± 3.6 | 250.0 ± 8.4 |
Metrics | Proposed DRL Model | MATPOWER |
---|---|---|
Convergence rate (%) | 100% | 68.8% |
Feasibility rate (%) | 100% | 68.8% |
Computing time (ms) | 6.8 ± 4.0 | 580.0 ± 12.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, L.; Huo, L.; Liu, L.; Xu, H.; Chen, R.; Chen, X. Optimal Power Flow for High Spatial and Temporal Resolution Power Systems with High Renewable Energy Penetration Using Multi-Agent Deep Reinforcement Learning. Energies 2025, 18, 1809. https://doi.org/10.3390/en18071809
Zhou L, Huo L, Liu L, Xu H, Chen R, Chen X. Optimal Power Flow for High Spatial and Temporal Resolution Power Systems with High Renewable Energy Penetration Using Multi-Agent Deep Reinforcement Learning. Energies. 2025; 18(7):1809. https://doi.org/10.3390/en18071809
Chicago/Turabian StyleZhou, Liangcai, Long Huo, Linlin Liu, Hao Xu, Rui Chen, and Xin Chen. 2025. "Optimal Power Flow for High Spatial and Temporal Resolution Power Systems with High Renewable Energy Penetration Using Multi-Agent Deep Reinforcement Learning" Energies 18, no. 7: 1809. https://doi.org/10.3390/en18071809
APA StyleZhou, L., Huo, L., Liu, L., Xu, H., Chen, R., & Chen, X. (2025). Optimal Power Flow for High Spatial and Temporal Resolution Power Systems with High Renewable Energy Penetration Using Multi-Agent Deep Reinforcement Learning. Energies, 18(7), 1809. https://doi.org/10.3390/en18071809