Next Article in Journal
High-Performance Algorithms for Soft X-Ray Diagnostics Towards Future Fusion Reactors and Power Generation
Previous Article in Journal
Optimization Method for Distribution Networks with High Penetration of Renewable Energy Based on Deep Scenario Generation and Data-Driven Approaches
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning

1
School of Electrical Engineering, Southeast University, Nanjing 210018, China
2
Electric Power Research Institute, State Grid Jiangxi Electric Power Co., Ltd., Nanchang 330096, China
*
Author to whom correspondence should be addressed.
Energies 2026, 19(13), 3071; https://doi.org/10.3390/en19133071 (registering DOI)
Submission received: 1 June 2026 / Revised: 22 June 2026 / Accepted: 23 June 2026 / Published: 29 June 2026
(This article belongs to the Section A1: Smart Grids and Microgrids)

Abstract

With the widespread integration of renewable energy, power flow in the system has become extremely complex and variable. This not only exacerbates the operational safety issues of distribution networks but also intensifies the three-phase unbalance situation. The traditional volt/var control (VVC) model is facing significant challenges such as high-dimensional nonlinearity and low efficiency. To address these problems, this paper proposes a volt/var control for three-phase unbalanced distribution network based on trust region safe reinforcement learning. Firstly, a model is constructed based on the three-phase linear power flow matrix. Then it is transformed into a Markov Decision Process (MDP) to overcome the computational burden. Secondly, a trust region construction method based on the Clip mechanism is introduced to ensure stable policy gradient updates and computational efficiency. Further, the Lagrange multiplier is introduced in the trust region optimization to convert the node voltage safety boundary into a cost function, establishing a distribution network safety reinforcement learning (SDRL) model, which limits the output of dangerous action. Finally, through case studies, it is verified that the proposed method can effectively mitigate three-phase unbalance, enhance online decision-making efficiency, and strictly guarantee the safe operation of distribution networks, demonstrating significant feasibility and superiority.
Keywords: volt/var control; three-phase unbalance; active distribution network; safe reinforcement learning; CMDP volt/var control; three-phase unbalance; active distribution network; safe reinforcement learning; CMDP

Share and Cite

MDPI and ACS Style

Hu, J.; Dou, X.; Xiong, J.; Tao, X. Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning. Energies 2026, 19, 3071. https://doi.org/10.3390/en19133071

AMA Style

Hu J, Dou X, Xiong J, Tao X. Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning. Energies. 2026; 19(13):3071. https://doi.org/10.3390/en19133071

Chicago/Turabian Style

Hu, Junru, Xiaobo Dou, Junjie Xiong, and Xiang Tao. 2026. "Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning" Energies 19, no. 13: 3071. https://doi.org/10.3390/en19133071

APA Style

Hu, J., Dou, X., Xiong, J., & Tao, X. (2026). Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning. Energies, 19(13), 3071. https://doi.org/10.3390/en19133071

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop