Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning

Hu, Junru; Dou, Xiaobo; Xiong, Junjie; Tao, Xiang

doi:10.3390/en19133071

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning

by

Junru Hu

¹,

Xiaobo Dou

^1,*,

Junjie Xiong

² and

Xiang Tao

²

¹

School of Electrical Engineering, Southeast University, Nanjing 210018, China

²

Electric Power Research Institute, State Grid Jiangxi Electric Power Co., Ltd., Nanchang 330096, China

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(13), 3071; https://doi.org/10.3390/en19133071 (registering DOI)

Submission received: 1 June 2026 / Revised: 22 June 2026 / Accepted: 23 June 2026 / Published: 29 June 2026

(This article belongs to the Section A1: Smart Grids and Microgrids)

Download Versions Notes

Abstract

With the widespread integration of renewable energy, power flow in the system has become extremely complex and variable. This not only exacerbates the operational safety issues of distribution networks but also intensifies the three-phase unbalance situation. The traditional volt/var control (VVC) model is facing significant challenges such as high-dimensional nonlinearity and low efficiency. To address these problems, this paper proposes a volt/var control for three-phase unbalanced distribution network based on trust region safe reinforcement learning. Firstly, a model is constructed based on the three-phase linear power flow matrix. Then it is transformed into a Markov Decision Process (MDP) to overcome the computational burden. Secondly, a trust region construction method based on the Clip mechanism is introduced to ensure stable policy gradient updates and computational efficiency. Further, the Lagrange multiplier is introduced in the trust region optimization to convert the node voltage safety boundary into a cost function, establishing a distribution network safety reinforcement learning (SDRL) model, which limits the output of dangerous action. Finally, through case studies, it is verified that the proposed method can effectively mitigate three-phase unbalance, enhance online decision-making efficiency, and strictly guarantee the safe operation of distribution networks, demonstrating significant feasibility and superiority.

Keywords: volt/var control; three-phase unbalance; active distribution network; safe reinforcement learning; CMDP

Share and Cite

MDPI and ACS Style

Hu, J.; Dou, X.; Xiong, J.; Tao, X. Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning. Energies 2026, 19, 3071. https://doi.org/10.3390/en19133071

AMA Style

Hu J, Dou X, Xiong J, Tao X. Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning. Energies. 2026; 19(13):3071. https://doi.org/10.3390/en19133071

Chicago/Turabian Style

Hu, Junru, Xiaobo Dou, Junjie Xiong, and Xiang Tao. 2026. "Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning" Energies 19, no. 13: 3071. https://doi.org/10.3390/en19133071

APA Style

Hu, J., Dou, X., Xiong, J., & Tao, X. (2026). Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning. Energies, 19(13), 3071. https://doi.org/10.3390/en19133071

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Volt/Var Control for Three-Phase Unbalanced Distribution Network Based on Trust Region Safe Reinforcement Learning

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI