Data-Driven Automatic Generation Control Based on Learning to Coordinate and Teach Reinforcement Mechanism
Abstract
1. Introduction
- (1)
- A novel multi-agent distributed coordinated control framework based on the LECTR mechanism is proposed. This framework improves the training speed of the newly added area agent when the regions of the power system are expanded.
- (2)
- During the training of the agents, we employed a double-target critic network architecture in place of a single critic network. This structure reduces the variance in target critic network value estimation, thereby enhancing the learning performance of the multi-agent system and facilitating superior control strategies.
- (3)
- We developed a cooperative reward function considering the control error of each region within the interconnected system, effectively guiding the training of the reinforcement learning algorithm.
2. Muti-Area Power System Model
3. Framework for AGC Based on MADRL Algorithm with the LECTR Mechanism
3.1. Action Space
3.2. State Space
3.3. Reward Function
3.4. Overall Training Framework
4. Training for Agent Based on the LECTR Mechanism
4.1. MADRL with a Double Target Critic Network Delay Update Strategy
4.2. The Application of the LECTR Mechanism in Expandable Power System
5. Case Study
5.1. Experimental Environment
5.2. Three-Area Power System Disturbance Simulation
5.2.1. Step Load Disturbance
5.2.2. Random Renewable Energy Disturbance
5.3. Extension of Power System Disturbance Simulation
5.3.1. Comparison of Offline Training Speed
5.3.2. Step Load Disturbance
5.3.3. Random Renewable Energy Disturbance
6. Conclusions
- (1)
- The LECTR mechanism is introduced to facilitate the joint development of control strategies when interconnected power system regions are expanded. The LECTR mechanism enhances the learning speed of agents in the newly integrated region by 58.73% to 75.01% and reduces the average ACE by 54.93% to 84.80% by enabling peer-to-peer action suggestions from agents in the original region to those in the new region.
- (2)
- In a fixed-area power system environment, the proposed method effectively reduces ACE fluctuations and demonstrates superior control performance.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wang, H.; Zhang, Z.; Wang, Q. Generating adversarial deep reinforcement learning -based frequency control of island city microgrid considering generalization of scenarios. Front Energy Res. 2024, 12, 1377465. [Google Scholar] [CrossRef]
- Basit, M.; Dilshad, S.; Badar, R.; Rehman, S.M.S. Limitations, challenges, and solution approaches in grid-connected renewable energy systems. Int. J. Energy Res. 2020, 44, 4132–4162. [Google Scholar] [CrossRef]
- Xu, D.; Zhou, B.; Wu, Q.; Chung, C.; Li, C.; Huang, S.; Chen, S. Integrated modelling and enhanced utilization of power-to-ammonia for high renewable penetrated multi-energy systems. IEEE Trans Power Syst. 2020, 35, 4769–4780. [Google Scholar] [CrossRef]
- Liu, Z.; Li, J.; Zhang, P.; Ding, Z.; Zhao, Y. An AGC dynamic optimization method based on proximal policy optimization. Front. Energy Res. 2022, 10, 947532. [Google Scholar] [CrossRef]
- Kazemi, M.V.; Sadati, S.J.; Gholamian, S.A. Adaptive frequency control of microgrid based on fractional order control and a data-driven control with stability analysis. IEEE Trans. Smart Grid. 2022, 13, 381–392. [Google Scholar] [CrossRef]
- Barakat, M. Novel chaos game optimization tuned-fractional-order PID fractional-order Pi Controller for load–frequency control of interconnected power systems. Prot. Control. Mod. Power Syst. 2022, 7, 16. [Google Scholar] [CrossRef]
- El-Bahay, M.H.; Lotfy, M.E.; El-Hameed, M.A. Effective participation of wind turbines in frequency control of a two-area power system using Coot Optimization. Prot. Control. Mod. Power Syst. 2023, 8, 14. [Google Scholar] [CrossRef]
- Karanam, A. and Shaw. B. A new two-degree of Freedom Combined PID controller for automatic generation control of a wind integrated interconnected power system. Prot. Control. Mod. Power Syst. 2022, 7, 20. [Google Scholar] [CrossRef]
- Guo, J. A novel proportional-derivative sliding mode for load frequency control. IEEE Access. 2024, 12, 127417–127425. [Google Scholar] [CrossRef]
- Saha, D.; Saikia, L.C.; Rahman, A. Cascade Controller based modeling of a four area thermal: Gas AGC system with dependency of wind turbine generator and pevs under restructured environment. Prot. Control. Mod. Power Syst. 2022, 7, 47. [Google Scholar] [CrossRef]
- Singh, V.P.; Kishor, N.; Samuel, P. Distributed multi-agent system-based load frequency control for multi-area power system in smart grid. IEEE Trans. Ind. Electron. 2017, 64, 5151–5160. [Google Scholar] [CrossRef]
- Yu, T.; Wang, H.Z.; Zhou, B.; Chan, K.W.; Tang, J. Multi-agent correlated equilibrium Q(λ) learning for coordinated smart generation control of interconnected power grids. IEEE Trans Power Syst. 2015, 30, 1669–1679. [Google Scholar] [CrossRef]
- Zhang, Q.; Dehghanpour, K.; Wang, Z.; Qiu, F.; Zhao, D. Multi-agent safe policy learning for power management of networked microgrids. IEEE Trans. Smart Grid. 2021, 12, 1048–1062. [Google Scholar] [CrossRef]
- Xi, L.; Wu, J.; Xu, Y.; Sun, H. Automatic Generation Control Based on Multiple Neural Networks with Actor-Critic Strategy. IEEE Trans. Neural Networks Learn. Syst. 2021, 32, 2483–2493. [Google Scholar] [CrossRef] [PubMed]
- Yu, T.; Zhou, B.; Chan, K.W.; Chen, L.; Yang, B. Stochastic Optimal Relaxed Automatic Generation Control in Non-Markov Environment Based on Multi-Step Q(λ). IEEE Trans. Power Syst. 2011, 26, 1272–1282. [Google Scholar] [CrossRef]
- Xi, L.; Chen, J.; Huang, Y.; Xu, Y.; Liu, L.; Zhou, Y.; Li, Y. Smart generation control based on multi-agent reinforcement learning with the idea of the Time Tunnel. Energy 2018, 153, 977–987. [Google Scholar] [CrossRef]
- Daneshfar, F.; Bevrani, H. Load–frequency control: A GA-based multi-agent reinforcement learning. IET Gener. Transm. Distrib. 2010, 4, 13. [Google Scholar] [CrossRef]
- Kamruzzaman, M.d.; Duan, J.; Shi, D.; Benidris, M. A Deep Reinforcement Learning-Based Multi-Agent Framework to Enhance Power System Resilience Using Shunt Resources. IEEE Trans. Power Syst. 2021, 36, 5525–5536. [Google Scholar] [CrossRef]
- Li, J.; Yu, T. Deep Reinforcement Learning Based Multi-Objective Integrated Automatic Generation Control for Multiple Continuous Power Disturbances. IEEE Access 2020, 8, 156839–156850. [Google Scholar] [CrossRef]
- Yang, F.; Huang, D.; Li, D.; Lin, S.; Muyeen, S.M.; Zhai, H. Data-driven load frequency control based on multi-agent reinforcement learning with attention mechanism. IEEE Trans. Power Syst. 2023, 38, 5560–5569. [Google Scholar] [CrossRef]
- Xi, L.; Yu, L.; Xu, Y.; Wang, S.; Chen, X. A novel multi-agent DDQN-AD method-based distributed strategy for automatic generation control of Integrated Energy Systems. IEEE Trans. Sustain. Energy 2020, 11, 2417–2426. [Google Scholar] [CrossRef]
- Omidshafiei, S.; Kim, D.-K.; Liu, M.; Tesauro, G.; Riemer, M.; Amato, C. Learning to teach in cooperative multiagent reinforcement learning. Proc. AAAI Conf. Artif. Intell. 2019, 33, 6128–6136. [Google Scholar] [CrossRef]
- Li, J.; Zhou, T.; Cui, H. Brain-inspired deep meta-reinforcement learning for active coordinated fault-tolerant load frequency control of multi-area grids. IEEE Trans. Autom Sci Eng. 2024, 21, 2518–2530. [Google Scholar] [CrossRef]
- Yin, L.; Yu, T.; Zhou, L. Design of a novel smart generation controller based on deep Q learning for large-scale interconnected power system. J. Energy Chem. 2018, 144, 04018033. [Google Scholar] [CrossRef]
- Yan, Z.; Xu, Y. Data-driven load frequency control for stochastic power systems: A deep reinforcement learning method with Continuous Action Search. IEEE Trans. Power Syst. 2019, 34, 1653–1656. [Google Scholar] [CrossRef]
Parameter | (s) | (s) | (p.u./Hz) | (p.u./Hz) | (p.u./Hz) | (Hz/p.u.) |
---|---|---|---|---|---|---|
Area 1 | 0.10 | 0.35 | 1.7 | 24 | 0.014 | 0.5 |
Area 2 | 0.09 | 0.32 | 1.5 | 19 | 0.07 | 0.3 |
Area 3 | 0.08 | 0.33 | 2.7 | 22 | 0.03 | 0.2 |
Expanded area 4 | 0.10 | 0.35 | 1.4 | 17 | 0.019 | 0.6 |
Method | Reward | Mean Absolute ACE [p.u.] | Largest Variation in Absolute ACE [p.u.] |
---|---|---|---|
DQN | −1.95 | 0.003084164 | 0.013206667 |
MADDPG | −1.45 | 0.001495223 | 0.007463333 |
Improved MADRL | −0.84 | 0.000595075 | 0.002533333 |
Method | Reward | Mean Absolute ACE [p.u.] | Largest Variation in Absolute ACE [p.u.] | Episode Number of Convergence |
---|---|---|---|---|
Improved MADRL with LECTR | −0.75 | 0.000658 | 0.00216 | 6833 |
Improved MADRL | −0.94 | 0.00146 | 0.00435 | 16558 |
MADDPG | −1.25 | 0.00234 | 0.00669 | 21324 |
DQN | −1.88 | 0.00433 | 0.00982 | 27345 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, F.; Shao, X.; Zhou, B.; Shi, Y.; Shen, Y.; Li, D. Data-Driven Automatic Generation Control Based on Learning to Coordinate and Teach Reinforcement Mechanism. Symmetry 2025, 17, 854. https://doi.org/10.3390/sym17060854
Yang F, Shao X, Zhou B, Shi Y, Shen Y, Li D. Data-Driven Automatic Generation Control Based on Learning to Coordinate and Teach Reinforcement Mechanism. Symmetry. 2025; 17(6):854. https://doi.org/10.3390/sym17060854
Chicago/Turabian StyleYang, Fan, Xinyi Shao, Bo Zhou, Yuexing Shi, Yunwei Shen, and Dongdong Li. 2025. "Data-Driven Automatic Generation Control Based on Learning to Coordinate and Teach Reinforcement Mechanism" Symmetry 17, no. 6: 854. https://doi.org/10.3390/sym17060854
APA StyleYang, F., Shao, X., Zhou, B., Shi, Y., Shen, Y., & Li, D. (2025). Data-Driven Automatic Generation Control Based on Learning to Coordinate and Teach Reinforcement Mechanism. Symmetry, 17(6), 854. https://doi.org/10.3390/sym17060854