# Deep Reinforcement Learning for Load Frequency Control in Isolated Microgrids: A Knowledge Aggregation Approach with Emphasis on Power Symmetry and Balance

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Islanded Microgrids

#### 2.1. KDD-LFC Model for Islanded Microgrids

#### 2.2. Generation Costs

_{Gi}is the output of the ith unit; a

_{i}, b

_{i}, and c

_{i}are constants; and C

_{i}is the cost of the ith unit.

_{Gi}

_{,plan}is the planned output of unit i, ΔP

_{Gi}is the regulation output of the ith unit, P

_{Gi}

_{, actual}is the output of unit I; and α

_{i}, β

_{i}, and γ

_{i}are coefficients.

#### 2.3. Objective Functions and Constraints

_{order-∑}is the total command, ΔP

_{i}

^{max}and ΔP

_{i}

^{min}are the limits of the ith unit, ΔP

_{i}

^{rate}is the ramp rate of the ith unit, and ΔP

_{i}

^{in}is the command of the ith unit.

#### 2.4. MDP Modelling of KDD-LFCs

_{π}(s,a). In order to obtain the action value function, the state transfer probabilities need to be known, but due to the existence of a large number of disturbances in reality, it is not possible to obtain the state transfer probabilities. The goal of reinforcement learning is to learn to obtain a policy that maximizes the expected return π

_{θ}, and the objective function of the agent is defined by means of discounting the return as in Equation (2):

_{πθ}is the distribution of trajectories determined by the strategy π

_{θ}and γ ⸦ [0, 1] is the discount rate, denoting the weight of the long-run returns.

_{θ}

**, is obtained, and in this paper, the objective of the islanded microgrid LFC problem is defined as in Equation (7):**

_{*}- (1)
- Action space

- (2)
- State Space

- (3)
- Reward Functions

_{P}is the punishment function, Δf is the frequency error, C

_{i}is the power generation cost for the ith unit, and μ

_{1}and μ

_{2}are the weight coefficients, respectively.

## 3. Knowledge-Aggregation-Based Proximal Policy Optimization Method

#### 3.1. PPO Algorithm in Load Frequency Control

_{θ}is the stochastic strategy and ${\widehat{A}}_{t}$ is the estimate of the dominant ${\widehat{E}}_{t}$ function at the t moment.

_{t}) and A

_{t}, whose current states are used as inputs to the algorithm, and the output of the algorithm is the predicted state values. The Critic network updates the parameters of the network by minimizing a loss function, which is shown in Equation (14) as follows:

#### 3.2. Knowledge-Aggregation Methods and the KA-PPO Algorithm

## 4. Case Studies

#### 4.1. Case 1: Step Disturbance and Renewable Disturbance

#### 4.2. Case 2: Step Disturbance and Renewable Disturbance

#### 4.3. Case 3: Large-Scale Renewable Disturbance

## 5. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Pachaiyappan, R.; Arasan, E.; Chandrasekaran, K. Improved Gorilla Troops Optimizer-Based Fuzzy PD-(1+PI) Controller for Frequency Regulation of Smart Grid under Symmetry and Cyber Attacks. Symmetry
**2023**, 15, 2013. [Google Scholar] [CrossRef] - Kumar, A.; Anwar, M.N.; Huba, M. Load Frequency Controller Design Based on the Direct Synthesis Approach Using a 2DoF-IMC Scheme for a Multi-Area Power System. Symmetry
**2022**, 14, 1994. [Google Scholar] [CrossRef] - Srikanth, M.; Kumar, Y.V.P. A State Machine-Based Droop Control Method Aided with Droop Coefficients Tuning through In-Feasible Range Detection for Improved Transient Performance of Microgrids. Symmetry
**2023**, 15, 1. [Google Scholar] [CrossRef] - Pan, C.T.; Liaw, C.M. An Adaptive Controller for Power System and Load Frequency Control. IEEE Trans. Power Syst.
**1989**, 4, 122–128. [Google Scholar] [CrossRef] - Yousef, H.A.; AL-Kharusi, K.; Albadi, M.H.; Al-Badi, A.H. Load Frequency Control of a Multi-Area Power System: An Adaptive Fuzzy Logic Approach. IEEE Trans. Power Syst.
**2014**, 29, 1822–1830. [Google Scholar] [CrossRef] - Hosseini, S.H.; Etemadi, A.H. Adaptive Neuro-Fuzzy Inference System Based Automatic Generation Control. Electr. Power Syst. Res.
**2008**, 78, 1230–1239. [Google Scholar] [CrossRef] - Bengiamin, N.N.; Chan, W.C. Variable Structure Control of Electric Power Generation. IEEE Trans. Power Appar. Syst.
**1982**, PAS-101, 24–30. [Google Scholar] [CrossRef] - Mi, Y.; Fu, Y.; Wang, C.; Zhang, X.; Zhao, J. Decentralized Sliding Mode Load Frequency Control for Multi-Area Power Systems. IEEE Trans. Power Syst.
**2013**, 28, 4301–4309. [Google Scholar] [CrossRef] - Mi, Y.; Fu, Y.; Li, D.; Zhang, X.; Zhao, J. The Sliding Mode Load Frequency Control for Hybrid Power System Based on Disturbance Observer. Int. J. Electr. Power Energy Syst.
**2016**, 74, 446–452. [Google Scholar] [CrossRef] - Chen, Y.H.; Leitmann, G.; Kai, X.Z. Robust Control Design for Interconnected Systems with Time-Varying Uncertainties. Int. J. Control
**1991**, 54, 1119–1142. [Google Scholar] [CrossRef] - Wang, Y.; Zhou, R.; Wen, C. Robust Load-Frequency Controller Design for Power Systems. IEE Proc. C-Gener. Transm. Distrib.
**1993**, 140, 111–116. [Google Scholar] [CrossRef] - Wang, Y.; Zhou, R.; Wen, C. New Robust Adaptive Load Frequency Control with System Parameter Uncertainties. IEE Proc.-Gener. Transm. Distrib.
**1994**, 141, 184–190. [Google Scholar] [CrossRef] - Bevrani, H.; Hiyama, T. Robust Decentralised PI Based LFC Design for Time Delay Power Systems. Energy Convers. Manag.
**2008**, 49, 193–204. [Google Scholar] [CrossRef] - Xin, H.; Liu, Y.; Wang, Z.; Zhang, B.; Wang, C. A New Frequency Regulation Strategy for Photovoltaic Systems without Energy Storage. IEEE Trans. Sustain. Energy
**2013**, 4, 985–993. [Google Scholar] [CrossRef] - Nanou, S.I.; Papakonstantinou, A.G.; Papathanassiou, S.A. A Generic Model of Two-Stage Gridconnected PV Systems with Primary Frequency Response and Inertia Emulation. Electr. Power Syst. Res.
**2015**, 127, 186–196. [Google Scholar] [CrossRef] - Liu, Y.; Xin, H.; Wang, Z.; Zhang, B.; Wang, C. Power Control Strategy for Photovoltaic System Based on the Newton Quadratic Interpolation. IET Renew. Power Gener.
**2014**, 8, 611–620. [Google Scholar] [CrossRef] - Long, Y.; Liao, K.; Chong, T.; Zhang, Y.; Li, Y. Enhancement of Frequency Regulation in AC Microgrid: A Fuzzy-MPC Controlled Virtual Synchronous Generator. IEEE Trans. Smart Grid
**2021**, 12, 3138–3149. [Google Scholar] [CrossRef] - Liu, S.; Henze, G.P. Experimental Analysis of Simulated Reinforcement Learning Control for Active and Passive Building Thermal Storage Inventory: Part 1. Theoretical Foundation. Energy Build.
**2006**, 38, 142–147. [Google Scholar] [CrossRef] - Kuznetsova, E.; Li, Y.F.; Ruiz, C.; Zio, E. Reinforcement Learning for Microgrid Energy Management. Energy
**2013**, 59, 133–146. [Google Scholar] [CrossRef] - Dai, P.; Yu, W.; Wen, G.; Baldi, S. Distributed Reinforcement Learning Algorithm for Dynamic Economic Dispatch with Unknown Generation Cost Functions. IEEE Trans. Ind. Inform.
**2020**, 16, 2258–2267. [Google Scholar] [CrossRef] - Esmaeili, M.; Shayeghi, H.; Nejad, H.M.; Younesi, A. Reinforcement Learning Based PID Controller Design for LFC in a Microgrid. Int. J. Comput. Math. Electr. Electron. Eng.
**2015**, 34, 1450–1466. [Google Scholar] [CrossRef] - Adibi, M.; Van der Woude, J. Secondary Frequency Control of Microgrids: An Online Reinforcement Learning Approach. IEEE Trans. Autom. Control
**2022**, 67, 4824–4831. [Google Scholar] [CrossRef] - Yu, T.; Zhou, B.; Chan, K.W. Q-learning based dynamic optimal CPS control methodology for interconnected power systems. In Proceedings of the Chinese Society of Electrical Engineering, Beijing, China, 21–23 October 2009; Chinese Society of Electrical Engineering: Beijing, China, 2009; pp. 13–19. [Google Scholar]
- Bhongade, S.; Gupta, H.O.; Tyagi, B. Artificial neural network based automatic generation control scheme for deregulated electricity market. In Proceedings of the 2010 Conference Proceedings IPEC, Singapore, 27–29 October 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 1158–1163. [Google Scholar] [CrossRef]
- Xi, L.; Sun, M.; Zhou, H.; Li, Y.; Wang, Z.; Zhang, J. Multi-agent deep reinforcement learning strategy for distributed energy. Measurement
**2021**, 185, 109955. [Google Scholar] [CrossRef]

Control Algorithm | Average Frequency Deviation (HZ) | Power Generation Cost (USD) |
---|---|---|

|Δf|_{avg} | C^{total} | |

KA-PPO | 0.00431 | 2139.39 |

SAC | 0.00484 | 2140.67 |

TD3 | 0.00493 | 2140.62 |

DDPG | 0.00656 | 2139.94 |

GA-fuzzy-PI | 0.00803 | 2140.94 |

TS-fuzzy-PI | 0.00788 | 2139.45 |

GA-PI | 0.00699 | 2139.78 |

Control Algorithms | Average Frequency Error (Hz) | Generation Cost (USD) |
---|---|---|

|Δf|_{avg} | C^{total} | |

KA-PPO | 0.0155 | 5668.61 |

SAC | 0.0173 | 5672.65 |

TD3 | 0.0195 | 5672.09 |

DDPG | 0.0242 | 5670.70 |

GA-fuzzy-PI | 0.0273 | 5670.95 |

TS-fuzzy-PI | 0.0268 | 5669.10 |

GA-PI | 0.0280 | 5669.50 |

Control Algorithms | Average Frequency Error (Hz) | Generation Cost (USD) |
---|---|---|

|Δf|_{avg} | C^{total} | |

KA-PPO | 0.011686 | 8246.813 |

SAC | 0.012364 | 8252.201 |

TD3 | 0.012858 | 8251.709 |

DDPG | 0.014776 | 8249.376 |

GA-fuzzy-PI | 0.016278 | 8251.551 |

TS-fuzzy-PI | 0.016088 | 8247.276 |

GA-PI | 0.015794 | 8248.216 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wu, M.; Ma, D.; Xiong, K.; Yuan, L.
Deep Reinforcement Learning for Load Frequency Control in Isolated Microgrids: A Knowledge Aggregation Approach with Emphasis on Power Symmetry and Balance. *Symmetry* **2024**, *16*, 322.
https://doi.org/10.3390/sym16030322

**AMA Style**

Wu M, Ma D, Xiong K, Yuan L.
Deep Reinforcement Learning for Load Frequency Control in Isolated Microgrids: A Knowledge Aggregation Approach with Emphasis on Power Symmetry and Balance. *Symmetry*. 2024; 16(3):322.
https://doi.org/10.3390/sym16030322

**Chicago/Turabian Style**

Wu, Min, Dakui Ma, Kaiqing Xiong, and Linkun Yuan.
2024. "Deep Reinforcement Learning for Load Frequency Control in Isolated Microgrids: A Knowledge Aggregation Approach with Emphasis on Power Symmetry and Balance" *Symmetry* 16, no. 3: 322.
https://doi.org/10.3390/sym16030322