The deployment of microgrids could be fostered by control systems that do not require very complex modelling, calibration, prediction and/or optimisation processes. This paper explores the application of Reinforcement Learning (RL) techniques for the operation of a microgrid. The implemented Deep Q-Network (DQN) can learn an optimal policy for the operation of the elements of an isolated microgrid, based on the interaction agent-environment when particular operation actions are taken in the microgrid components. In order to facilitate the scaling-up of this solution, the algorithm relies exclusively on historical data from past events, and therefore it does not require forecasts of the demand or the renewable generation. The objective is to minimise the cost of operating the microgrid, including the penalty of non-served power. This paper analyses the effect of considering different definitions for the state of the system by expanding the set of variables that define it. The obtained results are very satisfactory as it can be concluded by their comparison with the perfect-information optimal operation computed with a traditional optimisation model, and with a Naive model.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited