This article highlights the industry experience of the development and practical implementation of a short-term photovoltaic forecasting system based on machine learning methods for a real industry-scale photovoltaic power plant implemented in a Russian power system using remote data acquisition. One of the goals of the study is to improve photovoltaic power plants generation forecasting accuracy based on open-source meteorological data, which is provided in regular weather forecasts. In order to improve the robustness of the system in terms of the forecasting accuracy, we apply newly derived feature introduction, a factor obtained as a result of feature engineering procedure, characterizing the relationship between photovoltaic power plant energy production and solar irradiation on a horizontal surface, thus taking into account the impacts of atmospheric and electrical nature. The article scrutinizes the application of different machine learning algorithms, including Random Forest regressor, Gradient Boosting Regressor, Linear Regression and Decision Trees regression, to the remotely obtained data. As a result of the application of the aforementioned approaches together with hyperparameters, tuning and pipelining of the algorithms, the optimal structure, parameters and the application sphere of different regressors were identified for various testing samples. The mathematical model developed within the framework of the study gave us the opportunity to provide robust photovoltaic energy forecasting results with mean accuracy over 92% for mostly-sunny sample days and over 83% for mostly cloudy days with different types of precipitation.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited