Simulating Power Generation from Photovoltaics in the Polish Power System Based on Ground Meteorological Measurements—First Tests Based on Transmission System Operator Data

: The Polish power system is undergoing a slow process of transformation from coal to one that is renewables dominated. Although coal will remain a fundamental fuel in the coming years, the recent upsurge in installed capacity of photovoltaic (PV) systems should draw signiﬁcant attention. Owning to the fact that the Polish Transmission System Operator recently published the PV hourly generation time series in this article, we aim to explore how well those can be modeled based on the meteorological measurements provided by the Institute of Meteorology and Water Management. The hourly time series of PV generation on a country level and irradiation, wind speed, and temperature measurements from 23 meteorological stations covering one month are used as inputs to create an artiﬁcial neural network. The analysis indicates that available measurements combined with artiﬁcial neural networks can simulate PV generation on a national level with a mean percentage error of 3.2%.


Introduction
The transformation of the power system is a continuous process, and to fully realize this we will need years of ongoing commitment and well-thought decisions on country and regional levels. In the past two years in the Polish power system, we could observe a significant increase in the installed capacity in both residential and commercial photovoltaic (PV) systems. To a set of interesting investments in PV capacity one could potentially include a 600 kWp system located near Porąbka-Żar pumped-storage hydropower station, 2.5 MW installation for waterworks in Szczecin, and 739 kW for 35 buildings belonging to a housing cooperative in Wrocław. The growing interest in PV systems can be linked to (a) increasing electricity prices, (b) the decreasing cost of PV systems, and (c) growing awareness of the impact of the energy generation sector on the natural environment. (Impact of PV systems from the perspective of Life Cycle Assessment should not be neglected. Readers are referred to other works strictly dedicated to this topic.) The Polish Transmission System Operator (PSE) has been publishing wind generation data on an aggregated level for quite some time. These time series with an hourly time step are of great importance to visualize and analyze the variability [1] of wind generation with different time horizons. They can also be used to create forecasting or simulation models [2][3][4][5]. Recently, the PSE has also made publicly available aggregated generation time series from photovoltaic installations located in Poland. These data create new opportunities for further research, including an analysis of the complementarity between renewable energy sources in Poland [6] or the use of available ground measurements to simulate PV generation on a country level. The second research direction can be later used to simulate and predict how the growing installed capacity in PV systems in Poland will affect power system operations. Based on available measurement data, and knowing the transmission system constraints, decisions can be made regarding the optimal distribution of renewable generators in the power system.
Considering the information above, the objective of this study is to conduct first tests with regard to the possibility of using ground measurements from meteorological stations to simulate the power generation from PV systems on a country level. Such research results can be found in the international literature both for PV and wind generation, using different kinds of inputs and simulation tools. For example, for Sweden reanalysis based on MERRA (Modern Era Retrospective-Analysis for Research and Applications), data sets have been used to effectively model a national fleet of wind turbines' power output [7]. Recently, Olauson [8] found that ERA5 data sets performed much better than the MERRA reanalysis sets mentioned previously [7]. Similar to the analysis presented in this work, Black et al. [9] used meteorological data and regression techniques to simulate a fleet of PV systems.

Materials and Methods
To simulate the energy generation from PV systems on a country level, the method based on an artificial neural network (ANN) was applied. ANNs are a computational system loosely inspired by biological neural networks. They belong to a group of artificial intelligence information processing paradigms that has gained immense popularity in recent years. Simulation or forecasting models based on ANNs have been successfully applied in various areas of cognitive and application research, including water-demand forecasting [10], lake water-level forecasting [11], renewables integration studies [12], in the area of color image identification and reconstruction [13], multi-core optic fibers [14], wind speed prediction [15], or, most importantly from this paper's perspective, in the areas of direct and global radiation prediction [16] and PV energy yield forecasting [17].
A model of feedforward neural network (NN) has been developed in Matlab 2019a software. The Levenberg-Marquardt method was used to optimize the weights and bias values [18]. The input data were divided into training, validation, and testing subsets in proportions of 70, 15, and 15. Since the length of the available time series was limited to one month, the data were divided so that the first 70% of hours was used to teach the neural network, followed by 15% to validate/supervise the teaching process and the remaining 15% to test the NN performance. The number of neurons in the hidden layer was selected following a brute-force approach, namely NNs with the number of hidden neurons k ranging from 1 to n, where n is the number of input neurons being tested. In the literature, various approaches to solving this problem can be found [18,19]. However, considering the low computing effort to create an ANN, a brute-force approach in this particular case seems to be a justified choice.
As inputs, the PV generation time series covering the month of May (available at https://www.pse. pl/) and time series of wind speed, temperature, and irradiation for 23 meteorological stations located in Poland, obtained from the Institute of Meteorology and Water Management-National Research Institute (IMWM-NRI), were used. The locations of meteorological stations along with the equipment used are presented in Figure 1 and in Table 1. The data have an hourly time step. The nighttime hours and hours when the energy generation from the PV system was less than 5 MW were removed from the input data set. Such low values were removed because hourly temporal resolution does not take into account the spatial distribution of PV systems in Poland, and low generation periods occur during sunset and sunrise. The final data set consisted of 473 hourly records. The data were normalized to a range of 0-1 before the ANN creation procedure. The performance of ANN has been assessed based Energies 2020, 13, 4255 3 of 9 on two commonly applied metrics, namely (MAPE) mean absolute percentage error (Equation (1)) and (RMSE) root-mean-square error (Equation (2)).
where n-sample size, o-observed value, and s-simulated value.
ere n-sample size, o-observed value, and s-simulated value.

Results and Discussion
The energy yield from photovoltaic modules is a function of irradiation falling on the module area, the modules' efficiency, and their temperature. Since detailed information about the PV systems' location is currently not available to the authors, nor is the specification of the modules used, it is justified to use a black-box approach where the input data are transferred into desired output. Because the temperature of the PV modules is determined by irradiation and ambient air temperature as well as wind speed, which has a cooling effect, the above-mentioned meteorological parameters have been considered as explanatory variables. To simulate an hourly power generation from PV systems in Poland, a set of 69 (3 meteorological parameters from 23 stations: Bielsko-Biała, Gdynia, Gorzów Wielkopolski, Jarczew, Jelenia Góra, Kasprowy Wierch, Kłodzko, Koło, Koszalin, Legionowo, Legnica, Łeba, Łódź, Mikołajki, Piła, Radzyń, Sulejów, Suwałki, Toruń, Warszawa-Bielany, Wieluń, Włodawa, and Zakopane -please see Figure 1) explanatory variables was used. These explanatory variables exhibited various correlation coefficient values with the response variable. For the irradiation, it was on average 0.777, whereas the air temperature and wind speed were, respectively, 0.439 and 0.299. Figure 2 shows the observed hourly irradiation on the 1st of May 2020 and the production of energy from PV systems at the national level. During that day, the observed irradiation in individual hours varied significantly among considered locations (meteorological stations), while the PV systems maintained a relatively smooth energy generation pattern. This phenomenon can be attributed to the spatial smoothing of power generation due to the geographical dispersion of the PV systems [20]. This situation is beneficial from the perspective of variable renewable energy sources (VRES) integration to the power system, although constraints such as transmission network capacity may limit the benefits resulting from spatial smoothing.
(meteorological stations), while the PV systems maintained a relatively smooth energy generation pattern. This phenomenon can be attributed to the spatial smoothing of power generation due to the geographical dispersion of the PV systems [20]. This situation is beneficial from the perspective of variable renewable energy sources (VRES) integration to the power system, although constraints such as transmission network capacity may limit the benefits resulting from spatial smoothing. The generation from PV systems during the whole period is visualized in Figure 3. Significant variability between individual daily yield sums can be observed. On a side note, accordingly to the PSE data in May, the PV systems generated 223 GWh, whereas wind turbines generated almost 1.072 TWh, which contributed to covering, respectively, 1.8% and 8.6% of the national demand in this month. The generation from PV systems during the whole period is visualized in Figure 3. Significant variability between individual daily yield sums can be observed. On a side note, accordingly to the PSE data in May, the PV systems generated 223 GWh, whereas wind turbines generated almost 1.072 TWh, which contributed to covering, respectively, 1.8% and 8.6% of the national demand in this month. As mentioned in the Methods and Data section, in total 69 potential configurations of the ANNs were tested. For those, the one with the lowest MAPE was selected for further analysis. The performance of the selected and the remaining ANNs as a function of the number of neurons in the hidden layer is presented in Figure 4. The best performing ANN had 15 neurons in the hidden layer and MAPE of 3.2%.
200% Figure 3. PV generation on a country level during May 2020.
As mentioned in the Methods and Data section, in total 69 potential configurations of the ANNs were tested. For those, the one with the lowest MAPE was selected for further analysis. The performance of the selected and the remaining ANNs as a function of the number of neurons in the hidden layer is presented in Figure 4. The best performing ANN had 15 neurons in the hidden layer and MAPE of 3.2%.
As mentioned in the Methods and Data section, in total 69 potential configurations of the ANNs were tested. For those, the one with the lowest MAPE was selected for further analysis. The performance of the selected and the remaining ANNs as a function of the number of neurons in the hidden layer is presented in Figure 4. The best performing ANN had 15 neurons in the hidden layer and MAPE of 3.2%.  The ANN also performed well in terms of RMSE, which was found to be 39.2 MW. In Figure 5, the performance of the ANN was visualized for the testing subset only. This set is the final verification if the neural network is capable of obtaining good-quality results for input data that remained unknown during the training phase. In Figure 5 it can be noted that the ANN performed very well for the extreme values (very low and high generation), whereas some systematic errors with an unknown source occurred for the mid-range values. On average, it was found that the ANN tended to overestimate the generation by 6 MW. The highest overestimation error was found to be 133 MW, whereas the highest underestimation was 150 MW. The ANN also performed well in terms of RMSE, which was found to be 39.2 MW. In Figure 5, the performance of the ANN was visualized for the testing subset only. This set is the final verification if the neural network is capable of obtaining good-quality results for input data that remained unknown during the training phase. In Figure 5 it can be noted that the ANN performed very well for the extreme values (very low and high generation), whereas some systematic errors with an unknown source occurred for the mid-range values. On average, it was found that the ANN tended to overestimate the generation by 6 MW. The highest overestimation error was found to be 133 MW, whereas the highest underestimation was 150 MW.  Figure 6 visualizes the performance of the neural network over a period of 4.5 days by the end of May 2020. The night hours are excluded from the analysis. As shown in the figure, the values simulated by the ANN followed the real PV systems generation well. During the second day, the ANN wrongly simulated a sudden drop in PV generation, increasing the variability of the modeled time series (in terms of ramp rates). During the fourth day, one can observe that in the midday hours the simulated generation was slightly greater than the observed one. In general, the absolute errors did not exceed 150 MW.  series (in terms of ramp rates). During the fourth day, one can observe that in the midday hours the simulated generation was slightly greater than the observed one. In general, the absolute errors did not exceed 150 MW.
Energies 2020, 13, x FOR PEER REVIEW 8 of 10 Figure 6. Performance of the neural network for the testing subset. Please note that night hours (no irradiation) are excluded.

Conclusions
The presented short communication archives the first results regarding the potential of simulating the generation from the PV systems on a national level based on ground measurements provided by the Institute of Meteorology and Water Management. The conducted analysis based on artificial neural networks revealed that ground measurements consisting of irradiation, wind speed, and air temperature can be used to correctly model the power generation from PV systems on a country level. Despite some limitations, such as neglecting the technical specification of the PV systems, the spatial distribution of PV farms across the country, taking input irradiation on a horizontal rather than an inclined surface, and finally representing the meteorological conditions in Poland based on a sample of 23 locations, it was possible to model the PV generation with a mean absolute percentage error of roughly 3.2%.
Most importantly, the obtained results have some practical implications. In the research and reality of the operation of present energy systems, meteorology starts to play a very important and, in many instances, crucial role in enabling and securing an efficient and reliable operation of the power system. The need for including meteorology-based studies in energy research comes directly from the unprecedented increase in the installed capacity of renewable energy sources, especially the ones in which variability/availability is driven by climate and weather [21]. The Polish power sector is starting a process of transformation. The increasing share of renewables such as wind and, in particular, solar energy (as observed in 2019/2020) is driven by an increasing awareness of the energy sector's impact on the natural environment, the growing cost of electricity generation from conventional fuels, the decreasing cost of renewables, and national/international policy. The power system expansion/development studies [22] call for data with both higher temporal and spatial resolution. This can be provided by either ground measurements, satellite measurements, numerical weather prediction models, or reanalysis data sets. In this paper we have investigated whether the in-house data available from a governmental institution (Institute of Meteorology and Water Management) can be used to simulate aggregated PV generation on a country level. The results obtained proved the high value of the already available data and its promising applications in power system expansion studies as well as research dedicated, for example, to optimal location of PV systems from the perspective of grid topology or the impact of PV systems on the residual load curve.

Conclusions
The presented short communication archives the first results regarding the potential of simulating the generation from the PV systems on a national level based on ground measurements provided by the Institute of Meteorology and Water Management. The conducted analysis based on artificial neural networks revealed that ground measurements consisting of irradiation, wind speed, and air temperature can be used to correctly model the power generation from PV systems on a country level. Despite some limitations, such as neglecting the technical specification of the PV systems, the spatial distribution of PV farms across the country, taking input irradiation on a horizontal rather than an inclined surface, and finally representing the meteorological conditions in Poland based on a sample of 23 locations, it was possible to model the PV generation with a mean absolute percentage error of roughly 3.2%.
Most importantly, the obtained results have some practical implications. In the research and reality of the operation of present energy systems, meteorology starts to play a very important and, in many instances, crucial role in enabling and securing an efficient and reliable operation of the power system. The need for including meteorology-based studies in energy research comes directly from the unprecedented increase in the installed capacity of renewable energy sources, especially the ones in which variability/availability is driven by climate and weather [21]. The Polish power sector is starting a process of transformation. The increasing share of renewables such as wind and, in particular, solar energy (as observed in 2019/2020) is driven by an increasing awareness of the energy sector's impact on the natural environment, the growing cost of electricity generation from conventional fuels, the decreasing cost of renewables, and national/international policy. The power system expansion/development studies [22] call for data with both higher temporal and spatial resolution. This can be provided by either ground measurements, satellite measurements, numerical weather prediction models, or reanalysis data sets. In this paper we have investigated whether the in-house data available from a governmental institution (Institute of Meteorology and Water Management) can be used to simulate aggregated PV generation on a country level. The results Energies 2020, 13, 4255 8 of 9 obtained proved the high value of the already available data and its promising applications in power system expansion studies as well as research dedicated, for example, to optimal location of PV systems from the perspective of grid topology or the impact of PV systems on the residual load curve.
This study intended to present the results of the first tests on the freshly published data by the Polish Transmission System Operator-PSE. Therefore, we did not go into detail comparing different statistical or machine learning techniques. Clearly, for now, the data sample is relatively short and, at the time of writing, was limited to one month. In the future, we plan to extend this research by investigating (a) in detail the impact of the input set on the model performance, (b) comparing different simulation tools, (c) the impact of a long series of meteorological parameters on the quality of the model, and finally (d) selecting a minimal set of representative meteorological stations sufficient for simulations. The results presented here should be taken with caution since solar irradiation has a high annual variability, and model performance might, therefore, vary depending on the part of the year.

Conflicts of Interest:
The authors declare no conflict of interest.