Estimating Daily Global Solar Radiation with No Meteorological Data in Poland

The aim of the study was to calibrate coefficients and evaluate performance of simple, day-of-the-year, global solar radiation (H) models nominated from the literature. Day-of-the-year models enable estimation of global solar radiation when no meteorological data is available. The study used 16-year-long data series of daily H, taken at 15 actinometric stations located in various parts of Poland. The goodness-of-fit of the models to the actual long-term monthly average daily global solar radiation data expressed by determination coefficient (R2) ranges from 0.94 to 0.97. Depending on statistical indicators analysis (root mean square error—RMSE, mean absolute bias error—MABE, mean average percentage error—MAPE) the best model was selected. The averaged values of H computed by the recommended model deviate from those measured by 4.16% to 8.71%. Locally calibrated, day-of-the-year model provides satisfactory accuracy and—where meteorological data is unavailable—can be used to estimate mean monthly daily global solar radiation in Poland and similar climate conditions.


Introduction
Spatial and temporal variability, as well as measurements and modeling techniques of solar radiation, a primary factor affecting the Earth's climate, are subjects of interest of many studies.
However, in Poland, possibly mainly due to the limited data accessibility, solar radiation studies are rare and based on individual data series [1][2][3][4][5]. The exception is a paper concerning changes in solar radiation in the years 1961-1995 at six stations [6]. Analysis of relationships between solar radiation and meteorological elements for more than one observation series has been conducted, for example, by [7][8][9] and concerned the calibration of coefficients for an Angström-type formula, describing linear correlation between sunshine duration and global solar radiation.
The best source of information on global solar radiation is its measurement results; however, due to the lack of sufficiently dense actinometric network, a modeling approach is a common practice. There are different types of models based on meteorological elements, including those which use sunshine duration as an input variable, i.e., the classic Angström formula [10,11] and its modified versions [12][13][14]; those based on the relationship between solar radiation and air temperature [15][16][17]; and models utilizing cloudiness [18,19], relative air humidity [20], or precipitation [21].
Although the aforementioned models are commonly used, sometimes no meteorological data are available. In this case, the solution can be provided by a model based on relative geometric position between the sun and the earth, where the only variable is the day of the year.
Additionally, such a solution does not need time-consuming input data processing (quality analysis, homogenization, gap filling).
As the cloudiness is usually the most variable and influential factor affecting atmospheric transmissivity for shortwave radiation [20], simple day-of-the-year models, based on trigonometric functions or nth degree polynomials have been mainly developed for locations with relatively large global solar radiation (H) and slight and hardly variable cloud cover, i.e., in Turkey [22], Greece [23], China [24], Morocco [25], Egypt [26], Algeria [27], and Iran [28].
In the temperate transitional climate typical for Poland, which is affected by considerable and variable cloudiness, these types of models are expected to be less accurate than those based on meteorological elements, especially referring to day-by-day model performance evaluation, since they do not consider variability of meteorological conditions. However, some authors apply also long-term analysis [28] based on mean monthly daily values, and this type of estimation (helpful for the simulation study of the long-term performances of solar energy utilization techniques [29]) was applied in this study.
As in practice, the simple model, requiring as little input data as possible and involving no preliminary data processing procedure, seems to be useful for the end user, and the authors made an attempt to calibrate local coefficients and assess prediction accuracy of selected global solar radiation models to answer the question of whether or not such an approach can be used in Poland.

Study Region and Data Collection
Poland is situated between 49.0°N and 54.5°N in the temperate transitional climate. According to long-term measured data , annual sunshine duration varies from 1377 h to 1700 h, and global solar radiation is in the range of 3600-3800 MJˑm -2 . Cloud cover varies between 4.8 to 5.4 octas. The annual average air temperature lies between 6.5 °C in the northeast part of Poland to 8.5 °C in the southwest [30].
The 16-year-long data series used for this study come from 15 actinometric stations located across the country and concern the years 2000-2015. The stations were selected considering the availability of a representative sample and the geographical location. Their spatial distribution and characteristics are shown in Figure 1 and Table 1, respectively.
. The data were sourced from actinometric stations run by the Institute of Meteorology and Water Management-National Research Institute (IMGW-PIB), which is responsible for the operation of meteorological stations, maintenance of regular measurements and observations, and processing, storing, and making meteorological data available [31]. The data for Warszawa and Kołobrzeg stations were obtained from a database of the World Radiation Data Centre [32]. An initial data quality analysis was performed by the IMGW-PIB [33], and the data transferred to the WRDC are also subject to inspection and flagging (quality assurance).
Daily sums of global solar radiation (H, [MJ·m -2 ]) provided the basis for this study. The daily values were additionally inspected by looking for any major errors and incidental values and comparing each value of H to the extraterrestrial radiation (H0). Wherever data were missing for more than 7 days in a month, the month was excluded from the database. The ultimate percentage of missing data is 2.4%. Detailed information on missing data is included in Table 1.
The daily values of H prepared as described above were used to determine the long-term average daily solar radiation values (the average value for each day of the year) used in further analysis, i.e., the calibration of equations' coefficients of the day-of-the-year type models nominated from the literature. Selected equations presented in the article are actually special cases of the Fourier series with appropriately calculated coefficients as meteorological parameters, revealing that strong annual cycle may be well-represented by the first harmonic of a Fourier expansion [34].

Calibration of Model Coefficients
The following day-of-the-year models from the literature were selected for the study: Sine wave proposed by Bulut et al. [22] and modified as follows: Cosine wave proposed by Kaplanis and Kaplani [23]: where: a, b, c, d, e, f, g-equation coefficients, n-subsequent day of the year. The parameters for Equations (1)-(3) were calculated for each of the 15 stations on the basis of empirical data, i.e., 16-year-long observational data series reduced by 5 random years, which were then used for testing the predictive accuracy of the model. The test series comprised the following years: 2002, 2010, 2011, 2013, and 2015.
The regression coefficients were determined using the least-squares method in Table Curve 2D software. This provided the equations describing global radiation for each day of the year at each location.
The coefficients for Equations (1)- (3) and the values of the determination coefficient R 2 , describing the goodness of fit of the model for each station, are collected in Table 2.

Model validation
The Model (1)-(3) performance was assessed using the most common in the solar field statistical parameters, described by [35]: Root mean square error (RMSE) Mean absolute bias error (MABE) Mean absolute percentage error (MAPE) where: Hm-measured long-term monthly average daily global solar radiation (from the years selected at random), Hc-calculated monthly average daily global solar radiation determined for so-called average day of the month, which simplifies the calculation procedure for the user. The most representative days for each month of a year are the following: 17, 47, 75, 105, 135, 162, 198, 228, 258, 288, 318, and 344 [36].
The annual series of measured and calculated mean monthly daily global solar radiation were plotted using R software [37].

Results and Discussion
Daily values of global solar radiation, averaged for 11 years for each of the 15 stations, were the basis for the equations coefficients calibration ( Table 2). The goodness-of-fit of the models (1)-(3) expressed by means of the determination coefficient R 2 did not vary considerably between the stations and ranged from 0.94 (Lesko) to 0.97 (Kołobrzeg). Although these values are slightly less favorable than in the other climate zones [22][23][24][25][26][27][28], the results indicate that the model is well-fitted to the averaged global solar radiation data. Furthermore, the model's precision was assessed by means of statistical indicators such as RMSE, MABE, and MAPE. Their values, collected in Table 3, were calculated using a separate, independent dataset.  (1) and (2), respectively). The highest RMSE, 1.10 MJ·m -2 , was noted for Model (2) at the Łeba station (1.02 MJ·m -2 and 1.01 MJ·m -2 for Model (1) and (3), respectively). The lowest values of RMSE are the most frequent for Equation (3).
MABE ranges from 0.31 MJˑm -2 to 0.87 MJˑm -2 . The best agreement between the calculated and measured values was observed for Equation (3), and the maximal difference between MABE for the analyzed models does not exceed 0.13 MJˑm -2 . Except for Wieluń and Włodawa stations, similar values of MABE were obtained for Equations (1) and (3). Figure 2 shows the compatibility of the measured and calculated values of long-term monthly average daily solar radiation. An underestimation is clear for Lesko, Toruń, and Łeba, while at the Wieluń and Łódź stations, an overestimation of the model is observed (Figure 2). MAPE ranges from 4.16% to 8.71% (Model (1)), from 5.68% to 14.20% (Model (2)), and from 4.02% to 9.15% (Model (3)). The highest values of MAPE were obtained for Equation (2), whereas they were similar for Equations (1) and (3).
Additionally, model coefficients for Poland as a whole were calculated for Equation (1). The results are presented in Tables 2 and 3. RMSE for this equation varies from 0.54 to 1.29 MJˑm -2 , while MABE and MAPE vary from 0.46 to 1.02 MJˑm -2 and 5.9%-11.4% respectively.

Conclusions
Simple solar radiation models, which consider only the day of the year as the input values, were calibrated and verified with the measured data for the solar climate of Poland. As expected, the goodness of fit of the tested models, expressed by the determination coefficient (R 2 ), is slightly worse than in other climatic zones, yet it is maintained within 0.94-0.97 and can be considered satisfactory.
Regarding long-term monthly average daily solar radiation, values of statistical parameters (RMSE, MABE, MAPE) demonstrate slight differences between the tested models, (1)-(3). More significant differences can be observed between individual stations analyzed.
The best prediction accuracy was obtained for Equations (1) and (3). The lowest values of statistical parameters were most often observed for Equation (3), but considering that the differences are negligible and Equation (3) is more complex, the authors recommend Formula (1) as a simple tool for the prediction of long-term monthly average daily solar radiation in temperate climate when meteorological data is not available. The results indicate also that the general equation for Poland as a whole gives satisfactory accuracy and can be applied to estimate mean monthly daily solar radiation.