3.1. Preprocessing
The best set of variables (and their interactions) for constructing the multilinear-regression used as preprocessing can be shown to be site-dependent, and also year-dependent (i.e., each year used to calibrate provides different coefficients of the variables selected).
To illustrate this,
Table 3 shows the coefficients for the multilinear-regression of measured
kt. In order to compare the value of these coefficients between different locations, and also between different periods, we show three different locations (DAA, GOB and SBO) and periods (either the entire period available or a single year). Each regression shown in
Table 3 is selected among 1050 models generated from the selected variables and their combinations (the sign “-“ refers to the fact that the variable in question is not used for the optimal regression). The regressions found for each site (and period) vary from one location to another, using each one a different set of variables.
It is interesting to note the different model predictor’s values in the different years analyzed at SBO site: the use of different single years to develop the site-adaptation models leads to slight differences between them, which in several cases require different predictors. Notwithstanding, there is a high correlation between the parameters found at different years: the coefficient of correlation (R-squared or R2) between parameters found in 2007, 2009 and 2010 is >0.99 and also the R2 of the parameters found at these years with respect to all period is >0.99. On the contrary, the R2 of the parameters found in 2008 show lower R2 values with respect to individual years (R2 = 0.92) and with all periods (R2 = 0.96). This suggests that both the model and its correction do not have a homogeneous behavior: although the model can be corrected well with most years, there may be specific years whose behavior is especially different from the others in view of the correlation results obtained (however, this difference is slight). It is also interesting to highlight the stability of the kt predictor value in all cases analyzed at SBO site: The average of their values found for different years (1.8211) is very close to the one found using all period available (1.8366). In addition, their values found at different years are close between them (their standard deviation is 3.7% of their average). The combination of kt with Kc and with m also shows stability between the different cases analyzed.
3.2. Site-Adaptation Performance
To illustrate the improvement of Class A metrics (regarding the dispersion or error of individual points), the scatterplots at GOB site between modeled (left) and adapted (right) DNI are shown in
Figure 3, where the slope one line is shown in purple. The modeled scatterplot shows a marked overestimation with respect to measured data, as reflected in the accumulation of points above the slope one line (higher data concentration is represented by yellow and red colors). Conversely, the site-adapted scatterplot (
Figure 3, right) is placed symmetrically around the slope one line, with the higher data concentration (shown in red) located on this line. There is also an appreciable reduction in dispersion between adapted and measured values with respect to the modeled ones, as reflected by the narrower yellow and red areas.
Table 4 illustrates the statistical indicators representing the accuracy of the modeled and adapted solar irradiance time series with respect to the measured ones, averaged over all sites analyzed. The results indicate that the modeled GHI time series show, in general, a better performance than DNI ones, whereas the site-adaptation procedure increases the accuracy in general. It is worth mentioning that the preprocessing (
Section 2.2.1) in the site-adaptation procedure reduces both Class A (dispersion) and Class C (distribution similarity) indicators, whereas Class B (overall performance) ones are not affected by this preprocessing. For example, the application of the preprocessing reduces the
relRMSD of DNI (averaged on all sites analyzed) by 4.5%: without preprocessing in the site-adaptation, the DNI adapted time series shows on average a
relRMSD of 34.3% (close to the modeled one, 34.9%), whereas the application of the preprocessing reduces this value to 29.8%. Similarly, the application of the preprocessing reduces the
KSI of DNI and GHI (averaged on all sites analyzed) by 10.1% and 3.8%, respectively, with respect to the same adaptation without preprocessing.
Figure 4 shows bar graphs of the statistical indicators calculated at the analyzed sites, both for modeled (light red) and site-adapted (purple) GHI, showing the IQR of the latter as error bars. A large variation in the performance of modeled series is observed, due to the different approaches in the modeling and diverse site characteristics. Class A metrics reveal that site-adapted GHI is less dispersed with respect to measured GHI than the modeled one. Modeled GHI
relbias values are ~1.8% on average, which is reduced after applying the adaptation procedure to ~0.1%. Likewise, a slight reduction is noticed both in
relMAD and
relRMSD after the site-adaptation procedure: from 11.3% to 8.7% and from 17.9% to 14.6%, respectively. It is worth mentioning the low IQR values found for these parameters. Overall, performance statistical indicators (Class B) show high values in modeled GHI, being slightly improved after the site-adaptation procedure. For example,
NSE is increased (on average) from 0.91 to 0.94, whereas
WIA remains similar (and close to 1 in all cases). In this case, again, IQR values are low. Finally, the distribution similarity indicators (Class C) are markedly improved after the application of the site-adaptation:
KSI is reduced on average by a factor of 3.3 (from 76.8% to 23.1%), whereas
OVER is markedly reduced: from 17.3% to ~0.7% (on average). IQR of both
KSI and
OVER of site-adapted GHI have low values, typically 5% of their corresponding modeled values. Finally,
CPI values are reduced by a factor of 2.5, from 32.5% to 13.3% (on average), with also a low IQR (1.2%).
Figure 5 shows bar graphs of the statistical indicators calculated at the analyzed sites, both for modeled (blue) and site-adapted (purple) DNI, showing the IQR of the latter as error bars. Class A metrics reveal a high dispersion in some of the modeled DNI (
relbias ranges from −15.1% to 15.1%, found at PAY and SBO sites, respectively), that markedly decreases after the site-adaptation (0.3% on average). On the other hand, both
relMAD and
relRMSD of site-adapted DNI are reduced by 5.1% (from 24.5% to 19.2% and from 34.9% to 29.8% on average), respectively, with low IQR values (below 0.5% on average in both cases). Overall, performance statistical indicators (Class B) show a variety of values in modeled DNI, lower than those obtained for GHI. In this case, again, the site-adaptation procedure results in an improvement of these statistical indicators:
NSE is increased from 0.64 to 0.78 (on average), whereas
WIA is increased from 0.90 to 0.94 (on average). In this regard, the site-adaptation of DNI shows a more substantial improvement with respect to the adaptation of GHI. The frequency distribution similarity is markedly improved after the application of the site-adaptation, as deduced by Class C metrics: mean
KSI of modeled DNI is 222.2%, being reduced on average to 57.6% (a factor of 3.9), whereas
OVER is reduced on average by a factor of 9.6 (from 141.0% to 12.1%). IQR of both
KSI (31%) and
OVER of site-adapted DNI have low values, typically 9% of their corresponding modeled values.
To illustrate the distribution similarity performance of site-adaptation, modeled and site-adapted GHI (left) and DNI (right) ECDFs at GOB site are shown in
Figure 6. It is worth mentioning the similarity between measured (red, left graph) and raw modeled (blue, left graph) GHI ECDFs, which is corrected after the adaptation procedure (adapted, in purple, and measured GHI ECDFs are indistinguishable). On the other hand, raw modeled DNI ECDF (blue, right graph) is far from the measured one (red, right graph), especially above 300 W/m
2, which is markedly corrected after the adaptation procedure (purple, right graph).
3.3. Prediction of Site-Adaptation Performance
In solar resource assessment studies, it is desirable to estimate the performance of the site-adaptation procedures since weather is not a linear system and solar irradiance is influenced by irregular oscillations in weather patterns like ENSO (El Niño, Southern Oscillation) and NAO (North Atlantic Oscillation), aerosols dynamics, anthropogenic aerosols emissions or volcano eruptions. In the previous results, we have seen how aerosols have a higher impact in solar irradiance for year 2008 in GOV site as we can see when we compare the parameter for kc with other parameters and its value in other years. In order to guarantee that the validation results are not biased by an overfitting of the measured data used in the calculation of the model parameters, training and validation should be done with independent data. Usually, the measured dataset is divided in equal percentages or 80% for training and 20% for validation.
To predict the performance of the site-adaptation procedure in a long time series with only one year of measurements, we have used the first fifteen days of each month as a single dataset for defining the site-adaptation procedure, and we have applied this adaptation to the whole year. By repeating this process for all years available at each location analyzed and comparing with the actual performance of the adaptation in the long time series, we can assess the validity of this prediction.
Figure 7 shows the prediction of the site-adaptation performance of DNI using only 1 year of coincident measured and modeled data with respect to the actual performance (which is calculated using the whole period available). Vertical bars represent the IQR of the statistical indicators at each site, as every year available is analyzed separately.
Table 5 shows the slope of the linear fit between the statistical indicators of all sites analyzed calculated with 1 year (of measured and modeled data) and their actual values (calculated in the whole period available), along with the corresponding
R2 values. It is worth highlighting an almost perfect match of
R2,
NSE and
WIA for GHI and DNI, with slopes between 0.986 and 0.999 and
R2 = 1.000. Similarly,
relMAD and
relRMSD show an
R2 > 0.999, but with slightly higher slopes (in this case, prediction underestimates actual values by ~4% or less).
KSI is almost exactly predicted for DHI, but it is underestimated for GHI and DNI with predictable behavior, reflected in
R2 of 0.951 for GHI and 0.905 for DNI.
CPI is also underestimated for GHI, DNI and DHI (ranging from 31% to 46%), but also with great predictability (with
R2 between 0.991 and 0.991). Finally, both
OVER and
relbias are not well predicted for both GHI and DNI (
R2 < 0.207), for different reasons: adapted
relbias are usually <1% in absolute value, but with great variability within its range, and
OVER predicted with a single year of measurements provides low values (<3%), whereas their actual values are up to 40%.