The best set of variables (and their interactions) for constructing the multilinear-regression used as preprocessing can be shown to be site-dependent, and also year-dependent (i.e., each year used to calibrate provides different coefficients of the variables selected).
To illustrate this, Table 3
shows the coefficients for the multilinear-regression of measured kt
. In order to compare the value of these coefficients between different locations, and also between different periods, we show three different locations (DAA, GOB and SBO) and periods (either the entire period available or a single year). Each regression shown in Table 3
is selected among 1050 models generated from the selected variables and their combinations (the sign “-“ refers to the fact that the variable in question is not used for the optimal regression). The regressions found for each site (and period) vary from one location to another, using each one a different set of variables.
It is interesting to note the different model predictor’s values in the different years analyzed at SBO site: the use of different single years to develop the site-adaptation models leads to slight differences between them, which in several cases require different predictors. Notwithstanding, there is a high correlation between the parameters found at different years: the coefficient of correlation (R-squared or R2) between parameters found in 2007, 2009 and 2010 is >0.99 and also the R2 of the parameters found at these years with respect to all period is >0.99. On the contrary, the R2 of the parameters found in 2008 show lower R2 values with respect to individual years (R2 = 0.92) and with all periods (R2 = 0.96). This suggests that both the model and its correction do not have a homogeneous behavior: although the model can be corrected well with most years, there may be specific years whose behavior is especially different from the others in view of the correlation results obtained (however, this difference is slight). It is also interesting to highlight the stability of the kt predictor value in all cases analyzed at SBO site: The average of their values found for different years (1.8211) is very close to the one found using all period available (1.8366). In addition, their values found at different years are close between them (their standard deviation is 3.7% of their average). The combination of kt with Kc and with m also shows stability between the different cases analyzed.
3.2. Site-Adaptation Performance
To illustrate the improvement of Class A metrics (regarding the dispersion or error of individual points), the scatterplots at GOB site between modeled (left) and adapted (right) DNI are shown in Figure 3
, where the slope one line is shown in purple. The modeled scatterplot shows a marked overestimation with respect to measured data, as reflected in the accumulation of points above the slope one line (higher data concentration is represented by yellow and red colors). Conversely, the site-adapted scatterplot (Figure 3
, right) is placed symmetrically around the slope one line, with the higher data concentration (shown in red) located on this line. There is also an appreciable reduction in dispersion between adapted and measured values with respect to the modeled ones, as reflected by the narrower yellow and red areas.
illustrates the statistical indicators representing the accuracy of the modeled and adapted solar irradiance time series with respect to the measured ones, averaged over all sites analyzed. The results indicate that the modeled GHI time series show, in general, a better performance than DNI ones, whereas the site-adaptation procedure increases the accuracy in general. It is worth mentioning that the preprocessing (Section 2.2.1
) in the site-adaptation procedure reduces both Class A (dispersion) and Class C (distribution similarity) indicators, whereas Class B (overall performance) ones are not affected by this preprocessing. For example, the application of the preprocessing reduces the relRMSD
of DNI (averaged on all sites analyzed) by 4.5%: without preprocessing in the site-adaptation, the DNI adapted time series shows on average a relRMSD
of 34.3% (close to the modeled one, 34.9%), whereas the application of the preprocessing reduces this value to 29.8%. Similarly, the application of the preprocessing reduces the KSI
of DNI and GHI (averaged on all sites analyzed) by 10.1% and 3.8%, respectively, with respect to the same adaptation without preprocessing.
shows bar graphs of the statistical indicators calculated at the analyzed sites, both for modeled (light red) and site-adapted (purple) GHI, showing the IQR of the latter as error bars. A large variation in the performance of modeled series is observed, due to the different approaches in the modeling and diverse site characteristics. Class A metrics reveal that site-adapted GHI is less dispersed with respect to measured GHI than the modeled one. Modeled GHI relbias
values are ~1.8% on average, which is reduced after applying the adaptation procedure to ~0.1%. Likewise, a slight reduction is noticed both in relMAD
after the site-adaptation procedure: from 11.3% to 8.7% and from 17.9% to 14.6%, respectively. It is worth mentioning the low IQR values found for these parameters. Overall, performance statistical indicators (Class B) show high values in modeled GHI, being slightly improved after the site-adaptation procedure. For example, NSE
is increased (on average) from 0.91 to 0.94, whereas WIA
remains similar (and close to 1 in all cases). In this case, again, IQR values are low. Finally, the distribution similarity indicators (Class C) are markedly improved after the application of the site-adaptation: KSI
is reduced on average by a factor of 3.3 (from 76.8% to 23.1%), whereas OVER
is markedly reduced: from 17.3% to ~0.7% (on average). IQR of both KSI
of site-adapted GHI have low values, typically 5% of their corresponding modeled values. Finally, CPI
values are reduced by a factor of 2.5, from 32.5% to 13.3% (on average), with also a low IQR (1.2%).
shows bar graphs of the statistical indicators calculated at the analyzed sites, both for modeled (blue) and site-adapted (purple) DNI, showing the IQR of the latter as error bars. Class A metrics reveal a high dispersion in some of the modeled DNI (relbias
ranges from −15.1% to 15.1%, found at PAY and SBO sites, respectively), that markedly decreases after the site-adaptation (0.3% on average). On the other hand, both relMAD
of site-adapted DNI are reduced by 5.1% (from 24.5% to 19.2% and from 34.9% to 29.8% on average), respectively, with low IQR values (below 0.5% on average in both cases). Overall, performance statistical indicators (Class B) show a variety of values in modeled DNI, lower than those obtained for GHI. In this case, again, the site-adaptation procedure results in an improvement of these statistical indicators: NSE
is increased from 0.64 to 0.78 (on average), whereas WIA
is increased from 0.90 to 0.94 (on average). In this regard, the site-adaptation of DNI shows a more substantial improvement with respect to the adaptation of GHI. The frequency distribution similarity is markedly improved after the application of the site-adaptation, as deduced by Class C metrics: mean KSI
of modeled DNI is 222.2%, being reduced on average to 57.6% (a factor of 3.9), whereas OVER
is reduced on average by a factor of 9.6 (from 141.0% to 12.1%). IQR of both KSI
(31%) and OVER
of site-adapted DNI have low values, typically 9% of their corresponding modeled values.
To illustrate the distribution similarity performance of site-adaptation, modeled and site-adapted GHI (left) and DNI (right) ECDFs at GOB site are shown in Figure 6
. It is worth mentioning the similarity between measured (red, left graph) and raw modeled (blue, left graph) GHI ECDFs, which is corrected after the adaptation procedure (adapted, in purple, and measured GHI ECDFs are indistinguishable). On the other hand, raw modeled DNI ECDF (blue, right graph) is far from the measured one (red, right graph), especially above 300 W/m2
, which is markedly corrected after the adaptation procedure (purple, right graph).
3.3. Prediction of Site-Adaptation Performance
In solar resource assessment studies, it is desirable to estimate the performance of the site-adaptation procedures since weather is not a linear system and solar irradiance is influenced by irregular oscillations in weather patterns like ENSO (El Niño, Southern Oscillation) and NAO (North Atlantic Oscillation), aerosols dynamics, anthropogenic aerosols emissions or volcano eruptions. In the previous results, we have seen how aerosols have a higher impact in solar irradiance for year 2008 in GOV site as we can see when we compare the parameter for kc with other parameters and its value in other years. In order to guarantee that the validation results are not biased by an overfitting of the measured data used in the calculation of the model parameters, training and validation should be done with independent data. Usually, the measured dataset is divided in equal percentages or 80% for training and 20% for validation.
To predict the performance of the site-adaptation procedure in a long time series with only one year of measurements, we have used the first fifteen days of each month as a single dataset for defining the site-adaptation procedure, and we have applied this adaptation to the whole year. By repeating this process for all years available at each location analyzed and comparing with the actual performance of the adaptation in the long time series, we can assess the validity of this prediction.
shows the prediction of the site-adaptation performance of DNI using only 1 year of coincident measured and modeled data with respect to the actual performance (which is calculated using the whole period available). Vertical bars represent the IQR of the statistical indicators at each site, as every year available is analyzed separately.
shows the slope of the linear fit between the statistical indicators of all sites analyzed calculated with 1 year (of measured and modeled data) and their actual values (calculated in the whole period available), along with the corresponding R2
values. It is worth highlighting an almost perfect match of R2
for GHI and DNI, with slopes between 0.986 and 0.999 and R2
= 1.000. Similarly, relMAD
show an R2
> 0.999, but with slightly higher slopes (in this case, prediction underestimates actual values by ~4% or less). KSI
is almost exactly predicted for DHI, but it is underestimated for GHI and DNI with predictable behavior, reflected in R2
of 0.951 for GHI and 0.905 for DNI. CPI
is also underestimated for GHI, DNI and DHI (ranging from 31% to 46%), but also with great predictability (with R2
between 0.991 and 0.991). Finally, both OVER
are not well predicted for both GHI and DNI (R2
< 0.207), for different reasons: adapted relbias
are usually <1% in absolute value, but with great variability within its range, and OVER
predicted with a single year of measurements provides low values (<3%), whereas their actual values are up to 40%.