Short-Term Deterministic Solar Irradiance Forecasting Considering a Heuristics-Based, Operational Approach

: Solar energy is an economic and clean power source subject to natural variability, while energy storage might attenuate it, ultimately, effective and operationally feasible forecasting techniques for energy management are needed for better grid integration. This work presents a novel deterministic forecast method considering: irradiance pattern classiﬁcation, Markov chains, fuzzy logic and an operational approach. The method developed was applied in a rolling manner for six years to a target location with no prior data to assess performance and its changes as new local data becomes available. Clearness index, diffuse fraction and irradiance hourly forecasts are analyzed on a yearly basis but also for 20 day types, and compared against smart persistence. Results show the proposed method outperforms smart persistence by ~10% for clearness index and diffuse fraction on the base case, but there are signiﬁcant differences across the 20 day types analyzed, reaching up to +60% for clear days. Forecast lead time has the greatest impact in forecasting performance, which is important for any practical implementation. Seasonality in data gaps or rejected data can have a deﬁnite effect in performance assessment. A novel, comprehensive and detailed analysis framework was shown to present a better assessment of forecasters’ performance.


Introduction
As installed solar energy capacity increases worldwide, its appropriate management becomes a pressing matter for electric system operators.As more efforts are directed towards further renewable energy integration to established grids, the issue of natural variability of renewable energy becomes concerning.Several studies show that renewable energy dumps due to system operational constraints can be significant and have proposed different dispatch schemes to counter this [1,2].This does not solve the issue, but complicates it by introducing a new aspect to energy management: appropriate storage and dispatch of energy; hence, solar power forecasting becomes an invaluable tool.To this end, several methods have been developed to produce forecasts and have been presented in comprehensive literature reviews [3][4][5][6][7][8], with statistical methods being very popular.autoregressive (AR), when only  0, the model is called autoregressive with external inputs (ARX) and when C 0 it is an autoregressive moving-average model with external inputs (B 0, ARMAX model) or without them ( 0, ARMA model) [9].

𝐴 𝑌 ⋯ 𝐴 𝑌 𝐵 𝑈 ⋯ 𝐵 𝑈 𝐶 𝑒 ⋯ 𝐶 𝑒
A procedure presented by Box et al. [10] uses the autocorrelation (ACF) and partial autocorrelation functions (PACF) of the time series, to determine model structure.Since time series can exhibit seasonality, it is often suggested to use the derivative of the time series for the seasonality period, to eliminate this trend and better determine the model's dynamics, calling such models SARMA (seasonal autoregressive moving average).The core of the methodology is that a particular model (AR/ARMA) exhibits a distinct, identifiable pattern in its ACF and PACF.By identifying the lags in which the ACF and PACF peak or decay, the model order (m/n/p parameters in Equation ( 1)) can be determined.Finally, linear or nonlinear estimators are employed to determine the model's coefficients.

Emergence and Challenges of Artificial Intelligence as Forecasting Tool
Recently, applications of artificial intelligence techniques for solar power forecasting have continually increased due to their ability to model highly nonlinear phenomena and pattern identification, provided by analyzing correlation and clustering between input and output variables [11].Likewise, Cai et al. [12] present the contrast against conventional statistical techniques, their shortcomings and the emergence of deep learning techniques as a response.These techniques rely on two main concepts: first, identify and classify patterns or features to be reproduced and second, the actual forecast is based on the training of a structure to generate outputs based on the identified pattern to be reproduced.Examples are presented in Table 1 and current issues for these methods on Table 2.
Table 1.Example artificial intelligence approaches for solar radiation forecasting.

Source Method Dataset
Lan et al. [13] Combination of frequency analysis to identify irradiance patterns, principal component analysis to identify characteristic features and neural networks are used to forecast future features of irradiance that are translated back to an irradiance forecast.
One year of data Theocharides et al. [14] Several neural networks were developed to produce GHI forecasts, which are then clusterized and finally, through statistical processing, the final forecast is produced as a linear combination of the clusters.

days
Du Plessis et al. [15] Several neural networks were developed for meteorological data to forecast a photovoltaic plant output at the subunit level and then scaled up to plant-wide production.
Two years training and 1 year of validation data.Table 2. Current issues in artificial intelligence applications for solar power forecasting.

Source Issue or Concern
Wang et al. [16] and Sethi and Kantardzic [17] Neural networks/deep learning approaches have not seen sufficient adoption, despite growing interest in them, due to their complex black-box nature and lack of explainability and interpretability Wang. et al. [18] These approaches are prone to model overfitting and insufficient generalization ability, being hyperspecific.
Wang. et al. [16] Explainability is of great importance, therefore, proposed a new approach through direct explainable neural networks that can provide further insights in the input-output relationship to assist in result interpretation and model explanation.
Ahmed et al. [19] Appropriate weather classification is important for solar photovoltaic power forecasting assessment, and there are challenges to overcome in these classifications, presenting that most authors employ four or less classes.
Wang et al. [20] Separate forecast models for each weather class should improve forecasting performance; therefore, having a higher number of classes would be beneficial.
Regardless of the success of these techniques, one issue remains regarding the interoperability of the approach, meaning that the resulting structures are created based on the specific characteristics of the data and the classifier itself is specific to each case of study.Therefore, a generalized classification scheme would benefit the development of forecasting methods by allowing interoperability and ease of implementation and comparison across studies.

Current State of the Art in Solar Power Forecasting Performance Assessment
Several well-regarded metrics have been presented in the literature for performance evaluation, however, there is not a defined standard on which metrics to report nor how appropriate they are as meaningful indicators of a forecaster's performance [21].Furthermore, when comparison against a reference baseline method is required, there may not be an overall agreement on the details of the implementation of such method and there is still debate regarding this topic, to which works such as [22] attempt to set the details for a standardized method.Additionally, as solar power forecasting has transitioned from the naïve persistence (irradiance  remains constant through forecast window, Equation ( 2), [23][24][25]) towards the smart persistence as the reference method (clear sky index  remains constant through forecast window, Equation ( 3), [25,26]), the readers must be aware of which reference was used, making analysis and result comparison more difficult.
From this body of knowledge, a baseline can be established regarding the performance of different forecasting methods.Results presented in [6,8] were analyzed considering that: (a) must have been assessed with a minimum of one year of validation data and (b) only hourly forecasts were considered.The normalized root mean square error (nRMSE) was the most commonly reported metric, with typical values in the range of 17 ± 8.25%, although values of 3% and 38% have been reported as well.Figure 1 presents results from selected works [27,28]; however, the reader is referred to [6,8] for a more detailed coverage of performance statistics.
Despite the fact that several well-known metrics and procedures for forecasting performance assessment are presented in the literature, most studies are limited to assessing performance of the entire evaluation period, while some further detail at how performance varies with month or season [16,[29][30][31], but comparatively few studies provide details at the type-of-day [15] level or for specific values or ranges for  [14,32].This analysis is of interest, because as Figure 1 shows, the same method can yield very different results as a function of the variability of the location under analysis.Often, the type-ofday categories are defined subjectively, and [33] showed that  alone is not a sufficient metric to characterize day type, proving that these approaches are not sufficient to accurately characterize performance under specific atmospheric dynamics.Further, studies commonly use training-evaluation-testing datasets that, in sum, encompass all of their available data, but rarely evaluate how data aggregation affects forecasting performance, and the testing datasets are frequently composed of few years or even fractions of a year.
It is therefore necessary to develop an assessment framework that addresses these shortcomings.
Finally, academic work in this area has mostly been oriented towards developing techniques to increase forecast accuracy.However, recent studies have shown that this approach is insufficient, as there are several critical issues pertaining to implementation and performance assessment that have not been standardized or that have not reached scientific consensus.For the operational aspect of forecasting, the method should be able to be integrated seamlessly in the decision-making process regarding energy dispatch.Yang et al. [34] point to key issues related with database requirements, computational time required to obtain useful results, time difference between forecasted energy and the anticipation on which the forecast must be provided (lead time) and data processing to comply with temporal resolution requirements.

Present Work and Scientific Contributions
From these arguments, the authors have decided to develop a novel statistical solar forecasting method and a comprehensive assessment methodology, aimed at addressing the issues detected in the literature review.Namely, the novelty and contributions of this work are summarized as follows: 1.A novel solar radiation forecasting method was developed based on pattern identification and classification, probability and heuristic methodology, considering operational needs of decision-making parties.The heuristic method developed for this application presents an intuitive, explainable, interpretable and effective way to forecast solar irradiance, by relying on concepts of probability, possibility and human reasoning, overcoming the limitation of complex mathematical abstraction and black-box characteristics of advanced state-of-the-art statistical and artificial intelligence methods.2. A generalized explicit irradiance pattern classification scheme was employed for performance assessment and forecasting, by classifying irradiance patterns through an analytical expression that yields similar results to clustering techniques, with the advantage of easy implementation across studies.3. A comprehensive performance assessment framework was developed to analyze not only how forecasting performance changes as a function of forecast horizon and lead time, but to evaluate the effect of data aggregation into the knowledge base has on forecasting skill, how quality control and/or data gaps affect performance assessment and how the forecaster performs under different objectively defined day types.
This work is organized as follows: Section 1 deals with the introduction and a brief state-of-the-art description in the issues relevant and under study in solar radiation forecasting.Section 2 presents the data sources and explains the forecasting methodology and Section 3 deals with performance assessment methodology.Finally, the fourth section presents the results and discussion, and Section 5 presents the conclusions.

Data Sources and Methodology
Data corresponds to seven measurement stations throughout Chile (Arica, La Tirana, Crucero, Diego de Almagro, Carrera Pinto, Santiago and Curicó), from 2012 to 2017 with one-minute temporal resolution, integrated to obtain hourly data.Measurement station location, data availability and quality control scheme are detailed in [33,35].The ESRA clear sky model [36] was used due to its simplicity and accuracy, showing good agreement with ground measurements [35].
While conventional statistical methods have been extensively employed for solar power forecasting, as described in the introductory section, these models are more suited to accurately reproduce the dynamics of a system whose nature/parameters do not change significantly.When this is not the case it is necessary to repeat the fitting procedure to match the new dynamics.This poses not only a complexity to maintain an updated model, but also because very different dynamic patterns can emerge even within the same range of static characteristics (i.e., for the same mean  for a given period, wildly different irradiance patterns can occur), a statement supported by [19,37].
Figure 2 illustrates this conundrum.The top-left subfigure displays the probability a particular one-minute value for  occurs for 20 possible categories with similar hourly  and the top-right subfigure presents the one-minute time series of  values for a representative time series of the corresponding category.By applying the Box-Jenkins methodology to identify which model structure would be the most appropriate, very different ACF and PACF patterns emerge for each time series that represents the different  patterns (shown in the bottom subfigures).Therefore, different structures would need to be developed for each pattern (with additional uncertainty, as different patterns can exist within the same  class) and an additional effort would need to be made to correctly identify transitions across patterns.Ultimately, no one-model-fits-all approach can be employed and similar results are obtained for hourly time series.Based on the previously discussed limitations of conventional statistical techniques, and the more modern artificial intelligence/deep learning approaches, and considering that the short-term variability of the solar resource is inherently stochastic in its origin, a probability-oriented approach, through Markov chains, is suited to represent these phenomena as the probability distribution to transition from a defined origin state to a destination state in the future, or simply, a chain of connected states [6].In this way, the problem is first shifted to define the characteristics of such states, then to quantify their transitions and finally to resolve a point value within the destination state.
Examples of this approach are presented in Bhardwaj et al. [38], where Markov chains were used to model transitions for atmospheric variables (temperature, humidity, sunshine hours, atmospheric pressure and wind speed) and a generalized fuzzy model was developed to take these variables as inputs to produce solar irradiance as the output.Sanjari and Gooi [39] produced probabilistic forecasts for photovoltaic plants as a function of irradiance, temperature and the plant's previous production, using a Markov model.Clusters for different patterns for inputs were constructed based on data similarity using either a shape similarity approach [38] or a pattern discovery method [39].While these approaches proved successful, the method is highly dependent on the characteristics of the data, as the resulting clusters are defined based on mathematical abstractions and no straightforward relations allow for simple understanding of the data characteristics and cluster boundaries, but more importantly, the interoperability or how the resulting construct can be applied to a different context, a shortcoming shared by [13][14][15] and already expressed in the introductory section.
However, in Castillejo-Cuberos and Escobar [33] it was determined that the overall specific static and dynamic characteristics of solar irradiance for a determined time period, or state, are summarized by the clearness index ( ), the diffuse fraction (K) and the variability of the solar resource.By defining a variable termed the solar resource quality score (QS), irradiance patterns can be classified in an explicit scheme comparable to clustering techniques, but providing a straightforward approach and a simple analytical expression on which to perform the classification, easily translatable across studies, overcoming the issue of data specificity previously discussed.The only variables needed are the global horizontal irradiance (GHI), diffuse horizontal irradiance (DHI) and extraterrestrial irradiance ( ).
The QS is a measure of how close a particular state is from the ideal of full atmospheric clearness ( = 1), no attenuation (K = 0) and constant irradiance (variability = 0).To quantify variability, the variability score (VS ) defined by Lave et al. [40] was used, defined as the probability that a particular ramp rate (   ⁄ ) exceeds a given threshold ramp ( 1000 / ) during an evaluation period.The quality score uses the normalized  or  , which is the result of linearly mapping the VS from the range 0.002-0.4054to 0-1 [33].

𝐾
(4) In [33], solar resource patterns were classified in 20 classes (or states) from overcast to cloudless (Figure 3).From the time series of hourly  , K and  values, an hourly time series of these discrete states can be constructed after applying this classification scheme.Then, to model the probability that a particular state could develop into another at a future time, transition matrices are constructed from these data.Since the actual physical phenomenon (irradiance time series) would now be represented by a continuous series of discrete states (irradiance classes) then, for each time instance (hour), the following state would only be a function of the previous one, for which assuming a perfect forecast, it would amount to a one-hour forecast horizon.As the lead time is a construct that results from the operational need to know future states with enough anticipation to prepare and enact appropriate measurements, it is not part of the physical reality of the phenomenon to be modeled and its value is determined by the specific need of the end user of the forecast.
Thus, given these two temporal parameters, there are infinite combinations of forecast horizons and lead times on which the transitions could be modeled.Therefore, striving to reproduce the atmospheric dynamics that govern solar resource patterns in a simple yet general way and based on the consideration that each state only depends on the previous one, the probability of transition is modeled corresponding to a forecast horizon of 1 h and no lead time.
Then, the probability of transition is modeled based on historical values by means of transition matrixes, in which for each hour, the preceding state (i.e., irradiance class) is compared against the current one, and the corresponding transition is added to a totalization matrix.Once this process is completed, the counts are normalized across origin states as to obtain the transition probability distribution from each origin state to each end state.This process is presented in Figure 4 where  , represents the probability to transition from "i" origin state towards "j" destination state.However, this procedure does not consider the relationship of irradiance patterns with solar geometry and seasonality, as the same solar elevation (α) can correspond to very different times of day across stations due to the effect of declination.To capture these seasonality effects (intradaily and intermonthly), transition matrices are developed for nine cases: for α, (in increments of 30°) and hour angle ( ) according to sunrise and sunset angles (  and  , respectively) for the following sets: (i)  0.5  , (ii) 0.5  0.5  0.5  and (iii) 0.5   .
Once the irradiance class origin/destination probability distribution is known, what follows is to determine the corresponding deterministic forecast.As the  and K distributions overlap for more than one irradiance class (Figure 3), the complexity of the reversion process increases.To deal with this complexity, while maintaining human interpretability and experience into the process, a fuzzy logic approach was employed, as in [38].
Fuzzy logic, proposed by Zadeh [41], is a framework of multivariate logic that aims at translating human reasoning of imperfect logic with uncertainty.Instead of dealing with all true or false affirmations of binary logic, fuzzy logic evaluates how "accurate" is a particular affirmation by considering the membership degree of an input to a linguistic affirmation that is translated into a fuzzy rule.In [42][43][44][45] it is established that fuzzy logic is compatible with, although distinct to, probability theory in a framework of possibility, the main difference being that probability is concerned with "what will happen" while possibility refers to "what can happen".This compatibility, the ability to approach the reasoning process from an inherently uncertain point of view, and the fact that a possibility framework is compatible with discrimination against scenarios that are physically impossible (i.e., a  = 1 with K = 1), makes the fuzzy logic approach suitable for this work.
Barragán [46] explains in detail the mathematical foundation of fuzzy logic in a comprehensive manner and the reader is referred to this work should a high level of detail be needed; however, Figure 5 presents a summary of the procedure.First, numeric values for variables are compared against affirmations or rules (membership functions) to determine their degree of membership to a particular affirmation, resulting in a fuzzy number representing this membership.In the second stage, all affirmations are aggregated using fuzzy logic set operators (such as addition, product, intersection or negation) to establish the current state of affairs in the system under analysis.Then, using a series of rules that constitute the knowledge base of the fuzzy logic system and have been defined either by previous experience and/or human expert knowledge, an inference is made regarding the most appropriate or likely outcome.Finally, since all this process is performed using fuzzy numbers, it is necessary to translate the results back to a crisp numerical value, in a process called de-fuzzification.The concept of set membership is taken from fuzzy logic to assess the membership degree of a Kt-K-QS value to a certain irradiance class and Figure 6 presents the workflow for each variable, using QS as example.First, the variable is evaluated using Gaussian membership functions defined for each of the 20 defined classes presented in Figure 3 to determine the most similar class or origin state (Figure 6a).Then, the transition probability distribution corresponding to the origin state (Figure 6c) is extracted from the general transition matrix (Figure 6b).This distribution represents the probability of transition from the current origin state to the destination (or forecasted) state.Equation ( 8) presents the Gaussian membership function in which x refers to the variable to be tested,  and  represent the mean and standard deviation of the set, respectively.

𝐺𝑀𝐹 𝑒𝑥𝑝 * (8)
Since each destination state has a range of values defined by the corresponding mean and standard deviation of its irradiance class (Figure 6d) and there is a definite probability of occurrence for each state, the overall likelihood of a particular value being realized depends upon a combination of these two facts.This relationship can be expressed through the product operator, and therefore the fuzzy "likelihood" variable is defined as the product of the probability of occurrence of a state and the degree of membership for a variable's values for each state (Figure 6e).Since there are still a multitude of destination states in which the forecasted variable could be realized between one or another, the algebraic sum of the fuzzy sets corresponding to each origin state results in the overall likelihood function for that variable (Figure 6f).Equation (7) shows the relationship between these variables; therefore, a Kt-K-QS space exists for a given  (Figure 7a) and the forecast output must belong to it.Since  is used to describe the intrahourly variability, has a wider distribution of values and has the least effect in QS compared to  and K [33], for this work it has been assumed that the intrahourly variability is preserved across hours.With this  value, the Kt-K-QS space is constructed (Figure 7b) and using the previously determined QS likelihood function (Figure 7c) the Kt-K-QS space can be mapped into the QS likelihood function in the Kt-K space (Figure 7d).
However, certain Kt-K combinations violate physical irradiance limits, for which no forecast can be produced.Hence, there exists a binary plausibility space (Figure 7e) representing this (1 for plausible and 0 for implausible values), with  and K limits being derived from the irradiance limits for GHI, DHI and direct normal irradiance (DNI) established by [47,48] calculated at the forecast's corresponding  and  .The last space to consider is constructed from the likelihood functions of  and K (Figure 7f).Finally, the product operator is used to aggregate the QS likelihood space, the combined  -K likelihood and the plausibility space, to obtain the overall combined likelihood in the  -K-QS space, representing the overall likelihood for the combination of variables.To obtain the crisp numerical  , K and QS values, a centroid defuzzification has been chosen, as this criterion considers the information of all the space and not just the maximum value.Once the  and K forecasts have been obtained, the corresponding irradiances are calculated using Equations ( 4), ( 5) and (9).Since the application of interest is power forecasting for photovoltaic plants, the tilted plane irradiances are calculated for fixed (at latitude), one and two-axis tracking surfaces, using the HDKR transposition model [49], with a ground surface albedo of 0.2.

𝐷𝑁𝐼
sin For performance assessment, the proposed method is compared against smart persistence (persistence henceforth) as it is the current reference method.To reduce forecast uncertainty due to the clear sky model, especially in cases of marked changes in Linke turbidity during nonclear conditions (as demonstrated in [35]),  will be used instead of  .As stated in Section 1, persistence forecasting as consensus baseline method has transitioned from naïve persistence (i.e., irradiance remains constant, Equation ( 2)) [23][24][25] towards smart persistence (i.e., clear sky index remains constant, Equation ( 3)) [25,26].However, the estimation of the clear sky irradiance is subject to astronomical uncertainties (solar geometry modeling, Earth-Sun distance modeling and solar constant estimation uncertainty), atmospheric properties uncertainty (such as aerosol optical depth and integrated water vapor column estimation, among others, with specific effects depending on the clear sky model being used) and the intrinsic clear sky modeling error (which is model dependent).Even under perfect circumstances in which these parameters are estimated with low uncertainty on clear days, the clear sky modeling error can amount to ~2-3% for GHI and ~8% for DNI [50] and [51] further elaborates on the sensitivities of clear sky models to uncertainty in their inputs.Finally, in [35] it was shown than under practical applications, it is difficult to maintain accurate and updated estimates of atmospheric properties used for clear sky modeling, that can result in significant estimation errors.
Therefore, in order to reduce forecast uncertainty due to clear sky modeling, the clearness index ( ) will be used instead of the clear sky clearness index ( ) for GHI persistence forecasts, as the former is much less subject to uncertainty than the latter, as it is only affected by the astronomical uncertainties.Likewise, the corresponding DHI forecast stems from this GHI forecast and assuming persistency of the diffuse fraction.For consistent evaluation, the persistence forecasts will be subject to the physical limit constraints previously described.
Data are grouped into two sets: a training set consisting of all but the last year and the evaluation set consisting of the last year.Data for the seven measurement stations are considered, however, to assess the performance of the proposed method, the Santiago station is the one to be analyzed.First, it will be assumed that no previous data for this station exist and then, gradually, data from each additional year are considered for the training set, to simulate data availability once the forecasting scheme is put into operation.

Assessment Methodology
The proposed method's performance is assessed through a set of commonly used error metrics featured in the literature, as reported by [21,34,52], namely, the mean bias error (MBE), the root mean square error (RMSE), the normalized root mean square error (NRMSE), the mean absolute percent error (MAPE), the time series correlation () and the skill score (SS).

𝑀𝐵𝐸 ∑ 𝑒 𝑛
Since module capacities are reported at standard testing conditions (STC, 1000 W/m 2 ≅ 25 °C) and plants vary in capacity, it is reasonable to estimate electric power production as the product of the nominal production capacity by the irradiance on the module's surface and reference the error metric to the nominal plant capacity  , following [32].Therefore, the errors are calculated as follows:  for irradiance, while  for power production at tilted plane cases.Only valid data points that passed quality control are considered for error calculation.

Data Aggregation Effect on Forecasting Performance
The reference case, Santiago 2012 with one-hour forecast horizon and no lead time, proves the proposed forecasting methodology generality by resulting in positive SS, despite not having previous information of this location's dynamics.As additional data become available, performance increases for the first year of new data and continues to improve with each passing year.Forecasting performance for GHI is closely tied to  , while for DHI is more complex, as it compounds the forecasting error of K with  , therefore results are larger than GHI alone.Since DNI is calculated from GHI, the good GHI forecasting performance is carried over to DNI.
As tilted plane irradiance can be considered a weighted sum of the three components, with variable weights according to solar geometry, individual component forecast errors can be somewhat balanced by the others, resulting in a more uniform performance (Figure 8,right).Two-axis tracking forecasting performance is highest among tilted planes, as it depends more upon DNI due to the negligible angle of incidence, whereas fixed plane and one axis tracking have performance close to  /K levels.The method tends to perform worst around sunrise due to changes in atmospheric dynamics occurring during the night, potentially resulting in considerable forecasting errors for the sunrise forecast window.Figure 9 presents the actual and forecasted time series for  , K and irradiances on one completely clear day, one clear with a slightly higher K and variability, and the third one is very variable, having overcast, intermediate and clear periods.For the first day, the forecast was accurate with only DNI overestimation towards sunset due to a combination of overestimation of  and underestimation of K.
This propagated into the tilted irradiance forecast, especially for the two-axis tracking case due to its higher dependence on DNI.The forecast for the second day was accurate as well.
The third day showcases the worst performing situation: when the conditions of the first hour to be forecasted on a day are significantly different than the last forecasted hour of the previous day.The last forecast corresponded to clear sky, which due to unavailability of data in the night hours suggests clear skies during sunrise.During the night the situation changed into an overcast morning resulting in large forecasting error.As daytime data become available, the forecast is rapidly corrected, reducing error.Once the effect of data aggregation on forecasting performance has been studied, the effect of forecast horizon and forecast lead time can be studied.For this analysis, the training data will consider up to Santiago 2016 data and evaluate performance for 2017 at up to 4 hours' horizon with up to 2 h lead time.

Effect of Forecast Horizon and Lead Time in Forecasting Performance
Despite the knowledge base being based on historic hour-to-hour transitions with no lead time, the results show the proposed forecasting methodology can be extended for longer time horizons and lead times.From Figure 10, the proposed method showed resistance to performance degradation as the forecasting horizon and lead time increases, with K being the only exception to this behavior.For no lead time cases, the proposed method produces useful forecasts for up to a two-hour forecast horizon while its overall skill is best for the 3 h forecast horizon with a lead time of one hour.To analyze the detailed performance metrics for each case, Taylor diagrams are chosen.They allow a more detailed, independent analysis for each case, as they summarize several metrics in a single chart based on the relationship presented in Equation ( 18) [53].In this diagram, forecast performance is presented as distance to the observations (reference) set, which is perfectly correlated and has no error with itself ( ,  = 1, RMSE = 0).The farther away from the reference a case is, the performance is worst.For the sake of clarity, only cases of 0-1 h lead times are presented.
Figure 11 presents results for  and K forecasts.Data points for persistence tend to follow the corresponding standard deviation isoline, due to the persistence forecast being a shifted version of the original time series with increasing decimation as forecast horizon and/or lead times extends.The resulting time series tends to keep most of the distribution characteristics of the original; however, correlation lowers and therefore RMSE increases, with the lead time having the most significant effect.This chart evidences the good performance in forecasting skill for the proposed method, as the increase in the RMSE is lower than persistence and shows better correlation with the original time series, as it is not subject to the decimation as is the persistence forecast.From Figure 12, GHI presents excellent correlation, whereas for DHI and DNI it is considerably lower.This can be explained by: (a) for GHI, forecasting error in DNI and DHI can be somewhat compensated through the closure equation, therefore, the relative differences are not large; (b) DHI has compounding effects from the extraterrestrial irradiance,  and K forecasting; (c) DNI is a derived parameter from GHI and DHI, weighted by the nonlinear term of sin  ; (d) since DNI fluctuates according to cloud coverage pattern and optical thickness, strong variations can be present at intra-and interhourly levels, making its forecast difficult.In all cases, the proposed method has a lower increase in RMSE than persistence.For tilted irradiance, the proposed method tends to be closer to the standard deviation isoline of the irradiance reference and consistently shows higher correlations and lower RMSEs than their corresponding persistence forecasts.

Forecast Performance Assessment as a Function of Day Type
The results presented so far corresponded to whole one-year time series of the evaluation dataset, as is often presented in literature.Nevertheless, an important aspect to analyze forecasting performance under different atmospheric conditions.While authors have previously analyzed performance according to subjective discrete day classifications [15] or as a function of specific daily  values [14,32], the classification presented in [33] proves useful to analyze performance characteristics for deterministically defined typical day types with well-defined characteristics, independently of solar geometry and seasonal effects.Only valid data points were considered for error calculation.Figure 8 shows that when invalid days or gaps have seasonal character, this can impact the results.Performance for 2016 suddenly and significantly decreased when compared to the rest while Figure 13 shows the majority of invalid data points for this year occurred around the summer months, when the skies are clearer and the method's performance is best.For   and K forecasts (Figure 14), results show poor performance for overcast conditions (day class ≤ 4), but for variable days onwards (day class ≥ 6 for   and day class ≥ 8 for K) mean performance is consistently and significantly better than persistence, except for the clearest day class.Despite the fact that the method is clearly underperforming in a determined set of conditions, this is comparatively rare when compared to the range of meteorological conditions under analysis, as the day type distribution shows that only ~10% of days are category 4 or lower and mean day class forecasting skills are positive for over 75% of days.By recalling results for 2017 on Figure 8, the forecasting skill for  as 0.12 while for K was 0.08; however, Figure 14 shows mean day class forecasting skills on very different ranges compared to the yearly value, which exceed it for ~60% of days for  and K and reaching extreme values of [−0.6,0.8] and [−36,0.8],respectively.Therefore, the proposed analysis framework shows its value by allowing a deeper look into the forecasting performance for different atmospheric conditions.Figure 15 shows similar trends for GHI and  , with very low skill for DHI, albeit with significant spread for each day class due to the interaction between  and K for DHI forecasting.DNI presents a significant spread at lower day classes, since DNI can be zero at overcast hours and different to zero for others, which can occur at given moments of cloudy or variable days, and this would substantially increase RMSE.However, DNI mean day class forecasting skill is consistently positive for day classes 8 onwards, which represent the vast majority of days.However, given that the proposed method's GHI and DHI performance improves from day classes 7 onwards (≥75% of days), the forecasting skill for DNI also improves to values greater than the average 0.13 for the year.For tilted irradiances results are similar, albeit the proposed method has consistent positive skill for day classes 8 onwards (~70% of days).Due to the compounding effect of horizontally referenced irradiance forecasting and the nonlinearity of the transposition model, two observations can be made: the spread for a given day class can be significant, but the overall patterns for the different tilted planes under analysis are similar.Finally, for a finer presentation of the individual error metrics that increase the level of detail for this analysis, the reader is referred to the Appendix A.

Conclusions
In this work, a solar energy forecasting method was developed that estimates  and K, the corresponding horizontally referenced irradiances on tilted, tracking surfaces.This method is based on the premise that solar irradiance patterns can be classified according to their static and dynamic characteristics through the metric termed quality score (QS) and that for any given state, there is a probability distribution function to transition to another future state.These transitions can be accurately represented by transition matrices according to solar geometry.Since each irradiance class (or state) is defined by  , K and  , this representation proves to be independent of yearly and time-of-day seasonality, while allowing for the flexibility to capture a new location's particular dynamics once the knowledge base is updated with fresh data for that site.Forecasting performance was assessed for 6 years in a rolling manner, for a location on which initially no data were available, to evaluate how performance changes as new local data is added to the knowledge base, simulating its operational deployment.The assessment framework is thorough, exceeding the conventional one-year assessment period of most studies and exploring not only the one-year performance, but also on a typical day case basis.
Given the marked inertias and start/stop times of hydro/thermal generators, it is necessary to provide system operators with forecast information such that it is operationally feasible to consider it in the dispatch problem to improve the solution against a no-forecast case, defining this anticipation as the forecast lead time.If this task cannot be achieved, then regardless of the forecast's quality, it is for all practical purposes useless because it is disconnected from the application it should serve: to positively influence a decision.That is why in this work the effect of the forecast lead time in the forecast performance has been analyzed, using 0-2 h for lead time, showing that this parameter has a significant effect in forecasting performance, giving additional value to the results presented in this manuscript.
The proposed method performed competitively against other methods in the literature and consistently provided for superior performance against the reference smart persistence due to maintaining a higher correlation with the actual data time series and by being more resistant to an increase in NRMSE.The most challenging condition for the proposed method relates with changes in atmospheric conditions between sunset and sunrise that cannot be foreseen since solar data are unavailable, in which the first forecast window for a new day can exhibit significant error as it is based on the last known atmospheric state that differs greatly from the current.Nevertheless, once new data are available for the next forecast window, the method is quick to adapt and even on highly variable days it can offer superior performance compared to persistence.
A significant contribution of this study is the assessment methodology, allowing for a better understanding of the characteristics of forecasting performance in the context of the local characteristics of solar irradiance.Data availability has a definite effect in the conventional one-year assessments when data were unavailable/flawed in seasons in which the method consistently over-or under-performs the yearly mean.Additionally, for certain day types the error would consistently be much lower or higher than persistence and the overall yearly performance is determined by this distribution along with that of day types during the year, which depends upon the location.Finally, this analysis allows for identifying specific conditions where the method performed best or worst and therefore, providing information for future work concerning where improvements can be made to enhance its performance.classes, with RMSE under 25% in the worst conditions, with similar MAPE values, although for the vast majority of days (and the yearly mean) it is close to 10%.Results for tilted irradiance are presented for one-axis tracking only due to similarity to other tracking planes, as to avoid redundancy.

Figure 1 .
Figure 1.Some selected results from the literature showcasing typical performance cases for solar radiation forecasting, based on [27,28].

Figure 2 .
Figure 2. Application of Box-Jenkins methodology to different irradiance patterns from overcast to clear sky.One-minute clear sky irradiance hourly patterns (top left), with most representative time series of each set (top right).Lower subfigures present autocorrelation (bottom left) and partial autocorrelation functions (bottom right) for each representative time series.Color bar in top-left subfigure represents probability of occurrence.

Figure 3 .
Figure 3.Typical one-minute clear sky clearness index time series for a representative hour for each irradiance class, with its corresponding  and K value ranges, based on [33].

Figure 4 .
Figure 4. Workflow to construct transition probability matrixes after irradiance pattern classification has been performed.

Figure 5 .
Figure 5. Summary of fuzzy inference system workflow.

Figure 6 .
Figure 6.Workflow to determine the likelihood function for QS on a test case with origin state:  = 0.667, K = 0.333 and QS = 0.667.Membership function for QS (a), state transition probability matrix (b), future state probability of occurrence (c), QS ranges for each destination state (d), probability-

Figure 7 .
Figure 7. Workflow for the inference and de-fuzzification stages.Case with origin state:  = 0.667, K = 0.333 and QS = 0.667.QS in the Kt-K space at the likely  value (a), QS likelihood function (b), QS likelihood mapped to the Kt-K space (c), plausibility space (d), combined Kt-K likelihood (e), overall likelihood function for a Kt-K-QS triad, with the forecasted value at the centroid (f).

Figure 8 .
Figure 8.Effect of data aggregation to the knowledge base on forecasting skill.

Figure 9 .
Figure 9. Example forecasting time series for the proposed method, showcasing completely clear (first), clear (second) and highly variable (third) days in the Santiago 2017 dataset: clearness index and diffuse fraction (a), horizontally referenced irradiance (b) and tilted plane irradiance (c).

Figure 10 .
Figure 10.Forecasting skill performance for different forecast horizons and lead times (FLT), time units in hours: clearness index and diffuse fraction in the first row, horizontally referenced irradiance middle row and tilted plane irradiance bottom row.

Figure 11 .
Figure 11.Taylor diagrams for Kt (left) and K (right) forecasting.Hours in the chart indicate forecast horizon.Standard deviation and RMSD are in percent.

Figure 12 .
Figure 12.Taylor diagrams for irradiance forecasting.Hours in chart indicate forecast horizon, standard deviation and RMSD are in percent for tilted irradiance and in W/m 2 for horizontally referenced irradiance.Mean GHI = 502.62W/m 2 Mean DNI = 503.74W/m 2 and Mean DHI = 148.35W/m 2 .

Figure 13 .
Figure 13.Valid (yellow) and invalid (blue) days for each available year of data for Santiago.

Figure 14 .
Figure 14.Skill score distributions for  and K as a function of day class.Dashed line represents yearly performance.Cumulative distribution of day types presented in secondary axis.

Figure 15 .
Figure 15.Skill score distributions for irradiance as a function of day class.Dashed line represents yearly performance.Cumulative distribution of day types presented in secondary axis.

Figure A2 .
Figure A2.Error metrics for GHI (top row), DHI (middle row) and DNI (bottom row) as a function of day class.Dashed line represents yearly performance.Cumulative distribution of day types presented in secondary axis.

Figure A3 .
Figure A3.MAPE for irradiance forecasting as a function of day class.Cumulative distribution of day types presented in secondary axis.