Hourly Solar Radiation Forecasting Using a Volterra-Least Squares Support Vector Machine Model Combined with Signal Decomposition

Abstract: Accurate solar forecasting facilitates the integration of solar generation into the grid by reducing the integration and operational costs associated with solar intermittencies. A novel solar radiation forecasting method was proposed in this paper, which uses two kinds of adaptive single decomposition algorithm, namely, empirical mode decomposition (EMD) and local mean decomposition (LMD), to decompose the strong non-stationary solar radiation sequence into a set of simpler components. The least squares support vector machine (LSSVM) and the Volterra model were employed to build forecasting sub-models for high-frequency components and low-frequency components, respectively, and the sub-forecasting results of each component were superimposed to obtain the final forecast results. The historical solar radiation data collected on Golden (CO, USA), in 2014 were used to evaluate the accuracy of the proposed model and its comparison with that of the ARIMA, the persistent model. The comparison demonstrated that the superior performance of the proposed hybrid method.


Introduction
Solar radiation is the most important factor affecting photovoltaic power generation [1].Accurate solar forecasting facilitates the integration of solar generation into the grid by reducing the integration and operational costs associated with solar intermittencies [2].Nowadays, physicallybased forecasting is mainly conducted using numerical weather prediction methods such as the well-known numerical weather prediction (NWP), which provides information up to several days ahead; however, it has obvious biases and random errors [3].It is therefore very important to develop effective solar radiation forecasting methods, especially ones that can use less measurements.Solar radiation sequences can be treated as a time series produced by random processes; therefore, mathematical models can be used to fit the underlying processes and forecast the future values [4,5].Many modeling methods have been used to describe solar radiation sequences, including regression methods such as linear regression [4], autoregressive model (AR) [6], autoregressive moving average (ARMA) [7], multi-dimensional linear prediction filters [8], least absolute shrinkage and selection operator (LASSO) [9], and nonlinear model approximators such as artificial neural network (ANN) [10], adaptive neuro-fuzzy inference system [11], hidden Markov model [12], fuzzy logic [13] and Angstrom-Prescott equations [14].Traditionally, these techniques were used independently by researchers to build forecasting models for solar radiation.However, solar radiation is a strong random process, which makes a single modeling algorithm unable to guarantee the accuracy and robustness of the forecasting model.Therefore, hybrid models were used to maximize the advantages The time series of hourly Global Horizontal Irradiance (GHI, which means hourly average value), collected on Golden, CO, USA (39.7 • N, 105 • W, 1829 m AMSL) in 2014, were used in this study.These GHI data were acquired by using a high-precision CM22 pyranometer instrument on a plane surface, and can be downloaded from the website of National Renewable Energy Laboratory (NREL, URL link: http://www.nrel.gov/midc/srrl_bms/).
Colorado is a mountain state with a complex climate.Table 1 presents the information about temperatures and relative humidity (RH) of the observation site in 2014.It can be seen that the observed environmental volatility is strong.The strong environmental volatility will result in strong nonstationary of solar radiation sequence, and ultimately lead to difficulty in establishing an accurate forecasting model.Therefore, the solar radiation sequence of Colorado in 2014 was used to verify the performance of the proposed method.
The series of hourly Global Horizontal Irradiance (GHI) values used in this study is shown in Figure 1, which contains a total of 8760 sampling data, and each sampling data represents an average value of hourly GHI, and there are 24 sample points per day.From Figure 1, it can be found that the seasonal factor has a great impact on the GHI.In order to weaken the influence of seasonal factor on the forecasting performance of the model, a total of 8760 sampling data were divided into 4 datasets (i.e., 1 January to 31 March, 1 April to 30 June, 1 July to 30 September, 1 October to 31 December), according to their sampling date.
The forecasting model was established for each dataset.Considering the adaptive signal decomposition algorithms used in this study are based on the prior condition of the time-continuity of the original signal, in order to satisfy this condition, the first 70% sampling data of each dataset were used as training set, and the last 30% sampling data were used as testing set.When the number of samples is sufficient, this sample partitioning method can cover all possible weather situations.

Signal Decomposition
The original solar radiation series has strong nonlinear and nonstationary characteristics.Existing modeling methods have limited ability to describe this kind of signal.Hence, the adaptive signal decomposition algorithms were used to decompose original solar radiation into a small number of simpler components (sub-sequences) to improve prediction accuracy.EMD and LMD have been extensively shown as good and popular adaptive methods for processing nonlinear and nonstationary signals.They were widely used in many applications, such as solar radiation forecasting [16], energy load forecasting [22], climate forecasting [23], and time-frequency estimation for electroencephalogram [24].Because the EMD and LMD have different decomposition mechanisms, the sub-sequences obtained by these two methods also have some differences, which are exactly the description of the original signal from different perspectives.Considering the

Signal Decomposition
The original solar radiation series has strong nonlinear and nonstationary characteristics.Existing modeling methods have limited ability to describe this kind of signal.Hence, the adaptive signal decomposition algorithms were used to decompose original solar radiation into a small number of simpler components (sub-sequences) to improve prediction accuracy.EMD and LMD have been extensively shown as good and popular adaptive methods for processing nonlinear and nonstationary signals.They were widely used in many applications, such as solar radiation forecasting [16], energy load forecasting [22], climate forecasting [23], and time-frequency estimation for electroencephalogram [24].Because the EMD and LMD have different decomposition mechanisms, the sub-sequences obtained by these two methods also have some differences, which are exactly the description of the original signal from different perspectives.Considering the complexity of the original solar radiation sequence, the EMD and LMD methods were used for analysis and comparison simultaneously in this study to realize the multi-angle and comprehensive description for original solar radiation sequence.

Empirical Mode Decomposition
The basic idea of EMD is to decompose a complicated signal into a finite and a small number of oscillatory modes based on a local characteristic time scale.Each oscillatory mode is expressed by an intrinsic mode function (IMF), which has to satisfy the following two conditions: (1) In the whole dataset, the number of extrema and the number of zero-crossings must either be equal or differ at most by one; (2) At any point, the mean between the upper and lower envelopes must be zero.Let {I(t), t = 1, 2, • • • , N} be a given solar radiation sequence, the computational steps of the EMD are described as follows: Step 1. Identify all the local extrema of I(t) and connect all the local maxima and minima using a cubic spline to generate an upper envelope u(t) and a lower envelope v(t).
Step 2. Calculate the mean envelop m(t) by averaging the upper envelope u(t) and the lower envelope v(t), and then subtract it from I(t) to obtain a detailed component d(t) = I(t) − m(t).
Step 3. Check whether d(t) is an IMF.If d(t) is an IMF, then set c(t) = c(t) and meanwhile replace I(t) with the residual r(t) = I(t) − d(t).Otherwise, replace I(t) with d(t)and repeat Steps 1-2, until the following termination criterion is satisfied: where j denotes the number of iterative calculations.The threshold value δ was set to 0.25 in this study.
Step 4. Repeat Steps 1-3, until all the IMFs and residual are obtained.Finally, the original time series I(t) can be decomposed as follows: where c i (t) and r(t) represent different IMFs and final residual, respectively.Here, the number of IMFs is adaptively determined by the algorithm.For complete procedures, please refer to the references [25,26].

Local Mean Decomposition
LMD was originally developed for decomposing the modulated signals into a small set of product functions (PFs), each of which is the product of an amplitude envelope signal and a frequency modulated signal.LMD scheme involves progressively separating a frequency modulated signal from an amplitude envelope signal in essence.For a given solar radiation sequence I(t), the procedure of LMD can be simply described as follows: Step 1. Identify the local extrema n i from I(t), and calculate the mean value m i and magnitude a i of the two successive extrema n i and n i+1 using the following description: Step 2. Smooth local means and local magnitudes using moving averaging respectively, to form a smoothly varying continuous local mean function m(t) and the envelope function a(t), and calculate the assessment function s(t) as follows: Energies 2018, 11, 68 5 of 21 Step 3. Replace I(t) with s(t) and repeat Steps 1-2, until the envelope function a(t) equals 1, and finally the product function (PF) is obtained: where j denotes the number of iterative calculation, a j (t) is the envelope function obtained in the j-th iterative process, and s n (t) is the final assessment function.
Step 4. Replace the original signal I(t) with the residual r(t) = I(t) − PF(t), repeat Steps 1-3, until the residual r(t) is a constant or monotonic.Finally, the original time series I(t) can be decomposed as follows: where PF i (t) and r(t) represent different PFs and final residual, respectively.Like the EMD algorithm, the number of PFs is also adaptively determined by the algorithm.Interested readers may refer to [27] for details.

False Nearest Neighbor (FNN) Algorithm
After decomposition, each component obtained by the EMD or the LMD can be treated as a time series produced by the dynamical system.According to state space equation, for a given time series {x(t), t = 1, .., k, . . ., j, . . .N}, produced by the dynamical system, their relationship can be described as the following equation: where F(•) is a nonlinear function reflecting the relationship between 'input' vectors x(t−1)x(t), x(t − t1), . . . . .., x(t − m + 1) and 'output' values x(t + 1), and m is an embedding dimension reflecting time-delay characters of the dynamical system.The objective of time series prediction is to obtain the function F(•) based on the given time series {x(t), t = 1, .., k, . . ., j, . . .N}, using various modeling methods.When the function F(•) (also called forecasting model) is established, the next value x(t + 1) can be predicted using the m-dimension vector {x(t), x(t − 1), . . ., x(t − m + 1)}.However, due to the lack of prior knowledge of the dynamic system, the proper embedding dimension of the dynamic system is unknown.Therefore, determining the embedding dimension (also called input dimension of the forecasting model) must be done before developing the forecasting model.The false nearest neighbor algorithm (FNN) [28] was adopted to confirm the optimal embedding dimension in this study.The FNN algorithm was summarized as follows: Step 1.For a given small embedding dimension m, identify the closest point (in the Euclidean sense) to a given point in the time-delay coordinates.That is, for a given time-delay point X m (k) = [x(k), x(k − 1), . . ., x(k − m + 1)], find the point X m (j) in the data set such that the following distance d is minimized: X m (j) is known as the nearest neighbor to X m (k).
Step 2. Determine if the following expression is true or false: where R is threshold value, and set as 1/15 in this study.If expression ( 9) is true, then the neighbors are true nearest neighbors.If the expression is false, then the neighbors are false nearest neighbors.
Step 3. Continue the algorithm for all points k in the data set.Calculate the percentage of points in the data set which have false nearest neighbors.
Step 4. Continue the algorithm for increasing embedding dimension m until the percentage of false nearest neighbors drops to acceptably small number.In this study, if for a certain m, the percentage of false nearest neighbors is less than 5%, it is accepted as the embedding dimension for the data set.

Prediction Model
After decomposition, selecting an effective modeling method to build the forecasting model for each component is particularly important.The Volterra model and the LSSVM are two powerful and popular modeling tools in nonlinear and nonstationary time series analysis.By integrating linear and nonlinear factors of the nonlinear time series, the Volterra model can automatically track the chaotic trajectory and yield accurate forecasting results for nonlinear time series [29].However, as a polynomial regression model, the Volterra model is difficult to obtain satisfied forecasting results for strong nonlinear time series due to the influence of finite model orders.The support vector machine (SVM) is a machine learning method based on structural risk minimization coupled with kernel mapping, which can achieve the best generalization ability of model by balancing the model complexity and learning ability [30].As an improved version of the SVM, the LSSVM uses equality constraints to replace inequality constraints of the SVM; therefore, it has a faster computing speed while maintaining the generalization ability [31].The LSSVM can obtain better results than that of the Volterra model for strong nonlinear time series forecasting by using a kernel function.Considering each component obtained by the EMD and the LMD has different nonlinear characteristics, high-frequency components have stronger nonlinearity than low-frequency components.LSSVM was thus used to model the high-frequency components while the Volterra was used to model the low-frequency components.Finally, the best forecasting model was achieved by integrating different sub-models.

Least Squares Support Vector Machine
LSSVM is an effective regression method based on the SVM, the quadratic optimization of the LSSVM is solving linear equations by constructing a loss function, which reduces the computational complexity, improves computation speed, and positively influences regression modeling.It has been widely used in time-series prediction [32].
Given X(t) = [x(t), x(t − 1), . . ., x(t − m + 1)] and Y(t) = x(t + 1) are set as the input and output of a nonlinear discrete dynamic system, respectively, the initial space has the following form: where ψ(X(t)) is the nonlinear mapping function, which maps X(t) to the high dimensional feature space.According to the structural risk minimization principle, the optimization problem can be expressed as: where γ is the regularization parameter, and {X i (t), Y(t) i } n i=1 is a given training set, e i (t) is the error variable.When the dimension tends to be infinite, the Lagrange duality problem can be set up to solve the above problem.The solution to this problem is presented by Suykens and Vandewalle [33] as: Energies 2018, 11, 68 where K(X(t), X i (t)) is the kernel function.In this paper, the radial basis function (RBF) was selected as the kernel function: where σ 2 is the kernel function parameter.The regularization parameter γ and the kernel function parameter σ 2 control the model complexity and affect the generalization capability of the LSSVM.In this study, the grid-search algorithm coupled with the 10-fold cross validation was utilized to optimize these two parameters.Relevant reference [34] may help the reader to get a clear idea on how the LSSVM works.

Volterra Model
The Volterra model is a general mathematical model for a nonlinear system that produces a single output from a serial input.The Volterra models can fit any dynamic systems, without necessity to assume the model structure [35,36].Given ] and Y(t) = x(t + 1) are the input and output of a nonlinear dynamic system, respectively.The discrete form of the Volterra model is written as: where h p (m 1 , m 2 , . . ., m p ) is the p order Volterra kernel, which act as the weighting coefficients on the past input to the system.This infinite series expansion is difficult to be realized in practical applications; thus, the form of finite truncated and finite times must be used.The most commonly used form is the following two-order truncation sum: the coefficients of the Volterra model expressed in Equation ( 15) can be solved using the orthogonal decomposition, stepwise multiple regression, and the iterative descent gradient method.In this paper, the least square algorithm was adopted.Complete details of the Volterra may refer to consistent reference [37].

Model Evaluation
In this paper, four statistical parameters, including Root mean square error (RMSE), mean absolute error (MAE), correlation coefficient (R), and forecast skill were used in model evaluation.

Root Mean Square Error
Root mean square error can be obtained using the following equation: where x(t) and x(t) are the forecasted value and true value, respectively, and n is the number of the forecasted points.This parameter provides information on short-term performance of a model.RMSE is always positive, except for 'zero' in the ideal case [38].Small values of RMSE indicate accurate model performance.

Mean Absolute Error
The MAE measures the average magnitude of the errors in a set of forecasts, with regardless of their direction, which represents the accuracy of continuous variables.It means that the MAE is the average over the verification sample of the absolute values of the differences between the forecast and the corresponding observation.All the individual differences are weighted equally on average as the MAE is a linear score, which is calculated as: The ideal value of the MAE is 0 or close to 0 [39].

Correlation Coefficient
The correlation coefficient is used to test the relationship between forecasted and true values [3,40]; this index can be obtained from the following equation: where x(t) is the average of the forecasted values, and x(t) is the average of the true values.
Correlation coefficient measures the degree of linear correlation between two random variables.High absolute values of correlation coefficient indicate strong correlation among variables.The correlation coefficient is stronger when it is close to 1 or −1 and weaker when it is close to 0.

Forecast Skill
To compare the forecast performance to smart persistence forecasts, the forecast skill is introduced [41]: where RMSE p is the RMSE of the persistence model.The forecast skill is used as a comparison with the commonly used persistence forecasting method, which considers the data associated with the closest hours to the forecasting time as the most important data for predicting the variable of interest at a specific hour.Consequently, the ratio RMSE/RMSE p is a measure of improvement over the persistence forecast when the latter is corrected by the clear-sky index (or, in other words, when diurnal variability of the clear sky is taken into consideration).A forecast skill of 1.0 implies an unattainable perfect forecasting, and a forecast skill of 0 indicates an exact performance as the persistence method.
Negative values show lower forecasting accuracies compared to the persistence method.In this paper, the persistence model with clear-sky index was used.The resulting clear-sky persistence model is defined as follows [42]: where Î(t + ∆t) and I(t + ∆t) clr are the persistence forecasting result and clear-sky solar radiation data for the time t + ∆t; k(t + ∆t) is the clear-sky index; I(t) and I(t) clr are the actual solar radiation data and clear-sky radiation data in the time t.Here I(t) clr was calculated based on the Bird model [43].
The input atmospheric parameters of the Bird model are listed in Table 2.

Establishment and Comparison of Models
In this section, several models are respectively analyzed and compared using the measured hourly GHI data from NREL to determine the most accurate solar radiation model.

The Forecasting Results for Hourly GHI Using the LSSVM and the Volterra Models
The forecasting results of the LSSVM, the Volterra model and persistence model for the four datasets based on the original solar radiation data are shown in Table 3.Both the LSSVM and the Volterra model have obtained good correlation coefficients for all the datasets, with values of R over 0.94, which indicates that there is strong correlation between the forecasting values and the measured values.The values of RMSE and forecast skill, obtained by the LSSVM and the Volterra model, were significantly better than the persistence model except for the period of Jul. to Sept. In terms of MAE, the persistence model was better than the LSSVM and the Volterra model for all the datasets, which may be attributed to the LSSVM and the Volterra models are based on the principle of least squares error minimization, while the persistence model is an empirical model.The above results indicate that the LSSVM and the Volterra model have good performance in GHI forecasting.However, the accuracy of models is greatly affected by the nonstationary characteristic of the original GHI sequence.Therefore, in order to improve the accuracy and robustness of the model, it is necessary to decompose the strong original GHI signal into a set of weaken nonstationary single.and 3 show the different IMF and PF components decomposed from the original GHI series using the EMD and the LMD, respectively.As we can see, the original GHI series was decomposed into a set of IMF and PF with different frequency.The frequency of IMF or PF is reduced with the increase of decomposition levels of the original signal.Both high-frequency IMF (PF) and low-frequency IMF (PF) have weaker nonstationarity than the original solar radiation sequence, and they are helpful to develop more accurate forecasting models.After decomposition, the LSSVM and the Volterra models were established for each IMF and PF component.Due to limited space, the forecasting results for each IMF and PF decomposed by two datasets (1 January to 31 March, 1 April to 30 June) are shown in Tables 4 and 5, respectively.The embedding dimensions of m, the values of RMSE and MAE showed the decrease tendency with the increase of decomposition levels, which may be attributed to the fact that the low-frequency components are more stable than the high-frequency components, and the stable time series can more easily obtain high forecasting accuracy.Except for the first components (i.e., IMF1 and PF1), the LSSVM and the Volterra models achieved excellent forecasting results for other components, with correlation coefficients of more than 0.97.The lower correlation coefficients obtained by the first components were due to the fact that the first components have more nonstationary characteristics than other components, which will bring difficulties for developing an accurate forecasting model.After decomposition, the LSSVM and the Volterra models were established for each IMF and PF component.Due to limited space, the forecasting results for each IMF and PF decomposed by two datasets (1 January to 31 March, 1 April to 30 June) are shown in Tables 4 and 5, respectively.The embedding dimensions of m, the values of RMSE and MAE showed the decrease tendency with the increase of decomposition levels, which may be attributed to the fact that the low-frequency components are more stable than the high-frequency components, and the stable time series can more easily obtain high forecasting accuracy.Except for the first components (i.e., IMF1 and PF1), the LSSVM and the Volterra models achieved excellent forecasting results for other components, with correlation coefficients of more than 0.97.The lower correlation coefficients obtained by the first components were due to the fact that the first components have more nonstationary characteristics than other components, which will bring difficulties for developing an accurate forecasting model.The Volterra model is a polynomial regression model.It easily falls into a local minimum of the parameter space and is inapplicable in cases where the underlying system exhibits a high degree of complexity [44].The LSSVM transforms the nonlinear regression problem in a low dimensional space into a linear regression problem in a high dimension space by using kernel mapping technique, so it exhibits unique advantages in strong nonlinear regression [45].The first components (i.e., IMF1 and PF1) obtained in this study have a strong randomness and large volatility; therefore, it can be found that the accuracy of the LSSSVM for the first components were better than that of the Volterra models.
For the remaining components, the R values of the Volterra model are more stable and higher than those of the LSSVM.Moreover, the RMSE and MAE values of the Volterra model for the test data are lower than that of the LSSVM in general.Thus, the Volterra model exhibits more accurate and stable performance than the LSSVM model for low-frequency components, which reflect the overall physical characteristics of the system.As such, the Volterra model can clearly describe the physical meaning of the system and accurately predict the continuous function [46].
Forecasting residuals (i.e., forecasting results minus real value) with the LSSVM and the Volterra models for two datasets (1 January to 31 March, 1 April to 30 June) were shown in Figures 4-7.
Comparison clearly showed that the fitting residuals of the Volterra were smaller than the LSSVM for low-frequency components, while the fitting residuals of the LSSVM were smaller than Volterra for high-frequency components.The Volterra model is a polynomial regression model.It easily falls into a local minimum of the parameter space and is inapplicable in cases where the underlying system exhibits a high degree of complexity [44].The LSSVM transforms the nonlinear regression problem in a low dimensional space into a linear regression problem in a high dimension space by using kernel mapping technique, so it exhibits unique advantages in strong nonlinear regression [45].The first components (i.e., IMF1 and PF1) obtained in this study have a strong randomness and large volatility; therefore, it can be found that the accuracy of the LSSSVM for the first components were better than that of the Volterra models.
For the remaining components, the R values of the Volterra model are more stable and higher than those of the LSSVM.Moreover, the RMSE and MAE values of the Volterra model for the test data are lower than that of the LSSVM in general.Thus, the Volterra model exhibits more accurate and stable performance than the LSSVM model for low-frequency components, which reflect the overall physical characteristics of the system.As such, the Volterra model can clearly describe the physical meaning of the system and accurately predict the continuous function [46].
Forecasting residuals (i.e., forecasting results minus real value) with the LSSVM and the Volterra models for two datasets (1 January to 31 March, 1 April to 30 June) were shown in Figures 4-7.
Comparison clearly showed that the fitting residuals of the Volterra were smaller than the LSSVM for low-frequency components, while the fitting residuals of the LSSVM were smaller than Volterra for high-frequency components.The final forecasting results for original GHI series, obtained by the different decomposition algorithms and modeling methods, were shown in Table 6.All of models based on the decomposition algorithm improved the forecasting results, compared to the LSSVM and the Volterra models without decomposition, meanwhile, the values of RMSE and forecast skill, obtained by these four models (EMD-LSSVM, LMD-LSSVM, EMD-Volterra, and LMD-Volterra), were significantly better than the persistence model for all the datasets.The results fully illustrated that using decomposition algorithm can effectively improve forecasting accuracies of GHI and model robustness.The final forecasting results for original GHI series, obtained by the different decomposition algorithms and modeling methods, were shown in Table 6.All of models based on the decomposition algorithm improved the forecasting results, compared to the LSSVM and the Volterra models without decomposition, meanwhile, the values of RMSE and forecast skill, obtained by these four models (EMD-LSSVM, LMD-LSSVM, EMD-Volterra, and LMD-Volterra), were significantly better than the persistence model for all the datasets.The results fully illustrated that using decomposition algorithm can effectively improve forecasting accuracies of GHI and model robustness.Analysis has shown that LSSVM is suitable for the high-frequency components (i.e., IMF1 and PF1) and the Volterra is suitable for the low-frequency components; the LSSVM and the Volterra models were combined to develop forecasting models (namely EMD-LSSVM-Volterra and LMD-LSSVM-Volterra, respectively).In other words, after signal decomposition by the EMD or LMD, the LSSVM was employed to develop a sub-model for the first components (i.e., IMF1 or PF1), while the Volterra was used to develop the sub-models for the remaining low-frequency components, and the final results were obtained by superposition of each sub-model.As shown in Table 7, the forecasting accuracies of the EMD-LSSVM-Volterra and the LMD-LSSVM-Volterra have various degrees of improvements for all the datasets, compared to the EMD-LSSVM, the LMD-LSSVM, the EMD-Volterra and the LMD-Volterra models.These differences are attributed to the fact that the EMD and the LMD have different explanatory power for the original signals.EMD presents several limitations in terms of end effect, over envelope, under envelope, and mode confusion.LMD, a new processing method for non-stationary signals, also exhibits obvious deficiencies compared with EMD.For example, the signal will be lagging behind or ahead of time when smoothing is performed for a long time, smoothing step cannot be optimally determined in the smoothing process, and no fast algorithm is available [38].The signal in LMD smoothing easily advances or lags and EMD yields serious end effect; hence, we integrated both decomposition algorithms to deal with the original radiation series to improve the prediction accuracy.Therefore, in order to further improve the forecasting accuracy for GHI, a novel modeling method called EMD-LMD-LSSVM-Volterra was proposed in this study.The forecasting result obtained by the EMD-LMD-LSSVM-Volterra model is the average value of the EMD-LSSVM-Volterra model and the LMD-LSSVM-Volterra model.The entire process was shown in Figure 8, and comparison results were presented in Table 7.

The Forecasting Results for GHI Series Using the EMD-LSSVM-Volterra, the LMD-LSSVM-Volterra, and the EMD-LMD-LSSVM-Volterra Models
Analysis has shown that LSSVM is suitable for the high-frequency components (i.e., IMF1 and PF1) and the Volterra is suitable for the low-frequency components; the LSSVM and the Volterra models were combined to develop forecasting models (namely EMD-LSSVM-Volterra and LMD-LSSVM-Volterra, respectively).In other words, after signal decomposition by the EMD or LMD, the LSSVM was employed to develop a sub-model for the first components (i.e., IMF1 or PF1), while the Volterra was used to develop the sub-models for the remaining low-frequency components, and the final results were obtained by superposition of each sub-model.As shown in Table 7, the forecasting accuracies of the EMD-LSSVM-Volterra and the LMD-LSSVM-Volterra have various degrees of improvements for all the datasets, compared to the EMD-LSSVM, the LMD-LSSVM, the EMD-Volterra and the LMD-Volterra models.These differences are attributed to the fact that the EMD and the LMD have different explanatory power for the original signals.EMD presents several limitations in terms of end effect, over envelope, under envelope, and mode confusion.LMD, a new processing method for non-stationary signals, also exhibits obvious deficiencies compared with EMD.For example, the signal will be lagging behind or ahead of time when smoothing is performed for a long time, smoothing step cannot be optimally determined in the smoothing process, and no fast algorithm is available [38].The signal in LMD smoothing easily advances or lags and EMD yields serious end effect; hence, we integrated both decomposition algorithms to deal with the original radiation series to improve the prediction accuracy.Therefore, in order to further improve the forecasting accuracy for GHI, a novel modeling method called EMD-LMD-LSSVM-Volterra was proposed in this study.The forecasting result obtained by the EMD-LMD-LSSVM-Volterra model is the average value of the EMD-LSSVM-Volterra model and the LMD-LSSVM-Volterra model.The entire process was shown in Figure 8, and comparison results were presented in Table 7.The forecast skill values obtained by the EMD-LMD-LSSVM-Volterra models were improved by 11.37-32.48%and 10.94-115.03%, and the RMSE values were decreased by 5.92-13.83%and 9.61-20.79%for all the datasets, compared to the EMD-LSSVM-Volterra and the LMD-LSSVM-Volterra, respectively.Moreover, the values of MAE obtained by the EMD-LMD-LSSVM-Volterra model, which has a significant decrease compared to the EMD-LSSVM-Volterra model and the LMD-LSSVM-Volterra model, were also lower than that of the persistence models presented in Table 3.The correlation coefficient (R) values of the EMD-LMD-LSSVM-Volterra models achieved 0.974-0.986for all the datasets, which indicates that the forecasting value has a strong correlation with the measured value of GHI.The ARIMA is a widely used modeling method for solar radiation forecasting, the results obtained by the ARIMA were also provided in Table 7.It can be easily found that the EMD-LMD-LSSVM-Volterra models were significantly better than the ARIMA models.Overall, the EMD-LMD-LSSVM-Volterra models achieved the best forecasting accuracy in terms of all the statistical parameters (RMSE, MAR, R, and Forecast Skill).
The correlation between the forecasted and measured values of GHI is plotted in Figure 9 to show the effectiveness of the forecasting approaches.The forecasted values of the EMD-LMD-LSSVM-Volterra model were strongly correlated with the measured GHI data.The forecast skill values obtained by the EMD-LMD-LSSVM-Volterra models were improved by 11.37-32.48%and 10.94-115.03%, and the RMSE values were decreased by 5.92-13.83%and 9.61-20.79%for all the datasets, compared to the EMD-LSSVM-Volterra and the LMD-LSSVM-Volterra, respectively.Moreover, the values of MAE obtained by the EMD-LMD-LSSVM-Volterra model, which has a significant decrease compared to the EMD-LSSVM-Volterra model and the LMD-LSSVM-Volterra model, were also lower than that of the persistence models presented in Table 3.The correlation coefficient (R) values of the EMD-LMD-LSSVM-Volterra models achieved 0.974-0.986for all the datasets, which indicates that the forecasting value has a strong correlation with the measured value of GHI.The ARIMA is a widely used modeling method for solar radiation forecasting, the results obtained by the ARIMA were also provided in Table 7.It can be easily found that the EMD-LMD-LSSVM-Volterra models were significantly better than the ARIMA models.Overall, the EMD-LMD-LSSVM-Volterra models achieved the best forecasting accuracy in terms of all the statistical parameters (RMSE, MAR, R, and Forecast Skill).
The correlation between the forecasted and measured values of GHI is plotted in Figure 9 to show the effectiveness of the forecasting approaches.The forecasted values of the EMD-LMD-LSSVM-Volterra model were strongly correlated with the measured GHI data.The persistence model with clear-sky index is another widely used physical model for GHI forecasts.In this model, the clear-sky index has a great effect on the model accuracy.The clear-sky index time series not only contains the solar radiation information, but it also contains the information of other environmental physical quantities.Usually, the clear-sky index in time t + ∆t was calculated based on actual solar radiation data and clear-sky radiation data in the time t (see Equation (20)).This is an approximate method which ignores the variation of the clear-sky index between the time t and t + ∆t; hence, in order to obtain more accurate forecasting results obtained by persistence model, it is necessary to develop a forecasting model for the clear-sky index.In this study, the persistence model combined with the ARIMA, the LSSVM, the Volterra, and the EMD-LMD-LSSVM-Volterra model were also used for solar radiation forecast.Firstly, the clear-sky indexes k(t) are calculated through using measured values of solar radiation I(t) divided by the calculated values of the clear-sky solar radiation I clr (t).Here, I clr (t) were obtained based on the Bird model.Then the ARIMA, the LSSVM, the Volterra and the EMD-LMD-LSSVM-Volterra models were separately built to forecast the clear-sky index k(t + 1) using historical data of k(t).Finally, the forecasted value of Î(t + 1) can be achieved using k(t + 1) multiplied by the I clr (t+1).The final results obtained by these methods were shown in Table 8.
It also can be found that the EMD-LMD-LSSVM-Volterra models combined with persistence model gained the best performance compared with other three methods and had a similar performance with Table 7.In conclusion, the EMD-LM-LSSVM-Volterra models can accurately forecast solar radiation whether it's only using original GHI series or based on the clear sky index series which contains enough information of environmental physical quantity.

Conclusions
A novel hybrid method combining EMD, LMD, LSSVM and Volterra algorithms was proposed as a statistical tool for forecasting hourly solar radiation only dependent on GHI series.The performance of the newly proposed models was checked on the gathered data from Golden, (CO, USA) and compared with the performance of the respective single and simple hybrid models.The main conclusions of this paper are summarized as follows: firstly, the EMD-LMD method effectively decomposes the original GHI series into a series of stable components for different frequencies which accurately reflect the original feature information.Secondly, different LSSVM and Volterra sub-models established according to the law of variation of these components can achieve the best results.Lastly, the EMD-LMD-LSSVM-Volterra model accurately forecasts the solar radiation whether it's only using original GHI series or based on the clear sky index series.
A whole series of simulations validated the effectiveness of the proposed EMD-LMD-LSSVM-Volterra model on forecasting the complex and strongly nonlinear solar radiation time series.In the application stduies against a renewable energy background, we believe that this method can also effectively decompose and forecast other nonstationary time series signals, such as the wind speed series for wind power generation and energy load time series forecasting.In this study, only one-step forecasting (i.e., using current and previous radiation data to forecast solar radiation values in the next hour) was studied; the application effect of the proposed method in the long-term forecast of solar radiation is an issue that needs to be studied in the future.On the other hand, one year of data from one site was used in this study, further research is needed by including solar radiation data form more locations and under different sky conditions over a wider time period.

2 Figure 1 .
Figure 1.The series of hourly GHI from 1 January 2014 to 31 December 2014 collected on Golden, Colorado.

Figure 3 .
Figure 3. Solar radiation sequence decomposition using the LMD.(a) PFs from Jan. to Mar.; (b) PFs from Apr. to Jun.; (c) PFs from Jul. to Sept.; (d) PFs from Oct. to Dec.

2 Figure 4 .
Figure 4. Forecasting residuals of the LSSVM and the Volterra models with different IMFs based on the data of 1 January to 31 March in 2014.

Figure 4 .
Figure 4. Forecasting residuals of the LSSVM and the Volterra models with different IMFs based on the data of 1 January to 31 March in 2014.

Figure 5 .
Figure 5. Forecasting residuals of the LSSVM and the Volterra models with different PFs based on the data of 1 January to 31 March in 2014.

Figure 6 . 2 Figure 5 .
Figure 6.Forecasting residuals of the LSSVM and the Volterra models with different IMFs based on the data of 1 April to 30 June in 2014.

Figure 4 .
Figure 4. Forecasting residuals of the LSSVM and the Volterra models with different IMFs based on the data of 1 January to 31 March in 2014.

Figure 5 .
Figure 5. Forecasting residuals of the LSSVM and the Volterra models with different PFs based on the data of 1 January to 31 March in 2014.

Figure 6 . 2 Figure 6 .
Figure 6.Forecasting residuals of the LSSVM and the Volterra models with different IMFs based on the data of 1 April to 30 June in 2014.

Figure 7 .
Figure 7. Forecasting residuals of the LSSVM and the Volterra models with different PFs based on the data of 1 April to 30 June in 2014.

2 Figure 7 .
Figure 7. Forecasting residuals of the LSSVM and the Volterra models with different PFs based on the data of 1 April to 30 June in 2014.

Table 1 .
Environmental information about temperatures and relative humidity (RH) of the observation site in 2014.

Table 2 .
Input atmospheric parameters values of the Bird clear sky model.

Table 3 .
The performance comparison of the persistence, the LSSVM and the Volterra models for the four datasets.

Table 4 .
The forecasting results of the LSSVM and the Volterra models for each component obtained by the EMD and the LMD based on the data of 1 January to 31March in 2014.

Table 4 .
The forecasting results of the LSSVM and the Volterra models for each component obtained by the EMD and the LMD based on the data of 1 January to 31 March in 2014.

Table 5 .
The forecasting results of the LSSVM and the Volterra models for each component obtained by the EMD and the LMD based on the data of 1 April to 30 June in 2014.

Table 5 .
The forecasting results of the LSSVM and the Volterra models for each component obtained by the EMD and the LMD based on the data of 1 April to 30 June in 2014.

Table 6 .
The forecasting results of the EMD-LSSVM, the LMD-LSSVM, the EMD-Volterra, and the LMD-Volterra for the four datasets in 2014.

Table 6 .
The forecasting results of the EMD-LSSVM, the LMD-LSSVM, the EMD-Volterra, and the LMD-Volterra for the four datasets in 2014.The Forecasting Results for GHI Series Using the EMD-LSSVM-Volterra, the LMD-LSSVM-Volterra, and the EMD-LMD-LSSVM-Volterra Models

Table 8 .
The forecasting accuracy comparisons of the persistence model combined with the ARIMA, the LSSVM, the Volterra, and the EMD-LMD-LSSVM-Volterra model.