Short-Term Wind Speed Forecasting Based on Signal Decomposing Algorithm and Hybrid Linear/Nonlinear Models

: Accurate wind speed forecasting is a signiﬁcant factor in grid load management and system operation. The aim of this study is to propose a framework for more precise short-term wind speed forecasting based on empirical mode decomposition (EMD) and hybrid linear/nonlinear models. Original wind speed series is decomposed into a ﬁnite number of intrinsic mode functions (IMFs) and residuals by using the EMD. Several popular linear and nonlinear models, including autoregressive integrated moving average (ARIMA), support vector machine (SVM), random forest (RF), artiﬁcial neural network with back propagation (BP), extreme learning machines (ELM) and convolutional neural network (CNN), are utilized to study IMFs and residuals, respectively. An ensemble forecast for the original wind speed series is then obtained. Various experiments were conducted on real wind speed series at four wind sites in China. The performance and robustness of various hybrid linear/nonlinear models at two time intervals (10 min and 1 h) are compared comprehensively. It is shown that the EMD based hybrid linear/nonlinear models have better accuracy and more robust performance than the single models with/without EMD. Among the ﬁve hybrid models, EMD-ARIMA-RF has the best accuracy on the whole for 10 min data, and the mean absolute percentage error (MAPE) is less than 0.04. However, for the 1 h data, no model can always perform well on the four datasets, and the MAPE is around 0.15.


Introduction
Wind energy has been growing fast in recent years.By the end of 2017, the worldwide total capacity of wind turbines reached 539 GW (52.6 GW added in 2017) [1].Because of the stochastic variation of wind speed, wind energy behaves in a more unstable and volatile manner than traditional energy sources.Direct integration of unstable wind power will have a serious impact on the whole grid, especially for the areas with high levels of wind power penetration [2][3][4].If the wind speed could be predicted accurately, the dispatching plan of the power system could be adjusted to reduce the adverse impact of the wind power on the whole grid.It is also beneficial for the improvement of the power limit of the wind power penetration.Therefore, accurate wind speed forecasting is very important for grid load management and system operation [5][6][7][8][9].
Many efforts have been made on developing accurate wind speed forecasting models.The autoregressive moving average (ARMA) or autoregressive integrated moving average (ARIMA) models are considered to be the most widespread models in wind speed forecasting.Torres et al. [10] first carried out the transformation and standardization of the original wind speed series to allow the use of ARMA models.Shi and Erdem [11] comprehensively evaluated the effectiveness of ARMA-generalized autoregressive conditional heteroscedastic (GARCH) approaches for modeling the mean and volatility of wind speed.Lydia et al. [12] built various ARMA models with and without external variables to predict wind speed at 10-min intervals up to 1 h.Note that the ARIMA models are considered as inherent linear models, although curvilinear (or nonlinear) relations could be incorporated in these models.Besides ARIMA models, machining learning (ML) models, such as artificial neural network (ANN) with back propagation (BP) and radial basis function (RBF) [13,14], support vector machine (SVM) [15], extreme learning machines (ELM) [16], and deep learning networks (DLN) [17,18], have shown excellent potentials in accurate wind speed predictions.Usually, ML models are considered as nonlinear models as they are more flexible compared with linear models, and thus could better deal with nonlinear relations.Despite showing superiority over linear models, the ML models also have their own shortcomings, e.g., dilemma of local minima, the over-fitting problem, poor efficiency with fewer samples.Many studies have shown that single linear or nonlinear models could not give satisfactory results for all situations due to their drawbacks [19].Therefore, hybrid models, by synthesizing the advantages of single models, are the trend of advanced wind speed forecasting.Some signal de-noising measures need to be implemented to make the wind speed series less noisy and more stable.Empirical mode decomposition (EMD) [20] is a de-noising method based on local characteristics of the signal.EMD absorbs the advantages of multi-resolution of wavelet decomposition, overcomes the difficulty of determining the wavelet base and decomposition scale, and is more suitable for nonlinear and non-stationary signal sequences [21].Thus, EMD is usually used to decompose the original wind speed series into a small number of intrinsic mode functions (IMFs) and residuals, which will be respectively studied by various models to formulate a hybrid forecast for original wind speed series.The ANN family is the most commonly used prediction tool in EMD based hybrid forecasting models [22][23][24][25][26][27][28][29][30][31][32].Guo et al. [22] first proposed a modified EMD-ANN model for multi-step wind speed forecasting.After simulations on the monthly and daily wind speed data in Zhangye of China, the proposed model showed the best accuracy comparing with the single ANN and unmodified EMD-ANN model.By introducing the latest decomposing algorithm, fast ensemble EMD (FEEMD), Multilayer perceptron (MLP) neural networks and Adaptive neuro fuzzy inference system (ANFIS) neural networks, two new hybrid models (FEEMD-MLP, FEEMD-ANFIS) were proposed by Liu et al. [23].After comparisons with the wavelet packet decomposition (WPD) based hybrid models (WPD-MLP and WPD-ANFIS), they found that the FEEMD-MLP hybrid model has the best performance in the three-step predictions [24].Liu et al. [25] also used the FEEMD for a secondary decomposition, and found that the FEEMD could further improve the forecasting accuracy of WPD based hybrid models.Xiao et al. [26] and Sun and Wang [27], respectively, adopted the bat algorithm and phase space reconstruction to improve the accuracy of BP models, and then developed new forecasting architectures based on FEEMD and modified BP.Santhosh et al. [28] utilized the adaptive wavelet neural network (WNN) to regress each signal decomposed by EMD.The proposed EMD based hybrid approach was subsequently investigated with respect to the wind farm of south India.In recent years, Wang and his collaborators [29-32] developed a series of powerful EMD based hybrid forecasting systems.Several ANN models, including Elman neutral network (ENN) [29], WNN [30,32] and generalized regression neural network (GRNN) [31], have been adopted in the systems.Some popular parameter optimization algorithms, such as the multi-objective ant lion optimization algorithm [29], meta-heuristic optimization algorithm [31] and multi-objective sine cosine algorithm [32], were respectively developed to ensure the hybrid forecast models are in the optimal state.The experimental results indicated that the average values of the mean absolute percent errors of the developed model utilizing 10-min, 30-min and 60-min interval data are lower than 8% [29].ELM is another popular prediction tool in EMD based hybrid forecasting models [33][34][35][36][37]. Liu et al. [33] first introduced the ELM to the hybrid forecasting architecture using EMD and FEEMD as wind signal decomposing algorithms.Their experiments indicated that by using the EMD and FEEMD, all the hybrid algorithms have better performance than the single ELMs.Then, Liu et al. [34] used the outlier correction method to guarantee the robustness of ELM during the forecasting computation.They also developed a multi-decomposing strategy based on the combination of EMD and WPD, and showed that the WPD-EMD-ELM hybrid model has the best predicting performance [35].Two improved ELM models, named by regularized ELM [36] and composite quantile regression outlier-robust ELM [37], were introduced to model each EMD decomposed sub-series.Besides the ANN and ELM, the SVM also received the attention of some scholars in EMD based hybrid wind speed forecasting [38,39].
Most of the above studies adopted only single nonlinear model (ANN [22][23][24][25][26][27][28][29][30][31][32], ELM [33][34][35][36][37] or SVM [38,39]) to study the IMFs decomposed by the EMD.Because the characteristics of IMFs and residuals are not the same, hybrid strategies with one more different types of forecast models might get even better results.More recently, some scholars have made meaningful attempts in this area.Zhang et al. [40] employed a recent type of EMD to divide the original wind speed data into a finite set of IMFs and residuals, and then utilized five neural networks (including BP, RBF, GRNN, WNN and ENN) to forecast each IMFs and residuals.Experimental results of their study showed that the proposed hybid model can take advantages of individual models and has the best performance.Li et al. [41] developed a novel hybrid forecasting model, which combines EMD and several single models (BP and ENN).A modified SVM was then used to integrate all the results to obtain the final forecasting results.Experimental studies on real 10-min wind speed series showed that the developed hybrid model outperforms other benchmark models.Among the EMD decomposed IMFs and residuals, the low frequency dominated residuals behave more stable and might be more suitable for forecasting with linear models (ARIMA).For the IMFs with higher frequency components, it is better to be modeled by nonlinear prediction models (such as ANN, ELM, SVM, etc.).Thus, combining both linear and nonlinear models for predicting IMFs and residuals with various frequency band components might be a good choice for precise wind speed forecasting.However, such idea has not been realized in current studies.
The above literature review indicates that it is rare to see that two (or more) prediction tools are used in EMD based hybrid wind speed forecasting; this is particularly true for the hybrid linear/nonlinear modeling and prediction of wind speed series.In addition, few efforts have been paid to comprehensively compare the performance and robustness of both linear and nonlinear models in short-term wind speed forecasting.Therefore, the novelty and contributions of this study can be summarised as follows: A framework for short-term wind speed forecasting is introduced based on EMD and hybrid linear/nonlinear models.The EMD is adopted to decompose the original wind speed series into a finite number of IMFs and residuals, i.e., low-frequency residuals (LFR), medium-frequency IMF (MIMF) and high-frequency IMF (HIMF).Several popular linear models (ARIMA) and nonlinear models (SVM, random forest (RF), BP, ELM and convolutional neural network (CNN)) are, respectively, utilized to study each IMFs and residuals.An ensemble forecast for the original wind speed series is then obtained.Various experiments are conducted on the real wind speed data at four wind sites in China.The performance and robustness of various hybrid linear/nonlinear models at two time intervals (10 min and 1 h) are compared comprehensively.The forecasting model with the best performance is then recommended for real applications.
The remainder of this paper is organized as follows.Section 2 explains the structures and procedures of EMD based hybrid models, Section 3 introduces the measurement of wind speed data at four sites in China, Section 4 presents the evaluation of model parameters and compares the performances of various models, and Section 5 summarises the conclusions of the study.

Methods
The solution process of EMD is explained in this section.After introduction of both linear (ARIMA) and nonlinear (SVM, RF, BP, ELM and CNN) single models, the structures and procedures for EMD based hybrid linear/nonlinear models are proposed in detail.Several metrics for measuring the forecast accuracy are also introduced briefly.

EMD
The original wind speed data is processed by EMD, and a finite number of IMFs with different scales or trends are obtained.Compared with the original data series, each IMF has better stability and regularity.The decomposed IMFs should satisfy two conditions [21]: (a) The number of extrema and the number of zero crossings are equal or differ at most by one; (b) At any point, the mean value of the envelope defined by the local maxima and the envelope defined by the local minima is zero.The computation of the EMD can be given as follows [21]: (1) For wind speed series X(t), all of the local maximal and minimal data points are found and located;.(2) Two cubic spline lines are used to connect all of the local maximal and minimal points, respectively.
Then, upper and lower envelopes X H (t) and X L (t) are gained accordingly.
satisfies condition (a) and (b), then the C(t) could be considered as a IMF; otherwise, replace X(t) by C(t), and repeat step (1)-( 4) until the condition (a) and (b) are simultaneously obeyed.(5) Residuals R(t) = X(t) − C(t) are then calculated.Replace X(t) by R(t), and repeat step (1)-( 5) until all the IMFs and residuals are found.
In our study, two IMFs (HIMF and MIMF) and residuals (LFR) are used to represent the original wind speed series, and are respectively studied by both linear and nonlinear models to obtain an ensemble forecast.

Single Linear Models (ARIMA)
The model is known as ARIMA(p, d, q), where p is the order of the autoregressive part, q is the order of the moving average part, d is the degree of first difference involved.The Augmented Dickey-Fuller (ADF) [42] is used to determine whether the wind speed series is stable or not.If it is unstable, first difference should be applied, i.e., d > 0. The autocorrelation function (ACF) and partial autocorrelation function (PACF) [42] plots are then utilized to determine the order of ARIMA model.The maximum likelihood method is utilized to estimate the model parameters.In principle, the ARIMA model with estimated parameters should generate the lowest residuals.

Single Nonlinear Models
Five nonlinear models (SVM, RF, BP, ELM and CNN) are used in this study.They are briefly introduced as follows: (1) The SVM model has recently been used in a range of applications such as regression and time series forecasting.The basic idea of SVM for regression is to use a nonlinear mapping model to transform the data into a high-dimensional feature space, and then perform a linear regression in the feature space.Optimal weight and bias values are obtained by solving the quadratic optimization problem [43].(2) The RF model, which was suggested by Breiman [44], is an ensemble learning method for classification and regression.It is operated by constructing a multitude of decision tress at the training stage and outputting the mean prediction of individual trees.Classification and regression trees (CARTs) [45] in the RF model use the binary rules to divide data samples.CARTs could correct decision trees' habit of overfitting to their original dataset.(3) Among many available learning algorithms, BP has been the most popular implemented learning algorithm for all ANN models [46].The time series data is introduced by the input layer, and the forecast value is produced by the output layer.The layer between the input and output layers is called the hidden layer, where data are processed.The procedure of BP is repeated by adjusting the weights of the connection in the network using the gradient descent.Ref. [47] presented the detailed algorithm.(4) The ELM's structure is similar to the single hidden layer feed-forward neural network [48].
The main idea of the ELM model is to randomly set the network weights and then obtain the inverse output matrix of the hidden layer.This concept makes the ELM model operate extremely fast and maintain better accuracy compared with other learning models.The number of hidden nodes, which is the key parameter of ELM model, should be carefully estimated in order to obtain good results [16].( 5) CNN is a class of deep, feed-forward ANNs, most commonly applied to image classification and then generalized for time series prediction.In order to simply the preprocessing, CNN utilizes a variation of multilayer perceptrons.Except that the filter weights need to be shared, there are no other connections between the neurons.Thus, CNN could be trained more efficiently and have reliable abilities to extract the hidden features [49].

Hybrid Linear/Nonlinear Models
The idea of hybrid linear/nonlinear models is shown in Figure 1.The EMD is first adopted to decompose the linear and nonlinear characteristics of original wind speed data into a finite number of IMFs and residuals (HIMF, MIMF and LFR) as Usually, the LFR retains more linear characteristics, while the nonlinear characteristics are dominant in HIMF and MIMF.The linear (ARIMA) model is utilized to fit LFR as By applying the nonlinear models on both HIMF and MIMF, the nonlinear characteristics of wind speed data are extracted as follows Nt = nonlinear(MI MF) + nonlinear(H I MF) From Equation (1), we can obtain the ensemble prediction results of the hybrid linear/nonlinear model ŷt = Lt + Nt (4) As five nonlinear models (SVM, RF, BP, ELM and CNN) are adopted, so the hybrid linear/nonlinear models are named by EMD-ARIMA-SVM, EMD-ARIMA-RF, EMD-ARIMA-BP, EMD-ARIMA-ELM and EMD-ARIMA-CNN.In order for comparisons, the obtained HIMF, MIMF and LFR are studied by single linear or nonliear models, which are named by EMD-ARIMA, EMD-SVM, EMD-RF, EMD-BP, EMD-ELM and EMD-CNN.Moreover, single linear or nonlinear models are also directly applied on the original wind speed data to show the effect of EMD on the forecasting accuracy.

Forecasting Performance Metrics
Three metrics, including root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE), are utilized to evaluate the forecasting performance.Their mathematical definitions are given as follows where y t , ŷt denote the actual and predicted values, n is the sample number.Using the above three metrics, various models, including the single models (ARIMA, SVM, RF, BP, ELM, CNN) and EMD based single models (EMD-ARIMA, EMD-SVM, EMD-RF, EMD-BP, EMD-ELM, EMD-CNN) and EMD based hybrid linear/nonlinear models (EMD-ARIMA-SVM, EMD-ARIMA-RF, EMD-ARIMA-BP, EMD-ARIMA-ELM, EMD-ARIMA-CNN), will be evaluated for their forecast performance on different wind speed datasets, respectively.

Data Descriptions
At a wind observation site, the wind speed data is continuously measured using the anemometers.Four wind sites in China, listed in Table 1, were selected for this study.The wind speed data with two time intervals (10 min and 1 h) are considered, and shown in Figures 2 and 3, respectively.The length for every wind speed series is 1000 points.The first 800 points of data are used for constructing and training the forecasting models, while the remaining 200 points of data are utilized for tests.Descriptive statistics of wind speed data are also given in Table 2.

ARIMA
In this study, we utilize the "auto.arima"function in R to fit the best ARIMA model to the univariate time series.For instance, ARIMA(1,1,1) is obtained as the best model for the 10 min data of the AnHui (AH) site.The maximum likelihood method is applied to estimate the model parameters, whose values are given in Table 3.In order to verify the obtained model, the parameter significant test is carried out and shown in Figure 4. Obviously, the fitting residuals are random series and the p value is found to be significantly greater than 0.05, indicating that the obtained ARIMA model is reasonable.By applying the "auto.arima"function, the single ARIMA model parameters for both 10 min and 1 h data of four wind speed datasets are determined and given in Table 3.    4. Obviously, when σ 2 = 0.1, γ = 0.25, the MAE value of the 10 min data is minimum, i.e., 0.444.For the 1 h data, the minimum MAE value (0.754) is obtained at σ 2 = 0.25, γ = 0.25.For the GD site, the optimal parameters of single SVM model are: σ 2 = 0.1, γ = 0.25 (10 min data) and σ 2 = 0.25, γ = 0.25 (1 h data).Similarly, the optimal parameters of SVM models for other three wind speed sites could also be determined and shown in Table 5.
For the RF model, the number of trees and number of extracted features should be determined [44].Here, we consider number of trees to vary from 50 to 700 and number of extracted features to vary from 1 to 4, and use the RF model to fit each wind speed time series.The calculated MAE values at the HeiLongJiang (HLJ) site are given in Table 6.When the number of trees is equal to 100 and the number of extracted features is equal to 2, the MAE of 10 min data reaches the minimum, i.e., 0.8992.For 1 h data, the minimum MAE is obtained for the number of trees equal to 300 and number of extracted features equal to 3. Thus, the optimal parameters of single RF models for the HLJ site are determined.Following the same process, the best RF models for other three wind sites can also be gained and shown in Table 5.For the single BP model, there is no reliable way to determine the optimal parameters.In this study, we just try four structures, named by s1: 10-10-10-10-10, s2: 10-10-10-10, s3: 10-10-10 and s4: 10-20-50, to see which structure has the lowest MAE value.Table 7 shows the MAE values of the single BP model with different structures for wind speed time series at the AH site.It is shown that either 10 min data or 1 h data, the BP model with structure s4 always has the minimum value of MAE, and thus it is selected as the best single BP model.Similarly, other three wind sites are studied to obtain the best structure for the single BP model, and the results are given in Table 5.For the single ELM model, the key parameter is the number of hidden neurons (N neu ).Here, we set N neu = 20, 40, 60, 80, 100, 200, 500, 1000, and take the wind speed data at the GanSu (GS) site as an example.The calculated MAE results are listed in Table 8.It is shown that for the 10 min data, the minimum MAE is obtained at N neu = 40.For the 1 h data, the minimum MAE is gained at N neu = 200.Thus, for the wind speed data at the GS site, the best number of hidden neurons should be 40 (10 min data) and 200 (1 h data).In this way, the best single ELM models for other three wind sites could also be gained and shown in Table 5.For the single CNN model, we just adopt the structure recommended by Liu et al. [50].In their study, the CNN model consists of three convolutional layers and a fully connected layer.The channels of the convolutional layers are 4, 16 and 32, respectively.In the following study, the CNN model has the same structure.

EMD-ARIMA
The EMD is adopted to decompose the original wind speed data into a finite number of IMFs and residuals (HIMF, MIMF and LFR). Figure 5 gives the EMD results of 10 min data at the AH site.Compared with the original data (see Figure 2a), one can find that the LFR retains the trend and low frequency components.The high frequency components are dominant in HIMF and MIMF.Similar to the solution process presented in Section 4.1, the orders and parameters of ARIMA models for HIMF, MIMF and LFR could be determined accordingly.Table 9 gives the results of both 10 min data and 1 h data of four wind sites.

EMD-SVM, EMD-RF, EMD-BP, EMD-ELM, EMD-CNN
In Section 4.1, the single nonlinear models are studied to estimate their parameters and structures.Here, the HIMF, MIMF and LFR obtained by EMD are also analyzed by single nonlinear models following the same solution process with that presented in Section 4.1.The results of both 10 min data and 1 h data of the four wind sites are shown in Table 10.

Hybrid Models with EMD
As shown in the above section, the idea of hybrid linear/nonlinear models is to use the linear (ARIMA) model to analyze the LFR, while the HIMF and MIMF are studied by nonlinear models.

Discussions
From the above analysis, we can see that the proposed hybrid model is mainly composed of the existing mature linear and nonlinear prediction algorithms, so it is very suitable for the promotion of practical engineering applications.The proposed hybrid model performs well in short-term wind speed prediction, and the MAPE can reach 0.03 (10 min data).For the 1 h data, the prediction performance has decreased significantly, and the MAPE has exceeded 0.10.Predictably, in the medium and long term wind speed forecasting, the performance of proposed hybrid model would be even worse.Therefore, future studies will combine weather numerical prediction results to improve the accuracy and performance of hybrid models in the medium and long term wind speed forecasting.

Conclusions
A framework for short-term wind speed forecasting is introduced based on EMD and hybrid linear/nonlinear models.The EMD is adopted to decompose the original wind speed series into a finite number of IMFs and residuals, which are studiedby several linear models (ARIMA) and nonlinear models (SVM, RF, BP, ELM and CNN) to obtain the ensemble forecast for the original wind speed series.Forecasting experiments are conducted on real wind speed series at four wind sites in China.The performance and robustness of various hybrid linear/nonlinear models at two time intervals (10 min and 1 h) are compared comprehensively.It is shown that single ARIMA models have better prediction accuracy than the single SVM, RF, BP, ELM and CNN models.The introduction of EMD is beneficial to most single models' prediction accuracy.The EMD based hybrid linear/nonlinear models generally have better accuracy and more robust performance than the single models with/without EMD.Among the five hybrid models, EMD-ARIMA-RF has the best accuracy on the whole for 10 min data.However, for the 1h data, no model can always perform well on the whole dataset.As the existing mature linear and nonlinear forecast models are adopted, they will greatly enhance the practical utility of the proposed hybrid wind speed forecasting model.

Figure 4 .
Figure 4. Parameter significance test results of single ARIMA(1,1,1) model for 10 min data from the AH site.ACF: Autocorrelation function.4.1.2.SVM, RF, BP, ELM and CNN In the SVM model, two parameters, namely the bandwidth σ 2 and regularization factor γ, should be estimated for each of the wind speed sites.Following the suggestions of Zhou et al. [15], we will consider the values of σ 2 as 0.1, 0.25, 1, 4, 16, 100, 1000 and γ as 0.25, 1, 4, 16, 256.Single SVM models with different combinations of σ 2 and γ are utilized to fit the wind speed time series at the GuangDong (GD) site, and the obtained MAE values are listed in Table4.Obviously, when

Figure 6 .
Figure 6.Wind speed prediction comparisons for the 10 min data at the AH site.Root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE).

Figure 7 .
Figure 7. Wind speed prediction comparisons for the 10 min data at the GD site.

Figure 8 .
Figure 8. Wind speed prediction comparisons for the 10 min data at the GS site.

Figure 9 .
Figure 9. Wind speed prediction comparisons for the 10 min data at the HLJ site.

Figure 10 .
Figure 10.Wind speed prediction comparisons for the 1 h data at the AH site.

Figure 11 .
Figure 11.Wind speed prediction comparisons for the 1 h data at the GD site.

Figure 12 .
Figure 12.Wind speed prediction comparisons for the 1 h data at the GS site.

Figure 13 .
Figure 13.Wind speed prediction comparisons for the 1 h data at the HLJ site.

Table 1 .
Locations of four observation stations in China.

Table 2 .
Descriptive statistics of wind speed data at four sites in China.Max-Min Values (m/s) Mean (m/s) Standard Deviation (m/s) Skewness Kurtosis

Table 3 .
Single ARIMA model parameters of various wind speed datasets.

Table 4 .
Single SVM model selection for wind speed time series at the GD site, mean absolute error (MAE).

Table 5 .
Parameter selection results for single nonlinear models.

Table 6 .
Single RF model selection for wind speed time series at the HLJ site (MAE).

Table 7 .
Single BP model selection for wind speed time series at the AH site (MAE).

Table 8 .
Single ELM model selection for wind speed time series at the GS site (MAE).

Table 9 .
EMD-ARIMA model parameters of various wind speed datasets.

Table 10 .
Parameter selection results for single nonlinear models with EMD.