Multi-Step Wind Speed Forecasting Based On Ensemble Empirical Mode Decomposition, Long Short Term Memory Network and Error Correction Strategy

: It is of great signiﬁcance for wind power plant to construct an accurate multi-step wind speed prediction model, especially considering its operations and grid integration. By integrating with a data pre-processing measure, a parameter optimization algorithm and error correction strategy, a novel forecasting method for multi-step wind speed in short period is put forward in this article. In the suggested measure, the EEMD (Ensemble Empirical Mode Decomposition) is applied to extract a series of IMFs (intrinsic mode functions) from the initial wind data sequence; the LSTM (Long Short Term Memory) measure is executed as the major forecasting method for each IMF; the GRNN (general regression neural network) is executed as the secondary forecasting method to forecast error sequences for each IMF; and the BSO (Brain Storm Optimization) is employed to optimize the parameter for GRNN during the training process. To verify the validity of the suggested EEMD-LSTM-GRNN-BSO model, eight models were applied on three different wind speed sequences. The calculation outcomes reveal that: (1) the EEMD is able to boost the wind speed prediction capacity and robustness of the LSTM approach effectively; (2) the BSO based parameter optimization method is effective in ﬁnding the optimal parameter for GRNN and improving the forecasting performance for the EEMD-LSTM-GRNN model; (3) the error correction method based on the optimized GRNN promotes the forecasting accuracy of the EEMD-LSTM model signiﬁcantly; and (4) compared with all models involved, the proposed EEMD-LSTM-GRNN-BSO model is proved to have the best performance in predicting the short-term wind speed sequence.


Introduction
As the awareness of environmental protection increases, the application and promotion of renewable energy has attracted worldwide attention.As one type of promising renewable energy, wind power is experiencing a rapid development [1].Nevertheless, owning to the instability and stochastic property of wind power generation, the instability of power system is caused easily when considering wind power [2].Therefore, it is imperative to propose an accurate prediction method for wind speed to reduce the instability risk of the power system and the economic losses for wind power enterprises.
In recent years, many scholars have done extensive research on predicting the wind speed sequence.The traditional prediction measures are universally recognized as four kinds: (1) physical method; (2) statistical method; (3) intelligent approach; and (4) hybrid model [3].The physical method commonly takes advantage of physical data, for example temperature, air density, topographic information and so on [4], which is mainly obtained through numerical weather prediction [5], to get the prediction results.However, the physical methods are not good at forecasting wind speed in short period and the methods also require plenty of time to compute and additional resources [6].The statistical measures, such as the autoregressive integrated moving average (ARIMA) measure, are built with easy procedures of pattern recognition, parameter estimation and model checking [7].However, this kind of methods cannot afford to deal with the non-linear problems [8].Owning to the ability to recognize the non-linear characters, the intelligent approaches, for instance artificial neural networks (ANNs) [9][10][11], support vector machine (SVM) [12], the genetic algorithm [13] and the general regression neural network (GRNN) [14], have been utilized to forecast wind speed effectively.Due to the superior ability to recognize the non-linear structure, intelligent approach is better at forecasting the wind speed of short period than traditional time series based methods.Nevertheless, the single intelligent approaches also suffer form certain problems.For example, genetic algorithm measure has the problem of premature convergence, which limits its searching ability to obtain the optimal value.Furthermore, with the function to recognize deep characteristics in the data, the deep learning approaches, for example the deep convolutional neural network [15] and LSTM [16], have been investigated in the process of prediction for wind speed in recent studies.
However, due to the unstable property of wind speed, a single intelligent model may occasionally fall into local extremum and result in poor forecasting performance.Hence, to fix this problem, hybrid models to predict wind speed are put forward.There are four types of hybrid forecasting models [17].
(1) The hybrid methods involving weighting approaches give a weighing parameter to each single approach based on their forecasting performance and then add the weighted forecasting results together.For example, Shi et al. [18] put forward a weighting based hybrid approach involving grey relational analysis as well as the distribution characteristics of wind velocity, which integrates the LSSVM (least square support vector machine) and the RBFNN (Radial Basis Function Neural Networks).The weighting parameters in the approach can be calculated based on data sequences in each month.The results reveal that the suggested combination measure effectively promotes the performance in forecasting the wind speed in very short term.Xiao et al. [10] utilized the nonnegative constraint theory and hybrid smart approaches to obtain the wind speed prediction, in which the importance degrees of the latter combined approaches are decided utilizing the chaos particle optimization algorithm as well as the genetic algorithm.The afore-mentioned hybrid approaches take advantage of the strength of single forecasting methods, thus the forecasting accuracy is significantly improved.
(2) The signal pre-processing measure is executed to obtain a collection of sub-sequences, which are stationary and regular, from the initial non-linear time series.Different decomposition approaches have been utilized in latest hybrid prediction approaches extensively.For instance, in [19], the raw data series is preprocessed by wavelet transform (WT) before being brought into the forecasting procedure of SVM.The final outcomes indicate that the suggested approach consisting of WT and SVM is superior to the single SVM approach in prediction accuracy.Fan [20] used a combination measure integrating the empirical mode decomposition (EMD) and SVM, in which the initial wind speed data are processed with EMD for the purpose of fluctuation deduction.However, the EMD method cannot afford to dispose the problem of mode mixing.Therefore, to compensate for the disadvantage of EMD, the ensemble empirical mode decomposition (EEMD) measure is utilized in [21].Cheng [22] utilized a hybrid model integrated with the EEMD approach to construct the forecasting process for wind speed, where the EEMD is applied for information extraction from the raw wind speed series.The final outcomes reveal that the suggested approach with EEMD shows a better prediction performance than the EMD or LSSVM method.
(3) Hybrid models integrating parameter optimization, which applies the optimization methods to find optimal setting for the prediction models in the training procedure, are investigated recently.Chitsaz et al. [23] presented a novel prediction measure which has the structure of the Wavelet Neural Network (WNN) as well as multi-dimensional Morlet wavelet.The modified Clonal selection method is used for finding the optimal parameters in WNN with the training criterion of Maximum Correntropy Criterion.The final outcomes demonstrate the validity of the suggested method.In [24], a novel hybrid wind speed prediction measure in short period consisting of mutual information, wavelet transform, evolutionary particle swarm optimization (EPSO) and the adaptive neuro-fuzzy inference system (ANFIS) is developed, in which the EPSO is utilized to search the optimal parameter for ANFIS.The final results reveal that the suggested measure has advantages in forecasting accuracy over the other comparison models.Yuan et al. [25] presented the gravitational search algorithm (GSA) for searching the best parameter for LSSVM model.The experiment outcomes show that the suggested LSSVM-GSA combination measure have the highest forecasting accuracy, compared with other models.The Brain Storm Optimization (BSO) approach, which is enlightened from the process of brainstorming for people, is put forward in [26].Modeling BSO algorithm requires simulating the form of gathering various experts together to propose potential solutions for the current problem [27].Each individual is grouped into different teams for collaborative investigation.Various teams are able to locate different answer space areas to promote the possibility to find the best solution, which possesses an excellent global exploration ability [28].The significance of BSO has been validated by numerous scholars [29,30].
(4) Unlike the aforementioned hybrid methods, the hybrid models based on the data post-processing technique emphasize using the error correction method to reduce the adverse effect brought by the forecasting error.For instance, Liang et al. [31] put forward a novel hybrid model, in which the forecasting step for the raw data sequence of wind power is conducted with the SVM, and then the prediction error for the SVM is forecasted utilizing the SVM together with the ELM.The numerical outcomes demonstrate that the proposed combination measure with error correction can promote the wind power prediction performance effectively.Jiang et al. [32] put forward a combined structure, in which the EEMD is executed to pre-process the wind speed sequences with mean zero, and the chosen sub-layers are forecasted using LSSVM.Then, the LSSVM and the Generalized Auto-Regressive Conditionally Heteroscedastic (GARCH) measure are applied for forecasting the error sequences.The outcomes demonstrate that the error correction method contributes to the forecasting accuracy improvement.Moreover, in [33], the prediction errors of the wind speed series in short period, which are acquired by the measure of grey forecasting, are forecasted utilizing the Markov method for wind speed forecasting error correction before it is turned into wind power forecasting, and the results show the superiorities of the proposed approach in forecasting accuracy improvement.
All four types of hybrid forecasting model mentioned above can contribute to the improvement of forecasting performance.In this paper, the signal pre-processing technique, the parameter optimization algorithm and the error correction method are considered.The suggested combination model involving the signal decomposition technique, the parameter optimization algorithm and the error correction method is built as follow: (1) The EEMD is applied to extract a collection of IMFs from the raw wind speed sequence.(2) The LSTM network is used to forecast each IMF.(3) The GRNN is conducted to predict the error sequence for each IMF (intrinsic mode function).(4) The BSO is executed to optimize the parameter for GRNN.(5) The ultimate prediction result is obtained through merging all the predictions of IMFs.
The major contribution in the article is proposing a novel multi-step wind speed prediction structure combining the data pre-processing technique, the parameter optimization algorithm and the error correction method to achieve a satisfactory forecasting performance, and to analyze the influences of every element of the proposed hybrid model in forecasting accuracy contribution.As far as we know, the potential performance of the suggested structure, which integrates three kinds of improvements in one hybrid model, has not been studied in the prediction for short-term wind speed.
Thus, aiming at investigating the effectiveness of each component, the overall prediction performance and the generalization of the suggested combination measure, eight diverse approaches were applied to forecast two different 5-min wind speed sequences and one 30-min wind speed sequence.Finally, the prediction accuracy of all the approaches involved in this paper were estimated utilizing different evaluating indicators.
The structure for this article is described as below.Section 2 describes the application process of the suggested hybrid measure and the single models required.Section 3 introduces the evaluation criteria for prediction capacity.Section 4 presents two 5-min wind speed forecasting case studies to prove the forecasting capacity of the suggested hybrid measure.Section 5 presents an additional 30-min case to further validate the generalization of the suggested measure.Finally, the conclusions are drawn in Section 6.

The Overall Structure of the Suggested Combination Measure
The structure of the suggested EEMD-LSTM-GRNN-BSO approach is shown in Figure 1.The specific processes are described below: (1) The EEMD method is executed to extract a collection of IMFs from the wind speed observations.
The ratio of the standard deviation of the added noise takes 0.01 and the ensemble number for the EEMD takes 100.The BSO algorithm is executed to search the optimal smooth factor for further prediction accuracy improvement, in which the smooth factor is treated as the variable to be optimized and the mean absolute error (MAE) calculated with the predictions and observations is considered as the fitness function of BSO.Each value of smooth factor in the searching space is brought in to the GRNN to obtain the predictions and the corresponding fitness, until the optimal value is found.Sections 2.4 and 2.5 describe the details of GRNN and BSO, respectively.(5) The LSTM network and the GRNN model optimized by BSO are combined to construct the proposed hybrid forecasting measure.The suggested combination measure is validated using the test set for getting each IMF predictions and the error predictions.The overall prediction for each IMF can be obtained with the equation below: where i is the number of IMF determined by the EEMD method.P corrected I MF i stands for the corrected prediction for each IMF.P I MF i represents the original predictions for each IMF forecasted by BSO.P ERR i stands for the error prediction for each IMF forecasted by optimized GRNN.The final predicted wind speed sequences are gained by means of merging all the corrected predictions of IMF together.(6) To test and verify the wind speed prediction performance of the suggested combination EEMD-LSTM-GRNN-BSO approach, seven other prediction methods were used as comparisons.
The comparison models involved in this study are the ARIMA measure, the BP network, the GRNN measure, the LSTM measure, the LSTM-GRNN-BSO measure, the EEMD-LSTM measure, and the EEMD-LSTM-GRNN measure.Comparisons between models were also utilized to reveal the effectiveness of each component in forecasting accuracy improvement.

Ensemble Empirical Mode Decomposition
Being a valid data series disposing measure, empirical mode decomposition (EMD) can draw the feature information from the raw data series [34].Utilizing the EMD approach, a collection of intrinsic mode functions (IMFs) are acquired.Following the EMD measure, the ensemble empirical mode decomposition (EEMD) was studied to handle the mode mixing issue, which cannot be solved by the EMD approach.The main process for the EEMD measure [22] is described as follows: (1) Create a novel data series y(t) by adding white noise into the raw data series x(t).
(2) Recognize all the local extremum values for the data series y(t).
(3) Construct the upper envelopes e u (t) and lower envelopes e l (t) for y(t) (4) Generate the average value m(t) with the upper envelope and the lower envelope.
(5) Calculate the distinction between the raw data series y(t) and m(t) as the first part h(t): (6) Iterated the sifting procedure several times.The iterative process continues for k times until h(t) is an IMF.After that the first IMF part c 1 is shown as follow: The residue r 1 is considered as a new series, and Steps (2)-( 6) are repeated to get all r j and a residue c n .Finally, by adding up all the IMFs and the residue obtained, the following is acquired: EEMD is regarded as an approach to help analyze data with noise by means of mixing white noise into the raw series, and it is useful to mitigate the problem caused by mode mixing.

Long Short Term Memory Measure
Improved from Recurrent Neural Network (RNN), the LSTM measure was put forward by Hochreiter and Schmidhuber [35] in 1997.The important parts of the LSTM network are its memory cells, which make it different from the traditional RNN.Graves and Schmidhuber [36] explained that three types of multiplicative units exist in the structure of LSTM model: the input gate, the output gate and the forget gate in the memory cells.These gates change the state of the memory cells following the steps below [37]: (a) by activating the input gate, as the latest data enters, the input message is able to be accumulated to the cell; (b) by activating the forget gate, the former cell states are to be abandoned during the procedure; and (3) the output gate is responsible for deciding if the latest cell output is propagated to the final state.
In terms of short wind speed forecasting, x = (x 1 , x 2 , • • • , x T ) is the historical wind speed series and y = (y 1 , y 2 , • • • , y T ) is the forecasting value.The prediction of the wind speed sequence is computed as below [38]: where i t represents the input gate, f t represents the forget gate, c t represents the activation vector for every cell, o t stands for the output gate, m t stands for the activation vectors for every memory block, W represents the weigh matrices, b represents the bias vectors and the mathematical symbol "•" stands for the scalar product.
σ(•) stands for the standard logistic function: g(•) stands for the centered logistic function: h(•) stands for the centered logistic function:

General Regression Neural Network
Specht put forward the GRNN method in 1991 [39].The GRNN has many advantages such as strong non-linear mapping abilities, flexible network framework and satisfactory robustness, which makes it a perfect choice to deal with non-linear problems.Although the GRNN has a similar structure to the RBFNN, its approaching ability and learning speed is better.The structure of GRNN includes the input layer, the pattern layer, the summation layer and the output layer.The framework of GRNN is described in Figure 1C.The input for GRNN is The following descriptions explains the detailed process of GRNN: (1) Input layer: The amount of neurons and the dimension of the input data of the training set should be kept consistent.Every neuron denotes an easy distribution unit that delivers the input information straightly to the pattern layer.(2) Pattern layer: The number of neurons equals the number of training data.Every neuron denotes a diverse sample.The calculation for the neuron transfer function of the pattern layer p i is shown below: where X stands for the input variables for the model and X i stands for the training data of neuron i.The width parameter of the Gaussian function is controlled by smoothing factor σ. (3) Summation layer: The procedure of summation is calculated with two kinds of neurons.
One way is to merge the output of each neuron in pattern layer with the formula of to obtain the summation S D for the summation layer.The combination weighting parameter connecting the pattern layer and each neuron is equal to one, and the transfer formula is shown below: The other way is to conduct the summation S Nj of all the neurons in the pattern layer with different weights, whose formula is represented by . The jth component of ith output sample Υ i defines the weighting parameter connecting the ith neuron in the pattern layer and the jth molecule in the summation layer.The corresponding transfer formula is as follow.
(4) Output layer: The amount of neurons and the dimension of the output vector in the sample should be kept consistent.The output of neuron j is equal to the jth component of the calculated outcome Υ(X), which is calculated as: As the parameter σ takes a great value, Υ(X) is closer to the average value of all the sample based variables.Conversely, as the value of parameter σ is closer to zero, Υ(X) is similar to the training set.Under the situation that the predicted point are part of the training set, the prediction for the dependent variable is rather similar to the corresponding dependent variable of the training set.When this happens, the corresponding sample cannot be considered, which may lead to unsatisfactory forecasting performance and generalization.When the smooth factor σ takes a proper value, the calculation of Υ(X) includes the dependent variable of all training data, and the distance between the dependent variable and the corresponding forecasting point is assigned with a larger weight.Thus, based on the significant influence of smooth factor σ on the forecasting performance of GRNN, the BSO is utilized to search the optima value for GRNN during the training process.

Brain Storm Optimization
The BSO [40,41] is an algorithm based on population aiming at mimicking brainstorming meetings conducted by people.In the process of BSO, each population can be considered as a set of ideas.Each idea stands for a solution for the issue.In every iteration, a population of ideas (solutions) is renewed.At first, ideas are allocated to search space randomly.Every single idea idea i is renewed by the following steps.

•
Firstly, k-means clustering can be utilized to identify similar solutions and the optimal idea of each cluster is marked as the cluster center.

•
Secondly, BSO creates a novel idea nidea i by making it equal to one of options mentioned below.
-A probabilistically chosen cluster center -A randomly chosen idea from a probabilistically selected cluster - The stochastic integration of two probabilistically chosen cluster centers - The stochastic integration of two randomly chosen ideas from two probabilistically chosen clusters One of the options is randomly chosen according to several parameters, p one−cluster , p one−center , and p two−centers .Besides, a cluster is probabilistically chosen based on its scale, which reflects the amount of ideas in the cluster.

•
Thirdly, the created nidea i is perturbed utilizing a step-size parameter ξ and Gaussian distribution.
• Finally, nidea i substitutes the current idea i if its fitness is better.If not, it is abandoned.
The main steps of BSO algorithm [42] is described in Figure 3, where n represents the population scale, m represents the amount of clusters, and N(0, 1) stands for a normal distribution in which the average value is 0 and the standard deviation is 1. ξ represents a dynamically updated step-size and k is for altering the slope of the logsig function.As the special evolution of BSO, making diverse groups to explore wide solution space area helps BSO to avoid local extremum trap and increase the probability to find the optimal value, thus making BSO a good choice to optimize the smooth factor for GRNN.

Evaluation Criteria for Prediction Capacity
Aiming to investigate the forecasting capacity of the suggested combination forecasting model, three widely used evaluation indexes were applied to compare the prediction capacity: mean absolute error (MAE), root mean square error (RMSE), and mean absolute percent error (MAPE).The indicators are described as below: where p ture (t) represents the actual observation data of the moment t and p f orecast (t) represents the value of prediction for the corresponding moment.T is the number of predicted points.Moreover, aiming at analyzing the forecasting capacity increase of the suggested measure, the percentage improvements of MAE, MAPE and RMSE, which are represented by P MAE , P MAPE , P RMSE , respectively, were also used in this study.These evaluation indexes can be defined as follows:

Experiments
Aiming at validating the prediction capacity of the EEMD-LSTM-GRNN-BSO method, the suggested combination approach together with seven comparison methods were conducted on the two different datasets of wind speed.The comparison approaches were the ARIMA method, the BP method, the GRNN method, the LSTM method, the LSTM-GRNN-BSO measure, the EEMD-LSTM measure, and the EEMD-LSTM-GRNN measure.The actual wind speed sequence and the forecasting values of all involved approaches are presented in Figures 6 and 7.The calculation outcomes of evaluation indicators for the involved wind speed forecasting approaches are demonstrated in Tables 1  and 2.

Comparison and Analysis
As shown in the above two tables, the calculation outcomes for evaluation indicators of the two wind speed sequences forecasting cases demonstrate the same trend.Tables 3 and 4 provide the percentage improvements for the three evaluation indicators of the suggested combination EEMD-LSTM-GRNN-BSO approach on the two different datasets in comparison with the other measures mentioned.From the results in Tables 1-4 and Figures 6 and 7, some analyses could be obtained.Take Wind Speed Sequence I as an example.

Additional Prediction Case
Aiming at further studying the generalization capacity of the suggested hybrid measure, the proposed EEMD-LSTM-GRNN-BSO hybrid model was conducted on am additional case with 30-min interval: Wind Speed Sequence III.The actual data of Wind Speed Sequence III, taken from 1 October 2018 to 10 November 2018, are shown in Figure 9.The additional experiment was conducted with the same procedure of the aforementioned three experiments, and the forecasting results are shown in Figure 10.Table 6 demonstrates the multi-step calculation outcomes for evaluation indicators of all the models involved.Table 6 illustrates the multi-step percentage improvements of the three evaluation indices for the suggested EEMD-LSTM-GRNN-BSO approach compared with other comparison models on wind speed series III.It is observed in Tables 6 and 7 that the calculation outcomes of the evaluation indicators on the additional prediction case showed the same basic behavior as the the two aforementioned forecasting cases in Section 4. Again, the suggested combination EEMD-LSTM-GRNN-BSO approach demonstrated the highest forecasting accuracy compared with all the other models mentioned.This additional case witnessed the generalization and validity of the suggested combination model on wind speed series with longer time interval.

Finally, 11
IMFs are obtained utilizing EEMD.The process of the EEMD measure are shown in Section 2.2.(2) The IMFs are classified into two training sets.The input matrixes and output matrixes are formed with data in each set based on the procedures described in Figure 2 to train the forecasting models.(3) The LSTM network is trained with the data in Training Set 1 to predict each IMF; the trained LSTM networks are tested with data in Training Set 2; the forecasting error series are obtained by finding the difference between the observations and predictions of Training Set 2. The procedures of the LSTM models are described in Section 2.3.(4) The GRNN approach is trained with the error sequence of the Training Set 2 to model the prediction errors for the LSTM network.

Figure 2 .
Figure 2. The structure of the multi-step forecasting strategy.

Figure 3 .
Figure 3.The main procedure of BSO.

Figures 4
Figures 4 and 5 demonstrate two different datasets with a time interval of 5 min collected from 1 January 2018 to 7 January 2018 and from 1 May 2018 to 7 May 2018 at different wind power plants in Zhang Jiakou, Hebei, China.Training Set 1, including samples from 1 to 800 of each sequence, were applied to train the LSTM network; and Training Set 2, including samples from 801 to 1600 of each sequence, were applied to create the error series and train the GRNN model, which was optimized by BSO.Samples 1601-2000 of each sequence were executed to test and estimate the prediction capacity of the models mentioned in this study.

Figure 4 .
Figure 4.The observations for Wind Speed Sequence I.

Figure 5 .
Figure 5.The observations for Wind Speed Sequence II.

Figure 6 .
Figure 6.The comparisons between the observations and the predictions for Wind Speed Sequence I.

Figure 7 .
Figure 7.The comparisons between the observations and the predictions for Wind Speed Sequence II.

Figure 8 .
Figure 8.The error statistical estimation for one step forecast of Wind Speed Sequence I.

Figure 9 .
Figure 9.The observation values of Wind Speed Sequence III.

Figure 10 .
Figure 10.The comparisons between the observations and the predictions for Wind Speed Sequence III.

Table 1 .
The multi-step calculation results for evaluation indicators of involved approaches on Wind Speed Sequence I.

Table 2 .
The multi-step calculation results for evaluation indicators of involved approaches on Wind Speed Sequence II.

Table 3 .
The multi-step percentage improvements of the suggested EEMD-LSTM-GRNN-BSO approach in comparison with the other measures on Wind Speed Sequence I.

Table 4 .
The multi-step percentage improvements of the suggested EEMD-LSTM-GRNN-BSO approach in comparison with the other measures on Wind Speed Sequence II.

Table 5 .
The accuracy improvements of data preprocess, error correction and parameter optimization in 1-5-step prediction for Wind Speed Sequence I.Prediction Approaches Step MAE (m/s) MAPE (%) RMSE (m/s)

Table 6 .
The multi-step calculation results for evaluation indicators of involved approaches on Wind Speed Sequence II.

Table 7 .
The multi-step percentage improvements of the suggested EEMD-LSTM-GRNN-BSO approach in comparison with the other measures on Wind Speed Sequence III.