A Hybrid Method Based on Singular Spectrum Analysis, Fireﬂy Algorithm, and BP Neural Network for Short-Term Wind Speed Forecasting

: With increasing importance being attached to big data mining, analysis, and forecasting in the ﬁeld of wind energy, how to select an optimization model to improve the forecasting accuracy of the wind speed time series is not only an extremely challenging problem, but also a problem of concern for economic forecasting. The artiﬁcial intelligence model is widely used in forecasting and data processing, but the individual back-propagation artiﬁcial neural network cannot always satisfy the time series forecasting needs. Thus, a hybrid forecasting approach has been proposed in this study, which consists of data preprocessing, parameter optimization and a neural network for advancing the accuracy of short-term wind speed forecasting. According to the case study, in which the data are collected from Peng Lai, a city located in China, the simulation results indicate that the hybrid forecasting method yields better predictions compared to the individual BP, which indicates that the hybrid method exhibits stronger forecasting ability.


Introduction
Wind energy has been a fast growing energy resource type because it is renewable, pollution free and abundant.Currently, with the development of the economy, many countries are facing a severe energy crisis.Therefore, there is no doubt that the necessity of exploring and using various sources of energy will need to be emphasized.Wind energy, as one of the major renewable energy sources, is a great challenge regarding the reliability and accuracy of power systems, considering the fluctuation of the wind speed.Therefore prediction has become a theme in the planning of today's competitive environment.Wind speed prediction is more necessary for us to explore and take good advantage of the advantages of wind power [1].Utilizing appropriate wind speed data, power system operators are able to predict the theoretical power output, which helps in system planning, scheduling and storage capacity optimization.Thus, wind speed prediction plays an important role in actual decisions.In order to increase the accuracy of wind speed prediction, a hybrid predicted model is proposed in this paper.are detected.Note that the FA is population-based.Population-based algorithms have the following advantages when compared to single-point search algorithms [12]: a Building blocks are put together from different solutions through crossover.b Focusing a search again relies on the crossover and means that, if both parents share the same value of a variable, then the offspring will also have the same value of this variable.c Low-pass filtering ignores distractions within the landscape.d Hedging against bad luck in the initial positions or decisions it makes.e Parameter tuning is the algorithm's opportunity to learn good parameter values in order to balance exploration against exploitation.
Furthermore, for most of the data of the time series, the noise components are an element that needs to be considered.Apart from the time series modeling approach, the extraction of the main features of the time series, the removal of noise and unpredictable components in a pre-processing stage can remarkably enhance the prediction performance and accuracy [13].In fact, filtering the noisy and almost unpredictable components of nonlinear and chaotic time series by data processing techniques often leads to a series that is less complex and more predictable [14].Here, we employed singular spectrum analysis (SSA) for de-noising.SSA works well for linear and nonlinear, stationary and non-stationary time series with different features and structure.It can efficiently identify and extract the trend and noise components of a time series [15] and then reconstruct a new series by eliminating the noise components and, hence, improve prediction performance.With this character, SSA is usually employed for time series filtering in the pre-processing stage.
The subsequent content of the paper is structured as follows: the methodologies of the individual models involved in the hybrid model are described in Section 2. Section 3 introduces the hybrid model constituted with the component models mentioned above in detail.With the hybrid models and the component models, a case study is put forward to verify the performance of the proposed models.The specific description of the data structure are explained in Section 4. Section 5 focuses on the simulation analysis, in which the results of the three experiments are displayed and analyses and comparisons of the proposed models are discussed.Finally, Section 6 presents the paper's conclusions.

Methodology
In this paper, the proposed hybrid model is integrated with three components, singular spectrum analysis, the firefly algorithm, and the BP neural network.

SSA Algorithm and Methodology
In this section we introduce the information about SSA which is vital for understanding the implementation of SSA and the ways SSA has to be used for the analysis of real-life data [16].One of basic tasks of SSA analysis is to decompose the observed time series into the sum of interpretable components with no a priori information about the time series structure.Following is the formal description of the algorithm.Definition 1.Consider a real-value time series X N = (x 1 , . . .,x N ) of length N. Let L (1<L<N) be some integer called window length and K = N − L + 1.

First Stage: Decomposition
The process of decomposition consist of two steps: 1st step: Embedding.
To perform the embedding we map the original time series into a sequence of lagged vectors of size L by forming K = N -L + 1 lagged vectors X l = (x 1 , . . .,x i+L-1 ), i = 1 . . .,K. Definition 2. The trajectory matrix of the series X N is: There are two important properties of the trajectory matrix, namely: (a) Both the rows and columns of X are subseries of the original series.(b) X has equal elements on anti-diagonals and therefore the trajectory matrix is Hankel.
2nd step: Decomposition.Definition 3. Let {P i } L i=1 be an orthonormal basis in R L .Consider the following decomposition of the trajectory matrix: where Q I = X T P i , and define We consider two choices of the basis {P i } L i=1 : Basic: {P i } L i=1 are eigenvectors of XX T ; Toeplitz: {P i } L i=1 are eigenvectors of the matrix C whose entries are: In both cases the eigenvectors are ordered so that the corresponding eigenvalues are placed in the decreasing order.
Let us remark that Case A corresponds to Singular Value Decomposition (SVD) of X, that is, X = ∑ i √ λ i U i V T i , P i = U i are left singular vectors of X, Q i = √ λ i V i , V i are called factor vectors or right singular vectors, λ i are eigenvalues of XX T ; therefore, Note also that Case B is suitable only for the analysis of stationary time series with zero mean [17].
In the SSA literature (A) is also called the BK version, while (B) is called the VG one, V i .In case A, the triple ( √ λ i , U i , V i ) is called the Eigen triple.

Second Stage: Reconstruction
The reconstruction process can be separated into two steps: 1st step: Eigen triple grouping.
Definition 1.Let d = max{j:λ j = 0}.Once the expansion (2) is obtained, the grouping procedure partitions the set of indices {1, . . .,d} into m disjoint subsets I 1 , . . .,I m .Define X I = ∑ i∈I X i .The expansion (2) leads to the decomposition: The procedure of choosing the sets I 1 , . . .,I m is called Eigen triple grouping.If m = d and I j = {j}, j = 1, . . .,d; then the corresponding grouping is call elementary.The choice of several leading Eigen triples for Case A corresponds to the approximation of the time series in view of the well-known optimality property of the SVD.2nd step: Diagonal averaging.Definition 2. At this step, we transform each matrix X ij of the grouped decomposition (3) into a new series of length N. let Y be an L × K matrix with elements y ij , 1 ≤ i ≤ L, 1 ≤ j ≤ K, and let for simplicity L ≤ K.By making the diagonal averaging we transfer the matrix Y into the series ( y 1 , . . ., y N ) using the formula: where denotes the number of elements in the set A s , This corresponds to averaging the matrix element over the "antidiagonals".
Diagonal averaging (4) applied to a resultant matrix X jk produces a reconstructed series N ).Therefore, the initial series (x 1 , . . .x N ) is decomposed into a sum of m reconstructed series: The reconstructed series produced by the elementary grouping will be called elementary reconstructed series.

Firefly Algorithm
The firefly algorithm is a popular algorithm with its basis in biology.In recent years, more and more comprehensive studies have been put forward, and its elaboration has been getting more thorough.

Biological Foundations
Fireflies are high-population insects, and their spectacular courtship scenes have inspired poets and scientists alike [18].At present, there exist more than 2000 species around the world.According to several studies, the habitats of fireflies are usually warm, which implies that the swarms are more active during summer nights.Plenty of researchers have put considerable energy into the study of the firefly phenomena in nature so that many papers have been put forward, for example [19][20][21][22][23]. Fireflies are characterized by their flashing light produced by a biochemical process called bioluminescence.The flashing light serves as the primary courtship signal for mating and is also the signal for warning of potential danger.Moreover, features such as the brightness and frequency of the flashing light form the diversity of the signal.Depending on different flashing lights, the fireflies transfer information to each other and then take actions according to the message.In the FA algorithm, the brightness of the flashing light is the foremost character.Based on the brightness, the FA algorithm focuses on the movement of the fireflies when they receive brightness information, which is the biological foundation of the FA algorithm.

Structure of the Firefly Algorithm
FA has powerful global exploration and exploitation abilities, and it can substantially increase the local optimum avoidance ability and the convergence speed.This algorithm is based on a physical formula of the light intensity, which decreases with an increase in the square of the distance r 2 .However, as the distance from the light source increases, the light absorption causes that light to become weaker and weaker.These phenomena can be associated with the objective function to be optimized.As a result, the base FA can be formulated as what is illustrated in the following algorithm: Algorithm: Pseudo-Code of the Firefly Algorithm.

Input:
x (0) The data that has been disposed by Singular Spectrum Analysis.

END IF 21
Attractiveness varies with the distance r via β 0 e −γr There are three idealized assumptions in the FA: (1) ignorance of the sex of the fireflies, there is no difference in the firefly's attraction between different sexes-they are considered unisexual; (2) brightness is the determining factor for the attraction of one firefly towards another which means a less bright firefly moves towards a brighter firefly.Attractiveness and brightness are in inverse proportion with the distance; (3) the landscape of the fitness function determines the brightness of a firefly.Definition 1.The distance between any two fireflies i and j whose positions are x i and x j is given by the Cartesian distance as follows: where D is the number of dimensions.Definition 2. The firefly's attractiveness is given by: At r = 0 attractiveness is β 0 and γ is the light absorption coefficient.The movement of the i th firefly towards more attractive j th firefly is calculated as: Here, the second term represents the attraction of one firefly towards another and the randomization in the movement of firefly is caused by the third term with α as randomization parameter.Rand is random number generator.x new i is the new position of the i th firefly and x old i is the old position of i th firefly.The parameter γ determines the attractiveness and hence the speed of convergence.In implementation, we can take β 0 = 1 and α ≈ (0,1).In this paper we have set γ = 0.001 however the results do not show much variation with change in value of γ.

BP Neural Network
The BP neural network is a typical feed forward network.Its structure is comprised of the input layer, output layer and hidden layer, as shown in Figure 1.The working process of BP neural network can be divided into two stages: the learning stage and the working stage.In the learning stage, the input information will be disposed from input layer to the hidden layer and then to the output layer.

BP Neural Network
The BP neural network is a typical feed forward network.Its structure is comprised of the input layer, output layer and hidden layer, as shown in Figure 1.The working process of BP neural network can be divided into two stages: the learning stage and the working stage.In the learning stage, the input information will be disposed from input layer to the hidden layer and then to the output layer.According to the simulation output, the differences between the results of output layer and the output given in the sample will be calculated as the back propagation of errors which will be sent back as the feedback.Depending on the feedback, the weight values (wij, wjk) and the threshold values (a, b) will be modified for the next learning cycle.By this mechanism, the connection weight of neutrons among different layers can be adjusted.When working well, the network will be applied into the working stage, in which, the input information is forward-propagating, and the output will be displayed according to the neural network [24].According to the simulation output, the differences between the results of output layer and the output given in the sample will be calculated as the back propagation of errors which will be sent Energies 2016, 9, 757 8 of 28 back as the feedback.Depending on the feedback, the weight values (w ij , w jk ) and the threshold values (a, b) will be modified for the next learning cycle.By this mechanism, the connection weight of neutrons among different layers can be adjusted.When working well, the network will be applied into the working stage, in which, the input information is forward-propagating, and the output will be displayed according to the neural network [24].

Hybrid SSA-FA-BP Model
The hybrid model used in this paper consists of the singular spectrum analysis, firefly algorithm, and BP neural network.Singular spectrum analysis is used to de-noise the wind speed data for preprocessing.The BP neural network is the main predicting model.The firefly algorithm is used to optimize the parameters w ij , w jk, a j , and b k of the BP neural network.
The BP neural network objectively has imperfect aspects, such as susceptibility to the initial parameters of the network and easily being trapped in local optima.Therefore, the single BP neural network cannot provide good predictions because of the complexity of the short-term wind speed series.Hence, a hybrid approach consisting of SSA, FA and BP is proposed.The structure of the proposed SSA-FA-BP is shown in Figure 2.  The specific hybrid model can divided into four stages.The general task in each stage is described as follows: Stage 1 According to the characteristics of the dataset of the BP neural network, the data have to be divided into different parts, including a training set and a test set.The training set is used to train the original network, which is built for forecasting.The test set is applied to the network to output the forecasting value.
Stage 2 Utilize singular spectrum analysis to divide the original short-term wind speed series into two modes, including a de-noising series and a residual series.Discard the residual series to de-noise and smooth the original short-term wind speed data, and make preparations for the following forecasting.In this stage, only the wind speed data of the training set are disposed by SSA.The decomposed residual series are discarded because the residual is small and can be regarded as an uncorrelated white noise series, which have a negative influence on the prediction; the rest of the decomposed modes are aggregated into the new data series.This process de-noises the original data to improve the prediction accuracy.
Stage 3 Optimize the parameters of the BP with the firefly algorithm.The FA was selected as the optimization instrument to obtain better parameters in the training process of BP, which can improve the forecasting accuracy.The FA randomly generates the initial population of the candidate solutions for the weights of the BP.After that, it calculates the light intensity of all fireflies and finds the most attractive firefly (the brightest firefly) within the population.Then, it calculates the attractiveness and distance for each firefly to move all fireflies towards the most attractive firefly in the search space.Next, the best solution among the population is passed to the BP as the initial solution.
Stage 4 Forecast the wind speed in different forecasting horizons (one-step ahead prediction or multi-step ahead prediction).After optimization of the initial parameters of the BP by the firefly algorithm, the better-trained BP is used to predict the wind speed in different forecasting horizons.The forecasting horizon in this paper is spread from one to six, which has been demonstrated to be valid.

Brief Description of the Case Study
As a type of non-polluting and abundantly reserved energy, wind power has important significance for the sustainable development of energy.Thus, research on wind speed is necessary.In order to verify the superiority of the hybrid SSA-FA-BP model and the optimizing effect of our algorithms in actual experiments, we focus on a case study to research the forecasting function of the hybrid model in wind speed prediction.

Data Collection
Considering the object that we study, wind speed data constitute the main data set of the experiment.Therefore, we choose Peng Lai as an experimental site, where wind speed data are of high quality.Peng Lai (shown in Figure 3), located in the Shandong Province in China, is a coastal city.Because of its temperate continental monsoon climate, Peng Lai has rich wind power storage capabilities.In this paper, the wind speed observations were collected from the Peng Lai wind farm to verify the proposed hybrid approach.
The data used are the wind speed data of 2011, and three main units, No. 12, No. 13, and No. 14, are selected to be studied for examining the hybrid approach.Table 1 shows the information of the different data sets.The accurate prediction of the wind speed contributes to planning the economic load dispatch and the load increment/decrement decisions made.However, according to the actual study, in order to guarantee the accuracy, the range of short-term predictions is usually considered to be 10 min to 6 h ahead [25].The horizon is one to six steps, which definitely indicates 10 min to 6 s ahead considering all three experiments of different time intervals.
Considering the object that we study, wind speed data constitute the main data set of the experiment.Therefore, we choose Peng Lai as an experimental site, where wind speed data are of high quality.Peng Lai (shown in Figure 3), located in the Shandong Province in China, is a coastal city.Because of its temperate continental monsoon climate, Peng Lai has rich wind power storage capabilities.In this paper, the wind speed observations were collected from the Peng Lai wind farm to verify the proposed hybrid approach.Figure 4 displays the basic statistical indexes, which include the maximum, minimum, average and standard deviation of the wind speed data involving the three experimental sites.As shown in Figure 4, the average and maximum speeds occur in Site B for four different quarters, whereas the minimum speeds occur in different seasons and sites.
The wind speed series of Site C present the maximum standard deviation values, which lead to the largest degree far away from the corresponding average value for different seasons.However, these basic statistical indexes might be not enough for surveying the wind speed patterns.In order to describe the wind speed series more adequately, the observed frequency distribution (Weibull distribution) is shown in Figure 4.According to the maximum likelihood (ML) method, the shape parameter and scale parameter in the Weill function are estimated depending on the daily wind speeds recorded in three sites [26].For each case, the shape parameter a and the scale parameter b vary from one site to the other two sites or from one season to the other three seasons at the same site, which indicates the wind speed patterns vary significantly.

Evaluation Indices for Forecasting Performance
In this paper, in order to inspect the effect of the hybrid model, three main statistical indices are employed to measure the forecasting accuracy.They are the mean absolute percent error (MAPE), mean absolute error (MAE) and mean square error (MSE), for which small values indicate high forecast performance.These indices are defined as follows: where y n is the observed value for the time period t and ŷn is the predicted value for the corresponding period.The MAE reveals how similar the predicted values are to the observed values, whereas the MSE measures the overall deviation between the predicted values and the observed values.The MAPE is a unit-free measure of accuracy for the predicted wind series and is sensitive to small changes in the data.According to some empirical studies, these three indices are reliable for wind speed forecasting, and have been widely used [27], but when the result of wind speed forecasting is applied to the wind power forecasting, more error would be displayed, because of other factors that have influence on the generation of wind energy, such as the turbine efficiency [28].Therefore, the evaluation indices of wind speed forecasting are not directly applied to the management of connected electric power systems.

Simulation
The simulation includes three experiments, which are determined with time spans of 10, 30 and 60 min.In each experiment, three turbines (Unit 12, Unit 13, and Unit 14) are observed.For the single turbine, the observed data are divided into four quarters to facilitate the observation of the seasonal differences.The main models involved in this paper are single BP, FA-BP, SSA-BP, and SSA-FA-BP.With these four models, the forecasting data and series will be output, and in this paper, we focus on the one step and multistep prediction.The result of the 1-step, 2-step, 3-step, and 6-step predictions will be displayed in three experiments, and the main indexes for accuracy observation are MAPE, MSE, and MAE, which will be shown in all three experiments to express the superiority and inferiority of the different models.

Experiment I: The Forecasting for a Time Interval of 10 Min
The experiment of 10 min takes the data of Unit 12 as an example.The original wind series presents high fluctuation and instability, as Figure 5 shows.Arranged by the SSA algorithm, the original wind speed series is decomposed, and the residual series are extracted against a meaningful series (reconstructed series).The residual series mainly contains noisy signals that will disturb the forecasting process.Therefore, before the final prediction, a de-noising process is necessary, which is able to improve the forecasting quality of the short term wind speed.During this process, the residual series are discarded, and all the other modes are reconstructed into the new series to be used for forecasting.
to improve the forecasting quality of the short term wind speed.During this process, the residual series are discarded, and all the other modes are reconstructed into the new series to be used for forecasting.
In this case, the number of neurons was determined by the method of trial and error.To set up a more effective network, many experiments were conducted, and then, the best trial results were selected.The experimental parameters of the BP are shown in Table 2.With many trials, for the hidden layer, when neuron number in the hidden layer was 13, the outputs showed good performance in the accuracy of forecasting.Thus, the structure of the BP was set as 6-13-1.In this case, the number of neurons was determined by the method of trial and error.To set up a more effective network, many experiments were conducted, and then, the best trial results were selected.The experimental parameters of the BP are shown in Table 2.With many trials, for the hidden layer, when neuron number in the hidden layer was 13, the outputs showed good performance in the accuracy of forecasting.Thus, the structure of the BP was set as 6-13-1.Table 3 shows the evaluation results for the individual forecasting and combination forecasting for the three sites (Unit 12, Unit 13, and Unit 14).It is apparent that the forecast accuracy of the proposed combination approach, SSA-FA-BP, is similar to or exceeds the single forecasting model BP and two-component models, FA-BP and SSA-BP, in all horizons for three sites.This result reflects the reliability of the proposed combined method in view of the stochastic nature of wind and its spatial and temporal variations.More detailed analyses are described as follows.Firstly, the one-step forecasting results obtained from the foresaid models will be discussed in detail.Then, we analyze the multi-step forecasting results in a similar way.Finally, we concentrate on the details of the seasonal influence to the models of concern.When considering the one-step and multi-step forecasting, the accuracy indices refer to the average of the seasonal indices.For one-step forecasting, we make comparisons between the BP, FA-BP, SSA-BP and hybrid SSA-FA-BP model.The one-step ahead predicted output of the different models is displayed in Table 3.It is also clear from Table 3 and Figure 6 that the three forecasting evaluation indices (MAE, MSE and MAPE) obtained through the proposed hybrid strategy are smaller than those obtained from other component models mentioned above for all three units.Comparing the predictions shows that the integration of the SSA algorithm and the FA algorithm is an effective method for short-term wind speed prediction and that the hybrid model can provide better prediction based on the properties of the short-term wind series.
Energies 2016, 9, 757 14 of 26 We also find that the hybrid SSA-FA-BP exhibits better performance than the single BP model, which reveals that the models that use singular spectrum analysis and the firefly algorithm to handle the BP neural network predicted model outperform the single BP model, although the former are more complex and unintelligible than the latter.As shown in Table 3, for Unit 12, the main evaluation index MAPE of BP is 14.50% in two-step forecasting, 16.32% in three-step forecasting and 18.18% in six-step forecasting.According to relative research about wind speed forecasting, this accuracy is not sufficiently accurate for reference.However, for the hybrid model SSA-FA-BP, the corresponding index MAPE is 8.18% in two-step forecasting, 9.29% in three-step forecasting and 15.33% in six-step forecasting.Compared with the single BP, the hybrid model shows evident forecasting accuracy improvement.However, for the horizons that are more than six steps, the result of the prediction is not good enough in the condition of a time horizon of 10 min.In addition, compared with the FA-BP model, the SSA-FA-BP model leads to a 4.6% reduction in the total MAPE for the two-step prediction and a 5% reduction in the total MAPE for the three-step ahead prediction, which demonstrates that the firefly algorithm methods are effective in boosting the multi-steps forecasting accuracy of the short-term wind speed prediction.Similarly, in the comparison between the BP, SSA-BP, and SSA-FA-BP, the modified function of the SSA and FA come out.Additionally, in the hybrid model, SSA-FA-BP, the singular spectrum analysis occupied more proportions than the firefly algorithm in improving the effect of multi-steps forecasting.The hybrid models make full use of data preprocessing methods and take good advantage of the optimization algorithm to improve the performance of forecasting.In addition, with the comparison of the results of the different predicted horizons, we can draw the conclusion that the forecasting accuracy of each model decreases with an increase in the number of horizon steps.To obtain detailed properties, the models are submitted to further careful analyses.Setting Unit 12 as an example, firstly, the single BP model is used as a baseline to benchmark the forecasting accuracy, and the other combined models are compared with it.Table 3 shows that the FA-BP, SSA-BP and hybrid SSA-FA-BP models perform better than the single BP model, which reveals that the SSA and FA have good performance by improving the accuracy of forecasting.This is the reason that an increasing number of studies have proposed the de-noising algorithm and optimization algorithm to tackle short-term wind speed problems.In the comparison and analysis of forecasting, the results between the single BP and SSA-FA-BP reveal that the proposed model leads to reductions of 4.39% in MAPE, 0.14 in MAE and 0.21 in MSE.In addition, the results between the single BP and FA-BP show that the firefly algorithm leads to improvements in the accuracy of 1.3% in MAPE, 0.02 in MAE, and 0.04 in MSE.Furthermore, the results between the single BP and SSA-BP models reveal that the proposed model leads to reductions of 4.28% in MAPE, 0.14 in MAE and 0.19 in MSE.In conclusion, the model comparisons show that the proposed SSA-FA-BP hybrid model achieves better forecasting performance than the other component based on analyses of the prediction results.The SSA model has a better contribution than FA in the improvement of the forecasting accuracy.Overall, in the hybrid model, SSA-FA-BP, the SSA is more significant than the FA in improving the forecasting accuracy.

Analysis of Multi-Step Forecasting
The proposed hybrid SSA-FA-BP model also performs well when applied to multi-step forecasting.With the model of rolling prediction, the results of the multi-step forecasting are obtained, which involve the two steps, three steps and six steps.In the experiment of the 10 min interval, multi-step forecasting indicates forecasting for 20 min, 30 min, and one hour, as shown in Table 3 and in Figure 6.
We also find that the hybrid SSA-FA-BP exhibits better performance than the single BP model, which reveals that the models that use singular spectrum analysis and the firefly algorithm to handle the BP neural network predicted model outperform the single BP model, although the former are more complex and unintelligible than the latter.As shown in Table 3, for Unit 12, the main evaluation index MAPE of BP is 14.50% in two-step forecasting, 16.32% in three-step forecasting and 18.18% in six-step forecasting.According to relative research about wind speed forecasting, this accuracy is not sufficiently accurate for reference.However, for the hybrid model SSA-FA-BP, the corresponding index MAPE is 8.18% in two-step forecasting, 9.29% in three-step forecasting and 15.33% in six-step forecasting.Compared with the single BP, the hybrid model shows evident forecasting accuracy improvement.However, for the horizons that are more than six steps, the result of the prediction is not good enough in the condition of a time horizon of 10 min.In addition, compared with the FA-BP model, the SSA-FA-BP model leads to a 4.6% reduction in the total MAPE for the two-step prediction and a 5% reduction in the total MAPE for the three-step ahead prediction, which demonstrates that the firefly algorithm methods are effective in boosting the multi-steps forecasting accuracy of the short-term wind speed prediction.Similarly, in the comparison between the BP, SSA-BP, and SSA-FA-BP, the modified function of the SSA and FA come out.Additionally, in the hybrid model, SSA-FA-BP, the singular spectrum analysis occupied more proportions than the firefly algorithm in improving the effect of multi-steps forecasting.The hybrid models make full use of data preprocessing methods and take good advantage of the optimization algorithm to improve the performance of forecasting.In addition, with the comparison of the results of the different predicted horizons, we can draw the conclusion that the forecasting accuracy of each model decreases with an increase in the number of horizon steps.
Considering the different sites of concern, the experiment also demonstrates that the hybrid model is effective for all the data of Unit 12, Unit 13, and Unit 14, which can prove the universality of the model.According to the diversity of the experimental data, as shown in Table 3, the forecasting accuracy has a subtle distinction.

Analysis of Seasonal Feature
The specific seasonal forecasting results are shown in Table 4.According to the displayed accuracy indications (MAPE, MAE, and MSE), differences of accuracy caused by observation time come out.For example, in the experiment of 10 min, SSA-FA-BP model applied to the data of Unit13 in the first quarter perform the most accurate forecasting in one-step ahead, and the MAPE is 4.18%, which is a relatively ideal for wind speed forecasting.As what is shown in Figure 7, in the comparison of seasonal forecasting, for the 10-min forecasting, the forecasting accuracy of the first quarter is the best, and forecasting of the second and the fourth quarter is not good enough.Especially, for several specific sites and quarters, the SSA-FA-BP even perform worse than SSA-BP or FA-BP, which still outperform than single BP. in the first quarter perform the most accurate forecasting in one-step ahead, and the MAPE is 4.18%, which is a relatively ideal for wind speed forecasting.As what is shown in Figure 7, in the comparison of seasonal forecasting, for the 10-min forecasting, the forecasting accuracy of the first quarter is the best, and forecasting of the second and the fourth quarter is not good enough.Especially, for several specific sites and quarters, the SSA-FA-BP even perform worse than SSA-BP or FA-BP, which still outperform than single BP.

Experiment II: Forecasting for a Time Interval of 30 Min
In the 30 min experiment, the forecasting time interval is longer than the 10-min experiment.The data collected for this experiment is aimed at the whole hour point and the half hour point.The result of the forecasting is similar to the 10 min forecasting.
For example, as Table 5 shows, for Unit 12, the SSA-FA-BP model outperforms the other component models in one-step ahead forecasting and has a lower MAPE value of 12.65% compared to the MAPEs of 12.78%, 17.11% and 17.62% for the SSA-BP, FA-BP and single BP models, respectively.In Table 5 and Figure 8 the SSA-BP method performs slightly worse than the best component model in one-step ahead and multiple steps-ahead forecasting.For the FA-BP and single BP, the forecasting accuracy is obviously worse than the other two component models.As the forecasting horizon increases, the MAPEs, MAEs, and MSEs of each model increase.For Unit 13 and Unit 14, the forecasting accuracy is shown in Table 5.The wind speed forecast displays the same trend with Unit 12, which adds more weight to the universality of the effectiveness of the models.For all horizons, the SSA-FA-BP model obtains the best forecasting performance compared with other component models.The proposed approach, SSA-FA-BP, has a higher reliability and precision at all sites in terms of the forecasting performance.Table 5 shows the evaluation results of the predictions obtained from models that used SSA and FA and models that did not use SSA and FA for site 12, site 13 and site 14.The model comparisons demonstrate that the preprocessing method SSA is effective in increasing the forecasting accuracy of short-term wind speed prediction, and the optimized method FA is effective in improving the forecasting accuracy of BP.Considering the seasonal factor, the results in Table 6 and Figure 9 show that the forecasting of the wind speed in the first quarter is still the most accurate among the whole year in the 30 min experiment.However, for the second and fourth quarters, the results are not desired.For some outputs of forecasting, such as the three-step forecasting of Unit 14 in the fourth quarter, the MAPE of SSA-FA-BP is 24.81%, which is larger than SSA-BP.As the horizon increases, the number of occurrences of the abnormal results increases.Table 6.The quarterly forecasting results of the combined model and the results of the other models involving the data of three Units (30 min).

Experiment III: The Forecasting for Time Interval of 60 Min
In the experiment of 60 min, the forecasting time interval is longer than the 10 min and the 30 min experiment.The datum collected for experiment are the whole hour point.And the results of forecasting are shown in Table 7.For example, as Table 7 shows, for Unit 12, the SSA-FA-BP model outperforms the other component models in one-step ahead forecasting and has a lower MAPE value of 20.67% compared to MAPEs of 20.88%, 21.68% and 22.22% for the SSA-BP, FA-BP and single BP models, respectively.In Table 7 and Figure 10 the SSA-BP method performs slightly worse than the best component model, in one-step ahead and multiple steps-ahead forecasting.For the FA-BP, and single BP, the forecasting accuracy is obviously worse than the other two component models.As the forecasting horizon increases, the MAPEs, MAEs, and MSEs of each model increase.For Unit 13 and Unit 14, the forecasting accuracies are shown in Table 7.Although the optimizing effect of SSA and FA is still obvious, for the experiment of 60 min, the forecasting accuracy is generally not ideal.As Figure 10 shows, the superiority of the proposed model decreased, and the differences among the models also diminish.Considering the seasonal factor, the results in Figure 11 and Table 8 show that the forecasting of wind speed in the fourth quarter is the most accurate during the whole year in the 60 min experiment.Similarly, for the other three quarters, the results are not desirable.For Unit 13 and Unit 14, the forecasting accuracies are shown in Table 7.Although the optimizing effect of SSA and FA is still obvious, for the experiment of 60 min, the forecasting accuracy is generally not ideal.As Figure 10 shows, the superiority of the proposed model decreased, and the differences among the models also diminish.Considering the seasonal factor, the results in Figure 11 and Table 8 show that the forecasting of wind speed in the fourth quarter is the most accurate during the whole year in the 60 min experiment.Similarly, for the other three quarters, the results are not desirable.According to the results shown in Tables 3, 5 and 7, compared with BP, SSA-BP has more obvious improvement than FA-BP.Therefore, we can deduce that, in the combined model SSA-FA-BP, the SSA has more contribution than FA in the improvement of the wind speed forecasting accuracy.However, as Figures 6, 8 and 10 show, as the intervals of the data increase, the differences of accuracy for the models decrease, and the optimizing effects of SSA and FA decrease.
Table 9 shows the forecasting results of the three experiments with the average of the three sites.For one-step ahead forecasting, the MAPE of 10 min is 6.76%, which is less than that for 30 min and 60 min.For two-step and other multi-step ahead forecasting, the accuracy also decreases as the time interval increases.
As Table 3, Table 5, Table 7, and Table 9 show, whichever the experiment, for any model, as the forecasting horizon increases, the forecasting accuracy shows an obvious decrease.For the six-step ahead forecasting, most of the MAPE is more than 15%, which is not credible enough for wind speed forecasting.This result implies that, if the forecasting horizon is more than six, the models we propose are not available.
Figure 7, Figure 9, and Figure 11 and Table 4, Table 6, and Table 8 show that the wind speed forecasting of the first quarter is the most accurate of the whole year.The forecasting of the second and fourth quarters are not relatively ideal.

Statistical Testing of the Predictive Accuracy
The statistic testing has been widely used to evaluate the predictive accuracy between the various predicting models.In order to promote the superiority and conviction of the proposed model, in this paper, the bias-variance statistics framework and the Diebold-Mariano (DM) test are employed to test the predicting results at the point of statistics.

Bias-Variance Statistics Framework
The bias-variance framework [29] is utilized to estimate the models' accuracy and stability, which are important in evaluating the effectiveness of the wind-speed forecasting models.The error attributed to bias is taken as the difference between the forecasts of the proposed model and the observed value.The error attribute to variance is taken as the variability of the forecasting results: i is the difference between the forecasting value x f i and the actual value is the expectation of the forecasting data, where n is the number of data for comparison.The bias-variance statistic framework is described as follows: Var(x f ) demonstrates the stability of predicting model, and Bias(x f ) demonstrates the predictive accuracy.
Table 10 shows the bias and variance of the bias-variance statistic framework.For all three experiments of different time interval, the bias of the SSA-FA-BP is smaller than the other three models which indicate that the proposed hybrid model has is superior in predictive accuracy.However, for the same models, as the time interval increases, the bias maintains a growing trend, which reveals the decline of the accuracy.The variance results show that the proposed model is more stable generally.

The Diebold-Mariano (DM) Test
The Diebold-Mariano (DM) test [30], which is a comparative test that focuses on the predictive accuracy, could be used to compare and evaluate the forecasting performance of the proposed hybrid model and other comparing models.In empirical applications, it is often the case that two or more time series models are available for forecasting a particular variable of interest.n .Definition 2. The DM test statistic evaluates the forecasts in terms of the absolute loss function: where S 2 is an estimator of the variance of . The DM testing is based on the hypothesis testing, thus we construct the null hypothesis and the alternative hypothesis as: The null hypothesis is that the two forecasts have the same accuracy.The alternative hypothesis is that the two forecasts have different levels of accuracy under the null hypothesis, the test statistics DM is asymptotically N(0,1) distributed.The null hypothesis of no difference will be rejected if the computed DM statistic falls outside the range of [z -α/2 , z α/2 ] that is if: |DM| > z α/2 (17) where z α/2 is the upper z-value from the standard normal table corresponding to half of the desired level of the test.
The results of the DM testing are shown in Table 10.For experiment I and experiment II, the values of the DM statistic between the proposed hybrid model and the other three models are larger than the upper limit at a 1% significance level, which means the hybrid model displays a distinct superiority over the other three models.For experiment III, the value of the DM statistic between the hybrid model and BP is larger than the upper limit at 1% significance level, but for the FA-BP and the SSA-BP, the values of the DM statistic against the hybrid model are larger than the upper limit at a 10% significance level and a 15% significance level respectively, which indicates a weaker significant difference compared to the other models.Thus, it can be concluded that the hybrid model reveals a significant superiority compared to the other models for the forecasting accuracy, and as the data time intervals increase, the superiority declines.

Conclusions
As the energy pressure increases, the development and utilization of new renewable energy sources deserve greater attention to achieve the aims of sustainability and environmental protection.At present, there is no doubt that wind energy is one of best forms of non-renewable energy.However, for wind power generation, the reliability and accuracy of wind speed forecasts are vital, but the complexity and fluctuation of wind speed series make it a great challenge to forecast the wind speed precisely.A large number of studies have been devoted to improving wind speed forecasting performance through parameter optimization and factors analysis, which affect the final estimates significantly.However, in conventional studies, the model constructed ignoring the pre-processing of the data, which contains considerable irrelevant factors, will inaccurately estimate the fluctuation trend of the wind speed series, which is usually devoted to the deviation and errors in the predicted results.To achieve the desired forecasts, therefore, it is necessary to identify and eliminate the outliers in the original wind speed data before constructing a forecasting model.In this study, singular spectrum analysis is introduced for the process of de-noising.Moreover, affected by various environmental factors, the wind speed data present high fluctuations, autocorrelation and stochastic volatility, making it difficult to forecast the wind speed using a single model.Thus, in this paper, a hybrid model, SSA-FA-BP, is proposed.The SSA is exploited to eliminate the stochastic volatility in the wind speed series.The parameters in the BP are tuned and optimized by the FA algorithm, so the defect of the randomness of the BP neural network is overcome partly and does not fall into the local optimum.In addition, this study generates wind speed predictions over two different forecasting horizons: one-step ahead prediction and multi-step ahead prediction.The test results obtained for different forecast horizons suggest that the proposed hybrid wind speed forecasting method based on the BP model integrated with the FA algorithm and preprocessing with the SSA algorithm has the ability to produce good wind speed predictions.In addition, the SSA outperforms the FA on the contribution of the improvement of the forecasting accuracy in the proposed hybrid model.

Figure 1 .
Figure 1.The three-layer BP neural network structure.

Figure 1 .
Figure 1.The three-layer BP neural network structure.

Figure 2 .
Figure 2. The structure of the proposed SSA-FA-BP, (a) The structure of SSA; (b) The structure of FA-BP; (c) The schematic of back-propagation neural network; (d) The mechanism of multistep rolling forecasting.The specific hybrid model can divided into four stages.The general task in each stage is described as follows: Stage 1 According to the characteristics of the dataset of the BP neural network, the data have to be divided into different parts, including a training set and a test set.The training set is used to train the original network, which is built for forecasting.The test set is applied to the network to

Figure 2 .
Figure 2. The structure of the proposed SSA-FA-BP, (a) The structure of SSA; (b) The structure of FA-BP; (c) The schematic of back-propagation neural network; (d) The mechanism of multistep rolling forecasting.

Figure 3 .
Figure 3.The location of the Peng Lai wind farm, and the data structure.Figure 3. The location of the Peng Lai wind farm, and the data structure.

Figure 3 .
Figure 3.The location of the Peng Lai wind farm, and the data structure.Figure 3. The location of the Peng Lai wind farm, and the data structure.

Figure 5 .Figure 5 .
Figure 5.The results of the wind series of Unit 12 (10 min) disposed by SSA.Table 2. The experiment parameters of BP.Experimental Parameters Value Neuron number in the input layer 6 Neuron number in the hidden layer 5-15 Neuron number in the output layer 1

Figure 6 .
Figure 6.The quarterly forecasting results of the wind series of Unit 12 with one-step (10 min), (a) refers to the result of MSE; (b) refers to the result of MAE; (c) refers to the result of MAPE).

Figure 6 .
Figure 6.The quarterly forecasting results of the wind series of Unit 12 with one-step (10 min), (a) refers to the result of MSE; (b) refers to the result of MAE; (c) refers to the result of MAPE).

Figure 8 .
Figure 8.The multiple-steps forecasting accuracy indexes of Unit 12 (30 min), (a) refers to the result of MSE; (b) refers to the result of MAE; (c) refers to the result of MAPE).

Figure 8 .
Figure 8.The multiple-steps forecasting accuracy indexes of Unit 12 (30 min), (a) refers to the result of MSE; (b) refers to the result of MAE; (c) refers to the result of MAPE).

Figure 9 .
Figure 9.The quarterly forecasting results of the wind series of Unit 12 with one-step (30 min).

Figure 9 .
Figure 9.The quarterly forecasting results of the wind series of Unit 12 with one-step (30 min).

Energies 2016, 9 , 757 20 of 26 Figure 10 .
Figure 10.The multiple-steps forecasting accuracy indexes of Unit 12 (60 min), (a) refers to the result of MSE; (b) refers to the result of MAE; (c) refers to the result of MAPE).

Figure 10 .
Figure 10.The multiple-steps forecasting accuracy indexes of Unit 12 (60 min), (a) refers to the result of MSE; (b) refers to the result of MAE; (c) refers to the result of MAPE).

Figure 11 .
Figure 11.The quarterly forecasting results of the wind series of Unit 12 with one-step (60 min).Figure 11.The quarterly forecasting results of the wind series of Unit 12 with one-step (60 min).

Figure 11 .
Figure 11.The quarterly forecasting results of the wind series of Unit 12 with one-step (60 min).Figure 11.The quarterly forecasting results of the wind series of Unit 12 with one-step (60 min).

5. 4 .
Summary: Based on Experiments I-III Comparing the three experiments above, we obtain the following overall conclusion: The BP network has acceptable accuracy in the forecasting of the wind speed.The combined model possesses a more powerful forecasting ability than the individual model.As every experiment shows, the accuracy of the model single BP, FA-BP, SSA-BP and SSA-FA-BP for wind speed forecasting increases successively, and in all predictions, the SSA-FA-BP almost always outperform the other two component models and the individual model.

Definition 1 .
The actual values are {y n }, and the two forecasting values are y

n
− y n .Here we employ the popular loss expression, absolute deviation loss, L δ i n = δ (i) sequence of verifying set.
* Initialize population of n fireflies x i (i = 1, 2,..., n) randomly*/ 6 FOR EACH i: 1 ≤ i ≤ n DO 7 Evaluate the corresponding fitness function F i 8 END FOR 9 /*Determine light intensity.*/10 FOR EACH i: 1 ≤ i ≤ n DO 11 Determine light intensity L i depending on F(x i ). 12 END FOR 13 WHILE (g < Maxgeneration) DO 14 FOR EACH i = 1:n DO /*all n fireflies */ 15 FOR EACH j = 1:n DO /*all n fireflies */ 16 /*Move firefly i towards j in all d-dimensions*/ 17 IF (L j

Table 1 .
The data sets of three different experiments.

Table 2 .
The experiment parameters of BP.

Table 3 .
The quarterly average forecast results of the combined model and the results of the other models involving the data of three units (10 min).

Table 4 .
The quarterly forecasting results of the combined model and the results of the other models involving the data of three Units (10 min).

Table 4 .
The quarterly forecasting results of the combined model and the results of the other models involving the data of three Units (10 min).

Table 5 .
The quarterly average forecast results of the combined model and the results of the other models involving the data of three units (30 min).

Table 5 .
The quarterly average forecast results of the combined model and the results of the other models involving the data of three units (30 min).

Table 6 .
The quarterly forecasting results of the combined model and the results of the other models involving the data of three Units (30 min).

Table 7 .
The quarterly average forecast results of the combined model and the results of the other models involving the data of three units (60 min).

Table 8 .
The quarterly forecasting results of the combined model and the results of the other models involving the data of three units (60 min).

Table 8 .
The quarterly forecasting results of the combined model and the results of the other models involving the data of three units (60 min).

Table 9 .
The quarterly average forecast results of the combined model and the results of the other models involving the data of three units (three experiments).

Table 10 .
Bias-variance and Diebold-Mariano test of three experiments among the four different models for the average value of four quarters and three sites.