A Novel Combined Model Based on an Artificial Intelligence Algorithm—a Case Study on Wind Speed Forecasting in Penglai, China

Wind speed forecasting plays a key role in wind-related engineering studies and is important in the management of wind farms. Current forecasting models based on different optimization algorithms can be adapted to various wind speed time series data. However, these methodologies cannot aggregate different hybrid forecasting methods and take advantage of the component models. To avoid these limitations, we propose a novel combined forecasting model called SSA-PSO-DWCM, i.e., particle swarm optimization (PSO) determined weight coefficients model. This model consisted of three main steps: one is the decomposition of the original wind speed signals to discard the noise, the second is the parameter optimization of the forecasting method, and the last is the combination of different models in a nonlinear way. The proposed combined model is examined by forecasting the wind speed (10-min intervals) of wind turbine 5 located in the Penglai region of China. The simulations reveal that the proposed combined model demonstrates a more reliable forecast than the component forecasting engines and the traditional combined method, which is based on a linear method.


Introduction
Due to increasing energy demands and environmental concerns, wind power has attracted global attention as a source of sustainable energy.China is rich in wind energy resources.According to one estimate of wind energy, at an altitude of 10 m, China has theoretical wind energy reserves of 600-1000 GW on land and offshore (exploitable) reserves of 100-200 GW.At present, the wind power industry is growing rapidly in the country [1].It is well known that wind energy has three main weaknesses; low density, instability and regional variations.These features make wind speed difficult to predict.Wind speed forecasting can be summed up in three categories: ultra-short-term forecast, short-term forecast and mid-and-long term forecast [2].In recent years, much research has been conducted to enhance wind speed forecasting accuracy, and these approaches can be divided into four categories: physical methods, statistical methods, hybrid physical-statistical approaches and artificial intelligence techniques [3].Among these four categories, artificial intelligence techniques and statistical methods are the main methods studied in this paper.
Neural networks have good generalization ability, particularly in solving nonlinear problems, and they have been extensively used to forecast wind speed.Artificial Neural Networks (ANNs) have three advantages: first, they possess self-learning ability, second, ANNs have associative memory functions and, last, they are able to find optimal solutions.In the last 10 years, with the constant development of artificial neural networks, many researchers have proposed the application of artificial intelligence techniques to wind speed forecasting, including artificial neural and other mixed methods.A Wavelet Neural Network (WNN) is a typical and widely used artificial neural network due to its strong advantages in dealing with nonlinear estimation problems [4].It has performed well in various fields, such as pattern recognition [5], image processing [6], forecast estimation [7], biology [8], medicine [9], economics [10] and others.The WNN method has several advantages such as high data error tolerance and no requirement for excess information beyond a wind speed history.It can fit unattained samples from historical data and can also approach an optimal nonlinear function with high precision.Based on the above advantages of WNNs, many studies have applied them to forecasting future data.
Decomposition of raw data is an important procedure for data filtering.It can effectively improve model forecasting precision and result in a better wind speed forecast [11].Decomposition techniques such as Wavelet Decomposition (WD) [12] and Empirical Mode Decomposition (EMD) [13] are often employed to eliminate noise sequences.However, some limitations that need to be noted are that the WD method is sensitive to the threshold selection and the EMD method has an inherent disadvantage in the frequent appearance of mode mixing [14].The de-noising method of singular spectrum analysis (SSA) used in this paper is somewhat different from de-noising techniques such as Fourier decomposition (FD) and wavelet decomposition (WD).It is one of the principal component analysis methods, which combine statistics and probability theory with concepts from dynamical systems and signal processing [15].The main concept of SSA is that the original time series is decomposed into several components, which represent the trend, oscillatory behaviour (periodic or quasi-periodic components) and noise [16].One of the strengths of the SSA technique compared with other non-parametric methods is that only two parameters are needed to reconstruct the original time series.SSA is often used to extract signals from one-dimensional short time sequences such as wind speed time series.
Individual artificial intelligence methods cannot always determine the link between each data point and obtain accurate forecasts [17].To obtain better performance, hybrid forecasts have been presented using many approaches [18].Hybrid forecasts have demonstrated significant improvement in forecasting results compared with using a single forecasting method [19].Nevertheless, hybrid forecasting methods are based on just one or two optimization methods to improve individual models.It becomes uncertain whether the strengths of different optimization methods are fully exploited if more optimization methods are included.Thus, to avoid the above disadvantages, combination forecasts have been proposed as a novel method.
The combination forecast proposed by Bates and Granger in 1969 has been considered an efficient and simple way to improve forecasting stability [20].The study of combination forecasts received significant attention after the 1970s.A lot of researchers focused on combining different forecasting methods and the application of combination forecasting models in their studies [21,22].This paper studies a combined method that incorporates three hybrid models: SSA-PSO-WNN, SSA-CS-WNN; and SSA-GA-WNN.Generally, combined forecasting models are divided into the constant weight combination forecast method and the variable weight combination forecast method [23].This paper based on the minimum mean absolute percentage error (MAPE), which belongs to the constant weight combination method.The first step of the combination model is data filtering of the raw wind speed by SSA.Then, we use Cuckoo Search (CS), Genetic Algorithm (GA); and Particle Swarm Optimization (PSO) algorithms to optimize the WNN.Finally, the combined model SSA-PSO-DWCM is constructed based on different weighting coefficients, which are calculated by the PSO algorithm.The simulations demonstrate that the forecasting accuracy of the proposed combined model is superior to the models used for comparison in this paper.As a forecasting method, SSA-PSO-DWCM can effectively account for the periodicity and nonlinearity in the wind speed series and gives more accurate forecasts.
The primary contributions of this study are described as follows: (1) A model based on the SSA de-noising technique is utilized to decompose wind speed time series and discard the noise.This procedure, by reducing the irregularity and instability of wind speed sequences, can improve model forecasting precision effectively.(2) Each algorithm has its own advantages.On the basis of an analysis of the structure and parameters of a WNN, the CS (Cuckoo Search), PSO (Particle Swarm Optimization) and GA (Genetic Algorithm) algorithms can be employed to determine the number of wavelet nodes and related parameters such as initial values.These procedures give the optimized artificial neural network higher stability, convergence speed and prediction accuracy.(3) A novel combined model, the SSA-PSO-DWCM, is developed for the wind-speed forecasting field that, for the first time, combines three hybrid models using an intelligent technique method.
The combined model integrates the advantages of its component models and breaks through the limitations of traditional non-negative theory.(4) Considering the randomness of the optimization method and the nonlinearity of the wind series, every experiment was performed 10 times to ensure the reliability of the conclusions.This paper's structure is as follows; Section 2 introduces the individual optimization theories (Cuckoo Search, Genetic Algorithm and Particle Swarm Optimization), the Wavelet Neural Network prediction method and the Singular Spectrum Analysis de-noising method.Section 3 proposes the combined approach.In Section 4, to illustrate the effectiveness of the proposed SSA-PSO-DWCM combined model, several cases are simulated.Experimental design, results and discussion comprise this section.Finally, Section 5 gives a comprehensive summary of this study.

Forecasting Theory
A combined model adopts advantage of its component models is superior to the individual models or performs at least as well as the best one, as has been proven by many simulation results [24].This work proposes a novel combined method to forecast wind speed which includes three hybrid models: SSA-CS-WNN, SSA-GA-WNN and SSA-PSO-WNN.First, Singular Spectrum Analysis (SSA) is applied to decompose and reconstruct the raw wind sequence.Then, three hybrid models (SSA-CS-WNN, SSA-GA-WNN and SSA-PSO-WNN) are built to forecast wind speed.Finally, particle swarm optimization (PSO) is employed to determine the weighting coefficients of these three hybrid models, and a final combined model is proposed.

Cuckoo Search (CS) Algorithm
A cuckoo is a charming bird that makes a beautiful sound and has an aggressive reproduction strategy.Numerous studies have described that many insects and animals exhibit the behavior of Lévy flights [25].A moving objective takes a stochastic step to alter the behavior of a system; this situation can be described as a Lévy flights; a sketch is shown in Figure 1, part c.
The CS algorithm connects a local random process and a global search process in a perfect way, all controlled by a transfer parameter.The primary procedures of the CS are illustrated by the pseudo code shown in Figure 1, part (c).In our case, the selection of the number of neurons was based on a method of trial and error.Many experiments were conducted to determine the number of neurons and then the best trial results were selected.Tables 1-4 show the experimental parameters of all algorithms.The experimental parameters of the CS algorithm in this study are shown in Table 1.

Experimental Parameters
Default Value CS the scale of bird's nest 20 CS the probability of host cuckoo discover outside egg 0.25 CS the accuracy of the iteration termination 1.0e-5

Genetic Algorithm (GA)
The genetic algorithm was proposed by Professor Holland of the University of Michigan in 1962 [26].This algorithm operates on a number of potential solutions, applying the principle of survival of the fittest to produce better and better estimated values to a solution.Currently, genetic algorithms are used to optimize neural nets to solve some complicated problems [27].The basic manipulations of GA contain six parts as described below [28].
Step 1: Generate the initial population in a random way.
Step 2: Compute and save each individual's fitness.
Step 3: Based on different fitness values, the selection procedure chooses an individual for a new group.The probability of being chosen is proportional to the individual fitness value.
Step 4: A crossover operation is carried out by selecting two matching parents in which two random places are selected on each chromosome string and the string segments between these two places are exchanged between the mates.
Step 5: Mutation randomly modifies elements in the chromosomes and is employed with low probability, typically from 0.001 to 0.01.Step 6: If the above steps have not found optimal solutions, i.e., the minimum objective function value has not been obtained, the procedure goes back to Step 2.
In this paper, the simple genetic algorithm (SGA) which demonstrates the main principles of a GA in a simple way [29] is applied to sketch the primary properties of GA and the pseudo-code is shown in Figure 1, part a. Table 2 illustrates the experimental parameters of the GA used in this study.

Particle Swarm Optimization (PSO) Algorithm
Particle Swarm Optimization (PSO) is a type of optimized algorithm, which was inspired by the characteristic of a flock of birds in flight to have random movement locally, but to be globally determined [30].The purpose of the PSO algorithm is to look to the optimal solution of one problem [31].This paper uses the pseudo-code demonstrated in Figure 1, part b to describe the basic steps of the PSO algorithm.The experimental parameters of the PSO algorithm in this study are shown in Table 3.

Wavelet Neural Network (WNN)
The wavelet neural network (WNN) is a network which is based on the structure of the BP neural network; Multiple-dimensions and feed-forward are characteristic of WNNs.The wavelet neural network method regards the wavelet basis function as the transfer function of the hidden layer nodes.The basic structure of WNN is a three-layer neural network which is shown in Figure 1, part (d).
There are m nodes in the input layer, while the hidden layer has n wavelet bases and only one output.WNN not only converges quickly, but also can avoid local optima because of its strong learning and generation capacity [32].The experimental parameters of WNN in this study are shown in Table 4.
The structure of the wavelet neural network is always described by the following formula: In the formula, ŷ is the final predicted value and has just one element; x " px 1 , x 2 , ¨¨¨, x m q T represents the initial input vector; u kt is the weight of the connection from the input layer kth neuron to the hidden layer tth neuron; the product of w t and ψ t is the wavelet basis function; a t is the stretch factor of the wavelet basis function and b t is the translation vector of the wavelet basis function.In this paper, the Morlet wavelet is adopted as an activation function in the hidden nodes because, in comparison to the broader Mexico hat wavelet, orthogonal wavelet, and Gauss spine wavelet, the Morlet wavelet has the smallest error and the best computational stability [33].The formula is given below: (2)

Singular Spectrum Analysis (SSA)
Singular spectrum analysis (SSA) based on the dynamic reconfiguration of time series.It is a statistical technique associated with the empirical orthogonal function.It is often used for analyzing time series and extracting oscillatory components from the original data.SSA is often used to analyse one-dimensional time series of the form x 1 , x 2 , x 3 , ¨¨¨, x N .The trajectory matrix Y is constructed from the primitive sequence X based on a window of length L. The procedure of SSA is described below: (1) Embedding.Arrange a lag and choose a favorable "window" Lp2 ď L ď N{2q.Build the trajectory matrix as below: (2) Calculate the covariance matrix C of the trajectory matrix, with diagonals corresponding to equal lags: Calculate the eigenvalue λ 1 ě λ 2 ě L ě λ L ě 0 of the eigenvector E k , where ?λ 1 ě ?λ 2 ě ¨¨¨ě ?λ L ě 0 is called the time series' singular spectrum and E k is called the temporal empirical orthogonal function (T-EOF).
(3) Divide the matrices into applicable groups and calculate the sum of each group after the decomposition procedure.The projection of lagged series Y on E k : a ik is called the time principle component (TPC).( 4) The most important procedure of SSA is the component reconstruction.Two parameters, L ("window" length) and Y (the pattern of grouping the matrices), which are based on the attributes of the primitive sequences and the final analysis' objective, are vital for the final decomposition result.
Reconstruction component X k i : SSA decomposes original data into m reconstructed series; the first reconstructed series X 1 is regarded as the most important one.Hence, the rest are discarded as noise.

The Hybrid Models SSA-CS-WNN, SSA-GA-WNN, and SSA-PSO-WNN
It is difficult for a single WNN desirable wind speed forecasting results, though the WNN is suitable for handling small samples or high-dimensional complex problems.What is worse, the irregularity and nonlinearity of wind speed data cause more difficulties in the wind speed prediction procedure.To address the shortcoming that an individual model cannot entirely integrate the information contained in real problem records, three optimization methodologies (CS, GA and PSO) are used to assign the number of wavelet nodes and related parameters such as initial values in this study.We use SSA to reconstruct the original series to obtain the de-noising sequences because it has been confirmed to be a promising method to extract the noise from the original wind speed series.The applied models' results after the SSA de-noising procedure have a higher accuracy than the same models without the de-noising procedure.

Combined Model
Recent studies have predominantly focused on short-term wind speed forecasting ranging from minutes to hours because of the importance of these forecasts for power systems.Various attempts have been made to use hybrid methods for short-term wind forecasting.The combined approaches most commonly seen in the literature are data pre-processing-based approaches, parameter-optimization-based approaches and weighting-based approaches [34].Combination forecasts can be used to enhance the eventual prediction results because they can integrate signal forecasting models and make use of component forecasts.Figure 2 shows the flowchart for the weighting-based combined approaches.The main idea of the optimal mix forecasting method can be expressed as the following mathematical programming problem: where Qpw 1 , w 2 , ¨¨¨, w m q represents the object function, and w 1 , w 2 , ¨¨¨, w m are the weighting coefficients in different models.

Traditional Combination Forecasting Theory (Weighting-Based Combined Approaches)
Different individual models have different advantages for data forecasting, and each forecast has some degree of significance.A more scientific approach is to combine these single models using proportional weighting coefficients and then to utilize various methods to provide comprehensive information.The traditional combination forecasting approach attempts to find the best weight for each of the combined models based on minimizing MAPE.In this study where ; it e is the error of the ith method at time t; and ˆt x represents the forecast value of the ith method at time t.

Artificial Intelligence Algorithms
In addition to the above traditional methods, an artificial intelligence optimization algorithm has been used in many approaches [35].To find the optimal forecasts, this study proposed using the particle swarm optimization algorithm to determine the weighting coefficients.Combined forecasting models can also be divided into variable weight combination forecasting methods and invariable weight combination forecasting methods based on whether the weight changes over time.This paper based on the minimum mean absolute percentage error (MAPE), which belongs to the constant weight combination method.This section provides a weight-determined method that was assessed by experimental simulation rather than a theoretical proof.
After repeated experiments, it was found that the sum of the weights is not precisely equal to 1, it approximates that value.In addition, the weights may calculate a negative value.The amended method is expressed below:

Traditional Combination Forecasting Theory (Weighting-Based Combined Approaches)
Different individual models have different advantages for data forecasting, and each forecast has some degree of significance.A more scientific approach is to combine these single models using proportional weighting coefficients and then to utilize various methods to provide comprehensive information.The traditional combination forecasting approach attempts to find the best weight for each of the combined models based on minimizing MAPE.In this study minJ " L T EL " where L " pl 1 , l 2 , ¨¨¨, l m q T is the weight vector, R " p1, 1, ¨¨¨, 1q T is a column vector where all elements are 1 and E ij " e T i e j , where e i " pe i1 , e i2 , ¨¨¨, e iN q.E " `Eij ˘mˆm is the error information matrix; J represents the MAPE, e t " x t ´x t ; e it is the error of the ith method at time t; and xt represents the forecast value of the ith method at time t.

Artificial Intelligence Algorithms
In addition to the above traditional methods, an artificial intelligence optimization algorithm has been used in many approaches [35].To find the optimal forecasts, this study proposed using the particle swarm optimization algorithm to determine the weighting coefficients.Combined forecasting models can also be divided into variable weight combination forecasting methods and invariable weight combination forecasting methods based on whether the weight changes over time.This paper based on the minimum mean absolute percentage error (MAPE), which belongs to the constant weight combination method.This section provides a weight-determined method that was assessed by experimental simulation rather than a theoretical proof.
After repeated experiments, it was found that the sum of the weights is not precisely equal to 1, it approximates that value.In addition, the weights may calculate a negative value.The amended method is expressed below: In Equation ( 9), the weight vector is not limited to the range [0, 1].After repeated experiments, we found that the weight vector has a value in the range [-1, 1] can generate desirable results.

Experimental Design, Results and Discussion
In this section, several cases are presented to demonstrate the effectiveness of the proposed hybrid approach through comparisons with other models.These studies are presented in four sequential sections: data collection, forecast performance evaluation criteria, simulation forecast procedure and comparison and discussion.

Data Set
The proposed SSA-PSO-DWCM combined model was tested by forecasting the wind speed (in 10-min increments) of wind turbine 5 located in the Penglai region of China.A simple map of the study area is shown in Figure 3.In this section, several studies are presented to illustrate the effectiveness of the proposed combined approach through comparisons with other models.To examine the stability of the combined method, we present our analysis of four days of data from four quarters.Because the wind speed time series includes some uncertainty and some parameters of the combined method have no defined value, we make the following assumptions: (1) Due to the highly random nature of wind speed processes, the experimental data have been randomly selected from four quarters, and the experimental results are regarded as general results.(2) For ease of plotting, T (the period of the time series) is 144.(2) For ease of plotting, T (the period of the time series) is 144.   4 shows the properties of the raw data from Penglai wind farm.The prediction method is that the six previous 10-min data points are used to forecast the next step value and to replace the latest predicted value with the actual value (see Figure 3, part (a)).

Evaluation Indices for Forecasting Performance
Many performance measures have been applied in previous approaches to evaluate the forecast accuracy, but no one single measure can be regarded as the common estimation criterion.For the above reason, we should select several representative indicators to evaluate the quality of these algorithms.In this paper, three evaluation criteria are used: mean absolute error (MAE), Equation (10); mean square error (MSE), Equation (11); and mean absolute percentage error (MAPE), Equation (12).
In Equation ( 9), the weight vector is not limited to the range [0, 1].After repeated experiments, we found that the weight vector has a value in the range [-1, 1] can generate desirable results.

Experimental Design, Results and Discussion
In this section, several cases are presented to demonstrate the effectiveness of the proposed hybrid approach through comparisons with other models.These studies are presented in four sequential sections: data collection, forecast performance evaluation criteria, simulation forecast procedure and comparison and discussion.

Data Set
The proposed SSA-PSO-DWCM combined model was tested by forecasting the wind speed (in 10-min increments) of wind turbine 5 located in the Penglai region of China.A simple map of the study area is shown in Figure 4.In this section, several studies are presented to illustrate the effectiveness of the proposed combined approach through comparisons with other models.To examine the stability of the combined method, we present our analysis of four days of data from four quarters.Because the wind speed time series includes some uncertainty and some parameters of the combined method have no defined value, we make the following assumptions: (1) Due to the highly random nature of wind speed processes, the experimental data have been randomly selected from four quarters, and the experimental results are regarded as general results.

Evaluation Indices for Forecasting Performance
Many performance measures have been applied in previous approaches to evaluate the forecast accuracy, but no one single measure can be regarded as the common estimation criterion.For the above reason, we should select several representative indicators to evaluate the quality of these algorithms.In this paper, three evaluation criteria are used: mean absolute error (MAE), Equation (10); mean square error (MSE), Equation (11); and mean absolute percentage error (MAPE), Equation (12).
In the above formulas, N is the scale of the test data; ŷi represents the forecast result for time period i, whereas y i represents the actual wind speed for the same time period.Out of these three criteria, MAPE is regarded as the main estimation index in this paper because it is a unit-free measure of accuracy for the predicted wind series and is sensitive to small changes in the data.
Generally, the forecasting error is closely related to the purpose of the research and the characteristics of the original series.The shorter the output length or the smoother the wind speed series is, the smaller the forecasting errors.Otherwise, the forecasting errors will be larger [36].

Forecasting Procedure
This paper employs 3000 samples ranging from 00:10 on 6 June to 20:00 on 26 June 2011 to simulate the models and regards the raw data of the Penglai region as a random series.Then, the models are employed to forecast the wind speed for four different days drawn from four different quarters.The experimental process consisted of several steps as follows: Step 1: Execute Wavelet Neural Network (WNN) method forecasts and collect the results (for four quarters of wind turbine 5).
Step 2: Run three hybrid models PSO-WNN, CS-WNN and GA-WNN to forecast wind speed.
Step 3: Combine the three hybrid forecast models by using the traditional combination method.
Step 4: Combine the three hybrid forecast models based on the PSO-determined weighting coefficient method.
Step 5: Use SSA to filter the raw wind speed data to decrease its non-stationarity.Then, use the de-noised data to rerun the models following the above Steps 1-4.The flowchart of the combined method SSA-PSO-DWCM is shown in Figure 4.

Analysis of Forecast Results and Comparisons of Different Models
Considering the randomness of the optimization methods, each program was executed 10 times.The maximum and minimum values of the indexes for each quarter and all experiments are presented in Tables 5 and 6.To facilitate the analysis and discussion of the proposed combined model, 10 other models for short-term wind speed forecasting are employed for comparison and assessment of the prediction performance in this subsection.From the first quarter's simulation results, we can conclude that the single WNN shows the largest fluctuation and the highest MAPE, which ranges from 15.52% to 10.80%.After combining the three optimization algorithms, the MAPE becomes more steady and decreases to some extent.The PSO-WNN ranges from 10.13% to 9.72%, CS-WNN ranges from 10.81% to 9.87%, and GA-WNN ranges from 16.49% to 10.81%.In SSA-WNN, SSA-PSO-WNN, SSA-CS-WNN and SSA-GA-WNN models, the MAPE decreased significantly.The three hybrid models' forecast results for four quarters are highlighted in Figure 1, part e.The final forecasting results illustrate that decomposing the raw wind speed signals by SSA can not only improve the forecasting accuracy but can also lower the fluctuation of the MAPE.The above conclusions can also be drawn from the results for the other quarters in Tables 5 and 6.The evaluation index results for different forecasting methods are compared in Tables 7-10; the first six rows of these four tables present the forecasts without decomposition.MAE, MSE and MAPE are used to monitor the forecasting accuracy.Wind speed in every quarter was forecast using 10 models to compare the forecasting accuracy; comparisons of MAPE for different models are shown in Figure 5, part a. From the first six rows of Tables 7-10, we can see that the individual WNN has the lowest accuracy, better performance is provided by the three hybrid optimization models PSO-WNN, CS-WMN and GA-WNN.However, we find that the forecasting accuracy of the Traditional Combined Method is low compared with the three hybrid optimization models.This situation occurs because the Traditional Combined Method cannot integrate all of the advantages of the hybrid models.In Table 7, the MAPE of the PSO-DWCM model is 9.30%, which is 3.00%, 0.42%, 0.57% and 1.64% lower than the WNN, PSO-WNN, CS-WNN and GA-WNN models, respectively.These data indicate that the PSO-DWCM is a viable method to exploit the advantages of different models.The other three quarters also support the above conclusions.The only two parameters in the SSA that we must select are L and Y.The range of L is 2 ď L ď N{2, the number of elements in the decomposed series is N = 3150.After repeated experiments, we found that the final results change little for different values of L. To define the value of Y, we use information from the previous wind speed time series data.First, we divide the original data (3150 elements) into two sets: the first set (containing 3000 points) is used to train the model and the second set (containing 150 points) is used to forecast.Second, the WNN forecasting accuracy is obtained for many experiments by adjusting the Y value in increments of 10 interval.Finally, we obtain the value of Y that provides the best performance.Based on the above simulations, L = 1000 and Y = [1:180] were chosen.The procedure is shown in Figure 6.This paper used the Correlation Coefficient (R) Equation (13) to depict the relationship between the original series and the decomposed series and the Relative Error (RE) Equation ( 14) and the Root Mean Square Error (RMSE) Equation (15)   Rows 7-12 of Tables 7-10 represent the forecasts obtained using the decomposed samples.It clearly shows that the models reconstructed by SSA perform better than the models using the original data.The largest improvement in forecasting accuracy is determined by the de-noising procedure.Finally, the MAPE of the SSA-PSO-DWCM method in the first quarter is 6.52%, which is a decrease of 5.78% compared to the single model WNN.This value illustrates a great reduction in forecasting accuracy.The simulation results for the other three quarters also support the above views.Furthermore, SSA-PSO-DWCM shows stronger forecasting capability compared to the SSA-Traditional combined method, because the novel combination method is more reasonable, more scientific, and more applicable to practical problems than no negative constraint theory combination The correlation coefficient between the decomposed data and the original data is more than 98%, the relative error and the root mean square error are only approximately 0.6% and 0.42% as shown in Table 11.These results illustrate that SSA is an effective method for extracting information.Rows 7-12 of Tables 7-10 represent the forecasts obtained using the decomposed samples.It clearly shows that the models reconstructed by SSA perform better than the models using the original data.The largest improvement in forecasting accuracy is determined by the de-noising procedure.Finally, the MAPE of the SSA-PSO-DWCM method in the first quarter is 6.52%, which is a decrease of 5.78% compared to the single model WNN.This value illustrates a great reduction in forecasting accuracy.The simulation results for the other three quarters also support the above views.Furthermore, SSA-PSO-DWCM shows stronger forecasting capability compared to the SSA-Traditional combined method, because the novel combination method is more reasonable, more scientific, and more applicable to practical problems than no negative constraint theory combination models.A comparison of forecasting results between WNN and SSA-PSO-DWCM for four quarters is shown in Figure 5, part (b).

Analysis of Different Weighting Coefficients
In this paper, the traditional method and the PSO optimization method are employed to optimize the weighting coefficients.Different hybrid models' weighting coefficients were calculated according to different weighting coefficient determination methods and the results are shown in Table 12.We can conclude that the weighting coefficients determined by the traditional combined method have two characteristics: the sum of the three weights is equal to the value 1 and each of the weighting coefficients is larger than 0. In contrast, the sum of these three weighting coefficients when optimized by the artificial intelligence algorithm PSO is close to 1 and the weighting coefficients range from ´1 to 1.The results illustrate that the intelligence algorithm PSO can enlarge advantages and avoid drawbacks in an effective way to estimate the performance of different models.

Conclusions
Wind speed forecasting plays an indispensable role in wind-related engineering studies and is important in the management of wind farms.Accurate forecasts have a significant influence on the economy and energy-saving measures.However, properties such as nonlinearity and non-stationarity are great challenges for wind speed prediction.Many studies have made efforts to understand and successfully implement a forecasting procedure.However, many of these studies are not suitable to apply to various wind speed time series.This study provides a comprehensive presentation of the combined theories and then proposes a novel combined forecasting model (SSA-PSO-DWCM) to forecast future wind speed.Data from four quarters were used to validate the stability of the model.The first step of the combined model is SSA filtering of the original wind speed data.Then, the WNN model, improved by the GA, PSO and CS optimization algorithms is used to forecast the set of new wind speeds.Finally, the combined model is integrated using different weighting coefficients calculated by the PSO algorithm.Based on the criteria index MAPE in all cases of this study, several conclusions are presented as follows: (a) the SSA de-noising procedure demonstrates a remarkable decrease in MAPE; (b) improving the WNN with the PSO, GA and CS algorithms shows a better forecasting performance than the individual WNN model; (c) in different comparisons, the combined model SSA-PSO-DWCM obtains the highest forecasting accuracy and is the least sensitive compared with other models proposed in this paper.Therefore, the proposed combined model has integrated the advantages of different models and is very useful for the wind energy sector, such as management of large wind farms, avoiding power grid collapse and reducing production costs.In addition, this combined model can be generalized to other areas, such as electric load forecasting, product demand forecasting and traffic flow forecasting.Moreover, as a new type of optimization strategy, the combined method has excellent prospect.A series of assumptions can be proposed to improve the accuracy and instability, for instance, an intersection optimal algorithm.

Figure 1 .
Figure 1.Comprehensive presentation of three optimization algorithms and the forecasting method: (a) Genetic Algorithm pseudo-code and flowchart; (b) Particle Swarm Optimization pseudo-code and flowchart; (c) Cuckoo Search pseudo-code and flowchart; (d) structure of the WNN; (e) forecasting results of three hybrid models for four quarters.

Figure 1 .
Figure 1.Comprehensive presentation of three optimization algorithms and the forecasting method: (a) Genetic Algorithm pseudo-code and flowchart; (b) Particle Swarm Optimization pseudo-code and flowchart; (c) Cuckoo Search pseudo-code and flowchart; (d) structure of the WNN; (e) forecasting results of three hybrid models for four quarters.
, ,w Q L represents the object function, and 1 2 m w ,w , ,w L are the weighting coefficients in different models.

Figure 2 .
Figure 2. Flowchart for the weighting-based combined approach.

Figure 2 .
Figure 2. Flowchart for the weighting-based combined approach.

Figure 4 .
Figure 4. Location of Penglai wind farm in China and statistical properties of the original data.

Figure 3 .
Figure 3. Location of Penglai wind farm in China and statistical properties of the original data.

Figure 3 .
Figure 3. Flowchart of the combined model SSA-PSO-DWCM: (a) a brief illustration of the prediction method; (b) structure of the WNN and image of the Morlet wavelet function; (c) three hybrid models: SSA-PSO-WNN, SSA-GA-WNN and SSA-CS-WNN.

Figure 4 .
Figure 4. Flowchart of the combined model SSA-PSO-DWCM: (a) a brief illustration of the prediction method; (b) structure of the WNN and image of the Morlet wavelet function; (c) three hybrid models: SSA-PSO-WNN, SSA-GA-WNN and SSA-CS-WNN.

Figure 6 .
Figure 6.First quarter forecasting results obtained using SSA.

Figure 6 .
Figure 6.First quarter forecasting results obtained using SSA.

Table 5 .
Maximum and minimum index values for the first and second quarters in all cases.

Table 6 .
Maximum and minimum index values for the third and fourth quarters in all cases.

Table 7 .
Evaluation indices of different models in the first quarter for wind turbine 5.

Table 8 .
Evaluation indices of different models in the second quarter for wind turbine 5.

Table 9 .
Evaluation indices of different models in the third quarter for wind turbine 5.

Table 10 .
Evaluation indices of different models in the fourth quarter for wind turbine 5.
to measure the deviation between the observed values and the true values.Larger R and smaller RE and RMSE indicate a similar connection between the de-noised data and the original data.and y represent the de-noising data and the original data, respectively, covpy t , yq is the covariance between y t and y. σ y t and σ y represent the variance of y t and y, respectively.

Table 11 .
Correlation index between the de-noise data and the original data.

Table 12 .
Different weighting coefficients determined by traditional method and PSO method.