Wind Energy Potential Assessment and Forecasting Research Based on the Data Pre-Processing Technique and Swarm Intelligent Optimization Algorithms

Accurate quantification and characterization of a wind energy potential assessment and forecasting is significant to optimal wind farm design, evaluation and scheduling. However, wind energy potential assessment and forecasting remain difficult and challenging research topics at present. Traditional wind energy assessment and forecasting models usually ignore the problem of data pre-processing as well as parameter optimization, which leads to low accuracy. Therefore, this paper aims to assess the potential of wind energy and forecast the wind speed in four locations in China based on the data pre-processing technique and swarm intelligent optimization algorithms. In the assessment stage, the cuckoo search (CS) algorithm, ant colony (AC) algorithm, firefly algorithm (FA) and genetic algorithm (GA) are used to estimate the two unknown parameters in the Weibull distribution. Then, the wind energy potential assessment results obtained by three data-preprocessing approaches are compared to recognize the best data-preprocessing approach and process the original wind speed time series. While in the forecasting stage, by considering the pre-processed wind speed time series as the original data, the CS and AC optimization algorithms are adopted to optimize three neural networks, namely, the Elman neural network, back propagation neural network, and wavelet neural network. The comparison results demonstrate that the new proposed wind energy assessment and speed forecasting techniques produce promising assessments and predictions and perform better than the single assessment and forecasting components.


Introduction
As a clean and renewable resource, wind energy is important in energy supply and, through wind turbines, the green wind energy can be converted to electricity.However, not all locations are suitable for wind turbine installation.As a result, wind energy assessment should be performed in advance.Furthermore, to guarantee the safety of wind energy, the accuracy of wind speed forecasting should be ensured.Wind energy assessment and wind speed forecasting are two challenging research topics at present.
Wind energy assessment plays a significant role in wind turbine installation decisions in many countries worldwide, and technologies used for wind energy potential are varied.Based on different moment constraints, Liu and Chang [1] performed validity analysis of the maximum entropy distribution for wind energy assessment in Taiwan.Nested ensemble Numerical Weather Prediction approach was proposed by Al-Yahyai et al. [2] to perform a wind energy assessment over Oman.Wu et al. [3] proposed an assessment model based on the Weibull distribution and different particle swarm optimization algorithms as well as differential evolution algorithms to assess the wind energy potential at Inner Mongolia in China.Jung and Kwon [4] introduced artificial neural networks to improve the wind energy potential estimation for four sites surrounding the Saemangeum Seawall.The wind analysis model was adopted by Boudia et al. [5] to assess the wind energy of four locations situated in the Algerian Sahara.Apart from the wind analysis model, Quan and Leephakpreeda [6] also used economic analysis to assess the wind energy potential in Thailand.A GIS-based method was applied by Siyal et al. [7] for wind energy assessment in Sweden.
One of the most vital factors used for wind energy assessment is the wind speed.The effect of the wind energy assessment directly depends on the accuracy of the wind speed forecasting.Many techniques have recently been proposed to forecast the wind speed, and the related techniques can usually be divided into the following three categories: short-term wind speed forecasting [8][9][10], medium-term wind speed forecasting [11] and long-term wind speed forecasting.One of the most popular skills used for wind speed forecasting is to construct a hybrid model based on several single forecasting approaches.For example, Wang et al. [12] presented a hybrid model with the assistance of the phase space reconstruction algorithm and Markov algorithm.Based on the extreme learning machine, Ljung-Box Q-test and seasonal auto-regressive integrated moving average (ARIMA) models, a hybrid wind speed forecasting model is proposed by Wang et al. [13] to estimate the wind speed of different sites in northwestern China.The ARIMA model was also used by Shukur and Lee [14] to show a hybrid wind speed forecasting model with the Kalman filter and an artificial neural network.Liu et al. [15] demonstrated a hybrid approach using the secondary decomposition model and Elman neural networks.Fei [16] used a hybrid method that consists of the empirical mode decomposition and multiple-kernel relevance vector regression technologies.
In this paper, based on the cuckoo search (CS) algorithm and ant colony (AC) algorithm, two new wind energy assessment models and six wind speed forecasting models are proposed.In the assessment process, the AC and CS algorithms are applied to optimize two unknown parameters of the Weibull distribution.Then, four assessment error evaluation criteria are adopted to evaluate the effectiveness of the two newly proposed assessment models.While in the forecasting process, the CS and AC algorithms are used to optimize three neural networks, namely the Elman, back propagation and wavelet neural networks, and the new proposed approaches are validated by three forecasting error evaluation criteria.
The remaining part of this paper is organized as follows: A description of wind energy potential assessment methodologies is given and the results are evaluated in Section 2. Section 3 presents the connection between the energy assessment and forecasting to identify the best data pre-processing approach.The proposed integrated forecasting framework and forecasting results are presented in Section 4, and the last section presents the concluding remarks.

Wind Energy Potential Assessment Methodologies and Results
In this section, related single methodologies as well as the proposed hybrid methods used to assess the wind energy potential are introduced; then, the assessment results are presented to demonstrate the performance of the methods.

Related Methodologies
This subsection focuses on the related single and hybrid methodologies to assess the wind energy potential.

Related Single Methodologies
The main content of two parameter optimization algorithms and the assessment approach will be described in this section.

Parameter Optimization Algorithms (a) Cuckoo Search Algorithm
The cuckoo search (CS) algorithm [17] is derived from the behavior of the cuckoo in the process of searching for nests.To simplify the CS algorithm, three idealized rules are hypothesized.The first is that only one egg is laid by a cuckoo each time, and the cuckoo randomly selects a parasitic nest to hatch the egg.The second is that among the randomly selected parasitic nests, the best parasitic nest will be reserved for the next generation.The last is that the number of the available parasitic nests is fixed, and the probability of the alien egg found by the host of the parasitic nest is p a , which is located in the interval [0, 1].Once the alien eggs have been found, the host birds will throw them or abandon the nest, and build a new one in another place.For simplicity, we use the statement that one egg in a nest represents a solution, and the new and potentially better solutions will replace the bad ones.
On the basis of these three ideal rules, the new solution is generated by: where α is the step size and, in most cases, it is set to α = 1; the symbol "×" represents the entry-wise multiplication.In essence, Equation ( 1) is a random walk equation, and the future position is determined by the current positon (the first term in Equation ( 1)) as well as the transition probability (the second term in Equation ( 1)).Lévy in Equation ( 1) denotes the random search path, and the random step length follows the Lévy distribution shows Equation (2), i.e., where λ is set to values in the interval (1, 3].

(b) Ant Colony Algorithm
The ant colony (AC) algorithm is proposed by Italian scientist Dorigo M. etc. in 1991.To facilitate the research, the following assumptions are proposed [18]: (1) The communication mediums that ants used are the pheromone and environment; (2) The response of the ant to the environment is determined by its internal mode; (3) The ant individuals are independent; and (4) the entire ant colony shows a random characteristic.
Through adaptation and collaboration in two stages, ants transition to an ordered state from the disordered one and obtain the optimum path.The key point of path selection is the probability transition, i.e., the probability of the kth ant from the ith city to the jth city at time is calculated by the Equation (3) [19]: where τ ij (t) and η ik (t) represent the intensity of the pheromone trail and visibility of edge (i, j), respectively; allowed k is the set of cities to be visited by the kth ant in the Ith city, and α and β are two coefficients that tune the relative importance of the trail versus visibility.

Assessment Approach
The Weibull distribution is introduced to this paper to assess the potential wind energy.The probability density function (PDF) of the Weibull distribution can be expressed by Equation (4): where x is the random variable, which represents the wind speed in this paper; k and c are the shape and scale parameters, respectively.

Proposed Wind Energy Potential Assessment Model
In this paper, the CS algorithm is used to estimate the unknown parameters k and c in the Weibull distribution.The new proposed novel model is abbreviated as the CS-Weibull model.The pseudo code of this model is presented in Algorithm 1.Similarly, the AC algorithm is adopted to estimate the two parameters.Correspondingly, this new model is abbreviated as the AC-Weibull model.The pseudo code presented in Algorithm 2 is provided to help understand this novel model.

Wind Energy Potential Assessment Case Study
In this paper, wind speed data from 2009 to 2013 are adopted to assess the wind energy in four locations-[125, 40], [122.5,40], [125, 42.5], and [120, 40]-where the first component represents the longitude and the second one denotes the location latitude.The collected wind speed data will be applied from two aspects, 1. Single year data application: Wind speed data in the single year will be analyzed to obtain the yearly assessment results and 2. Whole five-year data application: Wind speed data in each season of the five years will be analyzed to obtain the seasonal assessment results as well as the whole five-year assessment results.
In addition, beyond the CS-Weibull and AC-Weibull models, an original Weibull model and two other models related to the Firefly Algorithm (FA) and the Genetic Algorithm (GA) are introduced to compare the assessment effectiveness.The two models are abbreviated as the FA-Weibull and GA-Weibull models, respectively.

Assessment Results in a Single Year
The wind energy assessment is an important indicator to determine the potential of wind resources and describe the amount of wind energy at various wind speed values in a particular location.In a study of the wind energy assessment, the common parameter estimation methods include the method of moments estimate, maximum likelihood estimate, and least squares estimate, which have some disadvantages and limitations.For example, the method of moments estimate is simple where only knowing the moment of the population is sufficient and does not require knowledge of the population distribution.However, it can only be used in the distribution when the population origin moment exists, and the moment only has some of the information.This method only has good performance when the sample size is large.The maximum likelihood estimation (MLE) is a method of estimating the parameters of a statistical model according to observations by finding the parameter values that maximize the likelihood of making the observations given the parameters.However, the maximum likelihood estimation must incorporate the sample distribution.It is more complicated to incorporate the likelihood equations, which often obtains the approximate solution by computer iterative computation.The maximum likelihood estimation is complex and may lead to multi-optimal solutions or non-optimal solutions.The least squares can be applied to estimate linear and nonlinear relationships.When applying the least square to estimate the parameters of models, the observed data do not require information about the probability and statistics method.However, the least square has two kinds of defects.If the noise of model is colored noise, the estimation result of the least square is a biased estimation; with increasing data size, "data saturation" will appear.The Bayesian parameter estimation must know the distribution of the random error.When the sample size is small, prior probability has a significant influence on the estimation result (the result of maximum likelihood estimation, method of moments estimate, least square estimate and Bayesian parameter estimation in Appendix A).In summary, in this paper, the effectiveness of four optimization algorithms (Firefly Algorithm, Genetic Algorithm, Ant Colony Algorithm and Cuckoo Search Algorithm) is evaluated to determine the shape (k) and scale (c) parameters of the Weibull distribution function for calculating the wind power density.By comparing the assessment results, the swarm intelligent algorithm showed an effective assessment performance.
The parameter estimation results in a single year, from 2009 to 2013, of the five models are listed in Table 1.According to the estimated parameters given in Table 1, the five models can be determined, and Figure 1 is the indication of the PDF fitting results in a single year from 2009 to 2013.With the PDF fitting results, in this paper, the following four error evaluation criteria (showed in Equations ( 5)-( 7)) are adopted to evaluate the assessment performance: With the PDF fitting results, in this paper, the following four error evaluation criteria (showed in Equations ( 5)-( 7)) are adopted to evaluate the assessment performance: where y i is the observed value, ŷi means the forecasted value, and y is calculated by y = n ∑ i=1 y i /n .
Table 2 provides the assessment performance evaluation results in a single year from 2009 to 2013 of the four optimization algorithms on a yearly basis in terms of MAE, RMSE, SSE and R 2 , respectively.As seen from Table 2, although the presented descriptive statistics provide meaningful statistical analysis, especially regarding the distribution of the wind speed, they cannot be solely used to judge the precision level of each optimization algorithm for estimating the parameters of Weibull distribution.Therefore, the different evaluation criteria introduced by Equations ( 5)-( 8) are employed to appraise the performances of the four selected parameter estimation optimization algorithms.It is meaningful that different statistical criterion supplies different useful views for comparing the optimization algorithms.As a result, the combination of all statistical indicators provides an effective way to compare the different parameter estimation optimization algorithms for wind power assessment.The effectivity of the assessed wind power density values changes when the parameter estimation optimization algorithms change.This is apparent for each research site when the four optimization algorithms of CS, GA, FA and AC are utilized to estimate the parameters of Weibull distribution.This conclusion is drawn from the low error values and high R 2 and SSE values.On the other hand, the lowest agreement levels are attained when the four algorithms are applied for k and c parameter calculations.According to the statistical results in Table 2, for the four sites Chinese wind farm sites, the best results for calculating the wind speed density are achieved when the four optimization algorithms are employed to compute the k and c parameters.For each gate station site, the most precise results are obtained using the different optimization algorithms [20].

Seasonal and Whole Five-Year Assessment Results
Considering that wind speed data may be vastly different in different years, this section provides seasonal and whole five-year wind energy assessment results by comprehensively using the wind speed data in the five years from 2009 to 2013.Similarly, Table 3 lists the seasonal and whole five-year parameter estimation results, and Figure 2 and Table 4 present the PDF fitting and corresponding error results.
The same conclusion can be obtained from these results; i.e., the four new proposed models based on the FA, GA, CS algorithm and AC algorithm are superior to the original Weibull model.The two -parameter Weibull distribution function has been widely applied to different kinds of wind energy-related investigations due to its briefness, flexibility and effectiveness.In this paper, the performance of four optimization algorithms, including the FA, GA, CS, and AC algorithms, was assessed to optimize the k and c parameters of the Weibull probability distribution function when calculating the wind power density at four sites in China.The assessments were conducted on both a seasonal and annual basis to offer a more complete analysis.Both the annual and seasonal results showed that by using different parameter estimation methods through different optimization algorithms for determining the k and c parameters of the Weibull distribution, the accuracy of the calculated wind power density values would change.According to the wind energy assessment results from the statistical analysis, the FA, GA, CS, and AC algorithms provided a very desirable performance for each site.Another discovery showed the CS and AC algorithms' approach in terms of the efficiency.The assessment results show that the more appropriate parameter estimation algorithm was not universal among all examined sites.As a matter of fact, the wind energy properties could be a significant factor in wind energy assessment.Annually and seasonally for Site 1, the CS algorithm was recognized as a more appropriate algorithm, while the FA showed weak performance for wind power assessment.For Site 2, the four optimization algorithms were determined as a more effective Weibull parameter estimation algorithm for optimizing the wind power density in each year and season.For Site 3, the AC showed poor performance for the annual wind power density The two -parameter Weibull distribution function has been widely applied to different kinds of wind energy-related investigations due to its briefness, flexibility and effectiveness.In this paper, the performance of four optimization algorithms, including the FA, GA, CS, and AC algorithms, was assessed to optimize the k and c parameters of the Weibull probability distribution function when calculating the wind power density at four sites in China.The assessments were conducted on both a seasonal and annual basis to offer a more complete analysis.Both the annual and seasonal results showed that by using different parameter estimation methods through different optimization algorithms for determining the k and c parameters of the Weibull distribution, the accuracy of the calculated wind power density values would change.According to the wind energy assessment results from the statistical analysis, the FA, GA, CS, and AC algorithms provided a very desirable performance for each site.Another discovery showed the CS and AC algorithms' approach in terms of the efficiency.The assessment results show that the more appropriate parameter estimation algorithm was not universal among all examined sites.As a matter of fact, the wind energy properties could be a significant factor in wind energy assessment.Annually and seasonally for Site 1, the CS algorithm was recognized as a more appropriate algorithm, while the FA showed weak performance for wind power assessment.For Site 2, the four optimization algorithms were determined as a more effective Weibull parameter estimation algorithm for optimizing the wind power density in each year and season.For Site 3, the AC showed poor performance for the annual wind power density distribution, and the FA was recognized as a more appropriate method.For Site 4, both the FA and GA perform better for the seasonal wind power density.The suggested parameter estimation methods have excellent performance for representing the distribution of seasonal and annual wind power density as well as determining different statistical properties of the power density [20].

Connection between Energy Assessment and Forecasting
In recent years, the de-noising method is widely used to preprocess wind speed time series, such as the Ensemble Empirical Mode Decomposition (EEMD), Singular Spectrum Analysis (SSA), and the Wavelet decomposition (WD).Thus far, there is no effective way to choose which de-noising methods should be used to address the original wind speed time series.In this section, the wind energy assessment method with the smallest error values is used to choose the best de-nosing method to pre-process the wind speed time series.
Figure 3 presents the PDF fitting results obtained by three different de-noising methods for the four sites, and Table 5 shows the parameter estimation and error results of the different de-nosing wind speed time series.As seen from Figure 3 and Table 5, the R 2 values from Site 1 to Site 4 in the WD de-noising method are all closest to 1. Assessment results obtained by the three de-noising models show that the MAE values of the WD de-noising method is the smallest.In this paper, the WD de-noising method is adopted to preprocess the original wind speed to improve the forecasting accuracy.
Sustainability 2016, 8, 1191 14 of 30 distribution, and the FA was recognized as a more appropriate method.For Site 4, both the FA and GA perform better for the seasonal wind power density.The suggested parameter estimation methods have excellent performance for representing the distribution of seasonal and annual wind power density as well as determining different statistical properties of the power density [20].

Connection between Energy Assessment and Forecasting
In recent years, the de-noising method is widely used to preprocess wind speed time series, such as the Ensemble Empirical Mode Decomposition (EEMD), Singular Spectrum Analysis (SSA), and the Wavelet decomposition (WD).Thus far, there is no effective way to choose which de-noising methods should be used to address the original wind speed time series.In this section, the wind energy assessment method with the smallest error values is used to choose the best de-nosing method to preprocess the wind speed time series.
Figure 3 presents the PDF fitting results obtained by three different de-noising methods for the four sites, and Table 5 shows the parameter estimation and error results of the different de-nosing wind speed time series.As seen from Figure 3 and Table 5, the R 2 values from Site 1 to Site 4 in the WD de-noising method are all closest to 1. Assessment results obtained by the three de-noising models show that the MAE values of the WD de-noising method is the smallest.In this paper, the WD de-noising method is adopted to preprocess the original wind speed to improve the forecasting accuracy.

Proposed Integrated Forecasting Framework and Forecasting Results
In section, three basic neural network forecasting models are first introduced; then, the integrated forecasting framework proposed in this paper is shown.Finally, the forecasting results obtained by the new proposed forecasting framework are analyzed.

Basic Neural Network Forecasting Models
Artificial neural networks are usually used to forecast fields as they can approximate nonlinear functions with arbitrary accuracy.Three neural network models are introduced in this paper for the wind speed forecasting application.

Back Propagation Neural Network
The back propagation neural network (BPNN) [21] is a multilayer feed-forward neural network.The two main features that should be considered in BPNN are the feed-forward signal and back propagated error.In the feed-forward process, the signal is passed layer-by-layer from the input layer to the hidden layer and then to the output layer.The state of the neurons only impacts the neurons in the adjacent next layer.If the output in the output layer is not expected, back propagation starts.
Suppose X 1 , X 2 , . . ., X n are the input values of the BPNN; Y 1 , Y 2 , . . ., Y m are the corresponding output values; and ω ij and ω jk are the weights, the BPNN can be viewed as a non-linear function and the input values and output values can be regarded as the independent and dependent variables.The BPNN structure in Figure 4 is the expression of the function mapping relation from n independent variables to m dependent variables.
The network training is the main task of the BPNN.Through the training operation, the BPNN has capacity for associative memory and forecasting.The training process of the BPNN includes the following steps: Step 1: Network initialization.Based on the practical problem, determine the number of nodes in the input, hidden and output layers.Then, initialize the following values: the connection weights ω ij and ω jk , threshold values θ j and θ k in the hidden and output layers, respectively, and the learning rate η and the transfer functions.
Step 2: The output calculation of the hidden layer.According to the input vector X = (X 1 , X 2 , . . . ,X n ), the connection weights ω ij between the input and hidden layers and the threshold value θ j in the hidden layer as well as the output of the hidden layer can be calculated by Equation( 9): where l is the number of nodes in the hidden layer and f (•) is the transfer function of the hidden layer, which has a variety of expression forms.In this research, the following form is adopted in Equation (10):

3:
The output calculation of the output layer.According to the output H j of the hidden layer, the connection weights ω jk between the hidden layer and output layer, and the threshold value θ j in the output layer, the forecasting output of the BPNN can be expressed as Equation (11): where g (•) is the transfer function from the hidden layer to the output layer, which is defined as Equation ( 12) in this research: Step 4: Error calculation.With the predicted output Y = (Y 1 , Y 2 , . . . ,Y m ) and the desired output DY = (DY 1 , DY 2 , . . . ,DY m ), the forecasting error of the network is computed by Equation ( 13): where P is the number of the input and output pairs.
Step 5: Weights update.Update the connection weights ω ij and ω jk by Equations ( 14) and (15): ) where η is the learning rate, and shows Equations ( 16) and ( 17) ) Step 6: Threshold update.By using the forecasting error of the network, the threshold is updated by Equations ( 18) and ( 19): Step 7: Termination determination.Determine whether the termination requirement is achieved, if so, ended, otherwise, return to Step 2.

Wavelet Neural Network
The Wavelet Neural Network (WNN) [22] is a neural network type that is constructed on the basis of the BPNN topology, and the wavelet basis function is regarded as the transfer function of the hidden layer nodes.In this type of network, the signal is transferred feed-forward, while the error is transferred back-forward.Suppose  and jk  are the weights, the output of the hidden layer can be represented by Equation ( 20) where j h is the output of the jth hidden layer node, ij  is the connection weight between the input and hidden layers,   h  is the wavelet function, j b is the shift factor of the wavelet function, and j a is the stretch factor wavelet function.
The forecasted value of the output layer can be calculated by Equation ( 21):

Wavelet Neural Network
The Wavelet Neural Network (WNN) [22] is a neural network type that is constructed on the basis of the BPNN topology, and the wavelet basis function is regarded as the transfer function of the hidden layer nodes.In this type of network, the signal is transferred feed-forward, while the error is transferred back-forward.Suppose X 1 , X 2 , . . ., X n are the inputs of the network, Y 1 , Y 2 , . . ., Y m are the forecasted output, and ω ij and ω jk are the weights, the output of the hidden layer can be represented by Equation ( 20) where h j is the output of the jth hidden layer node, ω ij is the connection weight between the input and hidden layers, h (•) is the wavelet function, b j is the shift factor of the wavelet function, and a j is the stretch factor wavelet function.
The forecasted value of the output layer can be calculated by Equation ( 21): where ω jk is the weight between the hidden and output layers, h j is the output of the jth hidden layer nodes, l is the number of the nodes in the hidden layer, and m is number of the nodes in the output layer.
The process of the WNN algorithm is as follows: Step 1: Network initialization.Randomly initialize the stretch factor a k , shift factor b k , network connection weights ω ij and ω jk , and network learning rate η.
Step 2: Sample classification.Divide the samples into the training and testing samples, which are used to train the network and test the forecasting accuracy of the network, respectively.
Step 3: Output prediction.Input the training sample into the network and calculate the predicted output of the network as well as the error between the network output and desired output.
Step 4: Weight correction.Correct the network weights and parameters in the wavelet function according to the calculated error values, helping the network predicted values approach the expected values.
Step 5: Algorithm termination judgment.Determine whether the algorithm termination is satisfied; if not, return to Step 3.

Elman Neural Network
ENN [23] is generally divided into four layers, input, hidden, context and output layers.The connections between the input, hidden and output layers are similar to the feed-forward network.The nodes in the input layer only play a signal transmission role, while those in the output layer have a linear weighted effect.The transfer function of the hidden layer can be either linear or nonlinear, and the context layer, which is also known as the undertake or state layer, is used to remember the previous output of the hidden layer and return it to the network input so it can be considered a single-step delay operator.
Through the delay and storage of the context layer, the output of the hidden layer can be self-connected to the input of the hidden layer.This self-connection approach makes the network sensitive to the historical data and increases the capacity of the network to address the dynamic information, which can then achieve the dynamic modeling purpose.In addition, the ENN can approximate any nonlinear map with arbitrary precision without considering the specific form of the external noise impact on the system.Therefore, given the input and output pair of the system, the system can be modeled.

Structure of the Proposed Integrated Forecasting Framework
In this paper, neural network models based on the three artificial intelligent neural networks mentioned in Section 4.1-i.e., the ENN, BPNN and WNN-are used to forecast the wind speed; the integrated forecasting framework is shown in Figure 5 and can be decomposed into the following three main procedures.First, the wavelet decomposition (WD) [24] is used to decompose the original wind speed data.As seen from Section 3, the WD method is the best pre-processing method selected according to the wind energy assessment results, and it is used to preprocess the original wind speed.With this operation, three new models, abbreviated as WD-ENN, WD-BPNN and WD-WNN, are gained.Second, the CS and the AC algorithms are adopted to optimize the unknown weight and bias matrices between hidden and output layers in the three neural network models obtained in the first step, respectively.Additionally, with this implementation, in addition to the three neural networks optimized by the CS algorithm, named the WD-CS-ENN, WD-CS-BPNN and WD-CS-WNN, three neural networks optimized by the AC algorithm, abbreviated as the WD-AC-ENN, the WD-AC-BPNN and the WD-AC-WNN, are obtained as well (shown in Figure 4).The related pseudo codes are presented in Algorithms 3 and 4.

Wind Speed Forecasting Case Study
When the original wind speed time series is disposed by the WD method, the pre-processed wind speed time series is considered as the input of the optimized BPNN, ENN and WNN models.It is worth noting that the method for dividing the original wind speed time series into the training and testing sets is quite important.Moreover, in the network training procedure, the training inputs

Wind Speed Forecasting Case Study
When the original wind speed time series is disposed by the WD method, the pre-processed wind speed time series is considered as the input of the optimized BPNN, ENN and WNN models.It is worth noting that the method for dividing the original wind speed time series into the training and testing sets is quite important.Moreover, in the network training procedure, the training inputs are de-noised data, while the training output is the original training time series.In the testing step, are de-noised data, while the training output is the original training time series.In the testing step, the inputs are also the de-noised wind speed data, and the output is the original testing output.However, the testing output is assumed to be unknown.
Figure 6 presents the data division results; in this paper, the training dataset window with length N = 1008 is fixed according to the original time series.For example, suppose a study of the wind speed time series will be forecasted.Apart from the data division, the forecasting horizon is also an important index.In this paper, multi-step ahead forecasting with values h = 1, 2, and 3 are analyzed, where h is a prediction step.Related parameter initialization values in different neural networks are shown in Table 6.Based on the error evaluation criteria, MAE, defined in Equation ( 5) and the following two forecasting error evaluation criteria shows in Equations ( 22) and (23), forecasting error values obtained by different neural networks are listed in Table 7.
where i y and ˆi y are the actual and forecasted wind speed values, and n is the number of the data samples.Table 7 provides the forecasting error results with three different horizons, one-step-ahead, twosteps-ahead and three-steps-ahead.As seen, under the same horizon conditions, performances of the optimized nine neural networks are all better than those of the three single neural networks.Additionally, models optimized by the WD and CS or WD and AC are all superior to those that were only optimized by the WD algorithm.While the models optimized by the WD and CS are compared with the models optimized by the WD and AC, for the one-step-ahead horizon forecasting results shows in Figure 7, error values obtained by the WD and CS algorithms are all smaller than the corresponding models optimized by the WD and AC algorithms.For the two-step-ahead horizon forecasting results shows in Figure 8, the BPNN model optimized by the WD and CS is worse than that optimized by the WD and AC algorithms.For the three-steps-ahead horizon forecasting results shows in Figure 9, the ENN and BPNN models optimized by the WD and CS are both worse than the one optimized by the WD and AC algorithms.In conclusion, the novel optimized models proposed in this paper are all better than the original models.Related parameter initialization values in different neural networks are shown in Table 6.Based on the error evaluation criteria, MAE, defined in Equation ( 5) and the following two forecasting error evaluation criteria shows in Equations ( 22) and (23), forecasting error values obtained by different neural networks are listed in Table 7.
where y i and ŷi are the actual and forecasted wind speed values, and n is the number of the data samples.Table 7 provides the forecasting error results with three different horizons, one-step-ahead, two-steps-ahead and three-steps-ahead.As seen, under the same horizon conditions, performances of the optimized nine neural networks are all better than those of the three single neural networks.Additionally, models optimized by the WD and CS or WD and AC are all superior to those that were only optimized by the WD algorithm.While the models optimized by the WD and CS are compared with the models optimized by the WD and AC, for the one-step-ahead horizon forecasting results shows in Figure 7, error values obtained by the WD and CS algorithms are all smaller than the corresponding models optimized by the WD and AC algorithms.For the two-step-ahead horizon forecasting results shows in Figure 8, the BPNN model optimized by the WD and CS is worse than that optimized by the WD and AC algorithms.For the three-steps-ahead horizon forecasting results shows in Figure 9, the ENN and BPNN models optimized by the WD and CS are both worse than the one optimized by the WD and AC algorithms.In conclusion, the novel optimized models proposed in this paper are all better than the original models.

Conclusions
Effective wind energy potential assessment and forecasting for a particular site plays an indispensable role in the design, evaluation and scheduling of wind farms.In this paper, based on the CS and AC algorithms, two new wind energy assessment models, as well as six wind speed forecasting models, are proposed.First, the CS and AC algorithms are introduced to estimate the two unknown parameters in the Weibull distribution as well as improve the assessment accuracy.The four assessment error evaluation criteria sets of results demonstrate that the two newly proposed assessment models are effective and meaningful.Then, the best data pre-processing approach is selected according to the wind energy potential evaluation results and is adopted to process the wind speed time series.Finally, the CS and AC algorithms are used to optimize three neural networks-namely the ENN, BPNN and WNN-and the three sets of forecasting error evaluation criteria results demonstrate that the six newly proposed assessment models perform better than the original ones.Therefore, forecasting researchers can greatly benefit from data pre-processing and swarm intelligent optimization techniques and these data allow for significant improvements in accuracy.

Fitness
Function: f (x) = (k/c) × (x/c) k−1 ×exp[− (x/c) k ] Parameters: Maximum iterations:50 The number of ant:30 Parameters of the important degree of information elements:1 Parameters of the important degree of the Heuristic factor:5 Parameters of the important degree of the heuristic factor:0.1 Pheromone increasing intensity coefficient:100 NC_max-Maximum iterations:50 m-The number of ant:30 Alpha-Parameters of the important degree of information elements:1 Beta-Parameters of the important degree of the Heuristic factor:5 Rho-Parameters of the important degree of the heuristic factor:0.1 Q-Pheromone increasing intensity coefficient:100 1: /*Initialize popsize candidates with the values between 0 and 1*/ 2:

Figure 1 .
Figure 1.PDF fitting results in the single year from 2009 to 2013

Figure 1 .
Figure 1.PDF fitting results in the single year from 2009 to 2013

Figure 2 .
Figure 2. Seasonal PDF and whole five-year fitting results.

Figure 2 .
Figure 2. Seasonal PDF and whole five-year fitting results.

Figure 3 .
Figure 3. PDF fitting results obtained by three different de-noising methods for the four sites.

Figure 3 .
Figure 3. PDF fitting results obtained by three different de-noising methods for the four sites.
WNN) Parameters: Maximum iterations:50 The number of ant:30 Parameters of the important degree of information elements:1 Parameters of the important degree of the Heuristic factor:5 Parameters of the important degree of the heuristic factor:0.1 Pheromone increasing intensity coefficient:100 NC_max-Maximum iterations:50 m-The number of ant:30 Alpha-Parameters of the important degree of information elements:1 Beta-Parameters of the important degree of the Heuristic factor:5 Rho-Parameters of the important degree of the heuristic factor:0.1 Q-Pheromone increasing intensity coefficient:100 1: /*Initialize popsize candidates with the values between 0 and 1*/ 2: FOR EACH i 1 ≤ i ≤ n DO 3: α 1 i = rand (m, n) 4: END FOR 5: P = α iter i : 1 ≤ i ≤ popsize 6: iter = 1; Evaluate the corresponding fitness function F i 7: /* Find the best value of repeatedly until the maximum iterations are reached.*/ 8: WHILE .(iter≤ iter max ) DO 9: /* Find the best fitness value for each candidates */ 10: FOR EACH α iter i ∈ P DO 11: Build neural network by using x

Figure 5 .
Figure 5.The flowchart of this proposed integrated forecasting model.

Figure 5 .
Figure 5.The flowchart of this proposed integrated forecasting model.

Figure 7 .
Figure 7. One-step-ahead forecasting results obtained by different models.

Figure 8 .
Figure 8. Two-step-ahead forecasting results obtained by different models.

Figure 8 .
Figure 8. Two-step-ahead forecasting results obtained by different models.Figure 8. Two-step-ahead forecasting results obtained by different models.

Figure 8 .
Figure 8. Two-step-ahead forecasting results obtained by different models.Figure 8. Two-step-ahead forecasting results obtained by different models.

Figure 9 .
Figure 9. Three-steps-ahead forecasting results obtained by different models.Figure 9. Three-steps-ahead forecasting results obtained by different models.

Figure 9 .
Figure 9. Three-steps-ahead forecasting results obtained by different models.Figure 9. Three-steps-ahead forecasting results obtained by different models.
popsize 6: iter = 1; Evaluate the corresponding fitness function F i 7: /* Find the best value of repeatedly until the maximum iterations are reached.
*/ 8: WHILE .(iter≤ iter max ) DO 9: /* Find the best fitness value for each candidates */ 10: FOR EACH α iter i ∈ P DO 11: Build neural network by using x

Table 1 .
Parameter estimation results in a single year from 2009 to 2013.

Table 2 .
Assessment error results in a single year from 2009 to 2013.

Table 3 .
Seasonal and whole five-year parameter estimation results.

Table 4 .
Seasonal and whole five-year assessment error results.

Table 5 .
Assessment results of each de-noising wind speed time series.

Table 5 .
Assessment results of each de-noising wind speed time series.

Table 6 .
Related parameter initialization values in the neural networks.

Table 7 .
Forecasting error values of each model.