A Hybrid Wind Speed Forecasting Method and Wind Energy Resource Analysis Based on a Swarm Intelligence Optimization Algorithm and an Artiﬁcial Intelligence Model

: Wind power has the most potential for clean and renewable energy development. Wind power not only effectively solves the problem of energy shortages, but also reduces air pollution. In recent years, wind speed time series analyses have increasingly become a concern of administrators and power grid dispatchers searching for a reasonable way to reduce the operating cost of wind farms. However, analyzing wind speed in detail has become a difﬁcult task, because the traditional models sometimes fail to capture data features due to the randomness and intermittency of wind speed. In order to analyze wind speed series in detail, in this paper, an effective and practical analysis system is studied and developed, which includes a data analysis module, a data preprocessing module, a parameter optimization module, and a wind speed forecasting module. Numerical results show that the wind time series analysis system can not only assess wind energy resources of a wind farm, but also master future changes of wind speed, and can be an effective tool for wind farm management and decision-making.


Introduction
Wind power makes use of air flow through wind turbines to mechanically power generators for electric power. The total global installed capacity reached 487 GW by the end of 2016, and new installed capacity was more than 54.6 GW in 2016, of which 23.4 GW of new installations in China powered this growth in large part [1]. The rapid increase of global installation capacity means that one of the largest onshore wind farms, Gansu Wind Farm [2], which has installed capacity of over 6000 MW, has already started operation. In order to ensure reliability and safe operation, detailed analysis of the situation of large wind farms is an important task for wind farm management and decision-making.
Assessing wind energy resources involves analyzing and evaluating the long-term wind energy reserves of wind farms [3]. Through detailed assessment of the time series of a wind farm, some characteristics of wind energy are calculated, such as average wind speed, average wind power density, effective wind power density, and available hours. The probability statistics method uses a variety of distribution functions (such as Rayleigh distribution, gamma distribution, logistic distribution, and Weibull distribution [4]) to fit the probability distribution of wind speed and find the optimal probability distribution function. When the optimal probability distribution function is found, its parameters need to be estimated by a numerical method or optimization algorithm. For example, Mohammadi et al. [5] used multinumerical methods to evaluate the parameters of the Weibull distribution. Wu et al. [6] compared three probably density functions and selected the best one to apply the wind energy assessment. For the optimization parameters, three particle swarm optimization algorithms and 18 differential evolution algorithms were used to estimate the parameters of the Weibull distribution.
However, numerical estimation methods such as moments estimate [7], maximum likelihood estimate [8], and least squares estimate [9], have some disadvantages and restricted conditions. For example, the method of moment can only be used in the distribution when the population origin moment exists and the moment only has some of the information, and this method only has good performance when the sample size is large. The maximum likelihood estimation must incorporate the sample distribution. It is more complicated to incorporate the likelihood equations, which often obtain the approximate solution by computer iterative computation. However, the least squares method has two kinds of defects. If the noise of the model is colored noise, the estimation result of the least squares is a biased estimation; with increasing data size, "data saturation" will appear.
Due to the defects of the above methods, in this paper, four optimization algorithms (firefly algorithm [10], genetic algorithm [11], ant colony algorithm [12], and cuckoo search algorithm [13]) are used to optimize and determine the shape (k) and scale (c) parameters of the Weibull distribution function for calculating wind power density.
The wind energy assessment is used to analyze long-term wind energy reserves and calculate long-term electricity production. However, the wind energy assessment is not comprehensive in the detailed analysis of wind farms. At same time, it is important to master future changes of wind speed. Over the past few decades, many researchers have proposed methods of wind speed forecasting. For example, Cassola et al. [14] improved the forecasting performance of wind speed by using a Kalman filtering technique to correct the numerical weather prediction model output. Liu et al. [15] proposed two hybrid models based on Auto-Regressive and Moving Average Model (ARMA) and nonparametric models, and showed that nonparametric-based hybrid models generally outperformed the other models. Tascikaraoglu et al. [16] proposed a novel wind speed prediction model using wavelet transform and spatiotemporal methods, which improved the short-term wind speed forecasting relative to other benchmark models. Xiao et al. [17] developed a hybrid wavelet neural network (WNN) model with an improved cuckoo search algorithm. The above studies used four classes to forecast wind speed: physical model, conventional statistical model, spatial correlation model, and artificial intelligence model. However, traditional forecasting methods have some defects. For example, the physical methods require considerable observed data with limited simulation scale and consume a lot of computing resources, which are expensive to obtain [18]. On the other hand, these methods are more suitable for weather forecasting instead of wind speed forecasting [19]. Statistical models cannot address forecasting with high noise and fluctuation, irregular nonlinear trends, or features of wind speed series data, which is mainly limited by the prior assumption of linear form among time series. The spatial correlation model [20] makes it quite difficult to implement perfect wind speed forecasting due to the vast quantities of information that need to be considered and collected, such as wind speed values of many spatially correlated sites. There are still many disadvantages and defects with artificial intelligence methods, for example, easily getting into local optimum, overfitting, and exhibiting a relatively low convergence rate [21].
In view of the disadvantages of wind energy resource assessment and traditional forecasting models, in this paper, a wind energy analysis system that includes an assessment module and a forecasting module is developed. The system can be an effective estimation and forecasting tool for large wind farms.
The major contributions of this paper are as follows: (1) A wind energy analysis system including an assessment of wind energy resources and forecasting of wind speed time series, is developed and applied. The purpose of the wind energy assessment is to analyze the wind energy reserves of the wind farm, and the purpose of wind speed forecasting is to master the future changes of wind speed. (2) To effectively assess wind energy, the performance of different probability density functions (PDFs) is compared so as to select the best one. As one of the most suitable probability distributions, the probability density function is the most applicable approach to describe the distribution of wind speed in this case. (3) When the parameters of the probability density function are determined, in place of the numerical estimation method, which has some defects for estimation parameters, the optimization algorithm can effectively obtain optimal parameters. (4) Because actual wind speed has chaotic noise, which will decrease the accuracy of the forecasting, data preprocessing technology is employed to reduce noise and uncertainty of wind speed series. (5) A wind speed forecasting module, which includes a data preprocessing module, parameter optimization module, and forecasting module, is developed. In order to determine the parameters of the least squares support vector machine (LSSVM), a hybrid optimization algorithm is proposed. The new optimization algorithm uses the steepest descent to improve the convergence rate of the cuckoo search algorithm, which can effectively improve the forecasting accuracy of the LSSVM.

Model Construction
This part mainly introduces the composition of the hybrid algorithm and the detailed steps.
Step one: introduce correlation algorithm for wind speed data analysis.
Step two: actual wind speed data is processed by wavelet packet transform, which is used to establish the LSSVM. Then the modified cuckoo search algorithm is used to optimize the parameters of the LSSVM.

The Correlation Algorithm for Wind Speed Data Analysis
At a wind farm, the air density and natural terrain environment are constant, which has less influence on wind energy resources. Only the wind speed will change with the movement of the atmosphere. Thus, the frequency distribution of wind speed can measure the distribution of wind energy resources at the wind farm. In this paper, four kinds of probability distribution are used to fit the wind density [22]. Table 1 shows the different probability density functions.

Probability Distribution Probability Density Function
Gamma distribution In order to determine the parameters of the probability density functions, in this paper, four kinds of optimization algorithm are employed: differential evolution algorithm [23], particle swarm optimization algorithm [24], cuckoo search algorithm [23], and genetic algorithm [22].
The performance of a wind turbine installed at a given site can be examined by the amount of mean power output over a period of time (Pout) and the conversion efficiency or capacity factor [23] of the turbine. The mean power output Pout can be calculated using the following expression based on the Weibull distribution function: where v c , v r , v f are the cut-in wind speed, rated wind speed, and cut-off wind speed, respectively. It should be mentioned that the above expressions are only valid for the assessment of wind turbines.

The Method of Data Pre-Processing: Wavelet Packet Transform
In the wavelet packet framework, compression and denoising ideas are exactly the same as those developed in the wavelet framework. The only difference is that wavelet packets [24] offer a more complex and flexible analysis, because in wavelet packet analysis, the details as well as the approximations are split.
A single wavelet packet decomposition gives a lot of bases from which to look for the best representation with respect to a design objective. This can be done by finding the "best tree" based on an entropy criterion.
Denoising and compression are interesting applications of wavelet packet analysis. The wavelet packet denoising or compression procedure involves four steps: Step 1: Decomposition. For a given wavelet, compute the wavelet packet decomposition of signal x at level N.
Step 2: Computation of the best tree. For a given entropy, compute the optimal wavelet packet tree.
Of course, this step is optional. The graphical tools provide a "Best Tree" button to make this computation quick and easy.
Step 3: Thresholding of wavelet packet coefficients. For each packet (except for the approximation), select a threshold and apply thresholding to coefficients.
The graphical tools automatically provide an initial threshold based on balancing the amount of compression and retained energy. This threshold is a reasonable first approximation for most cases. However, in general, you will have to refine your threshold by trial and error so as to optimize the results to fit your particular analysis and design criteria.
The tools facilitate experimentation with different thresholds and make it easy to alter the trade-off between the amount of compression and retained signal energy.
Step 4: Reconstruction. Compute wavelet packet reconstruction based on the original approximation coefficients at level N and the modified coefficients.

L.SSVM (Least Square Support Vector Machine)
Least squares support vector machine, proposed by Suyken et al. [25], is an improved support vector machine, and is one of the important results of statistical learning theory in recent years. Compared with the traditional method of SVM, LSSVM has the following advantages: (1) equality constraints replace the inequality constraints in the traditional SVM algorithm; and (2) solving the quadratic program is changed into solving the linear equations directly.
The least squares support vector machine algorithm is described as follows: For a given training set (p i , q i ), where i = 1,2,3,· · · ,n, p i ∈ R n , q i ∈ R, LSSVM uses a nonlinear mapping φ(p) to map samples from the original space R n to the feature space, and φ(p i ) to construct the optimal decision function in high-dimensional feature space: Therefore, the nonlinear estimation function is transformed into a linear high-dimensional feature space estimation function, and looking for ω, b is minimized through the structural risk minimization principle: In Equation (3), ω 2 controls the complexity of the model; C is the regularization parameter that controls the severity of the error samples; and R emp is the error control function that is an insensitive loss function of ε. The frequently used loss functions are linear ε loss function, quadratic ε loss function, and Huberg loss function. Different forms of support vector machine can be constructed by selecting different loss functions. According to the structural risk minimization principle, a regression problem is expressed as a constrained optimization problem: where C is constant and b deviates. If we use the Lagrange method to solve this optimization problem, it will become:

Modified Cuckoo Search Algorithm
Cuckoo search [11] is a random global search algorithm and an optimization algorithm, just like genetic algorithm (GA) and particle swarm optimization (PSO), based on groups. In nature, cuckoos search randomly for nests in which to lay eggs. Yang and Suash assume the following idealized rules: (1) Every cuckoo lays one egg at every turn, and dumps its egg in a randomly chosen nest.
(2) The best nest with high-quality eggs will be preserved by the next generation.
(3) The number of available nests is fixed (we assume that the number is n), and the egg laid by a cuckoo is discovered by the host bird with a probability P a ∈ [0, 1]. In this case, the host bird can discard the egg or simply abandon the nest and build a new nest.
For the sake of simplicity, we can use the following simple statement that each egg in the nest represents a solution, and a cuckoo egg represents a new solution. The purpose is to use new and potentially better solutions to replace a bad solution in the nest. Certainly, this algorithm can be extended to more complex cases in which each nest with a plurality of eggs represents a set of solutions. At present, we adopt the simplest approach that there is only one egg in each nest.
Based on these three ideal conditions, the formula of the cuckoo search path and update position is shown as the following: In this formula, x (t) i shows the nest location of the tth generation where the ith bird's nest lies and α(α > 0) is step length, and in most cases, α = 1.
The formula makes up a random walk equation in essence; a random walk is a Markov chain, and its future position depends on the current position (the first item of the last equation) and the transition probability (the second item). ⊕ is point-to-point multiplication; Levy(λ) is a random search path, and random step is Levy distribution: Therefore, the main steps of the cuckoo search algorithm can be described as follows: Step 1: f (x) is an objective function, X = (x 1 , · · · , x d ) T . Population is initialized and the initial position X i (i = 1,2, · · · , n) of n nests is randomly generated. The algorithm parameters are set.
Step 2: Calculate the objective function value for each bird's nest and record the current optimal solution. Step 3: Hold back the last generation's optimal nest location and update Equation (7) according to the position, then update the remaining nest locations.
Step 4: The current nest position is compared with the previous generation; if the current nest position is better, it will be the best position.
Step 5: A random number R is the possibility that the nest's host finds the extrinsic eggs and is compared with P a ; if R > P a , the position of the nest is changed and a new position is determined.
Step 6: Return to step 2 if the end conditions are not met.
Step 7: The global optimal position is output.
The steepest descent algorithm [26] is one of the oldest optimization algorithms, and is simple and intuitive. At present, more effective hybrid optimization algorithms are based on the steepest descent algorithm.
To overcome the defects of the cuckoo search algorithm, by using the steepest descent method, the iteration process can be expressed in the following steps: Step 1: Select the initial point x 0 and preset stop error ε > 0 and k: = ().
Step 2: Calculate ∇ f (x k ). If ∇ f (x k ) ≤ ε jumps out of iteration, output x k , otherwise take step 3.
Step 4: Perform one-dimensional search to obtain t k and make f ( The cuckoo search algorithm is used to preserve the optimal solution of the hatched birds, and the steepest descent method is used to iterate and modify the location of the nest to obtain the optimal solution matrix.

Structure of the Proposed Integrated Forecasting Framework
The wavelet packet transform extracts the multiscale spatial energy feature of the original wind speed. It is noteworthy that the applications of the wavelet packet transform need to select two hyperparameters: wavelet function and decomposition level. At present, no study discusses how to select the value of these two hyperparameters; their values can depend on the type of original wind speed time series to be analyzed. The procedure of the wavelet packet transform is displayed in Figure 1 (first step).
From Figure 1 (second step), the data setting for each model with length N (the number of the training sample) is fixed according to the original time series. For example, suppose the last 144 points of the wind speed time series with length of 1152 (N = 1008) will be forecasted, and the data structure is shown in Figure 1 (second step).
It must be noted that there is one step ahead of forecasting h = 1 of different wind farm sites in this paper.
After the wavelet packet transform, the filtered wind speed time series is input to the optimized least squares support vector machine. The overall flowchart of the proposed hybrid forecasting model is shown in Figure 1 (third step).
It is remarkable that the original wind speed time series divides into a training sample and a testing sample. Moreover, in the process of training the least squares support vector machine, the input data is filtered time series and the output data is original wind speed time series of the training sample, and in the testing process, the input data is also filtered time series and the output is test sample.

Numerical Experimentation
The wind farm is located between 120°43′6′′ and 120°47′17′′ north latitude and between 37°32′50′′and 37°37′6′′ west longitude; the ground elevation is 140-420 m; and the annual average wind speed is higher than the surrounding region but also reserves the larger number of wind energy resources. In this region, the installed wind power capacity is approximately 67 million KW. The data collected in this paper are from 1 January 2015 to 31

Analysis of Wind Speed
The probability distribution of wind speed can be used to measure the distribution of wind energy resources in a place. The common probability distributions are normal, Reyleigh, gamma, and Weibull, which use the distribution function to fit the probability distribution of wind speed. In this paper, four kinds of probability distribution are used to assess wind speed density. Figure 2 and the Appendix A Table A1 show that the Weibull distribution well fits the distribution of wind speed. For example, the R-square of four probability distributions at Site 1 are 0.918, 0.768, 0.831, and 0.765. In the assessment of Site 2 processing, the Weibull distribution also well fits the wind speed distribution,

Numerical Experimentation
The wind farm is located between 120 • 43 6 and 120 • 47 17 north latitude and between 37 • 32 50 and 37 • 37 6 west longitude; the ground elevation is 140-420 m; and the annual average wind speed is higher than the surrounding region but also reserves the larger number of wind energy resources. In this region, the installed wind power capacity is approximately 67 million KW. The data collected in this paper are from 1 January 2015 to 31

Analysis of Wind Speed
The probability distribution of wind speed can be used to measure the distribution of wind energy resources in a place. The common probability distributions are normal, Reyleigh, gamma, and Weibull, which use the distribution function to fit the probability distribution of wind speed. In this paper, four kinds of probability distribution are used to assess wind speed density. Figure 2 and the Appendix A (Table A1) show that the Weibull distribution well fits the distribution of wind speed. For example, the R-square of four probability distributions at Site 1 are 0.918, 0.768, 0.831, and 0.765.
In the assessment of Site 2 processing, the Weibull distribution also well fits the wind speed distribution, which the sum square error (SSE) of Weibull distribution is 0.876 lower than the other distributions.
The doughnut in Figure 2 shows the capacity factor of each site, with values of 35.860%, 27.456%, 26.825%, and 28.738%. In order to accurately assess wind speed, four different optimization algorithms are used to optimize the parameters of the Weibull distribution. From Appendix A (Table A1) we can clearly see that the optimal Weibull distributions are obtained by different optimizations; the optimal Weibull distribution for Site 1 is CS-Weibull and the other optimal Weibull distributions are shown in the Appendix A (Table A1), marked in bold. which the sum square error (SSE) of Weibull distribution is 0.876 lower than the other distributions.
The doughnut in Figure 2 shows the capacity factor of each site, with values of 35.860%, 27.456%, 26.825%, and 28.738%. In order to accurately assess wind speed, four different optimization algorithms are used to optimize the parameters of the Weibull distribution. From Appendix A (Table  A1) we can clearly see that the optimal Weibull distributions are obtained by different optimizations; the optimal Weibull distribution for Site 1 is CS-Weibull and the other optimal Weibull distributions are shown in the Appendix A (Table A1), marked in bold. In this paper, four data sites have been selected as our case studies. The sites' wind characteristics and Weibull parameters are estimated using the above expressions for all locations considered in the present study and are presented in Table 2. The table shows that the annual mean wind speeds are 6.5214 m/s, 5.8752 m/s, 5.9730 m/s, and 6.1713 m/s for the four sites. Also displayed in Table 2 is the shape parameter of Weibull for the four sites. It can be seen that the shape parameter varies between 2.07 and 2.23 for these sites. It can further be noted that the annual wind speed carrying maximum energy speeds are 9.89400 m/s, 8  The analysis of the wind energy resources and power generation of a wind turbine at four windspeed sites shows that the capacity factor of wind turbine generation is 35.860%, 26.825%, 27.456%, and 28.738%. To reduce the operational cost of a wind farm, it is very important to improve the short- In this paper, four data sites have been selected as our case studies. The sites' wind characteristics and Weibull parameters are estimated using the above expressions for all locations considered in the present study and are presented in Table 2. The table shows that the annual mean wind speeds are 6.5214 m/s, 5.8752 m/s, 5.9730 m/s, and 6.1713 m/s for the four sites. Also displayed in Table 2 is the shape parameter of Weibull for the four sites. It can be seen that the shape parameter varies between 2.07 and 2.23 for these sites. It can further be noted that the annual wind speed carrying maximum energy speeds are 9.89400 m/s, 8  The analysis of the wind energy resources and power generation of a wind turbine at four wind-speed sites shows that the capacity factor of wind turbine generation is 35.860%, 26.825%, 27.456%, and 28.738%. To reduce the operational cost of a wind farm, it is very important to improve the short-term wind speed forecasting accuracy. This section focuses on establishing the accuracy of the short-term wind-speed forecasting model and analyzing the forecasting results.

Definition of the Performance Metrics
To demonstrate the validity of the proposed hybrid model, three error measurements [27], mean absolute percentage error (MAPE), mean relative error (MRE), mean absolute error (MAE), and root mean squared error (RMSE), are used to evaluate the accuracy and stability of forecasting results. MAPE, MRE, and MAE are used to evaluate accuracy. RMSE is used to measure the stability of the model, and a positive value of P bias [28] denotes underestimation of bias, while a negative value denotes overestimation of bias. P bias = 0 indicates no bias. They are calculated in Table 3 as follows.

Metric
Description Equation

MAE
Mean Absolute Error

Data Setting for Each Model
The actual wind speed series from the four observation stations and their assessment results are shown in Figure 2. According to the above analysis, the capacity factor is about 35.860%, 26.825%, 27.456%, and 28.738% for each site. In order to improve the utilization of wind energy, it is necessary to accurately forecast wind speed.
In this paper, all of the wind speed data in the sampling period is taken over 8 days, including 1152 data points in two parts, training sample and testing sample. In order to ensure that the training set gets more information between the input variables and the output variables, the first 1008 wind speed data points serve as the training set, and last 144 data points serve as the testing sample. The training sample is processed by wavelet packet transform (WPT) and the testing sample is actual wind speed time series, and they are shown in Table 4. In order to ensure the best proportion between the training and testing samples, the enumeration method based on numerical simulation is used (the results are shown in Figure 3). The LSSVM parameters of the hybrid model are shown in Table 4.  To determine the relationship between the number of input and training samples, many experiments were established. In these experiments, we used the listing technique to determine the number of input and training samples for the different forecasting models. From Figure 3, the following conclusions were obtained: (1) Figure 3A and Table 5 show that the forecasting performance of WPT-SDCS-LSSVM is better than the other four models, in which the value of MAPE of 3.45% is lower than the other hybrid model. Figure 3B shows that the forecasting curve obtained by WPT-SDCS-LSSVM is very close to the original wind speed time series.
(2) Figure 3C shows that the range of input for each forecasting model is 1-30, in which the MAPE range of WPT-SDCS-LSSVM is 3.45-7.5%. When the number of inputs increases, the test error gradually decreases. If the number of inputs reaches a certain maximum, the test error will increase. Table 4 shows the best input for LSSVM, and Figure 3C shows that the change of training sample size corresponds to the change of test accuracy, and it can be clearly seen that with the increase of training samples, the accuracy of the WPT-SDCS-LSSVM is improved and the fluctuation range of accuracy is between 5.5% and 7.5%.

Numerical Simulation for Four Different Sites
In this section, to illustrate the effectiveness of the hybrid model, we will compare three models:  Table 5, it is clear that the WPT-CS-LSSVM model performs much better than the other three models. To explain the results of the proposed method, we utilize the first site as an example. First, the hybrid LSSVM model has the smallest statistical error of MAPE, RMSE, MAE, and MRE when compared with the other three models. For example, in Table 9, the MAPE of our hybrid method is 4.50% while the single BPNN is calculated to be 5.65%; thus, the precision is improved by To determine the relationship between the number of input and training samples, many experiments were established. In these experiments, we used the listing technique to determine the number of input and training samples for the different forecasting models. From Figure 3, the following conclusions were obtained: (1) Figure 3A and Table 5 show that the forecasting performance of WPT-SDCS-LSSVM is better than the other four models, in which the value of MAPE of 3.45% is lower than the other hybrid model. Figure 3B shows that the forecasting curve obtained by WPT-SDCS-LSSVM is very close to the original wind speed time series.
(2) Figure 3C shows that the range of input for each forecasting model is 1-30, in which the MAPE range of WPT-SDCS-LSSVM is 3.45-7.5%. When the number of inputs increases, the test error gradually decreases. If the number of inputs reaches a certain maximum, the test error will increase. Table 4 shows the best input for LSSVM, and Figure 3C shows that the change of training sample size corresponds to the change of test accuracy, and it can be clearly seen that with the increase of training samples, the accuracy of the WPT-SDCS-LSSVM is improved and the fluctuation range of accuracy is between 5.5% and 7.5%.

Remark 1.
According to the above analysis, it is discovered that it is hard to find the general relationship between the number of the input layer and the training sample. Wind speed time series at different sites have different forecasting structures, which suggests that different time series corresponding to the parameters of the model are also different.

Numerical Simulation for Four Different Sites
In this section, to illustrate the effectiveness of the hybrid model, we will compare three models: back propagation neural network (BPNN), wavelet neural network (WNN), CS-LSVM, and PSO-LSSVM. Tables 5-8 and Figures 4-7 show the forecasting results of the hybrid model at four sites.

The Forecasting Result in Site 1
From Table 5, it is clear that the WPT-CS-LSSVM model performs much better than the other three models. To explain the results of the proposed method, we utilize the first site as an example. First, the hybrid LSSVM model has the smallest statistical error of MAPE, RMSE, MAE, and MRE when compared with the other three models. For example, in Table 9, the MAPE of our hybrid method is 4.50% while the single BPNN is calculated to be 5.65%; thus, the precision is improved by 1.16%. In the remaining models, the precision is improved by 4.91% and 2.56%. This result combines the advantages of each single model. Second, the various indicators of the hybrid LSSVM model are better than PSO-LSSVM. The major cause for this, among others, is that the hybrid model consists of WPT and CS. This illustrates that WPT and SDCS are more effective than PSO. Not only that, the run time of SDCS is much shorter than PSO at the Site 1, as shown in Table 5. On the whole, the hybrid LSSVM model is quite simple and efficient.
As mentioned earlier, Figure 4A shows that the forecasting error of SDCS-LSSVM is approximately 0. Table 5 and Figure 4B show that the MAE, MRE, RMSE, and MAPE values of SDCS-LSSVM are 0.1991, 3.10%, 0.2656, and 3.11%, respectively, which are lower than the other model, and the Index Agreement of SDCS-LSSVM is 0.9986, closed to 1, which is higher than the other model. Figure 4C shows the 95% confidence intervals (CIs) obtained by the proposed hybrid model (SDCS-LSSVM), which indicate that both the upper and lower CI are close to the observed wind speed time series. Table 5 shows that single models cannot be applied to multiple forecasting problems, and model performance under specific conditions should be analyzed and understood and incremental improvements made based on knowledge gained [30][31][32][33][34]. In this paper, our hybrid LSSVM model based on WPT and SDCS performs much better than the single models in terms of its precision and stability, and is an excellent method for 10-min wind speed forecasting in Shandong Province compared with other traditional forecasting models. 1.16%. In the remaining models, the precision is improved by 4.91% and 2.56%. This result combines the advantages of each single model. Second, the various indicators of the hybrid LSSVM model are better than PSO-LSSVM. The major cause for this, among others, is that the hybrid model consists of WPT and CS. This illustrates that WPT and SDCS are more effective than PSO. Not only that, the run time of SDCS is much shorter than PSO at the Site 1, as shown in Table 5. On the whole, the hybrid LSSVM model is quite simple and efficient. As mentioned earlier, Figure 4A shows that the forecasting error of SDCS-LSSVM is approximately 0. Table 5 and Figure 4B show that the MAE, MRE, RMSE, and MAPE values of SDCS-LSSVM are 0.1991, 3.10%, 0.2656, and 3.11%, respectively, which are lower than the other model, and the Index Agreement of SDCS-LSSVM is 0.9986, closed to 1, which is higher than the other model. Figure 4C shows the 95% confidence intervals (CIs) obtained by the proposed hybrid model (SDCS-LSSVM), which indicate that both the upper and lower CI are close to the observed wind speed time series. Table 5 shows that single models cannot be applied to multiple forecasting problems, and model performance under specific conditions should be analyzed and understood and incremental improvements made based on knowledge gained [30][31][32][33][34]. In this paper, our hybrid LSSVM model based on WPT and SDCS performs much better than the single models in terms of its precision and stability, and is an excellent method for 10-min wind speed forecasting in Shandong Province compared with other traditional forecasting models.    The cuckoo search algorithm easily suffers from slow convergence, which will affect its parameter optimization capability. Therefore, the steepest descent algorithm is used to iterate and modify the location of the nest to obtain the optimal parameters. The SDCS algorithm combines the advantages of the steepest descent algorithm and the cuckoo search algorithm, which significantly improves the convergence speed of the cuckoo algorithm. The forecasting results can be seen in Figure 5A, and Table 6 shows that this proposed WPT-SDCS-LSSVM model outperforms the other four hybrid models comparing the MAE, MRE, RMSE, MAPE, IA, and P bias from Monday to Sunday. For instance, the MAPE of the WPT-SDCS-LSSVM is 5.00%, 3.57%, 3.99%, 3.95%, 4.79%, 3.65%, and 2.75%, corresponding to P bias of 0.489%, −0.133%, 0.108%, 0.012%, 0.036%, 0.047%, and 0.314% from Monday to Sunday, respectively. Figure 5A and Table 6 show that the IA values of 0.9909, 0.9907, 0.9759, 0.9903, 0.9810, 0.9802, and 0.9914 obtained by WPT-SDCS-LSSVM are bigger than the other models. Forecasting values of all models are shown in Figure 5A. It is demonstrated that this proposed hybrid model has high accuracy and low bias. However, the IA values are 0.9986, 0.9995, 0.9977, 0.9986, 0.9964, 0.996, and 0.9991 from Monday to Sunday. It is also shown that the WPT-SDCS-LSSVM can accurately forecast future changes in wind speed.
In order to analyze the forecasting results of the five models, we used the testing set from Monday to Sunday. Figure 5A clearly displays the forecasting values and actual values of 00:10 to 24:00, from Monday to Sunday, 6-12 March 2017, in which all models obtained the best forecasting accuracy in the forecasting of Sunday. Figure 5B.1 shows the values of the five forecasting models and the actual wind speed from 00:10 to 24:00, Sunday, 12 March 2017. Figure 5B.2 shows 95% confidence intervals (CIs) obtained by the WPT-SDCS-LSSVM. It can be clearly seen that both the upper and lower CIs are very close to the actual wind speed time series of Sunday. Figure 5B.3 is the plot of forecasting values from 00:10 to 24:00, Sunday, 12 March 2017, for these five forecasting models. It can be seen that the performance of the WPT-SDCS-LSSVM is better than two single models and two hybrid models. In addition, Table 6 clearly shows the metrics of the five forecasting models which find the three forecasting (LSSVM, BPNN, and WNN) models are optimized by the same optimization algorithm (SDCS), and the forecasting performance of SDCS-LSSVM is better than the other two hybrid forecasting models (SDCS-BPNN and SDCS-WNN). It is obvious that the forecasting values of the proposed WPT-SDCS-LSSVM hybrid model is approximate to the actual wind speed time series.

The Forecasting Result in Site 3
For the Site 3 forecasting process, Figure 6A clearly shows the forecasting metric values of MAE, MRE, RMSE, and MAPE, in which the proposed hybrid obtains the minimum values of 0.3055, 4.332%, 0.4154, and 4.478%, respectively, on Monday. At the same time, the proposed model obtains a maximum IA (0.9980) and minimum P bias (0.302%), which indicates that WPT-SDCS-LSSVM can provide a more accurate and stable forecasting value. Figure 6B shows the performance of each hybrid model and plots the forecasting performance, forecasting value, and forecasting error for the five hybrid models at Site 3. Figure 6B shows that the forecasting value of WPT-SDCS-LSSVM falls within the 95% confidence interval of the original wind speed. Figure 6B clearly shows the actual wind speed compared with the forecasting values (with 95% confidence intervals) for Site 3. It shows that the 95% confidence interval obtained by the proposed hybrid model (WPT-SDCS-LSSVM) is narrower than those of the other hybrid models. It is shown for a week of forecasting values. From Figure 6B, it can be seen that the distribution forecasting error of WPT-SDCS-LSSVM is lower than that of the other hybrid model.
In Figure 6B, Through the above forecasting error analysis, and combined with the forecasting metric in Table 7, we find that the proposed WPT-SDCS-LSSVM integrated model is better than the other model. In a nutshell, it can be explained that this proposed integrated algorithm has better capacity of location and global search to find optimizing regularization (Gam) and tuning parameter (Sig2) for nonlinear least squares support vector machine and wavelet packet transform as a preprocessing technique, which can extract tendency and volatility features in the original wind speed time series, improving forecasting accuracy. In addition, the nonlinear least squares support vector machine can effectively reduce forecasting error because the dataset structures are optimized by the longitudinal dataset selection approach.

Remark 2.
By comparing the hybrid model proposed in this paper with four other models (two hybrid models and two single models), we can conclude that the proposed hybrid model provides more accurate forecasting results. A comparison between WPT-SDCS-LSSVM, WPT-CS-LSSVM, and WPT-PSO-LSSVM suggests that the optimization ability of SDCS is stronger than the single cuckoo search algorithm and particle swarm optimization algorithm, while the forecasting performance of single forecasting model is worse than hybrid models. (1) Table 9 shows  (1) Table 9 shows respectively. The WPT-CS-LSSVM model is second in accuracy next to the proposed hybrid model, but WPT-WNN shows the worst forecasting performance among these models.
(2) Figure 7A shows the forecasting values and actual values for Site 4 in a week. Figure 7A      Remark 3. The level of improvement by the combined forecasting model decreased as the number of forecasting steps increased. In spite of this, the forecasting performance of the combined model showed great improvement compared with the other models. The combined model proposed in this paper obtains satisfactory results in terms of wind speed forecasting.

Discussion the Forecasting Accuracy for Each Model
Finally, the above criteria are some of the most common standards; in fact, these criteria cannot appropriately judge the effectiveness of the forecasting method. The reason is that the different index series have different dimensions, so these criteria cannot be used directly. Even for similar sequences that have the same dimensions, the same period index values are different, so the above standard cannot show that the forecasting methods are equally effective. Therefore, this paper uses forecasting effectiveness (FE) to supplement the above criteria.
We presume that the observation sequence value is {x t , t = 1, 2, ···, N} and x t is the estimated value. The following concepts can be obtained: In this formula, e t is relative forecasting error at time t, t = 1, 2, · · · , N. Obviously, 0 ≤ |e t | ≤ 1.

Definition 2.
Let A t = 1 − |e t | be forecasting accuracy at time t, t = 1, 2, · · · , N. Obviously, 0 ≤ A t ≤ 1; if (x t − x t )/x t > 1, A t = 0, and this shows that the forecasting method is invalid at time t.
Definition 3. Let m k = ∑ N t = 1 Q t A k t be the forecasting effectiveness element, k is a positive integer, and {Q t , t = 1, 2, · · · , N} is the discrete probability distribution of the forecasting method at time t. ∑ N t=1 Q t = 1, Q t > 0.

Definition 4.
Let H be a k-dimensional continuous function, then H(m 1 , m 2 , · · · , m k ) is k-order forecast effectiveness. In particular, we use two-order forecast effectiveness in this paper. Moreover, two-order forecasting effectiveness of the proposed hybrid model is maximum from the following simulation results. From Table 10 we can clearly see that second-order forecasting effectiveness offered by the proposed hybrid model outperforms the other four models for wind speed forecasting at different sites. For example, for Site 1, second-order values of WPT-SDCS-LSSVM from Monday to Sunday are 0.93728, 0.99108, 0.98144, 0.98225, 0.97296, 0.98165, and 0.99103, respectively, which are larger than the other model.

Remark 4.
The results indicate that the proposed hybrid model is more valid than and significantly superior to the other models. Accordingly, the proposed model can satisfactorily approximate the observed actual wind speed time series.

Conclusions
With the increasing utilization of renewable energy in recent years, wind energy is the largest new energy reserve, and the development and integration of wind energy resources into the power system are rapidly growing. It is noteworthy that both accuracy and stability should be regarded as equally vital in the forecasting field; thus, developing techniques to simultaneously achieve satisfactory accuracy and stability is imperative. However, because of the unique features of wind speed (uncertainty and intermittency), it is difficult to obtain satisfactory forecasting results using single models. To overcome this challenge, this research proposes a hybrid model based on an LS-SVM model that employs a modified cuckoo search algorithm to optimize the parameters of forecasting models for 10-min wind speed forecasting. Wavelet packet transform (WTP) in each WTP-ANN model is applied to denoise the wind speed series to improve the forecasting accuracy. Overall, for forecasting accuracy, the average MAPE values of WTP-BPNN, WTP-WNN, WPT-PSO-LSSVM, WPT-CS-LSSVM, and the hybrid model are 2.7565%, 3.0197%, 3.6897%, 3.0336%, and 1.5944%, respectively. Therefore, the proposed hybrid model can obtain satisfactory forecasting results for wind speed forecasting compared with the other four models. In addition, the fluctuations of MAPE values at each forecasting point are the smallest for the proposed hybrid model, which indicates that the model can improve the accuracy of wind speed forecasting. Moreover, the discussion of the hypothesis test and forecasting availability is used to evidence the superiority of the hybrid model compared to other models in this paper. As demonstrated by an instance based on a wind farm, the improvements in forecasting accuracy are significant for integrating electricity generated by wind energy into the grid with minimum risk and maximum benefit. The proposed hybrid model, which provides high forecasting accuracy, can be employed for wind farm dispatch and could improve the utilization of renewable energy.