Artiﬁcial Combined Model Based on Hybrid Nonlinear Neural Network Models and Statistics Linear Models—Research and Application for Wind Speed Forecasting

: The use of wind power is rapidly increasing as an important part of power systems, but because of the intermittent and random nature of wind speed, system operators and researchers urgently need to ﬁnd more reliable methods to forecast wind speed. Through research, it is found that the time series of wind speed demonstrate not only linear features but also nonlinear features. Hence, a combined forecasting model based on an improved cuckoo search algorithm optimizes weight, and several single models—linear model, hybrid nonlinear neural network, and fuzzy forecasting model—are developed in this paper to provide more trend change for time series of wind speed forecasting besides improving the forecasting accuracy. Furthermore, the effectiveness of the proposed model is proved by wind speed data from four wind farm sites and the results are more reliable and accurate than comparison models.


Introduction
In the past few decades, due to policy-driven environmental and energy security issues, the development of renewable energy sources (RESs), which play an indispensable role in the global power sector, has received much attention, and these energy sources play an indispensable role in the global power sector [1].
Wind energy is a very rich resource on the earth, according to a report of the World Meteorological Organization (WMO).Global wind energy reserves total about 2.74 × 10 9 MW, which amounts to wind energy development and utilization of approximately 2 × 10 7 MW, greater than hydrogen energy, and can be developed and utilized all around the world.It is estimated that the amount of wind energy is 10 times larger than hydrogen energy, and Earth's daily wind power is equivalent to the current world energy consumption.Almost all of the world's energy is produced by coal combustion, and only one-third is provided by wind energy [2].
Therefore, accurate prediction of wind speeds over several periods, from a few minutes (short-term) to a few hours (medium-term) or days (long-term), is a strategic issue that needs to be addressed.Forecasting short-term wind speed is key to improving the reliability of wind power generation systems and integrating wind energy into the power grid [3][4][5][6][7].In order to reduce the operational cost of a wind farm, it is very important to improve short-term wind speed forecasting accuracy.
However, wind power generation has some drawbacks.One of the main problems is that wind is an intermittent energy source, which means that there are great differences in the production of energy due to various factors, such as wind speed, air density, and wind turbine characteristics.Another problem is that wind is often used as a nondispatchable source of energy, so the management of wind energy production according to demand is difficult.Usually, intermittency can be considered as a problem related to dispatchability [8].According to the theory of wind energy, wind energy is proportional to wind speed, so seemingly trivial forecasting changes in wind speed might lead to significant changes in total wind power.
In this condition, wind speed forecasting is the significant foundation and premise.More accurate forecasting of wind speed can: (1) reduce the rotation of wind farm equipment and operating costs, (2) improve the wind power penetration limit, (3) help scheduling in a timely manner to adjust plans, (4) reduce the impact of wind on the grid, and (5) effectively reduce or avoid the negative impact of wind farms in the power system, improving the ability of wind power in the competitive electricity market.
It is a difficult task to develop applications and methods of wind speed forecasting.In order to solve wind speed forecasting problems, most researches are based on terrain features, atmospheric pressure, ambient temperature, and other meteorological information to obtain a satisfy results.However, wind speed, which is generally considered to be one of the most difficult weather parameters, can be forecasted based on its chaotic and random fluctuations [9][10][11].In order to achieve accurate wind speed forecasting, many methods and models have been proposed, which can be grouped into four categories [11]: (a) statistical models, (b) physical models, (c) artificial intelligence models, and (d) spatial correlation models.
Statistical models used to forecast wind speed usually are fuzzy methods [12], autoregressive (AR), moving average (MA), autoregressive moving average (ARMA) [13], and autoregressive integrated moving average (ARIMA) [14] models.These required that the time series data be stable, or stable after differentiation, and they can only capture the linear relationship, but not the non-linear relationship.So, statistical models are not suit for nonlinear data.
Physical methods can be used to forecast simulated wind speed, but have a great influence on the results of numerical simulation for prediction accuracy.Physical methods based on large samples, the forecast object area of pressure, temperature, terrain, obstacles, and other objective conditions have higher requirements than numerical prediction methods in the prediction of wind speed, and systematic errors often arise due to the physical parameters and the processing of grids.One way to solve this problem is to improve the grid precision of the numerical prediction method where calculation accuracy is limited by grid resolution.Higher grid resolution not only takes a lot of computing time, but also is beyond the assumptions of a mesoscale model, and an improper boundary layer parameterization scheme may lead to larger calculation errors.The model itself also has some inherent defects, such as mode, near-ground wind speed with too-rapid height increase, and the wind speed forecast is not accurate [15][16][17].
Artificial neural network [18][19][20][21][22][23][24][25][26][27][28] has strong learning and mapping ability and can easily fit the arbitrary complex nonlinear relationship, which is very suitable for short-term wind speed forecasting, and now research with neural networks is quite active in the world.Commonly, researchers forecast wind speed using back-propagation neural networks (BPNNs).Usually wind direction, temperature, pressure, and other meteorological factors related to wind speed are applied as the input of the neural network and the wind speed value is the output of the sample.Training the network can establish a nonlinear simulation of wind speed and forecast future wind speed change.However, there is usually no fixed theory and it needs to rely on experience to guide the design of the network structure.In addition, selecting the appropriate training sample is also a difficult problem.When the model is training the samples, it is easy for the neural network to fall into a local minimum, or the fitting for the training sample reduces the network generalization ability, influencing the forecasting effect.Artificial neural networks have powerful nonlinear mapping capabilities, and the modeling process is more concise, straightforward, and direct than other models.
Unlike other models, the spatial correlation model considers not only the wind speed of the given wind farm, but also the wind speed at several adjacent locations.Spatial correlation and the spatial correlation method need to consider the wind and wind speed data of several adjacent sites by using the spatial correlation between these sites to forecast wind speed.Due to wind speed data, the spatial correlation method requires multiple sites and the amount of data is very large, but also the requirement of real-time data acquisition and transmission is higher, and the current study is not very mature.
In practice, a variety of forecasting models can be chosen to forecast confirmed variables.Different forecasting models provide different information and the forecasting accuracy is often different.If some forecasting models with larger errors are simply discarded, some useful forecasting information would be lost, which is a waste of forecasting information and should be avoided.While existing time series forecasting models for researching the description and prediction of accuracy have reached a higher level, any kind of model is a simplified abstraction of the actual object, incomplete and with unavoidable limitations.A more scientific way is to combine different forecasting models into new combined models, to find a model based on the single model and optimization algorithm, or to combine time series forecasting models, as combined predictions can more fully reflect the dynamic phenomenon of internal regularity and future trends, thus constructing a combined forecasting model on the basis of single models.
Combining forecasting models is a comprehensive utilization of all kinds of forecasting models, built in the form of an appropriate combined model to forecast the variable.Combined forecasting models can avoid the loss of information in fitting a single model, reduce the randomness, and improve forecasting precision [29].The main purpose of the method is comprehensive utilization of various models of the information provided, to improve forecasting accuracy as much as possible, and the greater comprehensiveness can be reflected by the variety of regulation of the system.
A new combined model that integrates nonpositive constraint theory [30], two hybrid neural networks, a single nonlinear model, two linear models, and a modified cuckoo search algorithm with steepest descent (we call it SDCS) is proposed in this paper, and 10-min time series wind speed data from four wind farm sites are applied to examine our proposed model.As a result of our experiments and analyses, the combined model has better performance than the other two hybrid models and three single models.Hence, the proposed wind speed forecasting combined model is helpful for wind energy utilization rates, such as avoiding wasting wind energy resources, saving on economic dispatching, reducing production costs, and improving wind turbine safety operation.This model also has certain reference value for decision-making of wind farms in practice.The combined forecasting model with accuracy is a promising model for future applications and can be utilized in other forecasting fields.

The major contributions and innovations of this paper are as follows:
(1) Based on decomposing the time series into the sum of interpretable components such as trend, periodic components, and noise with no a priori assumptions about the parametric form of these components, the preprocessing technique reduces the uncertainty and irregularity of wind speed data, and effectively improves the performance of wind speed forecasting.(2) A novel deciding weights method based on a modified cuckoo search algorithm with steepest descent (SDCS) is proposed to improve the weights and optimize the performance.Due to the excellent performance of SDCS, the combined model effectively utilizes the advantages of its component models and overcomes the disadvantages of low precision and poor stability of traditional models.(3) Linear functions and nonlinear functions are applied to test the wind speed series.Based on the results of our tests, the wind speed data cannot be considered as linear data or nonlinear data completely, so the linear models and nonlinear models are applied correctly in our proposed model.(4) The effectiveness of the proposed model is proved by wind speed data of four wind farm sites and the results are more reliable and accurate than comparison models.Our proposed combined model includes a hybrid BPNN, which is optimized by a modified ant colony optimization algorithm; a hybrid extreme learning machine neural network (ELMNN), which is optimized by differential evolution (DE); two traditional linear statistics models (ARIMA and Holt-Winters); and a single fuzzy neural network model (adaptive network-based fuzzy inference system), which can obtain high accuracy and strong stability.(5) This novel combined model provides powerful technical support for smart grid scheduling and management.From the results of our experiments based on wind speed data of four sites, accurate forecasting can be achieved to enhance the security and controllability of the power grid and to realize reasonable dispatch of the power grid.
The remainder of the paper is arranged as follows.Singular spectrum analysis, nonlinear back propagation, extreme learning machine neural network, adaptive neuro-fuzzy inference system, two linear models, autoregressive integrated moving average, and Holt-Winters, heuristic algorithms and the optimization procedure is introduced in Section 2. Section 3 shows the proposed integrated framework.In Section 4, some forecasting performance metrics and the forecasting results of individual models and of the proposed combined model and comparisons are discussed and the views and results of the full paper are summarized.Finally, Section 5 concludes the study.

Methods
In this section, the data denoising method, statistic models, artificial neural networks and optimization algorithms are presented in this part.

Singular Spectrum Analysis
Singular spectrum analysis (SSA) can be used as a denoising technique so that it can be applied to arbitrary time series, especially nonstationary time series.The basic aim of SSA is to decompose the time series into the sum of interpretable components such as trends, periodic components, and noise with no a priori assumptions about the parametric form of these components [31].
Consider a real-valued time series X = (x 1 , . . .x N ) of length N. Let L(1 < L < N) be some integer called the window length and K = N − L + 1.
1st step: Embedding.Form the trajectory matrix of the series X, which is the L × K matrix where X = (x 1 , . . .x i+L−1 ) T , (1 < i < K) are lagged vectors of size L. The matrix X is a Hankel matrix which means that X has equal elements x ij on the anti-diagonals i + j = const.2nd step: Singular Value Decomposition.Perform the singular value decomposition (SVD) of the trajectory matrix X. Set S = XX T and denote by λ 1 , . . ., λ L the eigenvalues of S taken in the decreasing order of magnitude λ 1 ≥ . . .≥ λ L ≥ 0 and by U 1 , . . ., U L the orthonormal system of the eigenvectors of the matrix S corresponding to these eigenvalues.Set d = rankX = max{i, such that λ i > 0} (note that d = L for a typical real-life series) and V i = X T U/ √ λ i , (i = 1, . . ., d).In this notation, the SVD of the trajectory matrix X can be written as X = X 1 + . . .+ X N .
Where X i = √ λ i U i V i T are matrices having rank 1; these are called elementary matrices.
The collection √ λ i , U i , V i T will be called the ith eigentriple (abbreviated as ET) of the SVD.Vectors U i are the left singular vectors of the matrix X, numbers √ λ i are the singular values and provide the singular spectrum of X; this gives the name to SSA.Vectors V i √ λ i = X T U i are called vectors of principal components (PCs).
3rd step: Eigentriple grouping.Partition the set of indices {1, . . . ,d} into m disjoint subsets I 1 , . . . ,I m .Let I = i 1 , . . ., i p .Then the resultant matrix X I corresponding to the group I is defined as X = X I 1 + . . .+ X I m .The resultant matrices are computed for the groups and the grouped SVD expansion of X can now be written as X = X I 1 + . . .+ X I m .
4th step: Diagonal averaging.Each matrix X I j of the grouped decomposition is hankelized and then the obtained Hankel matrix is transformed into a new series of length N using the one-to-one correspondence between Hankel matrices and time series.Diagonal averaging applied to a resultant matrix X I k produces a reconstructed series X N .In this way, the initial series x 1 , . . .x N is decomposed into a sum of m reconstructed subseries: This decomposition is the main result of the SSA algorithm.The decomposition is meaningful if each reconstructed subseries could be forecasting as a part of either trend or some periodic component or noise.

Forecasting Models and Methods
In this paper, the proposed combined model, based on the nonpositive constraint theory, is integrated with two single statistics forecasting models, two optimization artificial neural network (ANN) forecasting models, and the adaptive network-based fuzzy inference system (ANFIS) based on the Takagi-Sugeno fuzzy inference system.

Back-Propagation Neural Network (BPNN) Model
BPNN is a type of multilayer feed-forward neural network with a wide variety of applications.It is based on a gradient descent method that minimizes the sum of the squared errors between the actual and desired output values.The transfer function is of the neuron type.The output function is between 0 and 1 and can transform input to output for continuous nonlinear mapping [32].
The topology of the BPNN, is shown as follows: where X min and X max are the minimum and maximum value of the input array or output vectors and X i denotes the real value of each vector.
Step 1. Calculate outputs of all hidden layer nodes.
net j = ∑ i w ji x i +b j , (j = 1, . . ., 2n) where the activation value of node j is net j , w ji represents the connection weight from input node i to hidden node j, b j represents the bias of neuron j, y j represents the output of hidden layer node j, and f is the activation function of a node, which is usually a sigmoid function.
Step 2. Calculate the output data of the neural network.
where w 0j represents the connection threshold from hidden node j to the output node, b 0 represents the bias of the neuron, O 1 represents the output data of the network, and f 0 is the activation function of the output layer node.
Step 3. Minimize the global error via the training algorithm.
where Z represents the real data vector of output, m represents the number of output.

Extreme Learning Machine Neural Network Model
The ELMNN is a type of single hidden-layer, feed-forward neural network (SLFN) in which the hidden layer parameters do not need to be tuned [33].
For given dataset T = {x 1 , t 1 , x 2 , t 2 , . . . ,x i , t i } where the activation function which contains L hidden layer nodes as g(x).The computational steps of the standard ELMNN are illustrated as follows: (1) Randomize the bias between the input weights and the hidden layer of the given network as: (2) The feed forward neural network output of activation function g(x) is expressed as: where the output matrix H is shown as: Thus formula (10) can be simplified as: Among it and (3) Output weight matrix β can be obtained by the following formula: where H + represents generalized inverse matrix of hidden layer output matrix.

Fuzzy Inference Systems Model: Adaptive Neuro-Fuzzy Training of Sugeno-Type FIS
An adaptive neuro-fuzzy inference system or adaptive network-based fuzzy inference system (ANFIS) is a kind of artificial neural network based on the Takagi-Sugeno fuzzy inference system.The technique was developed in the early 1990s.Since it integrates neural networks and fuzzy logic principles, it can potentially capture the benefits of both in a single framework.Its inference system corresponds to a set of fuzzy if-then rules that have learning capability to approximate nonlinear functions, as shown in Figure 1 [34].

Statistical Linear Models: Autoregressive Integer Moving Average and Holt Winters
The autoregressive integer moving average (ARIMA) model is one of the most popular forecasting models [35].The ARIMA model can be expressed as follows: y t =ϕ 1 y t-1 +ϕ 2 y t-2 +…+ϕ p y t-p +ε-θ q ε t-q (15) where y i (i = 1,2,…,t) is the actual value, ε i (i =1,2,…,t) is the random error at time t, ϕ i and θ i represent the coefficients, and p and q are inter-numbers that are often referred to as autoregressive and moving average polynomials, respectively [36].
The output of Holt-Winters (HW) method is written as F t+m , an estimate of the value of x at time t + m, m > 0 based on the raw data up to time t and suppose we have a sequence of observations {x t }, beginning at time t = 0 with a cycle of seasonal change of length L. The formula and recursive updating equations are following: α is the data smoothing factor, β is the trend smoothing factor, γ is the seasonal change smoothing factor, these three parameters are between 0 to 1. {s t } and {b t } represent the smoothed value of the constant part for time t and the sequence of best estimates of the linear trend that are superimposed on the seasonal changes, respectively.{c t } is the sequence of seasonal correction factors.

Statistical Linear Models: Autoregressive Integer Moving Average and Holt Winters
The autoregressive integer moving average (ARIMA) model is one of the most popular forecasting models [35].The ARIMA model can be expressed as follows: where y i (i = 1, 2, . . ., t) is the actual value, ε i (i = 1, 2, . . ., t) is the random error at time t, ϕ i and θ i represent the coefficients, and p and q are inter-numbers that are often referred to as autoregressive and moving average polynomials, respectively [36].
The output of Holt-Winters (HW) method is written as F t+m , an estimate of the value of x at time t + m, m > 0 based on the raw data up to time t and suppose we have a sequence of observations {x t }, beginning at time t = 0 with a cycle of seasonal change of length L. The formula and recursive updating equations are following: where, is the data smoothing factor, β is the trend smoothing factor, γ is the seasonal change smoothing factor, these three parameters are between 0 to 1. {s t } and {b t } represent the smoothed value of the constant part for time t and the sequence of best estimates of the linear trend that are superimposed on the seasonal changes, respectively.{c t } is the sequence of seasonal correction factors.

Combined Model
The combined forecasting theory [29] states that, to solve a certain forecasting problem, which could be solved by M types of forecasting models, the weight coefficients should be properly selected, and then the results from several forecasting methods should be added up.In this condition, it has been regarded as an improvement on single models and an effective and simple way to improve forecasting stability [37].
This paper proposes a new combined model, based on weight-coefficient optimization.In order to improve the forecasting accuracy of the combined model, we modified cuckoo search method with steepest descent to solve the slow speed of convergence and the imprecise accuracy of convergence during the later period of CS optimization.Experiment V shows that the optimizing performance of SDCS is better than that of CS.Thus, SDCS was employed to optimize the weight coefficients of the combined model.The proposed combined model based on nonpositive constraint theory (NNCT) consolidates several models including two hybrid neural networks, a fuzzy model and two linear models to ensure to take advantages of each models.Definition 1.The traditional forecasting combined method attempts to find the best weight of the combined models based on minimizing SSE: where L = (l 1 , l 2 , . . . ,l m ) is the weight vector, R = (1, 1, . . . , 1)T is a column vector where all elements are 1, and E ij = e T i e j , where e i = (e i1 , e i2 , . . . ,e in ) is called the error information matrix Definition 2. An improved of the traditional combined method based on the nonnegative constraint theory (NNCT) is given as follows: The weights have no limitation in the range [0, 1].The experimental results show that when the weight vector has a value in the range [-2, 2], the combined model can obtain desirable results, and this method was regarded as the nonpositive constraint theory (NPCT) [30].This section provides a weight-determined method that was assessed by experimental simulation rather than a theoretical proof.
The branch of combined model proposed in this paper are Adaptive particle swarm optimization ant colony optimization (APSOACO)-BPNN, DE-ELMNN, ANFIS, ARIMA, and HW, which is shown in Section 3.

Heuristic Algorithm
Many parameters, weights, thresholds, and initial values are required for neural networks, and different parameters have a great impact on the output results.Inspired by nature, many methods have been found by natural laws to solve practical problems.The methods that are inspired by the laws and rules of nature are often called heuristic algorithms.Single heuristic algorithms have limited ability to solve problems, and hybrid algorithms that combine the characteristics of different meta-heuristic algorithms have gradually become a research hot point.

Differential Evolution Algorithm
Similar to other evolutionary algorithms, the differential evolution (DE) algorithm is a stochastic model that simulates biological evolution.Differential evolution is an algorithm that optimizes a problem by trying to improve candidate solutions through iteration.Compared to the evolutionary algorithm, the differential evolution algorithm is based on the population global search strategy using coding, based on differences of a simple mutation operation and the competition strategies, reducing the complexity of genetic manipulation.At the same time, the differential evolution algorithm has specific memory so that it can dynamically track the current search condition to adjust the searching strategy, and has a strong global convergence ability and robustness [38].

Hybrid Adaptive Particle Swarm Optimization based on Ant Colony Optimization
Adaptive particle swarm optimization (APSO) is presented in this paper to improve the accuracy of search capability [39].As the value of the particle fitness function changes, the inertia weight will be automatically adjusted to enhance the illumination direction of the particle search.Although APSO does not converge fast, it does not fall into local extreme points lightly.Ant colony optimization (ACO), inspired by the action of ant colonies searching for food [40,41], is used for continuous search space optimization problems and is tested on some benchmark functions as well [42].
APSOACO was developed inspired by APSO.The advantages of ant ACO, foraging behavior, and velocity update of adaptive particle swarm optimization APSO were assembled in APSOACO.A sigmoid function is used to convert distance and velocity into heuristic values.The advantages of the hybrid algorithm are as follows: (1) it avoids convergence to a local optimum, (2) it provides a better solution within fewer iterations, i.e., fast convergence, and ( 3) it has low computational complexity.The basic steps of APSOACO are shown in Algorithm 1 and Figure 2.     The cuckoo search (CS) algorithm was derived from the action of cuckoos laying their eggs in other birds' nests to let those birds hatch the eggs for them [43].However, once the host birds discover the cuckoo eggs, they will throw away the eggs or abandon their nests and build a new nest elsewhere.In CS, every nest stands for a solution.The CS algorithm was constructed based on three assumptions: (a) only one egg is laid by each cuckoo in a selected nest randomly; (b) succeeding generations would begin in the best nest; and (c) it is a constant of the number of available host nests; the probability value of the host bird discovering the egg laid by a cuckoo is p, which has a range of 0 to 1.
Similar to other meta-heuristic algorithms, the original CS algorithm is simple and efficient; however, it has disadvantages, such as insufficient search vigor and slow search speed during the latter part of the search.Therefore, this paper proposes an improved CS algorithm, which we call the SDCS model, based on the steepest descent (SD) method [44].
As one of the oldest optimization algorithms, the steepest descent method is simple and intuitive.Currently, many effective optimization algorithms have been established on the basis of this algorithm.In the cause of avoiding slow convergence rate of CS's shortcoming of slow convergence rate, the steepest descent method is used to modify the cuckoo search method, and the modified process can be expressed by the following steps: Step 1. Select the initial points x 0 , and give the end error ε > 0. Make k = 0. Step stop iterations and output x k .where, ∇ is gradient operator, α > 0 is the step size related to the scales of the problem of interest, while the product means entry-wise multiplication, Levy(λ) is a Levy flight, which represents the most powerful features of the cuckoo search to generate new eggs provided by a random walk and t is the iteration number.Otherwise go to Step 3.
Step 3. Take Step 4. Conduct one-dimensional search.Solve t k , make Make The step size and step-length distribution function of the cuckoo search algorithm can be improved by using steepest descent due to its simplicity and flexibility.The final optimal solution can be obtained by modifying the step size and step-length distribution function constantly.

Hybrid Models
Section 2.2 introduced the different forecasting models and Section 2.3 introduced some heuristic algorithms.In this section, APSO, ACO, and APSOACO are optimized to the weights and biases of the nonlinear back propagation neural network as shown Figure 2, utilizing the differential evolution (DE) algorithm to optimize the weights and biases of the nonlinear ELMNN (shown in Figure 2) and using a modified cuckoo search algorithm based on steepest descent (SDCS) to optimize the weight vector of the combined forecasting model, which will provide the minimum error for each model (shown in Figure 3).Section 2.2 introduced the different forecasting models and Section 2.3 introduced some heuristic algorithms.In this section, APSO, ACO, and APSOACO are optimized to the weights and biases of the nonlinear back propagation neural network as shown Figure 2, utilizing the differential evolution (DE) algorithm to optimize the weights and biases of the nonlinear ELMNN (shown in Figure 2) and using a modified cuckoo search algorithm based on steepest descent (SDCS) to optimize the weight vector of the combined forecasting model, which will provide the minimum error for each model (shown in Figure 3).

Framework of the Proposed Forecasting Models
In this paper, five forecasting techniques (APSOACO-BPNN, DE-ELMNN, ANFIS, ARIMA, and HW) were used, due to their forecasting ability to solve nonlinear and linear problems.When the data are linear, the linear models ARIMA and HW have good forecasting ability.When the data are nonlinear, such as wind speed data, which is often random and intermittent, the performance of these models is not very good.Therefore, this study presents a combined linear and nonlinear model to solve the forecasting task.The structure of the proposed combined model, which has three phases, is shown in Figure 3.
Phase I: Utilize SSA to preprocess the original short-term wind speed series.In this phase, according to the observed time series, the trajectory matrix is constructed with the wind speed data, and the matrix is decomposed and reconstructed, and different composition on behalf of the original time series signal is extracted, thus the structure of time series analysis, and can further improve the prediction precision.

Framework of the Proposed Forecasting Models
In this paper, five forecasting techniques (APSOACO-BPNN, DE-ELMNN, ANFIS, ARIMA, and HW) were used, due to their forecasting ability to solve nonlinear and linear problems.When the data are linear, the linear models ARIMA and HW have good forecasting ability.When the data are nonlinear, such as wind speed data, which is often random and intermittent, the performance of these models is not very good.Therefore, this study presents a combined linear and nonlinear model to solve the forecasting task.The structure of the proposed combined model, which has three phases, is shown in Figure 3.
Phase I: Utilize SSA to preprocess the original short-term wind speed series.In this phase, according to the observed time series, the trajectory matrix is constructed with the wind speed data, and the matrix is decomposed and reconstructed, and different composition on behalf of the original time series signal is extracted, thus the structure of time series analysis, and can further improve the prediction precision.Phase III: Utilize the test data to choose the branch models of combined model.For different models, every model performance different in the test data.And we also used optimization algorithms to optimize BPNN and ELMNN, and we chose the best hybrid models to build the combined model.
Phase IV: Combine the single forecasting models by the combined model theory.Combine the independent forecasts generated by the aforementioned forecasting engines.NPCT is utilized to predict the distribution of future wind speed with the independent forecasts generated by the forecasting engines as input in different forecasting horizons.NPCT is used to predict the distribution of time series wind speed data, and the independent prediction generated by the prediction engine Phase II: Construct the test wind speed datasets, forecasting time series wind speed data with different forecasting models.It is worth pointing out that the original wind speed time series is divided into training sets and test sets.In the process of training the network, the input is the filtered data and the output is the original time series of the training set.In the test step, the input is also the filtered data, and the output is the original data.
Phase III: Utilize the test data to choose the branch models of combined model.For different models, every model performance different in the test data.And we also used optimization algorithms to optimize BPNN and ELMNN, and we chose the best hybrid models to build the combined model.
Phase IV: Combine the single forecasting models by the combined model theory.Combine the independent forecasts generated by the aforementioned forecasting engines.NPCT is utilized to predict the distribution of future wind speed with the independent forecasts generated by the forecasting engines as input in different forecasting horizons.NPCT is used to predict the distribution of time series wind speed data, and the independent prediction generated by the prediction engine is used as input to different prediction fields.Hereafter, the combined model can be used to forecast the time series wind speed data.

Experimental Study
In order to verify the effectiveness of our proposed combined model, 10-min wind speed series were applied to test the performance of these models.Using the wind speed data, which included samples from every wind speed observation site at two wind farms in China in 2015, the effectiveness and reliability of the combined model are verified.In our experiments, the size of training datasets is limited by the moving window method and the size is set as 2016 samples for Site 1 in March, Site 2 in June, Site 3 in September, and Site 4 in November.
There are four experiments in this paper.Experiment I aims to research the feature of time series wind speed data.In the process of wind speed forecasting, it is a challenging task to choose which forecasting models are employed to forecast the wind speed.Experiment I uses linear and nonlinear functions to test the feature of the time series wind speed data.
The developed hybrid BPNN with other two hybrid forecasting models, APSO-BPNN and ACO-BPNN, are compared in Experiment II.We also compared single optimization and hybrid optimization, optimizing the weight and threshold of the nonlinear BPNN.In Experiment II, it also shown the differential evolution optimizes the weight matrix of hidden and output layers and the bias matrix between the two layers to compare the single ELMNN and DE-ELMNN Experiment III shows the performance of each forecasting model at each time point (hours).We compared two single linear forecasting models, a single fuzzy forecasting model and two hybrid nonlinear models with the forecasting results of Monday from Site 1.
Experiment VI presents the forecasting results of branch models and combined model from four sites to test the performance of models in different sites.
Experiment V shows the test between SDCS and CS, which reveals SDCS is better than CS.

Metric of Forecasting Model and Experiment Setup
Accuracy of results is a significant criterion in this paper.Four metrics, average error (AE), mean absolute error (MAE), mean square error (MSE), and mean absolute percentage (MAPE), as shown in Table 1, are used to judge the performance of the forecasting models.
From the results of Table 2, wind speed data are both linear and nonlinear by hypothesis test.So, the linear models and nonlinear models considered in our proposed forecasting model are correct and necessary.Note: In order to verify the linear or nonlinear character of wind speed, three functions were structured: (1) linear

Experiment II: Forecasting Results of Hybrid Models and Single Models
In Section 2.4, we proposed four hybrid models, including ACO-BPNN, APSO-BPNN, APSOACO-BPNN, and DE-ELMNN.In this section, the performance of three different hybrid BPNN are compared to certify the hybrid BPNN of combined can obtain better results compared to other hybrid BPNNs.At the same time, ELMNN and DE-ELMNN are also compared in order to select a better model.

Comparesion of Two ELMNNs: ELMNN and DE-ELMNN
A novel and fast learning neural network, ELMNN, is based on modification of the traditional single hidden-layer feed-forward.The obvious advantage is that it randomly assigns the weights and thresholds of the input layer and the hidden layer in the learning process, which is not required to accommodate to these parameters in the learning process, so that the training process is extraordinarily fast.However, the random numbers initialize the input weights and hidden biases.The learning process of ELMNN has poor stability, the same as the other neural network.Then the output weights are calculated through an inverse operation on the hidden layer output matrix, which is randomly determined according to Equation (10).Section 2.3 of this paper introduces how to use the DE algorithm to search the weights and threshold values for ELMNN.In this experiment the performance of hybrid model and single model, SSA-DE-ELMNN and SSA-ELMNN, were tested.
Table 3 show the forecasting results of Sites 1-4 in a week, it is clear that the DE-ELMNN model performs much better than the other three models.To explain the results of the proposed method, we utilize the first site as an example.First, the hybrid DE-ELMNN model has the smallest statistical error of AE, MAE, MSE, and MAPE when compared with the single ELMNN.As Table 3 shown, for an example, from the Monday results in site 1, the MAPE of our hybrid DE-ELMNN is 5.85% while the single ELMNN is calculated to be 6.30%; thus, the precision is improved by 0.45% and the most accuracy improved is from Wednesday in site 1 with the value 0.98%.The DE-ELMNN improved MAPE value 0.01% at least from Table 3.Second, in this condition, the results reveal the DE-ELMNN is more effective than DE-ELMNN.Remark: From the Table 3 and Figure 4, the results indicate that SSA-DE-ELMNN shows better performance than the single SSA-ELMNN.In brief, it can be explained that SSA, which is able to denoise the time series wind speed data as a preprocessing method, and the DE algorithm, which has better ability to detect optimizing parameters for BPNN, can improve forecasting accuracy.Furthermore, due to the vertical dataset selection method, the structure of the dataset is optimal, DE-ELMNN can reach the minimum error effectively.

Comparesion of Three Optimization BPNNs: ACO-BPNN, APSO-BPNN, and APSOACO-BPNN
The traditional BPNN was comprehensively applied to wind speed forecasting, but as it has defects, it is easy to fall into local minimum and low forecasting accuracy.So Section 2.3 introduces the modified ant colony optimization as a way to use APSOACO to optimize the BPNN as a strong predictor to improve the accuracy of forecasting results.Remark: From the Table 3 and Figure 4, the results indicate that SSA-DE-ELMNN shows better performance than the single SSA-ELMNN.In brief, it can be explained that SSA, which is able to denoise the time series wind speed data as a preprocessing method, and the DE algorithm, which has better ability to detect optimizing parameters for BPNN, can improve forecasting accuracy.Furthermore, due to the vertical dataset selection method, the structure of the dataset is optimal, DE-ELMNN can reach the minimum error effectively.

Comparesion of Three Optimization BPNNs: ACO-BPNN, APSO-BPNN, and APSOACO-BPNN
The traditional BPNN was comprehensively applied to wind speed forecasting, but as it has defects, it is easy to fall into local minimum and low forecasting accuracy.So Section 2.3 introduces the modified ant colony optimization as a way to use APSOACO to optimize the BPNN as a strong predictor to improve the accuracy of forecasting results.This experiment tested the three optimization nonlinear BPNNs to choose the best performance of hybrid nonlinear BPNN as a portion of the combined forecasting model.APSO and ACO have many advantages.ACO has strong global search ability, is robust, and is easy to combine with other algorithms, and APSO has fast convergence speed and fewer parameters, and is simple and easy to operate.With ACO the slow convergence rate occurs easily, but at the same time APSO easily falls into local extreme points.All of these defects will affect the parameters with which the algorithms are optimized.Combining the advantages and disadvantages of ACO and APSO, the APSOACO algorithm is proposed to ameliorate the capacity of parameter optimization.
It is clear that, as the results show in Table 4, the hybrid SSA-APSOACO-BPNN model is superior to SSA-APSO-BPNN and SSA-ACO-BPNN according to the values of MAE, MSE, and MAPE.In addition, the MAPE of the SSA-APSOACO-BPNN is 6.08%, 3.67%, 4.62%, 4.90%, 5.55%, 4.21%, and 3.21% from Monday to Sunday in site 1, respectively.What we could find is that SSA-APSOACO-BPNN not only obtained high accuracy and stability, but also improved the time series wind speed data suited to be forecasted by this model.From the experiments, we can determine the MAE, MSE, AE, and MAPE and whether the proposed SSA-APSOACO-BPNN achieves the best performance in most of tome.It can be seen that, SSA-APSO-BPNN achieved better performance in Sunday of site 3 and SSA-ACO-BPNN obtain better results in Tuesday of site 2 and Saturday of site 3.
Figure 5A show that the MAPE, MAE, and MSE values of APSOACO-BPNN are 0.4116, 0.2663 and 3.57%, respectively in Tuesday from site 2, which are lower than APSO-BPNN and ACO-BPNN.And for other days from site 2, it is clearly shows APSOACO-BPNN performance better than other hybrid BPNNs.Figure 5C also shows the 95% confidence intervals (CIs) obtained by three hybrid BPNNs, the figure indicates that both the upper and lower CI are close between this three hybrid models, but for APSOACO-BPNN there are more points in the confidence interval.Figure 6C also shows the 95% confidence intervals (CIs) obtained by five branch models, the figure indicates that both the upper and lower CI are close between three nonlinear models, but for linear models, they get larger CI which indicates the linear models reached worse results than nonlinear models.
(1) Part A shows the MAPE values of five branch models, from the figure, ARIMA perform worst, the MAPE of other four models are approximately.(2) From Figure 6B, the MSE and MAE of five models are not high, except 18:00 and 21:00 to 24:00.
It indicates the single models perform about the same before 21:00.(3) From Figure 6B, the forecasting results of ARIMA are closer than other models but not the whole hour points.
Figure 6C also shows the 95% confidence intervals (CIs) obtained by five branch models, the figure indicates that both the upper and lower CI are close between three nonlinear models, but for linear models, they get larger CI which indicates the linear models reached worse results than nonlinear models

Remark:
The former experiment reveals there is no model that can reach best results at every time point; each model has advantages and disadvantages.The combined models can add up forecasting models to overcome these dilemmas.It has been regarded as an improvement on single models and an effective and simple way to improve forecasting stability.Understanding and predicting wind speed is very important for calculating wind farm and estimating the generation capacity and structural load of wind turbines [45].By this study, it provides a scientific basis for wind turbine design, wind farm location and load reduction strategy.

Experiment IV: The Performance of Hybrid Models, Single Models and Combined Models at Four Sites
As the results in Tables 6-7 shown, ANFIS is a special method in the development of neuro-

Remark:
The former experiment reveals there is no model that can reach best results at every time point; each model has advantages and disadvantages.The combined models can add up forecasting models to overcome these dilemmas.It has been regarded as an improvement on single models and an effective and simple way to improve forecasting stability.Understanding and predicting wind speed is very important for calculating wind farm and estimating the generation capacity and structural load of wind turbines [45].By this study, it provides a scientific basis for wind turbine design, wind farm location and load reduction strategy.

Experiment IV: The Performance of Hybrid Models, Single Models and Combined Models at Four Sites
As the results in Tables 6 and 7 shown, ANFIS is a special method in the development of neuro-fuzzy networks and shows good results in the modeling of nonlinear functions.ANFIS uses a hybrid algorithm of feed-forward network and least squares to adjust the premise parameters and conclusion parameters, and can automatically generate if-then rules that perform well on nonlinear work.In linear, nonstationarity, and in particular wind speed time series forecasting, ARIMA and HW are used in this experiment to judge whether these models fit the time series wind speed data or could forecast future points of the data.
Figure 7 shows the following: (1) From Part A, our proposed combined model achieved the most accurate results compared with other models in these days because of the lower MAE values.It also shows that the forecasting of SSA-DE-ELMNN had better performance than other models, because the learning speed of ELMNN is extraordinarily fast.ELMNN had better generalization performance in feed-forward neural networks and tended to achieve solutions directly without these trivial problems [33].fuzzy networks and shows good results in the modeling of nonlinear functions.ANFIS uses a hybrid algorithm of feed-forward network and least squares to adjust the premise parameters and conclusion parameters, and can automatically generate if-then rules that perform well on nonlinear work.In linear, nonstationarity, and in particular wind speed time series forecasting, ARIMA and HW are used in this experiment to judge whether these models fit the time series wind speed data or could forecast future points of the data.Figure 7 shows the following: (1) From Part A, our proposed combined model achieved the most accurate results compared with other models in these days because of the lower MAE values.It also shows that the forecasting of SSA-DE-ELMNN had better performance than other models, because the learning speed of ELMNN is extraordinarily fast.ELMNN had better generalization performance in feed-forward neural networks and tended to achieve solutions directly without these trivial problems [33].It is further shown in Tables 6 and 7 that the performance of wind speed forecasting improved by our proposed combined model with SDCS.The wind speed forecasting performance in a week at Sites 1-4 was applied as experimental data.It is obvious that the forecasting performance of our proposed combined model is more accurate than the other single hybrid model for each site.This result responds to the reliability and stability of our proposed combined model taking into account the random nature of the wind and its spatiotemporal variation.For Sites 1-4, the SSA-DE-ELMNN is always superior to any other single hybrid forecasting model.For example, the SSA-DE-ELMNN approach overmatched the other single hybrid models in a week forecasting with a lower MAPE (4.48%) than SSA-APSOACO-BPNN, SSA-ARIMA, SSA-ANFIS, and SSA-HW models with value 4.84%, 7.96%, 6.12%, and 5.30% in Monday from site 3.In addition, our proposed combined model improves the MAPE with the value 0.11% at least (Thursday in site 2) and it also has higher accuracy and stability in wind speed forecasting.
Remark: Our proposed combined model is more effective than the hybrid linear or nonlinear single model in forecasting wind speed at both site.The minimum performance metrics were obtained by the combined models.

Experiment V: Test of SDCS
The ability of CS and SDCS to solve the test functions and global optimal solution of multiple test functions is evaluated in this section.The test functions and parameters [46] of the algorithm are listed in Tables 8 and 9, respectively.
(1) The max, min, and average values of the iterations are 298, 144, and 12 for CS, but for SDCS, max, min, and average values are only 12, 8, and 7.8 when the dimension is 10.(2) When the dimension is 20, the max, min, and average values of the iterations of CS are 389, 314, and 347.However, for SDCS, max, min, and average values are only 29, 18, and 22.8.
(3) When the dimension is 50, the max, min, and average values of the iterations of CS are 597, 558, and 576.4.However, for SDCS, max, min, and average values are only 183, 147, and 166, respectively.
For the Rosenbrock function, CS cannot obtain the convergence when the dimension of variables is 2, whereas the max, min, and average values of the iterations are 185, 99, and 173 for SDCS.For the Rastrigin function, both algorithms can successfully obtain convergence except CS when the dimension is 50.The performance of SDCS is better than CS.Both algorithms can successfully obtain convergence when the dimensions are 10 and 20; when the dimension is 50, SDCS can achieve optimized results, but CS cannot.
For the Rosenbrock function, both algorithms can successfully achieve optimized results, and the performance of SDCS is better than CS.See Table 10.Remark: Above all, the iteration of SDCS is less than CS and when CS can not achieve the SDCS also obtained the goal.From this experiment, the performance of SDCS is better than CS.

Summary
From former four experiments, we found the following facts: (1) Experiment I and Table 5 indicate that wind speed data is both linear and nonlinear by hypothesis test and the wind speed data cannot be considered as linear or nonlinear.So the linear models and nonlinear models considered in our proposed forecasting model is correct and necessary.(2) The results from Experiment II, Figure 4 and Table 3 reveal that SSA-APSOACO-BPNN performed better than SSA-APSO-BPNN and SSA-ACO-BPNN in accuracy.(3) Because the DE algorithm has better capacity to search optimizing weights and threshold value for ELMNN, SSA-DE-ELMNN is more stable and accurate than the single model SSA-ELMNN as the Experiment II, Figure 5 and Table 4 show.(4) As shown in Figure 6 and Table 5 from Experiment III, different branch models can obtain the best results at different time points.This character caters to the feature of combined theory, which can avoid the loss of information in fitting a single model, reduce the randomness, and improve forecasting precision.Our proposed combined model can provide more accurate results than the single models.From the results, the minimum performance metrics were obtained by the combined models.
(5) Experiment IV, our proposed combined model provided more accurate results than the single models.From the results shown in Figure 7 and Tables 6 and 7, the minimum performance metrics were obtained by the combined models.(6) From the last experiment V, it shows that the optimizing performance of SDCS is better than that of CS, which means SDCS can optimize the weight coefficients of the combined model more effective.
In addition, the single models also improved in this paper, and these modified single models provided more accurate results than the single models as shown in Experiment II.As shown in the above experiments, the proposed combined model, which achieved higher accuracy, possesses a more powerful forecasting ability than the benchmark models.Improved wind speed forecasting would be extremely significant to the energy grid and wind farms.With the integration of large-scale wind power into the power grid, the safety and stability of the grid would face severe challenges.Accurate forecasting of wind power generation (wind speed) is an effective way to enhance the security and controllability of the power grid and to realize reasonable dispatch of the power grid.Wind speed forecasting and wind power generation forecasting have received attention around the world.

Conclusions
Wind speed forecasting is of great significance to the operation of wind farms in terms of economy and safety.Accurate and reliable forecasting results have a significant impact on wind farms, which in turn have an influence on the economy.In this study, the preprocessing technique reduces the uncertainty and irregularity of wind speed data, and effectively improves the performance of wind speed forecasting.Furthermore, to improve the capacity of our proposed combined forecasting model, we integrated the improved cuckoo search algorithm and developed a new algorithm named SDCS.Our proposed combined model, which includes SSA-APSOACO-BPNN, SSA-DE-ELMNN, SSA-ARIMA, SSA-ANFIS, and SSA-HW, is more effective than the hybrid nonlinear or linear single model in forecasting wind speed based on the above reasons, it improves the MAPE with the value range from 0.11% (Site 2, compared with ASS-APSOACO-BPNN) to 9.61% (Site 3, compared with SSA-ANFIS) and it also has higher accuracy and stability in wind speed forecasting.As the results show, our proposed model is more accurate than compared models.So, according to our research, the combined model can be used in wind farms to save operating costs and wind power.By improving forecasting accuracy and stability, the combined model also can be used to predict wind speed and power dispatch, resulting in various benefits such as avoiding grid collapse and saving economic dispatch.

29 Figure 1 .
Figure 1.The flow of the adaptive network-based fuzzy inference system (ANFIS).

Figure 1 .
Figure 1.The flow of the adaptive network-based fuzzy inference system (ANFIS).
q+2 , . . ., x (0) q+d -a sequence of verifying data.Output: α best -the value of with the best fitness value in particle searching space.Parameters: APSO: Particles = 30, c 1 = c 2 = 2, w 0 = 1, Maximum iteration = 300 Stopping criteria = maximum iteration ACO: NC_max-Maximum iterations:50 m-The number of ant:30 Alpha-Parameters of the important degree of information elements:1 Beta-Parameters of the important degree of the Heuristic factor:5 Rho-Parameters of the important degree of the heuristic factor:0.1 Q-Pheromone increasing intensity coefficient:100 1: /*Initialize popsize candidates with the values between 0 and 1*/ 2:

Figure 2 .
Figure 2. The flow of three different hybrid back-propagation neural networks (BPNN) and a hybrid extreme learning machine (ELM) neural network.

Figure 2 .
Figure 2. The flow of three different hybrid back-propagation neural networks (BPNN) and a hybrid extreme learning machine (ELM) neural network.

Figure 3 .
Figure 3.The flow of proposed combined model.Phase II: Construct the test wind speed datasets, forecasting time series wind speed data with different forecasting models.It is worth pointing out that the original wind speed time series is divided into training sets and test sets.In the process of training the network, the input is the filtered data and the output is the original time series of the training set.In the test step, the input is also the filtered data, and the output is the original data.Phase III: Utilize the test data to choose the branch models of combined model.For different models, every model performance different in the test data.And we also used optimization algorithms to optimize BPNN and ELMNN, and we chose the best hybrid models to build the combined model.Phase IV: Combine the single forecasting models by the combined model theory.Combine the independent forecasts generated by the aforementioned forecasting engines.NPCT is utilized to predict the distribution of future wind speed with the independent forecasts generated by the forecasting engines as input in different forecasting horizons.NPCT is used to predict the distribution of time series wind speed data, and the independent prediction generated by the prediction engine

Figure 3 .
Figure 3.The flow of proposed combined model.
Figure 4 also shows the 95% confidence intervals (CIs) obtained by DE-ELMNN and ELMNN, the figure indicates that both the upper and lower CI are close between DE-ELMNN and ELMNN, but for DE-ELMNN there are more points in the confidence interval.
ELMNN and ELMNN, but for DE-ELMNN there are more points in the confidence interval.

Figure 4 .
Figure 4.The results of two different ELMNN forecasting models in site 1.

Figure 4 .
Figure 4.The results of two different ELMNN forecasting models in site 1.

29 Figure 5 .Figure 5 .
Figure 5.The results of three different BPNN forecasting models in site 1.This experiment tested the three optimization nonlinear BPNNs to choose the best performance of hybrid nonlinear BPNN as a portion of the combined forecasting model.APSO and ACO have many advantages.ACO has strong global search ability, is robust, and is easy to combine with other algorithms, and APSO has fast convergence speed and fewer parameters, and is simple and easy to operate.With ACO the slow convergence rate occurs easily, but at the same time APSO easily falls Figure 5.The results of three different BPNN forecasting models in site 1.

Figure 6 .
Figure 6.Typical forecasting performance of branch models of Monday in Site 3.

Figure 6 .
Figure 6.Typical forecasting performance of branch models of Monday in Site 3.

( 2 )
On these seven days, the results of SSA-ANFIS are nearly to SSA-DE-ELMNN which achieved the most accurate results compared with other models.(3) The single linear model: Although ARIMA and HW can achieve higher prediction precision, prediction performance is worse than modified nonlinear and single nonlinear models.(4) Compared with modified nonlinear models and individual nonlinear models, the MAPE value of the combined model had a significant improvement in forecasting accuracy.(5) As Part C shows, the errors of combined model are very small, and our combined model also achieve a small CI.Sustainability 2018, 10, x FOR PEER REVIEW 22 of 29

( 2 )
On these seven days, the results of SSA-ANFIS are nearly to SSA-DE-ELMNN which achieved the most accurate results compared with other models.(3)The single linear model: Although ARIMA and HW can achieve higher prediction precision, prediction performance is worse than modified nonlinear and single nonlinear models.(4) Compared with modified nonlinear models and individual nonlinear models, the MAPE value of the combined model had a significant improvement in forecasting accuracy.(5) As Part C shows, the errors of combined model are very small, and our combined model also achieve a small CI.

Figure 7 .
Figure 7.Typical forecasting performance of models in Site 4.Figure 7. Typical forecasting performance of models in Site 4.

Figure 7 .
Figure 7.Typical forecasting performance of models in Site 4.Figure 7. Typical forecasting performance of models in Site 4.

Table 2 .
(A) Testing wind speed data by adjusting to linear functions or nonlinear functions.(B) The explanations of the test parameters.
Error Degrees of Freedom n − p, where n is the number of observations, and p is the number of coefficients in the model, including the intercept.R-squared and Adjusted R-squared Coefficient of determination and adjusted coefficient of determination, respectively F-statistic vs. Constant Model Test statistic for the F-test on the regression model.It tests for a significant regression relationship between the response variable and the predictor variables p-Value p-value for the F statistic of the hypotheses test that the corresponding coefficient is equal to zero or not.

Table 3 .
The forecasting results of the two different ELM neural network in Site 1-4.SSA: singular spectrum analysis; DE: differential evolution.

Table 3 and
Figure 4A show that the MAPE, MAE, and MSE values of DE-ELMNN are 3.58%, 0.3832 and 0.2523, respectively in Tuesday from site 1, which are lower than ELMNN.And for other days from site 1, it is clearly shows DE-ELMNN performance better than ELMNN.From the forecasting errors shown in Figure 4B, the error of DE-ELMNN is approximately 0 than ELMNN especially in Wednesday.

Table 4 .
The results of the three different BPNN forecasting models in Site 1-4.

Table 5 .
The forecasting results of every hours of the four branch models in Site 3.

Table 6 .
Performance evaluation of different models for forecasts using the 10-min wind speed data in a week in Site 1 and 2.

Table 7 .
Performance evaluation of different models for forecasts using the 10-min wind speed data in a week in Site 3 and 4.

Table 9 .
Experimental parameters of modified model SDCS.

Table 10 .
The experimental parameters of SDCS.