Short-Term Wind Electric Power Forecasting Using a Novel Multi-Stage Intelligent Algorithm

As the most efficient renewable energy source for generating electricity in a modern electricity network, wind power has the potential to realize sustainable energy supply. However, owing to its random and intermittent instincts, a high permeability of wind power into a power network demands accurate and effective wind energy prediction models. This study proposes a multi-stage intelligent algorithm for wind electric power prediction, which combines the Beveridge–Nelson (B-N) decomposition approach, the Least Square Support Vector Machine (LSSVM), and a newly proposed intelligent optimization approach called the Grasshopper Optimization Algorithm (GOA). For data preprocessing, the B-N decomposition approach was employed to disintegrate the hourly wind electric power data into a deterministic trend, a cyclic term, and a random component. Then, the LSSVM optimized by the GOA (denoted GOA-LSSVM) was applied to forecast the future 168 h of the deterministic trend, the cyclic term, and the stochastic component, respectively. Finally, the future hourly wind electric power values can be obtained by multiplying the forecasted values of these three trends. Through comparing the forecasting performance of this proposed method with the LSSVM, the LSSVM optimized by the Fruit-fly Optimization Algorithm (FOA-LSSVM), and the LSSVM optimized by Particle Swarm Optimization (PSO-LSSVM), it is verified that the established multi-stage approach is superior to other models and can increase the precision of wind electric power prediction effectively.


Introduction
With the continuous emergence of global warming, smog weather, and other environmental problems, the development of the conventional thermal power generation mode, which is unsustainable and has large pollutant emissions, has been limited by the renewable energy power generation mode due to its beneficial characteristics. Having been considered as the most efficient renewable energy source for generating electricity in modern power systems, wind power has experienced a rapid expansion throughout the world [1][2][3]. The global wind installed capacity reached approximately 539 GW at the end of 2017, of which 53 GW was newly installed in the year of 2017 [4]. Specifically, China has equipped 188 GW, which accounts for 34.9% of global wind installed capacity. According to the report 'China Energy Outlook 2030' issued by the China Energy Research Association, China will strive to achieve 250 GW of wind installed capacity by 2020, accounting for 12.5% of total installed capacity, and 450 GW by 2030 with 900 billion kWh of on-grid electricity [5].
Despite of the environmental benefits of wind energy [6], its stochastic and intermittent nature [7] will pose lots of challenges in power system operations and large scale penetration.

Overview of Current Wind Speed and Power Forecasting
Research in the field of wind power prediction has made great contributions to the stable operation of a power network. Many models have been developed to accurately forecast future wind power, and tools are classified into two main-stream approaches, which are the physical approach and the statistical method. Additionally, an integration approach is also utilized to combine the merits of both methods.

Physical Forecasting Approach
The physical prediction approach usually employs concrete physical characteristics to simulate the on-site items for a wind farm location [19]. The physical approach conducts the enhancement of the NWP data to consider the on-site requirements through a downscaling approach on the basis of the physics of the lower atmospheric bound layer. In the downscaling method, the wind farms and their surroundings, such as the wind turbine power curve, are required to be described in detail. Then, the enhanced wind speed data can be fed into the wind power curve to compute the wind power output. If on-line data can be accessed, the output statistics of the model can decrease the error of the prediction [10]. Different from the statistical method, historical data are not necessary for the physical approach. However, obtaining the physical data is the primary disadvantage of the approach.
Risoe National Laboratory in Denmark developed the Prediktor, which considers local conditions through using the NWP prediction from a High-Resolution Limited Area Model [20,21]. The University of Oldenburg in Germany developed the Previento, which has a similar physical approach but employs a different NWP forecast from the Lakelmodell of the German Weather Service [22]. AWS True Wind Inc. in the USA developed the eWind, which has a similar physical approach to the Prediktor but employs a high-resolution boundary layer model as a numerical weather model to consider the local conditions [23].
The physical methods are developed on the basis of models employing essential physical regulations for the conservation of momentum, mass, and energy in air flows. These models deal with computational fluid dynamics for imitating the atmosphere. Although many computational fluid dynamics models can be accessed, they are all based on the same fundamental physical theories. The differences between them include how the grids are structured and scaled and how the numerical computations are performed [10].

Statistical Forecasting Approach
The alternative mainstream approach utilized to predict wind power is the statistical forecasting approach on the basis of the historical meteorological values and wind power data. This kind of method usually elaborates the relationship between wind power or speed prediction and impact factors, such as NWPs and on-site evaluated data. Representative statistical methods contain traditional statistical methods and artificial intelligence techniques.
For the traditional statistical methods, time-series approaches are always utilized to forecast wind electric power or wind speed. The Auto-Regressive Moving Average (ARMA) approach [24] and the Auto-Regressive Integrated Moving Average (ARIMA) approach [25,26] are widely used to forecast wind power. The Autoregressive Conditional Heteroscedastic (ARCH) approach, integrated with the ARIMA approach and taking the heteroscedasticity effect into consideration, is utilized to predict sub-wind speed series calculated by wavelet decomposition. Forecasting results, which are the sum of the prediction values, illustrate that it can improve prediction precision [27]. Conventional statistical approaches are developed on the basis of classical linear statistical models. However, wind power is usually a nonlinear function considering its input characteristics [3]. Therefore, artificial intelligence technologies have been developed to enhance the prediction precision of wind power.
Artificial intelligence techniques have been widely utilized in many research studies to forecast wind power output [28]. The artificial neural networks (ANN) method with Gaussian procedure simulation has been put forward for short-run wind power prediction [29]. A Back Propagation (BP) neural network on account of Particle Swarm Optimization (PSO) has been conducted for input parameter selection to decrease errors in wind speed prediction [30]. A reduced support vector machine (RSVM) optimized by PSO has been utilized to conduct wind speed forecasting with feature selection [31]. Additionally, recurrent neural networks [32], Multi-Layer Perceptron (MLP) neural networks [17], and RBF neural networks [33] have been put forward for the prediction of wind power. Although neural networks and support vector machines can mimic nonlinear input functions, the learning capacity of a single method with conventional learning mechanisms has been constrained. To handle with this issue, combinations of neural networks, support vector machines, and other methods with optimization algorithms and data processing mechanisms have been put forward.

Combination Approach
The combination approach aims at integrating different approaches and maintaining the advantages of each approach. Combined wind forecasting methods usually include weighting-based combined methods [34][35][36][37], combined methods containing data pre-processing techniques [38][39][40], combined methods consisting of parameter optimization techniques [41][42][43][44], and combined methods comprising error processing techniques [45]. The study [46] proposed two univariate statistical prediction models on the basis of LSSVM, which are the univariate LSSVM approach and the integrated model combining the LSSVM model with ARIMA. An integrated model integrating input chosen by Wavelet Transform (WT) and Support Vector Machine (SVM) has been proposed and its superiority has been verified in terms of wind speed forecasting [47].
Considering the high forecasting accuracy of the combination approach, this paper proposes a new multi-stage intelligent algorithm for wind electric power prediction. During the data preprocessing step, the B-N disintegration approach is used to disintegrate the nonlinear and unstable wind electric power data series into a deterministic component, a cyclic term, and a stochastic impact effect. In previous research, the B-N decomposition method has been used to discuss the influence of a financial crisis on GDP and electricity consumption [48,49]. For this research, we explore the possibility of applying the B-N decomposition method to a wind power forecasting issue. Since the three decomposed sequences obtained from the B-N decomposition method can reflect the changing tendency of history and the future, they are utilized to predict future wind electric power. In the prediction process, a new nature-inspired optimization algorithm, named the Grasshopper Optimization Algorithm (GOA) and established by Seyedali Mirjalili in 2017 [50], is utilized to optimally select two parameters of the LSSVM approach to enhance the forecasting precision of wind electric power. After comparing this proposed method with a single LSSVM model, FOA-LSSVM, and PSO-LSSVM, it is verified that GOA-LSSVM has superior performance over other approaches. Therefore, the proposed multi-stage intelligent algorithm for wind electric power forecasting has high prediction accuracy.

B-N Decomposition Approach
For the purpose of decomposing a first-order differential stable data sequence, the B-N decomposition approach was put forward, which verified that a data sequence with a first-order stable instinct could be resolved to a permanent component and a transitory term. For the permanent component, it is made up of the deterministic component and the random term. For the transitory term, it has a stable procedure with a zero mean value named the cyclic trend [51][52][53][54].
According to the B-N decomposition model and the calculation requirements, it is obligatory to test if the data sequence is first-order stable before decomposing the data series. If it is verified that the data sequence has a first-order difference, the details of the procedures of the B-N disintegration approach are as below [49,51,52].
Suppose the natural logarithm data of the hourly wind electric power to be lnP under the first-order stable condition. To calculate the deterministic trend in accordance with the Wold theorem, we set: where ∆ ln P t = ln P t − ln P t−1 , P t demonstrates the wind electric power at time t, µ indicates the average data of ∆ ln P t in the long-term, ε t~i .i.d.N(0, σ 2 ) (i.i.d. denotes independently and identically distributed), t implies the time point, and λ i means the coefficient. The anticipated value of Equation (1) is: where E demonstrates the computational procedure of the anticipated value for every variable.
Since ∆ ln P t means the natural logarithm value of the wind electric power, its first-order difference indicates the proportion of increase of the wind electric power. In accordance with Equation (2), the mean value of the proportion of increase of wind electric power is the long-run proportion of increase of wind electric power. Based on the B-N disintegration logic, the deterministic component DT t is disintegrated as [51]: where DT t illustrates the determinacy component at time t, and lnP 0 denotes the original data of the natural logarithm data of the wind electric power. Moreover, lnP t is identified as the prediction value in terms of current information. Morley [55] noted that this data sequence was much more precisely forecasted through utilizing a stable univariate Auto Regressive (AR) Equation (1) method for its first differences, called: where ϕ is calculated by AR Equation (1) and |ϕ| < 1. Through considering the demonstrated Wold form in AR Equation (1), the minimum mean squared error (MMSE) of the j-period prediction of the first difference of ∆lnP t is: Consistent with the disintegration identification of the B-N approach [51], T t , the total tendency component of the data sequence, is determined as the MMSE forecasting of the long-run level for the sequence, which is computed as: Hence, combining Equations (5) and (6), and employing the infinite sum expression of geometric succession, T t is computed as: Meanwhile, the cyclic tendency of lnP t is calculated as: where C t indicates the cyclic tendency at time t.
The stochastic shock component is computed based on Equations (3), (7), and (8): where ST t illustrates the stochastic component at time t.

The LSSVM Approach
LSSVM is an enhanced approach of SVM that alleviates the convex quadratic programming related to SVM [56][57][58]. LSSVM owns the merits of making slackness through an equality restraint as well as handling the regression issue as linear equations, which leads to faster training and higher precision and stability [59][60][61]. The fundamental methodology of the LSSVM approach is elaborated as follows [56,60].
Suppose a sequence of samples {x i , y i } m i=1 , regarding x i ∈ R n as the input variable and y i ∈ R n as the outcome data for sample i. By utilizing a nonlinear function ϕ, the sample data are programmed to a high dimensional space, hence, to roughly depict it in a linear formula: where w indicates the weight vector and b implies the error.
In the initial space, LSSVM with an equality restraint is phrased as where C represents the regularization parameter and ξ i means the slackness variable. The Lagrangian function L can be adopted [62]: where a i demonstrates Lagrange multiplier. The requirements of Karush-Kuhn-Tucker (KKT) for optimality are decided as Eliminating the variables W and ξ i , the optimized issue is converted into a linear equation as: where Q = [1, · · · , 1] T , A = [a 1 , a 2 , · · · , a m ] T , and Y = [y 1 , y 2 , · · · , y m ] T . Consistent with the Mercer's term, the Kernel function is written as: Then, the LSSVM for regression is introduced as: As the RBF has fewer parameters to be assumed, it is chosen to be K(x, x i ) in this study and illustrated as: Consequently, two parameters are required to be identified by a complicated search process, which are parameters 'c' and 'σ' [63,64]. The latest optimization algorithm GOA is utilized to identify the best values of these parameters.

Grasshopper Optimization Algorithm (GOA)
Inspired by different movement characteristics in larval and adult phases of grasshoppers, a novel optimization approach has been put forward by researcher Seyedali Mirjalili, namely the grasshopper optimization algorithm (GOA) [50]. In the nymph phase, the primary characteristic of a grasshopper swarm is slow movement and small steps, and they jump and move like rolling cylinders. Along their movement path, almost all vegetation can be eaten. Then, they step into the adult phase, in which long range and abrupt movement is the necessary characteristic of the grasshopper swarm [65,66]. The searching procedure of the GOA is divided into two types: exploration and exploitation. In exploitation, the search agents shift locally, while they tend to move abruptly in exploration [67]. Since a grasshopper performs these two functions naturally, it is of great significance to mathematically imitate this behavior and use the new nature-inspired optimization method to obtain optimal results.
According to the behaviors of grasshoppers, the GOA is composed of five phases listed as below.
The primary parameters of the GOA comprise the number of grasshoppers SearchAgents_no; the number of variables dim; the maximum number of iterations Max_iteration; the bottom limit lb = [lb 1 , lb 2 , . . . , lb n ], and the upper limit ub = [ub 1 , ub 2 , . . . , ub n ] of variables.
The original population of grasshoppers in the GOA is set in a matrix as below: where G indicates the position matrix of the grasshoppers, g ij implies the j-th parameter's (dimension or variable) value of the i-th grasshopper, and i = 1, 2, . . . , n, j = 1, 2, . . . , d, and g ij are computed through using the stochastic distribution listed in Equation (19).
where G(i,j) denotes the data of the i-th row and j-th column in the matrix, ub(i) and lb(i) respectively illustrate the upper limit and bottom limit of the i-th grasshopper, and rand() denotes the random value from a uniform distribution in the scope of [0, 1].
For the purpose of assessing every grasshopper, the fitness function f [*] is required to be decided in an optimization procedure, and the matrix OG is applied to save the fitness data for grasshoppers described as below.
where OG represents the matrix for storing the fitness values of grasshoppers and n implies the amount of grasshoppers.
The moving behavior of grasshoppers is imitated by the following mathematical model [68]: where X i represents the i-th grasshopper's position, S i means the social connection, G i indicates the gravity force on the i-th grasshopper, and A i implies the wind advection. On the purpose of providing random behavior, the equation is described as where d ij represents the distance from the i-th to the j-th grasshopper, which can be computed as d ij = x j − x i , s implies a function employed to identify the intensity of social force, which is displayed in Equation (23), andd ij = denotes a unit vector from the i-th grasshopper to the j-th grasshopper.
The s function that identifies the social force is computed as below: where f implies the attraction strength and l means the attractive length scale. Figure 2 in Reference [50] illustrates how function s affects the social connection of grasshoppers and distances ranging from 0 to 15 are taken into account. Repulsion happens in the scope of [0, 2.079]. If a grasshopper keeps 2.079 units away from another one, there exists neither a repulsion nor an attraction force, which is named the comfort zone or comfortable distance. It can also be seen that the attraction strength raises from 2.079 units of distance to nearly 4 and reduces to 0. Alternating the parameters' values of l and f in Equation (23) leads to various social behaviors in artificial grasshoppers. The function s is re-drawn with various values of l and f independently. That figure illustrates that these two parameters change the attraction zone, comfort area, and repulsion area obviously.
Although the function s can classify the space between two grasshoppers into an attraction area, a comfort area, and a repulsion area, this function outputs values on the verge of zero with distances longer than 10. Therefore, this function cannot employ strong forces between grasshoppers with long distances. In order to deal with this problem, the distances between grasshoppers should be mapped in the interval of [1,4].
The G i component in Equation (21) can be calculated as where g indicates the constant of earth gravitation, andê g represents a unity vector to the center of earth.
The A i term in Equation (21) can be computed as where u is constant drift andê w is a unit vector for the wind direction. Since larval grasshoppers do not have wings, their movements rely on wind direction to a great extent. Substituting Equations (22), (24) and (25) into Equation (21), X i is described as where s(r) = f e −r l − e −r and N represent the amount of grasshoppers. However, this mathematical model brings the grasshoppers to achieve the comfort area quickly and the swarm does not verge on a particular point. Therefore, a revised model of this equation is put forward as below: where ub d and lb d represent the superior and bottom limit in the D-th dimension, s(r) = f e −r l − e −r , T d indicates the data of the D-th dimension in the target, and c implies a reducing coefficient to lessen the repulsion region, comfort area, and attraction region. It should be pointed out that S is similar to the S i section in Formula (21), and earth gravity has not been considered (not containing G i term). It is assumed that the wind direction (A i part) is towards a target (T d ).
For Equation (27), the first term (the sum) takes other grasshoppers' position into account. The second term (T d ) mimics their trends to move towards the food. The inner c is devoted to the decrease of attraction or repulsion strength between grasshoppers rational to the generation number, while the outer c shrinks the search coverage around the target with the increasing of the generation count. For the aim of balancing exploration and exploitation, the parameter c should be reduced to be rational to the generation number. This procedure encourages exploitation with the increasing of generation. The coefficient c decreases the comfort area rational to the iteration number and can be computed as where cmax indicates the maximum data, cmin implies the minimum data, l represents the present generation, and L denotes the total amount of generations. In this method, 1 and 0.00001 are employed for cmax and cmin.
Step 5: Optimal selection. In this mathematical model, grasshoppers are required to move towards a target gradually over a number of iterations. However, in a real search space, the global optimum is exactly unknown. Hence, it is necessary to search for a target for grasshoppers in each step of a generation. In the GOA, the fittest grasshopper (the one with the best objective value) is assumed to be the target, which can help the GOA to store the most promising target in each generation and require grasshoppers to move towards the target.
During the iteration, the positions of search agents will be updated based on Equation (27). Additionally, the best target position obtained to date is updated in every generation. Positions are required to be updated iteratively until they satisfy the terminal criterion. The location and fitness value of the target are ultimately feedback as the best solution for the global optimum.

The Proposed Multi-Stage Intelligent Algorithm for Predicting Wind Electric Power
A novel multi-stage intelligent algorithm is put forward to improve wind electric power prediction accuracy. Since hourly wind electric power data is non-stationary with periodic changing tendency and under the interference of random factors, the B-N disintegration method is utilized to disintegrate the virginal hourly wind electric power data into a deterministic trend, a cyclic component, and a stochastic impact effect. Then, the GOA-LSSVM approach is employed to predict three disintegrated data sequences, and final prediction wind electric power data are calculated. For LSSVM, parameters 'σ' and 'c' are required to be decided before predicting three decomposed data sequences. To enhance the prediction precision, the GOA is applied to identify the optimum values of them.
The details of the procedures of the proposed multi-stage intelligent algorithm for wind electric power prediction are summarized as follows.
In accordance with the basic rules of the B-N disintegration approach, we need to test if the logarithmic sequence of the original data sequence is first-order stable. The Augmented Dickey-Fuller (ADF) approach is utilized to carry out the stability examination.
After confirming the first-order stability of the original data's logarithmic sequence, the initial data are disintegrated into a deterministic trend, a cyclic component, and a random impact effect according to Equations (3), (8) and (9).
Through employing the GOA to identify the optimum values of the two parameters in LSSVM, the fitness function f [*] is required to be decided. The Root Mean Square Error (RMSE) expressed as Equation (29) is applied to compose the fitness function.
where x(k) is the objective data of wind electric power at time k andx(k) is the prediction data of wind electric power at time k.
Sustainability 2018, 10, x FOR PEER REVIEW 10 of 18 Through employing the GOA to identify the optimum values of the two parameters in LSSVM, the fitness function f[*] is required to be decided. The Root Mean Square Error (RMSE) expressed as Equation (29) is applied to compose the fitness function.   The GOA optimizes the two parameters in LSSVM through generating a set of stochastic solutions. Search agents renovate their locations in terms of Equation (27). The position of the best target found to date is updated in every generation. The distances between grasshoppers need to be normalized in the interval of [1,4] in every iteration. The locations of grasshoppers are employed to imply two parameters in LSSVM, namely g i,1 = σ and g i,2 = c. Supposing that the objective data sequence {x (0) (1), x (0) (2), . . . x (0) (n)} is employed in the first generation, the fitting sequence {x (0) (1),x (0) (2), . . .x (0) (n)} is computed in terms of the LSSVM. Then, the fitness function is identified minimizing the RMSE of the prediction value, written by Positions and fitness values will be saved. Then, the remaining 99 iterations can be performed in accordance with the optimization procedure of the GOA.
Through the whole optimization procedure, various values of RMSEs are computed with varying parameter values, and the minimum RMSEs can be discovered when the optimization procedure is finished. The optimum values of 'σ' and 'c' can be gained through using the GOA in terms of Equation (30). Then, the optimized LSSVM can be established, and the future data points of wind electric power can be predicted.
The procedure of the proposed multi-stage intelligent algorithm for wind electric power prediction is illustrated in Figure 1.

B-N Decomposition Results
In this research, hourly wind electric power data of one province in Northwest China will be fed into the proposed multi-stage intelligent algorithm for future wind electric power forecasting. According to the available data, 216 h (9 days) data will be treated as the training sample, and 168 h (7 days) data will be managed to be the testing sample to measure the performance of the established model.
Firstly, the B-N decomposition model should be applied to decompose the logarithmic sequence of the initial 384 (384 = 216 + 168) h wind electric power data. Before decomposition, it is necessary to carry out a unit root test in accordance with the ADF test to examine if the logarithmic series is first-order stationary. As listed in Table 1, the data sequence of the wind electric power is stable after a first-order difference, which satisfies the requirement of the B-N disintegration approach that the data sequence is first-order co-integrated. Therefore, Equations (3), (8) and (9) can be employed to decompose the logarithmic sequence of original 384 h wind electric power data, and then the deterministic component, cyclic term, and stochastic component in logarithmic form are obtained. Since the deterministic term, cyclic component, and random term are in logarithmic form, we need to transform them into natural numbers to apply the GOA-LSSVM model. The decomposition results of these three terms in natural numbers are displayed in Figures 2 and 3. The deterministic trend means that the time sequence increases steadily with the change of time. The periodic term indicates the periodic variation of the time series. The stochastic component implies the impact of unexpected and unobservable random factors on the time series.

GOA-LSSVM Prediction Results
The GOA-LSSVM approach is employed to predict the three components, respectively, and the final wind electric power value is obtained by multiplying the predicted deterministic trend, cyclic component, and random trend. Based on the data available, through using the GOA-LSSVM model to forecast the deterministic trend, the deterministic trend in the last moment and the deterministic trend at the same time yesterday will be fed into GOA-LSSVM. By utilizing the GOA-LSSVM model to forecast the periodic component, the periodic component in the last moment and the periodic component at the same time yesterday are handled as the input items for GOA-LSSVM. By utilizing the GOA-LSSVM approach to forecast the stochastic trend, the stochastic trend in the last moment and the stochastic trend at the same time yesterday are managed as the input terms for GOA-LSSVM.
The forecasting procedure of GOA-LSSVM should start with normalizing the sample data in the scope of [0, 1] utilizing the following formula: where min x and max x denote the minimum and maximum data of every input sequence.

GOA-LSSVM Prediction Results
The GOA-LSSVM approach is employed to predict the three components, respectively, and the final wind electric power value is obtained by multiplying the predicted deterministic trend, cyclic component, and random trend. Based on the data available, through using the GOA-LSSVM model to forecast the deterministic trend, the deterministic trend in the last moment and the deterministic trend at the same time yesterday will be fed into GOA-LSSVM. By utilizing the GOA-LSSVM model to forecast the periodic component, the periodic component in the last moment and the periodic component at the same time yesterday are handled as the input items for GOA-LSSVM. By utilizing the GOA-LSSVM approach to forecast the stochastic trend, the stochastic trend in the last moment and the stochastic trend at the same time yesterday are managed as the input terms for GOA-LSSVM.
The forecasting procedure of GOA-LSSVM should start with normalizing the sample data in the scope of [0, 1] utilizing the following formula: where min x and max x denote the minimum and maximum data of every input sequence.

GOA-LSSVM Prediction Results
The GOA-LSSVM approach is employed to predict the three components, respectively, and the final wind electric power value is obtained by multiplying the predicted deterministic trend, cyclic component, and random trend. Based on the data available, through using the GOA-LSSVM model to forecast the deterministic trend, the deterministic trend in the last moment and the deterministic trend at the same time yesterday will be fed into GOA-LSSVM. By utilizing the GOA-LSSVM model to forecast the periodic component, the periodic component in the last moment and the periodic component at the same time yesterday are handled as the input items for GOA-LSSVM. By utilizing the GOA-LSSVM approach to forecast the stochastic trend, the stochastic trend in the last moment and the stochastic trend at the same time yesterday are managed as the input terms for GOA-LSSVM.
The forecasting procedure of GOA-LSSVM should start with normalizing the sample data in the scope of [0, 1] utilizing the following formula: (31) where x min and x max denote the minimum and maximum data of every input sequence.
In the GOA-LSSVM approach, the parameters 'c' and 'σ' of the LSSVM approach will be identified by the GOA method. Through the optimization process, the optimal values of σ 2 and c of LSSVM for determinacy component prediction, cyclic term prediction, and random component prediction are determined by Equation (30), and the optimum values of them are illustrated in Table 2.
In the testing phrase, the optimum values of σ 2 and c will be applied to calculate the forecasting values of the deterministic trend, the periodic component, and the stochastic term, respectively. Through multiplying these three terms, the final wind electric power values are obtained. The values of three error criteria are applied to discuss the prediction performance, which are the Root Mean Square Error (RMSE, computed in accordance with Equation (29)), the Normalized-RMSE calculated based on [69,70], and the Mean Absolute Percentage Error (MAPE) calculated as below: where x(k) indicates the objective data at time k andx(k) means the prediction data at time k.
The values of RMSE, normalized-RMSE, and MAPE for GOA-LSSVM of each day through calculating the average hourly RMSE, normalized-RMSE, and MAPE values are shown in Table 3. The value of MAPE on the first day is the highest, which is 9.54%, the value of MAPE on the third day is the lowest, which is 1.52%, and the values of MAPE on other days range from 3% to 4.6%. The average value of MAPE is 4.28%, which is much lower than the prediction error in the current literature, which ranges from 8% to 22% [18]. For the normalized-RMSE values, all of them are less than 10%, which implies that the established model has excellent performance in wind electric power prediction [70]. This proves that the GOA-LSSVM can effectively improve the forecasting accuracy.

Forecasting Performance Assessment
To assess the prediction performance, three compared prediction models have been chosen, including LSSVM, FOA-LSSVM, and PSO-LSSVM. For these three models, the input variables and the output variable are the same as GOA-LSSVM, while the difference among them is the parameters' selection mechanism.
For the LSSVM model, the deterministic trend, periodic component, and stochastic term will be forecasted taking the term's value in the last moment and the component's value at the same time yesterday for each trend from 1 to 216 h as input variables. Values of parameters σ 2 and c are identified [71], which are listed in Table 4. Additionally, the hourly wind electric power for the future 168 h can be forecasted.   Table 4.
For the PSO-LSSVM approach, the parameters σ 2 and c of LSSVM for determinacy trend prediction, cyclic component prediction, and random term prediction are iteratively identified by PSO. Before optimization, the original parameters of PSO are supposed to be: maxgen = 300, swarm size = 30, particle size = 2, the minimum of the particle = [0.01, 1000], the maximum of the particle= [0.1, 1000], the minimum of velocity = −500, the maximum of velocity = 500, and learning factors c1 = 1.5 and c2 = 1.7. The optimum σ 2 and c selected by PSO for these three terms' forecasting are demonstrated in Table 4.
According to the optimum σ 2 and c for determinacy component prediction, cyclic term forecasting, and random component prediction in different models, the future 168 h wind electric power values are calculated through multiplying the prediction data of the three components. The MAPE values, normalized RMSE values, and RMSE values of each day through calculating the average hourly MAPE, normalized RMSE, and RMSE values for various approaches are computed to assess the prediction performance of the various prediction approaches and are illustrated in Table 5. The LSSVM model has the poorest performance compared to other models with optimization algorithms, and the average value of MAPE is 15.61%, which is the only one larger than 10%. The GOA-LSSVM model is superior to the other models with a 4.28% average MAPE value; except for the MAPE value on the first day, the MAPE values on other days are all smaller than 4.6%, specifically, the MAPE value on the third day is only 1.52%. The forecasting performance of the PSO-LSSVM model ranks second with an 8.00% average MAPE value; however, the MAPE values on the last two days are all more than 10%, which is much higher than the values of GOA-LSSVM. The forecasting performance of the FOA-LSSVM approach ranks third with a 9.89% average MAPE value. For the RMSE values, the average values of the LSSVM model and the FOA-LSSVM model are larger than 10,000 KW, which indicate that there exists a large gap between the forecasted data and the objective data of wind electric power. The smallest average RMSE value is 3876.71 KW, which belongs to the GOA-LSSVM model, followed by 8046.46 KW of the PSO-LSSVM model. For the normalized-RMSE values, the average values of GOA-LSSVM and PSO-LSSVM are both less than 10%, which demonstrate that both of these two models have excellent performance in wind electric power prediction, but the values on Day 1, Day 6, and Day 7 of PSO-LSSVM are all larger than 10%, which are much greater than GOA-LSSVM. Most of the normalized-RMSE values of LSSVM and FOA-LSSVM are larger than 10%, which verify that the established approach effectively enhances the prediction accuracy. Compared with PSO-LSSVM and FOA-LSSVM, GOA-LSSVM has a better forecasting performance. This is because in the GOA the next location of a grasshopper is identified on the basis of its current location, the global best, and the location of all other grasshoppers according to Equation (27), while PSO, as the most well-regarded swarm intelligent optimization algorithm in the current literature, updates the positions of particles with regard to present location, personal best, and the global best. For FOA-LSSVM, the fruit flies renew their positions according to the location of the best fruit fly with the best smell. This demonstrates that in PSO and FOA the other search agents make no contribution to the update of a search agent's position, while all grasshoppers' status are involved in the position updating of each search agent [50]. Thus, the GOA performs better than PSO and the FOA on the optimization issue. Therefore, it can be concluded that the proposed multi-stage intelligent algorithm with the B-N decomposition method and the LSSVM model optimized by the newly proposed evolutionary algorithm GOA for hourly wind electric power forecasting is effective and practical.

Conclusions
Wind power is of great significance among low-carbon and low-emissions renewable energy technologies, which are promising to realize sustainable energy supply. The accurate prediction of wind power makes a great contribution to the extensive scale penetration of wind power into a power network considering the stochastic and intermittent characteristics of wind energy. Therefore, a multi-stage intelligent algorithm for wind electric power prediction is put forward in this research to increase the prediction precision of wind power. This established multi-stage intelligent algorithm combines the B-N decomposition approach, the LSSVM model, and a newly proposed intelligent optimization algorithm GOA together. In the data preprocessing stage, considering the stochastic tendency of wind electric power, the B-N disintegration approach is utilized to disintegrate the initial wind electric power data into a determinacy trend, a cyclic component, and a random component. Then, LSSVM is utilized to predict the future values of these three terms, of which the parameters 'c' and 'σ' are optimally selected by the GOA. The final future wind electric power data are calculated through multiplying the predicted values of the deterministic trend, cyclic component, and random component. For the purpose of evaluating the forecasting performance, the LSSVM approach, the FOA-LSSVM approach, and the PSO-LSSVM approach are selected as the compared methods. Through conducting comparative analysis, it can be concluded that the proposed multi-stage intelligent algorithm for wind electric power forecasting performs the best with a 4.28% forecasting error, which is much lower than the prediction error in the current literature, which ranges from 8% to 22% [18]. Although the proposed multi-stage intelligent optimization algorithm consists of complicated computation steps, including data decomposition, parameter optimization, and future data-point prediction, and this process may have a high computational cost, the proposed multi-stage intelligent algorithm can improve forecasting accuracy significantly and can be of great help for electricity power system management and power dispatching. Therefore, the proposed multi-stage intelligent algorithm is an effective and attractive optimization algorithm for wind electric power prediction. In the future research, this proposed model can also be combined with the Numerical Weather Prediction method in using physical data to further increase the forecasting precision of wind power, and it can also be utilized for other prediction studies, such as power load prediction in the short term and photovoltaic power generation prediction. Additionally, considering the intense demand for quantitative information of wind power uncertainty, the research of wind power prediction will gradually turn from point prediction to probabilistic prediction [72][73][74]. Therefore, in our future research, we will focus on exploring appropriate probabilistic forecasting approaches to enhance the prediction accuracy of wind electric power prediction.