A Novel Deep Learning Approach for Wind Power Forecasting Based on WD-LSTM Model

: Wind power generation is one of the renewable energy generation methods which maintains good momentum of development at present. However, its extremely intense intermittences and uncertainties bring great challenges to wind power integration and the stable operation of wind power grids. To achieve accurate prediction of wind power generation in China, a hybrid prediction model based on the combination of Wavelet Decomposition (WD) and Long Short-Term Memory neural network (LSTM) is constructed. Firstly, the nonstationary time series is decomposed into multidimensional components by WD, which can e ﬀ ectively reduce the volatility of the original time series and make them more stable and predictable. Then, the components of the original time series after WD are used as input variables of LSTM to predict the national wind power generation. Forty points were used, 80% as training samples and 20% as testing samples. The experimental results show that the MAPE of WD-LSTM is 5.831, performing better than other models in predicting wind power generation in China. In addition, the WD-LSTM model was used to predict the wind power generation in China under di ﬀ erent development trends in the next two years. new candidate for cell state. LSTM cells act as state information, updating the c t of the old cell state c − to the new cell state. and the weights of input, forgetting, and current cell state. b b o and b c are, respectively, the deviations of input, forgetting, output and current cell state.


Introduction
Environmental pollution and serious shortage of energy have become the most pressing problems in the world today. With the increasing environmental pollution and the depletion of fossil energy, there is a strong demand for renewable energy generation [1]. Wind power generation is one of the main renewable energy generation methods, showing a good momentum of continuous growth. The Global Wind Energy Council (GWEC) emphasized in its 14th Global Wind Power Development Report that the value wind energy, as a new form of energy, brings to power systems and markets will contribute to the wind power integration and balance between supply and demand. Wind power generation can not only effectively relieve the pressure of energy crisis but as a kind of clean energy can also greatly reduce environmental pollution [2]. Wind power generation prediction is an effective measure to improve the acceptance capacity of wind power and ensure the stable operation of power grid. A high-precision wind power generation prediction model directly affects power quality, power grid stability and the balance between power grid processing load and power generation, which is of great practical significance for power grid security, stability and efficient operation [3]. Wind power generation is affected by wind speed fluctuation on three time scales: ultra-short-term fluctuation (a few minutes) influences the control of wind turbine to a certain extent, medium-term fluctuation (from a few hours to a few days) has a certain impact on wind power grid connection and power grid dispatch and long-term fluctuations (weeks or months) are related to maintenance plans for wind farms and power

Data Preprocessing Models
Previous research on wind power prediction mainly took wind speed, wind direction and humidity as input variables of the model and preprocessed wind power signals by Empirical Mode Decomposition (EMD) [17], Ensemble Empirical Mode Decomposition (EEMD) [18], Complete Ensemble Empirical Mode Decomposition (CEEMD), Variational Mode Decomposition (VMD) [19] etc., which can more clearly reflect the characteristics of wind power signals. EMD was proposed by NordneE. Huang et al. to decompose signals into characteristic modes, which has the advantage that it does not use any defined function as a basis, but adaptively generates a natural mode based on the analyzed signal state function. With high signal-to-noise ratio and good time-frequency focus, it can be used to analyze nonlinear and non-stationary signal sequences. In the research of Jyotirmayee Naik et al., EMD was used as a data preprocessing method in short-term wind speed and wind power prediction. The original nonlinear non-stationary wind speed and wind time series data were decomposed by EMD. The accuracy of the proposed EMD-KRR and EMD-RVFL prediction models has been confirmed in experiments [20]. However, the Intrinsic Mode Function (IMF) after EMD will cause modal aliasing, while EEMD uses Noise-Assisted Signal Processing (NASP) to solve this problem effectively. As a preprocessing method for wind power prediction, the hybrid prediction model can improve the performance and prediction accuracy, and show good results in wind power signal processing [21]. As a preprocessing method for wind power time series in wind power prediction, the performance of the hybrid prediction model is improved, the prediction accuracy is improved and it shows good results in wind power signal processing. CEEMD has been further improved on the basis of EEMD, which makes up for the problem of EEMD's unclean noise removal in wind signal processing. To reduce the non-stationarity of the wind power time series, Wang et al. used CEEMD to decompose the wind power signal. The decomposed time series, as the input variables of the prediction model, can effectively improve the accuracy of short-term wind power prediction [22]. VMD is a completely non-recursive signal decomposition method based on the frequency domain, which to some extent overcomes many shortcomings of EMD. Li et al. used VMD to decompose wind power data into long-term modes, wave modes and random modes, which is more conducive for the prediction model to better understand the characteristics of the three constituent modes [23]. With the improvement of wind power prediction on the stability of sample data, data preprocessing has been improved on the original method.

Prediction Models
Through comparison, selection and improvement of the models, more accurate prediction models are obtained. The modeling methods mainly include Autoregressive models (AR) [24,25], Time Series Models [26,27], Support Vector Machine (SVM) [28,29], Artificial Neural Networks (ANN) [30,31], etc. The initial application of these prediction models in the field of wind power prediction has improved the accuracy of the prediction to a certain extent. However, these models do not fully consider the long-term correlation between the input samples, so the ability to improve the accuracy of wind power prediction models is also very limited. Li. et al. combined support vector machine (SVM) and improved dragonfly algorithm to forecast short-term wind power for a hybrid prediction model, and they found the proposed model suitable for short-term wind power prediction [32]. The SVM method can theoretically find a global optimal prediction. However, the calculation cost of SVM method will increase sharply, when the data volume is large. Under the circumstances, recursive neural network is introduced to improve the accuracy of wind power forecasts. RNN is a deep learning network, where there is a recursive link in the network structure. The relationship between the samples before and after the learning can be considered, which is especially suitable for processing time series signals. Aiming at the problems of gradient explosion and gradient disappearance, various improved methods have been studied. The emergence of LSTM neural network effectively solved the problems existing in previous models and achieved considerable results in the field of wind power prediction. At present, it is difficult for a single model to achieve good prediction effect, while the fusion method combining multiple models can improve the accuracy of prediction model more easily [33,34]. Erick Lopez et al. deeply integrated Long Short-Term Memory (LSTM) with Echo State Network (ESN) in their study, proposing an architecture similar to ESN. LSTM-ESN is superior to the WPPT model in all global indicators [35]. The wind power is predicted by the LSTM neural network algorithm, while the Gaussian Mixed Model (GMM) is used to analyze the error distribution characteristics of wind power short-term prediction. Both methods show better performance and evaluation [36]. On this basis, some scholars have made simple improvements to the structure of LSTM, reducing the influence of random components on prediction, effectively avoiding overfitting and making it more suitable for prediction [37]. Jyotirmayee Naik et al. used VMD to decompose the original nonlinear and non-stationary data and combined the VMD with 10 Multi-Kernel Regularized Pseudo Inverse Neural Network (MKPPINN), which showed the superiority of this model in wind power prediction [38]. Yu et al. proposed the Long Short-Term Memory and Enhanced Forget-Gate network model (LSTM-EFG), which can be used for wind power prediction. Based on correlation, the characteristic data of units within a certain distance are filtered, and the effect of wind power prediction is optimized by cluster analysis [39]. Lin. et al. integrated IF with deep learning and proposed a novel approach to perform power prediction using high-frequency SCADA data. Compared with the conventional predictive model used for outlier detection, the proposed deep learning prediction model shows superiority in wind power prediction [40].

Methodology
Wavelet Decomposition and Long Short-Term Memory neural network (WD-LSTM) is an intelligent network combining the advantages of WD and LSTM neural network. To better represent the data characteristics of the input index and facilitate the prediction of the neural network of LSTM, this paper adopts the loose WD and LSTM neural network, in which the WD is used as the preprocessing method of the prediction model of the LSTM neural network. According to the multi-fraction analysis function of WD, the original data are decomposed into time series with different frequency components to provide input vectors for LSTM neural network. S A1 is the approximate coefficient, while S D1 , S D2 and S D3 are the detail coefficients [41]. After the original data are decomposed by WD, the prediction is made by using the LSTM neural network, and the prediction results are obtained.
WD-LSTM prediction model combines the advantages of WD and LSTM neural network. This network can not only use the WD to analyze the subtle features of the original data but also can combine the self-learning and fault-tolerance capabilities of the neural network, which can improve both the accuracy of wind power generation prediction and the learning efficiency of the network. The steps of WD-LSTM neural network to predict wind power are shown in Figure 1.

Wavelet Decomposition
WD is an effective method to deal with non-stationary sequences. The multi-scale decomposition capability of WD can decompose the original time series into different frequency sequences according to different scales. WD is used to perform multi-scale analysis of various frequency components in the original signal, and noise frequency is screened out to obtain high-quality signals that can represent data characteristics, so as to improve the prediction accuracy of the model.
In the continuous wavelet transform, suppose ϕ(t) ∈ L 2 (R), ϕ * (w) as the results of Fourier transform ϕ(t), and ϕ * (w) meet the conditions of Equation (1), Then, ϕ(t) can be considered as the parent wavelet function. At the same time, ϕ(t) can be obtained by stretching and shifting, where a is the scaling variable and b is the translation variable. For the square product function f (t) ∈ L 2 (R), the continuous wavelet transform is, In Equation (3), a, b and t are continuous variables, while a is the expansion variable and b is the translation variable.
Continuous wavelets are usually sampled into discrete wavelets in practical applications, in order to facilitate calculation and analysis. Wavelet discretization is mainly for scaling variables a and shifting variables b. Then, the discrete wavelet function is Equation (4), In 1998, Mallet proposed wavelet multi-resolution analysis to perform J scale decomposition on the original sequence s(t). In the first step, the original signal was first decomposed into low-frequency components a 1 and high-frequency components d 1 . In the second step of decomposition, the high frequency part is retained and the low frequency component a 1 is further decomposed into a low frequency component a 2 and a high frequency component d 2 . The low-frequency components obtained at each step are decomposed in turn to finally obtain the low-frequency components a J and high-frequency components d J in the J scale. Then, the original sequence can be expressed as Equation (5), where J is the decomposition scale, a J (t) is the component approaching the original time series (low-frequency component) and d r (t)(r = 1, . . . , J) is the detail signal component (high-frequency component).
The more important step in WD is to choose the wavelet function and the scale of WD to participate in the algorithm. The number of wavelet decompositions is small, and the approximate signal usually contains random interference signals, which cannot effectively reflect the change trend of the original wind speed sequence. If the number of decompositions is too large, there will be greater error accumulation and the training time will be longer. In this paper, Daubechies (DB) wavelet is used to decompose the original data, taking J = 3.

Basic Principles of LSTM
The traditional neural network model lacks the memory function of historical information, and the cyclic neural network (RNN) can apply the output information of previous neurons to the current task. However, conventional RNN has the problem of gradient disappearance or gradient explosion; in other words, when the time interval is large, the past learning results will disappear. To address these shortcomings, Hochreiter proposed the Long Short-Term Memory Neural Network (LSTM) in 1997. LSTM is a type of Recurrent Neural Network that can learn long-term dependent information. It not only has the memory function of historical information, but also overcomes the long-term dependence of the model and can selectively forget the invalid information and update the effective information, thus solving the problem of gradient dispersion to some extent. As shown in Figure 2, the LSTM network is composed of an input layer, an output layer and several recursive hiding layers between them. A recursive hiding layer is composed of multiple memory modules, each of which contains one or more self-connected memory units with three gates controlling the information flow: the input gate, the forgetting gate and the output gate. The state of LSTM cell is calculated as follows: In Equations (6)- (8), i t , f t and o t are, respectively, input gate, forgetting gate and output gate. In Equation (10), c t is a new candidate for cell state. LSTM cells act as state information, updating the c t of the old cell state c t−1 to the new cell state. W i , W f , W o and W c are, respectively, the weights of input, forgetting, output and current cell state. b i , b f , b o and b c are, respectively, the deviations of input, forgetting, output and current cell state.
Energies 2020, 13, x FOR PEER REVIEW 7 of 17 task. However, conventional RNN has the problem of gradient disappearance or gradient explosion; in other words, when the time interval is large, the past learning results will disappear. To address these shortcomings, Hochreiter proposed the Long Short-Term Memory Neural Network (LSTM) in 1997. LSTM is a type of Recurrent Neural Network that can learn long-term dependent information.
It not only has the memory function of historical information, but also overcomes the long-term dependence of the model and can selectively forget the invalid information and update the effective information, thus solving the problem of gradient dispersion to some extent. As shown in Figure 2, the LSTM network is composed of an input layer, an output layer and several recursive hiding layers between them. A recursive hiding layer is composed of multiple memory modules, each of which contains one or more self-connected memory units with three gates controlling the information flow: the input gate, the forgetting gate and the output gate. The state of LSTM cell is calculated as follows: (10) In Equations (6) are, respectively, the deviations of input, forgetting, output and current cell state.

Data Description and Preprocessing
This paper selects four macroeconomic indicators of Gross Domestic Product (GDP), Consumer Price Index (CPI), Industrial Added Value (IAV) and Total Imports and Exports (TIE), as well as two

Data Description and Preprocessing
This paper selects four macroeconomic indicators of Gross Domestic Product (GDP), Consumer Price Index (CPI), Industrial Added Value (IAV) and Total Imports and Exports (TIE), as well as two related power generation indicators of National Total Power Generation (NTPG) and Hydropower Generation (HG), as input variables. To accurately evaluate the accuracy of wind power generation prediction model, this paper selects macroeconomic indicators and related power generation indicators from the National Bureau of Statistics of China. Macroeconomic indicators and related power generation data from the third quarter of 2009 to the second quarter of 2019 are selected, with a total of 40 data points. The original data samples are divided into two datasets: 80% of the original data (32 data points) are used as training samples and the remaining 20% (8 data points) are used as test samples to evaluate the predictive performance of the model.
Since the macroeconomic indicators and related power generation indicators have different dimensions and dimensional units, it is necessary to carry out data standardization processing for the original time series in order to eliminate the dimensional impact between indicators. According to Equation (11), each group of data is normalized into the interval 0-1 to solve the comparability between indicators, reduce the influence of outliers and noise and speed up the training speed of the model.
The Matrix Laboratory (MATLAB) wavelet toolbox is used to decompose the normalized time series s(t); then, where J is the decomposition scale, a J (t) is the low-frequency component close to the original sequence, d r (t) is the detail signal component (high-frequency component) of the r-th decomposition and t is the discrete time.

Model Parameters
In this paper, the time series of multiple macroeconomic indicators and related power generation indicators after WD are taken as the input variables LSTM neural network, and the wind power generation of the whole country is taken as the output variable. The LSTM neural network contains four parameters that affect the prediction accuracy of the model, including the time step of each layer in the LSTM neural network, the number of hidden units in each layer in the model and the training times. In the process of training the model, the other parameters are the same each time, but the single parameter is different, so as to find the best prediction model. Each parameter setting in the model is shown in Table 1. The LSTM model is a deep learning neural network, which has three layers: an input layer, a hidden layer and an output layer. The input is composed of six input variables: Gross Domestic Product (GDP), Consumer Price Index (CPI), Industrial Added Value (IAV), Total Imports and Exports (TIE), National Total Power Generation (NTPG) and Hydropower Generation (HG). The hidden layer consists of two LSTM units with time steps of 2, and each LSTM unit contains 64 cells. The output layer contains an output variable of wind power generation. The structure of the LSTM model is shown in Figure 3. Energies 2020, 13, x FOR PEER REVIEW 9 of 17

Input Layer
Hidden Layer Output Layer

Performance Indicators
To further verify the effectiveness and performance of the prediction method proposed for wind power prediction, three error analysis criteria are introduced to evaluate the proposed model, as given in Equations (13)

Model Accuracy
To evaluate the performance of WD-LSTM model in wind power prediction more effectively, other models are preliminarily selected for comparison in the paper. The physical or statistical models commonly used to predict by time series are selected. In addition, the models commonly used in machine learning and deep learning are selected. Bayesian Model Averaging and Ensemble Learning

Performance Indicators
To further verify the effectiveness and performance of the prediction method proposed for wind power prediction, three error analysis criteria are introduced to evaluate the proposed model, as given in Equations (13)-(15) (where y_real i is the actual values and y_pred i is predicted values). The mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE) are used to evaluate the performance of each method.

Model Accuracy
To evaluate the performance of WD-LSTM model in wind power prediction more effectively, other models are preliminarily selected for comparison in the paper. The physical or statistical models commonly used to predict by time series are selected. In addition, the models commonly  Table 2. As shown in Table 2, among the four models, the MAPE of WD-LSTM model is the lowest, reaching 5.831. The MAPE of SVR-IDA model comes in second at 15.679, more than 10. The errors of BMA-EL and MRMLE-AMS are relatively high, exceeding 20. In this experiment, the accuracy of machine learning and deep learning prediction models is generally better than that of physical or statistical prediction models. The presumed reason may be that machine learning and deep learning predictive models can fully learn the correlation between input and output variables, similar to human neural networks. In particular, the deep learning model can more fully learn the variation trend of data in time series, hence showing a higher prediction accuracy.
Based on previous comparisons, the prediction model proposed in this paper, which combines wavelet decomposition with long short-term memory neural network, has shown high prediction accuracy when predicting wind power generation in China. To effectively evaluate the performance of WD-LSTM in wind power prediction, traditional prediction methods of machine learning and deep learning are used in this paper as comparative experiments. Based on the same input time series, the learning situation of each model is tested, and its errors are compared and analyzed. During the experiment, Support Vector Regression (SVR), Gate Recurrent Unit (GRU), Wavelet Decomposition and Support Vector Regression (WD-SVR) and Wavelet Decomposition and Gated Recurrent Unit (WD-GRU) are used for time series prediction as comparative tests. In addition, the proportion of training set and test set is the same as that of WD-LSTM model and five comparative experiments are conducted. To objectively evaluate and describe the performance of the six prediction models, the prediction error values of each model are calculated according to the above formulas. The experimental results of MAE, MAPE and RMSE of the raw test set are shown in Table 3.  15.048 and 13.715, respectively. The error of WD-LSTM model is the smallest, and its MAPE is 5.831, which is significantly lower than the other five models. It can be seen from the data in Table 3 that WD-LSTM has a high accuracy in predicting wind power generation and is more effective than the traditional models and single models. Furthermore, Table 3 shows computing time cost of WD-LSTM and five other comparison models. In the machine learning models, the prediction using SVR model took 0.05 min while WD-LSTM took 12.05 min. In the deep learning models, GRU and LSTM cost the same time, 32 min, while WD-GRU and WD-LSTM cost the same time, 44 min. In general, compared with machine learning models, deep learning models take a longer time to predicate using time series. However, as for deep learning models, since the data samples are relatively small, there is no significant difference in the time spent on prediction. Figure 4 shows the prediction results of WD-LSTM neural network and five other comparison models, which directly reflects the degree of fitting between the predicted values of the six models and the real values. Meanwhile, Figure 5 shows the predicted and original value based on WD-LSTM. As shown in Figure 4, the prediction curve of Support Vector Regression (SVR) is relatively stable and it is difficult to predict the dynamic change of data. When the data present a large fluctuation, the model presents a large error value. Gated Recurrent Unit (GRU) is a variant or simplification of the Long Short-Term Memory network (LSTM), which includes reset gate and update gate. From the forecast results, it can reflect the fluctuation of wind power generation, but the variation trend in a single quarter is opposite to the real value, leading to higher error value. The results show that the input indexes such as Gross Domestic Product (GDP), Consumer Price Index (CPI), Industrial Added Value (IAV), Total Imports and Exports (TIE), National Total Power Generation (NTPG) and Hydropower Generation (HG) can be used as the input data of wind power generation. WD-LSTM can accurately predict the fluctuation of wind power generation, and the error value is lower than other models.
Energies 2020, 13, x FOR PEER REVIEW 11 of 17 Figure 4 shows the prediction results of WD-LSTM neural network and five other comparison models, which directly reflects the degree of fitting between the predicted values of the six models and the real values. Meanwhile, Figure 5 shows the predicted and original value based on WD-LSTM. As shown in Figure 4, the prediction curve of Support Vector Regression (SVR) is relatively stable and it is difficult to predict the dynamic change of data. When the data present a large fluctuation, the model presents a large error value. Gated Recurrent Unit (GRU) is a variant or simplification of the Long Short-Term Memory network (LSTM), which includes reset gate and update gate. From the forecast results, it can reflect the fluctuation of wind power generation, but the variation trend in a single quarter is opposite to the real value, leading to higher error value. The results show that the input indexes such as Gross Domestic Product (GDP), Consumer Price Index (CPI), Industrial Added Value (IAV), Total Imports and Exports (TIE), National Total Power Generation (NTPG) and Hydropower Generation (HG) can be used as the input data of wind power generation. WD-LSTM can accurately predict the fluctuation of wind power generation, and the error value is lower than other models.    Figure 4 shows the prediction results of WD-LSTM neural network and five other comparison models, which directly reflects the degree of fitting between the predicted values of the six models and the real values. Meanwhile, Figure 5 shows the predicted and original value based on WD-LSTM. As shown in Figure 4, the prediction curve of Support Vector Regression (SVR) is relatively stable and it is difficult to predict the dynamic change of data. When the data present a large fluctuation, the model presents a large error value. Gated Recurrent Unit (GRU) is a variant or simplification of the Long Short-Term Memory network (LSTM), which includes reset gate and update gate. From the forecast results, it can reflect the fluctuation of wind power generation, but the variation trend in a single quarter is opposite to the real value, leading to higher error value. The results show that the input indexes such as Gross Domestic Product (GDP), Consumer Price Index (CPI), Industrial Added Value (IAV), Total Imports and Exports (TIE), National Total Power Generation (NTPG) and Hydropower Generation (HG) can be used as the input data of wind power generation. WD-LSTM can accurately predict the fluctuation of wind power generation, and the error value is lower than other models.   On the basis of the above research, the paper further studies the influence of different types of input indicators on the accuracy of wind power generation prediction. The six input indicators are divided into two categories: (1) macroeconomic indicators, including GDP, CPI, IAV and TIE; and (2) power generation indicators, including NTPG and HG. The two kinds of indicators are, respectively, taken as input variables, and the WD-LSTM model is used to predict wind power generation on the condition that the model parameters are kept consistent. When macroeconomic indicators are taken as input variables, the experimental result of MAPE is 19.732. When the related power generation index is used as the input variables, the MAPE is 16.298. The results show that wind power forecast achieves the best prediction accuracy when six indicators are used as input variables.

Sensitivity Analysis
Sensitivity analysis is a common method to study and analyze the effect of parameter changes on system behavior. The sensitivity of variables to test parameters can be calculated as follows: where S t is the sensitivity of variables to test parameters at time t, setting the third quarter of 2009 as t = 1; Y t and Y t are the value of output variable before and after change at time t; and X t and X t are the value of input variables before and after change at time t. The maximum sensitivity of wind power generation during 2006-2017 is: The sensitivity of the wind power generation variable on the six main input variable in the proposed WD-LSTM model is studied and analyzed by changing the corresponding input variables by −5%, −3%, −1%, 1%, 3% and 5%, and the maximum sensitivity of wind power generation from the third quarter of 2009 to the second quarter of 2019 with respect to the six input variables change is shown in Table 4. It is found that the maximum sensitivity of wind power generation in the proposed model with respect to the six input variables from the third quarter of 2009 to the second quarter of 2019 is less than 0.10, which means the maximum sensitivity of wind power generation in the proposed WD-LSTM model is less sensitive. Therefore, the proposed model is stable and does not cause abnormal fluctuations in the output variable data due to the small changes of input variables.

Scenarios Setting
Different scenarios for forecasting are set in this paper, in which different scenarios match different input data to explore the changing trend of national wind power generation under different development situations and reduce the uncertainty of forecasting. Taking historical data rates and national economic, energy and social macro-development plans into account to make more realistic predictions and analyze the future trends of each characteristic value, this paper sets the following three scenarios to predict the wind power generation in China under different development trends in the next two years. Scenario 1 is a low-growth scenario, which keeps the country's recent development trend sustainable and calculates the minimum growth rate (non-negative) of macroeconomic indicators and related power generation indicators based on the growth rate of historical data to predict the national wind power generation. Scenario 2 is the base scenario, in which the development trend of each indicator is predicted more accurately and in line with the actual development trend from the fourth quarter of 2019 to the fourth quarter of 2022, and the average year-on-year growth rate of the data from the third quarter of 2009 to the second quarter of 2019 is calculated. Scenario 3 is a high-growth scenario, which maintains a high growth rate according to the historical development trend. According to the data from the third quarter of 2009 to the second quarter of 2019, the average year-on-year growth rate of each quarter is calculated and increased by 1.2 times on the basis of the average growth rate of each quarter. The specific growth rates under each scenario are shown in Table 5.

Future Prediction Results
By comparing the error values of each single model and the hybrid model in the prediction of wind power generation across the country through testing, we obtained that the prediction accuracy of WD-LSTM is relatively high, and predicted the wind power generation of China in the next two years by this model ( From the overall trend, the country's wind power generation will continue to increase. Under the three scenarios, the national wind power generation will decline slightly in the first quarter of 2020, and the growth rate will peak in the fourth quarter of 2021. In 2017, the State Grid pointed out at a press conference that, by 2020, the problem of new energy consumption will be completely solved, and the rate of abandoned wind and light will be controlled within 5%. According to the 13th Five-Year Plan for Wind Power Development, by 2020, non-fossil energy will account for 15% of primary energy consumption, and the country's annual wind power generation will need to reach 42 billion kWh and 6% of the total power generation. The sustained growth of wind power generation in China in the future may be affected by the following factors: (1) The sustained and steady development of China's economy. At the present stage, China's economic development model is changing from high-speed development to high-quality development. The steady high-quality economic development has laid a solid foundation for the development of the wind power industry in China, thus realizing the sustainable growth of the country's wind power generation. (2) Environmental protection brings development opportunities for renewable energy such as wind energy. From the overall perspectives, the development of renewable energy is a common goal of mankind and an important support for the global response to future climate, environmental and economic changes. The development of wind power as an energy source can become more affordable than traditional coal power, and the parity of wind power will release new market space, which is also an important reason for the continuous growth of wind power generation across the country. From the overall trend, the country's wind power generation will continue to increase. Under the three scenarios, the national wind power generation will decline slightly in the first quarter of 2020, and the growth rate will peak in the fourth quarter of 2021. In 2017, the State Grid pointed out at a press conference that, by 2020, the problem of new energy consumption will be completely solved, and the rate of abandoned wind and light will be controlled within 5%. According to the 13th Five-Year Plan for Wind Power Development, by 2020, non-fossil energy will account for 15% of primary energy consumption, and the country's annual wind power generation will need to reach 42 billion kWh and 6% of the total power generation. The sustained growth of wind power generation in China in the future may be affected by the following factors: (1) The sustained and steady development of China's economy. At the present stage, China's economic development model is changing from highspeed development to high-quality development. The steady high-quality economic development has laid a solid foundation for the development of the wind power industry in China, thus realizing the sustainable growth of the country's wind power generation. (2) Environmental protection brings development opportunities for renewable energy such as wind energy. From the overall perspectives, the development of renewable energy is a common goal of mankind and an important support for the global response to future climate, environmental and economic changes. The development of wind power as an energy source can become more affordable than traditional coal power, and the parity of wind power will release new market space, which is also an important reason for the continuous growth of wind power generation across the country.
Wind power in China shows a trend of rapid development. Wind curtailment and power limiting has become the focus of society and a major problem that needs to be solved urgently in power grid planning and dispatching operation. In 2017, the National Energy Administration's Guidance on the Implementation of the 13th Five-Year Plan for the Development of Renewable Energy was released. At the same time, the target of consumption and utilization is also proposed to effectively solve the problem of wind power curtailment by 2020. From 2011 to 2016, the wind curtailment rate showed a trend of first decreasing and then increasing, reaching the highest value of 17.1% in 2016. According to the Clean Energy Consumption Action Plan (2018-2020) jointly issued by the National Development and Reform Commission and the National Energy Administration, the wind curtailment rate will be kept at a reasonable level (aiming at around 5%) by 2020. Based on previous predictions, the total national wind power output will fluctuate between 283.1 and 300.4 Wind power in China shows a trend of rapid development. Wind curtailment and power limiting has become the focus of society and a major problem that needs to be solved urgently in power grid planning and dispatching operation. In 2017, the National Energy Administration's Guidance on the Implementation of the 13th Five-Year Plan for the Development of Renewable Energy was released. At the same time, the target of consumption and utilization is also proposed to effectively solve the problem of wind power curtailment by 2020. From 2011 to 2016, the wind curtailment rate showed a trend of first decreasing and then increasing, reaching the highest value of 17.1% in 2016. According to the Clean Energy Consumption Action Plan (2018-2020) jointly issued by the National Development and Reform Commission and the National Energy Administration, the wind curtailment rate will be kept at a reasonable level (aiming at around 5%) by 2020. Based on previous predictions, the total national wind power output will fluctuate between 283.1 and 300.4 billion kWh in 2020. Consequently, wind curtailment power will keep between 14.1 and 15.0 billion kWh in 2020.

Conclusions
As a kind of renewable energy, wind power generation plays a crucial role in China's electric energy production. Therefore, accurate prediction of wind power generation is helpful to optimize the power grid dispatching, reduce the reserve capacity of the system and reduce the operating cost of the power system. In this paper, a hybrid LSTM model for predicting wind power generation in China is constructed based on six index factors: gross domestic product, consumer price index, industrial added value, total imports and exports, total power generation and hydropower generation. Based on wavelet decomposition and long short-term memory neural network methods, a hybrid WD-LSTM model for predicting national wind power generation is constructed. The following conclusions can be reached through experiments: (1) Wind power generation is related to GDP, CPI, IAV, TIE, TPG and HG. The selection of these six input indexes can, to a certain extent, predict the wind power generation of the country. (2) The time series of macroeconomic indicators and related power generation indicators are decomposed into low-frequency components and high-frequency components through wavelet decomposition, which increases the data dimension of the input variables of the prediction model to some extent. The time series data of macroeconomic and related power generation indexes of different frequencies are used as input variables to effectively improve the accuracy of the prediction model. (3) In this paper, the WD-LSTM hybrid prediction model is selected to predict the wind power generation in China. The experimental results show that the MAPE of the mixed prediction model is 5.831. Compared with machine learning and a single prediction model, the model can predict wind power generation more accurately across the country. (4) In addition, the prediction of national wind power generation in this paper still needs to be improved and deepened. Due to the difficulty in obtaining some index data and the inconsistency of some data in scale, the paper has the limitation in the selection of input indices. The limitations of the samples themselves will lead to a certain range of errors in the process of data processing and prediction. Therefore, other possible influencing factors can be considered as input variables. (5) The next step of the study will consider whether the time series with different scales can be used as the input index of the same model. At the same time, Information Gain (IG) will also be used to sort and filter input indicators by correlation, and then make prediction using WD-LSTM model. The application of the proposed model in primary energy consumption or renewable energy consumption will also be considered.