Prediction of Wind Turbine-Grid Interaction Based on a Principal Component Analysis-Long Short Term Memory Model

The interaction between the gird and wind farms has significant impact on the power grid, therefore prediction of the interaction between gird and wind farms is of great significance. In this paper, a wind turbine-gird interaction prediction model based on long short term memory (LSTM) network under the TensorFlow framework is presented. First, the multivariate time series was screened by principal component analysis (PCA) to reduce the data dimensionality. Secondly, the LSTM network is used to model the nonlinear relationship between the selected sequence of wind turbine network interactions and the actual output sequence of the wind farms, it is proved that it has higher accuracy and applicability by comparison with single LSTM model, Autoregressive Integrated Moving Average (ARIMA) model and Back Propagation Neural Network (BPNN) model, the Mean Absolute Percentage Error (MAPE) is 0.617%, 0.703%, 1.397% and 3.127%, respectively. Finally, the Prony algorithm was used to analyze the predicted data of the wind turbine-grid interactions. Based on the actual data, it is found that the oscillation frequencies of the predicted data from PCA-LSTM model are basically the same as the oscillation frequencies of the actual data, thus the feasibility of the model proposed for analyzing interaction between grid and wind turbines is verified.


Introduction
During the operation of wind turbines, the output power is in a constantly changing state due to the randomness and intermittency of the wind resource, which brings unpredictable influences to the operation state of the power system and may lead to system oscillation.Exploring a wind power prediction method which can relieve the peak load regulation and frequency modulation pressure of the power system and predict the possible oscillation of the system with a certain accuracy is very important [1].The real-time operation data of wind turbines records the actual operation status of wind turbines, and inevitably contains information on the interaction between wind turbines and power grids.Therefore, it is necessary to analyze them in depth and apply big data analysis to extract valuable information.
At present, there are three kinds of forecasting methods that are commonly used: physical methods, statistical methods, and combinations of the two methods [2].The purpose of the physical method is to describe the physical process of converting wind into electricity, and to simulate all the steps involved, according to the wind turbine background data, such as wind turbine position and fan parameters, to build the model and estimate the wind speed at the hub height of each wind turbine, and finally to obtain the output power through the wind power curve [3].This method involves a large number of meteorological theories and geomorphological parameters and is very difficult to solve.The statistical method aims to establish a nonlinear relationship between wind power and input variables directly by analyzing the statistical laws of time series, including sequential extrapolation and artificial intelligence prediction methods.Sequential extrapolation includes time series method, regression analysis method and Kalman filtering method [4], etc. Artificial intelligence method includes artificial neural networks (ANN), support vector machines (SVM), deep learning [5] and so on.A method using Least Squares Support Vector Machine (LSSVM) to predict wind speed and indirectly predict wind power output is proposed in [6].In reference [7], an artificial neural network for wind power prediction is constructed based on Numerical Weather Prediction (NWP) data.
However, wind power data series is a kind of time series with dynamic characteristics, and the output of the system is not only related to the current time input, but also related to the past input.Recursive neural networks (RNN) [8,9] can not only use current input information but also historical information, so RNN has great advantages in processing timing information.As a special RNN model, LSTM network effectively avoids the problem of gradient disappearance and gradient explosion in the conventional RNN training process due to its special structural design [10].LSTM has many nonlinear transport layers and can be used in complex situations.With enough training data, LSTM model can explore the information contained in massive data.
Since the large-scale integration of wind power, the interaction between wind turbines and power grids [11,12] has become one of the topics of widespread concern.Many researches are carried to handle the process of wind integration with the grid.Reference [13] investigates a renewable power system by jointly optimizing the expansion of renewable generation facilities and the transmission grid.It is proved that transmission can reduce cost of electricity when wind capacities and solar photovoltaics are installed separately.Reference [14] presents a Two-layer nested model considering the uncertainty in forecasting photovoltaic power.Reference [15] proposes a Mixed-Integer Nonlinear Programming MINLP model for grid connected solar-wind-pumped-hydroelectricity (PV-WT-PSH), which combines mixed integer modeling with an ANN model to predict energy flow between a local balancing area using PV-WT-PSH and the national power system.
At present, the complicated oscillation phenomenon caused by wind power integration includes sub-synchronous interaction (SSI) and low frequency oscillation [16][17][18].SSI mostly shows the exchange of energy between generator and alternating current at a frequency lower than the rated frequency of the system.The frequency value of low frequency oscillation is usually between 0.1 and 2.5 Hz, which is caused by the negative damping effect caused by the rapid excitation of the generator.According to the difference of internal mechanism, SSI can be divided into subsynchronous control interaction (SSCI) [19] and subsynchronous torque interaction (SSTI) [20].SSCI is associated with the series capacitance of the control device and power electronic equipment, and may also occur in the case of low series compensation.SSTI [21] is related to the mechanical power on the generator shaft system.Depending on the formation mechanism, this kind of oscillation problem can be subdivided into subsynchronous oscillation (SSO) [22] and subsynchronous resonance (SSR) at SSTI level.SSR [23,24] is caused by resonance caused by series compensation capacitance in the power grid, and SSO is caused by positive feedback caused by defects of the control system itself.
The main contributions of this paper are as follows: (1) The principal component analysis of wind turbine-grid interaction is studied, and simulations prove the rationality of the selected component in the prediction of interaction between wind turbine and grid; (2) A prediction model of wind turbine-grid interaction based on PCA-LSTM is proposed.
The first part of the article puts forward the related factors of wind turbine-grid interaction and introduces the PCA analysis.In the second part, the prediction model of wind turbine gird interaction is proposed, and the principle of LSTM network and the design scheme of prediction model are introduced.The third part introduce the data flow diagram of the model in TensorFlow.The fourth part is experimental verification and result analysis, which verifies the accuracy of the proposed model.Figure 1 shows the flowchart of the methodology used in this paper.
Energies 2018, 11, x FOR PEER REVIEW 3 of 20 part is experimental verification and result analysis, which verifies the accuracy of the proposed model.Figure 1 shows the flowchart of the methodology used in this paper.

Predicted values of related facto rs
Pro

Analysis Objects of Wind Turbine Grid Interaction
In this paper, wind output power, phase voltage and phase current are selected as the analysis objects of wind turbine grid interaction.First, it is necessary to build and train prediction models to predict power, voltage and current respectively.Too few predictors will lead to missing information and unable to conduct a comprehensive analysis of data.However, too many prediction factors will lead to an increase in the calculation amount and a decrease in the generalization ability, so it is necessary to select input features before prediction.

Voltage/Current
The factors that affect the voltage stability of wind turbines are usually the combination of various factors, including the scale of wind turbines, the type and size of disturbances, the type of generators and the operation mode of wind turbines.The harmonic of stator current is affected by stator and rotor voltages.In addition, the harmonic of stator current may also come from the wind motor itself, the disturbance of the surrounding environment, etc.Therefore, PCA will be used to select the input quantity that is related to the voltage and current.

Power
Wind turbine works by converting the kinetic energy in the wind first into rotational kinetic energy and then electrical energy, which can be supplied via the grid, the rotational kinetic power produced in a wind turbine is given by:

Analysis Objects of Wind Turbine Grid Interaction
In this paper, wind output power, phase voltage and phase current are selected as the analysis objects of wind turbine grid interaction.First, it is necessary to build and train prediction models to predict power, voltage and current respectively.Too few predictors will lead to missing information and unable to conduct a comprehensive analysis of data.However, too many prediction factors will lead to an increase in the calculation amount and a decrease in the generalization ability, so it is necessary to select input features before prediction.

Voltage/Current
The factors that affect the voltage stability of wind turbines are usually the combination of various factors, including the scale of wind turbines, the type and size of disturbances, the type of generators and the operation mode of wind turbines.The harmonic of stator current is affected by stator and rotor voltages.In addition, the harmonic of stator current may also come from the wind motor itself, the disturbance of the surrounding environment, etc.Therefore, PCA will be used to select the input quantity that is related to the voltage and current.

Power
Wind turbine works by converting the kinetic energy in the wind first into rotational kinetic energy and then electrical energy, which can be supplied via the grid, the rotational kinetic power produced in a wind turbine is given by: In Equation ( 1), P w is the output power (kW), C p is the power coefficient, ρ is air density (kg/m 3 ), S is blade rolling area (m 2 ), and v is wind speed (m/s).Air density of the wind turbine is given by: In Equation ( 2), p represents normal atmospheric pressure level, P b is saturated vapor pressure, T is thermodynamic temperature and ϕ is relative air humidity.
According to Equations ( 1) and ( 2), for a given wind turbine, the power coefficient and blade rolling area are constant, so the output power of the wind turbine is closely related to the following four factors: wind speed, temperature, humidity and pressure.Wind speed is the most important factor among them since it is a cubic parameter.Some of the above four factors are related to each other and some are independent of each other.As there is a certain correlation, it is possible to synthesize information existing in various variables with fewer factors.PCA belongs to this kind of dimensionality reduction method.

Principle of Principal Component Analysis
The idea of PCA [25] is to construct new variables formed by linear combination of original variables and make the new variables reflect as much information of the original variables as possible on the premise that they are not related to each other.Mapping n-dimensional features to k-dimensional (k < n), which is a completely new orthogonal feature, is called the main component.Principal components are reconstructed K-dimensional features, rather than simply removing the remaining N-K-dimensional features from the N-dimensional features.Each new feature has its own unique meaning.Data information is mainly reflected in variance.Features with large variance can reflect that the main information is contained in the original variables, usually measured by cumulative variance contribution rate.Generally, the dimension whose cumulative contribution rate is about 75~95% is selected.
There is a sample set X = {x 1 , x 2 , . . ., x m } assuming that the sample set is centered, that is ∑ i x i = 0, assuming that the new coordinate system after projection transformation is {w 1 , w 2 , . . ., w d }, where w i is the standard orthogonal basis vector, w i 2 = 1.The projection of the sample points x i on the hyperplane in the new space is W T x i .In order for the projection of all the sample points to be separated as much as possible, the variance of the projected sample points should be maximized, and the variance of the projected sample points can be expressed as: Applying the Lagrange multiplier method: Therefore, it is only necessary to perform eigenvalue decomposition on the covariance matrix XX T and sort the obtained eigenvalues: λ 1 ≥ λ 2 ≥ . . .≥ λ m .The number of principal components selected depends on the cumulative variance contribution rate.Usually, when the cumulative variance contribution rate is greater than 75~95%, the corresponding previous p principal component contains most of the information that can be provided by the original variables m, and the number of principal components is just one.Variance contribution rate and cumulative variance contribution rate are respectively: Energies 2018, 11, 3221 The solution of PCA is to form W = {w 1 , w 2 , . . ., w p } corresponding to the previous eigenvalues.LSTM can be used as a complex nonlinear unit to construct a larger deep neural network, which can reflect the long-term memory effect.The LSTM network includes an input layer, an output layer, and multiple hidden layers.The hidden layer is composed of memory tuples, and its basic structure is shown in Figure 2. The key to LSTM network is cell state.The state of the cells runs directly along the whole chain like a conveyor belt.In LSTM, cell state information is added or deleted through the gate structure, and whether information passes through can be selectively determined through the gate.It consists of a Sigmoid layer and a pair of multiplication operations.The output of gate structure is 0~1, which defines the degree of information passing through.The tanh layer in Figure 2  The solution of PCA is to form W = { , , … , } corresponding to the previous eigenvalues.

Long-Term and Short-Term Memory Network Structure
LSTM can be used as a complex nonlinear unit to construct a larger deep neural network, which can reflect the long-term memory effect.The LSTM network includes an input layer, an output layer, and multiple hidden layers.The hidden layer is composed of memory tuples, and its basic structure is shown in Figure 2. The key to LSTM network is cell state.The state of the cells runs directly along the whole chain like a conveyor belt.In LSTM, cell state information is added or deleted through the gate structure, and whether information passes through can be selectively determined through the gate.It consists of a Sigmoid layer and a pair of multiplication operations.The output of gate structure is 0~1, which defines the degree of information passing through.The tanh layer in Figure 2  The LSTM tuple includes three gates, namely, an input gate, a forget gate and an output gate.The three gates control the flow of information between the tuple and the network.In the following formula, , , represent the state values of input gate, output gate and forgotten gate, respectively.
(1) Forget gate decides to forget information from the old cell state , and the input is the input of the current layer and the output of the previous layer ℎ , the cell state output is: ( ) (2) Generate information to be updated and store it in the cell needs two steps: (a) update the information by the result of the input gate passing through the sigmoid layer; (b) will be added to the new candidate information by multiplying the old cell state with to forget unnecessary information: ( ) (10) (3) The output information is determined by the output gate.First, the initial output is obtained through the Sigmoid layer, the cell state value is scaled between [−1, 1] with the tanh layer, and the output ℎ can be easily obtained: The LSTM tuple includes three gates, namely, an input gate, a forget gate and an output gate.The three gates control the flow of information between the tuple and the network.In the following formula, i t , o t , f t represent the state values of input gate, output gate and forgotten gate, respectively.
(1) Forget gate decides to forget information from the old cell state C t−1 , and the input is the input of the current layer x t and the output of the previous layer h t−1 , the cell state output is: (2) Generate information to be updated and store it in the cell needs two steps: (a) update the information by the result of the input gate passing through the sigmoid layer; (b) C t will be added to the new candidate information by multiplying the old cell state with f t to forget unnecessary information: Energies 2018, 11, 3221 6 of 19 (3) The output information is determined by the output gate.First, the initial output is obtained through the Sigmoid layer, the cell state value is scaled between [−1, 1] with the tanh layer, and the output h t can be easily obtained: From Equations ( 7) to (11), 1 respectively represent the weight matrix of input gate, forget gate, output gate and tuple input, The LSTM model has the same structure as RNN model.It can be seen as multiple replications of the same neural network, and each neural network module will pass the message to the next one.After unfolding the loop, the structure is shown in Figure 3.
Energies 2018, 11, x FOR PEER REVIEW 6 of 20 ( ) ( ) From Equations ( 7) to (11), The LSTM model has the same structure as RNN model.It can be seen as multiple replications of the same neural network, and each neural network module will pass the message to the next one.After unfolding the loop, the structure is shown in Figure 3.The observation objects of wind turbine network interaction and wind speed data is the input to the LSTM model, and the expression of the prediction model can be derived from the network structure of Figure 3: In Equation ( 13),

( ) ( ) h t ,...,h t n −
is the historical data, ( ) ( ) is the input parameter selected by PCA, in this case, it is wind speed.
The topological structure of LSTM model selected in this paper is shown in Figure 4.After the principal component analysis of the original data, the analysis objects of wind turbine grid interaction and the selected principal component are chosen as inputs of the prediction model.We have two hidden layers.And the output layer gives the prediction of wind power, voltage and current in wind turbine grid interaction.The observation objects of wind turbine network interaction and wind speed data is the input to the LSTM model, and the expression of the prediction model can be derived from the network structure of Figure 3: In Equation ( 13), h(t), . . ., h(t − n) is the historical data, x(t + 1), . . ., x(t − n) is the input parameter selected by PCA, in this case, it is wind speed.
The topological structure of LSTM model selected in this paper is shown in Figure 4.After the principal component analysis of the original data, the analysis objects of wind turbine grid interaction and the selected principal component are chosen as inputs of the prediction model.We have two hidden layers.And the output layer gives the prediction of wind power, voltage and current in wind turbine grid interaction.
Energies 2018, 11, x FOR PEER REVIEW 6 of 20 ( ) ( ) From Equations ( 7) to (11), The LSTM model has the same structure as RNN model.It can be seen as multiple replications of the same neural network, and each neural network module will pass the message to the next one.After unfolding the loop, the structure is shown in Figure 3.The observation objects of wind turbine network interaction and wind speed data is the input to the LSTM model, and the expression of the prediction model can be derived from the network structure of Figure 3: In Equation ( 13),

( ) ( ) h t ,...,h t n −
is the historical data, ( ) ( ) is the input parameter selected by PCA, in this case, it is wind speed.The topological structure of LSTM model selected in this paper is shown in Figure 4.After the principal component analysis of the original data, the analysis objects of wind turbine grid interaction and the selected principal component are chosen as inputs of the prediction model.We have two hidden layers.And the output layer gives the prediction of wind power, voltage and current in wind turbine grid interaction.

Data Normalization
When predicting multi-variable time series, due to the different dimensions and numerical differences among different variables, considering the input and output range of nonlinear activation function in the model, and in order to equally handle the influence of various variables on wind power, voltage and current, it is necessary to normalize the raw data between [0, 1].Normalization is carried out by MinMaxScaler, the formula is shown in Equation ( 14): The predicted wind power, current and voltage data are subjected to inverse normalization processing to make them have physical significance.The formula is shown in Equation ( 15):

Model Parameter Selection
The establishment of LSTM prediction model requires five hyperparameters, namely, input dimension, input layer timesteps, number of hidden layers, dimension of each hidden layer and output dimension.
In an actual neural network, the number of hidden layers and neurons will directly affect the accuracy of network training and prediction so the number of hidden layers and neurons should be carefully selected.The network starts from a complex structure, which has many hidden layers and several hundred of neurons in each layer, then the over fitting problem happens, so that the number of layers should be reduced and some of the neurons should be dropped off until the generalization ability of the network is good enough, The best parameters for our model is found after many experiments, the following hyperparameters can obtain better prediction results: the input shape is 2, 5 time steps, the number of hidden layers is 2, 50 neurons are defined in the first hidden layer, 100 neurons are defined in the second hidden layer, and 1 neuron is defined in the output layer to predict the output.Adam function with random gradient descent is used as the optimization algorithm of the neural network.

Evaluation of Forecast Results
The mean absolute percentage error (MAPE) and root mean square error (RMSE) are used for evaluation the prediction results, and the error functions are shown in Equations ( 16) and (17), respectively: ) In Equations ( 16) and (17), P N (i) and PN (i) (i = 1, 2, 3, . . ., n) are the actual value and predicted value of the i th data, n represents the length of the data used for verification.

TensorFlow Framework
TensorFlow [26] is Google's open source deep learning framework system, which supports a wide range of models and various types of learning algorithms.It can build deep learning models and can flexibly build analysis models as needed.TensorFlow uses data flow diagram to deal with numerical calculation.The nodes in the data flow diagram represent numerical operations, and the edges between nodes represent some connection between tensors, where tensors are represented by n dimensional arrays, flow is based on a data flow diagram, and tensor flow is the calculation process from one end of the graph to the other.

Construction of Tensor Flow Flow Diagram of the Model
Data flow diagram is an abstract description of computation.At the beginning of the calculation, the data flow graph is started in the session, which distributes the operations in the graph to each computing device while providing the execution method of the operations.These methods calculate and return tensors according to the calculation relationship of each side.The data flow diagram of the LSTM model constructed in this paper is shown in Figure 5, where the nodes are numerical operations and the edges are tensors represented by n dimensional arrays.The data flow diagram of the hidden layer is shown in Figure 6.

Construction of Tensor Flow Flow Diagram of the Model
Data flow diagram is an abstract description of computation.At the beginning of the calculation, the data flow graph is started in the session, which distributes the operations in the graph to each computing device while providing the execution method of the operations.These methods calculate and return tensors according to the calculation relationship of each side.The data flow diagram of the LSTM model constructed in this paper is shown in Figure 5, where the nodes are numerical operations and the edges are tensors represented by n dimensional arrays.The data flow diagram of the hidden layer is shown in Figure 6.

Data Preprocessing
The data used in this paper are collected from an actual wind farm.The sampling started at 13:33 on 6 August 2013 and ended at 14:03 on 6 August 2013.Since we are to research the interaction Energies 2018, 11, 3221 9 of 19 between the grid and wind farms, the sampling frequency should be very high and it is 4 kHz, that is, the data time interval is 1/4000 s, so there are in total 7,200,000 data items.The original data include factors such as fan speed, wind speed, wind direction, pressure, temperature, humidity and so on.If a certain factor is directly ignored, it may bring errors to the prediction.In order to reduce the dimension of input variables and minimize the errors, PCA is used to determine the minimum number of variables required and analyze the multivariate prediction factors.
First, the data are normalized to unify the dimensions of each parameter, then principal component extraction is performed, the covariance matrix of the normalized training data is calculated, the characteristic root and contribution rate of the covariance matrix are calculated, and principal components are extracted according to the cumulative contribution rate.The calculation results are shown in Table 1.Table 1 gives the eigenvalues, variance contribution rate and cumulative contribution rate of principal components, and Figure 7 is a line chart of variance relative to the number of components.

Data Preprocessing
The data used in this paper are collected from an actual wind farm.The sampling started at 13:33 on 6 August 2013 and ended at 14:03 on 6 August 2013.Since we are to research the interaction between the grid and wind farms, the sampling frequency should be very high and it is 4 kHz, that is, the data time interval is 1/4000 s, so there are in total 7,200,000 data items.The original data include factors such as fan speed, wind speed, wind direction, pressure, temperature, humidity and so on.If a certain factor is directly ignored, it may bring errors to the prediction.In order to reduce the dimension of input variables and minimize the errors, PCA is used to determine the minimum number of variables required and analyze the multivariate prediction factors.
First, the data are normalized to unify the dimensions of each parameter, then principal component extraction is performed, the covariance matrix of the normalized training data is calculated, the characteristic root and contribution rate of the covariance matrix are calculated, and principal components are extracted according to the cumulative contribution rate.The calculation results are shown in Table 1.Table 1 gives the eigenvalues, variance contribution rate and cumulative contribution rate of principal components, and Figure 7 is a line chart of variance relative to the number of components.As can be seen from Table 1, the contribution rate of the first component 1 Z is 89.273%,As can be seen from Table 1, the contribution rate of the first component Z 1 is 89.273%, indicating that it basically contains all the information of the original data, and Z 1 can be concluded as the principal component according to the principal component judgment.Another method of selecting principal components is to check the line chart of variance with respect to the number of components and select the point where the graph is close to the horizontal.From Figure 7, the graph is close to the horizontal after the first principal component and the contribution rate of other component variables is very low, so it is determined that the principal component is Z 1 .There are 10 input parameters before processing PCA, and only one principal component is used as a parameter after processing PCA.
As can be seen from the score of component coefficient matrix in Table 2, this first principal component Z 1 is mainly associated with the original parameter variable X 8 , with the correlation coefficient of 0.965, X 8 corresponds to the wind speed.Therefore, the result obtained from the PCA is consistent with the result obtained from Equation ( 2) that wind speed is the most important influencing factor.The data preprocessing based on PCA can improve the calculation efficiency of the prediction model with guaranteed accuracy.

Results of Experimental Results
After implementing PCA, the selected parameters are treated as input to the model.Considering that the sampling frequency of the data is 4 kHz, to reduce the impact of individual data disturbance, an average method is adopted.The data used in the prediction is one point per second, that is, the average value of every 4000 data is taken as the current time value, and the average value is used the processing of the output active power and wind speed.The waveforms of output active power, phase current, phase voltage is shown in Figure 8.   From the prediction results in Figure 9a-c, the wind power, phase current and phase voltage prediction based on PCA-LSTM model have high accuracy and low prediction error.In Figure 9a, MAPE of wind power is 0.617%, RMSE is 2167.839,MAPE of phase current in Figure 9b is 3.287%, RMSE is 75.177,MAPE of phase voltage in Figure 9c is 2.383%, RMSE is 35.912.By predicting the output of the wind turbine, the peak load regulation and frequency modulation pressure of the power system can be relieved, mechanical failures can be found in time, corresponding measures can be taken as soon as possible, and the possibility of serious problems in the operation of the wind turbine can be reduced.
Figure 10a,c,e are the comparison of the prediction results of active power, phase current and phase voltage between PCA-LSTM model and single LSTM model proposed in this paper.Due to the large Y axis value, the comparison effect is not obvious enough, so Figure 10b,d,f are the typical fragments extracted from Figure 10a,c,e, which show the comparison of the prediction results of the two models.From the prediction results in Figure 9a-c, the wind power, phase current and phase voltage prediction based on PCA-LSTM model have high accuracy and low prediction error.In Figure 9a, MAPE of wind power is 0.617%, RMSE is 2167.839,MAPE of phase current in Figure 9b is 3.287%, RMSE is 75.177,MAPE of phase voltage in Figure 9c is 2.383%, RMSE is 35.912.By predicting the output of the wind turbine, the peak load regulation and frequency modulation pressure of the power system can be relieved, mechanical failures can be found in time, corresponding measures can be taken as soon as possible, and the possibility of serious problems in the operation of the wind turbine can be reduced.
Figure 10a,c,e are the comparison of the prediction results of active power, phase current and phase voltage between PCA-LSTM model and single LSTM model proposed in this paper.Due to the large Y axis value, the comparison effect is not obvious enough, so Figure 10b,d,f are the typical fragments extracted from Figure 10a,c,e, which show the comparison of the prediction results of the two models.
system can be relieved, mechanical failures can be found in time, corresponding measures can be taken as soon as possible, and the possibility of serious problems in the operation of the wind turbine can be reduced.
Figure 10a,c,e are the comparison of the prediction results of active power, phase current and phase voltage between PCA-LSTM model and single LSTM model proposed in this paper.Due to the large Y axis value, the comparison effect is not obvious enough, so Figure 10b,d,f are the typical fragments extracted from Figure 10a,c,e, which show the comparison of the prediction results of the two models.As can be seen from Figure 10, the prediction results of LSTM and PCA-LSTM methods are close to the actual wind power, phase current and phase voltage curves, respectively, and the prediction accuracy of PCA-LSTM is higher than that of a single LSTM model, so the role of PCA in this prediction is very important.As can be seen from Table 3, the RMSE of the PCA-LSTM model proposed in this paper is 5.533%, 6.887% and 5.098% lower than LSTM model, respectively.As can be seen from Figure 10, the prediction results of LSTM and PCA-LSTM methods are close to the actual wind power, phase current and phase voltage curves, respectively, and the prediction accuracy of PCA-LSTM is higher than that of a single LSTM model, so the role of PCA in this prediction is very important.As can be seen from Table 3, the RMSE of the PCA-LSTM model proposed in this paper is 5.533%, 6.887% and 5.098% lower than LSTM model, respectively.By comparing the prediction results of single LSTM model and PCA-LSTM model, it shows that the higher the correlation degree with the target variables, the higher the prediction performance of LSTM model will be.On the contrary, variables with low correlation degree will not only affect the calculation speed, but may also reduce the prediction performance.This result shows that data preprocessing based on PCA increases the accuracy by 12.233% compared with the model using all variables as input parameters.Moreover, the input variables of PCA-LSTM model are much less than those of single LSTM model, which has the advantage of high computational efficiency in the case of large amount of data.

Comparison with Other Models
In this paper, a Relu function is used as activation function of LSTM network.In order to test performance of the network proposed in this paper, we compare it with classic time series prediction models such as BPNN model and ARIMA model, the output power is taken as the comparison object here.BPNN is a multi-layer feed-forward network trained according to back propagation.And the basic idea is gradient descent method.By analyzing the autocorrelation function and partial autocorrelation function of the residual, the optimal ARIMA model is determined as ARIMA (1,1,1).The prediction results are shown in Figure 11, the average absolute error percentage and root mean square error are shown in Table 3, respectively.

Comparison with Other Models
In this paper, a Relu function is used as activation function of LSTM network.In order to test performance of the network proposed in this paper, we compare it with classic time series prediction models such as BPNN model and ARIMA model, the output power is taken as the comparison object here.BPNN is a multi-layer feed-forward network trained according to back propagation.And the basic idea is gradient descent method.By analyzing the autocorrelation function and partial autocorrelation function of the residual, the optimal ARIMA model is determined as ARIMA (1,1,1).The prediction results are shown in Figure 11, the average absolute error percentage and root mean square error are shown in Table 3, respectively.As can be seen from Figure 11, the predicted value obtained by the PCA-LSTM method proposed in this paper is closest to the actual value, and the prediction accuracy is higher than that based on BPNN model and ARIMA model.As can be seen from Table 4, the prediction error of the PCA-LSTM model is the lowest among the three models, and its MAPE is reduced by 2.510% and 0.780% compared with BPNN model and ARIMA model, respectively.As can be seen from Figure 11, the predicted value obtained by the PCA-LSTM method proposed in this paper is closest to the actual value, and the prediction accuracy is higher than that based on BPNN model and ARIMA model.As can be seen from Table 4, the prediction error of the PCA-LSTM model is the lowest among the three models, and its MAPE is reduced by 2.510% and 0.780% compared with BPNN model and ARIMA model, respectively.In the experiments described in Sections 5.2 and 5.3, it was confirmed that the prediction based on PCA-LSTM model has high accuracy, so it is reasonable to use the predicted value of the wind turbine-network interaction observation object as the basis for judging the operation state of the system.
The prediction data and the actual data within a certain time period are selected, and Prony algorithm is used to analyze the oscillation module.The analysis results are shown in Table 5.In addition, the oscillation frequency of the turbine-grid is between 0 and 100 Hz, so the oscillation frequency higher than 100 Hz is eliminated.From the above data, it can be concluded that subsynchronous control interaction (SSCI), subsynchronous oscillation (SSO) and subsynchronous resonance (SSR) exist during the actual system operation, and the frequency value and actual value of the predicted data output by LSTM model are also similar, with subsynchronous oscillation and subsynchronous resonance as the main oscillation components.Based on the analysis of the actual operation data of wind turbines, it is found that several oscillation modes such as low-frequency oscillation, subsynchronous control interaction (SSCI), subsynchronous oscillation (SSO) and subsynchronous resonance (SSR) exist in the actual system operation, but due to various factors, the frequency value will be slightly different from the theoretical calculated characteristic frequency value.The output current, voltage and power of wind turbines mainly include frequency values of 0.8, 8, 12, 25, 45, 50 and 90 Hz.As shown in Figure 12a-c, the X axis is the frequency component obtained from the LSTM-PCA model, the Y axis is the frequency component obtained from the actual active power, and Figure 13 is a hexagonal box diagram drawn from the above three charts, which more visually depicts the relationship between the predicted power and the actual power.the darker the hexagon, the more frequent the certain frequency component appears, so it shows that the frequency component of 12, 25 and 50 Hz appears more often.From Figure 13, it shows that the frequency value of the predicted data output by PCA-LSTM model is basically the same as the actual frequency value.Tables 6 and 7 are respectively the oscillation modes corresponding to the predicted phase current, phase voltage and active power of the wind turbine and the oscillation modes corresponding to the actual phase current, phase voltage and active power of the wind turbine.According to Tables 6 and 7, there are many components in subsynchronous oscillation and subsynchronous resonance of wind turbines, and there is a greater possibility of excitation.Low-frequency oscillation mainly exists in phase current and phase voltage, and the possibility of excitation is relatively small.The experiment of the above measured data fully verifies the feasibility and high accuracy of the analysis of the interaction between the grid and wind turbine based on the predicted values of phase current, phase voltage and active power of the wind turbine base PCA-LSTM model.Based on the predicted values of phase current, phase voltage and active power of wind turbines, it is possible to control the possible interaction between grid and wind turbine in time by analyzing the operating state of the system, which is of great significance to the safe operation of the grid.

Conclusions
In this paper, a prediction model of wind turbine-grid interaction based on LSTM network is proposed under TensorFlow.When selecting the model input variables, PCA is used to select appropriate input variables, which reduces the data dimension.On the analysis of oscillation mode, the prediction data of the interaction between wind turbine and grid are analyzed by Prony algorithm.By analyzing the measured data of a wind turbine, the following conclusions are obtained: (1) PCA can reduce the dimensions of input variables, reflect the main factors affecting wind power prediction, and improve the operation speed on the premise of ensuring the prediction accuracy.
Compared with the single LSTM model, the prediction accuracy of PCA-LSTM is obviously improved.In terms of wind power, phase current and phase voltage prediction, RMSE of PCA-LSTM model is reduced by 5.533%, 6.887% and 5.098%, respectively, compared with the LSTM model.(2) A LSTM network can effectively analyze massive amounts of data.Compared with the traditional time series prediction method, the deep learning method has the advantages of strong learning and generalization ability, and the performance increases with the increase of data size.Compared with other prediction methods, this method has higher accuracy and applicability.Compared with BPNN model and ARIMA model, its MAPE decreased by 2.510% and 0.780%, respectively.(3) Based on the actual data and the predicted data of the model, the oscillation modes of the interaction between the wind turbine and power grid are analyzed by Prony algorithm, which proves that the oscillation frequency of the predicted data from PCA-LSTM model proposed in this paper are basically the same as the oscillation frequency of the actual data, and from the oscillation frequency, it is found that wind turbines have more harmonic components such as 12, 25 and 50 Hz, that is, there are more sub synchronous oscillations and sub synchronous resonances, and there is a greater possibility of being stimulated, which verifies the feasibility of the proposed method for analyzing the interaction between wind turbines and power grid.

Wind Turbine Grid Interaction 3 . 1 .
Long-Term and Short-Term Memory Network Structure is an activation function that can map a real number input into [−1, 1].Energies 2018, 11, x FOR PEER REVIEW 5 of 20
represent the weight matrix of input gate, forgetting gate, output gate and tuple input to connect h t−1 , and b i , b f , b o , b C respectively represent the bias vectors of input gate, forget gate, output gate and tuple input.σ represents sigmoid activation function.

Figure 3 .
Figure 3. Network structure unfolded in time.

Figure 3 .
Figure 3. Network structure unfolded in time.

Figure 3 .
Figure 3. Network structure unfolded in time.

4. 2 .
Construction of Tensor Flow Flow Diagram of the Model Data flow diagram is an abstract description of computation.At the beginning of the calculation, the data flow graph is started in the session, which distributes the operations in the graph to each computing device while providing the execution method of the operations.These methods calculate and return tensors according to the calculation relationship of each side.The data flow diagram of the LSTM model constructed in this paper is shown in Figure 5, where the nodes are numerical operations and the edges are tensors represented by n dimensional arrays.The data flow diagram of the hidden layer is shown in Figure 6.Energies 2018, 11, x FOR PEER REVIEW 8 of 20 n dimensional arrays, flow is based on a data flow diagram, and tensor flow is the calculation process from one end of the graph to the other.

Figure 6 .
Figure 6.Data flow diagram of hidden layer in LSTM model.

Figure 6 .
Figure 6.Data flow diagram of hidden layer in LSTM model.

Figure 6 .
Figure 6.Data flow diagram of hidden layer in LSTM model.

Figure 7 .
Figure 7.The scatter of variance relative to the number of component.

Figure 7 .
Figure 7.The scatter of variance relative to the number of component.

Figure 9 .
Figure 9.Comparison of actual output and predicted output.(a) Active power; (b) Phase current; (c) Phase voltage.

Figure 10 .
Figure 10.Forecast results and partial results of PCA-LSTM and LSTM.(a) Active power (b) Partial active power (c) Phase current (d) Partial phase current (e) Phase voltage (f) Partial phase voltage.

Energies 2018 ,
11, x FOR PEER REVIEW 16 of 20the wind turbine and the oscillation modes corresponding to the actual phase current, phase voltage and active power of the wind turbine.

Figure 12 .
Figure 12. Analysis results of the actual value and forecast value of actual phase current, phase voltage and power based on Prony algorithm (a) Phase voltage (b) Phase current (c) Active power.

Figure 12 .
Figure 12. Analysis results of the actual value and forecast value of actual phase current, phase voltage and power based on Prony algorithm (a) Phase voltage; (b) Phase current; (c) Active power.

Figure 12 .
Figure 12. Analysis results of the actual value and forecast value of actual phase current, phase voltage and power based on Prony algorithm (a) Phase voltage (b) Phase current (c) Active power.
is an activation function that can map a real number input into [-1, 1].
t Figure 2. Cell structure of LSTM.
respectively represent the weight matrix of input gate, forgetting gate, output gate and tuple input to connect ℎ , and , , , respectively represent the bias vectors of input gate, forget gate, output gate and tuple input.σ represents sigmoid activation function.
CW respectively represent the weight matrix of input gate, forget gate, output gate and tuple input, , , , input gate, forgetting gate, output gate and tuple input to connect ℎ , and , , , respectively represent the bias vectors of input gate, forget gate, output gate and tuple input.σ represents sigmoid activation function.

Table 2 .
Score of Component Coefficient Matrix.

Table 3 .
Error analysis of forecasting result.

Table 3 .
Error analysis of forecasting result.

Table 4 .
Error analysis of forecasting result.

Table 4 .
Error analysis of forecasting result.

Table 5 .
The analysis results of the actual value and forecast value based on Prony algorithm.

Table 6 .
Analysis of Oscillation Mode of Wind Turbine on Forecasted Phase Current, Phase voltage and Power.

Table 6 .
Analysis of Oscillation Mode of Wind Turbine on Forecasted Phase Current, Phase voltage and Power.

Table 7 .
Analysis of Oscillation Mode of Wind Turbine on Actual Phase Current, Phase voltage and

Table 6 .
Analysis of Oscillation Mode of Wind Turbine on Forecasted Phase Current, Phase voltage and Power.

Table 7 .
Analysis of Oscillation Mode of Wind Turbine on Actual Phase Current, Phase voltage and Power.