Improved Combined Inertial Control of Wind Turbine Based on CAE and DNN for Temporary Frequency Support

: With the continuous and large-scale development of renewable energy, there is a prominent decrease in the level of inertia in new power systems. This decrease leads to the weakening of the system’s capability to provide inertia support and frequency regulation during disturbance events. The wind turbines (WT), as the main representatives of renewable energy generation, should be more efﬁciently involved in the power system frequency regulation dynamics. However, optimal frequency regulation is difﬁcult to achieve through the combined inertial control strategy of wind turbines because it greatly depends on control parameters and ﬂuctuates in different scenarios. To cope with disturbance efﬁciently and quickly in different scenarios and obtain the optimal frequency regulation results, this paper presents an improved combined inertial intelligent control strategy of WT based on contractive autoencoder (CAE) and deep neural network (DNN). This method obtains the optimal parameters for combined inertial control using the particle swarm optimization (PSO) algorithm, then effectively extracts features from actual data using CAE followed by building a network model to predict the optimal combined inertial control parameters online. To verify and test the proposed method, it is applied in the IEEE 9-bus test system. The simulation results show that the method can obtain optimal control parameters with a faster computational time, good prediction accuracy, and generalization capability.


Introduction
Wind power has been widely used because of its advantages such as environmentally friendly, mature technology and low cost [1]. With the development of wind power generation technology, the penetration rate of wind power in various countries has been rising. According to relevant data, by the end of 2020, China's installed wind power capacity accounted for 12.79%, showing a rapid growth rate [2]. The WT is generally connected to the grid through a converter, which leads to the loss of coupling between the turbine speed and the grid frequency [3]. Additionally, in the maximum power point tracking (MPPT) control mode, the WT cannot provide power support when the power grid frequency changes [4]. Therefore, with the increasing penetration of wind power, the problems of reduced system inertia and insufficient primary frequency regulation capability are becoming more and more obvious, which bring unfavorable factors to the grid frequency [5].
For this reason, the WT should have inertia response characteristics and primary frequency regulation capability such as synchronous turbines in order to achieve wind power friendly grid connection. To address this issue, the control methods currently used mainly include virtual inertia control and droop control. Virtual inertia control uses the system frequency rate of change as input to participate in frequency regulation and is mostly used to support transient frequencies. The concept of virtual inertia control was introduced earlier in [6], which makes full use of the high controllability and fast response of the WT output power, enabling the WT to simulate the inertia characteristics of synchronous turbines and release rotor kinetic energy to provide power support for the grid. The droop control enables the WT to have similar frequency droop characteristics as conventional turbines, to cope with system frequency fluctuations, to continuously support the system frequency, and to provide a strong support at the lowest point of the system frequency. In [7], the effect of droop control on system frequency was analyzed and the range of control parameters taken for different wind power penetration rates was obtained. The combined inertial control is a combination of droop control and virtual inertia control. In [8][9][10][11], the improvement effect of combined inertial control on system frequency is investigated, and it is pointed out that combined inertial control plays an important role in reducing the frequency variation rate and maximum frequency deviation.
In order to further optimize the effect of wind power participation in frequency regulation, many scholars have conducted further research on optimal control methods. In [12], a control strategy with variable droop parameters is presented to improve the adaptive capability of the control strategy, but the solution process is tedious, and the combination of virtual inertia control and droop control is not considered comprehensively. In [13], it simply involves a combined control strategy that considers virtual inertia control and droop control without specific analysis. In [14], PSO was used to optimize the virtual inertia parameters and droop control parameters, and wind speed adaptation analysis was performed in three scenes, which proved that the proposed method could significantly optimize the effect of frequency regulation. However, the method did not significantly reduce the frequency regulation time and the scenes were too few and lacked universality.
Based on the aforementioned background, an improved combined inertial intelligent control of the wind turbine for temporary frequency support based on CAE and DNN is proposed in this paper. First, combined inertial intelligence is performed using PSO to obtain the optimal parameters and generate data sets. Then, CAE is introduced to extract features from the data sets. After the selection of the number of hidden layers and the number of neurons, with the unsupervised greedy layer-wise pre-training and supervised fine-tuning, the DNN model is built to predict the optimal combined inertial control parameters online, which can have a faster computational speed, good prediction accuracy and generalization capability.

System Frequency Response Model
The most commonly used method for frequency dynamic analysis of the power system after disturbance is the system frequency response (SFR) model [15]. This model assumes that the system satisfies the same inertia center frequency principle and, equivalently, aggregates the prime mover and governor system models of all generators in the system into the prime mover and governor system model of one generator, while concentrating the mechanical power of all generators in the system onto one generator rotor, and finally, equivalently, aggregates all loads in the system into one centralized load. The model structure is shown in Figure 1. mostly used to support transient frequencies. The concept of virtual inertia control w introduced earlier in [6], which makes full use of the high controllability and fast respon of the WT output power, enabling the WT to simulate the inertia characteristics of sy chronous turbines and release rotor kinetic energy to provide power support for the gr The droop control enables the WT to have similar frequency droop characteristics as co ventional turbines, to cope with system frequency fluctuations, to continuously supp the system frequency, and to provide a strong support at the lowest point of the syste frequency. In [7], the effect of droop control on system frequency was analyzed and t range of control parameters taken for different wind power penetration rates was o tained. The combined inertial control is a combination of droop control and virtual iner control. In [8][9][10][11], the improvement effect of combined inertial control on system f quency is investigated, and it is pointed out that combined inertial control plays an i portant role in reducing the frequency variation rate and maximum frequency deviatio In order to further optimize the effect of wind power participation in frequency re ulation, many scholars have conducted further research on optimal control methods. [12], a control strategy with variable droop parameters is presented to improve the ada tive capability of the control strategy, but the solution process is tedious, and the com nation of virtual inertia control and droop control is not considered comprehensively. [13], it simply involves a combined control strategy that considers virtual inertia control a droop control without specific analysis. In [14], PSO was used to optimize the virtual iner parameters and droop control parameters, and wind speed adaptation analysis was p formed in three scenes, which proved that the proposed method could significantly op mize the effect of frequency regulation. However, the method did not significantly redu the frequency regulation time and the scenes were too few and lacked universality.
Based on the aforementioned background, an improved combined inertial intellige control of the wind turbine for temporary frequency support based on CAE and DNN proposed in this paper. First, combined inertial intelligence is performed using PSO obtain the optimal parameters and generate data sets. Then, CAE is introduced to extr features from the data sets. After the selection of the number of hidden layers and t number of neurons, with the unsupervised greedy layer-wise pre-training and supervis fine-tuning, the DNN model is built to predict the optimal combined inertial control p rameters online, which can have a faster computational speed, good prediction accura and generalization capability.

System Frequency Response Model
The most commonly used method for frequency dynamic analysis of the power sy tem after disturbance is the system frequency response (SFR) model [15]. This model sumes that the system satisfies the same inertia center frequency principle and, equiv lently, aggregates the prime mover and governor system models of all generators in t system into the prime mover and governor system model of one generator, while conce trating the mechanical power of all generators in the system onto one generator rotor, a finally, equivalently, aggregates all loads in the system into one centralized load. T model structure is shown in Figure 1.   Figure 1 depicts the variables used in the SFR model, where H s denotes the inertia constant of the power system, K m stands for the mechanical power gain factor, D s denotes the proportion of overall power produced by the high pressure turbine, T R represents the constant of reheat time, R stands for the governor regulation coefficient, F H is the percentage of total power generated by the high pressure turbine, P G represents the output power of the synchronous generator, P w denotes the output power of the WT, P L represents the load power, ∆P denotes the power variation of the input, and ∆ f represents the frequency response of the output.

Principle of Combined Inertial Control
The energy source of the WT is the wind energy captured by the blade rotation, according to the knowledge of aerodynamics, the mechanical power of the WT can be expressed as follows: where C p (λ, β) is the wind energy utilization coefficient; λ is the blade tip speed ratio; β is the pitch angle; ρ is the air density; R is the wind wheel radius; v is the wind speed. The wind energy utilization coefficient reflects the ability of the WT to capture wind energy, and a commonly used functional expression is as follows [16]: where ω w is the WT speed. In order to be able to capture the maximum amount of wind energy and achieve the maximum utilization of wind energy, the WT will generally operate in the maximum power point tracking mode under normal conditions. When the electromagnetic power and mechanical power reach equilibrium, the WT is in a stable operating condition, and the amount of active power produced can be represented by the following equation: where C p(opt) and ω w(opt) denote the optimal wind energy utilization coefficient and the corresponding optimal WT speed, respectively; k is the maximum power tracking coefficient of the WT, and k = ρπR 5 C p(opt) /2λ 3 opt , where λ opt is the optimal blade tip speed ratio. As the penetration of wind power in the power system continues to increase, the proportion of traditional thermal power plants also decreases accordingly, and the inertia level of the power system continues to decrease. At this time, in order to provide sufficient inertia support to the power system, the WT also needs to have the corresponding frequency regulation capability. To enable the WT to provide inertial response, additional control loops with frequency differentiation and frequency deviation feedback are introduced in the converter active control link to enable the WT to provide power support through rotor kinetic energy control during the power system frequency deviation. Rotor kinetic energy control generally consists of virtual inertia control, droop control, combined inertial control, and virtual synchronous generator techniques. In this paper, a combined inertial control method consisting of virtual inertia control and droop control is used.
Virtual inertia control enables the WT to adjust the active reference instruction according to the rate of frequency change when the power system frequency changes by simulating the inertia characteristics of the synchronous machine to release or absorb the rotor kinetic energy for the purpose of participating in the power system frequency regulation. Specific control implementation can be referred to in Figure 2. is used.
Virtual inertia control enables the WT to adjust the active reference instruction according to the rate of frequency change when the power system frequency changes by simulating the inertia characteristics of the synchronous machine to release or absorb the rotor kinetic energy for the purpose of participating in the power system frequency regulation. Specific control implementation can be referred to in Figure 2. In Figure 2, P  is the additional active reference signal obtained by the virtual inertia control. It can be acquired through a direct correlation with the fluctuation rate of the power system frequency. The active reference signal can be expressed as follows: where f K is the scale coefficient. Virtual inertia control provides a fast active output response when the power system is disturbed. In the initial phase of the power system frequency response, the frequency deviation is small, while the absolute value of the frequency rate of change is large. Therefore, the WT with virtual inertia control can provide faster and stronger power support during the initial phase. However, at the end of the frequency regulation phase, as the absolute value of the rate of change of frequency becomes smaller, the output power support is further reduced, which hinders the restoration of the power system frequency. Therefore, the virtual inertia control method is mostly used to support the transient frequency and cannot continuously participate in the frequency regulation, and it is necessary to avoid the problem of secondary frequency drop in the speed recovery phase.
Droop control is a simulation of the power-frequency static characteristic curve of the primary frequency regulation of a synchronous generator, which is also referred to as proportional control or ramp control. It responds directly to the power system frequency deviation and can provide continuous support for the frequency deviation. The input signal is the power system frequency deviation, and the output signal is the increment of electrical power output from the WT. The increment of electrical power output of the WT varies linearly with the deviation of the power system frequency. As shown in the following equation: where R is the droop coefficient. The specific control implementation of droop control is shown in Figure 3.  In Figure 2, ∆P is the additional active reference signal obtained by the virtual inertia control. It can be acquired through a direct correlation with the fluctuation rate of the power system frequency. The active reference signal can be expressed as follows: where K f is the scale coefficient. Virtual inertia control provides a fast active output response when the power system is disturbed. In the initial phase of the power system frequency response, the frequency deviation is small, while the absolute value of the frequency rate of change is large. Therefore, the WT with virtual inertia control can provide faster and stronger power support during the initial phase. However, at the end of the frequency regulation phase, as the absolute value of the rate of change of frequency becomes smaller, the output power support is further reduced, which hinders the restoration of the power system frequency. Therefore, the virtual inertia control method is mostly used to support the transient frequency and cannot continuously participate in the frequency regulation, and it is necessary to avoid the problem of secondary frequency drop in the speed recovery phase. Droop control is a simulation of the power-frequency static characteristic curve of the primary frequency regulation of a synchronous generator, which is also referred to as proportional control or ramp control. It responds directly to the power system frequency deviation and can provide continuous support for the frequency deviation. The input signal is the power system frequency deviation, and the output signal is the increment of electrical power output from the WT. The increment of electrical power output of the WT varies linearly with the deviation of the power system frequency. As shown in the following equation: where R is the droop coefficient. The specific control implementation of droop control is shown in Figure 3. In contrast to virtual inertia control, droop control employs the frequency deviation signal as its input, which provides greater power assistance in the vicinity of the frequency nadir. However, it has a slower response time compared to virtual inertia control and is disadvantaged by inadequate power support during the initial stages of frequency regulation.  In contrast to virtual inertia control, droop control employs the frequency deviation signal as its input, which provides greater power assistance in the vicinity of the frequency nadir. However, it has a slower response time compared to virtual inertia control and is disadvantaged by inadequate power support during the initial stages of frequency regulation.
Virtual inertia control can have a suppressive effect on the high frequency deviation rate in the power system, while droop control mainly serves to reduce the power system frequency deviation, but both have their own drawbacks when acting individually. When the system frequency changes, if only the virtual inertia control is used, the output power involved in the process of frequency regulation fluctuates greatly and stops working after the system frequency drops to the lowest point, so it cannot continuously participate in frequency regulation. If droop control is used alone, the frequency response is slow and cannot provide support at the beginning of the frequency change, and the effect of frequency regulation is poor when the amount of frequency change is small. In order to better realize the participation of the WT in primary frequency regulation and ensure system frequency security, the existing research introduces a combined inertial control strategy consisting of both virtual inertia control and droop control. The introduction of both frequency deviation and frequency change rate in the WT active control link, so that the rotor kinetic energy actively responds to the system frequency change, can further improve the frequency tuning capability of the WT. Its additional active reference values are expressed as follows: where K df is the virtual inertia control parameter and K pf is the droop control parameter. The values of K df and K pf can be adjusted to change the values of ∆P, which can further affect the WT inertia support capacity and speed recovery time after frequency regulation. The basic control block diagram of the combined inertial control is shown in Figure 4. The high-pass filter allows only the transient component of the frequency to pass, and the low-pass filter is used to avoid interference from noise during the measurement. Combined inertial control combines virtual inertia control and droop control, which can provide powerful power support in both the initial phase of the system frequency response and the range near the frequency nadir. However, the setting of the control parameters in the application is complicated and subject to external influence. Therefore, the control parameters need to be adjusted scientifically to achieve the combined enhancement effect after the combination of the two.  Combined inertial control combines virtual inertia control and droop control, can provide powerful power support in both the initial phase of the system freq response and the range near the frequency nadir. However, the setting of the cont rameters in the application is complicated and subject to external influence. Therefo

High-pass Filter
Low-pass Filter

Combined Inertial Control Based on CAE and DNN
In the event of a load disturbance, an online calculation of the optimal combined inertial control parameters is required during the disturbance burst, owing to the uncertainty surrounding factors such as wind speed, the proportion of wind power, and the magnitude of the load disturbance of these three variables. If the traditional time-domain simulation method is used to manually adjust the combined inertial control parameters based on experience, it will take a long time and may not be able to find the optimal control parameters. Additionally, with these three variables constantly changing, the corresponding combined inertial optimal control parameters are often different, which greatly increases the difficulty of frequency regulation. In order to reduce the time required in the online calculation process, taking into account the accuracy of the control parameters, a combined inertial intelligent control for wind power frequency regulation based on CAE and DNN is presented. Through the acquisition of optimal control parameters for varying wind speed, wind power proportion, and load disturbance scenarios, the power system can swiftly and precisely determine the corresponding combined inertial control parameters when a load disturbance occurs.

Particle Swarm Optimization
Particle swarm optimization (PSO) was presented in 1995 and originated from the study of bird predation. Its design ideas are mainly inspired by the regularity of bird clusters activities, and then a simplified model established by swarm intelligence. PSO is a problem-solving technique that draws inspiration from the collective behavior of animal clusters. It leverages the sharing of information between individuals in the group to progressively move towards an optimal solution by moving from a state of disorder to order in the problem-solving space. PSO has a distinct biosocial background: cognitive behavior and social behavior; that is, in the process of seeking consistent cognition, individuals tend to remember their beliefs, while considering the beliefs of other peers, and when individuals perceive that the beliefs of their peers are better, they will make adaptive adjustments [17].
The solution of the optimization problem is represented by the position of the particles in PSO, and the performance level of each particle is determined by the objective function's adaptation value. Each particle is determined by a velocity that determines its direction of flight and the magnitude of its rate. Suppose that in a D-dimensional target search space, there are m particles forming a population. The position of the particle i at the tth iteration is expressed as . When initiating PSO, the position and velocity of m particles are initially randomized, and an optimal solution is then reached through subsequent iterations. During each iteration, the particle updates both its position and velocity by tracing two extremums: one of the extremums is the optimal solution that the particle itself has searched for so far, called the individual extremum, expressed as P i (t) = (p i1 (t), p i2 (t), · · · , p id (t)); another extremum is the optimal solution found so far for the entire swarm of particles, called the global extremum, expressed as P g (t) = (p g1 (t), p g2 (t), · · · , p gd (t)).
Specifically, at the (t + 1) th iteration of the calculation, particle i updates its velocity and position according to the following rules: where ω is inertia weights; c 1 , c 2 represent the individual learning factor and social learning factor, respectively; rand() is a random number between (0, 1); i = 1, 2, · · · , m; k = 1, 2, · · · , d.
To further improve the efficiency of the PSO, this paper proposes an advanced version that dynamically adjusts the inertia weight. By utilizing an exponential function to nonlinearly adjust the inertia weight and incorporating a random adjustment number that follows a beta distribution, the algorithm achieves dynamic adjustment and exhibits excellent global convergence capability.
The specific improvement can be described as follows: using an exponential function to control the change of inertia weight ω, where e − t tmax nonlinearly decreases with the increase in iteration times. Betarnd is a random number generator in MATLAB that can generate random numbers that follow the beta distribution. Using this generator in the later stages of the iterations can increase the global search ability of the algorithm and reduce the likelihood of the algorithm getting trapped in local optima [18]. The improved expression for the inertia weight is: where σ is the inertia adjustment factor, which is set to 0.1. B(p, q) is the beta function, parameters p > 0, q > 0. The first and second terms on the right side are adjusted by an exponential function to decrease non-linearly with the increase in iteration times, which reduces the initial dominance of inertia weight. The third term uses the beta distribution to adjust the overall distribution of ω, and an inertia adjustment factor is added to control the deviation of inertia weight, making the adjustment of ω more reasonable. On one hand, the improvement of inertia weight makes the inertia weight change non-linearly and satisfies the condition of non-linearly decreasing inertia weight throughout the entire search process.
On the other hand, the introduction of the random adjustment strategy based on beta distribution generates a random quantity to adjust ω, which improves the global search ability in the early stage of the algorithm and the search accuracy in the later stage. The algorithm flow is as follows: (1) Randomly initialize the position and velocity of the particle swarm. (2) Calculate the fitness value for each particle. (3) For each particle, compare its fitness value with the individual extremum and update the current individual extremum if it is better. (4) For each particle, compare its adaptation value with the global extremum and update the current global extremum if it is better. (5) Update the position and flight velocity of each particle according to Formulas (7)-(10). (6) If the pre-set stopping criterion (usually set to the maximum number of iterations) is not reached, return to step (2), and stop the calculation if it is reached.

Contractive Autoencoder
From the whole combined inertial control frequency regulation procedure, it can be seen that the effect of frequency regulation depends mainly on the droop control parameter (K pf ) and the virtual inertia control parameter (K df ). In the same system, there always exists an optimal combination of control parameters when the wind speed, wind power proportion, and load disturbance are determined. Different optimal combinations of droop and virtual inertia control parameters obtained for these three different variables make up multidimensional data sets. The effectiveness of the prediction part depends to some extent on the selection, pre-processing, and feature learning of the input data. In order to make the subsequent prediction part more effective, a deep feature learning method called contractive autoencoder (CAE) is used in this paper, which not only reconstructs the input signal well, but also is invariant to the perturbations of the input data under a certain degree.
The autoencoder (AE) network structure is shown in Figure 5 [19]. It consists of an encoder and a decoder: the input vector is mapped into a feature vector in the implicit layer by the encoder, and then the feature phase is reconstructed into the original input vector by the decoder [20]. called contractive autoencoder (CAE) is used in this paper, which not only reconstructs the input signal well, but also is invariant to the perturbations of the input data under a certain degree.
The autoencoder (AE) network structure is shown in Figure 5 [19]. It consists of an encoder and a decoder: the input vector is mapped into a feature vector in the implicit layer by the encoder, and then the feature phase is reconstructed into the original input vector by the decoder [20].
where the weight matrix between the input layer and the hidden layer is denoted by W , while B represents the bias matrix between these two layers; additionally, f s denotes the activation function used by the encoder's neurons, which is typically represented by the sigmoid function. The input sample set denoted by X comprises N groups of samples x 1 , x 2 , . . . , x n . The set of feature vectors in the hidden layer is represented by H, comprising M groups of samples h 1 , h 2 , . . . , h m . The relationship between X and H can be described as follows: where the weight matrix between the input layer and the hidden layer is denoted by W, while B represents the bias matrix between these two layers; additionally, s f denotes the activation function used by the encoder's neurons, which is typically represented by the sigmoid function.
where z represents the vector of input. The decoder operation is the inverse process of the encoder operation, reconstructing the original input from the input vector of the hidden layer's feature vectors. Y represents the set of the output vectors, which consists of overall N groups with a dimension of n. Consequently, the coding relationship between H and Y can be expressed as follows: where the weight matrix between the hidden layer and the output layer is denoted by W , while B represents the bias matrix between these two layers; s g represents the activation function of the decoder's neurons. AE accomplishes feature learning through minimizing difference between the reconstructed output data and the original input data, as the following shows: The gradient descent algorithm is applied to iteratively update the weights and biases of the network in order to minimize the reconstruction error, as illustrated below: (15) where l is learning rate. However, the learning process of AE may only preserve the information of the original input data, which may not guarantee an effective representation of the feature information.
Appl. Sci. 2023, 13, 6984 9 of 16 CAE is based on the original autoencoder by adding a contraction regularization term to the loss function, forcing the encoder to learn a feature extraction function with stronger contraction, thus increasing the robustness when small perturbations occur around the training data sets.
Assume that the contraction regularization factor is λ. The Jacobi matrix of the hidden layer output with respect to the input samples is J f (x). The loss function of CAE is [21]: is the square of the Frobenius norm of the Jacobi matrix, which is expressed as follows: In terms of the loss function, CAE balances the reconstruction error with the contraction regularization term to extract the abstract features of the sample. The contraction regularization term makes the gradients of the functions learned by CAE smaller with respect to the input, while the reconstruction error forces CAE to retain the complete information. Under the combined effect of both, the gradients of the feature extraction functions about the inputs are mostly small, and only a few have large gradients. This way, when the input has small perturbations, the smaller gradients will weaken these perturbations, thus improving the robustness of CAE to small input perturbations. Pre-training of CAE is a process of continuously learning data features, performing feature extraction, and also providing reasonable initial parameters for the deep neural network.

Deep Neural Network
Deep neural network (DNN) is a neural network with many hidden layers, also known as deep feedforward network (DFN) or multi-layer perceptron (MLP).
The neural network layers inside DNN can be divided into input, hidden, and output layers according to the position of different layers. Generally, the input layer is located at the beginning, the output layer is located at the end, and all intervening layers are referred to as hidden layers. In this network architecture, there exist complete connections between each layer [22]. The basic structure is shown in Figure 6. process of continuously learning data features, performing feature extraction, and als providing reasonable initial parameters for the deep neural network.

Deep Neural Network
Deep neural network (DNN) is a neural network with many hidden layers, als known as deep feedforward network (DFN) or multi-layer perceptron (MLP).
The neural network layers inside DNN can be divided into input, hidden, and outpu layers according to the position of different layers. Generally, the input layer is located a the beginning, the output layer is located at the end, and all intervening layers are referre to as hidden layers. In this network architecture, there exist complete connections betwee each layer [22]. The basic structure is shown in Figure 6. DNN is a multilayer neural network that employs unsupervised learning for settin the initial weights of each layer, which serves to improve feature representation by map ping input features to a separate feature space through layer-wise feature mapping. Wit DNN is a multilayer neural network that employs unsupervised learning for setting the initial weights of each layer, which serves to improve feature representation by mapping input features to a separate feature space through layer-wise feature mapping. With multiple nonlinear transformations, DNN is capable of effectively fitting complex functions. To illustrate the core concept of DNN as a neuron network, it can be summarized by the following three points: (1) each layer of the network is pre-trained using unsupervised learning. (2) The output features of the previous layer are used as input for the next layer during unsupervised layer-wise training. (3) To fine-tune all layers, supervised learning is employed. Compared to traditional neural networks such as BP neural network, DNN achieves faster training speeds and reduces the risk of overfitting by utilizing pretraining mechanisms.

Application and Control Procedure
Combined inertial intelligent control of wind power frequency regulation based on deep neural network is divided into three parts: acquiring data, extracting features, and predicting parameters. Feature extraction and learning of the original combined inertial control is performed to improve and optimize the control strategy. PSO is first applied to obtain combined inertial optimal control parameters in the simulation and combined with the scene data to generate the data sets. Then, CAE is used to extract data features from the data sets. Finally, DNN is used to learn the features so that the optimal combined inertial control parameters are quickly obtained when making decisions online. It is primarily categorized into two components: offline training and online decision-making.
Offline training: Step 1: Define objective function and constraint conditions for PSO, and conduct time-domain simulation. Apply the PSO algorithm to obtain the optimal combination of combined inertial control parameters, and match it with the wind speed, wind power proportion, and load disturbance; use these three variables in corresponding scenarios to create data sets.
Step 2: Pre-process the obtained data sets: CAE is used for feature training to extract the noise present in the original data and to improve the robustness of the DNN prediction model.
Step 3: Normalize the obtained data and split it into training and validation sets. Allocate 90% of the original data for the training set and the remaining 10% for the validation set.
Step 4: Conduct unsupervised greedy layer-wise pre-training. Utilize feature vectors obtained from training data through CAE training as input data, and apply the dropout technique to randomly deactivate some neurons, thereby reducing the connectivity among particular nodes, enhancing the generalization ability, and preventing overfitting. This process is iterated in succession and trained layer-wise to acquire the weight matrix W and bias vector B for the initialization of neural network.
Step 5: Perform supervised fine-tuning to refine the initial network parameters. Specifically, in this study, the input weight matrix, hidden layer feature vector, and output weight matrix of the network are optimized by using the Adamax optimization algorithm.
Step 6: Use the mean square error (MSE) as the evaluation metric of the model. If the MSE decreases, return back to Step 5 until the MSE stops decreasing. The training process of the network parameters ends when the MSE no longer decreases, and the trained model is saved.
Online decision-making: After the offline training is completed, when the load disturbance event occurs again, it is possible to quickly match the scene data and use the trained network model to quickly obtain the optimal combined inertial control parameters to achieve the best combined inertial control and achieve rapid stabilization of frequency drops. The specific steps of online decision-making are as follows.
Step 1: Collect the necessary system data for the specific scenario, including the above three variables, and create online data sets.
Step 2: Normalize the acquired online data sets.
Step 3: Feed the normalized data into the DNN model that was saved after offline training. Through this step, the optimal combined inertial control parameter combination will be swiftly and precisely acquired, forming an entire control strategy.
Step 4: The entire frequency regulation strategy is input into the power grid system for online decision-making, achieving efficient and steady control of power system frequency.
The specific procedure is shown in Figure 7.
decreases, return back to Step 5 until the MSE stops decreasing. The training process of the network parameters ends when the MSE no longer decreases, and the trained model is saved. Online decision-making: After the offline training is completed, when the load disturbance event occurs again, it is possible to quickly match the scene data and use the trained network model to quickly obtain the optimal combined inertial control parameters to achieve the best combined inertial control and achieve rapid stabilization of frequency drops. The specific steps of online decision-making are as follows.
Step 1: Collect the necessary system data for the specific scenario, including the above three variables, and create online data sets.
Step 2: Normalize the acquired online data sets.
Step 3: Feed the normalized data into the DNN model that was saved after offline training. Through this step, the optimal combined inertial control parameter combination will be swiftly and precisely acquired, forming an entire control strategy.
Step 4: The entire frequency regulation strategy is input into the power grid system for online decision-making, achieving efficient and steady control of power system frequency.
The specific procedure is shown in Figure 7.

Case Study and Analysis
Using the IEEE 9-bus system as the test system example, connect the WT model to the line L3 of the system, as depicted in Figure 8. The studies in this paper were all obtained by simulating the operation of the WT at below rated power under different conditions. In order to ensure the method's applicability in diverse scenarios, the impact of different wind speed, wind power proportion, and load disturbance on the power system's frequency response is considered. Wind speed is varied from 4 m/s to 10 m/s in increments of 1 m/s, for a total of seven cases. Wind power proportion is set to range from 5% to 60%, increasing by 5% at each interval, for a total of 12 cases. Load disturbance ranges from 1.005 to 1.25, increasing by 0.05 at each step, for a total of 50 cases, resulting in 4200 total cases. Furthermore, a comparison of the proposed method's performance with that of other frequency regulation control techniques is conducted in typical scenarios.

Case Study and Analysis
Using the IEEE 9-bus system as the test system example, connect the WT model to the line L3 of the system, as depicted in Figure 8. The studies in this paper were all obtained by simulating the operation of the WT at below rated power under different conditions. In order to ensure the method's applicability in diverse scenarios, the impact of different wind speed, wind power proportion, and load disturbance on the power system's frequency response is considered. Wind speed is varied from 4 m/s to 10 m/s in increments of 1 m/s, for a total of seven cases. Wind power proportion is set to range from 5% to 60%, increasing by 5% at each interval, for a total of 12 cases. Load disturbance ranges from 1.005 to 1.25, increasing by 0.05 at each step, for a total of 50 cases, resulting in 4200 total cases. Furthermore, a comparison of the proposed method's performance with that of other frequency regulation control techniques is conducted in typical scenarios.

Data Construction
Considering the impact of the above three variables on the frequency regulation performance of wind power, PSO is utilized in time-domain simulations to obtain the optimal G1 G2 G3 L1 L2 L3 WT Figure 8. WT in IEEE 9-bus system.

Data Construction
Considering the impact of the above three variables on the frequency regulation performance of wind power, PSO is utilized in time-domain simulations to obtain the optimal control parameters for the combined inertial control under different scenarios. This approach aims to improve the minimum system frequency, reduce the absolute value of the maximum frequency deviation, and optimize the secondary frequency droop to achieve the best frequency regulation performance. The resulting optimal parameters are then combined with scenario data to generate comprehensive data sets. The algorithm's objective function and constraint are established as a first step. Designate the minimum system frequency f (t) min as the optimization objective. The constraint is the system frequency constraint: f (t) min ≥ 49.8, f (t) max ≤ 50.2, less than or equal to 0.2 Hz in absolute value. The optimal control parameters combination (droop control parameter K pf and virtual inertia control parameter K df ) is obtained through time-domain simulations by using PSO, and matched with these three variables in each scene to produce corresponding data sets.

Feature Learning and DNN Parameters Setting
CAE feature training is performed to extract the noise from the original data and improve the robustness of DNN prediction model based on the data sets obtained from the time-domain simulation using PSO. Once the data is normalized, the next step is to divide it into training and validation sets. In this case, 90% of the original data is assigned to the training set, while the remaining 10% is used for validation purposes. These sets are then utilized to create the training data and the validation data.
The selection of the number of hidden layers and neurons per hidden layer in DNN can impact the accuracy of the model, as well as the time required for offline training and online decision-making in combining optimal combined inertial control parameters. In this paper, the strategy of setting the number of layers and the number of neurons per layer of DNN is layer by layer, increasing in order. The optimal number of neurons in each hidden layer is determined experimentally and kept constant, and then the number of hidden layers is gradually increased until the MSE of the optimal combination of combined inertial control parameters no longer decreases, and the training is stopped.
According to the above strategy of setting, the optimal configuration for the wind power frequency regulation combined inertial control deep learning neural network in this test system is determined to be three hidden layers, with 6, 12, and 6 neurons in each respective layer.

Analysis and Comparison of Results
Using the number of hidden layers and neurons determined in Section 5.2, the trained DNN is obtained through deep learning on the training set. Table 1 provides a comparison between the time and MSE required for online decision-making on the test samples using the proposed method and time-domain simulation. To acquire the optimal combined inertial control combination under 4200 conditions, the time-domain simulation would require at least 4200 simulation calculations, each taking about 120 s. This would result in a total simulation time of approximately 140 h. In contrast, using the method presented to regulate the frequency of 4200 conditions results in an MSE of 0.00117 and an entire time of only 0.31 s, significantly reducing the time required and improving efficiency. Figure 9 displays the predicted results of the optimal combined inertial control parameter combination. To acquire the optimal combined inertial control combination under 4200 conditions, the time-domain simulation would require at least 4200 simulation calculations, each taking about 120 s. This would result in a total simulation time of approximately 140 h. In contrast, using the method presented to regulate the frequency of 4200 conditions results in an MSE of 0.00117 and an entire time of only 0.31 s, significantly reducing the time required and improving efficiency. Figure 9 displays the predicted results of the optimal combined inertial control parameter combination.
As depicted in Figure 9, as the load disturbance increases, while keeping wind speed and wind power proportion constant, the optimal droop control parameter pf K is gradually increasing and the optimal virtual inertia control parameter df K is gradually decreasing. With fixed wind speed and load disturbance in the scene, both pf K and df K are decreasing as the wind power proportion increases. With the increase in wind speed, when the proportion of wind power and load disturbance in the scene are fixed, both pf K and Figure 9. Combined inertial control optimal parameter prediction results. (a) K pf (b) K df .
As depicted in Figure 9, as the load disturbance increases, while keeping wind speed and wind power proportion constant, the optimal droop control parameter K pf is gradually increasing and the optimal virtual inertia control parameter K df is gradually decreasing. With fixed wind speed and load disturbance in the scene, both K pf and K df are decreasing as the wind power proportion increases. With the increase in wind speed, when the proportion of wind power and load disturbance in the scene are fixed, both K pf and K df are decreasing relatively slowly. In summary, it can be seen that a higher amount of load disturbance leads to a larger power deficit in the system, making the minimum frequency value of the system smaller. At this time, more K pf is needed to reduce the maximum frequency deviation. However, if it is too large, it will lead to a more serious secondary drop in frequency. For this reason, there will be a corresponding reduction in K df . The higher the wind power proportion, the lower the inertia in the system, and the greater the unbalanced power in the system when the WT exits the frequency regulation. If the conventional thermal power generator cannot provide enough frequency regulation power, the frequency secondary drop is more serious. Reducing the combined inertial control parameters can reduce the rotor kinetic energy released when the WT is involved in frequency regulation, thus mitigating the frequency secondary drop. The increase in wind speed will lead to a higher real-time power generation by the WT, but during the short-term overshoot phase, using the same combination of parameters for combined inertial control will result in a higher energy consumption and a drop in the secondary droop point. The optimal combined inertial control parameters should decrease slowly in order to achieve improved frequency regulation. Furthermore, the prediction models exhibit high accuracy, exceptional generalization capability, and strong adaptability across different scenes.
Online decision-making is carried out using the methods described above in a system scenario where the wind speed is 6 m/s, the wind power proportion is 30%, and the load disturbance is 0.05 p.u., starting at 40 s. In this scenario, the performance of combined inertial control with BP training parameters is compared with frequency control without wind power participation. The results of frequency control are presented in Figure 10a.
The results depicted in Figure 10a demonstrate that though the combined inertial control with BP training parameters can achieve the similar effect of frequency control as the combined inertial control with optimal parameters, the latter responds more quickly than the former. It is evident from the results that obtaining the optimal control parameters is crucial for achieving optimal frequency regulation using combined inertial control. The rapid and accurate determination of these parameters is necessary to fully utilize the frequency regulation capability of the combined inertial control and achieve optimal control effects under its control strategy. hibit high accuracy, exceptional generalization capability, and strong adaptability across different scenes.
Online decision-making is carried out using the methods described above in a system scenario where the wind speed is 6 m/s, the wind power proportion is 30%, and the load disturbance is 0.05 p.u., starting at 40 s. In this scenario, the performance of combined inertial control with BP training parameters is compared with frequency control without wind power participation. The results of frequency control are presented in Figure 10a. The results depicted in Figure 10a demonstrate that though the combined inertial control with BP training parameters can achieve the similar effect of frequency control as the combined inertial control with optimal parameters, the latter responds more quickly than the former. It is evident from the results that obtaining the optimal control parameters is crucial for achieving optimal frequency regulation using combined inertial control. The rapid and accurate determination of these parameters is necessary to fully utilize the frequency regulation capability of the combined inertial control and achieve optimal control effects under its control strategy. Figure 10b shows that as the amount of load disturbance increases, the nadir of the system frequency keeps decreasing. Although the method presented in this paper has tried to minimize the maximum deviation of frequency as much as possible, the test has shown that it is not enough to rely on the WT primary frequency regulation alone. It is still necessary to cooperate with other types of turbines as much as possible to achieve the optimal control of frequency.  Figure 10b shows that as the amount of load disturbance increases, the nadir of the system frequency keeps decreasing. Although the method presented in this paper has tried to minimize the maximum deviation of frequency as much as possible, the test has shown that it is not enough to rely on the WT primary frequency regulation alone. It is still necessary to cooperate with other types of turbines as much as possible to achieve the optimal control of frequency.

Discussion and Prospects
To comply with the new primary frequency security standards for the power system, this paper proposes a combined inertial intelligent control strategy of wind turbines based on CAE and DNN. This method can effectively extract the key features of the optimal frequency regulation parameters for online fast frequency regulation parameter decisionmaking. It has a faster computational speed, good prediction accuracy, and generalization capability. This method can provide theoretical reference for new energy to participate in grid frequency control.
The effect of combined inertial control mainly depends on the control parameters, including K pf and K df . The above case study shows that different values of the control parameters can lead to different effects of the control of frequency. The optimal parameters vary with different factors and scenes, which need a great amount of time and calculation to be obtained. The method presented in this paper offers significant advantages over traditional time-domain simulations in terms of both online decision-making time for frequency control and accuracy in achieving optimal combined inertial control for frequency regulation.
In practical production, different application scenarios need to be faced, and these scenarios can be roughly represented by wind speed, wind power proportion, and the load disturbance level, which the wind turbine needs to cope with in case of emergencies. The method proposed in this paper can effectively take into account the impact of varying wind speed, wind power proportion, and load disturbance on the frequency response characteristics of the power system, thereby meeting the requirements of different scenarios and demonstrating its high versatility. The simulation results validate the efficacy of the proposed method across a wide range of scenarios.
Shallow BP neural networks primarily rely on manual feature selection and design to accomplish feature learning. For simple classification tasks, shallow BP neural networks can obtain effective features through preprocessing and feature selection of data. However, for more complex tasks, manual feature design becomes very difficult, leading to the traditional shallow BP neural networks being unable to fully utilize their maximum feature learning capabilities. The presented method adds CAE-based feature learning before network training, which can learn effective data features, explore the implicit information of data, facilitate better fitting of DNN, and improve the speed of obtaining network parameters. According to the comparison frequency curves of different parameters, the combined inertial control by the presented method in this paper achieves the best frequency regulation effect.
Traditional time-domain simulation is a time-based simulation method, which has the advantages of high accuracy, high reliability, and high realism. However, in situations where real-time response is required, such as in the field of power systems discussed in this article, its real-time performance is poor and cannot meet the requirements of real-time simulation. This paper provides a new fast technical method for the online frequency control of wind power after load disturbance events. Compared with other combined inertial control methods, this method is more time-saving and generalized for different and complex operation conditions of new power systems.
This paper proposes an improved combined inertial intelligent control of wind turbines for temporary frequency support based on CAE and DNN. In fact, there are many other methods of fan rotor kinetic energy control, such as stepwise inertial control (SIC) and so on [23][24][25]. In further research, other frequency control methods will be studied to make frequency control more efficient. In addition, there are many other network structures in the field of deep learning such as recurrent neural network (RNN) and so on. For subsequent research, other deep learning algorithms can be tried to improve the accuracy of the algorithm. Moreover, some factors are not considered in this method of the current version, such as turbulent wind speed and random load disturbance. This may limit the feasibility and adaptation of this method. In the subsequent research, the above factors can be taken into account to expand a wider range of application scenarios.