1. Introduction
Joint control strategies of wind storage systems play a crucial role in enhancing the competitiveness and regulation of wind power in high-penetration markets [
1,
2]. Energy storage systems achieve dispatchability comparable to that of conventional units by regulating the active output of wind power to track the dispatch schedule value [
3,
4,
5]. Given energy storage’s cost and capacity constraints, optimizing the wind storage control strategy to improve output accuracy and system economic efficiency is essential.
Much previous research has been conducted on the joint control strategy of wind storage systems. It is mainly divided into two categories: One is the direct control strategy. That is, real-time section control is carried out according to the deviation between the actual output of the current wind farm and the planned value, which mainly includes mode decomposition [
6], proportional–integral–derivative (PID) control [
7], fuzzy control [
8], Fourier transform [
9], and so on. The direct control strategy can quickly respond to errors and has strong engineering practicability. However, in the case of large fluctuations in wind power, it is easy to cause problems such as the over-charging and over-discharging of the energy storage system and frequent regulation, which affects the final control effect and economy. The second category is the process optimization control strategy. Aiming at the wind power output error in the control period, this kind of control strategy establishes a mathematical model with the number of energy storage orders and throughput as the objective function and uses the optimizer to solve it. This type of control strategy continuously corrects the energy storage output during the control period through rolling calculations [
10,
11], effectively reducing the number of energy storage actions and having high control accuracy. However, the solver may not converge in complex environments, which affects the control effect. If proportional–integral (PI) control is used when the solver does not converge, the stability of control can be improved to a certain extent. Therefore, the data after control of the above two control methods can be used as samples, and the excellent control characteristics of both can be learned using deep learning methods, which can enhance the stability and comprehensive performance of control.
Firstly, wind power has high stochasticity and volatility [
3]. Deep learning can deal with uncertainty problems well, and it performs well in the research of power system state estimation [
12], transient stability [
13], fault detection [
14], and automatic generation control (AGC) [
15]. In the research of wind power generation, deep reinforcement learning extracts uncertain relationships between inputs and outputs from massive, high-dimensional data from wind farms, which significantly reduces the omission of crucial information, and improves the coordination and economy of the wind storage system scheduling as well as the grid connection [
16,
17]. In addition, deep learning can also cope well with wind power uncertainty [
18]. The above research shows that deep learning has applicability and advancement for wind storage combined control strategy research.
Secondly, there exists a certain time series of wind power. Among many neural network algorithms, both the long short-term memory network (LSTM) and gated recurrent unit (GRU) can handle time-series data well and are widely used in wind power prediction [
19], wind speed prediction [
20], and short-term load prediction [
21]. In wind power prediction, the prediction performance of the two is equivalent, but compared with the LSTM network, the structure of the GRU is more concise, and the training speed is faster [
21]. In addition, the bidirectional gated recurrent unit (BiGRU) can process the time series data in both directions and fully extract the connection before and after the time series data, which has been shown to improve the model’s accuracy compared to GRU [
22,
23]. Therefore, this paper uses BiGRU to extract the uncertainty relationship between wind power and energy storage action and realizes the prediction classification of energy storage actions.
The attention mechanism in neural networks can focus on crucial information, reduce attention to other information, and even filter out irrelevant information, thus solving the problem of information overload and improving the efficiency and accuracy of task processing [
24,
25]. It is mainly used to guide the model focusing on essential features in the power system to improve its performance [
26]. If the temporal pattern attention mechanism (TPA), which is more sensitive to temporal features [
27], is used in wind power prediction, it can improve the model prediction accuracy [
28]. Therefore, this paper uses TPA to extract important temporal information in the BiGRU hidden layer to improve model performance.
Motivated by it, this paper proposes a joint control strategy for the wind storage system based on TPA-BiGRU. The contributions are summarized as follows.
The wind storage system is controlled by adopting the advanced rolling optimization control strategy (AROCS) and PI control strategy, and a dataset with excellent wind storage control characteristics is constructed.
An evaluation standard for the output deviation of wind power active power is proposed. This standard takes the assessment requirements of China Southern Power Grid for wind power integration as an example, which can be consistent with the grid-connected requirements of conventional units.
A wind storage joint control strategy based on the TPA-BiGRU algorithm is proposed, which solves the problem of non-convergence of mathematical modeling methods. It can obtain the storage control quantity in real-time and dynamically, which improves the stability and accuracy of the wind storage system under the premise of ensuring economic benefits.
The rest of this paper is organized as follows:
Section 2 describes the structure of the proposed model and the evaluation criteria.
Section 3 presents the basic theory and design of the model.
Section 4 shows the model’s training process. In
Section 5 and
Section 6, the effectiveness of the control strategy proposed in this paper is verified by experimental comparison and conclusions are drawn.
2. Structure and Control Criteria
The structure of the joint control strategy of wind storage systems is shown in
Figure 1. First, the regulation power is calculated based on the data of the actual output value of wind turbines
, the planned output value
, and the energy storage state of charge (
). Then, the dead zone is set to judge whether the energy storage is acting or not. The dead zone is defined as a predefined threshold range (e.g., ±5% of the nominal power) where the energy storage system remains inactive. Conversely, the storage system is activated only when the control deviation (e.g., power imbalance) exceeds the thresholds. The exact threshold can be adjusted based on the grid requirements or optimization objectives to balance response frequency and equipment lifespan. When it is judged that the energy storage output is needed, the system will output the regulation power of the energy storage
.
To ensure the safe and efficient integration of large-scale wind power into grid operation, it is stipulated that the active power regulation capacity of grid-connected wind power should meet the requirements of the grid management of new energy field stations. Moreover, to improve the competitiveness of wind power, it is necessary to raise the assessment requirements of the wind power generation program to the same level as conventional units. Taking China Southern Power Grid as an example, it is stipulated that the assessment is carried out every 15 min, and the power deviation rate of conventional grid-connected units should not exceed ±2.5% [
29]. The specific requirements are shown in
Table 1.
Table 1 shows that the China Southern Power Grid has made high demands on the regulation accuracy and response speed of wind farms’ active output in addition to the need for wind farms to have substantial voltage and frequency adaptability. Accordingly, this paper proposes an evaluation criterion for wind power active power output deviation as follows:
The average value of the control deviation per minute in each assessment period is expressed using
, which is defined by the following formula:
where
is the starting capacity of the wind farm,
and
are the planned and actual values of the wind farm at sampling point
, and
is the number of sampling points.
The standard is one assessment point per minute and 1440 assessment points throughout the day. The qualified standard is .
3. Designing Control Strategy and Model
3.1. Joint Control Strategy of Wind and Storage Based on TPA-BiGRU
The control process of the wind storage joint control strategy based on advanced rolling optimization control and PI control is divided into two parts. The first part is to determine the type of energy storage action. Both control strategies input the relevant data of wind farms and energy storage and judge the operation of energy storage by calculation. Only when the current control deviation is outside the dead zone does the energy storage need to be charged and discharged. Otherwise, energy storage will not act. The second part calculates the specific value of the charge and discharge of the energy storage. If it has been judged that the energy storage output needs to be adjusted, the adjustment power of the energy storage is determined by using the data of the first part.
In this paper, the proposed wind storage joint control strategy uses three trained TPA-BiGRU models to realize the judgment and calculation functions in the above links to obtain a new wind storage joint control strategy. The function of the first link is to distinguish the action state of energy storage, that is, charging, discharging, and inaction, which can be realized by using a three-classification model. The function of the second link is to calculate the specific value of the charge and discharge power of the energy storage, which can be realized by using two regression calculation models.
Firstly, the energy storage action state is judged by the classification network. A control calculation is performed on each sampling point, and the relevant data at the judgment time are composed of the original data according to the input requirements and standardized. The average and variance of the variables are calculated when training the network. The processed data are input into the classification network to obtain the energy storage control state after the one-hot encoding with a length of 3. After decoding, three control states of energy storage are finally obtained, namely, −1 (charging), 0 (inaction), and 1 (discharging). Then, the regression network calculates the regulated power value under the energy storage’s charge and discharge state. When the energy storage state is in action, the input samples are fed again into the charging or discharging power regression network for calculation, and the network output is anti-standardized to obtain the energy storage charging or discharging power value.
3.2. Joint Control Model of Wind and Storage Based on TPA-BiGRU
Firstly, the proposed model uses the deep BiGRU network to process wind power and energy storage data. Then, the temporal relationship of each feature quantity is extracted and processed for long-term memory. Moreover, it utilizes the TPA mechanism to strengthen the model memory function, highlighting the importance of the local information and ensuring the model’s accuracy and stability. The control model proposed in this paper is shown in
Figure 2.
3.2.1. Bidirectional Gated Recurrent Unit
GRU replaces the forgetting gate and input gate in LSTM with an update gate, so the GRU model has fewer parameters and a more straightforward structure, and the performance of the two is comparable. The specific calculations are as follows:
where the symbol
represents the sigmoid activation function
;
is the current wind farm data;
is the output of the upper hidden layer;
and
are the reset door and the update door, respectively;
is the candidate hidden gate state;
,
, and
are the network parameter matrices; [ ] is vector splicing; and
is the multiplication of the matrices by elements.
To better obtain the mapping relationship between the input data and the energy storage regulation, this paper chooses the BiGRU network to capture the relationship between the two in a bidirectional time series. It can be seen from
Figure 2 that the output of the BiGRU model contains both historical and future information of the input data, which can avoid the lack of information and improve the prediction accuracy when dealing with rolling optimized data for wind power and energy storage.
3.2.2. Temporal Pattern Attention Mechanism
This paper introduces TPA to strengthen the model’s memory of long-term time series information while reinforcing the key features of local short-term information, highlighting the key factors affecting the energy storage output, and improving the model’s prediction effect. The designed TPA is calculated as follows [
26]:
where
is the hidden state matrix of the original wind storage time series containing multiple moments of information after BiGRU processing, and the length of the time window is
;
is a convolution kernel of length
T, generally taken as the window length;
is the convolution value in row
and column
;
denotes the temporal pattern matrix;
represents the hidden state information extracted by the neural network from the input wind storage data feature matrix at the current moment;
and
are the weight parameter matrices;
is the attention weight vector, which characterizes the importance of the hidden state information at each moment in the state matrix, and length
;
is the feature vector that characterizes the temporal relationship after weighting; and
is the splicing of the feature vector and the current moment state information, and it is also the final output result of TPA.
4. Training Process of the Model
The model proposed in this paper mainly comprises a deep BiGRU neural network and TPA. As a neural network with supervised learning, BiGRU first needs to obtain the energy storage action data after optimal control and PI control as the dataset label. Since the optimal control has superior performance, it is chosen for most of the examination periods, and the PI control is used when the optimization algorithm does not converge. The control data obtained implies the excellent characteristics of the two typical control strategies.
4.1. Selection of Input and Output Variables for the Network
When using optimal control and PI control to generate datasets, four inputs are used as follows: , , , and energy storage output value in the past period. The optimal control takes the minimum penalty power as the primary goal and the minimum battery throughput as the secondary goal, which the solver solves to obtain the energy storage output value for the period. The first three inputs belong to the characteristic variables reacting to the power state of the wind farm at the sampling moment, and the fourth input belongs to the characteristic variables reacting to the power status of the energy storage in the past period. When selecting the input variables of the neural network, to reflect the actual operating state of the wind storage combined system at the sampling time as much as possible, this paper selects the input variables of the dataset as follows: , , , and the in the past period.
The input variables of the three TPA-BiGRU network models are the same, but the selected output variables are different due to the different functions of different networks. The output variables of the classification network are discrete variables of the energy storage action state, which are divided into three categories: −1, 0, and 1 for charging, inactive, and discharging, respectively. The charging and discharging regression networks select the corresponding energy storage charging and discharging power value as the output variable.
4.2. Data Preprocessing
For continuous data, such as
,
,
, and
, the four types of input variables and the energy storage adjustment power as output variables in the regression network are all processed by the Z-score standardization method. The conversion formula is as follows:
where
is the normalized value;
is the value to be standardized; and
and
are the mean and standard deviation of the characteristic variables.
The discrete data are processed by one-hot encoding. The energy storage charging, inaction, and discharge state values are −1, 0, and 1, respectively, corresponding to 100, 010, and 001 after coding.
The TPA-BiGRU network model training requires three-dimensional (3D) supervised learning data, so after the data preprocessing, it is necessary to use a sliding window of sequence length multiplied by the size of the features to frame two-dimensional data in the time series data and superimpose it to obtain 3D data. According to the training effect, the sliding window size taken in this paper is 151 × 4.
The training dataset for the classification network is a 3D dataset labeled with the preprocessed energy storage states. In contrast, the charging and discharging regression network training dataset is a 3D dataset labeled with the corresponding charging and discharging power values.
4.3. Training Process
The structure of the deep TPA-BiGRU network model proposed in this paper is mainly divided into the input, hidden, TPA, and output layers. The data input size of the input layer is 151 × 4, and the hidden layer has four layers, each containing a BiGRU layer, a dropout layer, and an activation function. For the classification network, its output layer outputs a sequence of length 3, and the activation function is Softmax. For the regression network, its output is the energy storage action value, and the length is 1. So its output layer can be a fully connected layer with the number of neurons 1. The specific hyperparameter settings are shown in
Table 2.
The hyperparameters listed in
Table 2 are selected through empirical validation and domain-specific considerations. For instance, the dropout rate is tuned based on task complexity: a higher rate (0.5) applies to classification to counteract overfitting in multi-class scenarios, while a lower rate (0.3) is used for regression to preserve network capacity. Neurons in BiGRU layers are sized to balance computational efficiency and feature representation needs (256 for classification vs. 128 for regression). The choice of sliding window size balances computational efficiency and feature richness, and the performance metrics of different window sizes are evaluated using sensitivity analysis, which shows that 151 × 4 has the lowest RMSE. The number of BiGRU layers is investigated using an ablation study to quantify the depth-influence relationship, and the 4-layer optimization strikes a balance between accuracy and GPU memory utilization. All the choices are validated via cross-validation on our dataset, with ablation studies confirming their necessity.
The data processing and training processes for the three TPA-BiGRU network models are as follows:
Data preprocessing: Standardize and encode the input and output of the three networks, respectively.
Data sampling: The time series data after preprocessing are sampled by sliding sampling with a size of 151 × 4 window and stored in the form of n × 151 × 4.
Division of training and test sets: The sampled dataset is divided into training and test sets in the ratio of 7:3. The data from the training set is fed into the TPA-BiGRU model, and the predicted values are obtained after neural network black-box computation.
Parameter update: The training set loss is calculated according to the predicted value and the training set label, and the parameters in the recurrent neural network are updated after a single back-propagation computation.
Performance evaluation: The test set loss is obtained by substituting the test set data into the untrained TPA-BiGRU model and comparing it with the training set loss. If overfitting or underfitting occurs, the network structure or hyperparameters need to be adjusted.
After iterative training, three TPA-BiGRU models are constructed, laying the foundation for the simulation of the wind storage joint control strategy based on TPA-BiGRU. The training results are shown in
Table 3.
During the simulation process, the data are read in 151 × 4 at the ordered moments and the raw data are normalized using Z-score and one-hot encoding codes. Then, they are inputted into the BiGRU network to extract the bidirectional timing features to memorize the timing relationship between the input variables, and the feature matrix in the last layer of the BiGRU network is inputted into the TPA network to strengthen the model memory function, and at the same time to highlight the importance of the local information to the energy storage output at the current moment. The classification network determines how the energy storage acts at that moment, with output 1 indicating discharge, output −1 indicating charge, and output 0 indicating inaction. When the output of the classification network is 1 (−1), the standardized data will be input into the discharge (charging) regression model to calculate the specific energy storage discharge value (charging value).
5. Case Study
The computational experiments utilized TensorFlow v2.18.0 (Google LLC, Mountain View, CA, USA) under Python 3.8 on hardware comprising an Intel Core i7-10700F CPU (Intel Corporation, Santa Clara, CA, USA) and AMD Radeon R5 430 GPU (Advanced Micro Devices, Inc., Santa Clara, CA, USA). The current implementation uses a minimalist hardware setup and can support the real-time control of a 100-MW wind farm cluster with latency well below the operational threshold. Higher-end hardware will allow for larger batch processing.
The data in this paper comes from the actual historical data of a 100-MW wind farm cluster, which is taken from the operation data of the first half of 2018 for 33 days, with a total of 3168 assessment periods, a sampling period of 2 s, and a total of 43,200 sampling points for the whole day. Among them, a total of 2208 assessment periods in 23 days are used as training sets, and a total of 960 assessment periods in 10 days are used as verification sets. The data used for AROCS and the energy storage parameters are referred to in the literature [
11], and the coefficients of PI control are set dynamically according to the control deviation.
5.1. Comparative Control Strategies
To compare the effectiveness of the joint control strategy for the wind storage system based on TPA-BiGRU proposed in this paper, five control strategies are used for simulation and analysis. The experimental data are derived from 31 days of data for the first half of 2019 for the wind farms mentioned above.
TPA-BiGRU: Temporal pattern attention mechanism combined with the bidirectional gated recurrent unit. The new decision model proposed in this paper.
TPA-BiLSTM: The GRU in TPA-BiGRU is replaced with LSTM, and the bidirectional structure and attention mechanism are retained to verify the computational efficiency advantage of GRU in the joint control of the wind storage system by comparison.
BiGRU: The temporal pattern attention (TPA) module in TPA-BiGRU is removed, and only the bidirectional GRU is retained for quantifying the contribution of the attention mechanism to multi-timescale feature extraction.
AROCS: Advanced Rolling Optimal Control Strategy with Model Predictive Control (MPC) framework, the core of which is to dynamically adjust the power allocation of the wind storage system through Rolling Horizon Optimization. AROCS stands for comparative experiments on optimization models.
PI control: Traditional proportional–integral feedback control. The real-time deviation of the wind storage system is used as an input, and through PI control, the regulation command of the energy storage is output to regulate the battery storage output, thus reducing the deviation of the output of the wind storage system. PI control stands for classical feedback control.
5.2. Analysis of Control Effects
The evaluation indexes are RMSE, ABS_MAX, KD, KDB, KS, KSH, K1%, TD, TDA, and TDB. The specific meanings of each index are in the abbreviations.
Table 4 and
Table 5 show the effect of wind power and energy storage outputs and the comparison of energy storage output under five control strategies.
Table 4 shows that the five control strategies effectively reduce the assessment power of the wind farm and improve the tracking planning ability of the wind storage combined system. However, the wind storage joint control strategy based on TPA-BiGRU performs the best, and it can control the control deviation of the wind storage combined system in a small range. The average value of RMSE is only 0.79%, with high control accuracy. The assessment rate (KSH) is only 10.15%, and the assessed electric quantity (KD) of the wind farm is reduced from the original 3466.96 MWh to 31.76 MWh.
Table 5 shows that the wind storage joint control strategy based on TPA-BiGRU has the least average regulated electricity quantity (TDA), which is only 146.48 MWh, and all the storage regulations are better than the other networks. It can be seen from the two tables that the wind storage joint control strategy based on TPA-BiGRU has better control accuracy and stability and is more suitable for the operating conditions of wind farms than the other four control strategies.
Table 6 shows the assessment results of each control strategy. It can be seen from the table that the K1 assessment index of the wind storage joint control strategy based on TPA-BiGRU has the largest number of 100% qualified days, which exceeds the other four control strategies. It shows that the wind storage joint control strategy based on TPA-BiGRU has a better and more stable control effect than the other four control strategies.
To verify the robustness and reliability of the results of the study, this paper uses the following methodology to validate the experimental results:
Statistical significance tests: A paired t-test (α = 0.05) is performed on the prediction error of TPA-BiGRU versus the baseline model (BiGRU) over 100 trials, and the results confirmed a significant difference in the RMSE distributions (p < 0.01), validating the reliability of the proposed model.
K-fold cross-validation: Applying 5-fold cross-validation to assess the stability of the proposed model, the RMSE variance across folds is only 1.8%, ensuring the general applicability of the proposed model in different operational scenarios.
Ablation Studies: Removal of the key model component temporal pattern attention mechanism increases RMSE by 14.8%, confirming its key role in capturing temporal dependencies.
Taking a particular day to analyze the control effect, the RMSE between the actual output of the wind farm and the planned value on that day is 8.61%, and the whole day’s output is 662.17 MWh.
Figure 3 shows the wind power output and control deviation curves under the five control strategies. From
Figure 3 and
Table 7, all five control strategies can effectively track the planned power output. However, the wind storage joint control strategy based on TPA-BiGRU has the best control effect and the highest control accuracy, and the proportion of the assessed electric quantity (KDB) is only 0.33%. It shows that the wind storage joint control strategy based on TPA-BiGRU has a solid ability to adapt to the uncertainty of the wind farm output and can track and regulate energy storage well even when the wind power output changes suddenly.
Table 8 shows the regulation of the energy storage systems with five control strategies. The wind storage joint control strategy based on TPA-BiGRU has the best energy storage regulation performance. In the case that the system can effectively track the planned output, the control strategy proposed in this paper has the least total regulated electricity quantity (TD), the least proportion of the regulated electricity quantity (TDB), and the least number of energy storage operations. It can effectively avoid equipment aging and loss accelerated by frequent actions, and too much regulation power leads to the overuse of energy storage equipment, which affects the equipment’s life.
5.3. Economic Analysis
At present, the cost of energy storage is still high, so it is necessary to evaluate the income level of the wind storage combined system. The economic evaluation of the energy storage system used in this paper is referenced in [
11].
Based on the simulation results of 31 days, the annual utilization hours of the wind farm are assumed to be 2300 h. The energy storage battery life and the number of replacements are evaluated using the rain flow counting method [
30]. Based on the results of the energy storage system life calculation, it can be seen that during the 20 A life cycle, the PI control strategy needs to replace the equipment twice, while the other four control strategies need to replace the equipment once. The economics of specific energy storage systems are shown in
Table 9.
As shown in
Table 9, the wind storage joint control strategy based on TPA-BiGRU can extend the lifetime of the storage system by reducing the battery throughput as much as possible while lowering the appraised power during the whole operation cycle. This control strategy achieves a yield of 25.49% over the 20 A life cycle, which is the highest economic benefit among the five control strategies.
In summary, the deep learning control strategy proposed in this paper can obtain the wind storage control results in real-time and quickly and can take into account the instantaneity, stability, and accuracy under various operating conditions while ensuring the economy of the wind storage combined system.