# A Novel Hybrid Deep Learning Model for Forecasting Ultra-Short-Term Time Series Wind Speeds for Wind Turbines

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- (1)
- A UTSWS forecasting model based on VMD-AOA-GRU is proposed.
- (2)
- VMD is employed to extract high-frequency wind speed features from time series wind speeds.
- (3)
- The hyperparameters of the GRU model are optimized using the AOA to construct a hybrid AOA-GRU model.
- (4)
- The proposed model outperforms other models for the four wind speed datasets.

## 2. Principles of Time Series Wind Speed Forecasting

_{k}= {a

_{1}, a

_{2}, a

_{3}, …, a

_{k}}, where a

_{k}represents the wind speed at the k

^{th}time step. Then, the wind speed, a

_{k+}

_{1}, at the (k + 1)

^{th}time step can be calculated using Equation (1).

_{k+}

_{2}, at time k + 2 can be calculated using Equation (2).

_{1}, a

_{2}, a

_{3}, …, a

_{k}} is shifted one step to the left, and the forecasting value a

_{k+}

_{1}at time k + 1 is then added to the wind speed sequence as an input to the forecasting model. Similarly, the predicted value a

_{k+n}at time k + n can be calculated using Equation (3).

## 3. Variational Mode Decomposition

_{k}with specific center frequencies and finite bandwidths, and the sum of the bandwidths of all of the sub-signals is equal to the original signal. Then, the variational constraint equation shown in Equation (4) can be obtained.

_{k}, ω

_{k}} represents the k

^{th}decomposed mode component with its corresponding center frequency, ∂

_{t}is the partial derivative with respect to time t, δ(t) represents the Dirac function, and ⊗ represents the convolution operator.

_{k}}, {ω

_{k}}, λ), thereby converting the constrained problem into an unconstrained one. The constructed Lagrangian function is expressed in Equation (5).

_{k}}, {ω

_{k}}, λ) is solved using the alternative direction method of multipliers (ADMM); the specific steps are as follows:

^{1}and the iteration number n = 0.

^{n}based on Equations (6)–(8), respectively.

## 4. Arithmetic Optimization Algorithm (AOA)

- (1)
- Mathematical Optimizer Acceleration Function

^{th}candidate solution in the Z-dimensional solution space is X

_{i}(x

_{i}

_{1}, x

_{i}

_{2}, …, x

_{i}

_{Z}), where i = 1, 2, …, N. The solution set can then be represented by Equation (10).

_{1}≥ MOA, the AOA performs the global exploration stage, and when r

_{1}< MOA, the AOA performs the local exploitation stage. Here, r

_{1}is a random number in the range [0, 1]. The MOA is calculated using Equation (11).

- (2)
- Global Exploration Stage

_{2}≥ 0.5, the multiplication search strategy is executed, and when r

_{2}< 0.5, the division search strategy is executed. The formulas for the multiplication and division search strategies are given in Equation (12).

_{2}is a random number between [0, 1], and μ is the control parameter for adjusting the search process, with a typical value of 0.499. MOP is the mathematical optimizer probability, which is calculated as shown in Equation (13).

- (3)
- Local Exploitation Stage

_{3}is a random variable with a value in the range of [0, 1]. When r

_{3}< 0.5, the subtraction operation is performed; when r

_{3}≥ 0.5, the addition operation is executed.

## 5. AOA-GRU Hybrid Model

#### 5.1. GRU Algorithm Principles

- (1)
- Reset Gate

_{t}represents the input vector at time step t, S

_{t}

_{−1}represents the hidden state at time step t−1, W

_{t}and U

_{t}are the weight matrices of the reset gate, B

_{r}is the bias matrix of the reset gate, and r

_{t}is the output of the reset gate at time step t.

- (2)
- Update Gate

_{z}and U

_{z}are the weight matrices of the update gate, B

_{z}is the bias matrix of the update gate, and z

_{t}is the output of the update gate at time step t.

- (3)
- Candidate Hidden State of the GRU Model

- (4)
- Hidden State of the GRU Model

_{t}represents the hidden state of the GRU model.

- (5)
- GRU Model Output

_{t}represents the output vector of the GRU model, ${W}_{y}^{T}$ is the weight matrix, and B

_{y}is the bias matrix.

#### 5.2. Hyperparameters Affecting the Forecasting Performance of GRU Models

- (1)
- Impact of the number of hidden layers on the model forecasting performance

- (2)
- Influence of the number of hidden layer neurons on the model forecasting performance

- (3)
- Impact of training epochs on the model forecasting performance

- (4)
- Impact of the learning rate and learning rate decay period on the model forecasting performance

#### 5.3. AOA Optimized Hyperparameters of the GRU Model

_{i}(x

_{i}

_{1}, x

_{i}

_{2}, x

_{i}

_{3}, x

_{i}

_{4}, x

_{i}

_{5}), including the training data time series length, x

_{i}

_{1}, number of hidden layer neurons, x

_{i}

_{2}, training iterations, x

_{i}

_{3}, initial learning rate, x

_{i}

_{4}, and learning rate decay period, x

_{i}

_{5}. Determine the range and number of candidate solutions.

_{i}(x

_{i}

_{1}, x

_{i}

_{2}, x

_{i}

_{3}, x

_{i}

_{4}, x

_{i}

_{5}) of each candidate solution, construct their respective GRU forecasting models and forecast the wind speed sequence.

^{*}(x

_{1}, x

_{2}, x

_{3}, x

_{4}, x

_{5}), representing the optimal model hyperparameters.

_{i}(x

_{i}

_{1}, x

_{i}

_{2}, x

_{i}

_{3}, x

_{i}

_{4}, x

_{i}

_{5}) and rebuild each GRU model based on the updated coordinates to forecast the wind speed.

^{*}(x

_{1}, x

_{2}, x

_{3}, x

_{4}, x

_{5}) of the candidate solution with the best fitness value, i.e., the optimal training data length and model hyperparameters.

^{*}(x

_{1}, x

_{2}, x

_{3}, x

_{4}, x

_{5}), to forecast the wind speed.

## 6. Construction and Verification of the VMD-AOA-GRU Model

#### 6.1. Data Sources and Sample Set Partition

- (1)
- Data Sources

- (2)
- Sample Set Partitioning

#### 6.2. VMD-AOA-GRU Model Construction

- (1)
- Utilize VMD to decompose the training and testing datasets and obtain K modal components of different frequencies.
- (2)
- Input each modal component derived from the decomposed training dataset into the AOA-GRU model separately and train the AOA-GRU model.
- (3)
- Input each modal component derived from the decomposed testing dataset into the trained AOA-GRU model to achieve ultra-short-term forecasting for each modal component.
- (4)
- Reconstruct the UTSWS based on the ultra-short-term forecasting results of each modal component of the testing dataset.

#### 6.3. Verification of the VMD-AOA-GRU Model

#### 6.3.1. Determination of the Number of VMD Modal Components

#### 6.3.2. Training of VMD-AOA-GRU Model

- (a)
- The wind speed data features contained in the four modal components decomposed from the same training dataset differ. Therefore, after using the four modal components to train the AOA-GRU model, the hyperparameter values of the GRU model optimized by AOA are different.
- (b)
- Owing to the large distance between the 58th and 123rd wind turbines, there are certain differences in the time series wind speed data for these two turbines during the same time period, and the decomposed modal components also differ. After using these modal components to train the AOA-GRU model, the hyperparameter values of the GRU model optimized by AOA are different.
- (c)
- The time series wind speed data from the same wind turbine differ significantly during different time periods, and the decomposed modal components also exhibit significant differences. After training the AOA-GRU model using modal components, the hyperparameter values of the GRU model optimized by AOA are different.

#### 6.3.3. Forecasting Analysis of the VMD-AOA-GRU Model

- (1)
- Under different forecasting time steps, all four forecasting models can accurately reflect the trend of actual wind speed changes, confirming that the GRU model and its hybrid model perform well in time series wind speed forecasting.
- (2)
- The wind speed inflection point in Figure 10 reveals that the forecasting results of the GRU and AOA-GRU models lag behind those of the VMD-GRU and VMD-AOA-GRU models. This is because the VMD algorithm can effectively extract high-frequency component features (corresponding to the rapidly changing part of the wind speed) from the wind speed sequence, and these high-frequency component features are used as a component of the input data of the forecasting model, enabling the VMD-GRU and VMD-AOA-GRU models to accurately forecast the sudden changes in the actual wind speed.

- (1)
- The VMD-GRU model outperforms the GRU model in terms of forecasting accuracy, and the VMD-AOA-GRU model exhibits a higher forecasting accuracy than the AOA-GRU model. This demonstrates that the VMD algorithm effectively captures the high-frequency components of the wind speed data, thereby enhancing the forecasting accuracy of the forecasting models.
- (2)
- The AOA-GRU model outperforms the GRU model in terms of forecasting accuracy, and the VMD-AOA-GRU model outperforms the VMD-GRU model in terms of forecasting accuracy. This indicates that utilizing the AOA to optimize the hyperparameters of the GRU model can effectively improve its forecasting accuracy.
- (3)
- For the different forecasting models, as the forecasting time step increases, the forecasting error also increases.

## 7. Comparison of Different Forecasting Models

- (1)
- All of the machine learning models accurately capture the trends of the actual wind speed, demonstrating that the use of machine learning models for ultra-short-term wind speed forecasting is feasible.
- (2)
- At the wind speed inflection points, the VMD-AOA-GRU, VMD-LSTM, VMD-GRU, VMD-PSO-BP, VMD-PSO-ELM, and VMD-PSO-LSSVM models accurately forecast the positions of the inflection points (the highest accuracy in single-step forecasting). In contrast, the other models exhibit a lag in forecasting the positions of the inflection points compared with the actual wind speed. This further validates that the VMD algorithm can accurately extract the high-frequency components of the time series wind speed, thereby enhancing the accuracy of wind-speed forecasting.
- (3)
- Among the forecasting models, the VMD-AOA-GRU model shows the closest similarity to the distribution characteristics of the actual time series wind speed. This demonstrates that the forecasting performance of the VMD-AOA-GRU model is superior to that of the other models.

- (1)
- The forecasting accuracy of the hybrid VMD models is higher than that of the non-hybrid VMD models, indicating that deep mining of high-frequency features in the time series wind speed through VMD can effectively improve the forecasting accuracy of the forecasting model.
- (2)
- The forecasting accuracy of the LSTM and GRU models is lower than that of some machine learning models, indicating that although the LSTM and GRU models have the theoretical potential to achieve high forecasting accuracy by mining temporal correlations in the data, their forecasting accuracy is affected by improper hyperparameter settings.
- (3)
- The forecasting accuracy of the VMD-AOA-GRU model is higher than that of all other models, demonstrating that optimizing the hyperparameters of the GRU model through AOA effectively enhances the forecasting accuracy of the GRU model.
- (4)
- As the forecasting time step increases, the forecasting accuracy of all models gradually decreases, which aligns with the inherent characteristics of forecasting models.

## 8. Conclusions

- (1)
- The forecasting accuracies of the hybrid VMD models (VMD-AOA-GRU, VMD-LSTM, VMD-PSO-BP, VMD-PSO-ELM, and VMD-PSO-LSSVM) are higher than those of the non-hybrid VMD models (GRU, LSTM, PSO-BP, PSO-ELM, and PSO-LSSVM), indicating that the VMD can deeply explore high-frequency components in time series wind speed, particularly the high-frequency features at inflection points, effectively improving the accuracy of the forecasted time series wind speed.
- (2)
- Although the LSTM and GRU deep learning models can capture the temporal correlations in time series wind speeds, their forecasting accuracy may be lower than that of some commonly used machine learning models (PSO-BP, PSO-ELM, and PSO-LSSVM) when their hyperparameter settings are improper. This indicates that a reasonable setting of hyperparameters for deep learning models significantly affects the forecasting accuracy.
- (3)
- The forecasting accuracy of the GRU model can be effectively improved by using the AOA to optimize the hyperparameters of the GRU model. The calculation results show that the forecasting accuracy of the VMD-AOA-GRU model constructed in this study is higher than that of the other models.
- (4)
- As the forecasting time step increases, the forecasting accuracy of the model gradually decreases.

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Nomenclature

AOA | arithmetic optimization algorithm |

AR | autoregressive model |

ARIMA | autoregressive integrated moving average model |

ARMA | auto-regressive moving average model |

BiLSTM | bidirectional long short-term memory |

BP | back propagation neural network |

CNN | convolutional neural network |

CNN-BiLSTM | CNN and BiLSTM hybrid model |

DASTGN | dynamic adaptive spatiotemporal graph neural network |

ELM | extreme learning machine |

GRU | gated recurrent unit |

ICEEMDAN | improved complete ensemble empirical mode decomposition with additive noise |

LSSVM | least squares support vector machines |

LSTM | long short-term memory networks |

PSO | particle swarm optimization |

PSO-BP | PSO and BP hybrid model |

PSO-ELM | PSO and ELM hybrid model |

PSO-LSSVM | PSO and LSSVM hybrid model |

UTSWS | ultra-short-term time series wind speeds |

VMD | variational mode decomposition |

VMD-AOA-GRU | VMD, AOA and GRU hybrid model |

VMD-GRU | VMD and GRU hybrid model |

VMD-LSTM | VMD and LSTM hybrid model |

VMD-PSO-BP | VMD, PSO and BP hybrid model |

VMD-PSO-ELM | VMD, PSO and ELM hybrid model |

VMD-PSO-LSSVM | VMD, PSO and LSSVM hybrid model |

## References

- IRENA. Renewable Capacity Statistics 2023. Available online: https://www.irena.org/Publications/2023/Mar/Renewable-capacity-statistics-2023 (accessed on 29 September 2023).
- GWEC. Global Wind Report 2023. Available online: https://gwec.net/globalwindreport2023/ (accessed on 29 September 2023).
- Saini, V.K.; Kumar, R.; Al-Sumaiti, A.S.; Sujil, A.; Heydarian-Forushani, E. Learning Based Short Term Wind Speed Forecasting Models for Smart Grid Applications: An Extensive Review and Case Study. Electr. Power Syst. Study
**2023**, 222, 109502. [Google Scholar] [CrossRef] - Wang, Y.; Zou, R.; Liu, F.; Zhang, L.; Liu, Q. A Review of Wind Speed and Wind Power Forecasting with Deep Neural Networks. Appl. Energy
**2021**, 304, 117766. [Google Scholar] [CrossRef] - Shahram, H.; Liu, X.L.; Lin, Z.; Saeid, L. A Critical Review of Wind Power Forecasting Methods-Past, Present and Future. Energies
**2020**, 13, 3764. [Google Scholar] - Wang, J.; Song, Y.; Liu, F.; Hou, R. Analysis and Application of Forecasting Models in Wind Power Integration: A Review of Multi-Step-Ahead Wind Speed Forecasting Models. Renew. Sustain. Energy Rev.
**2016**, 60, 960–981. [Google Scholar] [CrossRef] - Han, Y.; Mi, L.; Shen, L.; Cai, C.; Liu, Y.; Li, K.; Xu, G. A Short-Term Wind Speed Prediction Method Utilizing Novel Hybrid Deep Learning Algorithms to Correct Numerical Weather Forecasting. Appl. Energy
**2022**, 312, 118777. [Google Scholar] [CrossRef] - Lydia, M.; Kumar, S.S.; Selvakumar, A.I.; Kumar, G.E.P. Linear and Non-Linear Autoregressive Models for Short-Term Wind Speed Forecasting. Energy Convers. Manag.
**2016**, 112, 115–124. [Google Scholar] [CrossRef] - Srihari, P.; Kiran, T.; Vishalteja, K. A Hybrid VMD Based Contextual Feature Representation Approach for Wind Speed Forecasting. Renew. Energy
**2023**, 219, 119391. [Google Scholar] - Zhang, Y.; Zhao, Y.; Kong, C.; Chen, B. A New Prediction Method Based on VMD-PRBF-ARMA-E Model Considering Wind Speed Characteristic. Energy Convers. Manag.
**2020**, 203, 112254. [Google Scholar] [CrossRef] - Zhu, K.; Mu, L.; Yu, R.; Xia, X.; Tu, H. Probabilistic Modelling of Surface Drift Prediction in Marine Disasters Based on the NN-GA and ARMA Model. Ocean Eng.
**2023**, 281, 114804. [Google Scholar] [CrossRef] - Zhu, Y.; Xie, S.; Xie, Y.; Chen, X. Temperature Prediction of Aluminum Reduction Cell Based on Integration of Dual Attention LSTM for Non-Stationary Sub-Sequence and ARMA for Stationary Sub-Sequences. Control Eng. Pract.
**2023**, 138, 105567. [Google Scholar] [CrossRef] - Aasim Singh, S.N.; Mohapatra, A. Repeated Wavelet Transform Based ARIMA Model for Very Short-Term Wind Speed Forecasting. Renew. Energy
**2019**, 136, 758–768. [Google Scholar] [CrossRef] - Liu, X.L.; Lin, Z.; Feng, Z.M. Short-term Offshore Wind Speed Forecast by Seasonal ARIMA-A Comparison against GRU and LSTM. Energy
**2021**, 227, 120492. [Google Scholar] [CrossRef] - Akshita, G.; Arun, K. Two-Step Daily Reservoir Inflow Prediction Using ARIMA-Machine Learning and Ensemble Models. J. Hydro-Environ. Res.
**2022**, 45, 39–52. [Google Scholar] - Hu, Y.H.; Guo, Y.S.; Fu, R. A Novel Wind Speed Forecasting Combined Model Using Variational Mode Decomposition, Sparse Auto-Encoder and Optimized Fuzzy Cognitive Mapping Network. Energy
**2023**, 278 Pt A, 127926. [Google Scholar] [CrossRef] - Li, M.; Yang, Y.; He, Z.; Guo, X.; Zhang, R.; Huang, B. A Wind Speed Forecasting Model Based on Multi-objective Algorithm and Interpretability Learning. Energy
**2023**, 269, 126778. [Google Scholar] [CrossRef] - Parmaksiz, H.; Yuzgec, U.; Dokur, E.; Erdogan, N. Mutation Based Improved Dragonfly Optimization Algorithm for a Neuro-fuzzy System in Short Term Wind Speed Forecasting. Knowl.-Based Syst.
**2023**, 268, 110472. [Google Scholar] [CrossRef] - Dokur, E.; Erdogan, N.; Salari, M.E.; Karakuzu, C.; Murphy, J. Offshore Wind Speed Short-Term Forecasting Based on a Hybrid Method: Swarm Decomposition and Meta-Extreme Learning Machine. Energy
**2022**, 248, 123595. [Google Scholar] [CrossRef] - Yang, Y.; Zhou, H.; Wu, J.; Ding, Z.; Wang, Y.-G. Robustified Extreme Learning Machine Regression with Applications in Outlier-Blended Wind-Speed Forecasting. Appl. Soft Comput.
**2022**, 122, 108814. [Google Scholar] [CrossRef] - Sun, S.; Wang, Y.; Meng, Y.; Wang, C.; Zhu, X. Multi-Step Wind Speed Forecasting Model Using a Compound Forecasting Architecture and an Improved QPSO-Based Synchronous Optimization. Energy Rep.
**2022**, 8, 9899–9918. [Google Scholar] [CrossRef] - He, Y.; Wang, Y.; Wang, S.; Yao, X. A Cooperative Ensemble Method for Multistep Wind Speed Probabilistic Forecasting. Chaos Solitons Fractals
**2022**, 162, 112416. [Google Scholar] [CrossRef] - Yang, W.; Hao, M.; Hao, Y. Innovative Ensemble System Based on Mixed Frequency Modeling for Wind Speed Point and Interval Forecasting. Inf. Sci.
**2023**, 622, 560–586. [Google Scholar] [CrossRef] - Dong, Y.; Li, J.; Liu, Z.; Niu, X.; Wang, J. Ensemble Wind Speed Forecasting System Based on Optimal Model Adaptive Selection Strategy: Case Study in China. Sustain. Energy Technol. Assess.
**2022**, 53 Pt B, 102535. [Google Scholar] [CrossRef] - Hao, Y.; Yang, W.D.; Yin, K.D. Novel Wind Speed Forecasting Model Based on a Deep Learning Combined Strategy in Urban Energy Systems. Expert Syst. Appl.
**2023**, 219, 119636. [Google Scholar] [CrossRef] - Wang, Y.; Xu, H.; Song, M.; Zhang, F.; Li, Y.; Zhou, S.; Zhang, L. A Convolutional Transformer-based Truncated Gaussian Density Network with Data Denoising for Wind Speed Forecasting. Appl. Energy
**2023**, 333, 120601. [Google Scholar] [CrossRef] - Lv, M.; Li, J.; Niu, X.; Wang, J. Novel Deterministic and Probabilistic Combined System Based on Deep Learning and Self-improved Optimization Algorithm for Wind Speed Forecasting. Sustain. Energy Technol. Assess.
**2022**, 52 Pt B, 102186. [Google Scholar] [CrossRef] - Gao, Z.; Li, Z.; Xu, L.; Yu, J. Dynamic Adaptive Spatio-temporal Graph Neural Network for Multi-node Offshore Wind Speed Forecasting. Appl. Soft Comput.
**2023**, 141, 110294. [Google Scholar] [CrossRef] - Sibtain, M.; Bashir, H.; Nawaz, M.; Hameed, S.; Azam, M.I.; Li, X.; Abbas, T.; Saleem, S. A Multivariate Ultra-Short-Term Wind Speed Forecasting Model by Employing Multistage Signal Decomposition Approaches and a Deep Learning Network. Energy Convers. Manag.
**2022**, 263, 115703. [Google Scholar] [CrossRef] - Nascimento, E.G.S.; Talison, A.C.; Davidson, M.M. A Transformer-Based Deep Neural Network with Wavelet Transform for Forecasting Wind Speed and Sind Energy. Energy
**2023**, 278, 127678. [Google Scholar] [CrossRef] - Bentsen, L.Ø.; Warakagoda, N.D.; Stenbro, R.; Engelstad, P. Spatio-Temporal Wind Speed Forecasting Using Graph Networks and Novel Transformer Architectures. Appl. Energy
**2023**, 333, 120565. [Google Scholar] [CrossRef] - Bala, S.B.; Kiran, T.; Vishalterja, K. Hybrid Wind Speed Forecasting Using ICEEMDAN and Transformer Model with Novel Loss Function. Energy
**2023**, 265, 126383. [Google Scholar] - Wu, H.; Meng, K.; Fan, D.; Zhang, Z.; Liu, Q. Multistep Short-Term Wind Speed Forecasting Using Transformer. Energy
**2022**, 261 Pt A, 125231. [Google Scholar] [CrossRef] - Liu, G.; Wang, Y.; Qin, H.; Shen, K.; Liu, S.; Shen, Q.; Qu, Y.; Zhou, J. Probabilistic Spatiotemporal Forecasting of Wind Speed Based on Multi-Network Deep Ensembles Method. Renew. Energy
**2023**, 209, 231–247. [Google Scholar] [CrossRef] - Lv, S.X.; Wang, L. Multivariate Wind Speed Forecasting Based on Multi-Objective Feature Selection Approach and Hybrid Deep Learning Model. Energy
**2023**, 263 Pt E, 126100. [Google Scholar] [CrossRef] - Zheng, L.; Lu, W.S.; Zhou, Q.Y. Weather Image-Based Short-Term Dense Wind Speed Forecast with a ConvLSTM-LSTM Deep Learning Model. Build. Environ.
**2023**, 239, 110446. [Google Scholar] [CrossRef] - Zhang, Y.M.; Wang, H. Multi-Head Attention-Based Probabilistic CNN-BiLSTM for Day-Ahead Wind Speed Forecasting. Energy
**2023**, 278 Pt A, 127865. [Google Scholar] [CrossRef] - Wang, J.Z.; An, Y.N.; Lu, H.Y. A Novel Combined Forecasting Model Based on Neural Networks, Deep Learning Approaches, and Multi-Objective Optimization for Short-Term Wind Speed Forecasting. Energy
**2022**, 251, 123960. [Google Scholar] [CrossRef] - Dragomiretskiy, K.; Zosso, D. Variational Mode Decomposition. IEEE Trans. Signal Process.
**2014**, 62, 531–544. [Google Scholar] [CrossRef] - Abualigah, L.; Diabat, A.; Mirjalili, S.; Abd Elaziz, M.; Gandomi, A.H. The Arithmetic Optimization Algorithm. Comput. Methods Appl. Mech. Eng.
**2021**, 376, 113609. [Google Scholar] [CrossRef] - Cho, K.; van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation. arXiv
**2014**, arXiv:1406.1078. [Google Scholar] [CrossRef] - Bahdanau, D.; Cho, K.; Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv
**2014**, arXiv:1409.0473. [Google Scholar] [CrossRef]

**Figure 4.**Relationship among the number of hidden layer neurons, forecasting error, and training time.

**Figure 6.**Relationship among the learning rate, learning rate decay period, forecasting error, and training time. (

**a**) Impact of the learning rate on the model forecasting performance. (

**b**) Impact of the learning rate decay period on the model forecasting performance.

Hyperparameter | Search Scope | Optimal Parameter Value |
---|---|---|

Number of Hidden Layers | [1, 2, 3, 4] | 2 |

Number of Hidden Layer Neurons | [10, 20, 30, 40, 50, 60] | 20 |

Number of Training Epochs | [30, 40, 50, 60, 70, 80] | 70 |

Initial Learning Rate | [0.02, 0.04, 0.06, 0.08, 0.1] | 0.06 |

Learning Rate Decay Period | [10, 20, 30, 40, 50, 60] | 30 |

Number of Modes (K) | Central Frequency | |||||||
---|---|---|---|---|---|---|---|---|

2 | 0.1491 Hz | 0.0004 Hz | ||||||

3 | 0.3401 Hz | 0.0776 Hz | 0.0003 Hz | |||||

4 | 0.4367 Hz | 0.1792 Hz | 0.0551 Hz | 0.0003 Hz | ||||

5 | 0.4494 Hz | 0.2573 Hz | 0.1002 Hz | 0.0156 Hz | 0.0002 Hz | |||

6 | 0.3768 Hz | 0.2106 Hz | 0.1094 Hz | 0.0529 Hz | 0.0098 Hz | 0.0001 Hz | ||

7 | 0.3875 Hz | 0.2525 Hz | 0.1540 Hz | 0.1007 Hz | 0.0511 Hz | 0.0092 Hz | 0.0001 Hz | |

8 | 0.4255 Hz | 0.3154 Hz | 0.2231 Hz | 0.1543 Hz | 0.1033 Hz | 0.0514 Hz | 0.0088 Hz | 0.0001 Hz |

Number of Modal Components (K) | C_{12} | C_{23} | C_{34} | C_{45} | C_{56} | C_{67} | C_{78} |
---|---|---|---|---|---|---|---|

2 | 0.015163 | ||||||

3 | 0.025336 | 0.04361 | |||||

4 | 0.022502 | 0.061709 | 0.053189 | ||||

5 | 0.035944 | 0.048752 | 0.069011 | 0.157038 | |||

6 | 0.040904 | 0.064714 | 0.109889 | 0.079833 | 0.144653 | ||

7 | 0.061023 | 0.072363 | 0.117903 | 0.132261 | 0.074168 | 0.14186 | |

8 | 0.068861 | 0.085539 | 0.118399 | 0.116491 | 0.113842 | 0.066931 | 0.144964 |

Sequence | Training Dataset 1 | Training Dataset 2 | ||||||

Modal Components | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 |

Temporal Length of Training Data | 21 | 17 | 9 | 47 | 14 | 11 | 10 | 20 |

Number of Neurons in Hidden Layers | 28 | 50 | 21 | 50 | 50 | 30 | 50 | 44 |

Number of Training Epochs | 91 | 81 | 75 | 100 | 100 | 34 | 71 | 89 |

Learning Rate | 0.0478 | 0.0355 | 0.079 | 0.0572 | 0.0766 | 0.0581 | 0.0269 | 0.0574 |

Learning Rate Decay Period | 15 | 30 | 9 | 16 | 10 | 25 | 29 | 30 |

MAE | 3.46% | 4.7% | 4.32% | 3.96% | 7.54% | 4.4% | 3.23% | 2.74% |

RMSE | 4.98% | 6.59% | 5.36% | 4.82% | 9.71% | 5.81% | 3.93% | 3.35% |

MAPE | 0.53% | 0.71% | 0.85% | 0.01% | 0.62% | 0.30% | 0.21% | 0.01% |

Training Time | 187.4 s | 190.1 s | 153.9 s | 232.4 s | 233.9 s | 88.6 s | 172.1 s | 201.9 s |

Sequence | Training Dataset 3 | Training Dataset 4 | ||||||

Modal Components | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 |

Temporal Length of Training Data | 9 | 10 | 9 | 56 | 9 | 16 | 11 | 20 |

Number of Neurons in Hidden Layers | 39 | 44 | 16 | 24 | 30 | 33 | 34 | 48 |

Number of Training Epochs | 67 | 83 | 62 | 96 | 58 | 96 | 35 | 80 |

Learning Rate | 0.1 | 0.0391 | 0.1 | 0.0589 | 0.087 | 0.0544 | 0.0409 | 0.0564 |

Learning Rate Decay Period | 30 | 30 | 30 | 30 | 26 | 25 | 30 | 17 |

MAE | 4.51% | 4.08% | 2.12% | 2.58% | 3.66% | 3.4% | 3.14% | 2.74% |

RMSE | 6.52% | 5.53% | 2.92% | 3.13% | 4.59% | 4.41% | 4.13% | 3.47% |

MAPE | 1.60% | 0.32% | 0.12% | 0.01% | 2.51% | 0.32% | 0.15% | 0.01% |

Training Time | 157.2 s | 189.7 s | 127.3 s | 187.2 s | 133 s | 201.1 s | 92.1 s | 186.6 s |

Time Steps | 1 | 2 | 3 | |||||||
---|---|---|---|---|---|---|---|---|---|---|

Error Type | MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | |

Testing Dataset 1 | GRU | 0.5499 | 0.7527 | 0.0734 | 0.7272 | 0.9096 | 0.0964 | 0.7524 | 0.9413 | 0.0994 |

VMD-GRU | 0.4159 | 0.5091 | 0.0576 | 0.5143 | 0.6375 | 0.0710 | 0.6018 | 0.7176 | 0.0846 | |

AOA-GRU | 0.5178 | 0.7378 | 0.0696 | 0.6743 | 0.8652 | 0.0912 | 0.7420 | 0.9236 | 0.1007 | |

VMD-AOA-GRU | 0.2280 | 0.2990 | 0.0292 | 0.2536 | 0.3382 | 0.0323 | 0.2704 | 0.3585 | 0.0349 | |

Testing Dataset 2 | GRU | 0.7005 | 0.9290 | 0.1615 | 0.8503 | 1.0866 | 0.2025 | 0.9865 | 1.2427 | 0.2413 |

VMD-GRU | 0.3967 | 0.5232 | 0.0927 | 0.5400 | 0.6781 | 0.1274 | 0.5542 | 0.6989 | 0.1354 | |

AOA-GRU | 0.5259 | 0.7174 | 0.1227 | 0.7793 | 1.0050 | 0.1807 | 0.9585 | 1.2020 | 0.2235 | |

VMD-AOA-GRU | 0.2463 | 0.3001 | 0.0615 | 0.2727 | 0.3286 | 0.0641 | 0.3411 | 0.4422 | 0.0843 | |

Testing Dataset 3 | GRU | 0.5426 | 0.7111 | 0.0728 | 0.7227 | 0.8794 | 0.0982 | 0.7178 | 0.8949 | 0.0977 |

VMD-GRU | 0.3390 | 0.4373 | 0.0483 | 0.3697 | 0.4769 | 0.0508 | 0.4685 | 0.6097 | 0.0660 | |

AOA-GRU | 0.4937 | 0.6592 | 0.0659 | 0.6291 | 0.8042 | 0.0838 | 0.6862 | 0.8508 | 0.0918 | |

VMD-AOA-GRU | 0.1988 | 0.2576 | 0.0263 | 0.2027 | 0.2729 | 0.0269 | 0.2363 | 0.2923 | 0.0301 | |

Testing Dataset 4 | GRU | 0.6143 | 0.7849 | 0.2029 | 0.6785 | 0.8854 | 0.2218 | 0.8397 | 1.0956 | 0.2713 |

VMD-GRU | 0.3970 | 0.4965 | 0.1346 | 0.4283 | 0.5355 | 0.1489 | 0.5287 | 0.6590 | 0.1756 | |

AOA-GRU | 0.4425 | 0.5597 | 0.1387 | 0.6373 | 0.8183 | 0.2145 | 0.7011 | 0.9456 | 0.2094 | |

VMD-AOA-GRU | 0.2170 | 0.2779 | 0.0701 | 0.2608 | 0.3351 | 0.0855 | 0.3102 | 0.3908 | 0.1054 |

Time Step | 1 | 2 | 3 | |||||||
---|---|---|---|---|---|---|---|---|---|---|

Error Type | MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | |

Testing Dataset 1 | LSTM | 0.7122 | 0.8927 | 0.0948 | 0.7605 | 0.9483 | 0.1004 | 0.8092 | 1.0211 | 0.1049 |

GRU | 0.5499 | 0.7527 | 0.0734 | 0.7272 | 0.9096 | 0.0964 | 0.7524 | 0.9413 | 0.0994 | |

PSO-BP | 0.5722 | 0.8016 | 0.0760 | 0.7276 | 0.9305 | 0.0971 | 0.7565 | 1.0682 | 0.1002 | |

PSO-ELM | 0.2937 | 0.3873 | 0.0399 | 0.4859 | 0.6034 | 0.0654 | 0.5159 | 0.6478 | 0.0698 | |

PSO-LSSVM | 0.5241 | 0.7260 | 0.0716 | 0.6662 | 0.8312 | 0.0919 | 0.7286 | 0.8855 | 0.0995 | |

VMD-LSTM | 0.4140 | 0.5131 | 0.0593 | 0.4148 | 0.5472 | 0.0545 | 0.4637 | 0.6218 | 0.0596 | |

VMD-GRU | 0.4159 | 0.5091 | 0.0576 | 0.5143 | 0.6375 | 0.0710 | 0.6018 | 0.7176 | 0.0846 | |

VMD-PSO-BP | 0.2306 | 0.2987 | 0.0292 | 0.2896 | 0.3994 | 0.0367 | 0.3467 | 0.4775 | 0.0444 | |

VMD-PSO-ELM | 0.2371 | 0.3172 | 0.0302 | 0.4253 | 0.5602 | 0.0560 | 0.4832 | 0.6103 | 0.0638 | |

VMD-PSO-LSSVM | 0.2295 | 0.3045 | 0.0293 | 0.2888 | 0.3955 | 0.0367 | 0.3162 | 0.4365 | 0.0406 | |

VMD-AOA-GRU | 0.2280 | 0.2990 | 0.0292 | 0.2536 | 0.3382 | 0.0323 | 0.2704 | 0.3585 | 0.0349 | |

Testing Dataset 2 | LSTM | 0.7528 | 1.0541 | 0.1643 | 0.8739 | 1.1703 | 0.1995 | 1.0092 | 1.3030 | 0.2428 |

GRU | 0.7005 | 0.9290 | 0.1615 | 0.8503 | 1.0866 | 0.2025 | 0.9865 | 1.2427 | 0.2413 | |

PSO-BP | 0.5447 | 0.7329 | 0.1265 | 0.7984 | 1.0600 | 0.1791 | 0.9295 | 1.2103 | 0.2099 | |

PSO-ELM | 0.4124 | 0.5301 | 0.1029 | 0.5670 | 0.7277 | 0.1419 | 0.6451 | 0.8573 | 0.1566 | |

PSO-LSSVM | 0.6130 | 0.8520 | 0.1380 | 0.7448 | 0.9983 | 0.1696 | 0.9032 | 1.2067 | 0.2007 | |

VMD-LSTM | 0.4073 | 0.5381 | 0.0953 | 0.4659 | 0.6060 | 0.1026 | 0.5551 | 0.7177 | 0.1266 | |

VMD-GRU | 0.3967 | 0.5232 | 0.0927 | 0.5400 | 0.6781 | 0.1274 | 0.5542 | 0.6989 | 0.1354 | |

VMD-PSO-BP | 0.2496 | 0.3046 | 0.0632 | 0.3523 | 0.3535 | 0.0649 | 0.4273 | 0.5705 | 0.1002 | |

VMD-PSO-ELM | 0.2564 | 0.3070 | 0.0643 | 0.2618 | 0.4604 | 0.0820 | 0.4124 | 0.5346 | 0.0971 | |

VMD-PSO-LSSVM | 0.2474 | 0.3066 | 0.0622 | 0.2848 | 0.3730 | 0.0677 | 0.4589 | 0.6011 | 0.1071 | |

VMD-AOA-GRU | 0.2463 | 0.3001 | 0.0615 | 0.2727 | 0.3286 | 0.0641 | 0.3411 | 0.4422 | 0.0843 | |

Testing Dataset 3 | LSTM | 0.6011 | 0.7619 | 0.0798 | 0.7163 | 0.8916 | 0.0954 | 0.8098 | 0.9865 | 0.1089 |

GRU | 0.5426 | 0.7111 | 0.0728 | 0.7227 | 0.8794 | 0.0982 | 0.7178 | 0.8949 | 0.0977 | |

PSO-BP | 0.4963 | 0.6558 | 0.0648 | 0.6969 | 0.8854 | 0.0921 | 0.7414 | 0.9364 | 0.0981 | |

PSO-ELM | 0.3469 | 0.4394 | 0.0462 | 0.4447 | 0.5677 | 0.0597 | 0.5260 | 0.6446 | 0.0713 | |

PSO-LSSVM | 0.6398 | 0.7872 | 0.0854 | 0.7180 | 0.8766 | 0.0957 | 0.7375 | 0.8947 | 0.0981 | |

VMD-LSTM | 0.2927 | 0.3846 | 0.0399 | 0.4904 | 0.5983 | 0.0691 | 0.4918 | 0.6181 | 0.0696 | |

VMD-GRU | 0.3390 | 0.4373 | 0.0483 | 0.3697 | 0.4769 | 0.0508 | 0.4685 | 0.6097 | 0.0660 | |

VMD-PSO-BP | 0.2078 | 0.2755 | 0.0277 | 0.2502 | 0.2790 | 0.0337 | 0.2645 | 0.3690 | 0.0350 | |

VMD-PSO-ELM | 0.2006 | 0.2716 | 0.0265 | 0.2085 | 0.3570 | 0.0277 | 0.2253 | 0.3981 | 0.0405 | |

VMD-PSO-LSSVM | 0.2031 | 0.2690 | 0.0269 | 0.2040 | 0.2835 | 0.0270 | 0.2978 | 0.3371 | 0.0311 | |

VMD-AOA-GRU | 0.1988 | 0.2576 | 0.0263 | 0.2027 | 0.2729 | 0.0269 | 0.2363 | 0.2923 | 0.0301 | |

Testing Dataset 4 | LSTM | 0.7196 | 0.9250 | 0.2394 | 0.8377 | 1.0775 | 0.2821 | 0.8782 | 1.1983 | 0.2640 |

GRU | 0.6143 | 0.7849 | 0.2029 | 0.6785 | 0.8854 | 0.2218 | 0.8397 | 1.0956 | 0.2713 | |

PSO-BP | 0.4324 | 0.5544 | 0.1331 | 0.6957 | 0.9454 | 0.2054 | 0.7131 | 0.9583 | 0.2135 | |

PSO-ELM | 0.3143 | 0.4076 | 0.1008 | 0.4435 | 0.5777 | 0.1375 | 0.5032 | 0.6291 | 0.1586 | |

PSO-LSSVM | 0.4349 | 0.5586 | 0.1345 | 0.6184 | 0.8034 | 0.1848 | 0.7263 | 0.9627 | 0.2155 | |

VMD-LSTM | 0.4238 | 0.5176 | 0.1448 | 0.4586 | 0.5561 | 0.1585 | 0.5844 | 0.7049 | 0.2096 | |

VMD-GRU | 0.3970 | 0.4965 | 0.1346 | 0.4283 | 0.5355 | 0.1489 | 0.5287 | 0.6590 | 0.1756 | |

VMD-PSO-BP | 0.2177 | 0.2902 | 0.0730 | 0.3526 | 0.3358 | 0.1138 | 0.3906 | 0.5082 | 0.1267 | |

VMD-PSO-ELM | 0.2239 | 0.2860 | 0.0728 | 0.2620 | 0.4415 | 0.0872 | 0.4021 | 0.5043 | 0.1239 | |

VMD-PSO-LSSVM | 0.2250 | 0.2785 | 0.0704 | 0.2778 | 0.3551 | 0.0904 | 0.3554 | 0.4476 | 0.1145 | |

VMD-AOA-GRU | 0.2170 | 0.2779 | 0.0701 | 0.2608 | 0.3351 | 0.0855 | 0.3102 | 0.3908 | 0.1054 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Yang, J.; Pang, F.; Xiang, H.; Li, D.; Gu, B.
A Novel Hybrid Deep Learning Model for Forecasting Ultra-Short-Term Time Series Wind Speeds for Wind Turbines. *Processes* **2023**, *11*, 3247.
https://doi.org/10.3390/pr11113247

**AMA Style**

Yang J, Pang F, Xiang H, Li D, Gu B.
A Novel Hybrid Deep Learning Model for Forecasting Ultra-Short-Term Time Series Wind Speeds for Wind Turbines. *Processes*. 2023; 11(11):3247.
https://doi.org/10.3390/pr11113247

**Chicago/Turabian Style**

Yang, Jianzan, Feng Pang, Huawei Xiang, Dacheng Li, and Bo Gu.
2023. "A Novel Hybrid Deep Learning Model for Forecasting Ultra-Short-Term Time Series Wind Speeds for Wind Turbines" *Processes* 11, no. 11: 3247.
https://doi.org/10.3390/pr11113247