# Wind Power Short-Term Time-Series Prediction Using an Ensemble of Neural Networks

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Dataset Characteristics

**x**into a two-dimensional system using t-distributed Stochastic Neighbor Embedding (tSNE) transformation, representing the nonlinear method

**y**=

**f**(

**x**) = [y

_{1}, y

_{2}]

^{T}of dimensionality reduction [23,24,25]. The method maps the high-dimensional data into a two- or three-dimensional space. Similar objects are modeled by neighboring points, and less similar objects are modeled by much distant points with high probability [25,26,27].

_{j}to datapoint x

_{i}is the conditional probability, p

_{j|i}, that x

_{i}would pick x

_{j}as its neighbor if neighbors were picked in proportion to their probability density under a Gaussian centered at x

_{i}(1)”.

## 3. The Neural Prediction System

#### 3.1. Individual Neural Predictors Forming an Ensemble

**x**and is defined by:

**w**

_{i}is the weight vector of the ith neuron. The other type of activation used in radial basis function (RBF) networks is based on the Gaussian function:

**c**represents the center, and σ is the coefficient characterizing the width of the function. The networks applying such a form of activation also use different learning algorithms. In the case of a regression task, the output neurons of both (MLP and RBF) networks are usually linear [19].

**x**(t) but also on its previous state

**x**(t − 1). The most efficient implementation of the recurrent neural network is now long short-term memory [22], which has a very good reputation in prediction tasks. Therefore, they are good candidates for time-series prediction of the future outcome.

**x**

_{t}is a vector of input signals in the time points t, c

_{t}

_{−1}represents the memory signal (cell state) of the cell in the time point t − 1, and h

_{t}

_{−1}is the output signal of the cell in the preceding time point t − 1. The cell generates c

_{t}and h

_{t}of the memory cell and the output signal (at time point t). Both signals (c

_{t}and h

_{t}) leave the cell, and they are fed back to the cells in the hidden layer at a time t + 1.

#### 3.2. Ensemble of Predictors

## 4. Numerical Results of Experiments

- Prediction of the 24-h power pattern generated by the wind for the next day (24 h ahead), based on the information from the previous day;
- Prediction of 1-h-ahead hourly power generation, assuming its knowledge from the previous hour.

_{h}-24. In forecasting the 24-h vector of the next day’s power, we used the 24-h vector of power generated in the previous day and n

_{h}hidden neurons of sigmoidal nonlinearity. The RBF network applying the Gaussian activation function has a similar architecture of n

_{h}hidden neurons and can be presented as 24-n

_{h}-24 (input formed by the elements of 24-h power pattern generated in the previous day, the output represents the corresponding vector for the day under prediction). In both cases, the value of n

_{h}was adjusted in the initial stage of experiments, depending on the type of task.

**x**representing the 24-h patterns of the previous daily load. These samples are associated during the learning of the network with the target vector of the 24-h load pattern of the next day. The network is trained using the pairs of vectors: predicted input

**x**(d) and known output

**x**(d + 1). In the testing mode, the pattern of the 24-h vector for the next day is predicted by the learned network. It uses the supplied vector of the already-known pattern of the previous day. The structure of the LSTM network used in the experiments is 24-n

_{h}-24, where n

_{h}represents the number of LSTM cells. This number is also chosen in the introductory experiment phase. The results of individual units will be fused in an ensemble form using three approaches to their integration:

- Simple averaging of the results of individual units;
- Weighted averaging using the application of the MLP combiner;
- Application of PCA in the fusing phase.

#### 4.1. Prediction of the 24-h Power Pattern for the Next Day

_{h}of hidden neurons in the networks. The experiments have shown the best results for n

_{h}= 12 in MLP, n

_{h}= 16 in RBF, and n

_{h}= 18 in LSTM networks.

**x**(N = 72) is transformed into K-dimensional output vector

**z**, defined as follows:

**R**

_{x}associated with K largest eigenvalues. The vector

**z**is composed of K principal components, starting from the most important z

_{1}and ending on the least important component z

_{K}. The cut information can be associated with the noise of data. The vector

**z**following from the PCA transformation is treated as the input to the MLP combiner of the structure like that presented in Figure 4. The introductory experiments have shown that the best results are obtained at the dimension of vector

**z**equal to 35. The final forecasting for 24 h of the next day is performed by the MLP combiner of the structure 35-12-24 (35 elements following from PCA transformation, 12 hidden neurons, and 24 output signals representing the final forecast). PCA is explained well in [30,31]. The calculations were performed using a laptop with Windows 10 64-bit with the following parameters: Intel Core i7-6700HQ 2.60 GHz, RAM 16 GB, HDD 1 TB. All experiments were conducted using the MATLAB R2023a computation platform [24]. The detailed structures of these three members of the ensemble were adjusted using and accepting the introductory experiments, which provided the best results on the validation data (around 20% of the learning set).

#### 4.2. Prediction of 1-h-Ahead Hourly Power

**x**used for prediction of the power value of the hth hour p(h) is also composed of 24 elements, representing powers corresponding to its past 24 h,

**x**(h) = [p(h − 1) p(h − 2) … p(h − 24)]. Similar to the first task, the learning data were composed from the first three years, leaving the last (fourth) year only for testing.

_{h}= 8 in MLP, n

_{h}= 10 in RBF, and n

_{h}= 30 in LSTM networks.

## 5. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

MLP | Multilayer perceptron |

RBF | Radial basis function |

PCA | Principal component analysis |

tSNE | t-distributed Stochastic Neighbor Embedding |

LSTM | Long short-term memory |

MAE | Mean absolute error |

MAPE | Mean absolute percentage error |

SGD | Stochastic Gradient Descent algorithm |

ADAM | Adaptive Moment estimation algorithm |

NRMSE | Normalized root-mean-square Error |

## References

- Rosa, J.; Pestana, R.; Leandro, C.; Geraldes, C.; Esteves, J.; Carvalho, D. Wind power forecasting with machine learning: Single and combined methods. In Proceedings of the 20th International Conference on Renewable Energies and Power Quality (ICREPQ’22), Vigo, Spain, 27–30 July 2022; Volume 20. [Google Scholar]
- Karthikeswaren, R.; Kanishka, K.; Gaurav, D.; Ankur, A. A survey on classical and deep learning based intermittent time series forecasting methods. In Proceedings of the 2021 International Joint Conference on Neural Networks, Shenzhen, China, 18–22 July 2021. [Google Scholar] [CrossRef]
- Ferrero Bermejo, J.; Gómez Fernández, J.F.; Olivencia Polo, F.; Crespo Márquez, A. A review of the use of artificial neural network models for energy and reliability prediction. A study of the solar PV, hydraulic and wind energy sources. Appl. Sci.
**2019**, 9, 1844. [Google Scholar] [CrossRef] - Donadio, L.; Fang, J.; Porté-Agel, F. Numerical weather prediction and artificial neural network coupling for wind energy forecast. Energies
**2021**, 14, 338. [Google Scholar] [CrossRef] - El Aissaoui, H.; El Ougli, A.; Tidhaf, B. Neural networks and fuzzy logic based maximum power point tracking control for wind energy conversion system, Advances in Science. Technol. Eng. Syst. J.
**2021**, 6, 586–592. [Google Scholar] [CrossRef] - Alanis, A.Y.; Sanchez, O.D.; Alvare, J.G. Time series forecasting for wind energy systems based on high order neural networks. Mathematics
**2021**, 9, 1075. [Google Scholar] [CrossRef] - Yan, C.; Pan, Y.; Archer, C.L. A general method to estimate wind farm power using artificial neural networks. Wind Energy
**2019**, 22, 1421–1432. [Google Scholar] [CrossRef] - Altintas, A.; Davidson, L.; Carlson, O. Forecasting of wind power by using a hybrid machine learning method for the Nord-Pool intraday electricity market. Wind Energy Sci.
**2023**, preprint. [Google Scholar] [CrossRef] - Hu, T.; Liu, K.; Ma, H. Short-term spatial-temporal wind power forecast through alternate feature extraction. In Proceedings of the International Joint Conference on Neural Networks, Virtual, 18–22 July 2021; pp. 1–8. [Google Scholar]
- Oh, J.R.; Park, J.J.; Ok, C.S.; Ha, C.H.; Jun, H.B. A Study on the Wind Power Forecasting Model Using Transfer Learning Approach. Electronics
**2022**, 11, 4125. [Google Scholar] [CrossRef] - Lee, H.; Kim, K.; Jeong, H.; Lee, H.; Kim, H.; Park, J. A Study on Wind Power Forecasting Using LSTM Method. Trans. Korean Inst. Electr. Eng.
**2020**, 69, 1157–1164. [Google Scholar] [CrossRef] - Yang, M.; Shi, C.; Liu, H. Day-ahead wind power forecasting based on the clustering of equivalent power curves. Energy
**2021**, 218, 119515. [Google Scholar] [CrossRef] - Hanifi, S.; Liu, X.; Lin, Z.; Lotfian, S. A critical review of wind power forecasting methods—Past, present and future. Energies
**2020**, 13, 3764. [Google Scholar] [CrossRef] - Maldonado-Correa, J.; Solano, J.; Rojas-Moncayo, M. Wind power forecasting: A systematic literature review. Wind Eng.
**2021**, 45, 413–426. [Google Scholar] [CrossRef] - Zhang, X.; Kuenzel, S.; Colombo, N.; Watkins, C. Hybrid Short-term Load Forecasting Method Based on Empirical Wavelet Transform and Bidirectional Long Short-term Memory Neural Networks. J. Mod. Power Syst. Clean Energy
**2022**, 10, 1216–1228. [Google Scholar] [CrossRef] - Zhang, W.; Lin, Z.; Liu, X. Short-term offshore wind power forecasting—A hybrid model based on Discrete Wavelet Transform (DWT), Seasonal Autoregressive Integrated Moving Average (SARIMA), and deep-learning-based Long Short-Term Memory (LSTM). Renew. Energy
**2022**, 185, 611–628. [Google Scholar] [CrossRef] - Osowski, S.; Szmurlo, R.; Siwek, K.; Ciechulski, T. Neural approaches to short-time load forecasting in power systems—A comparative study. Energies
**2022**, 15, 3265. [Google Scholar] [CrossRef] - Haykin, S. Neural Networks and Learning Machines; Pearson: Santa Monica, CA, USA, 2016. [Google Scholar]
- Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Netw. Learn. Syst.
**2017**, 28, 2222–2232. [Google Scholar] [CrossRef] [PubMed] - Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput.
**1997**, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed] - Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw.
**2015**, 61, 85–117. [Google Scholar] [CrossRef] - Wind-Generated Power Dataset in Poland. Available online: https://www.pse.pl/dane-systemowe/funkcjonowanie-kse/raporty-dobowe-z-pracy-kse/generacja-zrodel-wiatrowych (accessed on 28 August 2023).
- Tan, P.N.; Steinbach, M.; Kumar, V. Introduction to Data Mining; Pearson Education Inc.: Boston, MA, USA, 2021. [Google Scholar]
- Matlab User Handbook; MathWorks: Natick, MA, USA, 2023.
- van der Maaten, L.; Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res.
**2008**, 9, 2579–2605. Available online: https://jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf (accessed on 6 October 2023). - Birjandtalab, J.; Baran Pouyan, M.; Nourani, M. Nonlinear Dimension Reduction for EEG-Based Epileptic Seizure Detection. In Proceedings of the 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), Las Vegas, NV, USA, 24–27 February 2016. [Google Scholar]
- Hinton, G.; Roweis, S. Stochastic Neighbor Embedding. Neural Inf. Process. Syst.
**2002**. Available online: https://cs.nyu.edu/~roweis/papers/sne_final.pdf (accessed on 20 October 2023). - Hochreiter, S.; Bengio, Y.; Frasconi, P.; Schmidhuber, J. Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies. In A Field Guide to Dynamical Recurrent Networks; Wiley: Hoboken, NJ, USA, 2001. [Google Scholar] [CrossRef]
- Ciechulski, T.; Osowski, S. High Precision LSTM Model for Short-Time Load Forecasting in Power Systems. Energies
**2021**, 14, 2983. [Google Scholar] [CrossRef] - Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. A
**2016**, 374, 20150202. [Google Scholar] [CrossRef] [PubMed] - Forkman, J.; Josse, J.; Piepho, H.P. Hypothesis Tests for Principal Component Analysis When Variables are Standardized. J. Agric. Biol. Environ. Stat.
**2019**, 24, 289–308. [Google Scholar] [CrossRef] - Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Neural Inf. Process. Syst. (NIPS)
**2017**, 1–11. [Google Scholar] [CrossRef] - Pandey, A.; Wang, D.L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain. In Proceedings of the ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 6875–6879. [Google Scholar] [CrossRef]

**Figure 2.**The distribution of 24-h vectors of data samples in two-dimensional space after tSNE transformation.

**Figure 4.**The general structure of the single memory LSTM: c

_{t}is the cell in a state of time t; c

_{t}

_{−1}is the cell in a state of time t − 1; h

_{t}is the hidden cell in a state of time h; h

_{t}

_{−1}is the hidden cell in a state of time t − 1; sigm is the sigmoid activation function; tgh is the hyperbolic tangent activation function [29].

**Figure 5.**The diagram of the proposed weighted average integration system using MLP combiner. The input signals are represented by concatenated vectors

**y**

_{MLP},

**y**

_{RBF}, and

**y**

_{LSTM}, and the output nodes present the predicted vector of power in the succeeding hours of the day P(1), P(2), …, P(24).

**Figure 6.**The graphical illustration of the predicted and true hourly samples of the generated wind power values: the upper figure represents the superimposed values, and the bottom figure represents the hourly distribution of error of prediction. They correspond to the results of 10 days.

Year | Average Power (MW) | The Standard Deviation of Power (MW) |
---|---|---|

2018 | 1408.7 | 1152.9 |

2019 | 1662.3 | 1226.2 |

2020 | 1731.7 | 1296.6 |

2021 | 1739.8 | 1393.2 |

**Table 2.**The MAE, RMSE, and MAPE of individual units in the prediction of the 24-h power pattern of the next day for testing data.

MAE (Mean ± Std.) (MW) | RMSE (Mean ± Std.) (MW) | MAPE (Mean ± Std.) (%) | |
---|---|---|---|

MLP | 673.12 ± 9.90 | 932.43 ± 5.08 | 38.43 ± 0.12 |

RBF | 681.47 ± 11.67 | 938.93 ± 8.83 | 38.82 ± 0.30 |

LSTM | 708.55 ± 9.25 | 937.53 ± 5.24 | 38.00 ± 0.17 |

**Table 3.**The MAE, RMSE, and MAPE of an ensemble in the prediction of the 24-h power pattern of the next day for testing data.

MAE (Mean ± Std.) (MW) | RMSE (Mean ± Std.) (MW) | MAPE (Mean ± Std.) (%) | |
---|---|---|---|

Simple averaging | 665.68 ± 6.27 | 913.62 ± 4.59 | 37.60 ± 0.19 |

MLP combiner (n_{h} = 16) | 598.45 ± 4.87 | 835.08 ± 6.23 | 34.75 ± 0.27 |

PCA integration (K = 35) | 611.38 ± 3.38 | 856.14 ± 6.17 | 35.60 ± 0.28 |

**Table 4.**The comparison of the proposed approach and the naïve methods of forecasting in terms of RELMAE, RELRMSE, and RELMAPE for 24-h-ahead forecasting.

Relative quality measures | RELMAE | RELRMSE | RELMAPE |

0.57 | 0.58 | 0.65 |

**Table 5.**The MAE, RMSE, and MAPE of the individual units and an ensemble in the prediction of the 1-h-ahead power value for testing data.

MAE (Mean ± Std.) (MW) | RMSE (Mean ± Std.) (MW) | MAPE (Mean ± Std.) (%) | |
---|---|---|---|

MLP | 130.10 ± 0.97 | 182.67 ± 1.36 | 8.03 ± 0.06 |

RBF | 130.79 ± 2.22 | 182.59 ± 1.29 | 8.02 ± 0.06 |

LSTM | 114.49 ± 9.51 | 159.54 ± 12.74 | 6.90 ± 0.51 |

Simple averaging | 119.78 ± 1.84 | 168.32 ± 2.96 | 7.39 ± 0.13 |

MLP combiner (n_{h} = 2) | 107.71 ± 5.57 | 152.42 ± 7.70 | 6.69 ± 0.33 |

PCA integration (K = 2) | 107.70 ± 5.39 | 152.41 ± 7.61 | 6.69 ± 0.33 |

**Table 6.**The comparison of the proposed approach and the naïve methods of forecasting in terms of RELMAE, RELRMSE, and RELMAPE for 1-h-ahead forecasting.

Relative quality measures | RELMAE | RELRMSE | RELMAPE |

0.83 | 0.84 | 0.84 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ciechulski, T.; Osowski, S.
Wind Power Short-Term Time-Series Prediction Using an Ensemble of Neural Networks. *Energies* **2024**, *17*, 264.
https://doi.org/10.3390/en17010264

**AMA Style**

Ciechulski T, Osowski S.
Wind Power Short-Term Time-Series Prediction Using an Ensemble of Neural Networks. *Energies*. 2024; 17(1):264.
https://doi.org/10.3390/en17010264

**Chicago/Turabian Style**

Ciechulski, Tomasz, and Stanisław Osowski.
2024. "Wind Power Short-Term Time-Series Prediction Using an Ensemble of Neural Networks" *Energies* 17, no. 1: 264.
https://doi.org/10.3390/en17010264