Next Article in Journal
Coordinated Speed Control Strategy for Minimizing Energy Consumption of a Shearer in Fully Mechanized Mining
Next Article in Special Issue
Modeling of CO Accumulation in the Headspace of the Bioreactor during Organic Waste Composting
Previous Article in Journal
Underground MV Network Failures’ Waveform Characteristics—An Investigation
Previous Article in Special Issue
The Sustainable Energy Development Index—An Application for European Union Member States
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hybrid and Ensemble Methods of Two Days Ahead Forecasts of Electric Energy Production in a Small Wind Turbine

1
Institute of Electrical Power Engineering, Warsaw University of Technology, Koszykowa 75 Street, 00-661 Warszawa, Poland
2
Globema Sp. z o. o., Wita Stwosza 22 Street, 02-661 Warsaw, Poland
*
Author to whom correspondence should be addressed.
Energies 2021, 14(5), 1225; https://doi.org/10.3390/en14051225
Submission received: 22 January 2021 / Revised: 14 February 2021 / Accepted: 19 February 2021 / Published: 24 February 2021
(This article belongs to the Collection Feature Papers in Energy, Environment and Well-Being)

Abstract

:
The ability to forecast electricity generation for a small wind turbine is important both on a larger scale where there are many such turbines (because it creates problems for networks managed by distribution system operators) and for prosumers to allow current energy consumption planning. It is also important for owners of small energy systems in order to optimize the use of various energy sources and facilitate energy storage. The research presented here addresses an original, rarely predicted 48 h forecasting horizon for small wind turbines. This topic has been rather underrepresented in research, especially in comparison with forecasts for large wind farms. Wind speed forecasts with a 48 h horizon are also rarely used as input data. We have analyzed the available data to identify potentially useful explanatory variables for forecasting models. Eight sets with increasing data amounts were created to analyze the influence of the types and amounts of data on forecast quality. Hybrid, ensemble and single methods are used for predictions, including machine learning (ML) solutions like long short-term memory (LSTM), multi-layer perceptron (MLP), support vector regression (SVR) and K-nearest neighbours regression (KNNR). Original hybrid methods, developed for research of specific implementations and ensemble methods based on hybrid methods’ decreased errors of energy generation forecasts for small wind turbines in comparison with single methods. The “artificial neural network (ANN) type MLP as an integrator of ensemble based on hybrid methods” ensemble forecasting method incorporates an original combination of predictors. Predictions by this method have the lowest mean absolute error (MAE). In addition, this paper presents an original ensemble forecasting method, called “averaging ensemble based on hybrid methods without extreme forecasts”. Predictions by this method have the lowest root mean square error (RMSE) error among all tested methods. LSTM, a deep neural network, is the best single method, MLP is the second best one, while SVR, KNNR and, especially, linear regression (LR) perform less well. We prove that lagged values of forecasted time series slightly increase the accuracy of predictions. The same applies to seasonal and daily variability markers. Our studies have also demonstrated that using the full set of available input data and the best proposed hybrid and ensemble methods yield the lowest error. The proposed hybrid and ensemble methods are also applicable to other short-time generation forecasting in renewable energy sources (RES), e.g., in photovoltaic (PV) systems or hydropower.

Graphical Abstract

1. Introduction

Renewable energy sources have become a very important element of energy mixes in many countries across our planet. The majority of green energy is produced in large hydropower stations, wind and solar farms. However, more and more energy has been produced every year by various types of prosumer sources. A prosumer is most often perceived as a user of photovoltaic systems. Nevertheless, some prosumers use small wind turbines to produce electricity. This may be particularly appropriate where insolation is low or wind conditions are favourable. Of course, small wind turbines are not as convenient as PV systems for prosumers. Small wind turbines need more free space around them and can be problematic due to noise and vibration that can adversely affect people and cause structural damage. That is why wind turbines are rarely installed on buildings.
Forecasting electricity generation for a small wind turbine is important both on a larger scale where there are many such turbines (because it creates problems for networks owned by distribution system operators), but also for prosumers due to the current energy consumption planning. It is also important for owners of small energy systems as it facilitates the use of various energy carriers and optimizes energy storage.
The problem of forecasting electricity generation for small wind turbines seems to be as complex as forecasting for large wind farms. The following reasons can be mentioned:
  • Small wind turbines have small rotor inertia, therefore, any change of wind speed instantly affects energy generation;
  • Towers of small turbines are low and forecasting of wind speed at low altitudes is burdened with large errors;
  • Operation of small turbines can be affected by surrounding obstacles and rough terrain, and energy generation varies due to vegetation or lying snow.
The research presented in this article concerns small prosumer wind turbines. No actual wind speed data were collected, which made data analysis difficult.

1.1. Related Works

In recent decades, renewable energy sources (RES) has become an important way to address environmental concerns. RES shares have been rising at different rates across countries, e.g., for the European Union (EU) the share has tripled to 18% since 1990, while for Poland it has increased six-fold [1]. Wind energy is and will be important part of this equation. Since wind sources with their intermittency could become destabilizing factors for power system energy balance, different solutions were proposed by researchers to ease the situation. One aspect would be to determine maximum wind power penetration level in view of frequency response adequacy [2]. Another aspect would be good quality of generation predictions to allow us to properly plan for and increase wind power capacity to above the threshold [3].
Recent papers addressing wind power forecasts could be broadly classified into 5 categories: papers focused on how to increase NWP accuracy [4,5,6,7,8], good-practice prediction guidelines [9,10,11], comparisons of accuracy across prediction models [12,13,14,15], hybrid and ensemble methods [16,17,18,19,20,21,22,23,24,25,26,27], and conventional methods improved by, among other things, preprocessing [28,29,30,31,32,33,34,35]. At this point, clear distinction should be made between hybrid, ensemble and improved models. Methods classified as hybrid ones consist mainly of a set of forecasting engine tools without considering input data improvement tools part of the hybrid. Improved models, on the other hand, either preprocess input data using a traditional non-hybrid model or slightly alter/add to the model. Unlike both approaches, ensemble methods use different models working in parallel.
Papers addressing the topic of wind forecasts improvement [4,5,6] have attempted to incorporate Doppler light detection and ranging (lidar) or sodar systems readings into verification or enhancement of wind forecasts. The presented improvement proved to be the most promising for strong, volatile winds. Other papers [7,8] propose different approaches, like Gaussian process regression [8] and combination of artificial neural networks, ensemble learning, and feature selection techniques [7].
Research in [9,10,11] focused on adequate dealing with common problems associated with forecasting and machine learning. Tawn et al., 2020 [9] presented an approach for dealing with missing data, both for operational work and training of models. Messner et al., 2020 [10] demonstrate forecasting verification methods more fitted to perspective of the forecast user, while Sewdien et al., 2020 [11] assessed the influence of different parameters on artificial neural network prediction accuracy.
Comparative studies [12,13,14,15] provide analyses of the forecasting quality of different single models. Mishra et al., 2020 [12] contrast deep neural networks, namely DFFNN, DCNN, RNN, AM and LSTM, both with and without preprocessing by FFT and DWT. Shetty et al., 2020 [13] confront ANN and the surface response method (RSM) with more fundamental models like cubic spline interpolation, least squares, power curve and power equation. The authors of [14] have chosen more complex methods, comparing LR with six machine learning (ML) models: multi-layer perceptron (MLP), Bayesian neural network (BNN), random forests (RF), gradient boosting trees (GBT), KNNR and SVR, while [15] compares MLP, SVR and ANFIS models. Overall winners of comparisons were AM in [12], ANN in [13] and ANFIS in [15]. In [14], the best model heavily depends on the performance indicator and no statement about any model superiority was made.
Hybridization [16,17,18,19,20,21] and parallelization [22,23,24,25,26,27] of prediction models use data-refining and error compensation, respectively, as an approach to maximize prediction accuracy. The most common bases for hybrid models in recent literature are ANNs [17,18,19,21] due to their generalization ability, while the most common hybrid add ons would be single optimization methods [16,18,20,21]. With varying implementation, error reduction can be achieved in different ways. For that purpose, Zhen et al., 2020 [17] propose a Bidirectional Long Short Term Memory–Convolutional Neural Network (BiLSTM-CNN) model for better extraction of spatio-temporal features from data and Zhou et al., 2020 [19] describe a long short-term memory–seasonal autoregressive integrated moving average (LSTM-SARIMA) model for capturing seasonality.
Ensemble models proposed in literature vary by the methods used and how output is aggregated. Studies pertaining to both switchable [22] and output-aggregated models [23,24,25,26,27] have been described in recent research papers. As for methods, researchers frequently use variations of regression trees [22,23,24,25]. Another approach is proposed in [26]. The solution suggested there is based on parallelization of stacked autoencoders for advanced feature extraction.
Papers concerning improved models are more data-focused. Durán-Rosal et al., 2018 [28] propose an algorithm for optimal reduction of the size of time series and the authors of [29] investigate the influence of reduced numerical weather prediction (NWP) data on forecast quality. Research by Yu et al., 2020 [30] and Fan et al., 2020 [31] focuses on assimilating spatial data with Graph Networks, while Li et al., 2020 [32] present an adaptive time resolution method to deal with the error hidden in the data due to error averaging. Some of the papers are focused on enhancing performance of long short-term memory (LSTM) networks. Shadid et al., 2020 [33] propose preprocessing by wavelets, an approach popular in recent years, while Zhang et al., 2020 [34] provide modification to LSTM by constructing error following forget gate. Other contributions, focused largely on improvements of the existing methods—kernel density estimation (KDE) [35] and others.

1.2. Objective and Contribution

The main objectives of this paper can be summarized as follows:
  • Conduct statistical analysis of time series of energy generated from a small wind turbine and potential explanatory variables. Select 8 input data sets to verify how the types and amounts of input data impact on forecasts quality;
  • Verify the accuracy of forecasts conducted by single methods, hybrid methods and ensemble methods (18 methods in total);
  • Develop and verify an original ensemble method: an averaging ensemble based on hybrid methods without extreme forecasts. Conduct an original selection of combinations of predictors for ensemble methods;
  • Indicate the most efficient forecasting methods with no historical wind speed measurements available, but with wind speed forecasts available for up to 48 h ahead.
Below are listed selected contributions of this paper:
  • This research applies to forecasting of small turbine generation using wind speeds predicted for up to 48 h. This problem is understudies in research so far, especially in comparison with forecasting for large wind farms;
  • Development of an ensemble method called “artificial neural network (ANN) type MLP as an integrator of ensemble based on hybrid methods”, which includes a combination of original predictors, and has been arrived at by testing different combinations of predictors. Predictions by this method have yielded the lowest mean absolute error (MAE) among the tested methods;
  • Development of an original method called “Averaging ensemble based on hybrid methods without extreme forecasts”. Predictions by this method have yielded the lowest root mean square error (RMSE) among the tested methods;
  • Completion of an extensive scenario analysis taking into account different degrees of data availability and model complexities. In total, more than 100 models with different parameters/hyperparameters were tested to choose an optimal model for this complex predictive problem.
The research carried out and the methods developed here contribute significantly to the topic of small wind turbine generation prediction, especially for 48 h time horizon and lack of historical wind speed measurements, but with wind speed forecasts up to 48 h ahead available instead. In practice, it is impossible to obtain acceptable generation forecasts for 48 h horizon without using wind speed forecasts as input data. Nevertheless, using other data as additional explanatory variables have further reduced the error.
The remainder of this paper is organized is as follows: Section 2.1 presents statistical analysis of times series of small wind turbine hourly generation and wind speed forecasts. The process of input data selection for particular forecasting methods is described in Section 2.2, and Section 3 specifies the forecasting methods used in the paper. Evaluation criteria used for the assessment of forecasting quality are presented in Section 4, followed by many-sided analysis of the obtained results and their discussion in Section 5. Finally, the main conclusions of our studies are summarized in Section 6, and references are listed at the end of this paper.

2. Data

2.1. Statistical Analysis of Data

Data acquired from a small prosumer turbine located in the south of Poland were used for statistical analyses (exact location is confidential). The location is characterized by rather low wind speeds. Rated power of the turbine was 5 kW, maximum power 7.5 kW, and tower height 13 m. The turbine’s cut-in speed was 2.5 m/s, and the cut-out speed was 25 m/s.
The collected data consisted of hourly wind turbine power output and wind speed forecasts with a horizon from 1 to 48 h. Measurements of actual wind speeds were not available. The data covered almost 2 years (22.5 months), from January of one year to 14 November of the following year, with exact date and time range being confidential. The time series consisted of 16,392 consecutive hours.

2.1.1. Statistical Analysis of Time Series of Hourly Wind Turbine Generations

Table 1 shows descriptive statistics for time series of hourly electric energy generated by the wind turbine considered here.
The analysis of electric energy generation percentiles analysis shows that 0 values made up more than 30% of the time series samples. Usually, energy generation was within 0–1 kWh range (47.35% of the samples). Values closest to maximum power ranged from 6 to 7 kWh and made up for slightly more than 0.1% of samples. Percentage distribution of electric energy generation is shown in Figure 1.
Autocorrelation coefficient (ACF) analysis of hourly generation time series shows slight daily periodicity, with increase of autocorrelation for number of backward periods equal to multiplicity of 24 h. Autocorrelation rapidly decreases for the consecutive hours of the first day. Details are presented in Figure 2. For the time series analyzed here, all autocorrelation coefficients are statistically significant (5% significance level) up to 7 days back (168 prior observations).
To check for daily variability of energy, hourly averages of generation were calculated for each hour of the day, based on the available 2-year dataset. Calculation results are shown in Figure 3. The analysis has demonstrated high daily variability of electric energy production. The ratio of generation for the period between 11:00 and 12:00 (maximum production) and generation for the period between 1:00 and 2:00 (minimum production) was 2.06.
To check for annual variability of energy, monthly averages of generation were calculated for each month of the year, based on the available 2-year dataset. Calculation results are shown in Figure 4.
The analysis demonstrated strong seasonality of energy generation, with the highest production in winter months and lowest in summer ones. For both of the analyzed years, peak production was in March, which is an unusual, probably accidental phenomenon.

2.1.2. Statistical Analysis of Wind Speed Forecasts with Horizon from 1 to 48 h

Two days ahead wind speed forecasts were generated once per day, at 00:00 UTC. NWP unified model (UM) with 4 km × 4 km grid was used as the data source.
Wind speed forecasts for the period from m-1 to m refer to mean forecasts for hours m-1 and m. Descriptive statistics for the time series of forecasts with horizons from 1 to 24 h and forecasts with horizons from 25 to 48 h are shown in Table 2. For both horizons, means and variances were at the same level. For the 1–24 h horizon, mean, median, minimum and maximum wind speed forecasts were slightly higher.

2.2. Selection of Input Data for Particular Forecasting Methods

For input data selection, the Pearson linear correlation coefficient (R) was calculated between energy generation and potential explanatory variables. Analysis was made with omitting test-range data and results of analysis from chapter 2.1 were used for preliminary selection. Lagged inputs were selected by relying on ACF results and choosing 00:00 UTC of the following days as the starting point for 48 h generation forecasts, with horizons from 1 to 48 h. Table 3 describes results of R calculations and codes of input variables selected for use in explanatory data sets. All R values were statistically significant (5% significance level). R values of wind speed forecasts for the second day ahead were lower than for one day ahead, which suggests that wind speed forecasting accuracy decreases with increasing horizon. Wind speed was the most important input variable, with the highest R value. EN_L—coded input variables for different lag variants were given in a decomposed form. Separate results were given for horizons 1–24 h and 25–48 h for better visualization.
Figure 5 describes correlation between per unit values of hourly generations of electric energy and wind speed forecasts. Data were normalized to the <0.1> range and sorted in ascending order by energy production. Figure 5 demonstrates that for some production data with the zero value, NWP-UM wind speed forecasts had non-zero values, with some wind speeds above cut-in speed of the analyzed wind turbine.
A dispersion diagram for relationships between wind speed forecasts and actual energy generation [p.u.] is presented in Figure 6. The diagram demonstrates non-linear relationships between wind speed vector’s module and production of electric energy. Data concentration is low, hence a wind turbine power curve typical of singular wind turbine is not clearly visible. This is mostly due to the differences between forecast and actual wind speeds. Extreme outliers were treated as unreliable data and were removed.
Eight different sets of input data with different information potential were proposed for the forecasts in order to analyze differences in the quality of forecasts between the sets. Table 4 presents sets of appropriate, selected sets of input data for the forecasting methods described in Section 3. Global sensitive analysis in the MLP model was used to eliminate unnecessary input data in the given set (from set 4 to set 8).

3. Forecasting Methods

This section describes the methods employed in this paper. Forecasts are made using single methods based on time series only, other single methods using also additional exogenous variables including machine learning methods and the most advanced (complex)—ensemble methods. Below there is a brief description of the proposed methods. Naive model and naive smoothing model are benchmarks for the quality of other, more advanced forecasting methods.
Naive model (method code NAIVE). The naive model, which is simplest to implement, assumes that forecast generation values are identical to the actual energy generation values for the last known period being the multiplicity of 24. An unquestionable advantage of such model is simplicity and ability to take daily and seasonal variabilities of energy generation into account. Forecasts are calculated by the following Formula (1):
y ^ t = y t 24 n
where y ^ t —forecast electric energy generated by wind turbine for hour t, y t 24 n —energy generation for period lagged by (t−24·n) from forecast period t, n = 1 for forecasting horizons from 1 to 24 h and n = 2 for horizons from 25 to 48 h.
Naive smoothing model (method code SMOOTHING). The naive model with smoothing assumes that forecast energy generation is the same as “smoothened” generation for the last known period being the multiplicity of 24. “Smoothened” production from the last known period is calculated as weighted average, with weight of production from t−24·n equal to 0.5 and weights for productions from t−24·n−1 and t−24·n+1 and equal to 0.25.
y ^ t = 0.5 y t 24 n + 0.25 y t 24 n 1 + 0.25 y t 24 n + 1
where y ^ t —energy generation forecast for hour t, n = 1 for horizons 1 to 23, and n = 2 for horizons 24 to 48.
Physical model (method code PHM). This forecasting model of generated hourly power is a function of forecasted wind speed. Form of function is the 4th order polynomial described by Formula (3). Catalogue data of wind turbine power curve were used to develop the function equation, with data points responding to powers for 3–25 m/s wind speeds, with 1 m/s steps. Second, third, and fourth order of polynomials were tested and 4th order function was chosen due to the highest determination coefficient R2, equal 0.9964.
y ^ t ( v ) = 0.0005 v t 4 0.0285 v t 3 + 0.4992 v t 2 2.5708 v t + 4.3363
where y ^ t ( v t ) —energy generation forecast for period t, and v t —wind speed forecast for t.
Multiple Linear regression model (method code LR). It is a linear model that assumes a linear relationship between the input variables and the single output variable [36]. The input data are selected lags of the forecasted output variable and other input explanatory variables correlated to the output variable. The model is fitted using the least squares approach.
K-Nearest Neighbours Regression (method code KNNR). This algorithm is a non-parametric method used for classification and regression [37]. The input consists of the k closest training examples in the feature space. In KNN regression, the output is the property value for the object. This value is the average of the values of k nearest neighbors. The main hyperparameter for tuning is the number of nearest neighbors. Distance metric is the second hyperparameter.
Support Vector Regression (method code SVR). SVM for regression of the Gaussian kernel (non-linear regression) transforms the classification task into regression by defining width ε tolerance region around the destination [38]. The learning task is reduced to the quadratic optimization problem and depends on few hyperparameters: regularization constant C, width parameter s of the Gaussian kernel and tolerance ε.
Artificial neural network, type MLP. It is a class of feedforward ANNs. MLP is a popular and effective non-linear or linear (depending on the type of activation function in hidden layer(s) and output layer) global approximator [29,36,39]. It consists of a single input layer, typically has one or two hidden layers, one output layer and uses the backpropagation algorithm for supervised learning. The main hyperparameter for tuning is the number of neurons in hidden layer(s). Two MLP models differing by the optimization algorithm are used for the forecasts.
  • The BFGS method utilized for solving unconstrained non-linear optimization problems was chosen as a learning algorithm of a neural network (method code MLPBFGS).
  • The particle swarm optimization (PSO) method was used to determine MLP weights (method coded MLPPSO). This hybrid combination of PSO and MLP has been implemented in an original computer program by one of the authors. The following tuning hyperparameters were investigated in our research: number of neurons in hidden layers (5, 8, 10, 20), number of particles in the swarm (20, 50, 100, 150), number of algorithm iterations (100, 1000), coefficients in the formula for particles movements (0.7, 1.0, 1.4), number of iterations between subsequent disturbances in the swarm (50, 100, 400), neighborhood width (2, 5, 10). The optimization concept is presented in Figure 7 and Figure 8.
Deep neural network type LSTM (method code LSTM). It is a type of recurrent ANN. Internal modules different from the traditional RNNs allow LSTM to avoid problems with long-term dependencies [3,24], gradient explosion and vanishing [8,10]. Typically, a LSTM network consists of one input layer, one or two LSTM layers, and a dense output layer. Hidden LSTM layers consist of neurons for which input gate, forget gate and output gate are responsible for selective control of information [24,25]. Detailed information about algorithms incorporated in LSTM networks is presented in these papers. Between layers, dropout layers can be used to prevent model overfitting [3,8]. The principle of this mechanism for each node is to retain given node with a probability according to Bernoulli distribution, and dropping from network with complementary probability [40]. Settable hyperparameters for the LSTM model would be mainly the number of hidden layers and neurons in them and each layer activation function, but also batch size, number of training epochs and dropout degree. Besides those, various model optimizers like AdaGrad, RMSProp or ADAM can be used [3,10,24]. For optimal LSTM model selection, ReLU/ELU/PReLU/LeakyReLU/sigmoid/tanh activation functions were used for hidden layers, and sigmoid/linear/tanh/ ReLU were used for the output layer. The tested networks had one or two hidden layers with different combinations of neurons in them. Data shuffling and patience mechanisms were used. Networks were trained for 2000 epochs, with patience from 100 to 500, by the ADAM optimizer with lr = 0.001 and decay = 1 × 10−5. To decrease computation time, batch size = 128 was used.
Hybrid methods. Hybrid methods are the physical model (PHM) and a single method connected in series. In these methods, information flows between two models. The first model receives forecast wind speeds (set 5) as input data. The following model receives not only output from the first model (generation forecast), but also endogenous and exogenous (set 7) input data. Both sets together form set 8. The concept of hybrid methods is described in Figure 9 and used for the following pairs of single methods connected in series:
  • Physical model and multiple linear regression model (method code PHM+LR),
  • Physical model and K-nearest neighbours regression (method code PHM+KNNR),
  • Physical model support vector regression (method code PHM+SVR),
  • Physical model and deep neural network type LSTM (method code PHM+LSTM),
  • Physical model and artificial neural network type MLP (method codes: PHM+MLBFGS and PHM+MLPPSO).
Ensemble methods based on hybrid methods. This category of methods use more than one individual predictor and are supported by a simple or more complex integration system of individual forecasts. The general scheme of ensemble of predictors based on hybrid methods is presented in Figure 10. The simplest integration system is the weighted averaging of individual predictors and the most advanced integrator system is the ANN.
Four types of integration system were used for forecasts using “ensemble methods based on hybrid methods”:
(1) Averaging ensemble based on hybrid methods (method code AVE_INT). It integrates the results of selected predictors into final verdict of the ensemble. The final forecast is defined as the average of the results produced by all s hybrid predictors organized in the ensemble [38]. The final prediction result is calculated by Formula (4). This formula uses stochastic distribution of predictive errors. The averaging of the forecast results is an established method of reducing the variance of forecast errors. Two main strategies of predictor choice were tested. An important condition for including the predictor into the ensemble is mutually independent operation and also similar levels of prediction error [38]. Hybrid predictors are included in the ensemble based on the smallest MAE errors on the validation subset and only predictors of different types.
y ^ i = 1 s j = 1 s y ^ i j
where, i is the prediction point, y ^ i is the final predicted value, y ^ i j is predicted value by hybrid predictor number j, and s is the number of hybrid predictors in the ensemble.
(2) An ensemble based on hybrid methods with weight optimization for each predictor (method code W_OPT_INT). Each hybrid predictor in the ensemble has an individual weight. The optimal weights for each predictor in ensemble are calculated using the social cognitive optimization (SCO) method on the validation data (forecasts from each predictor in the ensemble are input data) while minimizing MAE error of final forecasts (output data). SCO is a population-based metaheuristic optimization algorithm developed in 2002 [41]. This algorithm is based on the social cognitive theory. The key point of the ergodicity is the process of individual learning of a set of agents with their own memory and their social learning with the knowledge points in the social sharing library. The final prediction result is calculated by Formula (5). For w s (weights) optimization by SCO, no limits were set for weight ranges or their sum.
y ^ i = w 1 y ^ i 1 + w 2 y ^ i 2 + , , + w s y ^ i s
where, i is the prediction point, y ^ i is the final predicted value, y ^ i s is the predicted value by hybrid predictor number s, w s is the weight associated with the forecast from the hybrid predictor with number s, and s is the number of hybrid predictors in the ensemble.
(3) Averaging ensemble based on hybrid methods without extreme forecasts (method code MIN&MAX_SKIP). This method assumes removing the minimum and maximum forecast from the set of hybrid predictors before each calculation of single final forecasts, being an average of forecasts from s hybrid predictors. For a 48 h horizon, removal is done 48 times for each forecast separately. Such a procedure should in theory decrease prediction errors (MAE and RMSE) and increase the value of R. An important condition for including the predictor into the ensemble is mutually independent operation and also similar levels of prediction error. The final prediction result is calculated by Formula (6).
y ^ i = 1 s 2 ( k = 1 s y ^ i k min { y ^ i k } max { y ^ i k } )
where, i is the prediction point, y ^ i is the final predicted value, y ^ i k is the predicted value by hybrid predictor number k, and s is the number of hybrid predictors in primary ensemble before the elimination of the outputs of predictors yielding extreme forecasts from the set of results.
(4) ANN type MLP as a integrator of ensemble based on hybrid methods (method code MLP_INT). It incorporates the results of selected predictors into final verdict of the ensemble using the MLP model. Finally, four hybrid predictors are chosen for the ensemble based on the smallest MAE errors on validation subset and predictors of differing type. The MLP integrator uses forecasts from individual hybrid predictors as input data, and the actual value of electric energy production from the wind turbine is the output. Dataset training is used for the training of the MLP integrator and validation dataset for MAE control, and the tuning of hyperparameters. Finally, the evaluation criteria are checked on a test data set. The general structure of MLP as an integrator of ensembles is presented in Figure 11.
Different sets of explanatory variables with different information potentials were used for forecasts by single methods. The least information was contained in sets of variables that use only selected lagged variables of forecasted energy generation time series (set 1, 2, 3). The largest explanatory data set (set 8) was used for predictions by hybrid and ensemble methods. Table 5 shows tested input data sets for each method. One reason for organizing data into such sets was to verify the influence of the type and number of variables on forecast accuracy.

4. Evaluation Criteria

For the performance tests of the methods, six evaluation criteria (measures of errors) are used, including RMSE, MAE, Pearson linear correlation coefficient (R), mean bias error (MBE), 75th percentile of the absolute errors (AE) and 99th percentile of the AE.
Mean absolute error is calculated by Formula (7)
M A E = 1 n i = 1 n y i y ^ i
In the process of forecasting of wind turbine electric energy production, changes of RMSE and MAE follow the same trend, and the smaller the two errors, the more accurate the prediction results. MAE is related to the first order of error moment while RMSE is related to the second order.
Root mean square error, which is sensitive to large errors, is calculated by Formula (8):
R M S E = 1 n i = 1 n ( y i y ^ i ) 2
where, y ^ i is the predicted value, y i is the actual value, and n is the number of prediction points.
Pearson linear correlation coefficient of the observed and predicted data is calculated by Formula (9):
R = C y y ^ s t d ( y ) s t d ( y ^ )
where, C y y ^ is the covariance between the actually observed and predicted data and s t d denotes standard deviation of the appropriate variable.
The bigger the error R value (range from −1 to 1), the more accurate the prediction results.
Mean bias error (MBE) captures the average bias in the prediction and is calculated by Formula (10):
M B E = 1 n i = 1 n y i y ^ i
The value of single i-th Absolute Error (AE) needed for calculation of percentiles of AE errors is calculated by Formula (11):
A E i = y i y ^ i
The 75th percentile (PCTL75AE) is the value of AE error below which there are 75% of all AE errors, and it indicates very well the density of AE errors.
Similarly, the 99th percentile (PCTL99AE) is the value of AE error below which there are 99% of all AE errors, and it indicates very well the level of the biggest AE errors.
Measures of errors (MAE, RMSE) are basic measures to evaluate the accuracy of proposed models, while others measures (R, MBE, PCTL75 and PCTL99) are auxiliary.

5. Results and Discussion

Predictions were conducted sequentially, from single methods with a limited number of input variables to hybrid methods and ensemble methods using all selected input variables (set 7, 8). Such a procedure allows us to observe differences in the quality of results depending on the complexity of particular methods and the range of input variables used.
Data available from the 2-year period were divided into training, validation and test sets. Eighty-five percent of the first year’s data were used as the training data set and the remaining 15% constituted the validation set.
Data for both data sets were chosen at random from the first year’s data set. The second year’s data set constituted the test set used for one-time final evaluation of the quality of specific prediction methods on data containing all seasons.
Table 6 provides the classification of forecasts by the range of input variables used, with class number increasing with the range of data used. It should be noted that no wind speed measurements were acquired, so they could not be used in the forecasting process, which is not unusual for small wind turbines. On the other hand, wind speed forecasts can be purchased, and they are the most important explanatory variable.
Forecasts of Class 1 have practical use where the wind turbine has been installed relatively recently and wind speed forecasts are unavailable. NAIVE and SMOOTHING models in particular do not need collecting energy production time series from date times more than 48 h backwards. Class 1 forecasts are predictions based only on time series of forecasted process of energy production, hence their accuracy is low. Class 2 forecasts can be used if seasonal and daily variability markers have been calculated (if the time series covers at least 1 year) and wind speed forecasts are not used due to data acquisition cost. Tests of class 3 use predictions of wind speed and try to check whether lagged energy production can be included in inputs without decreasing the accuracy of forecasting. These tests verify if collecting energy production time series is reasonable. Class 4 represents forecasts using all available input data, while class 5 additionally uses electric energy production forecast as an output from physical model and input to another single method, which makes it a hybrid structure composed of two single methods connected in series.
The performance indicators on the test subset are presented separately for each class in Table 7, Table 8, Table 9, Table 10, Table 11 and Table 12. Results are sorted in descending order by MAE values. Best result for each quality measure is bold-faced. A detailed description of the results of hyperparameters tuning for tested hybrid and ensemble methods (Table 11 and Table 12) is provided in Table A1 of Appendix A.
Based on the results from Table 7, it is possible to draw the following partial conclusions which concern proposed single methods with input data class 1:
  • Regarding MAE, qualitative differences between the best two methods (LSTM and MLPPSO) are very small;
  • The two clearly worst methods in terms of MAE are NAIVE (the reference method) and SMOOTHING;
  • Regarding performance measures for the MLP model, PSO is clearly superior to BFGS as a weight optimization method;
  • Values of R are very small and similar for all seven methods, which clearly indicates that single methods with class 1 input data are of little value for the forecasting of wind turbine generation.
Based on the results from Table 8, the following partial conclusions can be drawn concerning the proposed single methods with input data class 2:
  • Regarding MAE, qualitative differences between the three best methods (LSTM, LR and MLPPSO) are very small; linear method LR has surprisingly ranked second-best,
  • The two clearly worst methods in terms of MAE are SVR and MLPBFGS,
  • Regarding performance measures, for the MLP model, PSO is clearly superior to BFGS as a weight optimization method,
  • Values of R are very small and similar for all five methods, which clearly indicates that single methods with class 2 input data are of little value for the forecasting of wind turbine generation.
Based on the results from Table 9, the following partial conclusions can be drawn, regarding the proposed single methods with input data class 3:
  • MAE, RMSE and R have demonstrated clear improvement as compared to the results of Class 2. The reason is that using for Class 3 as an input wind speed forecasts the most important explanatory variable.
  • Regarding MAE, qualitative differences between the best three methods (LSTM, MLPPSO and MLPBFGS) are very small,
  • In terms of MAE value, PHM is clearly the worst method. This is due to using only wind speed forecasts as input data (set 5),
  • The rank of linear method LR by MAE has notably decreased. It is the second worst method in the ranking, so it can be concluded that it is of little value as compared to non-linear methods when wind speed forecasts are included in input data.
Based on the results from Table 10, the following partial conclusions can be drawn regarding the proposed single methods with input data class 4:
  • Performance measures have slightly improved as compared to the results of Class 3. This is due to additional use of selected, previously observed values of forecasted time series;
  • Taking into account MAE value, qualitative differences among the three best methods (LSTM, MLPPSO and MLPBFGS) are very small;
  • MLPBFGS method deserves attention as its RMSE is not only the lowest, but also visibly less than for the LSTM method with the lowest MAE;
  • The two clearly worst methods in terms of MAE value are SVR and LR. In particular, linear method LR seems to be of little value.
Based on the results from Table 11, the following partial conclusions can be drawn regarding the proposed hybrid methods with input data class 5:
  • Performance measures slightly improved in comparison with results of Class 4. This is due to using a hybrid method—the energy generation forecast from PHM model is additional input data for another, more advanced model;
  • Taking into account MAE value, qualitative differences among the two best methods (PHM+LSTM and PHM+MLPPSO) are very small,
  • MLPBFGS method deserves attention as its RMSE is the lowest,
  • The two clearly worst methods in terms of MAE value are PHM+SVR and PHM+LR. In particular, linear method LR seems to be of little value.
Based on the results from Table 12, the following partial conclusions can be drawn regarding the proposed ensemble methods based on hybrid methods with input data class 5:
  • Performance measures have slightly improved as compared to the results of hybrid methods with input data of Class 5. This is due to the use of different integration systems to achieve the final forecast with the use of selected hybrid methods;
  • Taking into account MAE value, qualitative differences among ensemble methods based on hybrid methods are quite small except the last method MIN&MAX_SKIP [PHM+MLPPSO, PHM+LR, PHM+LSTM, PHM+SVR, PHM+KNNR] using 5 hybrid methods;
  • Special attention should be paid to MIN&MAX_SKIP [PHM+MLPPSO, PHM+KNNR, PHM+LSTM, PHM+SVR] using 4 instead of 5 hybrid methods (PHM+LR removed from ensemble). MAE of this method is similar to the value for the best method in the MLP_INT [PHM+LSTM, PHM+MLPPSO, PHM+KNNR] ranking. At the same time, it features the lowest RMSE, PCTL75AE, PCTL99AE and the highest value of R.
Table 13 and Figure 12 provide a collective set of results from Table 7, Table 8, Table 9, Table 10, Table 11 and Table 12, made by choosing best MAE result from each table. The same, but with RMSE, applies to Table 14 and Figure 13. Moreover, Table 13 and Table 14 contain percentage differences between various methods in relation to the method with the best value of the given error measure. A percentage value with the positive sign means a difference in favor of the given method.
Figure 14 provides a forecast from 1 to 48 h made by using method MLP_INT [PHM+LSTM, PHM+MLPPSO, PHM+KNNR] for 2 following days of an autumn month (October). The forecast curve smoothing effect is clearly visible. Figure 15 demonstrates a forecast from 1 to 48 h made by using the above method for two following days of a spring month (April). For the spring forecast, the curve smoothing effect is slightly less visible and prediction is less accurate, particularly for the second day, which is normal, because the accuracy of wind speed forecasts used as input data decreases with increasing time horizon.

6. Conclusions

The results of the study can be summarized as follows:
(1)
Original hybrid methods and ensemble methods based on hybrid methods, developed for researching specific implementations, have reduced errors of energy generation forecasts for a small wind turbine as compared to single methods.
(2)
The best integration system for ensemble methods based on hybrid methods for accuracy measures MAE, R, PCTL75AE and PCTL99AE is a new, original integrator developed for predictions called “averaging ensemble based on hybrid methods without extreme forecasts” (method code MIN&MAX_SKIP) with 3 hybrid methods in the ensemble. This method is notable for its simplicity, especially in contrast with MLP integrator which requires tuning parameters and hyperparameters.
(3)
The best integration system in ensemble methods based on hybrid methods for accuracy measure MAE is the MLP integrator.
(4)
“An ensemble based on hybrid methods with weight optimization for each predictor” performs better than the method with equal weights for each predictor.
(5)
Our research has demonstrated the merits of using ensemble methods based on hybrid methods instead of hybrid methods. Especially, high accuracy gain was achieved as compared to single methods.
(6)
Deep neural network LSTM is the best single method, MLP is the second best, while using SVR, KNNR and especially LR is less favorable.
(7)
An increase in valuable information in input data (from class 1 to class 5) decreases prediction errors. In particular, wind speed forecasts are the most important input data. Using lagged values of forecast time series proved to slightly increase prediction accuracy. The same applies to using seasonal and daily variability markers.
(8)
If lagged values of actual wind velocities can be used as additional input data, the quality of forecasts should slightly improve.
(9)
More research is needed to verify, among other things, the following:
  • Is prediction accuracy affected by using forecasts from more than one source?
  • Does using greater amount of input data, especially wind speed forecasts from periods directly neighboring the forecast period, affect prediction accuracy?
  • Will the proposed, original method of “averaging ensemble based on hybrid methods without extreme forecasts” be equally good for different RES predictions from different locations and 1 to 72 h ahead horizons?

Author Contributions

Conceptualization, P.P., D.B.; methodology, P.P., D.B.; investigation, P.P., D.B., and M.K.; supervision, S.R.; validation, S.R., T.G.; writing, P.P., D.B., M.K., and S.R.; visualization M.K.; project administration, P.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The National Centre for Research and Development (Poland), Grant No. ID POIR.01.01.01-00-130/16 (to P.P., D.B., M.K.)

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

Interdisciplinary Centre for Mathematical and Computational Modelling of the Warsaw University (ICM UW) provided the data (meteorological forecast) for the scientific research.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ACFAutocorrelation function
AEAbsolute Error
AMAttention Mechanism
ANFISAdaptive Neuro-Fuzzy Inference System
ANNArtificial Neural Network
ARIMA-FFNNAutoregressive Integrated Moving Average—Feedforward Neural Network
BFGSBroyden–Fletcher–Goldfarb–Shanno algorithm
BiLSTM-CNNBidirectional Long Short Term Memory–Convolutional Neural Network
BNNBayesian Neural Network
DCNNDeep Convolutional neural network
DFFNNDeep Feed Forward Neural Network
DWTDiscrete Wavelet Transform
FFTFast Fourier Transform
GBTGradient Boosting Trees
KDEKernel Density Estimation
KNNRK-Nearest Neighbours Regression
LRLinear Regression
LSTMLong Short-Term Memory
LSTM-SARIMALong Short-Term Memory–Seasonal Autoregressive Integrated Moving Average
MAEMean Absolute Error
MBEMean Bias Error
MLMachine Learning
MLPMulti-Layer Perceptron
NWPNumerical Weather Prediction
PCTL75AEThe 75th percentile
PCTL99AEThe 99th percentile
p.u.Per unit
PVPhotovoltaic System
RPearson linear correlation coefficient
R2Determination coefficient
RESRenewable Energy System
RFRandom Forests
RMSERoot Mean Square Error
RNNRecurrent Neural Network
RSMResponse Surface Method
SVRSupport Vector Regression
UMUnified Model

Appendix A

Table A1. The results of hyperparameters tuning for hybrid methods and ensemble methods based on hybrid methods.
Table A1. The results of hyperparameters tuning for hybrid methods and ensemble methods based on hybrid methods.
Method CodeDescription of Method, the Name and the Range of Values of Hyperparameters Tuning and Selected Values
PHM+LSTMThe number of hidden layers:1–2, selected:2, the number of neurons in hidden layer: 3–15, selected: 5–5,
the activation function in hidden layer: ReLU/ELU/PReLU/LeakyReLU/sigmoid/tanh, selected ReLU,
the activation function in output layer: ReLU/sigmoid/tanh/linear, selected tanh, learning algorithm: ADAM, lr = 0.001, decay = 1 × 10−5, epochs: 2000, patience: 100–500, selected: 100, batch size: 128; shuffle:True.
PHM+SVRRegression SVM: Type-1, Type 2, selected: Type-1, kernel type: Gaussian (RBF), the width parameter σ: 0.333, the regularization constant C, range: 1–20 (step 1), selected: 3, the tolerance ε, range: 0.05–2 (step 0.05), selected: 0.1.
PHM+KNNRDistance metrics: Euclidean, Manhattan, Minkowski, selected: Euclidean, the number of nearest neighbors k, range: 1–200, selected: 42.
PHM+MLPPSOThe number of neurons in hidden layers: 5–20, selected 2 hidden layers 10–8, learning algorithm: PSO, the activation function in hidden layer: linear, hyperbolic tangent, selected: hyperbolic tangent, the activation function in output layer: hyperbolic tangent. The number of particles in swarm (20, 50, 100, 150), selected 50, number of algorithm iterations (100, 1000) selected 50, values of coefficients in formula for particles movements (0.7, 1.0, 1.4), selected 1.4 for first coef and 0.7 for the rest, number of iterations between subsequent disturbances in swarm (50, 100, 400), selected 400, neighborhood width (2, 5, 10), selected 5.
PHM+MLPBFGSThe number of neurons in hidden layer: 5–20, selected: 13, learning algorithm: BFGS, the activation function in hidden layer: linear, hyperbolic tangent, selected: hyperbolic tangent, the activation function in output layer: linear.
MLP_INT [PHM+LSTM, PHM+MLPPSO,PHM+KNNR]The number of neurons in hidden layer: 3–10, selected: 4, learning algorithm: BFGS, the activation function in hidden layer: linear, hyperbolic tangent, selected: hyperbolic tangent, the activation function in output layer: linear.

References

  1. Brodny, J.; Tutak, M.; Saki, S.A. Forecasting the Structure of Energy Production from Renewable Energy Sources and Biofuels in Poland. Energies 2020, 13, 2539. [Google Scholar] [CrossRef]
  2. Masood, N.A.; Yan, R.; Saha, T.K. A new tool to estimate maximum wind power penetration level: In perspective of frequency response adequacy. Appl. Energy 2015, 154, 209–220. [Google Scholar] [CrossRef]
  3. Wood, D. Grand Challenges in Wind Energy Research. Front. Energy Res. 2020, 8. [Google Scholar] [CrossRef]
  4. Hämäläinen, K.; Saltikoff, E.; Hyvärinen, O. Assessment of Probabilistic Wind Forecasts at 100 m Above Ground Level Using Doppler Lidar and Weather Radar Wind Profiles. Mon. Weather Rev. 2020, 8. [Google Scholar] [CrossRef]
  5. Wilczak, J.M.; Olson, J.B.; Djalalova, I.; Bianco, L.; Berg, L.K.; Shaw, W.J.; Coulter, R.L.; Eckman, R.M.; Freedman, J.; Finley, C.; et al. Data assimilation impact of in situ and remote sensing meteorological observations on wind power forecasts during the first Wind Forecast Improvement Project (WFIP). Wind Energy 2019, 22, 932–944. [Google Scholar] [CrossRef] [Green Version]
  6. Theuer, F.; van Dooren, M.F.; von Bremen, L.; Kühn, M. Minute-scale power forecast of offshore wind turbines using long-range single-Doppler lidar measurements. Wind Energ. Sci. 2020, 5, 1449–1468. [Google Scholar] [CrossRef]
  7. Papazek, P.; Schicker, I.; Plant, C.; Kann, A.; Wang, Y. Feature selection, ensemble learning, and artificial neural networks for short-range wind speed forecasts. Meteorol. Z. 2020, 29, 307–322. [Google Scholar] [CrossRef]
  8. Cai, H.; Jia, X.; Feng, J.; Li, W.; Hsu, Y.M.; Lee, J. Gaussian Process Regression for numerical wind speed prediction enhancement. Renew. Energy 2020, 146, 2112–2123. [Google Scholar] [CrossRef]
  9. Tawn, R.; Browell, J.; Dinwoodie, I. Missing data in wind farm time series: Properties and effect on forecasts. Electr. Power Syst. Res. 2020, 189, 106640. [Google Scholar] [CrossRef]
  10. Messner, J.W.; Pinson, P.; Browell, J.; Bjerregård, M.B.; Schicker, I. Evaluation of wind power forecasts—An up-to-date view. Wind Energy 2020, 23, 1461–1481. [Google Scholar] [CrossRef] [Green Version]
  11. Sewdien, V.N.; Preece, R.; Rueda Torres, J.L.; Rakhshani, E.; van der Meijden, M. Assessment of critical parameters for artificial neural networks based short-term wind generation forecasting. Renew. Energy 2020, 161, 878–892. [Google Scholar] [CrossRef]
  12. Mishra, S.; Bordin, C.; Taharaguchi, K.; Palu, I. Comparison of deep learning models for multivariate prediction of time series wind power generation and temperature. Energy Rep. 2020, 6, 273–286. [Google Scholar] [CrossRef]
  13. Shetty, R.P.; Sathyabhama, A.; Pai, P.S. Comparison of modeling methods for wind power prediction: A critical study. Front. Energy 2020, 14, 347–358. [Google Scholar] [CrossRef]
  14. Spiliotis, E.; Petropoulos, F.; Nikolopoulos, K. The Impact of Imperfect Weather Forecasts on Wind. Power Forecasting Performance: Evidence from Two Wind Farms in Greece. Energies 2020, 13, 1880. [Google Scholar] [CrossRef] [Green Version]
  15. Baptista, D.; Carvalho, J.P.; Morgado-Dias, F. Comparing different solutions for forecasting the energy production of a wind farm. Neural Comput. Appl. 2020, 32, 15825–15833. [Google Scholar] [CrossRef]
  16. Hu, J.; Tang, J.; Lin, Y. A novel wind power probabilistic forecasting approach based on joint quantile regression and multi-objective optimization. Renew. Energy 2020, 149, 141–164. [Google Scholar] [CrossRef]
  17. Zhen, H.; Niu, D.; Yu, M.; Wang, K.; Liang, Y.; Xu, X. A Hybrid Deep Learning Model and Comparison for Wind Power Forecasting Considering Temporal-Spatial Feature Extraction. Sustainability 2020, 12, 9490. [Google Scholar] [CrossRef]
  18. Ma, Y.-J.; Zhai, M.-Y. A Dual-Step Integrated Machine Learning Model for24 h-Ahead Wind Energy Generation Prediction Based on Actual Measurement Data and Environmental Factors. Appl. Sci. 2019, 9, 2125. [Google Scholar] [CrossRef] [Green Version]
  19. Zhou, B.; Liu, C.; Li, J.; Sun, B.; Yang, J. A Hybrid Method for Ultrashort-Term Wind Power Prediction considering Meteorological Features and Seasonal Information. Math. Probl. Eng. 2020. [Google Scholar] [CrossRef]
  20. Chen, K.-S.; Lin, K.-P.; Yan, J.-X.; Hsieh, W.-L. Renewable Power Output Forecasting Using Least-Squares Support Vector Regression and Google Data. Sustainability 2019, 11, 3009. [Google Scholar] [CrossRef] [Green Version]
  21. Ding, J.; Chen, G.; Yuan, K. Short-Term Wind Power Prediction Based on Improved Grey Wolf Optimization Algorithm for Extreme Learning Machine. Processes 2020, 8, 109. [Google Scholar] [CrossRef] [Green Version]
  22. Ouyang, T.; Huang, H.; He, Y.; Tang, Z. Chaotic wind power time series prediction via switching data-driven modes. Renew. Energy 2020, 145, 270–281. [Google Scholar] [CrossRef]
  23. Bracale, A.; Caramia, P.; Carpinelli, G.; De Falco, P. Day-ahead probabilistic wind power forecasting based on ranking and combining NWPs. Int. Trans. Electr. Energy Syst. 2020, 30, 12325. [Google Scholar] [CrossRef]
  24. Sun, M.; Feng, C.; Zhang, J. Multi-distribution ensemble probabilistic wind power forecasting. Renew. Energy 2020, 148, 135–149. [Google Scholar] [CrossRef]
  25. Banik, R.; Das, P.; Ray, S.; Biswas, A. Wind power generation probabilistic modeling using ensemble learning techniques. Mater. Today-Proc. 2020, 26, 2157–2162. [Google Scholar] [CrossRef]
  26. Chen, J.; Zhu, Q.; Li, H.; Lin, Z.; Shi, D.; Li, Y.; Duan, X.; Liu, Y. Learning Heterogeneous Features Jointly: A Deep End-to-End Framework for Multi-Step Short-Term Wind Power Prediction. IEEE Trans. Sustain. Energy 2020, 11. [Google Scholar] [CrossRef]
  27. Shahid, F.; Khan, A.; Zameer, A.; Arshad, J.; Safdar, K. Wind power prediction using a three stage genetic ensemble and auxiliary predictor. Appl. Soft Comput. 2020, 90, 106151. [Google Scholar] [CrossRef]
  28. Durán-Rosal, A.M.; Gutiérrez, P.A.; Salcedo-Sanz, S. A statistically-driven Coral Reef Optimization algorithm for optimal size reduction of time series. Appl. Soft Comp. 2018, 63, 139–153. [Google Scholar] [CrossRef]
  29. Piotrowski, P.; Baczyński, D.; Kopyt, M.; Szafranek, K.; Helt, P.; Gulczyński, T. Analysis of forecasted meteorological data (NWP) for efficient spatial forecasting of wind power generation. Electr. Power Syst. Res. 2019, 175, 105891. [Google Scholar] [CrossRef]
  30. Yu, M.; Zhang, Z.; Li, X.; Yu, J.; Gao, J.; Liu, Z.; You, B.; Zheng, X.; Yu, R. Superposition Graph Neural Network for offshore wind power prediction. Future Gener. Comput. Syst. 2020, 113, 145–157. [Google Scholar] [CrossRef]
  31. Fan, H.; Zhang, X.; Mei, S.; Chen, K.; Chen, X. M2GSNet: Multi-Modal Multi-Task Graph Spatiotemporal Network for Ultra-Short-Term Wind Farm Cluster Power Prediction. Appl. Sci. 2020, 10, 7915. [Google Scholar] [CrossRef]
  32. Li, L.; Li, Y.; Zhou, B.; Wu, Q.; Shen, X.; Liu, H.; Gong, Z. An adaptive time-resolution method for ultra-short-term wind power prediction. Int. J. Electr. Power Energy Syst. 2020, 118, 105814. [Google Scholar] [CrossRef]
  33. Shahid, F.; Zameer, A.; Mehmood, A.; Raja, M.A.Z. A novel wavenets long short term memory paradigm for wind power prediction. Appl. Energy 2020, 269, 115098. [Google Scholar] [CrossRef]
  34. Zhang, P.; Li, C.; Peng, C.; Tian, J. Ultra-Short-Term Prediction of Wind Power Based on Error Following Forget Gate-Based Long Short-Term Memory. Energies 2020, 13, 5400. [Google Scholar] [CrossRef]
  35. Guan, J.; Lin, J.; Guan, J.J.; Mokaramian, E. A novel probabilistic short-term wind energy forecasting model based on an improved kernel density estimation. Int. J. Hydrog. Energy 2020, 45, 23791–23808. [Google Scholar] [CrossRef]
  36. Géron, A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd ed.; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2019. [Google Scholar]
  37. Dudek, G.; Pełka, P. Forecasting monthly electricity demand using k nearest neighbor method. Przegląd Elektrotechniczny 2017, 93, 62–65. [Google Scholar]
  38. Osowski, S.; Siwek, K. Local dynamic integration of ensemble in prediction of time series. Bull. Pol. Ac. Tech 2019, 67, 517–525. [Google Scholar]
  39. Dudek, G. Multilayer Perceptron for Short-Term Load Forecasting: From Global to Local Approach. Neural Comput. Appl. 2019, 32, 3695–3707. [Google Scholar] [CrossRef] [Green Version]
  40. Piotrowski, A.P.; Napiorkowski, J.J.; Piotrowska, A.E. Impact of deep learning-based dropout on shallow neural networks applied to stream temperature modelling. Earth-Sci. Rev. 2020, 201, 103076. [Google Scholar] [CrossRef]
  41. Xie, X.F.; Zhang, W.J.; Yang, Z.L. Social cognitive optimization for nonlinear programming problems. In Proceedings of the International Conference on Machine Learning and Cybernetics (ICMLC), Beijing, China, 4–5 November 2002; pp. 779–783. [Google Scholar]
Figure 1. Percentage of time-series observations in particular generation ranges.
Figure 1. Percentage of time-series observations in particular generation ranges.
Energies 14 01225 g001
Figure 2. Autocorrelation function (ACF) of the wind turbine energy generation time series.
Figure 2. Autocorrelation function (ACF) of the wind turbine energy generation time series.
Energies 14 01225 g002
Figure 3. Daily variability of wind turbine energy generation.
Figure 3. Daily variability of wind turbine energy generation.
Energies 14 01225 g003
Figure 4. Monthly electric energy production—the mean over a 2-year period.
Figure 4. Monthly electric energy production—the mean over a 2-year period.
Energies 14 01225 g004
Figure 5. Correlation between hourly energy generations and wind speed forecasts.
Figure 5. Correlation between hourly energy generations and wind speed forecasts.
Energies 14 01225 g005
Figure 6. Relationship between wind speed forecasts [p.u.] and actual electricity production.
Figure 6. Relationship between wind speed forecasts [p.u.] and actual electricity production.
Energies 14 01225 g006
Figure 7. General concept of using particle swarm optimization (PSO) algorithm for multi-layer perceptron (MLP) training.
Figure 7. General concept of using particle swarm optimization (PSO) algorithm for multi-layer perceptron (MLP) training.
Energies 14 01225 g007
Figure 8. General algorithm of particle swarm optimization artificial neural network (PSO-ANN).
Figure 8. General algorithm of particle swarm optimization artificial neural network (PSO-ANN).
Energies 14 01225 g008
Figure 9. Diagram of series information processing in hybrid method.
Figure 9. Diagram of series information processing in hybrid method.
Energies 14 01225 g009
Figure 10. The general scheme of ensemble of predictors based on hybrid methods.
Figure 10. The general scheme of ensemble of predictors based on hybrid methods.
Energies 14 01225 g010
Figure 11. General structure of MLP as an integrator of ensemble of predictors based on hybrid predictors.
Figure 11. General structure of MLP as an integrator of ensemble of predictors based on hybrid predictors.
Energies 14 01225 g011
Figure 12. Collective set of best results with respect to MAE for each class of method.
Figure 12. Collective set of best results with respect to MAE for each class of method.
Energies 14 01225 g012
Figure 13. Collective set of best results with respect to MAE for each class of method.
Figure 13. Collective set of best results with respect to MAE for each class of method.
Energies 14 01225 g013
Figure 14. Forecast of electric energy generation from 1 to 48 h made by MLP_INT [PHM+LSTM, PHM+MLPPSO, PHM+KNNR] method for two following days of an autumn month (October).
Figure 14. Forecast of electric energy generation from 1 to 48 h made by MLP_INT [PHM+LSTM, PHM+MLPPSO, PHM+KNNR] method for two following days of an autumn month (October).
Energies 14 01225 g014
Figure 15. Forecast of electric energy generation from 1 to 48 h made by MLP_INT [PHM+LSTM, PHM+MLPPSO, PHM+KNNR] method for two following days of the spring month (April).
Figure 15. Forecast of electric energy generation from 1 to 48 h made by MLP_INT [PHM+LSTM, PHM+MLPPSO, PHM+KNNR] method for two following days of the spring month (April).
Energies 14 01225 g015
Table 1. Descriptive statistics for hourly electric energy generation.
Table 1. Descriptive statistics for hourly electric energy generation.
Descriptive StatisticsValue
Mean0.650 [kWh]
Standard deviation0.899 [kWh]
Minimum0.000 [kWh]
Maximum6.684 [kWh]
Coefficient of variation138.13%
The 25th percentile (lower quartile)0.000 [kWh]
The 50th percentile (median)0.364 [kWh]
The 75th (upper quartile)0.835 [kWh]
The 90 percentile1.676 [kWh]
Variance0.808
Skewness2.578
Kurtosis8.208
Table 2. Descriptive statistics of wind speed forecasts with horizons from 1 to 24 h and 25 to 48 h.
Table 2. Descriptive statistics of wind speed forecasts with horizons from 1 to 24 h and 25 to 48 h.
Descriptive StatisticsTime Series of Wind Speed Forecasts for up to One Day Ahead (d + 1)Time Series of Wind Speed Forecasts for the Second Day Ahead (d + 2)
Mean3.592 [m/s]3.626 [m/s]
Standard deviation1.767 [m/s]1.750 [m/s]
Minimum0.088 [m/s]0.228 [m/s]
Maximum11.059 [m/s]13.463 [m/s]
Coefficient of variation49.18%48.26%
The 25th percentile (lower quartile)2.290 [m/s]2.377 [m/s]
The 50th percentile (median)3.284 [m/s]3.341 [m/s]
The 75th (upper quartile)4.672 [m/s]4.641 [m/s]
The 90 percentile6.089 [m/s]6.023 [m/s]
Variance3.1213.063
Skewness0.7180.800
Kurtosis0.2770.697
Table 3. Pearson linear correlation coefficients between selected input variables and energy generation.
Table 3. Pearson linear correlation coefficients between selected input variables and energy generation.
Description of Input DataCode of Input DataR
Indicator of annual seasonality—mean daily energy generation for given monthMONTH_I0.263
Indicator of daily variability of electric energy production—mean hourly energy generation for given hourDAY_I0.195
Wind speed forecasts—horizons from 1 to 24 hV_F0.756
Wind speed forecasts—horizons from 25 to 48 hV_F0.717
Electric energy production forecast by physical method—the function of polynomial degree 3POLY_F0.690
Hourly energy generation lagged by 24 h, for 1–24 h horizonsEN_L0.343
Hourly energy generation lagged by 48 h, for 25–48 h horizonsEN_L0.166
Hourly energy generation lagged by 25 h, for 1–24 h horizonsEN_L-10.326
Hourly energy generation lagged by 49 h, for 25–48 h horizonsEN_L-10.161
Hourly energy generation lagged by 26 h, for 1–24 h horizonsEN_L-20.304
Hourly energy generation lagged by 50 h, for 25–48 h horizonsEN_L-20.157
Hourly energy generation lagged by 48 h, for 1–24 h horizonsEN_L-240.166
Hourly energy generation lagged by 72 h, for 25–48 h horizonsEN_L-240.117
Hourly energy generation lagged by 72 h, for 1–24 h horizonsEN_L-480.117
Hourly energy generation lagged by 96 h, for 25–48 h horizonsEN_L-480.161
Table 4. Sets of input data selected for forecasting methods.
Table 4. Sets of input data selected for forecasting methods.
Name of SetDescription of SetCodes of Input Data
Set 1Time lag of forecasted time seriesEN_L
Set 2Selected time lags of forecasted time seriesEN_L, EN_L-1, EN_L-2
Set 3Selected time lags of forecasted time seriesEN_L, EN_L-1, EN_L-2,
EN_L-24, EN_L-48
Set 4- Time lag of forecasted time series
- Indicator of annual seasonality
- Indictor of variability of daily electric energy production
EN_L,
MONTH_I,
DAY_I
Set 5Wind speed forecastV_F
Set 6- Indicator of annual seasonality
- Indictor of variability of daily electric energy production
- Wind speed forecast
MONTH_I,
DAY_I
V_F
Set 7- Time lag of forecasted time series
- Indicator of annual seasonality
- Indictor of variability of daily electric energy production
- Wind speed forecast
EN_L,
MONTH_I,
DAY_I
V_F
Set 8- Time lag of forecasted time series
- Indicator of annual seasonality
- Indicator of variability of daily electric energy production
- Wind speed forecast
- Electric energy production forecast by physical model
EN_L,
MONTH_I,
DAY_I
V_F
POLY_F
Table 5. Tested input data sets for each method.
Table 5. Tested input data sets for each method.
The Method CodeType of MethodSet 1Set 2Set 3Set 4Set 5Set 6Set 7Set 8
NAIVELinear+ *
SMOOTHINGLinear +
PHMNon-linear +
LRLinear ++ ++
SVRNon-linear ++ ++
MLPBFGSNon-linear ++ ++
MLPPSONon-linear ++ ++
LSTMNon-linear ++ ++
PHM+LRNon-linear +
PHM+SVRNon-linear +
PHM+MLPBFGSNon-linear +
PHM+KNNRNon-linear +
PHM+MLPPSONon-linear +
PHM+LSTMNon-linear +
AVE_INT [PHM+LSTM, PHM+SVR PHM+MLPPSO, PHM+KNNR]Non-linear +
AVE_INT [PHM+LSTM, PHM+KNNR, PHM+MLPPSO]Non-linear +
W_OPT_INT [PHM+LSTM, PHM+KNNR, PHM+MLPPSO,]Non-linear +
W_OPT_INT [PHM+LSTM, PHM+SVR, PHM+MLPPSO, PHM+KNNR,]Non-linear +
MIN&MAX_SKIP [PHM+MLPPSO, PHM+KNNR, PHM+LSTM, PHM+SVR]Non-linear +
MIN&MAX_SKIP [PHM+MLPPSO, PHM+LR, PHM+LSTM, PHM+SVR, PHM+KNNR]Non-linear +
MLP_INT [PHM+LSTM, PHM+MLPPSO, PHM+KNNR]Non-linear +
Note: —* tested input data set.
Table 6. Tested input data sets for each class forecast.
Table 6. Tested input data sets for each class forecast.
Class No.Description of Input DataInput Data Sets (Depend on Method)
Class 1Only selected previously observed value of forecasted time seriesset 1, set 2, set 3
Class 2Selected previously observed values of forecasted time series and indicators of annual seasonality and variability of daily electric energy productionset 4
Class 3Wind speed forecast and indicators of annual seasonality and variability of the daily electric energy production (without forecasted time series data)set 5, set 6
Class 4Selected previously observed values of forecasted time series, indicators of annual seasonality and variability of daily electric energy production and wind speed forecastset 7
Class 5Selected previously observed values of forecasted time series, indicators of annual seasonality and variability of the daily electric energy production, wind speed forecast and electric energy production forecast by physical modelset 8
Table 7. Performance measures for single methods with input data class 1.
Table 7. Performance measures for single methods with input data class 1.
The Method CodeInput Data SetMAE
[kWh]
RMSE [kWh]MBE
[kWh]
PCTL75AE [kWh]PCTL99AE [kWh]R
LSTMSet 30.48410.80290.11420.36322.79850.2416
MLPPSOSet 30.48420.81390.20520.09540.52520.2455
SVRSet 30.49750.81730.16430.11510.85840.1587
LRSet 30.51830.85560.31660.29231.09140.2301
MLPBFGSSet 30.53040.91050.34100.42912.96340.2335
SMOOTHINGSet 20.62440.98070.03230.40643.31310.2260
NAIVESet 10.63361.00260.03220.39973.61380.2193
Average measures for single methods with input data class 10.53890.88340.17230.30022.16630.2221
Notes: the best fitting results for each fitting measure is printed in bold in blue. The worst fitting result is printed in red.
Table 8. Performance measures for single methods with input data class 2.
Table 8. Performance measures for single methods with input data class 2.
The Method Code Input Data Set MAE
[kWh]
RMSE [kWh] MBE
[kWh]
PCTL75AE [kWh] PCTL99AE [kWh] R
LSTMSet 40.47880.79110.11840.34912.69870.3323
LRSet 40.48020.79270.18230.14080.71330.3175
MLPPSOSet 40.48230.79830.22240.18150.52560.3417
SVRSet 40.50500.82800.28320.21090.93870.2776
MLPBFGSSet 40.52040.89120.03220.41142.94300.3505
Average performance measures for single methods with input data class 20.49340.82030.16770.25871.56390.3239
Notes: —the best fitting results for each fitting measure is printed in bold in blue. The worst fitting result is printed in red.
Table 9. Performance measures for single methods with input data class 3.
Table 9. Performance measures for single methods with input data class 3.
The Method CodeInput Data SetMAE
[kWh]
RMSE [kWh]MBE
[kWh]
PCTL75AE [kWh]PCTL99AE [kWh]R
LSTMSet 60.28600.48120.11890.35202.78990.8176
MLPPSOSet 60.29020.47110.08910.36932.76940.8208
MLPBFGSSet 60.29610.46880.03280.40412.87100.8177
SVRSet 60.32990.50520.14480.27962.57820.8064
LRSet 60.35620.59810.10540.46121.21900.7068
PHMSet 50.43890.6155−0.06620.46912.78010.6540
Average performance measures for single methods with input data class 30.33290.52330.07080.38922.50130.7706
Notes: the best fitting results for each fitting measure is printed in bold in blue. The worst fitting result is printed in red.
Table 10. Performance measures for single methods with input data class 4.
Table 10. Performance measures for single methods with input data class 4.
The Method CodeInput Data SetMAE
[kWh]
RMSE [kWh]MBE
[kWh]
PCTL75AE [kWh]PCTL99AE [kWh]R
LSTMSet 70.28540.48050.11550.35142.78340.8188
MLPPSOSet 70.29090.46910.09420.37172.61310.8270
MLPBFGSSet 70.29290.45940.02930.41402.85270.8253
SVRSet 70.34120.51770.14740.29882.54900.7937
LRSet 70.35310.60550.10680.45121.22430.7145
Average performance measures for single methods with input data class 40.31270.50640.09860.37742.40450.7958
Notes:—the best fitting results for each fitting measure is printed in bold in blue. The worst fitting result is printed in red.
Table 11. Performance measures for hybrid methods with input data class 5.
Table 11. Performance measures for hybrid methods with input data class 5.
The Method CodeInput Data SetMAE
[kWh]
RMSE [kWh]MBE
[kWh]
PCTL75AE [kWh]PCTL99AE [kWh]R
PHM+LSTMSet 80.28230.47020.08300.36612.89050.8202
PHM+MLPPSOSet 80.28680.47270.09430.36682.83000.8209
PHM+MLPBFGSSet 80.29140.45950.03290.40622.87620.8260
PHM+KNNRSet 80.29960.47460.04560.37632.60140.8148
PHM+SVRSet 80.32390.49790.12940.29892.55100.8088
PHM+LRSet 80.33800.54570.09840.38991.93960.7677
Average performance measures for hybrid methods with input data class 50.30370.48680.08060.36742.61480.8097
Notes: the best fitting results for each fitting measure is printed in bold in blue. The worst fitting result is printed in red.
Table 12. Performance measures for ensemble methods based on hybrid methods with input data class 5.
Table 12. Performance measures for ensemble methods based on hybrid methods with input data class 5.
The Method CodeInput Data SetMAE
[kWh]
RMSE [kWh]MBE
[kWh]
PCTL75AE [kWh]PCTL99AE [kWh]R
MLP_INT [PHM+LSTM, PHM+MLPPSO, PHM+KNNR]Set 80.28160.46790.08550.36502.75110.8206
W_OPT_INT [PHM+LSTM, PHM+KNNR, PHM+MLPPSO]Set 80.28190.46810.09730.35602.71730.8205
AVE_INT [PHM+LSTM, PHM+KNNR, PHM+MLPPSO]Set 80.28250.47010.09720.36162.78710.8199
MIN&MAX_SKIP [PHM+MLPPSO, PHM+KNNR, PHM+LSTM, PHM+SVR]Set 80.28580.45750.08440.34872.65340.8259
W_OPT_INT [PHM+LSTM, PHM+SVR, PHM+MLPPSO, PHM+KNNR,]Set 80.28600.46960.09110.35632.84410.8228
AVE_INT [PHM+LSTM, PHM+SVR, PHM+MLPPSO, PHM+KNNR]Set 80.28610.47080.09290.35722.84790.8221
MIN&MAX_SKIP [PHM+MLPPSO, PHM+LR, PHM+LSTM, PHM+SVR, PHM+KNNR]Set 80.29090.46770.07520.34882.65360.8258
Average performance measures for ensemble methods based on hybrid methods with input data class 50.28500.46880.08910.35622.75060.8224
Notes: the best fitting results for each fitting measure is printed in bold in blue. The worst fitting result is printed in red.
Table 13. Collective set of best results with respect to MAE for each class of method.
Table 13. Collective set of best results with respect to MAE for each class of method.
Class of MethodMethod CodeMAE
[kWh]
Difference [%]
The ensemble method based on hybrid methods with input data class 5MLP_INT [PHM+LSTM, PHM+MLPPSO,
PHM+KNNR]
0.2816-
The hybrid method with input data class 5PHM+LSTM0.28230.25
The single method with input data class 4LSTM0.28541.35
The single method with input data class 3LSTM0.28601.56
The single method with input data class 2LSTM0.478870.03
The single method with input data class 1LSTM0.484171.91
Table 14. Collective set of best results with respect to RMSE for each class of method.
Table 14. Collective set of best results with respect to RMSE for each class of method.
Class of MethodMethod CodeRMSE [kWh]Difference [%]
The ensemble method based on hybrid methods with input data class 5MIN&MAX_SKIP [PHM+MLPPSO, PHM+KNNR, PHM+LSTM, PHM+SVR]0.4575-
The single method with input data class 4MLPBFGS0.45940.42
The hybrid method with input data class 5PHM+MLPBFGS0.45950.44
The single method with input data class 3MLPBFGS0.46882.47
The single method with input data class 2LSTM0.791172.92
The single method with input data class 1LSTM0.802975.50
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Piotrowski, P.; Kopyt, M.; Baczyński, D.; Robak, S.; Gulczyński, T. Hybrid and Ensemble Methods of Two Days Ahead Forecasts of Electric Energy Production in a Small Wind Turbine. Energies 2021, 14, 1225. https://doi.org/10.3390/en14051225

AMA Style

Piotrowski P, Kopyt M, Baczyński D, Robak S, Gulczyński T. Hybrid and Ensemble Methods of Two Days Ahead Forecasts of Electric Energy Production in a Small Wind Turbine. Energies. 2021; 14(5):1225. https://doi.org/10.3390/en14051225

Chicago/Turabian Style

Piotrowski, Paweł, Marcin Kopyt, Dariusz Baczyński, Sylwester Robak, and Tomasz Gulczyński. 2021. "Hybrid and Ensemble Methods of Two Days Ahead Forecasts of Electric Energy Production in a Small Wind Turbine" Energies 14, no. 5: 1225. https://doi.org/10.3390/en14051225

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop