A Novel and Alternative Approach for Direct and Indirect WindPower Prediction Methods

Wind energy is a variable energy source with a growing presence in many electrical networks across the world. Wind-speed prediction has become an important tool for many agents involved in energy markets. In this paper, an approach to this problem is proposed by means of a novel method that outperforms results obtained by current direct and indirect wind-power prediction procedures. The first difference is that it is not strictly a direct or indirect method in the conventional sense because it uses information from both wind-speed and wind-power data series to obtain a wind-power series. The second difference is that it smooths down the wind-power series obtained in the first stage, and uses the resulting series for predicting new wind-power values. The process of smoothing is based on the label sequence generation process discussed in the pattern sequence forecasting algorithm and the Naive Bayesian method-based matching process. The result is a less chaotic way to predict wind speed than those offered by other existing methods. It has been assessed in multiple simulations, for which three different error measures have been used.


Introduction
Renewable energy sources, such as solar and wind, are gaining more importance and attention because of the depletion of conventional energy sources, such as fossil fuels, and pollution generated by the combustion of such fuels.Wind power is a clean and sustainable source of energy, and it does not lead to any environmental hazards.Hence, energy generation with wind power has become the main goal of many countries.However, effective power generation with wind energy is quite an uncertain process because of the chaotic and intermittent nature of wind-power availability.This uncertainty in wind power can imperil power availability, quality, and stability.Eventually, this can lead to a huge loss in the energy market.Hence, precise prediction of wind power is a critical task with deep impact and large benefits for humanity.
There are various approaches to forecasting wind power and these can be classified broadly into three categories: (1) model-driven approaches, (2) data-driven approaches, and (3) hybrid approaches [1].Model-driven approaches require abundant meteorological knowledge and information of various physical factors affecting wind power [2].In data-driven approaches, on the other hand, data-driven statistical models are used for forecasting.With the advancement in the artificial-intelligence and data-science fields, more accurate prediction results can be achieved with this approach [3].Historical data are the only requirement for such models.Many research articles describe the performance of distinct data-driven models, such as the basic persistence model [4], and complex models, including support vector machines (SVM) [5,6], neural networks (NN) [7,8], and autoregressive integrated moving average (ARIMA) [9].However, due to the highly stochastic and intermittent nature of wind-power time series, it is difficult to predict within a significantly accurate range.
Wind-power prediction studies are broadly classified into direct and indirect approaches.In direct approaches, wind-power data are directly predicted by various methods.The advantage of this kind of approach is that there is no need to study the relations between wind-power and wind-speed parameters.However, the prediction accuracy of a direct approach is not always good enough since wind-power data usually show high levels of randomness and a chaotic nature.Such wind-power data are very difficult to efficiently process with the prediction methods.
To overcome this difficulty, another part of the available studies focused on indirect prediction approaches.In this kind of approach, wind-speed data are firstly forecasted, and then the predicted data converted into wind-power data by means of various techniques.However, in practice, while transforming wind-speed into wind-power data, further errors are made in prediction accuracy because of inaccuracies in nonlinear power curve analysis.Generally, wind power and wind speed are related in terms of cubic or higher-order powers.Hence, a small change in wind speed leads to larger and significant deviations in wind power.The success of an indirect approach is in how it evaluates the nonlinear dependence between wind-power and wind-speed data.Such error evaluations lead to a rise in learning accuracy and comprehensibility.Instead of manufacturer power curves, statistical techniques seem to be a better option to describe the nonlinear relationship between wind power and wind speed.Higher-order polynomial equations, exponential, fitted power, regression, logistic, and many other models are used to estimate wind power by using explanatory wind-speed datasets.
While reviewing the literature related to short-term wind-power prediction, there is a large number of articles that are focused on direct wind-power as well as wind-speed predictions [10][11][12].
However, there are very few articles that have compared the performance of direct and indirect approaches.Most of them have evidenced that the best prediction accuracy comes with direct approaches [10,11], whereas Reference [12] concluded that an indirect approach performed better than the alternative.
In this paper, a novel approach is presented in order to eliminate the drawbacks of both direct and indirect prediction methods used in wind-power predictions.The proposed method cannot be classified into any of the commented groups because it uses combined information from wind-speed and wind-power series.In this sense, it is an alternative method and behaves as a direct-indirect hybrid that does not directly or indirectly predict power.It starts by smoothing down a wind-power time series by keeping respective wind-speed data as a reference.The process of smoothing down is based on the label sequence generation process discussed in the PSF algorithm and the Naïve Bayesian method-based matching process following the next procedure.Wind-speed and wind-power data are converted into a sequence of labels.Then, these labels are mapped and their best combination is estimated.Keeping these combinations as a reference, the wind-power labels are smoothed down and further predicted with the steps involved in the PSF method.After following this procedure, an important consequence is to reduce the degree of chaos contained in the resulting predicted series.
Multiple simulations have been carried out with the aim of collecting a contingent of results.Three different error measures have been used in order to quantify how much the proposed method outperforms existing ones.
The rest of the paper is organized as follows: Section 2 describes the steps involved in the PSF algorithm.Section 3 introduces the proposed methodology and the description of the prediction methodology for wind-power forecasting.Section 4 shows the results obtained by the proposed approach in predicting wind power, including their quality measurements.Comparisons between the proposed method and other techniques are also provided.Finally, Section 5 summarizes the conclusions achieved with regard to wind-power predictions.

Conventional PSF Methodology
The PSF algorithm is one of the most popular types of univariate time-series prediction methodology, proposed in Reference [13] and further analyzed in Reference [14].The basic principle behind predictions with the PSF algorithm is an optimum search of pattern sequences present in a time series.This methodology consists of several processes that operate in two steps.During the first step, data are clustered, and during the second, the forecasting process is carried out based on the previously clustered data, as shown in Figure 1.The novelty of the PSF algorithm is the utilization of labels for respective pattern sequences present in a time series, instead of the use of the original time-series data.
The clustering step consists of various tasks, including data normalization, the selection of an optimum number of clusters, and the application of k-means clustering.The ultimate aim of this step is to discover clusters of time-series data and accordingly label them.This starts with a normalization process, in which the time series is normalized with Equation (1) in order to remove the redundancies present in it.
where X j is the jth value of each cycle in the input time series, and N is its size in time units.Secondly, the normalized series is assigned with the labels according to different patterns present in it with the help of clustering methods.In PSF, a k-means clustering method is used because of its popularity, simplicity, and fast computing nature.However, it requires prior knowledge of a number of centers so that the series can be clustered in respective numbers of clusters.Reference [13] utilized the Silhouette index [15] to decide the number of clusters in PSF methodology, whereas Reference [14] suggested the 'best among three' policy to decide the optimum number of clusters, in which three different indices (the Silhouette index [15], Dunn index [16], and Davies-Bouldin index [17]) are used.In this policy, the cluster size is finalized with the use of multiple statistical tests to ensure efficiency in the clustering process.Further, References [18][19][20] used a single index (Silhouette index [15]) to simplify computation complexity in the clustering process.Then, with respect to cluster heads (K) generated with the k-means clustering method, the values in the original time series are transformed into label series.These label series are further used for the prediction procedure.This prediction procedure consists of window-size selection, pattern sequence matching, and an estimation process.Consider that x(t) is the vector of time-series data of length N, such that x(t) = [x 1 (t), x 2 (t), ..., x N (t)].After clustering and labeling, the vector is converted into y where L i are labels representing the cluster centers to which data in vector x(t) belongs.Then, during the process, the last W labels are searched in vector y(t).If this sequence of the last W labels is not found in y(t), then the search process is repeated for the last W − 1 labels.In PSF, the length of this label sequence of size W is denoted as the window size.Therefore, window size can vary from W to 1, although this is not usual.In the window-size selection process, the sequence of labels of length size W were picked from the backward direction, and this sequence was searched in the label series.The selection of optimum window (W) is one of the most challenging processes in prediction with PSF in order to minimize the prediction errors.The mathematical expression for an optimum window size is the minimization of Equation (2): where X(t) is a predicted value at time t, X(t) is the measured data at same time instance, and TS represents the time series under study.Practically, the estimation of an optimum window size is done by means of errors validation.However, while searching a sequence W in the label series, if this sequence is not found, then the size of W is reduced by one unit.Again, this process continues until a new window sequence repeats itself in the label series at least once.This confirms that at least one sequence appears more than once in the label series.Once the optimum window size is obtained, the available pattern sequence in the window is searched in y(t), and the label present just after each discovered sequence is noted in a new vector ES.Finally, the future time-series value is predicted by averaging the values in vector ES as in Equation (3).
where size(ES) is the length of vector ES.Finally, the predicted labels are replaced with the appropriate value in a range of an original measured time series with a denormalization process.However, in order to predict future values for multiple time indices, the current predicted value is appended to the original time series, and this procedure continues until the desired number of prediction values are obtained.The usability and superior performance of the PSF method for distinct univariate time-series prediction applications are discussed in References [20][21][22][23][24].

Proposed Methodology
The conventional PSF algorithm has gained popularity because of its superior and promising prediction performance for univariate time series.Also, PSF has shown its capability in wind-power and wind-speed predictions in [25].The methodology proposed in this paper is focused on predicting wind-power data samples framed in a time series with the assistance of corresponding wind-speed data.The prediction concept is based on the PSF algorithm.This novel methodology is proposed as an alternative to direct and indirect wind-power prediction approaches.In this methodology, the wind-power time series is predicted with modifications in conventional PSF and dataset smoothing.In contradiction to state-of-the-art methods and approaches, the significant difference in the proposed approach is the utilization of both wind-power and wind-speed datasets to achieve better accuracy in wind-power predictions.
Usually, researchers have used indirect wind-power prediction approaches due to the highly chaotic nature of wind-power time series.In comparison to wind-speed time series, the nature of respective wind-power time series is more chaotic and intermittent.Hence, it is difficult to predict them more accurately.Contrary to this, indirect approach methods are associated with additional errors accumulated by the curve fitting of power curves.The proposed approach attempts to reduce the prediction errors associated with both direct and indirect approaches.Firstly, this approach smooths down wind-power time series with the help of wind-speed time series by using the same labeling sequence technique as the one used in the conventional PSF algorithm.Secondly, it predicts the future values of wind-power time series with PSF principles.
Given wind-speed and wind-power values recorded in the past at a specific interval (5, 15, 30, and 60 min) up to the day (d − 1), the prediction of future values of wind power is expected at the next few intervals (of same precision) for day d.Consider that TS P and TS S are the time series composed of 'n' samples of wind power and wind speed, respectively, as follows: Similar to the procedure followed in PSF, TS P and TS S are converted into label sequence LS P and LS S , respectively.
Let L i , i ∈ {1, ..., K} be the labels of day i obtained in the labeling step of the PSF method, where K is the number of clusters.LS P and LS S are the label sequence of W consecutive days, as follows: The next step is to map the LS P sequence with the LS S sequence.This mapping is done with decision matrix (M) that uses the Naïve Bayesian method.The motive of this matrix is to represent the pair of each label in LS S with all corresponding labels from LS P with respective occurrence probabilities of each pair.The formulation of decision matrix (M) is done with four parameters: labels from LS S at t and t − 1, labels from LS P at t, and the probability of occurrence of respective combinations, where t is the label sequence index (LS P and LS S ).
where PO stands for probability of occurrence.
Table 1 shows a sample decision matrix, where the first three columns are the combinations of labels of LS S (t − 1), LS S (t), and LS P (t), and the fourth one is the probability of occurrence of a combination of labels.It can often be possible in a decision matrix that each label in LS S has multiple alternatives in respective labels in LS P , with different probabilities of occurrence.In such cases, the Naïve Bayesian method is used to map the most suitable pairs in LS P and LS S .This mapping of labels generates a look-up table (LUT), as shown in Table 2, which is referred further to smooth down the TS P sequence as indicated in Equation ( 9): where NB is the Naïve Bayesian function.
The next process is the smoothing of the TS P series.This process is performed with the consideration of the above-mentioned look-up table.Firstly, all labels in LS S are compared with the respective labels in LS P .The ideal cases are considered wherever these matching pairs follow the pairs, as mentioned in the look-up table as shown in Equation ( 10): Whereas for mismatched cases, the labels in LS P are replaced with the labels corresponding to the respective LS S in the look-up table, as shown in Equation ( 11): where [L S,t , L S,t−1 , L P,t ] / ∈ LUT, L S,t , L P,t are the labels in LS P and LS S , respectively, and L P,LUT,t is a replacement of L P,t from the look-up table at nonideal cases.

LS S (t − 1) Matching of Labels
Eventually, this leads to the removal of labels in LS P responsible for making the wind-power time series more chaotic and intermittent, and to generate a smoother sequence of wind-power labels (LS P ).This new sequence series (LS P ) possesses a positive but much smaller Maximum Lyapunov Exponent (MLE) compared to that of LS P , as shown in Section 4.3.The correlation coefficient between LS P and LS S is also smaller than the one between LS P and LS S .This assures that the LS P sequence is smoother and more favorable for future values prediction than LS P .The procedure of the proposed methodology is illustrated in graphical form and a block diagram in Figures 2 and 3, respectively.It is also expressed in terms of pseudocode in Figure 4.  Furthermore, the prediction process after smoothing LS P is adopted from a conventional PSF algorithm.It starts with the calculation of optimum window (W) selection.Similar to the conventional PSF algorithm, the last W-sized label sequences in LS P are searched for in the whole LS P series.The mean of the very next label of each repetition of this window (W) sequence is noted as the future value of LS P , and it is again replaced with a value within the range of TS P with the denormalization process.

Description of Experimental Data
The proposed methodology can be better understood if it is accompanied by a numerical example.This section aims at proving that the proposed method can outperform results obtained by only using a PSF algorithm without involving the smoothing process.In this study, the performance of the proposed prediction approach was evaluated using wind-power and wind-speed datasets collected from the website of the National Renewable Energy Laboratory (NREL), USA [26].The wind data were measured in 2012 at a time interval of 5 min.With the same resolution of 5 min, the wind-speed and -power datasets were segmented for a week from the four seasons (winter, spring, summer, and autumn).Both wind power and wind speed were measured at the same time interval at the same location.The basic statistical parameters of these datasets are discussed in Table 3.The mean, median, minimum, and maximum values of all datasets are shown, which express the variation and deviation in wind data with respect to the change in seasonal conditions.

Observations
The proposed methodology has been tested by checking three error performance measures.These are Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE), which are as given in Equations ( 12)- (14).
where X i and Xi are the measured and predicted data at time t, respectively.N is the number of data for prediction evaluation.The RMSE and MAE values indicate sample standard deviation and variation between measured and predicted data, respectively, whereas MAPE values show accurate sensitivity measurements for minute changes in the predicted data.
Further, the prediction accuracy of the proposed method is compared with seven distinct state-of-the-art methods used for short-term wind-power prediction applications with similar time horizons.The performance of the proposed method is compared with ARIMA [11,27], Persistence Model (PM) [28,29], Nonlinear AutoRegressive eXogenous model (NARX) [30], SVM [31,32], and Multilayer Perceptron neural network (MLP) [33], Extreme Learning Machine neural network (ELM) [34], and PSF [25] models for each week's dataset from all four seasons, as well as for the one-year dataset.All comparisons are performed for 5, 15, 30, and 60 min ahead of value prediction.
Since the proposed method is presented as an alternative to direct and indirect prediction approaches, its comparison is done with both direct and indirect approaches.In the direct approach, wind-power datasets are directly predicted with all methods under study, whereas in the indirect prediction approach, wind-speed datasets are predicted with prediction methods and then transformed into wind-power data with the use of power curves.In this study, four different power curve fitting techniques are used, these being the fourth-order polynomial, exponential, fitted-power, and regression models.The corresponding seasonwise equations are discussed in Appendix A. These equations are derived by fitting the power curves of datasets of each season as illustrated in Figure 5.Further in Appendix B, Tables A5 and A6 show the prediction results of state-of-the-art methods with direct prediction approaches, and those of indirect approaches are tabulated in Tables A5b and A6a-c for the fourth-order polynomial, exponential, fitted-power, and regression models, respectively.On the same comparison platform, the prediction results of the proposed approach are shown in Table 4.However, by primarily observing these tables, the lower RMSE, MAE and MAPE values in the case of the proposed approach indicates its better prediction accuracy and usability.A more detailed comparative analysis of the case study is discussed below.

Discussion
Tables A5 and A6 provide a comparison between distinct prediction models in terms of three statistical measures for different datasets at different prediction horizons.By simply observing this table, it can be stated that none of the methods shows superior performance in any cases.Hence, it is extremely difficult to make a generalized statement regarding any model that could provide the best prediction method for any wind-power time series.Furthermore, it can be observed that the methods' performance varies with changes in the prediction horizon.In other words, It does not necessarily happen that the method performing the best very short-term prediction horizon is also the best one for short-term horizon prediction.It is even difficult to generally state which method is superior between direct or indirect approaches.
In order to address this ambiguity, the results in Tables A5 and A6 were further analyzed in a different format, as shown in Tables 5 and 6.Table 5 indicates the performance of all methods excluding the proposed method, collectively for all datasets (one-week data for all four seasons).Each value in this table represents the percentage of the respective methods that outperformed all other methods in the comparison.The overall comparison shows that ARIMA, SVM, and PSF showed the best performance in most cases.These methods outperformed other methods in 16.25%, 22.50%, and 26.25% of cases, respectively.However, if the comparison is done on the basis of prediction horizons, prediction-method performance significantly varied.In this study, for a 5 min ahead prediction horizon, PSF showed the best performance in 45% of cases, whereas such dominant performance was not observed by any method in the 15, 30, and 60 min ahead prediction horizons.Nearly similar and mixed performance was achieved with most of the methods.It is important to note that the performance of the ELM models was better in most cases, but while representing the best-performing methods in Table 5, it only reflected 7.5%.Such misleading results are reflected because prediction accuracy associated with ELM was very near but quite larger than the best-performing methods.Contrary to this, the PM method showed the worst prediction accuracy in almost all cases.Interestingly, the best performance percentage in Table 5 changed significantly with the inclusion of the proposed method, because the errors corresponding to the proposed method were lesser than the contemporary methods.The prediction errors for all seasons with the proposed methods are tabulated in Table 4.The proposed method showed the best performance in almost all cases.This quantified comparison shows the superiority of the proposed method for wind-power predictions.Additionally, this case study examined and compared the performance of direct and indirect prediction approaches with the proposed approach as shown in Table 6.This table presents the percentage of cases at which the corresponding technique (direct or indirect) performed best among other techniques with all prediction methods in the dataset study from all seasons.These techniques are compared for different prediction horizons (5, 15, 30, and 60 min).In this study, the direct prediction approach has outperformed all indirect techniques for all four prediction horizons.Eventually, the direct approach performed best in overall situations for all seasons.By comparing the performance of indirect approach techniques, the regression model showed better prediction accuracy in more cases than other techniques for all prediction horizons.
So far, the comparative study explained the superior performance of the proposed methodology for week-sized datasets collected from the different seasons in a year.However, it would be interesting to observe its performance during a whole one-year dataset, and to know the effects of seasonal variations on prediction accuracy.Figure 6a,b illustrates the wind-speed and wind-power time series (initial 5000 samples) of the whole one-year dataset, respectively.The power curve between these time series is also shown in Figure 6c.As discussed in Section 3, the proposed methodology smooths down the wind-power time series as shown in Figure 6d.The changes in amplitudes of smoother time series (TS P as shown in Figure 6d) at various samples are clearly visible as compared to measured wind power time series (TS P ).These significant changes in amplitudes of TS P remove the chaotic components in it, so that maximum Lyapunov exponent, which was 0.9898 for TS P is reduced to 0.9221 for TS P .It was also observed that TS P was more correlated to the TS S time series (Correlation coefficient was 0.981) than to that of TS P (correlation coefficient was 0.9421).This makes time series more favorable for prediction with PSF methodologies.Further, Figure 7 shows the prediction comparison of the initial 100 samples of the observed and predicted values respective to the validating time series.The comparison of prediction error values for the whole one-year dataset for distinct time horizons for the proposed and other contemporary methods is also shown in Table 7. Similar to earlier comparisons for datasets from different seasons, Figure 7 and Table 7 reflect the superior prediction performance of the proposed methodology.

Conclusions
In this paper, a wind-power forecasting algorithm has been proposed, which can be considered an alternative method to direct and indirect approaches.While a direct approach directly predicts power, and an indirect approach does so with the help of power curves after previous predictions of wind speed, the proposed method combines both wind-speed and wind-power data, smooths down the resulting wind-power series, and uses them for predicting wind power in a clearly less chaotic way than existing methods do.
Multiple simulations were carried out with the aim of collecting a contingent of results.Three different error measures were used in order to quantify how much the proposed method can be said to outperform existing ones.Our conclusions are outlined in the next few paragraphs.
Direct prediction approaches show more accuracy in forecasts in comparison to indirect approaches in terms of all three error measures.The crucial reason behind these observations is that power curves are only based on the average deterministic relationships between wind-speed and -power datasets.However, such relationships are actually stochastic in nature.Power-curve variability is the significant factor to reduce wind-power prediction accuracy.In contrast, in the proposed method, all time instances in a wind-power time series are handled and modified individually on a case-by-case basis.This smooths down the time series and removes stochastic patterns in it up to an extent.
As shown in Table 6 and discussed in the corresponding section, between the contemporary methods, ARIMA, SVM, and PSF showed the best performance for both direct and indirect approaches of wind-power predictions.However, Table 5 shows how much the proposed methodology outperforms ARIMA, SVM, PSF, and other methods for all seasons.It shows, on average, 22.79%, 24.65%, and 17.26% improvement of the proposed method compared to ARIMA, SVM, and PSF, respectively, for collectively all seasons and time horizons.Similar improvement is observed for the whole one-year data.
There is scope for future developments.For instance, in this paper, the method used only values at time instants t and t − 1.A possibility is to use more time instants, such as t − 2, t − 3, . . ., t − n.In a way, this presents certain similarities with Markov processes, where several-order Markov chain matrices could be established, regarding whether data of one or more previous states are taken into account when the probability of a state must be calculated.

Figure 2 .
Figure 2. Steps involved in the proposed methodology.

Figure 3 .
Figure 3. Block diagram of the proposed methodology.

Figure 4 .
Figure 4. Pseudocode for the proposed methodology.
Smoother wind power time series

Figure 6 .
Figure 6.Illustrations of a whole one-year dataset used in the study: initial 5000 samples of (a) wind-speed and (b) wind-power time series; (c) power curve; (d) smoother wind-power time series with the proposed method.

Figure 7 .
Figure 7.Comparison of observed and predicted values of a whole one-year dataset (initial 100 samples).

1 ]
Variables: Label sequence of power LS P,W and speed LS S,W data, length of window W, test set T, decision matrix M, and look-up table LUT Output: Forecasts TS P (t) for all time intervals of T ES t j for each j ∈ ES t TS P (t) ← TS P (t) + TS P (j + 1) TS P (t) ← TS P (t)/size(ES t ) D ← D TS P (t) [L 1 , L 2 , ..., L t−1 , L t ] ← clustering(D, K) t ← t + 1 return TS P (t) for all time intervals of T

Table 3 .
Statistical characteristics of datasets.

Table 4 .
Performance of proposed methodology for wind power predictions.

Table 5 .
Percentages of best performance of state-of-the-art methods for different prediction horizons.

Table 6 .
Percentages of best performance of direct and indirect prediction approaches for different prediction horizons.
Last four rows are curve-fitting techniques used for indirect approaches.

Table 7 .
Comparison of proposed methodology with contemporary methods for a whole one-year dataset.

Table A5 .
Comparison of wind-power prediction results with (a) direct prediction approach and indirect prediction approach with curve-fitting techniques: (b) fourth-order polynomial model.

Table A6 .
Comparison of wind-power prediction results with (a) exponential model; (b) fitted-power model; and (c) regression model.