## 1. Introduction

Currently, because of increasing environmental pollution and the energy crisis, wind energy is very important for the energy industry. The use of wind energy in electricity production is widespread, and new units with a nominal capacity of thousands of megawatts are being installed each year [

1]. In 2017, according to the report of World Wind Energy Association, the total installed wind power capacity (WPC) of the whole world increased to 539 GW with recent installation of 52.6 GW [

2], while the global growth rate was 10.8%. In the same year in China, the recently installed WPC was 15 GW, and the total capacity reached 163.67 GW with a 21.3% increment. Both the wind power capacity and the growth rate of China were larger than those of other countries in 2017. Wind energy is important to the economic and environmental operation of electric power systems due to its characteristics of clean and renewable energy; thus, such abilities make it a more attractive subject for researchers [

3]. Nevertheless, wind power has innate features of randomness, instability, and intermittence. If the electricity produced by unstable wind power, especially in large quantities, is injected into the power grid, it will threaten the gridâ€™s safety. This problem can be solved by accurately predicting wind power [

4]. Precise wind energy prediction can help workers (at the power grid control system) know the precise amount of electric power produced by wind energy in a timely manner, and employ a sensible dispatching plan for other forms of energy to serve an appropriate electricity amount. It can be seen that the accurate prediction of wind power energy plays a vital role in the power gridâ€™s safety and economical operation; it can also guide the normal operation of wind turbines and extend the equipmentâ€™s service life, while also reducing dependence on conventional expensive energy sources [

5].

In recent years, many approaches were developed for wind speed and wind power prediction in the literature. These approaches can be considered in three categories: the physical approach, statistical approach, and soft computing approach. The principle of the physical approach is to find out the relationships among wind speed, temperature, pressure, and moisture and build thermodynamics formulas [

6]. This kind of model is good for long-term wind speed prediction. However, the detection and collection of this information needs a lot of sensors, which can be very expensive. What is more, solving this kind of model requires complex calculations. In the physical approach, the models require a huge number of physical specifications [

7]. These disadvantages limit the application of the physical model. In addition, physical models are selected for modeling long time horizons, while statistical approach models are more suitable for short time horizons [

8]. The statistical approach tries to find inherent relationships within the actual data. Autoregressive models, such as autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) are commonly utilized for short-term wind speed prediction [

9,

10]. In recent years, some new and improved statistical models were proposed for wind prediction [

11,

12,

13]. Kavasseri and Seetharaman [

14] applied a fractional-ARIMA model in wind speed prediction of one- and two-day-ahead horizons in four potential wind generation sites located in North Dakota, United States of America (USA). The simulated results showed that fractional-ARIMA outperformed ARIMA when wind speed series showed long-memory characteristics. Erdem and Shi [

15] proposed four approaches based on ARMA for the prediction of hourly wind speed obtained from two wind observation sites in North Dakota, USA, and satisfactory simulation results were obtained. In the literature, some authors also used spaceâ€“time statistical models for wind energy prediction and found them better in comparison to simple statistical time-series models. However, such models provide less accurate prediction results because they cannot adequately address the nonlinearity of the data [

16]. In addition, statistical models establish that any phenomenon can be expressed as a linear combination of its own past values, given that the studied stochastic process is stationary. However, it was documented that wind speed time series have a heteroscedastic, non-stationary, and highly nonlinear behavior. Soft computing methods, due to their excellent nonlinear processing capacity, which is very important for wind energy high-precision predictions, were adopted in this study [

17,

18,

19].

In soft computing (SC) approaches, models use an auto learning process from previous data to recognize future trends. The most popular SC-based models are neural network (NN), neuro-fuzzy system (NF), support vector regression (SVR), least square support vector regression (LSSVR), and M5 regression tree (M5RT) models. Wind power production is mainly affected by wind speed fluctuations [

20,

21,

22]. Thus, SC-based models overcome the shortcomings of statistical models in handling the nonlinearity of the data (e.g., wind speed) [

23,

24]. NF models were successfully utilized for modeling wind energy in the past few decades [

25,

26,

27,

28,

29,

30,

31,

32]. Liu et al. [

26] predicted wind energy using NF and compared the results with a radial basis neural network (RBFNN), a backpropogation neural network (BPNN), and LSSVR. In the study, they firstly predicted wind energy using BPNN, RBFNN, and LSSVR, separately. Then, they used predicted results of these models as inputs to the NF model and found that NF provided more accurate prediction results in comparison to these models. Saleh et al. [

27] used NF to predict wind energy using fuzzy cluster means for selecting the optimal fuzzy rules. They found that the proposed NF model provided good prediction accuracy in wind energy prediction. Giorgi et al. [

28] used NF, NN, and ARMA models to predict wind power. Their results showed superior accuracy of the NF model compared to ARIMA and NN. Mohandes et al. [

29] estimated the wind speed at different heights using the NF model. The results demonstrated that the NF model could be applied successfully in the estimation of wind speeds at higher heights, using the wind speed at lower heights as inputs. Johnson et al. [

31] applied the NF model to predict five-minutes-ahead wind power. The results were compared with the persistence method, and it was found that NF provided better accuracy compared to the latter model. LSSVR was also extensively applied in solving many wind energy problems in recent years [

33,

34,

35,

36,

37,

38,

39,

40]. Zhang et al. [

33] applied the LSSVR model for wind energy prediction, compared with the RBF model, and found that LSSVR provided better results than RBF. Wang et al. [

34] used a model combination of ARIMA, extreme learning machine, SVR, and LSSVR for wind speed prediction. Liu and Li [

36] predicted short-term wind speed and wind power by utilizing LSSVR with wavelet transform (WT). The results were compared with a recursive least square (RLS) regression model, and the LSSVR-WT gave better results than the RLS-WT model. Zhou et al. [

38] made a study on the fine-tuning of SVR model parameters to predict wind speed for one-step-ahead horizon. The simulated results showed that the SVR models processed by fine-tuning outperformed the persistence model. Guo et al. [

39] used the LSSVR model for wind speed prediction in the Hexi corridor of China. They compared the results of LSSVR with two statistical models, ARIMA and seasonal ARIMA (SARIMA), and also made a hybrid of LSSVR with these models. The results indicated that LSSVR alone provided better accuracy compared to the others. Yuan et al. [

40] applied the LSSVR model with a gravitational search algorithm for the prediction of wind power. They compared the optimized LSSVR with SVR and NN, and LSSVRâ€™s performance was superior to that of the other models. M5RT is not as popular as NF and LSSVR in the field of wind energy, and there are limited applications in the literature related to wind prediction. To our best knowledge, the applications of M5 regression trees in wind energy modeling were only reported by Kusiak et al. [

41,

42].

In this research, the applicability of LSSVR, M5RT, NF-SC, and NF-GP methods was investigated for predicting hourly wind speed (WS) and wind power (WP) time series using a cross-validation method. The cross-validation method and M5RT were used successfully in recent years for modeling hydrological time series [

43,

44]. Thus, the authors were compelled to apply these methods to wind time series to check their performance. It is worthy to note that there are no published studies in the literature that predict the wind speed and wind power by comparing LSSVR, M5RT, NF-SC, and NF-GP models while also using the cross-validation method. The paper is organized as follows: in

Section 2, the basic structures of the LSSVR, M5RT, NF-SC, and NF-GP models are briefly explained. In

Section 3, the data used in the analysis are described. In

Section 4, two neuro-fuzzy and two heuristic regression models are applied for the prediction of hourly wind speed and wind power. The performance of the four models is analyzed with respect to three statistical indexes.

Section 5 contains the concluding remarks. The models were applied using MATLAB software in the present study [

45].

## 4. Results and Discussion

In the first part of the research, the prediction of hourly wind speed using previous values was carried out. Then, the accuracy of LSSVR, M5RT, NF-GP, and NF-SC was tested for hourly wind power prediction. Root-mean-square errors (RMSE), mean absolute errors (MAE), and coefficients of determination (R

^{2}) were used for evaluating the applied models. RMSE is one of the most commonly used statistics for measuring prediction error. MAE is another statistical index used for measuring the absolute error between observed and predicted values. R

^{2} represents the degree of linear relationship between the predicted and observed data. These three indices are commonly utilized for evaluating model prediction performance in the field of wind energy [

79,

80,

81,

82,

83]. Their equations are as follows:

where

N is the total number of observations,

${W}_{O}$ is the observed wind speed/wind power,

${W}_{f}$ is the predicted wind speed/wind power,

$\stackrel{\xc2\xaf}{{W}_{O}}$ is the average of observed wind speed/wind power, and

$\stackrel{\xc2\xaf}{{W}_{f}}$ is the average predicted wind speed/wind power.

Before application of the models, the input numbers should be decided to predict the wind speed/wind power. For this purpose, correlation analysis (CA) was employed to wind speed and wind power time series to observe the effect of antecedent wind speed and wind power values. Correlation analysis was successfully used in previous studies for the determination of inputs of data-driven models [

84,

85,

86,

87]. Sudheer et al. [

84] used correlation analysis and determined the optimal inputs for an artificial neural network (ANN) in modeling the complex rainfallâ€“runoff phenomenon. Kisi [

85] determined the optimal inputs of ANN in modeling a nonlinear dischargeâ€“sediment relationship. Li and Shi [

86] applied correlation analysis for the determination optimal inputs of ANN in wind speed forecasting. Zemzami and Benaabidate [

87] applied correlation analysis for deciding the inputs of data-driven models in the prediction of daily streamflows. On the basis of correlation analysis employed in the current study, four previous values were selected for each variable as follows: (i) WS

_{t}_{âˆ’1}; (ii) WS

_{t}_{âˆ’1}, WS

_{t}_{âˆ’2}; (iii) WS

_{t}_{âˆ’1}, WS

_{t}_{âˆ’2}, WS

_{t}_{âˆ’3}; and (iv) WS

_{t}_{âˆ’1}, WS

_{t}_{âˆ’2}, WS

_{t}_{âˆ’3}, WS

_{t}_{âˆ’4} for wind speed, and (i) WP

_{t}_{âˆ’1}; (ii) WP

_{t}_{âˆ’1}, WP

_{t}_{âˆ’2}; (iii) WP

_{t}_{âˆ’1}, WP

_{t}_{âˆ’2}, WP

_{t}_{âˆ’3}; and (iv) WP

_{t}_{âˆ’1}, WP

_{t}_{âˆ’2}, WP

_{t}_{âˆ’3}, WP

_{t}_{âˆ’4} for wind power (see

Table 2).

#### 4.1. Hourly Wind Speed Prediction Using NF-SC, NF-GP, LSSVR, and M5RT Methods

The test results of the two NF methods are given in

Table 2. It can be seen from the table that NF-SC and NF-GP models give different prediction results for different inputs and datasets. It can be observed from the average statistics that both methods provided the worst accuracy in the third input combination. Input combinations (ii) and (iv) had better accuracy compared to input combinations (i) and (iii) for all datasets. Input combination (ii) gave slightly better results for the NF-GP method compared to input combination (iv). For the NF-SC method, the performance of input combination (iv) was superior to the other combinations. It is obvious from the table that both methods had the worst accuracy for the M2 dataset. The reason for this may be the fact that the maximum and minimum wind speed values of the testing data set (WS

_{max} = 23.13 m/s and WS

_{min} = 3.71 m/s) were higher and lower, respectively, than the corresponding values of the training dataset (

Table 1). From this, we can say that the trained NF-GP and NF-SC methods may have difficulties in extrapolating lower and higher values in the M2 case. It is clear that the NF-GP and NF-SC methods gave good results for the M4 dataset for all input combinations. It is obvious from

Table 2 that the NF-GP method performed slightly better than the NF-SC method with respect to average performance criteria. The reason for this may be the fact that NF-GP includes much more fuzzy rules (or consequent parameters) than the NF-SC model, and this may provide more flexibility to this method in predicting wind speed.

The test statistics of the optimal LSSVR and M5RT models are summarized in

Table 3. Here, input combinations (iii) and (iv) performed worse than the other combinations. Input combination (ii) gave slightly better results for the LSSVR method compared to input combination (i). For the M5RT method, input combination (i) outperformed the other combinations. Similar to the NF-GP and NF-SC methods, the LSSVR and M5RT methods had the worst accuracy for the M2 dataset due to the extrapolation difficulties as mentioned before. The best models of the LSSVR and M5RT methods were obtained for the M4 dataset using input combinations (ii) and (i), respectively. As observed from

Table 3, LSSVRâ€™s performance was superior to M5RT in one-hour-ahead wind speed prediction. The main reason for this might be the nonlinear structure of LSSVR compared to M5RT, which uses linear equations for simulation. Various control parameters were considered for each LSSVR model, and the optimal values that provided the minimum RMSE in the test period were selected for each dataset. The optimal parameters of LSSVR are reported in

Table 4. Here, M1 shows model 1 whereas (100, 12) refers to the regularization constant and the RBF kernelâ€™s width, respectively. The variation in control parameters of LSSVR with respect to RMSE is illustrated in

Figure 4 for the M4 dataset.

According to the comparison of NF-GP, NF-SC, LSSVR, and M5RT methods (

Table 2 and

Table 3), it is clear that the LSSVR method outperformed the other models in predicting wind speed of the Sotavento Galicia wind farm. There was a slight difference between LSSVR and NF-GP methods. The M5RT method gave inferior results compared to the other methods. The linear structure of this method might be the reason for this, because wind speed fluctuations are highly nonlinear. The average errors of the NF-GP, NF-SC, LSSVR, and M5RT methods for each input combination are illustrated in

Figure 5a,b. As observed from the figure, the average RMSE and MAE values of the LSSVR method were smaller than those of the other models for all input combinations. The LSSVR decreased the overall average RMSE error of NF-GP, NF-SC, and M5RT by 1.68%, 2.94%, and 11.71%, respectively.

Figure 6aâ€“d show the observed and predicted hourly wind speeds using all methods for the M4 dataset with their best input combinations. It is apparent from the figure that NF-GP, NF-SC, and LSSVR methods provided higher R

^{2} values for the M4 dataset. The figure also shows that the NF-GP model gave a slightly higher value of R

^{2} than the LSSVR model. From the fitted line equations, however, it is apparent that the LSSVR model was closer to the ideal line compared to NF-GP (see the slope and bias coefficients in

Figure 6). In fact, both models (LSSVR and NF-GP) had almost the same accuracy in wind speed forecasting.

The best (NF-GP) and worst (M5RT) models were also tested in wind speed prediction for multiple horizons (from one to five hours ahead) using the best dataset (M4). The new model results are compared in

Table 5. As expected, the modelsâ€™ accuracies deteriorated upon increasing the forecast horizons. From the table, it is clear that the NF-GP modelâ€™s performance was superior to that of the M5RT model in wind speed prediction for all considered horizons. It can be observed that increasing the input lag beyond two (combination (ii)) generally did not increases model accuracy. These results are parallel to previous studies [

88,

89,

90,

91,

92]. This indicates the necessity of examining different input lags to obtain the most effective one in WS forecasting.

#### 4.2. Hourly Wind Power Prediction Using NF-SC, NF-GP, LSSVR, and M5RT Methods

In this section, the accuracy of the four methods was examined in one-hour-ahead wind power prediction using previous values. Similar to the previous application, the cross-validation method was also utilized here. The best control parameters of the LSSVR models are reported in

Table 6. The RMSE, MAE, and R

^{2} statistics of the applied methods are reported in

Table 7 and

Table 8. As obviously seen from the tables, all methods also performed the worst for the M2 dataset, probably due to the extrapolation difficulties (WPmax = 15.85 MW), while they performed very well for the M4 dataset. It is also obvious from

Table 7 and

Table 8 that LSSVR, NF-GP, and NF-SC showed similar accuracy for different input combinations. However, the M5RT method gave worse results than the other methods for all datasets probably due to its linear structure.

Figure 7a,b show the average errors statistics of all the applied methods for different input combinations. As seen from the figure, NF-GP performed better than the other methods for all input combinations from the viewpoints of RMSE, MAE, and R

^{2}. Input combination (i) gave the best results for the NF-GP and M5RT models, whereas input combination (ii) provided the best accuracy for the LSSVR and NF-SC models. However, input combination (iii) gave the worst results for the NF-GP and NF-SC models, whereas input combination (iv) performed the worst for the LSSVR and M5RT models. The figure also reports that both NF methods performed slightly better than the LSSVR method for all input combinations. NF-GP decreased the overall average RMSE errors of the NF-SC, LSSVR, and M5RT methods by 1.30%, 4.52%, and 15.6%, respectively.

The observed and predicted hourly wind powers using all the methods are shown in

Figure 8aâ€“d for the M4 dataset. As apparent from the figure, the NF-GP and NF-SC models were in good agreement with the observed wind power data. The NF-GP and NF-SC methods provided higher R

^{2} values for each dataset than the other methods. The figure also reports that the LSSVR method gave slightly higher values of R

^{2} than the NF-GP method. The slope and bias coefficients for the NF-GP model were closer to the 1 and 0, respectively, compared to values for the LSSVR, NF-SC, and M5RT models. It can be clearly seen from the scatterplots that M5RT had more scattered predictions compared to LSSVR, NF-GP, and NF-SC.

Table 9 compares the best (NF-GP) and worst (M5RT) models in wind power prediction for multiple horizons (from one to five hours ahead) using the best dataset (M4). A decrease in model accuracy can also be clearly observed here with respect to an increase in forecast horizons. As seen from the test results, the NF-GP model outperformed the M5RT model for the all horizons and input combinations. It can be observed that increasing the input lag beyond one (combination (i)) generally did not improves the model accuracy. It is evident from the existing literature that increasing the input lag does not guarantee better forecast performance [

93,

94]. Sometimes, a high number of inputs has a negative impact on variance and causes a more complex model, leading to poor forecasting performance. Therefore, several values of input lag should be searched in the case of WS or WP forecasting using data-driven methods.

## 5. Conclusions

In this study, hourly wind speed and wind power time-series data were used to examine the prediction capability of the NF-GP, NF-SC, LSSVR, and M5RT methods. Three statistical indices (RMSE, MAE, and R^{2}) were used for evaluating the performance of these methods. Four heuristic soft computing techniques were employed in one-hour-ahead wind speed prediction using previous values. The cross-validation method was employed to better evaluate the applied methods. The comparison results showed that LSSVR and NF-GP had almost same accuracy, and they performed better than the other soft computing models. LSSVR decreased the overall average RMSE error of NF-GP, NF-SC, and M5RT by 1.68%, 2.94%, and 11.71%, respectively. The capability of the four methods was also examined in the prediction of wind power using previous values. NF-GP decreased the overall average RMSE error of NF-SC, LSSVR, and M5RT by 1.30%, 4.52%, and 15.60%, respectively. The results indicated that LSSVR and NF-GP had almost the same accuracy and performed better compared to other methods. The overall results also indicated that the M5RT method gave the worst results in both applications. The results showed that hourly WS and WP could be successfully predicted using the NF-GP and LSSVR methods.

NF-GP and M5RT were also compared in forecasting WS and WP for multiple horizons (from one to five hours ahead). The results indicated the superior accuracy of the first model compared to the latter one. Only one or two input lags were found to be enough for multiple-hours-ahead WS and WP forecasting.

This study examined the ability of two different neuro-fuzzy methods, as well as the LSSVR and M5RT methods, in predicting hourly wind speed and wind power. The main limitation of this study was using limited data from one site. It is known that the effect of inter-annual variability on one-hour-ahead WS or WP prediction is relatively small. It will be better to get more training data from different years to address this effect. In fact, this is a limitation of the models presented in the current study. The NF-GP, NF-SC, LSSVR, and M5RT methods can be compared to each other using much more hourly data from other climatic regions. The accuarcy of the four methods may also be compared using evolutionary algorithms in the calibration of their control parameters.