A Data-Driven Short-Term Forecasting Model for O ﬀ shore Wind Speed Prediction Based on Computational Intelligence

: Wind speed forecasting is an important element for the further development of o ﬀ shore wind turbines. Due to its importance, many researchers have proposed di ﬀ erent models for wind speed forecasting that di ﬀ er in terms of the time-horizon of the forecast, types and number of inputs, complexity, structure, and others. Wind speed series present high nonlinearity and volatilities, and thus an e ﬀ ective model should successfully deal with those features. An approach to deal with the nonlinearities and volatilities is to utilize a time series processing technique such as the wavelet transform. In the present paper, an ensemble data-driven short-term wind speed forecasting model is developed, tested and applied. The term “ensemble” refers to the combination of two di ﬀ erent predictors that run in parallel and the prediction is obtained by the predictor that leads to the lowest error. The proposed model utilizes the wavelet transform and is compared with other models that have been presented in the related literature and outperforms their accuracy. The proposed forecasting model can be used e ﬀ ectively for 1 min and 10 min ahead horizon wind speed predictions.


Motivation and State-of-the-Art
The rapid implementation of wind turbines across the globe corresponds to a set of challenges during power systems operation and planning. This is due to intermittent nature of wind potential. By the end of 2018, the total European Union-installed offshore wind capacity reached 19 GW. Total investments in offshore wind in 2018 were more than 10.3 billion € (WindEurope [1], 2018). This includes investments in construction of projects, transmission assets, and refinancing. More than 91% of all offshore wind installations were located in shallow or intermediate water depths with a mean water depth equal to 27.1 m. The capacities span from few to hundred megawatts and the installations differ in terms of hub site, distance to shore, water depth and others (Snyder et al. [2], 2009). The reduction of installation and maintenance costs but also the reliable assessment of energy production of offshore wind parks will signify the next phase of their deployment.
Wind energy leads to disturbances of the balance between generation and demand sides. A wind speed prediction system is a potential solution to the aforementioned situation. Apart from the balance, accurate predictions can lead to lower costs during the installation in the offshore field of wind validate the proposed model for very short-term and short-term predictions by using the data obtained from the National Renewable Energy Laboratory of USA. The model is compared with feed-forward neural network and radial basis network. The prediction is held using only past wind speed values. In (Zhou et al. [17], 2011), the authors present a least-squares support vector machine for one-step ahead wind speed forecasting. Three kernels, namely linear, Gaussian, and polynomial kernels, are implemented. The support vector machine's parameters considered include the training sample size, order, regularization parameter, and kernel parameters. The support vector machine's version are compared with a persistence model and provide better forecasts. The Adaptive Neuro-Fuzzy Inference System (ANFIS) is utilized in (Fazelpour et al. [18], 2016), and is compared with a feed-forward neural network and radial basis network in hour-ahead forecasting in a location in Tehran, Iran. No exogenous parameters are used. ANFIS results in better forecasts. In (Fortuna et al. [19], 2016), the clustering tool is used to form wind speed classes. Then, two models, namely the Hidden Markov Model and the Nonlinear Autoregressive are compared for predicting the class of each new wind speed data entry. In general, wind speed series present volatilities and stochasticity. Depending on the data set, an analysis on the wind speed characteristics can take place. For instance, in (Fortuna et al. [20], 2014), the authors provide a fractal analysis on wind speed observations. Exploitable information can be derived for such analysis for further modeling.

Contribution of the Present Paper
A variety of forecasting techniques have been proposed so far from different research groups. In the present paper a relatively simple yet efficient model for short-term wind speed forecasting based on real measured wind speed data is developed, applied, and proposed. The used data set involves inconsistencies of the time sequence of the wind speed series due to missing data. Various experiments take place that refer to different input combinations. Also, the Discrete Wavelet Transform (DWT) is utilized in order to decompose the initial series into a set of wavelet components for strengthening the forecasting credibility [21,22]. The model is composed by an ANFIS and a Feed-Forward Neural Network (FFNN) [23,24]. In the majority of the studies of the literature, the prediction is accomplished using only past values. In order to fully examine the level of influence of external variables such as temperature and speed directions on the prediction accuracy, in the present paper various input combinations of wind speed, wind direction, and air temperature are examined. Overall, the purpose of the paper is to test the performance of a proposed hybrid computational intelligence model in the wind speed forecasting problem under the limitation of using incomplete data for the training and validation of the model.

Description
In this section, an efficient forecasting model is developed and proposed. The model consists of an FFNN trained by the Levenberg-Marquardt algorithm and an ANFIS [25]. Neural network-based forecasting systems are a favorable scheme in recent years in predictions over traditional time series models. Numerous applications in load and price forecasting studies have brought forth the advantages of neural networks. Recently, neural networks have been used in wind power predictions. For full mathematical description of the FFNN the reader is referred to (Graupe [24], 2007). A general illustration of an FFNN is shown in Figure 1. Another common forecasting system is ANFIS (Jang, 1993). ANFIS is based on a fuzzy rule-based inference mechanism. It is composed of five layers and each layer contains several nodes.
The nodes are described by a node function. Let j i O be the output of the i-th node in layer j. In the 1st layer, every node I is an adaptive node with node function: where x or y is the input of the ith node and i A or is a linguistic label associated with the node. Hence, j i O is the membership grade of a fuzzy set 1 2 1 ,, A A B or 2 B and it specifies the degree to which the input x or y satisfies the quantifier A or B. Any continuous and piecewise differential function can be used as node function in the 1st layer. In the 2nd layer, each node Π multiplies the inputs and sends the product in output: In the 3rd layer, each node N computes the ratio (4) In the 4th layer, each node computes the contribution of the ith rule to the overall output: where i w is the output of the 3rd layer and ,, ii a b c are a set of parameters.
Finally, in the 5th layer, the node Σ computes the final output as the summation of all inputs: ANFIS topology is displayed in Figure 2. Another common forecasting system is ANFIS [23]. ANFIS is based on a fuzzy rule-based inference mechanism. It is composed of five layers and each layer contains several nodes. The nodes are described by a node function. Let O j i be the output of the i-th node in layer j. In the 1st layer, every node I is an adaptive node with node function: or where x or y is the input of the ith node and A i or B i−2 is a linguistic label associated with the node. Hence, O j i is the membership grade of a fuzzy set A 1 , A 2 , B 1 or B 2 and it specifies the degree to which the input x or y satisfies the quantifier A or B. Any continuous and piecewise differential function can be used as node function in the 1st layer. In the 2nd layer, each node Π multiplies the inputs and sends the product in output: In the 3rd layer, each node N computes the ratio In the 4th layer, each node computes the contribution of the ith rule to the overall output: where w i is the output of the 3rd layer and a i , b i , c are a set of parameters. Finally, in the 5th layer, the node Σ computes the final output as the summation of all inputs: ANFIS topology is displayed in Figure 2. The proposed forecasting model combines the independent forecasts of FFNN and ANFIS. A schematic representation of the hybrid model is shown in Figure 3. The models are trained separately. The training set is used to define the optimal model parameters. For instance, for the case of the FFNN the parameters that need to be defined are the number of hidden layers, the number of neurons in the hidden layer, and the type of activation function in the hidden and output layers. While for the case of ANFIS the required parameters that need to be defined are the type of inference mechanism, the training epochs, the number of fuzzy rules, the type of membership function, and the values of ,, ii a b c . Real monitored environmental data measured with a monitoring system are used in the present paper for developing the forecasting model. The monitoring system is placed in the coastal area of Neos Marmaras, Greece. Details about the monitoring system (e.g., sensors used and verification) can be found in (Michailides et al., 2013). The training and test sets cover the periods 01/04/2013-10/08/2013 and 01/09/2013-24/12/2013, respectively. The test set is used for the comparison of the models. No filling of incomplete or missing data took place. Also, no other preprocessing of the data took place. The aim is to build a model applied to raw data obtained from a real measurement system.  The proposed forecasting model combines the independent forecasts of FFNN and ANFIS. A schematic representation of the hybrid model is shown in Figure 3. The models are trained separately. The training set is used to define the optimal model parameters. For instance, for the case of the FFNN the parameters that need to be defined are the number of hidden layers, the number of neurons in the hidden layer, and the type of activation function in the hidden and output layers. While for the case of ANFIS the required parameters that need to be defined are the type of inference mechanism, the training epochs, the number of fuzzy rules, the type of membership function, and the values of a i , b i , c. Real monitored environmental data measured with a monitoring system are used in the present paper for developing the forecasting model. The monitoring system is placed in the coastal area of Neos Marmaras, Greece. Details about the monitoring system (e.g., sensors used and verification) can be found in (Michailides et al. [26], 2013). The training and test sets cover the periods 01/04/2013-10/08/2013 and 01/09/2013-24/12/2013, respectively. The test set is used for the comparison of the models. No filling of incomplete or missing data took place. Also, no other preprocessing of the data took place. The aim is to build a model applied to raw data obtained from a real measurement system. The proposed forecasting model combines the independent forecasts of FFNN and ANFIS. A schematic representation of the hybrid model is shown in Figure 3. The models are trained separately. The training set is used to define the optimal model parameters. For instance, for the case of the FFNN the parameters that need to be defined are the number of hidden layers, the number of neurons in the hidden layer, and the type of activation function in the hidden and output layers. While for the case of ANFIS the required parameters that need to be defined are the type of inference mechanism, the training epochs, the number of fuzzy rules, the type of membership function, and the values of ,, ii a b c . Real monitored environmental data measured with a monitoring system are used in the present paper for developing the forecasting model. The monitoring system is placed in the coastal area of Neos Marmaras, Greece. Details about the monitoring system (e.g., sensors used and verification) can be found in (Michailides et al., 2013). The training and test sets cover the periods 01/04/2013-10/08/2013 and 01/09/2013-24/12/2013, respectively. The test set is used for the comparison of the models. No filling of incomplete or missing data took place. Also, no other preprocessing of the data took place. The aim is to build a model applied to raw data obtained from a real measurement system.   The topology of the proposed model using the wavelet components is shown in Figure 4. The topology of the proposed model using the wavelet components is shown in Figure 4. Therefore, we examined ten different cases referring to the two prediction horizons.

Performance Assessment
The performance assessment includes a set of mathematical criteria that measure the prediction errors. To fully examine the proposed model performance, we used a set of different mathematical  Therefore, we examined ten different cases referring to the two prediction horizons.

Performance Assessment
The performance assessment includes a set of mathematical criteria that measure the prediction errors. To fully examine the proposed model performance, we used a set of different mathematical criteria. Let p a m and p f m be the actual and predicted wind speed values of the m-th day of the test set, m = 1, 2, . . . , M, respectively. The indicator considered for the assessment are the Absolute Error (AE), the Mean Absolute Error (MAE), the Root Mean Squared Error (RMSE), and the Mean Absolute Range Normalized Error (MARNE) as defined in Equation (10). The AE is defined as The MAE corresponds to the sum of all AEs: The RMSE is expressed as The MARNE is the absolute difference between the actual and forecast wind speed, normalized to the maximum wind speed: As benchmarks for the proposed model test, the individual applications of FFNN and ANFIS are used.

Wind Speed Forecasting
Computational intelligence-based systems are a favourable scheme in recent years in various variable predictions, such as electric load, over traditional time series models. However, a careful selection of inputs and a proper training phase are essential for the model's successful implementation and utilization. The selection of the types of inputs is crucial to the forecasting success. In the present study, various input combinations are examined. The objective is to test computational intelligence-based models for the case of incomplete data. Three models are compared that refer to an FFNN, an ANFIS, and a proposed FFNN-ANFIS. After the decision of the types of inputs, i.e., Case#1-Case#5, the next test is to define the number of inputs. This number refers to the historical values of the used parameters: wind speed, temperature and wind direction. With the application of the Sample Autocorrelation Function (SAF), the historical values are evaluated based on the correlation of the present value. Figure 5 displays the SAF that resembles the minute-ahead wind speed set. Only the first 20 values are displayed. It is shown that the correlation is decreasing progressively when the lagged value becomes more time distant. The same conclusions are drawn from the 10 min ahead set. The first five values are selected as inputs for the models. Also, the corresponding values of temperature and wind direction are proportionally selected. The MARNE is the absolute difference between the actual and forecast wind speed, normalized to the maximum wind speed: As benchmarks for the proposed model test, the individual applications of FFNN and ANFIS are used.

Wind Speed Forecasting
Computational intelligence-based systems are a favourable scheme in recent years in various variable predictions, such as electric load, over traditional time series models. However, a careful selection of inputs and a proper training phase are essential for the model's successful implementation and utilization. The selection of the types of inputs is crucial to the forecasting success. In the present study, various input combinations are examined. The objective is to test computational intelligence-based models for the case of incomplete data. Three models are compared that refer to an FFNN, an ANFIS, and a proposed FFNN-ANFIS. After the decision of the types of inputs, i.e., Case#1-Case#5, the next test is to define the number of inputs. This number refers to the historical values of the used parameters: wind speed, temperature and wind direction. With the application of the Sample Autocorrelation Function (SAF), the historical values are evaluated based on the correlation of the present value. Figure 5 displays the SAF that resembles the minute-ahead wind speed set. Only the first 20 values are displayed. It is shown that the correlation is decreasing progressively when the lagged value becomes more time distant. The same conclusions are drawn from the 10 min ahead set. The first five values are selected as inputs for the models. Also, the corresponding values of temperature and wind direction are proportionally selected. Employing the minute-ahead set and the data set of Case#1, a series of experiments were conducted for the purpose of defining the optimal FFNN and ANFIS structures. The optimal FFNN structure has one hidden layer. The tangent sigmoid function is used both for the hidden and output layers. The number of training epochs is set equal to 100. The optimal number of neurons in the hidden layer is defined also by series of simulations. It differs among the various cases. Thus, a series of FFNN executions took place to track the number of hidden layers that minimize the RMSE indicator. Concerning the optimal ANFIS topology, the Sugeno inference method is selected together with Gaussian membership functions. Employing the minute-ahead set and the data set of Case#1, a series of experiments were conducted for the purpose of defining the optimal FFNN and ANFIS structures. The optimal FFNN structure has one hidden layer. The tangent sigmoid function is used both for the hidden and output layers. The number of training epochs is set equal to 100. The optimal number of neurons in the hidden layer is defined also by series of simulations. It differs among the various cases. Thus, a series of FFNN executions took place to track the number of hidden layers that minimize the RMSE indicator. Concerning the optimal ANFIS topology, the Sugeno inference method is selected together with Gaussian membership functions.
The scores of the forecasting models on the assessment indicators are presented at Tables 1 and 2.  Table 1 refers to the 1-min-ahead horizon and Table 2 to the 10-min-ahead horizon. The rows of the Tables correspond to the different cases. According to the results of Table 1, the proposed model outperforms the FFNN and ANFIS in all test cases, highlighting the significance of using combined forecasts. The prediction accuracy improvement that is obtained with the proposed model is more evident in the data sets of Case#3, Case#4, and Case#5. FFNN leads to better results compared to ANFIS in Case#2, Case#3, and Case#5 when using the MAE indicator. Also, the FFNN leads to better results in Case#1 according to the RMSE and MARNE measures. However, it scores in MARNE = 4.2353% in Case#2, a value that is higher than the respective of ANFIS. While in most experiments the FFNN appears more robust, it can be suggested over the ANFIS in the minute-ahead wind speed prediction problem. Among the types of data inputs, Case#5 leads to considerably lower errors indicating the benefit of transforming the volatile wind data into the wavelet domain. This is evident in all measures and especially in MARNE. Considering the MAE indicator, Case#2 leads to better predictions if the latter is held with FFNN or the hybrid model. On the contrary, the data of Case#1, i.e., using only wind speed values is more suitable for ANFIS. Using wind speed, temperature, and wind direction as inputs, the prediction is less credible. This implies again for FFNN and the proposed model. ANFIS scores MAE = 0.4443 with the data of Case#3. The aforementioned conclusions are identical when the evaluation is held with the RMSE or MARNE. Therefore, by combining wind speed and direction data the forecasting procedure is strengthened. The use of temperature is not recommended for the test set under study. According to the above analysis, the combination of the FFNN and ANFIS that is fed with the wind speed data transformed in the wavelet domain is the recommended model for minute-ahead forecasts under the limitation of many incomplete data entries.
According to the findings presented in Table 2, it is evident that the 10-min-ahead wind speed prediction problem is a more difficult task. A possible reason for this is the decrease that is accomplished between current and past values of the minute time frame measurements. This means that the 10-min data are less correlated since the minute correlation lowers due to the averaging of the one-minute data for the purpose of transforming them in the 10-min intervals. The proposed FFNN-ANFIS model is more accurate than the rest in all types of data sets. Again the 10-min-ahead problem benefits from the implementation of the wavelet transform. The data of Case#5 provide more robust predictions independently of the model used. A further comparison of the models is held via the AE distribution over time. The MAE indicator receives one value for a specific prediction, for example for a given number of neurons in the hidden layer. It is essential to examine the error distribution over the focusing period. Using the AE indicator, the analysis can be scaled to minutes. This concept strengthens the conclusions drawn from the models comparison. Figure 6 presents the AE distribution per Case. The figure refers to the 1-min-ahead prediction horizon while the forecasting is achieved with the proposed FFNN-ANFIS model. The discrete peaks correspond to large error values, which can be considered as indicators of the model's poor performance for the specific minute. The data of Case#1-Case#4 lead to some high peaks of the AE shape. These are mainly met in December days. The lowest errors are mostly gathered in September days. As the time horizon progresses, the peaks become more frequent. Hence, extreme weather conditions worse the credibility of the predictions. Some late autumn and winter wind speeds are difficult to effectively be predicted in the coastal site under study. The mean values of AE are 0.3673, 0.3664, 0.4123, 0.4210, and 0.1021 for Case#1, Case#2, Case#3, Case#4, and Case#5, respectively. Some parallel conclusions with the above results can be made for the 10-min-ahead horizon. The corresponding results are graphically presented in Figure 7. In the 10-min-ahead problem, the implementation of the wavelet transform is more advantageous compared to the minute-ahead case. Case#5 increases the accuracy by a large portion. AE indicator, the analysis can be scaled to minutes. This concept strengthens the conclusions drawn from the models comparison. Figure 6 presents the AE distribution per Case. The figure refers to the 1-min-ahead prediction horizon while the forecasting is achieved with the proposed FFNN-ANFIS model. The discrete peaks correspond to large error values, which can be considered as indicators of the model's poor performance for the specific minute. The data of Case#1-Case#4 lead to some high peaks of the AE shape. These are mainly met in December days. The lowest errors are mostly gathered in September days. As the time horizon progresses, the peaks become more frequent.    In order to examine the relationship between the accuracy and the direction of the wind, we measured the AE per direction degree. Figure 8 shows the comparison among Case#2 and Case#5 for the 1-min-ahead forecasts. Case#2 refers to the combination of wind speed and direction. The predictions refer to the proposed model. Case#5 involves only to the transformed wind speed data.
It is plotted here for the sake of comparison. The lowest AE of Case#2 is 5.97 × 10 -6 and occurred for 312.48°. The next lowest AE degrees are 200.73°, 104.56°, 151.90°, and 281.14°. The larger values of AE are presented for 162.47°, 42.17°, 55.08°, and 222.70°. According to these findings, a preliminary conclusion is that there is no strong correlation between the direction and the forecasting error. For example, it can be strongly supported that normal to the monitoring system winds (e.g., 90°) are less predictable compared to other directions. This statement is also supported from the data of Case#5. The lowest errors refer to 134.67°, 122.06°, 73.07°, 317.51°, and 46.48° directions, while, the highest ones are occurred for 27.01°, 35.56°, 41.31°, 57.65°, and 31.57°. In this data set, the less accurate prediction is presented for wind direction degrees below 60°.
As illustrative examples of the proposed model's behavior, Figures 9 and 10 present the actual and forecasted wind speed curve of the test set for 1-min-and 10-min-ahead horizons, respectively. In order to examine the relationship between the accuracy and the direction of the wind, we measured the AE per direction degree. Figure 8 shows the comparison among Case#2 and Case#5 for the 1-min-ahead forecasts. Case#2 refers to the combination of wind speed and direction. The predictions refer to the proposed model. Case#5 involves only to the transformed wind speed data. It is plotted here for the sake of comparison. The forecasted wind speed sequences of the two figures succeed by a large portion to accurately simulate the actual data, which is another one indicator of the robustness of the model.  The forecasted wind speed sequences of the two figures succeed by a large portion to accurately simulate the actual data, which is another one indicator of the robustness of the model.

Comparison with Other Forecasting Models
To fully evaluate the proposed model, a comparison is made with the following models; Group Method of Data Handling Neural Network (GMDHNN) [27], Regression Neural Network (GRNN) [28], Regression Trees (RTs) [29], Relevance Vector Machine (RVM) [30], and Support Vector Regression (SVR) [31]. Tables 3 and 4 present the scores of GMDHNN, GRNN and RTs, RVM and SVR, respectively, on the error metrics, for the 1-min-ahead predictions. Correspondingly, Tables 5 and 6 present the scores of GMDHNN, GRNN and RTs, RVM and SVR, respectively, on the error metrics MAE, RMSE, and MARNE for 10-min-ahead predictions. Among these models, SVR and GMDHNN display comparative results with the hybrid model. The latter outperforms all the other models. It can be noticed that GRNN and RTs result in high errors and thus, for the problem under study are not recommended.

Comparison with Other Forecasting Models
To fully evaluate the proposed model, a comparison is made with the following models; Group Method of Data Handling Neural Network (GMDHNN) [27], Regression Neural Network (GRNN) [28], Regression Trees (RTs) [29], Relevance Vector Machine (RVM) [30], and Support Vector Regression (SVR) [31]. Tables 3 and 4 present the scores of GMDHNN, GRNN and RTs, RVM and SVR, respectively, on the error metrics, for the 1-min-ahead predictions. Correspondingly, Tables 5 and 6 present the scores of GMDHNN, GRNN and RTs, RVM and SVR, respectively, on the error metrics MAE, RMSE, and MARNE for 10-min-ahead predictions. Among these models, SVR and GMDHNN display comparative results with the hybrid model. The latter outperforms all the other models. It can be noticed that GRNN and RTs result in high errors and thus, for the problem under study are not recommended.

Comparison with Other Forecasting Models
To fully evaluate the proposed model, a comparison is made with the following models; Group Method of Data Handling Neural Network (GMDHNN) [27], Regression Neural Network (GRNN) [28], Regression Trees (RTs) [29], Relevance Vector Machine (RVM) [30], and Support Vector Regression (SVR) [31]. Tables 3 and 4 present the scores of GMDHNN, GRNN and RTs, RVM and SVR, respectively, on the error metrics, for the 1-min-ahead predictions. Correspondingly, Tables 5 and 6 present the scores of GMDHNN, GRNN and RTs, RVM and SVR, respectively, on the error metrics MAE, RMSE, and MARNE for 10-min-ahead predictions. Among these models, SVR and GMDHNN display comparative results with the hybrid model. The latter outperforms all the other models. It can be noticed that GRNN and RTs result in high errors and thus, for the problem under study are not recommended.

Discussion and Concluding Remarks
Offshore wind turbine installations are continually gathering the research interest since they are considered an efficient mechanism for covering the electrical needs of various isolated loads. The present study emphasizes on the development of an effective method for very short-term wind speed forecasting under the limitation of wind speed series that do not present consistency in time, i.e., there are interruptions in the date sequence. Real measured data are used for the training of the developed method. An ensemble data-driven short-term wind speed forecasting model is developed, tested, and applied. The term "ensemble" refers to the combination of two different predictors that run in parallel and the prediction is obtained by the predictor that leads to the lowest error. The proposed model utilizes the wavelet transform and is compared with other models that have been presented in the related literature. The main conclusions of the present study: • The proposed forecasting model can be used effectively for 1 min and 10 min ahead horizon wind speed predictions.

•
The exogenous variables (i.e., wind speed direction and air temperature) decrease the prediction accuracy. The best results are obtained using the DWT.

•
The highest errors are met on winter days and especially in instances with high wind speed.

•
There is no correlation among the forecasting error and the wind direction. • The hybrid model (combination of FFNN and ANFIS) leads to better forecasts in all examined data set cases. • The proposed model outperforms the accuracy of other forecasting models that have been presented in the related literature.
The research of the present paper will be further expanding by checking the implementation of the forecasting problem in the Wind Farm Layout Optimization (WLFO) problem incorporating wake effects with the use of specific mathematical or numerical models. Forecasted wind speed time series can serve as inputs to the problem. By estimating future wind speed values, the WLFO can be modified to a scenario-based problem where different wind speed forecasts can lead to various WLFO solutions and thus, assessing the level of influence of the future wind speed variations in the outputs of the optimization problem.