An Ensemble Forecasting Model of Wind Power Outputs Based on Improved Statistical Approaches

: The number of wind-generating resources has increased considerably, owing to concerns over the environmental impact of fossil-fuel combustion. Therefore, wind power forecasting is becoming an important issue for large-scale wind power grid integration. Ensemble forecasting, which combines several forecasting techniques, is considered a viable alternative to conventional single-model-based forecasting for improving the forecasting accuracy. In this work, we propose the day-ahead ensemble forecasting of wind power using statistical methods. The ensemble forecasting model consists of three single forecasting approaches: autoregressive integrated moving average with exogenous variable (ARIMAX), support vector regression (SVR), and the Monte Carlo simulation-based power curve model. To apply the methodology, we conducted forecasting using the historical data of wind farms located on Jeju Island, Korea. The results were compared between a single model and an ensemble model to demonstrate the validity of the proposed method.


Introduction
With limits to fossil fuel reserves and the emerging importance of environmental protection around the world, renewable energy sources have received considerable attention. Among them, wind power is considered a front-runner for increasing the global installed capacity of renewable power facilities. In 2018, global renewable energy capacity grew to~2378 GW; wind power accounts for 28% of the additional renewable capacity [1]. Global installed wind power capacity reached 579 GW, and, according to estimates, wind power will supply over 20% of total global power by 2030 [2]. As wind power generation increases within power systems, the penetration of wind power presents many challenges to system operators. The high penetration of wind power affects real-time system operations, the quality of power, and the reliability of power systems [3,4]. As wind power generation output is variable and intermittent, depending on weather conditions, supplying wind power continuously and reliably by using wind power forecasting is essential. Accurate wind power forecasting improves energy conversion efficiency and reduces the risk of overload, thereby enabling reliable system operation [5].
In various studies, a number of methods have been successfully applied to forecast wind power. Wind power forecasting models are divided into three main categories: physical models, statistical models, and combinations of both [6,7]. The physical model uses physical considerations based on the lower atmosphere or numerical weather prediction (NWP), using weather forecast information such as temperature, pressure, and obstacles [8]. Generally, when using wind speed obtained from a local weather service, the speed is adjusted to the onsite conditions at the wind farm and then converted to power output through the power curve [9]. Shokrzadeh et al. propose the wind turbine power curve estimation through polynomial regression-a statistical technique [10]. Statistical models basically use the relationships of historical data to perform short-term forecasts. Compared to other models, these models are easier and cheaper to develop, but the forecast error increases proportionally with the forecast time. Typical statistical models include the autoregressive (AR), autoregressive moving average (ARMA), and autoregressive integrated moving average (ARIMA) and are used for small forecast horizons [11]. Wang et al. propose a Bayesian-based adaptive multi-kernel regression model [12]. Shi et al. compare ARIMA, an artificial neural network (ANN) and a support vector machine (SVM) [13]. Time-series forecasting models including the ARIMA model explicitly represent the relationship between inputs and outputs but are limited in linear components. Artificial intelligence (AI) and machine learning (ML) approaches are suitable for modeling nonlinear components but are computationally intensive, and outputs are difficult to understand. Statistical models can be used in combination with NWP models. Statistical models may have significant accuracy in very-short-term forecasting (3-4 h). However, owing to the increased errors produced over time, statistical models are used in combination with physical models as a practical alternative for improving forecast accuracy [14]. These forecasting methods are summarized in Table 1 [3]. Wind power forecasting is classified into very-short-term, short-term, medium-term and longer-term according to the timescale [15,16]. Table 2 shows the time horizons and the scope of application for each of the four categories. This work proposes short-term wind power forecasting through a combination of several statistical techniques. The models proposed here include autoregressive integrated moving average with exogenous variables (ARIMAX), support vector regression (SVR), and the Monte-Carlo simulation (MCS) power curve models. The forecasting uses wind power output data and wind-speed data. Through spatial modeling, wind-speed data, which are obtained via the local NWP, are adjusted for wind speed in a wind farm. The forecasting results can be combined using a weighting algorithm. The remainder of the paper is organized as follows. Section 2 explains briefly the structure of the ensemble model and discusses the proposed methodology; Section 3 presents a case study validating the proposed model. We applied the proposed method to the wind farm on Jeju Island and analyzed the results of the month-long forecast for a comprehensive study. In the final section, we present our conclusions and plans for future work.

The Ensemble-Based Forecasting Method
In this section, we describe the proposed forecasting method. The forecasting models are broadly divided into three categories: the time-series analysis-based ARIMAX model, the machine-learningbased SVR model, and the probability-based MCS power curve model. In addition, a spatial model was employed to improve the accuracy of the wind-speed forecast data that were used as input data. Figure 1 shows the procedure of the proposed method.  Different variants of the Kriging technique can be applied, depending on the weighting method, but the Ordinary Kriging (i.e., the most representative Kriging technique) was employed in this work. The governing equation of the Ordinary Kriging technique is given as follows [20]: * = (1) Where, denotes a characteristic value at among n points and the weight value is assigned to the spatial data for deriving the characteristic value * at the point of interest. The

Spatial Model
The purpose of spatial modeling is to improve the accuracy of the wind-speed data used as input data in the forecasting model. The wind-speed forecast data used for forecasting were obtained from points near wind farms employed in the local NWP model, which significantly impacts the forecast accuracy. In this study, the spatial limitations of the existing data were supplemented using a spatial modeling referred to as the Kriging technique. This technique is a representative geostatistics technique that uses spatial correlation based on the distance between data to estimate the characteristic value of points of interest. In the case of wind [17,18], spatial correlation exists because similar values occur within the same time-space. Kriging exploits the similarities in the characteristics between two adjacent points in a given space, and is, hence, a suitable technique for interpolating weather variable data [19]. Figure 2 illustrates the concept of the Kriging technique.   Different variants of the Kriging technique can be applied, depending on the weighting method, but the Ordinary Kriging (i.e., the most representative Kriging technique) was employed in this work. The governing equation of the Ordinary Kriging technique is given as follows [20]: where, z i denotes a characteristic value at i among n points and the weight value w is assigned to the spatial data for deriving the characteristic value z * at the point of interest. The weights were calculated based on a variogram and covariance representing the data's spatial correlation, and the sum is 1 to avoid bias.

The ARIMAX Model
ARIMAX is the general ARIMA model with the addition of an exogenous variable (X) [21][22][23]. This model uses the historical data of wind power output as the main variable and wind speed estimated by spatial modeling as the exogenous variable. The general ARIMAX model is defined in Equation (3).
where, y t , p, y t−1 , α i , q, β j , t−j , ρ, and F t represent the wind power output at time t, maximum number of time lags, output lagged by time step i, coefficient of y t−1 , maximum number of time lags, coefficient of t−j , white noise, coefficient of F t , and wind speed at time t, respectively. p and q indicate the order of AR and moving average (MA), which selects the optimal parameter for the minimum Akaike Information Criteria (AIC) [24]. The parameter of minimizing this estimator is important for model identification because it reduces the mean square error, which is an estimate of the variance of the white noise process, and considers the principle of parsimony.

The SVR Model
SVM is one of the most popular approaches to the field of machine learning, and is used for data classification, data mining, and statistical analysis [25]. SVM determines where new data belongs by optimizing binary classification problems to determine the maximum margin for hyperplane separation. SVR is a model for deriving a regression function by applying SVM. A general regression function can be expressed as y i = f (x i ) + b and the SVR maps to a high-dimensional feature space to solve nonlinear regression problems. x 1 , y 1 , . . . , x n , y n is training data, where x is wind speed and y is wind power output. y i is the target value, which means the predicted value of wind power output derived through the SVR. The SVR optimization problem is formally defined in Equation (4).
where x i is the input vector and y i is the output value. is the magnitude of errors that can be neglected, C is a factor for tradeoff between overfitting and underfitting, and ξ is a slack variable [26]. SVR has excellent generalization capability with high accuracy and without computational complexity relying on the dimensionality of the input space [27].

The MCS-Based Power Curve Model
The easiest way to predict wind power output using wind speed data is to convert wind speed to power through the manufacturer's power curve [28]. However, the actual relationship between wind speed and the power generated by wind farms is complicated by turbine aging and control factors, limiting the use of the manufacturer's deterministic power curves. In this paper, MCS was applied to probabilistically model the relationship between wind speed and wind power output. MCS is a technique used to perform decision-making under uncertain circumstances and to model the probability of different outcomes that are not easily predictable due to random variables [29].
Historical data of wind speed and power output over the past year were used for power curve modeling. After a generation output, database (DB) according to wind speed was modeled, the unit of wind speed was specified in 0.5 m/s to allocate output data corresponding to the interval. Distribution fitting was performed using logistic distribution based on the assigned wind power output. The power curve was then modeled by performing 10,000 random sample extractions through the MCS for a 90% confidence interval of the distribution. Figure 3 shows the power curve estimated by MCS, which reflects the relationship between actual wind speed and power output. of wind speed was specified in 0.5 m/s to allocate output data corresponding to the interval. Distribution fitting was performed using logistic distribution based on the assigned wind power output. The power curve was then modeled by performing 10,000 random sample extractions through the MCS for a 90% confidence interval of the distribution. Figure 3 shows the power curve estimated by MCS, which reflects the relationship between actual wind speed and power output.

Forecast Combination
In this paper, we applied the constrained least squares (CLS) regression method to combine the forecast results of the three forecasting models described above. The CLS approach minimizes the sum of squared error in Equation (6) by training a portion of the forecast results [30]. A regression model was used to impose restrictions on the weights of an individual model [31].
. ≥ 0 ∀ = 1 Where is the forecast obtained from model of n forecasting models and w is the weight. denotes the constant term that is provided if the individual forecasts are biased, and represents the combined forecast.

Forecasting Performance Evaluation
To assess the model quantitatively, two kinds of error indexes, normalized mean absolute error (NMAE) and root mean square error (RMSE), were used as metrics of forecasting accuracy. Equations

Forecast Combination
In this paper, we applied the constrained least squares (CLS) regression method to combine the forecast results of the three forecasting models described above. The CLS approach minimizes the sum of squared error in Equation (6) by training a portion of the forecast results [30]. A regression model was used to impose restrictions on the weights of an individual model [31].
where y i is the forecast obtained from model i of n forecasting models and w is the weight.α denotes the constant term that is provided if the individual forecasts are biased, and y c represents the combined forecast.

Forecasting Performance Evaluation
To assess the model quantitatively, two kinds of error indexes, normalized mean absolute error (NMAE) and root mean square error (RMSE), were used as metrics of forecasting accuracy. Equations (8) and (9) represent the two metrics.
where n is the number of forecasting periods, M t is the measured value at time t, F t is the forecast value at the same time, and P is the installed capacity of the wind farm.

Wind Power Forecasting Case Study
This section details a case study which was conducted for wind farm A with a capacity of 30 MW located in Jeju Island to evaluate wind power output forecasting performance. The NWP model estimates the wind speed for 24 h at 1 h intervals, and the measured data used was supervisory control and data acquisition (SCADA) data measured every hour. Time lags based on electrical power outputs in MW from the SCADA system are attributed to statistical learning for historical values. To compare the performance of a single model and an ensemble model, three single models were used to perform day-ahead forecasting in July and August 2018. The forecast results were generated for 24 h with 1 h intervals. Based on the forecast results in July, we combined the forecast results of the three models through training to produce an ensemble result for August. Figure 4 shows the training and evaluation periods for the forecast, using data from the past 28 days for the day-ahead forecast.

Wind Power Forecasting Case Study
This section details a case study which was conducted for wind farm A with a capacity of 30 MW located in Jeju Island to evaluate wind power output forecasting performance. The NWP model estimates the wind speed for 24 h at 1 h intervals, and the measured data used was supervisory control and data acquisition (SCADA) data measured every hour. Time lags based on electrical power outputs in MW from the SCADA system are attributed to statistical learning for historical values. To compare the performance of a single model and an ensemble model, three single models were used to perform day-ahead forecasting in July and August 2018. The forecast results were generated for 24 h with 1 h intervals. Based on the forecast results in July, we combined the forecast results of the three models through training to produce an ensemble result for August. Figure 4 shows the training and evaluation periods for the forecast, using data from the past 28 days for the day-ahead forecast.

Spatial Modeling Results for Wind Speed Correction
In order to perform the forecast, wind speed forecast values were required for the forecast period. In this paper, the forecast value of the point near the wind farm was obtained from the local NWP model and corrected through spatial modeling. NWP data of 20 points were used, and the wind speed at the wind farm was estimated using Ordinary Kriging-a spatial modeling technique. Figure  5 shows the wind speed estimates for the forecast period. The red dashed line represents the estimated wind speed, with an overall RMSE of 1.76 m/s. The data were used as the input data for the wind power output forecast models.

Spatial Modeling Results for Wind Speed Correction
In order to perform the forecast, wind speed forecast values were required for the forecast period. In this paper, the forecast value of the point near the wind farm was obtained from the local NWP model and corrected through spatial modeling. NWP data of 20 points were used, and the wind speed at the wind farm was estimated using Ordinary Kriging-a spatial modeling technique. Figure 5 shows the wind speed estimates for the forecast period. The red dashed line represents the estimated wind speed, with an overall RMSE of 1.76 m/s. The data were used as the input data for the wind power output forecast models.

Wind Power Output Forecasting Results Using Ensemble Model
We performed the wind power output forecasting in July and August using single models. Figure 6 shows the forecast values of the single models for these two months. The accuracy of the forecast output values for each model is shown in Table 3. In both months, the accuracy of the MCS-based power curve model is high, but because this model is significantly affected by wind speed prediction results, an ensemble approach is needed to prevent bias of the results. In order to perform the ensemble forecasting for August by combining the single forecasting results, the weights were calculated through CLS regression based on the previous 28 days' forecasting results. The weighting results are shown in Figure 7.

Wind Power Output Forecasting Results using Ensemble Model
We performed the wind power output forecasting in July and August using single models. Figure 6 shows the forecast values of the single models for these two months. The accuracy of the forecast output values for each model is shown in Table 3. In both months, the accuracy of the MCSbased power curve model is high, but because this model is significantly affected by wind speed prediction results, an ensemble approach is needed to prevent bias of the results. In order to perform the ensemble forecasting for August by combining the single forecasting results, the weights were calculated through CLS regression based on the previous 28 days' forecasting results. The weighting results are shown in Figure 7.        Table 4. Accuracy is improved through the ensemble approach. The ensemble approach does not always improve accuracy, but it can compensate for overshoots occurring at turning points and prevent bias of the results. Thus, the ensemble approach is a viable   Table 4. Accuracy is improved through the ensemble approach. The ensemble approach does not always improve accuracy, but it can compensate for overshoots occurring at turning points and prevent bias of the results. Thus, the ensemble approach is a viable alternative for improving the forecasting model in that its results are often more accurate than single forecasting results. alternative for improving the forecasting model in that its results are often more accurate than single forecasting results.

Conclusions
Intermittent power fluctuation depending on the wind climate is the biggest challenge in integrating wind into the grid. Therefore, a wind power forecasting technique is essential for reliable grid operation and integration. In this paper, we proposed a short-term wind power forecasting model that combined three statistical methods. Wind speed prediction data were obtained from the local NWP model and corrected for the wind speed in a wind farm through the Kriging technique, a spatial modeling technique that uses the spatial correlation for wind speeds to increase the accuracy of NWP models.
In order to verify the proposed method, a case study was performed using empirical data from a wind farm on Jeju Island. Two months' worth of day-ahead predictions using three single models showed that the MCS-based power curve model performed best. However, this method requires a great deal of historical data for power curve modeling, and it is difficult to obtain relatively accurate results when the NWP model's prediction accuracy is low. By combining single forecasting results based on the ensemble technique, we found that the prediction accuracy was improved. While the ensemble model did not have good performance over all time periods, combining multiple single models prevented bias in the forecasting results. These practical forecasts allow the grid operator to know the expected wind power output at a specific time, enabling stable grid operation.

Conclusions
Intermittent power fluctuation depending on the wind climate is the biggest challenge in integrating wind into the grid. Therefore, a wind power forecasting technique is essential for reliable grid operation and integration. In this paper, we proposed a short-term wind power forecasting model that combined three statistical methods. Wind speed prediction data were obtained from the local NWP model and corrected for the wind speed in a wind farm through the Kriging technique, a spatial modeling technique that uses the spatial correlation for wind speeds to increase the accuracy of NWP models.
In order to verify the proposed method, a case study was performed using empirical data from a wind farm on Jeju Island. Two months' worth of day-ahead predictions using three single models showed that the MCS-based power curve model performed best. However, this method requires a great deal of historical data for power curve modeling, and it is difficult to obtain relatively accurate results when the NWP model's prediction accuracy is low. By combining single forecasting results based on the ensemble technique, we found that the prediction accuracy was improved. While the ensemble model did not have good performance over all time periods, combining multiple single models prevented bias in the forecasting results. These practical forecasts allow the grid operator to know the expected wind power output at a specific time, enabling stable grid operation.
In the future, we will apply Random Forest (RF) to a proposed model in order to enhance the forecasting error and perform forecast for various regions. Further, wind direction will be considered to spatial modeling and application when wind direction data is available from the Korean NWP system.