Improved Spatio-Temporal Linear Models for Very Short-Term Wind Speed Forecasting

In this paper, the spatio-temporal (multi-channel) linear models, which use temporal and the neighbouring wind speed measurements around the target location, for the best short-term wind speed forecasting are investigated. Multi-channel autoregressive moving average (MARMA) models are formulated in matrix form and efficient linear prediction coefficient estimation techniques are first used and revised. It is shown in detail how to apply these MARMA models to the spatially distributed wind speed measurements. The proposed MARMA models are tested using real wind speed measurements which are collected from the five stations around Canakkale region of Turkey. According to the test results, considerable improvements are observed over the well known persistence, autoregressive (AR) and multi-channel/vector autoregressive (VAR) models. It is also shown that the model can predict wind speed very fast (in milliseconds) which is suitable for the immediate short-term forecasting.


Introduction
Electricity consumption of the developing countries increases annually [1,2].However, the authorities are aiming to reduce the greenhouse gas emission and also the electricity consumption by increasing the amount of renewable energy and improving the energy efficiency respectively [3].Since wind energy is sustainable, emission-free and cost-effective, it is very attractive and a good candidate to achieve the above ambitious aims.In order to use these energy sources reliably in the future's optimum economic power system operations, it is critically important to accurately forecast wind power generation [4][5][6].Since wind power is a function of the cube of wind speed, accurate wind power output prediction depends on wind speed prediction [7].
Wind speed prediction problem is widely investigated in literature and various methods are presented [5,[7][8][9][10][11][12][13].The available methods are generally separated as physical and statistical methods.However, for very short-term wind speed forecasting, physical model-based methods such as numerical whether prediction (NWP) have high computational complexity and lower accuracy [7,8].Therefore, some hybrid of physical (NWP) and statistical methods are proposed in literature as in [8,9].Computationally efficient but accurate and reliable statistical methods for very short-term wind speed forecasting are required especially for the electricity market-wind forecasting control [14].The statistical methods can be classified as point and probabilistic forecasting approaches [8].In point forecasting approach, future wind speed is given as a single value.However, in probabilistic forecasting case, the future wind speed value is modelled as random variable and its probability density function (pdf) is given as a result.
Recently, spatial correlation models, which also known as "spatio-temporal" methods, are appeared as a new trend in short-term wind speed forecasting [5].These methods use measurements from neighbourhood of target location (wind farm) for more accurate wind speed forecasting with a modest processing overhead [15][16][17][18][19][20][21].Since wind is a horizontal movement in atmosphere, its spatial correlation carries important information for such spatial models.However, the spatial correlation of low level wind directly depends on the complexity of the terrain.In [15], space-time forecasting model is proposed which promises more accurate results according to conventional time series models.However, this model which is called as calibrated probabilistic forecasting is designed only for the selected region.This region specific forecasting model is improved in [16], so it does not require any prior geographic information for the target region.In [17] a graph-learning based spatio-temporal analysis techniques are used to characterize probabilistic models for short-term forecasting.In [5], a methodology is proposed for optimum probabilistic forecasting of geographically dispersed information.In [19], multichannel adaptive filtering technique is applied for short-term prediction which promise lower complexity, improved robustness and ability to track seasonal variations.Most of the above methods are based on the statistical analysis and interpretation of the location specific multi-channel data collected in years.
On the other hand, the conventional linear time series models are easy to implement and requires no preliminary analysis for model development.Hence these models are widely preferred for short-term wind speed forecasting [12,[22][23][24].However, the multi-channel (spatio-temporal) linear methods, which uses the measurements from neighbourhood of target location, have not been addressed sufficiently for short term wind speed forecasting.The vector autoregressive (VAR) method is applied to geographically dispersed (multi-channel) wind speed data in [25].There are also some other hybrid artificial neural network (ANN) based methods [26][27][28].
The multi-channel autoregressive moving average (ARMA) models are commonly used for blind identification of single input multi output (SIMO) systems in communications, source localization and medical imaging [29][30][31].These multi-channel blind linear system models can also be applied for multi-channel wind speed prediction problem for more accurate results on target location [32,33].
In this paper, the multi-channel linear prediction models for short-term wind speed forecasting using neighbouring wind speed measurements around the target location which is sketched in Figure 1 are investigated and reviewed.These multi-channel linear prediction models can also be called as multi-channel ARMA or MARMA.The problem formulation, compact matrix forms and efficient multi-channel coefficient estimation approaches are presented and tested using hourly averaged real wind speed/direction values.These values are collected from the five synchronized measurements station of the Turkish State Meteorological Service.These stations are selected around the Canakkale Canel of the Turkey, namely Bozcaada (BOZ), Ipsala (IPS), Gonen (GON), Bandirma (BAN), and Sile (SIL).The root mean square error (RMSE) and mean absolute error (MAE) are used as the performance measurements of the prediction models.It is shown that MARMA model's prediction performance is better than uni-variable AR and multi-variable vector AR (VAR).It is also observed that the performance's of the MARMA increases when the forecast lead time is increased compared to other methods.
The paper is structured as follows: (1) Multi-channel linear prediction models and their compact matrix forms for short-term wind speed forecasting is presented and reviewed.(2) Computationally efficient and accurate linear solution techniques with a new linear channel selection approach for multichannel coefficients are proposed and discussed.(3) MARMA forecasting models are tested using the real wind speed data which are collected from three different locations from the Canakkale region of Turkey.The RMSE and MAE performances are compared for various cases.The section organization of the paper is as follows.In Section 2, problem formulation of the multi-channel linear prediction models and the coefficient estimation techniques are presented.In Section 3, the selected region where the real multi-channel wind data collected is introduced and prediction performances of the models in Section 2 are tested and compared with other methods.We conclude the results in Section 4.

Multi-Channel Wind Data
We consider M spatially distinct (geographically separated) measurement stations with known positions as shown in Figure 1.At each m th station (channel), discrete measurements are assumed to be collected as: where y m [n] is averaged wind speed values at discrete time index n respectively.∆t is the averaging time duration and can be chosen as a minute or a hour.The problem is to forecast short-term wind speed value at m th station, using M spatially distributed (multi-channel) averaged wind data measurements as in Figure 1.Since wind directions are spread to all directions, wind measurement stations should surround the target location for the best result.

Multi-Channel Linear Prediction Models
In this part, multi-channel ARMA model which is used for blind identification of SIMO systems in [31] are modified and implemented for the multichannel wind speed prediction model.AR model is applied to multi-channel real wind speed data which is called as vector autoregressive model (VAR) in [25].The VAR predictor's ∆ hour ahead output for the m th channel (target location) is given as: where M is the number of channels, P is the number of coefficients and w m [n] is the additive noise (model error) terms at each channel and assumed as temporally and spatially white random process with variance σ 2 w .Two different multi-channel ARMA models are proposed for short-term wind speed prediction.First model is called as MARMA-1 and ∆ hour forecast lead time output at m th location is defines as: where s[n] is the common input signal which is white noise random process with constant power spectrum and statistically independent from the additive channel noises w m [n] with variance σ 2 s .Second model is called as MARMA-2 and ∆ hour forecast lead time output at m th channel which differently using multi-channel spatially and temporally white noise inputs, s k [n], as follows: It is possible to put M channel wind data for the above linear prediction models in matrix form as: where T is a M × 1 vector and this vector (also known as snapshot) includes M channel wind values from different locations at the same time.x[n] is the input data for the multichannel linear prediction models and defined for MARMA-1 in Equation ( 6) and MARMA-2 in Equation ( 7) respectively as: x where Similarly multi-channel white noise process in Equation ( 7) is defined as ] T is M × 1 additive channel noise vector.Finally the multi-channel prediction filter coefficient matrix (A) for MARMA-1 is defined as: where A is a M × (MP + Q) matrix and it includes all the unknown coefficients in Equation (3).Similarly for MARMA-2, A matrix is defined as: where . ., M. In this case, A is a M × M(P + Q) matrix and it includes all the unknown coefficients in Equation (4).
It is required to efficiently solve linear prediction model coefficients in Equations ( 8) and ( 9).The matrix form of the multi-channel linear prediction models, which is given in Equation ( 5), is similar to well known array signal model in array theory [34].Array signal processing area deals with the space-time signals which are collected by an array of sensors.It is possible to solve these coefficients using the subspace methods in [30].Another computationally efficient way of solving these coefficients is given in [31].
In the next section, computationally efficient and accurate linear solution technique with a new linear channel selection approach for multichannel coefficient estimation is presented.

Multi-Channel Linear Prediction Coefficient Estimation
In order to find multi-channel linear prediction coefficients for more accurate results, N snapshot measurements are collected and the data in Equation ( 5) is extended as: which can be rewritten as: where Y[n + ∆] is extended multi-channel linear prediction output vector with size MN × 1 and X[n] is extended prediction input vector.W[n] is extended model error vector with size MN × 1 and Ā is the extended coefficient matrix.In order to solve MARMA-1 and MARMA-2 coefficients in Equations ( 8) and ( 9) respectively, it is possible to apply a selection matrix for the specified m th target location as: where the selection matrix for the m th location is defined as: e k is a 1 × MN row vector as: , 1, 0, . . ., 0 If we multiply N multi-channel data in Equation ( 10) with m th selection matrix S m as in Equation (12) we get linear set of equations for the m th location as: where (11) and it is the measurement data matrix which consist from the previous multichannel wind data and white noise signal.ām is the m th row of the A matrix in Equations ( 8) or (9) which is the prediction coefficients of MARMA-1 and MARMA-2 respectively for the m th target location.This model is the well known linear model in classical estimation theory [35] and it is possible to apply linear least squares (LS) techniques to find the optimum prediction coefficients.In this case, it is required to minimize the following cost function: where () T is for transpose operation and the optimum LS solution for the unknown prediction coefficients is: There are some computationally efficient ways to solve the above matrix pseudoinverse solutions as in [36,37].

Data Set
The accuracy of the proposed multi-channel linear prediction models are tested with hourly averaged wind speed and direction data which were collected from five stations around the Canakkale region of Turkey.Data is available in [38].The three years hourly averaged wind speed and direction values between the years 2008 and 2010 are used.These five stations (Bozcaada, Ipsala, Gonen, Bandirma and Sile) belong to the Turkish State Meteorological Service and the locations are shown in Figure 2. All wind measurements are taken from 10 meters height above ground.The region is known as having one of the highest wind energy potential in Turkey.These stations are selected arbitrary from the available measurement locations in that region.The topographic map of the region is shown in Figure 3.The topography is indicated by different colors; green colors indicated low altitude and white colors indicate hight altitude.As shown in Figure, these measurement stations are not close each other and the canal.BOZ is located at the highest point of an island.IPS is located in a valley.BAN is close to GON but it is separated from the canal.SIL is approximately 250 km far away from GON which is completely separated from the canal and other stations.Figure 4 shows the Auto and Cross-Correlation Coefficients of the stations with the target location GON for different time delays.All the correlations demonstrate a decline with time delay, except for maximum at diurnal periods (multiples of 24 h).It can be seen from Figure 4 that cross-correlation coefficient values of BOZ and IPS are higher than the other two (BAN and SIL) stations for short time delays, 1 ≤ ∆ ≤ 4. SIL station has the lowest correlation values as expected.Since these stations are selected arbitrarily from the available stations their spatial dependencies are limited as shown.So it is not possible to apply a region specific space time method such as [15].Figure 5 shows the frequency of the wind directions at the measurement stations as polar histograms.These polar plots show that prevailing wind directions at the stations are similar and along the Canakkale Canal from North East (NE) to South West (SW) and vice versa due to the large-scale circulation in that region.Some of the basic statistics (annual maximum, mean and variance values) of the used multi-channel data set are summarized in Table 1.

Test Results
In this section, real wind speed forecasting performances of the proposed multi-channel models are compared with the persistence, AR, VAR models.In order to compare and show the performances of the forecasting models, RMSE and MAE are calculated as, where ŷm is the predicted value and y m is the actual value.∆ is for forecast lead time in hours and m indicates the index of target location.K is the number of total predictions to calculate the RMSE and MAE in Equation (18).In this study, K is selected to cover the whole data between the years 2009 and 2010.In the following calculations of RMSE and MAE results total K = 17280 prediction values are used as in Equation ( 18) respectively.The persistence forecasting method in [25,39] is used as a benchmark to compare all the results.In persistence forecasting, the ∆ ahead future value is taken as the current value.The prevailing wind directions are along the NE to SW and vice versa as shown in Figure 5. GON station is in the midst of the prevailing wind directions according to other stations.Therefore in the following case study, third station (m = 3) is selected as the target station which is also surrounded by other stations.However, it is also possible to select the other stations as the target station.

Model Order
The linear prediction model orders of P and Q in Equations ( 2), ( 3) and ( 4) can be selected using the information criteria in [40] or the minimum description length in [41].Figures 6 and 7 show the RMSE and MAE performances of the AR, VAR and MARMA models according to the model order for the 3 rd station, GONEN (m = 3) and for the forecast lead time ∆ = 2, respectively.It is observed that the AR has minimum error for P = 2 and VAR and MARMA-1 gives minimum error when P = 1.On the other hand, MARMA-2's RMSE and MAE values are reducing when the filter order increased.Therefore, MARMA-2 model gives the best performance when P = 4 compared with other models.model order, P RMSE (m/s)  The similar confirmation is repeated for the Q parameter's of the MARMA models.Figures 8  and 9 show the RMSE and MAE performances of the MARMA-1 and MARMA-2 models respectively according to Q.It is observed that the increasing the model order Q slightly reduces the performances.For the best performance, Q is selected as 1 for MARMA models.

Number of Samples
In order to solve the multichannel linear prediction filter coefficients, the selection of the number of previous samples, N, in Equation ( 10) is another critical parameter.It is observed that increasing the number of N after certain value do not improve the forecasting performances of the AR and VAR models as shown in Figure 10.On the other hand, MARMA-2's forecasting performance is better than the other models when relatively large number of previous samples are used.MARMA-2 uses different random noise processes for each channels and if the large number of previous samples are used, this model gives statistically efficient results.In the following section to make a fair comparison, N is selected as 1000 h for all models.

Number of Channels
In this part the effect of number of channels, M, in Equations ( 2), ( 3) and ( 4) is investigated.Table 2 shows RMSE performances of important channel selections.For M = 4, if we exclude BOZ or IPS from data set, RMSE increases which indicates the significance of these measurements for GON.However, if we exclude SIL, which has minimum correlation value with GON, RMSE value almost unchanged which verifies the correlation values in Figure 4. Figure 11 shows the RMSE performances with respect to the channel number, M. As it is seen, if the used channel is decreased the forecasting error demonstrate a steady increase.

Forecasting Results
Table 3 shows the RMSE and MAE of the target station (m = 3) according the forecast lead time (∆).The multi-channel MARMA-2 has better RMSE and MAE performance than the persistence, AR and VAR models.Table 3 also show that when the lead time period is increased the MARMA models have much better performance than the others.4 shows the percentage improvements of the AR, VAR and MARMA-2 methods over persistence method.It is observed that the proposed MARMA-2 method has the best performance and approximately 2.6% more improvements on the average than the multichannel VAR method.It is also seen in Table 4, the multichannel (spatio-temporal) models (VAR and MARMA) which using the neighbouring measurements have significant improvements according the only temporal AR model.The average execution times of the used and the proposed methods are given in Table 5 for a single ∆ hour ahead forecasting.The used desktop computer has Intel Core(TM) i7-3770K CPU @ 3.50 GHz Processor and 16 GB RAM.Since all the single and multi-channel models are linear and uses efficient linear least square techniques, the observed execution times in table are less than one second with an ordinary desktop computer.It is possible to forecast very short term (in seconds) wind speed values with the proposed spatio-temporal linear MARMA model.

Conclusions
In this study, the spatio-temporal (multi-channel) linear models, which use the neighbouring measurements around the target location, are investigated for short-term wind speed forecasting problem.The problem formulation of the multi-channel ARMA models (called as MARMA) are presented and efficient multi-channel prediction coefficient estimation techniques are revised.The proposed MARMA models and solution techniques are tested using hourly averaged real wind values from the five station around Canakkale region of Turkey.The forecasting RMSE and MAE's of the MARMA-2 model is compared with the persistence, AR and multi-channel AR (VAR) methods.As a result, considerable improvements are observed compared to well known temporal persistence (24.01%improvement) and AR (7.41% improvement) methods.The proposed MARMA-2 model gives 2.6% better results than the spatio-temporal VAR method.It is shown that MARMA-2's performance is continuously improved when number of previous samples (N) and filter order are increased unlike the other models.It should be also noted that since the proposed MARMA model moves on the data set using the N previous available samples, it can also adapt the seasonal variations.It is also shown that the proposed multi-channel linear model can predict ∆ hour wind speed value using an ordinary desktop computer in milliseconds which is suitable for very short term (in seconds) wind speed forecasting.

Figure 1 .
Figure 1.Multi-channel wind data measurement stations around the target location.

Figure 3 .
Figure 3.The topographic map of Turkey-A portion is zoomed for visual purposes.("Turkey topo" by Captain Blood-Licensed under CC BY-SA 3.0)

Figure 5 .
Figure 5.The frequency of the wind directions at the measurement stations for three year hourly averaged data where North is zero degree.

Figure 10 .
Figure 10.RMSE performances of the AR, VAR and MARMA models with respect to number of previous samples N when m = 3 (GONEN) and ∆ = 2 h.

Table 1 .
Some basic statistics of the used multi-channel data set.

Table 3 .
RMSE and MAE performances of the persistent, AR, VAR and MARMA models according the forecast lead time (∆).

Table 4 .
The percentage improvements in MAE of the models with respect to persistence model.

Table 5 .
The execution times of the used and the proposed models in milliseconds (ms).