Dynamic Hybrid Model for Short-term Electricity Price Forecasting

Accurate forecasting tools are essential in the operation of electric power systems, especially in deregulated electricity markets. Electricity price forecasting is necessary for all market participants to optimize their portfolios. In this paper we propose a hybrid method approach for short-term hourly electricity price forecasting. The paper combines statistical techniques for pre-processing of data and a multi-layer (MLP) neural network for forecasting electricity price and price spike detection. Based on statistical analysis, days are arranged into several categories. Similar days are examined by correlation significance of the historical data. Factors impacting the electricity price forecasting, including historical price factors, load factors and wind production factors are discussed. A price spike index (CWI) is defined for spike detection and forecasting. Using proposed approach we created several forecasting models of diverse model complexity. The method is validated using the European Energy Exchange (EEX) electricity price data records. Finally, results are discussed with respect to price volatility, with emphasis on the price forecasting accuracy.


Introduction
During the last twenty years, the traditional vertically integrated electric utility structure has been deregulated and replaced by a competitive market.The deregulated power market is an auction market with market clearing prices.Companies that trade in the electricity market today make extensive use of price prediction techniques to stay competitive.Along with forecasting electricity prices, producers and traders can develop bidding strategies to maximize profits and minimize risks and allocate purchases between long term bilateral contracts and spot prices.
Electricity today is not storable in economically significant quantities and as a result electricity prices are volatile.Aside from volatility (elaborated in detail in Section 4), liquidity is another major market parameter.Market liquidity is an asset's ability to be sold without causing a significant change in the price and with minimum loss of value [1].The essential characteristic of a liquid market is that there are available willing buyers and sellers at all times.In a non-liquid market, the accuracy of a price forecasting procedure can significantly vary depending on the position of the dominant player.Without knowing its position or the parameters affecting it, it is very challenging to forecast electricity prices in such market [2].
The European Energy Exchange (EEX) is the most important energy exchange in central Europe which provides a spot market for power derivatives and emission trading in Germany, France, Austria and Switzerland.EEX is a highly liquid market affected by domestic and regional power system factors.Due to the importance and regional influence of the price of the EEX, it is important to find a suitable model for electricity spot market price forecasting [3].The proposed methodology will be tested on the EEX price history data.Many attempts have been made to predict electricity prices, ranging from traditional time series approaches to artificial intelligence, such as fuzzy systems and artificial neural networks (ANN) [4].Auto regressive integrated moving average [5], dynamic regression and transfer function [6,7] and generalized auto regressive conditional heteroscedasticity [8,9] are the most widely used time series algorithms.Time series techniques exhibit good performances, however due to the use of linear modelling most of them have difficulties in predicting the hard nonlinear behaviors and rapid changes of the price signal [10][11][12].ANN has been extensively used by many researches on similar problems.In problems with adequate data for ANN training and straight-forward selection of input-output samples, ANN are a powerful and flexible tool for forecasting and provide more accurate results than time series models.For example in [13] the ANN approach for weekly price forecasts has outperformed the time series ARIMA technique and the native procedure in all of the observed weeks.
Accuracy of a certain method can be evaluated by mean absolute percentage error (MAPE).Although the concept of MAPE sounds very simple and convincing, it has two major drawbacks in practical applications.If there are zero values (which can happen as EEX allows prices in the range from -500 € to 3,500 €), a division by zero will occur.On a perfect fit, however, MAPE will be zero.With regard to its upper level, MAPE has no restriction.When calculating an average MAPE for multiple time series, a few numbers in the series that have a very high MAPE might distort a comparison between averages MAPE of time series fitted with different methods [14].
Our goal is to reuse existing methods for price forecasting to create a hybrid price forecasting model which combines advantages of existing techniques to cover specificity of electricity price movements.
Initially data is processed and filtered using statistical methods resulting with a data model of similar days.This model is then improved by using a multi-layer (MLP) neural network or price spike detection and forecasting.We validated our approach using the EEX electricity price history data, and evaluated our results by applying several measures of accuracy (MAPE, MAE and RMSE).In addition we emphasized the problem of price volatility by showing how price volatility affects the accuracy of each forecasting method.
The rest of the paper is organized as follows.Section 2 describes price forecasting framework.Section 3 is devoted to the proposed forecasting methodology while Section 4 describes price volatility.Section 5 provides our simulation results.Finally, Section 6 concludes the paper.

Price Forecasting Framework
Most of the existing studies of electricity price forecasting use only historical price and consumption data to forecast electricity prices over various time spans [3][4][5][6]8,10,13].We found it important to include wind power production history data as it is the resource with the most volatile production ratio.During the last decade, wind power generation has experienced a powerful breakthrough due to incentives, preferential price, Kyoto Protocol commitments and technical achievement, as well as other renewable energy resources.In Europe, Nordic consumers benefit financially from the presence of Danish wind on the power market [15].In recent years, Germany also has a high penetration of wind power (Figure 1) [16].However, high volatility of the wind power can cause price drops on the power exchange to below zero values, further increasing price volatility.For example, the minimum price of -500.02€ has occurred on 10 April 2009 [17,18].The proposed hybrid price forecasting model has both linear (similar day-SN) in Case 1 and nonlinear (ANN) forecasting capabilities in Case 2. In addition in Case 2 if possibility of price spikes (PS) are detected price is forecasted as price spike.Time horizon is day-ahead and the price is forecasted for every hour.
The price forecasting framework and methodology are presented in Figure 2. As model input, historical data for prices, consumption and wind production from Point Carbon web service [19] were used.Day-ahead forecasted consumption and wind production data were taken as additional input for increased performance.

II. Relevant history horizon for similar days-H(T). III. Hourly correlation time horizon-H(T) h .
Day type can be divided in three categories: working days (Monday-Friday), Saturdays and Sundays/Holidays.Days before or after holidays also reveal distinct behavior, but due to the negligible influence on forecasting performance, these cases are ignored and not performed.
To define the relevant history horizon for similar days-H(T) we performed a historical price data analysis on an annual base, and have selected a significant hourly price correlation coefficient of 0.85.Our analysis showed that this correlation coefficient corresponds to a time period D = 28 days, in average (Figure 3): Prior to performing the forecast, an hourly horizon is defined by applying Equations ( 1) and ( 2).Hourly horizon k + and k − ("significant neighbors" shown in green in Figure 4) are different for every forecasted price p(T) h but generally two nearest neighbors always have the highest correlation.In some cases the maximum correlation coefficient between neighboring hours is 0.80.Therefore we decided to use this value as minimum correlation coefficient for defining the hourly horizon.Detailed preview of the hourly price correlations is shown in Figure 4, where correlations higher than 0.80 are marked in green.

Price Forecasting Methodology
Our approach of using hybrid forecasting model, based on similar-day analysis, improved by neural network and price spikes detection and forecasting is shown in Figure 2. Data set is filtered using a data mining technique described in Section II.Each hour is observed separately and then forecasted as price spike or normal price using a neural network.We defined the hourly price p(T) h for day T as a cumulative value of linear and non-linear (neural network and price spike) components: ( ) where: The hybrid forecasting model consists of:

Similar Days Methodology
Electricity price forecasting methods based on similar day's methodology were presented by Paras Mandal et al., in numerous works, such as [20][21][22].Similar price days are selected based on a Euclidian norm.Euclidian norm with weighted factors was used in order to evaluate the similarity between forecasted days and similar days [1].Due to a high correlation between consumption and price for observed market, these two parameters are chosen for day-ahead price forecasting [1]: where: L t -load for forecasted day; P t -price for forecasted day; L p t -load on similar day in past; P p t -price on similar day in past; ΔL t -the load deviation between forecasted day and similar days; ΔP t -the price deviation at time t; ω-weighted factor is determined using least square method based on regression model constructed by using historical load and price data.
Input data for the proposed method includes consumption and wind forecast data taken from the Point Carbon web service [23] (underlying techniques used for consumption and wind forecasts are out of the scope of this paper).Similar days with realized prices are examined by consumption and wind production data mining method.The result of this process is the linear price component from Equation (2) defined as: where: N-number of similar days; δ-average hourly difference between forecasted hours.
The linear price component can be observed as independent because it represents a starting point for the forecasting model which can be improved by neural network and price spikes component.

Neural Network Architecture
Neural networks are applied widely for solving different problems which in general are difficult to solve by humans or conventional computational algorithms.In power systems the ANN's have been used to solve problems such as load forecasting, unit commitment, power system topology recognition, and safety analysis, price forecasting etc. [24] For hourly neural network component used in our approach, a multi-layer feed-forward neural network is proposed as shown in Figure 5.This neural network is used to forecast hourly deviation from the similar days function with regards to a forecasted day.The neural network is composed of one input layer, one hidden and one output layer.Figure 6 shows how a relevant data set for forecasting the price deviation in hour h is defined.We have defined input variables for the ANN to be load forecast, wind forecast and hourly price, wind and load deviation from average values on similar days.The deviation is defined as: ) ( 1 (11) where:  The neural network is trained on 70% randomly chosen cases from the data set and tested on the remaining 30%.Due to the fact that there is no efficient way for storing electric energy, all electricity produced has to be consumed forthwith.Imbalances between consumption and production lead to electricity price jumps (i.e., price spikes) which can be several times higher or lower than the regular price.One possible cause of imbalance are weather conditions which have an effect on consumption, similar days set neighbouring hours with high correlation production from renewable energy sources, unexpected outages and reductions of cross-border capacities.Some of these events like outages or capacity reductions happen rarely and cannot be predicted with confidence.
In the proposed forecasting model we categorize prices as either regular (normal) or price spikes.It is important to find how consumption and wind production affect the price spikes.An index was created which unifies consumption and wind production changes to create a signal, which detects possibility of price spikes.Consumption and wind index (CWI) proposed in our approach consists of two components where CC refers to the relative degree of forecasted consumption with the initial consumption on similar days and CW refers to forecasted wind production with the initial wind production on similar days: where: C h -consumption in hour h; C sd -average consumption on similar days; W h -wind production in hour h; W sd -average wind production on similar days; N-number of similar days; Applying the CWI to the whole data set indicates that high price deviations happen for lower value of the coefficient (Figure 7).If the error of forecasted wind production and consumption is lower than 1.0, a price spike may happen in the observed hour.
After calculating the CWI index of a possible price spike, the hourly price spike component S h Equation (3) can be calculated.Since price spikes happen rarely, we experienced a lack of data to be used for the neural network calculation.Therefore S h was calculated using a linear approximation with two variables; wind and consumption.Importance of short-term price forecasting on one hand and its complexity on the other hand led researchers to propose various methods.Among these methods, there are three widely used approaches; time series models, artificial neural network (ANN) and hybrid methods.

Price Volatility
Volatility refers to unpredictable fluctuations of a process observed over time.In finance, volatility is a measure for variation of a price of a financial instrument over time [20].Past volatility is derived from a time series of the past market prices.It is a criterion to study the risk associated with holding assets when there is an uncertainty in the future value of the assets.
In [25] past volatility is calculated as standard deviation of arithmetic and logarithmic return over a time window T. If p t is a spot price for a commodity at the time t, arithmetic return over time period h is defined as: the logarithmic return over time h is defined as: , ln ln( ) ln( ) When returns are small, the arithmetic and logarithmic returns are approximately equal:

CWI CWI/deviation
Given the return values, the estimated value of past volatility can be calculated as: ( ) where: σ h,T -the estimated value of past volatility; N 0 -the number of r t,T observations; r hT -the average over the time window T.
For this study volatility is calculated as a standard deviation of arithmetic return over a time window T because EEX prices can be negative.When the prices are negative, the logarithmic return cannot be calculated as in Equation (20).
It is interesting to observe the volatility fluctuation on EEX over the last eight years in dependence with traded volume.Figure 8 shows much higher volatility values occurring between 2005 and 2013.This period coincides with the growth of wind power production (Figure 1) so it can be concluded that for some time market prices were under influence of renewable energy production.It can also be concluded that price volatility depends on traded energy volume, as from 2010, with an increase in trade volume the price, the volatility is lower than in previous years.

Case Study
We tested our approach for electricity price forecasting on the EEX price history data from a time period 20 Price spikes brought outliers in results and that was the reason for including two additional error measures: MAE and RMSE.For example, in December price distribution was scattered and there were a large number of price spikes which increased MAPE: Simulation results show that similar day's method with neural network and price spikes detection gives the best results.Price volatility is a dominant factor affecting the forecast model accuracy.In case of low price volatility, such as March 2011, simple models such as similar days gave adequate results with forecasting error similar to advanced models with neural network and price spikes detection.In case of high volatility, such as December 2010, advanced models gave better results.Therefore it is very important not only to evaluate the forecasting model against the forecast error but also to analyze the complexity of the result distribution, in this case defined by price volatility.

Conclusions
The main difference between the proposed model and other existing methods for short-term electricity price forecasting is in its data utilization approach.In the proposed approach, the data is pre-processed by statistical methods prior to each analysis.Forecasting results obtained by a combination of methods: similar days method (SD), neural network forecasting (NN) and price spikes (PS) in the following combinations; SD, SD + NN, SD + NN + PS were presented; each, respectably, proving to be more accurate.Price volatility has a significant influence on the forecasting results.Simple forecasting models produced satisfactory results in cases of low price volatility.Our method proved robust enough, even in cases of high price volatility.

Figure 1 .
Figure 1.Wind power capacity and generation in Germany.

Figure 2 .
Figure 2. Flow chart of the hybrid electricity price forecasting model.

Figure 3 .
Figure 3. Daily price correlation for D nearest neighbor.

Figure 4 .
Figure 4. Correlation matrix of EEX hourly data averages for the period from 20 November 2010 to 20 July 2011.
similar hour before (k − ) and after (k + ) forecasted hour h.

Figure 5 .
Figure 5. Proposed ANN model for hourly neural network component forecasting.

Figure 6 .
Figure 6.Defining the data set for the forecasted hour h.

Figure 7 .
Figure 7. Correlation of CWI and standard deviation for observed hours.

Figure 8 .
Figure 8. Volatility fluctuations and traded volume show that volatility jumps were more severe on lower levels of EEX spot market liquidity.
deviation from average value on similar days; -number of similar days;

Table 1 .
November 2010-20 July 2011.Data was processed in Microsoft Excel with Palisade Decision Tool and Visual Basic.Three cases were observed: II Forecasting with similar days and neural network (SD + NN); III Forecasting with similar days, neural network and price spikes detection (SD + NN + PS).Our results are presented in Table1.below.Performances of the proposed hybrid model for hourly electricity price forecasting.Model performance was evaluated with MAPE where y t is a realized value at time t and f t is forecasted value at time t in the time period T: