Next Article in Journal
Energy Balanced Localization-Free Cooperative Noise-Aware Routing Protocols for Underwater Wireless Sensor Networks
Previous Article in Journal
Long-Term Projection of Renewable Energy Technology Diffusion
Previous Article in Special Issue
Forecasting in Blockchain-Based Local Energy Markets

Energies 2019, 12(22), 4262; https://doi.org/10.3390/en12224262

Article
Forecasting the Price Distribution of Continuous Intraday Electricity Trading
Energy Information Networks & Systems, TU Darmstadt, 64283 Darmstadt, Germany
*
Author to whom correspondence should be addressed.
Received: 30 August 2019 / Accepted: 5 November 2019 / Published: 8 November 2019

Abstract

:
The forecasting literature on intraday electricity markets is scarce and restricted to the analysis of volume-weighted average prices. These only admit a highly aggregated representation of the market. Instead, we propose to forecast the entire volume-weighted price distribution. We approximate this distribution in a non-parametric way using a dense grid of quantiles. We conduct a forecasting study on data from the German intraday market and aim to forecast the quantiles for the last three hours before delivery. We compare the performance of several linear regression models and an ensemble of neural networks to several well designed naive benchmarks. The forecasts only improve marginally over the naive benchmarks for the central quantiles of the distribution which is in line with the latest empirical results in the literature. However, we are able to significantly outperform all benchmarks for the tails of the price distribution.
Keywords:
electricity price forecasting; intraday markets; lasso regression; neural networks

1. Introduction

Continuous intraday electricity trading offers market participants the possibility to balance short-term deviations from their planned generation and load schedules. This is especially valuable for agents with a high share of generation from non-dispatchable renewable energy sources like wind and solar. Conventionally, deviations from the day-ahead schedules are compensated through balancing energy which is contracted and centrally dispatched by the transmission system operator. The possibility to trade electricity on short notice can partly explain the counter intuitive situation that the demand for balancing energy in Germany in the last years substantially declined, while at the same time the share of generation from renewable sources increased [1,2]. Thus, intraday markets can be an effective tool to support the transition to a flexible and renewable energy system and have seen steadily growing volumes in recent years [3].
In this work we will focus on the German continuous intraday market. On this market it is possible to trade hourly and quarter-hourly contracts for the delivery of electricity till 30 min before delivery. Inside of the four control zones it is possible to trade until five minutes before the delivery starts. Contrary to the day-ahead auction, the intraday market is operated as a continuous pay-as-bid market, i.e., market participants can submit bids and asks for price-volume combinations which are immediately executed if two offers in the order book can be matched. This results in a potentially large set of prices for the same product. Therefore, price indexes that reflect the volume-weighted average price are a main indicator of market outcomes. The most important one is the ID3 price index which is the volume-weighted average price of all trades in the time interval from three hours before delivery till 30 min before delivery [4]. For a detailed description of the German power markets see [5].
In contrast to the well researched day-ahead markets [6], the literature regarding intraday electricity price forecasting is scarce. Andrade et al. [7] and Monteiro et al. [8] conducted a forecasting study for the Iberian intraday electricity market. However, this market does not resemble the design of the German market. Most importantly, the Iberian intraday market is operated as six separate intraday auctions under a uniform pricing regime. More recently, Maciejowska et al. [9] presented a model that is able to predict the price spread between the German day-ahead auction prices and the corresponding volume-weighted average intraday prices. Finally, Uniejewski et al. [10] and Narajewski & Ziel [11] are to our best knowledge the only two papers that aim to directly forecast ID3 prices. The authors of [11] present evidence that the information available to the market at forecasting time, i.e., three hours before delivery, is already efficiently incorporated by the market participants and therefore the best forecast for the ID3 price is the volume-weighted average price of the most recent 15 min of trading.
Since the intraday market is operated under a pay-as-bid regime, it is possible for market participants to sell or buy contracts at prices that substantially differ from the ID3 price. Furthermore, very different sets of trades can result in the same weighted average price. Therefore, we aim to forecast the entire volume-weighted price distribution instead of only volume-weighted average prices. Note that this distribution is not equivalent to a predictive distribution of the ID3 price, but will often be much wider. Such a forecast is important to enable bidding strategies that use prices away from the ID3 price, e.g., to benefit from especially high or low price offers. To approach our task, we construct empirical cumulative distributions for each trading product in discrete time intervals and describe these distributions using a dense grid of quantiles. This results in a set of multivariate time series of quantile values which non-parametrically approximate the targeted price distributions.
We conduct a forecasting study for the German intraday market in which we aim to forecast the quantiles of the price distribution in the time from three hours to 30 min before delivery. We test several linear regression models as well as a neural network model that accounts for the unique structure of the data. We compare the forecasts from these models to several carefully designed naive benchmark models. Our empirical findings support the evidence in [11], as we are only able to outperform the naive benchmarks by a small margin for the central quantiles. However, the performance of our forecasts for the tails of the distributions improves significantly over the benchmark models.
The remainder of the paper is structured as follows. In Section 2 we describe the data set and preliminary data transformation we apply to obtain the quantiles of the price distributions. We present the linear regression and neural network models as well as a set of naive benchmark models in Section 3 and describe our forecasting strategy along with the employed error measures in Section 4. The empirical findings are discussed in Section 5. We conclude in Section 6.

2. Data Set & Data Transformation

The German intraday market has undergone two relevant regulatory changes in the last several years. First, since July 2017 it is possible to trade until 5 min before delivery inside of the control zones. Second, in October 2018 the Austrian control zone was split from the former German-Austrian market. For our analysis we consider data on the intraday transactions from 1 July 2017 to 31 March 2019 for the German-Austrian market and German market respectively, i.e., we start our analysis after the introduction of the 5 min delivery horizon while the market split occurred during the time frame of our analysis. We assume that the control zone split did not have a substantial impact on prices and liquidity since the German control zones are large compared to the Austrian control zone and cross border trading is still possible. The continuous intraday trading data we use is commercially available from Epex Spot [12]. We will only consider hourly products in our analysis as their traded volume is more than five times larger than the volume of the quarter hour products. We additionally consider corresponding exogenous hourly data regarding day-ahead auction prices as well as forecasted renewable generation and load which is available from ENTSO-E [13].

2.1. Constructing Price Distributions from Intraday Trading Data

The trading data contains all executed trades for hour products to be delivered between 1 July 2017 and 31 March 2019. For this period we define a corresponding set of dates D . Each entry in the raw data set corresponds to a single trade i and is comprised of the identifier for the hour h i { 0 , . . . , 23 } and the day d i D of delivery, a timestamp of the trading time that indicates the time left until delivery t i R + , the trade volume v i { 0.1 , 0.2 , 0.3 , . . . } in M W h , and the price p i { 9999.90 , 9999.89 , . . . , 9999.90 } in E U R / M W h [14].
Forecasting the data on the level of single trades is likely to be difficult and not necessarily needed for decision making. Hence, one has to aggregate the trading data into a more suitable form that is still able to represent the market’s development in sufficient detail. To this end, the literature has so far focused on analyzing and forecasting volume-weighted average prices [9,10,11] which only provide a highly aggregated representation of the market.
We instead propose to work with the entire volume-weighted price distribution of the trades. Therefore, we compute the volume-weighted empirical cumulative distribution function (VWECDF) over the price p,
F d , h t 1 t 2 ( p ) = 1 V d , h t 1 t 2 i v i 1 d i = d , h i = h , t 2 < t i t 1 , p i p .
F d , h t 1 t 2 ( p ) denotes the VWECDF for the hour product h on day d between the time t 1 and t 2 before delivery with t 2 < t 1 . The total volume traded for the product in the given interval
V d , h t 1 t 2 = i v i 1 d i = d , h i = h , t 2 < t i t 1
is used to normalize the sum in the VWECDF. The VWECDF in Equation (1) for a given product is computationally represented by an ordered set { ( q j , r j ) } j = 1 J of the J trades observed in the time interval t 1 t 2 , where r j is the empirical quantile and q j is the corresponding quantile value which is given by the price of trade j. This allows us to estimate quantile values for a specific quantile τ using linear interpolation if necessary
q τ = q j i f τ = r j q j + q j + 1 q j r j + 1 r j ( τ r j ) i f r j < τ < r j + 1 .
Using a dense grid of quantiles τ { 0 , 0 . 01 , . . . , 0 . 99 , 1 } we then obtain a vector of quantile values
q d , h t 1 t 2 = [ q d , h , 0 t 1 t 2 , q d , h , 0.01 t 1 t 2 , . . . , q d , h , 1 t 1 t 2 ] T
which non-parametrically describes the price distribution of the given product d , h in the time interval defined by t 1 and t 2 . For this empirical distribution, the values for q d , h , 0 t 1 t 2 and q d , h , 1 t 1 t 2 correspond to the cheapest and most expensive trades observed.
Figure 1a shows the result of the applied transformation in a fan chart for a single product in 15 min time intervals from 5 h before delivery till the time of delivery. We can observe that the variance increases with the time of delivery approaching. This is characteristic for the intraday market. Figure 1b shows the inverse of the VWECDF and quantile values for selected quantiles for the entire time horizon of 5 h.
Let us note again that this is not a distribution which describes the uncertainty over the volume-weighted average price but a distribution that describes how the traded volume is distributed over the possible prices. To illustrate this point consider the following hypothetical situation. Suppose we would observe the trades for a certain product before issuing a forecast for the observed time frame. Then, we could compute the volume-weighted average price and issue a perfect probabilistic forecast for this average price, a distribution where the entire probability mass is centered at the true, known value. However, this forecast would not inform a trader about the variety of prices that are traded for this product. In contrast, our approach would still forecast a non-trivial distribution that would inform an agent about the dispersion of the traded prices, e.g., we could exactly forecast the marginal value of the cheapest and most expensive 10% of the trades. This is a much richer representation of the market behavior and reflects that different market participants might value electrical energy very differently. Considering a price taker perspective, an agent could then take advantage of the estimated dispersion of prices.

2.2. Exogenous Data

Along with the trading data we also include exogenous fundamental data. For each day and hour we consider the load forecast L o a d d , h , the forecasted in-feed from wind and solar power R E S d , h , and the day ahead auction price D A d , h which is already known before the continuous trading starts. We combine all exogenous variables in the vector x d , h = [ L o a d d , h , R E S d , h , D A d , h ] T . Additionally we consider the 24-dimensional one-hot encoded column vector s d , h which contains a dummy variable for each hour of the day.

3. Predicting the Quantiles of the Price Distribution

The most important price index for the intraday market is the ID3 price. It is the volume-weighted average price of the trades in the time interval from three hours before delivery till 30 min before delivery for a given product [4]. We therefore also focus on this time horizon and aim to forecast q d , h 3 0.5 . As explanatory variables we use the time series of the observed quantile values from four to three hours before delivery in 15 min time intervals denoted by Q d , h 4 3 = [ q d , h 4 3.75 , q d , h 3.75 3.5 , q d , h 3.5 3.25 , q d , h 3.25 3 ] , i.e., Q d , h 4 3 is a matrix of dimension N τ × 4 where N τ is the number of quantiles. We also consider the corresponding time series from the two neighboring products Q d , h 1 3 2 and Q d , h + 1 5 4 which are also of dimension N τ × 4 . Furthermore, Q d , h , τ t 1 t 2 denotes the τ th column of Q d , h t 1 t 2 . For ease of notation we will write ( d , h + k ) to denote the product that has to be delivered k hours before/after the product ( d , h ) instead of using the correct notation ( d + h + k 24 , ( h + k ) mod 24 ) . Finally, we also use the exogenous variables for all three considered products x d , h 1 , x d , h , x d , h + 1 and the vector of hour dummy variables s d , h .

3.1. Linear Regression Models

In this section we present a set of linear regression models which use different subsets of the available regressors. This allows us to stepwise infer the contribution of each factor to the forecasting performance. To obtain a forecast for the vector of quantile values q ^ d , h 3 0.5 we have to fit a separate model for each quantile τ . At test time we concatenate the predictions from the N τ individual models and sort the resulting vector to ensure monotonically increasing quantile values.
The first model, which we call AR1, uses only the time series information of the same product for the same quantile Q d , h , τ 4 3 and the vector of dummy variables s d , h . It is given by
q ^ d , h , τ 3 0 . 5 = w 1 Q d , h , τ 4 3 + w 2 s d , h ,
where w i are row vectors of model parameters. The model ARX1 given by
q ^ d , h , τ 3 0.5 = w 1 Q d , h , τ 4 3 + w 2 x d , h + w 3 s d , h
additionally uses the exogenous variables for the same product. The model AR2 given by
q ^ d , h , τ 3 0.5 = w 1 Q d , h , τ 4 3 + w 2 Q d , h 1 , τ 3 2 + w 3 Q d , h + 1 , τ 5 4 + w 4 s d , h
also utilizes the time series information from the neighboring products for the same quantile but ignores the exogenous variables. The model ARX2 given by
q ^ d , h , τ 3 0.5 = w 1 Q d , h , τ 4 3 + w 2 Q d , h 1 , τ 3 2 + w 3 Q d , h + 1 , τ 5 4 + w 4 x d , h + w 5 x d , h 1 + w 6 x d , h + 1 + w 7 s d , h
additionally includes the exogenous regressors for all three products. Finally the ARXfull model
q ^ d , h , τ 3 0.5 = τ w 1 , τ Q d , h , τ 4 3 + τ w 2 , τ Q d , h 1 , τ 3 2 + τ w 3 , τ Q d , h + 1 , τ 5 4 + w 4 x d , h + w 5 x d , h 1 + w 6 x d , h + 1 + w 7 s d , h
utilizes all available inputs. Hence, this model has 1245 parameters. This will likely result in overfitting for the used training set size of 6 months. Furthermore, many regressors might not carry useful information for the quantile value to forecast. We therefore apply Lasso regularization to automatically select an optimal subset of regressors [15], i.e., the model parameters are estimated using an extended loss function that penalizes the L 1 norm of the model weights. Let z d , h , τ be a vector of standardized regressors, w the model weights, and q d , h , τ 3 0.5 the true quantile values, then the Lasso estimator for the optimal weight vector w * is given by
w * = argmin w d h q d , h , τ 3 0.5 w z d , h , τ 2 + λ τ j w j ,
where λ τ is the hyperparameter that controls the degree of regularization. Setting λ τ = 0 leads to standard ordinary least squares estimation.

3.2. Neural Network Model

The modeling approaches described above result in one model per quantile and can only model linear relationships. Therefore, we also test a multi output neural network model (NN) which uses an architecture that accounts for the structure of the inputs and limits the number of parameters in the hidden layers, see Figure 2 for a visualization. The model outputs a prediction for the vector of quantile values as a function of all available regressors
q ^ d , h 3 0.5 = f ( Q d , h 1 3 2 , Q d , h 4 3 , Q d , h + 1 5 4 , x d , h 1 , x d , h , x d , h + 1 , s d , h ) .
The proposed neural network has two hidden layers. The first hidden layer is a locally connected layer and operates only on the time series data Q d , h 1 3 2 , Q d , h 4 3 , Q d , h + 1 5 4 . In this locally connected layer a distinct vector of weights w τ ( 1 ) = [ w 0 , τ ( 1 ) , w 1 , τ ( 1 ) , w 2 , τ ( 1 ) , w 3 , τ ( 1 ) ] is learned for each quantile. Each local model outputs a scalar value
h τ ( 1 ) = g ( w 0 , τ ( 1 ) + w 1 , τ ( 1 ) Q d , h 1 , τ 3 2 + w 2 , τ ( 1 ) Q d , h , τ 4 3 + w 3 , τ ( 1 ) Q d , h + 1 , τ 5 4 ) ,
where Q d , h , τ t 1 t 2 denotes the τ th column of Q d , h t 1 t 2 . The function
g ( z ) = z i f z 0 ( e z 1 ) i f z < 0
is the ELU activation function [16]. The layer outputs a vector h ( 1 ) = [ h 0 ( 1 ) , h 0.01 ( 1 ) , . . . , h 1.0 ( 1 ) ] T of dimension N τ × 1 . The vector h ( 1 ) is then concatenated with the vectors x d , h 1 , x d , h , x d , h + 1 , s d , h and is passed through a fully connected layer
h ( 2 ) = g ( w 0 ( 2 ) + W ( 2 ) [ h ( 1 ) , x d , h 1 , x d , h , x d , h + 1 , s d , h ] T )
with N τ neurons, i.e., the weight matrix W ( 2 ) has dimension N τ × ( N τ + 9 + 24 ) and w 0 ( 2 ) is a vector of constants with dimension N τ × 1 . The last layer
q ^ h , d 3 0.5 = w 0 ( 3 ) + W ( 3 ) h ( 2 )
outputs the model’s prediction using the N τ × N τ weight matrix W ( 3 ) and the N τ × 1 vector of constants w 0 ( 3 ) .
We train the model by minimizing the L2 norm of the difference between the predicted and true vector of quantile values given by
L = 1 24 D d h q d , h 3 0.5 q ^ d , h 3 0.5 2 ,
where D is the number of days in the training set. We train the model for 50 epochs with a batch size of 32 using the Adam optimizer [17] at standard settings in Keras 2.2.4 [18]. At test time we sort the predictions of the model to ensure monotonically increasing quantile values.
For both the linear regression models as well as the neural network model we chose to use one model for all hours of the day. Fitting a separate model for each hour would result in much smaller training sets. However, if the market behaves fundamentally different for different hour products, it might be insufficient to account for these differences by simply introducing dummy variables. We also did not transform the data to stabilize the variance e.g., by applying the a s i n h -transformation which has been shown to work well for electricity price forecasting tasks [19]. Studying the effectiveness of different modeling strategies, variance stabilizing transformations, or robust loss functions like the absolute loss or the Huber loss [20] for intraday forecasting is outside the scope of this paper but is an interesting avenue for further research.

3.3. Naive Benchmark Models

Narajewski & Ziel [11] showed empirically that a strong benchmark for short-term forecasts of the ID3 price is the volume-weighted average price of the last 15 min before forecasting. Based on their findings we test five naive benchmark models of similar type. Let us note the authors of [11] use information up to 3.25 h before delivery while we use information up to 3.0 h before delivery for both the naive and statistical models.
The Naive1 model uses the quantile values of the full trading period till 3 h before delivery and is given by
q ^ d , h 3 0.5 = q d , h 32 3 .
The Naive2 model uses the quantile values of the last 15 min before forecasting, i.e.,
q ^ d , h 3 0.5 = q d , h 3.25 3 .
This type of naive model performed best in [11] which suggests that the latest market results already reflect the information available at forecasting time. Hence, we expect that this model’s forecasts will perform best at least for the central quantiles that are closely related to the ID3.
As the dispersion of the traded prices till three hours before delivery is usually significantly lower than in the last three hours, we consider three more models that scale the variance of the distribution but are centered at the value for the 0.5 quantile q d , h , 0.5 3.25 3 . This is motivated by the expectation that the distribution right before we issue the forecast is a good estimator for the median but not for the variance of the target distribution.
The Naive3 model shifts the distribution of the last finished product by centering it at q d , h , 0.5 3.25 3
q ^ d , h 3 0.5 = q d , h , 0.5 3.25 3 + ( q d , h 3 3 0.5 q d , h 3 , 0.5 3 0.5 ) .
The Naive4 model shifts the distribution of the same hour from the day before in similar way and is given by
q ^ d , h 3 0.5 = q d , h , 0.5 3.25 3 + ( q d 1 , h 3 0.5 q d 1 , h , 0.5 3 0.5 ) .
Finally, the Naive5 model shifts the average distribution of the hour product in the entire training set and is defined as
q ^ d , h 3 0.5 = q d , h , 0.5 3 . 25 3 + ( q ¯ h 3 0.5 q ¯ h , 0.5 3 0.5 ) ,
where q ¯ h 3 0.5 denotes the vector of average quantile values for the hour h in the training set.

4. Forecasting Study

4.1. Forecasting Strategy

For the empirical forecasting study we consider the entire data set from 1 July 2017 till 31 March 2019 with the initial training, validation, and test split shown in Figure 3. We use the first six months of data from 1 July 2017 till 31 December 2017 as initial training set to forecast the quantiles for all hours of the following day. We then shift the training set by one day, refit all models, and again forecast the following day. We use the first three months of 2018 as a validation set to fix the values for λ τ considering values on an exponential grid given by { λ i = 2 i | i { 15 , 14 , . . . , 0 } } . The value of λ τ for each τ is determined by the lowest mean absolute error. The 12 months between April 2018 and March 2019 form the test set. In cases where there was no trading between two time steps for a product, we reuse the quantile values from the preceding 15 min time interval. If there was no trading in any preceding periods, we set all quantile values to the day-ahead auction price. To account for the numerical instability of the neural network’s predictions resulting from the random weight initialization and the non-convex loss function, we train an ensemble of 5 models and average their predictions.

4.2. Evaluation

We use two measures to evaluate the accuracy of the predictions for the entire distribution, the Wasserstein distance (WD) and integrated quadratic distance (QD). These distances provide an intuitive way to measure the difference between two empirical distributions in a non-parametric way. For two univariate distributions P and S with cumulative density functions (CDF) F and G the WD is defined as W D ( P , S ) = + F ( x ) G ( x ) d x and the QD is defined as Q D ( P , S ) = + F ( x ) G ( x ) 2 d x . Hence, we compute the errors by
w d , h = + F d , h 3 0.5 ( p ) F ^ d , h 3 0.5 ( p ) d p
and
e d , h = + F d , h 3 0.5 ( p ) F ^ d , h 3 0.5 ( p ) 2 d p ,
respectively, where F ^ d , h 3 0.5 ( p ) is described by the predicted vector of quantile values q ^ d , h 3 0.5 and F d , h 3 0.5 ( p ) is the true VWECDF.
To measure the overall the forecast accuracy we compute the mean Wasserstein distance (MWD)
M W D = 1 24 D d h w d , h
and mean integrated quadratic distance (MQD)
M Q D = 1 24 D d h e d , h ,
where D is the number of days in the test set.
To investigate the difference in forecasting accuracy for different quantiles we compute the mean absolute error (MAE) and root mean squared error (RMSE) values for each quantile separately
M A E τ = 1 24 D d h q d , h , τ 3 0.5 q ^ d , h , τ 3 0.5 ,
R M S E τ = 1 24 D d h q d , h , τ 3 0.5 q ^ d , h , τ 3 0.5 2 .
Since the values of the error measures alone do not allow for a statistically sound conclusion on the outperformance of forecast A by forecast B, we employ the Diebold-Mariano (DM) test [21] in the modified version proposed by Harvey et al. [22] as implemented in the R forecast package [23]. The DM test examines the statistical significance of the difference of the residual time series of two models. We compute the multivariate version of the test as proposed in [24], i.e., we obtain one error for each day by computing a norm for the vector of residuals for the day d.
We consider two variants of the test as we expect a difference in forecasting ability for different quantiles. In the first variant we compute the L1 norm of the WDs and the L2 norm of the QDs over one day
ω d 1 = h w d , h ,
ϵ d 2 = h e d , h 1 / 2 .
Then the loss differential for two forecasts A and B is given by Δ d ( A , B ) = ω d ( A ) 1 ω d ( B ) 1 and Δ d ( A , B ) = ϵ d ( A ) 2 ϵ d ( B ) 2 , respectively. For both error measures and all model combinations we conduct a pair of two one-sided DM tests and report the p-values for the hypothesis H 1 : E ( Δ d ( A , B ) ) 0 and H 1 : E ( Δ d ( A , B ) ) 0 , respectively.
In the second variant we compare the L 1 and L 2 norms of the errors for only a single quantile τ , i.e., we compute
γ d , τ i = h q d , h , τ 3 0.5 q ^ d , h , τ 3 0.5 i 1 / i
with i { 1 , 2 } and obtain one loss differential per quantile Δ d , τ ( A , B ) = γ d , τ ( A ) i γ d , τ ( B ) i . We again conduct a pair of two one-sided DM tests for both measures for the loss differential of the quantile forecasts, i.e., we report the p-values for the hypothesis H 1 : E ( Δ d , τ ( A , B ) ) 0 and H 1 : E ( Δ d , τ ( A , B ) ) 0 for all models and all quantiles.

5. Results

We present the MWD and MQD values in Table 1 and the corresponding DM test p-values in Figure 4a,b. All statistical models except AR1 show lower MWD and MQD values than the best benchmark model Naive5. The differences in accuracy between the forecasts of AR1, ARX1, and Naive5 are not significant. The forecasts of the ARXfull model give the best results in terms of both measures. The improvements of the ARXfull forecasts in terms of MWD and MQD compared to the best benchmark model Naive5 are only about 2% and 3.5%. In terms of MWD, the improvement in accuracy of the ARXfull forecasts is significant compared to all models. Considering MQD, the improvement in accuracy of the ARXfull forecasts is significant compared to all models except the NN. There is no statistical significant difference in the accuracy of the forecasts from the AR2, ARX2, and NN models. Furthermore, we can not report a significant difference in accuracy between the forecasts from AR1 and ARX1 as well as between the forecasts from AR2 and ARX2.
These observations lead to several conclusions. Incorporating exogenous variables does not improve the forecasting performance while considering time series information from the neighboring products leads to a significant improvement. Furthermore, the inclusion of information from other quantiles in combination with automated variable selection using Lasso also improves the forecasting performance significantly.
Table 2 and Table 3 present the values for M A E τ and R M S E τ for selected quantiles. This allows a more detailed analysis regarding the forecasting accuracy for different regions of the distribution. The forecasts from the ARXfull model show the lowest M A E τ values for all quantiles except Q90 and Q100. For these quantiles the NN forecasts are more accurate. The NN forecasts show larger errors for the central quantiles than the predictions of the much simpler AR2 and ARX2 models. However, the accuracy of the NN forecasts is better for the more extreme quantiles. This could be explained by non-linear effects that can not be modeled by the linear regression approach. The findings are similar for the RMSE. The ARXfull model shows the best performance for the central quantiles while the NN model shows slightly better performance for the tails. In general, errors are larger for the tails of the distribution across all models, especially for the minimum and maximum values.
Figure 5a,b show the relative improvement in M A E τ and R M S E τ for the statistical models compared to the forecasts of the best performing benchmark model Naive5. In terms of M A E τ only the ARXfull forecasts show a small relative improvement for the central quantiles of roughly 0.4 % . The relative gains in accuracy are larger for the tails for all models, e.g., the ARXfull forecasts show a relative improvement of respectively 3 % and 2.4 % for Q10 and Q90.
Figure 6a,b show the p-values of the DM-test for the loss differential per quantile for the ARXfull forecasts against all other models’ forecasts for the L1 and L2 norm, respectively. As can be seen from Figure 6a, we can not conclude a significantly improved forecasting accuracy in comparison to the forecasts of Naive2 to Naive5 in terms of MAE for the central quantiles. However, the accuracy is significantly better for the tails of the distribution. Considering the RMSE, the improvement over the benchmarks is significant for all quantiles. These results suggest that it is possible to forecast the short-term volatility of the intraday market which is reflected in the tails of the volume-weighted price distribution. At the same time, we can not report a definite improvement over the naive models for the central quantiles considering the inconsistent results for MAE and RMSE.

6. Conclusions

We analyzed the German continuous intraday electricity market and focused on hour products and the last three hours before delivery. We proposed to non-parametrically approximate the empirical volume-weighted price distribution by using a dense grid of discrete quantiles. This admits a much richer representation of the market behavior than only analyzing volume-weighted average prices. In order to forecast the quantile values of this distribution we constructed a set of simple linear regression models that use different subsets of the available inputs. Furthermore, we used two more advanced models that utilize all available regressors, a Lasso regularized linear regression model and an ensemble of multi-output neural networks. We found that including exogenous variables did not improve the accuracy while considering time series information from neighboring products and quantiles did. We compared the forecasts of the proposed models with several simple but well designed benchmarks. The best performing model turned out to be the Lasso regularized linear regression model. We also studied the forecasting accuracy for different quantiles of the price distribution. Compared to the naive benchmarks, the gains in forecasting performance were small and not significant for the central quantiles of the target distribution. However, the gains in accuracy for the tails of the distributions were larger and significant. Hence, we gather evidence that the German intraday market works efficiently while also showing that it is possible to forecast the variance of short-term intraday prices.
There are several avenues for future work. It would be interesting to see if we would obtain similar findings for quarter hour products which we excluded in this work. It is also worth investigating if information from quarter-hour products could help to improve the forecast accuracy for the hour products and vice versa. We chose to model the price distribution in a non-parametric way which allows a larger degree of flexibility. However, modeling the price distribution in a parametric way is straightforward and worth exploring. Furthermore, we solely focused on prices while an estimate of the expected traded volume as a measure of short-term market liquidity would also be of interest in practice. Finally, future work should explore how to exploit forecasts for the distribution of prices and volumes for short-term trading and risk management.

Author Contributions

Conceptualization, T.J. and F.S.; methodology, T.J.; software, T.J.; validation, T.J.; formal analysis, T.J.; investigation, T.J.; data curation, T.J.; writing—original draft preparation, T.J.; writing—review and editing, T.J. and F.S.; visualization, T.J.; supervision, F.S.

Funding

This research was partially funded by the TU Darmstadt Pioneer Fund.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CDFcumulative density function
DMDiebold-Mariano
MAEmean absolute error
MQDmean integrated quadratic distance
MWDmean Wasserstein distance
QDintegrated quadratic distance
RMSEroot mean squared error
VWECDFvolume-weighted empirical cumulative density function
WDWasserstein distance

References

  1. Ocker, F.; Ehrhart, K.M. The “German Paradox” in the balancing power markets. Renew. Sustain. Energy Rev. 2017, 67, 892–898. [Google Scholar] [CrossRef]
  2. Koch, C.; Hirth, L. Short-term electricity trading for system balancing: An empirical analysis of the role of intraday trading in balancing Germany’s electricity system. Renew. Sustain. Energy Rev. 2019, 113, 109–275. [Google Scholar] [CrossRef]
  3. EPEXSpot. Traded Volumes Soar to an All Time High in 2018. 2018. Available online: https://www.epexspot.com/en/press-media/press/details/press/Traded_volumes_soar_to_an_all-time_high_in_2018 (accessed on 4 November 2019).
  4. EPEXSpot. Description of Epex Spot Market Indices (May 2019). 2019. Available online: https://www.epexspot.com/document/39669/EPEX%20SPOT%20Indices (accessed on 4 November 2019).
  5. Viehmann, J. State of the German Short-Term Power Market. Z. Für Energiewirtschaft 2017, 41, 87–103. [Google Scholar] [CrossRef]
  6. Weron, R. Electricity price forecasting: A review of the state-of-the-art with a look into the future. Int. J. Forecast. 2014, 30, 1030–1081. [Google Scholar] [CrossRef]
  7. Andrade, J.; Filipe, J.; Reis, M.; Bessa, R. Probabilistic price forecasting for day-ahead and intraday markets: Beyond the statistical model. Sustainability 2017, 9, 1990. [Google Scholar] [CrossRef]
  8. Monteiro, C.; Ramirez-Rosado, I.; Fernandez-Jimenez, L.; Conde, P. Short-term price forecasting models based on artificial neural networks for intraday sessions in the iberian electricity market. Energies 2016, 9, 721. [Google Scholar] [CrossRef]
  9. Maciejowska, K.; Nitka, W.; Weron, T. Day-Ahead vs. Intraday—Forecasting the Price Spread to Maximize Economic Benefits. Energies 2019, 12, 631. [Google Scholar] [CrossRef]
  10. Uniejewski, B.; Marcjasz, G.; Weron, R. Understanding intraday electricity markets: Variable selection and very short-term price forecasting using LASSO. Int. J. Forecast. 2019, 35, 1533–1547. [Google Scholar] [CrossRef]
  11. Narajewski, M.; Ziel, F. Econometric modelling and forecasting of intraday electricity prices. arXiv 2018, arXiv:1812.09081. [Google Scholar] [CrossRef]
  12. EPEXSpot. Available online: www.epexspot.com (accessed on 4 November 2019).
  13. ENTSOE-E. transparency Platform. Available online: transparency.entsoe.eu (accessed on 4 November 2019).
  14. EPEXSpot. Epex Spot Market Rules. 2019. Available online: http://www.epexspot.com/de/extras/download-center (accessed on 4 November 2019).
  15. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 1996, 58, 267–288. [Google Scholar] [CrossRef]
  16. Clevert, D.A.; Unterthiner, T.; Hochreiter, S. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs). arXiv 2015, arXiv:1511.07289. [Google Scholar]
  17. Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  18. Chollet, F. Keras. 2015. Available online: https://keras.io (accessed on 4 November 2019).
  19. Uniejewski, B.; Weron, R.; Ziel, F. Variance Stabilizing Transformations for Electricity Spot Price Forecasting. IEEE Trans. Power Syst. 2018, 33, 2219–2229. [Google Scholar] [CrossRef]
  20. Huber, P.J. Robust Estimation of a Location Parameter. Ann. Math. Statist. 1964, 35, 73–101. [Google Scholar] [CrossRef]
  21. Diebold, F.; Mariano, R. Comparing Predictive Accuracy. J. Bus. Econ. Stat. 1995, 13, 253–263. [Google Scholar]
  22. Harvey, D.; Leybourne, S.; Newbold, P. Testing the equality of prediction mean squared errors. Int. J. Forecast. 1997, 13, 281–291. [Google Scholar] [CrossRef]
  23. Hyndman, R.J.; Khandakar, Y. Automatic time series forecasting: The forecast package for R. J. Stat. Softw. 2008, 26, 1–22. [Google Scholar]
  24. Ziel, F.; Weron, R. Day-ahead electricity price forecasting with high-dimensional structures: Univariate vs. multivariate modeling frameworks. Energy Econ. 2018, 70, 396–420. [Google Scholar] [CrossRef]
Figure 1. (a) The figure shows all trades that were executed from 5 to 0 h before delivery for the hour product h = 21 on d = 10.8.2018 in gray circles. Circles are scaled according to trade volume. The shaded red regions show values for selected quantiles in the 15 min time intervals. (b) The red function is the inverse of the volume-weighted empirical cumulative distribution function F d , h 5 0 ( p ) for the last 5 h of trading. The black markers depict selected quantiles, e.g., the black square represents q d , h , 0.5 5 0 .
Figure 1. (a) The figure shows all trades that were executed from 5 to 0 h before delivery for the hour product h = 21 on d = 10.8.2018 in gray circles. Circles are scaled according to trade volume. The shaded red regions show values for selected quantiles in the 15 min time intervals. (b) The red function is the inverse of the volume-weighted empirical cumulative distribution function F d , h 5 0 ( p ) for the last 5 h of trading. The black markers depict selected quantiles, e.g., the black square represents q d , h , 0.5 5 0 .
Energies 12 04262 g001
Figure 2. Visualization of the neural network model. The first layer is a locally connected layer that operates only on the time series data and learns a distinct set of weights per quantile. The layer’s output is concatenated with the vector of exogenous variables x and passed through a fully connected layer.
Figure 2. Visualization of the neural network model. The first layer is a locally connected layer that operates only on the time series data and learns a distinct set of weights per quantile. The layer’s output is concatenated with the vector of exogenous variables x and passed through a fully connected layer.
Energies 12 04262 g002
Figure 3. Selected quantiles of the intraday price distribution from 3 h to 0.5 h before delivery for the entire data set. The initial training set contains six months of data from June 2017 to December 2017. The first three months of 2018 are used to fix the hyperparameters for the elastic net models. The test set contains 12 months of data from April 2018 to March 2019. We refit all models each day using a rolling window scheme.
Figure 3. Selected quantiles of the intraday price distribution from 3 h to 0.5 h before delivery for the entire data set. The initial training set contains six months of data from June 2017 to December 2017. The first three months of 2018 are used to fix the hyperparameters for the elastic net models. The test set contains 12 months of data from April 2018 to March 2019. We refit all models each day using a rolling window scheme.
Energies 12 04262 g003
Figure 4. p-values of the multivariate DM test for (a) daily the L1 norm of the WD and (b) the daily L2 norm of the QD. p-values close to zero (dark green) indicate that the forecasts from the model on the x-axis are significantly more accurate than the forecasts from the model on the y-axis.
Figure 4. p-values of the multivariate DM test for (a) daily the L1 norm of the WD and (b) the daily L2 norm of the QD. p-values close to zero (dark green) indicate that the forecasts from the model on the x-axis are significantly more accurate than the forecasts from the model on the y-axis.
Energies 12 04262 g004
Figure 5. Relative difference in (a) MAE and (b) RMSE values per quantile of the statistical models compared to Naive5. Values smaller than zero indicate smaller errors than Naive5.
Figure 5. Relative difference in (a) MAE and (b) RMSE values per quantile of the statistical models compared to Naive5. Values smaller than zero indicate smaller errors than Naive5.
Energies 12 04262 g005
Figure 6. The figure shows the p-values for the multivariate DM tests per quantile for (a) the daily L1 norm and (b) the daily L2 norm for the forecasts of the ARXfull model compared against the forecasts of all other models. Dark green cells indicate that the forecasts of the ARXfull model are significantly more accurate than the forecasts from the model on the y-axis for the quantile given on the x-axis.
Figure 6. The figure shows the p-values for the multivariate DM tests per quantile for (a) the daily L1 norm and (b) the daily L2 norm for the forecasts of the ARXfull model compared against the forecasts of all other models. Dark green cells indicate that the forecasts of the ARXfull model are significantly more accurate than the forecasts from the model on the y-axis for the quantile given on the x-axis.
Energies 12 04262 g006
Table 1. MWD and MQD values for the test set in E U R / M W h .
Table 1. MWD and MQD values for the test set in E U R / M W h .
Naive1Naive2Naive3Naive4Naive5AR1ARX1AR2ARX2ARXfullNN
MWD4.1473.9124.0084.0913.8033.7893.8013.7513.7473.7213.763
MQD1.9382.0411.5351.5851.4141.4181.4091.3951.3751.3621.371
Table 2. Test set mean MAE values for selected quantiles in E U R / M W h .
Table 2. Test set mean MAE values for selected quantiles in E U R / M W h .
Naive1Naive2Naive3Naive4Naive5AR1ARX1AR2ARX2ARXfullNN
Q07.528.968.238.457.116.996.966.936.916.796.88
Q104.624.614.654.704.234.214.214.154.144.114.18
Q204.103.853.974.003.713.713.723.653.653.623.69
Q303.803.433.523.533.383.403.403.353.343.333.39
Q403.703.263.303.313.243.283.293.243.233.213.28
Q503.683.213.213.213.213.253.253.213.213.193.25
Q603.693.223.273.303.243.263.273.233.223.213.25
Q703.753.343.453.553.373.353.373.333.333.313.34
Q803.993.693.884.023.703.653.673.633.623.613.62
Q904.504.434.604.824.314.244.274.244.244.214.18
Q1007.388.498.188.667.367.167.167.157.137.006.88
Table 3. Test set mean RMSE values for selected quantiles in E U R / M W h .
Table 3. Test set mean RMSE values for selected quantiles in E U R / M W h .
Naive1Naive2Naive3Naive4Naive5AR1ARX1AR2ARX2ARXfullNN
Q014.1915.6417.0117.1013.0712.9512.9312.9212.9012.8112.80
Q107.107.206.887.056.226.236.206.146.106.046.04
Q206.306.055.956.035.545.575.565.485.465.425.44
Q305.885.365.365.415.175.175.175.115.105.085.11
Q405.775.105.095.125.035.065.064.994.994.975.00
Q505.795.045.045.045.045.065.065.005.004.985.02
Q605.885.155.195.215.135.155.155.095.085.075.10
Q706.195.605.825.885.515.505.505.465.455.435.44
Q806.886.566.917.096.276.236.236.206.186.166.15
Q908.538.609.219.587.987.937.927.917.877.847.79
Q10019.7320.4124.6825.3818.8118.7218.6518.7118.6118.5618.41

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Back to TopTop