Probabilistic forecasting of German electricity imbalance prices

The exponential growth of renewable energy capacity has brought much uncertainty to electricity prices and to electricity generation. To address this challenge, the energy exchanges have been developing further trading possibilities, especially the intraday and balancing markets. For an energy trader participating in both markets, the forecasting of imbalance prices is of particular interest. Therefore, in this manuscript we conduct a very short-term probabilistic forecasting of imbalance prices, contributing to the scarce literature in this novel subject. The forecasting is performed 30 minutes before the delivery, so that the trader might still choose the trading place. The distribution of the imbalance prices is modelled and forecasted using methods well-known in the electricity price forecasting literature: lasso with bootstrap, gamlss, and probabilistic neural networks. The methods are compared with a naive benchmark in a meaningful rolling window study. The results provide evidence of the efficiency between the intraday and balancing markets as the sophisticated methods do not substantially overperform the intraday continuous price index. On the other hand, they significantly improve the empirical coverage. The analysis was conducted on the German market, however it could be easily applied to any other market of similar structure.

Since the liberalization of electricity markets the market design has undergone a constant development.Currently, it consists of 3 parts: futures, spot and balancing market.The futures market allows the market participants to trade the electricity in a longer horizon.
The spot market consists of the day-ahead and intraday parts, and it is the main electricity market.Here, the market players can trade one day to a few minutes prior the physical delivery.The balancing market, however, is of no less importance as it preserves the system stability.The futures and spot markets are often being run by big energy exchanges, as e.g. the European Energy Exchange (EEX) or Nord Pool, whereas the balancing market is still run locally by the Transmission System Operators (TSOs).Thus, the design of the former ones is rather unified, while the design of the latter one could deviate depending on the control zone.We can particularly distinguish the single and two price imbalance settlement methods.In the study, we consider the German market data, and therefore we focus ourselves on the single price design.
Large deviations from nominal electric grid frequency may lead to disconnections or even blackouts.Thus, the need for electricity balancing is undebatable, and it only gains on importance with the growth of renewable energy capacity, even though the introduction of intraday continuous trading and quarter-hourly products has reduced the need for shortterm balancing reserves [1,2].The German balancing market comprises the capacity and energy markets [3].The capacity market takes place on the day before the physical delivery period and the traders declare there their balancing capacity for a given price.Then, the balancing service providers (BSPs) that offer the cheapest capacity are accepted and may participate in the balancing energy market.A detailed description of the market is presented in Section 2. This paper raises the novel issue of very short-term probabilistic forecasting of German electricity imbalance prices.We apply the methods well-known in electricity price forecasting (EPF) in order to model and predict the distribution of imbalance prices 30 minutes before the delivery.The motivation for such setting is the possibility to trade the energy in the intraday continuous market in the respective control zones, after gate closure, until 5 minutes before the delivery or in the balancing market.Having precise imbalance price probabilistic forecasts and access to intraday continuous limit order book, the market participant may choose between these two to maximize their profit.The utilized modelling methods are: lasso with bootstrapped in-sample errors, gamlss with lasso-based variable selection and probabilistic neural networks.For gamlss and neural networks we assume two distributions: normal and Student's t.The models are compared against a naive benchmark -EPEX ID 1 Price in a rolling window study, what is inline with the existing EPF literature.The models are presented in detail in Section 3 and the application study in Section 4.
The electricity balancing markets have already drawn the researchers' attention.The balancing market design was studied by van der Veen et al. [4], van der Veen and Hakvoort [5], Poplavskaya et al. [6].The authors additionally analyse the impact of the imbalance pricing mechanism on market behaviour, and they conclude that although the system imbalance is similar for different mechanisms, the mechanism that minimizes the imbalance costs for the market is the single price settlement.The literature on modelling and forecasting in electricity balancing markets can be split to imbalance forecasting [7][8][9][10], imbalance price forecasting [11][12][13] and the application in trading [8-10, 14, 15].The scarce electricity imbalance price forecasting literature focuses on point forecasting [11,12], interval forecasting [11] and probabilistic forecasting [13].The work of Dumas et al. [13] is naturally the closest one to our study.The authors utilize a two-step approach, namely they first calculate the probabilities for the net imbalance and then based on that make predictions regarding the imbalance prices.On the other hand, we forecast the imbalance prices directly and do not make any prior assumptions.
The research on EPF is much wider than the one particularly focused on balancing markets.Weron [16] provides a review of point forecasting methods and Nowotarski and Weron [17] present an overview of probabilistic forecasting methods in electricity markets.
2. The imbalance market is inevitable for any market player, and thus this paper may contribute also to electricity trading literature.
3. Various probabilistic models are compared in an exhaustive forecasting study.4. We contribute to the scarce electricity balancing literature by drawing researchers' attention to the German electricity balancing market.

5.
The paper provides evidence of the efficiency between the intraday and balancing markets.
Let us additionally note that the importance of this research is emphasized by the need of including the imbalance market in the electricity trading strategies [35].
The remainder of this manuscript has the following structure.Section 2 describes the electricity balancing market in Germany, the calculation of the imbalance price and the data utilized in the study.The models and estimation methods are discussed in Section 3.
Section 4 presents the application study, including the description of the setting, and the empirical results.Finally, Section 5 closes the paper with conclusion.

Electricity balancing market
This section familiarizes the reader with the German balancing market and provides a description of calculation of the imbalance price.Additionally, we present the data used in the purpose of this study.

Balancing market in Germany
The balancing market is a crucial part of every electricity market.In Germany, it was adjusted many times in recent years, unlike the spot market, which is already well-developed and the appearing changes are rather minor.The current timeline of electricity spot and balancing market in Germany can be seen in Figure 1.The spot market is presented in the top part and it consists of the Day-Ahead Auction (DA), Intraday Auction (IA) and Intraday Continuous (IC).The DA takes place on the day before the delivery at 12:00 and it is the main part of the market, where the majority of power volume is traded.The IA takes place 3 hours after the DA, at 15:00 and here the market participants can trade quarter-hourly contracts, whereas in the DA one may trade only hourly contracts.This  part of the market serves mainly the purpose of balancing the ramping effects of demand and power generation [36,37], however Narajewski and Ziel [35] show that a trader could make significant gains by incorporating this market in their trading strategy.The IC is the last part of the spot market, and it starts on the day before the delivery at 15:00 for hourly products and at 16:00 for quarter-hourly products1 .Here, the market players can trade power continuously until 30 minutes before the delivery in whole Germany and until 5 minutes before the delivery in respective TSO control zones2 .Also, starting at 22:00 the previous day until 1 hour before the delivery the market participants can trade cross-border using the XBID system [38].The purpose of the IC market is to enable the traders to react to changing generation or consumption forecasts and adjust their positions.Even though the trading window is very long, the most of the power volume traded in the IC is traded in the last couple of hours before the delivery [39,40].Therefore, the most important IC price indicators are the volume-weighted average prices ID 1 and ID 3 [24,25,28] which measure the price level in the last 1 and 3 hours before the delivery, respectively.
For the spot market participants of particular interest is the balancing market and especially the imbalance price.As many of the producers and consumers face high uncertainty due to the stochastic nature of weather conditions and people's behaviour, it is basically impossible for them to balance their generation or consumption perfectly.Thus, any deviations from the scheduled generation or consumption are then handled by the TSOs during the delivery.The costs of balancing the energy are then divided between the market players, often called balance responsible parties (BRPs), who contributed to the imbalance.On the other hand, the BRPs that deviated from their schedule, but their deviation reduced the overall system imbalance, are rewarded for this imbalance reduction.Let us note that even though we name the final energy balancing a market which is inevitable for any market participant, it is not really a market in which the BRPs can make bids.Instead, they need to accept the imbalance price that is a derivative of total balancing costs and total system imbalance.
The bottom part of Figure 1 presents the balancing market routine.To avoid big deviations from nominal frequency in the electricity grid, the TSOs have three types of BSPs at their disposal: FCR, aFRR and mFRR.The Frequency Containment Reserve (FCR), also referred as primary reserve, is fully activated after 30 seconds and is a first response to any occurring imbalance.If the imbalance persists, the Automatic Frequency Restoration Reserve (aFRR), also referred as secondary reserve, is activated and in case of longer and deeper imbalances, the Manual Frequency Restoration Reserve (mFRR), also referred as tertiary reserve, is activated.The full activation time of aFRR and mFRR is 5 and 15 minutes, respectively.The balancing market is divided to capacity and energy markets.In the capacity market, the BSPs offer their readiness to deliver or receive the unscheduled electricity and in the energy market, they define the costs for given amount of balancing energy.Let us note that the balancing services are offered in 4-hours positive or negative blocks and the FCR does not participate in the energy market due to negligible volumes.
The capacity auctions take place on the day before the delivery at 08:00 (FCR), 09:00 (aFRR) and 10:00 (mFRR) 3 .Based on the demand from TSOs, the cheapest offers are accepted.The winning BSPs are remunerated with pay-as-cleared (FCR) and pay-as-bid (aFRR and mFRR) mechanisms 4 .Then, until one hour before the 4-hour delivery block the BSPs can make bids in the energy market 5 .The offers are sorted creating a merit order list and in case of imbalance they are activated with pay-as-bid remuneration mechanism.
The costs of balancing energy are carried over to BRPs, whereas the costs of balancing capacity are carried over to end consumers.

Imbalance price
As mentioned, in the German electricity market (but also in many other European markets) the imbalance price is settled using a single price mechanism.The German TSOs have established a Grid Control Cooperation (GCC) and thus the price is unified for all German control zones.The basic formula is as follows Let us note that the price is in EUR/MWh, and it is calculated separately for each quarterhour.The balancing costs and revenues of the GCC are derived based on the activated energy from aFRR and mFRR suppliers.Since the numerator and denominator of equation (1) can be both negative and positive, the same applies to the imbalance price.The BRPs that contribute to the imbalance, i.e. are short/long in case of system under/oversupply, pay the price to the TSOs.However, the BRPs that reduce the system imbalance by being short/long in case of system over/under-supply are being paid the price by the TSOs.
The price given in equation ( 1) is not the final imbalance price.Before it reaches its ultimate value, it undergoes multiple modifications.In the following, we list the modifications, however we do not go deep into details as the formulas are cumbersome and not much explanatory.
1. Price cap in the case of a small GCC balance.
2. Additional price cap in the case of a small GCC balance.
3. Price comparison with the intraday market and setting a minimum price distance to it in such direction that it is less profitable to contribute to the imbalance.
4. Surcharge/discount on the imbalance price in the event of GCC reaching 80% of the positive/negative balancing capacity.
The details of the current and past imbalance price calculation method are available on the regelleistung.netwebsite [41].The first two modifications are meant to avoid extreme imbalance prices in the case of small net GCC balance.The third one compares the imbalance price with the Intraday Price Index and sets a minimum distance of 25%, but at least 10 EUR/MWh between them.This modification pushes the price in such direction that the BRPs contributing to the imbalance get worse price as they would have got in the intraday market.The Intraday Price Index is a volume-weighted average price that uses for calculation all the transactions in the intraday continuous on the hourly and quarterhourly product on the particular day.The fourth modification is an additional penalty on the BRPs that contribute to the system imbalance in the case it reaches very high values.
All the measures make it very unprofitable to contribute to the imbalance, but on the other hand very lucrative for the BRPs to reduce it.We denote the adjusted imbalance price as IP d,qh and refer to it as the imbalance price.  of the histograms).The fitted densities prove that the data is heavy-tailed as the three parametric student's t distribution t(µ, σ, τ ) seems to fit the data much better than the normal distribution N (µ, σ 2 ).The two mentioned distributions will be later utilized in the application study.All the distributions belong to the location-scale family with µ, σ, τ and being the location, scale, and tail-weight (degrees of freedom) parameters, respectively.

Data
The data utilized in the study are collected from 4 different sources.The spot market data (DA, IA and IC transactions and prices) from the EEX transparency, the day-ahead forecast data (load and renewable generation) from the ENTSO-E transparency, the balancing market data (imbalance price, imbalance volume, aFRR and mFRR capacity and energy market data) from the regelleistung.net and the fuels and emission allowance prices from the ICE.The complete dataset contains observations between 12.07.2018and 31.12.2021 as the aFRR and mFRR data is not available for the preceding time.We cleaned the data from missing values using the R package tsrobprep [42].following a decision by the Düsseldorf Higher Regional Court due to an appeal and the capacity pricing (as described in Section 2.1) was immediately re-introduced.

Models and estimation
This section describes the input features and the models that use them to forecast the imbalance price IP d,qh .In the EPF literature, it is typical to use autoregressive effects of the modelled prices, here however we cannot do it as the German imbalance prices are published once a month.For the price calculation in the IC market, we use the x ID y definition of Narajewski and Ziel [25].Let us recall that the x ID y is a volume-weighted average price of all transactions in the IC market that take place in the [x + y, x) time interval prior the delivery.

Input features
The following features are considered in the exercise of modelling the imbalance price IP d,qh for day d and quarter-hour qh with qh = 1, . . ., 96.Whenever mentioning the corresponding product, we mean the same delivery hour, e.g. for qh = 6 the corresponding hourly delivery time is 01:00 and quarter-hourly is 01:15.Note that we utilize only the information available until 30 minutes before the delivery.
• aFRR prices: aFRR d,qh i,j,k for i = POS, NEG indicating the positive or negative balancing side, j = CAP, EN indicating the capacity or energy price, and k = min, avg, max indicating the minimum, average or maximum price (6 • 2 regressors).
In total, we consider 948 regressors for the modelling exercise.Let us shortly motivate the choice of these particular variables.Previous studies [25,28] have shown that the past prices can bring a lot of information regarding the future intraday price level and distribution.We expect similar behaviour in the imbalance price development, and thus we consider the price data, especially the most recent intraday prices and price differences.
Similarly, the DA forecasts of fundamental variables might help in explaining the expected volatility.Naturally, the most recent intraday forecasts would be much more informative, but unfortunately this data is not publicly available and very expensive to obtain.The most recent observed imbalance values might indicate the expected imbalance in the considered quarter-hour.The aFRR and mFRR prices are natural regressors for the imbalance prices, as they directly contribute to their values.The fuel and EUA prices should explain the general price trend, and finally the weekday dummies and cubic B-splines account for weekly and annual seasonality, respectively.

Naive
Following the research on intraday markets [25,[27][28][29] where the authors find the most recent intraday price to be a very good and simple model, we construct the naive model in similar manner.That is to say, we assume the expected imbalance price to be equal the observed quarter-hourly ID 1 price To obtain a distribution of imbalance prices, we use additionally the bootstrap method [44] which was successfully applied in previous EPF research studies [17,28,35,45].The in-sample bootstrapped errors are added to the forecasted expected price to derive the distribution forecast

Lasso with bootstrap
The lasso regression of Tibshirani [46] is a very simple and powerful tool for linear model estimation, and thus gained high popularity and reputation in the EPF literature [18,19,[23][24][25]29].It serves both model estimation and variable selection, and therefore for the model we use all the regressors described in Section 3.1, and we denote such vector as X d,qh .
The formula for the model is and the lasso estimator is given by where λ is a tuning parameter.The lasso estimator expects scaled inputs, and in addition to that we apply on the inputs the variance stabilizing asinh transformation as suggested by Uniejewski et al. [47] with the inverse proposed by Narajewski and Ziel [25].The λ parameter is tuned based on Bayesian information criterion (BIC) for λ ∈ Λ = {λ i = 2 i |i ∈ G}, where G is an equidistant grid from −15 to 1 of length 50, similarly as in the paper of Narajewski and Ziel [25].Let us note that similarly as for the naive, the lasso model estimates the expected imbalance price and to obtain a distribution forecast we need the bootstrap procedure described in equation (3).

Gamlss with lasso
The gamlss framework of Rigby and Stasinopoulos [48] is an extension of the generalized additive models by allowing to build explicit additive models not only for the location, but also scale and shape parameters of a given distribution.Its potential was already noticed in the EPF literature [28,33], however it has not gained yet such popularity as the lasso estimation.For the input vector X d,qh we have the following model with g i being the link function, θ d,qh i ∈ Θ d,qh and distribution given by the cumulative distribution function F (x; Θ d,qh ).In the study, we consider the normal and t distributions.
The link function for the location parameter is the identity function g 1 (x) = x, and for the scale and tail-weight the softplus function g 2 (x) = log(exp(x) + 1).The link functions are shown in Figure 5.
The model in equation ( 6) is actually a glmlss one as we consider only linear effects of the inputs.Moreover, the size of X d,qh could make the optimizing algorithm converge Figure 5: Link functions used in the estimation of distribution parameters very slowly, especially for the 3-parametric distribution.Therefore, we additionally use the lasso regularization ( 5) as described e.g. by Ziel [49], however we do not directly use the gamlss [50] and gamlss.lasso[51] R packages as their deterministic algorithm has issues with convergence due to the very heavy tails of our data.Instead, we utilize the TensorFlow [52] and Keras [53] framework by building a simple neural network with a single linear hidden layer and given probability distribution as output.For each of the distribution parameters we use different regularization parameter λ i ∈ 10 −5 , 10 .We also allow for no regularization of each of the distribution parameters.The model is estimated by maximizing the log-likelihood using the Adam algorithm.The learning rate is assumed to be in the interval (10 −5 , 10 −1 ), and we tune all the parameters using the Optuna [54] package in Python with the number of iterations arbitrarily set to 500.Depending on distribution, we have 5 or 7 hyperparameters to tune.Let us note that the input vector X d,qh is standardized prior the modelling.

Probabilistic neural networks
The probabilistic neural network model is simply a multilayer perceptron (MLP) that models distribution parameters instead of price values, as shown in Figure 6.Let us note that if we remove the hidden layers, we get the gamlss model described in the previous section.
The approach of probabilistic MLP in EPF was first introduced by Marcjasz et al. [45] for the day-ahead prices.For mathematical details see the aforementioned manuscript.The considered model assumes 2 or 3 hidden layers and outputs normal or t distribution.For the distribution parameters, we use the same link functions as in Figure 5.We regularize the model through input feature selection, L 1 regularization of the hidden layers and their weights, and a dropout layer.We tune them together with the number of hidden layers, their activations functions, number of neurons and the learning rate.In the following, we present a list of all hyperparameters considered in the tuning.
• Dropout layer -whether to use the dropout layer after the input layer, and if yes at what rate.The rate parameter is drawn from (0, 1) interval (up to 2 hyperparameters).
• Number of neurons in the hidden layers.The values are drawn from [24,1024] interval (1 hyperparameter per layer).
• L 1 regularization -whether to use the L 1 regularization on the hidden layers and their weights and if yes at what rate.The rate is drawn from (10 −5 , 10) interval (up to 4 hyperparameters per layer).
In total, we have up to 42 hyperparameters to tune.The selected input features are normalized prior the model estimation.Similarly as for the gamlss model, we use the Tensorflow [52] and Keras [53] framework for model estimation, and the Optuna [54] for hyperparameter tuning with the number of iterations arbitrarily set to 1000.The model contains additionally some elements which are not subject of the tuning exercise.These are size of the learning and validation sets, the optimizing algorithm, the number of epochs fixed to 1500, and batch size fixed to 32.We estimate the model by maximizing the loglikelihood, and we use the early stopping callback with patience of 50 epochs.
4 Application study The forecasting study utilizes a rolling window scheme with D = 730 days in-sample and N = 539 days out-of-sample.In case of gamlss and probabilistic MLP models, the in-sample period is split to 547 days used for training and 183 for validation.The hyperparameter tuning is performed once, using the initial in-sample data, as shown in Figures 2 and 4. We aim for a very short-term forecasting utilizing the information available up to 30 minutes before the delivery.The naive and lasso models forecast the imbalance price distribution through M = 10000 bootstrap samples, whereas the gamlss and probabilistic ANN models forecast directly the assumed distribution.

Evaluation
Following the conclusions of Gneiting and Raftery [55], our main evaluation measure is the continuous ranked probability score (CRPS) as it is a strictly proper scoring rule for marginal distribution forecasts.Additionally, we calculate the values of the RMSE, MAE and empirical coverage as supplementary measures.For statistically significant conclusions, we conduct the Diebold and Mariano [56] test using the respective CRPS losses.In this subsection, we provide details regarding the calculation of the mentioned measures.
The CRPS is approximated using the pinball loss for a dense equidistant grid of probabilities r between 0 and 1 of size R, see e.g.[17].In our study, we consider r = {0.01,0.02, . . ., 0.99} of size R = 99.PB d,qh τ is the pinball loss with respect to probability τ .Its formula is given by where Q d,qh τ is a forecast of τ -th quantile of IP d,qh price.To calculate the overall CRPS value we use a simple average The formulas for the supplementary measures are given by  (12) where τ ∈ {0.5, 0.9, 0.98} and µ d,qh is a forecast of expected IP d,qh price.
The DM test measures the statistical significance of the difference between the accuracy of the forecasts of model A and model B, and it is commonly used in the EPF literature [19,24,25,28].Denote

Results
Table 1 presents the results of the forecasting study.We see that the lowest error values are obtained for the naive model which forecasts the imbalance price simply with the quarterly ID 1 price.However, its empirical coverage is very bad.The second-lowest errors are produced by the gamlss model that assumes the t-distribution, and this model provides the best values in terms of 90% and 98% empirical coverage.The generalization from gamlss to probabilistic neural network model does not bring any improvement for the tdistribution.Based on the performance of the two mentioned models, we decided to try a simple forecast combination by averaging the forecasts of the two models.This brings a small improvement in the CRPS and in the coverage, compared to the naive model.Let us also mention very high errors of the models that assume the normal distribution.This is inline with the previous studies [28] on intraday price development, and it was expected based on Figures 2 and 3. Interestingly, the probNN.N model provides a very accurate 50% coverage, but not as good 90% or 98% coverages.Finally, the lasso model performs slightly worse than the naive in all terms what indicates that one cannot gain any improvement only with linear terms.
Figure 7 shows the pinball score values over quantiles τ ∈ r and the ratio to the naive model.For better clarity, we removed from the right plot the models assuming normal distribution.We see that the models have generally more issues with forecasting the right tail of the distribution.Interestingly, the lasso model forecasts the quantiles up to around 0.3 slightly better than the naive, however it loses very much in the higher quantiles.Also, the combination of naive and gamlss.t is slightly better than the naive in both tails, however not that good in the central part of the distribution.This shows that a forecast combination, as e.g. in Berrisch and Ziel [57], could likely improve the overall score.Figure 8 presents the CRPS values over considered quarter-hours.The naive model seems to be the best across all quarter-hours except for two at hour 6. There, the gamlss.t is slightly better than the naive, however the difference is not large and probably not significant.Again, some additional improvement comes as a result of combining the naive with the gamlss.tmodel.

CRPS
Finally, Figure 9     loss -the closer they are to zero (→ dark green), the more significant the difference is between forecasts of X-axis model (better) and forecasts of the Y-axis model (worse).
different to the naive itself.

Conclusion
The paper raised the novel issue of probabilistic imbalance electricity price forecasting in the German market.The participation in the balancing is mandatory for every market player, and therefore this subject is crucial for them.analysis assumed a setting of a very short-term forecasting, 30 minutes before the physical delivery.We considered various state-of-art methods for probabilistic EPF, however none of them could provide better forecasts in terms of CRPS, MAE and RMSE than the naive ID 1 price.On the other hand, the gamlss and probabilistic neural networks models provide forecasts with far higher empirical coverage than the naive.This is an evidence that the results might be improved, e.g. using intraday power generation forecasts or forecasting combination methods, e.g.[57].
The obtained results are an argument towards the market efficiency between the intraday and balancing markets.This extends the conclusions of intraday market being close to market efficiency [25,29].Therefore, given the difficulty in forecasting the imbalance prices and the potential size of forecasting errors, the BRPs should minimize their imbalance

Figure 1 :
Figure 1: The daily routine of the German electricity spot (top) and balancing (bottom) markets.d, h correspond to the day and hour of the delivery, respectively.

Figure 2
Figure2presents the time series of three electricity prices: the DA price, the quarterly ID 1 price, and the imbalance price.The plots show clearly that the imbalance price is much more volatile than the prices in the quarterly IC market or in the DA market.Moreover, in the considered time-frame the imbalance price exhibited many positive and negative extreme spikes, with a minimum of around −6500 EUR/MWh and a maximum of around 24500 EUR/MWh (for better clarity of Figure2we do not show the extremes in the plot).Such values are impossible to reach in the DA (a min of −500 and a max of 3000) and IC (a min of −9999 and a max of 9999) markets.Therefore, the participation in the balancing market comes with a high risk for a BRP.This confirms Figure3which shows histograms of imbalance prices for selected hours (the range of prices was limited for better clarity

Figure 2 :
Figure 2: Time series plots of various electricity price data in EUR/MWh.

Figure 4
Figure4presents time series plots of selected external regressors and is a complement to Figure2.In both figures we marked the initial in-sample, the hyperparameter tuning, and the out-of-sample periods.Let us note the structural break in the aFRR positive and negative average energy prices between October 2018 and July 2019.During this period the mixed-pricing method was used in the tendering of aFRR and mFRR, i.e. both capacity and energy prices were used to select the cheapest BSPs.However, this method was abolished

Figure 3 :
Figure 3: Histograms of imbalance prices with fitted densities for selected hours.

Figure 4 :
Figure 4: Time series plots of selected external regressors.POS and NEG stand for positive and negative, respectively.
where ε D+1,qh m are drawn with replacement in-sample residuals for day D + 1, i.e. we sample from the set of ε d,qh = IP d,qh − IP d,qh for d = 1, . . ., D.

Figure 6 :
Figure 6: Exemplary network structure of the probabilistic MLP.
L d Z = (L d,qh Z ) qh∈QH the vector of out-of-sample losses for day d of model Z. Formally, we choose L d,qh Z = CRPS d,qh .The multivariate loss differential series ∆ d A,B = ||L d A || 1 − ||L d B || 1 (13) defines the difference of losses in || • || 1 norm.For each pair of models, we compute the pvalue of two one-sided DM tests.The first one is with the null hypothesis H 0 : E(∆ d A,B ) ≤ 0, that is to say the outperformance of the forecasts of model B by the ones of model A. The second test is with the reverse null hypothesis H 0 : E(∆ d A,B ) ≥ 0 and it complements the former one.

Figure 7 :
Figure 7: Pinball score (left) and its ratio to the naive (right) over quantiles τ ∈ r.The right graph shows selected models for better clarity.

Figure 8 :Figure 9 :
Figure 8: CRPS (left) and its ratio to the naive (right) over quarter-hours qh ∈ QH.The right graph shows selected models for better clarity.
Due to the high complexity of the models and the need for comprehensive and computational heavy hyperparameter tuning, we consider in the study only selected quarter-hours.That is to say, we use all quarter-hours of representative hours 0, 6, 12 and 18, i.e. qh ∈ QH = {1, 2, 3, 4, 25, 26, 27, 28, 49, 50, 51, 52, 73, 74, 75, 76}.Such approach was already used in the literature.As described in Section 3, for each qh we separate models, including a separate hyperparameter tuning.Thus, we reduce the number of them from 96 to 16 without loss of generality.

Table 1 :
Error measures of the considered models.Colour indicates the performance colum- nwise (the greener, the better).With bold, we depicted the best values in each column.
provides p-values of the DM test obtained using CRPS loss.This figure only confirms the conclusions that we made based on Table 1.Namely, the forecasts of the naive model are significantly the best among considered models and the ones of gamlss.tmodel the second-best.Moreover, the combination of naive and gamlss.t is not significantly