X-model: further development and possible modifications

Despite its critical importance, the famous X-model elaborated by Ziel and Steinert (2016) has neither bin been widely studied nor further developed. And yet, the possibilities to improve the model are as numerous as the fields it can be applied to. The present paper takes advantage of a technique proposed by Coulon et al. (2014) to enhance the X-model. Instead of using the wholesale supply and demand curves as inputs for the model, we rely on the transformed versions of these curves with a perfectly inelastic demand. As a result, computational requirements of our X-model reduce and its forecasting power increases substantially. Moreover, our X-model becomes more robust towards outliers present in the initial auction curves data.


Introduction
As has already been accentuated multiple times (for example, see [Boyle, 2004], [Weron, 2007] or [Spiecker and Weber, 2014]), the shift to cleaner power is accelerating at a growing pace. Due to their apparent advantages, renewable resources are becoming increasingly competitive. They also are exerting a profound influence upon contemporary energy systems. Intermittent in nature and weather-dependent, green energies have elevated the importance of forecasting over the past several years. More accurate predictions, as is now obvious, lead to hefty cost reductions and allow the stability of the whole power system to be maintained.
Immense complexity of energy markets provided forecasters with a vast variety of datasets and variables to study. Therefore, a numerous amount of different forecasting models has emerged over time (see e.g. [Bunn, 2000] or [Weron, 2014]). One of those models is the so-called X-model developed by [Ziel and Steinert, 2016]. Despite following a truly unconventional approach, this model has proven to be a very powerful tool for conducting price and volume forecasts in energy markets. The X-model will thus be given a thorough scrutiny in the present study.
In fact, the core of the X-model is constructed around a relatively simple idea. When attempting to make a price or volume prediction in an electricity market, scientists typically consider price and volume time series. Instead, however, scientists may focus on making a forecast for the entire wholesale supply and demand curves. Since these curves are used to settle equilibrium prices, the intersection of the prognosticated curves will constitute the price or volume forecast.
To explain the functioning of the model, let us first consider the demand curve. To obtain a prediction for the entire demand curve at time period t + 1, we first need to select several points on this curve. These points correspond to certain prices and in the original paper by [Ziel and Steinert, 2016] were referred to as the price classes. Then, we construct a time seriesbased model for volume forecasting for each of the selected price classes. Combining the obtained forecasts together (or, loosely speaking, drawing "a line" over the predicted points) thus yields a prediction for the entire demand curve at time period t + 1. Then, we progress similarly to obtain a forecast for the entire supply curve. Afterwards, as has already been mentioned in the previous paragraph, we simply search for the intersection between the two predicted auction curves and conclude that this intersection coincides with our equilibrium price or volume forecast.
Hence, as its major intuition may suggest, the X-model can be used particularly well when it comes to forecasting price spikes. Since the model incorporates best properties of both time series and structural analyses, the X-model is capable of capturing the bidding behavior of market participants more precisely. This ability, in turn, results in more accurate forecast of extreme price events.
Nevertheless, despite being a model of exceptional importance, it has not been widely developed further. This paper aims to fill this gap and proposes presumably the first improvement of the original X-model. The kernel of the present study is based upon the paper written by [Coulon et al., 2014] who showed that the initial wholesale auction curves can be transformed into their analogues with a perfectly inelastic demand curve. We will show that forecasting accuracy of the X-model improves and its computational burden lessens if the curves are transformed prior to making a prognostication.
Fundamentally, the inelastic demand curve is the main reason behind a superior performance of the modified X-model. First, predicting only one point instead of the entire demand curve requires a substantially smaller amount of time. Therefore, the modified X-model delivers final results faster than the original X-model. Second, the modified X-model is much less dependent on the outliers present in the original auction curves data. Speaking generally, the initial wholesale demand curve has a very sophisticated composition and shape. Predicting this curve correctly is thus a task of great complexity. Forecasting the inelastic demand is, in turn, much simpler. As a result, the modified X-model is not only quicker, but also more accurate.
This paper is organized as follows. The next section comments briefly on the used data set and the corresponding manipulations with the data. Section 3 elaborates on the transformation of the auction curves and explains why this transformation leads to a significant improvement of the X-model. Section 4 provides the fundamentals of the underlying time series process. Section 5 is a discussion of the obtained results. Section 6 concludes the paper.

Data set
The present study was conducted on the auction curves data from the German EPEX SPOT SE. Additional data sets were obtained from the ENTSOE. These data sets comprise wind and solar power forecasts and the total generation forecast. The in-sample period used in the current paper is from 2016-01-01 to 2017-01-01. The out-of-sample period is the year 2017. Following the regulation of the EPEX, the maximal bid price in the market is equal to P max = 3000, the lowest bidding price amounts to P min = −500. The data was clock-change adjusted. The missing hours were calculated using the two values before and after them, whereas the average value of two double hours in October was taken to solve the problem.
There were two big clusters of outliers present in the original data. The first cluster was detected at the price P min in the supply curve, the second one at the price P max in the demand curve. The fact that these clusters are outliers becomes apparent given that observations at points distant from P min and P max did not exhibit any peculiarities. Moreover, from an economic standpoint, it is possible that market participants tried to bid unrealistic volumes at the very extremes of the auction curves in a hope to get very profitable deals.
Please note that these outliers do impede the functioning of the model because they affect the compositions of the forecasted auction curves. Speaking technically, however, we could have taken points e.g. P 1 S = −495 and P 1 D = 2995 to construct the X-model. If we would have done so, the outliers would barely influence the model since prices are almost never realized at the extremes of the auctions curves. From this perspective, we would suffer only a marginal loss in informational efficiency if we would have taken points P 1 S = −495 instead of P min and P 1 D = 2995 instead of P max . However, we do not want to tolerate this loss and stick to the officially established price bounds. Therefore, the outliers are to be processed.
To clean these outliers, a suggestion from [Weron, 2007] was taken and a typical expert-type regression model was constructed. The model is similar to [Weron and Misiorek, 2008] or [Ziel, 2016] with lags 1,2, and 7 and Monday, Saturday and Sunday. Moreover, the corresponding value at the point P 1 S = P min + 5 and P 1 D = P max − 5 for the supply and demand curves, respectively, was used as an additional regressor in the model. Since the values at points P 1 S and P 1 D of the same curves were taken into account, the method, though simple, yields credible and precise results.

Transformation of the auction curves
As was mentioned earlier, it is possible to transform the actually observed auction curves into their analogues with a perfectly inelastic demand curve. The basics of this transformation are provided in [Coulon et al., 2014], whereas possible applications of this theory can be found in [Kulakov and Ziel, 2019b] and [Kulakov and Ziel, 2019a]. A graphical representation of the transformation can be found in the Figure  It is of crucial importance to note that the price remains the same after the curves have been transformed. The volume sizes, however, increase. Using the X-model for price forecasting on the transformed curves is thus possible without any further modifications or additional comments. However, our volume forecast will not correspond to the wholesale market volumes. Please note that there are three major benefits from transforming the curves prior to applying the X-model. First, recall that the demand curve after the transformation is represented by only one point. Therefore, it is no longer necessary to predict the whole demand curve, but rather only to make a forecast for this single point. This allows a substantial amount of computational time to be spared. Second, the model becomes more robust towards outliers present in the auction curves data. Being more robust towards the outliers, the accuracy of the model increases greatly. This feature of the modified X-model will be expanded in more details in what follows. Third, the spared computational time can be used to select a greater amount of price classes and thus forecast the supply curve more precisely. Hence, as the above description suggests, transforming the auction curves before using the X-model will not only deliver the results speedier, but will also yield results of higher quality.

Model description 4.1 Transformation of the auction curves
The first step in describing the model is to comment on the way the auction curves can be transformed. As has been mentioned earlier, the formulas were taken from [Coulon et al., 2014]. Please note that we consider both auction curves as functions of the price. We can thus simplify the notation significantly. Please note that the expression for the inelastic demand curve can be represented as where W SDem denotes the demand curve in a wholesale market and P max shows the maximal price at which market participants can bid under the regulation of a power exchange. The equation for the inverse supply curve reads

Defining the price classes
Having transformed the curves, it is now possible to apply the X-model and carry out price and volume forecasts. Please note that the formulas below are almost identical to those in the original paper. However, the applied transformation allows us to focus only on the supply curve. Therefore, it was possible to omit a great amount of indices in the formulas. Hence, the less sophisticated appearance of the mathematical part of the present paper constitutes another neat simplification of the original X-model. To construct the X-model, the first step to be undertaken is to determine price classes on the transformed supply curve. The price classes, as has already been mentioned, are points on the supply curve which correspond to certain prices and to which volume forecasting models will be applied. Then, to proceed further, we first define a grid of prices with positive bid volumes as where P denotes a grid of all possible possible prices and V stands for volumes on curve Sup −1 t . Therefore, the transformed supply curve in our case can be written as Having determined all possible prices present in the in-sample period, we can proceed with determining the price classes. First of all, we have to compute average volumes over T in-sample observations at prices P. Hence, it holds that Given the above expression, writing an equation for average curve Sup −1 t over the in-sample period yields Finally, we apply an equidistant volume grid with a step of V * = 500 mW to curve Sup(iV * ). This allows us to define the price classes we are looking for. Hence, it can be said that Of course, using an equidistant volume grid may appear too simple for determining the price classes. However, despite its simplicity, the method proves to be relatively powerful. Following Figure 2, it is clear that the number of price classes is much greater in the flatter segments of the transformed supply curve and is much sparser in the steeper segments. Recall now that the equilibrium price is typically realized in the flatter segments of the curve. Therefore, using the equidistant volume grid allows us to focus only on those sectors of the supply curve in which the price can typically be observed. Furthermore, please note that there is no mathematical justification for choosing the size of V * . However, the selected value of V * allows us to to obtain such an amount of price classes that their number is (a) sufficient enough for approximating the supply curve relatively precisely and (b) is not too large and thus not computationally inefficient.
Thus, there are M S = 19 price classes in our case, as compared to M S = 16 in the original paper. The defined price classes C are represented in Table 1 below. Please note that the volume sizes in each of those price classes C can be written as which means that the volume size in a price class incorporates all volumes present in between this price class and the previous one. Recall now that the demand curve in our model is perfectly inelastic. Therefore, there are no price classes for the inelastic demand curve. The demand volume is thus given by Please note that formulas 8 and 9 bear a critical implication for the comparison of the original and the modified X-models. Imagine that we have conducted forecasts for each of M S price classes and now want to combine the obtained predictions in a single curve. According to formula 8, each following point on the forecasted supply curve is an increment over the preceding point. In other words, to construct the predicted supply curve, we have to start with the first price class. Then, to the forecasted volume in the first price class, we one-by-one add the forecasted volumes in the following price classes. Therefore, whenever an outlier occurs in e.g. the 5th price class, this outlier not only affects the 5th price class itself, but also all other price classes afterwards.
The demand curve in the modified X-model, however, is represented by only one point. Therefore, the modified X-model becomes more robust towards outliers present in the initial auction curves data. This holds since the cumulative effect of outliers does not affect the demand forecast in the modified X-model. Hence, accuracy of the modified X-model is higher compared to that of the original X-model.

Time series model
Please note that the applied time series model is very similar to the one in the original paper by [Ziel and Steinert, 2016] and is thus similar to [Weron and Misiorek, 2008] or [Ziel, 2016]. The model is a simple ARX-type process with 4 external regressors. The external regressors are the wholesale market price and forecasts for: electricity generation, wind and solar power supply. Please note that in our case the equilibrium volume coincides with the value of the inelastic demand function. Therefore, to account for the equilibrium volume separately, we consider the difference between the equilibrium volume in the setting of the transformed curves and the initial wholesale equilibrium volume, i.e. X volume . The time series for the modified X-model with an inelastic demand curve is then which in this case includes M + 5 = 25 variables. However, the forecast is conducted only for the first M S + M D = 20 parameters since the remaining ones are only auxiliary.
To capture the seasonal structure, a weekday dummy is introduced with the following formula where W(d) is a function which yields a number corresponding to the day of the week d and k is a day index with e.g. k = 1 for Monday. Since we estimate the time series model by a BIC-based lasso (for more see [Tibshirani, 1996] and [Schwarz et al., 1978]), the underlying data should be standardized. Therefore, we have to subtract means from the original process, i.e. Y d,h = X m,d,h −µ h where µ h = E(X d,h ). Therefore, the model under consideration can be written as follows where φ m,h,l,j,k , ϕ m,h,k and I m,h (l, j) are sets of lags and ε m,d,h is an error term. As in the original paper, the latter term is supposed to be i.i.d. with constant variance σ 2 m,h . Please note that I m,h (l, j) is defined as where the choice of lags and the corresponding motivation is elaborated at length in the original paper by [Ziel and Steinert, 2016].
Then, to estimate the β-coefficients, we use R-package glmnet (for more see e.g. [Friedman et al., 2010]). The multivariate ordinary least squares estimator for our model can thus be defined as where X m,d,h = (X m,d,h,1 , ..., X m,d,h,p m,h ) ′ is a p m,h -dimensional vector of regressors, β m,h is a corresponding vector of coefficients, and tilde denotes a standardized version of a variable with its variance being scaled to one. Standardization is necessary for the lasso-estimator to function correctly. The corresponding mathematical representation of the scaled and estimated β-coefficients can be written as follows where λ m,h denotes a penalization parameter. Moreover, please note that the non-standardized versions of the coefficients can be obtained easily by rescaling. The volume forecast for the next day is thus given by Then, we need to add sample means to the obtained values of Y 1,n+1,h , ..., Y M,n+1,h to compute the final day-ahead volume forecast X 1,n+1,h , ..., X M,n+1,h . Please note, however, that to calculate a precise forecast for the next day simply adding mean values to the above defined process is not sufficient. We thus follow the procedure used in the original paper and run a residual-based bootstrap simulation with B = 10000 bootstrap samples. Hence, we sample from the residual vector ε d,h = (ε 1,d,h , ..., ε M,d,h ) ′ only over the days d. We then use the mean of the simulated results to finalize the computation of our point forecasts.

Supply curve reconstruction
The model described in the previous section is then applied to each of the price classes. As a result, we have day-ahead forecasts for M S = 19 points which lie on the forecasted supply curve and a forecast for the inelastic demand. Therefore, what remains to be done is to connect the forecasted M S price classes with each other, i.e. draw a curve out of the prognosticated points. We, however, want to retain the structure of the transformed supply curves and thus want to replicate this structure as precisely as possible. Hence, we do not simply draw a line over the predicted points, but instead use a more sophisticated technique. This technique was called curve reconstruction in the original paper. We thus rely on this technique without further modifying it.
To proceed further, we consider the following formulǎ where R(P ) = 1 if a price occurs at least two times a day and R(P ) = 0 otherwise. Equation 16 thus allows us to neglect prices which are not important and hence models the actual composition of the supply curve more accurately. Please note that reconstructing the demand curve is not necessary since this curve is perfectly inelastic. Figure 3 provides a graphical representation of the prognosticated curves after the curve reconstruction was carried out.

The obtained results
To test the model, a rolling window study was conducted. The size of the window was equal to one day, whereas the out-of-sample period was equal to the year 2017. The comparison between the modified and the original X-models with the naive benchmark is provided in the Table 2 below. Please note that the definitions of the MAE-and RMSE-values are analogous to those in the original paper or in e.g. [Uniejewski et al., 2017].
MAE RMSE Average execution time (min) Naive 9.97 11.90 -X-model orig. 6.21 7.54 4.34 X-model inel. 5.12 6.45 1.40 Besides lower MAE-and RMSE-values, the conducted DM-test has also proven superiority of the modified X-model with the corresponding p-value being equal to 2 × 10 −9 . Therefore, following Table 2 and the previous discussion, the modified X-model outperforms the original one in two major aspects.
The first aspect is the execution speed. The modified X-model requires on average 1.4 minutes to deliver the results, whereas the original model needs on average 4.34 minutes. Please note that the execution speed may vary depending on the specification of the lasso model and its parameters. Yet, the obtained results demonstrate explicitly that the modified version of the X-model is significantly faster. Naturally, the improvement occurs because the amount of variables is almost twice smaller in the modified version of the X-model.
The second aspect is the quality of results. As has been mentioned earlier, the modified X-model is more robust towards outliers present in the initial auction curves data because the cumulative effect of outliers is absent in the modified version of the X-model. Naturally, this means that the demand curve is approximated more accurately in the modified version of the X-model. In turn, this leads to a significant improvement of the accuracy. Figure 4 shows the forecast for the supply and demand curves (depicted in blue and red, respectively) delivered by the modified X-model against the true data (depicted in yellow). Moreover, an example of the equilibrium price and volume forecasts can be seen in Figure  5 below. As can be seen explicitly, the modified X-model proves it suitability for both volume and price forecasting. Moreover, as the above discussion demonstrates, the modified X-model is superior to the original one in both quality and speed dimensions.

Conclusion
The core idea of the present paper was to provide an improvement to the famous X-model derived by [Ziel and Steinert, 2016]. Since the X-model has not been widely studied as yet, the present paper is presumably the first one which develops the X-model further. The key component of the improvement came from the transformation of the auction curves into their analogues with an inelastic demand curve. The fundamentals behind the transformation were taken from [Coulon et al., 2014]. We showed that using this method prior to applying the X-model leads to a significant improvement of the final results.
More specifically, the modified X-model was shown to work faster. The boost in the execution speed came from the fact that the demand curve after the transformation was represented by only one point instead of several price classes. Therefore, predicting the demand curve in the modified X-model is much less computationally expensive.
Moreover, the modified X-model was shown to be more robust towards outliers present in the initial auction curves data. Due to the specifics of the model, these outliers may influence the compositions of the forecasted curves significantly. This influence, in turn, may deteriorate the quality of price and volume forecasts. Since it is much simpler to predict the demand curve in the modified version of the X-model, outliers exert a weaker influence upon the model's precision. Therefore, the modified X-model yields more accurate forecasts.