Beating the Naïve—Combining LASSO with Naïve Intraday Electricity Price Forecasts

: In the last three decades the vast majority of electricity price forecasting (EPF) research has concerned day-ahead markets. However, the rapid expansion of renewable generation—mostly wind and solar—have shifted the focus to intraday markets, which can be used to balance the deviations between positions taken in the day-ahead market and the actual demand and renewable generation. A recent EPF study claims that the German intraday, continuous-time market for hourly products is weak-form efﬁcient, that is, that the best predictor for the so-called ID3-Price index is the most recent transaction price. Here, we undermine this claim and show that we can beat the naïve forecast by combining it with a prediction of a parameter-rich model estimated using the least absolute shrinkage and selection operator (LASSO). We further argue, that that if augmented with timely predictions of fundamental variables for the coming hours, the LASSO-estimated model itself can signiﬁcantly outperform the naïve forecast.


Introduction
After performing a comprehensive empirical study on intraday electricity price forecasting and considering models with tens of thousands of regressors, Narajewski and Ziel [1] concludes that the German continuous-time market for hourly products is weak-form efficient, that is, that the best predictor is the most recent transaction price. Their result is surprising and at the same time disappointing from a research perspective. Here, we undermine their claim and show that it is possible to build models that significantly outperform the naïve benchmark. Consequently, we invalidate the conjecture that the German intraday market for hourly products is weak-form efficient.
This paper belongs to a new strand of literature on forecasting prices in intraday electricity markets. To date, the workhorse of power trading in Europe has been the uniform price auction, and a vast majority of research and applications have concerned day-ahead (DA) electricity prices [2]. However, the rapid expansion and integration of renewable energy sources (most notably wind and solar), active demand side management (smart meters, smart appliances, etc.) and the introduction of the XBID pan-European trading platform have shifted the focus to intraday markets [3][4][5]. One of the more liquid-and hence more studied-marketplaces, is the German intraday market for quarter-hourly and hourly products [6][7][8][9][10][11][12]. In this continuous-time market, the majority of trading takes place in the last couple of hours before gate closure [13] and on the hourly products [1]; the latter are traded from 15:00 on day d − 1 until 5 min before the delivery starts on day d, or 30 min before if the trade is made between the delivery zones. The leading reference price is the so-called ID3-Price index (or simply ID3), which is also an underlying instrument of exchange-traded derivative products (see www.eex.com). The index is computed as the volume-weighted average price of all trades on the quarter-hourly and hourly products in the three hour window directly preceding the delivery (see www.epexspot.com).
In this article, we focus on predicting the ID3-Price index a few hours-ahead and develop regression type models that outperform the naïve benchmark. To this end, we consider a large set of past ID3 values, past DA prices and forward-looking fundamental variables, and utilize the least absolute shrinkage and selection operator (LASSO) [14] to eliminate regressors with low explanatory power, as well as apply forecast averaging [15]. By comparing performance of different model structures, we draw important conclusions regarding variable selection and provide recommendations for very short-term electricity price forecasting.
The remainder of the paper is structured as follows. In Section 2, we introduce the dataset and discuss the use of variance stabilizing transformations (VSTs). Next, in Section 3 we describe the naïve approach proposed in Narajewski and Ziel [1] and introduce the model structures used in our study. In Section 4, we compare the predictive performance in terms of two commonly used error measures and the Giacomini and White [16] test for conditional predictive ability. Finally, in Section 5, we wrap up the results and conclude.

The ID3-Price Index and DA Prices
The ID3-Price index takes into account only the most recent trades, that is, transactions that took place no earlier than 3 h before delivery. EPEX SPOT SE publishes the index, however, the currently covered period is too short for a comprehensive evaluation of the forecasts. Therefore, following Narajewski and Ziel [1] and Uniejewski et al. [11], we use an ID3-like time-series reconstructed from the individual transactions and denote it by ID3 d,h , where d is the day and h is the hour of delivery, see the top panel in Figure 1. In addition to past ID3 values, we also use prices from the German day-ahead (DA) market, see the middle panel in Figure 1; to better see the differences between the two price series, both are plotted during a sample 4-week period in the bottom panel. Recall, that the DA prices are set around noon on day d − 1 for all 24 h of day d; we denote them by DA d,h .
Both time series are of hourly resolution and span 1216 days ranging from 1 January 2015 to 30 April 2018. We consider a rolling estimation scheme with a 364-day window. Initially, we fit our models to data from hour 1 on 1 January 2015 to hour 20 on 30 December 2015 and compute the price forecasts for the first hour of 31 December 2015; note, that there is a 4 h lag between the last known ID3-Price index and the predicted hour. Next, the window is rolled forward by one hour and the predictions for hour 2 of 31 December 2015 are generated. This procedure is repeated until forecasts for hour 24 of 30 April 2018 are made, that is, the last hour in the 852-day long out-of-sample test period.

Exogenous Variables
The set of exogenous variables considered in this study includes three pairs of time-series that describe the demand-supply relationship in Germany:  Figure 2; the corresponding actual values X d,h i of the fundamental variables are indistinguishable from them at this resolution. Naturally, the latter are known ex-post, hence only their lagged values can be used for forecasting. As discussed in Section 3, we utilize them by constructing a series of forecast errors, that is, X d,h i − X d,h i , for the time moments for which the actual values are available; we assume that X d,h i is known immediately after its hourly period ends, that is, at (d, h + 1). Although an assumption, advances in on-line data collection significantly reduce the latency from the data source to the data provider, to the extent that in the near future this may become reality.    Figure 2. Three forward-looking fundamental time-series: system-wide load forecasts (top), wind generation forecasts (middle) and solar generation forecasts (bottom) for the period from 1 January 2015 to 30 April 2018. All three are published on day d − 1 and concern the 24 h of day d. As in Figure 1, the vertical dashed lines mark the beginning of the 852-day long out-of-sample test period.
As Goodarzi et al. [3] argue, wind and photovoltaic generation forecasting errors increase the absolute levels of system imbalance in Germany and these in turn influence electricity prices. Hence, we additionally use a set of balancing volumes B d,h−5 i for the three (i = 1, 2, 3) quarter-hourly periods directly preceding the time at which the forecast is made, that is, the period spans the first 45 min of the hour preceding the moment of computing the forecast. As in Narajewski and Ziel [1], B d,h i is defined as the sum of imbalances of all German Transmission System Operators for day d and hour h; this data is published every quarter-hour, 15 min after the end of the delivery.

Variance Stabilizing Transformation
Following the recommendations put forward by Uniejewski et al. [17], we use the so-called variance stabilizing transformation (VST) to reduce the impact of extreme observations present in demand, generation and particularly in electricity price data. Before applying the VST, each variable is standardized by subtracting the sample median and dividing by the sample median absolute deviation (MAD) or by the sample standard deviation if MAD = 0, corrected by the 75th percentile of the standard normal distribution z 0.75 : where ψ is the in-sample vector of a given variable, ψ is a single element of ψ and ξ its standardized value. However, unlike earlier studies, we apply the standardization to each variable separately due to a large number of zero-valued observations in the PVG series. Then, we use a well performing VST-the area hyperbolic sine (asinh)-on ξ: where φ is the VST-transformed value of ψ. The back-transformation is more tricky. Uniejewski et al. Uniejewski et al. [17] simply set: However, Narajewski and Ziel [1] argue that the latter is not correct since in most cases E sinh(X) = sinh(EX). As a remedy, they propose using the following, mathematically correct back-transformation: where ε i are in-sample residuals of the model and D is the size of the calibration window. In this study we compare model performance for both back-transformations to assess the loss in predictive power across models of different complexity when using the more popular [4,9,11,17,18], simpler and faster to compute, but generally incorrect transformation (3), instead of (4).

The Naïve Benchmark
Recall that Narajewski and Ziel [1] conclude their empirical study of intraday hourly products by stating that the market is weak-form efficient, that is, that the best predictor is the most recent transaction price. Since we want to challenge this conjecture, as our benchmark we define: where x ID d,h y denotes the volume-weighted price of transactions that took place in the intraday (ID) market in a y-hour window that ended x hours before delivery on day d and hour h, see Equation (2) in [1]. Using this notation the ID3-Price index can be defined as ID3 d,h ≡ 0 ID d,h 3 , that is, the volume-weighted price of transactions that took place in the last three hours of trading (excluding the last 5 or 30 min, see Section 2).
Note, that our naïve benchmark is not identical to the one used in [1], that is, Naive.MR1 ≡ 3.25 ID d,h 0.25 . Instead of assuming that the trader makes the decision and places orders in a 15-min window ending 3 h before delivery, we allow for a one hour window for making the trading decisions (between 4 and 3 h before delivery). This is illustrated in Figure 3, where the red step function represents the time the forecasts are made (4 h before delivery) and the black step function the time the delivery starts. The black step function represents the moment the delivery starts (every hour of Friday, 27 April 2018), the circles refer to actual trades, with circle size indicating the traded volume (from 0.1 to 300 MWh) and color the price (see the colorbar on the right), and the red step function represents the moment the forecasts are made. For instance, at 12:00 on 27 April 2018 when forecasting the price for 16:00 (−→), the most recent ID3 value is for 12:00 ( * ). The grey-shaded area indicates the data used for computing the seven partial ID3 indices utilized when forecasting the price for hour 16, see Section 3.2.3 for details.

LASSO-Estimated Models
An advantage of using automated variable selection is an almost unlimited number of initially considered explanatory variables [19]. In this study, we define a baseline model with 76 potential regressors and its three extensions; the largest one takes into account 200+ explanatory variables. All considered models are estimated in a multivariate modeling framework in the sense of Ziel and Weron [20], that is, an explicit 'day × hour' matrix-like structure is used for the 24-dimensional price vectors. However, unlike when forecasting in day-ahead auction markets, where the prices are set once a day, in a continuous-time intraday market we are able to use information updated in the course of the day, for example, more recent weather forecasts.

The Baseline Model
The baseline model is a slightly modified LASSO-estimated model of Uniejewski et al. [11]. The only difference is the omission of some of the less important variables. Namely, we exclude the information about inputs distant in time and only use the latest information about past ID3 and DA prices. As a result, we obtain a model with 76 potential regressors-21 last known ID3-Price index values from the intraday market (that is, nearly the whole day), 24 DA prices for the target day and seven dummy variables (to account for the weekly seasonality). Given the 4-hour forecast to delivery lag and the time the DA prices are published, we can additionally include next day's DA prices when forecasting hours 16 to 24: where ε d,h is the noise term. To simplify the notation when referring to an hourly product with delivery i hours after (i > 0) or before (i < 0) the product with delivery on day d and hour h (more precisely: with delivery between hour h − 1 and h) we define: For instance, for h = 2 and 24). Note, that the price for each hour is predicted 4 h in advance, hence the first sum in the above formula starts with i = 4, and using the most recent information available, see Figure 3.
Later in the text we denote model (6) by baseline.

The Model with Exogenous Variables
The first extension of model (6) is motivated by the results of Uniejewski et al. [21], who showed that fundamental variables play an important role when forecasting DA prices. On the other hand, Monteiro et al. [22] and Andrade et al. [23] argued that fundamentals (historical and predicted demand, generation and weather) did not have much explanatory power when forecasting Spanish intraday prices, since DA prices already included this information. To check whether fundamentals can help in forecasting the ID3-Price index in the German intraday market, we extend the baseline model to include load, wind power generation (WPG) and photovoltaic generation (PVG) forecasts and the corresponding errors, as well as the balancing volumes (Section 2.2 for details): Later in the text we denote this model by w/exogenous.

The Model with Partial ID Prices
The second extension of model (6) is motivated by the results of Narajewski and Ziel [1]. The authors emphasize that the most important information for forecasting ID3 can be derived from recent transaction data for a given product. Hence, we extend the baseline model to include 8 additional predictors. Firstly, we add the naïve benchmark (5) as one of the explanatory variables. Secondly, we add variables that link the intraday to day-ahead markets and reflect changes in the expectations about price levels over time. More precisely, we construct artificial series that utilize the information from recent transaction data on the neighboring products. For i = −4, ..., 2, we define seven partial ID indexes: where V d,h τ and P d,h τ are respectively the volume and price of a transaction made at time τ on a product with delivery on day d and hour h. Hence, pID d,h i is a volume-weighted price of all transactions on product (d, h + i) in the last hour before the forecast is computed, that is, between 5 and 4 h before the delivery. For example, to compute pID d,16 i , we use seven hourly windows corresponding to i = −4, ..., 2, see the gray-shaded rectangle spanning 7 hourly products in Figure 3. Note, that using the x ID d,h y notation we can write: Finally, we can define the model with partial ID prices as follows (later in the text we denote it by w/partial ID): i difference between DA and partial ID prices

The Full Model
Now, we are ready to write the full model (denoted later in the text by full), which includes all elements of models (8) and (11). We end up with a maximum of 222 potential regressors, depending on whether we already know the day-ahead prices for day d + 1: The final modification of the benchmark model is obtained by fixing β 222 ≡ 1, as considered in [1]. Later in the text we denote such a model by full-diff, because it corresponds to setting the dependent variable to the difference between ID3 and the naïve benchmark, instead of the ID3-Price index itself.

LASSO Estimation
In order to explain the estimation scheme, let us use a more compact form of the regression model: where V d,h i 's are the predictors and β i 's are the corresponding coefficients. The least absolute shrinkage and selection operator (LASSO) shrinks the coefficients of the less important explanatory variables towards zero and hence performs variable selection [14,24]. The LASSO can be treated as a generalization of linear regression, where instead of minimizing only the residual sum of squares (RSS), the sum of RSS and a linear penalty function of the β's is minimized: where λ ≥ 0 is a tuning (or regularization) parameter. Note that setting λ to zero yields the standard least squares estimator, for λ → ∞ all β i 's tend to zero, while 0 < λ ∞ admits a balance between minimizing the RSS and shrinking the coefficients.
Selecting a 'good' value for λ is critical. It is, however, a complex problem [9,12,19]. Because of a relatively short dataset, we are not able to reselect λ based on model performance in a validation period. Instead, we have decided to use cross-validation. It can be effectively applied to select the tuning parameter ex-ante, unfortunately at a cost of increased computational complexity. The procedure is discussed in more detail in Section 5.2.

Forecast Averaging
Combining forecasts in order to obtain more precise and robust predictions is a technique known both in the electricity price forecasting literature [15] and in forecasting in general [25]. Here, we use an ensemble of two predictions-a simple arithmetic average of a LASSO-estimated model (labeled Z) and the naïve forecast: The motivation for using the arithmetic mean is twofold. Firstly, it is the simplest averaging scheme, requiring no additional calibration. Secondly, it is hard to beat by 'more sophisticated' approaches [26].

Forecast Evaluation
The forecasting accuracy is assessed in terms of two error measures: the mean absolute error (MAE) and the root mean squared error (RMSE). The scores are reported for the full out-of-sample test period of D = 852 days, that is, 31 December 2015 to 30 April 2018, see Figure 1, jointly for all hours of the day: is the prediction error for model Z, for day d and hour h. Recall, that the RMSE is the optimal measure for least square problems, whereas the MAE is more robust to outliers [24]. The resulting aggregate MAE and RMSE scores can be used for a direct comparison of the forecasts, but do not allow to draw statistically significant conclusions. Therefore, we use the Giacomini and White [16] test for conditional predictive ability (CPA), which can be treated as a generalization of the more popular Diebold-Mariano test for unconditional predictive ability [2].
First, for each pair of models, following Uniejewski et al. [17] and Ziel and Weron [20], we compute the so-called multivariate loss differential series: where E d Z p = (∑ 24 h=1 |E d,h Z | p ) 1/p is the p-th norm of the 24-dimensional vector of out-of-sample errors for model Z. Then, we calculate the p-values of the CPA test with null H 0 : α = 0 in the regression:  [20]. Due to the strong intraday seasonality we cannot use the standard approach, where forecasts for all hours are treated as one univariate time series and tested jointly. On the other hand, reporting test results for 24 hourly time series independently would require much more space.

MAE and RMSE Errors
In Table 1 we report the MAE and RMSE metrics for all considered models and their ensembles with the naïve benchmark, as defined in Equation (15). In Figure 4 we additionally visualize the set of results corresponding to back-transformation (4), reflecting the upper part of Table 1. Several important conclusions can be drawn:

•
In terms of the MAE, three models outperform the naïve benchmark even without averaging forecasts. However, only the full-diff approach manages to beat the benchmark in terms of the RMSE, see the values emphasized in bold in Table 1 in columns labeled 'model'. • All baseline model extensions yield lower errors than the baseline model itself, both in terms of the MAE and RMSE. • The full model outperforms the model with partial ID prices, which suggests that using the exogenous variables discussed in Section 2.2 improves forecast accuracy.

•
On average, back-transformation (4) proposed by Narajewski and Ziel [1] (the upper part of Table 1) performs slightly better than the originally introduced one (the lower part of Table 1). For this reason, in what follows we only discuss the results of back-transformation (4).

•
Interestingly, for the full-diff model we observe that back-transformation (3) performs better than the mathematically correct VST defined in Equation (4). The difference vanishes when the forecasts are averaged, which is probably caused by the fact that the correction improves performance mainly in the tails, and in the full-diff model the less heavy-tailed price differences are predicted. • Apart from the full-diff model, every other model performs better when its forecasts are averaged using Equation (15). Compare the columns labeled 'model' and 'ens(model)' in Table 1.

•
The improvements from averaging forecasts are much higher (ca. 12-14%) for models that do not use the naïve benchmark as a regressor. However, what is surprising, the gains are noticeable (ca. 2-4%) even for models which include this explanatory variable. Apparently, the LASSO scheme does not put enough weight to this variable. Setting β 222 = 0 in the full-diff model helps, but does not solve the problem completely. We return to this issue in Section 4.4. Table 1. MAE and RMSE errors for all 852 days of the out-of-sample test period, see Figure 1. The upper part of the table reports on the results obtained for models which use back-transformation (4), while the lower that use back-transformation (3). Columns labeled 'model' refer to the models themselves, while those labeled 'ens(model)' to ensembles with the naïve benchmark, as defined in Equation (15). Errors smaller than those of the naïve benchmark are emphasized in bold.   Table 1, that is, for the naïve benchmark and models that utilize back-transformation (4). The black dashed lines correspond to the benchmark, the solid bars represent the individual models and the dotted bars the corresponding ensembles.

Conditional Predictive Ability
We perform the Giacomini and White [16] test of conditional predictive ability (CPA) to check whether the differences in forecasting accuracy are statistically significant. We conduct the test only for the naïve benchmark and models that utilize back-transformation (4). The p-values of the pairwise comparisons are visualized in Figure 5. We can see that: • The naïve forecasts can be significantly outperformed by predictions of models that include partial ID information and exogenous variables (full and full-diff models) without averaging, and by most of models after ensembling.

•
Forecasts of the baseline model are significantly outperformed by those of any other LASSO-estimated model.

•
For all considered models, ensembling significantly improves the accuracy in terms of the linear errors.

•
Forecasts of the ens(full) model significantly outperform those of any other model, both in terms of the linear and quadratic errors. Figure 5. Results of the conditional predictive ability (CPA) test of Giacomini and White [16] for the linear (left) and quadratic (right) errors. We use a heat map to indicate the range of the p-values-the closer they are to zero (→ dark green) the more significant is the difference between the forecasts of a model on the X-axis (better) and the forecasts of a model on the Y-axis (worse).

Why Does Ensembling Improve the Results?
As the above reported results indicate, the ensemble is in most cases able to outperform both individual forecasts. However, the simple averaging scheme proposed in Equation (15) might not be the optimal for this task. Hence, in this Section we consider a more general formula: where w is the weight assigned to the naïve forecast. In Figure 6 we depict the MAE of ensemble (18) as a function of w for the full model with back-transformation (4). The MAE curve is convex with a minimum at ca. w = 45%. However, the value for w = 50%, that is, the simple mean used in the study, is very close to the optimum. The reason behind this shape is the characteristic of LASSO forecasts, estimated on long calibration windows. Specifically, the model is trained to generalize well, and such a behavior is reinforced by the fact that there are only a few spikes in the calibration window. As such, the model is able to better predict prices at the typically observed levels at the cost of underestimating spikes, especially negative ones, see Table 2. Note that an ensemble model is either the best performing one, or its performance is very close to the better of the full model and the naïve benchmark.
This behavior can be also observed in Figure 7, which illustrates differences in mean absolute errors-MAE naïve − MAE model for model = ens(full) or full, across a range of price regimes. Overall 40 price regimes are considered-percentiles 0 to 2.5, percentiles 2.5 to 5, ..., and percentiles 97.5 to 100-with each point placed in the middle of the corresponding 2.5-percentile interval. Note, that for all except the very extreme intervals the ens(full) and full models consistently outperform the naïve benchmark. On the other hand, in the tails of the price distribution the benchmark excels, likely due to the ability to quickly adapt to unexpected market situations. Therefore the ensemble (regardless of the weights) balances the generalization of the LASSO forecasts with the ability to quickly adapt to non-recurring phenomena of the naïve benchmark, with both ensemble components playing an important role in achieving this effect.   (4). The x-axis represents percentiles of the marginal distribution of prices observed in the out-of-sample test period, with each point placed in the middle of the corresponding 2.5-percentile interval. The first point of the blue curve, that is, for observations between percentiles 0 and 2.5, is out of bounds since MAE naïve − MAE full = −1.7 for the extremely low prices.

Discussion and Conclusions
The motivation for this study was a claim made by Narajewski and Ziel [1], that the German intraday, continuous-time market for hourly products was weak-form efficient, that is, that the best predictor for the ID3-Price index was the most recent transaction price. Performing a comprehensive forecasting exercise involving parameter-rich regression-type models with four types of fundamental variables as inputs, we have been able to challenge their claim and show that we can significantly outperform the naïve forecast by combining it with a prediction of a LASSO-estimated model. To keep the empirical part of the paper concise, we have opted for omitting some of the considerations. Let us now briefly discuss them.

The Moment of Forecasting the ID3-Price Index
After consulting with practitioners, we have decided to focus on a forecasting scheme used by Uniejewski et al. [11], where the predictions are made four hours before delivery. This means, that a trader has an hour to make the decisions and build a long or short position before the ID3 transaction window opens three hours before delivery. However, to check whether also the Naive.MR1 ≡ 3.25 ID d,h 0.25 benchmark of Narajewski and Ziel [1] can be outperformed, we have recalculated our models in their setting. Naturally, the Naive.MR1 is harder to beat than our naïve model, because it uses more recent transaction data. Yet, the relative performance vs. the benchmark was qualitatively the same as reported in Section 4.

Selecting the LASSO Regularization Parameter
For the choice of the regularization parameter, we have resorted to using an automated cross validation (CV) technique. More precisely, the applied CV procedure consisted of three folds with a dense logarithmic grid of 50 λ values spanning six orders of magnitude. Two thirds of the calibration sample was used for training the models estimated with different λ's, the remaining one third for testing them. This resulted in a significantly increased computational burden, due to the need of testing multiple models for multiple λ's, but also allowed for an ex-ante choice of the regularization parameter. We have also performed a limited numerical experiment to compare with the results obtained for the best ex-post selected λ. As it turned out, the difference in the MAE and RMSE errors was less than 0.5%.

The Impact of Intraday Updates of the Fundamentals
We have also tried to assess the impact of using more recent forecasts of the system-wide load, wind power generation, photovoltaic generation and balancing volumes. We have measured the predictive performance of our models under the assumption that we know future values of the exogenous variables until the end of the target day. With such 'perfect forecasts' we have been able to additionally reduce the forecasting error by more than 2%. This result emphasizes how important in short-term forecasting is the availability of more frequently updated forecasts of the exogenous variables.

Model Size
As mentioned above, the LASSO procedure allows for an efficient estimation of parameter-rich models. However, the quality of the obtained estimates can differ for different sizes of the regression model. Having only ca. 360 observations in the calibration window, we may obtain worse forecasts if we consider dozens or hundreds of redundant variables in the model. The full model defined by Equation (12) includes only ca. 200 potential predictors. Interestingly it outperforms by ca. 0.6% a richer model with more than 800 variables (the same information sources, but more past observations). Therefore we advise to use expert knowledge and/or back-testing to eliminate non-informative predictors before running the LASSO.

Directions for Future Research
Given that the literature on forecasting prices in European intraday power markets is still very scarce, our study is a step forward towards understanding the impact of using recent transaction data and exogenous variables on the predictive performance. Our study can be further expanded in several directions. In particular, we report the results for only one VST (for more suggestions see Reference [17]) and without decomposing the data into a long-term seasonal component and the remaining stochastic part (for the importance of doing this see, for example, References [27,28]). Furthermore, we have focused on point forecasting, ignoring the full predictive distribution [8,29] or-what may be even more important in continuous-time intraday markets-the trajectories [13,30]. We have restricted ourselves to using regression-based models, however, machine learning techniques could be used in this context as well [12,22,23,31], naturally at the cost of an increased computational burden. Finally, recall from Section 4.4, that the ensemble we use balances the generalization of the LASSO forecasts with the ability to quickly adapt to non-recurring phenomena of the naïve benchmark. A potentially viable alternative would be to use the approach introduced by Hubicka et al. [32], which averages forecasts of a given model across calibration windows of different length.