Forecasting in Blockchain-based Local Energy Markets

Increasingly volatile and distributed energy production challenge traditional mechanisms to manage grid loads and price energy. Local energy markets (LEMs) may be a response to those challenges as they can balance energy production and consumption locally and may lower energy costs for consumers. Block-chain-based LEMs provide a decentralized market to local energy consumer and prosumers. They implement a market mechanism in the form of a smart contract without the need for a central authority coordinating the market. Recently proposed block-chain- based LEMs use auction designs to match future demand and supply. Thus, such block-chain-based LEMs rely on accurate short-term forecasts of individual households’ energy consumption and production. Often, such accurate forecasts are simply assumed to be given. The present research tests this assumption. First, by evaluating the forecast accuracy achievable with state-of-the-art energy forecasting techniques for individual households and, second, by assessing the effect of prediction errors on market outcomes in three different supply scenarios. The evaluation shows that, although a LASSO regression model is capable of achieving reasonably low forecasting errors, the costly settlement of prediction errors can offset and even surpass the savings brought to consumers by a block-chain-based LEM. This shows, that due to prediction errors, participation in LEMs may be uneconomical for consumers, and thus, has to be taken into consideration for pricing mechanisms in block-chain-based LEMs.

(1) There are several other benchmark models commonly used in energy load forecasting. Most

162
LSTM RNN is an advanced architecture of RNN that is particularly well suited to learn long 163 sequences or time series due to its ability to retain information over many time steps [28]. LSTM units 164 [31] extend RNN units by an additional state. This state can retain information for as long as needed. 165 In which step this additional state is updated and in which state the information it retains is used in 166 the transformation of the input is controlled by three so-called gates [32]. These three gates have the 167 form of a simple RNN cell. Formally, by slightly adapting the notation of [33] -who use h t−1 instead of 168 s t−1 , whereas the notation used here (s t−1 ) accounts for the modern LSTM architecture with peephole 169 connections -the gates can be written as where σ is the sigmoid activation function σ(z) = 1 1+e −z , W denotes the weight matrices that are 171 intuitively labelled (ix for the weight matrix of gate i t multiplied with the input x t etc.), and b denotes 172 the bias vectors. Again following the notation of [33], the full algorithm of a LSTM unit is given by the 173 three gates specified above, the input node, the internal state of the LSTM unit at time step t, where is pointwise multiplication, and the output at time step t, LSTM RNNs are capable of learning highly complex, non-linear relationships in time series  the hyperparameters of all layers should be tuned simultaneously. However, due to computational 197 constraints, that was not possible here and, thus, the described, second-best option was chosen. As 198 the hyperparameter values specified in Table 1  Tesla P100 GPUs.

5:
Generate target values y by aggregating data to 15-min intervals.

6:
Transform time series in data set Ψ i and add calender features.

7:
Set up training and validation data generators according to parameter tuple < b, d >.

8:
Split data set Ψ i into training data set Ψ i,tr and testing data set Ψ i,ts .

12:
Train LSTM RNN ζ i with data batches ϕ train ⊆ Ψ i,tr supplied by training data generator.

13:
Evaluate performance with mean absolute error Λ k on cross-validation data batches ϕ val ⊆ Ψ i,tr supplied by validation data generator. 14: until Λ k−1 − Λ k < 0.001 for the last 3 epochs.

16:
Set up testing data generator according to tuple < b, d >.

17:
Generate predictions y i with batches ϕ ts ⊆ Ψ i,ts fed by testing data generator into LSTM RNN ζ i .

18:
Calculate error measures Θ i to assess performance of X i .

19:
Write prediction vector y i into column i of matrix P. Formally, the LASSO estimator can be written as where X is a matrix with row t being [1 x T t ] (the length of x T t is the number of lag-orders n 240 included), and λ is a parameter that controls the level of sparsity in the model, i.e., which lag-orders are 241 included to predict y t+1 . This model specification selects the best recurrent pattern in the energy  The response vector consists of single consumption values in 15-minutes aggregation.

249
The detailed description of the model estimation and prediction is presented in Procedure 2. As

253
After generating the predictor matrix for the model estimation, the optimal λ is found in a K-fold

4:
Generate target values y by aggregating data to 15-min intervals.

5:
Split data set Ψ i into training data set Ψ i,tr and testing data set Ψ i,ts .

6:
Generate predictor matrix M tr by slicing time series Ψ i,tr with sliding window.

9:
Split predictor matrix M tr into K folds. 10: for k in K do

11:
Select fold k as CV testing set and folds j = k as CV training set.

12:
for each l s in {l s } L s=1 do

13:
Compute vector β k,ls on CV training set. 14: Compute mean absolute error Λ k,ls on CV testing set.

17:
For each β k,ls calculate average mean absolute errorΛ s across the K folds.

18:
Select cross-validated λ-value l CV s with the highest regularization (min no. of non-zero β-coeff.) within one SD of the minimumΛ s .

19:
Compute β l CV s on complete predictor matrix M tr .

20:
Generate predictor matrix M ts by slicing time series Ψ i,ts with sliding window.

21:
Generate predictions y i from predictor matrix M ts and coefficients β l CV s .

22:
Calculate error measures Θ i to assess performance.

23:
Write prediction vector y i into column i of matrix P.

Error measures 269
Forecasting impreciseness is measured by a variety of norms. The L 1 -type MAE is defined as the 270 average of the absolute differences between the predicted and true values [37]: where N is the length of the forecasted time series, x t the forecasted value and x t the observed Electronic copy available at: https://ssrn.com/abstract=3658115 Absolute error measures are not scale independent, which makes them unsuitable to compare 276 the prediction accuracy of a forecasting model across different time series. Therefore, they are 277 complemented with the percentage error measures MAPE and NRMSE normalized by the true value: and (10) However, as [39] point out, using x t as denominator may be problematic as the fraction is normalized with the in-sample mean absolute error of the persistence model forecast: In summary, in the present research, the forecasting performance of the LSTM RNN and the 286 LASSO were evaluated using MAE, RMSE, MAPE, NRMSE, and MASE. would not pay more than the energy utility's price per kWh. However, this assumes that the agents 302 do not consider any non-price related preferences, such as strongly preferring local renewable energy 303 [6]. Third, for each trading slot (i.e., every 15-minutes interval), the bids and asks are ordered in 304 price-time precedence. Given the total supply is lower than the total demand, the lowest bid price 305 that can still be served determines the equilibrium price. Given the total supply is higher than the 306 total demand, the overall lowest bid price determines the equilibrium price. In the case of over-or 307 undersupply, the residual amounts are traded at the feed-in (12.31 EURct kWh ) or the regular household consumption values with very rare, extreme spikes. Four more consumers were excluded due to 343 conspicuous regularity in daily or weekly consumption patterns. Lastly, one consumer was excluded 344 not due to peculiarities in the consumption patterns but due to missing data. As the inclusion of 345 this shorter time series would have led to difficulties in the forecasting algorithms, this data set was 346 excluded as well.

371
Notably, the sum of underestimation errors is higher across the data sets than the sum of overestimation The average performance of the three prediction models across all 88 data sets is shown in Table 3.

382
As can be seen, LASSO  Interestingly, there are some consumer data sets which exhibit apparently much harder to predict 386 consumption patterns than the other data sets. This is exemplified by the heatmap displayed in Figure 2.

387
It confirms that there is quite some variation among the same prediction methods across different 388 households. Therefore, one may conclude, that there is no "golden industry standard" approach for 389 households' very short-term energy consumption forecasting. Nevertheless, it is obvious that the 390 LASSO model performed best overall. Hence, the predictions on the last quarter of the data produced 391 by the fitted LASSO model for each consumer data set will be used for the evaluation of the following 392 market simulation. naive   LASSO   LSTM   c001  c002  c003  c004  c005  c006  c007  c008  c009  c010  c011  c012  c014  c015  c016  c017  c018  c019  c020  c022  c023  c024  c025  c027  c028  c029  c030  c031  c032  c033  c035  c036  c037  c038  c039  c040  c041  c042  c043  c044  c046  c047  c048  c049  c050  c051  c053  c054  c055  c057  c058  c059  c060  c061  c062  c063  c064  c065  c067  c068  c069  c070  c071  c072  c073  c074  c076  c078  c080  c082  c083  c084  c085  c086  c087  c088  c089  c090  c091  c092  c093  c094  c095  c096  c097  c098  c099  c100 consumer ID    in supply has to be compensated by energy purchases from the grid. This means, the more severe 432 the undersupply, the more energy has to be purchased from the grid, and the more the LEM price 433 surpasses the equilibrium price. In summary, one can conclude that the market outcomes are the 434 more favourable to consumers, the more locally produced energy is offered by prosumers. Assuming  This is due to the above-mentioned need to settle prediction errors at unfavourable terms.

467
The percentage loss induced by prediction errors is shown in Table 5. Depending on the supply 468 scenario it ranges between abound 4.8 % and 13.75 %. These numbers have to be judged relative to 469 the savings that are brought to consumers by the participation in an LEM. It turns out, that in the 470 balanced supply scenario, the savings due to the LEM are almost completely offset by the loss due to 471 prediction errors. As consumers profit more from an LEM, the lower the equilibrium prices are, this is 472 not the case in the oversupply scenario. Here, the savings are substantial and amount to about 130 % 473 which is almost ten times more than the percentage loss due to the prediction errors. However, the 474 problem of the settlement structure for prediction errors becomes very apparent in the undersupply 475 scenario. Here, the savings due to an LEM are more than offset by the loss due to prediction errors.

536
The second option addresses the demand and supply structure in the blockchain-based LEM.

537
As was shown in Section 4.2, the cost induced by prediction errors and their settlement is more than unclear on what basis the restriction to participate in the market should be grounded.

544
The third option to mitigate the problem is the market mechanism and the prediction error more radical approach might be to change the market mechanism of closed double auctions altogether 557 and use an exposed market instead. Hereby, the energy consumption and production is settled in an LEMs.

583
In the performance assessment of currently used forecasting techniques, the LASSO model yielded 584 the best results with an average MAPE across all consumer data sets of 17 %. It was subsequently 585 used to make predictions for the market simulation. The evaluation of the market mechanism and 586 prediction error settlement structure revealed that in a balanced supply and demand scenario the 587 costs of prediction errors almost completely offset savings brought by the participation in the LEM.

588
In an undersupply scenario, the cost due to prediction errors even surpassed the savings and made 589 market participation uneconomical. The most promising approach to mitigate this problem seemed to 590 be adjustment of the market design, which can be two-fold: Either shorter trading periods could be introduced which would reduce the forecasting horizon, and therefore, prediction errors or the auction 592 mechanism could be altered to not use predicted consumption values to settle transactions.

593
For the present research, data from a higher number of smart meters and more context information 594 about the data would have been desirable. Also, the large-scale differences in the production capacities 595 of the prosumers, contained in the data, complicated the analysis of the market simulation further.

596
Additionally, it is to mention that the market simulation did not account for taxes or fees, especially 597 grid utilization fees, which can be a substantial share of the total electricity cost of households. The 598 simulation also did not take into account compensation costs for blockchain miners that reimburses 599 them for the computational cost they bear.

600
Evidently, future research concerned with blockchain-based LEMs should take into account the 601 potential cost of prediction errors. Furthermore, to our knowledge there has been no simulation of 602 a blockchain-based LEM with actual consumption and production data conducted. Doing so on a 603 private blockchain with the market mechanism coded in a smart contract should be the next step for 604 the assessment of potential technological and conceptual weaknesses.

605
In conclusion, previous research has shown that blockchain technology and smart contracts 606 combined with renewable energy production can play an important role in tackling the challenges of 607 climate change. The present research, however, emphasizes that advancement on this front cannot be 608 made without a holistic approach that takes all components of blockchain-based LEMs into account.

609
Simply assuming that reasonably accurate energy forecasts for individual households will be available