Event Effects Estimation on Electricity Demand Forecasting

: We consider the problem of short-term electricity demand forecasting in a small-scale area. Electric power usage depends heavily on irregular daily events. Event information must be incorporated into the forecasting model to obtain high forecast accuracy. The electricity ﬂuctuation due to daily events is considered to be a basis function of time period in a regression model. We present several basis functions that extract the characteristics of the event effect. When the basis function cannot be speciﬁed, we employ the fused lasso for automatic construction of the basis function. With the fused lasso, some coefﬁcients of neighboring time periods take exactly the same values, leading to stable basis function estimation and enhancement of interpretation. Our proposed method is applied to the electricity demand data of a research facility in Japan. The results show that our proposed model yields better forecast accuracy than a model that omits event information; our proposed method resulted in roughly 12% and 20% improvements in mean absolute percentage error and root mean squared error, respectively.


Introduction
After Japan fully liberalized its retail power operation in April 2016, the number of power producers and suppliers (PPSs), that is, electricity utilities other than the ten dominating general electricity utilities, has increased significantly. The PPSs' share of electric power sales in Japan is shown in Figure 1 [1]. Their total share has significantly increased from April 2016 to June 2020 in all areas. In particular, the share approaches 25% in the Tokyo area in 2020.
Nowadays, PPSs procure more than 80% of their total supply power from the Japan Electric Power Exchange (JEPX [2]), as shown in Figure 2. Although their backup power remains low, their electric power procurement has significantly increased in the last two years, especially from September 2018 to December 2018. The reduction in procurement cost in the JEPX market has been crucial for PPSs.
In JEPX, the day-ahead (or spot) and intraday markets are popular in electricity exchanges. In the day-ahead market, contracts are made for the delivery of electricity on the following day. In the intraday market, power will be delivered one hour after the order is closed. In both markets, transactions are typically made in 30-min intervals. Therefore, suppliers must provide 48 forecast values per a day. Participating in JEPX involves the risk of an increase in procurement cost. For example, JEPX imposes an imbalance fee based on the difference between the actual and forecasted power demand. Another risk is the fluctuation in electricity prices. In the day-ahead market, bidding is done by a single price auction system; the cost is determined by the intersection of the demand and supply curves. Therefore, electricity price is highly affected by the Japanese economy, such as oil prices. Thus, an accurate forecasting method is essential for PPSs to reduce their procurement cost [3]. A review of the existing forecasting techniques is presented in Section 2.
PPSs have to forecast their electricity demands for ten supply areas separately, as shown in Figure 1. The PPSs have a different number of customers in each area. Some areas may have a large number of customers. In large-scale areas, the forecast accuracy would be high even if one or a few customers change their electricity usage pattern due to some daily event because the event effect would be small compared to the total electricity demand [4]. Thus, the forecast is typically based on past demand and weather information, for example, temperature [5] and humidity [6]. However, many PPSs could have a small number of customers in one of their supply areas. Indeed, from the data available at the Energy Information Center in Japan [7], 218 PPSs traded less than 1000 MWh of electricity in April 2020 [7]. For small-scale areas, the electricity fluctuation due to a specific customer's daily event can have considerable effect on the total electricity demand. A forecast without incorporating event information can lead to poor forecast accuracy. Therefore, it is crucial to incorporate event information into forecasting models for small-scale areas. Figure 3 shows the daily demand of a research facility in 2018 (This facility is in the Kyushu region; we cannot provide further details of this research region for confidentiality reasons). The average temperature near the facility is also provided. Some of the demand data figures are not depicted owing to missing values or outliers, resulting in 353 complete daily demands in 2018. The average temperature has considerable influence on the demand. Indeed, the demand becomes high when the temperature is either high or low owing to air conditioner usage. In Japan, air conditioners are mainly used as a heating system in the winter season (see, e.g., in [8]). We consider the relationship between temperature and demand in detail in Section 4.2. We also observe that daily demand has a weekly seasonality. However, daily demand cannot be approximated from only the effects mentioned above. Indeed, some daily demands are exceptionally large apart from weekly seasonality. An irregular event causes the instability of demand; this facility irregularly consumes a large amount of electricity due to some experiments.
To illustrate the event effects in more detail, we depict the half-hourly electricity demand of the research facility in November 2018 in Figure 4.  Electricity demand shows weekly seasonality: demand is high on weekdays and low on weekends (4, 11, 18, and 25 November are Sundays). The routine human activity may cause daily variation during weekdays. However, we observe a significant difference in demand pattern for the same day of the week. For example, the maximum demand on 28 November was much larger than that on 21 November. An irregular event caused this large electricity demand on 28 November. We aim to construct a forecasting model that incorporates the effect of such irregular daily events.
As mentioned earlier, a large number of methodologies are available for electricity demand forecasting (please refer to Section 2 for details). Most methods do not consider event information. To incorporate the event effects into a forecasting method, we typically transform the event information into a dummy variable (i.e., a variable that takes 1 when an event occurs, and 0 otherwise) and add it to the exploratory variables. This approach reduces to the standard regression analysis or machine learning techniques. Indeed, Grolinger et al. [9], Grolinger et al. [10] used a dummy variable to show improvement in forecast accuracy.
However, the dummy variables approach has limitations in that it cannot incorporate detailed information on the daily event. For example, assume that the irregular event is factory operation. The electricity usage pattern of the factory can manage load shifting (see, e.g., in [11]); therefore, we can "roughly" find the electricity fluctuation due to factory operation on each time period in advance. Such detailed event information is expressed as a function of the time period. The dummy variables approach cannot directly incorporate the time period function because the dummy variables take only either 0 or 1 for each time period. Thus, the dummy variables approach disregards useful information on the daily event. To incorporate time-varying electricity fluctuation information, we must view the event effect as a time period function.
To incorporate a wide variety of event effects that depend on the time period, we employ basis expansion [12,13]; the event effect is expressed as a basis function depending on the time period. The basis function approach generalizes the dummy variable; indeed, the indicator function corresponds to the dummy variable. Moreover, the basis function expresses the event effect much more flexibly than the dummy variable. For example, Hirose [14] recently proposed a statistical model based on the varying-coefficient model with basis expansion, employing the B-spline function as basis function to capture the COVID-19 effect. This study adopts the forecasting model used in [14] because the numerical results show that his forecasting technique outperformed several existing machine learning techniques.
The B-spline function is known as a smooth function. However, most event effects are expressed as a non-smooth function due to a sudden change in electricity usage. Examples include the illumination in a stadium, factory operations, experiments, and supercomputers. In this study, we use event-specific basis functions. For example, when an event is in some specific time period, the basis function takes 1 for that time period, and 0 otherwise. A more flexible basis function can be constructed according to the event characteristics. The details are presented in Section 3.2.1.
A basis function can be constructed only when detailed information on the event is provided. However, in many cases, we do not have enough event information to specify the basis function. Indeed, PPSs enter into many contracts, and it would be difficult to obtain all information on each customer's events in detail. For the above-mentioned demand data in a research facility depicted in Figures 3 and 4, we do not have detailed information on the events (i.e., experiments). Therefore, we find it important to extend our proposed method even when the basis function cannot be specified; in other words, we need a procedure for automatic construction of the basis function.
To automatically generate a non-smooth basis function, we employ the fused lasso [15]. With the fused lasso, we can estimate some of the parameters in the neighbouring time periods as exactly the same values, greatly enhancing the interpretation of the basis function. The fused lasso includes a regularization parameter that controls the degree of simplicity of the basis function. For example, when the regularization parameter is sufficiently large, the basis function becomes an indicator function (i.e., dummy variables), and as it becomes small, the basis function becomes complex. We select the regularization parameter so as to minimize the forecast error. The fused lasso is typically used in life science studies (e.g., gene expression data analysis) and image processing [16], but to our knowledge, application of the fused lasso to electricity demand forecasting has not yet been examined.
The remainder of the paper is organized as follows. Section 2 briefly reviews the existing forecasting techniques. Section 3 presents our proposed method based on basis expansion both with and without basis function specification. In particular, Section 3.2 describes the main idea of our proposed method. We apply our proposed method to the electricity demand data from a research facility in Section 4. The conclusions and future direction of the study are presented in Section 5.

Literature Review
Many forecasting methods are proposed in the literature. The forecasting techniques are classified mainly into two approaches, statistical and machine learning approaches. The former builds a statistical model based on data characteristics. Several statistical models are interpretable, making it easy to develop strategies for energy-saving intervention (see, e.g., in [17,18]) and demand response (see, e.g., in [19]). Furthermore, most of the statistical methods are based on probabilistic forecasting, and therefore, it is easy to construct the prediction interval. The review of probabilistic forecasting is presented in [20]. Another approach to investigate uncertainty is to employ the fuzzy time series approach [21,22].
The traditional statistical approach includes linear regression [23][24][25] and time series analysis, such as autoregressive integrated moving average (ARIMA) and Kalman filter [26][27][28][29]. Nonlinear statistical modeling with smoothing spline may capture nonlinear seasonal patterns [30,31]. Several studies recently applied functional data analysis, where the daily electricity demand curves are expressed as functions [32,33]. For example, Cabrera and Schulz [32] used functional principal component analysis to decrease the number of parameters. Shah and Lisi [34], Shah et al. [35] compared the performance of various statistical approaches, including smoothing spline and time series analysis and functional data analysis. The lasso [36] is also a promising technique for electricity demand forecasting [37,38]; it can simultaneously perform variable selection and model estimation.
Many researchers have developed machine learning techniques. These techniques generally have high accuracy because they capture the complex nonlinear structure. In particular, numerous data resources are available owing to the spread of smart meters in recent years. State-of-the-art machine learning techniques lead to high forecast accuracy on large sample sizes, especially for short-term forecasting [39].

Statistical Model
We briefly describe a nonlinear regression modeling based on the varying-coefficient model given in [14]. Let y ij be the electricity demand at the jth time period on the ith day (i = 1, . . . , n, j = 1, . . . , J).
Typically, the number of time periods is J = 48; one day is divided into 30-min intervals [2]. We employ the following statistical model based on Hirose [14], where µ ij , b w ij , and b e ij are the deterministic terms and ij is the stochastic error term with mean E[ ij ] = 0 and variance V[ ij ] = σ 2 j . The first term on the right-hand side of (1), µ ij , represents the effect of past demand. The second and third terms correspond to the effect of external factors; b w ij and b e ij are the effect of weather and irregular event, respectively ("w" and "e" in the superscript of b ij are abbreviations for weather and event, respectively).
The µ ij expresses the electricity demand effect of the past T days as follows: where α jt are the parameters given beforehand. In this study, we assume the AR(1) structure on α jt , which gives the weighted mean of past demand on the previous days at time zone j. Note that the effect of past temperature and event, b w (i−t)j + b e (i−t)j , is removed from past demand y (i−t)j when calculating µ ij , because our model decomposes routine demand and external effects.
The second term of the deterministic terms, b w ij , describes the daily weather effects, such as maximum temperature and average humidity. Let s i be a vector of weather on the ith day (i = 1, . . . , n). Weather effects are usually expressed as a nonlinear function [71] and depend on the time period. To incorporate these characteristics, Hirose [14] employed the varying-coefficient model, where regression coefficients are expressed as a smooth function with respect to j. Thus, the weather effect is expressed as a linear combination of two basis functions as follows, Here, h q (j) and g m (s i ) are basis functions for regression coefficients and daily weather, respectively. M and Q are the number of basis functions. Note that h q (j) and g m (s i ) are given beforehand, and the parameters related to b w ij are γ w qm . In [14], b w ij is assumed to be smooth with respect to both j and s i ; therefore, h q (j) and g m (s i ) are defined as B-spline functions.

Construction of Event Effect and Parameter Estimation
The main topic of this research is construction of the event effect, b e ij . We use the basis expansion as in b w ij , but the basis function is assumed to be a non-smooth function. We present the construction of b e ij in two cases: basis function is specified and not specified. We also present the parameter estimation procedures for these cases.

Case 1: Basis Function Is Specified
As described previously, b w ij is assumed to be a smooth function of j. However, in many cases, the event effect cannot be expressed as a smooth function. Therefore, we need to construct basis functions specific to the event effect. The event effect is expressed as where r i (j) is a basis function depending on j, γ e is a parameter, and e i is the indicator function of the event on the ith day. Figure 5 presents three examples of non-smooth basis functions r i (j). The basis function in Figure 5a may be useful for an event that uniformly increases electricity usage. We apply basis function (a) to real data in Section 4. Figure 5b corresponds to the event where the electricity for a specific time zone is increased. Figure 5c shows a more complex basis function than that in Figure 5a,b. Basis function (c) assumes two major facilities: one used during the day, and the other used only early morning. A basis function that is more complex than Figure  Let γ w be a QM-dimensional vector whose elements consist of γ qm (q = 1, . . . , Q; m = 1, . . . , M). Let γ = ((γ w ) T , γ e ) T . Through some calculations, problem (1) is reduced to a linear regression problem [14]: We may employ ordinary least squares estimation to obtain the regression coefficient vector γ. However, as the estimate of γ w qm often turns out to be negative, it is difficult to interpret the statistical model [72]. Note that interpreting the estimated model would help develop strategies for energy-saving intervention and demand response. To obtain an interpretable statistical model, following the work in [14], we employ a non-negative least squares (NNLS) estimation [73] with ridge penalty: where λ 1 ≥ 0 is a regularization parameter for the ridge penalty. Here, " √ n" in the second term is for consistency in ridge regression (see, e.g., in [74]), but this is not essential. We then obtain the electricity demand forecast valueŷ ij as follows,ŷ ij =μ ij +b w ij +b e ij .

Case 2: Basis Function Cannot Be Specified
In many situations, it would be difficult to specify the basis functions for an event. In this case, we need to estimate a basis function that is similar to Figure 5. To achieve this, first, we assume that the parameters related to the event effect are different at each time period; that is, where γ j (j = 1, . . . , J) are the parameters; J different parameters are assumed for each time period.
With this approach, the number of parameters related to the event effect turns out to be too large, resulting in unstable model estimation. Moreover, it would not be easy to interpret the event effect.
A regularization approach is needed to handle these issues.
Let γ e = (γ e 1 , . . . , γ e J ) T and γ = ((γ w ) T , (γ e ) T ) T . To construct a basis function that is stable and easy to interpret, we employ the fused lasso [15] where λ 2 ≥ 0 is a regularization parameter for the fused lasso penalty. With large λ 2 , we have γ e j = γ e j−1 for some j, and therefore interpretability of the event effect is greatly enhanced. An ordinary fused lasso is expressed as √ nλ 2 ∑ J j=2 |γ e j − γ e j−1 |; we add the term √ nλ 2 |γ e 1 − γ e J | to the fused lasso penalty. This is because the edge of the time period is expected to be smooth. For example, when J = 48, the event effect between 23:30-00:00 and 00:00-00:30 should be similar. The value of λ 2 controls the degree of smoothness; the larger the value is, the smoother the basis function. Note that for a sufficiently large λ 2 , we have γ 1 = · · · = γ J , implying that the corresponding basis function is exactly the same as that in Figure 5a.
To solve (2), we provide the following algorithm.
2. Repeat the following procedures until convergence.
(a) For fixed γ e , minimize (2) with respect to γ w . The problem is exactly the same as the NNLS, and then a numerical algorithm for the NNLS, such as in [72], can be directly used. (b) For fixed γ w , minimize (2) with respect to γ e . This problem reduces to the generalized lasso problem with the penalty term expressed as The generalized lasso is solved using the path algorithm presented in [16]. We remark that the generalized lasso covers a wide variety of lasso-type penalties, including the lasso, fused lasso, and trend filtering [75].
The objective function (2) is bounded (always nonnegative), and the above algorithm decreases the objective function at each iteration. Therefore, the sequences of γ obtained using the above algorithm are guaranteed to converge.

Real Data Analysis
We present the analysis of electricity demand data in a research facility as shown in the Introduction. The data are collected at 30-min intervals from 1 January 2016, to 31 December 2018. We consider the day-ahead electricity demand forecast in 2018; the 2016 and 2017 data are used only for training.

Irregular Events
As described in the Introduction, the electricity demand of the research facility is highly affected by events. We investigate the importance of event effects by observing the demand in November 2018. Note that we choose November only for illustration; in November, the research facility frequently conducted experiments and the temperature effect was small. In this case, it is easy to understand how an event affects the demand. We observe the event effects during other months, and find a similar tendency to November.
The actual event dates are observed only partially for confidential reasons. However, the electricity demand for the events is observed. Therefore, the event dates are subjectively determined as follows; events occur when the total electricity demand of buildings is beyond 750 kW, and not otherwise. Although the event occurrence strategy is subjective, a similar categorization is adopted in [76] to distinguish between usual and unusual days for peak demand forecasting. Figure 6 shows the daily electricity demand due to an event. We observe that the threshold, 750 kW, would be reasonable to judge whether an event occurs or not. November has 13 event dates. Figure 8 shows the demand in 30 min intervals for the 13 event days. The event effect seems to be unstable, making the construction of a basis function difficult. Nevertheless, we observe that events mainly occur during working time, and not when the facility is closed. Thus, the fused lasso in (2) is expected to provide higher forecast accuracy than the simple basis function in Figure 5a.

Model Setting
We need to specify the input and output data to construct a forecast model. In this study, the input data are past demand, daily weather, and event information. For past demand, we use all the past demand data up to the day prior to the forecast day. For example, for the electricity demand forecast on 11 November 2018, we use the past demand in 30 min intervals from 1 January 2016 to 10 November 2018. In this case, the ratio of the sample sizes of the training and test datasets is about 0.95:0.05.
For daily weather information, we use the average temperature. Other daily temperature information, such as maximum temperature, may be used. Nevertheless, the forecast accuracy based on average temperature is generally slightly better than that based on maximum temperature for this dataset. The average temperature near the facility is obtained from the Japan Meteorological Agency [77] website. Figure 9 shows the demand and temperature at some specific time periods on weekdays in 2018. Basis expansion is employed for curve fitting using five B-spline functions. For easy interpretation, we exclude the demand related to events. For high and low temperature, the facility uses more electricity on high and low temperature days than on moderate temperature days due to air conditioner usage. The curves slightly differ among time periods, suggesting that our proposed approach based on the varying-coefficient model with basis expansion properly captures the temperature effect.
We also observe that the curve shapes differ between days of the week. Therefore, we construct the statistical models by day of the week separately; thus, we construct seven statistical models. To forecast the demand, we select the model that matches the day of the week. All national holidays are regarded as Sunday.
We compare the performance of our proposed and existing methods. For all forecasting methods, we consider cases with and without event information. All of the forecasting techniques include tuning parameters that influence forecast accuracy. We specify a list of candidates for tuning parameters and select a set of tuning parameters so as to minimize the past three months' forecast error. We set T = 4, that is the demand in the past four days, to forecast the demand. As indicators for forecast accuracy evaluation, we use the root mean squared error (RMSE) and mean average percentage error (MAPE): whereŷ ij is the forecast value of y ij (i = 1, . . . , n; j = 1, . . . , J). We present the forecasting techniques in detail below.

Proposed Method
As Figure 8 shows, the event effect impact highly depends on the time period. It is not easy to create a basis function as in Figure 5c. Thus, we apply the fused lasso in (2) to generate a basis function. For comparison, we use the simple basis function in Figure 5a. When event information is not included, our proposed method reduces to Hirose's [14] method.
Our proposed model is similar to Fan and Hyndman's [78] one in the sense that they applied smooth spline functions and forecasted half-hourly demand. However, they did not employ the varying coefficient model; the regression coefficients associated with the weather effect were different in each time period. With this approach, the number of parameters becomes large and the regression coefficients cannot be smooth across the time period, resulting in an unstable estimation. Another difference between our proposed method and Fan and Hyndman's [78] one is the transformation of demand. Fan and Hyndman [78] showed that the logarithm transformation resulted in the best fit among the various transformations presented by Box and Cox [79]. Therefore, it is worth trying to conduct the logarithm transformation in our proposed model. Therefore, we employ the following nine estimation approaches.  As basis functions for the temperature effect and varying-coefficient models, following the work in [14], we use the cubic and cyclic B-splines for temperature effect and varying-coefficient models, respectively. When we do not employ the logarithm transformation, the candidates of regularization parameter for ridge regression, λ 1 , are 10 grids from 1 to 10 −3 on a log scale. Similarly, for λ 2 , we specify 10 candidates from 10 3 to 1 on a log scale. When the logarithm transformation is employed, these candidates are divided by 100; this is because average demand divided by the logarithm of average demand is approximately 100. We also specify candidates for M and Q as M = 5, 10 and Q = 5, 10. We use R package genlasso to implement the generalized lasso in VCM-FL. For NNLS implementation, we use the nnls package in R.

Existing Methods
To investigate the performance of our proposed method, we employ several existing machine learning techniques. We employ the SVR, lasso, Random Forest (RF), XGBoost (XGB), LightGBM (LGBM), and ARIMA as existing methods. These are frequently used in electricity forecasting. For details, refer to the literature review section, Section 2. We use R packages, randomForest, ksvm, glmnet, xgboost, and lgbm to implement the above algorithms.
For a fair comparison, the inputs/outputs used in the machine learning techniques are exactly the same as those used in our proposed method. For SVR, RF, XGBoost, LightGBM, and lasso, the forecast is based on the demand for the past T days with the same time period, average temperature, and event; that is y ij ≈ f (y i−1,j , y i−2,j , . . . , y i−T+1,j , s i , e i ). For the ARIMA model, we apply the ARIMA(T, d, q) expressed as y ij − y i−d,j = c j + ε ij + ∑ T t=1 φ tj y i−t,j + ∑ q t=1 θ tj ε i−t,j + γ w j s i + γ e j e i , where c j , φ tj , θ tj , γ w j , and γ e j are model parameters and ε ij (i = 1, . . . , n) are white noises. The above model implies that the model fitting is separately done for each time period. The detailed settings for the existing techniques, including the tuning parameter candidates and input normalization, are presented in the Supplementary Material.
In addition to the above methods, we use a hybrid model. We select the top three models that minimize the RMSE and define the forecast value by the sample mean of the three methods' forecast values.    Our empirical observations are as follows.

VCM-S VCM-S-Log Sep-S VCM-FL VCM-FL-Log Sep-FL
• The MAPE is generally large for all methods because this facility does not cover large-scale areas and hence has unstable electricity usage. For large-scale areas, without event information, the VCM in [14] and most machine learning techniques can result in less than 4% of MAPE, as shown in [14].

•
For almost all methods, event information improves the accuracy. • When event information is included, the VCM-FL-Log yields the best performance in both RMSE and MAPE. Basis expansion successfully captures the effects of both daily temperature and events. In addition, the logarithm transformation successfully improves accuracy when the event information is incorporated into the model. • VCM-FL performs much better than VCM-S, implying that the event effect highly depends on the time period.

•
The varying coefficient model performs slightly better than the separate regression. • Interestingly, VCM-S provides a larger MAPE than VCM does, although the latter does not incorporate event information. A simple basis function can result in poor forecast accuracy when the event effect highly depends on the time period. • For existing methods, SVR, RF, XGBoost, and LightGBM perform better than the lasso and ARIMA. This is probably because the lasso and ARIMA can capture only the linear relationship between temperature and demand, whereas SVR, RF, XGBoost, and LightGBM can capture the nonlinear structure. • XGBoost and LightGBM yield similar accuracy because both are based on the gradient boosting decision tree.

Interpretation of Forecast Results
In Section 4.3.1, we showed that the basis function constructed using event information greatly improves accuracy. In this section, we investigate how well the fused lasso improves accuracy by interpreting the forecast results. For illustration, we focus on the demand in November 2018 shown in Figure 4. A similar tendency is observed for other months in 2018. Figure 10 shows the forecast values estimated by VCM-FL and those estimated by VCM. Actual electricity demand is also depicted.
The result shows that the fused lasso adequately captures the large electricity demand due to the facility's experiments. For example, on 7 November 2018, the maximum difference of two forecast values was roughly 315 kW, resulting in a 12% improvement in MAPE. Although VCM-FL generally produces larger or almost the same forecast values compared to VCM, a reversal occurs on 21 November. Interestingly, no event occurred on 21 November, and VCM-FL yielded better accuracy than VCM. We use the demand of the past four days on Wednesdays to forecast the demand, and the event occurred on the latest past two days out of the T = 4 days (i.e., 7 and 14 November). With VCM, the electricity fluctuations caused by the events on 7 and 14 November resulted in unnecessary increase in forecast values.
Theŷ ij along withμ ij andb ij of VCM-FL and those of VCM are shown in Figure 11. At first, we find that the cyclic pattern ofμ ij has the same tendency over time on VCM-FL and VCM; the working time takes a larger value ofμ ij than the closed time. However, the curve shapes ofb ij are completely different between the two methods. The VCM-FL can approximately capture the effect of the irregular experiments carried out in the research facility buildings. Meanwhile, the VCM cannot capture this because the event information is not included in the model. For interpretation, we show how the fused lasso's regularization parameter affects the basis function estimation. We estimate the basis function using the data of Thursdays from 1 January 2016 to 31 October 2018. Figure 12 shows the basis functions made with the fused lasso for various regularization parameter values λ 2 . As the value of λ 2 increases, the basis function for neighboring time periods tends to take exactly equal values, thanks to the penalty √ nλ 2 Dγ e 1 in (2). In November 2018, we select λ 2 = 1, where the basis function becomes complex in daytime but takes constant value at midnight. For some other months, we select a relatively large value of λ 2 , such as λ 2 = 46.4. In this case, the basis function is constant during the afternoon as well as at midnight. When λ 2 = 464, the basis function does not depend on the time period, resulting in exactly the same basis function as in Figure 5a.

Conclusions
We constructed a statistical model for forecasting the electricity demand affected by external factors, such as average temperature and irregular events. The event effect was expressed as a basis function, and several basis functions representing the event effect were presented. When the basis function could not be specified, we employed the fused lasso to automatically produce a basis function whose neighbouring time periods were likely to take exactly equal values. We applied our proposed method to the demand data of a research facility that consumed a large amount of electricity due to irregular events. The results showed that incorporating event information improved the forecast accuracy significantly in both MAPE and RMSE.
Our future research plan is to improve accuracy by incorporating variables other than daily temperature. For example, Xie et al. [80] discussed the usage of relative humidity; their empirical results showed that relative humidity has a role in improving accuracy. However, the improvement was minor when the relative humidity and temperature interaction was not included; indeed, Xie et al. [80] recommended the inclusion of eight variables related to relative humidity in the linear regression model. For our model, however, we need basis expansion, and the incorporation of eight variables would cause a significant increase in number of parameters, resulting in poor forecast accuracy. A more elaborate consideration is needed to incorporate information other than temperature into our proposed model, but this is beyond the scope of this research. We would like to take this up as a future research topic.
Another important research direction is to add other variables from a smart building, such as the number of people in the building, ventilation, electric lighting, solar shading, and network data. In particular, it is crucial to forecast solar power generation; solar gains are a decisive factor in the indoor environment. Energy consumption is calculated from the difference between actual demand and solar gains. In this case, the forecasting model would become highly complicated. It would thus be interesting to investigate how well our proposed approach based on the fused lasso performs when a complex statistical model is employed.
In this study, we evaluated the performance of our proposed procedure through the average demand deviation method. We also consider peak demand forecasting as an essential topic. For example, if a peak demand forecast with high accuracy could be achieved, it would enable us to conduct peak cut through the demand response [81], that is, managing the event. To forecast the peak demand, we need a technique specifically designed for peak demand forecasting, such as the transformation technique [82] and extreme value theory [83,84]. Another interesting future research topic is peak demand forecasting that takes irregular event information into account.

Patents
This research is related to Japanese patent application No. 2020-074743.