Stability of Multiple Seasonal Holt-Winters Models Applied to Hourly Electricity Demand in Spain

Electricity management and production depend heavily on demand forecasts made. Any mismatch between the energy demanded with respect to that produced supposes enormous losses for the consumer. Transmission System Operators use time series-based tools to forecast accurately the future demand and set the production program. One of the most effective and highly used methods are Holt-Winters. Recently, the incorporation of the multiple seasonal Holt-Winters methods has improved the accuracy of the predictions. These forecasts, depend greatly on the parameters with which the model is constructed. The forecasters need to deal with these parameters values when operating the model. In this article, the parameters space of the multiple seasonal Holt-Winters models applied to electricity demand in Spain is analysed and discussed. The parameters stability analysis leads to forecasters better understanding the behaviour of the predictions and managing their exploitation efficiently. The analysis addresses different time windows, depending on the period of the year as well as different training set sizes. The results show the influence of the calendar effect on these parameters and if it is necessary or not to update them in order to obtain a good accuracy over time.


Introduction
The programming of the production of electrical energy is a complex task carried out by the Transmission System Operators (TSO). Their main objective is the supply of electricity, assuring the distribution. In addition, the energy production at the lowest possible cost to the consumer, without incurring losses is a key point [1]. Any mismatch between the programmed and the really demanded energy produces huge losses. Hobbs [2] determined that the losses produced by a 1% gap between planning and reality can cost up to millions of dollars. Hong et al. [3] determined that for a maximum peak demand central of 1 GW, a 1% error in the prediction can involve costs of $600,000 per year. The TSOs use for their predictions time series forecasting [4][5][6]. One of the most common used technique is the exponential smoothing methods, and especially the Holt-Winters models, due to their easy to understand and implement features [7,8]. Exponential smoothing methods use the data observed in the past to make predictions, assigning an exponentially decreasing weight to the older information against the newer. The way to assign the relevance of newer data against the older is performed by the weight assigned to smoothing parameters. The smoothing parameters are bounded in the range [0,1]. The closer to 0 the more important the older data is, and the contrary when closer to 1. The Holt-Winters model is described in Equations (1)-(4) X t+h = (S t + hT t )I t−s+h (4) where S t , and T t are the equations for the level and additive trend, with smoothing parameters α and γ. I t are the seasonal indices of length s, with smoothing parameters δ. X t is the observed data. Finally, X t+h is the equation to forecast h time instants ahead. It collects the information contained in the model and makes predictions. The introduction of double and triple seasonal models (HWT), described in [9,10], improved the accuracy of these methods, and their use is being generalized. Additionally, these methods behave very accurate for such type of series, with a strong seasonal effect [11,12]. The generalization to include up to n seasonal patterns is known as Multiple Seasonal Holt-Winters (nHWT), developed in [13]. The model is defined as in Equations (5)-(8), t−s i i = 1, . . . , n s (7) where I (i) t are the seasonal indices of length s i , with smoothing parameters δ (i) . There are as many equations as seasonal patterns allocated in the time series, n s . The ϕ AR parameter is entered to correct the model including effectively the first order autocorrelation error (ε t ), known as AR(1) adjustment.
Depending on the way the equations are defined, additive or multiplicative methods can be used. The method of combining the smoothing equations will determine the Holt-Winters model to be used. The combination provides 30 different models, according to the trend and seasonality combination, as well as AR(1) adjustment, as described in [13]. The nomenclature for these models is as follows: Three letters to describe the combination, where the first one describes the trend method, the second one the seasonality method and the last one whether the adjustment was applied. This combination is shown in Table 1. Furthermore, the model is described by adding as subscript the different seasonal patterns considered, using the length of the cycle. As an example, the AMC 24,168 one is a model with additive trend and multiplicative seasonal, the model is adjusted using first order autocorrelation error and it has two seasonalities, one daily (subscript 24) and one weekly (subscript 168).
The smoothing parameters are obtained by adjusting the model to data observed in the past, and try to reproduce the same pattern for the near future. The adjustment is done by solving a non-linear problem, in which the 1-hour-ahead forecasting error is minimised. This topic is better developed in Section 3. The smoothing parameters obtained are assumed to be optimal, and the model is then exploited. The optimality of these parameters is locally obtained, valid only for the dataset and the model selected. Moreover, the same parameter can highly vary from one method to another using the same dataset. Thus, it is a big deal for forecasters to understand the behaviour of the model considering the locally optimised parameters. The future predictions accuracy will be closely related to the smoothing parameters. The optimisation procedure is using an optimisation algorithm, that do not take care concerning the reality of the series. The forecaster needs to make use of the own experience to validate and approve the obtained parameters. These actions are crucial for the entire process. Some factors have an influence on the parameter values. The method to obtain initial values for seeding the model impact on the forecasting accuracy [14] although after an optimisation of the parameters, accuracy differences are also minimised [15]. Climate conditions are not a big deal in terms of short-term forecasting accuracy [12]. However, it seems reasonable the parameters ought to be influenced. If only two seasonal patterns are considered, seasons modify the trend, whereas if three seasonalities are considered, the intra-year seasonal parameter should be influenced. The calendar also has a huge influence on the predictions and parameters, that must deal with some irregularities of the series [16][17][18]. Therefore, forecasters should always pay close attention to the parameters values. The forecasting procedure performed by TSOs must ensure continuously accurate forecasts for the electricity system. These predictions are used by the TSO for the operational planning and unit assignment, while they are also used by the market for the spot price settlement [19,20]. The Spanish electricity market (OMIE) operates similarly [21,22], where the unique Spanish TSO, Red Eléctrica de España (REE), supplies forecasts of demand for the next week every Wednesday, for the next 24 h every day, and a revision of the daily demand every six hour [23]. The market uses this information for the bidding process, as well as REE itself to carry out operational planning. Thus, there is an enormous responsibility in such eagerly awaited forecasts. In fact, REE uses a complex algorithm [24] to provide forecasts.
With such high frequency forecasting, there is a need for the model to be updated continuously. The forecaster must deal with the varying smoothing parameters and take decisions on the engaged forecasts. In the Holt-Winters literature there is a strong discussion about whether the parameters should be continuously readjusted as the series progresses, or on the contrary, they must be immobile and try to exploit them for as long as possible. The values of the parameters are also discussed, since high values of the parameters indicate a great damping to adapt to the new observed values, compared to low values, where the initial values are given greater weight. Before this doubt, the following question is added: how do the parameters respond to a specific time series, such as the hourly electricity demand in Spain? Make it sense the obtained values? What is their stability? All these questions have not been answered in the previous literature.
This article describes how we have performed a stability analysis on the parameters of the nHWT models applied to the electricity demand series in Spain. The main objective was to understand the behaviour of smoothing parameters against several different datasets with changes in climate conditions, time series components such as trend or seasonal, calendar effects among others. This analysis will help the forecasters to face the optimised model before making forecasts, considering the values of the parameters. It is also important to check whether it is necessary or not to update the smoothing parameters of the Holt-Winters models in order to obtain a good accuracy over time. On the other hand, the results of this study will show the influence of the calendar effect on these parameters, and therefore the need to develop new Holt-Winters models that take it into account [16].
The article is organized as follows: Section 2 reviews the existing literature related to the smoothing parameters of Holt-Winters models. Section 3 presents the methodology followed in order to predict the Spanish electricity demand time series and to analyze the variability of the parameters; in Section 4 the results obtained for a given double and triple seasonal model are shown and in Section 5 the results are discussed. Finally, the conclusions reached in this article are shown.

Related Work
Short and medium term electricity demand time series forecasting were extensively studied in the literature applying both statistical methods and machine learning techniques. Recently, the machine learning methods have focused on big data [25], especially deep learning [26,27], and ensemble methodologies [28,29].
Within classical methods, the accuracy of the forecasts in the Holt-Winters models has been widely studied, especially how to select the best method and adjust the parameters [30]. As the Holt-Winters models are recursive, an initialization value is needed to feed the model. Thus, the initialization methods for Holt-Winters were also another intense field of research especially with the emergence of double and triple seasonal Holt-Winters models [31,32]. In fact, new methods to initialize the level, trend and seasonality in multiple seasonal Holt-Winters models were recently developed in [14]. These seed values have influence on the smoothing parameters' values as well as the forecasting accuracy [33]. However, when the series adjustment is made, the initial values lose influence [15]. After fitting the Holt-Winters model, the obtained smoothing parameters are assumed to be optimal for the data set. Thus, the study of their values or their stability has not been worked in depth in the literature. This is due, in large part, to the fact that the models do not allow a theoretical mathematical analysis.
Archibald [34] used 406 monthly series from the M competitions. He analyzed the models with additive seasonality, and demonstrated that the values of the parameters within the range [0,1] are not always invertible. Thus, it is necessary to use only a set range within the region of invertibility.
Lawton [35] used state spaces to analyze Holt-Winters models, and although his work focused on the normalization of the seasonal component, he also analyzed the stability of the parameters. From previous work on filters [36,37], it determined that Holt-Winters models are not asymptotically stable. The values of the eigenvectors of the stability matrices depend on the α, γ and δ parameters.
Some authors [38] worked on obtaining the limits of the smoothing parameters. They analyzed state space parameters, and stated the parameters ought to meet 0 < α < γ. Finally, the authors in [39] established a series of criteria in the parameters so that the models can be "predictable", a term that mints whether a series will be able to make forecasts with constant mean and variance. Osman et al. [40] obtained the values of the main vectors and confirmed that the models proposed in [39] produce stable predictions, but if regressors are used, they must be invariant over time.
Another interesting study related to this paper is the study of the minimum size required for the sample. Hyndman et al. [41] analyzed this situation and concluded that there is no clear answer, but that there should be many more observations than parameters. In this regard, García-Díaz and Trull [13] verified that for a double seasonal model, a data set length of one year produces more stable parameters than 8-10 weeks.

Materials and Methods
The time series used in this study is the hourly electricity demand in Spain, within the years from 2008 to 2017. It was provided by REE through its website www.ree.es. This time series shows clearly several seasonalities: the intraday which is repeated every 24 h, and the intra-weekly, which is repeated every 168 h. A third seasonality has also been included, which relates the data of the series with the seasonal periods of the year. Various strategies were proposed to deal with the 3rd seasonality, since its length depends largely on the concept to be covered. Taylor uses the calendar year, so that the years comprise 52 weeks, although to adjust the process in leap years, it uses 53 weeks [10]. Most authors have chosen to use the solar year, which includes 365.25 days, and seasonally adjusted [42]. In our case, it was considered to be 365.25 days, or 8766 h. This series was used in former works [43,44] and especially using the nHWT methods [13,14,16]. This experience showed the authors that many factors could have an impact on the model parameters, as well as its forecast accuracy, key point for the proper exploitation of the nHWT models. The influence of the initial values was analysed in [14]. However, there is still a lack of analysis regarding the meaning of the parameter space and their relation with forecasts.
The method described in this paper analyses the smoothing parameters of the nHWT forecasting procedure when working with the hourly electricity demand in Spain. It performs an analysis of the parameter values, by checking their behaviour depending on the method chosen from Table 1. That implies to analyse the stability against some influential factors derived from our experience such as seasons, years, calendar, and inner variability.
To match the previously mentioned issues, raw data must be used, avoiding any smoothing or modification of the series. Double seasonal nHWT used a dataset size of 8-10 weeks, it was chosen as this seems to be long enough for this purpose. Alternatively, an analysis of the dataset size is also performed. Triple seasonal nHWT used a dataset of three years.
Randomness in the series selection must be assured. To observe the maximum possible variability, randomly selected data sets were used, but always complying with the following premises: • At least 3 sets of data for each season of the year.
• At least there must be sets in different years, to check the repeatability of the process in similar situations in a matter of period of the year, including holidays nearby. • The data has not been filtered or the special days were altered.
As a result, for the seasonal double analysis, 250 random samples of 10 weeks were extracted, of which, the first 8 weeks were used for adjustment, while the other two weeks were used for validation. Because the climatic effect present in the series must be assessed, the samples were selected so that they belong to different climatic periods and several years. Figure 1 shows the working scheme for this analysis.
The parameters optimization was done using a minimization algorithm, Nelder-Mead's simplex [45], where the minimization function was the Root Mean of the Squared Error (RMSE), described in Equation (9).
where h is the number of samples to be predicted. Once the model was fit and the parameters obtained, 24-hours ahead forecast during two weeks were performed, and compared against the real data. The forecast accuracy was measured by using the Mean Average Percentage Error (MAPE), described in Equation (10).
The use of RMSE for the model optimisation is preferred [46]. On the contrary, the most commonly used indicator to compare forecast accuracy is MAPE [47], in special when comparing demand forecasts. The triple seasonal analysis was performed similarly, but it was needed a higher dataset to fit the model and obtain the parameters. Thus, 80 random samples from three full years and two weeks were sampled. The first set of three years was used to fit the model, whereas the two week set was used for validation purposes. In this case, only the AMC 24,168,8766 model was used for the analysis.

Results
The analysis procedure was organized inductively. First, the results of the forecasts and how they and the smoothing parameters evolve over time were observed for both double and triple seasonal Holt-Winters models, and then, the influence of the size of the adjusting set was analyzed. Table 2 shows the results of the forecasts made by double seasonal models. In particular, the average of the MAPE when predicting the 250 random samples of two weeks. Only models with the correction of the first order autocorrelation error setting are shown, since they offer better results. Removing the year 2009, the year in which the crisis was emphasized in the Spanish industrial sector until that uncertain moment, the rest of the years there is a clear stability in the forecasts.

Double Seasonal Models
Holt-Winters models are very robust to small variations. The forecast accuracy values are around 2.6% in terms of MAPE for 24 hours-ahead forecasts. In the triple seasonal case these values are around 7.9% of MAPE. The NMC 24,168 is selected as it offers the best forecast accuracy. As the MAPE provided is an average each year, a split including the months is graphed in Figure 2 as weather conditions and months may influence on the results. Clearly the winter and autumn months have higher levels of MAPE, this time the winter months when worse forecasts occur.  Table 3 shows the distribution of the parameter values according to the year and seasonal period for the model N MC 24,168 . These values were obtained using the sets for adjustment composed of the 250 random samples of 8 weeks. The total row is the mean of the parameters if they are calculated without dividing into seasons. It can be seen that there was a shift in the parameter values from autumn to winter. The values for the daily seasonality are in the order of 0.2 to 0.3 with the exception of winter, where it increases to the value of 0.4. For the intra-weekly seasonality it has a value of 0.2 and it becomes 0.5 for the winter season.   Figure 3 shows a representation of the parameters over time. It can be observed how the values follow a pattern according to the period of the year. The level and AR(1) adjustment values are slightly different from those of the previous analysis. The level values are higher while the AR(1) adjustment is lower. This is because when the trend is removed, possible long-term variations are supplied by the level. The MAPE in winter season worsens while the smoothing parameters increase their variability in autumn. The randomness of the method chosen to analyse variability makes many forecasts done in winter to use a model optimised with data from autumn season. The special events during autumn impact greatly on the winter forecasts. alpha delta (24) delta (168) phi AR 1.0 This analysis was completed with a study on the influence of the size of the training dataset on prediction accuracy. The date of 11 July 2016 was set for the 24-hours-ahead forecasts during two weeks, and the size of the training dataset increased from 8 weeks to several years. In this way, it is intended to observe the behaviour of the model with respect to the sample size, following the indications of [41]. Figure 4 presents the MAPE of the 24-hour forecasts according to the size of the sample for the N MC 24,168 and AMC 24,168 models in order to compare their behaviour. It can be observed that the AMC 24,168 model presents higher variability and the model starts to stabilize around the average value 1.7% when a sufficiently large set is used, in particular from 20,000 h. However, the N MC 24,168 model is much more stable, and with relatively small training sets, in particular, with a time series composed of 5000 h, it maintains accuracy values around 1.6%. The AMC 24,168 model is more unstable due to the trend equation (Equation (6)) as the electricity demand time series has no clear trend as shown in Figure 1. Using a larger dataset helps the N MC 24,168 model to stabilize while AMC 24,168 model depends clearly on the season the dataset starts. Although N MC 24,168 model shows better stability than AMC 24,168 , the smoothing parameter associated with the trend does not tend to 0 in order to converge to the same model. This is due to the parameters are local optimal, but not global optimal. Thus, the need for an analysis of these parameters and their stability is supported.   Size of the adjusting set (hours) α δ (24) φ AR δ (168) Figure 5. Evolution of parameters versus size of data set.

Triple Seasonal Models
In this section, an analysis of the parameters of the triple seasonal Holt-Winters models is carried out. Data sets of size 3 years (53 × 24 × 7 × 3 h) were used to adjust the models. As in the case of double seasonal models, 24-hour forecasts were made for two weeks. Although all triple seasonal methods depending on the seasonality and trend were analyzed, there are no major differences between the models, and the AMC 24,168,8766 model turns out to be the easiest to study, as it has 6 parameters, including level, trend, seasonality and fit.
The results obtained are shown in Table 4, where the values of the parameters are organized according to the season and year. In particular, the mean and standard deviation of the smoothing parameters obtained using the 80 sets for adjustment composed of 8 random weeks are shown. It can be noticed that the standard deviation for Autumn of the year 2016 is not defined because it was only a random set of 8 weeks into the Autumn. Contrary to what happens in the double seasonal models, and as seen in the previous section, when using sets of size greater than 20,000 h, the values of the parameters are stabilized.  The value of the δ (24) parameter stabilizes around 0.3 for any period and with a minimum variability. The value of the δ (168) parameter stabilizes around 0.2 and the new parameter δ (8766) around 0.08 and 0.1. These values follow the pattern shown in Figure 5, although there is a transfer between the δ (168) parameter and the δ (8766) . The introduction of the third seasonality has forced the model to update the intra-weekly seasonality with more recent data over time. On the other hand, values of α stabilize around practically 0, and values of ϕ AR around 0.95. The existence of so much data makes the AR(1) adjustment practically responsible for adapting the level and makes the level equation redundant. This would imply the elimination of a parameter to be optimized.
In addition, an analysis of the size of the data set and its consequences on the forecasts was carried out in the same way as in the previous case. It was used on the same day, 11 July 2016, where the size of the observed data set was gradually increased and 24-hour forecasts were made over two weeks. The results are shown in Figure 6. An asymptotic evolution towards a MAPE of 2% can be seen. Although the MAPE is triggered and reduced again periodically. The shark fin form responds to an adjusting dataset beginning in the autumn, and finishing in the end of autumn-remember that in autumn the calendar effect was much more important than at other times. One of the main characteristics observed in the triple seasonal models is the variability in predictability, mainly produced by the time of year with more holidays, i.e., autumn. Size of the adjusting set (hours) Figure 6. MAPE of the forecasts for 24-hours ahead according to the size of the dataset used to obtain the AMC 24,168,8766 model. The selection of 11 July 2016 for the forecasts responds to the necessity to avoid many special events nearby. The nearest one is the 1st May, and the series has enough time to react. As the model has low values for alpha, gamma and δ 8766 , near to zero, only the intraday and intraweek seasonal components can deal and react against the irregularities of the series, smoothing the model to newer values. The closer the start of the autumn period is, the less reaction time the model has to smooth the irregularities. These irregularities affect forecasts more intensely. When forecasting in other dates, the same analysis provides a similar fin-shaped graph. The only difference is the minimum and maximum of the MAPE, that depends on the date chosen.

Discussion of the Results
The objective of this section is the empirical analysis of multiple-seasonal Holt-Winters models applied to hourly electricity demand in Spain.
The literature found that analyses the parameters of the models tries to give a solution to the theoretical stability analysis, with the determination of the concept of predictability. However, its application to Holt-Winters models is not direct. It is necessary to carry out an empirical analysis of the models and their forecasts, and from there to draw conclusions about predictability.
A framework was established consisting of the usual process of adjustment and forecasting using a Spanish hourly electricity demand data set provided by REE. The forecasts obtained are then analyzed.
The double seasonal models with an 8 week adjustment period are shown to be robust with respect to predictions. Two different periods, with different characteristics, were used and 24-hour prediction MAPEs of around 2% to 2.6% were obtained. In the models with the best behaviour, the parameters were analyzed, and a direct relationship was found between variability and high values of the smoothing parameters associated with seasonality, and in the periods when a greater number of holidays occur.
The size of the observed data set influences the stability. The MAPE of the AMC 24,168 model has a variability of 0.2%, which from about 20,000 h is reduced to 0.1%. The N MC 24,168 model is much more stable-the series did not show a trend either-achieving this stability in sets longer than 5000 h. As the number of observations increases, the smoothing parameters show a stabilization.
In the case of triple seasonal models, a large number of observations are necessary to adjust the model. Therefore, parameter values are stabilized from the beginning. It can be seen how the predictions get worse when the set of values starts at significant dates in the autumn, a season with many public holidays. Forecasters need to use a dataset avoiding to start in autumn. If no other solution is possible, it is interesting to reduce the dataset size as it is large enough, but not including autumn.
In short, it can be seen that the parameters need a large data set to stabilize, and that the triple seasonal models are not able to improve the double seasonal forecasts due to the calendar effect. As a consequence, it is necessary to develop models that are able to reduce this variability by including the calendar effect in the model.

Conclusions
This article analyses the stability of the smoothing parameters in the multiple seasonal Holt-Winters models. This is crucial for the proper appliance of these models to provide accurate and trusted forecasts. Although most of the time, an automatic forecasting algorithm can provide good results, forecasters need to understand the behaviour of these parameters before submitting the forecasts. We use the time series of the hourly Spanish electricity demand.
In this work, we analyze the behaviour of the smoothing parameters of the multiple seasonal Holt-Winters models applied to the series of hourly electricity demand in Spain. There are many variables that affect the parameters within the same time series, including the computation of the initial values, in addition to other factors, such as the calendar or climate conditions. This variability of the parameters is subsequently reflected in the accuracy of the forecasts since they depend greatly on the calculation of the parameters.
The variation of the parameters is analyzed when different seasonal and trend methods are used, as well as different climatic situations of the series. Additionally, it is analyzed how the size of the data set used in order to adjust the model influences on the parameters, and, thus, in the forecasts. It was observed that the seasonal parameters are strongly dependent on the period of the year, making autumn a really difficult period to deal with. This effect is more pronounced in the intra-weekly seasonality than the daily one. However, with the increase in the size of the fitting set, these parameter values stabilise around a value, which depends on the time series. When the size of the set is bigger than 5000 hours, the values are stable. Triple seasonal models have much more stable parameters, although a much larger set of data is necessary to adjust the model. It was found that the main source of variability of the parameters is the calendar effect, which strongly influences the accuracy of the forecasts, not being so much the climatic effect on the series. The accuracy of the forecasts is also stabilised with a larger set of data, it is not necessary to have more than 5000 observations. It is also observed how double seasonal models provide better predictions than triple seasonal ones. The results of the current analysis are limited to being used with Spanish electricity demand. Of course, similar results are expected when faced with load forecasting in other countries or systems. Future works will be addressed toward the development of new models able to include calendar effect in the own model as well as models for the prediction of holidays through the inclusion of discrete seasonalities.