Next Article in Journal
Impacts of Photovoltaic Farms on the Environment in the Romanian Plain
Next Article in Special Issue
Deep Learning-Based Short-Term Load Forecasting for Supporting Demand Response Program in Hybrid Energy System
Previous Article in Journal
Modeling and Simulation of Carbon Emission-Related Issues
Previous Article in Special Issue
Research and Application of a Novel Combined Model Based on Multiobjective Optimization for Multistep-Ahead Electric Load Forecasting
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Short-Term Electricity Demand Forecasting Using Components Estimation Technique

1
Department of Statistics, Quaid-i-Azam University, Islamabad 45320, Pakistan
2
Department of Statistical Sciences, University of Padua, 35121 Padova, Italy
3
College of Life Science, Linyi University, Linyi 276000, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Energies 2019, 12(13), 2532; https://doi.org/10.3390/en12132532
Submission received: 27 May 2019 / Revised: 27 June 2019 / Accepted: 28 June 2019 / Published: 1 July 2019
(This article belongs to the Special Issue Short-Term Load Forecasting 2019)

Abstract

:
Currently, in most countries, the electricity sector is liberalized, and electricity is traded in deregulated electricity markets. In these markets, electricity demand is determined the day before the physical delivery through (semi-)hourly concurrent auctions. Hence, accurate forecasts are essential for efficient and effective management of power systems. The electricity demand and prices, however, exhibit specific features, including non-constant mean and variance, calendar effects, multiple periodicities, high volatility, jumps, and so on, which complicate the forecasting problem. In this work, we compare different modeling techniques able to capture the specific dynamics of the demand time series. To this end, the electricity demand time series is divided into two major components: deterministic and stochastic. Both components are estimated using different regression and time series methods with parametric and nonparametric estimation techniques. Specifically, we use linear regression-based models (local polynomial regression models based on different types of kernel functions; tri-cubic, Gaussian, and Epanechnikov), spline function-based models (smoothing splines, regression splines), and traditional time series models (autoregressive moving average, nonparametric autoregressive, and vector autoregressive). Within the deterministic part, special attention is paid to the estimation of the yearly cycle as it was previously ignored by many authors. This work considers electricity demand data from the Nordic electricity market for the period covering 1 January 2013–31 December 2016. To assess the one-day-ahead out-of-sample forecasting accuracy, Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE) are calculated. The results suggest that the proposed component-wise estimation method is extremely effective at forecasting electricity demand. Further, vector autoregressive modeling combined with spline function-based regression gives superior performance compared with the rest.

1. Introduction

Liberalization of the energy sector, changes in climate policies, and the upgrade of renewable energy resources have completely changed the structure of the previous strictly-controlled energy sector. Today, most energy markets have been liberalized and privatized with the purpose of gaining consistent and inexpensive facilities for power trades. Within the energy sector, the liberalization of the electricity market has also introduced new challenges. In particular, electricity demand and price forecasting have become extremely important issues for producers, energy suppliers, system operators, and other market participants. In many electricity markets, electricity demand is fixed a day before the physical delivery by concurrent (semi-)hourly auctions. Further, electricity cannot be stored in an efficient manner, and the end-user demand must be satisfied instantaneously; thus, accurate forecast for electricity demand is crucial for effective power system management [1,2].
The electricity demand forecast can be broadly divided into three time horizons: (a) short-term, (b) medium-term, and (c) long-term load forecasting. Long-Term Load Forecast (LTLF) includes horizons from a few months to several years ahead. LTLF is generally used for planning and investment profitability analysis, determining upcoming sites, or acquiring fuel sources for production plants [3]. Medium-Term Load Forecast (MTLF) normally considers horizons from a few days to months ahead and is usually preferred for risk management, balance sheet calculations, and derivatives pricing [4]. Finally, Short-Term Load Forecast (STLF) generally includes horizons from a few minutes up to a few days ahead. In practice, the most attention in electricity load forecasting has been paid to STLF since it is an essential tool for the daily market operations [5].
However, electricity demand forecasting is a difficult task due to the features demand time series exhibit. These features include non-constant mean and variance, calendar effects, multiple periodicities, high volatility, jumps, etc. For example, the yearly, weekly, and daily periodicities can be seen from Figure 1. The weekly phase is comprised of comparatively lower variation in the data. The load curves are comparatively different on different days of the week, and the demand varies throughout the day. The demand is high on weekdays as compared to weekends. Moreover, electricity demand is also affected by calendar effects (bank/bridging holidays) and by seasons. In general, the demand is considerably lower during bank holidays and bridging holidays (a day among two non-working days). From the figure, high volatility in electricity demand can also be observed in almost all load periods. In addition, different environmental, geographical, and meteorological factors have a direct effect on electricity demand. Further, as electricity is a secondary source of energy, which is retrieved by converting prime energy sources like fossil fuels, natural gas, solar, wind power, etc. [6], the cost related to each source is different. Thus, a consistent electricity supply mechanism for different levels of demand with short periods of high and rather longer periods of moderate demand is necessary.
To account for the different features of the demand series, in the last two decades, researchers suggested different methods and models to forecast electricity demand [7,8,9,10,11]. For example, the work in [12] proposed a semi-parametric component-based model consisting of a non-parametric (smoothing spline) and a parametric (autoregressive moving average model) component. Exponential smoothing techniques are also widely used in forecasting electricity demand [13,14]. Multiple equations time series models, e.g., the Double Seasonal Autoregressive Moving Average (DSARIMA) model, the Double Holt–Winters (D-HW) model, and Multiple Equations Time Series (MET) approaches are also used for short-term load forecasting [15,16]. Regression methods are easy to implement and have been widely used for electricity demand forecasting in the past. For example, the work in [17] used parametric regression models to forecast electricity demand for the Turkish electricity market. Some authors included exogenous variables in the time series models to improve the forecasting performance [18,19,20]. Several researchers compared the classical time series models and computational intelligence models [21,22,23]. For example, the work in [24] compared the Seasonal Autoregressive Moving Average (SARIMA) and Adaptive Network-based Fuzzy Inference System (ANFIS) models. For short-term load forecasting, the work in [25] introduced a new hybrid model that combines SARIMA and the Back Propagation Neural Network (BPNN) model. Some authors suggested the use of functional data analysis to predict electricity demand [26,27,28]. The main idea behind this approach is to consider the daily demand profile as a single functional object; thus, functional approaches can be applied to electricity load series. Other approaches used for demand forecasting can be seen in [29,30,31,32,33]. Apart from the forecasting models, Distributed Energy Resources (DERs) that are directly connected to a local distribution system and can be used for electricity producing or as controllable loads are also discussed in the literature [34,35]. DERs include solar panels, combined heat and power plants, electricity storage, small natural gas-fueled generators, and electric water heaters.
The main objective of this work is to compare different modeling techniques for electricity demand forecasting. The main attention is paid to the yearly cycle, which in many cases is ignored. The authors suggest to estimate jointly the effect of the long-term trend and yearly cycle using one component [36,37]. In practice, however, the yearly component shows regular cycles, while the long-term component highlights the trend structure of the data. Thus, these two components must be modeled separately [26]. Further, in our case, some pilot analyses suggested that modeling these two components separately can significantly improve the forecasting results. Thus, the main contribution of this paper is the thorough investigation of the impact of yearly component estimation on one-day-ahead out-of-sample electricity demand forecasting. Within the framework of the components estimation method, we compare models in terms of forecasting ability considering both univariate and multivariate, as well as parametric and non-parametric models. Moreover, for the considered models, the significance analysis of the difference in predication accuracy is also conducted.
The rest of the article is organized as follows: Section 2 contains a description of the proposed modeling framework and of the considered models. Section 3 provides an application of the proposed modeling framework. Section 4 contains a summary and conclusions.

2. Component-Wise Estimation: General Modeling Framework

The main objective of this study is to forecast one-day-ahead electricity demand using different forecasting models and methods. To this end, let log ( D t , j ) be the series of the log demand for the t th day and the j th hour. Following [28,33], the dynamics of the log demand, log ( D t , j ) , can be modeled as:
log D t , j = F t , j + R t , j
That is, the log ( D t , j ) is divided into two major components: a deterministic component F t , j and a stochastic component R t , j . The deterministic component, F t , j , is comprised of the long-run trend, annual, seasonal, and weekly cycles, and calendar effects and is modeled as:
F t , j = l t , j + a t , j + s t , j + w t , j + b t , j
where l t , j represents the long-run (trend component), a t , j represents the annual cycles, s t , j represents the seasonal cycles, w t , j is the weekly cycles, and b t , j represents the bank holidays. On the other hand, R t , j is a (residual) stochastic component that describes the short-run dependence of demand series. Concerning the estimation of the deterministic component, apart from the yearly component a t , j , the remaining components are estimated using parametric regression. For the estimation of a t , j , six different methods including the sinusoidal function-based regression techniques, three local polynomial regression models, and two regression spline function-based models are used. All the components in Equation (2) are estimated using the back fitting algorithm. In the case of stochastic component R t , j , four different methods, namely the Autoregressive Model (AR), the Non-Parametric Autoregressive model (NPAR), the Autoregressive Moving Average Model (ARMA), and the Vector-Autoregressive model (VAR) are used. Combining the models for deterministic and stochastic components estimations leads us to comparing twenty four ( 6 F × 4 R = ) 24 different combinations. Note that in the case of univariate models, each load period is modeled separately to account for the intra-daily periodicity [38].

2.1. Modeling the Deterministic Component

This section will explain the estimation of the deterministic component. The long-run (trend) component l t , j , which is a function of time t, is estimated using Ordinary Least Squares (OLS). Dummy variables are used for seasonal periodicities, weekly periodicities, and for bank holidays, i.e., s t = i = 1 4 α i I i , t , with I i , t = 1 if t refers to the i th season of the year and zero otherwise, w t = i = 1 7 β i I i , t , with I i , t = 1 if t refers to the i th day of the week and zero otherwise, and b t = i = 1 2 γ I i , t , with I i , t = 1 if t refers to a bank holiday or zero otherwise. The coefficients α ’s, β ’s, and γ ’s are estimated by OLS. On the other hand, the annual component a t , j , which is a function of the series ( 1 , 2 , 3 , , 365 , 1 , 2 , 3 , , 365 , ) , is estimated by six different methods that include Sinusoidal function-based Regression (SR), local polynomial regression models with three different kernels, namely: tri-cubic ((L1), Gaussian (L2), and Epanechnikov (L3), Regression Splines (RS), and Smoothing Splines (SS).

2.1.1. Sinusoidal Function Regression Model

Sinusoidal function Regression (SR) is widely used in the literature to capture the periodicity of a periodic component [39,40,41,42,43,44]. In this method, we consider that the annual cycle can be estimated using q sine and cosine functions given as:
a t = i = 1 q α 1 , i sin w a t + α 2 , i cos w a t ,
where w = 2 π 365.25 . The unknown parameters α 1 , i a n d α 2 , i ( i = 1 , , q ) are estimated by OLS.

2.1.2. Local Polynomial Regression

Local polynomial regression is a flexible non-parametric technique that approximates a t at a point a 0 by a low-order polynomial (say, q), fit using only points in some neighborhood of a 0 .
a ^ t = j = 1 q α ^ j ( a t - a 0 ) j .
Parameters α ^ j are estimated by Weighted Least Squares (WLS) by minimizing:
t = 1 N ( a t - a t ^ ) 2 K δ ( a ) ( a t - a 0 ) ,
where K δ ( a ) ( a t - a 0 ) is a weighting kernel function, which depends on the smoothing parameter δ , also known as the bandwidth. It controls the size of the neighborhood around a 0 [45] and, thus, of the locality of the approximation. In this work, the value of the bandwidth is selected by using the cross-validation technique. Three different weighting kernel functions, namely the tri-cubic kernel (L1), the Epanechnikov (L2), and Gaussian kernels (L3) are used. It is worth mentioning that these types of local kernel-based regression techniques have been extensively used in the literature [31,39,40,44,46].

2.1.3. Regression Spline Models

Spline Regression (RS) is a popular non-parametric regression technique, which approximates a t by means of piecewise polynomials of order p, estimated in the subintervals delimited by a sequence of m points called knots. Any spline function Z ( a ) of order p can be described as a linear combination of functions Z j ( a ) called basis functions and is expressed in the following way:
Z ( a ) = j = 1 m + p + 1 α j Z j ( a ) .
The unknown parameters α j are estimated by the OLS. The most important choice is the number of knots and their location because they define the smoothness of the approximation. Again, we chose it by the cross-validation approach. In the literature, many authors considered this approach for long-run component prediction [11,12,26]. The annual cycle component for regression splines is estimated as,
a ^ t = Z ^ ( a )

2.1.4. Smoothing Splines

To overcome the requirement for fixing the number of knots, spline functions can alternatively be estimated by using the penalized least squares condition to minimize the sum of squares. Hence, the expression to minimize becomes:
j = 1 N ( a t - Z ( a ) ) 2 + λ ( Z ( a ) ) 2 d t
where ( Z ( a ) ) is the second derivative of Z ( a ) . The first term accounts for the degree of fitting, while the second one penalizes the roughness of the function through the smoothing parameter λ . The selection of smoothing parameter is an important task, which in this work is done using the cross-validation approach. Smoothing Splines (SS) have been previously used by some authors to estimate the long-run dynamics of the series, e.g., [11,47,48].
To see the performance of all six models defined above for the estimation of the annual component a t , j , the observed log demand and the estimated annual components are depicted in Figure 2. From the figure, we can see that all six models for a t , j were capable of capturing the annual seasonality, as the annual cycles can be seen clearly from the figure.
Finally, it is worth mentioning that one-day-ahead forecast for the deterministic component is straightforward as the elements of F t , j are deterministic functions of time or calendar conditions, which are known at any time. Once all these components are estimated, the residual (stochastic) component R t , j is obtained as:
R t , j = log ( D t , j ) - ( l ^ t , j + a ^ t , j + s ^ t , j + w ^ t , j + b ^ t , j )

2.2. Modeling the Stochastic Component

Once the stochastic (residual) component is obtained, different types of parametric and non-parametric time series models can be considered. In our case, from the univariate class, we consider parametric AutoRegressive (AR), Non-Parametric AutoRegressive (NPAR), and Autoregressive Moving Average (ARMA). On the other hand, the Vector AutoRegressive (VAR) model is used to compare the performance of the multivariate model with the univariate models.

2.2.1. Autoregressive Model

A linear parametric Autoregressive (AR) model defines the short-run dynamics of R t , j taking into account a linear combination of the past r observations of R t , j and is given by:
R t , j = c + β 1 R t - 1 , j + β 2 R t - 2 , j + . . . . + β r R t - r , j + ϵ t
where c is the intercept, β i ( i = 1 , 2 , , r ) are the parameters of the AR(r) model, and ϵ t is a white noise process. In our case, the parameters are estimated using the maximum likelihood estimation method. After some pilot analysis on different load periods, we concluded that the lags 1, 2, and 7 were significant in most cases and hence were used to estimate the model.

2.2.2. Non-Parametric Autoregressive Model

The additive non-parametric counterpart of AR is an additive model (NPAR), where the relation between R t , j , and its lagged values do not have a particular parametric form, allowing, potentially, for any type of non-linearity and given by:
R t , j = g 1 ( R t - 1 , j ) + g 2 ( R t - 2 , j ) + + g r ( R t - r , j ) + ϵ t , j
where g i are smoothing functions describing the relation between each past values and R t , j . In our case, functions g i are represented by cubic regression splines. As in the parametric case, we used the lags 1, 2, and 7 to estimate NPAR. To avoid the so-called “curse of dimensionality”, we used the back fitting algorithm to estimate the model [49].

2.2.3. Autoregressive Moving Average Model

The Autoregressive Moving Average (ARMA) model not only includes the lagged values of the series, but also considers the past error terms in the model. In our case, the stochastic component R t , j is modeled as a linear combination of the past r observations, as well as the lagged error terms. Mathematically,
R t , j = c + β 1 R t - 1 , j + β 2 R t - 2 , j + . . . . + β r R t - r , j + ϵ t , j + ϕ 1 ϵ t - 1 , j + ϕ 2 ϵ t - 2 , j + . . . . + ϕ ϵ t - s , j
where c is the intercept, β i ( i = 1 , 2 , , r ) and ϕ j ( j = 1 , 2 , , s ) are parameters of the AR and MA components, respectively, and ϵ t N ( 0 , σ ϵ 2 ) . In this case, some pilot analyses suggest that the lags 1, 2, and 7 are significant for the AR part, while only the lag 1 for the MA part, thus a constrained ARMA(7,1) where β 3 = = β 6 = 0 is fitted to R t , j using the maximum likelihood estimation method.

2.2.4. Vector Autoregressive Model

In the Vector Autoregressive (VAR) model, both the response and the predictors are vectors, and hence, they contain information on the whole daily load profile. This allows one to account for possible interdependence among demand levels at different load periods. In our context, the daily stochastic component R t is modeled as a linear combination of the past r observations of R t , i.e.,
R t = G 1 R t - 1 + G 2 R t - 2 + + G r R t - r + ϵ t
where R t = { R t , 1 , , R t , 24 } , G j ( j = 1 , 2 , , r ) are coefficient matrices and ϵ t = ( ϵ t , 1 , , ϵ t , 24 ) is a vector of the disturbance term, such that ϵ t N ( 0 , Σ ϵ ) . Estimation of the parameters is done using the maximum likelihood estimation method.
Finally, once estimation of both, deterministic and stochastic, components is done, the final day-ahead electricity demand forecast is obtained as:
D ^ t + 1 , j = exp l ^ t + 1 , j + a ^ t + 1 , j + s ^ t + 1 , j + w ^ t + 1 , j + b ^ t + 1 , j + R ^ t + 1 , j = exp F ^ t + 1 , j + R ^ t + 1 , j
For the stochastic component R t , j and the final model error ϵ t , j , examples of the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) are plotted in Figure 3 and Figure 4. Note that in the case of ϵ t , j , both ACF and PACF refer to the models when VAR is used as a stochastic model. The reason for plotting the residual obtained after applying the VAR model to R t , j is the superior forecasting performance of the multivariate model (see Table 1). Overall, the residuals ϵ t , j of each model have been whitened. In some cases, residuals still show some significant correlation, but with an absolute value so small that it is useless for prediction.

3. Out-of-Sample Forecasting

This work considers the electricity demand data for the Nord Pool electricity market. The data cover the period from 1 January 2013–31 December 2016 (35,064 hourly demand levels for 1461 days). A few missing observations in the load series were replaced by averages of the neighboring observations. The whole dataset was divided into two parts: 1 January 2013-31 December 2015 (26,280 data points, covering 1095 days) for model estimation and 1 January 2016–31 December 2016 (8784 data points, covering 366 days) for one-day-ahead out-of-sample forecasting.
In the first step, the deterministic component was estimated separately for each load period as described in Section 2.1. An example of estimated deterministic components, as well as of R t , j is plotted in Figure 5. In the figure, along with the log demand at the top left, the long trend, yearly, seasonal, and weekly components are plotted on top right, middle left, middle right, and bottom left, respectively. Note that the elements of the deterministic components capture different dynamics of the log demand. An example of the series R t , 21 is plotted at the bottom right in Figure 5. In the second step, the previously-defined models for stochastic component were applied to the residual series R t , j . In both steps, models were estimated and one-day-ahead forecasts were obtained for 366 days using the rolling window technique. Final demand forecasts were obtained using Equation (12).
To evaluate the forecasting performance of the final models obtained from different combinations of deterministic and stochastic components, three accuracy measures, namely Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE) were computed as:
MAPE = mean | D t , j - D ^ t , j | D t , j × 100
MAE = mean | D t , j - D ^ t , j |
RMSE = mean ( D t , j - D ^ t , j ) 2 ,
where D t , j and D ^ t , j are the observed and the forecasted demand for the t th day (t = 1, 2, , 366) and the j th ( j = 1 , 2 , , 24 ) load period.
As within the deterministic component, this work used six different estimation methods for a t , j , whereas the estimation of other elements was the same; six different combinations were obtained. On the other hand, four different models were used to model the stochastic component. Hence, the estimation of both, deterministic and stochastic, components led us to compare twenty four different models. For these twenty four models, one-day-ahead out-of-sample forecast results are listed in Table 1. From the table, it is evident that the multivariate VAR model combined with any estimation technique used for a t , j led to a better forecast compared to the univariate models. The best forecasting model was obtained by combining VAR and RS, which produced 1.994, 856.082, and 1145.979 for MAPE, MAE, and RMSE, respectively. VAR combined with SS or with L3 produced the second best results. Within the univariate models, NPAR combined with the spline-based regression models performed better than the other two parametric counterparts. Finally, any stochastic model combined with SR or with L1 led to the worst forecast in their respective classes (univariate and multivariate). Considering only MAPE, a graphical representation of the results for the twenty four combination is given in Figure 6. From the figure, we can easily see that multivariate models performed better than the univariate models. To assess the significance of the difference among accuracy measures listed in Table 1 for different combinations, we performed the Diebold and Mariano (DM) [50] test of equal forecast accuracy. The DM test is a widely-used statistical test for comparing forecasts obtained from different models. To understand it, consider two forecasts, y ^ 1 t and y ^ 2 t , that are available for the time series y t for t = 1 , , T . The associated forecast errors are ϵ 1 t = y t - y ^ 1 t and ϵ 2 t = y t - y ^ 2 t . Let the loss associated with forecast error { ϵ i t } i = 1 2 by L ( ϵ i t ) . For example, time t absolute loss would be L ( ϵ i t ) = | ϵ i t | . The loss differential between Forecasts 1 and 2 for time t is then η t = L ( ϵ 1 t ) - L ( ϵ 2 t ) . The null hypothesis of equal forecast accuracy for two forecast is E [ η t ] = 0 . The DM test requires that the loss differential be covariance stationary, i.e.,
E [ η t ] = μ , t cov ( η t - η t - τ ) = γ ( τ ) , t var ( η t ) = σ η , 0 < σ η <
Under these assumptions, the DM test of equal forecast accuracy is:
DM = η ¯ σ ^ η ¯ d N ( 0 , 1 )
where η ¯ = 1 T t = 1 T η t is the sample mean loss differential and σ ^ η ¯ is a consistent standard error estimate of η t .
The results for the DM test are listed in Table 2 and Table 3. The elements of these tables are p-values of the Diebold and Mariano test where the null hypothesis assumes no difference in the accuracy of predictors in the column and row against the alternative hypothesis that the predictor in the column is more accurate than the predictor in the row. From Table 2, it is clear that the multivariate VAR models outperform their univariate counterparts. When looking at the results of VAR using different methods of estimation for a t , j in Table 3, it can be seen that, except for SR-VAR and L1-VAR, the remaining four combinations had the same predictive ability. In the case of SR-VAR and L1-VAR, the remaining four combinations performed statistically better.
The day-specific MAPE, MAE, and RMSE are tabulated in Table 4. From this table, we can see that day-specific MAPE was relatively higher on Monday and Sunday and smaller on other weekdays. As the VAR model performed better previously, the day-specific MAPE values for this model were considerably lower compared to univariate models, except on Wednesday, Thursday, and Friday. For these three days, both the univariate and multivariate models produced lower errors. The same findings can be seen by looking at day-specific MAE and day-specific RMSE. The day-specific MAPE values are also depicted in Figure 7. The figure clearly indicates that the MAPE value was lower in the middle of the week and was higher on Monday and Sunday.
To conclude this section, the hourly RMSE and forecasted demand for the best four combinations including one for each stochastic model is plotted in Figure 8. From the figure (left), note that hourly RMSE are considerably lower at the low load periods, while they are high at peak load periods. Further, note the best forecasting performance of the SR-VAR model compared to the competing stochastic models. For these models, the observed and the forecasted demand are also plotted in Figure 8 (right). The forecasted demand was following the actual demand very well, especially when VAR was used as a stochastic model. Thus, we can conclude that the multivariate model VAR outperformed the univariate counterparts.

4. Conclusions

The main aim of this work was to model and forecast electricity demand using the component estimation method. For this purpose, the log demand was divided into two components: deterministic and stochastic. The elements of the deterministic component consisted of a long trend, multiple periodicities due to annual, seasonal, and weekly regular cycles, and bank holidays. Special attention was paid to the estimation of the yearly seasonality as it was previously ignored by many authors. The estimation of yearly components was based on six different estimation methods, whereas other elements of the deterministic component were estimated using ordinary least squares. In particular, for the estimation of annual periodicity, this work used the sinusoidal function-based model (SR), the local polynomial regression models with three different kernels: tri-cubic (L1), Gaussian (L2), and Epanechnikov (L3), Regression Splines (RS), and Smoothing Splines (SS). For the stochastic component, we used four univariate and multivariate models, namely the Autoregressive Model (AR), the Non-Parametric Autoregressive Model (NPAR), the Autoregressive Moving Average model (ARMA), and the Vector Autoregressive model (VAR). The estimation of both, deterministic and stochastic, components led us to compare twenty four different combinations of these models. To see the predictive performance of different models, demand data from the Nord Pool electricity market were used, and one-day-ahead out-of-sample forecasts were obtained for a complete year. The forecasting accuracy of the models was assessed through the MAPE, MAE, and RMSE. To assess the significance of the differences in the predictive performance of the models, the Diebold and Mariano test was performed. Results suggested that the component-wise estimation method was extremely effective for modeling and forecasting electricity demand. The best results were produced by combining RS and the VAR model, which led to the lowest error values. Further, all the combinations of the multivariate model VAR completely outperformed the univariate counterparts, suggesting the superiority of multivariate models. Within the combination of VAR, however, the results were not statistically different for all models.

Author Contributions

Conceptualization and Methodology I.S.; Software, S.A.; Validation, I.S., S.A.; Formal Analysis, H.I. and I.S.; Investigation, H.I.; Resources, D.W.; Data Curation, S.A.; Writing—Original Draft Preparation, I.S.; Writing—Review & Editing, I.S., S.A., and D.W.; Visualization, H.I. and S.A.; Supervision, I.S.; Project Administration, I.S.; Funding Acquisition, S.A. and D.W.

Funding

The work of Ismail Shah is partially funded by Quaid-i-Azam University, Islamabad, Pakistan through university research fund.

Acknowledgments

The authors would like to thank Francesco Lisi, department of statistical sciences, University of Padua, for his valuable suggestions and comments. We are also grateful to the anonymous referees for their constructive comments, which greatly improved the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Serrallés, R.J. Electric energy restructuring in the European Union: Integration, subsidiarity and the challenge of harmonization. Energy Policy 2006, 34, 2542–2551. [Google Scholar] [CrossRef]
  2. Bunn, D.W. Forecasting loads and prices in competitive power markets. Proc. IEEE 2000, 88, 163–169. [Google Scholar] [CrossRef]
  3. Herbst, A.; Toro, F.; Reitze, F.; Jochem, E. Introduction to energy systems modelling. Swiss J. Econ. Stat. 2012, 148, 111–135. [Google Scholar] [CrossRef]
  4. Gonzalez-Romera, E.; Jaramillo-Moran, M.A.; Carmona-Fernandez, D. Monthly electric energy demand forecasting based on trend extraction. IEEE Trans. Power Syst. 2006, 21, 1946–1953. [Google Scholar] [CrossRef]
  5. Kyriakides, E.; Polycarpou, M. Short term electric load forecasting: A tutorial. In Trends in Neural Computation; Springer: Berlin/Heidelberg, Germany, 2007; pp. 391–418. [Google Scholar]
  6. Pollak, J.; Schubert, S.; Slominski, P. Die Energiepolitik der EU. Facultas WUV/UTB. 2010. Available online: http://www.utb-shop.de/die-energiepolitik-der-eu-2646.html (accessed on 30 April 2019).
  7. Bosco, B.P.; Parisio, L.P.; Pelagatti, M.M. Deregulated wholesale electricity prices in Italy: An empirical analysis. Int. Adv. Econ. Res. 2007, 13, 415–432. [Google Scholar] [CrossRef]
  8. Janczura, J.; Weron, R. An empirical comparison of alternate regime-switching models for electricity spot prices. Energy Econ. 2010, 32, 1059–1073. [Google Scholar] [CrossRef] [Green Version]
  9. Trueck, S.; Weron, R.; Wolff, R. Outlier treatment and robust approaches for modeling electricity spot prices. In Proceedings of the 56th Session of the ISI, Lisbon, Portugal, 22–29 August 2007; Available online: http://mpra.ub.uni-muenchen.de/4711/ (accessed on 28 March 2019).
  10. Sigauke, C.; Chikobvu, D. Prediction of daily peak electricity demand in South Africa using volatility forecasting models. Energy Econ. 2011, 33, 882–888. [Google Scholar] [CrossRef]
  11. Lisi, F.; Nan, F. Component estimation for electricity prices: Procedures and comparisons. Energy Econ. 2014, 44, 143–159. [Google Scholar] [CrossRef]
  12. Liu, J.M.; Chen, R.; Liu, L.M.; Harris, J.L. A semi-parametric time series approach in modeling hourly electricity loads. J. Forecast. 2006, 25, 537–559. [Google Scholar] [CrossRef]
  13. Taylor, J.W.; De Menezes, L.M.; McSharry, P.E. A comparison of univariate methods for forecasting electricity demand up to a day ahead. Int. J. Forecast. 2006, 22, 1–16. [Google Scholar] [CrossRef]
  14. Taylor, J.W.; McSharry, P.E. Short-term load forecasting methods: An evaluation based on european data. IEEE Trans. Power Syst. 2007, 22, 2213–2219. [Google Scholar] [CrossRef]
  15. Clements, A.E.; Hurn, A.; Li, Z. Forecasting day-ahead electricity load using a multiple equation time series approach. Eur. J. Oper. Res. 2016, 251, 522–530. [Google Scholar] [CrossRef] [Green Version]
  16. Ismail, M.A.; Zahran, A.R.; El-Metaal, E.M.A. Forecasting Hourly Electricity Demand in Egypt Using Double Seasonal Autoregressive Integrated Moving Average Model. In Proceedings of the First International Conference on Big Data, Small Data, Linked Data and Open Data, Barcelona, Spain, 19–24 April 2015; pp. 42–45. [Google Scholar]
  17. Yukseltan, E.; Yucekaya, A.; Bilge, A.H. Forecasting electricity demand for Turkey: Modeling periodic variations and demand segregation. Appl. Energy 2017, 193, 287–296. [Google Scholar] [CrossRef]
  18. Hinman, J.; Hickey, E. Modeling and forecasting short-term electricity load using regression analysis. Inst. Regul. Policy Stud. 2009, 51, 1–51. Available online: https://irps.illinoisstate.edu/downloads/research/documents/LoadForecastingHinman-HickeyFall2009.pdf (accessed on 31 May 2018).
  19. Feng, Y.; Ryan, S.M. Day-ahead hourly electricity load modeling by functional regression. Appl. Energy 2016, 170, 455–465. [Google Scholar] [CrossRef] [Green Version]
  20. Weron, R.; Misiorek, A. Forecasting spot electricity prices: A comparison of parametric and semiparametric time series models. Int. J. Forecast. 2008, 24, 744–763. [Google Scholar] [CrossRef] [Green Version]
  21. Kandananond, K. Forecasting electricity demand in Thailand with an artificial neural network approach. Energies 2011, 4, 1246–1257. [Google Scholar] [CrossRef]
  22. Meng, M.; Niu, D.; Sun, W. Forecasting monthly electric energy consumption using feature extraction. Energies 2011, 4, 1495–1507. [Google Scholar] [CrossRef]
  23. Ryu, S.; Noh, J.; Kim, H. Deep neural network based demand side short term load forecasting. Energies 2016, 10, 3. [Google Scholar] [CrossRef]
  24. Debusschere, V.; Bacha, S. One week hourly electricity load forecasting using Neuro-Fuzzy and Seasonal ARIMA models. IFAC Proc. Vol. 2012, 45, 97–102. [Google Scholar]
  25. Yang, Y.; Wu, J.; Chen, Y.; Li, C. A new strategy for short-term load forecasting. Abstr. Appl. Anal. 2013, 2013, 208964. [Google Scholar] [CrossRef]
  26. Lisi, F.; Shah, I. Forecasting Next-Day Electricity Demand and Prices Based on Functional Models. Working Paper. 2019, pp. 1–30. Available online: https://www.researchgate.net/profile/Ismail_Shah2/publications (accessed on 30 March 2019).
  27. Shang, H.L. Functional time series approach for forecasting very short-term electricity demand. J. Appl. Stat. 2013, 40, 152–168. [Google Scholar] [CrossRef]
  28. Shah, I.; Lisi, F. Day-ahead electricity demand forecasting with non-parametric functional models. In Proceedings of the 12th International Conference on the European Energy Market, Lisbon, Portugal, 19–22 May 2015; pp. 1–5. [Google Scholar]
  29. Bisaglia, L.; Bordignon, S.; Marzovilli, M. Modelling and Forecasting Hourly Spot Electricity Prices: Some Preliminary Results; Working Paper Series; University of Padua: Padova, Italy, 2010. [Google Scholar]
  30. Gianfreda, A.; Grossi, L. Forecasting Italian electricity zonal prices with exogenous variables. Energy Econ. 2012, 34, 2228–2239. [Google Scholar] [CrossRef] [Green Version]
  31. Bordignon, S.; Bunn, D.W.; Lisi, F.; Nan, F. Combining day-ahead forecasts for British electricity prices. Energy Econ. 2013, 35, 88–103. [Google Scholar] [CrossRef] [Green Version]
  32. Laouafi, A.; Mordjaoui, M.; Laouafi, F.; Boukelia, T.E. Daily peak electricity demand forecasting based on an adaptive hybrid two-stage methodology. Int. J. Electr. Power Energy Syst. 2016, 77, 136–144. [Google Scholar] [CrossRef]
  33. Lisi, F.; Pelagatti, M.M. Component estimation for electricity market data: Deterministic or stochastic? Energy Econ. 2018, 74, 13–37. [Google Scholar] [CrossRef]
  34. Dominguez-Garcia, A.D.; Cady, S.T.; Hadjicostis, C.N. Decentralized optimal dispatch of distributed energy resources. In Proceedings of the 2012 IEEE 51st IEEE Conference on Decision and Control (CDC), Maui, HI, USA, 10–13 December 2012; pp. 3688–3693. [Google Scholar]
  35. Basak, P.; Chowdhury, S.; nee Dey, S.H.; Chowdhury, S. A literature review on integration of distributed energy resources in the perspective of control, protection and stability of microgrid. Renew. Sustain. Energy Rev. 2012, 16, 5545–5556. [Google Scholar] [CrossRef]
  36. Nowotarski, J.; Weron, R. On the importance of the long-term seasonal component in day-ahead electricity price forecasting. Energy Econ. 2016, 57, 228–235. [Google Scholar] [CrossRef] [Green Version]
  37. Marcjasz, G.; Uniejewski, B.; Weron, R. On the importance of the long-term seasonal component in day-ahead electricity price forecasting with NARX neural networks. Int. J. Forecast. 2018, in press. [Google Scholar] [CrossRef]
  38. Ramanathan, R.; Engle, R.; Granger, C.W.; Vahid-Araghi, F.; Brace, C. Short-run forecasts of electricity loads and peaks. Int. J. Forecast. 1997, 13, 161–174. [Google Scholar] [CrossRef]
  39. Pilipovic, D. Energy Risk: Valuing and Managing Energy Derivatives; McGraw-Hill: New York, NY, USA, 1998; Volume 300. [Google Scholar]
  40. Lucia, J.J.; Schwartz, E.S. Electricity prices and power derivatives: Evidence from the nordic power exchange. Rev. Deriv. Res. 2002, 5, 5–50. [Google Scholar] [CrossRef]
  41. Weron, R.; Bierbrauer, M.; Trück, S. Modeling electricity prices: Jump diffusion and regime switching. Phys. A Stat. Mech. Its Appl. 2004, 336, 39–48. [Google Scholar] [CrossRef]
  42. Kosater, P.; Mosler, K. Can Markov regime-switching models improve power-price forecasts? Evidence from German daily power prices. Appl. Energy 2006, 83, 943–958. [Google Scholar] [CrossRef] [Green Version]
  43. De Jong, C. The nature of power spikes: A regime-switch approach. Stud. Nonlinear Dyn. Econom. 2006, 10. [Google Scholar] [CrossRef]
  44. Escribano, A.; Ignacio Peña, J.; Villaplana, P. Modelling electricity prices: International evidence. Oxf. Bull. Econ. Stat. 2011, 73, 622–650. [Google Scholar] [CrossRef]
  45. Avery, M. Literature review for local polynomial regression. Unpublished manuscript. 2013. [Google Scholar]
  46. Veraart, A.E.; Veraart, L.A. Modelling electricity day-ahead prices by multivariate Lévy semistationary processes. In Quantitative Energy Finance; Springer: New York, NY, USA, 2014; pp. 157–188. [Google Scholar]
  47. Shah, I. Modeling and Forecasting Electricity Market Variables. Ph.D. Thesis, University of Padova, Padua, Italy, 2016. [Google Scholar]
  48. Dordonnat, V.; Koopman, S.J.; Ooms, M. Intra-daily smoothing splines for time-varying regression models of hourly electricity load. J. Energy Mark. 2010, 3, 17. [Google Scholar] [CrossRef]
  49. Hastie, T.; Tibshirani, R. Generalized additive models: Some applications. J. Am. Stat. Assoc. 1987, 82, 371–386. [Google Scholar] [CrossRef]
  50. Diebold, F.; Mariano, R. Comparing predictive accuracy. J. Bus. Econ. Stat. 1995, 13, 253–263. [Google Scholar]
Figure 1. Yearly seasonality for the period 01-01-2012–31-12-2015 (top left), weekly periodicity for the period 01-01-2013–14-01-2013 (top right), box plot of hourly electricity load for the period 01-01-2013–31-12-2016 (bottom right), daily load curves for the period 01-01-2013–31-01-2013, weekdays (solid lines), Saturdays (dashed lines), Sundays (dotted lines), and bank holidays at the bottom (solid) representing 1 January (bottom left).
Figure 1. Yearly seasonality for the period 01-01-2012–31-12-2015 (top left), weekly periodicity for the period 01-01-2013–14-01-2013 (top right), box plot of hourly electricity load for the period 01-01-2013–31-12-2016 (bottom right), daily load curves for the period 01-01-2013–31-01-2013, weekdays (solid lines), Saturdays (dashed lines), Sundays (dotted lines), and bank holidays at the bottom (solid) representing 1 January (bottom left).
Energies 12 02532 g001
Figure 2. Observed l o g ( D t , 21 ) with superimposed estimated a t , j using: (first row) Sinusoidal Regression (SR) (left), Local regression (L1) (middle), L2 (right), and (second row) L3 (left), Regression Splines (RS) (middle), and Smoothing Splines (SS) (right).
Figure 2. Observed l o g ( D t , 21 ) with superimposed estimated a t , j using: (first row) Sinusoidal Regression (SR) (left), Local regression (L1) (middle), L2 (right), and (second row) L3 (left), Regression Splines (RS) (middle), and Smoothing Splines (SS) (right).
Energies 12 02532 g002
Figure 3. ACF and Partial Autocorrelation Function (PACF) plots for R t , 21 (first row), ACF and PACF plots for ϵ t , 21 obtained with L1-VAR (second row), L2-VAR (third row), and L3-VAR (fourth row).
Figure 3. ACF and Partial Autocorrelation Function (PACF) plots for R t , 21 (first row), ACF and PACF plots for ϵ t , 21 obtained with L1-VAR (second row), L2-VAR (third row), and L3-VAR (fourth row).
Energies 12 02532 g003
Figure 4. ACF and PACF plots for ϵ t , 21 obtained with SR-VAR (first row), RS-VAR (second row), and SS-VAR (third row).
Figure 4. ACF and PACF plots for ϵ t , 21 obtained with SR-VAR (first row), RS-VAR (second row), and SS-VAR (third row).
Energies 12 02532 g004
Figure 5. l o g ( D t , 21 ) (top left), l ^ t , 21 (top right), a ^ t , 21 (middle left), s ^ t , 21 (middle right), w ^ t , 21 (bottom left), and R t , 21 (bottom right).
Figure 5. l o g ( D t , 21 ) (top left), l ^ t , 21 (top right), a ^ t , 21 (middle left), s ^ t , 21 (middle right), w ^ t , 21 (bottom left), and R t , 21 (bottom right).
Energies 12 02532 g005
Figure 6. One-day-ahead out-of-sample MAPE for electricity demand using SR, L1, L2, L3, RS, SS, AR, NPAR, ARMA, and VAR.
Figure 6. One-day-ahead out-of-sample MAPE for electricity demand using SR, L1, L2, L3, RS, SS, AR, NPAR, ARMA, and VAR.
Energies 12 02532 g006
Figure 7. Day-specific MAPEs for all stochastic component models: AR, NPAR, ARMA, and VAR.
Figure 7. Day-specific MAPEs for all stochastic component models: AR, NPAR, ARMA, and VAR.
Energies 12 02532 g007
Figure 8. (left) Hourly RMSE for: RS-AR (solid), RS-NPAR (dashed), RS-ARMA (dotted), and RS-VAR (dotted-dashed). (Right) Observed demand (solid) and forecasted demand for: RS-AR (dashed), RS-NPAR (dotted), RS-ARMA (dotted-dashed), and RS-VAR (long dash).
Figure 8. (left) Hourly RMSE for: RS-AR (solid), RS-NPAR (dashed), RS-ARMA (dotted), and RS-VAR (dotted-dashed). (Right) Observed demand (solid) and forecasted demand for: RS-AR (dashed), RS-NPAR (dotted), RS-ARMA (dotted-dashed), and RS-VAR (long dash).
Energies 12 02532 g008
Table 1. Descriptive statistics for one-day-ahead out-of-sample forecasting: The column represents the estimation of the yearly component through Sinusoidal Regression (SR), Local regression (L1), Local regression (L2), Local regression (L3), Regression Spline (RS), and Smoothing Spline (SS). The row represents the estimation of the stochastic component thorough Autoregressive (AR), Non-Parametric Autoregressive (NPAR), Autoregressive Moving Average (ARMA), and Vector Autoregressive (VAR).
Table 1. Descriptive statistics for one-day-ahead out-of-sample forecasting: The column represents the estimation of the yearly component through Sinusoidal Regression (SR), Local regression (L1), Local regression (L2), Local regression (L3), Regression Spline (RS), and Smoothing Spline (SS). The row represents the estimation of the stochastic component thorough Autoregressive (AR), Non-Parametric Autoregressive (NPAR), Autoregressive Moving Average (ARMA), and Vector Autoregressive (VAR).
ERRORSMODELSSRL1L2L3RSSS
MAPEAR2.5032.4662.4132.4122.4112.412
NPAR2.5102.4342.4132.4112.3992.395
ARMA2.5142.4352.4132.4182.4052.396
VAR2.1432.1091.9971.9951.9941.995
MAEAR1081.1841069.1251044.3411044.6861044.1771044.611
NPAR1086.3361056.3921045.7531046.8991041.2751039.804
ARMA1084.8691055.0171048.1071045.8811042.7051038.553
VAR922.405907.187856.497856.135856.082856.088
RMSEAR1486.5801493.6521450.5511454.5101450.6761453.358
NPAR1476.3941450.8131436.1081439.6771434.6701433.172
ARMA1468.9081443.3671431.6861431.2961427.4371422.794
VAR1219.6081211.2251146.3021146.0021145.9791146.014
Table 2. p-values for the Diebold and Mariano test. H 0 : the forecasting accuracy for the model in the row and the model in the column is the same; H 1 : the forecasting accuracy of the model in the column is greater than that of the model in the row.
Table 2. p-values for the Diebold and Mariano test. H 0 : the forecasting accuracy for the model in the row and the model in the column is the same; H 1 : the forecasting accuracy of the model in the column is greater than that of the model in the row.
MODELSRS-ARRS-NPARRS-ARMARS-VAR
RS-AR-0.330.40 < 0.01
RS-NPAR0.67-0.75 < 0.01
RS-ARMA0.600.25- < 0.01
RS-VAR0.990.990.99-
Table 3. p-values for the Diebold and Mariano test. H 0 : the forecasting accuracy for the model in the row and the model in the column is the same; H 1 : the forecasting accuracy of the model in the column is greater than that of the model in the row.
Table 3. p-values for the Diebold and Mariano test. H 0 : the forecasting accuracy for the model in the row and the model in the column is the same; H 1 : the forecasting accuracy of the model in the column is greater than that of the model in the row.
ModelsSR-VARL1-VARL2-VARL3-VARRS-VARSS-VAR
SR-VAR-0.28 < 0.01 < 0.01 < 0.01 < 0.01
L1-VAR0.72- < 0.01 < 0.01 0.01 < 0.01
L2-VAR0.990.99-0.930.850.83
L3-VAR0.990.990.07-0.470.45
RS-VAR0.990.990.150.53-0.48
SS-VAR0.990.990.170.550.52-
Table 4. Electricity demand: hourly day-specific MAPE, MAE, and RMSE.
Table 4. Electricity demand: hourly day-specific MAPE, MAE, and RMSE.
ERRORSMODELSMondayTuesdayWednesdayThursdayFridaySaturdaySunday
MAPEAR3.332.181.991.821.832.243.49
NPAR3.462.141.961.791.832.263.35
ARMA3.442.121.961.811.822.243.42
VAR2.331.711.941.721.731.932.60
MAEAR1728.101002.69755.78633.24634.54905.951649.36
NPAR1811.31973.38740.55618.19638.20921.641585.79
ARMA1798.48969.16743.45626.00631.97912.871615.01
VAR1225.76799.34739.52592.69601.85791.001250.86
RMSEAR2194.771288.65980.01774.78798.231176.362142.07
NPAR2252.481253.27950.24764.92795.331187.312038.49
ARMA2232.411248.31952.02775.92789.301181.012028.24
VAR1601.44999.26963.13733.99747.15981.531619.33

Share and Cite

MDPI and ACS Style

Shah, I.; Iftikhar, H.; Ali, S.; Wang, D. Short-Term Electricity Demand Forecasting Using Components Estimation Technique. Energies 2019, 12, 2532. https://doi.org/10.3390/en12132532

AMA Style

Shah I, Iftikhar H, Ali S, Wang D. Short-Term Electricity Demand Forecasting Using Components Estimation Technique. Energies. 2019; 12(13):2532. https://doi.org/10.3390/en12132532

Chicago/Turabian Style

Shah, Ismail, Hasnain Iftikhar, Sajid Ali, and Depeng Wang. 2019. "Short-Term Electricity Demand Forecasting Using Components Estimation Technique" Energies 12, no. 13: 2532. https://doi.org/10.3390/en12132532

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop