Next Article in Journal
Low Carbon Energy Transitions in the Nordic Countries: Evidence from the Environmental Kuznets Curve
Previous Article in Journal
Coordinated Control for Large-Scale Wind Farms with LCC-HVDC Integration
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Probabilistic Hourly Load Forecasting Using Additive Quantile Regression Models

by
Caston Sigauke
1,*,
Murendeni Maurel Nemukula
2 and
Daniel Maposa
2
1
Department of Statistics, University of Venda, Private Bag X5050, Thohoyandou 0950, South Africa
2
Department of Statistics and Operations Research, University of Limpopo, Private Bag X1106, Sovenga 0727, South Africa
*
Author to whom correspondence should be addressed.
Energies 2018, 11(9), 2208; https://doi.org/10.3390/en11092208
Submission received: 13 July 2018 / Revised: 3 August 2018 / Accepted: 15 August 2018 / Published: 23 August 2018

Abstract

:
Short-term hourly load forecasting in South Africa using additive quantile regression (AQR) models is discussed in this study. The modelling approach allows for easy interpretability and accounting for residual autocorrelation in the joint modelling of hourly electricity data. A comparative analysis is done using generalised additive models (GAMs). In both modelling frameworks, variable selection is done using least absolute shrinkage and selection operator (Lasso) via hierarchical interactions. Four models considered are GAMs and AQR models with and without interactions, respectively. The AQR model with pairwise interactions was found to be the best fitting model. The forecasts from the four models were then combined using an algorithm based on the pinball loss (convex combination model) and also using quantile regression averaging (QRA). The AQR model with interactions was then compared with the convex combination and QRA models and the QRA model gave the most accurate forecasts. Except for the AQR model with interactions, the other two models (convex combination model and QRA model) gave prediction interval coverage probabilities that were valid for the 90 % , 95 % and the 99 % prediction intervals. The QRA model had the smallest prediction interval normalised average width and prediction interval normalised average deviation. The modelling framework discussed in this paper has established that going beyond summary performance statistics in forecasting has merit as it gives more insight into the developed forecasting models.

1. Introduction

1.1. Context

In the literature, several modelling approaches are discussed in which hourly or half-hourly electricity demand data is modelled jointly and also modelling of hourly data separately [1,2]. Pros and cons of these different approaches are discussed in the literature. Modelling hourly data jointly helps in exploring the correlation structure of the intra-day relationships and can improve the accuracy of forecasts [1]. Wood et al. [2] argue that there are practical disadvantages of modelling hourly data individually which are the failure to capture the correlation between the hourly periods, the problem of interpretation due to lack of model continuity between the hourly periods and that the developed models will lack statistical stability. The authors further argue that over-fitting and the burden of model checking are significantly reduced if one model is fitted to the data. However, this modelling approach leads to the problem of the dimensional curse. Proponents of this modelling approach argue that the use of factor analysis can help in identifying a few factors that can account for most of the variation in the covariance matrix of the data [3]. Dordonnat et al. [4] develop a regression model which takes the intra-day correlation structure to forecast electricity demand.

1.2. Literature Review on Related Problems

It is argued in the literature that electricity demand patterns change throughout the day. Soares and Medeiros [5] argue that modelling of hourly demand data separately avoids the intra-day correlations which are common with time series data. Ramanathan et al. [6] develop flexible multiple regression models for each hour of the day to forecast electricity demand. The authors included a dynamic error structure together with adaptive adjustments which allow for the correction of forecast errors of previous hours. The modelling approach by Ramanathan et al. [6] is extended by Fan and Hyndman [7] who use a semi-parametric additive modelling framework to forecast short-term half-hourly Australian electricity demand. Using regression splines to model temperature and lagged demand effects, Fan and Hyndman [7] model each half-hourly period separately. These authors argue that modelling hourly or half-hourly electricity demand data results in more accurate forecasts.
Work on short-term load forecasting in which hourly data is modelled separately is discussed in literature. Goude et al. [8] developed generalised additive models for forecasting electricity demand. The authors used hourly load data from 2260 substations across France. Individual models for each of the 24 h of the day were developed.The developed models produced accurate forecasts for the the short- and medium-term horizons. Additive quantile regression models for forecasting probabilistic load and electricity prices are developed by Gaillard et al. [9]. The work done by Gaillard et al. [9] is extended by Fasiolo et al. [10] who developed fast calibrated additive quantile regression models. An online load forecasting system for very-short-term load forecasts is proposed by Laouafi et al. [11]. The proposed system is based on a forecast combination methodology which gives accurate forecasts in both normal and anomalous conditions. Zhang et al. [12] developed a hybrid model to short-term load forecasting based on singular spectrum analysis and support vector machine, which is optimized by the heuristic method they refer to as the Cuckoo search algorithm. The new proposed model outperformed the other heuristic models used in the study.
Boroojeni et al. [13] proposed a model which captures the complex seasonalities of electricity demand including the non-seasonal cycles. The developed model was then used for both short-term and medium-term forecasting. A boosted artificial neural network technique was presented in Khwaja et al. [14]. The developed model was compared with other artificial neural networks based models. Results showed that the new proposed model produces the lowest forecast errors. Ekonomou et al. [15] propose a methodology for short-term load forecasting. In their paper, wavelets and neural networks are used. The developed models were then applied to real and simulated data sets. In a study by Pappas et al. [16], autoregressive integrated moving average (ARIMA) models were used in short-term load forecasting. The authors showed in their study that the ARIMA model was appropriate for modelling load data with periodic variations and performed poorly during blackouts or when unexpected peaks in load demand were experienced.
A two-stage approach which is presented as a pattern recognition problem is discussed in Gajowniczek and Zabkowski [17]. The stages involve forecasting and peak detection through the use of machine learning algorithms. It is found that the proposed modelling approach produces accurate forecasts and is capable of detecting about 96.3% of the peak loads. Chapagain and Kittipiyakul [18] present a modelling approach which includes atmospheric covariates in the modelling and forecasting of short-term electricity demand. The atmospheric covariates used are cloud cover, wind speed, rainfall, relative humidity, and solar radiation including snow fall. Empirical results from this study showed a significant improvement in the forecast accuracy compared to models without atmospheric variables. Divina et al. [19] show that the use of a stacking ensemble learning scheme results in combined forecasts which are more accurate compared to the forecasts from individual models. Nagbe et al. [20] developed a functional vector autoregressive state space model for short-term electricity demand. The developed model was tested on real-life data sets and results showed that the modelling approach is adequate in forecasting electricity demand.
Short-term load forecasting using South African data is discussed in the literature. A regression-seasonal autoregressive integrated moving average (RegSARIMA) model for predicting short-term daily peak electricity demand is discussed in Chikobvu and Sigauke [21]. A comparative analysis is done with SARIMA and Holt–Winter’s triple exponential smoothing models. Empirical results from this study show that the RegSARIMA model is capable of capturing important drivers of electricity demand. In another study, an additive regression model for forecasting daily winter peak electricity demand is presented in Sigauke and Chikobvu [22]. The authors show that electricity demand in South Africa is highly sensitive to cold temperatures compared to hot temperatures. A more recent study by Sigauke and Chikobvu [23] compares the performance of time series regression models in forecasting short-term daily peak electricity demand in South Africa. Temperature effects are smoothed using regression splines and linear splines. The model in which regression splines are used produced better forecasting results.
Joint modelling of hourly electricity demand using additive quantile regression with pairwise interactions including an application of quantile regression averaging (QRA) is not discussed in detail in the literature. The current study intends to bridge this gap. The study focuses on an application of additive quantile regression (AQR) models. A comparative analysis is then done with the generalised additive models (GAMs) which are used as benchmark models. In this study, we discuss an application of pairwise hierarchical interactions discussed in Bien et al. [24] and Laurinec [25] who showed that the inclusion of interactions improves forecast accuracy.

1.3. Contributions

From the literature discussed in Section 1.2, the contributions of the present study are as follows: this study has established that going beyond summary performance statistics has merit as it gives more insight into the forecasting models. QRA forecasts result in valid prediction interval coverage probabilities and narrow prediction interval widths. The inclusion of hierarchical pairwise interactions and a nonlinear trend variable improves forecast accuracy and that the modelling framework allows for residual autocorrelation in the joint modelling of hourly electricity data.
A discussion of the models is presented in Section 2, with Section 3 discussing the results of the study. The conclusions are given in Section 4.

2. Theoretical Background

2.1. Quantile Regression

Developed by Koenker and Basset [26], quantile regression (QR) was introduced as a modelling framework for estimating conditional quantiles of the response variable. If Y denotes a random variable representing the response variable with corresponding covariates X, then the conditional quantile q Y | X ( τ ) , where τ ( 0 , 1 ) is defined as q Y | X ( τ ) = inf { y I R , F Y | X ( y ) τ } , where F Y | X represents the conditional distribution of Y given X. The conditional quantile q Y | X ( τ ) is a solution to
q Y | X ( τ ) = arg min g E ρ τ ( Y g ( x ) ) | X ,
where ρ τ is the quantile loss also known as the pinball loss defined as ρ τ ( s ) = s ( τ I ( s < 0 ) ) and I(.) is an indicator function. Now, let Y t = X t T β + ε t be a linear quantile regression where Y t denotes hourly electricity demand, X t the design matrix, β a vector of parameters and ε t the error term; then, the estimates of β are given as
β ^ τ = arg min β I R ρ i = 1 n ρ τ Y t X t T β .

2.2. Generalised Additive Models

Generalised additive models (GAMs) which were developed by Hastie and Tibshirani [27,28] are used in modelling predictors in regression-based models as a sum of smooth functions. The generalised additive model (GAM) is then written as [28,29,30]:
g ( E ( y t ) ) = β 0 t + i = 1 p s i ( x t i ) + ε t .
y t follows some exponential family distribution, where g denotes a link function and usually the Gaussian link function is used, s i are smooth functions and ε t is the error term. The smooth function, s is written as
s ( x ) = j = 1 k β j b j ( x ) ,
where β j denotes the j t h parameter, and b j ( x ) represents the j t h basis function with the dimension of the basis denoted by k. There are several smoothing spline bases ranging from P-splines, thin plate regression splines, B-splines, cubic regression splines to cyclic cubic regression splines. In this study, we use P-splines and adaptive splines. We seek to find an optimal solution to the optimisation problem given in Equation (5):
min t = 1 n y t i = 1 p s i ( x t i ) 2 + i = 1 p λ i ( f ( x ) ) 2 d x ,
where λ i is the i t h smoothing parameter.

2.3. The Proposed Models

2.3.1. Additive Quantile Regression Model

An additive quantile regression (AQR) model is a hybrid model that is a combination of GAM and QR models. AQR models were first applied to short-term load forecasting by Gaillard et al. [9] and extended by Fasiolo et al. [10]. Let y t denote hourly electricity demand where t = 1 , , n , n is the number of observations and let the number of days be denoted by n d . Then, n = 24 n d , where 24 is the number of hours in a day and the corresponding p covariates, x t 1 , x t 2 , , x t p . The AQR model is given in Equation (6) [9,10]:
y t , τ = j = 1 p s j , τ ( x t j ) + ε t , τ ; τ ( 0 , 1 ) ,
where s j , τ are smooth functions and ε t , τ is the error term. The smooth function, s, is written as
s j ( x ) = k = 1 q β k j b k j ( x t j ) ,
where β j denotes the j t h parameter, and b j ( x ) represents the j t h basis function with the dimension of the basis being denoted by q. The parameter estimates of Equation (6) are obtained by minimising the function given in Equation (8):
q Y | X ( τ ) = t = 1 n ρ τ y t , τ j = 1 p s j , τ ( x t j ) ,
where ρ τ is the pinball loss function that is defined in Section 2.1. The AQR models are given in Equations (9) and (10):
y t , τ = j = 1 p s j , τ ( x t j ) + k = 1 K j = 1 j α j k s j ( x t j ) s k ( x t k ) + ε t , τ ,
ϕ ( B ) Φ ( B s ) ε t , τ = θ ( B ) Θ ( B s ) v t , τ ,
ϕ ( B ) Φ ( B s ) y t , τ j = 1 p s j , τ ( x t j ) + k = 1 K j = 1 J α j k s j ( x t j ) s k ( x t k ) = θ ( B ) Θ ( B s ) v t , τ .
A comparative analysis will be done with the GAM given in Equation (11) and discussed in Sigauke [31]:
y t = β 0 t + i = 1 p s i ( x t i ) + k = 1 K j = 1 J α j k s j ( x t j ) s k ( x t k ) + ε t ,
ϕ ( B ) Φ ( B s ) ε t = θ ( B ) Θ ( B s ) v t ,
ϕ ( B ) Φ ( B s ) y t B 0 t + i = 1 p s i ( x t i ) + k = 1 K j = 1 J α j k s j ( x t j ) s k ( x t k ) = θ ( B ) Θ ( B s ) v t ,
where y t denotes hourly electricity demand, s i denotes the smoothing function, x t i represents the covariates, and ε t denotes error terms which are assumed to be autocorrelated. Selection of variables is done using the least absolute shrinkage and selection operator (Lasso) for the hierarchical interactions method developed by Bien et al. [24] and implemented in the R package “hierNet” [32]. The objective is to include an interaction where both variables are included in the model. The restriction known as the strong hierarchy constraint is discussed in detail in Ben and Tibshirani [24] and Lim and Hastie [33].

2.3.2. Forecast Error Measures

There are several error measures for probabilistic forecasting which include among others the continuous rank probability score, the logarithmic score and the quantile loss that is also known as the pinball loss. In this paper, we use the pinball loss function which is relatively easy to compute and interpret [34]. The pinball loss function is given as
L ( q τ , y t ) = τ ( y t q t ) , if y t > q τ , ( 1 τ ) ( q τ y t ) , if y t q τ ,
where q τ is the quantile forecast and y t is the observed value of hourly electricity demand.

2.3.3. Percentage Improvement

The percentage improvement between the best model M j best , j = 1 , , k with the other models is computed as follows ([35]):
Improvement ( % ) = 1 Pinball ( best model ) Pinball ( other model ) × 100 .
Equation (14) is used to compute the percentage improvements of the best model developed from the other models.

2.3.4. Prediction Intervals

For each of the models, M j , j = 1 , , k , we compute the prediction interval widths (PIWs), which we shall abbreviate as PIW i j , i = 1 , , n , j = 1 , , k as follows:
PIW i j = UL i j LL i j ,
where UL i j and LL i j are the upper and lower limits of the prediction interval, respectively. The analysis for determining the model which yields narrower PIW is done in this study using box and whisker plots, together with the probability density plots. A comparative analysis is done using the prediction intervals based on QRA [36].

2.3.5. Evaluation of Prediction Intervals

A prediction interval with nominal confidence (PINC) of 100 ( 1 α ) % is defined as the probability that the forecast y ^ t , τ lies in the prediction interval ( LL i j , UL i j ) . PINC is given in Equation (16) [37]:
PINC = P y ^ t , τ ( LL i j , UL i j ) = 100 ( 1 α ) % .
Various indices are used to evaluate the reliability of prediction intervals (PIs). In this paper, we use the prediction interval coverage probability (PICP), the prediction interval normalised average width (PINAW) and the prediction interval normalised average deviation (PINAD) that are discussed in Sun et al. [37] and Shen et al. [38]. The PICP is given in Equation (17):
PICP = 1 m i = 1 m l i j ,
where m is the number of forecasts and I is a binary variable that is defined as
I i j = 1 , if y i LL i j , UL i j , 0 , if otherwise .
The PICP is valid if it is greater than or equal to the predetermined level of confidence [37,38]. The PINAW is an index that is used to check if the required value is covered by the prediction interval and is given as follows [37,38]:
PINAW = 1 m ( max ( y i j ) min ( y i j ) ) i = 1 m UL i j LL i j , j = 1 , , k .
If the PICP is valid and accurate, then the PINAW is usually small [37,38]. However, PINAW can be used to compare different models and then determine the one that possesses the smallest percentage value. Another index which is used to assess the deviation of the target value from the prediction interval is the PINAD, which is given in Equation (20) [37,38]:
PINAD = 1 m i = 1 m D i j max ( y i j ) min ( y i j ) ,
where
D i j = LL i j y j i , if y j i < LL i j , 0 , if LL i j y j i UL i j , y j i U L i j , if y j i > UL i j .

2.3.6. Forecast Error Distribution

For each of the models, M j , j = 1 , , k , we extract the residuals ε t j = y t j y ^ t j and then compute the under and over predictions. Probability density and box plots of forecast errors including summary statistics are used in the analysis of over and under predictions.

2.3.7. Forecast Combination

QRA is based on forecasting the response variable against the combined forecasts which are treated as independent variables. Let y t , τ be hourly electricity demand as discussed in Section 2.3.1 and let there be M methods used to predict the next observations of y t , τ , which shall be denoted by y t + 1 , y t + 2 , , y t + M . Using m = 1 , , M methods, the combined forecasts will be given by
y ^ t , τ QRA = β 0 + j = 1 k β j y ^ t j + ε t , τ ,
where y ^ t j represents forecasts from method j, y ^ t , τ QRA is the combined forecasts and ε t , τ is the error term. We seek to minimise
arg min β t = 1 n ρ τ y ^ t QRA β 0 j = 1 k β j y ^ t j .
In matrix form, we have
arg min β I R t = 1 n ρ τ y ^ t QRA x t T β ,
which reduces to
arg min β I R ρ t : y ^ t QRA > x t T β τ y ^ t Q R A x t T β + t : y ^ t QRA < x t T β ( 1 τ ) ( y ^ t QRA x t T β ) .
The QRA forecasts will be compared with forecasts based on weighted average of the forecasts given in Equation (23)
y ^ t , τ c = m = 1 M ω m t y ^ m t , τ ,
where ω m t is weight assigned to the forecast m.

3. Description of the Case Study

The modelling framework discussed in Section 2 is then applied to a real-life data set. Hourly load data from Eskom, South Africa’s power utility company is used. The data is from all the sectors of the South African economy, i.e., industrial, commercial, agricultural including the residential sectors. In this study, hourly temperature data from the South African Weather Services is used. The temperature data is from 28 meteorological stations. Other variables (predictors) used are lagged demand at lags 1, 12 and 24; including factor variables, hour = 1, hour = 2, ..., hour = 24; month which takes values, month = 1 for January, month = 2 for February, ..., month = 12 for December; daytype taking values daytype = 1 for Monday, daytype = 2 for Tuesday, ..., daytype = 7 for Sunday, variable holiday which takes value 1 if a day is a holiday and also value 1 for a day before and after a holiday. In addition, a nonlinear trend variable is also used.

4. Empirical Results

4.1. Exploratory Data Analysis

The summary statistics of hourly electricity demand for the sampling period January 2010 to December 2012 is given in Table 1. The distribution of hourly load is non-normal since it is skewed to the left and platykurtic as shown by the skewness value of −0.243 and a kurtosis value of 2.05 given in Table 1.
Figure 1 shows the time series plot of hourly electricity demand together with density, normal quantile to quantile (Q–Q) and box plots that all show departure from normality of the data. The distribution of the sampling data is bimodal.
A plot of hourly electricity demand with a superimposed nonlinear trend is shown in Figure 2. A penalised cubic regression spline π ( t ) = t = 1 n y t f ( x t ) 2 + λ f ( x ) 2 d x is used as the nonlinear trend function, with λ as the smoothing parameter and is estimated by generalised cross-validation (GCV) approach. The fitted values are extracted and used as input values for the nonlinear trend variable in the GAM and AQR models.

4.2. Forecasting Electricity Demand When Covariates Are Known in Advance

4.2.1. Forecasting Results

The data used is hourly electricity demand from 1 January 2010 to 31 December 2012 giving us n = 26,281 observations. The data is split into training data, 1 January 2010 to 2 April 2012, i.e., n 1 = 19,708 and testing data, from 2 April 2012 to 31 December 2012, i.e., n 2 = 6573 , which is 25 % of the total number of observations. The smoothed effect of the variable “hour” which is given in Figure 3 shows that daily peak electricity demand occurs around 7:00 p.m. The period 5:00 p.m. to 9:00 p.m. is then considered as the peak period in which electricity demand is expected to exceed a certain high threshold, which is likely to cause problems for the system operators due to grid instability and severe stress on the system.
The models considered are M 1 (GAM), M 2 (GAMI), which are GAM models without and with interactions, respectively, and M 3 (AQR), M 4 (AQRI) which are additive quantile regression models without and with interactions, respectively. The four models M 1 to M 4 are then combined based on the pinball losses, resulting in M 5 and also combined using QRA, resulting in M 6 .

4.2.2. Out of Sample Forecasts

After correcting for residual autocorrelation, we then use the model for out of sample forecasting (testing). A comparative analysis of the models given in Table 2 shows that M 4 is the best model out of the four models, M 1 to M 4 , based on the root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE). The forecasts from the four models are then combined based on the pinball losses. The weights assigned to the forecasts from the models M 1 to M 4 are 0.0174, 0.0946, 0.326 and 0.562, respectively. The model for combining the forecasts based on the pinball losses is M 5 . Model M 6 , i.e., the model based on QRA has the lowest MAE and MAPE values as shown in Table 2. Model 4 has more under predictions compared to over predictions, and M 5 has more over predictions compared to under predictions, while, for model 6, the under and over-predictions are almost the same.
Using models M 4 , M 5 and M 6 , we then compute the average pinball losses. The average losses suffered by the models based on the pinball losses are given in Table 3 with model M 6 having the smallest average pinball loss.
In order to test the effectiveness of the forecasting models M 4 to M 6 , we present, in Figure 4, box plots of the pinball loses of the models.

4.2.3. Evaluation of Prediction Intervals

Empirical prediction intervals (PIs) are constructed using the forecasts from the models M 4 to M 6 . The constructed PIs are then used to find PIWs, PINAWs, PINADs and calculation of the number of forecasts below and above the PIs from each model. Summary statistics of the PIWs for the models M 4 to M 6 for PINC value of 95 % are given in Table 4. The distributions of the PIWs for the three models are all leptokurtic since they are greater than 3. They are all skewed to the right since the values of their skewness are all positive. This shows that heavy-tailed distributions would be appropriate to fit the distributions of the PIWs of the three models. Model M 5 has the smallest standard deviation, which indicates narrower PIW compared to M 4 and M 6 .
Boxplots of widths of the PIs for the forecasting models M 4 , M 5 and M 6 are given in Figure 5. The figure shows that the PI from model M5 are narrower compared to those from M4 and M6.
Figure 6 shows the density plots of the PIWs of M 4 , M 5 and M 6 . The distribution of the PIWs of M 5 is bimodal and all the densities show that the distributions are skewed to the right.
In order to choose the best model based on the analysis of the PIWs, we need to calculate the PICPs, PINAWs and PINADs including a count of the number of forecasts below and above the PIs. This is done for various PINC values, which are 90 % , 95 % and 99 % , respectively. A comparative evaluation of the models using PI indices for PINC values of 90 % , 95 % and 99 % are given in Table 5. Models M 5 and M 6 have valid PICPs for the three PINC values, with M 6 having the highest PICP. Model M 6 has the smallest PINAD values and fewer number of forecasts falling below and above the PIs. Model M 4 has the smallest PINAW value for all three of the PINC values. All three of the models could be used in the construction of PIs. Although M 4 does not give a valid PICP, the PINAW and PINAD are reasonably small. The performance of model M 6 seems to be the best amongst these three models. However, this analysis is not enough and, as a result, we need further analysis using residuals of the three models.

4.2.4. Residual Analysis

Table 6 gives summary statistics of the residuals from the models M 4 , M 5 and M 6 . Model M 6 has the smallest standard deviation with a median of zero, showing that it is the best model for predicting hourly electricity demand. All three of the models have positive skewness, an indication of a large number of positive errors, which is a reflection of underestimation of predicted hourly electricity demand. Model 6 has the smallest skewness value. A failure to predict high electricity demand is shown by high values of kurtosis [16]. The kurtosis values of all three of the models are greater than 3.
The error distributions of each of the forecasting models M 4 , M 5 and M 6 are given in Figure 7, which shows that the number of positive errors dominates negative errors, an indication that the distribution of errors for each of the three models is positively skewed. Model M 6 is the best fitting model since it has the smallest distribution of errors.
Figure 8 shows the boxplots of the hourly errors from the three models. The range of the errors from M6 is narrower compared to the ones from M4 and M5.

4.2.5. Plots of out of Sample Forecasts

From the analysis of the PIWs and residual analysis, M 6 is the best fitting model and can be used for predicting hourly electricity demand. The plot of actual demand superimposed with forecasted demand from model M 6 (2 April to 31 December 2012) given in Figure 9 shows that the forecasts follow hourly electricity demand very well.
The density plots from M 6 (QRA forecasts) and M 5 (convex combination forecasts) models superimposed with actual hourly electricity demand are given in Figure 10. In both plots, the fit of the densities is fairly good.
A summary of the accuracy measures for the months April to December 2012 for each of the first 168 forecasts of each month is given in Table A1 in Appendix A while Appendix B shows in Figure A1, Figure A2 and Figure A3 hourly load superimposed with forecasts together with their respective densities.

5. Discussion

The modelling approach discussed in this study allows for easy interpretability and accounting for residual autocorrelation in the joint modelling of hourly electricity data. A comparative analysis was then done with the generalised additive models (GAMs). In both modelling frameworks, variable selection was done using Lasso via hierarchical interactions. Four models considered were GAMs and AQR models with and without interactions. The AQR model with pairwise interactions was found to be the best fitting model. The forecasts from the four models were then combined using an algorithm based on the pinball loss (convex combination model) and also using quantile regression averaging (QRA). The AQR model with interactions was then compared with the convex combination and QRA models and the QRA model gave the most accurate forecasts. Except for the AQR model with interactions, the other two models’ convex combination model and the QRA model gave prediction interval coverage probabilities which were valid for the 90 % , 95 % and the 99 % prediction intervals. The QRA model had the smallest prediction interval normalised average width and prediction interval normalised average deviation.

6. Conclusions

This study discussed an application of short-term hourly electricity demand forecasting in South Africa using additive quantile regression (AQR) models without and with pairwise interactions which satisfy the strong hierarchy in Lasso via hierarchical interactions. This modelling approach allows for a detailed analysis which goes beyond the performance statistics in forecasting. This approach has merit in that it gives more insight in the developed models.

Author Contributions

Conceptualization, C.S.; Methodology, C.S., M.M.N. and D.M.; Software, C.S., M.M.N. and D.M.; Validation, C.S., M.M.N. and D.M.; Formal Analysis, C.S., M.M.N. and D.M.; Investigation, C.S., M.M.N. and D.M.; Data Curation, C.S., M.M.N. and D.M.; Writing—Original Draft Preparation, C.S.; Writing—Review and Editing, M.M.N. and D.M.; Project Administration, C.S.; Funding Acquisition, C.S.

Funding

This research was funded by the National Research Foundation of South Africa, Grant No. 93613.

Acknowledgments

The authors are grateful to Eskom, South Africa’s power utility company for providing the data.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AQRAdditive Quantile Regression
GAMGeneralised additive model
MAEMean Absolute Error
MAPEMean Absolute Percentage Error
PIPrediction Interval
PICPPrediction Interval Coverage Probability
PINADPrediction Interval Normalised Average Deviation
PINAWPrediction Interval Normalised Average Width
PINCPrediction Interval with Nominal Confidence
QRQuantile Regression
QRAQuantile regression averaging
RMSERoot Mean Square Error

Appendix A. Summary of the Accuracy Measures for the Months April to December 2012

A summary of the accuracy measures for the months April to December 2012 for each of the first 168 forecasts of each month is given in Table A1. The best forecasts are in October and the worst are in April.
Table A1. Forecast accuracy measures root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) for the forecasts of April to December 2012.
Table A1. Forecast accuracy measures root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) for the forecasts of April to December 2012.
RMSEMAE (MW)MAPE (%)
April945.5214781.64293.151406
May620.7605488.65481.891559
June665.0797537.52381.898156
July392.0611329.33931.181808
August642.6158538.23211.903814
September750.3948618.54762.264714
October345.0181271.05951.010533
November394.3301302.90481.146244
December468.6219369.55951.395704

Appendix B. Hourly Load with Forecasts for the Months April–December 2012

Hourly load superimposed with forecasts for the first 168 forecasts of each month of the months April to December of 2012 together with their respective densities is given in Figure A1, Figure A2 and Figure A3.
Figure A1. Hourly load superimposed with forecasts for the first 168 forecasts of each month of the months April to June 2012 together with their respective densities.
Figure A1. Hourly load superimposed with forecasts for the first 168 forecasts of each month of the months April to June 2012 together with their respective densities.
Energies 11 02208 g0a1
Figure A2. Hourly load superimposed with forecasts for the first 168 forecasts of each month of the months July to September 2012 together with their respective densities.
Figure A2. Hourly load superimposed with forecasts for the first 168 forecasts of each month of the months July to September 2012 together with their respective densities.
Energies 11 02208 g0a2
Figure A3. Hourly load superimposed with forecasts for the first 168 forecasts of each month of the months October to December 2012 together with their respective densities.
Figure A3. Hourly load superimposed with forecasts for the first 168 forecasts of each month of the months October to December 2012 together with their respective densities.
Energies 11 02208 g0a3

References

  1. Maciejowska, K.; Weron, R. Forecasting of daily electricity prices with factor models: Utilizing intra-day and inter-zone relationships. Comput. Stat. 2017, 30, 805–819. [Google Scholar] [CrossRef]
  2. Wood, S.N.; Goude, Y.; Shaw, S. Generalized additive models for large datasets. J. R. Stat. Soc. 2015, 64, 139–155. [Google Scholar] [CrossRef] [Green Version]
  3. Tsay, R.S. Analysis of Financial Time Series, 2nd ed.; Wiley Series in Probability and Statistics; Wiley Online Library: Hoboken, NJ, USA, 2005. [Google Scholar]
  4. Dordonnat, V.; Koopman, S.J.; Ooms, M. Dynamic factors in periodic time-varying regressions with an application to hourly electricity load modelling. Comput. Stat. Data Anal. 2012, 56, 3134–3152. [Google Scholar] [CrossRef]
  5. Soares, L.J.; Medeiros, M.C. Modeling and Forecasting Short-term Electric Load Demand: A Two-Step Methodology. 2016. Available online: https://pdfs.semanticscholar.org/734b/3f6565243912784ad7b1a7421acb7188c9ca.pdf (accessed on 28 December 2016).
  6. Ramanathan, R.; Engle, R.; Granger, C.W.J.; Vahid-Araghi, F.; Brace, C. Short-run forecasts of electricity loads and peaks. Int. J. Forecast. 1997, 13, 161–174. [Google Scholar] [CrossRef]
  7. Fan, S.; Hyndman, R.J. Short-term load forecasting based on a semi-parametric additive model. IEEE Trans. Power Syst. 2012, 27, 134–141. [Google Scholar] [CrossRef]
  8. Goude, Y.; Nedellec, R.; Kong, N. Local short and middle term electricity load forecasting with semi-parametric additive models. IEEE Trans. Smart Grid 2014, 5, 440–446. [Google Scholar] [CrossRef]
  9. Gaillard, P.; Goude, Y.; Nedellec, R. Additive models and robust aggregation for GEFcom2014 probabilistic electric load and electricity price forecasting. Int. J. Forecast. 2016, 32, 1038–1050. [Google Scholar] [CrossRef]
  10. Fasiolo, M.; Goude, Y.; Nedellec, R.; Wood, S.N. Fast Calibrated Additive Quantile Regression. 2017. Available online: https://github.com/mfasiolo/qgam/blob/master/draftqgam.pdf (accessed on 13 March 2017).
  11. Laouafi, A.; Mordjaoui, M.; Haddad, S.; Boukelia, T.E.; Ganouche, A. Online electricity demand forecasting based on effective forecast combination methodology. Electr. Power Syst. Res. 2017, 148, 35–47. [Google Scholar] [CrossRef]
  12. Zhang, X.; Wang, J.; Zhang, K. Short-term electric load forecasting based on singular spectrum analysis and support vector machine optimized by Cuckoo search algorithm. Electr. Power Syst. Res. 2017, 146, 270–285. [Google Scholar] [CrossRef]
  13. Boroojeni, K.G.; Amini, M.H.; Bahrami, S.; Iyengar, S.S.; Sarwat, A.I.; Karabasoglu, O. A novel multi-time-scale modelling for electric power demand forecasting: From short-term to medium-term horizon. Electr. Power Syst. Res. 2017, 142, 58–73. [Google Scholar] [CrossRef]
  14. Khwaja, A.S.; Zhang, X.; Anpalagan, A.; Venkatesh, B. Boosted neural networks for improved short-term electric load forecasting. Electr. Power Syst. Res. 2017, 143, 431–437. [Google Scholar] [CrossRef]
  15. Ekonomou, L.; Christodoulou, C.A.; Mladenov, V. A short-term load forecasting method using artificial neural networks and wavelet analysis. Int. J. Power Syst. 2016, 1, 64–68. [Google Scholar]
  16. Pappas, S.S.; Ekonomou, L.; Moussas, V.C.; Karampelas, P.; Katsikas, S.K. Adaptive load forecasting of the Hellenic electric grid. J. Zhejiang Univ. Sci. A 2008, 9, 1724–1730. [Google Scholar] [CrossRef]
  17. Gajowwniczek, K.; Zabkowski, T. Two-stage electricity demand modeling using machine learning algorithms. Energies 2017, 10, 1547. [Google Scholar] [CrossRef]
  18. Chapgain, K.; Kittipiyakul, S. Performance analysis of short-term electricity demand with atmospheric variables. Energies 2018, 11, 818. [Google Scholar] [CrossRef]
  19. Divina, F.; Gilson, A.; Goméz-Vela, F.; Torres, M.G.; Torres, J.F. Stacking ensemble learning for short-term electricity consumption forecasting. Energies 2018, 11, 949. [Google Scholar] [CrossRef]
  20. Nagbe, K.; Cugliari, J.; Jacques, J. Short-term electricity demand forecasting using a functional state space model. Energies 2018, 11, 1120. [Google Scholar] [CrossRef]
  21. Chikobvu, D.; Sigauke, C. Regression-SARIMA modelling of daily peak electricity demand in South Africa. J. Energy S. Afr. 2012, 23, 23–30. [Google Scholar]
  22. Sigauke, C.; Chikobvu, D. Short-term peak electricity demand in South Africa. Afr. J. Bus. Manag. 2012, 6, 9243–9249. [Google Scholar] [CrossRef]
  23. Sigauke, C.; Chikobvu, D. Peak electricity demand forecasting using time series regression models: An application to South African data. J. Stat. Manag. Syst. 2016, 19, 567–586. [Google Scholar] [CrossRef]
  24. Bien, J.; Taylor, J.; Tibshirani, R. A lasso for hierarchical interactions. Ann. Stat. 2013, 41, 1111–1141. [Google Scholar] [CrossRef] [PubMed]
  25. Laurinec, P. Doing Magic and Analyzing Seasonal Time Series with GAM, (Generalized Additive Model) in R. 2017. Available online: https://petolau.github.io/Analyzing-double-seasonal-time-series-with-GAM-in-R/ (accessed on 23 February 2017).
  26. Koenker, R.; Bassett, G. Regression quantiles. Econ. J. Econ. Soc. 1978, 46, 33–50. [Google Scholar] [CrossRef]
  27. Hastie, T.; Tibshirani, R. Generalized additive models (with discussion). Stat. Sci. 1986, 1, 297–318. [Google Scholar] [CrossRef]
  28. Hastie, T.; Tibshirani, R. Generalized Additive Models; Chapman & Hall: London, UK, 1990. [Google Scholar]
  29. Wood, S.N. Generalized Additive Models: An Introduction with R; Chapman & Hall: London, UK, 2006. [Google Scholar]
  30. Wood, S.N. Generalized Additive Models: An Introduction with R; Chapman & Hall: London, UK, 2017. [Google Scholar]
  31. Sigauke, C. Forecasting medium-term electricity demand in a South African electric power supply system. J. Energy S. Afr. 2017, 28, 54–67. [Google Scholar] [CrossRef]
  32. Bien, J.; Tibshirani, R. R Package “HierNet”, Version 1.6. 2015. Available online: https://cran.r-project.org/web/packages/hierNet/hierNet.pdf (accessed on 22 May 2017).
  33. Lim, M.; Hastie, T. Learning interactions via hierarchical group-lasso regularization. J. Comput. Graph. Stat. 2015, 24, 627–654. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Hong, T.; Pinson, P.; Fan, S.; Zareipour, H.; Troccoli, A.; Hyndman, R.J. Probabilistic energy forecasting: Global Energy Forecasting Competition 2014 and beyond. Int. J. Forecast. 2016, 32, 896–913. [Google Scholar] [CrossRef] [Green Version]
  35. Abuella, M.; Chowdhury, B. Hourly probabilistic forecasting of solar power. In Proceedings of the 49th North American Power Symposium, Morgantown, WV, USA, 17–19 September 2017. [Google Scholar]
  36. Liu, B.; Nowotarski, J.; Hong, T.; Weron, R. Probabilistic load forecasting via quantile regression averaging of sister forecasts. IEEE Trans. Smart Grid 2017, 8, 730–737. [Google Scholar] [CrossRef]
  37. Sun, X.; Wang, Z.; Hu, J. Prediction interval construction for byproduct gas flow forecasting using optimized twin extreme learning machine. Math. Probl. Eng. 2017. [Google Scholar] [CrossRef]
  38. Shen, Y.; Wang, X.; Chen, J. Wind power forecasting using multi-objective evolutionary algorithms for wavelet neural network-optimized prediction intervals. Appl. Sci. 2018, 8, 185. [Google Scholar] [CrossRef]
Figure 1. Hourly electricity demand from January 2010 to 31 December 2012.
Figure 1. Hourly electricity demand from January 2010 to 31 December 2012.
Energies 11 02208 g001
Figure 2. Plot of hourly electricity demand from 1 January 2010 to 31 December 2012 superimposed with a nonlinear trend.
Figure 2. Plot of hourly electricity demand from 1 January 2010 to 31 December 2012 superimposed with a nonlinear trend.
Energies 11 02208 g002
Figure 3. Smoothed effects of variable “hour”.
Figure 3. Smoothed effects of variable “hour”.
Energies 11 02208 g003
Figure 4. Plot of pinball losses for models M 4 (pinballAQRI), M 5 (pinballPlaqr) and M 6 (pinballQRA) (2 April 2012 to 31 December 2012).
Figure 4. Plot of pinball losses for models M 4 (pinballAQRI), M 5 (pinballPlaqr) and M 6 (pinballQRA) (2 April 2012 to 31 December 2012).
Energies 11 02208 g004
Figure 5. Prediction interval widths for models M 4 (PIAQRI), M 5 (PIConvex) and M 6 (PIQRA).
Figure 5. Prediction interval widths for models M 4 (PIAQRI), M 5 (PIConvex) and M 6 (PIQRA).
Energies 11 02208 g005
Figure 6. Density plots of the prediction interval widths for models M 4 (PIAQRI), M 5 (PIConvex) and M 6 (PIQRA).
Figure 6. Density plots of the prediction interval widths for models M 4 (PIAQRI), M 5 (PIConvex) and M 6 (PIQRA).
Energies 11 02208 g006
Figure 7. The error distribution of forecasting techniques for M4(AQRI), M5(convex) and M6(QRA).
Figure 7. The error distribution of forecasting techniques for M4(AQRI), M5(convex) and M6(QRA).
Energies 11 02208 g007
Figure 8. Box plots of residuals from models M 4 (residAQRI), M 5 (residConvex) and M 6 (residQRA).
Figure 8. Box plots of residuals from models M 4 (residAQRI), M 5 (residConvex) and M 6 (residQRA).
Energies 11 02208 g008
Figure 9. Plot of actual demand superimposed with forecasted demand from M 6 (2 April to 31 December 2012).
Figure 9. Plot of actual demand superimposed with forecasted demand from M 6 (2 April to 31 December 2012).
Energies 11 02208 g009
Figure 10. Density plots of actual demand superimposed with density plots from M 6 and M 5 models (2 April to 31 December 2012).
Figure 10. Density plots of actual demand superimposed with density plots from M 6 and M 5 models (2 April to 31 December 2012).
Energies 11 02208 g010
Table 1. Summary statistics for hourly electricity demand (MW).
Table 1. Summary statistics for hourly electricity demand (MW).
Descriptive StatisticsMeanMedianMaxMinSt. Dev.SkewnessKurtosis
Load27,79828,49636,66418,7393337−0.24332.050
Table 2. Model comparisons.
Table 2. Model comparisons.
M 1 M 2 M 3 M 4 M 5 M 6
RMSE736.2662.4731.5648.8596.1577.7
MAE (NW)568.7516.2549.5499.7459.4445.2
MAPE (%)2.151.932.041.861.701.65
Under predictions 331932793280
Over predictions 325132913286
Table 3. Average pinball losses for M 1 to M 6 (2 April 2012 to 31 December 2012).
Table 3. Average pinball losses for M 1 to M 6 (2 April 2012 to 31 December 2012).
M 1 M 2 M 3 M 4 M 5 M 6
Average Pinball loss284.363258.087274.768249.842229.723222.584
Table 4. Model comparisons.
Table 4. Model comparisons.
MeanMedianMinimumMaximumStandard DeviationSkewnessKurtosisRange
M 4 2100.920232875617686.980.72563.72175330
M 5 2419.1243518833560117.721.489812.33681667
M 6 2300.022637954438418.110.67764.03043643
Table 5. Comparative evaluation of models using prediction interval (PI) indices. Below LL = number of forecasts below the lower prediction limit, Above UL = number of forecasts above the upper prediction limit.
Table 5. Comparative evaluation of models using prediction interval (PI) indices. Below LL = number of forecasts below the lower prediction limit, Above UL = number of forecasts above the upper prediction limit.
PINCModelPICP (%)PINAW (%)PINAD (%)Below LLAbove UL
90 % M 4 84.4110.630.2353462563
M 5 90.4611.730.1671310317
M 6 90.8011.070.1347301304
95 % M 4 91.1912.520.1186236343
M 5 95.1614.410.0756156162
M 6 95.3113.700.0573151157
99 % M 4 97.3516.430.0312736138
M 5 99.119.870.01103031
M 6 99.2217.750.0059863120
Table 6. Model comparisons.
Table 6. Model comparisons.
MeanMedianMinimumMaximumStandard DeviationSkewnessKurtosis
M 4 44.167−25073258647.360.37613.9266
M 5 28.55−1−25202690595.490.24423.7702
M 6 14.980−22732860577.560.19973.9223

Share and Cite

MDPI and ACS Style

Sigauke, C.; Nemukula, M.M.; Maposa, D. Probabilistic Hourly Load Forecasting Using Additive Quantile Regression Models. Energies 2018, 11, 2208. https://doi.org/10.3390/en11092208

AMA Style

Sigauke C, Nemukula MM, Maposa D. Probabilistic Hourly Load Forecasting Using Additive Quantile Regression Models. Energies. 2018; 11(9):2208. https://doi.org/10.3390/en11092208

Chicago/Turabian Style

Sigauke, Caston, Murendeni Maurel Nemukula, and Daniel Maposa. 2018. "Probabilistic Hourly Load Forecasting Using Additive Quantile Regression Models" Energies 11, no. 9: 2208. https://doi.org/10.3390/en11092208

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop