The probabilistic approach is based on a prediction that takes the form of a probability distribution of future quantities or events. It allows for treating different types of problems, also extending the spectrum of the point estimation approach, e.g., long-term forecast becomes feasible.
3.1. The Problem
We can use a point forecast of the electricity spot price for the definition of the probabilistic forecasting problem, i.e., the “best estimate” or expected value of the spot price, see [
16]. Notice that the actual price at
t,
can be expressed as:
where
refers to the point forecast of the spot price at time
t, made at a previous time, and
is the corresponding error. In most of the EPF papers, the analysis stops here, because the authors are only primarily interested in point forecasts, see [
8].
The construction of prediction intervals (PI) is a very natural extension from point to probabilistic forecasts. Several methods can be used for this purpose and a popular one considerd both the point prediction and the corresponding error: at the confidence level, the center of the PI is set equal to and its bounds are determined by the th and th quantile of the CDF of . As to provide an example, for the 90% PIs, the 5% and 95% quantiles of the error term are requested.
A forecaster can further expand such an approach by considering multiple PIs. As a finale result, we obtain a series of quantiles at many levels.
An appropriate discretization of the price distribution is also a set of 99 quantiles (
). Generally, a density prediction that corresponds to (
3) can be referred to as a set of PIs for all
. The calculation of a probabilistic prediction requires an estimate of
and the distribution of
. Analogously, we reformulate it inverting the CDF of
and
, namely:
The probability forecast can be then defined as a probability distribution w.r.t. future quantities. Such an identification can be then carried on while using random variables. For our purposes, the latter implies the distribution of
itself, which is the way that is implied by the QRA method. It is worth mentioning that we are not considering the Probability Density Function (PDF) of
; moreover, since 24-h prediction distributions have to be created in a single step, their interconnected probability properties have to be taken into account all at once. Following [
16], we can either try to model the correlation between boundary distributions or we can simulate the 24-h paths characterizing the day-ahead dynamics; hence, using a multivariate model.
3.2. Construction of Probabilistic Forecasts
There are several ways to construct a probabilistic interval. We report four of them, mainly following [
16].
3.2.1. Historical Simulation
We can use a historical simulation approach to compute the PIs. The latter is a path-dependent approach that consists of evaluating sample quantiles out of the empirical distribution for the errors’ quantities.
3.2.2. Distribution-Based Probabilistic Predictions
When the models used for time series are driven by noises of the Gaussian type, which is the case, e.g., for the AR, ARMA, ARIMA, etc., methods, we directly use the Normal distribution to model the density forecasts, while computing the PIs accordingly via analytical tools. The latter differs from historical simulation because of the standard deviation of the error that we are considering in modelling the density, let us indicate it by , which is first calculated, then one considers the lower, resp. upper, bound of the PIs to be set according to a .
3.2.3. Bootstrapped PIs
An alternative method is the bootstrap one, which is often used within the NNs scenario. When considered to produce step-ahead forecast, the bootstrap approach works, as follows:
- 1.
Provide an estimate for the parameters characterizing the model and obtaining a fit with corresponding set of residuals .
- 2.
Provide a set of (simulated) data, which are then not real world based, with distribution that is steered by the parameters’ set
and normalized residuals
. The latter means that, for a general autoregressive model of order
r, w.r.t. variables (that constitute the external sources of noise)
, we recursively define:
- 3.
Provide a new estimation for the model to then compute the bootstrap one step-ahead forecast for the new time step .
- 4.
Starting from the the bottom, repeat both step 2 and 3, N times to obtain the (bootstrapped) price predictions: .
- 5.
Calculate the previously (step 4) quantities in order to provide the requested PIs
This type of construction is more accurate, because it takes both historical forecast errors and parameter uncertainty into account. However, it is computationally heavier.
3.2.4. Quantile Regression Averaging
The method that was proposed by Nowotarski and Weron in [
10] is called QRA and it involves quantile regression for a group of point forecasts from individual forecasting models. It is not necessary to split the probabilistic forecast into point forecast and error term distribution, because it works directly with the electricity spot price distribution. The quantile regression problem can be expressed as
where
represents the conditional
q-th quantile of the electricity price distribution,
are the explanatory variables, also called regressors, and
is a vector of parameters for the
q quantile. Subsequently, minimizing the loss function for a given
q-th quantile, the parameters can be estimated. There is no limitation of the components of
. Until it contains predictions from individual models, it is considered to be QRA.
3.3. Validity
In the case of a probabilistic forecast, the most important problem is that we do not know the actual distribution of the underlying process. It is not possible to compare the predicted distribution and the true distribution of the spot price of electricity with just the values of prices that were observed in the past. Probabilistic forecasting can be evaluated in several ways and the chosen method also depends on final goal that we aim to reach. Of course, we can rely on tests and parameters to check the validity of the model and to have criteria for choosing the optimal model. An evaluation is usually based on reliability, sharpness, and resolution.
Statistical consistency between distributional forecasts and observations is called reliability (or also calibration or unbiasedness). For example, if a 90% PI covers 90% of observed prices, then this PI is considered to be reliable, well-calibrated, or unbiased. The sharpness means how closely the predicted distribution covers the true distribution, i.e., the concentration of predicted distributions. This is a property only of forecasts, in contrast to reliability, which is a joint property of predictions and observations. Finally, resolution represents how strongly the predicted density changes over time, in other words, the ability to provide probabilistic predictions (e.g., wind power) depending on the forecast conditions, e.g., wind direction.
To formally check whether there is an “unconditional coverage” (UC), i.e., whether where if is in the interval, we can use the approach of Kupiec (1995), which allows for knowing whether , also known as an indicator for “hits and misses”, is determined as a i.i.d. Bernoulli series, with an average of , i.e., violations are assumed to be independent. Because the Kupiec test is not based on the order of the PI violations, but only on the total number of violations, Christoffersen (1998) introduced the independence test and the conditional coverage test (CC).
It is generally more difficult to test the goodness-of-fit of a predictive distribution than it is to assess the reliability of a PI. A very common method is to use the probability integral transform (PIT)
If the distribution forecast matches the actual distribution of the spot price process, then
is uniformly distributed and independent, which can be verified by a statistical test, see [
17].
In contrast to reliability, which is a joint property of observations and predictions, sharpness is only a property of predictions. Sharpness is strongly related to the notion of correct scoring rules. Indeed, scoring rules simultaneously evaluate the reliability and sharpness [
17]. The pinball loss (PL) and the continuous ranked probability score (CRPS) are the two most popular correct valuation rules for energy forecasting: the former for quantile predictions and the latter for distribution predictions. The pinball loss (PL) is a particular case of an asymmetric piece-wise linear loss function:
so
depends on the quantile function and the actually observed price. The PL is a strictly correct score for the
-th quantile. In order to obtain an aggregated score, the PL can be averaged over different quantiles.
It is also necessary to have statistically significant conclusions regarding the out-performance of the forecasts of one model by those of another model. For this purpose, we use the Diebold Mariano (DM) test, which is an asymptotic
z-test of the hypothesis that the mean value of the loss differential series:
is zero, where
is the score of the forecasts of the model
. In the context of probabilistic or ensemble forecasts, any strictly correct scoring rule can be used, for example, the pinball loss. Given the loss difference series, we calculate the statistics:
where
and
are the sample mean or standard deviation of
and
T represents the length of the test period outside the sample. The fundamental hypothesis of equal predictive accuracy, i.e., equal expected loss, corresponds to
, in which case, while assuming a steady-state covariance of
, the
statistic is asymptotically standard normal and one- or two-sided asymptotic tail probabilities are easily computed. The DM test makes a comparison between the forecasts of two models and not the models themselves. In day-ahead power markets, the forecasts for all hours of the next day are simultaneously made with the same set of information, implying that forecast errors for a given day usually have a high serial correlation. Therefore, it is recommended to run the DM tests separately for every load period considered, for example, at each hour of the day. [
18]. It is also worth mentioning that probabilistic forecast performance can be improved, exploiting approaches that are related to regime-switching models, see, e.g., [
19,
20].