1. Introduction
In competitive and liberalized markets with prominent renewable integration, the natural renewable stochasticity is echoed in all market players’ decisions, bringing additional challenges in the way of achieving a sustainable, profitable, and reliable operation of the electricity structure [
1]. Moreover, the microgeneration together with the natural evolution of renewable energy technologies is leading to a paradigm shift of the electricity sector [
2].
This, in addition to other factors, often make electricity market prices (EMP) forecasting tools more vital than demand series forecasting tools [
3]. A way to increase the sector flexibility is by integrating innovative storage systems, where the main goal is to manage the unpredictable behavior of renewables. However, this is highly costly, the lifetime is limited, and in most of the cases, prototypal systems are used [
4,
5,
6].
Therefore, the study of forecasting EMP has grown as one of the biggest research areas [
7]. EMP forecasting is a critical and inevitable task for all agents participating in different electricity market activities, and, moreover, in the presence of smart grid and smart house concepts [
8].
This is especially the case with the advance of smart grids, the aforementioned paradigm shift in the electricity sector, and the necessary and unavoidable mitigation of the human footprint impact due to climate change and other environmental concerns [
9].
The scientific literature shows that several techniques for EMP forecasting models are categorized [
10] into different time horizons: (1) Very-short-term (few seconds to few hours), (2) short-term (few hours to few days), and (3) long-term (few days to few months) [
11].
In the case of hard computing, the most known models are related with the auto-regressive integrated moving average (ARIMA), with or without data pre-processing [
12], where a huge amount of physical data is needed and an exact modelling of the structure is mandatory.
In contrast, soft computing models generally use an auto-learning procedure from historical information to recognize expected data with outlines present in the historical data. A wide range of models exist, mostly associated with artificial neural network (NN) philosophy, such as [
13,
14], and hybrid models [
15,
16,
17,
18], where the objective is to exploit the best features from the sets’ techniques that compose the forecasting model.
Nowadays, the efforts of the scientific community are focused on innovative probabilistic forecasting models, where the hybridization of different methods is common, but with the goal of more realistic and spread outputs [
19,
20,
21,
22]. To validate the accuracy and applicability of proposed forecasting models, the usage of similar historical datasets is necessary, not with the goal of tuning the model, but to prove its advantages among other proposed models.
For example, in Reference [
23], a hybrid forecasting model was presented and applied to forecast different time horizons (i.e., for the next day- and week-ahead EMP), considering different sets of historical data from two electricity markets commonly used to validate forecasting models.
In Reference [
9], a hybrid forecasting model was presented, combining wavelet transforms (WT), hybrid particle swarm optimization (DEEPSO), and adaptive neuro-fuzzy inference system (ANFIS) methods to forecast the EMP series for the Spanish market (2002, 2006), and Pennsylvania-New Jersey-Maryland (PJM) markets (2006), with different forecasting windows (i.e., between 24 h and 168 h ahead) with a 1-h time-step. The model was validated by comparison with previous approaches considering the same real and historical data.
In Reference [
14], a hybrid forecasting approach was presented considering a pre-processing technique combining particle swarm optimization (PSO) and fuzzy neural networks techniques to forecast and classify the EMP of the Spanish electricity market. In the same trend of research, in Reference [
1], a forecasting model was proposed composed by the support vector network with an ANFIS network in order to analyze the information of the Nordpool power market in Denmark.
In Reference [
24] a hybrid approach was proposed for market clearing prices considering forecasting intervals and point forecast through a modified dolphin echolocation optimization algorithm combined with WT and NN. Uncertainties and noise variance were considered. Real EMP data from Ontario, New England, and Australian electricity markets were used for validation.
Furthermore, in Reference [
25], a probabilistic hybrid EMP forecasting EMP based on an improved clonal selection algorithm and extreme learning machine for NN and WT was proposed. An autocorrelation process was also used to reduce computational costs. Real EMP data from Ontario and Australian electricity markets were also used for validation.
Considering the widespread state-of-the-art tools and the importance of considering probabilistic tools for forecasting, in this work, a probabilistic hybrid forecasting model is elaborated and explored for short-term EMP. The proposed model uses a combination of WT, as a pre-processing data technique, with DEEPSO in order to reduce the overall forecasting error by tuning ANFIS parameters. Afterwards, the resulting forecasted data is analyzed and post-processed using a Monte-Carlo simulation (MCS) model to extract the expected deviation from the forecasted results.
The proposed model is hereafter referred to as a hybrid probabilistic forecasting model (HPFM). The proposed HPFM was used to forecast the Spanish and PJM EMP, without exogenous data, considering the next week with a time-step of 1 h. The results were compared with other proposed models already published considering the same historical data and without relying on any external exogenous input data for the sake of a fair comparison.
The manuscript is ordered as follows:
Section 2 describes the proposed hybrid model, explaining in detail the underlying concepts of the incorporated algorithms (WT, DEEPSO, ANFIS, and MCS).
Section 3 shows the most commonly used forecasting validation tools and the ones used by this work to validate the proposed model.
Section 4 presents the case studies and obtained results. Finally, in
Section 5, the conclusions are provided.
2. Proposed Hybrid Probabilistic Forecasting Model (HPFM)
The proposed HPFM employs wavelet transforms (WT) to capture some features from the random behavior of electricity market prices (EMP). The hybrid particle swarm optimization (DEEPSO), due to the differential evolutionary process and hybrid enhancements, brings better capabilities to the ANFIS structure to reduce the forecast error through the tuning of ANFIS membership functions, providing a first-stage result. The resulting data is then analyzed through the Monte-Carlo simulation (MCS) model, where the goal is to have the capability of knowing the forecasted values’ range without increasing the forecast error.
2.1. Wavelet Transforms (WT)
In current forecasting models used for the analysis of time series, such as EMP and renewable power, WT has been widely used because it can detect patterns and trends without losing the original information. Both the above-mentioned time series exhibit intermittency, volatility, and peak trends that are challenging to forecast. In this sense, WT can be considered as a tool capable of isolating these trends from a non-stationary time series [
19]. Moreover, WT was successfully used for power quality and transient analysis, modeling of short-term energy system disturbances, and other signal analyses in a continuous or discrete domain [
18].
Mathematically, data processing of a time-varying input signal,
, at multiple resolutions is done by scaling and translating a mother function,
, by a convolution integral, allowing the illustration of the time series in the time and frequency domains [
19]. The scaling (
) and translating (
) factors are applied to the mother function,
, such that:
Using the complex conjugate,
, the continuous wavelet transform is obtained as follows [
26]:
By combining Equations (1) and (2), and discretizing, the more computationally efficient discrete WT (DWT) can be obtained and is expressed as:
In this form:
is the sampled signal length,
and
are the DWT counterparts of scaling factors
and
such that
and
. In the method proposed by this work, DWT was used for: (1) Signal decomposition, and (2) signal reconstruction. This was done using a multiresolution analysis based on a three-level variation of Mallat’s algorithm, which allows a filter-based DWT implementation with an iterative process [
27].
In the case of decomposition, the input signal (
) is divided into components of lower resolution, making up the WT decomposition tree. This signal at a first level is passed through a low-pass filter and a high-pass filter, and then the filter outputs are subtracted by 2, resulting in the first-level approximation and detail coefficients (
, and
). This is repeated until the desired levels are reached (
, and
). In a similar manner, the reconstruction process has three iterations of sampling followed by filtering, resulting in six coefficients, three of approximation and three of detail [
26].
2.2. Hybrid Particle Swarm Optimization (DEEPSO)
Particle swarm optimization (PSO) is a stochastic, bio-inspired, population-based, swarm intelligence method developed by [
28]. Being a swarm intelligence method, the algorithm mimics the social behavior and dynamics of swarming phenomenon in nature (e.g., bird flocks). In PSO, each particle in the population represents a potential solution to the objective function.
Each particle’s position vector within the multi-dimensional search space corresponds to values of solution variables (i.e., design vector). As the swarm “flies” through the search space, the objective function for each particle is calculated from its position vector.
At the end of every iteration, each particle trajectory, or velocity, is updated depending on: (1) Whether its objective function is increasing or decreasing (local knowledge), (2) the trajectories of neighboring particles (global knowledge), and (3) the best solution historically found by the swarm (global knowledge). The updating of individual trajectories happens iteratively based on the combined use of local, global, and historical knowledge, allowing the swarm to collectively “fly” towards more optimal solution points, ideally converging to the global optimum in the search space [
29].
Evolutionary computing, the most famous of which are genetic algorithms (GA), is another class of stochastic, bio-inspired, population-based methods. Unlike PSO, they are driven by survival of the fittest rather than social behavior [
30]. The population of potential solutions share global knowledge through the simulation of natural selection (Darwinism). The population “evolves” in every iteration through crossover of different individuals. The choice thereof is based on deterministic (fitness) and random factors. In addition, mutation operators allow random changes to occur to some individuals every iteration.
While GA is older than PSO, they have recently been less popular in comparison due to the fact they are more complex, with many parameters that need to be defined and re-tuned for different problems. PSO has been shown to outperform GA computationally in addition to being easier to implement due to its algorithmic simplicity with less parameters, despite having a static population set due to lacking operators, such as mutation and crossover (allowing a random element of information exchange between the population as opposed to deterministic trajectory updates), which can lead to the algorithm being stuck in local minima [
31].
Evolutionary PSO (EPSO) is an enhanced version of PSO, which aims to incorporate the advantages of evolutionary computing. A control parameter of PSO is the weight,
, of particles, which provides damping of the trajectory updating. In EPSO, particle weights are subjected to a mutation operator, and the swarm is altered by evolutionary features in every iteration. The hybrid algorithm enhances the performance of PSO [
32].
Differential EPSO (DEEPSO) takes this one step further, with the weight parameters,
, having auto-adaptive properties. While EPSO subjects the weights to a mutation operator, DEEPSO incorporates a crossover operator (as in GA), generating a new solution for particles as a fraction of that of the two other particles [
18]. As such, for every particle,
, in a swarm of a size,
, the individual trajectories are updated in each iteration,
, as follows:
and are the new and previous positions of particle (in iteration ), respectively. is any duo of different particles previously sampled from the swarm. is the best global position, selected based on a random Gaussian variable with a mean = 0 and variance = 1 . is a diagonal binary matrix with a value of 1 when the probability is and 0 when the probability is . Moreover, are the particle inertias (also the memory and cooperation weights), which are mutated each iteration through the learning mutation parameter, .
The differential evolutionary features added to PSO are:
Particle Weights Mutation:
Current Best Position with Normal Distribution
Differential Set for Particle Crossover:
The components, , guarantee that an appropriate extraction truly occurs, through the consideration of macro-gradient points in a descending way, depending on the organized comparison. Through this idea, the component, , is presumed to be , and the component, , is selected from the set of the best ancestors of the swarm.
2.3. Adaptive Neuro-Fuzzy Inference System (ANFIS)
ANFIS is able to work with a large amount of data using a smaller computational requirement [
15]. The structure of the ANFIS network is composed of five layers, where each layer contains nodes described by the node function,
. ANFIS is a hybrid model combining the best features of NN structures and fuzzy inference system (FIS) features. As demonstrated in the state-of-the-art review, ANFIS is capable of processing considerable lengths of data using low computational requirements. In addition, the ANFIS incorporates self-learning capabilities, which helps to self-adjust its parameters [
18].
The combination of NNs with FIS takes advantage of the best of both: NN extracts fuzzy rules from the data automatically, while in the learning process, fuzzy logic is applied with the adjustment of membership functions. As stated before, the ANFIS model is composed of five layers: Fuzzification process, firing strength rules, normalization, defuzzification, and output; which are interconnected with the different inputs and resulting in one output. The forecast results in a Takagi-Sugeno structure [
18,
27].
2.4. Monte-Carlo Simulation (MCS)
MCS is a powerful instrument of analysis and has long been employed on engineering problems. MCS is typically used to model an intrinsic variable of a system in which analytical formulae cannot be used due to complexity [
33]. An example of an MCS application can be found in [
34], where the effects of renewable integration with the interruptions’ probability were studied; where not only the uncertainty of components failure was analyzed, but also the variability of renewable integration was considered.
In this work, MCS was employed to analyze the result of the forecast for a range of values where the forecast can be inserted. With this, it is possible to have a probabilistic forecast result for further realistic analyses and decision making. MCS with variable control was implemented, whose concept is described as computing the analysis results around which the obtained values do not diverge from the expected real values [
35].
The historical dataset, , is experimented according to its probabilistic distributions. Afterwards, the forecast results set, , were computed through the performance function, , containing input variables.
Afterwards, statistical features of the output variables,
, were estimated. The basic structure of this model is shown in
Figure 1. Supposing that a set of
samples of random variables are generated, it means that all generated samples constitute a set of inputs,
, for the model,
, i.e., [
35]:
After obtaining the output samples, a statistical study can be done to estimate the output features, , i.e., the mean, variance, reliability, probability of failure, the probability density function (PDF), and the cumulative density function (CDF). The meanings associated with these features are represented in the following equations:
The probability of failure expression in case of
where:
As the integral of the previous equation is only the average value of
, the probability of failure is rewritten as:
where
is the total of samples that have a performance function lower or equal to 0. After this step, the cumulative density function is:
where the indicator function,
, is:
Finally, the PDF can be obtained from the numerical differentiation of CDF.
2.5. Probabilistic Hybrid Forecasting Model (PHFM)
The flowchart of the proposed PHFM is illustrated in
Figure 2. In brief, the algorithm can be described in the following list of steps:
Step 1: Start the PHFM model with a historical data of EMP, taking as window frames the forecasting time-frame (168 h for each set chosen);
Step 2: Select the historical data that will be decomposed by the WT;
Step 3: Select the parameters of the DEEPSO (
Table 1);
Step 4: Select the set of weeks that will be used in DEEPSO to obtain the necessary features to tuning and increase the performance of the ANFIS model;
Step 5: Select the parameters of the ANFIS (
Table 1);
Step 6: Select the inputs of each iteration of the ANFIS method;
Step 7: Calculate the forecasting errors with the different error measurements criterions to authenticate the advances of the proposed PHFM model:
- ▪
Step 7.1: If the criterion error goal is not achieved, start Step 7 again;
- ▪
Step 7.2: If the criterion error goal is not found in Step 7, jump to Step 4 to find another set of solutions;
- ▪
Step 7.3: If the best forecasting results are found, or the number of the iteration is reached, save the latest best record and go to Step 8;
Step 8: Use the inverse of the WT transform to include the data previously filtered in the forecasted output;
Step 9: Obtain the analysis result using the MCS; print the forecasting results and finish.
4. Case Studies and Results
The proposed PHFM was tested using an input 6 weeks of historical data to provide forecast results for the next week-ahead with 1-h increments. Well-known real data from the Spanish EMP of the year, 2002, and from the PJM EMP for the winter season of 2006 were used More details are available about the datasets and can be found in Reference [
18]. Moreover, in accordance with published and validated studies, no exogenous data was considered to achieve a fair and valid comparison between their results and this study. Implementation and execution of the model were on a common PC (Intel processor i3-2310, @2.10 GHz, 4 GB RAM) using MATLAB 2016b.
The average computation time to obtain a single forecasting result was around 1 min.
Figure 3 shows the numerical results for the different sample weeks of the year (spring, summer, fall, and winter) of the Spanish EMP, and
Figure 4 for the week of 22–28 February of the PJM market.
From the numerical results obtained it is possible to observe how the PHFM model handled the uncertainty of markets and different seasons under analysis. Moreover, the precision of the forecasted values throughout the weeks under analysis was observed.
The MCS analysis allows the observation of how the forecasting values may fluctuate due to uncertainty and stochasticity incorporated inherently in the algorithms and historical data. In fact, the shaded areas in
Figure 3a,b, which are produced by MCS, visualize that some days are associated with a higher uncertainty of the forecasting accuracy. Specifically, hour 48 to hour 72 in
Figure 3a, shows that the proposed model may have more difficulties obtaining optimal results in this period due to the high uncertainty of the forecasting accuracy.
Moreover, in
Figure 3b, an even higher range of uncertainty was encountered. This is mainly due to the price spikes behavior (a phenomenon which is out of scope for this study). However, despite the presence of high uncertainties and price spikes, the proposed forecasting model provides good results throughout those weeks.
Table 2,
Table 3,
Table 4 and
Table 5 present the evaluation between PHFM with other previous hybrid models with intelligence features in the scientific community, regarding MAPE criterion, and weakly error variance criterion for the Spanish and PJM markets in each season.
Table 5 outlines the performance of the proposed PHFM considering the mean, variance, and maximum and minimum error corresponding to all test cases. Finally,
Figure 5 shows the convergence curve of the proposed PHFM model considering the Spanish market summer season week.
From the error outcomes obtained, PHFM generally outperformed, in most of the situations under analysis, the hybrid models under analysis and comparison.
Figure 5 and
Table 5 show how the proposed forecasting model is stable and the both the final obtained forecasting error performance measures and training process losses performance gradually converged before the maximum number of iterations considered was reached.
Figure 5 suggests that the RMSE may further trend toward zero if the number of iterations is increased. This means that there might be a margin to improve further the performance of the model by increasing the number of iterations. However, the computational time will also increase with no guarantee that the forecasting error will significantly improve. To tackle this, a broader statistical study with non-parametric analysis may allow the improvement of the results [
36].