Short-Term Electricity Price Forecasting by Employing Ensemble Empirical Mode Decomposition and Extreme Learning Machine

Day-ahead electricity price forecasting plays a critical role in balancing energy consumption and generation, optimizing the decisions of electricity market participants, formulating energy trading strategies, and dispatching independent system operators. Despite the fact that much research on price forecasting has been published in recent years, it remains a difficult task because of the challenging nature of electricity prices that includes seasonality, sharp fluctuations in price, and high volatility. This study presents a three-stage short-term electricity price forecasting model by employing ensemble empirical mode decomposition (EEMD) and extreme learning machine (ELM). In the proposed model, the EEMD is employed to decompose the actual price signals to overcome the non-linear and non-stationary components in the electricity price data. Then, a day-ahead forecasting is performed using the ELM model. We conduct several experiments on real-time data obtained from three different states of the electricity market in Australia, i.e., Queensland, New South Wales, and Victoria. We also implement various deep learning approaches as benchmark methods, i.e., recurrent neural network, multi-layer perception, support vector machine, and ELM. In order to affirm the performance of our proposed and benchmark approaches, this study performs several performance evaluation metric, including the Diebold–Mariano (DM) test. The results from the experiments show the productiveness of our developed model (in terms of higher accuracy) over its counterparts.


Introduction
One of the primary objectives of smart grids is to mitigate peaks in electricity demand and to balance between electricity demand and supply [1,2]. Energy users can also involve in reducing total electricity costs and peaks by shifting their load from high-peak to low-peak hours [3,4]. Dynamic pricing is one of the fundamental indicators of energy consumers for load shifting [5,6]. Thus, electricity prices play a significant role in smart grids while balancing power demand and generation/supply. Due to the competitive and deregulated energy market environments, electricity price has a close relation between load demand and supply; as a result, it has become one of the most relevant metrics in the electricity markets [7]. Research related to energy prices is of great importance to the whole community from an economic and political point of view. Moreover, the prediction of electricity prices is crucial for participants in the electricity market to optimize profitability and improve and strengthen risk management [8,9]. An accurate price prediction is important for the electricity market and the entire power system. It is also a critical concern In [24], the authors investigated the influence of electricity price jumps or outliers in electricity price forecasting. To detect price jumps, the authors used three different approaches: Recursive filters, the Tukey criterion, and the fitted boxplots approach.
The authors of [25] use a neural network structure and singular spectrum analysis (SSA) to build a hybrid method to forecast day-ahead electricity prices based on load and temperature details. A study presented in [26] proposes a probabilistic methodology for forecasting per hour electricity prices, which makes use of bootstrapping technology. Ugurlu et al. [27] develop an electricity price prediction method by proposing a multi-layer gated recurrent unit (GRU). They use real-time price data for three years from the Turkish day-ahead market to conduct experiments. Several experiments are conducted with realtime data and the results are compared with state-of-the-art price forecasting algorithms, i.e., Markov, Naive, ARIMA, CNN, ANN, and LSTM. Another study presented in [28] also develops a power price prediction model that is based on heterogeneous ensemble learning and self-adaptive decomposition. This work forecasts the electricity prices in the Brazilian market for one, two, and three months ahead. In the preprocessing phase, a metaheuristicbased Coyote algorithm is used for hyperparameters tuning of complementary EEMD. Then, three machine learning (ML) approaches, including the extreme learning machine (ELM), support vector regression (SVR), and gradient boosting machine (GBM), are used for time series forecasting. To forecast electricity prices in Turkey markets, the authors of [29] develop an ARIMA-based algorithm. Due to the presence of multiple outliers in ARIMA, constructing a model from raw market data causes forecast accuracy to be unreliable.
Qiao et al. [30] use a wavelet transform combined with long short-term memory (LSTM) and a stacked auto-encoder model to predict electricity prices for industrial, commercial, and residential sectors. Another study presented in [31] also develops a hybrid approach, where a metaheuristic-based cuckoo search is used for feature selection, and combined with SVR and singular spectrum analysis (SSA). For multi-step electricity price prediction, Yang et al. [32] propose the VMD model along with an improved multi-objective (MO) sine cosine algorithm (MO-SCA) and regularized ELM. The VMD method is used in this approach to obtain data features, such as low and high frequencies and then based data features day-ahead electricity prices are predicted. Centered on the LSTM model and the Jaya optimizer, Khalid et al. [33] develop an integrated deep NN architecture for conducting electricity price forecasting. By employing real-time data from Pennsylvania-New Jersey-Maryland and Spanish electricity markets, the authors of [34] proposed a composed approach that is based on VMD and feature selection method (that selects feature relevant to hours of the day). However, the feasibility of using other features applicable to the electricity market is not discussed in this paper. A detailed summary of literature review is presented in Table 1.
In this paper, a novel three-stage forecasting model is proposed to predict short-term electricity price data based on small instances (as 12 historical instances are used to forecast price data for a given time window). Before forecasting electricity price data, we first decompose the original price series into a fixed number of intrinsic mode functions (IMFs) and a residual. Each IMF and the residual is forecast individually using the ELM. The prediction results of each IMF are then combined to obtain the actual predicted price of the electricity load. To validate our proposed forecasting model, several experiments were conducted on real-time datasets obtained from three different states in Australia, i.e., New South Wales (NSW), Queensland (QLD), and Victoria (VIC). The results from experiments reveal the effectiveness of the developed model over the compared approaches, i.e., recurrent neural network (RNN), multi-layer perceptron (MLP), support vector regression (SVR), and extreme learning machine (ELM).
The remaining work is managed as follows. The next section explains the architecture and working of the proposed hybrid (EEMD-ELM) method. Section 3 presents the experimental setting along with the results of proposed and compared approaches. Section 4 concludes this paper along with future directions. The performance of different day-ahead electricity price forecasting algorithms was evaluated using data samples from Greece and Hungarian Power Industries, as well as the impact of different training sample sizes on forecasting performance and the impact of training on an hourly clustered sample.
Long-term Using Hungry data, hourly clustered training models perform better, while hourly non-clustered training models are better for Greece data.
[21] DNN The proposed methodology for day-ahead power price forecasting solves the hyper-parameter selection problem for DL implementations by establishing a robust ex-ante hyper-parameter selection mechanism.
Long-term The proposed method reduces the noise and outperforms LASSO estimated model and DNN with nonoptimized parameters.

Short-term
It is observed that the proposed method has higher performance over VMD in terms of MAPE and RMSE by 84% and 81%, respectively. [23] SDR-MASES-SPSD method In the proposed model, the stacked pruning sparse denoising autoencoder (SPSDAE) is used to individually reduce the noise of the data. Then, to detect the features of the input data, a maximum separation subspace (MASES) in sufficient dimension reduction (SDR) is proposed. Finally, a new multimodal combined (MMC) method is introduced to accurately predict the day-ahead electricity price.
Short-term It is confirmed by simulation that the proposed method achieves higher performance in terms of minimum error rate compared to benchmark methods. [24] Bayesian models In the proposed model, the Bayesian jump model is used along with the double exponential model and explanatory variables to detect upward jumps, no jumps, or downward jumps in electricity price.

Short-term
Results suggest that electricity jump predictions are useful for price prediction in peak hours. In the proposed model, bootstrapping is used to implement uncertainty and a generalized ELM is used for low computational cost and fast daily price prediction. In addition, to achieve a better fit of the prediction model to the changes in time series price, wavelet preprocessing is used. To confirm the productivity of the proposed model, real datasets from Ontario and Australia electricity markets are used for implementation.

Short-term
It is confirmed through simulations that the proposed model achieves higher prediction accuracy than its counterparts.
[27] GRU The objective of this study is to evaluate the performance of different neural networks in predicting the price of electricity.

Short-term
Simulation results confirm the productivity, in terms of MAE, of their proposed multilayer GRU method.
[28] Complementary EEMD, ELM, Gaussian process, and SVM In the proposed model, the complementary EEMD is used to decompose the current series into a number of subseries. The subseries are predicted using ELM, gradient boosting machine, Gaussian process, and SVM. The results are integrated to output the predicted electricity price

Short-term
The proposed model outperforms the benchmark algorithms in terms of error reduction. [29] Seasonal ARIMA and ANN In the proposed work, the current series is decomposed into two components: Linear and non-linear. The linear component is forecast using ARIMA, while the non-linear component is forecast using ANN Short-term The proposed model shows a 30 percent improvement in terms of error reduction in forecasting compared to benchmark models.

Ref. Model
Description & Methodology Study Area Remarks [30] Hybrid of wavelet transform, SAE and LSTM Wavelet transform is used to decompose the current series. SAE -LSTM is used to forecast each series. Then the predicted series is reconstructed. The proposed hybrid algorithms overcome the shortcomings of wavelet transform and improve the price forecasting for residential, commercial, and industrial users using an optimal and stratified model.

Short-term
In terms of MAPE reduction, the performance of the proposed model is superior compared to other algorithms. [31] Hybrid of cuckoo search, SVM, and SSA In the proposed model, electricity price forecasting is performed by analyzing seasonal trends and patterns. Moreover, a hybrid feature selection algorithm is introduced to improve the electricity price forecasting.

Short-term
MAPE and RMSE of the proposed model along with DM are significantly lower compared to other benchmark models.
[32] RELM, VMD, and MO-SCA An adaptive, deterministic, and probabilistic model is used for forecasting. A divide-and-conquer strategy is used to improve price forecasting. VMD is used to decompose the current series into a number of series and each series is forecast individually.
Short-term MAE, RMSE, MAPE, and TIC of the proposed model is significantly lower compared to benchmark models. [33] LSTM and Jaya optimization algorithm In the proposed model, the Jaya optimization algorithm is used to tune the hyperparameters of LSTM to accurately forecast the electricity load and price.
Long-term It is observed that the proposed model achieves low error rate over benchmark models. [34] VMD GRRN, and gravity search optimization In the proposed model, a mixed approach is proposed to predict electricity load and price. A hybrid of a neural network and gravity search optimization is developed for input selection to select important features.

Short-term
It is observed from results that RMSE and TIC values of the proposed model are lower than counterparts.

Methodology
In this section, we first introduce EEMD and ELM models, and then we describe our proposed electricity price forecasting model, namely EEMD-ELM.

Ensemble Empirical Mode Decomposition
Wu and Huang [35] developed ensemble empirical mode decomposition (EEMD), which is an extended form of empirical mode decomposition (EMD) obtained by solving the mode mixing problem. In general, the data is a combination of signal and noise. Let x be the data recorded at any time t, "s" be the signal, and "n" be the noise. Then we can express it as: In practice, noise is the highly undesirable part of the data that interferes with data analysis. To remove the noise from the data, EMD decomposes the data by extracting a number of IMFs and a residual. IMFs are oscillatory functions with varying frequency and amplitude. In order to decompose a time series into a number of IMFs, first, all the local maximum and minimum are identified and connected by means of a cubic spline to form the upper and lower envelope. Secondly, the mean of the upper and lower envelopes is determined. Afterward, the mean is subtracted from the actual time series to generate the first IMF. The process is repeated until the final component becomes a monotonic function. In practice, the extraction process follows a shifting process by identifying the local maximum and minimum. Mathematically, the extraction process is presented as: In Equation (2) "r" is the residue, "n" is the number of IMF extracted. Mode mixing occurs due to signal intermittency. It suggests that an IMF may involve different physical processes. As a result, the transparent decomposition process of the signal is affected. Wu and Huang solved the problem of mode mixing by adding white noise to the target data. The advantage of using EEMD is that the white noise added to the target data cancels the effect of the actual noise. As a result, the dyadic property of each IMF is preserved. It is worth noting that the effect of white noise can be controlled using the well-established statistical rule: In Equation (3), shows the amplitude of the included noise, while N presents ensemble members.

Extreme Learning Machine
Huang et al. [36] developed an extreme learning machine that is based on a neural network with a single hidden layer and a number of "N" hidden nodes and nonlinear activation functions. It is one of the most popular neural networks due to its fast learning ability and satisfactory results. For any given dataset: 2, 3, ..., N, R n as the input samples, and R m as the output samples, ELM can be formulated as follows: where β = [β 1 ,β 2 ,....β N ], is the weight vector that connects the hidden layer neurons with the output layer, h(x) = [h(x) 1 , h(x) 2 , ..., h(x) N ] is the output of the ith hidden node with respect to the input of a hidden node x, and g(x) represents the activation function (in our case the activation functions is sigmoid), then based on the hidden layer output, the equation can be written as: where, and ELM aims to minimize the training error however, in some cases where the dataset is unstable due to random observations or contains outliers, ELM may perform poorly.

Combined EEMD-ELM Forecasting Model
In this section, the three-stage EEMD-ELM model is introduced in detail. Figure 1 presents the structure of our proposed model. Before inputting the time series into the EEMD-ELM, we first extract all the observations for each day separately. Then, the observations of the same days are appended to each other by forming a new time series, as presented in Figure 2. The main reason for extracting and appending similar days to each other is to ensure the historical observations of the same days are used to forecast the electricity price of a given day.  After forming a time series, the number of IMFs are determined. It is important to maintain the same number of inputs for each dataset to evaluate the performance of our model with the same inputs. When EEMD is applied to each dataset, the IMFs for each data source decomposes into 12-16 components. In order to keep the same number of IMFs for each dataset, we started to train and test our model with two IMFs and a residue and compare the performance by incriminating the IMFs by one. We observe slight performance improvement when IMFs were increased from four to higher. As we are forecasting each IMF individually, therefore, we choose four as the suitable number of IMFs for each set because the time to train and forecast each IMF for each state is significantly higher.
Once the number of IMFs is determined, the electricity price series is input to the three-stage model. At the first stage, the actual price series decomposes into four intrinsic mode function (IMF) and a residual component using the EEMD algorithm. Figure 3 shows the IMFs, the residual component, and the input time series. In the second stage, each IMF and the residual are considered as independent time series. In order to forecast the electricity price, each component is converted into a supervised learning problem and ELM is used to train and test the components individually. It is worthy to mention that a single model cannot capture the non-linear and non-stationary trend in the electricity price series.
The key advantage of our model over the existing model is that our model uses a divide and conquer strategy to forecast the electricity price. By splitting the actual series into multiple series, the series becomes simple and easy to forecast. In the third stage, all the forecast values of all IMFs and the residue obtained in stage two are added to output the forecast value of electricity prices.

Results and Simulations
In this section, we first present the experimental design. Second, the details are provided of three datasets that are adapted (for the experiments) from three different markets in Australia, i.e., NSW, QLD, and VIC. Then, performance evaluation metrics are disclosed along with the results of the proposed model (EEMD-ELM) and benchmark approaches, i.e., RNN, MLP, SVR, and ELM.

Experimental Design
Simulations of our proposed model and benchmark algorithms were evaluated using Google COLAB [37]. Three months of data are used to train and test all models (from Monday, 1 January 2018, to Saturday, 31 March 2018). In the three months of data, we took 12 days of data for each day of the week (e.g., 12 Mondays, 12 Saturdays, etc.). It is worth noting that in the three months, the number of some days may be greater than 12 (e.g., Tuesday may occur 13 times in three months); however, we only considered the first 12 days for each day and discarded the rest of the days. To predict the next day's electricity prices (e.g., for Monday), historical observations of the last 11 Mondays are used as training data. The main reason to train the model on the observations for the same days is that the pattern of electricity price generation/consumption is different for each day, i.e., the electricity price pattern of Sundays is different from the electricity price generation/consumption/price pattern of Mondays and vice versa. It is worth noting that unlike other forecasting algorithms, we have not trained our model for a specific day, i.e., the next day does not necessarily have to be a specific Sunday or Monday, etc. Moreover, we predicted the price values for the same day for all three states and ran all models five times in the same test environment to present the average of the performance metric values.

Dataset Description
To check the productivity of our newly developed model, datasets from three different states of the Australian Energy Market Operator (AEMO) were analyzed. The main reason for selecting different states is to ensure that the applicability of the proposed model is not affected by different attributes such as population, geographical, or climatic characteristics, etc. As the Australian electricity dataset is recorded for different geographical and climate conditions, this makes our model applicable to any data source with variable seasonal or geographical conditions. The datasets are publicly available on the AEMO website https://aemo.com.au/en (Accessed date: 1 February 2021). The temporal resolution in each dataset is 30 min. This results in 48 observations for one day.

Performance Evaluation Metric
To comprehensively evaluate the performance of the prediction models, we employ four performance evaluation metrics, including mean square error (MSE), MAE, mean absolute percentage error (MAPE), and root means square error (RMSE), as shown in Equations (8)- (10). The lower the values of the metrics, the higher the prediction accuracy of our proposed model. In Equations (8)-(11), X i andX i show the actual and predicted electricity price values, respectively. Where N is the total number of instances: Moreover, to illustrate the better performance, the Diebold-Mariano (DM) [38] test on all the datasets is performed to highlight the statistical significance of all the forecasting models. The loss function is set to MAPE and to test the forecasting results of each model, we test the null hypothesis i.e., the forecasting ability of the models is the same. The alternate hypothesis states that the forecasting ability of one model is better than the other. Mathematically, DM can be expressed as: where γ k denotes the auto co-variance.

Analysis of IMFs
To overcome the nonlinear and nonstationary components in the electricity price data, the actual price signal is first decomposed using EEMD. Figure 3a-c shows the actual electricity price series, IMFs, and residual of NSW, QLD, and VIC respectively. As shown in Figure 3, the frequency of each IMF ranges from high to low. In addition, each IMF shows a unique oscillating mode embedded in the actual price series. In this paper, the actual electricity price signal is decomposed into four IMFs and a residual. The first component, i.e., IMF1, is the most non-stationary and non-linear component. Moreover, the prediction accuracy of IMF1 is also the worst among all IMFs. As we move from IMF1 to IMF4, the prediction accuracy is improved. The last IMF (i.e., residual) shows the best prediction results.

Forecasting Result of NSW
In this subsection, the prediction results of NSW are discussed. In general, the proposed EEMD-ELM has the lowest performance metric values compared to the other models. The performance metrics can be seen in the bar chart in Figure 4, where it shows that EEMD-ELM has the lowest values for the performance metric compared to the other electricity price forecasting models used in this work. From Figure 4, it can be concluded that EEMD-ELM reduces the MSE by 68.66%, 65.82%, 83.83%, and 85.72% compared to RNN, MLP, SVR, and ELM respectively. In terms of MAE reduction, EEMD-ELM is 43.47%, 53.72%, and 58.18% efficient. In terms of MAPE reduction, the proposed EEMD-ELM is 80.063%, 38.18%, 45.28%, and 53.98% efficient. On the other hand, in terms of RMSE, the proposed scheme has 81.09%, 41.54%, 59.8%, and 62.22% higher performance over RNN, MLP, SVR, and ELM, respectively. Table 2 shows the DM test results for the NSW state.
The forecasted electricity price values by EEMD-ELM and benchmark approaches are presented in Figure 5, where, it is clear that only EEMD-ELM can forecast the irregular trend in the electricity price data. The price values predicted by RNN are far from the actual electricity price signal as shown by the red line. MLP also does not perform well in forecasting electricity prices, however, MLP's predicted price values are slightly better compared to RNN. SVR's predicted electricity price values are better compared to RNN. However, in time slots 24-40, the price values predicted by SVR are unacceptable. Similarly, in time slots 32-40, the price values predicted by ELM are unacceptable. From Figure 5, it is clear that only EEMD-ELM can both capture and forecast the irregular trend in electricity price data. Moreover, the price predicted by EEMD-ELM is almost identical to the actual electricity price in some time windows (e.g., time windows 8 to 28).

Forecasting Result of QLD
The forecasting results of electricity price data for the state of QLD are discussed in this subsection. Like NSW, the proposed EEMD-ELM achieves the lowest performance metric values compared to the other benchmark models. Moreover, the performance of all the forecasting models in predicting the electricity price for QLD is better compared to NSW. The error bars present the values of performance metrics for all models used to forecast the electricity price for QLD are shown in Figure 6. According to Figure 6, EEMD-ELM has the lowest MSE value. EEMD-ELM minimized the MSE in predicting the electricity price of QLD by 94.47%, 82.36%, 92.12%, and 92% compared to RNN, MLP, SVR, and ELM, respectively. Similarly, EEMD-ELM is efficient in minimizing MAE by 75.43%, 53.42%, 64.78%, and 66.22%, respectively. Similarly, the MAPE achieved by EEMD-ELM is minimal compared to RNN, MLP, SVR, and ELM at 85.11%, 45.75%, 54.45%, and 59.45%, respectively. Table 3 Figures 4, 6 and 8, it is concluded that except EEMD-ELM, the performance of all schemes is better compared to the performance of the schemes in predicting the electricity price of QLD and NSW. As shown in Figure 8, EEMD-ELM outperforms all the forecasting models in terms of minimum performance metric values. The MSE achieved by EEMD-ELM in forecasting the electricity price of VIC is the lowest, and corresponds to a minimum of 76.52%, 72.63%, 77.74%, and 76.82% compared to RNN, MLP, SVR, and ELM, respectively. In terms of the lowest MAE value, EEMD-ELM outperforms RNN, MLP, SVR, and ELM by 51.55%, 46.18%, 50.12%, and 45.88%, respectively. The MAPE achieved by EEMD-ELM is 82.52%, 46.98%, 51.96%, and 47.92% minimally compared to RNN, MLP, SVR, and ELM. Similarly, the lowest RMSE value is obtained for EEMD-ELM. The RMSE value obtained by EEMD-ELM is 81.16%, 47.69%, 52.82%, and 51.85% better than RNN, MLP, SVR, and ELM respectively. Table 4 shows the DM results for the state of VIC.
The forecasted electricity price values using the combined EEMD-ELM for VIC are shown in Figure 9. As with NSW and QLD, the RNN forecast price values are nowhere near the actual electricity price. MLP also shows a similar trend. In some time windows, SVR's predicted electricity price values are close to the actual price values, however such time windows are very limited. Similarly, ELM also fails to accurately forecast the electricity price of VIC. EEMD-ELM performs better compared to the other forecasting models. However, in the electricity price prediction of VIC, the difference between the predicted and actual price values is visible in some time windows, e.g., time windows 36-48.

Conclusions and Future Work
This study proposes a novel forecasting model to predict short-term electricity prices based on EEMD and ELM approaches. To overcome the nonlinear and non-stationary components in electricity price data, the actual price signal is decomposed using EEMD. In this process, the actual electricity price signals are decomposed into four IMFs and a residual (IMF1 is the most non-linear and non-stationary component; in contrast, IMF4 is the least non-linear and non-stationary component). Then, an ELM model is fitted to predict the price for a given day based on historical observations of the same days. The newly developed model (EEMD-ELM) was implemented along with benchmark methods, i.e., RNN, MLP, SVR, and ELM, in three real-time datasets obtained from NSW, QLD, and VIC electricity markets in Australia. The results from the experiments demonstrate that the proposed model had a higher performance than the counterparts, i.e., the MSE, MAE, MAPE, and RMSE of the proposed model are lower than the benchmark approaches.
In future, we will investigate electricity theft detection in power grids. Smart meter data and power load data will be analyzed with deep learning models to study power loss and abnormal power consumption.

Conflicts of Interest:
The authors declare no conflict of interest.