You are currently viewing a new version of our website. To view the old version click .
Energies
  • Article
  • Open Access

Published: 5 March 2019

Short-Term Electric Load and Price Forecasting Using Enhanced Extreme Learning Machine Optimization in Smart Grids

,
,
,
,
and
1
Department of Computer Science, COMSATS University Islamabad, Islamabad 44000, Pakistan
2
College of Computer and Information Systems, Al Yamamah University, Riyadh 11512, Saudi Arabia
3
Computer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia
*
Authors to whom correspondence should be addressed.
This article belongs to the Special Issue Artificial Intelligence for Smart and Sustainable Energy Systems and Applications

Abstract

A Smart Grid (SG) is a modernized grid to provide efficient, reliable and economic energy to the consumers. Energy is the most important resource in the world. An efficient energy distribution is required as smart devices are increasing dramatically. The forecasting of electricity consumption is supposed to be a major constituent to enhance the performance of SG. Various learning algorithms have been proposed to solve the forecasting problem. The sole purpose of this work is to predict the price and load efficiently. The first technique is Enhanced Logistic Regression (ELR) and the second technique is Enhanced Recurrent Extreme Learning Machine (ERELM). ELR is an enhanced form of Logistic Regression (LR), whereas, ERELM optimizes weights and biases using a Grey Wolf Optimizer (GWO). Classification and Regression Tree (CART), Relief-F and Recursive Feature Elimination (RFE) are used for feature selection and extraction. On the basis of selected features, classification is performed using ELR. Cross validation is done for ERELM using Monte Carlo and K-Fold methods. The simulations are performed on two different datasets. The first dataset, i.e., UMass Electric Dataset is multi-variate while the second dataset, i.e., UCI Dataset is uni-variate. The first proposed model performed better with UMass Electric Dataset than UCI Dataset and the accuracy of second model is better with UCI than UMass. The prediction accuracy is analyzed on the basis of four different performance metrics: Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), Mean Square Error (MSE) and Root Mean Square Error (RMSE). The proposed techniques are then compared with four benchmark schemes. The comparison is done to verify the adaptivity of the proposed techniques. The simulation results show that the proposed techniques outperformed benchmark schemes. The proposed techniques efficiently increased the prediction accuracy of load and price. However, the computational time is increased in both scenarios. ELR achieved almost 5% better results than Convolutional Neural Network (CNN) and almost 3% than LR. While, ERELM achieved almost 6% better results than ELM and almost 5% than RELM. However, the computational time is almost 20% increased with ELR and 50% with ERELM. Scalability is also addressed for the proposed techniques using half-yearly and yearly datasets. Simulation results show that ELR gives 5% better results while, ERELM gives 6% better results when used for yearly dataset.

1. Introduction

For electricity generation and distribution, Traditional Grids (TGs) are used. The infrastructure of TG is getting obsolete, which results in energy loss and less efficient output. Due to the usage of outdated infrastructure, intensive power losses are being faced. This intensive power loss leads to load shedding, which is one of the major issues of today’s world []. TGs use fossil fuels like coal, petrol, diesel, etc., for the combustion process of turbines. The extensive use of fossil fuels lead to natural resource depletion and increase in pollution. The literature has suggested to use Renewable Energy Sources (RES) and to modify the existing TGs by incorporating the latest technologies and updated infrastructure to overcome these issues. The new and modified form of TG is the Smart Grid (SG) []. The Information and Communication Technology (ICT) is integrated with TG to make SG. It provides bi-directional communication between consumers and utility. It monitors, protects and optimizes the generation, distribution and consumption of electric energy. It incorporates the latest technologies in TG: technical, control and communication technologies, to enable efficient energy transmission. With an ever increasing dilemma of energy shortage and cost inflation, people are attracted towards the SG. It provides the consumers with a reliable, economical, sustainable, secure and efficient energy as it uses intelligent methods. In SG, Demand Side Management (DSM) is used, which encourages the consumers to efficiently optimize the energy usage. DSM allows efficient load utilization by shifting maximum load from on-peak hours to off-peak hours. Thus, it reduced the cost of electricity. The differences between TG and SG are summarized in Table 1 [].
Table 1. Differences between TG and SG.
Data analytics is the phenomenon of dealing with big data obtained from different sources. Big data is the term used for the datasets having large volume, velocity, variety and veracity. It has the problem of extreme complexity which makes the processing of data difficult. Data analytics techniques are the necessity for the processing of big data. Data analytics can be used in a number of fields. For example, handling the financial details of customers by a bank, dealing with the flight details of different passengers by an airline company, dealing with the electricity load and price forecasting of consumers, etc. In SG, data analytics is used to minimize the electricity cost and to improve the service quality of energy utilities. It is also used to predict the future patterns of electricity consumption. Forecasting is done to schedule the load consumption from on-peak hours to off-peak hours for next day, week or month to reduce the electricity cost and enhance user comfort [].
The terms forecasting and prediction are used interchangeably in this article. The case with load and consumption is similar. The sole purpose of this work is to increase the accuracy of load and price forecasting. Two techniques are proposed to solve the aforementioned objectives, i.e., ELR and ERELM. Furthermore, two types of datasets are used, i.e., uni-variate and multi-variate. UCI is the uni-variate dataset. Uni-variate dataset contains one variable, i.e., load in this paper. However, real-time data has a number of variables. Thus, multi-variate dataset is required to handle multiple variables to achieve a better understanding. In this paper, multi-variate dataset, i.e., UMass Electric Dataset is used to predict the load and price. Two types of scenarios are considered in this paper, i.e., residential load and smart meters load. The proposed techniques outperformed existing techniques in terms of forecasting load and price. Consequently, energy prediction assists in energy management on the residential and utility side. List of abbreviations that are used in this paper is given in Table 2. Whereas, Table 3 shows complete list of symbols.
Table 2. List of abbreviations.
Table 3. List of symbols.
The rest of the paper is organized as: Section 2 deals with related work, Section 3 contains the detailed description of techniques used in this paper. Section 4 covers the proposed system models. Results and their discussion are given in Section 5, whereas Section 6 consists of evaluation of the proposed models using the performance metrics. Conclusion and future studies are discussed in Section 7.

1.1. Motivation

The authors in [] used Multi Layer Perceptron (MLP) and Artificial Neural Network (ANN) to solve the load and price forecasting problem. We proposed an enhanced technique to increase the accuracy of load and price forecasting based on a modified loss function. In Reference [], authors used ELM and RELM to predict electricity load. We proposed an enhanced technique to optimize weights and biases of network for efficient load forecasting. Furthermore, two scenarios are considered and two different types of datasets are used to predict the load and price efficiently.

1.2. Problem Statement

Data of SGs are increasing dramatically so an efficient technique is required to predict the load and price of electricity. Authors in [] used Recurrent Extreme Learning Machine (RELM) to predict the electricity load. However, in RELM, weights and biases are randomly assigned which leads to drastic variations in prediction results. An enhanced technique is proposed to solve the aforesaid issue. In Reference [], authors used Convolutional Neural Networks (CNNs) for predicting the energy demand. However, CNN involves tuning of a number of layers which makes it spatio-temporal complex.
In this paper, two enhanced techniques are proposed to increase the accuracy of load and price of electricity efficiently. Uni-variate and multi-variate datasets are used for both techniques. Furthermore, analysis of both residential and utility data is performed collectively.

1.3. Contributions

The following are the contributions of this paper:
  • Feature engineering is performed using Recursive Feature Elimination (RFE), Classification And Regression Technique (CART) and Relief-F
  • Two new classification techniques are proposed, i.e., Enhanced Logistic Regression (ELR) and Enhanced Recurrent Extreme Learning Machine (ERELM)
  • In ELR, the loss function of Logistic Regression (LR) is modified to increase the prediction accuracy
  • The Grey Wolf Optimizer (GWO) learning algorithm is used with Recurrent Extreme Learning Machine (RELM) to optimize weights and biases in order to improve the forecasting accuracy
  • The proposed techniques predict the electricity load and price efficiently
  • ELR is used to predict the load and price of a smart home, whereas ERELM is used for forecasting the load of smart meters
  • Cross validation is performed using K-Fold and Monte Carlo methods for assigning the fixed optimal values to weights and biases. This further increases the efficiency of GWO
  • The accuracy of the proposed techniques is evaluated using the performance metrics, i.e., Mean Absolute Error (MAE), Mean Square Error (MSE), Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE)

3. Existing and New Techniques

In this section, the existing and the proposed techniques are discussed.

3.1. Classification and Regression Technique (CART)

CART is a type of decision tree algorithm which consists of both classification and regression procedures and is used to predict the continuous and discrete variables, respectively. CART uses historical data to build decision trees. These newly built trees are then used for classification of data. It is a binary recursive process. Binary process has only two output values, i.e., 0 and 1. The algorithm will search for all possible values and variables before performing the split operation []. The CART method has three main parts:
  • Construction of maximum tree,
  • Choice of right tree size,
  • Classification of data using the constructed tree.
The construction of a maximum tree refers to splitting of the tree till the last set of observations. This is the most time-consuming phase in CART. Constructing the maximum trees is a complex method which can have more than hundred levels. Therefore, the trees must be optimized before being used for classification of the data. The classification problems are the ones which involve discrimination between entities, e.g., discrimination among students to decide which student will be awarded with the degree this year. On the other hand, regression uses historical data patterns to predict the future values, e.g., load and price prediction of homes. The steps of CART are stated below:
  • Problem definition,
  • Variable selection,
  • Specifying the accuracy criteria,
  • Selecting split size,
  • Determine the threshold to stop splitting,
  • Selection of the best tree.

3.2. Recursive Feature Elimination (RFE)

RFE is a feature extraction process. It selects a set of most important features which are least redundant. As the name is self defining, it is an iterative process which keeps running in a loop unless all the best features are selected. The selected features are then ranked in the order they are being removed from the feature set. The computation time depends upon the number of features which need to be eliminated []. The pseudocode of RFE is given in Algorithm 1.
Algorithm 1: Pseudocode of RFE
Energies 12 00866 i001

3.3. Relief-F

Relief-F is an extensively used method for feature selection. This method randomly selects an instance R and then find its nearest hits and miss instances. The nearest hits are the k-nearest neighbors of the selected random instance R. Afterwards, the average of all the weights of the nearest hits and miss is calculated to select the next instance. The pseudocode of Relief-F is discussed in Algorithm 2 [].
Algorithm 2: Pseudocode of Relief-F
Energies 12 00866 i002

3.4. Convolutional Neural Network (CNN)

CNN is a type of NN. It is built from neurons and work like the biological neurons. Each neuron is fed with some input, and then it performs a dot product and finally gives the output. It consists of more than one convolutional layer; followed by a multilayer NN. The basic type of CNN is a 2D network and is mostly used for images. The layers in CNN are: pooling layer, dense layer, dropout layer and convolutional layer. For forecasting data, 1D CNN can also be used. It also has an activation function like Sigmoid, ReLU, Tanh, etc. When new inputs are given to CNN, it does not know the exact feature mapping. Therefore, it creates a convolutional layer and then convolves this layer to find the correct feature mapping. The pooling layer in CNN has the ability of shrinking the large inputs. The most widely used activation function for CNN is ReLU. Its working is simple; whenever a negative number occurs, it is replaced by 0. Hidden layers are also present in CNN. The error minimization is performed using these layers.

3.5. Logistic Regression (LR)

LR is a type of statistical model used for regression. It is used to analyze a given dataset and then perform predictions using the independent variables. The outcome of LR is in the binary form. The main aim of LR is to describe a pattern between independent and dependent variables. There are two main parameters of LR: loss function and sigmoid function. The features should be in the binary form to use the LR method. Hence, normalization of data is required before implementing the LR model on the available data. The sigmoid function used in LR is given in following equation []:
s i g m o i d ( t ) = 1 1 + e t .
Logistic loss function is given by the following equation, which is taken from []:
l o s s f u n c t i o n ( t ) = 1 m ( y t l o g ( h ) ( 1 y ) t l o g ( 1 h ) ) .

3.6. Enhanced Logistic Regression (ELR)

ELR is proposed in this paper. It is an enhanced form of LR technique. In ELR, a new loss function is used. Loss function is a group of objective functions that need to be minimized. It is a measure of how good a prediction model performs in predicting the outcome. Minimizing the value of the loss function increases the prediction accuracy. In this paper, the loss function is being minimized to enhance the prediction accuracy. The equation for the loss function of ELR is given below:
n e w l o s s f u n c t i o n ( t ) = 0.1 m ( y t l o g ( h ) ( 1 y ) t l o g ( 1 h ) ) .
ELR is used to predict electricity load and price efficiently for a smart home and load of smart meters. Two different datasets, i.e., UMass Electric Dataset and UCI Dataset are used to test the proposed technique.

3.7. Grey Wolf Optimizer (GWO)

In this section, GWO technique is discussed in detail. In the proposed model, the metaheuristic technique GWO is used. It follows the social leadership and hunting mechanism of grey wolves as shown in Figure 1. The population is based on groups, i.e., alpha ( α ), beta ( β ), gamma ( γ ) and omega ( ω ). α , β and γ are considered as the fittest wolves who guide other wolves ( ω ) in search space. Grey wolves update their location according to the positions of the three fittest wolves, i.e., α , β and γ [].
Figure 1. Grey wolf social hierarchy.
The general steps that are followed in GWO are:
  • Parameters of grey wolves are initialized such as maximum number of iterations, the population size, upper and lower bounds of search space,
  • Calculate fitness value to initialize the position of each wolf,
  • Select three best wolves, i.e., α , β and γ ,
  • Calculate the positions of the remaining wolves ( ω ),
  • Repeat from step 2 if current solution is not satisfied,
  • The fittest solution is taken as α .
The pseudocode of GWO is given in Algorithm 3.
Algorithm 3: Pseudocode of GWO
Energies 12 00866 i003

3.8. Recurrent Extreme Learning Machine (RELM)

RELM is a single hidden layer neural network (SHLRN). It is a feedback intra network that uses output or hidden layers as given in Equation (5) []:
y = j = 1 m β j g i = 1 n w i , j x i + n = i + 1 n + r W i , j δ ( t 1 + n ) + b j ,
where δ represents delay, t shows current iteration and r indicates total number of context neurons. Context neurons are connected backward from output to input. These neurons perform similar to input neurons and hold delayed values of output neurons. The learning method to update weights and biases of ELM and RELM is similar to that shown in Figure 2.
Figure 2. Functioning of RELM.
Weights and biases are decided randomly. Optimal results against weight and biases are utilized in RELM on a random basis. Training dataset is used to calculate the unknown weights of hidden layer. The unknown weights of hidden layer are calculated using a Moore–Penrose generalized inverse technique.

3.9. Enhanced Recurrent Extreme Learning Machine (ERELM)

ERELM is an enhanced form of RELM, whereas RELM is an enhanced form of ELM. ERELM is a single layer FFNN. In RELM, weights and biases are decided randomly, whereas the output weights are determined analytically. The output weights are determined using a simplified generalized inverse operation. The issue with ELM is that the classification boundary is not well defined and usually misclassifies some samples. To overcome this shortcoming, a new technique is proposed.
In the proposed technique, i.e., ERELM, weights and biases are decided after optimization using GWO algorithm. GWO finds the optimized solution which minimizes RMSE. Cross validation in ERELM is done using Monte Carlo and K-Fold methods. The Monte Carlo technique is used to model the probabilistic nature of the random variables. It performs risk analysis using the probability distribution. The common probability distributions used with Monte Carlo are: normal, uniform, triangular, discrete, etc. In K-Fold cross validation process, the entire dataset is divided into batches of K samples. The value of K could be any positive integer. Most commonly used K-Fold method is 10-Fold cross validation method, in which the value of K is 10. Each batch formed after splitting of data in the validation process is termed as fold. The pseudocode of ERELM is discussed in Algorithm 4.
Algorithm 4: Pseudocode of ERELM
Energies 12 00866 i004

4. Proposed System Models

Two system models are proposed in this section. The description of these models are given below.

4.1. Proposed System Model 1

The proposed system model consists of residential load and price data of a SH. The SH under consideration consists of six rooms and eight heavy appliances. The proposed model consists of four basic steps, i.e., normalization of data, feature selection using CART and RFE, feature extraction using Relief-F and finally forecasting of load and price using CNN, LR and ELR. ELR is a proposed technique which outperformed CNN and LR in terms of prediction accuracy. In this model, short term forecasting is performed to make decisions for efficient load and price scheduling for the near future.
The first proposed model is shown in Figure 3.
Figure 3. Proposed system model 1.

4.2. Proposed System Model 2

The second proposed system model is shown in Figure 4.
Figure 4. Proposed system model 2.
In the second system model, a load of 10 smart meters is taken. Subsequently, comparison is performed with multivariate residential data. The first step in this model is the preprocessing of data; after the data is preprocessed, the best parameters are selected using RELM. The optimization of RELM is performed using GWO. GWO optimizes biases and weights to improve the accuracy. Thereafter, the proposed technique ERELM reduces forecasting error. Cross validation is performed using Monte Carlo and K-Fold methods.
The simulation results and the assessment of both proposed models is done on the basis of four different performance metrics: MAPE, MAE, RMSE and MSE. The results show that the proposed techniques beat the existing techniques in terms of prediction accuracy.

5. Simulation Results and Discussion

This section covers the simulation results of the proposed models. The results are given in this section along with their discussion. The simulations are performed in Spyder (Python 3.6 package) provided by Anaconda (a data science platform manufactured by Anaconda, Inc. located in Austin, Texas, USA) on HP 450G ProBook, having 1 TB Hard Drive and 8 GB RAM.

5.1. Simulation Results and Discussion of Proposed System Model 1

The simulation results and discussion of proposed system model 1 is given below.

5.1.1. Data Description

The first dataset is taken from UMass Electric Dataset []. It is a multivariate dataset used for forecasting purpose. Half-yearly and yearly data is taken for the year 2016 to address scalability. The dataset contains the half-hourly load and price values of a single home. The dataset is divided into a 70:30 ratio, i.e., seventy percent data is used for training, whereas the remaining thirty percent is used for testing. Preprocessing of the dataset is done to remove the Not a Number (NaN) and blank values. UMass dataset is used for both proposed system models. Though, it performs much better when used with ELR.
Table 5 shows the features of UMass Electric Dataset excluding the target features. The targeted features are “Load” in case of load prediction and “Price” in case of price prediction. The values are given in standard units, i.e., kW for load and cents/kWh for price. The dataset is normalized in the range [0–1].
Table 5. Features in UMass Electric Dataset.

5.1.2. CART

Table 6 shows the results of CART used to predict load and price using UMass Electric Dataset. CART gives respective values for different features.
Table 6. Results of CART for UMass Electric Dataset.

5.1.3. RFE

After using the CART technique, RFE is implemented for feature selection. RFE keeps on iterating unless model is left with only the most prominent features. The choice of features depend upon requirements. The results of RFE are given in Table 7.
Table 7. RFE features for a UMass Electric Dataset.
RFE assigns two values to the features, i.e., True and False. In the proposed model, number of selected features through RFE is 8, when used for UMass Electric Dataset. The RFE selected features are: AC, cellar lights, washer, garage, master bed + kids bed, panels, dining room and microwave. For a UCI Dataset, RFE did not give any output as it is a uni-variate dataset.

5.1.4. Relief-F

Relief-F is used for feature extraction. The threshold for Relief-F is 10. Table 8 shows the Relief-F features for UMass Electric Dataset. It did not give any output when used with UCI Dataset because it is uni-variate.
Table 8. Relief-F features for UMass Electric Dataset.

5.1.5. Load Forecasting

Figure 5a,b show the load prediction comparison of three different techniques for one day using two different hourly datasets. Similarly, Figure 6a,b show the load prediction comparison for one week using two different hourly datasets. From Figure 7a,b, monthly load prediction comparison can be observed. In this case, to avoid the cluttering of the graphs, data is taken after every four hours. These figures show that the proposed technique ELR outperformed LR and CNN for both datasets. It can be envisioned that the load prediction with ELR is close to the actual data. The prediction results obtained using a UMass Electric Dataset are better than the UCI Dataset.
Figure 5. One day load prediction.
Figure 6. One week load prediction.
Figure 7. One month load prediction.

5.1.6. Price Forecasting

Figure 8 shows the price prediction comparison of three different techniques for one day using UMass Electric Dataset. Similarly, Figure 9 shows the price prediction comparison for one week using UMass Electric Dataset. From Figure 10, monthly price prediction comparison can be observed. In this case, data is taken every four hours. These figures show that the proposed technique ELR outperformed LR and CNN for UMass Electric Dataset in terms of price prediction. It can be envisioned that the price prediction with ELR is close to the actual data.
Figure 8. One day price prediction using UMass.
Figure 9. One week price prediction using UMass.
Figure 10. One month price prediction using UMass.

5.2. Simulation Results and Discussion of Proposed System Model 2

The simulation results and discussion of proposed system model 2 are given in this section.

5.2.1. Data Description

The second dataset is taken from the UCI machine learning repository. It is a uni-variate dataset developed by Artur Trindade []. Consumption of 370 substations is taken under consideration to analyze the load of smart meters. Daily data of meter ID: 166, 168, 169, 171, 182, 225, 237, 249, 250 and 257 substation is shown in Figure 11. The periodicity of load consumption can be observed. Pattern of intervals give trend of load consumption that later helps in prediction of future electricity load. UCI Dataset is used for both proposed system models. In order to analyze scalability, half-yearly and yearly datasets are used. It performs well for smart meters because the only targeted feature is load. The values of load are given in kilo-Watts. This dataset is also normalized in the range [0–1].
Figure 11. Daily load consumption of 10 different substations.

5.2.2. Results Discussion

Multiple approximation function is used in order to find optimal forecasting accuracy. These functions include hard limit, sine, tanh and sigmoid function. Number of neurons and context neurons are assumed as 2 and 5. The MT166 dataset is selected to finalize functions that are producing optimal results. The dataset is normalized and scaled before use to remove spikes and noise in data. ELM, RELM and ERELM are tested on all functions one by one as given in Table 9. Sigmoid approximation function performed better than other functions. The simulations for the second proposed model are carried out on both datasets using the sigmoid approximation function.
Table 9. Obtained RMSE using ELM, RELM and ERELM.
In Table 10, simulation results of both datasets are given using six months of data. Cross validation is done using Monte Carlo and K-Fold. Simulations show that the proposed technique outperformed the conventional techniques in perspective of forecasting and gives minimum RMSE. Monte Carlo gives better results as compared to K-Fold. Similarly, Table 11 addresses the scalability of the proposed system and proves that the prediction accuracy increases with the increase in size of dataset. The Figure 12a,b show regression line produced by predicted and actual load using ELM. Similarly, Figure 13a,b show greater RMSE as compared to a proposed technique in regression plot using RELM. Figure 14a,b show plots produced by ERELM, where the regression line shows actual and predicted electricity load with minimum RMSE.
Table 10. Obtained RMSE for half-yearly testing data using ELM, RELM and ERELM by Monte Carlo and K-Fold cross validation.
Table 11. Obtained RMSE for yearly testing data using ELM, RELM and ERELM by Monte Carlo and K-Fold cross validation.
Figure 12. Regression line plots using ELM.
Figure 13. Regression line plots using RELM.
Figure 14. Regression line plots using ERELM.
It is clearly visible that predicted values are very close to the actual electricity load. Table 12 gives computational time comparison for execution of training and testing data of ELM, RELM and ERELM. ERELM has great computational time as compared to ELM and RELM due to its metaheuristic behaviour. Thus, there is a tradeoff between accuracy and computational time.
Table 12. Computational time comparison of ERELM, RELM and ELM execution.

6. Performance Metrics

The performance of the proposed system models is evaluated on basis of four performance metrics. These performance metrics are: MAE, MSE, RMSE and MAPE. Out of these four, MAPE is given in terms of percentage whereas, the other three are given as absolute values:
M A P E = 1 T t m = 1 T M | A v F v | 100 ,
R M S E = 1 T t m = 1 T M ( A v F v ) 2 ,
M S E = 1 T t m = 1 T M ( A v F v ) 2 ,
M A E = n = 1 N | ( F v A v ) | N .
The accuracy of the model is calculated using the following equation:
A c c u r a c y = 100 R M S E .
Table 13, Table 14 and Table 15 show the load performance metrics comparison for half-yearly and yearly data to address the scalability issue. The dataset being used is UMass Electric Dataset. Similarly, Table 16, Table 17 and Table 18 show the price performance metrics comparison for half-yearly and yearly data to address the scalability issue using the UMass Electric Dataset.
Table 13. Load performance metrics comparison for one day using the UMass Electric Dataset.
Table 14. Load performance metrics comparison for one week using the UMass Electric Dataset.
Table 15. Load performance metrics comparison for one month using the UMass Electric Dataset.
Table 16. Price performance metrics comparison for one day using the UMass Electric Dataset.
Table 17. Price performance metrics comparison for one week using the UMass Electric Dataset.
Table 18. Price performance metrics comparison for one month using the UMass Electric Dataset.
Table 19, Table 20 and Table 21 show the load performance metrics comparison for half-yearly and yearly data to address the scalability issue. The dataset being used is UCI Dataset.
Table 19. Load performance metrics comparison for one day using the UCI Dataset.
Table 20. Load performance metrics comparison for one week using the UCI Dataset.
Table 21. Load performance metrics comparison for one month using the UCI Dataset.
Table 22 and Table 23 represent accuracy of proposed technique ERELM using RMSE, MSE and MAE, using half-yearly and yearly data. Results represent that ERELM outperformed in all performance metrics.
Table 22. Accuracy of ERELM using RMSE, MSE and MAE for half-yearly data.
Table 23. Accuracy of ERELM using RMSE, MSE and MAE for yearly data.

7. Conclusions and Future Work

In this paper, electricity load and price forecasting are performed using two techniques. UMass Electric Dataset is used to predict day ahead, week ahead and month ahead load and price of a SH. Six months of hourly data are considered for day ahead and week ahead prediction, whereas four hours of data are considered for month ahead prediction. It is a multi-variate dataset. The data is first normalized and split into a training set and testing set. Feature engineering is then performed using three different techniques: RFE, CART and Relief-F. For efficient load and price prediction, a new technique, i.e., ELR is proposed. ELR outperformed CNN and LR in terms of prediction accuracy. ELR is used for UCI Dataset as well. It is a uni-variate dataset having data of smart meters of different substations. The results show that the first proposed model works well with UMass Electric Dataset. The techniques used are then accessed on the basis of four different performance metrics, i.e., MAPE, MAE, MSE and RMSE. The simulation results show that ELR outperformed LR and CNN for both datasets.
For accurate short term load forecasting, a new technique, i.e., ERELM is proposed. Short term forecasting is performed to ensure efficient load scheduling and price reduction. Parameter optimization of RELM is done using GWO. GWO optimizes biases and weights to improve the accuracy. Prediction accuracy is further increased using Monte Carlo and K-Fold. ERELM is used with both datasets. The results show that ERELM works well for UCI Datasets. It is observed that ERELM outperformed ELM and RELM for both datasets. The phenomenon of scalability is also addressed using both proposed techniques. Results prove that the prediction accuracy increases with the increase in size of dataset.
In future, the proposed methods will be used to perform mid-term and long-term forecasting. Weights and biases of ERELM will be further optimized using better methods. Furthermore, efficient work is required to reduce the computational time of ELR and ERELM.

Author Contributions

All authors contributed equally.

Acknowledgments

The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through research group NO (RG-1438-034).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ipakchi, A.; Albuyeh, F. Grid of the future. IEEE Power Energy Mag. 2009, 7, 52–62. [Google Scholar] [CrossRef]
  2. Yoldaş, Y.; Önen, A.; Muyeen, S.M.; Vasilakos, A.V.; Alan, İ. Enhancing smart grid with microgrids: Challenges and opportunities. Renew. Sustain. Energy Rev. 2017, 72, 205–214. [Google Scholar] [CrossRef]
  3. Shaukat, N.; Ali, S.M.; Mehmood, C.A.; Khan, B.; Jawad, M.; Farid, U.; Ullah, Z.; Anwar, S.M.; Majid, M. A survey on consumers empowerment, communication technologies, and renewable generation penetration within Smart Grid. Renew. Sustain. Energy Rev. 2018, 81, 1453–1475. [Google Scholar] [CrossRef]
  4. Zhou, K.; Fu, C.; Yang, S. Big data driven smart energy management: From big data to big insights. Renew. Sustain. Energy Rev. 2016, 56, 215–225. [Google Scholar] [CrossRef]
  5. Nazar, M.S.; Fard, A.E.; Heidari, A.; Shafie-khah, M.; Catalão, J.P.S. Hybrid model using three-stage algorithm for simultaneous load and price forecasting. Electr. Power Syst. Res. 2018, 165, 214–228. [Google Scholar] [CrossRef]
  6. Ertugrul, Ö.F. Forecasting electricity load by a novel recurrent extreme learning machines approach. Int. J. Electr. Power Energy Syst. 2016, 78, 429–435. [Google Scholar] [CrossRef]
  7. Muralitharan, K.; Sakthivel, R.; Vishnuvarthan, R. Neural network based optimization approach for energy demand prediction in smart grid. Neurocomputing 2018, 273, 199–208. [Google Scholar] [CrossRef]
  8. Shailendra, S.; Yassine, A. Big Data Mining of Energy Time Series for Behavioral Analytics and Energy Consumption Forecasting. Energies 2018, 11, 452. [Google Scholar] [CrossRef]
  9. Ahmad, T.; Chen, H. Short and medium-term forecasting of cooling and heating load demand in building environment with data-mining based approaches. Energy Build. 2018, 166, 460–476. [Google Scholar] [CrossRef]
  10. Kunjin, C.; Kunlong, C.; Qin, W.; Ziyu, H.; Jun, H.; He, J. Short-term Load Forecasting with Deep Residual Networks. IEEE Trans. Smart Grid 2018, 99. [Google Scholar] [CrossRef]
  11. Seunghyoung, R.; Noh, J.; Kim, H. Deep neural network based demand side short term load forecasting. Energies 2016, 10, 3. [Google Scholar]
  12. Liu, J.P.; Li, C.L. The short-term power load forecasting based on sperm whale algorithm and wavelet least square support vector machine with DWT-IR for feature selection. Sustainability 2017, 9, 1188. [Google Scholar] [CrossRef]
  13. Ahmad, A.; Javaid, N.; Guizani, M.; Alrajeh, N.; Khan, Z.A. An accurate and fast converging short-term load forecasting model for industrial applications in a smart grid. IEEE Trans. Ind. Inform. 2017, 13, 2587–2596. [Google Scholar] [CrossRef]
  14. Fan, C.; Xiao, F.; Zhao, Y. A short-term building cooling load prediction method using deep learning algorithms. Appl. Energy 2017, 195, 222–233. [Google Scholar] [CrossRef]
  15. Shi, H.; Xu, M.; Li, R. Deep learning for household load forecasting—A novel pooling deep RNN. IEEE Trans. Smart Grid 2018, 9, 5271–5280. [Google Scholar] [CrossRef]
  16. Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
  17. Huang, G.B.; Zhou, H.; Ding, X.; Zhang, R. Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B 2012, 42, 513–529. [Google Scholar] [CrossRef] [PubMed]
  18. Fallah, S.N.; Deo, R.C.; Shojafar, M.; Conti, M.; Shamshirband, S. Computational Intelligence Approaches for Energy Load Forecasting in Smart Energy Management Grids: State of the Art, Future Challenges, and Research Directions. Energies 2018, 11, 596. [Google Scholar] [CrossRef]
  19. Zeng, Y.R.; Zeng, Y.; Choi, B.; Wang, L. Multifactor-influenced energy consumption forecasting using enhanced back-propagation neural network. Energy 2018, 127, 381–396. [Google Scholar] [CrossRef]
  20. Luo, J.; Vong, C.M.; Wong, P.K. Sparse Bayesian extreme learning machine for multi-classification. IEEE Trans. Neural Netw. Learn. Syst. 2014, 25, 836–843. [Google Scholar] [PubMed]
  21. Yu, J.; Wang, S.; Xi, L. Evolving artificial neural networks using an improved PSO and DPSO. Neurocomputing 2008, 71, 1054–1060. [Google Scholar] [CrossRef]
  22. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
  23. Saremi, S.; Mirjalili, S.Z.; Mirjalili, S.M. Evolutionary population dynamics and grey wolf optimizer. Neural Comput. Appl. 2015, 26, 1257–1263. [Google Scholar] [CrossRef]
  24. Lago, J.; De Ridder, F.; De Schutter, B. Forecasting spot electricity prices: deep learning approaches and empirical comparison of traditional algorithms. Appl. Energy 2018, 221, 386–405. [Google Scholar] [CrossRef]
  25. González, J.P.; San Roque, A.M.; Perez, E.A. Forecasting functional time series with a new Hilbertian ARMAX model: Application to electricity price forecasting. IEEE Trans. Power Syst. 2018, 33, 545–556. [Google Scholar] [CrossRef]
  26. Kuo, P.H.; Huang, C.J. An Electricity Price Forecasting Model by Hybrid Structured Deep Neural Networks. Sustainability 2018, 10, 1280. [Google Scholar] [CrossRef]
  27. Wang, K.; Xu, C.; Zhang, Y.; Guo, S.; Zomaya, A. Robust big data analytics for electricity price forecasting in the smart grid. IEEE Trans. Big Data 2017, 5, 34–45. [Google Scholar] [CrossRef]
  28. Lago, J.; De Ridder, F.; Vrancx, P.; De Schutter, B. Forecasting day-ahead electricity prices in Europe: the importance of considering market integration. Appl. Energy 2018, 211, 890–903. [Google Scholar] [CrossRef]
  29. Long, W.; Zhang, Z.; Chen, J. Short-Term Electricity Price Forecasting with Stacked Denoising Autoencoders. IEEE Trans. Power Syst. 2017, 32, 2673–2681. [Google Scholar]
  30. Ghasemi, A.; Shayeghi, H.; Moradzadeh, M.; Nooshyar, M. A novel hybrid algorithm for electricity price and load forecasting in smart grids with demand-side management. Appl. Energy 2016, 177, 40–59. [Google Scholar] [CrossRef]
  31. Huang, G.B.; Chen, L.; Siew, C.K. Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 2006, 17, 879–892. [Google Scholar] [CrossRef] [PubMed]
  32. Bartlett, P.L. For valid generalization the size of the weights is more important than the size of the network. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 1997; pp. 134–140. [Google Scholar]
  33. Scardapane, S.; Comminiello, D.; Scarpiniti, M.; Uncini, A. Online sequential extreme learning machine with kernels. IEEE Trans. Neural Netw. Learn. Syst. 2015, 26, 2214–2220. [Google Scholar] [CrossRef] [PubMed]
  34. Loh, W.Y. Classification and Regression Trees. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery; John Wiley and Sons Inc.: Hoboken, NJ, USA, 2011; Volume 1, pp. 14–23. [Google Scholar]
  35. Recursive Feature Elimination. Available online: https://topepo.github.io/caret/recursive-feature-elimination.html (accessed on 10 November 2018).
  36. Durgabai, R.P.L. Feature selection using ReliefF algorithm. Int. J. Adv. Res. Comput. Commun. Eng. 2014, 3, 10, 8215–8218. [Google Scholar]
  37. Logistic Regression. Available online: https://ml-cheatsheet.readthedocs.io/en/latest/logistic-regression.html (accessed on 10 November 2018).
  38. UMass Electric Dataset. Available online: http://traces.cs.umass.edu/index.php/Smart/Smart (accessed on 10 November 2018).
  39. Lichman, M. UCI Machine Learning Repository; University of California: Irvine, CA, USA, 2013. [Google Scholar]

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.