# Big Data Analytics for Short and Medium-Term Electricity Load Forecasting Using an AI Techniques Ensembler

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{*}

## Abstract

**:**

## 1. Introduction

- Hybrid of feature selection techniques; Extreme Gradient Boosting (XGB), Random Forest RF and Recursive Feature Elimination (RFE) techniques are applied to clean the huge amount of data.
- Two enhanced classifier techniques, Support Vector Machine with Grey Wolf Optimization (SVM-GWO) and Convolutional Neural Network Gated Recurrent Unit with Earth Worm Optimization (CNN-GRU-EWO) are proposed to forecast the electricity load.
- Grey Wolf Optimization (GWO) and Earth Worm Optimization (EWO) algorithms are used to tune the parameters of SVM and CNN-GRU, respectively.
- The parameters of classifiers are tuned to reduce the computational time efficiently.
- To overcome the overfitting problem, enhanced classifiers are used.
- Our proposed techniques are compared with some State Of The Art (SOTA) to prove the better performance of our enhanced techniques.

## 2. Related Work

## 3. Proposed System Model

#### 3.1. Dataset Description

#### 3.2. Feature Engineering

#### 3.3. Classification and Forecasting

#### 3.3.1. CNN-GRU-EWO

#### 3.3.2. Gated Recurrent Unit (GRU)

#### 3.3.3. SVM-GWO

## 4. Simulation Results

#### 4.1. Average Feature Selection Based on RF and XGB

#### 4.2. Classification and Forecasting Using SVM-GWO and CNN-GRU-EWO

## 5. Performance Metrics

## 6. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Abbreviations

CMI | Conditional Mutual Information |

NLSSVM | Nonlinear Least Square Support Vector Machine |

ABC | Artificial Bee Colony |

ARIMA | Autoregressive Integrated Moving Average |

IWNN | Wavelet Neural Network |

ELM | Extreme Learning Machine |

CNN | Convolutional Neural Network |

LSTM | Long Short Term Memory |

ASF | Auto Correlation Function |

IITK | India Institution of Technology Kanpoor |

ELM | Extreme Learning Machine |

XGB | Extreme Gradient Boosting |

DTC | Decision Tree Classifier |

MAE | Mean Absolute Error |

RMSE | Root Mean Square Error |

MSE | Mean Square Error |

MAPE | Mean Absolute Percentage Error |

PJM | Pennsylvania New Jersey Maryland |

CA | Correlation Analysis |

## References

- Zhu, Z.; Tang, J.; Lambotharan, S.; Chin, W.H.; Fan, Z. An integer linear programming based optimization for home demand-side management in smart grid. In Proceedings of the Innovative Smart Grid Technologies (ISGT), Washington, DC, USA, 16–20 January 2012; pp. 1–5. [Google Scholar]
- Samadi, P.; Wong, V.W.S.; Schober, R. Load Scheduling and Power Trading in Systems with High Penetration of Renewable Energy Resources. IEEE Trans. Smart Grid
**2016**, 7, 1802–1812. [Google Scholar] [CrossRef] - Chen, X.; Zhou, Y.; Duan, W.; Tang, J.; Guo, Y. Design of intelligent De-mand Side Management system respond to varieties of factors. In Proceedings of the China International Conference on Electricity Distribution (CICED), Nanjing, China, 13–16 September 2010; pp. 1–5. [Google Scholar]
- Hahn, H.; Meyer-Nieberg, S.; Pickl, S. Electric load forecasting methods: Tools for decision making. Eur. J. Oper. Res.
**2009**, 199, 902–907. [Google Scholar] [CrossRef] - Wang, K.; Yu, J.; Yu, Y.; Qian, Y.; Zeng, D.; Guo, S.; Xiang, Y.; Wu, J. A survey on energy internet: Architecture, approach and emerging technologies. IEEE Syst. J.
**2017**, 12, 2403–2416. [Google Scholar] [CrossRef] - Jiang, H.; Wang, K.; Wang, Y.; Gao, M.; Zhang, Y. Energy big data: A survey. IEEE Access
**2016**, 4, 3844–3861. [Google Scholar] [CrossRef] - Fatima, A.; Shabbir, S. Data Analytics for Load and Price Forecasting via Enhanced Support Vector Regression. In Advances in Internet, Data and Web Technologies: The 7th International Conference on Emerging Internet, Data and Web Technologies (EIDWT-2019); Springer: Cham, Switzerland, 2019; Volume 29, p. 259. [Google Scholar]
- Naz, A.; Javed, M.U.; Javaid, N.; Saba, T.; Alhussein, M.; Aurangzeb, K. Short-term electric load and price forecasting using enhanced extreme learning machine optimization in smart grids. Energies
**2019**, 12, 866. [Google Scholar] [CrossRef][Green Version] - Gao, X.; Li, X.; Zhao, B.; Ji, W.; Jing, X.; He, Y. Short-term electricity load forecasting model based on EMD-GRU with feature selection. Energies
**2019**, 12, 1140. [Google Scholar] [CrossRef][Green Version] - Liu, Z.; Sun, X.; Wang, S.; Pan, M.; Zhang, Y.; Ji, Z. Midterm power load forecasting model based on kernel principal component analysis and back propagation neural network with particle swarm optimization. Big Data
**2019**, 7, 130–138. [Google Scholar] [CrossRef][Green Version] - Bouktif, S.; Fiaz, A.; Ouni, A.; Serhani, M.A. Multi-Sequence LSTM-RNN Deep Learning and Metaheuristics for Electric Load Forecasting. Energies
**2020**, 13, 391. [Google Scholar] [CrossRef][Green Version] - Cai, M.; Pipattanasomporn, M.; Rahman, S. Day-ahead building-level load forecasts using deep learning vs. traditional time-series techniques. Appl. Energy
**2019**, 236, 1078–1088. [Google Scholar] [CrossRef] - Jindal, A.; Singh, M.; Kumar, N. Consumption-aware data analytical demand response scheme for peak load reduction in smart grid. IEEE Trans. Ind. Electron.
**2018**, 65, 8993–9004. [Google Scholar] [CrossRef] - Mujeeb, S.; Javaid, N.; Ilahi, M.; Wadud, Z.; Ishmanov, F.; Afzal, M.K. Deep long short-term memory: A new price and load forecasting scheme for big data in smart cities. Sustainability
**2019**, 11, 987. [Google Scholar] [CrossRef][Green Version] - Chitsaz, H.; Zamani-Dehkordi, P.; Zareipour, H.; Parikh, P.P. Electricity price forecasting for operational scheduling of behind-the-meter storage systems. IEEE Trans. Smart Grid
**2018**, 9, 6612–6622. [Google Scholar] [CrossRef] - Ghasemi, A.; Shayeghi, H.; Moradzadeh, M.; Nooshyar, M. A novel hybrid algorithm for electricity price and load forecasting in smart grids with demand-side management. Appl. Energy
**2016**, 177, 40–59. [Google Scholar] [CrossRef] - Abedinia, O.; Amjady, N.; Zareipour, H. A New Feature Selection Technique for Load and Price Forecast of Electrical Power Systems. IEEE Trans. Power Syst.
**2017**, 32, 62–74. [Google Scholar] [CrossRef] - Rafiei, M.; Niknam, T.; Khooban, M.H. Probabilistic Forecasting of Hourly Electricity Price by Generalization of ELM for Usage in Improved Wavelet Neural Network. IEEE Trans. Ind. Inform.
**2017**, 13, 71–79. [Google Scholar] [CrossRef] - Kuo, P.H.; Huang, C.J. An electricity price forecasting model by hybrid structured deep neural networks. Sustainability
**2018**, 10, 1280. [Google Scholar] [CrossRef][Green Version] - Wang, J.; Liu, F.; Song, Y.; Zhao, J. A novel model: Dynamic choice artificial neural network (DCANN) for an electricity price forecasting system. Appl. Soft Comput. J.
**2016**, 48, 281–297. [Google Scholar] [CrossRef] - Shi, H.; Xu, M.; Li, R. Deep Learning for Household Load Forecasting-A Novel Pooling Deep RNN. IEEE Trans. Smart Grid
**2018**, 9, 5271–5280. [Google Scholar] [CrossRef] - Khan, S.; Javaid, N.; Chand, A.; Rashid, F. Electricity load forecasting for each day of the week using deep CNN. In Proceedings of the Workshops of the International Conference on Advanced Information Networking and Applications, Taipei, Taiwan, 27–29 March 2019. [Google Scholar]
- Tian, C.; Ma, J.; Zhang, C.; Zhan, P. A Deep Neural Network Model for Short-Term Load Forecast Based on Long Short-Term Memory Network and Convolutional Neural Network. Energies
**2018**, 11, 3493. [Google Scholar] [CrossRef][Green Version] - Ali, U.; Rauf, A.; Iqbal, U.; Shoukat, I.A.; Hassan, A. Big data analytics for a novel electrical load forecasting technique. Int. J. Inf. Technol. Secur.
**2019**, 11, 33–40. [Google Scholar] - Ahmad, W.; Ayub, N.; Ali, T.; Irfan, M.; Awais, M.; Shiraz, M.; Glowacz, A. Towards short term electricity load forecasting using improved support vector machine and extreme learning machine. Energies
**2020**, 13, 2907. [Google Scholar] [CrossRef] - Houimli, R.; Zmami, M.; Ben-Salha, O. Short-term electric load forecasting in Tunisia using artificial neural networks. Energy Syst.
**2020**, 11, 357–375. [Google Scholar] [CrossRef] - Yang, A.; Li, W.; Yang, X. Short-term electricity load forecasting based on feature selection and Least Squares Support Vector Machines. Knowl. Based Syst.
**2019**, 163, 159–173. [Google Scholar] [CrossRef] - Zheng, S.; Zhong, Q.; Peng, L.; Chai, X. A simple method of residential electricity load forecasting by improved Bayesian neural networks. Math. Probl. Eng.
**2018**, 2018, 4276176. [Google Scholar] [CrossRef] - Abbas, F.; Feng, D.; Habib, S.; Rahman, U.; Rasool, A.; Yan, Z. Short term residential load forecasting: An improved optimal nonlinear auto regressive (NARX) method with exponential weight decay function. Electronics
**2018**, 7, 432. [Google Scholar] [CrossRef][Green Version] - Heghedus, C.; Chakravorty, A.; Rong, C. Energy Load Forecasting Using Deep Learning. In Proceedings of the 2018 IEEE International Conference on Energy Internet (ICEI), Beijing, China, 21–25 May 2018; pp. 146–151. [Google Scholar] [CrossRef]
- Zhang, Z.; Ding, S.; Sun, Y. A support vector regression model hybridized with chaotic krill herd algorithm and empirical mode decomposition for regression task. Neurocomputing
**2020**, 410, 185–201. [Google Scholar] [CrossRef] - Zhang, Z.; Ding, S.; Jia, W. A hybrid optimization algorithm based on cuckoo search and differential evolution for solving constrained engineering problems. Eng. Appl. Artif. Intell.
**2019**, 85, 254–268. [Google Scholar] [CrossRef] - Rafati, A.; Joorabian, M.; Mashhour, E. An efficient hour-ahead electrical load forecasting method based on innovative features. Energy
**2020**, 201, 117511. [Google Scholar] [CrossRef] - RAskari, M.; Keynia, F. Mid-term electricity load forecasting by a new composite method based on optimal learning MLP algorithm. IET Gener. Transm. Distrib.
**2019**, 14, 845–852. [Google Scholar] [CrossRef] - Khan, A.R.; Dewangan, C.L.; Srivastava, S.C.; Chakrabarti, S. Short Term Load Forecasting using SVM Models. In Proceedings of the 8th IEEE Power India International Conference PIICON, Kurukshetra, India, 10–12 December 2018; pp. 1–5. [Google Scholar]
- Cheng, Y.; Jin, L.; Hou, K. Short-Term Power Load Forecasting based on Improved Online ELM-K. In Proceedings of the 2018 International Conference on Control, Automation and Information Sciences (ICCAIS), Hangzhou, China, 24–27 October 2018; pp. 128–132. [Google Scholar]
- Mujeeb, S.; Javaid, N.; Akbar, M.; Khalid, R.; Nazeer, O.; Khan, M. Big data analytics for price and load forecasting in smart grids. In International Conference on Broadband and Wireless Computing, Communication and Applications; Springer: Cham, Switzerland, 2018; pp. 77–87. [Google Scholar]
- ISO NE Electricity Market Data. Available online: https://www.iso-ne.com/isoexpress/web/reports/load-and-demand (accessed on 28 April 2020).
- Xin, M.; Wang, Y. Research on image classification model based on deep convolution neural network. Eurasip J. Image Video Process.
**2019**, 2019, 40. [Google Scholar] [CrossRef][Green Version] - Wu, L.; Kong, C.; Hao, X.; Chen, W. A Short-Term Load Forecasting Method Based on GRU-CNN Hybrid Neural Network Model. Math. Probl. Eng.
**2020**, 2020, 1428104. [Google Scholar] [CrossRef][Green Version] - Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey Wolf Optimizer. Adv. Eng. Softw.
**2014**, 69, 46–61. [Google Scholar] [CrossRef][Green Version] - Ahmad, A.; Javaid, N.; Mateen, A.; Awais, M.; Khan, Z.A. Short-term load forecasting in smart grids: An intelligent modular approach. Energies
**2019**, 12, 164. [Google Scholar] [CrossRef][Green Version] - Ayub, N.; Javaid, N.; Mujeeb, S.; Zahid, M.; Khan, W.Z.; Khattak, M.U. Electricity Load Forecasting in Smart Grids Using Support Vector Machine. In Proceedings of the International Conference on Advanced Information Networking and Applications, Matsue, Japan, 27–29 March 2019; Springer: Cham, Switzerland, 2019; pp. 1–13. [Google Scholar]
- Zhu, Q.; Han, Z.; Başar, T. A differential game approach to distributed demand side management in smart grid. In Proceedings of the 2012 IEEE International Conference on Communications (ICC), Ottawa, ON, Canada, 10–15 June 2012; pp. 3345–3350. [Google Scholar]

Proposed Techniques | Objective | Dataset | Limitations |
---|---|---|---|

DA [13] | Reduce peak load | PJM | Issue in managing big data |

DLSTM [14] | Price and Load forecasting | ISO-NE | Cannot fulfill the requirement of real time data. |

CMI, NLSSVM [15,16] | Forecasting with important feature selection method | PJM | Less amount of data is taken into consideration |

GELM, IWNN [17] | Hourly price forecasting | PJM | Model complexity is considered |

CNN, LSTM [18,19] | Price forecasting | PJM | Redundancy in features are not considered |

DNN [20] | Load forecasting | Irish | Overfitting problem needed to improve |

DCNN [21] | Load forecasting of one day | Victoria | Limited use of dataset |

ESVM [22,23] | Short term load forecasting | ISO-NE | SVM is not good to deal big dataset because overfitting problem |

ANN [24] | Half hourly load forecasting | Tanzanian | Accuracy rates of their work are not satisfactory. |

MI, NN [25] | Short term forecasting | PJM | Maximize the penetration of renewable energy |

NARX, ARMAX [26] | Residential based short term load forecasting | IESCO | Model complexity increased |

GRU [27] | Load forecasting | PJM | Redundancy of features did not considered |

SVM, ANN [28] | Short term forecasting | IITK | Very small dataset is used for experiment |

ELM-K [29] | Short term forecasting | Southern China | Only one error metrics used for evaluation. |

CNN [20] | Short term forecasting | ISO-NE | Manually tuned the hyper parameters of proposed technique |

GRU-CNN [31] | Short term forecasting | Wuwei, Gansu province | Manually tuned the hyper parameters of proposed technique |

MI, ANN [32] | Day ahead load forecasting | DAYTOWN, AKPC | Feature selection need more improvement |

Target Feature | Features | Short Name | Dimension |
---|---|---|---|

System Load | Day-Ahead Cleared Demand | DA_Demand | TRUE |

Regulation Market Service clearing price | Reg_Capacity_Price | TRUE | |

Real-Time Demand | RT_Demand | TRUE | |

The dewpoint temperature | Dew_Point | FALSE | |

Day-Ahead Locational Marginal Price | DA_LMP | FALSE | |

The dry-bulb temperature | Dry_Bulb | FALSE | |

Energy Component of Day-Ahead | DA_EC | FALSE | |

Marginal Loss Component of Real-Time | RT_MLC | FALSE | |

Congestion Component of Day-Ahead | DA_CC | FALSE | |

Congestion Component of Real-Time | RT_CC | FALSE | |

Marginal Loss Component of Day-Ahead | DA_MLC | FALSE | |

Energy Component of Real-Time | RT_EC | TRUE | |

Real-Time Locational Marginal Price | RT_LMP | TRUE | |

Regulation Market Capacity clearing | Reg_Service_Price | FALSE |

Techniques | Performance Metrics | |||||||
---|---|---|---|---|---|---|---|---|

F1-Score | Accuracy | Precision | Recall | MAPE | RMSE | MAE | MSE | |

CNN_GRU_EWA | 95.23 | 96.33 | 94.00 | 94.62 | 6.00 | 7.00 | 10.00 | 13.00 |

LR | 75.88 | 78.35 | 76.56 | 76.98 | 20.00 | 23.00 | 27.00 | 26.00 |

ELM | 75.00 | 78.98 | 76.45 | 22.78 | 13.00 | 12.00 | 15.00 | 18.00 |

SVM | 87.88 | 87.99 | 86.91 | 85.99 | 1.79 | 12.30 | 10.50 | 12.00 |

SVM_GWO | 90.67 | 93.99 | 91.87 | 90.99 | 1.33 | 9.12 | 10.31 | 9.75 |

CNN | 88.66 | 89.00 | 90.00 | 88.76 | 10.00 | 12.00 | 15.00 | 18.00 |

Techniques and Tests | Correlation Tests | Parametric Statistical Hypothesis Tests | Non-Parametric Statistical Hypothesis Tests | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

Pearson’s Test | Spearman’s Test | Kendalla’s Test | Chi- Squared Test | Student’s Test | Paired Student’s Test | ANOVA Test | Mann- Whitney Test | Wilcoxon Test | Kruskal Test | ||

SVM | F-stastistic | −0.0404 | −0.0549 | −0.0362 | 157,449.28 | −5.5019 | −5.3941 | 30 | 225,955 | 104,549 | 26.0883 |

p-value | 0.2753 | 0.1379 | 0.1429 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |

SVM-GWO | F-stastistic | −0.0376 | −0.0553 | −0.0362 | 164,404.40 | 0.2530 | 0.2484 | 0.0640 | 262,798 | 132,003 | 0.2949 |

p-value | 0.3106 | 0.1349 | 0.1436 | 0.0000 | 0.8003 | 0.8039 | 0.8003 | 0.2936 | 0.8054 | 0.5871 | |

CNN | F-stastistic | 0.9964 | 0.9963 | 0.9499 | 575.09 | 1.1820 | 19.2812 | 1.3971 | 257,449 | 37,953 | 1.4537 |

p-value | 0.0000 | 0.0000 | 0.0000 | 1.0000 | 0.2374 | 0.0000 | 0.2374 | 0.1140 | 0.0000 | 0.2279 | |

CNN-GRU-EWA | F-stastistic | 0.7367 | 0.7208 | 0.5321 | 37,815.93 | −0.8087 | −1.4750 | 0.6539 | 267,085 | 131,225 | 0.0001 |

p-value | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.4188 | 0.1406 | 0.4188 | 0.4953 | 0.6555 | 0.9906 | |

ELM | F-stastistic | 0.9887 | 0.9856 | 0.9143 | 1865.32 | −0.1100 | −1.0303 | 0.0121 | 26,4803 | 124,235 | 0.0868 |

p-value | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.9124 | 0.3032 | 0.9124 | 0.3842 | 0.1538 | 0.7683 | |

LG | F-stastistic | 0.2411 | 0.2033 | 0.1415 | 89,538.00 | −6.0994 | −6.9077 | 37 | 218,561 | 94,749 | 36.3238 |

p-value | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ayub, N.; Irfan, M.; Awais, M.; Ali, U.; Ali, T.; Hamdi, M.; Alghamdi, A.; Muhammad, F.
Big Data Analytics for Short and Medium-Term Electricity Load Forecasting Using an AI Techniques Ensembler. *Energies* **2020**, *13*, 5193.
https://doi.org/10.3390/en13195193

**AMA Style**

Ayub N, Irfan M, Awais M, Ali U, Ali T, Hamdi M, Alghamdi A, Muhammad F.
Big Data Analytics for Short and Medium-Term Electricity Load Forecasting Using an AI Techniques Ensembler. *Energies*. 2020; 13(19):5193.
https://doi.org/10.3390/en13195193

**Chicago/Turabian Style**

Ayub, Nasir, Muhammad Irfan, Muhammad Awais, Usman Ali, Tariq Ali, Mohammed Hamdi, Abdullah Alghamdi, and Fazal Muhammad.
2020. "Big Data Analytics for Short and Medium-Term Electricity Load Forecasting Using an AI Techniques Ensembler" *Energies* 13, no. 19: 5193.
https://doi.org/10.3390/en13195193