Enhancing Smart Grid Sustainability: Using Advanced Hybrid Machine Learning Techniques While Considering Multiple Influencing Factors for Imputing Missing Electric Load Data
Abstract
:1. Introduction
- The meteorological features and the short-term and long-term factors of time series are fully considered, making imputation more accurate and efficient.
- Multiple machine learning models are introduced, which can predict the power grid load from different perspectives, capturing the complex dependencies among features in the data.
- By introducing a variance–covariance weighting method, the merged method becomes more stable and accurate, providing effective data support for optimizing the scheduling, safe operation, and reasonable pricing of power systems.
2. Theoretical Foundation
2.1. Random Forest Imputation
2.2. SW-KNN Imputation
2.3. LM-BP Imputation
3. Variance–Covariance Hybrid Imputation Model
4. An Example Analysis Based on the Hybrid Model
4.1. Data Source and Processing
4.2. Feature Selection
4.3. Evaluation Metrics
4.4. Experimental Results and Analysis
4.4.1. Comparison between Single Model and Multiple Models
4.4.2. Comparison of Different Variants
4.4.3. Comparison with Other Methods
4.4.4. Investigating the Impact of the Training Scale and Season
4.4.5. Impact of Related Metrics on Imputation
5. Conclusions and Outlook
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Liu, X.; Zhang, Z. A Two-Stage Deep Autoencoder-Based Missing Data Imputation Method for Wind Farm SCADA Data. IEEE Sens. J. 2021, 21, 10933–10945. [Google Scholar] [CrossRef]
- Humeau, S.; Wijaya, T.K.; Vasirani, M.; Aberer, K. Electricity load forecasting for residential customers: Exploiting aggregation and correlation between households. In Proceedings of the 2013 IEEE Sustainable Internet and ICT for Sustainability (SustainIT), Palermo, Italy, 30–31 October 2013; pp. 1–6. [Google Scholar]
- Sharma, S.; Verma, V. Performance of Shunt Active Power Filter Under Sensor Failure. In Proceedings of the 2017 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE), Dehradun, India, 18–19 December 2017; pp. 165–168. [Google Scholar]
- Zhou, X.; Han, X.; Wu, Y.; Ju, R.; Tang, Y.; Ni, M. Vulnerability Assessment of the Electric Power and Communication Composite System. In Proceedings of the 2014 IEEE China International Conference on Electricity Distribution (CICED), Shenzhen, China, 23–26 September 2014; pp. 369–372. [Google Scholar]
- Dai, Y.; Chen, Z.; Zheng, X.; Dong, X.; Du, Y.; Liu, X. Smart Electricity Meter Reliability Analysis Based on In-Service Data. In Proceedings of the 2021 IEEE 4th International Conference on Energy, Electrical and Power Engineering (CEEPE), Chongqing, China, 23–25 April 2021; pp. 143–147. [Google Scholar]
- Das, P.; Shuvro, R.A.; Wang, Z.; Hayat, M.M.; Sorrentino, F. A Data-Driven Model for Simulating the Evolution of Transmission Line Failure in Power Grids. In Proceedings of the 2018 IEEE North American Power Symposium (NAPS), Fargo, ND, USA, 9–11 September 2018; pp. 1–6. [Google Scholar]
- Kong, W.; Dong, Z.Y.; Jia, Y.; Hill, D.J.; Xu, Y.; Zhang, Y. Short-Term Residential Load Forecasting Based on LSTM Recurrent Neural Network. IEEE Trans. Smart Grid 2019, 10, 841–851. [Google Scholar] [CrossRef]
- Sim, Y.-S.; Hwang, J.-S.; Mun, S.-D.; Kim, T.-J.; Chang, S.J. Missing Data Imputation Algorithm for Transmission Systems Based on Multivariate Imputation with Principal Component Analysis. IEEE Access 2022, 10, 83195–83203. [Google Scholar] [CrossRef]
- Miranda, V.; Krstulovic, J.; Keko, H.; Moreira, C.; Pereira, J. Reconstructing Missing Data in State Estimation with Autoencoders. IEEE Trans. Power Syst. 2012, 27, 604–611. [Google Scholar] [CrossRef]
- Konstantinopoulos, S.; De Mijolla, G.M.; Chow, J.H.; Lev-Ari, H.; Wang, M. Synchrophasor Missing Data Recovery via Data-Driven Filtering. IEEE Trans. Smart Grid 2020, 11, 4321–4330. [Google Scholar] [CrossRef]
- Sun, J.; Liao, H.; Upadhyaya, B.R. A Robust Functional-Data-Analysis Method for Data Recovery in Multichannel Sensor Systems. IEEE Trans. Cybern. 2014, 44, 1420–1431. [Google Scholar] [CrossRef] [PubMed]
- Suo, Q.; Zhong, W.; Xun, G.; Sun, J.; Chen, C.; Zhang, A. GLIMA: Global and Local Time Series Imputation with Multi-Directional Attention Learning. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; pp. 798–807. [Google Scholar]
- Lin, W.-C.; Tsai, C.-F. Missing Value Imputation: A Review and Analysis of the Literature (2006–2017). Artif. Intell. Rev. 2020, 53, 1487–1509. [Google Scholar] [CrossRef]
- Azarkhail, M.; Woytowitz, P. Uncertainty Management in Model-Based Imputation for Missing Data. In Proceedings of the 2013 IEEE Proceedings Annual Reliability and Maintainability Symposium (RAMS), Orlando, FL, USA, 28–31 January 2013; pp. 1–7. [Google Scholar]
- Kamisan, N.A.B.; Lee, M.H.; Hussin, A.G.; Zubairi, Y.Z. Imputation Techniques for Incomplete Load Data Based on Seasonality and Orientation of the Missing Values. Sains Malays. 2020, 49, 1165–1174. [Google Scholar] [CrossRef]
- Farrugia, M.; Scerri, K.; Sammut, A. Imputation of Electrical Load Profile Data as Derived from Smart Meters. In Proceedings of the 2022 IEEE 21st Mediterranean Electrotechnical Conference (MELECON), Palermo, Italy, 14–16 June 2022; pp. 1067–1072. [Google Scholar]
- Crespo Turrado, C.; Sánchez Lasheras, F.; Calvo-Rollé, J.; Piñón-Pazos, A.; De Cos Juez, F. A New Missing Data Imputation Algorithm Applied to Electrical Data Loggers. Sensors 2015, 15, 31069–31082. [Google Scholar] [CrossRef] [PubMed]
- Smola, A.J.; Vishwanathan, S.V.; Hofmann, T. Kernel methods for missing variables. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Las Vegas, NV, USA, 27–30 June 2005; pp. 325–332. [Google Scholar]
- Huo, H.; Xu, D.; Ding, L.; Liu, Y.; Zheng, Y.; Wang, S.; Xin, C.; Li, W. A Comprehensive Analysis Framework for Power Grid Construction and Operation Efficiency Consider Regional Differentiation and Load Randomness. In Proceedings of the 2023 IEEE 3rd International Conference on Energy Engineering and Power Systems (EEPS), Dali, China, 28 July 2023; pp. 891–896. [Google Scholar]
- Ahmadi, M.M.H.; Aghasi, S.H.; Salemnia, A. Hybrid Energy Storage for DC Microgrid Performance Improvement Under Nonlinear and Pulsed Load Conditions. In Proceedings of the 2018 IEEE Smart Grid Conference (SGC), Sanandaj, Iran, 28–30 November 2018; pp. 1–6. [Google Scholar]
- Lotfipoor, A.; Patidar, S.; Jenkins, D.P. Transformer Network for Data Imputation in Electricity Demand Data. Energy Build. 2023, 300, 113675. [Google Scholar] [CrossRef]
- Ryu, S.; Kim, M.; Kim, H. Denoising Autoencoder-Based Missing Value Imputation for Smart Meters. IEEE Access 2020, 8, 40656–40666. [Google Scholar] [CrossRef]
- Liu, Z.; Tao, Y.; Liu, H.; Luo, L.; Zhang, D.; Meng, X. Missing Completion Method for Load Data Based on Generative Adversarial Imputation Net. In Proceedings of the 2023 IEEE International Conference on Power Science and Technology (ICPST), Kunming, China, 5–7 May 2023; pp. 294–298. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Ou, H.; Yao, Y.; He, Y. Missing Data Imputation Method Combining Random Forest and Generative Adversarial Imputation Network. Sensors 2024, 24, 1112. [Google Scholar] [CrossRef] [PubMed]
- Wang, M.; Ye, X.-W.; Ying, X.-H.; Jia, J.-D.; Ding, Y.; Zhang, D.; Sun, F. Data Imputation of Soil Pressure on Shield Tunnel Lining Based on Random Forest Model. Sensors 2024, 24, 1560. [Google Scholar] [CrossRef] [PubMed]
- Algehyne, E.A.; Jibril, M.L.; Algehainy, N.A.; Alamri, O.A.; Alzahrani, A.K. Fuzzy Neural Network Expert System with an Improved Gini Index Random Forest-Based Feature Importance Measure Algorithm for Early Diagnosis of Breast Cancer in Saudi Arabia. Big Data Cogn. Comput. 2022, 6, 13. [Google Scholar] [CrossRef]
- Yang, F.; Du, J.; Lang, J.; Lu, W.; Liu, L.; Jin, C.; Kang, Q. Missing Value Estimation Methods Research for Arrhythmia Classification Using the Modified Kernel Difference-Weighted KNN Algorithms. BioMed Res. Int. 2020, 2020, 7141725. [Google Scholar] [CrossRef] [PubMed]
- Liang, C.; Zhang, L.; Wan, Z.; Li, D.; Li, D.; Li, W. An Improved kNN Method Based on Spearman’s Rank Correlation for Handling Medical Missing Values. In Proceedings of the 2022 IEEE International Conference on Machine Learning and Knowledge Engineering (MLKE), Guilin, China, 25–27 February 2022; pp. 139–142. [Google Scholar]
- Ma, F.; Wang, S.; Xie, T.; Sun, C. Regional Logistics Express Demand Forecasting Based on Improved GA-BP Neural Network with Indicator Data Characteristics. Appl. Sci. 2024, 14, 6766. [Google Scholar] [CrossRef]
- Chen, H.; Zhu, M.; Hu, X.; Wang, J.; Sun, Y.; Yang, J. Research on Short-Term Load Forecasting of New-Type Power System Based on GCN-LSTM Considering Multiple Influencing Factors. Energy Rep. 2023, 9, 1022–1031. [Google Scholar] [CrossRef]
- Aidos, H.; Tomas, P. Neighborhood-Aware Autoencoder for Missing Value Imputation. In Proceedings of the 2020 IEEE 28th European Signal Processing Conference (EUSIPCO), Amsterdam, The Netherlands, 18–21 January 2021; pp. 1542–1546. [Google Scholar]
- Gond, V.K.; Dubey, A.; Rasool, A.; Khare, N. Missing Value Imputation Using Weighted KNN and Genetic Algorithm. In Proceedings of the ICT Analysis and Applications; Fong, S., Dey, N., Joshi, A., Eds.; Springer Nature: Singapore, 2023; pp. 161–169. [Google Scholar]
Algorithms | Optimal Parameter Details |
---|---|
RF | Number of random forest subtrees = 50 |
Maximum depth = 14 | |
Random state selected as 30 | |
SW-KNN | Optimal k = 7 |
LM-BP | Hidden layer parameters = 10 |
Learning rate = 0.01 |
Date Category | Quantification |
---|---|
Holidays | 1 |
Workdays | 0 |
RMSE | MSE | MAE | MAPE | ||
---|---|---|---|---|---|
Averaging | 0.141 | 0.020 | 0.043 | 0.054 | 0.9875406 |
Stacking | 0.129 | 0.017 | 0.032 | 0.036 | 0.9892530 |
Ours | 0.114 | 0.013 | 0.025 | 0.031 | 0.9910273 |
RMSE | MSE | MAE | MAPE | ||
---|---|---|---|---|---|
KNN(s) | 0.235 | 0.055 | 0.090 | 0.096 | 0.9727835 |
KNN(h) | 0.208 | 0.043 | 0.047 | 0.058 | 0.9786470 |
NAA(s) | 0.136 | 0.018 | 0.040 | 0.077 | 0.9892253 |
NAA(h) | 0.128 | 0.016 | 0.029 | 0.050 | 0.9913876 |
WKNN(s) | 0.184 | 0.034 | 0.063 | 0.085 | 0.9802168 |
WKNN(h) | 0.140 | 0.019 | 0.042 | 0.066 | 0.9871092 |
SW-KNN(s) | 0.164 | 0.027 | 0.051 | 0.064 | 0.9908321 |
SW-KNN(h) | 0.114 | 0.013 | 0.025 | 0.031 | 0.9910273 |
Season | Training Period | MAPE | RMSE |
---|---|---|---|
Summer (June–August) | 3 days | 0.028 | 0.117 |
7 days | 0.039 | 0.152 | |
15 days | 0.060 | 0.204 | |
Winter (December–February) | 3 days | 0.028 | 0.125 |
7 days | 0.031 | 0.135 | |
15 days | 0.042 | 0.138 |
Metrics Features | RMSE | MSE | MAE | MAPE | |
---|---|---|---|---|---|
0.134 | 0.018 | 0.024 | 0.031 | 0.9908271 | |
0.128 | 0.019 | 0.026 | 0.036 | 0.9900548 | |
0.128 | 0.018 | 0.025 | 0.034 | 0.9902170 | |
Date | 0.141 | 0.020 | 0.031 | 0.037 | 0.9825336 |
Season | 0.145 | 0.021 | 0.029 | 0.040 | 0.9877206 |
All | 0.114 | 0.013 | 0.022 | 0.031 | 0.9910273 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hou, Z.; Liu, J. Enhancing Smart Grid Sustainability: Using Advanced Hybrid Machine Learning Techniques While Considering Multiple Influencing Factors for Imputing Missing Electric Load Data. Sustainability 2024, 16, 8092. https://doi.org/10.3390/su16188092
Hou Z, Liu J. Enhancing Smart Grid Sustainability: Using Advanced Hybrid Machine Learning Techniques While Considering Multiple Influencing Factors for Imputing Missing Electric Load Data. Sustainability. 2024; 16(18):8092. https://doi.org/10.3390/su16188092
Chicago/Turabian StyleHou, Zhiwen, and Jingrui Liu. 2024. "Enhancing Smart Grid Sustainability: Using Advanced Hybrid Machine Learning Techniques While Considering Multiple Influencing Factors for Imputing Missing Electric Load Data" Sustainability 16, no. 18: 8092. https://doi.org/10.3390/su16188092
APA StyleHou, Z., & Liu, J. (2024). Enhancing Smart Grid Sustainability: Using Advanced Hybrid Machine Learning Techniques While Considering Multiple Influencing Factors for Imputing Missing Electric Load Data. Sustainability, 16(18), 8092. https://doi.org/10.3390/su16188092