Detection and Classification of Abnormal Power Load Data by Combining One-Hot Encoding and GAN–Transformer
Abstract
:1. Introduction
- (1)
- Sudden changes in load data over short periods;
- (2)
- Significant deviations from normal trends over long periods or even an entire day;
- (3)
- Distorted load data [4].
- (1)
- This article first examines the classification, causes, and characteristics of abnormal power load data, while clarifying the technical requirements for the detection and separation of such anomalies.
- (2)
- To enhance the speed of model training, better capture the trend characteristics of load data, and improve the accuracy of anomaly detection, a game mechanism between the GAN generator and discriminator is incorporated into the Transformer architecture.
- (3)
- The One-hot encoding method is employed to construct classification labels for anomaly data, leading to the comprehensive proposal of a GAN–Transformer model based on this encoding technique.
- (4)
- To evaluate the accuracy of the anomaly detection capabilities of the GAN–Transformer model, an abnormal load data detection mechanism was developed. Additionally, performance indicators such as Precision, recall, F1-score, Overall Accuracy (OA), Average Accuracy (AA), and the Kappa coefficient are comprehensively employed to assess the accuracy of anomaly detection and classification for the proposed model.
2. Electric Power Load Abnormal Data Classification
3. The Principles of One-Hot Encoding and the GAN–Transformer Model
3.1. One-Hot Encoding
3.2. GAN–Transformer Model
3.2.1. Generator
3.2.2. Discriminator
3.2.3. Staged Mapping and Training
3.2.4. Loss Function
4. Implementation of Abnormal Detection and Classification of Power Load Data
4.1. Exception Type Annotation Implementation
4.2. Implementation Process
- (1)
- Data Preprocessing: The collected historical power load data are first divided into training and test sets in a 7:3 ratio. Missing values are addressed through standard procedures, and the data are normalized to ensure uniformity.
- (2)
- Label Processing: To capture the diverse characteristics of abnormal load data, One-hot encoding is applied to label the anomaly types. These labeled data are then prepared as inputs and outputs for model training.
- (3)
- Model Training: The GAN–Transformer model is trained on a substantial volume of labeled abnormal load data. Throughout the training process, hyperparameters are iteratively adjusted. The training spans 500 epochs, with early stopping implemented if the training loss does not improve for 10 consecutive epochs. The weights of the best-performing model are saved for subsequent testing.
- (4)
- Model Testing and Analysis: The saved model is loaded to predict anomalies in the test set. The model generates detection results, identifying anomalies, as well as classification results that categorize the types of anomalies. Finally, metrics such as accuracy, precision, recall, and F1-score are calculated to evaluate performance, followed by a comprehensive analysis of the results.
4.3. Abnormal Load Data Detection Mechanism
4.4. Model Performance Evaluation Indicators
5. Experimental Research and Discussion
5.1. Experimental Preparation
5.2. Detection of Abnormal Load Data in Power Supply and Consumption
5.3. Classification of Abnormal Load Data in Power Supply and Consumption
5.4. Comparative Analysis of Algorithms
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Klein, M.; Thiele, G.; Fono, A.; Khorsandi, N.; Schade, D.; Krüger, J. Process data based Anomaly detection in distributed energy generation using Neural Networks. In Proceedings of the 2020 International Conference on Control, Automation and Diagnosis (ICCAD), Paris, France, 7–9 October 2020; pp. 1–5. [Google Scholar]
- Kwasinski, A. Quantitative model and metrics of electrical grids’ resilience evaluated at a power distribution level. Energies 2016, 9, 93. [Google Scholar] [CrossRef]
- Deng, S.; Chen, F.; Dong, X.; Gao, G.; Wu, X. Short-term load forecasting by using improved GEP and abnormal load recognition. ACM Trans. Internet Technol. (TOIT) 2021, 21, 1–28. [Google Scholar] [CrossRef]
- Tianhui, Z.; Zhang, Y.; Jianxue, W. Identification method of load outlier based on density-based spatial clustering and outlier boundaries. Autom. Electr. Power Syst. 2021, 45, 97–105. [Google Scholar]
- Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv. (CSUR) 2009, 41, 1–58. [Google Scholar] [CrossRef]
- Wang, H.; Bah, M.J.; Hammad, M. Progress in outlier detection techniques: A survey. IEEE Access 2019, 7, 107964–108000. [Google Scholar] [CrossRef]
- Shen, X.; Luo, Z.; Li, Y.; Ouyang, T.; Wu, Y. Chance-Constrained Abnormal Data Cleaning for Robust Classification with Noisy Labels. IEEE Trans. Emerg. Top. Comput. Intell. 2024, 1–8. [Google Scholar] [CrossRef]
- Saeed, M.S.; Mustafa, M.W.; Sheikh, U.U.; Jumani, T.A.; Mirjat, N.H. Ensemble bagged tree based classification for reducing non-technical losses in multan electric power company of Pakistan. Electronics 2019, 8, 860. [Google Scholar] [CrossRef]
- He, Y.; Ma, Y.; Huang, K.; Wang, L.; Zhang, J. Abnormal data detection and recovery of sensors network based on spatiotemporal deep learning methodology. Measurement 2024, 228, 114368. [Google Scholar] [CrossRef]
- Luo, Z.; Fang, C.; Liu, C.; Liu, S. Method for cleaning abnormal data of wind turbine power curve based on density clustering and boundary extraction. IEEE Trans. Sustain. Energy 2021, 13, 1147–1159. [Google Scholar] [CrossRef]
- Tran-Nam, H.; Nguyen-Trang, T.; Che-Ngoc, H. A new possibilistic-based clustering method for probability density functions and its application to detecting abnormal elements. Sci. Rep. 2024, 14, 17871. [Google Scholar] [CrossRef]
- Liang, G.; Liao, H.; Huang, Z.; Li, X. Abnormal discharge detection using adaptive neuro-fuzzy inference method with probability density-based feature and modified subtractive clustering. Neurocomputing 2023, 551, 126513. [Google Scholar] [CrossRef]
- Li, J.; Izakian, H.; Pedrycz, W.; Jamal, I. Clustering-based anomaly detection in multivariate time series data. Appl. Soft Comput. 2021, 100, 106919. [Google Scholar] [CrossRef]
- Li, C.; Guo, L.; Gao, H.; Li, Y. Similarity-measured isolation forest: Anomaly detection method for machine monitoring data. IEEE Trans. Instrum. Meas. 2021, 70, 1–12. [Google Scholar] [CrossRef]
- Peng, Y.; Yang, Y.; Xu, Y.; Xue, Y.; Song, R.; Kang, J.; Zhao, H. Electricity theft detection in AMI based on clustering and local outlier factor. IEEE Access 2021, 9, 107250–107259. [Google Scholar] [CrossRef]
- Liu, X.; Ding, Y.; Tang, H.; Xiao, F. A data mining-based framework for the identification of daily electricity usage patterns and anomaly detection in building electricity consumption data. Energy Build. 2021, 231, 110601. [Google Scholar] [CrossRef]
- Qian, Y.; Wang, Y.; Shao, J. Enhancing power utilization analysis: Detecting aberrant patterns of electricity consumption. Electr. Eng. 2024, 106, 5639–5654. [Google Scholar] [CrossRef]
- Lei, X.; Xia, Y.; Wang, A.; Jian, X.; Zhong, H.; Sun, L. Mutual information based anomaly detection of monitoring data with attention mechanism and residual learning. Mech. Syst. Signal Process. 2023, 182, 109607. [Google Scholar] [CrossRef]
- Yang, N.-C.; Sung, K.-L. Non-intrusive load classification and recognition using soft-voting ensemble learning algorithm with decision tree, K-Nearest neighbor algorithm and multilayer perceptron. IEEE Access 2023, 11, 94506–94520. [Google Scholar] [CrossRef]
- Cai, Q.; Li, P.; Wang, R. Electricity theft detection based on hybrid random forest and weighted support vector data description. Int. J. Electr. Power Energy Syst. 2023, 153, 109283. [Google Scholar] [CrossRef]
- Choi, J.; Roshanzadeh, B.; Martínez-Ramón, M.; Bidram, A. An unsupervised cyberattack detection scheme for AC microgrids using Gaussian process regression and one-class support vector machine anomaly detection. IET Renew. Power Gener. 2023, 17, 2113–2123. [Google Scholar] [CrossRef]
- Shahid, S.M.; Ko, S.; Kwon, S. Real-time abnormality detection and classification in diesel engine operations with convolutional neural network. Expert Syst. Appl. 2022, 192, 116233. [Google Scholar] [CrossRef]
- Zamanzadeh Darban, Z.; Webb, G.I.; Pan, S.; Aggarwal, C.; Salehi, M. Deep learning for time series anomaly detection: A survey. ACM Comput. Surv. 2024, 57, 1–42. [Google Scholar] [CrossRef]
- Kumari, S.; Prabha, C.; Karim, A.; Hassan, M.M.; Azam, S. A Comprehensive Investigation of Anomaly Detection Methods in Deep Learning and Machine Learning: 2019–2023. IET Inf. Secur. 2024, 2024, 8821891. [Google Scholar] [CrossRef]
- Lei, C.; Kai, Q.; Kuangrong, H. Time series anomaly detection method based on integrated LSTM-AE. J. Huazhong Univ. Sci. Technol. (Nat. Sci. Ed.) 2021, 49, 35–40. [Google Scholar]
- Su, Y.; Zhao, Y.; Niu, C.; Liu, R.; Sun, W.; Pei, D. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2828–2837. [Google Scholar]
- Li, J.; Lv, Y.; Zhou, Z.; Du, Z.; Wei, Q.; Xu, K. Identification and Correction of Abnormal, Incomplete Power Load Data in Electricity Spot Market Databases. Energies 2025, 18, 176. [Google Scholar] [CrossRef]
- Li, D.; Chen, D.; Jin, B.; Shi, L.; Goh, J.; Ng, S.-K. MAD-GAN: Multivariate anomaly detection for time series data with generative adversarial networks. In Proceedings of the International Conference on Artificial Neural Networks, Munich, Germany, 17–19 September 2019; pp. 703–716. [Google Scholar]
- Yang, Z.; Zhang, W.; Cui, W.; Gao, L.; Chen, Y.; Wei, Q.; Liu, L. Abnormal detection for running state of linear motor feeding system based on deep neural networks. Energies 2022, 15, 5671. [Google Scholar] [CrossRef]
- Feng, C.; Shao, L.; Wang, J.; Zhang, Y.; Wen, F. Short-term Load Forecasting of Distribution Transformer Supply Zones Based on Federated Model-Agnostic Meta Learning. IEEE Trans. Power Syst. 2024, 40, 31–45. [Google Scholar] [CrossRef]
- Xu, J. Anomaly transformer: Time series anomaly detection with association discrepancy. arXiv 2021, arXiv:2110.02642. [Google Scholar]
- Wang, L.; Wang, X.; Zhang, J.; Wang, J.; Yu, H. A self-supervised learning-based approach for detection and classification of dam deformation monitoring abnormal data with imaging time series. Structures 2024, 68, 107148. [Google Scholar] [CrossRef]
- Shi, J.; Gao, Y.; Gu, D.; Li, Y.; Chen, K. A novel approach to detect electricity theft based on conv-attentional Transformer Neural Network. Int. J. Electr. Power Energy Syst. 2023, 145, 108642. [Google Scholar] [CrossRef]
- Poslavskaya, E.; Korolev, A. Encoding categorical data: Is there yet anything ‘hotter’ than one-hot encoding? arXiv 2023, arXiv:2312.16930. [Google Scholar]
Anomaly Classifications | Anomaly Characteristics | Waveform Performances | Causing Factors |
---|---|---|---|
Mutation Anomaly | Variability | Significant surges and drops in magnitude at specific times | Impulse load signal interference |
Spike Anomaly | Fluctuation | A large number of spikes occurring over a short period | Internal channel noise external interference |
Sustained/ Transient Extremum Anomaly | Trend | Zero values appearing over a sustained or transient period, deviating from the normal trajectory | Data collection is missing at a certain moment due to equipment maintenance, sudden accidents at power plants, etc. |
Codes | 0001 | 0010 | 0100 | 1000 |
---|---|---|---|---|
0001 | Mutation Anomaly | |||
0010 | Sustained Extremum Anomaly | |||
0100 | Spike Anomaly | |||
1000 | Transient Extremum Anomaly |
Begin | |
---|---|
1: | Split data into a training set and a test set in a 7:3 ratio, clean and normalize data |
2: | Apply one-hot encoding for training data |
3: | for epoch from 1 to 500: |
4: | Train model on training set |
5: | if training loss does not improve for 10 epochs |
6: | break (early stopping) |
7: | end if |
8: | end for |
9: | Save best model weights |
10: | Load saved model |
11: | Predict anomalies on test set |
12: | Output detection results and classification results |
13: | Quality Assessment and Analysis |
14: | Calculate metrics and analyze results |
End |
Hyperparameter Types | Values |
---|---|
Window size | 10 |
Number of layers in transformer encoders | 1 |
Number of layers in feed-forward unit of encoders | 2 |
Hidden units in encoder layers | 64 |
Dropout in encoders | 0.1 |
Learning rate | 0.001 |
Num_heads | 3 |
Optimizer | Adam |
Models | P | R | F1 | Training Time per Epoch/s |
---|---|---|---|---|
Ours | 0.9492 | 0.9384 | 0.9437 | 2.30 |
LSTM-NDT | 0.8454 | 0.9846 | 0.9097 | 3.11 |
Transformer | 0.7469 | 0.8635 | 0.8010 | 6.00 |
OmniAnomaly | 0.4735 | 0.6634 | 0.5521 | 10.54 |
MAD-GAN | 0.8519 | 0.9599 | 0.9026 | 10.30 |
Models | P | R | F1 | Training Time per Epoch/s |
---|---|---|---|---|
Ours | 0.9762 | 0.9264 | 0.9506 | 2.30 |
LSTM-NDT | 0.5013 | 0.6551 | 0.5680 | 3.11 |
Transformer | 0.7652 | 0.8699 | 0.8141 | 6.00 |
OmniAnomaly | 0.7965 | 0.8523 | 0.8234 | 10.54 |
MAD-GAN | 0.8967 | 0.8677 | 0.8820 | 12.30 |
Models | OOA/% | AAA/% | KKappa |
---|---|---|---|
Ours | 90.47 | 89.36 | 0.8839 |
LSTM-NDT | 87.20 | 86.50 | 0.7852 |
Transformer | 86.98 | 85.55 | 0.7952 |
OmniAnomaly | 85.00 | 84.70 | 0.6366 |
MAD-GAN | 88.50 | 88.20 | 0.7214 |
Models | OOA/% | AAA/% | KKappa |
---|---|---|---|
Ours | 86.97 | 84.66 | 0.8469 |
LSTM-NDT | 68.30 | 70.15 | 0.5620 |
Transformer | 81.78 | 84.26 | 0.7525 |
OmniAnomaly | 78.50 | 79.00 | 0.4791 |
MAD-GAN | 84.00 | 83.90 | 0.6124 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, T.; Yu, H.; Lu, D.; Bai, S.; Li, Y.; Fan, W.; Liu, K. Detection and Classification of Abnormal Power Load Data by Combining One-Hot Encoding and GAN–Transformer. Energies 2025, 18, 1062. https://doi.org/10.3390/en18051062
Yang T, Yu H, Lu D, Bai S, Li Y, Fan W, Liu K. Detection and Classification of Abnormal Power Load Data by Combining One-Hot Encoding and GAN–Transformer. Energies. 2025; 18(5):1062. https://doi.org/10.3390/en18051062
Chicago/Turabian StyleYang, Ting, Hongyi Yu, Danhong Lu, Shengkui Bai, Yan Li, Wenyao Fan, and Ketian Liu. 2025. "Detection and Classification of Abnormal Power Load Data by Combining One-Hot Encoding and GAN–Transformer" Energies 18, no. 5: 1062. https://doi.org/10.3390/en18051062
APA StyleYang, T., Yu, H., Lu, D., Bai, S., Li, Y., Fan, W., & Liu, K. (2025). Detection and Classification of Abnormal Power Load Data by Combining One-Hot Encoding and GAN–Transformer. Energies, 18(5), 1062. https://doi.org/10.3390/en18051062