Classiﬁcation of Electricity Consumption Behavior Based on Improved K-Means and LSTM

Featured Application: The results of this work will be used to develop a classiﬁcation framework that will be applied in the analysis of a large number of scattered users’ electricity consumption behavior. This work sets labels for existing electricity consumption behaviors to carry out the classiﬁcation of unknown types of electricity consumption behavior. Abstract: Power big data-based artiﬁcial intelligence or data mining methods, which can be used to analyze electricity consumption behavior, have been widely applied to provide targeted marketing services for electricity consumers. However, the traditional clustering algorithm has difﬁculty in judging new electricity consumption patterns. Deep neural networks usually need large amounts of labeled data. However, there are few comparable electricity consumption features or basic data, and the labeled data cannot meet the actual needs. Therefore, an intelligent classiﬁcation framework for electricity consumption behavior based on an improved k-means and long short-term memory (LSTM) is proposed, which not only extracts features effectively, but also establishes a mapping relationship between unlabeled electricity consumption behavior characteristics and user types. The features can be labeled to train the deep neural network to judge the electricity consumption behavior of new users. Firstly, nine typical characteristics were selected from aspects including electricity price sensitivity and load ﬂuctuation rate. Secondly, the k value and initial clustering centers of the k-means algorithm were optimized. Thirdly, the users were labelled based on the clustering results, together with the features, and a dataset was formed, which was input into LSTM to train the classiﬁcation model. Finally, the analysis of users in Shenyang, China, showed the results based on the proposed method were consistent with the actual situation. Moreover, compared to other methods, the efﬁciency and accuracy were higher.


Introduction
Based on power big data, artificial intelligence or data mining techniques can be reasonably applied to the analysis of user electricity consumption behavior and habits, which can help power grids to understand the characteristics of users' electricity consumption and provide more targeted electricity services and marketing strategies [1][2][3].
However, with regard to dispersed and distinct power big data, how to choose appropriate data analysis methods, how to select effective features, and how to make full use of historical data to achieve detailed classification of electricity consumption behavior are all unsolved problems.
At present, the methods commonly used in electricity behavior analysis are cluster analysis and deep neural networks [4]. A clustering method based on self-organizing maps and k-means was proposed in [5], where the self-organizing map was used to initially select cluster centers, which significantly improved the accuracy of clustering and reduced the convergence time of the algorithm. In [6], principal component analysis was introduced to extract features from the original data, and then these features were used as the input of fuzzy clustering, which effectively improved the efficiency of the clustering. In [7], affinity propagation clustering was used in the analysis of electricity consumption. The discrete characteristics and time domain features obtained by symbolic aggregate approximation were extracted. The load curve was reduced in dimensionality and has been fully described. In [8], based on a correlation analysis, cluster forecasting was carried out for each type of load, which not only considered the electricity consumption characteristics of each type of load but also clustered the loads with similar characteristics, reducing the error of load forecasting. In [9], a method based on a simulated annealing algorithm for the optimization of initial cluster centers was proposed, which improved the performance of the k-means algorithm. A pattern classification of an electricity consumption curve based on auto-encoding neural network and fuzzy c-means clustering was proposed in [10], and the optimized electricity price model was established for different electricity consumption patterns, which could guide users to adjust electricity consumption behavior and electricity purchase strategies. Deep neural networks are generally used in load forecasting and classification. A load forecasting method based on a deep belief network (DBN) was proposed in [11]. The potential environmental factors with the strongest correlation with the electricity consumption behavior were selected as inputs to improve the forecasting accuracy. A user classification method based on hybrid long short-term memory (H-LSTM) neural network was proposed in [12]. The H-LSTM neural network analyzed the temporal correlation of feature sequences, and then obtained the classification results.
The above methods focused on the improvement of clustering algorithms or the optimization of features, and the cluster analyses were only performed on the existing data. If new electricity consumption data appears, re-clustering will take a lot of time. Additionally, deep neural networks are often used for load forecasting and require a large amount of label data.
The main contributions of this paper are summarized as follows: (1) A framework for intelligent classification of electricity consumption behavior based on improved k-means and LSTM is proposed, which can judge users' electricity consumption behavior accurately. (2) An improved k-means clustering method is introduced to set labels for scattered and irregular original data, and the initial k value and initial clustering centers of the k-means algorithm are optimized intelligently.
The remainder of this paper is organized as follows: Section 2 proposes the classification framework for electricity consumption behavior based on improved k-means and LSTM. Section 3 delivers the defined characteristics and the improved k-means clustering algorithm, as well as the basic network framework of LSTM. Section 4 analyzes the electricity consumption behavior of SY city in LN Province, China, based on the proposed method. Section 5 concludes the paper.

The Framework of Electricity Consumption Behavior Classification Based on Improved K-Means and LSTM
Electricity consumption behavior classification based on improved k-means and LSTM mainly includes selecting electricity characteristics and electricity consumption behavior classification. In order to form an effective feature set to train the classification model, cluster analysis was performed. Based on the clustering results, the labels for each user were set corresponding to the extracted features. Then, the intelligent classification model was achieved. The framework is shown in Figure 1.

The Framework of Electricity Consumption Behavior Classification Based on Improved K-Means and LSTM
Electricity consumption behavior classification based on improved k-means and LSTM mainly includes selecting electricity characteristics and electricity consumption behavior classification. In order to form an effective feature set to train the classification model, cluster analysis was performed. Based on the clustering results, the labels for each user were set corresponding to the extracted features. Then, the intelligent classification model was achieved. The framework is shown in Figure 1.

Analysis of Users' Electricity Consumption
Behavior  The specific steps corresponding to the framework are as follows: 1. Data acquisition: The electricity consumption data of a certain period were acquired by smart meters. Then, the incomplete data in the original data were replaced with values (average or median) close to the center of the sample attribute. In order to avoid the impact of data differentiation, the above preprocessed data were normalized. 2. Feature selection: Multi-dimensional features were calculated based on the preprocessed data, which included electricity consumption, electricity price sensitivity, load fluctuation rate, and power factor. 3. Cluster analysis: Referring to the calculated features, we performed a preliminary clustering of users. In order to improve the performance of clustering, the traditional k-means algorithm was improved. 4. Training classification model: We initialized the LSTM network structure and selected model parameters. Based on the results of clustering, the users' labels were set. Together with the calculated features, the electricity consumption behavior dataset was formed, which was used as an input to train the LSTM classification model, and then the trained model parameters were saved. 5. Evaluation of classification results: We obtained new electricity consumption data from smart meters and calculated features through the same steps as above. Then, we input the new features into the trained LSTM, and output the classification results The specific steps corresponding to the framework are as follows: 1. Data acquisition: The electricity consumption data of a certain period were acquired by smart meters. Then, the incomplete data in the original data were replaced with values (average or median) close to the center of the sample attribute. In order to avoid the impact of data differentiation, the above preprocessed data were normalized.

2.
Feature selection: Multi-dimensional features were calculated based on the preprocessed data, which included electricity consumption, electricity price sensitivity, load fluctuation rate, and power factor. 3.
Cluster analysis: Referring to the calculated features, we performed a preliminary clustering of users. In order to improve the performance of clustering, the traditional k-means algorithm was improved. 4.
Training classification model: We initialized the LSTM network structure and selected model parameters. Based on the results of clustering, the users' labels were set. Together with the calculated features, the electricity consumption behavior dataset was formed, which was used as an input to train the LSTM classification model, and then the trained model parameters were saved. 5.
Evaluation of classification results: We obtained new electricity consumption data from smart meters and calculated features through the same steps as above. Then, we input the new features into the trained LSTM, and output the classification results to judge the users' electricity consumption behavior. Finally, the performance of the model was evaluated, and the classification results of other models were compared.

Multi-Dimensional Feature Extraction
There are two common clustering methods for electricity consumption behaviors: direct clustering and indirect clustering. Direct clustering includes k-means, hierarchical clustering, density-based spatial clustering with noise, and self-organizing maps. Indirect clustering requires the performance of feature extraction on the electricity data before clustering. Good feature extraction can greatly improve the effect of clustering [2].
The user's electricity consumption behaviors are related to multiple factors, such as load, price, time, and environment. There are many characteristics used to describe the behavior, including the daily load rate, valley electricity coefficient, peak time power consumption rate, daily minimum, and maximum load [13,14]. According to different electricity consumption characteristics, different electricity consumption behaviors can be derived.
However, there are few horizontally comparable features and basic data that can be obtained. Some defined features are difficult to meet the actual needs [15,16]. In this paper, a set of electricity consumption characteristics based on power big data (mainly including active load, reactive load, electricity consumption, and electricity price) are proposed, which take into account four aspects: electricity consumption, electricity price sensitivity, load fluctuation rate, and power factor.
The total electricity consumption reflects the user's electricity consumption capacity, which is closely related to the type of the user. It is shown as: where A is the total electricity consumption in one day; P(i) is the active power per hour; ∆T is the time interval, which is 1 h. The total electricity consumption is related to the composition and load adjustment measures of users.
• Sensitivity of electricity price.
Electricity price sensitivity includes sensitivity of electricity price changes and sensitivity of total electricity price.
(a) Sensitivity of electricity price changes: where SS represents the sensitivity of electricity price change; F(i) represents the electricity price change at each hour. P(i) represents the active power at time i; P(i + 1) represents the active power at time i + 1; T(i) represents the electricity price at time i; T(i + 1) represents the electricity price at time i + 1.
(b) Sensitivity of total electricity price: where ST 1 , ST 2 , and ST 3 are the sensitivity of total electricity price in the valley, flat, and peak period, respectively; W 1 , W 2 , and W 3 are the electricity consumption in the valley, flat, and peak period, respectively. T 1 = 0.4 is the electricity price in the valley period; T 2 = 0.8 is the electricity price in the flat period; and T 3 = 1.2 is the electricity price in the peak period.
The above two parameters can reflect the users' electricity demand changes with electricity prices. The higher the electricity price sensitivity, the greater the load. Conversely, the lower the electricity price sensitivity, the smaller the load.
Load fluctuation rate includes the peak-valley difference, the mean square deviation, and the ramps.
(a) Peak-valley difference: where DPN is the peak-valley difference, which is equal to the difference between max(P(i)) (maximum active power) and min(P(i)) (minimum active power) at time i. The index is closely related to the fluctuation of electricity consumption and season. The greater the peak-valley difference, the greater the peak-shaving pressure of the grid, and the greater the peak-shaving capacity required to maintain the safe operation of the grid.
(b) Mean square deviation: where MSE represents the mean square deviation; P(i) is the active power at i; P is the average active power of a day. The mean square deviation can reflect the dispersion degree of active power of the user at 24 h. The larger the mean square deviation, the greater the user's load fluctuation rate. The smaller the mean square deviation, the smaller the user's load fluctuation rate, and the more stable the load.
(c) Ramps: where R represents ramps; P(i) represents the active power at i; P(i + 1) represents the active power at i + 1. In peak or low periods of load, ramping events pose a great threat to the safe operation of a power system.
The power factor reflects the utilization rate of electrical equipment from a technological perspective and the economic benefits of a grid from a management perspective. The minimum power factor is selected as the characteristic: where MI NPF is the minimum power factor; P(i) is the active power at i; Q(i) is the reactive power at i. The higher the power factor, the higher the equipment utilization rate, and the better the economic efficiency of the grid. Based on the above index, each user's electricity consumption behavior is expressed as a 1 × 9 vector X = [A, SS, ST 1 , ST 2 , ST 3 , DPN, MSE, R, MI NPF]. In order to avoid the influence of larger or smaller values, X is normalized. Using X as a reference, the cluster analysis is performed on the user's electricity consumption behavior.

Optimization of K Value
Cluster analysis of users' electricity data can divide a large number of scattered users into k typical electricity consumption patterns, which is helpful for further refining electricity consumption characteristics. A k-means clustering algorithm is simple and efficient, and with fast convergence and strong scalability, which is often used in the study of electricity consumption behavior [17][18][19].
A k-means algorithm needs to specify the number of clusters k in advance. However, the k is usually given by experience, without considering the actual characteristics of the sample, which is subjective. Therefore, a k value selection strategy based on the K-D calculation was proposed.
Firstly, select a point as the first clustering center in feature dataset. Secondly, calculate the distance D(x) between each point and the selected cluster center. The smaller the D(x), the greater the probability that that point is selected as the new cluster center. Then, repeat the calculation until k cluster centers are selected. Finally, use these k values as the initial cluster centers, and the k value can be obtained from the K-D curve.
where D is the distance between each point and the nearest cluster center; D 1 is the average distance within the cluster; and D 2 is the average distance between clusters. The smaller the D, the smaller the intra-class distance, or the larger the inter-class distance. The common distance calculation methods of a k-means algorithm are cosine similarity and Euclidean distance. We adopted the latter and took the square root of the minimum Euclidean distance as the objective function.

Optimization of Initial Clustering Centers
The initial clustering centers of the traditional k-means algorithm is randomly given, which may cause the algorithm to fall into a local optimum, and the final result will be unstable. Therefore, an improved particle swarm optimization (PSO) algorithm for initial cluster centers optimization is proposed. The PSO algorithm does not have many adjustment parameters, which is simple and can be used for a wide range of applications. The algorithm has been introduced many times in the literature [20,21].
The improved PSO algorithm proposed in this paper mainly selects the inertial weight factor and the learning factor by adjusting the parameters, which avoids the algorithm from falling into the local optimum and being difficult to converge when searching the optimal solution.
Supposing the initial clustering centers composed of N particles are searched in the D dimensional space, the position of the i−th particle is After determining the optimal solution, the particle swarm updates the position and velocity according to (12) and (13) [22].
where k is the number of iterations; w is the inertia weight factor; c 1 and c 2 are the learning factor, and their values are reported in [1,2]. Their appropriate value can accelerate the convergence rate and avoid falling into the local optimum. r 1 and r 2 are two random numbers in [0, 1]. p k i and p k g are the local optimal value and global optimal value of the particle swarm, respectively.
In order to increase the position search ability of the particles in the early stage and the converge speed in the later stage, the inertial weight factor w is iterated according to the rule of linear decrease, which is adjusted with the number of iterations: where w i is the i-th inertia weight value; t max is the maximum number of iterations; w max is the maximum inertia weight factor; w min is the minimum inertia weight factor. The learning factors c 1 and c 2 are the linear change of the update rate. In the initial search, c 1 is larger, and c 2 is smaller. As the iteration progresses, c 1 decreases linearly, and c 2 increases linearly. Each particle moves closer to the global optimum. The update of c 1 and c 2 is as follows (15)-(16): (16) where c 1b and c 2b are the initial setting value of acceleration constant c 1 and c 2 ; c 1 f and c 2 f are the final value of acceleration constant c 1 and c 2 after the maximum iteration; t max is the maximum number of iterations. The flowchart of the improved k-means clustering algorithm is shown in Figure 2. Based on the above process, the user's electricity consumption behaviors were clustered, and the labels were obtained according to different electricity consumption patterns, which provide the basis of the data for intelligent classification.

LSTM Classification Model
Based on the clustering result and the extracted features, the dataset of users' electricity consumption behavior characteristics was formed. Considering the high dimensionality and timing correlation of features, a long short-term memory (LSTM) neural network was selected as the classification model. After setting up the network architecture and parameters, the neural network was trained to learn the characteristics with the abovementioned dataset. Then, the mapping relationship between the user's category and the electricity characteristics was established, which can automatically judge the new user electricity behavior.
Deep learning has many practical applications in electricity consumption behavior analysis, including grid operation monitoring and load forecasting [23,24]. LSTM is a common classification model and is widely used in speech recognition, text classification, and other fields; it is especially suitable for sequence modeling. Through continuous improvement of LSTM [25][26][27], the neurons in the hidden layer of the recurrent neural network (RNN) were replaced with unique memory neural units, which effectively solves the problems of gradient disappearance and gradient explosion in the RNN.
Each memory unit of LSTM is composed of an input gate, a forget gate, and an output gate. The unit structure is shown in Figure 3. C is used to save the long-term state of the sequence and to pass the information to the next layer. The forget gate updates C and discards the outdated information.  Figure 2. Improved k-means clustering algorithm.

LSTM Classification Model
Based on the clustering result and the extracted features, the dataset of users' electricity consumption behavior characteristics was formed. Considering the high dimensionality and timing correlation of features, a long short-term memory (LSTM) neural network was selected as the classification model. After setting up the network architecture and parameters, the neural network was trained to learn the characteristics with the above-mentioned dataset. Then, the mapping relationship between the user's category and the electricity characteristics was established, which can automatically judge the new user electricity behavior.
Deep learning has many practical applications in electricity consumption behavior analysis, including grid operation monitoring and load forecasting. [23,24]. LSTM is a common classification model and is widely used in speech recognition, text classification, and other fields; it is especially suitable for sequence modeling. Through continuous improvement of LSTM [25][26][27], the neurons in the hidden layer of the recurrent neural network (RNN) were replaced with unique memory neural units, which effectively solves the problems of gradient disappearance and gradient explosion in the RNN.
Each memory unit of LSTM is composed of an input gate, a forget gate, and an output gate. The unit structure is shown in Figure 3. C is used to save the long-term state of the sequence and to pass the information to the next layer. The forget gate updates C and discards the outdated information. , Then, perform a sigmoid calculation on t x to obtain t o . Calculate t o and the updated long-term state t o to obtain the output t C , which is shown in (18): After the data x t at t reach the network, it is used as the input together with the output h t−1 at the previous time to update C t−1 to obtain a new long-term state C t , which is shown in (17). Then, perform a sigmoid calculation on x t to obtain o t . Calculate o t and the updated long-term state o t to obtain the output C t , which is shown in (18): LSTM regulates the flow of characteristic sequences and filters information through input gates, forget gates, and output gates, which can better cover seasonal changes or timing fluctuations of users' electricity consumption behavior. Moreover, for a large number of high-dimensional features, the classification performance of LSTM is better than BP, SVM, ELM, and other shallow learning models. The generalization ability is stronger, and the accuracy is higher.

Case Analysis Results
The electricity consumption data in this case are the active and reactive power of 99,442 users in SY City, LN Province, in February, May, August, and November 2018, which is measured every hour every day. February, May, August, and November can represent the electricity consumption characteristics of the four seasons, respectively. The peak-valley-flat time periods and the corresponding electricity prices in SY City are shown in Table 1.

Analysis of Electricity Consumption Behavior
Firstly, the electricity data were divided into 77,000 sets for the preprocessing and multi-dimensional features calculation. Then, the improved k-means method was used for cluster analysis on the calculated features. The minimum D was obtained by testing when K = 5, and the corresponding K-D curve is shown in Figure 4.

Analysis of Electricity Consumption Behavior
Firstly, the electricity data were divided into 77,000 sets for the preprocessing and multi-dimensional features calculation. Then, the improved k-means method was used for cluster analysis on the calculated features. The minimum D was obtained by testing when K = 5, and the corresponding K-D curve is shown in Figure 4.  According to the clustering results, the users' electricity characteristics are divided into five typical patterns, from which the average value of the electricity load of each type of user can be calculated for detailed analysis. The electricity consumption curves of each type of user in each quarter are shown in Figures 5-8. As shown in Figures 5a-8a, load types 1, 2, 4, and 5 are in the same left y-axis coordinate, and load type 3 is in the other right y-axis coordinate.     It can be concluded that (a) the load curves in February and August (spring and autumn), and in May and November (summer and winter) were consistent, respectively. Compared with the load curves of the four months, it was found that the average type 2 It can be concluded that (a) the load curves in February and August (spring and autumn), and in May and November (summer and winter) were consistent, respectively. Compared with the load curves of the four months, it was found that the average type 2 load in May and November was higher than that in February and August. This type of load was a cooling load in the summer and a heating load in the winter. The first type 3 and 4 load curves remained unchanged within one year, which are the daily electricity loads, such as resident load, industrial load, and commercial load. (b) The electricity consumption characteristics of type 1, 4 and 5 loads among the four months had obvious time-dependent characteristics: the electricity consumption period was concentrated between 5:00 a.m. to 11:00 p.m., which has a strong correlation with working hours. The type 1 load had the most severe fluctuation and the largest average electricity consumption. It also had a more obvious load spike. The average electricity consumption values of the type 2 and the type 4 loads were relatively small. The analysis results are consistent with the actual situation.
The electricity consumption characteristics of each user in each quarter are shown in Tables 2-5. As can be seen from these tables, except for the minimum power factor, the other characteristic values of the type 1 load of each quarter were the largest. Combined with the actual analysis of the change of the type 2 load, it was the cooling load in May (summer) and the heating load in November (winter), and thus the characteristic value was larger. In February (spring) and August (autumn), the type 2 load was out of service, and the characteristic value was small.
The characteristics of the type 3 and 4 loads were smaller. The characteristics of the type 5 load were closer to the type 4 load in May and November. The difference of characteristic values of the two types was less than 100. However, the characteristics of the type 5 load in February and August increased by more than 10 times compared to May and November.
The change in the characteristic value was consistent with the load fluctuation, which can effectively reflect the users' electricity consumption behavior. The characteristic value of the type 3 and 4 loads was smaller. The characteristics of the type 5 load were closer to those of the type 4 load in May and November, and the type 5 load consumption in February and August increased nearly 10 times compared with that in May and November. The change in the characteristic value was consistent with the load fluctuation, which can effectively reflect the users' electricity consumption behavior.
Finally, the silhouette index (SI) was used to compare the traditional k-means algorithm, the particle swarm optimization-based k-means algorithm (PSO-K), and the improved k-means proposed in this paper.
The silhouette index is defined as where d in represents the average distance between the sample point and all other points in the same cluster; d out represents the average distance between the sample point and all points in the next closest cluster. SI is a key indicator used to describe the difference between the inside and outside of the cluster. Its value range is (−1, 1). The closer to 1, the better the clustering effect.
The comparison results are shown in Table 6. It can be seen that although the SI of PSO-K is high, the efficiency is low. The SI of the algorithm proposed in this paper was high, the calculation time was short, and the efficiency was improved, which shows the superiority of the improved algorithm.

Classification of Electricity Consumption Feature
The above characteristics can be normalized, and thus, the labels for each user based on different electricity consumption quarters and user clustering categories can be obtained, which can form the users' electricity consumption behavior dataset. The labels are shown in Table 7. Then, 15,000 new sample data from the measurement data that did not participate in the above cluster analysis were selected. After data preprocessing, multiple feature values were calculated. These formed a test dataset of the classification model. Since the classification performance of LSTM needs to be evaluated, the sampling period and users' type of the test sample in this case were known. The specific settings of the LSTM classification model were the following: the number of input channels was 1; the input dimension was 9; the output dimension was 6; and the numbers of neurons in the first and second hidden layers were 25 and 10, respectively. The neuron excitation function adopted a sigmoid function; the model learning rate was 0.001; and the execution environment was designated as the GPU. The gradient threshold was 1; the number of trainings per batch was set to 300; and the sequence length was specified as the longest.
The accuracy of the classification model under different optimization algorithms and different hidden layer nodes is shown in Figure 9. It can be seen that Adam was superior to other algorithms. When the Adam algorithm was used to train the network, the learning step size of each iteration parameter had a certain range, and the large gradient did not cause an excessive learning step size. Therefore, the parameter update was more stable, and the convergence speed was faster. The specific settings of the LSTM classification model were the following: the number of input channels was 1; the input dimension was 9; the output dimension was 6; and the numbers of neurons in the first and second hidden layers were 25 and 10, respectively. The neuron excitation function adopted a sigmoid function; the model learning rate was 0.001; and the execution environment was designated as the GPU. The gradient threshold was 1; the number of trainings per batch was set to 300; and the sequence length was specified as the longest.
The accuracy of the classification model under different optimization algorithms and different hidden layer nodes is shown in Figure 9. It can be seen that Adam was superior to other algorithms. When the Adam algorithm was used to train the network, the learning step size of each iteration parameter had a certain range, and the large gradient did not cause an excessive learning step size. Therefore, the parameter update was more stable, and the convergence speed was faster. Based on the feature dataset, the users' electricity consumption behavior in SY city was classified. During the training process, the classification accuracy changed when the network selected different initial learning rates and different iteration times, as shown in Based on the feature dataset, the users' electricity consumption behavior in SY city was classified. During the training process, the classification accuracy changed when the network selected different initial learning rates and different iteration times, as shown in Figure 10. When the initial learning rate was 0.001 and the number of iterations was 50, the corresponding training dataset accuracy rate curve and loss curve changed, as shown in Figure 11.  Due to certain differences in input features, the classification accuracy in the early days of network training varied greatly. As the iteration continued, the loss curve gradually approached zero, and the model training was completed. LSTM can give the charac-  Due to certain differences in input features, the classification accuracy in the early days of network training varied greatly. As the iteration continued, the loss curve gradu ally approached zero, and the model training was completed. LSTM can give the charac teristic sequence a certain timing correlation, and has a good time dependence on the in Figure 11. Training set accuracy rate curve and loss function curve.
Due to certain differences in input features, the classification accuracy in the early days of network training varied greatly. As the iteration continued, the loss curve gradually approached zero, and the model training was completed. LSTM can give the characteristic sequence a certain timing correlation, and has a good time dependence on the input electricity consumption characteristics of different quarters.
The test data were input into the network, and the classification accuracy rate was 96.71%. In order to further evaluate the network performance, precision, recall, and F1score were also introduced, and the evaluation values of some classification models, such as SVM, KNN, ELM, and BP, were compared. The comparison results are shown in Table 8. It can be concluded from the above analysis that the combination of LSTM and kmeans clustering cannot only set cluster labels for complex and irregular original electricity consumption data, but also automatically classify new users' electricity consumption characteristics, which can carry out analysis of the users' electricity consumption behavior based on the existing massive electricity big data.
After preliminary clustering analysis of electricity consumption data, the scattered electricity characteristics can be divided into typical categories, and labels can be set for each user. The classification model was trained, and new electricity behavior classifications were given by LSTM, which is beneficial to grids to provide users with targeted services.

Conclusions
In this paper, an intelligent analysis of users' electricity consumption behavior based on improved k-means and LSTM is proposed, which can divide scattered and irregular original electricity consumption data according to the effective features. Then, the labels corresponding to the data can be given to form a feature dataset. A deep neural network trained based on the existing data can predict the user's type based on new electricity consumption characteristics so that it can intelligently provide users with targeted electricity consumption strategies or marketing services. This effectively solves the problem of a single analysis method not being able to easily classify and judge the new electricity data.
Nine features were extracted based on power big data, which could comprehensively characterize users' electricity consumption behavior. The improvement of a k-means algorithm greatly improved the efficiency of cluster analysis. The clustering results and features formed an effective dataset.
Compared with the method of directly training a neural network with original data, the calculation time of the proposed algorithm was reduced, and the classification results of new electricity data were more accurate. The analysis results can provide support for grids to formulate targeted marketing services. However, the classification model selected only considered the time dependence of electricity data and the seasonal correlation of features. In the future, the applicability of other deep learning models will be further analyzed.