Residential Electricity Load Forecasting Based on Fuzzy Cluster Analysis and LSSVM with Optimization by the Fireworks Algorithm

: As the construction of the energy internet progresses, the proportion of residential electricity consumption in end-use energy consumption is increasing, the peak load on the grid is growing year on year, and seasonal and regional peak power supply tensions, mainly for residential electricity consumption, have become common problems across the country. Accurate residential load forecasting can provide strong data support for the operation of electricity demand response and the incentive setting of the response. For the accuracy and stability of residential electricity load forecasting, a forecasting model is presented in this paper based on fuzzy cluster analysis (FC), least-squares support vector machine (LSSVM), and a fireworks algorithm (FWA). First of all, to reduce the redundancy of input data, it is necessary to reduce the dimension of data features. Then, FWA is used to optimize the arguments 𝛾 and σ2 of LSSVM , where γ. is the penalty factor and σ2 denotes the kernel width. Finally, a load forecasting method of FC – FWA – LSSVM is developed. Relevant data from Beijing, China, are selected for training tests to demonstrate the effectiveness of the proposed model. The results show that the FC – FWA – LSSVM hybrid model proposed in this paper has high accuracy in residential power load forecasting, and the model has good stability and versatil-ity.


Introduction
With a rapidly growing economy, the massive consumption of non-renewable energy, the deterioration of people's living environment, and the energy crisis, how to improve energy utilization and achieve the coordinated development of economy and energy have become the focus of attention for countries around the globe. Reasonable electricity dispatch is one of the favorable means to improve energy utilization. A reasonable forecast of residential electricity load can help power suppliers formulate reasonable demand response strategies, prompt residents to change their inherent electricity consumption habits, reduce customers' electricity costs, and achieve the purpose of peak and valley reduction.
Over the years, the techniques and methods of load forecasting have been developed continuously, and there are two main research methods: classical forecasting methods and modern intelligent load forecasting methods. Classical load forecasting methods principally include time series methods [1], regression analysis methods [2], gray theory methods [3], etc. These methods are simple to calculate and more mature in theory, but all have certain defects leading to less than ideal forecasting accuracy. Among them, the time series method only analyzes the time factor as a variable, ignoring the influence of other external factors, so when the external environment changes significantly, its prediction results will produce a large error [4]. Regression forecasting methods are sometimes speculative in the selection of explanatory variables or the way in which the explanatory variables are expressed, which affects the diversity of explanatory variables and the unpredictability of some explanatory variables to a certain extent, resulting in the limitation of regression analysis methods in predicting the load of distributed energy systems [5]. The gray prediction method has a better fitting effect for the original data as a smooth discrete series, while the influencing factors of residential electricity load are mostly discrete and subjective data, so the accuracy of the prediction will be greatly reduced when the gray prediction method is used to study such problems [6]. In summary, classical forecasting methods are not suitable for residential electricity load forecasting research.
At the present stage, scholars have gradually applied intelligent algorithms to the field of load prediction. Since intelligent algorithms such as artificial neural networks can simulate the human brain mechanism, the prediction accuracy has been improved with the help of their self-learning and self-seeking functions to simulate the changing pattern of the predicted object and build a suitable model [7]. BPNN is a typical representative of artificial neural network algorithms, and a previous study in the literature [8] took full account of the weather factors and established a short-term load forecasting model based on BPNN. Another study [9], on the other hand, proposed a load forecasting model based on improvement differential evolution and wavelet neural network. Forecasting by artificial neural network models can control the error in a small range; however, the disadvantage of this algorithm is that it is slow to converge and tends to be bogged down in local operations [10], so some scholars use SVM for load forecasting to prevent the problems of neural network structure selection and local optimization. Studies by [11] and [12] constructed load prediction models based on the SVM algorithm, respectively, and the prediction accuracy was improved, compared with BP neural net model, but the support vector machine algorithm is often not suitable when the amount of training data are too large, and the effect is poor in dealing with multi-classification problems.
Therefore, some scholars have improved the SVM by proposing the least-squares support vector machine (LSSVM), using kernel functions to transform the prediction problem into solutions of equations and thereby transforming unequal constraints into equation constraints. In addition to the significant improvement in the accuracy of load forecasting, this method also increases the operation speed [13]. Thus, some researchers sought to use LSSVM for load forecasting of other power systems, for example, one study [14] proposed an LSSVM-based electricity load forecasting model and also used a rolling mechanism to forecast the annual electricity consumption in China, and the results showed that, in contrast with the single prediction model, this model has better prediction performance [15]. Another study [16], on the other hand, applied the cuckoo algorithm to optimize the LSSVM and applied the optimized LSSVM model for short-term electricity load forecasting, which achieved more satisfactory forecasting results. Considering the relatively satisfactory results achieved by LSSVM in these studies, it was decided to use the LSSVM model for prediction in this paper. However, although the performance of the LSSVM model is superior to the SVM model in the process of load forecasting, it continues to be plagued by a blind selection of penalty coefficients and kernel arguments, and the selection of appropriate penalty coefficients and kernel arguments is crucial to enhance learning and generalize the capabilities of LSSVM, so suitable intelligent algorithms need to be selected to optimize it. At present, the main intelligent algorithms used include genetic algorithm [17], particle swarm algorithm [18], CS algorithm [19], bat algorithm [20], etc. However, genetic algorithms suffer from premature maturity, being computationally cumbersome, small processing scale, difficulty in dealing with nonlinear constraints, and poor stability. The particle swarm algorithm, on the other hand, has low local search accuracy and cannot fully meet the needs of the LSSVM model in the problem of parameter optimization. The CS and bat algorithms cannot converge to the optimal point, so they are susceptible to slip into local optimization, which leads to the reduction in load forecasting accuracy. For this reason, this paper proposes to use the fireworks algorithm to optimize the arguments of LSSVM. The fireworks algorithm does not affect the performance of the algorithm, to a large extent, due to small changes in the arguments, and arguments of the method are selected with low difficulty; therefore, the fireworks algorithm has good global convergence and high computational stability [21].
In addition, the residential electrical system is a complex system, with very many influencing factors; when all impact conditions are given as input data to the forecasting model, extensive redundant data will be generated in the process of operation, so it is necessary to select the input index data [22]. Fuzzy clustering is a statistical method for categorizing objective matters in accordance with specific demands and laws based on the attribute characteristics between factors [23]. Considering that the load curves are basically the same between dates with similar factors influencing the daily load of filling residential electricity consumption, better prediction results can be achieved by using similar daily load samples for prediction. Thus, this paper decided to use the fuzzy cluster analysis method to process and analyze the influencing factors.
Therefore, this paper analyzes the influencing factors of residential electricity load and constructs a residential electricity load forecasting model (FC-FWA-LSSVM) based on fuzzy cluster analysis and the fireworks algorithm to optimize LSSVM. The arrangement of the remaining parts of the paper is as follows: Section 2 describes the algorithms utilized in this paper, including the fuzzy cue analysis, the LSSVM model, and the fireworks algorithm, and constructs a complete forecasting framework. Section 3 selects practical cases to investigate the accuracy and stability of the model proposed in this paper. In Section 4, four typical scenarios are selected to validate the prediction results. In Section 5, the research results of the article are summarized.

Contributions
The main contributions of this paper are as follows: (1) Based on support vector machines, this paper proposes a method for short-term load prediction, which effectively reduces the difficulty of prediction by least-squares support vector machines while alleviating the possibility of overfitting and improving the inductive ability of learners and prediction accuracy.
(2) This paper proposes a feature extraction method for data compression through fuzzy cluster analysis and parameter optimization using the fireworks algorithm, which can reduce the redundancy of data more effectively, further improve the prediction effect, and reduce the difficulty of prediction, compared with traditional cluster analysis.
(3) Based on an empirical analysis of a residential neighborhood in China, this paper validates the effectiveness of the proposed method. Compared with traditional methods, the proposed method in this paper can reduce RMSE to 2.32%, MAPE to 2.21%, and AAE to 2.1%, which is suitable for high accuracy load prediction under largescale features.

Fuzzy Clustering Analysis
Fuzzy clustering is a statistical method for categorizing objective matters in accordance with specific demands by establishing fuzzy similarity relationships based on different characteristics, closeness, and similarity between them [24]. In this paper, we used the fuzzy equivalence matrix dynamic cluster analysis method.
Let the set of n samples for the forecast date as shown in Equation (1) and each sample has m characteristic indicators, i.e., the sample can be expressed as Equation (2). The concrete steps of the cluster analysis are as follows: (1) Specification of data: Each characteristic indicator has a different scale and order of magnitude and needs to be normalized. The following Equation (3) was used to process the historical data: where , is the original datum; is the minimum value; is the maximum value in 1 , 2 , … , ; is the datum after specification.
(2) Establishing fuzzy similarity relationship matrix: To measure the similarity between the samples that need to be classified, a fuzzy similarity relationship matrix {} ij RR = was established. The methods to determine are similarity coefficient method, distance method, closeness method, etc., and the absolute value index method was used in this paper [25].
The formula is After obtaining the fuzzy similarity relation matrix, the squared self-synthesis method was used to construct the transfer closure R* of R.
(3) Dynamic clustering: We had to choose a reasonable threshold L to truncate R*. The size of the clustering level L directly affects the clustering results, and the classification gradually merges from coarse to fine as L decreases from 1 to 0, forming a kinetic gathering plot. The optimal value can be obtained by using the rate of change of L [26].
where i is the number of aggregation order from high to low , , and −1 is the number of elements in the and − 1 clustering; and −1 are the confidence levels in the and − 1 clustering. If max( ) jj CC = , then the confidence level clustering is considered the optimal threshold.
where ′ is the vector of characteristic indicators for the forecast day, and ′ is the vector of characteristic indicators for each category. Finally, the Euclidean distance had to be calculated, and therefore, the category associated with the minimum value was picked as the category for the forecast date, and the corresponding prediction models for forecasting were established.

Fireworks Optimization Algorithm
The fireworks algorithm (FWA) [27] is the calculation of the total firework blast sequence. Fireworks explode to make sparks, and the sparks make more new sparks at the same time, so as to constitute rich patterns. Converting the process of exploding fireworks into a computational process for FWA and viewing fireworks as a practical alternative for optimizing the problem-solving scope allowed us to comprehend the process of the spark generated as a way of seeking the perfect solution. In the process of achieving the perfect solution, the influencing factors of FWA include the number of sparks, the blast radius, and the best set of fireworks and sparklers to be selected by the next explosion (search process).
FWA has a superior self-regulation mechanism of local search capability and global search capability. In the FWA model, each firework has a different blast radius and the number of sparks that explode. A larger blast radius of fireworks with a poor adaptation value empowers the firework more "ability to explore"-exploration capabilities. Additionally, fireworks that have good fitness values have lower blast radii, enabling them to have greater "ability of excavation"-exploitability around the location. In addition, to further increase the variety of the population, the introduction of Gaussian mutation spark is very necessary.
Therefore, it can be seen that the three most vital elements of FWA are the explosion operator, mutation operator, and selection strategy.
(1) Explosion operator: According to the adaptation value of fireworks, we can calculate the number of sparks produced by each firework blast and the blast radius. The formulas for calculating the number of fireworks and blast radius toward the fireworks ( = 1,2, … , ) are as follows: In Equations (7) and (8), , stands for largest and least adaptive values of the current population, respectively; the adaptation value of the fireworks is expressed in ( ); adjusts the number of blast sparks as a constant; in addition, ̂ is to resize the blast radius of fireworks as a constant; moreover, is intended to prevent zero operation as the minimum machine value.
(2) Mutation operator: Mutation operators can add to the variety of the sparks population. The variation sparks in FWA are the Gaussian mutation sparks produced by the explosion sparks through Gaussian mutation. When selecting fireworks for Gaussian mutation, the k-dimensional Gaussian mutation exercise is used as ik ik x x e  = , where ^ delegates k-dimensional variation spark, and delegates obeying Gaussian distribution.
In FWA, when the explosion sparks and abrupt sparks that are created by the explosion arithmetic and abruptness arithmetic drop out of the search space, they are required to be plotted to a new place; the formula is as follows: where , , , represent the upper and lower search spaces on the k dimension.
(3) Selection strategy: A certain number of individuals need to be selected for the next generation of fireworks in explosion fireworks and mutation sparks, in order to transmit more complete data and information to the next generation of fireworks.
Candidates with the best fitness value were identified as the next generation of fireworks when individuals were picked, and is the size of the population. For the remaining − 1 fireworks, the selection was carried out in a proportional manner. For fireworks , the odds of being selected were computed as follows: where ( ) is the ratio of distances among all individuals in the present individual candidate set. In the candidate set, the probability that the individual is selected will decrease when the individual has a higher density, which also means there are other candidates around this individual. According to the foregoing statement, the concrete stages of FWA are as follows [28]: Step 1. Randomly pick N fireworks in the solution volume and normalize their parameters; Step 2. Calculate the fitness value ( ) for every firework and calculate the explosion radius and number of sparks for every firework. Randomly pick the coordinate in to update the coordinates.
Step 3. Generate ̂ Gaussian mutation sparks; randomly pick the spark ; then, calculate the result ̂ of ̂ Gaussian mutation sparks based on the Gaussian mutation formula, and save them to the population of Gaussian mutation sparks; Step 4. Use the probability selection formula to pick individuals from the fireworks, blast sparks, and populations of Gaussian mutation sparks as the fireworks for next-generation heterogeneous computation; Step 5. Determine the stop condition. If the stop condition is met, the program is exited and the optimal result is output; otherwise, return to Step 2 to continue the cycle.

LSSVM
LSSVM is an extension of the SVM, which replaces the inequality constraint of the support vector machine with an equation constraint and transforms the quadratic programming problem in SVM into finding the solution of a linear system of equations. As a result, the convergence speed of the model is significantly improved [29].
Let the given specimen group {( , )} =1 , be the total size of the group; then, the retrospective model of the samples is where ∅( * ) is the training group projected onto a high-dimensional universe, is the weighted vector, and is the bias.
For LSSVM, the optimization issue becomes  (13) . st (14) where γ. is the penalty factor, which is utilized to ensure the balance between sophistication and precision of the model; is the estimation error. To address the above issue, establishing the Lagrangian function yields ( ) (15) where is the Lagrangian multiplier. Taking the differentiator for every parameter of the function and making it 0 gives Eliminating and translates into the following expression: where ( , ) is the kernel function that satisfies the mercer condition. Viewing that the radial basis RBF kernel function has a broad convergence domain and a wide range of applicability, it was chosen in this paper as the kernel function for the least-squares support vector machine with the following Equation (23): x , x i (23) where σ 2 denotes the kernel width, reflecting the features of the training dataset and having implications for the system's ability to mineralize genes. From the above analysis, it is clear that the difficulty in building the LSSVM prediction model is the determination of the arguments of the model-the kernel function parameter σ 2 and the penalty parameter γ. The selection of appropriate σ 2 and γ is crucial to increasing model learning and summarization skills.

Model Construction
The factors influencing the load forecast include seasonal type, maximum temperature, minimum temperature, weather type, day type, and historical load values. In this paper, we first analyzed the influencing factors of residential electricity load and used fuzzy cluster analysis to extract the dates with similar influencing factors to be predicted day and form similar day load samples as the training samples of the prediction model; then, we used the fireworks algorithm to optimize LSSVM, so as to gain best results of γ and σ 2 , and eventually, obtained the prediction results and analyzed the results. The proposed combined forecasting framework is shown in Figure 1.  Figure 1. The proposed composite prediction framework.

Example Analysis
In this paper, a residential neighborhood in China was selected as a case study for analysis, the arithmetic data of the neighborhood were selected as the load data from October 2018 to October 2019, and the training set was used from 1 October 2018 to 30 October 2019; the test set was used from 31 October 2019, with 30 min as the data collection frequency.

Input Variable Selection and Processing
In this paper, seasonal type, maximum temperature, minimum temperature, weather type, day type, and load values at the same moment 4 days before the forecast day were selected as input variables. The seasonal-type data were divided into 4 categories: 1 for spring, 2 for summer, 3 for autumn, and 4 for winter. The weather-type data were divided into 2 categories: 1 for sunny and cloudy type, and 0.5 for rain and snow type. The daytype data were divided into 2 categories: 1 for weekdays and 0.5 for weekends. The temperature data and load data had to be normalized according to Equation (24).
where is the real value, and are the minimum and maximum sample data, and is the normalized load value.
(1) Relative error (RE) (2) Root-mean-squared error (RMSE) (3) Mean absolute percentage error (MAPE) (4) Average absolute error (AAE) In the above Equations (25)- (28), is the actual value of the load, ̂ is the predicted value of the load, and is the number of datasets. The smaller the value of the above indicators, the higher the prediction accuracy [27].

Evaluation Indices of Forecasting Results
Initialize the FWA parameter: The highest size of cycles is = 1000, the size of the population is = 40, the size of spark determines the constant = 150, and the radius of blast determines the constant ̂= 200. To validate the performance of the prediction method presented in this article, this study relied on the test sample data and conducted comparison experiments using the fireworks-algorithm-optimized LSSVM model (FWA-LSSVM), the standard LSSVM model, and the standard BPNN model. Figure 2 and Table 1 show the forecasting results of the proposed model, FWA-LSSVM, LSSVM, and BPNN models for loads in this paper. Figure 3 and Table 2 show the RE of each prediction model.    The smallest absolute value of the relative error of the BPNN model is −1.55%, and the maximum value is 9.97%, and the errors of the majority of the time points are between (−7%, −5%) and (5%, 7%), with a large fluctuation. From this perspective, the FC-FWA-LSSVM model has the highest forecasting precision, followed by the FWA-LSSVM model and the LSSVM model, and the worst is the BPNN model. It can be seen that fuzzy clustering analysis can effectively prevent ignorance in choosing similar days by manual experience. Compared with the traditional LSSVM model, the FWA-optimized LSSVM model improved its prediction accuracy because it found better model arguments. Figure 4 and Table 3 show the RMSE, MAPE, and AAE of each model for the overall prediction results. It is evident that the RMSE of the method presented in this article is calculated as 2.32%, while the RMSE of the FWA-LSSVM, LSSVM, and BPNN models are calculated as 4.25%, 6.12%, and 8.26%, respectively. This shows that the forecasting outcome of the model introduced in this article has small errors and the highest general precision of forecasting. Additionally, the FC-FWA-LSSVM model has the best-calculated results for MAPE (2.21%) and AAE (2.10%). Compared with the FWA-LSSVM model, fuzzy clustering analysis overcomes, to some extent, the negative impact of non-conventional load data on LSSVM training due to unexpected changes in influencing agents. Contrasting this with the LSSVM model reveals that its generalization capability and forecasting precision can be enhanced by enhancing the arguments of the LSSVM. In contrast to the BPNN model, the LSSVM model can protect against the disadvantages of delayed restraint and the tendency to become stuck in a local optimum of the BPNN model. Overall, the FC-FWA-LSSVM model has the best forecasting performance, the FWA-LSSVM model and the LSSVM model are the next best, and the BPNN model has the worst prediction performance.

Scenario Validation
To remove the specificity of the target day and to check the generalization behavior of the model, one day of data from each of the four seasons was picked as the test sample: 17 April as the spring test sample; 21 July as the summer test day; 09 October as the autumn test day; 28 January as the winter test day. Figure 5 shows the curve of the forecasting load values against the true values of the load for the four seasons of spring, summer, autumn, and winter. By comparing the predicted and true load values for the four seasons, it can be concluded that the total predicted tendency of all four models is near to the true value, which validates the advantages of the model proposed in this paper for residential electricity load forecasting.
From Table 4, it can be concluded that the RMSE, MAPE, and MAE values of the FC-FWA-LSSVM forecasting model proposed in this paper are the lowest for the four seasonal test samples, as shown in the table. The RMSE, MAPE, and MAE values for spring load forecasting are 1.3901%, 1.4112%, and 1.4349%, respectively; the RMSE, MAPE, and MAE values for summer load forecasting are 1.3912%, 1. 5812%, and 1.4424%; the RMSE, MAPE, and MAE values for summer load forecasting are 1.5123%, 1.4671%, and 1.4291%; the RMSE, MAPE, and MAE values for winter load forecasting are 1.3941%, 1.4214%, and 1.4042%, all of which are lower than the other models; this, again, illustrates the optimal overall prediction performance of the prediction model proposed in this paper.

Conclusions
In this article, we presented a hybrid load forecasting model that combines fuzzy cluster analysis and LSSVM and is optimized by FWA. First, to forecast residential electric load loads, fuzzy cluster analysis was used to select the input features. In addition, FWA was used to optimize the criteria of LSSVM. Finally, upon acquiring an optimized feed subset and the optimal values of γ and σ 2 , the proposed model was used for residential electricity load forecasting. Based on these studies, several conclusions can be drawn as follows: (a) By using fuzzy clustering analysis, the influence of uncorrelated factors can be mitigated, which effectively improves forecasting capabilities; (b) the optimization algorithm FWA increases the global search capability of the model, and the LSSVM model optimized by FWA shows good performance; (c) based on the error evaluation criteria, with SVM, LSSVM achieves better prediction results, indicating that the method of improving SVM by introducing least-squares linear system is effective. The model based on FCA and KELM optimized with FWA proposed in this paper offers a new research direction for load forecasting and is highly feasible. In the example of load forecasting, the desired forecasting results were obtained. In future research, the forecasting model can be applied to develop demand corresponding strategies to contribute to peak and valley reduction in electricity loads.