A Novel Short-Term Residential Electric Load Forecasting Method Based on Adaptive Load Aggregation and Deep Learning Algorithms

: Short-term residential load forecasting is the precondition of the day-ahead and intra-day scheduling strategy of the household microgrid. Existing short-term electric load forecasting methods are mainly used to obtain regional power load for system-level power dispatch. Due to the high volatility, strong randomness, and weak regularity of the residential load of a single household, the mean absolute percentage error (MAPE) of the traditional methods forecasting results would be too big to be used for home energy management. With the increase in the total number of households, the aggregated load becomes more and more stable, and the cyclical pattern of the aggregated load becomes more and more distinct. In the meantime, the maximum daily load does not increase linearly with the increase in households in a small area. Therefore, in our proposed short-term residential load forecasting method, an optimal number of households would be selected adaptively, and the total aggregated residential load of the selected households is used for load prediction. In addition, ordering points to identify the clustering structure (OPTICS) algorithm are also selected to cluster households with similar power consumption patterns adaptively. It can be used to enhance the periodic regularity of the aggregated load in alternative. The aggregated residential load and encoded external factors are then used to predict the load in the next half an hour. The long short-term memory (LSTM) deep learning algorithm is used in the prediction because of its inherited ability to maintain historical data regularity in the forecasting process. The experimental data have veriﬁed the effectiveness and accuracy of our proposed method.


Introduction
The different kinds of appliances have increased significantly in households, and the residential electrical load has maintained a medium-high growth rate over the years. In the meantime, with the development of renewable energy technologies, rooftop photovoltaic and distributed electric vehicles are also widely involved in home energy management [1][2][3]. Therefore, the household microgrid will be established by household appliances, rooftop photovoltaic, distributed electric vehicles, and battery energy storage devices [4][5][6]. The constructed household microgrid can dispatch the residential electricity flexibly, provide demand-side response capability, and, finally, improve the economic performance of the microgrid operational management.
Short-term residential load forecasting is the precondition of the day-ahead and intraday scheduling strategy of the household microgrid. Accurate short-term load forecasting results can be used to form a more reasonable home energy scheduling plan [7][8][9]. Since dispatching management is only applied in the large-scale regions in the traditional power grid, existing electric load forecasting methods are mainly used to obtain regional power load for generation scheduling, transaction scheduling, and network dispatching [10,11]. These load forecasting methods can be mainly grouped into three categories, including similar day or similar time interval-based forecasting methods, frequency component-based forecasting methods, and meteorological factor-based load forecasting methods.
Similar day or similar time interval-based load forecasting methods use the historical load data at the same time in the related and nearby days to obtain the load value at some time on the prediction day [12]. In these methods, similar day identification and data smoothing algorithms are the most important procedures. Algorithms such as Euclid distance or density clustering have been proposed and used to find the most suitable similar day or similar time interval [12,13]. In the meantime, several data smoothing algorithms are also applied in the forecasting process to efficiently find the laws of the relevant historical data and the forecasting load value on the prediction day, including the least square regression algorithm [14], support vector machine [15], artificial neural network [16], etc. The principle of these methods is essentially based on the electrical consumption patterns on the day of the week, which are directly determined by human activity. Nowadays, these methods are easy to realize and widely applied in the field.
Frequency component-based forecasting methods would decompose the electric load series into components with daily periodicity, weekly periodicity, climate-vulnerable low frequency, and randomness-determined high frequency. In addition, corresponding algorithms are optimally designed to forecast these several components separately. In references [17][18][19][20], variational mode decomposition and wavelet transform algorithms are used to decompose time series load data separately into several components with different frequencies. These methods use the power consumption patterns of all components instead of the entire load regularity. The influence of the random and abnormal load disturbance on the forecasting result can be reduced because of the effective decomposition and special signal processing for different components.
In contrast to the first and second kinds of methods, meteorological parameters are directly used as input factors in meteorological factor-based load forecasting methods. In these methods, in addition to historical time series of electric load, intraday meteorological parameters and accumulative effect factors of historical meteorological parameters are specially converted to several input variables by numerical value mapping [21][22][23][24]. These variables are combined with the power consumption data to form multi-dimensional input parameters for the following forecasting algorithms, including artificial neural network [21][22][23], support vector machine [24], and the mutation of these two algorithms. This kind of method intensifies the influence of the meteorological parameters on the electric load forecasting, and then the prediction accuracy can be further improved and the forecasting error can be reduced.
The abovementioned three kinds of short-term electric load forecasting methods have been widely applied in the field for system-level load forecasting over the years. The forecasting error of the power consumption in the whole country, province, or city can be controlled under 0.5%, as stated in some reports [25]. The precondition of all these methods is that the predicted load should have a remarkably regular pattern. Due to the randomness of human activities, the residential electric load of a single-family dwelling or limited multi-family houses has high fluctuation and ruleless trajectory. Therefore, when traditional electric load forecasting methods are applied in home energy management, the mean absolute percentage error (MAPE) would be greater than 40% [26].
Different from the summing electric load in a city or a region, volatility and uncertainty exist in residential power consumption. The external factors, including the routine of life, human occupancy, and household appliances, will directly affect the short-term individual power consumption. Reference [27] analyzes the characteristics of residential electric load based on the nature of different appliances and the routine of life. Due to the burdensome and insatiable data collection, a physical model is automatically inferior to the data-driven electric load forecasting model. With the development of the smart meter and Internet of Things technologies [28], data-driven residential load forecasting algorithms have gained the attention of scholars. References [29,30] summarize the data-driven forecasting methods that are applied in building energy management. In these methods, feature generation from the daily timetable, clustering algorithms, and deep learning networks are used to forecast the building energy consumption. In [31], convolutional neural networks (CNN) and long short-term memory neural networks (LSTM) are combined to forecast the electric load of a four-story building robustly and reliably. To improve the LSTM capability to deal with the varying length of input features, the attention mechanism is integrated into the LSTM algorithm to improve the prediction performance [32]. The attention mechanism is also used to improve the prediction accuracy for a sudden increase in power usage [33]. Compared with the load prediction of a whole building or a whole floor, the electric load prediction at a single-unit level is more difficult because of the greater randomness. In [34], an LSTM algorithm is used to achieve power consumption patterns and human behaviors in real time. It can improve the load prediction adaptivity in home appliances configuration. Furthermore, appliance consumption sequences are integrated into the LSTM algorithm to especially improve prediction accuracy for the volatile problem in [35]. In the meantime, modified LSTM algorithms are also proposed in [36,37] to adaptively assign weights to temporal features and extract spatial characteristics effectively. These methods provide several effective algorithms to forecast the individual electric load, and the prediction process can be adopted for reference by future research. However, the prediction results show that the prediction error for a single unit is still too big to be applied in the field. The mean absolute percentage error (MAPE) nearly reaches 30-40% for different experimental data. Therefore, the prediction load results cannot be directly used for home energy management.
In this paper, a novel short-term residential load forecasting framework will be proposed to fill the gap between the electric load forecasting of a single-family unit and that of a whole city. Firstly, characteristic analysis of residential electric loads will be conducted to verify the necessity of load aggregating. Secondly, an optimal aggregated electric load algorithm is proposed and discussed by using typical load clustering algorithms. Thirdly, a LSTM-based residential load forecasting model is proposed and discussed with the input parameters of the adaptive load aggregation and the encoded external factors. The experimental data has verified the effectiveness and accuracy of our proposed method.

Power Consumption Analysis of Household Appliances
The residential electric load consists of different home appliances. These residential electric loads can be grouped into three categories. The first kind of load works at a relatively consistent time every day, including appliances such as rice cookers, kitchen ventilators, and refrigerators. The second kind of load is directly determined by external factors, including heating, air-conditioning, and electric fans. The third kind of load would work every day, but the corresponding operation time would vary and be influenced by the routine life of the host family, including the electric water heater and laundry machine.
Nowadays, the residential electric load is mainly collected by intelligent electric meters and used by the marketing departments of electric utilities. Globally, the sampling intervals of residential load are in the range of 15 min to one hour in the field [38]. In this section, the historical data of residential electric load would be analyzed to obtain the statistical characteristics based on typical quantitative indicators.

Quantitative Indicators of Residential Electric Loads
Based on the collected historical data and the requirement of load prediction, the following five indicators would be used to describe the characteristics of residential electric load.
(1) maximum daily load The maximum daily load is the maximum value of selected residential electric load in a whole day and is represented by P max in this paper.
(2) mean daily load The mean daily load is the mean value of selected residential electric load in a whole day and is represented by P v in this paper. It will be calculated by where P i represents the ith sampling data of selected residential electric load, ∆t represents the sampling interval of the smart meter, N represents the total sampling number in a whole day, and N × ∆t equals 24 h.
(3) mean daily loading rate The mean daily loading rate, γ, is the ratio of mean daily load to the maximum daily load, which is expressed by (4) minimum daily loading rate The minimum daily loading rate, β, is the ratio of minimum daily load, P min , to the maximum daily load, P max , and is calculated by The dispersion degree of sample data is usually analyzed and obtained by statistical analysis technique. The coefficient of dispersion can be used to analyze the volatility of the daily electric load. The daily volatility index of residential electric load, F L , is calculated by

Characteristic Analysis of Residential Electric Loads
Among the public datasets about residential electric load, three datasets are used widely in the papers. They include the data from Smart Grid Smart City (SGSC) project in Australia [39], the data from the Smart Metering project in Ireland [40], and the data from the Smart-star project in the USA [41]. The SGSC project comprises historical electric load data of 10,000 households recorded from 2012 to the 2014 under a sampling interval of 30 min, while the load data of about 4000 households from 2009 to 2010 was recorded under the same sampling interval in the Smart Metering project. Unlike the abovementioned datasets, the electric load data and external factors are detailed and recorded for about 400 households from 2013 to 2016 in the Smart-star project, and the sampling interval reaches 1 min for several individuals.
Due to the good readability and complete continuity of the recorded data, the set from the SGSC project will be used in our research after the comparison. Firstly, the characteristic of a single-unit residential electric load is analyzed over a long period of time. Secondly, the load characteristics of different units are analyzed and compared with each other. Thirdly, the difference between a single-unit load and the aggregated load of multiple units is detailed analyzed, and the results will be used for the following load prediction method in this paper.

Electric Load Characteristics of a Single Unit
As an example, take the residential electric load of the customer numbered as 10006414. The corresponding recorded power consumption data from February 2012 to March 2014 is selected as sample data. The indicators of the selected data are calculated to illustrate the load characteristics.
The obtained maximum daily loads of the customer with number 10006414 for 750 consecutive days are shown in Figure 1. Statistical analysis has been performed to obtain the distribution of the maximum daily loads. The most probable maximum daily load is in the range of (0.5 kW, 1.0 kW), which accounts for 37.5% of the total 750 days. The proportion of maximum daily load with the range of (1.0 kW, 1.5 kW), 20.0%, is similar to that with the range of (1.5 kW, 2.0 kW), which equals 21.6%. The proportions of maximum daily loads with the range of (2.0 kW, 2.5 kW) and the range of (2.5 kW, 3.0 kW) equal 11.2% and 5.9%, respectively. In addition, the maximum daily load changes with the seasons. For example, as shown in Figure 1, the values of maximum daily loads from April to October are bigger than those from November to March. Other characteristics of residential load for the customer with number 10006414 have also been analyzed. The other corresponding quantitative indicators of residential electric loads for these 750 days are also analyzed and given as follows.
The mean daily load of customer 10006414 is mainly located in the range of (0.2 kW, 0.4 kW), and the corresponding proportion equals 63.2%. In the meantime, the mean daily load increased significantly in the period from the middle of May to the beginning of August. The proportions of mean daily loads with the range of (0.4 kW, 0.6 kW) and the range of (0.6 kW, 0.8 kW) equal 21.9% and 7.9%, respectively.
The mean loading rate mainly varies in the range of 0.2 to 0.4. It is much less than the loading rate of a provincial region, which usually equals 0.8. These results show that the residential load is more volatile than the regional load. Furthermore, the minimum loading rate is in the range of 0.05 to 0.15. In other words, the peak-valley difference of the residential load is very big, and it is far below that of industrial loads. The corresponding daily volatility index is in the range of 0.15 to 0.3. The high volatility and big peak-valley difference of residential load bring an enormous challenge for the short-term residential load prediction.

Electric Load Characteristics of Different Units
Ten households are randomly selected from the SGSC set data, and the corresponding load characteristics are analyzed and compared with each other based on the historical data in March 2013.
The maximum daily loads of the selected ten units are shown in Figure 2. The results show that no noteworthy associations exist between two maximum daily loads curves. For example, the maximum value of the maximum daily loads of customer 10006414 appears on 13 March and equals 2.306 kW, while the minimum value appears on 1 March and equals only 0.206 kW. The maximum value of the maximum daily loads of customer 10006704 appears on 6 March and equals 7.126 kW, while the minimum value appears on 3 March and equals 2.608 kW. There are obvious differences between the maximum daily loads of customer 10006414 and that of customer 10006704. The mean daily loads and the daily volatility indicators of these ten selected units are also analyzed, and some quantitative indicators are given in Table 1 for intuitive and clear comparison. The mean daily load analysis results show that some customers consume similar daily electric quantities among these days, and some customers are different. For example, the mean daily loads of customer 10006414 fluctuate within a small range of (0.2 kW, 0.4 kW), and the mean daily loads of customer 10006572 also fluctuate within a small range of (0.4 kW, 0.6 kW). Unlike these two customers, the mean daily load of customer 10006684 varies on a large scale. The values fluctuate sharply from 2 kW to 6.5 kW on 3 March, 9 March, and 15 March, while the values remain in the range of (1.8 kW, 1.9 kW).
Compared to the irregular patterns of residential loads, the daily volatility indicators of the selected ten customers are within the same range, located from 0.15 to 0.25. However, further analysis indicates that the daily volatility index variation curves of any two households are different. These results are caused by the similar household appliances and different working processes in these selected units.
The electric loads of eighty selected households are used to form the distribution probability of mean daily loading rates, and the distributions on three selected days are shown in Figure 3. For the same eighty households, the distribution probability of mean daily loading rates is different on 1 March, 5 March, and 10 March in 2013. On 1 March, the maximum probability of the mean daily loading rates appears in the range of (0.1 kW, 0.15 kW), while the maximum probability of the rates appears in the range of (0.15 kW, 0.2 kW). In the meantime, the mean daily loading rates of these selected eighty households are distributed in a wide range, especially in the range of (0.05 kW, 0.4 kW).

Electric Load Characteristics of a Single Unit and Total Loads of Multiple Units
The abovementioned analysis results show that the electric load of a single unit has weak regularity. If the traditional load prediction methods are applied for the load forecasting of a single unit, a big forecasting error would inevitably appear. Based on the field data and historical experience, the daily volatility index of the total electric load of a whole region is very small because the load fluctuations of all units balance themselves out. With the increase in households in the region, the pattern of the total electric load becomes more and more stable, but too many residents in one region would exceed the scale limit of the microgrid, and then reduce the operational flexibility of the microgrid.
Quantitative indicators of the total electric load of the abovementioned selected ten households are analyzed and shown in Figure 4. Furthermore, another ten households are added to the cluster, and the indicators of the total electric load of twenty households are analyzed. The quantitative indicators of these total residential electric loads are also listed in Table 2.  There is some regular pattern in the total power consumption of the selected ten households. As shown in Figure 4a, the maximum daily loads of the total load of the ten households appears as local minimums on 6 March, 12 March, 16 March, 23 March, and 30 March, and the corresponding values equal 9.232 kW, 11.252 kW, 12.3256 kW, 11.222 kW, and 11.178 kW, respectively. The maximum daily loads of the total load of the ten households appear as local maximums on 10 March and 20 March, and the corresponding values equal 19.674 kW and 19.866 kW. The maximum daily loads of the total load contain specific periodic components expect for some fluctuations between 6 March and 16 March. The maximum daily loads of the total load of the twenty households have more stable regularity. More precisely, the first local minimum of maximum daily loads of the total load appears on 6 March, which equals 12.226 kW. Then, the maximum daily load of the total load increases to the first local maximum value, occurring on 10 March, which equals 23.728 kW. Subsequently, the maximum daily load of the total load decreases to the second local minimum value, occurring on 14 March, which equals 14.462 kW. The maximum daily loads of the total load vary with the same periodic pattern in the following days.
Moreover, the maximum value of maximum daily loads for the total load of the ten households in March 2013 equals 20 kW, while the maximum value of maximum daily loads for the total load of the twenty households only equals 25 KW. In the meantime, the maximum value of maximum daily loads for a single household with number 10006704 among these selected twenty households unexpectedly reaches 7 kW on 6 March. Therefore, the maximum daily load does not increase linearly with the increase in households in a small area during a period of time. The optimal aggregation of residential loads would smooth the electric load under a small enough region to ensure the flexibility of the energy management.
As shown in Figure 4b, the mean daily loads of the total load of the ten households fluctuate within the range of (5 kW, 6.5 kW), and some outlier data exist in the curve. For example, the mean daily load of the total load of the ten households on 24 March equals 7.1 kW. For the twenty households, the mean daily loads of the total load vary within the range of (8 kW, 11 kW), and only one point does not locate in this range, which appears on 2 March and equals 12.1 kW. The mean daily load becomes more regular with the total number of houses increasing.
The mean daily loading rates of the total load of the selected ten households are analyzed in March 2013, and the results show that these rates are in the range of (0.3, 0.55). The results also show that the mean daily loading rates of the total load of the selected twenty households are in the range of (0.4, 0.65). In contrast with these aggregated loads, the mean daily loading rates of a single household are located in the range of (0.1, 0.3), which is discussed in Section 2.2.2. This comparison result demonstrates that the total load is tending towards stability with the total number of houses increasing. The analysis results of daily volatility indexes are also used to verify this result again. Compared with the daily volatility indexes of a single household within the range of (0.15, 0.25), the daily volatility indexes of the total load of the ten selected households are in the range of (0. 16, 0.24), and the indexes of the twenty selected households are in the range of (0.15, 0.23). This result shows that the daily volatility indexes reduce slightly when the total number of households increases.
Compared with the quantitative indicators of a single household given in Table 1 and those of the aggregated load given in Table 2, the residential load of a single household has high volatility, strong randomness, and weak regularity. With the increase in the total number of households, the total aggregated load becomes more and more stable, and the cyclical pattern of the aggregated load becomes more and more distinct. Therefore, in our proposed short-term residential load forecasting method, the optimal number of households would be selected, and the total aggregated residential load of the selected households is used separately for prediction.

Basic Principle of Our Proposed Method
Short-term residential electric load forecasting is used to predict the power consumption in the next hours. In our proposed method, the aggregated load of multiple households is used, instead of that of a single household. The total number of households is determined by the minimum households when the short-term prediction result of the corresponding aggregated load meets the precision requirements. The detailed residential load prediction process of our proposed method is shown in Figure 5. We would extract the mean daily load rate, γ, and the minimum daily load rate, β, from the raw residential load data first. Then, the historical daily load of each household is clustered by the ordering points to identify the clustering structure (OPTICS) algorithm, using indexes γ and β. Thirdly, the residential load would be aggregated adaptively according to the classified results of all households. The basic principle of household classification is that the households with the same number of clusters for historical data would be classified into one category. Additionally, the households with a number of clusters greater than two are all classified into one category. The aggregated load data and the corresponding time-related features are used as the input parameters of the long short-term memory (LSTM)-based forecasting model. Finally, the total load predicted results are obtained using all selected and aggregated load forecasting results.

Optimal Number of Total Aggregated Households
As discussed in Section 2.2, the pattern of the total electric load becomes more and more stable with the increase in the total number of aggregated households. The total aggregated residential loads would smooth the electric load under a small enough region to ensure the flexibility of the energy management. The randomness and fluctuation of the residential load are determined by the human life routine and home appliances; thus, it is related to the city in which these households are located.
In this paper, we use a typical LSTM-based load forecasting method to identify the optimal number of total aggregated households. The relationship between the prediction MAPE and the number of total aggregated households can be obtained through a great deal of load prediction processes under different numbers of randomly selected households. The optimal number is identified as the minimum number of households when the MAPE meets the requirement of the microgrid dispatch.

Adaptive Density-Based Spatial Clustering Algorithm for the Residential Load
The analysis results in Section 2.2 show that obvious differences exist in the power consumption patterns of different households. To enhance the regularity of the aggregated residential load further, the households with similar patterns in the optimally selected households will be clustered as one group. Then, the corresponding load of each group is used separately for prediction. In our proposed method, the OPTICS algorithm will be used to identify households with similar patterns.
Typical clustering algorithms include K-means, density-based spatial clustering of applications with noise (DBSCAN), and OPTICS. The comparison results among these algorithms are given in Table 3. Table 3. Comparison results among different clustering algorithms.

K-means
The sample set is divided into K clusters according to the distance between the samples and core points of clusters.
It has low computational complexity, fast convergence, and strong interpretability.
a. The number of clusters, K, needs to be preset; b. It is difficult to converge when the algorithm is applied in non-convex datasets; c. It is sensitive to noise samples.

DBSCAN
It relies on a density-based notion of clusters.
a. It is suitable in discovering clusters of arbitrary shape; b. It is not sensitive to the noise samples.
a. The clustering quality is poor when the density of sample distribution is not uniform; b. Two parameters, including reachable distance threshold and sample number of clusters threshold, needs to be preset.

OPTICS
It is an extended DBSCAN algorithm for an infinite number of distance parameters.
It does not limit us to one global parameter setting in traditional density-based clustering algorithms.
The time complexity of this algorithm increased a little.
Due to the high volatility and strong randomness of residential load, it is not possible to identify the clustering number of the selected households in advance. In the meantime, some daily loads are irregular and should be regarded as outliers. Although there are some other extensions of K-means algorithms to select a proper cluster number or remove the outliers automatically [42,43], the improved K-means algorithms would be too complicated to find a proper cluster number and remove the outliers simultaneously. Therefore, the Kmeans algorithm is not suitable for our proposed method. Two parameters of the DBSCAN algorithm directly affect the reasonability of the clustering results, including the distance from a neighborhood point to a defined core point and the minimum number of samples in a cluster, but the selection of these two parameters has no paradigm. Improperly selected parameters would significantly reduce the effectiveness of the DBSCAN algorithm. In contrast, a variable neighborhood radius is used in OPTICS algorithm to avoid the influence of improper parameters on the clustering result. The samples can be clustered adaptively based on the distribution density. In this paper, the OPTICS algorithm is used to realize load clustering.
The detailed residential load clustering process is given as follows.
Step 1: The quantitative indicators of all residential electric loads in the objective area are analyzed and used for the distance calculation in the clustering algorithm. In our proposed method, the mean daily loading rate, γ, and the minimum daily loading rate, β, are selected as the key parameters for load clustering. These two parameters of ith historical daily load of the kth household are represented by γ k,i and β k,i , respectively. The distance between the quantitative indexes of ith day and jth day of the kth household is represented by d k,i,j , and can be expressed by Step 2: The historical loads of the kth household in the past D days are clustered by the OPTICS algorithm. The detailed process is illustrated as follows, which includes Algorithms 1 and 2. These historical loads of the kth household would be clustered as NC k classes. The pth class includes N p days, which are denoted as D k,p,1 , D k,p,2 , . . . , D k,p,Np . And put the elements into seeds queue P. 5 If p = ∅, then jump to the line 1 and move to the next element. 6 If p = ∅, foreach item q ∈ p, mark item q and put it into results queue M. 7 If q ∈ Ω, the unmarked elements belonging to the neighborhood area of q are put into seeds queue P. And calculate the reachable distance of any elements belonging to queue P. 8 If q / ∈ Ω, do nothing 9 end 10 end Step 3: All households in the region are aggregated into several groups based on the clustering results in step 2. The detailed process is given as follows.
Step 3.1: If the number of clusters for kth household equals that for jth household, the kth household and jth household would be aggregated as one group. All households would be divided into NH classes, and the number of households in the mth class is represented by N m .
Step 3.2: If the number of households in the mth class exceeds the threshold N set , the mth class should be divided into small groups again. The number of clusters for any household in the mth class is represented by U. For any household b, the corresponding cluster id is sorted in descending order according to the days contained in the cluster. The historical days in the qth cluster id of household b is represented by set N(b,q), and the elements of N(b,q) are represented by D b,q,1 , D b,q,2 , . . . , D b,q,Nq . The household x and household y in the mth class would be aggregated into one group if the following criteria were met.
(3) |N(x,2) ∩ N(y,2)| ≥ k 4 × |N(x,2) ∪ N(y,2)|. Generally, k 4 is set as 0.15. According to the abovementioned three steps, residential loads in the predicted area can be clustered and aggregated adaptively. The results would be used as the input of LSTM to realize short-term residential load forecasting.

LSTM-Based Short-Term Data Prediction for Residential Load
LSTM is one kind of recurrent neural network (RNN). In LSTM, the cell state is added to the hidden layer, and the forget gate parameter is used to update the cell state. Therefore, cell state can be further used to identify the signals which need to be abandoned and the signals which are required to be reserved in the next step. This characteristic can be applied to maintain the dependency relationship in the long time series without vanishing and exploding gradients problems. The inherited ability to maintain historical data regularity can improve prediction accuracy in the future. Therefore, LSTM is selected as the deep learning algorithm to predict the short-term load in the next hours in our proposed method.
The LSTM-based short-term data prediction process for residential load is shown in Figure 6. In our method, a proper K look-back time step is defined to select the length of time series of historical load as the input data. In the meantime, time-related feature data are also extracted and used to form the input matrix. day is encoded into t/dx according to the load sampling interval dx; (c) The pattern of historical data is related to human life routine, which is usually inextricably linked to the day of the week. Hence, the sorted number of days of the week related to the historical data is encoded into 0 to 6.
The input matrix would be preprocessed to avoid the influence of different dimensions of data and improve the convergence rate by the min-max normalization. The constructed input matrix X ∈ R 3×k is then used as the input data of our proposed LSTM-based forecasting model. The constructed load forecasting LSTM network consists of one input layer, two hidden layers, and one output layer. The input layer contains three cells, and each hidden layer contains twenty memory cells. Each memory cell is a self-recurrent unit, and it is preserved subsequently at the k look-back time steps. The input vector M for the memory cell consists of the output element of the input layer at time t and this memory cell output at the previous time step.
The residential load data of similar days at the same time are relevant to the load value of the prediction day at the prediction time, as well as the load data at the previous time window. Therefore, the look-back time steps should be set as the integral multiple of the total sampling number, 24 h/dx, of a whole day.

Experimental Datasets and Criteria in the Proposed Load Prediction Process
In this paper, the residential load set from the SGSC project is used in our short-term residential load prediction. The recorded residential load of 50, 100, 150, and 200 randomly selected households is used to verify our proposed method. The corresponding period of time is from 1 March 2013 to 31 March 2013. These load data are divided into two subsets, including the training set and test set. In the training set, the data from 1 March to 24 March of 1152 samples are selected to train the constructed LSTM network. Furthermore, 336 load samples from 25 March to 31 March are used to test our proposed short-term load forecasting method.
In the forecasting method, several parameters, especially the hyperparameters of the LSTM network, are predefined according to our rich experience in load forecasting. a) Thirty-one history days are used in our experimental datasets, hence, the minimum number of points in the neighbors, MinPts, is set as five in the OPTICS clustering algorithm. b) The short-term load forecasting result is used for microgrid power dispatch. Hence, the high computational efficiency is needed in our scenario. The learning rate is set as 0.01 initially, with an Adam optimizer to reduce the LSTM network learning time. In the meantime, the number of iterations is set as 150 to avoid continuous oscillation.
To effectively evaluate the load forecasting result, MAPE of the prediction load is used in the cost function. c) The sampling time interval of the historical load data is 30 min. Hence, there are 48 samples in a whole day. The look-back time step of the LSTM network is set as 48; therefore, the load at the same time of the previous day and the load before the prediction time can be both used to reveal the forecasting load value. d) The rolling load prediction strategy is adopted for the short-term residential load forecasting in this paper. In our following experiments, the constructed LSTM network outputs one prediction result after each prediction process without loss of generality.

Short-Term Residential Load Forecasting Results of a Single Household
The short-term load forecasting tests are carried out for each household by our proposed LSTM-based forecasting method. The MAPE values of 48 forecasting results on 31 March are calculated for each of the 200 selected households. The distribution of the corresponding 200 MAPE values is shown in Figure 7.
As shown in Figure 7, the distribution of MAPE values on 31 March for the selected 200 households is analyzed according to the twelve divided ranges. The vertical axis represents the total number of residential households within the corresponding range. Only one MAPE value is below 10% among these 200 households, while the MAPE values of other households are greater than 10%. Notably, ten MAPE values are even greater than 200%. The vast majority of individual forecasting errors are greater than the average MAPE values of 200 households, which equals 74.6%. This result shows that the forecasting error of any individual forecast has high variability, and the short-term load prediction accuracy of a single household cannot meet the requirements of home energy management.

Results of Residential Load Clustering
The minimum daily loading rate and the mean daily loading rate mentioned in Section 2.2 are used as the key parameters of daily load clustering for individual customers. The clustering results of daily load curves of the selected 200 households can be obtained by the OPTICS algorithm. The total number of clusters for each household will be used for the adaptive load aggregation. The clustering results for some selected households in March 2013 are shown in Figure 8, and the clustering results are detailed given in Table 4.
As shown in Figure 8, the average series of each cluster is plotted with bold lines. Figure 8a shows the historical load curves clustering results of the customer number 10006704. The total number of clusters equals 1. This means that the customer has only one pattern of power consumption, and the peak electricity consumption is concentrated in the morning and evening hours. Figure 8b shows that the total number of clusters equals 2 for the customer number 10006414. This indicates that there are two main forms of electricity consumption for this household. In the one power consumption pattern, the peaks concentrated in the morning and evening hours. In the other pattern, electricity is consumed throughout the whole day. Figure 8c,d represent the scenarios where the total number of clusters is 3 and 4, respectively. Figure 8c can reflect the electricity consumption characteristics of residents with less electricity consumption before 10:00 a.m. and with three main electricity consumption behaviors, while Figure 8d indicates that the household has four electricity consumption behaviors. It indicates that the residents have stronger regularity of living electricity consumption in these two scenarios.
As given in Table 4, when the total number of clusters equals 1, the number of outliers is 21, and when the number of clusters is 2, the number of outliers is 18. The number of outliers accounts for 2/3 of the total days, which shows the poor regularity of electricity consumption for these households. When the number of clusters equals 3 or 4, the number of outliers is less than 1/3 of the total days, and the number of clusters is evenly distributed. This indicates that the electricity load has a strong regularity, and also reflects the reasonableness of aggregating households with more than three clusters into one category, because they represent households with regular electricity consumption. The clustering results of fifty randomly selected households show that the users with two clusters reach 50%, while the users with four clusters only account for a small percentage, 2%. This result also verifies that it is reasonable for households with more than three clusters to be aggregated into one category.  Therefore, in this paper, households with the same number of clusters were aggregated into one category when the clusters are below three and all the households with more than three clusters were aggregated into one category. Groups of 50, 100, 150, and 200 customers are randomly selected and clustering analysis is processed based on the historical load data, and the final categories of the households are shown in the following Table 5. It can be seen from Table 5 that the number of households of category 1 always accounts for about 1/5 of the total number of households. The number of households in category 2 always accounts for about 1/2 of the total number of households. It indicates that the vast majority of households have certain electricity consumption patterns. Hence, it is necessary to forecast the load of the households with poor and strong electricity consumption patterns separately in our proposed method.

Results of Short-Term Residential Load Forecasting
Based on the aggregation results of 50, 100, 150, and 200 households, the load forecasting for each aggregation category was performed separately, and the MAPEs of the summing forecasting load are given in Table 6. As given in Table 6, the forecasting error of the total load prediction, as well as the adaptive aggregated load prediction, decreases sharply with the increase in the total number of households. In the meantime, the MAPE of the adaptive aggregated load prediction is always lower than that of total load prediction from 50 to 200 households. This result confirms the effectiveness of the proposed method. For our proposed method, 50 households are predicted with a MAPE of 14.3% and 100 households are predicted with a MAPE of 10.2%. When the number of households reaches 150, the MAPE of both the traditional method and our proposed method are both below 0.1, which meets the requirement of load forecasting accuracy when dispatching a microgrid. In general, for 150 households, the load forecasting results can achieve the accuracy requirement, and the number of households is not too large. Therefore, 150 would be an ideal number of households to construct a microgrid.
The detailed forecasting results at 48 points on 31 March for 150 and 200 households are shown in Figure 9. We can find that the fitting effect of our proposed method is significantly better than traditional methods. This verifies the advantage of our proposed method. The prediction accuracy of 200 households is significantly better than that of 150 households, which meets the analysis results expressed in the previous section. In general, the prediction accuracy of 150 households meets the requirement of prediction accuracy for microgrid construction.
To further verify the effectiveness of our proposed method, the data from 1 March to 24 March, totaling 1152 samples, are selected to train the constructed LSTM network. Additionally, 336 load data samples from 25 March to 31 March are used to test our proposed short-term load forecasting method. The forecast results for 150 households are shown in Figure 10.  As shown in Figure 10, the short-term load forecasting of our proposed method is always better than that of traditional methods during the whole week. The average MAPE of our proposed method during the week is 8.2%, while the average MAPE of the traditional method is 8.9%. A large number of forecasting results in this paper confirm the effectiveness of the proposed method. However, when the peak-valley value of the load is predicted, the overall effect of load forecasting is unsatisfactory. The reason is that various hyperparameters are not optimized in our method. This problem would be solved in our future work.

Sensitivity of Look-Back Time Steps of the LSTM Network
In our proposed method, the look-back time step, k, is selected as 48 to reveal the relationship between the historical load data and the prediction load. In this section, different look-back time steps are selected to analyze the load forecasting accuracy. The MAPE of the prediction results for 6 and 48 time steps in the LSTM network on 31 March are given in Table 7. As given in Table 7, for the households with the number of 50, 100, and 150, the MAPE of prediction results on 31 March is different when different look-back time steps are selected in the load forecasting process. For the households with the number of 50, the MAPE reaches 14.3% when the time step equals 48, while the MAPE reaches 18.3% when the time step equals 6. The MAPE for the 48 look-back time steps is lower than that for the 6 look-back time steps. Similar results can be conducted for the households with the number of 100, 150, and 200. The reason is that the residential load at the prediction time is usually related to the load at the same time on the previous day.

Comparison with Traditional Methods
Two other traditional prediction methods, SVR-based and BPNN-based load forecasting, are used to compare with our proposed method. The setting parameters for these two traditional load forecasting methods are given in Table 8. The comparison results with traditional methods for the 150 selected households are given in Table 9. The MAPEs of the short-term load forecasting results on 31 March for three methods are calculated and given in this table. To verify the advantages of our proposed method clearly, the load forecasting results are obtained based on two load processing methods. In the first prediction process, the total load data is directly used as the input parameters of the artificial intelligence algorithms. In the second prediction process, the aggregated load data in Section 4.3 will be used separately as the input parameters of the forecasting model. The MAPE values in Table 9 represent the final predicted result errors of the summing aggregated load forecasting results. The content has been revised to avoid misunderstandings of our comparison results.  The comparison results show that our proposed method has the best load forecasting results whether the residential load is aggregated or not. When the total load is forecasted directly, the MAPE of our proposed only equals 9.1%, which is lower than traditional methods. When the aggregated load is forecasted separately, the MAPE of our proposed method equals 8.3%, while the MAPE of the SVR-based method equals 11.2% and the MAPE of the BPNN-based method equals 10.2%. In all cases, our proposed method gets the best MAPE value.
The load forecasting results vary under multiple runs for our LSTM-based method or the BPNN-based method. The reason should be the random initialization of weights of trainable layers or parameters in the artificial intelligence models. We select the average of several runs in the comparison. We run the constructed LSTM and BPNN models by fifty runs. In each run, the aggregated load data in Section 4.3 will be used separately as the input parameters of the forecasting model. The MAPE values represent the final predicted 48 results errors of the summing aggregated load forecasting results. The average and variance of the MAPE values for the LSTM-based method are 8.19% and 6.13 × 10 −6 , respectively, while the average and variance of the MAPE values for the BPNN-based method are 10.12% and 2.56 × 10 −5 , respectively. A Student t-test showed that the difference was statistically significant, where t = −24.22 and p = 0.000. Therefore, the results of our method are better than those from the traditional algorithms.
We have recorded the computational time of our proposed method and the traditional methods to make the comparison more comprehensive. Each run of the residential load forecasting includes the training process and 48 prediction processes. The program runs on GPU (Graphics Processing Units), whose type is NVIDIA GeForce GTX 1650. The results show that the computational time of our proposed method is around ten minutes, while the computational time of BPNN-based and SVR-based algorithms is around one second. The training time of the LSTM-based model is much longer than that of traditional methods due to the complicated structure of the LSTM-based model. It is worth noting that the computational time of our proposed method is much shorter than half an hour. Therefore, our proposed method can be sufficiently used for hourly load forecasting in microgrid dispatching.

Conclusions
A novel short-term residential electric load forecasting method based on adaptive load aggregation and deep learning algorithms is proposed and discussed in this paper. An adaptive load aggregation method is proposed based on the number of clusters of historical load data of each household. Households with the same number of clusters are aggregated into one category when the cluster number is below three. All the households with more than three clusters were aggregated into one category. The LSTM-based network with proper look-back time steps is used to forecast the total aggregated load of each category. The look-back time steps are set as the ratio of 24 h to the load sampling interval. This can take into account the load at the same time on the previous day and the load before the prediction time because of the good performance of the LSTM network at storing and accessing long-term information. A large number of experiments using the monitoring load data from the SGSC project show that 150 households are the proper scale to construct a microgrid, because the corresponding MAPE of load prediction for 150 households is less than 10% and meets the requirement of the microgrid dispatch. Our proposed method can significantly improve the load forecasting accuracy for the residential load with high volatility, strong randomness, and weak regularity, and it is very important in microgrid planning and operation.