Electric Heating Load Forecasting Method Based on Improved Thermal Comfort Model and LSTM

: The accuracy of the electric heating load forecast in a new load has a close relationship with the safety and stability of distribution network in normal operation. It also has enormous implications on the architecture of a distribution network. Firstly, the thermal comfort model of the human body was established to analyze the comfortable body temperature of a main crowd under different temperatures and levels of humidity. Secondly, it analyzed the inﬂuence factors of electric heating load, and from the perspective of meteorological factors, it selected the difference between human thermal comfort temperature and actual temperature and humidity by gray correlation analysis. Finally, the attention mechanism was utilized to promote the precision of combined adjunction model, and then the data results of the predicted electric heating load were obtained. In the veriﬁcation, the measured data of electric heating load in a certain area of eastern Inner Mongolia were used. The results showed that after considering the input vector with most relative factors such as temperature and human thermal comfort, the LSTM network can realize the accurate prediction of the electric heating load.


Introduction
Electric heating is a clean, efficient, and flexible form of heating equipment. In recent years, coal-fired heating has been gradually replaced by electric heating in northern China. In order to control urban haze pollution and improve the quality of life of residents, in recent years, the relevant departments of the state have launched the policies of "electricity instead of coal" and "electricity instead of oil" [1]. These policies promote the process of clean energy gradually replacing polluting energy and greatly improve the effect of reducing pollutant emissions. With the continuous improvement of residents' requirements for indoor comfort, the scale of electric heating in winter is increasing year by year, and electric heating is used more and more frequently. Meanwhile, the daily maximum load in winter is also increasing.
Electric heating equipment can be divided into centralized (direct heating electric boiler, regenerative electric boiler, etc.) and distributed (heating cable, electric heating film, carbon crystal heating, etc.). Because electric heating in operation will not produce pollution gas and noise, it is very clean and environmentally protective. The typical characteristics of electric heating are high power, concentrated load, easy-to-produce peak load, and large peak valley difference, and thus it has a great impact on distribution lines [2]. Therefore, accurate load forecasting of electric heating load has great practical significance.
The influence of meteorological factors on short-term load forecasting cannot be ignored. The relevant literature mainly analyzes the factors such as temperature, humidity,

1.
Modeling the thermal comfort of the human body. 2.
The difference between the user's thermal comfort temperature and the temperature is introduced, rather than the absolute temperature value as the input in the network model. 3.
On the basis of LSTM network, we added attention mechanism and dropout layer.

Thermal Comfort Model of the Human Body
The use of electric heating devices in heating areas in China (such as eastern Inner Mongolia) has gradually become mature, and its comfort is very important to the user Energies 2021, 14, 4525 3 of 13 experience. In the use of decentralized electric heating, human thermal comfort will affect the heating time, heating temperature, and other factors, thus affecting the electric heating load data. As the most important driving force of user response, thermal comfort should be considered in load forecasting.
Indoor environment quality will directly affect the physical and mental health and work efficiency of human body. It is very important and fundamental for people in a heated area to achieve a comfortable indoor temperature. Thermal comfort is used to indicate that most people are satisfied with the objective thermal environment, both physically and psychologically. It is mainly affected by physical conditions, physiological conditions, and psychological conditions [18]. The physical conditions include the heat transfer performance and shading coefficient of the walls and windows of the building where people live, the internal disturbance of lighting and equipment, the growth rate of indoor microorganism, and so on, which are not affected by the human body's own activities. Physiological conditions include the change of perspiration rate caused by the roughness or cracking of human skin, the intensity of exercise when carrying out routine activities, and the regulation of local or overall sensation of radiation temperature. Psychological conditions refer to the deviation between the factors and psychological expectation in the thermal environment, which are closely related to subjective feeling.
At present, the thermal comfort of people's environment is usually analyzed according to the ISO 7730 thermal comfort model [19], which is proposed by the international standards organization. The calculation results are expressed by predicted mean vote (PMV), and the formula is as follows [20,21]: where M is metabolic rate of human body, W/m 2 ; W is the mechanical power consumed by the human body, W/m 2 ; P a is partial pressure of water vapor in ambient air around human body, Pa; t a is air temperature around human body, • C; t r is average radiation temperature, • C; f cl is the ratio of clothing area covered by human body to bare area; t cl is the temperature of outer surface for clothing, • C; and h c is the heat transfer coefficient, W/(m 2 ·K). ISO 7730 thermal comfort model has a high accuracy in obtaining the user comfort temperature range, but it is difficult to obtain the real-time environmental data required by the model. Therefore, the ISO 7730 model can be simplified properly without affecting the accuracy. In [22], the Rohles simplified model was improved, and the results were extended to a wider range of clothing insulations. Only the indoor air temperature and relative humidity in the test environment were used as the input parameters, and therefore the thermal comfort parameters can be easily evaluated. The results show that the method is very close to ISO 7730 thermal comfort model and is easy to operate and greatly enhanced. The simplified and improved model is as follows: where I PMV is index value of PMV; T a is indoor temperature; P v is relative humidity, %; and a, b, and c are known parameters. When the indoor temperature and relative humidity are on the high side or on the low side, they will interfere with people's core temperature. At present, people's heating temperature is increasing day by day, and therefore the temperature of people's thermal comfort zone will also rise as a whole, and the regulation ability of cold and heat stimulation of people who stay in the thermal comfort zone for a long time will be weakened. In the end, peoples' sensitivity and reaction time to adjust the temperature will become longer. When the indoor temperature is not the expected thermal comfort temperature, people will adjust the temperature setting to achieve the expected value. Therefore, in order to consider the impact of users' thermal comfort temperature, we used the difference between the air temperature and human thermal comfort temperature to improve the input data of LSTM neural network prediction model.

Load Characteristics of Electric Heating
Electric heating load is different from general electric load, and it has obvious seasonal climate characteristics. Taking the electric heating data of a certain year in eastern Inner Mongolia as an example, from the change trend of annual load curve, we found that the electric heating load in northern region is more intensive in winter (December to March of the next year), in which December to January are the months with the lowest average temperature. From the daily load curve, we found that electric heating load also has obvious characteristics of daily type. From Monday to Friday, the load of office buildings is higher, while the load of weekends and holidays is lower, but the load of commercial and residential electric heating is higher, and the overall trend of daily change is not large. It can be seen from Figure 1 that the typical daily load curve of electric heating in eastern Inner Mongolia presents the characteristics of morning peak, afternoon trough, and evening peak. In terms of electricity consumption, this is mainly due to the start-up of industrial and commercial electric heating in the morning, the general rise of temperature in the afternoon, and the start-up of residential load gathering in the evening.
When the indoor temperature is not the expected thermal comfort temperature, people will adjust the temperature setting to achieve the expected value. Therefore, in order to consider the impact of users' thermal comfort temperature, we used the difference between the air temperature and human thermal comfort temperature to improve the input data of LSTM neural network prediction model.

Load Characteristics of Electric Heating
Electric heating load is different from general electric load, and it has obvious seasonal climate characteristics. Taking the electric heating data of a certain year in eastern Inner Mongolia as an example, from the change trend of annual load curve, we found that the electric heating load in northern region is more intensive in winter (December to March of the next year), in which December to January are the months with the lowest average temperature. From the daily load curve, we found that electric heating load also has obvious characteristics of daily type. From Monday to Friday, the load of office buildings is higher, while the load of weekends and holidays is lower, but the load of commercial and residential electric heating is higher, and the overall trend of daily change is not large. It can be seen from Figure 1 that the typical daily load curve of electric heating in eastern Inner Mongolia presents the characteristics of morning peak, afternoon trough, and evening peak. In terms of electricity consumption, this is mainly due to the start-up of industrial and commercial electric heating in the morning, the general rise of temperature in the afternoon, and the start-up of residential load gathering in the evening.
The key areas of electric energy substitution in eastern Inner Mongolia are distributed electric heating and centralized electric heating, and electric heating accounts for more than 50% of the proportion of electric energy substitution in eastern Inner Mongolia. With the increasing application of electric heating and large-scale access to the power grid, the impact on the operation of the power system is to further narrow the gap between the winter and summer load.  The key areas of electric energy substitution in eastern Inner Mongolia are distributed electric heating and centralized electric heating, and electric heating accounts for more than 50% of the proportion of electric energy substitution in eastern Inner Mongolia. With the increasing application of electric heating and large-scale access to the power grid, the impact on the operation of the power system is to further narrow the gap between the winter and summer load.

Correlation Analysis of Electric Heating Load and Influencing Factors
The idea of association analysis is to compare the similarity degree of data series, so as to clarify the association degree and regular pattern between each series. It belongs to an effective and practical method of gray system theory to analyze the correlation degree of various factors in the research object system [5,23]. In order for the variation characteristics of electric heating load in winter in eastern Inner Mongolia to be studied, the relationship between the meteorological factors such as temperature difference (the difference between human thermal comfort temperature and actual temperature), relative humidity, wind speed and snow falling, and electric heating load should be analyzed. The calculation steps of correlation analysis method are as follows: Step 1: Construct electric heating load characteristic sequence and influence factor sequence. The electric heating load sequence is expressed as X 0 , and the related influencing factor sequence is expressed as X i ; the complete sequence is as follows: where k is serial number; n is number of samples, k = 1, 2, · · · , n; and i is the number of related factors, i = 1, 2, · · · , m.
Step 2: Obtain the correlation degree.
(a) Each sequence is dimensionless as the initial value, as shown in the following formula: where i = 1, 2, · · · , m, and X i is initial value after processing. (b) Determine the difference between electric heating load sequence and each influencing factor ∆ i .
Record the minimum value of all sequence differences as a, the minimum range is b.
(c) Find the correlation coefficient of each sample in the sequence γ i (k).
where γ i (k) is the correlation coefficient between the k-th parameter of the i-th subsequence and the k-th parameter of the electric heating load sequence, and ε is the resolution coefficient, usually 0.5. (d) Calculate the average correlation coefficient as the following: where i = 1, 2, · · · , m.
Step 3: Analyze the correlation coefficient.
Obtain the correlation coefficient between the electric heating load data series X 0 and each related factor series X i . The larger the correlation coefficient, the greater the influence of the factor series on the electric heating load data series. Therefore, the correlation coefficient between electric heating load and various factors can be calculated, as shown in Table 1.
It can be seen from the data in Table 1 that temperature difference and humidity are the most influential factors on electric heating load data, while snowfall and wind speed are relatively less influential. This is mainly because temperature difference and humidity will affect human thermal comfort to a greater extent. Although snowfall and wind speed will also affect people's psychological expectation and feeling of temperature and humidity, their influence is relatively small relative to temperature difference and humidity. After the most relevant factors of electric heating load are analyzed, in addition to the historical electric heating load data, temperature difference data and humidity also become the main source data of electric heating load prediction.

Long Short-Term Memory Network
Due to the inherent time series of load data, the selected forecasting model must have a good ability to express the time series characteristics. In this paper, the long short-term memory network (LSTM) was taken as the main body and improved as the model to study its applicability for short-term load forecasting modeling of electric heating load in eastern Inner Mongolia.
LSTM is a kind of special recurrent neural network (RNN). It can use the information learned at the last moment to learn at the current moment and can set gradient threshold to prevent the gradient disappearing or exploding in RNN training. LSTM algorithm adds cell state C to the original RNN hidden layer to keep the long-term state, thus solving the long-term dependence problem of RNN. Therefore, LSTM is superior to other neural network models. Figure 2 is the schematic diagram of LSTM expansion structure.  It can be seen from the data in Table 1 that temperature difference and humidity are the most influential factors on electric heating load data, while snowfall and wind speed are relatively less influential. This is mainly because temperature difference and humidity will affect human thermal comfort to a greater extent. Although snowfall and wind speed will also affect people's psychological expectation and feeling of temperature and humidity, their influence is relatively small relative to temperature difference and humidity.
After the most relevant factors of electric heating load are analyzed, in addition to the historical electric heating load data, temperature difference data and humidity also become the main source data of electric heating load prediction.

Long Short-Term Memory Network
Due to the inherent time series of load data, the selected forecasting model must have a good ability to express the time series characteristics. In this paper, the long short-term memory network (LSTM) was taken as the main body and improved as the model to study its applicability for short-term load forecasting modeling of electric heating load in eastern Inner Mongolia.
LSTM is a kind of special recurrent neural network (RNN). It can use the information learned at the last moment to learn at the current moment and can set gradient threshold to prevent the gradient disappearing or exploding in RNN training. LSTM algorithm adds cell state C to the original RNN hidden layer to keep the long-term state, thus solving the long-term dependence problem of RNN. Therefore, LSTM is superior to other neural network models. Figure 2 is the schematic diagram of LSTM expansion structure.  In Figure 2, the input of LSTM consists of three parts: the input value at the current time x t , the output value at the previous time h t−1 , and the cell state at the previous time c t−1 . The output of LSTM consists of the output unit state c t and the output value of hidden layer h t .
Compared with RNN, LSTM redesigns the internal memory unit while maintaining its basic structure. The architecture diagram of each unit of LSTM is shown in Figure 3.
The key of every LSTM cell is the control of cell state c. There are three control gates in the unit state, which are forgetting gate f t , input gate i t , and output gate o t . Through these gates, information can be filtered or added to achieve a new unit state.
According to Figure 3, from left to right, it can be seen that the unit state of the previous time c t−1 and the output value of the hidden layer of the previous time h t−1 together memorize the historical information of the sequence data.
Step-by-step analysis of LSTM architecture can be divided into three parts. Compared with RNN, LSTM redesigns the internal memory unit while maintaining its basic structure. The architecture diagram of each unit of LSTM is shown in Figure 3.
The key of every LSTM cell is the control of cell state c . There are three control gates in the unit state, which are forgetting gate t f , input gate t i , and output gate t o . Through these gates, information can be filtered or added to achieve a new unit state. Step-by-step analysis of LSTM architecture can be divided into three parts. The first step is to filter the information selectively. The forgetting gate removes the information in the last unit according to t-1 h and t x , that is, it removes the useless part of the information learned at the last moment. The forgetting gate is as follows: ( ) where ( ) σ  is Sigmoid activation function, f w is the weight of forgetting gate, and f b is the bias of forgetting gate.
The second step is to generate new information that needs to be updated. This part is combined by input gate t i and candidate value t C  . t-1 h and t x use sigmoid function to obtain the data that need to be input into the cell state (i.e., input gate) and create a new candidate state through tanh layer. The formula is as follows: ( ) where t i is information to memorize, that is, input gate; t C  is the candidate value to  The first step is to filter the information selectively. The forgetting gate removes the information in the last unit according to h t−1 and x t , that is, it removes the useless part of the information learned at the last moment. The forgetting gate is as follows: where σ(·) is Sigmoid activation function, w f is the weight of forgetting gate, and b f is the bias of forgetting gate. The second step is to generate new information that needs to be updated. This part is combined by input gate i t and candidate value C t . h t−1 and x t use sigmoid function to obtain the data that need to be input into the cell state (i.e., input gate) and create a new candidate state through tanh layer. The formula is as follows: where i t is information to memorize, that is, input gate; C t is the candidate value to update the original cell state; w i and w c represent the weight of input gate and candidate value, respectively; and b i and b c represent the bias of input gate and candidate value, respectively. The third step is to generate new cell state c t and hidden layer outputs h t . By multiplying the input gate i t and the candidate value C t and adding them to the forgetting gate f t , one can obtain the updated cell state value c t , as shown in the following formula: The new cell state c t is processed by a tanh function, and then multiplied by the output gate o t to obtain the output value of the hidden layer h t : where w o is the weight of output gate, and b o is the bias of output gate. Through the analysis of LSTM structure system, we can see that using LSTM to replace neurons in RNN to build load forecasting model can solve the problem of longterm dependence and we can learn the hidden historical operation law in power load forecasting.

Improved LSTM with Attention Mechanism
For different times, the brain will focus on the areas that need to be focused on and reduce or ignore the attention to other areas. This kind of attention allocation mechanism can help people to obtain important and detailed information and reduce the influence of other irrelevant information.
Attention mechanism refers to the idea of human brain attention resource allocation [24]. By assigning different probabilities to generate different attention distribution coefficients, the model can better learn the information in the input sequence and improve the accuracy of the model.
The attention structure is shown in Figure 4, where x t (t ∈ [1, n]) is the input to the hidden layer of the LSTM model, h t (t ∈ [1, n]) is the hidden layer output through the LSTM corresponding to each input, α t (t ∈ [1, n]) is the probability distribution value of the attention mechanism output to hidden layer, and y is the LSTM output value with attention mechanism.
where o w is the weight of output gate, and o b is the bias of output gate.
Through the analysis of LSTM structure system, we can see that using LSTM to replace neurons in RNN to build load forecasting model can solve the problem of long-term dependence and we can learn the hidden historical operation law in power load forecasting.

Improved LSTM with Attention Mechanism
For different times, the brain will focus on the areas that need to be focused on and reduce or ignore the attention to other areas. This kind of attention allocation mechanism can help people to obtain important and detailed information and reduce the influence of other irrelevant information.
Attention mechanism refers to the idea of human brain attention resource allocation [24]. By assigning different probabilities to generate different attention distribution coefficients, the model can better learn the information in the input sequence and improve the accuracy of the model.
The attention structure is shown in Figure 4   The formulas of attention weight matrix and eigenvector in attention mechanism are as follows: e t = u s tanh(w s h t + b s ) (17) where e t is the non-normalized weight matrix, and w s , b s , and u s represent randomly initialized attention mechanism weight matrix, bias vector, and time series matrix, respectively.
To sum up, the structure of the improved LSTM electric heating load forecasting model designed in this paper is shown in Figure 5, which is mainly composed of input layer, LSTM layer, attention layer, dropout layer, and output layer. The function of dropout layer is to prevent over learning and set the discard rate, so that some neurons extracted from the model can be "discarded" (do not participate in network training).
domly initialized attention mechanism weight matrix, bias vector, and time series matrix, respectively.
To sum up, the structure of the improved LSTM electric heating load forecasting model designed in this paper is shown in Figure 5, which is mainly composed of input layer, LSTM layer, attention layer, dropout layer, and output layer. The function of dropout layer is to prevent over learning and set the discard rate, so that some neurons extracted from the model can be "discarded" (do not participate in network training). Considering the climate characteristics of northern China, according to the results of correlation analysis, we took the historical electric heating load data from January to March l , the difference between human thermal comfort temperature and air temperature Δt , and relative humidity v p as the original sample set of the prediction model.
The sample data were standardized by 0-1 as the input matrix s X of the model. The data of temperature and relative humidity were from the National Meteorological Data Center.
The input data s X of the input layer was simply extracted with feature vectors, and the neural network unit was controlled by three "gates" structures. The output data Considering the climate characteristics of northern China, according to the results of correlation analysis, we took the historical electric heating load data from January to March l, the difference between human thermal comfort temperature and air temperature ∆t, and relative humidity p v as the original sample set of the prediction model. The sample data were standardized by 0-1 as the input matrix X s of the model. The data of temperature and relative humidity were from the National Meteorological Data Center.
The input data X s of the input layer was simply extracted with feature vectors, and the neural network unit was controlled by three "gates" structures. The output data of LSTM layer was the matrix H = [h 1 · · · h i · · · h n ], which represents the output value of electric heating load of this layer. The input of attention mechanism was the output matrix H of LSTM layer, and the feature vectors V were obtained by different attention weights.

Date Preprocessing
The data used in this paper are the historical data of 66 days of electric heating load from January to the first week of March in 2018 in an area of eastern Inner Mongolia. At the same time, the thermal comfort of 300 individuals of different ages was investigated, and the model parameters were fitted by Equation (2), and the thermal comfort temperature of the main population was obtained. Among the 300 individuals, there were 150 men and 150 women, mainly young people aged about 20 years old and middle-aged and old people aged about 60 or 70 years old.
The thermal comfort questionnaire survey was conducted on the subjects, and the temperature and relative humidity during the survey were investigated. The model parameters of the same user under different clothing and activity intensity were obtained by fitting (see Table 2).
It can be seen from Table 2 that users had different adaptability to temperature under different clothes and different activity intensities. In order to make the model more universal, we took the average value of 23.275 • C as the thermal comfort temperature of the human body.

Parameter Setting and Analysis
The input data were divided into training set and test set. The first 90% of the input samples were taken as the training set for the data samples of model fitting; the last 10% of the input sample was taken as the test set to evaluate the accuracy of the final model, that is, the training prediction of the prediction day. We set the initial learning rate as 0.05, learning decay rate as 0.6, and data training cycle as 250. In addition, the dropout layer discard rate was set to 0.25.
The number of hidden layers of the LSTM network and the number of LSTM units in each hidden layer had an impact on the accuracy of electric heating load forecasting. Under-learning or over-learning will affect the accuracy of the model. The enumeration method was used to record the training effect of different hidden layers and different number of neurons in each layer, so as to determine the optimal network structure. Firstly, the number of hidden layers was set to 1, and different numbers of neurons were set one by one to train and record MAPE; then, we kept the optimal number of neurons in the first layer, set the number of hidden layers to 2, continued to set the number of different units one by one for training, and so on. In this paper, the maximum number of hidden layers was set to 3, and the performance of each training is shown in Table 3. According to the results in Table 3, when the number of hidden layers was 1 and the number of neurons in each layer was 5, the minimum e MAPE was 4.2486%; when the number of hidden layers was 2, the number of neurons in the first layer was fixed to 5, and the number of neurons in the second layer was set to 20, and the minimum e MAPE was 5.0683%. When the hidden layer was 3, the first two layers were fixed with the optimal number. When the number of neurons in the third layer was 10, the minimum e MAPE was 5.0794%.

Test Results and Analysis
In order to verify the performance of the thermal comfort model and the improved LSTM neural network method proposed in this paper, we selected the optimal prediction model (one hidden layer, five neurons per layer). In addition, the hourly load from January to early March 2018 was used as the dataset to test the prediction performance of the model, which was compared with the other three cases. Figure 6 shows the mean absolute percentage error (MAPE) of the prediction results of the proposed method. Figure 7 shows the comparison curve between the actual electric heating load and the load predicted by each method. The curve LSTM-T-A represents the prediction result of the LSTM model with thermal comfort temperature and attention mechanism added, the curve LSTM-T represents the prediction result with thermal comfort temperature added only, and the curve LSTM-A represents the prediction result with attention mechanism added only. The curve LSTM represents the LSTM prediction results without thermal comfort temperature and attention mechanism. It can be seen from Figure 7 that compared with the other three methods, LSTM-T-A had little change in amplitude compared with the real value, and the curve characteristics were closest to the real value.   Table 4. In addition to comparing the improved part of LSTM, the errors of SVM and ANN are also compared.    Table 4. In addition to comparing the improved part of LSTM, the errors of SVM and ANN are also compared.   Table 4. In addition to comparing the improved part of LSTM, the errors of SVM and ANN are also compared. It can be seen from Figure 7 and Table 4 that for the LSTM model, the improvement after adding human thermal comfort temperature and attention mechanism will significantly improve the prediction accuracy of electric heating load. LSTM-T-A prediction curve fitted the real value best, and the selected error index values were the smallest, which showed a better prediction effect.

Conclusions
According to the load of electric heating in northern China, we analyzed the load characteristics of electric heating in winter and constructed the thermal comfort temperature model of the human body. The main meteorological factors affecting electric heating load were screened out by the gray correlation analysis method. Meanwhile, the difference between thermal comfort temperature and actual temperature of main users was analyzed and considered. Attention mechanism and dropout layer were added to improve the LSTM neural network, and the optimal number of hidden layers and hidden neurons were obtained.
The actual electric heating load data were used to verify the model and were compared with several models. The results show that:

1.
Comprehensive historical data showed that the shape of the typical daily load curve of electric heating load fluctuated greatly, and the peak valley difference was large. Moreover, the electric heating load had a strong time correlation, which was closely related to temperature, relative humidity, and thermal comfort temperature.

2.
It is necessary to find the optimal number of hidden layers and neurons in order to mine more data information and improve the prediction accuracy of improved LSTM network.

3.
As far as the improvement of LSTM prediction method is concerned, considering human thermal comfort temperature and attention mechanism accuracy, the training effect is the best. When considering the difference between thermal comfort temperature and air temperature in the model input, we found that the conclusion was more accurate and performed better than SVM, ANN, and other algorithms, and thus it is a more suitable electric heating load forecasting method.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Data are contained within the article. Data sharing is not applicable to this article.