A Deep Learning Approach to Forecasting Monthly Demand for Residential–Sector Electricity

: Forecasting electricity demand at the regional or national level is a key procedural element of power-system planning. However, achieving such objectives in the residential sector, the primary driver of peak demand, is challenging given this sector’s pattern of constantly ﬂuctuating and gradually increasing energy usage. Although deep learning algorithms have recently yielded promising results in various time series analyses, their potential applicability to forecasting monthly residential electricity demand has yet to be fully explored. As such, this study proposed a forecasting model with social and weather-related variables by introducing long short-term memory (LSTM), which has been known to be powerful among deep learning-based approaches for time series forecasting. The validation of the proposed model was performed using a set of data spanning 22 years in South Korea. The resulting forecasting performance was evaluated on the basis of six performance measures. Further, this model’s performance was subjected to a comparison with the performance of four benchmark models. The performance of the proposed model was exceptional according to all of the measures employed. This model can facilitate improved decision-making regarding power-system planning by accurately forecasting the electricity demands of the residential sector, thereby contributing to the e ﬃ cient production and use of resources.


Introduction
Forecasting electricity demand at the regional or national level is vital for efficient power-system planning aimed at ensuring optimal energy management [1] and secure electricity supply [2]. Overestimating electricity demand not only wastes resources but also introduces unnecessary environmental pollution from thermal power plants by devising expensive expansion plans for the relevant power systems [2,3]. On the other hand, underestimating electricity demand can also have disastrous social consequences, such as blackouts caused by insufficient power supply [2]. According to the Annual Energy Outlook 2019 by the U.S. Energy Information Administration (EIA) [4], residential electricity usage accounted for close to 30% of electricity consumption worldwide in 2018 and is fairly volatile relative to the electricity demand in, for example, industrial and commercial sectors. Meanwhile, it is projected that residential electricity consumption will account for more than one-third of electricity consumption worldwide in 2040 and about 36% in 2050; the amount of electricity used in the residential sector is expected to more than double compared to 2018 [4]. In this context, forecasting monthly electricity demand in the residential sector represents an urgent concern for efficient power-system planning, but is also challenging on account of its tendency to follow a complex pattern characterized by fluctuations and increases. Residential demand depends on various non-linear variables, including economic [5,6], demographic [6,7], and weather conditions [6,7]. Weather conditions, such as wind speed and temperature, vary monthly, pursuant to seasonal trends.
A consensus has developed around the accurate forecasting of monthly electricity demand, especially regarding approaches that employ data mining techniques because their performance has been consistently superior to that of traditional statistical methods. Several studies have used data mining techniques to explore the problem of forecasting monthly electricity demand.
Bunnoon et al. [8] employed a multi-regional artificial neural network (ANN) to study forecasting of monthly electricity demand in Thailand for 1997 and 2007, the results of which calculated a MAPE value of 1.42 for their model, which renders it more reliable than a traditional ANN. Hamzaçebi et al. [9] proposed a model using the grey system theory to study forecasting monthly electricity demand in Turkey for 1987 and 2014. In terms of MAPE, their model demonstrated an accuracy value of 5.18. Son and Kim [2] used an SVR model to study the forecasting of monthly residential electricity demand in South Korea for 1991 and 2012. Their model selected socioeconomic and weather variables from previous studies, after which they employed a fuzzy-rough and particle swarm optimization algorithm to identify the relevant variables and handle the problems associated with non-linear time series. Their results achieved a MAPE of 2.13 and a UPA of −5.89. Guo et al. [10] proposed a model to study the forecasting of monthly electricity demand in China between 2000 and 2014. Their model is based on the self-adaptive screening and seasonal vector error correction model (SAS-SVECM). They obtained a MAPE value of 1.66, which signifies a higher level of accuracy than time series based on X-12-ARIMA, which is an extension of the autoregressive integrated moving average (ARIMA), SVR with X-12-ARIMA, ANN, growth equation method, and support vector machine (SVM) with X-12-ARIMA. These studies have proved that it is possible to forecast electricity demand but, thus far, they have yielded results that illustrate the unsatisfactory nature of the models employed, even though data mining techniques have shown superior results relative to those derived from conventional statistical methods. As electricity demand grows, it will become even more difficult to forecast monthly residential electricity demand. Therefore, the previous methods developed to solve this problem still require refinement in terms of their actual forecasting performance.
In recent years, deep learning-based approaches have been applied to various subjects relevant to a wide range of electricity demand forecasting problems (e.g., [13][14][15][16][17][18]). Various forecasting resolutions have been studied from multiple times a day (e.g., minute-by-minute [18], half-hourly [14,16], hourly [18], and every two hours [16]) to daily [15,17,18], weekly [18], monthly [13], or yearly depending on the purposes of forecasting. There have been various target units, such as a single household consumer [15,18], a business consumer (e.g., a manufacturing company in [14] and small and medium enterprise consumer in [16]), any end-user not distinguished as a household or business consumer [13], and a region or country (e.g., UT Chandigarh, India, in [17]). At the regional or national level, the problem of forecasting electricity demand is then broken down by end-use sectors, such as residential, industrial, commercial, and transportation. The characteristics used for each subject's historical demand series and the variables that affect the series can differ and it is sometimes difficult to consider additional variables depending on the forecasting resolution and target unit. Although deep learning-based approaches showed promise as alternatives to conventional statistical methods in [14,18] and machine learning algorithms in [17], their potential for forecasting monthly residential electricity demand at the regional or national level needs to be explored. To that end, this paper hereby proposed a forecasting model based on LSTM that utilizes a deep learning algorithm with great potential for time series analyses.

Long Short-Term Memory
LSTM is an advanced and deep architecture of recurrent neural network (RNN) [19]. RNN can handle time series data because the activation of a recurrent hidden state at each time step depends on the hidden state of a previous time step, whereas conventional neural network transmits information to the next layer without reference to the previous time step [20,21]. As illustrated in Figure 1, t, which signifies a hidden state at each time, is updated at each time step by the input as well as the hidden state at time t and t -1, respectively. Therefore, an RNN is regarded as an example of deep network architecture, which assigns one layer to each time step. Although an RNN is theoretically suitable for forecasting time-based feature sequences, training the sequences with long time steps is challenging due to RNN's inherent limitation, i.e., exploding and vanishing gradients [19,22,23]. In this study, a prolonged historical data sequence, related to monthly residential electricity demand, was considered for reliable time series forecasting and LSTM, which is more effective than conventional RNN, was employed to handle the time series with long time steps [19,24]. Sustainability 2020, 12, x FOR PEER REVIEW 3 of 16 data mining techniques have shown superior results relative to those derived from conventional statistical methods. As electricity demand grows, it will become even more difficult to forecast monthly residential electricity demand. Therefore, the previous methods developed to solve this problem still require refinement in terms of their actual forecasting performance. In recent years, deep learning-based approaches have been applied to various subjects relevant to a wide range of electricity demand forecasting problems (e.g., [13][14][15][16][17][18]). Various forecasting resolutions have been studied from multiple times a day (e.g., minute-by-minute [18], half-hourly [14,16], hourly [18], and every two hours [16]) to daily [15,17,18], weekly [18], monthly [13], or yearly depending on the purposes of forecasting. There have been various target units, such as a single household consumer [15,18], a business consumer (e.g., a manufacturing company in [14] and small and medium enterprise consumer in [16]), any end-user not distinguished as a household or business consumer [13], and a region or country (e.g., UT Chandigarh, India, in [17]). At the regional or national level, the problem of forecasting electricity demand is then broken down by end-use sectors, such as residential, industrial, commercial, and transportation. The characteristics used for each subject's historical demand series and the variables that affect the series can differ and it is sometimes difficult to consider additional variables depending on the forecasting resolution and target unit. Although deep learning-based approaches showed promise as alternatives to conventional statistical methods in [14,18] and machine learning algorithms in [17], their potential for forecasting monthly residential electricity demand at the regional or national level needs to be explored. To that end, this paper hereby proposed a forecasting model based on LSTM that utilizes a deep learning algorithm with great potential for time series analyses.

Long Short-Term Memory
LSTM is an advanced and deep architecture of recurrent neural network (RNN) [19]. RNN can handle time series data because the activation of a recurrent hidden state at each time step depends on the hidden state of a previous time step, whereas conventional neural network transmits information to the next layer without reference to the previous time step [20,21]. As illustrated in Figure 1, , which signifies a hidden state at each time, is updated at each time step by the input as well as the hidden state at time and -1, respectively. Therefore, an RNN is regarded as an example of deep network architecture, which assigns one layer to each time step. Although an RNN is theoretically suitable for forecasting time-based feature sequences, training the sequences with long time steps is challenging due to RNN's inherent limitation, i.e., exploding and vanishing gradients [19,22,23]. In this study, a prolonged historical data sequence, related to monthly residential electricity demand, was considered for reliable time series forecasting and LSTM, which is more effective than conventional RNN, was employed to handle the time series with long time steps [19,24]. The main idea of LSTM involves replacing the traditional neuron in an RNN with a memory cell comprised of three sigmoid layers, known as the input, output, and forget gates [25], as illustrated in Figure 2. The first step decides whether to keep or discard the information in the state of the cell, a result achieved by the forget gate. In this process, the gate generates a value between 0 and 1 by The main idea of LSTM involves replacing the traditional neuron in an RNN with a memory cell comprised of three sigmoid layers, known as the input, output, and forget gates [25], as illustrated in Figure 2. The first step decides whether to keep or discard the information in the state of the cell, a result achieved by the forget gate. In this process, the gate generates a value between 0 and 1 by referring to the input as well as the hidden state at time t and t -1, which it discards and keeps, respectively. The generated value is multiplied by the hidden state at time t -1 to perform this process. Next, newly-generated information to be stored is determined by multiplying the outputs from the following two processes in the state of the cell. First, the input gate generates the value to denote the updated information of the state of the cell at time t, then the tanh layer returns a new candidate vector. Here, the vector can be used to update the state of the cell. The outputs from these two processes are multiplied and the result of that calculation is then added to the forget gate's output. The output of the memory cell is translated by the output gate using the new hidden state at time t. In the process of generating output, the output gate decides which information to use as the output based on the state of the cell. Finally, the memory cell's output is generated by multiplying the vector, element-wise, from the output gate and the state of the cell through the tanh layer. referring to the input as well as the hidden state at time and -1, which it discards and keeps, respectively. The generated value is multiplied by the hidden state at time -1 to perform this process. Next, newly-generated information to be stored is determined by multiplying the outputs from the following two processes in the state of the cell. First, the input gate generates the value to denote the updated information of the state of the cell at time , then the ℎ layer returns a new candidate vector. Here, the vector can be used to update the state of the cell. The outputs from these two processes are multiplied and the result of that calculation is then added to the forget gate's output. The output of the memory cell is translated by the output gate using the new hidden state at time . In the process of generating output, the output gate decides which information to use as the output based on the state of the cell. Finally, the memory cell's output is generated by multiplying the vector, element-wise, from the output gate and the state of the cell through the ℎ layer. In the context of forecasting electricity demand, = ( , , … , ) refers to the historical input data and = ( , , … , ) refers to the forecasted data. The forecasted electricity demand can be calculated as [26,27]: where denotes the input gate, refers to the forget gate, signifies the output gate, represents each cell's activation vectors, ℎ stands for each memory block's activation vectors, denotes the weight matrix of each gate unit, represents the bias vectors, and ° denotes the element-wise multiplication. The input gate determines what information is stored in the memory cell. The forget gate serves to determine what information to take and store in the memory cell from the previous state, that is, whether to retain the information. Then the output gate helps to determine the output of the information from the memory cell for the new hidden state. The operation of each gate that constitutes the LSTM is represented in Equations (1)-(3), respectively. By the operation of the input, forget, and output gates, the states of the cell and the hidden layers are determined; these are shown in Equations (4) and (5), respectively. For this series of processes, LSTM features the capability to extract and remember information for a long time. In the context of forecasting electricity demand, x = (x 1 , x 2 , . . . , x T ) refers to the historical input data and y = (y 1 , y 2 , . . . , y T ) refers to the forecasted data. The forecasted electricity demand can be calculated as [26,27]: where i denotes the input gate, f refers to the forget gate, o signifies the output gate, c represents each cell's activation vectors, h stands for each memory block's activation vectors, W denotes the weight matrix of each gate unit, b represents the bias vectors, and • denotes the element-wise multiplication.
The input gate determines what information is stored in the memory cell. The forget gate serves to determine what information to take and store in the memory cell from the previous state, that is, whether to retain the information. Then the output gate helps to determine the output of the information from the memory cell for the new hidden state. The operation of each gate that constitutes the LSTM is represented in Equations (1)-(3), respectively. By the operation of the input, forget, and output gates, the states of the cell and the hidden layers are determined; these are shown in Equations (4) and (5), respectively. For this series of processes, LSTM features the capability to extract and remember information for a long time.
The activation function σ s (·) used for the input, forget, and output gates, which represents a sigmoid-type function, is written as: while each of the activation function σ c (·) and σ h (·) used for the activation gate is expressed as 4. Experiments

Dataset
Data were collected from 1991 when the monthly data for the input considered in this study began to be officially published. Data spanning 22 years (1991-2012, a total of 264 observations) were used for the training, test, and validation, among which the period from January 2011 to December 2012 was considered the validation period instead of more recent years. This was because the electricity demand during this period showed complex seasonal and monthly patterns, significant fluctuations, and gradual increases as well as abnormal cases in forecasting electricity demand. While residential electricity demand peaked in August in most years, in 2011 the electricity demand in January was higher than in August. This was presumed to be due to the fact that the electricity demand in January 2011 was affected by a severe cold wave that led to exceptionally low temperatures and the second-coldest winter since 1981. In September 2011, South Korea experienced its first-ever serious demand and supply imbalance and this led to nationwide rolling blackouts. In August, a month before this point, humid weather continued and heavy rainfall events replaced the heat waves that occurred in previous years. Instead, a late heat wave came in early-and mid-September. In August 2012, an unexpected capacity shortage due to abnormal electricity demand occurred. This was also presumed to happen due to a severe heat wave with the highest average temperature since 1994. The monthly electricity demand growth rate was 3.29% on average between 2011 and 2012, whereas it increased drastically (by 12.58%) in August 2012 compared to the same time during the previous year. In South Korea, Korea Electric Power Corporation provides the state of the amount of reserved power by categorizing the state as normal or five levels of insufficiency (below 5 × 10 3 MWh). In August 2012, the amount of reserved power fell below 3 × 10 3 MWh, resulting in a level 3 insufficiency warning. This indicated that this validation period constitutes a forecasting challenge. Figure 3 shows a graphical plot of the time series for the electricity demand, presenting complex seasonal and monthly patterns and significant fluctuations. The data indicate a considerable upward pattern lasting until the end of 2012, particularly linked to the economic and demographic growth of the country. Moreover, a cyclic yearly trend that correlated with annual global climatic changes was imposed on seasonal and monthly patterns. Further, the amplitude of the values obtained denoted holidays, workdays, and various other days incorporated into the series.  According to previous studies [28][29][30][31], electricity demand series is affected by several non-linear variables, such as social and weather conditions. In this study, related variables that Son and Kim [2] found to impact the time series for monthly residential electricity demand were employed to reflect such influences. Table 1 summarizes 10 related variables, including 2 social variables and 8 weatherrelated ones. In the present study, data on monthly residential electricity demand and associated variables were assembled from published statistics by the Korea Energy Statistical Information System, Korea Meteorological Administration, and Korean Statistical Information Service. For weather variables, this study collected the monthly weather data according to the representative regions of South Korea, including the Seoul metropolitan area (including Seoul, Incheon, and Gyeongggi), Gangwon, North Chungcheong, South Chungcheong (including Daejeon and Sejong), North Jeolla, South Jeolla (including Gwangju), North Gyeongsang (including Daegu), South Gyeongsang (including Busan and Ulsan), and the Jeju special self-governing province. Each region was linked to a climatic condition, each of which represents an area of similar climate. The data for weather variables, collected from these nine regions, were averaged according to the population ratio of their regions. Notably, because climate affects the electricity demand via people's responses to the weather, the population was used as an indicator [32]. That is to say, people vary their use of electric air conditioning or heating appliances depending on the degree of hotness or coldness of the weather, so a population's density is likely to increase the electricity demand [33]. According to previous studies [28][29][30][31], electricity demand series is affected by several non-linear variables, such as social and weather conditions. In this study, related variables that Son and Kim [2] found to impact the time series for monthly residential electricity demand were employed to reflect such influences. Table 1 summarizes 10 related variables, including 2 social variables and 8 weather-related ones. In the present study, data on monthly residential electricity demand and associated variables were assembled from published statistics by the Korea Energy Statistical Information System, Korea Meteorological Administration, and Korean Statistical Information Service. For weather variables, this study collected the monthly weather data according to the representative regions of South Korea, including the Seoul metropolitan area (including Seoul, Incheon, and Gyeongggi), Gangwon, North Chungcheong, South Chungcheong (including Daejeon and Sejong), North Jeolla, South Jeolla (including Gwangju), North Gyeongsang (including Daegu), South Gyeongsang (including Busan and Ulsan), and the Jeju special self-governing province. Each region was linked to a climatic condition, each of which represents an area of similar climate. The data for weather variables, collected from these nine regions, were averaged according to the population ratio of their regions. Notably, because climate affects the electricity demand via people's responses to the weather, the population was used as an indicator [32]. That is to say, people vary their use of electric air conditioning or heating appliances depending on the degree of hotness or coldness of the weather, so a population's density is likely to increase the electricity demand [33]. Similarly, a stronger correlation exists between the population and the regional electricity demand [34]. Similarly, a stronger correlation exists between the population and the regional electricity demand [34]. Figures 4 and 5 compare the normalized values of the weather and social-related variables in the surveyed time series data, each with its own characteristics, and the trend of normalized demand values of electricity in the residential sector. Although the normalized values of the weather-related variables (Figure 4) could delineate the fluctuations in the demand for electricity, they could not account for the long-term upward pattern of electricity demand.  On the other hand, while the normalized values of social variables ( Figure 5) followed an upward pattern of normalized demand for electricity values annually, they showed less fluctuation than the weather-related variables, especially the consumer price index. On the other hand, while the normalized values of social variables ( Figure 5) followed an upward pattern of normalized demand for electricity values annually, they showed less fluctuation than the weather-related variables, especially the consumer price index.

Implementation
The proposed LSTM model was implemented using Keras with a Tensorflow backend [35]. To consider the seasonal and annual rising patterns of residential electricity demand, the input layer's time step was set to 24. The sigmoid-type function was used for the input, forget, and output gates, and the activation function ℎ(•) was used for the activation gate. An LSTM model, consisting of one input, two hidden, and one output layers was trained. For the two hidden layers, 125 hidden neurons were used in this study. The architecture was experimentally determined by exploring the optimal configurations under the different numbers of hidden layers in the network from 1 to 3 and neurons in each hidden layer at 25 intervals from 25 to 125. The proposed LSTM model was trained on the mean-squared error (MSE) loss function and optimized through adaptive moment estimation (ADAM) optimization scheme [36].

Benchmark Models
Four forecasting models (SVR, ANN, ARIMA, and multiple linear regression (MLR)) were used for a comparative demonstration of the proposed LSTM model's performance in forecasting monthly residential electricity demand.

Support Vector Regression
SVR is a modified version of the SVM devised by Müller et al. [37] to address the time series forecasting problem. It has gained increasing attention over the past decades, especially in electricity demand forecasting [38]. The main objective of SVR is to find a hyperplane function that can recognize patterns in the given time series data. One advantage of this model is that the upper boundary of the generalized error, rather than the learning error, is minimized [39]. SVR was selected as a benchmark model because previous studies have proven that an SVR-based model can yield satisfactory performance across various electricity demand forecasting [2,40,41]. In this study, the hyperparameters of SVR, penalty factor and gamma, were experimentally determined as 100 and 0.001, respectively.

Artificial Neural Networks
An ANN is a mathematical model designed to simulate the processes and functionality of the human brain [42]. Generally, the main objective in this context involves finding a relationship that

Implementation
The proposed LSTM model was implemented using Keras with a Tensorflow backend [35]. To consider the seasonal and annual rising patterns of residential electricity demand, the input layer's time step was set to 24. The sigmoid-type function was used for the input, forget, and output gates, and the activation function tanh(·) was used for the activation gate. An LSTM model, consisting of one input, two hidden, and one output layers was trained. For the two hidden layers, 125 hidden neurons were used in this study. The architecture was experimentally determined by exploring the optimal configurations under the different numbers of hidden layers in the network from 1 to 3 and neurons in each hidden layer at 25 intervals from 25 to 125. The proposed LSTM model was trained on the mean-squared error (MSE) loss function and optimized through adaptive moment estimation (ADAM) optimization scheme [36].

Benchmark Models
Four forecasting models (SVR, ANN, ARIMA, and multiple linear regression (MLR)) were used for a comparative demonstration of the proposed LSTM model's performance in forecasting monthly residential electricity demand.

Support Vector Regression
SVR is a modified version of the SVM devised by Müller et al. [37] to address the time series forecasting problem. It has gained increasing attention over the past decades, especially in electricity demand forecasting [38]. The main objective of SVR is to find a hyperplane function that can recognize patterns in the given time series data. One advantage of this model is that the upper boundary of the generalized error, rather than the learning error, is minimized [39]. SVR was selected as a benchmark model because previous studies have proven that an SVR-based model can yield satisfactory performance across various electricity demand forecasting [2,40,41]. In this study, the hyperparameters of SVR, penalty factor and gamma, were experimentally determined as 100 and 0.001, respectively.

Artificial Neural Networks
An ANN is a mathematical model designed to simulate the processes and functionality of the human brain [42]. Generally, the main objective in this context involves finding a relationship that can automatically map output to input through the training stage, by which the network is iteratively trained to minimize forecasting error [39]. ANN has the advantage of being applicable to multivariate models and capable of describing non-linear relationships among output and input from given historical data [39]. A back-propagation ANN, combining a back-propagation algorithm with a feedforward multi-layer perceptron is the most widely used network structure and was adopted in this study; the momentum and learning-rate parameters were experimentally determined as 0.2 and 0.3, respectively.

Autoregressive Integrated Moving Average (ARIMA)
ARIMA, the most well-known statistical method for time series analyses, can model complex patterns and forecast in univariate time series data [43]. The ARIMA model function has three important factors, denoted as p, d, and q [44,45], which represent the autoregressive, integration, and moving average factors, respectively. An expression of the ARIMA (p, d, q) model in general is as follows [44]: where Φ p indicates the autoregressive parameter of order p, p indicates the number of autoregressive terms, BO represents the backshift operator, d indicates the number of differences which are non-seasonal, Y t indicates the actual value in the given time series data, θ q is the order p's autoregressive parameter, q indicates the number of forecasting errors, which are lagged in the forecasting equation, e t is a random perturbation or white noise, and δ indicates a constant value.

Multiple Linear Regression (MLR)
MLR, one of the most common statistical regression models, was also employed as a benchmark model. The MLR's objective is to model the relationship among a dependent variable and multiple independent variables [44]. The relationship between the vector of regressors and the dependent variable is assumed to be linear in MLR. The equation of linear regression maps the forecasting model to the observations in the given time series, x and y values, for forecasting purposes. An expression of the MLR model is as follows [46]: where represents an error or a random disturbance, β 1 , β 2 , . . . , β n are the coefficients, and n refers to the number of observations. The resulting model is used for forecasting the y value with additional observations x.

Performance Measures
In this study, the forecasting results were evaluated using the values of six measures, including mean absolute error (MAE), root-mean-squared error (RMSE), MAPE, post-error ratio C, mean bias error (MBE), and UPA. The use of these performance measures represents various ways of evaluating the models. MAE and RMSE are absolute performance measures that allowed us to compare the actual deviation between the actual and forecasted values. On the other hand, MAPE is a relative measure that is scale-invariant and represents the relative criterion of the forecasting error between the actual and forecasted values [47]. The ratio C, represents the ratio between the standard deviations of the actual values and the forecasting errors between the actual and forecasted values. A smaller value of C indicates that the model has a higher level of accuracy. It is possible to determine whether and by how much forecasted values are overestimated or underestimated based on whether the MBE is positive or negative, where a smaller value of MBE is more accurate. The value of UPA represents the forecasting model's performance in estimating peak demand, but is incapable of pairing the forecasted value with actual value in terms of space or time. In this study, employed performance measures were calculated as the following equations: where n is the number of validation points, refers to the forecasted values, y i refers to the actual values, and S 1 and S 2 denote the standard deviations of the actual values and the forecasting errors, respectively.

Results and Discussion
The performances of the models for forecasting monthly residential electricity demand were tested and validated using the dataset and the variables described in the Dataset section. The results for the proposed LSTM model were compared with those of the four benchmark models: SVR, ANN, ARIMA, and MLR. The results were obtained by training based on 240 months and then forecasting the remaining 24 months. A comparison of the forecasting performance, in terms of MAE, RMSE, MAPE, C, MBE, and UPA is summarized in Table 2. In each of the measures considered in this study, the proposed LSTM model outperformed the four benchmark models. As the values of MAE and RMSE indicate a comparison of the absolute error between the actual and forecasted demand, no absolute criterion exists for both measures' reliability. Therefore, a smaller value of MAE and RMSE indicates that the forecasted demand during the validation period was closer to the actual values, relative to the other forecasting models. Thus, the proposed LSTM model with the lowest MAE and RMSE values suggests that, within the validation period, it could provide the most accurate forecast, relative to the other tested models.
The proposed LSTM model also showed a superior performance for MAPE by achieving the lowest MAPE value, of less than or equal to 1. When a MAPE value is less than 10, it is interpreted as highly accurate forecasting [48] or acceptable forecasting accuracy [49], and when a MAPE value is less than 1, it is interpreted as near-perfect forecasting [49]. So, the LSTM model was proven to yield superior performance in terms of absolute and relative error. In terms of the ratio C, the proposed LSTM model performed the best, yielding the lowest value of C. As this value represents the change rate of the error in forecasting, a lower value implies a higher forecasting performance. Moreover, it achieved the lowest C-value of less than or equal to 0.35, indicating near-perfect forecasting furnished by that model [49,50].
For MBE, positive and negative values represent underestimated and overestimated values, respectively, and an absolute value of MBE closer to zero indicates a more accurate forecasting performance. The proposed LSTM model attained a lower absolute value for MBE than SVR, which achieved the next best performance among the benchmark models. Additionally, the ability to forecast the peak demand month is critical for efficient power-system planning as well as for identifying errors in monthly demand forecasts [51]. Therefore, the UPA value, which is a measure for evaluating the ability to forecast peak demand, is also meaningful when evaluating the model's forecasting performance. As described in Section 4.1, there were abnormal peak demands in 2011 and 2012. In 2011, the peak demand was recorded in January instead of August. In August 2012, there was a drastic surge in electricity demand compared to August 2011. Within the validation period, the proposed LSTM model gave the best peak-demand forecasting performance, with UPA values of 0.02 and −0.44 for 2011 and 2012, respectively, which means that the proposed LSTM model overestimated and underestimated the electricity usage peak in the residential areas by 0.02% and 0.44% for each year of the validation period. Although the proposed model showed better results than other models with respect to deviation between the actual and forecasted values during peak demand in 2012, the results were worse than in the year 2011. Figure 6 shows a graphical comparison of residential electricity demand during the validation period versus the electricity demand forecasted by the proposed LSTM model. As shown in the graph, the electricity demand used for validation gradually increased monthly and showed strong fluctuations, according to seasonal trends as well as according to increasing national economic and demographic growth. So, the data for the validation had a relatively complex pattern relative to data from the whole observation period considered in this study. Nonetheless, the proposed LSTM model, represented by the blue graph, forecasted residential electricity demand that was very close to the actual electricity demand. This result not only means that the proposed LSTM model can forecast future patterns of electricity demand pattern but also that the forecasted monthly demand is highly accurate, as indicated by the statistical results in Table 2.
Sustainability 2020, 12, x FOR PEER REVIEW 11 of 16 2012. In 2011, the peak demand was recorded in January instead of August. In August 2012, there was a drastic surge in electricity demand compared to August 2011. Within the validation period, the proposed LSTM model gave the best peak-demand forecasting performance, with UPA values of 0.02 and -0.44 for 2011 and 2012, respectively, which means that the proposed LSTM model overestimated and underestimated the electricity usage peak in the residential areas by 0.02% and 0.44% for each year of the validation period. Although the proposed model showed better results than other models with respect to deviation between the actual and forecasted values during peak demand in 2012, the results were worse than in the year 2011.  Figure 6 shows a graphical comparison of residential electricity demand during the validation period versus the electricity demand forecasted by the proposed LSTM model. As shown in the graph, the electricity demand used for validation gradually increased monthly and showed strong fluctuations, according to seasonal trends as well as according to increasing national economic and demographic growth. So, the data for the validation had a relatively complex pattern relative to data from the whole observation period considered in this study. Nonetheless, the proposed LSTM model, represented by the blue graph, forecasted residential electricity demand that was very close to the actual electricity demand. This result not only means that the proposed LSTM model can forecast future patterns of electricity demand pattern but also that the forecasted monthly demand is highly accurate, as indicated by the statistical results in Table 2.  The graphical comparisons of the results are shown in Figures 7 and 8. Figure 7 also provides a visual comparison of the accuracy of the electricity demand forecasts between the proposed LSTM model and the four models employed for comparison. As indicated, the benchmark models clearly deviated from the actual demand, but the deviation between the actual and forecasted values cannot be visually confirmed in the proposed LSTM model. Sustainability 2020, 12, x FOR PEER REVIEW 12 of 16 model and the four models employed for comparison. As indicated, the benchmark models clearly deviated from the actual demand, but the deviation between the actual and forecasted values cannot be visually confirmed in the proposed LSTM model. Figure 7. A visual comparison of the forecasting results for the five models. Figure 8 shows the deviation between the actual and forecasted values in each month to render a sharper and more explicit comparison of the performance in forecasting residential electricity demand. As indicated, the benchmark models clearly reflected deviation between actual and forecasted values, indicating underestimation or overestimation, whereas the proposed LSTM model exhibited few deviations. To visually evaluate the proposed model's performance, Figure 8(b) illustrates the deviations, expanded by 10-times magnification of the vertical axis of the portion, marked as blue in Figure 8 Figure 8 shows the deviation between the actual and forecasted values in each month to render a sharper and more explicit comparison of the performance in forecasting residential electricity demand. As indicated, the benchmark models clearly reflected deviation between actual and forecasted values, indicating underestimation or overestimation, whereas the proposed LSTM model exhibited few deviations. To visually evaluate the proposed model's performance, Figure 8b illustrates the deviations, expanded by 10-times magnification of the vertical axis of the portion, marked as blue in Figure 8a.
As indicated, the proposed model showed a deviation of less than 50 kWh (27.47 kWh), even in August 2012 when the aforementioned abnormal peak demand occurred. On the other hand, as shown in Figure 8c-f, the deviations between the forecasted electricity demand and the actual consumption by the benchmark models were significant as they were outside of the blue zone. These results visually demonstrated that the proposed model forecasted monthly demand for electricity in the residential sector more accurately than the benchmark models. Therefore, the experimental results prove that an accurate model was achieved using the deep learning technique in forecasting monthly residential electricity demand in the present dataset, which is in agreement with recently introduced research that also used deep learning techniques to investigate a wide range of electricity demand forecasting problems.
demand. As indicated, the benchmark models clearly reflected deviation between actual and forecasted values, indicating underestimation or overestimation, whereas the proposed LSTM model exhibited few deviations. To visually evaluate the proposed model's performance, Figure 8(b) illustrates the deviations, expanded by 10-times magnification of the vertical axis of the portion, marked as blue in Figure 8  As indicated, the proposed model showed a deviation of less than 50 kWh (27.47 kWh), even in August 2012 when the aforementioned abnormal peak demand occurred. On the other hand, as shown in Figure 8(c)-(f), the deviations between the forecasted electricity demand and the actual consumption by the benchmark models were significant as they were outside of the blue zone. These results visually demonstrated that the proposed model forecasted monthly demand for electricity in the residential sector more accurately than the benchmark models. Therefore, the experimental results prove that an accurate model was achieved using the deep learning technique in forecasting monthly residential electricity demand in the present dataset, which is in agreement with recently introduced research that also used deep learning techniques to investigate a wide range of electricity demand forecasting problems.

Conclusion
As the residential sector contributes significantly to electricity consumption worldwide and its

Conclusions
As the residential sector contributes significantly to electricity consumption worldwide and its proportion of consumption is on the rise, developing a forecasting model capable of making accurate predictions is important for efficient power-system planning. In this regard, this study aimed to propose and develop an accurate forecasting model for monthly residential electricity demand. To achieve this objective, LSTM, one of the most powerful deep learning algorithms, was trained and validated on 22 years of data collected from South Korea. The electricity demand series, together with 10 social and weather-related variables, was used as the input. Then, the model's forecasting performance was evaluated and compared to the models employed in previous studies in terms of six performance measures: MAE, RMSE, MAPE, C, MBE, and UPA.
Some remarkable findings can be extrapolated from the results of this study. Previous studies demonstrated relatively little understanding of the value of employing deep learning algorithms, especially to forecast monthly residential electricity demand at the regional or national level. The proposed model achieved a highly accurate forecasting performance with a MAPE value of less than 1 by considering social and weather-related variables. Although the proposed LSTM-based model achieved a superior forecasting performance in all of the evaluated performance measures relative to the four benchmark models in the present dataset, comprehensive evaluation using data from across the world is recommended for future research to generalize these findings and conclusions in other countries and different situations.
The proposed model is expected to contribute to efficient power-system planning by accurately forecasting monthly residential electricity usage, which accounts for the majority of total electricity usage. The ability to forecast monthly electricity demand with high accuracy can enable power-system personnel and market players to make sustainable power-system planning decisions to ensure efficient production and utilization of resources. This accurate deep learning-based forecasting model could also be applied to other problems associated with analyzing time series and optimizing energy processes, energy conservation, and sustainable use of energy resources.
Recommended future research includes studies of different forecast resolutions, such as 10 to 15 min in the future, for smart grid applications, which are currently receiving tremendous attention worldwide. In addition, by using detailed data for each city or groups of residential buildings, the deep learning-based model could be adapted for use in fields other than power-system planning.