Electric Energy Consumption Prediction by Deep Learning with State Explainable Autoencoder

: As energy demand grows globally, the energy management system (EMS) is becoming increasingly important. Energy prediction is an essential component in the ﬁrst step to create a management plan in EMS. Conventional energy prediction models focus on prediction performance, but in order to build an efﬁcient system, it is necessary to predict energy demand according to various conditions. In this paper, we propose a method to predict energy demand in various situations using a deep learning model based on an autoencoder. This model consists of a projector that deﬁnes an appropriate state for a given situation and a predictor that forecasts energy demand from the deﬁned state. The proposed model produces consumption predictions for 15, 30, 45, and 60 min with 60-min demand to date. In the experiments with household electric power consumption data for ﬁve years, this model not only has a better performance with a mean squared error of 0.384 than the conventional models, but also improves the capacity to explain the results of prediction by visualizing the state with t-SNE algorithm. Despite unsupervised representation learning, we conﬁrm that the proposed model deﬁnes the state well and predicts the energy demand accordingly.


Introduction
As industrialization has progressed globally and the industry has developed, the demand for energy has become so high that energy has become an important topic in national policy [1].In addition, energy use is rapidly increasing due to economic growth and human development [2].The causes of these phenomena can be attributed to uncontrolled energy use such as overconsumption, poor infrastructure, and wastage of energy [3].Among the demanders of various energy sources, Streimikiene estimates that residential energy consumption will account for a large proportion by 2030 [4].According to Zuo, 39% of the United States' total energy use is referred to as building energy consumption [5].An energy management system (EMS) like a smart grid has been proposed to control the demand for soaring energy.
One work cycle of the EMS is the Plan-Do-Check-Act (PDCA) cycle as depicted in Figure 1 [6].Formulating an energy plan is the first thing to do.This is the decision of the initial energy baseline, the energy performance indicators, the strategic and operative energy objectives, and the action plans.In the "do" phase, planning and action take place.The plans conducted in the previous phase have to be checked to ensure that they are effective.In the last phase, the results are reviewed, and a new strategy is established.Among the four stages, the "plan" phase is very important because it is the stage of establishing an energy use strategy and it includes an energy demand forecasting step.Therefore, it is necessary to study the energy prediction model to construct an efficient EMS.Many researchers have conducted studies with various methods to predict energy demand.In the past, machine learning techniques such as the support vector machine (SVM) and linear regression (LR) have been widely used.However, as shown in Figure 2a, energy demand values over time are complex and noisy, which limits performance.As depicted in Figure 2b, the Fourier transform to analyze patterns of energy demand reveals that it has complex features.For quantitative analysis, t-test and ANOVA were performed on the dataset used in this paper as shown in Table 1.
In the statistical analysis using the t-test, two groups (e.g., two different months in monthly demand) are chosen randomly and computed p-values, and we compute the average of all possible sampling.In the case of using ANOVA, p-value is computed from all the groups in each month, date, and hour.Many researchers have conducted studies with various methods to predict energy demand.In the past, machine learning techniques such as the support vector machine (SVM) and linear regression (LR) have been widely used.However, as shown in Figure 2a, energy demand values over time are complex and noisy, which limits performance.As depicted in Figure 2b, the Fourier transform to analyze patterns of energy demand reveals that it has complex features.For quantitative analysis, t-test and ANOVA were performed on the dataset used in this paper as shown in Table 1.In the statistical analysis using the t-test, two groups (e.g., two different months in monthly demand) are chosen randomly and computed p-values, and we compute the average of all possible sampling.In the case of using ANOVA, p-value is computed from all the groups in each month, date, and hour.Many researchers have conducted studies with various methods to predict energy demand.In the past, machine learning techniques such as the support vector machine (SVM) and linear regression (LR) have been widely used.However, as shown in Figure 2a, energy demand values over time are complex and noisy, which limits performance.As depicted in Figure 2b, the Fourier transform to analyze patterns of energy demand reveals that it has complex features.For quantitative analysis, t-test and ANOVA were performed on the dataset used in this paper as shown in Table 1.
In the statistical analysis using the t-test, two groups (e.g., two different months in monthly demand) are chosen randomly and computed p-values, and we compute the average of all possible sampling.In the case of using ANOVA, p-value is computed from all the groups in each month, date, and hour.As the result of our analysis that the characteristics of energy demand are complex, we have conducted the research with deep learning to extract difficult characteristics and work out tasks.
Energies 2019, 12, 739 3 of 14 Studies conducted using deep learning have contributed a lot to improving prediction performance, but they have not led to more utility from an energy management system perspective.If the energy demand forecasting model in the EMS copes with various situations and predicts the demand, it will be able to build a more efficient system.Therefore, in this paper, we propose a model that predicts future energy demand with that until now, considers various situations, and predicts energy demand according to different situations.This model consists of a projector that defines the state based on the energy demand to date and a predictor that predicts future energy demand from the defined state.We can adjust the automatically learned state in the middle to predict the energy demand with the consideration of various situations.The summary of the main contribution is as follows.

•
We propose a novel predictive model that can be explained by not only predicting future demand for electric power but also defining current demand pattern as state.

•
Our model predicts a very complex power demand value with stable and high performance compared with previous studies.

•
We analyze the state defined in latent space by the proposed model and investigate a model that predicts the power demand by assuming various explanations.
The rest of the paper is as follows.Section 2 introduces the previous studies for forecasting energy demand and addresses the limitations.To overcome these shortcomings, we propose our model in Section 3. Section 4 shows the results of the energy demand forecasting with the proposed model and also shows the result of forecasting the demand by considering various situations.In the final section, conclusions and discussion are presented.

Related Works
Several studies have been conducted to predict energy demand mentioned in Section 1. Table 2 summarizes the previous studies.In the past, statistical techniques were used mainly to predict energy demand.Munz et al. predicted a time series of irregular patterns using k-means clustering [7].Kandananond used different forecasting methods-autoregressive integrated moving average (ARIMA), artificial neural network (ANN), and multiple linear regression (MLR) -to predict energy consumption [8].Cauwer et al. proposed a method to predict energy consumption using a statistical model and its underlying physical principles [9].
However, due to the irregular patterns of energy demand, statistical techniques have limited performance and many models of prediction using machine learning methods have been investigated.Dong et al. predicted the demand of building energy using SVM with consumption and weather information [10].Gonzalez and Zamarreno forecasted the next temperature from the temperature to date using a feedforward neural network (NN) and predicted the requirement with the difference of them [11].Ekici and Aksoy predicted the building energy needs with properties of buildings without weather conditions [12].Li et al. estimated the annual energy demand using SVM with the building's transfer coefficient [13].However, these studies only constructed models to predict correct value corresponding to the input so as to lack the basis for influence of the input features.To solve this problem, Xuemei et al. set the state for forecasting energy consumption through fuzzy c-means clustering and predicted demand with fuzzy SVM [14].Ma forecasted energy consumption with specific population activities or unexpected events, as well as weather condition as inputs of the MLR model [15].Although the above studies set the state and forecasted future consumption based on it, they lacked the mechanism to identify the state accurately.As mentioned in Section 1, the energy consumption data contain large noise.Deep learning, which is a rising method to solve complex tasks of late, is efficient for predicting energy demand because it solves tasks by modeling complex characteristics of data well [16].Ahmad et al. forecasted energy demand by constructing a deep NN and inputting the information of weather and building usage rate [17].Lee et al. estimated environmental consumption by using a temporal model like recurrent neural network (RNN) with energy consumption data and temporal features [18].Li et al. proposed a method to predict energy demand with autoencoder, one of the methods to represent data [19].However, Li et al.'s model included only fully-connected layers, so that temporal features were ignored, and it is hard to control the conditions because latent space where the features of data are represented is not defined in that model.
Although some of the above studies provided novel research directions, other features such as information of weather and building are used in addition to the energy demand value, which is costly to construct the model for energy consumption prediction.Besides, they lack the explanation capability on the predicted value, because there was no study on the state to analyze the results of prediction.However, studies that analyze and explain the predictive results are essential for practical use of the predictive model.In this paper, we propose a model that visualizes a state by defining the state based on the current usage pattern and date information, so as to be able to explain the results of prediction.Our model, like any other prediction models, takes the energy demand up to now as input and predicts consumption in the future.However, in order to overcome the limitations of the end-to-end system, which cannot analyze the internal prediction process, we add a step to define the state of the demand pattern in the middle.

Overview
As mentioned in Sections 1 and 2, nonlinear approaches, including those based on fuzzy and neural net, have demonstrated successful performance in many applications [20][21][22][23].In this paper, we have contributed to the field of application by solving the power demand forecasting problem using the deep neural network-based method.Compared to the previous work, the overall architecture of our model consists of a projector f and a predicter g, similar to an auto-encoder consisting of an encoder and a decoder as shown in Figure 3 [24].There are many ways to deal with time series data, but f and g are based on long short-term memory (LSTM), one of the RNN's, to handle time series data [25][26][27][28].Predictor uses the output value of each time-step as the input of the next [29].The projector defines the state by compressing the energy demand and transferring it to the latent space representing the demand information.Predictor predicts future energy demand based on the defined state.It can be done by end-to-end learning of the projector and the predictor, and the process of defining state is trained automatically.In the variational autoencoder (VAE), the state defined on the latent space contains the feature of the produced data, and also contains the information of the expected energy consumption, as well as features of the input values [30,31].Ma and Lee predicted the energy consumption by adding more information of the surrounding environment while learning [15,18].However, unlike them, after learning to predict the consumption with only demand to date, our model can predict future consumption by adjusting the state on the latent space with the condition of the surrounding environment.

Consumption Representation
Previous studies have constructed a model that sends energy consumption X to a predicted value Y as in Equation ( 1), while we define a state by adding a potential space S between X and Y, as shown in Equation (2).
We continuously update the state during the time interval t.The state s e i for i during the time interval t of input is defined as follows.
where e means "embedded state", x e i is the ith input value of projector and s e 0 = 0. f (•, •) is a LSTM for the projector including one memory cell c t and three gates (input i t , forget f t , and output o t ).We calculate each value as follows.
where U and W are weights of the layer, σ is activation function, and c and c are intermediary memory cell and memory cell, respectively.We set s e t as the last state of the projector (i.e., s e t = f x e t , s e t−1 ).The process to define the state is learned automatically as the predictor interacts with the projector to predict the demand.The details are introduced in the next section.
Energies 2019, 12, 739 5 of 14 architecture of our model consists of a projector  and a predicter , similar to an auto-encoder consisting of an encoder and a decoder as shown in Figure 3 [24].There are many ways to deal with time series data, but  and  are based on long short-term memory (LSTM), one of the RNN's, to handle time series data [25][26][27][28].Predictor uses the output value of each time-step as the input of the next [29].The projector defines the state by compressing the energy demand and transferring it to the latent space representing the demand information.Predictor predicts future energy demand based on the defined state.It can be done by end-to-end learning of the projector and the predictor, and the process of defining state is trained automatically.In the variational autoencoder (VAE), the state defined on the latent space contains the feature of the produced data, and also contains the information of the expected energy consumption, as well as features of the input values [30,31].Ma and Lee predicted the energy consumption by adding more information of the surrounding environment while learning [15,18].However, unlike them, after learning to predict the consumption with only demand to date, our model can predict future consumption by adjusting the state on the latent space with the condition of the surrounding environment.

Consumption Representation
Previous studies have constructed a model that sends energy consumption  to a predicted value  as in equation ( 1), while we define a state by adding a potential space  between  and , as shown in equation ( 2).We continuously update the state during the time interval t.The state    for  during the time interval t of input is defined as follows.The state of the projector is located on the latent space where patterns and features of input energy consumption are shown.Therefore, by controlling the state transferred to the latent space, it is possible to predict the future consumption as well as to analyze the current consumption situation.

Demand Prediction
This section presents how to use the state set by the projector for forecasting future demand.First, as shown in Equations ( 10) and (11), the predictor predicts a single consumption value immediately after inputting the state.Recursively, predictor forecasts the next single demand value with the predicted value.
g : (Y, S) → Y, where d means "demand", y i is the ith output and the (i + 1)th input of predictor at the same time, and •) is a RNN for the predictor.It gets the output of itself from the previous time-step as input and is computed as follows.
Since the predictor forecasts the demand based on the first input state, if the state is arbitrarily controlled by the user, the model can predict the demand according to various conditions, resulting in an effective energy management system.For example, adding different conditions, such as information of weather or economy, can make energy demand prediction model more smoothly.
To train the proposed model, we use L2 loss function as shown in Equation ( 14) by sampling the data with a time interval of t and T in X and Y, respectively.The projector interacts with the predictor and learns to automatically define the state.
The algorithms for learning the proposed model and forecasting the future demand are as follows.In Algorithm 1, two functions f and g are trained with energy demand record for t minutes, x 1:t , and energy demand to be predicted for T minutes, y 1:T .In Algorithm 2, at first, we get a state with energy consumption x 1:t and the predictor f .Then, based on the computed state s e t , we predict total T minutes by 1 min with the predictor g.Input: Dataset (X , Y ) 2.
where θ f , θ g are parameters of projector f and predictor g, respectively. Energies

State Transition
This section describes how the user can arbitrarily adjust the defined state to assume various situations and predict energy demand.When energy demand x A based on condition A (e.g., warm weather) comes in as input, and when predicting energy demand based on condition B (e.g., cold weather), first, the average of state s A of the demanded quantities of condition A and the average s B of states of the demanded quantities of condition B are calculated.If the state s A , which is defined by inputting x A to projector, is subtracted from s A and s B is added to the state calculated through the energy demand x A , it can be seen that the condition A is omitted, and the condition B is added.We perform the state transition with this process and forecast the energy demand according to various conditions.

Dataset amd Experimental Settings
To verify the proposed model, we use a dataset on household electric power consumption [32].There are about two million minutes of electric energy demand data from 2006 to 2010, and they are divided into training and test data as a 9:1 ratio.It consists of eight attributes including date, global active power (GAP), global reactive power (GRP), global intensity (GI), voltage, sub metering 1, 2, and 3 (S1, 2, and 3), and the model predicts the GAP.S1 corresponds to the kitchen, containing mainly a microwave, an oven, and a dishwasher.S2 corresponds to the laundry room, containing a refrigerator, a tumble-drier, a light, and a washing-machine.S3 corresponds to an air-conditioner and an electric water-heater.The statistical summary of each feature is described in Table 3.To train the time series model, we use a backpropagation through time (BPTT) algorithm, and the Adam optimizer with default hyper parameters in the keras library of python [33,34].All weights are initialized with Glorot initialization [35].The operating system of the computer used in our experiments was Ubuntu 16.04.2LTS and the central processing unit of the computer was an Intel Energies 2019, 12, 739 9 of 14 Xeon E5-2630V3.The random-access memory of the computer was Samsung DDR4 16 GB × 4, and the graphic processing unit of the computer was GTX Titan X D5 12 GB.The number of hidden units in the deep learning approach, including our model (i.e., the size of the state s) was set at 64.

Demand Prediction
To verify the performance of the proposed model, we show the energy demand forecasting result using our model and compared with other conventional methods.Figure 4 is the result showing real and predicted energy demand values at the same time.The model predicts energy demand for 15, 30, 45, and 60 min with actual energy demand for 60 min.Although the model could not predict the energy demand perfectly, we confirm that the energy demand pattern predicted well.We show the convergence of the learning algorithm experimentally by showing the change of loss value as learning progresses in Figure 5.Our model is compared with conventional machine learning methods such as linear regression (LR), decision tree (DT), random forest (RF) and multilayer perceptron (MLP), and with deep learning methods such as LSTM, stacked LSTM, and the autoencoder model proposed by Li.Stacked LSTM is a model including two LSTM layers similar to our model but does not set the state.A model proposed by Li has one hundred hidden units and four hidden layers.The MSE measure of the experimental results for each model is shown in Figure 6 as box plot.The results of the comparison with other models show that the proposed model outperforms other models.We can confirm that the conventional machine learning methods (LR, DT, RF, and MLP) show a large variation in prediction performance, but the deep learning methods (LSTM, Stacked LSTM, the Li's model, and ours) are trained in stable.Some of the deep learning methods are worse than machine learning methods, but our model yields the best performance.To examine the performance of the prediction model, we use three evaluation metrics-the mean squared error (MSE), the mean absolute error (MAE), and the mean relative error (MRE), which can be calculated respectively as follows.

Metric
We conducted the experiments with 10-fold cross validation and the average values for each metric are shown in Table 4.

State Transition
We empirically verify whether our model can automatically learn a capacity to define state as described in Section 3. We extract the output of the projector to get states and visualize them as shown in Figure 7.We use the t-SNE algorithm to visualize the state [36].We confirm that the consumption data are not separated clearly by month, but they are clustered by month on the latent space even with the unsupervised representation learning.Approximately the distribution of data can be divided into right (January, February, May, October, and November), left (August and September), top (December), center (March), center-right (June), center-left (July), and center-top (April).We mark plotted points of each month with annotations to figure and illustrate it monthly to show the state for each month, resulting in twelve plots.It can be seen that the defined state is well clustered on a monthly basis, achieving low intra-class variability.
Our model is also empirically confirmed that not only defines the state well but also has the capacity to adjust the prediction by controlling the state on latent space.This method is effective for EMS, for example, because the electricity demand prediction can be made flexible according to the climate or economic situation.The experiment to control the condition of the month is conducted, and other conditions are left for future study.An example of a state transition that controls a state on the latent space is shown in Figure 8.If we want to forecast the energy demand in October with only consumption in April, we just project the demand in April into the latent space to extract the state.After extracting the state for the electric energy consumption at one point in April, we add the average value of the states for October v(x OCT ) and subtract the average value of the states for the April v(x APR ) and put it into the predictor to get the predicted values.
Table 5 shows the average value of the predicted electric power consumption for one hour in minutes to determine whether the demand pattern for April is changed to the consumption pattern for October after the state transition.Each column shows the month of the input electric energy demand to date, and each row shows an output month of state transition to predict the desired pattern for the specified month.The ground truth (GT) is the average electric energy consumption for each month.It can adjust the state on the latent space because the predicted consumption after conditioning is similar to GT. minutes to determine whether the demand pattern for April is changed to the consumption pattern for October after the state transition.Each column shows the month of the input electric energy demand to date, and each row shows an output month of state transition to predict the desired pattern for the specified month.The ground truth (GT) is the average electric energy consumption for each month.It can adjust the state on the latent space because the predicted consumption after conditioning is similar to GT.

Conclusions
We have addressed the importance of energy demand prediction and proposed a model to solve them.It attempts to predict electric energy consumption through defining the state unlike the conventional machine learning or deep learning models.Divided into two parts (projector and predictor), each part interacts with the other and learns to automatically set a state without any supervision.We achieve the best forecasting performance compared to others, analyze the state, and peek the basis for the predicted consumption value.In addition, the state transition method shows that our model can be more efficient because we can control the predicted electric energy consumption values according to various situations by adjusting conditions.For example, if we add several conditions to the state such as information of weather or economy, we can predict electricity

Conclusions
We have addressed the importance of energy demand prediction and proposed a model to solve them.It attempts to predict electric energy consumption through defining the state unlike the conventional machine learning or deep learning models.Divided into two parts (projector and predictor), each part interacts with the other and learns to automatically set a state without any supervision.We achieve the best forecasting performance compared to others, analyze the state, and peek the basis for the predicted consumption value.In addition, the state transition method shows that our model can be more efficient because we can control the predicted electric energy consumption values according to various situations by adjusting conditions.For example, if we add several conditions to the state such as information of weather or economy, we can predict electricity demand accordingly.
In this paper, we have conducted experiments with several conditions of the state only for the month.We will experiment with various conditions such as weather, economy, or any other events in the future works.In this paper, only the energy consumption of one individual household is predicted, so the demand of several buildings will be collected, and we will add the information about building into the state and have a plan to propose a model which can predict energy consumption of various buildings.Finally, we will construct an efficient energy management system including the proposed prediction model.

Figure 1 .
Figure 1.Plan, do, check, and act (PDCA) cycle for EMS.We focus on "plan" phase in this paper.

Figure 2 .
Figure 2. (a) The electric energy demand for each date, and (b) the result of Fourier transform.

Figure 1 .
Figure 1.Plan, do, check, and act (PDCA) cycle for EMS.We focus on "plan" phase in this paper.

Figure 1 .
Figure 1.Plan, do, check, and act (PDCA) cycle for EMS.We focus on "plan" phase in this paper.

Figure 2 .
Figure 2. (a) The electric energy demand for each date, and (b) the result of Fourier transform.

Figure 2 .
Figure 2. (a) The electric energy demand for each date, and (b) the result of Fourier transform.

Figure 3 .
Figure 3.The overall scheme of the proposed method.

Figure 3 .
Figure 3.The overall scheme of the proposed method.

Algorithm 1 .
Learning algorithm for the proposed method 1.

Figure 4 .
Figure 4.The predicted electric energy consumption and the actual demand.We show the prediction results for (a) 15, (b) 30, (c) 45, and (d) 60 minutes.

Figure 5 .
Figure 5.The training and test loss values for each epoch.We show the mean squared error of the model that predicts electric energy demand for (a) 15, (b) 30, (c) 45, and (d) 60 minutes.

Figure 4 .
Figure 4.The predicted electric energy consumption and the actual demand.We show the prediction results for (a) 15, (b) 30, (c) 45, and (d) 60 min.

Figure 4 .
Figure 4.The predicted electric energy consumption and the actual demand.We show the prediction results for (a) 15, (b) 30, (c) 45, and (d) 60 minutes.

Figure 5 .
Figure 5.The training and test loss values for each epoch.We show the mean squared error of the model that predicts electric energy demand for (a) 15, (b) 30, (c) 45, and (d) 60 minutes.

Figure 5 .
Figure 5.The training and test loss values for each epoch.We show the mean squared error of the model that predicts electric energy demand for (a) 15, (b) 30, (c) 45, and (d) 60 min.

Figure 7 .
Figure 7. Visualization of states for monthly electricity demands.We describe the cluster of states as red circles on a monthly basis.

Figure 7 . 14 Figure 8 .
Figure 7. Visualization of states for monthly electricity demands.We describe the cluster of states as red circles on a monthly basis.Energies 2019, 12, 739 12 of 14

Table 1 .
Results of statistical analysis of electric energy demand by month, date, and hour.

Table 1 .
Results of statistical analysis of electric energy demand by month, date, and hour.

Table 1 .
Results of statistical analysis of electric energy demand by month, date, and hour.

Table 2 .
The summary of related works.

Table 3 .
The summary of the data used in this paper.

Table 4 .
The numerical results of experiments.Our model outperforms the other models in most metrics.

Table 5 .
The results of experiments on state transition.Each column shows the month of the input electric energy demand and each row shows an output month after state transition.The ground truth (G.T.) is the average electric energy consumption for each month.