Dual Deep Learning Networks Based Load Forecasting with Partial Real-Time Information and Its Application to System Marginal Price Prediction

: Load power forecast is one of most important tasks in power systems operation and maintenance. Enhancing its accuracy can be helpful to power systems scheduling. This paper presents how to use partial real-time temperature information in forecasting load power, which is usually done using past load power and temperature data. The partial real-time temperature information means temperature information for only part of the entire prediction time interval. To this end, a long short-term memory (LSTM) network is trained using past temperature and load power data in order to forecast load power, where forecasted load power depends on the temperature prediction implicitly. Then, in order to deal with the case where nontrivial temperature prediction errors happen, a multi-layer perceptron (MLP) network is trained using the past data describing the relation between temperature variation and load power variation. Then, the temperature is measured at the beginning of the prediction time-interval and compensated load forecast is computed by adding the output of the LSTM and that of the MLP whose input is the temperature prediction error. It is shown that the proposed compensation using the real-time temperature information indeed improves performance of load power forecast. This improved load forecast is used to predict system marginal price (SMP). The proposed method is validated using the real temperature and load power data of South Korea.


Introduction
An accurate load power forecasting plays an important role in maintaining the stability and safety of a power system [1][2][3].Moreover, operational decisions in power systems like maintenance scheduling are really dependent on the behavior of the load pattern itself [3].Especially with good forecasting results, the electric power company can provide sufficient power to the loads under unexpected large operation conditions change, for example, sudden temperature decrease in winter or increase in summer.Note that if such a situation is not well handled, partial or total blackout can take place.The application of load power forecasting is not only to maintain the safety of power system operation but also to assist the scheduling of storage facilities.Load power forecasting is also used as another parameter to make better system marginal price (SMP) forecasting in order to maximize profit between the electrical power company and consumer [4].Load forecasting really depends on many aspects, especially nature parameter, because of the weather-sensitive load.Hence, an accurate weather prediction like temperature or humidity can improve the performance of load forecasting.
The daily temperature is one of the important natural parameters that give an impact on the amount of daily load demand [2].Accordingly, it is important to consider temperature as another input parameter in addition to past load demand for load forecasting because of its relevancy.Researchers used the relevance between temperature and load demand for forecasting by using statistical methods until advanced methods like artificial intelligence were developed [1][2][3][5][6][7][8].Load power forecasting using historical load demand and also temperature as the input parameter was shown to be a precisely good forecasting result [1][2][3].
Another application of load demand forecasting is SMP prediction.Nowadays, power generation is moving from centralized to decentralized operation.It means many electrical power companies generate their own electricity and sell it to their customers at various prices.They submit their own electricity bidding price at a competitive rate that makes a great market competition among the other companies.So undoubtedly, good SMP prediction can help both electrical power companies and also consumers gain their maximum benefits based on the SMP forecasting.Since load power demand and SMP are closely related to each other, load power demand needs to be considered in order to determine the exact SMP on a specific hour.There are three main parts to determine the SMP in a specific time on single-round auctions [4].The electrical power producer makes its own generation bidding price and also electricity consumers make their own bidding price.After that, there is a market-clearing tool that has a task to clear the price market and to determine the SMP on one specific time that has been accepted by production and consumption companies.In view of such a procedure, load forecasting is important for both electric power producers and customers.
Owing to its significance, there are many approaches to do load forecasting in the literature.Well-known statistical methods used for load forecasting are simple moving average (SMA), exponential smoothing (SES), and autoregressive integration moving average (ARIMA) to name a few [9].Those statistical methods have their own speciality.The SMA calculates a simple moving average for the last n values.In the exponential smoothing, it provides a way to make a smooth forecasting by removing noises from the data so that it would give a better forecasting result.The ARIMA method forecasts future values based on past values of the signal under consideration using a time series property.Besides statistical method, artificial intelligence (AI) method is also used for forecasting.Due to its universal approximation property, AI based methods show good performance compared with statistical methods for the case where the signal is generated by a very complicated mechanism.Combining statistical and artificial intelligence methods to forecast load power in a single process is proposed in order to take advantage of each method [6].The basic support vector machine (SVM) method is used to forecast the load for the next 24 h by using historical load and temperature while artificial neural network (ANN) is used to extract the components from the historical valley and peak data of load power and temperature [6].A comparative analysis is done between auto-regressive integrated moving average (ARIMA), ANN and adaptive neuro-fuzzy to determine which method has the best capability to forecast load power.Historical load power data is set as the only input to each method with different time sampling [7].The support vector regression (SVR) method is used to forecast load power every 5 min by using historical load power data as the input and comparing it to the back-propagation neural network (BPNN) algorithm [8].Historical data of both load demand and temperature are utilized for the sake of load forecasting using a machine learning technique [3,5].A support vector regression based method uses an algebraic method for load prediction and the SVR method is used to compensate for the load prediction result [3].Another machine learning technique like a not-fully connected ANN is used to reduce the training time compared to a fully connected ANN [2].In [5], a part of real past data is extracted to enhance the accuracy of the load prediction.
In the literature, there are many approaches to SMP forecasting developed by researchers.In [4], a time series analysis based method is proposed to forecast electricity price and, for the purpose, dynamic regression and transfer function model are utilized.A method using the linear programming is proposed to make a model for real-time electricity pricing with a motivation to reduce the electricity expenditure of the customer optimally [10].In addition to the mathematical and statistical methods, machine learning based approaches are also used for electricity price forecasting.For instance, only using historical SMP data as the main predictor, ref. [11] natural gas and oil price are also considered as another predictor to enhance electricity price forecasting because of its strong relation to SMP data in a country with majority of oil and gas power plant.The main result in [12] forecasts SMP by considering oil and gas price as additional predictors in an SVM model.Another simulation proposes a one-day ahead of SMP forecasting and they increase the prediction accuracy especially during a peak time by utilizing a weekly variation of SMP data [13].In order to enhance forecasting results, a modern technique like deep learning is also utilized for short-term SMP forecasting because of the well-known capability of deep learning to analyze non-linear data [14,15].In addition, recurrent neural network (RNN) and also long short-term memory (LSTM) network are used as other forecasting methods because of their outstanding capability to predict time-dependent data compared to general MLP [16].
It is reported that load power is closely related to SMP forecasting [15,17].This paper intends to utilize the load power forecasting to enhance SMP forecasting by using both LSTM and MLP.This paper focuses on enhancing short-term load demand forecasting by using partial real-time information on top of historical load demand and temperature data.SMP forecasting as its application will also be enhanced by using this load forecasting result.The partial real-time information means the temperature information for a part of the whole forecast period.In other words, only partial temperature information will be used in the MLP to compensate for the load forecasting.In addition to these, this paper uses the relation between ∆P (load power variation) and ∆T (temperature variation) from their past data.In other words, from the past data on load demand and temperature, the load changes induced by temperature changes is investigated.Based on these data, at first an LSTM is trained using the past data on load and temperature to forecast the next 24 h load power.In addition, an MLP is trained using the relation between past ∆P (load power variation) and ∆T (temperature variation) in order to compute ∆ P (i.e., amount of compensation) with measured ∆T.This measured ∆T is called the partial real-time information in this paper, which means the difference between the temperature prediction and the temperature measurement over the initial part of the entire prediction time interval.A similar method is applied to SMP prediction in order to enhance its performance.The proposed method is implemented using real temperature, load data, and SMP data from South Korea and the numerical simulations show that the performance of the load and SMP forecasting are indeed improved.

SMP Forecasting
Nowadays the operational approach of the electrical power industry has shifted from a centralized approach.This means that price competition among electrical power producers has been getting more complicated than ever and it affects the profit of the companies depending on how the companies can provide electric power with high reliability and low cost to the consumer [4].Therefore, SMP forecasting is a crucial and important problem to electrical power producers and the consumer.Knowing the forecast of SMP in the specific time helps the generation companies and consumers to make their own bidding strategies to maximize the profits [4].Since SMP is highly volatile and it depends on many factors such as oil price, load demand, and even nature parameters, it is a challenging problem to make SMP forecast.In order to improve the forecasting result, we need to utilize other factors in addition to historical SMP data itself.One of the most important factors to improve the SMP forecasting is load demand.If we can provide load demand forecasting precisely as another input for SMP forecasting, it can improve the forecasting performance.In our numerical experiment, two kinds of load power demand are used as the network input.The first one is past load demand power and the second one is load demand forecasting for the next 24 h.This means that we expect that SMP forecasting can be improved if we provide past and also forecasting data of load power demand as another input to the network.

MLP
Multi-layer perceptron (MLP) is a classical type of neural network consisting of multiple hidden layers with a specific nonlinear activation function in it [18,19].Basically, the MLP is used for a kind of simple classification and regression task.The essential key of the MLP is to find a relationship between the input and output data [19].Namely, when input data and corresponding desired output (also called label) are given, the mapping between them is identified using the MLP [16].
In Figure 1, it is a representative of the MLP structure with multiple hidden layers.As the non-linearity of the MLP increases with the increase of the layer's number, the ability of the MLP to fit more complex data is enhanced.

LSTM Model
SMP, temperature, and load demand data can be categorized as time-dependent data, which means that the value at a certain time is dependent on its past values.Among deep learning networks, it is well-known that RNN is effective for time series prediction.Because of its popularity, LSTM is chosen in this paper.Note that the LSTM is an advanced version of the RNN with an emphasis on its long-term memory performance.Figure 2 depicts the basic structure of the LSTM.Inside of the LSTM Network, there are substructures called gates that make a way to optionally let the information through [19].Those gates consist of forget gate, input gate, and output gate.The forget gate decides how much information from the cell state is kept or thrown away by flowing the information through it.σ(•) ∈ [0, 1] is a sigmoid activation function.Value 1 represents completely keeping the information while 0 represents whole information from the cell state completely thrown away.The functions in Figure 2 are defined as follows: where x t and x t−1 are current and previous inputs.h t and h t−1 are current and previous hidden states.

Proposed Forecasting Method
In this section, the proposed load forecasting method is presented and also its application to SMP prediction is introduced.In the proposed forecasting method, it is assumed that past hourly load power and temperature data sets are available during the past several years.

LSTM Based Load Power Forecasting using Past Load Power and Temperature Data Set
Load power forecasting using past load power data set can be viewed as a time-series prediction.It is well-known that load power depends heavily on the temperature since customers use heating or air conditioning systems more when the weather is hot or cold.Among many deep learning techniques, the LSTM is popularly employed for the time series prediction problem.Considering these observations, it is natural to construct the LSTM network shown in Figure 3 for load forecasting.Notice that the temperature is also used as the input to the LSTM and is also forecasted for later use in the proposed method.The required input data for the LSTM training consists of hourly historical load demand data P(t) and hourly historical temperature data T(t).Namely, those data sets are used as the input to the LSTM network to generate the forecasted load demand P and temperature T over some future time interval.In this approach described in Figure 3, an LSTM is trained using the past load demand and temperature.This trained LSTM generates load power forecast at every t 1 = 00 : 00 for the next 24 h using the last 24 h temperature and load power data.In forecasting the load power for the next 24 h, the temperature prediction over the same time interval is used implicitly.In other words, the LSTM computes the temperature prediction T over the time interval using past temperature data and then the temperature prediction is used to forecast the load demand.Therefore, since the forecasting is based on the temperature prediction over the next 24 h, if there is nontrivial difference between the actual temperature and temperature prediction over the interval, then the accuracy of the load power forecasting can decrease.In the next subsection, it is presented how to deal with this situation.

Compensation of Load Forecasting Error due to Temperature Prediction Error
Figure 4 shows the main idea of the proposed method.Figure 4a describes that the LSTM generates the load and temperature forecast using the past load and temperature data.In Figure 4b, T real denotes the real temperature value.If there is nontrivial difference between T real (t) and T(t), the load forecasting has to be compensated.Hence, when the temperature information over the initial prediction time interval is available, the proposed method compensates the difference using the MLP.In Figure 4b, Pcomp (t) stands for the compensated load forecasting.For the purpose of compensating, this paper looks into quantitative effect of temperature variation ∆T on load power from the past data.To be more precise, from the past temperature and load demand data, we investigate the quantitative relation between temperature variation and load power variation, i.e., the relation between temperature variation ∆T and load variation ∆P.Using this quantitative relation as training data, we develop an MLP such that its input training is ∆T, its label is ∆P, and its network output is ∆ P. Note that ∆T denotes temperature variation and ∆P denotes load demand variation respectively.They are obtained from the past data set as shown in Table 1.By training an MLP using (∆T, ∆P) as the training data, i.e., the input-label, it is possible to obtain the sensitivity of load demand with respect to the temperature variation.In other words, after training, when arbitrary temperature variation is given as the input to the trained MLP, the output of the trained MLP is the corresponding load demand variation.This is shown in the case study section later.Figure 5 depicts the entire structure of the proposed forecasting method.As mentioned, the LSTM network generates both load demand and temperature prediction using past load power and temperature, and the output of the MLP network is the prediction of the load power variation ∆ P and its input is the temperature variation ∆T.In terms of the time line, at 00:00, the trained LSTM generates load demand and temperature forecast for the next 24 h.Then, the temperature prediction error (i.e., T(t) − ∆ T(t), t ∈ [00 : 00, 08 : 00]) is measured and the error is used as the input to the trained MLP network and the compensation is made for [08:00-24:00] In other words, the compensated load forecast is given by Pcomp (t) = P(t) + ∆ P(t).
This procedure is repeated everyday.

SMP Prediction
Good SMP prediction can help both the customer and the electric power company to maximize their profit by providing appropriate bidding strategies.Naturally, SMP is heavily dependent on load demand.Hence, the enhanced load power forecast in the previous subsection can be used to improve SMP prediction.Motivated by the method in the previous subsection, SMP prediction can be done in three steps.First, similarly to the load power forecast, SMP prediction can be made using past load power and SMP data set via an LSTM network since SMP can also be viewed as a time series.Second, by employing the load power forecast by the LSTM-MLP based method in the previous subsection as an input to the LSTM for SMP forecast, the performance of the SMP prediction can be improved.Third, again, similarly to the previous subsection, the SMP prediction can be enhanced one step further using partial real-time load power information.In other words, using the real-time load power and SMP information over [00 : 00, 08 : 00], improved SMP prediction can be made.For this, it is assumed that the relation between the past load demand variation and SMP variation is identified as described in Table 2, and that an MLP is trained using the relation ∆P and ∆SMP.Hence, the trained MLP can generate the prediction of SMP variation (i.e., ∆ ŜMP) when the load demand variation (i.e., ∆P) is given.Figure 6 depicts the entire structure of the proposed SMP prediction method.

Case Studies
This section describes how to apply the proposed method.For this purpose, it is presented how to obtain the training and testing data as well as how to implement the proposed method using the LSTM and MLP.All the proposed methods were implemented using deep learning libraries provided by TensorFlow [20] and Keras [21].

Training Data Description and LSTM and MLP Setting
All data sets used in this paper are taken from Government of South Korea website which contains load power data sets from 2016 until 2018 at the mainland of South Korea [22].In order to support the load power forecasting, the hourly temperature at the mainland of South Korea is provided [23], while SMP data sets in South Korea to support our SMP forecasting numerical experiment are provided in [24].In the experiment, all training data are normalized using scikit-learn python library.
All data are normalized belonging to the interval [−1, 1] for the LSTM network and interval [0, 1] for the MLP network .In order to see the consistent performance of the proposed method irrespective of seasonal properties of the training data sets, both summer and winter data are used.Summer data is used first to validate the proposed method and winter data is used to confirm the generality of the proposed method.Table 3 summarizes the information about the training data.The network parameters that were used for training are summarized in Table 5.Note that 20% of the training data set is used for validation as described in Figure 7.We keep the loss function value of validation and training data keeps decreasing gradually in order to prevent overfitting or underfitting network.

Network Training for Load Forecast
With the LSTM and MLP setting described in the previous subsection, network training for the method in Figure 5 is carried out with summer data first.Figure 8a shows the loss function of the LSTM network training using past load power and temperature as the network input.Figure 8b depicts the loss function of the MLP training with ∆T being the network input, ∆P the label, and ∆ P network output which is the compensation.In Figure 8, the red and blue lines stand for the loss function value using the training and testing data, respectively.Since the loss functions are decreasing and they also converge to the very small number for both training and testing data, it can be concluded that the learning was over successfully.In order to confirm that the method also works for other season's data, exactly the same training is done for winter data.As seen in Figure 9, the training ends successfully in terms of convergence of the loss functions.

Load Forecast using the Trained Network
After finishing the training, the trained network is tested using 24 h data from 5 September 2018.Figure 10 shows the comparison between the actual load demand power and its forecasts on 5 September 2018.In addition to the true load power data, the other three trajectories in the figure are the load forecast by the trained LSTM when the input to the LSTM network is only past load power data, by the LSTM when the input to the LSTM network consists of both the load demand and temperature, and the load forecast by the proposed method described in Figure 5.In light of the three trajectories, the proposed method outperforms other methods.To compare quantitatively, Table 6 shows the root mean square error (RMSE) values for those three cases.Notice that the RMSE corresponding to the proposed method is smallest.Moreover, the load forecasting result should be as close as possible to the peak load demand on the forecasted day.It becomes the most important aspect in load forecasting because the grid operator can use it to provide sufficient power during the peak time on the forecasted day.In Figure 10, it turns out that the peak time happens at 17:00.591.67 Table 7 compares the value of load forecasting during peak time for each method.It shows that the proposed compensation method gives the most accurate forecast compared to other methods.One possibility to improve the result is to consider more real time information.To this end, in addition to ∆T, real power consumption ∆P during 00:00-08:00 is used as another piece of real-time information in order to train the MLP for the purpose of obtaining the prediction of the load demand variation.Figure 11 shows the resulting compensated load forecasting.The method using both ∆T and ∆P for the MLP training shows better results compared with others and Table 8 shows the corresponding RMSE values.To see consistent performance of the proposed method, when the trained network in the previous subsection is used to forecast the load power for winter data, Figure 12 and Table 9 are obtained.Note also that the forecast result by the proposed method is used to generate SMP prediction for the next 24 h in the next subsection.

Network Training for SMP Prediction
Now, training result is shown when the method in Figure 6 is implemented.Figure 13a shows the loss function of the LSTM training using past SMP data, past load power, and next 24 h load prediction as the network input.Figure 13b describes the loss function corresponding to the MLP training with load power variation being the input and SMP variation being the label and the prediction of the SMP variation being the output of the MLP network, respectively.In view of the convergence of the loss function in Figure 13, the LSTM and MLP training for SMP prediction were done successfully as well.

SMP Prediction using the Trained Network
This section shows that the enhanced performance of the load forecast indeed improves SMP prediction.To this end, in this subsection, the load forecast in the previous subsection is used as another input of the network to predict SMP as described in Figure 6. Figure 15 shows the actual and predicted SMPs on 5 September 2018.The SMP predictions are made in three different ways as mentioned in the previous section.First, SMP prediction is made by LSTM using past SMP, past load power data, and uncompensated 24 h load power forecast.Second, the load power forecast is replaced by the proposed compensated forecast in the first case.Thrid, the SMP prediction is done according to the method depicted in Figure 6.Namely, the partial real-time information on the load power is also used for compensation as the input to the MLP.As seen in Figure 15, SMP prediction by the proposed method leads to quite a good result in the sense that it leads to the smallest RMSE as shown in Table 10.  Figure 16 and Table 11 show the proposed method provides good enough SMP prediction for winter data as well.In view of such results, we can see that the proposed method results in good performance in both load power forecast and SMP prediction.In the next subsection, as a byproduct of the proposed method, it is shown how the load demand and SMP changes quantitatively using the trained MLP when the temperature varies.4.6.Incremental Relationship Among Temperature, Load Power, and SMP Due to maintenance reasons of power systems and the electricity market, it is helpful to predict how the load demand and SMP changes in response to the temperature variation.Figure 17 shows how the two trained MLPs in the previous sections can be used to see the relationship among ∆T, ∆P, and ∆SMP.Most of all, please note that one of main motivations of the paper is to unveil the relation between temperature variation and load demand variation quantitatively.Since an MLP is trained using (∆T, ∆P) in Section 3.2, it is possible to compute load demand variation for a given temperature variation.In order to see this, an arbitrary ∆T trajectory in Figure 18a is applied to the cascade network in Figure 17.In view of Figure 18b,c, it can be verified how the load demand and SMP respond to temperature variations.Although it is easy to think qualitatively that the load demand and SMP increase when the temperature increases, Figure 18a-c shows that the proposed method can provide such a relationship quantitatively.It is expected that this finding can provide many applications.For example, when unusual temperature increase is forecasted for the next day during the summer, a power systems operator can reallocate all possible generation resources to deal with the expected abrupt large increase in load demand.Figure 18a shows that we take samples of ∆T 50 times.The value of ∆T is increased gradually during sampling time t = 1 until t = 32 in order to check the response of ∆P and also ∆SMP.During time t = 33 until t = 40, the ∆T is decreased abruptly.The response of changing ∆T value to ∆P and also ∆SMP can be seen in Figure 18b,c for summer and winter season.When ∆T is increased in the interval between 6 and 8 °C, the load sensitivity in the summer season is 3000 MW < ∆P < 4500 MW.Due to a similar reason, less heating systems are used when the temperature is increased.The load sensitivity is −800 MW < ∆P < −3000 MW during the winter season.The response of ∆SMP shows similar behavior as the response of ∆P with respect to ∆T.

Conclusions
In this paper, it was presented how to use real-time information of temperature and load demand in order to compensate the load and SMP forecasting generated by LSTM using past load demand and SMP data.For the purpose, first an LSTM is trained using the past load and SMP data and their forecast is made using the trained LSTM.Then from the past load demand and SMP data, it is found out how the temperature variation affects load demand variations and the load variations influence the SMP variations.This finding is used to train an MLP.Hence, the trained MLP can generate load demand variation forecast when the temperature variation is given and SMP variation forecast when the load demand variation is given.Finally, when the real-time temperature and load demand are available, the MLP is used to predict the load demand and SMP variation.The predicted values are used to compensate load demand and SMP forecasts by the LSTM.In addition to this, it is shown how to use the trained MLP in predicting load demand and SMP variations for an arbitrary temperature variation.
Future research includes how to make a simpler deep learning structure compared with the two steps structure in the proposed method.In addition, it would be useful to develop a method on how to use the proposed load forecast in energy storage system (ESS) scheduling for the purpose of safe grid operation.In addition, reinforcement learning based grid scheduling with the proposed load forecast method is also a possible application.

Figure 2 .
Figure 2. Repeating module in a long short-term memory (LSTM) structure.
C t and C t−1 are current and previous cell states.W f , W i , W c , and W o are weight matrices connecting between the input to each gate.b f , b i , b C , and b o are bias on each gate's calculation.
(a) Forecast using past load demand and temperature (b) Compensation using partial real-time information

Figure 4 .
Figure 4. Description of the main idea of the study.

Figure 5 .
Figure 5. Description of the proposed load power forecasting method.

Figure 6 .
Figure 6.Description of the proposed system marginal price (SMP) forecasting method.
(a) Loss function of LSTM.(b) Loss function of MLP.

Figure 8 .
Figure 8. Loss function for load power forecast with summer data.
(a) Loss function of LSTM using winter data.(b) Loss function of MLP using winter data.

Figure 9 .
Figure 9. Loss function for load power forecast with winter data.

Figure 10 .
Figure 10.Load power forecast by the proposed method using summer data set.

Figure 11 .
Figure 11.Comparing ∆P and ∆T for compensation method.
(a) Loss function of the LSTM.(b) Loss function of the MLP.

Figure 13 .
Figure 13.Loss function for SMP prediction with summer data.

Figure 14
Figure14describes the training results for the winter data.Again, it can be seen that the network for the proposed method is well trained.

Figure 14 .
Figure 14.Loss function for SMP prediction with winter data.

Table 3 .
The training data (T, P, SMP) for load forecasting and SMP prediction.

Table 4
shows how the testing data is chosen for the trained LSTM and MLP.

Table 4 .
Testing data for summer and winter months.

Table 5 .
Parameter of LSTM and MLP.

Table 6 .
Root mean square error (RMSE) value for summer data.

Table 7 .
Comparison of load forecasting during peak time.

Table 8 .
Comparison of RMSE value of ∆P and ∆T as partial real information.

Table 9 .
RMSE value for winter data.

Table 10 .
RMSE value for summer data.

Table 11 .
RMSE value for winter data.