Next Article in Journal
RETRACTED: Pradeep et al. Express Data Processing on FPGA: Network Interface Cards for Streamlined Software Inspection for Packet Processing. Appl. Syst. Innov. 2023, 6, 9
Previous Article in Journal
Prediction of Changes in Blood Parameters Induced by Low-Frequency Ultrasound
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Short-Term Electricity Demand Forecasting Using Deep Neural Networks: An Analysis for Thai Data

by
Kamal Chapagain
1,*,
Samundra Gurung
1,
Pisut Kulthanavit
2 and
Somsak Kittipiyakul
3,*
1
School of Engineering, Kathmandu University, Dhulikhel P.O. Box 6250, Nepal
2
Faculty of Economics, Thammasat University, Bangkok 10200, Thailand
3
Computer Engineering, College of Innovative Technology and Engineering, Dhurakij Pundit University, Bangkok 10210, Thailand
*
Authors to whom correspondence should be addressed.
Appl. Syst. Innov. 2023, 6(6), 100; https://doi.org/10.3390/asi6060100
Submission received: 9 July 2023 / Revised: 21 October 2023 / Accepted: 22 October 2023 / Published: 27 October 2023

Abstract

:
Electricity demand forecasting plays a significant role in energy markets. Accurate prediction of electricity demand is the key factor in optimizing power generation and consumption, saving energy resources, and determining energy prices. However, integrating energy mix scenarios, including solar and wind power, which are highly nonlinear and seasonal, into an existing grid increases the uncertainty of generation, creating additional challenges for precise forecasting. To tackle such challenges, state-of-the-art methods and algorithms have been implemented in the literature. Artificial Intelligence (AI)-based deep learning models can effectively handle the information of long time-series data. Based on patterns identified in datasets, various scenarios can be developed. In this paper, several models were constructed and tested using deep AI networks in two different scenarios: S c e n a r i o 1 used data for weekdays, excluding holidays, while S c e n a r i o 2 used the data without exclusion. To find the optimal configuration, the models were trained and tested within a large space of alternative hyperparameters. We used an Artificial Neural Network (ANN)-based Feedforward Neural Network (FNN) to show the minimum prediction error for S c e n a r i o 1 and a Recurrent Neural Network (RNN)-based Gated Recurrent Network (GRU) to show the minimum prediction error for S c e n a r i o 2 . From our results, it can be concluded that the weekday dataset in S c e n a r i o 1 prepared by excluding weekends and holidays provides better forecasting accuracy compared to the holistic dataset approach used in S c e n a r i o 2 . However, S c e n a r i o 2 is necessary for predicting the demand on weekends and holidays.

1. Introduction

1.1. Background

An accurate Short-Term Load or Demand Forecast (STLF) system is essential to the establishment an effective power planning and generation system and to the real-time operation of utilities. By providing accurate prediction of demand, generators can produce optimal power levels and save energy resources while ensuring that utilities have enough time to prepare for scheduling and balancing the electrical grid and related systems.
A balanced grid ensures a consistent supply of electrical power while accounting for demand and market factors, which ultimately lowers costs for consumers, reduces risk, and protects the utility provider [1,2,3,4]. Accurate load forecasting allows for more efficient power markets and a better understanding of the demand profile while taking the power dynamics into account. In electrical engineering jargon, the term load is commonly used to refer to electricity demand [5]. Throughout this paper, the terms load and electricity demand are used interchangeably.

1.2. Challenges

The hourly variation of electricity demand shows that demand-side nonlinear characteristics add to the challenge of balancing the grid. In addition, integration of solar and wind power along with other energy mix scenarios increases the uncertainty of generation. Load profiles are influenced by seasonal and cyclic pattern of the atmosphere as well as by human behavior and other external factors [6,7]. While researchers have included these variables in their models when developing forecasting models, the predicted value may not exactly match the true value.
Depending on the impact of such influencing parameters, researchers are continuously seeking to improve on existing forecasting models in order to minimize the incidence of under- and over-forecasting [8]. In this context, the major challenges for researchers is to minimize the prediction error by developing sophisticated models and testing them on various datasets, such as industrial loads, residential loads, aggregated loads, etc. [9,10]. This challenge has motivated researchers to develop robust forecasting models that can improve accuracy when predicting electricity demand and reduce the financial costs for utility companies. The three main areas that need to be tackled are forecasting accuracy, sensitivity to parameters, and complexity of training [1].

1.3. Model Categories

In STLF, both univariate and multivariate models have been discussed in the literature [7,11,12]. Univariate models use only historical demand data to make future predictions, while multivariate models consider other variables such as atmospheric variations and calendars along with historical demand data. Taylor et al. [11,12] developed both univariate and multivariate models, and stated that the univeriate models had good prediction capability. In univariate time series models, the historical electricity demand data are arranged with correlated past lags to capture the demand patterns even when the data are limited [13]. McCulloch et al. [6] obtained improved accuracy by including the temperature as a variable, recognizing that weather conditions play a crucial role in forecasting performance.
While factors such as humidity, wind, rainfall, and cloud cover have been identified as influential in meteorological analysis [8], temperature is widely recognized as the most crucial weather variable [6]. In fact, the temperature variable alone is capable of explaining more than 70% of the load variance in the GEF2012Com dataset [14,15]. Therefore, other weather variables can be excluded when the temperature variable is included in the model [6]. There are a several reasons for this decision: (i) other weather variables have shown lower impact on electricity demand when the temperature is already included; (ii) the cost of collecting weather data is considerable due to the need to install weather stations; and (iii) collinearity problems can arise multiple weather-related variables are employed simultaneously [8].

1.4. Model Approaches

STLF models have been developed through two major approaches, namely, statistical approaches [8,9,10,16,17,18,19] and Artificial Intelligence or data-driven approaches [20,21,22,23,24,25].

1.4.1. Statistical Approaches

In statistical approaches, time series analysis techniques, including auto-regression and exponential smoothing, have long been considered as STLF baseline models. This approach is based on well-established methods and has good interpretability. However, selection of appropriate and sufficient lagged inputs requires expertise. This approach may not be able to capture complex patterns present in the data. Simple averaging models, such as ARIMA and triple exponential smoothing, can be effective for long forecasting horizons, as discussed by Taylor et al. [11].

1.4.2. Artificial Intelligence or Data Driven Approaches

To address the limitations of statistical approaches, data-driven methods have emerged as viable alternatives. Machine Learning (ML) techniques such as support vector regression, decision trees, ensemble learning, and Artificial Neural Networks (ANNs) are commonly used as baseline methods in data-driven approaches. These approaches have the capacity to fully capture complex nonlinear patterns [26]. Hippert et al. highlighted the ability of ANNs, fuzzy systems, SVM, and ensemble learning methods to handle nonlinear data. ANNs, particularly Feedforward Neural Networks (FNNs), offer many advantages in comparison to traditional time series models such as ARIMA when handling nonlinear and non-normally distributed data. However, one of the main drawbacks of ANNs is their assumption of independence between inputs and outputs when dealing with sequential data of electric energy consumption [1]. FNNs are one of the most popular structures among ANNs, and are used to model complex input–output relationships through trial-and-error based searching for the best parameter values. Despite their efficacy, FNNs are prone to overfitting, that is, their learning process may not guarantee that the global optimum is reached due to the possibility of the network becoming trapped in a local optimum. Backpropagation learning algorithms are a commonly used method for FNN learning in numerous applications.
Deep learning, on the other hand, is a specialized subset of machine learning that uses multiple layers to understand complex pattern data. However, the problem of vanishing or exploding gradients due to long-term dependencies is present when using RNNs. To address this issue, concepts such as gated or Long Short-Term Memory (LSTM) cells and Gated Recurrent Units (GRUs) have been introduced. LSTM cells are well suited for tasks that involve capturing long-term dependencies and sequence-dependent behavior, making them suitable for applications in electricity load demand forecasting [20]. Similarly, GRUs were designed to use a single path, removing the output gating and outputting the cell state directly. The cell state can then be used for both state updates and gating function computations in subsequent steps. This means that the reset gate in GRUs can be considered as a shifted output gate of LSTMs, shifting from the output of the current step (in LSTMs) to the input of the next step (in GRUs) [20].
This paper is an extended version of our previously published research work [22], which focused on the development of a regression model to interpret the impact of temperature variation on electricity demand in Thailand. In the present paper, we use the same dataset while reducing the previous four scenarios into two scenarios, as recommended in [22]. As a continuation of our previous work, Chatum et al. [27] tested the same dataset by developing machine learning models through an ensemble learning approach. Therefore, one reason for this extension is to fill the gap with respect to DNN implementation. In this work, we mainly focus on finding the best DNN model on the basis of forecasting accuracy. The major contributions of this paper can be summarized as follows:
  • A comparative study of deep networks for FNN- and RNN-based LSTM and GRU are discussed on the basis of testing and validation accuracy.
  • Implementation of hyperparameter tuning (number of neurons, layers, dropout, epoch, lookback period, etc.) and a cross-validation strategy to select the best model.
  • Our results include the finding that increasing the number of hidden layers does not ensure improved forecasting accuracy.
  • The dataset we used has been tested in various contexts, such as Bayesian [28], regression [22,28,29], machine learning [30,31,32,33], and ensemble learning [27]; in this paper, we apply it in the context of deep learning networks.
The rest of this paper is organized as follows: related works are discussed in Section 2; our modeling strategy, the theory behind DNNs, and the estimation procedure of the models are presented in Section 3; Section 4 describes the characteristics of electricity demand data and discusses the variables; Section 5 demonstrates the model formulation, extensive experimental setup, and analysis of forecasting accuracy and quality of the model fit based on the Thai dataset; finally, results and comprehensive discussions are presented in Section 6, and Section 7 concludes the paper.

2. Related Works

The universal approximation theorem states that an ANN is capable of accurately approximating any nonlinear function. ANN models have been employed for electricity demand forecasting since the 1990s, and have consistently shown promising results. Computational advancements and new state-of-the-art algorithms in recent years have led to the development of DNNs as the leading method of electricity demand forecasting, which is made possible by increasing the feature abstraction capability of the model. The ability of RNN-based LSTM and GRU networks to handle sequential data and long-term dependencies while extracting the complex patterns in the data has led to their widespread popularity among the researchers [20,21].
Hippert et al. [26] mentioned several important criticisms of ANN techniques; despite these limitations, however, ANN models continue to be an important tool in forecasting electricity demand. Deep neural networks possess the capability to acquire nonlinear combinations of features in their deeper layers [34]. These deep learning methodologies, which involve augmenting standard machine learning neural networks with multiple hidden layers, hold great promise as the most effective approach within the field of machine learning. The fundamental structure of FNNs and RNNs remains the same except for feedback between nodes.
FNNs are among the most popular models. Harun et al. [35] implemented an FNN in a comparative study between different data preprocessing schemes, obtaining the best result with a 72 h lag load. This study demonstrated the significance of FNNs in electricity demand forecasting and highlights the importance of choosing appropriate inputs and preprocessing techniques to improve model the accuracy. A study by Tee et al. [36] proposed a multilinear FNN model with 51 inputs, including load lag, hours, day type dummy entries, and temperature. Their model achieved a Mean Absolute Percentage Error (MAPE) of 0.439% with a maximum MAPE of 7.986%, which was observed during the month of December. Another study by Raza et al. [2] presented a model utilizing an FNN trained using a gradient descent algorithm. The inputs for their model included variables such as the day of the week, working day indicator, hour of the day, dew point, dry bulb temperature, and loads for the current day, the previous day, and the previous week. The forecasting accuracy reported by the authors ranged from 3.81% in the spring to 4.59% during the summer.
Li et al. [37] evaluated the performance of LSTM and FNN models in electricity demand forecasting by comparing their prediction accuracy and robustness. They found that the LSTM model outperformed the FNN model in terms of both accuracy and robustness, demonstrating the superiority of the LSTM model in capturing complex long-term dependencies in electricity demand data. In order to enhance the performance of their LSTM model, the authors proposed the use of multiple parallel LSTMs; this model was able to capture the multi-scale dependencies in the electricity demand data, resulting in even better prediction accuracy.
In addition to LSTM and FNN, hybrid DNN models have been applied in the area of electricity demand forecasting. For example, to enhance the accuracy of short-term electricity demand forecasting, ref. [4] tested five different recurrent neural network architectures (RNNs). Their study showed that the GRU and bidirectional LSTM models outperformed the traditional FNN and RNN models in terms of accuracy, demonstrating the potential of combining multiple machine learning techniques for improved forecasting performance. The same paper suggested the implementation of hyperparameter testing. However, the accuracy depends on the data variation pattern. For example, Selvi et al. [16] achieved a 2.90% MAPE value with an ANN model when using the DSO dataset (Delhi, India) for testing with a 1 h prediction horizon, while Torabi et al. [38] achieved an MAPE of only 1.96% with the same ANN model (see Table 1). This variation in the MAPE results is due to the high dependence on the geographical region from where the demand comes. For example, electricity from industrial regions is much more stable than that from residential, agricultural, or urban areas. In residential areas, demand is highly fluctuating in nature due to the behavior of local residents. Addressing such uncertainty represents a huge challenge for researchers.
In the context of Thailand, several studies have been conducted to predict electricity demand using various methods and techniques. Several authors, including our own research team, have produced interesting results published using the same EGAT dataset 2009–2013 recorded for the Bangkok metropolitan region (Table 2). Dilhani et al. [33] used an ANN method to forecast electricity demand based on historical electricity demand and temperature data. However, their results were tested only for one month, while the results in this study are tested for one year. Parkpoom et al. [42] conducted a small study on the effect of temperature on electricity demand using a simple regression model, although its prediction accuracy was poor due to its being a traditional model with few variables. A more advanced model with additional variables was developed and tested in [22]. Similarly, Phyo et al. [30] and Su et al. [31] implemented DNN-based methods to forecast electricity demand for the same dataset. To improve the model’s accuracy, they performed data cleaning and grouped the data into similar days. As measured by the Mean Absolute Percentage Error (MAPE), their results were high compared to similar previous studies.
Weather conditions have a significant impact on short-term electricity demand forecasting, and are commonly incorporated into forecasting models [45]. For short lead times (i.e., of up to six hours), univariate methods that do not include weather variables can achieve competitive performance [11]. Due to the difficulties and high costs involved in accessing weather data, univariate models are often used [8,46].
Regions with similar weather conditions to Thailand can provide useful insights into electricity demand forecasting for Thailand. For example, Ismail [19] investigated the impact of weather variables, holidays, and other factors on daily and monthly demand for Malaysia. Their monthly predictions achieved an MAPE of 1.71%. In another study, the effect of air conditioning in the US was investigated, with a 20% increase in cooling degree days found to increase residential electricity consumption by 1% to 9% during the summer season and 5.4% during peak hours [47].

3. Rationale of Deep Learning Implementation

Deep learning architectures are particularly well-suited for tackling the unique challenges presented by electric load forecasting, including nonlinearity, periodicity, seasonality, and sequential dependencies within consumption data sequences. In contrast to shallow ANN architectures, deep learning models have the ability to automatically learn complex temporal patterns by employing nonlinear transformations and extracting high-level abstractions.

3.1. Feedforward Neural Networks (FNNs)

The basic architecture of an FNN is depicted in Figure 1. In this architecture, the input vector x i ( t ) is associated with a weight w i ( t ) and bias b i ( t ) . The activation function f ( ) is applied to each neuron, and the network learns by adjusting the weights. The weight updates are accomplished through the backpropagation error E, allowing the network to iteratively refine its predictions.
Normally, an FNN is constructed by applying a Back-Propagation (BP) learning algorithm; the BP learns the neurons’ connecting weights, which are adjusted over the input–output dataset, as shown by Equation (1). This helps the FNN model to learn the behavior of the data very quickly.
Suppose that X i l is the input vector, where x 1 , , i [ 1 , , L ] for layers l 1 , , L for input i represents the set of half-hourly demand data, including the calendar and weather variables passing into the layers. The output Y i L = X i l is equivalent to h L [ W [ L ] X [ L 1 ] + b [ L ] ] , where h [ l ] ( . ) are the activation functions, b [ l ] = ( b 1 [ l ] , , b j [ l ] ) T are the bias, and W [ l ] are the weights [48]. These parameters are optimized using the gradient descent algorithm, as follows:
p i + 1 = p [ i ] λ E ( p )
where p represents the parameters, λ represents the learning rate, and E ( p ) represents the mean square error (MSE), sometimes called the loss function. In Equation (2), x i is the predicted value, t i is the target value, and n is the amount of half-hourly demand data. The search for the minimum loss function is commonly performed by computing its gradient p E ( p ) , which indicates how the function changes with small changes in the parameters.
E ( p ) = M S E = 1 n i = 1 n ( f ( x i , p ) t i ) 2
From above two equations Equations (1) and (2), the final equations for updating the weights and biases are
for weight update : W [ i + 1 ] [ L ] = W [ i ] [ L ] λ E ( p ) W [ L ] for bias update : b [ i + 1 ] [ L ] = b [ i ] [ L ] λ E ( p ) b [ L ]
The backpropagated errors E ( p ) W [ L ] and E ( p ) b [ L ] for Equation (3) can be obtained using the chain rule:
E W [ L ] = E X [ L ] X [ L ] A [ L ] . A [ L ] W [ L ] T E b [ L ] = E X [ L ] X [ L ] A [ L ] . A [ L ] b [ L ] T
where the ( . ) symbol stands for matrix multiplication and the ( ) symbol represents the Hadamard or element-wise product. The backpropagated error represented by δ [ L ] = E X [ L ] X [ L ] A [ L ] is passed to the L 1 layer, the updated backpropagated error in layer L 1 , that is, δ [ L 1 ] = A [ L ] T X [ L 1 ] . δ [ L ] X [ L 1 ] A [ L 1 ] (represented by δ L 1 ), is passed to δ L 2 layer, and so on. Finally, the general form of the updated error is represented as shown below.
E W [ l ] = δ [ l ] . A [ l ] W [ l ] T E b [ l ] = δ [ l ] . A [ l ] b [ l ] T
Now, Equation (3) can be continuously updated using Equation (5); this is known as the backpropagation approach, and has been widely implemented for energy consumption prediction due to its ability to learn both highly nonlinear relationships and shared uncertainties [34].

3.2. RNN with Long Short-Term Memory (LSTM)

Long-term training of RNNs using backpropagation algorithms often encounters difficulties due to the issue of vanishing gradient descent. Apart from traditional RNN cells, the LSTM architecture consists of a special sharing parameter vector h t that is deployed to retain memorized information using a backpropagation through time algorithm. Thus, it can addresses the challenge of vanishing gradient descent by incorporating internal self-loops, enabling the network to maintain memory over a longer term and effectively store information (Figure 2).
Let X t = [ x t 1 , x t 2 , , x t N ] be the input data, h t = [ h t 1 , h t 2 , , h t K ] be the hidden units, and C t = [ c t 1 , c t 2 , , c t K ] be the memory state of the LSTM network at time t (Figure 2). For each timestamp t, a forget gate f t , input gate i t , and output gate O t are involved for the following operations [49]:
1.
The forget gate f t = σ ( W f X t + U f h t 1 + b f is controlled based on the input x t and the previous hidden state h t 1 that decides which of the previous information is to be discarded.
2.
The input gate i t = σ ( W i X t + U i h t 1 + b i ) is the degree to which the new content added to the memory cell is modulated, i.e., selectively read into the information that is controlled based on the input. The weight of the input gate is independent from that of the forget gate.
3.
The output O t = σ ( W o X t + U o h t 1 + b o ) modulates the amount of memory content.
Finally, the information of the current LSTM cell h t is calculated as h t = O t × t a n h ( C t ) to pass to the next LSTM cell, where C t = f t × C t 1 + i t × C ^ t and C ^ t = t a n h ( W c X t + U c h t 1 + b c ) , by updating the previous information h t 1 , where W f , W i , W C , W o are the weight matrices corresponding to the input X t , U f , U i , U C , U o are the recurrent weight matrices associated with the previous hidden state h t 1 , and b f , b i , b C , and b o are the bias vectors for the forget gate, input gate, candidate solution, and output gate, respectively.
The RNN with LSTM architecture excels in capturing temporal features, making them well-suited for time series forecasting tasks. However, tuning the hyperparameters of LSTM networks can be a challenging task, involving considerations such as the number of hidden layers, nodes per layer, batch size, number of epochs, learning rate, and optimization of connection weights and biases [49]. Interestingly, during our experiments with the Thai dataset we found that training LSTM models was not overly complex. As a result, we have proposed an RNN cell that effectively manages the computation of input information and memory, leading to favorable convergence during the training process and producing excellent results.

3.3. RNN with Gate Recurrent Unit (GRU)

Similar to an LSTM cell, a GRU uses gating units that modulate the flow of information inside the unit. It has only two gate structures, the reset gate and the update gate. The forget and input gates from the LSTM, are combined into a single update gate in the GRU cell that decides which information to forget and which to add.
At time t, let X t = [ x t 1 , x t 2 , , x t N ] be the input data and let H t = [ H t 1 , H t 2 , , H t K ] be the hidden units of the memory state of the GRU network (Figure 3). When the input vector X t is provided to the cell, it is divided into three branches: (i) one towards the reset gate, (ii) another towards the update gate, and (iii) a final one towards the outputs. The reset gate r t = σ ( W r X t + U r H t 1 ) is similar to the forget gate in LSTM. When r t is close to 0, it allows the previously computed state to be forgotten. The update gate z t = σ ( W z X t + U z H t 1 ) decides how much the unit updates its activation H t = ( 1 z t ) H t 1 + z t H t ^ , which is the linear interpolation between previous activation H t 1 and candidate activation H t ^ . The procedure of taking a linear sum between the existing state and the newly computed state is similar to the LSTM unit [50].

4. Electricity Demand Profile on Study Area

The scope of our study encompasses the metropolitan region of Thailand, which includes Bangkok and the surrounding provinces of Pathum Thani, Nonthaburi, Nakhon Pathom, Samut Sakhon, and Samut Prakan. This metropolitan region accounts for approximately 70% of the total electricity consumption in Thailand [42]. Numerous factories, industrial parks, offices, and universities are situated within these provinces, contributing to the overall electricity demand.
In this study, we utilized half-hourly demand data obtained from EGAT spanning from 1 March 2009 to 31 December 2013. Out of the total 84,618 observation samples, only eight half-hour samples were found to be missing on 10 March 2012. These missing values were addressed through a straightforward interpolation method to ensure the completeness of the dataset. The complete half-hour non-holiday and holiday demand profile over a full year for 2012 is presented in Figure 4a. The dataset exhibits various patterns, including trends, seasonal fluctuations, weekly and daily patterns, and holiday effects. These patterns are inherent in electricity demand data, and are commonly observed in many tropical countries [9,46].

4.1. Seasonal and Holiday Pattern

The hourly aggregate demand profile is plotted in Figure 4b. To observe the stable variation of data over time, a rolling window of 365 samples were taken for moving average. This plot indicates that the overall demand grows with a linear trend and is influenced by seasonality. The pattern of peak demand and off-peak demand describes the significant effect of peak working hours, holidays, or special events.
The massive Bangkok flood in the north and central Thailand from October 2011 to December 2011 caused a long period of significant demand drop. During that time many factories, universities, and offices in Bangkok and surrounding provinces were closed. Figure 5a indicates that the measure of the demand drop compared with respect to the previous year’s demand shows that the peak demand was reduced approximately by 2000 MW, and can be considered similar to the COVID-19 pandemic lockdown [22,51].
Each year, there is a noticeable decrease in electricity demand during holidays and special events. In particular, extended holidays such as the first week of January (New Year holiday) and second week of April (Songkran holiday) have a significant impact on demand, as is illustrated in Figure 6a,b. During these periods, factories, universities, and other offices remain closed for approximately a week, while many Thai people return to their homes outside the metropolitan region. Additionally, holidays such as Mother’s Day (August 12) and Father’s Day (December 5) contribute to fluctuations in electricity demand, although the effect is not as substantial as that of Songkran and New Year. These variations in electricity demand due to holidays are commonly referred to as holiday effects; they play a crucial role in the modeling process and pose a significant challenge for researchers aiming to achieve high forecasting accuracy.

4.2. Monthly, Weekly, and Daily Patterns

In our study area, residential demand is dominated by factory, office, and industrial loads (Figure 5b). In contrast, during weekends or holidays factories and all governmental or private offices are closed; this reduced industrial demand is dominated by the residential load (Figure 7a). The residential demand depends on people’s behavior, which is volatile and random in nature. This is the main reason behind the lower forecasting accuracy on weekends and holidays. On weekdays, it is normal to see similar daytime and evening electricity demand patterns (Figure 7b). Moreover, the demand during the midnight and early morning hours is almost the same for weekdays and weekends.
The box plot in Figure 8a depicts the level of electricity consumption demand for individual years. Each box represents the variation in demand, with the line in the box showing whether it lies on first quartile ( Q 1 , 0–25% of demand data), third quartile ( Q 3 , 50–75% of demand data), or median range (50% of demand data). In 2011 and 2012, a few outliers indicate that the possibility of very low demand may exist. Figure 8b represents the level of electricity consumption demand for individual months, where December shows very high variation, with relatively lower demand than the rest of the months.

4.3. Temperature

The geographic of a country plays a significant role in electricity demand. For example, European countries experience strong heating effects because of cold regions, while warm countries, such as Thailand, experience strong cooling effects [8]. As a tropical country, Thailand maintains a warm climate even during the winter season and becomes noticeably hot in the summer. Temperatures during the winter season are slightly lower, leading to a slight decrease in demand. Average temperatures range from approximately 25 °C in December to 30 °C in May.
Figure 9a illustrates the impact of temperature on electricity demand during both working days and holidays. It can be seen that there is significant variation in demand when the temperature drops below 30 °C or rises above 35 °C on holidays. In contrast, the demand follows a sharp and linear pattern during working days. Furthermore, as we employ separate models for individual hours, Figure 9b presents the characteristics of electricity demand at two different hours: 2 pm and 11 pm. This figure compares the relationship between temperature and peak electricity demand during these two time periods. Notably, Figure 9b highlights that the demand at 11 pm exhibits a more linear relationship with temperature compared to the demand at 2 pm.

5. Methods

This section presents the description of the proposed methods, as illustrated in Figure 10. The proposed methods can be categorized into three frameworks: data preprocessing, model design and estimation, and comparative study. Because this work is an extended version of [22,27], we have excluded details of data characteristics, the variable identification procedure, and the preprocessing stage to avoid ambiguous presentation. However, carefully grouped datasets, hereinafter named the s c e n a r i o s , are considered for further discussion. Each model is trained and tested for each scenario and divided as follows:
  • S c e n a r i o 1 : demand for working days only; training and validation length 911 days; testing length 239 days.
  • S c e n a r i o 2 : demand for the full dataset;, training length 1365 days; testing length 365 days.
These two scenarios, S c e n a r i o 1 and S c e n a r i o 2 , were found to be superior to other scenarios in [22]. In this study, further analysis on these two scenarios is conducted using DNN techniques to fulfill an existing study gap.
Figure 10. Proposed forecasting methodology.
Figure 10. Proposed forecasting methodology.
Asi 06 00100 g010
The input of data vectors for artificial intelligence models shown in Figure 10 is expanded in Figure 11. Both FNN and DNN models receive past observations of the demand x t τ , , x t , where τ is the length of the looback period. During the hyperparameter testing process, the length of the sliding window was varied with τ > n . At any timestep t, the possible target was y t + k . The weather variables and calendar variables followed the sliding window procedure and used the same length for the lookback period.

5.1. Feature Selection

In Thailand, the EGAT system makes forecasts for the next day at 2 pm, which is 10 h to 34 h ahead. On Fridays, EGAT needs to forecast until Monday because the EGAT office is closed on weekends. For short-term forecasting, Thailand typically uses data up to 106 h ahead, especially during long holidays. This study, however, is limited to using data up to 2 pm to forecast only for the next day.
To simplify the model, the lagged terms and prediction horizon specific to Thailand’s practice are illustrated in Figure 12. Many authors, such as [12,52], have proposed and implemented a feature selection strategy in their model. To forecast the demand on a particular day d, forecasting starts at 2 pm, which is 10 h ahead. Now, the historical demand data from H H = 0 to H H = 28 of d 1 (the previous day) and H H = 29 to H H = 47 of d 2 is taken and represented by the variable load1d_cut2pm. Similarly, the lagged demand for two days ahead is included in our model and represented by the variable load2d_cut2pm, and so on.
Our forecasting models incorporated external variables in the form of a weather variable (specifically, temperature) and its interactions with days and months. Previous studies have suggested the inclusion of dummy variables and their interactions for improved forecasting accuracy. For instance, Ramanathan et al. [17] proposed the use of dummy variables and their interactions, while Cottet [53] applied different day-type dummy variables for each day of the week, considering Sunday as a public holiday. The variables were grouped into four categories: deterministic, temperature, lagged, and interactions, as presented in Table 3.

5.2. Experimental Setup

The sequence of electricity demand observations, denoted as x ( t ) = x 1 , x 2 , , captures the demand values at different time steps t. The objective is to predict the time series y ( t ) which represents the predicted electricity demand for a specific time step.
In our supervised learning approach, Deep Learning (DL) models were trained and tested to predict future time steps. A predictor function, h was utilized to estimate the energy consumption value for the next step y ( t + 1 ) . The model parameters, such as epochs, number of layers, nodes per layer, dropout rate, and recurrent dropout, were optimized using a validation set. When this had been optimized, the model was retrained on the entire training set and tested on the test data.
Assuming a lookback duration of τ , each x ( t ) consists of a vector of variables including demand, temperature, and other factors. The sample ( t , τ ) is defined as [ x ( t ) , x ( t τ + 1 ) , x ( t τ + 2 ) , , x ( t ) ] . In our approach, each x ( t , τ ) is treated as a single batch with a b a t c h _ s i z e . The target variable represents the demand at a future day, denoted as t a r g e t ( t ) = x D ( t + d ) , where x D ( t ) represents the scalar demand at time t. The target demand t a r g e t ( t ) is influenced by previous variables in s a m p l e ( t , τ ) . The lookback duration τ is a parameter that requires optimization. It is important to note that the target demand can be a vector, such as t a r g e t ( t ) = [ x D ( t + d 1 ) , x D ( t + d 1 + 1 ) , , x D ( t + d 2 ) ] , where [ t + d 1 , t + d 2 ] indicates the time interval for prediction based on s a m p l e ( t ) . Thus, we are predicting d 2 d 1 + 1 values of the target variable D into the future based on τ samples from the lookback period. Determining the suitable value for τ is crucial.
Assuming a batch prediction size of T = d 2 d 1 + 1 , where the first prediction starts d 1 samples ahead, the last prediction corresponds to d 2 = d 1 + T 1 , indicating a T 1 sample offset. The goal is to estimate the function f that maps s a m p l e ( t ) to t a r g e t ( t ) given [ x ( t τ + 1 ) , x ( t τ + 2 ) , , x ( t ) ] and [ x D ( t + d 1 ) , x D ( t + d 1 + 1 ) , , x D ( t + d 2 ) ] . It is assumed that this mapping f is independent of time t, although a more general setting allows for a function f ( t ) that depends on time.
For day-ahead prediction, a batch prediction with T = 48 is typically implemented. Following the methodology used by EGAT for day-ahead forecasting, we predicted demands for half-hour intervals from H H = 0 to 47 for the next day, considering the latest demand at 2 pm of the current day. In this specific scenario, the delays are set as d 1 = 48 28 = 20 and the b a t c h _ s i z e should include the half-hour delayed data. Thus, the delay parameters are designed as d 1 = 20 and d 2 = d 1 + T 1 = 20 + 48 1 = 67 .
In the case of a FNN, the input is transformed from [ x ( t τ + 1 ) , x ( t τ + 2 ) , , x ( t ) ] to [ x 1 ( t τ + 1 ) , x 2 ( t τ + 1 ) , , x k ( t τ + 1 ) , x 1 ( t τ + 2 ) , , x k ( t τ + 2 ) , , x 1 ( t ) , , x k ( t ) ] . Here, x ( t ) is assumed to consist of k variables, denoted as [ x 1 ( t ) , x 2 ( t ) , , x k ( t ) ] . Thus, the FNN maps from k τ inputs to one output. In order to account for dummies that have the same value for a given day, we retain only one value per day. Each x ( t ) in the samples should include the demand at time t, the forecasted temperature at time t + d 2 , and D a y O f W e e k dummies for the next day (tomorrow, if t is today). The prediction half-hours in this case range from t + d 1 to t + d 2 ; thus, if we are interested in considering the days after holidays, we should incorporate dummies for the two days following tomorrow. It is important to note that due to holidays falling on weekdays there are irregularities in the D a y O f W e e k dummies for adjacent days in the dataset. For example, it may follow a pattern such as Monday, Tuesday, Thursday, Friday, Tuesday, etc. As a result, the D a y O f W e e k dummies are random in nature, not periodic.
When considering the historical data to predict t a r g e t ( t ) , particularly when the lookback τ is large, the LSTM model proves beneficial by avoiding the issue of vanishing or exploding gradients. However, it is important to note that including too much fluctuated information for a large τ may adversely impact the accuracy of the predictions, as depicted in Figure 13. In Figure 13, impact of off-peaks and peaks are illustrated. During the prediction using pre-trained model, i.e., ➀ the upper off-peak, i.e., ⓐ is affected. Such effect can be seen through out the test dataset.
In the case of holiday data, it may be necessary to have a large τ spanning one or multiple years. Conversely, for non-holiday data the required lookback period is typically on the order of a few days or weeks, with the exception of the days surrounding Songkran and New Year, which may require a lookback period on the order of years. To address this, an approach of treating the interim days together with the holidays was attempted; however, this resulted in poor performance [22].

5.3. Hyperparameter Tuning

The process of tuning hyperparameters plays a crucial role in shaping the structure of a network, and significantly impacts the performance of deep networks. These hyperparameters represent variables with values that are continuously adjusted to optimize the model’s performance [4]. In our experiment, we conducted the hyperparameter optimization process separately for each deep network. The following were the major hyperparameters:
  • Number of hidden layers
  • Number of network training iterations
  • Mini-batch size that denotes the number of time series considered for each full back propagation for each iteration
  • Epochs, which denotes one full forward and backward pass through the whole dataset; the number of epochs denotes the required number of passes over the dataset for optimal training.
  • Dropout, which is a technique to prevent the problem of overfitting by excluding the negligibly influenced neurons from the network. We applied both forward and recurrent drop-out.
  • Lookback period, which denotes the number of previous timesteps taken to predict the subsequent timestep. In our tuning, we used a 5–10 day lookback period to predict the subsequent timestep of one day ahead.

5.4. Criticism of ANNs

ANNs are often criticized for their black-box nature, i.e., lack of intepretability, which has been acknowledged in numerous studies [1,3,22,26,54]. The possibility of overfitting is another major drawback of ANNs. Overfitting can occur due to either over-training or over-parameterization. In an overfitted model, the training data may too well fitted, leading to poor generalization performance on unknown test data [1]. Critical comments by Hong et al. [1] included that good results with ANNs are obtained by fitting the properties to the targets. For example, the nonlinear autoregressive ANN model implemented in [55] achieved a 30% improvement on accuracy.
However, due to the elastic configuration of ANN structures and their ability to tackle the nonlinearity of periodicity, seasonality, and the sequential dependencies of electric demand data sequences, researchers have applied ANN and similar deep learning architectures in the electricity demand forecasting use case. In many examples [23,36,48], ANNs have already shown outstanding performance on electricity demand prediction in terms of accuracy; nevertheless, they do not provide insight into the relationship between electricity demand and its driving factors.

6. Results and Discussion

In this section, we describe the approach used to empirically determine the hyperparameters for the F N N , R N N - L S T M , and R N N - G R U models for both scenarios. To ensure model performance and optimize the hyperparameters, a separate validation dataset was created by partitioning it from the training dataset. During the training process, simultaneous validation steps were performed using this validation dataset. The training dataset spanned from 2009 to 2011, the validation dataset was from the 2012 dataset, and the testing dataset was considered for the year 2013 (Figure 14). The model’s weights were adjusted based on the calculated loss from the validation dataset. The corresponding results for all parameters are presented Appendix A.

6.1. Parameter Tuning for Scenario1

In this scenario, we compiled a list of hyperparameter value sets to be tested specifically for the F N N , R N N - L S T M , and R N N - G R U models. These hyperparameter values are presented in Table 4.
The parameters presented in Table 4a were initially implemented to test the functionality and validity of our function. After successfully verifying its performance, the set of parameters resulting in the minimum validation MAE was selected as the optimal or tuned parameter set, shown in table (Table 4b).

6.2. Parameter Tuning for Scenario2

Similarly, within the context of S c e n a r i o 2 we prepared a compilation of various sets of hyperparameter values to experiment with, focusing on the F N N , R N N - L S T M , and R N N - G R U models. The specific hyperparameter values are detailed in Table 5.
The parameters outlined in Table 5a were first utilized to assess the functionality and reliability of our function. When the performance had been confirmed, the parameter set leading to the lowest validation MAE was chosen as the optimized or tuned parameter set, displayed in table (Table 5b).

6.3. FNN Performance: Scenario1

For S c e n a r i o 1 , the FNN model outperformed the RNN models. Through our FNN experimental setup, we achieved a minimum validation loss of 170.41 MWatt, which is the lowest MAE against 195.31 MWatt, and 200.75 MWatt. This outcome was obtained by selecting a double-layered network with 32 nodes, a lookback period of 10 days, and training the model for 351 epochs.
Using the optimized parameters mentioned above with a dataset size of 44,256 and a test size of 11,520, we conducted predictions for the electricity demand in 2013. The resulting MAPE was found to be 2.47%. It is important to note that this data setup specifically excluded holidays and weekends, focusing solely on weekdays. The results are presented in Figure 15, where the actual demand is denoted as l o a d and the forecasting results are denoted as l o a d _ p r e d .

6.4. RNN-LSTM Performance: Scenario1

For S c e n a r i o 1 , the R N N - L S T M model outperformed the R N N - G R U model while underperforming than F N N model. Through our R N N - L S T M experimental setup we achieved a minimum validation loss of 195.31 MWatt, which is a lower MAE than against 200.75 MWatt and higher than 170.41 MWatt. This outcome was obtained by selecting a single-layered network with 32 nodes, a lookback period of 5 days, and training the model for 69 epochs.
By utilizing the optimized parameters mentioned earlier with a dataset size of 44,256 and a test size of 11,520, we conducted predictions for the electricity demand in 2013. The resulting Mean Absolute Percentage Error (MAPE) was found to be 2.58% for s c e n a r i o 1 . It should be noted that this data setup specifically excluded holidays and weekends, focusing solely on weekdays. The visualization of the prediction results is presented in the accompanying Figure 16, where the actual demand is denoted as l o a d and the forecasting results are denoted as l o a d _ p r e d .

6.5. RNN-GRU Performance: Scenario1

For S c e n a r i o 1 , the R N N - G R U model underperformed compared to the F N N and R N N - L S T M models. Through our R N N - G R U experimental setup, we achieved a minimum validation loss of 200.75 MWatt, which was the highest MAE against 170.41 MWatt and 195.31 MWatt. This outcome was obtained by selecting a single-layered network with 32 nodes, a look-back period of 5 days, and training the model for 51 epochs.
By utilizing the optimized parameters mentioned earlier with a dataset size of 44,256 and a test size of 11,520, we conducted predictions for the electricity demand in 2013. The resulting Mean Absolute Percentage Error (MAPE) was found to be 3.37% for s c e n a r i o 1 . It should be noted that this data setup specifically excluded holidays and weekends, focusing solely on weekdays. The visualization of the prediction results is presented in the accompanying Figure 17, where the actual demand is denoted as l o a d and forecasting results are denoted as l o a d _ p r e d .
The the R N N -based G R U and L S T M models underperformed compared to F N N in S c e n a r i o 1 ; the corresponding MAPEs are as shown below.
According to Table 6 (and see details in Appendix A, Table A1), the minimum loss of 170.41 MW is achieved when the parameters are set as follows: nnodes = 32, nlayers = 2, lookback = 10 days, and dropout = 0 over the course of 351 epochs. We tested for various trial and error cases with the variables, finding that excluding the month variable did not lead to a significant improvement in MAPE when using the FNN model. However, incorporating the maximum temperature variable in the LSTM model resulted in an improvement of 1.39 MW in MAE, and the MAPE values decreased to 2.54% (Figure 18). This indicates that including the maximum temperature variable enhanced the model’s predictive accuracy.
Next, we considered the entire dataset in S c e n a r i o 2 , including demand values for holidays, weekends, and weekdays. Additional hyperparameter tuning experiments were conducted, with the results shown in Table 5. We achieved the following results in S c e n a r i o 2 :
  • For the FNN model, the minimum validation loss of 237.82 MWatt was obtained when nnodes = 64, nlayers = 3, lookback = 8 days, dropout = 0, and epochs = 180.
  • For the GRU model, the minimum validation loss of 234.22 MWatt occurred when nnodes = 64, nlayers = 2, lookback = 8 days, dropout = 0, and epoch = 99.
  • Similarly, for the LSTM model, the minimum validation loss of 242.12 MWatt was achieved when nnodes = 64, nlayers = 2, lookback = 8 days, dropout = 0, and epoch = 56.
The MAE values obtained from the optimized parameters indicate that the RNN-GRU model outperformed the other models. The overall results are summarized below.
Table 7 reflects the superior performance of the RNN-GRU model in terms of both MAE and MAPE. We tabulated the prediction results obtained by the FNN, LSTM, and GRU networks trained by using the aggregate data and testing for 2013, with the results shown below.
Table 8 indicates clearly that the RNN-based GRU model has the best accuracy for S c e n a r i o 2 , i.e., the aggregated dataset. The overall MAPE is increased due to the volatile nature of demand during weekends and holidays, as discussed in Section 4.
The predicted data for the 2013 test dataset when using the best model for S c e n a r i o 2 (RNN-based GRU) is plotted in Figure 19. Observing the predicted values, there is under-prediction in July 2013 and over-prediction in April 2013 and December 2013. This kind of under prediction or over-prediction is normal in deep learning algorithms when implementing a lookback period. The Figure 20 compares the prediction using FNN, GRU, and LSTM for the whole dataset for 2013.

7. Conclusions

Accurate load forecasting ensures the reliable operation of power grids, saves costs, and leads to more sustainable scheduling and marketing plans in electricity market operation. However, integration of nonlinear energy generation from solar and wind energy sources adds to the uncertainty and challenges involved in developing accurate forecasting models. In this paper, we introduce a novel approach by constructing two different scenarios based on the same dataset and adopting sophisticated AI-based models. Preprocessed datasets were fed to the models for several experiments, in which the models were trained with 3 years of half-hourly data and validated with 1 year of half-hourly data to obtained the optimized parameters.
For S c e n a r i o 1 , double hidden layers with 32 nodes and a lookback period of 10 days in 351 epochs resulte in the minimum validation loss. Considering these optimized parameters, day-ahead forecasting was carried out using the test data with one year of half-hourly data. The results demonstrated that the deep networks were highly influenced by the number of hidden layers and number of neurons per layer. In the case of S c e n a r i o 1 , the FNN model exhibited superior accuracy to the two RNN-based models. The FNN model achieved the lowest MAPE of 2.47% when predicting only the weekday load, while the RNN-based GRU model exhibited the worst MAPE of 3.37%.
Similarly, for S c e n a r i o 2 , double layers with 64 nodes and a lookback period of 8 days in 99 epochs resulted in the minimum validation loss. The RNN-based GRU model exhibited the best MAPE of 3.44% when testing on the one-year test dataset. A comparative study was conducted to evaluate the forecasting accuracy across different scenarios. When forecasting for non-holiday periods using the dataset from S c e n a r i o 1 , the achieved the lowest MAPE of 2.47% is achieved from FNN method. In contrast, when predicting the same test dataset (non-holidays) using the dataset from S c e n a r i o 2 , the RNN-GRU model achieved a slightly higher MAPE of 2.71%, approximately 10% higher. As the dataset for S c e n a r i o 1 specifically consisted of non-holiday weekdays, it is evident that S c e n a r i o 1 provides better accuracy compared to the dataset from S c e n a r i o 2 .
Based on these findings, we can implement STLF by categorizing the dataset into two distinct scenarios. S c e n a r i o 1 can be utilized to predict electricity demand for workdays (Monday to Friday), while S c e n a r i o 2 can be employed to forecast demand for weekends (Saturday and Sunday) and holidays with improved accuracy.

Author Contributions

Conceptualization and methodology, S.K. and K.C.; software, S.K.; validation, S.K., P.K. and S.G.; formal analysis, S.K.; investigation, P.K.; writing—original draft preparation, K.C.; writing—review and editing, K.C., S.G. and S.K.; visualization, P.K.; supervision, S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study is available on request from the corresponding author. The data are not publicly available due to the electricity demand dataset for Thailand is not publicly available. Therefore, the author needs to maintain privacy ethics and can not make it public.

Acknowledgments

We would like to express our sincere appreciation to Chawalit Jeenanunta, and to the EGAT for providing the necessary data used in this research.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Tuning of Hyperparameters

Hyperparameter selection for S c e n a r i o 1 is described in Table A1, where FNN method outperformed the other methods. Thus, the following parameters were varied for the FNN method to find the best parameter values.
Table A1. Variation of parameters and corresponding results for S c e n a r i o 1 .
Table A1. Variation of parameters and corresponding results for S c e n a r i o 1 .
ParametersFNN ResultsGRU ResultsLSTM Results
nnodesnlayersLook BackdropoutMAEepochsMAEepochsMAEepochs
32 150226.09319195.3169234.1639
32150.05179.33362204.7272251.4830
32150.1184.27384217.9050223.0150
32150.15205.24327228.4080223.5281
321100264.73196NANANANA
321100.05231.74302NANANANA
321100.1196.73399NANANANA
321100.15197.42348NANANANA
32250271.67377226.1972200.7551
32250.05259.13136213.2479240.3189
32250.1272.1389235.5657225.0764
32250.15238.862233.2971224.44100
322100170.41351NANANANA
322100.05185.37395NANANANA
322100.1189.83260NANANANA
322100.15200.49328NANANANA
64150230.31153221.1179241.4988
64150.05189.14322205.1072214.1940
64150.1255.40307222.8066253.2051
64150.15245.7153207.4878251.5063
641100193.33391NANANANA
641100.05198.98142NANANANA
641100.1210.31351NANANANA
641100.15191.81399NANANANA
64250314.57140207.6667219.21100
64250.05314.30141227.2061212.2499
64250.1278.63386240.2971218.1467
64250.15293.13126236.6878227.7277
642100220.30356NANANANA
642100.05192.39340NANANANA
642100.1218.01365NANANANA
642100.15216.56349NANANANA
In Table A1, the minimum loss of 170.41 MWatt is obtained when nnodes = 32, nlayers = 2, lookback = 10 days, dropout = 0, and epochs = 351 from FNN model. Minimum loss of 200.75 MWatt results when nnodes = 32, nlayers = 2, lookback = 5 days, dropout = 0, and epochs = 51 on LSTM model, and minimum loss of 195.31 MWatt is obtained by nnodes = 32, nlayers = 1, lookback = 10 days, dropout = 0, and epochs = 69 from GRU model.
Table A2. Variation of parameters and corresponding results for S c e n a r i o 2 .
Table A2. Variation of parameters and corresponding results for S c e n a r i o 2 .
ParametersFNN ResultsGRU ResultsLSTM Results
nnodesnlayersdropoutMAEepochMAEepochMAEepoch
3210323.9583269.4271265.8557
3210.1387.77164251.0141276.9934
3210.2409.22174281.8076267.5355
3220243.30352251.1753305.9297
3220.1266.86304278.2367274.5297
3220.2276.40349280.4799265.2352
3230251.3796284.1497306.1958
3230.1261.65374293.5698281.6638
3230.2273.96209274.0573275.0399
6410339.8559263.2082284.7268
6410.1327.4451275.2299319.1319
6410.2388.2568290.8297269.6943
6420243.72324234.2299242.1256
6420.1277.33289281.9677254.6397
6420.2311.62369296.0680279.3092
6430237.82180266.5095296.7988
6430.1279.02288286.0892281.9087
6430.2296.5766290.5693260.95100
Similarly, from Table A2 minimum loss of 234.22 MWatt is obtained when nnodes = 64, nlayers = 2, lookback = 8 days, dropout = 0, and epochs = 99 from GRU model. Minimum loss of 242.12 MWatt results when nnodes = 64, nlayers = 2, lookback = 8 days, dropout = 0, and epochs = 56 on LSTM model, and minimum loss of 242.12 MWatt is obtained with nnodes = 64, nlayers = 3, lookback = 8 days, dropout = 0, and epochs = 180 from FNN model.
Figure A1. Validating the results on hyperparameter testing for S c e n a r i o 1 .
Figure A1. Validating the results on hyperparameter testing for S c e n a r i o 1 .
Asi 06 00100 g0a1
Figure A2. Validating the results on hyperparameter testing for S c e n a r i o 2 .
Figure A2. Validating the results on hyperparameter testing for S c e n a r i o 2 .
Asi 06 00100 g0a2

References

  1. Hong, T.; Shu, F. Probabilistic electric load forecasting: A tutorial review. Int. J. Forecast. 2016, 3, 914–938. [Google Scholar] [CrossRef]
  2. Raza, M.; Khosravi, A. A review on artificial intelligence based load demand forecasting techniques for smart grid and buildings. Renew. Sustain. Energy Rev. 2015, 50, 1352–1372. [Google Scholar] [CrossRef]
  3. Makridakis, S.; Spiliotis, E.; Assimakopoulus, V. Statistical and machine learning fore- casting methods: Concerns and ways forward. PLoS ONE 2018, 13, e0194889. [Google Scholar] [CrossRef]
  4. Stosov, M.A.; Radivojevic, N.; Ivanova, M. Electricity Consumption Prediction in an Electronic System Using Artificial Neural Networks. Electronics 2022, 11, 3506. [Google Scholar] [CrossRef]
  5. Roman-Portabales, A.; Lopez-Nores, M.; Pazos-Arias, J.J. Systematic Review of Electricity Demand Forecast Using ANN-Based Machine Learning Algorithms. Sensors 2021, 21, 4544. [Google Scholar] [CrossRef]
  6. McCulloch, J.; Ignatieva, K. Forecasting High Frequency Intra-Day Electricity Demand Using Temperature. SSRN Electron. J. 2017. [Google Scholar] [CrossRef]
  7. Hewamalage, H.; Christoph, B.; Kasum, B. Recurrent Neural Networks for Time Series Forecasting: Current Status and Future Directions. Int. J. Forecast. 2019, 17, 388–427. [Google Scholar] [CrossRef]
  8. Chapagain, K.; Kittipiyakul, S. Performance Analysis of Short-Term Electricity Demand with Atmospheric Variables. Energies 2018, 11, 818. [Google Scholar] [CrossRef]
  9. Clements, A.E.; Hurn, A.S.; Li, Z. Forecasting day-ahead electricity load using a multiple equation time series approach. Eur. J. Oper. Res. 2016, 251, 522–530. [Google Scholar] [CrossRef]
  10. Lusis, P.; Khalilpour, K.; Andrew, L.; Liebman, A. Short-term residential load forecasting: Impact of calendar effects and forecast granularity. Appl. Energy 2017, 205, 654–669. [Google Scholar] [CrossRef]
  11. Taylor, J.W.; de Menezes, L.M.; McSharry, P.E. A comparison of univariate methods for forecasting electricity demand up to a day ahead. Int. J. Forecast. 2006, 22, 1–16. [Google Scholar] [CrossRef]
  12. Hong, T.; Wang, P.; Willis, H.L. Short Term Electric Load Forecasting. Int. J. Forecast. 2010, 74, 1–6. [Google Scholar] [CrossRef]
  13. Bandara, K.; Bergmeir, C.; Smyl, S. Forecasting across time series databases using recurrent neural networks on groups of similar series: A clustering approach. Expert Systems with Applications. Expert Syst. Appl. 2019, 140, 112896. [Google Scholar] [CrossRef]
  14. Hong, T.; Wang, P. Fuzzy interaction regression for short term load forecasting. Fuzzy Opt. Decis. Mak. 2014, 13, 91–103. [Google Scholar] [CrossRef]
  15. Dang-Ha, H.; Filippo, M.B.; Roland, O. Local Short Term Electricity Load Forecasting: Automatic Approaches. arXiv 2017, arXiv:1702.08025. [Google Scholar]
  16. Selvi, M.; Mishra, S. Investigation of Weather Influence in Day-Ahead Hourly Electric Load Power Forecasting with New Architecture Realized in Multivariate Linear Regression & Artificial Neural Network Techniques. In Proceedings of the 8th IEEE India International Conference on Power Electronics (IICPE), Jaipur, India, 13–15 December 2018; pp. 13–15. [Google Scholar]
  17. Ramanathan, R.; Engle, R.; Granger, C.W.; Vahid-Araghi, F.; Brace, C. Short-run forecasts of electricity loads and peaks. Int. J. Forecast. 1997, 13, 161–174. [Google Scholar] [CrossRef]
  18. Chapagain, K.; Sato, T.; Kittipiyakul, S. Performance analysis of short-term electricity demand with meteorological parameters. In Proceedings of the 2017 14th Int Conf on Electl Eng/Elx, Computer, Telecom and IT (ECTI-CON), Phuket, Thailand, 27–30 June 2017; pp. 330–333. [Google Scholar]
  19. Ismail, Z.; Jamaluddin, F.; Jamaludin, F. Time Series Regression Model for Forecasting Malaysian Electricity Load Demand. Asian J. Math. Stat. 2008, 1, 139–149. [Google Scholar] [CrossRef]
  20. Bouktif, S.; Fiaz, A.; Ouni, A.; Serhani, M. Multi-Sequence LSTM-RNN Deep Learning and Metaheuristics for Electric Load Forecasting. Energies 2020, 13, 391. [Google Scholar] [CrossRef]
  21. Bouktif, S.; Fiaz, A.; Ouni, A.; Serhani, M. Optimal Deep Learning LSTM Model for Electric Load Forecasting using Feature Selection and Genetic Algorithm: Comparison with Machine Learning Approaches. Energies 2018, 7, 1636. [Google Scholar] [CrossRef]
  22. Chapagain, K.; Kittipiyakul, S.; Kulthanavit, P. Short-Term Electricity Demand Forecasting: Impact Analysis of Temperature for Thailand. Energies 2020, 13, 2498. [Google Scholar] [CrossRef]
  23. Cao, Z.; Han, X.; Lyons, W.; O’Rourke, F. Energy management optimisation using a combined Long Short term Memory recurrent neural network-Particle Swarm Optimization model. J. Clean. Prod. 2022, 326, 129246. [Google Scholar] [CrossRef]
  24. Alya, A.; Ameena, S.A.; Mousa, M.; Rajesh, K.; Ahmed, A.Z.D. Short-term load and price forecasting using artificial neural network with enhanced Markov chain for ISO New England. Energy Rep. 2023, 9, 4799–4815. [Google Scholar] [CrossRef]
  25. Hongli, L.; Luoqi, W.; Ji, L.; Lei, S. Research on Smart Power Sales Strategy Considering Load Forecasting and Optimal Allocation of Energy Storage System in China. Energies 2023, 16, 3341. [Google Scholar] [CrossRef]
  26. Hippert, H.S.; Pedreira, C.E.; Souza, R.C. Neural networks for short-term load forecasting: A review and evaluation. IEEE Trans. Power Syst. 2001, 16, 44–55. [Google Scholar] [CrossRef]
  27. Sankalpa, C.; Kittipiyakul, S.; Laitrakun, S. Short-Term Electricity Load Using Validated Ensemble Learning. Energies 2022, 15, 8567. [Google Scholar] [CrossRef]
  28. Chapagain, K.; Kittipiyakul, S. Short-term Electricity Load Forecasting Model and Bayesian Estimation for Thailand Data. In Proceedings of the 2016 Asia Conference on Power and Electrical Engineering (ACPEE 2016), Bankok, Thailand, 20–22 March 2016; Volume 55, p. 06003. [Google Scholar]
  29. Chapagain, K.; Kittipiyakul, S. Short-term Electricity Load Forecasting for Thailand. In Proceedings of the 2018 15th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Chiang Rai, Thailand, 18–21 July 2018; pp. 521–524. [Google Scholar] [CrossRef]
  30. Phyo, P.; Jeenanunta, C. Electricity Load Forecasting using a Deep Neural Network. Eng. Appl. Sci. Res. 2019, 46, 10–17. [Google Scholar]
  31. Su, W.H.; Jeenanunta, C. Short-term Electricity Load Forecasting in Thailand: An Analysis on Different Input Variables. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2018. [Google Scholar]
  32. Darshana, A.K.; Chawalit, J. Hybrid Particle Swarm Optimization with Genetic Algorithm to Train Artificial Neural Networks for Short-term Load Forecasting. Int. J. Swarm Intell. Res. 2019, 10, 1–14. [Google Scholar]
  33. Dilhani, M.H.M.R.S.; Jeenanunta, C. Daily electric load forecasting: Case of Thailand. In Proceedings of the 2016 7th International Conference on Information and Communication Technology for Embedded Systems (IC-ICTES), Bangkok, Thailand, 20–22 March 2016; pp. 25–29. [Google Scholar] [CrossRef]
  34. Shi, H.; Xu, M.; Li, R. Deep Learning for Household Load Forecasting Novel Pooling Deep RNN. IEEE Trans. Smart Grid 2018, 9, 5271–5280. [Google Scholar] [CrossRef]
  35. Harun, M.H.H.; Othman, M.M.; Musirin, I. Short term Load Forecasting using Artificial Neural Network based Multiple lags and Stationary Timeseries. In Proceedings of the 2010 4th International Power Engineering and Optimization Conference (PEOCO), Shah Alam, Malaysia, 23–24 June 2010; pp. 363–370. [Google Scholar]
  36. Tee, C.Y.; Cardell, J.B.; Ellis, G.W. Short term Load Forecasting using Artificial Neural Network. In Proceedings of the 41st North American Power Symposium, Starkville, MS, USA, 4–6 October 2009; pp. 1–6. [Google Scholar]
  37. Li, Y.; Pizer, W.A.; Wu, L. Climate change and residential electricity consumption in the Yangtze River Delta, China. Proc. Natl. Acad. Sci. USA 2019, 116, 472–477. [Google Scholar] [CrossRef]
  38. Torabi, M.; Hashemi, S. A data mining paradigm to forecast weather sensitive short-term energy consumption. In Proceedings of the 2012 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012), Shiraz, Iran, 2–3 May 2012. [Google Scholar]
  39. Pramono, S.H.; Rohmatillah, M.; Maulana, E.; Hasanah, R.N.; Hario, F. Deep learning-based short-term load forecasting for supporting demand response program in hybrid energy system. Energies 2019, 12, 3359. [Google Scholar] [CrossRef]
  40. Kim, T.Y.; Cho, S.B. Predicting residential energy consumption using CNN-LSTM neural networks. Energy 2019, 182, 72–81. [Google Scholar] [CrossRef]
  41. Qi, Z.; Zheng, X.; Chen, Q. A short term load forecasting of integrated energy system based on CNN-LSTM. Web Conf. 2020, 185, 01032. [Google Scholar] [CrossRef]
  42. Parkpoom, S.; Harrison, G.P. Analyzing the Impact of Climate Change on Future Electricity Demand in Thailand. IEEE Trans. Power Syst. 2008, 23, 1441–1448. [Google Scholar] [CrossRef]
  43. Darshana, A.K.; Chawalit, J. Combine Particle Swarm Optimization with Artificial Neural Networks for Short-Term Load Forecasting. Int. Sci. J. Eng. Technol. 2017, 1, 25–30. [Google Scholar]
  44. Chapagain, K.; Kittipiyakul, S. Short-Term Electricity Demand Forecasting with Seasonal and Interactions of Variables for Thailand. In Proceedings of the 2018 International Electrical Engineering Congress (iEECON), Krabi, Thailand, 7–9 March 2018; pp. 1–4. [Google Scholar]
  45. Apadula, F.; Bassini, A.; Elli, A.; Scapin, S. Relationships between meteorological variables and monthly electricity demand. Appl. Energy 2012, 98, 346–356. [Google Scholar] [CrossRef]
  46. Soares, L.J.; Souza, L.R. Forecasting electricity demand using generalized long memory. Int. J. Forecast. 2006, 22, 17–28. [Google Scholar] [CrossRef]
  47. Sailor, D.; Pavlova, A. Air conditioning market saturation and long-term response of residential cooling energy demand to climate change. Energy 2003, 28, 941–951. [Google Scholar] [CrossRef]
  48. Machado, E.; Pinto, T.; Guedes, V.; Morais, H. Demand Forecasting Using Feed-Forward Neural Networks. Energies 2021, 14, 7644. [Google Scholar] [CrossRef]
  49. Gourav, K.; Uday, P.S.; Sanjeev, J. An adaptive particle swarm optimization-based hybrid long short-term memory model for stock price time series forecasting. Soft Comput. 2022, 26, 12115–12135. [Google Scholar] [CrossRef]
  50. Hong, T.; Xiangzheng, L.; Liangzhi, L.; Lian, Z.; Yu, Y.; Xiaohui, H. One-shot pruning of gated recurrent unit neural network by sensitivity for time-series prediction. Neurocomputing 2022, 512, 15–24. [Google Scholar] [CrossRef]
  51. Kittiwoot, C.; Vorapat, I.; Anothai, T. Electricity Consumption in Higher Education Buildings in Thailand during the COVID-19 Pandemic. Buildings 2022, 12, 1532. [Google Scholar] [CrossRef]
  52. Wang, Y.; Niu, D.; Ji, L. Short-term power load forecasting based on IVL-BP neural network technology. Syst. Eng. Procedia 2012, 4, 168–174. [Google Scholar] [CrossRef]
  53. Cottet, R.; Smith, M. Bayesian Modeling and Forecasting of Intraday Electricity Load. J. Am. Stat. Assoc. 2003, 98, 839–849. [Google Scholar] [CrossRef]
  54. Shereen, E.; Daniela, T.; Ahmed, R.; Hadi, S.J.; Lars, S.T. Do We Really Need Deep Learning Models for Time Series Forecasting. arXiv 2021, arXiv:2101.02118. [Google Scholar]
  55. Buitrago, J.; Asfour, S. Short-Term Forecasting of Electric Loads Using Nonlinear Autoregressive Artificial Neural Networks with Exogenous Vector Inputs. Energies 2017, 1, 40. [Google Scholar] [CrossRef]
Figure 1. Architecture of a Feedforward Neural Network.
Figure 1. Architecture of a Feedforward Neural Network.
Asi 06 00100 g001
Figure 2. Architecture of LSTM [49].
Figure 2. Architecture of LSTM [49].
Asi 06 00100 g002
Figure 3. Architecture of GRU showing information flow [50].
Figure 3. Architecture of GRU showing information flow [50].
Asi 06 00100 g003
Figure 4. (a) Complete demand profile for the year 2012 and (b) overall demand profile with trend.
Figure 4. (a) Complete demand profile for the year 2012 and (b) overall demand profile with trend.
Asi 06 00100 g004
Figure 5. (a) Decrease in load during the 2011 Bangkok-flood and (b) sector-wise consumption.
Figure 5. (a) Decrease in load during the 2011 Bangkok-flood and (b) sector-wise consumption.
Asi 06 00100 g005
Figure 6. (a) Electricity demand during Songkran period and (b) decrease in demand during New Year holiday.
Figure 6. (a) Electricity demand during Songkran period and (b) decrease in demand during New Year holiday.
Asi 06 00100 g006
Figure 7. (a) Weekly demand pattern. (b) Intra-day demand variation.
Figure 7. (a) Weekly demand pattern. (b) Intra-day demand variation.
Asi 06 00100 g007
Figure 8. Variation of demand over years and each months.
Figure 8. Variation of demand over years and each months.
Asi 06 00100 g008
Figure 9. (a) Effect of temperature during peak hours and (b) effect of temperature at two different hours.
Figure 9. (a) Effect of temperature during peak hours and (b) effect of temperature at two different hours.
Asi 06 00100 g009
Figure 11. (a) Data input structure of FNN model and (b) data input structure of DNN model.
Figure 11. (a) Data input structure of FNN model and (b) data input structure of DNN model.
Asi 06 00100 g011
Figure 12. Demand prediction horizon: practice of EGAT in Thailand.
Figure 12. Demand prediction horizon: practice of EGAT in Thailand.
Asi 06 00100 g012
Figure 13. Visualization of pretrained model, lookback period, and prediction.
Figure 13. Visualization of pretrained model, lookback period, and prediction.
Asi 06 00100 g013
Figure 14. Breakdown of dataset into training, validation, and testing sets.
Figure 14. Breakdown of dataset into training, validation, and testing sets.
Asi 06 00100 g014
Figure 15. Weekday forecast using FNN.
Figure 15. Weekday forecast using FNN.
Asi 06 00100 g015
Figure 16. Weekday forecast using RNN-LSTM.
Figure 16. Weekday forecast using RNN-LSTM.
Asi 06 00100 g016
Figure 17. Weekday forecasting using RNN-GRU.
Figure 17. Weekday forecasting using RNN-GRU.
Asi 06 00100 g017
Figure 18. Weekday forecasting when including maximum temperature variables in LSTM model.
Figure 18. Weekday forecasting when including maximum temperature variables in LSTM model.
Asi 06 00100 g018
Figure 19. Predictions for 2013 using the whole dataset for RNN-GRU ( S c e n a r i o 2 ).
Figure 19. Predictions for 2013 using the whole dataset for RNN-GRU ( S c e n a r i o 2 ).
Asi 06 00100 g019
Figure 20. Prediction of demand using FNN, GRU, and LSTM for the whole dataset for 2013: (a) first three weeks of January 2013, (b) first three weeks of March 2013, (c) last week of March 2013, including Songkran festival, (d) non-holiday period of June 2013, (e) first three weeks of December 2013.
Figure 20. Prediction of demand using FNN, GRU, and LSTM for the whole dataset for 2013: (a) first three weeks of January 2013, (b) first three weeks of March 2013, (c) last week of March 2013, including Songkran festival, (d) non-holiday period of June 2013, (e) first three weeks of December 2013.
Asi 06 00100 g020
Table 1. Variation of prediction accuracy due to the uncertain nature of demand datasets.
Table 1. Variation of prediction accuracy due to the uncertain nature of demand datasets.
ModelMAPEPrediction
Horizon
Data SourcePublished YearReference
ANN model2.90%1 hDSO, Delhi, India2018Selvi et al.  [16]
1.96%1 hBandar Abbas, Iran2012Torabi et al.  [38]
CNN-LSTM2.02%1 hPublic dataset, England, USA2019Pramono et al.  [39]
34.84%1 hUCI ML dataset (households)2019Kim et al.  [40]
1%24 hIndustrial area, China2020Qi et al.  [41]
Table 2. List of published works based on 2009–2013 EGAT dataset [27].
Table 2. List of published works based on 2009–2013 EGAT dataset [27].
MethodResultMAPE%Reference
MLR with AR(2)Bayesian estimation provides consistent and better accuracy compared to OLS estimation1% to 5%  [28]
PSO with ANNImplementing PSO on ANN model outperformed shallow ANN model3.44%  [43]
OLSInteration of variable improves the prediction accuracy>4%  [44]
OLS and Bayesian estimationIncluding temperature variable in a model can improved the prediction accuracy up to 20%2% to 3%  [29]
PSO & GA with ANNPSO+GA outperformed PSO with ANN>3%  [32]
OLS, GLSAR, FNNOLS and GLSAR models showed better forecasting accuracy than FNN1.74% to 2.95%  [22]
Ensemble for regression and MLLowers the test MAPE implementing blocked Cross Validation scheme.2.6%  [27]
FNN, RNN based LSTM & GRUFor weekdays and for aggregate data GRU shows better accuracy2.47% to 3.44%In this study
Table 3. List of selected input variables.
Table 3. List of selected input variables.
TypesVariablesDescription
DeterministicWDWeek dummy [Mon <Tue … <Sat<Sun]
MDMonth dummy [Feb <Mar <… <Nov <Dec]
DayAfterHolidayBinary 0 or 1
DayAfterLongHolidayBinary 0 or 1
DayAfterSongkranBinary 0 or 1
DayAfterNewyearBinary 0 or 1
TemperatureTempForecasted temperature
MaxTempMaximum forecasted temperature
Square temperatureSquare of the forecasted temperature
MA2pmTempMoving avearage of temperature at 2 pm
Laggedload1d_cut2pm1 day ahead until 2 pm and 2 day ahead after 2 pm load
load2d_cut2pm2 days ahead until 2 pm and 3 day ahead after 2 pm load
load3d_cut2pmR3 days ahead until 2 pm and 4 days ahead after 2 pm load
load4d_cut2pmR4 days ahead until 2 pm and 5 days ahead after 2 pm load
InteractionWD:TempInteraction: week day dummy to temperature
MD:TempInteraction: month dummy to temperature
WD:load1d_cut2pmInteraction: week day dummy to load1d_cut2pm
WD:load2d_cut2pmInteraction: week day dummy to load2d_cut2pm
Table 4. Experimental hyperparameter testing setup for S c e n a r i o 1 (a) hyperparameter variations and (b) tuned parameters for S c e n a r i o 1 .
Table 4. Experimental hyperparameter testing setup for S c e n a r i o 1 (a) hyperparameter variations and (b) tuned parameters for S c e n a r i o 1 .
(a)
ParametersValue
Number of nodes [2, 4, 8, 16, 32, 64, 128]
Number of hidden layers [1 to 5]
Look back period [5 days to 10 days]
Dropout [0, 0.05, 0.1, 0.15]
Epochs [up to 1 million]
(b)
ParametersFNNLSTMGRU
Time period484848
Delay202020
Pred_batch_size484848
Number of hidden layers212
Dropout000
Number of nodes323232
Epochs3516951
Look back period10 days5 days5 days
Train_fraction111
Validation loss (MAE)170.41195.31200.75
Table 5. Experimental hyperparameter testing setup for S c e n a r i o 2 : (a) hyperparameter variations and (b) tuned parameters for S c e n a r i o 2 .
Table 5. Experimental hyperparameter testing setup for S c e n a r i o 2 : (a) hyperparameter variations and (b) tuned parameters for S c e n a r i o 2 .
(a)
ParametersValue
Number of nodes [2, 4, 8, 16, 32, 64, 128]
Number of hidden layers [1 to 5]
Look back period [5 days to 10 days]
Dropout [0, 0.05, 0.1, 0.15]
Epochs [up to 1 million]
(b)
ParametersFNNLSTMGRU
Time period484848
Delay202020
Pred_batch_size484848
Number of hidden layers322
Dropout000
Number of nodes646464
Epochs1805699
Look back period8 days8 days8 days
Train_fraction111
Validation loss (MAE)237.82242.12234.22
Table 6. Implementation of the best parameters for day-ahead forecasting ( S c e n a r i o 1 ).
Table 6. Implementation of the best parameters for day-ahead forecasting ( S c e n a r i o 1 ).
Modelnnodes/LayernlayersLook
Back
dropoutepochMin
MAE
Test
MAE
Test
MAPE(%)
FNN32210 days0351170.41165.542.47
GRU3225 days051200.75192.763.37
LSTM3215 days069195.31179.832.58
Table 7. Implementation of the best parameters: day-ahead forecasting ( S c e n a r i o 2 ).
Table 7. Implementation of the best parameters: day-ahead forecasting ( S c e n a r i o 2 ).
Modelnnodes/LayernlayersLook
Back
dropoutepochMin
MAE
Test
MAE
Test
MAPE(%)
FNN6438 days0180237.82262.83.54
GRU6428 days099234.22251.33.44
LSTM6428 days056242.12276.23.86
Table 8. MAPE performance with the best parameters: day-ahead forecasting in S c e n a r i o 2 .
Table 8. MAPE performance with the best parameters: day-ahead forecasting in S c e n a r i o 2 .
DaytypeFNNGRULSTM
Weekdays2.972.713.76
Weekends3.834.623.58
Holidays9.796.706.96
Overall (MAPE %)3.543.443.86
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chapagain, K.; Gurung, S.; Kulthanavit, P.; Kittipiyakul, S. Short-Term Electricity Demand Forecasting Using Deep Neural Networks: An Analysis for Thai Data. Appl. Syst. Innov. 2023, 6, 100. https://doi.org/10.3390/asi6060100

AMA Style

Chapagain K, Gurung S, Kulthanavit P, Kittipiyakul S. Short-Term Electricity Demand Forecasting Using Deep Neural Networks: An Analysis for Thai Data. Applied System Innovation. 2023; 6(6):100. https://doi.org/10.3390/asi6060100

Chicago/Turabian Style

Chapagain, Kamal, Samundra Gurung, Pisut Kulthanavit, and Somsak Kittipiyakul. 2023. "Short-Term Electricity Demand Forecasting Using Deep Neural Networks: An Analysis for Thai Data" Applied System Innovation 6, no. 6: 100. https://doi.org/10.3390/asi6060100

Article Metrics

Back to TopTop