A Deep Learning Approach for Short-Term Electricity Demand Forecasting: Analysis of Thailand Data

: Accurate electricity demand forecasting serves as a vital planning tool, enhancing the reliability of management decisions. Apart from that, achieving these aims, particularly in managing peak demand, faces challenges due to the industry’s volatility and the ongoing increase in residential energy use. Our research suggests that employing deep learning algorithms, such as recurrent neural networks (RNN), long short-term memory (LSTM), and gated recurrent units (GRU), holds promise for the accurate forecasting of electrical energy demand in time series data. This paper presents the construction and testing of three deep learning models across three separate scenarios. Scenario 1 involves utilizing data from all-day demand. In Scenario 2, only weekday data are considered. Scenario 3 uses data from non-working days (Saturdays, Sundays, and holidays). The models underwent training and testing across a wide range of alternative hyperparameters to determine the optimal configuration. The proposed model’s validation involved utilizing a dataset comprising half-hourly electrical energy demand data spanning seven years from the Electricity Generating Authority of Thailand (EGAT). In terms of model performance, we determined that the RNN-GRU model performed better when the dataset was substantial, especially in scenarios 1 and 2. On the other hand, the RNN-LSTM model is excellent in Scenario 3. Specifically, the RNN-GRU model achieved an MAE (mean absolute error) of 214.79 MW and an MAPE (mean absolute percentage error) of 2.08% for Scenario 1, and an MAE of 181.63 MW and MAPE of 1.89% for Scenario 2. Conversely, the RNN-LSTM model obtained an MAE of 226.76 MW and an MAPE of 2.13% for Scenario 3. Furthermore, given the expanded dataset in Scenario 3, we can anticipate even higher precision in the results.


Introduction 1.Background
The electricity consumption within a society serves as a significant indicator of its sustainable development, encompassing both economic and environmental aspects.Consequently, ongoing research aimed at addressing electricity consumption issues is crucial for supporting various domains [1].Accurate forecasting is a critical need in electricity markets.Additionally, it plays a significant role in ensuring the security of the power system [2].There are typically three types of electricity load forecasting based on the period of forecasting; these are short-term load forecasting (covering minutes to one day ahead), medium-term load forecasting (spanning weeks to months), and long-term load forecasting (projecting years ahead) [3].Short-term load forecasting (STLF) is useful for utilities in predicting energy demand during peak hours and preparing for potential supply shortages.Additionally, it assists electricity market participants in managing energy trading activities and mitigating risks arising from sudden fluctuations in demand or supply [4].It was observed that 80% of the electrical energy demand forecasting concerned STLF, while MTLF and LTLF accounted for 15% and 5%, respectively.STLF has the highest priority because of its critical role in the daily operational scheduling of power systems, guaranteeing grid stability and maintaining supply and demand in real time.While MTLF and LTLF are critical for strategic planning and infrastructure development, they may not be as urgent in addressing immediate operational challenges, which justifies the more limited research focus [5].Thailand's electrical supply comprises three main entities: the Metropolitan Electricity Authority (MEA), the Electricity Generating Authority of Thailand (EGAT), and the Provincial Electricity Authority (PEA), where EGAT manages power generation.Our paper focuses specifically on the metropolitan regions within Thailand [2].

Challenges
Short-term load forecasting plays a vital role in the energy industry, as precise predictions of future electricity demand are essential for maintaining the reliable and efficient operation of power systems [6].Weather conditions, economic trends, consumer behavior, and unforeseen events such as natural disasters all have an impact on load data, leading to fluctuations that make accurate forecasting difficult, especially over long periods [7].Deep learning models can independently learn detailed patterns and relationships while adjusting to variations in the data-generating process.However, the training and deployment of these models demand substantial amounts of data and computational resources [8].Energy systems are constantly evolving, influenced by policy changes, technological advancements, and market fluctuations.Researchers are consistently attempting to enhance existing forecasting models to mitigate the occurrences of both under and over forecasting, depending upon the effects of these influential parameters [9].In this context, researchers face significant challenges in minimizing prediction errors.Their task involves developing sophisticated models and thoroughly testing them across diverse datasets, including commercial, residential, and collective loads [10].

Load Forecasting Models
Conventional statistical models include regression models, such as linear regression models, multiple regression models, stepwise regression models, logistic regression models, and polynomial regression models [11], where moving average (MA), autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA), autoregressive integrative moving average exogenous (ARIMAX), and seasonal autoregressive integrated moving average (SARIMA) are popular in statistical forecasting models [12].For non-real-time forecasting, these models are useful, but they have no accuracy with respect to actual time load and no ability to regulate the consumption of non-linear loads [13].
Short-term electrical demand forecasting powered by AI is crucial for optimizing energy generation and distribution, allocating resources effectively and maintaining power network stability [14].AI-based models, including traditional machine learning (ML) and deep learning (DL) models, were introduced to handle non-linear relationships in load consumption.Traditional machine learning methods, such as support vector machines (SVM), random forests (RF), and decision trees (DT), can capture complex correlations between several factors affecting energy consumption.ML models consist of artificial neural networks (ANN), particle swarm optimization algorithms (PSO), genetic algorithms (GA), fuzzy logic, and expert systems [15].Feed-forward neural networks (FNNs) simulate complex relationships between inputs and outputs by adjusting parameters.However, they can suffer from overfitting and may not always find the best solution, becoming stuck in a local optimal state.To overcome this, backward propagation learning is often used in FNN training [8].To overcome the complex ML problems, DL models were introduced, such as the deep neural network (DNN), the recurrent neural network (RNN), long short-term memory (LSTM), the convolutional neural network (CNN), gated recurrent units (GRU), and the deep belief network (DBN), which have been used for STLF [16].A modification of RNNs, the LSTM network was created by Hochreiter and Schmidhuber in 1997 to address the vanishing and exploding difficulties in RNNs [17]; it solves time series data problems by memorizing all input information.GRU were introduced to address vanishing and exploding gradients using the concepts of gated cells, like in LSTM, which exists in RNN due to long-term dependencies.So, this is the model for improvement based on RNNs.
This research is related to short-term electrical demand forecasting by using a halfhourly recorded demand dataset from Thailand.There is a lot of previous work related to demand data from EGAT, applying different models by using either all or part of a 5-year period (from 1 March 2009 to 31 December 2013), which is mentioned in table in Section 2. We used a dataset from EGAT, covering about 7 years (from 1 January 2012 to 31 December 2018), with the same half-hourly pattern of demand.Identifying the optimal DNN model based on forecasting accuracy is the primary goal of this work.This paper makes the following significant contributions:

•
Based on testing and validation accuracy, a comparative analysis of deep neural networks for simple RNN, LSTM, and GRU is discussed.

•
Splitting the complete 7-year demand dataset into three distinct scenarios, encompassing working days, non-working days, and the entirety of the dataset, based on shared demand characteristics, enhances prediction accuracy compared to prior studies using Thai datasets.The remainder of this paper unfolds as follows: Section 2 explores related literature; our modeling approach, including the principles supporting DNNs and the methodology for model estimation, is described in Section 3; Section 4 states the features of electricity demand data and examines pertinent variables; Section 5 determines the model framework, details the expansive experimental setup, and closely examines the forecast accuracy and model fit quality using the Thai dataset; after this, Section 6 furnishes the findings along with comprehensive discussions, while Section 7 as the concluding section.

Related Works
The nature of the data, the accessibility of previous observations, and the forecasting goal have an impact on the decision to use a time series model.Building efficient time series models requires careful consideration and analysis of trends and seasonality, feature engineering, and model selection [15].Kernel-based multitask learning has been shown to be effective in addressing electricity demand forecasting difficulties.Fiot et al. developed kernels that consider various seasonal patterns to specifically address the complicated task of forecasting individual customer demand [18].Dudek et al. presented the principles of the pattern-similarity-based methods for short-term load forecasting and similar-pattern-based local linear regression models using Polish power system data [19].The proposed stepwise and lasso regression methods have been suggested to decrease the number of predictors and have shown superior performance when compared to other models such as ARIMA, ES, ANN, and the Nadaraya-Watson estimator.Ismail et al. [20] used an MLR model with a day-ahead forecast MAPE of 1.71% to examine the effects of temperature and holiday types.In addition to artificial intelligence, linear regression was contrasted with other conventional and adaptable techniques for selecting weather variables and identical energy prediction, such as seasonal ARIMA (SARIMA) and regression ARIMA (RegARIMA) [20].Lusis et al. evaluated calendar effects and forecast granularity of STLF using regression trees, neural networks, and SVR with root mean square error (RMSE) and normalized RMSE as forecast error metrics [10].Zhang et al. [21] proposed improved adaptive rules for the genetic algorithm, which demonstrated superior performance compared to the traditional genetic algorithm (GA) in optimizing support vector regression (SVR) and enhancing forecasting accuracy for hourly electricity demand.Additionally, it outperformed the extreme learning machine (ELM) model and various artificial neural network models [21].Certain scholars have utilized machine learning models to improve their performance by enhancing training models, iterations, hyperparameters tuning, and other different techniques.Khwaja et al. proposed the boosted neural network (BooNN) model, which combines a set of ANNs iteratively, and then compared it with simple ANN and bagged neural networks (BagNN) [22].Similarly, they proposed the joint bagged-boosted ANNs that train an ensemble of multiple ANN models in parallel, creating individual ANNs and combining the individual ANNs.Then, the generated results of the proposed model were compared with those of single ANN, BoostNN, and BagNN [23].
The attraction of the RNN-based LSTM and GRU networks among researchers is due to their capacity to manage sequential data and dependencies over time and to discover complicated patterns in the data [24,25].Recent improvements in computation and algorithms have resulted in the development of DNN as the primary approach for forecasting electricity demand.This is due to their ability to enhance the feature abstraction capability of the model.Chen et al. presented a modified deep residual network with a two-stage ensemble strategy to enhance the generalization capability [26].When compared to other models, deep learning (DL) experiments possess outstanding abilities in managing complex non-linear relationships, model complexity, and computational efficiency.Din et al. investigate and evaluate the effectiveness of feed-forward DNN and recurrent deep neural network (R-DNN) models in terms of accuracy and computational efficiency, and the generated results show that the use of transfer function feature analysis with DNN enables higher accuracy [27].Li et al. [28] assessed the efficacy of LSTM and FNN models for electricity demand prediction, scrutinizing their preciseness and resilience.Their work underscores LSTM's expertise in capturing complicated long-term patterns in electricity demand data [28].Hybrid DNN models have been used in electricity demand prediction in addition to LSTM and FNN.To improve the reliability of short-term demand for electricity forecasting, Stosovic et al. recommended five distinct recurrent neural network architectures.In terms of accuracy, this study revealed that the GRU and bidirectional LSTM model surpassed the conventional FNN and RNN models, indicating the possibility of combining several machine-learning techniques for enhanced forecasting performance [29].Xiao et al. used a hybrid forecasting model for electrical power prediction that incorporates ANN, including BPNN, the generalized regression neural network (GRNN), the Elman neural network, and an optimized model (GA-BP) neural network, in to half-hourly electrical power data of the State of Victoria and New South Wales of Australia [30].The ensemble approach has lately gained popularity among researchers because of the improved prediction accuracy that has been demonstrated.For the management of the power system, a survey of demand forecasting models was conducted [31].Gao et al. [32] introduced an adaptive deep-learning load forecasting framework by integrating transformer and domain knowledge.Adaptive-TgDLF effectively combines the transformer model with human insights to enhance short-term electrical load forecasting.It introduces adaptive learning to tackle data scarcity and changing load patterns, marking a significant advancement in this area of research [32].The researchers explored model predictive control for adjusting power production and distribution strategies in response to predicted demand, thus preventing delays and optimizing resource use.By precisely predicting short-term load demand, services can enhance grid stability, reduce costs, and boost efficiency [33,34].
The SLTF works for Thailand appear to have only lately surfaced.Hence, numerous studies have been conducted in Thailand to forecast power consumption using various approaches and strategies.Several studies [2,[35][36][37][38][39] utilizing data sourced from the Bangkok metropolitan region have been published, recorded by the Electricity Generating Authority of Thailand (EGAT) between 2009 and 2013.
Chapagain and Kittipiyakul implemented calendar elements, including the year, month, day, and hour, as well as seasonal elements like holidays and special days, which were considered to be the factors driving demand in [2], and used them, together with other crucial variables, to construct numerous predictive models.These elements are typically referred to as deterministic variables.When compared to demand on weekends, weekday demand exhibits dramatically distinct patterns.Due to the existence of industrial sector activity on weekdays, weekday demand is specifically much stronger and more consistent than weekend demand.Weekend and weekday holidays both have significantly lower energy demand than weekday demand.In order that the various weekday patterns would be identified, we needed to create dummy variables that represented each day of a week as either weekdays, weekends, or holidays (whether on weekdays or weekends) [40].Various researchers have generated noteworthy findings based on the identical EGAT dataset spanning from 2009 to 2013.These findings are summarized in Table 1.
Table 1.List of articles that were published using the EGAT dataset [41].

Theoretical Background
This chapter presents the theoretical background of all terminologies and an explanation of the proposed model.Deep learning techniques, such as RNN, RNN-LSTM, and RNN-GRU, are utilized to forecast future short-term electricity demand.Forecasting electrical load presents distinct challenges, including seasonality, periodicity, non-linearity, and sequential interdependence within consumption data sequence, which deep learning architectures are especially well-suited to address.By using non-linear modifications and extracting high-level abstractions, deep learning models, as opposed to shallow ANN architectures, can automatically learn complicated temporal patterns [43].

Deep Neural Network
The DNN is one of the neural networks with an input layer, an output layer, and many hidden layers (more than two).It includes an input layer (xi), an output layer (yi), and more than two hidden layers (hi), as shown in Figure 1.It trains as a feed-forward network to generate the corresponding output values through all hidden layers and neuron nodes in forward propagation.The non-linear activation function f takes a weighted sum of input X values and returns an output Y value, as indicated in Equation (1), where W l jk = the weight between node j in the layer (l−1) and node k in layer l.
sum of input X values and returns an output Y value, as indicated in Equation (1), where  = the weight between node j in the layer (l−1) and node k in layer l.   ∑ (1)

Recurrent Neural Network
The RNN is defined as an extension of a feed-forward neural network (FNN), with feedback connections that can detect short-term non-linear time relationships.Williams and Zipser created the first RNN in the late 1980s, a time of rapid advancement in neural network construction research that resulted in multiple essential discoveries [44].RNN computes new states by recursively applying transfer functions to previous states and inputs.This process determines how information is passed between neurons and is critical for creating a learning system that can perform well under different conditions [45].
In terms of prediction, an RNN is trained using input data x(t) to generate a desired output y(t).Figure 2 illustrates the unrolled architecture of an RNN across the t-time dimension with LN layers.Each time step t features neurons in the hidden layer, which can be viewed as a single layer within an FNN.At time step t, the parameters of the network neurons at the lth layer update their shared states using the following equations [46]: ℎ  = Activation function   for l=1, 2,..., N, ℎ  1  ℎ  for l = 1,..,.N, ℎ  1  ℎ  , L = loss function (y(t), y traget (t)), where x(t) denotes the input data at the t th time step; y(t) represents the corresponding prediction; y target (t) signifies the true values of output targets; ℎ  indicates the shared states of l th network layer at time step t; and   depicts the input value of the l th layer at time step t, comprising the t th time step input x(t) or the shared state ℎ  at time t from the (l − 1) th layer, bias b, and shared states ℎ  1 at current network layer l from the previous time step (t -1).

Recurrent Neural Network
The RNN is defined as an extension of a feed-forward neural network (FNN), with feedback connections that can detect short-term non-linear time relationships.Williams and Zipser created the first RNN in the late 1980s, a time of rapid advancement in neural network construction research that resulted in multiple essential discoveries [44].RNN computes new states by recursively applying transfer functions to previous states and inputs.This process determines how information is passed between neurons and is critical for creating a learning system that can perform well under different conditions [45].
In terms of prediction, an RNN is trained using input data x(t) to generate a desired output y(t).Figure 2 illustrates the unrolled architecture of an RNN across the t-time dimension with L N layers.Each time step t features neurons in the hidden layer, which can be viewed as a single layer within an FNN.At time step t, the parameters of the network neurons at the lth layer update their shared states using the following equations [46]: L = loss function (y(t), y traget (t)), where x(t) denotes the input data at the t th time step; y(t) represents the corresponding prediction; y target (t) signifies the true values of output targets; h l (t) indicates the shared states of l th network layer at time step t; and α l (t) depicts the input value of the l th layer at time step t, comprising the t th time step input x(t) or the shared state h l−1 (t) at time t from the (l − 1) th layer, bias b, and shared states h l (t − 1) at current network layer l from the previous time step (t -1).

RNN-Based Long Short-Term Memory
Despite extensive research, the non-stationary demand data and the long-term forecasting horizon of the STLF make it difficult to obtain an accurate estimate.The capacity of LSTM to identify long-term dependencies in a time series is its key benefit.The LSTM representation was chosen as one input because it contains methods that account for the sequential character of time series.Therefore, we utilized the LSTM, a specialized form of

RNN-Based Long Short-Term Memory
Despite extensive research, the non-stationary demand data and the long-term forecasting horizon of the STLF make it difficult to obtain an accurate estimate.The capacity of LSTM to identify long-term dependencies in a time series is its key benefit.The LSTM representation was chosen as one input because it contains methods that account for the sequential character of time series.Therefore, we utilized the LSTM, a specialized form of RNN architecture, to tackle the short-term load forecasting challenge [47,48].
The memory cell, which acts as an accumulation of the state information, is the primary innovation of the LSTM.Three functions of gate controllers are demonstrated as follows [49]: and y (t) .
The forget gate is used to determine which information to remove from the cell state, as seen in Figure 3.

RNN-Based Long Short-Term Memory
Despite extensive research, the non-stationary demand data and the long-term fore casting horizon of the STLF make it difficult to obtain an accurate estimate.The capacity of LSTM to identify long-term dependencies in a time series is its key benefit.The LSTM representation was chosen as one input because it contains methods that account for the sequential character of time series.Therefore, we utilized the LSTM, a specialized form o RNN architecture, to tackle the short-term load forecasting challenge [47,48].
The memory cell, which acts as an accumulation of the state information, is the pri mary innovation of the LSTM.Three functions of gate controllers are demonstrated a follows [49]  Forget gate  decides which part of the long-term state  should be omitted. Input gate  controls which part of̃ should be added to the long-term state . Output gate  determines which part of  should be read and outputs to ℎ and  .
The forget gate is used to determine which information to remove from the cell state as seen in Figure 3.In conclusion, it is necessary to decide on the output, which involves two steps.First, a sigmoid layer is utilized as an output gate to selectively filter the cell state.Subsequently, the cell state is passed through a tanh function and multiplied by the output o t to obtain the desired information.

𝑓 𝜎 𝑊 . ℎ , 𝑥 𝑏
W i , W f , W c , and W o represent the appropriate weight matrices.The vectors b i , b f , b c , and b o denote the corresponding bias vectors.The most significant difficulty associated with the LSTM recurrent neural network lies in the process of adjusting its hyperparameters, such as determining the appropriate number of hidden layers, the nodes within each layer, the batch size, the number of epochs, and the learning rate, and optimizing the connection weights, biases of the network, etc. [51].

RNN Based Gated Recurrent Unit
A variation type of LSTM is the GRU, which was proposed by Cho et al. (2014) [52] to make each recurrent unit adaptively capture the dependencies of different time scales.Similarly to the LSTM unit, the GRU has gating units that modulate the flow of information inside the unit, albeit without having separate memory cells [52]. Figure 4 shows the details of the GRU system.
,  ,  , and  represent the appropriate weight matrices.The vectors  ,  ,  , and  denote the corresponding bias vectors.The most significant difficulty associated with the LSTM recurrent neural network lies in the process of adjusting its hyperparameters, such as determining the appropriate number of hidden layers, the nodes within each layer, the batch size, the number of epochs, and the learning rate, and optimizing the connection weights, biases of the network, etc. [51].

RNN Based Gated Recurrent Unit
A variation type of LSTM is the GRU, which was proposed by Cho et al. (2014) [52] to make each recurrent unit adaptively capture the dependencies of different time scales.Similarly to the LSTM unit, the GRU has gating units that modulate the flow of information inside the unit, albeit without having separate memory cells [52]. Figure 4 shows the details of the GRU system.The diagram illustrates a GRU in which each line represents a complete vector, connecting the output of one node to the input of another node.The light pink circles in the diagram represent mathematical operations, while the yellow boxes denote the neural network layers.The direction of the lines indicates the flow of the contents.It should be noted that the GRU layer is based on the LSTM layer and has a similar equation [54]: .ℎ ,   , The diagram illustrates a GRU in which each line represents a complete vector, connecting the output of one node to the input of another node.The light pink circles in the diagram represent mathematical operations, while the yellow boxes denote the neural network layers.The direction of the lines indicates the flow of the contents.It should be noted that the GRU layer is based on the LSTM layer and has a similar equation [54]: ∼ The function of the reset gate is to decide on the combination of the new input and the memory, while the update gate determines the proportion of the memory that should be retained.This gating mechanism is analogous to that used in LSTM as it aims to capture long-term dependencies [55].

Electricity Demand Pattern of Dataset
Our study's scope includes the Thailand metropolitan area, which comprises Pathum Thani, Nonthaburi, and Nakhon, as well as Bangkok and the neighboring provinces.These provinces are linked to an enormous number of enterprises, offices, universities, and industrial parks, all of which increase the overall energy demand.For the period of 1 January 2012 to 31 December 2018, we used half-hourly electrical energy demand data from EGAT (same as [56]) in this paper.Based on meteorological information for the Bangkok metropolitan region, the period of cooling typically spans from March through September, with the highest temperatures and humidity occurring between April and August.Throughout this timeframe, there is a notable rise in electricity usage, primarily for cooling needs such as air conditioning.There is still a noticeable demand for space heating during the cooler period from December to February.Additionally, there is a consistent need for hot water heating throughout the year in both residential and commercial settings to provide warm water [57].According to electricity usage data, buildings consume roughly 70% of Bangkok's total demand, which includes residential, business, and institutional areas.The rest (30%) of the city's electrical consumption comes from the industrial sector, which includes manufacturing and production [58].From Figure 5, we can see that for the moving average, a rolling window of 365 samples was used to track the data's consistent fluctuation over time.This plot shows that seasonality influences total demand growth, which follows a linear trend.
These provinces are linked to an enormous number of enterprises, offices, universities, and industrial parks, all of which increase the overall energy demand.For the period of 1 January 2012 to 31 December 2018, we used half-hourly electrical energy demand data from EGAT (same as [56]) in this paper.Based on meteorological information for the Bangkok metropolitan region, the period of cooling typically spans from March through September, with the highest temperatures and humidity occurring between April and August.Throughout this timeframe, there is a notable rise in electricity usage, primarily for cooling needs such as air conditioning.There is still a noticeable demand for space heating during the cooler period from December to February.Additionally, there is a consistent need for hot water heating throughout the year in both residential and commercial settings to provide warm water [57].According to electricity usage data, buildings consume roughly 70% of Bangkok's total demand, which includes residential, business, and institutional areas.The rest (30%) of the city's electrical consumption comes from the industrial sector, which includes manufacturing and production [58].From Figure 5, we can see that for the moving average, a rolling window of 365 samples was used to track the data's consistent fluctuation over time.This plot shows that seasonality influences total demand growth, which follows a linear trend.

Weekdays, Weekends, and Holidays Patterns
Every year, during holidays and other special occasions, there is a discernible drop in the electricity demand.Figure 6a,b shows that extended holidays like the Songkran holiday in the second week of April and the New Year's holiday in the first week of January, respectively, have a major effect on demand.But in the case of the New Year holiday, demand has started to decrease from Christmas (December 25th) to the first week of January, as shown in Figure 6b.These effects are important to model and provide a considerable challenge to researchers who want to attain high forecasting accuracy.Furthermore, though they have less of an impact than Songkran and New Year's, holidays like Makha

Weekdays, Weekends, and Holidays Patterns
Every year, during holidays and other special occasions, there is a discernible drop in the electricity demand.Figure 6a,b shows that extended holidays like the Songkran holiday in the second week of April and the New Year's holiday in the first week of January, respectively, have a major effect on demand.But in the case of the New Year holiday, demand has started to decrease from Christmas (December 25th) to the first week of January, as shown in Figure 6b.These effects are important to model and provide a considerable challenge to researchers who want to attain high forecasting accuracy.Furthermore, though they have less of an impact than Songkran and New Year's, holidays like Makha Bucha Day (February), Visakha Bucha Holiday (May), and Father's Day (December) also affect swings in electricity demand.

Monthly, Weekly, and Daily Patterns
The demand value varies each month according to the season and holidays observed, especially during these months.Demand was at its peak during the summer, which runs from March through September, although it was lower in April due to Songkran, a lengthy holiday that fell during the second week of the month.The months of October through February are a little bit cooler than the summer; thus, there is less demand during these months.Figure 7a shows the half-hourly demand pattern for one (the first) week of April 2012, which illustrates that Saturday and Sunday (weekends) have low electricity demand

Monthly, Weekly, and Daily Patterns
The demand value varies each month according to the season and holidays observed, especially during these months.Demand was at its peak during the summer, which runs from March through September, although it was lower in April due to Songkran, a lengthy holiday that fell during the second week of the month.The months of October through February are a little bit cooler than the summer; thus, there is less demand during these months.Figure 7a shows the half-hourly demand pattern for one (the first) week of April 2012, which illustrates that Saturday and Sunday (weekends) have low electricity demand compared to other working days (weekdays) because manufacturing plants, officed, and industrial loads dominate residential demand in our research area.On weekends and holidays, residential demand, driven by human behavior, fluctuates unpredictably, affecting forecasting accuracy.Weekdays show consistent midday and evening demand patterns, unlike weekends and holidays.Late-night and early-morning demand remain consistent across weekdays and weekends (Figure 7b).

Monthly, Weekly, and Daily Patterns
The demand value varies each month according to the season and holidays observed, especially during these months.Demand was at its peak during the summer, which runs from March through September, although it was lower in April due to Songkran, a lengthy holiday that fell during the second week of the month.The months of October through February are a little bit cooler than the summer; thus, there is less demand during these months.Figure 7a shows the half-hourly demand pattern for one (the first) week of April 2012, which illustrates that Saturday and Sunday (weekends) have low electricity demand compared to other working days (weekdays) because manufacturing plants, officed, and industrial loads dominate residential demand in our research area.On weekends and holidays, residential demand, driven by human behavior, fluctuates unpredictably, affecting forecasting accuracy.Weekdays show consistent midday and evening demand patterns, unlike weekends and holidays.Late-night and early-morning demand remain consistent across weekdays and weekends (Figure 7b).

Temperature vs. Demand
Thailand is a tropical nation with higher average temperatures; as such, temperature is the primary factor influencing electricity consumption in Thailand.Figure 8 illustrates the clear correlation between electricity demand and temperature.In summer, higher AC

Temperature vs. Demand
Thailand is a tropical nation with higher average temperatures; as such, temperature is the primary factor influencing electricity consumption in Thailand.Figure 8 illustrates the clear correlation between electricity demand and temperature.In summer, higher AC usage boosts electricity demand for cooling, while in winter, lower temperatures lead to a slight decrease in demand.Temperature's impact on electricity demand on holidays and non-holidays at midnight and the peak hour are depicted in Figures 9b and 9c, respectively, where orange dots denote holiday and blue dots denote non-holiday demand.Figure 9c describes how temperature affects demand for both holidays and non-holidays during peak hours (2 pm), where demand varied significantly on holidays during the peak hour period.During working days, there is a sharp and linear demand for electricity.The influence of temperature on peak demand is negligible.This fact confirms the assertion made in article [59] that temperature has a negligible effect on commercial demand.When the temperature falls below 30 °C or rises above 35 °C, there is a significant fluctuation in demand during holidays.Moreover, Figure 9a shows the features of the electrical power consumption at Temperature's impact on electricity demand on holidays and non-holidays at midnight and the peak hour are depicted in Figure 9b,c, respectively, where orange dots denote holiday and blue dots denote non-holiday demand.Figure 9c describes how temperature affects demand for both holidays and non-holidays during peak hours (2 pm), where demand varied significantly on holidays during the peak hour period.During working days, there is a sharp and linear demand for electricity.The influence of temperature on peak demand is negligible.This fact confirms the assertion made in article [59] that temperature has a negligible effect on commercial demand.When the temperature falls below 30 • C or rises above 35 • C, there is a significant fluctuation in demand during holidays.Moreover, Figure 9a shows the features of the electrical power consumption at two distinct hours, 2 pm with blue dots and 11 pm with orange dots, where the relationship between peak electricity demand and temperature during these two time periods is compared.In particular, Figure 9a indicates that in comparison to the demand at 2 pm, the demand at 11 pm shows a more linear relationship with temperature.The same linear relationship is illustrated for both holidays and non-holidays at 11 pm in Figure 9b.Temperature's impact on electricity demand on holidays and non-holidays at midnight and the peak hour are depicted in Figures 9b and 9c, respectively, where orange dots denote holiday and blue dots denote non-holiday demand.Figure 9c describes how temperature affects demand for both holidays and non-holidays during peak hours (2 pm), where demand varied significantly on holidays during the peak hour period.During working days, there is a sharp and linear demand for electricity.The influence of temperature on peak demand is negligible.This fact confirms the assertion made in article [59] that temperature has a negligible effect on commercial demand.When the temperature falls below 30 °C or rises above 35 °C, there is a significant fluctuation in demand during holidays.Moreover, Figure 9a shows the features of the electrical power consumption at two distinct hours, 2 pm with blue dots and 11 pm with orange dots, where the relationship between peak electricity demand and temperature during these two time periods is compared.In particular, Figure 9a indicates that in comparison to the demand at 2 pm, the demand at 11 pm shows a more linear relationship with temperature.The same linear relationship is illustrated for both holidays and non-holidays at 11 pm in Figure 9b.

System Model
This article provides three types of deep learning models with hyperparameter tuning for short-term electrical demand forecasting.The proposed approaches are described in this section and are shown in Figure 10.There are four frameworks into which the suggested methods are classified: data pre-processing, model design with training data, fine tuning and testing, and performance analysis.Data normalization, structural transformation, and removing outliers were part of the preprocessing process of the raw data.
Large values can hinder the deep neural network's learning and convergence, so feature scaling was used to scale the data in the range of 0 to 1.All three deep learning models are trained using the training dataset.The input data and the associated target values are taught to the model to identify patterns and relationships.To reduce the difference between the predicted and the original target values, the model modifies its internal variables (weights and biases) according to the input data.The validation dataset is used to observe the model's effectiveness during training and to adjust the hyperparameters.It directs the process of determining the model and helps avoid overfitting.After training is finished, the testing dataset is used to assess how well the model performs on unknown data.
are trained using the training dataset.The input data and the associated target values are taught to the model to identify patterns and relationships.To reduce the difference between the predicted and the original target values, the model modifies its internal variables (weights and biases) according to the input data.The validation dataset is used to observe the model's effectiveness during training and to adjust the hyperparameters.It directs the process of determining the model and helps avoid overfitting.After training is finished, the testing dataset is used to assess how well the model performs on unknown data.

Data Preparation
EGAT, the largest company in the Thai electricity market, provided the dataset that was applied in this research.EGAT processes 100% of the transmission system and nearly 50% of the generation.The nation is divided into five regions by EGAT, which also maintains separate records of the electric loads [41].The dataset spans seven years (from 1 January 2012 to 31 December 2018) and contains 1,22,736 instances of half-hourly electricity demand (in MW).There are additional deterministic, historical load, and interaction variables in the preprocessed data.In the entire dataset, 5 years (from 1 January 2012 to 31 December 2016) were selected as the training dataset, 1 year (from 1 January 2017 to 31

Data Preparation
EGAT, the largest company in the Thai electricity market, provided the dataset that was applied in this research.EGAT processes 100% of the transmission system and nearly 50% of the generation.The nation is divided into five regions by EGAT, which also maintains separate records of the electric loads [41].The dataset spans seven years (from 1 January 2012 to 31 December 2018) and contains 122,736 instances of half-hourly electricity demand (in MW).There are additional deterministic, historical load, and interaction variables in the preprocessed data.In the entire dataset, 5 years (from 1 January 2012 to 31 December 2016) were selected as the training dataset, 1 year (from 1 January 2017 to 31 December 2017) as the validation dataset, and the remaining 1 year (from 1 January 2018 to 31 December 2018) as the testing dataset.The dataset was split into three groups, known as scenarios, unlike previous research mentioned in [2] in which [41] they were divided into four groups.Splitting the dataset proved effective in simplifying the modeling and demonstrated a significant accuracy enhancement compared with the related work.
Our available dataset comprised half-hourly (HH) 7-year data with 122,736 instances.Hence, the numbers of training, validation, and test instances for each HH in each scenario were as follows:

Forecast Horizon
The EGAT system in Thailand forecasts conditions for 2 pm on the next day, along with 10 to 34 h in advance.Since the EGAT office is not open on weekends, EGAT must forecast up to Monday from Friday.Thailand usually uses data up to 106 h ahead for short-term forecasting, particularly during extended holidays.However, the scope of this study is restricted to forecasting for the following day utilizing data up to 2 pm.Simplifying the model, Figure 11

Forecast Horizon
The EGAT system in Thailand forecasts conditions for 2 pm on the next day, along with 10 to 34 h in advance.Since the EGAT office is not open on weekends, EGAT must forecast up to Monday from Friday.Thailand usually uses data up to 106 h ahead for short-term forecasting, particularly during extended holidays.However, the scope of this study is restricted to forecasting for the following day utilizing data up to 2 pm.Simplifying the model, Figure 11   Adding dummy variables and their relationships has been recommended by earlier research to increase forecasting accuracy.Ramanathan et al. [60] suggested the usage of four types of dummy variables-deterministic, temperature, load, and past errors-and their interactions.The term "deterministic" describes predictable variables, like the day of the week, the month, and the year [60].The variables were grouped into two categories: deterministic and lagged, as presented in Table 2. Adding dummy variables and their relationships has been recommended by earlier research to increase forecasting accuracy.Ramanathan et al. [60] suggested the usage of four types of dummy variables-deterministic, temperature, load, and past errors-and their interactions.The term "deterministic" describes predictable variables, like the day of the week, the month, and the year [60].The variables were grouped into two categories: deterministic and lagged, as presented in Table 2.

Hyperparameter Tuning
When creating deep learning models, hyperparameter tuning is an essential step.The configuration parameters that a model is trained on and structured with are called hyperparameters.Hyperparameters, in contrast to parameters, which are discovered from the data during training, are preset before training and direct the learning process.The goal of fine-tuning these hyperparameters is to identify the optimal combination that maximizes the model's performance [61].We independently carried out the hyperparameter optimization procedure for every deep network in our experiment.These were the primary hyperparameters: • Dropout: This serves as a regularization technique aimed at mitigating overfitting by randomly selecting cells within a layer and nullifying their output based on a specified probability.

•
Lookback period: The lookback period refers to the number of previous time steps that are taken into consideration by the model when generating predictions.The model becomes more complex when the lookback period is extended, but it also enables the model to capture longer-term dependencies and patterns.If the lookback period is too short, the model might miss important temporal relationships.

Performance Analysis
An essential step in determining a predictive model's efficacy is performance analysis.In this paper, mean absolute error (MAE) and mean absolute percentage error (MAPE) were used as metrics for performance analysis.
Performance analysis workflow: • Model Training: Train three deep learning models using the training dataset.

•
Hyperparameter Tuning: Fine-tune different hyperparameters, which are mentioned in Section 5.3, based on performance metrics (MAE, MAPE) calculated on the validation dataset.

•
Model Evaluation: Evaluate the prediction performance of the completed trained model using the testing dataset.

•
Performance Metrics Calculation: ■ Use the trained model to make predictions for energy demand on the testing dataset.

■
Calculate MAE and MAPE metrics using the actual demand value of the test dataset and predicted value from a trained model using testing data.

■
These metrics give us information about the precision and accuracy of the predictions made by our final model.

Mean Absolute Error (MAE)
One useful key performance indicator (KPI) for evaluating forecast accuracy is the mean absolute error (MAE).To ensure that positive and negative residuals never cancel out, we will compute the residual for each data point using only its absolute value.After that, we compute the mean of each of these residuals.The MAE, which merely calculates the absolute difference between the model's predictions and the actual demand data, is the most appropriate metric.The residual's absolute value is utilized, so the under-or overperformance of the model is not displayed.The formal equation of MAE is shown below: where n = total number of data points; y t = actual electricity demand at a time "t"; F t = predicted electricity demand at a time "t";

Mean Absolute Percentage Error (MAPE)
In percentage terms, MAE is equivalent to mean absolute percentage error (MAPE).Like MAE, which represents the mean absolute error generated by your model, MAPE indicates the average deviation between the model's predictions and their corresponding outputs.A popular metric for assessing forecast accuracy in time series data is called MAPE, which is computed as the average of the absolute percentage (%) errors (APEs) within the predicted and actual values of time series data.The formal equation of MAE is shown below:

Results and Analysis
This section provides an outline of the methodology used to empirically determine the hyperparameters for the RNN, RNN-LSTM, and RNN-GRU deep learning models across all three scenarios.We separated the entire dataset into three datasets: training, validation, and testing.This allowed us to fine-tune the model's hyperparameters and increase its efficacy by separating the training dataset into distinct validation datasets.Throughout the training phase, concurrent validation steps were executed using this validation dataset.To conduct the research, the hyperparameter values listed in Table 3 are utilized.To test hyperparameters value sets particularly for the RNN, RNN-LSTM, and RNN-GRU models, we have assembled a list for all three scenarios.Initially, we tested the built-in model's validity and functionality using the parameters listed in Table 3.The set of parameters that produced the lowest validation mean absolute error (MAE) was chosen as the ideal or tuned parameter set after its performance was successfully verified.Tables 4, 6, and 8 show the complete set of parameters for all three scenarios.

Parameters Tuning for Scenario 1
The electricity demand for the year 2018 is forecasted utilizing the refined parameters illustrated in Table 4 with a dataset size of 105,216 and a test size of 17,520.Compared to the other two RNN-LSTM and RNN models, the RNN-GRU model performed better.We achieved the lowest minimum loss of validation (MAE) of 213.36 MW compared to 221.79 MW and 242.35MW using the RNN-GRU simulation setup.In Scenario 1, as stated in Table 4, we were able to attain the following outcomes:

•
The LSTM model with nodes = 128, layers = 1, dropout = 0.15, lookback period = 8 days, and epochs = 36 produced the minimal validation loss of 221.79 MW. • The GRU model with nodes = 64, layers = 1, dropout = 0.15, lookback period = 5 days, and epochs = 42 obtained the minimal validation loss of 213.36 MW, which was the lowest loss across all three models.• The RNN model with nodes = 64, layers = 3, dropout = 0.15, lookback period = 8 days, and epochs = 38 achieved the minimal validation loss of 242.35 MW, which was the highest loss across all three models.
When optimized hyperparameters are used, Table 5 shows that the RNN-GRU model achieved its best performance in terms of both performance metrics (MAE and MAPE) with MAE = 214.79MW and MAPE = 2.08%.The electricity demand for the year 2018 is forecasted, utilizing the refined parameters illustrated in Table 6 with a dataset size of 70,608 and a test size of 11,808.Compared to the other two RNN-LSTM and RNN models, the RNN-GRU model performed better.We achieved the lowest minimum loss of validation (MAE) of 179 MW compared to 189.87 MW and 195.11MW using the RNN-GRU simulation setup.In Scenario 2, as stated in Table 6, we were able to attain the following outcomes: • The LSTM model with nodes = 128, layers = 1, dropout = 0.15, lookback period = 5 days, and epochs = 57 produced the minimal validation loss of 189.87 MW.

•
The GRU model with nodes = 64, layers = 1, dropout = 0.15, lookback period = 8 days, and epochs = 61 obtained the minimal validation loss of 179 MW, which was the lowest loss across all three models.• The RNN model with nodes = 64, layers = 2, dropout = 0.2, lookback period = 8 days, and epochs = 51 achieved the minimal validation loss of 195.11MW. which was the highest loss across all three models.
When optimized hyperparameters are used, Table 7 shows that the RNN-GRU model achieved its best performance in terms of both performance metrics (MAE and MAPE) with MAE = 181.63MW and MAPE = 1.89%.In Scenario 3, as stated in Table 8, we were able to attain the following outcomes:

Conclusions
Ensuring the equilibrium between electricity production and consumption is crucial for efficient power system operation.Precise load forecasting plays a key role in achieving this balance as it enables the proactive anticipation of fluctuations in both supply and demand, thereby ensuring grid stability.This paper presented three types of deep learning models: RNN-LSTM, RNN-GRU, and RNN.They were each constructed to represent three different scenarios using the same seven-year half-hourly electrical energy demand data from EGAT for STLF.All three models, along with three scenarios, were validated using one year of data to optimize hyperparameters.These tuned parameters were then utilized to day-ahead forecast the demand for the year 2018.
In Scenario 1, the RNN-GRU model exhibited the lowest validation MAE loss.This model utilized a 5-day lookback period with 128 nodes in a single layer.It achieved the

Conclusions
Ensuring the equilibrium between electricity production and consumption is crucial for efficient power system operation.Precise load forecasting plays a key role in achieving this balance as it enables the proactive anticipation of fluctuations in both supply and demand, thereby ensuring grid stability.This paper presented three types of deep learning models: RNN-LSTM, RNN-GRU, and RNN.They were each constructed to represent three different scenarios using the same seven-year half-hourly electrical energy demand data from EGAT for STLF.All three models, along with three scenarios, were validated using one year of data to optimize hyperparameters.These tuned parameters were then utilized to day-ahead forecast the demand for the year 2018.
In Scenario 1, the RNN-GRU model exhibited the lowest validation MAE loss.This model utilized a 5-day lookback period with 128 nodes in a single layer.It achieved the lowest test MAE loss of 214.79 MW and MAPE of 2.08%.For Scenario 2, the RNN-GRU model achieved the lowest validation MAE loss using an 8-day lookback period with 64 nodes in a single layer, resulting in a test MAE loss of 181.63 MW and MAPE of 1.89%.Meanwhile, in Scenario 3, the RNN-LSTM model obtained the lowest-validation MAE with an 8-day lookback period with 128 nodes in a single layer.This model yielded a test MAE loss of 226.76 MW and MAPE of 2.13%.Among the obtained results from three different models using three scenarios, Scenario 2 emerges as the most promising, outperforming the other two scenarios.This superiority can be assigned to its exclusive focus on the weekday demand dataset, which exhibits consistent demand patterns.On the other hand, Scenario 1 encompasses the entire dataset, including weekdays, weekends, and holidays, leading to a higher degree of demand pattern non-linearity, which raises prediction errors.Scenario 3, which comprises only weekend and holidays demand data, shows improved results compared to previous research, likely due to the expanded dataset.
These findings suggest the potential for implementing STLF by categorizing datasets into three distinct scenarios.Scenario 2 can effectively forecast electricity demand for weekdays (Monday to Friday), while Scenario 3 proves valuable for forecasting demand during weekends and holidays (Saturday, Sunday, and public holidays), demonstrating enhanced accuracy.We did not utilize additional variables such as solar insolation, air humidity, cloud cover percentage, and daily temperature amplitude (maximum-minimum difference) in our study.However, we may consider incorporating these variables in future research to assess their impact on prediction accuracy.
March and May, while overestimation was observed in September and November.From Figures A2 and A3, representing scenarios 2 and 1 for predictions by all three models, it is apparent that the RNN-GRU model outperformed the other two models in both scenarios.There is underestimation in March and May, while September and November exhibited overestimation in Scenario 2. And we observed underestimation in March and May, while overestimation was observed in July, September, and November in Scenario 1.

Appendix B. Tunning of Hyperparameters for Different Three Scenarios
The hyperparameter selection process for Scenario 1 is detailed in Table A1, presenting RNN-GRU's superior performance over the other two models.Afterward, a range of parameters was tested for all three models to determine the optimal configuration.The resulting validation and test MAE values for each parameter set are presented in the table below.

Appendix B. Tunning of Hyperparameters for Different Three Scenarios
The hyperparameter selection process for Scenario 1 is detailed in Table A1, presenting RNN-GRU's superior performance over the other two models.Afterward, a range of parameters was tested for all three models to determine the optimal configuration.The resulting validation and test MAE values for each parameter set are presented in the table below.The hyperparameter selection process for Scenario 2 is detailed in Table A2, presenting RNN-GRU's superior performance over the other two models.Afterward, a range of parameters was tested for all three models to determine the optimal configuration.The resulting validation and test MAE values for each parameter set are presented in the table below.The hyperparameter selection process for Scenario 3 is detailed in Table A3, presenting RNN-LSTM's superior performance over the other two models.Afterward, a range of parameters was tested for all three models to determine the optimal configuration.The resulting validation and test MAE values for each parameter set are presented in the table below.

Figure 1 .
Figure 1.Structure of the deep neural network.

Figure 1 .
Figure 1.Structure of the deep neural network.

Figure 6 .
Figure 6.(a) Demand profile of weekdays, weekends, and holidays for April 2014; (b) demand profile of weekdays, weekends, and holidays for December 2012 to 4 January 2013.

Figure 6 .
Figure 6.(a) Demand profile of weekdays, weekends, and holidays for April 2014; (b) demand profile of weekdays, weekends, and holidays for December 2012 to 4 January 2013.

Figure 6 .
Figure 6.(a) Demand profile of weekdays, weekends, and holidays for April 2014; (b) demand profile of weekdays, weekends, and holidays for December 2012 to 4 January 2013.

Figure 7 .
Figure 7. (a) Daily demand for one-week; (b) Average demand profile for weekday, weekend, and holiday patterns for 7 years.

Figure 7 .
Figure 7. (a) Daily demand for one-week; (b) Average demand profile for weekday, weekend, and holiday patterns for 7 years.
Appl.Sci.2024, 14, 3971 11 of 27 usage boosts electricity demand for cooling, while in winter, lower temperatures lead to a slight decrease in demand.

Figure 8 .
Figure 8. Correlation between temperature and electricity demand for the second week of March 2018.

Figure 8 .
Figure 8. Correlation between temperature and electricity demand for the second week of March 2018.

Figure 8 .
Figure 8. Correlation between temperature and electricity demand for the second week of March 2018.

Figure 9 .
Figure 9. (a) Effect of temperature at two different hours (2 pm and 11 pm); (b) effect of temperature during midnight (11 pm); and (c) effect of temperature during the peak hour (2 pm).Figure 9. (a) Effect of temperature at two different hours (2 pm and 11 pm); (b) effect of temperature during midnight (11 pm); and (c) effect of temperature during the peak hour (2 pm).

Figure 9 .
Figure 9. (a) Effect of temperature at two different hours (2 pm and 11 pm); (b) effect of temperature during midnight (11 pm); and (c) effect of temperature during the peak hour (2 pm).Figure 9. (a) Effect of temperature at two different hours (2 pm and 11 pm); (b) effect of temperature during midnight (11 pm); and (c) effect of temperature during the peak hour (2 pm).
displays the forecast horizon and lag terms specific to Thailand's practice.To predict demand on a given day, forecasting begins 10 h in advance, at 2 pm.The variable load1d_cut2pm now represents the historical demand data from HH = 0 to HH = 28 of d−1 (the preceding day) and HH = 29 to HH = 47 of d-2.Likewise, our model incorporates the lagged demand for the next two days, which is denoted by a parameter called load2d_cut2pm.training = 87,696; validation = 17,520; test = 17,520. Scenario 2: only demand for weekdays (demand without Saturday, Sunday, and other holidays): training = 58,848; validation = 11,760; test = 11,808. Scenario 3: only demand for non-working days (demand for weekends and holidays): training = 28,848; validation = 5808; test = 5664.
displays the forecast horizon and lag terms specific to Thailand's practice.To predict demand on a given day, forecasting begins 10 h in advance, at 2 pm.The variable load1d_cut2pm now represents the historical demand data from HH = 0 to HH = 28 of d−1 (the preceding day) and HH = 29 to HH = 47 of d-2.Likewise, our model incorporates the lagged demand for the next two days, which is denoted by a parameter called load2d_cut2pm.

Figure 12 .
Figure 12.Electricity demand (MW) on the y-axis and days on the x-axis using RNN-LSTM.

Figure 12 .
Figure 12.Electricity demand (MW) on the y-axis and days on the x-axis using RNN-LSTM.

Figure 12 .
Figure 12.Electricity demand (MW) on the y-axis and days on the x-axis using RNN-LSTM.

Figure 13 .
Figure 13.Electricity demand (MW) on the y-axis and days on the x-axis using RNN-GRU.

Figure 14 .
Figure 14.Electricity demand (MW) on the y-axis and days on the x-axis using RNN.

Figure 14 .
Figure 14.Electricity demand (MW) on the y-axis and days on the x-axis using RNN.

27 Figure A1 .
Figure A1.Electricity demand (MW) on the y-axis and days on x-axis, utilizing Scenario 3.

Figure A2 .
Figure A2.Electricity demand (MW) on the y-axis and days on x-axis, utilizing Scenario 2.

Figure A2 .
Figure A2.Electricity demand (MW) on the y-axis and days on x-axis, utilizing Scenario 2.

Figure A2 .
Figure A2.Electricity demand (MW) on the y-axis and days on x-axis, utilizing Scenario 2.

Figure A3 .
Figure A3.Electricity demand (MW) on the y-axis and days on x-axis, utilizing Scenario 1.

Figure A3 .
Figure A3.Electricity demand (MW) on the y-axis and days on x-axis, utilizing Scenario 1.
) decides which part of the long-term state c (t) should be omitted.➢ Input gate i (t) controls which part of ∼ c t should be added to the long-term state c (t) .➢ Output gate o (t) determines which part of c t should be read and outputs to h (t)

•
Number of layers: The neural network's depth, which represents its number of layers.Deeper networks have the potential to overfit but can also more complex patterns.Achieving the proper depth is critical.• Epochs: How many times is the neural network used to pass the whole dataset back and forth during training?Epoch determines how frequently the model views the complete dataset.Underfitting may result from a lack of epochs, and overfitting may result from an excess of epochs.• Number of neurons in each layer: The quantity of neurons in every network layer regulates the model's ability to recognize patterns.Underfitting can occur from having too few neurons, while overfitting can occur from having too many.• Batch size: The total number of training instances used in a single iteration impacts how quickly and effectively training uses memory.Greater batch sizes may result in quicker training, but they also demand more RAM.• Learning rate: The step size while attempting to minimize a loss function at each iteration.The model might overshoot the minimum but converge quickly if the learning rate is high.A low learning rate can cause slow convergence.

Table 3 .
Sets of hyperparameters for all three scenarios.

Table 4 .
Sets of parameters for lowest validation loss of Scenario 1.

Table 5 .
Implementing the optimal parameter settings for day-ahead forecasting using the test dataset of Scenario 1.

Table 6 .
Set of parameters for lowest validation loss of Scenario 2.

Table 7 .
Implementing the optimal parameter settings for day-ahead forecasting using the test dataset of Scenario 2.The electricity demand for the year 2018 is forecasted utilizing the refined parameters illustrated in Table8with a dataset size of 34,656 and a test size of 5664.Compared to the other two RNN-GRU and RNN models, the RNN-LSTM model performed better.We achieved the lowest minimum loss of validation (MAE) of 222.07 MW compared to 229.89 MW and 230.57MW using the RNN-LSTM simulation setup.

Table 8 .
Set of parameters for lowest validation loss of Scenario 3.

Table A1 .
Parameter variation and corresponding test and validation MAE outcomes for Scenario 1.

Table A1 .
Parameter variation and corresponding test and validation MAE outcomes for Scenario 1.

Table A2 .
Parameter variation and corresponding test and validation MAE outcomes for Scenario 2.

Table A3 .
Parameter variation and corresponding test and validation MAE outcomes for Scenario 3.