1. Introduction
Very short-term load forecasting (VSTLF) refers to predicting electricity demand over a time horizon ranging from several minutes to several hours ahead. VSTLF provides higher-resolution, real-time load profiles compared to conventional short-term load forecasting (STLF). Accurate VSTLF is essential not only for maintaining power system stability under rapid load fluctuations but also for optimizing generator dispatch and managing energy storage systems. Furthermore, VSTLF plays a pivotal role in electricity market operations. Real-time market activities, including price determination, market clearing, and generation scheduling, heavily depend on accurate STLF and VSTLF. Inaccurate forecasts can lead to price volatility, increased imbalance costs, and inefficient dispatch decisions, adversely affecting both electricity suppliers and consumers. Moreover, demand response programs rely on precise VSTLF to maximize their effectiveness. Enhancing forecasting accuracy enables market participants to optimize bidding strategies, mitigate financial risks, and improve overall market efficiency.
Numerous studies have investigated load forecasting in recent years. Load forecasting methods can generally be categorized into statistical approaches [
1,
2,
3] and machine learning approaches [
4,
5,
6]. Statistical techniques, such as the autoregressive integrated moving average (ARIMA) model [
1], exponential smoothing [
2], and least absolute shrinkage and selection operator (LASSO) regression [
3], have been widely applied. However, their performance is limited when addressing the nonlinear and uncertain nature of load, especially in systems with high integration of variable renewable energy sources (RESs).
To overcome these limitations, various machine learning techniques have been explored, including neural networks [
4], gradient boosting algorithms [
5], and support vector machines (SVMs) [
6]. These approaches have shown superior performance by effectively capturing complex nonlinear relationships between input variables. Among them, neural network-based methods such as deep neural networks (DNNs) [
7], convolutional neural networks (CNNs) [
8], recurrent neural networks (RNNs) [
9], and attention-based models [
10] have gained significant attention. Given the sequential nature of load data, RNN-based models—such as long short-term memory (LSTM) and gated recurrent unit (GRU)—as well as CNN-based models like temporal convolutional networks (TCNs), have been widely adopted for STLF. For instance, Kwon et al. [
11] proposed STLF using LSTM. Lin et al. [
10] applied LSTM with attention for STLF, while Kong et al. [
12] used LSTM. Cai et al. [
13] combined variational mode decomposition (VMD) with GRUs and TCNs to enhance forecasting accuracy. Hua et al. [
14] adopts a CNNs-GRU hybrid architecture combined with multi-head attention for STLF. He et al. [
15] employs a DDPG algorithm for GRU hyperparameter tuning for STLF. Ahmad et al. [
16] proposed TFTformer, a transformer-based model that integrates temporal convolution and feature-specific embeddings to enhance short-term load forecasting accuracy across diverse regional datasets.
Although these studies contribute to load forecasting, their direct applicability to VSTLF remains limited. Most of them focus on longer forecasting horizons and do not adequately address the rapid load fluctuations caused by high RES variability. These studies often rely solely on historical load and temperature as input features without employing a clear feature selection process, which limits their ability to capture real-time system dynamics. However, input feature selection is critical in load forecasting due to the strong dependency on past demand and meteorological variables, especially in VSTLF, where timely adaptation to changing conditions is essential. Furthermore, these studies focus on relatively small-scale power systems with peak loads up to 15 GW, limiting their generalizability to large-scale systems such as the one considered in this study, which has a peak load of approximately 97 GW, including significant BTM loads. In large-scale power systems, spatial heterogeneity in meteorological conditions becomes significant, necessitating the aggregation of data from multiple weather stations across regions to derive representative weather inputs. This introduces additional complexity, limiting the direct scalability of previous approaches to more realistic operational environments.
Several studies have specifically addressed VSTLF. Pati et al. [
17] proposed a method based on incomplete fuzzy decision systems and genetic algorithms for small-city VSTLF. Wang et al. [
18] utilized a combination of TCN and Light Gradient Boosting Machine (LGBM) to extract spatial–temporal features for load forecasting. Rafati et al. [
19] applied artificial neural networks and SVMs in a photovoltaic (PV)-integrated microgrid. Jiang et al. [
20] used a deep-autoformer model for household-level VSTLF. Zhang et al. [
21] combined improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) and bidirectional LSTM (Bi-LSTM) for small-city VSTLF in China. Cheng et al. [
22] developed a multi-head 1D convolutional block attention module for VSTLF in both Chinese and New England cities. However, most of these studies focus on one-step-ahead forecasting, which limits their usefulness for real-time system operations and market planning.
Some studies have explored multi-step VSTLF [
23,
24,
25]. Vontzos et al. [
23] compared multilayer perceptron (MLP), LSTM, and Bi-LSTM to forecast six future steps at five-minute intervals for a Greek airport building. Wang et al. [
24] forecasted 60 steps ahead at one-minute intervals for a Chinese substation by combining Prophet-based decomposition and VMD for feature extraction, followed by models such as GRU, RNN, LSTM, and TCN. Zhao et al. [
25] proposed a diffusion–attention-enhanced temporal (DATE-TM) model that integrates multiple features to perform 24-step ahead household VSTLF at 1 min intervals. Yang et al. [
26] developed hybrid CNN-GRU and TCN-GRU models for 15 min-ahead VSTLF at a 1 min resolution, incorporating SHAP-based interpretability to assess feature importance. While these efforts demonstrate multi-step forecasting, their scope is limited to small-scale areas such as household and buildings, making them less applicable to large-scale power system operations and real-time market applications.
Although the aforementioned studies contribute to the development of multi-step VSTLF methods, they primarily focus on load forecasting without explicitly addressing the distortions introduced by BTM solar generation. In power systems with widespread deployment of distributed PV resources, the observed load often significantly deviates from the actual system demand due to BTM consumption. This discrepancy poses a major challenge for accurate forecasting and system operation. To address this challenge, Tziolis et al. [
27] proposed a short-term net load forecasting model for solar-integrated distribution systems using Bayesian neural networks, incorporating solar irradiance as an input feature to implicitly account for PV generation variability. Kerkau et al. [
28] developed a day-ahead net load forecasting model for renewable integrated buildings using XGBoost, with solar irradiance as a key input variable to capture the impact of rooftop PV generation. Yuan et al. [
29] proposed a net load forecasting model based on LSTM networks for distribution grid planning, utilizing solar irradiance to reflect the variability in distributed PV generation. While these approaches consider the impact of solar generation, they rely solely on external features such as irradiance and PV output to approximate BTM effects. Consequently, they may fail to fully capture the distortions caused by self-consumption of BTM solar generation. In contrast, Bae et al. [
30] proposed an XGBoost-based day-ahead forecasting algorithm that reconstructs load profiles by combining measured net load with estimated BTM solar generation, thereby enabling more accurate forecasting in high PV-integration environments. Unlike studies that indirectly account for PV variability through solar irradiance inputs, this method directly reconstructs gross load to mitigate the masking effects of BTM generation. Although the reconstituted load approach enhances day-ahead forecasting accuracy, its applicability to VSTLF remains to be investigated to assess robustness and performance in very short-term scenarios.
To address the limitations of existing VSTLF research, this paper proposes a novel multi-step VSTLF model based on a GRU integrated with an attention mechanism. The model is specifically designed for large-scale systems and forecasts net load up to six hours ahead using nationwide data from South Korea. The reconstituted load method is specifically adapted to account for the effects of BTM solar generation. And to enhance input selection, a feature selection technique based on normalized mutual information (NMI) is introduced. Furthermore, a novel input variable termed “load variation” is proposed to explicitly capture the dynamic behavior of real-time load changes.
Comprehensive case studies are conducted to validate the effectiveness of the proposed model, evaluating both the impact of NMI-based input selection and the contribution of the new input feature. The main contributions of this paper are summarized as follows:
Development of a Large-Scale VSTLF Model: A VSTLF model tailored for a national-scale power system (South Korea) is proposed. The model forecasts the net load over a six-hour horizon, making it well-suited for real-time system operations and electricity market planning.
Customized Data Preprocessing for Large Systems: Two key preprocessing steps are introduced. First, reconstituted load is calculated to account for BTM solar generation. Second, a weighted average method is applied to derive representative weather inputs at a national scale.
Input Feature Selection Using NMI: A systematic method based on NMI is proposed to select dominant features, including weather variables and historical load patterns, by evaluating their mutual information (MI) with the target load.
Introduction of the “load variation” Feature: A new input feature is introduced to reflect the real-time rate of load change, enabling the model to better capture dynamic load behavior.
Parallel Input Architecture with GRU and Attention: A GRU-attention model architecture is designed to effectively process multi-dimensional inputs. GRU units learn temporal dependencies, while the attention mechanism identifies the relative importance and inter-dependencies of input features at each time step.
The remainder of this paper is organized as follows.
Section 2 describes the characteristics of the South Korean load dataset used in this study.
Section 3 presents the proposed algorithm.
Section 4 provides results on feature selection, hyperparameter tuning, and comparative performance evaluation. Finally,
Section 5 concludes the paper.
2. Load Characteristic Analysis
Load in power systems is shaped by temporal patterns—such as seasonality, day-of-week effects, and holidays—as well as socio-economic factors including economic growth. Accurate forecasting therefore requires rigorous analysis and integration of these inter-dependencies.
In addition, variability introduced by RES—primarily influenced by meteorological conditions—must also be considered. Specifically, ambient temperature and humidity exert primary influence on heating and cooling loads; precipitation and sky conditions affect photovoltaic output; and wind speed and direction determine wind power variability.
This study forecasts South Korea’s net load over a six-hour horizon at 15 min intervals. As of 2023, the national grid recorded a peak load of 93,615 MW and an installed capacity of 142,567 MW [
31], with industrial consumption accounting for 53% of the total electricity demand [
32]. Consequently, weekday demand—when industrial operations are active—is generally higher than on weekends. Monday morning loads tend to be lower due to reduced activity on the preceding Sunday, while Tuesday through Friday (i.e., “normal weekdays”) exhibit similar net-load profiles when weather effects are excluded.
Figure 1 presents the daily net-load profile and the hour-by-hour distribution of net-load values.
As shown in
Figure 1a,b, net-load variability is highest between 09:00 and 17:00, driven by a combination of social activity, seasonal influences, and the intermittent nature of solar PV generation.
In 2023, RESs accounted for approximately 9.67% of South Korea’s total power generation [
33], with solar PV contributing 55.03% of the total renewable energy output. This substantial share of solar PV generation causes significant distortions in observed load profiles. Additionally, solar PV output is inherently variable, being directly influenced by solar irradiance, which fluctuates due to atmospheric conditions such as cloud cover, humidity, and time of day. As a result, the net load measured at the system level deviates from the actual demand, complicating the load forecasting process.
Figure 2 illustrates the relationship between solar power generation and net load under contrasting weather scenarios—sunny versus cloudy days. In 2023, this variation in solar output led to a net-load difference of 9582 MW between the two conditions.
Hence, accurate VSTLF depends on the integration of periodic load patterns, weather-related variables, and advanced methods that can effectively capture the inherent variability in BTM solar generation.
4. Case Study
To evaluate the performance of the proposed VSTLF algorithm, a case study was conducted on the South Korean power system. This system represents a national-scale grid, with a peak load of 97,115 MW and a total installed generation capacity of 148,709 MW as of 2024. The proposed algorithm is designed to forecast the net load over a 6 h horizon at 15 min intervals.
The model was trained and validated using historical operational data, including net-load measurements and relevant weather variables. A detailed description of the dataset is provided in
Section 4.1.
Section 4.2 presents the results of input feature selection using NMI and the hyperparameter optimization process. Finally,
Section 4.3 provides a comprehensive performance evaluation of the proposed VSTLF model, including a comparative analysis against several benchmark methods.
4.1. Description of Dataset
The dataset used for the case study comprises three main components: 15 min interval net load data, hourly estimated BTM solar generation, and hourly weather observations. The dataset spans the period from 1 January 2021 to 31 December 2023.
The historical 15 min interval net load data were obtained from the Korea Power Exchange (KPX), the Independent System Operator (ISO) of South Korea, and are used to capture historical load patterns. To account for the impact of BTM solar generation—which distorts the net load—hourly estimated BTM solar generation is used for load reconstitution. These hourly values are linearly interpolated to align with the 15 min resolution of the net-load data.
Weather data relevant to real-time system operations—including temperature, humidity, wind speed, weather conditions, and precipitation—were obtained from the Korea Meteorological Administration (KMA). To model the weather dependency of the net load at the national level, hourly observations from eight major cities across South Korea are aggregated into representative inputs using a weighted average method [
39]. These hourly weather observations are also linearly interpolated to produce 15 min resolution inputs synchronized with the demand data. A detailed summary of the dataset is provided in
Table 1.
The simulations, including model training and forecasting simulations, were conducted in a Python 3.10.15 environment using an Intel Xeon Silver 4215R CPU and an NVIDIA GeForce RTX 3090 GPU.
4.2. Input Feature Selection and Hyperparameter Tuning Results Using Grid Search Algorithm
To analyze the relationship between the reconstitute load and input variables, NMI was calculated using data from 1 January 2021 to 31 December 2022. The NMI values for candidate input features are summarized in
Table 2 and
Table 3. NMI is computed based on entropy and MI, which quantify the degree of dependency between each input variable and the net load. The NMI values for weather-related variables are presented in
Table 2.
As shown in
Table 2, among the weather variables, temperature exhibited the highest NMI value of 0.082, which is more than twice that of the next highest variable, wind speed (0.030). This result can be attributed to the fact that heating and cooling demands constitute a significant portion of the total load and are strongly influenced by temperature. In addition to weather variables, the NMI values for historical load and load variation are presented in
Table 3.
Compared to the NMI values of weather variables, historical load exhibited significantly higher NMI values. Among the historical load inputs, the NMI of the previous day (D-1) and one week ago (D-7) are relatively high at 0.243 and 0.239, respectively. This can be attributed to the fact that recent load patterns exert the greatest influence on forecasting, and load profiles from the same day of the week typically follow similar patterns. Based on this analysis, good thresholds for and were empirically determined through comparative experiments over a range of candidate values, with each value evaluated using forecasting accuracy metrics. The selected thresholds are = 0.08 and = 0.2. Accordingly, the input features include temperature among the weather variables, as well as historical load-related features such as day-ahead load (D-1), a week-ahead load (D-7), and load variation.
In addition to weather and historical load variables, the load variation (
) exhibited the highest NMI value among all input features, reaching 0.542. The impact of input feature selection and the inclusion of
is further analyzed in
Section 4.3.
Hyperparameter tuning is critical because sub-optimal settings can obscure the true capability and yield misleading performance. Several advanced parameter optimization approaches have been proposed and showed the computational efficiency and accuracy [
42]. However, the dimensionality of candidate parameters of this work is small, so complete coverage of every combination is computationally tractable. Due to the small number of candidate parameters, a grid search algorithm [
43] is employed for hyperparameter tuning.
The grid search algorithm is an exhaustive global search restricted to the user-defined bounds. It evaluates every feasible combination in order to guarantee that the globally optimal combination is within predefined space. The grid search algorithm is expressed as follows:
here, the
D denotes the number of hyperparameters (i.e., the dimensionality of the search space);
is the discrete candidate set for the
i-th hyperparameter. The
represents the full search space. And optimal hyperparameter vector (
) that minimizes the objective
and cardinality of full search space (
) are expressed as follows:
In this work, the
D is three: length of train data, number of GRU layers, and number of features. The numbers of each search space are 3, 3, and 5, respectively, as shown in
Table 4. Therefore, the cardinality is
. As the search space comprises 45 combinations, exhaustive evaluation of every combination is tractable and reproducible, making grid search preferable to more advanced hyperparameter-optimization algorithms—such as Bayesian, evolutionary, or other surrogate-based strategies. The resulting best setting is shown in
Table 4 and
Figure 10. The length of train data, number of GRU layers, and number of input features were selected as 12 months, 1, and 32, respectively.
4.3. Comparison Algorithms and Evaluation Metrics
To assess the performance of the proposed VSTLF algorithm, four comparison models are employed. As a baseline, the Kalman Filter-based Real-Time Load Forecasting model (KRLF) [
44], which is currently used by KPX for 15 min interval load forecasting, is selected. In addition, three GRU-attention-based models are designed to compare the impact of input feature selection and the inclusion of the load variation (
). These models, along with the proposed method, are summarized in
Table 5.
Model 1 does not employ input feature selection nor include the load variation.
Model 2 incorporates only the load variation without input feature selection.
Model 3 adopts only the input feature selection method, excluding the load variation.
TheProposed Model integrates both input feature selection based on NMI and the load variation within the GRU-attention framework.
This comparative setup enables an isolated and combined evaluation of the effects of each component on forecasting performance.
By comparing these algorithms, the performance of the proposed model is validated in three aspects: against a real-time operational model, in terms of the effectiveness of input feature selection, and the contribution of the additional input feature, namely the load variation (
). The forecasting performance of the proposed VSTLF model and the comparison algorithms are evaluated using three standard metrics: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE). These evaluation metrics are defined as follows:
where
and
are the forecasted load value and the actual load value at time step
t, respectively, and
n is the total number of samples.
For performance evaluation, forecasting results on normal days from 1 January 2022 to 31 December 2023, are used. To ensure robust comparison, each algorithm—except for the KRLF model—was trained and tested three times with different random initialization seeds, and the average performance was reported.
4.4. Results of Proposed VSTLF Model and Comparison Models
To assess the performance of the proposed VSTLF model and the comparison models, the annual average values of the evaluation metrics, as well as the metrics at 8:00 AM—a time characterized by high load variability—were analyzed for the years 2022 and 2023. The results are summarized in
Table 6. As presented in
Table 6, Model 1 showed larger error than KRLF except for total MAPE in both years.
The performance improvements achieved by Models 2 and 3 highlight the contributions of the additional input feature () and the input feature selection mehtod, respectively. Compared to the KRLF benchmark, Model 2 exhibited higher performance across all evaluation metrics, except for the RMSE at 8:00 AM in 2023. In contrast, Model 3 consistently outperformed KRLF in all evaluation metrics for both 2022 and 2023.
The substanital performance gain of Model 3 over Model 2 underscores the critical importance of selecting dominant input features in VSTLF. Furthermore, a comparison among Models 2, 3, and the proposed VSTLF model reveals that incorporating —capturing recent load variations—as an additional input feature provides further accuracy improvements, beyond those achieved by input feature selection alone.
Among the evaluated models, the proposed VSTLF model demonstrated the highest forecasting accuracy, outperforming the KRLF benchmark. In 2022, it reduced the MAPE by 0.50 percentage points, the MAE by 306.58 MW, and the RMSE by 411.58 MW when aggregated across all forecast horizons. At the 08:00 AM horizon specifically, MAPE, MAE, and RMSE were further reduced by 0.95 percentage points, 532.55 MW, and 635.40 MW, respectively. Similarly, in 2023, the proposed VSTLF model achieved reductions of 0.56 percentage points in MAPE, 329.23 MW in MAE, and 430.27 MW in RMSE across all forecast horizons. For the 08:00 AM forecast in 2023, these improvements were 0.49 percentage points (MAPE), 205.70 MW (MAE), and 128.81 MW (RMSE). These results highlight the effectiveness of integrating NMI-based feature selection and the additional load variation feature () within the VSTLF framework.
Figure 11 presents the MAPE distributions for KRLF, Models 1–3, and the proposed VSTLF. Each box represents the interquartile range (IQR) with the central line indicating the median. The whiskers extend to 1.5 × IQR from the quartiles, and outliers are omitted for clarity. Compared to KRLF and Model 1, Model 2 exhibits a narrower IQR and shorter whiskers, indicating the benefit of incorporating the
feature. Model 3 further reduces dispersion by applying feature selection. The proposed VSTLF achieves the lowest median MAPE and the most compact IQR and whiskers, demonstrating both the highest accuracy and the most consistent performance among all models.
Figure 12 and
Figure 13 illustrate the mean predicted values at each forecast point for the proposed and comparison models, aggregated over all forecast horizons and at 08:00 AM, respectively.
As shown in
Figure 12 and
Figure 13, Models 1–3 outperformed the KRLF benchmark, and the proposed VSTLF yielded the most accurate forecasts overall.
Next, the forecasting accuracy of different algorithms is compared to evaluate the effectiveness of the proposed model. For comparison, two hybrid model structures commonly adopted in recent load forecasting studies are selected as baselines. The first comparison model is based on a CNN-GRU architecture combined with a multi-head attention mechanism (CNN-GRU-MHAT), which captures both spatial and temporal features through CNN-GRU layers while learning the importance of input features via the multi-head attention layers [
14]. The second comparison model employs a hybrid CNN-BiLSTM structure, where the CNN extracts spatial features and the BiLSTM captures bidirectional temporal dependencies [
26]. For a fair comparison, all models are trained using the same input features and identical hyperparameter settings. The forecasting results for each model are summarized in
Table 7. Among the comparison models, the CNN-BiLSTM model demonstrated superior forecasting accuracy compared to the CNN-GRU-MHAT model. However, the proposed model outperformed the CNN-BiLSTM model across both overall time intervals and periods of high load variability during the year 2022. In 2023, the proposed model consistently achieved better forecasting performance than the CNN-BiLSTM model across most evaluation metrics, with the exception of RMSE.
5. Conclusions
In this study, a novel VSTLF model incorporating a GRU and an attention mechanism was proposed for forecasting net load in large-scale power systems. To enhance forecasting accuracy, two key techniques were integrated into the model: input feature selection based on NMI and the inclusion of a novel input feature, the load variation (), which captures real-time load dynamics.
Extensive case studies using national-level data from South Korea demonstrated that the proposed GRU-attention model consistently outperformed both comparison GRU-attention architectures, the Kalman filter-based VSTLF model currently used for real-time system operations, and hybrid deep learning models. In particular, when expressing the accuracy of the proposed method in terms of the error rate, the Mean Absolute Percentage Error (MAPE) is 0.77%, which shows an improvement of 0.50 percentage points over the benchmark model using the Kalman filter algorithm, and an improvement of 0.27 percentage points over the hybrid deep learning benchmark (CNN-BiLSTM). The simulation results clearly demonstrate the effectiveness of the NMI-based feature selection and the combination of load characteristics for very short-term load forecasting.
These findings highlight the practical relevance of the proposed approach for real-time power system operation and electricity market management. However, limitations remain: the model was trained and evaluated using actual measured data without considering weather forecast uncertainty, and it was not explicitly designed to account for holidays, which can cause atypical load behavior. Future work is needed to address the challenges associated with holidays. Holidays exhibit unique load patterns, which can vary depending on the specific type of the day, the corresponding day of the week, or in cases where the date is determined by the lunar calendar. Due to the limited availability of historical data for these events, future studies should explore data augmentation techniques and develop methods for effectively handling the surrounding days of holidays. In addition, further investigation is required to incorporate forecast uncertainty by utilizing predicted weather variables in the training phase or by developing strategies to appropriately integrate both observed and forecasted weather data. Also, this work proposed the sequential approach that first selects the input features via NMI and tunes hyperparameters using grid search algorithm. This approach becomes computationally intensive when dealing with high-dimensional parameter settings such as incorporating economic growth indicators, holiday effects, or learning rate schedules. Future work will therefore investigate employing advanced optimization techniques such as SGSA, Bayesian optimization, or evolutionary strategies.