Very Short-Term Load Forecasting for Large Power Systems with Kalman Filter-Based Pseudo-Trend Information Using LSTM

Kim, Tae-Geun; Kwon, Bo-Sung; Yoon, Sung-Guk; Song, Kyung-Bin

doi:10.3390/en18184890

Open AccessFeature PaperArticle

Very Short-Term Load Forecasting for Large Power Systems with Kalman Filter-Based Pseudo-Trend Information Using LSTM

¹

Department of Electrical Engineering, Soongsil University, Seoul 06978, Republic of Korea

²

Department of Convergence of Energy Policy and Technology, Soongsil University, Seoul 06978, Republic of Korea

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(18), 4890; https://doi.org/10.3390/en18184890

Submission received: 7 August 2025 / Revised: 6 September 2025 / Accepted: 13 September 2025 / Published: 15 September 2025

(This article belongs to the Special Issue Advanced Load Forecasting Technologies for Power Systems)

Download

Browse Figures

Versions Notes

Abstract

The increasing integration of renewable energy resources, driven by carbon neutrality goals, has intensified load variability, thereby making very short-term load forecasting (VSTLF) more challenging. Accurate VSTLF is essential for the reliable and economical real-time operation of power systems. This study proposes a Long Short-Term Memory (LSTM)-based VSTLF model designed to predict nationwide power system load, including renewable generation over a six-hour horizon with 15 min intervals. The model employs a reconstituted load approach that incorporates photovoltaic (PV) generation effects and computes representative weather variables across the country. Furthermore, the most informative input features are selected through a combination of correlation analyses. To further enhance input sequences, pseudo-trend components are generated using a Kalman filter-based predictor and integrated into the model input. The Kalman filter-based pseudo-trend produced an MAPE of 1.724%, and its inclusion in the proposed model reduced the forecasting error (MAPE) by 0.834 percentage points. Consequently, the final model achieved an MAPE of 0.890%, which is under 1% of the 94,929 MW nationwide peak load.

Keywords:

deep learning; large power system; pseudo-input; real-time load forecasting; very short-term load forecasting (VSTLF)

1. Introduction

Climate change, driven by global warming, along with the global transition from fossil fuels to renewable energy to achieve carbon neutrality, has introduced substantial changes to power systems. In particular, the growing integration of renewable energy sources and climate-induced variability in system load have increased the magnitude of load fluctuations. These challenges have made accurate load forecasting more difficult. Nevertheless, precise forecasting remains essential to ensure the reliable and economic operation of power systems. This underscores the need for forecasting models that can effectively capture and adapt to such variability.

Recent studies have explored a variety of approaches to improve the accuracy of load forecasting. These studies can be broadly categorized according to input feature selection techniques, forecasting methodologies, and input enhancement strategies. Given that selecting appropriate input features based on the forecasting horizon and the characteristics of the target load is critical, extensive research has focused on identifying the most relevant features to enhance forecasting performance.

Several studies have proposed input feature selection methods that incorporate multiple analytical techniques. Subbiah et al. [1] introduced a long-term forecasting framework that combines the RReliefF filter, mutual information, and recursive feature elimination. Similarly, Cui et al. [2] applied XGBoost and Random Forest algorithms to identify relevant features for short-term load forecasting. These studies highlight that combining multiple techniques can provide greater robustness in input feature selection, as relying on a single method may not fully capture the diverse characteristics of load-related data. Therefore, a hybrid approach that integrates multiple selection techniques is essential for improving both the quality and reliability of input selection.

Traditional time-series models, such as ARIMA, exponential smoothing (ES), LASSO, the Kalman filter, and their nonlinear extensions, have long been applied to load forecasting. Taylor et al. [3] employed a double seasonal ETS for a 15-day load forecasting task in Brazil, Ziel et al. [4] adopted LASSO estimation for probabilistic short-term forecasting, and Jung et al. [5] applied the Kalman filter for VSTLF. While these approaches provide interpretable results, they struggle to capture the nonlinear dynamics of large-scale systems, particularly under increasing load variability driven by weather and renewable integration.

In contrast, machine learning methods, including support vector machines (SVM), gradient boosting, and hybrid approaches, have gained attention for their ability to model nonlinear dependencies. For example, Singh et al. [6] utilized SVM with wavelet transforms for day-ahead forecasting, while Taieb et al. [7] applied gradient boosting for short-term forecasting. Since a basic SVM is linear, whereas kernelized SVM can capture nonlinear relationships, these methods extend the modeling capacity beyond traditional time-series models. Hybrid approaches have also been explored: Dudek et al. [8] combined exponential smoothing with Long Short-Term Memory (LSTM) for mid-term forecasting, and Velasco et al. [9] proposed an ARIMA–ANN model where ARIMA captured linear components and ANN corrected residuals, achieving higher accuracy than single models.

Among machine learning techniques, Recurrent Neural Network (RNN)-based methods such as LSTM and Gated Recurrent Units (GRUs) have been extensively utilized in recent studies, as their internal feedback structures enable them to effectively capture temporal dependencies in sequential input data. Kwon et al. [10] applied LSTM for short-term load forecasting in the South Korean power system. Lin et al. [11] proposed an LSTM-based short-term load forecasting model with an attention mechanism, while Hua et al. [12] developed a hybrid model that combines Convolutional Neural Networks (CNNs), GRUs, and an attention mechanism. Numerous recent studies have demonstrated that LSTM and GRU architectures are well suited for load forecasting tasks.

Moreover, various studies have explored input data enhancement techniques to improve the accuracy of load forecasting. While many approaches directly use historical load data, along with related weather variables, an increasing number of studies have focused on utilizing feature extraction methods involving component analysis of the original data to enhance input variables. Wang et al. [13] proposed a method that applies input sequence enhancement by decomposing load signals using the Prophet algorithm and variational mode decomposition. Li et al. [14] employed Seasonal trend decomposition, using losses to separate load data into trend, seasonal, and residual components, which were then used as input sequences. Shao et al. [15] used ensemble empirical mode decomposition to decompose load data and utilized the resulting components as forecasting model inputs. These decomposition-based sequence enhancement methods have all demonstrated improvements in load forecasting accuracy.

Recent studies on load forecasting have increasingly addressed VSTLF, which targets predictions ranging from a few minutes to several hours ahead. Wang et al. [13] proposed a VSTLF model based on Temporal Convolutional Networks (TCNs) combined with Light Gradient Boosting Machine (LGBM) to extract spatio-temporal load features. Zhang et al. [16] developed a VSTLF approach that integrates an improved empirical mode decomposition algorithm with bidirectional LSTM for a small city in China. Tong et al. [17] combined an attention module with a 1D convolutional block for VSTLF in Chinese and New England cities. Although these studies advanced the development of VSTLF, they primarily addressed one-step-ahead forecasting, which limits their applicability for real-time power system operation and market planning.

Although many recent studies have focused on input feature selection, input sequence enhancement, or RNN-based VSTLF, few have integrated all these aspects in the context of a large-scale, nationwide power systems. In particular, VSTLF for such systems remains challenging because it must capture not only heterogeneous weather conditions across wide geographic regions and the variability introduced by renewable energy sources but also the distortions in system load caused by Behind-the-Meter (BTM) PV generation. The rapid growth of renewable energy amplifies the uncertainty of BTM PV generation, which induces irregular system load patterns and poses significant challenges for nationwide forecasting.

Building upon previous studies that have improved load forecasting accuracy, this study proposes a VSTLF algorithm for the next six hours in a large-scale, nationwide power system at 15 min intervals. The proposed approach integrates the following: (1) Input feature selection based on multiple correlation analyses, including the Pearson correlation coefficient, Spearman correlation coefficient, and normalized mutual information (NMI); (2) Input sequence enhancement using pseudo-trend components generated by a Kalman filter-based predictor; and (3) An LSTM model tailored to handle multi-dimensional inputs.

The main contributions of this study are described as follows:

1.: Input Feature Selection Based on Multiple Correlation Analyses. This study identifies candidate input features by combining multiple correlation analyses: the Pearson correlation coefficient to capture linear relationships, the Spearman correlation coefficient for monotonic and potentially nonlinear associations, and NMI to quantify information shared between variables. These correlation measures are integrated to provide robustness in the candidate selection process. From the set of selected candidates, the input combination that yields the highest forecasting accuracy is selected.
2.: Input Sequence Enhancement Using a Kalman Filter-Based Predictor for Pseudo-Trend Generation. To enhance the input sequences for the LSTM-based forecasting model, this study employs a Kalman filter-based predictor to generate pseudo-trend components of the load. These pseudo-trends are used to augment the original historical input sequences, which enables the model to learn a more stable temporal structure and thereby improve forecasting performance.
3.: VSTLF for a Large-Scale, Nationwide Power System Considering Regional Characteristics and Renewable Variability. To perform VSTLF on a nationwide scale, this study adopts two key strategies: (1) a reconstituted load approach that accounts for the variability introduced by photovoltaic (PV) generation and (2) the use of representative weather variablesthat capture spatially heterogeneous conditions across regions of the power system.

The remainder of this paper is organized as follows. Section 2 provides background information. Section 3 presents the proposed algorithm. Section 4 discusses case studies for validation, and Section 5 concludes the paper.

2. Background

2.1. Very Short-Term Load Forecasting

Load forecasting is commonly categorized into very short-term, short-term, mid-term, and long-term horizons based on the forecasting time scale [18]. Among these, VSTLF refers to prediction of load from a few minutes to several hours ahead. Compared to short-term forecasting, VSTLF requires finer temporal resolution and greater sensitivity to rapid load fluctuations, making it essential to adopt methods that can effectively capture such variability.

VSTLF plays a critical role in real-time power system operations—such as generator dispatch, energy storage system control, and electricity market price determination—by enabling timely responses to rapid load variations driven by diverse and often uncertain factors. Accurate VSTLF can contribute to cost reduction through improved unit commitment planning and facilitates more efficient dispatch decisions, underscoring its importance in modern power system management.

2.2. Long Short-Term Memory

In this study, the forecasting model is developed using LSTM networks [19], which are well suited for learning time-series data. LSTM is a type of RNN that addresses the vanishing gradient problem through a gating mechanism. Owing to its recurrent structure, LSTM demonstrates strong performance in modeling temporal dependencies and has been widely applied in time-series forecasting.

The input to an LSTM model consists of sequential data, along with features that capture temporal characteristics. In the proposed model, input sequences are constructed from representative daily load profiles, and feature vectors are derived from load values at specific times of the day. This design leverages domain knowledge that the load exhibits consistent patterns depending on both the day of the week and the time of day.

LSTM models also require several hyperparameters, including the number of hidden layers, the number of neurons, the choice of optimizer, and the batch size. The number of hidden layers and neurons determines the structural capacity of the model, while dropout reduces overfitting by randomly deactivating connections during training. Optimizers such as Adam or stochastic gradient descent (SGD) are used to update parameters and improve convergence, and the batch size controls the number of samples processed at each iteration, thereby influencing training efficiency. The appropriate selection of these hyperparameters is essential to ensure forecasting accuracy and model generalization, since inadequate configurations can lead to either overfitting or underfitting.

2.3. Correlation Analysis Method

In this study, correlation analysis methods are employed to select input variables that have a significant impact on load forecasting. Since the relationships between load and weather variables are often complex and nonlinear, relying on a single correlation measure or selecting only the most correlated variable can result in biased input selection. To address this, the proposed approach integrates multiple correlation analyses: the Pearson correlation coefficient [20] for identification of linear relationships, the Spearman correlation coefficient [21] to capture monotonic associations, and the NMI [22] for quantification of information shared between variables.

The Pearson correlation coefficient (

r_{P}

) quantifies the strength and direction of the linear relationship between two variables (X and Y), with values ranging from −1 to +1. A value close to −1 indicates a strong negative linear correlation, a value near 0 indicates little to no linear correlation, and a value close to +1 indicates a strong positive linear correlation. The coefficient is calculated based on the covariance and standard deviations of the two variables, that is

r_{P} = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} \sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}},

(1)

where i denotes the index of each data point and n is the total number of samples. Furthermore,

\bar{x}

and

\bar{y}

represent the sample means of the X and Y variables, respectively. The term

\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})

corresponds to the covariance between X and Y, while

\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}

and

\sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

denote the standard deviations of X and Y, respectively.

The Spearman correlation coefficient (

r_{S}

) quantifies the strength and direction of the monotonic relationship between two variables (X and Y). Unlike the Pearson correlation coefficient, it is capable of capturing nonlinear monotonic associations and is more robust to outliers. Similar to the Pearson correlation coefficient, its values range from −1 to +1: a value close to −1 indicates a strong, decreasing monotonic relationship; a value near 0 suggests no monotonic association; and a value close to +1 indicates a strong, increasing monotonic relationship. The Spearman correlation coefficient is computed by first converting the original data into ranked values, then calculating the Pearson correlation between these ranks. These coefficients are defined as

r_{S} = \frac{\sum_{i = 1}^{n} (r_{i} - \bar{r}) (s_{i} - \bar{s})}{\sqrt{\sum_{i = 1}^{n} {(r_{i} - \bar{r})}^{2}} \sqrt{\sum_{i = 1}^{n} {(s_{i} - \bar{s})}^{2}}},

(2)

r_{i} = rank (x_{i}), s_{i} = rank (y_{i}),

(3)

\bar{r} = \frac{1}{n} \sum_{i = 1}^{n} r_{i}, \bar{s} = \frac{1}{n} \sum_{i = 1}^{n} s_{i},

(4)

where

r_{i}

and

s_{i}

represent the rank-transformed values of variables X and Y, respectively.

\bar{r}

and

\bar{s}

denote the sample means of the rank vectors. Accordingly,

\sum_{i = 1}^{n} (r_{i} - \bar{r}) (s_{i} - \bar{s})

corresponds to the covariance between the two rank vectors, while

\sqrt{\sum_{i = 1}^{n} {(r_{i} - \bar{r})}^{2}}

and

\sqrt{\sum_{i = 1}^{n} {(s_{i} - \bar{s})}^{2}}

represent the standard deviations of

r_{i}

and

s_{i}

, respectively.

The NMI quantifies the amount of information shared between two variables on a normalized scale ranging from 0 to 1. An NMI value close to 0 indicates that the variables are nearly independent, whereas a value close to 1 suggests a strong dependency or near one-to-one correspondence. It is calculated using entropy and mutual information, that is,

M I (X; Y) = H (X) + H (Y) - H (X, Y),

(5)

H (X) = - \sum_{i = 1}^{n} p (x_{i}) {log}_{2} p (x_{i}), H (Y) = - \sum_{i = 1}^{n} p (y_{i}) {log}_{2} p (y_{i}),

(6)

H (X, Y) = - \sum_{x_{i} \in X} \sum_{y_{i} \in Y} p (x_{i}, y_{i}) {log}_{2} p (x_{i}, y_{i}),

(7)

N M I (X, Y) = \frac{M I (X; Y)}{H (X) + H (Y)},

(8)

where

p (x_{i})

and

p (y_{i})

denote the marginal probability distributions of X and Y, respectively, while

p (x_{i}, y_{i})

represents their joint probability distribution.

H (X)

and

H (Y)

denote the individual entropies of X and Y, and

H (X, Y)

represents their joint entropy.

2.4. Reconstituted Load Method

The reconstituted load method has been demonstrated in a prior study [23] to improve forecasting performance in power systems with integrated PV generation. This approach reconstructs the underlying load by adding PV generation to the observed system load, thereby reducing distortions introduced by PV variability. The forecasting model first predicts the reconstituted load and subsequently subtracts the estimated PV generation to obtain the final system load. In this study, the reconstituted load method is applied to forecast system load in a large-scale power system with PV integration. The procedure of the reconstituted load method is illustrated in Figure 1.

2.5. Representative Weather Variable Calculation

Weather is one of the most influential factors affecting system load, along with variables such as the day of the week and social events. Therefore, it is essential to appropriately account for weather-related factors in load forecasting.

In small-scale power systems, weather data from the local region are typically used for forecasting. However, in large-scale systems, capturing spatial weather variability across multiple regions presents a greater challenge. To address this, a representative weather variable is derived by applying a weighted averaging method to weather data collected from multiple zones of the large, nationwide power system. The weighting factors are assigned based on prior studies [24], and the resulting weighted average serves as the representative weather input for the entire system. The computation of the representative weather variable is given as

W_{rep} = \sum_{i = 1}^{n} w_{i} α_{i},

(9)

where

W_{rep}

denotes the representative weather variable, and i indexes each zone among the n zones included in the weighted average. The weight assigned to each zone is represented by

w_{i}

, with values adopted from prior studies [24].

2.6. Kalman Filter-Based Pseudo-Trend Generation

The Kalman filter algorithm [5] estimates the optimal state variables of a system by combining a probabilistic state-space model with observation data. It has been widely applied across diverse domains, including control systems, stock market prediction, weather forecasting, and population modeling. The Kalman filter is typically formulated using a state transition model and an observation model. They are expressed as follows:

x_{k + 1} = A x_{k} + w_{k},

(10)

z_{k} = H x_{k} + v_{k},

(11)

where

x_{k}

,

z_{k}

,

A

,

H

,

w_{k}

, and

v_{k}

denote the state vector, observation vector, state transition matrix, observation matrix, process noise vector, and measurement noise vector, respectively.

The Kalman filter algorithm consists of two main steps: the prediction step and the correction step. In the prediction step, the previous state estimate (

{\hat{x}}_{k - 1}

) and its associated error covariance (

P_{k - 1}

) are used to compute the a priori state estimate (

{\hat{x}}_{\bar{k}}

) and the predicted error covariance (

P_{\bar{k}}

). The prediction equations are given as

{\hat{x}}_{\bar{k}} = A {\hat{x}}_{k - 1} + w_{k},

(12)

P_{\bar{k}} = A P_{k - 1} A^{⊤} + Q,

(13)

where

A

is the state transition matrix that defines the system dynamics, and

w_{k}

represents the process noise vector, with a covariance matrix denoted by

Q

.

In the correction step, the a priori estimates (

({\hat{x}}_{\bar{k}}, P_{\bar{k}})

) and the observation (

z_{k}

) are used to compute the posterior state estimate (

{\hat{x}}_{k}

) and its associated error covariance (

P_{k}

). The correction step is defined by

K_{k} = P_{\bar{k}} H^{⊤} {(H P_{\bar{k}} H^{⊤} + R)}^{- 1},

(14)

{\hat{x}}_{k} = {\hat{x}}_{\bar{k}} + K_{k} (z_{k} - H {\hat{x}}_{\bar{k}}),

(15)

P_{k} = (I - K_{k} H) P_{\bar{k}},

(16)

where

K_{k}

denotes the Kalman gain, which serves as the weighting factor in the Kalman filter algorithm.

H

is the observation matrix that defines the relationship between the measurements and the state vector.

R

represents the covariance matrix of the measurement noise vector (

v_{k}

), which typically originates from sensor inaccuracies. The final corrected state estimate (

{\hat{x}}_{k}

) is obtained by adjusting the predicted state (

{\hat{x}}_{\bar{k}}

) using the Kalman gain (

K_{k}

) and the measurement residual

(z_{k} - H {\hat{x}}_{\bar{k}})

. The overall flowchart of the Kalman filter algorithm is illustrated in Figure 2.

The input data used in the Kalman filter algorithm were organized according to day-of-week load patterns, categorized into weekdays (Tuesday–Friday), Saturday, Sunday, and Monday [5]. Figure 3 illustrates an example of the 15 min interval data employed as inputs to the Kalman filter algorithm.

In Figure 3, the black points represent the most recent load values prior to the forecasting target. The yellow points indicate the load values from the most recent day within the same day-of-week group as the forecasting target, and the green points correspond to the load values from the second most recent day in the same group. While the most recent load values reflect the current load trend, the load values from other days capture the expected trend of future loads at the same time of day. For example, if the forecasting target is Monday, the inputs consist of the most recent load prior to the forecasting time, the load from one week earlier, and the load from two weeks earlier. These inputs are then processed by the Kalman filter algorithm to generate the forecast.

3. VSTLF for Large Power Systems with Pseudo-Trend Information

The overall flowchart of the proposed VSTLF algorithm is illustrated in Figure 4.

In the proposed method, the reconstituted load is first computed by combining the system load with PV generation data, thereby properly accounting for the influence of PV generation. Input candidates are then selected through multiple correlation analyses—the Pearson correlation coefficient, Spearman correlation coefficient, and NMI—applied to both weather variables and time-lagged features. To enrich the temporal characteristics of the input sequence, pseudo-trend components are generated using a Kalman filter-based predictor and incorporated into the input. The model forecasts the reconstituted load using combinations of selected input candidates and the pseudo-trend data. The forecasted loads are then converted back to system load, and the input combination that yields the lowest forecasting error is selected. Finally, hyperparameter tuning and model evaluation are performed.

3.1. Selection of Candidate Input Features Through Correlation Analyses

In this study, candidate input features are selected through an integrated correlation analysis framework that prioritizes variables consistently exhibiting strong associations with the target load. The selection process combines the Pearson correlation coefficient to assess linear relationships, the Spearman correlation coefficient to capture monotonic trends, and the NMI to quantify the mutual information shared between variables. The overall process of input feature selection is depicted in Figure 5.

As shown in Figure 5, each input variable (e.g., weather variables and time-lagged variables) is evaluated using Pearson, Spearman, and NMI correlation analyses. For each correlation analysis, a variable is marked as 1 when its correlation coefficient is greater than the average coefficient obtained from that specific analysis. Finally, variables that are consistently marked across all three methods are selected as the input candidates. Among these candidates, all possible combinations are evaluated through a brute-force search to identify the optimal input feature set for the forecasting model.

Since the proposed model is trained to forecast the reconstituted load directly, correlation analysis is conducted between the reconstituted load and both weather and time-lagged variables to identify meaningful input features. The weather-related variables considered in this study include temperature, humidity, wind speed, weather conditions, and precipitation, all of which are derived from weather forecasts. Time-lagged variables consist of historical load values from one day prior (D-1) to one week prior (D-7) to the forecast date. Additionally, calendar variables—such as year, month, day, and day of the week—are incorporated to account for periodicity and are treated as auxiliary features.

3.2. Data Pre-Processing

The load and weather data used as inputs to the forecasting model may contain missing values and differences in scale, which can introduce bias or increase the risk of overfitting. Furthermore, it is necessary to derive representative weather information that accurately reflects the characteristics of the entire large-scale power system and effectively captures the impact of renewable energy generation.

In this study, data pre-processing is carried out in the following steps: time resolution alignment, missing value removal, load reconstitution, representative weather data generation, and data normalization. The raw datasets used in this study were originally recorded at different temporal resolutions: the target variable, i.e., the system load, is recorded at 15 min intervals; weather data are provided hourly; and calendar variables—such as date and day of the week—are categorical.

All datasets are resampled to a unified 15 min resolution to match the forecasting target. Hourly weather data are converted to 15 min intervals using linear interpolation. Additionally, any days containing missing values in the training dataset are entirely excluded from the dataset.

To accurately account for the impact of renewable energy, particularly PV generation, the reconstituted load is calculated by summing PV generation and the system load. Representative weather data for the entire power system are derived by performing weighted averaging of weather observations from n major zones, where the weights reflect the relative significance of each region.

Finally, to ensure consistency in data scales, min–max normalization is applied. This technique is selected from among several normalization methods—including max normalization, min–max normalization, and Z-score normalization—based on prior studies [10] reporting superior forecasting accuracy when using min–max scaling. Min–max normalization is defined as

x^{'} (t) = \frac{x (t) - \min (x)}{\max (x) - \min (x)},

(17)

where

\max (x)

and

\min (x)

denote the maximum and minimum values of the data, respectively, and

x (t)

denotes the value at t.

3.3. VSTLF Model with Pseudo-Trend Input Sequence Enhancement

This study proposes a VSTLF model that utilizes load data, weather information, and a pseudo-trend generated by a Kalman filter-based predictor as inputs to an LSTM network. The model forecasts load every 15 min over a 6 h horizon on regular weekdays, excluding holidays. This forecasting scope is due to the limited availability of holiday load data and their irregular behavior, which will be addressed in future work.

The proposed model is designed to learn temporal patterns by processing daily load sequences and extracting their characteristics through hourly feature representations. Constructing input sequences that effectively capture patterns relevant to the forecasting target is essential for enhancing forecasting accuracy. To achieve this, the input sequence is constructed using selected time-lagged date combinations from a candidate pool.

Additionally, to provide the model with forward-looking information specific to the forecast date, a pseudo-trend is generated using a Kalman filter-based predictor and incorporated into the input sequence. This pseudo-trend is obtained using the method proposed by Jung et al. [5], which is designed to capture the underlying load dynamics by identifying both recent and historical trend components. The resulting trend sequence represents the temporal direction of the forecast day’s load profile. The sequence-feature structure of the input data is illustrated in Figure 6.

The proposed algorithm extends the input structure illustrated in Figure 6 by incorporating the pseudo-trend generated by the Kalman filter-based predictor. An example of the input data structure including the Kalman filter-based pseudo-trend is presented in Figure 7.

The pseudo-trend, which captures the recent trend of the target day, is incorporated into the input sequence alongside historical load patterns. The input to the LSTM model consists of three components: (1) the past 6 h of load data prior to the prediction time, (2) time-lagged components selected through the input feature selection process, and (3) the pseudo-trend corresponding to the same 6 h forecast horizon.

When multiple types of data, such as load, temperature, and humidity, are jointly used in the form of sequences and features, the model input becomes multidimensional. To address this, either a merged input structure or a parallel input structure can be applied. However, merged input structures may cause distortion during concatenation because of potential temporal misalignment among heterogeneous variables.

To mitigate this issue, the proposed model preserves the original structure of each input type by feeding load data and each selected weather variable into separate LSTM layers. This architecture enables the model to independently capture the temporal dynamics of each variable, thereby enhancing its ability to learn feature-specific patterns more effectively.

However, since the parallel structure has limitations in learning inter-variable dependencies, the outputs of the individual LSTM layers are concatenated and passed to a fully connected (FC) layer. One-dimensional inputs—such as weather forecasts for the prediction day, the pseudo-trend generated by the Kalman filter-based predictor, and auxiliary features (e.g., calendar variables)—are also fed into the FC layer, along with the LSTM outputs.

The model is trained by minimizing the mean squared error (MSE) between the predicted and actual values using the Adam optimizer. The overall architecture of the proposed forecasting model is illustrated in Figure 8.

4. Case Studies

4.1. Dataset Description and Experimental Setup

This study utilizes data spanning from 1 January 2021 to 31 December 2022. The dataset includes system load measurements recorded at 15 min intervals and PV generation data recorded hourly, both provided by the Korea Power Exchange. Additionally, hourly weather data—including temperature, humidity, wind speed, weather condition, and precipitation—were obtained from the Korea Meteorological Administration for eight zones across the large, nationwide power system, which consists of several zones. Calendar-related auxiliary variables, such as day of the week and holidays, were also collected. The types and ranges of all datasets used in this study are summarized in Table 1.

As shown in Table 1, the maximum recorded load reaches 94,929 MW, representing total national demand. Industrial consumption accounts for approximately 53% of this total, indicating that overall load patterns are primarily driven by industrial activity. Although PV generators are installed across most regions—including limited deployment in the capital area—their output must be considered in the forecasting model [23].

All processes—including input selection, model training, forecasting, and performance evaluation—were implemented in Python 3.10.15 on a system equipped with an Intel Xeon Silver 4215R CPU.

4.2. Evaluation Metrics

To evaluate the forecasting performance of the proposed model, three commonly used error metrics are employed: mean absolute percentage error (MAPE), root mean square error (RMSE), and mean absolute error (MAE). The definitions of these metrics are

MAPE (%) = \frac{100}{n} \sum_{t = 1}^{n} |\frac{{\hat{y}}_{t} - y_{t}}{y_{t}}|,

(18)

RMSE (MW) = \sqrt{\frac{1}{n} \sum_{t = 1}^{n} {({\hat{y}}_{t} - y_{t})}^{2}},

(19)

MAE (MW) = \frac{1}{n} \sum_{t = 1}^{n} |{\hat{y}}_{t} - y_{t}|,

(20)

where

y_{t}

and

{\hat{y}}_{t}

denote the actual and predicted load at time t, respectively, and n is the number of forecast steps.

MAPE measures the average percentage error between predicted and actual values, providing a scale-independent indicator of forecasting accuracy. MAE calculates the mean of the absolute errors and is less sensitive to outliers compared to RMSE. In contrast, RMSE penalizes larger errors more heavily by squaring the residuals, making it more suitable for emphasizing peak deviations in the forecast.

4.3. Results of Input Candidate Selection Based on Correlation Analysis

Input candidate selection was performed by conducting correlation analysis between the reconstituted load—obtained from system load and PV generation data—and the candidate variables. The results of the correlation analysis between the reconstituted load and both weather variables and time-lagged features are presented in Table 2 and Table 3.

As shown in Table 2, temperature and humidity consistently exhibited higher correlation values than the average coefficients in both the Pearson and Spearman analyses. In contrast, the NMI results indicated that temperature, humidity, and wind speed exceeded the average NMI coefficient. Since the proposed input candidates were selected only when a variable exceeded the average value across all three correlation measures, temperature and humidity were identified as the final weather-related input features.

In addition, as shown in Table 3, time-lagged variables D-1, D-6, and D-7 commonly exhibited higher correlation values across the three correlation analyses. As a result, D-1, D-6, and D-7 were selected as the final time-lagged input features.

Additionally, auxiliary information such as year, month, day, weekday, and time was consistently included to represent the calendar-based periodic characteristics of the target day.

To identify the optimal input combination, a brute-force search [25] was conducted over all possible combinations of the selected input features. For weather-related inputs, two combinations were evaluated: [temperature] and [temperature, humidity]. Since temperature is widely recognized as the most influential weather variable in load forecasting [10], combinations that excluded temperature were not considered. For time-lagged inputs, seven combinations were tested: [D-1], [D-1, D-6], [D-1, D-7], [D-1, D-6, D-7], [D-6], [D-6, D-7], and [D-7].

As a result, two valid combinations of weather variables and seven combinations of time-lagged variables yielded a total of 14 input configurations. Each configuration was evaluated using the MAPE to identify the one that achieved the highest forecasting accuracy. The forecasting errors associated with each input combination are summarized in Table 4.

As shown in Table 4, among the 14 evaluated input combinations, the configuration using temperature as the weather variable and D-1 and D-7 as time-lagged inputs, i.e., Case 3, yielded the highest forecasting accuracy. A comparison between the use of D-6 and D-7 indicates that incorporating D-7, which represents the same day of the previous week, is the most effective for improving forecasting performance. D-6, on the other hand, exhibited higher correlation with the target load than other time-lagged candidates and yielded better accuracy than using D-1 alone, but its contribution was less effective than that of D-7. These results highlight the importance of carefully selecting input features and lag structures. Given that the input sequence of the designed LSTM model is based on daily load patterns, the incorporation of D-7, which provides a more similar load pattern, yielded the greatest improvement in forecasting accuracy. Based on this result, the final input feature set was determined to include temperature as the weather input and D-1 and D-7 as the time-lagged variables.

4.4. Effect of Pseudo-Trend Sequence Input Enhancement

The Kalman filter-based pseudo-trend is generated at 15 min intervals over a 6 h forecasting horizon. This section evaluates the effectiveness of the pseudo-trend as an additional input sequence by analyzing its correlation with the actual target load. To quantify the relationship between the pseudo-trend and the target load, three correlation metrics are computed: the Pearson correlation coefficient, the Spearman correlation coefficient, and the NMI. The results are summarized in Table 5. Here, the generated pseudo-trend sequence is further denoted as KF.

As shown in Table 5, the pseudo-trend exhibits a strong correlation with the target load, with Pearson and Spearman correlation coefficients of 0.937 and 0.935, respectively, and an NMI value of 0.992. When used as a direct forecast of the target load, the pseudo-trend yields an MAPE of 1.724%.

Based on these results, the pseudo-trend is incorporated as an additional component of the input sequence, alongside the historical load data. The input configuration of the forecasting model corresponds to the optimal feature set identified in the previous section, which includes temperature as the weather-related feature and D-1 and D-7 as the time-lagged features. In addition to the selected input configuration, the auxiliary calendar features are included as input in the same way.

The annual forecasting performance of the pseudo-trend-enhanced input sequence model is summarized in Table 6. For comparison, Case 3 in Table 4, which excludes the pseudo-trend from the input sequence, is also presented.

As shown in Table 6, incorporating the pseudo-trend into the input sequence led to reductions across all annual average error metrics. This result confirms that the pseudo-trend makes a meaningful contribution in terms of improving forecasting accuracy. Detailed monthly average forecasting errors (MAPE, %) for the proposed model are provided in Table 7.

As shown in Table 7, the application of the pseudo-trend improved forecasting performance in January, March, April, September, October, November, and December. However, in some other months, the improvement was less evident, which can be attributed to discrepancies between the pseudo-trend values generated for the training and test data, as well as the model being in a pre-optimized state prior to hyperparameter tuning. These preliminary results nonetheless indicate that pseudo-trend augmentation has the potential to enhance forecasting performance, which becomes more evident after model optimization.

4.5. Hyperparameter Tuning Using Grid Search Method

Hyperparameters significantly influence both the training efficiency and forecasting accuracy of LSTM models. The types of tunable hyperparameters vary depending on the architecture of the forecasting model. In this study, the proposed model adopts a relatively simple structure based on parallel LSTM layers, and the tuning process was therefore restricted to the number of layers and the number of neurons per layer. Other parameters were fixed to widely adopted default settings for reproducibility. Specifically, the Adam optimizer was employed, and the batch size was set sufficiently large to process the entire dataset at once. Dropout was not applied in order to preserve the temporal dependency inherent in the LSTM architecture. To mitigate overfitting, the training dataset was randomly split into 80% for training and 20% for validation. Model performance was monitored by comparing training and validation losses, and the training was terminated early if the validation loss did not improve for more than 200 epochs.

Given the model’s simplicity and the small number of hyperparameters, a grid search method [26] is employed to identify the parameter combination that yields the highest forecasting accuracy within a predefined search space. The search ranges and the selected hyperparameter values are summarized in Table 8, and the MAPE corresponding to each hyperparameter combination is illustrated in Figure 9.

As shown in Figure 9, the grid search results indicate that the best forecasting performance was achieved with a configuration of two LSTM layers and 64 features per layer, yielding an MAPE of 0.890%.

Table 9 summarizes the configurations and annual average forecasting errors of the comparison models. The model configurations are defined as follows: (1) KF—the Kalman filter-based predictor used independently as a pseudo-trend input; (2) Case 3—the model trained with the optimal input feature set selected via brute-force search; (3) baseline LSTM—a baseline LSTM model in which all input features are concatenated and processed through a single LSTM network; (4) Pseudo-Trend—the model configuration incorporating pseudo-trend enhancement. Hyperparameters are not optimized; (5) Proposed—the final model configuration incorporating pseudo-trend enhancement and optimized hyperparameters.

The results demonstrate that the proposed model outperforms the baseline configurations in forecasting accuracy, highlighting the effectiveness of both the input selection strategy and the pseudo-trend input enhancement.

As shown in Table 9, the Kalman filter-based forecast, when used independently as the pseudo-trend input, produces relatively large errors across all evaluation metrics. In contrast, Case 3, which employs the best-performing input variable combination identified through correlation analysis, achieves higher forecasting accuracy. The results of Case 3 demonstrate that incorporating a similar load pattern with optimized input variables improves load forecasting accuracy.

In a similar manner, incorporating the pseudo-trend as an additional input, i.e., Pseudo-Trend, yields further improvements than Case 3. The Pseudo-Trend model incorporates the pseudo-trend as part of the input sequence, providing additional information about the trend of the target load. Since both D-7 and the pseudo-trend exhibit strong correlations with the target load, the model can capture more relevant information, leading to higher load forecasting accuracy.

A comparison between the Baseline LSTM and the Pseudo-Trend model shows that the Baseline LSTM achieves higher load forecasting accuracy. However, given that the Pseudo-Trend model was not optimized at this stage, a more equitable comparison should be made between the Baseline LSTM and the optimized Pseudo-Trend model. The results highlight the effectiveness of incorporating the pseudo-trend as an additional input sequence. With the inclusion of the pseudo-trend, the number of input sequences increased from three in the Case 3 model to four in the Proposed model, leading to a slight rise in computational cost; however, the forecasting errors were significantly reduced.

These results suggest that, although the Kalman filter-based predictor alone produces high forecasting errors, it captures meaningful trend information that enhances model performance when integrated with historical input sequences. A detailed comparison of monthly MAPE values for the comparison model configurations is provided in Table 10.

As shown in Table 10, the proposed algorithm achieved the highest forecasting accuracy in all months, except May and November. The load forecasting accuracy of the proposed model consistently outperformed that of the baseline LSTM across most months; however, in May, the baseline LSTM showed relatively better accuracy, achieving the lowest error of 0.836%. This can be attributed to the relatively stable load patterns observed in May, where the additional pseudo-trend component provided limited new information and, in some cases, introduced minor noise. Nevertheless, the proposed model achieved the best overall performance, on average, with an MAPE of 0.890%, especially in months characterized by greater variability in load and weather conditions. These results suggest that pseudo-trend augmentation is especially beneficial in periods of greater volatility, whereas future work may focus on adaptive schemes to further improve performance in months with relatively stable demand patterns. Compared with the pre-optimized results reported in Table 7, these results confirm that hyperparameter tuning enabled the model to more effectively leverage the pseudo-trend input, leading to consistent improvements across nearly all months.

5. Conclusions

This study proposed a VSTLF algorithm for a nationwide power system with a 6 h forecasting horizon. The proposed approach demonstrates strong potential for deployment in large-scale power system operations and the efficient operation of an electrical energy market, as it consistently improved the accuracy of load forecasting compared with baseline models.

The model’s effectiveness stems from two key contributions: (1) a correlation-based input selection strategy that identifies the most informative variables and (2) the incorporation of Kalman filter-based pseudo-trend sequences that capture underlying short-term load dynamics. Together, these elements reduced the overall forecasting error to 0.890% MAPE and enhanced accuracy across nearly all months in the evaluation period.

These findings confirm the utility of combining systematic feature selection with pseudo-trend augmentation to develop more reliable VSTLF models. Nonetheless, limitations remain, as the analysis focused only on normal weekdays. Future research should extend the evaluation to holidays and investigate adaptive strategies for pseudo-trend generation and season-specific feature selection to improve performance during periods of relatively stable demand. In addition, exploring advanced deep learning architectures, such as attention mechanisms and hybrid models, may further enhance forecasting accuracy and interpretability. Another promising direction is to account for heterogeneous weather conditions across zones by performing forecasts at the zonal level, then aggregating them, which may capture regional variability more effectively and improve nationwide forecasting performance.

Author Contributions

Methodology, T.-G.K., B.-S.K., and K.-B.S.; Software, T.-G.K.; Writing—review and editing, T.-G.K., B.-S.K., S.-G.Y., and K.-B.S.; Supervision, K.-B.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant number RS-2024-00398166.

Data Availability Statement

The 15 min resolution load dataset and the source code developed for this study are not publicly available due to Korea Power Exchange (KPX) data-handling policies. Requests for access to the original dataset should be directed to KPX. A publicly accessible 1 h KPX load dataset is available at https://www.data.go.kr/data/15065266/fileData.do (accessed on 6 August 2025). Meteorological data were retrieved via the Korea Meteorological Administration API at https://apihub.kma.go.kr/ (accessed on 6 August 2025). Both websites are primarily provided in Korean.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Subbiah, S.S.; Chinnappan, J. Deep learning based short-term load forecasting with hybrid feature selection. Electr. Power Syst. Res. 2022, 210, 108065. [Google Scholar] [CrossRef]
Cui, J.; Kuang, W.; Geng, K.; Bi, A.; Bi, F.; Zheng, X.; Lin, C. Advanced short-term load forecasting with XGBoost-RF feature selection and CNN-GRU. Processes 2024, 12, 2466. [Google Scholar] [CrossRef]
Taylor, J.W. Short-Term Load Forecasting with Exponentially Weighted Methods. IEEE Trans. Power Syst. 2011, 27, 458–464. [Google Scholar] [CrossRef]
Ziel, F.; Liu, B. Lasso Estimation for GEFCom2014 Probabilistic Electric Load Forecasting. Int. J. Forecast. 2016, 32, 1029–1037. [Google Scholar] [CrossRef]
Jung, H.-W.; Song, K.-B.; Park, J.-D.; Park, R.-J. Very short-term electric load forecasting for real-time power system operation. J. Electr. Eng. Technol. 2018, 13, 1419–1424. [Google Scholar]
Singh, S.N.; Mohapatra, A. Data-driven day-ahead electrical load forecasting through repeated wavelet transform assisted SVM model. Appl. Soft Comput. 2021, 111, 107730. [Google Scholar] [CrossRef]
Taieb, S.B.; Hyndman, R.J. A gradient boosting approach to the Kaggle load forecasting competition. Int. J. Forecast. 2014, 30, 382–394. [Google Scholar] [CrossRef]
Dudek, G.; Pełka, P.; Smyl, S. A hybrid residual dilated LSTM and exponential smoothing model for midterm electric load forecasting. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 2879–2891. [Google Scholar] [CrossRef] [PubMed]
Velasco, L.C.P.; Tabañag, I.D.L.; Tiongco, M.A.D.; Dadios, E.P. Load Forecasting Using Autoregressive Integrated Moving Average and Artificial Neural Network. Int. J. Adv. Comput. Sci. Appl. 2018, 9, 136–141. [Google Scholar] [CrossRef]
Kwon, B.-S.; Park, R.-J.; Song, K.-B. Short-term load forecasting based on deep neural networks using LSTM layer. J. Electr. Eng. Technol. 2020, 15, 1501–1509. [Google Scholar] [CrossRef]
Lin, J.; Ma, J.; Zhu, J.; Cui, Y. Short-term load forecasting based on LSTM networks considering attention mechanism. Int. J. Electr. Power Energy Syst. 2022, 137, 107818. [Google Scholar] [CrossRef]
Hua, Q.; Fan, Z.; Mu, W.; Cui, J.; Xing, R.; Liu, H.; Gao, J. A short-term power load forecasting method using CNN-GRU with an attention mechanism. Energies 2024, 18, 106. [Google Scholar] [CrossRef]
Wang, C.; Zhao, H.; Liu, Y.; Fan, G. Minute-level ultra-short-term power load forecasting based on time series data features. Appl. Energy 2024, 372, 123801. [Google Scholar] [CrossRef]
Li, L.; Guo, X.; Jing, R. Short-term electric load forecasting based on series decomposition and meta-informer algorithm. Electr. Power Syst. Res. 2025, 243, 111478. [Google Scholar] [CrossRef]
Shao, L.; Guo, Q.; Li, C.; Li, J.; Yan, H. Short-term load forecasting based on EEMD-WOA-LSTM combination model. Appl. Bionics Biomech. 2022, 2022, 2166082. [Google Scholar] [CrossRef]
Zhang, M.; Han, Y.; Zalhaf, A.S.; Wang, C.; Yang, P.; Wang, C.; Yang, P.; Wang, C.; Zhou, S.; Xiong, T. Accurate ultra-short-term load forecasting based on load characteristic decomposition and convolutional neural network with bidirectional long short-term memory model. Sustain. Energy Grids Netw. 2023, 35, 101129. [Google Scholar] [CrossRef]
Tong, C.; Zhang, L.; Li, H.; Ding, Y. Attention-based temporal–spatial convolutional network for ultra-short-term load forecasting. Electr. Power Syst. Res. 2023, 220, 109329. [Google Scholar] [CrossRef]
Vats, V.K.; De, M.; Rai, S.; Bharti, D. Very short-term, short-term and mid-term load forecasting for residential academic institute: A case study. In Proceedings of the 4th International Conference on Computing Communication and Automation (ICCCA), Greater Noida, India, 14–15 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 687–692. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson correlation coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 1–4. [Google Scholar]
Spearman, C. The proof and measurement of association between two things. Am. J. Psychol. 1904, 15, 72–101. [Google Scholar] [CrossRef]
Vinh, N.X.; Epps, J.; Bailey, J. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 2010, 11, 2837–2854. [Google Scholar]
Kim, T.-G.; Yoon, S.-G.; Song, K.-B. Very short-term load forecasting model for large power system using GRU-attention algorithm. Energies 2025, 18, 3229. [Google Scholar] [CrossRef]
Lim, J.-H.; Kim, S.-Y.; Park, J.-D.; Song, K.-B. Representative temperature assessment for improvement of short-term load forecasting accuracy. J. Korean Inst. Illum. Electr. Install. Eng. 2013, 27, 39–43. [Google Scholar] [CrossRef]
Tran, T.; Lam, B.M.; Tuan, N.A.; Le, Q.B. Load forecasting with support vector regression: Influence of data normalization on grid search algorithm. Sustainability 2020, 12, 2640. [Google Scholar] [CrossRef]
Bergstra, J.; Bengio, Y. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]

Figure 1. Procedure of reconstituted load method.

Figure 2. The overall flowchart of the Kalman filter algorithm.

Figure 3. Input example of the Kalman filter algorithm.

Figure 4. The overall flowchart of the proposed VSTLF algorithm.

Figure 5. The overall process for input candidate selection.

Figure 6. Sequence-feature structure illustrating the input data.

Figure 7. Example of the sequence-feature structure of the input data with Pseudo-trend generated by Kalman filter-based predictor.

Figure 8. Proposed VSTLF model with pseudo-sequence input.

Figure 9. MAPE (%) for each hyperparameter combination.

Table 1. The types and ranges of all datasets.

Dataset	Range (Minimum–Maximum)
Load	39,715–94,929 [MW]
PV generation	0–15,493.75 [MW]
Temperature	−16–34 [°C]
Humidity	0–100 [%]
Wind speed	0–7.8 [m/s]
Weather conditions	Sunny, Cloudy, Overcast
Precipitation	0–86 [mm]
Calendar information	Year, Month, Day, Day of the week, Time

Table 2. Correlation analysis between reconstituted load and weather variables.

	Temperature	Humidity	Wind Speed	Weather Conditions	Precipitation	Average
Pearson	0.269	0.221	0.075	0.010	0.008	0.117
Spearman	0.275	0.227	0.100	0.010	0.008	0.124
NMI	0.082	0.030	0.030	0.001	0.001	0.029

Table 3. Correlation analysis between reconstituted load and time-lagged data.

	D-1	D-2	D-3	D-4	D-5	D-6	D-7	Average
Pearson	0.820	0.617	0.563	0.549	0.565	0.728	0.881	0.674
Spearman	0.804	0.596	0.545	0.533	0.548	0.721	0.888	0.662
NMI	0.243	0.128	0.112	0.114	0.122	0.173	0.239	0.162

Table 4. Forecasting error of each input candidate combination.

Case	Weather Combination	Time-Lagged Combination	MAPE (%)
1	Temperature	D-1	1.741
2		D-1, D-6	1.580
3		D-1, D-7	1.116
4		D-1, D-6, D-7	1.134
5		D-6	1.763
6		D-6, D-7	1.154
7		D-7	1.238
8	Temperature, humidity	D-1	1.684
9		D-1, D-6	1.557
10		D-1, D-7	1.164
11		D-1, D-6, D-7	1.763
12		D-6	1.714
13		D-6, D-7	1.221
14		D-7	1.176

Table 5. Correlation analyses and error of pseudo-trend sequence and the reconstituted load.

	Pearson	Spearman	NMI	MAPE (%)
Generated Pseudo-Trend (KF)	0.937	0.935	0.992	1.724

Table 6. Annual forecasting errors using the pseudo-trend.

	MAPE (%)	RMSE (MW)	MAE (MW)
Pseudo-trend-enhanced input sequence model	1.092	1052.332	768.046
Case 3	1.116	1082.772	782.505

Table 7. Monthly MAPE (%) comparison with and without pseudo-trend input.

Month	W/O Pseudo-Trend	Pseudo-Trend
January	1.358	1.220
February	1.048	1.105
March	1.166	1.116
April	1.074	1.049
May	0.910	0.925
June	0.912	1.028
July	1.017	1.196
August	1.205	1.373
September	1.136	1.008
October	1.148	1.084
November	1.233	0.959
December	1.187	1.044
Average	1.112	1.092

Table 8. Hyperparameter search space and selected values.

Hyperparameter	Search Space	Selected Value
Number of LSTM layers	1, 2, 3	2
Number of features	4, 8, 16, 32, 64, 128	64

Table 9. Forecasting performance of compared models.

Model	Input Sequence	Hyperparameter Tuning	MAPE (%)	RMSE	MAE
KF	–	–	1.724	1136.421	1718.398
Case 3	*	x	1.116	1082.772	782.505
Baseline LSTM	*	o	1.021	994.426	716.766
Pseudo-Trend	*, pseudo-trend	x	1.092	1052.332	768.046
Optimized Pseudo-Trend (Proposed)	*, pseudo-trend	o	0.890	859.805	622.263

*: D-0, D-1, D-7.

Table 10. The monthly MAPE (%) comparison of the comparison models.

Month	KF	Case 3	Baseline LSTM	Pseudo-Trend	Proposed
January	1.769	1.358	1.531	1.220	0.886
February	2.061	1.048	0.958	1.105	0.870
March	2.187	1.166	0.986	1.116	0.981
April	1.811	1.074	1.022	1.049	0.878
May	1.331	0.910	0.836	0.925	0.923
June	2.018	0.912	0.805	1.028	0.783
July	1.416	1.017	0.939	1.196	0.812
August	1.692	1.205	1.028	1.373	0.896
September	1.599	1.136	1.005	1.008	0.853
October	1.494	1.148	0.896	1.084	0.823
November	1.571	1.233	1.077	0.959	1.006
December	1.651	1.187	1.169	1.044	0.967
Average	1.724	1.112	1.021	1.092	0.890

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, T.-G.; Kwon, B.-S.; Yoon, S.-G.; Song, K.-B. Very Short-Term Load Forecasting for Large Power Systems with Kalman Filter-Based Pseudo-Trend Information Using LSTM. Energies 2025, 18, 4890. https://doi.org/10.3390/en18184890

AMA Style

Kim T-G, Kwon B-S, Yoon S-G, Song K-B. Very Short-Term Load Forecasting for Large Power Systems with Kalman Filter-Based Pseudo-Trend Information Using LSTM. Energies. 2025; 18(18):4890. https://doi.org/10.3390/en18184890

Chicago/Turabian Style

Kim, Tae-Geun, Bo-Sung Kwon, Sung-Guk Yoon, and Kyung-Bin Song. 2025. "Very Short-Term Load Forecasting for Large Power Systems with Kalman Filter-Based Pseudo-Trend Information Using LSTM" Energies 18, no. 18: 4890. https://doi.org/10.3390/en18184890

APA Style

Kim, T.-G., Kwon, B.-S., Yoon, S.-G., & Song, K.-B. (2025). Very Short-Term Load Forecasting for Large Power Systems with Kalman Filter-Based Pseudo-Trend Information Using LSTM. Energies, 18(18), 4890. https://doi.org/10.3390/en18184890

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Very Short-Term Load Forecasting for Large Power Systems with Kalman Filter-Based Pseudo-Trend Information Using LSTM

Abstract

1. Introduction

2. Background

2.1. Very Short-Term Load Forecasting

2.2. Long Short-Term Memory

2.3. Correlation Analysis Method

2.4. Reconstituted Load Method

2.5. Representative Weather Variable Calculation

2.6. Kalman Filter-Based Pseudo-Trend Generation

3. VSTLF for Large Power Systems with Pseudo-Trend Information

3.1. Selection of Candidate Input Features Through Correlation Analyses

3.2. Data Pre-Processing

3.3. VSTLF Model with Pseudo-Trend Input Sequence Enhancement

4. Case Studies

4.1. Dataset Description and Experimental Setup

4.2. Evaluation Metrics

4.3. Results of Input Candidate Selection Based on Correlation Analysis

4.4. Effect of Pseudo-Trend Sequence Input Enhancement

4.5. Hyperparameter Tuning Using Grid Search Method

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI