Next Article in Journal
Wave-Power Extraction by an Oscillating Water Column Device over a Step Bottom
Previous Article in Journal
A Retail Inventory Model with Promotional Efforts, Preservation Technology Considering Green Technology Investment
Previous Article in Special Issue
Day-Ahead Economic Dispatch Strategy for Distribution Networks with Multi-Class Distributed Resources Based on Improved MAPPO Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Short-Term Electricity Load Complementary Forecasting Method Based on Bi-Level Decomposition and Complexity Analysis

College of Electrical Engineering and Control Science, Nanjing Tech University, Nanjing 211816, China
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(7), 1066; https://doi.org/10.3390/math13071066
Submission received: 3 February 2025 / Revised: 19 March 2025 / Accepted: 24 March 2025 / Published: 25 March 2025
(This article belongs to the Special Issue Artificial Intelligence and Game Theory)

Abstract

:
With the increasing complexity of the power system and the increasing load volatility, accurate load forecasting plays a vital role in ensuring the safety of power supply, optimizing scheduling decisions and resource allocation. However, the traditional single model has limitations in extracting the multi-frequency features of load data and processing components with varying complexity. Therefore, this paper proposes a complementary forecasting method based on bi-level decomposition and complexity analysis. In the paper, Pyraformer is used as a complementary model for the Single Channel Enhanced Periodicity Decoupling Framework (SCEPDF). Firstly, a Hodrick Prescott Filter (HP Filter) is used to decompose the electricity data, extracting the trend and periodic components. Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) is used to further decompose the periodic components to obtain several IMF components. Secondly, based on the sample entropy, spectral entropy, and Lempel–Ziv complexity, a complexity evaluation index system is constructed to comprehensively analyze the complexity of each IMF component. Then, based on the comprehensive complexity of each IMF component, different components are fed into the complementary model. The predicted values of each component are combined to obtain the final result. Finally, the proposed method is tested on the quarterly electrical load dataset. The effectiveness of the proposed method is verified through comparative and ablation experiments. The experimental results show that the proposed method demonstrates excellent performance in short-term electricity load forecasting tasks.

1. Introduction

Accurate load forecasting is crucial for the planning and operation of smart grids [1]. By accurately predicting power demand, power grid operators can adjust their operation strategies in a timely manner to avoid insufficient or excess power supply and ensure the stability and reliability of the power grid. In addition, accurate short-term load forecasting can ensure the stable operation of the power system [2], reduce operating costs, improve energy efficiency, and further enhance economic benefits.
According to the technical characteristics, the existing load forecasting methods can be categorized into statistical methods, neural network models, and combined forecasting methods [3].
Traditional statistical methods, such as the autoregressive moving average model (ARMA) [4], autoregressive integrated moving average model (ARIMA) [5], and multiple linear regression (MLR) [6], primarily analyze historical data to uncover correlations between load change and time series [7]. However, due to the constraints of the assumptions made by the linear models, their effectiveness is limited when input data are nonlinear or non-stationary [8].
In recent years, neural network models have been widely used in the field of load forecasting due to their advantages in the classification, feature extraction, and processing of nonlinear regression problems [9], such as the artificial neural network (ANN) [10], recurrent neural network (RNN) [11], and convolutional neural network (CNN) [12]. In [13], a convolutional long short-term memory neural network model was proposed (CLSAF). This model effectively improved the accuracy of short-term household power load forecasting by integrating autoregressive feature selection, exogenous feature selection, and ‘default’ state strategy. Ref. [14] proposed an online adaptive recurrent neural network power load forecasting method. This method effectively solved the problem whereby the traditional offline learning model could not adapt to data changes and concept drift. The proposed method’s effectiveness was validated through experiments. Ref. [15] proposed graph environment intelligent technology (GAIN) for heat load forecasting. This model combines customer load classification, a collaborative attention mechanism, and the recursive autoregressive method to effectively capture the time correlation of energy consumption behavior and significantly improve the accuracy of heat load forecasting. However, it is difficult to fully capture the multi-scale features contained in load data by relying solely on neural network methods, as they limit performance in complex load forecasting tasks.
In order to explore the multi-level features hidden in the load data, decomposition-based combined models are becoming the mainstream method of load forecasting [16]. Ref. [17] proposed a combined forecasting method based on variational mode decomposition and long-term and short-term memory network. The load data are decomposed by VMD, and the sequences are extended based on the correlation coefficients. At the same time, it is optimized for different day types and temperature conditions, which effectively overcomes the limitations of traditional methods in dealing with complex load forecasting. In Ref. [18], a multivariate load forecasting model based on time series decomposition and reconstruction is proposed. This method introduced a composite evaluation factor (CEF) to reconstruct modal components by considering their complexity, coupling, and frequency characteristics, significantly improving the model’s forecasting accuracy and stability in complex environments. Ref. [19] introduced a hybrid neural network joint model based on mode decomposition and change-point detection. An improved signal energy evaluation metric was proposed, dynamically adjusting the number of intrinsic mode functions (IMFs) during the VMD process. This method effectively addresses the issue of power load fluctuations in port areas. Ref. [20] proposed a multi-model fusion forecasting method based on complementary ensemble empirical mode decomposition, genetic algorithm-long short-term memory, a radial basis function fusion autoencoder, and a particle swarm optimization support vector machine. This approach effectively addresses the impacts of non-stationarity in load sequences, thereby enhancing the accuracy and reliability of the forecasts. Ref. [21] introduced a forecasting framework based on MAFS + ISTD + PGBM. This method employed MAFS, ISTD, and PGBM for feature selection, time series decomposition, and probabilistic forecasting, respectively. Experimental validation demonstrated that this method surpasses the existing models in both training time and forecasting accuracy, offering an economically efficient prediction solution for power grid operators. Ref. [22] introduced a probabilistic load forecasting method based on an enhanced graph convolutional network with multi-scale self-attention, utilizing an improved seasonal-trend decomposition. This approach employs a dual-channel convolutional neural network to process the different temporal components resulting from the decomposition. It also leverages a multi-scale self-attention mechanism within the graph convolutional network to concurrently capture geographic and semantic spatial correlations, effectively addressing deficiencies in extracting temporal and spatial relatedness. Although the above methods all adopt the combination model based on decomposition, the complexity of the IMF components obtained by decomposition is different. Consequently, it is difficult for a single model to fully mine its feature information when forecasting these components.
To address the insufficiency of single models in handling IMF components with varying levels of complexity, this paper analyzes the performance characteristics of different models in dealing with multiple complexity components in depth, and it proposes a complementary forecasting method based on bi-layer decomposition and complexity-driven strategies.
The main contributions of this paper are as follows:
  • A component complexity analysis method based on bi-layer decomposition is proposed. A Hodrick Prescott Filter (HP Filter) is applied to extract the long-term trend and short-term fluctuation characteristics from the electricity load data. Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) is applied to decompose the short-term fluctuation components. A subsequence complexity evaluation index system is constructed based on the sample entropy, spectral entropy, and Lempel–Ziv complexity, providing a comprehensive characterization of the complexity features of each IMF component.
  • An improved model, SCEPDF, based on the multi-periodic decoupling block [23], is proposed. The Cross-Variation Aggregation Block is introduced to improve the PDF. During the fusion of periodic features from the dual-variant modeling block, interaction terms between different periodic features are generated element by element through a feature-crossing mechanism. This approach deeply explores the nonlinear relationships between periodic features and significantly improves the model’s forecasting accuracy under single-channel input.
  • A complementary forecasting model based on SCEPDF and Pyraformer [24] is constructed. SCEPDF is used to accurately extract the periodic and local characteristics of components with medium and low complexity. Pyraformer is utilized to efficiently model the long-range dependencies and global patterns in high complexity components. Experimental results verify that the integration of these two models’ complementary characteristics allows for a more comprehensive capture of the multi-level features in the data, thereby significantly improving the accuracy of short-term electricity load forecasting.

2. Methodology

2.1. Overall Framework

A single model has limitations in mining multi-band features of data. Moreover, the decomposition-based combined forecasting method finds it difficult to capture the fast dynamic characteristics of high-frequency components and the long-term trend characteristics of low-frequency components, at the same time, when dealing with different components. The above defects lead to limited model forecasting performance. In order to overcome this shortcoming, this paper proposes a complementary forecasting method based on bi-layer decomposition and complexity-driven strategies. The overall framework of the proposed method consists of four modules, as shown in Figure 1. The contents and functions of the four modules are as follows:
In Moudle1, a bi-layer decomposition framework based on a HP filter and CEEMDAN is constructed. Based on the HP filter, the time series decomposition of electricity load data is carried out to obtain the stable trend component and unstable periodic component. CEEMDAN is used to decompose the periodic component twice to obtain several IMF components.
In Moudle2, a subsequence complexity evaluation index system based on sample entropy, spectral entropy, and Lempel–Ziv complexity is constructed. The complexity of each IMF component obtained by Moudle1 is analyzed comprehensively, and the final comprehensive complexity is obtained by weighting each index.
In Moudle3, the complementary characteristics of improved PDF and Pyraformer in dealing with different complexity components are discussed in depth, and a complementary model based on the two is constructed. SCEPDF focuses on accurately extracting periodic and local features, while Pyraformer can efficiently model long-range dependencies and global patterns. Combining the complementary characteristics of the two can more fully capture the complex characteristics of the data.
In Moudle4, a complexity-driven complementary forecasting framework is constructed, and the trend component obtained after HP filtering is input into the SCEPDF model. For the IMF components after secondary decomposition, different components are input into the complementary model based on complexity, and the predicted values of each component are superimposed to obtain the final result.

2.2. Bi-Layer Decomposition Method

2.2.1. Hodrick Prescott Filter

The Hodrick-Prescott (HP) filter exhibits strong mathematical stability. Its quadratic smoothing term ensures that the trend component is continuously differentiable. This feature enables the effective extraction of long-term trends and short-term fluctuations from load data. Additionally, compared to seasonal decomposition algorithms, the components derived from HP filtering do not contain residual parts, reducing the workload for subsequent secondary modal decomposition. The specific implementation steps of the HP filter are as follows:
Given a time series dataset y i = y 1 , y 2 , , y n , the HP filter decomposes the data into a trend component (stationary component) and a periodic component (unstable component). The trend component reflects the long-term trends in the data. The cyclic component captures the short-term fluctuations around the trend. The number of observations is α = { t | t , t n } , which can be expressed as follows:
y i = t i + c i
where t i = t 1 , t 2 , , t n and c i = c 1 , c 2 , , c n represent the trend component and the periodic component. The algebraic sum of the trend component and the cyclic component equals the observed values.
The trend component is defined as a smoothed version of the data. The component is obtained by minimizing the following objective function, which is expressed as follows:
min t α y t t t 2 + δ t α t t + 1 t t 1 2
where δ is the smoothing function, and t is the time index. In the first term t α y t t t 2 of the objective function, the deviation between the trend component and the original data is measured. In the second term t α t t + 1 t t 1 2 , the curvature of the trend component changing with time is measured, and δ is used to weigh the two. In the experiments of this paper, the value of δ is set to 1600.

2.2.2. CEEMDAN

Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) is an improvement of the CEEMD method, and its flowchart is shown in Figure 2. The noise intensity is determined by using adaptive white noise and calculating the signal-to-noise ratio of each scale, thereby improving the accuracy and stability of the decomposition. The specific implementation steps of CEEMDAN are as follows:
(a) Add a positive and negative Gaussian white noise signal with an N -th mean value of 0 to the original signal x ( t ) to obtain a new signal sequence x i t of group N , where i = 1 , 2 , , n , as follows:
x i t = x ( t ) + ε δ i ( t )
where the weight of white noise is ε , and δ i ( t ) is the Gaussian white noise x sequence added for the i-th time.
(b) The obtained group N signals are subjected to modal decomposition, and the average value of the obtained modal components is taken as the component of CEEMDAN, as follows:
IMFl ( t ) = 1 N i = 1 N IMFl i ( t )
r 1 ( t ) = x ( t ) IMFl ( t )
where IMFl ( t ) represents the first intrinsic mode component obtained by CEEMDAN decomposition; and r 1 ( t ) is the first decomposed residual signal.
(c) Repeat the first two steps until N iterations are completed. Finally, the relationship between the original data x ( t ) and n intrinsic modal components is as follows:
x ( t ) = n = 1 N IMFn ( t ) + r n ( t )
where r n ( t ) is the residual signal after n iterations.

2.3. Complexity Evaluation Index System

2.3.1. Sample Entropy

Sample Entropy measures the complexity of time series by measuring the probability of generating new patterns in the signal. The greater the probability of generating new patterns, the greater the complexity of the sequence. The specific calculation method is as follows:
S n , r , L = ln A n r / B n r
where L is the length of the sequence, n is the dimension, and r is the similarity tolerance. In this paper, n equals 2 and r is 0.2. A n r and B n r denote the probabilities of two consecutive subsequences matching to m + 1 and n points under the same similarity tolerance in time series, respectively.

2.3.2. Spectral Entropy

Spectral Entropy is an index used to measure the complexity or randomness of signal spectrum distribution. It calculates the entropy of the probability distribution by normalizing the PSD (Power Spectral Density) of the signal into a probability distribution to characterize the uniformity or sparsity of the signal spectrum. The formula is expressed as follows:
p [ i ] = P [ i ] j P [ j ]
H = i p [ i ] ln ( p [ i ] )
where P [ i ] is the power spectral density. The window length is 256, with an overlap rate of 50%.

2.3.3. Lempel–Ziv Complexity

Lempel–Ziv Complexity (LZC) is a complexity measure method based on information theory, which is used to quantify the complexity of sequences. It was proposed by Jacob Ziv and Abraham Lempel. The principle is to measure the structure and randomness of a sequence by calculating the number of different sub-patterns of a discrete sequence. The more complex the sequence is, the more different modes it contains, and the higher the LZC value is. The calculation process is as follows:
(a) The initial sequence X = x 1 , x 2 , , x 3 is median binary coarse-grained, that is, the sequence is reconstructed by 0 , 1 to obtain the symbol sequence S = s 1 , s 2 , , s 3 .
(b) Initialize M 0 and N 0 as empty matrices, let i = 0 , then, the complexity C i = 0 .
(c) Enter the cyclic traversal sequence S, let M i = M i 1 s i , N i = N i 1 s i , and then judge whether M i 1 contains N i . If it contains that, the complexity C i remains unchanged, and if it does not contain that, add 1 to increase complexity.
(d) The complexity C N is normalized. For the binary sequence S, the normalized complexity C is calculated as follows:
C = C N log l N N

2.4. Complementary Forecasting Method

In this paper, the PDF model is improved by introducing the Cross-Variation Aggregation Block, as shown in Equations (14)–(17). When the Cross-Variation Aggregation Block fuses the periodic features from the Dual Variation Modeling Blocks (DVMBs), the interaction terms between different periodic features are generated element by element, through the feature crossover mechanism, so as to deeply explore the nonlinear relationship between periodic features. However, as shown in Equations (11)–(13), the advantages of SCEPDF are mainly reflected in the extraction of local features and periodic patterns. When dealing with high-complexity components, the ability to capture long-term dependencies and global patterns is limited. In contrast, Pyraformer performs cross-scale modeling of complex time series through pyramidal attention mechanism and multi-resolution tree structure. As shown in Equations (18) and (19), the pyramidal attention mechanism limits the attention range of each node to its neighborhood, including adjacent nodes at the same scale, child nodes, and father nodes. It effectively captures local information and long-range dependencies at different scales, thereby efficiently modeling global patterns.
Therefore, this paper uses Pyraformer as a complementary model for SCEPDF to better capture the long-range dependence and global patterns in high-complexity IMF components. The specific principles and structures of SCEPDF and Pyraformer are as follows.

2.4.1. SCEPDF

The Single Channel Enhanced Periodicity Decoupling Framework (SCEPDF) uses Multi-Periodic Decoupling Blocks and Dual Variation Modeling Blocks to capture short-term and long-term changes in time series by decoupling one-dimensional time series into two-dimensional short-term and long-term changes. The Cross-Variation Aggregation Block is introduced to capture the complex interrelationships between different periods, which effectively captures the complex periodic changes in long-term sequence forecasting. The overall structure is shown in Figure 3.
(a) Multi-Periodic Decoupling Block
The Multi-Periodic Decoupling Block uses Periodicity Extractor and a Period-based Reshaper to convert 1D time series into 2D space, and its structure is shown in Figure 4. For a given one-dimensional input X I t × d with dimension d, the Periodicity Extractor is used to analyze the time series in frequency domain, and then, the Period-based Reshaper is used to reshape the one-dimensional input into a two-dimensional tensor. The specific expression is as follows:
A = A v g ( A m p ( F F T ( X I ) ) )
F u = arg t o p m ( A ) , F k 1 = arg t o p k 1 ( A ) , { f 1 , , f k } = F k 1 t o p k 2 ( F u F k 1 ) f * { 1 , , [ t 2 ] }
X 2 D i = R e s h a p e f i , p i ( P a d d i n g ( X I ) ) , i { 1 , , k }
where F F T ( ) and A m p ( ) represent fast Fourier transform and amplitude extraction, respectively, and F u and F k 1 represent the u and k1 frequencies with the largest amplitude from A, respectively.
(b) Dual Variation Modeling Block
As shown in Figure 5, the Dual Variation Modeling Block consists of a long-term variation extractor and a short-term variation extractor. The long-term variation extractor for patch x q i , j N × P with long-term information is initially projected into the potential space by linear projection, and it is then processed by multiple Transformer encoder layers. But the short-term variation extractor contains a sequence of convolution blocks, each consisting of a Conv1d layer and a nonlinear activation function. These blocks are sequentially structured to gradually expand the receptive field, accommodating periods of various lengths.
(c) Cross-Variation Aggregation Block
Compared with the Variation Aggregation Block, this paper introduces the feature crossover mechanism on the basis of feature splicing, and it proposes the Cross-Variation Aggregation Block. By splicing the periodic feature X ^ i = X ^ 1 , X ^ 2 , , X ^ k , extracted by k DVMBs, and generating the cross features, the complex interrelationships between different cycles are further captured. Specifically, it multiplies the features of each two cycles by each element to generate the interactive feature f i , j c r o s s , and it splices it into the original feature matrix, thus revealing the potential nonlinear relationship between the periodic features. Then, the module is used to input the combined features into the linear layer for mapping, and the rich feature information is compressed to the target window size. Finally, the final multivariate forecasting X O T × d is obtained by stacking d univariate forecasting XO, which is expressed as follows.
M f = c o n c a t X ^ 1 , X ^ 2 , , X ^ k
f i , j c r o s s = X ^ i X ^ j
M f c r o s s = c o n c a t f 1 , 2 c r o s s , f 1 , 3 c r o s s , , f i , j c r o s s
X O = L i n e a r c o n c a t M f , M f c r o s s
In these formulas, represents element-wise multiplication, M represents the feature matrix, c o n c a t · denotes the concatenation operation, and L i n e a r · refers to the linear layer operation.

2.4.2. Pyraformer

Pyraformer effectively captures short-term and long-term dependencies in time series data while maintaining low time complexity and spatial complexity through the pyramidal attention module (PAM) and the coarser-scale construction module (CSCM). The overall structure is shown in Figure 6.
(a) Pyramidal Attention Module
As shown in Figure 7, the time dependence of the observation time series is described in a multi-resolution manner, which is mainly composed of the following two parts: inter-scale connection and intra-scale connection. Specifically, let n l ( s ) denote the lth node under scale s, where s = 1 , , S denotes the scale from bottom to top. Each node in the graph can focus on a set of neighbor nodes l ( s ) on three scales as follows: adjacent A nodes on the same scale, including the node itself (denoted as A l ( s ) ); its C child nodes in the C-ary tree (denoted as l ( s ) ); and its parent node in the C-ary tree (denoted as l ( s ) ), expressed as follows:
l ( s ) = A l ( s ) l ( s ) l ( s ) A l ( s ) = { n j ( s ) : | j l | A 1 2 , 1 j L C s 1 } l ( s ) = { n j ( s 1 ) : ( l 1 ) C < j l C } if s 2 else l ( s ) = { n j ( s + 1 ) : j = l C } if s S 1 else
Therefore, the attention at node n l ( s ) can be simplified as follows:
y i = l l ( s ) exp ( q i k l T / d K ) v l l l ( s ) exp ( q i k l T / d K )
(b) Coarser-Scale Construction Module
As shown in Figure 8, the CSCM applies multiple convolution layers with a kernel size of C and a step size of C to the embedded sequence in the time dimension, and it obtains a sequence with a scale of s and a length of L / C S . Based on the obtained sequences at different scales, a C-ary tree is formed. In order to reduce the amount of parameters and calculation, before inputting the sequence into the stacked convolution layer, the dimension of each node is reduced by the fully connected layer, and it is restored after all convolutions, which significantly reduces the number of parameters in the module and prevents overfitting.

3. Examples Analysis

3.1. Data Base Foundation and Model Input

The electricity load data from a specific user-side park in Nantong, Jiangsu Province, for the first quarter of 2021 serve as the sample data, with a sampling interval of one hour, totaling 2160 data points. To verify the performance of the method presented in this paper in load forecasting tasks, the electric load power is selected as the sole input feature for the forecasting model (excluding meteorological and other features). To ensure effective model learning and to balance optimization and evaluation needs, enhancing the stability and reliability of the forecasts, the dataset is divided into training, validation, and test sets with a 7:1:2 ratio.
To deeply explore the multi-scale features and multi-band characteristics inherent in electric load data, this paper proposes a bi-level decomposition strategy as follows: First, the original load data are subjected to quadratic smoothing using a HP filter, decomposing it into trend and cyclic components to precisely separate long-term trends and short-term fluctuations. Subsequently, the cyclic component is further decomposed using the CEEMDAN method to obtain multiple intrinsic mode functions (IMFs). CEEMDAN effectively suppresses mode mixing through adaptive noise injection, allowing for the extraction of load fluctuation components at different frequencies. This layered decomposition method not only fully reveals the multi-scale features of the electric load but also provides higher quality data inputs for subsequent forecasting models.
This paper utilizes HP filtering to extract the trend and cyclic components of the data. As shown in Figure 9, the trend component changes smoothly, reflecting the long-term evolutionary trend of the electrical load, whereas the cyclic component exhibits larger fluctuations, effectively capturing the short-term fluctuation characteristics of the electrical load. This decomposition method allows for a more intuitive revelation of the electric load’s variation characteristics across different time scales.
Considering the instability of the periodic component, this paper uses CEEMDAN to decompose it twice. By introducing white noise, CEEMDAN effectively suppresses the mode mixing problem in the traditional EMD method, thus making the decomposition result more stable. It can be seen from Figure 10 that the IMF component obtained after the second decomposition is more stable than the original periodic component, and this can comprehensively reveal the multi-band characteristics of the electrical load. At the same time, because the order of magnitude of the residual component is much smaller than that of other IMF components, the residual component is ignored in this paper.

3.2. Evaluating Indicator

In this paper, the mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), and coefficient of determination (R2) are used as the evaluation criteria of the model. Among them, the smaller the values of the MAE, RMSE, and MAPE indicators, the closer the value of R2 is to 1, indicating that the smaller the deviation between the predicted value and the true value, the higher the forecasting accuracy of the model, as shown in Equations (20)–(23).
M A E = 1 n i = 1 n | y i y i |
R M S E = 1 n i = 1 n y i y i 2
M A P E = 1 n i = 1 n | y i y i | y i 100 %
R 2 = 1 i = 1 n ( y i y i ) 2 i = 1 n ( y i y i ¯ ) 2
where n represents the number of data, y i represents the actual value, y i represents the predicted value, and y i ¯ represents the actual average.

3.3. Model Comparison

The model of this paper is built using Python3.11 environment and the Pytorch framework, and it runs in the hardware environment equipped with Core (TM) i5-13500HX CPU (2.50 GHz) and 8 GB memory. Its manufacturer is Intel, produced in Chandler, AZ, USA. Given the current lack of comprehensive theoretical guidance for hyperparameter selection, this paper initially sets the parameters based on experience and plans to fine-tune them further based on the model performance in subsequent experiments. After multiple trials, the hyperparameter selections for this paper are as shown in Table 1.
In order to verify the validity of the model used in this method, a comparative experiment was set up for the main model, SCEPDF. The results of the comparative experiment and the performance of each model in the evaluation index are shown in Figure 11 and Figure 12 and Table 2. As shown in the table, the PDF model outperformed other comparative models in the quarterly electric load forecasting task, exhibiting better performance on metrics such as RMSE and MAE compared to models like LightTS, Reformer, and Crossformer. It showed an average improvement of 8.19%, 25.43%, and 2.27% on the RMSE metric, validating the advanced nature of the model used in this paper. Additionally, after introducing the Cross-Variation Aggregation Block to improve the PDF model, the SCEPDF demonstrated superior performance on all error evaluation metrics compared to the PDF, confirming the effectiveness of the improved model presented in this paper.
Additionally, to further validate the advancement and effectiveness of the model used in this paper, quarterly data for both heating and cooling loads were selected for testing, following the same parameter choices. The comparative results of the models are shown in Table A1 and Table A2. According to the tables, the improved PDF model (SCEPDF) achieved the best performance in both heating and cooling load data. In terms of the RMSE metric, it showed an average improvement of 7.326% compared to the original PDF and an average improvement of 37.45% compared to other comparative models. This confirms the generality of the improved model in multi-load forecasting tasks.

3.4. Complexity Analysis

Based on sample entropy, spectral entropy, and the Lempel–Ziv complexity, the complexity evaluation index system is established, and the complexity of the intrinsic mode function after modal decomposition is analyzed. The Min–Max normalization method is used to map each complexity evaluation index to the same data space. The final comprehensive complexity is obtained by the weighted summation of multiple complexity evaluation indexes. The complexity calculation results are shown in Table 3. The specific calculation method is as follows:
γ = M i n M a x ( S ) + M i n M a x ( H ) + M i n M a x ( C ) / 3
where S, H, and C represent the sample entropy, spectral entropy, and Lempel–Ziv complexity, respectively.

3.5. Verification of the Model Complementary Characteristics

In order to verify the complementary characteristics of the model, SCEPDF and Pyraformer are selected to forecast each IMF component, respectively. The forecasting results are shown in Figure 13 and Table 4. It can be seen from the table that SCEPDF and Pyraformer show good complementary characteristics in dealing with the complex components of different IMFs, that is, SCEPDF is suitable for medium- and low-complexity IMF components, while Pyraformer is more suitable for processing high-complexity IMF components. At the same time, according to Table 4, although Pyraformer performs slightly worse than Crossformer on the overall forecasting task, Table 4 shows that the complementary characteristics of Pyraformer and SCEPDF are significantly better than Crossformer.
In order to further verify the complementary characteristics between SCEPDF and Pyraformer, this paper selects the quarterly data of heat load and cooling load and analyzes them according to the same processing flow. The complexity analysis results and forecasting indexes of each IMF component are shown in Table A3, Table A4, Table A5 and Table A6. It can be seen from the table that SCEPDF performs well in low-complexity components when dealing with IMF components of multi-type data, and it can effectively extract local features and periodic patterns from the data. In processing high-complexity components, Pyraformer shows stronger global modeling ability than SCEPDF, and it can capture long-term trends more accurately. It can be seen that the two models have complementary characteristics in dealing with high- and low-complexity components. By making full use of the complementary characteristics of the two models, the characteristics of different complexity components in a time series can be fully extracted, and the forecasting accuracy can be further improved.

3.6. Ablation Experiment

In order to verify the effectiveness of the bi-layer decomposition method and the complementary model proposed in this paper in the short-term electrical load forecasting task, ablation experiments were performed on the decomposition method and the complementary model, respectively. The results of the ablation experiment and the performance of various forecasting methods on the evaluation indicators are shown in Figure 14 and Figure 15 and Table 5. The table shows that the method proposed in this paper performs best in the task of electric load forecasting. Both the bi-level decomposition method and the complexity-based complementary forecasting approach effectively enhance the accuracy of electric load predictions. It can be seen from the table that after applying the HP decomposition method, the RMSE and MAE indexes increased by 11.96% and 10.28%, respectively. After applying the CEEMDAN decomposition method, the RMSE and MAE indexes increased by 28.65% and 28.08%, respectively. After applying the HP-CEEMDAN secondary decomposition method, the RMSE and MAE indexes increased by 28.86% and 28.34%, respectively. The results show that the performance of the quadratic decomposition method in the forecasting task is better than that of the single decomposition method, which verifies the effectiveness of the bi-level decomposition method. In addition, after the introduction of the complementary model, the RMSE and MAE indicators have been further improved. Compared with the single forecasting model using only the quadratic decomposition method, the RMSE and MAE indicators have increased by 14.75% and 14.13% on average, which verifies the effectiveness of the proposed method.

4. Conclusions

In order to deeply explore the multi-band characteristics of load data and fully extract the characteristics of different complexity components, this paper proposes a complementary forecasting method based on bi-layer decomposition and complexity-driven strategies. The effectiveness of the proposed method is verified by case analysis, and the following conclusions are obtained:
(1)
The bi-layer decomposition method based on HP filtering and CEEMDAN effectively improves the data input of the model and significantly improves the forecasting accuracy. The effectiveness of the bi-layer decomposition method was verified by the results of ablation experiments. The combined model based on bi-layer decomposition increased the RMSE, MAE, MAPE, and R2 by 28.87%, 28.35%, 28.27%, and 11.80%, respectively, as compared with the single model.
(2)
The improved PDF and Pyraformer exhibit significant complementarity, and their combination achieves better forecasting results. SCEPDF focuses on extracting periodic and local features, making it suitable for handling medium- to low-complexity IMF components. In contrast, Pyraformer excels at capturing long-range dependencies and global patterns, making it appropriate for high-complexity IMF components. By selecting the corresponding model based on component complexity, their collaborative use can further enhance the accuracy of electric load forecasting.
(3)
The complementary forecasting method based on bi-level decomposition and complexity-driven approaches fully exploits the characteristics of each component, effectively addressing the impacts of non-stationarity in load sequences, thereby further enhancing prediction accuracy. The effectiveness of this method was verified through ablation experiments, which demonstrated an average improvement of 14.75% in RMSE, 14.13% in MAE, 13.79% in MAPE, and 4.79% in R2 compared to models based solely on bi-level decomposition.

Author Contributions

Formal analysis; methodology; writing—original draft; and supervision: X.D. Resources; software; data curation; writing—original draft: Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Program of State Grid Corporation of China (SGCC), “Theory and application research of flexible resource scheduling for ramping product”, grant number 5108-202416045A-1-1-ZN.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author due to legal and privacy reasons.

Acknowledgments

In this paper, Zhaochen Luan (British Columbia Academy, Nanjing Foreign Language School) participated in the previous data preprocessing part and the time series and modal decomposition part. In the example analysis part, models such as Transformer and Crosformer were selected for partial comparison and ablation experiments, which verified the effectiveness of the proposed method.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Comparison results for the heating load model.
Table A1. Comparison results for the heating load model.
MethodRMSE (KW)MAE (KW)MAPE%R2
LightTS738.63566.494.92470.8511
Reformer1283.43998.358.31580.5503
Crossformer673.11518.134.44780.8763
Transformer966.45711.55.84720.745
Informer845.59661.095.84240.8048
PDF564.84425.193.63860.9127
SCEPDF523.68402.453.50080.925
Pyraformer601.31441.983.80060.9013
Table A2. Comparison results for the cooling load model.
Table A2. Comparison results for the cooling load model.
MethodRMSE (KW)MAE (KW)MAPE%R2
LightTS812.82584.975.32980.9298
Reformer1050.92757.616.69410.8826
Crossformer575.13428.194.0080.9649
Transformer880.24673.475.83030.9177
Informer1133.1827.467.10710.8636
PDF586.83444.933.95130.9638
SCEPDF543.61409.573.89610.9869
Pyraformer676.5490.784.41260.9514
Table A3. Complexity analysis results of the IMF components after the secondary decomposition of the heat load.
Table A3. Complexity analysis results of the IMF components after the secondary decomposition of the heat load.
ComponentsSample
Entropy
Spectral
Entropy
Lempel–Ziv
Complexity
Compositive Complexity
IMF_11.000.951.000.98
IMF_20.591.000.650.75
IMF_30.470.990.360.60
IMF_40.290.960.440.56
IMF_50.200.920.320.48
IMF_60.070.960.170.40
IMF_70.020.930.090.35
IMF_80.000.840.030.29
IMF_90.000.000.000.00
Table A4. Forecasting results of the IMF component after the secondary decomposition of the heat load.
Table A4. Forecasting results of the IMF component after the secondary decomposition of the heat load.
ComponentsSCEPDFPyraformer
RMSE (KW)MAE (KW)RMSE (KW)MAE (KW)
IMF_1269.49222.73261.57215.14
IMF_2210.43157.09160.61115.37
IMF_3104.1985.67141.94114.18
IMF_430.6221.440.0127.01
IMF_511.379.0910.527.92
IMF_62.72.024.313.53
IMF_70.580.491.341.07
IMF_80.040.0311.248.47
IMF_90.050.050.10.09
Table A5. Complexity analysis results of the IMF component after the secondary decomposition of the cooling load.
Table A5. Complexity analysis results of the IMF component after the secondary decomposition of the cooling load.
ComponentsSample
Entropy
Spectral
Entropy
Lempel–Ziv
Complexity
Compositive Complexity
IMF_11.000.931.000.98
IMF_20.401.000.380.59
IMF_30.180.960.400.51
IMF_40.280.930.390.53
IMF_50.090.870.220.39
IMF_60.020.880.140.35
IMF_70.010.850.060.31
IMF_80.000.000.000.00
Table A6. Forecasting results of the IMF component after the secondary decomposition of the cooling load.
Table A6. Forecasting results of the IMF component after the secondary decomposition of the cooling load.
ComponentsSCEPDFPyraformer
RMSE (KW)MAE (KW)RMSE (KW)MAE (KW)
IMF_1213.54175.27197.11158.98
IMF_2281.93215.54253.7199.95
IMF_337.5727.6741.4728.03
IMF_417.6713.0310.477.72
IMF_54.33.057.95.92
IMF_60.960.7781.941.82
IMF_70.730.652.352.25
IMF_80.620.966.436.38

References

  1. Liao, W.; Wang, S.; Yang, D.; Yang, Z.; Fang, J.; Rehtanz, C.; Porté-Agel, F. TimeGPT in load forecasting: A large time series model perspective. Appl. Energy 2025, 379, 124973. [Google Scholar]
  2. Jalalifar, R.; Delavar, M.R.; Ghaderi, S.F. SAC-ConvLSTM: A novel spatio-temporal deep learning-based approach for a short term power load forecasting. Expert Syst. Appl. 2024, 237 Pt B, 121487. [Google Scholar]
  3. Kong, X.; Li, C.; Wang, C.; Zhang, Y.; Zhang, J. Short-term electrical load forecasting based on error correction using dynamic mode decomposition. Appl. Energy 2020, 261, 114368. [Google Scholar]
  4. Pappas, S.S.; Ekonomou, L.; Karampelas, P.; Karamousantas, D.C.; Katsikas, S.K.; Chatzarakis, G.E.; Skafidas, P.D. Electricity demand load forecasting of the Hellenic power system using an ARMA model. Electr. Power Syst. Res. 2010, 3, 256–264. [Google Scholar] [CrossRef]
  5. Wang, X.; Kang, Y.; Hyndman, R.J.; Li, F. Distributed ARIMA models for ultra-long time series. Int. J. Forecast. 2023, 39, 1163–1184. [Google Scholar]
  6. Mashaly, A.F.; Alazba, A.A. MLP and MLR models for instantaneous thermal efficiency prediction of solar still under hyper-arid environment. Comput. Electron. Agric. 2016, 122, 146–155. [Google Scholar]
  7. Wang, C.; Zhao, H.; Liu, Y.; Fan, G. Minute-level ultra-short-term power load forecasting based on time series data features. Appl. Energy 2024, 372, 123801. [Google Scholar] [CrossRef]
  8. Li, K.; Mu, Y.; Yang, F.; Wang, H.; Yan, Y.; Zhang, C. A novel short-term multi-energy load forecasting method for integrated energy system based on feature separation-fusion technology and improved CNN. Appl. Energy 2023, 351, 121823. [Google Scholar]
  9. Bu, X.; Wu, Q.; Zhou, B.; Li, C. Hybrid short-term load forecasting using CGAN with CNN and semi-supervised regression. Appl. Energy 2023, 338, 120920. [Google Scholar]
  10. Gang, W.; Wang, J. Predictive ANN models of ground heat exchanger for the control of hybrid ground source heat pump systems. Appl. Energy 2013, 112, 1146–1153. [Google Scholar]
  11. Agarwal, H.; Mahajan, G.; Shrotriya, A.; Shekhawat, D. Predictive Data Analysis: Leveraging RNN and LSTM Techniques for Time Series Dataset. Procedia Comput. Sci. 2024, 235, 979–989. [Google Scholar] [CrossRef]
  12. Jonkers, J.; Avendano, D.N.; Van Wallendael, G.; Van Hoecke, S. A novel day-ahead regional and probabilistic wind power forecasting framework using deep CNNs and conformalized regression forests. Appl. Energy 2024, 361, 122900. [Google Scholar] [CrossRef]
  13. Li, L.; Meinrenken, C.J.; Modi, V.; Culligan, P.J. Short-term apartment-level load forecasting using a modified neural network withselected auto-regressive features. Appl. Energy 2021, 287, 116509. [Google Scholar] [CrossRef]
  14. Fekri, M.N.; Patel, H.; Grolinger, K.; Sharma, V. Deep learning for load forecasting with smart meter data: Online Adaptive Recurrent Neural Network. Appl. Energy 2021, 282 Pt A, 116177. [Google Scholar] [CrossRef]
  15. Wang, Z.; Liu, X.; Huang, Y.; Zhang, P.; Fu, Y. A multivariate time series graph neural network for district heat load forecasting. Energy 2023, 278 Pt A, 127911. [Google Scholar] [CrossRef]
  16. Yang, D.; Li, M.; Guo, J.-E.; Du, P. An attention-based multi-input LSTM with sliding window-based two-stage decomposition for wind speed forecasting. Appl. Energy 2024, 375, 124057. [Google Scholar] [CrossRef]
  17. He, F.; Zhou, J.; Feng, Z.-K.; Liu, G.; Yang, Y. A hybrid short-term load forecasting model based on variational mode decomposition and long short-term memory networks considering relevant factors with Bayesian optimization algorithm. Appl. Energy 2019, 237, 103–116. [Google Scholar] [CrossRef]
  18. Li, K.; Duan, P.; Cao, X.; Cheng, Y.; Zhao, B.; Xue, Q.; Feng, M. A multi-energy load forecasting method based on complementary ensemble empirical model decomposition and composite evaluation factor reconstruction. Appl. Energy 2024, 365, 123283. [Google Scholar] [CrossRef]
  19. Ma, K.; Nie, X.; Yang, J.; Zha, L.; Li, G.; Li, H. A power load forecasting method in port based on VMD-ICSS-hybrid neural network. Appl. Energy 2025, 377 Pt B, 124246. [Google Scholar] [CrossRef]
  20. Shi, J.; Teh, J. Load forecasting for regional integrated energy system based on complementary ensemble empirical mode decomposition and multi-model fusion. Appl. Energy 2024, 353 Pt B, 122146. [Google Scholar] [CrossRef]
  21. Saini, P.; Parida, S.K. A novel probabilistic gradient boosting model with multi-approach feature selection and iterative seasonal trend decomposition for short-term load forecasting. Energy 2024, 294, 130975. [Google Scholar]
  22. Qiu, Y.; He, Z.; Zhang, W.; Yin, X.; Ni, C. MSGCN-ISTL: A multi-scaled self-attention-enhanced graph convolutional network with improved STL decomposition for probabilistic load forecasting. Expert Syst. Appl. 2024, 238 Pt A, 121737. [Google Scholar]
  23. Dai, T.; Wu, B.; Liu, P.; Li, N.; Bao, J.; Jiang, Y.; Xia, S.-T. Periodicity Decoupling Framework for Long Term Series Forecasting. In Proceedings of the 2024 12th International Conference on LeFarning Representations, Vienna, Austria, 7–11 May 2024. [Google Scholar]
  24. Liu, S.; Yu, H.; Liao, C.; Li, J.; Lin, W.; Liu, A.X.; Dustdar, S. Pyraformer: Low complexity pyramidal attention for long-range time series modeling and forecasting. In Proceedings of the 2022 10th International Conference on LeFarning Representations, Online, 25–29 April 2022. [Google Scholar]
Figure 1. Overall architecture diagram.
Figure 1. Overall architecture diagram.
Mathematics 13 01066 g001
Figure 2. CEEMDAN flowchart.
Figure 2. CEEMDAN flowchart.
Mathematics 13 01066 g002
Figure 3. The architecture of SCEPDF.
Figure 3. The architecture of SCEPDF.
Mathematics 13 01066 g003
Figure 4. Multi-Periodic Decoupling Block.
Figure 4. Multi-Periodic Decoupling Block.
Mathematics 13 01066 g004
Figure 5. Dual Variation Modeling Block.
Figure 5. Dual Variation Modeling Block.
Mathematics 13 01066 g005
Figure 6. Architecture of Pyraformer.
Figure 6. Architecture of Pyraformer.
Mathematics 13 01066 g006
Figure 7. Pyramid attention mechanism diagram.
Figure 7. Pyramid attention mechanism diagram.
Mathematics 13 01066 g007
Figure 8. Coarser-scale construction module.
Figure 8. Coarser-scale construction module.
Mathematics 13 01066 g008
Figure 9. HP filter decomposition results.
Figure 9. HP filter decomposition results.
Mathematics 13 01066 g009
Figure 10. CEEMDAN decomposition results.
Figure 10. CEEMDAN decomposition results.
Mathematics 13 01066 g010
Figure 11. Forecasting effect diagram of each model.
Figure 11. Forecasting effect diagram of each model.
Mathematics 13 01066 g011
Figure 12. Schematic diagram of the forecasting error evaluation index of each model.
Figure 12. Schematic diagram of the forecasting error evaluation index of each model.
Mathematics 13 01066 g012
Figure 13. Forecasting effect diagram of each IMF component.
Figure 13. Forecasting effect diagram of each IMF component.
Mathematics 13 01066 g013
Figure 14. Forecasting effect diagram of each method.
Figure 14. Forecasting effect diagram of each method.
Mathematics 13 01066 g014
Figure 15. Sschematic diagram of each method’s forecasting error evaluation index.
Figure 15. Sschematic diagram of each method’s forecasting error evaluation index.
Mathematics 13 01066 g015
Table 1. Parameter settings.
Table 1. Parameter settings.
Parameter NameParameter Value
Seq len24
Number of attention heads8
Epoch50
Batch size24
Patience10
Dropout0.05
OptimizerAdam
Learning rate0.0001
Activation functionGELU
Loss functionMSE
Table 2. Model comparison results.
Table 2. Model comparison results.
MethodRMSE (KW)MAE (KW)MAPE%R2
LightTS562.52448.782.67310.7416
Reformer691.99555.153.29010.609
Crossformer528.36416.762.47870.7721
Transformer676.74551.643.29980.6267
Informer605.78476.82.83940.7004
PDF516.63404.392.39370.7821
SCEPDF510.86400.62.37550.7869
Pyraformer548.84425.692.53740.7541
Table 3. Complexity analysis.
Table 3. Complexity analysis.
ComponentsSample
Entropy
Spectral
Entropy
Lempel–Ziv
Complexity
Compositive Complexity
IMF_11.001.000.770.92
IMF_20.640.861.000.83
IMF_30.450.570.900.64
IMF_40.350.550.820.57
IMF_50.210.430.490.37
IMF_60.070.230.320.21
IMF_70.020.120.360.17
IMF_80.010.060.000.02
IMF_90.000.000.420.14
Table 4. Comparison results of complementary models.
Table 4. Comparison results of complementary models.
ComponentsSCEPDFPyraformerCrossformer
RMSE (KW)MAE (KW)RMSE (KW)MAE (KW)RMSE (KW)MAE (KW)
IMF_1343.19276.77324.94258.02335.91272.86
IMF_2140.67105.99132.1999.34168.44124.7
IMF_351.0939.2649.2339.0154.735.69
IMF_436.5125.2419.1615.4920.5615.27
IMF_58.926.995.0583.62895.763.71
IMF_60.91470.6341.70991.43190.40610.3258
IMF_70.55490.36893.92532.37344.633.67
IMF_80.00790.00470.12880.11680.08430.0698
IMF_90.1430.06524.15614.05263.27693.1799
Table 5. Ablation experiment.
Table 5. Ablation experiment.
MethodRMSE (KW)MAE (KW)MAPE%R2
SCEPDF510.86400.62.370.7869
Pyraformer548.84425.692.530.7541
HP-SCEPDF455.46358.952.130.8306
CEEMDAN-SCEPDF364.48287.721.710.8915
HP-CEEMDAN-LightTS483.81375.42.22530.8089
HP-CEEMDAN-Reformer487.69376.052.22740.8058
HP-CEEMDAN-Crossformer386.16303.091.81090.8783
HP-CEEMDAN-Transformer370.14296.121.75860.8881
HP-CEEMDAN-Informer407.65320.991.90830.8643
HP-CEEMDAN-SCEPDF363.39287.031.700.8922
Proposed354.74280.341.670.8973
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dou, X.; He, Y. A Short-Term Electricity Load Complementary Forecasting Method Based on Bi-Level Decomposition and Complexity Analysis. Mathematics 2025, 13, 1066. https://doi.org/10.3390/math13071066

AMA Style

Dou X, He Y. A Short-Term Electricity Load Complementary Forecasting Method Based on Bi-Level Decomposition and Complexity Analysis. Mathematics. 2025; 13(7):1066. https://doi.org/10.3390/math13071066

Chicago/Turabian Style

Dou, Xun, and Yu He. 2025. "A Short-Term Electricity Load Complementary Forecasting Method Based on Bi-Level Decomposition and Complexity Analysis" Mathematics 13, no. 7: 1066. https://doi.org/10.3390/math13071066

APA Style

Dou, X., & He, Y. (2025). A Short-Term Electricity Load Complementary Forecasting Method Based on Bi-Level Decomposition and Complexity Analysis. Mathematics, 13(7), 1066. https://doi.org/10.3390/math13071066

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop