FTimeDD: A Time–Frequency Collaborative Model for Multi-Energy Load Forecasting

Lin, Zi; Wang, Ziyi; Guo, Tengyue; Xia, Min

doi:10.3390/en19112729

Open AccessArticle

FTimeDD: A Time–Frequency Collaborative Model for Multi-Energy Load Forecasting

¹

Collaborative Innovation Center on Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, China

²

Department of Computer Science, University of Reading, Whiteknights, Reading RG6 6DH, UK

³

State Key Laboratory of Environment Characteristics and Effects for Near-space, Nanjing University of Information Science and Technology, Nanjing 210044, China

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(11), 2729; https://doi.org/10.3390/en19112729 (registering DOI)

Submission received: 25 April 2026 / Revised: 29 May 2026 / Accepted: 3 June 2026 / Published: 5 June 2026

(This article belongs to the Special Issue Artificial Intelligence for Energy Forecasting)

Download

Browse Figures

Versions Notes

Abstract

With the global energy transition, Integrated Energy Systems (IESs) improve efficiency by coordinating multiple energy sources, including electricity, cooling, and heating. Accurate load forecasting is essential for reliable energy system operation. However, multi-energy loads show complex coupling, non-stationarity, and long-term dependencies. These characteristics pose significant challenges to forecasting tasks. Existing methods have improved short-term forecasting accuracy, but still struggle to jointly capture long-term trends and local fluctuations. To address these issues, this paper proposes FTimeDD, a time–frequency collaborative model for multi-energy load forecasting in IESs. It adopts a dual-path decoupling architecture. The time-domain path separates trend and fluctuation components, while the frequency-domain path extracts dominant periodic features. The two paths are then fused to predict electricity, cooling, and heating loads. Experiments on the ASU Integrated Energy System dataset show that FTimeDD performs well across different forecasting horizons. Compared with the strongest baseline for each metric and horizon, FTimeDD reduces MAE, RMSE, and MAPE by 3.85%, 2.48%, and 1.91% on average, respectively. The method improves forecasting accuracy under the adopted experimental setting while maintaining a compact model scale and low computational cost.

Keywords:

IES; multi-energy load forecasting; long-term dependencies; dual-path decoupling architecture; time-frequency collaborative modeling

1. Introduction

With the increasing penetration of renewable energy and the growing need for low-carbon operation, modern energy systems are becoming more integrated and flexible. IESs (shown in Figure 1) coordinate electricity, heating, cooling, and other energy carriers to improve energy efficiency and operational resilience [1,2]. In practice, multi-energy loads exhibit significant coupling relationships and complex dynamic patterns, which are influenced by meteorological and temporal factors. Therefore, multi-energy load forecasting in IESs is more difficult than single-energy forecasting. Accurate load forecasting is important for the secure, stable, and coordinated operation of IES [3].

Early load forecasting was largely based on statistical models. In univariate forecasting tasks, traditional machine learning models such as ARIMA and SARIMA perform predictions by modeling statistical characteristics of time series [4]. Meanwhile, Support Vector Machines and Random Forests incorporate exogenous variables to improve model expressiveness and forecasting accuracy [5,6]. However, these models depend on manual feature engineering struggling to capture nonlinear coupling relationships among multiple variables. Furthermore, they are prone to error accumulation in multi-step forecasting, limiting their applicability in complex load scenarios.

As deep learning has advanced, neural network methods have been broadly adopted for multi-energy load forecasting [7,8]. Early research focused on univariate forecasting, using models such as RNN and LSTM to capture temporal dependencies [9,10]. Subsequently, the research expanded to single-step and short-term forecasting tasks, improving accuracy through architectural optimization [11,12]. Furthermore, multivariate multi-step forecasting methods were proposed, enabling joint modeling of multi-energy loads by incorporating meteorological and temporal features [13,14,15,16]. Although deep learning improves forecasting performance, it still has limitations in capturing long-term dependencies and complex dynamic patterns [17].

In recent years, Transformer-based models have been widely used in time series forecasting [18,19,20]. Zhou et al. proposed Informer, which reduces computational complexity via the ProbSparse attention mechanism and enables efficient long-sequence modeling [21]. Wu et al. introduced Autoformer, which enhances modeling of trend and periodic components via a series decomposition architecture [22]. Zhou et al. further proposed FEDformer, which incorporates frequency-domain modeling into the Transformer framework to extract critical spectral features [23]. These methods achieve high accuracy in multivariate multi-step forecasting tasks. However, these Transformer-based models rely on complex attention mechanisms, leading to large parameter scales and high computational costs. They are also prone to overfitting in noisy or non-stationary environments.

In contrast, RNN-based models are effective in modeling local temporal dynamics in time series. Cho et al. developed forecasting models based on the GRU structure, improving computational efficiency through simplified gating mechanisms [10]; Zeng et al. proposed SegRNN, which enhances local pattern modeling capabilities via sequence segmentation [24]. In addition, some studies adopt multi-layer architectures or multi-task learning to improve the representational capacity of multivariate sequences [13]. However, although RNN-based methods perform well in short-term forecasting, their recursive structures suffer from information decay in long-sequence modeling, and their sequential computation limits training efficiency.

Moreover, various hybrid models and probabilistic modeling methods have been proposed to further improve forecasting performance. Lai et al. proposed LSTNet, which combines convolutional with recurrent neural networks to capture both local and long-term dependencies [13]. Lim et al. introduced the Temporal Fusion Transformer (TFT), achieving multi-source information fusion via a variable selection mechanism [14]. Salinas et al. developed DeepAR to model time series uncertainty from a probabilistic perspective [25]. Song et al. used a hierarchical multi-task learning model for multi-energy load forecasting [26]. Although these methods excel in multi-source information integration, their structures are complex and sensitive to parameter tuning. Consequently, they remain limited in capturing complex multi-energy coupling relationships.

Overall, existing methods still have three limitations. First, time-decomposition methods mainly focus on trend and seasonal components, but they do not fully exploit frequency-domain periodic information. Second, frequency-enhanced methods can extract periodic patterns, but they often lack explicit modeling of local fluctuations and trend evolution. Third, some high-performing models rely on complex structures, leading to higher computational costs. These limitations leave a clear problem: how to jointly model trends, fluctuations, and periodic patterns in a lightweight framework. To address this issue, this study proposes FTimeDD, a time–frequency collaborative model for multi-energy load forecasting. It integrates time-domain trend–fluctuation disentanglement with frequency-domain periodic enhancement to capture multi-scale dynamics in multi-energy loads. The study’s primary contributions are as follows:

(1): A multi-energy load forecasting model, FTimeDD, is developed to handle multivariate inputs and multi-horizon outputs, enabling unified modeling of multi-energy loads.
(2): A dual-path decoupling mechanism is designed to achieve time–frequency collaboration. The time-domain path captures trend evolution and local fluctuations. The frequency-domain path enhances dominant periodic patterns.
(3): A compact multi-scale fusion module is developed to integrate time- and frequency-domain representations. This module learns cross-load interactions while controlling the parameter scale.
(4): Experiments on the ASU Integrated Energy System dataset verify the effectiveness of FTimeDD. The results show lower forecasting errors than several baseline models across different horizons, with competitive computational efficiency.

2. Problem Definition and Feature Analysis

2.1. Problem Definition

In IES, multi-energy load forecasting aims to characterize deep coupling relationships among multiple loads based on historical observations, and thereby predict future load variations. This task can be formulated as a multivariate multi-step time series forecasting problem [27].

Let the input historical multivariate sequence be

X \in R^{B \times T_{s e q} \times C_{i n}}

, where B denotes the batch size,

T_{s e q}

denotes the look-back window length, and

C_{i n}

represents the input feature dimension. The objective is to predict the multi-energy load states

Y \in R^{B \times T_{p r e d} \times C_{o u t}}

over a forecasting horizon of

T_{p r e d}

time steps by learning a mapping function f. Here,

C_{o u t} = 3

corresponds to the three target loads: electricity, cooling, and heating. The forecasting process is formulated as:

\hat{Y} = f (X; θ) .

(1)

In the training process, we minimize the error between the predicted sequence

\hat{Y}

and the ground truth

Y

by

θ

, a set of parameters to be learned. In this study, Mean Absolute Error (MAE) is adopted as the optimization objective:

L_{M A E} = \frac{1}{B \cdot T_{p r e d} \cdot C_{o u t}} \sum_{b = 1}^{B} \sum_{t = 1}^{T_{p r e d}} \sum_{c = 1}^{C_{o u t}} |{\hat{Y}}_{b, t, c} - Y_{b, t, c}| .

(2)

By minimising this loss function, the model parameters

θ

are iteratively updated to achieve accurate prediction of multi-energy load sequences.

2.2. Feature Analysis

This study utilizes the public Integrated Energy System dataset from Arizona State University (ASU) in Tempe. The dataset spans from 2021 to 2023 with a 1 h sampling interval. Electricity, cooling, and heating loads are used as target variables. Meteorological and temporal variables, including ambient temperature, sea-level pressure, wet-bulb temperature, day of week, and greenhouse gas emission index, are used as auxiliary variables. These features cover the primary external factors affecting load variations, enabling the model to better describe complicated energy demand patterns while providing a reliable data foundation for modeling long-term dependencies.

To investigate the dynamic relationship between multi-energy loads and exogenous variables, a time series analysis is conducted, as shown in Figure 2.

Overall, all three load types exhibit distinct periodic patterns superimposed with non-stationary fluctuations. Cooling and heating loads show strong temporal consistency with temperature variations, exhibiting synchronization during high- and low-temperature periods. This indicates that temperature and temporal features are key driving factors for these loads. In contrast, electricity load exhibits more complex dynamic characteristics, with significant random fluctuations superimposed on its periodic patterns.

The above analysis suggests that multi-energy load sequences contain long-term trends as well as short-term variations, with varying impacts from external factors across different loads. To further quantify and validate variable correlations, Grey Relational Analysis (GRA) and Spearman rank correlation are employed to evaluate the relationships between multi-energy loads and auxiliary features, as shown in Figure 3a and Figure 3b, respectively.

GRA characterizes the strength of association by measuring the trend consistency across sequences. It is adapted for nonlinear and non-stationary time series in this study. The grey relational coefficient is defined as:

ξ_{i} (k) = \frac{Δ_{m i n} + ρ Δ_{m a x}}{|x_{0} (k) - x_{i} (k)| + ρ Δ_{m a x}},

(3)

where

x_{0}

is the reference sequence,

x_{i}

is the comparison sequence, and

ρ

is the resolution coefficient (typically 0.5). The grey relational grade is obtained by averaging the coefficients across time steps; a higher value represents a stronger correlation.

γ_{i} = \frac{1}{n} \sum_{k = 1}^{n} ξ_{i} (k) .

(4)

The upper triangle in Figure 3a displays the relational grades between auxiliary features and target loads. All values are above 0.6, indicating that incorporating meteorological and time characteristics into the model can significantly improve its representational ability. Specifically, temperature shows strong correlation with cooling and heating loads, while the day-of-week feature significantly influences electricity load. These variations reflect that load variations are driven by multiple factors. Thus, multi-dimensional feature fusion is essential to improve model expressiveness [28]. The fitting results in the lower triangle reveal complex nonlinear coupling relationships among variables. This further suggests that traditional linear methods struggle to adequately capture such relationships, necessitating the introduction of models capable of nonlinear representation. The histograms on the diagonal illustrate the statistical distributions of features, highlighting distinct differences in value ranges and shapes. This indicates that auxiliary features are complementary rather than redundant in describing load variations.

To further support the GRA-based analysis, Spearman rank correlation was used as a complementary method. GRA measures the similarity of variation trends, while Spearman correlation evaluates monotonic relationships among variables. The Spearman correlation heatmap is shown in Figure 3b. The results show that electricity and cooling loads show a strong positive correlation, with a coefficient of 0.82, while heating load is negatively correlated with electricity and cooling loads, with coefficients of −0.75 and −0.81, respectively. Temperature-related variables are positively correlated with electricity and cooling loads but negatively correlated with heating load, which is consistent with the opposite seasonal patterns of cooling and heating demands. In addition, DOW and GHG also show noticeable correlations with the loads, indicating that temporal and exogenous variables contain useful forecasting information. These results are consistent with the GRA analysis and further support the use of meteorological and temporal variables as auxiliary inputs.

Overall, the GRA and Spearman analyses provide consistent evidence for the complex relationships in multi-energy loads. Temperature variations are closely related to cooling and heating loads, while electricity load is also affected by temporal and exogenous factors. This reflects coupling relationships and complex dynamic patterns between different loads. Therefore, it is necessary to extract key information from multi-dimensional features and perform decomposition-based modeling to enhance the representation of complex load patterns.

3. Method

This paper proposes FTimeDD, a time–frequency collaborative modeling framework for multi-energy load forecasting. It captures multi-scale features by jointly modeling sequences from the time and frequency domains. It employs a dual-path parallel architecture, consisting of three core modules: the Frequency Filtering Module (FFM), the Dual-stream Disentanglement Module (DSDM), and the Multi-scale Feature Fusion Module (MSFFM). The framework consists of five stages: input preprocessing, time-domain feature extraction, frequency-domain modeling, dual-path feature fusion, and forecast generation. Specifically, the historical multivariate sequence is first processed by the FFM to reduce noise and alleviate non-stationarity. The denoised sequence is then sent to two parallel paths. The time-domain path extracts trend and fluctuation features, while the frequency-domain path captures dominant periodic information. The MSFFM fuses the two types of features and generates the final forecasts for electricity, cooling, and heating loads, as illustrated in Figure 4.

3.1. Frequency Filtering Module (FFM)

To address high-frequency noise and non-stationary distributions in raw load data, the Frequency Filtering Module (FFM) is employed for frequency-domain preprocessing. The module integrates low-pass filtering (LPF) and complex-valued linear layers, utilizing frequency-domain interpolation to suppress noise and enhancing periodic features, providing high-quality inputs for subsequent forecasting.

First, the input data undergoes reversible instance normalization (RIN) [29]. By computing the mean

μ

and standard deviation

σ

, the input sequence is converted into a normalized distribution Z with a zero-mean and unit-variance distribution, improving the modeling of non-stationary sequences.

Z = \frac{X - μ}{σ}, μ = \frac{1}{T_{s e q}} \sum_{i = 1}^{T_{s e q}} x_{i}, σ = \sqrt{\frac{1}{T_{s e q}} \sum_{i = 1}^{T_{s e q}} {(x_{i} - μ)}^{2}} .

(5)

The normalized signal is mapped to the complex frequency domain via real Fast Fourier Transform (rFFT) to obtain complex frequency-domain features

F (Z)

:

F (Z) = r F F T (Z) \in C^{K \times C_{i n}},

(6)

where

K = ⌊T_{s e q} / 2⌋ + 1

is the total number of frequency points.

The model performs hard-threshold low-pass filtering (LPF) on the spectrum based on dominant frequencies

f_{d o m}

, thereby filtering out irregular dynamic interference. In this study,

f_{d o m}

is determined based on the look-back window and the daily periodicity of the load sequences. It is calculated as:

f_{d o m} = (|\frac{T_{s e q}}{T_{d a y}}| + 1) H + δ_{f},

(7)

where

T_{s e q}

denotes the look-back window length,

T_{d a y}

denotes the number of time steps in one daily period, H denotes the retained harmonic order, and

δ_{f}

is an additional low-frequency margin. Since the dataset is sampled hourly,

T_{d a y} = 24

. With

T_{s e q} = 168

,

H = 6

, and

δ_{f} = 20

, the cutoff is

f_{d o m} = 68

. This cutoff is used to retain the dominant low-frequency components and daily-period-related harmonic information in the frequency filtering process. The LPF process is formulated as:

F_{l o w} = M ⊙ F (Z),

(8)

where

M \in {\{0, 1\}}^{K \times 1}

is a mask vector, with

M_{k} = 1

if the frequency index

k < f_{d o m}

, and 0 otherwise. This process suppresses high-frequency noise and enhances the primary periodic structure, reducing noise interference and computational redundancy.

The core of the FFM is a complex-valued linear projection layer to perform frequency-domain interpolation and extrapolation. Unlike traditional zero-padding, this layer utilizes learnable complex weight matrices

W_{c o m p l e x} \in C^{d_{o u t} \times d_{i n}}

, which explicitly models amplitude and phase variations. It maps truncated low-frequency components to a spectral space aligned with the target forecasting horizon, deriving the required spectral features for future sequences.

Here,

d_{i n}

and

d_{o u t}

represent the input and output dimensions of the complex-valued linear layer respectively, and

B_{c o m p l e x} \in C^{d_{o u t} \times C_{i n}}

represents the complex bias vector. The relevant parameters and formulae are as follows:

\begin{matrix} d_{i n} = f_{d o m}, \end{matrix}

(9)

\begin{matrix} d_{o u t} = ⌊f_{d o m} \cdot \frac{T_{s e q} + T_{p r e d}}{T_{s e q}}⌋, \end{matrix}

(10)

\begin{matrix} Y_{f r e q} = W_{c o m p l e x} \cdot F_{l o w} + B_{c o m p l e x} . \end{matrix}

(11)

Finally, the interpolation-enhanced frequency representation

Y_{f r e q}

is converted to the time domain through inverse real FFT (irFFT), and subsequently rescaled to the original data scale using interpolation rate

η = (T_{s e q} + T_{p r e d}) / T_{s e q}

and inverse RIN (IRIN). The stationarized signal

X_{d e n o i s e d} \in R^{B \times T_{s e q} \times C_{i n}}

produced by FFM provides a robust representational foundation for subsequent time–frequency dual-stream modeling and improves stability in long-horizon forecasting.

The computational cost of FFM mainly comes from FFT and the complex-valued linear projection. The FFT operations have a complexity of

O (T_{s e q} log T_{s e q})

. The projection layer is controlled by the retained dominant frequencies

f_{d o m}

and the projected frequency dimension

d_{o u t}

, with a complexity of

O (f_{d o m} d_{o u t})

. Since

f_{d o m}

is much smaller than the full sequence length, the FFM remains compact while enhancing useful periodic information.

3.2. Dual-Stream Decoupling Module

The core architecture employs a dual-path parallel mechanism in both time and frequency domains, enabling multi-dimensional disentanglement and complementary feature extraction. The time-domain path (TDP) explicitly decomposes sequences into trend and seasonal components via moving average decomposition, capturing long-term evolution and short-term fluctuations, respectively. The frequency-domain path (FDP) adaptively weights the spectrum via a learnable filter to enhance dominant periodic components and suppress noise. This time–frequency joint modeling mechanism enhances the model’s ability to capture transient patterns and improves forecasting robustness in multi-energy coupling scenarios.

3.2.1. Time Domain Path

To disentangle long-term trends from short-term variations, our model adopts an additive decomposition strategy in the time domain to reduce modeling complexity. A Moving Average (MA) operator performs 1D average pooling on the denoised sequence to extract the trend component

X_{t r e n d}

, followed by a residual connection to obtain the seasonal component

X_{s e a s o n a l}

.

X_{t r e n d}

reflects long-term evolution patterns, while

X_{s e a s o n a l}

represents temporary fluctuations, thereby reducing the complexity of subsequent modeling. The formula is as follows:

\begin{matrix} X_{t r e n d} = A v g P o o l l d (P a d (X), k, 1) \in R^{B \times T_{s e q} \times C_{i n}}, \end{matrix}

(12)

\begin{matrix} X_{s e a s o n a l} = X - X_{t r e n d} \in R^{B \times T_{s e q} \times C_{i n}} . \end{matrix}

(13)

The

P a d (\cdot)

denotes the padding operation. The window size k is specifically configured for electricity, cooling, and heating loads to mitigate feature blurring in heterogeneous loads under a unified temporal resolution.

The disentangled components are fed into two identical Multi-Layer Perceptrons (MLPs) for independent evolution forecasting. Each MLP comprises two fully connected layers with LeakyReLU activation. It maps the input sequence onto future time steps

T_{p r e d}

via linear projection. Finally, the predicted trend and fluctuation components are aggregated to obtain the global feature representation

z \in R^{B \times T_{p r e d} \times C}

for the TDP, where C is the hidden dimension. The specific process is as follows:

\begin{matrix} z_{t r e n d} = {M L P}_{t r e n d} (X_{t r e n d}), \end{matrix}

(14)

\begin{matrix} z_{s e a s o n a l} = {M L P}_{s e a s o n a l} (X_{s e a s o n a l}), \end{matrix}

(15)

\begin{matrix} z = z_{t r e n d} + z_{s e a s o n a l} . \end{matrix}

(16)

3.2.2. Frequency-Domain Path

The FDP utilizes spectral modeling to separate amplitude and phase information, achieving high-precision coupling feature extraction. This module uses the denoised sequence input and transforms it into frequency domain via rFFT to obtain complex-valued features

X_{f r e q} \in C^{B \times K \times C_{i n}}

. To achieve disentanglement of steady-state and dynamic components, a learnable spectral filter

W_{f r e q} \in C^{K \times C_{i n}}

is introduced to adaptively redistribute spectral energy via spectral convolution:

Y_{s p e c} = X_{f r e q} ⊙ W_{f r e q} .

(17)

The filter assigns higher weights to periodic low-frequency steady-state components to strengthen static coupling patterns and ensure consistency in long-horizon forecasting. Conversely, for high-frequency dynamic components caused by noise and transient interference, adaptive filtering is achieved via weight attenuation mechanisms. This mechanism mitigates overfitting issues commonly observed in traditional time-domain models. Element-wise spectral convolution in the frequency domain is equivalent to full-length circular convolution in the time domain, where the element-wise spectral filtering has a linear complexity with respect to the number of frequency point. Additionally, the FFT operations introduce a complexity of

O (T_{s e q} log T_{s e q})

. Consequently, frequency-domain operations can capture global steady-state features more effectively. Its computational overhead

(O (T_{s e q} log T_{s e q}))

is significantly lower than that of attention mechanisms in Transformer-based models

(O ({T_{s e q}}^{2}))

. This design helps balance lightweight design and long-term modeling capabilities.

The filtered frequency-domain features

Y_{s p e c}

are restored to the time domain via irFFT, yielding refined temporal features. To further enhance the representational capacity of frequency-domain features and capture nonlinear phase evolution over time, a nonlinear evolution layer (self.model) is introduced. This module is composed of stacked linear layers and ReLU activations, formulated as follows:

y_{p r e d} = R e L U (L i n e a r (y_{t i m e})) {\in R}^{B \times T_{p r e d} \times C},

(18)

where

y_{p r e d}

is the final forecasting output of the FDP. It captures complex nonlinear phase-shift features by learning the phase evolution of frequency features over time. This compensates for the limited expressiveness of purely linear frequency-domain modeling while minimizing computations.

3.3. Multi-Scale Feature Fusion Module

This module performs deep integration of time-domain and frequency-domain features, combining temporal dependencies with periodic patterns. First, the model concatenates forecasting feature

z {\in R}^{B \times T_{p r e d} \times C}

from the time domain and

y_{p r e d} {\in R}^{B \times T_{p r e d} \times C}

from the frequency domain. This integrates temporal dependencies with periodic patterns to provide a comprehensive feature space for global coupling learning.

X_{f u s i o n} = [z; y_{p r e d}] {\in R}^{B \times T_{p r e d} \times 2 C} .

(19)

After concatenating multi-dimensional features, a final linear mapping layer achieves feature fusion and forecasting output. This fully connected layer utilizes a high-dimensional linear projection matrix

W_{f u s i o n} {\in R}^{2 C \times C_{o u t}}

and a bias term

b {\in R}^{T_{p r e d} \times C_{o u t}}

to enable cross-channel interaction and automatically learn multi-energy flow coupling relationships. Upon inputting all forecasting features, each weight in the

W_{f u s i o n}

matrix represents the influence of input components on the target load, automatically learning the interdependencies among all energy variables. The formulation is as follows:

{\hat{Y} = X}_{f u s i o n} \cdot W_{f u s i o n} + b {\in R}^{B \times T_{p r e d} \times C_{o u t}} .

(20)

The final output consists of three core channels—electricity, cooling and heating (

C_{o u t} = 3

); however, information from all channels participates in joint weight updates during training. This enables the model to leverage implicit coupling features from auxiliary variables to refine predicted trajectories of core loads in real time, achieving implicit auxiliary learning. Furthermore, the module maintains low computational complexity (

O (C^{2})

), facilitating effective multivariate information interaction within a compact architecture.

4. Experiments

This study is conducted using the Integrated Energy System (IES) dataset published by Arizona State University (ASU) Tempe campus. The dataset is chronologically divided into training, validation, and test sets in an 8:1:1 ratio. This split preserves the chronological order of the forecasting task. The present validation is based on the ASU IES dataset, and separate tests on other datasets or specific seasonal subsets are not included. The look-back window is set to 168 h (7 days), and the forecasting horizons are 24, 48, 72, and 96 h. These settings are used to evaluate the model under different prediction lengths.

To avoid information leakage, all preprocessing statistics are estimated from the training set only and then applied to the validation and test sets. Outliers are identified using a box-plot-based quartile method and corrected via exponential smoothing to mitigate the impact of abnormal fluctuations on model training.

x_{t}^{'} = α x_{t} + (1 - α) x_{t - 1}^{'} .

(21)

Missing data are imputed using linear interpolation. Additionally, to eliminate scale differences, all input variables undergo Min–Max normalization, thereby improving training stability and convergence efficiency.

x_{n o r m} = \frac{x - x_{m i n}}{x_{m a x} - x_{m i n}} .

(22)

During forecasting, exogenous variables are used only within the historical look-back window. The model does not use future meteorological observations or future target load values over the prediction horizon as inputs.

To evaluate performance, several typical time series forecasting models are selected as baselines, including ARIMA [5], LSTM [12], DLinear [24], iTransformer [30], TimesNet [15], TimeXer [31], TimeMixer [32], SegRNN [33], and Amplifier [34]. All these models are retrained using the same training, validation, and test splits. The recommended configurations from official implementations are used as initial settings. For models with tunable settings, we conducted basic hyperparameter tuning on the validation set. The evaluation metrics comprise Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and Average Correlation Coefficient (ACCR). These metrics assess model performance from the perspectives of error magnitude, relative error, and sequence correlation respectively.

The experiments are implemented using Python 3.10 and the PyTorch 2.5.1 framework. Parameter updates are performed utilizing the Adam optimizer. The key hyperparameters are set as follows: the dominant frequency cutoff

f_{d o m}

= 68, the moving average window size k = 13, the hidden dimension C = 64, the learning rate

4 \times 10^{- 4}

, the maximum number of training epochs 100, and the random seed 2026. The same fixed random seed is used for neural-network models in the experiments to maintain a consistent comparison protocol. Early stopping is applied with a patience of 10 epochs based on the validation MAE. A learning-rate decay strategy reduces the learning rate by a factor of 0.5 if the validation loss does not improve for 3 consecutive epochs.

Runtime was measured under the same hardware environment for all models, and FLOPs were calculated using the same input setting and measurement protocol. The experiments were conducted on a workstation equipped with an Intel i7-13700KF CPU, Intel, USA and an NVIDIA GeForce RTX 4070 Ti SUPER GPU, NVIDIA, USA.

4.1. Comparative Experiments

4.1.1. Comparison of Fitting Curves

To intuitively evaluate the model’s fitting capability, Figure 5 compares predictions for electricity, cooling, and heating loads with ground truth on the test set.

Overall, predictions (red line) closely match the ground truth (blue line), maintaining consistency in both periodic patterns and local fluctuations. For electricity load with complex fluctuations, the model maintains stable tracking during rapid changes, with no significant deviations. For cooling and heating loads, the model captures peaks and valleys accurately without noticeable lag.

In summary, the proposed method effectively captures dynamic patterns in the load sequence while maintaining the smoothness of the forecast results, demonstrating strong fitting capability and adaptability.

4.1.2. Comparison of Error Metrics

Quantitative analysis evaluates the reduction in MAE and RMSE achieved by FTimeDD relative to representative baselines, as shown in Figure 6. Across the tested forecasting horizons, FTimeDD generally obtains lower errors than the compared baselines. This indicates that the proposed time–frequency collaborative structure is effective for reducing forecasting errors in multi-energy load prediction.

Figure 7 further compares multi-metric performance across electricity, cooling, and heating loads. FTimeDD achieves competitive or near-optimal results in most metrics, particularly for cooling and heating loads. This suggests that the method effectively captures periodic load characteristics. For an electricity load with more irregular fluctuations, the model still maintains competitive performance, indicating its ability to capture complex load variations.

Table 1 lists overall model performance and computational cost across different forecasting horizons. As the forecasting horizon increases, the prediction errors generally increase for most models, indicating the higher difficulty of long-horizon multi-energy load forecasting. FTimeDD shows the lowest overall MAE and MAPE at all forecasting horizons. For RMSE, FTimeDD obtains the best results at the 24, 72, and 96 h horizons, while Amplifier shows a slightly lower RMSE at the 48 h horizon. In addition, FTimeDD maintains a compact parameter scale and relatively low FLOPs. These results indicate that FTimeDD achieves a balance between forecasting accuracy and computational cost.

Table 2 presents the load-wise forecasting results for electricity, cooling, and heating loads. The results show that the model performance varies across load types. For electricity load, FTimeDD achieves lower MAE and MAPE at the 24, 48, and 72 h horizons, while Amplifier is more competitive at the 96 h horizon. For cooling load, several baselines remain competitive at longer horizons, which suggests that the cooling sequence contains strong seasonal regularity. FTimeDD maintains high ACCR values at the 72 and 96 h horizons, indicating that it can still preserve the temporal correlation of the cooling load. For heating load, FTimeDD achieves the lowest MAE at all horizons and obtains strong RMSE and MAPE results in most cases. Overall, FTimeDD demonstrates relatively balanced forecasting performance across different energy loads, rather than relying on improvements for a specific load type.

4.2. Ablation Study

To evaluate the contribution of each component, multiple ablation settings are constructed, including removing the frequency-domain filtering module (w/o FFM), removing the time-domain path (w/o Time domain Decoupling), removing the frequency-domain path (w/o Frequency domain Decoupling), removing the dual-stream decoupling module (w/o Dual-stream), removing exogenous variables (w/o Exogenous Variables), and removing reversible instance normalization (w/o RIN).

4.2.1. Module Necessity Analysis

Table 3 lists the ablation results for different model variants across horizons. The results show that the Full model achieves the lowest MAE and RMSE at all horizons, outperforming all variants. This indicates that the Full model provides the most stable error control. Each module contributes positively to performance and exhibits effective synergy.

As we can see, removing the time-domain path or dual-stream module leads to noticeable error increases, especially in long-horizon forecasting. This is because the time-domain path extracts trends, while dual-stream decoupling enhances multi-scale features expression via time–frequency collaborative modeling. Once these modules are removed, capturing long-term dependencies becomes difficult, which leads to model performance degradation. By contrast, removing the frequency-domain path mainly affects short-term forecasting, highlighting its role in capturing periodic features.

Further analysis shows that removing RIN leads to a clear degradation in prediction accuracy, especially at the 48 h and 96 h horizons. This indicates that normalization is important for alleviating the non-stationarity of multi-energy load sequences. Additionally, the w/o Exogenous Variables variant performs worse than the full model. This suggests that meteorological and temporal variables provide useful auxiliary information for multi-energy load forecasting.

4.2.2. Analysis of Parameter and Performance Relationship

The correlation of parameter scale and performance is analyzed in Figure 8. Figure 8a compares parameter scales across different forecasting horizons. While removing modules reduces the number of parameters, the overall change is only within several dozen K. This indicates that the parameter scale for each module is well-controlled and does not significantly increase model complexity.

Figure 8b shows error growth curves for different structural variants as the forecasting horizon expands. While MAE increases for all variants, the full model maintains the lowest level and flattest growth. This suggests that the main structural modules enhance the system’s robustness, while removing any module leads to faster error accumulation.

Figure 8c reveals differences in the relationship between parameter size and performance across different models: top-left models have fewer parameters but higher errors; top-right models have high parameters and high errors. However, FTimeDD (full model) occupies the optimal bottom-left region, indicating better efficiency-to-performance ratios. This suggests that improvements in our model stem from structural optimization rather than parameter scaling alone.

4.2.3. Module Contribution Analysis

Figure 9 illustrates the contributions of the main modules across different forecasting horizons. Overall, the contribution of each module varies with the prediction length. This is mainly because short- and long-horizon forecasting emphasize different temporal patterns.

At the 24 h horizon, removing the frequency-domain decoupling path causes the largest MAE increase, from 624.77 to 695.74. This indicates that short-term forecasting relies more on periodic and high-frequency fluctuation information, which is captured by frequency-domain modeling.

As the horizon becomes longer, trend evolution and accumulated temporal dependency become more important. At the 96 h horizon, removing the time-domain decoupling path increases MAE from 1038.73 to 1162.98, highlighting its role in modeling long-term trends. In contrast, the dual-stream decoupling module maintains a stable contribution in different forecasting horizons, because it can integrate trend-related information from the time domain with periodic information from the frequency domain. Although FFM shows smaller error variations, removing it consistently increases MAE and RMSE. This indicates that LPF-based frequency filtering and denoising provide more stable inputs for subsequent time–frequency modeling.

5. Conclusions

To address the challenges of complex multivariate coupling, strong non-stationarity, and difficulty in capturing long-term dependencies in integrated energy load forecasting, this paper proposes a time–frequency collaborative modeling method, termed FTimeDD. The method constructs a dual-path parallel architecture combining time-and frequency-domain modeling. It performs trend–fluctuation decoupling modeling in the time domain, while extracting dominant periodic features in the frequency domain, enabling effective representation of multi-scale dynamic information. This architecture improves the model’s representational capacity while maintaining computational efficiency, providing a lightweight solution for joint multi-energy load forecasting.

Experimental results indicate that the proposed method performs better than several representative baseline methods across different forecasting horizons. It achieves lower overall errors in most cases as the forecasting horizon increases. Ablation results further validate the necessity for each module. The time-domain path plays a key role in capturing long-term trends, the frequency-domain path facilitates extraction of periodic structures, and the dual-stream collaborative mechanism helps the model better represent complicated dynamic variations. In addition, RIN helps alleviate non-stationarity, while exogenous variables provide useful auxiliary information for multi-energy load forecasting.

The present study still has several limitations. The experiments are conducted under a fixed random seed, and multi-seed statistical evaluation is not included. The validation is also based on the ASU IES dataset with a chronological split, while tests on other datasets or specific seasonal periods are left for future work. Moreover, the dominant frequency cutoff

f_{d o m}

is determined from the look-back window and daily-period setting, and the effect of other cutoff values has not been further examined. Future work will extend the evaluation from these aspects and further consider uncertainty estimation, peak-demand errors, and dispatch-oriented assessment.

Overall, the proposed model provides an effective and compact forecasting framework for complex multivariate load forecasting tasks. It shows potential to provide more accurate load forecasting information for integrated energy system operation and to support subsequent operation-related analysis.

Author Contributions

Conceptualization, Z.L. and Z.W.; methodology, Z.L., Z.W., M.X. and T.G.; software, Z.L. and Z.W.; validation, T.G.; formal analysis, Z.L. and Z.W.; investigation, Z.W., T.G. and M.X.; resources, M.X.; data curation, T.G.; writing—original draft preparation, Z.L.; writing—review and editing, M.X.; visualization, Z.W.; supervision, M.X.; project administration, M.X.; and funding acquisition, M.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the State Grid Corporation of China Project (52100126003Q-065-ZN).

Data Availability Statement

The data are available at https://cm.asu.edu/ accessed on 1 May 2024.

Conflicts of Interest

The authors declare that this study received funding from the State Grid Corporation of China Project (52100126003Q-065-ZN). The funder was not involved in the study design, collection, analysis, or interpretation of data, the writing of this article, or the decision to submit it for publication.

References

Geidl, M.; Andersson, G. Optimal power flow of multiple energy carriers. IEEE Trans. Power Syst. 2007, 22, 145–155. [Google Scholar] [CrossRef]
Zhu, H.; Wang, X.; Wen, Y.; Zhu, J.; Li, J.; Luo, Q.; Liao, C. A review of integrated energy system modeling and operation. Appl. Energy 2025, 400, 126572. [Google Scholar] [CrossRef]
Wang, Z.; Zhou, S.; Liang, X.; Xia, M.; Liu, J. Frequency-Enhanced Dual-Stream Parallel Network for Multienergy Load Forecasting. IEEE Trans. Ind. Inform. 2026, 1–11. [Google Scholar] [CrossRef]
Box, G.E.; Jenkins, G.M. Time Series Analysis: Forecasting and Control; Holden-Day: San Francisco, CA, USA, 1976. [Google Scholar]
Vapnik, V. Statistical Learning Theory; Wiley: New York, NY, USA, 1998. [Google Scholar]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Wu, H.; Xu, Z. Multi-energy load forecasting in integrated energy systems: A spatial-temporal adaptive personalized federated learning approach. IEEE Trans. Ind. Inform. 2024, 20, 12262–12274. [Google Scholar] [CrossRef]
Liang, M.; Hu, Y.; Weng, H.; Xi, J.; Yin, B. EnergyGPT: Fine-tuning large language model for multi-energy load forecasting. Renew. Energy 2025, 251, 123313. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Cho, K.; Van Merriënboer, B.; Gulçehre, Ç.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1724–1734. [Google Scholar]
Marino, D.L.; Amarasinghe, K.; Manic, M. Building energy load forecasting using deep neural networks. In Proceedings of the IECON 2016—42nd Annual Conference of the IEEE Industrial Electronics Society; IEEE: New York, NY, USA, 2016; pp. 7046–7051. [Google Scholar] [CrossRef]
Borovykh, A.; Bohte, S.; Oosterlee, C.W. Conditional time series forecasting with convolutional neural networks. arXiv 2017, arXiv:1703.04691. [Google Scholar]
Lai, G.; Chang, W.C.; Yang, Y.; Liu, H. Modeling long-and short-term temporal patterns with deep neural networks. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA, 8–12 July 2018; pp. 95–104. [Google Scholar] [CrossRef]
Lim, B.; Arık, S.Ö.; Loeff, N.; Pfister, T. Temporal fusion transformers for interpretable multi-horizon time series forecasting. Int. J. Forecast. 2021, 37, 1748–1764. [Google Scholar] [CrossRef]
Wu, H.; Hu, T.; Liu, Y.; Zhou, H.; Wang, J.; Long, M. Timesnet: Temporal 2D-variation modeling for general time series analysis. arXiv 2022, arXiv:2210.02186. [Google Scholar]
Zhang, Y.; Yan, J. Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
Fan, P.; Wang, D.; Wang, W.; Zhang, X.; Sun, Y. A novel multi-energy load forecasting method based on building flexibility feature recognition technology and multi-task learning model integrating LSTM. Energy 2024, 308, 132976. [Google Scholar] [CrossRef]
Duan, P.; Zhao, X.; Hu, J.; Li, K.; Xue, Q.; Cao, X.; Wang, Y.; Zhao, B.; Zhang, C.; Yuan, X. Multi-energy load forecasting incorporating AI algorithms: Research status and trends in integrated energy systems. Renew. Sustain. Energy Rev. 2026, 229, 116611. [Google Scholar] [CrossRef]
Yuan, S.; Mao, Y.; Tian, C.; Yu, F.; Guo, T.; Xia, M. GSTAformer: Graph-Guided Spatio-Temporal Autoformer for Mid-Term Wind Power Forecasting. Energies 2026, 19, 254. [Google Scholar] [CrossRef]
Li, J.; Zhu, C.; Dong, Y.; Xia, M. Fault Prediction Method of Boost Converter Based on Multi-Modal Components and Temporal Convolutional Networks. Energies 2026, 19, 1974. [Google Scholar] [CrossRef]
Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Palo Alto, CA, USA, 2021; Volume 35, pp. 11106–11115. [Google Scholar] [CrossRef]
Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Adv. Neural Inf. Process. Syst. 2021, 34, 22419–22430. [Google Scholar]
Zhou, T.; Ma, Z.; Wen, Q.; Wang, X.; Sun, L.; Jin, R. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Proceedings of the International Conference on Machine Learning, PMLR, Baltimore, MD, USA, 17–23 July 2022; pp. 27268–27286. [Google Scholar]
Zeng, A.; Chen, M.; Zhang, L.; Xu, Q. Are transformers effective for time series forecasting? In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Palo Alto, CA, USA, 2023; Volume 37, pp. 11121–11128. [Google Scholar] [CrossRef]
Salinas, D.; Flunkert, V.; Gasthaus, J.; Januschowski, T. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 2020, 36, 1181–1191. [Google Scholar] [CrossRef]
Song, C.; Yang, H.; Cai, J.; Yang, P.; Bao, H.; Xu, K.; Meng, X.B. Multi-energy load forecasting via hierarchical multi-task learning and spatiotemporal attention. Appl. Energy 2024, 373, 123788. [Google Scholar] [CrossRef]
Taieb, S.B.; Bontempi, G.; Atiya, A.F.; Sorjamaa, A. A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition. Expert Syst. Appl. 2012, 39, 7067–7083. [Google Scholar] [CrossRef]
Ni, Y.; Liu, S.; Guo, T.; Xia, M. TiBT-Net: A High-Resolution Remote Sensing Image Change Detection Network Integrating Bi-Temporal Space Enhancement and Token Interaction. Remote Sens. 2026, 18, 805. [Google Scholar] [CrossRef]
Kim, T.; Kim, J.; Tae, Y.; Park, C.; Choi, J.H.; Choo, J. Reversible instance normalization for accurate time-series forecasting against distribution shift. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 3–7 May 2021. [Google Scholar]
Liu, Y.; Hu, T.; Zhang, H.; Wu, H.; Wang, S.; Ma, L.; Long, M. itransformer: Inverted transformers are effective for time series forecasting. arXiv 2023, arXiv:2310.06625. [Google Scholar]
Wang, Y.; Wu, H.; Dong, J.; Qin, G.; Zhang, H.; Liu, Y.; Qiu, Y.; Wang, J.; Long, M. Timexer: Empowering transformers for time series forecasting with exogenous variables. Adv. Neural Inf. Process. Syst. 2024, 37, 469–498. [Google Scholar]
Wang, S.; Wu, H.; Shi, X.; Hu, T.; Luo, H.; Ma, L.; Zhang, J.Y.; Zhou, J. Timemixer: Decomposable multiscale mixing for time series forecasting. arXiv 2024, arXiv:2405.14616. [Google Scholar] [CrossRef]
Lin, S.; Lin, W.; Wu, W.; Zhao, F.; Mo, R.; Zhang, H. Segrnn: Segment recurrent neural network for long-term time series forecasting. arXiv 2023, arXiv:2308.11200. [Google Scholar] [CrossRef]
Fei, J.; Yi, K.; Fan, W.; Zhang, Q.; Niu, Z. Amplifier: Bringing attention to neglected low-energy components in time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Palo Alto, CA, USA, 2025; Volume 39, pp. 11645–11653. [Google Scholar]

Figure 1. Energy flow of Integrated Energy System (IES).

Figure 2. Time series characteristics of multi-energy loads and exogenous variables.

Figure 3. Correlation analysis between multi-energy loads and auxiliary features. (a) Grey Relation Analysis; (b) Spearman correlation analysis.

Figure 4. Architecture of FTimeDD model.

Figure 5. Forecasting performance across different energy loads.

Figure 6. Error comparison under different forecasting horizons.

Figure 7. Multi-metric performance comparison for multi-energy loads.

Figure 8. Analysis of parameter scale and efficiency.

Figure 9. Module contribution under different forecasting horizons.

Table 1. Overall model performance and computational cost under different horizons.

Model	Horizon	Overall Indicator
Model	Horizon	Params	FLOPs	Time (s)	MAE	MAPE	RMSE	ACCR
DLinear	24	8112	101.95 KMac	0.21 s	669.0	5.84	1123.3	0.947
	48	16,224	199.3 KMac	0.26 s	874.2	7.43	1477.9	0.917
	72	24,336	296.64 KMac	0.34 s	1028.6	8.61	1717.4	0.896
	96	32,448	393.98 KMac	0.36 s	1145.7	9.47	1884.1	0.880
TimeMixer	24	349,669	32.24 MMac	1.45	671.9	5.91	1120.0	0.948
	48	363,925	32.8 MMac	1.56	888.9	7.60	1485.6	0.917
	72	378,181	33.35 MMac	1.63	1019.1	8.83	1691.1	0.896
	96	392,437	33.91 MMac	1.64	1105.1	9.61	1821.3	0.884
iTransformer	24	224,152	2.68 MMac	0.78	690.4	6.15	1153.6	0.946
	48	227,248	2.72 MMac	0.78	926.3	7.94	1598.9	0.913
	72	230,344	2.76 MMac	0.8	1095.6	9.29	1829.7	0.891
	96	233,440	2.79 MMac	0.81	1178.6	9.88	1957.7	0.875
TimeXer	24	382,872	21.48 MMac	2.77	695.1	6.04	1183.4	0.948
	48	407,472	21.78 MMac	2.8	917.0	7.74	1585.8	0.918
	72	432,072	22.07 MMac	2.84	1046.1	9.00	1789.7	0.897
	96	456,672	22.37 MMac	2.89	1163.1	9.81	1961.8	0.880
TimesNet	24	208,611	167.71 MMac	5.17	768.9	6.81	1238.5	0.933
	48	212,667	192.16 MMac	6.02	957.0	8.45	1578.4	0.907
	72	216,723	209.54 MMac	6.67	1061.3	9.07	1717.4	0.897
	96	220,779	230.89 MMac	7.03	1309.2	11.22	2154.1	0.859
SegRNN	24	408,984	38.72 MMac	0.72	684.0	5.94	1142.1	0.945
	48	409,112	43.56 MMac	0.78	892.7	7.59	1501.0	0.913
	72	409,240	48.39 MMac	0.81	1042.6	8.75	1726.0	0.893
	96	409,368	53.22 MMac	0.92	1179.6	9.75	1907.1	0.873
Amplifier	24	274,075	2.37 MMac	0.34	655.5	5.88	1095.8	0.948
	48	287,443	2.53 MMac	0.42	821.9	7.30	1369.1	0.924
	72	300,811	2.69 MMac	0.44	1015.7	8.82	1691.4	0.902
	96	314,179	2.85 MMac	0.52	1087.8	9.35	1796.7	0.888
ARIMA	24	-	-	-	1224.57	8.2	1670.4	0.835
	48	-	-	-	1371.56	9.2	1869.62	0.79
	72	-	-	-	1461.75	9.82	1980.11	0.764
	96	-	-	-	1431.06	10.3	2052.09	0.749
LSTM	24	5,706,528	14.82 MMac	0.89	789.4	6.924	1290.92	0.938
	48	5,854,272	14.96 MMac	1.16	944.6	8.37	1568.8	0.915
	72	6,002,016	15.11 MMac	1.78	1207.6	10.58	1965.4	0.886
	96	6,149,760	15.26 MMac	1.91	1288.34	11.54	2138.2	0.864
FTimeDD	24	117,110	1.42 MMac	1.28	624.8	5.60	1059.8	0.952
	48	131,006	1.59 MMac	1.32	818.5	7.23	1397.0	0.924
	72	144,902	1.77 MMac	1.36	957.0	8.44	1612.8	0.904
	96	158,798	1.94 MMac	1.37	1038.7	9.28	1723.9	0.888

Table 2. Load-wise forecasting performance under different horizons.

Model	Horizon	Electric Load				Cooling Load				Heat Load
Model	Horizon	MAE	MAPE	RMSE	ACCR	MAE	MAPE	RMSE	ACCR	MAE	MAPE	RMSE	ACCR
DLinear	24	852.5	5.69	1258.9	0.909	1069.0	7.55	1478.8	0.984	85.5	4.59	115.7	0.948
	48	1045.7	6.93	1553.0	0.857	1472.7	10.38	2030.1	0.970	104.2	5.57	140.6	0.925
	72	1173.2	7.74	1714.5	0.824	1793.1	12.68	2425.6	0.957	119.6	6.38	159.3	0.906
	96	1265.7	8.33	1813.0	0.801	2040.4	14.46	2707.9	0.948	131.1	6.98	173.4	0.890
TimeMixer	24	842.7	5.54	1247.1	0.911	1086.6	7.53	1481.5	0.984	86.4	4.65	116.2	0.948
	48	1066.3	7.00	1563.2	0.857	1496.4	10.21	2039.2	0.969	103.9	5.59	139.2	0.924
	72	1179.4	7.69	1709.5	0.826	1760.7	12.02	2373.3	0.958	117.2	6.30	156.1	0.905
	96	1250.8	8.07	1790.8	0.809	1939.8	13.09	2591.6	0.951	124.6	6.75	165.6	0.893
iTransformer	24	828.8	5.42	1236.5	0.913	1150.6	7.98	1564.7	0.982	91.8	4.93	122.4	0.942
	48	1063.2	6.87	1575.4	0.858	1605.8	11.34	2272.9	0.962	110.0	5.90	146.4	0.918
	72	1200.2	7.72	1739.4	0.826	1961.4	13.84	2644.1	0.950	125.1	6.80	163.7	0.896
	96	1260.8	8.11	1806.5	0.806	2141.4	15.04	2864.1	0.940	133.5	7.24	177.8	0.878
TimeXer	24	794.5	5.26	1198.3	0.918	1203.3	8.35	1658.9	0.980	87.5	4.72	116.1	0.947
	48	981.4	6.38	1472.5	0.873	1662.5	11.46	2314.2	0.960	107.0	5.76	142.1	0.921
	72	1110.4	7.14	1656.8	0.839	1909.4	13.22	2615.3	0.948	118.6	6.42	156.4	0.905
	96	1192.3	7.63	1753.4	0.817	2166.7	15.14	2905.6	0.936	130.3	7.03	171.7	0.886
TimesNet	24	1022.5	6.78	1433.6	0.881	1185.8	7.93	1590.8	0.982	98.3	5.40	126.7	0.937
	48	1173.6	7.86	1645.7	0.838	1584.0	10.42	2178.2	0.967	113.2	6.25	144.8	0.917
	72	1275.7	8.50	1725.7	0.819	1791.7	11.52	2418.2	0.957	116.5	6.45	149.2	0.914
	96	1445.9	9.27	1974.8	0.764	2344.4	14.54	3160.7	0.934	137.2	7.59	173.9	0.880
SegRNN	24	890.0	5.96	1300.0	0.902	1075.0	7.54	1486.5	0.984	86.9	4.66	117.3	0.948
	48	1101.8	7.34	1610.5	0.845	1470.5	10.15	2036.0	0.969	105.7	5.64	141.7	0.925
	72	1212.8	8.04	1740.6	0.815	1794.2	12.46	2425.2	0.956	120.9	6.43	160.3	0.908
	96	1351.4	8.98	1885.7	0.782	2055.3	14.32	2706.4	0.946	132.1	7.03	173.7	0.891
Amplifier	24	825.0	5.47	1226.1	0.913	1053.2	7.44	1443.9	0.985	88.4	4.73	119.6	0.947
	48	979.4	6.39	1455.8	0.876	1377.8	9.70	1866.4	0.974	108.4	5.80	144.6	0.921
	72	1104.2	7.13	1613.8	0.845	1822.2	12.51	2439.7	0.955	120.7	6.46	160.0	0.906
	96	1180.5	7.62	1687.5	0.829	1949.1	13.32	2608.6	0.949	133.9	7.12	177.4	0.885
ARIMA	24	1162.55	7.62	1603.95	0.852	1285.3	8.94	1737.75	0.811	1225.85	8.04	1669.5	0.842
	48	1303.52	8.61	1795.11	0.804	1440.71	10.02	1944.89	0.768	1370.45	8.97	1869.62	0.798
	72	1388.82	9.18	1900.51	0.781	1535.12	10.65	2059.29	0.739	1461.32	9.63	1980.52	0.772
	96	1359.27	9.64	1970.08	0.763	1502.76	11.18	2133.91	0.728	1431.15	10.08	2052.28	0.756
LSTM	24	947.8	6.159	1370.85	0.894	1327.6	9.05	1762.13	0.977	92.8	5.016	122.89	0.942
	48	1094.01	7	1619.95	0.856	1633.1	10.73	2177	0.966	106.86	5.784	142.11	0.924
	72	1310.49	8.35	1849.5	0.808	2189	14.622	2853.16	0.943	123.48	6.71	164.39	0.907
	96	1440.05	9.1	2070.6	0.767	2292.6	15.04	3065.4	0.934	132.3	7.16	177.17	0.892
FTimeDD	24	763.9	5.03	1188.5	0.919	1025.5	7.20	1394.2	0.985	84.9	4.58	113.6	0.951
	48	972.1	6.33	1481.7	0.871	1380.0	9.79	1908.0	0.973	103.4	5.57	138.7	0.925
	72	1101.6	7.08	1664.6	0.838	1654.7	11.57	2238.0	0.963	114.7	6.17	152.0	0.911
	96	1223.5	7.91	1771.5	0.807	1841.0	12.64	2482.2	0.953	111.7	6.57	160.5	0.899

Table 3. Ablation results of different model variants.

Mode	Horizons	Steps	Params	Time (s)	MAE	MAPE	RMSE	ACCR
FTimeDD (Full model)	168	24	117,110	1.28	624.77	5.6	1059.75	0.952
	168	48	131,006	1.32	818.51	7.23	1397.01	0.924
	168	72	144,902	1.36	956.99	8.44	1612.75	0.904
	168	96	158,798	1.37	1038.73	9.282	1723.87	0.888
w/o FFM	168	24	111,708	0.87	654.31	5.83	1093.2	0.949
	168	48	125,604	0.87	841	7.358	1418.99	0.922
	168	72	139,500	0.89	981.85	8.57	1648.72	0.9004
	168	96	153,396	0.89	1067.252	9.13	1778.15	0.887
w/o Time domain Decoupling	168	24	18,102	0.92	692.37	6.15	1134.32	0.943
	168	48	19,662	0.92	889.48	7.79	1469.63	0.911
	168	72	21,222	0.92	1037.38	8.93	1704.16	0.895
	168	96	22,782	0.92	1162.98	9.85	1897.78	0.873
w/o Frequency domain Decoupling	168	24	104,422	1.05	695.74	6.53	1145.33	0.94
	168	48	116,758	1.06s	854.01	7.45	1433.49	0.919
	168	72	129,094	1.08s	978.19	8.51	1645.33	0.902
	168	96	141,430	1.10s	1117.64	9.51	1839.02	0.883
w/o Dual-stream Decoupling	168	24	46,728	0.48	693.32	5.98	1137.68	0.947
	168	48	52,392	0.56	878.67	7.39	1497.86	0.917
	168	72	58,056	0.59	1055.41	8.63	1738.08	0.894
	168	96	64,428	0.6	1149.96	9.46	1890.12	0.88
w/o Exogenous Variables	168	24	116,121	1.22	637.64	5.692	1077.34	0.951
	168	48	130,017	1.28	843.6	7.241	1445.27	0.924
	168	72	143,913	1.29	1008.5	8.525	1693.11	0.9025
	168	96	157,809	1.32	1089.32	9.41	1809.24	0.887
w/o RIN	168	24	117,110	1.12	683.2	6.16	1135.4	0.948
	168	48	131,006	1.23	942.31	8.76	1527.99	0.9036
	168	72	144,902	1.27	1002.8	8.683	1683.16	0.905
	168	96	158,798	1.33	1195.76	10.08	1922.65	0.874

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lin, Z.; Wang, Z.; Guo, T.; Xia, M. FTimeDD: A Time–Frequency Collaborative Model for Multi-Energy Load Forecasting. Energies 2026, 19, 2729. https://doi.org/10.3390/en19112729

AMA Style

Lin Z, Wang Z, Guo T, Xia M. FTimeDD: A Time–Frequency Collaborative Model for Multi-Energy Load Forecasting. Energies. 2026; 19(11):2729. https://doi.org/10.3390/en19112729

Chicago/Turabian Style

Lin, Zi, Ziyi Wang, Tengyue Guo, and Min Xia. 2026. "FTimeDD: A Time–Frequency Collaborative Model for Multi-Energy Load Forecasting" Energies 19, no. 11: 2729. https://doi.org/10.3390/en19112729

APA Style

Lin, Z., Wang, Z., Guo, T., & Xia, M. (2026). FTimeDD: A Time–Frequency Collaborative Model for Multi-Energy Load Forecasting. Energies, 19(11), 2729. https://doi.org/10.3390/en19112729

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

FTimeDD: A Time–Frequency Collaborative Model for Multi-Energy Load Forecasting

Abstract

1. Introduction

2. Problem Definition and Feature Analysis

2.1. Problem Definition

2.2. Feature Analysis

3. Method

3.1. Frequency Filtering Module (FFM)

3.2. Dual-Stream Decoupling Module

3.2.1. Time Domain Path

3.2.2. Frequency-Domain Path

3.3. Multi-Scale Feature Fusion Module

4. Experiments

4.1. Comparative Experiments

4.1.1. Comparison of Fitting Curves

4.1.2. Comparison of Error Metrics

4.2. Ablation Study

4.2.1. Module Necessity Analysis

4.2.2. Analysis of Parameter and Performance Relationship

4.2.3. Module Contribution Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI