LLM-Empowered Kolmogorov-Arnold Frequency Learning for Time Series Forecasting in Power Systems
Abstract
1. Introduction
- We introduce a pioneering framework that transforms multivariable time-series data into structured text prompts and leverages pre-trained Large Language Models (LLMs) to extract semantically rich representations. This approach captures complex, contextually nuanced patterns within power system data that are difficult to discern with conventional methods.
- We propose a novel frequency-domain learning module utilizing Kolmogorov–Arnold Networks. By applying Fast Fourier Transform and employing KANs with learnable activation functions, this component effectively captures multi-scale periodic structures and intricate frequency characteristics inherent in power system time-series data, overcoming the limitations of fixed-activation MLPs.
- We develop an entropy-minimization strategy for cross-modal fusion to bridge the semantic gap between prompt-based (semantic) and frequency-based representations. This theoretically grounded approach ensures the full integration of complementary information from both modalities, significantly enhancing the model’s ability to leverage diverse data characteristics for accurate forecasting.
2. The Proposed Method
2.1. LLM-Based Prompt Representation Learning
2.2. KAN-Based Frequency Representation Learning
2.3. Entropy-Oriented Cross-Modal Fusion
2.4. The Loss Optimization
3. Experimental Evaluation
3.1. Setup
3.2. Comparison Evaluation
3.3. Ablation Analysis
3.4. Parameter Analysis
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Xing, Q.; Huang, X.; Wang, J.; Wang, S. A Novel Multivariate Combined Power Load Forecasting System Based on Feature Selection and Multi-Objective Intelligent Optimization. Expert Syst. Appl. 2024, 244, 122970. [Google Scholar] [CrossRef]
- Ferkous, K.; Guermoui, M.; Menakh, S.; Bellaour, A.; Boulmaiz, T. A Novel Learning Approach for Short-Term Photovoltaic Power Forecasting-A Review and Case Studies. Eng. Appl. Artif. Intell. 2024, 133, 108502. [Google Scholar] [CrossRef]
- Bashir, T.; Wang, H.; Tahir, M.; Zhang, Y. Wind and Solar Power Forecasting Based on Hybrid CNN-ABiLSTM, CNN-Transformer-MLP Models. Renew. Energy 2025, 239, 122055. [Google Scholar] [CrossRef]
- de Azevedo Takara, L.; Teixeira, A.C.; Yazdanpanah, H.; Mariani, V.C.C.; dos Santos Coelho, L. Optimizing Multi-Step Wind Power Forecasting: Integrating Advanced Deep Neural Networks with Stacking-Based Probabilistic Learning. Appl. Energy 2024, 369, 123487. [Google Scholar] [CrossRef]
- Zhao, Y.; Liao, H.; Pan, S.; Zhao, Y. Interpretable Multi-Graph Convolution Network Integrating Spatio-Temporal Attention and Dynamic Combination for Wind Power Forecasting. Expert Syst. Appl. 2024, 255, 124766. [Google Scholar] [CrossRef]
- Gao, J.; Li, P.; Laghari, A.A.; Srivastava, G.; Gadekallu, T.R.; Abbas, S.; Zhang, J. Incomplete Multiview Clustering via Semidiscrete Optimal Transport for Multimedia Data Mining in IoT. ACM Trans. Multimed. Comput. Commun. Appl. 2024, 20, 1–20. [Google Scholar] [CrossRef]
- Gao, J.; Liu, M.; Li, P.; Laghari, A.A.; Javed, A.R.; Victor, N.; Gadekallu, T.R. Deep Incomplete Multiview Clustering via Information Bottleneck for Pattern Mining of Data in Extreme-environment IoT. IEEE Internet Things J. 2023, 11, 26700–26712. [Google Scholar] [CrossRef]
- Gao, J.; Liu, M.; Li, P.; Zhang, J.; Chen, Z. Deep Multiview Adaptive Clustering with Semantic Invariance. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 12965–12978. [Google Scholar] [CrossRef]
- Li, P.; Laghari, A.A.; Rashid, M.; Gao, J.; Gadekallu, T.R.; Javed, A.R.; Yin, S. A Deep Multimodal Adversarial Cycle-consistent Network for Smart Enterprise System. IEEE Trans. Ind. Inform. 2022, 19, 693–702. [Google Scholar] [CrossRef]
- Gao, J.; Cheng, Y.; Zhang, D.; Chen, Y. Physics-Constrained Wind Power Forecasting Aligned with Probability Distributions for Noise-Resilient Deep Learning. Appl. Energy 2025, 383, 125295. [Google Scholar] [CrossRef]
- Wang, J.; Kou, M.; Li, R.; Qian, Y.; Li, Z. Ultra-Short-Term Wind Power Forecasting Jointly Driven by Anomaly Detection, Clustering and Graph Convolutional Recurrent Neural Networks. Adv. Eng. Inform. 2025, 65, 103137. [Google Scholar] [CrossRef]
- Wang, Y.; Hao, Y.; Zhao, K.; Yao, Y. Stochastic Configuration Networks for Short-Term Power Load Forecasting. Inf. Sci. 2025, 689, 121489. [Google Scholar] [CrossRef]
- Hu, X.; Li, H.; Si, C. Improved Composite Model Using Metaheuristic Optimization Algorithm for Short-Term Power Load Forecasting. Electr. Power Syst. Res. 2025, 241, 111330. [Google Scholar] [CrossRef]
- Yang, Q.; Tian, Z. A Hybrid Load Forecasting System Based on Data Augmentation and Ensemble Learning Under Limited Feature Availability. Expert Syst. Appl. 2025, 261, 125567. [Google Scholar] [CrossRef]
- Gao, J.; Guo, C.; Liu, Y.; Li, P.; Zhang, J.; Liu, M. Dynamic-Static Feature Fusion with Multi-Scale Attention for Continuous Blood Glucose Prediction. In Proceedings of the ICASSP 2025—2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 6–11 April 2025; pp. 1–5. [Google Scholar]
- Jalalifar, R.; Delavar, M.R.; Ghaderi, S.F. SAC-ConvLSTM: A Novel Spatio-Temporal Deep Learning-Based Approach for a Short Term Power Load Forecasting. Expert Syst. Appl. 2024, 237, 121487. [Google Scholar] [CrossRef]
- Deng, Q.; Wang, C.; Sun, J.; Sun, Y.; Jiang, J.; Lin, H.; Deng, Z. Nonvolatile CMOS Memristor, Reconfigurable Array, and Its Application in Power Load Forecasting. IEEE Trans. Ind. Inform. 2023, 20, 6130–6141. [Google Scholar] [CrossRef]
- Yuan, F.; Che, J. An Ensemble Multi-Step M-RMLSSVR Model Based on VMD and Two-Group Strategy for Day-Ahead Short-Term Load Forecasting. Knowl.-Based Syst. 2022, 252, 109440. [Google Scholar] [CrossRef]
- Zhang, S.; Chen, R.; Cao, J.; Tan, J. A CNN and LSTM-Based Multi-Task Learning Architecture for Short and Medium-Term Electricity Load Forecasting. Electr. Power Syst. Res. 2023, 222, 109507. [Google Scholar] [CrossRef]
- Lv, L.; Wu, Z.; Zhang, J.; Zhang, L.; Tan, Z.; Tian, Z. A VMD and LSTM Based Hybrid Model of Load Forecasting for Power Grid Security. IEEE Trans. Ind. Inform. 2021, 18, 6474–6482. [Google Scholar] [CrossRef]
- Xu, A.; Chen, J.; Li, J.; Chen, Z.; Xu, S.; Nie, Y. Multivariate Rolling Decomposition Hybrid Learning Paradigm for Power Load Forecasting. Renew. Sustain. Energy Rev. 2025, 212, 115375. [Google Scholar] [CrossRef]
- Pentsos, V.; Tragoudas, S.; Wibbenmeyer, J.; Khdeer, N. A Hybrid LSTM-Transformer Model for Power Load Forecasting. IEEE Trans. Smart Grid 2025, 16, 2624–2634. [Google Scholar] [CrossRef]
- Liu, P.; Guo, H.; Dai, T.; Li, N.; Bao, J.; Ren, X.; Jiang, Y.; Xia, S.T. Calf: Aligning LLMs for Time Series Forecasting via Cross-Modal Fine-Tuning. Proc. AAAI Conf. Artif. Intell. 2025, 39, 18915–18923. [Google Scholar] [CrossRef]
- Liu, C.; Xu, Q.; Miao, H.; Yang, S.; Zhang, L.; Long, C.; Li, Z.; Zhao, R. Timecma: Towards llm-empowered multivariate time series forecasting via cross-modality alignment. Proc. AAAI Conf. Artif. Intell. 2025, 39, 18780–18788. [Google Scholar] [CrossRef]
- Tan, M.; Merrill, M.; Gupta, V.; Althoff, T.; Hartvigsen, T. Are Language Models Actually Useful for Time Series Forecasting. Adv. Neural Inf. Process. Syst. 2024, 37, 60162–60191. [Google Scholar]
- Jia, F.; Wang, K.; Zheng, Y.; Cao, D.; Liu, Y. Gpt4mts: Prompt-Based Large Language Model for Multimodal Time-Series Forecasting. Proc. AAAI Conf. Artif. Intell. 2024, 38, 23343–23351. [Google Scholar] [CrossRef]
- Qiu, X.; Wu, X.; Lin, Y.; Guo, C.; Hu, J.; Yang, B. Duet: Dual Clustering Enhanced Multivariate Time Series Forecasting. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Toronto, ON, Canada, 3–7 August 2025; pp. 1185–1196. [Google Scholar]
- Murad, M.M.N.; Aktukmak, M.; Yilmaz, Y. Wpmixer: Efficient Multi-Resolution Mixing for Long-Term Time Series Forecasting. Proc. AAAI Conf. Artif. Intell. 2025, 39, 19581–19588. [Google Scholar] [CrossRef]
- Zeng, A.; Chen, M.; Zhang, L.; Xu, Q. Are Transformers Effective for Time Series Forecasting. Proc. AAAI Conf. Artif. Intell. 2023, 37, 11121–11128. [Google Scholar] [CrossRef]
- Huang, S.; Zhao, Z.; Li, C.; Bai, L. TimeKAN: KAN-based Frequency Decomposition Learning Architecture for Long-term Time Series Forecasting. In Proceedings of the Thirteenth International Conference on Learning Representations, Singapore, 24–28 April 2025; pp. 1–8. [Google Scholar]
Dataset Metric | T = 96 | T = 192 | T = 336 | T = 720 | ||||
---|---|---|---|---|---|---|---|---|
RMSE | MAE | RMSE | MAE | RMSE | MAE | RMSE | MAE | |
FS-TSF | 0.121 | 0.099 | 0.145 | 0.108 | 0.169 | 0.127 | 0.189 | 0.143 |
Hybrid-net | 0.098 | 0.076 | 0.106 | 0.080 | 0.110 | 0.089 | 0.115 | 0.094 |
IMC-net | 0.135 | 0.102 | 0.178 | 0.146 | 0.204 | 0.174 | 0.229 | 0.176 |
DAE-TSF | 0.108 | 0.082 | 0.146 | 0.115 | 0.156 | 0.124 | 0.159 | 0.128 |
SAC-ConvLSTM | 0.103 | 0.074 | 0.109 | 0.078 | 0.113 | 0.082 | 0.120 | 0.088 |
LT-TSF | 0.089 | 0.063 | 0.096 | 0.068 | 0.110 | 0.080 | 0.117 | 0.086 |
Timecma | 0.107 | 0.077 | 0.121 | 0.089 | 0.134 | 0.101 | 0.150 | 0.119 |
Gpt4mts | 0.091 | 0.063 | 0.096 | 0.069 | 0.117 | 0.083 | 0.121 | 0.099 |
Effformer | 0.092 | 0.064 | 0.101 | 0.073 | 0.112 | 0.086 | 0.150 | 0.116 |
TimeKAN | 0.092 | 0.061 | 0.097 | 0.069 | 0.115 | 0.085 | 0.119 | 0.092 |
Ours | 0.088 | 0.062 | 0.093 | 0.067 | 0.107 | 0.079 | 0.110 | 0.083 |
Dataset Metric | T = 96 | T = 192 | T = 336 | T = 720 | ||||
---|---|---|---|---|---|---|---|---|
RMSE | MAE | RMSE | MAE | RMSE | MAE | RMSE | MAE | |
FS-TSF | 0.090 | 0.074 | 0.093 | 0.082 | 0.095 | 0.099 | 0.097 | 0.102 |
Hybrid-net | 0.092 | 0.076 | 0.095 | 0.081 | 0.097 | 0.090 | 0.105 | 0.097 |
IMC-net | 0.088 | 0.075 | 0.091 | 0.079 | 0.102 | 0.087 | 0.109 | 0.096 |
DAE-TSF | 0.067 | 0.051 | 0.064 | 0.048 | 0.079 | 0.060 | 0.106 | 0.085 |
SAC-ConvLSTM | 0.075 | 0.052 | 0.085 | 0.061 | 0.089 | 0.065 | 0.090 | 0.066 |
LT-TSF | 0.076 | 0.052 | 0.084 | 0.059 | 0.093 | 0.068 | 0.094 | 0.069 |
Timecma | 0.054 | 0.038 | 0.062 | 0.047 | 0.068 | 0.048 | 0.075 | 0.057 |
Gpt4mts | 0.055 | 0.038 | 0.065 | 0.046 | 0.075 | 0.054 | 0.089 | 0.067 |
Effformer | 0.054 | 0.039 | 0.062 | 0.046 | 0.087 | 0.067 | 0.099 | 0.072 |
TimeKAN | 0.068 | 0.045 | 0.077 | 0.052 | 0.083 | 0.058 | 0.088 | 0.062 |
Ours | 0.053 | 0.037 | 0.060 | 0.043 | 0.063 | 0.046 | 0.072 | 0.054 |
Dataset Metric | T = 96 | T = 192 | T = 336 | T = 720 | ||||
---|---|---|---|---|---|---|---|---|
RMSE | MAE | RMSE | MAE | RMSE | MAE | RMSE | MAE | |
FS-TSF | 0.108 | 0.086 | 0.114 | 0.097 | 0.127 | 0.108 | 0.132 | 0.116 |
Hybrid-net | 0.120 | 0.092 | 0.127 | 0.098 | 0.134 | 0.108 | 0.155 | 0.119 |
IMC-net | 0.117 | 0.084 | 0.132 | 0.101 | 0.157 | 0.123 | 0.185 | 0.133 |
DAE-TSF | 0.102 | 0.070 | 0.100 | 0.071 | 0.104 | 0.072 | 0.110 | 0.089 |
SAC-ConvLSTM | 0.096 | 0.071 | 0.101 | 0.073 | 0.110 | 0.085 | 0.122 | 0.096 |
LT-TSF | 0.083 | 0.058 | 0.090 | 0.064 | 0.099 | 0.069 | 0.110 | 0.082 |
Timecma | 0.088 | 0.061 | 0.801 | 0.610 | 0.756 | 0.504 | 0.770 | 0.551 |
Gpt4mts | 0.084 | 0.062 | 0.090 | 0.069 | 0.102 | 0.077 | 0.112 | 0.093 |
Effformer | 0.082 | 0.056 | 0.088 | 0.062 | 0.098 | 0.072 | 0.108 | 0.083 |
TimeKAN | 0.082 | 0.055 | 0.085 | 0.063 | 0.099 | 0.074 | 0.109 | 0.082 |
Ours | 0.079 | 0.054 | 0.080 | 0.060 | 0.095 | 0.066 | 0.106 | 0.077 |
Dataset Metric | T = 96 | T = 192 | T = 336 | T = 720 | ||||
---|---|---|---|---|---|---|---|---|
RMSE | MAE | RMSE | MAE | RMSE | MAE | RMSE | MAE | |
FS-TSF | 0.069 | 0.059 | 0.074 | 0.063 | 0.082 | 0.074 | 0.89 | 0.082 |
Hybrid-net | 0.080 | 0.072 | 0.088 | 0.078 | 0.096 | 0.088 | 0.107 | 0.092 |
IMC-net | 0.058 | 0.039 | 0.065 | 0.043 | 0.070 | 0.046 | 0.081 | 0.054 |
DAE-TSF | 0.043 | 0.031 | 0.048 | 0.037 | 0.059 | 0.044 | 0.084 | 0.060 |
SAC-ConvLSTM | 0.095 | 0.077 | 0.104 | 0.086 | 0.117 | 0.095 | 0.126 | 0.109 |
LT-TSF | 0.083 | 0.065 | 0.101 | 0.079 | 0.110 | 0.084 | 0.121 | 0.082 |
Timecma | 0.047 | 0.045 | 0.054 | 0.040 | 0.756 | 0.043 | 0.087 | 0.055 |
Gpt4mts | 0.064 | 0.042 | 0.073 | 0.056 | 0.088 | 0.056 | 0.096 | 0.073 |
Effformer | 0.044 | 0.033 | 0.052 | 0.036 | 0.070 | 0.047 | 0.085 | 0.056 |
TimeKAN | 0.051 | 0.033 | 0.060 | 0.038 | 0.068 | 0.045 | 0.078 | 0.051 |
Ours | 0.040 | 0.028 | 0.047 | 0.033 | 0.054 | 0.038 | 0.063 | 0.045 |
Dataset Metric | T = 96 | T = 192 | T = 336 | T = 720 | ||||
---|---|---|---|---|---|---|---|---|
RMSE | MAE | RMSE | MAE | RMSE | MAE | RMSE | MAE | |
FS-TSF | 0.091 | 0.067 | 0.103 | 0.074 | 0.123 | 0.077 | 0.126 | 0.079 |
Hybrid-net | 0.083 | 0.056 | 0.094 | 0.070 | 0.112 | 0.079 | 0.120 | 0.086 |
IMC-net | 0.077 | 0.058 | 0.081 | 0.063 | 0.087 | 0.069 | 0.097 | 0.075 |
DAE-TSF | 0.085 | 0.071 | 0.097 | 0.068 | 0.107 | 0.075 | 0.153 | 0.092 |
SAC-ConvLSTM | 0.088 | 0.068 | 0.096 | 0.065 | 0.106 | 0.075 | 0.135 | 0.077 |
LT-TSF | 0.078 | 0.057 | 0.084 | 0.054 | 0.087 | 0.059 | 0.092 | 0.065 |
Timecma | 0.082 | 0.059 | 0.087 | 0.057 | 0.079 | 0.056 | 0.096 | 0.071 |
Gpt4mts | 0.074 | 0.045 | 0.072 | 0.044 | 0.073 | 0.047 | 0.086 | 0.058 |
Effformer | 0.072 | 0.045 | 0.074 | 0.047 | 0.077 | 0.051 | 0.086 | 0.058 |
TimeKAN | 0.075 | 0.048 | 0.079 | 0.052 | 0.080 | 0.055 | 0.093 | 0.063 |
Ours | 0.069 | 0.041 | 0.071 | 0.043 | 0.072 | 0.045 | 0.084 | 0.056 |
Dataset Metric | T = 96 | T = 336 | T = 720 | |||
---|---|---|---|---|---|---|
ETTm1 | Electricity | ETTm1 | Electricity | ETTm1 | Electricity | |
IMC-net | −0.0018 | −0.0048 | 0.0065 | 0.0088 | 0.0139 | 0.0166 |
LT-TSF | 0.0019 | 0.0049 | 0.0067 | 0.0090 | 0.0137 | 0.0165 |
Timecma | 0.0022 | 0.0053 | 0.0072 | 0.0096 | 0.0146 | 0.0168 |
Gpt4mts | 0.0016 | 0.0042 | 0.0063 | 0.0084 | −0.0133 | −0.0154 |
Effformer | −0.0014 | −0.0042 | 0.0064 | 0.0085 | −0.0135 | −0.0155 |
TimeKAN | 0.0017 | 0.0044 | 0.0064 | 0.0086 | 0.0138 | 0.0159 |
Ours | 0.0015 | 0.0038 | 0.0060 | 0.0080 | 0.0130 | 0.0150 |
Dataset Metric | T = 96 | T = 192 | T = 336 | T = 720 | ||||
---|---|---|---|---|---|---|---|---|
RMSE | MAE | RMSE | MAE | RMSE | MAE | RMSE | MAE | |
Variant_1 | 0.098 | 0.072 | 0.100 | 0.076 | 0.120 | 0.097 | 0.132 | 0.107 |
Variant_2 | 0.089 | 0.063 | 0.094 | 0.069 | 0.110 | 0.083 | 0.113 | 0.086 |
Variant_3 | 0.092 | 0.067 | 0.099 | 0.075 | 0.117 | 0.092 | 0.127 | 0.102 |
Ours | 0.088 | 0.062 | 0.093 | 0.067 | 0.107 | 0.079 | 0.110 | 0.083 |
Dataset Metric | T = 96 | T = 192 | T = 336 | T = 720 | ||||
---|---|---|---|---|---|---|---|---|
RMSE | MAE | RMSE | MAE | RMSE | MAE | RMSE | MAE | |
Ours w/o LLM | 0.098 | 0.071 | 0.102 | 0.075 | 0.119 | 0.081 | 0.131 | 0.103 |
Ours w/o KAN | 0.091 | 0.065 | 0.095 | 0.070 | 0.110 | 0.085 | 0.117 | 0.096 |
Ours w/o Entropy | 0.090 | 0.065 | 0.098 | 0.072 | 0.115 | 0.86 | 0.121 | 0.097 |
Ours | 0.088 | 0.062 | 0.093 | 0.067 | 0.107 | 0.079 | 0.110 | 0.083 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, Z.; Yu, Y.; Lin, S.; Zhang, Y. LLM-Empowered Kolmogorov-Arnold Frequency Learning for Time Series Forecasting in Power Systems. Mathematics 2025, 13, 3149. https://doi.org/10.3390/math13193149
Yang Z, Yu Y, Lin S, Zhang Y. LLM-Empowered Kolmogorov-Arnold Frequency Learning for Time Series Forecasting in Power Systems. Mathematics. 2025; 13(19):3149. https://doi.org/10.3390/math13193149
Chicago/Turabian StyleYang, Zheng, Yang Yu, Shanshan Lin, and Yue Zhang. 2025. "LLM-Empowered Kolmogorov-Arnold Frequency Learning for Time Series Forecasting in Power Systems" Mathematics 13, no. 19: 3149. https://doi.org/10.3390/math13193149
APA StyleYang, Z., Yu, Y., Lin, S., & Zhang, Y. (2025). LLM-Empowered Kolmogorov-Arnold Frequency Learning for Time Series Forecasting in Power Systems. Mathematics, 13(19), 3149. https://doi.org/10.3390/math13193149