An Overview of Spatiotemporal Network Forecasting: Current Research Status and Methodological Evolution
Abstract
1. Research Background and Overview
2. Time Series Forecasting Model
2.1. Statistical Methods
2.2. DNN-Based Methods
2.3. RNN-Based Methods
2.4. Transformer-Based Methods
- Key, Query, and Value Vector Generation:The input vector X is linearly projected into three representations: query vector Q, key vector K, and value vector V, computed as:
- Self-Attention Aggregation: Each query vector Q is compared with each key vector K through inner products to measure similarity. The process includes: (1) computing the scaled inner product between Q and K, normalized by ; (2) applying the softmax operation to obtain the attention distribution; (3) taking the weighted sum with the value vector V. The attention function is defined as:
2.4.1. Early Explorations
2.4.2. Efficiency and Long-Horizon Modeling
2.4.3. Robustness and Adaptability
2.4.4. Multi-Modal, Spectral and Exogenous Integration
2.4.5. Causality and Historical Dependency
2.4.6. Summary
2.5. Pre-Trained Foundation Models for Time Series and Spatio-Temporal Forecasting
3. Spatio-Temporal Forecasting Model
3.1. Euclidean-Structured Methods
- (1)
- Traditional Machine Learning Methods
- (2)
- Convolutional Neural Networks
- (3)
- Autoencoder Models
3.2. GNN-Based Methods
- (1)
- Graph Convolutional Networks
- (2)
- Spatiotemporal Prediction Models Based on GCN
- (3)
- Optimized Prediction Models Based on Spatiotemporal Feature Pattern Mining
- [I]
- Spatiotemporal Prediction Models Based on Global Information Extraction
- [II]
- Spatiotemporal Prediction Models Based on Dynamic Spatiotemporal Feature Extraction
- [III]
- Spatiotemporal Prediction Models Based on Neural ODE for Continuous-Time Modeling
3.3. Diffusion-Model-Based Methods
- Designing reverse diffusion paths with strong task adaptability, such as residual graph structure prediction and modulation-attention-based temporal modeling;
- Improving inference efficiency and stability, including sparse decoders and multi-scale reconstruction paths;
- Integrating with Graph Neural Networks or Neural Ordinary Differential Equations to achieve unified frameworks for continuous-time modeling and dynamic graph forecasting [94].
3.4. Causality-Based Methods
3.5. Multimodal Methods
3.6. Transformer-Based Methods
4. Limitations and Future Directions
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chen, D.; Shao, Q.; Liu, Z.; Yu, W.; Chen, C.P. Ridesourcing behavior analysis and prediction: A network perspective. IEEE Trans. Intell. Transp. Syst. 2020, 23, 1274–1283. [Google Scholar] [CrossRef]
- Lim, B.; Zohren, S. Time-series forecasting with deep learning: A survey. Philos. Trans. R. Soc. A 2021, 379, 20200209. [Google Scholar] [CrossRef]
- Cheng, M.; Liu, Z.; Tao, X.; Liu, Q.; Zhang, J.; Pan, T.; Zhang, S.; He, P.; Zhang, X.; Wang, D.; et al. A comprehensive survey of time series forecasting: Concepts, challenges, and future directions. Authorea, 2025; preprints. [Google Scholar]
- Kong, X.; Chen, Z.; Liu, W.; Ning, K.; Zhang, L.; Muhammad Marier, S.; Liu, Y.; Chen, Y.; Xia, F. Deep learning for time series forecasting: A survey. Int. J. Mach. Learn. Cybern. 2025, 16, 5079–5112. [Google Scholar] [CrossRef]
- Su, J.; Jiang, C.; Jin, X.; Qiao, Y.; Xiao, T.; Ma, H.; Wei, R.; Jing, Z.; Xu, J.; Lin, J. Large language models for forecasting and anomaly detection: A systematic literature review. arXiv 2024, arXiv:2402.10350. [Google Scholar] [CrossRef]
- Chen, D.; Chen, J.; Zhang, X.; Jia, Q.; Liu, X.; Sun, Y.; Lv, L.; Yu, W. Critical nodes identification in complex networks: A survey. arXiv 2025, arXiv:2507.06164. [Google Scholar] [CrossRef]
- Chen, D.; Sun, Y.; Shao, G.; Yu, W.; Zhang, H.T.; Lin, W. Coordinating directional switches in pigeon flocks: The role of nonlinear interactions. R. Soc. Open Sci. 2021, 8, 210649. [Google Scholar] [CrossRef] [PubMed]
- Chen, D.; Lu, T.; Liu, X.; Yu, W. Finite-time consensus of multiagent systems with input saturation and disturbance. Int. J. Robust Nonlinear Control 2021, 31, 2097–2109. [Google Scholar] [CrossRef]
- Chen, D.; Yang, Y.; Zhang, Y.; Yu, W. Prediction of COVID-19 spread by sliding mSEIR observer. Sci. China Inf. Sci. 2020, 63, 222203. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar] [CrossRef]
- Fu, R.; Zhang, Z.; Li, L. Using LSTM and GRU Neural Network Methods for Traffic Flow Prediction. In Proceedings of the 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC 2016), Wuhan, China, 26–28 May 2016; pp. 324–328. [Google Scholar]
- Mackenzie, J.; Roddick, J.F.; Zito, R. An Evaluation of HTM and LSTM for Short-Term Arterial Traffic Flow Prediction. IEEE Trans. Intell. Transp. Syst. 2019, 20, 1847–1857. [Google Scholar] [CrossRef]
- Cui, Z.; Ke, R.; Pu, Z.; Wang, Y. Stacked bidirectional and unidirectional LSTM recurrent neural network for forecasting network-wide traffic state with missing values. Transp. Res. Part C Emerg. Technol. 2020, 118, 102674. [Google Scholar] [CrossRef]
- Karevan, Z.; Suykens, J.A. Spatio-temporal stacked LSTM for temperature prediction in weather forecasting. arXiv 2018, arXiv:1811.06341. [Google Scholar] [CrossRef]
- Li, C.; He, Y.; Li, X.; Jing, X. BiGRU Network for Human Activity Recognition in High Resolution Range Profile. In Proceedings of the 2019 International Radar Conference (RADAR), Toulon, France, 23–27 September 2019; pp. 1–5. [Google Scholar]
- Hou, H.; Yu, F.R. Rwkv-ts: Beyond traditional recurrent neural network for time series tasks. arXiv 2024, arXiv:2401.09093. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; ukasz Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates, Inc.: Nice, France, 2017; Volume 30. [Google Scholar]
- Wu, N.; Green, B.; Ben, X.; O’Banion, S. Deep transformer models for time series forecasting: The influenza prevalence case. arXiv 2020, arXiv:2001.08317. [Google Scholar] [CrossRef]
- Wu, S.; Xiao, X.; Ding, Q.; Zhao, P.; Wei, Y.; Huang, J. Adversarial Sparse Transformer for Time Series Forecasting. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 6–12 December 2020; Curran Associates, Inc.: Nice, France, 2020; Volume 33, pp. 17105–17115. [Google Scholar]
- Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 2–9 February 2021; Volume 35, pp. 11106–11115. [Google Scholar]
- Kitaev, N.; Kaiser, Ł.; Levskaya, A. Reformer: The efficient transformer. arXiv 2020, arXiv:2001.04451. [Google Scholar] [CrossRef]
- Liu, S.; Yu, H.; Liao, C.; Li, J.; Lin, W.; Liu, A.X.; Dustdar, S. Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, LA, USA, 28 November–9 December 2022. [Google Scholar]
- Zhou, T.; Ma, Z.; Wen, Q.; Wang, X.; Sun, L.; Jin, R. FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting. In Proceedings of the 39th International Conference on Machine Learning (ICML 2022), Baltimore, MD, USA, 17–23 July 2022; Proceedings of Machine Learning Research. pp. 27268–27286. [Google Scholar]
- Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS 2021), Virtual, 6–14 December 2021; Volume 34, pp. 22419–22430. [Google Scholar]
- Zhang, Y.; Yan, J. Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting. In Proceedings of the 11th International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Liu, M.; Zeng, A.; Xu, Z.; Lai, Q.; Xu, Q. Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction. arXiv 2021, arXiv:2106.09305. [Google Scholar]
- Liu, Y.; Hu, T.; Zhang, H.; Wu, H.; Wang, S.; Ma, L.; Long, M. itransformer: Inverted transformers are effective for time series forecasting. arXiv 2023, arXiv:2310.06625. [Google Scholar]
- Lee, M.; Yoon, H.; Kang, M. CASA: CNN Autoencoder-based Score Attention for Efficient Multivariate Long-term Time-series Forecasting. arXiv 2025, arXiv:2505.02011. [Google Scholar]
- Liu, Y.; Wu, H.; Wang, J.; Long, M. Non-stationary transformers: Exploring the stationarity in time series forecasting. Adv. Neural Inf. Process. Syst. 2022, 35, 9881–9893. [Google Scholar]
- Jiang, J.; Han, C.; Zhao, W.X.; Wang, J. PDFormer: Propagation Delay-Aware Dynamic Long-Range Transformer for Traffic Flow Prediction. In Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI 2023), Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 4365–4373. [Google Scholar]
- Jang, J.; Park, H.; Choi, J.; Kim, T. Towards Robust Real-World Multivariate Time Series Forecasting: A Unified Framework for Dependency, Asynchrony, and Missingness. arXiv 2025, arXiv:2506.08660. [Google Scholar] [CrossRef]
- Villaboni, D.; Castellini, A.; Danesi, I.L.; Farinelli, A. Sentinel: Multi-Patch Transformer with Temporal and Channel Attention for Time Series Forecasting. arXiv 2025, arXiv:2503.17658. [Google Scholar] [CrossRef]
- Yamaguchi, Y.; Suemitsu, I.; Wei, W. Citras: Covariate-informed transformer for time series forecasting. arXiv 2025, arXiv:2503.24007. [Google Scholar] [CrossRef]
- Shu, Y.; Lampos, V. Sonnet: Spectral Operator Neural Network for Multivariable Time Series Forecasting. arXiv 2025, arXiv:2505.15312. [Google Scholar] [CrossRef]
- Guo, S.; Chen, Z.; Ma, Y.; Han, Y.; Wang, Y. SCFormer: Structured Channel-wise Transformer with Cumulative Historical State for Multivariate Time Series Forecasting. arXiv 2025, arXiv:2505.02655. [Google Scholar] [CrossRef]
- Zhang, X.; Qiang, W.; Zhao, S.; Guo, H.; Li, J.; Sun, C.; Zheng, C. CAIFormer: A Causal Informed Transformer for Multivariate Time Series Forecasting. arXiv 2025, arXiv:2505.16308. [Google Scholar] [CrossRef]
- Li, Z.; Xia, L.; Xu, Y.; Huang, C. GPT-ST: Generative pre-training of spatio-temporal graph neural networks. Adv. Neural Inf. Process. Syst. 2023, 36, 70229–70246. [Google Scholar]
- Wang, S.; Cao, J.; Yu, P.S. Deep Learning for Spatio-Temporal Data Mining: A Survey. IEEE Trans. Knowl. Data Eng. 2022, 34, 3681–3700. [Google Scholar] [CrossRef]
- Eddy, S.R. Hidden markov models. Curr. Opin. Struct. Biol. 1996, 6, 361–365. [Google Scholar] [CrossRef]
- Miao, S.; Wang, Z.J.; Liao, R. A CNN Regression Approach for Real-Time 2D/3D Registration. IEEE Trans. Med. Imaging 2016, 35, 1352–1363. [Google Scholar] [CrossRef]
- Liu, F.; Lin, G.; Shen, C. CRF Learning with CNN Features for Image Segmentation. Pattern Recognit. 2015, 48, 2983–2992. [Google Scholar] [CrossRef]
- Gehring, J.; Auli, M.; Grangier, D.; Yarats, D.; Dauphin, Y.N. Convolutional Sequence to Sequence Learning. In Proceedings of the 34th International Conference on Machine Learning (ICML 2017), Sydney, NSW, Australia, 6–11 August 2017; Proceedings of Machine Learning Research. pp. 1243–1252. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS 2012), Lake Tahoe, NV, USA, 3–8 December 2012; Volume 25. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Zhang, J.; Zheng, Y.; Qi, D. Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI 2017), San Francisco, CA, USA, 4–9 February 2017; Volume 31, pp. 1655–1661. [Google Scholar]
- Zhang, J.; Zheng, Y.; Sun, J.; Qi, D. Flow Prediction in Spatio-Temporal Networks Based on Multitask Deep Learning. IEEE Trans. Knowl. Data Eng. 2020, 32, 468–478. [Google Scholar] [CrossRef]
- Li, Y.; Yu, R.; Shahabi, C.; Liu, Y. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv 2017, arXiv:1707.01926. [Google Scholar]
- Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS 2015), Montréal, QC, Canada, 7–12 December 2015; Volume 28. [Google Scholar]
- Ravuri, S.; Lenc, K.; Willson, M.; Kangin, D.; Lam, R.; Mirowski, P.; Fitzsimons, M.; Athanassiadou, M.; Kashem, S.; Madge, S.; et al. Skilful Precipitation Nowcasting Using Deep Generative Models of Radar. Nature 2021, 597, 672–677. [Google Scholar] [CrossRef]
- Essien, A.; Giannetti, C. A Deep Learning Model for Smart Manufacturing Using Convolutional LSTM Neural Network Autoencoders. IEEE Trans. Ind. Informatics 2020, 16, 6069–6078. [Google Scholar] [CrossRef]
- Wei, M.; Yang, J.; Zhao, Z.; Zhang, X.; Li, J.; Deng, Z. DeFedHDP: Fully Decentralized Online Federated Learning for Heart Disease Prediction in Computational Health Systems. IEEE Trans. Comput. Soc. Syst. 2024, 11, 6854–6867. [Google Scholar] [CrossRef]
- Jiang, L.; Ming, X.; Zhang, X. DT-DOFL: Digital-Twin-Empowered Decentralized Online Federated Learning for User-Centered Smart Healthcare Service Systems. IEEE Trans. Comput. Soc. Syst. 2025, 12, 4441–4455. [Google Scholar] [CrossRef]
- Wei, M.; Yu, W.; Chen, D. AccDFL: Accelerated Decentralized Federated Learning for Healthcare IoT Networks. IEEE Internet Things J. 2025, 12, 5329–5345. [Google Scholar] [CrossRef]
- Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph Neural Networks: A Review of Methods and Applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
- Bruna, J.; Zaremba, W.; Szlam, A.; LeCun, Y. Spectral networks and locally connected networks on graphs. arXiv 2013, arXiv:1312.6203. [Google Scholar]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Proceedings of the Advances in Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, 5–10 December 2016; Volume 29. [Google Scholar]
- Kipf, T. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Zhao, L.; Song, Y.; Zhang, C.; Liu, Y.; Wang, P.; Lin, T.; Deng, M.; Li, H. T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction. IEEE Trans. Intell. Transp. Syst. 2020, 21, 3848–3858. [Google Scholar] [CrossRef]
- Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv 2017, arXiv:1709.04875. [Google Scholar]
- Chai, D.; Wang, L.; Yang, Q. Bike Flow Prediction with Multi-Graph Convolutional Networks. In Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL 2018), Seattle, WA, USA, 6–9 November 2018; pp. 397–400. [Google Scholar]
- Zhang, Q.; Chang, J.; Meng, G.; Xiang, S.; Pan, C. Spatio-Temporal Graph Structure Learning for Traffic Forecasting. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI 2020), New York, NY, USA, 7–12 February 2020; Volume 34, pp. 1177–1185. [Google Scholar]
- Cini, A.; Zambon, D.; Alippi, C. Sparse graph learning from spatiotemporal time series. J. Mach. Learn. Res. 2023, 24, 1–36. [Google Scholar]
- Cini, A.; Marisca, I.; Zambon, D.; Alippi, C. Taming local effects in graph-based spatiotemporal forecasting. Adv. Neural Inf. Process. Syst. 2023, 36, 55375–55393. [Google Scholar]
- Zhu, J.; Wang, Q.; Tao, C.; Deng, H.; Zhao, L.; Li, H. AST-GCN: Attribute-augmented spatiotemporal graph convolutional network for traffic forecasting. IEEE Access 2021, 9, 35973–35983. [Google Scholar] [CrossRef]
- Guo, S.; Lin, Y.; Feng, N.; Song, C.; Wan, H. Attention-Based Spatial-Temporal Graph Convolutional Networks for Traffic Flow Forecasting. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI 2019), Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 922–929. [Google Scholar]
- Bai, L.; Yao, L.; Kanhere, S.; Wang, X.; Sheng, Q. Stg2seq: Spatial-temporal graph to sequence model for multi-step passenger demand forecasting. arXiv 2019, arXiv:1905.10069. [Google Scholar]
- Do, L.N.N.; Vu, H.L.; Vo, B.Q.; Liu, Z.; Phung, D. An Effective Spatial-Temporal Attention Based Neural Network for Traffic Flow Prediction. Transp. Res. Part C Emerg. Technol. 2019, 108, 12–28. [Google Scholar] [CrossRef]
- Lei, K.; Qin, M.; Bai, B.; Zhang, G.; Yang, M. GCN-GAN: A Non-Linear Temporal Link Prediction Model for Weighted Dynamic Networks. In Proceedings of the IEEE Conference on Computer Communications (IEEE INFOCOM 2019), Paris, France, 29 April–2 May 2019; pp. 388–396. [Google Scholar]
- Wang, S.; Zhang, M.; Miao, H.; Peng, Z.; Yu, P.S. Multivariate Correlation-aware Spatio-temporal Graph Convolutional Networks for Multi-scale Traffic Prediction. ACM Trans. Intell. Syst. Technol. 2022, 13, 38. [Google Scholar] [CrossRef]
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Zhang, C. Graph wavenet for deep spatial-temporal graph modeling. arXiv 2019, arXiv:1906.00121. [Google Scholar]
- Zheng, C.; Fan, X.; Wang, C.; Qi, J. GMAN: A Graph Multi-Attention Network for Traffic Prediction. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI 2020), New York, NY, USA, 7–12 February 2020; Volume 34, pp. 1234–1241. [Google Scholar]
- Wei, M.; Yu, W.; Liu, H.; Xu, Q. Distributed Weakly Convex Optimization Under Random Time-Delay Interference. IEEE Trans. Netw. Sci. Eng. 2024, 11, 212–224. [Google Scholar] [CrossRef]
- Wei, M.; Chen, G.; Guo, Z. A Fixed-Time Optimal Consensus Algorithm over Undirected Networks. In Proceedings of the 2018 Chinese Control and Decision Conference (CCDC 2018), Shenyang, China, 9–11 June 2018; pp. 725–730. [Google Scholar]
- Chen, J.; Shao, Q.; Chen, D.; Yu, W. Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2025), Toronto, ON, Canada, 24–28 August 2025; pp. 167–178. [Google Scholar]
- Jeon, B.K.; Kim, E.J. Solar irradiance prediction using reinforcement learning pre-trained with limited historical data. Energy Rep. 2023, 10, 2513–2524. [Google Scholar] [CrossRef]
- Wang, X.; Ma, Y.; Wang, Y.; Jin, W.; Wang, X.; Tang, J.; Jia, C.; Yu, J. Traffic Flow Prediction via Spatial-Temporal Graph Neural Network. In Proceedings of the Web Conference 2020 (WWW 2020), Taipei, Taiwan, 20–24 April 2020; pp. 1082–1092. [Google Scholar]
- Li, M.; Zhu, Z. Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow Forecasting. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI 2021), Virtual, 2–9 February 2021; Volume 35, pp. 4189–4196. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Zhang, J.; Shi, X.; Xie, J.; Ma, H.; King, I.; Yeung, D.Y. Gaan: Gated attention networks for learning on large and spatiotemporal graphs. arXiv 2018, arXiv:1803.07294. [Google Scholar] [CrossRef]
- Park, C.; Lee, C.; Bahng, H.; Kim, K.; Jin, S.; Ko, S.; Choo, J. ST-GRAT: A Spatio-Temporal Graph Attention Network for Traffic Forecasting. arXiv 2019, arXiv:1911.13181. [Google Scholar]
- Guo, S.; Lin, Y.; Wan, H.; Li, X.; Cong, G. Learning Dynamics and Heterogeneity of Spatial-Temporal Graph Data for Traffic Forecasting. IEEE Trans. Knowl. Data Eng. 2022, 34, 5415–5428. [Google Scholar] [CrossRef]
- Wang, H.; Chen, J.; Pan, T.; Dong, Z.; Zhang, L.; Jiang, R.; Song, X. Robust Traffic Forecasting against Spatial Shift over Years. arXiv 2024, arXiv:2410.00373. [Google Scholar] [CrossRef]
- Xu, M.; Dai, W.; Liu, C.; Gao, X.; Lin, W.; Qi, G.J.; Xiong, H. Spatial-temporal transformer networks for traffic flow forecasting. arXiv 2020, arXiv:2001.02908. [Google Scholar]
- Wei, M.; Yu, W.; Chen, D.; Kang, M.; Cheng, G. Privacy Distributed Constrained Optimization Over Time-Varying Unbalanced Networks and Its Application in Federated Learning. IEEE/CAA J. Autom. Sin. 2025, 12, 335–346. [Google Scholar] [CrossRef]
- Wei, M.; Yang, Z.; Ji, Q.; Zhao, Z. Privacy-preserving distributed projected one-point bandit online optimization over directed graphs. Asian J. Control 2023, 25, 4705–4720. [Google Scholar] [CrossRef]
- Chen, R.T.Q.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D.K. Neural Ordinary Differential Equations. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS 2018), Montréal, QC, Canada, 3–8 December 2018; Volume 31. [Google Scholar]
- Huang, Z.; Sun, Y.; Wang, W. Coupled Graph ODE for Learning Interacting System Dynamics. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’21), Virtual, Singapore, 14–18 August 2021; pp. 705–715. [Google Scholar]
- Choi, J.; Choi, H.; Hwang, J.; Park, N. Graph Neural Controlled Differential Equations for Traffic Forecasting. In Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI 2022), Virtual, 22 February–1 March 2022; Volume 36, pp. 6367–6374. [Google Scholar]
- Wen, H.; Lin, Y.; Xia, Y.; Wan, H.; Wen, Q.; Zimmermann, R.; Liang, Y. DiffSTG: Probabilistic Spatio-Temporal Graph Forecasting with Denoising Diffusion Models. In Proceedings of the 31st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL 2023), Hamburg, Germany, 13–16 November 2023; pp. 1–12. [Google Scholar]
- Cheng, J.; Li, R.; Wang, H.; Li, Y. Sparse Diffusion Autoencoder for Test-time Adapting Prediction of Complex Systems. arXiv 2025, arXiv:2505.17459. [Google Scholar] [CrossRef]
- Jung, C.; Jang, Y. DiffGSL: A Graph Structure Learning Diffusion Model for Dynamic Spatio-Temporal Forecasting. In Proceedings of the 2024 IEEE International Conference on Big Data (IEEE BigData 2024), Washington, DC, USA, 15–18 December 2024; pp. 5785–5793. [Google Scholar]
- Rühling Cachay, S.; Zhao, B.; Joren, H.; Yu, R. Dyffusion: A dynamics-informed diffusion model for spatiotemporal forecasting. Adv. Neural Inf. Process. Syst. 2023, 36, 45259–45287. [Google Scholar]
- Yang, Y.; Jin, M.; Wen, H.; Zhang, C.; Liang, Y.; Ma, L.; Wang, Y.; Liu, C.; Yang, B.; Xu, Z.; et al. A survey on diffusion models for time series and spatio-temporal data. arXiv 2024, arXiv:2404.18886. [Google Scholar] [CrossRef]
- Xia, Y.; Liang, Y.; Wen, H.; Liu, X.; Wang, K.; Zhou, Z.; Zimmermann, R. Deciphering spatio-temporal graph forecasting: A causal lens and treatment. Adv. Neural Inf. Process. Syst. 2023, 36, 37068–37088. [Google Scholar]
- Chen, D.; Yu, W.; Shao, Q.; Liu, X. Causality Induced Distributed Spatio-Temporal Feature Extraction. In Proceedings of the 2021 8th International Conference on Information, Cybernetics, and Computational Social Systems (ICCSS 2021), Beijing, China, 10–12 December 2021; pp. 68–73. [Google Scholar]
- Einizade, A.; Malliaros, F.D.; Giraldo, J.H. Spatiotemporal Forecasting Meets Efficiency: Causal Graph Process Neural Networks. arXiv 2024, arXiv:2405.18879. [Google Scholar] [CrossRef]
- Malla, S.; Choi, C.; Dariush, B. Social-STAGE: Spatio-Temporal Multi-Modal Future Trajectory Forecast. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA 2021), Xi’an, China, 30 May–5 June 2021; pp. 13938–13944. [Google Scholar]
- Jiang, R.; Wang, Z.; Tao, Y.; Yang, C.; Song, X.; Shibasaki, R.; Chen, S.C.; Shyu, M.L. Learning Social Meta-Knowledge for Nowcasting Human Mobility in Disaster. In Proceedings of the ACM Web Conference 2023 (WWW 2023), Austin, TX, USA, 30 April–4 May 2023; pp. 2655–2665. [Google Scholar]
- Deng, J.; Jiang, R.; Zhang, J.; Song, X. Multi-modality spatio-temporal forecasting via self-supervised learning. arXiv 2024, arXiv:2405.03255. [Google Scholar]
- Zhang, Y.; Liu, L.; Xiong, X.; Li, G.; Wang, G.; Lin, L. Long-term wind power forecasting with hierarchical spatial-temporal transformer. arXiv 2023, arXiv:2305.18724. [Google Scholar]
- Liang, Y.; Xia, Y.; Ke, S.; Wang, Y.; Wen, Q.; Zhang, J.; Zheng, Y.; Zimmermann, R. AirFormer: Predicting Nationwide Air Quality in China with Transformers. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-23), Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 14329–14337. [Google Scholar]
- Sun, J.; Yeh, C.C.M.; Fan, Y.; Dai, X.; Fan, X.; Jiang, Z.; Saini, U.S.; Lai, V.; Wang, J.; Chen, H.; et al. Towards Efficient Large Scale Spatial-Temporal Time Series Forecasting via Improved Inverted Transformers. arXiv 2025, arXiv:2503.10858. [Google Scholar] [CrossRef]
- Bai, H.Y.; Liu, X. T-graphormer: Using transformers for spatiotemporal forecasting. arXiv 2025, arXiv:2501.13274. [Google Scholar] [CrossRef]
- Zhang, H.; Wu, D.; Zinflou, A.; Dellacherie, S.; Dione, M.M.; Boulet, B. Leveraging Multivariate Long-Term History Representation for Time Series Forecasting. arXiv 2025, arXiv:2505.14737. [Google Scholar] [CrossRef]
- Wu, H.; Zhou, H.; Long, M.; Wang, J. Interpretable weather forecasting for worldwide stations with a unified deep model. Nat. Mach. Intell. 2023, 5, 602–611. [Google Scholar] [CrossRef]



| Category | Representative Methods | Temporal Backbone | Spatial Structure | Training/Inference Complexity |
|---|---|---|---|---|
| Channel-independent (pure time series) | ||||
| Statistical | HA, ARIMA, VAR, Kalman, Prophet | Linear AR/MA/state-space models | No explicit spatial structure (per series) | Low–Medium |
| DNN-based | BPNN/MLP-style DNNs | Feedforward fully connected layers | No explicit spatial structure (per series) | Medium |
| RNN-based | LSTM, GRU, stacked/bi-directional RNNs | Recurrent neural units (RNN/LSTM/GRU) | No explicit spatial structure (per series) | Medium–High (sequential computation) |
| Transformer-based | Informer, Autoformer, FEDformer, PatchTST, iTransformer | Self-attention with feed-forward layers | No explicit spatial structure (per series) | Medium–High (attention-dependent) |
| Channel-dependent (spatio-temporal) | ||||
| Euclidean-structured | CNN, ConvLSTM, ST-ResNet, ST-UNet, DGMR | Convolutional and temporal convolutional layers | Regular grids (images, rasters) | Medium |
| GNN-based | T-GCN, DCRNN, STGCN, GraphWaveNet, GMAN | Recurrent, temporal convolutional, and attention layers | Explicit graph structures (static or adaptive) | Medium–High (graph operations) |
| Diffusion-based | DiffSTG, DYffusion, SparseDiff, DiffGSL | Denoising diffusion steps with GNN/Transformer backbones | Graphs or grids, often dynamic | High–Very High (multi-step sampling) |
| Causality-based | Causal-STGNN, CGPN, causal graph processes | GNN/RNN/Transformer backbones with causal priors | Learned causal graphs or sparse structures | Medium (extra structure learning) |
| Multimodal | Social-STAGE, Social Meta-Knowledge Transformer, MoSSL | RNN/Transformer-based multimodal fusion layers | Grids and/or graphs with multimodal inputs | Medium–High (fusion overhead) |
| Transformer-based ST | HSTTN, PDFormer, AirFormer, T-Graphormer | Spatio-temporal self-attention layers | Graphs or grids with learned relations | Medium–High |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Yang, C.; Zhang, W.; Zhou, Y. An Overview of Spatiotemporal Network Forecasting: Current Research Status and Methodological Evolution. Mathematics 2026, 14, 18. https://doi.org/10.3390/math14010018
Yang C, Zhang W, Zhou Y. An Overview of Spatiotemporal Network Forecasting: Current Research Status and Methodological Evolution. Mathematics. 2026; 14(1):18. https://doi.org/10.3390/math14010018
Chicago/Turabian StyleYang, Chenchen, Wenbing Zhang, and Yingjiang Zhou. 2026. "An Overview of Spatiotemporal Network Forecasting: Current Research Status and Methodological Evolution" Mathematics 14, no. 1: 18. https://doi.org/10.3390/math14010018
APA StyleYang, C., Zhang, W., & Zhou, Y. (2026). An Overview of Spatiotemporal Network Forecasting: Current Research Status and Methodological Evolution. Mathematics, 14(1), 18. https://doi.org/10.3390/math14010018
