ST-MAFNet: Spatio-Temporal Multi-Scale Adaptive Fusion Network for Traffic Forecasting
Abstract
1. Introduction
- (1)
- Neglecting complementary multi-source spatio-temporal dependencies. Existing models predominantly rely on predefined spatial structures and periodic temporal patterns, while overlooking the complementary dynamics from multiple spatio-temporal sources. Traffic flow exhibits directional propagation characteristics and implicit dependencies between non-adjacent nodes with similar behavioral patterns, which cannot be adequately captured by static topological structures alone. As shown in Figure 1, the flow consistency across spatially distant nodes underscores the necessity of integrating multi-source spatio-temporal information.
- (2)
- Overlooking anchor-refinement interactions between temporal scales. Traffic flow demonstrates multi-scale temporal dynamics, with short-term variations occurring within longer-term patterns at different time scales. Existing approaches rely on stacked dilated convolutions to learn multi-scale dependencies, but fail to capture the hierarchical structure underlying these temporal relationships. As illustrated in Figure 2, multi-scale temporal patterns (e.g., peak-hour trends) anchor short-term predictions, while short-term fluctuations refine local variations within this framework.
- A Cross-Scale Hierarchical Anchoring (CSHA) strategy is proposed for progressively transferring multi-scale temporal patterns to short-term prediction layers, effectively mitigating noise and drift in short-term predictions.
- A Dual Spatial Perception Module (DSPM) is designed to capture spatial dependencies from both node representation and graph structure perspectives, encoding time-varying spatial dependencies across various temporal contexts.
- A Spatio-Temporal Adaptive Fusion Module (STAFM) is introduced to dynamically integrate multi-source characteristics, capturing complex dependencies across spatial and temporal dimensions.
- Extensive experiments on four real-world datasets (PEMS03, PEMS04, PEMS07, and PEMS08) demonstrate that ST-MAFNet is especially effective for short-term forecasting and achieves the best or second-best results on most metrics.
2. Related Work
2.1. Time Series Forecasting
2.2. Multi-Scale Temporal Modeling
2.3. Spatio-Temporal Graph Neural Networks
3. Problem Statement
4. Methodology
4.1. Framework Overview
4.2. MST-Encoder
4.3. DSPM
4.4. STAFM
| Algorithm 1 Cross-Scale Hierarchical Anchoring (CSHA) |
| Require: Multi-scale temporal features , temporal embeddings , forward adjacency matrix , backward adjacency matrix Ensure: Final traffic-flow prediction
|
5. Experiment Implementation
- RQ1: How does the proposed ST-MAFNet perform compared to baselines?
- RQ2: How do different components affect the performance of ST-MAFNet?
- RQ3: How does the efficiency of ST-MAFNet compare to the baselines?
- RQ4: How do hyperparameters affect the performance of ST-MAFNet?
- RQ5: How robust is ST-MAFNet?
5.1. Experimental Settings
5.1.1. Datasets
5.1.2. Implementation Details
5.1.3. Performance Evaluation
5.2. Experimental Results
5.2.1. Performance Comparison (RQ1)
5.2.2. Ablation Study (RQ2)
- w/o : Remove adaptive node embeddings to verify the positive effect of node heterogeneity on spatial dependencies.
- w/o : Remove the DSPM module to examine whether the synergy between node heterogeneity and dynamic spatial dependencies improves prediction performance.
- w/o CSHA: Remove the CSHA strategy to assess whether cross-scale hierarchical anchoring achieves the expected impact.
5.2.3. Efficiency Analysis (RQ3)
5.2.4. Parameter Study (RQ4)
5.2.5. Robustness Testing (RQ5)
5.2.6. Visualization Analysis
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chen, W.; Chen, L.; Xie, Y.; Cao, W.; Gao, Y.; Feng, X. Multi-range Attentive Bicomponent Graph Convolutional Network for Traffic Forecasting. Proc. AAAI Conf. Artif. Intell. 2020, 34, 3529–3536. [Google Scholar] [CrossRef]
- Chen, J.; Wang, Q.; Cheng, H.H.; Peng, W.; Xu, W. A Review of Vision-Based Traffic Semantic Understanding in ITSs. IEEE Trans. Intell. Transp. Syst. 2022, 23, 19954–19979. [Google Scholar] [CrossRef]
- Jin, G.; Liang, Y.; Fang, Y.; Shao, Z.; Huang, J.; Zhang, J.; Zheng, Y. Spatio-Temporal Graph Neural Networks for Predictive Learning in Urban Computing: A Survey. IEEE Trans. Knowl. Data Eng. 2024, 36, 5388–5408. [Google Scholar] [CrossRef]
- Davis, G.A.; Nihan, N.L. Using time-series designs to estimate changes in freeway level of service, despite missing data. Transp. Res. Part A Gen. 1984, 18, 431–438. [Google Scholar] [CrossRef]
- Hamed, M.M.; Al-Masaeid, H.R.; Said, Z.M.B. Short-Term Prediction of Traffic Volume in Urban Arterials. J. Transp. Eng. 1995, 121, 249–254. [Google Scholar] [CrossRef]
- Kumar, S.V.; Vanajakshi, L. Short-term traffic flow prediction using seasonal ARIMA model with limited input data. Eur. Transp. Res. Rev. 2015, 7, 21. [Google Scholar] [CrossRef]
- Cui, Y.; Xie, J.; Zheng, K. Historical Inertia: A Neglected but Powerful Baseline for Long Sequence Time-series Forecasting. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management; Association for Computing Machinery: New York, NY, USA, 2021; pp. 2965–2969. [Google Scholar]
- Li, Y.; Yu, R.; Shahabi, C.; Liu, Y. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. In Proceedings of the International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada, 30 April–3 May 2018; Available online: https://openreview.net/forum?id=SJiHXGWAZ (accessed on 5 June 2026).
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Zhang, C. Graph wavenet for deep spatial-temporal graph modeling. In Proceedings of the 28th International Joint Conference on Artificial Intelligence; AAAI Press: Menlo Park, CA, USA, 2019; pp. 1907–1913. [Google Scholar]
- Shao, Z.; Zhang, Z.; Wei, W.; Wang, F.; Xu, Y.; Cao, X.; Jensen, C.S. Decoupled dynamic spatial-temporal graph neural network for traffic forecasting. Proc. VLDB Endow. 2022, 15, 2733–2746. [Google Scholar] [CrossRef]
- Zhang, A. Dynamic graph convolutional networks with Temporal representation learning for traffic flow prediction. Sci. Rep. 2025, 15, 17270. [Google Scholar] [CrossRef]
- Zhang, Q.; Chang, J.; Meng, G.; Xiang, S.; Pan, C. Spatio-Temporal Graph Structure Learning for Traffic Forecasting. Proc. AAAI Conf. Artif. Intell. 2020, 34, 1177–1185. [Google Scholar] [CrossRef]
- Zhang, T.Y.; Wang, Y.; Wei, Z. STGAT: A Spatio-Temporal Graph Attention Network for Travel Demand Prediction; IEEE: New York, NY, USA, 2023; pp. 434–439. [Google Scholar]
- Zheng, C.; Fan, X.; Pan, S.; Jin, H.; Peng, Z.; Wu, Z.; Wang, C.; Yu, P.S. Spatio-Temporal Joint Graph Convolutional Networks for Traffic Forecasting. IEEE Trans. Knowl. Data Eng. 2024, 36, 372–385. [Google Scholar] [CrossRef]
- Gao, H.; Jiang, R.; Dong, Z.; Deng, J.; Ma, Y.; Song, X. Spatial-temporal-decoupled masked pre-training for spatiotemporal forecasting. In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI-24), Jeju, Republic of Korea, 3–9 August 2024. [Google Scholar]
- Tang, J.; Wei, W.; Xia, L.; Huang, C. EasyST: A Simple Framework for Spatio-Temporal Prediction. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management; Association for Computing Machinery: New York, NY, USA, 2024; pp. 2220–2229. [Google Scholar]
- Cheng, S.; Lu, F.; Peng, P.; Wu, S. Short-term traffic forecasting: An adaptive ST-KNN model that considers spatial heterogeneity. Comput. Environ. Urban Syst. 2018, 71, 186–198. [Google Scholar] [CrossRef]
- Valente, J.M.; Maldonado, S. SVR-FFS: A novel forward feature selection approach for high-frequency time series forecasting using support vector regression. Expert Syst. Appl. 2020, 160, 113729. [Google Scholar] [CrossRef]
- Fu, J.; Zhou, W.; Chen, Z. Bayesian graph convolutional network for traffic prediction. Neurocomputing 2024, 582, 127507. [Google Scholar] [CrossRef]
- Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM Network: A machine learning approach for precipitation nowcasting. In Proceedings of the 29th International Conference on Neural Information Processing Systems—Volume 1; MIT Press: Cambridge, MA, USA, 2015; pp. 802–810. [Google Scholar]
- Cao, L.; Wang, B.; Jiang, G.; Yu, Y.; Dong, J. Spatiotemporal-aware trend-seasonality decomposition network for traffic flow forecasting. In Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Artificial Intelligence and Fifteenth Symposium on Educational Advances in Artificial Intelligence; AAAI Press: Menlo Park, CA, USA, 2025. [Google Scholar]
- Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting; AAAI Press: Menlo Park, CA, USA, 2021; Volume 35, pp. 11106–11115. [Google Scholar]
- Jiang, J.; Han, C.; Zhao, W.X.; Wang, J. PDFormer: Propagation delay-aware dynamic long-range transformer for traffic flow prediction. Proc. AAAI Conf. Artif. Intell. 2023, 37, 4365–4373. [Google Scholar] [CrossRef]
- Müller, M. Dynamic Time Warping. In Information Retrieval for Music and Motion; Springer: Berlin/Heidelberg, Germany, 2007; pp. 69–84. [Google Scholar]
- Zeng, A.; Chen, M.; Zhang, L.; Xu, Q. Are transformers effective for time series forecasting? In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence; AAAI Press: Menlo Park, CA, USA, 2023. [Google Scholar]
- Shao, Z.; Wang, F.; Xu, Y.; Wei, W.; Yu, C.; Zhang, Z.; Yao, D.; Sun, T.; Jin, G.; Cao, X.; et al. Exploring Progress in Multivariate Time Series Forecasting: Comprehensive Benchmarking and Heterogeneity Analysis. IEEE Trans. Knowl. Data Eng. 2025, 37, 291–305. [Google Scholar] [CrossRef]
- Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence; AAAI Press: Menlo Park, CA, USA, 2018; pp. 3634–3640. [Google Scholar]
- Guo, S.; Lin, Y.; Feng, N.; Song, C.; Wan, H. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence; AAAI Press: Menlo Park, CA, USA, 2019. [Google Scholar]
- Lea, C.; Flynn, M.; Vidal, R.; Reiter, A.; Hager, G. Temporal convolutional networks for action segmentation and detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2017; pp. 1003–1012. [Google Scholar]
- Ismail Fawaz, H.; Lucas, B.; Forestier, G.; Pelletier, C.; Schmidt, D.F.; Weber, J.; Webb, G.I.; Idoumghar, L.; Muller, P.A.; Petitjean, F. InceptionTime: Finding AlexNet for time series classification. Data Min. Knowl. Discov. 2020, 34, 1936–1962. [Google Scholar] [CrossRef]
- Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. In Proceedings of the 35th International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2021. [Google Scholar]
- Liu, Y.; Hu, T.; Zhang, H.; Wu, H.; Wang, S.; Ma, L.; Long, M. iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. arXiv 2023, arXiv:2310.06625. [Google Scholar] [CrossRef]
- Bai, L.; Yao, L.; Li, C.; Wang, X.; Wang, C. Adaptive graph convolutional recurrent network for traffic forecasting. In Proceedings of the 34th International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2020. [Google Scholar]
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Chang, X.; Zhang, C. Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; Association for Computing Machinery: New York, NY, USA, 2020; pp. 753–763. [Google Scholar]
- Cao, D.; Wang, Y.; Duan, J.; Zhang, C.; Zhu, X.; Huang, C.; Tong, Y.; Xu, B.; Bai, J.; Tong, J.; et al. Spectral temporal graph neural network for multivariate time-series forecasting. In Proceedings of the 34th International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2020. [Google Scholar]
- Liu, H.; Dong, Z.; Jiang, R.; Deng, J.; Deng, J.; Chen, Q.; Song, X. Spatio-Temporal Adaptive Embedding Makes Vanilla Transformer SOTA for Traffic Forecasting. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management; Association for Computing Machinery: New York, NY, USA, 2023; pp. 4125–4129. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection; IEEE: New York, NY, USA, 2017. [Google Scholar]
- Zhang, M.; Rong, Q.; Jing, H. TTSDA-YOLO: A Two Training Stage Domain Adaptation Framework for Object Detection in Adverse Weather. IEEE Trans. Instrum. Meas. 2025, 74, 5000213. [Google Scholar] [CrossRef]
- Rong, Q.; Jing, H.; Zhang, M. Scale Sensitivity Mamba Network for Object Detection in Remote Sensing Images. IEEE Sens. J. 2025, 25, 43339–43351. [Google Scholar] [CrossRef]
- Song, C.; Lin, Y.; Guo, S.; Wan, H. Spatial-Temporal Synchronous Graph Convolutional Networks: A New Framework for Spatial-Temporal Network Data Forecasting. Proc. AAAI Conf. Artif. Intell. 2020, 34, 914–921. [Google Scholar] [CrossRef]
- Chen, C.; Petty, K.; Skabardonis, A.; Varaiya, P.; Jia, Z. Freeway performance measurement system: Mining loop detector data. Transp. Res. Rec. 2001, 1748, 96–102. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA, 7–9 May 2015. [Google Scholar] [CrossRef]
- Hamilton, J.D. Time Series Analysis; Princeton University Press: Princeton, NJ, USA, 1994. [Google Scholar]
- Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems—Volume 2; MIT Press: Cambridge, MA, USA, 2014; pp. 3104–3112. [Google Scholar]
- Shang, C.; Chen, J.; Bi, J. Discrete Graph Structure Learning for Forecasting Multiple Time Series. arXiv 2021, arXiv:2101.06861. [Google Scholar] [CrossRef]
- Deng, J.; Chen, X.; Jiang, R.; Song, X.; Tsang, I.W. ST-Norm: Spatial and Temporal Normalization for Multi-variate Time Series Forecasting. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining; Association for Computing Machinery: New York, NY, USA, 2021; pp. 269–278. [Google Scholar]
- Fang, Z.; Long, Q.; Song, G.; Xie, K. Spatial-Temporal Graph ODE Networks for Traffic Flow Forecasting. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining; Association for Computing Machinery: New York, NY, USA, 2021; pp. 364–373. [Google Scholar]
- Shao, Z.; Zhang, Z.; Wang, F.; Wei, W.; Xu, Y. Spatial-Temporal Identity: A Simple yet Effective Baseline for Multivariate Time Series Forecasting. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management; Association for Computing Machinery: New York, NY, USA, 2022; pp. 4454–4458. [Google Scholar]
- Fang, Y.; Qin, Y.; Luo, H.; Zhao, F.; Xu, B.; Zeng, L.; Wang, C. When Spatio-Temporal Meet Wavelets: Disentangled Traffic Forecasting via Efficient Spectral Graph Attention Networks. In Proceedings of the 2023 IEEE 39th International Conference on Data Engineering (ICDE); IEEE: New York, NY, USA, 2023; pp. 517–529. [Google Scholar]
- Dong, Z.; Jiang, R.; Gao, H.; Liu, H.; Deng, J.; Wen, Q.; Song, X. Heterogeneity-Informed Meta-Parameter Learning for Spatiotemporal Time Series Forecasting. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining; Association for Computing Machinery: New York, NY, USA, 2024; pp. 631–641. [Google Scholar]
- Zheng, C.; Fan, X.; Wang, C.; Qi, J. GMAN: A Graph Multi-Attention Network for Traffic Prediction. Proc. AAAI Conf. Artif. Intell. 2020, 34, 1234–1241. [Google Scholar] [CrossRef]
- Jiang, R.; Wang, Z.; Yong, J.; Jeph, P.; Chen, Q.; Kobayashi, Y.; Song, X.; Fukushima, S.; Suzumura, T. Spatio-Temporal Meta-Graph Learning for Traffic Forecasting. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence; AAAI Press: Menlo Park, CA, USA, 2023; Volume 37, pp. 8078–8086. [Google Scholar]
- van den Oord, A.; Dieleman, S.; Zen, H.; Simonyan, K.; Vinyals, O.; Graves, A.; Kalchbrenner, N.; Senior, A.; Kavukcuoglu, K. WaveNet: A Generative Model for Raw Audio. arXiv 2016, arXiv:1609.03499. [Google Scholar] [CrossRef]












| Existing Method | Main Limitation | Corresponding Module | Expected Benefit |
|---|---|---|---|
| Multi-scale temporal models | Extract features at multiple scales but often lack explicit cross-scale anchor-refinement interactions. | CSHA | Uses coarse-scale temporal patterns to stabilize and refine fine-scale predictions. |
| FPN-style hierarchical fusion | Transfers hierarchical features but is not designed for traffic-specific temporal anchoring or multi-source spatial dependencies. | CSHA and STAFM | Enables traffic-oriented coarse-to-fine prediction refinement across temporal scales. |
| STAEformer | Learns adaptive spatio-temporal embeddings but does not explicitly model cross-scale anchoring. | CSHA and DSPM | Combines node heterogeneity with anchor-guided multi-scale prediction. |
| PDFormer | Models propagation delays and long-range dependencies but does not focus on anchor-refinement interactions among temporal scales. | CSHA | Strengthens short-term prediction through cross-scale temporal constraints. |
| STWave | Uses wavelet-based decomposition and graph attention but does not explicitly integrate multi-source spatial views with anchor refinement. | DSPM and STAFM | Integrates adaptive, forward, backward, and temporal features within a unified fusion module. |
| HimNet | Captures heterogeneity through meta-parameters but does not explicitly exploit coarse-to-fine temporal anchoring. | CSHA and DSPM | Jointly models node heterogeneity, dynamic correlations, and hierarchical temporal refinement. |
| Dataset | Nodes | Frames | Degree | Time Range | Data Points |
|---|---|---|---|---|---|
| PEMS03 | 358 | 26,208 | 1.5 | 1 September 2018–30 November 2018 | 9.38 M |
| PEMS04 | 307 | 16,992 | 1.1 | 1 January 2018–28 February 2018 | 5.22 M |
| PEMS07 | 883 | 28,224 | 1.0 | 1 May 2017–6 August 2017 | 24.92 M |
| PEMS08 | 170 | 17,856 | 1.6 | 1 July 2016–31 August 2016 | 3.04 M |
| Item | Setting |
|---|---|
| Input/output length | 12 historical steps/12 future steps |
| Data split | Chronological 6:2:2 split |
| Normalization | Training-set statistics applied to train/validation/test sets |
| Input/target features | Flow, time-of-day, day-of-week/flow only |
| Random seed | 1 |
| Repeated runs | One deterministic run for each setting |
| Model selection | Best validation MAE checkpoint |
| Optimizer | Adam |
| Batch size | 32 |
| Learning-rate schedule | MultiStepLR, milestones [1, 18, 36, 54, 72], gamma 0.5 |
| Maximum epochs | 300 |
| Gradient clipping | Max norm 3.0 |
| Hidden dimension | 64 |
| Adaptive embedding dimension | 64 |
| Adaptive attention heads | 4 |
| Adaptive attention layers | 2 |
| Dropout | 0.2 |
| Number of temporal scales | 4 |
| Search range of S | selected by validation MAE |
| Decoder layers/fusion layers | 2/2 |
| TCN configuration | Cascaded temporal convolution with kernel size |
| Method | PEMS03 | PEMS04 | PEMS07 | PEMS08 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | |
| HA | 32.62 | 49.89 | 30.60% | 42.35 | 61.66 | 29.92% | 49.03 | 71.18 | 22.75% | 36.66 | 50.45 | 21.63% |
| VAR | 17.48 | 29.40 | 18.27% | 20.87 | 32.26 | 15.70% | 44.85 | 62.53 | 23.30% | 18.66 | 27.35 | 12.81% |
| LSTM | 17.47 | 28.71 | 16.79% | 23.62 | 37.01 | 16.08% | 25.79 | 40.19 | 11.14% | 18.23 | 28.75 | 11.99% |
| DCRNN | 15.54 | 27.18 | 15.62% | 19.63 | 31.26 | 13.59% | 21.16 | 34.14 | 9.02% | 15.22 | 24.17 | 10.21% |
| GWNet | 14.59 | 25.24 | 15.52% | 18.53 | 29.92 | 12.89% | 20.47 | 33.47 | 8.61% | 14.40 | 23.39 | 9.21% |
| AGCRN | 15.24 | 26.65 | 15.89% | 19.38 | 31.25 | 13.40% | 20.57 | 34.40 | 8.74% | 15.32 | 24.41 | 10.03% |
| MTGNN | 14.85 | 25.23 | 14.55% | 19.17 | 31.70 | 13.37% | 20.89 | 34.06 | 9.00% | 15.18 | 24.24 | 10.20% |
| GTS | 15.41 | 26.15 | 15.39% | 20.96 | 32.95 | 14.66% | 22.15 | 35.10 | 9.38% | 16.49 | 26.08 | 10.54% |
| STNorm | 15.32 | 25.93 | 14.37% | 18.96 | 30.98 | 12.69% | 20.50 | 34.66 | 8.75% | 15.41 | 24.77 | 9.76% |
| STGODE | 16.50 | 27.84 | 16.69% | 20.84 | 32.82 | 13.77% | 22.29 | 37.54 | 10.14% | 16.81 | 25.97 | 10.62% |
| DLinear | 21.36 | 34.48 | 22.03% | 27.93 | 43.84 | 19.14% | 31.71 | 49.37 | 14.62% | 22.42 | 35.41 | 14.68% |
| STID | 15.33 | 27.40 | 16.40% | 18.38 | 29.95 | 12.04% | 19.61 | 32.79 | 8.30% | 14.21 | 23.28 | 9.27% |
| PDFormer | 14.94 | 25.39 | 15.82% | 18.36 | 30.03 | 12.00% | 19.97 | 32.95 | 8.55% | 13.58 | 23.41 | 9.05% |
| STAEformer | 15.35 | 27.55 | 15.18% | 18.22 | 30.18 | 11.98% | 19.14 | 32.60 | 8.01% | 13.46 | 23.25 | 8.88% |
| STWave | 15.18 | 26.87 | 15.81% | 18.53 | 30.29 | 12.50% | 19.65 | 33.13 | 8.56% | 13.96 | 23.93 | 9.04% |
| HimNet | 15.11 | 26.56 | 15.49% | 18.14 | 29.88 | 12.00% | 19.21 | 32.75 | 8.03% | 13.57 | 23.22 | 8.98% |
| ST-MAFNet (ours) | 14.16 | 25.08 | 14.04% | 17.88 | 29.76 | 12.14% | 18.90 | 32.50 | 7.97% | 13.41 | 23.26 | 8.63% |
| Rel. change | +2.95% | +0.59% | +2.30% | +1.43% | +0.40% | −1.34% | +1.25% | +0.31% | +0.50% | +0.37% | −0.17% | +2.82% |
| Dataset | Method | @Horizon 15 | @Horizon 30 | @Horizon 45 | @Horizon 60 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | ||
| PEMS03 | ST-MAFNet (ours) | 13.12 | 23.20 | 13.47% | 14.19 | 25.33 | 14.13% | 15.02 | 26.74 | 14.68% | 15.73 | 27.79 | 15.25% |
| GWNet | 13.47 | 23.02 | 14.08% | 14.58 | 25.13 | 14.51% | 15.53 | 26.60 | 15.29% | 16.32 | 27.78 | 15.79% | |
| MegaCRN | 13.57 | 24.27 | 14.92% | 14.82 | 26.53 | 15.33% | 15.77 | 27.93 | 16.14% | 16.52 | 28.99 | 16.53% | |
| MTGNN | 13.74 | 23.45 | 14.08% | 14.97 | 26.54 | 14.93% | 15.81 | 27.95 | 14.82% | 16.71 | 29.13 | 15.63% | |
| STNorm | 14.31 | 24.87 | 13.83% | 15.39 | 26.94 | 14.28% | 16.18 | 28.24 | 14.76% | 16.70 | 28.96 | 15.34% | |
| STWave | 14.27 | 24.72 | 13.98% | 15.51 | 27.02 | 14.68% | 16.21 | 28.21 | 16.03% | 16.96 | 29.43 | 15.82% | |
| STID | 13.88 | 24.63 | 14.78% | 15.30 | 27.91 | 16.16% | 16.41 | 31.12 | 17.29% | 17.48 | 33.27 | 18.78% | |
| HimNet | 13.73 | 24.00 | 14.58% | 15.29 | 27.39 | 15.84% | 16.50 | 29.16 | 16.96% | 17.56 | 30.75 | 17.99% | |
| GTS | 13.98 | 23.90 | 14.11% | 15.34 | 26.09 | 15.34% | 16.43 | 27.67 | 16.30% | 17.39 | 29.02 | 17.37% | |
| GMAN | 14.77 | 24.59 | 14.98% | 15.48 | 25.81 | 15.51% | 16.21 | 26.99 | 16.20% | 16.99 | 28.20 | 17.07% | |
| STAEformer | 14.06 | 24.69 | 14.30% | 15.46 | 27.25 | 15.51% | 16.56 | 29.06 | 16.45% | 17.59 | 30.62 | 17.53% | |
| DCRNN | 14.25 | 24.59 | 14.45% | 15.54 | 27.18 | 15.42% | 16.64 | 28.85 | 16.53% | 17.56 | 30.15 | 17.27% | |
| STGCN | 14.99 | 25.85 | 14.48% | 16.07 | 27.96 | 15.14% | 17.03 | 29.56 | 15.92% | 18.13 | 31.14 | 16.98% | |
| AGCRN | 14.95 | 25.64 | 14.39% | 16.27 | 27.84 | 15.49% | 17.27 | 29.50 | 16.52% | 18.38 | 31.11 | 18.02% | |
| StemGNN | 14.61 | 24.68 | 15.49% | 16.38 | 27.71 | 16.54% | 17.76 | 29.83 | 17.72% | 19.04 | 31.68 | 19.07% | |
| LSTM | 16.03 | 26.58 | 15.26% | 19.22 | 31.52 | 18.20% | 22.49 | 36.20 | 21.44% | 26.03 | 41.27 | 25.25% | |
| WaveNet | 16.14 | 26.91 | 15.30% | 19.37 | 31.58 | 18.22% | 22.73 | 36.64 | 21.56% | 26.36 | 41.72 | 25.39% | |
| PEMS04 | ST-MAFNet (ours) | 17.13 | 28.43 | 11.61% | 17.94 | 29.90 | 12.15% | 18.72 | 30.96 | 12.26% | 18.99 | 31.54 | 13.00% |
| STAEformer | 18.14 | 29.99 | 11.92% | 18.15 | 30.05 | 11.93% | 18.72 | 31.20 | 12.26% | 18.15 | 30.06 | 11.94% | |
| HimNet | 18.16 | 29.87 | 12.00% | 18.16 | 29.89 | 12.04% | 18.75 | 31.01 | 12.42% | 18.13 | 29.87 | 12.03% | |
| STID | 18.44 | 30.03 | 12.48% | 18.40 | 29.97 | 12.77% | 18.98 | 30.96 | 12.92% | 18.38 | 29.97 | 12.84% | |
| MegaCRN | 18.71 | 30.38 | 12.73% | 18.67 | 30.41 | 12.78% | 19.50 | 31.77 | 13.25% | 18.66 | 30.44 | 12.73% | |
| STNorm | 18.90 | 31.13 | 12.38% | 18.89 | 31.12 | 12.40% | 19.48 | 32.32 | 12.72% | 18.90 | 31.15 | 12.37% | |
| MTGNN | 18.99 | 31.82 | 12.65% | 18.94 | 31.79 | 12.62% | 19.69 | 33.43 | 13.21% | 18.93 | 31.78 | 12.75% | |
| GWNet | 19.29 | 30.82 | 12.70% | 19.06 | 30.63 | 12.54% | 19.61 | 31.55 | 12.95% | 18.87 | 30.41 | 12.57% | |
| GMAN | 19.57 | 31.03 | 12.95% | 18.98 | 30.73 | 12.90% | 19.26 | 31.25 | 13.06% | 18.97 | 30.56 | 12.98% | |
| STWave | 19.54 | 30.99 | 12.92% | 19.43 | 30.99 | 12.79% | 19.93 | 31.93 | 13.17% | 19.49 | 31.12 | 12.79% | |
| AGCRN | 19.47 | 31.24 | 12.88% | 19.57 | 31.40 | 13.01% | 20.15 | 32.33 | 13.21% | 19.61 | 31.47 | 12.94% | |
| DCRNN | 20.03 | 31.78 | 12.93% | 20.00 | 31.67 | 13.17% | 20.64 | 32.78 | 13.52% | 19.82 | 31.61 | 13.01% | |
| STGCN | 20.17 | 32.02 | 13.39% | 20.15 | 32.02 | 13.37% | 20.96 | 33.30 | 13.79% | 20.14 | 32.00 | 13.34% | |
| GTS | 20.96 | 32.92 | 14.79% | 20.92 | 32.88 | 14.87% | 22.27 | 34.69 | 15.83% | 20.89 | 32.84 | 14.70% | |
| StemGNN | 21.05 | 33.24 | 14.01% | 20.94 | 33.15 | 14.11% | 22.39 | 35.13 | 15.19% | 21.06 | 33.26 | 14.12% | |
| LSTM | 25.75 | 39.86 | 17.54% | 25.79 | 39.93 | 17.16% | 28.95 | 43.83 | 19.72% | 25.74 | 39.85 | 17.43% | |
| WaveNet | 25.77 | 39.91 | 17.27% | 25.77 | 39.87 | 17.45% | 29.01 | 43.86 | 19.88% | 25.75 | 39.90 | 17.33% | |
| PEMS08 | ST-MAFNet (ours) | 12.51 | 21.32 | 8.06% | 13.44 | 23.40 | 8.64% | 14.25 | 25.08 | 9.09% | 14.67 | 25.56 | 9.46% |
| STAEformer | 13.52 | 23.35 | 8.87% | 13.52 | 23.36 | 8.87% | 14.18 | 24.65 | 9.30% | 13.52 | 23.37 | 8.89% | |
| HimNet | 13.54 | 23.15 | 8.99% | 13.53 | 23.16 | 8.96% | 14.23 | 24.49 | 9.37% | 13.53 | 23.17 | 8.98% | |
| STID | 14.27 | 23.69 | 9.27% | 14.30 | 23.68 | 9.29% | 14.93 | 24.83 | 9.75% | 14.26 | 23.64 | 9.28% | |
| GWNet | 14.56 | 23.45 | 9.40% | 14.60 | 23.47 | 9.39% | 15.26 | 24.72 | 9.75% | 14.56 | 23.48 | 9.44% | |
| GMAN | 14.53 | 24.19 | 9.64% | 14.61 | 24.05 | 9.58% | 14.79 | 24.75 | 9.93% | 14.70 | 24.49 | 9.69% | |
| MegaCRN | 14.93 | 24.00 | 10.20% | 15.00 | 24.05 | 9.71% | 15.87 | 25.65 | 10.15% | 15.03 | 24.20 | 9.63% | |
| STNorm | 15.35 | 25.07 | 9.79% | 15.41 | 25.14 | 9.77% | 16.14 | 26.53 | 10.52% | 15.38 | 25.15 | 9.70% | |
| MTGNN | 15.40 | 24.43 | 9.67% | 15.44 | 24.50 | 9.71% | 16.20 | 25.79 | 10.28% | 15.41 | 24.49 | 9.76% | |
| DCRNN | 15.35 | 24.38 | 9.87% | 15.39 | 24.46 | 10.00% | 16.11 | 25.70 | 10.46% | 15.60 | 24.63 | 9.98% | |
| AGCRN | 15.78 | 24.94 | 10.33% | 15.68 | 24.83 | 10.29% | 16.41 | 26.07 | 10.72% | 15.82 | 24.96 | 10.43% | |
| STWave | 16.02 | 25.20 | 11.12% | 16.44 | 25.88 | 10.84% | 16.78 | 26.85 | 10.76% | 16.23 | 25.72 | 10.65% | |
| STGCN | 16.29 | 25.48 | 10.67% | 16.32 | 25.50 | 10.71% | 17.18 | 26.91 | 11.12% | 16.32 | 25.50 | 10.70% | |
| GTS | 16.39 | 25.81 | 10.45% | 16.37 | 25.81 | 10.42% | 17.55 | 27.58 | 11.29% | 16.36 | 25.81 | 10.60% | |
| StemGNN | 16.55 | 26.09 | 11.51% | 16.50 | 26.03 | 11.50% | 17.57 | 27.77 | 11.85% | 16.50 | 26.03 | 11.49% | |
| LSTM | 19.86 | 31.42 | 12.53% | 19.87 | 31.41 | 12.59% | 22.34 | 34.78 | 14.25% | 19.86 | 31.42 | 12.45% | |
| WaveNet | 20.29 | 31.79 | 12.70% | 20.27 | 31.82 | 12.65% | 22.87 | 35.26 | 14.41% | 20.25 | 31.77 | 12.66% | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Guo, F.; Wang, X.; Zou, F.; Zou, L.; Fang, T.; Wu, X.; Jiang, H.; Weng, J. ST-MAFNet: Spatio-Temporal Multi-Scale Adaptive Fusion Network for Traffic Forecasting. AI 2026, 7, 217. https://doi.org/10.3390/ai7060217
Guo F, Wang X, Zou F, Zou L, Fang T, Wu X, Jiang H, Weng J. ST-MAFNet: Spatio-Temporal Multi-Scale Adaptive Fusion Network for Traffic Forecasting. AI. 2026; 7(6):217. https://doi.org/10.3390/ai7060217
Chicago/Turabian StyleGuo, Feng, Xunhuang Wang, Fumin Zou, Lei Zou, Tao Fang, Xueming Wu, Haocai Jiang, and Jianqing Weng. 2026. "ST-MAFNet: Spatio-Temporal Multi-Scale Adaptive Fusion Network for Traffic Forecasting" AI 7, no. 6: 217. https://doi.org/10.3390/ai7060217
APA StyleGuo, F., Wang, X., Zou, F., Zou, L., Fang, T., Wu, X., Jiang, H., & Weng, J. (2026). ST-MAFNet: Spatio-Temporal Multi-Scale Adaptive Fusion Network for Traffic Forecasting. AI, 7(6), 217. https://doi.org/10.3390/ai7060217

