You are currently viewing a new version of our website. To view the old version click .
Electronics
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

4 January 2026

TSAformer: A Traffic Flow Prediction Model Based on Cross-Dimensional Dependency Capture

,
and
College of Electrical and Control Engineering, North China University of Technology, Beijing 100144, China
*
Author to whom correspondence should be addressed.
Electronics2026, 15(1), 231;https://doi.org/10.3390/electronics15010231 
(registering DOI)
This article belongs to the Special Issue Artificial Intelligence for Traffic Understanding and Control

Abstract

Accurate multivariate traffic flow forecasting is critical for intelligent transportation systems yet remains challenging due to the complex interplay of temporal dynamics and spatial interactions. While Transformer-based models have shown promise in capturing long-range temporal dependencies, most existing approaches compress multidimensional observations into flattened sequences—thereby neglecting explicit modeling of cross-dimensional (i.e., spatial or inter-variable) relationships, which are essential for capturing traffic propagation, network-wide congestion, and node-specific behaviors. To address this limitation, we propose TSAformer, a novel Transformer architecture that explicitly preserves and jointly models time and dimension as dual structural axes. TSAformer begins with a multimodal input embedding layer that encodes raw traffic values alongside temporal context (time-of-day and day-of-week) and node-specific positional features, ensuring rich semantic representation. The core of TSAformer is the Two-Stage Attention (TSA) module, which first models intra-dimensional temporal evolution via time-axis self-attention then captures inter-dimensional spatial interactions through a lightweight routing mechanism—avoiding quadratic complexity while enabling all-to-all cross-node communication. Built upon TSA, a hierarchical encoder–decoder (HED) structure further enhances forecasting by modeling traffic patterns across multiple temporal scales, from fine-grained fluctuations to macroscopic trends, and fusing predictions via cross-scale attention. Extensive experiments on three real-world traffic datasets—including urban road networks and highway systems—demonstrate that TSAformer consistently outperforms state-of-the-art baselines across short-term and long-term forecasting horizons. Notably, it achieves top-ranked performance in 36 out of 58 critical evaluation scenarios, including peak-hour and event-driven congestion prediction. By explicitly modeling both temporal and dimensional dependencies without structural compromise, TSAformer provides a scalable, interpretable, and high-performance solution for spatiotemporal traffic forecasting.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.