Author Contributions
Conceptualization, J.H., H.Y. and Y.L.; methodology, J.H. and H.Y.; software, J.H.; validation, Q.C., Y.L. and J.H.; formal analysis, H.Y. and Y.L.; resources, H.Y. and Q.C.; data curation, J.H.; writing—original draft preparation, J.H.; writing—review and editing, Y.L. and H.Y.; visualization, J.H.; supervision, Y.L.; project administration, Q.C.; funding acquisition, H.Y. and Y.L. All authors have read and agreed to the published version of the manuscript.
Figure 1.
The framework of the proposed MGTEFormer. The model integrates a multi-granularity temporal embedding module to process time-series data, a Transformer-based spatio-temporal feature extraction module to capture complex patterns, and a prediction module with fully connected layers for output.
Figure 1.
The framework of the proposed MGTEFormer. The model integrates a multi-granularity temporal embedding module to process time-series data, a Transformer-based spatio-temporal feature extraction module to capture complex patterns, and a prediction module with fully connected layers for output.
Figure 2.
Traffic flow of PEMS08 at 23 different sensors. The histogram displays the distribution of traffic flow values, with the x-axis representing the traffic flow magnitude, and the y-axis depicting the density of these values. The Kernel Density Estimation (KDE) curve, illustrated in red, provides a smooth approximation of the underlying probability density function, highlighting two prominent peaks. The first peak around 100 units suggests a common traffic flow during off-peak hours, while the second peak near 350 units corresponds to peak traffic periods.
Figure 2.
Traffic flow of PEMS08 at 23 different sensors. The histogram displays the distribution of traffic flow values, with the x-axis representing the traffic flow magnitude, and the y-axis depicting the density of these values. The Kernel Density Estimation (KDE) curve, illustrated in red, provides a smooth approximation of the underlying probability density function, highlighting two prominent peaks. The first peak around 100 units suggests a common traffic flow during off-peak hours, while the second peak near 350 units corresponds to peak traffic periods.
Figure 3.
Traffic flow of PEMS08 at 10 different sensors. This boxplot visually represents the distribution of traffic flow values for 10 distinct sensors, with the x-axis denoting each sensor and the y-axis illustrating the range of traffic flow values. Each box illustrates the quartiles of the data, median, and spread, while whiskers extend to the most extreme data points within 1.5 times the interquartile range. Red dots represent outliers that fall beyond the whiskers, indicating unusual traffic flow values that deviate significantly from the typical distribution pattern.
Figure 3.
Traffic flow of PEMS08 at 10 different sensors. This boxplot visually represents the distribution of traffic flow values for 10 distinct sensors, with the x-axis denoting each sensor and the y-axis illustrating the range of traffic flow values. Each box illustrates the quartiles of the data, median, and spread, while whiskers extend to the most extreme data points within 1.5 times the interquartile range. Red dots represent outliers that fall beyond the whiskers, indicating unusual traffic flow values that deviate significantly from the typical distribution pattern.
Figure 4.
Traffic flow data violin plot of PEMS08 at 10 different sensors. The x-axis lists the sensors, and the y-axis shows traffic flow values. Each violin shape reveals the distribution and density of traffic flow, highlighting medians, quartiles, and data spread.
Figure 4.
Traffic flow data violin plot of PEMS08 at 10 different sensors. The x-axis lists the sensors, and the y-axis shows traffic flow values. Each violin shape reveals the distribution and density of traffic flow, highlighting medians, quartiles, and data spread.
Figure 5.
Traffic flow data distribution of PEMS08 sensor ID# 23 under different time granularity. The x-axis represents time steps, and the y-axis shows traffic flow. Orange tracks the last hour, yellow the previous day, light blue the last week, and dark blue represents the ground truth. The close alignment of these lines with the dark blue prediction line suggests a strong dependency on temporal granularity, indicating that the model’s predictions are closely aligned with historical data, capturing the periodicity and trend of traffic flow.
Figure 5.
Traffic flow data distribution of PEMS08 sensor ID# 23 under different time granularity. The x-axis represents time steps, and the y-axis shows traffic flow. Orange tracks the last hour, yellow the previous day, light blue the last week, and dark blue represents the ground truth. The close alignment of these lines with the dark blue prediction line suggests a strong dependency on temporal granularity, indicating that the model’s predictions are closely aligned with historical data, capturing the periodicity and trend of traffic flow.
Figure 6.
Forecasting performance comparison at each step on PEMS08. Subfigures (a) MAE, (b) RMSE, and (c) MAPE show the forecasting error metrics over a 12-step horizon. MGTEFormer (blue) consistently outperforms STAEformer (orange), indicating higher accuracy in predicting traffic flow.
Figure 6.
Forecasting performance comparison at each step on PEMS08. Subfigures (a) MAE, (b) RMSE, and (c) MAPE show the forecasting error metrics over a 12-step horizon. MGTEFormer (blue) consistently outperforms STAEformer (orange), indicating higher accuracy in predicting traffic flow.
Figure 7.
Forecasting performance comparison at timestep 0 on PEMS08. Solid lines depict actual traffic flow for four sensors, with dashed lines showing model predictions. Sensor 40 matches predictions closely, indicating high accuracy. Sensors 190 and 278 exhibit some prediction errors, suggesting refinement needs. Sensor 135 shows minor systematic deviations.
Figure 7.
Forecasting performance comparison at timestep 0 on PEMS08. Solid lines depict actual traffic flow for four sensors, with dashed lines showing model predictions. Sensor 40 matches predictions closely, indicating high accuracy. Sensors 190 and 278 exhibit some prediction errors, suggesting refinement needs. Sensor 135 shows minor systematic deviations.
Figure 8.
Plot of prediction vs. measurement of sensor 3 at step 0 in PEMS08 by MGTEFormer. Each dot represents a data sample, with the red line indicating perfect prediction alignment. The close clustering around the line shows a strong correlation, highlighting the model’s accuracy in traffic estimation.
Figure 8.
Plot of prediction vs. measurement of sensor 3 at step 0 in PEMS08 by MGTEFormer. Each dot represents a data sample, with the red line indicating perfect prediction alignment. The close clustering around the line shows a strong correlation, highlighting the model’s accuracy in traffic estimation.
Figure 9.
Boxplots of prediction errors of MGTEFormer in PEMS08 30 sensors. Boxplots illustrate the spread and median of prediction errors. The prediction errors are relatively small for most nodes, suggesting that the model generally performs well. However, a few nodes exhibit larger spread and more outliers, indicating less consistent performance. Notably, most nodes show relatively small prediction errors with medians near zero (indicated by the orange horizontal line in each box) which implies accurate predictions for those nodes.
Figure 9.
Boxplots of prediction errors of MGTEFormer in PEMS08 30 sensors. Boxplots illustrate the spread and median of prediction errors. The prediction errors are relatively small for most nodes, suggesting that the model generally performs well. However, a few nodes exhibit larger spread and more outliers, indicating less consistent performance. Notably, most nodes show relatively small prediction errors with medians near zero (indicated by the orange horizontal line in each box) which implies accurate predictions for those nodes.
Figure 10.
Visualization of the traffic flow of the multi-granularity sequence on PEMS08, including hour sample, day sample, week sample, time of day sample, day of week sample, and ground truth. Each heatmap uses color intensity to represent traffic flow magnitude, with darker shades indicating higher traffic volumes.
Figure 10.
Visualization of the traffic flow of the multi-granularity sequence on PEMS08, including hour sample, day sample, week sample, time of day sample, day of week sample, and ground truth. Each heatmap uses color intensity to represent traffic flow magnitude, with darker shades indicating higher traffic volumes.
Table 1.
Dataset statistics.
Table 1.
Dataset statistics.
Dataset | Samples | Sensors | Interval | Missing Ratio | Time Range |
---|
PEMS04 | 16,992 | 307 | 5 min | 3.182% | 1 January 2018–28 Fabruary 2018 |
PEMS08 | 17,856 | 170 | 5 min | 0.696% | 1 July 2016–31 August 2016 |
Table 2.
Model performance of different hyperparameters.
Table 2.
Model performance of different hyperparameters.
Block Layers | Attention Heads | MAE | RMSE | MAPE | R2 |
---|
3 | 1 | 13.205 | 22.912 | 8.842% | 0.976 |
3 | 2 | 13.294 | 22.957 | 8.888% | 0.975 |
3 | 4 | 13.286 | 23.094 | 8.816% | 0.975 |
3 | 8 | 13.212 | 22.914 | 8.851% | 0.976 |
1 | 4 | 13.625 | 22.808 | 9.492% | 0.976 |
3 | 4 | 13.286 | 23.094 | 8.816% | 0.975 |
5 | 4 | 13.234 | 22.997 | 8.789% | 0.975 |
7 | 4 | 13.256 | 23.003 | 8.832% | 0.975 |
Table 3.
Model performance on PEMS08 and PEMS04 for traffic flow forecasting.
Table 3.
Model performance on PEMS08 and PEMS04 for traffic flow forecasting.
Dataset | Model | Metrics |
---|
MAE | RMSE | MAPE |
---|
PEMS08 | STGCN | 21.82 | 32.96 | 13.22% |
DCRNN | 17.82 | 28.12 | 11.28% |
ASTGCN | 18.18 | 28.05 | 11.00% |
GMAN | 16.49 | 25.91 | 12.00% |
ASTGNN | 16.02 | 25.51 | 10.09% |
D2STGNN | 14.39 | 23.82 | 9.30% |
STWave | 13.83 | 23.54 | 9.26% |
PDFormer | 13.66 | 23.55 | 9.07% |
STAEformer | 13.54 | 23.41 | 8.90% |
MGTEFormer | 13.31 | 22.88 | 8.86% |
PEMS04 | STGCN | 22.38 | 34.54 | 14.92% |
DCRNN | 20.31 | 32.14 | 13.97% |
ASTGCN | 21.89 | 34.53 | 15.00% |
GMAN | 19.47 | 30.73 | 14.07% |
ASTGNN | 18.77 | 31.04 | 12.44% |
D2STGNN | 18.58 | 30.19 | 12.43% |
STWave | 20.21 | 32.70 | 13.40% |
PDFormer | 18.33 | 29.97 | 12.10% |
STAEformer | 18.22 | 30.05 | 12.12% |
MGTEFormer | 18.26 | 30.28 | 12.09% |
Table 4.
Horizon performance on PEMS08.
Table 4.
Horizon performance on PEMS08.
H | T | Metrics | STGCN | DCRNN | ASTGCN | GMAN | PDFormer | STAEformer | MGTEFormer |
---|
3 | 15 min | MAE | 14.10 | 13.66 | 15.24 | 14.22 | 13.08 | 12.67 | 12.40 |
RMSE | 22.03 | 21.26 | 24.06 | 22.15 | 21.17 | 21.65 | 21.28 |
MAPE | 9.17% | 8.85% | 10.57% | 10.54% | 8.65% | 8.24% | 8.25% |
R2 | 0.977 | 0.979 | 0.973 | 0.977 | 0.979 | 0.978 | 0.979 |
6 | 30 min | MAE | 15.37 | 14.45 | 15.84 | 14.63 | 13.75 | 13.54 | 13.21 |
RMSE | 23.97 | 22.71 | 25.16 | 23.05 | 22.50 | 23.48 | 23.01 |
MAPE | 9.85% | 9.29% | 10.63% | 10.73% | 9.11% | 8.77% | 8.78% |
R2 | 0.973 | 0.976 | 0.971 | 0.975 | 0.976 | 0.974 | 0.975 |
9 | 45 min | MAE | 16.51 | 15.07 | 16.21 | 15.01 | 14.40 | 14.23 | 13.86 |
RMSE | 25.67 | 23.81 | 25.90 | 23.79 | 23.62 | 24.81 | 24.18 |
MAPE | 10.48% | 9.94% | 10.79% | 10.98% | 9.54% | 9.21% | 9.24% |
R2 | 0.969 | 0.973 | 0.969 | 0.973 | 0.973 | 0.971 | 0.973 |
12 | 60 min | MAE | 17.56 | 15.60 | 16.63 | 15.39 | 15.15 | 14.94 | 14.54 |
RMSE | 27.25 | 24.71 | 26.59 | 24.45 | 24.76 | 25.99 | 25.21 |
MAPE | 11.06% | 9.94% | 10.99% | 11.28% | 10.05% | 9.71% | 9.73% |
R2 | 0.969 | 0.971 | 0.968 | 0.972 | 0.971 | 0.968 | 0.970 |
Table 5.
Computational efficiency comparison.
Table 5.
Computational efficiency comparison.
Model | Average Training Time | Average Inference Time |
---|
DCRNN | 282.233 s | 38.653 s |
GMAN | 121.428 s | 11.914 s |
ASTGCN | 96.266 s | 20.485 s |
D2STGNN | 100.984 s | 13.841 s |
PDFormer | 49.177 s | 6.178 s |
STAEformer | 97.191 s | 4.357 s |
MGTEFormer | 67.448 s | 9.320 s |
Table 6.
Ablation study on PEMS08 with origin input.
Table 6.
Ablation study on PEMS08 with origin input.
Embedding | MAE | RMSE | MAPE (%) |
---|
origin(5) | 15.96 | 25.97 | 10.52 |
origin(5)⊕ adp | 14.65 | 23.89 | 9.53 |
origin(3)⊕day⊕week⊕adp | 15.30 | 24.09 | 9.61 |
origin(3)⊕dow⊕tod⊕adp | 13.44 | 23.27 | 8.92 |
origin(5)⊕dow⊕tod⊕day⊕week⊕adp | 13.31 | 23.12 | 8.85 |
Table 7.
Ablation study on PEMS08 with single x input.
Table 7.
Ablation study on PEMS08 with single x input.
Embedding | MAE | RMSE | MAPE (%) |
---|
x | 17.11 | 27.43 | 10.92 |
x⊕adp | 15.21 | 23.99 | 9.87 |
x⊕day⊕week | 15.71 | 25.6 | 10.14 |
x⊕dow⊕tod | 15.16 | 26.11 | 9.92 |
x⊕day⊕week⊕adp | 15.36 | 24.14 | 9.68 |
x⊕dow⊕tod⊕adp | 13.38 | 23.18 | 8.94 |
x⊕dow⊕tod⊕day⊕week⊕adp | 13.31 | 22.88 | 8.86 |