Deep Learning-Driven Bus Short-Term OD Demand Prediction via a Physics-Guided Adaptive Graph Spatio-Temporal Attention Network
Abstract
1. Introduction
1.1. Literature Review
1.1.1. Statistical-Based Prediction Methods
1.1.2. Traditional Machine Learning-Based Prediction Methods
1.1.3. Deep Learning-Based Prediction Methods
1.2. Contributions
- (i)
- Unlike the original model, which uses daily, weekly, and real-time input sequences, this study only adopts daily OD data as the input sequence. To fully extract information under a single time scale, the input layer converts daily data into a continuous sequence through temporal concatenation. This design enables the model to focus on the daily periodic features with only daily data dependency, making it more suitable for practical application scenarios with small-scale bus data.
- (ii)
- Unlike the original model, to better encode the spatiotemporal features in daily demand data, this study embeds a multi-head attention module in the encoder. As a result, the model can process the features extracted by the AGC-LSTM layer in parallel within multiple subspaces.
- (iii)
- Experiments show that the performance of using one-layer bidirectional LSTMs in the decoder is better than that of the multi-layer structure, which reduces the risk of overfitting in model training with a small bus dataset.
1.3. Outline
2. Methodology
2.1. Model Architecture
- (i)
- The input layer primarily receives raw bus passenger flow data, including historical OD demands and their temporal information. These data are formatted as tensors and fed into the model to support subsequent feature extraction and processing.
- (ii)
- In the encoder layer, the OD demand data is first input into the AGC-LSTM module, which captures hidden periodic spatiotemporal distribution features. The captured information is then encoded using an attention mechanism to generate corresponding representations. These representations are subsequently passed through a residual network to enhance the periodic spatiotemporal feature representations.
- (iii)
- The decoder incorporates one-layer BiLSTMs, which serve to decompress and reconstruct the enhanced spatiotemporal representations back into a space with the same dimensions as the original OD demand data.
- (iv)
- Finally, the output layer uses the decoded spatiotemporal features to generate short-term OD demand prediction results. The output is typically a tensor of the same dimensions as the actual OD data, representing predicted demand between stops across various time intervals.
2.2. AGC-LSTM
2.3. Attention Mechanism
2.4. Bidirectional LSTM
3. Dataset
3.1. Data Description
- (i)
- Passenger flow significantly increases during peak hours. During morning and evening commuting periods, passenger volumes along the entire route rise sharply, with more boarding activity at all stops. Figure 8 shows the variation trend of bus OD demand from Bell Tower Square Stop to Wushui Business District—IKEA Stop between 2 and 6 September 2024. The horizontal axis represents the number of time intervals, with each interval being 30 min, and there are 32 intervals per day. The vertical axis represents the OD demand volume. It can be intuitively observed from the figure that there is a periodic fluctuation in passenger flow. Higher OD demand appears during the morning and evening peak hours, whereas the demand is relatively lower during off-peak hours.
- (ii)
- Passenger flow is influenced by temporal factors. During off-peak hours, such as weekday mornings, afternoons, and evenings, passenger flow is relatively low. On weekends and holidays, with fewer commuters, the overall passenger volume generally decreases. Taking the trip from Bell Tower Square Stop to Wushui Business District—IKEA Stop as an example, the average daily OD demand is 728 passengers on weekdays and 415 passengers on weekends, representing a decrease of approximately 43% on weekends compared to weekdays.
3.2. Experimental Results
3.2.1. Normalization Details
3.2.2. Experimental Environment
3.2.3. Hyperparameter Optimization
3.2.4. Loss Function
3.2.5. Model Training
3.2.6. Prediction Results and Analysis
- (i)
- During peak hours, passenger flow is mainly for commuting, with clear travel purposes, concentrated origins and destinations, large volume, and strong regularity. The AGC-LSTM module can effectively capture the spatiotemporal correlation of OD demand, and the attention mechanism is also more likely to focus on the characteristics of passenger flow during peak hours, resulting in small prediction errors.
- (ii)
- During off-peak hours, the difference between predicted values and real values is significantly larger than that in peak hours. Noon passenger flow is mainly for non-commuting trips, with small volume, scattered travel purposes, and strong randomness, which lead to weakened spatiotemporal correlation and a lack of stable patterns in passenger flow demand. As a result, the model has limited ability to learn such aperiodic characteristics, leading to larger deviations.
3.3. Comparison with State-of-the-Art Methods
3.3.1. Evaluation Metrics
3.3.2. Baseline Models
- 2D CNN: 2D CNN extracts rich feature information through convolution operations. It consists of two 2D CNN layers with 3 × 3 filters and a fully connected layer with 256 neurons. The learning rate is 0.0005. The batch size is 16.
- LSTM: LSTM effectively captures key features in time series. Two fully connected LSTM layers are used to predict future OD demand, with a hidden state dimension set to 256. The learning rate is 0.0005. The batch size is 16.
- STGCN: STGCN [40] utilizes graph convolution to capture spatial dependencies and temporal convolution to model temporal features. This model stacks multiple ST-Conv blocks. In our study, we set the graph convolution kernel size and temporal convolution kernel size to 3. The learning rate is 0.001. The batch size is 32.
- BiLSTM: We use the bidirectional LSTM module from the PAG-STAN decoder as an independent baseline model.
- ConvLSTM: ConvLSTM [41] replaces the fully connected layers in LSTM with convolutional layers, enabling it to capture spatial and temporal features simultaneously. The convolution kernel size is 3 × 3. The learning rate is 0.001. The batch size is 32. Other model configurations are consistent with those of LSTM.
- AGC-LSTM: We extract the adaptive graph convolutional LSTM module from the PAG-STAN encoder as an independent baseline model.
- Transformer: Transformer [28] is an attention-based model that learns attention weights from sequential data using multi-head attention mechanisms. The learning rate is 0.001. The batch size is 32. The number of heads was set to 4, and the feature vector dimension was set to 256.
3.3.3. Model Comparisons
- (i)
- The performance of different models across various time intervals is generally consistent. In particular, PAG-STAN performs better than the others during both peak and off-peak periods, indicating its stability.
- (ii)
- When OD demand reaches peaks or troughs, the performance gap among models becomes more pronounced. In such cases, the PAG-STAN model exhibits its superior ability to capture the passenger flow fluctuations.
- (iii)
- When OD demand nonlinear fluctuation, performance varies across all models. Nonetheless, PAG-STAN still achieves the best predictions in this situation, demonstrating its strong adaptability.
4. Conclusions
- (i)
- This paper systematically summarizes the shortcomings of existing research on traffic/passenger demand prediction, including the limitations of traditional statistical models and machine learning models, the scarcity of studies specifically addressing bus OD prediction, and the lack of deep learning-based prediction models developed for small-scale bus datasets.
- (ii)
- To address these issues, this paper proposes an improved PAG-STAN framework. The framework simplifies the input into daily OD data, enabling the model to extract daily periodic features. A multi-head attention module is embedded in the encoder to enhance the model’s feature learning and representation capabilities, while one-layer bidirectional LSTMs are adopted in the decoder to reduce the risk of overfitting under the condition of small-scale training set.
- (iii)
- Experiments on a small-scale Nantong bus OD demand dataset demonstrate that the PAG-STAN model outperforms other baseline models in terms of applicability, stability, and prediction accuracy.
- (i)
- In future research, the focus can be placed on expanding the scale of the dataset by collecting OD passenger flow data from multiple bus routes over longer time spans and validating the model across different urban environments, so as to more comprehensively evaluate its robustness and transferability. Meanwhile, since the experimental data used in this study cover an earlier period from 2 September to 9 September 2024, it is necessary to acquire and utilize more recent datasets in future work to enhance the reliability and timeliness of the model evaluation.
- (ii)
- There are many factors influencing bus OD demand. It is challenging to accurately capture dynamic patterns using only a single type of bus data. Future studies can incorporate multidimensional data such as weather conditions, holiday travel demands, etc., to build more robust and reliable prediction models.
- (iii)
- In future research, to improve prediction accuracy and stability during off-peak hours, we will also focus on enhancing the model’s capability to capture irregular and low-demand patterns, which may be achieved by refining the model architecture or incorporating additional contextual features.
- (iv)
- The significance of bus OD demand prediction lies in its potential to help optimize and manage public transportation systems. Future work will explore the integration of the proposed model with manual or rule-based scheduling strategies to evaluate its potential for supporting real-world operational decision-making in the urban bus system.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Saif, M.A.; Zefreh, M.M.; Török, Á. Public Transport Accessibility: A Literature Review. Period. Polytech. Transp. Eng. 2019, 47, 36–43. [Google Scholar] [CrossRef]
- Mohammed, M.; Oke, J. Origin–Destination Inference in Public Transportation Systems: A Comprehensive Review. Int. J. Transp. Sci. Technol. 2023, 12, 315–328. [Google Scholar] [CrossRef]
- Lv, Y.; Duan, Y.; Kang, W.; Li, Z.; Wang, F.-Y. Traffic Flow Prediction with Big Data: A Deep Learning Approach. IEEE Trans. Intell. Transp. Syst. 2014, 16, 865–873. [Google Scholar] [CrossRef]
- Fan, Q.; Yu, C.; Zuo, J. Predicting Urban Rail Transit Network Origin-Destination Matrix under Operational Incidents with Deep Counterfactual Inference. Appl. Sci. 2025, 15, 6398. [Google Scholar] [CrossRef]
- Zhang, S.; Zhang, J.; Yang, L.; Chen, F.; Li, S.; Gao, Z. Physics Guided Deep Learning-Based Model for Short-Term Origin-Destination Demand Prediction in Urban Rail Transit Systems Under Pandemic. Engineering 2024, 41, 276–296. [Google Scholar] [CrossRef]
- Smith, B.L.; Demetsky, M.J. Traffic Flow Forecasting: Comparison of Modeling Approaches. J. Transp. Eng. 1997, 123, 261–266. [Google Scholar] [CrossRef]
- Lee, S.; Fambro, D.B. Application of Subset Autoregressive Integrated Moving Average Model for Short-Term Freeway Traffic Volume Forecasting. Transp. Res. Rec. 1999, 1678, 179–188. [Google Scholar] [CrossRef]
- Tan, M.; Wong, S.; Xu, J.; Guan, Z.; Zhang, P. An Aggregation Approach to Short-Term Traffic Flow Prediction. IEEE Trans. Intell. Transp. Syst. 2009, 10, 60–69. [Google Scholar] [CrossRef]
- Williams, B.M.; Hoel, L.A. Modeling and Forecasting Vehicular Traffic Flow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results. J. Transp. Eng. 2003, 129, 664–672. [Google Scholar] [CrossRef]
- Emami, A.; Sarvi, M.; Bagloee, A.S. Using Kalman Filter Algorithm for Short-Term Traffic Flow Prediction in a Connected Vehicle Environment. J. Mod. Transp. 2019, 27, 222–232. [Google Scholar] [CrossRef]
- Pan, Y.A.; Guo, J.; Chen, Y.; Cheng, Q.; Li, W.; Liu, Y. A Fundamental Diagram Based Hybrid Framework for Traffic Flow Estimation and Prediction by Combining a Markovian Model with Deep Learning. Expert Syst. Appl. 2024, 238, 122219. [Google Scholar] [CrossRef]
- Zhao, Y.; Ren, L.; Ma, Z.; Jiang, X. A Novel Three-Stage Framework for Prioritizing and Selecting Feature Variables for Short-Term Metro Passenger Flow Prediction. In Proceedings of the Transportation Research Board 99th Annual Meeting, Washington, DC, USA, 12–16 January 2020; TRB: Washington, DC, USA, 2020; Volume 2674, pp. 192–205. [Google Scholar]
- Wang, D.; Zhang, Q.; Wu, S.; Li, X.; Wang, R. Traffic Flow Forecast with Urban Transport Network. In Proceedings of the 2016 IEEE International Conference on Intelligent Transportation Engineering, Singapore, 20–22 August 2016; IEEE: Singapore, 2016; pp. 139–143. [Google Scholar]
- Kang, L.; Hu, G.; Huang, H.; Lu, W.; Liu, L. Urban Traffic Travel Time Short-Term Prediction Model Based on Spatio-Temporal Feature Extraction. J. Adv. Transp. 2020, 2020, 332. [Google Scholar] [CrossRef]
- Hong, W. Traffic Flow Forecasting by Seasonal SVR with Chaotic Simulated Annealing Algorithm. Neurocomputing 2011, 74, 2096–2107. [Google Scholar] [CrossRef]
- Cai, P.; Wang, Y.; Lu, G.; Chen, P.; Ding, C.; Sun, J. A Spatiotemporal Correlative k-Nearest Neighbor Model for Short-Term Traffic Multistep Forecasting. Transp. Res. Part C 2016, 62, 21–34. [Google Scholar] [CrossRef]
- Chen, X.; Wu, S.; Shi, C.; Huang, Y.; Yang, Y.; Ke, R.; Zhao, J. Sensing Data Supported Traffic Flow Prediction via Denoising Schemes and ANN: A Comparison. IEEE Sens. J. 2020, 20, 14317–14328. [Google Scholar] [CrossRef]
- Raskar, C.; Nema, S. Metaheuristic Enabled Modified Hidden Markov Model for Traffic Flow Prediction. Comput. Netw. 2022, 206, 108780. [Google Scholar] [CrossRef]
- Jin, J.; Wang, Y.H.; Li, M. Prediction of the Metro Section Passenger Flow Based on Time-Space Characteristic. Appl. Mech. Mater. 2013, 397, 1038–1044. [Google Scholar] [CrossRef]
- Zhang, X.; Wang, C.; Chen, J.; Chen, D. A Deep Neural Network Model with GCN and 3D Convolutional Network for Short-Term Metro Passenger Flow Forecasting. IET Intell. Transp. Syst. 2023, 17, 1599–1607. [Google Scholar] [CrossRef]
- Xia, Z.; Zhang, Y.; Yang, J.; Xie, L. Dynamic Spatial–Temporal Graph Convolutional Recurrent Networks for Traffic Flow Forecasting. Expert Syst. Appl. 2024, 240, 122381. [Google Scholar] [CrossRef]
- Zhang, J.; Chen, F.; Guo, Y.; Li, X. Multi-Graph Convolutional Network for Short-Term Passenger Flow Forecasting in Urban Rail Transit. IET Intell. Transp. Syst. 2020, 14, 1210–1217. [Google Scholar] [CrossRef]
- Shanthappa, K.N.; Mulangi, H.R.; Manjunath, M.H. Origin-Destination Demand Prediction of Public Transit Using Graph Convolutional Neural Network. Case Stud. Transp. Policy 2024, 17, 101230. [Google Scholar] [CrossRef]
- Lu, X.; Ma, C.; Qiao, Y. Short-Term Demand Forecasting for Online Car-Hailing Using ConvLSTM Networks. Phys. A 2021, 570, 125838. [Google Scholar] [CrossRef]
- Zhang, Q.; Li, C.; Su, F.; Li, Y. Spatiotemporal Residual Graph Attention Network for Traffic Flow Forecasting. IEEE Internet Things J. 2023, 10, 11518–11532. [Google Scholar] [CrossRef]
- He, Y.; Li, L.; Zhu, X.; Tsui, K.L. Multi-Graph Convolutional-Recurrent Neural Network (MGC-RNN) for Short-Term Forecasting of Transit Passenger Flow. IEEE Trans. Intell. Transp. Syst. 2022, 23, 18155–18174. [Google Scholar] [CrossRef]
- Zhan, S.; Cai, Y.; Xiu, C.; Zuo, D.; Wang, D.; Wong, S.C. Parallel Framework of a Multi-Graph Convolutional Network and Gated Recurrent Unit for Spatial–Temporal Metro Passenger Flow Prediction. Expert Syst. Appl. 2024, 251, 123982. [Google Scholar] [CrossRef]
- Yang, Y.; Zhang, J.; Yang, L.; Yang, Y.; Li, X.; Gao, Z. Short-Term Passenger Flow Prediction for Multi-Traffic Modes: A Transformer and Residual Network Based Multi-Task Learning Method. Inf. Sci. 2023, 642, 119144. [Google Scholar] [CrossRef]
- Lv, S.; Wang, K.; Yang, H.; Wang, P. An Origin–Destination Passenger Flow Prediction System Based on Convolutional Neural Network and Passenger Source-Based Attention Mechanism. Expert Syst. Appl. 2024, 238, 121989. [Google Scholar] [CrossRef]
- Chen, C.; Liu, Y.; Chen, L.; Zhang, C. Bidirectional Spatial-Temporal Adaptive Transformer for Urban Traffic Flow Forecasting. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 6913–6925. [Google Scholar] [CrossRef]
- Chu, K.F.; Lam, A.Y.S.; Li, V.O.K. Deep Multi-Scale Convolutional LSTM Network for Travel Demand and Origin-Destination Predictions. IEEE Trans. Intell. Transp. Syst. 2019, 21, 3219–3232. [Google Scholar] [CrossRef]
- Zhao, L.; Song, Y.; Zhang, C.; Liu, Y.; Wang, P.; Lin, T.; Deng, M.; Li, H. T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction. IEEE Trans. Intell. Transp. Syst. 2019, 21, 3848–3858. [Google Scholar] [CrossRef]
- Zheng, H.; Lin, F.; Feng, X.; Chen, Y. A Hybrid Deep Learning Model with Attention-Based Conv-LSTM Networks for Short-Term Traffic Flow Prediction. IEEE Trans. Intell. Transp. Syst. 2020, 22, 6910–6920. [Google Scholar] [CrossRef]
- Noursalehi, P.; Koutsopoulos, H.N.; Zhao, J. Dynamic Origin-Destination Prediction in Urban Rail Systems: A Multi-Resolution Spatio-Temporal Deep Learning Approach. IEEE Trans. Intell. Transp. Syst. 2021, 23, 5106–5115. [Google Scholar] [CrossRef]
- Zhang, J.; Che, H.; Chen, F.; Ma, W.; He, Z. Short-Term Origin-Destination Demand Prediction in Urban Rail Transit Systems: A Channel-Wise Attentive Split-Convolutional Neural Network Method. Transp. Res. Part C 2021, 124, 102928. [Google Scholar] [CrossRef]
- Ke, J.; Qin, X.; Yang, H.; Zheng, Z.; Zhu, Z.; Ye, J. Predicting Origin-Destination Ride-Sourcing Demand with a Spatio-Temporal Encoder-Decoder Residual Multi-Graph Convolutional Network. Transp. Res. Part C 2021, 122, 102858. [Google Scholar] [CrossRef]
- Huang, Z.; Wang, D.; Yin, Y.; Li, X. A Spatiotemporal Bidirectional Attention-Based Ride-Hailing Demand Prediction Model: A Case Study in Beijing During COVID-19. IEEE Trans. Intell. Transp. Syst. 2022, 23, 25115–25126. [Google Scholar] [CrossRef]
- Zhao, J.; Zhang, R.; Sun, Q.; Shi, J.; Zhuo, F.; Li, Q. Adaptive Graph Convolutional Network-Based Short-Term Passenger Flow Prediction for Metro. IEEE Trans. Intell. Transp. Syst. 2024, 28, 806–815. [Google Scholar] [CrossRef]
- Abbasimehr, H.; Shabani, M.; Yousefi, M. An Optimized Model Using LSTM Network for Demand Forecasting. Comput. Ind. Eng. 2020, 143, 106435. [Google Scholar] [CrossRef]
- Yu, B.; Yin, H.; Zhu, Z. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 3634–3640. [Google Scholar]
- Sattarzadeh, A.R.; Kutadinata, R.J.; Pathirana, P.N.; Huynh, V.T. A Novel Hybrid Deep Learning Model with ARIMA Conv-LSTM Networks and Shuffle Attention Layer for Short-Term Traffic Flow Prediction. Transp. A Transp. Sci. 2025, 21, 388–410. [Google Scholar] [CrossRef]















| References | Research Question | Input Data | Model(s) | Attention Mechanism(s) | Objective(s) | Compared with the Suboptimal Model |
|---|---|---|---|---|---|---|
| Zhao et al. [32] (2019) | Traffic prediction based on urban road networks | Real-world traffic dataset | GCN, GRU | None | To predict traffic flow in urban road networks | The accuracy was improved by 4.40% |
| Zheng et al. [33] (2020) | Short-term traffic flow prediction | Real traffic flow data | Conv-LSTM, BiLSTM | Self-attention | To effectively capture complex nonlinear characteristics of traffic flow | MAE, MAPE, and RMSE were reduced by 3.55%, 8.06%, and 4.15% |
| Zhang et al. [22] (2020) | Short-term metro passenger flow prediction | Smart card data of Beijing Metro | 3D CNN, GCN | None | To capture periodic features of metro passenger flow | RMSE, MAE, and WMAPE were reduced by 9.26%, 5.60%, and 2.80% |
| Ke et al. [36] (2021) | Short-term ride-hailing demand prediction | For-hire-vehicles datasets in Manhattan, New York City | RMGC, LSTM | None | To predict ride-hailing demand of various OD pairs | RMSE, MAE, and MAPE were reduced by 4.24%, 3.90%, and 5.00% |
| Huang et al. [37] (2022) | Short-term ride-hailing demand prediction | Real-world dataset during COVID-19 in Beijing | LSTM | Multi-head spatial attention, bidirectional attention | To predict the demand of urban ride-hailing during the epidemic period | RMSE, MAE, and MAPE were reduced by 4.03%, 2.30%, and 4.82% |
| Zhang et al. [20] (2023) | Short-term metro passenger flow prediction | Smart card data of Beijing metro and Xiamen metro | GCN, 3D CNN, ResNet | Multi-head attention | To predict systematic short-term passenger flow in urban rail transit | RMSE, MAE, and WMAPE were reduced by 10.35%, 7.43%, and 4.30% |
| Yang et al. [28] (2023) | Short-term passenger flow prediction of multiple transportation modes | Original passenger flow data of three transportation modes in selected areas of Beijing | ResNet, Transformer | Multi-head attention | To predict short-term inflow for multiple transportation modes | RMSE, MAE, and WMAPE were reduced by 1.43%, 4.70%, and 4.71% |
| Lv et al. [29] (2024) | Metro OD passenger flow prediction | Geographic information data, operation data, and smart card data of Shenzhen Metro | PSAM-CNN | Channel attention, spatial attention | To accurately predict OD passenger flows causing urban metro oversaturation | RMSE, MAE, and MAPE were reduced by 1.24%, 3.89%, and 8.88% |
| Xia et al. [21] (2024) | Traffic flow prediction | Three real-world traffic datasets | GCN, RNN, LSTM | Multi-head attention | To improve the accuracy of traffic flow prediction | RMSE, MAE, and MAPE were reduced by 7.00%, 15.70%, and 6.20% |
| This study | Bus short-term OD demand prediction | Nantong small-scale bus OD passenger flow dataset | AGC-LSTM, BiLSTM | Multi-head attention | To accurately predict short-term bus OD demand based on a small dataset | RMSE, MAE, and WMAPE were reduced by 6.19%, 6.59%, and 8.20%, and R2 was increased by 1.13% |
| Stop IDs | Stop Names | Stop Spacing (m) | Cumulative Distance (m) |
|---|---|---|---|
| 1 | Huanxi Cultural Square | ||
| 1500 | 1500 | ||
| 2 | Youyi Bridge West | ||
| 406 | 1906 | ||
| 3 | Bell Tower Square | ||
| 607 | 2513 | ||
| 4 | Stomatological Hospital North Campus | ||
| 366 | 2879 | ||
| 5 | Duanping Bridge | ||
| 427 | 3306 | ||
| 6 | Nantong No. 1 High School | ||
| 380 | 3686 | ||
| 7 | Chenggang Xincun | ||
| 510 | 4196 | ||
| 8 | Renmin Road & Waihuanxi Road East | ||
| 424 | 4620 | ||
| 9 | Chaan Hall | ||
| 589 | 5209 | ||
| 10 | Jiezhizha Xincun | ||
| 378 | 5587 | ||
| 11 | Chenggang Beicun | ||
| 833 | 6420 | ||
| 12 | Chengzha Bridge South | ||
| 1200 | 7620 | ||
| 13 | Wushui Business District—Wanda | ||
| 408 | 8028 | ||
| 14 | Wushui Business District—IKEA | ||
| 686 | 8714 | ||
| 15 | Huanghai Road & Jianghai Avenue West | ||
| 330 | 9044 | ||
| 16 | Huanghai Road & Yongyang Road Intersection | ||
| 260 | 9304 | ||
| 17 | Shenghe Yongxing Huayuan | ||
| 382 | 9686 | ||
| 18 | Huanghai Road & Yonghe Road West | ||
| 523 | 10,209 | ||
| 19 | Financial Technology City—Insurance Industry Park | ||
| 503 | 10,712 | ||
| 20 | Polytechnic College South | ||
| 194 | 10,906 | ||
| 21 | Huanghai Road & Dasheng Road Intersection | ||
| 418 | 11,324 | ||
| 22 | Dasheng Road & Huanghai Road North | ||
| 677 | 12,001 | ||
| 23 | Dasheng Bridge | ||
| 516 | 12,517 | ||
| 24 | Xinhua Jiayuan | ||
| 155 | 12,672 | ||
| 25 | Xinhua Second Community | ||
| 448 | 13,120 | ||
| 26 | Shuhang Bridge | ||
| 248 | 13,368 | ||
| 27 | Tangzha Ancient Town Bus Parking Lot | ||
| Description | Nantong Small-Scale Bus OD Dataset |
|---|---|
| Date | 2 September 2024–9 September 2024 |
| Time | 5:00 to 21:00 |
| Direction | Upward |
| Day number | 8 |
| Stop number | 27 |
| Matrix dimension | 27 × 27 |
| Time interval | 30 min |
| Matrix number in a day | 32 |
| Train timespan | 2 September 2024–8 September 2024 |
| Validation timespan | 20% of the training set |
| Test timespan | 9 September 2024 |
| D | 1 | 2 | 3 | 4 | 24 | 25 | 26 | 27 | ||
|---|---|---|---|---|---|---|---|---|---|---|
| O | ||||||||||
| 1 | 0 | 13 | 10 | 16 | 7 | 0 | 1 | 4 | ||
| 2 | 0 | 0 | 15 | 23 | 5 | 2 | 5 | 0 | ||
| 3 | 0 | 0 | 0 | 9 | 1 | 3 | 1 | 4 | ||
| 4 | 0 | 0 | 0 | 0 | 4 | 2 | 3 | 2 | ||
| 24 | 0 | 0 | 0 | 0 | 0 | 4 | 6 | 1 | ||
| 25 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 2 | ||
| 26 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | ||
| 27 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ||
| Software/Library | Version |
|---|---|
| Python | 3.12.7 |
| Pandas | 2.2.3 |
| PyTorch | 2.5.1 |
| Numpy | 2.0.1 |
| Matplotlib | 3.10.0 |
| Parameters | Values |
|---|---|
| Batch size | 32 |
| Learning rate | 0.001 |
| Epoch | 500 |
| Dropout | 0.1 |
| Optimizer | Adam |
| 256 | |
| 4 |
| Model | RMSE | MAE | WMAPE | R2 | Training Time Per Epoch (s) | Inference Time (s) |
|---|---|---|---|---|---|---|
| PAG-STAN | 4.8927 | 2.1878 | 21.39% | 0.9755 | 22.19 | 1.12 |
| Transformer | 5.2153 | 2.3421 | 23.30% | 0.9646 | 16.75 | 0.57 |
| AGC-LSTM | 5.3949 | 2.4208 | 23.69% | 0.9620 | 15.21 | 0.39 |
| ConvLSTM | 5.6688 | 2.5414 | 24.49% | 0.9579 | 16.64 | 0.31 |
| BiLSTM | 5.9104 | 2.8168 | 27.10% | 0.9345 | 15.47 | 0.26 |
| STGCN | 6.2993 | 3.1503 | 31.34% | 0.9178 | 16.60 | 0.48 |
| LSTM | 6.6318 | 3.3058 | 36.50% | 0.9047 | 15.43 | 0.27 |
| 2D CNN | 6.9026 | 3.5423 | 37.45% | 0.8903 | 15.24 | 0.36 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cao, Z.; Song, L.; Zhang, S.; Sun, J. Deep Learning-Driven Bus Short-Term OD Demand Prediction via a Physics-Guided Adaptive Graph Spatio-Temporal Attention Network. Sensors 2025, 25, 6739. https://doi.org/10.3390/s25216739
Cao Z, Song L, Zhang S, Sun J. Deep Learning-Driven Bus Short-Term OD Demand Prediction via a Physics-Guided Adaptive Graph Spatio-Temporal Attention Network. Sensors. 2025; 25(21):6739. https://doi.org/10.3390/s25216739
Chicago/Turabian StyleCao, Zhichao, Longfei Song, Silin Zhang, and Jingxuan Sun. 2025. "Deep Learning-Driven Bus Short-Term OD Demand Prediction via a Physics-Guided Adaptive Graph Spatio-Temporal Attention Network" Sensors 25, no. 21: 6739. https://doi.org/10.3390/s25216739
APA StyleCao, Z., Song, L., Zhang, S., & Sun, J. (2025). Deep Learning-Driven Bus Short-Term OD Demand Prediction via a Physics-Guided Adaptive Graph Spatio-Temporal Attention Network. Sensors, 25(21), 6739. https://doi.org/10.3390/s25216739

