GMTP: Enhanced Travel Time Prediction with Graph Attention Network and BERT Integration
Abstract
1. Introduction
- Introduced a high-performance spatiotemporal graph attention network: Constructing a road segment interaction frequency matrix, based on GATv2 [20], to fully exploit spatial and temporal correlations for modeling complex road network structures. This approach deeply integrates with BERT [23], effectively capturing complex interaction relationships between nodes and enhancing the model’s ability to analyze spatiotemporal dynamic features.
- Proposed a head information sharing spatiotemporal self-attention mechanism (HSSTA):This mechanism learns contextual information in trajectories by extracting traffic time characteristics such as peak hours and weekdays in the attention layer. A hybrid matrix is introduced in the attention head to adaptively adjust attention layer parameters, improving both computational efficiency and prediction accuracy.
- Designed an adaptive self-supervised learning task: This task reconstructs trajectory sequences by gradually increasing the masking ratio of trajectory subsequences. Combined with contrastive learning, this method reduces interference from other related information, increases the difficulty of true value prediction, and enhances the cross-sequence transferability of the pre-trained model, improving its generalization and robustness.
- Conducted experiments on two real-world trajectory datasets:The results demonstrate that the proposed method significantly outperforms other methods in both performance and computational efficiency.
2. Related Work
2.1. Spatiotemporal Graph Neural Networks
2.2. Transformer-Based Language Models
2.3. Self-Supervised Learning
3. Methodology
3.1. Road Segment Interaction Pattern to Enhance GATv2
- GATv2:
- GATv2-Enhanced:
3.2. Traffic Congestion-Aware Trajectory Encoder
3.2.1. Trajectory Cycle Time Refinement Module
3.2.2. Adaptive Shared Attention Module
- Enhanced Expressive Power: Each head can adaptively utilize the shared matrix to capture feature information across different dimensions, enabling richer representation within the attention mechanism.
- Reduced Parameter Redundancy: By sharing the projection matrix across heads, the total number of model parameters is significantly reduced, which improves computational efficiency.
- Flexible Head Configuration: This mechanism allows for head dimensions to be adjusted as needed, enabling the attention mechanism to flexibly adapt to varying levels of complexity.
3.2.3. Feedforward Network Layer
3.3. Self-Supervised Pre-Training Tasks
3.3.1. Adaptive Masked Trajectory Reconstruction Task
3.3.2. Trajectory Contrastive Learning Task
4. Results
4.1. Dataset Introduction
- The set of vertices , where each road segment in the network is represented as a vertex.
- The set of edges E and binary adjacency matrix A, which are defined by connecting each pair of road segments that share a direct link. Each such connection is represented as an edge between two corresponding vertices.
- A feature matrix that includes four key road characteristics: road type, length, lane count, and maximum speed limit. For each road in the adjacency matrix A, the in-degree and out-degree are computed to form these road attributes. Finally, the constructed directed road network is subsequently used as input for the GMTP model.
4.2. Training Details
4.3. Experimental Results and Analysis
4.4. Ablation Study
4.4.1. The Impact of Enhanced GATv2
4.4.2. The Impact of Adaptive HSSTA
4.4.3. The Impact of Self-Supervised Tasks
4.4.4. The Impact of Data Augmentation Strategies
4.4.5. Hyperparameter Sensitivity Analysis
4.4.6. Performance and Computational Cost at Different Embedding Dimensions
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Kong, X.; Li, M.; Ma, K.; Tian, K.; Wang, M.; Ning, Z.; Xia, F. Big trajectory data: A survey of applications and services. IEEE Access 2018, 6, 58295–58306. [Google Scholar] [CrossRef]
- Yue, Y.; Zhuang, Y.; Li, Q.; Mao, Q. Mining time-dependent attractive areas and movement patterns from taxi trajectory data. In Proceedings of the 2009 17th International Conference on Geoinformatics, Fairfax, VA, USA, 12–14 August 2009; pp. 1–6. [Google Scholar]
- Fang, Z.; Pan, L.; Chen, L.; Du, Y.; Gao, Y. MDTP: A multi-source deep traffic prediction framework over spatiotemporal trajectory data. Proc. VLDB Endow. 2021, 14, 1289–1297. [Google Scholar] [CrossRef]
- Wang, J.; Wu, N.; Zhao, W.X.; Peng, F.; Lin, X. Empowering A*search algorithms with neural networks for personalized route recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 539–547. [Google Scholar]
- Wang, J.; Jiang, J.; Jiang, W.; Li, C.; Zhao, W.X. Libcity: An open library for traffic prediction. In Proceedings of the 29th International Conference on Advances in Geographic Information Systems, Beijing, China, 2–5 November 2021; pp. 145–148. [Google Scholar]
- Ji, J.; Wang, J.; Jiang, Z.; Jiang, J.; Zhang, H. STDEN: Towards physics-guided neural networks for traffic flow prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 27 February–2 March 2022; pp. 4048–4056. [Google Scholar]
- Wang, J.; Lin, X.; Zuo, Y.; Wu, J. Dgeye: Probabilistic risk perceptionand prediction for urban dangerous goods management. ACM Trans. Inf. Syst. 2021, 39, 28:1–28:30. [Google Scholar] [CrossRef]
- Li, G.; Hung, C.; Liu, M.; Pan, L.; Peng, W.; Chan, S.G. Spatial temporal similarity for trajectories with location noise and sporadic sampling. In Proceedings of the 37th IEEE International Conference on Data Engineering, (ICDE), Chania, Greece, 19–22 April 2021; pp. 1224–1235. [Google Scholar]
- Amirian, P.; Basiri, A.; Morley, J. Predictive analytics for enhancing travel time estimation in navigation apps of Apple, Google, and Microsoft. In Proceedings of the 9th ACM SIGSPATIAL International Workshop on Computational Transportation Science, Burlingame, CA, USA, 31 October 2016; pp. 31–36. [Google Scholar]
- Zin, T.T.; Hama, H. A robust road sign recognition using segmentation with morphology and relative color. J. Inst. Image Inf. Telev. Eng. 2005, 59, 1333–1342. [Google Scholar] [CrossRef]
- Stessens, P.; Khan, A.Z.; Huysmans, M.; Canters, F. Analysing urban green space accessibility and quality: A GIS-based model as spatial decision support for urban ecosystem services in Brussels. Ecosyst. Serv. 2017, 28, 328–340. [Google Scholar] [CrossRef]
- Yildirimoglu, M.; Geroliminis, N. Experienced travel time prediction for congested freeways. Transp. Res. Part B Methodol. 2013, 53, 45–63. [Google Scholar] [CrossRef]
- Liu, Y.; Jia, R.; Ye, J.; Qu, X. How machine learning informs ride-hailing services: A survey. Commun. Transp. Res. 2022, 2, 100075. [Google Scholar] [CrossRef]
- Simroth, A.; Zähle, H. Travel time prediction using floating car data applied to logistics planning. IEEE Trans. Intell. Transp. Syst. 2010, 12, 243–253. [Google Scholar] [CrossRef]
- Carrion, C.; Levinson, D. Value of travel time reliability: A review of current evidence. Transp. Res. Part A Policy Pract. 2012, 46, 720–741. [Google Scholar] [CrossRef]
- Wang, H.; Tang, X.; Kuo, Y.H.; Kifer, D.; Li, Z. A simple baseline for travel time estimation using large-scale trip data. ACM Trans. Intell. Syst. Technol. (TIST) 2019, 10, 1–22. [Google Scholar] [CrossRef]
- Wang, Y.; Zheng, Y.; Xue, Y. Travel time estimation of a path using sparse trajectories. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 25–34. [Google Scholar]
- Li, Y.; Fu, K.; Wang, Z.; Shahabi, C.; Ye, J.; Liu, Y. Multi-task representation learning for travel time estimation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 1695–1704. [Google Scholar]
- Wang, D.; Zhang, J.; Cao, W.; Li, J.; Zheng, Y. When will you arrive? Estimating travel time based on deep neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
- Fang, X.; Huang, J.; Wang, F.; Zeng, L.; Liang, H.; Wang, H. Constgat: Contextual spatial-temporal graph attention network for travel time estimation at baidu maps. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual, 6–10 July 2020; pp. 2697–2705. [Google Scholar]
- Wang, Z.; Fu, K.; Ye, J. Learning to estimate the travel time. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 858–866. [Google Scholar]
- Cheng, H.T.; Koc, L.; Harmsen, J.; Shaked, T.; Chandra, T.; Aradhye, H.; Anderson, G.; Corrado, G.; Chai, W.; Ispir, M.; et al. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, Boston, MA, USA, 15 September 2016; pp. 7–10. [Google Scholar]
- Jiang, J.; Pan, D.; Ren, H.; Jiang, X.; Li, C.; Wang, J. Self-supervised trajectory representation learning with temporal regularities and travel semantics. In Proceedings of the 2023 IEEE 39th International Conference on Data Engineering (ICDE), Anaheim, CA, USA, 3–7 April 2023; pp. 843–855. [Google Scholar]
- Brody, S.; Alon, U.; Yahav, E. How attentive are graph attention networks? arXiv 2021, arXiv:2105.14491. [Google Scholar]
- Devlin, J. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Yu, R.; Li, Y.; Shahabi, C.; Demiryurek, U.; Liu, Y. Deep learning: A generic approachfor extreme condition traffic forecasting. In Proceedings of the 2017 SIAM international Conference on Data Mining, Houston, TX, USA, 27–29 April 2017; pp. 777–785. [Google Scholar]
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Zhang, C. Graph wavenet for deep spatial-temporal graph modeling. arXiv 2019, arXiv:1906.00121. [Google Scholar]
- Zheng, C.; Fan, X.; Wang, C.; Qi, J. Gman: A graph multi-attention network for traffic prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 1234–1241. [Google Scholar]
- Wang, X.; Ma, Y.; Wang, Y.; Jin, W.; Wang, X.; Tang, J.; Jia, C.; Yu, J. Traffic flow prediction via spatial temporal graph neural network. In Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; pp. 1082–1092. [Google Scholar]
- Jin, G.; Wang, M.; Zhang, J.; Sha, H.; Huang, J. STGNN-TTE: Travel time estimation via spatial–temporal graph neural network. Future Gener. Comput. Syst. 2022, 126, 70–81. [Google Scholar] [CrossRef]
- Guo, S.; Lin, Y.; Feng, N.; Song, C.; Wan, H. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 922–929. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. STAT 2017, 1050, 10-48550. [Google Scholar]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 1–67. [Google Scholar]
- Lewis, M. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv 2019, arXiv:1910.13461. [Google Scholar]
- Liu, Y. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. Openai Blog 2019, 1, 9. [Google Scholar]
- Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S.; et al. Llama 2: Open foundation and fine-tuned chat models. arXiv 2023, arXiv:2307.09288. [Google Scholar]
- Almazrouei, E.; Alobeidli, H.; Alshamsi, A.; Cappelli, A.; Cojocaru, R.; Debbah, M.; Goffinet, É.; Hesslow, D.; Launay, J.; Malartic, Q.; et al. The falcon series of open language models. arXiv 2023, arXiv:2311.16867. [Google Scholar]
- Chen, Y.; Li, X.; Cong, G.; Bao, Z.; Long, C.; Liu, Y.; Chandran, A.K.; Ellison, R. Robust road network representation learning: When traffic patterns meet traveling semantics. In Proceedings of the CIKM ’21: The 30th ACM International Conference on Information and Knowledge Management, Online, 1–5 November 2021; pp. 211–220. [Google Scholar]
- Hadsell, R.; Chopra, S.; LeCun, Y. Dimensionality reduction by learning an invariant mapping. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR’06, New York, NY, USA, 17–22 June 2006; Volume 2, pp. 1735–1742. [Google Scholar]
- He, K.; Fan, H.; Wu, Y.; Xie, S.; Girshick, R. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 9729–9738. [Google Scholar]
- Gao, T.; Yao, X.; Chen, D. SimCSE: Simple contrastive learning of sentence embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican, 7–11 November 2021; pp. 6894–6910. [Google Scholar]
- Van Den Oord, A.; Li, Y.Z.; Vinyals, O. Representation Learning with Contrastive Predictive Coding [Online]. Available online: https://arxiv.org/abs/1807.03748 (accessed on 22 January 2019).
- Khosla, P.; Teterwak, P.; Wang, C.; Sarna, A.; Tian, Y.; Isola, P.; Maschinot, A.; Liu, C.; Krishnan, D. Supervised contrastive learning. In Proceedings of the Advances in Neural Information Processing Systems, Online, 6–12 December 2020; Volume 33, pp. 18661–18673. [Google Scholar]
- Hochreiter, S. Long Short-term Memory. In Neural Computation; MIT-Press: Cambridge, MA, USA, 1997. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2017; pp. 5998–6008. [Google Scholar]
- Bhat, M.; Francis, J.; Oh, J. Trajformer: Trajectory prediction with local self-attentive contexts for autonomous driving. arXiv 2020, arXiv:2011.14910. [Google Scholar]
- Chen, Z.; Xiao, X.; Gong, Y.J.; Fang, J.; Ma, N.; Chai, H.; Cao, Z. Interpreting trajectories from multiple views: A hierarchical self-attention network for estimating the time of arrival. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 2771–2779. [Google Scholar]
- Cordonnier, J.B.; Loukas, A.; Jaggi, M. Multi-head attention:Collaborate instead of concatenate. arXiv 2020, arXiv:2006.16362. [Google Scholar]
- Khan, S.; Naseer, M.; Hayat, M.; Zamir, S.W.; Khan, F.S.; Shah, M. Transformers in vision: A survey. Acm Comput. Surv. (CSUR) 2022, 54, 1–41. [Google Scholar] [CrossRef]
- Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 13–18 July 2020; pp. 1597–1607. [Google Scholar]
- OpenStreetMap Contributors. 2017. Available online: https://www.openstreetmap.org (accessed on 20 September 2021).








| Field | Description | 
|---|---|
| id | Unique trajectory ID | 
| path | List of road segment IDs | 
| tlist | List of timestamps (UTC) | 
| usr_id | User ID | 
| traj_id | Original trajectory ID | 
| start_time | Travel start time | 
| Models | Porto | Beijing | ||||
|---|---|---|---|---|---|---|
| MAE ↓ | MAPE ↓ | RMSE ↓ | MAE ↓ | MAPE ↓ | RMSE ↓ | |
| Traj2vec | 1.55 | 23.70 | 2.35 | 10.13 | 37.95 | 56.83 | 
| Transformer | 1.74 | 25.72 | 2.64 | 10.74 | 39.61 | 57.16 | 
| BERT | 1.59 | 24.63 | 2.29 | 10.21 | 37.31 | 37.31 | 
| PIM | 1.56 | 24.68 | 2.34 | 10.19 | 39.04 | 57.73 | 
| START | 1.33 | 20.66 | 2.00 | 9.134 | 30.92 | 35.40 | 
| GMTP | 1.26 | 19.01 | 1.99 | 9.010 | 30.61 | 34.22 | 
| Params () | Time (h) | |||||
|---|---|---|---|---|---|---|
| MHA | HSSTA | MHA | HSSTA | MHA | HSSTA | |
| 256 | 0.71 | 0.75 | 80.1 | 81.9 | 13.0 | 14.3 | 
| 128 | 0.69 | 0.72 | 77.2 | 77.5 | 12.6 | 13.8 | 
| 64 | 0.64 | 0.71 | 74.6 | 74.7 | 11.5 | 12.9 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, T.; Liu, Y. GMTP: Enhanced Travel Time Prediction with Graph Attention Network and BERT Integration. AI 2024, 5, 2926-2944. https://doi.org/10.3390/ai5040141
Liu T, Liu Y. GMTP: Enhanced Travel Time Prediction with Graph Attention Network and BERT Integration. AI. 2024; 5(4):2926-2944. https://doi.org/10.3390/ai5040141
Chicago/Turabian StyleLiu, Ting, and Yuan Liu. 2024. "GMTP: Enhanced Travel Time Prediction with Graph Attention Network and BERT Integration" AI 5, no. 4: 2926-2944. https://doi.org/10.3390/ai5040141
APA StyleLiu, T., & Liu, Y. (2024). GMTP: Enhanced Travel Time Prediction with Graph Attention Network and BERT Integration. AI, 5(4), 2926-2944. https://doi.org/10.3390/ai5040141
 
        


 
       