PathGen-LLM: A Large Language Model for Dynamic Path Generation in Complex Transportation Networks
Abstract
1. Introduction
- Leveraging the structural similarity between path data and natural language, we use a “path-to-text” mapping system to apply LLMs’ semantic modeling capabilities to spatiotemporal sequence analysis, constructing a path generation framework for large-scale complex networks.
- PathGen-LLM does not require historical data for specific OD pairs to be present in the training set. Instead, it captures universal travel patterns through self-supervised learning from global path corpora that comprehensively cover the network structure, enabling generalization to unseen OD pairs.
- Integrating rotary position embedding (RoPE) position encoding, grouped query attention (GQA), and dynamic constraint decoding mechanisms, PathGen-LLM ensures simultaneous spatial–temporal path generation. Generated paths comply with network traversal rules and match statistical characteristics of historical passage times.
- Our study validates the path generation capability of PathGen-LLM in ultra-large-scale scenarios using real-world Beijing traffic network and travel data.
2. Related Work
2.1. Traditional Path Generation Models
2.2. Deep Learning-Based Path Generation Models
2.3. Large Language Models in Transportation
3. Problem Formulation
4. Methodology
- Path-to-Text Data Construction: It converts path data into natural language-like sequences using standardized tokens for zones, nodes, and timestamps, enabling LLM processing.
- Graph-Adaptive Transformer Architecture: It is a decoder-only neural network integrated with RoPE, grouped query attention, and Flash Attention, and it is optimized for modeling spatiotemporal dependencies in complex networks.
- Two-Stage Training Pipeline: It combines self-supervised pretraining and task-specific fine-tuning. Pretraining uses next-token predictions to learn general network representations, while fine-tuning on instruction–data pairs adapts the model to generate paths for specific OD pairs.
- Integrated Inference Framework: It is a hybrid decoding strategy combining reachability-constrained token generation, cyclic path validation, and computational caching to ensure validity and efficiency in large-scale networks.
4.1. Path-to-Text Data Construction
4.1.1. Mapping Paths to Natural Language Sequences
4.1.2. Vocabulary Construction
4.2. Model Architecture
4.2.1. Hierarchical Transformer
4.2.2. Activation Functions and Computational Efficiency Optimization
4.3. Two-Stage Training Pipeline
4.3.1. Pretraining
4.3.2. Fine-Tuning
Listing 1. An example of instruction–response pairs for OD path generation task. |
4.4. Inference Framework
5. Experiments and Results
5.1. Dataset
5.2. Baselines
5.3. Experiment Settings
5.4. Evaluation Metrics
5.5. Model Performance Comparison
5.6. Ablation Studies
5.7. Hyperparameter Sensitivity Analysis
5.8. Inference Efficiency
5.9. Discussion
5.9.1. LLM Scale and Performance Trade-Offs
5.9.2. Model Interpretability and Behavioral Validity
5.9.3. Data Dependency and Generalization Limitations
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Bast, H.; Delling, D.; Goldberg, A.V.; Müller-Hannemann, M.; Pajor, T.; Sanders, P.; Wagner, D.; Werneck, R.F. Route Planning in Transportation Networks; Technical Report MSR-TR-2014-4; Microsoft Research; One Microsoft Way: Redmond, WA, USA, 2014. [Google Scholar]
- Sheffi, Y. Urban Transportation Networks: Equilibrium Analysis with Mathematical Programming Methods; Prentice-Hall: Hoboken, NJ, USA, 1984. [Google Scholar]
- Tyagi, N.; Singh, J.; Singh, S. A Review of Routing Algorithms for Intelligent Route Planning and Path Optimization in Road Navigation. In Proceedings of the Recent Trends in Product Design and Intelligent Manufacturing Systems; Deepak, B., Bahubalendruni, M.R., Parhi, D., Biswal, B.B., Eds.; Springer: Singapore, 2023; pp. 851–860. [Google Scholar]
- Ghamami, M.; Kavianipour, M.; Zockaie, A.; Hohnstadt, L.R.; Ouyang, Y. Refueling infrastructure planning in intercity networks considering route choice and travel time delay for mixed fleet of electric and conventional vehicles. Transp. Res. Part C Emerg. Technol. 2020, 120, 102802. [Google Scholar] [CrossRef]
- Huang, W.; Hu, J.; Huang, G.; Lo, H.K. A three-layer hierarchical model-based approach for network-wide traffic signal control. Transp. B Transp. Dyn. 2023, 11, 1912–1942. [Google Scholar] [CrossRef]
- Lopez, P.A.; Behrisch, M.; Bieker-Walz, L.; Erdmann, J.; Flötteröd, Y.P.; Hilbrich, R.; Lücken, L.; Rummel, J.; Wagner, P.; Wiessner, E. Microscopic Traffic Simulation using SUMO. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 2575–2582. [Google Scholar] [CrossRef]
- Zhang, S.; Luo, Z.; Yang, L.; Teng, F.; Li, T. A survey of route recommendations: Methods, applications, and opportunities. Inf. Fusion 2024, 108, 102413. [Google Scholar] [CrossRef]
- Dijkstra, E.W. A Note on Two Problems in Connexion with Graphs. Numer. Math. 1959, 1, 269–271. [Google Scholar] [CrossRef]
- Hart, P.E.; Nilsson, N.J.; Raphael, B. A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Trans. Syst. Sci. Cybern. 1968, SSC-4, 100–107. [Google Scholar] [CrossRef]
- Koenig, S.; Likhachev, M. D* Lite. In Proceedings of the 18th National Conference on Artificial Intelligence, Edmonton, AB, Canada, 28 July–1 August 2002; pp. 182–188. [Google Scholar]
- Hall, R.W. The Fastest Path through a Network with Random Time-Dependent Travel Times. Transp. Sci. 1986, 20, 182–188. [Google Scholar] [CrossRef]
- Huang, W.; Wang, J. The Shortest Path Problem on a Time-Dependent Network with Mixed Uncertainty of Randomness and Fuzziness. IEEE Trans. Intell. Transp. Syst. 2016, 17, 3194–3204. [Google Scholar] [CrossRef]
- González Ramírez, H.; Leclercq, L.; Chiabaut, N.; Becarie, C.; Krug, J. Travel time and bounded rationality in travellers’ route choice behaviour: A computer route choice experiment. Travel Behav. Soc. 2021, 22, 59–83. [Google Scholar] [CrossRef]
- Zheng, G.; Chai, W.K.; Katos, V. A dynamic spatial–temporal deep learning framework for traffic speed prediction on large-scale road networks. Expert Syst. Appl. 2022, 195, 116585. [Google Scholar] [CrossRef]
- Chen, X.; Zhang, H.; Xiao, F.; Peng, D.; Zhang, C.; Hong, B. Route Planning by Merging Local Edges into Domains with LSTM. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; pp. 505–510. [Google Scholar] [CrossRef]
- Al-Molegi, A.; Jabreel, M.; Ghaleb, B. STF-RNN: Space Time Features-based Recurrent Neural Network for predicting people next location. In Proceedings of the 2016 IEEE Symposium Series on Computational Intelligence (SSCI), Athens, Greece, 6–9 December 2016; pp. 1–7. [Google Scholar] [CrossRef]
- Wu, H.; Chen, Z.; Sun, W.; Zheng, B.; Wang, W. Modeling trajectories with recurrent neural networks. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; IJCAI’17. pp. 3083–3090. [Google Scholar]
- Jain, J.; Bagadia, V.; Manchanda, S.; Ranu, S. NEUROMLR: Robust & reliable route recommendation on road networks. In Proceedings of the 35th International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 6–14 December 2021. NIPS ’21. [Google Scholar]
- van der Pol, M.; Currie, G.; Kromm, S.; Ryan, M. Specification of the Utility Function in Discrete Choice Experiments. Value Health 2014, 17, 297–301. [Google Scholar] [CrossRef] [PubMed]
- Liu, R.; Li, C.; Tang, H.; Ge, Y.; Shan, Y.; Li, G. ST-LLM: Large Language Models Are Effective Temporal Learners. In Proceedings of the Computer Vision—ECCV 2024: 18th European Conference, Milan, Italy, 29 September–4 October 2024; Proceedings, Part LVII. Springer: Berlin/Heidelberg, Germany, 2024; pp. 1–18. [Google Scholar] [CrossRef]
- Li, Z.; Xia, L.; Tang, J.; Xu, Y.; Shi, L.; Xia, L.; Yin, D.; Huang, C. UrbanGPT: Spatio-Temporal Large Language Models. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 25–29 August 2024; KDD ’24. pp. 5351–5362. [Google Scholar] [CrossRef]
- Zhang, R.; Han, L.; Sun, L.; Liu, Y.; Wang, J.; Lv, W. Regions are Who Walk Them: A Large Pre-trained Spatiotemporal Model Based on Human Mobility for Ubiquitous Urban Sensing. arXiv 2023, arXiv:2311.10471. [Google Scholar] [CrossRef]
- Wei, J.; Tay, Y.; Bommasani, R.; Raffel, C.; Zoph, B.; Borgeaud, S.; Yogatama, D.; Bosma, M.; Zhou, D.; Metzler, D.; et al. Emergent Abilities of Large Language Models. Trans. Mach. Learn. Res. 2022. [Google Scholar] [CrossRef]
- Dantzig, G.B. Linear Programming and Extensions; Princeton University Press: Princeton, NJ, USA, 1962. [Google Scholar]
- Sunita; Garg, D. Dynamizing Dijkstra: A solution to dynamic shortest path problem through retroactive priority queue. J. King Saud Univ.-Comput. Inf. Sci. 2021, 33, 364–373. [Google Scholar] [CrossRef]
- Sever, D.; Zhao, L.; Dellaert, N.; Demir, E.; Van Woensel, T.; De Kok, T. The dynamic shortest path problem with time-dependent stochastic disruptions. Transp. Res. Part C Emerg. Technol. 2018, 92, 42–57. [Google Scholar] [CrossRef]
- Liu, D.; Li, D.; Gao, K.; Song, Y.; Zhang, T. Enhancing choice-set generation and route choice modeling with data- and knowledge-driven approach. Transp. Res. Part C Emerg. Technol. 2024, 162, 104618. [Google Scholar] [CrossRef]
- Crivellari, A.; Beinat, E. LSTM-Based Deep Learning Model for Predicting Individual Mobility Traces of Short-Term Foreign Tourists. Sustainability 2020, 12, 349. [Google Scholar] [CrossRef]
- Guo, C.; Yang, B.; Hu, J.; Jensen, C. Learning to Route with Sparse Trajectory Sets. In Proceedings of the 2018 IEEE 34th International Conference on Data Engineering (ICDE), Paris, France, 16–19 April 2018; pp. 1073–1084. [Google Scholar] [CrossRef]
- Wang, Y.; Li, G.; Li, K.; Yuan, H. A Deep Generative Model for Trajectory Modeling and Utilization. Proc. VLDB Endow. 2022, 16, 973–985. [Google Scholar] [CrossRef]
- Xiong, Z.; Wang, Y.; Tian, Y.; Liu, L.; Zhu, S. RoPT: Route-Planning Model with Transformer. Appl. Sci. 2025, 15, 4914. [Google Scholar] [CrossRef]
- Cao, J.; Zheng, T.; Guo, Q.; Wang, Y.; Dai, J.; Liu, S.; Yang, J.; Song, J.; Song, M. Holistic semantic representation for navigational trajectory generation. In Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Artificial Intelligence and Fifteenth Symposium on Educational Advances in Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025. AAAI’25/IAAI’25/EAAI’25. [Google Scholar] [CrossRef]
- Wang, D.; Zhang, J.; Cao, W.; Li, J.; Zheng, Y. When will you arrive? estimating travel time based on deep neural networks. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. AAAI’18/IAAI’18/EAAI’18. [Google Scholar]
- Ren, Y.; Chen, Y.; Liu, S.; Wang, B.; Yu, H.; Cui, Z. TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models. arXiv 2024, arXiv:cs.LG/2403.02221. [Google Scholar] [CrossRef]
- Yuan, Y.; Ding, J.; Feng, J.; Jin, D.; Li, Y. UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal Prediction. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 25–29 August 2024; KDD ’24. pp. 4095–4106. [Google Scholar] [CrossRef]
- Huang, Z.; Shi, G.; Sukhatme, G.S. Can Large Language Models Solve Robot Routing? arXiv 2024, arXiv:cs.CL/2403.10795. [Google Scholar]
- Wang, L.; Duan, P.; He, Z.; Lyu, C.; Chen, X.; Zheng, N.; Yao, L.; Ma, Z. AI-Driven Day-to-Day Route Choice. arXiv 2024, arXiv:cs.LG/2412.03338. [Google Scholar]
- Marcelyn, S.C.; Gao, Y.; Zhang, Y.; Gao, X.; Chen, G. PathGPT: Leveraging Large Language Models for Personalized Route Generation. arXiv 2025, arXiv:cs.IR/2504.05846. [Google Scholar] [CrossRef]
- Su, J.; Ahmed, M.; Lu, Y.; Pan, S.; Bo, W.; Liu, Y. RoFormer: Enhanced transformer with Rotary Position Embedding. Neurocomputing 2024, 568, 127063. [Google Scholar] [CrossRef]
- Ainslie, J.; Lee-Thorp, J.; de Jong, M.; Zemlyanskiy, Y.; Lebrón, F.; Sanghai, S. GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints. arXiv 2023, arXiv:cs.CL/2305.13245. [Google Scholar]
- Dao, T.; Fu, D.; Ermon, S.; Rudra, A.; Ré, C. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. In Proceedings of the Advances in Neural Information Processing Systems; Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2022; Volume 35, pp. 16344–16359. [Google Scholar]
- Newson, P.; Krumm, J. Hidden Markov map matching through noise and sparseness. In Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Seattle, WA, USA, 4–6 November 2009; GIS ’09. pp. 336–343. [Google Scholar] [CrossRef]
- Shlegeris, B.; Roger, F.; Chan, L.; McLean, E. Language Models Are Better Than Humans at Next-token Prediction. Trans. Mach. Learn. Res. 2024. [Google Scholar] [CrossRef]
- Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 6–12 December 2020. NIPS ’20. [Google Scholar]
- Singh, C.; Inala, J.P.; Galley, M.; Caruana, R.; Gao, J. Rethinking Interpretability in the Era of Large Language Models. arXiv 2024, arXiv:cs.CL/2402.01761. [Google Scholar] [CrossRef]
BJ-Taxi | BJ-Subway | Porto | |
---|---|---|---|
Number of Traffic Zones | 952 | 361 | 5165 |
Number of Links | 87,526 | 1939 | 11,024 |
Number of Nodes | 38,278 | 522 | 5165 |
Number of Paths | 386,178 | 910,728 | 481,359 |
Average Path Length (km) | 8.03 | 6.96 | 4.02 |
Average Number of Nodes per Path | 29.76 | 8.74 | 40.21 |
Number of Tokens | 422,632,313 | 280,162,316 | 723,728,671 |
Category | Parameter | PathGen-LLM-1B | PathGen-LLM-7B |
---|---|---|---|
Model Architecture | Layers | 28 | 28 |
Hidden size | 1536 | 3584 | |
Attention heads | 12 | 16 | |
Number of parameters | 1.3B | 7B | |
Training Parameters | Pretraining learning rate | 1.00 | 1.00 |
Finetuning learning rate | 1.00 | 1.00 | |
Pretraining batch size | 64 | 64 | |
Fine-tuning batch size | 128 | 128 | |
Pretraining epochs | 1 | 1 | |
Fine-tuning epochs | 3 | 3 | |
Optimizer | AdamW | AdamW | |
Optimizer hyperparameters | , | , | |
Mixed precision training | BF16 | BF16 |
Dataset | Model | NTA (%) | Precision (%) | Recall (%) | F1 (%) | MAE (s) | RMSE (s) |
---|---|---|---|---|---|---|---|
BJ-Taxi | Dijkstra | - | 66.55 | 61.03 | 63.67 | 147.82 | 223.67 |
NeuroMLR | - | 76.87 | 74.81 | 75.83 | 127.73 | 195.69 | |
HOSER | - | 78.15 | 73.76 | 75.89 | 137.69 | 203.56 | |
PathGen-LLM-1B | 95.93 | 79.11 | 83.86 | 81.42 | 88.57 | 136.83 | |
PathGen-LLM-7B | 96.72 | 80.92 | 85.32 | 83.06 | 87.31 | 127.97 | |
BJ-Subway | Dijkstra | - | 87.88 | 83.79 | 85.79 | 116.85 | 172.38 |
NeuroMLR | - | 89.91 | 87.73 | 88.81 | 106.91 | 157.19 | |
HOSER | - | 90.87 | 89.37 | 90.11 | 101.92 | 148.57 | |
PathGen-LLM-1B | 98.66 | 90.79 | 91.52 | 91.15 | 93.96 | 129.68 | |
PathGen-LLM-7B | 98.91 | 91.63 | 92.22 | 91.92 | 93.55 | 126.85 | |
Porto | Dijkstra | - | 64.34 | 51.98 | 57.50 | 293.63 | 513.61 |
NeuroMLR | - | 75.91 | 70.84 | 73.29 | 197.80 | 365.97 | |
HOSER | - | 79.42 | 77.17 | 78.28 | 209.74 | 358.66 | |
PathGen-LLM-1B | 94.63 | 77.05 | 80.33 | 78.66 | 143.18 | 207.81 | |
PathGen-LLM-7B | 95.72 | 79.04 | 80.78 | 79.90 | 141.75 | 197.59 |
Model | NTA (%) | Precision (%) | Recall (%) | F1 (%) | MAE (s) | RMSE (s) |
---|---|---|---|---|---|---|
PathGen-LLM-1B | 95.93 | 79.11 | 83.86 | 81.42 | 88.57 | 137.83 |
w/o special tokens | 92.46 | 76.70 | 80.16 | 78.39 | 93.03 | 142.39 |
Alternate temporal encoding | 95.68 | 78.51 | 83.04 | 80.71 | 95.87 | 149.16 |
w/o pretraining | 86.42 | 71.09 | 72.35 | 71.72 | 137.82 | 191.46 |
Dataset | Dijkstra | NeuroMLR | HOSER | PathGen-LLM-1B | PathGen-LLM-7B |
---|---|---|---|---|---|
BJ-Taxi | 0.11 | 0.92 | 1.76 | 0.98 | 13.83 |
BJ-Subway | 0.01 | 0.21 | 0.33 | 0.29 | 2.59 |
Porto | 0.13 | 1.08 | 1.97 | 1.21 | 15.87 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, X.; Xian, K.; Wen, H.; Bai, S.; Xu, H.; Yu, Y. PathGen-LLM: A Large Language Model for Dynamic Path Generation in Complex Transportation Networks. Mathematics 2025, 13, 3073. https://doi.org/10.3390/math13193073
Li X, Xian K, Wen H, Bai S, Xu H, Yu Y. PathGen-LLM: A Large Language Model for Dynamic Path Generation in Complex Transportation Networks. Mathematics. 2025; 13(19):3073. https://doi.org/10.3390/math13193073
Chicago/Turabian StyleLi, Xun, Kai Xian, Huimin Wen, Shengguang Bai, Han Xu, and Yun Yu. 2025. "PathGen-LLM: A Large Language Model for Dynamic Path Generation in Complex Transportation Networks" Mathematics 13, no. 19: 3073. https://doi.org/10.3390/math13193073
APA StyleLi, X., Xian, K., Wen, H., Bai, S., Xu, H., & Yu, Y. (2025). PathGen-LLM: A Large Language Model for Dynamic Path Generation in Complex Transportation Networks. Mathematics, 13(19), 3073. https://doi.org/10.3390/math13193073