Atten-LTC-Enhanced MoE Model for Agent Trajectory Prediction in Autonomous Driving
Abstract
1. Introduction
- A Spatio-Temporal Attention-enhanced encoder–decoder with Liquid Time-Constant is designed. Aiming at the long-term time dependencies of the agent, the dynamic behavior of the agent is extracted from the historical trajectory information, and the spatial interaction of surrounding agents is involved to further improve the prediction accuracy.
- Improved computational efficiency. Compared with the existing state-of-the-art model, our model can obtain higher prediction accuracy in a smaller parameter scale. The effectiveness of the Atten-LTC-MoE model is verified by a large number of experiments on the Argoverse dataset [12] and in the Interaction dataset [13], which proves that it is suitable for the real-time deployment of single-agent trajectory prediction and multi-agent trajectory prediction.
- The proposed method in this paper has better expressive and uncertainty control capabilities in modeling multimodal trajectory distribution and dynamic interaction relationships in complex urban intersection scenes by using vectorized data representation, feature fusion, and prediction methods based on multi-input encoders.
2. Related Work
2.1. Agent Trajectory Prediction
2.2. Model-Based LSTM and Attention Mechanism in Trajectory Prediction
2.3. Liquid Time-Constant Networks
2.4. Mixture of Expert Methods
3. Proposed Method
3.1. Overall Framework
3.2. Lane and Agent Vectorization Encoding
3.3. Feature Fusion for Lane and Agent
3.4. Trajectory Prediction and Generation
3.4.1. Atten-LTC-MoE Based Endpoint Prediction
3.4.2. Trajectory Generation
4. Experiments
4.1. Experimental Setup
4.1.1. Dataset Specifications
4.1.2. Evaluation Metrics
4.1.3. Model Configurations and Training
4.2. Ablation Study
4.3. Results and Analysis
4.3.1. Comparison with State-of-the-Art Models
4.3.2. Qualitative Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Bahram, M.; Hubmann, C.; Lawitzky, A.; Aeberhard, M.; Wollherr, D. A combined model-and learning-based framework for interaction-aware maneuver prediction. IEEE Trans. Intell. Transp. Syst. 2016, 17, 1538–1550. [Google Scholar] [CrossRef]
- Li, J.; Yang, F.; Tomizuka, M.; Choi, C. Evolvegraph: Multi-agent trajectory prediction with dynamic relational reasoning. Adv. Neural Inf. Process. Syst. 2020, 33, 19783–19794. [Google Scholar]
- Liu, C.; Lee, S.; Varnhagen, S.; Tseng, H.E. Path planning for autonomous vehicles using model predictive control. In Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, USA, 11–14 June 2017; pp. 174–179. [Google Scholar]
- Brudigam, T.; Olbrich, M.; Wollherr, D.; Leibold, M. Stochastic Model Predictive Control With a Safety Guarantee for Automated Driving. IEEE Trans. Intell. Veh. 2021, 8, 22–36. [Google Scholar] [CrossRef]
- Xu, S.; Peng, H. Design, analysis, and experiments of preview path tracking control for autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 2019, 21, 48–58. [Google Scholar] [CrossRef]
- Biktairov, Y.; Stebelev, M.; Rudenko, I.; Shliazhko, O.; Yangel, B. Prank: Motion prediction based on ranking. Adv. Neural Inf. Process. Syst. 2020, 33, 2553–2563. [Google Scholar]
- Casas, S.; Luo, W.; Urtasun, R. Intentnet: Learning to predict intention from raw sensor data. In Proceedings of the Conference on Robot Learning, Zürich, Switzerland, 29–31 October 2018; pp. 947–956. [Google Scholar]
- Cui, H.; Radosavljevic, V.; Chou, F.-C.; Lin, T.-H.; Nguyen, T.; Huang, T.-K.; Schneider, J.; Djuric, N. Multimodal trajectory predictions for autonomous driving using deep convolutional networks. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 2090–2096. [Google Scholar]
- Djuric, N.; Radosavljevic, V.; Cui, H.; Nguyen, T.; Chou, F.-C.; Lin, T.-H.; Singh, N.; Schneider, J. Uncertainty-aware short-term motion prediction of traffic actors for autonomous driving. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Snowmass Village, CO, USA, 1–5 March 2020; pp. 2095–2104. [Google Scholar]
- Zeng, W.; Liang, M.; Liao, R.; Urtasun, R. Lanercnn: Distributed representations for graph-centric motion forecasting. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 532–539. [Google Scholar]
- Lechner, M.; Hasani, R.; Amini, A.; Henzinger, T.A.; Rus, D.; Grosu, R. Neural circuit policies enabling auditable autonomy. Nat. Mach. Intell. 2020, 2, 642–652. [Google Scholar] [CrossRef]
- Chang, M.-F.; Lambert, J.; Sangkloy, P.; Singh, J.; Bak, S.; Hartnett, A.; Wang, D.; Carr, P.; Lucey, S.; Ramanan, D. Argoverse: 3d tracking and forecasting with rich maps. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 8748–8757. [Google Scholar]
- Zhan, W.; Sun, L.; Wang, D.; Shi, H.; Clausse, A.; Naumann, M.; Kummerle, J.; Konigshof, H.; Stiller, C.; de La Fortelle, A. Interaction dataset: An international, adversarial and cooperative motion dataset in interactive driving scenarios with semantic maps. arXiv 2019, arXiv:1910.03088. [Google Scholar] [CrossRef]
- Liu, Y.; Qi, X.; Sisbot, E.A.; Oguchi, K. Multi-agent trajectory prediction with graph attention isomorphism neural network. In Proceedings of the 2022 IEEE Intelligent Vehicles Symposium (IV), Aachen, Germany, 5–9 June 2022; pp. 273–279. [Google Scholar]
- Ivanovic, B.; Pavone, M. The trajectron: Probabilistic multi-agent trajectory modeling with dynamic spatiotemporal graphs. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 2375–2384. [Google Scholar]
- Alahi, A.; Goel, K.; Ramanathan, V.; Robicquet, A.; Fei-Fei, L.; Savarese, S. Social lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 961–971. [Google Scholar]
- Song, X.; Chen, K.; Li, X.; Sun, J.; Hou, B.; Cui, Y.; Zhang, B.; Xiong, G.; Wang, Z. Pedestrian trajectory prediction based on deep convolutional LSTM network. IEEE Trans. Intell. Transp. Syst. 2020, 22, 3285–3302. [Google Scholar] [CrossRef]
- Zhang, P.; Ouyang, W.; Zhang, P.; Xue, J.; Zheng, N. Sr-lstm: State refinement for lstm towards pedestrian trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 12085–12094. [Google Scholar]
- Hou, L.; Xin, L.; Li, S.E.; Cheng, B.; Wang, W. Interactive trajectory prediction of surrounding road users for autonomous driving using structural-LSTM network. IEEE Trans. Intell. Transp. Syst. 2019, 21, 4615–4625. [Google Scholar] [CrossRef]
- Sun, H.; Chen, R.; Liu, T.; Wang, H.; Sun, F. LG-LSTM: Modeling LSTM-based interactions for multi-agent trajectory prediction. In Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan, 18–22 July 2022; pp. 1–6. [Google Scholar]
- Manh, H.; Alaghband, G. Scene-lstm: A model for human trajectory prediction. arXiv 2018, arXiv:1808.04018. [Google Scholar]
- Jiang, R.; Xu, H.; Gong, G.; Kuang, Y.; Liu, Z. Spatial-temporal attentive LSTM for vehicle-trajectory prediction. ISPRS Int. J. Geo-Inf. 2022, 11, 354. [Google Scholar] [CrossRef]
- Yu, J.; Zhou, M.; Wang, X.; Pu, G.; Cheng, C.; Chen, B. A dynamic and static context-aware attention network for trajectory prediction. ISPRS Int. J. Geo-Inf. 2021, 10, 336. [Google Scholar] [CrossRef]
- Hasani, R.; Lechner, M.; Amini, A.; Rus, D.; Grosu, R. Liquid time-constant networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021; pp. 7657–7666. [Google Scholar]
- Kaplan, H.S.; Thula, O.S.; Khoss, N.; Zimmer, M. Nested neuronal dynamics orchestrate a behavioral hierarchy across timescales. Neuron 2020, 105, 562–576.e9. [Google Scholar] [CrossRef]
- Lu, Y.; Wang, W.; Bai, R.; Zhou, S.; Garg, L.; Bashir, A.K.; Jiang, W.; Hu, X. Hyper-relational interaction modeling in multi-modal trajectory prediction for intelligent connected vehicles in smart cites. Inf. Fusion 2025, 114, 102682. [Google Scholar] [CrossRef]
- Wu, W.; Li, Z.; Gu, Y.; Zhao, R.; He, Y.; Zhang, D.J.; Shou, M.Z.; Li, Y.; Gao, T.; Zhang, D. Draganything: Motion control for anything using entity representation. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; pp. 331–348. [Google Scholar]
- Vijayabaskaran, S.; Zeng, X.; Ghazinouri, B.; Wiskott, L.; Cheng, B. A taxonomy of spatial navigation in mammals: Insights from computational modeling. Neurosci. Biobehav. Rev. 2025, 176, 106282. [Google Scholar] [CrossRef] [PubMed]
- Riquelme, C.; Puigcerver, J.; Mustafa, B.; Neumann, M.; Jenatton, R.; Susano Pinto, A.; Keysers, D.; Houlsby, N. Scaling vision with sparse mixture of experts. Adv. Neural Inf. Process. Syst. 2021, 34, 8583–8595. [Google Scholar]
- Zhou, Y.; Lei, T.; Liu, H.; Du, N.; Huang, Y.; Zhao, V.; Dai, A.M.; Le, Q.V.; Laudon, J. Mixture-of-experts with expert choice routing. Adv. Neural Inf. Process. Syst. 2022, 35, 7103–7114. [Google Scholar]
- Gao, J.; Sun, C.; Zhao, H.; Shen, Y.; Anguelov, D.; Li, C.; Schmid, C. Vectornet: Encoding hd maps and agent dynamics from vectorized representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11525–11533. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Hasani, R.; Lechner, M.; Amini, A.; Liebenwein, L.; Ray, A.; Tschaikowski, M.; Teschl, G.; Rus, D. Closed-form continuous-time neural networks. Nat. Mach. Intell. 2022, 4, 992–1003. [Google Scholar] [CrossRef]
- Aydemir, G.; Akan, A.K.; Güney, F. Adapt: Efficient multi-agent trajectory prediction with adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 8295–8305. [Google Scholar]
- Nayakanti, N.; Al-Rfou, R.; Zhou, A.; Goel, K.; Refaat, K.S.; Sapp, B. Wayformer: Motion forecasting via simple & efficient attention networks. arXiv 2022, arXiv:2207.05844. [Google Scholar] [CrossRef]
- Liang, M.; Yang, B.; Hu, R.; Chen, Y.; Liao, R.; Feng, S.; Urtasun, R. Learning lane graph representations for motion forecasting. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 541–556. [Google Scholar]
- Liu, Y.; Zhang, J.; Fang, L.; Jiang, Q.; Zhou, B. Multimodal motion prediction with stacked transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Online, 19–25 June 2021; pp. 7577–7586. [Google Scholar]
- Gu, J.; Sun, C.; Zhao, H. Densetnt: End-to-end trajectory prediction from dense goal sets. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 15303–15312. [Google Scholar]
- Ye, M.; Cao, T.; Chen, Q. Tpcn: Temporal point cloud networks for motion forecasting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Online, 19–25 June 2021; pp. 11318–11327. [Google Scholar]
- Ngiam, J.; Caine, B.; Vasudevan, V.; Zhang, Z.; Chiang, H.-T.L.; Ling, J.; Roelofs, R.; Bewley, A.; Liu, C.; Venugopal, A. Scene transformer: A unified architecture for predicting multiple agent trajectories. arXiv 2021, arXiv:2106.08417. [Google Scholar]
- Zhou, Z.; Ye, L.; Wang, J.; Wu, K.; Lu, K. Hivt: Hierarchical vector transformer for multi-agent motion prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 8823–8833. [Google Scholar]
- Varadarajan, B.; Hefny, A.; Srivastava, A.; Refaat, K.S.; Nayakanti, N.; Cornman, A.; Chen, K.; Douillard, B.; Lam, C.P.; Anguelov, D. Multipath++: Efficient information fusion and trajectory aggregation for behavior prediction. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 7814–7821. [Google Scholar]
- Wang, M.; Zhu, X.; Yu, C.; Li, W.; Ma, Y.; Jin, R.; Ren, X.; Ren, D.; Wang, M.; Yang, W.J. Ganet: Goal area network for motion forecasting. arXiv 2022, arXiv:2209.09723. [Google Scholar]
- Da, F.; Zhang, Y. Path-aware graph attention for hd maps in motion prediction. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 6430–6436. [Google Scholar]
- Gilles, T.; Sabatini, S.; Tsishkou, D.; Stanciulescu, B.; Moutarde, F. Thomas: Trajectory heatmap output with learned multi-agent sampling. arXiv 2021, arXiv:2110.06607. [Google Scholar]
- Ścibior, A.; Lioutas, V.; Reda, D.; Bateni, P.; Wood, F. Imagining the road ahead: Multi-agent trajectory prediction via differentiable simulation. In Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 19–22 September 2021; pp. 720–725. [Google Scholar]
- Gilles, T.; Sabatini, S.; Tsishkou, D.; Stanciulescu, B.; Moutarde, F. Gohome: Graph-oriented heatmap output for future motion estimation. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 9107–9114. [Google Scholar]















| Top-k | minADE6 | minFDE6 | MR6 |
|---|---|---|---|
| 1 | 0.65 | 1.03 | 0.13 |
| 2 | 0.61 | 1.01 | 0.11 |
| 3 | 0.63 | 1.01 | 0.12 |
| 4 | 0.63 | 1.02 | 0.11 |
| Models | Parameter (M) | MFLOPs | Per-Sample Latency (ms) | FPS |
|---|---|---|---|---|
| DenseTNT [38] | 12.93 | 21.44 | 5.1 | 610 |
| LaneGCN [36] | 10.24 | 18.52 | 3.4 | 735 |
| Wayformer [35] | 15.69 | 27.81 | 5.2 | 481 |
| Our method | 8.77 | 12.36 | 3.1 | 1190 |
| Models | minADE6 | minFDE6 | MR6 |
|---|---|---|---|
| DenseTNT [38] | 0.21 | 0.67 | - |
| SceneTransformer [40] | 0.26 | 0.47 | 0.05 |
| ITRA [46] | 0.17 | 0.49 | - |
| GOHOME [47] | - | 0.45 | 0.07 |
| THOMAS [45] | 0.26 | 0.46 | 0.05 |
| Our method | 0.16 | 0.42 | 0.06 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Jiang, S.; Wang, R.; Ding, R.; Ye, Q.; Liu, W. Atten-LTC-Enhanced MoE Model for Agent Trajectory Prediction in Autonomous Driving. Sensors 2026, 26, 479. https://doi.org/10.3390/s26020479
Jiang S, Wang R, Ding R, Ye Q, Liu W. Atten-LTC-Enhanced MoE Model for Agent Trajectory Prediction in Autonomous Driving. Sensors. 2026; 26(2):479. https://doi.org/10.3390/s26020479
Chicago/Turabian StyleJiang, Shangwu, Ruochen Wang, Renkai Ding, Qing Ye, and Wei Liu. 2026. "Atten-LTC-Enhanced MoE Model for Agent Trajectory Prediction in Autonomous Driving" Sensors 26, no. 2: 479. https://doi.org/10.3390/s26020479
APA StyleJiang, S., Wang, R., Ding, R., Ye, Q., & Liu, W. (2026). Atten-LTC-Enhanced MoE Model for Agent Trajectory Prediction in Autonomous Driving. Sensors, 26(2), 479. https://doi.org/10.3390/s26020479

