Vehicle Trajectory Prediction Based on Adaptive Edge Generation
Abstract
:1. Introduction
2. Related Work
2.1. Map Information
- (1)
- Enhanced detail and accuracy: Vectorized maps provide more comprehensive map information, including road curvature and traffic facilities, which are vital for accurate trajectory prediction. By delivering more precise road shapes and features, vectorized maps facilitate a better understanding of road conditions, aiding vehicles in making more accurate trajectory planning and predictions.
- (2)
- Superior real-time performance and dynamism: Vectorized maps are particularly advantageous in trajectory prediction scenarios that demand real-time responsiveness and dynamic adaptation. Road conditions may change due to various factors such as new traffic signs, road construction, or traffic congestion. The ease of updating vectorized maps allows for quick responses to such changes, ensuring that the maps reflect the most current information and thereby enhancing the real-time accuracy of predictions.
- (3)
- Improved navigation and route planning: Vectorized maps excel in navigation and route planning, providing more precise navigation guidance and route recommendations. With a deeper understanding of the structure and characteristics of road networks, vectorized maps can offer more accurate route planning for navigation systems.
2.2. Trajectory Prediction Based on Graph Neural Networks
2.3. Summary of Trajectory Prediction Problems Based on Graph Neural Networks
- (1)
- Optimization of graph connectivity: Traditional methods often employ fully connected graphs for vehicle trajectory prediction, ensuring that all nodes are engaged in the graph convolution process. However, this approach does not reflect the reality where many nodes might not significantly influence the target trajectory. For instance, within the Argoverse dataset [29], numerous nodes do not contribute directly to trajectory generation, such as lane line nodes that are distant from the predicted vehicle, which have minimal impact on the prediction outcomes. As illustrated in Figure 1, out of 11 nodes in the graph, only 7 are related to trajectories. Utilizing a fully connected graph that includes all 11 nodes undeniably adds to the complexity and computational time of the algorithm. Even when only considering the seven trajectory-related nodes, those that are distant from the predicted vehicle still have limited influence on the prediction results. Thus, a more precise screening of nodes that significantly impact prediction outcomes is crucial to optimizing both the performance and efficiency of the algorithm.
- (2)
- Enhanced focus on edges: Traditionally, research has predominantly focused on nodes, with edges primarily viewed as mere connectors and tools for representing neighborhood relationships during graph convolution. This perspective overlooks the significant role that edges play within the graph. Edges do more than connect nodes [30]; they also serve as conduits for storing and transferring information, playing a crucial role in facilitating communication between nodes. Therefore, fully leveraging the function of edges in graph convolution processes is critical for enhancing the performance and accuracy of algorithms and represents a novel research direction in this paper.
- (3)
- Refined edge construction: In the realm of edge construction, the traditional graph attention network (GAT) approach typically calculates edge weights based on the distance between nodes. This method, however, fails to accurately represent the weight relationships between vehicles in specific scenarios. For instance, at intersections, the relevance of one vehicle to another depends not only on their distance but also on their relative angles and other factors. Relying solely on distance for determining edge weights can lead to reduced accuracy and robustness of the prediction model. This paper advocates for a more nuanced approach that considers multiple factors in edge construction to better capture the complex dynamics of vehicle interactions, particularly in challenging environments like intersections.
3. Trajectory Prediction Based on Adaptive Edge Generation
3.1. Overall Workflow
- (1)
- Vectorization module: This module is responsible for the vectorization of vehicle trajectory and map information, transforming these elements into a format suitable for further processing and analysis.
- (2)
- Encoder module: Here, the adaptive edge generator strategy is introduced. For dynamic nodes, this involves calculating the relative positions between the vehicle (node) to be predicted and other vehicles (nodes), assigning weights based on these relative positions through an attention mechanism, and connecting dynamic nodes based on these weight values. For static nodes, a connection strategy that limits the length between vehicles and static elements is implemented for edge connections.
- (3)
- Decoder module: A multilayer perceptron (MLP) decoder is employed to achieve accurate trajectory prediction. This module translates the encoded graph data back into predicted vehicle trajectories, ensuring precision and reliability in the output.
- (1)
- Enhanced model flexibility and adaptability: Dynamic nodes represent the current positions of vehicles, while static nodes correspond to fixed road sections or landmarks. By distinguishing between these node types, the model can more efficiently handle rapidly changing road conditions, enhancing both the flexibility and adaptability of the prediction model.
- (2)
- Reduced computational complexity: Traditional reliance on fully connected graphs can obscure meaningful inter-node relationships and increase computational demands. By processing dynamic and static nodes separately, this approach reduces the graph’s connection density, thereby decreasing computational complexity and enhancing both the efficiency and scalability of the algorithm.
- (3)
- Improved accuracy and stability: The relationships between vehicles differ from those between vehicles and road markers, such as lane lines, introducing a level of heterogeneity that complicates graph convolution computations. By implementing different connection strategies and dynamically adjusting based on node type and semantic information, the model can more accurately capture the associations between vehicle trajectories, thus improving the accuracy and stability of predictions.
3.2. Scene Construction
3.3. Adaptive Edge Generator
3.3.1. Dynamic Node Edge Generation Based on Relative Angles
3.3.2. Static Node Edge Generation Based on Length Thresholding
3.4. MLP Decoder and Loss Function
4. Experimental Results and Analysis
4.1. Datasets and Evaluation Indicators
4.2. Experimental Environment and Hyperparameter Settings
4.3. Ablation Experiment
4.4. Decoder Module Comparison
4.5. Baseline Models
4.5.1. VectorNet [13] (Baseline)
4.5.2. Scene Transformer [40]
4.5.3. LaneGCN [23]
4.5.4. DenseTNT [41]
4.6. Comparison with Advanced Methods
4.7. Visualization of Results
5. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Shadrin, S.S.; Ivanova, A.A. Analytical review of standard Sae J3016 «taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles» with latest updates. Avtomobil’. Doroga. Infrastruktura. 2019, 3, 10. [Google Scholar]
- Dagli, I.; Breuel, G.; Schittenhelm, H.; Schanz, A. Cutting-in vehicle recognition for ACC systems-towards feasible situation analysis methodologies. In Proceedings of the IEEE Intelligent Vehicles Symposium, Parma, Italy, 14–17 June 2004; IEEE: New York, NY, USA, 2004; pp. 925–930. [Google Scholar]
- Huang, Y.; Du, J.; Yang, Z.; Zhou, Z.; Zhang, L.; Chen, H. A survey on trajectory-prediction methods for autonomous driving. IEEE Trans. Intell. Veh. 2022, 7, 652–674. [Google Scholar] [CrossRef]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Cui, H.; Radosavljevic, V.; Chou, F.C.; Lin, T.H.; Nguyen, T.; Huang, T.K.; Schneider, J.; Djuric, N. Multimodal trajectory predictions for autonomous driving using deep convolutional networks. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; IEEE: New York, NY, USA, 2019; pp. 2090–2096. [Google Scholar]
- Bao, Z.; Hossain, S.; Lang, H.; Lin, X. High-definition map generation technologies for autonomous driving. arXiv 2022, arXiv:2206.05400. [Google Scholar]
- Casas, S.; Gulino, C.; Liao, R.; Urtasun, R. Spatially-aware graph neural networks for relational behavior forecasting from sensor data. arXiv 2019, arXiv:1910.08233. [Google Scholar]
- Ou, C.; Karray, F. Deep learning-based driving maneuver prediction system. IEEE Trans. Veh. Technol. 2019, 69, 1328–1340. [Google Scholar] [CrossRef]
- Fernández-Llorca, D.; Biparva, M.; Izquierdo-Gonzalo, R.; Tsotsos, J.K. Two-stream networks for lane-change prediction of surrounding vehicles. In Proceedings of the 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, 20–23 September 2020; IEEE: New York, NY, USA, 2020; pp. 1–6. [Google Scholar]
- Schmidt, J.; Jordan, J.; Gritschneder, F.; Dietmayer, K. Crat-pred: Vehicle trajectory prediction with crystal graph convolutional neural networks and multi-head self-attention. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; IEEE: New York, NY, USA, 2022; pp. 7799–7805. [Google Scholar]
- Hong, J.; Sapp, B.; Philbin, J. Rules of the road: Predicting driving behavior with a convolutional model of semantic interactions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8454–8462. [Google Scholar]
- Chai, Y.; Sapp, B.; Bansal, M.; Anguelov, D. Multipath: Multiple probabilistic anchor trajectory hypotheses for behavior prediction. arXiv 2019, arXiv:1910.05449. [Google Scholar]
- Gao, J.; Sun, C.; Zhao, H.; Shen, Y.; Anguelov, D.; Li, C.; Schmid, C. Vectornet: Encoding hd maps and agent dynamics from vectorized representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11525–11533. [Google Scholar]
- Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Philip, S.Y. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4–24. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Liu, Q.; Xu, S.; Lu, C.; Yao, H.; Chen, H. Early recognition of driving intention for lane change based on recurrent hidden semi-Markov model. IEEE Trans. Veh. Technol. 2020, 69, 10545–10557. [Google Scholar] [CrossRef]
- Zhao, T.; Xu, Y.; Monfort, M.; Choi, W.; Baker, C.; Zhao, Y.; Wang, Y.; Wu, Y.N. Multi-agent tensor fusion for contextual trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 12126–12134. [Google Scholar]
- Bansal, M.; Krizhevsky, A.; Ogale, A. Chauffeurnet: Learning to drive by imitating the best and synthesizing the worst. arXiv 2018, arXiv:1812.03079. [Google Scholar]
- Lipton, Z.C.; Berkowitz, J.; Elkan, C. A critical review of recurrent neural networks for sequence learning. arXiv 2015, arXiv:1506.00019. [Google Scholar]
- Varadarajan, B.; Hefny, A.; Srivastava, A.; Refaat, K.S.; Nayakanti, N.; Cornman, A.; Chen, K.; Douillard, B.; Lam, C.P.; Anguelov, D.; et al. Multipath++: Efficient information fusion and trajectory aggregation for behavior prediction. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; IEEE: New York, NY, USA, 2022; pp. 7814–7821. [Google Scholar]
- Phan-Minh, T.; Grigore, E.C.; Boulton, F.A.; Beijbom, O.; Wolff, E.M. Covernet: Multimodal behavior prediction using trajectory sets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 14074–14083. [Google Scholar]
- Tang, C.; Salakhutdinov, R.R. Multiple futures prediction. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
- Liang, M.; Yang, B.; Hu, R.; Chen, Y.; Liao, R.; Feng, S.; Urtasun, R. Learning lane graph representations for motion forecasting. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part II 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 541–556. [Google Scholar]
- Zeng, W.; Liang, M.; Liao, R.; Urtasun, R. Lanercnn: Distributed representations for graph-centric motion forecasting. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; IEEE: New York, NY, USA, 2021; pp. 532–539. [Google Scholar]
- Chua, L.O. CNN: A Paradigm for Complexity; World Scientific: Singapore, 1998; Volume 31. [Google Scholar]
- Niepert, M.; Ahmed, M.; Kutzkov, K. Learning convolutional neural networks for graphs. In Proceedings of the International Conference on Machine Learning, PMLR, New York, NY, USA, 19–24 June 2016; pp. 2014–2023. [Google Scholar]
- Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Neural message passing for quantum chemistry. In Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia, 6–11 August 2017; pp. 1263–1272. [Google Scholar]
- Chang, M.F.; Lambert, J.; Sangkloy, P.; Singh, J.; Bak, S.; Hartnett, A.; Wang, D.; Carr, P.; Lucey, S.; Ramanan, D.; et al. Argoverse: 3d tracking and forecasting with rich maps. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8748–8757. [Google Scholar]
- Rong, Y.; Huang, W.; Xu, T.; Huang, J. Dropedge: Towards deep graph convolutional networks on node classification. arXiv 2019, arXiv:1907.10903. [Google Scholar]
- Mo, X.; Huang, Z.; Xing, Y.; Lv, C. Multi-agent trajectory prediction with heterogeneous edge-enhanced graph attention network. IEEE Trans. Intell. Transp. Syst. 2022, 23, 9554–9567. [Google Scholar] [CrossRef]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Lee, D.D.; Pham, P.; Largman, Y.; Ng, A. Advances in neural information processing systems 22. Technol. Rep. 2009. [Google Scholar]
- Zhou, Z.; Ye, L.; Wang, J.; Wu, K.; Lu, K. Hivt: Hierarchical vector transformer for multi-agent motion prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 8823–8833. [Google Scholar]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Tolstikhin, I.O.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J.; et al. Mlp-mixer: An all-mlp architecture for vision. Adv. Neural Inf. Process. Syst. 2021, 34, 24261–24272. [Google Scholar]
- Bahdanau, D. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
- Zhao, H.; Gao, J.; Lan, T.; Sun, C.; Sapp, B.; Varadarajan, B.; Shen, Y.; Shen, Y.; Chai, Y.; Schmid, C.; et al. Tnt: Target-driven trajectory prediction. In Proceedings of the Conference on Robot Learning, PMLR, London, UK, 8–11 November 2021; pp. 895–904. [Google Scholar]
- Yao, H.; Zhu, D.l.; Jiang, B.; Yu, P. Negative log likelihood ratio loss for deep neural network classification. In Proceedings of the Future Technologies Conference (FTC) 2019, San Francisco, CA, USA, 25–26 October 2019; Springer: Berlin/Heidelberg, Germany, 2020; Volume 1, pp. 276–282. [Google Scholar]
- Ngiam, J.; Caine, B.; Vasudevan, V.; Zhang, Z.; Chiang, H.T.L.; Ling, J.; Roelofs, R.; Bewley, A.; Liu, C.; Venugopal, A.; et al. Scene transformer: A unified architecture for predicting multiple agent trajectories. arXiv 2021, arXiv:2106.08417. [Google Scholar]
- Gu, J.; Sun, C.; Zhao, H. Densetnt: End-to-end trajectory prediction from dense goal sets. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 15303–15312. [Google Scholar]
Configuration | Parameter |
---|---|
Operating System | Linux |
CPU | Intel i5-13600KF |
Memory | 32 G |
GPU | NVIDIA RTX4070 super |
Software Platform | Python 3.10, PyTorch 2.2.0, CUDA 12.1 |
Parameter Setting | Parameter |
---|---|
Epoch | 64 |
Batch size | 32 |
Initial learning rate | |
Weight decay | |
Dropout rate | 0.2 |
L | 20 m |
Dynamic Edge Generation Module | Static Edge Generation Module | Map Information | ADE (mean ± std) | FDE (mean ± std) | MR (mean ± std) |
---|---|---|---|---|---|
√ | √ | 0.902 ± 0.015 | 1.574 ± 0.022 | 0.138 ± 0.005 | |
√ | √ | 0.681 ± 0.012 | 1.039 ± 0.018 | 0.102 ± 0.004 | |
√ | √ | 0.73 ± 0.010 | 1.16 ± 0.020 | 0.122 ± 0.003 | |
√ | √ | √ | 0.663 ± 0.009 | 0.973 ± 0.013 | 0.094 ± 0.002 |
Dynamic Edge Generation Module | Floating Point Capacity | Time Required for Individual Scenario Testing | GPU Memory |
---|---|---|---|
√ | 0.7 M | 0.04 s | 5.2 G |
2.6 M | 0.08 s | 8.4 G |
ADE | FDE | Param. | |
---|---|---|---|
MLP Decoder | 0.668 | 0.986 | 2360 K |
GRU Decoder | 0.663 | 0.973 | 3156 K |
LSTM Decoder | 0.665 | 0.965 | 4820 K |
Model | ADE | FDE | MR | Param. |
---|---|---|---|---|
VectorNet (baseline) | 0.9260 | 1.8623 | 0.2736 | 12,651 K |
Scene Transformer | 0.8026 | 1.2321 | 0.1255 | 15,296 K |
LaneGCN | 0.8679 | 1.3640 | 0.1634 | 3710 K |
DenseTNT | 0.8817 | 1.2815 | 0.1258 | 1130 K |
Ours | 0.6681 | 0.9864 | 0.0952 | 2360 K |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ren, H.; Zhang, Y. Vehicle Trajectory Prediction Based on Adaptive Edge Generation. Electronics 2024, 13, 3787. https://doi.org/10.3390/electronics13183787
Ren H, Zhang Y. Vehicle Trajectory Prediction Based on Adaptive Edge Generation. Electronics. 2024; 13(18):3787. https://doi.org/10.3390/electronics13183787
Chicago/Turabian StyleRen, He, and Yanyan Zhang. 2024. "Vehicle Trajectory Prediction Based on Adaptive Edge Generation" Electronics 13, no. 18: 3787. https://doi.org/10.3390/electronics13183787
APA StyleRen, H., & Zhang, Y. (2024). Vehicle Trajectory Prediction Based on Adaptive Edge Generation. Electronics, 13(18), 3787. https://doi.org/10.3390/electronics13183787