# Vehicle Trajectory Prediction Using Hierarchical Graph Neural Network for Considering Interaction among Multimodal Maneuvers

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- A hierarchical GNN is formulated for trajectory prediction to take into account the interaction among multimodal maneuvers.
- An efficient GNN is developed to consider the interaction among trajectories where collision is predicted.

## 2. Related Work

## 3. System Overview

#### 3.1. System Architecture

#### 3.2. Problem Formulation

## 4. Maneuver-Based Multimodal Trajectory Prediction Network

#### 4.1. Lstm Encoder for Extracting Vehicle Features

#### 4.2. GCN Encoder for Extracting Interaction Features

#### 4.3. LSTM Decoder for Generating Predicted Trajectory

#### 4.4. MLP Decoder for Predicting Maneuver Probability

## 5. Interaction-Aware Trajectory Prediction Network

#### 5.1. LSTM Encoder for Extracting Predicted Trajectory Features

#### 5.2. GCN Encoder for Extracting Interaction Features among Multimodal Maneuvers

#### 5.3. LSTM Decoder for Generating Predicted Trajectories

## 6. Loss Function and Implementation Details

## 7. Experimental Evaluation

#### 7.1. Dataset

#### 7.1.1. NGSIM Dataset

#### 7.1.2. Real Driving Dataset

#### 7.2. Evaluation Metrics

#### 7.3. Ablation Study

- Identity matrix: Only the current trajectory is considered, and the interactions among the multimodal maneuvers are not considered.
- Matrix of ones: The interactions among the multimode maneuvers of all vehicles are considered.
- Collision matrix without probability: The interaction among maneuvers where a collision is expected is considered.
- Collision matrix with probability: The interaction among maneuvers where a collision is expected is considered. In this case, the weight of the interaction is the probability that both maneuvers occur simultaneously.

#### 7.4. Quantitative Results

- Constant velocity (CV): This method uses a constant-velocity Kalman filter [5] to predict trajectories.
- Interacting multiple model Kalman filter (IMM-KF): This method uses an IMM Kalman filter proposed in [43]. This method consists of intention-based motion models, and the IMM filter is used to identify which of the motion models is active.
- Vanilla LSTM (V-LSTM): This technique is an encoder–decoder structure using single-layer LSTM [5] and does not consider interactions because it only uses the information of the target vehicle.
- CS-LSTM: In an encoder–decoder structure using the LSTM; this method employs the convolutional social pooling layer proposed in [5] to consider the interaction with the surrounding vehicles as a grid. The output is the unimodal trajectory distribution.
- CS-LSTM (M): This approach outputs the maneuver-based multimodal trajectory distribution in the CS-LSTM method that is proposed in [5]. The trajectory is evaluated as having the highest probability.
- GRIP: This method uses a graph-based interaction-aware trajectory prediction model that is proposed in [6]. The GRIP consists of several convolutional layers with graph operations to model the interaction among vehicles.

#### 7.5. Qualitative Results

## 8. Conclusions

- The proposed hierarchical graph neural network can predict the trajectory in highly interactive situations more accurately than other methods. In the evaluation using the NGSIM in a highly interactive situation, the proposed method compared with a previous method reduced the RMSE error by 44.6%. The proposed algorithm showed better performance in an interactive environment because it can consider interactions among multimodal maneuvers.
- The proposed graph neural network efficiently considered interactions. In the ablation study using the NGSIM dataset, the proposed graph neural network reduced the error by 16.8% compared to not using it. In addition, the proposed graph representation method, which considers interactions where a collision is expected, showed better performance than other graph representation methods.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Nilsson, J.; Silvlin, J.; Brannstrom, M.; Coelingh, E.; Fredriksson, J. If, When, and How to Perform Lane Change Maneuvers on Highways. IEEE Intell. Transp. Syst. Mag.
**2016**, 8, 68–78. [Google Scholar] [CrossRef] - Ulbrich, S.; Maurer, M. Towards Tactical Lane Change Behavior Planning for Automated Vehicles. In Proceedings of the IEEE Conference on Intelligent Transportation Systems (ITSC), Gran Canaria, Spain, 15–18 September 2015; pp. 989–995. [Google Scholar]
- Lim, W.; Lee, S.; Sunwoo, M.; Jo, K. Hybrid Trajectory Planning for Autonomous Driving in On-Road Dynamic Scenarios. IEEE Trans. Intell. Transp. Syst.
**2021**, 22, 341–355. [Google Scholar] [CrossRef] - Paden, B.; Čáp, M.; Yong, S.Z.; Yershov, D.; Frazzoli, E. A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Trans. Intell. Veh.
**2016**, 1, 33–55. [Google Scholar] [CrossRef][Green Version] - Deo, N.; Trivedi, M.M. Convolutional Social Pooling for Vehicle Trajectory Prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Li, X.; Ying, X.; Chuah, M.C. GRIP: Graph-based Interaction-aware Trajectory Prediction. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference, ITSC 2019, Auckland, New Zealand, 27–30 October 2019; pp. 3960–3966. [Google Scholar]
- Li, X.; Ying, X.; Chuah, M.C. GRIP++: Enhanced Graph-based Interaction-aware Trajectory Prediction for Autonomous Driving. arXiv
**2019**, arXiv:1907.07792. [Google Scholar] - Li, J.; Ma, H.; Zhang, Z.; Tomizuka, M. Social-WaGDAT: Interaction-aware Trajectory Prediction via Wasserstein Graph Double-Attention Network. arXiv
**2020**, arXiv:2002.06241. [Google Scholar] - Deo, N.; Trivedi, M.M. Multi-Modal Trajectory Prediction of Surrounding Vehicles with Maneuver based LSTMs. In Proceedings of the IEEE Intelligent Vehicles Symposium, Changshu, China, 26–30 June 2018; pp. 1179–1184. [Google Scholar]
- Lee, D.; Kwon, Y.P.; Mcmains, S.; Hedrick, J.K. Convolution neural network-based lane change intention prediction of surrounding vehicles for ACC. In Proceedings of the IEEE Conference on Intelligent Transportation Systems, ITSC, Yokohama, Japan, 16–19 October 2017; pp. 1–6. [Google Scholar]
- Lee, N.; Choi, W.; Vernaza, P.; Choy, C.B.; Torr, P.H.S.; Chandraker, M. DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 336–345. [Google Scholar]
- Chou, F.C.; Lin, T.H.; Cui, H.; Radosavljevic, V.; Nguyen, T.; Huang, T.K.; Niedoba, M.; Schneider, J.; Djuric, N. Predicting Motion of Vulnerable Road Users using High-Definition Maps and Efficient ConvNets. arXiv
**2019**, arXiv:1906.08469. [Google Scholar] - Djuric, N.; Radosavljevic, V.; Cui, H.; Nguyen, T.; Chou, F.C.; Lin, T.H.; Singh, N.; Schneider, J. Uncertainty-aware Short-term Motion Prediction of Traffic Actors for Autonomous Driving. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Snowmass, CO, USA, 1–5 March 2020; pp. 2095–2104. [Google Scholar]
- Monti, A.; Bertugli, A.; Calderara, S.; Cucchiara, R. DAG-Net: Double Attentive Graph Neural Network for Trajectory Forecasting. arXiv
**2020**, arXiv:2005.12661. [Google Scholar] - Chandra, R.; Guan, T.; Panuganti, S.; Mittal, T.; Bhattacharya, U.; Bera, A.; Manocha, D. Forecasting Trajectory and Behavior of Road-Agents Using Spectral Clustering in Graph-LSTMs. IEEE Robot. Autom. Lett.
**2020**, 5, 4882–4890. [Google Scholar] [CrossRef] - Yu, B.; Yin, H.; Zhu, Z. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In Proceedings of the IJCAI International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; pp. 3634–3640. [Google Scholar]
- Gao, J.; Sun, C.; Zhao, H.; Shen, Y.; Anguelov, D.; Li, C.; Schmid, C. VectorNet: Encoding HD Maps and Agent Dynamics From Vectorized Representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Cui, Z.; Henrickson, K.; Ke, R.; Wang, Y. Traffic Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting. IEEE Trans. Intell. Transp. Syst.
**2020**, 21, 4883–4894. [Google Scholar] [CrossRef][Green Version] - Gupta, A.; Johnson, J.; Li, F.; Savarese, S.; Alahi, A. Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Rhinehart, N.; McAllister, R.; Kitani, K.; Levine, S. PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27–28 October 2019. [Google Scholar]
- Cui, H.; Radosavljevic, V.; Chou, F.C.; Lin, T.H.; Nguyen, T.; Huang, T.K.; Schneider, J.; Djuric, N. Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks. In Proceedings of the International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 2090–2096. [Google Scholar]
- Tang, C.; Salakhutdinov, R.R. Multiple Futures Prediction. Adv. Neural Inf. Process. Syst.
**2019**, 32, 15424–15434. [Google Scholar] - Hong, J.; Sapp, B.; Philbin, J. Rules of the Road: Predicting Driving Behavior with a Convolutional Model of Semantic Interactions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Casas, S.; Luo, W.; Urtasun, R. IntentNet: Learning to Predict Intention from Raw Sensor Data. In Proceedings of the 2nd Annual Conference on Robot Learning (CoRL), Zürich, Switzerland, 29–31 October 2018; Volume 87, pp. 947–956. [Google Scholar]
- Chai, Y.; Sapp, B.; Bansal, M.; Anguelov, D. MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction. arXiv
**2019**, arXiv:1910.05449. [Google Scholar] - Phan-Minh, T.; Grigore, E.C.; Boulton, F.A.; Beijbom, O.; Wolff, E.M. CoverNet: Multimodal Behavior Prediction Using Trajectory Sets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 14062–14071. [Google Scholar]
- Simard, P.Y.; Steinkraus, D.; Platt, J.C. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, Edinburgh, UK, 3–6 August 2003; pp. 958–963. [Google Scholar]
- Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw.
**2009**, 20, 61–80. [Google Scholar] [CrossRef][Green Version] - Liu, W.; He, H.; Sun, F. Vehicle state estimation based on Minimum Model Error criterion combining with Extended Kalman Filter. J. Frankl. Inst.
**2016**, 353, 834–856. [Google Scholar] [CrossRef] - Scharcanski, J.; De Oliveira, A.B.; Cavalcanti, P.G.; Yari, Y. A particle-filtering approach for vehicular tracking adaptive to occlusions. IEEE Trans. Veh. Technol.
**2011**, 60, 381–389. [Google Scholar] [CrossRef] - Haarnoja, T.; Ajay, A.; Levine, S.; Abbeel, P. Backprop KF: Learning Discriminative Deterministic State Estimators. Adv. Neural Inf. Process. Syst.
**2016**, 29, 4376–4384. [Google Scholar] - Kalman, R.E. A new approach to linear filtering and prediction problems. J. Fluids Eng. Trans. ASME
**1960**, 82, 35–45. [Google Scholar] [CrossRef][Green Version] - Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to Sequence Learning with Neural Networks. Adv. Neural Inf. Process. Syst.
**2014**, 27, 3104–3112. [Google Scholar] - Altche, F.; De La Fortelle, A. An LSTM network for highway trajectory prediction. In Proceedings of the IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, Maui, HI, USA, 4–7 November 2018; pp. 353–359. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput.
**1997**, 9, 1735–1780. [Google Scholar] [CrossRef] - Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the EMNLP 2014—2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014; pp. 1724–1734. [Google Scholar]
- Luo, W.; Yang, B.; Urtasun, R. Fast and Furious Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of the 5th International Conference on Learning Representations, ICLR 2017—Conference Track Proceedings, Toulon, France, 24–26 April 2017. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Adv. Neural Inf. Process. Syst.
**2019**, 32, 8026–8037. [Google Scholar] - Kingma, D.P.; Ba, J.L. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR—Conference Track Proceedings, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Colyar, J.; Halkias, J. US Highway 101 Dataset. Federal Highway Administration (FHWA), Tech. Rep. FHWA-HRT-07-030. January 2007. Available online: https://www.fhwa.dot.gov/publications/research/operations/07030/ (accessed on 5 August 2021).
- Halkias, J.; Colyar, J. Us Highway i-80 Dataset. Federal Highway Administration (FHWA), Tech. Rep. FHWA-HRT-06-137. December 2006. Available online: https://www.fhwa.dot.gov/publications/research/operations/06137/ (accessed on 5 August 2021).
- Lefkopoulos, V.; Menner, M.; Domahidi, A.; Zeilinger, M.N. Interaction-Aware Motion Prediction for Autonomous Driving: A Multiple Model Kalman Filtering Scheme. IEEE Robot. Autom. Lett.
**2021**, 6, 80–87. [Google Scholar] [CrossRef] - Zhan, W.; Sun, L.; Wang, D.; Shi, H.; Clausse, A.; Naumann, M.; Kummerle, J.; Konigshof, H.; Stiller, C.; de La Fortelle, A.; et al. INTERACTION Dataset: An INTERnational, Adversarial and Cooperative moTION Dataset in Interactive Driving Scenarios with Semantic Maps. arXiv
**2019**, arXiv:1910.03088. [Google Scholar] - Zhan, W.; Sun, L.; Wang, D.; Jin, Y.; Tomizuka, M. Constructing a Highly Interactive Vehicle Motion Dataset. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Macau, China, 3–8 November 2019; pp. 6415–6420. [Google Scholar]

**Figure 1.**Illustration demonstrating the necessity of awareness of interactions among multimodal maneuvers: (

**a**) Vehicles with multimodal maneuvers; (

**b**) vehicle movement affected by maneuver of surrounding vehicles.

**Figure 3.**Illustration of maneuver-based multimodal trajectory prediction network with four components: (

**a**) LSTM encoder, (

**b**) GCN encoder, (

**c**) LSTM decoder, and (

**d**) MLP decoder.

**Figure 5.**Illustration of interaction-aware trajectory prediction network with three components: (

**a**) LSTM encoder, (

**b**) GCN encoder, and (

**c**) LSTM decoder.

**Figure 7.**Experiment dataset for training the proposed method and comparison with other methods: (

**a**) Vehicle trajectories were collected for NGSIM using digital video cameras; (

**b**) dataset including mild, moderate, and congested traffic conditions on the freeway.

**Figure 8.**Experimental environment for real driving evaluation: (

**a**) Test vehicle A1 equipped with one 32-channel LiDAR, two 16-channel LiDARs, and RTK-GNSS/INS; (

**b**) test site including a congested freeway.

**Figure 9.**Qualitative evaluation of the proposed hierarchical method using real driving data: (

**a**) Trajectory prediction in Stage 1; (

**b**) Trajectory prediction in Stage 2.

**Figure 11.**Visualization of the predicted trajectory. (

**a**–

**d**) The results of the NGSIM dataset, and (

**e**,

**f**) the results of the real driving dataset.

Stage 1 | Stage 2 | ||||
---|---|---|---|---|---|

Identity Matrix | Matrix of Ones | Collision Matrix without Probability | Collision Matrix with Probability | ||

$RMS{E}_{1s}$ | 0.62 | 0.61 | 0.60 | 0.61 | 0.61 |

$RMS{E}_{2s}$ | 1.24 | 1.22 | 1.13 | 1.14 | 1.14 |

$RMS{E}_{3s}$ | 1.88 | 1.87 | 1.63 | 1.62 | 1.62 |

$RMS{E}_{4s}$ | 2.56 | 2.53 | 2.43 | 2.16 | 2.10 |

$RMS{E}_{5s}$ | 3.33 | 3.29 | 3.02 | 2.79 | 2.66 |

$RMSE$ | 1.90 | 1.88 | 1.74 | 1.62 | 1.58 |

CV [5] | IMM-KF [43] | V-LSTM [5] | CS-LSTM(M) [5] | CS-LSTM [5] | GRIP [6] | GRIP++ [7] | HGNN (Proposed Method) | |
---|---|---|---|---|---|---|---|---|

$RMS{E}_{1s}$ | 0.73 | 0.58 | 0.68 | 0.62 | 0.61 | 0.37 | 0.38 | 0.61 |

$RMS{E}_{2s}$ | 1.78 | 1.36 | 1.65 | 1.29 | 1.27 | 0.86 | 0.89 | 1.14 |

$RMS{E}_{3s}$ | 3.13 | 2.28 | 2.91 | 2.13 | 2.09 | 1.45 | 1.45 | 1.62 |

$RMS{E}_{4s}$ | 4.78 | 3.37 | 4.46 | 3.20 | 3.10 | 2.21 | 2.14 | 2.10 |

$RMS{E}_{5s}$ | 6.68 | 4.55 | 6.27 | 4.52 | 4.37 | 3.16 | 2.94 | 2.66 |

**Table 3.**Compariso n of RMSE errors with GRIP++ in highly interactive situations ($\u2206TTC{P}_{min}\le 1\text{}\mathrm{s}$).

GRIP++ [7] | HGNN (Proposed Method) | |
---|---|---|

$RMSE$ | 3.57 | 1.97 |

GRIP++ [7] | HGNN (Proposed Method) | |
---|---|---|

$RMSE$ | 3.02 | 1.98 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Jo, E.; Sunwoo, M.; Lee, M. Vehicle Trajectory Prediction Using Hierarchical Graph Neural Network for Considering Interaction among Multimodal Maneuvers. *Sensors* **2021**, *21*, 5354.
https://doi.org/10.3390/s21165354

**AMA Style**

Jo E, Sunwoo M, Lee M. Vehicle Trajectory Prediction Using Hierarchical Graph Neural Network for Considering Interaction among Multimodal Maneuvers. *Sensors*. 2021; 21(16):5354.
https://doi.org/10.3390/s21165354

**Chicago/Turabian Style**

Jo, Eunsan, Myoungho Sunwoo, and Minchul Lee. 2021. "Vehicle Trajectory Prediction Using Hierarchical Graph Neural Network for Considering Interaction among Multimodal Maneuvers" *Sensors* 21, no. 16: 5354.
https://doi.org/10.3390/s21165354