Next Article in Journal
An Evaluation Framework and Algorithms for Train Rescheduling
Previous Article in Journal
Segment-Based Clustering of Hyperspectral Images Using Tree-Based Data Partitioning Structures
Previous Article in Special Issue
Generative Model for Skeletal Human Movements Based on Conditional DC-GAN Applied to Pseudo-Images
Article

Predicting Intentions of Pedestrians from 2D Skeletal Pose Sequences with a Representation-Focused Multi-Branch Deep Learning Network

1
Institut VEDECOM—Versailles, 78000 Versailles, France
2
Centre de Robotique, MINES ParisTech, Université PSL, 75006 Paris, France
*
Author to whom correspondence should be addressed.
Algorithms 2020, 13(12), 331; https://doi.org/10.3390/a13120331
Received: 30 October 2020 / Revised: 3 December 2020 / Accepted: 7 December 2020 / Published: 10 December 2020
(This article belongs to the Special Issue Algorithms for Human Gesture, Activity and Mobility Analysis)
Understanding the behaviors and intentions of humans is still one of the main challenges for vehicle autonomy. More specifically, inferring the intentions and actions of vulnerable actors, namely pedestrians, in complex situations such as urban traffic scenes remains a difficult task and a blocking point towards more automated vehicles. Answering the question “Is the pedestrian going to cross?” is a good starting point in order to advance in the quest to the fifth level of autonomous driving. In this paper, we address the problem of real-time discrete intention prediction of pedestrians in urban traffic environments by linking the dynamics of a pedestrian’s skeleton to an intention. Hence, we propose SPI-Net (Skeleton-based Pedestrian Intention network): a representation-focused multi-branch network combining features from 2D pedestrian body poses for the prediction of pedestrians’ discrete intentions. Experimental results show that SPI-Net achieved 94.4% accuracy in pedestrian crossing prediction on the JAAD data set while being efficient for real-time scenarios since SPI-Net can reach around one inference every 0.25 ms on one GPU (i.e., RTX 2080ti), or every 0.67 ms on one CPU (i.e., Intel Core i7 8700K). View Full-Text
Keywords: skeleton-based action prediction; pedestrian intention prediction; body action; human activity; action and gesture recognition; mobility analysis skeleton-based action prediction; pedestrian intention prediction; body action; human activity; action and gesture recognition; mobility analysis
Show Figures

Figure 1

MDPI and ACS Style

Gesnouin, J.; Pechberti, S.; Bresson, G.; Stanciulescu, B.; Moutarde, F. Predicting Intentions of Pedestrians from 2D Skeletal Pose Sequences with a Representation-Focused Multi-Branch Deep Learning Network. Algorithms 2020, 13, 331. https://doi.org/10.3390/a13120331

AMA Style

Gesnouin J, Pechberti S, Bresson G, Stanciulescu B, Moutarde F. Predicting Intentions of Pedestrians from 2D Skeletal Pose Sequences with a Representation-Focused Multi-Branch Deep Learning Network. Algorithms. 2020; 13(12):331. https://doi.org/10.3390/a13120331

Chicago/Turabian Style

Gesnouin, Joseph, Steve Pechberti, Guillaume Bresson, Bogdan Stanciulescu, and Fabien Moutarde. 2020. "Predicting Intentions of Pedestrians from 2D Skeletal Pose Sequences with a Representation-Focused Multi-Branch Deep Learning Network" Algorithms 13, no. 12: 331. https://doi.org/10.3390/a13120331

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop