Next Article in Journal
A Review of Remote Sensing Approaches for Monitoring Blue Carbon Ecosystems: Mangroves, Seagrassesand Salt Marshes during 2010–2018
Next Article in Special Issue
SCENet: Secondary Domain Intercorrelation Enhanced Network for Alleviating Compressed Poisson Noises
Previous Article in Journal
A High Sensitivity FBG Strain Sensor Based on Flexible Hinge
Previous Article in Special Issue
Depth Estimation and Semantic Segmentation from a Single RGB Image Using a Hybrid Convolutional Neural Network
Article Menu
Issue 8 (April-2) cover image

Export Article

Open AccessArticle

Spatio–Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks

1
Cerema, Project team STI, 1 avenue du Colonel Roche, F-31400 Toulouse, France
2
Informatics Research Institute of Toulouse (IRIT), Paul Sabatier University, Toulouse 31062, France
3
Aparnix, La Gioconda 4355, 10B, Las Condes, Santiago 7550076, Chile
4
Cortexica Vision Systems Ltd., London SE1 9LQ, UK
5
School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, UK
6
Department of Computer Science, University Carlos III of Madrid, 28903 Leganés, Spain
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper, Pham, H. H., Khoudour, L., Crouzil, A., Zegers, P., & Velastin, S. A. “Skeletal Movement to Color Map: A Novel Representation for 3D Action Recognition with Inception Residual Networks.” published in the 25th IEEE International Conference on Image Processing (ICIP). In the evaluation section, we also reproduce results from our paper, Pham, H. H., Khoudour, L., Crouzil, A., Zegers, P., & Velastin, S. A. “Learning to recognise 3D human action from a new skeleton-based representation using deep convolutional neural network” published in the IET Computer Vision Journal in 2018 and compare them with the method described in this paper.
Sensors 2019, 19(8), 1932; https://doi.org/10.3390/s19081932
Received: 6 March 2019 / Revised: 10 April 2019 / Accepted: 17 April 2019 / Published: 24 April 2019
(This article belongs to the Special Issue Deep Learning-Based Image Sensors)
  |  
PDF [1810 KB, uploaded 24 April 2019]
  |     |  

Abstract

Designing motion representations for 3D human action recognition from skeleton sequences is an important yet challenging task. An effective representation should be robust to noise, invariant to viewpoint changes and result in a good performance with low-computational demand. Two main challenges in this task include how to efficiently represent spatio–temporal patterns of skeletal movements and how to learn their discriminative features for classification tasks. This paper presents a novel skeleton-based representation and a deep learning framework for 3D action recognition using RGB-D sensors. We propose to build an action map called SPMF (Skeleton Posture-Motion Feature), which is a compact image representation built from skeleton poses and their motions. An Adaptive Histogram Equalization (AHE) algorithm is then applied on the SPMF to enhance their local patterns and form an enhanced action map, namely Enhanced-SPMF. For learning and classification tasks, we exploit Deep Convolutional Neural Networks based on the DenseNet architecture to learn directly an end-to-end mapping between input skeleton sequences and their action labels via the Enhanced-SPMFs. The proposed method is evaluated on four challenging benchmark datasets, including both individual actions, interactions, multiview and large-scale datasets. The experimental results demonstrate that the proposed method outperforms previous state-of-the-art approaches on all benchmark tasks, whilst requiring low computational time for training and inference. View Full-Text
Keywords: 3D human action recognition; skeleton-based representation; SPMF; Enhanced-SPMF; AHE; D-CNNs; DenseNet 3D human action recognition; skeleton-based representation; SPMF; Enhanced-SPMF; AHE; D-CNNs; DenseNet
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Pham, H.H.; Salmane, H.; Khoudour, L.; Crouzil, A.; Zegers, P.; Velastin, S.A. Spatio–Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks. Sensors 2019, 19, 1932.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top