Deep Full-Body HPE for Activity Recognition from RGB Frames Only
1
University of Tunis El Manar, National Engineering School of Tunis, 1002 Tunis, Tunisia
2
Université de Sousse, Ecole Nationale d’Ingénieurs de Sousse, LATIS- Laboratory of Advanced Technology and Intelligent Systems, 4023 Sousse, Tunisie
*
Authors to whom correspondence should be addressed.
Informatics 2021, 8(1), 2; https://doi.org/10.3390/informatics8010002
Received: 30 November 2020 / Revised: 12 January 2021 / Accepted: 13 January 2021 / Published: 18 January 2021
(This article belongs to the Section Machine Learning)
Human Pose Estimation (HPE) is defined as the problem of human joints’ localization (also known as keypoints: elbows, wrists, etc.) in images or videos. It is also defined as the search for a specific pose in space of all articulated joints. HPE has recently received significant attention from the scientific community. The main reason behind this trend is that pose estimation is considered as a key step for many computer vision tasks. Although many approaches have reported promising results, this domain remains largely unsolved due to several challenges such as occlusions, small and barely visible joints, and variations in clothing and lighting. In the last few years, the power of deep neural networks has been demonstrated in a wide variety of computer vision problems and especially the HPE task. In this context, we present in this paper a Deep Full-Body-HPE (DFB-HPE) approach from RGB images only. Based on ConvNets, fifteen human joint positions are predicted and can be further exploited for a large range of applications such as gesture recognition, sports performance analysis, or human-robot interaction. To evaluate the proposed deep pose estimation model, we apply it to recognize the daily activities of a person in an unconstrained environment. Therefore, the extracted features, represented by deep estimated poses, are fed to an SVM classifier. To validate the proposed architecture, our approach is tested on two publicly available benchmarks for pose estimation and activity recognition, namely the J-HMDBand CAD-60datasets. The obtained results demonstrate the efficiency of the proposed method based on ConvNets and SVM and prove how deep pose estimation can improve the recognition accuracy. By means of comparison with state-of-the-art methods, we achieve the best HPE performance, as well as the best activity recognition precision on the CAD-60 dataset.
View Full-Text
Keywords:
human pose estimation; human activity recognition; deep learning; ConvNets; SVM
▼
Show Figures
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
MDPI and ACS Style
Neili Boualia, S.; Essoukri Ben Amara, N. Deep Full-Body HPE for Activity Recognition from RGB Frames Only. Informatics 2021, 8, 2. https://doi.org/10.3390/informatics8010002
AMA Style
Neili Boualia S, Essoukri Ben Amara N. Deep Full-Body HPE for Activity Recognition from RGB Frames Only. Informatics. 2021; 8(1):2. https://doi.org/10.3390/informatics8010002
Chicago/Turabian StyleNeili Boualia, Sameh; Essoukri Ben Amara, Najoua. 2021. "Deep Full-Body HPE for Activity Recognition from RGB Frames Only" Informatics 8, no. 1: 2. https://doi.org/10.3390/informatics8010002
Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.
Search more from Scilit