Special Issue "Algorithms for Human Gesture, Activity and Mobility Analysis"

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Algorithms for Multidisciplinary Applications".

Deadline for manuscript submissions: closed (20 December 2020).

Special Issue Editors

Prof. Dr. Mounim A. El Yacoubi
E-Mail Website
Guest Editor
Telecom SudParis, Institut Polytechnique de Paris, 91120 Palaiseau, France
Interests: machine learning; deep learning; pattern recognition; modeling behavioral and physiological human data; human activity and gesture recognition; handwriting and voice analysis; human mobility analysis; biometrics; human–computer interaction; detection and assessment of neurodegenerative diseases from biometric signals
Special Issues, Collections and Topics in MDPI journals
Prof. Dr. Hui Yu
E-Mail Website
Guest Editor
Prof. Dr. Mehdi Ammi
E-Mail Website
Guest Editor
Department of Computer Science, University of Paris 8, 93526 Saint-Denis, France
Interests: human activity recognition; modeling physiological functions; emotions recognition; affective and social interaction; human–computer interaction; pervasive and ubiquitous environments; Internet of Things; e-health
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Human activity, gesture and mobility analysis (HAGMA) has been a hot research area for the past decade thanks to several factors this field has benefited from, such as the recent ubiquity of Internet of Things (IOT) devices for collecting data, the dramatic increase of computing and storage power, and the advent of advanced machine learning algorithms to make sense of complex data. IOT and wearable devices have made it possible to collect an unprecedented diversity of data : video (possibly including depth information), audio, gaze, audio, geolocation, inertial, and egocentric vision data, etc. The huge computing resources available today have made it possible to store such data in the form of large datasets that were not available before. These resources were also key in triggering massive research on and unleashing the power of advanced machine learning algorithms, in particular deep learning, to design robust solutions in the field. As a result, activity and gesture recognition as well as mobility analysis have found large application areas such as human–computer/robot interaction, e-health, affective computing, sign language recognition, video surveillance, sports, education and entertainment, and mobility analysis for smart cities.

In spite of the relative success of these applications, several challenges are still remaining to make HAGMA solutions deployable at a large scale, chief among them ensuring robustness with respect to sources of variability (e.g., view angles in vision, audio noise, the irrelevant signals resulting from moving cameras or inertial sensors, missing data), and combining different sources of information in a multimodal framework that gives more weight to the most reliable ones. From the machine learning standpoint, research is currently active in areas including, but not limited to, optimizing advanced techniques such as CNNs, RNNs (LSTMs, GRU, etc.), and transformer models to the HAGMA tasks, designing effective transfer learning algorithms from one task to another, devising explainable (e.g., attentional) models that make decisions understandable, and working out robust models that can be as immune as possible to adversarial attacks.

This Special Issue aims to gather recent advances in AI (particularly advanced machine learning techniques) for applications of gesture, activity, and mobility analysis and recognition, by bringing together researchers from academia and industry to contribute and discuss the latest research and innovations in this field.

Prof. Mounim A. El Yacoubi
Prof. Dr. Hui Yu
Prof. Dr. Mehdi Ammi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • human activity, action, and gesture recognition
  • sign language recognition
  • human–robot interaction
  • affective analysis
  • activities of daily living
  • activity summarization
  • mobility analysis
  • sensors
  • multimodal schemes
  • new datasets
  • artificial intelligence and machine learning
  • supervised, unsupervised, self-supervised learning, reinforcement learning
  • deep learning, CNN, RNN-based (LSTM, GRU, etc.), transformer models
  • transfer learning
  • explainable and attentional models
  • adversarial attacks and robust models

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
Predicting Intentions of Pedestrians from 2D Skeletal Pose Sequences with a Representation-Focused Multi-Branch Deep Learning Network
Algorithms 2020, 13(12), 331; https://doi.org/10.3390/a13120331 - 10 Dec 2020
Cited by 2 | Viewed by 978
Abstract
Understanding the behaviors and intentions of humans is still one of the main challenges for vehicle autonomy. More specifically, inferring the intentions and actions of vulnerable actors, namely pedestrians, in complex situations such as urban traffic scenes remains a difficult task and a [...] Read more.
Understanding the behaviors and intentions of humans is still one of the main challenges for vehicle autonomy. More specifically, inferring the intentions and actions of vulnerable actors, namely pedestrians, in complex situations such as urban traffic scenes remains a difficult task and a blocking point towards more automated vehicles. Answering the question “Is the pedestrian going to cross?” is a good starting point in order to advance in the quest to the fifth level of autonomous driving. In this paper, we address the problem of real-time discrete intention prediction of pedestrians in urban traffic environments by linking the dynamics of a pedestrian’s skeleton to an intention. Hence, we propose SPI-Net (Skeleton-based Pedestrian Intention network): a representation-focused multi-branch network combining features from 2D pedestrian body poses for the prediction of pedestrians’ discrete intentions. Experimental results show that SPI-Net achieved 94.4% accuracy in pedestrian crossing prediction on the JAAD data set while being efficient for real-time scenarios since SPI-Net can reach around one inference every 0.25 ms on one GPU (i.e., RTX 2080ti), or every 0.67 ms on one CPU (i.e., Intel Core i7 8700K). Full article
(This article belongs to the Special Issue Algorithms for Human Gesture, Activity and Mobility Analysis)
Show Figures

Figure 1

Article
Generative Model for Skeletal Human Movements Based on Conditional DC-GAN Applied to Pseudo-Images
Algorithms 2020, 13(12), 319; https://doi.org/10.3390/a13120319 - 03 Dec 2020
Viewed by 835
Abstract
Generative models for images, audio, text, and other low-dimension data have achieved great success in recent years. Generating artificial human movements can also be useful for many applications, including improvement of data augmentation methods for human gesture recognition. The objective of this research [...] Read more.
Generative models for images, audio, text, and other low-dimension data have achieved great success in recent years. Generating artificial human movements can also be useful for many applications, including improvement of data augmentation methods for human gesture recognition. The objective of this research is to develop a generative model for skeletal human movement, allowing to control the action type of generated motion while keeping the authenticity of the result and the natural style variability of gesture execution. We propose to use a conditional Deep Convolutional Generative Adversarial Network (DC-GAN) applied to pseudo-images representing skeletal pose sequences using tree structure skeleton image format. We evaluate our approach on the 3D skeletal data provided in the large NTU_RGB+D public dataset. Our generative model can output qualitatively correct skeletal human movements for any of the 60 action classes. We also quantitatively evaluate the performance of our model by computing Fréchet inception distances, which shows strong correlation to human judgement. To the best of our knowledge, our work is the first successful class-conditioned generative model for human skeletal motions based on pseudo-image representation of skeletal pose sequences. Full article
(This article belongs to the Special Issue Algorithms for Human Gesture, Activity and Mobility Analysis)
Show Figures

Figure 1

Article
Experimenting the Automatic Recognition of Non-Conventionalized Units in Sign Language
Algorithms 2020, 13(12), 310; https://doi.org/10.3390/a13120310 - 25 Nov 2020
Viewed by 760
Abstract
Sign Languages (SLs) are visual–gestural languages that have developed naturally in deaf communities. They are based on the use of lexical signs, that is, conventionalized units, as well as highly iconic structures, i.e., when the form of an utterance and the meaning it [...] Read more.
Sign Languages (SLs) are visual–gestural languages that have developed naturally in deaf communities. They are based on the use of lexical signs, that is, conventionalized units, as well as highly iconic structures, i.e., when the form of an utterance and the meaning it carries are not independent. Although most research in automatic Sign Language Recognition (SLR) has focused on lexical signs, we wish to broaden this perspective and consider the recognition of non-conventionalized iconic and syntactic elements. We propose the use of corpora made by linguists like the finely and consistently annotated dialogue corpus Dicta-Sign-LSF-v2. We then redefined the problem of automatic SLR as the recognition of linguistic descriptors, with carefully thought out performance metrics. Moreover, we developed a compact and generalizable representation of signers in videos by parallel processing of the hands, face and upper body, then an adapted learning architecture based on a Recurrent Convolutional Neural Network (RCNN). Through a study focused on the recognition of four linguistic descriptors, we show the soundness of the proposed approach and pave the way for a wider understanding of Continuous Sign Language Recognition (CSLR). Full article
(This article belongs to the Special Issue Algorithms for Human Gesture, Activity and Mobility Analysis)
Show Figures

Figure 1

Article
I3D-Shufflenet Based Human Action Recognition
Algorithms 2020, 13(11), 301; https://doi.org/10.3390/a13110301 - 18 Nov 2020
Cited by 1 | Viewed by 834
Abstract
In view of difficulty in application of optical flow based human action recognition due to large amount of calculation, a human action recognition algorithm I3D-shufflenet model is proposed combining the advantages of I3D neural network and lightweight model shufflenet. The 5 × 5 [...] Read more.
In view of difficulty in application of optical flow based human action recognition due to large amount of calculation, a human action recognition algorithm I3D-shufflenet model is proposed combining the advantages of I3D neural network and lightweight model shufflenet. The 5 × 5 convolution kernel of I3D is replaced by a double 3 × 3 convolution kernels, which reduces the amount of calculations. The shuffle layer is adopted to achieve feature exchange. The recognition and classification of human action is performed based on trained I3D-shufflenet model. The experimental results show that the shuffle layer improves the composition of features in each channel which can promote the utilization of useful information. The Histogram of Oriented Gradients (HOG) spatial-temporal features of the object are extracted for training, which can significantly improve the ability of human action expression and reduce the calculation of feature extraction. The I3D-shufflenet is testified on the UCF101 dataset, and compared with other models. The final result shows that the I3D-shufflenet has higher accuracy than the original I3D with an accuracy of 96.4%. Full article
(This article belongs to the Special Issue Algorithms for Human Gesture, Activity and Mobility Analysis)
Show Figures

Figure 1

Back to TopTop