Special Issue "Statistical Machine Learning for Human Behaviour Analysis"

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Statistical Physics".

Deadline for manuscript submissions: closed (31 August 2019).

Special Issue Editors

Guest Editor
Prof. Dr. Thomas Moeslund Website E-Mail
Visual Analysis of People Lab, Aalborg University, Rendsburggade 14, 9000 Aalborg, Denmark
Interests: computer vision; image processing; machine vision; pattern recognition; visual analysis of peoples' whereabouts; surveillance; traffic monitoring
Guest Editor
Prof. Dr. Sergio Escalera Website E-Mail
Computer Vision Center UAB, University of Barcelona
Phone: +34934020853
Interests: Human Behaviour Analysis; Pattern recognition; Machine Learning
Guest Editor
Prof. Dr. Gholamreza Anbarjafari Website E-Mail
The intelligent computer vision (iCV) research lab in the Institute of Technology, University of Tartu
Interests: Human-Robot Interaction; Emotion Recognition; Machine Learning
Guest Editor
Prof. Dr. Kamal Nasrollahi Website E-Mail
Visual Analysis of People (VAP) Laboratory, Aalborg University, Denmark
Interests: Action Recognition; Pattern Recognition; Machine Learning
Guest Editor
Dr. Jun Wan Website E-Mail
The National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Science, China
Interests: Action recognition; Human Behaviour Analysis; Pattern recognition

Special Issue Information

Dear Colleagues,

Human Behaviour Analysis has introduced a number of challenges in various fields, such as applied information theory, affective computing, robotics, biometrics and pattern recognitions. This Special Issue focuses on novel vision-based approaches, which mainly belong to broader categories, such as computer vision and machine learning. The above topics fall, mainly, under categories related to computer vision and machine learning, where the theoretical advancements and practical developments usually benefit from the contributions brought by other areas of research in the relevant domains of science and technology, which is due to the multidisciplinary nature of the task.

We solicit submissions on the following topics:

  • Information theory based pattern classification
  • Biometric recognition
  • Multimodal human analysis
  • Low resolution human activity analysis
  • Face analysis
  • Abnormal behaviour analysis
  • Unsupervised human analysis scenarios
  • 3D/4D human pose and shape estimation
  • Human analysis in virtual/augmented reality
  • Affective computing
  • Social Signal Processing
  • Personality computing
  • Activity recognition
  • Human tracking in wild
  • Application of information-theoretic concepts for human behaviour analysis

Prof. Dr. Thomas Moeslund
Prof. Dr. Sergio Escalera
Prof. Dr. Gholamreza Anbarjafari
Prof. Dr. Kamal Nasrollahi
Dr. Jun Wan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • human behaviour analysis
  • machine learning
  • information theory
  • biometrics
  • emotion recognition

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Open AccessArticle
Entropy-Based Clustering Algorithm for Fingerprint Singular Point Detection
Entropy 2019, 21(8), 786; https://doi.org/10.3390/e21080786 - 12 Aug 2019
Abstract
Fingerprints have long been used in automated fingerprint identification or verification systems. Singular points (SPs), namely the core and delta point, are the basic features widely used for fingerprint registration, orientation field estimation, and fingerprint classification. In this study, we propose an adaptive [...] Read more.
Fingerprints have long been used in automated fingerprint identification or verification systems. Singular points (SPs), namely the core and delta point, are the basic features widely used for fingerprint registration, orientation field estimation, and fingerprint classification. In this study, we propose an adaptive method to detect SPs in a fingerprint image. The algorithm consists of three stages. First, an innovative enhancement method based on singular value decomposition is applied to remove the background of the fingerprint image. Second, a blurring detection and boundary segmentation algorithm based on the innovative image enhancement is proposed to detect the region of impression. Finally, an adaptive method based on wavelet extrema and the Henry system for core point detection is proposed. Experiments conducted using the FVC2002 DB1 and DB2 databases prove that our method can detect SPs reliably. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Graphical abstract

Open AccessArticle
A Unified Framework for Head Pose, Age and Gender Classification through End-to-End Face Segmentation
Entropy 2019, 21(7), 647; https://doi.org/10.3390/e21070647 - 30 Jun 2019
Abstract
Accurate face segmentation strongly benefits the human face image analysis problem. In this paper we propose a unified framework for face image analysis through end-to-end semantic face segmentation. The proposed framework contains a set of stack components for face understanding, which includes head [...] Read more.
Accurate face segmentation strongly benefits the human face image analysis problem. In this paper we propose a unified framework for face image analysis through end-to-end semantic face segmentation. The proposed framework contains a set of stack components for face understanding, which includes head pose estimation, age classification, and gender recognition. A manually labeled face data-set is used for training the Conditional Random Fields (CRFs) based segmentation model. A multi-class face segmentation framework developed through CRFs segments a facial image into six parts. The probabilistic classification strategy is used, and probability maps are generated for each class. The probability maps are used as features descriptors and a Random Decision Forest (RDF) classifier is modeled for each task (head pose, age, and gender). We assess the performance of the proposed framework on several data-sets and report better results as compared to the previously reported results. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
Emotion Recognition from Skeletal Movements
Entropy 2019, 21(7), 646; https://doi.org/10.3390/e21070646 - 29 Jun 2019
Abstract
Automatic emotion recognition has become an important trend in many artificial intelligence (AI) based applications and has been widely explored in recent years. Most research in the area of automated emotion recognition is based on facial expressions or speech signals. Although the influence [...] Read more.
Automatic emotion recognition has become an important trend in many artificial intelligence (AI) based applications and has been widely explored in recent years. Most research in the area of automated emotion recognition is based on facial expressions or speech signals. Although the influence of the emotional state on body movements is undeniable, this source of expression is still underestimated in automatic analysis. In this paper, we propose a novel method to recognise seven basic emotional states—namely, happy, sad, surprise, fear, anger, disgust and neutral—utilising body movement. We analyse motion capture data under seven basic emotional states recorded by professional actor/actresses using Microsoft Kinect v2 sensor. We propose a new representation of affective movements, based on sequences of body joints. The proposed algorithm creates a sequential model of affective movement based on low level features inferred from the spacial location and the orientation of joints within the tracked skeleton. In the experimental results, different deep neural networks were employed and compared to recognise the emotional state of the acquired motion sequences. The experimental results conducted in this work show the feasibility of automatic emotion recognition from sequences of body gestures, which can serve as an additional source of information in multimodal emotion recognition. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
Enhanced Approach Using Reduced SBTFD Features and Modified Individual Behavior Estimation for Crowd Condition Prediction
Entropy 2019, 21(5), 487; https://doi.org/10.3390/e21050487 - 13 May 2019
Abstract
Sensor technology provides the real-time monitoring of data in several scenarios that contribute to the improved security of life and property. Crowd condition monitoring is an area that has benefited from this. The basic context-aware framework (BCF) uses activity recognition based on emerging [...] Read more.
Sensor technology provides the real-time monitoring of data in several scenarios that contribute to the improved security of life and property. Crowd condition monitoring is an area that has benefited from this. The basic context-aware framework (BCF) uses activity recognition based on emerging intelligent technology and is among the best that has been proposed for this purpose. However, accuracy is low, and the false negative rate (FNR) remains high. Thus, the need for an enhanced framework that offers reduced FNR and higher accuracy becomes necessary. This article reports our work on the development of an enhanced context-aware framework (EHCAF) using smartphone participatory sensing for crowd monitoring, dimensionality reduction of statistical-based time-frequency domain (SBTFD) features, and enhanced individual behavior estimation (IBEenhcaf). The experimental results achieved 99.1% accuracy and an FNR of 2.8%, showing a clear improvement over the 92.0% accuracy, and an FNR of 31.3% of the BCF. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
3D CNN-Based Speech Emotion Recognition Using K-Means Clustering and Spectrograms
Entropy 2019, 21(5), 479; https://doi.org/10.3390/e21050479 - 08 May 2019
Cited by 1
Abstract
Detecting human intentions and emotions helps improve human–robot interactions. Emotion recognition has been a challenging research direction in the past decade. This paper proposes an emotion recognition system based on analysis of speech signals. Firstly, we split each speech signal into overlapping frames [...] Read more.
Detecting human intentions and emotions helps improve human–robot interactions. Emotion recognition has been a challenging research direction in the past decade. This paper proposes an emotion recognition system based on analysis of speech signals. Firstly, we split each speech signal into overlapping frames of the same length. Next, we extract an 88-dimensional vector of audio features including Mel Frequency Cepstral Coefficients (MFCC), pitch, and intensity for each of the respective frames. In parallel, the spectrogram of each frame is generated. In the final preprocessing step, by applying k-means clustering on the extracted features of all frames of each audio signal, we select k most discriminant frames, namely keyframes, to summarize the speech signal. Then, the sequence of the corresponding spectrograms of keyframes is encapsulated in a 3D tensor. These tensors are used to train and test a 3D Convolutional Neural network using a 10-fold cross-validation approach. The proposed 3D CNN has two convolutional layers and one fully connected layer. Experiments are conducted on the Surrey Audio-Visual Expressed Emotion (SAVEE), Ryerson Multimedia Laboratory (RML), and eNTERFACE’05 databases. The results are superior to the state-of-the-art methods reported in the literature. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
Learning Using Concave and Convex Kernels: Applications in Predicting Quality of Sleep and Level of Fatigue in Fibromyalgia
Entropy 2019, 21(5), 442; https://doi.org/10.3390/e21050442 - 28 Apr 2019
Cited by 1
Abstract
Fibromyalgia is a medical condition characterized by widespread muscle pain and tenderness and is often accompanied by fatigue and alteration in sleep, mood, and memory. Poor sleep quality and fatigue, as prominent characteristics of fibromyalgia, have a direct impact on patient behavior and [...] Read more.
Fibromyalgia is a medical condition characterized by widespread muscle pain and tenderness and is often accompanied by fatigue and alteration in sleep, mood, and memory. Poor sleep quality and fatigue, as prominent characteristics of fibromyalgia, have a direct impact on patient behavior and quality of life. As such, the detection of extreme cases of sleep quality and fatigue level is a prerequisite for any intervention that can improve sleep quality and reduce fatigue level for people with fibromyalgia and enhance their daytime functionality. In this study, we propose a new supervised machine learning method called Learning Using Concave and Convex Kernels (LUCCK). This method employs similarity functions whose convexity or concavity can be configured so as to determine a model for each feature separately, and then uses this information to reweight the importance of each feature proportionally during classification. The data used for this study was collected from patients with fibromyalgia and consisted of blood volume pulse (BVP), 3-axis accelerometer, temperature, and electrodermal activity (EDA), recorded by an Empatica E4 wristband over the courses of several days, as well as a self-reported survey. Experiments on this dataset demonstrate that the proposed machine learning method outperforms conventional machine learning approaches in detecting extreme cases of poor sleep and fatigue in people with fibromyalgia. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
Action Recognition Using Single-Pixel Time-of-Flight Detection
Entropy 2019, 21(4), 414; https://doi.org/10.3390/e21040414 - 18 Apr 2019
Cited by 1
Abstract
Action recognition is a challenging task that plays an important role in many robotic systems, which highly depend on visual input feeds. However, due to privacy concerns, it is important to find a method which can recognise actions without using visual feed. In [...] Read more.
Action recognition is a challenging task that plays an important role in many robotic systems, which highly depend on visual input feeds. However, due to privacy concerns, it is important to find a method which can recognise actions without using visual feed. In this paper, we propose a concept for detecting actions while preserving the test subject’s privacy. Our proposed method relies only on recording the temporal evolution of light pulses scattered back from the scene. Such data trace to record one action contains a sequence of one-dimensional arrays of voltage values acquired by a single-pixel detector at 1 GHz repetition rate. Information about both the distance to the object and its shape are embedded in the traces. We apply machine learning in the form of recurrent neural networks for data analysis and demonstrate successful action recognition. The experimental results show that our proposed method could achieve on average 96.47 % accuracy on the actions walking forward, walking backwards, sitting down, standing up and waving hand, using recurrent neural network. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
Supervisors’ Visual Attention Allocation Modeling Using Hybrid Entropy
Entropy 2019, 21(4), 393; https://doi.org/10.3390/e21040393 - 12 Apr 2019
Abstract
With the improvement in automation technology, humans have now become supervisors of the complicated control systems that monitor the informative human–machine interface. Analyzing the visual attention allocation behaviors of supervisors is essential for the design and evaluation of the interface. Supervisors tend to [...] Read more.
With the improvement in automation technology, humans have now become supervisors of the complicated control systems that monitor the informative human–machine interface. Analyzing the visual attention allocation behaviors of supervisors is essential for the design and evaluation of the interface. Supervisors tend to pay attention to visual sections with information with more fuzziness, which makes themselves have a higher mental entropy. Supervisors tend to focus on the important information in the interface. In this paper, the fuzziness tendency is described by the probability of correct evaluation of the visual sections using hybrid entropy. The importance tendency is defined by the proposed value priority function. The function is based on the definition of the amount of information using the membership degrees of the importance. By combining these two cognitive tendencies, the informative top-down visual attention allocation mechanism was revealed, and the supervisors’ visual attention allocation model was built. The Building Automatic System (BAS) was used to monitor the environmental equipment in a subway, which is a typical informative human–machine interface. An experiment using the BAS simulator was conducted to verify the model. The results showed that the supervisor’s attention behavior was in good agreement with the proposed model. The effectiveness and comparison with the current models were also discussed. The proposed attention allocation model is effective and reasonable, which is promising for use in behavior analysis, cognitive optimization, and industrial design. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
Saliency Detection Based on the Combination of High-Level Knowledge and Low-Level Cues in Foggy Images
Entropy 2019, 21(4), 374; https://doi.org/10.3390/e21040374 - 06 Apr 2019
Abstract
A key issue in saliency detection of the foggy images in the wild for human tracking is how to effectively define the less obvious salient objects, and the leading cause is that the contrast and resolution is reduced by the light scattering through [...] Read more.
A key issue in saliency detection of the foggy images in the wild for human tracking is how to effectively define the less obvious salient objects, and the leading cause is that the contrast and resolution is reduced by the light scattering through fog particles. In this paper, to suppress the interference of the fog and acquire boundaries of salient objects more precisely, we present a novel saliency detection method for human tracking in the wild. In our method, a combination of object contour detection and salient object detection is introduced. The proposed model can not only maintain the object edge more precisely via object contour detection, but also ensure the integrity of salient objects, and finally obtain accurate saliency maps of objects. Firstly, the input image is transformed into HSV color space, and the amplitude spectrum (AS) of each color channel is adjusted to obtain the frequency domain (FD) saliency map. Then, the contrast of the local-global superpixel is calculated, and the saliency map of the spatial domain (SD) is obtained. We use Discrete Stationary Wavelet Transform (DSWT) to fuse the cues of the FD and SD. Finally, a fully convolutional encoder–decoder model is utilized to refine the contour of the salient objects. Experimental results demonstrate that the presented model can remove the influence of fog efficiently, and the performance is better than 16 state-of-the-art saliency models. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
Detecting Toe-Off Events Utilizing a Vision-Based Method
Entropy 2019, 21(4), 329; https://doi.org/10.3390/e21040329 - 27 Mar 2019
Abstract
Detecting gait events from video data accurately would be a challenging problem. However, most detection methods for gait events are currently based on wearable sensors, which need high cooperation from users and power consumption restriction. This study presents a novel algorithm for achieving [...] Read more.
Detecting gait events from video data accurately would be a challenging problem. However, most detection methods for gait events are currently based on wearable sensors, which need high cooperation from users and power consumption restriction. This study presents a novel algorithm for achieving accurate detection of toe-off events using a single 2D vision camera without the cooperation of participants. First, a set of novel feature, namely consecutive silhouettes difference maps (CSD-maps), is proposed to represent gait pattern. A CSD-map can encode several consecutive pedestrian silhouettes extracted from video frames into a map. And different number of consecutive pedestrian silhouettes will result in different types of CSD-maps, which can provide significant features for toe-off events detection. Convolutional neural network is then employed to reduce feature dimensions and classify toe-off events. Experiments on a public database demonstrate that the proposed method achieves good detection accuracy. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
Incremental Market Behavior Classification in Presence of Recurring Concepts
Entropy 2019, 21(1), 25; https://doi.org/10.3390/e21010025 - 01 Jan 2019
Abstract
In recent years, the problem of concept drift has gained importance in the financial domain. The succession of manias, panics and crashes have stressed the non-stationary nature and the likelihood of drastic structural or concept changes in the markets. Traditional systems are unable [...] Read more.
In recent years, the problem of concept drift has gained importance in the financial domain. The succession of manias, panics and crashes have stressed the non-stationary nature and the likelihood of drastic structural or concept changes in the markets. Traditional systems are unable or slow to adapt to these changes. Ensemble-based systems are widely known for their good results predicting both cyclic and non-stationary data such as stock prices. In this work, we propose RCARF (Recurring Concepts Adaptive Random Forests), an ensemble tree-based online classifier that handles recurring concepts explicitly. The algorithm extends the capabilities of a version of Random Forest for evolving data streams, adding on top a mechanism to store and handle a shared collection of inactive trees, called concept history, which holds memories of the way market operators reacted in similar circumstances. This works in conjunction with a decision strategy that reacts to drift by replacing active trees with the best available alternative: either a previously stored tree from the concept history or a newly trained background tree. Both mechanisms are designed to provide fast reaction times and are thus applicable to high-frequency data. The experimental validation of the algorithm is based on the prediction of price movement directions one second ahead in the SPDR (Standard & Poor’s Depositary Receipts) S&P 500 Exchange-Traded Fund. RCARF is benchmarked against other popular methods from the incremental online machine learning literature and is able to achieve competitive results. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine
Entropy 2018, 20(11), 809; https://doi.org/10.3390/e20110809 - 23 Oct 2018
Abstract
In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for [...] Read more.
In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unseen data. Two modalities, RGB and Depth, are considered in the model input in three forms: original image, cropped image, and noisy cropped image. Five crops of the input image are used and the hand of these cropped images are detected using Convolutional Neural Network (CNN). After that, three types of the detected hand images are generated for each modality and input to RBMs. The outputs of the RBMs for two modalities are fused in another RBM in order to recognize the output sign label of the input image. The proposed multi-modal model is trained on all and part of the American alphabet and digits of four publicly available datasets. We also evaluate the robustness of the proposal against noise. Experimental results show that the proposed multi-modal model, using crops and the RBM fusing methodology, achieves state-of-the-art results on Massey University Gesture Dataset 2012, American Sign Language (ASL). and Fingerspelling Dataset from the University of Surrey’s Center for Vision, Speech and Signal Processing, NYU, and ASL Fingerspelling A datasets. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Figure 1

Open AccessArticle
Multi-Objective Evolutionary Rule-Based Classification with Categorical Data
Entropy 2018, 20(9), 684; https://doi.org/10.3390/e20090684 - 07 Sep 2018
Cited by 1
Abstract
The ease of interpretation of a classification model is essential for the task of validating it. Sometimes it is required to clearly explain the classification process of a model’s predictions. Models which are inherently easier to interpret can be effortlessly related to the [...] Read more.
The ease of interpretation of a classification model is essential for the task of validating it. Sometimes it is required to clearly explain the classification process of a model’s predictions. Models which are inherently easier to interpret can be effortlessly related to the context of the problem, and their predictions can be, if necessary, ethically and legally evaluated. In this paper, we propose a novel method to generate rule-based classifiers from categorical data that can be readily interpreted. Classifiers are generated using a multi-objective optimization approach focusing on two main objectives: maximizing the performance of the learned classifier and minimizing its number of rules. The multi-objective evolutionary algorithms ENORA and NSGA-II have been adapted to optimize the performance of the classifier based on three different machine learning metrics: accuracy, area under the ROC curve, and root mean square error. We have extensively compared the generated classifiers using our proposed method with classifiers generated using classical methods such as PART, JRip, OneR and ZeroR. The experiments have been conducted in full training mode, in 10-fold cross-validation mode, and in train/test splitting mode. To make results reproducible, we have used the well-known and publicly available datasets Breast Cancer, Monk’s Problem 2, Tic-Tac-Toe-Endgame, Car, kr-vs-kp and Nursery. After performing an exhaustive statistical test on our results, we conclude that the proposed method is able to generate highly accurate and easy to interpret classification models. Full article
(This article belongs to the Special Issue Statistical Machine Learning for Human Behaviour Analysis)
Show Figures

Graphical abstract

Back to TopTop