sensors-logo

Journal Browser

Journal Browser

Special Issue "Sensors-Based Human Action and Emotion Recognition"

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: 15 November 2022 | Viewed by 4906

Special Issue Editors

Prof. Dr. ByoungChul Ko
E-Mail Website
Guest Editor
Department of Computer Engineering, Keimyung Univ., Shindang-Dong, Dalseo-Gu, Daegu 704-701, Korea
Interests: fire and smoke detection; advanced driver assistant system; human detection and tracking; analysis of remote sensing images; human action recognition; medical image processing
Special Issues, Collections and Topics in MDPI journals
Prof. Dr. Jong-Ha Lee
E-Mail Website
Guest Editor
Department of Biomedical Engineering, Keimyung University, School of Medicine, Dalseo-gu, Korea
Interests: computer aided diagnostics; artificial intelligence; biomedical optics

Special Issue Information

Dear Colleagues.

Human action and emotion recognition (HAER) technology, which analyzes data collected from various types of sensing devices including vision and embedded sensors, can be used in virtual reality (VR), argumented reality (VR), video surveillance, sport analysis, human–computer interaction, and healthcare. It has been applied to the development of various context-aware applications in emerging application areas.

In particular, recently, HAER research using deep learning technology instead of traditional machine learning has been actively conducted. A large-scale, well-annotated HAER-related database is required for HAER with high accuracy. Recently, the popularity of research on sensor (video)-based HAER is increasing, using the publicly available HAER-related database.

The purpose of this Special Issue is to take this opportunity to introduce the current developments of sensor- and video-based human action and emotion recognition combined with machine learning, including computer vision, pattern recognition, expert systems, deep learning, and so on. In this Special Issue, you are invited to submit contributions of original research, advancement, developments, and experiments pertaining to machine learning combined with sensors. Therefore, this Special Issue welcomes newly developed methods and ideas combining data obtained from various sensors in the following fields (but not limited to these fields):

  • HAER based on machine learning;
  • Deep network structure/learning algorithm for HAER;
  • HAER based on sensor fusion techniques based on machine learning;
  • Face and gaze recognition for HAER;
  • Multi-modal/task learning for HAER decision-making and control;
  • HAER technologies in autonomous vehicles;
  • State of practice, research overview, experience reports, industrial experiments, and case studies in HAER.

Prof. Dr. ByoungChul Ko
Prof. Dr. Jong-Ha Lee
Guest Editors

If you want to learn more information or need any advice, you can contact the Special Issue Editor Bell Ding via <[email protected]>

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
Towards Building a Visual Behaviour Analysis Pipeline for Suicide Detection and Prevention
Sensors 2022, 22(12), 4488; https://doi.org/10.3390/s22124488 - 14 Jun 2022
Viewed by 467
Abstract
Understanding human behaviours through video analysis has seen significant research progress in recent years with the advancement of deep learning. This topic is of great importance to the next generation of intelligent visual surveillance systems which are capable of real-time detection and analysis [...] Read more.
Understanding human behaviours through video analysis has seen significant research progress in recent years with the advancement of deep learning. This topic is of great importance to the next generation of intelligent visual surveillance systems which are capable of real-time detection and analysis of human behaviours. One important application is to automatically monitor and detect individuals who are in crisis at suicide hotspots to facilitate early intervention and prevention. However, there is still a significant gap between research in human action recognition and visual video processing in general, and their application to monitor hotspots for suicide prevention. While complex backgrounds, non-rigid movements of pedestrians and limitations of surveillance cameras and multi-task requirements for a surveillance system all pose challenges to the development of such systems, a further challenge is the detection of crisis behaviours before a suicide attempt is made, and there is a paucity of datasets in this area due to privacy and confidentiality issues. Most relevant research only applies to detecting suicides such as hangings or jumps from bridges, providing no potential for early prevention. In this research, these problems are addressed by proposing a new modular design for an intelligent visual processing pipeline that is capable of pedestrian detection, tracking, pose estimation and recognition of both normal actions and high risk behavioural cues that are important indicators of a suicide attempt. Specifically, based on the key finding that human body gestures can be used for the detection of social signals that potentially precede a suicide attempt, a new 2D skeleton-based action recognition algorithm is proposed. By using a two-branch network that takes advantage of three types of skeleton-based features extracted from a sequence of frames and a stacked LSTM structure, the model predicts the action label at each time step. It achieved good performance on both the public dataset JHMDB and a smaller private CCTV footage collection on action recognition. Moreover, a logical layer, which uses knowledge from a human coding study to recognise pre-suicide behaviour indicators, has been built on top of the action recognition module to compensate for the small dataset size. It enables complex behaviour patterns to be recognised even from smaller datasets. The whole pipeline has been tested in a real-world application of suicide prevention using simulated footage from a surveillance system installed at a suicide hotspot, and preliminary results confirm its effectiveness at capturing crisis behaviour indicators for early detection and prevention of suicide. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

Article
Facial Expression Recognition from Multi-Perspective Visual Inputs and Soft Voting
Sensors 2022, 22(11), 4206; https://doi.org/10.3390/s22114206 - 31 May 2022
Viewed by 491
Abstract
Automatic identification of human facial expressions has many potential applications in today’s connected world, from mental health monitoring to feedback for onscreen content or shop windows and sign-language prosodic identification. In this work we use visual information as input, namely, a dataset of [...] Read more.
Automatic identification of human facial expressions has many potential applications in today’s connected world, from mental health monitoring to feedback for onscreen content or shop windows and sign-language prosodic identification. In this work we use visual information as input, namely, a dataset of face points delivered by a Kinect device. The most recent work on facial expression recognition uses Machine Learning techniques, to use a modular data-driven path of development instead of using human-invented ad hoc rules. In this paper, we present a Machine-Learning based method for automatic facial expression recognition that leverages information fusion architecture techniques from our previous work and soft voting. Our approach shows an average prediction performance clearly above the best state-of-the-art results for the dataset considered. These results provide further evidence of the usefulness of information fusion architectures rather than adopting the default ML approach of features aggregation. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

Article
Facial Expression Recognition Based on Squeeze Vision Transformer
Sensors 2022, 22(10), 3729; https://doi.org/10.3390/s22103729 - 13 May 2022
Viewed by 646
Abstract
In recent image classification approaches, a vision transformer (ViT) has shown an excellent performance beyond that of a convolutional neural network. A ViT achieves a high classification for natural images because it properly preserves the global image features. Conversely, a ViT still has [...] Read more.
In recent image classification approaches, a vision transformer (ViT) has shown an excellent performance beyond that of a convolutional neural network. A ViT achieves a high classification for natural images because it properly preserves the global image features. Conversely, a ViT still has many limitations in facial expression recognition (FER), which requires the detection of subtle changes in expression, because it can lose the local features of the image. Therefore, in this paper, we propose Squeeze ViT, a method for reducing the computational complexity by reducing the number of feature dimensions while increasing the FER performance by concurrently combining global and local features. To measure the FER performance of Squeeze ViT, experiments were conducted on lab-controlled FER datasets and a wild FER dataset. Through comparative experiments with previous state-of-the-art approaches, we proved that the proposed method achieves an excellent performance on both types of datasets. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

Article
Improving Wearable-Based Activity Recognition Using Image Representations
Sensors 2022, 22(5), 1840; https://doi.org/10.3390/s22051840 - 25 Feb 2022
Cited by 1 | Viewed by 529
Abstract
Activity recognition based on inertial sensors is an essential task in mobile and ubiquitous computing. To date, the best performing approaches in this task are based on deep learning models. Although the performance of the approaches has been increasingly improving, a number of [...] Read more.
Activity recognition based on inertial sensors is an essential task in mobile and ubiquitous computing. To date, the best performing approaches in this task are based on deep learning models. Although the performance of the approaches has been increasingly improving, a number of issues still remain. Specifically, in this paper we focus on the issue of the dependence of today’s state-of-the-art approaches to complex ad hoc deep learning convolutional neural networks (CNNs), recurrent neural networks (RNNs), or a combination of both, which require specialized knowledge and considerable effort for their construction and optimal tuning. To address this issue, in this paper we propose an approach that automatically transforms the inertial sensors time-series data into images that represent in pixel form patterns found over time, allowing even a simple CNN to outperform complex ad hoc deep learning models that combine RNNs and CNNs for activity recognition. We conducted an extensive evaluation considering seven benchmark datasets that are among the most relevant in activity recognition. Our results demonstrate that our approach is able to outperform the state of the art in all cases, based on image representations that are generated through a process that is easy to implement, modify, and extend further, without the need of developing complex deep learning models. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

Article
A Novel Biosensor and Algorithm to Predict Vitamin D Status by Measuring Skin Impedance
Sensors 2021, 21(23), 8118; https://doi.org/10.3390/s21238118 - 04 Dec 2021
Cited by 1 | Viewed by 820
Abstract
The deficiency and excess of vitamin D cause various diseases, necessitating continuous management; but it is not easy to accurately measure the serum vitamin D level in the body using a non-invasive method. The aim of this study is to investigate the correlation [...] Read more.
The deficiency and excess of vitamin D cause various diseases, necessitating continuous management; but it is not easy to accurately measure the serum vitamin D level in the body using a non-invasive method. The aim of this study is to investigate the correlation between vitamin D levels, body information obtained by an InBody scan, and blood parameters obtained during health checkups, to determine the optimum frequency of vitamin D quantification in the skin and to propose a vitamin D measurement method based on impedance. We assessed body composition, arm impedance, and blood vitamin D concentrations to determine the correlation between each element using multiple machine learning analyses and an algorithm which predicted the concentration of vitamin D in the body using the impedance value developed. Body fat percentage obtained from the InBody device and blood parameters albumin and lactate dehydrogenase correlated with vitamin D level. An impedance measurement frequency of 21.1 Hz was reflected in the blood vitamin D concentration at optimum levels, and a confidence level of about 75% for vitamin D in the body was confirmed. These data demonstrate that the concentration of vitamin D in the body can be predicted using impedance measurement values. This method can be used for predicting and monitoring vitamin D-related diseases and may be incorporated in wearable health measurement devices. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

Article
Computer-Aided Diagnosis Algorithm for Classification of Malignant Melanoma Using Deep Neural Networks
Sensors 2021, 21(16), 5551; https://doi.org/10.3390/s21165551 - 18 Aug 2021
Cited by 2 | Viewed by 825
Abstract
Malignant melanoma accounts for about 1–3% of all malignancies in the West, especially in the United States. More than 9000 people die each year. In general, it is difficult to characterize a skin lesion from a photograph. In this paper, we propose a [...] Read more.
Malignant melanoma accounts for about 1–3% of all malignancies in the West, especially in the United States. More than 9000 people die each year. In general, it is difficult to characterize a skin lesion from a photograph. In this paper, we propose a deep learning-based computer-aided diagnostic algorithm for the classification of malignant melanoma and benign skin tumors from RGB channel skin images. The proposed deep learning model constitutes a tumor lesion segmentation model and a classification model of malignant melanoma. First, U-Net was used to classify skin lesions in dermoscopy images. We implement an algorithm to classify malignant melanoma and benign tumors using skin lesion images and expert labeling results from convolutional neural networks. The U-Net model achieved a dice similarity coefficient of 81.1% compared to the expert labeling results. The classification accuracy of malignant melanoma reached 80.06%. As a result, the proposed AI algorithm is expected to be utilized as a computer-aided diagnostic algorithm to help early detection of malignant melanoma. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

Back to TopTop