sensors-logo

Journal Browser

Journal Browser

Sensors-Based Human Action and Emotion Recognition

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: closed (30 June 2023) | Viewed by 22500

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Engineering, Keimyung University, Shindang-Dong, Dalseo-Gu, Daegu 704-701,Republic of Korea
Interests: computer vision; pattern recognition; object detection tracking; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Biomedical Engineering, Keimyung University, School of Medicine, Dalseo-gu, Korea
Interests: computer aided diagnostics; artificial intelligence; biomedical optics

Special Issue Information

Dear Colleagues.

Human action and emotion recognition (HAER) technology, which analyzes data collected from various types of sensing devices including vision and embedded sensors, can be used in virtual reality (VR), argumented reality (VR), video surveillance, sport analysis, human–computer interaction, and healthcare. It has been applied to the development of various context-aware applications in emerging application areas.

In particular, recently, HAER research using deep learning technology instead of traditional machine learning has been actively conducted. A large-scale, well-annotated HAER-related database is required for HAER with high accuracy. Recently, the popularity of research on sensor (video)-based HAER is increasing, using the publicly available HAER-related database.

The purpose of this Special Issue is to take this opportunity to introduce the current developments of sensor- and video-based human action and emotion recognition combined with machine learning, including computer vision, pattern recognition, expert systems, deep learning, and so on. In this Special Issue, you are invited to submit contributions of original research, advancement, developments, and experiments pertaining to machine learning combined with sensors. Therefore, this Special Issue welcomes newly developed methods and ideas combining data obtained from various sensors in the following fields (but not limited to these fields):

  • HAER based on machine learning;
  • Deep network structure/learning algorithm for HAER;
  • HAER based on sensor fusion techniques based on machine learning;
  • Face and gaze recognition for HAER;
  • Multi-modal/task learning for HAER decision-making and control;
  • HAER technologies in autonomous vehicles;
  • State of practice, research overview, experience reports, industrial experiments, and case studies in HAER.

Prof. Dr. ByoungChul Ko
Prof. Dr. Jong-Ha Lee
Guest Editors

If you want to learn more information or need any advice, you can contact the Special Issue Editor Bell Ding via <[email protected]>

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

25 pages, 1974 KiB  
Article
A Hybrid Multimodal Emotion Recognition Framework for UX Evaluation Using Generalized Mixture Functions
by Muhammad Asif Razzaq, Jamil Hussain, Jaehun Bang, Cam-Hao Hua, Fahad Ahmed Satti, Ubaid Ur Rehman, Hafiz Syed Muhammad Bilal, Seong Tae Kim and Sungyoung Lee
Sensors 2023, 23(9), 4373; https://doi.org/10.3390/s23094373 - 28 Apr 2023
Cited by 4 | Viewed by 2122
Abstract
Multimodal emotion recognition has gained much traction in the field of affective computing, human–computer interaction (HCI), artificial intelligence (AI), and user experience (UX). There is growing demand to automate analysis of user emotion towards HCI, AI, and UX evaluation applications for providing affective [...] Read more.
Multimodal emotion recognition has gained much traction in the field of affective computing, human–computer interaction (HCI), artificial intelligence (AI), and user experience (UX). There is growing demand to automate analysis of user emotion towards HCI, AI, and UX evaluation applications for providing affective services. Emotions are increasingly being used, obtained through the videos, audio, text or physiological signals. This has led to process emotions from multiple modalities, usually combined through ensemble-based systems with static weights. Due to numerous limitations like missing modality data, inter-class variations, and intra-class similarities, an effective weighting scheme is thus required to improve the aforementioned discrimination between modalities. This article takes into account the importance of difference between multiple modalities and assigns dynamic weights to them by adapting a more efficient combination process with the application of generalized mixture (GM) functions. Therefore, we present a hybrid multimodal emotion recognition (H-MMER) framework using multi-view learning approach for unimodal emotion recognition and introducing multimodal feature fusion level, and decision level fusion using GM functions. In an experimental study, we evaluated the ability of our proposed framework to model a set of four different emotional states (Happiness, Neutral, Sadness, and Anger) and found that most of them can be modeled well with significantly high accuracy using GM functions. The experiment shows that the proposed framework can model emotional states with an average accuracy of 98.19% and indicates significant gain in terms of performance in contrast to traditional approaches. The overall evaluation results indicate that we can identify emotional states with high accuracy and increase the robustness of an emotion classification system required for UX measurement. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

12 pages, 3775 KiB  
Article
The Novel Digital Therapeutics Sensor and Algorithm for Pressure Ulcer Care Based on Tissue Impedance
by Tae-Mi Jung, Dae-Jin Jang and Jong-Ha Lee
Sensors 2023, 23(7), 3620; https://doi.org/10.3390/s23073620 - 30 Mar 2023
Cited by 1 | Viewed by 1452
Abstract
Visual diagnosis and rejuvenation are methods currently used to diagnose and treat pressure ulcers, respectively. However, the treatment process is difficult. We developed a biophotonic sensor to diagnose pressure ulcers and, subsequently, developed a pressure ulcer care device (PUCD.) We conducted animal and [...] Read more.
Visual diagnosis and rejuvenation are methods currently used to diagnose and treat pressure ulcers, respectively. However, the treatment process is difficult. We developed a biophotonic sensor to diagnose pressure ulcers and, subsequently, developed a pressure ulcer care device (PUCD.) We conducted animal and clinical trials to investigate the device’s effectiveness. We confirmed the accuracy of the pressure ulcer diagnosis algorithm to be 91% and we observed an 85% reduction in immune cells when using the PUCD to treat pressure ulcer-induced mice. Additionally, we compared the treatment group to the pressure ulcer induction group to assess the PUCD’s effectiveness in identifying immune cells through its nuclear shape. These results indicate a positive effect and suggest the use of PUCD as a recovery method for pressure ulcer diagnosis and treatment. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

27 pages, 22214 KiB  
Article
High-Resolution Tactile-Sensation Diagnostic Imaging System for Thyroid Cancer
by So-Hyun Cho, Su-Min Lee, Na-Young Lee, Byoung Chul Ko, Hojeong Kim, Dae-Jin Jang and Jong-Ha Lee
Sensors 2023, 23(7), 3451; https://doi.org/10.3390/s23073451 - 25 Mar 2023
Cited by 2 | Viewed by 1544
Abstract
In this study, we propose the direct diagnosis of thyroid cancer using a small probe. The probe can easily check the abnormalities of existing thyroid tissue without relying on experts, which reduces the cost of examining thyroid tissue and enables the initial self-examination [...] Read more.
In this study, we propose the direct diagnosis of thyroid cancer using a small probe. The probe can easily check the abnormalities of existing thyroid tissue without relying on experts, which reduces the cost of examining thyroid tissue and enables the initial self-examination of thyroid cancer with high accuracy. A multi-layer silicon-structured probe module is used to photograph light scattered by elastic changes in thyroid tissue under pressure to obtain a tactile image of the thyroid gland. In the thyroid tissue under pressure, light scatters to the outside depending on the presence of malignant and positive properties. A simple and easy-to-use tactile-sensation imaging system is developed by documenting the characteristics of the organization of tissues by using non-invasive technology for analyzing tactile images and judging the properties of abnormal tissues. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

18 pages, 10370 KiB  
Article
Data Valuation Algorithm for Inertial Measurement Unit-Based Human Activity Recognition
by Yeon-Wook Kim and Sangmin Lee
Sensors 2023, 23(1), 184; https://doi.org/10.3390/s23010184 - 24 Dec 2022
Cited by 4 | Viewed by 1843
Abstract
This paper proposes a data valuation algorithm for inertial measurement unit-based human activity recognition (IMU-based HAR) data based on meta reinforcement learning. Unlike previous studies that received feature-level input, the algorithm in this study added a feature extraction structure to the data valuation [...] Read more.
This paper proposes a data valuation algorithm for inertial measurement unit-based human activity recognition (IMU-based HAR) data based on meta reinforcement learning. Unlike previous studies that received feature-level input, the algorithm in this study added a feature extraction structure to the data valuation algorithm, and it can receive raw-level inputs and achieve excellent performance. As IMU-based HAR data are multivariate time-series data, the proposed algorithm incorporates an architecture capable of extracting both local and global features by inserting a transformer encoder after the one-dimensional convolutional neural network (1D-CNN) backbone in the data value estimator. In addition, the 1D-CNN-based stacking ensemble structure, which exhibits excellent efficiency and performance on IMU-based HAR data, is used as a predictor to supervise model training. The Berg balance scale (BBS) IMU-based HAR dataset and the public datasets, UCI-HAR, WISDM, and PAMAP2, are used for performance evaluation in this study. The valuation performance of the proposed algorithm is observed to be excellent on IMU-based HAR data. The rate of discovering corrupted data is higher than 96% on all datasets. In addition, classification performance is confirmed to be improved by the suppression of discovery of low-value data. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

23 pages, 1470 KiB  
Article
Towards Building a Visual Behaviour Analysis Pipeline for Suicide Detection and Prevention
by Xun Li, Sandersan Onie, Morgan Liang, Mark Larsen and Arcot Sowmya
Sensors 2022, 22(12), 4488; https://doi.org/10.3390/s22124488 - 14 Jun 2022
Cited by 10 | Viewed by 2308
Abstract
Understanding human behaviours through video analysis has seen significant research progress in recent years with the advancement of deep learning. This topic is of great importance to the next generation of intelligent visual surveillance systems which are capable of real-time detection and analysis [...] Read more.
Understanding human behaviours through video analysis has seen significant research progress in recent years with the advancement of deep learning. This topic is of great importance to the next generation of intelligent visual surveillance systems which are capable of real-time detection and analysis of human behaviours. One important application is to automatically monitor and detect individuals who are in crisis at suicide hotspots to facilitate early intervention and prevention. However, there is still a significant gap between research in human action recognition and visual video processing in general, and their application to monitor hotspots for suicide prevention. While complex backgrounds, non-rigid movements of pedestrians and limitations of surveillance cameras and multi-task requirements for a surveillance system all pose challenges to the development of such systems, a further challenge is the detection of crisis behaviours before a suicide attempt is made, and there is a paucity of datasets in this area due to privacy and confidentiality issues. Most relevant research only applies to detecting suicides such as hangings or jumps from bridges, providing no potential for early prevention. In this research, these problems are addressed by proposing a new modular design for an intelligent visual processing pipeline that is capable of pedestrian detection, tracking, pose estimation and recognition of both normal actions and high risk behavioural cues that are important indicators of a suicide attempt. Specifically, based on the key finding that human body gestures can be used for the detection of social signals that potentially precede a suicide attempt, a new 2D skeleton-based action recognition algorithm is proposed. By using a two-branch network that takes advantage of three types of skeleton-based features extracted from a sequence of frames and a stacked LSTM structure, the model predicts the action label at each time step. It achieved good performance on both the public dataset JHMDB and a smaller private CCTV footage collection on action recognition. Moreover, a logical layer, which uses knowledge from a human coding study to recognise pre-suicide behaviour indicators, has been built on top of the action recognition module to compensate for the small dataset size. It enables complex behaviour patterns to be recognised even from smaller datasets. The whole pipeline has been tested in a real-world application of suicide prevention using simulated footage from a surveillance system installed at a suicide hotspot, and preliminary results confirm its effectiveness at capturing crisis behaviour indicators for early detection and prevention of suicide. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

16 pages, 549 KiB  
Article
Facial Expression Recognition from Multi-Perspective Visual Inputs and Soft Voting
by Antonio A. Aguileta, Ramón F. Brena, Erik Molino-Minero-Re and Carlos E. Galván-Tejada
Sensors 2022, 22(11), 4206; https://doi.org/10.3390/s22114206 - 31 May 2022
Viewed by 1559
Abstract
Automatic identification of human facial expressions has many potential applications in today’s connected world, from mental health monitoring to feedback for onscreen content or shop windows and sign-language prosodic identification. In this work we use visual information as input, namely, a dataset of [...] Read more.
Automatic identification of human facial expressions has many potential applications in today’s connected world, from mental health monitoring to feedback for onscreen content or shop windows and sign-language prosodic identification. In this work we use visual information as input, namely, a dataset of face points delivered by a Kinect device. The most recent work on facial expression recognition uses Machine Learning techniques, to use a modular data-driven path of development instead of using human-invented ad hoc rules. In this paper, we present a Machine-Learning based method for automatic facial expression recognition that leverages information fusion architecture techniques from our previous work and soft voting. Our approach shows an average prediction performance clearly above the best state-of-the-art results for the dataset considered. These results provide further evidence of the usefulness of information fusion architectures rather than adopting the default ML approach of features aggregation. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

13 pages, 1322 KiB  
Article
Facial Expression Recognition Based on Squeeze Vision Transformer
by Sangwon Kim, Jaeyeal Nam and Byoung Chul Ko
Sensors 2022, 22(10), 3729; https://doi.org/10.3390/s22103729 - 13 May 2022
Cited by 14 | Viewed by 3216
Abstract
In recent image classification approaches, a vision transformer (ViT) has shown an excellent performance beyond that of a convolutional neural network. A ViT achieves a high classification for natural images because it properly preserves the global image features. Conversely, a ViT still has [...] Read more.
In recent image classification approaches, a vision transformer (ViT) has shown an excellent performance beyond that of a convolutional neural network. A ViT achieves a high classification for natural images because it properly preserves the global image features. Conversely, a ViT still has many limitations in facial expression recognition (FER), which requires the detection of subtle changes in expression, because it can lose the local features of the image. Therefore, in this paper, we propose Squeeze ViT, a method for reducing the computational complexity by reducing the number of feature dimensions while increasing the FER performance by concurrently combining global and local features. To measure the FER performance of Squeeze ViT, experiments were conducted on lab-controlled FER datasets and a wild FER dataset. Through comparative experiments with previous state-of-the-art approaches, we proved that the proposed method achieves an excellent performance on both types of datasets. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

21 pages, 960 KiB  
Article
Improving Wearable-Based Activity Recognition Using Image Representations
by Alejandro Sanchez Guinea, Mehran Sarabchian and Max Mühlhäuser
Sensors 2022, 22(5), 1840; https://doi.org/10.3390/s22051840 - 25 Feb 2022
Cited by 4 | Viewed by 1873
Abstract
Activity recognition based on inertial sensors is an essential task in mobile and ubiquitous computing. To date, the best performing approaches in this task are based on deep learning models. Although the performance of the approaches has been increasingly improving, a number of [...] Read more.
Activity recognition based on inertial sensors is an essential task in mobile and ubiquitous computing. To date, the best performing approaches in this task are based on deep learning models. Although the performance of the approaches has been increasingly improving, a number of issues still remain. Specifically, in this paper we focus on the issue of the dependence of today’s state-of-the-art approaches to complex ad hoc deep learning convolutional neural networks (CNNs), recurrent neural networks (RNNs), or a combination of both, which require specialized knowledge and considerable effort for their construction and optimal tuning. To address this issue, in this paper we propose an approach that automatically transforms the inertial sensors time-series data into images that represent in pixel form patterns found over time, allowing even a simple CNN to outperform complex ad hoc deep learning models that combine RNNs and CNNs for activity recognition. We conducted an extensive evaluation considering seven benchmark datasets that are among the most relevant in activity recognition. Our results demonstrate that our approach is able to outperform the state of the art in all cases, based on image representations that are generated through a process that is easy to implement, modify, and extend further, without the need of developing complex deep learning models. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

12 pages, 1630 KiB  
Article
A Novel Biosensor and Algorithm to Predict Vitamin D Status by Measuring Skin Impedance
by Jin-Chul Heo, Doyoon Kim, Hyunsoo An, Chang-Sik Son, Sangwoo Cho and Jong-Ha Lee
Sensors 2021, 21(23), 8118; https://doi.org/10.3390/s21238118 - 04 Dec 2021
Cited by 2 | Viewed by 2491
Abstract
The deficiency and excess of vitamin D cause various diseases, necessitating continuous management; but it is not easy to accurately measure the serum vitamin D level in the body using a non-invasive method. The aim of this study is to investigate the correlation [...] Read more.
The deficiency and excess of vitamin D cause various diseases, necessitating continuous management; but it is not easy to accurately measure the serum vitamin D level in the body using a non-invasive method. The aim of this study is to investigate the correlation between vitamin D levels, body information obtained by an InBody scan, and blood parameters obtained during health checkups, to determine the optimum frequency of vitamin D quantification in the skin and to propose a vitamin D measurement method based on impedance. We assessed body composition, arm impedance, and blood vitamin D concentrations to determine the correlation between each element using multiple machine learning analyses and an algorithm which predicted the concentration of vitamin D in the body using the impedance value developed. Body fat percentage obtained from the InBody device and blood parameters albumin and lactate dehydrogenase correlated with vitamin D level. An impedance measurement frequency of 21.1 Hz was reflected in the blood vitamin D concentration at optimum levels, and a confidence level of about 75% for vitamin D in the body was confirmed. These data demonstrate that the concentration of vitamin D in the body can be predicted using impedance measurement values. This method can be used for predicting and monitoring vitamin D-related diseases and may be incorporated in wearable health measurement devices. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

12 pages, 2892 KiB  
Article
Computer-Aided Diagnosis Algorithm for Classification of Malignant Melanoma Using Deep Neural Networks
by Chan-Il Kim, Seok-Min Hwang, Eun-Bin Park, Chang-Hee Won and Jong-Ha Lee
Sensors 2021, 21(16), 5551; https://doi.org/10.3390/s21165551 - 18 Aug 2021
Cited by 12 | Viewed by 2141
Abstract
Malignant melanoma accounts for about 1–3% of all malignancies in the West, especially in the United States. More than 9000 people die each year. In general, it is difficult to characterize a skin lesion from a photograph. In this paper, we propose a [...] Read more.
Malignant melanoma accounts for about 1–3% of all malignancies in the West, especially in the United States. More than 9000 people die each year. In general, it is difficult to characterize a skin lesion from a photograph. In this paper, we propose a deep learning-based computer-aided diagnostic algorithm for the classification of malignant melanoma and benign skin tumors from RGB channel skin images. The proposed deep learning model constitutes a tumor lesion segmentation model and a classification model of malignant melanoma. First, U-Net was used to classify skin lesions in dermoscopy images. We implement an algorithm to classify malignant melanoma and benign tumors using skin lesion images and expert labeling results from convolutional neural networks. The U-Net model achieved a dice similarity coefficient of 81.1% compared to the expert labeling results. The classification accuracy of malignant melanoma reached 80.06%. As a result, the proposed AI algorithm is expected to be utilized as a computer-aided diagnostic algorithm to help early detection of malignant melanoma. Full article
(This article belongs to the Special Issue Sensors-Based Human Action and Emotion Recognition)
Show Figures

Figure 1

Back to TopTop