Human Face and Motion Recognition in Video

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (20 March 2023) | Viewed by 16445

Special Issue Editor


E-Mail Website
Guest Editor
AI Research Laboratory, ETRI, Daejeon 34129, Republic of Korea
Interests: computer vision; machine learning; human motion analysis; biometric systems; HCI; intelligent robots

Special Issue Information

Dear Colleagues,

Recent interest in computer vision dealing with the analysis of image sequences involving humans. The most easily observable human features are the motion and the face. Human motion includes action, activity, and behavior, and the human face reveals various information, such as identity, gender, age, ethnicity, and emotional state. The potential applications of human motion and face recognition involve fields such as athletic performance analysis, clinical analysis, human–machine interaction, intelligent robots, video surveillance, and biometrics. There has been a great deal of progress made in these applications in the last few years with the rise of deep learning technologies. This Special Issue, “Human Face and Motion Recognition”, provides a platform for researchers and practitioners to present state-of-the-art and innovative algorithms, technologies, datasets, and applications.

Dr. Jang-Hee Yoo
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Human pose estimation with deep learning
  • Deep learning in face and motion analysis
  • Face recognition and biometrics
  • Facial expression and emotion recognition
  • Age and gender estimation in video
  • Human gait and behavior recognition
  • Human recognition at a distance
  • Forensic gait and facial identification
  • Databases for face and motion recognition

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 14415 KiB  
Article
Micro-Expression Spotting Based on a Short-Duration Prior and Multi-Stage Feature Extraction
by Zhihua Xie and Sijia Cheng
Electronics 2023, 12(2), 434; https://doi.org/10.3390/electronics12020434 - 14 Jan 2023
Viewed by 1127
Abstract
When micro-expressions are mixed with normal or macro-expressions, it becomes increasingly challenging to spot them in long videos. Aiming at the specific time prior of micro-expressions (MEs), an ME spotting network called AEM-Net (adaptive enhanced ME detection network) is proposed. This paper is [...] Read more.
When micro-expressions are mixed with normal or macro-expressions, it becomes increasingly challenging to spot them in long videos. Aiming at the specific time prior of micro-expressions (MEs), an ME spotting network called AEM-Net (adaptive enhanced ME detection network) is proposed. This paper is an extension of the conference paper presented at the Chinese Conference on Biometric Recognition (CCBR). The network improves spotting performance in the following five aspects. Firstly, a multi-stage channel feature extraction module is constructed to extract the features at different depths. Then, an attention spatial-temporal module is leveraged to obtain salient and discriminative micro-expression segments while suppressing the generation of excessively long or short suggestions. Thirdly, a ME-NMS (non-maximum suppression) network is developed to reduce redundancy and decision errors. Fourthly, a multi-scale feature fusion module is introduced to fuse up-sampling features of high-level maps and fine-grained information, which obtains meaningful information on feature distribution and contributes to a good representation of MEs. Finally, two spotting mechanisms named anchor-based and anchor free were integrated to get final spotting. Extensive experiments were conducted on prevalent CAS(ME)2 and the SAMM-Long ME databases to evaluate the spotting performance. The results show that the AEM-Net achieves competitive performance, outperforming other state-of-the-art methods. Full article
(This article belongs to the Special Issue Human Face and Motion Recognition in Video)
Show Figures

Figure 1

17 pages, 6276 KiB  
Article
Industrial Ergonomics Risk Analysis Based on 3D-Human Pose Estimation
by Prabesh Paudel, Young-Jin Kwon, Do-Hyun Kim and Kyoung-Ho Choi
Electronics 2022, 11(20), 3403; https://doi.org/10.3390/electronics11203403 - 20 Oct 2022
Cited by 6 | Viewed by 2633
Abstract
Ergonomics is important for smooth and sustainable industrial operation. In the manufacturing industry, due to poor workstation design, workers frequently and repeatedly experience uncomfortable postures and actions (reaching above their shoulders, bending at awkward angles, bending backwards, flexing their elbows/wrists, etc.). Incorrect working [...] Read more.
Ergonomics is important for smooth and sustainable industrial operation. In the manufacturing industry, due to poor workstation design, workers frequently and repeatedly experience uncomfortable postures and actions (reaching above their shoulders, bending at awkward angles, bending backwards, flexing their elbows/wrists, etc.). Incorrect working postures often lead to specialized injuries, which reduce productivity and increase development costs. Therefore, examining workers’ ergonomic postures becomes the basis for recognizing, correcting, and preventing bad postures in the workplace. This paper proposes a new framework to carry out risk analysis of workers’ ergonomic postures through 3D human pose estimation from video/image sequences of their actions. The top-down network calculates human body joints when bending, and those angles are compared with the ground truth body bending data collected manually by expert observation. Here, we introduce the body angle reliability decision (BARD) method to calculate the most reliable body-bending angles to ensure safe working angles for workers that conform to ergonomic requirements in the manufacturing industry. We found a significant result with high accuracy in the score for ergonomics we used for this experiment. For good postures with high reliability, we have OWAS score 94%, REBA score 93%, and RULA score 93% accuracy. Similarly, for occluded postures we have OWAS score 83%, REBA score 82%, and RULA score 82%, compared with expert’s occluded scores. For future study, our research can be a reference for ergonomics score analysis with 3D pose estimation of workers’ postures. Full article
(This article belongs to the Special Issue Human Face and Motion Recognition in Video)
Show Figures

Figure 1

20 pages, 7182 KiB  
Article
Efficient Generation of Cancelable Face Templates Based on Quantum Image Hilbert Permutation
by Hesham Alhumyani, Ghada M. El-Banby, Hala S. El-Sayed, Fathi E. Abd El-Samie and Osama S. Faragallah
Electronics 2022, 11(7), 1040; https://doi.org/10.3390/electronics11071040 - 26 Mar 2022
Cited by 5 | Viewed by 1806
Abstract
The pivotal need to identify people requires efficient and robust schemes to guarantee high levels of personal information security. This paper introduces an encryption algorithm to generate cancelable face templates based on quantum image Hilbert permutation. The objective is to provide sufficient distortion [...] Read more.
The pivotal need to identify people requires efficient and robust schemes to guarantee high levels of personal information security. This paper introduces an encryption algorithm to generate cancelable face templates based on quantum image Hilbert permutation. The objective is to provide sufficient distortion of human facial biometrics to be stored in a database for authentication requirements through encryption. The strength of the proposed Cancelable Biometric (CB) scheme is guaranteed through the ability to generate cancelable face templates by performing the scrambling operation of the face biometrics after addition of a noise mask with a pre-specified variance and an initial seed. Generating the cancelable templates depends on a strategy with three basic steps: Initialization, Odd module, and Even module. Notably, the proposed scheme achieves high recognition rates based on the Area under the Receiver Operating Characteristic (AROC) curve, with a value up to 99.51%. Furthermore, comparisons with the state-of-the-art schemes for cancelable face recognition are performed to validate the proposed scheme. Full article
(This article belongs to the Special Issue Human Face and Motion Recognition in Video)
Show Figures

Figure 1

12 pages, 4099 KiB  
Article
Improvement of Identity Recognition with Occlusion Detection-Based Feature Selection
by Jaeyoon Jang, Ho-Sub Yoon and Jaehong Kim
Electronics 2021, 10(2), 167; https://doi.org/10.3390/electronics10020167 - 13 Jan 2021
Cited by 3 | Viewed by 1886
Abstract
Image-based facial identity recognition has become a technology that is now used in many applications. This is because it is possible to use only a camera without the need for any other device. Besides, due to the advantage of contactless technology, it is [...] Read more.
Image-based facial identity recognition has become a technology that is now used in many applications. This is because it is possible to use only a camera without the need for any other device. Besides, due to the advantage of contactless technology, it is one of the most popular certifications. However, a common recognition system is not possible if some of the face information is lost due to the user’s posture or the wearing of masks, as caused by the recent prevalent disease. In some platforms, although performance is improved through incremental updates, it is still inconvenient and inaccurate. In this paper, we propose a method to respond more actively to these situations. First, we determine whether an obscurity occurs and improve the stability by calculating the feature vector using only a significant area when the obscurity occurs. By recycling the existing recognition model, without incurring little additional costs, the results of reducing the recognition performance drop in certain situations were confirmed. Using this technique, we confirmed a performance improvement of about 1~3% in a situation where some information is lost. Although the performance is not dramatically improved, it has the big advantage that it can improve recognition performance by utilizing existing systems. Full article
(This article belongs to the Special Issue Human Face and Motion Recognition in Video)
Show Figures

Figure 1

12 pages, 2893 KiB  
Article
Real-Time Hair Segmentation Using Mobile-Unet
by Ho-Sub Yoon, Seong-Woo Park and Jang-Hee Yoo
Electronics 2021, 10(2), 99; https://doi.org/10.3390/electronics10020099 - 06 Jan 2021
Cited by 12 | Viewed by 5188
Abstract
We described a real-time hair segmentation method based on a fully convolutional network with the basic structure of an encoder–decoder. In one of the traditional computer vision techniques for hair segmentation, the mean shift and watershed methodologies suffer from inaccuracy and slow execution [...] Read more.
We described a real-time hair segmentation method based on a fully convolutional network with the basic structure of an encoder–decoder. In one of the traditional computer vision techniques for hair segmentation, the mean shift and watershed methodologies suffer from inaccuracy and slow execution due to multi-step, complex image processing. It is also difficult to execute the process in real-time unless an optimization technique is applied to the partition. To solve this problem, we exploited Mobile-Unet using the U-Net segmentation model, which incorporates the optimization techniques of MobileNetV2. In experiments, hair segmentation accuracy was evaluated by different genders and races, and the average accuracy was 89.9%. By comparing the accuracy and execution speed of our model with those of other models in related studies, we confirmed that the proposed model achieved the same or better performance. As such, the results of hair segmentation can obtain hair information (style, color, length), which has a significant impact on human-robot interaction with people. Full article
(This article belongs to the Special Issue Human Face and Motion Recognition in Video)
Show Figures

Figure 1

14 pages, 1177 KiB  
Article
Study of Process-Focused Assessment Using an Algorithm for Facial Expression Recognition Based on a Deep Neural Network Model
by Ho-Jung Lee and Deokwoo Lee
Electronics 2021, 10(1), 54; https://doi.org/10.3390/electronics10010054 - 31 Dec 2020
Cited by 9 | Viewed by 2584
Abstract
This study proposes an approach for process-focused assessment (PFA) utilizing the concept of deep neural networks with a sequence of facial images. Recently, process-based assessment has received significant attention compared to result-based assessment in the field of education. Continuously evaluating and quantifying student [...] Read more.
This study proposes an approach for process-focused assessment (PFA) utilizing the concept of deep neural networks with a sequence of facial images. Recently, process-based assessment has received significant attention compared to result-based assessment in the field of education. Continuously evaluating and quantifying student engagement, as well as understanding and interacting with teachers in study activities are considered important factors. However, to achieve PFA, from the technical and systematic perspectives, the real-time monitoring of the learning process of students is desired, which requires time consumption and extremely high attention to each student. This study proposes an approach to develop an efficient method for evaluating the process of learning and studying students in real time using facial images. We developed a method for PFA by learning facial expressions using a deep neural network model. The model learns and classifies facial expressions into three categories: easy, neutral, and difficult. Because the demand for online learning is increasing, PFA is required to achieve efficient, convenient, and confident assessment. This study chiefly considers a sequence of 2D image data of students solving some exam problems. The experimental results demonstrate that the proposed approach is feasible and can be applied to PFA in classrooms. Full article
(This article belongs to the Special Issue Human Face and Motion Recognition in Video)
Show Figures

Figure 1

Back to TopTop