Sensors

Research

Jump to: Other

25 pages, 23382 KiB

Open AccessArticle

Emotion Variation from Controlling Contrast of Visual Contents through EEG-Based Deep Emotion Recognition

by Heekyung Yang, Jongdae Han and Kyungha Min

Sensors 2020, 20(16), 4543; https://doi.org/10.3390/s20164543 - 13 Aug 2020

Cited by 6 | Viewed by 3640

Abstract

Visual contents such as movies and animation evoke various human emotions. We examine an argument that the emotion from the visual contents may vary according to the contrast control of the scenes contained in the contents. We sample three emotions including positive, neutral [...] Read more.

Visual contents such as movies and animation evoke various human emotions. We examine an argument that the emotion from the visual contents may vary according to the contrast control of the scenes contained in the contents. We sample three emotions including positive, neutral and negative to prove our argument. We also sample several scenes of these emotions from visual contents and control the contrast of the scenes. We manipulate the contrast of the scenes and measure the change of valence and arousal from human participants who watch the contents using a deep emotion recognition module based on electroencephalography (EEG) signals. As a result, we conclude that the enhancement of contrast induces the increase of valence, while the reduction of contrast induces the decrease. Meanwhile, the contrast control affects arousal on a very minute scale. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Graphical abstract

24 pages, 4222 KiB

Open AccessArticle

Sparse Spatiotemporal Descriptor for Micro-Expression Recognition Using Enhanced Local Cube Binary Pattern

by Shixin Cen, Yang Yu, Gang Yan, Ming Yu and Qing Yang

Sensors 2020, 20(16), 4437; https://doi.org/10.3390/s20164437 - 8 Aug 2020

Cited by 10 | Viewed by 3522

Abstract

As a spontaneous facial expression, a micro-expression can reveal the psychological responses of human beings. Thus, micro-expression recognition can be widely studied and applied for its potentiality in clinical diagnosis, psychological research, and security. However, micro-expression recognition is a formidable challenge due to [...] Read more.

As a spontaneous facial expression, a micro-expression can reveal the psychological responses of human beings. Thus, micro-expression recognition can be widely studied and applied for its potentiality in clinical diagnosis, psychological research, and security. However, micro-expression recognition is a formidable challenge due to the short-lived time frame and low-intensity of the facial actions. In this paper, a sparse spatiotemporal descriptor for micro-expression recognition is developed by using the Enhanced Local Cube Binary Pattern (Enhanced LCBP). The proposed Enhanced LCBP is composed of three complementary binary features containing Spatial Difference Local Cube Binary Patterns (Spatial Difference LCBP), Temporal Direction Local Cube Binary Patterns (Temporal Direction LCBP), and Temporal Gradient Local Cube Binary Patterns (Temporal Gradient LCBP). With the application of Enhanced LCBP, it would no longer be a problem to provide binary features with spatiotemporal domain complementarity to capture subtle facial changes. In addition, due to the redundant information existing among the division grids, which affects the ability of descriptors to distinguish micro-expressions, the Multi-Regional Joint Sparse Learning is designed to perform feature selection for the division grids, thus paying more attention to the critical local regions. Finally, the Multi-kernel Support Vector Machine (SVM) is employed to fuse the selected features for the final classification. The proposed method exhibits great advantage and achieves promising results on four spontaneous micro-expression datasets. Through further observation of parameter evaluation and confusion matrix, the sufficiency and effectiveness of the proposed method are proved. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Graphical abstract

27 pages, 6306 KiB

Open AccessArticle

Vagal Tone Differences in Empathy Level Elicited by Different Emotions and a Co-Viewer

by Suhhee Yoo and Mincheol Whang

Sensors 2020, 20(11), 3136; https://doi.org/10.3390/s20113136 - 1 Jun 2020

Cited by 9 | Viewed by 3574

Abstract

Empathy can bring different benefits depending on what kind of emotions people empathize with. For example, empathy with negative emotions can raise donations to charity while empathy with positive emotions can increase participation during remote education. However, few studies have focused on the [...] Read more.

Empathy can bring different benefits depending on what kind of emotions people empathize with. For example, empathy with negative emotions can raise donations to charity while empathy with positive emotions can increase participation during remote education. However, few studies have focused on the physiological differences depending on what kind of emotions people empathize with. Furthermore, co-viewer can influence the elicitation of different levels of empathy, but this has been less discussed. Therefore, this study investigated vagal response differences according to each empathy factor level elicited by different emotions and co-viewer. Fifty-nine participants were asked to watch 4 videos and to evaluate subjective valence, arousal scores, and undertake an empathy questionnaire, which included cognitive, affective and identification empathy. Half of the participants watched the videos alone and the other half watched the videos with a co-viewer. Valence and arousal scores were categorized into three levels to figure out what kind of emotions they empathized with. Empathy level (high vs. low) was determined based on the self-report scores. Two-way MANOVA revealed an interaction effect of empathy level and emotions. High affective empathy level is associated with higher vagal response regardless of what kind of emotions they empathized with. However, vagal response differences in other empathy factor level showed a different pattern depending on what kind of emotions that participant empathized with. A high cognitive empathy level showed lower vagal responses when participants felt negative or positive valence. High identification level also showed increased cognitive burden when participants empathized with negative and neutral valence. The results implied that emotions and types of empathy should be considered when measuring empathic responses using vagal tone. Two-way MANOVA revealed empathic response differences between co-viewer condition and emotion. Participants with a co-viewer felt higher vagal responses and self-reporting empathy scores only when participants empathized with arousal. This implied that the effect of a co-viewer may impact on empathic responses only when participants felt higher emotional intensity. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

12 pages, 796 KiB

Open AccessArticle

Affective Latent Representation of Acoustic and Lexical Features for Emotion Recognition

by Eesung Kim, Hyungchan Song and Jong Won Shin

Sensors 2020, 20(9), 2614; https://doi.org/10.3390/s20092614 - 4 May 2020

Cited by 8 | Viewed by 4271

Abstract

In this paper, we propose a novel emotion recognition method based on the underlying emotional characteristics extracted from a conditional adversarial auto-encoder (CAAE), in which both acoustic and lexical features are used as inputs. The acoustic features are generated by calculating statistical functionals [...] Read more.

In this paper, we propose a novel emotion recognition method based on the underlying emotional characteristics extracted from a conditional adversarial auto-encoder (CAAE), in which both acoustic and lexical features are used as inputs. The acoustic features are generated by calculating statistical functionals of low-level descriptors and by a deep neural network (DNN). These acoustic features are concatenated with three types of lexical features extracted from the text, which are a sparse representation, a distributed representation, and an affective lexicon-based dimensions. Two-dimensional latent representations similar to vectors in the valence-arousal space are obtained by a CAAE, which can be directly mapped into the emotional classes without the need for a sophisticated classifier. In contrast to the previous attempt to a CAAE using only acoustic features, the proposed approach could enhance the performance of the emotion recognition because combined acoustic and lexical features provide enough discriminant power. Experimental results on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) corpus showed that our method outperformed the previously reported best results on the same corpus, achieving 76.72% in the unweighted average recall. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

16 pages, 6966 KiB

Open AccessArticle

Adaptive 3D Model-Based Facial Expression Synthesis and Pose Frontalization

by Yu-Jin Hong, Sung Eun Choi, Gi Pyo Nam, Heeseung Choi, Junghyun Cho and Ig-Jae Kim

Sensors 2020, 20(9), 2578; https://doi.org/10.3390/s20092578 - 1 May 2020

Cited by 4 | Viewed by 6725

Abstract

Facial expressions are one of the important non-verbal ways used to understand human emotions during communication. Thus, acquiring and reproducing facial expressions is helpful in analyzing human emotional states. However, owing to complex and subtle facial muscle movements, facial expression modeling from images [...] Read more.

Facial expressions are one of the important non-verbal ways used to understand human emotions during communication. Thus, acquiring and reproducing facial expressions is helpful in analyzing human emotional states. However, owing to complex and subtle facial muscle movements, facial expression modeling from images with face poses is difficult to achieve. To handle this issue, we present a method for acquiring facial expressions from a non-frontal single photograph using a 3D-aided approach. In addition, we propose a contour-fitting method that improves the modeling accuracy by automatically rearranging 3D contour landmarks corresponding to fixed 2D image landmarks. The acquired facial expression input can be parametrically manipulated to create various facial expressions through a blendshape or expression transfer based on the FACS (Facial Action Coding System). To achieve a realistic facial expression synthesis, we propose an exemplar-texture wrinkle synthesis method that extracts and synthesizes appropriate expression wrinkles according to the target expression. To do so, we constructed a wrinkle table of various facial expressions from 400 people. As one of the applications, we proved that the expression-pose synthesis method is suitable for expression-invariant face recognition through a quantitative evaluation, and showed the effectiveness based on a qualitative evaluation. We expect our system to be a benefit to various fields such as face recognition, HCI, and data augmentation for deep learning. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

19 pages, 1784 KiB

Open AccessArticle

An Adaptive Face Tracker with Application in Yawning Detection

by Aasim Khurshid and Jacob Scharcanski

Sensors 2020, 20(5), 1494; https://doi.org/10.3390/s20051494 - 9 Mar 2020

Cited by 3 | Viewed by 4845

Abstract

In this work, we propose an adaptive face tracking scheme that compensates for possible face tracking errors during its operation. The proposed scheme is equipped with a tracking divergence estimate, which allows to detect early and minimize the face tracking errors, so the [...] Read more.

In this work, we propose an adaptive face tracking scheme that compensates for possible face tracking errors during its operation. The proposed scheme is equipped with a tracking divergence estimate, which allows to detect early and minimize the face tracking errors, so the tracked face is not missed indefinitely. When the estimated face tracking error increases, a resyncing mechanism based on Constrained Local Models (CLM) is activated to reduce the tracking errors by re-estimating the tracked facial features’ locations (e.g., facial landmarks). To improve the Constrained Local Model (CLM) feature search mechanism, a Weighted-CLM (W-CLM) is proposed and used in resyncing. The performance of the proposed face tracking method is evaluated in the challenging context of driver monitoring using yawning detection and talking video datasets. Furthermore, an improvement in a yawning detection scheme is proposed. Experiments suggest that our proposed face tracking scheme can obtain a better performance than comparable state-of-the-art face tracking methods and can be successfully applied in yawning detection. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

13 pages, 2329 KiB

Open AccessArticle

Differences in Facial Expressions between Spontaneous and Posed Smiles: Automated Method by Action Units and Three-Dimensional Facial Landmarks

by Seho Park, Kunyoung Lee, Jae-A Lim, Hyunwoong Ko, Taehoon Kim, Jung-In Lee, Hakrim Kim, Seong-Jae Han, Jeong-Shim Kim, Soowon Park, Jun-Young Lee and Eui Chul Lee

Sensors 2020, 20(4), 1199; https://doi.org/10.3390/s20041199 - 21 Feb 2020

Cited by 29 | Viewed by 6981

Abstract

Research on emotion recognition from facial expressions has found evidence of different muscle movements between genuine and posed smiles. To further confirm discrete movement intensities of each facial segment, we explored differences in facial expressions between spontaneous and posed smiles with three-dimensional facial [...] Read more.

Research on emotion recognition from facial expressions has found evidence of different muscle movements between genuine and posed smiles. To further confirm discrete movement intensities of each facial segment, we explored differences in facial expressions between spontaneous and posed smiles with three-dimensional facial landmarks. Advanced machine analysis was adopted to measure changes in the dynamics of 68 segmented facial regions. A total of 57 normal adults (19 men, 38 women) who displayed adequate posed and spontaneous facial expressions for happiness were included in the analyses. The results indicate that spontaneous smiles have higher intensities for upper face than lower face. On the other hand, posed smiles showed higher intensities in the lower part of the face. Furthermore, the 3D facial landmark technique revealed that the left eyebrow displayed stronger intensity during spontaneous smiles than the right eyebrow. These findings suggest a potential application of landmark based emotion recognition that spontaneous smiles can be distinguished from posed smiles via measuring relative intensities between the upper and lower face with a focus on left-sided asymmetry in the upper region. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

17 pages, 4127 KiB

Open AccessArticle

The Design of CNN Architectures for Optimal Six Basic Emotion Classification Using Multiple Physiological Signals

by SeungJun Oh, Jun-Young Lee and Dong Keun Kim

Sensors 2020, 20(3), 866; https://doi.org/10.3390/s20030866 - 6 Feb 2020

Cited by 55 | Viewed by 8786

Abstract

This study aimed to design an optimal emotion recognition method using multiple physiological signal parameters acquired by bio-signal sensors for improving the accuracy of classifying individual emotional responses. Multiple physiological signals such as respiration (RSP) and heart rate variability (HRV) were acquired in [...] Read more.

This study aimed to design an optimal emotion recognition method using multiple physiological signal parameters acquired by bio-signal sensors for improving the accuracy of classifying individual emotional responses. Multiple physiological signals such as respiration (RSP) and heart rate variability (HRV) were acquired in an experiment from 53 participants when six basic emotion states were induced. Two RSP parameters were acquired from a chest-band respiration sensor, and five HRV parameters were acquired from a finger-clip blood volume pulse (BVP) sensor. A newly designed deep-learning model based on a convolutional neural network (CNN) was adopted for detecting the identification accuracy of individual emotions. Additionally, the signal combination of the acquired parameters was proposed to obtain high classification accuracy. Furthermore, a dominant factor influencing the accuracy was found by comparing the relativeness of the parameters, providing a basis for supporting the results of emotion classification. The users of this proposed model will soon be able to improve the emotion recognition model further based on CNN using multimodal physiological signals and their sensors. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

22 pages, 18412 KiB

Open AccessArticle

Expressure: Detect Expressions Related to Emotional and Cognitive Activities Using Forehead Textile Pressure Mechanomyography

by Bo Zhou, Tandra Ghose and Paul Lukowicz

Sensors 2020, 20(3), 730; https://doi.org/10.3390/s20030730 - 28 Jan 2020

Cited by 21 | Viewed by 6615

Abstract

We investigate how pressure-sensitive smart textiles, in the form of a headband, can detect changes in facial expressions that are indicative of emotions and cognitive activities. Specifically, we present the Expressure system that performs surface pressure mechanomyography on the forehead using an array [...] Read more.

We investigate how pressure-sensitive smart textiles, in the form of a headband, can detect changes in facial expressions that are indicative of emotions and cognitive activities. Specifically, we present the Expressure system that performs surface pressure mechanomyography on the forehead using an array of textile pressure sensors that is not dependent on specific placement or attachment to the skin. Our approach is evaluated in systematic psychological experiments. First, through a mimicking expression experiment with 20 participants, we demonstrate the system’s ability to detect well-defined facial expressions. We achieved accuracies of 0.824 to classify among three eyebrow movements (0.333 chance-level) and 0.381 among seven full-face expressions (0.143 chance-level). A second experiment was conducted with 20 participants to induce cognitive loads with N-back tasks. Statistical analysis has shown significant correlations between the Expressure features on a fine time granularity and the cognitive activity. The results have also shown significant correlations between the Expressure features and the N-back score. From the 10 most facially expressive participants, our approach can predict whether the N-back score is above or below the average with 0.767 accuracy. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

13 pages, 2967 KiB

Open AccessArticle

Recognition of Emotion According to the Physical Elements of the Video

by Jing Zhang, Xingyu Wen and Mincheol Whang

Sensors 2020, 20(3), 649; https://doi.org/10.3390/s20030649 - 24 Jan 2020

Cited by 15 | Viewed by 4790

Abstract

The increasing interest in the effects of emotion on cognitive, social, and neural processes creates a constant need for efficient and reliable techniques for emotion elicitation. Emotions are important in many areas, especially in advertising design and video production. The impact of emotions [...] Read more.

The increasing interest in the effects of emotion on cognitive, social, and neural processes creates a constant need for efficient and reliable techniques for emotion elicitation. Emotions are important in many areas, especially in advertising design and video production. The impact of emotions on the audience plays an important role. This paper analyzes the physical elements in a two-dimensional emotion map by extracting the physical elements of a video (color, light intensity, sound, etc.). We used k-nearest neighbors (K-NN), support vector machine (SVM), and multilayer perceptron (MLP) classifiers in the machine learning method to accurately predict the four dimensions that express emotions, as well as summarize the relationship between the two-dimensional emotion space and physical elements when designing and producing video. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

11 pages, 3150 KiB

Open AccessArticle

Recognition of Negative Emotion Using Long Short-Term Memory with Bio-Signal Feature Compression

by JeeEun Lee and Sun K. Yoo

Sensors 2020, 20(2), 573; https://doi.org/10.3390/s20020573 - 20 Jan 2020

Cited by 23 | Viewed by 4502

Abstract

Negative emotion is one reason why stress causes negative feedback. Therefore, many studies are being done to recognize negative emotions. However, emotion is difficult to classify because it is subjective and difficult to quantify. Moreover, emotion changes over time and is affected by [...] Read more.

Negative emotion is one reason why stress causes negative feedback. Therefore, many studies are being done to recognize negative emotions. However, emotion is difficult to classify because it is subjective and difficult to quantify. Moreover, emotion changes over time and is affected by mood. Therefore, we measured electrocardiogram (ECG), skin temperature (ST), and galvanic skin response (GSR) to detect objective indicators. We also compressed the features associated with emotion using a stacked auto-encoder (SAE). Finally, the compressed features and time information were used in training through long short-term memory (LSTM). As a result, the proposed LSTM used with the feature compression model showed the highest accuracy (99.4%) for recognizing negative emotions. The results of the suggested model were 11.3% higher than with a neural network (NN) and 5.6% higher than with SAE. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

15 pages, 1422 KiB

Open AccessArticle

A CNN-Assisted Enhanced Audio Signal Processing for Speech Emotion Recognition

by Mustaqeem and Soonil Kwon

Sensors 2020, 20(1), 183; https://doi.org/10.3390/s20010183 - 28 Dec 2019

Cited by 342 | Viewed by 19641

Abstract

Speech is the most significant mode of communication among human beings and a potential method for human-computer interaction (HCI) by using a microphone sensor. Quantifiable emotion recognition using these sensors from speech signals is an emerging area of research in HCI, which applies [...] Read more.

Speech is the most significant mode of communication among human beings and a potential method for human-computer interaction (HCI) by using a microphone sensor. Quantifiable emotion recognition using these sensors from speech signals is an emerging area of research in HCI, which applies to multiple applications such as human-reboot interaction, virtual reality, behavior assessment, healthcare, and emergency call centers to determine the speaker’s emotional state from an individual’s speech. In this paper, we present major contributions for; (i) increasing the accuracy of speech emotion recognition (SER) compared to state of the art and (ii) reducing the computational complexity of the presented SER model. We propose an artificial intelligence-assisted deep stride convolutional neural network (DSCNN) architecture using the plain nets strategy to learn salient and discriminative features from spectrogram of speech signals that are enhanced in prior steps to perform better. Local hidden patterns are learned in convolutional layers with special strides to down-sample the feature maps rather than pooling layer and global discriminative features are learned in fully connected layers. A SoftMax classifier is used for the classification of emotions in speech. The proposed technique is evaluated on Interactive Emotional Dyadic Motion Capture (IEMOCAP) and Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) datasets to improve accuracy by 7.85% and 4.5%, respectively, with the model size reduced by 34.5 MB. It proves the effectiveness and significance of the proposed SER technique and reveals its applicability in real-world applications. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

21 pages, 2965 KiB

Open AccessArticle

Distinguishing Emotional Responses to Photographs and Artwork Using a Deep Learning-Based Approach

by Heekyung Yang, Jongdae Han and Kyungha Min

Sensors 2019, 19(24), 5533; https://doi.org/10.3390/s19245533 - 14 Dec 2019

Cited by 18 | Viewed by 5738

Abstract

Visual stimuli from photographs and artworks raise corresponding emotional responses. It is a long process to prove whether the emotions that arise from photographs and artworks are different or not. We answer this question by employing electroencephalogram (EEG)-based biosignals and a deep convolutional [...] Read more.

Visual stimuli from photographs and artworks raise corresponding emotional responses. It is a long process to prove whether the emotions that arise from photographs and artworks are different or not. We answer this question by employing electroencephalogram (EEG)-based biosignals and a deep convolutional neural network (CNN)-based emotion recognition model. We employ Russell’s emotion model, which matches emotion keywords such as happy, calm or sad to a coordinate system whose axes are valence and arousal, respectively. We collect photographs and artwork images that match the emotion keywords and build eighteen one-minute video clips for nine emotion keywords for photographs and artwork. We hired forty subjects and executed tests about the emotional responses from the video clips. From the t-test on the results, we concluded that the valence shows difference, while the arousal does not. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Graphical abstract

12 pages, 703 KiB

Open AccessArticle

A Multi-Column CNN Model for Emotion Recognition from EEG Signals

by Heekyung Yang, Jongdae Han and Kyungha Min

Sensors 2019, 19(21), 4736; https://doi.org/10.3390/s19214736 - 31 Oct 2019

Cited by 153 | Viewed by 9408

Abstract

We present a multi-column CNN-based model for emotion recognition from EEG signals. Recently, a deep neural network is widely employed for extracting features and recognizing emotions from various biosignals including EEG signals. A decision from a single CNN-based emotion recognizing module shows improved [...] Read more.

We present a multi-column CNN-based model for emotion recognition from EEG signals. Recently, a deep neural network is widely employed for extracting features and recognizing emotions from various biosignals including EEG signals. A decision from a single CNN-based emotion recognizing module shows improved accuracy than the conventional handcrafted feature-based modules. To further improve the accuracy of the CNN-based modules, we devise a multi-column structured model, whose decision is produced by a weighted sum of the decisions from individual recognizing modules. We apply the model to EEG signals from DEAP dataset for comparison and demonstrate the improved accuracy of our model. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

16 pages, 3984 KiB

Open AccessArticle

Hands-Free User Interface for AR/VR Devices Exploiting Wearer’s Facial Gestures Using Unsupervised Deep Learning

by Jaekwang Cha, Jinhyuk Kim and Shiho Kim

Sensors 2019, 19(20), 4441; https://doi.org/10.3390/s19204441 - 14 Oct 2019

Cited by 8 | Viewed by 5894

Abstract

Developing a user interface (UI) suitable for headset environments is one of the challenges in the field of augmented reality (AR) technologies. This study proposes a hands-free UI for an AR headset that exploits facial gestures of the wearer to recognize user intentions. [...] Read more.

Developing a user interface (UI) suitable for headset environments is one of the challenges in the field of augmented reality (AR) technologies. This study proposes a hands-free UI for an AR headset that exploits facial gestures of the wearer to recognize user intentions. The facial gestures of the headset wearer are detected by a custom-designed sensor that detects skin deformation based on infrared diffusion characteristics of human skin. We designed a deep neural network classifier to determine the user’s intended gestures from skin-deformation data, which are exploited as user input commands for the proposed UI system. The proposed classifier is composed of a spatiotemporal autoencoder and deep embedded clustering algorithm, trained in an unsupervised manner. The UI device was embedded in a commercial AR headset, and several experiments were performed on the online sensor data to verify operation of the device. We achieved implementation of a hands-free UI for an AR headset with average accuracy of 95.4% user-command recognition, as determined through tests by participants. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

Other

Jump to: Research

12 pages, 4368 KiB

Open AccessLetter

Experimental Verification of Objective Visual Fatigue Measurement Based on Accurate Pupil Detection of Infrared Eye Image and Multi-Feature Analysis

by Taehyung Kim and Eui Chul Lee

Sensors 2020, 20(17), 4814; https://doi.org/10.3390/s20174814 - 26 Aug 2020

Cited by 28 | Viewed by 5958

Abstract

As the use of electronic displays increases rapidly, visual fatigue problems are also increasing. The subjective evaluation methods used for visual fatigue measurement have individual difference problems, while objective methods based on bio-signal measurement have problems regarding motion artifacts. Conventional eye image analysis-based [...] Read more.

As the use of electronic displays increases rapidly, visual fatigue problems are also increasing. The subjective evaluation methods used for visual fatigue measurement have individual difference problems, while objective methods based on bio-signal measurement have problems regarding motion artifacts. Conventional eye image analysis-based visual fatigue measurement methods do not accurately characterize the complex changes in the appearance of the eye. To solve this problem, in this paper, an objective visual fatigue measurement method based on infrared eye image analysis is proposed. For accurate pupil detection, a convolutional neural network-based semantic segmentation method was used. Three features are calculated based on the pupil detection results: (1) pupil accommodation speed, (2) blink frequency, and (3) eye-closed duration. In order to verify the calculated features, differences in fatigue caused by changes in content color components such as gamma, color temperature, and brightness were compared with a reference video. The pupil detection accuracy was confirmed to be 96.63% based on the mean intersection over union. In addition, it was confirmed that all three features showed significant differences from the reference group; thus, it was verified that the proposed analysis method can be used for the objective measurement of visual fatigue. Full article

(This article belongs to the Special Issue Sensor Applications on Emotion Recognition)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Sensor Applications on Emotion Recognition

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (16 papers)

Research

Other

Further Information

Guidelines

MDPI Initiatives

Follow MDPI