This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Emotion Recognition from rPPG via Physiologically Inspired Temporal Encoding and Attention-Based Curriculum Learning
by
Changmin Lee
Changmin Lee 1
,
Hyunwoo Lee
Hyunwoo Lee 2
and
Mincheol Whang
Mincheol Whang 1,*
1
Department of Human-Centered Artificial Intelligence, Sangmyung University, Seoul 03016, Republic of Korea
2
Department of Emotion Engineering, Sangmyung University, Seoul 03016, Republic of Korea
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(13), 3995; https://doi.org/10.3390/s25133995 (registering DOI)
Submission received: 7 May 2025
/
Revised: 16 June 2025
/
Accepted: 23 June 2025
/
Published: 26 June 2025
Highlights
What are the main findings?
- A temporal-only rPPG framework with a multi-scale CNN, sparse α-Entmax attention, and Gated Pooling achieved 66.04% accuracy and a 61.97% weighted F1-score for arousal on MAHNOB-HCI (mixed subjects).
- The model underperformed for valence (62.26% accuracy), highlighting the physiological limits of unimodal time-series signals.
What is the implication of the main finding?
- Temporal rPPG can rival other single-modality methods for arousal when physiologically inspired temporal modeling is applied.
- Addressing valence requires the integration of spatial or multimodal cues, guiding future affective computing designs.
Abstract
Remote photoplethysmography (rPPG) enables non-contact physiological measurement for emotion recognition, yet the temporally sparse nature of emotional cardiovascular responses, intrinsic measurement noise, weak session-level labels, and subtle correlates of valence pose critical challenges. To address these issues, we propose a physiologically inspired deep learning framework comprising a Multi-scale Temporal Dynamics Encoder (MTDE) to capture autonomic nervous system dynamics across multiple timescales, an adaptive sparse α-Entmax attention mechanism to identify salient emotional segments amidst noisy signals, Gated Temporal Pooling for the robust aggregation of emotional features, and a structured three-phase curriculum learning strategy to systematically handle temporal sparsity, weak labels, and noise. Evaluated on the MAHNOB-HCI dataset (27 subjects and 527 sessions with a subject-mixed split), our temporal-only model achieved competitive performance in arousal recognition (66.04% accuracy; 61.97% weighted F1-score), surpassing prior CNN-LSTM baselines. However, lower performance in valence (62.26% accuracy) revealed inherent physiological limitations regarding a unimodal temporal cardiovascular analysis. These findings establish clear benchmarks for temporal-only rPPG emotion recognition and underscore the necessity of incorporating spatial or multimodal information to effectively capture nuanced emotional dimensions such as valence, guiding future research directions in affective computing.
Share and Cite
MDPI and ACS Style
Lee, C.; Lee, H.; Whang, M.
Emotion Recognition from rPPG via Physiologically Inspired Temporal Encoding and Attention-Based Curriculum Learning. Sensors 2025, 25, 3995.
https://doi.org/10.3390/s25133995
AMA Style
Lee C, Lee H, Whang M.
Emotion Recognition from rPPG via Physiologically Inspired Temporal Encoding and Attention-Based Curriculum Learning. Sensors. 2025; 25(13):3995.
https://doi.org/10.3390/s25133995
Chicago/Turabian Style
Lee, Changmin, Hyunwoo Lee, and Mincheol Whang.
2025. "Emotion Recognition from rPPG via Physiologically Inspired Temporal Encoding and Attention-Based Curriculum Learning" Sensors 25, no. 13: 3995.
https://doi.org/10.3390/s25133995
APA Style
Lee, C., Lee, H., & Whang, M.
(2025). Emotion Recognition from rPPG via Physiologically Inspired Temporal Encoding and Attention-Based Curriculum Learning. Sensors, 25(13), 3995.
https://doi.org/10.3390/s25133995
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article metric data becomes available approximately 24 hours after publication online.