MDPI - Publisher of Open Access Journals

19 pages, 1430 KB

Open AccessArticle

AI-Boosted Affective Real-Time Educational Software Adaptation

by Athanasios Nikolaidis, Athanasios Voulgaridis, Charalambos Strouthopoulos and Vassilios Chatzis

Appl. Sci. 2026, 16(9), 4117; https://doi.org/10.3390/app16094117 - 23 Apr 2026

Nowadays, educational software across all learning levels is increasingly enhanced with Artificial Intelligence (AI), primarily through content generation or post-session learning analytics. However, most existing systems remain weakly connected to learners’ real-time affective states and rarely exploit emotional information as a direct control [...] Read more.

Nowadays, educational software across all learning levels is increasingly enhanced with Artificial Intelligence (AI), primarily through content generation or post-session learning analytics. However, most existing systems remain weakly connected to learners’ real-time affective states and rarely exploit emotional information as a direct control signal for instructional adaptation. In this work, we propose a proof-of-concept closed-loop affect-aware educational adaptation framework that integrates real-time facial emotion recognition into a dynamic learning control system. The proposed approach is built upon a dual-model ensemble architecture, combining a transformer-based model (CAGE) and a CNN-based model (DDAMFN++) trained on large-scale in-the-wild datasets. To bridge heterogeneous emotion representations, we introduce a probabilistic fusion strategy that aligns continuous valence–arousal predictions with discrete emotion classification via a Gaussian Mixture Model (GMM), enabling unified emotion inference in real time. Based on the fused emotional state, a temporal aggregation mechanism is applied to capture sustained affective trends rather than transient expressions. These aggregated signals are then mapped to instructional decisions through an emotion-driven adaptive control policy, which adjusts activity difficulty using an Average Emotion Score (AES). This establishes a fully automated closed-loop adaptation cycle, where detected learner affect directly influences the learning environment without requiring explicit user input or post-session questionnaires. The framework is integrated into an open-source educational platform (eduActiv8) to demonstrate feasibility and system-level behavior. Results from alpha-level validation show that the system can continuously monitor learner affect, generate interpretable emotional analytics, and dynamically adjust task difficulty in real time, while reducing user interaction overhead. This study contributes a modular architecture for affect-aware educational systems by combining real-time ensemble emotion recognition, probabilistic fusion of heterogeneous outputs, and closed-loop instructional adaptation. The proposed framework provides a foundation for future research in scalable, emotion-driven intelligent tutoring and adaptive learning environments. Full article

(This article belongs to the Special Issue The Age of Transformers: Emerging Trends and Applications)

11 pages, 3891 KB

Open AccessProceeding Paper

Nose Detection Based on Quadratic Curve Fitting with Geometric–Photometric–Structural Scoring

by Yu-Chen Chen, Shao-Chi Kao and Jian-Jiun Ding

Eng. Proc. 2026, 134(1), 71; https://doi.org/10.3390/engproc2026134071 - 22 Apr 2026

Viewed by 71

Abstract

An edge-based and curve-based rule-driven nose detection framework is designed to improve the reliability of face detection. The designed framework combines quadratic curve fitting with a calibrated scoring mechanism that fuses geometric, photometric, and structural information into a unified model. These stages jointly [...] Read more.

An edge-based and curve-based rule-driven nose detection framework is designed to improve the reliability of face detection. The designed framework combines quadratic curve fitting with a calibrated scoring mechanism that fuses geometric, photometric, and structural information into a unified model. These stages jointly enforce symmetry consistency, reliable tip position, and clear wing boundaries. Candidate face regions are first refined by skin filtering and ellipse validation, from which a mid-lower facial ROI is framed for nasal candidate extraction. We further incorporate eye/mouth hints (EyeMap/MouthMap) to restrict the region of interest (ROI) to the region below the eyes, above the mouth, and between the two eyes. When a mouth is detected, this ROI refinement supersedes the chrominance-red (Cr) channel trimming; otherwise, we fall back to the Cr channel horizontal projection to detect dominant mouth peaks and trim the lower-lip band, thereby suppressing lip interference. A multi-threshold Canny procedure with histogram projection is employed to collect multiple nose rectangles by selecting various vertical and horizontal peaks under three adaptive threshold scales. Within each rectangle, edge contours are quadratically fitted and categorized into U-shape (nasal base), N-shape (nostril rim), and C-shape (nasal wings), enabling rule-based selection of the base, wings, and nostrils. The fused features are then processed by a calibrated geometric–photometric–structural scoring module that uses YCbCr contrasts and red/black penalties to suppress lip and eye confounders. Experiments with diverse faces and lighting conditions show accurate and stable nose localization, with notably reliable wing fitting and nasal base detection, improving the accuracy of face detection. Full article

(This article belongs to the Proceedings of The 7th Eurasia Conference on IoT, Communication and Engineering 2025 (ECICE 2025))

► Show Figures

Figure 1

24 pages, 2667 KB

Open AccessArticle

Hybrid Deep Neural Network-Based Modeling of Multimodal Emotion Recognition for Novice Drivers

by Jianzhuo Li, Ye Yu, Zhao Dai and Panyu Dai

Future Internet 2026, 18(4), 221; https://doi.org/10.3390/fi18040221 - 21 Apr 2026

Viewed by 154

Abstract

Driver emotion recognition is a crucial method for reducing traffic accidents. Most existing research focuses on experienced drivers as the primary research subjects, overlooking novice drivers, who are inexperienced in driving. However, novice drivers can easily lose control of their emotions due to [...] Read more.

Driver emotion recognition is a crucial method for reducing traffic accidents. Most existing research focuses on experienced drivers as the primary research subjects, overlooking novice drivers, who are inexperienced in driving. However, novice drivers can easily lose control of their emotions due to the high mental load during driving, which can lead to serious traffic accidents. Therefore, to recognize the emotions of novice drivers for timely warnings, we propose an emotion recognition model based on multimodal information. The model consists of a facial feature extraction module, an eye movement feature extraction module and a classifier. The facial feature extraction module uses the ViT-B/16 to extract the facial features of novice drivers. The eye movement feature extraction module is a hybrid network containing Bi-LSTM and Transformer. It extracts eye movement features of novice drivers. Facial features and eye movement features are fused and fed to the classifier. The classifier can output the five major emotion categories of surprise, anger, calm, happy, and other for novice drivers. The experimental results demonstrate that our model accurately recognizes the emotions of novice drivers with an accuracy of 98.72%, surpassing that of other models. Full article

► Show Figures

Figure 1

26 pages, 3904 KB

Open AccessArticle

AcneFormer: A Lesion-Aware and Noise-Robust CNN–Transformer for Acne Image Classification

by Yongtao Zhou and Kui Zhao

Sensors 2026, 26(8), 2533; https://doi.org/10.3390/s26082533 - 20 Apr 2026

Viewed by 232

Abstract

Convolutional neural networks (CNNs) have been widely used for acne image classification due to their effectiveness in capturing local texture of skin lesions. However, the locality of convolution operations limits their ability to model long-range dependencies. Vision Transformer (ViT) methods address this issue [...] Read more.

Convolutional neural networks (CNNs) have been widely used for acne image classification due to their effectiveness in capturing local texture of skin lesions. However, the locality of convolution operations limits their ability to model long-range dependencies. Vision Transformer (ViT) methods address this issue to some extent but their high computational complexity and reliance on large-scale pre-training present challenges. Although CNN–Transformer architecture alleviates this conflict to some extent, acne images present task-specific challenges, including indistinct lesion boundaries, subtle inter-class variations, and various facial interference factors. In this paper, we propose AcneFormer, a lesion-aware and noise-robust CNN–Transformer architecture for acne image classification. We introduce three modules especially for acne tasks: a Lesion Cue Enhancement (LCE) module to highlight discriminative multi-scale spatial patterns, a Cross-Layer Feature Transmission (CLFT) module to enhance cross-layer information flow in Transformers, and a Differential Semantic Denoising (DSD) module to suppress irrelevant responses during deep feature interaction. Extensive experiments show that AcneFormer outperforms several strong baselines. Ablation and external lesion-annotated analyses further show a consistent pattern: LCE mainly improves lesion-sensitive localization and class-balanced recognition, CLFT expands valid cross-depth lesion evidence, and DSD suppresses off-lesion semantic responses. Full article

(This article belongs to the Special Issue Emerging Trends in Artificial Intelligence for Biomedical Image Analysis)

► Show Figures

Figure 1

18 pages, 892 KB

Open AccessArticle

Emotional Recognition Under Multimodal Conflict: A Gaze-Based Response Task

by Alessandro De Santis, Giusi Antonia Toto, Martina Rossi, Laura D’Amico and Pierpaolo Limone

Psychol. Int. 2026, 8(2), 26; https://doi.org/10.3390/psycholint8020026 - 20 Apr 2026

Viewed by 141

Abstract

Emotional recognition relies on the integration of multiple affective cues. In everyday contexts, however, facial expressions, vocal prosody, and semantic content may convey incongruent emotional information, generating emotional conflict and increasing cognitive demands. The present study examined how multimodal emotional conflict affects emotion [...] Read more.

Emotional recognition relies on the integration of multiple affective cues. In everyday contexts, however, facial expressions, vocal prosody, and semantic content may convey incongruent emotional information, generating emotional conflict and increasing cognitive demands. The present study examined how multimodal emotional conflict affects emotion recognition during video viewing, focusing on short videos in which a single actor simultaneously conveyed incongruent emotional cues across facial, vocal, and semantic channels. Forty-seven undergraduate students completed a gaze-based response task in which, after each short video, they provided a single judgment of the overall emotion conveyed by the stimulus. The videos depicted either congruent or incongruent combinations of semantic content, facial expressions, and vocal prosody across six basic emotions and a neutral condition. Data were analyzed using repeated-measures ANOVAs and generalized linear mixed-effects models. Accuracy was consistently higher for congruent than incongruent stimuli across all domains, indicating a robust emotional interference effect. Critically, the magnitude of this effect differed by domain. Semantic content showed the largest performance reduction under incongruence, followed by facial expression and vocal prosody. Mixed-effects models confirmed these effects while accounting for participant- and item-level variability and revealed a significant Congruency × Domain interaction. In a gaze-based response task requiring a single overall emotion judgment, emotional conflict disrupted recognition in a domain-specific manner, with semantic information being particularly vulnerable to multimodal interference. Full article

(This article belongs to the Section Cognitive Psychology)

► Show Figures

Figure 1

24 pages, 13348 KB

Open AccessArticle

Morphological Convolutional Neural Network for Efficient Facial Expression Recognition

by Robert, Sarifuddin Madenda, Suryadi Harmanto, Michel Paindavoine and Dina Indarti

J. Imaging 2026, 12(4), 171; https://doi.org/10.3390/jimaging12040171 - 15 Apr 2026

Viewed by 276

Abstract

This study proposes a morphological convolutional neural network (MCNN) architecture that integrates morphological operations with CNN layers for facial expression recognition (FER). Conventional CNN-based FER models primarily rely on appearance features and may be sensitive to illumination and demographic variations. This work investigates [...] Read more.

This study proposes a morphological convolutional neural network (MCNN) architecture that integrates morphological operations with CNN layers for facial expression recognition (FER). Conventional CNN-based FER models primarily rely on appearance features and may be sensitive to illumination and demographic variations. This work investigates whether morphological structural representations provide complementary information to convolutional features. A multi-source and multi-ethnic FER dataset was constructed by combining CK+, JAFFE, KDEF, TFEID, and a newly collected Indonesian Facial Expression dataset, resulting in 3684 images from 326 subjects across seven expression classes. Subject-independent data splitting with 10-fold cross-validation was applied to ensure reliable evaluation. Experimental results show that the proposed MCNN1 model achieves an average accuracy of 88.16%, while the best MCNN2 variant achieves 88.7%, demonstrating competitive performance compared to MobileNetV2 (88.27%), VGG19 (87.58%), and the morphological baseline MNN (50.73%). The proposed model also demonstrates improved computational efficiency, achieving lower inference latency (21%) and reduced GPU memory usage (64%) compared to baseline models. These results indicate that integrating morphological representations into convolutional architectures provides a modest but consistent improvement in FER performance while enhancing generalization and efficiency under heterogeneous data conditions. Full article

(This article belongs to the Section AI in Imaging)

► Show Figures

Figure 1

43 pages, 2512 KB

Open AccessArticle

Computational Mapping of Hedgehog Pathway Kinase Module Predicts Node-Specific Craniofacial Phenotypes

by Kosi Gramatikoff, Miroslav Stoykov, Karl Hörmann and Mario Milkov

Genes 2026, 17(4), 433; https://doi.org/10.3390/genes17040433 - 8 Apr 2026

Viewed by 376

Abstract

Background/Objectives: Craniofacial malformations such as orofacial clefts affect ~1 in 700 births; 40–60% lack clear genetic etiology, and many exhibit asymmetry and variable expressivity unexplained by classical Sonic Hedgehog (SHH) morphogen gradient models. We investigated whether integrated molecular modules linking morphogen signaling with [...] Read more.

Background/Objectives: Craniofacial malformations such as orofacial clefts affect ~1 in 700 births; 40–60% lack clear genetic etiology, and many exhibit asymmetry and variable expressivity unexplained by classical Sonic Hedgehog (SHH) morphogen gradient models. We investigated whether integrated molecular modules linking morphogen signaling with metabolic stress responses may better account for craniofacial developmental outcomes. Methods: Sequential UniProt gene set integration identified 186 candidate craniofacial regulators. STRING network analysis revealed modular architecture. Molecular docking profiled 17 compounds against SMO, CK1δ, PINK1, and TIE2 (control). Pathway reconstruction integrated the SHH–CK1δ–HIF1A–HEY1–PINK1 axis with in-silico-predicted CK1δ phosphorylation sites on SMO (S615, T593, S751), HIF1A (Ser247), and GLI1/2/3 transcription factors. A developmental decision tree mapped affinity profiles to node-specific phenotype hypotheses. Results: CK1δ and PINK1 emerged as candidate nodes coupling morphogen signaling with mitochondrial quality control. Cross-docking showed preferential binding to developmental kinases (CK1δ: −8.34 kcal/mol; PINK1: −8.80 kcal/mol) versus TIE2 control (−6.76 kcal/mol; p < 0.001). Pathway reconstruction suggested that CK1δ-mediated Ser247 phosphorylation of HIF1A disrupts ARNT dimerization, redirecting HIF1A toward ARNT-independent HEY1 induction and consequent PINK1 suppression. Based on computed profiles, node-specific associations were proposed as computational hypotheses: SMO perturbation → midline defects; CK1δ → facial asymmetry/clefting; PINK1 → mandibular hypoplasia. Multi-target compounds (e.g., purmorphamine, taladegib) generated composite phenotype predictions consistent with clinical complexity. Conclusions: This strictly in silico study identifies candidate integrated morphogenic modules whose multi-node perturbation may underlie anatomically specific craniofacial malformation patterns. Node–phenotype associations are prioritized computational hypotheses requiring experimental validation; if confirmed, the framework could inform developmental toxicity assessment, therapeutic design, and reclassification of idiopathic craniofacial anomalies. Full article

(This article belongs to the Special Issue Genetic, Epigenetic and Environmental Factors in Dental Development and Pathologies: Genes, Interactions and Dental Development)

► Show Figures

Graphical abstract

13 pages, 2383 KB

Open AccessArticle

Novel Quantitative Approach for Age Estimation Using Facial Suture Closure and Modified Scoring Systems

by Siriwat Thunyacharoen, Chirapat Inchai and Pasuk Mahakkanukrauh

Appl. Sci. 2026, 16(7), 3591; https://doi.org/10.3390/app16073591 - 7 Apr 2026

Viewed by 463

Abstract

Background: While human cranial sutures are well-established indicators for age-at-death estimation in forensic anthropology, facial sutures remain an underutilized resource despite their critical role in facial growth and development. Macroscopic examination of craniofacial suture closure patterns reflects physiological aging processes and can [...] Read more.

Background: While human cranial sutures are well-established indicators for age-at-death estimation in forensic anthropology, facial sutures remain an underutilized resource despite their critical role in facial growth and development. Macroscopic examination of craniofacial suture closure patterns reflects physiological aging processes and can provide valuable information at crime scenes. This study aimed to address the gap of knowledge by quantitatively evaluating the efficacy of facial suture closure patterns for age estimation. Methods: A sample consisting of 296 Thai skulls was analyzed to assess facial suture closure based on anatomical morphology. The sutures were evaluated using various established classification systems to determine the most effective method for predicting age ranges. To ensure consistency and reliability, the evaluations were conducted by three independent raters. Results: The assessment demonstrated good Intraclass Correlation (ICC = 0.755, df = 14, p < 0.05). Among the classification methods tested, the Modified Meindl and Lovejoy Scoring System yielded the highest sensitivity, ranging from 90.9% to 100% in males and 75.4% to 96.1% in females. Specifically, the zygomaticomaxillary suture showed the highest sensitivity in males, whereas the frontonasal and sphenozygomatic sutures were the most sensitive indicators in females. Utilizing the total sum score (TSS), the following sex-specific linear regression formulas for age-at-death were generated: (Males: Age-at-death = 1.7625(TSS) − 17.094. Females: Age-at-death = 1.7325(TSS) − 12.865). Conclusions: Facial sutures exhibit distinct, sex-specific closure patterns that serve as robust and reliable indicators for estimating age, with higher sensitivity generally observed in males. The utility of this novel method is heavily dependent on the scoring system employed, highlighting the critical importance of utilizing modified, sex-specific analyses. While these population-specific models tailored to the Thai demographic effectively refine age estimation outcomes, integrating this methodology with broader biological profiling remains essential for high-confidence forensic identification. Full article

► Show Figures

Figure 1

26 pages, 2634 KB

Open AccessArticle

Minimal Angular Facial Representation for Real-Time Emotion Recognition

by Gerardo Garcia-Gil

Appl. Sci. 2026, 16(7), 3572; https://doi.org/10.3390/app16073572 - 6 Apr 2026

Viewed by 494

Abstract

Real-time facial emotion recognition remains challenging due to the high dimensionality and computational cost of dense facial representations, which limit their applicability in resource-constrained and real-time scenarios. This study proposes a compact, anatomically informed angular facial representation for efficient, interpretable emotion recognition under [...] Read more.

Real-time facial emotion recognition remains challenging due to the high dimensionality and computational cost of dense facial representations, which limit their applicability in resource-constrained and real-time scenarios. This study proposes a compact, anatomically informed angular facial representation for efficient, interpretable emotion recognition under real-time constraints. Facial landmarks are first extracted using a standard landmark detection framework, from which a reduced facial mesh of 27 anatomically selected points is defined. Internal geometric angles computed from this mesh are analyzed using temporal variability and redundancy criteria, resulting in a minimal set of eight angular descriptors that capture the most expressive facial dynamics while preserving geometric invariance and computational efficiency. The proposed representation is evaluated using multiple supervised machine learning classifiers under two complementary validation strategies: stratified frame-level cross-validation and strict Leave-One-Subject-Out evaluation. Under mixed-subject stratified validation, the best-performing model (MLP) achieved macro-averaged F1-scores exceeding 0.95 and near-unity ROC–AUC values. However, subject-independent evaluation revealed reduced generalization performance (average accuracy ≈55%), highlighting the influence of inter-subject morphological variability embedded in absolute angular descriptors. These findings indicate that a minimal angular geometric encoding provides strong intra-subject discriminative capability while transparently characterizing its cross-subject generalization limits, offering a practical and interpretable alternative for data- and resource-constrained real-time scenarios. Full article

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition: 2nd Edition)

► Show Figures

Figure 1

19 pages, 3413 KB

Open AccessArticle

AI-Based Angle Map Analysis of Facial Asymmetry in Peripheral Facial Palsy

by Andreas Heinrich, Gerd Fabian Volk, Christian Dobel and Orlando Guntinas-Lichius

Bioengineering 2026, 13(4), 426; https://doi.org/10.3390/bioengineering13040426 - 6 Apr 2026

Viewed by 511

Abstract

Peripheral facial palsy (PFP) causes pronounced facial asymmetry and functional impairment, highlighting the need for reliable, objective assessment. This study presents a novel, fully automated, reference-free method for quantifying facial symmetry using artificial intelligence (AI)-based facial landmark detection. A total of 405 datasets [...] Read more.

Peripheral facial palsy (PFP) causes pronounced facial asymmetry and functional impairment, highlighting the need for reliable, objective assessment. This study presents a novel, fully automated, reference-free method for quantifying facial symmetry using artificial intelligence (AI)-based facial landmark detection. A total of 405 datasets from 198 PFP patients were analyzed, each including nine standardized facial expressions covering both resting and dynamic movements. AI detected 478 landmarks per image, from which 225 paired landmarks were used to compute local asymmetry angles. Systematic evaluation identified 91 highly informative landmark pairs, primarily around the eyes, nose and mouth, which simplified the analysis and enhanced discriminatory power, while also enabling region-specific assessment of asymmetry. Statistical evaluation included Kruskal–Wallis H-tests across clinical scores and Spearman correlations, showing moderate to strong associations (0.32–0.73, p < 0.001). The fully automated pipeline produced reproducible results and demonstrated robustness to head rotation. Intuitive full-face angle maps allowed direct assessment of asymmetry without a reference image. This AI-driven approach provides a robust, objective, and visually interpretable framework for clinical monitoring, severity classification, and treatment evaluation in PFP, combining quantitative precision with practical applicability. Full article

(This article belongs to the Special Issue Next-Generation Biosignal Engineering: AI-Driven Diagnostics, Prosthetic Interfaces and Multimodal Physiological Sensing)

► Show Figures

Figure 1

14 pages, 2720 KB

Open AccessArticle

Social Attention in Electronic Picture Books with Social Scenes for Children with Autism Spectrum Disorder: Insights from Eye-Tracking Studies

by Lintao Yang, Yan Chen, Meifen Chen, Xiaoqun Wang and Leyuan Liu

Behav. Sci. 2026, 16(4), 536; https://doi.org/10.3390/bs16040536 - 2 Apr 2026

Viewed by 375

Abstract

Electronic picture book free viewing can promote language comprehension ability and social cognitive abilities in children with autism by providing structured visual information. Understanding autism spectrum disorder (ASD) children’s visual attention patterns during electronic picture book free viewing can inform targeted educational research. [...] Read more.

Electronic picture book free viewing can promote language comprehension ability and social cognitive abilities in children with autism by providing structured visual information. Understanding autism spectrum disorder (ASD) children’s visual attention patterns during electronic picture book free viewing can inform targeted educational research. The attentional preference of children with ASD toward electronic picture books with social scenes remains under-explored. This study aimed to understand the social attention of children with ASD during free viewing of electronic picture books with social scenes. Eye-tracking technology was used to record the visual behavior of 24 children with ASD viewing electronic picture books independently, and 25 typically developing (TD) children were selected as the control group. The results showed that children with ASD allocated less fixation time to social information in electronic picture books than TD children, with a clear difference in the fixation time spent on facial regions. Children with ASD neither displayed the same attention to happy facial expressions in electronic picture books as TD children nor did they show significant differences in attention to different emotions. These findings contribute to our understanding of visual attention patterns in children with ASD during electronic picture book free viewing and provide empirical evidence for future research on optimizing visual viewing guidance for children with ASD. Full article

► Show Figures

Figure 1

21 pages, 13964 KB

Open AccessArticle

Towards Generalizable Deepfake Detection via Facial Landmark-Guided Convolution and Local Structure Awareness

by Hao Chen, Zhengxu Zhang, Qin Li and Chunhui Feng

Algorithms 2026, 19(4), 270; https://doi.org/10.3390/a19040270 - 1 Apr 2026

Viewed by 401

Abstract

As deepfakes become increasingly realistic, there is a growing need for robust and highly accurate facial forgery detection algorithms. Existing studies show that global feature modeling approaches (Transformer, VMamba) are effective in capturing long-range dependencies, yet they often lack sufficient sensitivity to localized [...] Read more.

As deepfakes become increasingly realistic, there is a growing need for robust and highly accurate facial forgery detection algorithms. Existing studies show that global feature modeling approaches (Transformer, VMamba) are effective in capturing long-range dependencies, yet they often lack sufficient sensitivity to localized facial tampering artifacts. Meanwhile, traditional convolutional methods excel at extracting local image features but struggle to incorporate prior knowledge about facial anatomy, resulting in limited representational capability. To address these limitations, this paper proposes LGMamba, a novel detection framework that integrates facial guidance focusing on key facial components and fine-grained detail regions commonly manipulated in deepfakes with global modeling. First, we introduce an innovative Landmark-Guided Convolution (LGConv), which adaptively adjusts convolutional sampling positions using facial landmark information. This allows the model to attend to forgery-prone facial regions, such as the eyes and mouth. Second, we design a parallel Facial Structure Awareness Block (FSAB) to operate alongside the VMamba-based visual State-Space Model. Equipped with a multi-stage residual design and a CBAM attention mechanism, FSAB enhances the model’s sensitivity to subtle facial artifacts, enabling joint exploitation of global semantic consistency and fine-grained forgery cues within a unified architecture. The proposed LGMamba achieves superior performance compared to existing mainstream approaches. In cross-dataset evaluations, it attains AUC scores of 92.34% on CD1 and 96.01% on CD2, outperforming all compared methods. Full article

► Show Figures

Figure 1

20 pages, 34702 KB

Open AccessArticle

rePPG: Relighting Photoplethysmography Signal to Video

by Seunghyun Kim, Yeongje Park, Byeongseon An and Eui Chul Lee

Biomimetics 2026, 11(4), 230; https://doi.org/10.3390/biomimetics11040230 - 1 Apr 2026

Viewed by 510

Abstract

Remote photoplethysmography (rPPG) extracts physiological signals from facial videos by analyzing subtle skin color variations caused by blood flow. While this technology enables contactless health monitoring, it also raises privacy concerns because facial videos reveal both identity and sensitive biometric information. Existing privacy-preserving [...] Read more.

Remote photoplethysmography (rPPG) extracts physiological signals from facial videos by analyzing subtle skin color variations caused by blood flow. While this technology enables contactless health monitoring, it also raises privacy concerns because facial videos reveal both identity and sensitive biometric information. Existing privacy-preserving techniques, such as blurring or pixelation, degrade visual quality and are unsuitable for practical rPPG applications. This paper presents rePPG, a framework that inserts a desired rPPG signal into facial videos while preserving the original facial appearance. The proposed method disentangles facial appearance and physiological features, enabling replacement of the physiological signal without altering facial identity or visual quality. Skin segmentation restricts modifications to skin regions, and a cycle-consistency mechanism ensures that the injected rPPG signal can be reliably recovered from the generated video. Importantly, the extracted rPPG signals are evaluated against the injected target physiological signals rather than the subject’s original physiological state, ensuring that the evaluation measures signal rewriting accuracy. Experiments on the PURE and UBFC datasets show that rePPG successfully embeds target PPG signals, achieving 1.10 BPM MAE and 95.00% PTE6 on PURE while preserving visual quality (PSNR 24.61 dB, SSIM 0.638). Heart rate metrics are computed using a 5-second temporal window to ensure a consistent evaluation protocol. Full article

(This article belongs to the Special Issue Bio-Inspired Signal Processing on Image and Audio Data)

► Show Figures

Figure 1

21 pages, 18953 KB

Open AccessArticle

Evaluating AI-Based Image Inpainting Techniques for Facial Components Restoration Using Semantic Masks

by Hussein Sharadga, Abdullah Hayajneh and Erchin Serpedin

AI 2026, 7(4), 119; https://doi.org/10.3390/ai7040119 - 30 Mar 2026

Viewed by 814

Abstract

This paper presents a comparative analysis of advanced AI-based techniques for human face inpainting using semantic masks that fully occlude targeted facial components. The primary objective is to evaluate the ability of image inpainting methods to accurately restore semantically meaningful facial features. Our [...] Read more.

This paper presents a comparative analysis of advanced AI-based techniques for human face inpainting using semantic masks that fully occlude targeted facial components. The primary objective is to evaluate the ability of image inpainting methods to accurately restore semantically meaningful facial features. Our results show that existing inpainting models face significant challenges when semantic masks completely obscure the underlying facial structures. In contrast to random masks, which leave partial visual cues, semantic masks remove all structural information, making reconstruction substantially more difficult. We assess the performance of generative adversarial networks (GANs), transformer-based models, and diffusion models in restoring fully occluded facial components. To address these challenges, we explore three retraining strategies: using semantic masks, using random masks, and a hybrid approach combining both. While the hybrid strategy leverages the complementary strengths of each mask type and improves contextual understanding, fully accurate reconstruction remains challenging. These findings demonstrate that inpainting under fully occluding semantic masks is a critical yet underexplored area, offering opportunities for developing new AI architectures and strategies for advanced facial reconstruction. Full article

(This article belongs to the Special Issue Deep Learning Technologies and Their Applications in Image Processing, Computer Vision, and Computational Intelligence)

► Show Figures

Figure 1

16 pages, 3976 KB

Open AccessArticle

Spiking Feature-Driven Event Simulation with Movement-Aware Polarity Integration

by Jiwoong Oh, Byeongjun Kang, Hyungsik Shin and Dongwoo Kang

Electronics 2026, 15(7), 1420; https://doi.org/10.3390/electronics15071420 - 29 Mar 2026

Viewed by 339

Abstract

Event-based face detection has attracted significant interest due to the unique advantages of event cameras, including high temporal resolution, high dynamic range, and low power consumption. However, the lack of annotated public datasets remains a major challenge for training effective event-based face detection [...] Read more.

Event-based face detection has attracted significant interest due to the unique advantages of event cameras, including high temporal resolution, high dynamic range, and low power consumption. However, the lack of annotated public datasets remains a major challenge for training effective event-based face detection models. In this paper, we propose a spiking feature-driven synthetic event generation framework that utilizes a spiking neural network (SNN) in conjunction with a pretrained convolutional backbone to generate synthetic event representations from a single RGB image. To incorporate motion-induced ON/OFF polarity information, we introduce a movement-aware polarity integration (MPI) module that assumes four directional facial movements. An event-similarity score is further employed to select representations most consistent with real event data for training. Unlike conventional approaches relying on video-based simulators, our method enables efficient synthetic event dataset construction without requiring video inputs or additional simulation training. Experimental results on the N-Caltech101 dataset demonstrate a face detection accuracy of 99.91%, outperforming existing event-based face detection methods. Full article

(This article belongs to the Special Issue Edge-Intelligent Sustainable Cyber-Physical Systems)

► Show Figures

Figure 1

Search Results (799)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (799)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI