Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,282)

Search Parameters:
Keywords = facial feature

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1589 KiB  
Article
EEG-Based Attention Classification for Enhanced Learning Experience
by Madiha Khalid Syed, Hong Wang, Awais Ahmad Siddiqi, Shahnawaz Qureshi and Mohamed Amin Gouda
Appl. Sci. 2025, 15(15), 8668; https://doi.org/10.3390/app15158668 (registering DOI) - 5 Aug 2025
Abstract
This paper presents a novel EEG-based learning system designed to enhance the efficiency and effectiveness of studying by dynamically adjusting the difficulty level of learning materials based on real-time attention levels. In the training phase, EEG signals corresponding to high and low concentration [...] Read more.
This paper presents a novel EEG-based learning system designed to enhance the efficiency and effectiveness of studying by dynamically adjusting the difficulty level of learning materials based on real-time attention levels. In the training phase, EEG signals corresponding to high and low concentration levels are recorded while participants engage in quizzes to learn and memorize Chinese characters. The attention levels are determined based on performance metrics derived from the quiz results. Following extensive preprocessing, the EEG data undergoes severmal feature extraction steps: removal of artifacts due to eye blinks and facial movements, segregation of waves based on their frequencies, similarity indexing with respect to delay, binary thresholding, and (PCA). These extracted features are then fed into a k-NN classifier, which accurately distinguishes between high and low attention brain wave patterns, with the labels derived from the quiz performance indicating high or low attention. During the implementation phase, the system continuously monitors the user’s EEG signals while studying. When low attention levels are detected, the system increases the repetition frequency and reduces the difficulty of the flashcards to refocus the user’s attention. Conversely, when high concentration levels are identified, the system escalates the difficulty level of the flashcards to maximize the learning challenge. This adaptive approach ensures a more effective learning experience by maintaining optimal cognitive engagement, resulting in improved learning rates, reduced stress, and increased overall learning efficiency. This adaptive approach ensures a more effective learning experience by maintaining optimal cognitive engagement, resulting in improved learning rates, reduced stress, and increased overall learning efficiency. Our results indicate that this EEG-based adaptive learning system holds significant potential for personalized education, fostering better retention and understanding of Chinese characters. Full article
(This article belongs to the Special Issue EEG Horizons: Exploring Neural Dynamics and Neurocognitive Processes)
Show Figures

Figure 1

26 pages, 2625 KiB  
Article
Evaluating the Efficacy of the More Young HIFU Device for Facial Skin Improvement: A Comparative Study with 7D Ultrasound
by Ihab Adib and Youjun Liu
Appl. Sci. 2025, 15(15), 8485; https://doi.org/10.3390/app15158485 (registering DOI) - 31 Jul 2025
Viewed by 410
Abstract
High-Intensity Focused Ultrasound (HIFU) is a non-invasive technology widely used in aesthetic dermatology for skin tightening and facial rejuvenation. This study aimed to evaluate the safety and efficacy of a modified HIFU device, More Young, compared to the standard 7D HIFU system through [...] Read more.
High-Intensity Focused Ultrasound (HIFU) is a non-invasive technology widely used in aesthetic dermatology for skin tightening and facial rejuvenation. This study aimed to evaluate the safety and efficacy of a modified HIFU device, More Young, compared to the standard 7D HIFU system through a randomized, single-blinded clinical trial. The More Young device features enhanced focal depth precision and energy delivery algorithms, including nine pre-programmed stabilization checkpoints to minimize treatment risks. A total of 100 participants with facial wrinkles and skin laxity were randomly assigned to receive either More Young or 7D HIFU treatment. Skin improvements were assessed at baseline and one to six months post-treatment using the VISIA® Skin Analysis System (7th Generation), focusing on eight key parameters. Patient satisfaction was evaluated through the Global Aesthetic Improvement Scale (GAIS). Data were analyzed using paired and independent t-tests, with effect sizes measured via Cohen’s d. Both groups showed significant post-treatment improvements; however, the More Young group demonstrated superior outcomes in wrinkle reduction, skin tightening, and texture enhancement, along with higher satisfaction and fewer adverse effects. No significant differences were observed in five of the eight skin parameters. Limitations include the absence of a placebo group, limited sample diversity, and short follow-up duration. Further studies are needed to validate long-term outcomes and assess performance across varied demographics and skin types. Full article
(This article belongs to the Section Biomedical Engineering)
Show Figures

Figure 1

27 pages, 1869 KiB  
Review
Understanding the Molecular Basis of Miller–Dieker Syndrome
by Gowthami Mahendran and Jessica A. Brown
Int. J. Mol. Sci. 2025, 26(15), 7375; https://doi.org/10.3390/ijms26157375 - 30 Jul 2025
Viewed by 388
Abstract
Miller–Dieker Syndrome (MDS) is a rare neurodevelopmental disorder caused by a heterozygous deletion of approximately 26 genes within the MDS locus of human chromosome 17. MDS, which affects 1 in 100,000 babies, can lead to a range of phenotypes, including lissencephaly, severe neurological [...] Read more.
Miller–Dieker Syndrome (MDS) is a rare neurodevelopmental disorder caused by a heterozygous deletion of approximately 26 genes within the MDS locus of human chromosome 17. MDS, which affects 1 in 100,000 babies, can lead to a range of phenotypes, including lissencephaly, severe neurological defects, distinctive facial abnormalities, cognitive impairments, seizures, growth retardation, and congenital heart and liver abnormalities. One hallmark feature of MDS is an unusually smooth brain surface due to abnormal neuronal migration during early brain development. Several genes located within the MDS locus have been implicated in the pathogenesis of MDS, including PAFAH1B1, YWHAE, CRK, and METTL16. These genes play a role in the molecular and cellular pathways that are vital for neuronal migration, the proper development of the cerebral cortex, and protein translation in MDS. Improved model systems, such as MDS patient-derived organoids and multi-omics analyses indicate that WNT/β-catenin signaling, calcium signaling, S-adenosyl methionine (SAM) homeostasis, mammalian target of rapamycin (mTOR) signaling, Janus kinase/signal transducer and activator of transcription (JAK/STAT) signaling, and others are dysfunctional in MDS. This review of MDS integrates details at the clinical level alongside newly emerging details at the molecular and cellular levels, which may inform the development of novel therapeutic strategies for MDS. Full article
(This article belongs to the Special Issue Rare Diseases and Neuroscience)
Show Figures

Figure 1

35 pages, 4940 KiB  
Article
A Novel Lightweight Facial Expression Recognition Network Based on Deep Shallow Network Fusion and Attention Mechanism
by Qiaohe Yang, Yueshun He, Hongmao Chen, Youyong Wu and Zhihua Rao
Algorithms 2025, 18(8), 473; https://doi.org/10.3390/a18080473 - 30 Jul 2025
Viewed by 296
Abstract
Facial expression recognition (FER) is a critical research direction in artificial intelligence, which is widely used in intelligent interaction, medical diagnosis, security monitoring, and other domains. These applications highlight its considerable practical value and social significance. Face expression recognition models often need to [...] Read more.
Facial expression recognition (FER) is a critical research direction in artificial intelligence, which is widely used in intelligent interaction, medical diagnosis, security monitoring, and other domains. These applications highlight its considerable practical value and social significance. Face expression recognition models often need to run efficiently on mobile devices or edge devices, so the research on lightweight face expression recognition is particularly important. However, feature extraction and classification methods of lightweight convolutional neural network expression recognition algorithms mostly used at present are not specifically and fully optimized for the characteristics of facial expression images, yet fail to make full use of the feature information in face expression images. To address the lack of facial expression recognition models that are both lightweight and effectively optimized for expression-specific feature extraction, this study proposes a novel network design tailored to the characteristics of facial expressions. In this paper, we refer to the backbone architecture of MobileNet V2 network, and redesign LightExNet, a lightweight convolutional neural network based on the fusion of deep and shallow layers, attention mechanism, and joint loss function, according to the characteristics of the facial expression features. In the network architecture of LightExNet, firstly, deep and shallow features are fused in order to fully extract the shallow features in the original image, reduce the loss of information, alleviate the problem of gradient disappearance when the number of convolutional layers increases, and achieve the effect of multi-scale feature fusion. The MobileNet V2 architecture has also been streamlined to seamlessly integrate deep and shallow networks. Secondly, by combining the own characteristics of face expression features, a new channel and spatial attention mechanism is proposed to obtain the feature information of different expression regions as much as possible for encoding. Thus improve the accuracy of expression recognition effectively. Finally, the improved center loss function is superimposed to further improve the accuracy of face expression classification results, and corresponding measures are taken to significantly reduce the computational volume of the joint loss function. In this paper, LightExNet is tested on the three mainstream face expression datasets: Fer2013, CK+ and RAF-DB, respectively, and the experimental results show that LightExNet has 3.27 M Parameters and 298.27 M Flops, and the accuracy on the three datasets is 69.17%, 97.37%, and 85.97%, respectively. The comprehensive performance of LightExNet is better than the current mainstream lightweight expression recognition algorithms such as MobileNet V2, IE-DBN, Self-Cure Net, Improved MobileViT, MFN, Ada-CM, Parallel CNN(Convolutional Neural Network), etc. Experimental results confirm that LightExNet effectively improves recognition accuracy and computational efficiency while reducing energy consumption and enhancing deployment flexibility. These advantages underscore its strong potential for real-world applications in lightweight facial expression recognition. Full article
Show Figures

Figure 1

17 pages, 597 KiB  
Review
Dry Needling for Tension-Type Headache: A Scoping Review on Intervention Procedures, Muscle Targets, and Outcomes
by Ana Bravo-Vazquez, Ernesto Anarte-Lazo, Cleofas Rodriguez-Blanco and Carlos Bernal-Utrera
J. Clin. Med. 2025, 14(15), 5320; https://doi.org/10.3390/jcm14155320 - 28 Jul 2025
Viewed by 265
Abstract
Background/Objectives: Tension-type headache (TTH) is the most prevalent form of primary headache. The etiology of TTH is not yet fully understood, although it is associated with the presence of myofascial trigger points (MTPs) in cervical and facial muscles. Dry needling (DN) therapy [...] Read more.
Background/Objectives: Tension-type headache (TTH) is the most prevalent form of primary headache. The etiology of TTH is not yet fully understood, although it is associated with the presence of myofascial trigger points (MTPs) in cervical and facial muscles. Dry needling (DN) therapy has emerged as an effective and safe non-pharmacological option for pain relief, but there are a lack of systematic reviews focused on its specific characteristics in TTH. The aim of this paper is to examine the characteristics and methodologies of DN in managing TTH. Methods: A scoping review was conducted with inclusion criteria considering studies that evaluated DN interventions in adults with TTH, reporting target muscles, diagnostic criteria, and technical features. The search was performed using PubMed, Embase, Scopus, and the Web of Science, resulting in the selection of seven studies after a rigorous filtering and evaluation process. Results: The included studies, primarily randomized controlled trials, involved a total of 309 participants. The most frequently treated muscles were the temporalis and trapezius. Identification of MTPs was mainly performed through manual palpation, although diagnostic criteria varied. DN interventions differed in technique. All studies included indicated favorable outcomes with improvements in headache symptoms. No serious adverse effects were reported, suggesting that the technique is safe. However, heterogeneity in protocols and diagnostic criteria limits the comparability of results. Conclusions: The evidence supports the use of DN in key muscles such as the temporalis and trapezius for managing TTH, although the diversity in methodologies and diagnostic criteria highlights the need for standardization. The safety profile of the method is favorable, but further research is necessary to define optimal protocols and improve reproducibility. Implementing objective diagnostic criteria and uniform protocols will facilitate advances in clinical practice and future research, ultimately optimizing outcomes for patients with TTH. Full article
(This article belongs to the Section Clinical Neurology)
Show Figures

Figure 1

24 pages, 10460 KiB  
Article
WGGLFA: Wavelet-Guided Global–Local Feature Aggregation Network for Facial Expression Recognition
by Kaile Dong, Xi Li, Cong Zhang, Zhenhua Xiao and Runpu Nie
Biomimetics 2025, 10(8), 495; https://doi.org/10.3390/biomimetics10080495 - 27 Jul 2025
Viewed by 313
Abstract
Facial expression plays an important role in human–computer interaction and affective computing. However, existing expression recognition methods cannot effectively capture multi-scale structural details contained in facial expressions, leading to a decline in recognition accuracy. Inspired by the multi-scale processing mechanism of the biological [...] Read more.
Facial expression plays an important role in human–computer interaction and affective computing. However, existing expression recognition methods cannot effectively capture multi-scale structural details contained in facial expressions, leading to a decline in recognition accuracy. Inspired by the multi-scale processing mechanism of the biological visual system, this paper proposes a wavelet-guided global–local feature aggregation network (WGGLFA) for facial expression recognition (FER). Our WGGLFA network consists of three main modules: the scale-aware expansion (SAE) module, which combines dilated convolution and wavelet transform to capture multi-scale contextual features; the structured local feature aggregation (SLFA) module based on facial keypoints to extract structured local features; and the expression-guided region refinement (ExGR) module, which enhances features from high-response expression areas to improve the collaborative modeling between local details and key expression regions. All three modules utilize the spatial frequency locality of the wavelet transform to achieve high-/low-frequency feature separation, thereby enhancing fine-grained expression representation under frequency domain guidance. Experimental results show that our WGGLFA achieves accuracies of 90.32%, 91.24%, and 71.90% on the RAF-DB, FERPlus, and FED-RO datasets, respectively, demonstrating that our WGGLFA is effective and has more capability of robustness and generalization than state-of-the-art (SOTA) expression recognition methods. Full article
Show Figures

Figure 1

20 pages, 3386 KiB  
Article
Design of Realistic and Artistically Expressive 3D Facial Models for Film AIGC: A Cross-Modal Framework Integrating Audience Perception Evaluation
by Yihuan Tian, Xinyang Li, Zuling Cheng, Yang Huang and Tao Yu
Sensors 2025, 25(15), 4646; https://doi.org/10.3390/s25154646 - 26 Jul 2025
Viewed by 378
Abstract
The rise of virtual production has created an urgent need for both efficient and high-fidelity 3D face generation schemes for cinema and immersive media, but existing methods are often limited by lighting–geometry coupling, multi-view dependency, and insufficient artistic quality. To address this, this [...] Read more.
The rise of virtual production has created an urgent need for both efficient and high-fidelity 3D face generation schemes for cinema and immersive media, but existing methods are often limited by lighting–geometry coupling, multi-view dependency, and insufficient artistic quality. To address this, this study proposes a cross-modal 3D face generation framework based on single-view semantic masks. It utilizes Swin Transformer for multi-level feature extraction and combines with NeRF for illumination decoupled rendering. We utilize physical rendering equations to explicitly separate surface reflectance from ambient lighting to achieve robust adaptation to complex lighting variations. In addition, to address geometric errors across illumination scenes, we construct geometric a priori constraint networks by mapping 2D facial features to 3D parameter space as regular terms with the help of semantic masks. On the CelebAMask-HQ dataset, this method achieves a leading score of SSIM = 0.892 (37.6% improvement from baseline) with FID = 40.6. The generated faces excel in symmetry and detail fidelity with realism and aesthetic scores of 8/10 and 7/10, respectively, in a perceptual evaluation with 1000 viewers. By combining physical-level illumination decoupling with semantic geometry a priori, this paper establishes a quantifiable feedback mechanism between objective metrics and human aesthetic evaluation, providing a new paradigm for aesthetic quality assessment of AI-generated content. Full article
(This article belongs to the Special Issue Convolutional Neural Network Technology for 3D Imaging and Sensing)
Show Figures

Figure 1

20 pages, 4162 KiB  
Article
Discovering the Emotions of Frustration and Confidence During the Application of Cognitive Tests in Mexican University Students
by Marco A. Moreno-Armendáriz, Jesús Mercado-Ríos, José E. Valdez-Rodríguez, Rolando Quintero and Victor H. Ponce-Ponce
Big Data Cogn. Comput. 2025, 9(8), 195; https://doi.org/10.3390/bdcc9080195 - 24 Jul 2025
Viewed by 349
Abstract
Emotion detection using computer vision has advanced significantly in recent years, achieving remarkable performance that, in some cases, surpasses that of humans. Convolutional neural networks (CNNs) excel in this task by capturing facial features that allow for effective emotion classification. However, most research [...] Read more.
Emotion detection using computer vision has advanced significantly in recent years, achieving remarkable performance that, in some cases, surpasses that of humans. Convolutional neural networks (CNNs) excel in this task by capturing facial features that allow for effective emotion classification. However, most research focuses on basic emotions, such as happiness, anger, or sadness, neglecting more complex emotions, like frustration. People set expectations or goals to meet; if they do not happen, frustration arises, generating reactions such as annoyance, anger, and disappointment, which can harm confidence and motivation. These aspects make it especially relevant in mental health and educational contexts, where detecting it could help mitigate its adverse effects. In this research, we developed a CNN-based approach to detect frustration through facial expressions. The scarcity of specific datasets for this task led us to create an experimental protocol to generate our dataset. This classification task presents a high degree of difficulty due to the variability in facial expressions among different participants when feeling frustrated. Despite this, our new model achieved an F1-score of 0.8080, thus obtaining an adequate baseline model. Full article
(This article belongs to the Special Issue Application of Deep Neural Networks)
Show Figures

Figure 1

22 pages, 2952 KiB  
Article
Raw-Data Driven Functional Data Analysis with Multi-Adaptive Functional Neural Networks for Ergonomic Risk Classification Using Facial and Bio-Signal Time-Series Data
by Suyeon Kim, Afrooz Shakeri, Seyed Shayan Darabi, Eunsik Kim and Kyongwon Kim
Sensors 2025, 25(15), 4566; https://doi.org/10.3390/s25154566 - 23 Jul 2025
Viewed by 234
Abstract
Ergonomic risk classification during manual lifting tasks is crucial for the prevention of workplace injuries. This study addresses the challenge of classifying lifting task risk levels (low, medium, and high risk, labeled as 0, 1, and 2) using multi-modal time-series data comprising raw [...] Read more.
Ergonomic risk classification during manual lifting tasks is crucial for the prevention of workplace injuries. This study addresses the challenge of classifying lifting task risk levels (low, medium, and high risk, labeled as 0, 1, and 2) using multi-modal time-series data comprising raw facial landmarks and bio-signals (electrocardiography [ECG] and electrodermal activity [EDA]). Classifying such data presents inherent challenges due to multi-source information, temporal dynamics, and class imbalance. To overcome these challenges, this paper proposes a Multi-Adaptive Functional Neural Network (Multi-AdaFNN), a novel method that integrates functional data analysis with deep learning techniques. The proposed model introduces a novel adaptive basis layer composed of micro-networks tailored to each individual time-series feature, enabling end-to-end learning of discriminative temporal patterns directly from raw data. The Multi-AdaFNN approach was evaluated across five distinct dataset configurations: (1) facial landmarks only, (2) bio-signals only, (3) full fusion of all available features, (4) a reduced-dimensionality set of 12 selected facial landmark trajectories, and (5) the same reduced set combined with bio-signals. Performance was rigorously assessed using 100 independent stratified splits (70% training and 30% testing) and optimized via a weighted cross-entropy loss function to manage class imbalance effectively. The results demonstrated that the integrated approach, fusing facial landmarks and bio-signals, achieved the highest classification accuracy and robustness. Furthermore, the adaptive basis functions revealed specific phases within lifting tasks critical for risk prediction. These findings underscore the efficacy and transparency of the Multi-AdaFNN framework for multi-modal ergonomic risk assessment, highlighting its potential for real-time monitoring and proactive injury prevention in industrial environments. Full article
(This article belongs to the Special Issue (Bio)sensors for Physiological Monitoring)
Show Figures

Figure 1

18 pages, 3102 KiB  
Article
A Multicomponent Face Verification and Identification System
by Athanasios Douklias, Ioannis Zorzos, Evangelos Maltezos, Vasilis Nousis, Spyridon Nektarios Bolierakis, Lazaros Karagiannidis, Eleftherios Ouzounoglou and Angelos Amditis
Appl. Sci. 2025, 15(15), 8161; https://doi.org/10.3390/app15158161 - 22 Jul 2025
Viewed by 237
Abstract
Face recognition technology is a biometric technology, which is based on the identification or verification of facial features. Automatic face recognition is an active research field in the context of computer vision and artificial intelligence (AI) that is fundamental for a variety of [...] Read more.
Face recognition technology is a biometric technology, which is based on the identification or verification of facial features. Automatic face recognition is an active research field in the context of computer vision and artificial intelligence (AI) that is fundamental for a variety of real-time applications. In this research, the design and implementation of a face verification and identification system of a flexible, modular, secure, and scalable architecture is proposed. The proposed system incorporates several and various types of system components: (i) portable capabilities (mobile application and mixed reality [MR] glasses), (ii) enhanced monitoring and visualization via a user-friendly Web-based user interface (UI), and (iii) information sharing via middleware to other external systems. The experiments showed that such interconnected and complementary system components were able to perform robust and real-time results related to face identification and verification. Furthermore, to identify a proper model of high accuracy, robustness, and performance speed for face identification and verification tasks, a comprehensive evaluation of multiple face recognition pre-trained models (FaceNet, ArcFace, Dlib, and MobileNetV2) on a curated version of the ID vs. Spot dataset was performed. Among the models used, FaceNet emerged as a preferable choice for real-time tasks due to its balance between accuracy and inference speed for both face identification and verification tasks achieving AUC of 0.99, Rank-1 of 91.8%, Rank-5 of 95.8%, FNR of 2% and FAR of 0.1%, accuracy of 98.6%, and inference speed of 52 ms. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

26 pages, 829 KiB  
Article
Enhanced Face Recognition in Crowded Environments with 2D/3D Features and Parallel Hybrid CNN-RNN Architecture with Stacked Auto-Encoder
by Samir Elloumi, Sahbi Bahroun, Sadok Ben Yahia and Mourad Kaddes
Big Data Cogn. Comput. 2025, 9(8), 191; https://doi.org/10.3390/bdcc9080191 - 22 Jul 2025
Viewed by 385
Abstract
Face recognition (FR) in unconstrained conditions remains an open research topic and an ongoing challenge. The facial images exhibit diverse expressions, occlusions, variations in illumination, and heterogeneous backgrounds. This work aims to produce an accurate and robust system for enhanced Security and Surveillance. [...] Read more.
Face recognition (FR) in unconstrained conditions remains an open research topic and an ongoing challenge. The facial images exhibit diverse expressions, occlusions, variations in illumination, and heterogeneous backgrounds. This work aims to produce an accurate and robust system for enhanced Security and Surveillance. A parallel hybrid deep learning model for feature extraction and classification is proposed. An ensemble of three parallel extraction layer models learns the best representative features using CNN and RNN. 2D LBP and 3D Mesh LBP are computed on face images to extract image features as input to two RNNs. A stacked autoencoder (SAE) merged the feature vectors extracted from the three CNN-RNN parallel layers. We tested the designed 2D/3D CNN-RNN framework on four standard datasets. We achieved an accuracy of 98.9%. The hybrid deep learning model significantly improves FR against similar state-of-the-art methods. The proposed model was also tested on an unconstrained conditions human crowd dataset, and the results were very promising with an accuracy of 95%. Furthermore, our model shows an 11.5% improvement over similar hybrid CNN-RNN architectures, proving its robustness in complex environments where the face can undergo different transformations. Full article
Show Figures

Figure 1

18 pages, 5806 KiB  
Article
Optical Flow Magnification and Cosine Similarity Feature Fusion Network for Micro-Expression Recognition
by Heyou Chang, Jiazheng Yang, Kai Huang, Wei Xu, Jian Zhang and Hao Zheng
Mathematics 2025, 13(15), 2330; https://doi.org/10.3390/math13152330 - 22 Jul 2025
Viewed by 241
Abstract
Recent advances in deep learning have significantly advanced micro-expression recognition, yet most existing methods process the entire facial region holistically, struggling to capture subtle variations in facial action units, which limits recognition performance. To address this challenge, we propose the Optical Flow Magnification [...] Read more.
Recent advances in deep learning have significantly advanced micro-expression recognition, yet most existing methods process the entire facial region holistically, struggling to capture subtle variations in facial action units, which limits recognition performance. To address this challenge, we propose the Optical Flow Magnification and Cosine Similarity Feature Fusion Network (MCNet). MCNet introduces a multi-facial action optical flow estimation module that integrates global motion-amplified optical flow with localized optical flow from the eye and mouth–nose regions, enabling precise capture of facial expression nuances. Additionally, an enhanced MobileNetV3-based feature extraction module, incorporating Kolmogorov–Arnold networks and convolutional attention mechanisms, effectively captures both global and local features from optical flow images. A novel multi-channel feature fusion module leverages cosine similarity between Query and Key token sequences to optimize feature integration. Extensive evaluations on four public datasets—CASME II, SAMM, SMIC-HS, and MMEW—demonstrate MCNet’s superior performance, achieving state-of-the-art results with 92.88% UF1 and 86.30% UAR on the composite dataset, surpassing the best prior method by 1.77% in UF1 and 6.0% in UAR. Full article
(This article belongs to the Special Issue Representation Learning for Computer Vision and Pattern Recognition)
Show Figures

Figure 1

38 pages, 2346 KiB  
Review
Review of Masked Face Recognition Based on Deep Learning
by Bilal Saoud, Abdul Hakim H. M. Mohamed, Ibraheem Shayea, Ayman A. El-Saleh and Abdulaziz Alashbi
Technologies 2025, 13(7), 310; https://doi.org/10.3390/technologies13070310 - 21 Jul 2025
Viewed by 1278
Abstract
With the widespread adoption of face masks due to global health crises and heightened security concerns, traditional face recognition systems have struggled to maintain accuracy, prompting significant research into masked face recognition (MFR). Although various models have been proposed, a comprehensive and systematic [...] Read more.
With the widespread adoption of face masks due to global health crises and heightened security concerns, traditional face recognition systems have struggled to maintain accuracy, prompting significant research into masked face recognition (MFR). Although various models have been proposed, a comprehensive and systematic understanding of recent deep learning (DL)-based approaches remains limited. This paper addresses this research gap by providing an extensive review and comparative analysis of state-of-the-art MFR techniques. We focus on DL-based methods due to their superior performance in real-world scenarios, discussing key architectures, feature extraction strategies, datasets, and evaluation metrics. This paper also introduces a structured methodology for selecting and reviewing relevant works, ensuring transparency and reproducibility. As a contribution, we present a detailed taxonomy of MFR approaches, highlight current challenges, and suggest potential future research directions. This survey serves as a valuable resource for researchers and practitioners seeking to advance the field of robust facial recognition in masked conditions. Full article
Show Figures

Figure 1

18 pages, 2423 KiB  
Article
A New AI Framework to Support Social-Emotional Skills and Emotion Awareness in Children with Autism Spectrum Disorder
by Andrea La Fauci De Leo, Pooneh Bagheri Zadeh, Kiran Voderhobli and Akbar Sheikh Akbari
Computers 2025, 14(7), 292; https://doi.org/10.3390/computers14070292 - 20 Jul 2025
Viewed by 920
Abstract
This research highlights the importance of Emotion Aware Technologies (EAT) and their implementation in serious games to assist children with Autism Spectrum Disorder (ASD) in developing social-emotional skills. As AI is gaining popularity, such tools can be used in mobile applications as invaluable [...] Read more.
This research highlights the importance of Emotion Aware Technologies (EAT) and their implementation in serious games to assist children with Autism Spectrum Disorder (ASD) in developing social-emotional skills. As AI is gaining popularity, such tools can be used in mobile applications as invaluable teaching tools. In this paper, a new AI framework application is discussed that will help children with ASD develop efficient social-emotional skills. It uses the Jetpack Compose framework and Google Cloud Vision API as emotion-aware technology. The framework is developed with two main features designed to help children reflect on their emotions, internalise them, and train them how to express these emotions. Each activity is based on similar features from literature with enhanced functionalities. A diary feature allows children to take pictures of themselves, and the application categorises their facial expressions, saving the picture in the appropriate space. The three-level minigame consists of a series of prompts depicting a specific emotion that children have to match. The results of the framework offer a good starting point for similar applications to be developed further, especially by training custom models to be used with ML Kit. Full article
(This article belongs to the Special Issue AI in Its Ecosystem)
Show Figures

Figure 1

21 pages, 1115 KiB  
Article
Non-Contact Oxygen Saturation Estimation Using Deep Learning Ensemble Models and Bayesian Optimization
by Andrés Escobedo-Gordillo, Jorge Brieva and Ernesto Moya-Albor
Technologies 2025, 13(7), 309; https://doi.org/10.3390/technologies13070309 - 19 Jul 2025
Viewed by 373
Abstract
Monitoring Peripheral Oxygen Saturation (SpO2) is an important vital sign both in Intensive Care Units (ICUs), during surgery and convalescence, and as part of remote medical consultations after of the COVID-19 pandemic. This has made the development of new SpO2 [...] Read more.
Monitoring Peripheral Oxygen Saturation (SpO2) is an important vital sign both in Intensive Care Units (ICUs), during surgery and convalescence, and as part of remote medical consultations after of the COVID-19 pandemic. This has made the development of new SpO2-measurement tools an area of active research and opportunity. In this paper, we present a new Deep Learning (DL) combined strategy to estimate SpO2 without contact, using pre-magnified facial videos to reveal subtle color changes related to blood flow and with no calibration per subject required. We applied the Eulerian Video Magnification technique using the Hermite Transform (EVM-HT) as a feature detector to feed a Three-Dimensional Convolutional Neural Network (3D-CNN). Additionally, parameters and hyperparameter Bayesian optimization and an ensemble technique over the dataset magnified were applied. We tested the method on 18 healthy subjects, where facial videos of the subjects, including the automatic detection of the reference from a contact pulse oximeter device, were acquired. As performance metrics for the SpO2-estimation proposal, we calculated the Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and other parameters from the Bland–Altman (BA) analysis with respect to the reference. Therefore, a significant improvement was observed by adding the ensemble technique with respect to the only optimization, obtaining 14.32% in RMSE (reduction from 0.6204 to 0.5315) and 13.23% in MAE (reduction from 0.4323 to 0.3751). On the other hand, regarding Bland–Altman analysis, the upper and lower limits of agreement for the Mean of Differences (MOD) between the estimation and the ground truth were 1.04 and −1.05, with an MOD (bias) of −0.00175; therefore, MOD ±1.96σ = −0.00175 ± 1.04. Thus, by leveraging Bayesian optimization for hyperparameter tuning and integrating a Bagging Ensemble, we achieved a significant reduction in the training error (bias), achieving a better generalization over the test set, and reducing the variance in comparison with the baseline model for SpO2 estimation. Full article
(This article belongs to the Section Assistive Technologies)
Show Figures

Figure 1

Back to TopTop