MDPI - Publisher of Open Access Journals

18 pages, 1959 KB

Open AccessArticle

Predictive and Reactive Control During Interception

by Mario Treviño, Nathaly Martín, Andrea Barrera and Inmaculada Márquez

Brain Sci. 2026, 16(3), 322; https://doi.org/10.3390/brainsci16030322 - 18 Mar 2026

Viewed by 67

Background/Objectives: Successful interception of moving targets requires combining predictive control, which anticipates future target states, and reactive control, which compensates for ongoing sensory discrepancies. How these components evolve over time and are distributed across gaze and manual behavior remains unclear. We aimed to [...] Read more.

Background/Objectives: Successful interception of moving targets requires combining predictive control, which anticipates future target states, and reactive control, which compensates for ongoing sensory discrepancies. How these components evolve over time and are distributed across gaze and manual behavior remains unclear. We aimed to explore the time-resolved dynamics of predictive control during continuous interception and to dissociate eye and hand contributions. Methods: Human participants intercepted a moving target in a two-dimensional arena using a joystick while eye movements were recorded. Target speed was systematically varied, and visual information was selectively reduced by occluding either the target or the user-controlled cursor. Predictive control was assessed using two complementary metrics: a geometric strategy index capturing moment-to-moment spatial lead or lag relative to target motion, applied separately to gaze and manual trajectories, and root mean square error (RMSE) computed relative to current and forward-shifted target positions to quantify predictive alignment. Results: Successful interception was characterized by structured, speed-dependent transitions between predictive and reactive control rather than a fixed strategy. Predictive alignment emerged early and was dynamically reweighted as temporal constraints increased. Gaze and manual behavior showed complementary but partially dissociable predictive signatures. Occluding the target decreased predictive alignment, whereas occluding the user-controlled cursor had comparatively minor effects, indicating strong reliance on internal state estimation rather than continuous visual feedback of the effector. Conclusions: Predictive and reactive control are continuously and dynamically reweighted during interception. Their interaction unfolds within single trials and depends on target dynamics and sensory availability. These findings provide quantitative evidence for time-resolved coordination between anticipatory and feedback-driven control mechanisms in goal-directed behavior. Full article

(This article belongs to the Special Issue Predictive Processing in Brain and Behavior)

► Show Figures

Figure 1

33 pages, 6958 KB

Open AccessArticle

Short-Term Performance of Visual Attention Prompt Methods Across Driver Proficiency in a Driving Simulator

by Jinwei Liang and Makio Ishihara

Multimodal Technol. Interact. 2026, 10(3), 28; https://doi.org/10.3390/mti10030028 - 11 Mar 2026

Viewed by 139

Abstract

In complex driving environments, drivers must continuously detect and respond to critical visual information such as traffic signs and pedestrians. However, important targets may sometimes be overlooked due to high cognitive load during driving. Therefore, visual attention prompt methods have been proposed to [...] Read more.

In complex driving environments, drivers must continuously detect and respond to critical visual information such as traffic signs and pedestrians. However, important targets may sometimes be overlooked due to high cognitive load during driving. Therefore, visual attention prompt methods have been proposed to guide drivers’ gaze toward relevant targets. A visual attention prompt method is a visual cue presented in a key area in a user’s field of view to draw his/her visual attention. This study evaluates the short-term performance of five visual attention prompt methods (Point, Arrow, Blur, Dusk, and ModAF) in a driving simulator and compares their performance between novice and proficient drivers. Eye-tracking data and multiple analyses are used to examine whether the influence of these methods could be maintained after they are disabled and to clarify drivers’ response patterns across methods in consideration with their driving proficiency. The results indicate that visual attention prompt methods could induce a short-term transfer effect, as drivers still tend to fixate on target traffic signs earlier after the methods are disabled, and the elapsed-time analysis estimates that this effect lasts about 84.35 s. Overall, the Point, Arrow, and Dusk methods show relatively stronger performance with significant reductions in the elapsed time to fixate on the traffic sign. The clustering analysis further shows that drivers’ response patterns are not uniform, with two clusters for novice drivers and three clusters for proficient drivers. The results suggest that most novice drivers tend to benefit from explicit non-directional visual cues that enhance target salience, such as the Point method, whereas proficient drivers are more likely to benefit from explicit directional visual cues that provide clear directional guidance, such as the Arrow method. These findings suggest that visual attention prompt methods may be useful for developing driver training strategies tailored to different levels of driving proficiency, helping drivers maintain more effective visual attention allocation during driving and potentially contributing to improved driving safety. Full article

► Show Figures

Figure 1

19 pages, 3125 KB

Open AccessArticle

Landmark-Guided Gaze Estimation via Conditional Keypoint Generation and Cross-Attention Fusion

by Guanghui Xu, Xiaoyang Zhang, Wanli Zhao, Zhongjie Mao, Yue Li, Duantengchuan Li and Liangshan Dong

Information 2026, 17(3), 224; https://doi.org/10.3390/info17030224 - 25 Feb 2026

Viewed by 255

Abstract

In gaze estimation, existing mainstream methods face significant challenges in capturing the fine-grained structures of eye regions, particularly in the absence of explicit geometric prior information, which hampers gaze prediction accuracy. To address this limitation, we propose the landmark-guided gaze estimation network (LGNet), [...] Read more.

In gaze estimation, existing mainstream methods face significant challenges in capturing the fine-grained structures of eye regions, particularly in the absence of explicit geometric prior information, which hampers gaze prediction accuracy. To address this limitation, we propose the landmark-guided gaze estimation network (LGNet), a gaze estimation method guided by keypoints, which effectively incorporates geometric prior information to enhance estimation performance. The proposed method begins by training an eye-keypoint generator on the synthetic UnityEyes dataset using a Conditional Variational Autoencoder (CVAE). Next, we introduce a Symmetric Spatial Feature Fusion module (SSFF), combined with a dual-stream cross-attention mechanism, to achieve semantic alignment between the keypoint features and the facial image features extracted using ResNet50. Furthermore, we propose a Gated Channel Reweighting module (GCR) to suppress redundant information and amplify the critical features, thereby enhancing the model’s overall response. Experimental results demonstrate that LGNet outperforms existing methods on three benchmark datasets. The code for this research has been made publicly available. Full article

(This article belongs to the Section Information Systems)

► Show Figures

Figure 1

20 pages, 1780 KB

Open AccessArticle

A Comprehensive Eye-Tracking System Toward Large FOV HMD

by Jiafu Lv, Di Zhang, Ke Han, Qi Wu and Sanxing Cao

Sensors 2026, 26(5), 1402; https://doi.org/10.3390/s26051402 - 24 Feb 2026

Viewed by 362

Abstract

Eye tracking in virtual reality (VR) head-mounted displays poses substantial engineering challenges, particularly under immersive display configurations with large fields of view (FOV), where optical layout, illumination, and image acquisition impose nontrivial system constraints. To address these design constraints, we present an integrated [...] Read more.

Eye tracking in virtual reality (VR) head-mounted displays poses substantial engineering challenges, particularly under immersive display configurations with large fields of view (FOV), where optical layout, illumination, and image acquisition impose nontrivial system constraints. To address these design constraints, we present an integrated near-eye eye-tracking prototype tailored for immersive VR headsets, combining customized hardware components and a real-time software pipeline. The proposed system integrates optimized near-eye illumination and image acquisition with a pupil detection module and a deep learning-based gaze-vector estimation model, forming a real-time software pipeline for stable end-to-end gaze mapping under fixed calibration conditions. Under identical system settings, calibration procedures, and gaze-point mapping conditions, we evaluate the proposed gaze-vector estimation model through a controlled model-level ablation. The attention-enhanced model achieves an average angular deviation of 1.15°, corresponding to a 61.4% relative reduction compared with a baseline ResNet-152 model without attention. To demonstrate the usability of the system outputs at the application level, we further implement a real-time visualization example that integrates pupil diameter, gaze vectors, and blink events to depict the temporal evolution of eye-movement signals. This work provides a cost-effective and reproducible engineering reference for near-eye eye-movement acquisition and visualization in immersive VR settings and serves as a technical foundation for subsequent interaction design or behavioral analysis studies. Full article

(This article belongs to the Section Optical Sensors)

► Show Figures

Figure 1

21 pages, 1923 KB

Open AccessReview

Mapping Eye-Tracking Research in Human–Computer Interaction: A Science-Mapping and Content-Analysis Study

by Adem Korkmaz

J. Eye Mov. Res. 2026, 19(1), 23; https://doi.org/10.3390/jemr19010023 - 12 Feb 2026

Viewed by 666

Abstract

Eye tracking has become a central method in human–computer interaction (HCI), supported by advances in sensing technologies and AI-based gaze analysis. Despite this rapid growth, a comprehensive and up-to-date overview of eye-tracking research across the broader HCI landscape remains lacking. This study combines [...] Read more.

Eye tracking has become a central method in human–computer interaction (HCI), supported by advances in sensing technologies and AI-based gaze analysis. Despite this rapid growth, a comprehensive and up-to-date overview of eye-tracking research across the broader HCI landscape remains lacking. This study combines records from Web of Science (WoS) and Scopus to analyse 1033 publications on eye tracking in HCI published between 2020 and 2025. After merging and deduplicating the datasets, we conducted bibliometric network analyses (keyword co-occurrence, co-citation, co-authorship, and source mapping) using VOSviewer and performed a qualitative content analysis of the 50 most-cited papers. The literature is dominated by journal articles and conference papers produced by small- to medium-sized research teams (mean: 3.9 authors per paper; h-index: 29). Keyword and overlay visualisations reveal four principal research axes: deep-learning-based gaze estimation; XR-related interaction paradigms within HCI; cognitive load and human factors; and usability- and accessibility-oriented interface design. The most-cited studies focus on gaze interaction in immersive environments, deep learning for gaze estimation, multimodal interaction, and physiological approaches to assessing cognitive load. Overall, the findings indicate that eye tracking in HCI is evolving from a measurement-oriented technique into a core enabling technology that supports interaction design, cognitive assessment, accessibility, and ethical considerations such as privacy. This review identifies research gaps and outlines future directions for benchmarking practices, real-world deployments, and privacy-preserving gaze analytics in HCI. Full article

(This article belongs to the Special Issue New Horizons and Recent Advances in Eye-Tracking Technology)

► Show Figures

Figure 1

15 pages, 2307 KB

Open AccessArticle

An Open-Source Horizontal Strabismus Simulator as an Evaluation Platform for Monocular Gaze Estimation Using Deep Learning Models

by Shumpei Takinami, Yuka Morita, Jun Seita and Tetsuro Oshika

J. Eye Mov. Res. 2026, 19(1), 20; https://doi.org/10.3390/jemr19010020 - 9 Feb 2026

Viewed by 701

Abstract

Strabismus affects 2–4% of the global population, with horizontal cases accounting for more than 90%. Automated screening using monocular gaze estimation technology shows promise for early detection. However, existing models assume normal binocular vision, and their applicability to strabismus remains unvalidated due to [...] Read more.

Strabismus affects 2–4% of the global population, with horizontal cases accounting for more than 90%. Automated screening using monocular gaze estimation technology shows promise for early detection. However, existing models assume normal binocular vision, and their applicability to strabismus remains unvalidated due to the lack of evaluation platforms capable of reproducing disconjugate eye movements with known ground-truth angles. To address this gap, we developed an open-source, low-cost (approximately 200 USD) horizontal strabismus simulator. The simulator features two independently controllable artificial eyeballs mounted on a two-axis gimbal mechanism with servo motors and gyro sensors for real-time angle measurement. Mechanical accuracy achieved a mean absolute error of less than 0.1° across all axes, well below the clinical detection threshold of 1 prism diopter (≈0.57°). An evaluation of three representative AI models (Single Eye, GazeNet, and EyeNet) revealed estimation errors of 6.44–8.75°, substantially exceeding the clinical target of 2.8°. At this error level, small-angle strabismus (<15 prism diopters) would likely be missed, underscoring the need for strabismus-specific model development. Moreover, rapid accuracy degradation was observed beyond ±15° gaze angles. This platform establishes baseline performance metrics and provides a foundation for advancing gaze estimation technology for strabismus screening. Full article

► Show Figures

Figure 1

34 pages, 7495 KB

Open AccessArticle

Advanced Consumer Behaviour Analysis: Integrating Eye Tracking, Machine Learning, and Facial Recognition

by José Augusto Rodrigues, António Vieira de Castro and Martín Llamas-Nistal

J. Eye Mov. Res. 2026, 19(1), 9; https://doi.org/10.3390/jemr19010009 - 19 Jan 2026

Viewed by 762

Abstract

This study presents DeepVisionAnalytics, an integrated framework that combines eye tracking, OpenCV-based computer vision (CV), and machine learning (ML) to support objective analysis of consumer behaviour in visually driven tasks. Unlike conventional self-reported surveys, which are prone to cognitive bias, recall errors, and [...] Read more.

This study presents DeepVisionAnalytics, an integrated framework that combines eye tracking, OpenCV-based computer vision (CV), and machine learning (ML) to support objective analysis of consumer behaviour in visually driven tasks. Unlike conventional self-reported surveys, which are prone to cognitive bias, recall errors, and social desirability effects, the proposed approach relies on direct behavioural measurements of visual attention. The system captures gaze distribution and fixation dynamics during interaction with products or interfaces. It uses AOI-level eye tracking metrics as the sole behavioural signal to infer candidate choice under constrained experimental conditions. In parallel, OpenCV and ML perform facial analysis to estimate demographic attributes (age, gender, and ethnicity). These attributes are collected independently and linked post hoc to gaze-derived outcomes. Demographics are not used as predictive features for choice inference. Instead, they are used as contextual metadata to support stratified, segment-level interpretation. Empirical results show that gaze-based inference closely reproduces observed choice distributions in short-horizon, visually driven tasks. Demographic estimates enable meaningful post hoc segmentation without affecting the decision mechanism. Together, these results show that multimodal integration can move beyond descriptive heatmaps. The platform produces reproducible decision-support artefacts, including AOI rankings, heatmaps, and segment-level summaries, grounded in objective behavioural data. By separating the decision signal (gaze) from contextual descriptors (demographics), this work contributes a reusable end-to-end platform for marketing and UX research. It supports choice inference under constrained conditions and segment-level interpretation without demographic priors in the decision mechanism. Full article

► Show Figures

Figure 1

13 pages, 455 KB

Open AccessArticle

Eye Gaze Detection Using a Hybrid Multimodal Deep Learning Model for Assistive Technology

by Verdzekov Emile Tatinyuy, Noumsi Woguia Auguste Vigny, Mvogo Ngono Joseph, Fono Louis Aimé and Wirba Pountianus Berinyuy

Appl. Sci. 2026, 16(2), 986; https://doi.org/10.3390/app16020986 - 19 Jan 2026

Viewed by 690

Abstract

This paper presents a novel hybrid multimodal deep learning model for robust and real-time eye gaze estimation. Accurate gaze tracking is essential for advancing human–computer interaction (HCI) and assistive technologies, but existing methods often struggle with environmental variations, require extensive calibration, and are [...] Read more.

This paper presents a novel hybrid multimodal deep learning model for robust and real-time eye gaze estimation. Accurate gaze tracking is essential for advancing human–computer interaction (HCI) and assistive technologies, but existing methods often struggle with environmental variations, require extensive calibration, and are computationally intensive. Our proposed model, GazeNet-HM, addresses these limitations by synergistically fusing features from RGB, depth, and infrared (IR) imaging modalities. This multimodal approach allows the model to leverage complementary information: RGB provides rich texture, depth offers invariance to lighting and aids pose estimation, and IR ensures robust pupil detection. Furthermore, we introduce a personalized adaptation module that dynamically fine-tunes the model to individual users with minimal calibration data. To ensure practical deployment, we employ advanced model compression techniques, enabling real-time inference on resource-constrained embedded systems. Extensive evaluations on public datasets (MPIIGaze, EYEDIAP, Gaze360) and our collected M-Gaze dataset demonstrate that GazeNet-HM achieves state-of-the-art performance, reducing the mean angular error by up to 27.1% compared to leading unimodal methods. After model compression, the system achieves a real-time inference speed of 32 FPS on an embedded Jetson Xavier NX platform. Ablation studies confirm the contribution of each modality and component, highlighting the effectiveness of our holistic design. Full article

► Show Figures

Figure 1

23 pages, 4065 KB

Open AccessArticle

Robust Camera-Based Eye-Tracking Method Allowing Head Movements and Its Application in User Experience Research

by He Zhang and Lu Yin

J. Eye Mov. Res. 2025, 18(6), 71; https://doi.org/10.3390/jemr18060071 - 1 Dec 2025

Cited by 1 | Viewed by 964

Abstract

Eye-tracking for user experience analysis has traditionally relied on dedicated hardware, which is often costly and imposes restrictive operating conditions. As an alternative, solutions utilizing ordinary webcams have attracted significant interest due to their affordability and ease of use. However, a major limitation [...] Read more.

Eye-tracking for user experience analysis has traditionally relied on dedicated hardware, which is often costly and imposes restrictive operating conditions. As an alternative, solutions utilizing ordinary webcams have attracted significant interest due to their affordability and ease of use. However, a major limitation persists in these vision-based methods: sensitivity to head movements. Therefore, users are often required to maintain a rigid head position, leading to discomfort and potentially skewed results. To address this challenge, this paper proposes a robust eye-tracking methodology designed to accommodate head motion. Our core technique involves mapping the displacement of the pupil center from a dynamically updated reference point to estimate the gaze point. When head movement is detected, the system recalculates the head-pointing coordinate using estimated head pose and user-to-screen distance. This new head position and the corresponding pupil center are then established as the fresh benchmark for subsequent gaze point estimation, creating a continuous and adaptive correction loop. We conducted accuracy tests with 22 participants. The results demonstrate that our method surpasses the performance of many current methods, achieving mean gaze errors of 1.13 and 1.37 degrees in two testing modes. Further validation in a smooth pursuit task confirmed its efficacy in dynamic scenarios. Finally, we applied the method in a real-world gaming context, successfully extracting fixation counts and gaze heatmaps to analyze visual behavior and UX across different game modes, thereby verifying its practical utility. Full article

► Show Figures

Figure 1

16 pages, 2398 KB

Open AccessArticle

Gaze Point Estimation via Joint Learning of Facial Features and Screen Projection

by Yuying Zhang, Fei Xu and Yi Yang

Appl. Sci. 2025, 15(23), 12475; https://doi.org/10.3390/app152312475 - 25 Nov 2025

Viewed by 624

Abstract

In recent years, gaze estimation has received a lot of interest in areas including human–computer interface, virtual reality, and user engagement analysis. Despite significant advances in convolutional neural network (CNN) techniques, directly and effectively predicting the point of gaze (PoG) in unconstrained situations [...] Read more.

In recent years, gaze estimation has received a lot of interest in areas including human–computer interface, virtual reality, and user engagement analysis. Despite significant advances in convolutional neural network (CNN) techniques, directly and effectively predicting the point of gaze (PoG) in unconstrained situations remains a difficult task. This study proposes a gaze point estimation network (L1fcs-Net) that combines facial features with positional features derived from a two-dimensional array obtained by projecting the face relative to the screen. Our approach incorporates a Face-grid branch to enhance the network’s ability to extract features such as the relative position and distance of the face to the screen. Additionally, independent fully connected layers regress x and y coordinates separately, enabling the model to better capture gaze movement characteristics in both horizontal and vertical directions. Furthermore, we employ a multi-loss approach, balancing classification and regression losses to reduce gaze point prediction errors and improve overall gaze performance. To evaluate our model, we conducted experiments on the MPIIFaceGaz dataset, which was collected under unconstrained settings. The proposed model achieves state-of-the-art performance on this dataset with a gaze point prediction error of 2.05 cm, demonstrating its superior capability in gaze estimation. Full article

(This article belongs to the Special Issue AI Technologies for eHealth and mHealth, 2nd Edition)

► Show Figures

Figure 1

17 pages, 12830 KB

Open AccessArticle

Your Eyes Under Pressure: Real-Time Estimation of Cognitive Load with Smooth Pursuit Tracking

by Pierluigi Dell’Acqua, Marco Garofalo, Francesco La Rosa and Massimo Villari

Big Data Cogn. Comput. 2025, 9(11), 288; https://doi.org/10.3390/bdcc9110288 - 13 Nov 2025

Cited by 2 | Viewed by 1728

Abstract

Understanding and accurately estimating cognitive workload is crucial for the development of adaptive, user-centered interactive systems across a variety of domains including augmented reality, automotive driving assistance, and intelligent tutoring systems. Cognitive workload assessment enables dynamic system adaptation to improve user experience and [...] Read more.

Understanding and accurately estimating cognitive workload is crucial for the development of adaptive, user-centered interactive systems across a variety of domains including augmented reality, automotive driving assistance, and intelligent tutoring systems. Cognitive workload assessment enables dynamic system adaptation to improve user experience and safety. In this work, we introduce a novel framework that leverages smooth pursuit eye movements as a non-invasive and temporally precise indicator of mental effort. A key innovation of our approach is the development of trajectory-independent algorithms that address a significant limitation of existing methods, which generally rely on a predefined or known stimulus trajectory. Our framework leverages two solutions to provide accurate cognitive load estimation, without requiring knowledge of the exact target path, based on Kalman filter and B-spline heuristic classifiers. This enables the application of our methods in more naturalistic and unconstrained environments where stimulus trajectories may be unknown. We evaluated these algorithms against classical supervised machine learning models on a publicly available benchmark dataset featuring diverse pursuit trajectories and varying cognitive workload conditions. The results demonstrate competitive performance along with robustness across different task complexities and trajectory types. Moreover, our framework supports real-time inference, making it viable for continuous cognitive workload monitoring. To further enhance deployment feasibility, we propose a federated learning architecture, allowing privacy-preserving adaptation of models across heterogeneous devices without the need to share raw gaze data. This scalable approach mitigates privacy concerns and facilitates collaborative model improvement in distributed real-world scenarios. Experimental findings confirm that metrics derived from smooth pursuit eye movements reliably reflect fluctuations in cognitive states induced by working memory load tasks, substantiating their use for real-time, continuous workload estimation. By integrating trajectory independence, robust classification techniques, and federated privacy-aware learning, our work advances the state of the art in adaptive human–computer interaction. This framework offers a scientifically grounded, privacy-conscious, and practically deployable solution for cognitive workload estimation that can be adapted to diverse application contexts. Full article

(This article belongs to the Special Issue Advances in Artificial Intelligence for Computer Vision, Augmented Reality Virtual Reality and Metaverse)

► Show Figures

Figure 1

20 pages, 10684 KB

Open AccessArticle

Electro-Oculography and Proprioceptive Calibration Enable Horizontal and Vertical Gaze Estimation, Even with Eyes Closed

by Xin Wei, Felix Dollack, Kiyoshi Kiyokawa and Monica Perusquía-Hernández

Sensors 2025, 25(21), 6754; https://doi.org/10.3390/s25216754 - 4 Nov 2025

Viewed by 1266

Abstract

Eye movement is an important tool used to investigate cognition. It also serves as input in human–computer interfaces for assistive technology. It can be measured with camera-based eye tracking and electro-oculography (EOG). EOG does not rely on eye visibility and can be measured [...] Read more.

Eye movement is an important tool used to investigate cognition. It also serves as input in human–computer interfaces for assistive technology. It can be measured with camera-based eye tracking and electro-oculography (EOG). EOG does not rely on eye visibility and can be measured even when the eyes are closed. We investigated the feasibility of detecting the gaze direction using EOG while having the eyes closed. A total of 15 participants performed a proprioceptive calibration task with open and closed eyes, while their eye movement was recorded with a camera-based eye tracker and with EOG. The calibration was guided by the participants’ hand motions following a pattern of felt dots on cardboard. Our cross-correlation analysis revealed reliable temporal synchronization between gaze-related signals and the instructed trajectory across all conditions. Statistical comparison tests and equivalence tests demonstrated that EOG tracking was statistically equivalent to the camera-based eye tracker gaze direction during the eyes-open condition. The camera-based eye-tracking glasses do not support tracking with closed eyes. Therefore, we evaluated the EOG-based gaze estimates during the eyes-closed trials by comparing them to the instructed trajectory. The results showed that EOG signals, guided by proprioceptive cues, followed the instructed path and achieved a significantly greater accuracy than shuffled control data, which represented a chance-level performance. This demonstrates the advantage of EOG when camera-based eye tracking is infeasible, and it paves the way for the development of eye-movement input interfaces for blind people, research on eye movement direction when the eyes are closed, and the early detection of diseases. Full article

(This article belongs to the Section Biomedical Sensors)

► Show Figures

Figure 1

18 pages, 6415 KB

Open AccessArticle

Drowsiness Classification in Young Drivers Based on Facial Near-Infrared Images Using a Convolutional Neural Network: A Pilot Study

by Ayaka Nomura, Atsushi Yoshida, Takumi Torii, Kent Nagumo, Kosuke Oiwa and Akio Nozawa

Sensors 2025, 25(21), 6755; https://doi.org/10.3390/s25216755 - 4 Nov 2025

Viewed by 782

Abstract

Drowsy driving is a major cause of traffic accidents worldwide, and its early detection remains essential for road safety. Conventional driver monitoring systems (DMS) primarily rely on behavioral indicators such as eye closure, gaze, or head pose, which typically appear only after a [...] Read more.

Drowsy driving is a major cause of traffic accidents worldwide, and its early detection remains essential for road safety. Conventional driver monitoring systems (DMS) primarily rely on behavioral indicators such as eye closure, gaze, or head pose, which typically appear only after a significant decline in alertness. This study explores the potential of facial near-infrared (NIR) imaging as a hypothetical physiological indicator of drowsiness. Because NIR light penetrates more deeply into biological tissue than visible light, it may capture subtle variations in blood flow and oxygenation near superficial vessels. Based on this hypothesis, we conducted a pilot feasibility study involving young adult participants to investigate whether drowsiness levels could be estimated from single-frame NIR facial images acquired at 940 nm—a wavelength already used in commercial DMS and suitable for both physiological sensitivity and practical feasibility. A convolutional neural network (CNN) was trained to classify multiple levels of drowsiness, and Gradient-weighted Class Activation Mapping (Grad-CAM) was applied to interpret the discriminative regions. The results showed that classification based on 940 nm NIR images is feasible, achieving an optimal accuracy of approximately 90% under the binary classification scheme (Pattern A). Grad-CAM revealed that regions around the nasal dorsum contributed to this, consistent with known physiological signs of drowsiness. These findings support the feasibility of NIR-based drowsiness classification in young drivers and provide a foundation for future studies with larger and more diverse populations. Full article

(This article belongs to the Special Issue Emotion Recognition and Biometric Authentication with Contactless Sensing)

► Show Figures

Figure 1

17 pages, 2654 KB

Open AccessArticle

Eyeglass-Type Switch: A Wearable Eye-Movement and Blink Switch for ALS Nurse Call

by Ryuto Tamai, Takeshi Saitoh, Kazuyuki Itoh and Haibo Zhang

Electronics 2025, 14(21), 4201; https://doi.org/10.3390/electronics14214201 - 27 Oct 2025

Cited by 1 | Viewed by 1007

Abstract

We present the eyeglass-type switch, an eyeglass-mounted eye/blink switch designed for nurse-call operation by people with severe motor impairments, with a particular focus on amyotrophic lateral sclerosis (ALS). The system targets real-world bedside constraints—low illumination at night, supine posture, and network-independent operation—by combining [...] Read more.

We present the eyeglass-type switch, an eyeglass-mounted eye/blink switch designed for nurse-call operation by people with severe motor impairments, with a particular focus on amyotrophic lateral sclerosis (ALS). The system targets real-world bedside constraints—low illumination at night, supine posture, and network-independent operation—by combining near-infrared (NIR) LED illumination with an NIR eye camera and executing all processing on a small, GPU-free computer. A two-stage convolutional pipeline estimates eight periocular landmarks and the pupil center; eye-closure is detected either by a binary classifier or by an angle criterion derived from landmarks, which also skips pupil estimation during closure. User intent is determined by crossing a caregiver-tunable “off-area” around neutral gaze, implemented as rectangular or sector shapes. Four output modes—single, continuous, long-press, and hold-to-activate—are supported for both oculomotor and eyelid inputs. Safety is addressed via relay-based electrical isolation from the nurse-call circuit and audio feedback for state indication. The prototype runs at 18 fps on commodity hardware. In feature-point evaluation, mean errors were 2.84 pixels for landmarks and 1.33 pixels for the pupil center. In a bedside task with 12 healthy participants, the system achieved

F = 0.965

in single mode and

F = 0.983

in hold-to-activate mode; blink-only input yielded

F = 0.993

. Performance was uniformly high for right/left/up and eye-closure cues, with lower recall for downward gaze due to eyelid occlusion, suggesting camera placement or threshold tuning as remedies. The results indicate that the proposed switch provides reliable, low-burden nurse-call control under nighttime conditions and offers a practical input option for emergency alerts and augmentative and alternative communication (AAC) workflows. Full article

(This article belongs to the Special Issue Human-Robot Interaction and Applications: Challenges and Future Perspectives)

► Show Figures

Figure 1

18 pages, 1175 KB

Open AccessArticle

NAMI: A Neuro-Adaptive Multimodal Architecture for Wearable Human–Computer Interaction

by Christos Papakostas, Christos Troussas, Akrivi Krouska and Cleo Sgouropoulou

Multimodal Technol. Interact. 2025, 9(10), 108; https://doi.org/10.3390/mti9100108 - 18 Oct 2025

Cited by 1 | Viewed by 1934

Abstract

The increasing ubiquity of wearable computing and multimodal interaction technologies has created unprecedented opportunities for natural and seamless human–computer interaction. However, most existing systems adapt only to external user actions such as speech, gesture, or gaze, without considering internal cognitive or affective states. [...] Read more.

The increasing ubiquity of wearable computing and multimodal interaction technologies has created unprecedented opportunities for natural and seamless human–computer interaction. However, most existing systems adapt only to external user actions such as speech, gesture, or gaze, without considering internal cognitive or affective states. This limits their ability to provide intelligent and empathetic adaptations. This paper addresses this critical gap by proposing the Neuro-Adaptive Multimodal Architecture (NAMI), a principled, modular, and reproducible framework designed to integrate behavioral and neurophysiological signals in real time. NAMI combines multimodal behavioral inputs with lightweight EEG and peripheral physiological measurements to infer cognitive load and engagement and adapt the interface dynamically to optimize user experience. The architecture is formally specified as a three-layer pipeline encompassing sensing and acquisition, cognitive–affective state estimation, and adaptive interaction control, with clear data flows, mathematical formalization, and real-time performance on wearable platforms. A prototype implementation of NAMI was deployed in an augmented reality Java programming tutor for postgraduate informatics students, where it dynamically adjusted task difficulty, feedback modality, and assistance frequency based on inferred user state. Empirical evaluation with 100 participants demonstrated significant improvements in task performance, reduced subjective workload, and increased engagement and satisfaction, confirming the effectiveness of the neuro-adaptive approach. Full article

► Show Figures

Figure 1

Search Results (188)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (188)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI