error_outline You can access the new MDPI.com website here. Explore and share your feedback with us.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (254)

Search Parameters:
Keywords = gestural interfaces

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 18266 KB  
Article
GECO: A Real-Time Computer Vision-Assisted Gesture Controller for Advanced IoT Home System
by Murilo C. Lopes, Paula A. Silva, Ludwing Marenco, Evandro C. Vilas Boas, João G. A. de Carvalho, Cristiane A. Ferreira, André L. O. Carvalho, Cristiani V. R. Guimarães, Guilherme P. Aquino and Felipe A. P. de Figueiredo
Sensors 2026, 26(1), 61; https://doi.org/10.3390/s26010061 - 21 Dec 2025
Viewed by 573
Abstract
This paper introduces GECO, a real-time, computer vision-assisted gesture controller for IoT-based smart home systems. The platform uses a markerless MediaPipe interface that combines gesture-driven navigation and command execution, enabling intuitive control of multiple domestic devices. The system integrates binary and analog gestures, [...] Read more.
This paper introduces GECO, a real-time, computer vision-assisted gesture controller for IoT-based smart home systems. The platform uses a markerless MediaPipe interface that combines gesture-driven navigation and command execution, enabling intuitive control of multiple domestic devices. The system integrates binary and analog gestures, such as continuous light dimming based on thumb–index angles, while operating on-device through a private MQTT network. Technical evaluations across multiple Android devices have demonstrated ultra-low latency times (<50 ms), enabling real-time responsiveness. A user experience study with seventeen participants reported high intuitiveness (9.5/10), gesture accuracy (9.2/10), and perceived inclusivity, mainly for individuals with speech impairments and low technological literacy. These results position GECO as a lightweight, accessible, and privacy-preserving interaction framework, advancing the integration of artificial intelligence and IoT within smart home environments. Full article
(This article belongs to the Special Issue AI-Empowered Internet of Things)
Show Figures

Figure 1

31 pages, 9303 KB  
Article
Automatic Quadrotor Dispatch Missions Based on Air-Writing Gesture Recognition
by Pu-Sheng Tsai, Ter-Feng Wu and Yen-Chun Wang
Processes 2025, 13(12), 3984; https://doi.org/10.3390/pr13123984 - 9 Dec 2025
Viewed by 381
Abstract
This study develops an automatic dispatch system for quadrotor UAVs that integrates air-writing gesture recognition with a graphical user interface (GUI). The DJI RoboMaster quadrotor UAV (DJI, Shenzhen, China) was employed as the experimental platform, combined with an ESP32 microcontroller (Espressif Systems, Shanghai, [...] Read more.
This study develops an automatic dispatch system for quadrotor UAVs that integrates air-writing gesture recognition with a graphical user interface (GUI). The DJI RoboMaster quadrotor UAV (DJI, Shenzhen, China) was employed as the experimental platform, combined with an ESP32 microcontroller (Espressif Systems, Shanghai, China) and the RoboMaster SDK (version 3.0). On the Python (version 3.12.7) platform, a GUI was implemented using Tkinter (version 8.6), allowing users to input addresses or landmarks, which were then automatically converted into geographic coordinates and imported into Google Maps for route planning. The generated flight commands were transmitted to the UAV via a UDP socket, enabling remote autonomous flight. For gesture recognition, a Raspberry Pi integrated with the MediaPipe Hands module was used to capture 16 types of air-written flight commands in real time through a camera. The training samples were categorized into one-dimensional coordinates and two-dimensional images. In the one-dimensional case, X/Y axis coordinates were concatenated after data augmentation, interpolation, and normalization. In the two-dimensional case, three types of images were generated, namely font trajectory plots (T-plots), coordinate-axis plots (XY-plots), and composite plots combining the two (XYT-plots). To evaluate classification performance, several machine learning and deep learning architectures were employed, including a multi-layer perceptron (MLP), support vector machine (SVM), one-dimensional convolutional neural network (1D-CNN), and two-dimensional convolutional neural network (2D-CNN). The results demonstrated effective recognition accuracy across different models and sample formats, verifying the feasibility of the proposed air-writing trajectory framework for non-contact gesture-based UAV control. Furthermore, by combining gesture recognition with a GUI-based map planning interface, the system enhances the intuitiveness and convenience of UAV operation. Future extensions, such as incorporating aerial image object recognition, could extend the framework’s applications to scenarios including forest disaster management, vehicle license plate recognition, and air pollution monitoring. Full article
Show Figures

Figure 1

22 pages, 1145 KB  
Article
TSMTFN: Two-Stream Temporal Shift Module Network for Efficient Egocentric Gesture Recognition in Virtual Reality
by Muhammad Abrar Hussain, Chanjun Chun and SeongKi Kim
Virtual Worlds 2025, 4(4), 58; https://doi.org/10.3390/virtualworlds4040058 - 4 Dec 2025
Viewed by 323
Abstract
Egocentric hand gesture recognition is vital for natural human–computer interaction in augmented and virtual reality (AR/VR) systems. However, most deep learning models struggle to balance accuracy and efficiency, limiting real-time use on wearable devices. This paper introduces a Two-Stream Temporal Shift Module Transformer [...] Read more.
Egocentric hand gesture recognition is vital for natural human–computer interaction in augmented and virtual reality (AR/VR) systems. However, most deep learning models struggle to balance accuracy and efficiency, limiting real-time use on wearable devices. This paper introduces a Two-Stream Temporal Shift Module Transformer Fusion Network (TSMTFN) that achieves high recognition accuracy with low computational cost. The model integrates Temporal Shift Modules (TSMs) for efficient motion modeling and a Transformer-based fusion mechanism for long-range temporal understanding, operating on dual RGB-D streams to capture complementary visual and depth cues. Training stability and generalization are enhanced through full-layer training from epoch 1 and MixUp/CutMix augmentations. Evaluated on the EgoGesture dataset, TSMTFN attained 96.18% top-1 accuracy and 99.61% top-5 accuracy on the independent test set with only 16 GFLOPs and 21.3M parameters, offering a 2.4–4.7× reduction in computation compared to recent state-of-the-art methods. The model runs at 15.10 samples/s, achieving real-time performance. The results demonstrate robust recognition across over 95% of gesture classes and minimal inter-class confusion, establishing TSMTFN as an efficient, accurate, and deployable solution for next-generation wearable AR/VR gesture interfaces. Full article
Show Figures

Figure 1

16 pages, 5099 KB  
Article
Semi-Interpenetrating Highly Conductive and Transparent Hydrogels for Wearable Sensors and Gesture-Driven Cryptography
by Dan Li, Hong Li, Yilin Wei, Lu Jiang, Hongqing Feng and Qiang Zheng
Micro 2025, 5(4), 53; https://doi.org/10.3390/micro5040053 - 23 Nov 2025
Viewed by 563
Abstract
Developing conductive hydrogels that balance high conductivity, stretchability, transparency, and sensitivity for next-generation wearable sensors remains challenging due to inherent trade-offs. This study introduces a straightforward approach to fabricate a semi-interpenetrating double-network hydrogel comprising polyvinyl alcohol (PVA), polyacrylamide (PAM), and lithium chloride (LiCl) [...] Read more.
Developing conductive hydrogels that balance high conductivity, stretchability, transparency, and sensitivity for next-generation wearable sensors remains challenging due to inherent trade-offs. This study introduces a straightforward approach to fabricate a semi-interpenetrating double-network hydrogel comprising polyvinyl alcohol (PVA), polyacrylamide (PAM), and lithium chloride (LiCl) to overcome these limitations. Leveraging hydrogen bonding for energy dissipation and chemical cross-linking for structural integrity, the design achieves robust mechanical properties. The incorporation of 1 mol/L LiCl significantly enhances ionic conductivity, while also providing plasticizing and moisture-retention benefits. The optimized hydrogel exhibits impressive ionic conductivity (0.47 S/m, 113% enhancement), excellent mechanical performance (e.g., 0.177 MPa tensile strength, 730% elongation, 0.68 MJ m−3 toughness), high transparency (>85%), and superior strain sensitivity (gauge factors ~1). It also demonstrates rapid response/recovery and robust fatigue resistance. Functioning as a wearable sensor, it reliably monitors diverse human activities and enables novel, secure data handling applications, such as finger-motion-driven Morse code interfaces and gesture-based password systems. This accessible fabrication method yields versatile hydrogels with promising applications in health tracking, interactive devices, and secure communication technologies. Full article
Show Figures

Figure 1

13 pages, 245 KB  
Case Report
Noncontact Gesture-Based Switch Improves Communication Speed and Social Function in Advanced Duchenne Muscular Dystrophy: A Case Report
by Daisuke Nishida, Takafumi Kinoshita, Tatsuo Hayakawa, Takashi Nakajima, Yoko Kobayashi, Takatoshi Hara, Ikushi Yoda and Katsuhiro Mizuno
Healthcare 2025, 13(22), 2989; https://doi.org/10.3390/healthcare13222989 - 20 Nov 2025
Viewed by 495
Abstract
Augmentative and alternative communication (AAC) enables digital access for individuals with severe motor impairment. Conventional contact-based switches rely on residual voluntary movement, limiting efficiency. We report the clinical application of a novel, researcher-developed noncontact assistive switch, the Augmentative Alternative Gesture Interface (AAGI), in [...] Read more.
Augmentative and alternative communication (AAC) enables digital access for individuals with severe motor impairment. Conventional contact-based switches rely on residual voluntary movement, limiting efficiency. We report the clinical application of a novel, researcher-developed noncontact assistive switch, the Augmentative Alternative Gesture Interface (AAGI), in a 39-year-old male with late-stage Duchenne Muscular Dystrophy (DMD) retaining minimal motion. The AAGI converts subtle, noncontact gestures into digital inputs, enabling efficient computer operations. Before intervention, the participant used a conventional mechanical switch, achieving 12 characters per minute (CPM) in a 2 min text entry task and was unable to perform high-speed ICT tasks such as gaming or video editing. After 3 months of AAGI use, the input speed increased to 30 CPM (+2.5-fold), and previously inaccessible tasks became feasible. The System Usability Scale (SUS) improved from 82.5 to 90.0, indicating enhanced usability, whereas the Short Form 36 (SF-36) Social Functioning (+13) and Mental Health (+4) demonstrated meaningful gains. Daily living activities remained stable. This case demonstrates that the AAGI system, developed by our group can substantially enhance communication efficiency, usability, and social engagement in advanced DMD, highlighting its potential as a practical, patient-centered AAC solution that extends digital accessibility to individuals with severe motor disabilities. Full article
(This article belongs to the Special Issue Applications of Assistive Technologies in Health Care Practices)
Show Figures

Graphical abstract

3729 KB  
Proceeding Paper
A Smart Glove-Based System for Dynamic Sign Language Translation Using LSTM Networks
by Tabassum Kanwal, Saud Altaf, Rehan Mehmood Yousaf and Kashif Sattar
Eng. Proc. 2025, 118(1), 45; https://doi.org/10.3390/ECSA-12-26530 - 7 Nov 2025
Viewed by 332
Abstract
This research presents a novel, real-time Pakistani Sign Language (PSL) recognition system utilizing a custom-designed sensory glove integrated with advanced machine learning techniques. The system aims to bridge communication gaps for individuals with hearing and speech impairments by translating hand gestures into readable [...] Read more.
This research presents a novel, real-time Pakistani Sign Language (PSL) recognition system utilizing a custom-designed sensory glove integrated with advanced machine learning techniques. The system aims to bridge communication gaps for individuals with hearing and speech impairments by translating hand gestures into readable text. At the core of this work is a smart glove engineered with five resistive flex sensors for precise finger flexion detection and a 9-DOF Inertial Measurement Unit (IMU) for capturing hand orientation and movement. The glove is powered by a compact microcontroller, which processes the analog and digital sensor inputs and transmits the data wirelessly to a host computer. A rechargeable 3.7 V Li-Po battery ensures portability, while a dynamic dataset comprising both static alphabet gestures and dynamic PSL phrases was recorded using this setup. The collected data was used to train two models: a Support Vector Machine with feature extraction (SVM-FE) and a Long Short-Term Memory (LSTM) deep learning network. The LSTM model outperformed traditional methods, achieving an accuracy of 98.6% in real-time gesture recognition. The proposed system demonstrates robust performance and offers practical applications in smart home interfaces, virtual and augmented reality, gaming, and assistive technologies. By combining ergonomic hardware with intelligent algorithms, this research takes a significant step toward inclusive communication and more natural human–machine interaction. Full article
Show Figures

Figure 1

1949 KB  
Proceeding Paper
Gesture-Controlled Bionic Hand for Safe Handling of Biomedical Industrial Chemicals
by Sudarsun Gopinath, Glen Nitish, Daniel Ford, Thiyam Deepa Beeta and Shelishiyah Raymond
Eng. Proc. 2025, 118(1), 42; https://doi.org/10.3390/ECSA-12-26577 - 7 Nov 2025
Viewed by 129
Abstract
In pharmaceutical and biomedical industries, manual handling of dangerous chemicals is a leading cause of hazardous exposure to chemicals, toxic burning, and chemical contamination. To counteract these risks, we proposed a gesture-controlled bionic hand system to mimic human finger movements for safe and [...] Read more.
In pharmaceutical and biomedical industries, manual handling of dangerous chemicals is a leading cause of hazardous exposure to chemicals, toxic burning, and chemical contamination. To counteract these risks, we proposed a gesture-controlled bionic hand system to mimic human finger movements for safe and contactless chemical handling. This innovation system uses an ESP32 microcontroller to decode the hand gestures that are detected by the system using computer vision via an integrated camera. A PWM servo driver converts these movements to motor commands such that accurate movements of the fingers can be achieved. Teflon and other corrosion-proof materials are utilized in the 3D printing of the bionic hand in order to withstand corrosive conditions. This new, low-cost, and non-surgical approach replaces the EMG sensors, gives real-time control, and enhances industrial and laboratory process safety. The project is a major milestone in the application of robotics and AI for automation and risk reduction in dangerous environments. Full article
Show Figures

Figure 1

2177 KB  
Proceeding Paper
Hand Gesture to Sound: A Real-Time DSP-Based Audio Modulation System for Assistive Interaction
by Laiba Khan, Hira Mariam, Marium Sajid, Aymen Khan and Zehra Fatima
Eng. Proc. 2025, 118(1), 27; https://doi.org/10.3390/ECSA-12-26516 - 7 Nov 2025
Viewed by 190
Abstract
This paper presents the design, development, and evaluation of an embedded hardware and digital signal processing (DSP)-based real-time gesture-controlled system. The system architecture utilizes an MPU6050 inertial measurement unit (IMU), Arduino Uno microcontroller, and Python-based audio interface to recognize and classify directional hand [...] Read more.
This paper presents the design, development, and evaluation of an embedded hardware and digital signal processing (DSP)-based real-time gesture-controlled system. The system architecture utilizes an MPU6050 inertial measurement unit (IMU), Arduino Uno microcontroller, and Python-based audio interface to recognize and classify directional hand gestures and transform them into auditory commands. Wrist tilts, i.e., left, right, forward, and backward, are recognized using a hybrid algorithm that uses thresholding, moving average filtering, and low-pass smoothing to remove sensor noise and transient errors. Hardware setup utilizes I2C-based sensor acquisition, onboard preprocessing on Arduino, and serial communication with a host computer running a Python script to trigger audio playing using the playsound library. Four gestures are programmed for basic needs: Hydration Request, Meal Support, Restroom Support, and Emergency Alarm. Experimental evaluation, conducted over more than 50 iterations per gesture in a controlled laboratory setup, resulted in a mean recognition rate of 92%, with system latency of 120–150 milliseconds. The approach has little calibration costs, is low-cost, and offers low-latency performance comparable to more advanced camera-based or machine learning-based methods, and is therefore suitable for portable assistive devices. Full article
Show Figures

Figure 1

21 pages, 1507 KB  
Article
Embodied Co-Creation with Real-Time Generative AI: An Ukiyo-E Interactive Art Installation
by Hisa Nimi, Meizhu Lu and Juan Carlos Chacon
Digital 2025, 5(4), 61; https://doi.org/10.3390/digital5040061 - 7 Nov 2025
Viewed by 2138
Abstract
Generative artificial intelligence (AI) is reshaping creative practices, yet many systems rely on traditional interfaces, limiting intuitive and embodied engagement. This study presents a qualitative observational analysis of participant interactions with a real-time generative AI installation designed to co-create Ukiyo-e-style artwork through embodied [...] Read more.
Generative artificial intelligence (AI) is reshaping creative practices, yet many systems rely on traditional interfaces, limiting intuitive and embodied engagement. This study presents a qualitative observational analysis of participant interactions with a real-time generative AI installation designed to co-create Ukiyo-e-style artwork through embodied inputs. The system dynamically interprets physical presence, object manipulation, body poses, and gestures to influence AI-generated visuals displayed on a large public screen. Drawing on systematic video analysis and detailed interaction logs across 13 sessions, the research identifies core modalities of interaction, patterns of co-creation, and user responses. Tangible objects with salient visual features such as color and pattern emerged as the primary, most intuitive input method, while bodily poses and hand gestures served as compositional modifiers. The system’s immediate feedback loop enabled rapid learning and iterative exploration and enhanced the user’s feeling of control. Users engaged in collaborative discovery, turn-taking, and shared authorship, frequently expressing a positive effect. The findings highlight how embodied interaction lowers cognitive barriers, enhances engagement, and supports meaningful human–AI collaboration. This study offers design implications for future creative AI systems, emphasizing accessibility, playful exploration, and cultural resonance, with the potential to democratize artistic expression and foster deeper public engagement with digital cultural heritage. Full article
(This article belongs to the Special Issue Advances in Semantic Multimedia and Personalized Digital Content)
Show Figures

Figure 1

12 pages, 890 KB  
Article
Control Modality and Accuracy on the Trust and Acceptance of Construction Robots
by Daeguk Lee, Donghun Lee, Jae Hyun Jung and Taezoon Park
Appl. Sci. 2025, 15(21), 11827; https://doi.org/10.3390/app152111827 - 6 Nov 2025
Viewed by 515
Abstract
This study investigates how control modalities and recognition accuracy influence construction workers’ trust and acceptance of collaborative robots. Sixty participants evaluated voice and gesture control under varying levels of recognition accuracy while performing tiling together with collaborative robots. Experimental results indicated that recognition [...] Read more.
This study investigates how control modalities and recognition accuracy influence construction workers’ trust and acceptance of collaborative robots. Sixty participants evaluated voice and gesture control under varying levels of recognition accuracy while performing tiling together with collaborative robots. Experimental results indicated that recognition accuracy significantly affected perceived enjoyment (PE, p = 0.010), ease of use (PEOU, p = 0.030), and intention to use (ITU, p = 0.022), but not trust, usefulness (PU), or attitude (ATT). Furthermore, the interaction between control modality and accuracy shaped most acceptance factors (PE, p = 0.049; PEOU, p = 0.006; PU, p = 0.006; ATT, p = 0.003, and ITU, p < 0.001) except trust. In general, high recognition accuracy enhanced user experience and adoption intentions. Voice interfaces were favored when recognition accuracy was high, whereas gesture interfaces were more acceptable under low-accuracy conditions. These findings highlight the importance of designing high-accuracy, task-appropriate interfaces to support technology acceptance in construction. The preference for voice interfaces under accurate conditions aligns with the noisy, fast-paced nature of construction sites, where efficiency is paramount. By contrast, gesture interfaces offer resilience when recognition errors occur. The study provides practical guidance for robot developers, interface designers, and construction managers, emphasizing that carefully matching interaction modalities and accuracy levels to on-site demands can improve acceptance and long-term adoption in this traditionally conservative sector. Full article
(This article belongs to the Special Issue Robot Control in Human–Computer Interaction)
Show Figures

Figure 1

22 pages, 4342 KB  
Article
Cloud-Based Personalized sEMG Classification Using Lightweight CNNs for Long-Term Haptic Communication in Deaf-Blind Individuals
by Kaavya Tatavarty, Maxwell Johnson and Boris Rubinsky
Bioengineering 2025, 12(11), 1167; https://doi.org/10.3390/bioengineering12111167 - 27 Oct 2025
Viewed by 858
Abstract
Deaf-blindness, particularly in progressive conditions such as Usher syndrome, presents profound challenges to communication, independence, and access to information. Existing tactile communication technologies for individuals with Usher syndrome are often limited by the need for close physical proximity to trained interpreters, typically requiring [...] Read more.
Deaf-blindness, particularly in progressive conditions such as Usher syndrome, presents profound challenges to communication, independence, and access to information. Existing tactile communication technologies for individuals with Usher syndrome are often limited by the need for close physical proximity to trained interpreters, typically requiring hand-to-hand contact. In this study, we introduce a novel, cloud-based, AI-assisted gesture recognition and haptic communication system designed for long-term use by individuals with Usher syndrome, whose auditory and visual abilities deteriorate with age. Central to our approach is a wearable haptic interface that relocates tactile input and output from the hands to an arm-mounted sleeve, thereby preserving manual dexterity and enabling continuous, bidirectional tactile interaction. The system uses surface electromyography (sEMG) to capture user-specific muscle activations in the hand and forearm and employs lightweight, personalized convolutional neural networks (CNNs), hosted on a centralized server, to perform real-time gesture classification. A key innovation of the system is its ability to adapt over time to each user’s evolving physiological condition, including the progressive loss of vision and hearing. Experimental validation using a public dataset, along with real-time testing involving seven participants, demonstrates that personalized models consistently outperform cross-user models in terms of accuracy, adaptability, and usability. This platform offers a scalable, longitudinally adaptable solution for non-visual communication and holds significant promise for advancing assistive technologies for the deaf-blind community. Full article
(This article belongs to the Section Biosignal Processing)
Show Figures

Graphical abstract

28 pages, 4508 KB  
Article
Mixed Reality-Based Multi-Scenario Visualization and Control in Automated Terminals: A Middleware and Digital Twin Driven Approach
by Yubo Wang, Enyu Zhang, Ang Yang, Keshuang Du and Jing Gao
Buildings 2025, 15(21), 3879; https://doi.org/10.3390/buildings15213879 - 27 Oct 2025
Viewed by 907
Abstract
This study presents a Digital Twin–Mixed Reality (DT–MR) framework for the immersive and interactive supervision of automated container terminals (ACTs), addressing the fragmented data and limited situational awareness of conventional 2D monitoring systems. The framework employs a middleware-centric architecture that integrates heterogeneous [...] Read more.
This study presents a Digital Twin–Mixed Reality (DT–MR) framework for the immersive and interactive supervision of automated container terminals (ACTs), addressing the fragmented data and limited situational awareness of conventional 2D monitoring systems. The framework employs a middleware-centric architecture that integrates heterogeneous subsystems—covering terminal operation, equipment control, and information management—through standardized industrial communication protocols. It ensures synchronized timestamps and delivers semantically aligned, low-latency data streams to a multi-scale Digital Twin developed in Unity. The twin applies level-of-detail modeling, spatial anchoring, and coordinate alignment (from Industry Foundation Classes (IFCs) to east–north–up (ENU) coordinates and Unity space) for accurate registration with physical assets, while a Microsoft HoloLens 2 device provides an intuitive Mixed Reality interface that combines gaze, gesture, and voice commands with built-in safety interlocks for secure human–machine interaction. Quantitative performance benchmarks—latency ≤100 ms, status refresh ≤1 s, and throughput ≥10,000 events/s—were met through targeted engineering and validated using representative scenarios of quay crane alignment and automated guided vehicle (AGV) rerouting, demonstrating improved anomaly detection, reduced decision latency, and enhanced operational resilience. The proposed DT–MR pipeline establishes a reproducible and extensible foundation for real-time, human-in-the-loop supervision across ports, airports, and other large-scale smart infrastructures. Full article
(This article belongs to the Special Issue Digital Technologies, AI and BIM in Construction)
Show Figures

Figure 1

18 pages, 1175 KB  
Article
NAMI: A Neuro-Adaptive Multimodal Architecture for Wearable Human–Computer Interaction
by Christos Papakostas, Christos Troussas, Akrivi Krouska and Cleo Sgouropoulou
Multimodal Technol. Interact. 2025, 9(10), 108; https://doi.org/10.3390/mti9100108 - 18 Oct 2025
Cited by 1 | Viewed by 1382
Abstract
The increasing ubiquity of wearable computing and multimodal interaction technologies has created unprecedented opportunities for natural and seamless human–computer interaction. However, most existing systems adapt only to external user actions such as speech, gesture, or gaze, without considering internal cognitive or affective states. [...] Read more.
The increasing ubiquity of wearable computing and multimodal interaction technologies has created unprecedented opportunities for natural and seamless human–computer interaction. However, most existing systems adapt only to external user actions such as speech, gesture, or gaze, without considering internal cognitive or affective states. This limits their ability to provide intelligent and empathetic adaptations. This paper addresses this critical gap by proposing the Neuro-Adaptive Multimodal Architecture (NAMI), a principled, modular, and reproducible framework designed to integrate behavioral and neurophysiological signals in real time. NAMI combines multimodal behavioral inputs with lightweight EEG and peripheral physiological measurements to infer cognitive load and engagement and adapt the interface dynamically to optimize user experience. The architecture is formally specified as a three-layer pipeline encompassing sensing and acquisition, cognitive–affective state estimation, and adaptive interaction control, with clear data flows, mathematical formalization, and real-time performance on wearable platforms. A prototype implementation of NAMI was deployed in an augmented reality Java programming tutor for postgraduate informatics students, where it dynamically adjusted task difficulty, feedback modality, and assistance frequency based on inferred user state. Empirical evaluation with 100 participants demonstrated significant improvements in task performance, reduced subjective workload, and increased engagement and satisfaction, confirming the effectiveness of the neuro-adaptive approach. Full article
Show Figures

Figure 1

14 pages, 1197 KB  
Article
An Inclusive Offline Learning Platform Integrating Gesture Recognition and Local AI Models
by Marius-Valentin Drăgoi, Ionuț Nisipeanu, Roxana-Adriana Puiu, Florentina-Geanina Tache, Teodora-Mihaela Spiridon-Mocioacă, Alexandru Hank and Cozmin Cristoiu
Biomimetics 2025, 10(10), 693; https://doi.org/10.3390/biomimetics10100693 - 14 Oct 2025
Viewed by 757
Abstract
This paper introduces a gesture-controlled conversational interface driven by a local AI model, aimed at improving accessibility and facilitating hands-free interaction within digital environments. The technology utilizes real-time hand gesture recognition via a typical laptop camera and connects with a local AI engine [...] Read more.
This paper introduces a gesture-controlled conversational interface driven by a local AI model, aimed at improving accessibility and facilitating hands-free interaction within digital environments. The technology utilizes real-time hand gesture recognition via a typical laptop camera and connects with a local AI engine to produce customized learning materials. Users can peruse educational documents, obtain topic summaries, and generate automated quizzes with intuitive gestures, including lateral finger movements, a two-finger gesture, or an open palm, without the need for conventional input devices. Upon selection of a file, the AI model analyzes its whole content, producing a structured summary and a multiple-choice assessment, both of which are immediately saved for subsequent inspection. A unified set of gestures facilitates seamless navigating within the user interface and the opened documents. The system underwent testing with university students and faculty (n = 31), utilizing assessment measures such as gesture detection accuracy, command-response latency, and user satisfaction. The findings demonstrate that the system offers a seamless, hands-free user experience with significant potential for usage in accessibility, human–computer interaction, and intelligent interface design. This work advances the creation of multimodal AI-driven educational aids, providing a pragmatic framework for gesture-based document navigation and intelligent content enhancement. Full article
(This article belongs to the Special Issue Biomimicry for Optimization, Control, and Automation: 3rd Edition)
Show Figures

Figure 1

19 pages, 20388 KB  
Article
Radar-Based Gesture Recognition Using Adaptive Top-K Selection and Multi-Stream CNNs
by Jiseop Park and Jaejin Jeong
Sensors 2025, 25(20), 6324; https://doi.org/10.3390/s25206324 - 13 Oct 2025
Viewed by 1063
Abstract
With the proliferation of the Internet of Things (IoT), gesture recognition has attracted attention as a core technology in human–computer interaction (HCI). In particular, mmWave frequency-modulated continuous-wave (FMCW) radar has emerged as an alternative to vision-based approaches due to its robustness to illumination [...] Read more.
With the proliferation of the Internet of Things (IoT), gesture recognition has attracted attention as a core technology in human–computer interaction (HCI). In particular, mmWave frequency-modulated continuous-wave (FMCW) radar has emerged as an alternative to vision-based approaches due to its robustness to illumination changes and advantages in privacy. However, in real-world human–machine interface (HMI) environments, hand gestures are inevitably accompanied by torso- and arm-related reflections, which can also contain gesture-relevant variations. To effectively capture these variations without discarding them, we propose a preprocessing method called Adaptive Top-K Selection, which leverages vector entropy to summarize and preserve informative signals from both hand and body reflections. In addition, we present a Multi-Stream EfficientNetV2 architecture that jointly exploits temporal range and Doppler trajectories, together with radar-specific data augmentation and a training optimization strategy. In experiments on the publicly available FMCW gesture dataset released by the Karlsruhe Institute of Technology, the proposed method achieved an average accuracy of 99.5%. These results show that the proposed approach enables accurate and reliable gesture recognition even in realistic HMI environments with co-existing body reflections. Full article
(This article belongs to the Special Issue Sensor Technologies for Radar Detection)
Show Figures

Figure 1

Back to TopTop