Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (188)

Search Parameters:
Keywords = real-time gesture recognition

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 3647 KB  
Article
Study on Auxiliary Rehabilitation System of Hand Function Based on Machine Learning with Visual Sensors
by Yuqiu Zhang and Guanjun Bao
Sensors 2026, 26(3), 793; https://doi.org/10.3390/s26030793 - 24 Jan 2026
Viewed by 278
Abstract
This study aims to assess hand function recovery in stroke patients during the mid-to-late Brunnstrom stages and to encourage active participation in rehabilitation exercises. To this end, a deep residual network (ResNet) integrated with Focal Loss is employed for gesture recognition, achieving a [...] Read more.
This study aims to assess hand function recovery in stroke patients during the mid-to-late Brunnstrom stages and to encourage active participation in rehabilitation exercises. To this end, a deep residual network (ResNet) integrated with Focal Loss is employed for gesture recognition, achieving a Macro F1 score of 91.0% and a validation accuracy of 90.9%. Leveraging the millimetre-level precision of Leap Motion 2 hand tracking, a mapping relationship for hand skeletal joint points was established, and a static assessment gesture data set containing 502,401 frames was collected through analysis of the FMA scale. The system implements an immersive augmented reality interaction through the Unity development platform; C# algorithms were designed for real-time motion range quantification. Finally, the paper designs a rehabilitation system framework tailored for home and community environments, including system module workflows, assessment modules, and game logic. Experimental results demonstrate the technical feasibility and high accuracy of the automated system for assessment and rehabilitation training. The system is designed to support stroke patients in home and community settings, with the potential to enhance rehabilitation motivation, interactivity, and self-efficacy. This work presents an integrated research framework encompassing hand modelling and deep learning-based recognition. It offers the possibility of feasible and economical solutions for stroke survivors, laying the foundation for future clinical applications. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

32 pages, 4599 KB  
Article
Adaptive Assistive Technologies for Learning Mexican Sign Language: Design of a Mobile Application with Computer Vision and Personalized Educational Interaction
by Carlos Hurtado-Sánchez, Ricardo Rosales Cisneros, José Ricardo Cárdenas-Valdez, Andrés Calvillo-Téllez and Everardo Inzunza-Gonzalez
Future Internet 2026, 18(1), 61; https://doi.org/10.3390/fi18010061 - 21 Jan 2026
Viewed by 118
Abstract
Integrating people with hearing disabilities into schools is one of the biggest problems that Latin American societies face. Mexican Sign Language (MSL) is the main language and culture of the deaf community in Mexico. However, its use in formal education is still limited [...] Read more.
Integrating people with hearing disabilities into schools is one of the biggest problems that Latin American societies face. Mexican Sign Language (MSL) is the main language and culture of the deaf community in Mexico. However, its use in formal education is still limited by structural inequalities, a lack of qualified interpreters, and a lack of technology that can support personalized instruction. This study outlines the conceptualization and development of a mobile application designed as an adaptive assistive technology for learning MSL, utilizing a combination of computer vision techniques, deep learning algorithms, and personalized pedagogical interaction. The suggested system uses convolutional neural networks (CNNs) and pose-estimation models to recognize hand gestures in real time with 95.7% accuracy. It then gives the learner instant feedback by changing the difficulty level. A dynamic learning engine automatically changes the level of difficulty based on how well the learner is doing, which helps them learn signs and phrases over time. The Scrum agile methodology was used during the development process. This meant that educators, linguists, and members of the deaf community all worked together to design the product. Early tests show that sign recognition accuracy and indicators of user engagement and motivation show favorable performance and are at appropriate levels. This proposal aims to enhance inclusive digital ecosystems and foster linguistic equity in Mexican education through scalable, mobile, and culturally relevant technologies, in addition to its technical contributions. Full article
(This article belongs to the Special Issue Machine Learning Techniques for Computer Vision—2nd Edition)
Show Figures

Figure 1

27 pages, 11232 KB  
Article
Aerokinesis: An IoT-Based Vision-Driven Gesture Control System for Quadcopter Navigation Using Deep Learning and ROS2
by Sergei Kondratev, Yulia Dyrchenkova, Georgiy Nikitin, Leonid Voskov, Vladimir Pikalov and Victor Meshcheryakov
Technologies 2026, 14(1), 69; https://doi.org/10.3390/technologies14010069 - 16 Jan 2026
Viewed by 314
Abstract
This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in [...] Read more.
This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in scenarios where traditional remote controllers are impractical or unavailable. The architecture comprises two hierarchical control levels: (1) high-level discrete command control utilizing a fully connected neural network classifier for static gesture recognition, and (2) low-level continuous flight control based on three-dimensional hand keypoint analysis from a depth camera. The gesture classification module achieves an accuracy exceeding 99% using a multi-layer perceptron trained on MediaPipe-extracted hand landmarks. For continuous control, we propose a novel approach that computes Euler angles (roll, pitch, yaw) and throttle from 3D hand pose estimation, enabling intuitive four-degree-of-freedom quadcopter manipulation. A hybrid signal filtering pipeline ensures robust control signal generation while maintaining real-time responsiveness. Comparative user studies demonstrate that gesture-based control reduces task completion time by 52.6% for beginners compared to conventional remote controllers. The results confirm the viability of vision-based gesture interfaces for IoT-enabled UAV applications. Full article
(This article belongs to the Section Information and Communication Technologies)
Show Figures

Figure 1

27 pages, 4631 KB  
Article
Multimodal Minimal-Angular-Geometry Representation for Real-Time Dynamic Mexican Sign Language Recognition
by Gerardo Garcia-Gil, Gabriela del Carmen López-Armas and Yahir Emmanuel Ramirez-Pulido
Technologies 2026, 14(1), 48; https://doi.org/10.3390/technologies14010048 - 8 Jan 2026
Viewed by 309
Abstract
Current approaches to dynamic sign language recognition commonly rely on dense landmark representations, which impose high computational cost and hinder real-time deployment on resource-constrained devices. To address this limitation, this work proposes a computationally efficient framework for real-time dynamic Mexican Sign Language (MSL) [...] Read more.
Current approaches to dynamic sign language recognition commonly rely on dense landmark representations, which impose high computational cost and hinder real-time deployment on resource-constrained devices. To address this limitation, this work proposes a computationally efficient framework for real-time dynamic Mexican Sign Language (MSL) recognition based on a multimodal minimal angular-geometry representation. Instead of processing complete landmark sets (e.g., MediaPipe Holistic with up to 468 keypoints), the proposed method encodes the relational geometry of the hands, face, and upper body into a compact set of 28 invariant internal angular descriptors. This representation substantially reduces feature dimensionality and computational complexity while preserving linguistically relevant manual and non-manual information required for grammatical and semantic discrimination in MSL. A real-time end-to-end pipeline is developed, comprising multimodal landmark extraction, angular feature computation, and temporal modeling using a Bidirectional Long Short-Term Memory (BiLSTM) network. The system is evaluated on a custom dataset of dynamic MSL gestures acquired under controlled real-time conditions. Experimental results demonstrate that the proposed approach achieves 99% accuracy and 99% macro F1-score, matching state-of-the-art performance while using fewer features dramatically. The compactness, interpretability, and efficiency of the minimal angular descriptor make the proposed system suitable for real-time deployment on low-cost devices, contributing toward more accessible and inclusive sign language recognition technologies. Full article
(This article belongs to the Special Issue Image Analysis and Processing)
Show Figures

Figure 1

24 pages, 15172 KB  
Article
Real-Time Hand Gesture Recognition for IoT Devices Using FMCW mmWave Radar and Continuous Wavelet Transform
by Anna Ślesicka and Adam Kawalec
Electronics 2026, 15(2), 250; https://doi.org/10.3390/electronics15020250 - 6 Jan 2026
Viewed by 364
Abstract
This paper presents an intelligent framework for real-time hand gesture recognition using Frequency-Modulated Continuous-Wave (FMCW) mmWave radar and deep learning. Unlike traditional radar-based recognition methods that rely on Discrete Fourier Transform (DFT) signal representations and focus primarily on classifier optimization, the proposed system [...] Read more.
This paper presents an intelligent framework for real-time hand gesture recognition using Frequency-Modulated Continuous-Wave (FMCW) mmWave radar and deep learning. Unlike traditional radar-based recognition methods that rely on Discrete Fourier Transform (DFT) signal representations and focus primarily on classifier optimization, the proposed system introduces a novel pre-processing stage based on the Continuous Wavelet Transform (CWT). The CWT enables the extraction of discriminative time–frequency features directly from raw radar signals, improving the interpretability and robustness of the learned representations. A lightweight convolutional neural network architecture is then designed to process the CWT maps for efficient classification on edge IoT devices. Experimental validation with data collected from 20 participants performing five standardized gestures demonstrates that the proposed framework achieves an accuracy of up to 99.87% using the Morlet wavelet, with strong generalization to unseen users (82–84% accuracy). The results confirm that the integration of CWT-based radar signal processing with deep learning forms a computationally efficient and accurate intelligent system for human–computer interaction in real-time IoT environments. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications, 4th Edition)
Show Figures

Figure 1

24 pages, 3319 KB  
Article
NovAc-DL: Novel Activity Recognition Based on Deep Learning in the Real-Time Environment
by Saksham Singla, Sheral Singla, Karan Singla, Priya Kansal, Sachin Kansal, Alka Bishnoi and Jyotindra Narayan
Big Data Cogn. Comput. 2026, 10(1), 11; https://doi.org/10.3390/bdcc10010011 - 29 Dec 2025
Viewed by 415
Abstract
Real-time fine-grained human activity recognition (HAR) remains a challenging problem due to rapid spatial–temporal variations, subtle motion differences, and dynamic environmental conditions. Addressing this difficulty, we propose NovAc-DL, a unified deep learning framework designed to accurately classify short human-like actions, specifically, “pour” and [...] Read more.
Real-time fine-grained human activity recognition (HAR) remains a challenging problem due to rapid spatial–temporal variations, subtle motion differences, and dynamic environmental conditions. Addressing this difficulty, we propose NovAc-DL, a unified deep learning framework designed to accurately classify short human-like actions, specifically, “pour” and “stir” from sequential video data. The framework integrates adaptive time-distributed convolutional encoding with temporal reasoning modules to enable robust recognition under realistic robotic-interaction conditions. A balanced dataset of 2000 videos was curated and processed through a consistent spatiotemporal pipeline. Three architectures, LRCN, CNN-TD, and ConvLSTM, were systematically evaluated. CNN-TD achieved the best performance, reaching 98.68% accuracy with the lowest test loss (0.0236), outperforming the other models in convergence speed, generalization, and computational efficiency. Grad-CAM visualizations further confirm that NovAc-DL reliably attends to motion-salient regions relevant to pouring and stirring gestures. These results establish NovAc-DL as a high-precision real-time-capable solution for deployment in healthcare monitoring, industrial automation, and collaborative robotics. Full article
Show Figures

Figure 1

31 pages, 9303 KB  
Article
Automatic Quadrotor Dispatch Missions Based on Air-Writing Gesture Recognition
by Pu-Sheng Tsai, Ter-Feng Wu and Yen-Chun Wang
Processes 2025, 13(12), 3984; https://doi.org/10.3390/pr13123984 - 9 Dec 2025
Viewed by 543
Abstract
This study develops an automatic dispatch system for quadrotor UAVs that integrates air-writing gesture recognition with a graphical user interface (GUI). The DJI RoboMaster quadrotor UAV (DJI, Shenzhen, China) was employed as the experimental platform, combined with an ESP32 microcontroller (Espressif Systems, Shanghai, [...] Read more.
This study develops an automatic dispatch system for quadrotor UAVs that integrates air-writing gesture recognition with a graphical user interface (GUI). The DJI RoboMaster quadrotor UAV (DJI, Shenzhen, China) was employed as the experimental platform, combined with an ESP32 microcontroller (Espressif Systems, Shanghai, China) and the RoboMaster SDK (version 3.0). On the Python (version 3.12.7) platform, a GUI was implemented using Tkinter (version 8.6), allowing users to input addresses or landmarks, which were then automatically converted into geographic coordinates and imported into Google Maps for route planning. The generated flight commands were transmitted to the UAV via a UDP socket, enabling remote autonomous flight. For gesture recognition, a Raspberry Pi integrated with the MediaPipe Hands module was used to capture 16 types of air-written flight commands in real time through a camera. The training samples were categorized into one-dimensional coordinates and two-dimensional images. In the one-dimensional case, X/Y axis coordinates were concatenated after data augmentation, interpolation, and normalization. In the two-dimensional case, three types of images were generated, namely font trajectory plots (T-plots), coordinate-axis plots (XY-plots), and composite plots combining the two (XYT-plots). To evaluate classification performance, several machine learning and deep learning architectures were employed, including a multi-layer perceptron (MLP), support vector machine (SVM), one-dimensional convolutional neural network (1D-CNN), and two-dimensional convolutional neural network (2D-CNN). The results demonstrated effective recognition accuracy across different models and sample formats, verifying the feasibility of the proposed air-writing trajectory framework for non-contact gesture-based UAV control. Furthermore, by combining gesture recognition with a GUI-based map planning interface, the system enhances the intuitiveness and convenience of UAV operation. Future extensions, such as incorporating aerial image object recognition, could extend the framework’s applications to scenarios including forest disaster management, vehicle license plate recognition, and air pollution monitoring. Full article
Show Figures

Figure 1

22 pages, 1145 KB  
Article
TSMTFN: Two-Stream Temporal Shift Module Network for Efficient Egocentric Gesture Recognition in Virtual Reality
by Muhammad Abrar Hussain, Chanjun Chun and SeongKi Kim
Virtual Worlds 2025, 4(4), 58; https://doi.org/10.3390/virtualworlds4040058 - 4 Dec 2025
Viewed by 456
Abstract
Egocentric hand gesture recognition is vital for natural human–computer interaction in augmented and virtual reality (AR/VR) systems. However, most deep learning models struggle to balance accuracy and efficiency, limiting real-time use on wearable devices. This paper introduces a Two-Stream Temporal Shift Module Transformer [...] Read more.
Egocentric hand gesture recognition is vital for natural human–computer interaction in augmented and virtual reality (AR/VR) systems. However, most deep learning models struggle to balance accuracy and efficiency, limiting real-time use on wearable devices. This paper introduces a Two-Stream Temporal Shift Module Transformer Fusion Network (TSMTFN) that achieves high recognition accuracy with low computational cost. The model integrates Temporal Shift Modules (TSMs) for efficient motion modeling and a Transformer-based fusion mechanism for long-range temporal understanding, operating on dual RGB-D streams to capture complementary visual and depth cues. Training stability and generalization are enhanced through full-layer training from epoch 1 and MixUp/CutMix augmentations. Evaluated on the EgoGesture dataset, TSMTFN attained 96.18% top-1 accuracy and 99.61% top-5 accuracy on the independent test set with only 16 GFLOPs and 21.3M parameters, offering a 2.4–4.7× reduction in computation compared to recent state-of-the-art methods. The model runs at 15.10 samples/s, achieving real-time performance. The results demonstrate robust recognition across over 95% of gesture classes and minimal inter-class confusion, establishing TSMTFN as an efficient, accurate, and deployable solution for next-generation wearable AR/VR gesture interfaces. Full article
Show Figures

Figure 1

16 pages, 1701 KB  
Article
Research on YOLOv5s-Based Multimodal Assistive Gesture and Micro-Expression Recognition with Speech Synthesis
by Xiaohua Li and Chaiyan Jettanasen
Computation 2025, 13(12), 277; https://doi.org/10.3390/computation13120277 - 1 Dec 2025
Viewed by 452
Abstract
Effective communication between deaf–mute and visually impaired individuals remains a challenge in the fields of human–computer interaction and accessibility technology. Current solutions mostly rely on single-modal recognition, which often leads to issues such as semantic ambiguity and loss of emotional information. To address [...] Read more.
Effective communication between deaf–mute and visually impaired individuals remains a challenge in the fields of human–computer interaction and accessibility technology. Current solutions mostly rely on single-modal recognition, which often leads to issues such as semantic ambiguity and loss of emotional information. To address these challenges, this study proposes a lightweight multimodal fusion framework that combines gestures and micro-expressions, which are then processed through a recognition network and a speech synthesis module. The core innovations of this research are as follows: (1) a lightweight YOLOv5s improvement structure that integrates residual modules and efficient downsampling modules, which reduces the model complexity and computational overhead while maintaining high accuracy; (2) a multimodal fusion method based on an attention mechanism, which adaptively and efficiently integrates complementary information from gestures and micro-expressions, significantly improving the semantic richness and accuracy of joint recognition; (3) an end-to-end real-time system that outputs the visual recognition results through a high-quality text-to-speech module, completing the closed-loop from “visual signal” to “speech feedback”. We conducted evaluations on the publicly available hand gesture dataset HaGRID and a curated micro-expression image dataset. The results show that, for the joint gesture and micro-expression tasks, our proposed multimodal recognition system achieves a multimodal joint recognition accuracy of 95.3%, representing a 4.5% improvement over the baseline model. The system was evaluated in a locally deployed environment, achieving a real-time processing speed of 22 FPS, with a speech output latency below 0.8 s. The mean opinion score (MOS) reached 4.5, demonstrating the effectiveness of the proposed approach in breaking communication barriers between the hearing-impaired and visually impaired populations. Full article
Show Figures

Figure 1

21 pages, 4379 KB  
Article
ReHAb Playground: A DL-Based Framework for Game-Based Hand Rehabilitation
by Samuele Rasetto, Giorgia Marullo, Ludovica Adamo, Federico Bordin, Francesca Pavesi, Chiara Innocente, Enrico Vezzetti and Luca Ulrich
Future Internet 2025, 17(11), 522; https://doi.org/10.3390/fi17110522 - 17 Nov 2025
Viewed by 995
Abstract
Hand rehabilitation requires consistent, repetitive exercises that can often reduce patient motivation, especially in home-based therapy. This study introduces ReHAb Playground, a deep learning-based system that merges real-time gesture recognition with 3D hand tracking to create an engaging and adaptable rehabilitation experience built [...] Read more.
Hand rehabilitation requires consistent, repetitive exercises that can often reduce patient motivation, especially in home-based therapy. This study introduces ReHAb Playground, a deep learning-based system that merges real-time gesture recognition with 3D hand tracking to create an engaging and adaptable rehabilitation experience built in the Unity Game Engine. The system utilizes a YOLOv10n model for hand gesture classification and MediaPipe Hands for 3D hand landmark extraction. Three mini-games were developed to target specific motor and cognitive functions: Cube Grab, Coin Collection, and Simon Says. Key gameplay parameters, namely repetitions, time limits, and gestures, can be tuned according to therapeutic protocols. Experiments with healthy participants were conducted to establish reference performance ranges based on average completion times and standard deviations. The results showed a consistent decrease in both task completion and gesture times across trials, indicating learning effects and improved control of gesture-based interactions. The most pronounced improvement was observed in the more complex Coin Collection task, confirming the system’s ability to support skill acquisition and engagement in rehabilitation-oriented activities. ReHAb Playground was conceived with modularity and scalability at its core, enabling the seamless integration of additional exercises, gesture libraries, and adaptive difficulty mechanisms. While preliminary, the findings highlight its promise as an accessible, low-cost rehabilitation platform suitable for home use, capable of monitoring motor progress over time and enhancing patient adherence through engaging, game-based interactions. Future developments will focus on clinical validation with patient populations and the implementation of adaptive feedback strategies to further personalize the rehabilitation process. Full article
(This article belongs to the Special Issue Advances in Deep Learning and Next-Generation Internet Technologies)
Show Figures

Graphical abstract

17 pages, 1775 KB  
Article
Simplifying Prediction of Intended Grasp Type: Accelerometry Performs Comparably to Combined EMG-Accelerometry in Individuals With and Without Amputation
by Samira Afshari, Rachel V. Vitali and Deema Totah
Sensors 2025, 25(22), 6984; https://doi.org/10.3390/s25226984 - 15 Nov 2025
Viewed by 535
Abstract
The adoption of active upper-limb prostheses with multiple degrees of freedom is largely lagging due to bulky designs and counterintuitive operation. Accurate gesture prediction with minimal sensors is key to enabling low-profile, user-friendly prosthetic devices. Wearable sensors, such as electromyography (EMG) and accelerometry [...] Read more.
The adoption of active upper-limb prostheses with multiple degrees of freedom is largely lagging due to bulky designs and counterintuitive operation. Accurate gesture prediction with minimal sensors is key to enabling low-profile, user-friendly prosthetic devices. Wearable sensors, such as electromyography (EMG) and accelerometry (ACC) sensors, provide valuable signals for identifying patterns relating muscle activity and arm movement to specific gestures. This study investigates which sensor type (EMG or ACC) has the most valuable information to predict hand grasps and identifies the signal features contributing the most to grasp prediction performance. Using an open-source dataset, we trained two types of subject-specific classifiers (LDA & KNN) to predict 10 grasp types in 13 individuals with and 28 individuals without amputation. Having 4-fold cross-validation, LDA average accuracies using ACC only features (84.7%) were similar to combined ACC & EMG (88.3%) and much greater than with only EMG features (58.1%). Feature importance analysis showed that participants with amputation reached more than 80% accuracy using only three features, two of which were ACC-derived, while able-bodied participants required nine features, with greater reliance on EMG. These findings suggest that ACC is sufficient for robust grasp classification in individuals with amputation and can support simpler, more accessible prosthetic designs. Future work should focus on incorporating object and grip force detection alongside grasp recognition and testing model performance in real-time prosthetic control settings. Full article
(This article belongs to the Section Wearables)
Show Figures

Figure 1

22 pages, 2736 KB  
Article
Radar Foot Gesture Recognition with Hybrid Pruned Lightweight Deep Models
by Eungang Son, Seungeon Song, Bong-Seok Kim, Sangdong Kim and Jonghun Lee
Signals 2025, 6(4), 66; https://doi.org/10.3390/signals6040066 - 13 Nov 2025
Viewed by 683
Abstract
Foot gesture recognition using a continuous-wave (CW) radar requires implementation on edge hardware with strict latency and memory budgets. Existing structured and unstructured pruning pipelines rely on iterative training–pruning–retraining cycles, increasing search costs and making them significantly time-consuming. We propose a NAS-guided bisection [...] Read more.
Foot gesture recognition using a continuous-wave (CW) radar requires implementation on edge hardware with strict latency and memory budgets. Existing structured and unstructured pruning pipelines rely on iterative training–pruning–retraining cycles, increasing search costs and making them significantly time-consuming. We propose a NAS-guided bisection hybrid pruning framework on foot gesture recognition from a continuous-wave (CW) radar, which employs a weighted shared supernet encompassing both block and channel options. The method consists of three major steps. In the bisection-guided NAS structured pruning stage, the algorithm identifies the minimum number of retained blocks—or equivalently, the maximum achievable sparsity—that satisfies the target accuracy under specified FLOPs and latency constraints. Next, during the hybrid compression phase, a global L1 percentile-based unstructured pruning and channel repacking are applied to further reduce memory usage. Finally, in the low-cost decision protocol stage, each pruning decision is evaluated using short fine-tuning (1–3 epochs) and partial validation (10–30% of dataset) to avoid repeated full retraining. We further provide a unified theory for hybrid pruning—formulating a resource-aware objective, a logit-perturbation invariance bound for unstructured pruning/INT8/repacking, a Hoeffding-based bisection decision margin, and a compression (code-length) generalization bound—explaining when the compressed models match baseline accuracy while meeting edge budgets. Radar return signals are processed with a short-time Fourier transform (STFT) to generate unique time–frequency spectrograms for each gesture (kick, swing, slide, tap). The proposed pruning method achieves 20–57% reductions in floating-point operations (FLOPs) and approximately 86% reductions in parameters, while preserving equivalent recognition accuracy. Experimental results demonstrate that the pruned model maintains high gesture recognition performance with substantially lower computational cost, making it suitable for real-time deployment on edge devices. Full article
Show Figures

Figure 1

15 pages, 2164 KB  
Article
Real-Time Chinese Sign Language Gesture Prediction Based on Surface EMG Sensors and Artificial Neural Network
by Jinrun Cheng, Xing Hu and Kuo Yang
Electronics 2025, 14(22), 4374; https://doi.org/10.3390/electronics14224374 - 9 Nov 2025
Viewed by 606
Abstract
Sign language recognition aims to capture and classify hand and arm motion signals to enable intuitive communication for individuals with hearing and speech impairments. This study proposes a real-time Chinese Sign Language (CSL) recognition framework that integrates a dual-stage segmentation strategy with a [...] Read more.
Sign language recognition aims to capture and classify hand and arm motion signals to enable intuitive communication for individuals with hearing and speech impairments. This study proposes a real-time Chinese Sign Language (CSL) recognition framework that integrates a dual-stage segmentation strategy with a lightweight three-layer artificial neural network to achieve early gesture prediction before completion of motion sequences. The system was evaluated on a 21-class CSL dataset containing several highly similar gestures and achieved an accuracy of 91.5%, with low average inference latency per cycle. Furthermore, training set truncation experiments demonstrate that using only the first 50% of each gesture instance preserves model accuracy while reducing training time by half, thereby enhancing real-time efficiency and practical deployability for embedded or assistive applications. Full article
Show Figures

Figure 1

3729 KB  
Proceeding Paper
A Smart Glove-Based System for Dynamic Sign Language Translation Using LSTM Networks
by Tabassum Kanwal, Saud Altaf, Rehan Mehmood Yousaf and Kashif Sattar
Eng. Proc. 2025, 118(1), 45; https://doi.org/10.3390/ECSA-12-26530 - 7 Nov 2025
Viewed by 613
Abstract
This research presents a novel, real-time Pakistani Sign Language (PSL) recognition system utilizing a custom-designed sensory glove integrated with advanced machine learning techniques. The system aims to bridge communication gaps for individuals with hearing and speech impairments by translating hand gestures into readable [...] Read more.
This research presents a novel, real-time Pakistani Sign Language (PSL) recognition system utilizing a custom-designed sensory glove integrated with advanced machine learning techniques. The system aims to bridge communication gaps for individuals with hearing and speech impairments by translating hand gestures into readable text. At the core of this work is a smart glove engineered with five resistive flex sensors for precise finger flexion detection and a 9-DOF Inertial Measurement Unit (IMU) for capturing hand orientation and movement. The glove is powered by a compact microcontroller, which processes the analog and digital sensor inputs and transmits the data wirelessly to a host computer. A rechargeable 3.7 V Li-Po battery ensures portability, while a dynamic dataset comprising both static alphabet gestures and dynamic PSL phrases was recorded using this setup. The collected data was used to train two models: a Support Vector Machine with feature extraction (SVM-FE) and a Long Short-Term Memory (LSTM) deep learning network. The LSTM model outperformed traditional methods, achieving an accuracy of 98.6% in real-time gesture recognition. The proposed system demonstrates robust performance and offers practical applications in smart home interfaces, virtual and augmented reality, gaming, and assistive technologies. By combining ergonomic hardware with intelligent algorithms, this research takes a significant step toward inclusive communication and more natural human–machine interaction. Full article
Show Figures

Figure 1

2177 KB  
Proceeding Paper
Hand Gesture to Sound: A Real-Time DSP-Based Audio Modulation System for Assistive Interaction
by Laiba Khan, Hira Mariam, Marium Sajid, Aymen Khan and Zehra Fatima
Eng. Proc. 2025, 118(1), 27; https://doi.org/10.3390/ECSA-12-26516 - 7 Nov 2025
Viewed by 278
Abstract
This paper presents the design, development, and evaluation of an embedded hardware and digital signal processing (DSP)-based real-time gesture-controlled system. The system architecture utilizes an MPU6050 inertial measurement unit (IMU), Arduino Uno microcontroller, and Python-based audio interface to recognize and classify directional hand [...] Read more.
This paper presents the design, development, and evaluation of an embedded hardware and digital signal processing (DSP)-based real-time gesture-controlled system. The system architecture utilizes an MPU6050 inertial measurement unit (IMU), Arduino Uno microcontroller, and Python-based audio interface to recognize and classify directional hand gestures and transform them into auditory commands. Wrist tilts, i.e., left, right, forward, and backward, are recognized using a hybrid algorithm that uses thresholding, moving average filtering, and low-pass smoothing to remove sensor noise and transient errors. Hardware setup utilizes I2C-based sensor acquisition, onboard preprocessing on Arduino, and serial communication with a host computer running a Python script to trigger audio playing using the playsound library. Four gestures are programmed for basic needs: Hydration Request, Meal Support, Restroom Support, and Emergency Alarm. Experimental evaluation, conducted over more than 50 iterations per gesture in a controlled laboratory setup, resulted in a mean recognition rate of 92%, with system latency of 120–150 milliseconds. The approach has little calibration costs, is low-cost, and offers low-latency performance comparable to more advanced camera-based or machine learning-based methods, and is therefore suitable for portable assistive devices. Full article
Show Figures

Figure 1

Back to TopTop