Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (430)

Search Parameters:
Keywords = human gesture recognition

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 4898 KB  
Article
Highly Robust and Multimodal PVA/Aramid Nanofiber/MXene Organogel Sensors for Advanced Human–Machine Interfaces
by Guofan Zeng, Leiting Liao, Zehong Wu, Jinye Chen, Peidi Zhou, Yihan Qiu and Mingcen Weng
Biosensors 2026, 16(4), 229; https://doi.org/10.3390/bios16040229 - 20 Apr 2026
Viewed by 263
Abstract
Flexible and wearable electronics require soft sensing materials that balance mechanical compliance, stable signal transduction, and durability for human–machine interfaces (HMIs). To address the limitations of single-filler systems, we propose a poly(vinyl alcohol) (PVA)/aramid nanofiber (ANF)/MXene organogel (PAM) as a multifunctional soft platform. [...] Read more.
Flexible and wearable electronics require soft sensing materials that balance mechanical compliance, stable signal transduction, and durability for human–machine interfaces (HMIs). To address the limitations of single-filler systems, we propose a poly(vinyl alcohol) (PVA)/aramid nanofiber (ANF)/MXene organogel (PAM) as a multifunctional soft platform. This design integrates a PVA physically crosslinked network with ANF for mechanical reinforcement and MXene for electrical functionality. The optimized PAM composite exhibits outstanding mechanical properties, including a fracture stress of 2931 kPa, a fracture strain of 676%, and a fracture toughness of 9.04 MJ m−3. Importantly, PAM serves as a single material platform configurable into three sensing modalities. The resistive strain sensor achieves a gauge factor of 3.1 over 10–100% strain and enables the reliable recognition of human joint movements and gestures. The capacitive pressure sensor delivers a sensitivity of 0.298 kPa−1, rapid response/recovery times of 30/10 ms, and is integrated with a wireless module to control a smart car. Furthermore, the PAM-based triboelectric nanogenerator (TENG) delivers excellent electrical outputs (Voc = 123 V, Isc = 0.52 μA, Qsc = 58 nC) and functions as a self-powered smart handwriting pad, achieving a machine-learning-based recognition accuracy of 97.6%. This work demonstrates the immense potential of the PAM organogel for advanced, self-powered HMIs. Full article
(This article belongs to the Special Issue Flexible and Stretchable Biosensors)
Show Figures

Figure 1

28 pages, 3548 KB  
Article
Edge Computing Approach to AI-Based Gesture for Human–Robot Interaction and Control
by Nikola Ivačko, Ivan Ćirić and Miloš Simonović
Computers 2026, 15(4), 241; https://doi.org/10.3390/computers15040241 - 14 Apr 2026
Viewed by 461
Abstract
This paper presents an edge-deployable vision-based framework for human–robot interaction using a xArm collaborative robot and a single RGB camera mounted on the robot wrist, and lightweight AI-based perception modules. The system enables intuitive, contact-free control by combining hand understanding and object detection [...] Read more.
This paper presents an edge-deployable vision-based framework for human–robot interaction using a xArm collaborative robot and a single RGB camera mounted on the robot wrist, and lightweight AI-based perception modules. The system enables intuitive, contact-free control by combining hand understanding and object detection within a unified perception–decision–control pipeline. Hand landmarks are extracted using MediaPipe Hands, from which continuous hand trajectories, static gestures, and dynamic gestures are derived. Task objects are detected using a YOLO-based model, and both hand and object observations are mapped into the robot workspace using ArUco-based planar calibration. To ensure stable robot motion, the hand control signal is smoothed using low-pass and Kalman filtering, while dynamic gestures such as waving are recognized using a lightweight LSTM classifier. The complete pipeline runs locally on edge hardware, specifically NVIDIA Jetson Orin Nano and Raspberry Pi 5 with a Hailo AI accelerator. Experimental evaluation includes trajectory stability, gesture recognition reliability, and runtime performance on both platforms. Results show that filtering significantly reduces hand-tracking jitter, gesture recognition provides stable command states for control, and both edge devices support real-time operation, with Jetson achieving consistently lower runtime than Raspberry Pi. The proposed system demonstrates the feasibility of low-cost edge AI solutions for responsive and practical human–robot interaction in collaborative industrial environments. Full article
(This article belongs to the Special Issue Intelligent Edge: When AI Meets Edge Computing)
Show Figures

Figure 1

27 pages, 6782 KB  
Article
Development and Evaluation of a Data Glove-Based System for Assisting Puzzle Solving
by Shashank Srikanth Bharadwaj, Kazuma Sato and Lei Jing
Sensors 2026, 26(8), 2341; https://doi.org/10.3390/s26082341 - 10 Apr 2026
Viewed by 413
Abstract
Many hands-on tasks remain difficult to fully automate because they require human dexterity and flexible object handling. Data gloves offer a promising interface for sensing hand–object interactions, but most prior systems focus on gesture recognition or object classification rather than closed-loop, step-by-step task [...] Read more.
Many hands-on tasks remain difficult to fully automate because they require human dexterity and flexible object handling. Data gloves offer a promising interface for sensing hand–object interactions, but most prior systems focus on gesture recognition or object classification rather than closed-loop, step-by-step task guidance. In this work, we develop and evaluate a tactile-sensing operation support system using an e-textile data glove with 88 pressure sensors, a tactile pressure sheet for placement verification, and a GUI that provides step-by-step instructions. As a core component, a CNN classifies the grasped state as bare hand or one of four discs with 93.3% accuracy using 16,175 training samples collected from five participants. In a user study on the Tower of Hanoi task as a controlled proxy for multi-step manipulation, the system reduced mean solving time by 51.5% (from 242.6 s to 117.8 s), reduced the number of disc movements (35.4 to 15, about 20 fewer moves on average), and lowered perceived workload (NASA-TLX) by 53.1% (from 68.5 to 32.1), while achieving a SUS score of 75. These results demonstrate the feasibility of tactile-based step verification and guidance in a controlled multi-step task; broader generalization requires evaluation with larger and more diverse participant groups and tasks. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

7 pages, 1242 KB  
Proceeding Paper
Real-Time Recognition of Dual-Arm Motion Using Joint Direction Vectors and Temporal Deep Learning
by Yi-Hsiang Tseng, Che-Wei Hsu and Yih-Guang Leu
Eng. Proc. 2025, 120(1), 75; https://doi.org/10.3390/engproc2025120075 - 9 Apr 2026
Viewed by 226
Abstract
We developed a dual-arm motion recognition system designed for real-time upper-limb movement analysis using video input. The system integrates MediaPipe Hands for skeletal critical point detection, a feature extraction pipeline that encodes spatial and temporal characteristics from upper-limb joints, and a three-layer long [...] Read more.
We developed a dual-arm motion recognition system designed for real-time upper-limb movement analysis using video input. The system integrates MediaPipe Hands for skeletal critical point detection, a feature extraction pipeline that encodes spatial and temporal characteristics from upper-limb joints, and a three-layer long short-term memory network for temporal modeling and classification. By computing directional vectors from the shoulder to the elbow and wrist, a 168-dimensional feature vector is generated per frame. Sequences of 90 frames are used to capture full motion patterns. The system effectively supports multi-class recognition of coordinated dual-arm gestures, offering applications in rehabilitation, gesture control, and human–computer interaction. Full article
(This article belongs to the Proceedings of 8th International Conference on Knowledge Innovation and Invention)
Show Figures

Figure 1

16 pages, 1624 KB  
Article
Surface EMG-Based Hand Gesture Recognition Using a Hybrid Multistream Deep Learning Architecture
by Yusuf Çelik and Umit Can
Sensors 2026, 26(7), 2281; https://doi.org/10.3390/s26072281 - 7 Apr 2026
Viewed by 436
Abstract
Surface electromyography (sEMG) enables non-invasive measurement of muscle activity for applications such as human–machine interaction, rehabilitation, and prosthesis control. However, high noise levels, inter-subject variability, and the complex nature of muscle activation hinder robust gesture classification. This study proposes a multistream hybrid deep-learning [...] Read more.
Surface electromyography (sEMG) enables non-invasive measurement of muscle activity for applications such as human–machine interaction, rehabilitation, and prosthesis control. However, high noise levels, inter-subject variability, and the complex nature of muscle activation hinder robust gesture classification. This study proposes a multistream hybrid deep-learning architecture for the FORS-EMG dataset to address these challenges. The model integrates Temporal Convolutional Networks (TCN), depthwise separable convolutions, bidirectional Long Short-Term Memory (LSTM)–Gated Recurrent Unit (GRU) layers, and a Transformer encoder to capture complementary temporal and spectral patterns, and an ArcFace-based classifier to enhance class separability. We evaluate the approach under three protocols: subject-wise, random split without augmentation, and random split with augmentation. In the augmented random-split setting, the model attains 96.4% accuracy, surpassing previously reported values. In the subject-wise setting, accuracy is 74%, revealing limited cross-user generalization. The results demonstrate the method’s high performance and highlight the impact of data-partition strategies for real-world sEMG-based gesture recognition. Full article
(This article belongs to the Special Issue Machine Learning in Biomedical Signal Processing)
Show Figures

Figure 1

16 pages, 5700 KB  
Article
A Deep Learning-Based EIT System for Robust Gesture Recognition Under Confounding Factors
by Hancong Wu, Guanghong Huang, Wentao Wang and Yuan Wen
Biosensors 2026, 16(4), 200; https://doi.org/10.3390/bios16040200 - 1 Apr 2026
Viewed by 427
Abstract
Gesture recognition with electrical impedance tomography (EIT) is an enormous potential tool for human–machine interaction because of its low cost, low complexity and high temporal resolution. Although high-precision EIT-based gesture recognition has been achieved in ideal scenarios, ensuring its consistent performance under interference [...] Read more.
Gesture recognition with electrical impedance tomography (EIT) is an enormous potential tool for human–machine interaction because of its low cost, low complexity and high temporal resolution. Although high-precision EIT-based gesture recognition has been achieved in ideal scenarios, ensuring its consistent performance under interference remains challenging. This article presents a novel method to alleviate the effect of confounding factors on EIT gesture recognition. An EIT armband was designed to mitigate the effect of contact impedance variation based on equivalent circuit analysis, and a spatial–temporal fusion network, named the Fold Atrous Spatial Pyramid Pooling-Gated Recurrent Unit (FASPP-GRU), was developed for robust gesture classification. The results showed that the proposed two-layer electrode maintained a stable contact impedance when its contact force with the skin was changed. Although confounding factors caused significant changes in baseline forearm impedance, FASPP-GRU achieved 80% accuracy under the effect of limb position changes and dynamic changes in muscle state over time, which outperforms conventional classifiers. With an 87 μs inference time, the proposed system shows enormous potential in real-time applications. Full article
Show Figures

Figure 1

15 pages, 287 KB  
Proceeding Paper
Computer Vision for Collaborative Robots in Industry 5.0: A Survey of Techniques, Gaps, and Future Directions
by Himani Varolia, César M. A. Vasques and Adélio M. S. Cavadas
Eng. Proc. 2026, 124(1), 99; https://doi.org/10.3390/engproc2026124099 - 24 Mar 2026
Viewed by 444
Abstract
Collaborative robots are increasingly deployed in human-shared industrial workspaces, where perception is a key enabler for safe interaction, flexible manipulation, and human-aware task execution. In the context of Industry 5.0, computer vision for cobots must meet not only accuracy requirements but also human-centered [...] Read more.
Collaborative robots are increasingly deployed in human-shared industrial workspaces, where perception is a key enabler for safe interaction, flexible manipulation, and human-aware task execution. In the context of Industry 5.0, computer vision for cobots must meet not only accuracy requirements but also human-centered constraints such as safety, transparency, robustness, and practical deployability. This paper surveys computer-vision approaches used in collaborative robotics and organizes them through a task-driven taxonomy covering detection, segmentation, tracking, pose estimation, action/gesture recognition, and safety monitoring. Beyond a descriptive literature review, the paper provides a task-driven qualitative analytical perspective that relates families of computer vision methods to key industrial constraints, including occlusion, lighting variability, clutter, domain shift, real-time latency, and annotation cost, and summarizes comparative strengths and failure modes using unified criteria. We further discuss challenges related to data availability and evaluation practices, highlighting gaps in reproducibility, standardized metrics, and real-world validation in shared human–robot environments. Finally, we outline implementation and deployment considerations across common software stacks (e.g., Python-based pipelines and MATLAB-based prototyping), emphasizing ROS2 integration, edge inference, and lifecycle maintenance. The survey concludes with research directions toward robust multimodal perception, explainable human-aware vision, and benchmarkable safety-critical perception for next-generation collaborative robotic systems. Full article
(This article belongs to the Proceedings of The 6th International Electronic Conference on Applied Sciences)
23 pages, 5784 KB  
Article
Learning Italian Hand Gesture Culture Through an Automatic Gesture Recognition Approach
by Chiara Innocente, Giorgio Di Pisa, Irene Lionetti, Andrea Mamoli, Manuela Vitulano, Giorgia Marullo, Simone Maffei, Enrico Vezzetti and Luca Ulrich
Future Internet 2026, 18(4), 177; https://doi.org/10.3390/fi18040177 - 24 Mar 2026
Viewed by 351
Abstract
Italian hand gestures constitute a distinctive and widely recognized form of nonverbal communication, deeply embedded in everyday interaction and cultural identity. Despite their prominence, these gestures are rarely formalized or systematically taught, posing challenges for foreign speakers and visitors seeking to interpret their [...] Read more.
Italian hand gestures constitute a distinctive and widely recognized form of nonverbal communication, deeply embedded in everyday interaction and cultural identity. Despite their prominence, these gestures are rarely formalized or systematically taught, posing challenges for foreign speakers and visitors seeking to interpret their meaning and pragmatic use. Moreover, their ephemeral and embodied nature complicates traditional preservation and transmission approaches, positioning them within the broader domain of intangible cultural heritage. This paper introduces a machine learning–based framework for recognizing iconic Italian hand gestures, designed to support cultural learning and engagement among foreign speakers and visitors. The approach combines RGB–D sensing with depth-enhanced geometric feature extraction, employing interpretable classification models trained on a purpose-built dataset. The recognition system is integrated into a non-immersive virtual reality application simulating an interactive digital totem conceived for public arrival spaces, providing tutorial content, real-time gesture recognition, and immediate feedback within a playful and accessible learning environment. Three supervised machine learning pipelines were evaluated, and Random Forest achieved the best overall performance. Its integration with an Isolation Forest module was further considered for deployment, achieving a macro-averaged accuracy and F1-score of 0.82 under a 5-fold cross-validation protocol. An experimental user study was conducted with 25 subjects to evaluate the proposed interactive system in terms of usability, user engagement, and learning effectiveness, obtaining favorable results and demonstrating its potential as a practical tool for cultural education and intercultural communication. Full article
Show Figures

Figure 1

36 pages, 14443 KB  
Article
Personalized Wrist–Forearm Static Gesture Recognition Using the Vicara Kai Controller and Convolutional Neural Network
by Jacek Szedel
Sensors 2026, 26(5), 1700; https://doi.org/10.3390/s26051700 - 8 Mar 2026
Viewed by 331
Abstract
Predefined, user-independent gesture sets do not account for individual differences in movement patterns and physical limitations. This study presents a personalized wrist–forearm static gesture recognition system for human–computer interaction (HCI) using the Vicara KaiTM wearable controller and a convolutional neural network (CNN). [...] Read more.
Predefined, user-independent gesture sets do not account for individual differences in movement patterns and physical limitations. This study presents a personalized wrist–forearm static gesture recognition system for human–computer interaction (HCI) using the Vicara KaiTM wearable controller and a convolutional neural network (CNN). Unlike the system based on fixed, predefined gestures, the proposed approach enables users to define and train their own gesture sets. During gesture recording, users may either select a gesture pattern from a predefined prompt set or create their own natural, unprompted gestures. A dedicated software framework was developed for data acquisition, preprocessing, model training, and real-time recognition. The developed system was evaluated by optimizing the parameters of a lightweight CNN and examining the influence of sequentially applied changes to the input and network pipelines, including resizing the input layer, applying data augmentation, experimenting with different dropout ratios, and varying the number of learning samples. The performance of the resulting network setup was assessed using confusion matrices, accuracy, and precision metrics for both original gestures and gestures smoothed using the cubic Bézier function. The resulting validation accuracy ranged from 0.88 to 0.94, with an average test-set accuracy of 0.92 and macro precision of 0.92. The system’s resilience to rapid or casual gestures was also evaluated using the receiver operating characteristic (ROC) method, achieving an Area Under the Curve (AUC) of 0.97. The results demonstrate that the proposed approach achieves high recognition accuracy, indicating its potential for a range of practical applications. Full article
(This article belongs to the Special Issue Sensor Systems for Gesture Recognition (3rd Edition))
Show Figures

Graphical abstract

14 pages, 865 KB  
Essay
Utilizing the Walla Emotion Model to Standardize Terminological Clarity for AI-Driven “Emotion” Recognition
by Peter Walla
Brain Sci. 2026, 16(3), 260; https://doi.org/10.3390/brainsci16030260 - 26 Feb 2026
Viewed by 674
Abstract
The scientific study of affect has been historically characterized by a profound lack of terminological consensus, leading to a state of conceptual fragmentation that persists in psychology, neuroscience and many other fields. This ambiguity is not merely an academic concern; it has significant [...] Read more.
The scientific study of affect has been historically characterized by a profound lack of terminological consensus, leading to a state of conceptual fragmentation that persists in psychology, neuroscience and many other fields. This ambiguity is not merely an academic concern; it has significant consequences for the development of artificial intelligence (AI) systems designed to recognize and respond to human “emotions”. In fact, it has an influence on the entire field of affective computing. The problem is obvious. Without a distinct definition of “emotion” it is difficult to train an algorithm to recognize it. The Walla Emotion Model, also known as the ESCAPE (Emotions Convey Affective Processing Effects) model, provides a potentially helpful and neurobiologically grounded framework to resolve this impasse and to improve any discourse about it, for businesses and even lawmakers aiming at healthy societies. By establishing clear, non-overlapping definitions for affective processing, feelings, and emotions, this model offers a path toward more precise research and more ethically sound affective computing including AI-driven “emotion” recognition. It introduces a concept that allows for the detection of incongruences between internal states and external signals with a very clear terminology supporting understandable communication. This is critical for identifying feigned or socially masked inner affective states, a challenge that traditional “face-reading” AI models frequently fail to address. Even tone of voice and body postures as well as gestures can be and are often voluntarily modified. Through the separation of subcortical affective processing (evaluation of valence; neural activity) from subjective experience (feeling) and external communication (emotion), the Walla model provides a helpful framework for AI-designs meant to have the capacity to infer an internal affective state from collected signals in the wild bypassing verbal self-report. This paper is purely theoretical; it does not provide any algorithm models or other distinct suggestions to train a software package. Its main purpose is the introduction of a new emotion model, particularly a new terminology that is considered helpful in order to proceed with this endeavor. It is considered important to first enable the clearest-possible form of communication about anything related to the term emotion across all disciplines dealing with it. Only then can progress be made. Full article
(This article belongs to the Section Cognitive, Social and Affective Neuroscience)
Show Figures

Figure 1

17 pages, 11472 KB  
Article
Fabrication and Performance Study of 3D-Printed MWCNTs/PDMS Flexible Piezoresistive Pressure Sensors
by Haitao Liu, Chenhui Sun, Xiaoquan Shi, Xubo Fan, Junjun Liu and Yazhou Sun
Appl. Sci. 2026, 16(5), 2204; https://doi.org/10.3390/app16052204 - 25 Feb 2026
Viewed by 408
Abstract
Piezoresistive pressure sensing has broad application prospects in wearable fields such as human–machine interaction, physiological signal detection, and electronic skin. As a high-performance conductive filler, multi-walled carbon nanotubes (MWCNTs) have demonstrated extensive application potential across various domains. However, polymer composites filled with MWCNTs [...] Read more.
Piezoresistive pressure sensing has broad application prospects in wearable fields such as human–machine interaction, physiological signal detection, and electronic skin. As a high-performance conductive filler, multi-walled carbon nanotubes (MWCNTs) have demonstrated extensive application potential across various domains. However, polymer composites filled with MWCNTs exhibit complex behavior during the printing process, which increases the difficulty of applying extrusion-based 3D printing technology. To this end, this study systematically investigated the extrusion 3D printing process of MWCNTs/polydimethylsiloxane (PDMS) composites. In this research, MWCNTs/PDMS composites with MWCNTs mass fractions of 1 wt%, 2 wt%, 3 wt%, and 4 wt% were prepared. The printability of the materials at each ratio was systematically explored, and rational printing process parameters were determined. On this basis, the influence of MWCNTs mass fraction on sensor performance was analyzed through tensile testing. Finally, three sets of experiments, including palm gesture recognition and gripping tests, elbow joint motion monitoring, and continuous pressure monitoring, successfully verified the feasibility of the fabricated sensors in human motion monitoring. The results demonstrate that the sensors made of this composite material via extrusion 3D printing possess excellent application potential in the field of flexible wearable electronics. Full article
(This article belongs to the Section Additive Manufacturing Technologies)
Show Figures

Figure 1

17 pages, 4471 KB  
Article
Utilizing Data Quality Indices for Strategic Sensor Channel Selection to Enhance Performance of Hand Gesture Recognition Systems
by Shen Zhang, Hao Zhou, Rayane Tchantchane and Gursel Alici
Sensors 2026, 26(4), 1213; https://doi.org/10.3390/s26041213 - 12 Feb 2026
Viewed by 428
Abstract
This study proposes a data quality-driven channel selection methodology to improve hand gesture recognition performance in multi-channel wearable Human–Machine Interface (HMI) systems. The methodology centers around calculating (i) five data quality indices for both surface electromyography (sEMG) and pressure-based force myography (pFMG) signals [...] Read more.
This study proposes a data quality-driven channel selection methodology to improve hand gesture recognition performance in multi-channel wearable Human–Machine Interface (HMI) systems. The methodology centers around calculating (i) five data quality indices for both surface electromyography (sEMG) and pressure-based force myography (pFMG) signals and (ii) establishing a relationship between these data quality indices and the accuracy of gesture recognition for applications typified by prosthetic hand control. Machine learning (ML)-based and correlation-based methods were used to select three optimal channel/pair configurations from an eight-channel/pair system. Evaluations on the UOW and Ninapro DB2 datasets showed that the proposed methods consistently outperformed random channel selection, with the ML-based approach achieving the best results (76.36% for sEMG, 71.59% for pFMG, and 88.2% for fused sEMG-pFMG on the UOW dataset and 70.28% on Ninapro DB2). Notably, using three pairs of strategically selected sEMG-pFMG channels generated 88.2%, which is comparable to the 88.38% accuracy obtained with a full eight-channel sEMG system on the UOW dataset, highlighting the efficacy of our channel selection methodologies. These results highlight the value of data quality indices for sensor selection and provide a foundation for developing more efficient wearable HMI systems. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

24 pages, 29852 KB  
Article
Dual-Axis Transformer-GNN Framework for Touchless Finger Location Sensing by Using Wi-Fi Channel State Information
by Minseok Koo and Jaesung Park
Electronics 2026, 15(3), 565; https://doi.org/10.3390/electronics15030565 - 28 Jan 2026
Viewed by 410
Abstract
Camera, lidar, and wearable-based gesture recognition technologies face practical limitations such as lighting sensitivity, occlusion, hardware cost, and user inconvenience. Wi-Fi channel state information (CSI) can be used as a contactless alternative to capture subtle signal variations caused by human motion. However, existing [...] Read more.
Camera, lidar, and wearable-based gesture recognition technologies face practical limitations such as lighting sensitivity, occlusion, hardware cost, and user inconvenience. Wi-Fi channel state information (CSI) can be used as a contactless alternative to capture subtle signal variations caused by human motion. However, existing CSI-based methods are highly sensitive to domain shifts and often suffer notable performance degradation when applied to environments different from the training conditions. To address this issue, we propose a domain-robust touchless finger location sensing framework that operates reliably even in a single-link environment composed of commercial Wi-Fi devices. The proposed system applies preprocessing procedures to reduce noise and variability introduced by environmental factors and introduces a multi-domain segment combination strategy to increase the domain diversity during training. In addition, the dual-axis transformer learns temporal and spatial features independently, and the GNN-based integration module incorporates relationships among segments originating from different domains to produce more generalized representations. The proposed model is evaluated using CSI data collected from various users and days; experimental results show that the proposed method achieves an in-domain accuracy of 99.31% and outperforms the best baseline by approximately 4% and 3% in cross-user and cross-day evaluation settings, respectively, even in a single-link setting. Our work demonstrates a viable path for robust, calibration-free finger-level interaction using ubiquitous single-link Wi-Fi in real-world and constrained environments, providing a foundation for more reliable contactless interaction systems. Full article
Show Figures

Figure 1

27 pages, 11232 KB  
Article
Aerokinesis: An IoT-Based Vision-Driven Gesture Control System for Quadcopter Navigation Using Deep Learning and ROS2
by Sergei Kondratev, Yulia Dyrchenkova, Georgiy Nikitin, Leonid Voskov, Vladimir Pikalov and Victor Meshcheryakov
Technologies 2026, 14(1), 69; https://doi.org/10.3390/technologies14010069 - 16 Jan 2026
Viewed by 862
Abstract
This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in [...] Read more.
This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in scenarios where traditional remote controllers are impractical or unavailable. The architecture comprises two hierarchical control levels: (1) high-level discrete command control utilizing a fully connected neural network classifier for static gesture recognition, and (2) low-level continuous flight control based on three-dimensional hand keypoint analysis from a depth camera. The gesture classification module achieves an accuracy exceeding 99% using a multi-layer perceptron trained on MediaPipe-extracted hand landmarks. For continuous control, we propose a novel approach that computes Euler angles (roll, pitch, yaw) and throttle from 3D hand pose estimation, enabling intuitive four-degree-of-freedom quadcopter manipulation. A hybrid signal filtering pipeline ensures robust control signal generation while maintaining real-time responsiveness. Comparative user studies demonstrate that gesture-based control reduces task completion time by 52.6% for beginners compared to conventional remote controllers. The results confirm the viability of vision-based gesture interfaces for IoT-enabled UAV applications. Full article
(This article belongs to the Section Information and Communication Technologies)
Show Figures

Figure 1

27 pages, 80350 KB  
Article
Pose-Based Static Sign Language Recognition with Deep Learning for Turkish, Arabic, and American Sign Languages
by Rıdvan Yayla, Hakan Üçgün and Mahmud Abbas
Sensors 2026, 26(2), 524; https://doi.org/10.3390/s26020524 - 13 Jan 2026
Cited by 1 | Viewed by 919
Abstract
Advancements in artificial intelligence have significantly enhanced communication for individuals with hearing impairments. This study presents a robust cross-lingual Sign Language Recognition (SLR) framework for Turkish, American English, and Arabic sign languages. The system utilizes the lightweight MediaPipe library for efficient hand landmark [...] Read more.
Advancements in artificial intelligence have significantly enhanced communication for individuals with hearing impairments. This study presents a robust cross-lingual Sign Language Recognition (SLR) framework for Turkish, American English, and Arabic sign languages. The system utilizes the lightweight MediaPipe library for efficient hand landmark extraction, ensuring stable and consistent feature representation across diverse linguistic contexts. Datasets were meticulously constructed from nine public-domain sources (four Arabic, three American, and two Turkish). The final training data comprises curated image datasets, with frames for each language carefully selected from varying angles and distances to ensure high diversity. A comprehensive comparative evaluation was conducted across three state-of-the-art deep learning architectures—ConvNeXt (CNN-based), Swin Transformer (ViT-based), and Vision Mamba (SSM-based)—all applied to identical feature sets. The evaluation demonstrates the superior performance of contemporary vision Transformers and state space models in capturing subtle spatial cues across diverse sign languages. Our approach provides a comparative analysis of model generalization capabilities across three distinct sign languages, offering valuable insights for model selection in pose-based SLR systems. Full article
(This article belongs to the Special Issue Sensor Systems for Gesture Recognition (3rd Edition))
Show Figures

Figure 1

Back to TopTop