MDPI - Publisher of Open Access Journals

19 pages, 13635 KiB

Open AccessArticle

IPN HandS: Efficient Annotation Tool and Dataset for Skeleton-Based Hand Gesture Recognition

by Gibran Benitez-Garcia, Jesus Olivares-Mercado, Gabriel Sanchez-Perez and Hiroki Takahashi

Appl. Sci. 2025, 15(11), 6321; https://doi.org/10.3390/app15116321 - 4 Jun 2025

Viewed by 721

Hand gesture recognition (HGR) heavily relies on high-quality annotated datasets. However, annotating hand landmarks in video sequences is a time-intensive challenge. In this work, we introduce IPN HandS, an enhanced version of our IPN Hand dataset, which now includes approximately 700,000 hand skeleton [...] Read more.

Hand gesture recognition (HGR) heavily relies on high-quality annotated datasets. However, annotating hand landmarks in video sequences is a time-intensive challenge. In this work, we introduce IPN HandS, an enhanced version of our IPN Hand dataset, which now includes approximately 700,000 hand skeleton annotations and corrected gesture boundaries. To generate these annotations efficiently, we propose a novel annotation tool that combines automatic detection, inter-frame interpolation, copy–paste capabilities, and manual refinement. This tool significantly reduces annotation time from 70 min to just 27 min per video, allowing for the scalable and precise annotation of large datasets. We validate the advantages of the IPN HandS dataset by training a lightweight LSTM-based model using these annotations and comparing its performance against models trained with annotations from the widely used MediaPipe hand pose estimators. Our model achieves an accuracy that is 12% higher than the MediaPipe Hands model and 8% higher than the MediaPipe Holistic model. These results underscore the importance of annotation quality in training generalization and overall recognition performance. Both the IPN HandS dataset and the annotation tool will be released to support reproducible research and future work in HGR and related fields. Full article

(This article belongs to the Special Issue Advanced Technologies in Intelligent Software Methodologies, Tools, and Techniques)

► Show Figures

Figure 1

47 pages, 2260 KiB

Open AccessReview

Hand Gesture Recognition on Edge Devices: Sensor Technologies, Algorithms, and Processing Hardware

by Elfi Fertl, Encarnación Castillo, Georg Stettinger, Manuel P. Cuéllar and Diego P. Morales

Sensors 2025, 25(6), 1687; https://doi.org/10.3390/s25061687 - 8 Mar 2025

Cited by 3 | Viewed by 2193

Abstract

Hand gesture recognition (HGR) is a convenient and natural form of human–computer interaction. It is suitable for various applications. Much research has already focused on wearable device-based HGR. By contrast, this paper gives an overview focused on device-free HGR. That means we evaluate [...] Read more.

Hand gesture recognition (HGR) is a convenient and natural form of human–computer interaction. It is suitable for various applications. Much research has already focused on wearable device-based HGR. By contrast, this paper gives an overview focused on device-free HGR. That means we evaluate HGR systems that do not require the user to wear something like a data glove or hold a device. HGR systems are explored regarding technology, hardware, and algorithms. The interconnectedness of timing and power requirements with hardware, pre-processing algorithm, classification, and technology and how they permit more or less granularity, accuracy, and number of gestures is clearly demonstrated. Sensor modalities evaluated are WIFI, vision, radar, mobile networks, and ultrasound. The pre-processing technologies stereo vision, multiple-input multiple-output (MIMO), spectrogram, phased array, range-doppler-map, range-angle-map, doppler-angle-map, and multilateration are explored. Classification approaches with and without ML are studied. Among those with ML, assessed algorithms range from simple tree structures to transformers. All applications are evaluated taking into account their level of integration. This encompasses determining whether the application presented is suitable for edge integration, their real-time capability, whether continuous learning is implemented, which robustness was achieved, whether ML is applied, and the accuracy level. Our survey aims to provide a thorough understanding of the current state of the art in device-free HGR on edge devices and in general. Finally, on the basis of present-day challenges and opportunities in this field, we outline which further research we suggest for HGR improvement. Our goal is to promote the development of efficient and accurate gesture recognition systems. Full article

(This article belongs to the Special Issue Multimodal Sensing Technologies for IoT and AI-Enabled Systems)

► Show Figures

Figure 1

20 pages, 1820 KiB

Open AccessArticle

Hybrid Solution Through Systematic Electrical Impedance Tomography Data Reduction and CNN Compression for Efficient Hand Gesture Recognition on Resource-Constrained IoT Devices

by Salwa Sahnoun, Mahdi Mnif, Bilel Ghoul, Mohamed Jemal, Ahmed Fakhfakh and Olfa Kanoun

Future Internet 2025, 17(2), 89; https://doi.org/10.3390/fi17020089 - 14 Feb 2025

Cited by 2 | Viewed by 1009

Abstract

The rapid advancement of edge computing and Tiny Machine Learning (TinyML) has created new opportunities for deploying intelligence in resource-constrained environments. With the growing demand for intelligent Internet of Things (IoT) devices that can efficiently process complex data in real-time, there is an [...] Read more.

The rapid advancement of edge computing and Tiny Machine Learning (TinyML) has created new opportunities for deploying intelligence in resource-constrained environments. With the growing demand for intelligent Internet of Things (IoT) devices that can efficiently process complex data in real-time, there is an urgent need for innovative optimisation techniques that overcome the limitations of IoT devices and enable accurate and efficient computations. This study investigates a novel approach to optimising Convolutional Neural Network (CNN) models for Hand Gesture Recognition (HGR) based on Electrical Impedance Tomography (EIT), which requires complex signal processing, energy efficiency, and real-time processing, by simultaneously reducing input complexity and using advanced model compression techniques. By systematically reducing and halving the input complexity of a 1D CNN from 40 to 20 Boundary Voltages (BVs) and applying an innovative compression method, we achieved remarkable model size reductions of 91.75% and 97.49% for 40 and 20 BVs EIT inputs, respectively. Additionally, the Floating-Point operations (FLOPs) are significantly reduced, by more than 99% in both cases. These reductions have been achieved with a minimal loss of accuracy, maintaining the performance of 97.22% and 94.44% for 40 and 20 BVs inputs, respectively. The most significant result is the 20 BVs compressed model. In fact, at only 8.73 kB and a remarkable 94.44% accuracy, our model demonstrates the potential of intelligent design strategies in creating ultra-lightweight, high-performance CNN-based solutions for resource-constrained devices with near-full performance capabilities specifically for the case of HGR based on EIT inputs. Full article

(This article belongs to the Special Issue Joint Design and Integration in Smart IoT Systems)

► Show Figures

Graphical abstract

15 pages, 7134 KiB

Open AccessArticle

Single-Handed Gesture Recognition with RGB Camera for Drone Motion Control

by Guhnoo Yun, Hwykuen Kwak and Dong Hwan Kim

Appl. Sci. 2024, 14(22), 10230; https://doi.org/10.3390/app142210230 - 7 Nov 2024

Cited by 2 | Viewed by 2519

Abstract

Recent progress in hand gesture recognition has introduced several natural and intuitive approaches to drone control. However, effectively maneuvering drones in complex environments remains challenging. Drone movements are governed by four independent factors: roll, yaw, pitch, and throttle. Each factor includes three distinct [...] Read more.

Recent progress in hand gesture recognition has introduced several natural and intuitive approaches to drone control. However, effectively maneuvering drones in complex environments remains challenging. Drone movements are governed by four independent factors: roll, yaw, pitch, and throttle. Each factor includes three distinct behaviors—increase, decrease, and neutral—necessitating hand gesture vocabularies capable of expressing at least 81 combinations for comprehensive drone control in diverse scenarios. In this paper, we introduce a new set of hand gestures for precise drone control, leveraging an RGB camera sensor. These gestures are categorized into motion-based and posture-based types for efficient management. Then, we develop a lightweight hand gesture recognition algorithm capable of real-time operation on even edge devices, ensuring accurate and timely recognition. Subsequently, we integrate hand gesture recognition into a drone simulator to execute 81 commands for drone flight. Overall, the proposed hand gestures and recognition system offer natural control for complex drone maneuvers. Full article

(This article belongs to the Section Aerospace Science and Engineering)

► Show Figures

Figure 1

18 pages, 9066 KiB

Open AccessArticle

Semi-Supervised FMCW Radar Hand Gesture Recognition via Pseudo-Label Consistency Learning

by Yuhang Shi, Lihong Qiao, Yucheng Shu, Baobin Li, Bin Xiao, Weisheng Li and Xinbo Gao

Remote Sens. 2024, 16(13), 2267; https://doi.org/10.3390/rs16132267 - 21 Jun 2024

Cited by 1 | Viewed by 1633

Abstract

Hand gesture recognition is pivotal in facilitating human–machine interaction within the Internet of Things. Nevertheless, it encounters challenges, including labeling expenses and robustness. To tackle these issues, we propose a semi-supervised learning framework guided by pseudo-label consistency. This framework utilizes a dual-branch structure [...] Read more.

Hand gesture recognition is pivotal in facilitating human–machine interaction within the Internet of Things. Nevertheless, it encounters challenges, including labeling expenses and robustness. To tackle these issues, we propose a semi-supervised learning framework guided by pseudo-label consistency. This framework utilizes a dual-branch structure with a mean-teacher network. Within this setup, a global and locally guided self-supervised learning encoder acts as a feature extractor in a teacher–student network to efficiently extract features, maximizing data utilization to enhance feature representation. Additionally, we introduce a pseudo-label Consistency-Guided Mean-Teacher model, where simulated noise is incorporated to generate newly unlabeled samples for the teacher model before advancing to the subsequent stage. By enforcing consistency constraints between the outputs of the teacher and student models, we alleviate accuracy degradation resulting from individual differences and interference from other body parts, thereby bolstering the network’s robustness. Ultimately, the teacher model undergoes refinement through exponential moving averages to achieve stable weights. We evaluate our semi-supervised method on two publicly available hand gesture datasets and compare it with several state-of-the-art fully-supervised algorithms. The results demonstrate the robustness of our method, achieving an accuracy rate exceeding 99% across both datasets. Full article

► Show Figures

Figure 1

30 pages, 5445 KiB

Open AccessArticle

End-to-End Ultrasonic Hand Gesture Recognition

by Elfi Fertl, Do Dinh Tan Nguyen, Martin Krueger, Georg Stettinger, Rubén Padial-Allué, Encarnación Castillo and Manuel P. Cuéllar

Sensors 2024, 24(9), 2740; https://doi.org/10.3390/s24092740 - 25 Apr 2024

Cited by 1 | Viewed by 2969

Abstract

As the number of electronic gadgets in our daily lives is increasing and most of them require some kind of human interaction, this demands innovative, convenient input methods. There are limitations to state-of-the-art (SotA) ultrasound-based hand gesture recognition (HGR) systems in terms of [...] Read more.

As the number of electronic gadgets in our daily lives is increasing and most of them require some kind of human interaction, this demands innovative, convenient input methods. There are limitations to state-of-the-art (SotA) ultrasound-based hand gesture recognition (HGR) systems in terms of robustness and accuracy. This research presents a novel machine learning (ML)-based end-to-end solution for hand gesture recognition with low-cost micro-electromechanical (MEMS) system ultrasonic transducers. In contrast to prior methods, our ML model processes the raw echo samples directly instead of using pre-processed data. Consequently, the processing flow presented in this work leaves it to the ML model to extract the important information from the echo data. The success of this approach is demonstrated as follows. Four MEMS ultrasonic transducers are placed in three different geometrical arrangements. For each arrangement, different types of ML models are optimized and benchmarked on datasets acquired with the presented custom hardware (HW): convolutional neural networks (CNNs), gated recurrent units (GRUs), long short-term memory (LSTM), vision transformer (ViT), and cross-attention multi-scale vision transformer (CrossViT). The three last-mentioned ML models reached more than 88% accuracy. The most important innovation described in this research paper is that we were able to demonstrate that little pre-processing is necessary to obtain high accuracy in ultrasonic HGR for several arrangements of cost-effective and low-power MEMS ultrasonic transducer arrays. Even the computationally intensive Fourier transform can be omitted. The presented approach is further compared to HGR systems using other sensor types such as vision, WiFi, radar, and state-of-the-art ultrasound-based HGR systems. Direct processing of the sensor signals by a compact model makes ultrasonic hand gesture recognition a true low-cost and power-efficient input method. Full article

(This article belongs to the Special Issue Sensor Technologies for Gesture Recognition Applications in Shared Spaces)

► Show Figures

Figure 1

23 pages, 14459 KiB

Open AccessArticle

Investigating Effective Geometric Transformation for Image Augmentation to Improve Static Hand Gestures with a Pre-Trained Convolutional Neural Network

by Baiti-Ahmad Awaluddin, Chun-Tang Chao and Juing-Shian Chiou

Mathematics 2023, 11(23), 4783; https://doi.org/10.3390/math11234783 - 27 Nov 2023

Cited by 4 | Viewed by 3333

Abstract

Hand gesture recognition (HGR) is a challenging and fascinating research topic in computer vision with numerous daily life applications. In HGR, computers aim to identify and classify hand gestures. The limited diversity of the dataset used in HGR is due to the limited [...] Read more.

Hand gesture recognition (HGR) is a challenging and fascinating research topic in computer vision with numerous daily life applications. In HGR, computers aim to identify and classify hand gestures. The limited diversity of the dataset used in HGR is due to the limited number of hand gesture demonstrators, acquisition environments, and hand pose variations despite previous efforts. Geometric image augmentations are commonly used to address these limitations. These augmentations include scaling, translation, rotation, flipping, and image shearing. However, research has yet to focus on identifying the best geometric transformations for augmenting the HGR dataset. This study employed three commonly utilized pre-trained models for image classification tasks, namely ResNet50, MobileNetV2, and InceptionV3. The system’s performance was evaluated on five static HGR datasets: DLSI, HG14, ArabicASL, MU HandImages ASL, and Sebastian Marcell. The experimental results demonstrate that many geometric transformations are unnecessary for HGR image augmentation. Image shearing and horizontal flipping are the most influential transformations for augmenting the HGR dataset and achieving better classification performance. Moreover, ResNet50 outperforms MobileNetV2 and InceptionV3 for static HGR. Full article

(This article belongs to the Special Issue Artificial Intelligence and Machine Learning Based Methods and Applications)

► Show Figures

Figure 1

26 pages, 3814 KiB

Open AccessArticle

SDViT: Stacking of Distilled Vision Transformers for Hand Gesture Recognition

by Chun Keat Tan, Kian Ming Lim, Chin Poo Lee, Roy Kwang Yang Chang and Ali Alqahtani

Appl. Sci. 2023, 13(22), 12204; https://doi.org/10.3390/app132212204 - 10 Nov 2023

Cited by 4 | Viewed by 2226

Abstract

Hand gesture recognition (HGR) is a rapidly evolving field with the potential to revolutionize human–computer interactions by enabling machines to interpret and understand human gestures for intuitive communication and control. However, HGR faces challenges such as the high similarity of hand gestures, real-time [...] Read more.

Hand gesture recognition (HGR) is a rapidly evolving field with the potential to revolutionize human–computer interactions by enabling machines to interpret and understand human gestures for intuitive communication and control. However, HGR faces challenges such as the high similarity of hand gestures, real-time performance, and model generalization. To address these challenges, this paper proposes the stacking of distilled vision transformers, referred to as SDViT, for hand gesture recognition. An initially pretrained vision transformer (ViT) featuring a self-attention mechanism is introduced to effectively capture intricate connections among image patches, thereby enhancing its capability to handle the challenge of high similarity between hand gestures. Subsequently, knowledge distillation is proposed to compress the ViT model and improve model generalization. Multiple distilled ViTs are then stacked to achieve higher predictive performance and reduce overfitting. The proposed SDViT model achieves a promising performance on three benchmark datasets for hand gesture recognition: the American Sign Language (ASL) dataset, the ASL with digits dataset, and the National University of Singapore (NUS) hand gesture dataset. The accuracies achieved on these datasets are 100.00%, 99.60%, and 100.00%, respectively. Full article

► Show Figures

Figure 1

16 pages, 7548 KiB

Open AccessArticle

Robust Hand Gesture Recognition Using a Deformable Dual-Stream Fusion Network Based on CNN-TCN for FMCW Radar

by Meiyi Zhu, Chaoyi Zhang, Jianquan Wang, Lei Sun and Meixia Fu

Sensors 2023, 23(20), 8570; https://doi.org/10.3390/s23208570 - 19 Oct 2023

Cited by 5 | Viewed by 3120

Abstract

Hand Gesture Recognition (HGR) using Frequency Modulated Continuous Wave (FMCW) radars is difficult because of the inherent variability and ambiguity caused by individual habits and environmental differences. This paper proposes a deformable dual-stream fusion network based on CNN-TCN (DDF-CT) to solve this problem. [...] Read more.

Hand Gesture Recognition (HGR) using Frequency Modulated Continuous Wave (FMCW) radars is difficult because of the inherent variability and ambiguity caused by individual habits and environmental differences. This paper proposes a deformable dual-stream fusion network based on CNN-TCN (DDF-CT) to solve this problem. First, we extract range, Doppler, and angle information from radar signals with the Fast Fourier Transform to produce range-time (RT) and range-angle (RA) maps. Then, we reduce the noise of the feature map. Subsequently, the RAM sequence (RAMS) is generated by temporally organizing the RAMs, which captures a target’s range and velocity characteristics at each time point while preserving the temporal feature information. To improve the accuracy and consistency of gesture recognition, DDF-CT incorporates deformable convolution and inter-frame attention mechanisms, which enhance the extraction of spatial features and the learning of temporal relationships. The experimental results show that our method achieves an accuracy of 98.61%, and even when tested in a novel environment, it still achieves an accuracy of 97.22%. Due to its robust performance, our method is significantly superior to other existing HGR approaches. Full article

(This article belongs to the Special Issue Advances in Doppler and FMCW Radar Sensors)

► Show Figures

Figure 1

18 pages, 3045 KiB

Open AccessArticle

Real-Time Monocular Skeleton-Based Hand Gesture Recognition Using 3D-Jointsformer

by Enmin Zhong, Carlos R. del-Blanco, Daniel Berjón, Fernando Jaureguizar and Narciso García

Sensors 2023, 23(16), 7066; https://doi.org/10.3390/s23167066 - 10 Aug 2023

Cited by 7 | Viewed by 3886

Abstract

Automatic hand gesture recognition in video sequences has widespread applications, ranging from home automation to sign language interpretation and clinical operations. The primary challenge lies in achieving real-time recognition while managing temporal dependencies that can impact performance. Existing methods employ 3D convolutional or [...] Read more.

Automatic hand gesture recognition in video sequences has widespread applications, ranging from home automation to sign language interpretation and clinical operations. The primary challenge lies in achieving real-time recognition while managing temporal dependencies that can impact performance. Existing methods employ 3D convolutional or Transformer-based architectures with hand skeleton estimation, but both have limitations. To address these challenges, a hybrid approach that combines 3D Convolutional Neural Networks (3D-CNNs) and Transformers is proposed. The method involves using a 3D-CNN to compute high-level semantic skeleton embeddings, capturing local spatial and temporal characteristics of hand gestures. A Transformer network with a self-attention mechanism is then employed to efficiently capture long-range temporal dependencies in the skeleton sequence. Evaluation of the Briareo and Multimodal Hand Gesture datasets resulted in accuracy scores of 95.49% and 97.25%, respectively. Notably, this approach achieves real-time performance using a standard CPU, distinguishing it from methods that require specialized GPUs. The hybrid approach’s real-time efficiency and high accuracy demonstrate its superiority over existing state-of-the-art methods. In summary, the hybrid 3D-CNN and Transformer approach effectively addresses real-time recognition challenges and efficient handling of temporal dependencies, outperforming existing methods in both accuracy and speed. Full article

(This article belongs to the Special Issue Computer Vision and Smart Sensors for Human-Computer Interaction)

► Show Figures

Figure 1

20 pages, 3325 KiB

Open AccessArticle

HGR-ViT: Hand Gesture Recognition with Vision Transformer

by Chun Keat Tan, Kian Ming Lim, Roy Kwang Yang Chang, Chin Poo Lee and Ali Alqahtani

Sensors 2023, 23(12), 5555; https://doi.org/10.3390/s23125555 - 14 Jun 2023

Cited by 16 | Viewed by 5927

Abstract

Hand gesture recognition (HGR) is a crucial area of research that enhances communication by overcoming language barriers and facilitating human-computer interaction. Although previous works in HGR have employed deep neural networks, they fail to encode the orientation and position of the hand in [...] Read more.

Hand gesture recognition (HGR) is a crucial area of research that enhances communication by overcoming language barriers and facilitating human-computer interaction. Although previous works in HGR have employed deep neural networks, they fail to encode the orientation and position of the hand in the image. To address this issue, this paper proposes HGR-ViT, a Vision Transformer (ViT) model with an attention mechanism for hand gesture recognition. Given a hand gesture image, it is first split into fixed size patches. Positional embedding is added to these embeddings to form learnable vectors that capture the positional information of the hand patches. The resulting sequence of vectors are then served as the input to a standard Transformer encoder to obtain the hand gesture representation. A multilayer perceptron head is added to the output of the encoder to classify the hand gesture to the correct class. The proposed HGR-ViT obtains an accuracy of 99.98%, 99.36% and 99.85% for the American Sign Language (ASL) dataset, ASL with Digits dataset, and National University of Singapore (NUS) hand gesture dataset, respectively. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

35 pages, 15515 KiB

Open AccessArticle

Design and Evaluation of an Alternative Control for a Quad-Rotor Drone Using Hand-Gesture Recognition

by Siavash Khaksar, Luke Checker, Bita Borazjan and Iain Murray

Sensors 2023, 23(12), 5462; https://doi.org/10.3390/s23125462 - 9 Jun 2023

Cited by 5 | Viewed by 2086

Abstract

Gesture recognition is a mechanism by which a system recognizes an expressive and purposeful action made by a user’s body. Hand-gesture recognition (HGR) is a staple piece of gesture-recognition literature and has been keenly researched over the past 40 years. Over this time, [...] Read more.

Gesture recognition is a mechanism by which a system recognizes an expressive and purposeful action made by a user’s body. Hand-gesture recognition (HGR) is a staple piece of gesture-recognition literature and has been keenly researched over the past 40 years. Over this time, HGR solutions have varied in medium, method, and application. Modern developments in the areas of machine perception have seen the rise of single-camera, skeletal model, hand-gesture identification algorithms, such as media pipe hands (MPH). This paper evaluates the applicability of these modern HGR algorithms within the context of alternative control. Specifically, this is achieved through the development of an HGR-based alternative-control system capable of controlling of a quad-rotor drone. The technical importance of this paper stems from the results produced during the novel and clinically sound evaluation of MPH, alongside the investigatory framework used to develop the final HGR algorithm. The evaluation of MPH highlighted the Z-axis instability of its modelling system which reduced the landmark accuracy of its output from 86.7% to 41.5%. The selection of an appropriate classifier complimented the computationally lightweight nature of MPH whilst compensating for its instability, achieving a classification accuracy of 96.25% for eight single-hand static gestures. The success of the developed HGR algorithm ensured that the proposed alternative-control system could facilitate intuitive, computationally inexpensive, and repeatable drone control without requiring specialised equipment. Full article

(This article belongs to the Special Issue Sensing, Estimating, and Analyzing Human Movements for Human–Robot Interaction)

► Show Figures

Figure 1

15 pages, 27462 KiB

Open AccessArticle

A Collaborative Virtual Walkthrough of Matera’s Sassi Using Photogrammetric Reconstruction and Hand Gesture Navigation

by Nicla Maria Notarangelo, Gilda Manfredi and Gabriele Gilio

J. Imaging 2023, 9(4), 88; https://doi.org/10.3390/jimaging9040088 - 21 Apr 2023

Cited by 10 | Viewed by 2752

Abstract

The COVID-19 pandemic has underscored the need for real-time, collaborative virtual tools to support remote activities across various domains, including education and cultural heritage. Virtual walkthroughs provide a potent means of exploring, learning about, and interacting with historical sites worldwide. Nonetheless, creating realistic [...] Read more.

The COVID-19 pandemic has underscored the need for real-time, collaborative virtual tools to support remote activities across various domains, including education and cultural heritage. Virtual walkthroughs provide a potent means of exploring, learning about, and interacting with historical sites worldwide. Nonetheless, creating realistic and user-friendly applications poses a significant challenge. This study investigates the potential of collaborative virtual walkthroughs as an educational tool for cultural heritage sites, with a focus on the Sassi of Matera, a UNESCO World Heritage Site in Italy. The virtual walkthrough application, developed using RealityCapture and Unreal Engine, leveraged photogrammetric reconstruction and deep learning-based hand gesture recognition to offer an immersive and accessible experience, allowing users to interact with the virtual environment using intuitive gestures. A test with 36 participants resulted in positive feedback regarding the application’s effectiveness, intuitiveness, and user-friendliness. The findings suggest that virtual walkthroughs can provide precise representations of complex historical locations, promoting tangible and intangible aspects of heritage. Future work should focus on expanding the reconstructed site, enhancing the performance, and assessing the impact on learning outcomes. Overall, this study highlights the potential of virtual walkthrough applications as a valuable resource for architecture, cultural heritage, and environmental education. Full article

(This article belongs to the Special Issue The Roles of the Collaborative eXtended Reality in the New Social Era)

► Show Figures

Figure 1

18 pages, 3322 KiB

Open AccessArticle

Recognition of Hand Gestures Based on EMG Signals with Deep and Double-Deep Q-Networks

by Ángel Leonardo Valdivieso Caraguay, Juan Pablo Vásconez, Lorena Isabel Barona López and Marco E. Benalcázar

Sensors 2023, 23(8), 3905; https://doi.org/10.3390/s23083905 - 12 Apr 2023

Cited by 15 | Viewed by 6756

Abstract

In recent years, hand gesture recognition (HGR) technologies that use electromyography (EMG) signals have been of considerable interest in developing human–machine interfaces. Most state-of-the-art HGR approaches are based mainly on supervised machine learning (ML). However, the use of reinforcement learning (RL) techniques to [...] Read more.

In recent years, hand gesture recognition (HGR) technologies that use electromyography (EMG) signals have been of considerable interest in developing human–machine interfaces. Most state-of-the-art HGR approaches are based mainly on supervised machine learning (ML). However, the use of reinforcement learning (RL) techniques to classify EMGs is still a new and open research topic. Methods based on RL have some advantages such as promising classification performance and online learning from the user’s experience. In this work, we propose a user-specific HGR system based on an RL-based agent that learns to characterize EMG signals from five different hand gestures using Deep Q-network (DQN) and Double-Deep Q-Network (Double-DQN) algorithms. Both methods use a feed-forward artificial neural network (ANN) for the representation of the agent policy. We also performed additional tests by adding a long–short-term memory (LSTM) layer to the ANN to analyze and compare its performance. We performed experiments using training, validation, and test sets from our public dataset, EMG-EPN-612. The final accuracy results demonstrate that the best model was DQN without LSTM, obtaining classification and recognition accuracies of up to

90.37 % \pm 10.7 %

and

82.52 % \pm 10.9 %

, respectively. The results obtained in this work demonstrate that RL methods such as DQN and Double-DQN can obtain promising results for classification and recognition problems based on EMG signals. Full article

(This article belongs to the Special Issue Electromyography (EMG) Signal Acquisition and Processing)

► Show Figures

Figure 1

13 pages, 1815 KiB

Open AccessArticle

Gesture Detection and Recognition Based on Object Detection in Complex Background

by Renxiang Chen and Xia Tian

Appl. Sci. 2023, 13(7), 4480; https://doi.org/10.3390/app13074480 - 31 Mar 2023

Cited by 17 | Viewed by 4564

Abstract

In practical human–computer interaction, a hand gesture recognition method based on improved YOLOv5 is proposed to address the problem of low recognition accuracy and slow speed with complex backgrounds. By replacing the CSP1_x module in the YOLOv5 backbone network with an efficient layer [...] Read more.

In practical human–computer interaction, a hand gesture recognition method based on improved YOLOv5 is proposed to address the problem of low recognition accuracy and slow speed with complex backgrounds. By replacing the CSP1_x module in the YOLOv5 backbone network with an efficient layer aggregation network, a richer combination of gradient paths can be obtained to improve the network’s learning and expressive capabilities and enhance recognition speed. The CBAM attention mechanism is introduced to filtering gesture features in channel and spatial dimensions, reducing various types of interference in complex background gesture images and enhancing the network’s robustness against complex backgrounds. Experimental verification was conducted on two complex background gesture datasets, EgoHands and TinyHGR, with recognition accuracies of mAP0.5:0.95 at 75.6% and 66.8%, respectively, and a recognition speed of 64 FPS for 640 × 640 input images. The results show that the proposed method can recognize gestures quickly and accurately with complex backgrounds, and has higher recognition accuracy and stronger robustness compared to YOLOv5l, YOLOv7, and other comparative algorithms. Full article

(This article belongs to the Special Issue Practical Applications of New Optimization Methods and Intelligent Control)

► Show Figures

Figure 1

Search Results (31)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (31)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI