Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (117)

Search Parameters:
Keywords = hand–gesture interface

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 11232 KB  
Article
Aerokinesis: An IoT-Based Vision-Driven Gesture Control System for Quadcopter Navigation Using Deep Learning and ROS2
by Sergei Kondratev, Yulia Dyrchenkova, Georgiy Nikitin, Leonid Voskov, Vladimir Pikalov and Victor Meshcheryakov
Technologies 2026, 14(1), 69; https://doi.org/10.3390/technologies14010069 - 16 Jan 2026
Viewed by 27
Abstract
This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in [...] Read more.
This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in scenarios where traditional remote controllers are impractical or unavailable. The architecture comprises two hierarchical control levels: (1) high-level discrete command control utilizing a fully connected neural network classifier for static gesture recognition, and (2) low-level continuous flight control based on three-dimensional hand keypoint analysis from a depth camera. The gesture classification module achieves an accuracy exceeding 99% using a multi-layer perceptron trained on MediaPipe-extracted hand landmarks. For continuous control, we propose a novel approach that computes Euler angles (roll, pitch, yaw) and throttle from 3D hand pose estimation, enabling intuitive four-degree-of-freedom quadcopter manipulation. A hybrid signal filtering pipeline ensures robust control signal generation while maintaining real-time responsiveness. Comparative user studies demonstrate that gesture-based control reduces task completion time by 52.6% for beginners compared to conventional remote controllers. The results confirm the viability of vision-based gesture interfaces for IoT-enabled UAV applications. Full article
(This article belongs to the Section Information and Communication Technologies)
Show Figures

Figure 1

31 pages, 9303 KB  
Article
Automatic Quadrotor Dispatch Missions Based on Air-Writing Gesture Recognition
by Pu-Sheng Tsai, Ter-Feng Wu and Yen-Chun Wang
Processes 2025, 13(12), 3984; https://doi.org/10.3390/pr13123984 - 9 Dec 2025
Viewed by 444
Abstract
This study develops an automatic dispatch system for quadrotor UAVs that integrates air-writing gesture recognition with a graphical user interface (GUI). The DJI RoboMaster quadrotor UAV (DJI, Shenzhen, China) was employed as the experimental platform, combined with an ESP32 microcontroller (Espressif Systems, Shanghai, [...] Read more.
This study develops an automatic dispatch system for quadrotor UAVs that integrates air-writing gesture recognition with a graphical user interface (GUI). The DJI RoboMaster quadrotor UAV (DJI, Shenzhen, China) was employed as the experimental platform, combined with an ESP32 microcontroller (Espressif Systems, Shanghai, China) and the RoboMaster SDK (version 3.0). On the Python (version 3.12.7) platform, a GUI was implemented using Tkinter (version 8.6), allowing users to input addresses or landmarks, which were then automatically converted into geographic coordinates and imported into Google Maps for route planning. The generated flight commands were transmitted to the UAV via a UDP socket, enabling remote autonomous flight. For gesture recognition, a Raspberry Pi integrated with the MediaPipe Hands module was used to capture 16 types of air-written flight commands in real time through a camera. The training samples were categorized into one-dimensional coordinates and two-dimensional images. In the one-dimensional case, X/Y axis coordinates were concatenated after data augmentation, interpolation, and normalization. In the two-dimensional case, three types of images were generated, namely font trajectory plots (T-plots), coordinate-axis plots (XY-plots), and composite plots combining the two (XYT-plots). To evaluate classification performance, several machine learning and deep learning architectures were employed, including a multi-layer perceptron (MLP), support vector machine (SVM), one-dimensional convolutional neural network (1D-CNN), and two-dimensional convolutional neural network (2D-CNN). The results demonstrated effective recognition accuracy across different models and sample formats, verifying the feasibility of the proposed air-writing trajectory framework for non-contact gesture-based UAV control. Furthermore, by combining gesture recognition with a GUI-based map planning interface, the system enhances the intuitiveness and convenience of UAV operation. Future extensions, such as incorporating aerial image object recognition, could extend the framework’s applications to scenarios including forest disaster management, vehicle license plate recognition, and air pollution monitoring. Full article
Show Figures

Figure 1

22 pages, 1145 KB  
Article
TSMTFN: Two-Stream Temporal Shift Module Network for Efficient Egocentric Gesture Recognition in Virtual Reality
by Muhammad Abrar Hussain, Chanjun Chun and SeongKi Kim
Virtual Worlds 2025, 4(4), 58; https://doi.org/10.3390/virtualworlds4040058 - 4 Dec 2025
Viewed by 370
Abstract
Egocentric hand gesture recognition is vital for natural human–computer interaction in augmented and virtual reality (AR/VR) systems. However, most deep learning models struggle to balance accuracy and efficiency, limiting real-time use on wearable devices. This paper introduces a Two-Stream Temporal Shift Module Transformer [...] Read more.
Egocentric hand gesture recognition is vital for natural human–computer interaction in augmented and virtual reality (AR/VR) systems. However, most deep learning models struggle to balance accuracy and efficiency, limiting real-time use on wearable devices. This paper introduces a Two-Stream Temporal Shift Module Transformer Fusion Network (TSMTFN) that achieves high recognition accuracy with low computational cost. The model integrates Temporal Shift Modules (TSMs) for efficient motion modeling and a Transformer-based fusion mechanism for long-range temporal understanding, operating on dual RGB-D streams to capture complementary visual and depth cues. Training stability and generalization are enhanced through full-layer training from epoch 1 and MixUp/CutMix augmentations. Evaluated on the EgoGesture dataset, TSMTFN attained 96.18% top-1 accuracy and 99.61% top-5 accuracy on the independent test set with only 16 GFLOPs and 21.3M parameters, offering a 2.4–4.7× reduction in computation compared to recent state-of-the-art methods. The model runs at 15.10 samples/s, achieving real-time performance. The results demonstrate robust recognition across over 95% of gesture classes and minimal inter-class confusion, establishing TSMTFN as an efficient, accurate, and deployable solution for next-generation wearable AR/VR gesture interfaces. Full article
Show Figures

Figure 1

3729 KB  
Proceeding Paper
A Smart Glove-Based System for Dynamic Sign Language Translation Using LSTM Networks
by Tabassum Kanwal, Saud Altaf, Rehan Mehmood Yousaf and Kashif Sattar
Eng. Proc. 2025, 118(1), 45; https://doi.org/10.3390/ECSA-12-26530 - 7 Nov 2025
Viewed by 429
Abstract
This research presents a novel, real-time Pakistani Sign Language (PSL) recognition system utilizing a custom-designed sensory glove integrated with advanced machine learning techniques. The system aims to bridge communication gaps for individuals with hearing and speech impairments by translating hand gestures into readable [...] Read more.
This research presents a novel, real-time Pakistani Sign Language (PSL) recognition system utilizing a custom-designed sensory glove integrated with advanced machine learning techniques. The system aims to bridge communication gaps for individuals with hearing and speech impairments by translating hand gestures into readable text. At the core of this work is a smart glove engineered with five resistive flex sensors for precise finger flexion detection and a 9-DOF Inertial Measurement Unit (IMU) for capturing hand orientation and movement. The glove is powered by a compact microcontroller, which processes the analog and digital sensor inputs and transmits the data wirelessly to a host computer. A rechargeable 3.7 V Li-Po battery ensures portability, while a dynamic dataset comprising both static alphabet gestures and dynamic PSL phrases was recorded using this setup. The collected data was used to train two models: a Support Vector Machine with feature extraction (SVM-FE) and a Long Short-Term Memory (LSTM) deep learning network. The LSTM model outperformed traditional methods, achieving an accuracy of 98.6% in real-time gesture recognition. The proposed system demonstrates robust performance and offers practical applications in smart home interfaces, virtual and augmented reality, gaming, and assistive technologies. By combining ergonomic hardware with intelligent algorithms, this research takes a significant step toward inclusive communication and more natural human–machine interaction. Full article
Show Figures

Figure 1

2177 KB  
Proceeding Paper
Hand Gesture to Sound: A Real-Time DSP-Based Audio Modulation System for Assistive Interaction
by Laiba Khan, Hira Mariam, Marium Sajid, Aymen Khan and Zehra Fatima
Eng. Proc. 2025, 118(1), 27; https://doi.org/10.3390/ECSA-12-26516 - 7 Nov 2025
Viewed by 223
Abstract
This paper presents the design, development, and evaluation of an embedded hardware and digital signal processing (DSP)-based real-time gesture-controlled system. The system architecture utilizes an MPU6050 inertial measurement unit (IMU), Arduino Uno microcontroller, and Python-based audio interface to recognize and classify directional hand [...] Read more.
This paper presents the design, development, and evaluation of an embedded hardware and digital signal processing (DSP)-based real-time gesture-controlled system. The system architecture utilizes an MPU6050 inertial measurement unit (IMU), Arduino Uno microcontroller, and Python-based audio interface to recognize and classify directional hand gestures and transform them into auditory commands. Wrist tilts, i.e., left, right, forward, and backward, are recognized using a hybrid algorithm that uses thresholding, moving average filtering, and low-pass smoothing to remove sensor noise and transient errors. Hardware setup utilizes I2C-based sensor acquisition, onboard preprocessing on Arduino, and serial communication with a host computer running a Python script to trigger audio playing using the playsound library. Four gestures are programmed for basic needs: Hydration Request, Meal Support, Restroom Support, and Emergency Alarm. Experimental evaluation, conducted over more than 50 iterations per gesture in a controlled laboratory setup, resulted in a mean recognition rate of 92%, with system latency of 120–150 milliseconds. The approach has little calibration costs, is low-cost, and offers low-latency performance comparable to more advanced camera-based or machine learning-based methods, and is therefore suitable for portable assistive devices. Full article
Show Figures

Figure 1

21 pages, 1507 KB  
Article
Embodied Co-Creation with Real-Time Generative AI: An Ukiyo-E Interactive Art Installation
by Hisa Nimi, Meizhu Lu and Juan Carlos Chacon
Digital 2025, 5(4), 61; https://doi.org/10.3390/digital5040061 - 7 Nov 2025
Viewed by 2361
Abstract
Generative artificial intelligence (AI) is reshaping creative practices, yet many systems rely on traditional interfaces, limiting intuitive and embodied engagement. This study presents a qualitative observational analysis of participant interactions with a real-time generative AI installation designed to co-create Ukiyo-e-style artwork through embodied [...] Read more.
Generative artificial intelligence (AI) is reshaping creative practices, yet many systems rely on traditional interfaces, limiting intuitive and embodied engagement. This study presents a qualitative observational analysis of participant interactions with a real-time generative AI installation designed to co-create Ukiyo-e-style artwork through embodied inputs. The system dynamically interprets physical presence, object manipulation, body poses, and gestures to influence AI-generated visuals displayed on a large public screen. Drawing on systematic video analysis and detailed interaction logs across 13 sessions, the research identifies core modalities of interaction, patterns of co-creation, and user responses. Tangible objects with salient visual features such as color and pattern emerged as the primary, most intuitive input method, while bodily poses and hand gestures served as compositional modifiers. The system’s immediate feedback loop enabled rapid learning and iterative exploration and enhanced the user’s feeling of control. Users engaged in collaborative discovery, turn-taking, and shared authorship, frequently expressing a positive effect. The findings highlight how embodied interaction lowers cognitive barriers, enhances engagement, and supports meaningful human–AI collaboration. This study offers design implications for future creative AI systems, emphasizing accessibility, playful exploration, and cultural resonance, with the potential to democratize artistic expression and foster deeper public engagement with digital cultural heritage. Full article
(This article belongs to the Special Issue Advances in Semantic Multimedia and Personalized Digital Content)
Show Figures

Figure 1

22 pages, 4342 KB  
Article
Cloud-Based Personalized sEMG Classification Using Lightweight CNNs for Long-Term Haptic Communication in Deaf-Blind Individuals
by Kaavya Tatavarty, Maxwell Johnson and Boris Rubinsky
Bioengineering 2025, 12(11), 1167; https://doi.org/10.3390/bioengineering12111167 - 27 Oct 2025
Viewed by 889
Abstract
Deaf-blindness, particularly in progressive conditions such as Usher syndrome, presents profound challenges to communication, independence, and access to information. Existing tactile communication technologies for individuals with Usher syndrome are often limited by the need for close physical proximity to trained interpreters, typically requiring [...] Read more.
Deaf-blindness, particularly in progressive conditions such as Usher syndrome, presents profound challenges to communication, independence, and access to information. Existing tactile communication technologies for individuals with Usher syndrome are often limited by the need for close physical proximity to trained interpreters, typically requiring hand-to-hand contact. In this study, we introduce a novel, cloud-based, AI-assisted gesture recognition and haptic communication system designed for long-term use by individuals with Usher syndrome, whose auditory and visual abilities deteriorate with age. Central to our approach is a wearable haptic interface that relocates tactile input and output from the hands to an arm-mounted sleeve, thereby preserving manual dexterity and enabling continuous, bidirectional tactile interaction. The system uses surface electromyography (sEMG) to capture user-specific muscle activations in the hand and forearm and employs lightweight, personalized convolutional neural networks (CNNs), hosted on a centralized server, to perform real-time gesture classification. A key innovation of the system is its ability to adapt over time to each user’s evolving physiological condition, including the progressive loss of vision and hearing. Experimental validation using a public dataset, along with real-time testing involving seven participants, demonstrates that personalized models consistently outperform cross-user models in terms of accuracy, adaptability, and usability. This platform offers a scalable, longitudinally adaptable solution for non-visual communication and holds significant promise for advancing assistive technologies for the deaf-blind community. Full article
(This article belongs to the Section Biosignal Processing)
Show Figures

Graphical abstract

14 pages, 1197 KB  
Article
An Inclusive Offline Learning Platform Integrating Gesture Recognition and Local AI Models
by Marius-Valentin Drăgoi, Ionuț Nisipeanu, Roxana-Adriana Puiu, Florentina-Geanina Tache, Teodora-Mihaela Spiridon-Mocioacă, Alexandru Hank and Cozmin Cristoiu
Biomimetics 2025, 10(10), 693; https://doi.org/10.3390/biomimetics10100693 - 14 Oct 2025
Viewed by 788
Abstract
This paper introduces a gesture-controlled conversational interface driven by a local AI model, aimed at improving accessibility and facilitating hands-free interaction within digital environments. The technology utilizes real-time hand gesture recognition via a typical laptop camera and connects with a local AI engine [...] Read more.
This paper introduces a gesture-controlled conversational interface driven by a local AI model, aimed at improving accessibility and facilitating hands-free interaction within digital environments. The technology utilizes real-time hand gesture recognition via a typical laptop camera and connects with a local AI engine to produce customized learning materials. Users can peruse educational documents, obtain topic summaries, and generate automated quizzes with intuitive gestures, including lateral finger movements, a two-finger gesture, or an open palm, without the need for conventional input devices. Upon selection of a file, the AI model analyzes its whole content, producing a structured summary and a multiple-choice assessment, both of which are immediately saved for subsequent inspection. A unified set of gestures facilitates seamless navigating within the user interface and the opened documents. The system underwent testing with university students and faculty (n = 31), utilizing assessment measures such as gesture detection accuracy, command-response latency, and user satisfaction. The findings demonstrate that the system offers a seamless, hands-free user experience with significant potential for usage in accessibility, human–computer interaction, and intelligent interface design. This work advances the creation of multimodal AI-driven educational aids, providing a pragmatic framework for gesture-based document navigation and intelligent content enhancement. Full article
(This article belongs to the Special Issue Biomimicry for Optimization, Control, and Automation: 3rd Edition)
Show Figures

Figure 1

19 pages, 20388 KB  
Article
Radar-Based Gesture Recognition Using Adaptive Top-K Selection and Multi-Stream CNNs
by Jiseop Park and Jaejin Jeong
Sensors 2025, 25(20), 6324; https://doi.org/10.3390/s25206324 - 13 Oct 2025
Viewed by 1113
Abstract
With the proliferation of the Internet of Things (IoT), gesture recognition has attracted attention as a core technology in human–computer interaction (HCI). In particular, mmWave frequency-modulated continuous-wave (FMCW) radar has emerged as an alternative to vision-based approaches due to its robustness to illumination [...] Read more.
With the proliferation of the Internet of Things (IoT), gesture recognition has attracted attention as a core technology in human–computer interaction (HCI). In particular, mmWave frequency-modulated continuous-wave (FMCW) radar has emerged as an alternative to vision-based approaches due to its robustness to illumination changes and advantages in privacy. However, in real-world human–machine interface (HMI) environments, hand gestures are inevitably accompanied by torso- and arm-related reflections, which can also contain gesture-relevant variations. To effectively capture these variations without discarding them, we propose a preprocessing method called Adaptive Top-K Selection, which leverages vector entropy to summarize and preserve informative signals from both hand and body reflections. In addition, we present a Multi-Stream EfficientNetV2 architecture that jointly exploits temporal range and Doppler trajectories, together with radar-specific data augmentation and a training optimization strategy. In experiments on the publicly available FMCW gesture dataset released by the Karlsruhe Institute of Technology, the proposed method achieved an average accuracy of 99.5%. These results show that the proposed approach enables accurate and reliable gesture recognition even in realistic HMI environments with co-existing body reflections. Full article
(This article belongs to the Special Issue Sensor Technologies for Radar Detection)
Show Figures

Figure 1

26 pages, 7995 KB  
Article
Smart Home Control Using Real-Time Hand Gesture Recognition and Artificial Intelligence on Raspberry Pi 5
by Thomas Hobbs and Anwar Ali
Electronics 2025, 14(20), 3976; https://doi.org/10.3390/electronics14203976 - 10 Oct 2025
Viewed by 3322
Abstract
This paper outlines the process of developing a low-cost system for home appliance control via real-time hand gesture classification using Computer Vision and a custom lightweight machine learning model. This system strives to enable those with speech or hearing disabilities to interface with [...] Read more.
This paper outlines the process of developing a low-cost system for home appliance control via real-time hand gesture classification using Computer Vision and a custom lightweight machine learning model. This system strives to enable those with speech or hearing disabilities to interface with smart home devices in real time using hand gestures, such as is possible with voice-activated ‘smart assistants’ currently available. The system runs on a Raspberry Pi 5 to enable future IoT integration and reduce costs. The system also uses the official camera module v2 and 7-inch touchscreen. Frame preprocessing uses MediaPipe to assign hand coordinates, and NumPy tools to normalise them. A machine learning model then predicts the gesture. The model, a feed-forward network consisting of five fully connected layers, was built using Keras 3 and compiled with TensorFlow Lite. Training data utilised the HaGRIDv2 dataset, modified to consist of 15 one-handed gestures from its original of 23 one- and two-handed gestures. When used to train the model, validation metrics of 0.90 accuracy and 0.31 loss were returned. The system can control both analogue and digital hardware via GPIO pins and, when recognising a gesture, averages 20.4 frames per second with no observable delay. Full article
Show Figures

Figure 1

11 pages, 1005 KB  
Proceeding Paper
Multimodal Fusion for Enhanced Human–Computer Interaction
by Ajay Sharma, Isha Batra, Shamneesh Sharma and Anggy Pradiftha Junfithrana
Eng. Proc. 2025, 107(1), 81; https://doi.org/10.3390/engproc2025107081 - 10 Sep 2025
Cited by 1 | Viewed by 1453
Abstract
Our paper introduces a novel idea of a virtual mouse character driven by gesture detection, eye-tracking, and voice monitoring. This system uses cutting-edge computer vision and machine learning technology to let users command and control the mouse pointer using eye motions, voice commands, [...] Read more.
Our paper introduces a novel idea of a virtual mouse character driven by gesture detection, eye-tracking, and voice monitoring. This system uses cutting-edge computer vision and machine learning technology to let users command and control the mouse pointer using eye motions, voice commands, or hand gestures. This system’s main goal is to provide users who want a more natural, hands-free approach to interacting with their computers as well as those with impairments that limit their bodily motions, such as those with paralysis—with an easy and engaging interface. The system improves accessibility and usability by combining many input modalities, therefore providing a flexible answer for numerous users. While the speech recognition function permits hands-free operation via voice instructions, the eye-tracking component detects and responds to the user’s gaze, therefore providing exact cursor control. Gesture recognition enhances these features even further by letting users use their hands simply to execute mouse operations. This technology not only enhances personal user experience for people with impairments but also marks a major development in human–computer interaction. It shows how computer vision and machine learning may be used to provide more inclusive and flexible user interfaces, therefore improving the accessibility and efficiency of computer usage for everyone. Full article
Show Figures

Figure 1

10 pages, 2931 KB  
Proceeding Paper
Dynamic Hand Gesture Recognition Using MediaPipe and Transformer
by Hsin-Hua Li and Chen-Chiung Hsieh
Eng. Proc. 2025, 108(1), 22; https://doi.org/10.3390/engproc2025108022 - 3 Sep 2025
Viewed by 6029
Abstract
We developed a low-cost, high-performance gesture recognition system with a dynamic hand gesture recognition technique based on the Transformer model combined with MediaPipe. The technique accurately extracts hand gesture key points. The system was designed with eight primary gestures: swipe up, swipe down, [...] Read more.
We developed a low-cost, high-performance gesture recognition system with a dynamic hand gesture recognition technique based on the Transformer model combined with MediaPipe. The technique accurately extracts hand gesture key points. The system was designed with eight primary gestures: swipe up, swipe down, swipe left, swipe right, thumbs up, OK, click, and enlarge. These gestures serve as alternatives to mouse and keyboard operations, simplifying human–computer interaction interfaces to meet the needs of media system control and presentation switching. The experiment results demonstrated that training deep learning models using the Transformer achieved over 99% accuracy, effectively enhancing recognition performance. Full article
Show Figures

Figure 1

20 pages, 2732 KB  
Article
Redesigning Multimodal Interaction: Adaptive Signal Processing and Cross-Modal Interaction for Hands-Free Computer Interaction
by Bui Hong Quan, Nguyen Dinh Tuan Anh, Hoang Van Phi and Bui Trung Thanh
Sensors 2025, 25(17), 5411; https://doi.org/10.3390/s25175411 - 2 Sep 2025
Viewed by 1340
Abstract
Hands-free computer interaction is a key topic in assistive technology, with camera-based and voice-based systems being the most common methods. Recent camera-based solutions leverage facial expressions or head movements to simulate mouse clicks or key presses, while voice-based systems enable control via speech [...] Read more.
Hands-free computer interaction is a key topic in assistive technology, with camera-based and voice-based systems being the most common methods. Recent camera-based solutions leverage facial expressions or head movements to simulate mouse clicks or key presses, while voice-based systems enable control via speech commands, wake-word detection, and vocal gestures. However, existing systems often suffer from limitations in responsiveness and accuracy, especially under real-world conditions. In this paper, we present 3-Modal Human-Computer Interaction (3M-HCI), a novel interaction system that dynamically integrates facial, vocal, and eye-based inputs through a new signal processing pipeline and a cross-modal coordination mechanism. This approach not only enhances recognition accuracy but also reduces interaction latency. Experimental results demonstrate that 3M-HCI outperforms several recent hands-free interaction solutions in both speed and precision, highlighting its potential as a robust assistive interface. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

15 pages, 2127 KB  
Article
Accessible Interface for Museum Geological Exhibitions: PETRA—A Gesture-Controlled Experience of Three-Dimensional Rocks and Minerals
by Andrei Ionuţ Apopei
Minerals 2025, 15(8), 775; https://doi.org/10.3390/min15080775 - 24 Jul 2025
Cited by 1 | Viewed by 1477
Abstract
The increasing integration of 3D technologies and machine learning is fundamentally reshaping mineral sciences and cultural heritage, establishing the foundation for an emerging “Mineralogy 4.0” framework. However, public engagement with digital 3D collections is often limited by complex or costly interfaces, such as [...] Read more.
The increasing integration of 3D technologies and machine learning is fundamentally reshaping mineral sciences and cultural heritage, establishing the foundation for an emerging “Mineralogy 4.0” framework. However, public engagement with digital 3D collections is often limited by complex or costly interfaces, such as VR/AR systems and traditional touchscreen kiosks, creating a clear need for more intuitive, accessible, and more engaging and inclusive solutions. This paper presents PETRA, an open-source, gesture-controlled system for exploring 3D rocks and minerals. Developed in the TouchDesigner environment, PETRA utilizes a standard webcam and the MediaPipe framework to translate natural hand movements into real-time manipulation of digital specimens, requiring no specialized hardware. The system provides a customizable, node-based framework for creating touchless, interactive exhibits. Successfully evaluated during a “Long Night of Museums” public event with 550 visitors, direct qualitative observations confirmed high user engagement, rapid instruction-free learnability across diverse age groups, and robust system stability in a continuous-use setting. As a practical case study, PETRA demonstrates that low-cost, webcam-based gesture control is a viable solution for creating accessible and immersive learning experiences. This work offers a significant contribution to the fields of digital mineralogy, human–machine interaction, and cultural heritage by providing a hygienic, scalable, and socially engaging method for interacting with geological collections. This research confirms that as digital archives grow, the development of human-centered interfaces is paramount in unlocking their full scientific and educational potential. Full article
(This article belongs to the Special Issue 3D Technologies and Machine Learning in Mineral Sciences)
Show Figures

Figure 1

23 pages, 3542 KB  
Article
An Intuitive and Efficient Teleoperation Human–Robot Interface Based on a Wearable Myoelectric Armband
by Long Wang, Zhangyi Chen, Songyuan Han, Yao Luo, Xiaoling Li and Yang Liu
Biomimetics 2025, 10(7), 464; https://doi.org/10.3390/biomimetics10070464 - 15 Jul 2025
Viewed by 994
Abstract
Although artificial intelligence technologies have significantly enhanced autonomous robots’ capabilities in perception, decision-making, and planning, their autonomy may still fail when faced with complex, dynamic, or unpredictable environments. Therefore, it is critical to enable users to take over robot control in real-time and [...] Read more.
Although artificial intelligence technologies have significantly enhanced autonomous robots’ capabilities in perception, decision-making, and planning, their autonomy may still fail when faced with complex, dynamic, or unpredictable environments. Therefore, it is critical to enable users to take over robot control in real-time and efficiently through teleoperation. The lightweight, wearable myoelectric armband, due to its portability and environmental robustness, provides a natural human–robot gesture interaction interface. However, current myoelectric teleoperation gesture control faces two major challenges: (1) poor intuitiveness due to visual-motor misalignment; and (2) low efficiency from discrete, single-degree-of-freedom control modes. To address these challenges, this study proposes an integrated myoelectric teleoperation interface. The interface integrates the following: (1) a novel hybrid reference frame aimed at effectively mitigating visual-motor misalignment; and (2) a finite state machine (FSM)-based control logic designed to enhance control efficiency and smoothness. Four experimental tasks were designed using different end-effectors (gripper/dexterous hand) and camera viewpoints (front/side view). Compared to benchmark methods, the proposed interface demonstrates significant advantages in task completion time, movement path efficiency, and subjective workload. This work demonstrates the potential of the proposed interface to significantly advance the practical application of wearable myoelectric sensors in human–robot interaction. Full article
(This article belongs to the Special Issue Intelligent Human–Robot Interaction: 4th Edition)
Show Figures

Figure 1

Back to TopTop