MDPI - Publisher of Open Access Journals

21 pages, 891 KB

Open AccessArticle

Unified Visual Synchrony: A Framework for Face–Gesture Coherence in Multimodal Human–AI Interaction

by Saule Kudubayeva, Yernar Seksenbayev, Aigerim Yerimbetova, Elmira Daiyrbayeva, Bakzhan Sakenov, Duman Telman and Mussa Turdalyuly

Big Data Cogn. Comput. 2026, 10(3), 88; https://doi.org/10.3390/bdcc10030088 - 12 Mar 2026

Viewed by 169

Abstract

Multimodal human–AI systems generally consider facial expressions and body motions as separate input streams, leading to disjointed interpretations and diminished emotional coherence. To overcome this issue, we offer the Engagement-Safe Expressive Alignment (ESEA) paradigm and the Unified Visual Synchrony (UVS) framework as its [...] Read more.

Multimodal human–AI systems generally consider facial expressions and body motions as separate input streams, leading to disjointed interpretations and diminished emotional coherence. To overcome this issue, we offer the Engagement-Safe Expressive Alignment (ESEA) paradigm and the Unified Visual Synchrony (UVS) framework as its computational implementation. UVS models the coherence between facial expressions and gestures, offering an interpretable visual synchrony signal that can function as adaptive feedback in human–AI interactions. The framework’s key component is the Consistency Index for Affective Synchrony (

C I A S

), which correlates brief visual segments with scalar synchrony scores through a common latent representation. Facial and gestural signals are processed by modality-specific projection networks into a unified latent space, and

C I A S

is derived from the similarity and short-term temporal consistency of these latent trajectories. The synchrony index is regarded as an estimation of affective visual coherence within the ESEA paradigm. We formalize the UVS/

C I A S

framework and conduct a comparative experimental evaluation utilizing matched and mismatched face–gesture segments derived from rendered dialog footage. Utilizing ROC analysis, score distribution comparisons, temporal visualizations, and negative control tests, we illustrate that

C I A S

effectively captures structured face–gesture alignment that surpasses similarity-based baselines, while also delivering a persistent, time-resolved synchronization signal. These findings establish

C I A S

as a principled and interpretable feedback signal for future affect-aware, engagement-focused multimodal agents. Full article

► Show Figures

Figure 1

24 pages, 3935 KB

Open AccessArticle

PSO Trajectory Optimization of Robot Arm for Ultrasonic Testing of Complex Curved Surface

by Rao Yao, Yahui Lv, Kai Wang, Yan Gao and Dazhong Wang

Coatings 2026, 16(3), 332; https://doi.org/10.3390/coatings16030332 - 8 Mar 2026

Viewed by 131

Abstract

In ultrasonic nondestructive testing, maintaining the ultrasonic sensor in normal contact with curved surfaces is pivotal for acquiring valid defect signals. Replacing manual operation with a robotic arm ensures stable signal collection, while stable and fast trajectory planning for complex curved-surface tracking remains [...] Read more.

In ultrasonic nondestructive testing, maintaining the ultrasonic sensor in normal contact with curved surfaces is pivotal for acquiring valid defect signals. Replacing manual operation with a robotic arm ensures stable signal collection, while stable and fast trajectory planning for complex curved-surface tracking remains a key challenge. This research investigates gesture-driven robotic trajectory planning and impact optimization via the particle swarm optimization (PSO) algorithm in the robot joint space for rapid and smooth movement. Gesture trajectories are acquired via a Leap Motion device, with unified mapping established through spatial transformations among gesture, simulation, and experimental robot spaces. PSO is utilized to optimize trajectories, enhancing accuracy and controllability. Median filtering is applied to trajectory coordinate data to suppress errors from hand tremor and sensor limitations, followed by introducing a surface normal offset to generate pose matrices at each trajectory point. Systematic comparison of interpolation methods (polynomial, cubic spline, circular, cubic B-spline) reveals that cubic B-spline interpolation achieves the shortest execution time under angular acceleration constraints. The results show that PSO optimizes point-to-point trajectories based on 5-5-5 polynomial interpolation, with impact force and execution time as objectives, yielding the optimal trajectory with minimal time under acceleration constraints. This research provides valuable methodological references for robotic manipulator trajectory planning and optimization in complex curved-surface ultrasonic testing. Full article

(This article belongs to the Section Surface Characterization, Deposition and Modification)

► Show Figures

Figure 1

36 pages, 14443 KB

Open AccessArticle

Personalized Wrist–Forearm Static Gesture Recognition Using the Vicara Kai Controller and Convolutional Neural Network

by Jacek Szedel

Sensors 2026, 26(5), 1700; https://doi.org/10.3390/s26051700 - 8 Mar 2026

Viewed by 157

Abstract

Predefined, user-independent gesture sets do not account for individual differences in movement patterns and physical limitations. This study presents a personalized wrist–forearm static gesture recognition system for human–computer interaction (HCI) using the Vicara Kai^TM wearable controller and a convolutional neural network (CNN). [...] Read more.

Predefined, user-independent gesture sets do not account for individual differences in movement patterns and physical limitations. This study presents a personalized wrist–forearm static gesture recognition system for human–computer interaction (HCI) using the Vicara Kai^TM wearable controller and a convolutional neural network (CNN). Unlike the system based on fixed, predefined gestures, the proposed approach enables users to define and train their own gesture sets. During gesture recording, users may either select a gesture pattern from a predefined prompt set or create their own natural, unprompted gestures. A dedicated software framework was developed for data acquisition, preprocessing, model training, and real-time recognition. The developed system was evaluated by optimizing the parameters of a lightweight CNN and examining the influence of sequentially applied changes to the input and network pipelines, including resizing the input layer, applying data augmentation, experimenting with different dropout ratios, and varying the number of learning samples. The performance of the resulting network setup was assessed using confusion matrices, accuracy, and precision metrics for both original gestures and gestures smoothed using the cubic Bézier function. The resulting validation accuracy ranged from 0.88 to 0.94, with an average test-set accuracy of 0.92 and macro precision of 0.92. The system’s resilience to rapid or casual gestures was also evaluated using the receiver operating characteristic (ROC) method, achieving an Area Under the Curve (AUC) of 0.97. The results demonstrate that the proposed approach achieves high recognition accuracy, indicating its potential for a range of practical applications. Full article

(This article belongs to the Special Issue Sensor Systems for Gesture Recognition (3rd Edition))

► Show Figures

Graphical abstract

23 pages, 614 KB

Open AccessArticle

Two-Factor Cancelable Biometric Key Binding via Euclidean Challenge–Response Pair Mechanism

by Michael Logan Garrett, Mahafujul Alam, Michael Partridge and Julie Heynssens

J. Cybersecur. Priv. 2026, 6(2), 42; https://doi.org/10.3390/jcp6020042 - 2 Mar 2026

Viewed by 163

Abstract

This work proposes a lightweight biometric key-binding scheme that adapts a PUF-style challenge–response mechanism to face geometry: a two-factor password and session nonce generate random challenge points, Gray-coded Euclidean distances to facial landmarks form responses, and a random key is bound by discarding [...] Read more.

This work proposes a lightweight biometric key-binding scheme that adapts a PUF-style challenge–response mechanism to face geometry: a two-factor password and session nonce generate random challenge points, Gray-coded Euclidean distances to facial landmarks form responses, and a random key is bound by discarding selected positions so only a reduced subset, the nonce, and a key hash are stored. At authentication, a fresh response set is compared to the subset with a Hamming-distance tolerance, and bounded local search corrects residual errors; each successful session rotates the nonce and refreshes the ephemeral key. We frame this as a conceptual exploration of an interpretable, on-device, controlled-capture design niche—a per-session nonce-driven cancelable biometric key-binding mechanism—and we quantify the resulting security–usability trade-offs. Empirically, the scheme works under stable capture conditions with carefully tuned thresholds, and it is naturally suited to tightly controlled deployments (e.g., access kiosks) where it can also incorporate user-driven micro-gestures as an extra behavioral factor. While the construction is fragile under broader variability and leans on the second factor for security, it offers an alternative to existing mechanisms and a clear niche, and we present it as a conceptual exploration showing how CRP mechanisms can inform cancelable biometrics with per-session revocability. Full article

(This article belongs to the Section Security Engineering & Applications)

► Show Figures

Figure 1

15 pages, 2008 KB

Open AccessArticle

Application of the Ultraleap 3Di-Based Gesture-Controlled 3D Imaging Visualization System in Pulmonary Segmentectomy: A Single-Center Prospective Study

by Zhengnan Liu, Bin Wang, Chengrun Li, Ruiji Chen, Jixing Lin and Jie Li

Bioengineering 2026, 13(3), 284; https://doi.org/10.3390/bioengineering13030284 - 28 Feb 2026

Viewed by 242

Abstract

Objective: Pulmonary segmentectomy serves as a crucial approach for treating early-stage lung cancer. However, this procedure demands precise identification of segmental anatomy, and surgeons often need to repeatedly consult the patient’s 3D imaging data or other medical records during the operation. Traditional contact-based [...] Read more.

Objective: Pulmonary segmentectomy serves as a crucial approach for treating early-stage lung cancer. However, this procedure demands precise identification of segmental anatomy, and surgeons often need to repeatedly consult the patient’s 3D imaging data or other medical records during the operation. Traditional contact-based intraoperative imaging assistance devices involve cumbersome operation and pose risks to the sterile environment. This study aims to evaluate the clinical utility of an Ultraleap 3Di-based gesture-controlled 3D imaging visualization system for non-contact interaction during pulmonary segmentectomy in patients with early-stage lung cancer. Methods: This study enrolled 58 patients with early-stage non-small cell lung cancer scheduled for video-assisted thoracoscopic pulmonary segmentectomy from June 2025 to December 2025. Participants were randomly assigned to either the experimental group or the control group. Intraoperatively, the experimental group utilized the Ultraleap 3Di system for non-contact 3D image review, while the control group relied on conventional contact-based devices for image retrieval, which was operated by non-sterile assistants. The compared outcomes included intraoperative image retrieval time, total operative time, intraoperative blood loss, R0 resection rate, postoperative drainage duration, and surgeon satisfaction. Results: The baseline characteristics were comparable between the two groups. The mean age was 53.66 ± 9.12 years in the experimental group and 55.21 ± 8.76 years in the control group (t = −0.66, p > 0.05); the experimental group included 16 males and 13 females, while the control group included 14 males and 15 females (χ² = 0.276, p > 0.05). Preoperative pulmonary function, as measured by FEV1/FVC ratio, was 74.48 ± 4.75% in the experimental group versus 76.08 ± 4.51% in the control group (t = −1.31, p > 0.05). The image retrieval time in the experimental group was significantly shorter than that in the control group (75.16 ± 19.38 s versus 209.59 ± 28.13 s, t = −21.19, p < 0.001, 95% CI [−147.13, −121.72], Cohen’s d = −5.57). The total operative time was also reduced (88.72 ± 13.82 min versus 96.55 ± 13.90 min, t = −2.15, p = 0.036, 95% CI [−15.12, −0.53], Cohen’s d = −0.57). No significant differences were observed between the two groups in terms of R0 resection rate (both 100%), intraoperative blood loss, or postoperative drainage duration (p > 0.05). The operating surgeons rated the system highly for image clarity, navigation timeliness, and overall utility, while the score for operational convenience was relatively neutral (mean score 3.2). Conclusions: The Ultraleap 3Di-based non-contact visualization system reduces the time required for intraoperative image retrieval and improves overall procedural efficiency in segmentectomy, without compromising surgical safety or oncological radicality. Future efforts should focus on optimizing the intuitiveness of gesture interaction and exploring its integration with augmented reality and artificial intelligence to further advance the system’s intelligence and practical utility. Full article

(This article belongs to the Section Biomedical Engineering and Biomaterials)

► Show Figures

Figure 1

17 pages, 4699 KB

Open AccessArticle

Interactive Teleoperation of an Articulated Robotic Arm Using Vision-Based Human Hand Tracking

by Marius-Valentin Drăgoi, Aurel-Viorel Frimu, Andrei Postelnicu, Roxana-Adriana Puiu, Gabriel Petrea and Alexandru Hank

Biomimetics 2026, 11(2), 151; https://doi.org/10.3390/biomimetics11020151 - 19 Feb 2026

Viewed by 467

Abstract

Interactive teleoperation offers an intuitive pathway for human–robot interaction, yet many existing systems rely on dedicated sensors or wearable devices, limiting accessibility and scalability. This paper presents a vision-based teleoperation framework that enables real-time control of an articulated robotic arm (five joints plus [...] Read more.

Interactive teleoperation offers an intuitive pathway for human–robot interaction, yet many existing systems rely on dedicated sensors or wearable devices, limiting accessibility and scalability. This paper presents a vision-based teleoperation framework that enables real-time control of an articulated robotic arm (five joints plus a gripper actuator) using human hand tracking from a single, typical laptop camera. Hand pose and gesture information are extracted using a real-time landmark estimation pipeline, and a set of compact kinematic descriptors—palm position, apparent hand scale, wrist rotation, hand pitch, and pinch gesture—are mapped to robotic joint commands through a calibration-based control strategy. Commands are transmitted over a lightweight network interface to an embedded controller that executes synchronized servo actuation. To enhance stability and usability, temporal smoothing and rate-limited updates are employed to mitigate jitter while preserving responsiveness. In a human-in-the-loop evaluation with 42 participants, the system achieved an 88% success rate (37/42), with a completion time of 53.48 ± 18.51 s, a placement error of 6.73 ± 3.11 cm for successful trials (n = 37), and an ease-of-use score of 2.67 ± 1.20 on a 1–5 scale. Results indicate that the proposed approach enables feasible interactive teleoperation without specialized hardware, supporting its potential as a low-cost platform for robotic manipulation, education, and rapid prototyping. Full article

(This article belongs to the Special Issue Recent Advances in Bioinspired Robot and Intelligent Systems)

► Show Figures

Figure 1

19 pages, 1689 KB

Open AccessArticle

Bio-Adaptive Robot Control: Integrating Biometric Feedback and Gesture-Based Interfaces for Intuitive Human–Robot Interaction (HRI)

by Antonio Di Tecco, Daniele Leonardis, Edoardo Ragusa, Antonio Frisoli and Claudio Loconsole

Robotics 2026, 15(2), 45; https://doi.org/10.3390/robotics15020045 - 17 Feb 2026

Cited by 1 | Viewed by 363

Abstract

AI-driven assistance can help the user perform complex teleoperated tasks, introduce autonomous patterns, or adapt the workbench to objects of interest. On the other hand, the level of assistance should be responsive to the user’s response and adapt accordingly to promote a positive [...] Read more.

AI-driven assistance can help the user perform complex teleoperated tasks, introduce autonomous patterns, or adapt the workbench to objects of interest. On the other hand, the level of assistance should be responsive to the user’s response and adapt accordingly to promote a positive and effective experience. Envisaging this final goal, this article investigates whether physiological signals can be used to estimate the user’s performance and response in a teleoperation setup, with and without AI-driven assistance. In more detail, a teleoperated pick-and-place task was performed with or without AI-driven assistance during the grasping phase. A deep-learning algorithm for affordance detection provided assistance, helping participants align the robotic hand with the target object. Physiological and kinematic data were measured and processed by machine learning models to predict the effects of AI assistance on task performance during teleoperation. Results showed that AI-driven assistance, as expected, affected pick-and-place performance. Beyond this, the assistance affected the participant’s fatigue level, which the machine learning models could predict with an average accuracy of 84% based on the physiological response. In addition, the success or failure of the pick-and-place task could be predicted with an average accuracy of 88%. These findings highlight the potential of integrating deep learning with biometric feedback and gesture-based control to create more intuitive and adaptive HRI systems. Full article

(This article belongs to the Special Issue AI for Robotic Exoskeletons and Prostheses, 2nd Edition)

► Show Figures

Figure 1

17 pages, 4471 KB

Open AccessArticle

Utilizing Data Quality Indices for Strategic Sensor Channel Selection to Enhance Performance of Hand Gesture Recognition Systems

by Shen Zhang, Hao Zhou, Rayane Tchantchane and Gursel Alici

Sensors 2026, 26(4), 1213; https://doi.org/10.3390/s26041213 - 12 Feb 2026

Viewed by 294

Abstract

This study proposes a data quality-driven channel selection methodology to improve hand gesture recognition performance in multi-channel wearable Human–Machine Interface (HMI) systems. The methodology centers around calculating (i) five data quality indices for both surface electromyography (sEMG) and pressure-based force myography (pFMG) signals [...] Read more.

This study proposes a data quality-driven channel selection methodology to improve hand gesture recognition performance in multi-channel wearable Human–Machine Interface (HMI) systems. The methodology centers around calculating (i) five data quality indices for both surface electromyography (sEMG) and pressure-based force myography (pFMG) signals and (ii) establishing a relationship between these data quality indices and the accuracy of gesture recognition for applications typified by prosthetic hand control. Machine learning (ML)-based and correlation-based methods were used to select three optimal channel/pair configurations from an eight-channel/pair system. Evaluations on the UOW and Ninapro DB2 datasets showed that the proposed methods consistently outperformed random channel selection, with the ML-based approach achieving the best results (76.36% for sEMG, 71.59% for pFMG, and 88.2% for fused sEMG-pFMG on the UOW dataset and 70.28% on Ninapro DB2). Notably, using three pairs of strategically selected sEMG-pFMG channels generated 88.2%, which is comparable to the 88.38% accuracy obtained with a full eight-channel sEMG system on the UOW dataset, highlighting the efficacy of our channel selection methodologies. These results highlight the value of data quality indices for sensor selection and provide a foundation for developing more efficient wearable HMI systems. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

22 pages, 763 KB

Open AccessArticle

Comparative Evaluation of LSTM and 3D CNN Models in a Hybrid System for IoT-Enabled Sign-to-Text Translation in Deaf Communities

by Samar Mouti, Hani Al Chalabi, Mohammed Abushohada, Samer Rihawi and Sulafa Abdalla

Informatics 2026, 13(2), 27; https://doi.org/10.3390/informatics13020027 - 5 Feb 2026

Viewed by 525

Abstract

This paper presents a hybrid deep learning framework for real-time sign language recognition (SLR) tailored to Internet of Things (IoT)-enabled environments, enhancing accessibility for Deaf communities. The proposed system integrates a Long Short-Term Memory (LSTM) network for static gesture recognition and a 3D [...] Read more.

This paper presents a hybrid deep learning framework for real-time sign language recognition (SLR) tailored to Internet of Things (IoT)-enabled environments, enhancing accessibility for Deaf communities. The proposed system integrates a Long Short-Term Memory (LSTM) network for static gesture recognition and a 3D Convolutional Neural Network (3D CNN) for dynamic gesture recognition. Implemented on a Raspberry Pi device using MediaPipe for landmark extraction, the system supports low-latency, on-device inference suitable for resource-constrained edge computing. Experimental results demonstrate that the LSTM model achieves its highest stability and performance for static signs at 1000 training epochs, yielding an average F1-score of 0.938 and an accuracy of 86.67%. In contrast, at 2000 epochs, the model exhibits a catastrophic performance collapse (F1-score of 0.088) due to overfitting and weight instability, highlighting the necessity of careful training regulation. Despite this, the overall system achieves consistently high classification performance under controlled conditions. In contrast, the 3D CNN component maintains robust and consistent performance across all evaluated training phases (500–2000 epochs), achieving up to 99.6% accuracy on dynamic signs. When deployed on a Raspberry Pi platform, the system achieves real-time performance with a frame rate of 12–15 FPS and an average inference latency of approximately 65 ms per frame. The hybrid architecture effectively balances recognition accuracy with computational efficiency by routing static gestures to the LSTM and dynamic gestures to the 3D CNN. This work presents a detailed epoch-wise comparative analysis of model stability and computational feasibility, contributing a practical and scalable IoT-enabled solution for inclusive, real-time sign-to-text communication in intelligent environments. Full article

(This article belongs to the Section Machine Learning)

► Show Figures

Figure 1

20 pages, 7325 KB

Open AccessArticle

FingerType: One-Handed Thumb-to-Finger Text Input Using 3D Hand Tracking

by Nuo Jia, Minghui Sun, Yan Li, Yang Tian and Tao Sun

Sensors 2026, 26(3), 897; https://doi.org/10.3390/s26030897 - 29 Jan 2026

Viewed by 433

Abstract

We present FingerType, a one-handed text input method based on thumb-to-finger gestures. FingerType detects tap events from 3D hand data using a Temporal Convolutional Network (TCN) and decodes the tap sequence into words with an n-gram language model. To inform the design, we [...] Read more.

We present FingerType, a one-handed text input method based on thumb-to-finger gestures. FingerType detects tap events from 3D hand data using a Temporal Convolutional Network (TCN) and decodes the tap sequence into words with an n-gram language model. To inform the design, we examined thumb-to-finger interactions and collected comfort ratings of finger regions. We used these results to design an improved T9-style key layout. Our system runs at 72 frames per second and reaches 94.97% accuracy for tap detection. We conducted a six-block user study with 24 participants and compared FingerType with controller input and touch input. Entry speed increased from 5.88 WPM in the first practice block to 10.63 WPM in the final block. FingerType also supported more eyes-free typing: attention on the display panel within ±15° of head-gaze was 84.41%, higher than touch input (69.47%). Finally, we report error patterns and WPM learning curves, and a model-based analysis suggests improving gesture recognition accuracy could further increase speed and narrow the gap to traditional VR input methods. Full article

(This article belongs to the Special Issue Sensing Technology to Measure Human-Computer Interactions)

► Show Figures

Figure 1

20 pages, 4222 KB

Open AccessArticle

Development and Usability Evaluation of a Leap Motion-Based Controller-Free VR Training System for Inferior Alveolar Nerve Block

by Jun-Seong Kim, Kun-Woo Kim, Hyo-Joon Kim and Seong-Yong Moon

Appl. Sci. 2026, 16(3), 1325; https://doi.org/10.3390/app16031325 - 28 Jan 2026

Viewed by 325

Abstract

This study developed a virtual reality (VR) simulator for training the inferior alveolar nerve block (IANB) procedure using Leap Motion-based hand tracking and the Unity engine, and evaluated its interaction performance, task-level outcomes within the simulator, and usability. Built on a 3D anatomical [...] Read more.

This study developed a virtual reality (VR) simulator for training the inferior alveolar nerve block (IANB) procedure using Leap Motion-based hand tracking and the Unity engine, and evaluated its interaction performance, task-level outcomes within the simulator, and usability. Built on a 3D anatomical model, the system provides a pre-clinical practice environment for realistic syringe manipulation and visually guided needle insertion, enabling repeated rehearsal of the procedural workflow. Interaction stability was assessed using participant-level gesture recognition rates and input latency. Usability was evaluated via a questionnaire addressing ease of use, cognitive load, and perceived educational usefulness. The results indicated participant-level mean gesture recognition rates of 88.8–90.5% and mean response latencies of approximately 64–66 ms. In usability testing (n = 40), the item related to perceived procedural skill improvement received the highest score (4.25/5.0). Because this study did not include controlled comparisons with conventional training or objective measures of clinical competency transfer, the findings should be interpreted as preliminary evidence of technical feasibility and learner-perceived usefulness within a simulated setting. Controlled comparative studies using objective learning outcomes are warranted. Full article

► Show Figures

Figure 1

24 pages, 6118 KB

Open AccessArticle

Effective Approach for Classifying EMG Signals Through Reconstruction Using Autoencoders

by Natalia Rendón Caballero, Michelle Rojo González, Marcos Aviles, José Manuel Alvarez Alvarado, José Billerman Robles-Ocampo, Perla Yazmin Sevilla-Camacho and Juvenal Rodríguez-Reséndiz

AI 2026, 7(1), 36; https://doi.org/10.3390/ai7010036 - 22 Jan 2026

Viewed by 424

Abstract

The study of muscle signal classification has been widely explored for the control of myoelectric prostheses. Traditional approaches rely on manually designed features extracted from time- or frequency-domain representations, which may limit the generalization and adaptability of EMG-based systems. In this work, an [...] Read more.

The study of muscle signal classification has been widely explored for the control of myoelectric prostheses. Traditional approaches rely on manually designed features extracted from time- or frequency-domain representations, which may limit the generalization and adaptability of EMG-based systems. In this work, an autoencoder-based framework is proposed for automatic feature extraction, enabling the learning of compact latent representations directly from raw EMG signals and reducing dependence on handcrafted features. A custom instrumentation system with three surface EMG sensors was developed and placed on selected forearm muscles to acquire signals associated with five hand movements from 20 healthy participants aged 18 to 40 years. The signals were segmented into 200 ms windows with 75% overlap. The proposed method employs a recurrent autoencoder with a symmetric encoder–decoder architecture, trained independently for each sensor to achieve accurate signal reconstruction, with a minimum reconstruction loss of

3.3 \times 10^{- 4} V^{2}

. The encoder’s latent representations were then used to train a dense neural network for gesture classification. An overall efficiency of 93.84% was achieved, demonstrating that the proposed reconstruction-based approach provides high classification performance and represents a promising solution for future EMG-based assistive and control applications. Full article

(This article belongs to the Special Issue Transforming Biomedical Innovation with Artificial Intelligence)

► Show Figures

Figure 1

26 pages, 7469 KB

Open AccessArticle

Generalized Vision-Based Coordinate Extraction Framework for EDA Layout Reports and PCB Optical Positioning

by Pu-Sheng Tsai, Ter-Feng Wu and Wen-Hai Chen

Processes 2026, 14(2), 342; https://doi.org/10.3390/pr14020342 - 18 Jan 2026

Viewed by 442

Abstract

Automated optical inspection (AOI) technologies are widely used in PCB and semiconductor manufacturing to improve accuracy and reduce human error during quality inspection. While existing AOI systems can perform defect detection, they often rely on pre-defined camera positions and lack flexibility for interactive [...] Read more.

Automated optical inspection (AOI) technologies are widely used in PCB and semiconductor manufacturing to improve accuracy and reduce human error during quality inspection. While existing AOI systems can perform defect detection, they often rely on pre-defined camera positions and lack flexibility for interactive inspection, especially when the operator needs to visually verify solder pad conditions or examine specific layout regions. This study focuses on the front-end optical positioning and inspection stage of the AOI workflow, providing an automated mechanism to link digitally generated layout reports from EDA layout tools with real PCB inspection tasks. The proposed system operates on component-placement reports exported by EDA layout environments and uses them to automatically guide the camera to the corresponding PCB coordinates. Since PCB design reports may vary in format and structure across EDA tools, this study proposes a vision-based extraction approach that employs Hough transform-based region detection and a CNN-based digit recognizer to recover component coordinates from visually rendered design data. A dual-axis sliding platform is driven through a hierarchical control architecture, where coarse positioning is performed via TB6600 stepper control and Bluetooth-based communication, while fine alignment is achieved through a non-contact, gesture-based interface designed for clean-room operation. A high-resolution autofocus camera subsequently displays the magnified solder pads on a large screen for operator verification. Experimental results show that the proposed platform provides accurate, repeatable, and intuitive optical positioning, improving inspection efficiency while maintaining operator ergonomics and system modularity. Rather than replacing defect-classification AOI systems, this work complements them by serving as a positioning-assisted inspection module for interactive and semi-automated PCB quality evaluation. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Analytics for Data-Driven Decision-Making in Industrial Process Engineering)

► Show Figures

Figure 1

27 pages, 11232 KB

Open AccessArticle

Aerokinesis: An IoT-Based Vision-Driven Gesture Control System for Quadcopter Navigation Using Deep Learning and ROS2

by Sergei Kondratev, Yulia Dyrchenkova, Georgiy Nikitin, Leonid Voskov, Vladimir Pikalov and Victor Meshcheryakov

Technologies 2026, 14(1), 69; https://doi.org/10.3390/technologies14010069 - 16 Jan 2026

Viewed by 606

Abstract

This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in [...] Read more.

This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in scenarios where traditional remote controllers are impractical or unavailable. The architecture comprises two hierarchical control levels: (1) high-level discrete command control utilizing a fully connected neural network classifier for static gesture recognition, and (2) low-level continuous flight control based on three-dimensional hand keypoint analysis from a depth camera. The gesture classification module achieves an accuracy exceeding 99% using a multi-layer perceptron trained on MediaPipe-extracted hand landmarks. For continuous control, we propose a novel approach that computes Euler angles (roll, pitch, yaw) and throttle from 3D hand pose estimation, enabling intuitive four-degree-of-freedom quadcopter manipulation. A hybrid signal filtering pipeline ensures robust control signal generation while maintaining real-time responsiveness. Comparative user studies demonstrate that gesture-based control reduces task completion time by 52.6% for beginners compared to conventional remote controllers. The results confirm the viability of vision-based gesture interfaces for IoT-enabled UAV applications. Full article

(This article belongs to the Section Information and Communication Technologies)

► Show Figures

Figure 1

26 pages, 3626 KB

Open AccessArticle

A Lightweight Frozen Multi-Convolution Dual-Branch Network for Efficient sEMG-Based Gesture Recognition

by Shengbiao Wu, Zhezhe Lv, Yuehong Li, Chengmin Fang, Tao You and Jiazheng Gui

Sensors 2026, 26(2), 580; https://doi.org/10.3390/s26020580 - 15 Jan 2026

Viewed by 333

Abstract

Gesture recognition is important for rehabilitation assistance and intelligent prosthetic control. However, surface electromyography (sEMG) signals exhibit strong non-stationarity, and conventional deep-learning models require long training time and high computational cost, limiting their use on resource-constrained devices. This study proposes a Frozen Multi-Convolution [...] Read more.

Gesture recognition is important for rehabilitation assistance and intelligent prosthetic control. However, surface electromyography (sEMG) signals exhibit strong non-stationarity, and conventional deep-learning models require long training time and high computational cost, limiting their use on resource-constrained devices. This study proposes a Frozen Multi-Convolution Dual-Branch Network (FMC-DBNet) to address these challenges. The model employs randomly initialized and fixed convolutional kernels for training-free multi-scale feature extraction, substantially reducing computational overhead. A dual-branch architecture is adopted to capture complementary temporal and physiological patterns from raw sEMG signals and intrinsic mode functions (IMFs) obtained through variational mode decomposition (VMD). In addition, positive-proportion (PPV) and global-average-pooling (GAP) statistics enhance lightweight multi-resolution representation. Experiments on the Ninapro DB1 dataset show that FMC-DBNet achieves an average accuracy of 96.4% ± 1.9% across 27 subjects and reduces training time by approximately 90% compared with a conventional trainable CNN baseline. These results demonstrate that frozen random-convolution structures provide an efficient and robust alternative to fully trained deep networks, offering a promising solution for low-power and computationally efficient sEMG gesture recognition. Full article

(This article belongs to the Section Electronic Sensors)

► Show Figures

Figure 1

Search Results (485)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (485)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI