MDPI - Publisher of Open Access Journals

25 pages, 16941 KiB

Open AccessArticle

KAN-Sense: Keypad Input Recognition via CSI Feature Clustering and KAN-Based Classifier

by Minseok Koo and Jaesung Park

Electronics 2025, 14(15), 2965; https://doi.org/10.3390/electronics14152965 - 24 Jul 2025

Wi-Fi sensing leverages variations in CSI (channel state information) to infer human activities in a contactless and low-cost manner, with growing applications in smart homes, healthcare, and security. While deep learning has advanced macro-motion sensing tasks, micro-motion sensing such as keypad stroke recognition [...] Read more.

Wi-Fi sensing leverages variations in CSI (channel state information) to infer human activities in a contactless and low-cost manner, with growing applications in smart homes, healthcare, and security. While deep learning has advanced macro-motion sensing tasks, micro-motion sensing such as keypad stroke recognition remains underexplored due to subtle inter-class CSI variations and significant intra-class variance. These challenges make it difficult for existing deep learning models typically relying on fully connected MLPs to accurately recognize keypad inputs. To address the issue, we propose a novel approach that combines a discriminative feature extractor with a Kolmogorov–Arnold Network (KAN)-based classifier. The combined model is trained to reduce intra-class variability by clustering features around class-specific centers. The KAN classifier learns nonlinear spline functions to efficiently delineate the complex decision boundaries between different keypad inputs with fewer parameters. To validate our method, we collect a CSI dataset with low-cost Wi-Fi devices (ESP8266 and Raspberry Pi 4) in a real-world keypad sensing environment. Experimental results verify the effectiveness and practicality of our method for keypad input sensing applications in that it outperforms existing approaches in sensing accuracy while requiring fewer parameters. Full article

(This article belongs to the Special Issue IoT Based Intelligent Communications: Modelling, Practice and Applications)

► Show Figures

Figure 1

18 pages, 9571 KiB

Open AccessArticle

TCN-MAML: A TCN-Based Model with Model-Agnostic Meta-Learning for Cross-Subject Human Activity Recognition

by Chih-Yang Lin, Chia-Yu Lin, Yu-Tso Liu, Yi-Wei Chen, Hui-Fuang Ng and Timothy K. Shih

Sensors 2025, 25(13), 4216; https://doi.org/10.3390/s25134216 - 6 Jul 2025

Viewed by 269

Abstract

Human activity recognition (HAR) using Wi-Fi-based sensing has emerged as a powerful, non-intrusive solution for monitoring human behavior in smart environments. Unlike wearable sensor systems that require user compliance, Wi-Fi channel state information (CSI) enables device-free recognition by capturing variations in signal propagation [...] Read more.

Human activity recognition (HAR) using Wi-Fi-based sensing has emerged as a powerful, non-intrusive solution for monitoring human behavior in smart environments. Unlike wearable sensor systems that require user compliance, Wi-Fi channel state information (CSI) enables device-free recognition by capturing variations in signal propagation caused by human motion. This makes Wi-Fi sensing highly attractive for ambient healthcare, security, and elderly care applications. However, real-world deployment faces two major challenges: (1) significant cross-subject signal variability due to physical and behavioral differences among individuals, and (2) limited labeled data, which restricts model generalization. To address these sensor-related challenges, we propose TCN-MAML, a novel framework that integrates temporal convolutional networks (TCN) with model-agnostic meta-learning (MAML) for efficient cross-subject adaptation in data-scarce conditions. We evaluate our approach on a public Wi-Fi CSI dataset using a strict cross-subject protocol, where training and testing subjects do not overlap. The proposed TCN-MAML achieves 99.6% accuracy, demonstrating superior generalization and efficiency over baseline methods. Experimental results confirm the framework’s suitability for low-power, real-time HAR systems embedded in IoT sensor networks. Full article

(This article belongs to the Special Issue Sensors and Sensing Technologies for Object Detection and Recognition)

► Show Figures

Figure 1

25 pages, 2723 KiB

Open AccessArticle

A Human-Centric, Uncertainty-Aware Event-Fused AI Network for Robust Face Recognition in Adverse Conditions

by Akmalbek Abdusalomov, Sabina Umirzakova, Elbek Boymatov, Dilnoza Zaripova, Shukhrat Kamalov, Zavqiddin Temirov, Wonjun Jeong, Hyoungsun Choi and Taeg Keun Whangbo

Appl. Sci. 2025, 15(13), 7381; https://doi.org/10.3390/app15137381 - 30 Jun 2025

Cited by 1 | Viewed by 271

Abstract

Face recognition systems often falter when deployed in uncontrolled settings, grappling with low light, unexpected occlusions, motion blur, and the degradation of sensor signals. Most contemporary algorithms chase raw accuracy yet overlook the pragmatic need for uncertainty estimation and multispectral reasoning rolled into [...] Read more.

Face recognition systems often falter when deployed in uncontrolled settings, grappling with low light, unexpected occlusions, motion blur, and the degradation of sensor signals. Most contemporary algorithms chase raw accuracy yet overlook the pragmatic need for uncertainty estimation and multispectral reasoning rolled into a single framework. This study introduces HUE-Net—a Human-centric, Uncertainty-aware, Event-fused Network—designed specifically to thrive under severe environmental stress. HUE-Net marries the visible RGB band with near-infrared (NIR) imagery and high-temporal-event data through an early-fusion pipeline, proven more responsive than serial approaches. A custom hybrid backbone that couples convolutional networks with transformers keeps the model nimble enough for edge devices. Central to the architecture is the perturbed multi-branch variational module, which distills probabilistic identity embeddings while delivering calibrated confidence scores. Complementing this, an Adaptive Spectral Attention mechanism dynamically reweights each stream to amplify the most reliable facial features in real time. Unlike previous efforts that compartmentalize uncertainty handling, spectral blending, or computational thrift, HUE-Net unites all three in a lightweight package. Benchmarks on the IJB-C and N-SpectralFace datasets illustrate that the system not only secures state-of-the-art accuracy but also exhibits unmatched spectral robustness and reliable probability calibration. The results indicate that HUE-Net is well-positioned for forensic missions and humanitarian scenarios where trustworthy identification cannot be deferred. Full article

(This article belongs to the Special Issue New Technologies and Applications of Visual-Based Human-Computer Interactions)

► Show Figures

Figure 1

16 pages, 6543 KiB

Open AccessArticle

IoT-Edge Hybrid Architecture with Cross-Modal Transformer and Federated Manifold Learning for Safety-Critical Gesture Control in Adaptive Mobility Platforms

by Xinmin Jin, Jian Teng and Jiaji Chen

Future Internet 2025, 17(7), 271; https://doi.org/10.3390/fi17070271 - 20 Jun 2025

Viewed by 641

Abstract

This research presents an IoT-empowered adaptive mobility framework that integrates high-dimensional gesture recognition with edge-cloud orchestration for safety-critical human–machine interaction. The system architecture establishes a three-tier IoT network: a perception layer with 60 GHz FMCW radar and TOF infrared arrays (12-node mesh topology, [...] Read more.

This research presents an IoT-empowered adaptive mobility framework that integrates high-dimensional gesture recognition with edge-cloud orchestration for safety-critical human–machine interaction. The system architecture establishes a three-tier IoT network: a perception layer with 60 GHz FMCW radar and TOF infrared arrays (12-node mesh topology, 15 cm baseline spacing) for real-time motion tracking; an edge intelligence layer deploying a time-aware neural network via NVIDIA Jetson Nano to achieve up to 99.1% recognition accuracy with latency as low as 48 ms under optimal conditions (typical performance: 97.8% ± 1.4% accuracy, 68.7 ms ± 15.3 ms latency); and a federated cloud layer enabling distributed model synchronization across 32 edge nodes via LoRaWAN-optimized protocols (κ = 0.912 consensus). A reconfigurable chassis with three operational modes (standing, seated, balance) employs IoT-driven kinematic optimization for enhanced adaptability and user safety. Using both radar and infrared sensors together reduces false detections to 0.08% even under high-vibration conditions (80 km/h), while distributed learning across multiple devices maintains consistent accuracy (variance < 5%) in different environments. Experimental results demonstrate 93% reliability improvement over HMM baselines and 3.8% accuracy gain over state-of-the-art LSTM models, while achieving 33% faster inference (48.3 ms vs. 72.1 ms). The system maintains industrial-grade safety certification with energy-efficient computation. Bridging adaptive mechanics with edge intelligence, this research pioneers a sustainable IoT-edge paradigm for smart mobility, harmonizing real-time responsiveness, ecological sustainability, and scalable deployment in complex urban ecosystems. Full article

(This article belongs to the Special Issue Convergence of IoT, Edge and Cloud Systems)

► Show Figures

Figure 1

19 pages, 30474 KiB

Open AccessArticle

Multi-Head Attention-Based Framework with Residual Network for Human Action Recognition

by Basheer Al-Tawil, Magnus Jung, Thorsten Hempel and Ayoub Al-Hamadi

Sensors 2025, 25(9), 2930; https://doi.org/10.3390/s25092930 - 6 May 2025

Viewed by 710

Abstract

Human action recognition (HAR) is essential for understanding and classifying human movements. It is widely used in real-life applications such as human–computer interaction and assistive robotics. However, recognizing patterns across different temporal scales remains challenging. Traditional methods struggle with complex timing patterns, intra-class [...] Read more.

Human action recognition (HAR) is essential for understanding and classifying human movements. It is widely used in real-life applications such as human–computer interaction and assistive robotics. However, recognizing patterns across different temporal scales remains challenging. Traditional methods struggle with complex timing patterns, intra-class variability, and inter-class similarities, leading to misclassifications. In this paper, we propose a deep learning framework for efficient and robust HAR. It integrates residual networks (ResNet-18) for spatial feature extraction and Bi-LSTM for temporal feature extraction. A multi-head attention mechanism enhances the prioritization of crucial motion details. Additionally, we introduce a motion-based frame selection strategy utilizing optical flow to reduce redundancy and enhance efficiency. This ensures accurate, real-time recognition of both simple and complex actions. We evaluate the framework on the UCF-101 dataset, achieving a 96.60% accuracy, demonstrating competitive performance against state-of-the-art approaches. Moreover, the framework operates at 222 frames per second (FPS), achieving an optimal balance between recognition performance and computational efficiency. The proposed framework was also deployed and tested on a mobile service robot, TIAGo, validating its real-time applicability in real-world scenarios. It effectively models human actions while minimizing frame dependency, making it well-suited for real-time applications. Full article

(This article belongs to the Special Issue Deep Learning Applications for Pose Estimation and Human Action Recognition—2nd Edition)

► Show Figures

Figure 1

18 pages, 3238 KiB

Open AccessArticle

Multi-Grained Temporal Clip Transformer for Skeleton-Based Human Activity Recognition

by Peiwang Zhu, Chengwu Liang, Yalong Liu and Songqi Jiang

Appl. Sci. 2025, 15(9), 4768; https://doi.org/10.3390/app15094768 - 25 Apr 2025

Cited by 1 | Viewed by 587

Abstract

Skeleton-based human activity recognition is a key research topic in the fields of deep learning and computer vision. However, existing approaches are less effective at capturing short-term sub-action information at different granularity levels and long-term motion correlations, which affect recognition accuracy. To overcome [...] Read more.

Skeleton-based human activity recognition is a key research topic in the fields of deep learning and computer vision. However, existing approaches are less effective at capturing short-term sub-action information at different granularity levels and long-term motion correlations, which affect recognition accuracy. To overcome these challenges, an innovative multi-grained temporal clip transformer (MTC-Former) is proposed. Firstly, based on the transformer backbone, a multi-grained temporal clip attention (MTCA) module with multi-branch architecture is proposed to capture the characteristics of short-term sub-action features. Secondly, an innovative multi-scale spatial–temporal feature interaction module is proposed to jointly learn sub-action dependencies and facilitate skeletal motion interactions, where long-range motion patterns are embedded to enhance correlation modeling. Experiments were conducted on three datasets, including NTU RGB+D, NTU RGB+D 120, and InHARD, and achieved state-of-the-art Top-1 recognition accuracy, demonstrating the superiority of the proposed MTC-Former. Full article

(This article belongs to the Special Issue Applications of Deep Learning and Artificial Intelligence Methods: 3rd Edition)

► Show Figures

Figure 1

22 pages, 8938 KiB

Open AccessArticle

Enhancing Hand Gesture Image Recognition by Integrating Various Feature Groups

by Ismail Taha Ahmed, Wisam Hazim Gwad, Baraa Tareq Hammad and Entisar Alkayal

Technologies 2025, 13(4), 164; https://doi.org/10.3390/technologies13040164 - 19 Apr 2025

Cited by 1 | Viewed by 1007

Abstract

Human gesture image recognition is the process of identifying, deciphering, and classifying human gestures in images or video frames using computer vision algorithms. These gestures can vary from the simplest hand motions, body positions, and facial emotions to complicated gestures. Two significant problems [...] Read more.

Human gesture image recognition is the process of identifying, deciphering, and classifying human gestures in images or video frames using computer vision algorithms. These gestures can vary from the simplest hand motions, body positions, and facial emotions to complicated gestures. Two significant problems affecting the performance of human gesture picture recognition methods are ambiguity and invariance. Ambiguity occurs when gestures have the same shape but different orientations, while invariance guarantees that gestures are correctly classified even when scale, lighting, or orientation varies. To overcome this issue, hand-crafted features can be combined with deep learning to greatly improve the performance of hand gesture image recognition models. This combination improves the model’s overall accuracy and dependability in identifying a variety of hand movements by enhancing its capacity to record both shape and texture properties. Thus, in this study, we propose a hand gesture recognition method that combines Reset50 model feature extraction with the Tamura texture descriptor and uses the adaptability of GAM to represent intricate interactions between the features. Experiments were carried out on publicly available datasets containing images of American Sign Language (ASL) gestures. As Tamura-ResNet50-OptimizedGAM achieved the highest accuracy rate in the ASL datasets, it is believed to be the best option for human gesture image recognition. According to the experimental results, the accuracy rate was 96%, which is higher than the total accuracy of the state-of-the-art techniques currently in use. Full article

(This article belongs to the Section Information and Communication Technologies)

► Show Figures

Figure 1

28 pages, 6367 KiB

Open AccessArticle

Human Action Recognition from Videos Using Motion History Mapping and Orientation Based Three-Dimensional Convolutional Neural Network Approach

by Ishita Arora and M. Gangadharappa

Modelling 2025, 6(2), 33; https://doi.org/10.3390/modelling6020033 - 18 Apr 2025

Viewed by 1427

Abstract

Human Activity Recognition (HAR) has recently attracted the attention of researchers. Human behavior and human intention are driving the intensification of HAR research rapidly. This paper proposes a novel Motion History Mapping (MHI) and Orientation-based Convolutional Neural Network (CNN) framework for action recognition [...] Read more.

Human Activity Recognition (HAR) has recently attracted the attention of researchers. Human behavior and human intention are driving the intensification of HAR research rapidly. This paper proposes a novel Motion History Mapping (MHI) and Orientation-based Convolutional Neural Network (CNN) framework for action recognition and classification using Machine Learning. The proposed method extracts oriented rectangular patches over the entire human body to represent the human pose in an action sequence. This distribution is represented by a spatially oriented histogram. The frames were trained with a 3D Convolution Neural Network model, thus saving time and increasing the Classification Correction Rate (CCR). The K-Nearest Neighbor (KNN) algorithm is used for the classification of human actions. The uniqueness of our model lies in the combination of Motion History Mapping approach with an Orientation-based 3D CNN, thereby enhancing precision. The proposed method is demonstrated to be effective using four widely used and challenging datasets. A comparison of the proposed method’s performance with current state-of-the-art methods finds that its Classification Correction Rate is higher than that of the existing methods. Our model’s CCRs are 92.91%, 98.88%, 87.97.% and 87.77% which are remarkably higher than the existing techniques for KTH, Weizmann, UT-Tower and YouTube datasets, respectively. Thus, our model significantly outperforms the existing models in the literature. Full article

► Show Figures

Figure 1

10 pages, 1379 KiB

Open AccessProceeding Paper

Recognizing Human Emotions Through Body Posture Dynamics Using Deep Neural Networks

by Arunnehru Jawaharlalnehru, Thalapathiraj Sambandham and Dhanasekar Ravikumar

Eng. Proc. 2025, 87(1), 49; https://doi.org/10.3390/engproc2025087049 - 16 Apr 2025

Viewed by 857

Abstract

Body posture dynamics have garnered significant attention in recent years due to their critical role in understanding the emotional states conveyed through human movements during social interactions. Emotions are typically expressed through facial expressions, voice, gait, posture, and overall body dynamics. Among these, [...] Read more.

Body posture dynamics have garnered significant attention in recent years due to their critical role in understanding the emotional states conveyed through human movements during social interactions. Emotions are typically expressed through facial expressions, voice, gait, posture, and overall body dynamics. Among these, body posture provides subtle yet essential cues about emotional states. However, predicting an individual’s gait and posture dynamics poses challenges, given the complexity of human body movement, which involves numerous degrees of freedom compared to facial expressions. Moreover, unlike static facial expressions, body dynamics are inherently fluid and continuously evolving. This paper presents an effective method for recognizing 17 micro-emotions by analyzing kinematic features from the GEMEP dataset using video-based motion capture. We specifically focus on upper body posture dynamics (skeleton points and angle), capturing movement patterns and their dynamic range over time. Our approach addresses the complexity of recognizing emotions from posture and gait by focusing on key elements of kinematic gesture analysis. The experimental results demonstrate the effectiveness of the proposed model, achieving a high accuracy rate of 91.48% for angle metric + DNN and 93.89% for distance + DNN on the GEMEP dataset using a deep neural network (DNN). These findings highlight the potential for our model to advance posture-based emotion recognition, particularly in applications where human body dynamics distance and angle are key indicators of emotional states. Full article

(This article belongs to the Proceedings of The 5th International Electronic Conference on Applied Sciences)

► Show Figures

Figure 1

12 pages, 8262 KiB

Open AccessArticle

High-Sensitivity and Wide-Range Flexible Pressure Sensor Based on Gradient-Wrinkle Structures and AgNW-Coated PDMS

by Xiaoran Liu, Xinyi Wang, Tao Xue, Yingying Zhao and Qiang Zou

Micromachines 2025, 16(4), 468; https://doi.org/10.3390/mi16040468 - 15 Apr 2025

Cited by 1 | Viewed by 763

Abstract

Flexible pressure sensors have garnered significant attention due to their wide range of applications in human motion monitoring and smart wearable devices. However, the fabrication of pressure sensors that offer both high sensitivity and a wide detection range remains a challenging task. In [...] Read more.

Flexible pressure sensors have garnered significant attention due to their wide range of applications in human motion monitoring and smart wearable devices. However, the fabrication of pressure sensors that offer both high sensitivity and a wide detection range remains a challenging task. In this paper, we propose an AgNW-coated PDMS flexible piezoresistive sensor based on a gradient-wrinkle structure. By modifying the microstructure of PDMS, the sensor demonstrates varying sensitivities and pressure responses across different pressure ranges. The wrinkle microstructure contributes to high sensitivity (0.947 kPa⁻¹) at low pressures, while the PDMS film with a gradient contact height ensures a continuous change in the contact area through the gradual activation of the contact wrinkles, resulting in a wide detection range (10–50 kPa). This paper also investigates the contact state of gradient-wrinkle films under different pressures to further elaborate on the sensor’s sensing mechanism. The sensor’s excellent performance in real-time response to touch behavior, joint motion, swallowing behavior recognition, and grasping behavior detection highlights its broad application prospects in human–computer interaction, human motion monitoring, and intelligent robotics. Full article

► Show Figures

Figure 1

17 pages, 7673 KiB

Open AccessArticle

Motion Pattern Recognition via CNN-LSTM-Attention Model Using Array-Based Wi-Fi CSI Sensors in GNSS-Denied Areas

by Ming Xia, Shengmao Que, Nanzhu Liu, Qu Wang and Tuan Li

Electronics 2025, 14(8), 1594; https://doi.org/10.3390/electronics14081594 - 15 Apr 2025

Viewed by 883

Abstract

Human activity recognition (HAR) is vital for applications in fields such as smart homes, health monitoring, and navigation, particularly in GNSS-denied environments where satellite signals are obstructed. Wi-Fi channel state information (CSI) has emerged as a key technology for HAR due to its [...] Read more.

Human activity recognition (HAR) is vital for applications in fields such as smart homes, health monitoring, and navigation, particularly in GNSS-denied environments where satellite signals are obstructed. Wi-Fi channel state information (CSI) has emerged as a key technology for HAR due to its wide coverage, low cost, and non-reliance on wearable devices. However, existing methods face challenges including significant data fluctuations, limited feature extraction capabilities, and difficulties in recognizing complex movements. This study presents a novel solution by integrating a multi-sensor array of Wi-Fi CSI with deep learning techniques to overcome these challenges. We propose a 2 × 2 array of Wi-Fi CSI sensors, which collects synchronized data from all channels within the CSI receivable range, improving data stability and providing reliable positioning in GNSS-denied environments. Using the CNN-LSTM-attention (C-L-A) framework, this method combines short- and long-term motion features, enhancing recognition accuracy. Experimental results show 98.2% accuracy, demonstrating superior recognition performance compared to single Wi-Fi receivers and traditional deep learning models. Our multi-sensor Wi-Fi CSI and deep learning approach significantly improve HAR accuracy, generalization, and adaptability, making it an ideal solution for GNSS-denied environments in applications such as autonomous navigation and smart cities. Full article

(This article belongs to the Special Issue Theory and Method of GNSS Precision Positioning and Its New Application)

► Show Figures

Figure 1

14 pages, 567 KiB

Open AccessArticle

Efficient Human Activity Recognition Using Machine Learning and Wearable Sensor Data

by Ziwei Zhong and Bin Liu

Appl. Sci. 2025, 15(8), 4075; https://doi.org/10.3390/app15084075 - 8 Apr 2025

Viewed by 788

Abstract

With the rapid advancement of global development, there is an increasing demand for health monitoring technologies. Human activity recognition and monitoring systems offer a powerful means of identifying daily movement patterns, which helps in understanding human behaviors and provides valuable insights for life [...] Read more.

With the rapid advancement of global development, there is an increasing demand for health monitoring technologies. Human activity recognition and monitoring systems offer a powerful means of identifying daily movement patterns, which helps in understanding human behaviors and provides valuable insights for life management. This paper explores the issue of human motion state recognition using accelerometers and gyroscopes, proposing a human activity recognition system based on a majority decision model that integrates multiple machine learning algorithms. In this study, the majority decision model was compared with an integer programming model, and the accuracy was assessed through a confusion matrix and cross-validation based on a dataset generated from 10 volunteers performing 12 different human activities. The average activity recognition accuracy of the majority decision model can be as high as 91.92%. The results underscore the superior accuracy and efficiency of the majority decision model in human activity state recognition, highlighting its potential for practical applications in health monitoring systems. Full article

(This article belongs to the Special Issue Human Activity Recognition (HAR) in Healthcare, 2nd Edition)

► Show Figures

Figure 1

57 pages, 8107 KiB

Open AccessReview

Machine Learning for Human Activity Recognition: State-of-the-Art Techniques and Emerging Trends

by Md Amran Hossen and Pg Emeroylariffion Abas

J. Imaging 2025, 11(3), 91; https://doi.org/10.3390/jimaging11030091 - 20 Mar 2025

Cited by 1 | Viewed by 3612

Abstract

Human activity recognition (HAR) has emerged as a transformative field with widespread applications, leveraging diverse sensor modalities to accurately identify and classify human activities. This paper provides a comprehensive review of HAR techniques, focusing on the integration of sensor-based, vision-based, and hybrid methodologies. [...] Read more.

Human activity recognition (HAR) has emerged as a transformative field with widespread applications, leveraging diverse sensor modalities to accurately identify and classify human activities. This paper provides a comprehensive review of HAR techniques, focusing on the integration of sensor-based, vision-based, and hybrid methodologies. It explores the strengths and limitations of commonly used modalities, such as RGB images/videos, depth sensors, motion capture systems, wearable devices, and emerging technologies like radar and Wi-Fi channel state information. The review also discusses traditional machine learning approaches, including supervised and unsupervised learning, alongside cutting-edge advancements in deep learning, such as convolutional and recurrent neural networks, attention mechanisms, and reinforcement learning frameworks. Despite significant progress, HAR still faces critical challenges, including handling environmental variability, ensuring model interpretability, and achieving high recognition accuracy in complex, real-world scenarios. Future research directions emphasise the need for improved multimodal sensor fusion, adaptive and personalised models, and the integration of edge computing for real-time analysis. Additionally, addressing ethical considerations, such as privacy and algorithmic fairness, remains a priority as HAR systems become more pervasive. This study highlights the evolving landscape of HAR and outlines strategies for future advancements that can enhance the reliability and applicability of HAR technologies in diverse domains. Full article

(This article belongs to the Special Issue Advancing Action Recognition: Novel Approaches, Techniques and Applications)

► Show Figures

Figure 1

24 pages, 3877 KiB

Open AccessArticle

A Hybrid Approach for Sports Activity Recognition Using Key Body Descriptors and Hybrid Deep Learning Classifier

by Muhammad Tayyab, Sulaiman Abdullah Alateyah, Mohammed Alnusayri, Mohammed Alatiyyah, Dina Abdulaziz AlHammadi, Ahmad Jalal and Hui Liu

Sensors 2025, 25(2), 441; https://doi.org/10.3390/s25020441 - 13 Jan 2025

Cited by 8 | Viewed by 1151

Abstract

This paper presents an approach for event recognition in sequential images using human body part features and their surrounding context. Key body points were approximated to track and monitor their presence in complex scenarios. Various feature descriptors, including MSER (Maximally Stable Extremal Regions), [...] Read more.

This paper presents an approach for event recognition in sequential images using human body part features and their surrounding context. Key body points were approximated to track and monitor their presence in complex scenarios. Various feature descriptors, including MSER (Maximally Stable Extremal Regions), SURF (Speeded-Up Robust Features), distance transform, and DOF (Degrees of Freedom), were applied to skeleton points, while BRIEF (Binary Robust Independent Elementary Features), HOG (Histogram of Oriented Gradients), FAST (Features from Accelerated Segment Test), and Optical Flow were used on silhouettes or full-body points to capture both geometric and motion-based features. Feature fusion was employed to enhance the discriminative power of the extracted data and the physical parameters calculated by different feature extraction techniques. The system utilized a hybrid CNN (Convolutional Neural Network) + RNN (Recurrent Neural Network) classifier for event recognition, with Grey Wolf Optimization (GWO) for feature selection. Experimental results showed significant accuracy, achieving 98.5% on the UCF-101 dataset and 99.2% on the YouTube dataset. Compared to state-of-the-art methods, our approach achieved better performance in event recognition. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

17 pages, 11589 KiB

Open AccessArticle

Deep Fusion of Skeleton Spatial–Temporal and Dynamic Information for Action Recognition

by Song Gao, Dingzhuo Zhang, Zhaoming Tang and Hongyan Wang

Sensors 2024, 24(23), 7609; https://doi.org/10.3390/s24237609 - 28 Nov 2024

Viewed by 1144

Abstract

Focusing on the issue of the low recognition rates achieved by traditional deep-information-based action recognition algorithms, an action recognition approach was developed based on skeleton spatial–temporal and dynamic features combined with a two-stream convolutional neural network (TS-CNN). Firstly, the skeleton’s three-dimensional coordinate system [...] Read more.

Focusing on the issue of the low recognition rates achieved by traditional deep-information-based action recognition algorithms, an action recognition approach was developed based on skeleton spatial–temporal and dynamic features combined with a two-stream convolutional neural network (TS-CNN). Firstly, the skeleton’s three-dimensional coordinate system was transformed to obtain coordinate information related to relative joint positions. Subsequently, this relevant joint information was encoded as a color texture map to construct the spatial–temporal feature descriptor of the skeleton. Furthermore, physical structure constraints of the human body were considered to enhance class differences. Additionally, the speed information for each joint was estimated and encoded as a color texture map to achieve the skeleton motion feature descriptor. The resulting spatial–temporal and dynamic features were further enhanced using motion saliency and morphology operators to improve their expression ability. Finally, these enhanced skeleton spatial–temporal and dynamic features were deeply fused via TS-CNN for implementing action recognition. Numerous results from experiments conducted on the publicly available datasets NTU RGB-D, Northwestern-UCLA, and UTD-MHAD demonstrate that the recognition rates achieved via the developed approach are 86.25%, 87.37%, and 93.75%, respectively, indicating that the approach can effectively improve the accuracy of action recognition in complex environments compared to state-of-the-art algorithms. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

Search Results (121)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (121)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI