MDPI - Publisher of Open Access Journals

27 pages, 3417 KiB

Open AccessArticle

GaitCSF: Multi-Modal Gait Recognition Network Based on Channel Shuffle Regulation and Spatial-Frequency Joint Learning

by Siwei Wei, Xiangyuan Xu, Dewen Liu, Chunzhi Wang, Lingyu Yan and Wangyu Wu

Sensors 2025, 25(12), 3759; https://doi.org/10.3390/s25123759 - 16 Jun 2025

Viewed by 577

Gait recognition, as a non-contact biometric technology, offers unique advantages in scenarios requiring long-distance identification without active cooperation from subjects. However, existing gait recognition methods predominantly rely on single-modal data, which demonstrates insufficient feature expression capabilities when confronted with complex factors in real-world [...] Read more.

Gait recognition, as a non-contact biometric technology, offers unique advantages in scenarios requiring long-distance identification without active cooperation from subjects. However, existing gait recognition methods predominantly rely on single-modal data, which demonstrates insufficient feature expression capabilities when confronted with complex factors in real-world environments, including viewpoint variations, clothing differences, occlusion problems, and illumination changes. This paper addresses these challenges by introducing a multi-modal gait recognition network based on channel shuffle regulation and spatial-frequency joint learning, which integrates two complementary modalities (silhouette data and heatmap data) to construct a more comprehensive gait representation. The channel shuffle-based feature selective regulation module achieves cross-channel information interaction and feature enhancement through channel grouping and feature shuffling strategies. This module divides input features along the channel dimension into multiple subspaces, which undergo channel-aware and spatial-aware processing to capture dependency relationships across different dimensions. Subsequently, channel shuffling operations facilitate information exchange between different semantic groups, achieving adaptive enhancement and optimization of features with relatively low parameter overhead. The spatial-frequency joint learning module maps spatiotemporal features to the spectral domain through fast Fourier transform, effectively capturing inherent periodic patterns and long-range dependencies in gait sequences. The global receptive field advantage of frequency domain processing enables the model to transcend local spatiotemporal constraints and capture global motion patterns. Concurrently, the spatial domain processing branch balances the contributions of frequency and spatial domain information through an adaptive weighting mechanism, maintaining computational efficiency while enhancing features. Experimental results demonstrate that the proposed GaitCSF model achieves significant performance improvements on mainstream datasets including GREW, Gait3D, and SUSTech1k, breaking through the performance bottlenecks of traditional methods. The implications of this research are significant for improving the performance and robustness of gait recognition systems when implemented in practical application scenarios. Full article

(This article belongs to the Collection Sensors for Gait, Human Movement Analysis, and Health Monitoring)

► Show Figures

Figure 1

16 pages, 2499 KiB

Open AccessReview

The Neural Correlates of Body Image Processing in Anorexia Nervosa and Bulimia Nervosa: An Activation Likelihood Estimation Meta-Analysis of fMRI Studies

by Lara Norrlin and Oliver Baumann

Int. J. Environ. Res. Public Health 2025, 22(1), 55; https://doi.org/10.3390/ijerph22010055 - 1 Jan 2025

Viewed by 2357

Abstract

Body image concerns are key prognostic and pathogenic factors of anorexia nervosa (AN) and bulimia nervosa (BN). This study aimed to investigate the neural mechanisms underlying body image perception across its two domains of estimation and satisfaction in anorexia and bulimia patients and [...] Read more.

Body image concerns are key prognostic and pathogenic factors of anorexia nervosa (AN) and bulimia nervosa (BN). This study aimed to investigate the neural mechanisms underlying body image perception across its two domains of estimation and satisfaction in anorexia and bulimia patients and healthy controls (HC). Systematic searches were conducted across eight databases, including PubMed; Cochrane Library; Ovid; Google Scholar; Sage Journals; Scopus; PsycInfo; and ScienceDirect, from database inception until the 23rd of April 2023. The sample pertained to 14 functional magnetic resonance imaging (fMRI) studies and 556 participants, with tasks primarily including image and silhouette-based body estimation and satisfaction paradigms. ALE meta-analysis was conducted to investigate significant clusters of activation foci across the different studies. Shared activations were observed between HC, AN, and BN patients in cortical regions related to object manipulation and recognition, visuospatial awareness, and memory and negative affect regulation. Differential activation in interoceptive and higher-order cognitive or affective control regions likely hold the key to pathological body distortion. This study outlined commonalities and differences in the correlates driving healthy body mapping and eating disorder pathology. Our findings provide pertinent implications for future research, current clinical interventions, and therapeutic outcomes. Full article

(This article belongs to the Special Issue The Associations between Eating Disorders and Psychological Health)

► Show Figures

Figure 1

16 pages, 3231 KiB

Open AccessArticle

A Low-Resolution Infrared Array for Unobtrusive Human Activity Recognition That Preserves Privacy

by Nishat Tasnim Newaz and Eisuke Hanada

Sensors 2024, 24(3), 926; https://doi.org/10.3390/s24030926 - 31 Jan 2024

Cited by 5 | Viewed by 2396

Abstract

This research uses a low-resolution infrared array sensor to address real-time human activity recognition while prioritizing the preservation of privacy. The proposed system captures thermal pixels that are represented as a human silhouette. With camera and image processing, it is easy to detect [...] Read more.

This research uses a low-resolution infrared array sensor to address real-time human activity recognition while prioritizing the preservation of privacy. The proposed system captures thermal pixels that are represented as a human silhouette. With camera and image processing, it is easy to detect human activity, but that reduces privacy. This work proposes a novel human activity recognition system that uses interpolation and mathematical measures that are unobtrusive and do not involve machine learning. The proposed method directly and efficiently recognizes multiple human states in a real-time environment. This work also demonstrates the accuracy of the outcomes for various scenarios using traditional ML approaches. This low-resolution IR array sensor is effective and would be useful for activity recognition in homes and healthcare centers. Full article

(This article belongs to the Special Issue Infrared Sensing and Target Detection)

► Show Figures

Figure 1

27 pages, 3278 KiB

Open AccessArticle

Deep Learning Approach for Human Action Recognition Using a Time Saliency Map Based on Motion Features Considering Camera Movement and Shot in Video Image Sequences

by Abdorreza Alavigharahbagh, Vahid Hajihashemi, José J. M. Machado and João Manuel R. S. Tavares

Information 2023, 14(11), 616; https://doi.org/10.3390/info14110616 - 15 Nov 2023

Cited by 7 | Viewed by 3483

Abstract

In this article, a hierarchical method for action recognition based on temporal and spatial features is proposed. In current HAR methods, camera movement, sensor movement, sudden scene changes, and scene movement can increase motion feature errors and decrease accuracy. Another important aspect to [...] Read more.

In this article, a hierarchical method for action recognition based on temporal and spatial features is proposed. In current HAR methods, camera movement, sensor movement, sudden scene changes, and scene movement can increase motion feature errors and decrease accuracy. Another important aspect to take into account in a HAR method is the required computational cost. The proposed method provides a preprocessing step to address these challenges. As a preprocessing step, the method uses optical flow to detect camera movements and shots in input video image sequences. In the temporal processing block, the optical flow technique is combined with the absolute value of frame differences to obtain a time saliency map. The detection of shots, cancellation of camera movement, and the building of a time saliency map minimise movement detection errors. The time saliency map is then passed to the spatial processing block to segment the moving persons and/or objects in the scene. Because the search region for spatial processing is limited based on the temporal processing results, the computations in the spatial domain are drastically reduced. In the spatial processing block, the scene foreground is extracted in three steps: silhouette extraction, active contour segmentation, and colour segmentation. Key points are selected at the borders of the segmented foreground. The last used features are the intensity and angle of the optical flow of detected key points. Using key point features for action detection reduces the computational cost of the classification step and the required training time. Finally, the features are submitted to a Recurrent Neural Network (RNN) to recognise the involved action. The proposed method was tested using four well-known action datasets: KTH, Weizmann, HMDB51, and UCF101 datasets and its efficiency was evaluated. Since the proposed approach segments salient objects based on motion, edges, and colour features, it can be added as a preprocessing step to most current HAR systems to improve performance. Full article

(This article belongs to the Special Issue Computer Vision for Security Applications)

► Show Figures

Figure 1

24 pages, 5599 KiB

Open AccessArticle

Comparative Analysis of the Clustering Quality in Self-Organizing Maps for Human Posture Classification

by Lisiane Esther Ekemeyong Awong and Teresa Zielinska

Sensors 2023, 23(18), 7925; https://doi.org/10.3390/s23187925 - 15 Sep 2023

Cited by 14 | Viewed by 4003

Abstract

The objective of this article is to develop a methodology for selecting the appropriate number of clusters to group and identify human postures using neural networks with unsupervised self-organizing maps. Although unsupervised clustering algorithms have proven effective in recognizing human postures, many works [...] Read more.

The objective of this article is to develop a methodology for selecting the appropriate number of clusters to group and identify human postures using neural networks with unsupervised self-organizing maps. Although unsupervised clustering algorithms have proven effective in recognizing human postures, many works are limited to testing which data are correctly or incorrectly recognized. They often neglect the task of selecting the appropriate number of groups (where the number of clusters corresponds to the number of output neurons, i.e., the number of postures) using clustering quality assessments. The use of quality scores to determine the number of clusters frees the expert to make subjective decisions about the number of postures, enabling the use of unsupervised learning. Due to high dimensionality and data variability, expert decisions (referred to as data labeling) can be difficult and time-consuming. In our case, there is no manual labeling step. We introduce a new clustering quality score: the discriminant score (DS). We describe the process of selecting the most suitable number of postures using human activity records captured by RGB-D cameras. Comparative studies on the usefulness of popular clustering quality scores—such as the silhouette coefficient, Dunn index, Calinski–Harabasz index, Davies–Bouldin index, and DS—for posture classification tasks are presented, along with graphical illustrations of the results produced by DS. The findings show that DS offers good quality in posture recognition, effectively following postural transitions and similarities. Full article

(This article belongs to the Special Issue Sensing, Estimating, and Analyzing Human Movements for Human–Robot Interaction)

► Show Figures

Figure 1

22 pages, 3259 KiB

Open AccessArticle

Hybrid InceptionV3-SVM-Based Approach for Human Posture Detection in Health Monitoring Systems

by Roseline Oluwaseun Ogundokun, Rytis Maskeliūnas, Sanjay Misra and Robertas Damasevicius

Algorithms 2022, 15(11), 410; https://doi.org/10.3390/a15110410 - 4 Nov 2022

Cited by 21 | Viewed by 4900

Abstract

Posture detection targets toward providing assessments for the monitoring of the health and welfare of humans have been of great interest to researchers from different disciplines. The use of computer vision systems for posture recognition might result in useful improvements in healthy aging [...] Read more.

Posture detection targets toward providing assessments for the monitoring of the health and welfare of humans have been of great interest to researchers from different disciplines. The use of computer vision systems for posture recognition might result in useful improvements in healthy aging and support for elderly people in their daily activities in the field of health care. Computer vision and pattern recognition communities are particularly interested in fall automated recognition. Human sensing and artificial intelligence have both paid great attention to human posture detection (HPD). The health status of elderly people can be remotely monitored using human posture detection, which can distinguish between positions such as standing, sitting, and walking. The most recent research identified posture using both deep learning (DL) and conventional machine learning (ML) classifiers. However, these techniques do not effectively identify the postures and overfits of the model overfits. Therefore, this study suggested a deep convolutional neural network (DCNN) framework to examine and classify human posture in health monitoring systems. This study proposes a feature selection technique, DCNN, and a machine learning technique to assess the previously mentioned problems. The InceptionV3 DCNN model is hybridized with SVM ML and its performance is compared. Furthermore, the performance of the proposed system is validated with other transfer learning (TL) techniques such as InceptionV3, DenseNet121, and ResNet50. This study uses the least absolute shrinkage and selection operator (LASSO)-based feature selection to enhance the feature vector. The study also used various techniques, such as data augmentation, dropout, and early stop, to overcome the problem of model overfitting. The performance of this DCNN framework is tested using benchmark Silhouettes of human posture and classification accuracy, loss, and AUC value of 95.42%, 0.01, and 99.35% are attained, respectively. Furthermore, the results of the proposed technology offer the most promising solution for indoor monitoring systems. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithms for Medicine)

► Show Figures

Figure 1

24 pages, 5856 KiB

Open AccessArticle

A Graph-Based Approach to Recognizing Complex Human Object Interactions in Sequential Data

by Yazeed Yasin Ghadi, Manahil Waheed, Munkhjargal Gochoo, Suliman A. Alsuhibany, Samia Allaoua Chelloug, Ahmad Jalal and Jeongmin Park

Appl. Sci. 2022, 12(10), 5196; https://doi.org/10.3390/app12105196 - 20 May 2022

Cited by 11 | Viewed by 2836

Abstract

The critical task of recognizing human–object interactions (HOI) finds its application in the domains of surveillance, security, healthcare, assisted living, rehabilitation, sports, and online learning. This has led to the development of various HOI recognition systems in the recent past. Thus, the purpose [...] Read more.

The critical task of recognizing human–object interactions (HOI) finds its application in the domains of surveillance, security, healthcare, assisted living, rehabilitation, sports, and online learning. This has led to the development of various HOI recognition systems in the recent past. Thus, the purpose of this study is to develop a novel graph-based solution for this purpose. In particular, the proposed system takes sequential data as input and recognizes the HOI interaction being performed in it. That is, first of all, the system pre-processes the input data by adjusting the contrast and smoothing the incoming image frames. Then, it locates the human and object through image segmentation. Based on this, 12 key body parts are identified from the extracted human silhouette through a graph-based image skeletonization technique called image foresting transform (IFT). Then, three types of features are extracted: full-body feature, point-based features, and scene features. The next step involves optimizing the different features using isometric mapping (ISOMAP). Lastly, the optimized feature vector is fed to a graph convolution network (GCN) which performs the HOI classification. The performance of the proposed system was validated using three benchmark datasets, namely, Olympic Sports, MSR Daily Activity 3D, and D3D-HOI. The results showed that this model outperforms the existing state-of-the-art models by achieving a mean accuracy of 94.1% with the Olympic Sports, 93.2% with the MSR Daily Activity 3D, and 89.6% with the D3D-HOI datasets. Full article

(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅲ)

► Show Figures

Figure 1

21 pages, 1441 KiB

Open AccessArticle

Smart Home: Deep Learning as a Method for Machine Learning in Recognition of Face, Silhouette and Human Activity in the Service of a Safe Home

by George Vardakis, George Tsamis, Eleftheria Koutsaki, Kondylakis Haridimos and Nikos Papadakis

Electronics 2022, 11(10), 1622; https://doi.org/10.3390/electronics11101622 - 19 May 2022

Cited by 7 | Viewed by 5276

Abstract

Despite the general improvement of living conditions and the ways of building buildings, the sense of security in or around them is often not satisfactory for their users, resulting in the search and implementation of increasingly effective protection measures. The insecurity that modern [...] Read more.

Despite the general improvement of living conditions and the ways of building buildings, the sense of security in or around them is often not satisfactory for their users, resulting in the search and implementation of increasingly effective protection measures. The insecurity that modern people face every day, especially in urban centers regarding their home security, led computer science to the development of intelligent systems, aiming to mitigate the risks and ultimately lead to the consolidation of the feeling of security. In order to establish security, smart applications were created that turned a house into a Smart and Safe Home. We first present and analyze the deep learning method and emphasize its important contribution to the development of the process for machine learning, both in terms of the development of methods for safety at home, but also in terms of its contribution to other sciences and especially medicine where the results are spectacular. We then analyze in detail the back propagation algorithm in neural networks in both linear and non-linear networks as well as the X-OR problem simulation. Machine learning has a direct and effective application with impressive results in the recognition of human activity and especially in face recognition, which is the most basic condition for choosing the most appropriate method in order to design a smart home. Due to the large amount of data and the large computing capabilities that a system must have in order to meet the needs of a safe, smart home, technologies such as fog and cloud computing are used for both face recognition and recognition of human silhouettes and figures. These smart applications compose the systems that are created mainly through “Deep Learning” methods based on machine learning techniques. Based on the study we have done and present in this work, we believe that with the use of DL technology, the creation of a completely safe house has been achieved to a large extent today, covering an urgent need these days due to the increase in crime. Full article

(This article belongs to the Special Issue Smart Applications of 5G Network)

► Show Figures

Figure 1

24 pages, 5034 KiB

Open AccessFeature PaperArticle

Action Classification for Partially Occluded Silhouettes by Means of Shape and Action Descriptors

by Katarzyna Gościewska and Dariusz Frejlichowski

Appl. Sci. 2021, 11(18), 8633; https://doi.org/10.3390/app11188633 - 16 Sep 2021

Cited by 3 | Viewed by 2129

Abstract

This paper presents an action recognition approach based on shape and action descriptors that is aimed at the classification of physical exercises under partial occlusion. Regular physical activity in adults can be seen as a form of non-communicable diseases prevention, and may be [...] Read more.

This paper presents an action recognition approach based on shape and action descriptors that is aimed at the classification of physical exercises under partial occlusion. Regular physical activity in adults can be seen as a form of non-communicable diseases prevention, and may be aided by digital solutions that encourages individuals to increase their activity level. The application scenario includes workouts in front of the camera, where either the lower or upper part of the camera’s field of view is occluded. The proposed approach uses various features extracted from sequences of binary silhouettes, namely centroid trajectory, shape descriptors based on the Minimum Bounding Rectangle, action representation based on the Fourier transform and leave-one-out cross-validation for classification. Several experiments combining various parameters and shape features are performed. Despite the presence of occlusion, it was possible to obtain about 90% accuracy for several action classes, with the use of elongation values observed over time and centroid trajectory. Full article

(This article belongs to the Special Issue Advances in Image Processing, Analysis and Recognition Technology 2021)

► Show Figures

Figure 1

12 pages, 384 KiB

Open AccessArticle

The Analysis of Shape Features for the Purpose of Exercise Types Classification Using Silhouette Sequences

by Katarzyna Gościewska and Dariusz Frejlichowski

Appl. Sci. 2020, 10(19), 6728; https://doi.org/10.3390/app10196728 - 25 Sep 2020

Cited by 7 | Viewed by 2691

Abstract

This paper presents the idea of using simple shape features for action recognition based on binary silhouettes. Shape features are analysed as they change over time within an action sequence. It is shown that basic shape characteristics can discriminate between short, primitive actions [...] Read more.

This paper presents the idea of using simple shape features for action recognition based on binary silhouettes. Shape features are analysed as they change over time within an action sequence. It is shown that basic shape characteristics can discriminate between short, primitive actions performed by a single person. The proposed approach is tested on the Weizmann database using a various number of classes. Binary foreground masks (silhouettes) are replaced with convex hulls, which highlights some shape characteristics. Centroid locations are combined with some other simple shape descriptors. Each action sequence is represented using a vector with shape features and Discrete Fourier Transform. Classification is based on leave-one-sequence-out approach and employs Euclidean distance, correlation coefficient or C1 correlation. A list of processing steps for action recognition is explained and followed by some experiments that yielded accuracy exceeding 90%. The idea behind the presented approach is to develop a solution for action recognition that could be applied in a kind of human activity recognition system associated with the Ambient Assisted Living concept, helping adults increasing their activity levels by monitoring them during exercises. Full article

(This article belongs to the Special Issue Advances in Image Processing, Analysis and Recognition Technology)

► Show Figures

Figure 1

25 pages, 1734 KiB

Open AccessArticle

A Depth Video Sensor-Based Life-Logging Human Activity Recognition System for Elderly Care in Smart Indoor Environments

by Ahmad Jalal, Shaharyar Kamal and Daijin Kim

Sensors 2014, 14(7), 11735-11759; https://doi.org/10.3390/s140711735 - 2 Jul 2014

Cited by 237 | Viewed by 13930

Abstract

Recent advancements in depth video sensors technologies have made human activity recognition (HAR) realizable for elderly monitoring applications. Although conventional HAR utilizes RGB video sensors, HAR could be greatly improved with depth video sensors which produce depth or distance information. In this paper, [...] Read more.

Recent advancements in depth video sensors technologies have made human activity recognition (HAR) realizable for elderly monitoring applications. Although conventional HAR utilizes RGB video sensors, HAR could be greatly improved with depth video sensors which produce depth or distance information. In this paper, a depth-based life logging HAR system is designed to recognize the daily activities of elderly people and turn these environments into an intelligent living space. Initially, a depth imaging sensor is used to capture depth silhouettes. Based on these silhouettes, human skeletons with joint information are produced which are further used for activity recognition and generating their life logs. The life-logging system is divided into two processes. Firstly, the training system includes data collection using a depth camera, feature extraction and training for each activity via Hidden Markov Models. Secondly, after training, the recognition engine starts to recognize the learned activities and produces life logs. The system was evaluated using life logging features against principal component and independent component features and achieved satisfactory recognition rates against the conventional approaches. Experiments conducted on the smart indoor activity datasets and the MSRDailyActivity3D dataset show promising results. The proposed system is directly applicable to any elderly monitoring system, such as monitoring healthcare problems for elderly people, or examining the indoor activities of people at home, office or hospital. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Search Results (11)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (11)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI