MDPI - Publisher of Open Access Journals

24 pages, 824 KB

Open AccessArticle

MMF-Gait: A Multi-Model Fusion-Enhanced Gait Recognition Framework Integrating Convolutional and Attention Networks

by Kamrul Hasan, Khandokar Alisha Tuhin, Md Rasul Islam Bapary, Md Shafi Ud Doula, Md Ashraful Alam, Md Atiqur Rahman Ahad and Md. Zasim Uddin

Symmetry 2025, 17(7), 1155; https://doi.org/10.3390/sym17071155 - 19 Jul 2025

Viewed by 898

Abstract

Gait recognition is a reliable biometric approach that uniquely identifies individuals based on their natural walking patterns. It is widely used to recognize individuals who are challenging to camouflage and do not require a person’s cooperation. The general face-based person recognition system often [...] Read more.

Gait recognition is a reliable biometric approach that uniquely identifies individuals based on their natural walking patterns. It is widely used to recognize individuals who are challenging to camouflage and do not require a person’s cooperation. The general face-based person recognition system often fails to determine the offender’s identity when they conceal their face by wearing helmets and masks to evade identification. In such cases, gait-based recognition is ideal for identifying offenders, and most existing work leverages a deep learning (DL) model. However, a single model often fails to capture a comprehensive selection of refined patterns in input data when external factors are present, such as variation in viewing angle, clothing, and carrying conditions. In response to this, this paper introduces a fusion-based multi-model gait recognition framework that leverages the potential of convolutional neural networks (CNNs) and a vision transformer (ViT) in an ensemble manner to enhance gait recognition performance. Here, CNNs capture spatiotemporal features, and ViT features multiple attention layers that focus on a particular region of the gait image. The first step in this framework is to obtain the Gait Energy Image (GEI) by averaging a height-normalized gait silhouette sequence over a gait cycle, which can handle the left–right gait symmetry of the gait. After that, the GEI image is fed through multiple pre-trained models and fine-tuned precisely to extract the depth spatiotemporal feature. Later, three separate fusion strategies are conducted, and the first one is decision-level fusion (DLF), which takes each model’s decision and employs majority voting for the final decision. The second is feature-level fusion (FLF), which combines the features from individual models through pointwise addition before performing gait recognition. Finally, a hybrid fusion combines DLF and FLF for gait recognition. The performance of the multi-model fusion-based framework was evaluated on three publicly available gait databases: CASIA-B, OU-ISIR D, and the OU-ISIR Large Population dataset. The experimental results demonstrate that the fusion-enhanced framework achieves superior performance. Full article

(This article belongs to the Special Issue Symmetry and Its Applications in Image Processing)

► Show Figures

Figure 1

25 pages, 3823 KB

Open AccessArticle

Performance Evaluation of Various Deep Learning Models in Gait Recognition Using the CASIA-B Dataset

by Nakib Aman, Md. Rabiul Islam, Md. Faysal Ahamed and Mominul Ahsan

Technologies 2024, 12(12), 264; https://doi.org/10.3390/technologies12120264 - 17 Dec 2024

Cited by 2 | Viewed by 3816

Abstract

Human gait recognition (HGR) has been employed as a biometric technique for security purposes over the last decade. Various factors, including clothing, carrying items, and walking surfaces, can influence the performance of gait recognition. Additionally, identifying individuals from different viewpoints presents a significant [...] Read more.

Human gait recognition (HGR) has been employed as a biometric technique for security purposes over the last decade. Various factors, including clothing, carrying items, and walking surfaces, can influence the performance of gait recognition. Additionally, identifying individuals from different viewpoints presents a significant challenge in HGR. Numerous conventional and deep learning techniques have been introduced in the literature for HGR, but traditional methods are not well suited to handling large datasets. This research explores the effectiveness of four deep learning models for gait identification in the CASIA B dataset: the convolutional neural network (CNN), multi-layer perceptron (MLP), self-organizing map (SOMs), and transfer learning with EfficientNet. The selected deep learning techniques offer robust feature extraction and the efficient handling of large datasets, making them ideal in enhancing the accuracy of gait recognition. The collection includes gait sequences from 10 individuals, with a total of 92,596 images that have been reduced to 64 × 64 pixels for uniformity. A modified model was developed by integrating sequential convolutional layers for detailed spatial feature extraction, followed by dense layers for classification, optimized through rigorous hyperparameter tuning and regularization techniques, resulting in an accuracy of 97.12% for the test set. This work enhances our understanding of deep learning methods in gait analysis, offering significant insights for choosing optimal models in security and surveillance applications. Full article

► Show Figures

Figure 1

15 pages, 1999 KB

Open AccessArticle

Multi-Biometric Feature Extraction from Multiple Pose Estimation Algorithms for Cross-View Gait Recognition

by Ausrukona Ray, Md. Zasim Uddin, Kamrul Hasan, Zinat Rahman Melody, Prodip Kumar Sarker and Md Atiqur Rahman Ahad

Sensors 2024, 24(23), 7669; https://doi.org/10.3390/s24237669 - 30 Nov 2024

Cited by 5 | Viewed by 1826

Abstract

Gait recognition is a behavioral biometric technique that identifies individuals based on their unique walking patterns, enabling long-distance identification. Traditional gait recognition methods rely on appearance-based approaches that utilize background-subtracted silhouette sequences to extract gait features. While effective and easy to compute, these [...] Read more.

Gait recognition is a behavioral biometric technique that identifies individuals based on their unique walking patterns, enabling long-distance identification. Traditional gait recognition methods rely on appearance-based approaches that utilize background-subtracted silhouette sequences to extract gait features. While effective and easy to compute, these methods are susceptible to variations in clothing, carried objects, and illumination changes, compromising the extraction of discriminative features in real-world applications. In contrast, model-based approaches using skeletal key points offer robustness against these covariates. Advances in human pose estimation (HPE) algorithms using convolutional neural networks (CNNs) have facilitated the extraction of skeletal key points, addressing some challenges of model-based approaches. However, the performance of skeleton-based methods still lags behind that of appearance-based approaches. This paper aims to bridge this performance gap by introducing a multi-biometric framework that extracts features from multiple HPE algorithms for gait recognition, employing feature-level fusion (FLF) and decision-level fusion (DLF) by leveraging a single-source multi-sample technique. We utilized state-of-the-art HPE algorithms, OpenPose, AlphaPose, and HRNet, to generate diverse skeleton data samples from a single source video. Subsequently, we employed a residual graph convolutional network (ResGCN) to extract features from the generated skeleton data. In the FLF approach, the features extracted from ResGCN and applied to the skeleton data samples generated by multiple HPE algorithms are aggregated point-wise for gait recognition, while in the DLF approach, the decisions of ResGCN applied to each skeleton data sample are integrated using majority voting for the final recognition. Our proposed method demonstrated state-of-the-art skeleton-based cross-view gait recognition performance on a popular dataset, CASIA-B. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

9 pages, 877 KB

Open AccessProceeding Paper

Gait-Driven Pose Tracking and Movement Captioning Using OpenCV and MediaPipe Machine Learning Framework

by Malathi Janapati, Leela Priya Allamsetty, Tarun Teja Potluri and Kavya Vijay Mogili

Eng. Proc. 2024, 82(1), 4; https://doi.org/10.3390/ecsa-11-20470 - 26 Nov 2024

Cited by 1 | Viewed by 1992

Abstract

Pose tracking and captioning are extensively employed for motion capturing and activity description in daylight vision scenarios. Activity detection through camera systems presents a complex challenge, necessitating the refinement of numerous algorithms to ensure accurate functionality. Even though there are notable characteristics, IP [...] Read more.

Pose tracking and captioning are extensively employed for motion capturing and activity description in daylight vision scenarios. Activity detection through camera systems presents a complex challenge, necessitating the refinement of numerous algorithms to ensure accurate functionality. Even though there are notable characteristics, IP cameras lack integrated models for effective human activity detection. With this motivation, this paper presents a gait-driven OpenCV and MediaPipe machine learning framework for human pose and movement captioning. This is implemented by incorporating the Generative 3D Human Shape (GHUM 3D) model which can classify human bones, while Python can classify the human movements as either usual or unusual. This model is fed into a website equipped with camera input, activity detection, and gait posture analysis for pose tracking and movement captioning. The proposed approach comprises four modules, two for pose tracking and the remaining two for generating natural language descriptions of movements. The implementation is carried out on two publicly available datasets, CASIA-A and CASIA-B. The proposed methodology emphasizes the diagnostic ability of video analysis by dividing video data available in the datasets into 15-frame segments for detailed examination, where each segment represents a time frame with detailed scrutiny of human movement. Features such as spatial-temporal descriptors, motion characteristics, or key point coordinates are derived from each frame to detect key pose landmarks, focusing on the left shoulder, elbow, and wrist. By calculating the angle between these landmarks, the proposed method classifies the activities as “Walking” (angle between −45 and 45 degrees), “Clapping” (angles below −120 or above 120 degrees), and “Running” (angles below −150 or above 150 degrees). Angles outside these ranges are categorized as “Abnormal”, indicating abnormal activities. The experimental results show that the proposed method is robust for individual activity recognition. Full article

(This article belongs to the Proceedings of The 11th International Electronic Conference on Sensors and Applications)

► Show Figures

Figure 1

16 pages, 1816 KB

Open AccessArticle

MFCF-Gait: Small Silhouette-Sensitive Gait Recognition Algorithm Based on Multi-Scale Feature Cross-Fusion

by Chenyang Song, Lijun Yun and Ruoyu Li

Sensors 2024, 24(17), 5500; https://doi.org/10.3390/s24175500 - 24 Aug 2024

Cited by 1 | Viewed by 2172

Abstract

Gait recognition based on gait silhouette profiles is currently a major approach in the field of gait recognition. In previous studies, models typically used gait silhouette images sized at 64 × 64 pixels as input data. However, in practical applications, cases may arise [...] Read more.

Gait recognition based on gait silhouette profiles is currently a major approach in the field of gait recognition. In previous studies, models typically used gait silhouette images sized at 64 × 64 pixels as input data. However, in practical applications, cases may arise where silhouette images are smaller than 64 × 64, leading to a loss in detail information and significantly affecting model accuracy. To address these challenges, we propose a gait recognition system named Multi-scale Feature Cross-Fusion Gait (MFCF-Gait). At the input stage of the model, we employ super-resolution algorithms to preprocess the data. During this process, we observed that different super-resolution algorithms applied to larger silhouette images also affect training outcomes. Improved super-resolution algorithms contribute to enhancing model performance. In terms of model architecture, we introduce a multi-scale feature cross-fusion network model. By integrating low-level feature information from higher-resolution images with high-level feature information from lower-resolution images, the model emphasizes smaller-scale details, thereby improving recognition accuracy for smaller silhouette images. The experimental results on the CASIA-B dataset demonstrate significant improvements. On 64 × 64 silhouette images, the accuracies for NM, BG, and CL states reached 96.49%, 91.42%, and 78.24%, respectively. On 32 × 32 silhouette images, the accuracies were 94.23%, 87.68%, and 71.57%, respectively, showing notable enhancements. Full article

(This article belongs to the Special Issue Artificial Intelligence and Sensor-Based Gait Recognition)

► Show Figures

Figure 1

23 pages, 1980 KB

Open AccessArticle

GaitSTAR: Spatial–Temporal Attention-Based Feature-Reweighting Architecture for Human Gait Recognition

by Muhammad Bilal, He Jianbiao, Husnain Mushtaq, Muhammad Asim, Gauhar Ali and Mohammed ElAffendi

Mathematics 2024, 12(16), 2458; https://doi.org/10.3390/math12162458 - 8 Aug 2024

Cited by 4 | Viewed by 1870

Abstract

Human gait recognition (HGR) leverages unique gait patterns to identify individuals, but the effectiveness of this technique can be hindered due to various factors such as carrying conditions, foot shadows, clothing variations, and changes in viewing angles. Traditional silhouette-based systems often neglect the [...] Read more.

Human gait recognition (HGR) leverages unique gait patterns to identify individuals, but the effectiveness of this technique can be hindered due to various factors such as carrying conditions, foot shadows, clothing variations, and changes in viewing angles. Traditional silhouette-based systems often neglect the critical role of instantaneous gait motion, which is essential for distinguishing individuals with similar features. We introduce the ”Enhanced Gait Feature Extraction Framework (GaitSTAR)”, a novel method that incorporates dynamic feature weighting through the discriminant analysis of temporal and spatial features within a channel-wise architecture. Key innovations in GaitSTAR include dynamic stride flow representation (DSFR) to address silhouette distortion, a transformer-based feature set transformation (FST) for integrating image-level features into set-level features, and dynamic feature reweighting (DFR) for capturing long-range interactions. DFR enhances contextual understanding and improves detection accuracy by computing attention distributions across channel dimensions. Empirical evaluations show that GaitSTAR achieves impressive accuracies of 98.5%, 98.0%, and 92.7% under NM, BG, and CL conditions, respectively, with the CASIA-B dataset; 67.3% with the CASIA-C dataset; and 54.21% with the Gait3D dataset. Despite its complexity, GaitSTAR demonstrates a favorable balance between accuracy and computational efficiency, making it a powerful tool for biometric identification based on gait patterns. Full article

(This article belongs to the Special Issue Computer Vision, Image Processing Technologies and Artificial Intelligence, 2nd Edition)

► Show Figures

Figure 1

16 pages, 594 KB

Open AccessArticle

GaitMGL: Multi-Scale Temporal Dimension and Global–Local Feature Fusion for Gait Recognition

by Zhipeng Zhang, Siwei Wei, Liya Xi and Chunzhi Wang

Electronics 2024, 13(2), 257; https://doi.org/10.3390/electronics13020257 - 5 Jan 2024

Cited by 13 | Viewed by 2340

Abstract

Gait recognition has received widespread attention due to its non-intrusive recognition mechanism. Currently, most gait recognition methods use appearance-based recognition methods, and such methods are easily affected by occlusions when facing complex environments, which in turn affects the recognition accuracy. With the maturity [...] Read more.

Gait recognition has received widespread attention due to its non-intrusive recognition mechanism. Currently, most gait recognition methods use appearance-based recognition methods, and such methods are easily affected by occlusions when facing complex environments, which in turn affects the recognition accuracy. With the maturity of pose estimation techniques, model-based gait recognition methods have received more and more attention due to their robustness in complex environments. However, the current model-based gait recognition methods mainly focus on modeling the global feature information in the spatial dimension, ignoring the importance of local features and their influence on recognition accuracy. Meanwhile, in the temporal dimension, these methods usually use single-scale temporal information extraction, which does not take into account the inconsistency of the motion cycles of the limbs when a human body is walking (e.g., arm swing and leg pace), leading to the loss of some limb temporal information. To solve these problems, we propose a gait recognition network based on a Global–Local Graph Convolutional Network, called GaitMGL. Specifically, we introduce a new spatio-temporal feature extraction module, MGL (Multi-scale Temporal and Global–Local Spatial Extraction Module), which consists of GLGCN (Global–Local Graph Convolutional Network) and MTCN (Multi-scale Temporal Convolutional Network). GLGCN models both global and local features, and extracts global–local motion information. MTCN, on the other hand, takes into account the inconsistency of local limb motion cycles, and facilitates multi-scale temporal convolution to capture the temporal information of limb motion. In short, our GaitMGL solves the problems of loss of local information and loss of temporal information at a single scale that exist in existing model-based gait recognition networks. We evaluated our method on three publicly available datasets, CASIA-B, Gait3D, and GREW, and the experimental results show that our method demonstrates surprising performance and achieves an accuracy of 63.12% in the dataset GREW, exceeding all existing model-based gait recognition networks. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

16 pages, 2264 KB

Open AccessArticle

Two-Path Spatial-Temporal Feature Fusion and View Embedding for Gait Recognition

by Diyuan Guan, Chunsheng Hua and Xiaoheng Zhao

Appl. Sci. 2023, 13(23), 12808; https://doi.org/10.3390/app132312808 - 29 Nov 2023

Cited by 3 | Viewed by 1400

Abstract

Gait recognition is a distinctive biometric technique that can identify pedestrians by their walking patterns from considerable distances. A critical challenge in gait recognition lies in effectively acquiring discriminative spatial-temporal representations from silhouettes that exhibit invariance to disturbances. In this paper, we present [...] Read more.

Gait recognition is a distinctive biometric technique that can identify pedestrians by their walking patterns from considerable distances. A critical challenge in gait recognition lies in effectively acquiring discriminative spatial-temporal representations from silhouettes that exhibit invariance to disturbances. In this paper, we present a novel gait recognition network by aggregating features in the spatial-temporal and view domains, which consists of two-path spatial-temporal feature fusion module and view embedding module. Specifically, two-path spatial-temporal feature fusion module firstly utilizes multi-scale feature extraction (MSFE) to enrich the input features with multiple convolution kernels of various sizes. Then, frame-level spatial feature extraction (FLSFE) and multi-scale temporal feature extraction (MSTFE) are parallelly constructed to capture spatial and temporal gait features of different granularities and these features are fused together to obtain muti-scale spatial-temporal features. FLSFE is designed to extract both global and local gait features by employing a specially designed residual operation. Simultaneously, MSTFE is applied to adaptively interact multi-scale temporal features and produce suitable motion representations in temporal domain. Taking into account the view information, we introduce a view embedding module to reduce the impact of differing viewpoints. Through the extensive experimentation over CASIA-B and OU-MVLP datasets, the proposed method has achieved superior performance to the other state-of-the-art gait recognition approaches. Full article

(This article belongs to the Special Issue Advanced Technologies in Gait Recognition)

► Show Figures

Figure 1

17 pages, 4232 KB

Open AccessArticle

Cross-View Gait Recognition Method Based on Multi-Teacher Joint Knowledge Distillation

by Ruoyu Li, Lijun Yun, Mingxuan Zhang, Yanchen Yang and Feiyan Cheng

Sensors 2023, 23(22), 9289; https://doi.org/10.3390/s23229289 - 20 Nov 2023

Cited by 2 | Viewed by 1677

Abstract

Aiming at challenges such as the high complexity of the network model, the large number of parameters, and the slow speed of training and testing in cross-view gait recognition, this paper proposes a solution: Multi-teacher Joint Knowledge Distillation (MJKD). The algorithm employs multiple [...] Read more.

Aiming at challenges such as the high complexity of the network model, the large number of parameters, and the slow speed of training and testing in cross-view gait recognition, this paper proposes a solution: Multi-teacher Joint Knowledge Distillation (MJKD). The algorithm employs multiple complex teacher models to train gait images from a single view, extracting inter-class relationships that are then weighted and integrated into the set of inter-class relationships. These relationships guide the training of a lightweight student model, improving its gait feature extraction capability and recognition accuracy. To validate the effectiveness of the proposed Multi-teacher Joint Knowledge Distillation (MJKD), the paper performs experiments on the CASIA_B dataset using the ResNet network as the benchmark. The experimental results show that the student model trained by Multi-teacher Joint Knowledge Distillation (MJKD) achieves 98.24% recognition accuracy while significantly reducing the number of parameters and computational cost. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

14 pages, 608 KB

Open AccessArticle

GaitSG: Gait Recognition with SMPLs in Graph Structure

by Jiayi Yan, Shaohui Wang, Jing Lin, Peihao Li, Ruxin Zhang and Haoqian Wang

Sensors 2023, 23(20), 8627; https://doi.org/10.3390/s23208627 - 22 Oct 2023

Cited by 1 | Viewed by 2610

Abstract

Gait recognition aims to identify a person based on his unique walking pattern. Compared with silhouettes and skeletons, skinned multi-person linear (SMPL) models can simultaneously provide human pose and shape information and are robust to viewpoint and clothing variances. However, previous approaches have [...] Read more.

Gait recognition aims to identify a person based on his unique walking pattern. Compared with silhouettes and skeletons, skinned multi-person linear (SMPL) models can simultaneously provide human pose and shape information and are robust to viewpoint and clothing variances. However, previous approaches have only considered SMPL parameters as a whole and are yet to explore their potential for gait recognition thoroughly. To address this problem, we concentrate on SMPL representations and propose a novel SMPL-based method named GaitSG for gait recognition, which takes SMPL parameters in the graph structure as input. Specifically, we represent the SMPL model as graph nodes and employ graph convolution techniques to effectively model the human model topology and generate discriminative gait features. Further, we utilize prior knowledge of the human body and elaborately design a novel part graph pooling block, PGPB, to encode viewpoint information explicitly. The PGPB also alleviates the physical distance-unaware limitation of the graph structure. Comprehensive experiments on public gait recognition datasets, Gait3D and CASIA-B, demonstrate that GaitSG can achieve better performance and faster convergence than existing model-based approaches. Specifically, compared with the baseline SMPLGait (3D only), our model achieves approximately twice the Rank-1 accuracy and requires three times fewer training iterations on Gait3D. Full article

(This article belongs to the Special Issue Deep Learning Applications for Pose Estimation and Human Action Recognition)

► Show Figures

Figure 1

11 pages, 2003 KB

Open AccessArticle

Gait Recognition Based on Gait Optical Flow Network with Inherent Feature Pyramid

by Hongyi Ye, Tanfeng Sun and Ke Xu

Appl. Sci. 2023, 13(19), 10975; https://doi.org/10.3390/app131910975 - 5 Oct 2023

Cited by 6 | Viewed by 2086

Abstract

Gait is a kind of biological behavioral characteristic which can be recognized from a distance and has gained an increased interest nowadays. Many existing silhouette-based methods ignore the instantaneous motion of gait, which is an important factor in distinguishing people with similar shapes. [...] Read more.

Gait is a kind of biological behavioral characteristic which can be recognized from a distance and has gained an increased interest nowadays. Many existing silhouette-based methods ignore the instantaneous motion of gait, which is an important factor in distinguishing people with similar shapes. To further emphasize the instantaneous motion factor in human gait, the Gait Optical Flow Image (GOFI) is proposed to add the instantaneous motion direction and intensity to original gait silhouettes. The GOFI also helps to leverage both the temporal and spatial condition noises. Then, the gait features are extracted by the Gait Optical Flow Network (GOFN), which contains a Set Transition (ST) architecture to aggregate the image-level features to the set-level features and an Inherent Feature Pyramid (IFP) to exploit the multi-scaled partial features. The combined loss function is used to evaluate the similarity between different gaits. Experiments are conducted on two widely used gait datasets, the CASIA-B and the CASIA-C. The experiments show that the GOFN performs better on both datasets, which shows the effectiveness of the GOFN. Full article

(This article belongs to the Special Issue Advanced Technologies in Gait Recognition)

► Show Figures

Figure 1

17 pages, 4685 KB

Open AccessArticle

Research Method of Discontinuous-Gait Image Recognition Based on Human Skeleton Keypoint Extraction

by Kun Han and Xinyu Li

Sensors 2023, 23(16), 7274; https://doi.org/10.3390/s23167274 - 19 Aug 2023

Cited by 6 | Viewed by 2042

Abstract

As a biological characteristic, gait uses the posture characteristics of human walking for identification, which has the advantages of a long recognition distance and no requirement for the cooperation of subjects. This paper proposes a research method for recognising gait images at the [...] Read more.

As a biological characteristic, gait uses the posture characteristics of human walking for identification, which has the advantages of a long recognition distance and no requirement for the cooperation of subjects. This paper proposes a research method for recognising gait images at the frame level, even in cases of discontinuity, based on human keypoint extraction. In order to reduce the dependence of the network on the temporal characteristics of the image sequence during the training process, a discontinuous frame screening module is added to the front end of the gait feature extraction network, to restrict the image information input to the network. Gait feature extraction adds a cross-stage partial connection (CSP) structure to the spatial–temporal graph convolutional networks’ bottleneck structure in the ResGCN network, to effectively filter interference information. It also inserts XBNBlock, on the basis of the CSP structure, to reduce estimation caused by network layer deepening and small-batch-size training. The experimental results of our model on the gait dataset CASIA-B achieve an average recognition accuracy of 79.5%. The proposed method can also achieve 78.1% accuracy on the CASIA-B sample, after training with a limited number of image frames, which means that the model is more robust. Full article

(This article belongs to the Special Issue Deep Learning Applications for Pose Estimation and Human Action Recognition)

► Show Figures

Figure 1

19 pages, 2213 KB

Open AccessArticle

Omni-Domain Feature Extraction Method for Gait Recognition

by Jiwei Wan, Huimin Zhao, Rui Li, Rongjun Chen and Tuanjie Wei

Mathematics 2023, 11(12), 2612; https://doi.org/10.3390/math11122612 - 7 Jun 2023

Cited by 2 | Viewed by 2060

Abstract

As a biological feature with strong spatio-temporal correlation, the current difficulty of gait recognition lies in the interference of covariates (viewpoint, clothing, etc.) in feature extraction. In order to weaken the influence of extrinsic variable changes, we propose an interval frame sampling method [...] Read more.

As a biological feature with strong spatio-temporal correlation, the current difficulty of gait recognition lies in the interference of covariates (viewpoint, clothing, etc.) in feature extraction. In order to weaken the influence of extrinsic variable changes, we propose an interval frame sampling method to capture more information about joint dynamic changes, and an Omni-Domain Feature Extraction Network. The Omni-Domain Feature Extraction Network consists of three main modules: (1) Temporal-Sensitive Feature Extractor: injects key gait temporal information into shallow spatial features to improve spatio-temporal correlation. (2) Dynamic Motion Capture: extracts temporal features of different motion and assign weights adaptively. (3) Omni-Domain Feature Balance Module: balances fine-grained spatio-temporal features, highlight decisive spatio-temporal features. Extensive experiments were conducted on two commonly used public gait datasets, showing that our method has good performance and generalization ability. In CASIA-B, we achieved an average rank-1 accuracy of 94.2% under three walking conditions. In OU-MVLP, we achieved a rank-1 accuracy of 90.5%. Full article

(This article belongs to the Special Issue Advances in Computer Vision and Machine Learning)

► Show Figures

Figure 1

23 pages, 8161 KB

Open AccessArticle

Regional Time-Series Coding Network and Multi-View Image Generation Network for Short-Time Gait Recognition

by Wenhao Sun, Guangda Lu, Zhuangzhuang Zhao, Tinghang Guo, Zhuanping Qin and Yu Han

Entropy 2023, 25(6), 837; https://doi.org/10.3390/e25060837 - 23 May 2023

Cited by 4 | Viewed by 2223

Abstract

Gait recognition is one of the important research directions of biometric authentication technology. However, in practical applications, the original gait data is often short, and a long and complete gait video is required for successful recognition. Also, the gait images from different views [...] Read more.

Gait recognition is one of the important research directions of biometric authentication technology. However, in practical applications, the original gait data is often short, and a long and complete gait video is required for successful recognition. Also, the gait images from different views have a great influence on the recognition effect. To address the above problems, we designed a gait data generation network for expanding the cross-view image data required for gait recognition, which provides sufficient data input for feature extraction branching with gait silhouette as the criterion. In addition, we propose a gait motion feature extraction network based on regional time-series coding. By independently time-series coding the joint motion data within different regions of the body, and then combining the time-series data features of each region with secondary coding, we obtain the unique motion relationships between regions of the body. Finally, bilinear matrix decomposition pooling is used to fuse spatial silhouette features and motion time-series features to obtain complete gait recognition under shorter time-length video input. We use the OUMVLP-Pose and CASIA-B datasets to validate the silhouette image branching and motion time-series branching, respectively, and employ evaluation metrics such as IS entropy value and Rank-1 accuracy to demonstrate the effectiveness of our design network. Finally, we also collect gait-motion data in the real world and test them in a complete two-branch fusion network. The experimental results show that the network we designed can effectively extract the time-series features of human motion and achieve the expansion of multi-view gait data. The real-world tests also prove that our designed method has good results and feasibility in the problem of gait recognition with short-time video as input data. Full article

(This article belongs to the Special Issue Deep Learning Models and Applications to Computer Vision)

► Show Figures

Figure 1

15 pages, 644 KB

Open AccessArticle

Gait-CNN-ViT: Multi-Model Gait Recognition with Convolutional Neural Networks and Vision Transformer

by Jashila Nair Mogan, Chin Poo Lee, Kian Ming Lim, Mohammed Ali and Ali Alqahtani

Sensors 2023, 23(8), 3809; https://doi.org/10.3390/s23083809 - 7 Apr 2023

Cited by 30 | Viewed by 5219

Abstract

Gait recognition, the task of identifying an individual based on their unique walking style, can be difficult because walking styles can be influenced by external factors such as clothing, viewing angle, and carrying conditions. To address these challenges, this paper proposes a multi-model [...] Read more.

Gait recognition, the task of identifying an individual based on their unique walking style, can be difficult because walking styles can be influenced by external factors such as clothing, viewing angle, and carrying conditions. To address these challenges, this paper proposes a multi-model gait recognition system that integrates Convolutional Neural Networks (CNNs) and Vision Transformer. The first step in the process is to obtain a gait energy image, which is achieved by applying an averaging technique to a gait cycle. The gait energy image is then fed into three different models, DenseNet-201, VGG-16, and a Vision Transformer. These models are pre-trained and fine-tuned to encode the salient gait features that are specific to an individual’s walking style. Each model provides prediction scores for the classes based on the encoded features, and these scores are then summed and averaged to produce the final class label. The performance of this multi-model gait recognition system was evaluated on three datasets, CASIA-B, OU-ISIR dataset D, and OU-ISIR Large Population dataset. The experimental results showed substantial improvement compared to existing methods on all three datasets. The integration of CNNs and ViT allows the system to learn both the pre-defined and distinct features, providing a robust solution for gait recognition even under the influence of covariates. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

Search Results (30)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (30)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI