Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (41)

Search Parameters:
Keywords = CASIA-B

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 1786 KB  
Article
Development and Performance Analysis of a Semi-Supervised Gait Recognition Model for Pediatric Abnormalities Using a Hybrid Dataset
by Xiaoneng Song, Kun Qian and Sida Tang
Bioengineering 2026, 13(3), 272; https://doi.org/10.3390/bioengineering13030272 - 26 Feb 2026
Viewed by 335
Abstract
Pediatric gait abnormalities are closely intertwined with musculoskeletal dysfunctions and heightened injury risk, underscoring the urgency of early and accessible screening tools. Here, we develop and validate a video-based semi-supervised Abnormal Gait Recognition Module (AGRM) to address unmet needs in pediatric gait assessment, [...] Read more.
Pediatric gait abnormalities are closely intertwined with musculoskeletal dysfunctions and heightened injury risk, underscoring the urgency of early and accessible screening tools. Here, we develop and validate a video-based semi-supervised Abnormal Gait Recognition Module (AGRM) to address unmet needs in pediatric gait assessment, with a focus on diagnostic performance and clinical interpretability. The AGRM is built on a 3D ResNet backbone, synergistically integrated with a Mean Teacher Module (MTM) to mitigate the limitations of limited labeled clinical data, and a Spatial Hierarchical Pooling Module (SHPM) for robust multiscale spatiotemporal feature extraction—two core innovations tailored to gait dynamics. We trained and validated the model on a hybrid dataset combining self-collected pediatric gait videos and the public CASIA-B dataset, evaluating its performance in binary (normal vs. abnormal) and three-class (normal, genu varum, genu valgum) classification tasks using accuracy, macro-precision, macro-recall, and macro-F1 score. Ablation studies quantified the incremental contributions of MTM and SHPM, while Grad-CAM visualization was employed to enhance model interpretability. In the three-class classification task, the AGRM achieved a 70.5% accuracy, 72.1% macro-precision, 71.5% macro-recall, and a macro-F1 score of 0.718; in the binary task, it yielded a 80.3% precision and 79.2% recall. SHPM significantly augmented spatiotemporal feature aggregation, capturing fine-grained gait dynamics, whereas MTM improved model generalization under constrained labeled data scenarios—findings corroborated by ablation experiments. Grad-CAM visualization confirmed the model’s targeted attention to lower extremity regions, particularly the knee joints, aligning with the pathological loci of gait abnormalities. Collectively, our AGRM demonstrates robust performance and generalization in identifying pediatric gait abnormalities, while effectively capturing key pathological gait characteristics. This video-based intelligent approach offers a promising tool for early gait screening in both clinical and community settings, addressing barriers to accessible pediatric musculoskeletal assessment. Full article
(This article belongs to the Section Biomedical Engineering and Biomaterials)
Show Figures

Figure 1

21 pages, 3872 KB  
Article
IoT-Oriented Security for Small Sensor Systems Using DnCNN Denoising and Multimodal Feature Fusion for Image Forgery Detection
by Nimra Nasir, Syeda Sitara Waseem, Muhammad Bilal and Syed Rizwan Hassan
Sensors 2026, 26(4), 1172; https://doi.org/10.3390/s26041172 - 11 Feb 2026
Viewed by 274
Abstract
With ongoing growth in the implementation of CCTV networks, miniature sensors, and IoT devices, the quality of captured images in terms of authenticity has become a major security issue. Through advanced editing tools and generative models, the capability now exists to perform highly [...] Read more.
With ongoing growth in the implementation of CCTV networks, miniature sensors, and IoT devices, the quality of captured images in terms of authenticity has become a major security issue. Through advanced editing tools and generative models, the capability now exists to perform highly advanced forgeries that fail both human perception and traditional algorithms, and especially in terms of sensor-generated content. State-of-the-art algorithms typically use a single-cue characteristic in their models to stabilize performance, including local noise statistics or structural disruption patterns, making them susceptible to varied forms of manipulation. As a solution to this issue, we have developed MultiFusion, a new forgery detection framework which combines complementary forensic cues in images: SRM-based noise residuals, hierarchical texture features based on EfficientNet-B0, and global structural relationships from a vision transformer. A special DnCNN denoising preprocessing layer represses sensor noise and maintains fine traces of tampering. To achieve better interpretability, we combine Grad-cam images of the convolutional flow and transformer attention maps to create on-unit interpretable heatmaps, the areas of which identify regions of manipulation. Experimental verification on the CASIA 2.0 standard shows high detection accuracy (96.69) and good generalization. Via normalized denoising, multimodal feature fusion, and explainable AI, our framework takes CCTV, sensor forensics, and IoT image authentication to the next level. Full article
Show Figures

Figure 1

17 pages, 4792 KB  
Article
A Deep Learning-Based Graphical User Interface for Predicting Corneal Ectasia Scores from Raw Optical Coherence Tomography Data
by Maziar Mirsalehi and Achim Langenbucher
Diagnostics 2026, 16(2), 310; https://doi.org/10.3390/diagnostics16020310 - 18 Jan 2026
Viewed by 305
Abstract
Background/Objectives: Keratoconus, a condition in which the cornea becomes thinner and steeper, can cause visual problems, particularly when it is progressive. Early diagnosis is important for preserving visual acuity. Raw data, unlike preprocessed data, are unaffected by software modifications. They retain their [...] Read more.
Background/Objectives: Keratoconus, a condition in which the cornea becomes thinner and steeper, can cause visual problems, particularly when it is progressive. Early diagnosis is important for preserving visual acuity. Raw data, unlike preprocessed data, are unaffected by software modifications. They retain their native structure across versions, providing consistency for analytical purposes. The objective of this study was to design a deep learning-based graphical user interface for predicting the corneal ectasia score using raw optical coherence tomography data. Methods: The graphical user interface was developed using Tkinter, a Python library for building graphical user interfaces. The user is allowed to select raw data from the cornea/anterior segment optical coherence tomography Casia2, which is generated in the 3dv format, from the local system. To view the predicted corneal ectasia score, the user must determine whether the selected 3dv file corresponds to the left or right eye. Extracted optical coherence tomography images are cropped, resized to 224 × 224 pixels and processed by the modified EfficientNet-B0 convolutional neural network to predict the corneal ectasia score. The predicted corneal ectasia score value is displayed along with a diagnosis: ‘No detectable ectasia pattern’ or ‘Suspected ectasia’ or ‘Clinical ectasia’. Performance metric values were rounded to four decimal places, and the mean absolute error value was rounded to two decimal places. Results: The modified EfficientNet-B0 obtained a mean absolute error of 6.65 when evaluated on the test dataset. For the two-class classification, it achieved an accuracy of 87.96%, a sensitivity of 82.41%, a specificity of 96.69%, a positive predictive value of 97.52% and an F1 score of 89.33%. For the three-class classification, it attained a weighted-average F1 score of 84.95% and an overall accuracy of 84.75%. Conclusions: The graphical user interface outputs numerical ectasia scores, which improves other categorical labels. The graphical user interface enables consistent diagnostics, regardless of software updates, by using raw data from the Casia2. The successful use of raw optical coherence tomography data indicates the potential for raw optical coherence tomography data to be used, rather than preprocessed optical coherence tomography data, for diagnosing keratoconus. Full article
(This article belongs to the Special Issue Diagnosis of Corneal and Retinal Diseases)
Show Figures

Figure 1

15 pages, 979 KB  
Article
Hybrid Skeleton-Based Motion Templates for Cross-View and Appearance-Robust Gait Recognition
by João Ferreira Nunes, Pedro Miguel Moreira and João Manuel R. S. Tavares
J. Imaging 2026, 12(1), 32; https://doi.org/10.3390/jimaging12010032 - 7 Jan 2026
Viewed by 349
Abstract
Gait recognition methods based on silhouette templates, such as the Gait Energy Image (GEI), achieve high accuracy under controlled conditions but often degrade when appearance varies due to viewpoint, clothing, or carried objects. In contrast, skeleton-based approaches provide interpretable motion cues but remain [...] Read more.
Gait recognition methods based on silhouette templates, such as the Gait Energy Image (GEI), achieve high accuracy under controlled conditions but often degrade when appearance varies due to viewpoint, clothing, or carried objects. In contrast, skeleton-based approaches provide interpretable motion cues but remain sensitive to pose-estimation noise. This work proposes two compact 2D skeletal descriptors—Gait Skeleton Images (GSIs)—that encode 3D joint trajectories into line-based and joint-based static templates compatible with standard 2D CNN architectures. A unified processing pipeline is introduced, including skeletal topology normalization, rigid view alignment, orthographic projection, and pixel-level rendering. Core design factors are analyzed on the GRIDDS dataset, where depth-based 3D coordinates provide stable ground truth for evaluating structural choices and rendering parameters. An extensive evaluation is then conducted on the widely used CASIA-B dataset, using 3D coordinates estimated via human pose estimation, to assess robustness under viewpoint, clothing, and carrying covariates. Results show that although GEIs achieve the highest same-view accuracy, GSI variants exhibit reduced degradation under appearance changes and demonstrate greater stability under severe cross-view conditions. These findings indicate that compact skeletal templates can complement appearance-based descriptors and may benefit further from continued advances in 3D human pose estimation. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

18 pages, 10938 KB  
Article
Deep Learning-Based Diagnosis of Corneal Condition by Using Raw Optical Coherence Tomography Data
by Maziar Mirsalehi, Michael Schwemm, Elias Flockerzi, Nóra Szentmáry, Alaa Din Abdin, Berthold Seitz and Achim Langenbucher
Diagnostics 2025, 15(24), 3115; https://doi.org/10.3390/diagnostics15243115 - 8 Dec 2025
Viewed by 591
Abstract
Background/Objectives: Keratoconus (KC) is the most common corneal ectasia. This condition affects quality of vision, especially when it is progressive, and a timely and stage-related treatment is mandatory. Therefore, early diagnosis is crucial to preserve visual acuity. Medical data may be used [...] Read more.
Background/Objectives: Keratoconus (KC) is the most common corneal ectasia. This condition affects quality of vision, especially when it is progressive, and a timely and stage-related treatment is mandatory. Therefore, early diagnosis is crucial to preserve visual acuity. Medical data may be used either in their raw state or in a preprocessed form. Software modifications introduced through updates may potentially affect outcomes. Unlike preprocessed data, raw data preserve their original format across software versions and provide a more consistent basis for clinical analysis. The objective of this study was to distinguish between healthy and KC corneas from raw optical coherence tomography data by using a convolutional neural network. Methods: In total, 2737 eye examinations acquired with the Casia2 anterior-segment optical coherence tomography (Tomey, Nagoya, Japan) were decided by three experienced ophthalmologists to belong to one of three classes: ‘normal’, ‘ectasia’, or ‘other disease’. Each eye examination consisted of sixteen meridional slice images. The dataset included 744 examinations. DenseNet121, EfficientNet-B0, MobileNetV3-Large and ResNet18 were modified for use as convolutional neural networks for prediction. All reported metric values were rounded to four decimal places. Results: The overall accuracy for the modified DenseNet121, modified EfficientNet-B0, modified MobileNetV3-Large and modified ResNet18 is 91.27%, 91.27%, 92.86% and 89.68%, respectively. The macro-averaged sensitivity, macro-averaged specificity, macro-averaged Positive Predictive Value and macro-averaged F1 score for the modified DenseNet121, modified EfficientNet-B0, modified MobileNetV3-Large and modified ResNet18 are reported as 91.27%, 91.27%, 92.86% and 89.68%; 95.63%, 95.63%, 96.43% and 94.84%; 91.58% 91.65%, 92.91% and 90.24%; and 91.35%, 91.29%, 92.85% and 89.81%, respectively. Conclusions: The successful use of a convolutional neural network with raw optical coherence tomography data demonstrates the potential of raw data to be used instead of preprocessed data for diagnosing KC in ophthalmology. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

22 pages, 1770 KB  
Article
Key-Frame-Aware Hierarchical Learning for Robust Gait Recognition
by Ke Wang and Hua Huo
J. Imaging 2025, 11(11), 402; https://doi.org/10.3390/jimaging11110402 - 10 Nov 2025
Viewed by 641
Abstract
Gait recognition in unconstrained environments is severely hampered by variations in view, clothing, and carrying conditions. To address this, we introduce HierarchGait, a key-frame-aware hierarchical learning framework. Our approach uniquely integrates three complementary modules: a TemplateBlock-based Motion Extraction (TBME) for coarse-to-fine anatomical feature [...] Read more.
Gait recognition in unconstrained environments is severely hampered by variations in view, clothing, and carrying conditions. To address this, we introduce HierarchGait, a key-frame-aware hierarchical learning framework. Our approach uniquely integrates three complementary modules: a TemplateBlock-based Motion Extraction (TBME) for coarse-to-fine anatomical feature learning, a Sequence-Level Spatio-temporal Feature Aggregator (SSFA) to identify and prioritize discriminative key-frames, and a Frame-level Feature Re-segmentation Extractor (FFRE) to capture fine-grained motion details. This synergistic design yields a robust and comprehensive gait representation. We demonstrate the superiority of our method through extensive experiments. On the highly challenging CASIA-B dataset, HierarchGait achieves new state-of-the-art average Rank-1 accuracies of 98.1% under Normal (NM), 95.9% under Bag (BG), and 87.5% under Coat (CL) conditions. Furthermore, on the large-scale OU-MVLP dataset, our model attains a 91.5% average accuracy. These results validate the significant advantage of explicitly modeling anatomical hierarchies and temporal key-moments for robust gait recognition. Full article
(This article belongs to the Section Biometrics, Forensics, and Security)
Show Figures

Figure 1

20 pages, 2508 KB  
Article
An Attention-Enhanced Network for Person Re-Identification via Appearance–Gait Fusion
by Zelong Yu, Yixiang Cai, Hanming Xu, Lei Chen, Mingqian Yang, Huabo Sun and Xiangyu Zhao
Electronics 2025, 14(21), 4142; https://doi.org/10.3390/electronics14214142 - 22 Oct 2025
Cited by 2 | Viewed by 803
Abstract
The objective of person re-identification (Re-ID) is to recognize a given target pedestrian across different cameras. However, perspective variations, resulting from differences in shooting angles, often significantly impact the accuracy of person Re-ID. To address this issue, this paper presents an attention-enhanced person [...] Read more.
The objective of person re-identification (Re-ID) is to recognize a given target pedestrian across different cameras. However, perspective variations, resulting from differences in shooting angles, often significantly impact the accuracy of person Re-ID. To address this issue, this paper presents an attention-enhanced person Re-ID algorithm based on appearance–gait information interaction. Specifically, appearance features and gait features are first extracted from RGB images and gait energy images (GEIs), respectively, using two ResNet-50 networks. Then, a multimodal information exchange module based on the attention mechanism is designed to build a bridge for information exchange between the two modalities during the feature extraction process. This module aims to enhance the feature extraction ability through mutual guidance and reinforcement between the two modalities, thereby improving the model’s effectiveness in integrating the two types of modal information. Subsequently, to further balance the signal-to-noise ratio, importance weight estimation is employed to map perspective information into the importance weights of the two features. Finally, based on the autoencoder structure, the two features are weighted and fused under the guidance of importance weights to generate fused features that are robust to perspective changes. The experimental results on the CASIA-B dataset indicate that, under conditions of viewpoint variation, the method proposed in this paper achieved an average accuracy of 94.9%, which is 1.1% higher than the next best method, and obtained the smallest variance of 4.199, suggesting that the method proposed in this paper is not only more accurate but also more stable. Full article
(This article belongs to the Special Issue Artificial Intelligence and Microsystems)
Show Figures

Figure 1

24 pages, 824 KB  
Article
MMF-Gait: A Multi-Model Fusion-Enhanced Gait Recognition Framework Integrating Convolutional and Attention Networks
by Kamrul Hasan, Khandokar Alisha Tuhin, Md Rasul Islam Bapary, Md Shafi Ud Doula, Md Ashraful Alam, Md Atiqur Rahman Ahad and Md. Zasim Uddin
Symmetry 2025, 17(7), 1155; https://doi.org/10.3390/sym17071155 - 19 Jul 2025
Cited by 1 | Viewed by 1745
Abstract
Gait recognition is a reliable biometric approach that uniquely identifies individuals based on their natural walking patterns. It is widely used to recognize individuals who are challenging to camouflage and do not require a person’s cooperation. The general face-based person recognition system often [...] Read more.
Gait recognition is a reliable biometric approach that uniquely identifies individuals based on their natural walking patterns. It is widely used to recognize individuals who are challenging to camouflage and do not require a person’s cooperation. The general face-based person recognition system often fails to determine the offender’s identity when they conceal their face by wearing helmets and masks to evade identification. In such cases, gait-based recognition is ideal for identifying offenders, and most existing work leverages a deep learning (DL) model. However, a single model often fails to capture a comprehensive selection of refined patterns in input data when external factors are present, such as variation in viewing angle, clothing, and carrying conditions. In response to this, this paper introduces a fusion-based multi-model gait recognition framework that leverages the potential of convolutional neural networks (CNNs) and a vision transformer (ViT) in an ensemble manner to enhance gait recognition performance. Here, CNNs capture spatiotemporal features, and ViT features multiple attention layers that focus on a particular region of the gait image. The first step in this framework is to obtain the Gait Energy Image (GEI) by averaging a height-normalized gait silhouette sequence over a gait cycle, which can handle the left–right gait symmetry of the gait. After that, the GEI image is fed through multiple pre-trained models and fine-tuned precisely to extract the depth spatiotemporal feature. Later, three separate fusion strategies are conducted, and the first one is decision-level fusion (DLF), which takes each model’s decision and employs majority voting for the final decision. The second is feature-level fusion (FLF), which combines the features from individual models through pointwise addition before performing gait recognition. Finally, a hybrid fusion combines DLF and FLF for gait recognition. The performance of the multi-model fusion-based framework was evaluated on three publicly available gait databases: CASIA-B, OU-ISIR D, and the OU-ISIR Large Population dataset. The experimental results demonstrate that the fusion-enhanced framework achieves superior performance. Full article
(This article belongs to the Special Issue Symmetry and Its Applications in Image Processing)
Show Figures

Figure 1

15 pages, 2148 KB  
Article
Comparison of a Scheimpflug Camera and Optical Coherence Tomography in Evaluating Keratoconic Eyes Post Keratoplasty
by Anna Maria Gadamer, Piotr Miklaszewski, Dominika Janiszewska-Bil, Anita Lyssek-Boroń, Dariusz Dobrowolski, Edward Wylęgała, Beniamin Oskar Grabarek and Katarzyna Krysik
J. Clin. Med. 2025, 14(1), 238; https://doi.org/10.3390/jcm14010238 - 3 Jan 2025
Cited by 2 | Viewed by 1643
Abstract
Background/Objective: The aim of this retrospective study was to compare corneal parameters and compliance using a Pentacam HR–Scheimpflug (Pentacam HR) and a swept-source OCT Casia (Casia) in keratoconus (KC) patients post penetrating keratoplasty (PKP) and KC patients without PKP, as well as a [...] Read more.
Background/Objective: The aim of this retrospective study was to compare corneal parameters and compliance using a Pentacam HR–Scheimpflug (Pentacam HR) and a swept-source OCT Casia (Casia) in keratoconus (KC) patients post penetrating keratoplasty (PKP) and KC patients without PKP, as well as a control group. Pachymetry measurements were also analyzed using a spectral domain OCT Solix (OCT Solix), Pentacam HR, and Casia. Methods: The study included 71 patients (136 keratoconic eyes; group A), 86 eyes with KC post-PKP (group B), 50 eyes with KC without PKP (group C), and 52 control participants (104 eyes). All participants were adults, Polish Caucasian, and met specific inclusion criteria. Patients with ophthalmological or systemic diseases, cognitive impairment, or pregnancy were excluded. Corneal parameters were measured using two devices (Casia and Pentacam HR), while pachymetry was assessed with three devices (Casia, Pentacam HR, and OCT Solix), with the inter-device agreement and group differences analyzed. Results: Significant differences (p < 0.05) were found across all groups. The post-PKP KC eyes showed significant differences in all front parameters and K2 and Astig. back, while the non-PKP KC eyes showed differences in the K1 back (p = 0.025). The controls displayed differences in all parameters except front astigmatism (p = 0.61). The Pentacam HR overestimated the thinnest corneal thickness (TCT) compared to the OCT Casia across groups. The inter-device agreement was excellent for the anterior parameters (ICC > 0.9) but good for the posterior parameters and TCT. Conclusions: This study highlights significant variability in corneal and pachymetry measurements across devices, with OCT Casia providing more consistent and clinically reliable results than Pentacam HR. Clinicians should exercise caution when using these devices interchangeably, particularly for posterior parameters and TCT. Full article
(This article belongs to the Special Issue Clinical Updates in Corneal Transplantation)
Show Figures

Figure 1

25 pages, 3823 KB  
Article
Performance Evaluation of Various Deep Learning Models in Gait Recognition Using the CASIA-B Dataset
by Nakib Aman, Md. Rabiul Islam, Md. Faysal Ahamed and Mominul Ahsan
Technologies 2024, 12(12), 264; https://doi.org/10.3390/technologies12120264 - 17 Dec 2024
Cited by 9 | Viewed by 4889
Abstract
Human gait recognition (HGR) has been employed as a biometric technique for security purposes over the last decade. Various factors, including clothing, carrying items, and walking surfaces, can influence the performance of gait recognition. Additionally, identifying individuals from different viewpoints presents a significant [...] Read more.
Human gait recognition (HGR) has been employed as a biometric technique for security purposes over the last decade. Various factors, including clothing, carrying items, and walking surfaces, can influence the performance of gait recognition. Additionally, identifying individuals from different viewpoints presents a significant challenge in HGR. Numerous conventional and deep learning techniques have been introduced in the literature for HGR, but traditional methods are not well suited to handling large datasets. This research explores the effectiveness of four deep learning models for gait identification in the CASIA B dataset: the convolutional neural network (CNN), multi-layer perceptron (MLP), self-organizing map (SOMs), and transfer learning with EfficientNet. The selected deep learning techniques offer robust feature extraction and the efficient handling of large datasets, making them ideal in enhancing the accuracy of gait recognition. The collection includes gait sequences from 10 individuals, with a total of 92,596 images that have been reduced to 64 × 64 pixels for uniformity. A modified model was developed by integrating sequential convolutional layers for detailed spatial feature extraction, followed by dense layers for classification, optimized through rigorous hyperparameter tuning and regularization techniques, resulting in an accuracy of 97.12% for the test set. This work enhances our understanding of deep learning methods in gait analysis, offering significant insights for choosing optimal models in security and surveillance applications. Full article
Show Figures

Figure 1

15 pages, 1999 KB  
Article
Multi-Biometric Feature Extraction from Multiple Pose Estimation Algorithms for Cross-View Gait Recognition
by Ausrukona Ray, Md. Zasim Uddin, Kamrul Hasan, Zinat Rahman Melody, Prodip Kumar Sarker and Md Atiqur Rahman Ahad
Sensors 2024, 24(23), 7669; https://doi.org/10.3390/s24237669 - 30 Nov 2024
Cited by 8 | Viewed by 2446
Abstract
Gait recognition is a behavioral biometric technique that identifies individuals based on their unique walking patterns, enabling long-distance identification. Traditional gait recognition methods rely on appearance-based approaches that utilize background-subtracted silhouette sequences to extract gait features. While effective and easy to compute, these [...] Read more.
Gait recognition is a behavioral biometric technique that identifies individuals based on their unique walking patterns, enabling long-distance identification. Traditional gait recognition methods rely on appearance-based approaches that utilize background-subtracted silhouette sequences to extract gait features. While effective and easy to compute, these methods are susceptible to variations in clothing, carried objects, and illumination changes, compromising the extraction of discriminative features in real-world applications. In contrast, model-based approaches using skeletal key points offer robustness against these covariates. Advances in human pose estimation (HPE) algorithms using convolutional neural networks (CNNs) have facilitated the extraction of skeletal key points, addressing some challenges of model-based approaches. However, the performance of skeleton-based methods still lags behind that of appearance-based approaches. This paper aims to bridge this performance gap by introducing a multi-biometric framework that extracts features from multiple HPE algorithms for gait recognition, employing feature-level fusion (FLF) and decision-level fusion (DLF) by leveraging a single-source multi-sample technique. We utilized state-of-the-art HPE algorithms, OpenPose, AlphaPose, and HRNet, to generate diverse skeleton data samples from a single source video. Subsequently, we employed a residual graph convolutional network (ResGCN) to extract features from the generated skeleton data. In the FLF approach, the features extracted from ResGCN and applied to the skeleton data samples generated by multiple HPE algorithms are aggregated point-wise for gait recognition, while in the DLF approach, the decisions of ResGCN applied to each skeleton data sample are integrated using majority voting for the final recognition. Our proposed method demonstrated state-of-the-art skeleton-based cross-view gait recognition performance on a popular dataset, CASIA-B. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

9 pages, 877 KB  
Proceeding Paper
Gait-Driven Pose Tracking and Movement Captioning Using OpenCV and MediaPipe Machine Learning Framework
by Malathi Janapati, Leela Priya Allamsetty, Tarun Teja Potluri and Kavya Vijay Mogili
Eng. Proc. 2024, 82(1), 4; https://doi.org/10.3390/ecsa-11-20470 - 26 Nov 2024
Cited by 2 | Viewed by 3078
Abstract
Pose tracking and captioning are extensively employed for motion capturing and activity description in daylight vision scenarios. Activity detection through camera systems presents a complex challenge, necessitating the refinement of numerous algorithms to ensure accurate functionality. Even though there are notable characteristics, IP [...] Read more.
Pose tracking and captioning are extensively employed for motion capturing and activity description in daylight vision scenarios. Activity detection through camera systems presents a complex challenge, necessitating the refinement of numerous algorithms to ensure accurate functionality. Even though there are notable characteristics, IP cameras lack integrated models for effective human activity detection. With this motivation, this paper presents a gait-driven OpenCV and MediaPipe machine learning framework for human pose and movement captioning. This is implemented by incorporating the Generative 3D Human Shape (GHUM 3D) model which can classify human bones, while Python can classify the human movements as either usual or unusual. This model is fed into a website equipped with camera input, activity detection, and gait posture analysis for pose tracking and movement captioning. The proposed approach comprises four modules, two for pose tracking and the remaining two for generating natural language descriptions of movements. The implementation is carried out on two publicly available datasets, CASIA-A and CASIA-B. The proposed methodology emphasizes the diagnostic ability of video analysis by dividing video data available in the datasets into 15-frame segments for detailed examination, where each segment represents a time frame with detailed scrutiny of human movement. Features such as spatial-temporal descriptors, motion characteristics, or key point coordinates are derived from each frame to detect key pose landmarks, focusing on the left shoulder, elbow, and wrist. By calculating the angle between these landmarks, the proposed method classifies the activities as “Walking” (angle between −45 and 45 degrees), “Clapping” (angles below −120 or above 120 degrees), and “Running” (angles below −150 or above 150 degrees). Angles outside these ranges are categorized as “Abnormal”, indicating abnormal activities. The experimental results show that the proposed method is robust for individual activity recognition. Full article
Show Figures

Figure 1

16 pages, 1816 KB  
Article
MFCF-Gait: Small Silhouette-Sensitive Gait Recognition Algorithm Based on Multi-Scale Feature Cross-Fusion
by Chenyang Song, Lijun Yun and Ruoyu Li
Sensors 2024, 24(17), 5500; https://doi.org/10.3390/s24175500 - 24 Aug 2024
Cited by 1 | Viewed by 2705
Abstract
Gait recognition based on gait silhouette profiles is currently a major approach in the field of gait recognition. In previous studies, models typically used gait silhouette images sized at 64 × 64 pixels as input data. However, in practical applications, cases may arise [...] Read more.
Gait recognition based on gait silhouette profiles is currently a major approach in the field of gait recognition. In previous studies, models typically used gait silhouette images sized at 64 × 64 pixels as input data. However, in practical applications, cases may arise where silhouette images are smaller than 64 × 64, leading to a loss in detail information and significantly affecting model accuracy. To address these challenges, we propose a gait recognition system named Multi-scale Feature Cross-Fusion Gait (MFCF-Gait). At the input stage of the model, we employ super-resolution algorithms to preprocess the data. During this process, we observed that different super-resolution algorithms applied to larger silhouette images also affect training outcomes. Improved super-resolution algorithms contribute to enhancing model performance. In terms of model architecture, we introduce a multi-scale feature cross-fusion network model. By integrating low-level feature information from higher-resolution images with high-level feature information from lower-resolution images, the model emphasizes smaller-scale details, thereby improving recognition accuracy for smaller silhouette images. The experimental results on the CASIA-B dataset demonstrate significant improvements. On 64 × 64 silhouette images, the accuracies for NM, BG, and CL states reached 96.49%, 91.42%, and 78.24%, respectively. On 32 × 32 silhouette images, the accuracies were 94.23%, 87.68%, and 71.57%, respectively, showing notable enhancements. Full article
(This article belongs to the Special Issue Artificial Intelligence and Sensor-Based Gait Recognition)
Show Figures

Figure 1

23 pages, 1980 KB  
Article
GaitSTAR: Spatial–Temporal Attention-Based Feature-Reweighting Architecture for Human Gait Recognition
by Muhammad Bilal, He Jianbiao, Husnain Mushtaq, Muhammad Asim, Gauhar Ali and Mohammed ElAffendi
Mathematics 2024, 12(16), 2458; https://doi.org/10.3390/math12162458 - 8 Aug 2024
Cited by 5 | Viewed by 2352
Abstract
Human gait recognition (HGR) leverages unique gait patterns to identify individuals, but the effectiveness of this technique can be hindered due to various factors such as carrying conditions, foot shadows, clothing variations, and changes in viewing angles. Traditional silhouette-based systems often neglect the [...] Read more.
Human gait recognition (HGR) leverages unique gait patterns to identify individuals, but the effectiveness of this technique can be hindered due to various factors such as carrying conditions, foot shadows, clothing variations, and changes in viewing angles. Traditional silhouette-based systems often neglect the critical role of instantaneous gait motion, which is essential for distinguishing individuals with similar features. We introduce the ”Enhanced Gait Feature Extraction Framework (GaitSTAR)”, a novel method that incorporates dynamic feature weighting through the discriminant analysis of temporal and spatial features within a channel-wise architecture. Key innovations in GaitSTAR include dynamic stride flow representation (DSFR) to address silhouette distortion, a transformer-based feature set transformation (FST) for integrating image-level features into set-level features, and dynamic feature reweighting (DFR) for capturing long-range interactions. DFR enhances contextual understanding and improves detection accuracy by computing attention distributions across channel dimensions. Empirical evaluations show that GaitSTAR achieves impressive accuracies of 98.5%, 98.0%, and 92.7% under NM, BG, and CL conditions, respectively, with the CASIA-B dataset; 67.3% with the CASIA-C dataset; and 54.21% with the Gait3D dataset. Despite its complexity, GaitSTAR demonstrates a favorable balance between accuracy and computational efficiency, making it a powerful tool for biometric identification based on gait patterns. Full article
Show Figures

Figure 1

18 pages, 10105 KB  
Article
Multi-View Gait Analysis by Temporal Geometric Features of Human Body Parts
by Thanyamon Pattanapisont, Kazunori Kotani, Prarinya Siritanawan, Toshiaki Kondo and Jessada Karnjana
J. Imaging 2024, 10(4), 88; https://doi.org/10.3390/jimaging10040088 - 9 Apr 2024
Cited by 1 | Viewed by 2971
Abstract
A gait is a walking pattern that can help identify a person. Recently, gait analysis employed a vision-based pose estimation for further feature extraction. This research aims to identify a person by analyzing their walking pattern. Moreover, the authors intend to expand gait [...] Read more.
A gait is a walking pattern that can help identify a person. Recently, gait analysis employed a vision-based pose estimation for further feature extraction. This research aims to identify a person by analyzing their walking pattern. Moreover, the authors intend to expand gait analysis for other tasks, e.g., the analysis of clinical, psychological, and emotional tasks. The vision-based human pose estimation method is used in this study to extract the joint angles and rank correlation between them. We deploy the multi-view gait databases for the experiment, i.e., CASIA-B and OUMVLP-Pose. The features are separated into three parts, i.e., whole, upper, and lower body features, to study the effect of the human body part features on an analysis of the gait. For person identity matching, a minimum Dynamic Time Warping (DTW) distance is determined. Additionally, we apply a majority voting algorithm to integrate the separated matching results from multiple cameras to enhance accuracy, and it improved up to approximately 30% compared to matching without majority voting. Full article
(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)
Show Figures

Graphical abstract

Back to TopTop