1. Introduction
Lower extremity disorders have been identified as a significant factor contributing to disability and reduced quality of life on a global scale [
1,
2]. Osteoarthritis of the knee, hip, and ankle is a commonly observed disorder affecting the lower limbs [
3,
4]. These conditions, commonly resulting from trauma, degenerative diseases, or bio-mechanical abnormalities, can give rise to discomfort, a restricted range of motion, and diminished functionality [
5,
6]. The prompt and precise classification of these conditions is crucial for effective treatment planning, individualized rehabilitation, and the prevention of further consequences. Clinical trials, subjective patient testimonials, and diagnostic imaging modalities such as X-rays and magnetic resonance imaging (MRI) have conventionally served as the predominant approaches in ascertaining the existence and extent of lower limb issues [
7,
8]. Although these techniques have proven to be valuable, they often require the utilization of specialized equipment, entail significant time investments [
9,
10], and may not fully capture the comprehensive dynamics of joint movement observed during typical activities. In recent years, advancements in technology have enabled the objective and continuous monitoring of the bio-mechanics in the lower extremities using gait data [
11,
12].
Gait analysis is employed in a wide variety of domains, including medical diagnostics [
13,
14,
15], osteopathic medicine [
16,
17], comparative bio-mechanics [
18,
19,
20], and sports-related bio-mechanics [
21,
22,
23]. The application of gait analysis has exhibited considerable promise in the identification and assessment of lower limb disorders [
24]. Gait analysis encompasses the evaluation of an individual’s walking pattern, including diverse elements such as stride length, step width, joint angles, and the temporal coordination of movements. Through the examination of these gait parameters, medical professionals and scholars can detect irregularities or deviations from typical gait patterns, which may serve as indicators of the existence of a lower limb disorder [
25,
26,
27]. The objective of the study presented in this manuscript is to investigate the potential application of PoseNet features in the classification of lower limb disorders. PoseNet is a real-time pose estimation model developed by Google, which employs deep learning techniques to accurately assess human poses from both photos and videos [
28,
29]. Refs. [
30,
31,
32] used PoseNet for in-home rehabilitation, and [
33,
34] used PoseNet for batsman stroke prediction. These features are capable of capturing the spatial positions and movements of key body joints. PoseNet was chosen by the authors because of its better attributes in the domain of real-time human pose estimation. This is due to its real-time capabilities, user friendliness, and versatility. The model’s ability to perform efficiently with just one video feed, together with its minimal preprocessing requirements and smooth integration into widely used deep learning frameworks, makes it an excellent choice for a wide range of applications. The researchers aimed to establish a dependable and precise classification system for the identification of particular disorders in the hip, ankle, and knee by means of analyzing gait patterns and movements extracted from videos.
PoseNet offers a non-invasive and user-friendly approach to extracting human pose data from videos, obviating the necessity for dedicated apparatus or body-attached markers. This facilitates a more authentic and unimpeded evaluation of gait patterns in real-world contexts. The application of deep learning in PoseNet enables the extraction of complicated and detailed features from videos, resulting in a comprehensive depiction of movements in the lower limbs. The characteristics effectively capture the complex variations and fluctuations in an individual’s gait, which have the potential to serve as indicators for particular disorders affecting the lower limbs. This methodology presents the potential for improved accuracy in diagnosis. The study has the following contributions.
The data were collected from a total of 174 real patients and normal individuals, comprising both male and female participants. The data collection process involved capturing videos of the participants using a camera while they walked on a designated walkway at the Tehsil Headquarter (THQ) Hospital in Sadiqabad.
The data were gathered by the system via video recordings, thereby obviating the necessity for intrusive sensors or apparatus affixed to the subjects’ bodies. The implementation of this data collection method that minimizes interference guarantees a more authentic and unrestrained evaluation of gait patterns, thereby enhancing the ecological validity of the system.
The system employs PoseNet, a deep learning model, to extract relevant features from videos that capture movements of the lower limbs. By utilizing the features of PoseNet, the system capitalizes on the model’s capacity to accurately estimate the human pose, facilitating a thorough examination of gait patterns.
Through the application of machine learning (ML) algorithms to the extracted PoseNet features, the system possesses the capability to effectively classify and distinguish various disorders that impact the hip, ankle, and knee. The implementation of automation in this context serves to decrease the level of subjectivity involved in manual analysis, while also reducing the amount of time required for such analysis. As a result, the process of diagnosis becomes more expedient and efficient.
The aforementioned contributions play a significant role in the advancement of lower limb disorder classification, thereby holding the potential to yield substantial benefits for both clinical practice and research endeavors.
The subsequent sections of the article are structured as follows:
Section 2 provides a comprehensive literature review,
Section 3 outlines the methodology and experimental procedures, and the obtained results are discussed in
Section 4. Finally, the conclusions are presented in
Section 5.
2. Literature Review
In recent years, there has been growing interest in the field of gait analysis and the categorization of joint abnormalities. This attention is driven by the desire to enhance the accuracy of diagnosis and treatment methods. Various studies employ ML and deep learning techniques to automatically categorize joint abnormalities using gait data. Each study in the field of gait analysis is centered around a particular condition or aspect and employs a range of methodologies and evaluation metrics. The research conducted by [
35] centers on the diagnosis of knee osteoarthritis using the automated analysis of walking data obtained from both diagnosed persons and symptom-free controls. Ground reaction force features are extracted using force plates and piezoelectric sensors, and these values are then associated with the severity of osteoarthritis using random forest regression models. The attained accuracy of 72.61% in the five-fold cross-validation indicates a decent level of performance, leaving space for potential improvement.
In [
36], the researchers use supervised classifiers and an RGB-D camera to diagnose gait problems in osteoarthritis patients. The researchers categorize gait disorders with 97% accuracy using fourteen gait measures, demonstrating its potential for osteoarthritis diagnosis. Another work [
37] proposes a novel method of detecting gait abnormalities using a single 2D video camera. Video analysis with a support vector machine (SVM) classifier determines biomechanical gait parameters with 98.8% accuracy. The research in [
38] presents a cost-effective and user-friendly gait data acquisition and analysis system. This technique quantifies osteoarthritis-related walking irregularities. The hybrid prediction model, combining manual and automated characteristics, achieves 98.77% accuracy. Meanwhile, Ref. [
39] uses deep learning to classify abnormal gait patterns by integrating 3D skeletal data and plantar foot pressure readings. The multimodal hybrid model achieves 97.60% accuracy by utilizing pressure and skeletal data effectively.
The aim of [
40] was to develop an automated framework for knee osteoarthritis (KOA) classification utilizing radiographic imaging and gait analysis, with a Kallgren-Lawrence grading system. A support vector machine and deep learning features from Inception-ResNet-v2 classified KOA based on gait and radiographic data, showing strong relationships between gait characteristics and radiological severity. The AUC varied from 0.93 to 0.97 for KL grades 0–4. Moreover, Ref. [
41] intended to evaluate gait symmetry in unilateral ankle osteoarthrosis (AOA) patients and identify variables affecting post-surgery asymmetry. They compared 46 gait metrics in 10 healthy people with 10 AOA patients using 3D inertial sensors and pressure insoles. They found significant differences in 23 impacted-side and 20 non-impacted-side variables. In particular, 14 metrics exhibited differences during bilateral AOA patient comparisons, notably in the toe area, and in forefoot mobility during walking.
In [
42], the researchers use ground reaction force (GRF) measurements to automate the diagnosis of functional gait disorders (GDs). They evaluate GRF parameterization methods for GD identification and establish a reference for automatic classification. The study divides 279 GD patients and 161 healthy controls into hip, knee, ankle, and calcaneus impairment groups using GRF data. It tests GRF and PCA-based parameterization approaches. The evaluation of discriminative power uses linear discriminant analysis. The study classifies normal walking patterns and multiclass GD categories. The study in [
43] focuses on categorizing gait disorders, with an emphasis on ground reaction force (GRF) analysis. The study preprocesses GRF signals and extracts and selects features from the GaitRec and Gutenberg databases with data from gait problem patients and healthy participants. The K-nearest neighbor (KNN) model outperforms conventional machine learning approaches in four experimental schemes categorizing gait disorders. The study contrasts vertical and three-dimensional GRF, with the latter performing better. Meanwhile, Ref. [
44] develops an automated, accurate knee osteoarthritis (KOA) diagnosis method. The study uses RQA, fuzzy entropy, and statistical analysis to analyze dynamical characteristics collected from gait kinematic data. Discriminant analysis on these characteristics evaluates shallow classifiers like SVM, KNN, NB, DT, and Adaboost. SVM distinguishes KOA patients and healthy controls with the maximum accuracy of 92.31% and 100%, proving its KOA diagnostic efficacy.
Previous research studies provide evidence of the efficacy of ML and deep learning methodologies in the automated categorization of joint abnormalities using gait data. The utilization of diverse modalities, including RGB-D cameras, 2D video, and ground reaction force measurements, exemplifies the multifaceted nature of these methodologies. Nevertheless, certain studies demonstrate limitations in terms of moderate accuracy, the necessity for supplementary evaluation metrics, and comparatively limited sample sizes. This manuscript introduces a novel approach to categorizing lower limb disorders, focusing on ankle, knee, hip, and normal subjects. The proposed method involves the utilization of PoseNet features extracted from video data. The approach centers on utilizing PoseNet, a pose estimation model based on deep learning, to extract significant features from the video recordings. The primary objective of the proposed methodology is to improve the precision and effectiveness of diagnosing lower limb disorders through the utilization of video data. This approach capitalizes on the abundance of valuable information pertaining to subjects’ movements and joint positions that can be extracted from video recordings. The application of this methodology holds promise in assisting healthcare practitioners in the identification and classification of distinct lower limb disorders, thus facilitating the implementation of suitable treatment and rehabilitation approaches.
4. Results and Discussion
The dataset was split into a training set and a testing set using a ratio of 70% and 30%, respectively, in order to compare the performance of several ML models. This partitioning facilitated the evaluation of the model’s performance on data that had not been previously observed. We used various widely used ML algorithms, including Random Forest (RF), Extra Tree Classifier (ETC), K-nearest neighbor (KNN), Adaboost, and Multilayer Perceptron (MLP), as well as deep learning models like Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN). These ML models are used in different real-time applications related to disease diagnosis [
58], computer vision [
59,
60], agriculture [
61], and education [
62], among others. Grid Search was used for hyperparameter optimization to enhance the performance of the models. The goal of Grid Search is to find the highest possible model performance by systematically searching through different hyperparameter combinations. The hyperparameters used in this study are shown in
Table 1. The numbers 1024, 512, 256, and 128 in
Table 1 correspond to the specific configurations of hidden layers and neuron counts within the ANN architecture in our study. These values indicate the number of neurons present in each hidden layer of the ANN.
K-fold cross-validation was performed to assess the generalizability of the trained models. K-fold cross-validation is a technique that divides a dataset into K subgroups and trains and evaluates a model K times, once for each subset as a validation set, to analyze and improve the model’s performance and generalization [
63]. In this study, the dataset was separated into five folds, making sure that each fold represented an equal and representative percentage of the data. The utilization of this methodology facilitated a thorough evaluation of the models’ efficacy across various partitions of the dataset, yielding valuable insights into their ability to adapt and generalize to unfamiliar data. After performing K-fold validation, the models were assessed using the designated testing set. The frames were extracted from the videos in the testing set, and a set of preprocessing and feature engineering techniques were employed to improve the representation of the data. Subsequently, the classifiers that had undergone training proceeded to generate predictions for every individual frame within the video. The predictions obtained from each frame were consolidated, and the prediction that occurred most frequently was selected as the ultimate prediction for the entire video. The proposed methodology takes into consideration temporal data and effectively captures the overarching pattern present in the video, thereby enhancing the dependability and precision of the prediction. Through the utilization of this approach of frame-level prediction and aggregation, the models can proficiently classify videos by analyzing the content and patterns present in multiple frames. The methodology employed in this study utilizes ML algorithms to effectively analyze and interpret video data, thereby yielding significant insights into the identification and assessment of particular disorders or conditions. The classification matrix along with the K-fold scores are shown in
Table 2.
The data presented in
Table 2 indicate that the MLP, ANN, and CNN models exhibited the most notable levels of accuracy, which varied between 97.88% and 98.84%. The RF, Adaboost, KNN, and ETC models demonstrated accuracy of approximately 94%, accompanied by elevated precision, recall, and F1-Score metrics. The MLP achieved an accuracy rate of 97.88% along with exceptional precision and recall metrics. The ANN and CNN demonstrated exceptional performance, achieving a remarkable accuracy rate of 98.84%. Furthermore, these models exhibited near-flawless precision, recall, and F1-Score values. Nevertheless, there was observed variability in the cross-validation scores for the ANN and CNN models. The ANN achieved a better validation score than the CNN. The confusion matrix and accuracy loss curve of the ANN are shown in
Figure 8.
The confusion matrix presented in
Figure 8a depicts the classification outcomes of a predictive model focused on lower limb disorders, specifically targeting ankle, hip, knee, and normal conditions. The model demonstrated robust performance across all categories, accurately predicting the majority of instances within each class. The classification of the ankle class (class 0) yielded accurate predictions, with 105 instances out of 105 correctly classified. The hip class (class 1) achieved a total of 65 accurate predictions out of 67, with a single misclassification each in knee and ankle. In the knee class (class 2), a total of 174 instances out of 178 were accurately classified. However, three were wrongly classified as ankle and one as hip, while these instances actually belonged to knee. All 169 instances in the normal class (class 3) were accurately classified. Although the model exhibited a commendable level of accuracy, it did encounter a minute number of misclassifications, specifically in distinguishing between the hip and knee classes. The data presented in
Figure 8b demonstrate a decreasing trend in loss over epochs, indicating that the model’s predictive performance improves over time. Additionally, there is a corresponding increase in accuracy, suggesting that the model becomes more proficient in making accurate predictions as it learns. The loss consistently decreases and eventually reaches an equilibrium point, which suggests that the model is exhibiting convergence and effectively acquiring knowledge from the provided data. The class-wise classification matrix of the ANN is given in
Table 3.
Table 3 shows that the ANN model performed well across all classes. The achieved precision of 0.96 for the “ankle” class signifies that when the ANN identifies an instance as belonging to the “ankle” class, it is accurate 96% of the time. In other words, the ANN has a strong level of correctness in correctly classifying instances as “ankle”. A recall value of 1.00 indicates that the model detected all occurrences of the “ankle” class among the true positive instances. The F1-Score of 0.98 signifies a favorable equilibrium between precision and recall. In the case of the “hip” class, the ANN model demonstrated a precision score of 1.00, accurately classifying all instances as members of the “hip” class. The recall value of 0.97 indicates that the model effectively detected 97% of the true positive instances for the “hip” class. The F1-Score, with a value of 0.98, indicates well-balanced performance in terms of both precision and recall. The ANN model demonstrated a precision value of 0.99 for the “knee” class, suggesting precise predictions for instances categorized under the “knee” class. The recall value of 0.98 indicates that the model effectively detected 98% of the true positive instances pertaining to this particular class. The F1-Score, which is calculated as the harmonic mean of precision and recall, exhibits a noteworthy equilibrium between these two performance metrics, with a value of 0.99. Finally, the “normal” class demonstrated exceptional outcomes, with a precision and recall score of 0.99, signifying precise predictions and the accurate identification of all positive instances. The F1-Score of 1.00 demonstrates an optimal equilibrium between precision and recall for this particular class.
4.1. Computational Complexity
This study examined the computational time complexities associated with classifiers used for the classification of lower limb disorders using video data. The analysis of computational complexity was conducted with a focus on the hyperparameters that resulted in higher accuracy. The experiments were performed using an HP ProBook 450 G4 laptop equipped with 16 GB of RAM and an Intel Core i5 7th generation processor. The findings in
Table 4 indicate that the ETC classifier exhibited the lowest computational time complexity, with a duration of 102 s. This was followed by the MLP classifier, which took 128 s; KNN, which took 155 s; and the RF classifier, which took 364 s. The Adaboost and ANN classifiers exhibited computational time complexity of 462 and 500 s, respectively, whereas the CNN demonstrated the highest complexity of 712 s.
The MLP stands out as a strong competitor in light of the performance metrics and computational time complexity. The MLP demonstrates superior performance in terms of accuracy, precision, recall, and F1-score while exhibiting a comparatively lower level of computational time complexity. The ANN and CNN also exhibit exceptional performance; however, their computational time complexities are comparatively higher than those of the MLP.
4.2. Comparison with Existing Studies
The primary objective of the proposed study was to categorize lower limb disorders by utilizing PoseNet features that were extracted from video data. The findings of the study demonstrate a high level of encouragement, as evidenced by the attained accuracy rate of 98.8% and the cross-validation score of 99%. These results highlight the potential of the proposed method in achieving precise classification. Upon comparing the current study with previous research conducted on knee osteoarthritis [
35] and gait abnormalities [
36], it becomes evident that the current study shows notable accuracy. Furthermore, the accuracy of the current method is similar to that of studies that employed 2D video camera data [
38] and cost-effective systems for the analysis of gait. However, the primary objective of the proposed study focused on a wider range of lower limb disorders, encompassing not only gait abnormalities associated with osteoarthritis but also other conditions. Furthermore, the present study showcases comparable performance to previous studies that employed 3D skeletal data and foot pressure measurements [
39]. Moreover, it exhibits favorable comparisons to studies that utilized radiographic imaging and gait analysis data, which reported area under the curve (AUC) values ranging from 0.82 to 0.97 [
18]. In general, the study presents encouraging results, indicating the potential of utilizing PoseNet features extracted from video data as a viable method in classifying lower limb disorders. In contrast to prior research endeavors that depended on expensive and intricate systems, such as 3D skeletal data and foot pressure measurements [
39], the present study employed video data and PoseNet features for analysis. The accessibility, cost-effectiveness, and user friendliness of video data render it a practical option for widespread implementation and potential utilization in clinical settings. The comparison of the studies is given in
Table 5.
5. Conclusions
This research introduces a novel method of classifying lower limb problems based on gait analysis and PoseNet features, with an emphasis on the knee, hip, and ankle. To obtain detailed information about the bio-mechanics of the lower limb, PoseNet is used to extract important body joint movements and positions from video footage in a non-invasive manner. After feature extraction and feature engineering, several machine and deep learning models were trained and tested on the dataset. The results show that the suggested method is highly accurate and precise in the classification of lower limb diseases, with accuracies ranging from 93.44% to 98.84%. Non-invasiveness, user friendliness, and the ability to capture natural gait patterns are just a few of the benefits of this method. It has the potential to help medical personnel to effectively identify and precisely diagnose lower limb disorders, allowing for targeted treatment and rehabilitation techniques. Several directions could be taken by researchers in the future. First, improving the models’ applicability would require an increase in the size and diversity of the dataset. The validity and clinical utility of this technique could be further confirmed by exploring its application to various illnesses and conditions affecting the lower limbs.