Quinary Classification of Human Gait Phases Using Machine Learning: Investigating the Potential of Different Training Methods and Scaling Techniques

Mekni, Amal; Narayan, Jyotindra; Gritli, Hassène

doi:10.3390/bdcc9040089

Open AccessArticle

Quinary Classification of Human Gait Phases Using Machine Learning: Investigating the Potential of Different Training Methods and Scaling Techniques

by

Amal Mekni

¹

,

Jyotindra Narayan

^2,3,*

and

Hassène Gritli

^1,4,*

¹

Laboratory of Robotics Informatics and Complex Systems (RISC Lab, LR16ES07), National Engineering School of Tunis, University of Tunis El Manar, B.P. 37, Le Belvédère, Tunis 1002, Tunisia

²

Department of Mechanical Engineering, Indian Institute of Technology Patna, Patna 801106, India

³

Department of Computing, Imperial College London, London SWS 2AZ, UK

⁴

Higher Institute of Information and Communication Technologies, University of Carthage, Technopole of Borj Cédria, Route de Soliman, B.P. 123, Hammam Chatt, Ben Arous 1164, Tunisia

^*

Authors to whom correspondence should be addressed.

Big Data Cogn. Comput. 2025, 9(4), 89; https://doi.org/10.3390/bdcc9040089

Submission received: 15 March 2025 / Revised: 28 March 2025 / Accepted: 2 April 2025 / Published: 7 April 2025

(This article belongs to the Special Issue Deep Learning-Based Pose Estimation: Applications in Vision, Robotics, and Beyond)

Download

Browse Figures

Versions Notes

Abstract

Walking is a fundamental human activity, and analyzing its complexities is essential for understanding gait abnormalities and musculoskeletal disorders. This article delves into the classification of gait phases using advanced machine learning techniques, specifically focusing on dividing these phases into five distinct subphases. The study utilizes data from 100 individuals obtained from an open-access platform and employs two distinct training methodologies. The first approach adopts stratified random sampling, where 80% of the data from each subphase are allocated for training and 20% for testing. The second approach involves participant-based splitting, training on data from 80% of the individuals and testing on the remaining 20%. Preprocessing methods such as Min–Max Scaling (MMS), Standard Scaling (SS), and Principal Component Analysis (PCA) were applied to the dataset to ensure optimal performance of the machine learning models. Several algorithms were implemented, including k-Nearest Neighbors (k-NNs), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), Naive Bayes (Gaussian, Bernoulli, and Multinomial) (NB), Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis (QDA). The models were rigorously evaluated using performance metrics like cross-validation score, Mean Squared Error (MSE), Root Mean Squared Error (RMSE), accuracy, and

R^{2}

score, offering a comprehensive assessment of their effectiveness in classifying gait phases. In the five subphases analysis, RF again performed strongly with a 94.95% accuracy, an RMSE of 0.4461, and an

R^{2}

score of 90.09%, demonstrating robust performance across all scaling methods.

Keywords:

gait phase classification; machine learning; random forest; principal component analysis; k-Nearest Neighbors; Support Vector Machine; Logistic Regression; Min–Max scaling; Standard Scaling

1. Introduction

Walking is a fundamental aspect of human activity, relying on the seamless coordination of muscles, joints, and nerves. A deeper understanding of this process is crucial for identifying and managing gait abnormalities and musculoskeletal disorders [1,2,3,4,5]. Gait, which refers to the rhythmic movement of the legs during walking, varies among individuals based on age and health status. It is commonly divided into two main phases (see Figure 1): the stance phase (0–60%) and the swing phase (60–100%).

Gait phase classification plays a vital role in clinical diagnostics, rehabilitation, and assistive technologies, as it enables precise monitoring of movement patterns and early detection of abnormalities. While traditional machine learning (ML) and deep learning (DL) approaches have significantly advanced gait analysis, challenges remain in accurately classifying multiple gait phases, optimizing feature selection, and improving model generalizability across diverse datasets. The primary objective of this study is to develop and benchmark ML models for quinary gait phase classification (five-phase classification) to enhance gait recognition accuracy and applicability. Specifically, this study investigates the impact of different preprocessing techniques, feature scaling methods, and training methodologies to optimize classification performance. The findings of this study aim to bridge the gap between research and real-world applications, with potential benefits for gait rehabilitation, intelligent orthotic devices, and fall prevention systems.

Recent advances in ML and DL have shown promising results in gait phase classification, with various studies exploring different approaches to improve classification accuracy. Park et al. [7] investigated binary (stance/swing) and ternary (weight acceptance/single limb support/limb advancement) gait phase classification using four ML algorithms: Decision Tree (DT), k-Nearest Neighbors (k-NNs), Support Vector Machine (SVM), and Neural Network (NN). Their findings demonstrated that SVM achieved the highest classification accuracy (93.44% for binary and 91.72% for ternary classification), with muscle synergy features outperforming traditional EMG features. These results highlight the potential of modular muscle coordination in enhancing gait phase detection for neurological rehabilitation and assistive devices. Zhang et al. [8] proposed a multi-information fusion method integrating surface electromyography (sEMG) signals, knee joint angles, and plantar pressure data to improve classification accuracy. A modified CNN model was used for classification, achieving 98% accuracy and 92% F1-score under five-fold cross-validation. Results indicate multi-information fusion significantly outperforms single-source approaches, proving its effectiveness for real-time exoskeleton control. Similarly, Hwang et al. [9] developed an IMU-based abnormal gait classification system using SVM, Random Forest (RF), and Extreme Gradient Boosting (XGB). Their model classified three gait types (normal, knee impairment, ankle impairment) with 91% accuracy using Recursive Feature Elimination with Cross-Validation (RFECV). Comparatively, a walkway-based system achieved only 77% accuracy, highlighting the superiority of IMU sensors for gait assessment. These findings suggest that IMU-based gait classification can enhance patient monitoring and rehabilitation strategies. In the context of intelligent orthotic devices, the authors of [10] evaluated ML classifiers (J-48 DT, RF, Multi-Layer Perceptrons, and SVMs) for classifying gait phases using thigh-mounted inertial sensors. Data from 31 participants were analyzed with 5-fold and 10-fold cross-validation, where J-48 DT achieved 97.5% accuracy with knee angle input and 97.0% accuracy without it, suggesting thigh-mounted IMU sensors alone can provide robust classification for real-time orthosis control.

Furthermore, Jung et al. [11] proposed an ML-based approach using IMU sensors to classify gait phases and estimate joint moments. Their model optimized feature selection, reducing joint angles from six to three, with only a 4.04% accuracy drop, and further reduction to two angles caused a 7.46% decrease. They extended their method using OpenSim for joint-moment regression, correlating gait phases with biomechanical forces. Their findings suggest efficient feature selection and robust gait classification, benefiting prosthetics, exoskeletons, and rehabilitation applications. Lastly, Mekni et al. [12] investigated binary (stance/swing) and ternary (stance-I/stance-II/swing) gait phase classification using six ML algorithms, demonstrating that RF consistently outperformed other classifiers, achieving high accuracy. These studies collectively emphasize the growing impact of ML-driven gait phase classification in biomechanics, clinical diagnostics, rehabilitation, and assistive technologies, paving the way for enhanced real-time control of exoskeletons and intelligent orthotic devices. Further extending this work, Mekni et al. [13] explored multi-class gait phase recognition by dividing the gait cycle into five distinct subphases, showing that RF remained the most effective classifier, while participant-based data partitioning contributed to improved robustness. Additionally, Mekni et al. [14] analyzed the impact of varied training methods and feature selection, comparing classification performance using five and ten lower-body movement features. Their findings emphasized that increasing movement features enhanced classification accuracy across all ML models, with RF reaching the best accuracy, reaffirming its suitability for gait phase recognition. Previous works by [12,13,14] primarily focused on binary- and ternary-class gait classification using traditional ML models. However, several aspects remain unexplored. First, extending classification to a higher number of gait phases, such as quinary classification, could enhance the granularity of gait analysis. Second, the impact of different data preprocessing techniques, including various scaling methods and dimensionality reduction approaches, has not been systematically evaluated. Furthermore, their work lacks an in-depth investigation into generalizability across diverse datasets, which is crucial for real-world clinical and rehabilitation applications. Addressing these limitations could lead to more robust and adaptive gait classification frameworks.

Deep learning (DL) techniques have demonstrated superior performance in gait phase classification, significantly advancing clinical diagnostics, rehabilitation, and assistive technologies. Unlike conventional machine learning (CML) approaches that rely on handcrafted features, DL methods automatically learn relevant patterns from raw data, improving classification accuracy. Similarly, Ma et al. [15] utilized joint angular sensors to achieve real-time robustness with 88.71% accuracy, though lower than DL methods. These studies collectively underscore DL’s effectiveness in gait analysis, particularly hybrid models, while highlighting limitations concerning hardware dependency, computational cost, and interpretability in diverse clinical and real-world applications. Zhang et al. [8] introduced a multi-information fusion method integrating surface electromyography (sEMG), knee joint angles, and plantar pressure data, where a modified Convolutional Neural Network (CNN) achieved 98% accuracy and a 92% F1-score, proving the efficacy of CNNs for real-time exoskeleton control. Xia et al. [16] combined CNN and BiLSTM to achieve 95.09% accuracy through local and temporal feature extraction, despite increased hardware complexity due to reliance on inertial measurement units and plantar pressure sensors. Similarly, Zheng et al. [17] compared five DL approaches (CNN, Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Bidirectional LSTM (BiLSTM), and Convolutional LSTM (ConvLSTM)) against four CML methods for age-related gait classification based on accelerometer data, demonstrating that DL consistently outperformed CML, with GRU attaining the highest accuracy (89.3%) and all DL models achieving AUC values above 0.94 compared to 0.83 for the best CML model. However, multiple sensor arrays increased hardware complexity and posed challenges for real-time processing. Zheng et al. [18] further validated DL’s capability in age-related gait classification, achieving 81.4% accuracy (AUC = 0.89) with CNN on single-stride data, and 84.5% accuracy (AUC = 0.94) with GRU on multi-stride data, incorporating SHapley Additive exPlanations (SHAPs) to enhance interpretability by identifying influential features such as heel contact and toe-off accelerations.

Machine learning models have been widely applied to gait analysis, offering different strengths and limitations depending on their architecture. Traditional ML models, such as SVM, DT, and k-NN, have demonstrated strong interpretability and require relatively small datasets but struggle with complex gait dynamics. In contrast, DL models like CNN and RNN have outperformed traditional classifiers in recognizing abnormal gait patterns but require extensive labeled data and high computational resources [2,19,20]. CNNs, for example, have been particularly effective in feature extraction from time-series gait data [19]. However, recurrent models, such as LSTM networks, offer advantages in capturing temporal dependencies but may suffer from overfitting when applied to small datasets [4]. Given these trade-offs, recent research has focused on optimizing feature selection and training methodologies to maximize accuracy while improving model efficiency. Summarizing the limitations of existing models, three major challenges can be observed. First, generalizability remains a concern, as many models are trained on small, homogeneous datasets, resulting in poor performance on unseen subjects. Second, prior studies have not systematically evaluated the impact of feature selection techniques and preprocessing methods, such as PCA, on classification performance. Lastly, training strategies have largely relied on traditional random sampling without a thorough comparison to participant-based splitting, which can significantly affect model robustness. Addressing these challenges could enhance the reliability and applicability of gait classification models in real-world settings. This study systematically benchmarks multiple ML models for quinary gait phase classification to address these gaps, investigating the effects of different preprocessing techniques and training methodologies. In the five-subphase analysis, RF demonstrated strong performance, achieving an accuracy of 94.95%, an RMSE of 0.4461, and an

R^{2}

score of 90.09% across all scaling methods. Notably, this accuracy surpasses prior studies that focused on binary (stance/swing) or ternary (stance-I/stance-II/swing) gait phase classification, where reported accuracy rates typically ranged between 85% and 93% [12,13]. This improvement highlights the advantages of adopting a more granular quinary classification approach and optimizing ML training methodologies, ultimately enhancing gait phase recognition accuracy and clinical applicability. Building on these advancements, the main contributions of this work are as follows:

(i): Unlike conventional studies that classify gait into two or three phases, this work extends the analysis to a five-phase classification, providing a more detailed understanding of gait mechanics.
(ii): This study evaluates two training methodologies—stratified random sampling and participant-based splitting—to analyze inter-individual variability and improve model robustness. Such a comparison between the outcome effects of two training methodologies has hardly been explored in the literature.
(iii): The research systematically investigates two feature scaling techniques—Min–Max Scaling (MMS) and Standard Scaling (SS)—as well as Principal Component Analysis (PCA), a dimensionality reduction method, to assess their impact on model performance.
(iv): Multiple ML algorithms, including k-NN, LR, DT, RF, SVM, NB, LDA, and QDA, are rigorously tested using metrics like accuracy, MSE, RMSE, and $R^{2}$ , benchmarking the most effective models for gait phase classification.
(v): This study’s findings provide practical implications for real-time gait monitoring, rehabilitation, and biomechanical analysis, enabling the development of efficient gait classification models applicable in clinical and robotics applications.

Aligned with these objectives, this study emphasizes the real-world applications of gait phase classification, bridging the gap between theoretical research and practical implementations in healthcare, assistive technology, and sports performance. Some of the key applications are outlined below.

Real-time gait monitoring: The proposed machine learning models can be integrated into wearable sensor systems to continuously track gait patterns, allowing for early detection of abnormalities and real-time feedback for users.
Rehabilitation and clinical applications: The classification models can assist clinicians in monitoring patients with neurological disorders, such as Parkinson’s disease or stroke, and evaluating the effectiveness of rehabilitation therapies based on gait phase recognition.
Intelligent prosthetics and robotics: The study provides insights into gait phase transitions, which can be utilized to enhance the control algorithms of robotic exoskeletons and intelligent prosthetic limbs, improving mobility for individuals with motor impairments.
Fall prevention systems: By accurately classifying gait phases, these models can contribute to predictive systems that assess gait stability and issue alerts for individuals at high risk of falls, particularly among elderly or mobility-impaired populations.
Sports and performance analysis: Athletes and sports professionals can use gait analysis models to optimize performance and reduce injury risks through personalized training adjustments based on biomechanical insights.

The structure of this paper is organized as follows: Section 2 describes the methodology, including dataset details, ML algorithms, and hyperparameter selection. Section 3 presents numerical results from the classification experiments. Section 4 evaluates the performance of the models using various metrics. Section 5 provides a discussion, and Section 6 provides a comparative analysis of the results. Finally, Section 7 concludes the study and outlines future research directions.

2. Dataset Involved

We used an existing dataset that includes 100 healthy adults aged between 21 and 79 years, ensuring a balanced representation across different age groups and genders. The dataset provides demographic information such as age, gender, height, and weight, offering insights into variations in gait. It also includes comprehensive biomechanical, kinematic, and kinetic data collected under different walking conditions. Each participant was assigned a unique identifier to preserve the anonymity and integrity of the data. For further details about the dataset, refer to the work of Bahadori et al. [21,22], and access it at https://data.mendeley.com/datasets/wwnvw28n2m/1, accessed on 10 December 2024. Our study focused on analyzing specific lower-body movements, including ten key motions for both limbs such as hip flexion/extension, hip abduction/adduction, hip inter/intra rotation, knee flexion/extension, and ankle dorsiflexion/plantarflexion, as shown in Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6. Table 1 presents a summary of the average anthropometric characteristics, including height, weight, and lower limb segment lengths, for both males and females.

To enhance the transparency and reproducibility of this study, we provide additional details on the data collection process and sample selection criteria. The dataset was obtained through controlled laboratory experiments using high-speed motion capture systems synchronized with force-instrumented treadmills. Each participant underwent a standardized gait assessment protocol, ensuring consistency in data acquisition. Strict inclusion criteria were applied, selecting only healthy adults without known musculoskeletal or neurological disorders, while individuals with gait impairments or recent lower-limb injuries were excluded. Participants represented a diverse distribution across age groups and genders to improve generalizability. Data preprocessing involved multiple steps, including outlier removal, missing value imputation, and normalization based on anthropometric parameters to enhance comparability. Additionally, motion cycle segmentation was performed to align gait events across participants, ensuring a robust dataset for biomechanical and ML analyses. These methodological refinements guarantee high-quality data, facilitating accurate modeling and reproducible research outcomes.

Data preprocessing is a critical phase in the analysis pipeline, ensuring the quality and reliability of data for accurate modeling. The raw motion data, collected from multiple participants and encompassing numerous motion cycles per individual, underwent several key steps to enhance its usability. First, data were cleaned to handle missing values, remove outliers, and correct inconsistencies, eliminating noise and inaccuracies that could impact the analysis. Next, the cleaned data were restructured to organize each participant’s 100 motion cycles into a standardized format, simplifying manipulation and feature extraction. Following this, relevant features related to the movement were selected, prioritizing data elements that contribute most significantly to accurate modeling while discarding irrelevant or redundant features. These comprehensive preprocessing steps ensured the dataset was clean, well organized, and optimized for robust analysis and modeling performance [23,24,25].

3. Evaluation Metrics

The assessment criteria are critical in evaluating the efficiency of ML models and helping in the choice of the most suitable model for a specific problem.

3.1. Cross-Validation (CV) Score

The evaluation of ML models’ performance and generalization capability is commonly achieved through k-fold cross-validation. This method divides the dataset into k subsets (folds), with the model being trained on

k - 1

folds and tested on the remaining fold. The process is repeated k times, ensuring that each fold is used as a test set exactly once. The results from all iterations are averaged to provide an overall measure of the model’s performance. This approach ensures a robust evaluation by utilizing different combinations of training and testing data [26,27,28].

3.2. Mean Squared Error (MSE)

The MSE, a key metric for evaluating the performance of regression models, quantifies the error between predicted and actual values. It is computed as the average of the squared differences between the actual and predicted values:

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

(1)

In Equation (1), n denotes the total number of data points,

y_{i}

represents the actual value of the i-th data point, and

{\hat{y}}_{i}

is its corresponding predicted value. The summation symbol ∑ indicates the calculation of the squared differences across all data points. This measure provides insight into the model’s accuracy, as discussed in [26,27,28].

Root Mean Squared Error (RMSE)

The RMSE assesses the difference between expected and observed values, providing a more accurate measure of prediction inaccuracies than the MSE. The RMSE formula is

RMSE = \sqrt{MSE}

(2)

In the given Equation (2), the symbol n denotes the total count of observations.

y_{i}

denotes the observed value, and

{\hat{y}}_{i}

the predicted value represented by it.

3.3. $R^{2}$ Score

The

R^{2}

score quantifies the fit between the model and the data, and is defined via the following expression:

R^{2} = 1 - \frac{S S_{res}}{S S_{tot}}

(3)

The total variance of the observed values is represented by the sum of squared residuals (

S S_{res}

), while

S S_{tot}

denotes the total variance of the observed values. An

R^{2}

score of 1 indicates a perfect fit, while a score of 0 suggests that the model does not explain any variability. Additionally, a negative score signifies that the model performs worse than using the mean prediction [26,27,28].

3.4. Accuracy

The metric of accuracy determines the percentage of precisely identified data points compared to the whole number of data points, and it is commonly utilized in classification assignments. The calculation of accuracy is as follows [29]:

Accuracy = \frac{TP + TN}{TP + TN + FP + FN}

(4)

The count of True Positives is represented by TP, the count of False Positives is denoted by FP, TN stands for True Negatives, and FN indicates False Negatives.

3.5. Confusion Matrix

A confusion matrix is a performance evaluation tool used in classification tasks to summarize the predictions of a ML model compared to the actual values. It is a matrix that outlines the counts of TP, true negative TN, FP, and FN predictions, which help assess the accuracy and reliability of the model [27,28,30].

3.6. Scaling Techniques

Before applying ML algorithms, preprocessing the data with scaling techniques is essential. These methods standardize or normalize the features to ensure that every feature has an equal impact on the model, thereby enhancing its overall performance. The following are the two scaling techniques that are frequently utilized: Min–Max Scaling (MMS) and Standard Scaling (SS).

3.6.1. Min–Max Scaling (MMS)

MMS is a normalization technique that adjusts features by rescaling them within a defined range, typically from 0 to 1. This technique is especially valuable when features have varying units or scales, as it standardizes all features to the same range, preventing those with larger scales from disproportionately influencing the model [31,32].

X_{scaled} = \frac{X - X_{min}}{X_{max} - X_{min}}

(5)

In the given Expression (5), the initial feature value is denoted by X, the minimum value of the feature is represented as

X_{min}

, and the maximum value of the feature is denoted by

X_{max}

.

3.6.2. Standard Scaling (SS)

The SS approach normalizes the features, which are first adjusted by centering them at zero and then scaled to have a standard deviation of one. It works especially well when the dataset is normally distributed, guaranteeing that each feature has an equal impact on the model [32].

X_{scaled} = \frac{X - μ}{σ}

(6)

In the feature value Equation (6), X represents the original feature value, the mean of the feature is denoted by

μ

, and the standard deviation of the feature is indicated by

σ

.

3.7. Dimensionality Reduction via Principal Component Analysis (PCA)

Unlike MMS and SS, PCA is not a scaling technique but a dimensionality reduction method. Since PCA performs optimally when features are on the same scale, Standard Scaling (SS) was applied prior to PCA transformation. This step ensured proper feature normalization before dimensionality reduction, allowing PCA to retain the most informative variance while reducing redundancy. It transforms the original dataset into a new coordinate system where the axes (principal components) are orthogonal and ranked based on the variance they capture. Each principal component captures the maximum possible variance from the original features, ensuring minimal information loss while reducing dimensionality. By simplifying the dataset while preserving most of its essential information, PCA enhances computational efficiency and model performance, particularly when dealing with numerous correlated features [33].

Z = X W

(7)

The original data matrix is denoted by X, the matrix of eigenvectors corresponding to the principal components is denoted by W, and the transformed data matrix consisting of the principal components is denoted by Z.

4. Adopted Machine Learning Techniques

To ensure a robust and comprehensive gait classification, we selected a diverse set of ML algorithms, each with distinct advantages. The k-NNs algorithm was chosen for its simplicity and effectiveness in handling non-linear decision boundaries, making it suitable for gait patterns with subtle variations. LR, as a probabilistic classifier, provides interpretable predictions and works well for linearly separable data. DT and RF were included due to their capability to model complex decision boundaries while handling high-dimensional data efficiently. RF, as an ensemble method, further enhances prediction accuracy and reduces overfitting by aggregating multiple DTs. SVM was employed for its strong theoretical foundation in maximizing class separation through hyperplane optimization, which is particularly effective when gait features exhibit clear distinctions. NB, leveraging the probabilistic framework, offers fast and efficient classification, especially for datasets with independent features. LDA and QDA were chosen for their ability to project gait data into lower-dimensional spaces while maintaining class separability. LDA assumes a shared covariance structure across classes, making it useful when feature distributions follow Gaussian assumptions, whereas QDA allows for individual class covariance matrices, enabling greater flexibility in capturing complex gait variations. Each of these models was selected based on their ability to handle the multi-faceted nature of gait classification, balancing interpretability, computational efficiency, and predictive performance. Their strengths and limitations were carefully considered to ensure a well-rounded analysis of gait patterns in our study.

In classification models, training datasets serve as the foundation for constructing ML models capable of categorizing or classifying new samples [29,34]. Following this approach, we curated training and testing datasets and employed a variety of classification algorithms (k-NN, LR, DT, RF, SVM, NB, LDA, QDA). These carefully selected algorithms played a crucial role in accurately classifying the stages of walking in our study, utilizing insights from the dataset to generate accurate predictions and enhance our comprehension of walking patterns [35].

4.1. k-Nearest Neighbors (k-NNs)

The k-NNs algorithm classifies a sample by identifying the majority class among its k-Nearest Neighbors in the feature space. This non-parametric method relies entirely on the proximity of data points, determined by a chosen distance metric such as Euclidean, Manhattan, or Minkowski distance. The algorithm operates by calculating the distance between the sample and all training points, selecting the k closest points, and assigning the class that occurs most frequently among these neighbors. The prediction is determined by the majority vote of these k-Nearest Neighbors, making k-NN a straightforward and intuitive approach to classification tasks.

4.2. Logistic Regression (LR)

LR predicts the probability of a binary outcome using a logistic function. The mathematical formula is given as follows:

y = \frac{e^{b_{0} + b_{1} x}}{1 + e^{b_{0} + b_{1} x}},

(8)

where

b_{0}

and

b_{1}

are the intercept and slope, respectively.

4.3. Decision Tree (DT)

The DT algorithm uses a hierarchical structure where nodes represent features and branches represent values. Based on the input’s feature value, the tree splits at each node until a leaf node is reached, assigning the sample to a class. The splitting logic can be represented as follows:

if (Feature \leq Threshold) : Go to left branch; else : Go to right branch .

4.4. Random Forest (RF)

RF improves classification accuracy by aggregating predictions from multiple DTs. This ensemble method reduces overfitting and ensures robustness. The final prediction is computed as follows:

Prediction = \frac{1}{N} \sum_{i = 1}^{N} {DecisionTree}_{i} (Input),

(9)

where N is the number of DTs.

4.5. Support Vector Machine (SVM)

SVM constructs a hyperplane that separates the data into distinct classes with maximum margin. The decision boundary is defined by the following equation:

w^{T} x + c = 0,

(10)

where w represents the weight vector, x is the input, and c is the bias term.

4.6. Naive Bayes (NB)

NB is a probabilistic classifier based on Bayes’ theorem. It assumes independence among features and calculates the posterior probability of a class z as follows:

P (z | x_{1}, \dots, x_{n}) \propto P (z) \prod_{i = 1}^{n} P (x_{i} | z),

(11)

where

P (z)

is the prior probability and

P (x_{i} | z)

is the likelihood of feature

x_{i}

given class z.

4.7. Linear Discriminant Analysis (LDA)

LDA projects input data onto a lower-dimensional space to maximize class separability. It assumes that classes share the same covariance matrix. The formula for the projection is

y = W^{T} x,

(12)

where

W

is the projection matrix that maximizes the between-class variance.

4.8. Quadratic Discriminant Analysis (QDA)

QDA extends LDA by allowing each class to have its own covariance matrix. The decision function for QDA is given by

δ_{k} (x) = - \frac{1}{2} log | Σ_{k} | - \frac{1}{2} {(x - μ_{k})}^{T} Σ_{k}^{- 1} (x - μ_{k}) + log P_{k},

(13)

where

Σ_{k}

is the covariance matrix of class k,

μ_{k}

is the mean vector, and

P_{k}

is the prior probability of class k.

5. Results and Discussions: A Comparative Analysis

The following analysis compares the performance of various ML models across different scaling techniques (MMS, SS, and PCA) using both the first and second approaches. The key metrics considered include CV score, MSE, RMSE, accuracy, and the

R^{2}

score, as listed in Table 2, Table 3, Table 4, Table 5, Table 6 and Table 7.

In the first approach of MMS, RF demonstrated the best overall performance with the highest accuracy of 94.95%, an RMSE of 0.4461, and a strong ability to explain data variability with the highest

R^{2}

score of 90.09%, indicating its superior predictive power. On the other hand, k-NNs also performed well with an accuracy of 93.51%, but RF outshined it in terms of minimizing prediction error. In the second approach of MMS, SVM achieved the best results, with the highest accuracy of 90.04%, and the lowest RMSE of 0.8590, making it the top-performing model in this approach. LR also performed competitively, with an accuracy of 89.64% and the highest

R^{2}

score of 62.00%, showing its ability to explain variability effectively.

For the first approach of SS, RF again outperformed other models with the highest accuracy of 94.90%, the lowest RMSE of 0.4549, and a strong

R^{2}

score of 89.70%, highlighting its robust predictive performance. SVM was close behind, with an accuracy of 93.56% and competitive RMSE. In the second approach of SS, SVM took the lead with the highest accuracy of 90.24% and the lowest RMSE of 0.8465%, making it the best performer. LR followed closely with an accuracy of 90.04% and the highest

R^{2}

score of 64.29%, showing its ability to handle variability effectively.

With PCA, RF performed the best in the first approach, with an accuracy of 94.95%, the lowest RMSE of 0.4377, and a solid

R^{2}

score of 90.46%. SVM followed closely in the second approach, achieving the highest accuracy of 89.89% and the lowest RMSE of 0.8685%, while LR maintained the highest

R^{2}

score of 65.79%, demonstrating strong performance in explaining variance.

In the analysis of different ML classifiers using two distinct training methods with the MMS metrics as cited in Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12, Figure 13 and Figure 14, the confusion matrices highlight variations in correct classifications across models.

For k-NNs, the first method correctly classifies 97.75% of class 1, with a misclassification rate of 2.25% as class 2. In the second method, k-NNs achieves a correct classification rate of 91.25% for class 2, with a misclassification rate of 6.25% as class 3. LR correctly classifies 99.29% of class 0 in the first method, with a misclassification rate of 0.71% as class 4. In the second method, it also correctly classifies 92.89% of class 3, with a misclassification rate of 4.21% as class 2. DT correctly classifies 95.25% of class 1 in the first method, with a misclassification rate of 4.75% as class 2. In the second method, DT achieves a correct classification rate of 92.11 as class 3%, with a misclassification rate of 5.26% as class 2. RF achieves a correct classification rate of 98.25% for class 1 in the first method, with a misclassification rate of 1.75% as class 2. In the second method, it incorrectly classifies 6.25% as class 3, with a correct classification rate of 92%. SVM correctly classifies 97.62% as class 2 in the first method, with a misclassification rate of 2.38% as class 3. In the second method, SVM achieves a correct classification rate of 92.89% as class 3, with a misclassification rate of 4.21% as class 2. NB correctly classifies 96.58% as class 3 in the first method, with a misclassification rate of 1.84% as class 2. In the second method, it achieves a correct classification rate of 94.21% as class 3, with a misclassification rate of 3.42%. LDA correctly classifies 99.21% as class 3 in the first method, with a misclassification rate of 0.79% as class 4. In the second method, it achieves a correct classification rate of 95.53% as class 3, with a misclassification rate of 3.16% as class 0. QDA correctly classifies 96.67% as class 2 in the first method, with a misclassification rate of 3.33% as class 3. In the second method, it achieves a correct classification rate of 91.50% as class 0, with a misclassification rate of 2.50% as class 4.

6. Discussion and Comparative Analysis

The comparative analysis highlights the consistent superiority of RF across all scaling techniques and training methods. RF achieved the highest accuracy (up to 94.95%), RMSE, and strong

R^{2}

scores (up to 90.46%), demonstrating its robustness and adaptability. In comparison, Xia et al. [16] attained a slightly higher accuracy of 95.09% using a CNN-BiLSTM model. This was enabled by combining local feature extraction from CNN with temporal feature modeling from BiLSTM, utilizing inertial measurement units and plantar pressure sensors for data acquisition. However, the complexity and increased hardware requirements of this approach may limit its practicality in certain scenarios. SVM also showed strong performance in this study, particularly under SS and PCA, with high accuracy and low RMSE. Similarly, LR and k-NNs performed well but did not match the predictive accuracy of RF and SVM. In comparison, the study by Ma et al. [15] demonstrated real-time robustness and generalization capabilities using joint angular sensors in the hip and knee joints, achieving an accuracy of 88.71%. However, its accuracy falls short compared to RF and advanced models like CNN-BiLSTM. In contrast, NB, LDA, and QDA underperformed in this study, with higher error rates and lower

R^{2}

scores.

These findings emphasize the importance of selecting appropriate preprocessing techniques, training strategies, and ML models. While RF is the most adaptable and robust in the current study, integrating advanced architectures like CNN-BiLSTM could enhance performance if hardware complexity is appropriately managed. Furthermore, the practical implications of these results can be extended to clinical gait analysis, where RF’s strong predictive capabilities could support real-time gait monitoring in rehabilitation settings. The implementation of RF in wearable devices could aid in the early detection of abnormal gait patterns, enabling timely intervention for patients with mobility impairments. Additionally, hybrid models combining traditional ML and DL approaches may offer a balance between computational efficiency and classification accuracy, making them viable solutions for real-world deployment.

Future research could explore domain adaptation techniques to improve model generalization across diverse populations, as well as multi-sensor data fusion strategies that integrate data from inertial sensors, force plates, and depth cameras. By addressing these challenges, gait classification models can become more robust and applicable to broader clinical and biomechanical applications.

7. Conclusions

This study provides a comprehensive analysis of gait phase classification using ML techniques, emphasizing the importance of scaling methods and training strategies. Among the evaluated models, RF consistently outperformed others across all scaling techniques, achieving high accuracy, low error rates, and robust

R^{2}

scores. SVM also demonstrated strong performance, particularly under Standard Scaling and PCA, highlighting their potential for complex classification tasks. Key insights from this study include the impact of scaling techniques like Min–Max Scaling, Standard Scaling, and PCA on model performance, and the importance of selecting appropriate training strategies to ensure robust and generalizable results. These findings underscore the critical role of preprocessing and algorithm selection in optimizing classification accuracy for clinical and real-world applications.

Beyond theoretical contributions, this study has significant practical advantages, particularly in the fields of biomechanics, rehabilitation, and assistive technologies. The proposed machine learning models can be integrated into real-time gait monitoring systems, enabling continuous tracking of gait patterns and early detection of abnormalities. Additionally, these models offer valuable support for clinicians in assessing rehabilitation progress for patients with neurological disorders such as Parkinson’s disease and stroke. The insights gained from this research also contribute to the development of intelligent prosthetics and robotic exoskeletons, enhancing mobility assistance for individuals with motor impairments. Furthermore, accurate gait phase classification can play a key role in fall prevention systems by assessing gait stability and issuing alerts for individuals at high risk of falls, particularly among elderly or mobility-impaired populations. Finally, in sports science, these models can optimize athletic performance and minimize injury risks through biomechanical gait analysis, demonstrating the broad applicability of this research in both medical and non-medical fields.

Future research could focus on integrating DL architectures, such as CNNs and recurrent neural networks (RNNs), to capture temporal and spatial dependencies in gait data. Additionally, exploring hybrid models and multi-modal datasets could further enhance the accuracy and reliability of gait phase classification. Finally, the implementation of these models in real-time systems for rehabilitation, sports analytics, and patient monitoring represents a promising avenue for translating these insights into practical applications.

Author Contributions

A.M. and J.N.; formal analysis, A.M., J.N. and H.G.; investigation, A.M., J.N. and H.G.; methodology, A.M., J.N. and H.G.; project administration, H.G.; software, A.M., J.N. and H.G.; supervision, J.N. and H.G.; validation, J.N. and H.G.; visualization, J.N. and H.G.; writing—original draft, A.M.; writing—review and editing, A.M., J.N. and H.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kalita, B.; Narayan, J.; Dwivedy, S.K. Development of active lower limb robotic-based orthosis and exoskeleton devices: A systematic review. Int. J. Soc. Robot. 2021, 13, 775–793. [Google Scholar] [CrossRef]
Bauman, V.V.; Brandon, S.C.E. Gait Phase Detection in Walking and Stairs Using Machine Learning. J. Biomech. Eng. 2022, 144, 121007. [Google Scholar] [CrossRef] [PubMed]
Bhoir, A.A.; Mishra, T.A.; Narayan, J.; Dwivedy, S.K. Machine Learning Algorithms in Human Gait Analysis. In Encyclopedia of Data Science and Machine Learning; IGI Global: Hershey, PA, USA, 2023; pp. 922–937. [Google Scholar]
Semwal, V.B.; Jain, R.; Maheshwari, P.; Khatwani, S. Gait reference trajectory generation at different walking speeds using LSTM and CNN. Multimed. Tools Appl. 2023, 82, 33401–33419. [Google Scholar] [CrossRef]
Narayan, J.; Gritli, H.; Dwivedy, S.K. Lower Limb Joint Torque Estimation via Bayesian Regularized Backpropagation Neural Networks. In Proceedings of the 2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI), Gwalior, India, 14–16 March 2024; Volume 2, pp. 1–6. [Google Scholar]
Narayan, J.; Jhunjhunwala, S.; Mishra, S.; Dwivedy, S.K. 5—A comparative performance analysis of backpropagation training optimizers to estimate clinical gait mechanics. In Predictive Modeling in Biomedical Data Mining and Analysis; Roy, S., Goyal, L.M., Balas, V.E., Agarwal, B., Mittal, M., Eds.; Advanced Studies in Complex Systems: Theory and Applications; Academic Press: San Diego, CA, USA, 2022; pp. 83–104. [Google Scholar]
Park, H.; Han, S.; Sung, J.; Hwang, S.; Youn, I.; Kim, S.J. Classification of Gait Phases Based on a Machine Learning Approach Using Muscle Synergy. Front. Hum. Neurosci. 2023, 17, 1201935. [Google Scholar] [CrossRef]
Zhang, Y.; Cao, G.; Ling, Z.; Li, W.; Cheng, H.; He, B.; Cao, S.; Zhu, A. A Multi-Information Fusion Method for Gait Phase Classification in Lower Limb Rehabilitation Exoskeleton. Front. Neurorobot. 2021, 15, 692539. [Google Scholar] [CrossRef] [PubMed]
Hwang, S.; Kim, J.; Yang, S.; Moon, H.J.; Cho, K.H.; Youn, I.; Sung, J.K.; Han, S. Machine Learning Based Abnormal Gait Classification with IMU Considering Joint Impairment. Sensors 2024, 24, 5571. [Google Scholar] [CrossRef]
Farah, J.D.; Baddour, N.; Lemaire, E.D. Gait phase detection from thigh kinematics using machine learning techniques. In Proceedings of the Proceedings of the 2017 IEEE International Symposium on Medical Measurements and Applications (MeMeA), Rochester, MN, USA, 7–10 May 2017; pp. 263–268. [Google Scholar]
Jung, E.; Lin, C.; Contreras, M.; Teodorescu, M. Applied Machine Learning on Phase of Gait Classification and Joint-Moment Regression. Biomechanics 2022, 2, 44–65. [Google Scholar] [CrossRef]
Mekni, A.; Narayan, J.; Gritli, H. Binary and Ternary Human Gait Phase Classification Using Machine Learning Algorithms. In Proceedings of the 2024 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), Sakhir, Bahrain, 17–19 November 2024; pp. 80–87. [Google Scholar]
Mekni, A.; Narayan, J.; Gritli, H. Multi-Class Gait Phase Recognition using Machine Learning Models with Two Training Methods. In Proceedings of the 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, 24–28 June 2024; pp. 1–6. [Google Scholar]
Mekni, A.; Narayan, J.; Gritli, H. Leveraging Machine Learning for Gait Phase Classification with Varied Training Methods. In Proceedings of the 2024 IEEE 7th International Conference on Advanced Technologies, Signal and Image Processing (ATSIP), Sousse, Tunisia, 11–13 July 2024; Volume 1, pp. 582–587. [Google Scholar]
Ma, Y.; Wu, X.; Wang, C.; Yi, Z.; Liang, G. Gait Phase Classification and Assist Torque Prediction for a Lower Limb Exoskeleton System Using Kernel Recursive Least-Squares Method. Sensors 2019, 19, 5449. [Google Scholar] [CrossRef]
Xia, Y.; Li, J.; Yang, D.; Wei, W. Gait Phase Classification of Lower Limb Exoskeleton Based on a Compound Network Model. Symmetry 2023, 15, 163. [Google Scholar] [CrossRef]
Zheng, X.; Wilhelm, E.; Otten, E.; Reneman, M.F.; Lamoth, C.J. Age-related gait patterns classification using deep learning based on time-series data from one accelerometer. Biomed. Signal Process. Control. 2024, 102, 107406. [Google Scholar] [CrossRef]
Zheng, X.; Otten, E.; Reneman, M.F.; Lamoth, C.J. Explaining deep learning models for age-related gait classification based on acceleration time series. Comput. Biol. Med. 2025, 184, 109338. [Google Scholar] [CrossRef] [PubMed]
Fricke, C.; Alizadeh, J.; Zakhary, N.; Woost, T.B.; Bogdan, M.; Classen, J. Evaluation of three machine learning algorithms for the automatic classification of EMG patterns in gait disorders. Front. Neurol. 2021, 12, 666458. [Google Scholar] [CrossRef]
Krutaraniyom, S.; Sengchuai, K.; Booranawong, A.; Jaruenpunyasak, J. Pilot Study on Gait Classification Using Machine Learning. In Proceedings of the 2022 International Electrical Engineering Congress (iEECON), Khon Kaen, Thailand, 9–11 March 2022; pp. 1–4. [Google Scholar]
Bahadori, S.; Williams, J.; Wainwright, T. Lower Limb Kinematic, Kinetic and Spatial-temporal Gait Data for Healthy Adults Using a Self-paced Treadmill. Mendeley Data, V1. 2020. Available online: https://data.mendeley.com/datasets/wwnvw28n2m/1 (accessed on 10 December 2024). [CrossRef]
Bahadori, S.; Williams, J.M.; Wainwright, T.W. Lower limb kinematic, kinetic and spatial-temporal gait data for healthy adults using a self-paced treadmill. Data Brief 2021, 34, 106613. [Google Scholar] [CrossRef] [PubMed]
Fu, Y.; Zhang, R.; Xia, J.; Gough, A.; Clark, S.; Upadhyay, A.; Enemali, G.; Armstrong, I.; Ahmed, I.; Pourkashanian, M.; et al. Hybrid model-driven spectroscopic network for rapid retrieval of turbine exhaust temperature. IEEE Trans. Instrum. Meas. 2023, 72, 1–10. [Google Scholar] [CrossRef]
Babu, S.S.; Nutakki, C.; Diwakar, S. Classification of Human Gait: Swing and Stance Phases using Sum-Vector Analysis. Procedia Comput. Sci. 2020, 171, 403–409. [Google Scholar] [CrossRef]
Huang, A.; Cao, Z.; Zhao, W.; Zhang, H.; Xu, L. Frequency division multiplexing and main peak scanning WMS method for TDLAS tomography in flame monitoring. IEEE Trans. Instrum. Meas. 2021, 69, 9087–9096. [Google Scholar] [CrossRef]
Hossin, M.; Sulaiman, M.N. A Review On Evaluation Metrics For Data Classification Evaluations. Int. J. Data Min. Knowl. Manag. Process. 2015, 5, 1–11. [Google Scholar]
Shaha, R.; Talukder, D.; Iqbal, M.A.; Haque, M.M. TOS: A Relative Metric Approach for Model Selection in Machine Learning Solutions. In Proceedings of the 2021 IEEE International Conference on Robotics, Automation, Artificial-Intelligence and Internet-of-Things (RAAICON), Dhaka, Bangladesh, 3–4 December 2021; pp. 26–31. [Google Scholar]
Abaker, M.; Dafaalla, H.; Eisa, T.A.E.; Abdelgader, H.; Mohammed, A.; Burhanur, M.; Hasabelrsoul, A.; Alfakey, M.I.; Morsi, M.A. Deep Learning- and IoT-Based Framework for Rock-Fall Early Warning. Appl. Sci. 2023, 13, 9978. [Google Scholar] [CrossRef]
Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2021, 17, 168–192. [Google Scholar] [CrossRef]
Hicks, S.A.; Strümke, I.; Thambawita, V.; Hammou, M.; Riegler, M.A.; Halvorsen, P.; Parasa, S. On evaluation metrics for medical applications of artificial intelligence. Sci. Rep. 2022, 12, 5979. [Google Scholar] [CrossRef]
de Amorim, L.B.; Cavalcanti, G.D.; Cruz, R.M. The choice of scaling technique matters for classification performance. Appl. Soft Comput. 2023, 133, 109924. [Google Scholar] [CrossRef]
Ahsan, M.M.; Mahmud, M.A.P.; Saha, P.K.; Gupta, K.D.; Siddique, Z. Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance. Technologies 2021, 9, 52. [Google Scholar] [CrossRef]
Greenacre, M.; Groenen, P.J.F.; Hastie, T.; D’Enza, A.I.; Markos, A.; Tuzhilina, E. Principal component analysis. Nat. Rev. Methods Prim. 2022, 2, 100. [Google Scholar] [CrossRef]
Pratap, S.; Narayan, J.; Hatta, Y.; Ito, K.; Hazarika, S.M. From Tactile Signals to Grasp Classification: Exploring Patterns with Machine Learning. In Proceedings of the 2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI), Gwalior, India, 14–16 March 2024; Volume 2, pp. 1–6. [Google Scholar]
Mahesh, B. Machine learning algorithms-a review. Int. J. Sci. Res. 2020, 9, 381–386. [Google Scholar] [CrossRef]

Figure 1. Different gait phases of human walking [1,6].

Figure 2. Left- and right-hip flexion/extension.

Figure 3. Hip abduction/adduction (left and right).

Figure 4. Hip rotation (left and right).

Figure 5. Knee flexion/extension (left and right).

Figure 6. Ankle dorsiflexion/plantarflexion (left and right).

Figure 7. Confusion matrices of the k-NNs algorithm using the first and second training methods.

Figure 8. Confusion matrices of the LR classification algorithm using the first and second training methods.

Figure 9. Confusion matrices of the DT classifier using the first and second training methods.

Figure 10. Confusion matrices obtained using the RF classification algorithm for the first and second training methods.

Figure 11. Confusion matrices obtained using the SVM classifier for the first and second training methods.

Figure 12. Confusion matrices obtained using the NB classifier for the first and second training methods.

Figure 13. Confusion matrices obtained using the LDA classification algorithm for the first and second training methods.

Figure 14. Confusion matrices obtained using the QDA classifier for the first and second training methods.

Table 1. Mean values for various features classified by gender.

Feature	Female (Mean)	Male (Mean)
Height (cm)	166.3	177.9
Weight (kg)	64.1	75.4
Right Leg Length (cm)	86.9	89.2
Left Leg Length (cm)	87.0	89.2
Knee Width (cm)	9.3	9.2
Ankle Width (cm)	6.4	7.0

Table 2. Performance of various ML models with Min–Max Scaling (first approach).

Algorithms	CV Score	MSE	RMSE	Accuracy (%)	$R^{2}$ Score (%)
k-NN	0.8860	0.2282	0.4777	93.51	88.64
LR	0.9080	0.3010	0.5486	91.44	85.02
DT	0.8465	0.2733	0.5227	91.83	86.40
RF	0.8990	0.1990	0.4461	94.95	90.09
SVM	0.9130	0.2312	0.4808	93.22	88.49
NB	0.8679	0.4218	0.6494	87.52	79.01
LDA	0.8875	0.3426	0.5853	89.50	82.95
QDA	0.8969	0.3223	0.5677	91.53	83.96