Abstract
Differentiating genuine hemiplegic gait (HG) in stroke survivors from hemiplegic-like gait voluntarily imitated by healthy adults (MHG) is essential for reliable assessment and intervention planning. Treadmill-based gait data were obtained from 79 participants—39 stroke patients (HG) and 40 healthy adults—instructed to mimic HG (MHG). Forty-eight spatiotemporal and force-related variables were extracted. Random Forest, support vector machine (SVM), and logistic regression classifiers were trained with (i) the full feature set and (ii) the 10 most important features selected via Random Forest Gini importance. Performance was assessed with 5-fold stratified cross-validation and an 80/20 hold-out test, using accuracy, F1-score, and the area under the receiver operating characteristic curve (AUC). All models achieved high discrimination (AUC > 0.93). The SVM attained perfect discrimination (AUC = 1.000, test set) with the full feature set and maintained excellent accuracy (AUC = 0.983) with only the top 10 features. Temporal asymmetries, delayed vertical ground reaction force peaks, and mediolateral spatial instability ranked highest in importance. Reduced-feature models showed negligible performance loss, highlighting their parsimony and interpretability. Supervised machine learning algorithms can accurately distinguish true hemiplegic gait from mimicked patterns using a compact subset of gait features. The findings support data-driven, time-efficient gait assessments for clinical neurorehabilitation and for validating experimental protocols that rely on gait imitation.
1. Introduction
Stroke refers to neurological symptoms caused by damage to brain tissue due to either the blockage of blood vessels (ischemic stroke) or their rupture (hemorrhagic stroke) []. It is one of the leading causes of long-term disability, with more than 80% of survivors experiencing gait disturbances []. Even with continuous rehabilitation, approximately 18% of stroke patients are unable to walk, 11% require assistance, and only about half can regain independent walking ability []. The most common gait abnormality among stroke survivors is hemiplegic gait, which results from muscular weakness, impaired motor control, proprioceptive deficits, and increased muscle tone on the affected side []. This typically manifests as a flexor synergy in the upper limb and an extensor synergy in the lower limb, leading to joint patterns such as hip extension and internal rotation, knee extension, and ankle plantarflexion and inversion []. To achieve foot clearance during swing, patients often compensate through hip hiking or circumduction []. These compensatory strategies alter gait mechanics, causing increased energy expenditure and fatigue, and consequently elevate the risk of falls []. Therefore, the recovery of gait ability is a critical objective in stroke rehabilitation to promote independent daily functioning []. Recent studies have focused on quantifying gait impairments and identifying key variables for recovery using advanced analytical methods, including gait analysis systems, wearable sensors, and instrumented treadmills []. Given the distinguishable gait characteristics of stroke survivors, researchers have applied machine learning (ML) and deep learning techniques to develop automated algorithms capable of identifying gait deviations, predicting recovery outcomes, and classifying stroke severity []. These data-driven approaches have demonstrated promising performance; however, existing models primarily focus on distinguishing normal and pathological gait patterns under natural conditions. In real-world clinical and administrative settings, there are cases where patients may exaggerate or mimic hemiplegic gait, either consciously or due to external incentives such as insurance claims or return-to-work assessments. A previous study documented abnormal behaviors during rehabilitation by a stroke patient involved in a traffic accident [], raising concerns about potential malingering or fraud. Other reports suggest that caregivers may also influence evaluations, as in Munchausen Syndrome by Proxy []. It is particularly challenging to detect mimicked hemiplegic gait visually because it is relatively easy to reproduce spatiotemporal patterns such as foot drop or pelvic tilting. Kinematic imitation involving joint angles can often be controlled voluntarily, making it difficult for clinicians to reliably differentiate between neurologically impaired and cognitively generated gait patterns. Despite growing interest in the objective quantification of gait impairments, there remains a lack of research specifically addressing the differentiation between genuine hemiplegic gait and its mimicked counterpart. Prior machine learning studies have not explored whether such models can detect intention-based imitation, a scenario with meaningful implications in clinical diagnostics, functional evaluations, and medico-legal contexts. Moreover, the interpretability, efficiency, and reproducibility of gait classification models remain underexplored, especially in settings where visual assessment may be subjective or inconsistent. The objective of this study was to determine whether machine learning techniques could accurately classify mimicked hemiplegic gait performed by healthy individuals and actual hemiplegic gait exhibited by stroke patients. Using treadmill-based spatiotemporal and force-related gait parameters, we developed and evaluated three supervised classifiers—Random Forest, Support Vector Machine (SVM), and Logistic Regression. We also investigated whether reduced-feature models using the top 10 most important gait parameters could maintain classification performance. Ultimately, this study aims to enhance the objectivity and reliability of clinical gait assessment, particularly in ambiguous or bias-prone scenarios, and to support functional evaluations for rehabilitation planning, insurance verification, and return-to-work decisions.
2. Materials and Methods
2.1. Data Sources and Participant Selection
This study retrospectively analyzed individuals who underwent gait analysis through a treadmill gait analysis system at Wonkwang University Korean Medicine Hospital in Gwangju (WKUGH). As this research only used simple measurement or observation equipment that does not lead to physical changes, it received review approval from the institutional review board of the hospital (WKIRB 2022-07, 29 June 2022). We selected data from subjects who visited WKUGH between November 1 2018 and June 30 2022, and who met the inclusion criteria for both the hemiplegic gait (HG) group and the mimicked hemiplegic gait (MHG) group and did not fall under the exclusion criteria (Table 1).
Table 1.
Inclusion and exclusion criteria for the HG and MHG group.
A total of 79 subjects were retrospectively analyzed in this study, consisting of 39 in the hemiplegic gait group (19 with left hemiparesis, 20 with right hemiparesis) and 40 in the mimicked hemiplegic gait group (20 mimicking left, 20 mimicking right hemiparesis) (Table 2).
Table 2.
Demographic characteristics of subjects.
2.2. Experiment and Equipment
We conducted a gait analysis using a treadmill equipped with a pressure plate (Figure 1). When the subject starts walking on the treadmill, the pressure exerted on the pressure plate is measured, and spatiotemporal features of gait are collected. The pressure applied to the treadmill (AP2010-2Si, Apsun Inc., Seoul, Republic of Korea) is transmitted to the Zebris FDM software (version 1.18.44, Zebris Medical GmbH, Isny/Allgäu, Germany) on the computer [].
Figure 1.
Gait analysis system using treadmill.
For the HG group, the subjects were asked to walk on the treadmill at their preferred speed for 30 s. For the MHG group, we conducted training on the flexor synergy pattern of the upper limbs and the extensor synergy pattern of the lower limbs, which are characteristic of post-stroke hemiparesis. They were instructed to mimic the characteristics of hemiplegic gait, including hip extension and internal rotation, knee extension, and plantar flexion and inversion of the ankle, while performing hip hike and circumduction during swing phase. Gait analysis was conducted when it was judged that they could sufficiently reproduce the hemiplegic gait after watching videos of actual hemiplegic gait of stroke patients and demonstrations by doctors of Korean Medicine. For the MHG group, the walking speed was limited to 0.7 km/h, which is the average walking speed of the HG group, as they tended to perform normal walking when the walking speed was set high.
2.3. Gait Feature
The spatiotemporal features such as foot rotation degree, step length, stride length, velocity, cadence, and others, as well as the movement of the center of pressure (CoP) during gait, the ratio of the gait cycle of each lower limb, and the vertical ground reaction force (vGRF) generated during gait are calculated and obtained (Figure 2). These methods have been described previously [].
Figure 2.
(a) Length of gait line (green), (b) single-limb support line (blue), and (c) lateral symmetry (red). The lines of single-limb support and lateral symmetry are derived from the butterfly-shaped diagram that illustrates the trajectory of the COP during walking.
The gait cycle is split into the stance and swing phases, which are determined by the toe off. The initial contact of the opposite side (left side in this case) happens when the right side reaches 50% of the cycle. There are two double limb support phases where both feet touch the ground: once at the start of the cycle (0–10%) and again midway (50–60%) (Table 3). Details of the gait cycle phases have been described previously [].
Table 3.
Gait features obtained by treadmill.
2.4. Statistical Analysis
All statistical analyses were performed using Python (version 3.10) and open source scientific libraries. Data preprocessing and management were conducted using Pandas, and group comparisons between the hemiplegic gait (HG) and mimicked hemiplegic gait (MHG) groups were performed using independent t-tests. The t-tests were implemented via the SciPy package [].
To facilitate interpretation, since the direction of hemiparesis varies among the subjects, features that can be obtained from one side of the body (foot rotation, step length, stance phase, load response, single-limb support, pre-swing, swing phase, step time, length of gait line, single-limb support line, time maximum force 1) were uniformly converted to (+) for the unaffected side and (−) for the affected side, based on the study by Lee et al. [].
For lateral symmetry, the unaffected direction was converted to (+) and the affected direction to (−). Additionally, to interpret time maximum force 1, the time maximum force ratio was calculated by dividing the unaffected side feature by the affected side feature, referencing Patterson’s study method [].
To assess between-group differences in gait features while controlling for the effect of age, an Analysis of Covariance (ANCOVA) was conducted for each dependent feature. The primary fixed factor was group type (hemiplegic gait [HG] vs. mimicked hemiplegic gait [MHG]), and age was treated as a covariate. The model also included the interaction term between age and group to test whether the relationship between age and the dependent feature differed across groups.
To examine group differences in gait features between the HG and MHG groups, appropriate statistical tests were selected based on normality assessments using the Shapiro–Wilk test. For features in which both groups satisfied the assumption of normality (p > 0.05), independent samples t-tests were employed. For non-normally distributed features, the non-parametric Mann–Whitney U test was applied.
2.5. Machine Learning Model
Since there is no established formula for calculating the required sample size in machine learning, we referred to the empirical rule-of-thumb for multiple regression proposed, which recommends a minimum sample size of N ≥ 50 + 8 m, where m is the number of predictors []. For a maximum of 10 predictors, this criterion suggests that at least 130 participants are needed. However, our study included only 79 participants, falling short of this recommendation.
A post hoc sensitivity analysis was conducted as follows: for the full model test with m = 10 predictors and N = 79 participants, the numerator and denominator degrees of freedom were set to df1 = m = 10 and df2 = N − m − 1 = 79 − 10 − 1 = 68, respectively. At a significance level of α = 0.05, the critical F-value was (10,68) ≈ 1.973. The non-centrality parameter λ required to achieve 80% power was obtained from the noncentral F-distribution as ≈18.4875. Based on this, the minimum detectable effect size was calculated as =, . This result indicates that while the study had adequate power to detect medium-to-large effects, it may have failed to detect smaller true effects. According to Cohen’s conventional thresholds for multiple regression (f2 = 0.02 for small, f2 = 0.15 for medium, and f2 = 0.35 for large effects), the calculated f2 value of 0.2 corresponds to a medium-sized effect. This suggests that although the independent variables included in the model influence the dependent variable, the magnitude of this influence may not be substantial. Nevertheless, even very small effects can have considerable practical or clinical importance if they affect a large population.
A total of 79 observations were included in the data-set, consisting of two groups labeled as hemiplegic gait (HG) and mimicked hemiplegic gait (MHG) groups. Each observation included 26 spatiotemporal and force-related gait features, excluding age-sensitive features based on prior ANCOVA results.
To reduce model complexity and improve generalization due to the limited data-set size, feature selection was performed using the Gini impurity-based feature importance scores computed from a Random Forest classifier []. Specifically, a Random Forest model was trained on the entire standardized dataset using all available input features, and the importance of each feature was quantified by averaging the total reduction in Gini impurity that the feature contributed across all decision trees in the ensemble. Features that contributed more to reducing impurity were assigned higher importance scores. The features were then ranked in descending order based on their computed importance values, and the top 10 features were selected. These top-ranked features were consistently used in all subsequent modeling, hyperparameter tuning, and evaluation procedures to ensure comparability and to reduce the risk of overfitting [].
To classify HG versus MHG by gait features, supervised machine learning models were developed using the top 10 features identified through feature selection. Three classification algorithms were employed: a Random Forest classifier, a Support Vector Machine (SVM) with a radial basis function (RBF) kernel [], and a logistic regression model.
Random Forest is an ensemble method that constructs multiple decision trees and aggregates their outputs to improve prediction robustness. SVM identifies the optimal hyperplane to separate different classes, while Logistic Regression models the probability of class membership using a linear function. These models were chosen for their complementary strengths and their established use in biomedical data analysis
All models were implemented using scikit-learn (v1.3.0) with default hyperparameters initially, followed by hyperparameter tuning via grid search []. The Random Forest model leveraged ensemble decision trees to enhance robustness and interpretability, while the SVM model offered nonlinear boundary discrimination. Logistic regression was included for its simplicity and high interpretability in clinical contexts. All models were trained and evaluated on the same standardized feature set to ensure comparability across methods.
To mitigate the risk of overfitting due to the limited sample size, models were primarily evaluated using 5-fold stratified cross-validation, ensuring balanced representation of both gait groups within each fold []. In addition to cross-validation, an 80:20 holdout split was performed to further validate model performance on unseen data. Specifically, 80% of the data-set was randomly selected and used as a training-set for model fitting and hyperparameter tuning, while the remaining 20% was reserved as an independent test set for performance evaluation []. Model performance was assessed using accuracy, F1-score, and area under the Receiver Operating Characteristic—Area Under the Curve (ROC AUC), which reflect overall classification accuracy, robustness to class imbalance, and discriminative ability, respectively.
3. Results
3.1. Result of Age Effect
To examine the effect of age on gait features and to control for potential confounding, an analysis of covariance (ANCOVA) was performed with age as a covariate. The results indicated that age had a statistically significant effect on multiple features, including single-limb support line (+) (p = 0.003), foot rotation (+) (p = 0.007), velocity (p = 0.037), and step length (+) (p = 0.049) (Figure 3). No significant age-related effects were found in the remaining features. Additionally, no significant age × group interaction was found in any feature, suggesting that the effect of age was consistent across both groups.
Figure 3.
Distributional characteristics of features significantly influenced by age.
3.2. Result of Gait Feature
To examine group differences, independent samples t-tests were conducted for each variable, and Cohen’s d was calculated as a measure of effect size. Furthermore, to assess the relative importance of the variables, a Random Forest classification model was employed, and the Gini importance for each variable was computed as the average decrease in Gini impurity contributed by that variable across the ensemble. The results revealed that several features showed statistically significant differences between the HG and MHG groups. Specifically, swing phase (−) (p < 0.001, Cohen’s d = −2.103), stance phase (−) (p < 0.001, d = 2.103), and time maximum force1 (−) (p < 0.001, d = 1.556) showed large to very large effect sizes, indicating substantial biomechanical differences. Additionally, Time maximum force ratio (p < 0.001, d = −1.196) and step time (−) (p < 0.001, d = −0.615) also demonstrated meaningful group differences (Table 4).
Table 4.
Comparison of features.
3.3. Result of Feature Importance Analysis
After excluding four features influenced by age (e.g., velocity, single-limb support line (+), foot rotation (+), step length (+)), the top 10 features with the highest importance scores were selected, and their relative contributions are visualized in Figure 4.
Figure 4.
Feature importance scores by GINI Impurity of RF.
The most influential feature was stance phase (−) (Gini importance = 0.224), followed by single-limb support (+) (0.147), swing phase (−) (0.132), and step time (−) (0.064). Additional key features included time maximum force1 (−) (0.044), double stance phase (0.039), time_max_ratio (0.034), stance phase (+) (0.026), load response (+) (0.025), foot rotation (−), degree (0.025).
These top 10 features were subsequently used for feature-reduced model training, resulting in comparable or improved classification performance compared to models trained on the full feature set.
3.4. Result of Machine Learning Model
To evaluate classification performance, each model was trained and validated using both the full feature set (26 features) and a reduced feature set comprising the top 10 most important features selected via Random Forest. Table 5 summarizes the results of 5-fold stratified cross-validation across all models.
Table 5.
The test results of classification models using top 10 features.
When using the full feature set, the Support Vector Machine (SVM) with a radial basis function kernel demonstrated the highest classification performance, achieving an accuracy of 93.8%, F1-score of 0.941, and a perfect ROC AUC of 1.000. Logistic Regression also performed robustly on the full dataset, with an accuracy of 91.1% and ROC AUC of 0.932, while Random Forest yielded slightly lower accuracy (87.5%) but a comparably high AUC (0.961).
Upon restricting the input to the top 10 features, classification performance improved or remained stable across all models. Random Forest showed improved accuracy (92.3%) and F1-score (0.924), with a slight decrease in AUC to 0.949. The reduced-feature SVM achieved an accuracy of 91.1%, F1-score of 0.913, and maintained a high AUC of 0.961. Logistic Regression, using the same reduced features, also performed consistently well, with an accuracy of 91.1%, F1-score of 0.907, and AUC of 0.941 (Figure 5).
Figure 5.
ROC curve comparison of three classifiers using top 10 and all features.
These results indicate that reducing the feature space to the most informative features not only maintained but, in some cases, enhanced model performance. Furthermore, while SVM demonstrated the highest discriminative capacity, both Random Forest and Logistic Regression achieved comparable results with the added benefit of interpretability and ease of integration in clinical settings.
3.5. Result of Machine Learning Model Using Non-Significant Variables
To examine whether the classification performance was solely driven by variables with significant group differences, we performed an additional analysis using only variables with p > 0.05 between the hemiplegic and mimicked hemiplegic gait groups. The variables included in this analysis were step length –, step width, step time +, stride time, velocity, single-limb support line +, single-limb support line –, lateral symmetry, and time maximum force1 +. Three classifiers—Random Forest, Support Vector Machine (SVM), and Logistic Regression classifiers—were trained on the training set (80%) and evaluated on the independent test set (20%) using Accuracy, F1-score, and ROC AUC metrics. The results are presented in Table 6 and Figure 6.
Table 6.
The test results of classification models using non-significant features.
Figure 6.
ROC curve comparison of three classifiers using non-significant (independent) features.
4. Discussion
This study aimed to identify key gait features and develop machine learning models to distinguish true hemiplegic gait (HG) from mimicked hemiplegic gait (MHG) using quantitative gait analysis data.
Gait features that can be collected through gait analysis include not only spatiotemporal features but also a wide variety of features related to the force, or movement of the center of pressure, making analysis difficult. In this study, machine learning was used to compare gait features obtained from numerous subjects in two groups to identify and classify the relationships between them. In this study, the collected gait analysis data were analyzed through machine learning, and the following results were obtained.
Firstly, In this study, ANCOVA revealed that four gait features—velocity, single-limb support line (+), foot rotation (+), and step length (+)—were significantly influenced by age. As a result, these features were excluded from intergroup statistical comparisons and machine learning modeling. Although each of these features represents clinically and biomechanically meaningful aspects of gait—including overall function, stability, joint control, and step efficiency—their strong association with age-related changes makes them less suitable for isolating pathological gait characteristics.
For example, velocity, a widely used marker of mobility and gait efficiency, is known to decrease with age due to reduced strength, balance, and neuromuscular coordination. Thus, differences in velocity may be attributed more to natural aging than to hemiparetic impairment [,]. Likewise, single-limb support line (+), derived from the trajectory of the center of pressure (COP), reflects the stability and duration of stance [,]. This feature is inherently sensitive to age-related decline in postural control, potentially confounding its interpretation as a group-specific gait deviation [].
Foot rotation (+) indicates the degree of forefoot external rotation during gait. While such rotation is often seen in stroke-related compensatory movements, similar patterns can also emerge due to joint stiffness and muscular adaptations associated with aging, especially at the hip and ankle joints []. Step length (+), closely tied to stride mechanics, also decreases with age, as older adults typically exhibit reduced propulsion and joint range of motion. In this study, the absence of a significant difference in step length after adjusting for age supports this interpretation [].
Importantly, all age-dependent gait variables identified in this study were derived from the unaffected side. While the unaffected side may exhibit more biomechanically stable patterns, it primarily reflects compensatory mechanisms rather than direct neurological impairments [,]. In contrast, the affected side provides richer pathological information, including asymmetry, altered timing, and deficits in force generation—features that are essential for distinguishing true hemiplegic gait from mimicked patterns []. Healthy individuals attempting to imitate hemiplegic gait are generally unable to replicate these subtle dysfunctions, making affected-side features especially valuable for classification. From a machine learning perspective, using predominantly affected-side parameters enhances discriminative power in differentiating neurologically impaired gait from cognitively generated imitation.
Secondly, a machine learning-based classification model was developed using unaffected-side gait features that were not significantly influenced by age. Feature importance analysis identified the top 10 features that most contributed to model performance. These features primarily comprised temporal gait features, phase-specific time ratios, and force timing characteristics, all of which captured essential distinctions between HG and MHG group.
Notably, features such as stance phase (−), single-limb support (+), swing phase (−), and step time (−) ranked among the most important. Among the top-ranked features identified in this study, stance phase (−), single-limb support (+), swing phase (−), and step time (−) reflect key spatiotemporal characteristics of hemiplegic gait. A reduced stance phase on the affected side indicates instability and diminished weight-bearing capacity, while increased single-limb support on the unaffected side reflects compensatory load-bearing strategies []. Shortened or delayed swing phase is commonly associated with impaired motor control and muscle weakness, and irregularities in step time reflect disrupted gait rhythm and temporal asymmetry []. These features capture pathological aspects of gait that are difficult for healthy individuals to replicate through mimicked hemiplegic gait, thus providing high discriminative power for classification. Although Rezgui et al. (2013) investigated the imitation of cerebral palsy gait rather than stroke-related hemiplegic gait, their findings support the notion that pathological gait patterns exhibit neuromechanical complexities that are difficult for healthy individuals to replicate accurately, thereby providing a relevant basis for our interpretation []. Their consistent emergence as important predictors in machine learning models highlights both their clinical relevance and their robustness in differentiating true hemiplegic gait from cognitively generated imitation. Furthermore, time-based kinetic features such as time_max_ratio and load response timing emerged as critical indicators. These features represent subtle shifts in the timing of ground reaction forces and suggest that temporal irregularities in neuromuscular response may serve as objective markers of pathological gait. Such micro-level deviations are unlikely to be perceived through visual inspection alone, underscoring the potential utility of data-driven metrics in clinical gait assessment.
Interestingly, the selected top features align well with those highlighted in previous clinical gait studies involving stroke patients. This convergence lends further support to their validity and highlights the importance of kinetic over kinematic features in distinguishing pathological gait. While kinematic features such as joint angles or step speed can be consciously manipulated by mimicry, kinetic features typically reflect involuntary motor output, offering greater diagnostic specificity.
In summary, the top 10 features identified in this study appear to capture fundamental biomechanical markers of true hemiplegic gait that are resistant to voluntary imitation. These findings suggest that carefully selected unaffected-side features—particularly those capturing temporal dynamics and weight transfer mechanisms—may serve as reliable components in objective gait classification systems. The results further support the use of explainable machine learning approaches to extract clinically relevant insights from quantitative gait analysis.
Thirdly, All three machine learning classification models developed in this study—Random Forest, Support Vector Machine (SVM), and Logistic Regression—demonstrated high performance in distinguishing hemiplegic gait (HG) from mimicked hemiplegic gait (MHG), supporting the feasibility of quantitative, data-driven gait classification. Among them, the SVM exhibited the highest discriminative power, achieving a perfect AUC (1.000) when trained on the full feature set and maintaining robust performance even when using only the top 10 features. This suggests that the SVM’s ability to capture nonlinear decision boundaries may be particularly well-suited for modeling the complex patterns observed in hemiplegic gait.
The Random Forest model also achieved high accuracy and provided the added benefit of interpretability through feature importance scores. These importance scores were used to guide dimensionality reduction and contributed to maintaining model performance with fewer features. Logistic Regression, although slightly lower in AUC compared to the other models, offers advantages in transparency and clinical interpretability. In clinical settings where decision-making must be explainable, such simplicity is often valued despite minor performance trade-offs.
Importantly, models trained on only the top 10 most important features yielded comparable performance to models trained on the full feature set. This highlights the potential for efficient and lightweight gait assessment systems that minimize data collection burden without sacrificing diagnostic power. Such models may be more easily deployed in real-world settings, including wearable devices or bedside evaluation tools, and may support functional evaluations for rehabilitation planning, insurance verification, and return-to-work decisions. these models provide clinically meaningful insights into gait asymmetry and compensatory strategies. For example, the observed alterations in stance and swing phases correspond to well-known biomechanical consequences of muscle weakness and postural instability. Importantly, we found that models trained on only the top 10 most important features yielded comparable performance to those trained on the full feature set. This highlights the feasibility of developing efficient and lightweight gait assessment systems that minimize data collection burden without compromising diagnostic power. Such systems could be particularly advantageous for deployment in real-world clinical environments, including wearable devices or bedside evaluation tools, to support functional assessments for rehabilitation planning, insurance verification, and return-to-work decisions. Moreover, it should be acknowledged that the laterality of motor deficits may exert a significant influence on gait characteristics. Patients with left-sided hemiparesis often present with spatial neglect, which can exacerbate asymmetry and impair motor control, whereas right-sided hemiparesis may follow distinct compensatory mechanisms. Although our dataset did not allow for stratified analysis by side of involvement, future studies with larger cohorts should investigate laterality-specific gait alterations and their implications for personalized rehabilitation strategies. In addition, external validation using larger, multi-center datasets will be crucial to establish the robustness and generalizability of such lightweight models. Future research should also explore the integration of advanced algorithms, including deep learning and ensemble methods, to further enhance predictive performance while maintaining clinical interpretability.
Even when the analysis was restricted to variables showing no significant differences between groups (p > 0.05), Random Forest and SVM maintained high classification performance (ROC AUC = 0.97 and 0.95, respectively; Accuracy = 87.5%), demonstrating that the models learned complex multivariate patterns rather than exploiting obvious contrasts between groups. In contrast, Logistic Regression achieved substantially lower performance (ROC AUC = 0.58; Accuracy = 62.5%). This discrepancy can be attributed to the linear nature of Logistic Regression, which cannot adequately capture nonlinear relationships and higher-order interactions likely present in the selected variables. Moreover, potential multicollinearity among gait features and the limited sample size may have further reduced its performance. These findings collectively highlight the importance of nonlinear classification models in capturing complex gait patterns, particularly when apparent group differences are removed.
Several limitations of the present study should be acknowledged.
First, the training methodology for the MHG group poses concerns regarding ecological validity. Although participants were instructed using videos of actual stroke patients and demonstrations by doctors and were asked to reproduce characteristic hemiplegic gait patterns (hip extension and internal rotation, knee extension, ankle plantarflexion and inversion, hip hiking, and circumduction), no objective criteria or agreement scale was applied to confirm accurate imitation. Gait analysis was conducted only when physicians judged the mimicry to be sufficient, and treadmill speed was standardized at 0.7 km/h to minimize variability. Nevertheless, reliance on subjective assessment may have introduced uncontrolled variability, potentially compromising comparability with the HG group and leading to overestimation of the model’s discriminatory capacity.
Second, the relatively small sample size (N = 79) limited the statistical power of the study. While sensitivity analysis confirmed that medium-to-large effects could be reliably detected, smaller effects might have been overlooked. According to Cohen’s conventional thresholds for multiple regression (f2 = 0.02 for small, f2 = 0.15 for medium, and f2 = 0.35 for large effects), our calculated f2 = 0.2 corresponds to a medium-sized effect. This suggests that although the independent variables included in the model influence the dependent variable, the magnitude of this influence may not be substantial. Nevertheless, even very small effects can have considerable practical or clinical importance if they affect a large population.
Third, the machine learning models were trained and tested on a single-center dataset, raising concerns about external validity. To mitigate this, we applied cross-validation and developed a reduced model using only the top 10 key features identified through feature importance analysis, which showed comparable performance to the full model. However, such approaches cannot fundamentally resolve the limitation. We are therefore preparing additional patient recruitment and multi-center collaborative studies, and future work will focus on external validation with larger datasets to strengthen the generalizability of the model.
Fourth, although our models demonstrated high classification accuracy, it must be acknowledged that several gait features exhibited significant group differences between HG and MHG. This may have biased the classification results, favoring automatic discrimination and potentially inflating model performance beyond what might be expected in more ecologically valid conditions. Lastly, while we employed three widely used algorithms (Random Forest, SVM, Logistic Regression), future studies may benefit from exploring deep learning architectures, ensemble approaches, and refined threshold-tuning strategies to further enhance predictive accuracy and clinical applicability.
5. Conclusions
This study demonstrated that machine learning models can effectively distinguish hemiplegic gait from mimicked hemiplegic gait using quantitative gait features. By applying Random Forest, Support Vector Machine (SVM), and Logistic Regression classifiers, high classification performance was achieved even with a limited dataset, with SVM showing the highest AUC values. Feature importance analysis revealed that temporal asymmetries, such as altered stance and swing phases, along with force timing differences and spatial instability, were key discriminators. Notably, models trained using only the top 10 most informative features achieved comparable performance to those using the full feature set, suggesting that lightweight and interpretable models may be sufficient for clinical application. These findings support the potential of machine learning-based gait analysis as an objective tool for identifying pathological gait characteristics and contribute to the development of data-driven neurorehabilitation strategies. Further validation using external datasets and integration of additional physiological signals is recommended to enhance clinical utility and generalizability.
Author Contributions
Conceptualization, Y.-u.L.; methodology, Y.-u.L.; validation, S.L.; formal analysis, Y.-u.L.; investigation, C.-H.K.; data curation, J.-W.S. and S.K.; writing—original draft preparation, Y.-u.L.; writing—review and editing, J.-W.S. and S.K.; visualization, Y.-u.L.; supervision, S.L.; project administration, S.L.; funding acquisition, S.L. and C.-H.K. All authors have read and agreed to the published version of the manuscript.
Funding
This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: RS-2020-KH088006 & RS-2022-KH127675 & RS-2024-00442030).
Institutional Review Board Statement
The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board (IRB No., Wonkwang University Korean Medicine Hospital in Gwangju. WKIRB 2022-07, 29 June 2022).
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
The data presented in this study are available upon request from the corresponding author. These data are not publicly available because of privacy concerns.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Li, S.; Francisco, G.E.; Zhou, P. Post-Stroke Hemiplegic Gait: New Perspective and Insights. Front. Physiol. 2018, 9, 1021. [Google Scholar] [CrossRef]
- Balaban, B.; Tok, F. Gait Disturbances in Patients With Stroke. PM&R 2014, 6, 635–642. [Google Scholar] [CrossRef]
- Murphy, S.J.X.; Werring, D.J. Stroke: Causes and Clinical Features. Medicine 2020, 48, 561–566. [Google Scholar] [CrossRef] [PubMed]
- Eby, S.; Zhao, H.; Song, P.; Vareberg, B.J.; Kinnick, R.; Greenleaf, J.F.; An, K.-N.; Chen, S.; Brown, A.W. Quantitative Evaluation of Passive Muscle Stiffness in Chronic Stroke. Am. J. Phys. Med. Rehabil. 2016, 95, 899–910. [Google Scholar] [CrossRef] [PubMed]
- Awad, L.N.; Palmer, J.A.; Pohlig, R.T.; Binder-Macleod, S.A.; Reisman, D.S. Walking Speed and Step Length Asymmetry Modify the Energy Cost of Walking After Stroke. Neurorehabil. Neural Repair 2015, 29, 416–423. [Google Scholar] [CrossRef] [PubMed]
- Saunders, D.H.; Sanderson, M.; Hayes, S.; Kilrane, M.; Greig, C.A.; Brazzelli, M.; Mead, G.E. Physical fitness training for stroke patients. Cochrane Database Syst. Rev. 2020, CD003316. [Google Scholar] [CrossRef]
- Felius, R.A.W.; Geerars, M.; Bruijn, S.M.; Van Dieën, J.H.; Wouda, N.C.; Punt, M. Reliability of IMU-Based Gait Assessment in Clinical Stroke Rehabilitation. Sensors 2022, 22, 908. [Google Scholar] [CrossRef] [PubMed]
- Kohnehshahri, F.S.; Merlo, A.; Mazzoli, D.; Bò, M.C.; Stagni, R. Machine Learning Applied to Gait Analysis Data in Cerebral Palsy and Stroke: A Systematic Review. Gait Posture 2024, 114, S39. [Google Scholar] [CrossRef]
- Dolan, V.F. Narcotic-Seeking Behavior and Self-Injury: A Report of Three Cases. J. Insur. Med. 2025, 52, 23–30. [Google Scholar] [CrossRef]
- Glaser, D. Fabricated or Induced Illness: From “Munchausen by Proxy” to Child and Family-Oriented Action. Child Abuse Negl. 2020, 108, 104649. [Google Scholar] [CrossRef]
- M Patel, H.; M, B. Reliability, Agreement, and Validity of FDM Zebris Pressure Platform to Measure Lower Limb Weight Distribution during Quiet Standing. Int. J. Curr. Res. Rev. 2023, 15, 1–7. [Google Scholar] [CrossRef]
- Esmaeilpour, F.; Letafatkar, A.; Karimi, M.T.; Khaleghi, M.; Rossettini, G.; Villafañe, J.H. Comparative Analysis of Ground Reaction Forces and Spatiotemporal Gait Parameters in Older Adults with Sway-Back Posture and Chronic Low Back Pain: A Cross-Sectional Study. BMC Sports Sci. Med. Rehabil. 2025, 17, 71. [Google Scholar] [CrossRef] [PubMed]
- Kalron, A.; Achiron, A. The Relationship between Fear of Falling to Spatiotemporal Gait Parameters Measured by an Instrumented Treadmill in People with Multiple Sclerosis. Gait Posture 2014, 39, 739–744. [Google Scholar] [CrossRef] [PubMed]
- Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef]
- Lee, I.S.; Park, K.E.; Hong, H.J.; Sung, K.K.; Lee, S.K. The Change of Lateral Shift of Center of Pressure According to the Gait Improvement in Post-Stroke Hemiplegic Patients. J. Intern. Korean Med. 2014, 35, 448–454. [Google Scholar]
- Patterson, K.K.; Gage, W.H.; Brooks, D.; Black, S.E.; McIlroy, W.E. Evaluation of Gait Symmetry after Stroke: A Comparison of Current Methods and Recommendations for Standardization. Gait Posture 2010, 31, 241–246. [Google Scholar] [CrossRef]
- Green, S.B. How Many Subjects Does It Take to Do a Regression Analysis. Multivar. Behav. Res. 1991, 26, 499–510. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests; Springer Nature: Berlin/Heidelberg, Germany, 2001; Volume 45. [Google Scholar]
- Guyon, I.; Elisseeff, A. An Introduction to Variable and Feature Selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
- Chang, C.-C.; Lin, C.-J. LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 1–27. [Google Scholar] [CrossRef]
- Pedregosa, F.; Pedregosa, F.; Varoquaux, G.; Varoquaux, G.; Org, N.; Gramfort, A.; Gramfort, A.; Michel, V.; Michel, V.; Fr, L.; et al. Scikit-Learn: Machine Learning in Python. Mach. Learn. Python 2011, 12, 2825–2830. [Google Scholar]
- Kohavi, R. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. Int. Jt. Conf. Artif. Intell. 1995, 14, 1137–1143. [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer Series in Statistics; Springer: New York, NY, USA, 2009. [Google Scholar]
- Bohannon, R.W. Comfortable and Maximum Walking Speed of Adults Aged 20–79 Years: Reference Values and Determinants. Age Ageing 1997, 26, 15–19. [Google Scholar] [CrossRef] [PubMed]
- Studenski, S. Gait Speed and Survival in Older Adults. JAMA 2011, 305, 50. [Google Scholar] [CrossRef] [PubMed]
- Ruhe, A.; Fejer, R.; Walker, B. Center of Pressure Excursion as a Measure of Balance Performance in Patients with Non-Specific Low Back Pain Compared to Healthy Controls: A Systematic Review of the Literature. Eur. Spine J. 2011, 20, 358–368. [Google Scholar] [CrossRef] [PubMed]
- Lafond, D.; Corriveau, H.; Hébert, R.; Prince, F. Intrasession reliability of center of pressure measures of postural steadiness in healthy elderly people. Arch. Phys. Med. Rehabil. 2004, 85, 896–901. [Google Scholar] [CrossRef] [PubMed]
- Maki, B.E. Gait Changes in Older Adults: Predictors of Falls or Indicators of Fear? J. Am. Geriatr. Soc. 1997, 45, 313–320. [Google Scholar] [CrossRef]
- JudgeRoy, J.O.; Davis, B., III; Õunpuu, S. Step Length Reductions in Advanced Age: The Role of Ankle and Hip Kinetics. J. Gerontol. Ser. A 1996, 51A, M303–M312. [Google Scholar] [CrossRef]
- Prince, F.; Corriveau, H.; Hébert, R.; Winter, D.A. Gait in the Elderly. Gait Posture 1997, 5, 128–135. [Google Scholar] [CrossRef]
- Chen, G.; Patten, C.; Kothari, D.H.; Zajac, F.E. Gait Differences between Individuals with Post-Stroke Hemiparesis and Non-Disabled Controls at Matched Speeds. Gait Posture 2005, 22, 51–56. [Google Scholar] [CrossRef]
- Olney, S.J.; Richards, C. Hemiparetic Gait Following Stroke. Part I: Characteristics. Gait Posture 1996, 4, 136–148. [Google Scholar] [CrossRef]
- Patterson, K.K.; Parafianowicz, I.; Danells, C.J.; Closson, V.; Verrier, M.C.; Staines, W.R.; Black, S.E.; McIlroy, W.E. Gait Asymmetry in Community-Ambulating Stroke Survivors. Arch. Phys. Med. Rehabil. 2008, 89, 304–310. [Google Scholar] [CrossRef]
- Rezgui, T.; Megrot, F.; Fradet, L.; Marin, F. On the Imitation of CP Gait Patterns by Healthy Subjects. Gait Posture 2013, 38, 576–581. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).