1. Introduction
Neurodegenerative disorders (NDDs) have become increasingly prevalent with population aging and longer life expectancy. Many NDDs are associated with progressive impairments in motor control, and gait and balance disturbances are among the most functionally relevant manifestations because they directly affect mobility, independence, and quality of life [
1,
2,
3,
4]. In current clinical practice, mobility assessment still relies heavily on observational examination and clinical rating scales administered by clinicians or physiotherapists. Although such assessments are clinically valuable, they are inherently subjective and can be influenced by inter- and intra-rater variability, for example due to differences in clinical training, observational experience, and scoring habits across raters [
5,
6]. In addition, many clinical scales rely on discrete ordinal scores, which may limit sensitivity to subtle motor changes. These limitations motivate the development of objective, repeatable, and scalable methods for gait assessment in both clinical and daily-life settings [
7].
Parkinson’s disease (PD) is one of the most representative NDDs in which motor impairment and gait dysfunction play central roles. In addition to gait instability and fall risk, PD is frequently accompanied by other motor abnormalities such as bradykinesia, rigidity, postural instability, and movement asymmetry [
8]. In some patients, PD is also associated with a forward-flexed trunk related to muscle rigidity, which further alters posture and locomotor behavior [
9]. Common PD-related gait abnormalities include freezing of gait (FOG), reduced step length, increased stride-to-stride variability, as well as side-related gait asymmetry, all of which are closely associated with symptom severity and disease progression. In PD, laterality-related abnormalities are often reflected by asymmetric plantar loading and unequal temporal organization between the left and right limbs rather than the fixed unilateral compensatory patterns more typical of hemiplegic or foot-drop gait. PD may also present with postural abnormalities and tremor, further distinguishing its locomotor profile from other pathological gait patterns. In this context, contextualizing PD within the broader field of pathological gait recognition is important for avoiding confusion between PD-specific abnormalities and other pathological gait phenotypes [
10]. With the rapid development of wearable sensing technologies, gait can now be quantified more objectively outside conventional laboratory settings. In particular, plantar insole sensors provide direct measurements of foot–ground interaction and preserve rich temporal and loading information during walking, making them a promising modality for PD-related gait assessment and monitoring [
11].
In recent years, artificial intelligence (AI), especially machine learning (ML) and deep learning (DL), has been increasingly applied to medical data analysis because these methods can learn discriminative patterns from multidimensional physiological data [
12,
13]. In gait analysis, such methods provide useful tools for transforming sensor measurements into objective disease-related representations and classification models. Recent studies have shown the potential of deep neural networks for pathological gait recognition by combining heterogeneous gait-related representations, such as skeleton sequences, joint angles, and gait parameters [
8], as well as for the optimized recognition of pathological locomotor patterns from inertial data [
10]. In addition, wearable biosensor systems embedded in body-worn devices have also been successfully used together with deep learning for activity recognition, further supporting the feasibility of intelligent movement analysis beyond conventional laboratory settings [
14]. Nevertheless, interpretability remains essential for clinically meaningful wearable gait assessment.
Beyond conventional descriptive statistics, gait is increasingly understood as a nonlinear and multiscale dynamical process. In mathematics, fractals are commonly associated with exact or statistical self-similarity across scales, and their structural complexity is often characterized by the fractal dimension, which is a scale-dependent quantity that extends the conventional Euclidean notion of dimension and may take non-integer values. In this sense, the fractal dimension describes how structural detail changes with observation scale and provides a mathematical way to characterize irregular geometries beyond traditional integer-dimensional objects [
15,
16]. Fractal and scale-related analyses have therefore been introduced to capture aspects of gait organization that may not be adequately reflected by low-order temporal- or amplitude-based measures alone [
17]. At the same time, clinically meaningful PD gait abnormalities are not limited to complexity changes: they also involve laterality-related manifestations, altered temporal organization, and subject-specific deviation from normal walking patterns [
18,
19].
However, several limitations remain in current plantar insole sensor signal-based PD gait analysis. First, many existing approaches still rely primarily on conventional time-domain, frequency-domain, or time–frequency descriptors [
20,
21,
22], whereas nonlinear gait complexity may remain insufficiently characterized. Second, although PD gait often exhibits asymmetry, disturbed gait-phase organization, and deviation from normal locomotor patterns, these properties are not always represented in a unified and interpretable manner [
23,
24]. Third, subject-level screening and monitoring would benefit from a continuous abnormality measure that quantifies how far an individual deviates from a normal baseline while accounting for the feature correlation structure [
19]. Therefore, there remains a need for an integrated representation that jointly models gait complexity, biomechanical imbalance, and deviation from normal gait patterns.
To address these issues, we propose FID-Gait, which is a three-domain fusion framework for PD identification using smart-insole data. The framework integrates a fractal domain for nonlinear gait complexity, a plantar-loading–phase imbalance (PLPI) domain for loading asymmetry and temporal disturbance, and a covariance-adjusted deviation (CAD) domain for quantifying deviation from a normal reference distribution. By combining these complementary domains, FID-Gait provides an interpretable and discriminative representation for PD gait characterization and identification.
The main contributions of this work are summarized as follows:
We propose an automated gait-cycle segmentation pipeline for plantar insole sensor signals.
We design three complementary feature domains, namely fractal, PLPI, and CAD, to characterize nonlinear gait complexity, biomechanical imbalance, and deviation from normal gait patterns.
We develop the FID-Gait framework, which achieves strong performance at both the gait-cycle and subject levels.
The remainder of this paper is organized as follows.
Section 2 reviews related studies on PD gait analysis and plantar insole sensor signal-based intelligent assessment.
Section 3 introduces the dataset, signal preprocessing procedure, gait segmentation method, and multidomain feature construction.
Section 4 presents the experimental settings and classification results, including ablation analysis and robustness evaluation.
Section 5 discusses the main findings, practical implications, and study limitations. Finally,
Section 6 concludes the paper.
3. Materials and Methods
3.1. PD Dataset Description
The proposed method was evaluated using the publicly available Gait in Parkinson’s Disease dataset from PhysioNet [
46]. The dataset contains gait recordings from 93 subjects with idiopathic Parkinson’s disease (PD) and 73 Normal control subjects, yielding 166 subjects in total, and the data were collected from three independent studies, namely Ga, Ju, and Si [
47,
48,
49]. For the whole cohort, the subject-level sex composition was Male/Female = 58/35 for the PD group and Male/Female = 40/33 for the Normal control group.
During data acquisition, subjects walked on level ground at their usual self-selected pace for approximately 2 min. Plantar-force measurements were collected using instrumented insoles with 8 force sensors under each foot, resulting in 16 sensor channels in total. The signals were sampled at 100 Hz, corresponding to a sampling interval of 10 ms. In addition to the 16 individual sensor channels, the released records also provide two composite channels, corresponding to the summed force under the left foot and right foot, respectively.
Table 2 summarizes the subject-level and record-level composition used in this paper.
Because this research was conducted using a public dataset, the data-acquisition protocol was predetermined and did not include repeated walking trials separated by predefined rest intervals. It should be noted that gait measurements may be influenced by testing procedures and protocol design, and recent studies have emphasized the importance of protocol standardization in Parkinson’s disease gait research. In addition, repeated trials with appropriate inter-trial rest may facilitate a more thorough evaluation of measurement stability and reduce potential procedural interference. Therefore, in this paper, inter-trial consistency and its related effects could not be specifically assessed [
50,
51,
52,
53].
3.2. Statistical Analysis
To compare the PD and Normal groups, a unified statistical analysis workflow was adopted. Unless otherwise specified, all tests were two-sided with statistical significance set at .
After preprocessing, single-feature distributions were compared between the two groups using appropriate parametric or nonparametric tests according to the data characteristics. For significant differences, effect sizes were additionally reported with Cohen’s d used for mean-based comparisons.
To control the family-wise error rate in multiple feature comparisons, the Holm procedure was applied.
For subject-level distribution analyses, statistical inference was based on subject-level summary values. Normality was first assessed using the Shapiro–Wilk test; when the assumption of normality was not satisfied, between-group differences were tested using the Wilcoxon rank-sum test, and rank-biserial correlation was reported as the effect size. For analyses involving multiple comparisons (e.g., plantar-region screening and regional fractal-dimension screening), the Holm correction was applied to control the family-wise error rate, and raw p-values were explicitly distinguished from adjusted p-values in the main text. For single prespecified comparisons (e.g., the CAD group comparison), no additional post hoc multiple-comparison correction was applied.
For feature-domain ablation analysis, classifier-level performance drops relative to the full feature set were compared across ablation settings using the Friedman test, which was followed by Wilcoxon signed-rank tests with Holm correction for post hoc pairwise comparisons when applicable.
3.3. Data Processing and Model Settings
To ensure reproducibility, the data preprocessing pipeline, gait-state segmentation procedure, feature-processing strategy, and classifier settings were standardized and are summarized here.
The plantar-force signals were converted into numerical time-series data and smoothed using a centered moving average with a window length of 5 samples (50 ms). Missing values introduced by centered smoothing, as well as non-numeric entries, were imputed by backward fill followed by forward fill. Each channel was normalized within each record by its maximum value. No additional band-pass filtering, resampling, or detrending was applied.
Gait-state segmentation was performed using the clustering-based adaptive threshold generation method described below. Briefly, each plantar-region signal was divided into two clusters by K-means clustering, with the number of clusters set to 2, the number of initializations set to 10, and the random seed set to 42. The adaptive threshold for each region was defined as the lower cluster centroid after convergence. Gait states were then identified based on these thresholds and the corresponding state-transition rules from which gait-cycle-related features were extracted.
No additional feature selection or dimensionality reduction was applied in the main experiments. Except for the ablation analysis, in which predefined feature groups were removed for comparison, all classifiers were trained on the full extracted feature set. Positive and negative infinity values were replaced with missing values and imputed using the median of each feature column. All numerical features were standardized before classification.
For model development, the dataset was first divided into a training portion (70%) and an independent test set (30%) using stratified sampling. The 70% training portion was then further split into actual training and validation subsets in a stratified 8:2 ratio. This hierarchical partitioning strategy is consistent with common machine-learning evaluation practice, in which an independent hold-out test set is preserved for final assessment on unseen data, whereas the training portion is further divided for model fitting, validation, model selection, and hyperparameter tuning. Similar evaluation designs have also been reported in gait-related studies, including explicit subject-level training/validation/testing splits, and the use of an internal validation subset derived from the training data for model selection and hyperparameter tuning [
9,
54,
55,
56]. In addition, the independent test set was not used for model fitting, hyperparameter tuning, or validation-stage model selection.
Nine classifiers were evaluated: Decision Tree, Logistic Regression, K-Nearest Neighbors, Random Forest, Gaussian Naive Bayes, Gradient Boosting, Multilayer Perceptron, Support Vector Machine, and AdaBoost. For classifiers with stochastic components, was set when applicable. To address class imbalance, balanced class weights were used in Decision Tree, Logistic Regression, Random Forest, and Support Vector Machine. Logistic Regression and Multilayer Perceptron were trained with a maximum of 500 iterations. K-Nearest Neighbors used default settings (, Minkowski distance, ), and Random Forest used 100 trees. Gaussian Naive Bayes used the default . The Multilayer Perceptron had one hidden layer with 100 neurons, ReLU activation, and the Adam optimizer. The Support Vector Machine used an RBF kernel with , , and probability estimation enabled.
3.4. A Bodily-Kinesthetic Control Integration (BKCI)-Based FID-Gait Architecture for Smart-Insole Gait Analysis
As shown in
Figure 1, the proposed Bodily-Kinesthetic Control Integration (BKCI)-based architecture for smart-insole gait analysis consists of the Body System Module, the Multidomain Feature Generation Module (FID-Gait), and the Multilevel Integration Module. Within this architecture, BKCI provides a conceptual perspective for organizing gait analysis as a closed-loop process linking bodily regulation, plantar insole sensing, multidomain feature construction, and decision making. Accordingly, the Body System Module describes the bodily control basis of gait, the FID-Gait module organizes plantar insole sensor signal information into multidomain gait representations, and the Multilevel Integration Module connects sensing, feature generation, and classification into a unified analytical framework.
3.4.1. Body System Module and Bodily-Kinesthetic Control Integration
The Body System Module focuses on the central nervous system (CNS) and peripheral nervous system (PNS), which form a closed-loop regulatory system through motor output and sensory feedback, including proprioceptive and tactile inputs [
57,
58]. From a neuromechanical perspective, gait results from interactions among neural control, musculoskeletal dynamics, and sensory feedback rather than isolated limb movements [
57,
58,
59]. Accordingly, plantar insole sensor signals acquired by smart insoles reflect both foot–ground interaction and peripheral manifestations of motor-control states.
Based on this rationale, this paper introduces Bodily-Kinesthetic Control Integration (BKCI), which is derived from Bodily-Kinesthetic Intelligence (BKI) [
35,
60]. Whereas BKI emphasizes bodily perception, motor control, and environmental interaction, BKCI extends this perspective into a closed-loop framework for smart-insole gait analysis by integrating neural regulation, sensory feedback, plantar-loading variation, and gait-pattern representation.
In this framework, plantar insole sensor signals and derived gait features are interpreted as structured representations of impaired bodily control rather than isolated measurements or purely statistical descriptors. The multidomain features therefore capture complementary aspects of gait abnormality, including altered dynamic regulation, impaired coordination and temporal organization, and deviations from normal gait patterns. Thus, the proposed FID-Gait framework functions as both a classification pipeline and an interpretable representation of PD-related gait abnormality from the perspective of bodily control.
3.4.2. Multidomain Feature Generation Based on FID-Gait
FID-Gait is a multidomain representation framework for plantar insole sensor signal data, consisting of three complementary domains:
- (a)
F: Fractal-Dimension Feature Generation (complexity domain).
Fractal-dimension features are extracted from step-level plantar insole sensor data sequences to characterize nonlinear complexity and signal roughness. This domain uses time-domain fractal-dimension estimators, including HFD, PFD, KFD, and BCFD, to describe irregular temporal dynamics.
- (b)
I: Plantar-Loading–Phase Imbalance Feature Generation (imbalance domain).
This domain includes two types of imbalance features: plantar insole sensor signal data imbalance features, which describe asymmetrical loading between the left and right feet and across plantar regions, and gait-phase ratio imbalance features, which capture abnormalities in phase organization and temporal structure. The latter are derived from the Gait State Segmentation Module, where adaptive thresholds are generated by clustering-based threshold generation and then used for gait-state set construction, providing a consistent basis for phase-ratio imbalance calculation.
- (c)
D: Deviation Feature Generation (deviation domain).
Using the normal-control distribution as a reference, this domain quantifies overall gait deviation with a covariance-aware metric. Specifically, the squared Mahalanobis distance is used as a continuous score of departure from the normal baseline.
In summary, FID-Gait adopts a fused three-domain representation—complexity, imbalance, and deviation—to characterize PD-related gait differences while maintaining interpretability.
3.4.3. Multilevel Integration Module
The Multilevel Integration Module (
Figure 1) describes how plantar insole sensing, multidomain representation, and classification are connected within a coherent analytical pipeline [
61]. In this paper, this module is conceptualized as a three-level integration mechanism that links data acquisition, feature construction, and final decision making within a unified framework.
At the physical measurement level, plantar insole sensor signals are collected through smart insole sensors during walking. These sensors continuously capture the distribution and temporal variation of foot–ground interaction, thereby providing the primary observational basis for subsequent gait analysis.
At the multidomain feature level, the acquired plantar insole sensor signals are transformed into structured gait representations through the construction and fusion of the three feature domains, namely Fractal Dimension, Imbalance, and Deviation, thus converting raw sensor measurements into interpretable descriptors of gait dynamics, coordination, and abnormality.
At the decision-making level, the fused multidomain features are further processed by machine-learning classifiers to generate detection outcomes. Within the BKCI framework, these outputs can also be linked to feedback-related pathways, including human multimodal feedback, so as to support anomaly prompting and potential training or intervention applications.
3.5. Plantar-Loading–Phase Imbalance Feature Generation
3.5.1. Gait State Segmentation for PLPI Construction
To compute interpretable gait-phase ratio imbalance descriptors in the plantar-loading–phase imbalance (PLPI) domain, we introduce a Gait State Segmentation Module within the FID-Gait framework. As illustrated in
Figure 2, this module transforms multizonal plantar insole sensor signals into gait-state transition points and gait-state duration descriptors by combining clustering-based threshold generation with rule-based gait state detection. Thus, the module provides a consistent temporal partition of each gait cycle and establishes the basis for subsequent phase-organization and imbalance analysis.
Specifically, the smart insole is divided into five plantar anatomical zones: heel (H), rearfoot (RF), midfoot (MF), forefoot (FF), and toe (T). Let
denote the plantar insole sensor data sequence of the
z-th zone for the left or right foot, where
is the total number of sampled time points and
is the time moment of the
n-th sample acquired at sampling frequency
.
The five zonal plantar insole sensor signals are defined as
which correspond to the toe, heel, rearfoot, midfoot, and forefoot regions, respectively.
As shown in
Figure 2, each zonal plantar insole sensor data sequence is processed independently using iterative two-cluster partitioning (KMeans-2) to generate an adaptive threshold for that zone. After convergence, the lower cluster centroid is taken as the threshold, yielding the threshold set
The detailed iterative definitions of cluster assignment, cluster-mean updating, and the stopping criterion are provided in
Appendix A.
After threshold generation, the five adaptive zonal thresholds are jointly used to identify gait-state transition points, which are defined here as the time instants at which the plantar insole sensor signal of a given zone crosses its corresponding adaptive threshold in an ascending or descending direction. These transition points are threshold-crossing points rather than peak-force points. Based on the temporal relationships among the thresholds
,
,
,
, and
, the gait cycle is segmented into six canonical gait states: Initial Contact (IC), Loading Response (LR), Mid-Stance (MS), Terminal Stance (TS), Pre-Swing (PS), and Swing (SW). These state transitions further define higher-level temporal structures, including the overall gait cycle duration
, stance-state duration
, and swing-state duration
, as shown in
Figure 2.
Let
denote the start and end moments of gait state
i. Let
denote the number of gait cycles and
denote the number of gait states within one normal gait cycle. For the
j-th gait cycle, where
and for the
i-th gait state, where
, the duration of the
i-th gait state in the
j-th gait cycle is defined as
where
denotes the time duration of the
i-th gait state in the
j-th gait cycle and serves as the temporal basis for subsequent phase-ratio feature construction. Since one normal gait cycle contains six canonical gait states, these durations fully describe the temporal structure of a cycle.
Using the transition points detected from the five plantar zones, a gait state set is generated for each gait cycle. Accordingly, the complete set of gait-state durations can be written as
Then, the total number of gait-state durations is
These gait-state durations provide the temporal basis for constructing PLPI descriptors, especially those characterizing phase organization and ratio-based imbalance. The complete multizonal gait-state transition rules are summarized in
Appendix A.1.
Figure 3 illustrates a representative adaptive-threshold-based segmentation result and the corresponding gait-state percentage distribution in the Normal group. The segmented states showed a physiologically reasonable temporal organization with the overall stance-related and swing-related portions remaining close to the canonical Normal-gait pattern of about 60% and 40%, respectively. In classical gait analysis, Initial Contact, Loading Response, Mid-Stance, Terminal Stance, and Pre-Swing approximately occupy 0–2%, 2–12%, 12–31%, 31–50%, and 50–62% of the gait cycle, respectively, which are followed by swing from about 62% to 100% [
62,
63]. Although the exact percentages of individual sub-phases were not identical to textbook kinematic definitions, the present results still support the robustness and physiological plausibility of the proposed pressure-threshold-based gait-state segmentation.
3.5.2. Plantar-Loading Asymmetry Features
Abnormal plantar-loading coordination during walking is mainly reflected in two aspects: left–right limb imbalance and impaired loading transfer from heel contact to forefoot propulsion within the same foot. These patterns are closely related to gait-event organization, weight shifting, and dynamic locomotor control, and thus they provide biomechanically meaningful information for gait-abnormality assessment [
64,
65,
66,
67]. Based on this rationale, plantar-loading asymmetry features were constructed from gait-cycle plantar insole sensor signals to quantify imbalance both within each foot and between the two feet. By jointly characterizing intra-foot anterior–posterior coordination and inter-foot asymmetry, these features provide an interpretable and compact representation of plantar insole sensor signal data imbalance for subsequent gait assessment and classification.
From both functional and statistically corrected perspectives, the heel and forefoot were selected as the representative plantar regions. As shown in
Figure 4a, threshold summaries across the five plantar regions were compared between the Normal and Parkinson groups at the subject level. After Wilcoxon rank-sum testing with Holm correction, only the heel (adjusted
,
) and forefoot (adjusted
,
) remained statistically significant, whereas the rearfoot, midfoot, and toe did not.
For representative forefoot-channel selection,
Figure 4b shows that both P6 and P7 exhibited significant between-group differences (Holm-adjusted
) with rank-biserial correlations of 0.35 and 0.54, respectively. In contrast, the robust-dispersion comparison based on IQR/median showed no marked instability for either channel, suggesting that the group difference was mainly reflected by a shift in distribution location. Considering both disease sensitivity and robustness, P6 was retained as the representative forefoot channel. Accordingly, p1 and p6 were chosen as the representative channels for the heel and forefoot regions, respectively.
For the
j-th gait cycle, let
,
,
, and
denote the normalized plantar insole sensor signal time series of the representative heel and forefoot channels of the left and right feet. For each representative channel
, the peak loading and mean loading are defined as
where
denotes the set of sampling points in the
j-th gait cycle and
is the corresponding number of samples.
To quantify the relative loading imbalance between two representative channels, the asymmetry index is defined as
Since the normalized plantar insole sensor signal data values are nonnegative,
, and a larger value indicates a greater loading difference between the two compared channels.
Six representative channel pairs are considered: , , , , , and . Here, and characterize within-foot forefoot–heel coordination, and characterize diagonal inter-foot loading relationships, and and characterize inter-foot asymmetry between homologous forefoot and heel regions, respectively.
Based on these channel pairs, the plantar-loading asymmetry features of the
j-th gait cycle are defined as
Accordingly, the plantar-loading asymmetry feature vector for the
j-th gait cycle is written as
Here,
–
are the peak-loading asymmetry features, whereas
–
are the mean-loading asymmetry features. Accordingly, the plantar-loading asymmetry feature set for the
j-th gait cycle consists of 12 features.
From a biomechanical perspective, these features provide a compact and interpretable description of gait imbalance. The within-foot forefoot–heel asymmetry terms reflect abnormalities in loading-transfer coordination from heel contact to forefoot propulsion; the between-foot homologous asymmetry terms characterize lateral loading bias between corresponding plantar regions; and the diagonal asymmetry terms reflect contralateral coordination changes during alternating support [
65,
66,
67].
3.5.3. Gait-Phase Ratio Imbalance Features
Abnormal gait is commonly characterized by imbalanced stance–swing proportions and disrupted temporal relationships among gait sub-phases. As these temporal patterns are closely linked to gait rhythm and phase coordination, they serve as informative indicators of gait abnormalities. Based on this rationale, this paper derives gait-phase ratio imbalance features from gait-state segmentation results to quantify representative temporal imbalance patterns within the gait cycle. These features characterize both the global stance–swing proportion and the local temporal coordination among non-swing sub-phases, providing an interpretable representation of gait-phase dysregulation.
For the j-th gait cycle, the phase durations of the left foot are denoted as , , , , , and , and those of the right foot are denoted as , , , , , and .
Accordingly, the total stance durations of the left and right feet are defined as
and the corresponding gait-cycle durations are defined as
First, six global phase-ratio features are constructed:
Here,
to
denote the stance-phase ratio, swing-phase ratio, and swing-to-stance ratio of the left foot, respectively, while
to
denote the corresponding features of the right foot.
To further characterize the relative temporal organization among non-swing sub-phases, the phase-ratio feature between any two phases
a and
b is defined as
For the left foot,
and
. According to the order
,
,
,
,
,
,
,
,
, and
, the left-foot sub-phase ratio features are defined as
where
denotes the
k-th left-foot phase pair in the above order.
Similarly, for the right foot,
and
. According to the order
,
,
,
,
,
,
,
,
, and
, the right-foot sub-phase ratio features are defined as
where
denotes the
k-th right-foot phase pair in the above order.
Therefore, the gait-phase ratio imbalance feature set for the
j-th gait cycle is defined as
Among them, – are the global temporal ratio features of stance and swing phases for the left and right feet, whereas – are the relative temporal coordination features among non-swing sub-phases. By definition, all sub-phase ratio features lie within , and values closer to 1 indicate more similar phase durations.
3.6. Fractal-Dimension Feature Generation
In the complexity domain of FID-Gait, regional fractal-dimension features were extracted to characterize the nonlinear complexity of plantar insole sensor signals. Higuchi, Petrosian, Katz, and box-counting fractal dimensions were jointly used to capture complementary dynamic properties of insole sensor data sequences from the five plantar subregions of both feet. Compared with whole-foot analysis, this regional strategy reduces the masking of local dynamics caused by signal superposition and better preserves functional variations across the Heel, Rearfoot, Midfoot, Forefoot, and Toe. It thus facilitates the identification of local complexity abnormalities and their spatial distribution during gait. Additionally, inter-limb difference features between homologous subregions were constructed to quantify bilateral complexity asymmetry.
3.6.1. Definition of Regional Plantar Insole Sensor Data Sequences
Based on the five plantar subregions H, , , , and T, for any foot side , let the j-th gait cycle window contain sampling points, and let the sampling instant be written as (). Let represent the preprocessed plantar insole sensor signal value of the i-th original sensor on foot side S at time within the j-th window, where .
The regional insole sensor data sequences of the five plantar subregions are defined as follows: , , , , and .
3.6.2. Computation of Four Types of Fractal Dimensions
For each regional insole sensor data sequence
, the four time-domain fractal dimensions, HFD, PFD, KFD, and BCFD, characterize the complex dynamic features of the regional insole sensor data sequence from the perspectives of multiscale roughness, local oscillatory complexity, the relationship between trajectory length and overall extension, and the geometric covering-scale relationship, respectively. The detailed mathematical definitions of these four estimators are provided in
Appendix A.2.
Figure 5 presents representative regional insole sensor data sequences and the corresponding fractal-fitting schematics for the five plantar subregions in the Normal and Parkinson groups. The results visually illustrate the differences in fractal patterns and waveform characteristics across subregions.
3.6.3. Construction of Left-Right Difference Features
To characterize the complexity asymmetry between homologous plantar subregions on the two sides, left–right difference features were constructed for each subregion and each type of fractal dimension. For any subregion
and any fractal-dimension type
, the left–right difference is defined as
where
indicates that the complexity of the corresponding subregion of the left foot is higher than that of the right foot, whereas
indicates that the complexity of the corresponding subregion of the right foot is higher.
3.6.4. Regional Fractal-Dimension Feature Set
For the
j-th gait cycle window, four types of fractal dimensions are extracted from the five plantar subregions, and the corresponding left–right difference features are further constructed. Therefore, the regional fractal-dimension feature vector is defined as
This feature set contains a total of 60 features, corresponding to five plantar subregions, four types of fractal dimensions, and three components for each fractal-dimension type, namely the left-foot value, the right-foot value, and the left–right difference. Among them, and describe the local complexity levels of each subregion of the left and right feet, respectively, whereas describes the complexity asymmetry between homologous subregions on the two sides.
For the distribution analysis in
Figure 6a, each regional fractal-dimension feature was first assessed for normality using the Shapiro–Wilk test, and between-group comparisons were then performed using the appropriate parametric or nonparametric test. All regional fractal-dimension candidates were jointly corrected within the same feature family using the Holm procedure.
Figure 6a summarizes the top 10 regional fractal-dimension features ranked by Holm-adjusted
p-values.
As shown in
Figure 6a, only three regional fractal-dimension features remained statistically significant after Holm correction, namely Midfoot HFD (Bilateral, adjusted
), Heel HFD (
(Left–Right), adjusted
), and Forefoot HFD (Bilateral, adjusted
). The remaining seven features in the top-10 ranking did not remain significant after correction. These results indicate that the most robust between-group differences were concentrated in HFD-derived features, involving both bilateral regional complexity and left–right asymmetry.
3.7. Covariance-Adjusted Deviation Feature Generation
To quantify the overall deviation of a single gait-cycle sample from the Normal gait pattern, this paper defines the covariance-adjusted deviation (CAD) feature based on the squared Mahalanobis distance. This metric measures the standardized displacement of a sample from the center of the Normal reference distribution while accounting for feature-scale differences and inter-feature correlations. Its squared value is directly used as a continuous indicator of deviation intensity.
Let the input feature vector corresponding to the
jth gait cycle sample be
, where
d denotes the feature dimension. Specifically,
is formed by combining the plantar-loading asymmetry feature set and the regional fractal-dimension feature set. The covariance-adjusted deviation intensity of the
jth sample is then defined as
Here,
denotes the mean vector of the Normal-control reference distribution,
denotes the regularized covariance matrix of the Normal reference distribution, and
denotes its inverse. The detailed definitions of the Normal reference mean vector, covariance matrix, and regularization form are provided in
Appendix A.3.
Accordingly, is constructed as a one-dimensional continuous deviation feature. A larger indicates that the corresponding sample exhibits a stronger overall deviation from the Normal reference distribution in the joint feature space, whereas a smaller indicates that its feature structure is closer to the Normal gait pattern. Because this feature is defined under covariance constraints, it reflects not only the dispersion of individual feature dimensions but also the correlation structure among features, thereby providing a statistically consistent quantitative description of gait abnormality.
The CAD distributions in
Figure 6b were compared at the subject level after normality assessment with the Shapiro–Wilk test. Because CAD was analyzed as a single prespecified summary endpoint, this comparison did not involve a multiple-testing family and therefore no additional post hoc multiplicity correction was required. The between-group difference was evaluated using the Wilcoxon rank-sum test. As shown in
Figure 6b, the Parkinson group exhibited significantly higher CAD values than the Normal group (raw
), with a large rank-biserial effect size (
), together with a higher median and a broader distribution. This result indicates a stronger overall deviation from the Normal reference distribution in the fused feature space and further supports the effectiveness of CAD as an interpretable deviation-sensitive feature for characterizing abnormal gait patterns.
Spatial Box-Counting Fractal Dimension Based on CAD
To further characterize the distribution pattern of gait-cycle samples in a fused multidomain feature space, a Gait Feature Space was constructed based on the AS score, the RFD score, and CAD. In this space, each gait-cycle sample is represented as a three-dimensional point, so that gait imbalance, fractal complexity, and deviation from the Normal reference distribution can be jointly described within a unified geometric framework.
The AS score of the
j-th gait cycle sample was defined as
where
is the number of asymmetry-domain features,
is the value of the
k-th asymmetry feature of the
j-th gait cycle sample,
and
are the mean and standard deviation of the corresponding feature in the Normal reference samples, and
is a small constant introduced to avoid division by zero.
Similarly, the RFD score of the
j-th gait cycle sample was defined as
where
is the number of regional fractal-dimension features,
is the value of the
k-th fractal-dimension feature of the
j-th gait cycle sample, and
and
are the corresponding mean and standard deviation in the Normal reference samples.
Based on these definitions, the three coordinates of the
j-th gait cycle sample in the 3D Gait Feature Space were given by
Accordingly, each gait cycle sample was represented as
and the corresponding set of gait cycle samples for subject
u was written as
where
denotes the number of valid step-level samples of subject
u.
In this representation, the
X-,
Y-, and
Z-axes, respectively, denote sample displacement in the gait-imbalance domain, displacement in the regional fractal-complexity domain, and overall abnormal deviation intensity in the covariance-aware fused feature space, jointly forming a unified 3D Gait Feature Space for step-level gait representation. To characterize subject-specific sample distributions in this space, a three-dimensional box-counting fractal-dimension analysis was performed (
Appendix A.3). A lower
indicates a more compact and regular distribution, whereas a higher
indicates a more dispersed and irregular spatial pattern. Therefore,
complements
by providing a geometric measure of multidomain gait abnormality.
As shown in
Figure 7, the Normal subject forms a compact cluster with limited voxel occupancy and a spatial fractal dimension of 0.9529, whereas the Parkinson subject exhibits a more dispersed distribution across the AS, RFD, and CAD axes and a higher fractal dimension of 1.2586, indicating stronger step-level deviations and greater multidomain structural complexity.
Figure 7c further shows that the Parkinson group is shifted toward higher spatial fractal-dimension values, peaking at approximately 1.2380, compared with 0.9615 for the Normal group. Overall, these results indicate that the spatial box-counting fractal dimension complements
by capturing both deviation from the Normal baseline and the organizational complexity of gait states in the fused multidomain space.
The distributions of all normalized model-input features are provided in
Appendix C (
Figure A1).
4. Experiments and Results
To systematically evaluate the effectiveness and reproducibility of the proposed FID-Gait framework for Parkinson’s disease (PD) identification, experiments were conducted using plantar insole sensor data collected by smart insoles. The experimental analysis consisted of two parts: a gait-cycle-level classification experiment to evaluate the discriminative capability of the proposed multidomain fused features for abnormal gait samples and a subject-level classification experiment to further assess the stability and discriminative ability of the model at the individual level.
Prior to classification modeling, the raw plantar insole sensor signals were preprocessed by smoothing, boundary completion, and within-record normalization, which were followed by gait segmentation, cycle alignment, and feature extraction. Based on these procedures, a multidomain fused feature set was constructed for classification, including five feature groups across three domains: the fractal domain (regional fractal-dimension features), the PLPI domain (plantar-loading asymmetry features and gait-phase ratio features), and the deviation domain (covariance-adjusted deviation features and spatial box-counting fractal-dimension features).
To reduce scale discrepancies and distributional imbalance across different feature domains, thereby improving the stability and separability of multidomain fusion modeling while preventing data leakage, feature standardization was further applied to the classification inputs. Specifically, within each experimental split, the mean
and standard deviation
of the
j-th feature were estimated exclusively from the training set, and each feature value
was transformed as
The same parameters were then consistently applied to the corresponding validation and test sets.
To assess the discriminative performance of the proposed feature framework, nine classical machine learning classifiers were compared: Decision Tree, Logistic Regression, K-Nearest Neighbors, Random Forest, Gaussian Naive Bayes, Gradient Boosting, Multilayer Perceptron, Support Vector Machine, and AdaBoost. The experiments employed stratified splitting for the training, validation, and test sets, and five-fold stratified cross-validation was further performed to evaluate model robustness and generalization ability. The evaluation metrics included Accuracy, Precision, Recall, F1-score, and AUC. The evaluation metrics were defined as
where
,
,
, and
denote the numbers of true positives, true negatives, false positives, and false negatives, respectively. In addition, the area under the receiver operating characteristic curve (AUC) was adopted to measure the overall discriminative ability of a classifier across different decision thresholds.
4.1. Comparative Classification Performance
First, the proposed method was evaluated on gait-cycle-level features under a standard 7:3 split setting with an independent validation set further separated from the training portion. Different from the aggregated discrimination at the subject level, each sample in this setting corresponds to an aligned gait cycle, and the resulting performance therefore more directly reflects the ability of the model to identify local cycle-level gait patterns.
The classification results of the nine evaluated classifiers under the gait-cycle-level 7:3 split setting are summarized in
Table 3. Most classifiers achieved high performance on this task, suggesting that the proposed multidomain fused representation effectively distinguishes PD gait cycles from normal gait cycles.
Among all classifiers, Multilayer Perceptron (MLP) achieved the best overall performance at the gait-cycle level with the highest Accuracy (0.9911) and F1-score (0.9947). Random Forest yielded the highest Recall (0.9986) and AUC (0.9990). KNN also showed highly competitive performance, achieving an Accuracy of 0.9894 and an F1-score of 0.9938. Overall, under the present random gait-cycle-level split setting, the multidomain fused features provided strong discrimination between PD and Normal gait cycles, suggesting that the proposed complexity–imbalance–deviation representation is effective for identifying abnormal cycle-level gait patterns.
In addition, subject-level classification was further performed under the same 7:3 split setting. In this case, each sample corresponds to an individual subject rather than a single aligned gait cycle, and the results therefore reflect the stability and discriminative ability of the model at the individual level more directly. The corresponding results are summarized in
Table 4.
Under the 7:3 split setting, logistic regression and AdaBoost achieved the highest Accuracy, both reaching 0.9022. Among them, logistic regression yielded the highest AUC (0.9621), whereas AdaBoost achieved the highest F1-score (0.9302). MLP also showed strong competitive performance with an Accuracy of 0.8913, the highest Recall of 0.9531, and an AUC of 0.9481. These findings suggest that FID-Gait retains good discriminative performance at the subject level.
Figure 8 presents the visualization results of the top-performing models at two levels.
Figure 8a,b shows the ROC curve and the corresponding confusion matrix for MLP, which achieved the highest performance in gait cycle-level classification. At the subject level, AdaBoost was selected for visualization because it achieved the highest F1-score while sharing the highest Accuracy with Logistic Regression.
Figure 8c,d presents the ROC curve and the corresponding confusion matrix for AdaBoost. These results further demonstrate the effectiveness of the proposed multidomain fused features for PD gait classification at both the gait-cycle and subject levels.
4.2. Independent Validation Performance
To further assess the within-training stability and generalization ability of FID-Gait, an independent validation set was separated from the training portion at a ratio of 8:2 after the standard 7:3 split, and validation was performed at both the gait-cycle and subject levels before testing.
At the gait-cycle level, the independent validation results remained competitive, although the overall performance was slightly lower than in the 7:3 split experiment, and the best-performing classifier varied across metrics. Gradient Boosting achieved the highest Accuracy (0.9055) and F1-score (0.9460), whereas MLP achieved the highest AUC (0.9023). AdaBoost also showed competitive performance with an Accuracy of 0.9046 and an F1-score of 0.9454. These findings indicate that the gait-cycle-level features retained discriminative value across different training-data sub-splits despite some sensitivity to data partitioning.
At the subject level, the validation results showed a pattern broadly consistent with that of the 7:3 split experiment, although the top-performing classifier changed under the new sub-split. Random Forest achieved the highest Accuracy (0.8837) and F1-score (0.9091), whereas MLP yielded the highest AUC (0.9154) and maintained strong overall performance. AdaBoost also remained stable, achieving an Accuracy of 0.8605 and an F1-score of 0.8889. Overall, these results suggest that FID-Gait maintains good stability and generalization capability, particularly at the subject level. The representative independent validation results at the gait-cycle and subject levels are summarized in
Table 5.
4.3. Contribution Analysis of Feature Domains
To assess the contribution of different feature domains to classification performance, an ablation study was performed. Starting from the full feature set, three settings were constructed by removing the PLPI, deviation, and fractal domains, respectively, while keeping the same training and testing configuration.
Table 6 presents the results obtained using MLP at the gait-cycle level and AdaBoost at the subject level.
At the gait-cycle level, all ablation settings reduced performance compared with the full feature set. The full feature set achieved the best results with Accuracy, Precision, Recall, F1-score, and AUC scores of 0.9911, 0.9940, 0.9955, 0.9947, and 0.9973, respectively. Removing the fractal domain caused the largest decline, reducing Accuracy, F1-score, and AUC to 0.9799, 0.9882, and 0.9891, respectively. Removing the PLPI domain also decreased performance with Accuracy and F1-score dropping to 0.9893 and 0.9937. In contrast, removing the deviation domain produced only a small decline with Accuracy, F1-score, and AUC results of 0.9909, 0.9946, and 0.9969. These results indicate that the fractal domain contributed the most at the gait-cycle level.
At the subject level, the full feature set again achieved the best overall performance, with Accuracy, F1-score, and AUC values of 0.9022, 0.9302, and 0.9235, respectively. Removing the PLPI domain reduced Accuracy and F1-score to 0.8696 and 0.9048, although AUC increased slightly to 0.9364. Removing the deviation domain caused the largest decline, reducing Accuracy, F1-score, and AUC to 0.8152, 0.8722, and 0.8677, respectively. Removing the fractal domain also lowered Accuracy and F1-score to 0.8696 and 0.9032, with an AUC of 0.9219.
Figure 9 summarizes the contribution of each feature domain by showing the average performance degradation of each ablation setting relative to the full feature set across all classifiers and both evaluation levels. Larger values indicate greater performance loss after removal of the corresponding domain. Among the three domains, removing the fractal domain caused the largest decreases in Accuracy, Recall, and F1-score, indicating its dominant contribution to discriminative performance. Removing the PLPI domain also led to clear reductions, particularly in Recall and F1-score, highlighting its complementary role. In contrast, removing the deviation domain produced relatively smaller decreases across most metrics, although its effect on Precision remained noticeable. These results are consistent with
Table 6 and further support the complementary value of the proposed multidomain feature representation. The corresponding statistical results are provided in
Appendix B (
Table A2).
4.4. Comparative Evaluation of Fractal-Dimension Components
As shown in
Table 6 and
Figure 9, removing the fractal domain caused the most pronounced performance degradation among all feature-domain ablation settings. Therefore, a refined ablation experiment was further conducted to compare different fractal-dimension components. Starting from a baseline model without any fractal features, four additional settings were constructed by introducing only one fractal descriptor, namely HFD, PFD, KFD, or BCFD, while keeping the remaining non-fractal features unchanged. The results of all nine classifiers under each setting are listed in
Table 7, and the averaged gait cycle-level and subject-level performance across classifiers are summarized in
Table 8 and
Table 9, respectively.
From the averaged gait cycle-level results, the setting with only HFD achieved the best performance with mean Accuracy, mean F1-score, and mean AUC values reaching 0.9205, 0.9470, and 0.9571, respectively, all of which were higher than those of the baseline model without fractal features (0.9036, 0.9347, and 0.9442). The settings with only KFD and only BCFD also yielded performance improvements to varying degrees, whereas the setting with only PFD showed slightly lower mean Accuracy and mean F1-score values than the baseline despite a marginally higher mean AUC. Overall, HFD was the most effective single fractal component at the gait cycle level.
At the individual classifier level, MLP under the “Base + only HFD” setting achieved the highest gait cycle level Accuracy and F1-score with values of 0.9887 and 0.9933, respectively. Compared with the corresponding baseline MLP model, its Accuracy, F1-score, and AUC were all improved. Similar trends were also observed for several other classifiers, such as KNN, SVM, and Decision Tree, further indicating that HFD provided the most effective complementary discriminative information among the four fractal-dimension descriptors.
For the subject-level evaluation, only the averaged cross-classifier results are reported, rather than listing the detailed performance of each classifier individually, because the main purpose of this analysis is to provide a complementary validation of the gait cycle-level findings from the perspective of aggregated decision making. As shown in
Table 9, the setting with only PFD achieved the best averaged subject-level performance with Accuracy, Precision, Recall, F1-score, and AUC reaching 0.8563, 0.9010, 0.8924, 0.8962, and 0.9044, respectively, all of which were higher than those of the baseline model. In contrast, although HFD, KFD, and BCFD also showed certain gains, their overall subject-level performance remained inferior to that of PFD. These findings suggest that HFD was more advantageous at the gait cycle level, whereas PFD exhibited better overall adaptability after subject-level aggregation.
4.5. Cross-Validation Performance and Robustness
To further evaluate the robustness of the proposed FID-Gait framework, five-fold stratified cross-validation was conducted at both the gait cycle level and the subject level for all classifiers. For each evaluation metric, the mean across the five folds and the corresponding 95% confidence interval (CI) were calculated from the fold-wise results using the
t-distribution. The results are summarized in
Table 10 and
Table 11.
As shown in
Table 10, the gait cycle-level cross-validation results were broadly consistent with the test-set evaluation. MLP achieved the highest mean Accuracy (0.9921) and F1-score (0.9953), Random Forest yielded the highest mean Recall (0.9986), and SVM achieved the highest mean Precision (0.9962). Random Forest and SVM both reached the highest AUC (0.9991). KNN and Gradient Boosting also maintained strong performance, whereas Logistic Regression and Gaussian Naive Bayes performed relatively worse, with Gaussian Naive Bayes showing the lowest results across most metrics.
At the subject level, the cross-validation results also supported the robustness of the proposed representation, as shown in
Table 11. AdaBoost achieved the highest mean Accuracy (0.9314), mean Precision (0.9523), mean F1-score (0.9515), and mean AUC (0.9807), whereas Random Forest obtained the highest mean Recall (0.9627). Gradient Boosting and MLP also showed competitive performance. In contrast, Gaussian Naive Bayes and KNN showed relatively lower subject-level performance particularly in terms of mean Accuracy and mean F1-score.
Figure 10 further illustrates the distributions of Accuracy and F1-score for the top five classifiers under five-fold cross-validation at both the subject and gait cycle levels. Overall, gait cycle-level models achieved higher performance and smaller inter-fold variability than subject-level models, indicating better stability across data partitions.
At the subject level, AdaBoost showed the best overall performance, with higher central values and a more favorable distribution for both Accuracy and F1-score, which is consistent with the mean results in
Table 11. Gradient Boosting, Random Forest, and MLP also performed well, whereas Decision Tree showed lower central values and greater dispersion, indicating lower stability across folds.
At the gait cycle level, MLP showed the best performance, with the highest Accuracy and F1-score and relatively small dispersion, indicating both strong discrimination and high stability across folds. SVM, KNN, and Random Forest also showed strong and stable performance, whereas Decision Tree exhibited slightly lower central values and greater variability.
These visual results are consistent with the quantitative findings in
Table 10 and
Table 11, further confirming the robustness of the proposed FID-Gait representation under different cross-validation partitions. In particular, AdaBoost performed best at the subject level, whereas MLP achieved the best performance at the gait cycle level.
4.6. Performance Comparison of Machine Learning and Deep Learning Models Across Different Data Levels
Table 12 presents the classification performance of different models at the raw data, subject, and gait-cycle levels. The evaluation metrics include accuracy and average computation time per subject. The raw-data-level deep learning models used plantar insole sensor time-series signals as input, whereas the subject-level and gait-cycle-level models were built on the proposed three-domain feature representation after preprocessing, gait segmentation, and feature extraction. To improve comparability, the classifier settings and training-related parameters were kept consistent whenever applicable.
The deep learning baselines were implemented with a fixed random seed of 42. At the subject level, gait cycles were grouped by subject and ordered by cycle index to form variable-length sequences, which were zero-padded to a common length. The data were split in a stratified manner into training, validation, and test subsets, and feature standardization was fitted on the training set only. All deep learning models were trained using Adam with a learning rate of , a batch size of 16, and 40 epochs, and the model with the best validation F1-score was retained for final evaluation. The LSTM and BiLSTM models used a single recurrent layer with a hidden size of 64 with the BiLSTM variant using bidirectional recurrence. In the CNN-LSTM and CNN-BiLSTM models, the convolutional front-end consisted of two one-dimensional convolutional layers with kernel size 3, padding 1, and 64 channels, which was followed by the corresponding recurrent layer. The dropout rate was 0.3, and the MLP baseline used 128 hidden units in the first fully connected layer.
At the raw data level, model accuracy ranged from 0.5700 to 0.7000, with CNN-LSTM and CNN-BiLSTM achieving the highest accuracy of 0.7000, whereas MLP required the shortest computation time (0.2974 s). At the subject level, accuracy increased to 0.8387–0.9022 with AdaBoost achieving the highest accuracy of 0.9022; computation times were similar across models (0.9807–0.9876 s). At the gait cycle level, all models achieved accuracies above 0.9900. BiLSTM obtained the highest accuracy (0.9937), whereas MLP showed the shortest average computation time per subject (1.3342 s), indicating a favorable balance between accuracy and efficiency.
The computation times reported in the table represent the average per subject. In large-scale or full-dataset scenarios, these differences would accumulate and become more pronounced. Overall, MLP and AdaBoost not only achieved strong classification performance at their respective levels but also showed clear advantages in computational efficiency, indicating greater practical value in real-world applications.
4.7. Subject-Independent Evaluation Using Leave-One-Subject-Out Cross-Validation
To further evaluate the classification performance of the proposed FID-Gait framework under unseen-subject conditions, leave-one-subject-out cross-validation (LOSO) was conducted at the subject level. In each iteration, one subject was used as the test set, and the remaining subjects were used for training. The predictions from all iterations were aggregated to calculate Accuracy, Precision, Recall, F1-score, and AUC under this strict subject-independent setting.
The subject-level LOSO results are summarized in
Table 13. Overall, all classifiers maintained relatively good performance. Among them, MLP achieved the highest Accuracy (0.8954), Precision (0.9252), F1-score (0.9252), and AUC (0.9268), whereas Random Forest yielded the highest Recall (0.9533). Gradient Boosting, SVM, and AdaBoost also showed competitive performance, with F1-scores of 0.9144, 0.9125, and 0.9032, respectively. By contrast, Gaussian Naive Bayes and Decision Tree showed relatively lower performance.
Overall, although LOSO is a more stringent evaluation protocol, the proposed FID-Gait framework still achieved good classification performance across multiple classifiers, indicating that the constructed multidomain fused features retain good discriminative ability under subject-independent evaluation.
4.8. Accuracy Comparison with Previous Studies Based on Plantar Insole Sensor Signals for PD Classification
To further evaluate the effectiveness of the proposed framework,
Table 14 compares the classification accuracy of the proposed method with representative studies reported in the literature. As shown in the table, previous studies mainly relied on time-domain features, frequency-domain features, combined time- and frequency-domain features, raw time-series signals, or spectrogram image representations derived from insole sensor data. Their reported accuracies ranged from 77.33% to 93.75%, depending on the adopted feature representation and classifier model.
In comparison, the proposed FID-Gait framework achieved superior performance at the gait cycle level, reaching an accuracy of 99.11% with the MLP classifier. In addition, at the subject level, the best-performing classifier, AdaBoost, achieved an accuracy of 90.22%. Unlike methods based only on time-domain or frequency-domain information, the proposed framework integrates three complementary domains, namely the fractal domain, imbalance domain, and deviation domain, thereby providing a more comprehensive representation of PD-related gait characteristics.
Overall, the comparative results suggest that the proposed FID-Gait framework achieves highly competitive performance relative to existing studies on insole sensor data-based PD gait classification, especially at the gait cycle level, while also maintaining strong discriminative ability at the subject level.
5. Discussion
This paper proposed the FID-Gait framework for PD identification based on plantar insole sensor data by integrating fractal-domain, imbalance-domain, and deviation-domain features. The results showed that the proposed multidomain representation achieved strong discriminative performance at both the gait-cycle level and the subject level under the present experimental settings. Under the gait-cycle-level 7:3 split setting, MLP achieved the highest Accuracy (0.9911) and F1-score (0.9947). Under the subject-level 7:3 split setting, AdaBoost and Logistic Regression yielded the highest Accuracy (0.9022), whereas AdaBoost achieved the highest F1-score (0.9302). The five-fold cross-validation results further supported the robustness of the proposed framework within this dataset with MLP performing best at the gait-cycle level and AdaBoost performing best at the subject level. Collectively, these findings suggest that jointly modeling nonlinear gait complexity, biomechanical imbalance, and deviation from a Normal reference distribution can improve the discriminative capability of PD gait analysis.
The ablation results showed that the fractal domain contributed most substantially to the overall performance, particularly at the gait-cycle level, suggesting that fractal features capture aspects of gait complexity that are not adequately represented by conventional statistical descriptors. The imbalance domain provided complementary information related to plantar loading distribution and gait-phase organization, whereas the deviation domain appeared to be especially important at the subject level. In particular, removing the deviation domain reduced subject-level Accuracy from 0.9022 to 0.8152, supporting its relevance for subject-wise global abnormality assessment. These findings further support the complementary roles of the three feature domains in characterizing PD-related gait abnormalities.
A notable result of this paper is that the optimal classifier varied across evaluation protocols. Under the subject-level 7:3 split and five-fold cross-validation settings, AdaBoost showed the most favorable overall performance, whereas under the stricter subject-level LOSO setting, MLP became the best-performing classifier. Because LOSO leaves one entire subject out for testing in each iteration, it provides a more conservative assessment of subject-independent performance than random split or conventional k-fold validation [
71]. In this context, our results suggest that AdaBoost was more effective at capturing discriminative structure under dataset-dependent partitioning, whereas MLP showed comparatively stronger performance under the LOSO protocol within this dataset. The representative independent validation results further showed that validation-stage stability and final test performance were not necessarily identical, although MLP, AdaBoost, Gradient Boosting, and Random Forest consistently remained among the strongest models. This observation suggests that the performance advantage of FID-Gait is mainly associated with the multidomain fused representation itself rather than reliance on a single classifier.
From an application-oriented perspective, the proposed framework showed a more favorable balance between accuracy and efficiency than direct raw-signal modeling under the current offline experimental setting. At the raw-data level, model Accuracy ranged from 0.5700 to 0.7000, whereas after the introduction of the three-domain fused features, the selected classifiers achieved Accuracy values of 0.9022 at the subject level with AdaBoost and 0.9911 at the gait-cycle level with MLP. In addition, AdaBoost at the subject level and MLP at the gait-cycle level showed relatively short average computation times. These findings indicate that interpretable multidomain features can provide an efficient alternative to direct end-to-end raw-signal modeling in the present task setting. However, these results should be interpreted strictly within the scope of offline experiments on a single public dataset. Although the proposed framework showed strong performance under random split, cross-validation, and LOSO evaluation within this dataset, its external generalizability has not yet been established. In particular, robustness across independent cohorts, acquisition protocols, wearable devices, and real-world clinical environments remains to be confirmed in future studies.
5.1. Real-World Applicability and Feasibility of Real-Time Implementation
Although this research was conducted offline, the results provide preliminary evidence for practical applicability. In real-time deployment, total system latency depends not only on classifier inference but also on signal acquisition, preprocessing, gait-cycle segmentation, and feature extraction. Previous studies have shown that wearable real-time gait systems must balance accuracy, latency, energy consumption, and edge-device resource constraints [
64,
72].
In this context, FID-Gait shows potential for near-real-time application. The relatively short computation times of AdaBoost and MLP, together with the LOSO performance of MLP on unseen subjects, support its potential for deployment to new users. However, the reported computation times were obtained offline and should not be regarded as end-to-end latency in real wearable systems. In free-living scenarios, factors such as sensor displacement, unstable contact, gait-speed variation, turning, noise, and data loss may affect the stability of gait segmentation and feature extraction. Therefore, the present findings support the feasibility of translating FID-Gait to real-time wearable applications, although its practical performance still requires end-to-end validation on embedded or edge platforms.
5.2. Limitations
Several limitations should be noted. First, the proposed framework was evaluated mainly under offline conditions; thus, its end-to-end latency, operational stability, and real-time feasibility in wearable applications remain to be verified. In addition, although random split, five-fold cross-validation, and leave-one-subject-out validation were performed, this paper was conducted using only one public dataset for binary PD-versus-Normal classification. Therefore, the reported performance mainly reflects within-dataset generalization, and external generalizability to other cohorts, sensor setups, and acquisition conditions remains unconfirmed. Validation in larger, multicenter, multi-device, and more heterogeneous clinical cohorts is still required.
Another limitation is related to the acquisition protocol of the public PhysioNet gait dataset. The dataset provides continuous walking recordings rather than repeated trials with standardized rest intervals. Therefore, although a substantial number of gait cycles were extracted, inter-trial consistency and its potential influence could not be specifically evaluated in this paper. This is relevant because gait measurements may be affected by testing procedures and protocol design, and recent studies have emphasized the importance of standardized gait protocols in Parkinson’s disease research [
50,
51,
52,
53].
In addition, although the proposed three-domain features are interpretable, they remain handcrafted descriptors and may not fully capture long-range temporal dependencies or more complex sequential dynamics. The deviation domain is also defined relative to a Normal reference distribution, which may be affected by sample composition, acquisition conditions, and device-related variation. Future work should therefore validate the framework in real-world, cross-device, and cross-cohort settings, further examine repeated-trial protocols with predefined inter-trial rest intervals, and further explore integration with sequence-based deep learning models to better balance interpretability and temporal modeling capacity [
64,
71,
72].
6. Conclusions
This paper proposed FID-Gait, which is a three-domain fusion framework for Parkinson’s disease (PD) identification using plantar insole sensor data. By integrating fractal, plantar-loading–phase imbalance, and deviation domains, the framework provides an interpretable representation of gait complexity, biomechanical imbalance, and global deviation from normal gait.
Experimental results showed that FID-Gait achieved strong discriminative performance at both the gait-cycle and subject levels. At the gait-cycle level, MLP achieved the best performance under the 7:3 split setting with an Accuracy of 0.9911 and an F1-score of 0.9947. At the subject level, Logistic Regression and AdaBoost achieved the highest Accuracy of 0.9022, while AdaBoost obtained the best F1-score of 0.9302. Five-fold cross-validation further supported the robustness of the proposed framework, and subject-level LOSO evaluation provided preliminary evidence of subject-independent generalization within this dataset.
Ablation analysis confirmed that all three domains contributed to the final performance with the fractal domain showing the largest contribution overall and the deviation domain playing an important role in subject-level classification. These findings suggest that PD gait can be more effectively characterized through the integrated modeling of gait complexity, plantar-loading imbalance, and deviation from normal reference patterns.
Overall, FID-Gait achieved strong performance across multiple evaluation protocols while maintaining interpretability and computational efficiency. However, the present findings are based primarily on offline experiments using a single public dataset. Therefore, further validation on independent external datasets and in real-world, cross-device, and cross-cohort settings remains necessary before broader clinical or wearable deployment can be established.