Novel Deep Learning Method in Hip Osteoarthritis Investigation Before and After Total Hip Arthroplasty

Roel Pantonial; Milan Simic

doi:10.3390/app15020872

and

School of Engineering, RMIT University, Melbourne, VIC 3000, Australia

^*

Author to whom correspondence should be addressed.

Appl. Sci.2025, 15(2), 872;https://doi.org/10.3390/app15020872

This article belongs to the Special Issue Application of Artificial Intelligence in Biomedical Informatics

Version Notes

Order Reprints

Abstract

The application of gait analysis on patients with Hip Osteoarthritis (HOA) before and after Total Hip Arthroplasty (THA) surgery can provide accurate diagnostics, reliable treatment decision making, and proper rehabilitation efforts. Acquired kinematic trajectories provide discriminating features that can be used to determine the gait patterns of healthy subjects and the effects of surgical operation. However, there is still a lack of consensus on the best discriminating kinematics to achieve this. Our investigation aims to utilize Deep Learning (DL) methodologies and improve classification results for the kinematic parameters of healthy, HOA, and 6 months post-THA gait cycles. Kinematic angles from the lower limb are used directly as one-dimensional inputs into a DL model. Based on the human gait cycle’s features, a hybrid Long Short-Term Memory–Convolutional Neural Network (HLSTM-CNN) is designed for the classification of healthy/HOA/THA gaits. It was found, from the results, that the sagittal angles of hip and knee, and front angles of FPA and knee, provide the most discriminating results with accuracy above 94% between healthy and HOA gaits. Interestingly, when using the sagittal angles of hip and knee to analyze the THA gaits, common subjects have the same results on the misclassifications. This crucial information provides a glimpse in the determination for the success or failure of THA.

Keywords:

gait; kinematic parameters; hip osteoarthritis; total hip arthroplasty; long short-term memory; convolutional neural networks

1. Introduction

Hip Osteoarthritis (HOA) is a chronic hip disease that progressively degenerates the cartilage holding the joint together and eventually leads to its dysfunction []. The subsequent pain, experienced commonly, causes lateral trunk bending that overloads other parts of the musculoskeletal system for compensation; thus, treatment is generally required []. Total Hip Arthroplasty (THA) is commonly considered as an important option that alleviates pain and restores functionality in most subjects. However, the literature reveals that post-THA improvements do not completely reduce pain nor return the quality of life to that of healthy individuals []. This is due to muscle weaknesses responsible for gait adaptations after THA [] affecting even the non-operated limb.

The accepted standard in assessing hip functionality before and after THA is through the Harris Hip Score (HHS), which is a validated clinical tool. It focuses on pain, function, and range of motion. The HHS is conducted in questionnaire form. The main drawback is its subjectivity [], and this is where our DL model could be helpful as an additional tool. Our DL methodology is an objective tool. Alternatively, an objective measure is through radiograph images that depict hip structure anomalies, but still, it has its static nature and does not capture motion features [].

A more suitable alternative measurement is based on gait, which is defined as a manner of walking. Gait is meticulously studied in several areas such as security, sports, and biomedical applications []. The gait features are known to have discriminating abilities. Its non-invasive utilization provides us with an understanding of the wellbeing of athletes as well as biometric authentication for remote and mobile devices [,]. Prominently, the use of gait analysis, in clinical settings, is extensively investigated, focusing on gait abnormalities due to pathological diseases such as multiple sclerosis, Parkinson’s Disease [], and other neuro-muscular diseases.

The analysis of gait is the systematic examination of human walking via the observation of an expert with the help of data acquisition (DAQ) devices to measure kinematics, kinetics, and spatio-temporal parameters (STP) []. Kinematic parameters describe motion without reference to the force, while kinetics is the study of force, moments, and acceleration without orientation.

In the context of HOA research, vision-based clinical gait analysis (CGA) systems are the most popular. They are focused on relevant kinematic angle detection that can classify HOA gaits and predict severity. Constantinou et al. [] devised a case study to characterize hip joint kinematics during gait for mild-to-moderate HOA. It was found that the sagittal and transverse hip angles are significantly different from healthy subjects, and the net hip joint loading is altered in HOA, suggesting a progression of the disease. Consequently, Leigh et al. [] investigated the lower limb’s gait kinematics of HOA subjects and found out that the sagittal and transverse angles of the pelvis, as well as the front angle of the hip, are discriminative for HOA subjects.

Similarly, extensive studies were conducted on gait-related differences before and after THA. Dindorf et al. [] utilized the overall gait waveform cycle to detect relevant kinematic parameters to describe asymmetrical gaits in patients after THA. Frontal and sagittal angles of the knee and the sagittal angle of the hip are found to be the most discriminating kinematic parameters with a reported accuracy of 91%. Subsequently, Longworth et al. [] hypothesized that inter-joint coordination information can be used to characterize healthy, HOA, and post-THA gaits through cyclogram analysis. Knee and ankle angles were found to have discriminating patterns for the different classes. Additionally, Fujii et al. [] developed a model to predict pelvic sagittal angle five years after THA utilizing gait information one year after surgery with 91% accuracy.

These discriminating kinematic parameters for HOA and after THA are critical in the selection for classification model inputs among several hundreds of kinematic trajectories produced by CGA. Still, there is a lack of consensus on which of these kinematic outcomes provide the best result in the diagnosis and follow up in patients. Gait modifications could already be subtly manifested before the onset of functional infirmities; thus, discriminating information for HOA is important. The results have potential implications for proper treatment and rehabilitation efforts after THA surgery from different severity levels.

There are several methods to make accurate classifications and predictions utilizing gait patterns. Recently, the development of Deep Learning (DL), a subset of machine learning (ML) methods, has seen increasing advances in the field of computer science. Specifically, convolutional neural networks (CNNs) and long short-term memory (LSTM) have exhibited outstanding information extraction and modeling capabilities in different gait-related applications. The former is known for its excellent feature extraction capabilities, and the latter has strength in detecting time step dependencies from long sequential data.

Quite recently, a CNN model was employed to extract STP in a single gait cycle []. A publicly available eGAIT dataset, based on inertial measurements, was used as input. Two DL models were considered, namely multi-classification and ensemble models. Decent results were achieved, but the authors emphasized the need for a larger dataset to improve DL performance. Additionally, a related study was conducted to predict different STP for knee osteoarthritis and total knee arthroplasty subjects through a CNN model [] with inertial measurements as inputs. The architecture is not particularly deep with only two convolutional layers involved. Conversely, LSTM networks were designed to classify alterations or abnormalities of gait using wearable sensors [,]. Results were achieved with accuracy above 80% in both LSTM studies. It should be noted that hybrid LSTM-CNN (HLSTM-CNN) architectures have not been considered in the analysis of gait, but have shown promising results from other areas such as text classification [], heart beat classification [], and power flow prediction [].

The rest of this paper is organized as follows: work related to ML-based HOA gait classification is given in Section 2, a brief description of the DL model is provided in Section 3, the proposed methodology and dataset description is presented in Section 4, and the results and discussion are presented in Section 5. Finally, conclusions of the research and future directions are summarized in Section 6 and Section 7, respectively.

2. Related Works

Very few scientific studies have been published in the field of HOA gait classification using ML techniques. Laroche et al. [] extensively investigated ML applications utilizing HOA gait information. A Support Vector Machine (SVM) with a linear kernel was applied on kinematic trajectories to differentiate gaits between healthy and HOA patients. The overall reported accuracy is at 88%, with the sagittal hip angle providing the highest discriminating result at 85%, while the rest of the considered kinematic trajectories, from the knees and feet, were registered below 80%. Like other shallow networks, considerable work is completed with feature extraction that includes manual handcrafting. This is considered to be the traditional ML model’s major disadvantage.

A decade after the publication of the first study in this field, Pantonial et al. [] designed a DL architecture through the transfer learning method on kinematic trajectories for the classification of HOA and healthy gaits. Transfer learning is a DL methodology that utilizes a pre-trained network from a different domain that is re-used for faster training but maintains performance. In the presented investigation, a state-of-the-art image-based CNN was used and re-purposed for the gait classification problem. A novel tiling method is also employed to exploit both the operated and non-operated limb in one image. The result showed a 97% G-Mean score using the sagittal hip angle, while the rest of the considered kinematic parameters are upward of 90% in accuracy. An additional step is required to transform the gait cycle into an image via a continuous wavelet transform scalogram. Thus, the method requires more computational resources.

Teufl et al. [] used IMU sensors to acquire gait kinematics and STP for an SVM model with a Gaussian RBF Kernel to discriminate impaired and non-impaired gaits after THA. Healthy and post-THA subjects were used for the classification and, conspicuously, without HOA gait information. It was found by their study that kinematic parameters are more significant with 97% validation accuracy as compared to STP with 87%.

Overall, there is a clear research gap in the investigation of healthy, HOA, and post-THA gaits in one protocol as a multi-classification problem. Table 1 summarizes the related research on HOA gait analysis based on ML techniques. Whilst exploratory CNN work has been recently conducted by Pantonial et al. [], there is a research gap in the comprehensive investigation of HOA and THA research utilizing DL methodologies.

Table 1. Related works.

Generally, the contribution of this paper is as follows:

HLSTM-CNN model is presented and tested for the multi-classification of healthy, HOA, and post-THA gaits.
Investigation of the most relevant kinematic parameters that can best discriminate the multi-class problem.
Performance evaluation and comparison of the proposed HLSTM-CNN model with methodologies published in the literature.
Examination of the multi-classification model and its applicability in determining the success and failure of THA operation.

3. Deep Learning Structure Design

This section provides a description on the proposed HLSTM-CNN architecture for the classification of a gait cycle as healthy, HOA, or post-THA. Given that a hybrid topology is proposed, CNN and LSTM architectures will be discussed separately. Then, these two distinct models are combined, and the overall proposed HLSTM-CNN architecture is discussed.

3.1. Convolutional Neural Network (CNN)

The CNN model is known for its strength in the local feature extraction of a dataset. These models are inspired by biological constructions of the visual cortex, which is an arrangement of simple cells []. Cells are activated based on the subregion of the visual field, which is a concept used by neurons in a convolutional layer. In turn, these neurons are not fully connected and each of these are only connected within a subregion, in contrast to traditional neural networks. The subregions are also designed to overlap, and spatially correlated outcomes can be produced by the neurons of the CNN. The method is presented in Figure 1 with a filter slide through the image. Convolution function is employed on a subregion of the image. The result reflects the local region that was affected by the filter. This method is the main component of the CNN, which is exemplified in a layer.

Figure 1. Image convolution process.

Typically, a CNN model consists of the following layers: (1) convolution layer, (2) max pooling layer, (3) flattening layer, and the (4) fully connected layer. The max pooling layer is utilized in the reduction in the feature map through connection reduction between adjacent layers. Afterwards, a flattening layer is used to create a 1D vector, used on the fully connected layer, which consists of biases and weights. This whole architecture is described in Figure 2.

Figure 2. Typical CNN model.

3.2. Long Short-Term Memory (LSTM)

LSTM is a special type of recurrent neural network (RNN) that resolves the issue of vanishing gradient that affects RNN topologies in general []. These types of networks learn long-term temporal-dependent data that are mostly used in the classification of sequential information. Popular applications include language modeling, sentiment analysis, and speech recognition. Figure 3 shows the LSTM architecture with forget and memory cells added on top of the traditional RNNs. These added cells allow the network to learn long-term sequential data relationships. The memory and forget cells determine the information to be retained and discarded, respectively.

Figure 3. LSTM architecture.

To compute the output of the forget cell

F_{k}

, a sigmoid function is used on the input at the current time step

x_{k}

and the previous value of the hidden state

h_{k - 1}

:

F_{k} = σ (W_{f} \cdot [h_{k - 1}, x_{k}]) + b_{f}

(1)

where

W_{f}

and

b_{f}

represents the forget cell’s weight and bias, respectively,

[h_{k - 1}, x_{k}]

represents the concatenation of the previous hidden state and the current input, and

σ

is the sigmoid activation function. Afterwards, outputs of the forget cell, as well as the input cell

I

and the memory cell

M

, are used to update the previous cell state

n_{k - 1}

to a new cell state

n_{k}

. This can be achieved by computing the following:

I_{k} = σ (W_{I} \cdot [h_{k - 1}, x_{k}]) + b_{I}

(2)

{\tilde{M}}_{k} = \tanh ((W_{M} \cdot [h_{k - 1}, x_{k}]) + b_{M})

(3)

M_{k} = (M_{k - 1} \cdot F_{k}) + (I_{k} \cdot {\tilde{M}}_{k})

(4)

where

W_{I}

and

b_{I}

represent the input cell’s weight and bias, respectively, and the

W_{M}

and

b_{M}

represent the memory cell’s weight and bias, respectively. The final step is then to compute the value of the current hidden state

h_{k}

, as described below.

O_{k} = σ (W_{O} \cdot [h_{k - 1}, x_{k}]) + b_{O}

(5)

h_{k} = O_{k} \cdot \tanh M_{k}

(6)

The purpose of the final step is to act as the network’s memory, with information containing previous data to be used for prediction.

3.3. Hybrid LSTM-CNN Model

To fully capitalize the strengths of both LSTM and CNN architectures, a new hybrid model is proposed. Figure 4 shows the overall design of the HLSTM-CNN model. It consists of two bi-directional LSTM (bi-LSTM) and six convolutional layers, together with rectified linear unit (ReLU), batch normalization, and dropout layers specifically placed between weighted layers to avoid overfitting. Traditional activation functions utilize either sigmoid function or the hyperbolic tangent function, but to conserve computation time, ReLU functions are used as given by Equation (7).

f (x) = \max (0, x)

(7)

Figure 4. Proposed HLSTM-CNN model.

To lessen the effects of generalization errors whilst maintaining accuracy, dropout and batch normalization layers are utilized. Dropout is a strategy wherein an ensemble of possible sub-networks, from the original network, are randomly selected and applied on the hidden layer, with a zero-multiplication effect for each training iteration. On the other hand, batch normalization improves stability and speeds up network training []. Essentially, the features are shifted to have zero mean with standard deviation dominating the results.

Following the results from He et al. [] on the Resnet model skip connections are introduced to avoid vanishing and exploding gradients of very deep layers. This idea has shown improved performance on images, but has not been further exploited on 1D inputs.

4. Methodology

The proposed methodology can be divided into four distinct tasks: (1) dataset and kinematic parameter selection, (2) data pre-processing, (3) development of HLSTM-CNN model, and (4) evaluation of design. This is aptly summarized in Figure 5.

Figure 5. Proposed approach.

4.1. Dataset Description and Kinematic Parameter Selection

Several papers have emphasized the necessity of a large and publicly available dataset to improve valuable metrics in gait research [,,]. The motivation is for seamless performance comparison among developed algorithms; thus, benchmarking can easily be conducted and an agreement on significant parameters can be accelerated. With this in mind, a recently published gait analysis dataset [] of 80 healthy and 106 subjects before and 6 months after THA is utilized in this research. HOA identification is based on American College of Rheumatology Criteria and verified with radiological assessment.

Three-dimensional gait trajectories are captured via reflective markers and a vision-based system with eight optoelectronic cameras and floor-based sensors with two force plates utilized for the synchronized measurements. Thus, both kinematic and kinematic parameters are captured in the protocol. Ten trials are performed by each participant on a 6 m walkway. For brevity, demographic information of the volunteers is summarized in Table 2.

Table 2. Dataset demographics summary.

In this study, trajectories from the reflective markers on the lower limbs are considered as shown Figure 6a. Different planar views, as shown in Figure 6b, of the computed kinematic trajectories are used, namely sagittal (x-axis), frontal/coronal (y-axis), and transverse (z-axis).

Figure 6. Data description: (a) lower limb markers and (b) body plane and angles.

Explicitly, the relevant joint angles are the ankle, hip, knee, pelvis, and the foot progression angle (FPA). By definition, the FPA is the angle of the foot with respect to the walking direction. Following results from gait analysis on HOA [,,] and after THA operation [,,], the following kinematic angles are used in this study to provide sufficient discrimination related to gait dysfunctionalities:

Pelvis: Transverse and Front Angles
Hip: Sagittal and Front Angles
Knee: Sagittal and Front Angles
Ankle: Sagittal and Front Angles
FPA: Sagittal and Front Angles

4.2. Data Pre-Processing and Representation

Pre-processing is implemented to improve feature extraction and remove unnecessary information. Upon closer scrutiny of the raw data, redundant and non-numeric information is found across all kinematic angles. Thus, this is automatically removed from the dataset. Then, outliers are removed using a standard deviation method by taking the mean

μ

and standard deviation

σ

for each event of the gait sequence as described below:

μ_{i} = \frac{\sum_{j = 1}^{M} x_{j}}{M}, i = 1, \dots, N

(8)

σ_{i} = \sqrt{\frac{\sum_{j = 1}^{M} {(x_{j} - μ_{i})}^{2}}{M}}, i = 1, \dots, N

(9)

where

N

is the length of the gait cycle, which, in this dataset, involves one hundred points or events, and

M

is the size of the dataset for healthy, HOA, and 6 months post-THA gaits, respectively. Measured gaits with a value greater than

\pm 3 σ

in any of the events are automatically removed from the dataset. The resulting aggregated datasets are shown in Figure 7 with black, red, and blue strands as HOA, healthy, and 6 months post-THA gaits, respectively.

Figure 7. Aggregated joint angles of the affected limb.

Through visual observations, HOA-afflicted gaits have larger variance compared to healthy gaits, with THA showing some variations as well. Subsequently, data augmentation strategies are considered to improve generalization by increasing the size of the dataset. Essentially, artificial data are generated to be included in the training process. Popular data augmentation strategies from the literature include translation, shifting, rotation, and noise addition. Closer inspection reveals that Gaussian noise addition is the only strategy applicable for gait sequences which can be explained by Equation (10) given below:

x^{' (\in)} = \{x_{1} + \in_{1}, x_{2} + \in_{2}, \dots, x_{N} + \in_{N}\}

(10)

where

\in

is the Gaussian noise added for every event of the gait cycle. An example is shown in Figure 8, where

\in = 0.5

is added on a randomly selected kinematic gait cycle from the dataset.

Figure 8. Gaussian noise addition.

For clarity, the overall pre-processing flow chart is shown in Figure 9. This is composed of three stages resulting in an input gait dataset that will be used for the DL model.

Figure 9. Data pre-processing flowchart.

In contrast to the method chosen by a recent study [] that transforms gait into images, the proposed method utilizes the 1D gait sequence as input for the DL model. As a consequence, the DL model is designed from scratch and the training procedure is treated carefully given that there is no pre-trained network that can be used as a baseline.

4.3. Deep Learning Design

The focus of our research is the classification of kinematic gait parameters among healthy, HOA, and 6 months post-THA gaits using DL methods. A novel hybrid LSTM-CNN, stacking two bi-LSTM and six convolutional layers, is developed and tested.

The bi-LSTM network is a two-LSTM model with the first taking input in the forward direction, and the second is for the backward direction. Effectively, available information is increased for training and can improve the contextualization of the algorithm. By design, 62% and 12% memory windows are utilized to track sequence dependencies based on the period for stance phase and double limb support, respectively. This ensures that the learning is achieved within the selected length.

The first two convolution layers, in parallel with the skipped connection, have 38 and 12 filter sizes. This is based on the period of the swing phase and double limb support, respectively. Succeeding convolution layers have filter sizes of five to further improve feature extraction. The rest of the model parameters are summarized in Table 3.

Table 3. Summary of model parameters.

Hyperparameter selection is crucial in the training reliability of the design model. Given that the model is designed grounds-up, a relatively small initial learning rate is selected, and this is steadily increased. The mini-batch size is set to an arbitrarily small value of 20 for improved generalization and convergence. The maximum epoch is set as relatively large for extensive training. The list of relevant hyperparameters is shown in Table 4.

Table 4. Summary of hyperparameters.

The experimental research in this study was conducted on a DELL G15 5520 laptop with the following specifications:

Memory: 16 GB DDR3 RAM
GPU: NVIDIA GeForce RTX 3050 Ti
CPU: 12 Gen Intel Core i7-12700H, 2300 MHz,14 Cores, 20 Logical Processors

Software for the model, training, and evaluation was developed in Matlab^® 2023b with statistics, machine learning, and Deep Learning Toolboxes.

The dataset was divided into training (70%), validation (20%), and test (10%) subsets. Artificial data were added into the training dataset to improve performance. The training subset is used for model training, and after each end of the epoch, the validation subset is used to examine training accuracy. The test subset is only used at the end of training to verify the performance of the final model.

4.4. Model Evaluation

The evaluation of a DL model is an important step in assessing its performance and accuracy. The proposed HLSTM-CNN model was tested using the metric scores from the resulting confusion matrix. The final test accuracy is calculated for each kinematic parameter, which is the overall correct classification of the subjects. Obtained information should be treated carefully as the classification of THA gaits can provide knowledge about the success of the surgical operation and post-surgery physical therapy rehabilitation. Thus, metric scores are computed for each class, and misclassification is analyzed.

Three other metric scores are frequently used in the bio-medical field, namely sensitivity, specificity, and

G_{m e a n}

. Sensitivity determines the correct prediction of a class, while specificity determines the ability to correctly identify negative results.

G_{m e a n}

is, then, the measure to balance the results of the different classes. These metric scores are given by the following:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(11)

S e n s i t i v i t y = \frac{T P}{T P + F N}

(12)

S p e c i f i c i t y = \frac{T N}{T N + F P}

(13)

G_{m e a n} = \sqrt{S e n s i t i v i t y * S p e c i f i c i t y}

(14)

where

T N

and

T P

refer to right classification, and

F P

and

F N

to the misclassification of subjects.

5. Results and Discussion

5.1. Performance Metric Results on Selected Kinematic Parameters for Healthy and HOA Gaits

This subsection presents experimental results. The proposed HLSTM-CNN model is evaluated on the selected kinematic parameters’ test dataset, specifically, to distinguish between healthy and HOA gaits, as described in Table 5. The test dataset is randomly chosen and was never used during the training procedure. The sagittal angles of the knee 14 and the hip provide the best performance results with G-Mean scores of 97.6% and 97.8%, respectively. Apart from its concurrence with other published reports [,] as the most discriminating kinematic parameters, the outstanding performance results highlight DL superiority over traditional ML algorithms. Notably, healthy gaits of these kinematic parameters are not misclassified as HOA gaits, which is reflected in its sensitivity result of 100%. The next two best performing kinematic parameters are the font angles of the knee and FPA with G-Mean scores of 96.5% and 94.9%, respectively. This result well-aligned with those published in the literature.

Table 5. Performance metrics summary for healthy and HOA gait classification.

On the other hand, the worst performing kinematic parameters are the sagittal angles of ankle and FPA and the transverse angle of the pelvis with G-Mean scores below 90%. Particularly, the sagittal angle of the FPA is the only kinematic parameter that registers below 80% accuracy and G-Mean score. Put in context, its accuracy of 75% is already comparable with traditional ML algorithms and agrees with prior studies as the least discriminating kinematic parameter.

5.2. Comparative Analysis

Regarding research on HOA classification through ML techniques, two other related studies’ reports were compared to the results of the proposed architecture: (1) SVM classifier by Laroche et al. [] and (2) image-based CNN classifier via transfer learning method by Pantonial et al. []. While both investigations acquired datasets from the same laboratory, the former utilized a smaller and unpublished dataset, while the latter used the same published dataset. Figure 10 shows the results of the common kinematic parameters from these studies, and it is notable that the proposed HLSTM-CNN is in agreement that the hip sagittal angle is the most discriminating parameter for HOA classification. Both the image-based CNN and the HLSTM-CNN models recorded an outstanding 97% accuracy on this kinematic parameter, while the SVM model only recorded 85% accuracy. The results of the SVM model for the other kinematic parameters are decent for ML standards, which are slightly below 75%.

Figure 10. Comparison from the literature: SVM [], and CNN (transfer learning) [].

With the proposed architecture, for knee sagittal and FPA front angles, there is a considerable uptick in accuracy with improvements of 3.8% and 5.2%, respectively, as compared to the image-based CNN model. Even with the use of raw gaits as inputs to a model, there is no reduction in the performance, and has even improved results. Thus, computationally expensive scalogram image transformation is avoided without noticeable disadvantages.

The ranking of importance for the kinematic parameters is similar as with the previous studies, with the sagittal angle of the FPA as the least discriminatory in the gait classification problem. The accuracy of HLSTM-CNN stands at 75.8%, which is a relatively low result, but the sagittal angle of the FPA has not been considered in the literature for HOA classification.

Additionally, DL methods, both image-based CNN and HLSTM-CNN, are exceptionally superior to the SVM, giving results with at least a 12% accuracy improvement on the hip and knee sagittal angles. Besides this advantage, the manual handcrafting of data for feature extraction is eliminated in the process; thus, automation for classification is a possibility with less intervention from users.

On the other hand, Teufl et al. [] differentiated gaits between healthy patients and patients 2 weeks after THA operation utilizing IMU sensors. It is an important distinction from this study’s dataset of using gaits 6 months after THA. It is shown that the resulting kinematic accuracy is 97%. Thus, there is a discriminating feature between healthy and THA subjects. Misclassification could provide information into the effects of the surgery itself that can lead to better rehabilitation.

5.3. Results on Best Performing Kinematic Parameters on the Multi-Classification Problem

This subsection presents the results, based on a confusion matrix, when the proposed HLSTM-CNN model is evaluated on the top four performing kinematic parameters based on their classification G-Mean score:

Hip Sagittal Angle
Knee Sagittal Angle
Knee Front Angle
FPA Front Angle

For clarity, the diagonal and off-diagonal blocks of a confusion matrix indicate the correct and incorrect predictions of the true values, respectively. Figure 11 and Figure 12 show the resulting confusion matrices of the sagittal angles of hip and knee, respectively. Both of these kinematic parameters registered a more than 97% G-Mean score on the HOA classification problem with no misclassification of healthy gaits as HOA, and the opposite was kept at a low value as well. Moreover, the bulk of the misclassification is on the THA gait being misclassified as either healthy or HOA (third row). In addition, both healthy and HOA gaits are misclassified as THA gaits with high incidence as well. Based on these results, it is evident that there is a clear distinction between the features of healthy and HOA gaits for these kinematic parameters. Common features may be present on the THA gait that are seen on either healthy or HOA gaits.

Figure 11. Confusion matrix of the hip sagittal angle.

Figure 12. Confusion matrix of the knee sagittal angle.

Similar observations can be realized on the confusion matrices of the front angles of FPA and knee as shown in Figure 13 and Figure 14, respectively. Misclassification between healthy and HOA gaits is kept to minimum values, and the bulk of the misclassification is on THA gait prediction (third row and column). In particular, the front angle of the FPA has a significant misclassification of healthy and HOA gaits as THA, thus pulling its performance down.

Figure 13. Confusion matrix of the knee front angle.

Figure 14. Confusion matrix of the FPA front angle.

Further investigation is needed into the THA misclassifications. It could still provide information on patient satisfaction with the operation. Explicitly, if a THA kinematic parameter is misclassified as healthy, this could be indicative of functional improvements and gait returning to normal, while, if it is predicted as HOA, then gait impairments are still present. With more discriminating kinematic parameters, as predicted by the DL model, agreeing on the results, higher confidence can be attributed to the model’s results. This premise is extensively explored in the next section. Finally, regardless of classification outcomes, if a patient reports pain relief, improved mobility, and functionality, the THA could still be considered successful. In the future, we will include patient satisfaction as another THA verification parameter. Surgeons already use longitudinal surveys to track patient satisfaction and THA success. Time periods usually start just after the surgery, then three months and six months, and they continue for years. Our method could be one additional monitoring tool that could be used by surgeons and clinicians to track and improve THA success.

5.4. Analysis of Misclassification of THA Gaits

Building upon the results of the previous subsection, the top two most discriminating kinematic parameters, namely the sagittal angles of hip and knee, are further investigated, particularly for subjects with THA gaits. The subject’s gait prediction is classified between the three classes as shown in Table 6. For clarity, the subjects have identification numbers, which are tabulated in the table. The subject’s identification number, which is not common in both test datasets of the hip and knee sagittal angles, is removed for simplicity. Moreover, there are multiple gait data that are available for each subject, but only a single piece of data is retained that scores above 50% in the softmax layer. Remarkably, there is no cross-prediction between healthy and HOA for the THA gaits. Explicitly, if a gait of the hip sagittal angle is misclassified as healthy, it can either be healthy or THA on the knee sagittal angle prediction, and vice versa. This is also true for HOA gait misclassification.

Table 6. Subjects’ common THA gait prediction.

To better understand the relationship between the misclassification of the knee and hip sagittal angles, a close inspection of the conditional probabilities is crucial. Both healthy and HOA misclassifications are utilized, and Venn diagrams of the subjects are plotted to show the relationship between the two best performing kinematic parameters.

First, healthy classifications are analyzed, and the results are shown in Figure 15. As a nomenclature,

S N

in the figure signifies the subject’s identification number in the dataset. Interestingly, there are common subjects that were predicted as healthy between the two best performing kinematic parameters. This would increase our confidence in the success of the THA operations in these subjects. Evidently, the hip sagittal angle provides a reliable prediction on the healthy class with

P_{h e a} ({k n e e}_{x} | {h i p}_{x}) = 83.33 %

, while the knee sagittal angle resulted in a conditional probability of

P_{h e a} ({h i p}_{x} | {k n e e}_{x}) = 50 %

. Simply, when a patient’s hip sagittal angle is predicted as healthy, there is a better chance that it has returned to a healthy state.

Figure 15. Healthy prediction of subjects after THA.

Next, HOA predictions are analyzed, and the results are plotted in Figure 16. Similar to healthy predictions, there is also a significant number of common subjects that are predicted as HOA. When two best performing kinematic parameters are giving the same HOA results, this should be analyzed by medical diagnostic experts. This could mean a failure in the THA operation. Also, the hip sagittal angle provides a slightly better prediction for the HOA class with

P_{h o a} ({k n e e}_{x} | {h i p}_{x}) = 66.67 %

, while the knee sagittal angle resulted in a conditional probability of

P_{h o a} ({h i p}_{x} | {k n e e}_{x}) = 57.14 %

.

Figure 16. HOA prediction of subjects after THA.

6. Conclusions

The determination of the best discriminating kinematic parameters for HOA and THA classification is a crucial step in the accurate diagnosis of the disease. DL is a promising methodology used in solving different classification problems, and it has been proved to be a reliable approach. In this research, a hybrid of LSTM and CNN is proposed to classify gaits as healthy, gaits with HOA, and gaits 6 months after THA. The LSTM model exploits temporal dependencies of the gait sequential data, while the CNN model is employed to detect local gait features. Specifically, two bi-directional LSTMs and six CNNs are stacked to form the network with a skipping structure to improve fitting. To evaluate the design and benchmarked with the literature, the discrimination between healthy and HOA gaits is tested. It is shown that the sagittal angles of the hip and knee are the top two most discriminating kinematic parameters with G-Mean scores of 97.8% and 97.6%, respectively. This is followed by the front angles of the knee and FPA with G-Mean scores of 96.5% and 94.9%, respectively. Overall, the performance of the selected kinematic parameters is above 80%, except for the sagittal angle of the FPA, which has a G-Mean score of 78.2%. The proposed model performs significantly better compared to SVM-based models [] with a more than 10% higher margin, having similar results with image-based CNN models [].

A different perspective is used in the analysis of THA gaits, with an in-depth investigation conducted on the misclassification results. This approach can provide a glimpse into the success or failure of THA surgery. It was noted that the highest proportion of errors came from the misclassification of THA. For the top four most discriminating parameters, healthy and HOA gaits resulted in 10–25% and 10–30% misclassifications, respectively. In a broader sense, these percentages signify kinematic outcomes returning to normal with healthy prediction, while functional impairments can still be present with HOA prediction for the true THA class. Hence, further examination was conducted on the misclassified gaits of subjects after THA using the best two kinematics, namely the sagittal angles of the hip and the knee. The subjects were tabulated according to healthy and HOA misclassification, and it was found that there was no cross-misclassification between the top kinematic parameters. This means that there are no common features between healthy and HOA gaits on these parameters, but there are some common features with the THA gait. Indeed, there is a class of THA gait that requires a rehabilitation effort to closely align with healthy features. Additionally, more than 50% of the misclassification results are common to both hip and knee sagittal angles, thus boosting confidence into the success or failure determination of THA operation. These findings are promising, and our novel DL methodology could be used in medical diagnostic systems. Finally, misclassifications are driving us to keep improving our methodology.

7. Future Works and Recommendations

Based on the results of this investigation, it is recommended to monitor the other joint angles on the upper limbs such as the wrist, elbow, and shoulder. Information from these angles can potentially improve the performance of the gait classification model. Additionally, other available data from the lower limbs, such as kinetic and muscle activities, should be closely examined as they potentially have more differentiating features between healthy and THA gaits. Kinematics on the pelvis and FPA must be further analyzed on THA gaits as these can potentially boost confidence in the success of THA operation and provide specific rehabilitation efforts. Since demographic information is available in the dataset, such as age and gender, it would be interesting to know if there are discriminating features between demographics. Lastly, extensive investigation on the DL methodologies must be pursued to improve kinematic outcomes to identify HOA and THA gaits through information fusion with guidance from medical experts.

Author Contributions

Conceptualization, R.P. and M.S.; methodology, R.P. and M.S.; software: R.P.; validation, R.P.; formal analysis, R.P.; investigation, R.P. and M.S.; resources, R.P. and M.S.; data curation: R.P.; writing – original draft preparation, R.P.; writing – review and editing, M.S.; visualization, R.P. and M.S.; supervision, M.S.; project administration, M.S.; funding acquisition, R.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The gait analysis dataset of healthy subjects and subjects before and 6 months after THA was published by Bertaux et al. [] and is publicly available on the following sites: 1. Healthy Subjects: https://doi.org/10.6084/m9.figshare.15022827 (accessed on 10 April 2024).; 2. HOA Subjects: https://doi.org/10.6084/m9.figshare.13656233 (accessed on 10 April 2024).; 3. Demographics: https://doi.org/10.6084/m9.figshare.13655975 (accessed on 10 April 2024).

Conflicts of Interest

The authors declare no conflict of interest.

References

Ornetti, P.; Maillefert, J.-F.; Laroche, D.; Morisset, C.; Dougados, M.; Gossec, L. Gait analysis as a quantifiable outcome measure in hip or knee osteoarthritis: A systematic review. Jt. Bone Spine 2010, 77, 421–425. [Google Scholar] [CrossRef]
Whittle, M.W. Gait Analysis: An Introduction; Butterworth-Heinemann: Oxford, UK, 2014. [Google Scholar]
Beaulieu, M.L.; Lamontagne, M.; Beaulé, P.E. Lower limb biomechanics during gait do not return to normal following total hip arthroplasty. Gait Posture 2010, 32, 269–273. [Google Scholar] [CrossRef] [PubMed]
Foucher, K.C.; Hurwitz, D.E.; Wimmer, M.A. Preoperative gait adaptations persist one year after surgery in clinically well-functioning total hip replacement patients. J. Biomech. 2007, 40, 3432–3437. [Google Scholar] [CrossRef] [PubMed]
Lee, S.Y.; Park, S.J.; Gim, J.-A.; Kang, Y.J.; Choi, S.H.; Seo, S.H.; Kim, S.J.; Kim, S.C.; Kim, H.S.; Yoo, J.-I. Correlation between Harris hip score and gait analysis through artificial intelligence pose estimation in patients after total hip arthroplasty. Asian J. Surg. 2023, 46, 5438–5443. [Google Scholar] [CrossRef] [PubMed]
Bedson, J.; Croft, P.R. The discordance between clinical and radiographic knee osteoarthritis: A systematic search and summary of the literature. BMC Musculoskelet. Disord. 2008, 9, 1–11. [Google Scholar] [CrossRef] [PubMed]
Sepas-Moghaddam, A.; Etemad, A. Deep gait recognition: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 264–284. [Google Scholar] [CrossRef] [PubMed]
Gouwanda, D.; Senanayake, S. Emerging trends of body-mounted sensors in sports and human gait analysis. In Proceedings of the 4th Kuala Lumpur International Conference on Biomedical Engineering 2008: BIOMED 2008, Kuala Lumpur, Malaysia, 25–28 June 2008; pp. 715–718. [Google Scholar]
Marsico, M.D.; Mecca, A. A survey on gait recognition via wearable sensors. ACM Comput. Surv. (CSUR) 2019, 52, 1–39. [Google Scholar] [CrossRef]
Camps, J.; Samà, A.; Martín, M.; Rodriguez-Martin, D.; Pérez-López, C.; Arostegui, J.M.M.; Cabestany, J.; Catala, A.; Alcaine, S.; Mestre, B. Deep learning for freezing of gait detection in Parkinson’s disease patients in their homes using a waist-worn inertial measurement unit. Knowl.-Based Syst. 2018, 139, 119–131. [Google Scholar] [CrossRef]
Constantinou, M.; Loureiro, A.; Carty, C.; Mills, P.; Barrett, R. Hip joint mechanics during walking in individuals with mild-to-moderate hip osteoarthritis. Gait Posture 2017, 53, 162–167. [Google Scholar] [CrossRef]
Leigh, R.J.; Osis, S.T.; Ferber, R. Kinematic gait patterns and their relationship to pain in mild-to-moderate hip osteoarthritis. Clin. Biomech. 2016, 34, 12–17. [Google Scholar] [CrossRef]
Dindorf, C.; Teufl, W.; Taetz, B.; Becker, S.; Bleser, G.; Fröhlich, M. Feature extraction and gait classification in hip replacement patients on the basis of kinematic waveform data. Biomed. Hum. Kinet. 2021, 13, 177–186. [Google Scholar] [CrossRef]
Longworth, J.A.; Chlosta, S.; Foucher, K.C. Inter-joint coordination of kinematics and kinetics before and after total hip arthroplasty compared to asymptomatic subjects. J. Biomech. 2018, 72, 180–186. [Google Scholar] [CrossRef] [PubMed]
Fujii, J.; Aoyama, S.; Tezuka, T.; Kobayashi, N.; Kawakami, E.; Inaba, Y. Prediction of change in pelvic tilt after total hip arthroplasty using machine learning. J. Arthroplast. 2023, 38, 2009–2016.e2003. [Google Scholar] [CrossRef] [PubMed]
Hannink, J.; Kautz, T.; Pasluosta, C.F.; Gaßmann, K.-G.; Klucken, J.; Eskofier, B.M. Sensor-based gait parameter extraction with deep convolutional neural networks. IEEE J. Biomed. Health Inform. 2016, 21, 85–93. [Google Scholar] [CrossRef] [PubMed]
Sharifi Renani, M.; Myers, C.A.; Zandie, R.; Mahoor, M.H.; Davidson, B.S.; Clary, C.W. Deep learning in gait parameter prediction for OA and TKA patients wearing IMU sensors. Sensors 2020, 20, 5553. [Google Scholar] [CrossRef] [PubMed]
Turner, A.; Hayes, S. The classification of minor gait alterations using wearable sensors and deep learning. IEEE Trans. Biomed. Eng. 2019, 66, 3136–3145. [Google Scholar] [CrossRef] [PubMed]
Potluri, S.; Ravuri, S.; Diedrich, C.; Schega, L. Deep learning based gait abnormality detection using wearable sensor system. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; pp. 3613–3619. [Google Scholar]
Zhang, J.; Li, Y.; Tian, J.; Li, T. LSTM-CNN hybrid model for text classification. In Proceedings of the 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 12–14 October 2018; pp. 1675–1680. [Google Scholar]
Liu, F.; Zhou, X.; Wang, T.; Cao, J.; Wang, Z.; Wang, H.; Zhang, Y. An attention-based hybrid LSTM-CNN model for arrhythmias classification. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
Aksan, F.; Li, Y.; Suresh, V.; Janik, P. CNN-LSTM vs. LSTM-CNN to predict power flow direction: A case study of the high-voltage subnet of northeast Germany. Sensors 2023, 23, 901. [Google Scholar] [CrossRef]
Laroche, D.; Tolambiya, A.; Morisset, C.; Maillefert, J.-F.; French, R.M.; Ornetti, P.; Thomas, E. A classification study of kinematic gait trajectories in hip osteoarthritis. Comput. Biol. Med. 2014, 55, 42–48. [Google Scholar] [CrossRef]
Pantonial, R.; Simic, M. Transfer Learning Method for the Classification of Hip Osteoarthritis using Kinematic Gait Parameters. In Proceedings of the 28th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES2024), Seville, Spain, 11–13 September 2024. [Google Scholar]
Teufl, W.; Taetz, B.; Miezal, M.; Lorenz, M.; Pietschmann, J.; Jöllenbeck, T.; Fröhlich, M.; Bleser, G. Towards an inertial sensor-based wearable feedback system for patients after total hip arthroplasty: Validity and applicability for gait classification with gait kinematics-based features. Sensors 2019, 19, 5006. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Ioffe, S. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Bertaux, A.; Gueugnon, M.; Moissenet, F.; Orliac, B.; Martz, P.; Maillefert, J.-F.; Ornetti, P.; Laroche, D. Gait analysis dataset of healthy volunteers and patients before and 6 months after total hip arthroplasty. Sci. Data 2022, 9, 399. [Google Scholar] [CrossRef] [PubMed]
Dindorf, C.; Teufl, W.; Taetz, B.; Bleser, G.; Fröhlich, M. Interpretability of input representations for gait classification in patients after total hip arthroplasty. Sensors 2020, 20, 4385. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Image convolution process.

Figure 2. Typical CNN model.

Figure 3. LSTM architecture.

Figure 4. Proposed HLSTM-CNN model.

Figure 5. Proposed approach.

Figure 6. Data description: (a) lower limb markers and (b) body plane and angles.

Figure 7. Aggregated joint angles of the affected limb.

Figure 8. Gaussian noise addition.

Figure 9. Data pre-processing flowchart.

Figure 10. Comparison from the literature: SVM [], and CNN (transfer learning) [].

Figure 11. Confusion matrix of the hip sagittal angle.

Figure 12. Confusion matrix of the knee sagittal angle.

Figure 13. Confusion matrix of the knee front angle.

Figure 14. Confusion matrix of the FPA front angle.

Figure 15. Healthy prediction of subjects after THA.

Figure 16. HOA prediction of subjects after THA.

Table 1. Related works.

Reference	Sensor	Gait	Model
Laroche et al. []	Vision	healthy, HOA	SVM with linear kernel
Pantonial et al. []	Vision	healthy, HOA	image-based CNN
Teufl et al. []	IMU	healthy, THA	SVM with Gaussian RBF Kernel

Table 2. Dataset demographics summary.

Demographic Information	HOA Patients	Control Subjects
Gender	51 male and 55 female	35 male and 45 female
Age	66.9 ± 9.4 years	58.7 ± 15.5 years
Height	1.64 ± 0.08 m	1.66 ± 0.08 m
Weight	77.8 ± 17.1 kg	69.3 ± 13.4 kg

Table 3. Summary of model parameters.

Layer Type	Parameter	Value
CNN	Number of Filters	24
CNN	Padding	Same
dropout	Probability	0.2
fully connected layer (1)	Output size	256

Table 4. Summary of hyperparameters.

Hyperparameter	Value
Initial Learning Rate	0.001
Maximum Epoch	20
Mini-Batch Size	20
Validation Frequency	10
Solver	Stochastic Gradient Descent with Momentum
Shuffle	Every Epoch

Table 5. Performance metrics summary for healthy and HOA gait classification.

Kinematic Parameter	Accuracy	Sensitivity	Specificity	G-Mean
front ankle	92.31	93.15	91.80	92.47
front knee	96.34	97.18	95.83	96.51
front FPA	95.03	94.55	95.28	94.91
front hip	92.89	93.15	92.74	92.95
front pelvis	93.75	96	92.48	94.22
sagittal ankle	87.96	85.33	89.66	87.47
sagittal knee	96.76	100	95.16	97.55
sagittal FPA	75.86	82.98	73.72	78.21
sagittal hip	97.22	100	95.73	97.84
transverse pelvis	84.73	83.12	85.71	84.41

Table 6. Subjects’ common THA gait prediction.

Hip Sagittal Angle			Knee Sagittal Angle
HEALTHY	HOA	THA	HEALTHY	HOA	THA
1	36	71	1	36	2
51	2	23	73	78	71
4	27	42	91	74	23
24	31	73	51	27	42
16	38	91	4	31	34
60	68	78	62	37	82
		74	17	68	72
		34	16		58
		82	40		75
		62	60		48
		17			56
		58			67
		75			10
		48			24
		56			38
		67			29
		10			6
		40
		37
		29
		6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Novel Deep Learning Method in Hip Osteoarthritis Investigation Before and After Total Hip Arthroplasty

Abstract

1. Introduction

2. Related Works

3. Deep Learning Structure Design

3.1. Convolutional Neural Network (CNN)

3.2. Long Short-Term Memory (LSTM)

3.3. Hybrid LSTM-CNN Model

4. Methodology

4.1. Dataset Description and Kinematic Parameter Selection

4.2. Data Pre-Processing and Representation

4.3. Deep Learning Design

4.4. Model Evaluation

5. Results and Discussion

5.1. Performance Metric Results on Selected Kinematic Parameters for Healthy and HOA Gaits

5.2. Comparative Analysis

5.3. Results on Best Performing Kinematic Parameters on the Multi-Classification Problem

5.4. Analysis of Misclassification of THA Gaits

6. Conclusions

7. Future Works and Recommendations

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics