Accelerations Recorded by Simple Inertial Measurement Units with Low Sampling Frequency Can Differentiate between Individuals with and without Knee Osteoarthritis: Implications for Remote Health Care

Determining the presence and severity of knee osteoarthritis (OA) is a valuable application of inertial measurement units (IMUs) in the remote monitoring of patients. This study aimed to employ the Fourier representation of IMU signals to differentiate between individuals with and without knee OA. We included 27 patients with unilateral knee osteoarthritis (15 females) and 18 healthy controls (11 females). Gait acceleration signals were recorded during overground walking. We obtained the frequency features of the signals using the Fourier transform. The logistic LASSO regression was employed on the frequency domain features as well as the participant’s age, sex, and BMI to distinguish between the acceleration data from individuals with and without knee OA. The model’s accuracy was estimated by 10-fold cross-validation. The frequency contents of the signals were different between the two groups. The average accuracy of the classification model using the frequency features was 0.91 ± 0.01. The distribution of the selected features in the final model differed between patients with different severity of knee OA. In this study, we demonstrated that using logistic LASSO regression on the Fourier representation of acceleration signals can accurately determine the presence of knee OA.


Introduction
In patients with knee osteoarthritis (OA), gait analysis can support clinical decisions and contribute to the evaluation of interventions by providing relevant information on the course of the disease and the response to treatment [1]. Objective gait analysis has traditionally been limited to sophisticated biomechanical gait laboratories, but recently, inertial measurement units (IMUs) have received more attention due to their several advantages, especially for use in natural everyday settings in people's daily lives [2]. However, compared with conventional methods, these sensors provide restricted data, signifying the importance of comprehensive data analysis for the practical application of IMUs in clinical settings.
One of the most useful clinical applications of IMUs, especially in telemedicine and remote patient monitoring, is determining the presence or severity of knee OA. Different studies used various data analysis methods for this purpose [3]. While some studies employed more computationally complex approaches to process the IMU-derived data [4][5][6][7], others primarily used raw IMU data to classify gait deviation due to knee OA [8][9][10]. The complex methods yield spatiotemporal parameters and joint kinematics comparable to traditional gait analysis; however, some concerns have been raised about the validity of the

Study Setting and Participants
This cross-sectional observational study was conducted at Aalborg University Hospital, Denmark. Data collection was performed at the hospital's outpatient clinic. The participants included 27 patients with unilateral knee osteoarthritis and 18 volunteers without lower limb complaints. The exclusion criteria were a BMI higher than 35 kg/m 2 , a recent history of surgery in the lower limbs, neurological movement disorders, and inflammatory arthritis. In addition, we excluded the patients with complaints of pain or discomfort in the spine and lower-limb joints other than the affected knee and healthy controls with any pain or discomfort in the spine or lower-limb joints. Orthopedic surgeons with a subspecialty in knee replacement surgery established the diagnosis of knee OA in the patients. The Regional Committee on Health Research Ethics approved the study (journal 2021-000438). All participants were informed about the study and signed informed consent forms.

Data Collection
The participants' basic information (age, sex, and BMI) was registered in a secure REDCap database hosted by North Jutland Region. The participants also filled out the knee injury and osteoarthritis outcome score (KOOS) questionnaire as a subjective measure of the problems regarding knee OA [21]. KOOS scores were analyzed separately in five subscales: pain, symptoms (other than pain), disability regarding the activities of daily living (ADL), disability regarding sport and recreational activities (more demanding than activities of daily living), and quality of life (QoL). In addition, the severity of knee OA in the patients' radiographic images was evaluated according to the Kellgren-Lawrence (KL) classification [22].
The IMUs were SENS Motion sensors (SENS Motion ® , Copenhagen, Denmark) containing only a 3D accelerometer sampling at 12.5 Hz and were previously validated [23,24]. We placed the IMUs on the lateral side of the distal thigh, ipsilateral with the affected knee ( Figure 1). According to the manufacturer's instructions, the sensors were located approxi-Sensors 2023, 23, 2734 3 of 15 mately 10 cm above the lateral femoral epicondyle, and no calibrations were performed before recording data. The side of the IMU in the control group was randomly chosen. The participants performed two overground walking trials at a self-selected speed with a 5 min interval in a straight corridor. The IMUs were SENS Motion sensors (SENS Motion ® , Copenhagen, Denmark) containing only a 3D accelerometer sampling at 12.5 Hz and were previously validated [23,24]. We placed the IMUs on the lateral side of the distal thigh, ipsilateral with the affected knee (Figure 1). According to the manufacturer's instructions, the sensors were located approximately 10 cm above the lateral femoral epicondyle, and no calibrations were performed before recording data. The side of the IMU in the control group was randomly chosen. The participants performed two overground walking trials at a self-selected speed with a 5 min interval in a straight corridor.

Acceleration Signal Processing
Three-dimensional linear acceleration signals from the IMUs corresponding to craniocaudal (CC), anteroposterior (AP), and mediolateral (ML) axes were recorded and processed for further analysis. We randomly selected each participant's first or second gait trial to analyze the data. Considering the periodic nature of gait kinematic signals, we reconstructed a continuous interpolation of the acceleration signals using the Fourier method. Since averaged waveform is more reliable and the average variance provides additional information as to the randomness of the variable [25], we calculated the average of ten gait cycles extracted in the middle of the walking bout and segmented it into ten individual cycles using autocorrelation. Subsequently, the fundamental angular stride frequency (ω), the Fourier series representation, and the power of the signals corresponding to the signal frequencies were obtained from the averaged signal. The power of a signal at a particular frequency, P(fi), reveals how much of that frequency, fi, is present in the signal and calculated by adding the squares of the ith pairs of Fourier coefficients. The calculation of the Fourier coefficients and the power of the signal was previously described by Derrick [26].
The value of the Fourier coefficients and the power at the first six frequencies, P(f1), P(f2), P(f3), …, P(f6), were calculated for the signals corresponding to the CC, AP, and ML axes.
Numerical processing of the acceleration signals was performed in Python [27].

Acceleration Signal Processing
Three-dimensional linear acceleration signals from the IMUs corresponding to craniocaudal (CC), anteroposterior (AP), and mediolateral (ML) axes were recorded and processed for further analysis. We randomly selected each participant's first or second gait trial to analyze the data. Considering the periodic nature of gait kinematic signals, we reconstructed a continuous interpolation of the acceleration signals using the Fourier method. Since averaged waveform is more reliable and the average variance provides additional information as to the randomness of the variable [25], we calculated the average of ten gait cycles extracted in the middle of the walking bout and segmented it into ten individual cycles using autocorrelation. Subsequently, the fundamental angular stride frequency (ω), the Fourier series representation, and the power of the signals corresponding to the signal frequencies were obtained from the averaged signal. The power of a signal at a particular frequency, P(f i ), reveals how much of that frequency, f i , is present in the signal and calculated by adding the squares of the ith pairs of Fourier coefficients. The calculation of the Fourier coefficients and the power of the signal was previously described by Derrick [26].
The value of the Fourier coefficients and the power at the first six frequencies, P(f 1 ), P(f 2 ), P(f 3 ), . . . , P(f 6 ), were calculated for the signals corresponding to the CC, AP, and ML axes.
Numerical processing of the acceleration signals was performed in Python [27].

Logistic LASSO Regression
Logistic LASSO (least absolute shrinkage and selection operator) regression was employed in this study. The LASSO is a regularization method that performs classification tasks by selecting the most relevant features to the outcome variable, i.e., knee OA. This shrinkage method can actively select from a large and potentially multicollinear set of variables in the regression, resulting in a more relevant and interpretable set of predictors [28]. In addition, LASSO minimizes the regression coefficients to reduce the likelihood of overfitting, and as regression method can handle the confounders and the correlation within the gait data.
Since we did not perform any matching between the patients and the control group, the potential confounders (ω, age, sex, and BMI) were added to the regression model, in addition to the power of 18 signal frequencies along the CC, AP, and ML axes as the explanatory variables ( Table 1). The outcome variable was the participant's group (patients vs. controls) considered as the presence of knee OA. To construct the features from the explanatory variables, the continuous variables (age, BMI, ω, and the power of the signal frequencies) were standardized by removing the mean and scaling to unit variance, and the only categorical variable (sex) was encoded as a factor with unordered levels.
The model was a logistic LASSO regression model fitted via penalized maximum likelihood. The penalty coefficients (λ) were computed using 10-fold cross-validation with values ranging from 10 −12 to 10 2 , based on best computed binomial deviances. No weight or offset was specified for the observations. Two λ values were computed: λ min , defined as λ that minimizes the binomial deviance, and a more stringent value of λ 1se , defined as the largest λ that is still within one standard error of the minimum binomial deviance. λ 1se results in a smaller number of covariates than λ min . We estimated λ min = 0.02 and λ 1se = 0.10 ( Figure 2). The LASSO coefficients' 95% CI and p-values at a fixed value for the penalty parameter (λ = λ1se) were obtained using the method described by Taylor and Tibshirani [33].
Finally, the model's performance in classifying the gait signals into osteoarthritic and non-osteoarthritic knees (patients vs. controls) was estimated by performing 10-fold crossvalidation on divided data into 75% training and 25% validation sets. Subsequently, the We utilized R Statistical Software [29] and the related packages to fit the logistic LASSO regression model [30,31] and to estimate the confidence intervals for the coefficients in the model [32].
The LASSO coefficients' 95% CI and p-values at a fixed value for the penalty parameter (λ = λ 1se ) were obtained using the method described by Taylor and Tibshirani [33].
Finally, the model's performance in classifying the gait signals into osteoarthritic and non-osteoarthritic knees (patients vs. controls) was estimated by performing 10-fold cross-validation on divided data into 75% training and 25% validation sets. Subsequently, the mean and 95% CI were calculated for accuracy as the percentage of correctly classified instances out of all cases and Cohen's Kappa as the measure of agreement between the actual and classified labels. We have also calculated the accuracy and Kappa for a separate logistic LASSO regression model with potential confounder variables (age, BMI, and ω) to ascertain the superior ability of the power of the frequencies than the potential confounder variables in classifying the gait signals.

Comparing the Severity of Knee OA
We used KOOS as a subjective measure to estimate the severity of individuals' problems related to knee OA. We divided the participants into three groups using terciles (33rd and 67th percentiles) for each of the KOOS subscales (pain, symptoms, ADL, sport, and QoL): Individuals with the highest scores or no/mild knee OA (G0), individuals with scores in the middle range or moderate knee OA (G1), and individuals with lowest scores or severe knee OA (G2). We also divided the participants into three groups based on the radiographic classification of knee OA. However, since we did not perform a radiographic examination in healthy individuals, the participants in the control group, patients with KL 1 or 2, and patients with KL 3 or 4, formed G0, G1, and G2 groups for KL classification, respectively. The selected features of the logistic LASSO regression at λ = λ 1se were compared between three groups of participants for each KOOS subscale in addition to KL classification.

Statistical Analysis
Descriptive statistics were used to describe the participants' characteristics. The numerical variables (age, BMI, pain score, and KOOS) were presented as mean and standard deviation, and categorical variables (sex, KL classification, and the affected side) were shown as counts. The frequency domain features (ω and the power of the frequencies) were described as mean and range. Since the Shapiro-Wilk normality test did not confirm normality, univariate statistical comparisons were conducted using the non-parametric Wilcoxon rank sum test. In addition, the sex between the two groups was compared with a chi-square test. The mean and 95% confidence intervals (CI) were calculated for the differences in the means of the continuous variables (age, BMI, KOOS, and the frequency domain features of the signals). The significance level was considered as α = 0.05. Statistical analyses were conducted in the R Statistical Software. Table 2 compares the basic characteristics of the patient and control groups. The patients were significantly older and had higher BMI than the control group. In addition, the knee outcome scores were markedly higher in the control group compared with the patient group. Table 3 demonstrates the frequency domain features of the acceleration signals in the patients and controls. Except for the power of the first and fourth frequencies in the ML axis, the differences in the mean values for the calculated features were significant.   Figure 3 shows the shrinkage in the estimate of the coefficients for different values of the penalty parameter (λ) in the logistic LASSO regression model. We could demonstrate that 13 out of 22 coefficients (including age, sex, ω, and certain signal frequency powers) vanished with λ less than 10 −8 . Increasing λ beyond λ min led to excluding four other coefficients, among others, BMI. At λ 1se , five variables remained in the model, out of which the power of the sixth frequency in the ML axis was close to zero and nullified soon afterward. Overall, four coefficients lasted in the model longest, i.e., the most determining features in distinguishing between the gait acceleration signals of osteoarthritic and nonosteoarthritis knees. These four features included the power of the second, fifth, and sixth frequencies of the CC axis and the power of the fifth frequency in the AP axis. The 95% CI and p-values for the coefficients at λ1se demonstrated similar results (Table 4). Table 4. The coefficients of the logistic LASSO regression at λ = λ 1se .

Variable Coefficient [95% CI] p-Value
Sex -- AP axis     Figure 4 illustrates the distribution of the patients and controls based on the three most determining coefficients in the final model.  Figure 5 demonstrates the distribution of the selected features of the logistic LASSO regression model in three groups of participants with different severity of knee OA based on KOOS and KL classification (G0, G1, and G2). The features differed significantly between G0 and G1 groups and G0 and G2 groups divided by KOOS subscales and KL classification. However, in comparison between G1 and G2 groups (participants with moderate and severe knee OA), only the power of the sixth frequencies of the CC axis was statistically different between groups demarcated by KOOS-Symptoms.

Discussion
This study aimed to determine the ability of simple and low-sampling frequency IMUs to differentiate between individuals with and without knee OA. Using the signals' frequency contents, we could distinguish between these individuals with high accuracy (0.91 ± 0.01). In addition, we could demonstrate differences in the distribution of the frequency domain features between individuals with different severity of knee OA.
We found the most significant differences in the acceleration frequency contents in the CC axis. The significance of the CC axis acceleration can be justified biomechanically by its direct relationship with stance phase knee joint compression forces, to which patients with knee OA or pre-OA are likely to be sensitive. In another study, Hung et al. observed higher tibial vertical acceleration in medial knee OA patients than age-matched controls, suggesting a more considerable ground impact on the knee joint [34]. They also observed significantly higher vertical acceleration differences between the tibia and femur in the patients compared to the control group, indicating a more significant kinetic moment between the segments, parallel to the clinical observation of the lateral thrust gait

Discussion
This study aimed to determine the ability of simple and low-sampling frequency IMUs to differentiate between individuals with and without knee OA. Using the signals' frequency contents, we could distinguish between these individuals with high accuracy (0.91 ± 0.01). In addition, we could demonstrate differences in the distribution of the frequency domain features between individuals with different severity of knee OA.
We found the most significant differences in the acceleration frequency contents in the CC axis. The significance of the CC axis acceleration can be justified biomechanically by its direct relationship with stance phase knee joint compression forces, to which patients with knee OA or pre-OA are likely to be sensitive. In another study, Hung et al. observed higher tibial vertical acceleration in medial knee OA patients than age-matched controls, suggesting a more considerable ground impact on the knee joint [34]. They also observed significantly higher vertical acceleration differences between the tibia and femur in the patients compared to the control group, indicating a more significant kinetic moment between the segments, parallel to the clinical observation of the lateral thrust gait [34]. However, lower trunk-foot acceleration attenuation along the CC axis was observed in another study on an elderly female population without knee OA [35]. Levinger et al. also explored the frequency content of tibia acceleration signals [36]. They reported greater components in higher frequencies (>5 Hz) of the CC axis for the knee OA subjects than the healthy group. They attributed this finding to instability and altered attenuation of the impact during walking in patients with knee OA [36].
We also found different values for the selected features between patients with different severity of knee OA. However, the differences between the patients with moderate and severe knee OA were insignificant. We should emphasize that the KOOS performs best in measuring the changes in outcomes of the patients over time rather comparing different subjects [21]. Radiographic assessment is also an imprecise marker of pain or disability due to knee OA [37]. Nevertheless, inspecting the distribution of the selected features demonstrated that the values in the G1 group (moderate knee OA) were lower than the values in G0 (no/mild knee OA) and higher than in G2 (severe knee OA) groups divided by KOOS-Pain and KOOS-symptom. This pattern was also observed to a less degree in groups divided by KOOS-ADL and KL classification. Pain and symptom subscales, based on the International Classification of Functioning, Disability, and Health (ICF) framework, represent the body function (the anatomical and physiological level) [38]. While ADL shows activity (the personal level), and the sport and recreational and QoL subscales demonstrate participation (the level to which the person interacts with society) [38]. Different distribution of the selected features in patients with different knee OA severity creates the prospect of creating a gait deviation score based on frequency-domain features in these patients. Other studies evaluating the discriminating capacity of IMU data in knee OA severity assessed the time domain of the data [9,10] or employed more computationally complex approaches to extract the spatiotemporal parameters [6].
In the gait analysis application of machine learning, several methods have been described for extracting and selecting the features [39]. In this study, we employed logistic LASSO regression for feature selection and discriminating between the subjects with and without knee OA. As a logistic regression, this method can be used as a classification algorithm to distinguish between individuals with and without knee OA. LASSO can also perform feature selection by shrinking the features with little correlation with the response variable (i.e., the presence of knee OA in our study) toward zero. Logistic LASSO regression can handle the within-data correlation in gait data. Furthermore, logistic regression allows adjusting for the confounders [40]. Confounding variables, such as sex, age, and BMI, can affect gait characteristics [41][42][43], and researchers have applied different methods to control for the confounders in gait analysis [44]. We did not match the case and control groups for the confounding variables in this study. However, employing the LASSO regression, we could demonstrate that the confounders (sex, age, and BMI) were less effective than the frequency powers in the signal, and the effect of BMI (as the most influential confounder) was nullified before several frequency powers. Walking speed can also challenge gait analysis in knee OA [45]. In this study, we did not directly measure the gait velocity; however, ω, which correlates with the walking speed, was significantly higher in the controls compared with the patients. Nonetheless, the coefficient corresponding to ω was considerably lower than most of the frequency powers in the signal. The effect of ω was omitted by employing penalty values (λ) much smaller than the value we used in the final model in LASSO regression. Additionally, the inferior performance of a LASSO relying only on potential confounders (age, BMI, and ω), compared to the full LASSO model, signified the importance of the power of the frequencies in classifying the gait acceleration signals.
To our knowledge, this was the first study evaluating the frequency domain of IMU data to assess the presence and severity of knee OA and the first study employing a shrinkage technique for feature selection in gait analysis. Using LASSO regression, we identified the association between the frequency contents of the lower limb linear accelerations and ipsilateral knee OA. However, this study undeniably has several limitations. Firstly, conventional gait analysis was not performed as a ground truth to distinguish between the patients and controls. Nonetheless, the KOOS demonstrated significantly lower values, indicating walking problems in the patients. The correlation between KOOS and the severity of knee OA has been demonstrated [46]. Likewise, we could not compare the patients and the control group based on the radiographic features of the knee OA since the radiographic evaluation, for ethical reasons, was not performed in the control group. Therefore, the diagnosis of knee OA was overruled based on the absence of signs and symptoms [47]. An important limitation of not only this study but the frequency analysis is its non-intuitive nature compared to time-continuous analysis, complicating the interpretation of the frequency domain features. Relatedly, we must acknowledge that, based on the Nyquist theorem [48], the low-sampling frequency sensors provide us with limited bandwidth for data analysis. However, there is a trade between the sensor's sampling rate and the length of available frequency bandwidth for analysis. Still, we believe finding differences in a limited bandwidth is impressive. Finally, we recorded gait data using standardized protocols in the hospital rather than at patients' homes; however, we attempted to simplify the data collection protocol as much as possible to be reproducible in real-life environments.
Long-term monitoring of patients requires simple devices with long battery life. The IMU we utilized in this study was a single low-sampling accelerometer affixed by bandaid-like skin adhesion supplemented with a cloud connection for data transfer. The lowsampling accelerometers, as passive electronic components, have a long battery life of up to three months, which makes these devices an appropriate choice for the remote monitoring of patients. Despite the limitations, the Fourier coefficients of signals recorded by such sensors demonstrated a high discriminative capacity in knee OA. The method suggested in this study facilitates the automatic extraction of valuable parameters from IMU's raw data for clinicians to use gait analysis in regular consultation. The IMU-derived parameters can help identify patients with knee OA and evaluate the response to treatments and interventions. In addition, these locomotion parameters provide an opportunity to automatically monitor the recovery process after surgery and intelligently individualize the rehabilitation programs beyond the limits of the clinics in real-life conditions. However, before bringing this approach into patients' homes, the correlation between the changes in the frequency contents of acceleration signals and the disease's severity must be investigated. Before applying this method to remote monitoring of the patients, the responsiveness to changes in the signals' frequency contents needs to be evaluated. The study outlines this vision that IMUs may provide practical information on the diagnosis and assessment of knee OA, consequently reducing the number of referrals to secondary healthcare and decreasing the cost and burden of diagnostic procedures. For instance, considering the location of the used sensors close to pants pockets, where most people carry their mobile phones, future studies can demonstrate the possibility of using smartphones equipped with inertial sensors instead of additional devices for diagnostic and investigative purposes in patients with knee OA.

Conclusions
The frequency contents of the lower limbs' linear accelerations derived from the Fourier series of the signals can accurately differentiate between individuals with and without knee OA. However, despite the distribution of the frequency powers in patients with different severity of knee OA, further studies are required to clarify the correlation between the frequencies of IMU signals and the severity of knee OA.