Towards Effective Parkinson’s Monitoring: Movement Disorder Detection and Symptom Identification Using Wearable Inertial Sensors

Khan, Umar; Riaz, Qaiser; Hussain, Mehdi; Zeeshan, Muhammad; Krüger, Björn

doi:10.3390/a18040203

Open AccessArticle

Towards Effective Parkinson’s Monitoring: Movement Disorder Detection and Symptom Identification Using Wearable Inertial Sensors

by

Umar Khan

¹,

Qaiser Riaz

^1,*

,

Mehdi Hussain

¹

,

Muhammad Zeeshan

¹

and

Björn Krüger

²

¹

School of Electrical Engineering and Computer Science (SEECS), National University of Sciences and Technology (NUST), Islamabad 44000, Pakistan

²

Department for Epileptology, University Hospital Bonn, 53127 Bonn, Germany

^*

Author to whom correspondence should be addressed.

Algorithms 2025, 18(4), 203; https://doi.org/10.3390/a18040203

Submission received: 30 January 2025 / Revised: 11 March 2025 / Accepted: 2 April 2025 / Published: 4 April 2025

(This article belongs to the Special Issue Machine Learning in Medical Signal and Image Processing (3rd Edition))

Download

Browse Figures

Versions Notes

Abstract

Parkinson’s disease lacks a cure, yet symptomatic relief can be achieved through various treatments. This study dives into the critical aspect of anomalous event detection in the activities of daily living of patients with Parkinson’s disease and the identification of associated movement disorders, such as tremors, dyskinesia, and bradykinesia. Utilizing the inertial data acquired from the most affected upper limb of the patients, this study aims to create an optimal pipeline for Parkinson’s patient monitoring. This study proposes a two-stage movement disorder detection and classification pipeline for binary classification (normal or anomalous event) and multi-label classification (tremors, dyskinesia, and bradykinesia), respectively. The proposed pipeline employs and evaluates manual feature crafting for classical machine learning algorithms, as well as an RNN-CNN-inspired deep learning model that does not require manual feature crafting. This study also explore three different window sizes for signal segmentation and two different auto-segment labeling approaches for precise and correct labeling of the continuous signal. The performance of the proposed model is validated on a publicly available inertial dataset. Comparisons with existing works reveal the novelty of our approach, covering multiple anomalies (tremors, dyskinesia, and bradykinesia) and achieving 93.03% recall for movement disorder detection (binary) and 91.54% recall for movement disorder classification (multi-label). We believe that the proposed approach will advance the field towards more effective and comprehensive solutions for Parkinson’s detection and symptom classification.

Keywords:

Parkinson’s monitoring; gait analysis; wearable sensors; inertial sensors; movement disorder classification; movement disorder detection; bradykinesia; dyskinesia; tremors; deep learning; machine learning

1. Introduction

Parkinson’s disease, a neurological disorder stemming from nerve cell degeneration in the brain’s substantia nigra, currently lacks a cure [1]. Diagnosis of Parkinson’s relies not on X-rays or blood tests but on observable symptoms affecting daily activities [2], such as freeze of gait (freezing while walking or performing tasks), tremors, dyskinesia (uncontrolled movements), and bradykinesia (slowed movements). These symptoms pose significant risks, such as bradykinesia while crossing streets or dyskinesia while handling sharp objects like knives, potentially leading to harm [3]. Timely detection of these symptoms can greatly mitigate such risks, allowing for prompt alerts and the implementation of strategies to reduce their impact [4].

Devices such as smartwatches, smart fitness bands, and smartphones are equipped with built-in inertial sensors, including accelerometers and gyroscopes. These sensors can be non-invasively attached to the human body, enabling real-time data analytics and allowing the devices to provide immediate alerts and cues. These devices utilize inertial data, making them versatile in various applications, including terrain classification [5], fall detection [6], and human activity recognition [7]. Previous studies and the widespread availability of inertial measurement units (IMUs) show promise in leveraging these data for Parkinson’s disease (PD) diagnosis and monitoring [8,9,10,11,12]. IMUs typically include accelerometers, gyroscopes, and sometimes magnetometers, all commonly found in most of the modern digital gadgets.

Driven by the high performance achieved through the utilization of inertial data in various applications, this study demonstrates the critical role of wearable technology such as smartwatches and fitness bands in anomalous event detection and classification for the PD patients. These devices enable continuous monitoring and analysis of movement patterns by capturing real time inertial data, thus facilitating timely alerts and medical interventions. Our paper introduces several novel contributions to wearable technology for PD management:

An end-to-end pipeline for real-time movement disorder detection, achieving an inference time of 80 ms, and multi-label classification (inference time of 165 ms) of tremors, dyskinesia, and bradykinesia.
The proposed model is very lightweight (around 9 MBs of size), which is optimal for deployment on edge devices such as smartwatches and fitness bands.
This study proposes two auto-labeling techniques (early and late labeling) to identify optimal pre-processing configurations and evaluates the proposed model using three different window sizes (50, 150, 250) for optimal signal segmentation.
Evaluation of two modeling techniques—manual feature crafting for a machine learning pipeline and processing raw inertial data for a deep learning pipeline—to find the optimal approach. A detailed comparison between the two modeling techniques can be found in the Results Section. For movement disorder detection, the deep learning approach achieves a recall of 93%. Similarly, for movement disorder classification, the deep learning approach achieves a recall score of 91.54%.

2. Literature Review

Many of the recent studies employ inertial data for detecting and characterizing PD and its symptoms. For instance, one study focused on freeze of gait (FOG), a common PD symptom affecting gait and increasing fall risk, achieving a 91.5% F1-score using time–frequency features and convolutional neural networks [13]. Dvorani et al. [14] targeted bradykinesia by analyzing upper limb inertial signals with a multi-layer perceptron and achieved an accuracy of 85%. Similarly, FOG is addressed in another study, which utilized inertial sensors and a novel evaluation metric called GaitScore [15]. They achieved 97% sensitivity and 87% specificity.

Wearable inertial sensors offer promise for monitoring axial impairments in PD. Integration with machine learning techniques yielded low root mean square error (RMSE) and posture instability/gait difficulty scores (PIGD scores) [16], and validation studies confirmed their reliability in capturing gait parameters [17]. Investigations into hand tremors and levodopa effects on gait parameters achieved high accuracy using inertial sensors and advanced machine learning methods [18,19]. Studies on stride segmentation using hidden Markov models achieved a 92.1% F1-score on real-time gait data [19], confirming the reliability of wearable sensors for gait analysis under supervised and unsupervised settings [20]. Peres et al. [21] employed neural networks on inertial data in early PD detection, demonstrating the effectiveness of inertial data and machine learning in PD diagnosis and management. Su et al. [22] presented an interpretable CNN-based architecture and used spatio-temporal features for Parkinson’s disease detection, achieving an accuracy of 98%. Uchitomi et al. [23] utilized a deep learning model on a time-series gait dataset acquired rom an IMU mounted on a subject’s leg to differentiate between healthy persons and persons with mild Parkinson’s disease. Dimoudis et al. [24] proposed two deep learning architectures (LN-inception and InSEption) to predict FOG episodes, achieving an F1-score of 97%.

Inertial data require pre-processing before being used for machine learning or fed directly into deep learning models, as shown in many studies. Pre-processing steps involve noise removal using techniques like simple moving average [25,26]. This cleaned signal is then decomposed into smaller segments using techniques like peak/valley detection [25], or by dividing signal into windows of fixed length and stride [27]. Labeling segmented data follows various approaches based on the use case; straightforward cases apply the same label to all segments, e.g., gender identification [25], while cases where the label for the whole signal is not same, e.g., occurrence of tremor episodes within the gait of a patient suffering from PD, require more sophisticated segment-labeling methods. With non-homogeneous segments, methods using the mode as the overall segment label are commonly used, as in [26], or treating the segment as an anomalous one even if a smaller subset of it is anomalous [27]. Features from time, frequency, or wavelet domains are computed for each segment for machine learning models, improving model accuracy but increasing the processing time. Deep learning models offer improved computation time and accuracy by directly using segmented and labeled raw data for training and inference [28].

Apart from using inertial data, diverse data sources have been used to detect and quantify PD, such as electroencephalography (EEG), electromyography (EMG), speech, and vision. Commonly employed deep learning models include CNNs and RNNs, achieving high accuracies in various tasks. For instance, a combination of CNN+RNN classifies PD patients based on EEG with 99.2% accuracy [29], while EMG readings detect motor fluctuations with 99% accuracy [30]. Speech- and voice-based models also show promising results; however, real-world noise affects reliability [31,32,33,34,35,36]. Vision-based approaches reveal PD-related impairments, though occlusion and illumination changes still remain challenging [37,38]. Some other vision-based models analyze subjects’ handwriting as presented in [39]. Alazeb et al. [40] employed a combination of RGB, inertial, and depth sensor data, and used machine learning to offer therapeutic advise to patients. For vision-based models, overcoming data availability, equipment accessibility, and environmental challenges is crucial for real-world deployment.

While existing studies use wearable inertial sensors for PD detection, they often focus on isolated symptoms, rely on computationally expensive models, or lack standardized data segmentation. Real-time, multi-label classification of movement disorders remains under-explored. Moreover, models in the existing literature are not optimized for edge deployment. Our work addresses these gaps by proposing a lightweight model capable of real-time movement disorder detection and classification with an inference time of 80–165 ms. Additionally, we introduce two auto-labeling techniques to improve data segmentation and compare feature-engineered and deep learning approaches for optimal performance. We believe that these contributions advance real-time multi-label classification of movement disorders while ensuring computational efficiency for deployment on smartwatches and fitness bands.

3. Methodology

This section presents the proposed methodology, beginning with a brief overview of the publicly available dataset used in this study for training, validation, and testing. Subsequently, the proposed pipeline is discussed, which encompasses pre-processing and signal decomposition, feature extraction for classical machine learning algorithms, and the application of deep learning techniques.

3.1. Dataset

This study utilizes a publicly available dataset derived from the levodopa response study [4]. This dataset is specifically tailored to explore anomalies related to motor fluctuations observed in individuals afflicted with PD. The dataset encompasses three distinct anomalies: tremors, dyskinesia, and bradykinesia. Notably, the raw data are readily accessible through an open data repository. Data collection procedures involved the participation of 28 subjects diagnosed with PD, with Hoehn and Yahr scores ranging from II to IV, over a span of 4 days. The Hoehn and Yahr scale was employed to gauge the progression of Parkinson’s symptoms and the degree of disability. It included subjects aged from 30 to 80 years and undergoing L-Dopa treatment, exhibiting at least mild dyskinesia and motor fluctuations. Subjects with significant neurological disorders such as epilepsy, brain tumors, or hydrocephalus were excluded from this study. Data acquisition was conducted using three distinct sensors, Samsung S2 smart phone, GeneActiv IMU, and a Pebble smartwatch, positioned at the waist, most affected upper limb, and least affected upper limb, respectively. Three-dimensional acceleration data, capturing movement along the x, y, and z axes, were collected at a frequency of 50 Hz. Additionally, the dataset includes information on vector magnitude. The dataset also encompasses labeled data, providing details on symptom severity (for tremors) and presence (for dyskinesia and bradykinesia) for each limb and motor task, as annotated by a clinician. Data collection spanned a 4-day period. On the first day, participants performed various motor tasks in an on-medication state within a laboratory setting. These tasks included a range of activities, such as walking, finger movements, and fine motor tasks, with repetitions occurring every 30 min over a 3- to 4-h period. Subsequently, participants were equipped with sensors to record their daily activities over the second and third days, yielding two days of unlabeled data. On the fourth day, participants returned to the laboratory in a non-medicated state to repeat the motor tasks performed on the first day. Following this initial testing phase, participants ingested their scheduled medication dose and performed an additional set of motor tasks. Annotation of all the data was conducted by a trained clinician.

3.2. Pipeline Overview

This section presents a novel two-stage methodology designed for detecting and categorizing anomalous events found in inertial data acquired from wearable sensors wore by patients diagnosed with PD, as shown in Figure 1.

The proposed pipeline processes the captured inertial data on a segment-by-segment basis. First, the pipeline utilizes binary classification to discriminate between normal and anomalous signal segments. To accomplish this, the three anomalies (tremor, dyskinesia, and bradykinesia) from the dataset are consolidated into a unified binary label. Here, the signal segments are treated anomalously upon the detection of any anomaly, and they are classified as normal, when devoid of any anomalies. Following this, if a segment is identified as anomalous, it proceeds to the second stage of multi-label classification to categorize the specific types of anomalies present. In contrast to past studies, which involved dealing with only a single anomaly, like FOG or dyskinesia, our approach addresses multiple mutually inclusive classes. This implies that a single segment may be classified as having both tremor and dyskinesia simultaneously. Such an approach proves valuable for quantifying the severity of symptoms; for instance, an individual exhibiting both tremor and dyskinesia would receive a higher severity score compared to someone experiencing only dyskinesia. Conversely, if a signal segment is categorized as normal during the initial stage, the subsequent stage is bypassed, as no anomalies are detected. This study focuses on the role of data collected from the IMU attached to the most affected limb; thus, all experiments and results are derived from the GeneActiv IMU.

3.3. Signal Pre-Processing

The raw input signal is first decomposed into smaller segments. To optimize the window length, this study explored three different window sizes, based on the 50 Hz sampling rate used for data collection: (1) a 1 s window (50 data points), (2) a 3 s window (150 data points), and (3) a 5 s window (250 data points). An overlap of 80% among consecutive segments by using a stride of 20% is shown in Figure 2.

With respect to the desired overlap (O) and selected window size (WS), the stride (S) can be derived from the following equation:

S = W S \times (1 - O),

(1)

whereas the exact window indices (Start, End) for

N^{t h}

segment (starting from 0) can be determined by the following equation:

W_{n} = [n \times S, n \times S + (W S - 1)] .

(2)

The decomposed signal segments are then labeled. For this purpose, this study has explored two different labeling approaches. The first one (S1 methodology—early labeling) treats the segment as an anomalous one if it contains any anomalous data points (in the event of a single anomalous data point). Otherwise, it is treated as a normal segment (zero anomalous data points). In the second labeling approach (S2 methodology—late labeling), the segment label is as per majority voting. If most of the data points are anomalous, the label will be anomalous. On the other hand, if most of the data points are normal, the label will be normal. In case of a tie, the label will be anomalous. Pseudo-code for signal pre-processing can be found in Algorithm 1. Overall, this study has explored three different window sizes, each with two different labeling approaches. Their performance-wise comparative analysis is outlined in the Results Section.

Algorithm 1 Signal pre-processing.

1:: Input: Raw signal data, Sampling rate $f_{s}$ , Window size $W S$ , Overlap O
2:: Output: Segmented windows with indices and labels
3:: $S \leftarrow W S \times (1 - O)$ ▹ Calculate stride based on overlap
4:: $n \leftarrow 0$ ▹ Initialize segment counter
5:: while not end of signal do
6:: $S t a r t \leftarrow n \times S$
7:: $E n d \leftarrow S t a r t + (W S - 1)$
8:: Extract segment $W_{n} = [S t a r t, E n d]$
9:: $n \leftarrow n + 1$
10:: end while
11:: for each segment $W_{n}$ do
12:: ▹ S1 - Early labeling
13:: if any anomalous data point in $W_{n}$ then
14:: Label $W_{n}$ as anomalous
15:: else
16:: Label $W_{n}$ as normal
17:: end if
18:: ▹ S2 - Late labeling
19:: $c o u n t_{a n o m a l o u s} \leftarrow$ Number of anomalous points in $W_{n}$
20:: $c o u n t_{n o r m a l} \leftarrow$ Number of normal points in $W_{n}$
21:: if $c o u n t_{a n o m a l o u s} > c o u n t_{n o r m a l}$ then
22:: Label $W_{n}$ as anomalous
23:: else if $c o u n t_{n o r m a l} > c o u n t_{a n o m a l o u s}$ then
24:: Label $W_{n}$ as normal
25:: else
26:: Label $W_{n}$ as anomalous (tie case)
27:: end if
28:: end for

3.4. Feature Crafting and Machine Learning

The machine learning pipeline involves computing features from input segments and using them to train a random forest model. This study proposes a set of 150 features from temporal, frequency, and wavelet domains [25,26]. There are 60 features computed from the time domain, 60 features computed from the frequency domain, and 30 features computed from the wavelet domain covering statistical, amplitude-related, and spectral measures across all three axes of inertial data, i.e., x, y and z axis, as shown in Table 1. Since this study explores 3 different window sizes and 2 different labeling approaches, this will give us 6 (

2 \times 3

) different datasets of signal segments. Hence, features are computed for each of these 6 datasets. This study used MATLAB R2024a for both signal segmentation and feature computation.

Manually crafted features are used for model training and evaluation, which is completed using Anaconda’s Python V3.10 distribution, on Jupyter Notebook, with Pycaret (used for model training and testing) and Imblearn (used for implementing random under sampling). Segment-wise features are then labeled for the first stage of the pipeline (movement disorder detection) by consolidating the three anomalies into a single unified anomaly binary label. Afterwards, the data are normalized using the z-score normalization technique, which obtains a mean of all values of 0 with a standard deviation of 1, thus resulting in more accurate and faster convergence. From data exploration, it was found that a class imbalance exists for both stages. For anomalous event detection, this study solved this problem using signal segmentation and segment-level labels, as shown at the top of Figure 3. A similar pattern was observed for movement disorder classification in the raw data points, as shown at the bottom of Figure 3. This occurred because the anomalous segment used for classification did not undergo any labeling techniques.

The prepared data are then used to train a random forest classifier (number of trees = 100) for binary classification of the input signal segment as an anomalous one or a normal one (stage 1). This study uses a 10-fold cross validation technique for model training and evaluation. The evaluation parameters include accuracy, recall, precision, F1-score, and confusion matrix.

The models underwent initial training with the full feature set, followed by training exclusively on the highest-performing features, aimed at reducing execution time. The top-performing feature for movement disorder detection (AD) and movement disorder classification (AC), using data from the most affected upper limb (GeneActiv Smartwatch), with a window size of 250 and S1 labeling, can be seen in Figure 4. If Stage 1 classifies the input signal segment as an anomalous one, then it passes through the second stage, which predicts the type of anomalies (tremor, dyskinesia, bradykinesia) present in the signal. This is a multi-label classification problem and is handled using the label power set approach, where the problem is transformed to a multi-class classification problem by assigning a unique class label to each of the possible combination (based on existence) of the three anomalies, thus transforming the output array of

n \times 3

(tremor, dyskinesia, bradykinesia) into a

n \times 1

array.

3.5. Deep Learning

Similar to the above pipeline, a two-stage approach is followed here and includes a binary classifier (movement disorder detection) and a multi-label classifier (movement disorder classification). This study computes the magnitude of the raw 3D accelerations (see Equation (3)) to convert the 3D signal into a 1D signal, which is then passed to the deep learning model for training and inference. Using the 1D magnitude computed from 3D input significantly improves the computational efficiency, making the solution more suitable for real-time applications and helping to reduce the impact of sensor orientation [41]. This deep learning-based pipeline is also trained and evaluated using six different data configurations, which are made using three different window sizes for segmentation, and two different segment labeling approaches.

\hat{m a g_{a}} = \sqrt{(a_{x}^{2} + a_{y}^{2} + a_{z}^{2})}

(3)

Overview of the Algorithm

This study employs our previously developed deep learning algorithm, HARDenseRNN [41], as our deep learning model for training and validation. The HARDenseRNN is a combination of two multi-kernel convolutional neural network (CNN) modules followed by a recurrent neural network (RNN) module (see Figure 5). This model first passes

\hat{m a g_{a}}

as input through the CNN-based network for spatial feature extraction. The spatial features are then concatenated with the

\hat{m a g_{a}}

and passed to the RNN-inspired network (bi-directional GRU) for capturing temporal features. The concatenation of the raw input signal, with the feature map generated by the CNN, ensures that the RNN can take advantage of both the raw signal and the CNN features for temporal feature extraction. These features are then passed through batch normalization, and are then flattened. Afterwards, the flattened output is passed through a dropout layer and then finally through a series of dense (fully connected) layers for making predictions. For the first stage, which involves binary classification, a single output node is used, with a sigmoid activation function and binary cross-entropy loss function.

The second stage, involving multi-label classification, uses three output nodes, with a sigmoid activation function and binary cross-entropy loss function. For the first stage of binary classification, the model had 753,035 trainable parameters and 896 non-trainable parameters. For the second stage of multi-label classification, the model had 754,061 trainable parameters and 896 non-trainable parameters. Deep learning model training was carried out on Google Colab Pro with a GPU (Tesla P100), with the help of TensorFlow 2.4 and Keras 2.4.3 for model construction and training. The model was trained for 100 epochs using a batch size of 128. The number of epochs were selected empirically by monitoring the plot of training and validation loss and accuracy values, as shown in Figure 6. The pipeline also uses early stopping with the ‘max’ option for validation accuracy with a patience of 10 epochs. This study has also used scikit-learn for splitting data into training and testing sets, and it used seaborn for visualizations.

4. Results and Discussion

This section presents the results of the experiment, involving the evaluation of three segment sizes (250, 150, 50), two labeling approaches (S1 labeling and S2 labeling), two stages (movement disorder detection, movement disorder classification), and two models per stage (random forest, HARDenseRNN), encompassing a total of 24 models (3 × 2 × 2 × 2). It is important to note that this study uses recall as the primary metric for model evaluation to ensure that our models accurately identify as many anomalous segments as possible, thus ensuring accurate classification of critically anomalous segments.

4.1. Movement Disorder Detection

The initial phase of our proposed pipeline emphasizes the application of a binary classifier for categorizing input signal segments as normal or anomalous. The evaluation reveals that HARDenseRNN outperforms random forest in detecting anomalous events in patients with Parkinson’s disease when focusing on recall. The optimal configuration for HARDenseRNN is a segment size of 250 and S1 labeling, achieving the highest recall of 93.03%, along with an accuracy of 88.58% and precision of 86.13%. For S2 labeling, HARDenseRNN’s best recall is 87.92% with a segment size of 150. However, HARDenseRNN shows greater variability in performance depending on the segment size and labeling approach. In contrast, random forest consistently performs well across various configurations, with the highest recall of 89.10% at a segment size of 250 and S2 labeling, followed by 88.65% with the same segment size and S1 labeling. While random forest offers more consistent performance across various pre-processing configurations, the superior recall achieved by HARDenseRNN with larger segment sizes and S1 labeling makes it the most effective approach for this task.

Confusion matrices for movement disorder detection with HARDenseRNN and optimal pre-processing configurations (segment size 250 and S1 labeling) are shown in Figure 7. In addition to achieving a higher recall, HARDenseRNN exhibits faster execution times and a smaller model size (approximately 9 MB only), even when compared with a random forest model trained only on the top 10 features, as shown in Figure 8, making it optimal for real-time applications and deployment on low-end devices. Detailed results of 12 models evaluated for movement disorder detection can be observed in Figure 9.

4.2. Movement Disorder Classification

In the multi-stage movement disorder identification process, where the initial stage identifies a signal as anomalous, the signal segment proceeds to the second stage for multi-label classification, which determines the type or types of anomalies present in the input signal segment. Here, too, HARDenseRNN demonstrates superior recall, particularly with a segment size of 250 and S1 labeling, achieving the highest recall of 91.54%, along with an accuracy of 79.71% and precision of 91.95%. For S2 labeling, HARDenseRNN also performs well, with a recall of 90.54% at a segment size of 250 and 89.82% at a segment size of 150. However, HARDenseRNN exhibits greater variability in its performance, with recall dropping to 79.13% and 80.33% at smaller segment sizes of 50 for S2 and S1 labeling, respectively. Random forest shows consistent (across various pre-processing configurations) but lower recall compared to HARDenseRNN. The highest recall for random forest is 71.99% at a segment size of 150 with S2 labeling, followed closely by 71.23% at a segment size of 250 with S2 labeling. Recall values are generally lower with S1 labeling, with the best performance being 67.72% at a segment size of 250. Despite this, random forest maintains relatively high precision across all configurations, indicating its reliability in classification.

Overall, HARDenseRNN with a segment size of 250 and S1 labeling emerges as the best-performing model and pre-processing configuration, achieving the highest recall of 91.54%. Although random forest is more stable across different configurations, HARDenseRNN’s higher recall makes it the more effective method for movement disorder classification in this context. Detailed results of the 12 models evaluated for movement disorder classification can be seen in Figure 10. For random forest movement disorder classification, this study used a power set approach to perform multi-label classification. On the other hand, HARDenseRNN is able to perform multi-label classification, without the need of the power set approach, by using three output nodes, one for each anomaly. The confusion matrix for HARDenseRNN (across each node) using a window size of 250 and S1 labeling is shown in Figure 7.

4.3. Inference Time Evaluation

The HARDenseRNN deep learning model outperforms the random forest-based pipeline in terms of inference time, as seen in Figure 8. Although random forest takes 441 ms to complete a pipeline with all 150 features, HARDenseRNN takes only 165 ms. Even after excluding feature computation time, HARDenseRNN’s inference time remains significantly lower. For movement disorder detection, our model takes 45% less time compared to random forest, completing the task in 83 ms. In movement disorder classification, the time gap is even more substantial, with random forest taking 62% and 44% more time for the full feature set and top 10 features, respectively, compared to the HARDenseRNN deep learning model.

5. Conclusions

In Parkinson’s disease detection research, prior studies focused on a single specific anomaly or early detection using various modalities. Our work stands out by addressing multiple anomalies across body parts, including tremors, dyskinesia and bradykinesia, achieving 93.03% recall for movement disorder detection (binary) and 91.54% recall for movement disorder classification (multi-label). Utilizing data captured with the on-board IMUs of smartwatches attached on the most affected upper limb, our approach introduces a novel two-stage pipeline for movement disorder detection and classification (severity quantification). Unlike methods relying on vision or speech, our model provides a comprehensive solution. This study presents optimal pre-processing configurations, highlights key features for machine learning, and presents a deep neural network-based pipeline that is suitable for real-world deployment, even on low-end devices, with real time performance. This marks a significant advancement in Parkinson’s disease research, offering a holistic approach to movement disorder detection and severity quantification. Our current research emphasizes movement disorder detection and classification, excluding the prediction of severity for individual anomalies due to limitations in the dataset.

Future endeavors could involve addressing the skewed distribution by creating a more balanced dataset and incorporating individual anomaly severity quantification into our pipeline. Our future work aims to go beyond detection and quantification by incorporating anomaly forecasting, making our approach more proactive. Moreover, the dataset used in this study has 28 subjects, which may impact the generalization. This constraint should be considered when interpreting the results. To improve generalization, we plan to work with larger and more diverse datasets. Additionally, we aim to explore lower-limb and waist data to better capture full-body movement, including gait. These expansions will enhance Parkinson’s disease monitoring by providing a more comprehensive understanding and early anticipation of movement anomalies.

Author Contributions

Conceptualization, U.K., Q.R.; methodology, U.K., Q.R., M.H., M.Z. and B.K.; software, U.K., Q.R. and M.H.; validation, U.K., M.Z. and B.K.; investigation, U.K., Q.R., M.H., M.Z. and B.K.; writing—original draft preparation, U.K., Q.R., M.H., M.Z. and B.K.; writing—review and editing, U.K, M.Z. and B.K.; visualization, U.K., M.H. and M.Z.; supervision, Q.R.; project administration, Q.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Lin, C.H.; Wang, F.C.; Kuo, T.Y.; Huang, P.W.; Chen, S.F.; Fu, L.C. Early detection of Parkinson’s disease by neural network models. IEEE Access 2022, 10, 19033–19044. [Google Scholar] [CrossRef]
El Maachi, I.; Bilodeau, G.A.; Bouachir, W. Deep 1D-Convnet for accurate Parkinson disease detection and severity prediction from gait. Expert Syst. Appl. 2020, 143, 113075. [Google Scholar] [CrossRef]
Jeancolas, L.; Benali, H.; Benkelfat, B.E.; Mangone, G.; Corvol, J.C.; Vidailhet, M.; Lehericy, S.; Petrovska-Delacrétaz, D. Automatic detection of early stages of Parkinson’s disease through acoustic voice analysis with mel-frequency cepstral coefficients. In Proceedings of the 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), Fez, Morocco, 22–24 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
Daneault, J.F.; Vergara-Diaz, G.; Parisi, F.; Admati, C.; Alfonso, C.; Bertoli, M.; Bonizzoni, E.; Carvalho, G.F.; Costante, G.; Fabara, E.E.; et al. Accelerometer data collected with a minimum set of wearable sensors from subjects with Parkinson’s disease. Sci. Data 2021, 8, 48. [Google Scholar] [CrossRef]
Anantrasirichai, N.; Burn, J.; Bull, D. Terrain classification from body-mounted cameras during human locomotion. IEEE Trans. Cybern. 2014, 45, 2249–2260. [Google Scholar] [CrossRef] [PubMed]
Al-qaness, M.A.; Jiang, Z.; Shen, J. MKLS-Net: Multi-Kernel Convolution LSTM and Self-Attention for Fall Detection Based on Wearable Sensors. IEEE Internet Things J. 2024. [Google Scholar] [CrossRef]
Wang, H.; Wang, X.; Lu, C.; Yuan, M.; Wang, Y.; Yu, H.; Li, H. Enhancing Human Activity Recognition in Wrist-Worn Sensor Data Through Compensation Strategies for Sensor Displacement. IEEE Access 2024, 12, 95058–95070. [Google Scholar] [CrossRef]
He, T.; Chen, J.; Chen, Y. Smartphone-based detection of early Parkinson’s disease with tapping records and a multimodal-multiscale ensemble network. IEEE Sens. J. 2024, 24, 33207–33216. [Google Scholar] [CrossRef]
Akazzim, Y.; Arias, C.P.; Jofre, M.; Mrabet, O.E.; Romeu, J.; Jofre-Roca, L. UWB Microwave Functional Brain Activity Extraction for Parkinson’s Disease Monitoring. IEEE Sens. J. 2024, 24, 3844–3852. [Google Scholar] [CrossRef]
Faraji, B.; Rouhollahi, K.; Nezhadi, A.; Jamalpoor, Z. A Novel Closed-Loop Deep Brain Stimulation Technique for Parkinson’s Patients Rehabilitation Utilizing Machine Learning. IEEE Sens. J. 2023, 23, 2914–2921. [Google Scholar] [CrossRef]
Zhao, Y.; Liu, Y.; Lu, W.; Li, J.; Shan, P.; Lian, C.; Wang, X.; Fu, C.; Ma, C.; Wang, Y. Intelligent IoT Anklets for Monitoring the Assessment of Parkinson’s Diseases. IEEE Sens. J. 2023, 23, 31523–31536. [Google Scholar] [CrossRef]
Sigcha, L.; Borzì, L.; Amato, F.; Rechichi, I.; Ramos-Romero, C.; Cárdenas, A.; Gascó, L.; Olmo, G. Deep learning and wearable sensors for the diagnosis and monitoring of Parkinson’s disease: A systematic review. Expert Syst. Appl. 2023, 229, 120541. [Google Scholar] [CrossRef]
Shi, B.; Tay, A.; Au, W.L.; Tan, D.M.; Chia, N.S.; Yen, S.C. Detection of freezing of gait using convolutional neural networks and data from lower limb motion sensors. IEEE Trans. Biomed. Eng. 2022, 69, 2256–2267. [Google Scholar] [CrossRef]
Dvorani, A.; Jochner, M.; Seel, T.; Salchow-Hömmen, C.; Meyer-Ohle, J.; Wiesener, C.; Voigt, H.; Kühn, A.; Wenger, N.; Schauer, T. Inertial sensor based detection of freezing of gait for on-demand cueing in Parkinson’s disease. IFAC-PapersOnLine 2020, 53, 16004–16009. [Google Scholar] [CrossRef]
Borzì, L.; Mazzetta, I.; Zampogna, A.; Suppa, A.; Irrera, F.; Olmo, G. Predicting Axial Impairment in Parkinson’s Disease through a Single Inertial Sensor. Sensors 2022, 22, 412. [Google Scholar] [CrossRef]
Jakob, V.; Küderle, A.; Kluge, F.; Klucken, J.; Eskofier, B.M.; Winkler, J.; Winterholler, M.; Gassner, H. Validation of a sensor-based gait analysis system with a gold-standard motion capture system in patients with Parkinson’s disease. Sensors 2021, 21, 7680. [Google Scholar] [CrossRef]
Tong, L.; He, J.; Peng, L. CNN-based PD hand tremor detection using inertial sensors. IEEE Sens. Lett. 2021, 5, 1–4. [Google Scholar] [CrossRef]
Son, M.; Han, S.H.; Lyoo, C.H.; Lim, J.A.; Jeon, J.; Hong, K.B.; Park, H. The effect of levodopa on bilateral coordination and gait asymmetry in Parkinson’s disease using inertial sensor. Npj Park. Dis. 2021, 7, 42. [Google Scholar] [CrossRef]
Roth, N.; Küderle, A.; Ullrich, M.; Gladow, T.; Marxreiter, F.; Klucken, J.; Eskofier, B.M.; Kluge, F. Hidden Markov Model based stride segmentation on unsupervised free-living gait data in Parkinson’s disease patients. J. Neuroeng. Rehabil. 2021, 18, 93. [Google Scholar]
Alberto, S.; Cabral, S.; Proença, J.; Pona-Ferreira, F.; Leitão, M.; Bouça-Machado, R.; Kauppila, L.A.; Veloso, A.P.; Costa, R.M.; Ferreira, J.J.; et al. Validation of quantitative gait analysis systems for Parkinson’s disease for use in supervised and unsupervised environments. BMC Neurol. 2021, 21, 331. [Google Scholar] [CrossRef]
Peres, L.B.; Calil, B.C.; da Silva, A.P.S.P.B.; Dionísio, V.C.; Vieira, M.F.; de Oliveira Andrade, A.; Pereira, A.A. Discrimination between healthy and patients with Parkinson’s disease from hand resting activity using inertial measurement unit. Biomed. Eng. Online 2021, 20, 50. [Google Scholar]
Su, F.; Chen, M.; Sun, Z.; Xin, T.; Bu, D.; Chen, Y. An Interpretable Deep Learning Optimized Wearable Daily Monitoring System for Parkinson’s Disease Patients. Res. Sq. 2023. [Google Scholar] [CrossRef]
Uchitomi, H.; Ming, X.; Zhao, C.; Ogata, T.; Miyake, Y. Classification of mild Parkinson’s disease: Data augmentation of time-series gait data obtained via inertial measurement units. Sci. Rep. 2023, 13, 12638. [Google Scholar]
Dimoudis, D.; Tsolakis, N.; Magga-Nteve, C.; Meditskos, G.; Vrochidis, S.; Kompatsiaris, I. InSEption: A robust mechanism for predicting FoG episodes in PD patients. Electronics 2023, 12, 2088. [Google Scholar] [CrossRef]
Riaz, Q.; Vögele, A.; Krüger, B.; Weber, A. One small step for a man: Estimation of gender, age and height from recordings of one step by a single inertial sensor. Sensors 2015, 15, 31999–32019. [Google Scholar] [CrossRef]
Hashmi, M.A.; Riaz, Q.; Zeeshan, M.; Shahzad, M.; Fraz, M.M. Motion reveal emotions: Identifying emotions from human walk using chest mounted smartphone. IEEE Sens. J. 2020, 20, 13511–13522. [Google Scholar] [CrossRef]
Pardoel, S.; Shalin, G.; Lemaire, E.D.; Kofman, J.; Nantel, J. Grouping successive freezing of gait episodes has neutral to detrimental effect on freeze detection and prediction in Parkinson’s disease. PLoS ONE 2021, 16, e0258544. [Google Scholar] [CrossRef]
Yang, D.; Huang, R.; Yoo, S.H.; Shin, M.J.; Yoon, J.A.; Shin, Y.I.; Hong, K.S. Detection of mild cognitive impairment using convolutional neural network: Temporal-feature maps of functional near-infrared spectroscopy. Front. Aging Neurosci. 2020, 12, 141. [Google Scholar] [CrossRef]
Lee, S.; Hussein, R.; Ward, R.; Wang, Z.J.; McKeown, M.J. A convolutional-recurrent neural network approach to resting-state EEG classification in Parkinson’s disease. J. Neurosci. Methods 2021, 361, 109282. [Google Scholar]
Rezaee, K.; Savarkar, S.; Yu, X.; Zhang, J. A hybrid deep transfer learning-based approach for Parkinson’s disease classification in surface electromyography signals. Biomed. Signal Process. Control 2022, 71, 103161. [Google Scholar]
Quan, C.; Ren, K.; Luo, Z. A Deep Learning Based Method for Parkinson’s Disease Detection Using Dynamic Features of Speech. IEEE Access 2021, 9, 10239–10252. [Google Scholar] [CrossRef]
Hireš, M.; Gazda, M.; Drotár, P.; Pah, N.D.; Motin, M.A.; Kumar, D.K. Convolutional neural network ensemble for Parkinson’s disease detection from voice recordings. Comput. Biol. Med. 2022, 141, 105021. [Google Scholar] [CrossRef]
Costantini, G.; Cesarini, V.; Di Leo, P.; Amato, F.; Suppa, A.; Asci, F.; Pisani, A.; Calculli, A.; Saggio, G. Artificial intelligence-based voice assessment of patients with Parkinson’s disease off and on treatment: Machine vs. deep-learning comparison. Sensors 2023, 23, 2293. [Google Scholar] [CrossRef]
Rehman, A.; Saba, T.; Mujahid, M.; Alamri, F.S.; ElHakim, N. Parkinson’s disease detection using hybrid lstm-gru deep learning model. Electronics 2023, 12, 2856. [Google Scholar] [CrossRef]
Govindu, A.; Palwe, S. Early detection of Parkinson’s disease using machine learning. Procedia Comput. Sci. 2023, 218, 249–261. [Google Scholar]
Ali, L.; Javeed, A.; Noor, A.; Rauf, H.T.; Kadry, S.; Gandomi, A.H. Parkinson’s disease detection based on features refinement through L1 regularized SVM and deep neural network. Sci. Rep. 2024, 14, 1333. [Google Scholar]
Morinan, G.; Peng, Y.; Rupprechter, S.; Weil, R.S.; Leyland, L.A.; Foltynie, T.; Sibley, K.; Baig, F.; Morgante, F.; Gilron, R.; et al. Computer-vision based method for quantifying rising from chair in Parkinson’s disease patients. Intell.-Based Med. 2022, 6, 100046. [Google Scholar] [CrossRef]
Archila, J.; Manzanera, A.; Martínez, F. A multimodal Parkinson quantification by fusing eye and gait motion patterns, using covariance descriptors, from non-invasive computer vision. Comput. Methods Programs Biomed. 2022, 215, 106607. [Google Scholar] [CrossRef]
Agrawal, S.; Sahu, S.P. Image-based Parkinson disease detection using deep transfer learning and optimization algorithm. Int. J. Inf. Technol. 2024, 16, 871–879. [Google Scholar]
Alazeb, A.; Batool, M.; Al Mudawi, N.; Alshehri, M.; Almakdi, S.; Almujally, N.A.; Algarni, A. Effective Gait Abnormality Detection in Parkinson Patients for Multi-Sensors Surveillance System. IEEE Access 2024, 12, 48686–48698. [Google Scholar] [CrossRef]
Imran, H.A.; Riaz, Q.; Hussain, M.; Tahir, H.; Arshad, R. Smart-wearable sensors and cnn-bigru model: A powerful combination for human activity recognition. IEEE Sens. J. 2023, 24, 1963–1974. [Google Scholar]

Figure 1. Pipeline for detecting and classifying anomalies in inertial data (segment-by-segment processing) acquired from the most affected upper limb. The data are pre-processed and used to train four models: two (random forest and HARDenseRNN) for movement disorder detection and two (random forest and HARDenseRNN) for movement disorder classification. A comparative analysis of the two modeling approaches is provided in the Results Section.

Figure 2. Diagram illustrating signal segmentation process with parameters WS (Window size), S (Stride), and O (Overlap). S1 represents the initial segment, and s[N] denotes the N^th segment.

Figure 3. Class distribution for movement disorder detection and movement disorder classification. Here, the top half shows the class distribution for movement disorder detection, and the bottom half shows the class distribution for movement disorder classification. In both halves, A represents using segment size of 250 and S1 labeling, and B represents using segment size of 250 and S2 labeling. Classes–T&D: Tremor and Dyskinesia; T&B: Tremor and Bradykinesia; T&D&B: Tremor, Dyskinesia, and Bradykinesia; D&B: Dyskinesia and Bradykinesia.

Figure 4. A comparison of top 10 important features for anomaly detection (AD) and anomaly classification (AC), using the GeneActiv smartwatch, with a window size of 250, across S1 and S2 labeling methodologies.

Figure 5. The architecture of our previously developed HARDenseRNN model, which was used in this study, consists of two multi-kernel CNN modules followed by a 128-unit bi-directional GRU.

Figure 6. Plot depicting accuracy and loss curves for movement disorder detection and classification of HARDenseRNN models utilizing GeneActiv smartwatch data, employing a window size of 250 and S1 labeling approach.

Figure 7. (A): Confusion matrix for movement disorder detection using HARDenseRNN with a segment size of 250 and S1 labeling. (B–D): Confusion matrix for movement disorder classification (multi-label classification) using HARDenseRNN with a segment size of 250 and S1 labeling.

Figure 8. Inference time for various tasks (feature engineering, movement disorder detection, movement disorder classification). (A) Inference time when using all 150 features (random forest-based pipeline), (B) inference time when using top 10 features (random forest-based pipeline), and (C) inference time when using HARDenseRNN. The best inference time of 165 ms is observed for the deep learning pipeline.

Figure 9. Performance metrics (accuracy, precision, and recall) for 12 movement disorder detection models, highlighting the comparative analysis between random forest and HARDenseRNN across different pre-processing configurations, using inertial data acquired from the most affected upper limb via a GeneActiv smartwatch.

Figure 10. Performance metrics (accuracy, precision, and recall) for 12 movement disorder classification models, highlighting the comparative analysis between random forest and HARDenseRNN across different pre-processing configurations, using inertial data acquired from most affected upper limb via a GeneActiv smartwatch.

Table 1. Manually crafted features from different domains: time (T), frequency (F), and wavelet (W).

Domain	Feature
T	$μ = \frac{1}{N} \sum_{i = 1}^{N} x_{i}$
T	$Median = \{\begin{matrix} x_{(\frac{N + 1}{2})}, & if N is odd \\ \frac{x_{(\frac{N}{2})} + x_{(\frac{N}{2} + 1)}}{2}, & if N is even \end{matrix}$
T	$σ^{2} = \frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - μ)}^{2}$
T	$σ = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - μ)}^{2}}$
T	$max (x)$
T	$min (x)$
T	$\arg {max}_{i} (x_{i})$
T	$\arg {min}_{i} (x_{i})$
T	$Skewness = \frac{1}{N} \sum_{i = 1}^{N} {(\frac{x_{i} - μ}{σ})}^{3}$
T	$Kurtosis = \frac{1}{N} \sum_{i = 1}^{N} {(\frac{x_{i} - μ}{σ})}^{4} - 3$
T	$Entropy = - \sum_{i} p_{i} \log p_{i}$
T	$RMS = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2}}$
T	$Energy = \sum_{i = 1}^{N} x_{i}^{2}$
T	$Power = \frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2}$
T	$MAD = \frac{1}{N} \sum_{i = 1}^{N} \| x_{i} - μ \|$
T	$I Q R = Q 3 - Q 1$
T	$S M A = \frac{1}{N} \sum_{i = 1}^{N} \| x_{i} \|$
T	$ZCR = \frac{1}{2 (N - 1)} \sum_{i = 1}^{N - 1} \| sgn (x_{i}) - sgn (x_{i + 1}) \|$
T	$SSC = \sum_{i = 2}^{N - 1} sgn ((x_{i} - x_{i - 1}) (x_{i} - x_{i + 1}))$
T	$WL = \sum_{i = 1}^{N - 1} \| x_{i + 1} - x_{i} \|$
F	$X_{k} = \sum_{n = 0}^{N - 1} x_{n} e^{- j \frac{2 π}{N} k n}$
F	$μ_{f} = \frac{1}{N} \sum_{i = 1}^{N} X_{i}$
F	$max (X)$
F	$\| X \|$
F	$Energy = \sum_{i = 1}^{N} {\| X_{i} \|}^{2}$
F	Power in specific frequency band
W	$\sum_{i = 1}^{N} c_{i}^{2}$
W	$\sum_{i = 1}^{N} \| c_{i} \|$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khan, U.; Riaz, Q.; Hussain, M.; Zeeshan, M.; Krüger, B. Towards Effective Parkinson’s Monitoring: Movement Disorder Detection and Symptom Identification Using Wearable Inertial Sensors. Algorithms 2025, 18, 203. https://doi.org/10.3390/a18040203

AMA Style

Khan U, Riaz Q, Hussain M, Zeeshan M, Krüger B. Towards Effective Parkinson’s Monitoring: Movement Disorder Detection and Symptom Identification Using Wearable Inertial Sensors. Algorithms. 2025; 18(4):203. https://doi.org/10.3390/a18040203

Chicago/Turabian Style

Khan, Umar, Qaiser Riaz, Mehdi Hussain, Muhammad Zeeshan, and Björn Krüger. 2025. "Towards Effective Parkinson’s Monitoring: Movement Disorder Detection and Symptom Identification Using Wearable Inertial Sensors" Algorithms 18, no. 4: 203. https://doi.org/10.3390/a18040203

APA Style

Khan, U., Riaz, Q., Hussain, M., Zeeshan, M., & Krüger, B. (2025). Towards Effective Parkinson’s Monitoring: Movement Disorder Detection and Symptom Identification Using Wearable Inertial Sensors. Algorithms, 18(4), 203. https://doi.org/10.3390/a18040203

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Towards Effective Parkinson’s Monitoring: Movement Disorder Detection and Symptom Identification Using Wearable Inertial Sensors

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. Dataset

3.2. Pipeline Overview

3.3. Signal Pre-Processing

3.4. Feature Crafting and Machine Learning

3.5. Deep Learning

Overview of the Algorithm

4. Results and Discussion

4.1. Movement Disorder Detection

4.2. Movement Disorder Classification

4.3. Inference Time Evaluation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI