Applications of Machine Learning to Diagnosis of Parkinson’s Disease

Background: Accurate diagnosis of Parkinson’s disease (PD) is challenging due to its diverse manifestations. Machine learning (ML) algorithms can improve diagnostic precision, but their generalizability across medical centers in China is underexplored. Objective: To assess the accuracy of an ML algorithm for PD diagnosis, trained and tested on data from different medical centers in China. Methods: A total of 1656 participants were included, with 1028 from Beijing (training set) and 628 from Fuzhou (external validation set). Models were trained using the least absolute shrinkage and selection operator–logistic regression (LASSO-LR), decision tree (DT), random forest (RF), eXtreme gradient boosting (XGboost), support vector machine (SVM), and k-nearest neighbor (KNN) techniques. Hyperparameters were optimized using five-fold cross-validation and grid search techniques. Model performance was evaluated using the area under the curve (AUC) of the receiver operating characteristic (ROC) curve, accuracy, sensitivity (recall), specificity, precision, and F1 score. Variable importance was assessed for all models. Results: SVM demonstrated the best differentiation between healthy controls (HCs) and PD patients (AUC: 0.928, 95% CI: 0.908–0.947; accuracy: 0.844, 95% CI: 0.814–0.871; sensitivity: 0.826, 95% CI: 0.786–0.866; specificity: 0.861, 95% CI: 0.820–0.898; precision: 0.849, 95% CI: 0.807–0.891; F1 score: 0.837, 95% CI: 0.803–0.868) in the validation set. Constipation, olfactory decline, and daytime somnolence significantly influenced predictability. Conclusion: We identified multiple pivotal variables and SVM as a precise and clinician-friendly ML algorithm for prediction of PD in Chinese patients.


Introduction
Parkinson's disease (PD) is the second most prevalent neurodegenerative disorder among the elderly population, primarily characterized by motor symptoms such as bradykinesia, rigidity, resting tremor, and postural instability.Alongside these, a range of nonmotor symptoms (NMS) including constipation, hyposmia, REM sleep behavior disorders (RBD), depression, and cognitive impairment further complicate the clinical picture [1].As the global population ages, PD prevalence continues to increase, affecting approximately 1.7% of individuals aged 65 years and older and 4.0% of those aged 80 years and older [2].PD significantly impacts patients' quality of life, social functioning, and family dynamics, imposing a substantial financial burden on individuals and society [3][4][5].Current diagnostic criteria for PD primarily rely on clinical manifestations, and there is a lack of disease-modifying therapies available.This may be partly due to the loss of more than 50% of dopaminergic cells in the substantia nigra at the time of clinical diagnosis [6].Early and Brain Sci.2023, 13 Extensive research has identified potential risk factors and protective factors for PD.Family history [7], pesticide exposure [8], occupational solvent exposure [8], and prodromal symptoms of PD [9] (including hyposmia, constipation, depression, rapid eye movement sleep behavior disorder (RBD), global cognitive deficit, daytime somnolence, and orthostatic hypotension) have been linked to an elevated risk of PD.Conversely, smoking [10], tea consumption [11], coffee intake [12], and physical activity [13] have been associated with a reduced risk of PD.Genome-wide association studies (GWAS) have identified numerous single-nucleotide polymorphisms (SNPs) at loci such as SNCA, GBA, LRRK2, PARK16, BST1, and MAPT which can modulate the risk of PD [14,15].Leveraging these factors, high-risk populations can be identified, and targeted disease prevention measures can be deployed.However, an efficacious classification model that accurately discerns PD patients from healthy controls (HCs) remains elusive.
The advent of artificial intelligence has propelled machine learning (ML) to the forefront as an indispensable tool for disease diagnosis, progression evaluation, and prognosis assessment [16][17][18].ML algorithms possess the potential to identify intricate data patterns, automate data analysis, and classify patient-specific data, which can be harnessed for precision medicine applications in PD.In recent years, ML has been applied to PD diagnosis using diverse data modalities, including speech and phonation evaluation [19], handwriting patterns [20], gait analysis [21], neuroimaging [22], cerebrospinal fluid (CSF) [23], and genetic and transcriptomic data [24].
Studies evaluating clinical and imaging biomarkers for the diagnosis of PD have been widely reported.For instance, Kang et al. [25] highlighted the prognostic and diagnostic potential of measures such as CSF Aβ1-42, T-tau, P-tau181, and α-synuclein in early-stage PD.Silveira-Moriyama et al. [26], through their research on smell identification tests in Brazil, underscored the importance of olfactory dysfunction in diagnosing PD.Additionally, Shinde et al. [27] employed neuromelanin-sensitive MRI to identify predictive markers for PD, while Armañanzas et al. [28] leveraged machine learning to pinpoint significant nonmotor symptoms associated with PD severity.However, many studies have focused solely on individual biomarkers or imaging indicators, neglecting the potential of integrating diverse data types.Furthermore, the application of advanced machine learning techniques in PD research, despite their proven efficacy in other domains, remains underexplored.These gaps underscore the need for a more comprehensive approach, which our study aims to address by enhancing PD diagnostic accuracy through diverse machine learning algorithms, particularly tailored for the Chinese population.
The crucial questions for ML-based diagnostic models for PD include: what the pivotal variables are for PD, what the best ML algorithm is in the model construction, and how the models perform in different cohorts.To address these questions, we performed a comprehensive analysis using multi-faceted variables including demographic data, environmental factors, lifestyle habits, NMS, and genetic factors, constructed the models using six algorithms, and validated the models using a dataset from a separate (southern) population.Our research pinpointed several key variables and identified SVM as the most accurate and user-friendly ML algorithm for the prediction of PD in the Chinese population.We then present a more precise population.Consequently, we introduce a refined and pragmatical diagnostic model, providing a novel perspective for both the research and treatment of Parkinson's.

Study Design and Data Source
The study was designed for training and validation tests.The study was approved by the Medical Ethics Committee of Xuanwu Hospital of Capital Medical University.Informed consent was obtained from all participants in the study.Two populations of participants were included: a training set to train the algorithms and a validation set to independently Brain Sci.2023, 13, 1546 3 of 14 evaluate each algorithm's performance.PD patients in the training set were recruited from Xuanwu Hospital of Capital Medical University, while those in the validation set were recruited from Fujian Medical University Union Hospital.Concurrently, healthy controls (HCs) in the training set were selected from among the residents of Beijing, and HCs in the validation set were recruited from Fuzhou communities, matching proportionally (within 5-year strata) to cases by gender and age.Patients were diagnosed by senior neurologists specializing in movement disorders based on the MDS clinical diagnostic criteria for PD released in 2015 [29].Patients were excluded from the study using the following criteria: (i) Parkinson's plus syndromes, including multiple system atrophy (MSA), progressive supranuclear palsy (PSP), dementia with Lewy bodies (DLB), and corticobasal degeneration (CBD); (ii) Parkinsonian syndromes caused by cerebrovascular diseases, brain trauma, hypoxic diseases, infectious diseases, and metabolic diseases, affecting the central nervous system; (iii) malignant tumors or other serious systemic diseases; and (iv) dementia, rendering them unable to cooperate with the questionnaire.The control participants who were enrolled fulfilled the following criteria: (i) lack of PD-related motor symptoms (bradykinesia, tremor, postural instability and rigidity); (ii) no history of dementia, PD, or other neurodegenerative diseases; and (iii) no history of malignant tumors or other serious systemic diseases.A total of 1882 participants were recruited from December 2016 to September 2021.After excluding 226 participants due to incomplete gene detection or critical assessments, we had a final count of 1656 participants.The training set comprised 1028 individuals (524 PD patients and 504 HCs), while the external validation set included 628 individuals (305 PD patients and 323 HCs) (Figure 1).All participants were assessed for demographic information, lifestyle behaviors, and environmental exposures.Participants were also assessed for motor and NMS of PD using diagnostic scales (additional information provided in Supplementary Methods).All assessments were performed via face-to-face interviews by clinical investigators with unified training.

Study Design and Data Source
The study was designed for training and validation tests.The study was approved by the Medical Ethics Committee of Xuanwu Hospital of Capital Medical University.Informed consent was obtained from all participants in the study.Two populations of participants were included: a training set to train the algorithms and a validation set to independently evaluate each algorithm's performance.PD patients in the training set were recruited from Xuanwu Hospital of Capital Medical University, while those in the validation set were recruited from Fujian Medical University Union Hospital.Concurrently, healthy controls (HCs) in the training set were selected from among the residents of Beijing, and HCs in the validation set were recruited from Fuzhou communities, matching proportionally (within 5-year strata) to cases by gender and age.Patients were diagnosed by senior neurologists specializing in movement disorders based on the MDS clinical diagnostic criteria for PD released in 2015 [29].Patients were excluded from the study using the following criteria: (i) Parkinson's plus syndromes, including multiple system atrophy (MSA), progressive supranuclear palsy (PSP), dementia with Lewy bodies (DLB), and corticobasal degeneration (CBD); (ii) Parkinsonian syndromes caused by cerebrovascular diseases, brain trauma, hypoxic diseases, infectious diseases, and metabolic diseases, affecting the central nervous system; (iii) malignant tumors or other serious systemic diseases; and (iv) dementia, rendering them unable to cooperate with the questionnaire.The control participants who were enrolled fulfilled the following criteria: (i) lack of PD-related motor symptoms (bradykinesia, tremor, postural instability and rigidity); (ii) no history of dementia, PD, or other neurodegenerative diseases; and (iii) no history of malignant tumors or other serious systemic diseases.A total of 1882 participants were recruited from December 2016 to September 2021.After excluding 226 participants due to incomplete gene detection or critical assessments, we had a final count of 1656 participants.The training set comprised 1028 individuals (524 PD patients and 504 HCs), while the external validation set included 628 individuals (305 PD patients and 323 HCs) (Figure 1).All participants were assessed for demographic information, lifestyle behaviors, and environmental exposures.Participants were also assessed for motor and NMS of PD using diagnostic scales (additional information provided in Supplementary Methods).All assessments were performed via face-to-face interviews by clinical investigators with unified training.

Genotype Analysis and Classification
Genetic factors play a pivotal role in many diseases, including Parkinson's.This subsection elaborates on the genotype analysis we conducted.Genomic DNA was extracted from peripheral blood leukocytes using a standard protocol.Polymerase chain reaction (PCR) assays and extension primers for variants were designed employing the MassARRAY Assay Design software version 4.0 (Sequenom, Inc., San Diego, CA, USA) and the Primer 6.0 software (Premier, Vancouver, BC, Canada).The primer sequences for amplifying each SNP are listed in Supplementary Table S1.PCR products were purified and sequenced using an ABI3730xl DNA analyzer (Applied Biosystems, Inc., Waltham, MA, USA).Sequence readings were performed with the Chromas 2.22 software.

Machine Learning Algorithm
Machine learning stands as the cornerstone of our research.In this section, we delve into the algorithms we have harnessed and the metrics we utilize for evaluation.Six supervised ML algorithms were employed: least absolute shrinkage and selection operatorlogistic regression (LASSO-LR), decision tree (DT), random forest (RF), eXtreme gradient boosting (XGboost), support vector machine (SVM), and k-nearest neighbor (KNN).In the training set, we applied five-fold cross-validation to minimize each model's overfitting and utilized the grid search technique to select the optimal combination of hyperparameters for each ML algorithm.Algorithm performances were evaluated based on the area under the curve (AUC) of the receiver operating characteristic (ROC) curve, accuracy, sensitivity (recall), specificity, precision, and F1 score.The model with the highest accuracy was chosen for evaluation using the validation set.In addition to performance comparisons, we also analyzed the importance of variable factors in the models to identify which variables were critical in distinguishing PD patients from HCs in the training set.While deep learning techniques have shown significant promise in various applications, we opted not to use them in this study due to several considerations.Firstly, deep learning models typically require large datasets for effective training and to mitigate overfitting.Secondly, the intricacies of parameter tuning and model selection can be challenging.Furthermore, the complexity of deep learning models might compromise the interpretability of results, which was a key focus for our study.Given our dataset's size and our emphasis on clear variable interpretation, we chose the machine learning algorithms described above.Detailed information about the ML algorithms used in this study is provided in the Supplementary Materials.

Statistical Analysis
This subsection dives into the statistical techniques we used to validate and interpret our findings.All statistical analyses were performed using the R software (version 4.1.1;R Foundation for Statistical Computing Vienna, Austria; http://www.R-project.org/,accessed on 9 October 2023).Categorical variables were expressed as frequencies and percentages, while continuous variables were described as medians and interquartile ranges (IQRs).Mann-Whitney U tests (for continuous variables with skewed distributions) and the chi-square test (for categorical variables) were utilized to compare differences in characteristics between groups.The R package "glmnet" was employed to perform the LASSO regression.The R packages "pROC", "plotROC", and "rmda" were applied to generate the ROC.The R package "caret" was utilized to run the ML algorithms.All tests were two-tailed, and p < 0.05 was defined as statistically significant.

Clinical and Demographic Characteristics
Here, we present an overview of the clinical and demographic traits of our dataset.In the training set, the median age for PD patients was 66 years (interquartile range, 61-71.25 years), with 49% being male.The HCs had a median age of 67 years (interquartile range, 63-70 years), with 45% male representation.In the external validation set, PD patients had a median age of 69 years (interquartile range, 61-75 years), with 56.7% being male, while HCs had a median age of 68 years (interquartile range, 65-73 years), with 50.2% being male.Detailed demographic and clinical characteristics of the training and validation sets are provided in Table 1.

Comparison of Algorithms
With multiple algorithms being considered, it is essential to understand their comparative strengths and weaknesses.This section provides a side-by-side analysis of the algorithms, highlighting their performance metrics.Table 2 presents the performances of various ML algorithms with the training and validation sets, while Figure 2A,B illustrates the ROC of all six models with the training set and validation set.
LASSO-logistic regression: Utilizing 10-fold cross-validation, a LASSO regression model was employed to select predictive variables from among the preliminary factors.
K-nearest neighbor: Based on five-fold cross-validation and grid search (Supplementary Figure S2E), the optimal number of nearest neighbors (k) was found to be nine.In the training set, the tuned model accurately classified 467 out of 524 PD patients and 427 out Random forest: Utilizing five-fold cross-validation and the grid search method (Supplementary Figure S2B), 500 decision trees and three predictor variables selected at each node (mtry) provided optimal performance.In the training set, the tuned model accurately classified 477 out of 524 PD patients and 459 out of 504 HCs, achieving an AUC of 0.963 (95% CI: 0.952-0.973),an F1 score of 0.912 (95% CI: 0.894-0.930),and an overall accuracy of 0.911(95% CI: 0.892-0.928).In the validation set, the tuned model accurately classified 264 out of 305 PD patients and 260 out of 323 HCs, yielding an AUC of 0.912 (95% CI: 0.890-0.934),an F1 score of 0.835 (95% CI: 0.802-0.867),and an overall accuracy of 0.834 (95% CI: 0.804-0.863).
Variable importance: The significance of features, as demonstrated by effect sizes, was calculated (Figure 3A-F).Olfactory dysfunction and constipation exhibited the highest frequencies among the top predictors across all six models, while daytime somnolence, global cognitive deficit, and depression also displayed large effect sizes in more than half of the models.Variable importance: The significance of features, as demonstrated by effect sizes, was calculated (Figure 3A-F).Olfactory dysfunction and constipation exhibited the highest frequencies among the top predictors across all six models, while daytime somnolence, global cognitive deficit, and depression also displayed large effect sizes in more than half of the models.
The SVM model, exhibiting the most remarkable discrimination capabilities, outperformed other models by achieving the highest AUC, overall accuracy, and F1 score in the validation set.

Discussion
The primary objective of this study was to develop a non-invasive, cost-effective, and accurate classification model to differentiate patients with PD and HCs by employing genetic factors, lifestyle factors, environmental exposures, and NMS.The current study com- The SVM model, exhibiting the most remarkable discrimination capabilities, outperformed other models by achieving the highest AUC, overall accuracy, and F1 score in the validation set.

Discussion
The primary objective of this study was to develop a non-invasive, cost-effective, and accurate classification model to differentiate patients with PD and HCs by employing genetic factors, lifestyle factors, environmental exposures, and NMS.The current study combined analyses of multiple types and ML algorithms, and the model validation used datasets from distinct geographic centers.The results help to establish a comprehensive approach to the early and precise diagnosis of PD for the Chinese population.
Previous studies have typically relied on individual features for PD prediction [25][26][27][28].However, models incorporating multiple factors have demonstrated increased accuracy compared to those based on single risk factors.In recent years, the application of ML algorithms has gained increasing attention.Table 3 [38][39][40][41][42] provides a comparison of several models.R. Prashanth et al. utilized a combination of NMS features, cerebrospinal fluid (CSF), and imaging markers to differentiate early PD subjects from normal individuals using the naïve Bayes, SVM, boosted trees, and RF methods, observing that the SVM classifier delivered the most optimal performance [38].Govindu et al. employed machine learning techniques in telemedicine, using classifiers such as SVM, RF, KNN, and logistic regression on collected audio data.Among these, the random forest (RF) classifier stood out, achieving a remarkable 91.83% accuracy in early PD detection [42].However, many existing models generated overfitting, especially when they were trained on limited datasets.Additionally, the unclear interpretability of some complex models makes it challenging for clinicians to trust and adopt these models in practice.Furthermore, several models were constructed using isolated datasets, and their generalizability across diverse populations and medical centers has yet to be thoroughly evaluated.
Our study addressed some of these gaps by evaluating and comparing multiple ML algorithms for PD detection, emphasizing both performance and interpretability.First, our data originated from different medical centers, which not only adds diversity to the datasets but also enhances the model's generalizability.Second, in contrast to studies that rely on expensive radiomics or invasive cerebrospinal fluid collection, our approach is non-invasive and cost-effective.This holds tremendous potential for large-scale screening and clinical applications.Moreover, our model took into account a wider range of factors, including genetic factors, lifestyle factors, environmental exposure, and non-motor symptoms, contributing to a more accurate diagnosis of PD.Lastly, our data collection process was straightforward and user-friendly, and thus is easier for clinical doctors.In summary, our study offered a cost-effective, non-invasive, comprehensive, and easily accessible approach to the prediction of PD, and it is suitable for large-scale screening and clinical practice.
Currently, researchers are exploring novel biomarkers for diagnosing and predicting PD [43][44][45][46].Although numerous studies have been conducted, no widely recognized and easy-to-use predictive tool currently exists.In this study, we employed ML algorithms and discovered that genetic factors, lifestyles, environmental exposures, and NMS serve as robust predictors of PD, largely aligning with previous findings [2,47].Notably, we identified a strong association between PD and NMS.The variable importance analysis of the six ML models demonstrated that NMS were among the top predictors with the highest frequencies.
At present, the diagnosis of PD predominantly relies on clinical motor symptoms; however, NMS could potentially serve as an early warning for PD progression, as they may precede motor symptoms by decades [48].This period, known as the prodromal stage, is crucial for identifying individuals with early-stage PD, potentially paving the way for disease-modifying treatments that could delay or even prevent the progression to manifest Parkinson's disease if administered early.Among the prodromal PD indicators in our study, olfactory dysfunction emerged as the strongest association, followed by constipation and daytime somnolence.This relationship aligns with previous findings on PD risk.Olfactory dysfunction, one of the most common and typical NMS associated with PD [49], is found in approximately 50-90% of PD patients [50], often manifesting as one of the disease's earliest symptoms [51].Most patients develop olfactory dysfunction 4-6 years before the onset of motor impairment [47].This is supported by Braak et al., who identified the presence of Lewy bodies and Lewy neurites in the olfactory bulb during the premotor stage of PD [52].Constipation, a marker of early PD, affects approximately half of patients before the emergence of overt motor symptoms [53].The neuropathology of α-synuclein in the enteric nervous system may precede typical changes in the midbrain and limbic regions [49].Prior research has shown a higher prevalence of daytime somnolence in PD (35.1%) compared to the general population (9.0-16.8%)[54].The Honolulu-Asia Aging study revealed that men reporting subjective daytime somnolence had a 2.8-fold increased relative risk of developing PD [55].Patients newly diagnosed with PD may already exhibit mild cognitive deficits, with nearly 50% developing PD dementia within 10 years of disease onset [56].Depression, one of the earliest NMS in PD, has an incidence in PD patients three times higher than in the general population [57].A large-scale retrospective cohort analysis of 32,415 individuals suggested that most patients developed depression approximately 10 years before motor symptoms [58].RBD is a frequent and significant prodromal symptom of PD [59].Approxi-mately 50% of idiopathic RBD patients convert to PD within a decade, and ultimately, over 80% develop some form of neurodegenerative disease [59,60].
NMS bear clinical significance for early diagnosis.For instance, PD patients in the prodromal phase may present with NMS such as sleep disorders, olfactory dysfunction, mood disorders, constipation, and limb and trunk pain, but lack overt motor symptoms.These patients often visit sleep clinics, otorhinolaryngology clinics, psychological clinics, gastroenterology clinics, and orthopedics clinics, leading to delayed diagnosis.Furthermore, patients with limb pain may be mistakenly diagnosed with cervical or lumbar issues, and surgical intervention often fails to alleviate the pain.Employing an ML model can aid non-neurological clinicians in rapidly distinguishing PD patients from HCs, facilitating timely referrals to neurology departments for professional diagnosis and treatment.
There are several limitations to this study.First, as with all retrospective studies, potential selection biases and unknown confounding factors may not be accounted for in the analysis.To address these limitations, future work will include a prospective cohort study with a larger sample size.Second, as only one center was used for external validation, a larger sample size from multiple research centers is necessary to verify and improve the study's results.Third, due to time and resource constraints, we were unable to examine the participants' imaging indicators.

Conclusions
In conclusion, we identified multiple pivotal variables and SVM as a precise and clinician-friendly ML algorithm for the prediction of PD in Chinese patients.We believe that these ML algorithms, with further optimization, can serve as invaluable adjuncts to clinicians in the PD diagnostic process, potentially facilitating earlier and more accurate detection of the disease.

Figure 1 .
Figure 1.Flow diagram of selection of patients.

Figure 1 .
Figure 1.Flow diagram of selection of patients.

15 Figure 2 .
Figure 2. Receiver operating characteristic (ROC) curve of LASSO-LR, decision tree, random forest, XGboost, SVM, and KNN for the training set and validation set.(A) Integration of ROC for the PD classification model based on the six models for the training set.(B) Integration of ROC for the PD classification model based on the six models for the validation set.

Figure 2 .
Figure 2. Receiver operating characteristic (ROC) curve of LASSO-LR, decision tree, random forest, XGboost, SVM, and KNN for the training set and validation set.(A) Integration of ROC for the PD classification model based on the six models for the training set.(B) Integration of ROC for the PD classification model based on the six models for the validation set.

Figure 3 .
Figure 3. Variable importance.Histograms (A-F) depict the proportion of factor importance for different predictors in the model.For each model, the relative importance is quantified by assigning a weight between 0 and 1 for each variable.

Figure 3 .
Figure 3. Variable importance.Histograms (A-F) depict the proportion of factor importance for different predictors in the model.For each model, the relative importance is quantified by assigning a weight between 0 and 1 for each variable.

Table 1 .
Characteristics of total participants.

Table 2 .
A comparison of the model performances with the training and validation set.

Table 3 .
Comparative Analysis of Parkinson's Disease Diagnosis Using Machine Learning Approaches.