AI in Parkinson’s Disease: A Short Review of Machine Learning Approaches for Diagnosis

Arjita Sharma; Abhishek Agarwal; Michel Kalenga Wa Kalenga; Vishal Gupta; Vishal Srivastava

doi:10.3390/pr14020199

,

and

¹

Department of Biotechnology, Thapar Institute of Engineering and Technology, Patiala 147004, Punjab, India

²

Mechanical Engineering Department, College of Science and Technology, Royal University of Bhutan, Phuentsholing 21101, Bhutan

³

Department of Metallurgy, School of Mining, Metallurgy and Chemical Engineering, Faculty of Engineering and the Build Environment, University of Johannesburg, P.O. Box 17001, Doornfontein 2028, South Africa

⁴

Department of Mechanical Engineering, Thapar Institute of Engineering and Technology, Patiala 147004, Punjab, India

Processes2026, 14(2), 199;https://doi.org/10.3390/pr14020199

This article belongs to the Section AI-Enabled Process Engineering

Version Notes

Order Reprints

Abstract

Parkinson’s disease is a neurodegenerative disorder with progressive impairment in patients worldwide, featuring manifestations of both motor dysfunction and various/list-specific non-motor symptoms. Early diagnosis and personalized treatment thus remain the biggest challenges in managing the disease. Artificial intelligence (AI), especially machine learning techniques, has shown immense potential for countering such challenges during the past years. This short review aims to summarize recent innovations in applying Machine Learning (ML) and Deep Learning (DL) to Parkinson’s disease, explicitly directed toward developing diagnostic tools, the prediction of progression, and personalized treatment strategies. We discuss several ML and DL approaches, including supervised and unsupervised learning models that have been applied to classify symptoms and identify biomarkers. In addition, integrating clinical and imaging data into disease models continues to advance. This indicates the emerging role of DL in bypassing the limitations of standard methods. This review of the future of AI in Parkinson’s disease research outlines its possible directions for enhancing patient care and clinical outcomes.

Keywords:

Parkinson’s disease; machine and deep learning; diagnosis of Parkinson’s disease

1. Introduction

Parkinson’s disease (PD) is a common progressive (age-dependent) neurodegenerative condition that affects approximately 2–3% of people older than 65 years by affecting the dopaminergic neurons in the substantia nigra pars compacta [1]. This multisystemic disease is characterized by neuronal loss, which causes parkinsonism and non-motor symptoms, like hallucinations, hyposmia, dementia, and sleep disorders [2]. The loss of Dopamine in PD, along with low norepinephrine levels [3], causes multiple symptoms like bradykinesia (slow movements) [4], tremors [5], dysarthria (motor speech disorder) [6], rigidity [7], and cognitive impairments [8] in patients, as depicted in Figure 1. The clinical presentation varies substantially across individuals, and symptoms worsen gradually over time, although pharmacological and rehabilitative therapies can alleviate disease burden to some extent.

Figure 1. Cardinal (primary) symptoms of Parkinson’s disease.

PD is the second most prevalent neurodegenerative disorder following Alzheimer’s, and its global prevalence has increased significantly in recent decades [9]. Its cases have doubled in 25 years, with 8.5 million people affected globally in 2019, causing significant disability and death [10].

It predominantly affects older individuals but can also impact younger people, with men at higher risk, and while its exact cause is unknown, factors like family history, age, and environmental toxins contribute to its development [10]. Certain lifestyle factors like traumatic brain injury, excessive dairy consumption, and unhealthy habits linked to cardiovascular disease and associated medical conditions, including type 2 diabetes, certain infections, and autoimmune disorders, can increase the risk of developing PD [11,12].

A significant loss (60–80%) of dopamine-producing cells in the substantia nigra occurs before PD symptoms appear [3,13]. Early detection helps clinicians in designing a neuroprotective disease-modifying therapeutic program, which can prevent or slow down its development and provide the patients (and their caregivers) with additional years of a higher quality of life [14]. Treatment in the initial stages can significantly improve quality of life, reduce socioeconomic impact, and extend life expectancy for patients, their families, and society [15].

PD diagnosis relies on clinical assessment of bradykinesia and presence of either resting tremor or rigidity, though tremor can be absent in up to 30% of confirmed cases [16]. Despite recent improvements in diagnostic criteria, accurately identifying PD can be difficult as symptoms often overlap with other neurological conditions, and there is currently no definitive test to confirm the diagnosis, especially in the early stages [17]. Clinical scales like the Hoehn and Yahr scale (1967), Unified PD Rating Scale (UPDRS) [18], and its modified version, Movement Disorder Society-sponsored revision of the Unified PD Rating Scale (MDS-UPDRS) [19], are also used for diagnosis.

Deep learning (DL) is a subfield of Machine Learning (ML), as shown in Figure 2, which utilizes hierarchical artificial neural networks to unlock complex patterns within the data [20]. This capability makes it helpful in fields like natural language processing and computer vision. DL stands out from ML for its brain-like architecture and the ability to automatically extract patterns from data, unlike traditional methods that rely on human-defined features [21].

Figure 2. AI as a broad field consisting of ML and DL.

PD is associated with a slowing of brain activity (cortical activity), characterized by a shift from alpha to slower theta and delta brainwaves, even in early stages, along with a decrease in higher frequency activity [14]. Electroencephalography (EEG) signals effectively detect Parkinson’s disease despite their low amplitude, but their manual analysis is tedious and time-consuming. A visualization of different waves is presented in Figure 3 [22].

Figure 3. EEG signal categorization by frequency sub-bands [22] [copyright permission is not required as per the journal policy: open access journal].

ML models have also been applied to various data modalities for the diagnosis of Parkinson’s disease (PD), including handwritten patterns, movement, neuroimaging, voice, cerebrospinal fluid (CSF), cardiac scintigraphy, serum, and optical coherence tomography (OCT) [23]. Hence, ML offers a wide range of techniques for PD diagnosis, including traditional methods like linear regression, logistic regression, decision trees, and support vector machines, as well as more advanced DL approaches such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) [24]. The most popular class of DL is the Convolutional neural network (CNN), which focuses on image classification tasks [25].

These algorithms analyze diverse data types to identify patterns associated with the disease, thereby aiming to improve diagnostic accuracy and allow earlier detection.

Despite several recent reviews on AI applications in Parkinson’s disease, most prior work either focuses on a single data modality, provides broad conceptual summaries, or predates the surge of multimodal and deep learning approaches published between 2022 and 2025. The present review differs by offering an updated, modality-organized synthesis of the most recent ML and DL studies, spanning imaging, EEG, voice, gait, handwriting, emotion, and biomarker data. In addition to summarizing findings, this review provides cross-study comparisons, identifies recurring methodological limitations, and highlights trends such as multimodal fusion, explainable AI, and the growing use of large public datasets. By integrating evidence across diverse data sources and emphasizing diagnostic and predictive tasks, this review aims to provide a more comprehensive, clinically relevant perspective on current advances and remaining gaps in AI-based PD research.

Review Methodology

The review was conducted using a structured, narrative approach to identify recent ML and DL studies focused on the diagnosis of PD. A comprehensive literature search was performed across four major databases—PubMed, Scopus, Web of Science, and IEEE Xplore.

The search included combinations of the following keywords and Boolean operators:

“Parkinson’s disease”;
“Machine learning”;
“Deep learning”;
“Artificial intelligence”;
“Parkinson’s diagnosis”;
“Parkinson’s voice-based diagnosis”;
“Parkinson’s handwriting”;
“Parkinson’s biomarkers”;
“Parkinson’s multimodal analysis”;
“Parkinson’s early diagnosis”.

The studies were included if they met the following conditions:

Published between 2020 and 2025;
Focused on ML or DL models applied to PD diagnosis, classification, prediction, or symptom assessment;
Reported original research (not reviews or editorials);
Used human subject data across any modality (imaging, EEG, voice, gait, handwriting, biomarkers, or multimodal datasets);
Published in English.

Studies were excluded if they met the following conditions:

Were purely theoretical or did not evaluate a model on real data;
Focused solely on treatment response or medication effects without diagnostic relevance;
Did not involve ML/DL techniques;
Were duplicate reports or incomplete abstracts.

The initial search yielded approximately 60 articles. After title and abstract screening, eligible full texts were reviewed and organized by modality (Section 2.1, Section 2.2, Section 2.3, Section 2.4, Section 2.5, Section 2.6, Section 2.7 and Section 2.8). Reference lists were also screened for additional studies. Key information including data modality, feature extraction, ML/DL methods, dataset characteristics, objectives, and performance metrics was extracted into summary Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8 and Table 9. Comparative analysis and cross-modality synthesis were conducted to identify overarching trends, methodological limitations, and future research needs. Table 1 summarizes data collection methods, objectives, and limitations in PD research.

Table 1. Overview of data collection methods, dataset issues, study objectives, and reported limitations in PD research.

Table 2. Summary of classification and prediction techniques for Parkinson’s disease diagnosis using imaging data.

Table 3. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis using EEG signals and brainwave analysis.

Table 4. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis using voice, speech, and acoustic features.

Table 5. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis through motion and gait analysis from wearable sensors or video data.

Table 6. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis using handwriting, drawing, or sketch-based features.

Table 7. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis based on emotional, behavioral, or environmental interaction data.

Table 8. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis through fusion of multiple data modalities.

Table 9. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis based on genomic, genetic, or biomarker data.

2. Literature Review

By leveraging ML techniques, we can identify relevant attributes that are not traditionally used in the medical diagnosis of PD, enabling the diagnosis of PD in its preclinical stages using alternative indicators.

Generally, the diagnosis process involves three phases: data pre-processing, feature extraction, and the application of classification techniques [28], as shown in Figure 4.

Figure 4. General machine learning and deep learning workflow for Parkinson’s disease diagnosis using multimodal data.

Section 2.1, Section 2.2, Section 2.3, Section 2.4, Section 2.5, Section 2.6, Section 2.7 and Section 2.8 present recent advancements in 2025 on the application of machine learning (ML) and deep learning (DL) models for the detection of Parkinson’s disease (PD).

2.1. Imaging-Based Approaches (MRI, fMRI, SPECT, Etc.)

MRI scans reveal significant structural and functional brain differences in individuals with Parkinson’s disease (PD), particularly in the frontal and temporal lobes. These regions often show signs of cortical thinning, reduced gray matter volume, and disrupted connectivity, which are linked to the neurodegenerative processes responsible for both motor and cognitive symptoms in PD [72]. Moreover, the early-phase symptoms can be subtle and often overlap with other neurological conditions such as essential tremor, PSP, DLB, MSA, and SWEDD, making accurate diagnosis especially challenging even for experienced clinicians [29].

Imaging-based classification and prediction techniques for PD are summarized in Table 2.

2.2. EEG-Based Approaches

Electroencephalography (EEG) is increasingly used to investigate neural alterations in Parkinson’s disease (PD) due to its ability to capture real-time brain activity with high temporal resolution. While traditional EEG analyses focus on oscillatory power, connectivity patterns, or signal complexity, they often fall short in interpreting the transient neural dynamics essential for understanding PD-related functional disruptions [73]. Recent ML/DL-based EEG approaches for PD diagnosis are summarized in Table 3.

2.3. Voice and Speech-Based Analysis

Voice impairment, particularly dysphonia, is a common symptom in individuals with PD, with conditions like dysarthria and hypophonia affecting speech clarity and volume due to central nervous system damage [74]. Since nearly 90% of speech or voice data can aid in diagnosis, voice analysis has become a valuable tool for assessing disease progression, and AI is increasingly being integrated to support clinicians in interpreting these patterns and enhancing patient care.

Voice- and speech-based ML/DL approaches for PD diagnosis are summarized in Table 4.

2.4. Motion and Gait Analysis

Gait analysis is a powerful tool for quantifying movement deficits in PD, as gait abnormalities (along with bradykinesia, tremors, and posture issues) are core motor symptoms of the condition. Since early neuroimaging may not always reveal distinguishing signs, analyzing gait patterns provides valuable diagnostic insights, and with the aid of AI and machine learning, clinicians can better detect early-stage PD and monitor progression through subtle changes in mobility [75]. Motion- and gait-based ML/DL approaches for PD diagnosis are summarized in Table 5.

2.5. Handwriting and Drawing Analysis

Variations in handwriting, such as tremor frequency, amplitude, and direction, are valuable indicators in diagnosing Parkinson’s disease (PD), with writing tasks serving as clinical tools since the 19th century [55]. Quick handwriting assessments allow clinicians to observe fine motor disturbances and track tremor progression or treatment response through dynamic motion analysis. Table 6 summarizes studies applying recent ML/DL techniques for PD diagnosis using handwriting, drawing, and sketch-based features.

2.6. Emotion and Behavioral Data

AI is increasingly being used to detect subtle emotional and behavioral changes in individuals with Parkinson’s disease, such as mood swings, anxiety, and apathy, which are all symptoms that often precede or accompany motor decline. By analyzing speech patterns, facial expressions, and behavioral data from wearable sensors, AI models can identify psychological fluctuations early, enabling timely intervention and personalized care for the patients. Table 7 summarizes studies applying recent ML/DL techniques for PD diagnosis based on emotional, behavioral, and environmental data.

2.7. Multimodal and Fusion-Based Studies

Multimodal and fusion-based studies combine data from diverse sources such as neuroimaging, genetics, voice, gait, and wearable sensors to enhance early detection of prodromal PD. These integrative approaches leverage machine learning and deep learning models to extract complementary information, improving diagnostic accuracy beyond what single-modality systems can achieve [26]. Table 8 summarizes studies applying recent ML/DL techniques for PD diagnosis through multimodal data fusion.

2.8. Genomic and Biological Markers

Genomic and biological markers, such as SNCA, LRRK2, and α-synuclein levels, play a crucial role in identifying individuals at risk for Parkinson’s disease (PD) before motor symptoms appear. Integrating these biomarkers with AI-driven models enables more precise risk prediction and early diagnosis by uncovering hidden patterns across large-scale omics datasets.

AI, particularly ML algorithms, can assist healthcare professionals in developing early diagnostic tools for patients with various neurodegenerative disorders. This is achieved by predicting long-term brain changes and measuring the effectiveness of treatments [27,76]. Thus, the integration of AI into clinical practice holds significant promise for improving patient outcomes through timely intervention and personalized treatment strategies. Table 9 summarizes studies applying recent ML/DL techniques for PD diagnosis based on genomic, genetic, and biomarker data.

3. Challenges and Future Directions

Although the application of machine learning (ML) and deep learning (DL) in Parkinson’s disease (PD) research shows considerable promise, several key challenges must be addressed before these technologies can be effectively implemented in clinical practice.

3.1. Cross-Modal Comparative Discussion

Artificial intelligence (AI) methods have been applied across a wide range of data modalities in Parkinson’s disease (PD) research, each offering unique strengths along with inherent limitations. Understanding these cross-modal complementarities is essential for designing clinically meaningful diagnostic and predictive systems.

3.1.1. Imaging Modalities (MRI, fMRI, SPECT, DTI)

Imaging-based AI approaches benefit from rich structural and functional information, enabling the characterization of cortical thinning, altered connectivity networks, and dopaminergic deficits that are strongly associated with PD pathology [29,72]. Deep learning (DL) architectures, particularly CNNs and hybrid models, have achieved high diagnostic accuracy in identifying subtle anatomical changes. However, imaging data often require expensive scanners, standardized acquisition protocols, and highly trained personnel—factors that limit widespread clinical adoption. Additionally, imaging datasets are typically small, exhibit multicentre variability, and lack pathology-confirmed labels, resulting in limited model generalizability [24,29].

3.1.2. EEG-Based Methods

EEG offers excellent temporal resolution and is cost-effective compared to structural and functional imaging. AI models have successfully identified PD-related alterations in oscillatory power, connectivity, and time–frequency dynamics [35,36]. Nonetheless, EEG signals are highly susceptible to noise, inter-subject variability, and medication effects. The generalizability of EEG-based AI models decreases sharply when evaluated in cross-subject or cross-dataset settings, mainly due to inconsistent electrode settings and small sample sizes [35].

3.1.3. Voice and Speech Analysis

Voice-based AI systems leverage the fact that nearly 90% of PD patients exhibit dysphonia or speech impairment [74]. ML/DL models trained on MFCCs, jitter, shimmer, FrFT-based time–frequency features, and spectrogram representations show promising accuracy across various languages and recording conditions [39,40,41,42,43,44]. The primary limitations include dataset imbalance, overfitting to language-specific properties, and poor generalization across recording devices. Moreover, many publicly available datasets involve sustained vowel phonation, which may not always reflect real-world speech patterns [46].

3.1.4. Gait and Motion Analysis

Wearable sensors, accelerometers, and video-based pose estimation techniques offer unobtrusive and continuous monitoring of gait abnormalities—one of the hallmark symptoms of PD [49,75]. AI models have demonstrated strong performance in detecting freezing of gait (FoG), stride irregularities, and movement instability [51,52,53]. However, motion datasets are typically small and collected under controlled laboratory environments. Cross-dataset performance declines sharply due to variations in sensor placement, sampling frequencies, and movement protocols. Additionally, gait patterns may vary widely based on disease stage, comorbidities, and age.

3.1.5. Limitations in Handwriting and Drawing–Based Analysis

Handwriting analysis is particularly useful because tremor, bradykinesia, rigidity, and micrographia manifest clearly during writing tasks. DL models applied to spirals, meanders, and pressure-sensitive trajectories can discriminate PD from controls with high accuracy [55,56,57,58,59]. Still, handwriting datasets are limited in size and diversity, and most studies rely on static images rather than dynamic kinematic sequences. This reduces the capacity of models to capture real-time tremor fluctuations and movement smoothness.

3.1.6. Genomic, Proteomic, and Biomarker-Based Approaches

Molecular and biospecimen-based AI approaches have become increasingly important for detecting prodromal PD. Models trained on SNPs, CSF α-synuclein, tau, NfL, proteomics, or metabolomics data can reveal early biological signatures before motor symptoms appear [23,24,26,27].

The main challenges include population heterogeneity, batch effects, expensive assays, and missing modalities across cohorts. Biomarker datasets also tend to be small and require careful normalization and harmonization for reproducible AI models.

3.1.7. Synthesis Across Modalities

Across all modalities, a recurring theme emerges: single-source data provides valuable but incomplete information. Imaging captures structural changes, EEG reflects functional dynamics, speech captures articulatory deficits, and gait reveals motor control impairments. Biomarkers provide mechanistic insight.

AI systems that integrate these complementary signals show the highest potential for early-stage and prodromal PD detection, especially through multimodal fusion and hybrid ML–clinical frameworks [26,66].

3.2. Clinical Translation Challenges

Although AI demonstrates strong technical performance, several practical challenges hinder its integration into routine PD diagnosis and monitoring.

3.2.1. Regulatory Requirements

Clinical deployment of AI models must adhere to stringent regulatory frameworks, including standards for software as a medical device (SaMD), algorithm transparency, and post-marketing surveillance. Regulatory bodies increasingly require the following:

➢: Explainability and interpretability,
➢: Demonstration of algorithm robustness across populations,
➢: Monitoring for performance drift in real-world settings.

These requirements pose challenges for deep learning models that operate as “black boxes”.

3.2.2. Dataset Harmonization and Standardization

Across PD research, data collection protocols vary widely by center, device, operator, and demographic characteristics. Differences in MRI scanner field strengths, EEG electrode configurations, voice recording environments, and bio specimen handling result in non-uniform datasets.

Without harmonization, AI models suffer from dataset shift and fail to generalize beyond their training domain. Multi-center, standardized acquisition pipelines are critical for ensuring reproducibility.

3.2.3. Model Interpretability

Clinical practitioners require transparent and interpretable AI decisions. While explainable AI (XAI) techniques such as SHAP, LIME, Grad-CAM, and feature attribution mapping are increasingly used, many PD models still provide limited insight into their internal reasoning.

Lack of interpretability reduces physician trust and slows regulatory approval.

3.2.4. Integration into Clinical Workflow

Even high-performing AI models may fail in practice if they are not aligned with existing clinical workflows. Challenges include the following:

➢: Compatibility with EMR/EHR systems,
➢: Time burden of data acquisition,
➢: Need for technical support and training,
➢: Model updates and maintenance,
➢: Integration requires co-design with neurologists, movement disorder specialists, and hospital IT teams.

3.2.5. Cost and Infrastructure Limitations

Access to advanced imaging hardware, cloud-based storage, GPU-powered computation, and bio specimen testing varies widely across regions. These disparities are particularly evident in low- and middle-income countries. For real-world implementation, AI systems must be designed to operate efficiently on low-resource hardware or edge devices.

3.3. Limited Generalizability and Data Constraints

One of the primary concerns is the limited generalizability of current ML and DL models. Many studies are based on relatively small datasets that are often collected from single institutions or specific patient groups. This restricts the diversity and may not accurately reflect the broader PD population, especially those from minority or underrepresented groups. Such limitations can result in biased model performance and reduced accuracy when applied to real-world, diverse clinical settings. Expanding datasets to include multiple centers and varied demographic profiles is essential to develop more reliable and broadly applicable diagnostic tools [77,78,79].

3.4. Ethical and Legal Issues

The adoption of AI-based approaches in healthcare also raises important ethical and legal considerations. One critical issue is the potential for algorithmic bias, where models trained on unbalanced data could yield unfair or inconsistent outcomes across different patient populations. Additionally, concerns regarding patient privacy and data security are becoming increasingly significant, particularly when handling sensitive health information. Current regulatory and ethical guidelines are still evolving to address these complex challenges. Ensuring fairness, maintaining transparency, and protecting patient rights are fundamental to the responsible development and deployment of AI in clinical environments [80].

3.5. Obstacles to Clinical Adoption

Bringing AI models into routine clinical use faces practical barriers. There is currently no universal standard for how data should be collected, processed, or evaluated in this domain, which makes it difficult to compare results across studies and validate models consistently. Furthermore, many of the advanced models, especially those using deep learning, operate as “black boxes,” providing limited explanations for their decisions. This lack of interpretability can reduce confidence among healthcare professionals and hinder clinical integration. To overcome this, it is necessary to prioritize the development of models that are not only accurate but also explainable and clinically transparent. The translation of machine learning approaches into routine medical practice depends heavily on economic and infrastructural realities that differ widely across healthcare systems. Beyond algorithmic accuracy, the feasibility of deploying ML tools is shaped by the availability and cost of diagnostic equipment, the affordability of clinical services, and the presence of reliable digital infrastructure to support data processing and storage. In many low- and middle-income settings, limitations in imaging hardware, computational capacity, and standardized data acquisition pipelines pose significant barriers to adopting advanced ML-driven workflows. These practical concerns are consistent with broader discussions in the literature on the challenges of integrating AI technologies into real-world medicine, particularly the disparities in infrastructure, regulatory readiness, and resource availability across regions. At the same time, recent progress in portable imaging devices, cloud-based analytical systems, and cost-efficient diagnostic platforms suggests that ML applications may become increasingly accessible, even in resource-constrained environments. Acknowledging these factors provides a realistic outlook on the future applicability of ML in clinical practice and highlights the need for solutions designed to be scalable, affordable, and adaptable across varying healthcare ecosystems.

3.6. Future Directions

Based on the above synthesis, several high-priority research directions emerge for advancing AI-based PD diagnosis and monitoring.

3.6.1. Large-Scale, Multimodal Datasets with Harmonized Acquisition Standards

Future research should focus on building longitudinal, multi-center PD datasets that combine imaging, EEG, speech, gait, handwriting, genetic, and clinical data using unified protocols. This will reduce dataset shift and improve the robustness of machine learning models.

3.6.2. Explainable AI (XAI) to Enhance Clinician Trust

AI systems must provide interpretable explanations for their predictions. Examples include the following:

➢: Saliency maps for imaging,
➢: Frequency band contributions for EEG,
➢: Formant and MFCC influence for speech models,
➢: Gait cycle markers for motion analysis.

Developing modality-specific XAI frameworks will help clinicians understand and trust model outputs.

3.6.3. Multicenter External Validation and Benchmarking

Future studies must evaluate AI models across the following:

➢: Different hospitals,
➢: Populations,
➢: Recording devices,
➢: Geographical regions,
➢: Disease stages.

Such external validation is a prerequisite for regulatory approval and clinical adoption.

3.6.4. Early-Stage and Prodromal PD Prediction Models

Most existing studies focus on differentiating established PD from healthy controls. A major future goal is to detect PD before motor symptoms appear, using the following:

➢: REM sleep behavior disorder datasets,
➢: Genetic risk profiles,
➢: Autonomic dysfunction signals,
➢: Subtle voice, gait, and handwriting biomarkers,
➢: CSF and plasma signatures.

This early detection could dramatically improve neuroprotective treatment strategies.

3.6.5. Hybrid ML–Clinical Scoring Systems

Combining AI predictions with clinical scales such as MDS-UPDRS, MoCA, and H&Y staging can enhance diagnostic precision. Hybrid systems allow the following:

➢: Clinician oversight,
➢: Improved interpretability,
➢: Better patient stratification.

3.6.6. Integration with Wearable and Home-Monitoring Technologies

Wearables, smartphones, and IoT devices allow continuous monitoring of gait, tremor, sleep, and voice. AI-enabled home-monitoring systems can perform the following:

➢: Detect early deterioration,
➢: Personalize treatment,
➢: Reduce hospital visits,
➢: Support telemedicine applications.

3.6.7. Generative AI for Data Augmentation and Missing-Modality Compensation

Generative models (GANs, diffusion models, variational autoencoders) can be used to carry out the following:

➢: Augment small datasets,
➢: Simulate rare gait or handwriting patterns,
➢: Reconstruct missing imaging or bio specimen modalities,
➢: Harmonize datasets across acquisition settings.

GAN-based augmentation has shown promise in reducing class imbalance and improving generalization in multiple PD modalities.

By addressing these pressing challenges, the field can move toward more effective, equitable, and clinically accepted AI solutions for Parkinson’s disease management.

4. Conclusions

In conclusion, the application of artificial intelligence, particularly machine learning and deep learning, represents a promising advancement in the early diagnosis and management of Parkinson’s disease. By aiding in the development of predictive tools and personalized treatment strategies, AI holds the potential to transform current clinical practices, offering more precise and effective options for patients. However, realizing this potential in a clinical setting requires addressing significant challenges, such as standardizing data, resolving ethical concerns, and creating uniform protocols. As these issues are tackled, AI is likely to play an increasingly vital role in enhancing patient care, ultimately improving outcomes for those affected by Parkinson’s disease.

Future research should prioritize expanding datasets with diverse patient populations to improve model generalizability and conducting longitudinal studies that track PD progression biomarkers. Advanced explainable AI (XAI) techniques, tailored for multimodal imaging analysis, can also help clarify model decision-making. Additionally, incorporating further data, such as genetic and clinical assessments, may enhance the accuracy and interpretability of early PD detection, ultimately bringing AI closer to practical, impactful clinical applications in PD care.

Author Contributions

Conceptualization, A.S., A.A., M.K.W.K. and V.G.; methodology, A.A. and V.G.; software, A.S., A.A., V.G. and V.S.; validation, V.G. and V.S.; formal analysis, A.S. and V.S.; investigation, A.S., A.A. and M.K.W.K.; resources, A.S. and V.S.; data curation, V.G. and V.S.; writing—original draft preparation, A.S., A.A., M.K.W.K., V.G., and V.S.; writing—review and editing, A.S., A.A., M.K.W.K., V.G. and V.S.; visualization, V.G. and V.S.; supervision, A.A. and V.G.; project administration, A.A., M.K.W.K., V.G. and V.S.; funding acquisition, A.A. and M.K.W.K. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was funded by the University of Johannesburg.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All relevant data and results are presented within this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Polverino, P.; Cocco, A.; Albanese, A. Post-COVID parkinsonism: A scoping review. Park. Relat. Disord. 2024, 123, 106011. [Google Scholar] [CrossRef]
Iranzo, A.; Cochen De Cock, V.; Fantini, M.L.; Pérez-Carbonell, L.; Trotti, L.M. Sleep and sleep disorders in people with Parkinson’s disease. Lancet Neurol. 2024, 23, 925–937. [Google Scholar] [CrossRef] [PubMed]
Kispotta, S.; Das, D.; Prusty, S.K. A recent update on drugs and alternative approaches for parkinsonism. Neuropeptides 2024, 104, 101747. [Google Scholar] [CrossRef] [PubMed]
De Graaf, D.; Araújo, R.; Derksen, M.; Zwinderman, K.; de Vries, N.M.; IntHout, J.; Bloem, B.R. The sound of Parkinson’s disease: A model of audible bradykinesia. Park. Relat. Disord. 2024, 120, 106016. [Google Scholar] [CrossRef] [PubMed]
Hathaliya, J.J.; Modi, H.; Gupta, R.; Tanwar, S.; Sharma, P.; Sharma, R. Parkinson and essential tremor classification to identify the patient’s risk based on tremor severity. Comput. Electr. Eng. 2022, 101, 107946. [Google Scholar] [CrossRef]
Karan, B.; Sahu, S.S.; Orozco-Arroyave, J.R. An investigation about the relationship between dysarthria level of speech and the neurological state of Parkinson’s patients. Biocybern. Biomed. Eng. 2022, 42, 710–726. [Google Scholar] [CrossRef]
Linn-Evans, M.E.; Petrucci, M.N.; Huffmaster, S.L.A.; Chung, J.W.; Tuite, P.J.; Howell, M.J.; MacKinnon, C.D. REM sleep without atonia is associated with increased rigidity in patients with mild to moderate Parkinson’s disease. Clin. Neurophysiol. 2020, 131, 2008–2016. [Google Scholar] [CrossRef]
Yeager, B.E.; Twedt, H.P.; Bruss, J.; Schultz, J.; Narayanan, N.S. Cortical and subcortical functional connectivity and cognitive impairment in Parkinson’s disease. NeuroImage Clin. 2024, 42, 103610. [Google Scholar] [CrossRef]
Mmed, Z.; Zhu, J.; Cui, Y.; Zhang, J.; Yan, R.; Su, D.; Zhao, D.; Feng, T. Temporal trends in the prevalence of Parkinson’s disease from 1980 to 2023: A systematic review and meta-analysis. Lancet Healthy Longev. 2024, 5, e464–e479. [Google Scholar]
World Health Organization. Parkinson Disease. Available online: www.who.int (accessed on 8 October 2025).
Grotewold, N.; Albin, R.L. Update: Protective and risk factors for Parkinson disease. Park. Relat. Disord. 2024, 125, 107026. [Google Scholar] [CrossRef]
Venkatesan, D.; Iyer, M.; Wilson, R.; Vellingiri, B. The association between multiple risk factors, clinical correlations and molecular insights in Parkinson’s disease patients from Tamil Nadu population, India. Neurosci. Lett. 2021, 755, 135903. [Google Scholar] [CrossRef]
Shaban, M. Deep learning for Parkinson’s disease diagnosis: A short survey. Computers 2023, 12, 58. [Google Scholar] [CrossRef]
Gimenez-Aparisi, G.; Guijarro-Estelles, E.; Chornet-Lurbe, A.; Ballesta-Martinez, S.; Pardo-Hernandez, M.; Ye-Lin, Y. Early detection of Parkinson’s disease: Systematic analysis of the influence of the eyes on quantitative biomarkers in resting state electroencephalography. Heliyon 2023, 9, e20625. [Google Scholar] [CrossRef]
Fezeu, F.; Jbara, O.F.; Jbarah, A.; Choucha, A.; De Maria, L.; Ciaglia, E.; Samnick, S. PET imaging for a very early detection of rapid eye movement sleep behaviour disorder and Parkinson’s disease—A model-based cost-effectiveness analysis. Clin. Neurol. Neurosurg. 2024, 243, 108404. [Google Scholar] [CrossRef]
Kobylecki, C. Update on the diagnosis and management of Parkinson’s disease. Clin. Med. 2020, 20, 393–398. [Google Scholar] [CrossRef]
Tolosa, S.; Scholz, W.; Tolosa, E.; Garrido, A.; Scholz, S.W.; Poewe, W. Challenges in the diagnosis of Parkinson’s disease. Lancet Neurol. 2021, 20, 385–397. [Google Scholar] [CrossRef] [PubMed]
Pahuja, G.; Nagabhushan, T.N. A comparative study of existing machine learning approaches for Parkinson’s disease detection. IETE J. Res. 2021, 67, 4–14. [Google Scholar] [CrossRef]
Ramsay, N.; Macleod, A.D.; Alves, G.; Camacho, M.; Forsgren, L.; Lawson, R.A.; Khoo, T.K. Validation of a UPDRS-/MDS-UPDRS-based definition of functional dependency for Parkinson’s disease. Park. Relat. Disord. 2020, 76, 49–53. [Google Scholar] [CrossRef] [PubMed]
Ananth, K.R.; Khan, S.F.; Agarwal, A.; Bhatt, M.W.; Alvi, A.M.; Degadwala, S. IoT and Developed Deep Learning-Based Road Accident Detection System and Societal Knowledge Management. In Next Generation Computing and Information Systems; CRC Press: Boca Raton, FL, USA, 2024; pp. 64–71. [Google Scholar] [CrossRef]
Debus, B.; Parastar, H.; Harrington, P.; Kirsanov, D. Deep learning in analytical chemistry. Trends Anal. Chem. 2021, 148, 116459. [Google Scholar] [CrossRef]
Delis, A.; Tsavdaridis, G.; Tsanakas, P. A Novel Battery-Supplied AFE EEG Circuit Capable of Muscle Movement Artifact Suppression. Processes 2024, 14, 6886. [Google Scholar] [CrossRef]
Sivaranjini, S.; Sujatha, C.M. Deep learning-based diagnosis of Parkinson’s disease using convolutional neural network. Health Inf. Sci. Syst. 2020, 79, 15467–15479. [Google Scholar] [CrossRef]
Dennis, A.G.P.; Strafella, A.P. The role of AI and machine learning in the diagnosis of Parkinson’s disease and atypical parkinsonisms. Park. Relat. Disord. 2024, 128, 107118. [Google Scholar] [CrossRef]
Chen, J.; Park, C. A deep learning paradigm for medical imaging data. Expert Syst. Appl. 2024, 55, 124052. [Google Scholar] [CrossRef]
Serag, I.; Azzam, A.Y.; Hassan, A.K.; Diab, R.A.; Diab, M.; Hefnawy, M.T.; Ali, M.A.; Negida, A. Multimodal Diagnostic Tools and Advanced Data Models for Detection of Prodromal Parkinson’s Disease: A Scoping Review. BMC Med. Imaging 2025, 25, 103. [Google Scholar] [CrossRef]
Ameli, A.; Peña-Castillo, L.; Usefi, H. Assessing the Reproducibility of Machine-Learning-Based Biomarker Discovery in Parkinson’s Disease. Comput. Biol. Med. 2024, 174, 108407. [Google Scholar] [CrossRef]
Rana, A.; Dumka, A.; Singh, R.; Panda, M.K.; Priyadarshi, N.; Twala, B. Imperative role of machine learning algorithm for detection of Parkinson’s disease: Review, challenges and recommendations. Diagnostics 2022, 12, 2003. [Google Scholar] [CrossRef]
Aggarwal, N.; Saini, B.S.; Gupta, S. Role of artificial intelligence techniques and neuroimaging modalities in detection of Parkinson’s disease: A systematic review. Cogn. Comput. 2023, 16, 2078–2115. [Google Scholar] [CrossRef]
Volkmann, H.; Höglinger, G.U.; Grön, G.; Bârlescu, L.A.; DESCRIBE-PSP Study Group; Müller, H.P.; Kassubek, J. MRI classification of progressive supranuclear palsy, Parkinson disease and controls using deep learning and machine learning algorithms for the identification of regions and tracts of interest as potential biomarkers. Comput. Biol. Med. 2025, 185, 109518. [Google Scholar] [CrossRef]
Islam, N.; Turza, M.S.A.; Fahim, S.I.; Rahman, R.M. Advanced Parkinson’s disease detection: A comprehensive artificial intelligence approach utilizing clinical assessment and neuroimaging samples. Int. J. Cogn. Comput. Eng. 2024, 5, 199–220. [Google Scholar] [CrossRef]
Yuan, J.; He, Y. Adoption of deep learning-based magnetic resonance image information diagnosis in brain function network analysis of Parkinson’s disease patients with end-of-dose wearing-off. J. Neurosci. Methods 2024, 409, 110184. [Google Scholar] [CrossRef]
Maged, A.; Zhu, M.; Gao, W.; Hosny, M. Lightweight deep learning model for automated STN localization using MER in Parkinson’s disease. Biomed. Signal Process. Control 2024, 96, 106640. [Google Scholar] [CrossRef]
Pahuja, G.; Prasad, B. Deep learning architectures for Parkinson’s disease detection by using multi-modal features. Comput. Biol. Med. 2022, 146, 105610. [Google Scholar] [CrossRef]
Bera, S.; Geem, Z.W.; Cho, Y.I.; Singh, P.K. A comparative study of machine learning and deep learning models for automatic Parkinson’s disease detection from electroencephalogram signals. Diagnostics 2025, 15, 773. [Google Scholar] [CrossRef] [PubMed]
Zhang, R.; Jia, J.; Zhang, R. EEG analysis of Parkinson’s disease using time–frequency analysis and deep learning. Biomed. Signal Process. Control 2022, 78, 103883. [Google Scholar] [CrossRef]
Ezazi, Y.; Ghaderyan, P. Textural feature of EEG signals as a new biomarker of reward processing in Parkinson’s disease detection. J. Appl. Biomed. 2022, 42, 950–962. [Google Scholar] [CrossRef]
Guo, Y.; Huang, D.; Zhang, W.; Wang, L.; Li, Y.; Olmo, G.; Chan, P. High-accuracy wearable detection of freezing of gait in Parkinson’s disease based on pseudo-multimodal features. Comput. Biol. Med. 2022, 146, 105629. [Google Scholar] [CrossRef] [PubMed]
Yang, Z.; Zhou, H.; Srivastav, S.; Shaffer, J.G.; Abraham, K.E.; Naandam, S.M.; Kakraba, S. Optimizing Parkinson’s disease prediction: A comparative analysis of data aggregation methods using multiple voice recordings via an automated artificial intelligence pipeline. Data 2025, 10, 4. [Google Scholar] [CrossRef]
Islam, R.; Tarique, M. Escalate Prognosis of Parkinson’s Disease Employing Wavelet Features and Artificial Intelligence from Vowel Phonation. BioMedInformatics 2025, 5, 23. [Google Scholar] [CrossRef]
Al-Najjar, H.; Al-Rousan, N.; Al-Najjar, D. Hybrid Grey Wolf and Whale Optimization for Enhanced Parkinson’s Prediction Based on Machine Learning Models Using Biomedical Sound. Inform. Med. Unlocked 2024, 48, 101524. [Google Scholar] [CrossRef]
Singh, N.; Tripathi, P. An Ensemble Technique to Predict Parkinson’s Disease Using Machine Learning Algorithms. Speech Commun. 2024, 159, 103067. [Google Scholar] [CrossRef]
Zhang, T.; Lin, L.; Xue, Z. A Voice Feature Extraction Method Based on Fractional Attribute Topology for Parkinson’s Disease Detection. Expert Syst. Appl. 2023, 219, 119650. [Google Scholar] [CrossRef]
Guatelli, R.; Aubin, V.; Mora, M.; Naranjo-Torres, J.; Mora-Olivari, A. Detection of Parkinson’s Disease Based on Spectrograms of Voice Recordings and Extreme Learning Machine Random Weight Neural Networks. Eng. Appl. Artif. Intell. 2023, 125, 106700. [Google Scholar] [CrossRef]
Hireš, M.; Drotár, P.; Pah, N.D.; Ngo, Q.C.; Kumar, D.K. On the Inter-Dataset Generalization of Machine Learning Approaches to Parkinson’s Disease Detection from Voice. Int. J. Med. Inform. 2023, 179, 105237. [Google Scholar] [CrossRef]
Skibińska, J.; Hosek, J. Computerized Analysis of Hypomimia and Hypokinetic Dysarthria for Improved Diagnosis of Parkinson’s Disease. Heliyon 2023, 9, e21175. [Google Scholar] [CrossRef]
Hawi, S.; Alhozami, J.; AlQahtani, R.; AlSafran, D.; Alqarni, M.; Sahmarany, L.E. Automatic Parkinson’s Disease Detection Based on the Combination of Long-Term Acoustic Features and Mel Frequency Cepstral Coefficients (MFCC). Biomed. Signal Process. Control. 2022, 78, 104013. [Google Scholar] [CrossRef]
Hireš, M.; Gazda, M.; Drotár, P.; Pah, N.D.; Motin, M.A.; Kumar, D.K. Convolutional Neural Network Ensemble for Parkinson’s Disease Detection from Voice Recordings. Comput. Biol. Med. 2022, 141, 105021. [Google Scholar] [CrossRef] [PubMed]
Guo, L.; Chang, R.; Wang, J.; Narayanan, A.; Qian, P.; Leong, M.C.; Kundu, P.P.; Senthilkumar, S.; Garlapati, S.C.; Yong, E.C.K.; et al. Artificial Intelligence-Enhanced 3D Gait Analysis with a Single Consumer-Grade Camera. J. Biomech. 2025, 187, 112738. [Google Scholar] [CrossRef] [PubMed]
Sánchez Fernández, L.P.; Sánchez Pérez, L.A.; Martínez Hernández, J.M. Computer Model for Gait Assessments in Parkinson’s Patients Using a Fuzzy Inference Model and Inertial Sensors. Artif. Intell. Med. 2025, 160, 103059. [Google Scholar] [CrossRef]
Sigcha, L.; Borzì, L.; Olmo, G. Deep Learning Algorithms for Detecting Freezing of Gait in Parkinson’s Disease: A Cross-Dataset Study. Expert Syst. Appl. 2024, 255, 124522. [Google Scholar] [CrossRef]
Yang, J.; Williams, S.; Hogg, D.C.; Alty, J.E.; Relton, S.D. Deep Learning of Parkinson’s Movement from Video, without Human-Defined Measures. J. Neurol. Sci. 2024, 463, 123089. [Google Scholar] [CrossRef]
Borzì, L.; Sigcha, L.; Rodríguez-Martín, D.; Olmo, G. Real-Time Detection of Freezing of Gait in Parkinson’s Disease Using Multi-Head Convolutional Neural Networks and a Single Inertial Sensor. Artif. Intell. Med. 2023, 135, 102459. [Google Scholar] [CrossRef]
Pooja, N.; Veer, K.; Pahuja, S. Gender-Based Assessment of Gait Rhythms during Dual-Task in Parkinson’s Disease and Its Early Detection. Biomed. Signal Process. Control 2022, 72, 103346. [Google Scholar] [CrossRef]
Jiang, X.; Yu, H.; Yang, J.; Liu, X.; Li, Z. A New Network Structure for Parkinson’s Handwriting Image Recognition. Med. Eng. Phys. 2025, 139, 104333. [Google Scholar] [CrossRef]
Shastry, K.A. Deep Learning-Based Diagnostic Model for Parkinson’s Disease Using Handwritten Spiral and Wave Images. Curr. Med. Sci. 2025, 45, 206–230. [Google Scholar] [CrossRef]
Zhang, Y.; Lin, H.; Xie, X.; Peng, P.; Chen, T.; Zhao, Z.; Wen, Y.; Hong, W. A Novel Approach for Handwriting Recognition in Parkinson’s Disease by Combining Flexible Sensing with Deep Learning Technologies. Sens. Actuators A Phys. 2025, 385, 116287. [Google Scholar] [CrossRef]
Pragadeeswaran, S.; Kannimuthu, S. Cosine Deep Convolutional Neural Network for Parkinson’s Disease Detection and Severity Level Classification Using Hand Drawing Spiral Image in IoT Platform. Biomed. Signal Process. Control 2024, 94, 106220. [Google Scholar] [CrossRef]
Varalakshmi, P.; Priya, B.T.; Rithiga, B.A.; Bhuvaneaswari, R.; Sundar, R.S.J. Diagnosis of Parkinson’s Disease from Hand Drawing Utilizing Hybrid Models. Park. Relat. Disord. 2022, 105, 24–31. [Google Scholar] [CrossRef] [PubMed]
Deharab, E.D.; Ghaderyan, P. Graphical Representation and Variability Quantification of Handwriting Signals: New Tools for Parkinson’s Disease Detection. J. Appl. Biomed. Eng. 2022, 42, 158–172. [Google Scholar] [CrossRef]
Pepa, L.; Spalazzi, L.; Ceravolo, M.G.; Capecci, M. Supervised Learning for Automatic Emotion Recognition in Parkinson’s Disease through Smartwatch Signals. Expert Syst. Appl. 2024, 249, 123474. [Google Scholar] [CrossRef]
Gonçalves, H.R.; Santos, C.P. Deep Learning Model for Doors Detection: A Contribution for Context-Awareness Recognition of Patients with Parkinson’s Disease. Expert Syst. Appl. 2023, 212, 118712. [Google Scholar] [CrossRef]
Oliveira, G.C.; Ngo, Q.C.; Passos, L.A.; Papa, J.P.; Jodas, D.S.; Kumar, D. Tabular Data Augmentation for Video-Based Detection of Hypomimia in Parkinson’s Disease. Comput. Methods Programs Biomed. 2023, 240, 107713. [Google Scholar] [CrossRef]
Ghayvat, H.; Awais, M.; Geddam, R.; Quasim, M.T.; Khowaja, S.A.; Dev, K. AiCareGaitRehabilitation: Multi-Modalities Sensor Data Fusion for AI IoT Enabled Real-Time Electrical Stimulation Device for Pre-Fog and Post-Fog in Parkinson’s Disease. Inf. Fusion 2025, 122, 103155. [Google Scholar] [CrossRef]
Vatsavai, D.; Iyer, A.; Nair, A.A. A Quantum Inspired Machine Learning Approach for Multimodal Parkinson’s Disease Screening. Sci. Rep. 2025, 15, 11660. [Google Scholar] [CrossRef]
Meng, J.; Huo, X.; Zhao, H.; Zhang, G.; Zhang, L.; Wang, X.; Zhou, S. Multi-Modal Biological Feature Selection for Parkinson’s Disease Staging Based on Binary PSO with Broad Learning. Biomed. Signal Process. Control 2024, 94, 106234. [Google Scholar] [CrossRef]
Lv, C.; Fan, L.; Li, H.; Ma, J.; Jiang, W.; Ma, X. Leveraging Multimodal Deep Learning Framework and a Comprehensive Audio-Visual Dataset to Advance Parkinson’s Detection. Biomed. Signal Process. Control 2024, 95, 106480. [Google Scholar] [CrossRef]
Loo, R.T.J.; Tsurkalenko, O.; Klucken, J.; Mangone, G.; Khoury, F.; Vidailhet, M.; Zelimkhanov, G. Levodopa-Induced Dyskinesia in Parkinson’s Disease: Insights from Cross-Cohort Prognostic Analysis Using Machine Learning. Park. Relat. Disord. 2024, 126, 107054. [Google Scholar] [CrossRef]
Mahesh, T.R.; Bhardwaj, R.; Khan, S.B.; Alkhaldi, N.A.; Victor, N.; Verma, A. An Artificial Intelligence-Based Decision Support System for Early and Accurate Diagnosis of Parkinson’s Disease. Decis. Anal. J. 2024, 10, 100381. [Google Scholar] [CrossRef]
Junaid, M.; Ali, S.; Eid, F.; El-Sappagh, S.; Abuhmed, T. Explainable Machine Learning Models Based on Multimodal Time-Series Data for the Early Detection of Parkinson’s Disease. Comput. Methods Programs Biomed. 2023, 234, 107495. [Google Scholar] [CrossRef] [PubMed]
Shastry, K.A. An Ensemble Nearest Neighbor Boosting Technique for Prediction of Parkinson’s Disease. Healthc. Anal. 2023, 3, 100181. [Google Scholar] [CrossRef]
Reddy, S.; Giri, D.; Patel, R. Artificial intelligence diagnosis of Parkinson’s disease from MRI scans. Cureus 2024, 16, e58841. [Google Scholar] [CrossRef]
Anyfantakis, G.; Manouvelou, S.; Koutoulidis, V.; Velonakis, G.; Scarmeas, N.; Papageorgiou, S.G. Can progressive supranuclear palsy be accurately identified via MRI with the use of visual rating scales and signs? Biomedicines 2025, 13, 1009. [Google Scholar] [CrossRef] [PubMed]
Bakry, S.A.; Mahmoud, N.M. Automated early prediction of Parkinson’s disease based on artificial intelligent techniques. Arab. J. Sci. Eng. 2025, 1–17. [Google Scholar] [CrossRef]
Jadhwani, P.L.; Harjpal, P. A Review of Artificial Intelligence-Based Gait Evaluation and Rehabilitation in Parkinson’s Disease. Cureus 2023, 15, e47118. [Google Scholar] [CrossRef]
Beheshti, I.; Sone, D.; Yao, Z.; Maikusa, N. Editorial: State-of-the-Art Artificial Intelligence Methods in Neurodegeneration. Front. Neurol. 2023, 13, 1112639. [Google Scholar] [CrossRef]
Gerke, S.; Minssen, T.; Cohen, G. Ethical and Legal Challenges of Artificial Intelligence-Driven Healthcare. In Elsevier eBooks; Academic Press: Cambridge, MA, USA, 2020; pp. 295–336. [Google Scholar] [CrossRef]
Termine, A.; Fabrizio, C.; Strafella, C.; Caputo, V.; Petrosini, L.; Caltagirone, C.; Cascella, R. Multi-Layer Picture of Neurodegenerative Diseases: Lessons from the Use of Big Data through Artificial Intelligence. J. Pers. Med. 2021, 11, 280. [Google Scholar] [CrossRef]
Dabbas, H.A.; Aladwan, I.M.; Agarwal, A.; Ilunga, M.; Badran, O.; Ikhries, I.I. Comparative investigation of mechanical characteristics and microstructure in maraging steel fabricated via DMLS and CNC techniques. Int. J. Comput. Methods Exp. Meas. 2025, 13, 53–60. [Google Scholar] [CrossRef]
Singh, N.; Rana, A.; Singh, R.; Dumka, A.; Priyadarshi, N.; Twala, B. Artificial Intelligence Techniques for Parkinson’s Disease: Recent Advancements, Ethical Concerns, and Future Directions. Neurosci. Biobehav. Rev. 2024, 158, 105568. [Google Scholar]

Figure 1. Cardinal (primary) symptoms of Parkinson’s disease.

Figure 2. AI as a broad field consisting of ML and DL.

Figure 3. EEG signal categorization by frequency sub-bands [22] [copyright permission is not required as per the journal policy: open access journal].

Figure 4. General machine learning and deep learning workflow for Parkinson’s disease diagnosis using multimodal data.

Table 1. Overview of data collection methods, dataset issues, study objectives, and reported limitations in PD research.

Authors	Data Collection and Its Associated Issues	Study Conducted	Outcomes and Limitations
Sivaranjini & Sujatha (2020) [23]	Highlighted Parkinson’s disease datasets collected across multiple modalities, including speech, MRI, gait signals, handwriting, and EEG, exhibited significant variability. Identified inconsistencies due to heterogeneous data acquisition devices, patient demographics, disease severity levels, and multi-institutional collection practices. Most publicly available datasets were reported to be small in size and class-imbalanced, negatively impacting model robustness and generalization.	Conducted a comprehensive review of machine learning-based Parkinson’s disease diagnostic frameworks utilizing multimodal data sources such as clinical assessments, speech features, neuroimaging, and wearable sensor data.	Demonstrated that diagnostic performance varies widely across datasets due to noise, heterogeneity, and non-standardized labeling procedures. Emphasized the need for data harmonization, standardized acquisition protocols, and large-scale collaborative datasets. Limitation: absence of a unified, standardized Parkinson’s disease dataset across centers.
Dennis & Strafella (2024) [24]	Emphasized high dimensionality of MRI and multimodal datasets, where voxel-wise and spectroscopic features far exceed sample size. Discussed label reliability issues, especially due to ambiguous early PD diagnosis and inter-rater scoring differences.	Comprehensive review of AI methods in PD and atypical parkinsonism, focusing on multimodal datasets (MRI, PET, clinical scores).	Reported that high-dimensional domains require feature reduction and robust annotation, otherwise models overfit. Limitations include lack of pathology-confirmed labels and inconsistent dataset annotation across studies.
Serag et al. (2025) [26]	Reported that multimodal data (clinical + imaging + biospecimens) suffer from missing modalities, incomplete labels, and cross-site variability. Highlighted difficulty in integrating heterogeneous data for prodromal PD.	Scoping review on multimodal diagnostic tools for prodromal PD detection.	Found that variability and missing data reduce model robustness. Limitations include lack of standardized multimodal acquisition pipelines and inconsistent labeling between centers.
Ameli et al. (2024) [27]	Showed that genomic datasets suffer from population variability, platform-dependent noise, and dimensionality issues (thousands of SNPs with small sample sizes). Demonstrated that dataset differences obstruct reproducibility.	ML-based biomarker reproducibility study using multi-cohort genomic datasets (dbGaP).	Demonstrated improved reproducibility when datasets are integrated. Limitations include variation in genotyping platforms, demographic imbalance, and lack of harmonized data labeling standards.
Rana et al. (2022) [28]	Discussed heterogeneous clinical datasets with inconsistent annotations, variable UPDRS scoring, and uneven distribution across PD stages. Identified dimensionality issues in sensor-based and signal-based features.	Review on ML challenges and recommendations for PD detection (speech, clinical, wearable, imaging datasets).	Concluded that label noise and dataset heterogeneity limit clinical applicability. Limitations include scarcity of large annotated datasets and absence of unified labeling protocols.
Aggarwal et al. (2023) [29]	Highlighted MRI dataset challenges—scanner variability, voxel dimensionality, and inconsistent structural labels. Pointed out lack of pathology-proven ground truth in most PD imaging studies.	Systematic review on AI and neuroimaging for PD detection.	Found that performance varies due to dimensionality, data imbalance, and label uncertainty. Limitations include small sample sizes and inconsistent imaging parameters across studies.

Table 2. Summary of classification and prediction techniques for Parkinson’s disease diagnosis using imaging data.

Authors	Modality	Classification/Prediction Techniques Used	Feature Extraction Method and Dataset Used	Study Conducted	Study Outcomes and Limitations (If Any)
Volkmann et al. (2025) [30]	MRI (DTI, T1)	Random Forest SVM Decision Tree MLP Deep Neural Network (TensorFlow) Permutation Importance	DTI tract-wise FA statistics T1-weighted corpus callosum texture metrics (entropy, homogeneity) Dataset A: 74 PSP, 63 controls (3.0 T, multi-center) Dataset B: 66 PSP, 66 PD, 44 controls (1.5 T, single-site)	Differential diagnosis—classification of PSP vs. controls and PSP vs. PD; feature-importance analysis to identify key brain regions	DL model: 95% accuracy (PSP vs. controls), 86% (PSP vs. PD) Key discriminative regions: midbrain tegmentum, prefrontal WM, DL > ML for PSP vs. PD Limitations: scanner variation (1.5 T vs. 3.0 T), limited PD data at 3.0 T, small cohorts, overfitting risk, and lack of post-mortem confirmation
Islam et al. (2024) [31]	MRI + clinical	SVM, AdaBoost, GBDT, KNN, Extra Trees, LR, Decision Tree, RF, MLP CNN, ANN (Transfer Learning)	Clinical dataset (2010–2023) MRI dataset (PPMI, 7500 images → 2500 usable) Feature selection: Select K Best, Variance Threshold, Tree-based, Forward Selection	Multimodal PD detection using (1) clinical records and (2) MRI-based transfer learning models; comparison of sampling/imbalance strategies	Extra Trees + tree-based selection: 98.44% accuracy (clinical data) DenseNet169 + CNN: 85.08% accuracy (MRI) SMOTE-ENN improved recall LIME/SHAP used for explainability Limitation: severe class imbalance, MRI quality reduced dataset from 7500 to 2500
Yuan & He (2024) [32]	fMRI	CNN initialized with RBM	fMRI from 100 PD patients Brain network analysis via GRETNA	Functional network classification—identification of PD subgroups based on functional connectivity patterns	CNN accurately classified functional networks Significant differences found in DC values between groups EODWO showed no major effect on global/local efficiency
Maged et al. (2024) [33]	Primary modality: Neural electrophysiological signals Clinical context: Deep Brain Stimulation (DBS) surgery Signal source: Microelectrode recordings from the subthalamic nucleus (STN)	Custom CNN with residual + attention modules Transfer learning (ResNet, AlexNet) Traditional ML (SVM, KNN, etc.) for comparison	MER recordings from 39 DBS trajectories Extracted time-domain (power, spikes) + time–frequency (scalograms, spectrograms) features	Signal-based STN localization during DBS using MER-derived deep learning features	Accuracy 97.42%, AUC 0.9715 Spectrogram features best overall Limitations: small dataset, narrow PD population, requires validation in diverse clinical settings
Pahuja & Prasad (2022) [34]	MRI + SPECT + CSF	CNN, SSAE, AE, Softmax classifier Two frameworks: Feature-level (FL) and Modal-level (ML)	VBM for MRI SPM8 preprocessing ReliefF feature reduction SPECT SBR values + CSF markers Dataset: PPMI (73 PD, 59 controls)	Multimodal PD classification combining MRI, SPECT, and CSF biomarkers using FL and ML fusion frameworks	CNN accuracy: 93.33% (FL), 92.38% (ML) Multimodal fusion improved diagnostic reliability Limitations: dataset imbalance, high computational complexity, multimodal integration challenges

Abbreviations: PD: Parkinson’s Disease; ML: Machine Learning; DL: Deep Learning; MRI: Magnetic Resonance Imaging; DTI: Diffusion Tensor Imaging; fMRI: Functional MRI; SPECT: Single Photon Emission Computed Tomography; CSF: Cerebrospinal Fluid; PSP: Progressive Supranuclear Palsy; FA: Fractional Anisotropy; RF: Random Forest; SVM: Support Vector Machine; CNN: Convolutional Neural Network; ANN: Artificial Neural Network; DBS: Deep Brain Stimulation; MER: Microelectrode Recording; STN: Subthalamic Nucleus; PPMI: Parkinson’s Progression Markers Initiative.

Table 3. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis using EEG signals and brainwave analysis.

Authors	EEG Features	ML/DL Techniques Used	Feature Extraction Method and Dataset Used	Study Conducted	Study Outcomes and Limitations (If Any)
Bera et al. (2025) [35]	PSD (δ–γ bands)	Support Vector Machine (SVM) Convolutional Neural Network (CNN)	Feature Extraction: Power Spectral Density (PSD) using Welch’s method EEG Frequency Bands: Alpha (α), Beta (β), Gamma (γ), Theta (θ), Delta (δ) Datasets: UC San Diego Resting State EEG dataset IOWA EEG dataset	Compared ML and DL methods for PD detection from EEG signals SVM trained on band-specific PSD features CNN used stacked band-wise features in subject-dependent and independent settings	CNN achieved 96.7% (UCSD) and 99.3% (IOWA) accuracy, outperforming SVM SVM accuracy: up to 94% in subject-dependent, 68% in subject-independent cases CNN requires large datasets for generalization Subject-independent accuracy remains a challenge due to limited data
Zhang et al. (2022) [36]	WPT, TQWT	Tunable Q-Factor Wavelet Transform (TQWT) and Wavelet Packet Transform (WPT) mainly along with Deep Residual Shrinkage Network (DRSN) and CNN DRSN combined with time–frequency features used as primary classification method	TQWT and WPT used to extract time–frequency features from EEG signals Signals classified into different sleep stages (e.g., REM, N1, N2, etc.). Data from 44 clinical sleep EEG datasets at Shaanxi Provincial People’s Hospital, including PD, RBD, PD with RBD, and control groups undergoing overnight EEG monitoring Energy distribution across wavelet sub-bands used to differentiate PD, RBD, and control subjects	Study aimed to classify patients into 2-class, 3-class, and 4-class categories based on EEG signals, identifying differences between PD, RBD, PD with RBD, and healthy individuals. Pre-processing involved noise reduction using wavelet-based denoising methods	Accuracy: 99.92% for 2-class classification, 97.81% for 3-class, and 92.59% for 4-class using WPT-DRSN Demonstrated high classification accuracy for early detection of PD and differentiation from RBD WPT-DRSN performed better than TQWT-DRSN across multiple tasks
Ezazi & Ghaderyan (2022) [37]	FSST, GLCM, LBP	Short-Time Fourier Transform-based Synchrosqueezing Transform (FSST), Gray Level Co-occurrence Matrix (GLCM), Local Binary Pattern (LBP) Sparse Non-negative Least-Squares (SNNLS) coding classifier used as classifier	Feature extraction used FSST for T-F representation, GLCM-based statistics, and LBP for EEG features Data analysis used FSST for T-F enhancement, GLCM, and LBP for feature extraction, and statistical analysis EEG data from PD patients and healthy controls during a reinforcement-learning task used as dataset	EEG signals were preprocessed and textural features were extracted from FSST-enhanced T-F representation Classification performed using SNNLS	Highest accuracy of 100% using GLCM-based energy feature Robust against noise, individual differences, and medication states However, variability in EEG data quality due to noise and artifacts
Guo et al. (2022) [38]	Wavelet EEG + IMU	Wavelet Transformation (EEG), Inertial Features (Acceleration) LSTM used for pseudo-EEG feature extraction SVM (RBF kernel) used as classifier	EEG: Wavelet Energies (δ, θ, α, β sub-bands), Total Wavelet Entropy (RWE) to be extracted Inertial features are Freezing Index (FI), Sample Entropy (SE), Energy Index (EI), and Standard Deviation (STD) Self-collected Freezing of Gait (FoG) (12 PD patients (8 valid)) was used as dataset	LSTM-PM model used to represent pseudo-EEG features from accelerations. Performance of different feature combinations (ACC, EEG, EEG-ACC, pmEEG, pmEEG-ACC) Data analysis was performed by feature comparison and performance evaluation using metrics (accuracy, sensitivity, F1-score, geometric mean)	Subject-dependent: pmEEG-ACC achieved highest accuracy (93.6%), sensitivity (88.5%), and F1-score (89.4%) Cross-subject: pmEEG-ACC comparable to EEG-ACC, with high sensitivity (89.7%) and geometric mean (91.0%) Pseudo-multimodal features improve FoG detection accuracy

Abbreviations: EEG: Electroencephalography; PSD: Power Spectral Density; δ, θ, α, β, γ: Delta, Theta, Alpha, Beta, and Gamma frequency bands; SVM: Support Vector Machine; CNN: Convolutional Neural Network; WPT: Wavelet Packet Transform; TQWT: Tunable Q-Factor Wavelet Transform; DRSN: Deep Residual Shrinkage Network; FSST: Fourier-based Synchrosqueezing Transform; T-F: Time–Frequency; GLCM: Gray Level Co-occurrence Matrix; LBP: Local Binary Pattern; SNNLS: Sparse Non-negative Least Squares; IMU: Inertial Measurement Unit; LSTM: Long Short-Term Memory network; RBF: Radial Basis Function; FoG: Freezing of Gait; RWE: Relative Wavelet Entropy; FI: Freezing Index; SE: Sample Entropy; EI: Energy Index; STD: Standard Deviation; ACC: Accelerometer; pmEEG: Pseudo-multimodal EEG features.

Table 4. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis using voice, speech, and acoustic features.

Authors	ML/DL Techniques Used	Feature Extraction Method and Dataset Used	Study Conducted	Study Outcomes and Limitations (If Any)
Yang et al. (2025) [39]	XGBoost LightGBM MLP GBDT SVM Random Forest Logistic Regression AdaBoost Decision Tree KNN Naive Bayes Stacking	Dataset: UCI PD Classification Dataset (756 voice recordings; 188 PD, 64 controls) Features: 754 per sample MFCCs TWQT Wavelet-based Vocal Fold Time–Frequency features	Developed automated AI pipeline for PD classification from voice data Compared 4 aggregation strategies: Post-mean Post-max Post-min Pre-mean	Post-mean + XGBoost performed best: Accuracy: 0.880 F1-score: 0.922 MCC: 0.672 Post-min favored specificity Post-max favored sensitivity Pre-aggregation methods showed no clear benefit Limitations: Inter-recording variance lost in some strategies Dataset imbalance (handled via BOS) Generalization may vary with class distribution
Islam & Tarique (2025) [40]	SVM and kNN	Feature Extraction: Baseline features, intensities, formant frequencies, bandwidths, vocal fold parameters, MFCCs, Discrete Wavelet Transform (DWT), Tunable Q-Wavelet Transform (TQWT) Dataset: UCI PD speech dataset with 188 PD patients and 64 healthy controls Preprocessing: PCA used for dimensionality reduction of feature vectors	Developed two sets of feature vectors: - Feature Vector I (without wavelet-based features) - Feature Vector II (with wavelet and TQWT features) Evaluated performance of SVM and kNN on both vectors Performance metrics: Accuracy, Precision, Recall, F1 Score, Specificity, AUC, etc.	Feature Vector II improved accuracy and specificity in both classifiers kNN outperformed SVM in most performance metrics (Accuracy: 93.52%, AUC: 0.92) Wavelet-based features helped reduce misclassification and improved detection, especially of control samples Limitation: Slight increase in training time and decrease in prediction speed when wavelet features are included.
Al-Najjar et al. (2024) [41]	SVM Neural Network Quest CHAID CR-tree Logistic Regression	Feature selection using 6 algorithms: Ranker, Greedy, BestFirst, Whale Optimization, Grey Wolf Optimization (GWO), and hybrid GWO-Whale Dataset: UCI PD dataset (195 audio samples; 31 subjects, 23 with PD) Features from biomedical voice signals	Developed a hybrid optimization algorithm (Whale + GWO) for feature selection Built multiple ML models using selected features Compared models using various performance metrics (accuracy, precision, recall, F1, FOR) Evaluated six feature selection methods across six classifiers	Best accuracy: 95% using CR-tree with proposed hybrid optimizer Neural network + hybrid GWO achieved highest recall and F1 in training (both 1.00) Proposed model reduced feature count to 11 while maintaining high accuracy Most influential features: spread1, spread2, Speed1, DFA, PPE Limitations: small dataset (only 31 subjects), model complexity increased due to hybrid optimizers
Singh & Tripathi (2024) [42]	Ensemble voting classifier combining KNN, RF, DT, SVM, Bagging, MLP, Gradient Boosting, XGBoost	EFSA for feature selection UCI voice datasets (D1, D2, D3) SMOTE used for data balancing	Preprocessing: Handled missing values, normalization, SMOTE Feature selection via EFSA Model training and hyperparameter tuning (GridSearchCV) Cross-validation (5-fold, 10-fold)	Ensemble classifier outperformed individual classifiers EFSA improved performance, highest accuracy: Dataset-I (97.6%) Limitations: Small datasets, lack of model interpretability, scope for advanced feature engineering
Zhang et al. (2023) [43]	Fractional Fourier Transform (FrFT) for time–frequency representation Fractional Attribute Topology (FrAT) for feature extraction Classifiers: Support Vector Machine (SVM), Random Forest (RF), Logistic Regression (LR), Multilayer Perceptron (MLP)	FrFT used to obtain spectrograms at different orders Energy variation information mapped to a formal context and FrAT is generated Connected Component Features of FrAT (CCF-FrAT) represent discrete degree of topology and are fed into classifiers. Dataset Used: Database-1: Turkish voice dataset (20 Parkinson’s patients, 20 healthy controls) Database-2: Chinese voice dataset (38 Parkinson’s patients, 40 healthy controls) Database-3 (PC-GITA): Spanish voice dataset (50 Parkinson’s patients, 50 healthy controls).	Voice signals were used for FrFT to capture time–frequency characteristics Energy variation in spectrograms used to generate fractional attribute topology (FrAT) Features extracted and fed into machine learning classifiers Data analysis methods included k-fold cross-validation and Leave-One-Speaker-Out (LOSO) validation	Achieved high classification accuracies: 99.57% (Database-1), 95.33% (Database-2), 94.13% (Database-3) Proved effective in different language datasets for Parkinson’s disease detection Language differences may lead to varying results due to different vocal habits High dimensionality of feature extraction methods increases complexity
Guatelli et al. (2023) [44]	Convolutional Neural Networks (CNN), Extreme Learning Machines (ELM), and Transfer Learning are the algorithms used	Features extracted by Spectrograms using Short-Time Fourier Transform (STFT) Dataset used was 55 PD patients, 64 healthy controls (voice recordings of sustained vowel/a/)	Multiple CNN architectures (AlexNet, VGG-16, ResNet-50, etc.) compared for accuracy and training time	High accuracy (up to 100%) with reduced training time using hybrid CNN-ELM models Limitations were limited sample size and reliance on voice recordings only
Hireš et al. (2023) [45]	CNN (Xception architecture) XGBoost for shallow learning Classifiers used were CNN (Xception) and XGBoost	Long-term features extracted by formants, shimmers, jitters, and spectral coefficients Short-term features extracted by zero crossing rate, MFCC, spectral entropy, etc. Log-spectrogram conversion performed for CNN 4 datasets focusing on vowel/a/ phonation were used (CzechPD, PC-GITA, ITA, RMIT-PD)	Cross-dataset validation for generalization 10-fold cross-validation t-SNE for dataset differentiation	Single-dataset performance: 86–96% accuracy Cross-dataset performance: Poor generalization (accuracy drops to 33–72%) But models overfit to dataset-specific characteristics and there is poor real-world generalizability
Skibińska & Hosek (2023) [46]	XGBoost algorithm for classification	Acoustic analysis (phonation, articulation, prosody) and facial landmark movement 73 PD patients, 46 healthy controls (43 speech exercises, including tongue twisters)	Fusion of audio and video modalities Evaluated speech exercises for PD detection with statistical modeling	83% accuracy (audio + video fusion), highest performance with tongue twisters But limited to specific motor symptoms (hypomimia and dysarthria)
Hawi et al. (2022) [47]	Random Forest (RF) model K-nearest neighbours (KNN) Multi-layer perceptron (MLP) SVM with radial basis function (RBF) kernel Classifiers used were Random Forest, SVM-RBF, and KNN	Long-term features: Jitter, shimmer, formant frequencies (F1, F2), intensity parameters Short-term features: Mel frequency cepstral coefficients (MFCC) Open-source dataset from UCI Machine Learning Repository with 756 voice samples (564 PD patients, 192 healthy controls) was used	Study assessed combination of long-term acoustic features and MFCC for PD detection using Random Forest For data analysis, feature selection performed via backward stepwise selection; 5-fold cross-validation was used to validate models	Combined features of MFCC and long-term features achieved accuracy of 88.84% Independent sets (MFCC only, long-term only) showed lower accuracy (~84%) Sensitivity: 98.51%, Specificity: 71.08% for the combined feature set
Hireš et al. (2022) [48]	CNN ensemble with multiple fine-tuning (MFT) Classifiers used were ResNet50 and Xception	Conversion of voice recordings to spectrograms performed using Short-Time Fourier Transform (STFT) Gaussian blurring applied to spectrograms Dataset used was PC-GITA (100 Spanish-speaking subjects, 50 PD patients, 50 controls) Additional fine-tuning performed with SVD and vowel datasets	CNN ensemble was developed to differentiate PD voices from healthy ones using vowel utterances Data analysis methods were 10-fold cross-validation, majority voting ensemble	Best Accuracy: 99% (vowel/a/), Sensitivity: 86.2%, Specificity: 93.3%, AUC: 89.6% But there was no data on medication effects

Abbreviations: SVM: Support Vector Machine; KNN: k-Nearest Neighbour; RF: Random Forest; DT: Decision Tree; LR: Logistic Regression; MLP: Multilayer Perceptron; GBDT: Gradient Boosting Decision Tree; XGBoost: Extreme Gradient Boosting; LightGBM: Light Gradient Boosting Machine; AdaBoost: Adaptive Boosting; NB: Naive Bayes; ELM: Extreme Learning Machine; MFCC: Mel-Frequency Cepstral Coefficients; DWT: Discrete Wavelet Transform; TQWT: Tunable Q-Wavelet Transform; TWQT: Tunable Wavelet Q-Transform; FrFT: Fractional Fourier Transform; FrAT: Fractional Attribute Topology; CCF-FrAT: Connected Component Features of Fractional Attribute Topology; STFT: Short-Time Fourier Transform; PCA: Principal Component Analysis; SMOTE: Synthetic Minority Over-sampling Technique; EFSA: Enhanced Firefly Search Algorithm; BOS: Borderline Oversampling; GWO: Grey Wolf Optimization; AUC: Area Under the Curve; MCC: Matthews Correlation Coefficient; FOR: False Omission Rate; t-SNE: t-Distributed Stochastic Neighbor Embedding; LOSO: Leave-One-Speaker-Out validation; MFT: Multiple Fine-Tuning; SVD: Singular Value Decomposition.

Table 5. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis through motion and gait analysis from wearable sensors or video data.

Authors	Sensors	ML/DL Techniques Used	Feature Extraction Method and Dataset Used	Study Conducted	Study Outcomes and Limitations (If Any)
Guo et al. (2025) [49]	Video + LiDAR	YOLO (for person detection) deepSORT (tracking) Lightweight Pose Network (2D pose) VideoPose3D (3D pose) Joint Coordinate System (biomarker computation)	2D keypoints from videos 3D pose using depth sensor (LiDAR) Dataset: 72 healthy adults (split for training/testing) Open-source MS-COCO for pretraining	Developed 3DGait, a markerless AI-based system for 3D gait analysis using a single iPad Pro and depth camera. Validated against OptiTrack MoCap data.	Achieved average MAE of 2.3° and PCC of 0.75 for angular biomarkers. Spatiotemporal errors < 15% Clinically acceptable accuracy with minimal setup Limitations: Small test set (8 adults), limited diversity, validated only on healthy individuals. Further clinical validation required
Sánchez Fernández et al. (2025) [50]	IMUs	Fuzzy Inference System (FIS) Adaptive Neuro-Fuzzy Inference System (ANFIS)	Feature Extraction: Dynamic acceleration-based gait features from 4 inertial sensors (2 wrists + 2 ankles) Features: stride amplitude, foot lift height, arm swing, etc. Dataset: 58 Parkinson’s patients + 15 healthy controls 334 gait evaluations across 1 year	Developed fuzzy and neuro-fuzzy models to assess gait impairment Used wearable IMUs to automate the MDS-UPDRS gait assessment System provided a quantitative, interpretable score reflecting PD gait severity Model results compared against real-time expert clinician ratings	Outputs closely matched clinician assessments Provided explainable, two-decimal resolution for gait impairment scoring Enabled automatic and granular rating of normal, slight, and mild impairment Limitations: Sample limited to Mexican clinical setting Fewer healthy controls Needs broader validation across demographics and PD subtypes
Sigcha et al. (2024) [51]	Accelerometer	ML: Random Forest (RF) DL: CNN-MLP, ConvMixer, Wide-CNN	Features (mean, std, variance, entropy, energy, freezing index, spectral power sum) extracted from 3-axis accelerometer data Raw signals used for DL Datasets: Rempark (21 subjects), Daphnet (10 subjects), Oday (7 subjects) Single tri-axial accelerometer placed on lower back (standardized across datasets)	Evaluated FoG detection using ML and DL models Three validation setups: single-dataset, merged-dataset (all-in-one), and cross-dataset Data resampled to 32 Hz, segmented into 2 s overlapping windows Hyperparameter tuning with Hyperband; subject-independent splits	Best single-dataset AUC: Wide-CNN (Rempark: 0.938), CNN-MLP (Daphnet: 0.852), Wide-CNN (Oday: 0.786) Best merged-dataset AUC: Wide-CNN (0.912 on test set) DL models consistently outperformed RF Cross-dataset AUCs dropped significantly (Rempark: 0.829, Daphnet: 0.839, Oday: 0.654) SHAP analysis showed feature importance varied across datasets Limitations: Small datasets (7–21 subjects), sensor and protocol variations, and low generalizability across dataset
Yang et al. (2024) [52]	Video	3D Convolutional Neural Network (CNN) for video analysis	Collected video recordings of finger tapping Cropped and preprocessed videos for input	Developed 3D CNN to differentiate PD from healthy controls based on finger tapping Used batch normalization, dropout, and binary cross-entropy for training Evaluated using accuracy, precision, recall, F1-score, and AUROC Feature visualization through class activation maps	The model identified novel finger tapping patterns related to PD Promising results for PD differentiation Early detection potential Limitations: Small dataset and need for expanded validation and generalization
Borzì et al. (2023) [53]	IMU (waist)	Multi-head Convolutional Neural Networks (CNNs) Convolutional layers with varied kernel sizes Automatic feature extraction from inertial data Dropout and L2 regularization to prevent overfitting Grid-search optimized hyperparameters Classifiers used: CNN with ReLU activation, Max-pooling for dimension reduction, Softmax for final FOG classification, and Adam optimizer for training	Features extracted automatically by CNN Inertial signals segmented into 2 s windows with overlap Data from 118 PD patients and 21 healthy subjects 3 datasets: REMPARK, 6MWT, ADL 17+ hours of valid data having 1110 FOG episodes	Single inertial sensor on the waist Real-time FOG detection algorithm 60% of data for training, 20% each for validation and testing Predicted FOG episodes up to 3.1 s in advance	Sensitivity: 87.7% (test set) Specificity: 88.3% (test set) Prediction: 50% of FOG episodes detected 3.1 s before onset High specificity (100%) on healthy elderly subjects Low computational complexity, robust across datasets But limited accuracy for short FOG episodes (<5 s)
Pooja et al. (2022) [54]	Vertical ground reaction force (VGRF)	Naive Bayes, SVM, and Neural Networks used for classifying PD patients based on gait parameters during single and dual-task activities Among classifiers, Naive Bayes achieved the highest accuracy Neural Network and SVM were also used, but had lower accuracy	Gait parameters (stance time, swing time, stride time) and variability analyzed Data collected via VGRF sensors in shoes during walking Dataset from PhysioNet used had 39 participants (25 PD patients, 14 controls) Assessed under single-task (walking) and dual-task (listening to words while walking) conditions	Study evaluated the impact of dual-tasking on gait in males and females with PD Gait variability, particularly in stance time, swing time, and stride time, was assessed during single and dual tasks	Accuracy via Naïve Bayes: 100% for single-task and 97.22% for dual-task Gait speed decreased by 18.8% in single-task and 24.7% in dual-task for PD patients Variability in stance time was key indicator of PD, with dual-tasking increasing its deviation

Abbreviations: IMU: Inertial Measurement Unit; LiDAR: Light Detection and Ranging; VGRF: Vertical Ground Reaction Force; YOLO: You Only Look Once (object detection); deepSORT: Simple Online and Realtime Tracking with Deep Association Metrics; ANFIS: Adaptive Neuro-Fuzzy Inference System; FIS: Fuzzy Inference System; FoG: Freezing of Gait; MDS-UPDRS: Movement Disorder Society–Unified Parkinson’s Disease Rating Scale; MAE: Mean Absolute Error; PCC: Pearson Correlation Coefficient; AUC/AUROC: Area Under the Receiver Operating Characteristic Curve; STFT: Short-Time Fourier Transform; SHAP: SHapley Additive exPlanations; ADL: Activities of Daily Living; 6MWT: Six-Minute Walk Test; REMPARK: Remote Parkinson’s Disease Monitoring Dataset; ReLU: Rectified Linear Unit; L2: L2 regularization.

Table 6. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis using handwriting, drawing, or sketch-based features.

Authors	Input Type	ML/DL Techniques Used	Feature Extraction Method and Dataset Used	Study Conducted	Study Outcomes and Limitations (If Any)
Jiang et al. (2025) [55]	Spiral images	Attention Continuous Convolutional Network (ACC–Net) CBAM (Convolutional Block Attention Module) Compared with: AlexNet, VGG16, Attention-AlexNet	Dataset: HandPD dataset (hand-drawn spirals and meanders) 92 subjects: 74 PD, 18 healthy controls Features extracted from static handwriting images Data Augmentation: Gaussian noise, salt-and-pepper noise, mirroring, small-angle rotations (×7 dataset size)	Designed ACC–Net with continuous convolutions to avoid detail loss Integrated CBAM attention for region-wise feature enhancement Compared ACC–Net with baseline CNNs Evaluated using 3-fold cross-validation Tasks focused on spiral/meander drawing recognition	ACC–Net outperformed all baselines (Accuracy: 96.5%, F1-score: 0.976) Best performance achieved with 4 convolutional layers CBAM significantly improved feature localization and classification Limitations: Dataset imbalanced (80% PD) Static-image-based: lacks dynamic handwriting features Real-world validation not yet conducted
Shastry (2025) [56]	Spiral/wave images	Deep Neural Network (DNN) Convolutional Neural Network (CNN) DenseNet-201 Compared with: RF, LR, SVM, kNN, Decision Tree, Gradient Boost, AdaBoost, Naïve Bayes	Dataset: 204 handwritten images (spirals and waves) 102 spiral + 102 wave images From PD patients and healthy controls Preprocessing: Histogram equalization Feature Extraction: HOG (Histogram of Oriented Gradients) Augmentation: Rotation, shift, zoom, flip, brightness adjustment	Developed a DNN-based model for PD diagnosis from static handwritten images (spirals and waves) Compared performance against 9 traditional ML models and 2 DL models Evaluated on both spiral and wave datasets separately	DNN outperformed all other models: Spiral classification: up to 41.24% better accuracy than top ML models Wave classification: up to 40% improvement Limitations: Small sample size Images only; dynamic handwriting not considered No online/real-time testing performed
Zhang et al. (2025) [57]	Pressure sensor	ResNet-18 CNN Compared with: SqueezeNet, GoogLeNet	16 × 16 flexible pressure-sensitive sensor array Captured handwriting dynamics: force, velocity, pen-up/down, stroke count, deviation 600 handwriting samples from 28 PD patients and 28 healthy controls (collected at Fujian Provincial Hospital)	Designed a flexible sensing array for capturing pressure and motion during handwriting Combined sensor data into image maps and trained ResNet-18 to classify PD and its severity Used kinematic & pressure features as visual encodings in CNN input Compared multiple architectures for accuracy	ResNet-18 achieved best results (Accuracy up to 96% with force data) Force data significantly improved model performance Clear separation between PD, ET, mild PD, and asymptomatic subjects Limitations: Small participant group Limited symptom variability Needs further clinical validation for generalization
Pragadeeswaran & Kannimuthu (2024) [58]	Spiral images	DL: Cosine Deep Convolutional Neural Network (CosineDCNN) Hybrid routing via SCGA (Sine Cosine Geese Migration Algorithm)	Hand-drawn spiral images from Parkinson’s Drawings dataset Preprocessing: Adaptive Wiener filter Augmentation: Flipping, erasing, brightness, contrast, and resizing Feature extraction: Texton, ORB, LVP, Statistical (mean, variance, skewness, kurtosis, energy, entropy)	IoT-based system simulated for real-time PD detection and severity classification Spiral images transmitted via optimal path using SCGA (based on energy, delay, trust, distance) Features extracted and classified using CosineDCNN (DCNN + Cosine similarity + Fractional Calculus for fusion) Severity levels: minimal, mild, moderate, severe	Accuracy: 89.98%, TPR: 89.84%, TNR: 89.77%, PPV: 87.41%, NPV: 87.82% Outperformed ResNet, AlexNet, ISFO-DL, S-DCGAN, DLSTM, and RFE-SVM in both detection and severity classification Routing: Delay as low as 0.283 ms, energy up to 0.507 J Limitations: High model complexity, relies on quality of spiral images, does not evaluate real-time adaptability on mobile devices
Varalakshmi et al. (2022) [59]	Spiral images	CNN, ResNet50, VGG16, VGG19, and AlexNet Hybrid models combining DL feature extraction (e.g., CNN) with ML classifiers (e.g., SVM, Random Forest, KNN) Classifiers used were ResNet50 + SVM hybrid model, CNN + Random Forest and CNN + LSTM	Image was augmented (rotated, resized) due to small dataset Histogram equalization performed for preprocessing CNN used for feature extraction, combined with ML models for classification Dataset used was hand-drawing dataset from Kaggle, with 51 PD and 51 healthy spirals, expanded to 3800 images via augmentation	Compared performance of ML, DL, and hybrid models on hand-drawing data Metrics used for analysis were accuracy, precision, recall, specificity, and F1-score	ResNet50 + SVM achieved 98.45% accuracy, 99% sensitivity, and 98% specificity Hybrid models showed better classification performance However, this was a small dataset and focused only on hand-drawing data, lacking broader symptom capture
Deharab & Ghaderyan (2022) [60]	Dynamic handwriting	Empirical Mode Decomposition (EMD) SVM used as classifier	Handwriting signals (x, y coordinates, pressure) extracted Nonlinear features extracted by AASR, ASODP, EMD (IMFs) Graphical Representations by ASR, SODP PaHaW (PD handwriting) database was used as dataset (35 PD patients, 36 controls) Hilbert transform (ASR), CTM, 95% confidence ellipse (SODP) were used for data analysis	Handwriting signals decomposed using EMD; IMFs used for AASR, ASODP, and PD detection via SVM. Evaluated writing tasks, time sequences, and IMFs for performance. Tested feature effectiveness with statistical significance.	Sensitivity: 81.67% (ASODP), 63.33% (AASR); Specificity: 73.33% (ASODP), 85.71% (AASR) and Accuracy: 76.36% (ASODP), 69.09% (AASR) Effectively distinguishes PD patients from healthy controls Provides a trade-off between detection performance and computational complexity

Abbreviation: ACC–Net: Attention Continuous Convolutional Network; CBAM: Convolutional Block Attention Module; HOG: Histogram of Oriented Gradients; ORB: Oriented FAST and Rotated BRIEF; LVP: Local Vector Pattern; EMD: Empirical Mode Decomposition; IMF: Intrinsic Mode Function; AASR: Average Absolute Symbolic Representation; ASODP: Average Second-Order Difference Plot; ASR: Analytic Signal Representation; SODP: Second-Order Difference Plot; CTM: Central Tendency Measure; IoT: Internet of Things; SCGA: Sine Cosine Geese Migration Algorithm; TPR: True Positive Rate; TNR: True Negative Rate; PPV: Positive Predictive Value; NPV: Negative Predictive Value; ET: Essential Tremor.

Table 7. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis based on emotional, behavioral, or environmental interaction data.

Authors	ML/DL Techniques Used	Feature Extraction Method and Dataset Used	Study Conducted	Study Outcomes and Limitations (If Any)
Pepa et al. (2024) [61]	ML: Random Forest (RF), Decision Tree (DT), Support Vector Machine (SVM), Multilayer Perceptron (MLP) Shapley value-based feature selection	Autonomic signals (heart rate, skin temperature, electrodermal activity) collected from Microsoft Band 2 smartwatch Dataset: 11 PwPD and 8 healthy controls watching 12 emotion-eliciting video clips Features: 95 derived from signals; grouped as raw (FG-A), difference from baseline (FG-B), and ratio to baseline (FG-C) Feature sets tested: absolute & normalized, with/without baseline	Four classification tasks: arousal and valence prediction for PwPD and controls Each algorithm trained on 4 feature sets; hyperparameter tuning via nested CV Labels from Self-Assessment Manikin (SAM) questionnaire (valence/arousal on 1–9 scale) Recursive feature elimination using Shapley values	RF performed best in all tasks Valence (PwPD): RF + normalized all features (Set 2) → 100% accuracy Arousal (PwPD): RF + raw all features (Set 1) → 95.7% accuracy Valence (CG): RF + normalized raw features (Set 4) → 97.4% accuracy Arousal (CG): RF + raw all features (Set 1) → 97.4% accuracy Minimal feature sets (6–12 features) found via SHAP Limitations: Small participant pool (19 total), person-dependent validation (same person in train/test), generalization not guaranteed
Gonçalves & Santos (2023) [62]	Transfer learning with MobileNet-SSD pre-trained on COCO dataset TensorFlow Object Detection API Converted to TensorFlow Lite (TF-Lite) for RPi deployment	MobileNet-SSD with 3 × 3 convolution layers and bottleneck blocks Data augmentation: brightness, contrast, color, resizing Dataset of 912 labelled images of doors (various sizes, angles, colors) from real environments and Google Images 80/20 train-test split	Real-time door detection for Parkinson’s context Training on Google Colab using NVIDIA GPU Evaluated using precision, recall, and F1-score with IoU SSD with MobileNet backbone was used as a classifier	Precision: 97.2%, Recall: 78.9%, F1-score: 0.869 Time-efficient (~2.87 FPS on RPi) Limitation: Focus is on door detection; future work involves integrating sensory cues and patient testing
Oliveira et al. (2023) [63]	Conditional Generative Adversarial Networks (CGAN) Test-Time Augmentation (TTA) to improve model performance Comparison with Bayesian networks and other statistical methods Classifiers used were logistic regression with ADASYN for oversampling and SVM, Random Forest, and K-Neighbours for testing	Facial action units (AUs) analyzed using OpenFace Variances of AUs used to detect facial expressions (smiling, disgust, surprise) Public dataset of video recordings from PD patients and healthy individuals with 3 facial expressions (smiling, disgusted, surprised) was used 30 PD, 383 non-PD for training; 7 PD, 97 non-PD for testing	CGAN generated synthetic data to balance and augment training sets TTA employed to increase sensitivity and reduce false positives Tested on unbalanced data, simulating real-world prevalence of PD (~7%)	Accuracy: 83% (CGAN + TTA) Sensitivity: 86% Specificity: 82% (test set) Reduced false positives and improved real-world testing performance However, it is limited to a single dataset, lacking demographic diversity, and no validation is set due to small dataset size

Abbreviations: PwPD: People with Parkinson’s Disease; CG: Control Group; CV: Cross-Validation; EDA: Electrodermal Activity; HR: Heart Rate; SAM: Self-Assessment Manikin; CGAN: Conditional Generative Adversarial Network; TTA: Test-Time Augmentation; ADASYN: Adaptive Synthetic Sampling; SSD: Single Shot Detector; MobileNet-SSD: MobileNet-based Single Shot Multibox Detector; COCO: Common Objects in Context dataset; IoU: Intersection over Union; TF-Lite: TensorFlow Lite; RPi: Raspberry Pi; FPS: Frames Per Second; AUs: Action Units; GPU: Graphics Processing Unit.

Table 8. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis through fusion of multiple data modalities.

Authors	Modalities	ML/DL Techniques Esed	Feature Extraction Method and Dataset Used	Study Conducted	Study Outcomes and Limitations (If Any)
Ghayvat et al. (2025) [64]	IMUs/accelerometers	Multi-modal sensor data fusion using AI-IoT Integration with Functional Electrical Stimulation (FES) module	Multiple modalities (e.g., IMU, EMG, accelerometers) pre- and post-FoG episodes Dataset: Real-time wearable recordings from individuals with PD during gait and FoG events	Designed “AiCareGaitRehabilitation”: fusing multi-sensor inputs and applying fused features to trigger real-time FES for freeze-of-gait prevention/restoration in PD patients	Demonstrated effective real-time detection of gait Freezing of Gait (FoG) and timely FES intervention Limitations: Pilot study; exact performance metrics not reported; sample size/methodology details limited; further validation needed across larger, diverse populations
Vatsavai, Iyer & Nair (2025) [65]	Voice, gait, tapping	Quantum-inspired Support Vector Machine (qSVM) Classical machine learning models: Linear SVM RBF SVM Random Forest Gradient Boost Logistic Regression	Dataset: mPower Public Research Portal Over 150,000 samples from ~6000 participants Multimodal data: Voice Gait Tapping Demographics Features extracted: MFCCs, jitter, shimmer (voice) RMS & standard deviation (gait & tapping) Demographic features (age, gender, etc.)	Developed a simulatable qSVM using angle embedding Performed feature selection using Random Forest importance threshold (80th percentile) Compared quantum model with classical models	qSVM achieved higher accuracy and better generalization than classical models Limitations: Implemented on classical hardware (not true quantum) Computationally intensive Performance may vary with dataset size and modality availability
Meng et al. (2024) [66]	Fusion of neuromuscular, handwriting, and tremor signals	Multi-modal BPSO for feature selection Multi-modal broad learning for classification	Muscle tension (torque, elastic coefficient, etc.), handwriting trajectory (point, corner features), and tremor signals Dataset from Harbin Medical University (22 healthy controls, 21 PD patients)	Used MM-BPSO to select features from muscle tension, handwriting, and tremor signals MM-BL classified PD stages with high accuracy using selected features	MM-BL outperformed traditional classifiers (KNN, RF, etc.) Limitations: Small dataset and limited non-linear correlation analysis Needs larger, more diverse datasets for generalization
Lv et al. (2024) [67]	Audio + video	Deep learning framework with multimodal fusion Cross-attention module for audio-visual integration	CPD-AVD (Chinese PD Audio-visual Dataset) 220 participants (130 PD patients, 90 healthy) Features from ShuffleNet-V2, Mel spectrograms, and VGGish for audio and visual data	Evaluated on CPD-AVD using accuracy, F1 score, and sensitivity Ablation studies highlighted the contribution of visual features High ROC-AUC of 0.96 Compared to traditional ML models	Achieved 92.68% accuracy Limitations: Small dataset, treated PD patients, binary classification (PD vs. healthy) Future work suggested larger datasets, real-world testing, stage classification, and 3D facial modeling
Loo et al. (2024) [68]	Multi-cohort clinical assessment data	AdaBoost, CART, CatBoost for classification Nested cross-validation for model performance	Combined data from LuxPARK, PPMI, and ICEBERG cohorts Extracted baseline clinical features (age, motor symptoms, etc.)	Cross-cohort model predicted LID with significant predictors like age at onset, motor severity, and axial symptoms Evaluated with AUC, C-index, calibration, and decision curve analysis	Promising performance for LID prediction Highlights factors beyond levodopa exposure Limitations: Small and homogeneous datasets Future work to explore genetic and environmental factors
T.R. et al. (2024) [69]	Clinical/biomedical tabular data	Ensemble method combining XGBoost and Random Forest	Kaggle dataset (24 attributes, imbalanced) Feature reduction using correlation analysis from 24 to 11 features	Preprocessing: Handled missing data, normalization Data balancing via SMOTE Model training with grid search and cross-validation (10-fold)	XGBoost outperformed other classifiers XGBoost-RF achieved 98% accuracy, 97.40% F1-score Key features identified: Vocal frequency, tremor intensity, muscle rigidity
Junaid et al. (2023) [70]	Clinical + motor	SVM, Random Forest (RF), Extra Tree Classifier (ETC), LightGBM, Stochastic Gradient Descent (SGD) Feature importance via SHAP, LIME, and SHAPASH for explainability SVM, RF, ETC, LightGBM, and SGD were the classifiers used	Statistical features from 6 patient visits: mean, min, max, variance, standard deviation Feature selection: Early (modality-specific) and late (after fusion) PPMI dataset used with 5 modalities (Subject Characteristics, Bio-Samples, Medical History, Motor, and Non-Motor functions) 953 patients for 3-class prediction and 1060 patients for 4-class prediction	Time-series data fusion of patient visits (6 visits per patient) Data balancing performed by SMOTENN technique Bayesian optimization performed for model tuning	3-class task: LightGBM accuracy 94.89% (fused motor and non-motor data) 4-class task: RF accuracy 94.57% Limitations were class imbalance and limited external validation
Shastry (2023) [71]	Acoustic speech recordings	Ensemble methods: Nearest Neighbor Boosting combines k-NN and Gradient Boosting (GB) Classifiers: k-NN, GB, Decision Tree (DT), Random Forest (RF), AdaBoost (AB), Logistic Regression (LR), Support Vector Machine (SVM), Multilayer Perceptron (MLP), Naïve Bayes (NB), eXtreme Gradient Boosting (XGB), Extra Randomized Trees (ERT), Category Boosting (CB)	Feature importance is computed using: Mean Decrease in Impurity (F-MDI), Feature Permutation (F-PER), and Pearson’s Correlation (F-CORR) Min–Max scaling was used for data normalization Dataset Used: PD Speech Dataset with Multiple Types of Sound Recordings (PSV) 1039 samples, 27 input features, and 1 target feature (Parkinson’s disease status)	Data preprocessing and classification (Two-phase framework) Data split: 80% training, 20% testing Classifier evaluation in terms of accuracy, precision, recall, F-score, ROC, AUC Ensemble model (NNB) outperforms individual models Data analysis methods used were cross-validation, LOSO validation	NNB classifier achieved superior accuracy, recall, precision, and F-score over standalone ML models NNB accuracy improvements ranged from 1.93% to 16.13% compared to other classifiers But execution time of NNB was slower than other models due to its ensemble nature

Abbreviations: PwPD: People with Parkinson’s Disease; EMG: Electromyography; FES: Functional Electrical Stimulation; ERT: Extra Randomized Trees; CB: Category Boosting; CART: Classification and Regression Tree; ETC: Extra Tree Classifier; SGD: Stochastic Gradient Descent; MM-BPSO: Multi-Modal Binary Particle Swarm Optimization; MM-BL: Multi-Modal Broad Learning; RMS: Root Mean Square; LIME: Local Interpretable Model-agnostic Explanations; SHAPASH: SHAP-based Explainability framework; C-index: Concordance Index; PPMI: Parkinson’s Progression Markers Initiative; LID: Levodopa-Induced Dyskinesia; SMOTE: Synthetic Minority Over-sampling Technique; SMOTENN: Synthetic Minority Over-sampling Technique with Edited Nearest Neighbours; LOSO: Leave-One-Subject-Out validation.

Table 9. Summary of studies which use recent advances in ML/DL techniques for PD diagnosis based on genomic, genetic, or biomarker data.

Authors/Source	ML/DL Techniques Used	Feature Extraction Method and Dataset Used	Study Conducted	Study Outcomes and Limitations (If Any)
Ameli et al. (2024) [27]	Random Forest (primary classifier)	Sequential Variable Feature Selection (SVFS) using merged dbGaP SNP datasets (Phs000126, Phs000394, Phs000089, Phs000048)	Genomic biomarker-based PD classification; improving reproducibility across cohorts	Integrated datasets improved biomarker reproducibility; Approach 0 achieved highest accuracy; 54 SNPs replicated, 4 linked to PD. Limitations: population heterogeneity, genotyping platform differences.
PPMI-based multimodal biospecimen studies (selected from reviews [23,24,26])	Random Forest, XGBoost, LightGBM, SVM, deep neural networks	Plasma + CSF markers (α-syn, tau, Aβ, NfL), combined with MRI/DAT-SPECT & clinical scores; PPMI repository	Early diagnosis, progression prediction, and differential diagnosis using multimodal fusion	Multimodal fusion improves diagnostic accuracy & progression forecasting. Limitations: class imbalance, missing modalities, cross-site variability.
CSF α-synuclein biomarker studies (summarized in [23,24])	SVM, Random Forest, logistic regression	ELISA-based quantification of total/p-α-syn; CSF biochemical panels	Differential diagnosis of PD vs. atypical parkinsonism	ML improves diagnostic performance when α-syn is combined with other biomarkers. Limitations: assay variability, small cohorts, overlapping biomarker distributions.
Plasma/CSF Neurofilament Light Chain (NfL) studies (summarized in [23,24])	Regression, Random Forest, Cox-ML survival models	Plasma/CSF NfL via immunoassay + clinical scores + imaging	Progression prediction and severity estimation	Strong correlation with cognitive & motor decline; ML improves prognostic accuracy. Limitations: NfL lacks disease specificity and requires multimodal context.
Plasma & CSF proteomics/metabolomics ML studies (summarized in [21,23,24])	XGBoost, Elastic Net, RF, autoencoders; PCA/UMAP for dimensionality reduction	Mass-spectrometry proteomics; targeted metabolomics; biomarker panels	Biomarker discovery & early diagnostic classification	High-dimensional patterns outperform single biomarkers. Limitations: batch effects, overfitting, and lack of external validation.
Neuron-derived extracellular vesicle (NDEV) biomarker studies (from uploaded EV paper)	Random Forest, SVM, small CNNs for spectral data	Immunoaffinity-isolated NDEVs; proteomic and miRNA cargo profiling	Early diagnosis/mechanistic biomarker discovery	NDEV-derived cargo reflects neuronal dysfunction, and ML models demonstrate good discriminative performance. However, limitations include inconsistent NDEV isolation protocols, low vesicle yield, and assay variability. This study was not included in the numbered reference list and is cited only as supplemental evidence.
Multimodal fluid + imaging fusion studies (from reviews [23,24,26])	Early/late fusion, ensemble learning, deep multimodal architectures	Plasma/CSF biomarkers + MRI/DAT-SPECT + clinical scales	Multimodal diagnostic modeling	Fusion consistently improves diagnostic precision; mitigates weaknesses of single modalities. Limitations: computational complexity, missing-modality handling.
Ameli et al. (2024) [27]	Random Forest (primary classifier)	Sequential Variable Feature Selection (SVFS) using merged dbGaP SNP datasets (Phs000126, Phs000394, Phs000089, Phs000048)	Genomic biomarker-based PD classification; improving reproducibility across cohorts	Integrated datasets improved biomarker reproducibility; Approach 0 achieved highest accuracy; 54 SNPs replicated, 4 linked to PD. Limitations: population heterogeneity, genotyping platform differences.
PPMI-based multimodal biospecimen studies (selected from reviews [23,24,26])	Random Forest, XGBoost, LightGBM, SVM, deep neural networks	Plasma + CSF markers (α-syn, tau, Aβ, NfL), combined with MRI/DAT-SPECT & clinical scores; PPMI repository	Early diagnosis, progression prediction, and differential diagnosis using multimodal fusion	Multimodal fusion improves diagnostic accuracy & progression forecasting. Limitations: class imbalance, missing modalities, cross-site variability.
CSF α-synuclein biomarker studies (summarized in [23,24])	SVM, Random Forest, logistic regression	ELISA-based quantification of total/p-α-syn; CSF biochemical panels	Differential diagnosis of PD vs. atypical parkinsonism	ML improves diagnostic performance when α-syn is combined with other biomarkers. Limitations: assay variability, small cohorts, overlapping biomarker distributions.
Plasma/CSF Neurofilament Light Chain (NfL) studies (summarized in [23,24])	Regression, Random Forest, Cox-ML survival models	Plasma/CSF NfL via immunoassay + clinical scores + imaging	Progression prediction and severity estimation	Strong correlation with cognitive & motor decline; ML improves prognostic accuracy. Limitations: NfL lacks disease specificity and requires multimodal context.
Plasma & CSF proteomics/metabolomics ML studies (summarized in [21,23,24])	XGBoost, Elastic Net, RF, autoencoders; PCA/UMAP for dimensionality reduction	Mass-spectrometry proteomics; targeted metabolomics; biomarker panels	Biomarker discovery & early diagnostic classification	High-dimensional patterns outperform single biomarkers. Limitations: batch effects, overfitting, and lack of external validation.
Neuron-derived extracellular vesicle (NDEV) biomarker studies (from uploaded EV paper)	Random Forest, SVM, small CNNs for spectral data	Immunoaffinity-isolated NDEVs; proteomic and miRNA cargo profiling	Early diagnosis/mechanistic biomarker discovery	NDEV cargo reflects neuronal dysfunction, and ML models demonstrate good discriminative performance; however, limitations include inconsistent NDEV isolation, low vesicle yield, and assay variability. The study is not included in the numbered reference list and is cited only as supplemental evidence.
Multimodal fluid + imaging fusion studies (from reviews [23,24,26])	Early/late fusion, ensemble learning, deep multimodal architectures	Plasma/CSF biomarkers + MRI/DAT-SPECT + clinical scales	Multimodal diagnostic modeling	Fusion consistently improves diagnostic precision; mitigates weaknesses of single modalities. Limitations: computational complexity, missing-modality handling.

Abbreviations: SVFS: Sequential Variable Feature Selection; SNP: Single Nucleotide Polymorphism; dbGaP: Database of Genotypes and Phenotypes; PPMI: Parkinson’s Progression Markers Initiative; CSF: Cerebrospinal Fluid; α-syn: Alpha-synuclein; p-α-syn: Phosphorylated Alpha-synuclein; Aβ: Amyloid-beta; Tau: Tau protein; NfL: Neurofilament Light Chain; DAT-SPECT: Dopamine Transporter Single Photon Emission Computed Tomography; ELISA: Enzyme-Linked Immunosorbent Assay; UMAP: Uniform Manifold Approximation and Projection; NDEV: Neuron-Derived Extracellular Vesicles; miRNA: MicroRNA; EV: Extracellular Vesicle; Cox-ML: Cox Proportional Hazards Machine Learning model.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

AI in Parkinson’s Disease: A Short Review of Machine Learning Approaches for Diagnosis

Abstract

1. Introduction

Review Methodology

2. Literature Review

2.1. Imaging-Based Approaches (MRI, fMRI, SPECT, Etc.)

2.2. EEG-Based Approaches

2.3. Voice and Speech-Based Analysis

2.4. Motion and Gait Analysis

2.5. Handwriting and Drawing Analysis

2.6. Emotion and Behavioral Data

2.7. Multimodal and Fusion-Based Studies

2.8. Genomic and Biological Markers

3. Challenges and Future Directions

3.1. Cross-Modal Comparative Discussion

3.1.1. Imaging Modalities (MRI, fMRI, SPECT, DTI)

3.1.2. EEG-Based Methods

3.1.3. Voice and Speech Analysis

3.1.4. Gait and Motion Analysis

3.1.5. Limitations in Handwriting and Drawing–Based Analysis

3.1.6. Genomic, Proteomic, and Biomarker-Based Approaches

3.1.7. Synthesis Across Modalities

3.2. Clinical Translation Challenges

3.2.1. Regulatory Requirements

3.2.2. Dataset Harmonization and Standardization

3.2.3. Model Interpretability

3.2.4. Integration into Clinical Workflow

3.2.5. Cost and Infrastructure Limitations

3.3. Limited Generalizability and Data Constraints

3.4. Ethical and Legal Issues

3.5. Obstacles to Clinical Adoption

3.6. Future Directions

3.6.1. Large-Scale, Multimodal Datasets with Harmonized Acquisition Standards

3.6.2. Explainable AI (XAI) to Enhance Clinician Trust

3.6.3. Multicenter External Validation and Benchmarking

3.6.4. Early-Stage and Prodromal PD Prediction Models

3.6.5. Hybrid ML–Clinical Scoring Systems

3.6.6. Integration with Wearable and Home-Monitoring Technologies

3.6.7. Generative AI for Data Augmentation and Missing-Modality Compensation

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics