The Role of Neural Network for the Detection of Parkinson’s Disease: A Scoping Review

Background: Parkinson’s Disease (PD) is a chronic neurodegenerative disorder that has been ranked second after Alzheimer’s disease worldwide. Early diagnosis of PD is crucial to combat against PD to allow patients to deal with it properly. However, there is no medical test(s) available to diagnose PD conclusively. Therefore, computer-aided diagnosis (CAD) systems offered a better solution to make the necessary data-driven decisions and assist the physician. Numerous studies were conducted to propose CAD to diagnose PD in the early stages. No comprehensive reviews have been conducted to summarize the role of AI tools to combat PD. Objective: The study aimed to explore and summarize the applications of neural networks to diagnose PD. Methods: PRISMA Extension for Scoping Reviews (PRISMA-ScR) was followed to conduct this scoping review. To identify the relevant studies, both medical databases (e.g., PubMed) and technical databases (IEEE) were searched. Three reviewers carried out the study selection and extracted the data from the included studies independently. Then, the narrative approach was adopted to synthesis the extracted data. Results: Out of 1061 studies, 91 studies satisfied the eligibility criteria in this review. About half of the included studies have implemented artificial neural networks to diagnose PD. Numerous studies included focused on the freezing of gait (FoG). Biomedical voice and signal datasets were the most commonly used data types to develop and validate these models. However, MRI- and CT-scan images were also utilized in the included studies. Conclusion: Neural networks play an integral and substantial role in combating PD. Many possible applications of neural networks were identified in this review, however, most of them are limited up to research purposes.


Background
The human brain is the primary controller part of the human body. Any minor damage to any of its parts will severely affect other organs-one of its adverse effects is Parkinson's disease (PD) [1]. "PD is a chronic and progressive neurodegenerative disease" [2], and it occurs mainly in people over 50 years old [3]. Its symptoms start slowly and increase over time. PD symptoms are characterized such as motor and nonmotor [4]. Motor symptoms include movement disorders, shaking, walking issues [5], stiffness, and postural instability [6], while nonmotor symptoms including cognitive dysfunction, mood disorder [7], depression, and anxiety [8].
Parkinson's is the second worse neurodegenerative disease worldwide after Alzheimer's disease. In 2019, its incident rate ranged from 40.37 to 53.89 per 100,000 population per year in the US alone [9]. Diagnosis of PD in an early stage is an important issue to mitigate its complications. However, no medical test is available to diagnose it in the early stages conclusively. In a traditional clinical setup, the physician

Research Problem and Objectives
The scope of this paper is limited to the detection of Parkinson's disease (PD) in the early stage using neural networks. The patient dataset such as electronic health record (EHR) and medical image can be analyzed using neural network (NN) features; in particular, patient's data can undergo many processes; analysis, segmentation, augmentation, scaling, normalization, sampling, aggregation, and sifting, in order to obtain accurate prediction that assists healthcare ecosystem and stakeholders in the healthcare domain. Many studies have been recently conducted to address and propose a solution to mitigate and prevent neurodegenerative disorders such as PD. However, most of these studies and research are dispersed. Therefore, summarizing NN technologies' involvement in resolving challenges related to PD is needed; an appropriate summarization allows new researchers to understand the current role of neural networks against PD. It will open new opportunities for researchers to have the necessary base that allows them to build on instead of starting from ground zero.
Many studies have been carried out to cover AI techniques that have been used to mitigate and prevent PD [11][12][13][14]. These approaches are conducted in reviews or surveys that generally focus on artificial intelligence (AI) applications such as patient diagnosis, epidemiological monitoring, and drug and vaccine discovery [15]. Nevertheless, a massive number of research papers are constantly being published, which has overwhelmed electronic databases. Therefore, it is necessary to carry out an updated review that focuses on the uses of neural networks in PD prevention.
This review aims to identify and illustrate neural network technology's role in detecting PD early, based on the following aspects: (1) identifying the role of neural networks in PD detection, (2) highlighting the recent algorithms applied on PD datasets, (3) observing dataset types, (4) categorizing the type of PD based on symptoms, (5) investigating the best results achieved by the research community, and (6) providing a recommendation for researchers and healthcare individuals. The outcome can be used in the healthcare sector as guidance for developers who consider neural network's utilization to improve the public health capability as a response to PD.

Methodology
We carried out a scoping review to explore the evidence on neural network's application in diagnosing Parkinson's disease in a structured manner. In this section, we listed the details of the adopted methodology to conduct this review. For this purpose, PRISMA Extension for Scoping Reviews (PRISMA-ScR) [16] was used for this scoping review.

Search Sources
We selected five bibliographic databases (PubMed, IEEE, ACM, ScienceDirect, and Google Scholar) to retrieve the research studies relevant to the topic. We scanned only 100 articles from Google Scholar; these articles were chosen after scanning based on their relevance to fit this paper. The backward and forward reference checking lists were not performed due to the sufficient number of included studies. The search process was performed from 24 February to 1 March 2021.

Search Terms
In the present review, we considered two different search terms based on population and intervention. Given the population of "Parkinson's disease" and intervention of "deep learning", the search strategy was conducted as follows: (("Parkinson's disease" OR "Parkinson*" OR "Parkinsonism" OR "paralysis agitans" OR "shaking palsy") AND (" artificial intelligence* " OR " machine learning" OR " neural network*" OR " deep learning" OR "natural language processing" OR "neural network*" OR "supervised learning" OR "unsupervised learning" OR "ensemble learning" OR "reinforcement learning")) total retrieved studies in (Appendix A).

Study Eligibility Criteria
This study aims to summarize and review the application/use of deep learning, particularly in diagnosing Parkinson's disease. Therefore, only the following studies were eligible to satisfy the below criteria: a deep learning approach or technique introduced or developed that primarily focused on diagnosing Parkinson's disease. Further, some constraints on the types of publication and the language of the studies were made. Only studies published in English between 2018 and 2021 are selected, and only peer-reviewed articles, conference proceedings, reports, theses, dissertations were admitted. Reviews, conference abstracts, commentaries, proposals, editorials were excluded. The details of exclusion and inclusion for study selection are listed in Table 1.

Study Selection
The study selection process was conducted in two stages (screening title and abstracts of retrieved studies and screening full text of the studies selected in the first stage). In the first stage, the first reviewer, MA, independently screened all the retrieved studies' titles and abstracts; due to time constraints, the second reviewer, US, and the third reviewer, KD, reviewed the first half and second half of the complete set of articles, respectively. The Rayyan software, a web-based systematic review tool, was employed for screening title and abstract [17]. In the second stage, the first reviewer, MA, performed the first stage's full-text screening of the identified studies. Any disagreement between reviewers was resolved through consensus and discussion.

Data Extraction and Data Synthesis
To extract the study-specific information and data, an extraction form was created and tested by eight included studies (Appendix B). MA and US undertook the data extraction, and the data were extracted to the excel sheet to summarize the following: general characteristic of included studies (e.g., country, types, and year of publication), aim/purpose of the study, type of Parkinson's disease, branch/type neural network, type of validation, performance metrics, the dataset used to train and test the model, number of Parkinson's and healthy samples, type of dataset, size of the dataset, data collection device or sensor, and dataset source. We used the narrative approach to synthesis the extracted data.

Search Results
In total, 1061 studies were retrieved by searching through 5 recognized E-Databases. Then, 190 (17.90%) were removed due to duplication, while 871 (82.09%) went through title and abstract screening; in this screening, we excluded 598 (56.36%) studies due to various reasons, as shown in Figure 1. The remaining 273 (25.73%) studies went through the full-text screening, and 181 (17.05%) studies were excluded, as detailed in Figure 1. In total, 91(8.67%) studies were included in this review.

General Description of the Included Studies
As shown in Table 2, the included citations were published in more than 30 different countries, as shown in Figure 2, about 13 studies from the US (14.13%), followed by 9 studies from China and India (9.78%) ( Figure 3). This shows that numerous papers were published in the last 3 years; for instance, 30 papers (32.60%) were published in 2019 and 2020. More than half (56.2%) of the included studies were conference papers. However, most conference papers (n = 18) were published in 2018, and 2020, respectively, and only (n = 16) conferences article were reported in 2019. In addition, (n = 39) journal articles were published in last few years: (n = 10) in 2018; (n = 14) in 2019; (n = 12) in 2020; and (n = 3) in 2021. Table 2. General characteristics of the included studies (n = 91).
As shown in Figure 4, some studies target specific symptoms of PD, such as freezing of gait, vocal impairment, and tremor disorder. A more limited number of included studies proposed a deep learning approach to detect tremor disorder (n = 5) and vocal impairment (n = 13). However, various studies used the deep learning technique to diagnosis PD (n = 50), in general, and freezing of gait (FoG) (n = 23), in particular. As reported in Table 3, the neural network is divided into five main branches (CNN, RNN, ANN, FNN, NN); all types of subclassification techniques are listed as backbone model; moreover, we noticed that LSTM was heavily used in a different study (n = 11), followed by none deep learning classifier SVM (n = 8); however, we have reported SVM in this review because many studies used neural networks to perform data extraction, but the classification was handled by the machine learning classifier such as SVM; hence, DNN was used and reported in (n = 6), and a predefined model such as VGG was used in (n = 3); other types of algorithms that were used rarely depended on each of the studies' design or achieved a remarkable result. In most of the studies, the dataset was divided into three parts training, testing, and validation due to the limited number of studies that divided the datasets only into the training set and validation set, as presented in Table 3. We reported only the training and testing datasets. Furthermore, most of the experiments (n = 21) used ≥80%) volume of the training dataset, and (n = 9) used (≥70%). However, only few experiments provided less volume of the training dataset, as seen in (n = 5) used (≥60%) and (n = 3), (n = 1) used (≥50%), (≥40%), respectively. However, (n = 43) of the studies did not mention the volume of the training dataset. In addition, the volume of the testing dataset is not clarified in most of the studies; we noticed that (n = 53) did not specify the volume of the testing dataset that was used during the experiment; however, the volume of (≥20%) was mostly used in (n = 18), followed by (≥10%) that were mentioned in (n = 9), and the volume of (≥30%) was observed in (n = 6). The testing dataset is usually used in low volume, compared to the training dataset; however, we noticed that half of the dataset (≥50%) was used only in (n = 3). In addition, low volumes of testing dataset, i.e., (≥5%) and (≥40%), are reported in (n = 2) and (n = 1), respectively.
Various evaluation metrics used to check each model's performance and accuracy are the most commonly used metrics to calculate the model's efficiency in predicting the result based on the testing dataset. In (n = 57), the accuracy of the models was reported. On the other hand, along with the accuracy, other evaluation methods were used, such as recall/sensitivity that was reported in (n = 36), followed by specificity in (n = 24) and precision (n = 17); however, few studies (n = 8) used area under the curve (AUC) as an evaluation metric.
During summarization of all (n = 91) results, unfortunately, we did not come across any empirical validation/real-life implementation in any hospital. Moreover, from the (n = 91) studies, we only found one study that developed diagnosis software that identified any neurological disorders such as PD and that can be employed in the medical center [51].

Public Dataset
As discussed, an earlier total number of the public dataset (n = 57), Table 4, summarized the most used (n = 36) public available dataset sources and repositories (n = 36), e.g., Parkinson Progression Markers Initiative database (PPMI), UCI database repo, and PhysioNet; these were the most used datasets to develop and validate the AI models. Other public dataset sources used by the included studies were as follows: Kaggle, HandPD, DaphNet, the NTUA Parkinson Dataset, Neurovoz corpus, PC-GITA database, etc. Table 4 only provides a sample of the public datasets used within the included studies. As seen, the number of males in the PD sample is higher than the number of females, and the number of males in healthy control is higher than the number of females in most cases. Furthermore, different types of hardware devices were used to collect the dataset; we have noticed that most of the data are in the form of images collected with different devices, starting from hospital imaging device including MRI, CT, DaTscan and ending with smartphone images that were used to capture handwriting or drawing of the PD samples (n = 28) and (n = 4) for recording video. Biometric signal and time-sensor-based dataset were collected using the digital keyboard or sensor/accelerometer (n = 16) attached to the PD and healthy control sample or placed at a different angle to measure the severity of the freezing gait or the tremor. Moreover, devices such as a high-quality standalone microphone or smartphone were used to collect the biomedical voice dataset, and (n = 15) reported a public vocal dataset. Moreover, in the public dataset, only (n = 11) reported the gender of PD and healthy control sample, and only (n = 5) studies identified each sample's mean age.

Private Dataset
As mentioned, the earlier total number of private datasets (n = 31) is shown in Table 5. We summarized the dataset that was clearly explained within studies (n = 5). This dataset was collected and labeled in different entities such as hospitals, universities, and research centers. The number of PD and healthy control samples are reported, including gender. Table 5 only provides a sample of the private datasets used within the included studies. The number of males in the PD sample is higher than the number of females, whereas the number of females in health control is higher than the number of males. Furthermore, different types of hardware devices were used to collect the dataset; we have noticed that most of the data were in the form of images collected with different devices, starting from hospital imaging device including MRI, CT, DaTscan and ending with smartphone images that were used to capture handwriting or drawing of the PD samples (n = 11). Biometric signal and time-serious-based dataset were collected using the digital keyboard or sensor/accelerometer (n = 14) attached to the PD and healthy control sample or placed at a different angle to measure the severity of the freezing gait or the tremor. Moreover, devices such as a high-quality standalone microphone or smartphone were used to collect the biomedical voice dataset, and (n = 6) reported a private vocal dataset. Moreover, in the private dataset, only (n = 4) reported the gender of PD and healthy control sample, and only (n = 4) studies identified each sample's mean age.

Principal Findings
Although this study focuses on identifying and addressing deep learning and neural network application to detect Parkinson's disease in the early stage, we found some proposed models show promising results and can be employed in hospitals. This review provides recommendations for professional healthcare and researchers based on the included studies' outcomes. Moreover, we noticed that five studies [21,37,49,55,81] used the Vertical Ground Reaction (VGRF) dataset, which was obtained from PhysioNet hub to train the classification models including fuzzy neural networks (FNNs), stacked 2D CNNs, deep neural networks (DNNs), artificial neural networks (ANNs), and neighborhood representation local binary pattern (NR-LBP). However, DNN in [49] surprisingly achieved outstanding results for early detection of PD using the VGRF dataset, compared to the other studies.
We found that most of the biomedical voice measurements dataset was obtained from the University of California (UCI) Irvine Machine Learning repository; in [53,84] and [23], the same dataset is used; however, 19 achieved outstanding result using the sequential model in a deep neural network for detection PD based on voice measurement. In [33,44], and [4], the same voice measurement datasets with 756 instances and 754 attributes were used to identify PD, and the autoencoder neural network in [33] achieved better results than other studies.
Electroencephalograph (EEG) dataset was obtained from a different source and used in five studies [32,38,51,83,85]. In [38,83], we found that long short-term memory (LSTM) achieved outstanding results, indicating the best option to deal with EEG data. On the other hand, seven studies [3,19,25,27,40,69,101,102] focused on the classification of handwriting image to identify PD in the early stage, and we found that outstanding results were achieved in ANN + SVM in [3], dual-path RNN (DPRNN) in [40], and CNN + Optimum-Path Forest (OPF) in [102], respectively.
As mentioned earlier, the detection of PD using a neural network is not an easier task than other types of diseases because PD symptoms (vocal disorder, tremor disorder, freezing gait disorder) are inconsistent, and it is difficult to collect data concerning the type of the device. Therefore, many public repositories mainly focus on collecting and process certain types of datasets. Moreover, based on our findings, we can conclude that the sequential model in DNN and autoencoder neural network proved to be suitable models for PD detection from speech. Moreover, DNN is recommended to identify PD from VGFR data. Additionally, CNN is still on top for medical image classification such as MRI, PET/CT, and DaTSCAN. Moreover, the FNN shows significant results in classifying a medical image. On the other hand, in regard to images of handwritings, we found that ANN with machine learning classifier SVM had a remarkable result for the identification of PD from handwriting.
Based on the findings of this review, we can highlight the most used repositories that contain PD public datasets for the research community as follows: (1)

Strengths
This review covered deep learning neural network techniques used for PD detection regardless of the characteristics, country, and study design. We claim that this review is a comprehensive study of neural network approaches used for PD detection. It will help researchers to understand how neural network is used efficiently for detecting PD in early stages. Compared with other reviews [106][107][108] that do not focus on PD disease, this review is unique in its field because it describes and summarizes features of the identified neural network models, datasets, available repository, type of PD evaluation, validation, and research implication. Moreover, this review is different from the previously mentioned reviews by following the latest version of PRISMA-ScR [16]. Unlike other reviews, we retrieved the studies from the most popular computer science and healthcare database to determine the most relevant studies possible.

Limitations
In the beginning, we carried out a primary search from 2015 to 2021 through the five selected databases, and we retrieved a massive number of studies. Therefore, we limited our search to the period between 2018 to 2021. Due to that, we may have missed some significant studies. Due to many studies that we included (n = 92), backward and forward reference checking was not performed in this review. PD is an extensive topic and divided into many types of diseases, including various symptoms. Therefore, we may have missed categorizing some diseases from a clinical perspective.

Practical and Research Implications
Although this review investigates the neural networks used to detect Parkinson's disease (PD), some applications could significantly mitigate this neurodegenerative disorder. Nowadays, computer-aided diagnosis systems are essential because they are less time consuming and more user friendly. For example, the authors of [51] designed a GUI system that physicians may use for fast diagnosis of Parkinson's disease in its early stages. Researchers can also use the system to continue their future research on disease diagnosis, especially neurodegenerative disorders. The system will show the patient's disease progression and help clinicians monitor the disease in its early stages. Furthermore, the system can differentiate between PD patients and healthy subjects and compare various parameters (EEG, EMG, MRI/PET scan). In both PD and control subjects, the model can detect the region of dopamine output in the substantia nigra. As a result, the proposed model would be a novel solution containing all of the PD detection parameters in a single window, which would be extremely useful for disease monitoring.
In the included studies [6,18,19,30,61,75,87,91,92,96,98,101], clinicians could obtain PD Patient data in telemonitoring using devices such as tablets and smartphones. It is a promising solution because they can increase monitoring frequency without putting a strain on professional resources during the COVID-19 pandemic. However, the cost of training and testing the detection algorithm on a smartphone was too high; thus, the results were measured on a remote server and then transferred to the computer.
Clinical studies can refer to a video recorded for the patient while performing physical activities such as a PD bed test. As mentioned, in [18,43,70,87], a neural network was able to identify the symptoms of PD through a video sample of the patient. In the future, the clinical studies may analyze any video recorded in the hospital for other patients, for example, during therapy sessions, and predict if this patient is suspected of having PD in the future.

Conclusions
This scoping review summarized studies by investigating the use of neural networks, specifically deep learning algorithms, for early diagnosis of PD based on various data collected from different public and private sources (91 studies), including medical image, biomedical voice, and sensor signal, for both PD and healthy control samples. Included studies were categorized into different groups based on the neural network model, type of PD symptoms, and type of dataset. Additionally, the most used dataset and best performance model were highlighted based on the detection of particular symptoms of PD in this review. All technical experiment methods were reported, including submodel, dataset volume, training, testing, evaluation metrics, and validation type. We indicated any real-time implementation used in each hospital or university setting, and based on this review, we recommended particular suggestions for healthcare professionals. Future work could be a meta-analysis to examine each study and provide a comprehensive comparison between them in terms of quality.

Study Characteristics
Author The first author of the study.

Year Submission
The year in which the study was submitted.
Country of publication The country where the study was published.
Publication type The paper type (i.e., peer-reviewed, conference or preprint).

AI technique characteristics
Purpose/use of AI What are the applications or uses of AI in diagnosis of Parkinson (e.g., diagnosis, classification, and detection)?

AI branches
The branches/areas that were used (e.g., traditional machine learning, deep learning, natural language processing).
AI models/algorithms The specific AI models or algorithms that were used (e.g., Decision tree, Random forest, Convolutional neural network).

Dataset Characteristics
Data sources Source of data that were used for the development and validation of AI models/algorithms (e.g., public databases, clinical settings, government sources).

Data types
Type of data that were used for the development and validation of AI models/algorithms (e.g., radiology images, biological data, laboratory data).

Dataset size
The total number of data that were used for the development and validation of AI models/algorithms.

Type of validation
How the dataset was split/used to develop and test the proposed models/algorithms (e.g., Train-test split, K-fold cross-validation, External validation).
Proportion of training set Percentage of the training set of the total dataset.

Proportion of validation set
Percentage of validation set of the total dataset.

Proportion of test set
Percentage of the test set of the total dataset.

Type of device
The device used to collect the data (e.g., accelerometer, smartphone, etc.) At-risk group The number of Parkinson's participants included in the study.

Control group
The number of healthy participants included in the study