Implementation of a Deep Learning Algorithm Based on Vertical Ground Reaction Force Time–Frequency Features for the Detection and Severity Classification of Parkinson’s Disease

Setiawan, Febryan; Lin, Che-Wei

doi:10.3390/s21155207

Open AccessArticle

Implementation of a Deep Learning Algorithm Based on Vertical Ground Reaction Force Time–Frequency Features for the Detection and Severity Classification of Parkinson’s Disease

by

Febryan Setiawan

¹

and

Che-Wei Lin

^1,2,*

¹

Department of Biomedical Engineering, College of Engineering, National Cheng Kung University, Tainan City 701, Taiwan

²

Medical Device Innovation Center, National Cheng Kung University, Tainan City 701, Taiwan

^*

Author to whom correspondence should be addressed.

Sensors 2021, 21(15), 5207; https://doi.org/10.3390/s21155207

Submission received: 9 July 2021 / Revised: 29 July 2021 / Accepted: 30 July 2021 / Published: 31 July 2021

(This article belongs to the Special Issue Sensors for Physiological Parameters Measurement)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Conventional approaches to diagnosing Parkinson’s disease (PD) and rating its severity level are based on medical specialists’ clinical assessment of symptoms, which are subjective and can be inaccurate. These techniques are not very reliable, particularly in the early stages of the disease. A novel detection and severity classification algorithm using deep learning approaches was developed in this research to classify the PD severity level based on vertical ground reaction force (vGRF) signals. Different variations in force patterns generated by the irregularity in vGRF signals due to the gait abnormalities of PD patients can indicate their severity. The main purpose of this research is to aid physicians in detecting early stages of PD, planning efficient treatment, and monitoring disease progression. The detection algorithm comprises preprocessing, feature transformation, and classification processes. In preprocessing, the vGRF signal is divided into 10, 15, and 30 s successive time windows. In the feature transformation process, the time domain vGRF signal in windows with varying time lengths is modified into a time–frequency spectrogram using a continuous wavelet transform (CWT). Then, principal component analysis (PCA) is used for feature enhancement. Finally, different types of convolutional neural networks (CNNs) are employed as deep learning classifiers for classification. The algorithm performance was evaluated using k-fold cross-validation (kfoldCV). The best average accuracy of the proposed detection algorithm in classifying the PD severity stage classification was 96.52% using ResNet-50 with vGRF data from the PhysioNet database. The proposed detection algorithm can effectively differentiate gait patterns based on time–frequency spectrograms of vGRF signals associated with different PD severity levels.

Keywords:

deep learning; gait analysis; Parkinson’s disease severity stages; time–frequency spectrogram; vertical ground reaction force (vGRF) signal

1. Introduction

Parkinson’s disease (PD) is a neurodegenerative disease that belongs to a group of motor system disorders caused by the loss of dopamine-producing brain cells. PD is the second most common neurodegenerative disease [1]; its prevalence is approximately 0.3% in the general population, approximately 1% in individuals older than 60, and approximately 3% in people aged 80 and over [1]. The incidence of PD is 8–18 per 100,000 people. The median age at onset is 60 years, and the mean duration of the progression of the disease from diagnosis to death is approximately 15 years [1]. There is a 1.5–2-fold greater prevalence and incidence of this disease in men [1]. PD treatments cost approximately USD 2500 each year, and therapeutic surgery costs up to USD 100,000 per patient [2]. The primary PD symptoms are tremors in the hands, arms, legs, jaw, and face; rigidity (inflexibility of the limbs and trunk); bradykinesia (slowness in movement); and postural instability (balance and coordination disturbance) [3,4,5]. As these symptoms become more severe, patients may experience difficulties walking, talking, or accomplishing simple tasks. Currently, there are no blood or laboratory tests that assist in diagnosing PD. Symptoms of the disease include certain characteristic walking difficulties, such as a shortened stride length, decreased gait speed, increased stride-to-stride variation, a shambling gait, and frozen gait.

Gait analysis is used to assess and treat individuals with conditions affecting their ability to walk, such as poor health, advanced age, size, weight, and speed. A standard assessment is needed to clinically identify and evaluate gait characteristics and other phenomena in PD patients, such as gait count, walking speed, and step length. Pistacchi et al. analyzed temporal parameters (see Figure 1) in patients with early PD using 3D gait analysis-related cadence (PD patients: 102.46 ± 13.17 steps/min and healthy subjects: 113.84 ± 4.30 steps/min), stride duration (PD patients: 1.19 ± 0.18 s right limb and 1.19 ± 0.19 s left limb; healthy subjects: 0.426 ± 0.16 s right limb and 0.429 ± 0.23 s left limb), stance duration (PD patients: 0.74 ± 0.14 s right limb and 0.74 ± 0.16 s left limb; healthy subjects: 1.34 ± 1.1 s right limb and 0.83 ± 0.6 s left limb), and velocity (PD patients: 0.082 ± 0.29 m/s; healthy subjects: 1.33 ± 0.06 m/s) [5]. Sofuwa et al. concluded that individuals with PD showed a significant reduction in step length and walking speed compared with the non-PD control group [6]. These observations suggest that foot force is affected by PD. Lescano et al. aimed to analyze gait parameters, stance, swing phase duration, and the magnitude of the vertical component of the ground reaction force for the purpose of assessing whether there are statistically significant differences between PD patients in stages 2 and 2.5 (modified Hoehn and Yahr (HY) scale, see description in Table 1) [7]. Gait information has been developed for movement analysis in healthy control (CO, term defined by PhysioNet [8]) subjects and other subjects with different types of diseases. This approach is useful for understanding movement disorders arising from PD, and may be valuable in developing non-invasive automatic detection and severity classification approaches for PD.

Classification is the process of identifying the class of a new observation using a set of categories based on a training process involving observations for which the classes are known. In PD classification, various machine learning algorithms have been implemented as classifiers and combined with sophisticated feature extraction methods for dimensionality reduction. Recently, deep learning approaches, instead of conventional machine learning algorithms, have been applied to improve PD classification performance. For example, Jane et al. presented a Q-backpropagated time delay neural network (Q-BTDNN) in a clinical decision-making system (CDMS) to diagnose patients with PD (PD vs. CO) [11]. The Q-BTDNN was trained using a Q-learning induced backpropagation (Q-BP) training algorithm by generating a reinforced error signal, and the weights of the network were corrected through the backpropagation of the generated error signal. Correa et al. implemented a method to model PD patients’ difficulties in starting and ending movements by examining information from speech, handwriting, and gait [12]. These researchers trained a convolutional neural network (CNN) to classify PD patients and CO subjects. The PD population in the database was divided into three groups based upon the stage of PD: low, intermediate, or severe. Lee and Lim classified idiopathic PD patients and COs based on their gait force characteristics using a continuous wavelet transform (CWT) to generate approximate coefficients and detail coefficients [13]. Forty features were extracted from those coefficients using statistical approaches, including frequency distributions and their variabilities. The features of idiopathic PD patients and COs were classified using a neural network with weighted fuzzy membership functions (NEWFM). Zhao et al. developed a two-channel model that combined Long Short-Term Memory (LSTM) and CNNs to learn spatio-temporal patterns in gait data recorded by foot sensors [14]. The model was trained and tested on three public vGRF datasets. The model could perform multi-category classification on features such as the severity level of PD, while previous machine learning-based approaches could only perform binary classification.

As previously mentioned, only a few studies have used the deep learning approach for the detection and severity classification of PD, and some of them have used statistical features combined with machine learning methods. The drawbacks of using machine learning are the dependence of its performance on data size and the understanding of features [15,16]. Machine learning only performs well on small to medium datasets and needs a better understanding of features to represent the data. The objective of this work was to develop a deep learning classifier to help physicians screen and classify the severity of PD in patients, using vGRF spectrograms. The effectiveness of time–frequency spectrogram (feature transformation) of vGRF signals from left (LF), right (RF), and compound foot (CF = LF + RF) movements in classifying features of PD severity was investigated. Specifically, the aim was to determine whether a significant difference in vGRF is related to the specifics of disease severity, as passive (weight acceptance) and active (push off) peaks of vGRF are important gait parameters [17] and exhibit significant relevance in the detection of gait abnormalities, especially in the PD gait assessment [13,14,18,19,20]. Different deep learning algorithms (including AlexNet, ResNet-50, ResNet-101, and GoogLeNet) were also utilized with the proposed method to compare the effectiveness among classifiers.

2. Materials and Methods

The proposed PD severity classification algorithm attempts to extract pattern features and visualizations from vGRF signals in PD patients with severity stages of 0, 2, 2.5, and 3 on the HY rating scale by transforming one-dimensional time domain signals into two-dimensional patterns (images) using the feature transformation method from a CWT. The proposed PD severity classification algorithm consists of four main steps, as shown in Figure 2: (1) signal preprocessing of PD patients’ vGRF signals, (2) feature extraction from a spectrogram of the vGRF signal generated using CWT and PCA, (3) construction and training of a CNN classifier, and (4) cross-validation to evaluate the performance of the classification algorithm.

2.1. Gait in Parkinson’s Disease Database

The vGRF database used in this research, the Gait in Parkinson’s Disease Database (gaitpdb), is available online from PhysioNet [8]. The database comprises three datasets, which were contributed by Yogev et al. (Ga) [21], Hausdorff et al. (Ju) [22], and Frenkel-Toledo et al. (Si) [23,24].

The database contains information recorded from 93 idiopathic PD patients (average age: 66.3 years; 63% men and 37% women) and 73 CO subjects (average age: 66.3 years; 55% men and 45% women). Every subject was instructed to walk at their usual pace for about two minutes while wearing a pair of shoes with eight force sensors located under each insole. The raw vGRF signal data in this database were obtained using force-sensitive sensors (Ultraflex Computer Dyno Graphy, Intronic Inc., NL-7650 AB Tubbergen, The Netherlands) with the output proportional to the force under the foot in Newtons, collected at 100 samples per second (frequency of readings during movement was 100 Hz). The recordings also included two signals that reflect the sum of the eight sensor outputs from the left and right foot.

The database also contains information about each participant, including gender, age, height, weight, walking velocity, and severity level of PD. The PD severity level was assigned according to two rating scales, HY [10] and the Unified Parkinson’s Disease Rating Scale (UPDRS) [25]. The HY rating scale, widely used to represent the way in which symptoms of PD progress, defines five stages of PD, with two additional intermediate stages, 1.5 and 2.5 (Table 1) [10]. The number of participants diagnosed using the HY rating scale is shown in Table 2.

2.2. Signal Preprocessing

A two-minute foot force signal was acquired during data collection from subjects. The LF, RF, and CF vGRF signals of the CO and PD subjects were used as inputs to the proposed algorithm. It was difficult to interpret the foot force data directly, despite using a CWT to transform the features, due to the length of the foot force signal. To observe the foot force signal more accurately, a window function was employed. A window function is a mathematical construct that is zero-valued outside of selected intervals. In this research, 10, 15, and 30 s window sizes were used. The aim of the time-windowing process was to obtain shorter signal data. In the clinical application, this data collection is more convenient for the PD patient and, furthermore, reduces the fall risk. The possibility of patient injury rises if the data collection time is longer. Normalization and zero-mean processing were also used, to reduce the redundancy and dependency of data.

In 1987, Nilsson and Thorstensson observed the adaptability in the frequency and amplitude of leg movements during human locomotion at different speeds [26]. They reported that the overall range of stride frequency for normal leg movements is 0.83–1.95 Hz. The stride cycle period is defined as the time from the heel contact of one foot with the ground to the next heel contact of the same foot with the ground. The stride cycle period can be derived from the vGRF signal, and the stride frequency is the inverted value of the stride cycle duration. In conclusion, we selected two frequency ranges, 0.83–1.95 Hz and 1.95–50 Hz, for detailed observations of vGRF spectrograms among CO and PD subjects.

2.3. The Continuous Wavelet Transform

The continuous wavelet transform (CWT) is a signal processing tool used to observe the time–frequency spectrum characteristics of non-stationary signals [27]. As in the case of the Gabor transform [28], a CWT can be used to filter a signal using a dilated version of the mother wavelet, but the frequency translation is affected by dilation (scaling) and contraction. A CWT is a time–frequency transformation method, because it changes the signal time domain to the time–frequency domain. The output of a CWT is a time–frequency spectrogram (time–scale representation), which provides valuable information about the relationship between time and frequency.

A CWT consists of a time series function

x (t) \in L^{2} (R)

, with a scaling or dilation factor

s \in R^{+} (s > 0)

that controls the width of the wavelet, and a translation parameter,

τ

, controlling the location of the wavelet as expressed in the following equation:

X_{w} (s, τ) = \frac{1}{\sqrt{s}} \int_{- \infty}^{\infty} x (t) ψ^{*} (\frac{t - τ}{s}) d t

where

ψ (t)

is a mother wavelet, also called a window function. The mother wavelet function used in this research was a Morlet or Gabor wavelet. This wavelet function consists of a Gaussian-windowed complex sinusoid (a complex exponential multiplied by a Gaussian window) as follows:

ψ_{ω_{0}} (t) = (e^{- i f t} - e^{- \frac{1}{2} f^{2}}) e^{- \frac{1}{2} t^{2}}

The parameter

t

refers to the time, and

f

represents the reference frequency.

The aim of the time–frequency transformation is to represent the vGRF signal (Figure 3a) as a time–frequency spectrogram image, as shown in Figure 3b,c, Figure 4 and Figure 5. The images clearly show different patterns of vGRF between CO and PD subjects that cannot be found in the time and frequency domains of the signal. Using the time–frequency spectrogram, variations in the foot pressure signal caused by temporal characteristics can also be analyzed. Temporal characteristics, also known as spatial characteristics or linear gait variabilities, consist of the measurements of step length, the stance width, the length of the step rhythm, and the step velocity.

2.4. Principal Component Analysis

The main goal of principal component analysis (PCA) is to perform dimension reduction for a dataset containing a large number of interrelated variables while, to the greatest extent possible, retaining the variations present in the dataset [29]. This reduction is achieved by transforming the dataset into a new set of variables, the principal components (PCs), which are ordered, de-correlated variables.

The PCA method is defined mathematically using the following steps. Consider a matrix

X = {[P_{1}; P_{2}; P_{3}; \dots; P_{2}]}^{T}

constructed using the spectrogram images of PDs and COs, where

P

is a row vector consisting of the pixels of a spectrogram image of PDs or COs, and

i

is the number of spectrogram images of PDs and COs. The PC is built using the equation

X^{T} X

, a covariance matrix of the matrix X, to determine its eigenvalues and eigenvectors. The

W

matrix, an

m \times m

matrix of weights whose columns are the eigenvectors of

X^{T} X

, is obtained. Finally, the matrix for extracted feature

F

can be described as the full PCs’ decomposition of

X

, as shown in the following equation:

F = X W

.

The purpose of using PCA for feature enhancement was to extract fewer patterns while identifying the most important texture and pattern features. This processing was conducted in order to improve the performance of machine learning and artificial intelligence algorithms used for classifying the data points. The full PCs of each spectrogram image sample were selected to preserve the important texture and pattern features for visualization.

2.5. Convolutional Neural Network

A convolutional neural network (CNN) is composed of one or more convolutional layers, often with subsampling and pooling layers, followed by one or more fully-connected layers, as in a basic multi-layer neural network [30]. CNN was utilized to distinguish the time–frequency spectrogram representation of vGRF between PD severity stages. The convolutional layer plays the most important role in CNN performance. This layer is composed of a set of kernels (learnable filters) as parameters, which contain a small receptive field but are expanded through the full depth of the input. When the data pass through this layer, each kernel is convolved across the spatial dimensionality of the input (width and height of the input volume), resulting in the calculation of the dot product and production of a 2D activation map. The filters in the convolutional layers are edge detectors and color filters. An activation layer utilizes a non-saturating activation function

f (x) = \max (0, x)

, such as a sigmoid function, in which

σ (x) = {(1 + e^{- x})}^{- 1}

, to generate the output from the input produced by the previous layer. Another important concept in CNNs is pooling, also known as non-linear down-sampling. The aim of the pooling layer is to reduce the dimensionality and minimize the number of parameters and the complexity of model computation. This layer, known as the max-pooling layer, takes the input of each activation map and scales the input dimension using the MAX function. Finally, the fully connected layers attempt to generate scores from the previous activations to use for classification, as in traditional artificial neural networks (ANNs). Neurons in this layer have connections to all of the outputs of the previous layer. The performance of AlexNet, ResNet-50, ResNet-101, and GoogLeNet was examined in this study.

2.5.1. AlexNet CNN

The AlexNet architecture [31] comprises 25 layers, including an input layer, 5 convolution 2D layers, 7 rectified linear unit (ReLU) layers, 2 cross-channel normalization layers, 3 max-pooling 2D layers, 3 fully connected layers, 2 dropout layers for regularization, a softmax layer using a normalized exponential function, and an output layer. The input to the AlexNet CNN in the proposed method is a time–frequency spectrogram of the vGRF signals produced by the CWT. There are two methods for fine-tuning a pretrained AlexNet CNN: transfer learning and feature extraction. We chose the feature extraction method because it is easy to apply to pretrained networks without expending a lot of time, as it is faster than the transfer learning method and requires less training. This method applies two previous fully connected layers and uses a support vector machine (SVM) for classification.

2.5.2. ResNet-50 and ResNet-101 CNN

The main idea behind a residual network (ResNet) [32] is the presentation of a so-called “identity shortcut connection” that skips one or more layers. A shortcut (or skip) connection is used to solve the problem of vanishing or exploding gradients by using blocks that re-route the input and add to the concept learned in the previous layer. During learning, a layer learns the concepts of the previous layer and merges with inputs from that previous layer. ResNet-X refers to a residual deep neural network with X number of layers; for example, ResNet-50 indicates a ResNet developed using 50 layers. The architectures of ResNet-50 and ResNet-101 are described in Table 3.

2.5.3. GoogLeNet CNN

GoogLeNet [33] is a pretrained CNN that has 22 layers with 9 inception layers. An inception layer determines the optimal local sparse structure in a convolutional vision network, which can be approximated and covered by readily available dense components. In general, the inception layer is a network consisting of parallel convolutions of different sizes and types (

1 \times 1

,

3 \times 3

, and

5 \times 5

) for the same input, which stacks all of the outputs. The exact structure of GoogLeNet is as follows:

An average pooling layer with a $5 \times 5$ filter size and a stride of 3.
A $1 \times 1$ convolution with 128 filters for dimension reduction and rectified linear activation.
A fully connected layer with 1024 units and rectified linear activation.
A dropout layer with 70% rate of dropped outputs.
A linear layer with softmax loss as the classifier.

Although AlexNet, ResNet-50, ResNet-101, and GoogLeNet achieved significant performance in the PD severity detection (overall accuracy ~97%), their architecture characteristics exhibited different influences on performance based on the benefits and drawbacks of the networks. The advantages and disadvantages of AlexNet, ResNet-50, ResNet-101, and GoogLeNet applied in the proposed method are summarized in Table 4.

2.6. Cross-Validation

Cross-validation is a statistical method used to assess and compare learning algorithms by dividing data into two groups: a training set used to train a model and a testing set used to test the model [36]. The training and testing sets are varied in consecutive rounds so that each data point is tested using a classifier in whose training it did not participate. There are two main purposes of using cross-validation. Cross-validation is used to quantify the generalizability of an algorithm, by testing the classifier on unseen data. The second purpose is to evaluate the performance of different algorithms and identify the best algorithm with which to classify the available data or, alternatively, to compare the performance of two or more variants of a parameterized model. In order to compare the results with the existing literature, k-fold cross-validation was utilized. Consequently, k iterations of training and testing were carried out in such a way that within each iteration, a different fold of the dataset was used for testing, while the remaining (k-1) folds were used for training. In this research, 10-fold cross-validation was applied.

3. Results

The experiments were carried out using MATLAB R2018a software on an NVIDIA GeForce GTX 1060 6 GB computer with 24 GB RAM. The computation time is affected by the number of input time–frequency spectrogram images (related to the time-windowing process, where a smaller time window will result in more images and longer computation time) and the number of neurons in the CNN. We employed multi-class classification for the COs and PD Stages 2, 2.5, and 3. This approach is representative of real-life applications, because doctors and neurologists do not have preliminary information about whether a patient is healthy or suffers from PD and, if the latter, what the severity is.

The sensitivity, specificity, accuracy, and AUC value of the proposed method were included as parameters for evaluation. The detailed definition of each evaluation parameter is provided in [37]. When selecting between diagnostic tests, Youden’s index is often applied to evaluate the effectiveness of the test [38]. Youden’s index is a function of sensitivity and specificity, and its value ranges between 0 and 1. A value close to 1 indicates that the diagnostic test’s effectiveness is relatively high and the test is close to perfect, and a value close to 0 indicates poor effectiveness, where the test is useless. Youden’s index (J) is the sum of the two fractions and indicates whether the measurements correctly diagnosed the diseased group (sensitivity) and healthy controls (specificity) over all cut-points

c

,

- \infty < c < \infty

:

J = m a x_{c} \{sensitivity (c) + specificity (c) - 1\}

3.1. PD Severity Classification of Separated Ga, Ju, and Si Datasets

The gaitpdb database [8] contains three different vGRF datasets based on different studies: the Ga dataset describes dual tasking in PD patients, the Ju dataset indicates rhythmic auditory stimulation (RAS) in PD patients, and the Si dataset represents treadmill walking in PD patients. There are 29 PD patients and 18 CO subjects in the Ga dataset, 29 PD patients and 26 CO subjects in the Ju dataset, and 35 PD patients and 29 CO subjects in the Si dataset. The input signal for the proposed algorithm was dependent on the window size during the time-windowing process. For the 10 s window, the input signal numbers for CO, PD Stage 2, PD Stage 2.5, and PD Stage 3 in the Ga, Ju, and Si datasets were 447, 492, 240, and 168; 199, 352, 460, and 109; and 348, 336, and 84, respectively. In the 15 s time window, the input signal numbers for CO, PD Stage 2, PD Stage 2.5, and PD Stage 3 in the Ga, Ju, and Si datasets were 297, 328, 160, and 112; 129, 229, 300, and 72; and 232, 224, and 56, respectively. In the 30 s time window, the input signal numbers for CO, PD Stage 2, PD Stage 2.5, and PD Stage 3 in the Ga, Ju, and Si datasets were 147, 164, 80, and 56; 58, 103, 135, and 35; and 116, 112, and 28, respectively.

The proposed method covered two kinds of classifications, multi-class (CO vs. PD Stage 2 vs. PD Stage 2.5 vs. PD Stage 3) classification and two-class (CO vs. PD) classification. In the two-class classification, PD Stage 2, 2.5, and 3 datasets were combined into one PD dataset. The best classification performance was obtained using AlexNet CNN for multi-class classification and ResNet CNN for two-class classification. The best classification result of the Ga dataset has 98.15% sensitivity, 98.16% specificity, 98.16% accuracy, and an AUC value of 0.9816 on average for multi-class classification and 99.77% sensitivity, 98.80% specificity, 99.11% accuracy, and an AUC value of 0.9995 for two-class classification. The best classification result of the Ju dataset has 98.06% sensitivity, 98.38% specificity, 98.24% accuracy, and an AUC value of 0.9822 on average for multi-class classification and 98.94% sensitivity, 99.04% specificity, 99.01% accuracy, and an AUC value of 0.9993 for two-class classification. The best classification result for the Si dataset has 97.73% sensitivity, 98.76% specificity, 98.27% accuracy, and an AUC value of 0.9825 on average for multi-class classification and 98.85% sensitivity, 98.41% specificity, 98.56% accuracy, and an AUC value of 0.9964 for two-class classification. Based on these classification results, the performance of the proposed method was not influenced by different datasets in the database, even though the data collection processes varied among these datasets.

3.2. PD Severity Classification of All Datasets (Merged)

For this classification, the three vGRF datasets in gaitpdb were merged and used as inputs to the proposed PD severity classification algorithm. For the 10 s, 15 s, and 30 s time windows, the input signal numbers for CO, PD Stage 2, PD Stage 2.5, and PD Stage 3 were 994, 1180, 784, and 277; 658, 781, 516, and 184; and 321, 379, 243, and 91, respectively. The best result for this classification type was obtained using the ResNet CNN, with 92.08% sensitivity, 95.60% specificity, 94.58% accuracy, and an AUC value of 0.9384 on average for multi-class classification and 94.46% sensitivity, 97.69% specificity, 96.63% accuracy, and an AUC value of 0.9949 for two-class classification. The complete classification results are shown in Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 14, Table 15 and Table 16 for multi-class and Table 17, Table 18 and Table 19 for two-class, and Table 20, Table 21, Table 22 and Table 23 summarizes the classification results.

4. Discussion

In this section, we discuss the gait analysis for each severity stage of PD based on the time and frequency analyses of the time–frequency spectrograms. Some key features of a signal are difficult to observe with the naked eye, but time–frequency spectrogram analysis can help to decipher important information regarding time and frequency characteristics. A CWT was used in this study to transform the signal from the time domain into the time–frequency domain. The gait phenomena could be identified using pattern visualization and recognition based on time–frequency spectrograms for CO subjects and PD patients with severity stages of 2, 2.5, and 3.

This observation was only performed for the CF vGRF signal. Since this type of input signal is the additive force between the left and right foot force signals, it describes the correlations between the features of the left and right feet instead of a single feature of the left or right foot. In order to further investigate the gait phenomena, a 10 s time window spectrogram was selected because the image feature was derived from a shorter input signal, and more detail can be perceived from the texture and pattern visualization of gait phenomena. For a 15 and 30 s time window spectrogram, the texture and pattern information is more compressed, and thus, the gait phenomena are blurred and not easily observed (see Figure 4 and Figure 5). The 0.1–5 Hz and 5–50 Hz frequency ranges were only applied to the detailed observations of the CWT time–frequency spectrogram and were not used for the classification.

4.1. Healthy Controls

Normal gait phenomena were interpreted by observing the time–frequency spectrogram of CO subjects, as shown in the first column of Figure 3. In the 0.1–5 Hz frequency range (Figure 3, first column, second row), the strongest walking force magnitude, represented in yellow, of the normal gait occurs at 1.6–2.1 Hz and is stable from the initial time to the end of the experiment. The foot force distributions and walking velocities for normal subjects were therefore the same when they were walking. At 2.5–3 Hz and approximately 4.5–5 Hz, small areas signifying the lowest force magnitude, shown in dark blue, alternate with a significant force magnitude, indicated by light blue, forming a regular pattern. This phenomenon appears in the spectrogram and is caused by the CF force signal at the lowest magnitudes. The three lowest magnitudes can be observed in one cycle of the CF force time domain signal (top left of Figure 3); the lowest magnitudes are almost equal in every cycle of the signal. The lowest magnitudes that occur at the beginning and end of the half gait cycle (that is, only the left or right foot gait cycle), close to the 0 force unit, show that the toe-off and initial contact and the lowest magnitude that occur between the half gait cycle are demonstrated only when one foot is in contact with the ground.

In the 5–50 Hz frequency range (Figure 3, first column, third row), a steady, strong force level, represented in yellow, occurs at approximately 5 Hz, with the same magnitude as that which occurs during walking, from the beginning to the end of the recording, and a significant force magnitude, shown in light blue, occurs up to 50 Hz in all records. Both time–frequency spectrograms indicate that the time and frequency components in the spectrogram have a regular pattern. This interpretation became a benchmark for investigating PD gait phenomena. These data were compared to analyze the gait characteristics of PD patients based on spectrogram analyses.

4.2. Parkinson’s Disease Stage 2

The time–frequency spectrograms for PD patients were similar to those of the CO spectrograms. For PD Stage 2 patients, as presented in the second column of Figure 3, the strongest force is at 1.6–2.1 Hz in the 0.1–5 Hz (Figure 3, second column, second row) frequency range, and there is a significant, strong magnitude, shown in light yellow, at 1 Hz, which is weaker than the force magnitude at 1.6–2.1 Hz. The significant force magnitude at 2.5–3 Hz and approximately 4.5–5 Hz becomes more yellow instead of light blue as in the CO spectrogram. It is also apparent that the pattern of the lowest force magnitude at 2.5–5 Hz is regular at some times and irregular at other times. This observation indicates that the magnitudes of the global and local minima are not the same in every gait cycle (Figure 3, second column, first row). In the time domain, the CF vGRF signal has fluctuating force magnitudes that cause an irregularity in the signal.

In the 5–50 Hz frequency range (Figure 3, second column, third row), the strongest force magnitude, shown in yellow, is about 5 Hz, and significant force, represented by light blue, occurs up to 50 Hz every time. However, the force magnitude is not distributed equally over the entire walking period.

4.3. Parkinson’s Disease Stage 2.5

As shown in the third column of Figure 3, the spectrogram for PD Stage 2.5 patients is not very different from the PD Stage 2 spectrogram in either frequency range. The only difference is that, in the 0.1–5 Hz frequency range, a significant, strong magnitude at 1 Hz becomes stronger, and yellow areas of force magnitude appear in the image. PD patients in the early stages—2 and 2.5—of the disease can have a walking velocity similar to that of COs, but their force distribution is typically not equally distributed, due to the presence of tremors.

4.4. Parkinson’s Disease Stage 3

Of the patients studied in this research, those with PD Stage 3 had the most severe level of disease. The spectrograms of this group exhibit the most irregular patterns of all severity levels. In the fourth column, first row of Figure 3, the CF vGRF signal has the most fluctuation and irregular force magnitudes because of the jerky movements and tremors of the patients.

In the 0.1–5 Hz frequency range (third column, second row of Figure 3), the strongest walking force magnitude, shown in yellow, occurs at a lower frequency than in stages 2 and 2.5 at 1–1.5 Hz. A significant strong force magnitude, depicted in light yellow, also appears at approximately 0.75 Hz, although the force level is not the same in every gait cycle. At 2–3 Hz and 3.5–4 Hz, significant force magnitude regions occur, as shown by colors that are more yellow.

In the 5–50 Hz frequency range (third column, third row of Figure 3), the strongest force magnitude only appears in certain gait cycles and is not equally distributed. A significant force magnitude, shown in light blue, only occurs up to 20 Hz, and forms an irregular pattern in every gait cycle.

4.5. Comparison of Results with the Existing Literature

A comparison between the proposed methodology and a study by Zhao et al. [14] is presented in Table 24. The authors carried out multi-class classification of vGRF signals for CO vs. PD Stage 2 vs. PD Stage 2.5 vs. PD Stage 3 using the same information found in the same database used for the proposed method, gaitpdb. These authors separated the classification types based on the three datasets—Ga, Ju, and Si—and used 10-fold cross-validation as the evaluation method. The two-class classification results were also compared with those of studies conducted by Maachi et al. [39], Wu et al. [40], Ertugrul et al. [41], Zeng et al. [42], Daliri [43], and Khoury et al. [44,45]. These comparison results are shown in Table 25 and Table 26. In Khoury et al.’s study, the classification types were divided based on the three datasets—Ga, Ju, and Si.

In summary, the proposed method produced almost the same classification results as those published in the existing literature, but the proposed algorithm generated better visualizations via time–frequency spectrograms associated with the progression of PD severity. The irregularity in patterns in the spectrograms is proportional to the severity level of PD. The more severe the disease, therefore, the more irregular the spectrogram’s pattern. This phenomenon could be helpful for medical specialists or neurologists in monitoring PD progression, allowing them to provide more effective and accurate medications and therapies to patients.

5. Conclusions

In this study, a deep learning algorithm was implemented based on vGRF time–frequency features for the detection and severity classification of Parkinson’s disease. Pattern visualization and recognition of the time–frequency spectrogram made it possible to successfully differentiate PD severity stages and COs. A CWT was used to generate spectrograms to visualize gait foot force signals by transforming signals from the time domain into the time–frequency domain. Three time-window sizes (10, 15, and 30 s), two frequency ranges (0.83–1.95 and 1.95–50 Hz), and three types of gait foot force signals (LF, RF, and CF force signals) were selected as inputs to obtain good feature visualization. After the original signal was transformed, PCA was applied for feature enhancement, to increase between-class separability and to reduce within-class separability. Finally, CNNs were used to perform classification. To evaluate the CNNs’ classification process, 10-fold cross-validation was performed, and the accuracy, sensitivity, specificity, and the AUC value were evaluated. The proposed method was able to achieve the highest performance for more than 97.42% of the parameters being evaluated and achieved superior performance in comparison with the detection and PD severity classification performance of state-of-the-art methods found in the literature.

Although the evidence indicates that the proposed method achieved good performance, there are several major drawbacks that could be improved. First, an existing database was used with the proposed method, and clinical data with a greater number of severity levels should be used to verify the performance and to resolve the limitation of the relatively small number of PD patients at certain severity levels in the current database. Clinical data collection will be carried out using a smart insole with an embedded 0.5” force sensing resistor of our own design. The precision and accuracy of force sensing resistor readings are also considered in order to obtain the correct representation of the vGRF signal. PD patients will be asked to perform some simple daily activities, such as turning around and sitting, instead of only walking down a long pathway. Second, long-term data collection to monitor PD progression is important for treatment decisions, since the gait patterns of PD patients appear to change with the long-term progression of the disease. Third, to further investigate the clinical meaning of the results, PD gait phenomena based on time–frequency spectrograms should be discussed with physicians. Fourth, other input data, such as kinetic data, temporal data, step length, and cadence, and other classifiers should be used to confirm and compare the effectiveness of pattern visualization and recognition based on the use of time–frequency spectrograms in PD detection.

Author Contributions

Conceptualization, F.S. and C.-W.L.; methodology, F.S. and C.-W.L.; software, F.S.; validation, F.S.; investigation, F.S. and C.-W.L.; resources, C.-W.L.; writing—original draft preparation, F.S.; writing—review and editing, C.-W.L.; supervision, C.-W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ministry of Science and Technology (Taiwan), grant number 108-2628-E-006-003-MY3.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lee, A.; Gilbert, R. Epidemiology of Parkinson Disease. Neurol. Clin. 2016, 34, 955–965. [Google Scholar] [CrossRef]
Parkinson’s Disease Foundation. Statistics on Parkinson’s EIN: 13-1866796. 2018. Available online: https://bit.ly/2RCeh9H (accessed on 12 July 2019).
National Institute of Neurological Disorders and Stroke. Parkinson’s Disease Information Page. 2016. Available online: https://bit.ly/2xTA6rL (accessed on 12 July 2019).
Hoff, J.I.; Plas, A.A.; Wagemans, E.A.H.; Van Hilten, J.J. Accelerometric assessment of levodopa-induced dyskinesias in Parkinson’s disease. Mov. Disord. 2001, 16, 58–61. [Google Scholar] [CrossRef]
Pistacchi, M.; Gioulis, M.; Sanson, F.; De Giovannini, E.; Filippi, G.; Rossetto, F.; Marsala, S.Z. Gait analysis and clinical correlations in early Parkinson’s disease. Funct Neurol. 2017, 32, 28. [Google Scholar] [CrossRef] [PubMed]
Sofuwa, O.; Nieuwboer, A.; Desloovere, K.; Willems, A.-M.; Chavret, F.; Jonkers, I. Quantitative Gait Analysis in Parkinson’s Disease: Comparison with a Healthy Control Group. Arch. Phys. Med. Rehabil. 2005, 86, 1007–1013. [Google Scholar] [CrossRef] [Green Version]
Lescano, C.N.; Rodrigo, S.E.; Christian, D.A. A possible parameter for gait clinimetric evaluation in Parkinson’s disease patients. J. Phys. Conf. Ser. 2016, 705, 12019. [Google Scholar] [CrossRef]
Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 2003, 101, e215–e220. [Google Scholar] [CrossRef] [Green Version]
Arafsha, F.; Hanna, C.; Aboualmagd, A.; Fraser, S.; El Saddik, A. Instrumented Wireless SmartInsole System for Mobile Gait Analysis: A Validation Pilot Study with Tekscan Strideway. J. Sens. Actuator Netw. 2018, 7, 36. [Google Scholar] [CrossRef] [Green Version]
Hoehn, M.M.; Yahr, M.D. Parkinsonism: Onset, progression, and mortality. Neurology 1967, 17, 427. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Jane, Y.N.; Nehemiah, H.K.; Arputharaj, K. A Q-backpropagated time delay neural network for diagnosing severity of gait disturbances in Parkinson’s disease. J. Biomed. Inform. 2016, 60, 169–176. [Google Scholar] [CrossRef]
Vasquez-Correa, J.C.; Arias-Vergara, T.; Orozco-Arroyave, J.R.; Eskofier, B.M.; Klucken, J.; Noth, E. Multimodal Assessment of Parkinson’s Disease: A Deep Learning Approach. IEEE J. Biomed. Health Inform. 2018, 23, 1618–1630. [Google Scholar] [CrossRef]
Lee, S.-H.; Lim, J.S. Parkinson’s disease classification using gait characteristics and wavelet-based feature extraction. Expert Syst. Appl. 2012, 39, 7338–7344. [Google Scholar] [CrossRef]
Zhao, A.; Qi, L.; Li, J.; Dong, J.; Yu, H. A hybrid spatio-temporal model for detection and severity rating of Parkinson’s disease from gait data. Neurocomputing 2018, 315, 1–8. [Google Scholar] [CrossRef] [Green Version]
Hoffmann, A.G. General Limitations on Machine Learning. In Proceedings of the 9th European Conference on Artificial Intelligence, Stockholm, Sweden, 1 January 1990; pp. 345–347. [Google Scholar]
Mastorakis, G. Human-like machine learning: Limitations and suggestions. arXiv 2018, arXiv:1811.06052. [Google Scholar]
Jiang, X.; Napier, C.; Hannigan, B.; Eng, J.J.; Menon, C. Estimating Vertical Ground Reaction Force during Walking Using a Single Inertial Sensor. Sensors 2020, 20, 4345. [Google Scholar] [CrossRef]
Lin, C.-W.; Wen, T.-C.; Setiawan, F. Evaluation of Vertical Ground Reaction Forces Pattern Visualization in Neurodegenerative Diseases Identification Using Deep Learning and Recurrence Plot Image Feature Extraction. Sensors 2020, 20, 3857. [Google Scholar] [CrossRef]
Setiawan, F.; Lin, C.-W. Identification of Neurodegenerative Diseases Based on Vertical Ground Reaction Force Classification Using Time–Frequency Spectrogram and Deep Learning Neural Network Features. Brain Sci. 2021, 11, 902. [Google Scholar] [CrossRef]
Alam, N.; Garg, A.; Munia, T.T.K.; Fazel-Rezai, R.; Tavakolian, K. Vertical ground reaction force marker for Parkinson’s disease. PLoS ONE 2017, 12, e0175951. [Google Scholar] [CrossRef]
Yogev, G.; Giladi, N.; Peretz, C.; Springer, S.; Simon, E.S.; Hausdorff, J.M. Dual tasking, gait rhythmicity, and Parkinson’s disease: Which aspects of gait are attention demanding? Eur. J. Neurosci. 2005, 22, 1248–1256. [Google Scholar] [CrossRef]
Hausdorff, J.M.; Lowenthal, J.; Herman, T.; Gruendlinger, L.; Peretz, C.; Giladi, N. Rhythmic auditory stimulation modulates gait variability in Parkinson’s disease. Eur. J. Neurosci. 2007, 26, 2369–2375. [Google Scholar] [CrossRef]
Frenkel-Toledo, S.; Giladi, N.; Peretz, C.; Herman, T.; Gruendlinger, L.; Hausdorff, J.M. Treadmill walking as an external pacemaker to improve gait rhythm and stability in Parkinson’s disease. Mov. Disord. 2005, 20, 1109–1114. [Google Scholar] [CrossRef]
Frenkel-Toledo, S.; Giladi, N.; Peretz, C.; Herman, T.; Gruendlinger, L.; Hausdorff, J.M. Effect of gait speed on gait rhythmicity in Parkinson’s disease: Variability of stride time and swing time respond differently. J. Neuroeng. Rehabil. 2005, 2, 23. [Google Scholar] [CrossRef] [Green Version]
Fahn, S.; Marsden, C.D.; Calne, D.B.; Goldstein, M. Recent Developments in Parkinson’s Disease. Florham Park N. J. Macmillan Health Care Inf. 1987, 2, 153–164. [Google Scholar]
Nilsson, J.; Thorstensson, A. Adaptability in frequency and amplitude of leg movements during human locomotion at different speeds. Acta Physiol. Scand. 1987, 129, 107–114. [Google Scholar] [CrossRef]
Sadowsky, J. The continuous wavelet transform: A tool for signal investigation and understanding. Johns Hopkins APL Tech. Dig. 1994, 15, 306. [Google Scholar]
Bernardino, A.; Santos-Victor, J. A real-time gabor primal sketch for visual attention. In Iberian Conference on Pattern Recognition and Image Analysis; Springer: Berlin/Heidelberg, Germany, 2005; pp. 335–342. [Google Scholar]
Jolliffe, I.T. Introduction. In Principal Component Analysis, 2nd ed.; Springer: New York, NY, USA, 2002. [Google Scholar]
O’Shea, K.; Nash, R. An introduction to convolutional neural networks. arXiv 2015, arXiv:1511.08458. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef] [Green Version]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef] [Green Version]
Khan, A.; Sohail, A.; Zahoora, U.; Qureshi, A.S. A survey of the recent architectures of deep convolutional neural networks. Artif. Intell. Rev. 2020, 53, 5455–5516. [Google Scholar] [CrossRef] [Green Version]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 1–74. [Google Scholar] [CrossRef]
Refaeilzadeh, P.; Tang, L.; Liu, H. Cross-Validation. Encycl. Database Syst. 2009, 532–538. [Google Scholar] [CrossRef]
Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
Youden, W.J. Index for rating diagnostic tests. Cancer 1950, 3, 32–35. [Google Scholar] [CrossRef]
El Maachi, I.; Bilodeau, G.-A.; Bouachir, W. Deep 1D-Convnet for accurate Parkinson disease detection and severity prediction from gait. Expert Syst. Appl. 2020, 143, 113075. [Google Scholar] [CrossRef]
Wu, Y.; Chen, P.; Luo, X.; Wu, M.; Liao, L.; Yang, S.; Rangayyan, R. Measuring signal fluctuations in gait rhythm time series of patients with Parkinson’s disease using entropy parameters. Biomed. Signal Process. Control. 2017, 31, 265–271. [Google Scholar] [CrossRef]
Ertuğrul, Ö.F.; Kaya, Y.; Tekin, R.; Almali, M.N. Detection of Parkinson’s disease by Shifted One Dimensional Local Binary Patterns from gait. Expert Syst. Appl. 2016, 56, 156–163. [Google Scholar] [CrossRef]
Zeng, W.; Liu, F.; Wang, Q.; Wang, Y.; Ma, L.; Zhang, Y. Parkinson’s disease classification using gait analysis via deterministic learning. Neurosci. Lett. 2016, 633, 268–278. [Google Scholar] [CrossRef] [PubMed]
Daliri, M.R. Chi-square distance kernel of the gaits for the diagnosis of Parkinson’s disease. Biomed. Signal Process. Control. 2013, 8, 66–70. [Google Scholar] [CrossRef]
Khoury, N.; Attal, F.; Amirat, Y.; Oukhellou, L.; Mohammed, S. Data-Driven Based Approach to Aid Parkinson’s Disease Diagnosis. Sensors 2019, 19, 242. [Google Scholar] [CrossRef] [Green Version]
Khoury, N.; Attal, F.; Amirat, Y.; Chibani, A.; Mohammed, S. CDTW-Based Classification for Parkinson’s Disease Diagnosis. In Proceedings of the 26th European Symposium on Artificial Neural Networks, Bruges, Belgium, 25–27 April 2018. [Google Scholar]

Figure 1. Description of beginning and ending of different gait phases in a normal gait cycle (Arafsha et al. [9]).

Figure 2. Flowchart of the proposed PD detection and severity classification algorithm using the continuous wavelet transform as the feature transformation.

Figure 3. Time-frequency spectrograms using CWT of the first 10 s CF vGRF signals of CO, PD Stage 2, PD Stage 2.5, and PD Stage 3 subjects in a 10 s time window size from Ga [21] dataset: (a) original vGRF signal, (b) 0.1–5 Hz time-frequency spectrogram, and (c) 5–50 Hz time-frequency spectrogram.

Figure 4. Time-frequency spectrograms using CWT of the CF vGRF signals of CO, PD Stage 2, PD Stage 2.5, and PD Stage 3 subjects in the 0.1–5 Hz frequency range for (a) a 10 s time window size, (b) a 15 s time window size, and (c) a 30 s time window size from Ga [21] dataset.

Figure 5. Time-frequency spectrograms using CWT of the CF vGRF signals of CO, PD Stage 2, PD Stage 2.5, and PD Stage 3 subjects in the 5–50 Hz frequency range for (a) a 10 s time window size, (b) a 15 s time window size, and (c) a 30 s time window size from Ga [21] dataset.

Table 1. Hoehn and Yahr (HY) scale [10] for PD severity stage.

Stages	Description
0	No signs of disease
1	Symptoms are very mild; unilateral involvement only
1.5	Unilateral and axial involvement
2	Bilateral involvement without impairment of balance
2.5	Mild bilateral disease with recovery on pull test
3	Mild to moderate bilateral disease; some postural instability; physical independence
4	Severe disability; still able to walk or stand unassisted
5	Wheelchair bound or bedridden unless aided

Table 2. Number of subjects in three sub-datasets of Parkinson’s Disease database (gaitpdb) [8] based on the HY rating scale of severity.

Author	Stage 0	Stage 2	Stage 2.5	Stage 3
Ga [21]	18	15	8	6
Ju [22]	26	12	13	4
Si [23]	29	29	6	0

Table 3. Architectures for ResNet-50 and ResNet-101.

Layer Name	Output Size	50 Layer	101 Layer
conv1	$112 \times 112$	$7 \times 7$ , $64$ , stride 2
conv2_x	$56 \times 56$	$3 \times 3$ max pool, stride 2
conv2_x	$56 \times 56$	$[\begin{matrix} 1 \times 1, 64 \\ 3 \times 3, 64 \\ 1 \times 1, 256 \end{matrix}] \times 3$	$[\begin{matrix} 1 \times 1, 64 \\ 3 \times 3, 64 \\ 1 \times 1, 256 \end{matrix}] \times 3$
conv3_x	$28 \times 28$	$[\begin{matrix} 1 \times 1, 128 \\ 3 \times 3, 128 \\ 1 \times 1, 512 \end{matrix}] \times 4$	$[\begin{matrix} 1 \times 1, 128 \\ 3 \times 3, 128 \\ 1 \times 1, 512 \end{matrix}] \times 4$
conv4_x	$14 \times 14$	$[\begin{matrix} 1 \times 1, 256 \\ 3 \times 3, 256 \\ 1 \times 1, 1024 \end{matrix}] \times 6$	$[\begin{matrix} 1 \times 1, 256 \\ 3 \times 3, 256 \\ 1 \times 1, 1024 \end{matrix}] \times 23$
conv5_x	$7 \times 7$	$[\begin{matrix} 1 \times 1, 512 \\ 3 \times 3, 512 \\ 1 \times 1, 2048 \end{matrix}] \times 3$	$[\begin{matrix} 1 \times 1, 512 \\ 3 \times 3, 512 \\ 1 \times 1, 2048 \end{matrix}] \times 3$
	$1 \times 1$	average pool, 1000-d fc, softmax
FLOPs		$3.8 \times 10^{9}$	$7.6 \times 10^{9}$

Table 4. The advantages and disadvantages of the AlexNet, ResNet, and GoogLeNet CNN architecture [34,35].

Architecture	Advantage	Disadvantage
AlexNet layer depth: 8 parameters: 60 million	There is low feature loss, as the ReLU activation function does not limit the output. Uses data enhancement, dropout, and normalization layers to prevent the network from overfitting and improve the model generalization.	This model has much less depth; hence, it struggles to learn features from image sets. Takes more time to achieve higher accuracy (highest accuracy achieved: 99.11%).
ResNet-50 layer depth: 50 parameters: 25.6 million ResNet-101 layer depth: 101 parameters: 44.5 million	Decreased the error rate for deeper networks by introducing the idea of residual learning. Instead of widening the network, the increased depth of the network results in fewer additional parameters. This greatly reduces the training time and improves accuracy (highest accuracy ResNet-50: 99.20%; highest accuracy ResNet-101: 99.01%). Mitigates the effect of vanishing gradient.	A complex architecture Many layers may provide very little or no information. Redundant feature-maps may happen to be relearned.
GoogLeNet layer depth: 22 parameters: 7 million	Computational and memory efficiency. Reduced number of parameters by using bottleneck and global average pooling layer. Use of auxiliary classifiers to improve the convergence rate.	Lower accuracy (highest accuracy: 98.77%). Its heterogeneous topology necessitates adaptation from module to module. Substantially reduces the feature space because of the representation bottleneck and thus sometimes may lead to loss of useful information.

Table 5. Multi-class classification of LF from Ga dataset for CO (Class 0), PD Stage 2 (Class 2), PD Stage 2.5 (Class 2.5), and PD Stage 3 (Class 3) using several CNN classifiers (AlexNet, ResNet-50, ResNet-101, and GoogLeNet) with 10-fold cross-validation.

Time Window	Frequency Range	Disease Severity (Class)	AlexNet				ResNet-50				ResNet-101				GoogLeNet
Time Window	Frequency Range	Disease Severity (Class)	Sen (%)	Spec (%)	Acc (%)	AUC	Sen (%)	Spec (%)	Acc (%)	AUC	Sen (%)	Spec (%)	Acc (%)	AUC	Sen (%)	Spec (%)	Acc (%)	AUC
10 s	0.83–1.95 Hz	Class 0	98.21	98.11	98.14	0.9816	97.54	99.44	98.81	0.9849	97.54	99.56	98.89	0.9855	97.54	97.89	97.77	0.9771
		Class 2	93.29	97.78	96.14	0.9554	95.53	97.66	96.88	0.9659	97.15	96.49	96.73	0.9682	93.09	97.89	96.14	0.9549
		Class 2.5	82.50	96.12	93.69	0.8931	91.25	95.93	95.10	0.9359	88.75	97.29	95.77	0.9302	94.17	94.67	94.58	0.9442
		Class 3	83.93	97.37	95.69	0.9065	83.33	98.98	97.03	0.9116	88.10	99.32	97.92	0.9371	73.81	99.41	96.21	0.8661
	1.95–50 Hz	Class 0	97.76	98.56	98.29	0.9816	95.30	98.89	97.70	0.9710	94.63	98.44	97.18	0.9654	97.09	97.78	97.55	0.9743
		Class 2	95.93 *	98.25 *	97.40 *	0.9709 *	96.54	96.26	96.36	0.9640	96.34	94.74	95.32	0.9554	93.09	97.66	95.99	0.9538
		Class 2.5	93.75 *	98.83 *	97.92 *	0.9629 *	91.67	98.55	97.33	0.9511	87.50	98.01	96.14	0.9276	90.42	97.65	96.36	0.9403
		Class 3	96.43	99.15	98.81	0.9779	94.64	99.24	98.66	0.9694	89.29	99.24	98	0.9426	92.86	98.64	97.92	0.9575
15 s	0.83–1.95 Hz	Class 0	97.98	98.33	98.22	0.9816	93.94	98.50	96.99	0.9622	95.29	99.33	97.99	0.9731	97.98 *	99.17 *	98.77 *	0.9857 *
		Class 2	94.21	96.66	95.76	0.9543	93.90	95.08	94.65	0.9449	97.26	95.43	96.10	0.9634	96.04	96.84	96.54	0.9644
		Class 2.5	85	97.29	95.09	0.9114	88.75	96.74	95.32	0.9275	90	98.51	96.99	0.9425	88.75	97.69	96.10	0.9322
		Class 3	90.18	98.60	97.55	0.9439	88.39	98.98	97.66	0.9369	95.54	99.62	99.11	0.9758	91.96	99.24	98.33	0.9560
	1.95–50 Hz	Class 0	96.30	98.33	97.66	0.9731	93.94	98.50	96.99	0.9622	95.96	96.33	96.21	0.9615	95.96	98	97.32	0.9698
		Class 2	93.29	97.54	95.99	0.9542	93.90	95.08	94.65	0.9449	91.77	95.78	94.31	0.9378	94.51	95.78	95.32	0.9515
		Class 2.5	88.13	97.56	95.88	0.9284	88.75	96.74	95.32	0.9275	89.38	96.47	95.21	0.9292	84.38	96.74	94.54	0.9056
		Class 3	94.64	97.96	97.55	0.9630	88.39	98.98	97.66	0.9369	81.25	99.36	97.10	0.9031	84.82	98.47	96.77	0.9165
30 s	0.83–1.95 Hz	Class 0	95.24	98.33	97.32	0.9679	95.92	98.33	97.54	0.9713	91.84	99.67	97.09	0.9575	94.56	96.33	95.75	0.9545
		Class 2	93.90	97.17	95.97	0.9554	92.68	95.41	94.41	0.9404	95.73	91.87	93.29	0.9380	90.85	94.70	93.29	0.9278
		Class 2.5	95	97.55	97.09	0.9627	90	95.64	94.63	0.9282	85	94.82	93.06	0.8991	85	97.82	95.53	0.9141
		Class 3	94.64	99.49	98.88	0.9707	83.93	99.74	97.76	0.9184	76.79	99.74	96.87	0.8826	92.86	98.72	97.99	0.9579
	1.95–50 Hz	Class 0	95.92	98	97.32	0.9696	93.20	99	97.09	0.9610	93.20	97.33	95.97	0.9527	95.24	97.33	96.64	0.9629
		Class 2	93.29	95.76	94.85	0.9453	93.90	94.35	94.18	0.9412	93.30	92.58	93.06	0.9324	88.41	94.35	92.17	0.9138
		Class 2.5	80	98.37	95.08	0.8918	86.25	94.82	93.29	0.9054	82.50	95.91	93.51	0.8921	76.25	94.28	91.05	0.8526
		Class 3	98.21 *	97.44 *	97.54 *	0.9783 *	80.36	98.98	96.64	0.8967	76.79	99.23	96.42	0.8801	83.93	97.70	95.97	0.9081