Changes in Phonation and Their Relations with Progress of Parkinson’s Disease

Hypokinetic dysarthria, which is associated with Parkinson’s disease (PD), affects several speech dimensions, including phonation. Although the scientific community has dealt with a quantitative analysis of phonation in PD patients, a complex research revealing probable relations between phonatory features and progress of PD is missing. Therefore, the aim of this study is to explore these relations and model them mathematically to be able to estimate progress of PD during a two-year follow-up. We enrolled 51 PD patients who were assessed by three commonly used clinical scales. In addition, we quantified eight possible phonatory disorders in five vowels. To identify the relationship between baseline phonatory features and changes in clinical scores, we performed a partial correlation analysis. Finally, we trained XGBoost models to predict the changes in clinical scores during a two-year follow-up. For two years, the patients’ voices became more aperiodic with increased microperturbations of frequency and amplitude. Next, the XGBoost models were able to predict changes in clinical scores with an error in range 11–26%. Although we identified some significant correlations between changes in phonatory features and clinical scores, they are less interpretable. This study suggests that it is possible to predict the progress of PD based on the acoustic analysis of phonation. Moreover, it recommends utilizing the sustained vowel /i/ instead of /a/.


Introduction
Parkinson's disease (PD) is a frequent neurodegenerative disorder that is associated with a substantial reduction of dopaminergic neurons especially in substancia nigra pars compacta [1]. The primary motor symptoms of PD comprise tremor at rest, muscular rigidity, bradykinesia, and postural instability [1]. Patients with PD also develop a variety of non-motor symptoms [2] such as sleep disturbances, depression, cognitive impairment, etc. To diagnose, rate and monitor motor and non-motor symptoms of PD, various clinical rating scales such as Unified Parkinson's Disease Rating Scale (UPDRS) [3], Freezing Of Gait Questionnaire (FOG-Q) [4], or Addenbrooke's Cognitive Examination-Revised (ACE -R) [5] have been developed. Nevertheless, reliability of the assessment is often reduced by inter-rater variability [6].
Up to 90% [7] of patients with PD develop a multi-dimensional speech disorder named hypokinetic dysarthria (HD) [8], which is manifested in phonation, articulation, and prosody [9][10][11]. In the area of phonation, insufficient breath support, reduction in phonation time, increased acoustic noise, instability of articulatory organs, microperturbations of frequency/amplitude, and harsh breathy voice quality has been observed [9,12]. HD leads to serious complications in daily communication of patients with PD [13]. Generally, HD was found to be more severe in the advanced stages of PD [14].
As reported by the recent studies, acoustic analysis of HD can provide clinicians with non-invasive and reliable methodology of PD diagnosis, assessment and monitoring [9,15]. Moreover, this methodology has also been used to monitor the efficiency of PD treatment [10,[16][17][18]. In the field of acoustic analysis of PD phonation, the authors mostly focused on the sustained vowel /a/ [9]. Conventional phonatory features such as jitter, shimmer, harmonic-to-noise ratio, degree of unvoiced segments, and formant-based parameters extracted from this vowel have been widely used to diagnose PD [12,[19][20][21][22][23]. Although Hazan et al. [24] employed analysis of sustained phonation for diagnosis of PD even in its early stage, based on the recent review [9], most of the researchers find relevant applications of the phonatory analysis especially in moderate or severe stages of this disorder.
For example, the analysis of sustained phonation has been utilized during PD severity assessment. In 2010, Tsanas et al. [15] enrolled 42 PD patients and parameterized their sustained phonation of vowel /a/ by a set of conventional features that were consequently mapped to UPDRS, part III (motor examination) and the total score of this scale. Using classification and regression trees, they estimated the UPDRS III score with MAE (mean absolute error) equal to 5.95. The total UPDRS score was estimated with MAE = 7.52. A parametric version of this dataset has been made available for research purposes and other research teams further decreased the estimation error [25][26][27]. Another work that deals with the automatic clinical scores estimation was published by Mekyska et al. [21]. In this study, they acquired sustained phonation of vowels /a/, /e/, /i/, /o/, /u/ in 84 PD patients. Modeling conventional and advanced features by random forests provided the estimation of UPDRS III with MAE = 5.70. In addition, the authors estimated several other clinical scores such as UPDRS, part IV (complications of therapy) with MAE = 1.30 or Beck depression inventory (BDI) with MAE = 3.12.
Even though HD is one of the most problematic aspects of PD, the number of longitudinal studies investigating the evolution of HD in PD over time (based on the acoustic analysis) is very limited [28][29][30][31]. If we focus specifically on longitudinal monitoring of sustained phonation, then, in fact, we can identify only one study, which is published by Skodda et al. [31]. In this work, the authors repeatedly (with average time interval 32.50 months) acquired sustained vowel /a/ in 32 female and 48 male PD patients (age in session 1: 66.28 ± 8.11 years; PD duration in session 1: 6.10 ± 4.63 years; UPDRS III in session 1: 20.16 ± 10.96; UPDRS III in session 2: 19.58 ± 8.29). The voice was quantified by jitter, shimmer, noise-to-harmonic ratio, and mean fundamental frequency. Based on the paired t-test, the authors identified significant changes in shimmer and noise-to-harmonic ratio. In both cases, the values of these parameters increased. Another interesting finding is that, although some phonatory features significantly changed, UPDRS III was held widely stable over time. The authors provide two possible explanations: (1) voice impairment could be the result of an escalation of axial dysfunction too subtle to be mirrored by UPDRS III; (2) alterations of speech parameters could be completely independent of motor performance that may be based upon non-dopaminergic mechanisms. Inconsistencies in terms of the L-dopa effect on HD are further discussed in Brabenec et al. [9].
To sum it up, although the scientific community frequently addresses phonation in association with HD (especially when diagnosing or assessing PD), to the best of our knowledge, there is only one study that focuses on HD phonatory disorders from a longitudinal perspective. Moreover, the work deals with the analysis of phonation just partially, it considers only the sustained vowel /a/, and it does not explore a possibility of PD progress prediction based on a combination of acoustic analysis and machine learning. Therefore, in the frame of our two-year follow-up study, we are going much further with the following aims: 1.
to identify phonatory acoustic features at baseline that are significantly correlated with changes in various clinical rating scales, 2.
to investigate relationship between changes in the phonatory acoustic features and the clinical rating scales after the two-year follow-up, 3.
to establish mathematical models that will estimate the change in clinical rating scales based on the change in acoustic measures, 4.
The rest of this article is organized as follows: Section 2 describes a dataset of PD patients as well as methodology in terms of acoustic analysis, statistical analysis and machine learning. Results are reported in Section 3 and consequently discussed in Section 5. Finally, conclusions are given in Section 4.

Dataset
In this work, we enrolled 51 patients with idiopathic PD. All of them are Czech native speakers ( None of the patients had a disease affecting the central nervous system other than PD. All patients were examined on their regular dopaminergic medication approximately 1 h after the L-dopa [32] dose. The following rating scales were used to evaluate the clinical symptoms of PD: UPDRS III and UPDRS IV [3], FOG-Q [4], REM sleep behavior disorder screening questionnaire (RBDSQ) [33], and ACE-R [5]. The full clinical characteristics of the dataset, i e., mean ± sd values for the clinical rating scales in session 1, session 2, and session ∆ (session 2 − session 1) can be seen in Table 1. Moreover, to identify statistically significant differences, the table reports p-values of the Wilcoxon signed-rank test between the data acquired in session 1 (baseline examination) and session 2 (two-year follow-up examination) too.
The clinical data from the ∆ session were also used to generate descriptive visualizations (i.e., histograms, regression and residual plots) for the change in selected clinical rating scales, more specifically: LED, UPDRS III, UPDRS IV, FOG-Q, RBDSQ, ACE-R, see Figure 1. With this approach, it is possible to assess the improvement and/or decline in motor and non-motor deficits associated with PD in the horizon of two years as well as a relationship between the change in each of the scales relative to other scales in the selected set.  Figure 1. Descriptive statistical graphs of clinical characteristics of the PD patient dataset: on the main diagonal, histograms are visualized. Next, the upper triangular part of the graph-grid shows scatter plots with the fitted lines of linear regression models. Finally, the lower triangular part of the graph-grid is used to display residuals for the models shown in the upper grid. Color notation: the blue color represents data for session 1, and the green color represents data for session 2.

Vocal Tasks
To quantify the deterioration of phonation in patients with PD, we used a sustained phonation of vowels: /a/, /e/, /i/, /o/, /u/ as a basis for our experiments. The reason behind using all of the five vowels is to employ the analysis with the emphasis on quantifying all positions of a tongue during phonation. For more information, see the Hellwag (vowel) triangle [34]. In our view, using only a sustained phonation of the vowel /a/ is not fully justified as there is very little or no reason to assume that this particular position of the tongue can provide more information about phonatory disorders. In fact, as shown by previous studies, the analysis of other vowels is important for a more robust description of HD [19][20][21]23,24,[35][36][37].
Sustained phonation of a vowel is a standard measure used to assess quality of phonation [9]. During this particular vocal task, a speaker is asked to sustain phonation of a vowel, attempting to maintain steady frequency and amplitude at a comfortable level [38]. The advantage of this task in comparison with other commonly used vocal tasks is its independence of articulatory and other linguistic confounds [38]. Moreover, it is also present in most of the databases and therefore the experiments proposed in our work are comparable with other commonly used databases [39,40].
The sustained phonation task used in this study is a part of a speech acquisition protocol derived from the standardized 3F Dysarthria Profile [41]. During the data acquisition, a large capsule cardioid microphone M-AUDIO Nova (Cumberland, RI, United States) mounted to a boom arm RODE PSA1 (Silverwater, Australia) and positioned at a distance of approximately 20 cm from the patient's mouth was used for the recording. Consequently, the signals were digitized by audio interface M-AUDIO Fast Track Pro (Cumberland, RI, United States) with the sampling frequency of 48 kHz (16-bit resolution) and checked by a trained acoustic engineer without having seen the patient's clinical data. Finally, the signals were parameterized using Praat [42] software as well as a set of MATLAB (MATLAB 9.4, MathWorks, Natick, MA, United States) parametrization functions [43] developed at the Brno University of Technology.

Acoustic Features
To describe a variety of phonatory disorders associated with HD, we quantified the following: (a) microperturbations in frequency of voice using period perturbation quotient (PPQ); (b) microperturbations in intensity of voice using amplitude perturbation quotient (APQ); (c) irregular pitch fluctuations using coefficient of variation of fundamental frequency (F0 (CV)); (d) irregular amplitude fluctuations using coefficient of variation of Teager-Kaiser operator (TKEO (CV)); (e) tremor of articulatory organs (such as jaw, tongue and lips), coefficient of variation of 1st formant (F1 (CV)), coefficient of variation of 2nd formant (F2 (CV)), coefficient of variation of 3rd formant (F3 (CV)); (f) increased acoustic noise using median of harmonic-to-noise ratio (HNR (Q2)), median of energy ratio (ER (Q2), energy ratio of bands 2000-4000 Hz and 70-900 Hz)), median of glottal-to-noise excitation ratio (GNE (Q2)), median of normalized noise energy (NNE (Q2)); (g) irregular acoustic noise fluctuations using standard deviation of harmonic-to-noise ratio (HNR (SD)), coefficient of variation of energy ratio (ER (CV)), standard deviation of glottal-to-noise excitation ratio (GNE (SD)), standard deviation of normalized noise energy (NNE (SD)); and (h) aperiodicity of voice using fraction of locally unvoiced frames (FLUF). All of these features are standard and clinically interpretable dysphonic measures and were selected based on a recommendation given in our recent review on acoustic analysis of voice/speech signals in patients suffering from HD [9]. For more information about the voice/speech parametrization, see [43].

Statistical Analysis
Before describing the analytical setup applied in this work, it is important to mention that the dataset did not contain any missing values, and therefore all data samples were used. Furthermore, even though we used six clinical rating scales when describing the dataset (see Section 2.1), only four of these scales were used for the analysis, specifically: UPDRS III, UPDRS IV, RBDSQ, and FOG-Q. The reason is that previous studies have already shown that non-motor manifestations of PD are not linked with the phonatory aspects of HD, but rather with the impairments of prosody and articulation [44] that are commonly being quantified using a sentence reading task, free speech (monologue), etc. Since this study is focused on the phonatory aspects of HD, clinical rating scales describing only motor symptoms of PD were used.
To reveal and assess the strength of a relationship between the computed acoustic features and patients' clinical data (UPDRS III, UPDRS IV, RBDSQ, and FOG-Q), Spearman's correlation coefficient was computed (the statistical assumptions for Spearman's correlation coefficient were satisfied as: (a) the acoustic features as well as the the clinical data are both variables that are measured on at least an ordinal scale, and (b) there is a monotonic relationship between the two variables). Since age, gender, and probably L-dopa, are manifested in a voice of PD patients [9], for the purpose of this work, we employed partial Spearman's correlation controlling for the effect of the following confounding factors (also known as covariates): patients' age, gender [29,45], and dopaminergic medication [32,46]. The significance level of correlation was set to 0.05. More specifically, two correlation scenarios were considered: (a) correlation between the acoustic features at the baseline and the change in values of the selected clinical rating scales, and (b) correlation between the change in the acoustic features and the change in the values of the selected clinical rating scales. With this approach, we aimed at identifying those acoustic features that are significantly correlated with the specific motor and non-motor symptoms assessed by the selected clinical rating scales in both scenarios.
Next, to evaluate the power of the acoustic features at the baseline to predict the change of the patients' clinical data in the horizon of two years, we used the acoustic features computed for the recordings acquired in session 1 (baseline examination) and built mathematical models predicting the change in the selected clinical rating scales (∆). For this purpose, we employed Gradient Boosted Trees (more specifically, the famous XGBoost algorithm [47]) in a supervised learning setup: 10-fold cross-validation with 20 repetitions [48]. The XGBoost algorithm belongs to the state-of-the-art in machine learning, which is supported by the fact that it has been recently used to win competitions on Kaggle. It works well even on small datasets (where it outperforms deep learning approaches), it is robust to outliers and it is able to model complex interdependencies. For these reasons, it has been used by many researchers in various biomedical fields, e.g., [49][50][51], etc.
The performance of the models (precision of the predictions) was evaluated by MAE and estimation error rate (EER). These measures are defined as: where y i stands for the true label of i-th observation,ŷ i represents the predicted label of the i-th observation, n denotes the number of observations, and finally r stands for the range of values in the predicted clinical rating scale (not the range that can be theoretically reached, but the actual range of the values in the dataset). As can be seen, EER therefore describes a percentage of error predictions with respect to statistical properties of the dataset, which is particularly useful for easy interpretation of the results.

Results
The values of 16 acoustic features extracted from both sessions, as well as values of their differences (session 2 − session 1), are reported in Table 2. Based on the Wilcoxon signed-rank test, we can observe that none of the features extracted from vowel /a/ significantly changed after two years. Regarding vowel /e/, we identified significantly increased microperturbations in intensity of voice and also increased aperiodicity. The same significant changes were identified in vowel /i/ and /u/. In the case of vowel /u/, in addition, we monitored the increase of microperturbations in frequency of voice. The repeated acquisition of vowel /o/ was associated with increased aperiodicity and more dominant microperturbations in frequency of voice. The results of Spearman's partial correlation between the baseline acoustic features (session 1) and change in clinical data (∆) can be seen in Table 3  The results of Spearman's partial correlation between the change of baseline acoustic features (∆) and the change in clinical data (∆) can be seen in Table 4  The results of the clinical scales' estimation are reported in Table 5. Using the acoustic analysis of sustained phonation of the baseline vowel /e/ in combination with mathematically modeling based on the XGBoost algorithm, we estimated the change in UPDRS III score with 25.7% error (MAE = 7.3, range(UPDRS III ∆) = 29). The change in UPDRS IV was estimated with the lowest error equal to 11.3% (MAE = 1.7, range(UPDRS IV ∆) = 15) when employing acoustic analysis of the baseline vowel /o/. The change in RBDSQ was estimated with 16.3% error (MAE = 2.0, range(RBDSQ ∆) = 13) based on phonatory analysis of vowel /i/. Finally, the lowest error of FOG-Q change estimation is 13.2% (MAE = 2.8, range(FOG-Q ∆) = 22). In this case, the acoustic analysis of vowel /u/ outperformed the other ones. Due to inter-rater variability as well as intra-rater variability [52][53][54], consistent scoring of PD using the commonly used clinical rating scales is not an easy task. Automatic scoring, i.e., the estimation of the values of the clinical rating scales must be viewed as a tool that can provide clinicians with an additional, unbiased, and objective information that can help them with their decision-making, not as a tool that will substitute the work of clinicians. With this in mind, the predictions made by the trained XGBoost models can be considered rather reasonable as the error of 10-20% is comparable with a deviation caused by inter/intra-rater variability. Moreover, each clinical rating scale is different. On one hand, there are complex scales such as UPDRS III describing various motor aspects of PD, and, on the other hand, there are scales specifically focusing on a subset of its symptoms, e.g., FOG-Q (gait difficulties), RBDSQ (sleep disorders), etc. This information must be taken into account when evaluating the prediction errors because, the more complex the scale is, the more difficult it becomes to predict its values. This can be seen in our results as well. The most complex of the scales was predicted with the largest prediction error.
Feature importances of the SGBoost models are visualized in Figure 2. The figure shows the feature importances for all of the trained models. Feature importances quantify a relative importance of the features in the ensemble of the trained XGBoost model [47]. Therefore, the higher the value of the feature importance, the more important the feature is for the prediction of the dependent variable. With this in mind, the rationale behind this visualization is to show which features are important, and how strong that importance is, for the trained models in direction of predicting the change in the particular clinical rating scales in the horizon of two years given the acoustic features at the baseline.
Based on these graphs, we can conclude that the estimation of UPDRS III change requires a complex parametrization because, in all scenarios, at least 13 acoustic features were employed. In this case, especially median NNE was not frequently used. Although the models estimate the change of UPDRS IV with the lowest error, they usually use just a few phonatory parameters. In fact, in the case of vowel /o/, we observed 11.3% estimation error based on the following three phonatory features: GNE (Q2), ER (Q2), and FLUF. Generally, these features quantify quality of voicing. The best estimation of the RBDSQ change is based on eight phonatory parameters extracted from vowel /i/. The most important features quantify tremor of jaw (F1 (CV)), aperiodicity (FLUF), and microperturbations in intensity (APQ). Finally, based on the feature importances, we can observe that the most important role in FOG-Q change estimation was played by formant frequencies quantifying tremor of the articulatory organs.

Discussion
Although the only existing longitudinal study [31] is different in the interval between sessions (32.5 vs. 24.0 months), we are going to compare our findings with the results reported by these authors. In contrary to Skodda et al., who observed significant change in shimmer of the sustained phonation of vowel /a/, we have not identified any significant differences in this vowel. Nevertheless, we identified significant changes in the same feature extracted from vowels /e/, /i/, /u/. In addition, we monitored some significant changes in jitter and FLUF. Based on these results, we can conclude that, for two years, patients' voices became more aperiodic with increased microperturbations of frequency and amplitude.
None of the acoustic features at baseline significantly correlated with a change in UPDRS III, which supports the results of the clinical scales' estimation where the lowest estimation error was above 25%. However, we identified some significant correlations between changes of phonatory features and the clinical scale. Surprisingly, except tremor of lips (F3 (CV)), acoustic noise (ER (Q2)), and variation of voice quality (GNE (SD)), worsening in UPDRS III (motor performance) was associated with improvement in phonatory characteristics. This could be explained by the fact that HD belongs to axial symptoms [9,31] that do not play significant part in UPDRS III. In other words, although several significant correlations were identified, we hypothesize that some underlying pathophysiological mechanism are involved and a direct interpretation is not possible.
Regarding the change in complications of therapy (as assessed by UPDRS IV), although the most significant correlations were observed with baseline features extracted from the vowel /a/, the lowest estimation error (11%) was based on vowel /o/. In this case, low aperiodicity, but increased lips tremor and increased acoustic noise at baseline, was associated with increased complications in the follow-up examination.
Only three significant correlations are reported between baseline acoustic parameters (quantifying microperturbations of frequency/amplitude and variation in voice quality) and change in RBDSQ. Although we have not identified any significant correlations based on vowel /i/, the XGBoost algorithm reached the lowest error (16%) including features calculated from this vowel. This result could originate from the ability of XGBoost to model complex interdependencies that are not evident at first sight [47]. Regarding the partial correlations between changes in RBDSQ and phonatory features, we can conclude that mainly changes in voice aperiodicity and voice quality are linked with changes in sleep disorders.
HD and freezing of gait (FOG) are both axial symptoms of PD [55]. In our recent work, we have found out that these symptoms share some pathophysiological mechanism [56]. More specifically, we proved that FOG is mainly linked with improper articulation, disturbed speech rate and with intelligibility. We did not identify any significant relations between FOG and phonatory features. On the other hand, we analyzed only the sustained vowel /a/ and partial correlations were calculated only with some baseline FOG-Q sub-scores. The current study provides deeper and more complex results in terms of FOG and phonatory features relations. The first correlation analysis (baseline features vs. ∆FOG-Q) identified just a few significant correlations. However, based on mainly formant frequencies extracted from vowel /u/, the XGBoost model estimated the change in FOG-Q with 13% error. Generally, the significant impact of formants in this specific task is in line with our previous study [56]. The second correlation analysis (∆ of the baseline features vs. ∆FOG-Q) revealed some relations between changes in FOG and changes in aperiodicity, tremor of jaw/tongue, and acoustic noise.
Although most of the studies dealing with the acoustic analysis of phonation in PD patients focus on sustained vowel /a/, it is not sufficiently explained why this corner vowel is more important than the other two, i.e., /i/ or /u/. Looking at the Hellwag (vowel) triangle [34], we can see that, during phonation of vowel /a/, the tongue is in its lowest position from a vertical point of view, and in its central position from a horizontal one. In other words, a speaker does not have to make an effort to keep the tongue in a limit position (the tongue is almost relaxed). Therefore, some phonatory disorders could not be accented. This limitation is not present in vowels /i/ or /o/, where the speaker has to exert a force in both directions. On the other hand, the lowest limit position of jaw is reached during the phonation of vowel /a/. In summary, although some research teams employed a more complex set of vowels in their experiments [19][20][21]23,24,[35][36][37], the vowel /a/ is still the most frequently used one. However, this choice should be supported by a complex, robust, and multilingual study (theoretically, the effect of culture and language plays no role here, but this should be proven as well). Based on these assumptions, we have decided to explore significance of all five Czech vowels. In addition, the results suggest that the progress of PD is reflected in each vowel differently. Moreover, each vowel differently correlates with changes in scores of clinical scales. Finally, in our case, the best prediction of the change in the clinical rating scales under the focus have never been based on phonatory parameters of the vowel /a/. If we have to choose one optimal candidate for considered clinical scores changes prediction (see Table 5), it would be the corner vowel /i/, where the tongue is in limit position in both directions.
In our previous works, we proved that HD shares some pathophysiological mechanisms with other motor/non-motor features of PD. For instance, based on a combination of acoustic analysis and machine learning approaches, it is possible to predict cognitive deficits or gait disorders [44,56]. Although in the frame of this research we explored only the field of phonation, our results confirm the ability of acoustic HD analysis to predict the progress of PD. These findings and conclusions could have practical applications in eHealth, mHealth and generally Health 4.0 systems that could be used to remotely monitor and assess motor/non-motor deficits in PD patients.

Conclusions
This study deals with a quantitative analysis of changes in sustained phonation that has been acquired twice (with a two-year interval) in 51 PD patients. These changes are linked with progress of PD as assessed by three commonly used clinical scales. Finally, it explores a possibility of PD progress prediction based on a combination of acoustic analysis and machine learning modeling.
Based on the reported results, we conclude that, for two years, patients' voices became more aperiodic with increased microperturbations of frequency and amplitude. Although we did not identify many significant correlations between baseline values of phonatory features and changes in clinical scores, the XGBoost algorithm was able to predict these changes with errors ranging from 11% (in the case of UPDRS IV) to 26% (in the case of UPDRS III). These results accent the impact of acoustic HD analysis in Health 4.0 systems. Next, we identified significant correlations between changes in phonatory features and changes in clinical scores; however, probably due to some underlying pathophysiological mechanisms and complex interdependencies, these relations are less interpretable. Finally, our results suggest that the researchers should consider acoustic analysis of corner vowel /i/ instead of the corner vowel /a/.
Admittedly, the main limitation of this study is the small size of patient cohort. On the other hand, longitudinal studies of PD patients are very time-consuming (the patients are usually examined by several experts such as neurologists, clinical psychologists, and clinical speech therapists), physically demanding (PD is a movement disorder, therefore it requires patients to make significant effort to get into a hospital), and it is difficult to assess a large number of patients due to a low prevalence which is estimated to 1.5% for people aged over 65 years [57]. In fact, as far as we know, this is the first complex study analyzing changes in phonation and their relations with progress of PD based on such a big dataset. Moreover, it is the first study employing acoustic analysis of phonation in combination with machine learning modeling in order to predict the progress of PD. Nevertheless, our findings should be confirmed by further scientific research that will include bigger cohorts. Funding: This research was funded by the grant of the Czech Ministry of Health 16-30805A (Effects of non-invasive brain stimulation on hypokinetic dysarthria, micrographia, and brain plasticity in patients with Parkinson's disease) and the following projects: LO1401, FEDER and MEC, TEC2016-77791-C4-2-R, TEC2016-77791-C4-1-R, CENIE_TECA-PARK_55_02 INTERREG V-A Spain-Portugal (POCTEP), and TEC2016-77791-C4-4-R from the Ministry of Economic Affairs and Competitiveness of Spain. For the research, infrastructure of the SIX Center was used.

Conflicts of Interest:
The authors declare no conflict of interest.