Numerous studies have reported on the high accuracy of using voice tasks for the remote detection and monitoring of Parkinson’s Disease (PD). Most of these studies, however, report findings on a small number of voice recordings, often collected under acoustically controlled conditions, and therefore cannot scale at large without specialized equipment. In this study, we aimed to evaluate the potential of using voice as a population-based PD screening tool in resource-constrained settings. Using the standard telephone network, we processed 11,942 sustained vowel /a/ phonations from a US-English cohort comprising 1078 PD and 5453 control participants. We characterized each phonation using 304 dysphonia measures to quantify a range of vocal impairments. Given that this is a highly unbalanced problem, we used the following strategy: we selected a balanced subset (n
= 3000 samples) for training and testing using 10-fold cross-validation (CV), and the remaining (unbalanced held-out dataset, n
= 8942) samples for further model validation. Using robust feature selection methods we selected 27 dysphonia measures to present into a radial-basis-function support vector machine and demonstrated differentiation of PD participants from controls with 67.43% sensitivity and 67.25% specificity. These findings could help pave the way forward toward the development of an inexpensive, remote, and reliable diagnostic support tool for PD using voice as a digital biomarker.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited