Next Article in Journal
An Internet of Things and Fuzzy Markup Language Based Approach to Prevent the Risk of Falling Object Accidents in the Execution Phase of Construction Projects
Next Article in Special Issue
Porting Rulex Software to the Raspberry Pi for Machine Learning Applications on the Edge
Previous Article in Journal
Feature Selection and Validation of a Machine Learning-Based Lower Limb Risk Assessment Tool: A Feasibility Study
Previous Article in Special Issue
Application of Simulated Arms with Real-Time Pressure Monitor in Casting and Splinting by Physiological Sensors
Article

Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria

Department of Information Engineering, University of Pisa, Via G. Caruso 16, 56122 Pisa, Italy
*
Author to whom correspondence should be addressed.
This paper is an extension version of the conference paper: Marini, M.; Meoni, G.; Mulfari, D.; Vanello, N.; Fanucci, L. Enabling Smart Home Voice Control for Italian People with Dysarthria: Preliminary Analysis of Frame Rate Effect on Speech Recognition. In Proceedings of the International Conference on Applications in Electronics Pervading Industry, Environment and Society, Online, 19–20 Novembre 2020; pp. 104–110.
Academic Editor: Chiman Kwan
Sensors 2021, 21(19), 6460; https://doi.org/10.3390/s21196460
Received: 31 August 2021 / Revised: 22 September 2021 / Accepted: 22 September 2021 / Published: 27 September 2021
Within the field of Automatic Speech Recognition (ASR) systems, facing impaired speech is a big challenge because standard approaches are ineffective in the presence of dysarthria. The first aim of our work is to confirm the effectiveness of a new speech analysis technique for speakers with dysarthria. This new approach exploits the fine-tuning of the size and shift parameters of the spectral analysis window used to compute the initial short-time Fourier transform, to improve the performance of a speaker-dependent ASR system. The second aim is to define if there exists a correlation among the speaker’s voice features and the optimal window and shift parameters that minimises the error of an ASR system, for that specific speaker. For our experiments, we used both impaired and unimpaired Italian speech. Specifically, we used 30 speakers with dysarthria from the IDEA database and 10 professional speakers from the CLIPS database. Both databases are freely available. The results confirm that, if a standard ASR system performs poorly with a speaker with dysarthria, it can be improved by using the new speech analysis. Otherwise, the new approach is ineffective in cases of unimpaired and low impaired speech. Furthermore, there exists a correlation between some speaker’s voice features and their optimal parameters. View Full-Text
Keywords: dysarthria; automatic speech recognition; speech analysis; genetic algorithm; kaldi dysarthria; automatic speech recognition; speech analysis; genetic algorithm; kaldi
Show Figures

Figure 1

MDPI and ACS Style

Marini, M.; Vanello, N.; Fanucci, L. Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria. Sensors 2021, 21, 6460. https://doi.org/10.3390/s21196460

AMA Style

Marini M, Vanello N, Fanucci L. Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria. Sensors. 2021; 21(19):6460. https://doi.org/10.3390/s21196460

Chicago/Turabian Style

Marini, Marco, Nicola Vanello, and Luca Fanucci. 2021. "Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria" Sensors 21, no. 19: 6460. https://doi.org/10.3390/s21196460

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop