Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (1)

Search Parameters:
Keywords = non-native children speech recognition

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 529 KiB  
Article
Audio Augmentation for Non-Native Children’s Speech Recognition through Discriminative Learning
by Kodali Radha and Mohan Bansal
Entropy 2022, 24(10), 1490; https://doi.org/10.3390/e24101490 - 19 Oct 2022
Cited by 19 | Viewed by 2963
Abstract
Automatic speech recognition (ASR) in children is a rapidly evolving field, as children become more accustomed to interacting with virtual assistants, such as Amazon Echo, Cortana, and other smart speakers, and it has advanced the human–computer interaction in recent generations. Furthermore, non-native children [...] Read more.
Automatic speech recognition (ASR) in children is a rapidly evolving field, as children become more accustomed to interacting with virtual assistants, such as Amazon Echo, Cortana, and other smart speakers, and it has advanced the human–computer interaction in recent generations. Furthermore, non-native children are observed to exhibit a diverse range of reading errors during second language (L2) acquisition, such as lexical disfluency, hesitations, intra-word switching, and word repetitions, which are not yet addressed, resulting in ASR’s struggle to recognize non-native children’s speech. The main objective of this study is to develop a non-native children’s speech recognition system on top of feature-space discriminative models, such as feature-space maximum mutual information (fMMI) and boosted feature-space maximum mutual information (fbMMI). Harnessing the collaborative power of speed perturbation-based data augmentation on the original children’s speech corpora yields an effective performance. The corpus focuses on different speaking styles of children, together with read speech and spontaneous speech, in order to investigate the impact of non-native children’s L2 speaking proficiency on speech recognition systems. The experiments revealed that feature-space MMI models with steadily increasing speed perturbation factors outperform traditional ASR baseline models. Full article
(This article belongs to the Special Issue Information-Theoretic Approaches in Speech Processing and Recognition)
Show Figures

Figure 1

Back to TopTop