MDPI - Publisher of Open Access Journals

19 pages, 1182 KB

Open AccessArticle

Phonetic Attrition Beyond the Segment: Variability in Transfer Effects Across Cues in Voiced Stops

by Divyanshi Shaktawat

Languages 2025, 10(11), 281; https://doi.org/10.3390/languages10110281 - 7 Nov 2025

Viewed by 333

Previous research shows that L2 learning can cause non-nativeness in the L1 of adult learners. These effects vary across segments, even across members of the same natural class (e.g., voiceless or voiced stops) differing in the presence or absence of transfer, the direction [...] Read more.

Previous research shows that L2 learning can cause non-nativeness in the L1 of adult learners. These effects vary across segments, even across members of the same natural class (e.g., voiceless or voiced stops) differing in the presence or absence of transfer, the direction (‘assimilation’ toward L2 or ‘dissimilation’ away from it), and the magnitude of shift. However, little is known about how multiple phonetic cues within a single segment jointly exhibit transfer, or about the cross-linguistic linkages formed at this fine-grained, cue-specific level of phonetic structure. This study investigates phonetic backward transfer by analyzing production of three cues, voice onset time, voicing during closure, and relative burst intensity, across voiced stops /b d g/. Conducted among first-generation bilingual Indian immigrants in Glasgow, it explores how their native varieties (Hindi and Indian English) are influenced by the dominant host variety (Glaswegian English) with reference to the revised Speech Learning Model and its predictions of assimilation, dissimilation, and no change. Two control groups (Indians and Glaswegians) and an experimental group (Glasgow Indians) were recorded reading in English and Hindi words containing the three voiced stops. Findings reveal cue-specific variability, highlighting the multidimensional nature of CLI and challenging segment-level generalizations in models of phonetic transfer. Full article

► Show Figures

Figure 1

27 pages, 1695 KB

Open AccessReview

Overcoming the Challenge of Singing Among Cochlear Implant Users: An Analysis of the Disrupted Feedback Loop and Strategies for Improvement

by Stephanie M. Younan, Emmeline Y. Lin, Brooke Barry, Arjun Kurup, Karen C. Barrett and Nicole T. Jiam

Brain Sci. 2025, 15(11), 1192; https://doi.org/10.3390/brainsci15111192 - 4 Nov 2025

Viewed by 478

Abstract

Background: Cochlear implants (CIs) are transformative neuroprosthetics that restore speech perception for individuals with severe-to-profound hearing loss. However, temporal envelope cues are well-represented within the signal processing, while spectral envelope cues are poorly accessed by CI users, resulting in substantial deficits compared to [...] Read more.

Background: Cochlear implants (CIs) are transformative neuroprosthetics that restore speech perception for individuals with severe-to-profound hearing loss. However, temporal envelope cues are well-represented within the signal processing, while spectral envelope cues are poorly accessed by CI users, resulting in substantial deficits compared to normal-hearing individuals. This profoundly impairs the perception of complex auditory stimuli like music and vocal prosody, significantly impacting users’ quality of life, social engagement, and artistic expression. Methods: This narrative review synthesizes research on CI signal-processing limitations, perceptual and production challenges in music and singing, the role of the auditory–motor feedback loop, and strategies for improvement, including rehabilitation, technology, and the influence of neuroplasticity and sensitive developmental periods. Results: The degraded signal causes marked deficits in pitch, timbre, and vocal emotion perception. Critically, this impoverished input functionally breaks the high-fidelity auditory–motor feedback loop essential for vocal control, transforming it from a precise fine-tuner into a gross error detector sensitive only to massive pitch shifts (~6 semitones). This neurophysiological breakdown directly causes pervasive pitch inaccuracies and melodic distortion in singing. Despite these challenges, improvements are possible through advanced sound-processing strategies, targeted auditory–motor training that leverages neuroplasticity, and capitalizing on sensitive periods for auditory development. Conclusions: The standard CI signal creates a fundamental neurophysiological barrier to singing. Overcoming this requires a paradigm shift toward holistic, patient-centered care that moves beyond speech-centric goals. Integrating personalized, music-based rehabilitation with advanced CI programming is essential for improving vocal production, fostering musical engagement, and ultimately enhancing the overall quality of life for CI users. Full article

(This article belongs to the Special Issue Language, Communication and the Brain—2nd Edition)

► Show Figures

Figure 1

18 pages, 3330 KB

Open AccessArticle

Mycelium-Based Composites for Interior Architecture: Digital Fabrication of Acoustic Ceiling Components

by Müge Özkan and Orkan Zeynel Güzelci

Biomimetics 2025, 10(11), 729; https://doi.org/10.3390/biomimetics10110729 - 1 Nov 2025

Viewed by 541

Abstract

This study examines the integration of digital fabrication technologies into the design and production of mycelium-based components, addressing the growing demand for sustainable and innovative interior design solutions. Using a parametric design approach, modular and customized suspended ceiling elements were developed for a [...] Read more.

This study examines the integration of digital fabrication technologies into the design and production of mycelium-based components, addressing the growing demand for sustainable and innovative interior design solutions. Using a parametric design approach, modular and customized suspended ceiling elements were developed for a specific interior setting to explore a material-specific design approach for mycelium-based components. Three-dimensional printing was employed to produce molds, which were subsequently tested with plaster, silicone, and mycelium across three different scales. Experimental observations focused on the overall form, surface details, growth behavior and dimensional accuracy, systematically capturing volumetric deviations arising from the living nature of the material. In parallel, acoustic performance was evaluated through simulations using the Sabine method. The untreated condition demonstrated the longest reverberation times, whereas conventional panels achieved reductions consistent with typical comfort standards. Prototypes produced with mycelium yielded measurable decreases in reverberation time compared to the untreated condition, particularly within the speech frequency range, and approached the performance of standard acoustic panels. These findings suggest that mycelium-based components, when further optimized in terms of density and geometry, hold the potential to contribute both aesthetic and acoustic value within sustainable interior environments. Full article

(This article belongs to the Section Biomimetics of Materials and Structures)

► Show Figures

Figure 1

20 pages, 918 KB

Open AccessArticle

MVIB-Lip: Multi-View Information Bottleneck for Visual Speech Recognition via Time Series Modeling

by Yuzhe Li, Haocheng Sun, Jiayi Cai and Jin Wu

Entropy 2025, 27(11), 1121; https://doi.org/10.3390/e27111121 - 31 Oct 2025

Viewed by 380

Abstract

Lipreading, or visual speech recognition, is the task of interpreting utterances solely from visual cues of lip movements. While early approaches relied on Hidden Markov Models (HMMs) and handcrafted spatiotemporal descriptors, recent advances in deep learning have enabled end-to-end recognition using large-scale datasets. [...] Read more.

Lipreading, or visual speech recognition, is the task of interpreting utterances solely from visual cues of lip movements. While early approaches relied on Hidden Markov Models (HMMs) and handcrafted spatiotemporal descriptors, recent advances in deep learning have enabled end-to-end recognition using large-scale datasets. However, such methods often require millions of labeled or pretraining samples and struggle to generalize under low-resource or speaker-independent conditions. In this work, we revisit lipreading from a multi-view learning perspective. We introduce MVIB-Lip, a framework that integrates two complementary representations of lip movements: (i) raw landmark trajectories modeled as multivariate time series, and (ii) recurrence plot (RP) images that encode structural dynamics in a texture form. A Transformer encoder processes the temporal sequences, while a ResNet-18 extracts features from RPs; the two views are fused via a product-of-experts posterior regularized by the multi-view information bottleneck. Experiments on the OuluVS and a self-collected dataset demonstrate that MVIB-Lip consistently outperforms handcrafted baselines and improves generalization to speaker-independent recognition. Our results suggest that recurrence plots, when coupled with deep multi-view learning, offer a principled and data-efficient path forward for robust visual speech recognition. Full article

(This article belongs to the Special Issue The Information Bottleneck Method: Theory and Applications)

► Show Figures

Figure 1

29 pages, 1961 KB

Open AccessArticle

Developing an AI-Powered Pronunciation Application to Improve English Pronunciation of Thai ESP Learners

by Jiraporn Lao-un and Dararat Khampusaen

Languages 2025, 10(11), 273; https://doi.org/10.3390/languages10110273 - 28 Oct 2025

Viewed by 731

Abstract

This study examined the effects of using specially designed AI-mediated pronunciation application in enhancing the production of English fricative consonants among Thai English for Specific Purposes (ESP) learners. The research utilized a quasi-experimental design involving intact classes of 74 undergraduate students majoring in [...] Read more.

This study examined the effects of using specially designed AI-mediated pronunciation application in enhancing the production of English fricative consonants among Thai English for Specific Purposes (ESP) learners. The research utilized a quasi-experimental design involving intact classes of 74 undergraduate students majoring in Thai Dance and Music Education, divided into control (N = 38) and experimental (N = 36) groups. Grounded in Skill Acquisition Theory, the experimental group received pronunciation training via a custom-designed AI application leveraging automatic speech recognition (ASR), offering ESP contextualized practices, real-time, and individualized feedback. In contrast, the control group underwent traditional teacher-led articulatory and teacher-assisted feedback. Pre- and post-test evaluations measured pronunciation for nine target fricatives in ESP-relevant contexts. The statistical analyses revealed significant improvements in both groups, with the AI-mediated group demonstrating substantially greater gains, particularly on challenging sounds absent in Thai, such as /θ/, /ð/, /z/, /ʃ/, and /h/. The findings underscore the potential of AI-driven interventions to address language-specific phonological challenges through personalized, immediate feedback and adaptive practices. The study provides empirical evidence for integrating advanced technology into ESP pronunciation pedagogy, informing future curriculum design for EFL contexts. Implications for theory, practice, and future research are discussed, emphasizing tailored technological solutions for language learners with specific phonological profiles. Full article

► Show Figures

Figure 1

17 pages, 1174 KB

Open AccessArticle

Pauses as a Quantitative Measure of Linguistic Planning Challenges in Parkinson’s Disease

by Sara D’Ascanio, Fabrizio Piras, Caterina Spada, Clelia Pellicano and Federica Piras

Brain Sci. 2025, 15(11), 1131; https://doi.org/10.3390/brainsci15111131 - 22 Oct 2025

Viewed by 418

Abstract

Background/Objectives: Pausing is a multifaceted phenomenon relevant to motor and cognitive disorders, particularly Parkinson’s Disease (PD). Thus, examining pauses as a metric for linguistic planning and motor speech difficulties in PD patients has gained significant attention. Here, we examined the production of [...] Read more.

Background/Objectives: Pausing is a multifaceted phenomenon relevant to motor and cognitive disorders, particularly Parkinson’s Disease (PD). Thus, examining pauses as a metric for linguistic planning and motor speech difficulties in PD patients has gained significant attention. Here, we examined the production of silent and filled pauses (indexing difficulties at various linguistic processing levels) during narrative tasks to investigate the interplay between pausing behavior and informativeness/productivity measures. Methods: Individuals’ pausing patterns during narratives were analyzed relative to their syntactic context (within and between sentences expressing motor and non-motor related content), in 29 patients in the mild-to-moderate stage of PD, and 29 age-matched healthy speakers. The interaction between communicative metrics (informativeness and productivity), motor symptoms, cognitive capabilities, and pausing behavior was explored to characterize the mechanisms underlying pause production and its influence on discourse content. Results: PD patients’ pausing profile was characterized by an overall reduced number of pauses, longer silent pauses and fewer/shorter filled pauses, particularly before words that extend or specify the semantic content of sentences. Contrary to what was observed in healthy speakers, both the duration of silent pauses and the total number and duration of filled pauses could explain a significant proportion of variance in informativeness measures. Silent pause duration significantly correlated with measures of lexical access, indicating that cognitive processes influence pause production, while motor speech and cognitive challenges may also interact. Conclusions: Current results have significant implications for understanding discourse difficulties linked to PD and for formulating intervention strategies to improve communication efficacy. Full article

(This article belongs to the Section Neurolinguistics)

► Show Figures

Figure 1

10 pages, 734 KB

Open AccessArticle

Electromyographic Assessment of the Extrinsic Laryngeal Muscles: Pilot and Descriptive Study of a Vocal Function Assessment Protocol

by Jéssica Ribeiro, André Araújo, Andreia S. P. Sousa and Filipa Pereira

Sensors 2025, 25(20), 6430; https://doi.org/10.3390/s25206430 - 17 Oct 2025

Viewed by 571

Abstract

Aim: The aim of this study was to develop and test a surface electromyography (sEMG) assessment protocol to characterise the activity of the extrinsic laryngeal muscles (suprahyoid and infrahyoid) during phonatory tasks and vocal techniques. Methodology: The protocol of assessment was based on [...] Read more.

Aim: The aim of this study was to develop and test a surface electromyography (sEMG) assessment protocol to characterise the activity of the extrinsic laryngeal muscles (suprahyoid and infrahyoid) during phonatory tasks and vocal techniques. Methodology: The protocol of assessment was based on electromyographic assessment guidelines and on clinical voice evaluation needs and was tested in six healthy adults with no vocal disorders. Surface electromyographic activity of suprahyoid and infrahyoid muscles was acquired during different reference tasks (rest, reading, maximum contractions) and six vocal tasks, including nasal sounds, fricatives, and semi-occluded vocal tract exercises. A laryngeal accelerometer was used for detecting the beginning and end of each exercise. The average activity during each task was normalised by the signal obtained in the incomplete swallowing task for the SHM and by the sniff technique for the IHM. Results: The range of activation values varied across tasks, with higher percentages observed in plosive production and in the “spaghetti” technique, while nasal and fricative sounds tended to show lower activation values within the group. A consistent pattern of simultaneous activation of suprahyoid and infrahyoid muscles was observed during phonation. Conclusions: The protocol proved potential for clinical application in speech–language pathology as it enabled the characterisation of muscle activity in determinant muscles for vocal function. Larger samples and further validation of the time-marking system are needed. This study provides a foundation for integrating sEMG measures into functional voice assessment. Full article

(This article belongs to the Special Issue Flexible Pressure/Force Sensors and Their Applications)

► Show Figures

Figure 1

11 pages, 851 KB

Open AccessArticle

Distinguishing Among Variants of Primary Progressive Aphasia with a Brief Multimodal Test of Nouns and Verbs

by Marco A. Lambert, Melissa D. Stockbridge, Lindsey Kelly, Isidora Diaz-Carr, Voss Neal and Argye E. Hillis

Brain Sci. 2025, 15(10), 1108; https://doi.org/10.3390/brainsci15101108 - 15 Oct 2025

Viewed by 534

Abstract

Background: Primary Progressive Aphasia (PPA) variants include the non-fluent agrammatic (nfvPPA), logopenic (lvPPA), and semantic (svPPA), which differ in their effects on speech production. However, their impact on modality (oral vs. written) and grammatical word class (nouns vs. verbs) remains controversial. A significant [...] Read more.

Background: Primary Progressive Aphasia (PPA) variants include the non-fluent agrammatic (nfvPPA), logopenic (lvPPA), and semantic (svPPA), which differ in their effects on speech production. However, their impact on modality (oral vs. written) and grammatical word class (nouns vs. verbs) remains controversial. A significant effect of these variables might assist in classification. Materials and Methods: This study used first-visit data from 300 participants with PPA who completed oral and written noun and verb naming (matched in surface word frequency across word class) to test the hypothesis that the three variants show differential impairment on word class or modality. Group differences were evaluated with rank-transformed repeated measures ANOVA. Within individual differences between nouns and verbs and between oral and written modalities were tested with Fisher’s exact tests. Results: A significant modality × variant interaction (p = 0.017) was observed. Participants with lvPPA and nfvPPA demonstrated greater oral than written naming, with nfvPPA also performing better on nouns than verbs. Those with svPPA showed no modality or word class effects but had an overall low accuracy. Three participants with svPPA (but no individuals with the other variants) demonstrated significantly (p = 0.003) more accurate verb than noun naming. Conclusions: Differing modality and word class patterns characterize PPA variants, with nfvPPA more accurate in nouns than verbs on average. Within individuals, only those with svPPA occasionally showed significantly more proficient verb than noun naming. Grammatical word class effects likely arise at distinct levels of cognitive processing underlying naming. Full article

(This article belongs to the Special Issue Primary Progressive Aphasia: What Happens to Speech and Language? What Can We Do to Help?)

► Show Figures

Figure 1

22 pages, 617 KB

Open AccessReview

Mapping the Neurophysiological Link Between Voice and Autonomic Function: A Scoping Review

by Carmen Morales-Luque, Laura Carrillo-Franco, Manuel Víctor López-González, Marta González-García and Marc Stefan Dawid-Milner

Biology 2025, 14(10), 1382; https://doi.org/10.3390/biology14101382 - 10 Oct 2025

Viewed by 635

Abstract

Vocal production requires the coordinated control of respiratory, laryngeal, and autonomic systems. In individuals with high vocal demand, this physiological load may influence autonomic regulation, even in the absence of voice disorders. This scoping review systematically mapped current evidence on the relationship between [...] Read more.

Vocal production requires the coordinated control of respiratory, laryngeal, and autonomic systems. In individuals with high vocal demand, this physiological load may influence autonomic regulation, even in the absence of voice disorders. This scoping review systematically mapped current evidence on the relationship between voice production and autonomic nervous system (ANS) activity in adults, focusing exclusively on studies that assessed both systems simultaneously. A systematic search was conducted in PubMed, Scopus, Web of Science, Embase, and CINAHL, following PRISMA-ScR guidelines. Eligible studies included adults performing structured vocal tasks with concurrent autonomic measurements. Data were extracted and synthesized descriptively. Fifteen studies met the inclusion criteria. Most involved healthy adults with high vocal demand, while some included participants with subclinical or functional voice traits. Vocal tasks ranged from singing and sustained phonation to speech under cognitive or emotional load. Autonomic measures included heart rate (HR), heart rate variability (HRV), and electrodermal activity (EDA), among others. Four thematic trends emerged: autonomic synchronization during group vocalization; modulation of autonomic tone by vocal rhythm and structure; voice–ANS interplay under stress; and physiological coupling in hyperfunctional vocal behaviours. This review’s findings suggest that vocal activity can modulate autonomic function, supporting the potential integration of autonomic markers into experimental and clinical voice research. Full article

(This article belongs to the Special Issue Cardiovascular Autonomic Function: From Bench to Bedside—2nd Edition)

► Show Figures

Figure 1

26 pages, 842 KB

Open AccessArticle

Speech Production Intelligibility Is Associated with Speech Recognition in Adult Cochlear Implant Users

by Victoria A. Sevich, Davia J. Williams, Aaron C. Moberly and Terrin N. Tamati

Brain Sci. 2025, 15(10), 1066; https://doi.org/10.3390/brainsci15101066 - 30 Sep 2025

Viewed by 838

Abstract

Background/Objectives: Adult cochlear implant (CI) users exhibit broad variability in speech perception and production outcomes. Cochlear implantation improves the intelligibility (comprehensibility) of CI users’ speech, but the degraded auditory signal delivered by the CI may attenuate this benefit. Among other effects, degraded [...] Read more.

Background/Objectives: Adult cochlear implant (CI) users exhibit broad variability in speech perception and production outcomes. Cochlear implantation improves the intelligibility (comprehensibility) of CI users’ speech, but the degraded auditory signal delivered by the CI may attenuate this benefit. Among other effects, degraded auditory feedback can lead to compression of the acoustic–phonetic vowel space, which makes vowel productions confusable, decreasing intelligibility. Sustained exposure to degraded auditory feedback may also weaken phonological representations. The current study examined the relationship between subjective ratings and acoustic measures of speech production, speech recognition accuracy, and phonological processing (cognitive processing of speech sounds) in adult CI users. Methods: Fifteen adult CI users read aloud a series of short words, which were analyzed in two ways. First, acoustic measures of vowel distinctiveness (i.e., vowel dispersion) were calculated. Second, thirty-seven normal-hearing (NH) participants listened to the words produced by the CI users and rated the subjective intelligibility of each word from 1 (least understandable) to 100 (most understandable). CI users also completed an auditory sentence recognition task and a nonauditory cognitive test of phonological processing. Results: CI users rated as having more understandable speech demonstrated more accurate sentence recognition than those rated as having less understandable speech, but intelligibility ratings were only marginally related to phonological processing. Further, vowel distinctiveness was marginally associated with sentence recognition but not related to phonological processing or subjective ratings of intelligibility. Conclusions: The results suggest that speech intelligibility ratings are related to speech recognition accuracy in adult CI users, and future investigation is needed to identify the extent to which this relationship is mediated by individual differences in phonological processing. Full article

(This article belongs to the Special Issue Language, Communication and the Brain—2nd Edition)

► Show Figures

Figure 1

29 pages, 2068 KB

Open AccessArticle

Voice-Based Early Diagnosis of Parkinson’s Disease Using Spectrogram Features and AI Models

by Danish Quamar, V. D. Ambeth Kumar, Muhammad Rizwan, Ovidiu Bagdasar and Manuella Kadar

Bioengineering 2025, 12(10), 1052; https://doi.org/10.3390/bioengineering12101052 - 29 Sep 2025

Viewed by 1321

Abstract

Parkinson’s disease (PD) is a progressive neurodegenerative disorder that significantly affects motor functions, including speech production. Voice analysis offers a less invasive, faster and more cost-effective approach for diagnosing and monitoring PD over time. This research introduces an automated system to distinguish between [...] Read more.

Parkinson’s disease (PD) is a progressive neurodegenerative disorder that significantly affects motor functions, including speech production. Voice analysis offers a less invasive, faster and more cost-effective approach for diagnosing and monitoring PD over time. This research introduces an automated system to distinguish between PD and non-PD individuals based on speech signals using state-of-the-art signal processing and machine learning (ML) methods. A publicly available voice dataset (Dataset 1, 81 samples) containing speech recordings from PD patients and non-PD individuals was used for model training and evaluation. Additionally, a small supplementary dataset (Dataset 2, 15 samples) was created although excluded from experiment, to illustrate potential future extensions of this work. Features such as Mel-frequency cepstral coefficients (MFCCs), spectrograms, Mel spectrograms and waveform representations were extracted to capture key vocal impairments related to PD, including diminished vocal range, weak harmonics, elevated spectral entropy and impaired formant structures. These extracted features were used to train and evaluate several ML models, including support vector machine (SVM), XGBoost and logistic regression, as well as deep learning (DL)architectures such as deep neural networks (DNN), convolutional neural networks (CNN) combined with long short-term memory (LSTM), CNN + gated recurrent unit (GRU) and bidirectional LSTM (BiLSTM). Experimental results show that DL models, particularly BiLSTM, outperform traditional ML models, achieving 97% accuracy and an AUC of 0.95. The comprehensive feature extraction from both datasets enabled robust classification of PD and non-PD speech signals. These findings highlight the potential of integrating acoustic features with DL methods for early diagnosis and monitoring of Parkinson’s Disease. Full article

(This article belongs to the Special Issue Artificial Intelligence in Neurodegenerative Disorders: Advances in Diagnosis, Prognosis and Treatment)

► Show Figures

Figure 1

12 pages, 685 KB

Open AccessArticle

Changes in Bilabial Contact Pressure as a Function of Vocal Loudness in Individuals with Parkinson’s Disease

by Jeff Searl

Appl. Sci. 2025, 15(18), 10165; https://doi.org/10.3390/app151810165 - 18 Sep 2025

Viewed by 421

Abstract

This study evaluated the impact of vocal loudness on bilabial contact pressure (BCP) during the production of bilabial English consonants in adults with Parkinson’s disease (PD). Twelve adults with PD produced sentences with the phonemes /b, p, m/ initiating a linguistically meaningful word [...] Read more.

This study evaluated the impact of vocal loudness on bilabial contact pressure (BCP) during the production of bilabial English consonants in adults with Parkinson’s disease (PD). Twelve adults with PD produced sentences with the phonemes /b, p, m/ initiating a linguistically meaningful word within the sentence, while BCP was sensed with a miniature pressure transducer positioned at the midline between the upper and lower lips. Stimuli were produced at two loudness levels: Habitual and twice as loud as habitual loudness (Loud). A linear mixed model (LMM) indicated a statistically significant main effect of Condition (F (1, 714) = 16.210, p < 0.001) with Loud having greater BCP than Habitual (mean difference of 0.593 kPa). The main effect of Phoneme was also significant (F (1, 714) = 31.905, p < 0.001), with post hoc tests revealing that BCP was significantly higher for /p/ compared to /m/ (p = 0.007), and for /b/ compared to /m/ (p = 0.002). An additional LMM of the magnitude of the percent change in BCP in the Loud condition relative to the Habitual condition had a significant main effect of Phoneme (F (2, 22.3) = 5.871, p = 0.006). The percent change in BCP was the greatest for /p/ (47.7%), followed by /b/ (35.7%) and /m/ (27.4%), with statistically significant differences for both /p/ and /b/ compared to /m/ in post hoc tests. The results indicated that changes in vocal loudness cause changes in BCP in individuals with PD. A louder voice was associated with higher BCP for all three phonemes, although the increase was the greatest on bilabial stops compared to nasal stops. These results provide initial insights regarding the mechanism by which therapeutic interventions focused on increasing loudness in people with PD alter oral articulatory behaviors. Future work that details potential aerodynamic (e.g., oral air pressure build-up) and articulatory acoustics (e.g., burst intensity) is needed to better explain the mechanistic actions of increased loudness that can explain why loud-focused speech treatments for people with PD may improve speech intelligibility. Full article

(This article belongs to the Special Issue Alzheimer's Disease and Other Neurological Diseases: Novel Treatment Strategies)

► Show Figures

Figure 1

28 pages, 453 KB

Open AccessArticle

Language Learning in the Wild: The L2 Acquisition of English Restrictive Relative Clauses

by Stephen Levey, Kathryn L. Rochon and Laura Kastronic

Languages 2025, 10(9), 232; https://doi.org/10.3390/languages10090232 - 10 Sep 2025

Viewed by 940

Abstract

We argue that quantitative analysis of community-based speech data furnishes an indispensable adjunct to theoretical and experimental studies targeting the acquisition of relativization. Drawing on a comparative sociolinguistic approach, we make use of three corpora of natural speech to investigate second-language (L2) speakers’ [...] Read more.

We argue that quantitative analysis of community-based speech data furnishes an indispensable adjunct to theoretical and experimental studies targeting the acquisition of relativization. Drawing on a comparative sociolinguistic approach, we make use of three corpora of natural speech to investigate second-language (L2) speakers’ acquisition of restrictive relative clauses in English. These corpora comprise: (i) spontaneous L2 speech; (ii) a local baseline variety of the target language (TL); and (iii) L2 speakers’ first language (L1), French. These complementary datasets enable us to explore the extent to which L2 speakers reproduce the discursive frequency of relative markers, as well as their fine-grained linguistic conditioning, in the local TL baseline variety. Comparisons with French facilitate exploration of possible L1 transfer effects on L2 speakers’ production of English restrictive relative clauses. Results indicate that evidence of L1 transfer effects on L2 speakers’ restrictive relative clauses is tenuous. A pivotal finding is that L2 speakers, in the aggregate, closely approximate TL constraints on relative marker selection, although they use the subject relativizer who significantly less often than their TL counterparts. We implicate affiliation with, and integration into, the local TL community as key factors facilitating the propagation of TL vernacular norms to L2 speakers. Full article

32 pages, 2361 KB

Open AccessArticle

Exploring the Use and Misuse of Large Language Models

by Hezekiah Paul D. Valdez, Faranak Abri, Jade Webb and Thomas H. Austin

Information 2025, 16(9), 758; https://doi.org/10.3390/info16090758 - 1 Sep 2025

Viewed by 1143

Abstract

Language modeling has evolved from simple rule-based systems into complex assistants capable of tackling a multitude of tasks. State-of-the-art large language models (LLMs) are capable of scoring highly on proficiency benchmarks, and as a result have been deployed across industries to increase productivity [...] Read more.

Language modeling has evolved from simple rule-based systems into complex assistants capable of tackling a multitude of tasks. State-of-the-art large language models (LLMs) are capable of scoring highly on proficiency benchmarks, and as a result have been deployed across industries to increase productivity and convenience. However, the prolific nature of such tools has provided threat actors with the ability to leverage them for attack development. Our paper describes the current state of LLMs, their availability, and their role in benevolent and malicious applications. In addition, we propose how an LLM can be combined with text-to-speech (TTS) voice cloning to create a framework capable of carrying out social engineering attacks. Our case study analyzes the realism of two different open-source TTS models, Tortoise TTS and Coqui XTTS-v2, by calculating similarity scores between generated and real audio samples from four participants. Our results demonstrate that Tortoise is able to generate realistic voice clone audios for native English speaking males, which indicates that easily accessible resources can be leveraged to create deceptive social engineering attacks. As such tools become more advanced, defenses such as awareness, detection, and red teaming may not be able to keep up with dangerously equipped adversaries. Full article

► Show Figures

Figure 1

20 pages, 1589 KB

Open AccessArticle

Articulatory Control by Gestural Coupling and Syllable Pulses

by Christopher Geissler

Languages 2025, 10(9), 219; https://doi.org/10.3390/languages10090219 - 29 Aug 2025

Cited by 1 | Viewed by 911

Abstract

Explaining the relative timing of consonant and vowel articulations (C-V timing) is an important function of speech production models. This article explores how C-V timing might be studied from the perspective of the C/D Model, particularly the prediction that articulations are coordinated with [...] Read more.

Explaining the relative timing of consonant and vowel articulations (C-V timing) is an important function of speech production models. This article explores how C-V timing might be studied from the perspective of the C/D Model, particularly the prediction that articulations are coordinated with respect to an abstract syllable pulse. Gestural landmarks were extracted from kinematic data from English CVC monosyllabic words in the Wisconsin X-Ray Microbeam Corpus. The syllable pulse was identified using velocity peaks, and temporal lags were calculated among landmarks and the syllable pulse. The results directly follow from the procedure used to identify pulses: onset consonants exhibited stable timing to the pulse, while vowel-to-pulse timing was comparably stable with respect to C-V timing. Timing relationships with jaw displacement and jaw-based syllable pulse metrics were also explored. These results highlight current challenges for the C/D Model, as well as opportunities for elaborating the model to account for C-V timing. Full article

(This article belongs to the Special Issue Research on Articulation and Prosodic Structure)

► Show Figures

Figure 1

Search Results (371)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (371)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI