Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (63)

Search Parameters:
Keywords = L2 speech perception

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 720 KB  
Article
The Effect of Second Language Immersion Experience on the Perception of VOT by Saudi Arabic Learners of English
by Wafaa Alshangiti
Languages 2026, 11(5), 81; https://doi.org/10.3390/languages11050081 - 22 Apr 2026
Viewed by 511
Abstract
Increased experience with a second language (L2) can affect one’s speech perception and production. Some studies have suggested that experience does not affect the production of English bilabial stops by Arabic speakers. They produce the English bilabial stops /p/ and /b/ as the [...] Read more.
Increased experience with a second language (L2) can affect one’s speech perception and production. Some studies have suggested that experience does not affect the production of English bilabial stops by Arabic speakers. They produce the English bilabial stops /p/ and /b/ as the Arabic /b/, which differs in VOT. However, the effect of English experience on the perception of English bilabial stops remains underinvestigated. This study examines the effect of L2 immersion experience on the perception of the English stops /p/–/b/ to investigate whether the lack of /p/ in Arabic can affect the perception of the /p/–/b/ contrast and whether L2 experience shifts the category boundary toward that of native speakers. Sixtysix participants, comprising two groups of Arabic speakers with differing L2 experience and a control group of native English speakers, completed identification and discrimination tasks using the /p/–/b/ VOT continuum. The regression analysis showed that listeners with more L2 experience (i.e., ≥3 years in the UK) had a closer category boundary to that of native listeners than those with less L2 experience. However, category discrimination accuracy did not differ significantly between the Arabic groups. The results highlight the importance of L2 immersion experience in altering VOT perceptual strategies, which can help in designing future training studies that focus on VOT perception as an L2 phonetic cue. Full article
Show Figures

Figure 1

16 pages, 434 KB  
Article
Examining the Effect of Assimilation Overlap on Discrimination of English and Persian Stop–Fricative Contrasts in Chinese Listeners
by Youngja Nam
Behav. Sci. 2026, 16(4), 562; https://doi.org/10.3390/bs16040562 - 9 Apr 2026
Viewed by 430
Abstract
Research on cross-language adult speech perception shows that non-native speech sounds are interpreted through the listener’s L1 phonological system. According to the Perceptual Assimilation Model (PAM) and its extension, PAM-L2, discriminability of non-native/L2 speech contrasts is determined by how two phones are assimilated [...] Read more.
Research on cross-language adult speech perception shows that non-native speech sounds are interpreted through the listener’s L1 phonological system. According to the Perceptual Assimilation Model (PAM) and its extension, PAM-L2, discriminability of non-native/L2 speech contrasts is determined by how two phones are assimilated to L1 phonological categories. Specifically, discriminability varies depending on perceived overlap with L1 phonological categories. This study assessed the PAM/PAM-L2 account of the assimilation–discrimination relationship in discrimination of non-native/L2 stop–fricative contrasts, focusing on how discrimination varies with assimilation overlap. Chinese listeners completed assimilation and AXB discrimination tasks with six English (/p-f/, /b-v/, /t-θ/, /t-s/, /d-ð/, /d-z/) and two Persian (/k-x/, /g-ɣ/) stop–fricative contrasts. The contrasts were assimilated as four Uncategorized–Categorized (UC) contrasts, one with no overlap and three with partial overlap, and four Two-Category (TC) contrasts. The discrimination results showed that TC and non-overlapping UC contrasts were more accurately discriminated than partially overlapping UC contrasts, consistent with PAM/PAM-L2. Further analysis revealed that overlap scores were strongly negatively correlated with discrimination accuracy at the group level, and this correlation was also significant for most contrasts at the individual level. These findings suggest that exploring assimilation overlap may help clarify the assimilation–discrimination relationship in non-native/L2 stop–fricative contrast discrimination. Full article
(This article belongs to the Section Cognition)
Show Figures

Figure 1

24 pages, 11178 KB  
Article
FLAMA: Frame-Level Alignment Margin Attack for Scene Text and Automatic Speech Recognition
by Yikun Xu, Zhiheng Xu and Pengwen Dai
Electronics 2026, 15(5), 1064; https://doi.org/10.3390/electronics15051064 - 4 Mar 2026
Cited by 1 | Viewed by 525
Abstract
Scene text recognition (STR) and automatic speech recognition (ASR) translate visual or acoustic signals into linguistic sequences and underpin many modern perception systems. Although their front-ends and decoders differ (e.g., CTC-based, attention-based, or variants), both tasks ultimately rely on aligning input frames to [...] Read more.
Scene text recognition (STR) and automatic speech recognition (ASR) translate visual or acoustic signals into linguistic sequences and underpin many modern perception systems. Although their front-ends and decoders differ (e.g., CTC-based, attention-based, or variants), both tasks ultimately rely on aligning input frames to output tokens by deep learning techniques, which exposes a shared vulnerability to adversarial perturbations. Existing attacks commonly optimize global sequence-level objectives. As a result, decisive frames are treated implicitly, and optimization can become unnecessarily diffuse over long input sequences, hindering convergence and perceptual quality. To address the above issues, we propose FLAMA, a unified Frame-Level Alignment Margin Attack, which could be used for both STR and ASR models. FLAMA explicitly targets alignment by maximizing per frame (or per step) recognition margins. The design is decoder-agnostic and applies to both CTC-based and attention-based pipelines. It employs a recognition-score-aware Step/Halt gate that concentrates updates on the most critical frames, and a stabilization stage that suppresses late-iteration oscillations to improve optimization stability and perceptual control. Ablation analyses show that stabilization consistently enhances attack success and reduces distortion. We evaluate FLAMA on STR benchmarks (SVT, CUTE80, and IC13) with CRNN, STAR, and TRBA, and on the ASR benchmark (LibriSpeech) with a Wav2Vec 2.0 model. Across modalities and architectures, FLAMA achieves near-100% attack success while substantially reducing l2 distortion and improving perceptual metrics compared with FGSM/PGD baselines. These results highlight frame-level alignment as a shared weak point across visual and audio sequence recognizers and suggest localized margin objectives as a principled route to effective sequence attacks. Full article
Show Figures

Figure 1

25 pages, 2133 KB  
Article
Phonological Feature Posteriors and Cue-Specific Accent Perception in Hindi- and Tamil-Accented English
by Nitin Venkateswaran and Ratree Wayland
Brain Sci. 2026, 16(2), 177; https://doi.org/10.3390/brainsci16020177 - 31 Jan 2026
Viewed by 558
Abstract
Background/Objectives: Accented speech reflects systematic deviation from target-language phonetic norms. This study demonstrates that perceived accent strength covaries with selective, gradient differences in phonological feature realization. We examine whether perceived accents in Hindi- and Tamil-accented English reflect uniform segmental deviation or cue-specific [...] Read more.
Background/Objectives: Accented speech reflects systematic deviation from target-language phonetic norms. This study demonstrates that perceived accent strength covaries with selective, gradient differences in phonological feature realization. We examine whether perceived accents in Hindi- and Tamil-accented English reflect uniform segmental deviation or cue-specific patterns of phonological feature realization. Methods: English speech produced by native speakers of Hindi and Tamil was evaluated using native listener accentedness ratings. Phonetic variation was analyzed using posterior probabilities of phonological features derived from a machine learning model, Phonet. The analyses focused on liquids (laterals and rhotics (e.g., /l/, /ɭ/, and /ɻ/) and labial segments in the fricative–glide space (e.g., /v/, /w/, and /ʋ/), with attention to word position and feature-level generalization. Results: Accentedness ratings differed systematically for Hindi- and Tamil-accented English and covaried with a subset of phonological feature dimensions, yielding contrast- and context-specific patterns of perceptually relevant variation. Not all features that varied in production contributed to perceived accent strength. Conclusions: These findings support a cue-specific, perception-grounded account of accentedness and establish phonological feature posteriors derived from Phonet as interpretable phonological categories through which gradient L2 production differences are evaluated by listeners. Full article
(This article belongs to the Special Issue Language Perception and Processing)
Show Figures

Figure 1

21 pages, 2995 KB  
Article
Language Experience Shapes Neural Grouping of Speech by Accent: EEG Evidence from Native, Second-Language, and Heritage Listeners
by Lauren L. Hong, Chao Han and Philip J. Monahan
Brain Sci. 2026, 16(2), 174; https://doi.org/10.3390/brainsci16020174 - 31 Jan 2026
Viewed by 947
Abstract
Background: Accented speech contains talker-indexical cues that listeners can use to infer social group membership, yet it remains unclear how the auditory system categorizes accent variability and how this process depends on language experience. Methods: The current study used EEG and the MMN [...] Read more.
Background: Accented speech contains talker-indexical cues that listeners can use to infer social group membership, yet it remains unclear how the auditory system categorizes accent variability and how this process depends on language experience. Methods: The current study used EEG and the MMN oddball paradigm to test pre-attentive neural sensitivity to accent changes of English words stopped produced by Canadian English or Mandarin Chinese-accented English talkers. Three participant groups were tested: Native English listeners, L1-Mandarin listeners, and Heritage Mandarin listeners. Results: In the Native English and L1-Mandarin groups, we observed MMNs to the Canadian accented English deviant, indicating that the brain can group speech by accent despite substantive inter-talker variation and that this grouping is consistent with an experience-dependent sensitivity to accent. Exposure to Mandarin Chinese-accented English modulated MMN magnitude. Time-frequency analyses suggested that α and low-β power during accent encoding varied with language background, with Native English listeners showing stronger activity when presented with Mandarin Chinese-accented English. Finally, the neurophysiological response in the Heritage Mandarin group reflected a broader phonological space encompassing both Canadian English and Mandarin-accented English, and its magnitude was predicted by Chinese proficiency. Conclusions: These findings provide brain-based evidence that automatic accent categorization is not uniform across listeners but interacts with native phonology and second-language experience. Full article
(This article belongs to the Special Issue Language Perception and Processing)
Show Figures

Figure 1

25 pages, 2358 KB  
Article
Near-Merger and Contextual Sensitivity in the Perception of /n-l/ in Sichuan Mandarin
by Minghao Zheng, Allen Shamsi and Ratree Wayland
Brain Sci. 2026, 16(2), 155; https://doi.org/10.3390/brainsci16020155 - 29 Jan 2026
Viewed by 610
Abstract
Background/Objectives: Sichuan Mandarin is often described as exhibiting overlap or merger between word-initial /n/ and /l/, but perceptual sensitivity across phonetic contexts remains underexplored. This study examines whether perception of the /n-l/ contrast varies by vowel context and listener experience. Methods: [...] Read more.
Background/Objectives: Sichuan Mandarin is often described as exhibiting overlap or merger between word-initial /n/ and /l/, but perceptual sensitivity across phonetic contexts remains underexplored. This study examines whether perception of the /n-l/ contrast varies by vowel context and listener experience. Methods: Thirty-two Sichuan Mandarin listeners completed categorical identification and same–different AX discrimination tasks using seven-step /n/ → /l/ continua derived from native-speaker productions in /i/ and /a/ contexts. Sensitivity, response bias, accuracy, and response times were analyzed alongside individual differences. Acoustic properties of the stimuli were quantified using spectral and amplitude-based measures. Results: Listeners showed overall reduced sensitivity to the /n-l/ contrast, with substantially stronger perceptual differentiation in /i/ than in /a/ contexts. Bias patterns were comparable across contexts, indicating sensitivity-driven effects. Acoustic analyses showed more robust cue structure in the /i/ continuum. Age, education, and Standard Mandarin experience modulated response efficiency but did not eliminate the vowel asymmetry. Conclusions: Results support a context-dependent near-merger of /n/ and /l/, shaped by acoustic cue availability and experience-based cue exploitation. Full article
(This article belongs to the Special Issue Language Perception and Processing)
Show Figures

Figure 1

19 pages, 979 KB  
Article
Long-Term Auditory, Tinnitus, and Psychological Outcomes After Cochlear Implantation in Single-Sided Deafness: A Two-Year Prospective Study
by Jasper Karl Friedrich Schrader, Moritz Gröschel, Agnieszka J. Szczepek and Heidi Olze
J. Clin. Med. 2026, 15(2), 644; https://doi.org/10.3390/jcm15020644 - 13 Jan 2026
Cited by 1 | Viewed by 999
Abstract
Background/Objectives: Single-sided deafness (SSD) impairs speech perception, reduces spatial hearing, decreases quality of life, and is frequently accompanied by tinnitus. Cochlear implantation (CI) has become an established treatment option, but long-term prospective evidence across multiple functional and psychological domains remains limited. This [...] Read more.
Background/Objectives: Single-sided deafness (SSD) impairs speech perception, reduces spatial hearing, decreases quality of life, and is frequently accompanied by tinnitus. Cochlear implantation (CI) has become an established treatment option, but long-term prospective evidence across multiple functional and psychological domains remains limited. This study investigated auditory performance, subjective hearing outcomes, tinnitus burden, and psychological well-being over a two-year follow-up in a large SSD cohort. Methods: Seventy adults with SSD underwent unilateral CI. Assessments were conducted preoperatively and at 6 months, 1 year, and 2 years postoperatively. Outcome measures included the Freiburg Monosyllable Test (FS), Oldenburg Inventory (OI), Nijmegen Cochlear Implant Questionnaire (NCIQ), Tinnitus Questionnaire (TQ), Perceived Stress Questionnaire (PSQ), Generalized Anxiety Disorder scale (GAD-7), and General Depression Scale (ADS-L). Longitudinal changes were analyzed using Wilcoxon signed-rank tests with effect sizes; Holm-adjusted p-values were applied for baseline-to-follow-up comparisons. Results: Speech perception improved markedly within the first 6 months and remained stable through 2 years, with large effect sizes. All OI subdomains demonstrated early and sustained improvements in subjective hearing ability. Several hearing-related quality-of-life domains assessed by the NCIQ, particularly social interaction, self-esteem, and activity participation, showed medium-to-large long-term improvements. Tinnitus severity decreased substantially, with marked reductions observed by 6 months and maintained thereafter; the proportion of tinnitus-free patients increased at follow-up, although tinnitus symptoms persisted in a substantial subset of participants. Perceived stress was reduced initially at the early follow-up and remained below baseline thereafter. Anxiety and depressive symptoms mostly stayed within nonclinical ranges, showing no lasting changes after adjusting for multiple comparisons. Conclusions: In this prospective cohort, cochlear implantation was associated with durable improvements in auditory outcomes, tinnitus burden, and selected patient-reported quality-of-life domains over two years. Although significant functional and patient-centered improvements were noted, persistent tinnitus and diverse psychosocial outcomes underscore the need for personalized counseling and comprehensive follow-up that incorporate patient-reported outcomes and psychological assessments. Full article
Show Figures

Figure 1

20 pages, 707 KB  
Article
Beyond Native Norms: A Perceptually Grounded and Fair Framework for Automatic Speech Assessment
by Mewlude Nijat, Yang Wei, Shuailong Li, Abdusalam Dawut and Askar Hamdulla
Appl. Sci. 2026, 16(2), 647; https://doi.org/10.3390/app16020647 - 8 Jan 2026
Cited by 1 | Viewed by 624
Abstract
Pronunciation assessment is central to computer-assisted pronunciation training (CAPT) and speaking tests, yet most systems still adopt a native norm, treating deviations from canonical L1 pronunciations as errors. In contrast, rating rubrics and psycholinguistic evidence emphasize intelligibility for a target listener population and [...] Read more.
Pronunciation assessment is central to computer-assisted pronunciation training (CAPT) and speaking tests, yet most systems still adopt a native norm, treating deviations from canonical L1 pronunciations as errors. In contrast, rating rubrics and psycholinguistic evidence emphasize intelligibility for a target listener population and show that listeners rapidly adapt their phonetic categories to new accents. We argue that automatic assessment should likewise be referenced to the target learner group. We build a Transformer-based mispronunciation detection (MD) model that computationally mimics listener adaptation: it is first pre-trained on multi-speaker Librispeech, then fine-tuned on the non-native L2-ARCTIC corpus that represents a specific learner population. Fine-tuning, using either synthetic or human MD labels, constrains updates to the phonetic space (i.e., the representation space used to encode phone-level distinctions, the learned phone/phonetic embedding space, and its alignment with acoustic representations), which means that only the phonetic module is updated while the rest of the model stays fixed. Relative to the pre-trained model, L2 adaptation substantially improves MD recall and F1, increasing ROC–AUC from 0.72 to 0.85. The results support a target-population norm and inform the design of perception-aligned, fairer automatic pronunciation assessment systems. Full article
Show Figures

Figure 1

28 pages, 742 KB  
Article
L2 Pragmatics Instruction in the Greek EFL Classroom: Teachers’ Competence, Beliefs, and Classroom Challenges
by Despoina Tosounidou and Marina Terkourafi
Languages 2026, 11(1), 12; https://doi.org/10.3390/languages11010012 - 31 Dec 2025
Viewed by 1396
Abstract
While Greek EFL learners’ pragmatic competence has been frequently investigated, few studies have focused on Greek EFL teachers’ pragmatic knowledge. Complementing these earlier studies based on semi-structured interviews, we employed an extended online questionnaire and discourse completion tasks (DCTs) to explore the pragmatic [...] Read more.
While Greek EFL learners’ pragmatic competence has been frequently investigated, few studies have focused on Greek EFL teachers’ pragmatic knowledge. Complementing these earlier studies based on semi-structured interviews, we employed an extended online questionnaire and discourse completion tasks (DCTs) to explore the pragmatic competence of 72 Greek EFL teachers. Pragmatic comprehension was evaluated using scenarios that required participants to assess speech acts, while their ability to produce pragmatically appropriate responses was also assessed. Likert-scale items explored teachers’ perceptions about L2 instruction and their own abilities in this regard. Findings suggest that Greek EFL teachers possess an above average level of pragmatic competence, which nevertheless has not led to them systematically integrating L2 pragmatics instruction in their classrooms. Additional qualitative data collected through semi-structured interviews suggest that teachers’ lack of integration of explicit pragmatics instruction is not due to their not recognizing its importance, but rather to feeling inadequately prepared to implement this, which in turn points to the lack of emphasis on L2 pragmatics in teacher education programs. We catalog the most significant challenges in incorporating L2 pragmatics instruction in Greek EFL classrooms in terms of teacher and learner factors, as well as the Greek EFL context itself. Full article
(This article belongs to the Special Issue Greek Speakers and Pragmatics)
Show Figures

Figure 1

33 pages, 3147 KB  
Review
Perception–Production of Second-Language Mandarin Tones Based on Interpretable Computational Methods: A Review
by Yujiao Huang, Zhaohong Xu, Xianming Bei and Huakun Huang
Mathematics 2026, 14(1), 145; https://doi.org/10.3390/math14010145 - 30 Dec 2025
Cited by 1 | Viewed by 1824
Abstract
We survey recent advances in second-language (L2) Mandarin lexical tones research and show how an interpretable computational approach can deliver parameter-aligned feedback across perception–production (P ↔ P). We synthesize four strands: (A) conventional evaluations and tasks (identification, same–different, imitation/read-aloud) that reveal robust tone-pair [...] Read more.
We survey recent advances in second-language (L2) Mandarin lexical tones research and show how an interpretable computational approach can deliver parameter-aligned feedback across perception–production (P ↔ P). We synthesize four strands: (A) conventional evaluations and tasks (identification, same–different, imitation/read-aloud) that reveal robust tone-pair asymmetries and early P ↔ P decoupling; (B) physiological and behavioral instrumentation (e.g., EEG, eye-tracking) that clarifies cue weighting and time course; (C) audio-only speech analysis, from classic F0 tracking and MFCC–prosody fusion to CNN/RNN/CTC and self-supervised pipelines; and (D) interpretable learning, including attention and relational models (e.g., graph neural networks, GNNs) opened with explainable AI (XAI). Across strands, evidence converges on tones as time-evolving F0 trajectories, so movement, turning-point timing, and local F0 range are more diagnostic than height alone, and the contrast between Tone 2 (rising) and Tone 3 (dipping/low) remains the persistent difficulty; learners with tonal vs. non-tonal language backgrounds weight these cues differently. Guided by this synthesis, we outline a tool-oriented framework that pairs perception and production on the same items, jointly predicts tone labels and parameter targets, and uses XAI to generate local attributions and counterfactual edits, making feedback classroom-ready. Full article
(This article belongs to the Section E1: Mathematics and Computer Science)
Show Figures

Figure 1

16 pages, 5040 KB  
Article
Phonetic Training and Talker Variability in the Perception of Spanish Stop Consonants
by Iván Andreu Rascón
Languages 2026, 11(1), 1; https://doi.org/10.3390/languages11010001 - 23 Dec 2025
Viewed by 1224
Abstract
This study examined how variability in phonetic training input (high vs. low) influences the perception and acquisition of Spanish stop consonants by English-speaking beginners. A total of 128 participants completed 20 online identification sessions targeting /p, t, k, b, d, g/. In the [...] Read more.
This study examined how variability in phonetic training input (high vs. low) influences the perception and acquisition of Spanish stop consonants by English-speaking beginners. A total of 128 participants completed 20 online identification sessions targeting /p, t, k, b, d, g/. In the high-variability condition (HVPT), learners heard tokens from six speakers, and in the low-variability condition (LVPT), all input came from a single speaker. Training followed an interleaved-talker design with immediate feedback, and perceptual learning was evaluated using a Bayesian hierarchical logistic regression analysis. Results showed improvement across sessions for both groups, with identification accuracy reaching ceiling by the end of the training sessions. Differences between HVPT and LVPT were small: LVPT showed steeper categorization trajectories in some cases due to slightly lower baselines, but neither condition yielded a measurable advantage. The pattern observed suggests that for boundary-shift contrasts such as Spanish stops, perceptual improvements are driven primarily by input quantity rather than variability. This interpretation aligns with input-based models of L2 speech learning (SLM-r, L2LP) and underscores the role of repeated exposure in restructuring phonological categories. Full article
(This article belongs to the Special Issue The Impacts of Phonetically Variable Input on Language Learning)
Show Figures

Figure 1

16 pages, 1176 KB  
Article
Hearing Tones, Missing Boundaries: Cross-Level Selective Transfer of Prosodic Boundaries Among Chinese–English Learners
by Lan Fang, Zilong Li, Keke Yu, John W. Schwieter and Ruiming Wang
Behav. Sci. 2025, 15(12), 1605; https://doi.org/10.3390/bs15121605 - 21 Nov 2025
Viewed by 711
Abstract
Second language (L2) learners often struggle to process prosodic boundaries, which are essential for speech comprehension. This study investigated the nature of these difficulties and how first language (L1) cue-weighting strategies transfer to L2 processing among Chinese (Mandarin)–English learners. The rising pitch that [...] Read more.
Second language (L2) learners often struggle to process prosodic boundaries, which are essential for speech comprehension. This study investigated the nature of these difficulties and how first language (L1) cue-weighting strategies transfer to L2 processing among Chinese (Mandarin)–English learners. The rising pitch that cues English phrase boundaries acoustically overlaps with functionally distinct Chinese lexical tones. Through two experiments comparing Chinese–English learners and native English speakers, we assessed sensitivity across lexical constituent, phrase, and sentence boundaries and manipulated acoustic cues (pause, lengthening, pitch) to estimate their perceptual weights during phrase-boundary identification. L2 learners showed reduced discrimination sensitivity only at the phrase level, performing comparably to native speakers at lexical constituent and sentence boundaries. For phrase boundaries, learners over-relied on pitch and under-relied on pre-boundary lengthening compared to native speakers, though both groups weighted pauses strongly. This selective deficit implicates the transfer of L1 cue-weighting strategies more than a global knowledge deficit. Our findings support a dynamic transfer model where L1 sensitivity to lexical tone transfer of L2 phrase perception, elevating the weight of pitch. While learners show partial adaptation, these results refine the Cue-Weighting Transfer Hypothesis by demonstrating that L2 prosodic acquisition involves both integrated L1 transfer and L2-driven reweighting strategies. Full article
Show Figures

Figure 1

28 pages, 1026 KB  
Review
Neuropsychological Assessments to Explore the Cognitive Impact of Cochlear Implants: A Scoping Review
by Brenda Villarreal-Garza and María Amparo Callejón-Leblic
J. Clin. Med. 2025, 14(21), 7628; https://doi.org/10.3390/jcm14217628 - 27 Oct 2025
Cited by 1 | Viewed by 1918
Abstract
Background/Objectives: Hearing loss constitutes a modifiable risk factor for dementia. Auditory rehabilitation with devices such as cochlear implants (CIs) has been reported to prevent cognitive decline in older adults. However, post-implant cognitive effects remain highly heterogeneous across studies. Thus, the aim of [...] Read more.
Background/Objectives: Hearing loss constitutes a modifiable risk factor for dementia. Auditory rehabilitation with devices such as cochlear implants (CIs) has been reported to prevent cognitive decline in older adults. However, post-implant cognitive effects remain highly heterogeneous across studies. Thus, the aim of this review is to synthesize the evidence on cognitive outcomes and their interplay with speech perception, quality of life (QoL), and psychological status. Methods: A bibliographic search was conducted following PRISMA guidelines from January 2015 to July 2025. Studies were eligible if they included adult CI candidates who completed cognitive and audiometric assessments. In total, 43 studies, including longitudinal and cross-sectional designs, were reviewed. Several studies also assessed hearing aid (HA) users and normal-hearing (NH) controls. Principal results were identified and analyzed across cognitive domains, audiological performance, QoL, and psychological outcomes. Results: CIs significantly improved cognition across longitudinal studies, with a higher number of assessments reporting gains in memory (61%), global cognition (57%), and executive function (46%); while attention, language, and visuospatial skills were less frequently evaluated. Though findings are not fully consistent, interactions between speech intelligibility and cognitive subdomains have also been found in several studies: global cognition (25%), executive function (22%), visuospatial skills (20%), attention (21%), language (17%), and memory (12%). Improvements in QoL, social engagement, depression, and anxiety are frequently observed. Conclusions: The lack of unified and adapted neurocognitive tools may prevent the observation of consistent outcomes across studies. Further research and multimodal data are still needed to fully understand the interaction between cognition, speech intelligibility, and QoL in CI users. Full article
(This article belongs to the Special Issue The Challenges and Prospects in Cochlear Implantation)
Show Figures

Figure 1

20 pages, 594 KB  
Article
Identification of Mandarin Tones in Loud Speech for Native Speakers and Second Language Learners
by Hui Zhang, Xinwei Chang, Weitong Liu, Yilun Zhang and Na Wang
Behav. Sci. 2025, 15(8), 1062; https://doi.org/10.3390/bs15081062 - 5 Aug 2025
Viewed by 3078
Abstract
Teachers often raise their vocal volume to improve intelligibility or capture students’ attention. While this practice is common in second language (L2) teaching, its effects on tone perception remain understudied. To fill this gap, this study explores the effects of loud speech on [...] Read more.
Teachers often raise their vocal volume to improve intelligibility or capture students’ attention. While this practice is common in second language (L2) teaching, its effects on tone perception remain understudied. To fill this gap, this study explores the effects of loud speech on Mandarin tone perception for L2 learners. Twenty-two native Mandarin speakers and twenty-two Thai L2 learners were tested on their perceptual accuracy and reaction time in identifying Mandarin tones in loud and normal modes. Results revealed a significant between-group difference: native speakers consistently demonstrated a ceiling effect across all tones, while L2 learners exhibited lower accuracy, particularly for Tone 3, the falling-rising tone. The loud speech had different impacts on the two groups. For native speakers, tone perception accuracy remained stable across different speech modes. In contrast, for L2 learners, loud speech significantly reduced the accuracy of Tone 3 identification and increased confusion between Tones 2 and 3. Reaction times in milliseconds were prolonged for all tones in loud speech for both groups. When subtracting the length of the tones, the delay of RT was evident only for Tones 3 and 4. Therefore, raising the speaking volume negatively affects the Mandarin tone perception of L2 learners, especially in distinguishing Tone 2 and Tone 3. Our findings have implications for both theories of L2 tone perception and pedagogical practices. Full article
(This article belongs to the Section Cognition)
Show Figures

Figure 1

12 pages, 639 KB  
Article
Identification of Perceptual Phonetic Training Gains in a Second Language Through Deep Learning
by Georgios P. Georgiou
AI 2025, 6(7), 134; https://doi.org/10.3390/ai6070134 - 23 Jun 2025
Cited by 1 | Viewed by 2482
Abstract
Background/Objectives: While machine learning has made substantial strides in pronunciation detection in recent years, there remains a notable gap in the literature regarding research on improvements in the acquisition of speech sounds following a training intervention, especially in the domain of perception. This [...] Read more.
Background/Objectives: While machine learning has made substantial strides in pronunciation detection in recent years, there remains a notable gap in the literature regarding research on improvements in the acquisition of speech sounds following a training intervention, especially in the domain of perception. This study addresses this gap by developing a deep learning algorithm designed to identify perceptual gains resulting from second language (L2) phonetic training. Methods: The participants underwent multiple sessions of high-variability phonetic training, focusing on discriminating challenging L2 vowel contrasts. The deep learning model was trained on perceptual data collected before and after the intervention. Results: The results demonstrated good model performance across a range of metrics, confirming that learners’ gains in phonetic training could be effectively detected by the algorithm. Conclusions: This research underscores the potential of deep learning techniques to track improvements in phonetic training, offering a promising and practical approach for evaluating language learning outcomes and paving the way for more personalized, adaptive language learning solutions. Deep learning enables the automatic extraction of complex patterns in learner behavior that might be missed by traditional methods. This makes it especially valuable in educational contexts where subtle improvements need to be captured and assessed objectively. Full article
Show Figures

Figure 1

Back to TopTop