Speech Production Intelligibility Is Associated with Speech Recognition in Adult Cochlear Implant Users

Sevich, Victoria A.; Williams, Davia J.; Moberly, Aaron C.; Tamati, Terrin N.

doi:10.3390/brainsci15101066

Open AccessArticle

Speech Production Intelligibility Is Associated with Speech Recognition in Adult Cochlear Implant Users

¹

Department of Speech and Hearing Science, The Ohio State University, Columbus, OH 43210, USA

²

Department of Otolaryngology, Vanderbilt University Medical Center, Nashville, TN 37232, USA

^*

Author to whom correspondence should be addressed.

Brain Sci. 2025, 15(10), 1066; https://doi.org/10.3390/brainsci15101066

Submission received: 23 August 2025 / Revised: 24 September 2025 / Accepted: 28 September 2025 / Published: 30 September 2025

(This article belongs to the Special Issue Language, Communication and the Brain—2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

Background/Objectives: Adult cochlear implant (CI) users exhibit broad variability in speech perception and production outcomes. Cochlear implantation improves the intelligibility (comprehensibility) of CI users’ speech, but the degraded auditory signal delivered by the CI may attenuate this benefit. Among other effects, degraded auditory feedback can lead to compression of the acoustic–phonetic vowel space, which makes vowel productions confusable, decreasing intelligibility. Sustained exposure to degraded auditory feedback may also weaken phonological representations. The current study examined the relationship between subjective ratings and acoustic measures of speech production, speech recognition accuracy, and phonological processing (cognitive processing of speech sounds) in adult CI users. Methods: Fifteen adult CI users read aloud a series of short words, which were analyzed in two ways. First, acoustic measures of vowel distinctiveness (i.e., vowel dispersion) were calculated. Second, thirty-seven normal-hearing (NH) participants listened to the words produced by the CI users and rated the subjective intelligibility of each word from 1 (least understandable) to 100 (most understandable). CI users also completed an auditory sentence recognition task and a nonauditory cognitive test of phonological processing. Results: CI users rated as having more understandable speech demonstrated more accurate sentence recognition than those rated as having less understandable speech, but intelligibility ratings were only marginally related to phonological processing. Further, vowel distinctiveness was marginally associated with sentence recognition but not related to phonological processing or subjective ratings of intelligibility. Conclusions: The results suggest that speech intelligibility ratings are related to speech recognition accuracy in adult CI users, and future investigation is needed to identify the extent to which this relationship is mediated by individual differences in phonological processing.

Keywords:

speech production; speech perception; cochlear implants; phonological processing

1. Introduction

Cochlear implants (CIs) provide access to spoken language for adults with moderate-to-profound hearing loss (HL), but speech communication outcomes are highly variable [1,2,3,4,5]. While some CI users achieve excellent speech recognition outcomes, others continue to report significant difficulties understanding speech, particularly in real-world, challenging listening conditions [4,6,7]. Standard clinical outcome assessments typically focus on speech recognition, often using isolated words or sentences presented in quiet or noise. While these measures provide valuable insight into auditory processing, they likely do not capture individual differences in broader communication ability. Successful communication via spoken language requires that individuals not only accurately perceive speech but also can accurately produce speech. However, speech production outcomes have received less attention than perception outcomes among adult CI users, and the extent to which speech production reflects individual differences in speech perception and processing remains unclear. Understanding the relationship between speech production and perception outcomes may offer new insights into the underlying mechanisms that shape communication outcomes in adult CI users. In this study, we examined whether individual differences in speech production, measured perceptually and acoustically, are associated with speech recognition scores in experienced adult CI users.

1.1. Auditory Feedback Influences Speech Production

Auditory feedback refers to the process by which an individual receives sensory feedback regarding their own speech. Auditory feedback plays a critical role in influencing speech production, both in real time and over a longer time scale. In the short term, auditory feedback mechanisms help talkers detect and correct speech errors in real time, facilitating the intelligibility of their speech. In the long term, broader auditory experiences and auditory input across the lifespan (which include auditory feedback) support the maintenance and refinement of phonological representations, or a listener’s mental representation of speech sounds. In individuals with normal hearing (NH), ongoing, long-term access to detailed auditory input supports the formation and refinement of robust phonological representations. Phonological representations serve as a foundation for both the perception and production of speech, allowing for accurate recognition of speech sounds and clear and intelligible articulation, e.g., [8,9,10].

Evidence for a link between speech perception and production is robust, e.g., [11,12]. For example, real-time experimental modifications to auditory feedback consistently drive talkers to make online modifications to their own speech in response to these modifications [13,14,15,16,17,18]. The study in [14] instructed talkers to produce a series of consonant–vowel–consonant (CVC) sequences and modified, in real time, the first and second formant frequencies (F1 and F2) of one of the vowels (/ɛ/) produced by the talkers. Over the course of the experimental session, this altered feedback led to modifications to their productions not only of /ɛ/ but also of the other vowels tested (/i, ɪ, æ, ɑ/). These findings support the notion that NH listeners make use of short-term auditory feedback to adjust productions of their own speech in real time. In addition, the articulatory adjustments were made not only to the vowel whose formants were manipulated but also to the other vowels in each talker’s vowel space. These modifications to all vowels in the vowel space indicate that talkers can modify vowel production to maintain phonetic contrast among vowels. While most of these experiments manipulated short-term auditory feedback, changes to auditory feedback over longer periods of time may also shape phonological representations if altered feedback is sustained.

For individuals with CIs, auditory input is spectrally degraded and lacks the fine-grained acoustic detail that supports detailed phonological representations. In post-lingually deafened adult CI users—i.e., CI users who acquired hearing loss after learning their first language—cochlear implantation often follows a long period of auditory deprivation in the form of sensorineural HL [19]. This reduced auditory fidelity of the signal delivered by the CI following a period of auditory deprivation can reduce the precision of phonological categories over time, in part by limiting the quantity and quality of auditory feedback [20,21,22,23,24,25,26]. Despite this degraded auditory feedback, however, many CI users use this feedback in real time to monitor their speech as it is produced. Studies that have examined changes in speech production between conditions where CI users have their processors on (i.e., auditory feedback was available) vs. when they take their processors off (i.e., no auditory feedback was available) have observed changes in acoustic speech parameters as a result of these short-term modifications to auditory feedback [27,28,29,30,31,32]. Collectively, evidence from these studies suggests that even shortly after CI activation many CI users take advantage of auditory feedback to shape their speech production in real time, potentially to facilitate the intelligibility of their own speech.

The results reviewed above are consistent with models that argue for the importance of auditory feedback for the development and maintenance of intelligible speech. For instance, the Directions Into Velocities of Articulators (DIVA) model of speech motor control [33,34,35] posits a direct link between auditory feedback and motor control of speech articulators. Specifically, the DIVA model proposes the existence of a neurocognitive representation of the expected auditory consequences of a spoken utterance in addition to an acceptable range of variation for that utterance. According to DIVA, if the incoming acoustic signal of a talker’s own utterance does not fall within this acceptable range of variation, a feedback control system initiates corrective motor signals that the somatosensory system translates into movements to correct speech errors or to otherwise enhance speech production (e.g., by hyperarticulating speech when asked to repeat it). Further, the DIVA model proposes that the targets of speech utterances are auditory, rather than articulatory. Therefore, under this framework, long-term limitations in auditory feedback could impact speech production through a combination of reduced motor control of articulators and imprecise auditory targets of speech gestures. Intact phonological representations are critical for maintaining auditory targets, underscoring the importance of phonological processing (i.e., the ability to access, compare, and manipulate speech sounds) for both the perception and production of spoken language.

1.2. Phonological Processing for Speech Perception and Production

A large body of research across diverse populations has provided evidence that more accurate speech perception is associated with more precise or intelligible speech production [36,37,38,39,40,41,42,43]. For example, Ref. [42] found that NH participants who were more accurate in discriminating between vowel contrasts in perception also produced a given pair of vowels with more phonetic contrast than those who were less accurate in discrimination. This relationship between perception and production may be especially relevant for CI users, whose perception and production systems must adapt to a degraded input signal. Phonological processing supports both processes. Phonological processing tends to be poorer in adults with HL relative to adults with NH, likely due to long-term reductions in and changes to auditory input during long periods of deafness [20,22,23,24,26,44,45,46,47]. In CI users specifically, weaker phonological processing skills may limit the ability to extract detail from the speech signal, resulting in difficulties recognizing phonemic differences across stimuli and in producing clearly differentiated speech sounds [22,24,25,26,48,49,50,51]. Collectively, these findings demonstrate that auditory deprivation leads to modifications in the sound structure of the lexicon among adults with HL with and without CIs and that these modifications have consequences for both speech production and perception.

Because modified phonological representations affect both the production and perception of spoken language, both production and perception measures can be used to assess the quality of phonological representations. One way of indirectly assessing phonological representations in CI users is through timed single-word reading tasks. These reading efficiency tasks provide an indirect measure of phonological representations by assessing processing efficiency, which reflects the quality of underlying representations. More specifically, reading efficiency relies on phonological decoding processes and direct access to the sound structure of the lexicon, both for nonwords and real words [52,53,54]. In a study assessing phonological processing in experienced adult CI users, Ref. [51] found that CI users with better phonological processing efficiency, reflected by higher word reading scores, demonstrated higher accuracy on a measure of sentence recognition than CI users with relatively poorer phonological processing. Substantial individual differences in phonological processing were observed among the CI users in their study, suggesting that even experienced adult CI users show variability in the efficiency through which they process and encode speech sounds. The implications of this variability have been explored to some extent using measures of speech perception in adult CI users [24,25,49,55], but the extent to which individual differences in speech production are influenced by phonological processing ability are not well understood.

1.3. Acoustic and Perceptual Measures of Speech Production in CI Users and Their Relation to Speech Perception

Despite growing interest in the cognitive factors that contribute to individual differences in CI speech recognition outcomes, the relationship between speech production and perception remains relatively underexplored in adult CI users. Most research has focused on speech recognition performance, with relatively little attention paid to how speech production quality relates to perceptual and cognitive–linguistic abilities. Some studies suggest that both speech production and phonological processing improve following cochlear implantation [29,43,56], likely reflecting the benefits of restored auditory input on phonological representations. One way of assessing the effects of changes in phonological representations after cochlear implantation through speech production is by identifying the extent to which CI users produce phonetic contrasts between similar-sounding words or phonemes, which can entail measuring either the acoustic aspects or the subjective intelligibility of speech produced by CI users [28,29,30,31,43,56,57]. Further, addressing the relationship between these measures of production to measures of perception can provide insight into the production–perception link in CI users.

Acoustic measures of vowel production, like vowel space area and first and second formant (F1 and F2) frequency measurements, have been used in both cross-sectional and longitudinal studies to quantify speech production in CI users [31,56,57,58]. However, few studies have compared acoustic measures of speech production to speech perception in CI users. Ref. [59] collected longitudinal vowel production data for four CI users in addition to performance on a closed-set vowel identification (perception) task. The authors of that study did not directly compare production and perception measures to one another but found that increases in vowel identification accuracy over about 18 months post-activation were accompanied by global increases in F2 across all vowels (corresponding to fronting of the acoustic–phonetic vowel space). These results demonstrate that changes in production and perception following implantation tend to trend together, at least in the first 18 months.

One acoustic measure of vowel production that may be related to subjective speech intelligibility entails measuring of the overall size of the acoustic–phonetic vowel space to characterize both phonological representation of contrast between vowels and the resulting intelligibility of vowel productions. Auditory deprivation can lead to overall compression of the vowel space, resulting in vowel productions that are ambiguous or confusable with one another [60,61,62,63]. Vowel dispersion is an acoustic metric used to measure relative expansion or compression of the vowel space and is quantified as the Euclidian distance from a vowel-dependent midpoint of a talker’s vowel space to the acoustic realization of a vowel in the F1 × F2 dimension, e.g., [10,50,64,65,66]. Talkers who produce more dispersed vowels are more intelligible than those who produce less dispersed vowels [64], at least as assessed using vowel dispersion measures from vowels in running sentences and measuring intelligibility as the number of words in a sentence that a listener correctly reports back. These findings suggest that vowel dispersion could serve as an acoustic predictor of speech intelligibility.

Restoration of auditory feedback provided by the CI in post-lingually deafened adults can allow CI users to enhance the acoustic–phonetic contrast between speech sounds, and it has been assumed that this contrast enhancement can facilitate the intelligibility of their speech [28,29,56,57]; for reviews, see [67,68]. In one study that examined changes in speech production longitudinally following cochlear implantation, Ref. [57] assessed acoustic and perceptual measures of speech production in post-lingually deafened adults before cochlear implantation and at 1 month, 6 months, and 2 years post-CI activation. They found acoustic changes in speech production up until 2 years post activation, such that the CI users’ speech became more acoustically similar to that of their NH peers over time. In addition, perceptual ratings of the quality of the CI users’ speech indicated that their speech became more subjectively intelligible over the 2-year period following CI activation, as rated by trained clinicians. These results suggest that the restoration of auditory input via CI and experience with the signal delivered by the CI can yield improvements in both speech perception and production relative to performance pre-implant.

Due to the substantial variability observed in speech perception outcomes post-implantation, several studies have sought to identify potential pre-implant predictors of post-implant performance. For example, in pre-lingually deafened adults, Refs. [69,70] found that CI users with more subjectively intelligible speech prior to cochlear implantation demonstrated stronger speech recognition outcomes post-implantation than CI users with less intelligible speech pre-implantation. These findings suggest that aspects of speech production are related to speech perception in CI users, even when speech production is assessed prior to implantation. These results potentially reflect a common underlying cognitive–linguistic mechanism—phonological processing—that both influences the intelligibility of speech pre-implant and shapes the success with which new CI users can adapt to the degraded signal to recognize and perceive speech post-implant. In addition, these results highlight the potential clinical utility of using intelligibility of speech production as both a predictive measure of clinical speech recognition outcomes and as a standalone clinical outcome measure. Given that both speech production and perception draw on phonological representations, investigating the relationship between speech production, speech perception, and phonological processing may provide a clearer window into the mechanisms that underlie variability in CI production and perception outcomes.

1.4. The Current Study

The purpose of this exploratory study is to establish the relationship between individual differences in speech production and speech perception in experienced adult CI users. This cross-sectional study focused on stable patterns of speech production and perception among CI users with greater than 2 years of CI use, rather than longitudinal change, to establish preliminary relationships among measures. We analyzed both subjective perceptual ratings of speech intelligibility and acoustic features of vowel distinctiveness using isolated words elicited from 15 CI users in a word reading task. The primary goal of this study was to establish the relationships among the subjective intelligibility of a CI user’s speech, their speech recognition accuracy, and an indirect measure of phonological processing. As a secondary goal, we determined whether an acoustic measure of clarity of speech production, vowel dispersion, is associated with both speech recognition accuracy and phonological processing. Finally, we sought to determine whether vowel dispersion is a reliable predictor of the subjective intelligibility ratings of a CI user’s speech. We hypothesized, first, that subjective ratings of CI speech intelligibility are associated with both speech recognition and phonological processing. We further hypothesized that vowel dispersion is associated with both speech recognition and phonological processing. Finally, we expected that a higher degree of vowel dispersion would lead to more subjectively intelligible utterances than vowels produced with less dispersion. These findings will serve to establish preliminary evidence of relationships among measures of speech production, perception, and phonological processing in adult CI users, which could motivate future work to identify the utility of these metrics as predictive and explanatory clinical metrics of performance with a CI. In turn, these findings could provide clinicians with feasible metrics by which to assess CI speech production and predict and explain variability in perception outcomes.

2. Materials and Methods

2.1. Participants: Cochlear Implant Users

Fifteen adult CI users participated in the current study. All CI users were peri- or post-lingually deafened, defined as having an onset of HL at or after 12 years of age. CI users were identical to those described in [50] and formed a subset of participants from [51]. CI users ranged in age from 24 to 76 years (mean age = 58.5 years, SD = 13.8). Ten participants were female and five were male. Seven of the CI users had a bimodal listening configuration, with one CI and a hearing aid in the contralateral, non-implanted ear. Five participants were bilateral CI users, and three were unilateral CI users (one CI and no hearing aid in the contralateral ear). The mean duration of deafness, defined as the years between self-reported onset of HL and age at first CI, was 30.1 years (SD = 15.2). Duration of deafness information was unavailable for 4 of the 15 CI participants, who were unable to report when their HL began. A summary of demographic information for CI users is included in Table 1. During testing, CI users wore their own devices (including contralateral hearing aids, if applicable), set to their everyday clinical settings to most closely replicate real-world communicative conditions. All participants were native, monolingual speakers of American English and reported no history of speech or cognitive disorders. Participants were recruited through the Department of Otolaryngology at The Ohio State University Wexner Medical Center, through The Ohio State University, and from the surrounding Columbus, Ohio metropolitan area. This study was approved by the Institutional Review Board at The Ohio State University. CI users completed all tasks in the laboratory in an audio booth or sound-treated testing room.

2.2. Participants: Normal Hearing Listeners

Forty-seven adults with self-reported NH were recruited to rate the subjective intelligibility of the CI users’ speech. All participants were recruited and participated online using the Prolific recruitment service and gorilla.sc experimental platform [71] using their own desktop or laptop devices and headphones. Prior to participation, NH participants completed a headphone screener [72] and were excluded from analysis if results from these screeners suggested that they were not wearing headphones (i.e., a score of <5 out of 6 on the headphone screener). Ten participants were excluded in this way, resulting in a total of 37 participants retained for the current analysis. These 37 participants ranged in age from 18 to 60 years (mean age 29.4) and included 20 women and 17 men. All participants were native speakers of American English with self-reported NH and no speech or language disorders.

2.3. Speech Production: Word Reading and Acoustic Analysis

2.3.1. Materials

Materials for the CI word reading task were the same as those described in [50]. Each CI participant read 240 monosyllabic CVC words that varied in neighborhood density and lexical frequency. Neighborhood density was operationalized as the number of words differing from a target word by the addition, deletion, or substitution of a single phoneme [8] and was obtained for each word using the University of Kansas Similarity Neighborhood Calculator or the Hoosier Mental Lexicon [73]. Lexical frequency was operationalized as the log of the contextual diversity in the SUBTLEX_US database, which derives word frequency information from movie subtitles [74,75]. Contextual diversity is defined within SUBTLEX_US as the count of movies where a target word is identified within its subtitles.

For the current study, a subset of 84 of these 240 words were selected for analysis, which included point vowels [æ, ɑ, i, u]. Point vowels have F1 × F2 values that represent the extreme points of the vowel quadrilateral, and previous studies found that these vowels yielded differences in vowel dispersion across vowel categories [10,66].

2.3.2. Procedure

CI users were seated with the experimenter in a quiet room. Words were presented visually, one at a time, on a computer screen, using the gorilla.sc experimental platform. Participants were instructed to read each word aloud naturally as it appeared on the screen. The experiment advanced automatically to the following word after 2500 ms. Word order was randomized for each participant. Participant responses were digitally recorded using a Shure cardioid condenser microphone with a 44.1 kHz sampling rate and 16-bit resolution. The total duration of the task was about 15 min.

2.3.3. Acoustic Analysis

In order to derive an acoustic metric of vowel distinctiveness for each token and each talker, a measure of dispersion from the center of each talker’s vowel space was calculated for each vowel token. Each recording was acoustically analyzed to identify F1 and F2 for each vowel in each CVC word. First, vowels in audio recordings were annotated using a forced aligner (MAUS; [76]) and manually verified by the first author in Praat [77]. The temporal midpoint of each annotated vowel was identified using Parselmouth, a Python library (3.9.16) that interfaces with Praat [78], and formant values were estimated at the measured midpoint of each vowel. For analysis, formant values were converted from Hz to Bark [79] following procedures in similar acoustic analyses [10,50,65,66]. Words that were unable to be acoustically analyzed were excluded, including words that participants misread (e.g., “ran” instead of “rune”) and words that were otherwise interrupted (e.g., by a cough). The total number of words per each of the 15 CI talkers included in the current analysis is provided in Table 2, shown separately by vowel category.

In order to derive a measure of dispersion from the center of each talker’s vowel space for each vowel, the estimated center of each talker’s vowel space was calculated separately for each of the 15 CI users. Following previous studies that have calculated vowel dispersion [65,80,81], the center of the vowel space was defined as the grand mean of the F1 and F2 values for /i/ and /ɑ/. However, given that the vowels used in this analysis were not evenly distributed around the vowel space and words used for analysis did not include an equal number of vowels for each talker, the resulting acoustic centers of the vowel space are dependent on the vowels used in the task and analysis and do not necessarily represent a true articulatory or perceptual midpoint. Vowel dispersion was calculated and defined as the Euclidian distance in the F1 × F2 Bark space from a given instance of a vowel to the center of that talker’s acoustic vowel space.

2.4. Intelligibility Ratings of Cochlear Implant Users’ Speech

2.4.1. Materials

Recordings of words spoken by the 15 CI participants were used as stimulus materials for NH raters. Words that included point vowels [æ, ɑ, i, u] were included in addition to words that included additional vowels, which were examined separately as part of a larger study (only words with point vowels were included in the current analysis). Across all vowels, words that were unable to be acoustically analyzed due to mispronunciations or misreadings were excluded from the auditory stimuli presented to the NH raters. Recordings were trimmed to remove silence before and after each utterance and normalized to the same RMS level prior to presentation.

To ensure that each word spoken by each CI user was rated by multiple listeners, 10 lists of between 352 to 355 stimuli were created. Each list included between 16 and 33 utterances from each of the 15 talkers, determined based on the total number of available tokens from each CI user. Lists were pseudo-randomly assigned to each of the 37 NH raters. Each of the 1068 tokens included in the current analysis were rated between 4 and 8 times, with a mean of 5.2 ratings given for each token.

2.4.2. Procedure

NH participants completed the experiment on the gorilla.sc platform using their own desktop or laptop devices and headphones. They were asked to sit in a quiet room for the duration of the experiment. Listeners completed a headphone screener prior to participation. The online headphone screener consisted of a three-alternative forced-choice psychophysical task in which listeners heard three 1000 ms sequences of white noise and were asked to identify the interval that differed from the other two. In two of these intervals, the white noise bursts delivered to each ear were identical. In one of these intervals, one of the white noise bursts was shifted 180 degrees in phase relative to the noise presented to the opposite ear, yielding the percept of a faint tone embedded in noise. Critically, this percept is only present when the two signals are presented dichotically combined over headphones, so participants listening over loudspeakers are not likely to pass the screener [72]. For the purpose of the current study, participants who scored < 5 correct answers out of 6 were considered to have failed the screener and not included in the current analysis. Although not designed to screen for HL or identify background noise, listeners with hearing loss and/or listeners who are participating in a noisy environment may also fail this screener [82].

For the main intelligibility rating task, listeners were pseudo-randomly assigned to 1 of 10 possible lists of auditory stimuli, described above. Each list contained words spoken by each of the 15 talkers. Following five practice trials, listeners were presented with one word at a time and asked to rate how understandable that word was using a sliding scale with endpoints “not understandable at all” to “very understandable” visible on each side of the sliding scale for the duration of the task. Points on the scale were numerically coded as 1 (not understandable) to 100 (very understandable), though these numbers were not visible to participants. Participants were instructed to ignore differences in audio quality or sound quality across tokens and focus on how understandable or intelligible each word was. They were informed that all words were real words in the English language. Each auditory stimulus was preceded by a 500 ms fixation cross on the computer screen. Tokens were fully randomized within each list. Rating responses were recorded as numbers between 1 and 100 for analysis.

2.5. Speech Perception Tasks (Outcome Measures)

2.5.1. Phonological Processing: Word and Nonword Reading Efficiency

The Test of Word Reading Efficiency, Second Edition (TOWRE-2; [83]), was used as an indirect measure of phonological processing. This measure assesses single-word reading accuracy and fluency in the absence of sentence context. Participants completed two subsets of this test, including rapid real-world reading and nonsense word reading. Participants were asked to read as many real words as possible in 45 seconds from a 108-word list and as many nonsense words as possible from a 66-nonword list. Form lists A were used. Video recordings were used to transcribe responses by two trained scorers who had previously attained 95% agreement with an established reliable scorer of TOWRE-2. Each participant’s TOWRE-2 was scored by one primary and one secondary scorer, and scores from the primary scorer were used for analysis. No data were omitted. For analysis, one total score was computed per participant by summing the total number of real words and nonwords correctly read out loud. We chose to analyze this aggregated TOWRE-2 total score, rather than analyze words and nonwords separately, based on findings that this total score is associated with three different measures of sentence recognition accuracy in CI users in both cross-sectional and longitudinal studies [51,84]. In contrast, the separate associations between word reading efficiency and speech recognition and nonword reading efficiency and speech recognition are inconsistent across these studies.

2.5.2. Sentence Recognition Accuracy

Sentence recognition in quiet was assessed using the Perceptually Robust English Sentence Test Open-Set (PRESTO; [85]). PRESTO maximizes talker variability across sentence materials by including sentences produced by multiple talkers with different genders, regional dialects, and speaking rates. PRESTO is similar to sentence materials used for traditional clinical testing but was chosen over these traditional clinical measures in order to be relatively unfamiliar to participants. Original PRESTO sentence lists were balanced for talker gender, lexical frequency of sentence keywords, and familiarity of sentence keywords, with no repeated talkers. For this study, participants were presented with 36 sentences (PRESTO lists 7 and 8), with the first 2 sentences from list 7 used as practice. Each PRESTO list included 18 sentences, ranging from 5 to 10 words per sentence. The number of keywords per sentence ranged from 3 to 5, but each list included a total of 76 keywords. Auditory stimuli were presented at 68 dB SPL via a Roland MA-12C loudspeaker placed 1m directly in front of the participant’s head. Participants heard one sentence at a time over the loudspeaker and were asked to repeat back each sentence out loud and to guess if unsure. A single accuracy score was calculated per participant and defined as the percentage of sentence keywords correctly identified from each of the PRESTO sentences not used as practice, with a total of 143 total possible keywords. As with TOWRE-2, each participant’s PRESTO keyword recognition accuracy was scored by one primary and one secondary scorer who had previously been trained until they reached 95% agreement with an established reliable scorer and with each other. Scores from the primary scorer were used for analysis, and no data were omitted.

2.6. Data Analysis

To address the objectives of this study, a series of Pearson’s correlations were conducted to examine the associations between several pairs of variables. The distribution of data for each variable was approximately normal, based on visual inspection of histograms and visual inspection of output from the qqnorm() and qqline() functions in R (4.5.0). First, Pearson’s correlation analyses were conducted to establish the associations between subjective ratings of the intelligibility of CI users’ speech and speech recognition accuracy (PRESTO) and between intelligibility ratings and phonological processing efficiency (TOWRE-2). A second series of Pearson’s correlations were conducted to identify the relationship between vowel dispersion and speech recognition accuracy (PRESTO) and vowel dispersion and phonological processing efficiency (TOWRE-2). Finally, to assess the validity of vowel dispersion as a metric of the subjective intelligibility of a talker’s speech, a linear mixed-effects regression model was used to predict subjective ratings of CI users’ speech intelligibility by NH raters from vowel dispersion. Mixed-effects modeling was chosen for the final analysis because repeated measures were available for both vowel dispersion and intelligibility ratings, whereas PRESTO and TOWRE-2 data were collected as part of a larger study, and therefore, repeated trial-by-trial measures were unavailable.

For correlational analyses, predictor measures (intelligibility ratings and vowel dispersion) were first aggregated by talker, vowel, and lexical difficulty to account for inherent differences in subjective ratings of speech clarity and vowel dispersion based on vowel category and lexical difficulty [10,66,86] and because the number of tokens used for analysis differed across vowel category and lexical difficulty by talker. Then, for each talker, average intelligibility ratings and average vowel dispersion were calculated over these aggregated data. For all correlational analyses, the false discovery rate (FDR) correction was used to correct for multiple comparisons, and corrected p-values are reported. Finally, we note that our sample of participants includes bilateral CI users (n = 5), bimodal CI users (n = 7), and unilateral CI users (n = 3). Due to the small sample size, we cannot draw conclusions about differences across listening configurations. However, we have plotted the data in Figure 1 and Figure 2 using different shapes for each listening configuration to facilitate visualization of potential group-level trends.

3. Results

A summary of mean intelligibility ratings, mean vowel dispersion, PRESTO scores (sentence recognition accuracy), and TOWRE-2 scores (phonological processing efficiency) is provided in Table 3 for each CI user. Across all CI users, the mean intelligibility, as evaluated by NH raters, was 65.9 (range 42.9 to 84.0). Mean vowel dispersion was 2.41 Bark (range 1.8 to 3.3). Mean PRESTO accuracy was 70.8% keywords correct (range 47.6 to 86.6). Finally, the mean TOWRE-2 total score was 128.1 total combined words and nonwords correctly read out loud (range 84 to 157).

3.1. The Association Between Intelligibility Ratings and Speech Perception Outcomes

Pearson’s correlation was carried out to establish the association between subjective ratings of the intelligibility ratings of CI users’ speech and CI users’ PRESTO scores of sentence recognition. Results are displayed in Table 4 and the left panel of Figure 1. Intelligibility ratings were moderately to strongly positively correlated with PRESTO accuracy (r = 0.67, p = 0.024), revealing that CI users who were rated as having more intelligible speech performed more accurately on sentence recognition than CI users with less intelligible speech.

Pearson’s correlation was carried out to establish the association between subjective ratings of the intelligibility of CI users’ speech and overall TOWRE-2 scores. Results are displayed in Table 4 and the right panel of Figure 1. The relationship was marginally significant: intelligibility ratings were moderately positively correlated with TOWRE-2 performance (r = 0.47, p = 0.09), potentially revealing that CI users who were rated as having more intelligible speech demonstrated somewhat more efficient word reading and phonological processing than CI users with less intelligible speech, though this relationship did not reach statistical significance at an alpha of 0.05.

Figure 1. Relationships between intelligibility ratings and PRESTO accuracy (left panel) and intelligibility ratings and TOWRE-2 number of words read (right panel). Solid lines represent the best-fit linear regression line and shading represents the 95% confidence interval of the fitted regression line. Coefficients and associated p-values from Pearson’s correlations are provided. Participants’ listening configurations are denoted by the shape of each point.

3.2. The Association Between Vowel Dispersion and Speech Perception Outcomes

Pearson’s correlation was carried out to establish the association between vowel dispersion and PRESTO scores. Results are displayed in Table 4 and the left panel of Figure 2. The relationship was marginally significant: vowel dispersion was moderately positively correlated with PRESTO accuracy (r = 0.50, p = 0.098), potentially revealing that CI users who produced more distinct vowels demonstrated somewhat more accurate sentence recognition than CI users who produced more ambiguous vowels, though this relationship did not reach statistical significance at an alpha of 0.05.

Pearson’s correlation was carried out to establish the association between vowel dispersion and overall TOWRE-2 scores. Results are displayed in Table 4 and the right panel of Figure 2. No clear relationship was observed (r = 0.13, p = 0.63), indicating that acoustic distinctness of vowel production does not appear to be associated with this measure of phonological processing.

Figure 2. Relationships between vowel dispersion and PRESTO accuracy (left panel) and vowel dispersion and TOWRE-2 number of words read (right panel). Solid lines represent the best-fit linear regression line and shading represents the 95% confidence interval of the fitted regression line. Coefficients and associated p-values from Pearson’s correlations are provided. Participants’ listening configurations are denoted by the shape of each point.

3.3. The Relationship Between the Two Measures of Speech Production

A linear mixed-effects regression model was used to identify the impact of vowel dispersion on subjective intelligibility ratings. A maximal, data-driven random-effects structure was initially specified [87] and random slopes were removed until the model converged. Statistical significance was assessed using the Satterthwaite approximation of degrees of freedom for F-statistics, implemented using the anova function within the lmerTest package in R [88,89]. The dependent measure in the current analysis was listeners’ intelligibility ratings and was predicted from vowel dispersion. Based on evidence that words’ lexical frequency and neighborhood density influence subjective ratings of speech clarity [86], both lexical frequency and neighborhood density were included as covariates. All variables and covariates were continuous. In the final model, random intercepts for talker (CI user), listener (NH rater), and item (word) were included. For all measures, an alpha of 0.05 was set. A summary of model output is included in Table 5.

The results revealed no main effect of vowel dispersion on intelligibility ratings [F(1,4745) = 0.47, n.s.], suggesting that subjective ratings of speech intelligibility did not depend on the degree of dispersion with which vowels were produced. The lexical frequency covariate reached significance [F(1,86.3) = 7.04, p = 0.009], revealing that words with a higher frequency of occurrence in spoken language were rated as more intelligible than words with a lower frequency of occurrence. The neighborhood density covariate did not reach significance [F(1,84.3) < 0.001, n.s.].

4. Discussion

The primary goal of this exploratory study was to establish the relationships among the subjective intelligibility of a CI user’s speech and their performance on measures of speech recognition and phonological processing. As a secondary goal, we determined whether vowel dispersion, an acoustic metric of speech production, was associated with speech recognition performance and phonological processing. Finally, we determined whether vowel dispersion is a reliable predictor of the subjective intelligibility ratings of a CI user’s speech. We hypothesized, first, that subjective ratings of speech intelligibility are associated with both speech recognition and phonological processing. Similarly, we expected that vowel dispersion is also associated with speech recognition and phonological processing. Finally, we predicted that words with more dispersed vowels would be rated as more subjectively intelligible than words with less dispersed vowels. Due in part to the small sample size of CI users in the current study (n = 15), this exploratory study is intended to establish potential relationships among cognitive–linguistic factors to motivate future investigation. Accordingly, in the following sections, we will suggest possible factors underlying the observed results and suggest avenues for future work to delineate the relationships between speech production, phonological processing, and speech perception in adult CI users.

4.1. Subjective Ratings of Speech Intelligibility Are Associated with Speech Recognition

In support of our primary hypothesis, the results from the correlational analyses demonstrated a significant, positive, moderate-to-strong association between subjective intelligibility ratings of CI users’ speech and sentence recognition accuracy. Specifically, we found that CI users who were rated as having more intelligible speech also demonstrated more accurate sentence recognition than CI users who had less intelligible speech. Though relatively few studies have examined the intelligibility of the speech of post-lingually deafened CI users, these findings are consistent with those of longitudinal studies examining changes in the intelligibility of CI users’ speech utterances pre- and post-implant [30,57,90]. For example, Ref. [57] found that improvements in speech recognition accuracy over the first two years after CI activation were accompanied by increases in the subjective clarity of speech utterances of the participants, and these improvements appeared to follow similar trajectories. Examining changes in intelligibility over time, particularly for post-lingually deafened adults, is limited by the fact that many participants have high baseline (pre-implant) intelligibility (often reflected by the choice to assess the intelligibility of these talkers by embedding their utterances in background noise to avoid ceiling effects; e.g., [30,90]). Therefore, it may be difficult to assess improvement in speech production post-implant for talkers with relatively high baseline intelligibility pre-implant. However, despite high intelligibility among some post-lingually deafened adults, several studies have noted substantial variability in intelligibility pre-implant [30,91]. This variability suggests that not all post-lingually deafened adults consistently produce highly intelligible speech. This variability was also present among the current sample of 15 CI users who had at least 2 years of CI experience: mean intelligibility ratings ranged from 42.9 to 84.0 per talker, averaged across NH raters and stimulus words (see Table 3).

Understanding the time course of improvements in perception vs. production in adult CI users can provide insight into mechanisms underlying these improvements (for example, whether improvements in one drive improvements in the other). Evidence from the second language (L2) acquisition literature in adults may shed light on these changes. Some studies that have examined L2 acquisition patterns over a short timescale early in the acquisition process have found that gains in speech perception precede speech production improvements [92]. However, at later stages of learning, changes in production may precede changes in perception [40]. Further, although evidence is mixed, there is some evidence that training L2 learners in production may yield improvements in perception under certain conditions [93,94]. These results may suggest that the time course of acquisition of perception and production skills differ, or potentially that improvements in one skill drive improvements in the other at different stages of learning. To the extent that L2 acquisition shares cognitive and linguistic resources with the process of re-mapping sounds delivered through a CI to existing phonological representations, this evidence collectively suggests a complex link between the development of speech recognition and speech production. Further, these findings demonstrate that this relationship may change over time, as with more experienced L2 learners or more experienced CI users.

Methodological choices in assessing intelligibility of CI users’ speech may have facilitated our observation of the association between speech intelligibility ratings and speech recognition scores. Studies that have observed small or no improvements in the intelligibility of CI users’ speech from pre- to post-implant primarily had listeners perform closed- or open-set speech recognition or shadowing tasks in background noise [30,43,90]. Accuracy in these tasks is often high, given the relatively good intelligibility of the speech of post-lingually deafened CI users [90]. However, tasks eliciting a more subjective rating of speech intelligibility (sometimes referred to as clarity or comprehensibility [86]; or voice quality [57]) typically yield a wide range of responses both within and across NH raters, even when the speech is objectively intelligible (i.e., when listeners would be able to accurately identify the word they were presented with). Our preliminary results, along with those from studies that have used similar techniques to elicit subjective ratings of intelligibility, provide support for the use of these ratings to assess perceptually meaningful aspects of CI users’ speech that may not be captured by traditional tests of recognition accuracy of these utterances.

Our results also revealed a nonsignificant but positive association between subjective ratings of speech intelligibility and TOWRE-2 scores. While not statistically significant, this trend toward a positive association may tentatively suggest a relationship between speech intelligibility and phonological processing, such that CI users with speech that was rated as more subjectively intelligible demonstrated more efficient phonological processing than CI users with less intelligible speech. The absence of a more robust relationship between intelligibility ratings and TOWRE-2 scores may reflect limitations in the sensitivity of the TOWRE-2 task as an indirect measure of phonological processing and representations. This task may not fully capture the specific phonological mechanisms relevant to speech intelligibility. Further, nonauditory cognitive load and literary factors may also influence performance on TOWRE-2, potentially impacting our results. We did not assess the relationship between phonological processing and speech recognition accuracy in the current study, but previous work has revealed that post-lingually deafened CI users with better word reading efficiency (operationalized as higher total word + nonword TOWRE-2 reading fluency scores, as used in the current study) also demonstrated higher PRESTO sentence recognition accuracy [51,84]. An exhaustive investigation of the extent to which phonological processing mediates the relationship between speech perception and production in adult CI users is beyond the scope of this paper. However, based on evidence from NH adults that phonological processing influences speech perception and speech production [9,10], these results motivate further study into the role of phonological processing in speech communication outcomes among adult CI users.

The observed relationships between speech perception, speech production, and phonological processing are consistent with several existing models of speech processing. To the extent that phonological representations are entailed in the speech sound map described by the DIVA model of speech production [33,35], DIVA posits that auditory and perceptual information derived from exposure to spoken language contributes to the formation and maintenance of the speech sound map. This relationship between mapping auditory input onto phonological representations is further corroborated by the Ease of Language Understanding (ELU) model [95], which posits that speech recognition is facilitated by robust mappings between auditory input and phonological representations. Taken together, these models support a link between speech recognition and phonological representation, and previous studies have provided experimental evidence in support of this link in adult CI users [20,21,22,24,25,26,47,51]. Importantly, this relationship exists despite the degraded auditory signal delivered by the CI, potentially suggesting either that phonological representations formed prior to hearing loss are able to be recovered via auditory input delivered by the CI, or that the modification of phonological representations is possible with the novel input provided by the CI.

The relationship between phonological processing and intelligibility of speech production is less well studied, particularly among post-lingually deafened adult CI users. However, DIVA proposes direct and indirect links between speech sound maps (similar to phonological representations) and speech production via feedback control of the auditory and somatosensory systems. Experimental evidence in support of this relationship comes from neural imaging and behavioral studies alike [31,96,97]. In addition, studies investigating how phonological representations influence speech production, both in NH adults and adult CI users, have shown that talkers can use their knowledge of phonological contrast (i.e., differences in speech sounds) to enhance or reduce acoustic–phonetic realizations of their speech [10,50,65,66,80,81]. For example, evidence from vowel production studies suggests that talkers hyperarticulate vowels in short words when there are many similar-sounding words in the lexicon of that talker, presumably in order to enhance phonetic contrast between the target utterance and other similar-sounding words [10,66,80]. This phonetic realization of underlying phonological contrast supports the notion that phonological processing and speech production are linked, and preliminary evidence of these speech production patterns has been observed in adult CI users [50]. Phonetic contrast enhancement is one acoustic marker of clear speech and can therefore enhance the intelligibility of speech utterances by disambiguating between a target utterance and similar sounds in a talker’s or listener’s lexicon [98,99]. Evidence from this literature somewhat contradicts the non-significant association we observed between intelligibility ratings and phonological processing in the current study, though our measure of phonological processing (TOWRE-2) differs from the measures of phonetic contrast used in this literature. Therefore, future investigations should assess the validity and sensitivity of these different measures in assessing phonological processing and representations.

4.2. Vowel Dispersion Is Not Associated with Speech Recognition, Phonological Processing, or Intelligibility Ratings

Given the evidence that phonetic contrast enhancement is reflective of underlying phonological representations in NH adults [10] and that acoustic measures of vowel distinctiveness are associated with speech intelligibility [64,99], we investigated the extent to which vowel dispersion, an acoustic measure of vowel distinctiveness, was associated with speech recognition accuracy and phonological processing in CI users. The results revealed a non-significant but positive association between vowel dispersion and sentence recognition accuracy, such that talkers who produced vowels further from the center of their vowel space tended to demonstrate higher sentence recognition accuracy than talkers who produced vowels closer to the center of their vowel space. The absence of a significant association between vowel dispersion and PRESTO accuracy is somewhat inconsistent with our finding that CI users with more intelligible speech demonstrate higher sentence recognition accuracy than CI users with less intelligible speech. This discrepancy could suggest that intelligibility ratings and vowel dispersion reflect different underlying processes in speech production. Alternatively, it is possible that our measure of vowel dispersion is not sensitive enough or our study is not adequately powered to detect a reliable relationship between these measures.

The absence of a relationship between vowel dispersion and phonological processing raises additional questions about the relationship between vowel dispersion and intelligibility ratings. Given previous work that has shown a relationship between vowel dispersion and speech intelligibility [64] and that clear speech, which is typically more intelligible than conversational speech, is characterized in part by high degrees of vowel dispersion and large vowel spaces [99,100,101], we expected that enhanced vowel dispersion would be a feature of intelligible speech and would therefore both be related to sentence recognition and phonological processing.

To assess our assumption that vowel dispersion and intelligibility are related, we conducted a repeated measures analysis to assess whether more dispersed vowels were associated with higher subjective intelligibility ratings than less dispersed vowels. The results revealed that vowel dispersion did not predict intelligibility ratings, suggesting that the dispersion, or distinctiveness, of a given vowel in a CVC word did not contribute to how intelligible listeners rated the word as. This finding somewhat conflicts with that of [64], who found that keywords in sentences produced by talkers with more dispersed vowels were more accurately transcribed than keywords produced by talkers with less dispersed vowels. However, the materials from which vowel dispersion were derived differed across our studies: [64] calculated dispersion based on vowels from mono- and multisyllabic words in sentences, whereas dispersion in our study was calculated based only on vowels in isolated monosyllabic words. Different patterns of vowel dispersion have been observed in vowels in words extracted from conversational speech compared to words read in isolation [102]. Therefore, the patterns of vowel dispersion observed in [64] and in the current study may have differed due to the more conversational nature of words extracted from sentences compared to words read in isolation. However, further work is needed to delineate the impact of these potential differences in vowel dispersion on intelligibility.

A second reason why we may have failed to find evidence of a relationship between vowel dispersion and intelligibility ratings is due to potential differences between subjective intelligibility ratings, as employed in the current study, and speech recognition accuracy, as used in [64]. While these subjective ratings can be informative, particularly when intelligibility is high [57,69,70,86,90], they may recruit different auditory or cognitive processes than those required for speech recognition (i.e., transcription or shadowing) tasks. Evidence from NH L2 learners suggests that comprehensibility ratings of speech from nonnative talkers (similar to intelligibility ratings in the current study) are somewhat dissociated from speech recognition accuracy, and the authors speculate that comprehensibility ratings could reflect something similar to the effort expended while listening to an utterance rather than how intelligible the utterance was [103,104].

Finally, listeners may have used other properties of speech beyond just vowel dispersion when rating speech intelligibility. Vowel dispersion characterizes the acoustic properties of vowels within words but does not reflect the surrounding consonantal context or suprasegmental factors, like articulation rate, which are known to influence perceived intelligibility [105]. Therefore, the absence of a strong relationship between vowel dispersion and intelligibility ratings may indicate that, at least for some listeners, factors beyond acoustic properties of the vowel may influence perceived intelligibility of an utterance. Broader speech cues may therefore be more relevant for identifying CI users with poorer speech recognition outcomes, though further research is needed to evaluate the impact of these cues on intelligibility ratings.

One factor that may explain the lack of relationship between vowel dispersion and phonological processing is the way that vowel dispersion patterns are interpreted in the literature with respect to phonological representations. Specifically, vowel dispersion is often used as a metric by which to quantify phonetic contrast, assumed to reflect the structure of underlying phonological representations [10,50,65,66,80,81,102]. However, to measure contrast, there are typically least two factors that are being contrasted with one another. For example, in the studies listed above, vowel dispersion was compared across vowels in words that were lexically easy (i.e., occur often in spoken language, have few similar-sounding words, and tend to be intelligible) vs. lexically hard (i.e., occur infrequently, have many similar-sounding words, and are less likely to be intelligible). Therefore, the overall measure of vowel dispersion used in the current study may be less representative of phonological processing than a comparison of vowel dispersion between vowels across different lexical contexts, and this consequently may further explain the lack of a relationship between vowel dispersion and TOWRE-2 scores in the current study. However, other studies have used vowel dispersion or related measures to assess how distinct specific vowel utterances or utterances of vowel categories are to one another in adult CI users [31,32,106]. Accordingly, future analyses should pair acoustic measures of vowel distinctiveness with measures of the acoustic variability in vowel production (as in [106]), which could also account for inherent variability in formant measurements (especially in CI speech), or they should compare instances of vowel dispersion across particular words or vowel categories in order to more comprehensively characterize the acoustic markers of phonological contrast in adult CI users.

4.3. Limitations

This study is exploratory and is intended to establish the feasibility subjective ratings of speech intelligibility as a potential predictor of speech recognition performance among post-lingually deafened adult CI users. Due to the small sample size (n = 15), which limits statistical power and generalizability, the cross-sectional nature of the study, and analyses that are primarily correlational in nature, future work is necessary to determine the causality of the associations observed and to identify potential mechanisms that mediate or moderate these relationships. For instance, examining separate relationships between measures of speech production and word and nonword reading efficiency (rather than using an aggregated TOWRE-2 score) may provide additional insight into the relationship between speech production and phonological processing. In addition, multiple demographic, cognitive, audiological, and linguistic factors are known to contribute to measures used in the current study, which we did not examine in detail. For example, participants’ listening configurations—specifically, whether they receive acoustic auditory input through a hearing aid in addition to a CI—may influence the quality of auditory feedback they receive, but no clear trends based on listening configuration were apparent in the current study. In the following section, we briefly review some factors that may contribute to individual differences in measures of speech production and perception. The contribution of these factors to outcome measures should be more thoroughly evaluated before the clinical utility of intelligibility measures can be determined.

4.4. Future Directions: Other Factors That May Influence Speech Production and Intelligibility Ratings

Due to our small sample size, we were unable to conduct an exhaustive analysis of individual differences among CI users and NH raters and how they contributed to intelligibility ratings. However, some factors that may impact speech intelligibility are outlined below as directions for future analyses. Briefly, these factors encompass those related to properties of the stimuli, audiological and demographic characteristics of the CI users, and characteristics of the NH raters.

Properties of the stimuli used in this and other studies have been shown to influence subjective ratings of clarity and intelligibility. In the current study, we included lexical frequency as a covariate in our analysis predicting intelligibility ratings from vowel dispersion. The lexical frequency covariate reached significance, expanding upon previous findings that words with a higher frequency of occurrence in spoken language are rated as more clear or more intelligible than words with a lower frequency of occurrence [86]. In addition, while not directly assessed in the current study, previous studies have determined that speaking rate influences comprehensibility ratings (similar to intelligibility ratings in this study), such that talkers with very fast or very slow speaking rates were rated as less comprehensible than other talkers [105]. Therefore, lexical factors and speaking rate should be accounted for in future studies assessing intelligibility ratings.

Audiological, demographic, and cognitive factors of the talkers who produce stimuli may also be associated with individual differences in outcomes. For example, duration of deafness prior to cochlear implantation has been found in some studies to be negatively correlated with intelligibility [91], and some studies have found that longer durations of deafness are associated with more degraded phonological representations than those with shorter durations of deafness [22,23,49], though challenges in obtaining reliable reports of onsets of hearing loss can make duration of deafness unreliable [24,107]. Demographic characteristics of talkers, such as their gender and age, can influence the intelligibility of their speech [64]. For example, women tend to produce more hyperarticulated speech than men, which can be reflected acoustically by greater vowel dispersion or larger vowel space sizes in women than in men [108,109]. Further, older talkers tend to be less intelligible than younger talkers [110], suggesting an age-related component to intelligibility of talkers (and potentially of listener ratings). Finally, examining individual differences in inner speech among talkers may provide interesting insight into the neural and cognitive mechanisms underlying the production–perception link [111]. Inner speech is inherently related to error monitoring, which entails the integration of auditory feedback into motor planning [112]. Therefore, a more thorough investigation of the relationship between inner speech, phonological processing, and motor planning among CI users may be informative.

Subjective intelligibility ratings may be a promising tool for future clinical and research use, but there are open questions about what constructs these ratings represent. Specifically, individual differences among NH raters may have influenced how they rated speech from the adult CI talkers in the current study. Familiarity with deaf speech has been shown to influence intelligibility, such that listeners who have more experience listening to talkers with HL find their speech more intelligible than listeners with little experience [113]. Cognitive–linguistic factors, like working memory capacity and receptive vocabulary, can also influence intelligibility, especially in adverse listening conditions [25,95,114,115,116]. Further, listeners’ biases, attitudes toward aspects of the talker’s perceived identity, and expectations about how speech from a given talker will sound are associated with intelligibility and other measures of speech perception [117,118,119,120]. These associations reveal a potential limitation of the use of intelligibility ratings, in that they are necessarily subjective and likely depend on a range of listener factors. Therefore, the use of these ratings in future work should be paired with supplemental questionnaires or tasks to quantify some of these properties among NH raters and clinicians to determine their clinical utility.

4.5. Clinical Implications

Despite the exploratory, cross-sectional nature of the current study, the results may have clinical relevance for adult CI users. Our finding that subjective intelligibility ratings of CI users’ speech is related to their sentence recognition accuracy expands upon findings from longitudinal work demonstrating the utility of subjective impressions of speech intelligibility both in predicting long-term outcomes [69] and in tracking improvements over the first two years post-activation [57,90]. Although these subjective impressions can be influenced by, for example, listener biases and social attitudes toward a talker or social group [120,121], these subjective metrics may still provide clinicians with useful information that they can use to predict (pre-CI) or explain (post-CI) CI users’ speech recognition outcomes.

Current clinical interventions and auditory rehabilitation strategies for adult CI users primarily target speech recognition, but our findings provide preliminary support that interventions targeting speech production could potentially both enhance speech recognition and influence broader spoken communication outcomes. Regardless of the reasons underlying differences in subjective ratings of intelligibility across talkers or listeners, speech that is evaluated by a listener as being difficult to understand can evoke negative judgments from peers; for example, speech spoken by adolescent CI users is rated more negatively on personality traits related to competence and friendship skills than that of NH peers [122,123]. Though the burden of compensating for these judgments should not necessarily be placed on the CI users themselves, listener biases and judgements may contribute to social isolation experienced by adults with hearing loss, potentially impacting their quality of life [124,125,126]. Therefore, clinicians may be able to track a combination of acoustic and perceptual metrics related to speech production over time to supplement current clinical measures of speech recognition. The feasibility of incorporating measures of speech production intelligibility into clinical assessments should be more thoroughly evaluated, as time constraints and the need for raters may be prohibitive. However, automated acoustic measures may be able to serve as proxies for perceptual judgements, and their development should be further explored.

5. Conclusions

The current study examined the relations among speech intelligibility, vowel dispersion, and speech perception outcomes in adult CI users. These exploratory results reveal a potential association between subjective ratings of the intelligibility of CI users’ speech and a measure of sentence recognition accuracy. The marginal but non-significant relationship between intelligibility ratings and phonological processing suggests that phonological processing may be a shared mechanism that contributes to both production and perception performance among adult CI users, but future work with larger sample sizes is necessary to draw conclusions about the nature of this relationship. Similarly, CI users with more dispersed vowel spaces also tended to demonstrate higher speech recognition accuracy than those with less dispersed vowel spaces, but this relationship was not significant. Finally, vowel dispersion was unrelated to phonological processing and did not predict subjective intelligibility ratings. Taken together, these preliminary findings lay the groundwork for future investigation into the use of speech intelligibility as a predictor of speech recognition outcomes and the role of phonological processing in the speech production–perception link among post-lingually deafened adult CI users.

Author Contributions

Conceptualization, V.A.S., A.C.M., and T.N.T.; Methodology, V.A.S. and T.N.T.; Formal Analysis, V.A.S.; Investigation, V.A.S., A.C.M., and T.N.T.; Writing—Original Draft Preparation, V.A.S. and D.J.W.; Writing—Review and Editing, V.A.S., D.J.W., A.C.M., and T.N.T.; Visualization, V.A.S.; Supervision, T.N.T.; Project Administration, A.C.M. and T.N.T.; Funding Acquisition, A.C.M. and T.N.T. All authors have read and agreed to the published version of the manuscript.

Funding

Funding for this project was provided by the National Institutes of Health, National Institute on Deafness and Other Communication Disorders (NIDCD) Career Development Award 5K23DC015539-02 and American Otological Society Clinician-Scientist Award to A.C.M. and by the National Institutes of Health NIDCD Award R21DC01938 and an American Hearing Research Foundation research grant to T.N.T.

Institutional Review Board Statement

This study was approved by the Institutional Review Board at the Ohio State University (Protocol 2020H0433, initial approval 6/4/2021).

Informed Consent Statement

Informed consent was obtained from all participants involved in this study.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request. The data are not publicly available to protect the privacy of the participants.

Acknowledgments

The authors gratefully acknowledge Emily Clausing, Jessica Lewis, Ally Schmitzer, and Kara Schneider for their assistance with this project.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Blamey, P.; Artieres, F.; Başkent, D.; Bergeron, F.; Beynon, A.; Burke, E.; Dillier, N.; Dowell, R.; Fraysse, B.; Gallégo, S.; et al. Factors Affecting Auditory Performance of Postlinguistically Deaf Adults Using Cochlear Implants: An Update with 2251 Patients. Audiol. Neurotol. 2013, 18, 36–47. [Google Scholar] [CrossRef]
Boisvert, I.; Reis, M.; Au, A.; Cowan, R.; Dowell, R.C. Cochlear Implantation Outcomes in Adults: A Scoping Review. PLoS ONE 2020, 15, e0232421. [Google Scholar] [CrossRef] [PubMed]
Lazard, D.S.; Vincent, C.; Venail, F.; van de Heyning, P.; Truy, E.; Sterkers, O.; Skarzynski, P.H.; Skarzynski, H.; Schauwers, K.; O’Leary, S.; et al. Pre-, Per- and Postoperative Factors Affecting Performance of Postlinguistically Deaf Adults Using Cochlear Implants: A New Conceptual Model over Time. PLoS ONE 2012, 7, e48739. [Google Scholar] [CrossRef]
Tamati, T.N.; Pisoni, D.B.; Moberly, A.C. Speech and Language Outcomes in Adults and Children with Cochlear Implants. Annu. Rev. Linguist. 2022, 8, 299–319. [Google Scholar] [CrossRef]
Ma, C.; Fried, J.; Nguyen, S.A.; Schvartz-Leyzac, K.C.; Camposeo, E.L.; Meyer, T.A.; Dubno, J.R.; McRackan, T.R. Longitudinal Speech Recognition Changes After Cochlear Implant: Systematic Review and Meta-Analysis. Laryngoscope 2023, 133, 1014–1024. [Google Scholar] [CrossRef]
Baskent, D.; Gaudrain, E.; Tamati, T.N.; Wagner, A. Perception and Psychoacoustics of Speech in Cochlear Implant Users. In Scientific Foundations of Audiology: Perspectives from Physics, Biology, Modeling, and Medicine; Plural Publishing, Inc.: San Diego, CA, USA, 2016; pp. 185–320. [Google Scholar]
Tamati, T.N.; Ray, C.; Vasil, K.J.; Pisoni, D.B.; Moberly, A.C. High- and Low-Performing Adult Cochlear Implant Users on High-Variability Sentence Recognition: Differences in Auditory Spectral Resolution and Neurocognitive Functioning. J. Am. Acad. Audiol. 2020, 31, 324–335. [Google Scholar] [CrossRef]
Chomsky, N.; Halle, M. The Sound Pattern of English; Harper & Row Publishers: New York, NY, USA, 1968; pp. 295–298. [Google Scholar]
Luce, P.A.; Pisoni, D.B. Recognizing Spoken Words: The Neighborhood Activation Model. Ear Hear. 1998, 19, 1–36. [Google Scholar] [CrossRef]
Wright, R. Factors of Lexical Competition in Vowel Articulation; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Schwartz, J.-L.; Basirat, A.; Ménard, L.; Sato, M. The Perception-for-Action-Control Theory (PACT): A Perceptuo-Motor Theory of Speech Perception. J. Neurolinguistics 2012, 25, 336–354. [Google Scholar] [CrossRef]
Liberman, A.M.; Cooper, F.S.; Shankweiler, D.P.; Studdert-Kennedy, M. Perception and the Speech Code. Psychol. Rev. 1967, 74, 431–461. [Google Scholar] [CrossRef] [PubMed]
Chesters, J.; Baghai-Ravary, L.; Möttönen, R. The Effects of Delayed Auditory and Visual Feedback on Speech Production. J. Acoust. Soc. Am. 2015, 137, 873–883. [Google Scholar] [CrossRef]
Houde, J.F.; Jordan, M.I. Sensorimotor Adaptation in Speech Production. Science 1998, 279, 1213–1216. [Google Scholar] [CrossRef]
Jones, J.A.; Munhall, K.G. Perceptual Calibration of F0 Production: Evidence from Feedback Perturbation. J. Acoust. Soc. Am. 2000, 108, 1246–1251. [Google Scholar] [CrossRef] [PubMed]
Liang, D.; Xiao, Y.; Feng, Y.; Yan, Y. The Role of Auditory Feedback in Speech Production: Implications for Speech Perception in the Hearing Impaired. In Proceedings of the 14th International Symposium on Integrated Circuits, ISIC 2014, Singapore, 10–12 December 2014; pp. 192–195. [Google Scholar] [CrossRef]
Mitsuya, T.; MacDonald, E.N.; Munhall, K.G.; Purcell, D.W. Formant Compensation for Auditory Feedback with English Vowels. J. Acoust. Soc. Am. 2015, 138, 413–424. [Google Scholar] [CrossRef] [PubMed]
Shiller, D.M.; Sato, M.; Gracco, V.L.; Baum, S.R. Perceptual Recalibration of Speech Sounds Following Speech Motor Learning. J. Acoust. Soc. Am. 2009, 125, 1103–1113. [Google Scholar] [CrossRef]
Beyea, J.A.; McMullen, K.P.; Harris, M.S.; Houston, D.M.; Martin, J.M.; Bolster, V.A.; Adunka, O.F.; Moberly, A.C. Cochlear Implants in Adults: Effects of Age and Duration of Deafness on Speech Recognition. Otol. Neurotol. 2016, 37, 1238–1245. [Google Scholar] [CrossRef]
Andersson, U. Deterioration of the Phonological Processing Skills in Adults with an Acquired Severe Hearing Loss. Eur. J. Cogn. Psychol. 2002, 14, 335–352. [Google Scholar] [CrossRef]
Andersson, U.; Lyxell, B. Phonological Deterioration in Adults with an Acquired Severe Hearing Impairment. Scand. Audiol. Suppl. 1998, 27, 93–100. [Google Scholar] [CrossRef]
Lyxell, B.; Andersson, J.; Andersson, U.; Arlinger, S.; Bredberg, G.; Harder, H. Phonological Representation and Speech Understanding with Cochlear Implants in Deafened Adults. Scand. J. Psychol. 1998, 39, 175–179. [Google Scholar] [CrossRef]
Andersson, U.; Lyxell, B. Phonological Deterioration in Adults with an Acquired Severe Hearing Impairment: A Deterioration in Long-Term Memory or Working Memory? Scand. Audiol. 1999, 28, 241–247. [Google Scholar] [CrossRef]
Moberly, A.C.; Lowenstein, J.H.; Nittrouer, S. Word Recognition Variability with Cochlear Implants: The Degradation of Phonemic Sensitivity. Otol. Neurotol. 2016, 37, 470–477. [Google Scholar] [CrossRef]
Moberly, A.C.; Harris, M.S.; Boyce, L.; Nittrouer, S. Speech Recognition in Adults With Cochlear Implants: The Effects of Working Memory, Phonological Sensitivity, and Aging. J. Speech Lang. Hear. Res. 2017, 60, 1046–1061. [Google Scholar] [CrossRef] [PubMed]
Lane, H.; Denny, M.; Guenther, F.H.; Hanson, H.M.; Marrone, N.; Matthies, M.L.; Perkell, J.S.; Stockmann, E.; Tiede, M.; Vick, J.; et al. On the Structure of Phoneme Categories in Listeners with Cochlear Implants. J. Speech Lang. Hear. Res. 2007, 50, 2–14. [Google Scholar] [CrossRef] [PubMed]
Lane, H.; Denny, M.; Guenther, F.; Matthies, M.; Ménard, L.; Perkell, J.; Stockmann, E.; Tiede, M.; Vic, J.; Zandipour, M. Effects of Bite Blocks and Hearing Status on Vowel Production. J. Acoust. Soc. Am. 2005, 118, 1636–1646. [Google Scholar] [CrossRef]
Lane, H.; Matthies, M.L.; Guenther, F.H.; Denny, M.; Perkell, J.S.; Stockmann, E.; Tiede, M.; Vick, J.; Zandipour, M. Effects of Short- and Long-Term Changes in Auditory Feedback on Vowel and Sibilant Contrasts. J. Speech Lang. Hear. Res. 2007, 50, 913–927. [Google Scholar] [CrossRef]
Langereis, M.C.; Bosman, A.J.; van Olphen, A.F.; Smoorenburg, G.F. Changes in Vowel Quality in Post-Lingually Deafened Cochlear Implant Users. Int. J. Audiol. 1997, 36, 279–297. [Google Scholar] [CrossRef]
Langereis, M.C.; Bosman, A.J.; van Olphen, A.F.; Smoorenburg, G.F. Intelligibility of Vowels Produced by Post-Lingually Deafened Cochlear Implant Users. Audiology 1999, 38, 206–224. [Google Scholar] [CrossRef]
Ménard, L.; Polak, M.; Denny, M.; Burton, E.; Lane, H.; Matthies, M.L.; Marrone, N.; Perkell, J.S.; Tiede, M.; Vick, J. Interactions of Speaking Condition and Auditory Feedback on Vowel Production in Postlingually Deaf Adults with Cochlear Implantsa. J. Acoust. Soc. Am. 2007, 121, 3790–3801. [Google Scholar] [CrossRef]
Perkell, J.S.; Lane, H.; Denny, M.; Matthies, M.L.; Tiede, M.; Zandipour, M.; Vick, J.; Burton, E. Time Course of Speech Changes in Response to Unanticipated Short-Term Changes in Hearing State. J. Acoust. Soc. Am. 2007, 121, 2296–2311. [Google Scholar] [CrossRef]
Guenther, F.H.; Hampson, M.; Johnson, D. A Theoretical Investigation of Reference Frames for the Planning of Speech Movements. Psychol. Rev. 1998, 105, 611–633. [Google Scholar] [CrossRef] [PubMed]
Guenther, F.H.; Ghosh, S.S.; Tourville, J.A. Neural Modeling and Imaging of the Cortical Interactions Underlying Syllable Production. Brain Lang. 2006, 96, 280–301. [Google Scholar] [CrossRef]
Guenther, F.H.; Hickok, G. Role of the Auditory System in Speech Production. In Handbook of Clinical Neurology; Elsevier: Amsterdam, The Netherlands, 2015; Volume 129, pp. 161–175. [Google Scholar]
Baese-Berk, M.M. Interactions between Speech Perception and Production during Learning of Novel Phonemic Categories. Atten Percept Psychophys 2019, 81, 981–1005. [Google Scholar] [CrossRef] [PubMed]
Baese-Berk, M.M.; Kapnoula, E.C.; Samuel, A.G. The Relationship of Speech Perception and Speech Production: It’s Complicated. Psychon. Bull Rev. 2025, 32, 226–242. [Google Scholar] [CrossRef]
Chao, S.-C.; Ochoa, D.; Daliri, A. Production Variability and Categorical Perception of Vowels Are Strongly Linked. Front. Hum. Neurosci. 2019, 13, 96. [Google Scholar] [CrossRef]
Flege, J.E. Production and Perception of a Novel, Second-language Phonetic Contrast. J. Acoust. Soc. Am. 1993, 93, 1589–1608. [Google Scholar] [CrossRef]
Flege, J.E.; MacKay, I.R.A.; Meador, D. Native Italian Speakers’ Perception and Production of English Vowels. J. Acoust. Soc. Am. 1999, 106, 2973–2987. [Google Scholar] [CrossRef]
Newman, R.S. Using Links between Speech Perception and Speech Production to Evaluate Different Acoustic Metrics: A Preliminary Report. J. Acoust. Soc. Am. 2003, 113, 2850–2860. [Google Scholar] [CrossRef] [PubMed]
Perkell, J.S.; Guenther, F.H.; Lane, H.; Matthies, M.L.; Stockmann, E.; Tiede, M.; Zandipour, M. The Distinctness of Speakers’ Productions of Vowel Contrasts Is Related to Their Discrimination of the Contrasts. J. Acoust. Soc. Am. 2004, 116, 2338–2344. [Google Scholar] [CrossRef]
Vick, J.C.; Perkell, J.S.; Matthies, M.L.; Gould, J. Covariation of Cochlear Implant Users’ Perception and Production of Vowel Contrasts and Their Identification by Listeners with Normal Hearing. J. Speech Lang. Hear. Res. 2001, 44, 1257–1268. [Google Scholar] [CrossRef] [PubMed]
Classon, E.; Rudner, M.; Rönnberg, J. Working Memory Compensates for Hearing Related Phonological Processing Deficit. J. Commun. Disord. 2013, 46, 17–29. [Google Scholar] [CrossRef]
Classon, E.; Löfkvist, U.; Rudner, M.; Rönnberg, J. Verbal Fluency in Adults with Postlingually Acquired Hearing Impairment. Speech Lang. Hear. 2014, 17, 88–100. [Google Scholar] [CrossRef]
Rudner, M.; Foo, C.; Sundewall-Thorén, E.; Lunner, T.; Rönnberg, J. Phonological Mismatch and Explicit Cognitive Processing in a Sample of 102 Hearing-Aid Users. Int. J. Audiol. 2008, 47, S91–S98. [Google Scholar] [CrossRef] [PubMed]
Rudner, M.; Danielsson, H.; Lyxell, B.; Lunner, T.; Rönnberg, J. Visual Rhyme Judgment in Adults with Mild-to-Severe Hearing Loss. Front. Psychol. 2019, 10, 1149. [Google Scholar] [CrossRef]
Harnsberger, J.D.; Svirsky, M.A.; Kaiser, A.R.; Pisoni, D.B.; Wright, R.; Meyer, T.A. Perceptual “Vowel Spaces” of Cochlear Implant Users: Implications for the Study of Auditory Adaptation to Spectral Shift. J. Acoust. Soc. Am. 2001, 109, 2135–2145. [Google Scholar] [CrossRef]
Lazard, D.S.; Lee, H.J.; Gaebler, M.; Kell, C.A.; Truy, E.; Giraud, A.L. Phonological Processing in Post-Lingual Deafness and Cochlear Implant Outcome. NeuroImage 2010, 49, 3443–3451. [Google Scholar] [CrossRef]
Sevich, V.A.; Moberly, A.C.; Tamati, T.N. Lexical Difficulty Affects Vowel Articulation in Adult Cochlear Implant Users. Proc. Meet. Acoust. 2022, 50, 060003. [Google Scholar] [CrossRef]
Tamati, T.N.; Vasil, K.J.; Kronenberger, W.G.; Pisoni, D.B.; Moberly, A.C.; Ray, C. Word and Nonword Reading Efficiency in Postlingually Deafened Adult Cochlear Implant Users. Otol. Neurotol. 2021, 42, e272–e278. [Google Scholar] [CrossRef]
Baddeley, A.; Logie, R.; Nimmo-Smith, I.; Brereton, N. Components of Fluent Reading. J. Mem. Lang. 1985, 24, 119–131. [Google Scholar] [CrossRef]
Jackson, M.D.; McClelland, J.L. Processing Determinants of Reading Speed. J. Exp. Psychol. Gen. 1979, 108, 151–181. [Google Scholar] [CrossRef]
Wolf, M.; Katzir-Cohen, T. Reading Fluency and Its Intervention. In The Role of Fluency in Reading Competence, Assessment, and Instruction; Routledge: London, UK, 2001; ISBN 978-1-4106-0824-6. [Google Scholar]
Moberly, A.C.; Reed, J. Making Sense of Sentences: Top-down Processing of Speech by Adult Cochlear Implant Users. J. Speech Lang. Hear. Res. 2019, 62, 2895–2905. [Google Scholar] [CrossRef] [PubMed]
Schenk, B.S.; Baumgartner, W.-D.; Hamzavi, J.S. Changes in Vowel Quality after Cochlear Implantation. ORL 2003, 65, 184–188. [Google Scholar] [CrossRef] [PubMed]
Kishon-Rabin, L.; Taitelbaum, R.; Tobin, Y.; Hildesheimer, M. The Effect of Partially Restored Hearing on Speech Production of Postlingually Deafened Adults with Multichannel Cochlear Implants. J. Acoust. Soc. Am. 1999, 106, 2843–2857. [Google Scholar] [CrossRef]
Waldstein, R.S. Effects of Postlingual Deafness on Speech Production: Implications for the Role of Auditory Feedback. J. Acoust. Soc. Am. 1990, 88, 2099–2114. [Google Scholar] [CrossRef] [PubMed]
Perkell, J.; Lane, H.; Svirsky, M.; Webster, J. Speech of Cochlear Implant Patients: A Longitudinal Study of Vowel Production. J. Acoust. Soc. Am. 1992, 91, 2961–2978. [Google Scholar] [CrossRef] [PubMed]
Dawson, P.W.; Blarney, P.J.; Dettman, S.J.; Rowland, L.C.; Barker, E.J.; Tobey, E.A.; Busby, P.A.; Cowan, R.C.; Clark, G.M. A Clinical Report on Speech Production of Cochlear Implant Users. Ear Hear. 1995, 16, 551–561. [Google Scholar] [CrossRef] [PubMed]
Monsen, R.B. Normal and Reduced Phonological Space: The Production of English Vowels by Deaf Adolescents. J. Phon. 1976, 4, 189–198. [Google Scholar] [CrossRef]
Monsen, R.B.; Shaughnessy, D.H. Improvement in Vowel Articulation of Deaf Children. J. Commun. Disord. 1978, 11, 417–424. [Google Scholar] [CrossRef]
Schenk, B.S.; Baumgartner, W.D.; Hamzavi, J.S. Effect of the Loss of Auditory Feedback on Segmental Parameters of Vowels of Postlingually Deafened Speakers. Auris Nasus Larynx 2003, 30, 333–339. [Google Scholar] [CrossRef]
Bradlow, A.R.; Torretta, G.M.; Pisoni, D.B. Intelligibility of Normal Speech I: Global and Fine-Grained Acoustic-Phonetic Talker Characteristics. Speech Commun. 1996, 20, 255–272. [Google Scholar] [CrossRef]
Clopper, C.G.; Mitsch, J.F.; Tamati, T.N. Effects of Phonetic Reduction and Regional Dialect on Vowel Production. J. Phon. 2017, 60, 38–59. [Google Scholar] [CrossRef]
Munson, B.; Solomon, N.P. The Effect of Phonological Neighborhood Density on Vowel Articulation. J. Speech Lang. Hear. Res. 2004, 47, 1048–1058. [Google Scholar] [CrossRef]
Gautam, A.; Naples, J.G.; Eliades, S.J. Control of Speech and Voice in Cochlear Implant Patients. Laryngoscope 2019, 129, 2158–2163. [Google Scholar] [CrossRef]
Ashjaei, S.; Behroozmand, R.; Fozdar, S.; Farrar, R.; Arjmandi, M. Vocal Control and Speech Production in Cochlear Implant Listeners: A Review within Auditory-Motor Processing Framework. Hear. Res. 2024, 453, 109132. [Google Scholar] [CrossRef]
van Dijkhuizen, J.N.; Boermans, P.P.B.M.; Briaire, J.J.; Frijns, J.H.M. Intelligibility of the Patient’s Speech Predicts the Likelihood of Cochlear Implant Success in Prelingually Deaf Adults. Ear Hear. 2016, 37, e302–e310. [Google Scholar] [CrossRef]
van Dijkhuizen, J.N.; Beers, M.; Boermans, P.-P.B.M.; Briaire, J.J.; Frijns, J.H.M. Speech Intelligibility as a Predictor of Cochlear Implant Outcome in Prelingually Deafened Adults. Ear Hear. 2011, 32, 445. [Google Scholar] [CrossRef]
Anwyl-Irvine, A.; Massonnie, J.; Flitton, A.; Kirkham, N.; Evershed, J.K. Gorilla in Our Midst: An Online Behavioral Experiment Builder. Behav. Res. Methods 2020, 52, 388–407. [Google Scholar] [CrossRef]
Milne, A.E.; Bianco, R.; Poole, K.C.; Zhao, S.; Oxenham, A.J.; Billig, A.J.; Chait, M. An Online Headphone Screening Test Based on Dichotic Pitch. Behav. Res. Methods 2020, 53, 1551–1562. [Google Scholar] [CrossRef]
Vitevitch, M.S.; Luce, P.A. A Web-Based Interface to Calculate Phonotactic Probability for Words and Nonwords in English. Behav. Res. Methods 2004, 36, 481–487. [Google Scholar] [CrossRef] [PubMed]
Adelman, J.S.; Brown, G.D.A.; Quesada, J.F. Contextual Diversity, Not Word Frequency, Determines Word-Naming and Lexical Decision Times. Psychol. Sci. 2006, 17, 814–823. [Google Scholar] [CrossRef]
Brysbaert, M.; New, B. Moving Beyond Kučera And Francis: A Critical Evaluation of Current Word Frequency Norms and the Introduction of a New and Improved Word Frequency Measure for American English. Behav. Res. Methods 2009, 41, 977–990. [Google Scholar] [CrossRef] [PubMed]
Schiel, F. Automatic Phonetic Transcription of Nonprompted Speech. In Proceedings of the XIVth International Congress of Phonetic Sciences, San Francisco, CA, USA, 1–7 August 1999; pp. 607–610. [Google Scholar]
Praat: Doing Phonetics by Computer. Available online: https://www.fon.hum.uva.nl/praat/ (accessed on 22 August 2025).
Jadoul, Y.; Thompson, B.; de Boer, B. Introducing Parselmouth: A Python Interface to Praat. J. Phon. 2018, 71, 1–15. [Google Scholar] [CrossRef]
Traunmüller, H. Analytical Expressions for the Tonotopic Sensory Scale. J. Acoust. Soc. Am. 1990, 88, 97–100. [Google Scholar] [CrossRef]
Scarborough, R.; Zellou, G. Clarity in Communication: “Clear” Speech Authenticity and Lexical Neighborhood Density Effects in Speech Production and Perception. J. Acoust. Soc. Am. 2013, 134, 3793–3807. [Google Scholar] [CrossRef]
Zellou, G.; Scarborough, R. Lexically Conditioned Phonetic Variation in Motherese: Age-of-Acquisition and Other Word-Specific Factors in Infant- and Adult-Directed Speech. Lab. Phonol. 2015, 6, 305–336. [Google Scholar] [CrossRef]
Santurette, S.; Dau, T. Binaural Pitch Perception in Normal-Hearing and Hearing-Impaired Listeners. Hear. Res. 2007, 223, 29–47. [Google Scholar] [CrossRef]
TOWRE-2-Test of Word Reading Efficiency|Second Edition|Pearson Assessments US. Available online: https://www.pearsonassessments.com/en-us/Store/Professional-Assessments/Speech-%26-Language/Test-of-Word-Reading-Efficiency-%7C-Second-Edition/p/100000451?srsltid=AfmBOopJ28erRVKHx5I1aQ5YtHUY62pPljE_JOzAFqv7KLig3hXyHt6N (accessed on 22 August 2025).
Moberly, A.C.; Afreen, H.; Schneider, K.J.; Tamati, T.N. Preoperative Reading Efficiency as a Predictor of Adult Cochlear Implant Outcomes. Otol. Neurotol. 2022, 43, e1100. [Google Scholar] [CrossRef]
Gilbert, J.L.; Tamati, T.N.; Pisoni, D.B. Development, Reliability, and Validity of PRESTO: A New High-Variability Sentence Recognition Test. J. Am. Acad. Audiol. 2013, 24, 26–36. [Google Scholar] [CrossRef] [PubMed]
Tamati, T.N.; Sevich, V.A.; Clausing, E.M.; Moberly, A.C. Lexical Effects on the Perceived Clarity of Noise-Vocoded Speech in Younger and Older Listeners. Front. Psychol. 2022, 13, 837644. [Google Scholar] [CrossRef] [PubMed]
Barr, D.J.; Levy, R.; Scheepers, C.; Tily, H.J. Random Effects Structure for Confirmatory Hypothesis Testing: Keep It Maximal. J. Mem. Lang. 2013, 68, 255–278. [Google Scholar] [CrossRef] [PubMed]
Bates, D.; Mächler, M.; Bolker, B.M.; Walker, S.C. Fitting Linear Mixed-Effects Models Using Lme4. J. Stat. Softw. 2015, 67, 1–48. [Google Scholar] [CrossRef]
Kuznetsova, A.; Brockhoff, P.B.; Christensen, R.H.B. lmerTest Package: Tests in Linear Mixed Effects Models. J. Stat. Softw. 2017, 82, 1–26. [Google Scholar] [CrossRef]
Gould, J.; Lane, H.; Vick, J.; Perkell, J.S.; Matthies, M.L.; Zandipour, M. Changes in Speech Intelligibility of Postlingually Deaf Adults after Cochlear Implantation. Ear Hear. 2001, 22, 453. [Google Scholar] [CrossRef]
Ruff, S.; Bocklet, T.; Nöth, E.; Müller, J.; Hoster, E.; Schuster, M. Speech Production Quality of Cochlear Implant Users with Respect to Duration and Onset of Hearing Loss. ORL 2017, 79, 282–294. [Google Scholar] [CrossRef] [PubMed]
Nagle, C.L. Examining the Temporal Structure of the Perception–Production Link in Second Language Acquisition: A Longitudinal Study. Lang. Learn. 2018, 68, 234–270. [Google Scholar] [CrossRef]
Borden, G.; Gerber, A.; Milsark, G. Production and perception of the /r/-/l/ contrast in Korean adults learning English. Lang. Learn. 1983, 33, 499–526. [Google Scholar] [CrossRef]
Sakai, M.; Moorman, C. Can Perception Training Improve the Production of Second Language Phonemes? A Meta-Analytic Review of 25 Years of Perception Training Research. Appl. Psycholinguist. 2018, 39, 187–224. [Google Scholar] [CrossRef]
Rönnberg, J.; Lunner, T.; Zekveld, A.; Sörqvist, P.; Danielsson, H.; Lyxell, B.; Dahlström, Ö.; Signoret, C.; Stenfelt, S.; Pichora-Fuller, M.K.; et al. The Ease of Language Understanding (ELU) Model: Theoretical, Empirical, and Clinical Advances. Front. Syst. Neurosci. 2013, 7, 31. [Google Scholar] [CrossRef]
Turgeon, C.; Prémont, A.; Trudeau-Fisette, P.; Ménard, L. Exploring Consequences of Short- and Long-Term Deafness on Speech Production: A Lip-Tube Perturbation Study. Clin. Linguist Phonet. 2015, 29, 378–400. [Google Scholar] [CrossRef]
Tourville, J.A.; Reilly, K.J.; Guenther, F.H. Neural Mechanisms Underlying Auditory Feedback Control of Speech. NeuroImage 2008, 39, 1429–1443. [Google Scholar] [CrossRef] [PubMed]
Ferguson, S.H.; Kewley-Port, D. Vowel Intelligibility in Clear and Conversational Speech for Normal-Hearing and Hearing-Impaired Listeners. J. Acoust. Soc. Am. 2002, 112, 259–271. [Google Scholar] [CrossRef] [PubMed]
Ferguson, S.; Kewley-Port, D. Talker Differences in Clear and Conversational Speech: Acoustic Characteristics of Vowels. J. Speech Lang. Hear. Res. 2007, 50, 1241–1255. [Google Scholar] [CrossRef]
Bradlow, A.R.; Kraus, N.; Hayes, E. Speaking Clearly for Children With Learning Disabilities: Sentence Perception in Noise. J. Speech Lang. Hear. Res. 2003, 46, 80–97. [Google Scholar] [PubMed]
Picheny, M.A.; Durlach, N.I.; Braida, L.D. Speaking Clearly for the Hard of Hearing II. J. Speech Lang. Hear. Res. 1986, 29, 434–446. [Google Scholar] [CrossRef]
Gahl, S.; Yao, Y.; Johnson, K. Why Reduce? Phonological Neighborhood Density and Phonetic Reduction in Spontaneous Speech. J. Mem. Lang. 2012, 66, 789–806. [Google Scholar] [CrossRef]
Derwing, T.M.; Munro, M.J. Accent, Intelligibility, and Comprehensibility: Evidence from Four L1s. Stud. Second Lang. Acquis. 1997, 19, 1–16. [Google Scholar]
Munro, M.J.; Derwing, T.M. Processing Time, Accent, and Comprehensibility in the Perception of Native and Foreign-Accented Speech. Lang. Speech 1995, 38, 289–306. [Google Scholar] [CrossRef]
Munro, M.J.; Derwing, T.M. Modeling Perceptions of the Accentedness and Comprehensibility of L2 Speech the Role of Speaking Rate. Stud. Second Lang. Acquis. 2001, 23, 451–468. [Google Scholar] [CrossRef]
Turgeon, C.; Trudeau-Fisette, P.; Lepore, F.; Lippé, S.; Ménard, L. Impact of Visual and Auditory Deprivation on Speech Perception and Production in Adults. Clin. Linguist. Phon. 2020, 34, 1061–1087. [Google Scholar] [CrossRef]
Kelly, R.; Tinnemore, A.R.; Nguyen, N.; Goupell, M.J. On the Difficulty of Defining Duration of Deafness for Adults With Cochlear Implants. Ear Hear. 2025, 46, 1125. [Google Scholar] [CrossRef] [PubMed]
Munson, B. Lexical Characteristic Mediate the Influence of Sex and Sex Typicality on Vowel-Space Size. In Proceedings of the International Congress on Phonetic Sciences, Saarbrucken, Germany, 6–10 August 2007; pp. 885–888. [Google Scholar]
Munson, B.; Babel, M. The Phonetics of Sex and Gender. In The Routelidge Handbook of Phonetics; Taylor and Francis: Abingdon, UK, 2019; pp. 499–525. [Google Scholar]
Smiljanic, R.; Gilbert, R.C. Intelligibility of Noise-Adapted and Clear Speech in Child, Young Adult, and Older Adult Talkers. J. Speech Lang. Hear. Res. 2017, 60, 3069–3080. [Google Scholar] [CrossRef]
Dahò, M.; Monzani, D. The Multifaceted Nature of Inner Speech: Phenomenology, Neural Correlates, and Implications for Aphasia and Psychopathology. Cogn. Neuropsychol. 2025, 3, 1–21. [Google Scholar] [CrossRef]
Wessel, J.R. An Adaptive Orienting Theory of Error Processing. Psychophysiology 2018, 55, e13041. [Google Scholar] [CrossRef]
McGarr, N.S. The Intelligibility of Deaf Speech to Experienced and Inexperienced Listeners. J. Speech Hear. Res. 1983, 26, 451–458. [Google Scholar] [PubMed]
Baese-Berk, M.M.; Levi, S.V.; Van Engen, K.J. Intelligibility as a Measure of Speech Perception: Current Approaches, Challenges, and Recommendations. J. Acoust. Soc. Am. 2023, 153, 68–76. [Google Scholar] [CrossRef] [PubMed]
Bent, T.; Baese-Berk, M.; Borrie, S.A.; McKee, M. Individual Differences in the Perception of Regional, Nonnative, and Disordered Speech Varieties. J. Acoust. Soc. Am. 2016, 140, 3775–3786. [Google Scholar] [CrossRef]
McLaughlin, D.J.; Baese-Berk, M.M.; Bent, T.; Borrie, S.A.; Van Engen, K.J. Coping with Adversity: Individual Differences in the Perception of Noisy and Accented Speech. Atten. Percept. Psychophys. 2018, 80, 1559–1570. [Google Scholar] [CrossRef]
Baese-Berk, M.M.; McLaughlin, D.J.; McGowan, K.B. Perception of Non-Native Speech. Lang. Linguist. Compass 2020, 14, e12375. [Google Scholar] [CrossRef]
Dossey, E.; Clopper, C.G.; Wagner, L. The Development of Sociolinguistic Competence across the Lifespan: Three Domains of Regional Dialect Perception. Lang. Learn. Dev. 2020, 16, 330–350. [Google Scholar] [CrossRef]
McGowan, K.B. Social Expectation Improves Speech Perception in Noise. Lang. Speech 2015, 58, 502–521. [Google Scholar] [CrossRef]
Rubin, D.L. Nonlanguage Factors Affecting Undergraduates’ Judgments of Nonnative English-Speaking Teaching Assistants. Res High Educ 1992, 33, 511–531. [Google Scholar] [CrossRef]
Simon, E.; Lybaert, C.; Plevoets, K. Social Attitudes, Intelligibility and Comprehensibility: The Role of the Listener in the Perception of Non-Native Speech. Vigo Int. J. Appl. Linguist. 2022, 19, 177–222. [Google Scholar] [CrossRef]
Freeman, V. Speech Intelligibility and Personality Peer-Ratings of Young Adults with Cochlear Implants. J. Deaf Stud. Deaf Educ. 2018, 23, 41–49. [Google Scholar] [CrossRef] [PubMed]
Freeman, V. Attitudes toward Deafness Affect Impressions of Young Adults with Cochlear Implants. J. Deaf Stud. Deaf Educ. 2018, 23, 360–368. [Google Scholar] [CrossRef] [PubMed]
Cuda, D.; Manrique, M.; Ramos, Á.; Marx, M.; Bovo, R.; Khnifes, R.; Hilly, O.; Belmin, J.; Stripeikyte, G.; Graham, P.L.; et al. Improving Quality of Life in the Elderly: Hearing Loss Treatment with Cochlear Implants. BMC Geriatr 2024, 24, 16. [Google Scholar] [CrossRef] [PubMed]
Hawton, A.; Green, C.; Dickens, A.P.; Richards, S.H.; Taylor, R.S.; Edwards, R.; Greaves, C.J.; Campbell, J.L. The Impact of Social Isolation on the Health Status and Health-Related Quality of Life of Older People. Qual Life Res 2011, 20, 57–67. [Google Scholar] [CrossRef]
Mo, B.; Lindbæk, M.; Harris, S. Cochlear Implants and Quality of Life: A Prospective Study. Ear Hear. 2005, 26, 186. [Google Scholar] [CrossRef]

Table 1. Demographic characteristics of cochlear implant (CI) users. YR = years, HA = hearing aid, HL = hearing loss, N/A = not applicable. Duration of deafness was defined as the difference in years between patient-reported onset of hearing loss and age at first CI.

Subject	Gender	Age (YR)	Side of Implant	HA in Contra-Lateral Ear	Etiology of HL	Age at First CI (YR)	Duration of Deafness (YR)	Years of CI Use
CI1	Woman	65	Bilateral	N/A	Genetic	54	13	11
CI2	Woman	57	Left	Yes	Genetic	48	41	9
CI3	Woman	69	Bilateral	N/A	Otosclerosis	56	41	13
CI4	Woman	76	Left	No	Autoimmune	68	23	8
CI5	Man	59	Bilateral	N/A	Sudden (idiopathic)	57	2	2
CI6	Woman	62	Right	Yes	Sudden (idiopathic)	56	29	6
CI7	Man	66	Left	No	Meniere’s	60	46	6
CI8	Woman	35	Left	No	Physical trauma	31	18	4
CI9	Woman	64	Left	Yes	Genetic	59	50	5
CI10	Man	56	Bilateral	N/A	Unknown	45	Unknown	11
CI11	Woman	24	Bilateral	N/A	Unknown	19	Unknown	5
CI12	Man	67	Right	Yes	Physical trauma	60	48	7
CI13	Man	73	Right	Yes	Genetic	60	20	13
CI14	Woman	49	Right	Yes	Genetic	39	Unknown	10
CI15	Woman	56	Right	Yes	Unknown	48	Unknown	8

Table 2. A count of the total number of words per CI user (talker) that were acoustically analyzed and used as stimulus materials for the intelligibility rating task. Counts are provided separately for each of the four vowels included.

	Vowel
Talker	æ	ɑ	i	u	Total
CI1	18	23	18	15	74
CI2	18	26	21	16	81
CI3	15	19	20	13	67
CI4	13	22	19	16	70
CI5	15	24	21	16	76
CI6	15	18	17	14	64
CI7	15	22	17	14	68
CI8	17	21	21	13	72
CI9	15	23	17	14	69
CI10	15	23	17	13	68
CI11	16	20	21	16	73
CI12	16	23	22	15	76
CI13	12	22	20	15	69
CI14	15	23	18	12	68
CI15	15	24	20	14	73
Total	230	333	289	216	1068

Table 3. A summary of mean intelligibility ratings, vowel dispersion, PRESTO accuracy, and TOWRE-2 scores, provided separately for each CI user. Values in parentheses are standard deviations.

Subject	Intelligibility Rating	Vowel Dispersion (Bark)	PRESTO Accuracy (Percent Correct)	TOWRE-2 Score (Total Words + Nonwords Correctly Reported)
CI1	71.0 (27.6)	3.29 (0.74)	86.6	129
CI2	71.9 (26.5)	2.37 (0.46)	81.4	136
CI3	63.6 (27.8)	2.81 (0.44)	74.1	124
CI4	72.9 (25.4)	2.30 (0.63)	65.0	131
CI5	72.7 (26.3)	2.50 (0.59)	84.4	136
CI6	77.8 (21.7)	1.82 (0.36)	69.4	121
CI7	84.0 (19.2)	2.47 (0.32)	81.2	147
CI8	68.6 (27.1)	2.67 (0.91)	72.8	157
CI9	56.9 (32.6)	2.88 (0.81)	69.9	131
CI10	51.0 (32.0)	2.33 (0.58)	51.6	84
CI11	78.4 (23.5)	2.37 (0.37)	84.7	126
CI12	52.9 (31.5)	1.89 (0.41)	47.6	132
CI13	42.9 (31.2)	2.01 (0.52)	67.7	125
CI14	57.6 (33.2)	2.54 (0.87)	60.2	111
CI15	67.3 (27.5)	1.89 (0.57)	64.9	131
Mean	65.9 (11.5)	2.41 (0.41)	70.8 (11.9)	128.1 (16.2)

Table 4. Results from Pearson’s correlations between measures of speech production (intelligibility ratings and vowel dispersion), speech recognition (PRESTO), and phonological processing (TOWRE-2). Bolded comparisons are significant after FDR corrections.

	Sentence Recognition (PRESTO) Accuracy	Phonological Processing (TOWRE-2)
Intelligibility Ratings	r = 0.67	r = 0.47
Intelligibility Ratings	p = 0.024	p = 0.090
Vowel Dispersion	r = 0.50	r = 0.13
Vowel Dispersion	p = 0.098	p = 0.63

Table 5. Model output predicting intelligibility rating from vowel dispersion with covariates for lexical frequency and neighborhood density. p-values < 0.05 are denoted with an asterisk.

Effect	Estimate	Error	T-Value	p-Value
Intercept	55.56	8.83	6.30	<0.0001 *
Vowel Dispersion	−0.32	0.46	−0.69	0.49
Lexical Frequency	4.60	1.74	2.65	0.009 *
Neighborhood Density	0.001	0.23	0.001	0.99

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sevich, V.A.; Williams, D.J.; Moberly, A.C.; Tamati, T.N. Speech Production Intelligibility Is Associated with Speech Recognition in Adult Cochlear Implant Users. Brain Sci. 2025, 15, 1066. https://doi.org/10.3390/brainsci15101066

AMA Style

Sevich VA, Williams DJ, Moberly AC, Tamati TN. Speech Production Intelligibility Is Associated with Speech Recognition in Adult Cochlear Implant Users. Brain Sciences. 2025; 15(10):1066. https://doi.org/10.3390/brainsci15101066

Chicago/Turabian Style

Sevich, Victoria A., Davia J. Williams, Aaron C. Moberly, and Terrin N. Tamati. 2025. "Speech Production Intelligibility Is Associated with Speech Recognition in Adult Cochlear Implant Users" Brain Sciences 15, no. 10: 1066. https://doi.org/10.3390/brainsci15101066

APA Style

Sevich, V. A., Williams, D. J., Moberly, A. C., & Tamati, T. N. (2025). Speech Production Intelligibility Is Associated with Speech Recognition in Adult Cochlear Implant Users. Brain Sciences, 15(10), 1066. https://doi.org/10.3390/brainsci15101066

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Speech Production Intelligibility Is Associated with Speech Recognition in Adult Cochlear Implant Users

Abstract

1. Introduction

1.1. Auditory Feedback Influences Speech Production

1.2. Phonological Processing for Speech Perception and Production

1.3. Acoustic and Perceptual Measures of Speech Production in CI Users and Their Relation to Speech Perception

1.4. The Current Study

2. Materials and Methods

2.1. Participants: Cochlear Implant Users

2.2. Participants: Normal Hearing Listeners

2.3. Speech Production: Word Reading and Acoustic Analysis

2.3.1. Materials

2.3.2. Procedure

2.3.3. Acoustic Analysis

2.4. Intelligibility Ratings of Cochlear Implant Users’ Speech

2.4.1. Materials

2.4.2. Procedure

2.5. Speech Perception Tasks (Outcome Measures)

2.5.1. Phonological Processing: Word and Nonword Reading Efficiency

2.5.2. Sentence Recognition Accuracy

2.6. Data Analysis

3. Results

3.1. The Association Between Intelligibility Ratings and Speech Perception Outcomes

3.2. The Association Between Vowel Dispersion and Speech Perception Outcomes

3.3. The Relationship Between the Two Measures of Speech Production

4. Discussion

4.1. Subjective Ratings of Speech Intelligibility Are Associated with Speech Recognition

4.2. Vowel Dispersion Is Not Associated with Speech Recognition, Phonological Processing, or Intelligibility Ratings

4.3. Limitations

4.4. Future Directions: Other Factors That May Influence Speech Production and Intelligibility Ratings

4.5. Clinical Implications

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI