Next Article in Journal
A Practical Guideline to Capturing and Documenting the Real-Time Consequences of Fluctuating Hearing Loss in School-Age Children
Previous Article in Journal
Comprehensive Diagnostic Approach to Head and Neck Masses
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Spatial Release from Masking for Small Spatial Separations Using Simulated Cochlear Implant Speech †

by
Nirmal Srinivasan
*,
SaraGrace McCannon
and
Chhayakant Patro
Department of Speech-Language Pathology and Audiology, Towson University, Towson, MD 21252, USA
*
Author to whom correspondence should be addressed.
Portions of the data were presented at the 2023 Conference on the Implantable Auditory Prostheses (CIAP) meeting, Lake Tahoe, CA, USA, 9–14 July 2023.
Current address: Audiology Program, Vanderbilt School of Medicine, Vanderbilt University, Nashville, TN 37235, USA.
J. Otorhinolaryngol. Hear. Balance Med. 2024, 5(2), 18; https://doi.org/10.3390/ohbm5020018
Submission received: 12 September 2024 / Revised: 15 November 2024 / Accepted: 23 November 2024 / Published: 27 November 2024

Abstract

:
Background: Spatial release from masking (SRM) is the improvement in speech intelligibility when the masking signals are spatially separated from the target signal. Young, normal- hearing listeners have a robust auditory sys-tem that is capable of using the binaural cues even with a very small spatial separation between the target and the maskers. Prior studies exploring SRM through simulated cochlear implant (CI) speech have been completed using substantial spatial separations, exceeding 45° between the target signal and masking signals. Nevertheless, in re-al-world conversational scenarios, the spatial separation between the target and the maskers may be considerably less than what has been previously investigated. This study presents SRM data utilizing simulated CI speech with young, normal-hearing listeners, focusing on smaller but realistic spatial separations between the target and the maskers. Methods: Twenty-five young, normal-hearing listeners participated in this study. Speech identification thresholds, the target-to-masker ratio required to accurately identify 50% of the target words, were measured for both natural speech and simulated CI speech. Results: The results revealed that young, normal-hearing listeners had significantly higher speech identification thresholds when presented with simulated CI speech in comparison to natural speech. Furthermore, the amount of SRM was found to be greater for natural speech than for the simulated CI speech. Conclusions: The data suggests that young normal-hearing individuals are capable of utilizing the interaural level difference cues in the simulated cochlear implant signal to achieve masking release at reduced spatial separations between the target and the maskers, highlighting the auditory system’s capability to extract these interaural cues even in the presence of degraded speech signals.

1. Introduction

In real-world listening scenarios, the ability to segregate speech sources can be quite complex, leading to a reduction in speech intelligibility even for listeners with normal hearing. This difficulty is exacerbated for cochlear implant users who frequently struggle with communicating in such challenging listening environments. For individuals with normal hearing, spatial separation between the target and the maskers offers robust binaural cues which can be used to focus attention on the target speech. This improvement in speech intelligibility when the target and the maskers are spatially separated, compared to when they are not, is defined as spatial release from masking (SRM) [1,2,3,4,5].
Some of the factors that typically contribute to SRM are the acoustic head shadow effect, binaural decorrelation processing, and spatial attention. When a masker is spatially separated from the target, the listener’s head attenuates the incoming sound, thereby creating a favorable listening experience in one ear compared to the other ear. The listener can use this difference in signal-to-noise ratio (SNR) and attend to the ear with the favorable SNR to increase overall speech intelligibility. This acoustic head shadow is a monaural effect and can be minimized by separating the maskers symmetrically from the target [6,7,8]. When the target and maskers are spatially separated, binaural decorrelation processing can enable the listener to detect the target signal in the presence of maskers by using the interaural cues, namely the interaural time differences (ITDs) and the interaural level differences (ILDs) [9]. For spatial attention, the binaural cues need not be large to benefit from those cues as long as the listeners have perceived the distinct spatial location of the target and the maskers [2,5,10,11,12]. In fact, Srinivasan et al. [5] found that young, normal-hearing listeners could use spatial separations as small as 2° between the target and symmetrical maskers to achieve SRM. For older listeners with normal hearing, the spatial separation required to achieve SRM was increased to 6°, while the older listeners with hearing impairment required a much larger spatial separation (30°) between the target and the maskers to achieve SRM.
Bilateral cochlear implants (CIs) aim to improve the user’s ability to use spatial cues to increase their sound localization and stream segregation capabilities [13,14], and an increasing number of CI users are undergoing bilateral implantation to have better sound localization abilities and improved speech understanding in complex listening environments [15]. Clinical sound processors generally convey the ILD cues via the temporal envelope of the signal [16,17], whereas the ITD cues are generally not conveyed [18,19]. These ILD cues could potentially enable the user to hear sounds from distinct spatial locations [14] and could help them to differentiate the target signal from the maskers as CI users do not have access to cues such as pitch differences or onset similarities [20,21].
Bilateral CI users have been shown to have better binaural processing abilities compared to unilateral CI users [13,22]. The amount of SRM achieved by bilateral CI users vary depending on the type of stimuli used and the spatial configurations tested [4,13,17,23,24,25,26,27]. Hawley et al. [4] usedsimulated CI speech with young, normal-hearing listeners and found that SRM was larger when both the target and the maskers were similar (speech on speech; 6 dB) compared to when the target and maskers were dissimilar (speech on noise; 4 dB). These findings with simulated speech were similar to the amount of SRM achieved with young, normal-hearing listeners when the target speech and the masking speech were perceptually similar to each other [4,5,21]. Similarly, Garadat et al. [28] measured SRM using sine-vocoded speech and found similar results to the study by Hawley et al. [4].
Together, these findings indicate that normal-hearing listeners listening tosimulated CI speech show benefits in speech understanding when the target and the makers are spatially separated. However, most of the studies used spatial separations greater than 45° to measure speech identification thresholds. In realistic listening scenarios, the target and the maskers can be closer than the separations previously used in the literature. The goal of this study is to understand the role of small spatial separations (in the azimuthal plane) in the speech understanding of young, normal-hearing listeners while using simulated CI speech.

2. Methods

2.1. Listeners

Twenty-five young, normal-hearing adults (mean age = 21.3 years, age range: 19–23 years) participated in the experiment. Standard air conduction audiometric thresholds were obtained from all study participants, and all the participants had normal hearing, defined as audiometric thresholds of ≤15 dB HL, at all octave frequencies from 250 to 8000 Hz. Also, none of the study participants had any asymmetry in the audiometric thresholds, defined as any threshold difference >10 dB HL, between the ears. All testing conditions and the testing protocol were approved by Towson University’s Institutional Review board. Also, all the study participants were financially compensated for their participation in the study.

2.2. Stimuli

Three male talkers from the Coordinate Response Measure [29] (CRM) were used in this experiment. All sentences in the CRM corpus had the form “Ready [CALL SIGN] go to [COLOR] [NUMBER] now”. There were eight possible call signs (Arrow, Baron, Charlie, Eagle, Hopper, Laker, Ringo, and Tiger), four colors (Blue, Red, White, and Green), and eight numbers (1–8). All possible combinations of the call signs, colors, and numbers were used. On each trial, target sentence (call sign “Charlie”) and two masker sentences (different call signs) were presented simultaneously. Each talker would mention a unique color and number, and the listener would report the target color/number combination with a button press.
For both natural and simulated CI speech conditions, the target and the masker signals were first convolved with location-specific head-related impulse responses (HRIRs) to create signals that had binaural cues. In the natural speech conditions, the directionally dependent signals were added together and presented over headphones. In the simulated CI speech conditions, the directionally dependent signals were added together and processed through the CI simulation before presenting the CI simulated signals over the headphones. For all the listening conditions, the target level was fixed, and the masker level was varied across the trials.

2.3. CI Simulation

Spectral degradations were applied to the stimuli using noise–band-vocoder processing, a technique modeled after CI signal processing and widely used to simulate CI hearing [30]. An eight-channel vocoder was chosen, as this configuration has been shown to provide speech recognition performance comparable to that of the top-performing CI users [31,32]. First, the input bandwidth was first limited to 150–8000 Hz, and the stimuli were then filtered into eight spectral bands using fourth-order Butterworth filters, with a slope of 24 dB per octave. The band cutoff frequencies were distributed according to the Greenwood mapping function [33], ensuring a physiologically relevant frequency distribution that reflects the cochlear tonotopic organization. The temporal envelope of each band was extracted using half-wave rectification and low-pass filtering, with a 160 Hz cutoff frequency and a roll-off of 24 dB per octave. This extracted envelope was then used to modulate the corresponding bandpass-filtered noise. Finally, the outputs from all eight channels were combined to create the noise-vocoded stimuli, which were presented bilaterally to the participants, simulating CI listening conditions.

2.4. Conditions

A virtual auditory spatial array was used to present the speech stimuli. Head-related impulse responses (HRIRs) were generated using techniques described by Zahorik (2009) [34]. The simulation method used an image model [35] to compute the directions, delays, and attenuations of early reflections, which were then, along with the direct path, spatially rendered using non-individualized head-related transfer functions (HRTFs). Overall, this method of simulation was found to produce HRIRs that were reasonable physical and perceptual approximations of those measured in real environments [34]. The following eight spatial configurations were used in this study: colocated (target and maskers presented from 0° azimuth) and one of the seven spatially separated conditions (target at 0°, symmetrical maskers at ±2°, ±4°, ±6°, ±8°, ±10°, ±15°, or ±30°). The HRIRs used here were the same HRIRs used in [5] that investigated the effects of age and hearing loss on spatial release from masking. In the natural speech stimuli conditions, the target and the maskers were convolved with the HRIRs for their appropriate locations relative to the listener before being presented over the headphones.

2.5. Procedure

All the study participants were seated in a sound-treated audiology test booth in the Speech-Language Pathology and Audiology Department at Towson University. The study participants listened to the auditory stimuli, which were presented to them using over circumaural headphones (Sennheiser HD 650, Sennheiser, Hanover, Germany). Matlab was used to create the stimuli which were then played via a Lynx Hilo sound card. To familiarize the listeners with the testing procedure, the test session always started with a quiet threshold estimation, which was defined as the level required to obtain the 50% correct point on the psychometric function [36]. The experimental session started with obtaining quiet thresholds. Quiet identification thresholds were determined through a one-up, one-down adaptive procedure, which was based on the accuracy of reporting of the color and number combination of the target sentence. Both the color and the number had to be reported correctly to be scored as a correct response. During the initial phase of the quiet identification thresholds, the presentation level of the target speech was set at 20 dB above the pure-tone average (PTA) calculated from audiometric thresholds at 0.5, 1, 2, and 4 kHz. The presentation level of the target sentence was decreased by 5 dB following each correct response to the presented stimuli, while an incorrect response to the presented stimuli resulted in an increase of 5 dB in the presentation level of the target sentence. This process continued until the first three directional reversals were recorded, at which point the step size was decreased to 1 dB, starting from the fourth reversal. Each quiet threshold estimation track included nine reversals, with the quiet threshold being estimated based on the average of the last six reversals. Two such quiet threshold estimates were obtained, and the average of these two quiet threshold estimates was used to set the presentation level of the target stimulus for all test scenarios.
In all the blocks of these trials, the target speech was presented in the presence of speech masking. Speech identification thresholds for the different listening conditions were measured using a similar adaptive procedure as for the quiet threshold measurements. The only difference was that the target level was fixed at 20 dB above the quiet identification threshold and the masker level was varied adaptively in order to estimate the target-to-masker ratio (TMR, measured in dB SNR) associated with 50% correct identification performance. The kind of speech stimuli presented varied randomly between the blocks of these trials. The threshold estimates were averaged over three adaptive tracks per spatial separation for both the natural speech and simulated CI speech conditions.
Responses from the listeners were obtained using a computer monitor located on a table in front of the listener. Feedback was given after each presentation in the form of “Correct” or “Incorrect”. Data collection was self-paced, and listeners were instructed to take breaks whenever they felt the need.

2.6. Data Analysis

Analyses were performed with SPSS 28.0 (IBM Corp., Armonk, NY, USA). Repeated Measures ANOVAs (RM-ANOVAs) were used to investigate the differences in speech identification thresholds as a function of spatial separation for natural speech and CI-simulated speech. Pearson’s correlations were used to understand the relationship between the natural and the simulated CI speech identification thresholds at every spatial separation tested.

3. Results

The darker line on the left panel of Figure 1 displays the mean TMR thresholds (±1 standard error of the mean) at different spatial separations between the target and the masker for natural speech, while the darker line on the right panel displays the mean TMR thresholds (±1 standard error of the mean) for CI-simulated speech. Within each panel, the lighter lines show the mean TMR thresholds for the individual listeners. An RM-ANOVA was conducted with stimuli type (natural and CI-simulated speech) and spatial separations (0°, ±2°, ±4°, ±6°, ±8°, ±10°, ±15°, and ±30°) as within-subject factors, and TMR was required to identify 50% of the target speech correct as the dependent variable. Mauchly’s test indicated that the assumption of sphericity had been violated for both spatial separation and the interaction between stimuli type and spatial separation, and therefore degrees of freedom were corrected using Greenhouse–Geisser estimates of sphericity (spatial separation: χ2(27) = 60.91, p < 0.001, ε = 0.51; stimuli type*spatial separation: χ2(27) = 51.29, p = 0.004, ε = 0.58). Results indicated a significant main effect of stimuli type (F(1,24) = 324.40, p < 0.001, partial eta-square = 0.93, indicating a very large effect) and a significant main effect of spatial separation (F(3.56, 85.32) = 263.01, p < 0.001, pes = 0.92 indicating a very large effect) on TMR thresholds. Also, there was a significant interaction between stimuli type and spatial separation on the TMR thresholds (F(4.05, 97.27) = 68.61, p < 0.001, pes = 0.74 indicating a large effect).
To better understand the significant interaction, separate RM-ANOVAs were conducted for each of the stimuli type. Mauchly’s test indicated that the assumption of sphericity had been violated for both the stimuli type and therefore degrees of freedom were corrected using Greenhouse–Geisser estimates of sphericity (natural speech: χ2(27) = 58.94, p < 0.001, ε = 0.52; simulated CI speech: χ2(27) = 77.56, p < 0.001, ε = 0.44). There was a significant effect of spatial separation on the TMR thresholds for both natural speech (F(3.65, 87.49) = 174.20, p < 0.001, partial η2 = 0.88, indicating a large effect) and CI-simulated speech (F(3.09, 74.16) = 153.99, p < 0.001, partial η2 = 0.87 indicating a large effect). A post hoc analysis using paired sample t-tests and Bonferroni correction showed that the TMR thresholds at all other spatial separations were significantly better than the colocated conditions (all p < 0.05) for both the natural andsimulated CI speech. This result indicates that even a small spatial separation of the order of 2° between the target and the maskers helped the listeners to access the binaural cues and benefit from the spatial separation.
Spatial release from masking (SRM) was calculated as the difference between the TMR threshold under the spatially separated and colocated conditions. The darker line on the left panel of Figure 2 displays the mean SRM (±1 standard error of the mean) at different spatial separations for the natural speech, while the darker line on the right panel displays the mean SRM (±1 standard error of the mean) for the CI-simulated speech. Within each panel, the lighter lines show the mean SRM for the individual listeners. To investigate the effect of the listening conditions on SRM, a RM-ANOVA was conducted with stimuli type (natural and CI-simulated speech) and spatial separations (±2°, ±4°, ±6°, ±8°, ±10°, ±15°, and ±30°) as within-subject factors and SRM as the dependent variable. Mauchly’s test indicated that the assumption of sphericity had been violated for both spatial separation and the interaction between stimuli type and spatial separation, and therefore the degrees of freedom were corrected using Greenhouse–Geisser estimates of sphericity (spatial separation: χ2(20) = 42.75, p = 0.002, ε = 0.56; stimuli type*spatial separation: χ2(20) = 41.37, p = 0.004, ε = 0.61). The results indicate a significant main effect of stimuli type (F(1,24) = 91.56, p < 0.001, partial eta-square = 0.79, indicating a large effect) and spatial separation (F(3.35, 80.31) = 202.54, p < 0.001, partial η2 = 0.89, indicating a very large effect) on the SRM. There was a significant interaction between the stimuli type and spatial separation (F(3.68, 88.30) = 64.47, p < 0.001, partial η2 = 0.73, indicating a large effect). Simple effect analyses indicated no significant difference in the amount of SRM obtained when the listeners were presented with natural and simulated CI speech at 2° of spatial separation between the target and maskers. However, at higher separations (>4° of spatial separation between the target and the maskers, there was a significant difference in the amount of SRM, with a higher SRM for the natural speech stimuli compared to the CI-simulated speech (all p < 0.001). Also, the difference in the amounts of SRM became increasingly larger as the spatial separation between the target and maskers increased.
Pearson’s correlational analyses were performed to investigate the relationships between the individual TMR thresholds obtained at various spatial separations for the natural and CI-simulated speech listening conditions. Figure 3 shows the scatterplot between the natural speech and the CI-simulated speech TMR thresholds at the eight spatial separations tested. The correlations were positive and statistically significant for all the conditions, with the correlation value (r, df = 23 for all conditions) ranging between 0.46 and 0.75. These results indicate that listeners with better thresholds in the natural speech condition tend to have better thresholds in thesimulated CI speech condition as well.

4. Discussion

The present study investigated the effects of spatial separation between the target and maskers on speech identification thresholds when presented with natural speech or simulated CI speech. This study used smaller spatial separations between the target and the maskers than have traditionally been implemented in SRM studies.
The TMR thresholds for natural speech improved as the spatial separation between the target and the maskers increased. These findings were consistent with the results reported in Srinivasan et al. (2016) [5]. To the knowledge of the authors, there are no other studies apart from Srinivasan et al. (2016) [5] that have investigated the effects of small spatial separations between the target and the maskers on speech understanding. Additionally, there is strong consistency between the thresholds obtained from the listeners in this study and equivalent thresholds described in the literature. The threshold obtained for the colocated condition (M = 3.69 dB, 95% CI [3.12, 4.25]) was comparable to the thresholds reported in the literature [3,37,38,39,40]. Also, the thresholds (M = −7.75 dB, 95% CI [−8.95, −6.56]) obtained for the largest separation (30°) used in this study were also comparable to those reported in the literature [3,5,8,39,40,41]. The thresholds for the other smaller spatial separations were also comparable to those reported in Srinivasan et al. (2016) [5].
The TMR thresholds for simulated CI speech improved as the spatial separation between the target and the maskers increased. For all spatial separations tested, the simulated CI speech thresholds were significantly poorer than the natural speech thresholds, indicating the difficulty of perceiving simulated CI speech. This finding is consistent with the results reported in the literature [42,43,44]. The colocated TMR threshold seen here (M = 5.46 dB, 95% CI [4.80, 6.12]) is comparable to the threshold reported earlier in the literature. Also, the lower SRM for the simulated CI speech in comparison to the natural speech is consistent with the findings of Schoof et al. (2013) [45]. Even though the TMR thresholds became better with increasing spatial separation between the target and the maskers, it should be noted that even at the largest spatial separation (30°) tested, the TMR was positive (M = 1.29 dB, 95% CI [0.63, 1.95]), indicating that the listeners were still responding to the louder signal. The participants’ average best performance had a positive SNR, illustrating how challenging the task was with simulated CI speech. It has been shown that the listener’s susceptibility to informational masking is reduced when target identification happens within positive SNRs, thereby reducing the amount of SRM [1,46].
It should also be noted that the variability in the TMR thresholds for simulated CI speech was much larger than for natural speech. Also, not all participants struggled when presented with the simulated CI speech. Four participants had negative TMR thresholds when the spatial separation between the target and the maskers were 30°. Two participants (IDs: 12 and 25) had negative TMR thresholds for spatial separations greater than 8°. Participant 12 had negative TMR thresholds starting at 2° of spatial separation. However, it is worth pointing out that neither of these two participants (IDs: 12 and 25) had exceptionally better thresholds in the natural speech condition. The source of this variability among participants is unknown and should certainly be studied in future studies.
To the knowledge of the authors, there are no other studies in the literature which have used spatial separations of less than 45° to measure speech identification thresholds. Also, most of the studies that have used simulated CI speech presented the auditory stimuli at much higher levels compared to the presentation of 20 dB SL in this study. Consistent with the earlier research, the present findings highlight the limitations of CI vocoders in capturing the fine structure information that is recognized as critical for speech recognition, especially when target and masking speech overlap temporally and are semantically similar [28,47,48,49,50]. Also, these results could provide insight into the development of new signal processing algorithms for binaural CIs as the listeners in this study were able to use the interaural level differences between the two ears to achieve release from masking.

5. Conclusions

The current study examined the effects of small spatial separations between the target and the maskers on spatial release from masking with simulated cochlear implant speech. Speech identification thresholds improved as the spatial separation between the target and the masker increased. This was true for both natural and simulated CI speech. Even at small spatial separations, a young, normal-hearing listener’s binaural system is so robust that it can make use of the available binaural cues to understand speech in challenging listening environments.

Author Contributions

Conceptualization: N.S. and C.P.; Methodology: N.S. and S.M.; Data Collection: S.M.; Data Analysis: N.S., S.M. and C.P.; Writing—Review and Editing: N.S. and C.P. All authors have read and agreed to the published version of the manuscript.

Funding

This study was partly supported by Towson University’s College of Health Professions’ Summer Undergraduate Research Internship awarded to SaraGrace McCannon and Towson University’s Seed Funding Grant awarded to Nirmal Srinivasan.

Institutional Review Board Statement

This study protocol was reviewed and approved by Towson University Institutional Review Board (approval #: 1703016568, 9 April 2020) in accordance with the World Medical Association Declaration of Helsinki.

Informed Consent Statement

All participants signed an informed consent form approved by the Towson University Institutional Review Board prior to the start of the study.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, Nirmal Srinivasan, upon reasonable request.

Acknowledgments

The authors would like to thank all individuals who participated in this study. The authors would like to thank Sadie O’Neill and Morgan Barkhouse for their assistance with data collection.

Conflicts of Interest

The authors have no conflicts of interest.

References

  1. Arbogast, T.L.; Mason, C.R.; Kidd, G. The effect of spatial separation on informational masking of speech in normal-hearing and hearing-impaired listeners. J. Acoust. Soc. Am. 2005, 117, 2169–2180. [Google Scholar] [CrossRef] [PubMed]
  2. Freyman, R.L.; Helfer, K.S.; McCall, D.D.; Clifton, R.K. The role of perceived spatial separation in the unmasking of speech. J. Acoust. Soc. Am. 1999, 106, 3578–3588. [Google Scholar] [CrossRef] [PubMed]
  3. Gallun, F.J.; Diedesch, A.C.; Kampel, S.D.; Jakien, K.M. Independent impacts of age and hearing loss on spatial release in a complex auditory environment. Front. Neurosci. 2013, 7, 252. [Google Scholar] [CrossRef] [PubMed]
  4. Hawley, M.L.; Litovsky, R.Y.; Culling, J.F. The benefit of binaural hearing in a cocktail party: Effect of location and type of interferer. J. Acoust. Soc. Am. 2004, 115, 833–843. [Google Scholar] [CrossRef] [PubMed]
  5. Srinivasan, N.K.; Jakien, K.M.; Gallun, F.J. Release from masking for small spatial separations: Effects of age and hearing loss. The J. Acoust. Soc. Am. 2016, 140, EL73–EL78. [Google Scholar] [CrossRef]
  6. Helfer, K.S. Aging and the binaural advantage in reverberation and noise. J. Speech Lang. Hear. Res. 1992, 35, 1394–1401. [Google Scholar] [CrossRef]
  7. Bronkhorst, A.W.; Plomp, R. Effect of multiple speechlike maskers on binaural speech recognition in normal and impaired hearing. J. Acoust. Soc. Am. 1992, 92, 3132–3139. [Google Scholar] [CrossRef]
  8. Marrone, N.; Mason, C.R.; Kidd, G. Tuning in the spatial dimension: Evidence from a masked speech identification task. J. Acoust. Soc. Am. 2008, 124, 1146–1158. [Google Scholar] [CrossRef]
  9. Stecker, G.C.; Gallun, F. Binaural hearing, sound localization, and spatial hearing. In Translational Perspectives in Auditory Neuroscience: Normal Aspects of Hearingk; Tremblay, K., Burkard, R.F., Eds.; Plural Publishing: San Diego, CA, USA, 2012; pp. 383–433. [Google Scholar]
  10. Best, V.; Gallun, F.J.; Ihlefeld, A.; Shinn-Cunningham, B.G. The influence of spatial separation on divided listening. J. Acoust. Soc. Am. 2006, 120, 1506–1516. [Google Scholar] [CrossRef]
  11. Ihlefeld, A.; Shinn-Cunningham, B. Disentangling the effects of spatial cues on selection and formation of auditory objects. J. Acoust. Soc. Am. 2008, 124, 2224–2235. [Google Scholar] [CrossRef]
  12. Kidd, G.; Arbogast, T.L.; Mason, C.R.; Gallun, F.J. The advantage of knowing where to listen. J. Acoust. Soc. Am. 2005, 118, 3804–3815. [Google Scholar] [CrossRef] [PubMed]
  13. Litovsky, R.; Parkinson, A.; Arcaroli, J.; Sammeth, C. Simultaneous bilateral cochlear implantation in adults: A multicenter clinical study. Ear Hear. 2006, 27, 714–731. [Google Scholar] [CrossRef] [PubMed]
  14. Van Hoesel RJ, M. Exploring the benefits of bilateral cochlear implants. Audiol. Neurotol. 2004, 9, 234–246. [Google Scholar] [CrossRef] [PubMed]
  15. Peters, B.R.; Wyss, J.; Manrique, M. Worldwide trends in bilateral cochlear implantation. Laryngoscope 2010, 120, S14–S17. [Google Scholar] [CrossRef]
  16. Seeber, B.U.; Fastl, H. Localization cues with bilateral cochlear implants. J. Acoust. Soc. Am. 2008, 123, 1030–1042. [Google Scholar] [CrossRef]
  17. Van Hoesel RJ, M.; Tyler, R.S. Speech perception, localization, and lateralization with bilateral cochlear implants. J. Acoust. Soc. Am. 2003, 113, 1617–1630. [Google Scholar] [CrossRef]
  18. Long, C.J.; Carlyon, R.P.; Litovsky, R.Y.; Downs, D.H. Binaural Unmasking with Bilateral Cochlear Implants. J. Assoc. Res. Otolaryngol. 2006, 7, 352–360. [Google Scholar] [CrossRef]
  19. Lu, T.; Litovsky, R.; Zeng, F. Binaural masking level differences in actual and simulated bilateral cochlear implant listeners. J. Acoust. Soc. Am. 2010, 127, 1479–1490. [Google Scholar] [CrossRef]
  20. Carlyon, R.P.; Long, C.J.; Deeks, J.M.; McKay, C.M. Concurrent sound segregation in electric and acoustic hearing. J. Assoc. Res. Otolaryngol. 2007, 8, 119–133. [Google Scholar] [CrossRef]
  21. Garadat, S.N.; Litovsky, R.Y.; Yu, G.; Zeng, F. Effects of simulated spectral holes on speech intelligibility and spatial release from masking under binaural and monaural listening. J. Acoust. Soc. Am. 2010, 127, 977–989. [Google Scholar] [CrossRef]
  22. Kerber, S.; Seeber, B.U. Sound localization in noise by Normal-Hearing listeners and Cochlear implant users. Ear Hear. 2012, 33, 445–457. [Google Scholar] [CrossRef] [PubMed]
  23. Bernstein, J.G.W.; Goupell, M.J.; Schuchman, G.I.; Rivera, A.L.; Brungart, D.S. Having two ears facilitates the perceptual separation of concurrent talkers for bilateral and Single-Sided deaf cochlear implantees. Ear Hear. 2016, 37, 289–302. [Google Scholar] [CrossRef] [PubMed]
  24. Buss, E.; Pillsbury, H.C.; Buchman, C.A.; Pillsbury, C.H.; Clark, M.S.; Haynes, D.S.; Labadie, R.F.; Amberg, S.; Roland, P.S.; Kruger, P.; et al. Multicenter U.S. Bilateral MED-EL Cochlear Implantation Study: Speech Perception over the First Year of Use. Ear Hear. 2008, 29, 20–32. [Google Scholar] [CrossRef] [PubMed]
  25. Goupell, M.J.; Kan, A.; Litovsky, R.Y. Spatial attention in bilateral cochlear-implant users. J. Acoust. Soc. Am. 2016, 140, 1652–1662. [Google Scholar] [CrossRef] [PubMed]
  26. Loizou, P.C.; Hu, Y.; Litovsky, R.; Yu, G.; Peters, R.; Lake, J.; Roland, P. Speech recognition by bilateral cochlear implant users in a cocktail-party setting. J. Acoust. Soc. Am. 2009, 125, 372–383. [Google Scholar] [CrossRef]
  27. Van Hoesel, R.; Böhm, M.; Pesch, J.; Vandali, A.; Battmer, R.D.; Lenarz, T. Binaural speech unmasking and localization in noise with bilateral cochlear implants using envelope and fine-timing based strategies. J. Acoust. Soc. Am. 2008, 123, 2249–2263. [Google Scholar] [CrossRef]
  28. Garadat, S.N.; Litovsky, R.Y.; Yu, G.; Zeng, F. Role of binaural hearing in speech intelligibility and spatial release from masking using vocoded speech. J. Acoust. Soc. Am. 2009, 126, 2522–2535. [Google Scholar] [CrossRef]
  29. Bolia, R.S.; Nelson, W.T.; Ericson, M.A.; Simpson, B.D. A speech corpus for multitalker communications research. J. Acoust. Soc. Am. 2000, 107, 1065–1066. [Google Scholar] [CrossRef]
  30. Shannon, R.V.; Zeng, F.; Kamath, V.; Wygonski, J.; Ekelid, M. Speech Recognition with Primarily Temporal Cues. Science 1995, 270, 303–304. [Google Scholar] [CrossRef]
  31. Friesen, L.M.; Shannon, R.V.; Baskent, D.; Wang, X. Speech recognition in noise as a function of the number of spectral channels: Comparison of acoustic hearing and cochlear implants. J. Acoust. Soc. Am. 2001, 110, 1150–1163. [Google Scholar] [CrossRef]
  32. Fu, Q.; Nogaki, G. Noise Susceptibility of Cochlear Implant Users: The role of Spectral Resolution and Smearing. J. Assoc. Res. Otolaryngol. 2005, 6, 19–27. [Google Scholar] [CrossRef] [PubMed]
  33. Greenwood, D.D. A cochlear frequency-position function for several species—29 years later. J. Acoust. Soc. Am. 1990, 87, 2592–2605. [Google Scholar] [CrossRef] [PubMed]
  34. Zahorik, P. Perceptually relevant parameters for virtual listening simulation of small room acoustics. J. Acoust. Soc. Am. 2009, 126, 776–791. [Google Scholar] [CrossRef] [PubMed]
  35. Allen, J.B.; Berkley, D.A. Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am. 1979, 65, 943–950. [Google Scholar] [CrossRef]
  36. Levitt, H. Transformed Up-Down Methods in Psychoacoustics. J. Acoust. Soc. Am. 1971, 49, 467–477. [Google Scholar] [CrossRef]
  37. Brungart, D.S. Informational and energetic masking effects in the perception of two simultaneous talkers. J. Acoust. Soc. Am. 2001, 109, 1101–1109. [Google Scholar] [CrossRef]
  38. Eddins, D.A.; Liu, C. Psychometric properties of the coordinate response measure corpus with various types of background interference. The J. Acoust. Soc. Am. 2012, 131, EL177–EL183. [Google Scholar] [CrossRef]
  39. Srinivasan, N.K.; Holtz, A.; Gallun, F.J. Comparing spatial release from masking using traditional methods and portable automated rapid testing iPad app. Am. J. Audiol. 2020, 29, 907–915. [Google Scholar] [CrossRef]
  40. Srinivasan, N.; Patro, C.; Kansangra, R.; Trotman, A. Comparison of psychometric functions measured using remote testing and laboratory testing. Audiol. Res. 2024, 14, 469–478. [Google Scholar] [CrossRef]
  41. Jakien, K.M.; Kampel, S.D.; Gordon, S.Y.; Gallun, F.J. The benefits of increased sensation level and bandwidth for spatial release from masking. Ear Hear. 2017, 38, e13–e21. [Google Scholar] [CrossRef]
  42. Qin, M.K.; Oxenham, A.J. Effects of simulated cochlear-implant processing on speech reception in fluctuating maskers. J. Acoust. Soc. Am. 2003, 114, 446–454. [Google Scholar] [CrossRef] [PubMed]
  43. Rosen, S.; Souza, P.; Ekelund, C.; Majeed, A.A. Listening to speech in a background of other talkers: Effects of talker number and noise vocoding. J. Acoust. Soc. Am. 2013, 133, 2431–2443. [Google Scholar] [CrossRef] [PubMed]
  44. Stickney, G.S.; Zeng, F.; Litovsky, R.; Assmann, P. Cochlear implant speech recognition with speech maskers. J. Acoust. Soc. Am. 2004, 116, 1081–1091. [Google Scholar] [CrossRef] [PubMed]
  45. Schoof, T.; Green, T.; Faulkner, A.; Rosen, S. Advantages from bilateral hearing in speech perception in noise with simulated cochlear implants and residual acoustic hearing. J. Acoust. Soc. Am. 2013, 133, 1017–1030. [Google Scholar] [CrossRef]
  46. Freyman, R.L.; Balakrishnan, U.; Helfer, K.S. Spatial release from informational masking in speech recognition. J. Acoust. Soc. Am. 2001, 109, 2112–2122. [Google Scholar] [CrossRef]
  47. Rubinstein, J.T.; Hong, R. Signal coding in cochlear implants: Exploiting stochastic effects of electrical stimulation. Ann. Otol. Rhinol. Laryngol. 2003, 112, 14–19. [Google Scholar] [CrossRef]
  48. Smith, Z.M.; Delgutte, B.; Oxenham, A.J. Chimaeric sounds reveal dichotomies in auditory perception. Nature 2002, 416, 87–90. [Google Scholar] [CrossRef]
  49. Stickney, G.S.; Assmann, P.F.; Chang, J.; Zeng, F. Effects of cochlear implant processing and fundamental frequency on the intelligibility of competing sentences. J. Acoust. Soc. Am. 2007, 122, 1069–1078. [Google Scholar] [CrossRef]
  50. Wilson, B.S.; Schatzer, R.; Lopez-Poveda, E.A.; Sun, X.; Lawson, D.T.; Wolford, R.D. Two new directions in speech processor design for cochlear implants. Ear Hear. 2005, 26, 73S–81S. [Google Scholar] [CrossRef]
Figure 1. Left panel shows speech identification thresholds (defined as the target-to-masker ratio required to correctly identify 50% of the target speech) for natural speech while the right panel shows the same for simulated CI speech at all the spatial separations tested in the study. With in each of the panels, the dark black lines indicate the average threshold, while the light blue lines indicate thresholds for the individual listeners (listener IDs are plotted as unique points). The error bars for the thresholds in correct data in all the panels indicate ±1 SEM. Blue stars indicate that the thresholds at the corresponding spatial separation were significantly different from the colocated threshold.
Figure 1. Left panel shows speech identification thresholds (defined as the target-to-masker ratio required to correctly identify 50% of the target speech) for natural speech while the right panel shows the same for simulated CI speech at all the spatial separations tested in the study. With in each of the panels, the dark black lines indicate the average threshold, while the light blue lines indicate thresholds for the individual listeners (listener IDs are plotted as unique points). The error bars for the thresholds in correct data in all the panels indicate ±1 SEM. Blue stars indicate that the thresholds at the corresponding spatial separation were significantly different from the colocated threshold.
Ohbm 05 00018 g001
Figure 2. Left panel shows the spatial release from masking (SRM; defined as the difference between colocated and separated speech identification thresholds) for natural speech while the right panel shows the same for simulated CI speech at all the spatial separations tested in the study. Within each of the panels, the dark black lines indicate the average SRM while the light blue lines indicate SRM for the individual listeners (listener IDs are plotted as unique points). The error bars for the thresholds in correct data in all the panels indicate ±1 SEM. Blue stars indicate that the SRM at the corresponding spatial separation was significantly different from no release from masking (0 dB).
Figure 2. Left panel shows the spatial release from masking (SRM; defined as the difference between colocated and separated speech identification thresholds) for natural speech while the right panel shows the same for simulated CI speech at all the spatial separations tested in the study. Within each of the panels, the dark black lines indicate the average SRM while the light blue lines indicate SRM for the individual listeners (listener IDs are plotted as unique points). The error bars for the thresholds in correct data in all the panels indicate ±1 SEM. Blue stars indicate that the SRM at the corresponding spatial separation was significantly different from no release from masking (0 dB).
Ohbm 05 00018 g002
Figure 3. Scatter plots between the natural speech and simulated CI speech identification thresholds at all spatial separations tested. Individual participant’s ID are used as points in each of the panels. Pearson’s correlation co-efficient between the identification thresholds are indicated in the graph as well. All correlations are significant at 0.01 significance level. The lines within each panel indicate the best fit line to the data.
Figure 3. Scatter plots between the natural speech and simulated CI speech identification thresholds at all spatial separations tested. Individual participant’s ID are used as points in each of the panels. Pearson’s correlation co-efficient between the identification thresholds are indicated in the graph as well. All correlations are significant at 0.01 significance level. The lines within each panel indicate the best fit line to the data.
Ohbm 05 00018 g003
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Srinivasan, N.; McCannon, S.; Patro, C. Spatial Release from Masking for Small Spatial Separations Using Simulated Cochlear Implant Speech. J. Otorhinolaryngol. Hear. Balance Med. 2024, 5, 18. https://doi.org/10.3390/ohbm5020018

AMA Style

Srinivasan N, McCannon S, Patro C. Spatial Release from Masking for Small Spatial Separations Using Simulated Cochlear Implant Speech. Journal of Otorhinolaryngology, Hearing and Balance Medicine. 2024; 5(2):18. https://doi.org/10.3390/ohbm5020018

Chicago/Turabian Style

Srinivasan, Nirmal, SaraGrace McCannon, and Chhayakant Patro. 2024. "Spatial Release from Masking for Small Spatial Separations Using Simulated Cochlear Implant Speech" Journal of Otorhinolaryngology, Hearing and Balance Medicine 5, no. 2: 18. https://doi.org/10.3390/ohbm5020018

APA Style

Srinivasan, N., McCannon, S., & Patro, C. (2024). Spatial Release from Masking for Small Spatial Separations Using Simulated Cochlear Implant Speech. Journal of Otorhinolaryngology, Hearing and Balance Medicine, 5(2), 18. https://doi.org/10.3390/ohbm5020018

Article Metrics

Back to TopTop