Relationship between Behavioral and Objective Measures of Sound Intensity in Normal-Hearing Listeners and Hearing-Aid Users: A Pilot Study

Background: For hearing-impaired individuals, hearing aids are clinically fit according to subjective measures of threshold and loudness. The goal of this study was to evaluate objective measures of loudness perception that might benefit hearing aid fitting. Method: Seventeen adult hearing aid users and 17 normal-hearing adults participated in the study. Outcome measures including categorical loudness scaling, cortical auditory evoked potentials (CAEPs), and pupillometry. Stimuli were 1-kHz tone bursts presented at 40, 60, and 80 dBA. Results: Categorical loudness scaling showed that loudness significantly increased with intensity for all participants (p < 0.05). For CAEPs, high intensity was associated with greater P1, N1, and P2 peak amplitude for all listeners (p < 0.05); a significant but small effect of hearing aid amplification was observed. For all participants, pupillometry showed significant effects of high intensity on pupil dilation (p < 0.05); there was no significant effect of hearing aid amplification. A Focused Principal Component analysis revealed significant correlations between subjective loudness and some of the objective measures. Conclusion: The present data suggest that intensity had a significant impact on loudness perception, CAEPs, and pupil response. The correlations suggest that pupillometry and/or CAEPs may be useful in determining comfortable amplification for hearing aids.


Introduction
Hearing-impaired individuals experience decreased auditory dynamic range due to recruitment and the loss/dysfunction of outer hair cells [1]. A dysfunction of outer hair cells reduces cochlear mechanical amplification of low-intensity sounds without altering cochlear mechanical responses to high-intensity sounds [2,3]. Approximately 60-70% of hearing loss can be explained by a loss of cochlear amplification [4,5]. To restore amplification, hearingimpaired individuals can use hearing aids (HAs), which are fit according to individual patterns of hearing loss, as reflected by the audiogram, with compression applied to increase the auditory dynamic range [6,7]. However, these clinical fitting methods do not guarantee patient satisfaction with their HAs. Patients may need several adjustments to their hearing aids before being satisfied with the fittings [8][9][10][11][12][13].
While HAs are clinically fit using subjective measures of threshold and comfortable loudness, some individuals may not be able to provide subjective judgements of loudness due to difficulties in communication [14]. For very young children, proper HA fitting is essential for successful outcomes. Early rehabilitation is recognized as a prognostic factor listeners and individuals with hearing loss that use HAs. The secondary objective was to analyze subjective loudness perception, which is well known to be linked to sound intensity, and to observe potential associations between behavioral and objective measures. If associations were to be found, objective measures may predict auditory comfort or discomfort, and could be used in clinical HA fitting.

Participants
Seventeen adult (9 women, 8 men), right-handed, native French speakers with presbycusis and moderate sensorineural hearing loss (SNHL) according to Bureau International d'Audiophonologie (BIAP) criteria [65] participated in the study. All hearing-impaired participants were bilateral hearing aid (HA) users for at least one year and wore their HA at least 6 h per day; the mean daily HA use was 10.0 ± 3.2 h according to data logging. Participants were recruited from the Cochlear Implant unit of the Otolaryngology Department at University Hospital of Tours, France. None of the participants had reported neurologic disorders or history of Meniere's disease. The mean age at testing was 76.0 ± 7.0 years (range: 62-88 years), and all had a mini mental state score of 30/30. Throughout this study, these participants are referred to as the "HA group". Table 1 shows demographic information for the HA group. The mean unaided air pure tone average (PTA) threshold across 0.5, 1.0, 2.0, and 4.0 kHz was 47.7 ± 14.0 dB HL for left ear and 47.5 ± 11.7 dB HL for right ear; a paired t-test showed no significant difference in PTA thresholds between the right and left ear (p = 0.902). The unaided speech audibility threshold (SAT) for French monosyllable words [66] was 52.9 ± 12.5 dB HL for the left ear and 51.7 ± 7.4 dB HL for the right ear; a paired t-test showed no significant difference in SATs between the right and left ear (p = 0.627). The mean amplification for PTA thresholds (the difference between aided and unaided PTA thresholds across both ears, evaluated in the sound field) was 16.4 ± 8.6 dB HL. The mean amplification for SATs (the difference between aided and unaided SATs across both ears, evaluated in the sound field) was 12.5 ± 8.0 dB SPL.  Seventeen normal hearing (NH) adults (11 women, 6 men) served as experimental controls. All had PTA thresholds ≤25 dB HL and none had any reported neuronal disease. The mean age at testing was 64.0 ± 3.9 years (range: 59-75 years), and all had a mini mental state score of 30/30. Throughout the study, these participants are referred to as the "NH group".
The Ethics Committee of the University Hospital of Tours specifically approved the protocol for participants in the HA group (N • ID RCB No. 2015-A01249-40) and the NH group (N • ID RCB No. 2017-A00756-47). Written informed consent was obtained from all participants.

Hearing Aids
All participants in the HA group used in-the-ear receiver HAs. All HA devices were less than 5 years old from Oticon (n = 10) or Bernafon (n = 7). HAs were fitted using a receiver of 85 dB and all had the same frequency range (8 kHz) and maximal gain (55 dB), as measured using a 2-cc coupler. HAs were programmed for omnidirectional amplification with a deactivated volume control. HAs were fit using NOAH software and the manufacturer's programming module, and all were fit using the NAL NL2 prescription [6]. The HA gain was checked using a real ear insertion measure (REIR); an international speech test signal (ISTS) was presented at 65 dB SPL at 45 • azimuth and recorded from the ear canal with the probe-tube microphone of the Affinity module to check that the gain was consistent with hearing loss.

Procedure
All tests were conducted in a single session which lasted an average of 65 min for the NH group and 100 min for the HA group, where tests were conducted with and without their HAs. The order of testing was EEG recording (15 min), followed by pupillometry (15 min), and then loudness rating (5 min). Between EEG recording and pupillometry, participants were given a 30-min break. If a participant felt asleep during recording, the test was stopped and a break was given before continuing the session.

Categorical Loudness Scaling
Three 1-kHz tone bursts (total duration = 1 s) at three intensities (40, 60, 80 dBA) were presented in random order via two loudspeakers positioned 1.3 m away from the participant and at −45 • and +45 • relative to center. After the stimulus presentation, participants were asked to rate the loudness according to a scale that ranged from 0 (inaudible level) to 10 (intolerable level). Aided and unaided (with and without HA) loudness ratings were obtained from the HA group. For all participants, loudness ratings were obtained for each intensity 3 times.

Cortical Auditory Evoked Potentials (CAEPs) Stimuli
Stimuli (n = 360) were 1-kHz tone bursts (duration = 200 ms) generated using MATLAB (Mathworks, Natick, MA). Stimuli were presented at 40 (n = 120), 60 (n = 120), and 80 (n = 120) dBA via two loudspeakers situated at 1.3 m away from the participant and −45 • and +45 • relative to center. The inter-stimulus interval (ISI) varied from 2 to 3 s (offset to onset). For the HA group, cortical recordings were made with and without the HAs in two separate sessions. The orders of the stimulus intensity and listening condition (with or without HA) were randomized. The neurophysiological recordings took approximately 30 min for each session. During the recording session, participants sat in a sound-attenuated room and watched a silent movie.

Electroencephalogram (EEG) Data Recording
EEG data were recorded using a Compumedics System Neuroscan EEG system (Synamps RT amplifier and Curry 7 software) with 64 electrodes referenced on line to the nose. All electrodes were placed according to the international 10-20 electrode placement standard. Electrode impedances were kept below 5 kΩ. In addition, electrooculogram (EOG) activity was recorded from electrodes placed at the outer canthi of both eyes (horizontal EOG) and above and below the right eye (vertical EOG). The EEG data were recorded with a sampling frequency of 500 Hz and low-pass filtered at 200 Hz. The stimulus presentation was controlled by Presentation software.
EEG analysis was performed using ELAN software [67]. EEG recordings were filtered by a band-pass filter (0.3-70 Hz). Artifacts resulting from eye movements were removed using independent component analysis, and movement artifacts characterized by a high-frequency or high-amplitude signal were discarded manually by the experimenter. Afterwards, EEG was segmented into epochs from −100 to 500 ms relative to the stimulus onset. The epochs were baseline-corrected relative to a 100-ms pre-stimulus time window, and a digital zero-phase-shift low-pass filter of 30 Hz was applied. The mean number of epochs varied: 81.7 ± 14.6 (without HA), 82.3 ± 13.7 (with HA), and 85.6 ± 12.5 (NH) at 40 dBA; 80.8 ± 16.0 (without HA), 81.6 ± 13.6 (with HA), and 83.5 ± 12.9 (NH) at 60 dBA; and 77.5 ± 17.7 (without HA), 80.8 ± 14.3 (with HA), and 84.4 ± 12.7 (NH) at 80 dBA.

Stimuli
Stimuli were 1-kHz tone bursts (duration = 1 s) generated using MATLAB (Mathworks, Natick, MA, USA). Seven stimuli for each of the 40, 60, and 80 dBA intensities were presented via 2 loudspeakers situated at 1.3 m away from the subject and −45 • and +45 • relative to center. The order of intensity presentation was randomized, and the ISI was 20 ± 30 s (offset to onset). Pupillometry was measured with and without HAs in the HA group in two separate sessions.

Data Recording
Visual stimuli were sent by an SMI iView X RED (version 2.8) remote eye-tracking system with a spatial resolution of 0.03 • and a sampling frequency of 500 Hz. The system consisted of a computer equipped with two cameras sensitive to infrared light as well as a light source. A PC screen was positioned on top of the pupillometry system, about 45 cm away from the participant's head. The illumination of the room (20 lux) was kept constant during the experiment for all participants. No eye-tracking equipment was needed to be worn by participants, as the corneal reflection of infrared light allowed for monitoring of ocular behavior.
The prototype of the experiment is presented in Figure 1. First, a white image was presented for 2 s, followed by a black image for 2 s, and finally a white image for 2 s to record the photo-motor response. For the rest of the experimental run, the visual stimulation was a gray image (20 lux) and included a cross to direct participants' gaze on the screen. Experimental data recording began after 2 s of silence. Participants were seated in a comfortable armchair in a silent room and were asked to keep calm, not move, and watch the screen. For the HA group, pupillometry was measured with and without HAs. Figure 1. Pupillometry paradigm. Progression of the visual stimuli: first a white image was presented for 2 s, followed by a black image for 2 s, followed by a white image for 2 s, followed finally by a gray image, which remained present during the experimental run. A 1-kHz tone burst was presented at an interval of 20-30 s.

Analysis
The baseline pupil size was determined as the average pupil size in the 1.0 s interval preceding the auditory stimulation. Only pupil sizes between 1 and 9 mm were considered for the analysis [68]. All remaining traces were baseline corrected by subtracting a baseline value from each time point within that trace. The mean pupil diameter at each intensity presentation was calculated by averaging the pupil diameter between stimulus onset and 5 s after stimulus offset. The mean pupil response at each intensity presentation was estimate during a period of 5 s from baseline (1 s before to 4 s after stimulus onset). If the pupil data contained more than 50% blinks between the start of the baseline and the prompt signal, it was excluded from the analysis. After this, the mean curves for each stimulation level were generated using MATLAB. Two pupil measures were extracted from the average trace: (1) peak dilation amplitude, defined as the maximum pupil diameter after the onset of the tone burst (peak level-baseline); and (2) latency of the peak dilation amplitude. Peak dilation amplitude was determined manually.
Pupil analysis data were analyzed for only 15 participants in the HA group, as data from two participants (HA-8 aided and HA-5 unaided) could not be used. For categorical loudness ratings and CAEP analysis, aided and unaided data were used for all 17 participants in the HA group.

Statistical analyses
Analyses of variance (ANOVAs) were performed to evaluate the effects of intensity (40, 60, 80 dBA) and listening condition (NH, HA aided, HA unaided) on subjective and objective data. Within the NH and HA groups, separate repeated-measures (RM) ANO-VAs were used to evaluate intensity effects (both groups) and HA effect (HA group); in cases where assumptions of normal distribution and equal variance were violated, nonparametric tests were used. Across the NH and HA groups, non-parametric tests were used to compare NH data to HA data (aided or unaided). For all analyses, the significance Figure 1. Pupillometry paradigm. Progression of the visual stimuli: first a white image was presented for 2 s, followed by a black image for 2 s, followed by a white image for 2 s, followed finally by a gray image, which remained present during the experimental run. A 1-kHz tone burst was presented at an interval of 20-30 s.

Analysis
The baseline pupil size was determined as the average pupil size in the 1.0 s interval preceding the auditory stimulation. Only pupil sizes between 1 and 9 mm were considered for the analysis [68]. All remaining traces were baseline corrected by subtracting a baseline value from each time point within that trace. The mean pupil diameter at each intensity presentation was calculated by averaging the pupil diameter between stimulus onset and 5 s after stimulus offset. The mean pupil response at each intensity presentation was estimate during a period of 5 s from baseline (1 s before to 4 s after stimulus onset). If the pupil data contained more than 50% blinks between the start of the baseline and the prompt signal, it was excluded from the analysis. After this, the mean curves for each stimulation level were generated using MATLAB. Two pupil measures were extracted from the average trace: (1) peak dilation amplitude, defined as the maximum pupil diameter after the onset of the tone burst (peak level-baseline); and (2) latency of the peak dilation amplitude. Peak dilation amplitude was determined manually.
Pupil analysis data were analyzed for only 15 participants in the HA group, as data from two participants (HA-8 aided and HA-5 unaided) could not be used. For categorical loudness ratings and CAEP analysis, aided and unaided data were used for all 17 participants in the HA group.

Statistical Analyses
Analyses of variance (ANOVAs) were performed to evaluate the effects of intensity (40, 60, 80 dBA) and listening condition (NH, HA aided, HA unaided) on subjective and objective data. Within the NH and HA groups, separate repeated-measures (RM) ANOVAs were used to evaluate intensity effects (both groups) and HA effect (HA group); in cases where assumptions of normal distribution and equal variance were violated, nonparametric tests were used. Across the NH and HA groups, non-parametric tests were used to compare NH data to HA data (aided or unaided). For all analyses, the significance level was 0.05; Bonferroni or Tukey adjustments to the significance level were applied to all post hoc pairwise comparisons. Analyses were performed using IBM SPSS software.
Focused Principal Component Analysis (FCPA) was used to characterize relationships among the behavioral and objective measures. [69]. FCPA is based on Principal Component Analysis (PCA) and converts the structure of a correlation matrix into a distance matrix. FPCA allows for a graphical representation of associations between the dependent variable (here, subjective loudness scaling) and explanatory variables (here, objective measures of pupillometry and CAEPs), as well as the relationships among the explanatory variables [69,70]. FCPA was performed using the Psy library in R software, and figures were generated using the coorplot package in R software.
All data are reported in Supplementary Material (S1). Figure 2 shows boxplots of loudness ratings for the NH and HA groups (aided or unaided). In general, lo bSudness ratings increased with intensity for both groups. For the NH group, mean loudness ratings were 2.9 ± 1.7, 5.4 ± 1.5, and 7.8 ± 1.4 at 40, 60, and 80 dBA, respectively. A RM ANOVA was performed on the NH data, with intensity (40, 60, 80 dB) as the factor. Results showed a significant effect of intensity [F (2,32) = 124.0, p < 0.001]; Bonferroni-adjusted post hoc pairwise comparisons showed significant differences among all three intensities (p < 0.001 in all cases). For the HA group, mean loudness ratings with the HA off were 1.8 ± 1.4, 4.3 ± 2.3, and 7.2 ± 1.4 at 40, 60, and 80 dB, respectively; mean ratings with the HA on were 3.4 ± 1.3, 5.5 ± 1.1, and 7.2 ± 1.3 at 40, 60, and 80 dB, respectively. An RM ANOVA was performed on the HA group data, with HA (on, off) and intensity (40, 60, 80 dB) as factors. Results showed significant effects of HA [F (1,32) = 16.3, p < 0.001] and intensity [F (2,32) = 77.7, p < 0.001]; there was a significant interaction [F (2,32) = 11.0, p < 0.001]. Bonferroni-adjusted post hoc pairwise comparisons showed significant differences among all intensities for both listening conditions (p < 0.001 in all cases), and significantly higher ratings with the HA on than off at 40 and 60 dB (p < 0.001 in both cases), but not at 80 dB. level was 0.05; Bonferroni or Tukey adjustments to the significance level were applied to all post hoc pairwise comparisons. Analyses were performed using IBM SPSS software. Focused Principal Component Analysis (FCPA) was used to characterize relationships among the behavioral and objective measures. [69]. FCPA is based on Principal Component Analysis (PCA) and converts the structure of a correlation matrix into a distance matrix. FPCA allows for a graphical representation of associations between the dependent variable (here, subjective loudness scaling) and explanatory variables (here, objective measures of pupillometry and CAEPs), as well as the relationships among the explanatory variables [69,70]. FCPA was performed using the Psy library in R software, and figures were generated using the coorplot package in R software.

Categorical Loudness Scaling
All data are reported in Supplementary Material (S1). Figure 2 shows boxplots of loudness ratings for the NH and HA groups (aided or unaided). In general, lo bSudness ratings increased with intensity for both groups. For the NH group, mean loudness ratings were 2.9 ± 1.7, 5.4 ± 1.5, and 7.8 ± 1.4 at 40, 60, and 80 dBA, respectively. A RM ANOVA was performed on the NH data, with intensity (40, 60, 80 dB) as the factor. Results showed a significant effect of intensity [F (2,32) = 124.0, p < 0.001]; Bonferroni-adjusted post hoc pairwise comparisons showed significant differences among all three intensities (p < 0.001 in all cases). For the HA group, mean loudness ratings with the HA off were 1.8 ± 1.4, 4.3 ± 2.3, and 7.2 ± 1.4 at 40, 60, and 80 dB, respectively; mean ratings with the HA on were 3.4 ± 1.3, 5.5 ± 1.1, and 7.2 ± 1.3 at 40, 60, and 80 dB, respectively. An RM ANOVA was performed on the HA group data, with HA (on, off) and intensity (40,    To determine across-group differences, separate Kruskal-Wallis ANOVAs on ranked loudness rating data were performed at each intensity, with listening group (NH, HA aided, and HA unaided) as the factor. Results showed significant effects of listening group at 40 (dF = 2, H = 11.1, p = 0.004) and 60 dB (dF = 2, H = 7.4, p = 0.024); there was no significant effect at 80 dB. After Bonferroni adjustment, post hoc Dunn pairwise comparisons showed no significant difference between the NH and the HA-aided or HAunaided loudness ratings. Figure 3 shows mean CAEP data for the NH and HA groups (aided and unaided). In general, peak amplitude increased with intensity for GFP and at Cz, and latency reduced with intensity. For the HA group, amplitude was generally higher and latency was generally earlier with the HA on than with the HA off. Amplitude and latency values were generally similar between the NH and HA group with the HA on, with slightly lower amplitudes and longer latencies observed for the HA group with the HA off. Mean and standard deviation for peak CAEP values for the NH group and the HA group (aided and unaided) are shown in Table 2. To determine across-group differences, separate Kruskal-Wallis ANOVAs on ranked loudness rating data were performed at each intensity, with listening group (NH, HA aided, and HA unaided) as the factor. Results showed significant effects of listening group at 40 (dF = 2, H = 11.1, p = 0.004) and 60 dB (dF = 2, H = 7.4, p = 0.024); there was no significant effect at 80 dB. After Bonferroni adjustment, post hoc Dunn pairwise comparisons showed no significant difference between the NH and the HA-aided or HA-unaided loudness ratings. Figure 3 shows mean CAEP data for the NH and HA groups (aided and unaided). In general, peak amplitude increased with intensity for GFP and at Cz, and latency reduced with intensity. For the HA group, amplitude was generally higher and latency was generally earlier with the HA on than with the HA off. Amplitude and latency values were generally similar between the NH and HA group with the HA on, with slightly lower amplitudes and longer latencies observed for the HA group with the HA off. Mean and standard deviation for peak CAEP values for the NH group and the HA group (aided and unaided) are shown in Table 2.    The effects of intensity (40, 60, 80 dB) on CAEP responses were analyzed within the NH group using RM ANOVAs or non-parametric tests, as appropriate; complete results are shown in Table 3. For CZ amplitude, significant effects for intensity were observed at N1 (80 > 60 or 40 dB), P2 (80 > 60 or 40 dB), and N1-P2 (80 > 60 > 40 dB). For CZ latency, a significant effect of intensity was observed only at N1 (40 > 60 or 80 dB). For GFP amplitude, significant effects of intensity were observed at P1 (80 > 40 dB), N1 (80 > 60 > 40 dB), and P2 (80 > 60 > 40 dB). For GFP latency, significant effects of intensity were observed at P1 (40 > 80) and N1 (40 > 60, 80). The effects of intensity (40, 60, 80 dB) and HA (on or off) on CAEP responses were analyzed within the HA group (aided and unaided) using RM ANOVAs or non-parametric tests, as appropriate; complete results are shown in Table 4. For CZ amplitude, significant effects for intensity were observed at P1 (80 > 60 or 40 dB), N1 (80 > 60 or 40 dB), P2 (80 > 60 or 40 dB), and N1-P2 (80, 60 > 40 dB); there was no effect of HA. For CZ latency, significant effects for intensity were observed at P1 (40 > 80) and at N1 with the HA on (40 > 60, 80). For GFP amplitude, significant effects of intensity were observed at P1 (80 > 60 > 40), N1 (80 > 60 > 40), and P2 (80 > 40); significant effects of HA were observed at P1 (on > off) and N1 (off > on). For GFP latency, significant effects of intensity were observed at P1 (40 > 60, 80) and N1 40 > 60 > 80); there was no significant effect of HA. Table 4. Results of RM ANOVAs performed on HA CAEP amplitude and latency data (aided and unaided) at Cz and for global field power (GFP); in cases where assumptions of normality and equal variance were violated, non-parametric tests were performed (shown in lower part of the table). For the HA factor, the HA was on or off. Significant effects are indicated by asterisks and italics; post hoc significant differences are shown after Bonferroni or Tukey correction for multiple comparisons. CAEP responses were compared between the NH group and the HA group with the HA on or off at each intensity using Kruskal-Wallis ANOVAs on ranked data; complete results are shown in Table 5. At Cz, N1 amplitude was significantly lower for the NH group than for the HA group with the HA off at all intensities. N1-P2 amplitude was significantly higher for the NH group compared to the HA group with the HA off at all intensities and with the HA on at 40 dBA. P1 latency was significantly shorter for the NH group compared to the HA group with the HA on at all intensities and with the HA off at 60 dBA. N1 latency was significantly shorter for the NH group compared to the HA group with the HA on at 40 and 60 dBA and with the HA off at 60 dBA. For GFP, N1 amplitude was significantly larger for the NH group than for the HA group with the HA off at 40 dB; there were no other significant differences between the NH group and the HA group with the HA on or off. P1 latency was significantly shorter for the NH group compared the HA group with the HA on at all intensities and with the HA off at 80 dBA. N1 latency was significantly shorter for the NH group compared the HA group with the HA on or off at 40 and 60 dBA. Table 5. Results of Kruskal-Wallis ANOVAs on ranked data. CAEP amplitude and latency data (aided and unaided) were compared across the NH and HA groups (aided or unaided) at Cz and for global field power (GFP) at each intensity. Significant effects are indicated by asterisks and italics; post hoc significant differences are shown after Bonferroni correction (adjusted p = 0.025) for multiple comparisons (Dunn) where NH data were compared to HA data with the HA on or off.       Table 6. In general, pupil diameter and latency increased with intensity. Within the NH group, an RM ANOVA showed significant effects of intensity on pupil diameter [F (2, 32) = 27.1, p < 0.001] and latency [F (2, 32) = 21.5, p < 0.001]. Post hoc Bonferroni pairwise comparisons showed that both pupil diameter and latency were significantly lower at 40 dB than at 60 or 80 dB (p < 0.001 in all cases). Within the HA group, an RM ANOVA showed a significant effect of intensity on pupil diameter [F (2, 28) = 3.8, p = 0.036]; post hoc Bonferroni pairwise comparisons showed that pupil diameter was significantly lower at 40 dB than at 80 dB (p = 0.032). There was no significant effect of HA (on or off) on pupil diameter or latency. A Kruskal-Wallis ANOVA on ranked data showed a difference across the NH and HA groups only for pupil latency at 80 dBA (dF = 2, H = 9.5, p = 0.009). After Bonferroni adjustment for multiple comparisons, latency was significantly higher for the NH group than for the HA group with the HA off (p = 0.008). Kruskal-Wallis ANOVA on ranked data showed a difference across the NH and HA groups only for pupil latency at 80 dBA (dF = 2, H = 9.5, p = 0.009). After Bonferroni adjustment for multiple comparisons, latency was significantly higher for the NH group than for the HA group with the HA off (p = 0.008).

Relationships among Behavioral and Objective Measures Using FCPA
FCPA was performed on the subjective loudness, CAEP, and pupillometry data to explore the relationships between loudness perception and explanatory objective measures, as well as the relationships among the objective measures. The right panels of Figure 6 show the correlation matrix among all variables. The right panels of Figure 6 visualize these correlations. The strength of the correlation between the dependent variable of loudness (the center of the circle) and the explanatory objective measures are represented by the concentric circles; r values ≥ 0.5 were considered to be the strongest explanatory variables, and data within the red circle indicate significant relationships (p < 0.05). The closer the variable is to the center of the plot, the stronger the correlation. The distance among the explanatory variables indicates the degree of inter-correlation. When points are close together, the variables are strongly and positively correlated. When points are diametrically opposed, the variables are strongly and negatively correlated. When points are equidistant from the origin, there is no significant inter-correlation [69,70].

Relationships among Behavioral and Objective Measures Using FCPA
FCPA was performed on the subjective loudness, CAEP, and pupillometry data to explore the relationships between loudness perception and explanatory objective measures, as well as the relationships among the objective measures. The right panels of Figure 6 show the correlation matrix among all variables. The right panels of Figure 6 visualize these correlations. The strength of the correlation between the dependent variable of loudness (the center of the circle) and the explanatory objective measures are represented by the concentric circles; r values ≥ 0.5 were considered to be the strongest explanatory variables, and data within the red circle indicate significant relationships (p < 0.05). The closer the variable is to the center of the plot, the stronger the correlation. The distance among the explanatory variables indicates the degree of inter-correlation. When points are close together, the variables are strongly and positively correlated. When points are diametrically opposed, the variables are strongly and negatively correlated. When points are equidistant from the origin, there is no significant inter-correlation [69,70]. and explanatory variables (pupillometry and CAEPs). The circles represent Cz amplitude, the squares represent Cz latency, the up triangles represent GFP amplitude, the down triangles represent GFP latency, the stars represent pupil diameter, and the diamonds represent pupil latency. The green and yellow symbols represent positive and negative correlations between the dependent and explanatory variables, respectively. The explanatory variables within the red circle were significantly correlated with the dependent variable (p < 0.05). Right panels: Correlation matrices among the behavioral and objective measures. The bar to the right of the matrices shows the color coding for the correlation coefficients, ranging from −1 to 1. Within the matrices, the color and size of the circle represents the strength of correlation between two variables.

Discussion
To the best of our knowledge, this is the first exploratory study to compare loudness, pupillometry, and CAEPs in NH and HA listeners. Most previous studies have compared such subjective and objective measures in cochlear implant (CI) rather than HA users [20,21]. The goal of this study was to determine if pupillometry and/or CAEP could be a marker of auditory listening comfort. Results showed a strong impact of intensity level on loudness ratings, pupillometry, and CAEP responses in the NH and HA groups (with the HA on or off). For the HA group, there was little difference in behavioral responses when the HA was on or off. Significant relationships were observed between loudness ratings and some CAEP and pupillometry measures, though these differed between the NH and HA groups. Below we discuss the findings in greater detail.

Effect of Intensity Level on Loudness Perception, CAEP, and Pupillometry
Not surprisingly, loudness perception was closely related to intensity level for the NH and HA groups (Figure 2). No significant difference was observed between the NH group and the HA group with the HA on or off. Within the NH and HA groups, loudness ratings significantly increased with intensity. Within the HA group, loudness ratings were significantly higher with the HA on than off at 40 and 60 dB, but not at 80 dB. This was likely due to the compression in the HA, where softer sounds would be amplified, and loud sounds would be peak-limited.
For all groups, stimulus intensity affected CAEP waveform morphology (Figure 3). For the NH group, significant effects of intensity were observed for all CAEP responses except for P1 amplitude at Cz, P2 latency at Cz, and P2 latency GFP (Table 3). Similarly, significant effects for intensity were observed for all CAEP responses, except for P2 latency at Cz, and P2 latency GFP (Table 4). Shorter peak latencies signify decreased neural conduction time, and higher amplitudes represent increased response strength [40][41][42][43][44][45][46][47][48][49][50][51][52][53]. Similar patterns were observed in the present study, especially for P1 and N1 responses. Neural encoding for sound intensity has been directly linked to N1 peak amplitude [49,58]. Some significant differences in CAEP responses were observed between the NH group and the HA group, mostly for P1 latency and N1 amplitude and latency (Table 5). With the HA off, peak amplitude negativity was generally smaller for the HA group than for the NH group. This may have corresponded to the lower loudness ratings at 40 and 60 dB when the HA was off, compared to the NH group. With the HA on or off, peak latency was generally longer for the HA group than for the NH group. When the HA was on, compressor time constants may have affected latency. When the HA was off, poorer audibility may have resulted in longer latency [71].
Significant effects of intensity on pupillometry were observed for the NH and HA groups ( Figure 5). For the NH group, peak pupil diameter and latency significantly increased across all intensities. These results agree with previous NH studies that showed larger pupil responses with increasing intensity [46][47][48]. The increase in pupil response could be explained by the nature of the stimuli, which were sudden and novel from trial to trial, and may have evoked automatic attentional effects. As hearing is important for warning sounds, high intensity levels could be interpreted as an alarm signal or an environmental stressor, which would disturb the autonomic nervous system. As pupil dilation depends on the autonomic nervous system [72], it adapts to the environmental demands, including auditory stimulation. Kahneman described pupillometry as an index of "load on attention capacity" [73]; high intensity stimulation likely leads to greater attention. Thus, the increase in pupil dilation with intensity could be interpreted as an automatic direction of attention to the stimulus. Pupillometry can also provide an estimate of mental effort and has been correlated with activity in the locus coeruleus [74]. High intensity may require more time to arrive at peak dilation because of noradrenaline discharge [60], resulting in longer peak latency, as observed in the present NH group.
For the HA group, peak dilation was significantly different only between 40 and 80 dB, and only when the HA was on. There was no significant difference in peak latency across intensities with the HA on or off. When the HA was on, peak intensity may have been compressed and lower intensities would be likely be amplified, resulting in less differentiation across intensity than would occur with NH listeners. When the HA was off, there may have been a reduction in audibility that may have limited pupil response.
Interestingly, the only significant difference observed between the NH group and HA groups was for pupil dilation at 80 dB when the HA was off. While pupil response to intensity was more pronounced for the NH group, the range of pupil responses was generally similar between the NH group and the HA group when the HA was on or off (Table 6).

Effect of HA Amplification on Loudness Perception, CAEP, and Pupillometry Responses
Because intensity significantly affected loudness ratings and CAEP responses in the HA group, and because lower sounds would be amplified when the HA was on, we expected that HA amplification would significantly affect loudness perception and CAEP response. HA amplification significantly increased loudness ratings only at 40 and 60 dB, with no difference at 80 dB. Note that the compression ratio was greater for high and moderate sound levels (1.7 ± 0.7) than for low sound levels (1.3 ± 0.4), which may have contributed to this finding. Alternatively, there may have been some physiological saturation at the higher 80 dB intensity level that was unaffected by HA amplification [75]. HA amplification significantly increased CAEP responses for P1 and N1 amplitude GFP across all intensities, for P2 amplitude GFP at 80 dB (Table 4). No significant differences for HA amplification were observed for latency measures. The greater neural recruitment associated with HA amplification may increase CAEP amplitude in some cases but may not affect auditory processing time at the cortical level [76].
Previous studies in which NH listeners were tested with or without simulations of HA amplification showed that CAEP amplitudes decreased with HA amplification [63] or were unaffected by HA amplification [62,64]. Other studies in individuals with hearing loss showed no change in cortical responses with HA amplification [77]. Karawani et al. (2018) found increased N1 and P2 peak amplitude with HA amplification that was correlated with improvements in working memory [78], suggesting that HA experience may enhance cortical sound processing and improve cognitive function. In the present study, we did not observe a robust effect of HA amplification, even though participants had used their HA for at least one year. One possible explanation is that daily auditory stimulation via HA did not alter the neurophysiological representation of sound at the level of the auditory cortex.
The mean changes in loudness across intensities were different with and without HA amplification. With the HA off, the mean change in loudness was 2.5 from 40 to 60 dB, and 2.9 from 60 to 80 dB; with the HA on, the mean change in loudness was 2.1 from 40 to 60 dB, and 1.7 from 60 to 80 dB. As such, changes in loudness with intensity were much smaller with HA amplification than with changes in intensity. It is also possible that HA amplification changes more than just the intensity of a stimulus. HA amplification may differently affect neural responses to changes in intensity, although this may be truer for more complex stimuli (e.g., speech, noise) than for the pure tone stimuli used in this study. Taken together, the present data suggest that cortical responses and pupillometry may not be good approaches to evaluate the effects of HA amplification.

Relationships between Subjective and Objective Measures of Auditory Intensity
FCPA revealed few significant relationships between behavioral and objective responses to intensity; the observed relationships differed among the NH groups and the HA group with the HA on or off. For the NH group, loudness was significantly associated with [N1-P2] amplitude at Cz, N1 amplitude GFP, as well as pupil dilation and latency. For the HA group, loudness was significantly associated with P1 and N1 latency at Cz, P1 amplitude GFP, and N1 latency GFP; with the HA off, loudness was significantly associated with P1 and N1 amplitude GFP.
Loudness was significantly associated with pupil dilation and latency only in the NH group. The mean dynamic range for pupil dilation was 0.13 mm for the NH group, 11 mm for the HA group with the HA on, and 7 mm for the HA group with the HA off (Table 6). Increasing age has been associated with reduced dynamic range of pupil dilation in older than in younger listeners [79]. Age might differentially affect different components of the pupil response reflecting parasympathetic versus sympathetic effects [80]. In this study, the mean age of the NH group (64.0 ± 3.9 years) was 12 years younger than that of the HA group (76.0 ± 7.0 years). The older age of the HA group may partly explain the lack of association between loudness and pupil response, due to the reduced dynamic range of pupil dilation.
Concerning CAEPs, Bakhos et al. (2014) showed that some pediatric HA users exhibit abnormal temporal brain function (absence of N1c) that may underlie language impairment [81]. Of course, children react differently than adults, and their hearing loss may have other consequences for children (e.g., language development) than for elderly adults (e.g., CAEPs or pupil responses). Behavioral and objective responses to sound intensity should also be measured in children. Indeed, the principal interest in using objective measures is responses to sound intensity where behavioral measures are difficult to measure, as is the case for young children.

Limits to Study
While this study demonstrated the impact of intensity on CAEPs and pupillometry, the number of participants was limited. Indeed, a power analysis with G *power software indicated that for power = 0.95, 43 participants would be needed for each group. Additional studies with a larger cohort should be conducted to confirm the present findings. In addition, only 1-kHz tone bursts were used, and it remains unclear how tone frequency might affect responses to sound intensity. Indeed, loudness has been shown to depend on spectral [82][83][84] and temporal [85][86][87] properties of sound, as well as other factors.
Age at testing was significantly different between both the NH and HA groups and might have contributed to group differences by affecting neural responses. However, it is difficult to recruit adults up to 70 years old that still have normal hearing thresholds due to presbycusis. Moreover, HA devices are very complex. In this study, only Oticon and Bernafon in-ear HAs were used. HA signal processing, such as channel-specific compression time constants, noise reduction algorithms, and adaptive directionality, may affect CAEPs [88]. It may be preferable to conduct this study using the exact same devices and settings for all HA participants.
Finally, good HA outcomes likely cannot be reduced to auditory comfort markers such as CAEP or pupillometry. Many factors can contribute to good HA outcomes, such as device-specific (e.g., directional microphones, signal processing, gain settings) and patient-specific variables (age, attention, motivation, biology, personality, lifestyle) [63].

Summary of the Results
SNHL can distort perception of sound intensity. In this study, behavioral (loudness) and objective responses (CAEPs, pupillometry) to sound intensity were measured in NH and HA participants (with the HA on or off). For all groups, loudness increased with intensity; for the HA group, loudness was significantly higher with the HA on than off only at the low-to-mid intensities. Most CAEPs showed a significant response to intensity in the NH and HA groups; there was little effect of HA amplification on CAEPs. Similarly, there was a significant response in pupil dilation and latency to intensity in the NH and HA groups; there was little effect of HA amplification on pupil response. FPCA showed only a few significant relationships between behavioral and objective measures. Further research is needed to better understand the relationship between stimulus intensity, loudness perception, and objective measures.