Effects of Background Sounds on Annoyance Reaction to Foreground Sounds in Psychoacoustic Experiments in the Laboratory: Limits and Consequences

: In a variety of applications, e.g., psychoacoustic experiments, virtual sound propagation demonstration, or synthesized noise production, noise samples are played back in laboratories. To simulate realistic scenes or to mask unwanted background sounds, it is sometimes preferable to add background ambient sounds to the noise. However, this can inﬂuence noise perception. It should be ensured that either background sounds do not affect, e.g., annoyance from foreground noise or that possible effects can be quantiﬁed. Two laboratory experiments are reported, in which effects of mixing background sounds to foreground helicopter samples were investigated. By means of partially balanced incomplete block designs, possible effects of three independent variables, i.e., helicopter’s sound exposure level, background type, and background sound pressure level were tested on the dependent variable annoyance, rated on the ICBEN 11-point numerical scale. The main predictor of annoyance was helicopter’s sound exposure level. Stimuli with eventful background sounds were found to be more annoying than those with less eventful background sounds. Furthermore, background type and level interacted signiﬁcantly. For the major part of the background sound level range, increasing the background level was associated with increased or decreased annoyance for stimuli with eventful and less eventful background sounds, respectively.


Introduction
There are a variety of applications in the context of laboratory noise research or virtual acoustics for which background sounds are useful or, in many cases, necessary. The following is a non-exhaustive list in this regard.
It is possible to generate noise synthetically by means of emission synthesizers using physical and/or parametric approaches [1][2][3][4][5][6][7]. Adding recorded or synthesized (natural) background sounds to the noise samples helps to increase the degree of realism of the auralization. It is further possible to use propagation filters to simulate farther distances [4,8], in the case of which the recorded background sounds would also be sent to the target distance. Whereas in this case, only the source is aimed to be sent to the target distance, the background sounds are actually supposed to remain close to the observer. This problem can be solved with adding similar background sounds in the observer position [8].
in soundscape. It can be hence summarized that not only different background sources (water, birds, vegetation, etc.) might influence the perception of the noise differently, but also the temporal structure [13,24], spectral structure [13,18,19], and the level [12][13][14]25] of the background sound could be important parameters for the altered perception of the noise. In accordance with that, Aletta et al. [26] suggested a holistic approach when categorizing the soundscape, considering the physical spectro-temporal characteristics, as well as the type and the meaning (i.e., the semantic content) of sounds.
With that background, this paper investigates possible effects of background sounds of different types and levels on the perceived annoyance of foreground noise. In particular, sounds of landing helicopters were used as an exemplary noise source. Unlike studies which investigated relatively high-level background sounds-partially masking the noise-in order to improve the urban soundscape [12][13][14], the study presented here focused on the usage of background sounds in psychoacoustic experiments and in virtual acoustic applications. Section 2 of this paper introduces briefly the experimental concept. Experimental design and setup follow in Section 3. Experiments 1 and 2 and their results are presented in Sections 4 and 5, respectively. After a discussion of the outcomes in Section 6, Section 7 will conclude the paper.

Experimental Concept
Acute annoyance reactions to helicopter noise were investigated under laboratory conditions. The observed annoyance ratings correspond to "short-term annoyance" [27] or "psychoacoustic annoyance" [16], similar to the reported laboratory experiments by Schäffer et al. [28] and Taghipour et al. [8]. The term "short-term" refers to the time period during and after an acoustic stimulus' playback and before the next stimulus is presented [9].
To investigate differences in short-term annoyance, stimuli were generated from field recordings (see Taghipour et al. [8]) of helicopter landings and from background ambient sounds. In both experiments presented here, three design parameters, i.e., two continuous variables, flight event's A-weighted sound exposure level, L AE,F , and background sound's A-weighted equivalent continuous sound pressure level, L Aeq,B , as well as the categorical variable background sound type (i.e., eventful vs. less eventful) were systematically varied to study their individual and combined associations with short-term annoyance ratings.
To avoid confusions between the foreground sound (i.e., the helicopter noise) and the ambient background sound, in this article, "noise" and "sound" are repeatedly used for these two categories of sounds, respectively. However, it should be noted that, in general, sounds are not to be labeled or prejudged as "noise" per se. It is rather for each individual observer to judge subjectively whether they perceive a sound as noise. Pleasantness or unpleasantness (including declaring a sound as "noise") is not a definitive, but a (subjectively) perceived quality of sound, which can only be interpreted in the context under investigation [9,29].
The choice of landing helicopters as foreground noise was rather exemplary. A number of takeoffs and landings of civil helicopters and propeller-driven aircraft was available, which were calibrated, analyzed and already tested with regards to the perceived annoyance [8]. In order to avoid possible variations due to the type of aircraft (helicopter vs. propeller-driven aircraft) and procedure (takeoff vs. landing), only landing helicopters were chosen for this study. Whereas the ecological validity or plausibility of this particular noise situation is supported by landing civil helicopters in everyday life for medical, rescue, special transport, or sport (hobby) purposes, it should be noted that usage of other types of foreground noise samples (e.g., road traffic) might exhibit higher ecological validity [30]. Similar experiments are planned with other noise sources.

Listening Test Facility
The two experiments presented here were conducted in AuraLab, a listening test facility of the Empa (Swiss Federal Laboratories for Materials Science and Technology), which has a separate listening and control room allowing for audio-visual supervision to comply with ethical requirements [8,9].
AuraLab satisfies room acoustical requirements for high-quality audio reproduction in terms of its background noise and reverberation time. The 3D immersive sound system with 15 loudspeakers (KH 120 A, Georg Neumann GmbH, Berlin, Germany) in a hemispherical arrangement and two subwoofers (KH 805, Georg Neumann GmbH, Berlin, Germany) is controlled by a digital signal processor [8]. The stimuli of the experiments were played back through one loudspeaker at a distance of 2 m from the listening spot and 0°elevation (subject's ear level) as well as both subwoofers. The carpeted floor was covered with additional absorbers.

Recordings
The helicopter noise samples originated from field recordings of helicopter landings at Grenchen Airport, Switzerland (ICAO code: LSZG) [8]. Single-channel audio recordings were performed by means of a series of microphones (B&K 4188, Brül and Kjaer, Naerum, Denmark) and (Nor 1227, Norsonic AS, Tranby, Norway) placed 4 m above the ground in the eastern side of the airport, on the left and right sides of the extension of the runway [8]. The six recordings selected for this study were from helicopter types A109, A119, AS35, and EC20, all of which are categorized as light-weight civil helicopters (weight: 1.7-3.2 tons). Helicopter recordings were done at the sampling frequency of 44.1 kHz and with 16 bit depth.
Single-channel recordings of ambient sound were done by means of a microphone (B&K 4006, Brül and Kjaer, Naerum, Denmark) placed 2 m above the ground in and around the city of Dübendorf, Switzerland. Background sound recordings were carried out at the sampling frequency of 48 kHz and with 24 bit depth.

Signal Processing and Mixing
As mentioned above, six recordings of landing helicopters were chosen for the psychoacoustic experiments reported here. Furthermore, four background sound extracts were selected: two sounds with dominant events (i.e., a chiming church bell and singing birds) and two sounds with less dominant events (i.e., water stream and birds). Eventfulness in this context refers to a series of single dominant events with relatively high energy which arise from a rather low level background. Less eventful sounds are, on the contrary, sounds whose momentary sound pressure level do not vary extensively from their A-weighted equivalent continuous sound pressure level, L Aeq,B [31]. The categorization of background sounds into eventful or less eventful was carried out by (qualitative) subjective evaluations by the experimenters and another environmental acoustician, as well as by objective differentiation based on signals' intermittency ratio (IR) [31] and "L 10 − L 90 ". A series of sound temporal dynamics measures is listed in Table 1 to characterize the amount of single events in the signals. Furthermore, Figure 1 shows A-weighted spectrograms of the background sound signals.
The signals underwent the following processing steps to prepare them for playback in AuraLab. Most of the processing-beside bass management-was carried out with MATLAB R2016b (MathWorks, Natick, USA). In a first stage of processing, the recordings were calibrated, high-pass filtered ( f c = 20 Hz), and low-pass filtered ( f c = 10 kHz) to attenuate irrelevant noise (e.g., microphone inherent noise). Table 1. Information about the background sounds at the exemplary L Aeq,B of 41 dB(A). For the calculation of intermittency ratio (IR), C is the dB-offset (i.e., the dB difference to L Aeq,B ) beyond which events were counted [31].

Sound
Category IR (%), C = 3 dB IR (%), C = 5 dB L AF,B, 10   The duration of each stimulus was set to 20 s. While background signals were available at a sampling rate of 48 kHz, helicopter signals were upsampled from 44.1 to 48 kHz, which is a requirement of the playback system in AuraLab. The output (mixed) signals were of 16 bit depth.
Background and helicopter signals were faded in and out by squared cosine ramps of 0.5 and 4 s, respectively. That is, whereas the background sound was planned to be present during the stimulus duration of 20 s, the helicopter sound was supposed to be arising from and fading out into the background sound. The cuts for the helicopter signals were carried out (level-)symmetrical around their maximum A-weighted noise level L ASmax,F ; i.e., in the range from L min to L ASmax,F , whereby the exact value of L min (i.e., a specific minimum level on the L AS curve) varied for different flight events, corresponding to a (windowed) signal duration of 20 s. The sound pressure level of the helicopter noise signals and the background signals were modified to reach the target levels. That is, helicopter signals were attenuated and set to the target L AE,F . Furthermore, background signals were either attenuated or amplified to scale them to the target L Aeq,B .
It should be noted that not all stimuli were mixed. For comparison and as a baseline, the six helicopter samples were also prepared as single flight event stimuli of the target sound exposure levels of experiments 1 (six levels) and 2 (three levels), without any additional background sounds.

Experimental Sessions
The two experiments were conducted as focused listening tests with repeated measures and a partially balanced incomplete block design (PBIBD) [32]. Thereby each block was only assigned to one subject, only included a subset of the stimuli, and was therefore incomplete. Partially balanced means that all stimuli of an experiment were offered in an approximately equal number of blocks and, hence, were rated for an approximately equal number of times. Furthermore, an effort was made to partially balance every single parameter of the design for every single block/subject; i.e., for example, all the helicopter events were tested for an approximately equal number of times by each subject.
The experiment was carried out individually for each subject. After reading the study information and signing a consent form to participate, the subjects answered the first part of the questionnaire about their hearing and well-being. They were then introduced to the listening test and the test software which guided them throughout the test. Subjects rated their perceived annoyance they associated with each stimulus during or directly after its playback. After the listening test, the subjects filled out the remaining part of the questionnaire about their demographic data. Figure 2 shows the experimental software's graphical user interface (GUI). The software guided the subjects through the test and recorded their annoyance ratings. The subjects listened to a few orienting and a few training sounds before starting the main listening test. The stimuli were presented in a random order for each subject. They were played back once only, with a 1.5-s break between stimuli after each complete playback. As shown in Figure 2, short-term annoyance was rated on the ICBEN 11-point numerical scale [33]. Thereby, the following (modified) question was asked for every stimulus: "when you imagine that this is the sound situation in your garden, what number from 0 to 10 represents best how much you would be bothered, disturbed or annoyed by it?"

Statistical Analysis
Statistical analysis was carried out with IBM SPSS Statistics, version 25 (IBM Corporation, Armonk, USA). Tested effects were considered significant if the observed data's probability, p, of occurrence under the null hypothesis was ≤0.05.
Associations of design variables (L AE,F , L Aeq,B , and background type) with annoyance were analyzed by means of linear mixed effects models [34], combining fixed effects of categorical variable background type, covariates L AE,F , L Aeq,B , playback number, subjects' random effects, and their interactions to predict dependent variables annoyance ratings. It should be noted that, for the statistical analysis, the variable background type was coded with three levels of "no background", "eventful (E)", and "less eventful (L)", as the experimental hypothesis mainly focused on the eventfulness of the background.
Several models of different degrees of complexity were established. The models were compared using the Akaike information criterion (AIC) [35] and Bayesian information criterion (BIC) [36], where the model with the lowest AIC/BIC was preferred. Non-significant variables and interactions were excluded from the final models that will be presented later in this paper.

Experiment 1
Experiment 1 focused on the scenario in which the level of the background sound was reasonably low (i.e., compared to the level of the foreground helicopter noise). That is, mixed signal sound levels were not to be considerably higher than flight events' sound levels:

Experimental Design
Three design variables-i.e., independent variables-were considered in the design of experiment 1: L AE,F (six levels), L Aeq,B (four levels), and background type (five levels). The partially balanced incomplete block design consisted of 32 blocks (i.e., subjects), each containing 40 stimuli.
Flight events' L AE,F were set to 68, 70, 72, 74, 76 or 78 dB(A). Background sounds' L Aeq,B were set to 6 (i.e., background level of AuraLab in case of "no background sound"), 34, 41, or 48 dB(A). Four background sounds were used which were either eventful, E, or less eventful, L: water stream (L), birds and vegetation (L), birds and vegetation (E), and church bell (E). An additional category was coded as "no background" when no background sound was available in the corresponding stimulus.

Subjects
Thirty-two subjects (13 females and 19 males) participated in the main experiment. All subjects declared to have normal hearing (self judgement) and to feel well. They were aged between 21 and 50 yr (median 27.5 yr).

Results
In total, 1280 annoyance ratings were collected (32 subjects × 40 stimuli), each of which was assigned to a particular subject and a specific stimulus (of the total 78 stimuli). The results of experiment 1 are illustrated in Figure 4.
The following simple linear mixed effects model was found to be effective to predict annoyance: In Equation (1), y ik is the dependent variable annoyance, predicted for each background type and subject at different helicopter sound exposure levels and playback number. µ is the overall grand mean and τ B,i denotes the categorical variables background type (three levels: i = 1, 2, 3). Parameters β and θ are regression coefficients for L AE,F and S ik , respectively. The (unstructured) random effect term u k is subjects' random intercept-for k = 1, . . . , 32. Finally, the error term ik is the random deviation between observed and expected values of y ik . Model coefficients are listed in Table 2.  The main predictor of annoyance was flight event's sound exposure level, L AE,F (p < 0.001). Annoyance increased with increasing L AE,F . Background type was a further significant predictor of annoyance (p < 0.05). Whereas no significant difference was found between average annoyance ratings for stimuli without background sounds and those with less eventful background sounds, stimuli containing eventful background sounds were found to be more annoying on average than the other two categories. Furthermore, annoyance increased slightly, however significantly, with the playback number (p < 0.001).
A second more complex linear mixed effects model was found to be optimal to predict annoyance: In Equation (2), y ik , µ, τ B,i , β, θ, u k , and ik are the same parameters as in Equation (1). γ is the regression coefficient for L Aeq,B and γ B,i represents interaction between background type and background level L Aeq,B . Model coefficients are listed in Table 3. The main effect of the background level (L Aeq,B ) on annoyance was not significant (p > 0.05). The significant interaction between background type and L Aeq,B in predicting annoyance (p < 0.05) can be observed in Figure 5. While, with increasing L Aeq,B , annoyance from stimuli with eventful background sounds slightly increased, it decreased for the stimuli with less eventful background sounds. Furthermore, practically no difference could be observed between annoyance from the stimuli with less eventful and those with eventful background sounds at a background level of 34 dB(A). That is, when the level of the background sound was sufficiently low, it did not matter which type of background sound was used; at this background sound pressure level, annoyance was not different on average from that for the stimuli without background sound.

Experiment 2
Experiment 2 focused on the scenario in which background sound levels (L Aeq,B ) exhibited a higher range than in experiment 1.

Experimental Design
Three design variables-i.e., independent variables-were considered in the design of experiment 2: L AE,F (three levels), L Aeq,B (four levels), and background type (three levels). The partially balanced incomplete block design consisted of 25 blocks (i.e., subjects), each containing 14 stimuli.
Flight events' L AE,F were set to 66, 71, or 76 dB(A). Background sounds' L Aeq,B were set to 6 (i.e., background level of AuraLab in case of "no background sound"), 38, 46, or 54 dB(A). Two bird (and vegetation) sounds were used which were either eventful, E, or less eventful, L. An additional category was coded as "no background" when no background sound was available in the corresponding stimulus.

Stimuli
Based on the combination of three flight events (or levels, L AE,F ), four L Aeq,B -i.e., 3 + 1 (Auralab background level)-, and three background types-i.e., 2 + 1 (no background sound)-, a total number of 21 stimuli were prepared for this experiment:

Subjects
Twenty-five subjects (seven females and 18 males) participated in the main experiment. All subjects declared to have normal hearing (self judgement) and to feel well. They were aged between 18 and 51 yr (median 27 yr).

Results
In total, 350 annoyance ratings were collected (25 subjects × 14 stimuli), each of which was assigned to a particular subject and a specific stimulus (of the total 21 stimuli). The results of experiment 2 are illustrated in Figure 7. A simple linear mixed effects model such as in Equation (1) was not found to be sufficient to predict annoyance observed in experiment 2. However, a similar linear mixed effects model as in Equation (2) was found to be optimal to predict annoyance in this experiment: Model parameters in Equation (3) are the same as for Equation (2). The significance levels were comparable to those for experiment 1. Model coefficients are shown in Table 4. Annoyance increased slightly, however significantly, with the playback number.
Similar to the results of experiment 1, also for the data collected in experiment 2, the main predictor of annoyance was L AE,F (p < 0.001). Annoyance increased with increasing L AE,F . Background type was a significant predictor of annoyance (p < 0.05). Whereas no difference was found between average annoyance ratings for stimuli without background sounds and those with eventful background sounds, stimuli containing less eventful background sounds were found to be less annoying on average than the other two categories. Furthermore, annoyance increased slightly, however significantly, with the playback number (p < 0.001), as was the case for experiment 1. The main effect of the background level (L Aeq,B ) on annoyance was not significant (p > 0.05). The interaction between background type and background level (L Aeq,B ) can be observed in Figure 8. This interaction missed the significance level by a small margin (p = 0.05). With increasing L Aeq,B , annoyance from stimuli with eventful background sounds did not change significantly and was on average at the level of annoyance from stimuli without mixed background sounds. Annoyance from the stimuli with less eventful background sounds decreased with increasing L Aeq,B , however only from 38 to 46 dB(A). At an L Aeq,B of 54 dB(A), annoyance from the stimuli with the less eventful background sound were similar to those for the stimuli with the eventful background sound. That is, when the level of the background sound was sufficiently high, it did not matter which type of background sound was used; at this background sound pressure level, annoyance was not different on average from that for the stimuli without background sound. . Interaction (experiment 2) background type and L Aeq,B : mean acoustic comfort ratings across subjects and their 95% confidence intervals are depicted. Circle (•), square ( ) and cross (×) stand for no, less eventful, and eventful background, respectively. Note that only the annoyance range from 4-7 is shown here, although the scale ranged between 0 to 10.

Discussion
Beside the differences in the number of observations and the range of L Aeq,B , experiments 1 and 2 were designed and conducted similarly. It seems therefore plausible to compare the results of the two experiments and to try to summarize the results in a combined analysis. Annoyance rating histograms for the two experiments are shown in Figure 9. Median annoyance ratings in experiments 1 and 2 were 5 and 6, respectively. Furthermore, mean annoyance ratings in experiments 1 and 2 were 5.54 and 5.47, respectively. Considering that the L AE,F (or L AE,S ) ranges of the two experiments were comparable, this outcome is not surprising. In conjunction with the similarity of the linear mixed effects models-i.e., Equations (2) and (3)-this could be interpreted as a quasi-indicator for reliability and reproducibility of the experiment. Since experiments 1 and 2 were conducted by the second author and the first author, respectively, this is additionally indicative of the objectivity of the experiment conduction. Figure 9. Annoyance rating histograms for the data from experiments 1 (above) and 2 (below). Relative frequency is shown in percentages. The black curve shows the normal distribution.
The effect of L AE,F on perceived annoyance was very similar in both experiments: annoyance increased with increasing L AE,F . The value of its regression coefficient, β, was equal to 0.413 and 0.360 in Equations (2) and (3), respectively. In both experiments, stimuli containing eventful background sounds were rated systematically higher (i.e., more annoying) than those containing less eventful background sounds. In the mid range of L Aeq,B , generally, with increasing L Aeq,B , annoyance tended to increase and decrease for stimuli with eventful and with less eventful background sounds, respectively (see Figures 5 and 8). The differences between the annoyance ratings for these two categories of background sounds saturated (i.e., vanished) in the lower and higher end of the L Aeq,B range, which is similar to saturation effects in general psychometric functions [37,38]. That is, lower and higher than certain L Aeq,B thresholds (for these laboratory experiments 34 and 54 dB(A), respectively), it would not make a difference whether the background sound is eventful or less eventful. In the case of the lower threshold, the background sound might be generally irrelevant in presence of a much louder and partially masking flight event. In the case of the higher threshold, the background is probably clearly present and audible in foreground regardless of its eventfulness.
While the differences between annoyance ratings for the eventful and the less eventful background sounds were robust and alike for the two experiments, their differences to the baseline stimuli containing only helicopter noise (i.e., no background sound) were not similar in the two experiments. Table 5 shows mean and median annoyance for these three categories. In experiment 1, on average, annoyance from the stimuli with eventful background tended to be higher than annoyance from the helicopter stimuli (with no added background sound). In experiment 2, on average, annoyance from the stimuli with less eventful background tended to be lower than annoyance from the helicopter stimuli (with no added background sound). Further investigation are needed in order to explain this difference and/or to provide more clarity in this regard. Annoyance increased slightly with increasing playback number in both experiments. The value of its regression coefficient, θ, was equal to 0.017 and 0.060 in Equations (2) and (3), respectively. This is in accord with other laboratory studies [8,28,39]: generally, annoyance increases with increasing number of stimuli listened to in an experimental session. This emphasizes the importance of a randomized playback order of the stimuli, as done for the two experiments presented here.
For both experiments, including subjects' random intercept in the linear mixed effects models improved the models significantly. This confirms findings of other psychoacoustic laboratory experiments [8,28,40,41]. Furthermore, this shows the general interindividual differences in the preferences and sensibilities of human observers to background sounds [42]. It should be noted that including subjects' age and gender in the models, i.e., Equations (2) and (3), did not improve the models significantly.
Similar further analyses were carried out with L AE,S , L ASmax,S , or L Aeq,S instead of L AE,F in the linear mixed effects models. This was done to investigate whether-beside the differences caused by the background type-the subjects were annoyed mainly by the flight event levels (F) or by the level of the mixture of flight and background sound (S). All the linear mixed effects models with L AE,S , L ASmax,S , or L Aeq,S led to higher AIC and BIC than the same models with L AE,F , which indicates that L AE,F was a better predictor of short-term annoyance. Table 6 shows Pearson correlation coefficients for bivariate correlations between annoyance and these level variables. Table 6 confirms the above analysis: annoyance correlated more strongly with the flight event sound exposure level (L AE,F ) than with the stimulus level variables. Table 6. Pearson's r for correlations between a series of sound level variables and annoyance. The outcome of this study strengthens the theories of urban sound design. Enabling the presence of less eventful background sounds (such as water stream, birds, and vegetation) of mid L Aeq,B range should be effective in reducing perceived annoyance from noise sources in living areas which are affected by them. The highest annoyance rating in experiment 1 was given for the stimuli with the eventful church bell in background. Similarly, Kang and Zhang [43] reported that natural and culture-related sounds (e.g., music) were preferred compared to artificial sounds. More importantly, Hong and Jeon [44] reported that, in particular, water and bird sounds have been usually evaluated as the most effective and most favorable sounds to improve urban sound environments [11,12]. The results of the present study indicate that, for background sounds to be effective in reducing perceived annoyance and improving the soundscape, not only it is important that they are natural, e.g., from water and birds, but also they should exhibit a low degree of eventfulness. Furthermore, their sound pressure levels should not be too low or too high.
On the other hand, if the goal is, for example in a psychoacoustic experiment or in virtual acoustic demonstrations, not to affect the perception of the foreground sound (i.e., noise)-and hence to avoid a weakening in the validity of the experiment-the data suggest to use relatively low-level natural background sounds with a low degree of eventfulness. For the data collected from the two presented experiments, this L Aeq,B range was about 34 to 41 dB(A). Consistent with this outcome, Taghipour et al. [8] reported that two mixed "birds and vegetation" sound samples exhibiting low degrees of eventfulness and sound pressure levels (L Aeq,B ) of around 37.5 dB(A) were found to be optimal for a 3D auralization of aircraft noise.
De Coensel et al. [15] and Hong and Jeon [44] reported that, in similar contexts, (mixed) birds sounds were more pleasant or more preferable than sound of water features. In the present study (i.e., in experiment 1), no significant difference was found between annoyance from stimuli containing less eventful water stream and those containing less eventful birds.

Conclusions
Two laboratory psychoacoustic experiments were reported in this paper, in which short-term perceived annoyance from (mixed) sounds were collected. Stimuli consisted of foreground helicopter noise and background ambient sounds. The main predictor of annoyance was helicopter's L AE,F . The stimuli containing eventful background sounds were associated with higher annoyance ratings than the stimuli containing less eventful background sounds. Generally, increasing L Aeq,B accentuated this difference, however, with saturation effect at the lowest and highest L Aeq,B (at which no significant difference was observed between these two categories of background sounds). Furthermore, the statistical analysis did not lead to a unique correction factor compared to the baseline stimuli containing only helicopter flight events (i.e., no background sound). The observed differences were within one point on the ICBEN 11-point scale, however, different for the two experiments. Further studies are needed in this regard, as the quantification of such a correction factor is helpful (and partially essential) for applications in psychoacoustic experiments on annoyance and virtual acoustic demonstrations.
With respect to applications in urban sound design and acoustic comfort improvement, the outcomes suggest that enabling less eventful background sounds of water stream, birds, and vegetation could decrease the perceived annoyance by the residents. For this purpose, a mid L Aeq,B range seems to be appropriate. With respect to applications in psychoacoustic experiments and in order to ensure the experimental validity (with respect to the foreground aircraft noise), the outcomes suggest to use low-level water stream, birds, and vegetation sounds which exhibit a low degree of eventfulness.