1. Introduction
Birdsong and insect song are near universal experiences in the outdoor environment. Although some birdsongs and insect songs communicate seasonal changes and are considered pleasant by Japanese listeners, not all instances of birdsong elicit pleasant feelings [
1]. Further, while not all birdsongs and insect songs are considered by humans to be beneficial components of an environment, some have been found to contribute to perceived attention restoration and stress recovery [
2,
3].
From the perspective of soundscapes, natural sounds (e.g., water, birdsongs, and wind in trees) can play a key role in acoustic comfort. Water sounds are often used to mask other sounds and as noise barriers to enhance urban soundscapes [
4,
5,
6]. The introduction of birdsong has been found to increase the subjective pleasantness of soundscapes in public spaces [
7]. Among various natural sounds, birdsong was judged as the most effective and beneficial type of sound for improving sound environments [
8,
9,
10].
In Japanese public spaces, sound signals are often used to guide visually impaired people to specific destinations, such as a ticket gate or staircase. For instance, birdsong is often used to signal the presence of a staircase in train stations. However, more than 40% of visually impaired people reported that a birdsong stimulus was difficult to localize in a train station setting [
11]. Although guidelines exist regarding the use of birdsongs as information signals [
12], these are not always strictly followed by train company staff, who may prefer to use lower-than-recommended sound pressure levels (SPLs) to reduce discomfort in customers, staff, and surrounding residents.
The physical factors that affect the sound signals used to guide visually impaired people have been investigated from the viewpoint of sound localization. Based on the percentages of correct localization responses, researchers have found that the signal to noise ratio, initial delay time, reverberation energy, distance, elevation angle, and the temporal pattern of the signal all affect sound localization [
13,
14]. Additionally, researchers have proposed that sounds with specific temporal patterns, such as those with particular early component and silent interval lengths, might be more easily detectable by visually impaired people, and further, that these factors might be uncovered by examining human brain responses [
15].
The aim of this study was to evaluate the salience of a number of birdsongs and insect songs, and to determine the physical factors that modulated the observed salience. Here, we used the term salience to refer to whether or not a sound stood out from background noise. While previous studies have indicated that loudness is a significant predictor of salience [
16,
17], it is preferable that sound signals used for guidance purposes be salient even at a low SPL, as this reduces unnecessary discomfort elicited by loud sounds in the environment. Although the abovementioned guidelines include an appropriate volume in SPL, they do not consider variations among specific sound sources [
12]. To address this in the present study, we investigated the subjective salience of birdsong and insect song presented with equal SPLs and loudness to clarify the effects of physical factors in physically and subjectively equal sound intensity conditions.
3. Results and Discussion
The ANOVA for the SV of salience in the equal LAeq condition revealed that the main effect was statistically significant (F(17, 4184) = 194.3, p < 0.001 for birdsongs; F(15, 3479) = 325.6, p < 0.001 for insect songs). The ANOVA for the SV of salience in the equal loudness condition revealed that the main effect was statistically significant (F(17, 4472) = 187.1, p < 0.001 for birdsongs; F(15, 3255) = 120.9, p < 0.001 for insect songs).
Figure 5 shows the SVs of salience for birdsongs in the equal
LAeq and loudness conditions. The most salient birdsong was that of Garrulus glandarius in the equal
LAeq condition, although the Garrulus glandarius song was not as salient in the loudness condition. This is probably because the
LAeq of the Garrulus glandarius song was lower than that of the other birdsongs in the equal loudness condition. The relatively salient birdsongs in both the equal
LAeq and loudness conditions were that of Horornis diphone, Cuculus canorus, Latham, Porzana fusca, and Terpsiphone atrocaudata. Horornis diphone is one of three major species of passeriforme in Japan, which are known for their beautiful vocalizations. Its birdsong was the most preferred stimulus in a previous study [
1]. Since ancient times, Cuculus canorus has appeared in various documents in Japan, and its song is often compared to onomatopoeia. Compared with other stimuli, it was found to elicit the largest N1m responses, and these responses were most strongly correlated to the sound envelope in the human brain [
15]. The less salient birdsongs in both the equal
LAeq and loudness conditions were those of Zosterops japonicus, Emberiza cioides, Strix uralensis, and Cuculus saturates.
Figure 6 shows the SV of salience for insect songs in the equal
LAeq and loudness conditions. The most salient insect songs in both the equal
LAeq and loudness conditions were those produced by Meimuna opalifera, Oncotympana maculaticollis, Mecopoda nipponensis, and Tanna japonensis, which are all cicadas except for Mecopoda nipponensis. Cicadas are famous in Japan as noisy insects in the summertime. The less salient insect songs in both the equal
LAeq and loudness conditions were those produced by the Japanese katydid, Meloimorpha japonica, Gryllotalpa orientalis, and Tettigonia orientalis. The Japanese katydid and Meloimorpha japonica are well known in Japan and produce sounds during autumn. Regarding the songs of Xenogryllus marmoratus and Hexacentrus hareyamai, the judgment of salience varied among the participants. The songs of Xenogryllus marmoratus and Hexacentrus hareyamai have mainly higher frequency components and shorter durations. This might have caused the large differences between participants.
We conducted a multiple linear regression analysis with the SVs of salience for birdsongs in both the equal
LAeq and loudness conditions as the outcome variable. The final model showed that IACC, entropy, spectral flux, and spectral skewness were significant parameters in the equal
LAeq condition, while τ
1, QR of τ
e, QR of W
ϕ(0), IACC, loudness_QR, and roughness were significant parameters in the equal loudness condition:
The correlation coefficients between all of the explanatory variables in the equal
LAeq and loudness conditions are shown in
Table 1 and
Table 2, respectively. The ANOVA indicated the statistical significance of the model (
F(5, 264) = 40.34,
p < 0.001 for the equal
LAeq condition,
F(6, 281) = 26.26,
p < 0.001 for the equal loudness condition). The adjusted coefficient of determination, R
2, was 0.41 for the equal
LAeq condition and 0.35 for the equal loudness condition. The standardized partial regression coefficients in Equations (1) and (2) are summarized in
Table 3.
The IACC, which signifies the apparent source width, was the significant predictive variable for both the equal
LAeq and loudness conditions. The partial regression coefficients of the IACC were positive, indicating that birdsongs with a narrower sound source width were perceived as more salient. This is consistent with previous studies regarding the accuracy of sound source localization [
36,
37]. Spectral flux was also a significant predictive variable in the equal
LAeq condition, with positive partial regression coefficients. This suggests that quick spectral change led to higher perceived salience.
Higher frequency components might play a key role in saliency. The delay times of the maximum peak amplitude of the ACF, τ1, and the spectral skewness were significant predictive variables in the equal loudness and LAeq conditions, respectively. The negative τ1 regression coefficient indicates that birdsongs with a higher pitch were perceived to be more salient, while the positive regression coefficient of spectral skewness demonstrates that birdsongs with more energy at high frequencies were perceived as more salient.
We also conducted a multiple linear regression analysis for insect songs. The final model showed that the QR of W
ϕ(0), QR of loudness, fluctuation strength, spectral entropy, and spectral skewness were significant parameters in the equal
LAeq condition, while ϕ
1, τ
e, roughness, fluctuation strength, and the QR of spectral entropy were significant parameters in the equal loudness condition:
The correlation coefficients between all of the explanatory variables in the equal
LAeq and loudness conditions are shown in
Table 4 and
Table 5. The ANOVA indicated the statistical significance of the model (
F(5, 250) = 106.00,
p < 0.001 for the equal
LAeq condition;
F(5, 234) = 14.58,
p < 0.001 for the equal loudness condition). The adjusted coefficient of determination, R
2, was 0.68 for the equal
LAeq condition and 0.22 for the equal loudness condition. The standardized partial regression coefficients in Equations (3) and (4) are summarized in
Table 6.
Fluctuation strength was a significant predictive variable in both the equal
LAeq and loudness conditions. This suggests that strong and slow amplitude modulation of insect songs is important for salience perception. Although IACC was a significant predictor of birdsong salience in both the equal
LAeq and loudness conditions, it was not a significant predictor of insect song salience. This may be because the insect songs used in the experiment were not recorded using a dummy head microphone [
1], and so spatial impressions of the sound sources were not accurately reproduced.
Loudness variations can also be important for saliency. The QR of loudness was a significant predictive variable in the equal loudness condition. This is consistent with the results for birdsongs in the equal LAeq condition, although the partial regression coefficient was negative for insect songs and positive for birdsongs. This suggests that sound sources with moderate loudness variations are perceived to be more salient. We observed a similar pattern for roughness. Roughness was a significant predictor of both birdsong and insect song salience in the equal loudness condition. Although the partial regression coefficient for birdsong was positive, that for insect songs was negative. This suggests that sound sources with moderately fast amplitude modulation are perceived to be more salient.
Spectral entropy appears to play an important role in saliency. Spectral entropy and the QR of spectral entropy were significant predictors of salience in the equal LAeq and loudness conditions, respectively. The positive partial regression coefficient of spectral entropy suggests that abrupt energy changes in the frequency domain of a sound increase the salience. This is partially consistent with our finding regarding the role of spectral flux in the salience of birdsongs. The negative partial regression coefficient of the QR of spectral entropy suggests that stable energy changes in the frequency domain are more important for saliency.
Pitch strength can also be an important modulator of saliency. The maximum peak amplitude of the ACF, ϕ
1, was a significant predictor of insect song salience in the equal loudness condition. The negative partial regression coefficient of ϕ
1 suggests that broader frequency components are necessary for salience. This is inconsistent with previous findings regarding preference [
1]. One possible explanation for this discrepancy is the importance of tonal components for preference, specifically, the importance of melody and broader frequency components for salience, as they enable the listener to more deeply understand the characteristics of the sound source.
4. Conclusions
We examined the salience of birdsong and insect song in terms of several physical parameters. The results indicated that Horornis diphone and Cuculus canorus produce the most salient birdsongs, while Meimuna opalifera and Oncotympana maculaticollis produce the most salient insect songs. All of these creatures are well-known in Japan. The variation of loudness, roughness, and spectral skewness were significant predictors of salience for both birdsongs and insect songs. Spatial content related to the interaural cross-correlation coefficient, IACC, and spectral content expressed by spectral flux were significantly associated with birdsong salience. The maximum peak amplitude of the ACF, ϕ1, was significantly associated with insect song salience. These findings may be useful to designers of sound landmarks regarding physical parameters to consider, such as ϕ1, IACC, and spectral skewness.
Considering the findings of the current study together with those of a previous study on preference for birdsongs [
1], the birdsongs of Horornis diphone and Cuculus canorus appear to be desirable information signals because they are salient and preferred. As for insect songs, the song of Tanna japonensis appears to be a desirable signal because it is salient and preferred. This may be partly because they are ubiquitous in Japan, where they are well-liked.
Subjective salience in the current study was not well correlated with the physical parameters of the sounds. Soundscapes are not only affected by the physical aspects of sounds, but also by the context, which includes relationships between people and activities and position in space and time. Thus, the context with respect to the participants may be a dominant factor influencing subjective salience, and could be an interesting topic for future study. Furthermore, the salience of birdsong and insect song stimuli may differ according to culture. We hope to examine cognitive and cultural factors influencing salience in future work. In addition, the present findings need to be verified in a study with visually impaired participants.