Using Acoustic Data Repositories to Study Vocal Responses to Playback in a Neotropical Songbird

Simple Summary: Professional and amateur ornithologists might use conspeciﬁc playback to attract or study birds, record their songs, and then archive these songs in acoustic data repositories. This allows us to explore whether birdsong varies after simulated aggressive encounters. We investigated this idea in Rufous-browed Peppershrikes ( Cyclarhis gujanensis ), a widespread Neotropical bird. When accounting for geographic variation in song traits, we found that males recorded after playback produce longer songs than males recorded singing spontaneously. In contrast, playback usage neither altered song rate nor song frequency. Despite the limitations derived from unstandard-ized playback designs, data from acoustic repositories can provide hints about signal ﬂexibility in aggressive contexts. Abstract: Birds may alter song structure in response to territorial challenges to convey information about aggressive intent or ﬁghting ability. Professional and amateur ornithologists upload daily many birdsong recordings into acoustic data repositories, usually scoring whether songs were recorded in response to a conspeciﬁc playback or produced spontaneously. We analyzed recordings from these repositories to evaluate if song traits of Rufous-browed Peppershrikes ( Cyclarhis gujanensis ) vary between playback-elicited songs and spontaneous songs. For each recording after playback, we chose one spatially closer spontaneous recording to avoid geographic bias. Birds recorded after playback produced slightly longer songs than birds that were singing spontaneously. This result was accounted for by increases in the amount of sound and silence within a song after the playback instead of changes in the mean number or duration of elements. Playback did not alter song frequency parameters (bandwidth, minimum, mean, and maximum frequencies) or song rate. These results indicate that song duration might mediate aggressive interactions in Rufous-browed Peppershrikes. Even considering limitations such as unknown playback stimulus identity and possible pseudoreplication, acoustic data repositories give a unique yet unexplored opportunity to gather insights into the evolution of song ﬂexibility during aggressive encounters.


Introduction
Acoustic signals are evolutionarily conserved forms of vocal or non-vocal communication that enable the exchange of information between individuals of the same or different species [1]. Acoustic communication mediates activities such as foraging, mate choice, and territory defense [2,3]. In conflict situations, for example, acoustic signals may be favored Birds 2023, 4 62 rather than physical combat, as the latter poses risks of injury, higher energy expenditure, or death [4]. Acoustic signals can exhibit plasticity and vary structurally (e.g., in frequency or duration) depending on the costs of those signals [5], signaler motivation [6], environment [7], urbanization [8], anthropogenic noise [9], and the social pressures exerted on these signals [10].
Flexibility in acoustic signal traits (e.g., consistency, complexity) can convey information about the individual's intention or quality in both territorial defense and courtship contexts [11]. This flexibility may also be related to sound transmission efficiency. A bird may produce more redundant acoustic signals to increase the probability of signal detection and recognition by distant receivers [12], given the risk of degradation and reverberations along the sound path [13]. In short-distance communication, on the other hand, the production of soft songs with fewer syllables or longer intervals between syllables may be favored to protect the sender's position from unwanted eavesdroppers, as predicted by the eavesdropping avoidance hypothesis [14,15].
Changes in the acoustic signal duration can also inform the signaler's intention to escalate in the agonistic interaction [16], especially when song complexity increases through a greater number of phrases per tempo or longer phrases [17,18]. Despite the cost of producing more complex signals, this vocal plasticity can be a way of showing the individual's physical ability to fight or his quality as a mating partner [19]. Like song duration, acoustic frequency (or pitch) may be indicative of aggressive intent in birds according to the motivation-structural hypothesis [20]. This idea is based on an inversely proportional relationship between the sender's body size and song pitch [21,22]. The larger the bird, the larger the syrinx and, as a result, the individual can produce lower frequency sounds, as the syrinx folds tend to vibrate more slowly in this allometry context [21]. Consequently, the individual can benefit from producing the lower-pitched signal as possible, as it would reliably transmit information about its size and, likely, its fighting ability [23][24][25][26][27].
Professional and especially amateur ornithologists might use playbacks of conspecifics to attract or study birds [28], record their songs, and then upload the audio files into acoustic data repositories. This makes acoustic data repositories a rich source of information to study acoustic signaling during simulated agonistic interactions across taxa. The collection and deposition of sound data by citizen scientists including birdwatchers have revolutionized avian bioacoustics. By allowing access to recordings of birds in different locations, contexts, and times [29,30], acoustic data repositories have been useful to the understanding of vocal dialects [31], the effects of urbanization on vocalizations [32], the cultural evolution of vocalizations [33], and the variation in song frequency across species [34]. However, we are unaware of a study on acoustic communication in agonistic contexts using data from acoustic data repositories.
There are drawbacks and benefits to using acoustic data repositories to investigate vocal responses to playbacks. The primary limitations are related to the unstandardized and unknown playback stimuli and designs [35], spatiotemporal confounding effects (such as geographic and year changes) [31,32,36], and the lack of knowledge regarding how and which song features communicate aggressive intent and motivation [37]. These limitations may add substantial noise to the analysis, which means the occurrence of differences between spontaneous songs and songs produced in response to playback may be a strong indication that this pattern would be found in wild birds subjected to a standardized playback experiment. The advantages of using acoustic data repositories reside in the increased possibility of gaining insights into whether vocal responses to playback vary with population identity [38], population density [39], species [40], space [41], and time [42]. The research effort required to sample multiple populations or species over a wide range of time or places makes it challenging to carry out this task in the field.
Here, we compared spontaneous songs and songs produced after conspecific playback using audio recordings of the Rufous-browed Peppershrike (Cyclarhis gujanensis) retrieved from acoustic repositories. This songbird has a broad distribution in the Neotropics [43] and a large number of recordings available in the repositories. Male Rufous-browed Peppershrikes produce a short song that varies widely with latitude and habitat openness, suggesting signaling variation at evolutionary and/or ecological time scales [43]. We expected that birds recorded after playback would produce more, longer, and lower-pitched songs with a wider bandwidth [44] than birds that were recorded singing spontaneously (e.g., without playback), assuming these song traits signal aggressive intention or fighting abilities to simulated intruders [37].

Study Species
The Rufous-browed Peppershrike (Cyclarhis gujanensis, Aves: Vireonidae) is a mediumsized songbird (22-35 g) that inhabits a wide range of open and semi-open habitats in the Neotropics (from Mexico to Argentina) [45]. This species is not globally threatened, and it is often one of the most frequent species in avian assemblages in which it occurs [45][46][47]. The Rufous-browned Peppershrike produces two short song types (<5 s): an infrequent, slow-paced series of descending notes (song type 1, attributed to the female) and a loud series of melodious, whistled, and frequency-modulated notes (song type 2, attributed to the male) [43,45]. In this study, we focused on song type 2 which is considered the primary song of the species [43]. A previous study has shown that males produce on average~2 variants of song type 2 (range: 1, 7), acoustic features of this song type do not differ among subspecies, and there is no evidence for dialects (discrete geographic variation in song traits) [43]. However, songs from lower latitudes are shorter and have more elements, broader bandwidth, and higher pitch (maximum frequency) than those from higher latitudes [43]. In addition, songs produced in open habitats have narrower bandwidth and lower pitch than songs produced in closed habitats [43]. The widespread distribution, the clinal geographic variation in song features, and the large number of recordings available in acoustic data repositories make the Rufous-browned Peppershrike an ideal species to investigate song variation in an aggressive context using data from acoustic data repositories.

Song Recordings
We downloaded recordings of Rufous-browed Peppershrikes from three acoustic data repositories: xeno-canto (https://xeno-canto.org/, accessed on 24 January 2021), WikiAves (https://www.wikiaves.com.br/, accessed on 8 February 2021), and Macaulay Library (https://www.macaulaylibrary.org/, accessed on 31 January 2022). The recordings were obtained on the accesses dates listed above. First, we visually inspected oscillograms and spectrograms in Raven Pro 1.6.3 [48] to select 54 high-quality recordings (high signal-tonoise ratio and no acoustic overlays) made after using playback to stimulate vocal response (post-playback recordings, hereafter). For each remaining post-playback recording, we selected a high-quality spontaneous recording, classified as obtained without the use of playback, and performed in the same city or a city located within a radius of 300 km. This is a reasonable distance-threshold as the song type 2 of the study species does not have dialects and does not vary with subspecies [43]. There was no information about the use of pishing (imitated alarm call used by the observer to attract birds) [49] in the metadata of our selected recordings; thus, we assumed that these recordings were obtained without using pishing. We chose pairs of one post-playback recording and one closer spontaneous recording in an attempt to isolate the confounding effect of geographic variation on song structure [43]. In other words, we compared two recordings of the same region: a postplayback recording and a spontaneous recording. All recordings belonging to the same pair of comparisons were obtained from the same subspecies. These recordings encompass most of the distribution range of the Rufous-browed Peppershrike, though there were fewer recordings from central South America (Figure 1). song type 2. The mean distance (± SD) and median time interval between a post-play and a spontaneous recording of the same pair of comparisons were, respectively, 56 74.26 km (range: 0.00, 253.07) and 716.5 days (range: 0, 18,015). All audios were conv to WAVE format and had sample rate (44.1 kHz) and resolution (16 bits) standardiz Adobe Audition 2015.0.

Acoustic Analyses
We selected the start and end of each song ( Figure 2) in Raven Pro 1.6.3 [48] (set window type: Hann, window length: 512, overlap: 50%). Then, we import into R 4.1. tables containing these song selections. From each recording, we calculated song ra the number of songs per min in the time interval between the start of the first song the end of the last song in a recording. We did not calculate the song rate for four a files that were composed of two or more recordings of the same bird. On these files ferent recordings (i.e., audio files) were grouped into a single file and separated by periods created by the recordist when editing the file. Song rate was calculated onl recordings with ≥4 songs (mean ± SD: 9.58 ± 6.88 songs per recording, range: 4-34), re ing in 38 recordings with measurements of song rate (19 pairs of post-playback and s taneous recordings from geographically close locations). We discarded recordings from the same date and place made by the same observer because they were likely recordings of the same bird. Thus, each recording used in our study belonged to a different bird. When two post-playback recordings had only an equivalent spontaneous recording from the same or a nearby location, we selected an additional spontaneous recording from a second location closer to the two recordings. Two post-playback recordings were excluded because they had no geographically close spontaneous recordings (<300 km). Altogether, we used 88 recordings (44 post-playback and 44 spontaneous) containing 9.27 ± 10.92 songs (mean ± SD, range: 1-58, n = 816 songs). The number of songs per recording was insufficient to estimate song repertoire (variants of song type 2) for each male; thus, we treated all songs as belonging to the same variant of song type 2. The mean distance (± SD) and median time interval between a postplayback and a spontaneous recording of the same pair of comparisons were, respectively, 56.10 ± 74.26 km (range: 0.00, 253.07) and 716.5 days (range: 0, 18,015). All audios were converted to WAVE format and had sample rate (44.1 kHz) and resolution (16 bits) standardized in Adobe Audition 2015.0.

Acoustic Analyses
We selected the start and end of each song ( Figure 2) in Raven Pro 1.6.3 [48] (settings: window type: Hann, window length: 512, overlap: 50%). Then, we import into R 4.1.1. the tables containing these song selections. From each recording, we calculated song rate as the number of songs per min in the time interval between the start of the first song and the end of the last song in a recording. We did not calculate the song rate for four audio files that were composed of two or more recordings of the same bird. On these files, different recordings (i.e., audio files) were grouped into a single file and separated by silent periods created by the recordist when editing the file. Song rate was calculated only for recordings with ≥4 songs (mean ± SD: 9.58 ± 6.88 songs per recording, range: 4-34), resulting in 38 recordings with measurements of song rate (19 pairs of post-playback and spontaneous recordings from geographically close locations). At the song level of analysis, we measured song duration (song length in s), minimum frequency, maximum frequency, frequency bandwidth, and median frequency (all in Hz). These metrics were extracted using the warbleR package (v. 1.1.27) in R [51]. Song duration was measured as the time interval between the start and the end of each song (in s) [52], whereas median frequency was considered the frequency that divided the frequency spectrum into two parts of equal energy [53].
We used the power spectrum to select the frequencies located at a threshold of 15 dB below (minimum frequency) and above (maximum frequency) the dominant frequency (i.e., the frequency with most energy in a song) for each song (freq_range function, window type: Blackman, window length for frequency domain: 1024, window length for time domain: 256, frequency spectrum smooth: 1, overlap: 90%) [54,55]. We included a filter on this function to remove frequencies below 1 kHz and above 6 kHz, which fell outside the song frequency range in Rufous-browed Peppershrikes. The variable window length ensured an appropriate resolution for time and frequency measurements [56]. Finally, we calculated frequency bandwidth of each song as the difference between maximum and minimum frequencies [57].
We could not measure minimum frequency for 114 songs, maximum frequency for three songs, and frequency bandwidth for 117 songs. These songs were discarded because they had a low signal-to-noise ratio and reached the threshold of amplitude outside the chosen frequency range (i.e., 1-6 kHz, see above). We measured minimum frequency for, on average, 18.51 songs per male (range: 3-66, n = 648 songs), maximum frequency for 18.47 songs per male (range: 3-68, n = 813), and frequency bandwidth for 18.43 songs per male (range: 3-66, n = 645), for a total of 35 pairs of spontaneous versus post-playback recordings from geographically close locations. Because we had to discard songs from subsequent analyses, we also measured alternative energy-based frequency measurements: first quartile frequency (frequency at 25% of signal energy), third quartile frequency (frequency at 75% of signal energy), and interquartile frequency range (difference between these first two measures) [51]. At the song level of analysis, we measured song duration (song length in s), minimum frequency, maximum frequency, frequency bandwidth, and median frequency (all in Hz). These metrics were extracted using the warbleR package (v. 1.1.27) in R [51]. Song duration was measured as the time interval between the start and the end of each song (in s) [52], whereas median frequency was considered the frequency that divided the frequency spectrum into two parts of equal energy [53].
We used the power spectrum to select the frequencies located at a threshold of 15 dB below (minimum frequency) and above (maximum frequency) the dominant frequency (i.e., the frequency with most energy in a song) for each song (freq_range function, window type: Blackman, window length for frequency domain: 1024, window length for time domain: 256, frequency spectrum smooth: 1, overlap: 90%) [54,55]. We included a filter on this function to remove frequencies below 1 kHz and above 6 kHz, which fell outside the song frequency range in Rufous-browed Peppershrikes. The variable window length ensured an appropriate resolution for time and frequency measurements [56]. Finally, we calculated frequency bandwidth of each song as the difference between maximum and minimum frequencies [57].
We could not measure minimum frequency for 114 songs, maximum frequency for three songs, and frequency bandwidth for 117 songs. These songs were discarded because they had a low signal-to-noise ratio and reached the threshold of amplitude outside the chosen frequency range (i.e., 1-6 kHz, see above). We measured minimum frequency for, on average, 18.51 songs per male (range: 3-66, n = 648 songs), maximum frequency for 18.47 songs per male (range: 3-68, n = 813), and frequency bandwidth for 18.43 songs per male (range: 3-66, n = 645), for a total of 35 pairs of spontaneous versus post-playback recordings from geographically close locations. Because we had to discard songs from subsequent analyses, we also measured alternative energy-based frequency measurements: first quartile frequency (frequency at 25% of signal energy), third quartile frequency (frequency at 75% of signal energy), and interquartile frequency range (difference between these first two measures) [51].

Statistical Analyses
We compared post-playback songs and spontaneous songs using multivariate and univariate approaches. The multivariate approach is a way to test differences between post-playback songs and spontaneous songs in the multivariate acoustic scale, whereas the univariate analyses permit us to examine specific predictions such as changes in song frequency (motivation-structural hypothesis). For both approaches, we averaged each acoustic metric within each male, thus removing within-individual variance in song features. We also natural log-transformed (ln) minimum frequency, maximum frequency, median frequency, and bandwidth to achieve normal distributions and improve analytical resolution. We also corrected for geographical variation in song traits (see below). The possible effect of time of the year on song traits varies geographically (e.g., between the northern and southern hemispheres) due to variations in timing of breeding seasons [58,59]. Thus, we were unable to account for seasonal and year changes in song features [42] due to the lack of data on the seasonality of breeding across the Rufous-browed Peppershrike's distribution range.
We first performed a permutational multivariate analysis of variance (PERMANOVA) to compare distance matrices of acoustic features between post-playback and spontaneous songs (adonis function in vegan package, v. 2.5-7, in R, 9999 permutations) [60]. For this analysis, we considered normalized Euclidean distance matrices (decostand and vegdist functions) of all acoustic variables except song rate, which had a smaller sample size (see above). We also treated pairs of geographically close recordings as a block (stata) in this analysis, thus comparing multivariate acoustic features within each pair. This procedure takes geographic variation in song traits into account when comparing post-playback and spontaneous songs, despite not correcting for geographic variation in song traits within each of these two song categories.
We also compared each song feature between post-playback and spontaneous songs univariately using paired t-tests and adopting Welch approximation to the degrees of freedom. This procedure also takes geographic variation into account because the comparisons are only made between songs of spatially matched locations. Because we had multiple frequency measurements to test the same hypothesis (motivation-structural hypothesis), we adjusted p-value (method: false discovery rates) for tests using frequency measurements [61]. We repeated these paired t-tests but replaced the amplitude threshold-based frequency measurements with the energy-based frequency measurements; however, the results remained qualitatively the same. Therefore, we show only results from analysis including the amplitude threshold-based frequency measurements.
Playback use did not affect song frequency parameters: median frequency, minimum frequency, maximum frequency, and bandwidth (Table 1). In addition, the song rate was similar when the bird sang spontaneously or after conspecific playback. Acoustic features did not differ between post-playback songs and spontaneous songs at the multivariate scale (PERMANOVA: t = -2.20, df = 21, p = 0.84). Playback use did not affect song frequency parameters: median frequency, minimum frequency, maximum frequency, and bandwidth (Table 1). In addition, the song rate was similar when the bird sang spontaneously or after conspecific playback. Acoustic features did not differ between post-playback songs and spontaneous songs at the multivariate scale (PERMANOVA: t = -2.20, df = 21, p = 0.84).

Discussion
We show that male Rufous-browed Peppershrikes recorded after conspecific playback produced slightly longer songs but did not differ in song rate and song frequency in comparison with males that produced spontaneous songs. The longer song duration in birds recorded after conspecific playback was driven by an increasing amount of sound and silence within a song. Since we controlled for geographic variation in song traits, our results suggest that longer songs signal aggressive intent or fighting ability or that playback is used to attract shy and aggressive individuals. We discuss our results considering the limitations and advantages of acoustic data repositories to study the evolution of signaling during aggressive interactions across extant birds.
Signal duration mediates aggressive interactions in many other bird species [18,[62][63][64]. However, because our study relied on data from acoustic data repositories, we should consider alternative explanations for this vocal response to playback. For example, the longer songs in post-playback recordings could be due to recordists being more likely to use playback (or longer stimuli) to attract shy birds and those birds behaving consistently more aggressively [65]. Field studies with Rufous-browed Peppershrikes are needed to confirm three criteria in order to consider song duration as an aggressive signal [37]: males

Discussion
We show that male Rufous-browed Peppershrikes recorded after conspecific playback produced slightly longer songs but did not differ in song rate and song frequency in comparison with males that produced spontaneous songs. The longer song duration in birds recorded after conspecific playback was driven by an increasing amount of sound and silence within a song. Since we controlled for geographic variation in song traits, our results suggest that longer songs signal aggressive intent or fighting ability or that playback is used to attract shy and aggressive individuals. We discuss our results considering the limitations and advantages of acoustic data repositories to study the evolution of signaling during aggressive interactions across extant birds.
Signal duration mediates aggressive interactions in many other bird species [18,[62][63][64]. However, because our study relied on data from acoustic data repositories, we should consider alternative explanations for this vocal response to playback. For example, the longer songs in post-playback recordings could be due to recordists being more likely to use playback (or longer stimuli) to attract shy birds and those birds behaving consistently more aggressively [65]. Field studies with Rufous-browed Peppershrikes are needed to confirm three criteria in order to consider song duration as an aggressive signal [37]: males should be able to adjust song duration in response to conspecific playback (the context criterion) [66]; song duration should predict an escalation in aggressiveness by the signaler (the predictive criterion); and longer songs should elicit differential aggressive responses from receivers [63,64] (the response criterion). These studies would be important to evaluate the reliability of using acoustic data repositories as a source to study aggressive vocal signals.
According to the motivation-structural hypothesis [20], we expected that the production of lower-pitched songs of the same or different type would be favored in aggressive contexts because it carries information about body size and fighting ability [23][24][25][26]. However, frequency did not vary between songs produced after conspecific playbacks and songs produced spontaneously, which does not support this hypothesis. Alternatively, habitat may have masked a potential relationship between song frequency and aggressive context. Rufous-browed Peppershrikes are known to produce higher-pitched songs in closed than in open habitats at the macrogeographical scale [43]. Recordists may be more likely to broadcast conspecific songs to attract birds in closed than in open habitats because visualization of the bird is more difficult in closed habitats [28]. Therefore, even if Rufous-browed Peppershrikes are producing lower-pitched songs in aggressive contexts (e.g., [26]), these birds that were subjected to conspecific playback are more likely to inhabit closed habitats [28], which in turn is related to the production of higher-pitched songs in this species [43]. In other words, the effects of habitat and aggressive context on song frequency may be masking each other in our dataset. Further studies using acoustic recordings from acoustic data repositories should always consider the effects of habitat on song traits (e.g., [67]).
We expected that conspecific playback would lead to higher song rates, as observed in many bird species [68][69][70]. However, we found no variation in song rate when comparing spontaneous songs and songs induced by playback and produced by different birds from geographically close locations. While the variability in playback efficacy should have hidden any response to playback, we observed a small effect of conspecific playback in song duration, which suggests that song rate does not vary in aggressive contexts in Rufousbrowed Peppershrikes. Alternatively, one could attribute this result to the unstandardized and variable playback procedures between recordists. Playback procedures may have varied in many ways, including stimulus location, acoustic quality, duration, and amplitude, and bird identity and its attributes; all of these factors may affect vocal response to the playback [38,71,72]. For example, recordists may stop broadcasting the stimulus as the birds approach them, which may lead to low or no vocal response. In addition, birds may sing less and spend more time in vigilance after approaching the human holding the speaker [73]. Thus, the variability in the efficacy of conspecific playbacks may have hidden an actual effect of conspecific playback on song rate.
We acknowledge that our study has limitations associated with the use of data from acoustic data repositories and the variability arising from between-individual comparisons. One limitation is that we have no information about the playback procedures as outlined above. Different recordists might have used the same stimulus to broadcast to different birds, leading to pseudoreplication [35]. Pseudoreplication reduces the ability to detect real effects because vocal responses can be biased by unique characteristics (e.g., frequency) of the most broadcasted stimuli [35]. This could be mitigated whether acoustic data repositories require more details about the playback procedure (e.g., stimulus identity, duration) rather than mere playback usage. Another issue is dealing with spatiotemporal confounding effects on song traits [36,74], such as the effects of urbanization and anthropogenic noise on song frequency and duration [8]. We must account for geographic variation in song, especially in vocal learning species [75], as we did here by comparing pairs of spontaneous and playback-induced songs from geographically close locations.
There is a significant ethical issue with the use of conspecific playback to lure birds and record their songs [29,76,77], especially given that the majority of contributors to acoustic data repositories are amateur ornithologists or birdwatchers [28]. Conspecific playback may have negative impacts on bird behavior and physiology [78][79][80][81]. For example, frequent birdwatching may cause habituation in birds [80], while acute playback perturbates the oxidative state of birds [79]. Although we emphasize the importance of acoustic data repositories for the study of aggressive vocal signals, we advise against birdwatchers using playback or pishing [82] to attract or provoke birds to vocalize.

Conclusions
Here, we showed that Rufous-browed Peppershrikes produce slightly longer songs after conspecific playbacks, suggesting that song duration mediates aggressive interactions in this species. Our study highlights the potential of acoustic data repositories to contribute to a better understanding of the function of acoustic signals in aggressive contexts, but results should be treated with caution and contrasted with field studies given the limitations associated with using data from acoustic data repositories. The replication of this study in more species may allow us to gather insight about the ecological and evolutionary drives of signaling lability during aggressive interactions (reviewed in [37]). We hope this result encourages researchers to consider acoustic data repositories as a source to study acoustic communication in aggressive contexts.