Using Machine Learning to Identify Associations between the Environment, Occurrence, and Outcomes of Songbird Displacements at Supplemental Feeders

Simple Summary: Animals interact with their environment via a wide range of behaviors. Thus, exploring the factors that inﬂuence the occurrence and outcome of these consequential behaviors is important to understanding how animals interact and are affected by the world around them. Displacements—an aggressive behavior wherein one individual is chased from a resource by another— have implications for social hierarchies and geographic distribution in songbirds. At bird feeders, factors like body size and dominance rank have been shown to mediate these displacement behaviors. However, the role of the physical environment, namely temperature, humidity, and time of day, which may inﬂuence an individual’s energy needs and thus motivation to displace another individual, has remained understudied. We monitored songbird displacement behaviors using computerized bird feeders, which recorded who ate at the feeder, when, and under what environmental conditions. With these data, we used a machine learning algorithm to identify what social and environmental factors predict the occurrence and outcome of songbird displacement events. We found that the physical environment (i.e., humidity and the time of day) is associated with the occurrence of a displacement event, whereas the social environment (i.e., who’s displacing and being displaced) is associated with who’s involved in a displacement event. Abstract: The context and outcome of aggressive interactions between individuals has important ﬁtness consequences. Displacements—an aggressive interaction wherein one individual is chased from a location by another—also have implications for social hierarchy formation and geographic distribution in songbirds. Morphological correlates, like body size, and social correlates, such as dominance rank, have been shown to mediate displacements in songbirds. However, the role of the physical environment, namely temperature, humidity, and time of day, which may inﬂuence an individual’s energy needs and thus displacement motivation, has remained understudied. We monitored songbird feeding and displacement behaviors using computerized automated feeders. We observed asymmetric differences across species in displacement involvement. To identify the conditions of the social and physical environment that are associated with the occurrence and outcome of songbird displacements at supplemental feeders, we use the machine learning approach, random forest, which is a novel method to the ﬁelds of ornithology and animal behavior. From our random forest models, we found that the attributes of the physical environment (i.e., humidity and the time of day) are associated with the occurrence of a displacement event, whereas the attributes of the social environment (i.e., species of the displacer and displaced individuals) are associated with which species are involved. These results provide context to develop further observational and experimental hypotheses to tease apart the inner workings of these multifactorial behaviors on a larger scale and provide a proof of concept for our analytical methods in the study of avian behavior.

Birds 2022, 3 308 ment at feeders. Other physical features, such as the local availability of naturally-occurring foods, could also increase competition at feeders. The social environment, specifically the presence of conspecifics and heterospecifics at a feeder, also influences non-displacement feeding events and is additionally mediated by physical environmental features [25]. Thus, as the social environment influences when an individual chooses to feed, it may also influence whether an aggressive bout to achieve the food resource is necessary. By incorporating features of songbirds' physical and social environments into an analytical model, we may be able to better identify conditions that relate to the occurrence and outcomes of songbird displacements.
Here, we use machine learning algorithms to explore whether features of the physical and social environments relate to the occurrence and outcomes of songbird displacements at supplemental feeders. In order to address this question, we monitored the feeding behaviors of wild passerine populations in semi-urban Appalachian environments across a range of weather conditions using automated computerized feeding units [40]. Modern automated approaches, which allow for data collection under a wider range of conditions and can collect observations of unusual events without the biasing presence of an observer, have great potential to help us study such behaviors. To explore the data, we used random forest-an ensemble algorithm using multiple decision trees (a forest) to predict a response (e.g., feeding behavior) based on many potential predictor variables (e.g., environmental and social cues) [44,45]. This study's aim is to identify associations between the environment, both social and physical, and displacement behaviors. Our goals are to better understand the complex interactions of environmental factors driving displacements, provide context to develop further observational and experimental hypotheses to tease apart the inner workings of these multifactorial behaviors on a larger scale, and provide a proof of concept for our analytical methods in the study of avian behavior.
To both understand if this analytical approach is efficient and interpretable for avian behavior studies, and to identify potential correlates of avian displacement behaviors, we address two general a priori hypotheses: (1) the environmental attributes that relate to the occurrence of displacements differ from the environmental attributes that relate to non-displacement feeding events; and (2) the social environment is a more important predictor of which species displaces, and which is displaced, than the physical environment. Our hypotheses are intentionally broad as we use random forest as an exploratory tool to initiate future hypothesis generation in a complex system [46][47][48]. Machine learning methods, such as random forest, are especially apt to identify the most important variables from a large number of predictor variables, even if measures are not independent, and are thus particularly useful for analyzing the large and complex datasets that can be collected via automated methods. Random forest has been shown to have high accuracy in making predictions using ecological and behavioral data [46,[48][49][50]. This study serves as a proof of concept that automated data collection used in conjunction with machine learning models may be a useful tool in the study of animal behavior. The results from these models may also inform future studies of songbird displacement behaviors by identifying important environmental correlates that can be explored in further detail or manipulated in experimental design.

Data Collection
Observations of feeding events and behaviors were captured using PASSER (Programmable Automated System for Songbird Ecobehavioral Research) smart feeders [40]. These feeders operate independently of the researcher for extended periods of time, removing observer confounds and restrictions. Once a bird arrives, feeders capture 10 photos over 8 s of the bird sitting on the feeder's single perch via a camera that extends from the side of the feeder. For every set of photos, the feeder also captures the time of day, date, ambient temperature ( • C), and relative humidity (%) of the feeding event. This system can be re-activated every 10 s. A gravity feed mechanism dispenses food: in this study, it was a mixture of millet, shelled/unshelled sunflower seeds, and safflower seeds.
Two feeders were located 2.25 km apart in the city of Radford in southwest Virginia and collected data 24 h a day over a 167-day period from September 2017 to February 2018. While two feeders only captured a relatively localized snapshot of songbird behavior, these two feeders collected hundreds of feeding events a day and provide a strong platform for a proof of concept for this methodology. We use site as a variable in our analysis to control for any differential effects these two feeder's locations may have. Data was collected under Radford University's Institutional Animal Care and Use Committee (IACUC) protocol FY17-09.

Feeding Displacements
Displacements were identified manually using captured photos. An individual bird could be in one of two roles in a displacement event: displaced or displacer. The displaced is the individual removed from the perch, where the displacer is the individual removing the original bird from the perch. We classified a displacement as any set of photographs in which one individual was replaced on the feeder perch by another individual in the same or consecutive photographs within a set of ten. This timeframe of less than two seconds between the disappearance of the first individual and the appearance of the second is in accordance with previous feeder displacement works [51] and minimizes the likelihood of incorrectly classifying incidences in which a bird leaves the feeder of their own accord and is followed by, but not displaced by, another bird as a displacement. Intraspecific displacements were only included when the displacement was clearly visible in a single photo or when the individuals involved were visibly distinct due to sexually dimorphic plumage. This ensures that an individual leaving the perch and immediately returning was not incorrectly classified as a displacement.

Random Forest
We built a predictive model using the random forest algorithm [44,45] to determine whether any environmental variables or species-species interactions could identify correlates of displacement behavior. At each node in each decision tree, a random subset of predictor variables is assessed, and the variable that most increases node purity is chosen to split at that node, thereby making each decision tree different. This strategy overcomes any issues that could be presented by including variables in a predetermined order or including variables that are highly correlated. Additionally, the data is randomly sampled for each decision tree in the forest and the unsampled datasets are used to test the model. This information is used to build a confusion matrix for the prediction and calculate the out-of-bag error rates (OOB), which provides an estimate of accuracy for the model compared to the observed data. Once a model with good predictive power is established, the importance of each variable can be determined by measuring the mean decrease in accuracy (MDA) and mean decrease in Gini (node) impurity (MDG). These measures are calculated by removing each variable from the predictive function, then measuring the quality of the prediction in contrast to the full model. Homogenous MDA values across variables suggest there are no notable associations between the predictor variables and the dependent variable. Heterogenous MDA and MDG values suggest select variables were strongly associated with the dependent variable and are strong predictors. We used the R v3.5.2 [52] package 'randomForest' [53] with 1000 decision trees for all analyses discussed below.
Initially, we set up the random forest algorithm to predict whether a displacement occurred (i.e., displacement occurrence model), which we will refer to as "yes" (a displacement occurred) and "no" (a displacement did not occur; i.e., an uninterrupted feeding event). The predictor variables (n = 10) used are outlined in Table 1. Since displacement events are relatively rare, thereby biasing the response variable strongly toward no responses (yes = 668 and no = 24,789), the resulting model could not accurately predict the yes response (Table A1). Therefore, we implemented a downsampling scheme to overcome bias in uneven response classes [54]. For 100 iterations, we randomly subsampled the no observations 668 times to match the number of yes observations and used all the yes observations for a total of 1336 observations to inform the model. The iterations were then averaged to calculate the OOB error rates, and MDA and MDG values. We decided to use 100 iterations because the average and standard deviation OOB error rates did not change when we increased this number, (Table A2) and it produced balanced error rates across classes. Table 1. Variables used in the two suites of random forest models. Temperature, humidity, time of day, previous bird, Y/N prior 2 min, Y/N prior 5 min, Y/N prior 10 min, Y/N prior 20 min (time intervals selected to account for immediate and delayed effects of presence), and site are used in both suites of models (what is associated with displacements, i.e., displacement occurrence model, and what is associated with which individual is the displacer and which the displaced, i.e., the role prediction models). Species (who is the first species on the feeder) is used only in the displacement occurrence model. Displaced or displacer, Y/N each species, n species 15 min, 15 min count, and 15 min frequency were only used in the role prediction models. The most frequent species seen at the feeder in the 15min prior to a displacement event Accounts for the influence, or lack therefore, of the most frequent species at the feeder prior to a displacement event Next, we set up two more random forest analyses, one to identify the correlates of when a species was in the displaced role and another for when a species was the displacer (i.e., the role prediction models). In these cases, the response variable is the species in the displaced or displacer role (668 observations total). The predictor variables (n = 20) used are outlined in Table 1. These role prediction models include 9 of the 10 variables used in the displacement occurrence model described above, plus an additional 11 variables specific to displacement events. The 10th variable included in the displacement occurrence model, species on the feeder, was not included in the role prediction models because the additional displacement-specific variables incorporate species information. See Table 1 for detailed descriptions of each variable.

Species Involvement
Seven species were observed in displacement events at our feeders: American Goldfinches (Spinus tristis; AMGO), Black-Capped Chickadees (Poecile atricapillus; BCCH), Eastern Tufted Titmice (Baeolophus bicolor; ETTI), House Finches (Haemorhous mexicanus; HOFI), House Sparrows (Passer domesticus; HOSP), Northern Cardinals (Cardinalis cardinalis; NOCA), and Song Sparrows (Melospiza melodia; SOSP). Overall, we captured 23,963 individual feeding events of these seven species from August 2017 to March 2018. A total of 668 displacement events were observed between these 7 species, accounting for 2.62% of all feeding events. Intraspecific displacements accounted for 52.40% of all displacements. Displacement events occurred on 103 days, or 70.07% of all days. Overall, an average of 4.54 displacements occurred daily during the study.
We observed asymmetric differences across species in displacement involvement ( Table 2). For example, the lowest rate of displacement was demonstrated in Northern Cardinals, which were displaced in 0.87% of their feeding events. The highest rate of displacement was observed in House Sparrows, who were displaced in 4.99% of their feeding events. This trend was also seen in the displacer role, with Northern Cardinals displacing other species in only 1.14% of their total feeding events and House Sparrows displacing others in 5.47% of their feeding events.

Displacement Occurrence Model
The random forest model overall out-of-bag (OOB) classification error rate was 2.63%; however, the confusion matrix predicted skewed "yes" displacement responses (i.e., the occurrence of a displacement; Tables A1 and A2). Therefore, we conducted the 1:100 downsampling scheme described above, which returned an error rate of 35.74% with a balanced confusion matrix, allowing for more accurate interpretations to be drawn from this model (Tables A2 and A3) [53][54][55]. The error rate increase was expected with the downsample and remained in an acceptable and predictive range, especially given the complexities of the behavior.
This model exploring the differences in displacement and non-displacement events found attributes of the physical and social environments to be important in different contexts (Figure 1a,b). The most important variables in identifying the occurrence of a displacement were humidity and the time of day (Figure 1c). In contrast, the most important variables in identifying a non-displacement event were the species previously and currently at the feeder (Figure 1d). In summary, correlates of the physical environment were primarily associated with a displacement occurring, whereas correlates of the social environment were primarily associated with the occurrence of a non-displacement feeding event.
The random forest model overall out-of-bag (OOB) classification error rate was 2.63%; however, the confusion matrix predicted skewed "yes" displacement responses (i.e., the occurrence of a displacement; Table A1 and A2). Therefore, we conducted the 1:100 downsampling scheme described above, which returned an error rate of 35.74% with a balanced confusion matrix, allowing for more accurate interpretations to be drawn from this model (Tables A2 and A3) [53][54][55]. The error rate increase was expected with the downsample and remained in an acceptable and predictive range, especially given the complexities of the behavior.
This model exploring the differences in displacement and non-displacement events found attributes of the physical and social environments to be important in different contexts (Figure 1a,b). The most important variables in identifying the occurrence of a displacement were humidity and the time of day (Figure 1c). In contrast, the most important variables in identifying a non-displacement event were the species previously and currently at the feeder (Figure 1d). In summary, correlates of the physical environment were primarily associated with a displacement occurring, whereas correlates of the social environment were primarily associated with the occurrence of a nondisplacement feeding event.  Figure A1. Label meanings: Species (the first bird on the feeder), temperature (ambient; • C), humidity (relative; %), previous bird (that fed last before a new feeding event), Y/N prior 2 min (yes or no that there was a bird present in the 2 min prior to the feeding event), Y/N prior 5 min (yes or no that there was a bird present in the 5 min prior to the feeding event), Y/N prior 10 min (yes or no that there was a bird present in the 10 min prior to the feeding event), y/n prior 20 min (yes or no that there was a bird present in the 20 min prior to the feeding event), displaced (species that was removed from the feeder), displacer (species that removed another from the feeder), Y/N "species code" (yes or no if each species was present in the 15 min prior to a displacement), n "species code" 15 min (the number of times each species was present in the 15 min prior to a displacement), 15 min count (the number of birds present in 15 min prior to a displacement), and 15 min frequency (the most frequent species seen at the feeder in the 15 min prior to a displacement event).

What Is Associated with Who Is Displaced?
The first role prediction model, which identified variables associated with which species were displaced, had an error rate of 7.63% and had a balanced confusion matrix, thereby enabling accurate interpretations to be made (Table A3). The variables most strongly associated with who was displaced were: the presence/absence of a Song Sparrow(s) and a House Sparrow(s) in the 15 min prior to a displacement, the species of the previous bird at the feeder immediately before a displacement, and the species of the displacer (Figure 1e,f). The variables least associated with who was displaced was overall feeder use (15 min count from Table 1) and select species' feeder use (Y/N each species from Table 1) in the minutes prior to the occurrence of a displacement. Overall, descriptors of the social environment were primarily associated with what species was displaced.

What Is Associated with Who Displaces?
The second role prediction model, which identified what is associated with who displaces, had an error rate of 7.04% and had a balanced confusion matrix, again allowing for accurate interpretations to be made (Table A3). The presence/absence of a Song Sparrow(s), House Sparrow(s), Black-Capped Chickadee(s), and Northern Cardinal(s) in the 15 min prior to a displacement and the species of the displaced bird were most strongly associated with what species was the displacer in the random forest model (Figure 1g,h). The variable that was least associated involved feeder use prior to a displacement (15 min count from Table 1). Overall, the features of the social environment were primarily associated with what species will displace another.

Discussion
In summary, we found that the occurrence of a displacement event is associated with the attributes of the physical environment, specifically humidity and time of day. However, the identity of species involved in displacement events is associated with the attributes of the social environment, specifically the respective species of the displaced and the displacer and the species at the feeder immediately and 15 min prior to the displacement event. These results support our initial hypotheses and provide context to develop further observational and experimental hypotheses.
While our study inferences are restricted to a small portion of the Appalachian region, our novel use of this methodology for ornithology is a successful proof of concept in identifying associations between the environment and songbird behaviors on a small scale. We used random forest to test intentionally broad hypotheses, identifying attributes of the social and physical environment associated with displacement behaviors for future, pointed observational and experimental hypotheses to be created. Specifically, manipulating the physical environment (e.g., the time of day via artificial light) and the social environment (e.g., the species using the feeder by manipulating which food types are provided) may provide more insight into when and why displacement events occur. Controlling the species present around the feeder and the displacing/displaced species may provide deeper understanding of who displaces, or is displaced, when, and how often. Controlling for individuals would also be useful to facilitate independence of the data and to study the potential impacts of behavioral syndromes on displacements.
As random forest was able to identify that the variables associated with displacement and non-displacement events were different, we can tentatively accept our first hypothesis. The difference in the specific environmental attributes associated with displacement vs. nondisplacement feeding suggests that each behavior may be driven by different environmental conditions [56]. While we are unable to make broad inferences of songbird displacement behaviors due to the limited spatial reach of our study, these results preliminarily suggest that the specific conditions of the local physical environment may be contributing factors in songbirds' motivation to displace, or not to attempt to displace, another individual. For example, a bird's ability to cope with changes in irregular weather may influence (and be influenced by) its physiological energy budget. Increased demand for energy resources in the face of inclement weather may make displacements worth attempting, particularly for poorly prepared individuals. However, to understand the directionality and causality of this potential relationship, directed experimentation and hypothesis testing is required.
In testing our second hypothesis, we found that social factors are primarily associated with which species displaces and which is displaced. Both the species present around the feeder and the species of the individual displacing/being displaced were the primary identifiers in both models. The presence/absence of Song Sparrows, House Sparrows, Black-Capped Chickadees, and Northern Cardinals prior to a displacement may suggest that these species are indicators of the conditions under which a displacement may be favorable. The presence of these four species may also be a deterrent, or depending on the species, an inducement, to the potential displacer (as suggested in [57,58]). The low association for overall and select species-specific feeding frequency in identifying a displacement indicates that feeder use, and thus encounter probability, are not substantially associated with displacements. This further supports the notion that the species at the feeder may dictate displacement occurrence and outcome at our feeders, and not the number of individuals vying for the feeder at our study sites as suggested by Wojczulanis-Jakubas et al. [25]. If feeder use and encounter probability did play a role in displacements at our feeders, they were subtle.
While our models suggest the physical environment is central to identifying the occurrence of a displacement event, there was little suggestion in our models that physical correlates had a noteworthy role in identifying the species-specific makeup of displacements. Taken together, our results suggest that certain conditions of the physical environment may increase the motivation to displace, and once increased, the social environment may dictate which dyadic displacement pairings are likely to, or not to, occur. However, conditions that motivate displacements may also be likely to increase subordinates' incentives to stay [25], potentially making displacements less successful. For instance, adverse weather may increase the relative importance of feeding, and thus displacement, for all species. However, different species may be more or less likely to succeed or fail in displacement attempts, and more or less likely to benefit from initiating or resisting displacement, based on factors like size, mobility, and feeding preference. Examples of this nature are seen in the contest theory literature [59][60][61][62]. Additionally, while some species are involved more often in displacements, their presence in not predicting a displacement need not be surprising. These often-involved species (e.g., House Sparrows) may be motivated to displace, and displace often, but only under certain environmental conditions.
Our findings add support for the role of species interactions in mediating displacement events in songbirds [25][26][27]. We add to the nuance of these complex behaviors by suggesting that more specific details of the social environment, not only species size or prevalence, may also be important. With aspects of the social environment also being associated with displacement interactions, research looking at the larger-scale consequences of aggression, such as the formation of social structures and geographic distributions, may help account for this nuance.
Overall, on a very localized scale, our results suggest that the variables identifying songbird displacement occurrence and outcome may be interconnected and more complex than previously shown. While the physical environment is associated with the occurrence of displacements, and the social environment is associated with the outcome, the social environment is both mediated by, and serves as a buffer to, the physical environment. Such complexities require much further study to deconstruct and identify the correlates of songbird displacements across time and space, and their directionality. This future work may benefit from selecting physical and social attributes through the lens of contest theory [59][60][61][62] and similar frameworks. As previously stated, our results lay the groundwork for future observational and/or experimental design, which would benefit from beginning with the physical and social correlates we have identified here using random forest analysis. Further, as our random forest analysis did not identify the directionality of variables-only the importance of each variable predicting the displacement occurrence or role outcome-future analysis should tease apart the directionality of these physical and social correlates.
Furthermore, future studies employing computerized data collection units, such as our feeders, should learn from the shortcomings of our machine. Our limited computer processing power prevented the video recording of feeding events, which would have allowed for higher resolution in behavioral observations. Photographs may allow nuanced behaviors to go undetected due to the couple of tenths of seconds between photographs. Additionally, and importantly, our use of only two feeders limits the scope of our findings. While enough for a proof of concept and to make localized inferences, more feeders across time and space would greatly aid in addressing broader research questions. For studies with pointed observational and/or experimental hypotheses developed from our work, and for those with larger sets of data collected from across more sites and seasons, random forest will be a valuable tool to identify correlates of behavior. More predictors, such as the vegetation cover around the feeder, nutritional value of the food, and feeder accessibility, could be incorporated. Future studies could also uniquely mark individuals to facilitate data independence by determining if repeated displacements are the result of one or few individuals doing the displacing or being displaced. A larger dataset with observations of uniquely marked individuals will also aid in understanding the roles that sex and age may place in displacements, given that behavioral syndromes and profiles vary greatly between species and individuals [19,63,64]. As understanding the environmental correlates that underline avian behavior is critical to conservation efforts, our study and methods provide the groundwork for developing more specific hypotheses pertaining to avian behavior.

Data Availability Statement:
The data generated and analyzed in this study will be available on Dryad if accepted. Table A2. The out-of-bag (OOB) classification error rate (pre-downsampling) and the mean and SD of OOB error rates for the post-downsample model (both yes and no responses for the random forest model asking "did a displacement occur"). The mean and SD did not notably differ between 100 and 200 down sample iterations, resulting in our use of the 100 down sample iteration model in our analyses. The OOB error rate increase post-down sampling was expected and equated to a level acceptable for accurate predications to be made from the model [53][54][55]).

Downsample
Model.   Figure A1. Results from the random forest models for what is associated with which species is displaced (top row) and displaces (bottom row). Displayed as MDA (mean decrease in accuracy) and MDG (mean decrease in GINI impurity) variable importance values (higher values indicate greater predictive significance). Label meanings: Species (the first bird on the feeder), temperature (ambient; °C), humidity (relative; %), previous bird (that fed last before a new feeding event), y/n prior 2 min (yes or no that there was a bird present in the 2 min prior to the feeding event), y/n prior 5 min, y/n prior 10 min, y/n prior 20 min, Displaced (species that was removed from the feeder), displacer (species that removed another from the feeder), y/n "species" (yes or no if each species was present in the 15 min prior to a displacement), n "species" 15 min (the number of times each species was present in the 15 min prior to a displacement), 15 min count (the number of birds present in 15 min prior to a displacement), 15 min freq (the most frequent species seen at the feeder in the 15 min prior to a displacement event). Figure A1. Results from the random forest models for what is associated with which species is displaced (top row) and displaces (bottom row). Displayed as MDA (mean decrease in accuracy) and MDG (mean decrease in GINI impurity) variable importance values (higher values indicate greater predictive significance). Label meanings: Species (the first bird on the feeder), temperature (ambient; • C), humidity (relative; %), previous bird (that fed last before a new feeding event), y/n prior 2 min (yes or no that there was a bird present in the 2 min prior to the feeding event), y/n prior 5 min, y/n prior 10 min, y/n prior 20 min, Displaced (species that was removed from the feeder), displacer (species that removed another from the feeder), y/n "species" (yes or no if each species was present in the 15 min prior to a displacement), n "species" 15 min (the number of times each species was present in the 15 min prior to a displacement), 15 min count (the number of birds present in 15 min prior to a displacement), 15 min freq (the most frequent species seen at the feeder in the 15 min prior to a displacement event).