Open-Source Wearable Sensors for Behavioral Analysis of Sheep Undergoing Heat Stress

: Heat stress (HS) negatively affects animal productivity and welfare. The usage of wearable sensors to detect behavioral changes in ruminants undergoing HS has not been well studied. This study aimed to investigate changes in sheep’s behavior using a wearable sensor and explore how ambient temperature inﬂuenced the algorithm’s capacity to classify behaviors. Six sheep (Suffolk, Dorset, or Suffolk × Dorset) were assigned to 1 of 2 groups in a cross-over experimental design. Groups were assigned to one of two rooms where they were housed for 20d prior to switching rooms. The thermal environment within the rooms was altered ﬁve times per period. In the ﬁrst room, the temperature began at a thermoneutral level and gradually increased before decreasing. Simultaneously, in the second room, the temperature began at hot temperatures and gradually decreased before increasing again. Physiological responses (respiratory rate, heart rate, and rectal temperature) were analyzed using a linear mixed-effects model. A random forest algorithm was developed to classify lying, standing, eating, and ruminating (while lying and standing). Thermal stress shifted daily animal behavior budgets, increasing total time spent standing in hot conditions ( p = 0.036). Although models had a similar capacity to classify behaviors within a temperature range, their accuracy decreased when applied outside that range. Although wearable sensors may help classify behavioral shifts indicative of thermal stress, algorithms must be robustly derived across environments.


Introduction
Livestock welfare and production are notably affected by the thermal environment. High ambient temperature compromises animals' efficiency because maintaining body temperature becomes the highest priority instead of using nutrients for production purposes, e.g., milk and meat production [1]. Although advancements in management systems such as cooling systems and modern barn construction alleviate some negative effects of heat stress (HS), globally, it is estimated that livestock producers may face a financial loss of $40 billion annually by the end of the 21st century [2]. Physiological adaptations such as increased respiration rate [3] and sweating [4] are employed as heat abatement mechanisms by domestic animals intended to reduce heat loads and increase heat dissipation.
Traditionally, measurements of body temperature are the most widely used to assess or predict HS in livestock [5]. With the advancement of technology, the ability to monitor and predict physiological responses like body temperature automatically is progressing rapidly [6]. However, these automated temperature monitoring technologies may miss opportunities to identify changes within heat-stressed environments (i.e., before severe shifts in productivity occur) because deviation in temperature will only reflect when animals have exceeded the homeothermic limits. Nevertheless, these technologies aim to reduce human intervention, decrease the costs associated with management, and increase animal welfare through an improved understanding of stress exposure. In this context, methods used to measure body temperature and physiological responses to HS using remote sensing include vaginal and rectal probes [7], rumen boluses [8], thermal imaging [9], and ear canal sensors [10] as an alternative to monitoring temperature directly, monitoring animal behaviors during exposure to heat stress may allow for a more precise understanding of stress levels and adaptation across a temperature gradient, rather than signaling only when animals have exceeded their thermoregulatory mechanisms.
Animals in HS environments tend to alter their natural behavior in order to maintain euthermia in comparison with animals in thermoneutral conditions. Rumination is reduced in cows [11] with shifts in rumination patterns, with more than 60% of the daily rumination occurring at night [12]. Moreover, standing bouts have been demonstrated to increase in hot temperatures [13] as an attempt to enhance the body surface area and support cooling. Although previous studies have characterized some of these changes related to HS, e.g., a decrease in feed intake and its effects on production being the most studied [14,15], other behavioral changes in ruminants undergoing HS, such as the frequency and duration of activities including grazing, eating, lying, and standing behaviors have not been well studied. Moreover, there is limited information available regarding measurements of HS in sheep when compared to other species, such as dairy cattle. Enhancing our understanding of HS in sheep could significantly benefit the sheep industry once sheep serve as a costeffective animal for extensive systems. A challenge with using behavior to evaluate animal stress responses induced by climate is that behavioral adaptations may differ considerably between scenarios with gradual versus extreme shifts in the environment. For example, during extreme temperature changes, animals may show much more obvious behavioral adaptations, whereas those employed during gradual temperature shifts may be more challenging to observe and detect. Previously, inertial measurement units (IMU) to monitor and classify animal behaviors have been used in thermoneutral environments [16][17][18]. However, the robustness of the behavior classification algorithms used in these sensors within thermoneutral environments has not been well characterized. As such, the exploration of the robustness of these sensors for use in heat stress behavior monitoring is critical before use in studying behavioral shifts associated with different ambient temperatures and with different patterns of temperature change.
To investigate the role of HS on behavior, the objective of this study was to evaluate the usage of an open-source wearable sensor equipped with a three-axis accelerometer, gyroscope, and magnetometer to classify behaviors of interest in sheep exposed to different patterns of ambient temperature fluctuation. We hypothesized that animals experiencing HS would shift daily time bouts relative to those housed in thermoneutral conditions. In addition, we expected that algorithms for behavior classification derived within thermoneutral environments would be insufficient for accurate and precise behavioral classification in heat-stressed environments.

Animals Experimental Design
The animals and procedures in this study were approved by the Virginia Tech Institutional Animal Care and Use Committee (Protocol #20-200). Six commercial wethers (Suffolk, Dorset, or Suffolk × Dorset) were used in the study. Wethers were approximately 5.5 years of age and averaged 90.2 ± 13.4 kg body weight at the beginning of the experiment. The experiment consisted of a crossover design with wethers randomly assigned to 1 of 2 groups exposed to different patterns of thermal stress. The groups were randomly assigned to 1 of 2 identical rooms (2.71 m) where they were housed for 27 days. The room's floor was layered with rubber flooring and lined with sawdust bedding. After completing the 27-day period, animals were moved between rooms to initiate the cross-over component of the design. The animals were adapted in each room for a period of 7 days. The thermal environment in each room changed 5 times per period (once every 4 days). In one room, the temperature started at thermoneutral before increasing and decreasing gradually (20 • C, 27 • C, 35 • C, 27 • C, 20 • C), and in the other room, the animals were immediately exposed to a hot temperature before gradually decreasing and increasing again (35 • C, 27 • C, 20 • C, Appl. Sci. 2023, 13, 9281 3 of 16 27 • C, 35 • C). The room temperature was adjusted and maintained under the control of a computer system (Siemens Building Automation System, Siemens Industry Inc., Alpharetta, GA, USA). Animals were fed Timothy (Phleum pratense L.) hay ad libitum, replenished twice daily at 08:00 and 17:30 h for the entire experiment in order to mimic continuous feed access as would be experienced in a variety of production environments.

Collar Instrumentation
Four animals were fitted with an inertial measurement unit for behavior classification. The behavior monitoring collars fitted to the wethers included an open-source, low-cost, low-power sensor. A generic HiLetgo ® MPU-9250 motion sensor (Shenzhen Hiletgo Co., Ltd., Shenzhen, Guangdong, China) which included a 3-axis accelerometer, gyroscope, magnetometer, an Arduino-compatible microprocessor, and a data storage module (SD Card 16 GB). The sensors were powered using a rechargeable lithium-ion battery (6700 mAh) connected to the microcontroller by a micro-USB cord. The microprocessor was programmed using the open-source Arduino Development Environment (IDE) software version 1.0 (https://www.arduino.cc/en/Main/Software (accessed on 6 October 2021) and configured to record the data onto the SD card at 100 Hz. All electronic components were connected and affixed to a collar. The collar was positioned around the animal's neck.

Physiological and Behavioral Measurements
Throughout the experimental period, respiration rate, heart rate, and rectal temperature were measured twice daily at 7:00 and 17:00 h. Respiration rate (breaths/minute [BrPM]) was measured by visual inspection for 10 flank movements and converted to breaths per minute. Heart rate (beats/minute [BPM]) was measured using a stethoscope for 10 s and multiplied by six to obtain the beats per minute. Rectal temperature ( • C) was recorded by a clinical thermometer inserted into the rectum. Animal behavior was video recorded over the experimental period. Two cameras were utilized per room. The cameras provided continuous observation of the entire room, ensuring continuous monitoring of the animals at all times.
The timing of the video recording extended from morning to the next day (~24 h) with the time stamp from the video analysis used to match the inertial measurement unit data. Activity (eating, laying, standing, and ruminating) was determined for each minute for the four animals fitted with sensor collars. Video-recorded activities were classified using the criteria: (1) eating: feed intake from the feed bucket, chewing and swallowing the feed; (2) standing: static standing, with minor head movement and no jaw movements; (3) laying: laying down in a rest position without rumination; and (4) ruminating: the animal was standing or laying down while regurgitating rumen bolus, chewing and then re-swallowing.

Statistical Analysis
Statistical analyses were conducted using R Statistical Software v4.1.2 [19]. For the physiological measurements, the lme4 package [20] was used to derive the model. Before the experiment, a power analysis was conducted to determine the required sample size per treatment group to achieve a type error probability of 0.05 with a power superior of 0.90 based on a power t-test. Normality was tested for the model derivation process through the evaluation of residual plots. Relationships were analyzed as a linear mixed-effect model with period and group as a random effect. Response variables included respiration rate (RR) (breaths/minute [BrPM]), heartbeat (HR) (beats/minute [BPM]), and rectal temperature (RT, • C). Because of the temperature pattern used, there were several explanatory variables used to dissect aspects of the thermal environment for their influence on animal vital signs. The primary response of interest was the temperature experienced during a 4-day monitoring period. However, because that temperature could be experienced following a higher or a lower temperature, we also included a continuous variable to indicate the change in temperature from the previous period. Furthermore, temperatures could be experienced in the gradual (low temperature, to high temperature, back to low temperature) sequence or in the drastic (high temperature immediately) sequence. These sequence differences were also accounted for, as were the 2-and 3-way interactions among the thermal environment variables. Formally, the response variables were analyzed using the following model: where µ represents the overall mean, α i is the effect of the ith ambient temperature, β j is the effect of jth overall trend in the temperature changes (representing whether the gradual or drastic temperature pattern was used), γ k is the effect of the kth difference in the temperature from the current period to the previous period, αβγ ijk represents the 2-and 3way interactions of ambient temperature i, temperature trend j, and temperature lag k, c l is the random effect of period l, d m is the random effect of group m, and ε ijklm is the residual error associated with temperature i, temperature trend j, temperature lag k, and period l, and group m.
For the behavior classification, the random forest (RF) algorithm from the package randomForest [21] was derived for each temperature of interest. To investigate the predictive capacity of the algorithm, 70% of the IMU sensing data were randomly selected to build a training dataset, and 30% of the dataset was randomly selected to build the test dataset. A set of four models were derived, using data either from all thermal environments together or using data from each thermal environment. In addition to evaluating against the 30% of each dataset that was held out during derivation, we also evaluated the individual thermal environment algorithms against the data obtained under the other thermal conditions to explore the transferability of these algorithms among thermal conditions. Irrespective of the derivation dataset, models were derived using the 9-axis data from the IMU as explanatory variables and the behaviors of interest as the response variable. During each evaluation task (i.e., using either held-out data or data from other thermal environments), the con-fusionMatrix function of the package caret [19] was used to compute accuracy, precision, sensitivity, and specificity metrics. The evaluation metrics were computed as follows: where TP (true positives) is the number of occurrences where behavior was appropriately classified by the model as the behavior that was observed (video analysis observationgolden standard). TN (true negatives) is the number of occurrences where behavior was correctly classified as not being detected. FP (false positives) is the number of occurrences where the model incorrectly classified a behavior that was not detected. FN (false negatives) is the number of occurrences where the algorithm classifies a specific observed behavior as some other behavior.

Physiological Measurements
Results for physiological variables are presented in Table 1. Ambient temperature, temperature trend, temperature lag, and their interaction had a significant effect (p < 0.05) on respiration rate, with wethers having a significant increase in RR [BrPM] at 35 • C in comparison to 20 • C in both the gradual and rapid change temperature patterns. How-ever, the RR [BrPM] was higher when the temperature was drastically changed to 35 • C rapidly. The effect of temperature lag and its interaction with ambient temperature was also significant (p < 0.05) for HR [BPM], and animals showed lower HR [BPM] at 35 • C, particularly during the drastic temperature shift pattern. Finally, ambient temperature and its interaction with temperature trend and temperature lag and the interaction of temperature trend with temperature lag also had a significant effect (p < 0.05) on RT ( • C). In this case, rectal temperatures were highest among animals at 35 • C after the drastic shift in temperature.

Behavior Classification
The classification performance metrics using the RF algorithm for identifying eating, lying, lying and ruminating, standing, and standing and ruminating are presented in Tables 2-4, representing a model fit across data obtained from all temperatures versus models fit from data within each temperature. When derived and evaluated across all data, the RF had moderate accuracy for behavioral classification (67 to 78%; Table 2). This range of accuracy was similar to those fits obtained from evaluating models derived within each temperature range (Table 3). Although the specificity of models for most behaviors was generally high (69 to 100%), the sensitivity was lower for some behaviors, particularly standing and standing/ruminating behaviors (28 to 43%; Table 3). Crucially, those models derived within individual temperature ranges did not achieve similar performance when evaluated against data obtained from other temperature ranges, resulting in marked drops in precision and somewhat lower accuracy (Table 4). To explore shifts in the data structure that may have contributed to the poor translatability across thermal ranges, Table 5 shows the influence of temperature range on measured behavior characteristics. Time per bout of eating, lying, and lying and ruminating was significantly affected by the thermal environment. Total time standing was significantly affected by the thermal environment, and as a reflection of that, the animals spent more time standing in the hot condition.

Physiological Responses
As a response to heat increment, livestock reveals thermoregulatory adaptations such as behavioral, physiological, neuroendocrine, and molecular shifts designed to support the maintenance of body temperature within survivable limits [22]. The physiological responses categorized as respiratory rate, heart rate, rectal temperature, skin temperature, and sweating rate are usually exhibited following behavioral responses [23]. In the present study, an assessment of animal responses such as behavior (lying, standing, eating, ruminating responses), respiration, heart rate, and rectal temperature of sheep undergoing heat stress were collected. The animals showed higher RR and RT when the temperature was drastically changed to 35 • C in comparison with gradually changing temperature, suggesting that the effect of HS on RR and RT is amplified during extreme weather events. When developing management strategies to support climate-smart agriculture, specific mitigation approaches for extreme weather events may be necessary.
Despite thermo-regulatory responses, animals' productivity tends to decrease during summertime [24]. In the United States, both large and small sheep farms rely heavily on pasture-based systems, with the large scale typically operating on pasturelands and the smaller farms maintained on pastures and feedlots [25]. The outdoor housing systems used in the sheep industry increase the susceptibility to climate-related stress, specifically heat stress, thereby increasing their vulnerability to such impacts [26][27][28]. Therefore, mitigation strategies and technologies to help minimize the negative impact of heat stress on sheep will be necessary to maintain the resilience of this industry as the climate changes.
Increased RR is a very sensitive and widely used indicator of heat stress [3]. In ruminants, RR can be influenced by several factors, including the level of production, body condition, housing, cooling systems, and prior exposure to high temperatures [3]. Sweating and evaporation through the respiratory tract are the most important mechanisms of heat exchange between the sheep and its environment, with sweating being secondary due to the presence of wool [29]. In sheep, the normal respiratory rate is typically between 20 and 38 BrPM [30]. In this study, mean respiratory rates increased when the ambient temperature was gradually increased (17.9 vs. 66.0 BrPM), and there was a marked increase when the temperature underwent drastic changes (90.2 BrPM). However, the respiratory mean at 27 • C was similar, suggesting that sheep leverage changes in RR at temperatures higher than 27 • C.
Similar results for RR in Suffolk sheep were demonstrated by [31], who evaluated the effects of HS in different sheep breeds in tropical regions. Because RR changes were not observed at intermediate temperatures, this physiological indicator may be an insensitive strategy to identify low-to-moderate stress instances associated with elevated but not excessive ambient temperatures. For reference, a RR of 60 to 80 BPM is typically considered medium to high stress [32].
High environmental temperatures also increase pulse rates, reflecting altered circulation as a response to the increase in blood flow from the core to the periphery [29]. However, at extremely high temperatures, due to the decrease in the metabolic rate, the pulse rate might drop [33]. Moreover, heart rate is influenced by other factors such as age, metabolism, and biological activity [27]. The authors in [34] evaluated the responses of Merino sheep (average age of 12 months) under HS conditions and found an increase in HR up to 109 BPM when animals were exposed to 30 • C in an environmentally controlled experimental room. In this study, the animals' HR decreased during both gradual and drastic changes in temperature, with significantly lower HR for drastic change (52.8 beats/minute), suggesting that the decrease in metabolic rate during extreme weather events might have a stronger influence on HR. The HS heart rate identified is lower than those identified in previous studies [35,36], and these differences might be attributed to the influence of breed and age on HR. The animals from this study averaged 5.5 years of age, which would be considered quite old for wethers. Therefore, it is not surprising that the maximal HR observed in this study was lower than in previous studies using growing animals because maximal heart rates tend to decline with age.
A representative assessment of animal core temperature is the rectal temperature. Even under unfavorable climatic conditions, sheep, as strict homeotherms, leverage numerous physiological mechanisms to maintain their body temperature within a relatively narrow range [29]. Previous work has demonstrated that a rise of 1 • C or less in RT is sufficient to diminish the performance of most livestock [37], especially sheep. Under thermoneutral conditions, sheep have a normal RT range between 38.3 and 39.9 • C [29], with temperatures exceeding 42 • C considered dangerous [38]. It has been stated that the rise in RT in sheep starts at an ambient temperature of 32 • C, with mouth panting beginning at an RT of 40 • C [39]. In this work, ambient temperature had a significant effect (p < 0.05) on the RT with an increase of 0.3 • C as animals were gradually adjusted from 20 to 35 • C. During drastic changes in temperature, the RT increased by nearly twice that (0.6 • C). These values are still within the normal range for sheep considered to be in the thermoneutral zone. Our findings are consistent with the findings from [39], where HS effects in Merino and Omani sheep in an extensive system were evaluated. Similar results for RT were also found by [40] in Merino sheep under HS conditions in an extensive system. Despite the fact that the mean rectal temperatures observed within the present study were within the normal range, the increase in rectal temperature associated with 35 • C does suggest animals were experiencing HS conditions.
Although the physiological responses of sheep to elevated ambient temperatures are well documented within the literature, these values represent practical challenges because, in extensive production systems, their measurement as a means of determining thermal stress is not practical. In order to monitor HR and RT, in particular, the animal would need to be restrained and often separated from the remainder of the herd, which in and of itself can be a source of stress. As such, practical reliance on HT and RT as a means of diagnosing heat stress will be limited until better technologies for remote monitoring of these physiological parameters becomes available. An alternative to monitoring these physiological parameters is to monitor animal behaviors. Indeed, animals may begin to adapt daily behavioral budgets well before measurable changes in physiological indicators of heat stress are detectable. Therefore, monitoring behavioral characteristics might help to characterize animals experiencing early HS, which may help inform when and where mitigation-related management should be applied.

Effects of Temperature on Animal Behavior
An animal's physical state can be inferred from its action, which allows using animal behavior as an indicator of welfare [41]. The variability in behavior, the length of behavioral bouts, and the transition frequencies between activities and rest can be altered by HS [41]. To quantify those changes, behavior can be recorded continuously or sampled at regular intervals and can be characterized by metrics on individual activities or a complete ethogram of an individual or group [42]. The number of bouts, time per bout, and total time that animals spent in each behavior are presented in Table 5.
Behavioral responses to HS have been previously investigated in ruminants, mainly in cattle. Generally, the behaviors of interest involve dry matter intake (DMI) measurements. Even a slight rise in ambient temperature from 25 to 27 • C has been shown to decrease DMI in dairy cattle [43]. Collectively, the DMI responses to HS support the notable negative impact on health and production [23]. This influence of the thermal environment on HS exemplifies the challenge of waiting until clinical HS symptoms appear to apply stress abatement. Although the decrease in DMI is important for modulating the depressed productivity of HS animals, studies involving pair-fed thermal neutral controls reveal that the reduced DMI contributed to only~50% of the decrease in productivity related to milk production in dairy cows [44]. Residual changes in productivity could be due to shifts in pre-or post-absorptive metabolism [45]. Measuring DMI was not an objective of this study. However, bouts, time per bout, and length of eating were lower in high ambient temperatures (Table 5), which is consistent with impaired eating behavior and reduced DMI in heat-stressed animals.
Measures of lying and standing behaviors are indicators of animal welfare and how animals interact with the environment [46][47][48]. Increased standing time was believed to be a response to heat stress in cattle driven by the need to expose more surface area and enhance cooling [13,49,50] with decreasing lying bouts as a consequence. Despite the fact that an increase in the total time standing was significantly associated with elevated ambient temperatures in this study, animals spent the majority of their time lying down. This large daily time budget for lying behavior is in contradiction with the literature. For example, [51] evaluated lying time and frequency as a behavioral response of HS in Holstein bull calves using an accelerometer and found that an increase in lying time and frequency during the daytime was possibly caused by acute HS. A possible explanation for the notable difference in daily time budgets measured in this study may reflect the age of animals and their comfort with experimental procedures. The sheep used in this work have been involved in numerous intensive research projects since their cannulation at 15 to 17 months of age. Their old age in the present study, coupled with habituation to experimentation, may have supported improved comfort with the experimental procedures supporting elevated lying time budgets observed herein. The fact that animals increased total standing time with elevated temperatures is likely the best reflection within the behavioral observations of animals' behavioral adaptation to heat stress.

Ability of an Open-Source Sensing System to Classify Behaviors
The analysis of animal behavior has been extensively explored with the advancement of precision technologies, which allow for automatically monitoring animal behaviors with a degree of detail not possible previously. However, such technologies for monitoring behavior have not been widely used in heat-stressed animals, and it is not clear whether traditional behavioral classification algorithms need to be updated to reflect the distribution of behaviors expected during HS. In sheep, the use of wearable sensors, mainly accelerometers, has been explored to detect behaviors such as lying, ruminating, walking, and grazing [52][53][54] and more specific applications for health monitoring such as lameness detection [55]. Nevertheless, these technologies can also be applied to understand the physiological state of an animal by monitoring its responses in a specific environment. Therefore, sensing technologies to distinguish the activities of animals under thermoneutral and stressed conditions that are more sensitive across the true distribution of behaviors during exposure to heat load is critical for more precise management of HS conditions. Before sensing technologies can be used to understand behaviors, machine learning algorithms are needed to translate sensed data into meaningful output. The RF algorithm in this study could differentiate between behaviors with medium-high accuracy ( Table 2) when derived and evaluated across all data from all temperature ranges. The eating activity was the most accurately (78%) classified behavior, followed by laying, laying, and ruminating (77% and 76%, respectively). These accuracies were lower in comparison with the 95% performance for standing and laying classification in sheep found by [56] and for standing and lying (87% and 84% accuracy, respectively) classification in cows found by [57]. The authors in [58] evaluated the use of an accelerometer to classify seven behaviors of interest in cows and found similar accuracy (80%) and lower sensitivity (52%) for feeding behavior in comparison with the present study even though a different algorithm was used. Overall, developing an approach that maintains satisfactory levels of accuracy while minimizing the amount of data analyzed is considered a main obstacle in using IMU [59]. Several factors can influence the accuracy of algorithms used in classifying behaviors, including the placement and orientations of the sensors, appropriate sampling frequency, number of animals, and the accelerometer itself [60]. Despite the significant progress of technologies, there is a gap in the use of those technologies to measure the behavior and adaptations of livestock to environmental challenges. Specifically, the lack of appropriate algorithms that are robust across climate conditions. In this context, to evaluate the robustness of the algorithm in different temperatures, the accuracy and precision of the random forest algorithm across the different temperature ranges were also tested (Table 3). In general, the results show a similar range of accuracies when evaluated within the temperature range used for derivation. Interestingly, the accuracy of eating behavior classification was highest in the 27 • C environment, while the lower accuracies for standing and ruminating while standing behaviors were consistent among all environments. Although lying behavior was the most frequent among all thermal environments, it was not always the best classified, which suggests that the imbalance of data associated with normal time budget differences among behaviors was not solely responsible for creating deviations in the accuracy of the models. When models derived under specific thermal environments were evaluated based on data obtained from different temperatures, there were notable shifts in model performance ( Table 4). The models derived based on thermoneutral data only (20 • C) had notable reductions in accuracy and extremely poor precision for nearly all behaviors when evaluated on data obtained from 27 • C and 35 • C. Lying behavior was the most resilient, with marginally improved accuracy and acceptable precision when evaluated against data from 27 • C, likely due to the fact that the time budgets for lying behavior were similar between 20 • C and 27 • C. The models derived from data at 27 • C and 35 • C also performed poorly when compared against data obtained from other thermal environments, with several behavioral evaluations yielding accuracies of ≤50% and precision of 0.
Collectively, this exploration shows a critical limitation of precision technologies leveraging machine learning to translate sensor data to behavioral classification. The training data for these algorithms requires the demonstration of a wide array of expected production conditions to help ensure adequate behavioral classification among those conditions. For example, many evaluations of accelerometers explain details about the animals, their housing, the time of day, and other important methodological aspects of data collection, but they rarely refer to other environmental descriptors, such as the temperature. Similarly, longitudinal analyses of precision technologies focus predominantly on overall classification performance and fail to specifically investigate systematic errors in prediction, such as those which may be driven by failing to properly account for behavioral budget shifts during altered thermal environments. As the body of the literature leveraging commercial and open-source behavioral sensing technologies expands and matures, efforts to collate data among tools, and associated meta-data, may be essential to help better understand how methodological and environmental factors (such as age, breed, housing system, thermal environment, feeding schedules, group size, etc.) can be controlled for help develop robust behavioral classification techniques which are applicable across a broader array of potential production contexts.

Conclusions
Heat stress is a well-documented factor that can negatively impact livestock productivity. Traditionally, monitoring heat-stressed animals has relied on visual observations of behaviors or conventional methods such as manual recording of body temperature and respiration rate. In this study, we sought to address this challenge by employing opensource IMU sensors to detect changes in sheep's behavior occurring in response to heat stress. Our findings demonstrate an acceptable level of accuracy in differentiating between animals' behaviors in hot and thermoneutral environments when using algorithms trained on data from all thermal conditions studied. However, we observed a significant decline in algorithm performance when tested outside the thermal range used for the derivation data. This highlights the importance of carefully considering the experimental contexts when training IMU algorithms for behavioral classification. Given the diverse production contexts, environments, and animal types in which these sensors may be applied, data sharing among research teams studying similar technologies is crucial. Moreover, sharing data facilitates the validation and reproducibility of results, which can enhance the capacity to generate accurate and reliable classification approaches. However, such an effort would require considerable expansion in the meta-data recorded and considered with traditional behavioral classification studies. Overall, the utilization of open-source IMU sensors offers a promising approach for monitoring heat-stressed livestock and enhancing livestock management practices. Institutional Review Board Statement: The animal study protocol was approved by Virginia Tech Institutional Animal Care and Use Committee (Protocol #20-200, 06/01/2021).

Informed Consent Statement: Not applicable.
Data Availability Statement: The authors will provide the raw that supporting the conclusions of this article upon request.

Conflicts of Interest:
The authors declare no conflict of interest.