Evaluation of a Binary Classification Approach to Detect Herbage Scarcity Based on Behavioral Responses of Grazing Dairy Cows

In precision grazing, pasture allocation decisions are made continuously to ensure demand-based feed allowance and efficient grassland utilization. The aim of this study was to evaluate existing prediction models that determine feed scarcity based on changes in dairy cow behavior. During a practice-oriented experiment, two groups of 10 cows each grazed separate paddocks in half-days in six six-day grazing cycles. The allocated grazing areas provided 20% less feed than the total dry matter requirement of the animals for each entire grazing cycle. All cows were equipped with noseband sensors and pedometers to record their head, jaw, and leg activity. Eight behavioral variables were used to classify herbage sufficiency or scarcity using a generalized linear model and a random forest model. Both predictions were compared to two individual-animal and day-specific reference indicators for feed scarcity: reduced milk yields and rumen fill scores that undercut normal variation. The predictive performance of the models was low. The two behavioral variables “daily rumination chews” and “bite frequency” were confirmed as suitable predictors, the latter being particularly sensitive when new feed allocation is present in the grazing set-up within 24 h. Important aspects were identified to be considered if the modeling approach is to be followed up.


Introduction
Current sensor technology in farming offers decision support in addition to the farmer's knowledge and experience. Binary systems indicate need or no need for action, thereby enabling farm staff to respond to crop and animal reactions in a timely manner.
In arable farming, many precision technologies have already been adopted; however, the implementation of such in the livestock sector is hugely important, as it can improve consumer trust and product traceability [1]. Sensor data gathered in livestock farming play a growing role for proactive, instead of reactive, management to improve animal production, health, and welfare [1][2][3]. The increasingly available data benefit animal identification and monitoring as well as the control of animal production according to current regulations [2]. Predictions on an individual animal basis that consider physiological evolvements over time and might even combine information of various sensors may have the advantage of detecting vulnerabilities in large herds before it is possible for humans, but they generate large amounts of data, which calls for machine learning approaches [4,5].
However, assessment of the generalizability of prediction models is often lagging behind and, thus, their commercial implementation and farmer's adoption [1,6].
In livestock farming, cattle have been the most studied animal species for using the Internet of things and developing machine learning algorithms to improve management [4]. Non-invasive biometric sensor technologies that monitor behavior and physiological parameters in real time are an important information source for management decisions [1]. To date, many sensor technologies have focused on applications in indoor systems, because conditions are more steady, and technologies rely less on battery power than in pasturebased systems, but tools are increasingly being developed for use under grazing conditions as well [7].
Using behavioral sensor data combined with machine learning approaches is particularly of relevance in pasture-based dairy systems, because grazing animals are not under supervision at all times. For instance, a binary system could support the decision to allocate new pasture as soon as paddocks are grazed empty and feed intake of cows decreases. This could allow for more efficient herbage utilization in future, as farmers could respond to animal reactions with a pasture change before economic losses occur. Additionally, such an approach could be of great use in grazing areas that are difficult to reach under severe weather conditions.
With a known dry matter (DM) intake of cows, a precision grazing strategy can be pursued [8], and the timing of a pasture reallocation can be optimally chosen without wasting feed residues or using grasslands too intensively. However, measuring available herbage quantity by means of weekly farm walks using rising plate meters is laborious [9,10] and error-prone [11] and so are visual observations by humans. Measuring the available herbage quantity on pastures via unmanned aerial vehicles and satellites has been heavily investigated in recent years [12][13][14]. However, when coupled with grass growth models, this approach only allows for estimation of DM intake on a herd basis and not for individual animals. Moreover, it is difficult to monitor herbage DM intake of individual dairy cows, because it is influenced by multiple environmental and animal physiological factors [15] and, thus, highly variable.
Nevertheless, the decision of changing paddocks can also be based on changes in the animals' behavior in response to declining herbage availability. A few studies have already ventured into this vision. For instance, Hamidi et al. [8] observed longer walking distances of suckler cows, and O'Driscoll et al. [16] found fewer and shorter lying bouts together with less time lying in dairy cows with low than with high herbage availability on pasture. Moreover, Werner et al. [17] observed, amongst other behavioral indicators, a reduced rumination activity and an increased grazing bite frequency under restricted, compared to unrestricted, herbage allowance in a one-day rotational system. This is consistent with the results of Rombach et al. [18], who included bite frequency during grazing as an important predictor, with a negative relationship in their estimation model of herbage DM intake.
The study by Werner et al. [17] was followed up by Shafiullah et al. [19], who used their data to train models via supervised learning that evaluated sufficiency or insufficiency of herbage allowance on a 24 h basis in grazing dairy cows. The idea was to support farmers in scheduling paddock rotations before cow performance dropped. Using a binary system, Shafiullah et al. [19] developed and validated several prediction approaches based on changes in the cows' number of rumination chews per day and of their grazing bite frequency, along with six other variables. Among various statistical learning methods, a random forest model (RFM) performed best in classifying the binary feeding status of grazing dairy cows. The more classical generalized linear model (GLM) has performed slightly worse. The two approaches attained AUCs (area under receiver operating characteristic curve) of 88% (RFM) and 85% (GLM) using cross validation, which is satisfactory.
However, in order to correctly assess the generalizability of such models to different grazing conditions of dairy cows, and, thus, their possible implementation in future smart farming systems, they should be evaluated in commercial herds [6,20] and for a differing grazing system, such as a multi-day rotation. Variables selected and prediction models developed by Shafiullah et al. [19] have not yet been evaluated using independent data from another dairy farm.
The aim of the present study was, therefore, to evaluate two prediction models recommended by Shafiullah et al. [19] under commercial farming conditions. For this purpose, a practice-oriented rotational grazing system was set up with a gradually decreasing pasture allowance over six subsequent grazing days. Allocated paddocks yielded 20% less herbage DM than needed by the grazing cows. During the six-day grazing cycles, cow-individual days with feed scarcity were identified on the basis of the prediction models by means of behavioral changes and compared with reference indicators based on milk yield and rumen fill. More specifically, the two objectives were: (i) to investigate how well the modelpredicted binary classification of "sufficient" and "scarce" pasture allowance for each cow and day agreed with the reference indicators; and (ii) to assess if the behavioral variables selected by Shafiullah et al. [19] to predict a pronounced feed insufficiency are also sensitive to a gradual decline in herbage availability.

Materials and Methods
In an experiment with commercial farming conditions, dairy cows grazed half-days in a rotational grazing set-up at the surrounding grasslands of a research barn during an entire grazing season. During three experimental months in May, July, and September, data were gathered. In these periods, the animals grazed for six days in an allowance-restricted pasture so that a herbage scarcity was achieved towards the end of the six-day grazing cycle. Animals were then allocated to a new paddock.

Grazing Site
The experiment was carried out at the Agroscope Dairy Research Barn Waldegg in 2019, located in the north-east of Switzerland (47 • 49 N, 8 • 92 E, and 534 m above sea level). While the cumulative annual precipitation was 1293 mm, average daily ambient air temperatures during the course of the year varied from −5.3 • C to 26.8 • C (agrometeo.ch, location: Wil, Saint Gall, Switzerland). A total grazing area of approximately 1.7 ha was used during the three experimental months (Figure 1). For the purpose of the experiment, this grazing area was divided into four adjacent paddocks per experimental month with varying sizes between 0.17 and 0.52 ha each, depending on available herbage DM per ha. The paddocks were grazed sequentially. At the beginning of each experimental month, the paddock rotation started again from the area closest to the barn. While the boundaries of the grazing area were fenced with a fixed electric fence, a mobile electric fence separated the paddocks. The soil beneath the grazing area was rich in clay (27%, determined by field texturing) and had a low humus content of 2.5-3.5%. The permanent grassland had not been grazed for several years but was used for silage and hay production. Approximately 80% of the area had been re-sown in 2015 with a grass mixture (28% perennial ryegrass, 72% other grasses) and the remaining area in 2013 with a grass-clover mixture (10% white and 4% red clover, 28% perennial ryegrass, 58% other grasses). Each paddock was spatially arranged to include both sward mixtures so that the botanical composition of their vegetation was very similar.

Animals and Housing
In total, 20 Brown Swiss cows were used, comprising seven primiparous and 13 multiparous cows in their second to fifth lactation, resulting in a mean lactation number of 2.4 (one standard deviation (SD): 1.2). At the beginning of the experiment, 19 cows were, on average, 142 days in milk (SD: 44) and one exceptional cow 420 days in milk. The average body weight of the cows at the beginning of the experiment was 626 kg (SD: 64) and their daily milk yield 29.1 kg (SD 6.3). Body weight of cows was determined using a common measuring tape for cattle that measured the circumference of the torso close to the front legs. Body condition scores (BCS) of cows were assessed by a well-trained dairy production advisor once before each experimental month using scores from 2 to 5, including 0.25 steps [21]. The cows had, on average, a BCS of 2.8 (SD: 0.4) in April before the experiment began and of 3.1 (SD: 0.5) in September, when it ended.
The cows were kept in two groups of 10 cows each, hereafter referred to as Group A and Group B. The groups represented a treatment repetition and differed only in that they were allocated to a new paddock with a temporal shift. Individual cows were assigned to the two groups so that the number of cows with and without grazing experience was balanced (i.e., always or never grazed before, respectively). In addition, the Cow Groups A and B were balanced with respect to average body weight (627 kg and 625 kg), milk yield (29.4 kg d −1 and 28.8 kg d −1 ), and BCS (2.9 and 2.8). The average lactation number slightly differed between the two groups (A: 2.7 and B: 2.0).
Cows were milked twice daily at 05:30 and 16:30 in a side-by-side milking parlor (Lemmer-Fullwood GmbH, Lohmar, Germany). Both groups were housed in separate areas of a cubicle free stall barn and grazed on separate paddocks. Throughout the entire vegetation period (mid-March to mid-November), cows grazed the permanent grassland for half-days in a multi-day rotational regime. Cows were kept on pasture between milkings from 07:00 to 16:00 during the spring and autumn, whereas, during the warm summer months (June to August), they grazed overnight from 18:00 to 05:00. During the remainder of the day, the animals were kept indoors, where they had access to roughages and concentrates offered at a feeding bank and a feeding station, respectively. Both on pasture and indoors, cows had free access to fresh drinking water. All experimental procedures were in accordance with the Swiss guidelines for animal protection and approved by the responsible Animal Experimental Committee under the file reference TG01/19.

Experimental Setup
Each group of 10 cows grazed on one paddock over six consecutive days (i.e., one grazing cycle; Figure 1). The allocated paddock area was calculated to supply 20% less than the group's total DM requirement for maintenance and lactation for the entire six-day grazing cycle (see section below). As a consequence, the herbage allowance on pasture decreased from day 1 to day 6, when it was considered to have become scarce.
The six-day grazing cycles were repeated twice every experimental month, directly one after the other, with the exception of May, when grazing had to be interrupted for four days between both grazing cycles due to snowfall. To account for possible weather effects, the commencement of grazing cycles of the two groups was offset by two days (Figure 1).

Feed Allocation
In order to adjust the paddock size and, thus, herbage allowance for the cows' halfdays on pasture and half-days in the barn, the daily DM requirements were estimated using an Excel-based calculation tool for feed advisors (Agridea FuPlan, version 7.90, Lindau, Switzerland). The tool took into account information on the nutritional quality of the supplement feeds and of the pasture vegetation from a feed database (www.feedbase.ch (accessed on 21 March 2019)) and the average body weight of the 20 cows as well as their average milk yield and concentrations of lactose, fat, and protein. The latter three parameters were measured by an accredited laboratory (ISO 9622 method; Suisselab AG, Zollikofen, Switzerland). The estimated daily herbage requirement was 11.3 kg DM per cow on pasture and 8.8 kg DM per cow of supplement feed offered in the barn, with 5.7 kg DM per cow of meadow hay as well as 3.1 kg DM per cow of dried whole-plant corn pellets in the barn (i.e., 100% feed allowance). From this, the feed allowance on pasture and in the barn was reduced by 20%. The restriction level of 20% was chosen to represent a practical on-farm scenario of a mild feed restriction, which modifies ingestive behavior [16,17,22] but does not dramatically reduce milk yield [23][24][25].

Herbage Allowance
To determine the respective size of each paddock, the compressed sward height (CSH) was measured one day before each grazing cycle by using a rising-plate meter (RPM; Grasshopper ® G2 Sensor, App version 4.02, TrueNorth Technologies, Shannon, Ireland). Based on established calibrations per experimental month (see Section 2.5), average pregrazing CSH (mm, n = 25) was converted into estimates for above-ground herbage mass (HM, kg DM ha −1 ), assuming herbage > 20 mm CSH is available for cows, because cows are able to graze rather deeply when they are at strongly restricted pasture allowance [17]. Daily herbage regrowth during the six-day grazing cycles of 90, 60, and 30 kg DM ha −1 were assumed for May, July, and September, respectively, according to the location's altitude, soil type, climatic conditions, and time of the year [26]. Additionally, 10% of HM losses (on DM basis) were considered due to animal trampling and rejected patches.
Subsequently, the required size of the paddocks was calculated from the net available HM and the HM needed to achieve a herbage allowance on pasture of 80% of the animals' requirements for herbage DM over the entire six-day grazing cycles. Then, paddocks were established using electric fences and the RPM's global positioning module. Due to inaccuracies in measuring CSH and spatial coordinates, as well as to inaccuracies in estimating HM, the amount of allocated herbage varied between paddocks (Table 1).

Allowance of Roughage and Concentrate Feeds Offered in the Barn
The amounts of both roughages (31 kg DM per group of dried whole-plant corn pellets and 57 kg DM per group of meadow hay) offered in the barn were also reduced by 20%. To ensure that all animals had the same access to roughages, cows were fixed one by one after they returned from milking. After milking and rumen fill scoring (see Section 2.6.2) were completed, roughages were made accessible to the whole group of cows simultaneously. For this, the roughages were distributed equally along the feeding banks of both groups once daily. Additionally, commercial dairy concentrates were fed to cows via a feeding station, with the amounts depending on their individual milk performance (average 0.7 kg DM cow −1 d −1 ; SD: 1.2). Daily amounts of both roughages and concentrate feeds offered were kept constant across each six-day grazing cycle, except meadow hay, which was offered ad libitum at day 1 of each grazing cycle to compensate possible effects of the noticeable feed restriction on the last day of a previous grazing cycle.
From day 2 on, refusals of the roughage feeds were weighed daily after cows spent the half-day in the barn. While cows refused on average 4.3 kg DM (SD 3.3; n = 12) per group of roughages across all grazing cycles and both cow groups on day 2 of each grazing cycle, daily amounts of residuals were very low during days 3 to 6 of each grazing cycle (average 1.1 kg DM per group; SD: 0.9; n = 12).

Pasture Measurements
Prior to each experimental month, i.e., in May, July, and September, a calibration was developed to estimate HM from CSH. For this, above-ground HM was harvested manually in 9-15 randomly selected but evenly distributed plots (0.5 m × 0.5 m) across the anticipated grazing area. Within the respective plots, an average CSH (n = 3) was determined beforehand. The fresh samples were weighed, dried in a ventilated oven at 60 • C for a minimum of 48 h, and weighed again to determine DM concentrations and the HM (kg DM ha −1 ). Subsequently, linear relationships between HM (y) and CSH (x) were established (Table A1).
After each daily grazing event, herbage availability on paddocks was monitored. To do so, the CSH was measured at 25 equally distributed points per paddock using the above-mentioned RPM to determine post-grazing CSH.
2.6. Animal-Related Measurements 2.6.1. Behavior To measure the lying and feeding behavior, all 20 cows were equipped with pedometers on the right foreleg (Lemmer-Fullwood GmbH, Lohmar, Germany) and halters with a noseband sensor (RumiWatch, Itin&Hoch, Liestal, Switzerland) at the beginning of each experimental month. Pedometers measured the daily number of lying bouts (i.e., LAYDOWN, Table 2) using a 3-axis accelerometer. They were automatically read when the animals entered the milking parlor. The noseband sensors measured jaw and head movements in a 10 Hz resolution using a pressure tube and pressure sensor on the cow's nose in combination with a 3-axis accelerometer, which were both integrated into a halter around the head of the animal. The RumiWatch system allows for real-time transmission of measurements via the wireless networking protocol ANT+ and preliminary classification of behaviors so that users can check data plausibility. However, using the raw data to classify behaviors includes validity checks and makes the classification more accurate [27]. The raw data are saved on a memory card integrated in the halter and can be read via micro USB. User-defined post-processing of sensor data is then possible on a desktop computer using the RumiWatch Converter software. Technical specifications of the RumiWatch system and information about the algorithms used to classify the behaviors are given by Zehner et al. [27] and Werner et al. [17]. Table 2. Behavioral variables used in the present study (adapted from Werner et al. [17] and Shafiullah et al. [19]).

Variable (Unit)
Corresponding Eating, grazing, and rumination behaviors as well as head activity were extracted from the sensors' raw data at a 1-hour resolution using the RumiWatch Converter V0.7.4.13. Grazing and rumination bouts were detected and their duration recorded, as described by Werner et al. [28].
The eight behavioral variables selected by Shafiullah et al. [19] were relevant for the purpose of the present study, i.e., the daily prediction of herbage scarcity (Table 2). Therefore, the hour-based behavioral variables-bite frequency, rumination chews per bolus, and head movement activity-were averaged across each day, whereas the daily sum was calculated for the following variables: daily rumination chews, daily rumination time, and eating bouts. Finally, the mean duration of a rumination bout (i.e., the variable "rumination time per bout") was calculated by dividing the daily rumination time by the number of detected rumination bouts. Daily rumination times of <117 min (n = 9), which was the minimum value in the dataset of Shafiullah et al. [19], were not considered plausible [29]. Therefore, nine observations were removed from the present dataset.

Milk Yield and Rumen Fill
Milk yield of individual cows was automatically recorded by the milking system throughout the whole experiment and the daily sum of individual morning and evening milk yields computed (kg cow −1 ). Rumen fill was visually assessed from the depth of the paralumbar fossa as an indicator for the feed intake during the last four to six hours [30,31]. Rumen fill scoring was performed when cows were fixed at the feeding bank after the milking event that followed grazing and before roughages were offered. Therefore, it took place 1 to 1.5 h after cows had grazed last. Two well-trained direct observers assigned scores between 1 (low fill quantity) and 5 (high fill quantity), following the criteria of Zaaijer and Noordhuizen [30] and using the further developed protocol of Agroscope [32] that included 0.5 scores. The two observers showed high concordance in their scoring [33].

Reference Indicators for Herbage Scarcity
Both rumen fill scoring and milk yield served the purpose of having a practice-oriented reference indicator for evaluating the herbage availability experienced by individual cows. The first is relatively easy to determine by farmers themselves. Milk yield is usually automatically recorded and represents an important economic factor for the producer. The amount of milk a cow produces is strongly related to the cow's DM intake [34][35][36].
In the present experiment, we deliberately chose not to use estimation models for DM intake during grazing as reference measurement. First, they are not validated yet to evaluate day-to-day differences in individual cows. Additionally, they are error-prone, depending on the measurability of their input variables and their variability under varying environmental conditions [18,37,38].

Baseline Values
Cow individual baseline values for rumen fill score and milk yield were calculated by taking an average from days 1 and 2, in the case of rumen fill, and days 2 and 3 of each grazing cycle, in the case of milk yields. Therefore, changes in rumen fill score and daily milk yield relative to cows' respective baseline values were used as indicators for scarce herbage availability and the subsequent low DM intake. For this, absolute changes in rumen fill scores and relative changes in milk yields were calculated for each cow for days 3 to 6 and 4 to 7, respectively. As daily milk yields gradually declined across the six-day grazing cycle in some cows until one day after each grazing cycle ended, in May and September we used the milk yield of the subsequent day to evaluate the herbage availability on the previous day (e.g., daily milk yield on day 4 as an indicator of the herbage availability on day 3). In July, when cows grazed overnight until 05:00, the daily sums of milk yields were used for that exact day and not the day after, because milk yields began to recover at the subsequent day already.

Reference Classification
Declines in milk yield and rumen fill beyond the normal variation were considered to indicate whether herbage availability experienced by each cow on days 3 to 6 was either "sufficient" or "scarce". To determine reduction limits of normal variation in either reference indicator, SDs in daily milk yields and rumen fill scores were calculated for each cow and each experimental month across days 1 and 2 and days 2 and 3, respectively, of both consecutive grazing cycles (n = 4). The average SD across all cows and experimental months in rumen fill scores was ± 0.38 (n = 60; Figure A1). Since milk yield of cows varied considerably, respective SDs were expressed relative to the mean per cow and experimental month, with an average of ± 5.9% (n = 57; three extreme outliers visually identified and excluded). These generated SD values were then multiplied by a factor of 3, which represents the 99.73% confidence interval of the standard normal distribution [39] ( Figure A1), resulting in the reduction limits of 1.14 for rumen fill scores and 17.7% for milk yields. The reduction limits were interpreted as follows; compared to respective baseline values, absolute differences in rumen fill scores < 1.14 and relative differences in milk yields <17.7% on days 3 to 6 were attributed to an expected normal day-to-day variation (class "no herbage scarcity" (=0)), whereas those above these limits were considered as a warning sign for scarce herbage availability (=1). For example, a reduction in milk yield by 20% compared to the baseline value was an indication of herbage scarcity. No herbage scarcity was assumed for days 1 and 2, used to calculate the baseline values (classification = 1).

Application of Prediction Models
The herbage availability was additionally predicted based on changes in the eight behavioral variables (Table 2) as determined by Shafiullah et al. [19]. The authors used, among others, GLM and RFM to predict the binary herbage availability status of one cow at one specific day, i.e., of a cow-day. The prediction models of Shafiullah et al. [19] were established and validated using a dataset from an Irish full-time grazing system with a 1-day rotation and two portions of new pasture strips per day [17]. The dataset included 1221 observations from 40 individual cows. Animal behavior was measured using the RumiWatch noseband sensors and pedometers as described above. In the Irish study, cows had been offered (in % of their DM intake capacity) 100% of herbage DM ("sufficient", n = 629) or 60% of herbage DM ("insufficient", n = 592) on pasture; the latter is referred to as "scarce" hereafter. This dataset was used as training data in the present study, whereas the new data collected under commercial farming conditions served as test data. The test dataset excluded cows in heat, because estrus is known to affect feed intake as well as behavior of dairy cows [29]. Besides, in commercial farming, animal caretakers are often aware of a cow's estrus and know how to interpret the associated behavioral changes. Moreover, some observations had to be omitted, because data on one or several of the eight behavioral variables, the milk yield, or rumen fill scores were missing. In total, 583 cow-days served as test data.
For the test dataset (provided in File S1), the herbage availability classes 0 and 1 were predicted using both approaches, GLM and RFM. The analysis was performed using the statistical computing language R (Version 4.0.3, R Foundation, Vienna, Austria). Therefore, the functions "glm" of the R-package "stats" and the function "randomForest" of the R-package "randomForest" were applied as described by Shafiullah et al. [19,40,41]. The binary classification, i.e., the response variable, was based on the probability level 0.5, such that ifp < 0.5, y = 0 (i.e., negative class indicating sufficient herbage availability), and if p > 0.5, y = 1 (i.e., positive class indicating scarce herbage availability).

Predictive Performance
To evaluate the performance of the binary classification, model predictions were compared to both reference indicators, the herbage availability classification derived from the milk yield recordings and from the rumen fill scoring, as well as the combination of both reference indicators. For the latter, cow-days with either a significant milk yield (positive class = 1) or rumen fill reduction (positive class = 1) detected on days 3, 4, 5, or 6 were classified as positive. The remaining cow-days were classified as negative (=0).
With this, confusion matrices were created using the R package "caret" [42], where prediction classes and reference classes (i.e., true) were contrasted to identify true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). Thereafter, the sensitivity (i.e., ratio between TP and the sum of TP and FN), the specificity (i.e., ratio between TN and the sum of TN and FP), and the positive predictive value (PPV; i.e., ratio between TP and the sum of TP and FP) were calculated. Specificity and sensitivity were measures of the relative number of the true negative and true positive cases a model was able to detect. In contrast, the PPV indicated how many cases predicted as positive were actually positive cases. Finally, the area under the receiver operating characteristic curve (AUC) was determined using the R package caret to obtain a measure for model accuracy when corrected for imbalanced class distribution. Binary classifications with an AUC < 0.5 were assumed to indicate a failing performance, because 0.5 describes a random classification [20]. AUC values of 0.5 to 0.6 were considered to indicate a weak performance, whereas those greater than 0.6 indicated an even more viable model performance.
In order to assess the performance of the prediction models in different animals more accurately, several cow subgroups were formed based on their characteristics: parity, grazing experience from previous years, milk performance, gestation stage, and body condition (Table 3). Classification limits for milk performance and BCS were chosen to balance the sample size per class as much as possible. The assignment to a subgroup was based on the cows' characteristics at the beginning of each grazing cycle (i.e., for milk performance, gestation stage, and BCS) and did not change within a six-day grazing cycle. Lastly, the performance in predicting scarce herbage availability of the reference classification created from milk yield recordings was investigated. To do so, the classification derived from milk yield recordings was compared to the reference classification derived from rumen fill scoring. The same performance metrics were calculated as described above.

Variable Importance
To visualize the effects of the gradually decreasing herbage availability on pasture over the six grazing days on cow behavior, a mixed effects model was constructed using the R packages nmle and splines. Within the function lme [43], cow ID and paddock ID were random effects for intercept and slope, respectively. The day within each grazing cycle (six factor levels: days 1, 2, 3, 4, 5, and 6) was a fixed effect specified as natural cubic spline. For the natural cubic splines, five degrees of freedom were used [44]. As a result, the model included four internal knots as well as two boundary knots at days 1 and 6. The predicted splines were plotted in combination with violin plots along with the lower and upper 95% confidence intervals. Behavioral variables that did not indicate a relevant change in one direction, i.e., either decrease or increase, during the last four grazing days were identified as predictors of low relevance. They were then removed as predictors, and the predictive performance of the models was retested as described above.
The R package ggplot2 and the function geom_violin were used to present data in violin plots [45]. With their width, the violin plots indicate the probability density of the data at certain values.
For the analysis of variable importance (i.e., changes in behavioral variables as a response to gradually declining herbage availability), Paddock 1 was excluded (highlighted in Figure 2), because cows might have avoided grazing due to heavy rains on day 3 that had covered herbage with soil. Figure 2. Mean post-grazing compressed sward heights per paddock (a), mean rumen fill score across 20 cows (b), and mean next day's milk yield (recorded on days 2 to 7 as indicator for herbage availability on days 1 to 6, respectively) across 20 cows (c) at each grazing day, respectively. Lines between points connect measurements that belong to the same grazing cycle and do not visualize a regression analysis. Highlighted in gray is Paddock 1, which was excluded for the analysis in Section 3.3.

Effects of the Paddock Allocation
During the six-day grazing cycle, the available herbage on pasture decreased from an average pre-grazing CSH of 125 mm (SD: 24) to an average post-grazing CSH of 50 mm (SD: 14) at day 6 ( Figure 2). At that day, the CSH ranged from 30 mm to 67 mm, except for Paddock 1 (Table 1), where post-grazing CSH was 80 mm.
Both reference indicators, which were used to identify scarce herbage availability, changed noticeably across all cows during the six grazing days (Figure 2). Milk yields decreased on average from 23.2 kg (SD 6.8) to 20.2 kg (SD 6.4) over six days, while rumen fill scores declined from 3.7 (SD 0.59) to 2.9 (SD 0.59) from day 1 to day 6, respectively.

Herbage Availability According to Reference Indicators
On average, milk yields and rumen fill scores steadily declined until day 6 ( Figure 3). On that last day of grazing at paddock, average milk yield and average rumen fill score decreased by 10.4% (SD 13.5%) and 0.73 (SD 0.54), respectively, compared to their baseline values on days 1 and 2. However, even on day 6, decline in milk yield and rumen fill score was not always above the limit to classify a cow-day as "scarce". Thus, there was a relatively small number of positive reference classes in the test dataset. Points indicate means across 20 cows. The width of the violins indicates how high the probability density is at the given value of the y-axis. Any change below the red line was classified as 1 (i.e., scarce herbage availability), whereas any change above the line was classified as 0 (i.e., sufficient herbage availability). The presented milk yields were recorded one day later than indicated in the graph as indicators for the herbage availability on the previous grazing day.
With the limit in the reduction in daily milk yields of 17.7% of the baseline value, a total of 72 cow-days were classified positive, indicating scarce herbage availability (Table 4). Using the reference indicator of rumen fill scoring with the reduction of 1.14 as limit, there were 19 cases fewer (53 cow-days). Both reference indicators agreed in 24 cases; hence, combining both resulted in 101 out of 583 cases being positive (17.3%). Nevertheless, with all three measurements, the number of positive cases increased towards day 6 ( Table 4), from 12.9% of the cow-days on day 3 to 42.4% of the cow-days classified as "scarce" on day 6 across all grazing cycles when both reference indicators were combined. Table 4. Number of cow-days with scarce herbage availability identified by the reference methods on days 1 to 6 across six grazing cycles for two groups of 10 cows each. The number of total observations per grazing day is provided in parentheses. + It was assumed that herbage on pasture was sufficient on days 1 and 2, because meadow hay was fed ad libitum on day 1 before grazing began and compressed sward height was at its highest during these days (Figure 1). * Classified as "scarce" as soon as either the milk yield or rumen fill reference indicator classified a cow-day as "scarce".

Behavioral Changes
Compared to the training data, mean bite frequencies were lower in the test data, and mean number of rumination chews per bolus tended to be greater (Figure 4). During the six days at paddock, some of the eight behavioral variables changed in a relevant manner as herbage availability gradually decreased, indicating their importance in predicting herbage scarcity. All variables indicated a visible decline from day 1 to day 2, except the frequency of eating bites, which increased. This change in behavior was due to ad libitum feeding of meadow hay on day 1.
Mean bite frequency, estimated by the mixed effects model with splines, continued to increase from day 2 (48.8 bites min −1 ) to day 6 (54.6 bites min −1 ), whereas the increase was pronounced from day 2 to day 3 (+3.0 bites min −1 ) but very small during subsequent days (+1.4, +0.5, and +0.9 bites min −1 ). At the same time, the number of eating bouts, the daily number of rumination chews, the rumination time per bout, and the daily rumination time tended to decrease as days at paddock progressed, particularly towards days 5 and 6 of each grazing cycle ( Figure 4). Conversely, the head movement activity and the number of lying bouts did not change in a clear direction.
The greatest decline in the number of eating bouts was observed from day 5 to 6 (−1.5 n d −1 ). Daily rumination chews and rumination time decreased greatly from day 4 to 5 (−1916 n and −26.5 min d −1 , respectively). Similarly, rumination time per bout declined pronouncedly from day 4 to 5 (−1.5 min bout −1 ) as well as from day 5 to 6 (−1.5 min bout −1 ).
The number of rumination chews per bolus decreased slightly over days 3 to 6, but it was not until day 6 that a slight change was seen (−1.0 n bolus −1 compared to day 5). Nonetheless, this variable, along with the head movement activity and the number of lying bouts, was irrelevant in predicting a gradually decreasing herbage availability. This is supported by the fact that the variables did not change relevantly during the last four grazing days; they neither increased nor decreased.
When data were analyzed separately for the time cows spent on pasture or in the barn, bite frequency and number of eating bouts during the time in the barn changed more pronouncedly with advancing grazing cycle than during the time on pasture ( Figure 5). Conversely, head movement activity during the time in the barn was similar across days 2 to 6 but increased continuously until day 4 and then decreased until day 6 during the time on pasture.
Rumination-related variables, which were calculated as sums within a certain time interval (rumination chews and rumination time per half-day), often changed in an opposite pattern with the period on pasture in contrast to the period in the barn ( Figure A2). The ranges of these two variables and of eating bouts were only within the range of the training dataset when barn and pasture periods were summed up.

Predicted Herbage Availability
With the underlying training dataset, the two prediction approaches, GLM and RFM, classified a total of 57 and 45 cow-days, respectively, as positive, i.e., scarce herbage availability (Table 5), equivalent to 9.8% and 7.7% of the total cow-day observations. These proportions are approximately in line with the number of cases identified as "scarce" by both reference indicators (9.1% and 12.3%; Table 4). In both prediction approaches, no cow was classified as "scarce" on day 1. In the case of the GLM, the number of positive cases increased continuously towards day 6. For the RFM approach, the number of positive cases remained almost identical across days 3 to 5 and increased on day 6.

Predictive Performance
Model predictions best matched classifications using rumen fill scoring as a reference classification ( Table 6). Specificity of the prediction models, i.e., the ability to correctly identify herbage sufficiency, was high (0.90 to 0.92), irrespective of the prediction approach (i.e., GLM or RFM). In contrast, sensitivity and PPV were poor (0.06 to 0.15 and 0.07 to 0.21, respectively). Predictions were based either on all eight behavioral variables presented in Table 2 or on only five variables, where the variables "rumination chews per bolus", "head movement activity", and "lying bouts" were excluded, because they were identified as predictors of low relevance (see Section 3.3). Results of the prediction with only five variables are provided in parentheses. PPV: positive predictive value, AUC: area under receiver operating characteristic curve.
In general, GLM performed slightly better than RFM before the number of predictors was reduced. Reducing the number of predictors from eight to five relevant behaviors slightly improved the models' performances, especially their sensitivity (Table 6). Both models, GLM and RFM, performed similarly after variable reduction.
Using rumen fill scoring as the reference classification, milk yield predicted herbage availability with a sensitivity of 0.45, a specificity of 0.91, a PPV of 0.33, and AUC of 0.68. The fact that these performance metrics are higher than the metrics resulting from GLM and RFM predictions compared to the rumen fill reference classification indicates that milk yield recordings outperformed the prediction of herbage availability based on behavioral variables.
Comparing different subgroups of animals, predictive performance of both models was greater in high-performing (>27 kg d −1 ) than in medium-or low-performing cows; in multi-than in primiparous cows; in cows with low BCS (≤2.5) than in cows with medium and high BCS; in cows in gestation than in empty cows; and in cows with grazing experience from previous years than in cows without grazing experience (Table A2). Thus, changes in behavior of the above-mentioned animal groups were more congruent with a decrease in their milk yield and rumen fill score. Using the GLM approach with five predictors and comparing it to the classification where reference indicators were combined, performance for the animal groups rose to AUC ≥ 0.56, sensitivities to between 0.26 and 0.47, and PPV to between 0.19 and 0.36. The highest AUC values were obtained for cows with low body condition scores (0.60 and 0.67 with eight and five predictors, respectively) and high-performing cows (0.57 and 0.56 with eight and five predictors, respectively).

Discussion
Both modeling approaches for predicting scarce herbage availability performed more poorly in the present evaluation study under commercial farming conditions than they did during model development, where they reached an AUC of 0.76 for both GLM and RFM. The sensitivities were 0.74 and 0.75 and PPVs 0.63 and 0.60 using leave-one-outcross-validation [19]. Therefore, GLM and RFM including the eight selected variables proposed by Shafiullah et al. [19] were not generalizable to the grazing set-up in the present study using the underlying training dataset. This shows that the prediction models for herbage scarcity based on behavioral changes cannot yet be applied to various grazing and pasture conditions, feed supplementation levels, or cow breeds without impairing model performance.
Thus, we discuss different possible sources that may have contributed to low model performance before providing an outlook for further model-based decision support in allocating new pasture at the end of this section.

Differences between Training and Test Data as a Cause for Low Model Performance
The main differences between the training and testing experiments were, firstly, the level of feed restriction on pasture; secondly, the grazing set-up (i.e., timing of allocating new pasture and the hours spent daily on pasture and in the barn); and, lastly, the choice of reference measurements.
The first two points mentioned could have affected the response of some behavioral variables to declining herbage availability over the six grazing days and, thus, their contribution to a correct prediction of herbage scarcity. In the present experiment (testing), allowances of herbage and supplement feed were reduced by 20%, whereas a reduction in the daily feed intake by 40% was targeted in the Irish experiment (training) used as a basis for model development [17,19]. The restriction level of 20% was intentionally chosen to ensure animal welfare and to mimic practical situations in which farmers should notice and react early in the event of herbage scarcity. Nevertheless, the lower level of feed restriction in the present experiment might be the reason why, in some behavioral variables, no clear changes were visible with decreasing herbage availability when looking at the 24 h data ( Figure 4). However, differences in the grazing set-up likely also played a role, because the reference indicators did respond to declining herbage availability ( Figure 3) and clearly indicated herbage scarcity ( Table 4).
The grazing set-up in training and testing experiments differed in the timing of when cows were allocated to new pasture area and in the time they spent on pasture each day.
During the training experiment, cows were on pasture full-time, and a new strip was allocated twice daily after milking, albeit with the defined herbage restriction [17]. Thus, the animals were able to graze fresh but very limited amounts of herbage which, additionally, declined during the following hours. In contrast, during the present testing experiment, one grazing cycle lasted six days before a new paddock was allocated so that the CSH decreased with each additional day (Figure 2a). Therefore, cows did not have access to fresh herbage each day but went to a pasture with less and less herbage mass left.
In addition, the animals were out on pasture only half the day and in the barn the other half of the day, where they were fed a restricted amount of meadow hay once daily. This feed allocation in the barn was comparable to the allocation of a new strip of pasture. The effects of allocating new feed on the animals' behavior are presented in Figure 5, which shows the 24 h data separately for the time on pasture and in the barn.
Changes in the behavior of cows on pasture might have often been masked by their behavioral responses during the time in the barn and vice versa when analyzing the animals' responses to gradually declining herbage availability on a 24 h basis including both the time on pasture and in the barn. In this line, some behavioral variables relevant to the daily prediction of herbage scarcity in the Irish training dataset responded more pronouncedly in the present test dataset when looking only at the behavior during the time on pasture (i.e., head movement activity) or in the barn (i.e., increasing bite frequency and decreasing eating bouts).
Moreover, the ranges of values in the behavioral variables differed between the training and test datasets, which could have affected the predictive performance of the models. For example, on day 6, bite frequency (n min −1 ), head movement activity (units h −1 ), and rumination chews per bolus (n bolus −1 ) were still closer to the mean values of Training Class 0 than to those of Training Class 1 (Figure 4). The importance of these three variables was below that reported in Shafiullah et al. [19] (identified with variable importance plot of the RFM, results not shown). Interestingly, in the present study, the behavioral variables contributed less to the correct prediction than what was expected by Shafiullah et al. [19].
All three variables were calculated as a value per hour, using the RumiWatch Converter, and averaged across the day. The bite frequency and the head movement activity were often below and the rumination chews per bolus above those of the training dataset ( Figure 4). In contrast, values related to the three behavioral variables were closer to those of the training dataset when considering only the grazing phase, eliminating data from the barn phase ( Figure 5). This is plausible, because behavior of cows in the barn differs considerably from that on pasture, which is due to the finite amount and structure of the feed in the barn (see also Section 4.3), and, thus, mean values for the whole 24 hours were distorted. In the end, however, it also means that some behavioral variables on a 24 h basis generated from a part-time grazing system cannot be compared to 24 h summaries from a full-time grazing system. This leads to the assumption that some variables are better suited for developing a robust decision support algorithm than others.
Finally, another difference between training and testing experiments was in the reference measurements to distinguish between sufficient or scarce herbage availability. In the Irish experiment, a parallel group design was established to generate balanced training data for model development [17,19]. There, daily herbage allowance for the control group (n = 10 cows) and the group subjected to 40% herbage restriction (n = 30 cows) could be more precisely adjusted by moving the fences, compared with the herbage allowance in a six-day allocation set-up. Therefore, the reference classification was much clearer in the Irish experiment. In contrast, in the present experiment, it was more challenging to reveal less pronounced day-to-day changes in DM intake of individual animals. The chosen reference indicators were able to fulfill this objective and proved their practicability as indicators in farming practice. However, the formulation of their thresholds for a change in the binary classification certainly played a role in the model evaluation.

The Challenge of Evaluating the Binary Predictions with Gradually Changing Reference Indicators
In our grazing set-up, a decline in milk yield and a reduced rumen fill score beyond the normal variation were used as proxies for scarce herbage availability. Both reference indicators were chosen because they have the great advantage of responding rapidly to variations in DM intake [30,31,46], can be easily measured on commercial farms to support herd management [5,47], and, thus, can potentially be used to generate training data on-farm.
However, both reference indicators are subject to uncertainties with respect to the study objective. Milk yield and rumen fill could also be reduced by factors other than a scarce herbage allowance. For example, milk yield may also be reduced in the short term once estrus sets in [29] or as a result of reduced feed intake and/or an increased activity and energy expenditures caused by diseases [48,49], which would also affect rumen fill [31,47]. However, data of cows on days where signs of estrus or illness were observed were excluded from the dataset in the present study. Therefore, a decline in milk yield and a reduction in rumen fill was most probably due to decreasing herbage availability on pasture.
Another uncertainty is the agreement of the binary classification resulting from the model predictions with the binary classification by the reference indicators, because the exact response time of both reference indicators, i.e., the duration until the indicator changes after a feed challenge, and the degree of response are uncertain.
For rumen fill, studies have shown that the score directly reflects the level of feed intake during the previous two to six [30], or even eight, hours [47]. Rumen fill scoring was, therefore, the more timely and direct indicator for the amount of feed consumed by a cow in the present study, compared to milk yield, where metabolic processes might have influenced the response time and response degree to a previous feed challenge [50].
Daily milk yield in the present study often decreased gradually and with a delay of one day, because it had recovered almost completely at the day after feeding ad libitum, which was done immediately at the end of the first grazing cycles. Similarly, Herve et al. [46] showed that daily milk yield decreases directly upon a feed restriction to 80% of the cows' ad libitum DM intake but not, however, to the full extent, as it continued to decrease with continued feed restriction [46]. The trend, over days, of the milk reduction played an important role in our experiment, because, depending on which day the milk yield of one cow reached the pre-defined threshold, the classification changed from 0 to 1.
If this change from Class 0 to 1 did not occur on the exact same day as predicted by the GLM and RFM, there were FPs and FNs that depressed model performance. Likewise, Post et al. [20] and Post et al. [51] highlighted the challenge of detecting the correct time points of occurrence of rare events with a binary system; in their studies, the first signs of lameness and mastitis were already detectable in behavioral data a few days before being diagnosed by a person, which resulted in a high number of FPs.
Consequently, we may be able to detect diseases, or in the present case herbage scarcity, earlier with behavioral data than by the reference methods, but we may evaluate the developed models to be worse than they are.
In addition, it would be useful if models would consider the trends of behavioral changes between days on an individual-animal basis. So far, using Shafiullah's models, the same animal could have been classified as 1 on day 4 and again Class 0 on day 5. Therefore, future models should consider day-to-day correlations within one cow.
A third challenge for the modeling approach is individual animals in which behavior or reference indicators do not change even though herbage availability declines.
If a cow maintains its normal level of milk yield even when herbage availability on pasture declines, it would not be classified as "scarce" (i.e., Class 0) by the milk yield reference indicator. Nevertheless, the cow may have been restless and intensified its foraging activity (i.e., less lying down and rumination activity as well as increased bite frequency and head movement activity), causing it to be identified as "scarce" (i.e., Class 1) by GLM and RFM.
One explanation could be that cows which did not show any decline in milk yield could be animals who are more adaptable than others in terms of nutrient partitioning to support lactation [50] or showed more grazing perseverance to the point of possibly even consuming the required amount of feed from pasture with little herbage mass left. Another reason could be that those cows have generally lower feed requirement (and presumably produce less milk) but continue grazing, even though feed is not required.
The said potential sources of error resulted in a great number of FPs (50 out of 583 cow-days in the case of using GLM and the milk yield reference indicator; Table A3) and, thus, in a low model performance.
Not only was the number of FPs relatively high in our study, but the number of FNs was too (65 out of 583 cow-days in the case of using GLM and the milk yield reference indicator; Table A3). False negatives are days when a cow produced noticeably less milk or had a lower rumen fill than usual (i.e., Class 1) but did not exhibit the characteristic behavior of herbage scarcity as found by Werner et al. [17] and Shafiullah et al. [19], namely, increased restlessness and intensified foraging behavior (i.e., Class 0). These are likely animals with low plasticity to adapt to changing circumstances. Such days will remain FNs in any future usage of the prediction model and, thus, will reduce the sensitivity of the models.
Low sensitivity values occurred mainly in the animals with lower levels of milk production and better body condition (Table A2) and, presumably, in cows with lower levels of feed intake [50]. These cows had enough body reserves, and there was no need to change behavior in order to have sufficient feed intake for maintenance. Therefore, it might be useful to include milk yield recordings, although somewhat delayed in its response to reduced herbage availability, as an additional predictor in the model [52,53].
Furthermore, it is essential to base pasture allocation decisions on a percentage of the herd rather than one animal. The individuals in focus should be cows that produce high milk yields and have lower BCS, because they have greater feed intake levels and greater energy requirements [50], and it is likely that they respond more sensitively to declining herbage availability (Table A2).

Suitability of Behavioral Variables as Predictors Depends on Grazing Condition
Previous studies found decreases in lying bouts [16], eating bouts [54], and rumination time [55] and an increase in prehension bite frequency [56] and head movement activity [19] in response to gradually declining herbage availability. Correspondingly, we expected similar changes in the behavior of cows from days 1 to 6. These expected changes in animal behavior appeared in five variables, namely, the daily rumination chews, bite frequency, daily rumination time, rumination time per bout, and eating bouts. However, the expected behavioral changes were ambiguous in three variables (i.e., rumination chews per bolus, head movement activity, and lying bouts) where no pronounced and stepwise changes across two or three days were visible, such that they were not suitable predictors for a binary classification approach. Therefore, the question arose whether these behavioral variables are suitable predictors of scarce herbage availability in general and under which grazing conditions and supplementation schemes they allow for accurate prediction.
Both the head movement activity and the number of lying bouts did not change notably across the last four days of a grazing cycle. Similarly, rumination chews per bolus remained relatively constant until day 5 and then decreased only minimally (Figure 4). We suspected that these three behavioral variables might not be important as predictors of herbage scarcity in a part-time grazing system, at least not when evaluated on a 24 h basis (see Section 3.3), because model performance rose when we excluded them from the list of predictors (Table 6). However, after separating the 24 h data into the hours on pasture and in the barn, it became clear that a behavioral change in response to declining herbage availability on pasture was present in some variables, but especially in the head movement activity, masked by the exhibited behavior during the time spent in the barn ( Figure 5). Thus, the variables measured on a 24 h basis were less sensitive to changing conditions on pasture than expected, because they included the half-day phase in the barn. In the barn, the cows could not exhibit an intensive foraging behavior and restlessness with high head movement activity, because, at a certain point in time, the limited amount of feed at the feeding bank was eaten. Presumably, a main part of the remaining time was used to stand and to rest, to lay down, and to ruminate.
Contrarily to head movement activity, the number of eating bouts decreased even more strongly in the barn than on pasture. Werner et al. [17] explained the decreasing number of eating bouts under herbage scarcity by the fact that the duration of the bouts was longer in order for cows to find the required amount of herbage. In the present experiment, the decreasing number of eating bouts in the barn can be explained by the fact that, with declining herbage availability on pasture, the limited amount of meadow hay allocated in the barn was increasingly consumed in one piece, with a decreasing number of interruptions.
For future development of the approach, the variable "head movement activity" could be a suitable predictor when considering only behavior exhibited on pasture. Although it responds in a non-linear way, its vertex carries important information, because the variable increases as long as there is still new feed to be found in the pasture, and the vertex emerges presumably at the time when most new leaf mass within close range is consumed and cows must intensify foraging behavior (day 3 or day 4, Figure 5).
Consistent with the findings of Kennedy et al. [56], Figures 5 and A2 show that rumination mainly occurred during the time spent in the barn, and the hours spent on pasture were used for foraging. To consider this compensation, in future applications of the model in part-time grazing systems, the rumination-related variables, representing a daily sum, should address both the grazing and barn phases. However, the variable "rumination chews per bolus" might be excluded in future applications of the modeling approach. This is not only because the variable did not change relevantly across the six-day grazing cycle, but because, already during model development, Shafiullah et al. [19] found that the variable, which ranged from around 40 to around 70 n bolus −1 , could not be assigned to one of the two classes in the range between 50 and 60 n bolus −1 , indicating that the variable is less suitable as a predictor in a binary classification problem.
The average frequency of prehension bites increased clearly over the six-day grazing cycle when looking at the 24 h data ( Figure 4) and was already identified as a very relevant predictor in previous studies [17,19]. However, in the present grazing set-up, the variable was even more sensitive to declining herbage availability during the time in the barn than on pasture ( Figure 5). This could be explained by the fact that there was a limited amount of new feed allocated each day, comparable to the strip-grazing conditions in the training experiment, and that feed was eaten more and more greedily towards the end of the six-day grazing cycles when the cows returned from the pasture. This suggests that the variable "bite frequency" is definitely suitable as a predictor in a grazing system with an allocation of new feed or pasture strips, and changes in bite frequency should be taken into account by focusing on the time span when new feed is allocated and not when cows are actually on pasture with declining herbage availability.

Outlook
In order to develop future commercial decision support tools, they must be suitably robust to system differences [6,20]. In the present experiment, the prediction models, which used 24 h measurements of cow behavior to detect scarce herbage availability, did not perform satisfactorily with the underlying training data. Thus, the evaluated model approach did not fit sufficiently to a more typically Swiss grazing system, a six-day rotational grazing system under half-day barn conditions. However, the data analysis has provided a useful input. We identified three key points which can be put into context for further model development as follows: Firstly, the bite frequency and the daily number of rumination chews are confirmed to be important variables, as already identified by Shafiullah et al. [19]. However, bite frequency needs to be seen in the context of new feed allocation, because it increased with scarce herbage availability, particularly strongly during allocation of new pasture strips in the study of Shafiullah et al., and during supplementary feeding in the barn in the present study. Therefore, the aspect of new feed allocation within the grazing system holds important information regarding possible herbage scarcity [57].
Secondly, data analysis should focus on a higher resolution than 24 h summaries. Authors working on similar topics, such as Paulenz et al. [58], also suggested analyzing shorter time spans, in the context of automated behavioral monitoring, for an estimate of herbage availability on pasture instead of analyzing 24 h summaries.
Thirdly, a future decision support tool for pasture allocation scheduling based on animal behavior could include milk yield recordings [52,53]. In general, for binary classification models, it is the case that the more a predictor is related to the classification variable, the better is the AUC [51]. For example, milk yield is known to be very important in predicting the feed intake of cows [53,59]. Our results showed that milk yield recordings, which were used as reference indicator in the present study, have potential as a reliable predictor of herbage availability on pasture. However, milk yield has a postponed response [60] of roughly one day [46], which might delay the warning for herbage scarcity.
On the contrary, the reference indicator based on rumen fill responds relatively promptly to changing feed intake [30,31]. Thus, as a reference indicator, rumen fill is more favorable than milk yield. However, milk yield is automatically recorded so that it can be a useful additional source of information for the model.
Summarizing this chapter, there is scope to improve the modeling approach to identify herbage scarcity on pasture and to provide farmers with a more robust tool to support their decision making in allocating new pasture.

Conclusions
The prediction models showed poor performance under the applied grazing conditions, but this increased when insensitive predictors (rumination chews per bolus, number of lying bouts, and head movement activity) were excluded. Moreover, daily rumination chews and the (prehension) bite frequency are confirmed to be suitable predictors, as evaluated in previous studies.
There is potential for some of the behavioral predictors to indicate herbage scarcity and to further develop this approach according to different grazing set-ups. For example, in order to predict herbage scarcity on a daily basis, the predictor "bite frequency" is particularly sensitive to gradually declining herbage availability on pasture when new feed is allocated. Following this train of thought, to predict herbage scarcity on the basis of behavioral changes, it is important to provide either new herbage on pasture or feed supplementation in the barn at least once a day in order to identify the characteristic changes in bite frequency. If the system aims for a longer-duration grazing set-up, we recommend training a separate model which might include differing predictors, as the characteristic bite frequency changes do not occur under such circumstances. Furthermore, we identified the head movement activity during hours of grazing as an additional meaningful predictor for the time span on pasture, because the vertex emerges before the rumination-related variables respond to declining herbage availability. Therefore, the potential decision support tool could issue a warning before a drop in milk yield occurs. Predictions were based either on all eight behavioral variables presented in Table 2 or on only five variables, where the variables "rumination chews per bolus", "head movement activity", and "lying bouts" were excluded, because they were identified as predictors of low relevance in Section 3.3. Results of the latter are provided in parentheses. BCS: body condition score, PPV: positive predictive value, AUC: area under receiver operating characteristic curve. Table A3. Results of the confusion matrixes for the prediction * via generalized linear model (GLM) and random forest model (RFM • ) in comparison to the reference indicators of milk yield and rumen fill scoring as well as a combination of both (n = 583).

Reference Indicator Prediction Approach
True Negative False Negative True Positive False Positive Figure A2. Violin plots showing differences in animal behavior between the feed status Classes 0 (i.e., sufficient) and 1 (i.e., scarce) of the training dataset, as well as differences during the hours on pasture or in the barn from days 1 to 6 across six grazing cycles and 20 cows of the test dataset. The points indicate mean values. The width of the violins indicates how high the probability density is at the given value of the y-axis. Data from other variables are shown in Figure 5, except for the variable "lying bouts", where data could not be split into hours on pasture and in the barn, because it was recorded as a daily sum.