1. Introduction
There is increasing interest in the welfare of farmed animals among both consumers [
1] and scientists [
2]. Welfare assessments are commonly included in farm quality assurance schemes across the farming industry [
3,
4,
5,
6] but typically employ welfare indicators such as the presence or absence of disease and the availability (or lack thereof) of resources (e.g., space allowances) [
3,
7,
8]. Whilst these factors are straightforward to measure, they tend towards the assessment of the absence of a negative welfare state and do not consider the presence (or absence) of positive welfare states. In recent years, the concept of identifying and assessing the presence of good welfare conditions and positive welfare states has been gaining traction [
9]. Positive animal welfare encompasses the concepts of positive emotions, positive affective engagement, happiness and good quality of life [
9] and, as such, is potentially more difficult to quantify than traditional welfare measures such as husbandry standards and health outcomes.
Behavioural analysis has shown promise as a method of assessing positive welfare [
10,
11] and is recommended for inclusion in on-farm welfare assessment protocols [
12]. Detailed behavioural analysis is time consuming and labour intensive [
13], which means its routine inclusion in on-farm welfare assessment protocols is rarely feasible [
11]. Continuous visual observation provides an exact record of the behaviours observed, but it is only practical for behavioural analysis over short time periods [
14]. Other methods of behavioural analysis such as instantaneous sampling and one-zero sampling techniques have been devised to improve the feasibility of analysing behaviour over longer time periods and also allow larger numbers of behavioural categories to be measured at the same time [
14,
15]. Time budgets constructed using these methods correlate well with continuous observations for common and long-duration behaviours [
16]; however, the time budgets constructed from these methods often underestimate the duration of time engaged in short-duration or infrequent behaviours [
17], and selecting an appropriate sampling interval in studies where several different behaviours are being observed can be difficult [
18]. Recent technological advances have allowed animal activity to be recorded automatically using animal-mounted activity monitors [
19,
20]. Activity can be monitored over long periods of time, and data can be downloaded for detailed analysis at any point in time at which it is required, conferring advantages over the visual observation of animals either pen-side or with video recordings. The challenge, therefore, is to devise remote activity-monitoring technology that maintains an accurate correlation with continuous visual observations, even for rare or short-duration behaviours [
19,
21,
22]. Whilst remote monitoring devices typically record measurements at set sampling intervals [
21,
23]—analogous to instantaneous sampling—the sampling intervals are much shorter than those that would be practical or even possible for visual observations (sometimes fractions of a second) and, consequently, are a close approximation for continuous visual observations. A wide range of different animal-mounted sensors are now available for the remote monitoring of animal health and behaviour [
24], and leg-mounted tri-axial accelerometers are commonly used in behavioural studies [
19]. This type of accelerometer contains a piezoelectric sensor that generates a voltage signal in response to any change in velocity experienced in three planes and produces outputs representative of three-dimensional movement [
19].
Accelerometer technology has been utilised to monitor different behaviours in a wide variety of wild and domestic species including cattle [
19,
25,
26,
27,
28,
29]. Accelerometers have been evaluated as tools for identifying many different types of bovine behaviour including lying behaviours [
30,
31,
32,
33,
34,
35], locomotion [
34,
36,
37], feeding/drinking behaviours [
35,
38,
39,
40] and play behaviour [
41,
42]. Whilst accelerometer generated data have shown good correlation with visual observations for standing and lying behaviours in adult cows [
21,
32,
35,
43,
44] and lying behaviours in calves [
30,
45], the reported correlation between accelerometer measurements and locomotor activity in calves is inconsistent. For example, Luu et al. [
41] reported a good correlation between the number of acceleration peaks and the duration of time engaged in running, jumping/kicking and walking (
r = 0.96, 0.86 and 0.75, respectively), whereas Trénel et al. [
33] reported a low sensitivity (raw data sensitivity = 0.15; filtered data sensitivity = 0.22) for identifying movement in calves. Some studies have evaluated the use of accelerometers to identify play behaviours in calves [
41,
42,
46,
47], but these evaluate behaviour in calves aged four weeks or older [
41,
42,
46], evaluate play behaviour in arena tests [
41,
46,
47] or apply data manipulation methods to the raw accelerometer data [
41]. No studies have evaluated the use of accelerometers to identify play behaviour in neonatal calves aged up to 48 h old in their home pen using the raw accelerometer data that would be available to farmers, welfare auditors and veterinarians.
Play behaviour is observed in almost all species’ young and is widely considered to be an indicator of good welfare that occurs when an animal’s basic needs (e.g., nutrition) are fulfilled [
48,
49,
50]. Play behaviour is not considered to be essential for survival, and animals typically do not expend energy expressing play behaviour when resources are limited or welfare is compromised [
48]; play is therefore considered to be a “luxury behaviour” that is exhibited when animals are in a positive welfare state [
51]. Play is also thought to be an indicator of positive emotions in animals [
50,
52]; however, play behaviour can be time consuming and laborious to assess as it occurs spontaneously, infrequently and over short durations [
42]. Play cannot be predicted, and long periods of observation are required to accurately determine the duration and number of play bouts, making the assessment of play behaviour impractical for inclusion into on-farm welfare assessments. Accelerometer technology can potentially mitigate these limitations, and, as such, the use of accelerometers to identify play behaviour for the purposes of the assessment of positive states is of interest. Größbacher et al. [
42] assessed locomotor play in group-housed calves aged four and eight weeks old and found that the Hobo Pendant G Acceleration Data Logger (Onset Computer Corporation, Bourne, MA, USA) correctly identified 79% of the sampling periods in which play occurred but consistently overestimated play behaviour, and a correction factor had to be applied to enable the accelerometer derived data to reflect visual one-zero sampling. They concluded that, although this accelerometer provided a good approximation of spontaneous locomotor behaviour in calves, the sensor did not have a high enough recording frequency (1 Hz) for the accurate measurement of play behaviour [
42].
The synchronisation of observed behavioural patterns with device-generated data is considered best practice for validating remote monitoring devices as tools for recording behaviour [
19]. The IceTag accelerometer (IceRobotics, South Queensferry, UK) was chosen for use in this study as it is a small device (measuring 66.0 mm × 55.0 mm × 27.0 mm), weighing only 117 g, and as such, it was unlikely to cause disruption to the calves’ normal behaviour. It also has a high frequency of data collection of 16 Hz (i.e., 16 samples are measured every second) and the data are presented in intervals as short as one second, ideal for potentially capturing short duration behaviours such as play in calves.
The objectives of this study were to determine whether IceTag-generated motion index (MI) data had the potential to identify play behaviour in very young dairy calves (up to 48 h old). IceTag-generated MI data were compared to detailed focal observations of calf behaviour, and different analytical approaches were used to define the optimal method of utilising MI for identifying play behaviour. Initial work was undertaken to investigate whether the number of locomotor play bouts or the duration of locomotor play in the first 48 h of life were correlated with the cumulative IceTag-generated MI data. Although valuable information was obtained from this correlation analysis, it is a crude measure of behaviour; hence, more detailed analyses were subsequently performed. Firstly, an epidemiological approach was employed to calculate the sensitivity and specificity of selected MI cut points (thresholds) for detecting play behaviour with 1 min and 15 min sampling intervals. Sensitivity and specificity are test performance characteristics indicating the ability of a diagnostic test (in this case, MI values) to correctly detect the presence or absence of a condition (in this case, locomotor play) [
53]. This approach provides detailed information about the ability of MI to detect the presence or absence of play behaviour in each sample interval; however, it does not provide information on the amount of sample intervals during which play occurs, or information about behavioural patterns over time. The motion index is a single figure generated by the IceTag for selected sample intervals; therefore, it can only detect whether play was present or absent in each sample interval—this is analogous to one-zero behavioural sampling [
14]. One-zero sampling records whether (or not) a behaviour was observed in a sample interval selected by the investigator and can be used when the presence or absence of a behaviour is the point of interest [
54]. Thus, our final analysis compared selected MI thresholds to the results obtained from the visual one-zero behavioural analysis of video footage, aiming to establish a practical method of analysing IceTag-generated data that might have potential for future use in the on-farm welfare assessment of neonatal calves up to 48 h old.
4. Discussion
The objectives of this study were to determine whether data generated by a commercially available leg-mounted tri-axial accelerometer can be used to identify play behaviour in calves up to 48 h old and to determine the optimal method for the analysis of the raw data generated to enable the occurrence of play behaviour to be accurately evaluated. The design of the study allowed us to assess different methods of analysing accelerometer-generated data without needing to repeat data collection, thus keeping animal usage to a minimum. The cumulative MI over 48 h correlated well with both the total duration of play behaviour and the number of play bouts in the same 48 h observation period. Whilst this is a rather crude method of behavioural analysis, it is straightforward to calculate, and the data are readily available. This method is a way of comparing the duration and number of bouts of play behaviour exhibited by different calves over longer periods of time without the requirement for time consuming visual observations. As such, this method allows for the analysis and comparison of large numbers of calves over longer periods of time where visual observations would be impractical, meaning it can easily be applied in the wider context of on-farm welfare assessment. The cumulative 48 h MI was identified in this study as the most appropriate sampling period for the analysis of play behaviour, as there was the least interference by other behaviours. It is possible that longer sampling periods may be even more appropriate, and further research is needed to identify the sampling period most suited to the use of the cumulative MI for comparing the duration and number of bouts of play behaviour between calves. Unexpectedly, we identified a positive correlation between the number of lying bouts and MI in some observation periods; as lying is a low-activity behaviour, a negative correlation was expected. It is likely that this is due to the definition of “lying bout” and method of recording lying bouts employed by the IceTag accelerometer. A lying bout is recorded as having occurred when the IceTag moves from vertical to horizonal and back to vertical again; this corresponds to the animal moving from standing to lying (this initiates the recording of a lying bout) before standing back up again (thus terminating the recording of the lying bout). Although lying itself is associated with a low MI, a lying bout is both initiated and terminated by a transition between lying and standing, which is recorded as movement by the IceTag and presented as an increase in the MI. Consequently, increased numbers of lying bouts are associated with increased numbers of posture transitions, resulting in a positive correlation between number of lying bouts and cumulative MI in some observation periods.
Pilot work had previously indicated that play behaviour could not be accurately defined by the motion index at a 1 s resolution: the MI at 1 min and 15 min resolutions showed greater potential for detecting the presence or absence of play behaviour. As these are longer duration sampling intervals, patterns of play behaviour could not be accurately defined using these MI resolutions because multiple behaviours could have occurred during the sampling interval. Rather, the number of sampling intervals in which the MI exceeded a defined cut point could be calculated in a method analogous to one-zero sampling (using visual observations) where the number of sampling intervals in which a defined behaviour is observed (in this, case play) is calculated [
14]. One-zero sampling produces a single score for the required recording session that is expressed as a proportion of the total number of sample intervals during which the defined behaviour was observed and has previously been used by authors recording play, as it can capture sporadic, short-duration behaviours and is suited to capturing patterns of behaviour that are clustered [
14,
42,
64]. Short sampling intervals are optimal when one-zero sampling is used, and this was reflected in our findings that the sensitivity and specificity of 1 min sampling intervals was more accurately repeatable than the sensitivity and specificity of 15 min sampling intervals. Motion index cut points of ≥ 2.5 for a 1 min resolution and ≥ 24.5 for a 15 min resolution were determined to have the optimum sensitivity and specificity for detecting the presence/absence of play behaviour in each sampling interval. For practical use, these would need to be rounded up to MI thresholds of ≥ 3 and ≥ 25 because the MI is reported in whole integers. These cut points consistently overestimated the one-zero sampling score obtained from visual observations, possibly because of the different methods used to calculate each metric. Sensitivity and specificity are diagnostic test characteristics (in this case, the “diagnostic test” is the selected MI cut point) and account for both positive and negative results (both true and false) [
53], whereas one-zero sampling only records the number of positive samples out of the total number of recorded samples [
14] and may therefore have included false positive results as well as true positive results. The optimised MI for detecting play behaviour was determined in this study based on a balance between sensitivity and specificity; the method was chosen because the presence of false negatives and presence of false positives were considered to have equal importance for our analysis. If attempting to replicate visual one-zero sampling, false negative results affect the calculated one-zero score less than false positive results (because only positive results contribute to the calculation of a one-zero score); therefore, a MI with higher specificity for indicating play behaviour is more suitable for this purpose despite the associated loss of sensitivity.
An unavoidable limitation of any study of the behaviour of neonatal calves is the large proportion of time calves of this age spend lying [
55,
65,
66]. Play behaviour, in particular, is infrequent in calves of this age [
55]; therefore, the absence of play predominates over the presence of play, which will have had an effect on the calculated sensitivity and specificity. In clinical medicine, the sensitivity and specificity of a test are typically used to calculate the positive and negative predictive value of the test—i.e., how good the test is at predicting the presence or absence of a specified condition in individuals [
53]. Trénel et al. (2009) reported that the IceTag accelerometer had poor predictive value for movement [
33]. Despite achieving high sensitivity and specificity, the optimised MI for detecting play behaviour in 1 min and 15 min intervals consistently overestimated the proportion of sampling intervals in our study, seemingly supporting the findings of Trénel et al. [
33]. Positive and negative predictive value is affected by the prevalence of a condition in a population [
53] (in this case, the duration of time spent engaged in play behaviour) and therefore would be different for each calf studied, as the unit of study was the individual calf and not the population. Consequently, positive and negative predictive value has limited value for this type of analysis, and the predicted value for selected MI cut points was not calculated in our study.
Classification and Regression Tree analysis is a methodology that is well suited to this type of study as it does not assume any particular data distribution and can tolerate imbalanced datasets [
22,
67]. In this study, we were interested in determining whether there was potential for a single MI threshold (cut point) to detect the presence of locomotor play in selected sample intervals; therefore, only the top nodes of the classification tree were of interest, and two-node trees were selected for both sample intervals. This approach is simple to interpret and has allowed us to identify single MI thresholds for both 15 min and 1 min sample intervals that have a high sensitivity and specificity for detecting play behaviour with acceptable accuracy. A more complex decision tree may allow play behaviour to be detected with even greater accuracy, and further work is warranted to develop predictive models for this purpose. However, increasingly complex decision trees are also increasingly difficult to interpret, and there is a risk that in the pursuit of increasing the accuracy of prediction, the models that are used become less applicable to the wider welfare and farming industries.
One-zero sampling can be limited in its application, as it does not always accurately reflect the true duration of a behaviour and is not a true reflection of the frequency of bouts of behaviour, as only the first bout observed in a sampling interval is recorded [
15,
54]. However, one-zero sampling is a practical method of recording the behaviour of large numbers of animals over longer periods of time, has good inter-observer reliability and is a suitable method for recording behaviour when only the presence or absence of a defined behaviour is of interest [
54]. Recording visual observations using focal one-zero sampling is a well recognised technique in behavioural studies [
14] and was well suited to matching the data output generated by the IceTag accelerometer at 1 min and 15 min resolutions; additionally, this study method was easily replicated in a different group of calves with good results. Although this method offers a good technique for comparing the frequency of positive sample intervals between calves, its accuracy is limited as it is not possible to be certain that positive visual and MI sample intervals are accurately matched using this method. As such, whilst this method is of value in situations where a simple count of positive samples is of interest, it cannot be used if the temporal pattern of calf play behaviour needs to be determined. This limitation can, however, be improved using longer sample intervals, which better compensate for the lag time in recording a change in velocity. Although the calculated sensitivity and specificity of selected MI cut points to indicate play were accurately repeated when the optimised MI for detecting play was applied to the video footage of a second group of calves, the proportion of sample intervals in which play occurred was consistently overestimated. We therefore consider that the calculation of the proportion of sample intervals in which play occurs (out of the total number of samples)—analogous to a one-zero sampling score—is a method preferable to the calculation of sensitivity and specificity for the ability of selected MI cut points to detect the presence/absence of play behaviour in each sampling interval.
Play behaviour is an indicator of positive animal welfare [
48,
51] and has potential for use in the assessment of the welfare of young calves, especially when comparisons between calves experiencing different challenges or environments are required. Traditional methods of recording behaviour in animals can be time consuming, laborious and not always suitable for on-farm welfare assessment. The value of accelerometer technology is already well recognised for monitoring behaviour in adult cattle and has benefits over traditional methods of recording behaviour; in particular, the ability to record activity over long periods of time [
43] and generate large amounts of data that would be impractical to generate using visual observation techniques [
31]. This study has described and evaluated different methods of utilising raw data generated by a commercially available tri-axial accelerometer to detect play behaviour in very young calves without the need for extra software or advanced data manipulation. The cumulative MI over 48 h correlates well with the total number of bouts of play behaviour and the duration of play behaviour observed during the same observation period and can be used to provide a simple comparison of the amount of play behaviour exhibited by different calves. The sensitivity and specificity of selected MI cut points for detecting the presence/absence of play behaviour in 1 min and 15 min sampling intervals was repeatable, but the MI cut points with optimal sensitivity and specificity consistently overestimated the proportion of sample intervals positive for play. This latter method, therefore, may have less value for future application to the analysis of play behaviour in very young calves; however, if this method is chosen for use by future researchers, 1 min sampling intervals are recommended over 15 min sampling intervals, as the sensitivity and specificity of the optimised MI for detecting play during 1 min intervals were more accurately reproduced when it was applied to a second group of calves.
Locomotor play was analysed in the study reported, as this type of play involves leg movement and could therefore be captured by the IceTag accelerometer; however, social play behaviour is also of interest when assessing positive animal welfare [
68]. Social play in calves typically involves head movement [
49], which has the potential to be captured using accelerometers worn on the head or neck. Accelerometer devices are rarely used to detect social behaviours [
19], and to our knowledge, the use of accelerometer-generated data to detect social behaviours in calves has not yet been studied. Further work is warranted to determine whether accelerometers worn on the head or neck have potential for use as a tool for detecting calf social play behaviours.
The calves studied were all dairy or dairy x beef calves housed in group pens away from the dam in a rearing system typical of the UK dairy industry. Further work is needed to determine whether our findings can be extrapolated to calves housed and reared in different systems (e.g., in individual pens or in a beef suckler system) and whether similar findings are obtained when similar methods for analysis are applied to data generated by IceTag accelerometers worn by calves older than those recruited for this study.