Horsing Around—A Dataset Comprising Horse Movement

Kamminga, Jacob W.; Janßen, Lara M.; Meratnia, Nirvana; Havinga, Paul J. M.

doi:10.3390/data4040131

Open AccessData Descriptor

Horsing Around—A Dataset Comprising Horse Movement

Pervasive Systems Group, University of Twente, 7522 NB Enschede, The Netherlands

^*

Author to whom correspondence should be addressed.

Data 2019, 4(4), 131; https://doi.org/10.3390/data4040131

Submission received: 22 August 2019 / Revised: 17 September 2019 / Accepted: 18 September 2019 / Published: 22 September 2019

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Movement data were collected at a riding stable over seven days. The dataset comprises data from 18 individual horses and ponies with 1.2 million 2-s data samples, of which 93,303 samples have been tagged with labels (labeled data). Data from 11 subjects were labeled. The data from six subjects and six activities were labeled more extensively. Data were collected during horse riding sessions and when the horses freely roamed the pasture over seven days. Sensor devices were attached to a collar that was positioned around the neck of horses. The orientation of the sensor devices was not strictly fixed. The sensors devices contained a three-axis accelerometer, gyroscope, and magnetometer and were sampled at 100 Hz.

Dataset: The complete dataset is available online with open access at the 4TU.Centre for Research Data and can be accessed via https://doi.org/10.4121/uuid:2e08745c-4178-4183-8551-f248c992cb14.

Dataset License: The dataset has been made available under the CC0 license.

Keywords:

animals; horses; activity recognition; accelerometer; gyroscope; compass; IMU; orientation independent; neck

1. Summary

Activities from animals can be recognized from motion data [1,2]. The advent of small, lightweight, and low-power electronics has propelled research in Animal Activity Recognition (AAR). Most AAR approaches utilize motion data that are recorded with Inertial Measurement Units (IMUs). An IMU generally consists of an accelerometer, gyroscope, and magnetometer, which measure acceleration, angular velocity, and magnetism, respectively. The recorded sensor data, or part thereof, are tagged with labels by an observer to obtain a labeled dataset. A labeled dataset consists of several different activity categories. Ground truth for the observation is often recorded with a video camera during the sensor data collection. Various Machine Learning (ML) techniques are used to train, tune, and validate an AAR classifier. After training, the classifier can classify unlabeled raw data-samples into the learned activity categories. Recently, various studies utilized IMUs for AAR regarding: wildlife [3,4,5,6,7,8,9], livestock [1,2,10,11,12,13,14,15,16,17,18], and pets [19,20,21].

In this paper, we describe a horse movement dataset [22] and its collection process. To be able to study AAR from motion data, a large dataset comprising movement data was required. In earlier data collection campaigns [1,2], we found that the observed animals spent most of the day eating and resting, and the activity dataset became unbalanced and skewed towards a few activities. Therefore, we chose to monitor horses and ponies that were ridden in an equestrian facility because they are exercising various activities during the day. Because of this, we could ease the task of collecting and labeling relatively large amounts of movement data from several activities, which resulted in a more balanced dataset for different gaits and eating behavior. Ground truth was collected by placing cameras that could oversee most of the horse paddocks and outdoor pasture. The subjects were ridden in various gaits over multiple days. More natural activity data were collected by observing the animals while they were left to roam freely in an outdoor pasture during their daily break. Figure 1 shows three horses during the outdoor collection process. In total, 17 different activities exercised by the horses were observed and annotated.

This dataset has been used to evaluate a Naive Bayes (NB) classifier [23]. The paper briefly describes the dataset and shows that an AAR performance of 90% accuracy can be achieved using only the 3D acceleration vector as input. Moreover, the paper demonstrates the effect of increased complexity in AAR, parameter tuning, and class balancing on the classification performance and identifies open research challenges for AAR.

Most of this dataset is unlabeled data (denoted as null and unknown in the dataset). The distinction between these two is that null data have never been seen by an observer and essentially are unprocessed data and unknown data are data that have been seen by an observer but the ground-truth were unclear or the activity did not fit into one of the predetermined categories. Because the dataset contains a vast amount of unlabeled data along with a decent amount of labeled data, the dataset is particularly suitable for the benchmarking of unsupervised representation learning algorithms. Unsupervised representation learning is a set of ML technique that does not utilize data labels and aims to automatically discover a compact and descriptive representation of raw data from the data itself [24]. Two recent surveys [25,26] both identified unsupervised Activity Recognition (AR) through Deep Learning (DL) as an urgent open research question. Unsupervised representation learning is not only interesting to improve AAR and Human Activity Recognition (HAR), but Artificial Intelligence (AI) applications in general [24,27]. A paper that uses part of this dataset for unsupervised representation learning has been submitted [28]. The paper focuses on unsupervised representation learning for AAR and compares engineered representations with various representations that were learned from unlabeled data. The aim of publicly releasing and describing our dataset is to allow other researchers to improve AAR methods and benchmark novel approaches to unsupervised representation learning for AAR. Furthermore, this dataset could be useful for research related to: gait analysis and comparison, feature selection for AAR, and transfer learning. For example, the dataset might be valuable to improve AAR methods for other quadruped animals within the Equidae family, such as zebras or donkeys.

2. Data Description

In this section, we describe the labeled part of the dataset. The raw sensor data are stored in tables where each row denotes one raw data sample. Because we used a sampling rate of 100 Hz, 1 s of data equals 100 rows. Figure 2 shows five 2-s examples of accelerometer and gyroscope data that were recorded during different activities. The columns of the tables are described in Table 1.

The composition and size of the dataset are shown in Table 2. The samples in Table 2 were obtained by applying a 2 s window with 50% overlap over the raw data segments. The number of samples per segment was calculated as follows:

n = ⌊\frac{σ}{ω * τ} - 1⌋,

(1)

where

σ

is the length of the segment,

ω

is the size of the window, and

τ

is the overlap (50%). Information leakage may occur when overlapping windows are used and two overlapped windows are placed in both the training and test set. To prevent information leakage, the activity segments—instead of the windows—should be divided into training, tuning, and test sets. Therefore, each continuous activity is marked with a unique segment identifier throughout the dataset. Because some segments (activities) may have a long duration, the resulting training and test set may be imbalanced with different ratios when the segments are divided instead of the windows, even when stratified sampling is used [1]. Therefore, the segments have a maximum length of 10 s to maintain the same class balance ratio between the different subsets.

Table 2 shows that most of the dataset is null data (85.22%) or data that were labeled as unknown (7.52%). Null data are data that have not been seen by an annotator and unknown data are data that were labeled as such by an annotator because the ground truth was unclear or the behavior did not fit one of the 17 activities that were mainly exercised by the horses. Figure 3 shows the distribution of labeled activities. During the monitoring in the paddock, the horses were mainly walking and trotting with a rider on their back. During the monitoring in the outside pasture, the horses were mainly grazing. Therefore, these activities represent the largest part of the labeled dataset. Six activities from six subjects were annotated more extensively so that leave-one-out validation can be used for a subset of subjects and activities in [23,28].

Figure 4 shows the distribution of the labeled data using three summary statistics for each 2-s window of data: frequency entropy, the frequency component with the largest magnitude, and standard deviation. More details regarding these features can be found in [1]. The figure shows mixed data from 11 different subjects and all labeled activities. The different activity clusters are overlapping and activities such as head-shake, walking-rider, and galloping-rider are more scattered (they have a higher variability in the measurements). Although we did not consider the reasons for the higher variability empirically, one reason could be the variation in the size of horses. Table 3 shows that some of the subjects were smaller ponies, while others were larger horses. The difference in size could cause a higher variability in certain behaviors. Besides the distinction between horses and ponies, we did not record any other physical properties during the data collection.

File Structure

The following list describes the folders and files within the dataset:

/matlab: Folder that contains the datasets in Matlab format organized per subject number (%ID) and name (%NAME) as subject_%ID_%NAME.mat. The columns of the tables are described in Table 1. Each row in the tables denotes a raw data sample.
/csv: Folder that contains the datasets in .csv format. Each .csv file contains a maximum of $2^{20}$ rows and the datasets are therefore separated into multiple .csv parts (denoted by %PART in the filename). Files are named as follows: subject_%ID_%NAME_part_%PART.mat.
subject_mapping[ .xlsx, .csv ]: A table that maps the name of each subject to an integer subject identifier.
activity_distribution[ .xlsx, .csv ]: A table containing the number of data samples per activity for each subject (Table 2).
settings[ .xlsx, .csv ]: Table that shows the used settings to organize the dataset and activity_distribution table.

3. Methods

Movement data were collected from 18 individual horses and ponies over seven days. Table 3 describes which of the subjects was a horse or a pony. The data were labeled according to the 17 preconceived activity categories listed in Table 4. The animals were recorded on video from various angles during the day. The videos were later used as ground truth for labeling the data.

3.1. Data Acquisition

All experiments with the animals complied with Dutch ethics law concerning working with animals. A sensor node was attached to the neck of a horse by means of a collar fabricated from hook and loop fastener. Different colors were used for the collars to ease the identification of the animals in the videos during the labeling process. Figure 5 shows how the sensors were attached to horses. We studied the effect of sensor orientation in earlier work [1] and showed that robust AAR is possible with sensor-orientation-independent features. To be able to evaluate AAR approaches that are robust against the orientation, we did not fix the orientation of the sensor devices. The sensor devices were always attached around the neck of the horses so that they could be worn without a saddle or halter. Furthermore, this location is often used in studies that monitor wildlife such as zebra [3], which increases the usability of our dataset for research related to other animals.

We used the Human Activity Monitor [29] sensor devices from Gulf Coast Data Concepts, which contain a three-axis accelerometer, gyroscope and magnetometer. The sensor parameter settings are described in Table 5.

The sensors were enabled and attached to the horses while they were in their stable. This was done for all subjects during each day of monitoring. Horses were randomly ridden in turns, thus not all horses were either outside in the pasture or ridden the whole day. This means that in some cases part or most of the recorded data for that day can be that of the horse standing or roaming around in her stable. Activities such as eating, scratch biting and rubbing are also exercised in the stable.

3.2. Data Labeling

The data were annotated with our labeling application [30] that is publicly available online [31]. The application is based on a Matlab GUI [32]. A screen capture of the application is shown in Figure 6. Clock timestamps from the sensor nodes were used to obtain a coarse synchronization. The labeling application was used to further synchronize videos with sensor data by adjusting the offset. The magnitude of the accelerometer vector (Equation (2)) was displayed to visualize the sensor data. The orientation-independent magnitude of the 3D vector is defined as:

M (t) = \sqrt{s_{x} {(t)}^{2} + s_{y} {(t)}^{2} + s_{z} {(t)}^{2}},

(2)

where

s_{x}

,

s_{y}

, and

s_{z}

are the three respective axes of the sensor. The data were labeled by clicking at the point representing a change in behavior on the graph. The activity that belongs to the data following the selected point in time was then selected from a drop-down menu and added to the graph. A file with activity label and timestamp tuples was instantly updated when an annotation was added. The visualization of the sensor data and the high synchronization achieved with the video allowed the annotator to accurately label the activity associated with the sensor data.

Data from 11 subjects were annotated according to the behaviors listed in Table 4. The stop marker for one activity was also the start marker for the following activity, if the following activity is of any other class than unknown. Transitions between activities were not always excluded from the data, thus some data samples may include a transition phase to another activity. When a horse was performing multiple activities simultaneously, the activity that was mainly exercised was chosen as the label. For example, when a horse was eating while slowly walking, this activity was labeled as grazing, because the movement is part of the grazing behavior.

In leave-one-subject-out cross-validation, all labeled data from one subject are not used for training and tuning of an AAR classifier; they are only used as a test set for the performance assessment of the classifier. This training and performance assessment sequence is repeated until the data of each subject have been in the test set. To evaluate AAR methods through leave-one-subject-out cross-validation, it is desirable that the dataset from each subject contains sufficient labeled data for each activity within a set of activities that is identical for all subjects that are used in the cross-validation. Therefore, we chose to label a subset of activities for a subset of the subjects more extensively. Acquiring a sufficient amount of labeled data for these subsets proved to be challenging. The reliability of the labeling can be improved when multiple people label the same parts of the data so that an inter-observer reliability can be calculated. However, when multiple people label the same data, overall fewere data can be labeled within the same amount of time. Because we did not have the resources (in people and time) to label a sufficient amount of data multiple times, we chose to have a larger quantity of labeled data over a higher reliability. The data were labeled a single time and later verified through a visual inspection. All labeled data were visually inspected and corrected by a single person to minimize label ambiguity. All efforts were put in to ensuring high quality of the labeling process, e.g., we did not label the data for very long consecutive periods to prevent sloppiness due to repetition in work, and we kept a thorough administration to keep track what part of the labels still required validation.

Author Contributions

Conceptualization, J.W.K.; methodology, J.W.K.; software, J.W.K.; validation, J.W.K. and L.M.J.; formal analysis, J.W.K.; investigation, J.W.K.; resources, J.W.K., N.M., and P.J.M.H.; data curation, J.W.K. and L.M.J.; writing—original draft preparation, J.W.K.; writing—review and editing, J.W.K. and L.M.J.; visualization, J.W.K.; supervision, N.M.; project administration, N.M.; and funding acquisition, N.M. and P.J.M.H.

Funding

This research was supported by the Smart Parks Project, which involves the University of Twente, Wageningen University & Research, ASTRON Dwingeloo, and Leiden University. The Smart Parks Project is funded by the Netherlands Organisation for Scientific Research (NWO).

Acknowledgments

We would like to thank the Equestrian facility “de Horstlinde” in Enschede, The Netherlands for their kind cooperation during the collection of this dataset. We would like to thank Lieke Hamelers and Heleen Visserman for their help during parts of the data collection and labeling process.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Kamminga, J.W.; Le, D.V.; Meijers, J.P.; Bisby, H.; Meratnia, N.; Havinga, P.J. Robust Sensor- Orientation-Independent Feature Selection for Animal Activity Recognition on Collar Tags. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2018, 2, 1–27. [Google Scholar] [CrossRef]
Kamminga, J.W.; Bisby, H.C.; Le, D.V.; Meratnia, N.; Havinga, P.J.M. Generic Online Animal Activity Recognition on Collar Tags. In Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers; ACM: New York, NY, USA, 2017; pp. 597–606. [Google Scholar] [CrossRef]
Juang, P.; Oki, H.; Wang, Y.; Martonosi, M.; Peh, L.S.; Rubenstein, D. Energy-efficient computing for wildlife tracking. ACM SIGOPS Oper. Syst. Rev. 2002, 36, 96. [Google Scholar] [CrossRef]
Shepard, E.L.C.; Wilson, R.P.; Quintana, F.; Laich, A.G.; Liebsch, N.; Albareda, D.A.; Halsey, L.G.; Gleiss, A.; Morgan, D.T.; Myers, A.E.; et al. Identification of animal movement patterns using tri-axial accelerometry. Endanger. Spec. Res. 2010, 10, 47–60. [Google Scholar] [CrossRef]
Nathan, R.; Spiegel, O.; Fortmann-Roe, S.; Harel, R.; Wikelski, M.; Getz, W.M. Using tri-axial acceleration data to identify behavioral modes of free-ranging animals: General concepts and tools illustrated for griffon vultures. J. Exp. Biol. 2012, 215, 986–996. [Google Scholar] [CrossRef] [PubMed]
Wilson, R.P.; White, C.R.; Quintana, F.; Halsey, L.G.; Liebsch, N.; Martin, G.R.; Butler, P.J. Moving towards acceleration for estimates of activity-specific metabolic rate in free-living animals: The case of the cormorant. J. Anim. Ecol. 2006, 75, 1081–1090. [Google Scholar] [CrossRef] [PubMed]
Yoda, K.; Sato, K.; Niizuma, Y.; Kurita, M.; Bost, C.A.; Le Maho, Y.; Naito, Y. Precise monitoring of porpoising behaviour of Adélie penguins determined using acceleration data loggers. J. Exp. Biol. 1999, 202, 3121–3126. [Google Scholar] [PubMed]
Tapiador-Morales, R.; Rios-Navarro, A.; Jimenez-Fernandez, A.; Dominguez-Morales, J.; Linares-Barranco, A. System based on inertial sensors for behavioral monitoring of wildlife. In Proceedings of the IEEE CITS 2015–2015 International Conference on Computer, Information and Telecommunication Systems, Gijon, Spain, 15–17 July 2015. [Google Scholar] [CrossRef]
le Roux, S.P.; Marias, J.; Wolhuter, R.; Niesler, T. Animal-borne behaviour classification for sheep (Dohne Merino) and Rhinoceros (Ceratotherium simum and Diceros bicornis). Anim. Biotelem. 2017, 5, 25. [Google Scholar] [CrossRef] [Green Version]
Bishop-Hurley, G.; Henry, D.; Smith, D.; Dutta, R.; Hills, J.; Rawnsley, R.; Hellicar, A.; Timms, G.; Morshed, A.; Rahman, A.; et al. An investigation of cow feeding behavior using motion sensors. In Proceedings of the 2014 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Montevideo, Uruguay, 12–15 May 2014; pp. 1285–1290. [Google Scholar] [CrossRef]
González, L.A.; Bishop-hurley, G.J.; Handcock, R.N.; Crossman, C. Behavioral classification of data from collars containing motion sensors in grazing cattle. Comput. Electron. Agric. 2015, 110, 91–102. [Google Scholar] [CrossRef]
Vázquez Diosdado, J.A.; Barker, Z.E.; Hodges, H.R.; Amory, J.R.; Croft, D.P.; Bell, N.J.; Codling, E.A. Classification of behaviour in housed dairy cows using an accelerometer-based activity monitoring system. Anim. Biotelem. 2015, 3, 15. [Google Scholar] [CrossRef] [Green Version]
Martiskainen, P.; Järvinen, M.; Skön, J.P.; Tiirikainen, J.; Kolehmainen, M.; Mononen, J. Cow behaviour pattern recognition using a three-dimensional accelerometer and support vector machines. Appl. Anim. Behav. Sci. 2009, 119, 32–38. [Google Scholar] [CrossRef]
Dutta, R.; Smith, D.; Rawnsley, R.; Bishop-Hurley, G.; Hills, J.; Timms, G.; Henry, D. Dynamic cattle behavioural classification using supervised ensemble classifiers. Comput. Electron. Agric. 2015, 111, 18–28. [Google Scholar] [CrossRef]
Sneddon, J.; Mason, A. Automated Monitoring of Foraging Behaviour in Free Ranging Sheep Grazing a Bio-diverse Pasture using Audio and Video Information. In Proceedings of the 8th International Conference on Sensing Technology, Wellington, New Zealand, 3–5 December 2014; pp. 2–4. [Google Scholar] [CrossRef]
Marais, J.; Petrus, S.; Roux, L.; Wolhuter, R.; Niesler, T. Automatic classification of sheep behaviour using 3-axis accelerometer data. In Proceedings of the 2014 PRASA, RobMech and AfLaT International Joint Symposium, Cape Town, South Africa, 27–28 November 2014; pp. 97–102. [Google Scholar]
Petrus, S. A Prototype Animal Borne Behaviour Monitoring System. Ph.D. Thesis, Stellenbosch University, Stellenbosch, South Africa, 2016. [Google Scholar]
Terrasson, G.; Llaria, A.; Marra, A.; Voaden, S. Accelerometer based solution for precision livestock farming: Geolocation enhancement and animal activity identification. IOP Conf. Ser. Mate. Sci. Eng. 2016, 138. [Google Scholar] [CrossRef]
Watanabe, S.; Izawa, M.; Kato, A.; Ropert-Coudert, Y.; Naito, Y. A new technique for monitoring the detailed behaviour of terrestrial animals: A case study with the domestic cat. Appl. Anim. Behav. Sci. 2005, 94, 117–131. [Google Scholar] [CrossRef]
Ladha, C.; Hammerla, N.; Hughes, E.; Olivier, P.; Ploetz, T. Dog’s life: Wearable Activity Recognition for Dogs. In Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing—UbiComp ’13, Zurich, Switzerland, 8–12 September 2013; pp. 415–418. [Google Scholar] [CrossRef]
Gutierrez-Galan, D.; Dominguez-Morales, J.P.; Cerezuela-Escudero, E.; Rios-Navarro, A.; Tapiador-Morales, R.; Rivas-Perez, M.; Dominguez-Morales, M.; Jimenez-Fernandez, A.; Linares-Barranco, A. Embedded neural network for real-time animal behavior classification. Neurocomputing 2018, 272, 17–26. [Google Scholar] [CrossRef]
Kamminga, J.W. Horsing Around—A Dataset Comprising Horse Movement. 4TU.Centre for Research Data. Dataset. 2019. Available online: https://data.4tu.nl/repository/uuid:2e08745c-4178-4183-8551-f248c992cb14 (accessed on 22 September 2019).
Kamminga, J.W.; Meratnia, N.; Havinga, P.J. Dataset: Horsing Around—Description and Analysis of Horse Movement Data. In The 2nd Workshop on Data Acquisition To Analysis (DATA’19), New York, NY, USA, 10 November 2019; Association for Computing Machinery: New York, NY, USA, 2019. [Google Scholar] [CrossRef]
Bengio, Y.; Courville, A.; Vincent, P. Representation Learning: A Review and New Perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef] [PubMed]
Friday, H.; Wah, Y.; Al-garadi, M.A.; Rita, U.; Nweke, H.F.; Teh, Y.W.; Al-garadi, M.A.; Alo, U.R.; Friday, H.; Wah, Y.; et al. Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: State of the art and research challenges. Expert Syst. Appl. 2018, 105, 233–261. [Google Scholar] [CrossRef]
Wang, J.; Chen, Y.; Hao, S.; Peng, X.; Hu, L. Deep learning for sensor-based activity recognition: A survey. Pattern Recognit. Lett. 2019, 119, 3–11. [Google Scholar] [CrossRef] [Green Version]
Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Kamminga, J. Deep Unsupervised Representation Learning for Animal Activity Recognition. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2019. submitted. [Google Scholar]
Gulf Coast Data Concepts, LLC. Human Activity Monitor: HAM; Gulf Coast Data Concepts, LLC.: Waveland, MS, USA, 2019. [Google Scholar]
Kamminga, J.W.; Jones, M.; Seppi, K.; Meratnia, N.; Havinga, P.J. Synchronization between Sensors and Cameras in Movement Data Labeling Frameworks. In The 2nd Workshop on Data Acquisition To Analysis (DATA’19), New York, NY, USA, 10 November 2019; Association for Computing Machinery: New York, NY, USA, 2019. [Google Scholar] [CrossRef]
Kamminga, J. Jacob-Kamminga/Matlab-Movement-Data-Labeling-Tool: Generic Version Release. 8 August 2019. Available online: https://doi.org/10.5281/zenodo.3364004 (accessed on 22 September 2019).
MATLAB. Version 9.5.0.944444 (R2018b); The MathWorks Inc.: Natick, MA, USA, 2018. [Google Scholar]

Figure 1. Horses in the outside paddock: (a) two subjects standing still; and (b) subject grazing.

Figure 2. Example of accelerometer and gyroscope data. Data from several activities is concatenated.

A_{x}

,

A_{y}

, and

A_{z}

denote the x-, y-, and z-axis of the 3D-accelerometer, respectively. Similarly,

G_{x}

,

G_{y}

, and

G_{z}

denote the x-, y-, and z-axis of the 3D-gyroscope sensor, respectively.

Figure 2. Example of accelerometer and gyroscope data. Data from several activities is concatenated.

A_{x}

,

A_{y}

, and

A_{z}

denote the x-, y-, and z-axis of the 3D-accelerometer, respectively. Similarly,

G_{x}

,

G_{y}

, and

G_{z}

denote the x-, y-, and z-axis of the 3D-gyroscope sensor, respectively.

Figure 3. Labeled activity distribution from all subjects.

Figure 4. Data distribution in 3D, using three statistical features.

Figure 5. Sensor device placement around neck of a horse. The sensor devices were attached with a collar made out of hook and loop fastener. The sensor devices were attached to the manes using elastic bands. The orientation of the sensor devices was not fixed. The collars and sensor devices did not bother the horses.

Figure 6. Screenshot of the labeling application.

Table 1. Column description.

Column Name	Description
Ax	Raw data from accelerometer x-axis
Ay	Raw data from accelerometer y-axis
Az	Raw data from accelerometer z-axis
Gx	Raw data from gyroscope x-axis
Gy	Raw data from gyroscope y-axis
Gz	Raw data from gyroscope z-axis
Mx	Raw data from compass (magnetometer) x-axis
My	Raw data from compass (magnetometer) y-axis
Mz	Raw data from compass (magnetometer) z-axis
A3D	l2-norm (3D vector) of accelerometer axes
G3D	l2-norm (3D vector) of gyroscope axes
M3D	l2-norm (3D vector) of compass axes
label	Label that belongs to each row’s data
segment	Each activity has been segmented with a maximum length of 10 s. Data within one segment is continuous. Segments have been numbered incrementally.
subject	Subject identifier

Table 2. Amount of data samples per subject and activity. Each sample denotes a 2-s window of raw data.

Name/Activity	Null	Unknown	Walking_Rider	Trotting_Rider	Grazing	Standing	Galloping_Rider	Walking_Natural	Head_Shake	Scratch_Biting	Galloping_Natural	Trotting_Natural	Rolling	Eating	Fighting	Shaking	Jumping	Rubbing	Scared	Total
Galoway	62,155	23,264	9653	6374	4315	1750	1030	1402	59	170	13	49	13	16	25	4				110,292
Bacardi	92,775	9850	1317	1981	1116	245	288	360		22	40					13		6		108,013
Driekus	85,468	11,271	4024	2670	2465	341	310	270	55	14	13	3	23	31		4				106,962
Patron	78,536	15,156	5150	3385	1951	1244	709	388	37		5	17	31							106,609
Happy	68,468	13,606	8896	7032	5062	1186	689	746	238	8	7	6		1						105,945
Zonnerante	90,431																			90,431
Duke	81,885																			81,885
Viva	69,441	4413	1066	700		58	82	79	5	4									1	75,849
Flower	75,741																			75,741
Pan	68,628	1575	241			36		44												70,524
Porthos	67,080																			67,080
Barino	66,517																			66,517
Zafir	38,424	10,349	5078	3546	1091	347	826	161	105	23	9	13					12			59,984
Niro	43,563	2740			85	20		2												46,410
Sense	38,823	1569			1977	39		157	120	44	15	6			6				2	42,758
Blondy	31,579																			31,579
Noortje	17,777	2878				31														20,686
Clever	17,696																			17,696
total	1,094,987	96,671	35,425	25,688	18,062	5297	3934	3609	619	285	102	94	67	48	31	21	12	6	3	1,284,961
fraction	85.22%	7.52%	2.76%	2.00%	1.41%	0.41%	0.31%	0.28%	0.05%	0.02%	0.01%	0.01%	0.005%	0.004%	0.002%	0.002%	0.001%	0.000%	0.000%	100.00%
fraction of labeled			37.97%	27.53%	19.36%	5.68%	4.22%	3.87%	0.66%	0.31%	0.11%	0.10%	0.072%	0.051%	0.033%	0.023%	0.013%	0.006%	0.003%

Table 3. Horse names and the distinction between horses and ponies.

Name	Type
Viva	horse
Driekus	horse
Galoway	horse
Barino	horse
Zonnerante	horse
Patron	horse
Duke	horse
Porthos	horse
Bacardi	horse
Happy	horse
Clever	horse
Zafier	horse
Noortje	pony
Blondy	pony
Flower	pony
Peter Pan	pony
Niro	horse
Sense	horse

Table 4. Observed daytime activities exercised by horses.

Activity	Description
Standing	Horse standing on 4 legs, no movement of head, standing still
Walking natural	No rider on horse, the horse puts each hoof down one at a time, creating a four beat rhythm
Walking rider	Rider on horse, the horse puts each hoof down one at a time, creating a four beat rhythm
Trotting natural	No rider on horse, 2 beat gait, one front hoof and its opposite hind hoof come down at the same time, making a two-beat rhythm, different speeds possible but always 2 beat gait
Trotting rider	Rider on horse, 2 beat gait, one front hoof and its opposite hind hoof come down at the same time, making a two-beat rhythm, different speeds possible but always 2 beat gait
Galloping natural	No rider on horse, one hind leg strikes the ground first, and then the other hind leg and one foreleg come down together, the the other foreleg strikes the ground. This movement creates a three-beat rhythm
Galloping rider	Rider on horse, can be right or left leaning, one hind leg strikes the ground first, and then the other hind leg and one foreleg come down together, the the other foreleg strikes the ground. This movement creates a three-beat rhythm
Jumping	All legs off the ground, going over an obstacle
Grazing	Head down in the grass, eating and slowly moving to get to new grass spots
Eating	Head is up, chewing and eating food, usually eating hay or long grass
Head shake	Shaking head alone, no body shake, either head up or down
Shaking	Shaking the whole body, including head
Scratch biting	Horse uses its head/mouth to scratch mostly front legs
Rubbing	Scratching body against an object, rubbing its body to scratch itself
Fighting	Horses try to bite and kick each other
Rolling	Horse laying down on ground, rolling on its back, from one side to another, not always full roll
Scared	Quick sudden movement, horse is startled

Table 5. Sensor information and parameter settings.

Parameter	Accelerometer	Gyroscope	Magnetometer
Unit	m/s²	°/s	μT
Sampling rate (Hz)	100	100	12
Full scale range	78.45 m/s² (8 g)	2000 °/s	1200 μT
Sensitivity	9.8 m/s² (1 g)	1 °/s	1 μT

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kamminga, J.W.; Janßen, L.M.; Meratnia, N.; Havinga, P.J.M. Horsing Around—A Dataset Comprising Horse Movement. Data 2019, 4, 131. https://doi.org/10.3390/data4040131

AMA Style

Kamminga JW, Janßen LM, Meratnia N, Havinga PJM. Horsing Around—A Dataset Comprising Horse Movement. Data. 2019; 4(4):131. https://doi.org/10.3390/data4040131

Chicago/Turabian Style

Kamminga, Jacob W., Lara M. Janßen, Nirvana Meratnia, and Paul J. M. Havinga. 2019. "Horsing Around—A Dataset Comprising Horse Movement" Data 4, no. 4: 131. https://doi.org/10.3390/data4040131

Article Menu

Horsing Around—A Dataset Comprising Horse Movement

Abstract

1. Summary

2. Data Description

File Structure

3. Methods

3.1. Data Acquisition

3.2. Data Labeling

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI