Next Article in Journal
A NEAT Based Two Stage Neural Network Approach to Generate a Control Algorithm for a Pultrusion System
Previous Article in Journal
COVID-19 Diagnosis from Chest CT Scans: A Weakly Supervised CNN-LSTM Approach
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Happy Cow or Thinking Pig? WUR Wolf—Facial Coding Platform for Measuring Emotions in Farm Animals

Suresh Neethirajan
Farmworx, Department of Animal Sciences, Wageningen University & Research, 6700 AH Wageningen, The Netherlands
AI 2021, 2(3), 342-354;
Submission received: 22 June 2021 / Revised: 31 July 2021 / Accepted: 3 August 2021 / Published: 5 August 2021


Emotions play an indicative and informative role in the investigation of farm animal behaviors. Systems that respond and can measure emotions provide a natural user interface in enabling the digitalization of animal welfare platforms. The faces of farm animals can be one of the richest channels for expressing emotions. WUR Wolf (Wageningen University & Research: Wolf Mascot), a real-time facial recognition platform that can automatically code the emotions of farm animals, is presented in this study. The developed Python-based algorithms detect and track the facial features of cows and pigs, analyze the appearance, ear postures, and eye white regions, and correlate these with the mental/emotional states of the farm animals. The system is trained on a dataset of facial features of images of farm animals collected in over six farms and has been optimized to operate with an average accuracy of 85%. From these, the emotional states of animals in real time are determined. The software detects 13 facial actions and an inferred nine emotional states, including whether the animal is aggressive, calm, or neutral. A real-time emotion recognition system based on YoloV3, a Faster YoloV4-based facial detection platform and an ensemble Convolutional Neural Networks (RCNN) is presented. Detecting facial features of farm animals simultaneously in real time enables many new interfaces for automated decision-making tools for livestock farmers. Emotion sensing offers a vast potential for improving animal welfare and animal–human interactions.

Graphical Abstract

1. Introduction

Digital technologies, in particular, precision livestock farming, and artificial intelligence have the potential to shape transformation in animal welfare [1]. To ensure access to sustainable and high-quality health attention and welfare in animal husbandry management, innovative tools are needed. Unlocking the full potential of automated measurement of mental and emotional states of farm animals through digitalization such as facial coding systems would help blur the lines between biological, physical, and digital technologies [1,2].
Animal caretakers, handlers, and farmworkers typically rely on hands-on observations and measurements while investigating methods of monitoring animal welfare. To avoid the increased handling of animals in the process of taking functional or physiological data, and to reduce the subjectivity associated with manual assessments, automated animal behavior and physiology measurement systems can complement the current traditional welfare assessment tools and processes in enhancing the detection of animals in distress or pain in the barn [3]. Automated and continuous monitoring of animal welfare through digital alerting is rapidly becoming a reality [4].
In the human context, facial analysis platforms have long been in use for various applications, such as password systems on smartphones, identification at international border checkpoints, identification of criminals [5], diagnosis of Turner syndrome [6], detection of genetic disorder phenotypes [7], as a potential diagnostic tool for Parkinson disease [8], measuring tourist satisfaction through emotional expressions [9], and quantification of customer interest during shopping [10].

1.1. Emotions

Emotions are believed to be a social and survival mechanism that is present in many species. In humans, emotions are understood as deep and complex psychological experiences that influence physical reactions. There is an entire sector of science devoted to understanding the sophisticated inner workings of the human brain, yet many questions related to human emotions remain unanswered. Even less scientific research is focused on understanding the emotional capacity of non-human primates and other animals. The ability to interpret the emotional states of an animal is considerably more difficult than understanding the emotional state of a human [2]. The human face is capable of a wide array of expressions that communicate emotion and social intent to other humans. These expressions are so clear that even some non-human species, like dogs, can identify human emotion through facial expression [11]. Each species has its own unique physiological composition resulting in special forms of expression. Despite human intellectual capacity, emotional understanding of other species through facial observation has proven difficult.
Early studies [12] noted the influence of human bias in interpretations and accidental interference with the natural responses of animals. It is not uncommon for humans to anthropomorphize the expressions of animals. The baring of teeth is an example. Humans commonly consider such an expression to be “smiling” and interpret it as a sign of positive emotions. In other species, such as non-human primates, the baring of teeth is more commonly an expression of a negative emotion associated with aggression [13]. For these reasons and many others, the involvement of technology is critical in maintaining accurate and unbiased assessments of animal emotions and individual animal identification. In recent years, the number of studies concerning technological intervention in the field of animal behavior has increased [2]. The ability of customized software in improving research, animal welfare, the production of food animals, legal identification, and medical practices is astounding.

1.2. Understanding Animal Emotions

The human comprehension of animal emotions may seem trivial; however, it is a mutually beneficial skill. The ability of animals to express complex emotions, such as love and joy, is still debated within the field of behavioral science. Other emotions, such as fear, stress, and pleasure are more commonly studied [14]. These basic emotions have an impact on how animals feel about their environment and interact with it. They also impact an animal’s interactions with its counter specifics [15].
Non-domesticated species of animals are commonly observed in the wild and maintained in captivity to understand and conserve their species. Changes in the natural environment, because of human actions, can be stressful for individuals within a species. Captive non-domesticated animals also experience stress created through artificial environments and artificial mate selection. If even one animal experiences and displays signs of stress or aggression, its companions are likely to understand and attempt to respond to the emotional state [16]. These responses can result in stress, conflict, and the uneven distribution of resources [17]. The understanding of emotional expression in captive animals can help caretakers determine the most beneficial forms of care and companion matching for each individual, resulting in a better quality of life for the animals in question.
Companion animals are another category which can benefit from a deeper understanding of animal emotion. Just like humans, individual animals experience different thresholds for coping with pain and discomfort. Since many companion animals must undergo voluntary medical procedures for the well-being of their health and their species, it is important to understand their physical responses. Animals cannot tell humans how much pain they are in, so it is up to their caretakers to interpret the pain level an animal is experiencing and treat it appropriately [18]. This task is most accurately completed when the emotions of an animal are clearly and quickly detectable.
The understanding of expressions related to stress and pain from indicative facial features is impactful in animal agriculture. Animals used for food production often produce higher quality products when they do not experience unpleasant emotions [19]. The detection of individual animals experiencing stress also allows for the early identification of medical complications. A study on sows in parturition showed a uniform pattern of facially expressed discomfort during the birthing cycle [20]. In such a case, facial identification of emotional distress could be used to detect abnormally high levels of discomfort and alert human caretakers to the possibility of dystocia.

1.3. Facial Recognition Software

Facial recognition software has been used on human subjects for years. It has even contributed to the special effect capabilities in films and is used as a password system for locked personal devices [21]. It is a non-invasive method that tracks specific points on an individual’s face using photos and videos. These points need not be placed directly on the subject’s face; instead, computer software can be customized and trained to identify the location of each point. Once this software identifies an individual’s characteristics, it can be modified to detect changes in facial positioning and associate these changes with emotional states. In addition to this traditional way of tracking specific points on faces, there are a varied number of approaches [22] to identifying people from their faces.
The method of tracking specific points on faces can also be used to identify individuals and emotional states when it comes to animal subjects. With a little software reconstruction, scientists have been able to create reliable systems for the assessment of animal emotions through technological means [23,24]. These systems have been specified to identify multiple species including, cows, cats, sheep, large carnivores, and many species of non-human primate. In studies focusing on identifying individual members of the same species within a group, the accuracy of specialized facial recognition software was found to be between 94% and 98.7%. Some of these studies even displayed the ability of software to identify and categorize new individuals within a group and the ability to identify individuals at night [23,24,25]. Other studies focused more on the emotional expressions that could be identified through facial recognition software and some of the studies showed an accuracy of around 80% when compared to the findings of professionals in the field of animal emotion identification [26]. The differences in facial features such as the area of the eyes, size, shape and form of the ears and snout regions of farm animals has been used as a parameter in the identification of the individual members in previous studies. The focus of our study is to measure inferred affective states/emotions of animals and not to identify individual animals.

1.4. The Grimace Scale

The facial landmark detection software used is based on a series of points in relation to phenotypic features of the species in question, but it uses an older theory to attach the location of those points to emotional states.
The grimace scale is a template created to depict the physical reactions associated with varying levels of discomfort. These scales are created in relation to a specific species and are defined by a numerical scale [27]. In the case of pigs, sheep, and cattle, grimace scales normally focus on tension in the neck, shape of the eye, tension in the brow, nose bunching, and positioning of the ears [20,28]. These visual cues can be combined with vocal cues to further depict the level of discomfort an animal is experiencing. In species like mice, other expressive physical features must be accounted for, such as whisker movement [29]. For less social species, like cats, the changes in facial expression in response to pain are more minute but still identifiable with the use of a grimace scale [18]. These scales have been proven as an accurate way to assess pain with minimal human bias [27]. They are created through the professional observation of species during controlled procedures that are known to trigger pain receptors. Once created, grimace scales can be converted to specific measurements that are detectable through facial recognition software with the assistance of the Viola-Jones algorithm. This algorithm breaks down the facial structure of animal subjects into multiple sections to refine, crop, and identify major facial features [26]. These features make the technological interpretation of animal emotions feasible across a variety of species and in a variety of settings.

1.5. Best Way to Manage Animal Emotion Recognition

Studies are most accurate when the spectrum of discomfort, including everything from acute low-grade pain to severe chronic pain, is fully identified. Events of low-grade discomfort are significant; however, they may not be identifiable through the production of the stress hormone cortisol [30]. In such situations, discomfort may only be discernable through the facial expressions of an animal detectable by facial feature measurement software. Because of the quantitative nature, physiological indicators are typically preferred in the investigation of farm animal emotions. However, due to lack of precise correlations with the affective states and the physiological profiles, animal emotion researchers should be cognizant of not relying only on the physiological indicators of stress and pain, but must use a combination in the phenotyping interpretation of animal behavior.
On large-scale farms, it is important to keep the animals comfortable and relaxed, but it would be impractical and expensive to test the chemical levels of stress present in every animal. The identification of emotional states through facial recognition software provides a more efficient and cost-effective answer. It also provides an opportunity for the identification of very similar individuals in a way that cannot be illegally altered, unlike ear tags, which are sometimes changed for false insurance claims [25].
The use of facial recognition software also reduces the need for human interaction with animal subjects. For non-domesticated animals, the presence of human observers can be a stressful experience and alter their natural behavior. Facial recognition software allows researchers to review high-quality video and photo evidence of the subject’s emotional expressions without any disturbance. Researchers can even record the identification and actions of multiple individuals within a group of animals at the same time with the help of software such as LemurFaceID [24].
Room for human error in the form of bias is reduced with the help of facial recognition software. Since humans experience emotions and have the ability to empathize with other emotional beings, human observers run the risk of interpreting animals’ emotional expressions improperly. In a study concerning the pain expressions of sows during parturition, it was noted that all female observers rated the sows’ pain significantly higher than the male observers [20]. The problem of distinguishing between the emotional sensitivity of the interpreter of animal behavior and the measurement error is of fundamental importance in the case of the subject matter of the assessed work. With well-calculated software, these discrepancies will cease to exist, and researchers can focus more of their time on finding significant points and connections within recorded data, rather than spending their time recording the data.

2. Materials and Methods

2.1. Dataset Characteristics

Mapping of images to specific emotion classes of cows and pigs based on indicators of facial features is shown in Figure 1. Images (Figure 2) and videos of cows and pigs were collected from multiple locations: 3 farms in Canada, 2 farms in the USA, and 1 farm in India. The goal of various video and image collection and dataset enhancement was to enhance the generalization of the model and the diversity of the training dataset. Videos of cows and pigs were converted to images based on frames per second and augmentation methods. For testing the model to create ground truth and validation with unseen data, images were collected from the Internet and cropped, querying search engines using animal keywords. These images were also manually annotated for the potential inferred affective states perceived as emotions of the farm animals. All images were included for analysis when the entire face of the cows and pigs were contained in the frame with visible ears and eyes and snout features. The dataset consisted of 7800 images and over 150 videos from a total of 235 pigs and 210 dairy cows. In our application in the on-farm scenario, cows and pigs were often crowded together with occlusions due to mechanical farm structures and other barriers in the environment. Hence images and videos were processed through cropping before employed for further processing. Videos were converted into images and grouped into folders after labelling. The length of the videos ranged from 2 to 6 min each. All the images were grouped and categorized into multiple subfolders, based on three emotions of cows and six emotions of pigs. The farm animal’s facial expression started from positive to neutral to negative states and returns to neutral state during the data collection process. No elicitation or inducement of affective states on the farm animals were conducted during the data collection. Datasets were split up as 70% for training, 10% for validation, and 20% for testing phase for all three models for evaluation. At-least 100 images were present for each specific emotion in these folders for the test dataset. The data obtained from the internet were from original live farm animals and not from virtual images and were cross-checked with the sources. One of the core design requirements of an emotion recognition model is the availability of sufficient labeled training data with varying features and conditions. For this, the data collected from farms were predominantly used. Data from the internet were not used in the training of the model but only for testing the model.

2.2. Features and Data Processing

Several recent studies have clearly laid the foundation for the measurement of emotional states of farm animals based on their facial features such as ears, eyes, and orbital tightening (Table 1). Based on the evidence of correlations between the physiological responses and inferred internal states and the facial features, the farm animal’s emotional state is being assumed. The collected and grouped images dataset was divided into nine classes based on the correlation between facial features such as ear posture and eye whites of cows and pigs and the sensing parameters, as compiled in Table 1. The eye white region, the ear posture direction for cows and pigs and their relationship to mental states such as whether they are feeling positive, negative, or neutral have been studied previously. The data were labelled and annotated by trained ethologists based on the established protocols as given in [31,32,33,34,35]. The videos and images were preprocessed initially using a three-stage method: (1) Detection of faces, (2) Alignment of faces, (3) Normalization of input. A regular smartphone (Samsung Galaxy S10) was used for capturing images and videos from different angles and directions when the animals were in the barn or pen. The collected data were labelled based on the time stamp and the RFID tags and markers. Faces were not manually extracted, but via the MIT LabelImg code [36]. Annotation for labeling different models’ bounding boxes was done in the standard format for each: PASCAL format for Faster-RCNN and YOLO format for both YOLOv3 and YOLOv4.

2.3. Hardware

The training and the testing of the three models based on YoloV3, YoloV4 and Faster RCNN were performed on NVidia GeForce GTX 1080 Ti graphics processing unit (GPU) running on CUDA 9.0 (compute unified device architecture) and cuDNN 7.6.1 (CUDA deep neural network library), equipped with 3584 CUDA cores and 11 GB memory.

2.4. YOLOv3

You Only Look Once (YOLO) is one of the fastest Object Detection Systems with a 30 FPS image processing capability and a 57.9% mAP (mean Average Precision) score [40]. YOLO is based on a single Convolutional Neural Network (CNN), i.e., one-step detection and classification. The CNN divides an image into blocks and then predicts the bounding boxes and probabilities for each block. It was built on a custom Darknet architecture: darknet-19, a 19-layer network supplemented with 11 object detection layers. This architecture, however, struggled with small object detection. YOLOv3 uses a variant of Darknet, a 53-layer Imagenet-trained network combined with 53 more layers for detection and 61.5 M parameters. Detection is done at three receptive fields: 85 × 85, 181 × 181, 365 × 365, addressing the small object detection issue. The loss function does not utilize exhaustive candidate regions but generates the bounding box coordinates and confidence using regression. This gives faster and more accurate detection. It consists of four parts, each given equal weightage: regression loss, confidence loss, classification loss, and loss for the absence of any object. When applied to face detection, multiple pyramid pooling layers capture high-level semantic features, and the loss function is altered. Regression loss and confidence loss are given a higher weight. These alterations produce accurate bounding boxes and efficient feature extraction. YOLOv3 provides detection at an excellent speed. However, it suffers from some shortcomings: expressions are affected by the external environment, and orientations/posture are not taken into account.

2.5. YOLOv4

YOLOv4 introduces several features that improve the learning of Convolution Neural Networks (CNNs) [41]. These include Weighted Residual Connections (WRC), Cross-Stage-Partial connections (CSP), Cross mini-Batch Normalization (CmBN), and Self-adversarial training (SAT). CSPDarknet is used as an architecture. It contains 29 convolutional layers 3 × 3, a 725 × 725 receptive field, and 27.6 M parameters. Spatial Pyramid Pooling (SPP) is added on the top of this layer. YOLOv4 improves the Accuracy Precision Score and FPS of v3 by 10–12%. It is faster, more accurate, and can be used on a conventional GPU with 8 to 16 GB-VRAM, which enables widespread adoption. New features suppress the weakness and improve on the already impressive face detection capabilities of its predecessor.

2.6. Faster R-CNN

Faster R-CNN is the third iteration of the R-CNN architecture. Rich feature hierarchies for accurate detection of objects and features and semantic segmentation CNN (R-CNN) started in 2014, introducing a method of Selective Search to detect regions of interest in an image and a CNN to classify and adjust them [42]. However, it struggled with producing real-time results. The next step in its evolution was Fast R-CNN, a faster model with shared computation capabilities owing to the Region of Interest Pooling technique. Finally came Faster R-CNN, the first fully differentiable model. The architecture consists of a pre-trained CNN (ImageNet) up to an intermediate layer, which gives a convolutional map. This is used as a feature extractor and is provided as input to Region Proposal Network, which tries to find bounding boxes in the image. Region of Interest (RoI) Pooling then extracts features that correspond to the relevant objects into a new tensor. Finally, the R-CNN module classifies the contents in the bounding box and adjusts its coordinates to better fit the detected object. Maximum pooling is used to reduce the dimensions of extracted features. A Softmax layer and a regression layer were used to classify facial expressions. This results in Faster R-CNN, achieving higher precision and lower miss-rate. However, it is prone to overfitting: the model can stop generalizing at any point and start learning noise.

3. Results

3.1. Model Parameters

YOLOv3 and YOLOv4 were given image inputs in batches of 64. Learning rate, Momentum, and Step Size were set to 0.001, 0.9, and 20,000 steps, respectively. Training took 10+ hours for the former and 8+ for the latter. Faster R-CNN accepted input in batches of 32. Learning rate, Gamma, Momentum, and Step Size were set to 0.002, 0.1, 0.9, and 15,000, respectively. It is the most time-consuming to train of the three, taking 14+ hours. The confusion matrix of DarkNet-53, CSPDarkNet-53, VGG-16 trained and tested on the farm animals’ images and videos dataset using YoloV3, YoloV4 and Faster RCNN, respectively, are shown in the Supplementary Material Tables S1–S3.

3.2. Computation Resources

YOLOv3 with its Darknet53 architecture takes the most inference time (0.0331 s) compared to YOLOv4 (0.27 s) and Faster R-CNN (0.3 s), both of which have CSPDarknet53 and VGG-16 architectures, respectively. YOLOv4 is the computationally efficient model, using 3479 MBs compared to 4759 MBs usage by YOLOv3 and 5877 MBs by Faster R-CNN. YOLOv4 trumps its two competitors when it comes to resources and efficiency, with optimal memory usage and good-enough inference time. Figure 3 illustrates the proposed WUR Wolf model developed using the pre-trained Deep CNN. Figure 4 shows the images of farm animals detected by WUR Wolf facial coding platform from the dataset using Faster RCNN technique.
YOLOv3 takes the least amount of time in learning most of the features compared to the other 2 models, and the accuracy curve (Figure 5a) flattens earlier as a result. Its fluctuating loss curve is a result of more repetitive predictions and slower convergence as compared to YOLOv4 and Faster R-CNN.
YOLOv4 is slower in learning than Yolov3 but achieves a higher accuracy score and a smoother loss function. Validation accuracy is also very close to train accuracy, indicating that the model generalizes well on unseen data (Figure 6) and would perform better in real-time than v3.
Faster R-CNN achieves a higher accuracy (Figure 7) score than both of the YOLO variants, as well as converging quickly. However, it performs poorly in generalizing the learning as the difference between validation and train accuracy is very large at multiple times. Faster R-CNN’s accuracy score (93.11% on training and 89.19% on validation set) outperforms both YOLOv4 (89.96% on training and 86.45% on validation set) and YOLOv3 (85.21% on training and 82.33% on validation set) on these metrics. Its loss curve is also faster to converge, followed closely by v4, and v3 is the worst performer on this metric.

3.3. Mean Average Precision (mAP)

The mAP score compares the actual bounding box to the detected box and returns a score. The higher the score, the more accurate the model’s object boundary detection. YOLOv4 has a mAP score of 81.6% at 15 FPS, performing better than both the other models. YOLOv3 also performs well on this metric with a mAP score of 77.60% at 11 FPS. Faster R-CNN also provides a moderate mAP score of 75.22%; however, its processing speed is very slow at just 5 FPS. Among the 3, YOLOv4 provides the best bounding boxes at a higher speed.

3.4. F1 Score

The annotated labels for both cows and pigs can be grouped on the basis of mental states such as positive, negative, and neutral. Analyzing model performance on these groups is useful in measuring how the model works in different contexts. F1 score is a good measure for this analysis. A Confusion Matrix tabulates the performance of a model on the dataset for which true values are known. Model results are compared against pre-set annotations, and an analysis reveals the performance of each model in detecting the emotion portrayed in the picture. Confusion Matrices of all three models are given in the Supplementary Reading Section alongside respective F1 scores. Negative context requires additional effort and reactions, and as a result there are more pixels with useful information in classification. All 3 models perform return higher True Positives for such cases (Tables S4–S6). The average F1 scores of each of the models are as follows: 85.44% for YOLOv3, 88.33% for YOLOv4, and 86.66% for Faster R-CNN. YOLOv4 outperforms the other two in predicting emotion states for each image.

4. Discussions

Non-invasive technology that can assess good and poor welfare of farm animals, including positive and negative emotional states, will be soon possible using the proposed WUR Wolf Facial Coding Platform. The ability to track and analyze how animals feel will be a breakthrough in establishing animal welfare auditing tools.
In this project, the applicability of three deep learning-based models for determining the emotions of farm animals, Faster R-CNN, and two variants of YOLO, i.e., YOLOv3 and YOLOv4, has been determined. For training the YOLOv3 and YOLOv4 algorithms, the darknet framework was employed. YOLOv4 has the CSPDarknet53, while YOLOv3 has the Darknet53. Because of the differences between the backbones, Yolov4 is faster and provides more accurate results for real-time applications.
Demonstration and results of emotion detection of cows and pigs using Faster RCNN (Figure 2) is shown in the attached Supplementary Video S1. Faster RCNN is suitable for mobile terminals where there is a lack of hardware resources in facial expressions recognition [43]. If speed (time for data processing) is the deciding factor, then YoloV4 is a better choice than Faster RCNN. Due to the advantage of the network design, large variations in the dataset composed of facial images and videos with complex and multiscale objects is better analyzed by the two-stage Faster RCNN method. Hence, for higher accuracy in the results of emotion detection, Faster RCNN is recommended over YoloV4. In on-farm conditions where there may be a lack of equipment related to strong data processing ability, Faster RCNN would be a good choice. Technological advances in the field of animal behavior are a huge step in improving humans’ understanding of the animals they share this world with, but there is still room to grow.
Performance evaluation of the three models using a complex data set concerning animal faces has been explored in this study. None of the detectors has been deployed in animal data before. Animal facial features are different than human faces, as animals have fur and less facial muscles in comparison to humans. Moreover, there has been no study that compares the facial coding analysis between Yolov3, Yolov4 and Faster RCNN. Our study also provides critical insights into specific advantages of one ML model over another detection method for on-farm practical applications. The presented facial coding platform in our study is only a preliminary step, a proof-of-concept in the yet to be fully developed platform for assessing the emotions and mental states of farm animals. For example, mixed ear position (one ear directed forwards and one ear backwards) indicates a negative mental state of pigs as evidenced by [32,33]; this and other lateral ear posture data from pigs were not available in our study. Farm animals are phenotypically very diverse depending on the genotype. Additional studies with varying animal breeds would further strengthen the validation of the developed emotion recognition platform. The proposed tool has the ability to offer new ways of investigating the individual variations within the same species based on facial features. Future investigation with additional comprehensive farm animal’s facial feature data under varying conditions is warranted for full and thorough validation of facial coding platforms for determining the affective states of farm animals.
No facial recognition software created for animals is 100% accurate yet, and so far only a few common species and non-human primates have had this software modified to identify their physical features. Animal species that are not identified as mammals are minimally expressive and have not been tested with facial recognition software for the study of their emotions. One study even brought up the consideration that animals may be able to suppress emotional expression, much like people do in situations where it is socially appropriate to express only certain emotions [29]. The results we have presented in this study are only preliminary, an early work and a basis for the development of a more comprehensive platform. Currently, studies are underway to induce and elicit explicit specific emotions in cows and pigs and thereby measure these emotions using the facial coding features. In this study, we present a framework for measuring emotions using facial features of cows and pigs. Further study is needed to explore the intensity of the emotions and the relationship to valence and arousal components in the measurement of emotions from farm animals. A key takeaway from this study is the ability of automated systems to measure not just pain and suffering but also positive emotions. Unlike humans, farm animals are not capable of hiding or deceiving their true emotions, hence the developed system is expected to exert an influence in the welfare of farm animals. Analysis of human facial expressions from 6 million video clips from 144 countries around the world using deep neural network [44] determined that 16 expressions are significantly similar in the way the facial expressions are displayed in varying social contexts. This study further clarified that 70% of human facial expressions as emotional responses are shared across cultures. There are many questions related to animal emotional expression that have yet to be answered, but there is a good chance that the advancement and implementation of facial recognition software will lead scientists to those answers in the future.

5. Conclusions

The detailed analysis of the performance of the 3-machine learning python-based models shows the utility of each model in specific farm conditions and how they compare against each other. YOLOv3 learns quickly but gives random predictions and fluctuating losses. Its next iteration, YOLOv4, has improved considerably in many regards. If the aim is to balance higher accuracy with faster response and less training time, YOLOv4 works best. If the speed of training and memory usage is not a concern, the 2-staged Faster R-CNN method performs well and has a robust design for predicting different contexts. The output is accurate, and overfitting is avoided. There is no one-size-fits-all model, but with careful consideration, the most efficient and cost-effective methods can be selected and implemented in automating the facial coding platform for determining farm animal emotions. Facial features as an indicator of emotions of farm animals provides only a one-dimensional aspect of their affective states. Due to the advent of Artificial Intelligence and sensor technologies, in the near future multi-dimensional models of mental and emotional affective states will emerge in the form of measuring behavioral patterns, combined track changes in farm animal postures and behavioral changes with large-scale neural recordings.

Supplementary Materials

The following are available online at, Table S1: The confusion matrix of DarkNet-53 trained and tested on the farm animals’ images and videos dataset using YoloV3, Table S2: The confusion matrix of CSPDarkNet-53 trained and tested on the farm animals’ images and videos dataset using YoloV4, Table S3: The confusion matrix of VGG-16 trained and tested on the farm animals’ images and videos dataset using Faster RCNN, Table S4: F1-Scores for emotion detection and recognition by Yolov3 from facial features of cows and pigs’ images and videos data set, Table S5: F1-Scores for emotion detection and recognition by Yolov4 from facial features of cows and pigs’ images and videos data set, Table S6: F1-Scores for emotion detection and recognition by Faster RCNN from facial features of cows and pigs’ images and videos data set, Video S1: Demonstration of the WUR Wolf Facial Coding Platform.

Author Contributions

Conceptualization, S.N.; methodology, S.N.; algorithm development & coding, S.N.; validation, S.N.; formal analysis, S.N.; investigation, S.N.; writing—original draft preparation, S.N.; writing—review and editing, S.N.; supervision, S.N.; project administration, S.N; All authors have read and agreed to the published version of the manuscript.


This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Neethirajan, S.; Kemp, B. Digital Livestock Farming. Sens. Bio-Sens. Res. 2021, 32, 100408. [Google Scholar] [CrossRef]
  2. Neethirajan, S.; Reimert, I.; Kemp, B. Measuring Farm Animal Emotions—Sensor-Based Approaches. Sensors 2021, 21, 553. [Google Scholar] [CrossRef]
  3. Neethirajan, S. Transforming the adaptation physiology of farm animals through sensors. Animals 2020, 10, 1512. [Google Scholar] [CrossRef] [PubMed]
  4. Do, J.P.; Defensor, E.B.; Ichim, C.V.; Lim, M.A.; Mechanic, J.A.; Rabe, M.D.; Schaevitz, L.R. Automated and Continuous Monitoring of Animal Welfare through Digital Alerting. Comp. Med. 2020, 70, 313–327. [Google Scholar] [CrossRef]
  5. Purshouse, J.; Campbell, L. Privacy, crime control and police use of automated facial recognition technology. Crim. Law Rev. 2019, 3, 188–204. [Google Scholar]
  6. Pan, Z.; Shen, Z.; Zhu, H.; Bao, Y.; Liang, S.; Wang, S.; Xiong, G. Clinical application of an automatic facial recognition system based on deep learning for diagnosis of Turner syndrome. Endocrine 2021, 72, 865–873. [Google Scholar] [CrossRef] [PubMed]
  7. Hadj-Rabia, S.; Schneider, H.; Navarro, E.; Klein, O.; Kirby, N.; Huttner, K.; Grange, D.K. Automatic recognition of the XLHED phenotype from facial images. Am. J. Med. Genet. A 2017, 173, 2408–2414. [Google Scholar] [CrossRef]
  8. Jin, B.; Qu, Y.; Zhang, L.; Gao, Z. Diagnosing Parkinson Disease Through Facial Expression Recognition: Video Analysis. J. Med. Internet Res. 2020, 22, e18697. [Google Scholar] [CrossRef] [PubMed]
  9. González-Rodríguez, M.R.; Díaz-Fernández, M.C.; Gómez, C.P. Facial-expression recognition: An emergent approach to the measurement of tourist satisfaction through emotions. Telemat. Inform. 2020, 51, 101404. [Google Scholar] [CrossRef]
  10. Yolcu, G.; Oztel, I.; Kazan, S.; Oz, C.; Bunyak, F. Deep learning-based face analysis system for monitoring customer interest. J. Ambient Intell. Humaniz. Comput. 2020, 11, 237–248. [Google Scholar] [CrossRef]
  11. Siniscalchi, M.; D’Ingeo, S.; Quaranta, A. Orienting asymmetries and physiological reactivity in dogs’ response to human emotional faces. Learn. Behav. 2018, 46, 574–585. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Kujala, M.V.; Somppi, S.; Jokela, M.; Vainio, O.; Parkkonen, L. Human empathy, personality and experience affect the emotion ratings of dog and human facial expressions. PLoS ONE 2017, 12, e0170730. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Zhao, H.; Li, J.; Wang, X.; Ruliang, P. Facial expression recognition in golden snub-nosed monkeys. Curr. Zool. 2020, 66, 695–697. [Google Scholar] [CrossRef]
  14. Paul, E.S.; Mendl, M.T. Animal emotion: Descriptive and prescriptive definitions and their implications for a comparative perspective. Appl. Anim. Behav. Sci. 2018, 205, 202–209. [Google Scholar] [CrossRef]
  15. Nawroth, C.; Langbein, J.; Coulon, M.; Gabor, V.; Oesterwind, S.; Benz-Schwarzburg, J.; von Borell, E. Farm animal cognition—linking behavior, welfare and ethics. Front. Vet. Sci. 2019, 6, 1–16. [Google Scholar] [CrossRef] [Green Version]
  16. Howarth, E.R.I.; Kemp, C.; Thatcher, H.R.; Szott, I.D.; Farningham, D.; Witham, C.L.; Bethel, E.J. Developing and Validating Attention Bias Tools for Assessing Trait and State Affect in Animals. Appl. Anim. Behav. Sci. 2021, 234, 1–46. [Google Scholar] [CrossRef]
  17. Crump, A.; Bethel, E.; Earley, R.; Lee, V.E.; Arnott, G. Emotion in Animal Contests. Proc. R. Soc. B 2020, 287, 20201715. [Google Scholar] [CrossRef]
  18. Finka, L.R.; Stelio, P.L.; Brondani, J.; Tzimiropolos, Y.; Mills, D. Geometric morphometrics for the study of facial expressions in non-human animals, using the domestic cat as an exemplar. Sci. Rep. 2019, 9, 9883. [Google Scholar] [CrossRef] [PubMed]
  19. Mota-Rojas, D.; Olmos-Hernandez, A.; Verduzco-Mendoza, A.; Hernandez, E.; Whittaker, A. The Utility of Grimace Scales for Practical Pain Assessment in Laboratory Animals. Animals 2020, 10, 1838. [Google Scholar] [CrossRef]
  20. Navarro, E.; Mainau, E.; Manteca, X. Development of a Facial Expression Scale Using Farrowing as a Model of Pain in Sows. Animals 2020, 10, 2113. [Google Scholar] [CrossRef]
  21. Seng, S.; Al-Ameen, M.N.; Wright, M. A first look into users’ perceptions of facial recognition in the physical world. Comput. Secur. 2021, 105, 102227. [Google Scholar] [CrossRef]
  22. Kumar, A.; Kaur, A.; Kumar, M. Face detection techniques: A review. Artif. Intell. Rev. 2019, 52, 927–948. [Google Scholar] [CrossRef]
  23. Guo, S.; Xu, P.; Miao, Q.; Shao, G.; Li, B. Automatic Identification of Individual Primates with Deep Learning Techniques. iScience 2020, 23, 101412. [Google Scholar] [CrossRef] [PubMed]
  24. Crouse, D.; Jacobs, R.; Richardson, Z.; Klum, S.; Tecot, S. LemurFaceID: A face recognition system to facilitate individual identification of lemurs. BMC Zool. 2017, 2, 2. [Google Scholar] [CrossRef] [Green Version]
  25. Kumar, S.; Singh, S.J.; Singh, R.; Singh, A.K. Deep Learning Framework for Recognition of Cattle Using Muzzle Point Image Pattern. Measurement 2017, 116, 1–17. [Google Scholar] [CrossRef]
  26. Blumrosen, G.; Hawellek, D.; Pesaran, B. Towards Automated Recognition of Facial Expressions in Animal Models. In Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW), Venice, Italy, 22–29 October 2017; pp. 2810–2819. [Google Scholar] [CrossRef]
  27. Mogil, J.S.; Pang, D.S.J.; Dutra, G.G.S.; Chambers, C. The development and use of facial grimace scales for pain measurement in animals. Neurosci. Biobehav. Rev. 2020, 116, 480–493. [Google Scholar] [CrossRef] [PubMed]
  28. Guesgen, M.; Beausoleil, N.J.; Leach, M.; Minot, E.O.; Stafford, K.J. Coding and quantification of a facial expression for pain in lambs. Behav. Process. 2016, 132, 49–56. [Google Scholar] [CrossRef] [Green Version]
  29. Dolensek, N.; Gehrlach, D.A.; Klein, A.S.; Gogolla, N. Facial expressions of emotion states and their neuronal correlates in mice. Science 2020, 368, 89–94. [Google Scholar] [CrossRef]
  30. Lansade, L.; Nowak, R.; Lainé, A.; Leterrier, C.; Bertin, A. Facial expression and oxytocin as possible markers of positive emotions in horses. Sci. Rep. 2018, 8, 14680. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  31. Lambert, H.S.; Carder, G. Looking into the eyes of a cow: Can eye whites be used as a measure of emotional state? Appl. Anim. Behav. Sci. 2017, 186, 1–6. [Google Scholar] [CrossRef]
  32. Reimert, I.; Bolhuis, J.E.; Kemp, B.; Rodenburg, T.B. Indicators of positive and negative emotions and emotional contagion in pigs. Physiol. Behav. 2013, 109, 42–50. [Google Scholar] [CrossRef] [PubMed]
  33. Reimert, I.; Bolhuis, J.E.; Kemp, B.; Rodenburg, T.B. Emotions on the loose: Emotional contagion and the role of oxytocin in pigs. Anim. Cogn. 2015, 18, 517–532. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Krugmann, K.L.; Mieloch, F.J.; Krieter, J.; Czycholl, I. Can tail and ear postures be suitable to capture the affective state of growing pigs? J. Appl. Anim. Welf. Sci. 2020, 1–13. [Google Scholar] [CrossRef]
  35. Czycholl, I.; Hauschild, E.; Büttner, K.; Krugmann, K.; Burfeind, O.; Krieter, J. Tail and ear postures of growing pigs in two different housing conditions. Behav. Process. 2020, 176, 104138. [Google Scholar] [CrossRef] [PubMed]
  36. Tzutalin, D. LabelImg, Github. 2015. Available online: (accessed on 1 October 2020).
  37. Battini, M.; Agostini, A.; Mattiello, S. Understanding cows’ emotions on farm: Are eye white and ear posture reliable indicators? Animals 2019, 9, 477. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  38. Gómez, Y.; Bieler, R.; Hankele, A.K.; Zähner, M.; Savary, P.; Hillmann, E. Evaluation of visible eye white and maximum eye temperature as non-invasive indicators of stress in dairy cows. Appl. Anim. Behav. Sci. 2018, 198, 1–8. [Google Scholar] [CrossRef]
  39. Camerlink, I.; Coulange, E.; Farish, M.; Baxter, E.M.; Turner, S.P. Facial expression as a potential measure of both intent and emotion. Sci. Rep. 2018, 8, 1–9. [Google Scholar] [CrossRef]
  40. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef] [Green Version]
  41. Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
  42. Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
  43. Bao, J.; Wei, S.; Li, J.; Zhang, W. Optimized faster-RCNN in real-time facial expression classification. IOP Conf. Ser. Mater. Sci. Eng. 2020, 790, 012148. [Google Scholar] [CrossRef]
  44. Cowen, A.S.; Keltner, D.; Schroff, F.; Jou, B.; Adam, H.; Prasad, G. Sixteen facial expressions occur in similar contexts worldwide. Nature 2021, 589, 251–257. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Mapping the emotions from facial features of cows and pigs. Facial feature classes and subclasses indicating potential affective states of cows and pigs.
Figure 1. Mapping the emotions from facial features of cows and pigs. Facial feature classes and subclasses indicating potential affective states of cows and pigs.
Ai 02 00021 g001
Figure 2. Sample of images from the data set. Facial features of pigs and cows expressing varying emotions.
Figure 2. Sample of images from the data set. Facial features of pigs and cows expressing varying emotions.
Ai 02 00021 g002
Figure 3. Illustration of the proposed WUR Wolf Facial Emotion Recognition model based on Deep Convolutional Neural Networks. The process pipeline includes feature selection and extraction, bounding box prediction, and the output of recognition of affective states.
Figure 3. Illustration of the proposed WUR Wolf Facial Emotion Recognition model based on Deep Convolutional Neural Networks. The process pipeline includes feature selection and extraction, bounding box prediction, and the output of recognition of affective states.
Ai 02 00021 g003
Figure 4. Example emotion detection results from a neutral emotional state pig, and an excited emotional state cow as determined by the WUR Wolf Facial Coding Platform.
Figure 4. Example emotion detection results from a neutral emotional state pig, and an excited emotional state cow as determined by the WUR Wolf Facial Coding Platform.
Ai 02 00021 g004
Figure 5. Training and validation process of modified YOLOv3. (a) The curve of Accuracy. (b) The curve of Loss.
Figure 5. Training and validation process of modified YOLOv3. (a) The curve of Accuracy. (b) The curve of Loss.
Ai 02 00021 g005
Figure 6. Training and validation process of modified YOLOv4. (a) The curve of Accuracy. (b) The curve of Loss.
Figure 6. Training and validation process of modified YOLOv4. (a) The curve of Accuracy. (b) The curve of Loss.
Ai 02 00021 g006
Figure 7. Training and validation process of modified Faster RCNN. (a) The curve of Accuracy. (b) The curve of Loss.
Figure 7. Training and validation process of modified Faster RCNN. (a) The curve of Accuracy. (b) The curve of Loss.
Ai 02 00021 g007
Table 1. Sensing parameters that were used for each of the nine classes related to recognizing emotions of cows and pigs [2].
Table 1. Sensing parameters that were used for each of the nine classes related to recognizing emotions of cows and pigs [2].
Species TypeIndicators Inferring EmotionsEmotions/Affective StatesReferences
CowUpright ear posture longerExcited state (positive emotion)[31]
CowForward facing ear postureFrustration (negative emotion)[31]
CowHalf-closed eyes and ears backwards or hung-downRelaxed state (positive emotion)[37]
CowEye white clearly visible and ears directed forwardExcited state (positive emotion)[37]
CowVisible eye whiteStress (negative emotion)[38]
PigsEars forwardAlert[32,33]
PigsEars backwardNegative emotion[32,33]
PigsHanging ears flipping in the direction of eyesNeutral emotion[32,33]
PigsStanding upright earsNormal (neutral state)[32,33]
PigsEars forward orientedAggression (negative emotion)[39]
PigsEars backward and less open eyesRetreat from aggression or transition to neutral state[39]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Neethirajan, S. Happy Cow or Thinking Pig? WUR Wolf—Facial Coding Platform for Measuring Emotions in Farm Animals. AI 2021, 2, 342-354.

AMA Style

Neethirajan S. Happy Cow or Thinking Pig? WUR Wolf—Facial Coding Platform for Measuring Emotions in Farm Animals. AI. 2021; 2(3):342-354.

Chicago/Turabian Style

Neethirajan, Suresh. 2021. "Happy Cow or Thinking Pig? WUR Wolf—Facial Coding Platform for Measuring Emotions in Farm Animals" AI 2, no. 3: 342-354.

Article Metrics

Back to TopTop