AI Audio-Based Poultry Behavior Monitoring Using Vocal Sound Analysis

Sattar, Farook

doi:10.3390/blsf2025054019

Open AccessProceeding Paper

AI Audio-Based Poultry Behavior Monitoring Using Vocal Sound Analysis^†

by

Farook Sattar

Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC V8P 5C2, Canada

^†

Presented at the 3rd International Online Conference on Agriculture (IOCAG 2025), 22–24 October 2025; Available online: https://sciforum.net/event/IOCAG2025.

Biol. Life Sci. Forum 2025, 54(1), 19; https://doi.org/10.3390/blsf2025054019

Published: 9 February 2026

(This article belongs to the Proceedings of The 3rd International Online Conference on Agriculture)

Download

Browse Figures

Versions Notes

Abstract

The aim is to develop a simple and efficient AI audio-based approach to recognize chickens’ key behaviors, such as eating, greeting, foraging, hunting, and tidbitting, to improve poultry farming. First, the proposed study performs cepstral and entropy analyses on the chickens’ vocalizations to extract new vocal features. Second, a simple deep unsupervised clustering method is proposed to recognize the behaviors of the chickens. Alternations in recognized behaviors can be indicators of lameness in chickens. Here, we used an open access chicken language dataset consisting of a total of 74 distinct chicken calls with their probable meanings as based on careful observations. Promising results are obtained by the proposed scheme for chicken behavior monitoring, enabling poultry personnel to accurately determine the health and well-being of chickens.

Keywords:

poultry behavior; poultry vocalization; chickens; cepstral peak prominence; sample entropy; recognition; clustering; unsupervised learning; deep neural network

1. Introduction

Nowadays, a lot of attention has been paid to exploring Artificial Intelligence (AI) for analyzing audio and vocal data, offering a wide range of capabilities in precision livestock farming including poultry behavior monitoring. Animal behaviors provide significant insights into the mental and physical well-being of poultry, serving as an important indicator of their health and subjective states. With the world’s population projected to reach 9.5 billion by 2050, and the demand for animal products like eggs, meat, and milk expected to increase by 70% from 2005 levels, it becomes vital to develop automated, precise systems for monitoring poultry behaviors. This achievement is especially important to overcome the constraints of manual behavioral observations, which are time-consuming.

Recently, some progress have been made to develop AI-based methods relevant to poultry behavior monitoring using acoustical data. To a study in [1], an AI-based approach is introduced by utilising a single transformer-based neural network model to classify multiple stress conditions, including cold, heat, and wind stress, as well as the normal state. By employing a single model capable of handling multiple stress types and age variations, this approach simplifies the classification process and enhances the practicality of stress monitoring in broiler production assisting better management practices in the future. In [2], the work exploits CNN (Convolutional Neural Networks) combined with MFCC (Mel Frequency Cepstral Coefficients) to decode the vocalization patterns in laying hens having acute environmental stress. Further, the effect of age is investigated in modulating stress responses, by comparing the vocal behavior of younger hens to that of older hens under similar experimental conditions. A high classification accuracy of 94% is achieved by the CNN model differentiating stressor types, age categories, and exposure conditions solely based on MFCC-derived acoustic signatures. As published in [3], an investigation is conducted for processing of the chicken audio recordings through Whisper (a human-oriented ASR (Automatic Speech Recognition) model), producing text-like outputs. The work in [3] reveals that Whisper has the potential to generate text outputs consistently correlated with variations in chicken vocal patterns across stress, noise, and health conditions. In [4], the study demonstrates the feasibility and effectiveness of employing acoustic sensing, together with statistical feature analysis and interpretable machine learning models, for non-invasive poultry welfare assessment. The integration of MFCC, spectral features, and temporal dynamics enable robust classification of welfare indicators, with ensemble methods and LSTM (Long Short-Term Memory) [5] models providing high predictive performance.

2. Data Used

We have used an open-access chicken language dataset [6] consists of a total of 74 distinct chicken calls with their probable meanings based on careful observations where the sampling rate is 44.1 kHz.

The chicken behaviors considered here are displayed in Figure 1 and include tidbitting, foraging and greeting. (All images are included here with authors’ full permission).

For illustration, the chicken calls referring to the three types of behaviors, i.e., tidbitting, foraging, and greeting, are presented in Figure 2.

As we see in Figure 2, the signals have different waveforms in terms of their amplitude variations and varying length.

3. Proposed Method

Figure 3 shows the overall block diagram of the proposed method. It consists of feature extraction and unsupervised deep clustering. The input chicken vocalizations are passed through the feature extraction module generating a feature matrix, which is then used for unsupervised deep clustering. The output of the clustering provides the clustered classes with prediction indices for different chicken behaviors.

3.1. Feature Extraction

Two types of features are extracted from the raw data generating a feature matrix for the dataset, which consists of different types of chicken vocalizations, see Figure 4.

3.1.1. CPPS (Cepstral Peak Prominence Smoothed)

The CPPS are calculated using the following steps [10]:

1.: Each input signal is Hamming-windowed and then the Fast-Fourier Transform (FFT) is taken twice: the first time on the signal in time, the second one on the log power spectrum, obtaining the cepstrum.
2.: A regression line is obtained by quefrequency smoothing, which is performed through cepstral-magnitude averaging across quefrequency using a three-bin averaging window.
3.: Lastly, the level difference (in dB) between the peak in the cepstrum and the value of the regression line at the same quefrequency represents the CPPS measure, where the peak search is limited to the range between the fundamental frequencies of 200 Hz and 1000 Hz.

For example, Figure 5 shows the CPPS features extracted for the three types of chicken vocalizations in Figure 2.

3.1.2. Histograms of Sample Entropy

The histograms of sample entropy are calculated based on the following steps [11]:

1.: Signals are first orthogonally decomposed into a linear time-frequency transform corresponds to Short-Time Fourier Transform (STFT).
2.: The sample entropy is calculated for each output of the STFT.
3.: The histogram of each sample entropy is calculated.
4.: Finally, distortions of the histograms (histogram distortion) are calculated as the difference between each histograms and the mean histogram to be used as features.

Figure 6 illustrates the extraction of the histograms of sample entropy features for the three types of chicken vocalizations in Figure 2.

3.2. Unsupervised Deep Clustering

The unsupervised deep clustering consists of an encoder and decoder (AE (AutoEncoder)-based method), as depicted in Figure 7.

The encoder consists of two LSTM layers with 64 and 128 hidden units, respectively, each followed by batch normalization and dropout for regularization. The final LSTM outputs a compressed latent representation with 128 dimensions.

The decoder reconstructs the input to encoder from the generated latent space, mirroring the encoder structure.

The LSTM decoder first expands the latent embedding using a RepeatVector, followed by stacked LSTM layers that progressively refine the temporal reconstruction. A dense layer is used to restore the original feature dimensions.

After self-supervised reconstruction training finished, we extract the encoder part as the feature projector, which transforms the input sequences into the learned latent space representation, for the final clustering.

By leveraging the LSTM’s sequential dependencies, the learned representation facilitate effective unsupervised clustering.

3.2.1. Hyperparameters

We train the deep model for 100 epochs with a batch size of 32. We adopt Adam Optimizer and the learning rate is 0.001.

3.2.2. Evaluation Metrices

Since the ground truth is available, we measure the accuracy (ACC) as the percentage of correctly clustered samples between cluster labels and the true labels, given by the following:

ACC = \frac{TP + TN}{(TP + FP) + (TN + FN)}

(1)

where TP, TN, FP, FN represent true positive, true negative, false positive, false negative, respectively.

We also calculate the Normalized Mutual Information (NMI), which quantifies shared information between clustering results and true labels, normalized to ensure values between 0 and 1. It is given by the following:

N M I = \frac{2 I (Y; \hat{Y})}{H (Y) + H (\hat{Y})}

(2)

where

I (Y; \hat{Y})

is the mutual information between ground truth labels Y and predicted clusters

\hat{Y}

,

H (\cdot)

denotes entropy.

4. Results and Discussion

In Figure 8, the clustering result is visualized by a 2D-scatter plot. The three clusters are shown to be well-separated into three types of chicken behaviors including greeting, foraging, tidbitting labelled as ‘1’, ‘2’, and ‘3’, respectively.

The sequences of predicted labels and the ground truth (GT) levels are shown in Table 1.

Figure 9a,b display the confusion matrix and the performances of the method in terms of classification accuracy and NMI, which are found to be high as 94.23% and 86.16%, respectively.

5. Conclusions

An AI audio-based poultry behavior monitoring method is proposed using vocal sound analysis. The proposed scheme is based on feature sequence generation followed by LSTM-AE-based clustering. Preliminary results are presented using a small dataset for monitoring three types of chicken behaviours including greeting, foraging, tidbitting from their vocalizations. Promising results are obtained by the proposed scheme enabling poultry personnel to accurately determine the health and well-being of the chickens. This acoustical poultry behavior monitoring method, combined with a vision-based method, can be a very powerful tool for the veterinarians in locating the sick chickens from the healthy ones for precision animal farming.

Upon the availability of sufficiently large data samples, other chicken behaviors, such as eating, hungry, will be considered. Referring to unsupervised AE-based model, parallel autoencoder architecture consists of LSTM-AE and CNN-AE will be explored further leveraging the two distinct low-dimensional feature vectors extracted from the LSTM and convolutional autoencoders [12].

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available in Chicken Language Dataset at https://www.juheapi.com/datasets/chickenlanguagedataset (accessed on 29 May 2025), reference number [6]. These data were derived from the following resources available in the public domain: https://www.juheapi.com/datasets/chickenlanguagedataset (accessed on 29 May 2025).

Acknowledgments

The author would like to thank the anonymous reviewer and the editor for their critical comments and suggestions that helped to improve this paper.

Conflicts of Interest

The author declares no conflicts of interest.

References

Lev-ron, T.; Yitzhaky, Y.; Halachmi, I.; Druyan, S. Classifying Vocal Responses of Broilers to Environmental Stressors Via Artificial Neural Network. Animal 2025, 19, 101378. [Google Scholar] [CrossRef] [PubMed]
Neethirajan, S. Decoding Vocal Indicators of Stress in Laying Hens: A CNN-MFCC Deep Learning Framework. Smart Agric. Technol. 2025, 11, 101056. [Google Scholar] [CrossRef]
Neethirajan, S. Adapting a Large-Scale Transformer Model to Decode Chicken Vocalizations: A Non-Invasive AI Approach to Poultry Welfare. AI 2025, 6, 65. [Google Scholar] [CrossRef]
Manikandan, V.; Neethirajan, S. Decoding Poultry Welfare from Sound—A Machine Learning Framework for Non-Invasive Acoustic Monitoring. Sensors 2025, 25, 2912. [Google Scholar] [CrossRef] [PubMed]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Chicken Language Dataset. Available online: https://www.juheapi.com/datasets/chickenlanguagedataset (accessed on 29 May 2025).
Steele, L. Fresh Eggs Daily. Available online: https://www.fresheggsdaily.blog/2017/07/tidbitting-what-it-is-and-why-chickens.html (accessed on 6 January 2026).
Two Chickens Foraging in a Grassy Yard. Photo–Free Outdoor Image on Unsplash. Available online: https://unsplash.com/photos/two-chickens-foraging-in-a-grassy-yard-Lct9U7wX89o (accessed on 6 January 2026).
Two Black and White Chickens Photo– Free Grey Image on Unsplash. Available online: https://unsplash.com/photos/two-black-and-white-chickens-HnePInoaEe8 (accessed on 6 January 2026).
Selamtzisa, A.; Castellana, A.; Salvia, G.; Carullob, A.; Astolfic, A. Effect of Vowel Context in Cepstral and Entropy Analysis of Pathological Voices. Biomed. Signal Process. Control 2019, 47, 350–357. [Google Scholar] [CrossRef]
Jin, F.; Sattar, F.; Goh, D.Y.T. Automatic Wheeze Detection Using Histograms of Sample Entropy. In Proceedings of the 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Vancouver, BC, Canada, 20–25 August 2008. [Google Scholar] [CrossRef]
Yang, W.; Sui, Y.; Zhang, Y.; Xia, S. Unsupervised Deep Clustering for Human Behavior Understanding. In Proceedings of the 3rd International Workshop on Human-Centered Sensing, Modeling, and Intelligent Systems, Irvine, CA, USA, 6–9 May 2025. [Google Scholar]

Figure 1. The chicken behaviors: (a) tidbitting (left) [7]; (b) foraging (middle) [8]; (c) greeting (right) [9]. (a) is reprinted with permission from Ref. [7]. Copyright 2026 Fresh Eggs Daily.

Figure 2. The samples of chicken vocalizations used.

Figure 3. The overall block diagram of the proposed method.

Figure 4. The process of feature matrix generation for three types of input signals as represented by three colors.

Figure 5. The CPPS features extracted for the three types of chicken vocalizations in Figure 2.

Figure 6. The histograms of sample entropy features extracted for the three types of chicken vocalizations in Figure 2.

Figure 7. The unsupervised deep clustering method with output of five clusters shown in different colors.

Figure 8. The clustering results in terms of 2D-scatter plot.

Figure 9. The results of classification (a) confusion matrix, (b) performance indices.

Table 1. The cluster index and the corresponding ground truth (GT).

Cluster Index	3 3 3 2 3 2 3 2 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1
GT	3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1
Cluster Index	1 1 1 1 1 1 1 1 1 1 1 1
GT	1 1 1 1 1 1 1 1 1 1 1 1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sattar, F. AI Audio-Based Poultry Behavior Monitoring Using Vocal Sound Analysis. Biol. Life Sci. Forum 2025, 54, 19. https://doi.org/10.3390/blsf2025054019

AMA Style

Sattar F. AI Audio-Based Poultry Behavior Monitoring Using Vocal Sound Analysis. Biology and Life Sciences Forum. 2025; 54(1):19. https://doi.org/10.3390/blsf2025054019

Chicago/Turabian Style

Sattar, Farook. 2025. "AI Audio-Based Poultry Behavior Monitoring Using Vocal Sound Analysis" Biology and Life Sciences Forum 54, no. 1: 19. https://doi.org/10.3390/blsf2025054019

APA Style

Sattar, F. (2025). AI Audio-Based Poultry Behavior Monitoring Using Vocal Sound Analysis. Biology and Life Sciences Forum, 54(1), 19. https://doi.org/10.3390/blsf2025054019

Article Menu

AI Audio-Based Poultry Behavior Monitoring Using Vocal Sound Analysis^†

Abstract

1. Introduction

2. Data Used