Next Article in Journal
Evaluation of Growth and Performance Traits in Limousin Weaned Bull Calves Using K-Means Clustering
Previous Article in Journal
A Compound Feed Additive Improves Saline–Alkaline Stress Tolerance in Nile Tilapia (Oreochromis niloticus) Through Regulation of Hepatic Metabolism, Osmoregulation, and Intestinal Health
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Overcoming Data Scarcity: Few-Shot Pig Vocalization Recognition via Domain Expansion, Knowledge Transfer, and Feature Alignment

1
College of Electronic and Information Engineering, Huaibei Institute of Technology, Huaibei 235000, China
2
College of Command and Control Engineering, Army Engineering University of the People’s Liberation Army, Nanjing 210007, China
3
School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China
*
Author to whom correspondence should be addressed.
Animals 2026, 16(13), 2074; https://doi.org/10.3390/ani16132074 (registering DOI)
Submission received: 2 June 2026 / Revised: 29 June 2026 / Accepted: 30 June 2026 / Published: 5 July 2026
(This article belongs to the Section Pigs)

Simple Summary

Pig vocal sounds can help monitor animal state without disturbing the animals, but labelled recordings for specific behaviours or physiological states are often scarce. This study focused on few-shot recognition of five pig vocalization categories: eat, estrous, farrowing (fap), howl, and oink. We proposed PSA-AP, a pig-sound adaptation pipeline that converts audio into log-Mel spectrograms and combines spectrogram augmentation, knowledge transfer, and feature alignment. Using the last training checkpoint as the final evaluation model, PSA-AP achieved the best mean Accuracy, Macro-F1, and UAR across all six K-shot settings from 5 to shot to 30-shot. At 30-shot, PSA-AP reached 90.60% Accuracy, 90.49% Macro-F1, and 90.60% UAR. These findings suggest that the proposed task-adapted supervised model is feasible and effective for limited-label pig-sound recognition under the current protocol.

Abstract

Pig vocalization recognition can support non-invasive monitoring in precision livestock farming, but labelled pig-sound recordings are often limited for specific behaviours or physiological states. Under few-shot conditions, deep models may overfit, whereas traditional acoustic features may not fully describe class-specific time-frequency patterns. This study proposed PSA-AP, a pig-sound adaptation pipeline that uses log-Mel spectrograms and integrates SpecAugment-based domain expansion, ImageNet-pretrained ResNet18 knowledge transfer, and ArcFace-based feature alignment. The method was designed to reduce dependence on limited labelled samples, improve task-adapted representation learning, and enhance inter-class separability in the embedding space. Experiments were conducted on a five-class few-shot pig vocalization classification task, including eat, estrous, farrowing (fap), howl, and oink sounds collected from 10 adult Landrace pigs. Using K={5,10,15,20,25,30} labelled wav files per class and five random seeds, each selected training wav file and each held-out test wav file was converted into one 1.0 s log-Mel spectrogram for model training or evaluation. Final evaluation was based on the last checkpoint of each training run. PSA-AP achieved the best mean Accuracy, Macro-F1, and UAR at every K-shot setting. At K=30, PSA-AP reached 90.60% Accuracy, 90.49% Macro-F1, and 90.60% UAR, exceeding Raw by 7.80, 7.82, and 7.80 percentage points, respectively. These results indicate that the proposed integration of domain expansion, knowledge transfer, and feature alignment provides a feasible supervised adaptation strategy for few-shot pig vocalization recognition within the current protocol.
Keywords: pig vocalization; few-shot learning; bioacoustics; spectrogram classification; SpecAugment; ArcFace; self-supervised audio representation pig vocalization; few-shot learning; bioacoustics; spectrogram classification; SpecAugment; ArcFace; self-supervised audio representation

Share and Cite

MDPI and ACS Style

Li, G.; Liu, W. Overcoming Data Scarcity: Few-Shot Pig Vocalization Recognition via Domain Expansion, Knowledge Transfer, and Feature Alignment. Animals 2026, 16, 2074. https://doi.org/10.3390/ani16132074

AMA Style

Li G, Liu W. Overcoming Data Scarcity: Few-Shot Pig Vocalization Recognition via Domain Expansion, Knowledge Transfer, and Feature Alignment. Animals. 2026; 16(13):2074. https://doi.org/10.3390/ani16132074

Chicago/Turabian Style

Li, Guangbo, and Wenxiu Liu. 2026. "Overcoming Data Scarcity: Few-Shot Pig Vocalization Recognition via Domain Expansion, Knowledge Transfer, and Feature Alignment" Animals 16, no. 13: 2074. https://doi.org/10.3390/ani16132074

APA Style

Li, G., & Liu, W. (2026). Overcoming Data Scarcity: Few-Shot Pig Vocalization Recognition via Domain Expansion, Knowledge Transfer, and Feature Alignment. Animals, 16(13), 2074. https://doi.org/10.3390/ani16132074

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop