High-Accuracy Chicken Breed Identification Using Microsatellite Genotype Data and AutoGluon Framework

Rajaonarison Faniriharisoa Maxime Toky; Sutthisak Sukhamsri; Sadeep Medhasi; Trifan Budi; Thitipong Panthum; Worapong Singchat; Kornsorn Srikulnath

doi:10.3390/biology15010021

,

and

¹

Animal Genomics and Bioresource Research Unit (AGB Research Unit), Faculty of Science, Kasetsart University, Bangkok 10900, Thailand

²

Department of Information Technology, Faculty of Science and Agricultural Technology, Rajamangala University of Technology Lanna Tak, Tak 63000, Thailand

³

Biodiversity Center, Kasetsart University (BDCKU), Bangkok 10900, Thailand

^*

Authors to whom correspondence should be addressed.

Biology2026, 15(1), 21;https://doi.org/10.3390/biology15010021

This article belongs to the Section Bioinformatics

Version Notes

Order Reprints

Simple Summary

Identifying chicken breeds correctly is important for conserving local breeds and improving breeding programs. However, many breeds look very similar, making visual identification difficult and sometimes inaccurate. In this study, genetic information from several chicken populations was used to train a machine learning model to recognize breed patterns. The model was tested and showed high accuracy in identifying most breeds. This demonstrates that computer-based methods can offer a practical and reliable tool for farmers, breeders, and conservation groups. As more genetic data becomes available, this approach is expected to become even more accurate and useful for protecting and managing valuable chicken breeds.

Abstract

The practical applications of breed identification are numerous and diverse, and they include breed conservation and breeding program design. However, distinguishing between breeds remains challenging and costly, especially for phenotypically similar chicken populations. Continued research is necessary to develop more accessible and optimized methodologies. To address these challenges, machine learning (ML) offers promising tools for analyzing complex genetic data. The capabilities of machine learning, especially the random forest (RF) model, to enhance various fields, including bioinformatics, have recently been demonstrated. In this study, microsatellite genotype data from 651 individuals across 30 chicken populations filtered from a larger initial dataset for consistency were used to classify breeds using an RF model. Cross-validation techniques, including 10-fold cross-validation and leave-one-out cross-validation, were employed to assess the performance of the model. The model performance was evaluated using metrics such as accuracy, Cohen’s Kappa, 95% confidence interval, and F1-score. Results showed that the RF model achieved a 95.38% accuracy on the testing dataset. Accuracies of 91.44% and 90.99% were observed for 10-fold cross-validation and leave-one-out cross-validation, respectively. It is believed that larger datasets will significantly improve outcomes for other breeds. Because of its generalizability, the trained model can serve as a straightforward and modern method for chicken breed determination using machine learning. This study demonstrates that ML, particularly automated approaches like AutoGluon, provides a robust and accessible framework for chicken breed identification using cost-effective microsatellite data.

Keywords:

breed determination; chicken breeds; machine learning; microsatellite; random forest

High-Accuracy Chicken Breed Identification Using Microsatellite Genotype Data and AutoGluon Framework

Simple Summary

Abstract

Article Metrics

Citations

Article Access Statistics