Simple Summary
Identifying chicken breeds correctly is important for conserving local breeds and improving breeding programs. However, many breeds look very similar, making visual identification difficult and sometimes inaccurate. In this study, genetic information from several chicken populations was used to train a machine learning model to recognize breed patterns. The model was tested and showed high accuracy in identifying most breeds. This demonstrates that computer-based methods can offer a practical and reliable tool for farmers, breeders, and conservation groups. As more genetic data becomes available, this approach is expected to become even more accurate and useful for protecting and managing valuable chicken breeds.
Abstract
The practical applications of breed identification are numerous and diverse, and they include breed conservation and breeding program design. However, distinguishing between breeds remains challenging and costly, especially for phenotypically similar chicken populations. Continued research is necessary to develop more accessible and optimized methodologies. To address these challenges, machine learning (ML) offers promising tools for analyzing complex genetic data. The capabilities of machine learning, especially the random forest (RF) model, to enhance various fields, including bioinformatics, have recently been demonstrated. In this study, microsatellite genotype data from 651 individuals across 30 chicken populations filtered from a larger initial dataset for consistency were used to classify breeds using an RF model. Cross-validation techniques, including 10-fold cross-validation and leave-one-out cross-validation, were employed to assess the performance of the model. The model performance was evaluated using metrics such as accuracy, Cohen’s Kappa, 95% confidence interval, and F1-score. Results showed that the RF model achieved a 95.38% accuracy on the testing dataset. Accuracies of 91.44% and 90.99% were observed for 10-fold cross-validation and leave-one-out cross-validation, respectively. It is believed that larger datasets will significantly improve outcomes for other breeds. Because of its generalizability, the trained model can serve as a straightforward and modern method for chicken breed determination using machine learning. This study demonstrates that ML, particularly automated approaches like AutoGluon, provides a robust and accessible framework for chicken breed identification using cost-effective microsatellite data.