Integrating Bioinformatics and Machine Learning for Genomic Prediction in Chickens

Genomic prediction plays an increasingly important role in modern animal breeding, with predictive accuracy being a crucial aspect. The classical linear mixed model is gradually unable to accommodate the growing number of target traits and the increasingly intricate genetic regulatory patterns. Hence, novel approaches are necessary for future genomic prediction. In this study, we used an illumina 50K SNP chip to genotype 4190 egg-type female Rhode Island Red chickens. Machine learning (ML) and classical bioinformatics methods were integrated to fit genotypes with 10 economic traits in chickens. We evaluated the effectiveness of ML methods using Pearson correlation coefficients and the RMSE between predicted and actual phenotypic values and compared them with rrBLUP and BayesA. Our results indicated that ML algorithms exhibit significantly superior performance to rrBLUP and BayesA in predicting body weight and eggshell strength traits. Conversely, rrBLUP and BayesA demonstrated 2–58% higher predictive accuracy in predicting egg numbers. Additionally, the incorporation of suggestively significant SNPs obtained through the GWAS into the ML models resulted in an increase in the predictive accuracy of 0.1–27% across nearly all traits. These findings suggest the potential of combining classical bioinformatics methods with ML techniques to improve genomic prediction in the future.


Introduction
Chickens (Gallus gallus) are a crucial source of food for humans and are currently the most economically significant animals in the poultry industry.Breeders aim to enhance the production of chicken eggs and meat to meet the increasing demand for high-quality and cost-effective proteins.Over the past two decades, scientists have progressively integrated genomic data into the best linear unbiased prediction (BLUP) model from genomic selection, as well as refined and optimized BLUP models for diverse application scenarios [1][2][3][4].The intricacy of animal genomes presents challenges in handling genomic high-throughput data (e.g., single nucleotide polymorphisms, SNPs), in contrast to the relative simplicity and controllability of plant genomes [5].Classical linear mixed models face challenges in accommodating the complex relationships present in large, noisy, and redundant animal genomic data while tackling the lack of adequate observational data and predictors [6,7].Additionally, focusing solely on additive effects does not fulfill the requirements for accurate phenotype prediction.Hence, there is a continual necessity to explore more suitable and precise genomic selection and prediction algorithms.
Many researchers have recently proposed genomic selection and prediction models based on machine learning (ML) algorithms that are suitable for various breeding scenarios [8,9].In studies of pigs, cattle, and broilers, researchers performed genomic predictions on different economic traits through various ML methods [10][11][12][13].However, a common issue is that the feature engineering and parameter selection process almost completely ignores classical bioinformatics methods, resulting in the poor interpretability of these models, particularly those based on neural networks [14,15].Additionally, animal genomic data often comprises over 10K SNPs, leading to excessive model parameter fitting in both ML and deep learning models, causing dimension explosion and gradient disappearance [16].Therefore, prior to utilizing ML algorithms for genomic selection, classical bioinformatics methods can be employed to screen SNPs, thereby enhancing the reliability and interpretability of ML models.
The primary objective of this study was to investigate the suitability and reliability of different ML methods for developing genomic prediction models for various economic traits in chickens.Additionally, we aimed to assess the impact of SNP sets selected using distinct bioinformatics methods on the ultimate fitting outcomes.This study is anticipated to generate novel insights and perspectives for genomic prediction in animal breeding, thus assuming a pivotal role in chicken breeding.

Data Collection
The sample dataset was obtained from the Rhode Island Red pure line population of Beijing Huadu Yukou Poultry Industry Co., Ltd.(Beijing, China), and they provided the phenotype statistics and pedigree-based heritability.This line has undergone more than 10 generations of high-intensity artificial selection for egg production traits.We selected 4190 female chickens with phenotype records of 10 traits, including body weight (BW), egg number (EN), egg weight (EW), eggshell color (ESCA, ESCB, ESCI, and ESCL), eggshell strength (ESS), albumen height (AH), and egg glossiness (SINS) at different ages.Blood samples of all chickens were collected from the wing and placed in an anticoagulant tube containing EDTA.The integrity of the DNA was verified, and the genotypes were captured using an illumina 50K SNP chip (Phoenix No. 1) [17].The chip was established based on the reference genome of Gallus_gallus-5.0, and all loci were converted to Gallus_gallus-6.0 by Liftover (https://genome.ucsc.edu/cgi-bin/hgLiftOver,accessed on 15 February 2024) [18].All animal experiments were approved by the Animal Care and Use Committee of China Agricultural University (permit number SYXK 2007-0023).All animal experiments were performed strictly according to the requirements of the Animal Ethics Procedures and Guidelines of the People's Republic of China.

Quality Control and Imputation
We used PLINK v1.90 [19] to perform quality control on genotype data, including MAF, SNP call rate, and individual call rate.The criteria for all filtering methods were set to 0.05 (--maf 0.05 --geno 0.05 --mind 0.05), and we ultimately obtained 36,985 high-quality SNPs to participate in subsequent imputation via BEAGLE v5.2 [20].

Genome-Wide Association Study (GWAS)
We conducted genome-wide association studies (GWASs) on all traits, incorporating all valid samples and SNPs, using a univariate linear mixed model (LMM) implemented in GCTA v0.98.6 software [21].To address population stratification, we employed the genomic relationship matrix (GRM) and considered the first five dimensions of principal components (PCs) as covariates.The statistical model employed in this investigation is presented by the following equation: Here, y represents the phenotype records of 4190 individuals.W is a matrix of covariates (fixed effects: top five principal components), accounting for population structure, with α being a vector of corresponding effects that form the intercept; x signifies the marker genotypes; β refers to the associated marker effects; µ constitutes a vector of random polygenic effects with a covariance structure; and ε denotes a vector of random residuals.The p value served as a benchmark to assess the significance of the association between SNPs and the traits.The threshold for genome-wide significance was defined using a modified Bonferroni correction, applied using the R package simpleM [22].A total of 1509 independent SNPs were examined, and the thresholds for genome-wide significance and suggestive significance were 3.31 × 10 −5 (0.05/1509) and 6.63 × 10 −4 (1/1509), respectively.

Feature Engineering
In the current study, we used each of the three methods provided by PLINK for feature engineering.We utilized --pca n (n = 10, 30, 50, 100, 200, and 500) to achieve different degrees of parameter dimension reduction.We employed --maf x (x = 0.1, 0.2, 0.3, 0.4) to derive a parameter set with different minimum allele frequencies.We utilized --indeppairwise 25 5 r 2 (r 2 = 0.1, 0.2, 0.3, 0.4, 0.5) to identify various levels of linkage disequilibrium SNPs and obtained the independent loci.
In addition to the three methods described above, we composed the significant and suggestive significant loci for each trait obtained from the GWAS into a subset of SNPs for the corresponding trait, which were used to fit the trait.

Model Building
Following genotyping and feature engineering, we used an automatic ML software package, AutoGluon v1.0.0 [24], to construct multi common ML algorithms, including Random Forest (RF), Extra Tree (ET), K-nearest neighbors (KNNs), CatBoost (CAT), LightGBM (LGB), and neural networks (NNs), to fit different sets of SNPs and model 10 economic traits of chickens for comparison with the accuracy and bias of prediction of the ridge regression BLUP (rrBLUP) [25] and BayesA methods [26].All models reduced overfitting by 10-fold cross-validation.
The rrBLUP models for genomic prediction were developed using the R package "rrBLUP" (version 4.6.2).This model employs ridge regression methods to estimate marker effects and effectively addresses potential multicollinearity issues in the genetic correlation matrices.We first converted the genotype file from the bed format to the raw format using the --recodeA command in PLINK.The raw format files were imported into R and divided into train and test sets.We then fitted input files using the mixed.solve()function in rrBLUP.After obtaining the predicted values, the Pearson correlation coefficient and the root-mean-square error (RMSE) between the predicted values and the raw phenotypes were calculated using the cor() function and the rmse() function of the "Metrics" package, respectively.
The BayesA models were developed using the R package "BGLR" (version 1.1.2).After importing the genotype and phenotype into R, we employed the BGLR() function to fit the input files with the parameters set as nIter = 50,000, burnIn = 5000, and thin = 10, which specified the total number of iterations for the MCMC sampling, the number of iterations to discard at the beginning of the MCMC sampling process, and the interval for sampling, respectively.
We used the open-source AutoML toolkit AutoGluon v.1.0.0 to construct genomic prediction models based on ML algorithms.ML models utilized in this study encompassed RF, ET, CAT, KNN, LGB, NN, and stacking structures derived from these algorithms by linear fitting and were named Weighted Ensemble (WE).All of these models have been shown in some animal or plant datasets to potentially improve the accuracy of genomic prediction [10,[27][28][29].The toolkit's hyperparameter optimization function automatically fine-tuned model parameters to select the best-performing model by the order TabularPredictor().fit(presets= 'best_quality').Therefore, we chose the Pearson correlation coefficient and the RMSE as the evaluation metrics for validation data by setting eval_metric = pearsonr or root_mean_squared_error and the default hyperparameters for each ML model mentioned above (Table 1).The 10-fold cross-validation was carried by the order num-bagfolds = 10.

Phenotype Statistics and Genome-Wide Association Study (GWAS)
Phenotype data were collected from a total of 4190 hens for 10 traits at multiple age points, primarily including body weight (BW), egg number (EN), egg weight (EW), eggshell color (ESCA, ESCB, ESCI, and ESCL), eggshell strength (ESS), albumen height (AH), and egg glossiness (SINS).Descriptive statistical analysis revealed the number of reliable samples and the mean, variation, range, and pedigree-based heritability of each trait at different age points (Table 2).Our analysis of phenotype records indicated that the mean BW of the population increased from 1.7 to 2.2 kg with hen aging, and EW increased from 43 to 62 g.For the egg quality-related traits, AH and ESS, showed an increasing and then decreasing trend throughout the laying period (Figure 1a).In addition, ESC (A, B, L, I) and SINS did not show a clear pattern of change.Furthermore, the coefficients of variation (CVs) for most traits increased with age.Notably, the CV values for ESC, SINS, AH, and ESS were the highest among the traits, indicating instability in ESC formation and egg quality in this population.In addition, AH was a low heritability trait (0.157-0.258), and the rest of the traits had medium to high heritability at most time points.For the genetic analysis, we set the quality control criterion as an MAF greater than 0.05, and the MAF values of all filtered SNPs were evenly distributed between 0.1 and 0.5 (Figure 1b).Following quality control, a GWAS was conducted for each trait, and in total, we identified 953 significant and 3133 suggestively significant SNPs (Figure 1c, Table S1).Significant SNPs associated with BW were concentrated on GGA1, 2, 4, 5, 7, 9, 17, and 24, those associated with EN and EW were concentrated on GGA2, ESS on GGA8, and AH on GGA3, 12, 13, 18, and 24, and those associated with ESC and SINS were more widely distributed on the whole genome.These SNPs were mapped to 1457 genes (Table S2) and 26 reported QTLs corresponding to traits (Table S1).Specifically, 190 genes were associated with BW, 61 with EN, 249 with EW, 421 with ESC, 111 with AH, 98 with ESS, and 78 with SINS.Functional enrichment analysis of the genes related to each trait revealed pathways that correlated well with the observed traits, highlighting a link between these loci and traits (Table 3).Most of the pathways were associated with the regulation of gene expression, intercellular interactions, signal transduction, metabolic regulation, and growth and development.Specifically, the genes related to BW were enriched by the regulation of transcription and expression (RNA polymerase II transcription regulatory region sequencespecific DNA binding), mineral uptake and metabolism (inorganic cation transmembrane transport), and cellular uptake and metabolism of substances (regulation of receptor-mediated endocytosis).Genes related to EN were enriched in intracellular homeostasis (cellular response to salt) and histone modification (MOZ/MORF histone acetyltransferase complex).Genes related to EW were enriched in affecting cell growth and differentiation (frizzled binding), the regulation of metabolic activity (response to oxygen-containing compounds), the post-translational modification of proteins (peptidyl-threonine modification), and glucose metabolism and lipid synthesis (cellular response to insulin stimulus).Genes related to ESS were enriched in calcium channel activity (voltage-gated calcium channel activity involved in cardiac muscle cell action potential) and signal transduction (membrane depolarization during cardiac muscle cell action potential and cell-cell signaling involved in cardiac conduction).For the genetic analysis, we set the quality control criterion as an MAF greater than 0.05, and the MAF values of all filtered SNPs were evenly distributed between 0.1 and 0.5 (Figure 1b).Following quality control, a GWAS was conducted for each trait, and in total, we identified 953 significant and 3133 suggestively significant SNPs (Figure 1c, Table S1).Significant SNPs associated with BW were concentrated on GGA1, 2, 4, 5, 7, 9, 17, and 24,

Machine Learning Methods for Chicken Genomic Prediction
We utilized AutoGluon to establish individual ML models for each trait using the genotype information from all 36,985 loci.Subsequently, we employed rrBLUP and BayesA as references to assess the Pearson correlation coefficients and the RMSE between the predicted values from each method and raw phenotypes.Our preliminary results revealed that when using all loci, the predictive accuracy of all ML methods was notably low, but the prediction biases were relatively high (Figures S1 and S2).Consequently, we conducted PCA to reduce the dimensionality of the genotype data and used the first 100 eigenvalues of the samples as parameters to fit each trait with ML methods.In addition, rrBLUP and BayesA methods performed better when using all loci in predictive accuracy compared to using eigenvalues from PCA, with no significant difference in prediction biases.Therefore, in the latter section of the current study, our focus was on the comparison between the rrBLUP and BayesA methods, which used all loci, and ML methods, which used eigenvalues.
Overall, in all 10 traits and 44 phenotype records, the ML model outperformed rrBLUP and BayesA in 25 phenotypes in predictive accuracy, including all-period ESS, and BW, AH, ESC (A, B, I), EW, and SINS for partial periods (Figure 2).However, the ML model performed significantly worse than rrBLUP and BayesA for all periods of EN and ESCL.Specifically, with the exception of the NN algorithm based on neural networks, the remaining ML algorithms showed a 7~180% higher predictive accuracy in the ESS model compared to the other two methods.In contrast, the predictive accuracies of rrBLUP and BayesA for EN were 17~108% higher than that of the ML algorithms.Among the various ML algorithms, the WE model performed the best in sixteen phenotypes and outperformed rrBLUP and BayesA in twenty phenotypes, and CAT and KNN performed the best in nine and eight phenotypes, respectively (Table 4).

OR PEER REVIEW 9 of 17
However, the results of prediction biases, on the other hand, varied considerably.In all phenotypes, the ML model outperformed rrBLUP and BayesA in 26 phenotypes in the RMSE, including all-period BW, EN, ESCB and ESCL, and EW for partial periods, while they performed worse for AH and ESS.Among all ML algorithms, the WE model performed the best in 17 phenotypes and outperformed rrBLUP and BayesA in 25 phenotypes.It is worth noting that the WE model performed best both in predictive accuracy and biases for the pre-body weight (BW28) and mid-term egg weight (EW36, EW56, and EW72).However, the results of prediction biases, on the other hand, varied considerably.In all phenotypes, the ML model outperformed rrBLUP and BayesA in 26 phenotypes in the RMSE, including all-period BW, EN, ESCB and ESCL, and EW for partial periods, while they performed worse for AH and ESS.Among all ML algorithms, the WE model performed the best in 17 phenotypes and outperformed rrBLUP and BayesA in 25 phenotypes.It is worth noting that the WE model performed best both in predictive accuracy and biases for the pre-body weight (BW28) and mid-term egg weight (EW36, EW56, and EW72).
In addition, we investigated the association between sample size, heritability, and the predictive accuracy of the WE, rrBLUP, and BayesA models (Figure 3).Our findings revealed an absence of a distinct relationship between the sample size and predictive accuracy of all models.Relatively, there was a more pronounced linear relationship between the predictive accuracy and heritability obtained from the WE model compared to rrBLUP and BayesA.This finding was consistent with our expectations, indicating that ML models could capture genetic information for various traits.

Comparison of Different Feature Engineering Methods
In order to investigate the impact of commonly used methods of locus screening on the predictive accuracy of models, we set the gradients on PCA, MAF, and LD pruning to obtain different subsets of SNPs and used these subsets to fit each trait.First, we manipulated the number of principal components (PCs = 10, 50, 100, 200, 300, 400, and 500), and found a consistent pattern where the predictive accuracy of most trait models demonstrated an initial rise followed by a decline as the number of PCs increased, and each trait exhibited distinct peaks at varying PC values (Figure S3).Furthermore, we also applied gradients for MAF (0.1, 0.2, 0.3, 0.4) and LD pruning (r 2 = 0.1, 0.2, 0.3, 0.4, 0.5).In general, the models for most traits did not respond significantly to changes in MAF or LD pruning (Figure S4).
We also utilized the genotypes of suggestively significant SNPs obtained from the GWAS to fit the traits.Interestingly, our analysis revealed that directly using the genotype of these suggestive significant SNPs could lead to an increase in predictive accuracy for almost all phenotypes compared to using the 100 eigenvalues from PCA (Figure 4).Specifically, the predictive accuracy based on the ML methods improved for all but five phenotypes (AH56, AH80, EN48, EN56, and ESS72), with enhancements ranging from 0.69% to 277%.Twenty-nine phenotypes achieved more than 10% improvement, and eighteen achieved more than 20% improvement.In the prediction of pre-intermediate SINS (SINS36, SINS56), training the model with suggestive significant SNPs identified by the GWAS improved the predictive accuracy by 188% and 277%, respectively.In addition, twenty-one phenotypes yielded the same optimal ML model during training on the two parameter sets, with thirteen phenotypes achieving more than 10% improvement.With the exception of the all-period EN, AH56, and ESCB80, the predictive accuracies of the ML methods surpassed those of rrBLUP and BayesA (2~466%), including ESCL, which has not been performed well before.Notably, the WE model sustained its superior performance and remained at the forefront among the 23 phenotypes, particularly in predicting BW, EW, and ESS.On the other hand, there were similar decreases in prediction biases with the ML methods (Figure 5), excluding five phenotypes (EN48, EN56, ESS72, AH80, and

Comparison of Different Feature Engineering Methods
In order to investigate the impact of commonly used methods of locus screening on the predictive accuracy of models, we set the gradients on PCA, MAF, and LD pruning to obtain different subsets of SNPs and used these subsets to fit each trait.First, we manipulated the number of principal components (PCs = 10, 50, 100, 200, 300, 400, and 500), and found a consistent pattern where the predictive accuracy of most trait models demonstrated an initial rise followed by a decline as the number of PCs increased, and each trait exhibited distinct peaks at varying PC values (Figure S3).Furthermore, we also applied gradients for MAF (0.1, 0.2, 0.3, 0.4) and LD pruning (r 2 = 0.1, 0.2, 0.3, 0.4, 0.5).In general, the models for most traits did not respond significantly to changes in MAF or LD pruning (Figure S4).
We also utilized the genotypes of suggestively significant SNPs obtained from the GWAS to fit the traits.Interestingly, our analysis revealed that directly using the genotype of these suggestive significant SNPs could lead to an increase in predictive accuracy for almost all phenotypes compared to using the 100 eigenvalues from PCA (Figure 4).Specifically, the predictive accuracy based on the ML methods improved for all but five phenotypes (AH56, AH80, EN48, EN56, and ESS72), with enhancements ranging from 0.69% to 277%.Twenty-nine phenotypes achieved more than 10% improvement, and eighteen achieved more than 20% improvement.In the prediction of pre-intermediate SINS (SINS36, SINS56), training the model with suggestive significant SNPs identified by the GWAS improved the predictive accuracy by 188% and 277%, respectively.In addition, twenty-one phenotypes yielded the same optimal ML model during training on the two parameter sets, with thirteen phenotypes achieving more than 10% improvement.With the exception of the all-period EN, AH56, and ESCB80, the predictive accuracies of the ML methods surpassed those of rrBLUP and BayesA (2~466%), including ESCL, which has not been performed well before.Notably, the WE model sustained its superior performance and remained at the forefront among the 23 phenotypes, particularly in predicting BW, EW, and ESS.On the other hand, there were similar decreases in prediction biases with the ML methods (Figure 5), excluding five phenotypes (EN48, EN56, ESS72, AH80, and ESCB80).But, these decreases were not significant and did not produce qualitative changes for the comparison with the rrBLUP and BayesA methods.ESCB80).But, these decreases were not significant and did not produce qualitative changes for the comparison with the rrBLUP and BayesA methods.

Discussion
Our research initially highlighted the performance of ten traits in the Rhode Island Red chicken population across multiple time points.As age increased, both BW and EW showed a gradual upward trend; AH and ESS, indicators of egg quality and shell quality, respectively, demonstrated patterns of increase and then decrease.Genetic heritability for all traits, except AH, which was low, ranged from medium to high, aligned with our expectations and previous research findings [30][31][32][33].
Through the GWAS, we identified thousands of significant or suggestive significant SNPs associated with these traits, along with the genes and reported QTLs in which these loci were located.We found that some significant SNPs associated with traits such as BW, EN, AH, and ESC are located in the corresponding reported QTLs.Functional enrichment analyses revealed the biological pathways involved these genes.We found that these genes were commonly associated with pathways, such as the regulation of gene expression, signal transduction, calcium channels, and growth and development, displaying a good correlation between pathways and traits [34][35][36][37].Both results of the QTL and ESCB80).But, these decreases were not significant and did not produce qualitative changes for the comparison with the rrBLUP and BayesA methods.

Discussion
Our research initially highlighted the performance of ten traits in the Rhode Island Red chicken population across multiple time points.As age increased, both BW and EW showed a gradual upward trend; AH and ESS, indicators of egg quality and shell quality, respectively, demonstrated patterns of increase and then decrease.Genetic heritability for all traits, except AH, which was low, ranged from medium to high, aligned with our expectations and previous research findings [30][31][32][33].
Through the GWAS, we identified thousands of significant or suggestive significant SNPs associated with these traits, along with the genes and reported QTLs in which these loci were located.We found that some significant SNPs associated with traits such as BW, EN, AH, and ESC are located in the corresponding reported QTLs.Functional enrichment analyses revealed the biological pathways involved these genes.We found that these genes were commonly associated with pathways, such as the regulation of gene expression, signal transduction, calcium channels, and growth and development, displaying a good correlation between pathways and traits [34][35][36][37].Both results of the QTL and

Discussion
Our research initially highlighted the performance of ten traits in the Rhode Island Red chicken population across multiple time points.As age increased, both BW and EW showed a gradual upward trend; AH and ESS, indicators of egg quality and shell quality, respectively, demonstrated patterns of increase and then decrease.Genetic heritability for all traits, except AH, which was low, ranged from medium to high, aligned with our expectations and previous research findings [30][31][32][33].
Through the GWAS, we identified thousands of significant or suggestive significant SNPs associated with these traits, along with the genes and reported QTLs in which these loci were located.We found that some significant SNPs associated with traits such as BW, EN, AH, and ESC are located in the corresponding reported QTLs.Functional enrichment analyses revealed the biological pathways involved these genes.We found that these genes were commonly associated with pathways, such as the regulation of gene expression, signal transduction, calcium channels, and growth and development, displaying a good correlation between pathways and traits [34][35][36][37].Both results of the QTL and functional enrichment analyses demonstrated the potential association of the SNPs and genes we found with these traits, providing a foundation for subsequent feature engineering.
This study focused on comparing the accuracy and bias of different genomic prediction methods.Initially employing the genotype of all SNPs to predict phenotypes, we observed that nearly all ML methods yielded poor predictive accuracy and prediction biases.It was hypothesized that this was due to the excessive number of parameters leading to dimensionality explosion, as well as interference by redundant information during model training [38,39].Consequently, the latter part of this study focuses on how to effectively filter and preprocess SNPs with classical bioinformatics methods to enhance the predictive accuracy and reduce the prediction biases of the ML models.
We attempted to reduce the dimensionality through PCA.PCA is a widely used and efficient dimensionality reduction method in the fields of bioinformatics and machine learning.After training the ML model using the top 100 PCs from the samples, we observed a significant improvement in the results.Of the 44 phenotypes, the ML models outperformed rrBLUP and BayesA more than half of the phenotypes in predictive accuracy, particularly for BW and ESS.However, the models performed poorly in fitting EN and ESCL.Interestingly, the performance of the ML models on the prediction bias of EN and ESS is completely switched around.In predicting EN, the prediction biases of the ML models were significantly smaller than rrBLUP and BayesA, while ESS obtained exactly the opposite result.Most machine learning models outperform linear models in feature extraction and processing.Therefore, the direct reason for the difference in the predictive accuracy of traits is that after dimensionality reduction, the PCs in the population's genomic information were more strongly related to ESS and showed no significant correlation with EN, making it difficult for the ML models to extract information relevant to EN from the parameters.This might imply that the genetic structure of EN was relatively more linear, while the genetic structure of ESS contained more nonlinear relationships, which contributed to the difference in the performance of ML in the two traits.Since EN is controlled by multiple loci with small effects, even after aggregating the PCs, it may still be impossible to capture the influencing factors related to EN [34,40,41].On the other hand, the ML models were able to capture the key eigenvalues affecting BW and ESS more effectively.Another possible explanation is that when predicting ESS, ML models were overly sensitive to the noise in the training data, resulting in higher accuracy of its predictions but increased bias in the predicted values, which manifested as overfitting.In contrast, the genetic structure of EN contained less noise, which, in turn, led to better predictive accuracy in linear models.ML models might not have a significant advantage when dealing with cleaner data, but they were able to reduce prediction bias by learning patterns.
Among the ML models used, the WE model performed best.The WE model combined the prediction results from other methods employed in AutoGluon and used linear fitting to generate new predicted values.This modeling framework, known as stacking in ML, places the predictions of multiple first-level models as parameters into a second-level model for re-fitting, making it a widely employed approach for enhancing model predictive accuracy.The CAT and KNN demonstrated good performances.CatBoost is an ML algorithm that excels in processing categorical data, employing a gradient boosting approach, making it well suited for the characteristics of SNP genotype data [42][43][44].The KNN, on the other hand, is an algorithm based on the principle that "similar entities share attributes" [45].Considering that researchers of genomic selection and prediction often assume that the same genotypes will have the same effects in a population [1,10,46], the KNN algorithm fits this application scenario well.This could explain its relative success with certain traits, particularly with regard to EN.Compared with other ML algorithms, the KNN has a more obvious advantage in predicting EN.In summary, there is no one model that is superior to others in terms of predictive accuracy across all traits, and there does not seem to be a clear pattern to the variations in predictive accuracy among methods.In linear models, different assumptions about marker effects and their variances are made, and models whose assumptions are closer to the true genetic structure of a particular trait often have higher predictive accuracy [47].Although ML does not require distributional assumptions on the variables chosen, this situation is also found in ML models, where the predictive performance largely depends on the congruence between model assumptions and the true structure of the data [48].Therefore, examining the genetic structure of traits through the predictive accuracy of ML models may become a viable method in the future.Subsequently, we examined the impact of several locus filtering methods on the accuracy of the predictive models.PCA with different dimensions had a relatively clear effect on the predictive accuracy.Interestingly, this effect seemed to follow the "elbow rule", implying that there is a specific dimension value at which the dimensionality-reduced model can achieve the best predictive performance [49,50].However, the "elbow" positions for different traits were inconsistent and exhibited no clear pattern.On the other hand, surprisingly, both locus filtering based on MAF and LD pruning appeared to have no significant effect on predictive accuracy.Within our study population, MAF was used as a preliminary criterion for judging whether loci were affected by artificial selection, although this was not precise [51,52].It is generally considered that loci under strong artificial selection would show reduced polymorphism, reflected in lower MAF values [53,54].We speculate that the reason for this phenomenon might be that the breeding purposes of this population were primarily concentrated on EN traits, with less selection for other traits.Therefore, the MAF values of loci related to these traits were almost uniformly distributed between 0.1 and 0.5 and thus did not have a definitive effect on model accuracy.
The GWAS identified SNPs that were significantly associated with various traits across the entire genome.Applying these suggestive significant SNPs to ML models improved the predictive accuracy and reduced the prediction biases for nearly all the traits, which is consistent with our expectations.In our results, direct genomic prediction using genotypes of these SNPs reaped better performance in most phenotypes compared to using eigenvalues from PCA.In SINS and ESCL, the improvements in predictive accuracy were striking, even approaching three times that of the previous method (188~277%).This result indicates that the GWAS can serve as a feature-engineering method, playing a role in genomic prediction models.However, it is worth noting that these values are not exact values of predictive accuracy for traits but rather the magnitude of improvement compared to previous methods.Considering that typical GWASs use linear mixed models and control for population structure and false-positive correction in SNPs, the identified significant and suggestive significant loci have a clear statistical association with traits.This not only helps ML models to reduce the need to process redundant information but also focuses on feature extraction, thereby obtaining more accurate predictive results.

Conclusions
Our study revealed that ML algorithms outperformed the traditional rrBLUP and BayesA in predicting economic traits using genotype data in a Rhode Island Red female chicken population.We found that ML performed better at predicting most traits using eigenvalues from PCA.It also outperformed rrBLUP and BayesA in more than half the phenotypes, especially BW and EW.In addition, the application of suggestive significant SNPs obtained from the GWAS to genomic prediction substantially improved the effect of prediction, suggesting that the GWAS could be involved in genomic prediction as a feature extraction method.In conclusion, this study strengthens our understanding of ML algorithms and feature engineering in animal genome prediction and provides valuable insights for future research in this area.

Supplementary Materials:
The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/genes15060690/s1. Figure S1: Predictive accuracy of ML models for 56-week traits using all SNPs; Figure S2: Prediction biases of ML models for 56-week traits using all SNPs; Figure S3: Predictive accuracy of the WE model for traits at 56 weeks using different PCA methods; Figure S4: Predictive accuracy of the WE model for traits at 56 weeks using different MAF and LD pruning methods; Table S1: Significant and suggestive significant SNPs identified by GWAS for all phenotypes; Table S2: Genes in which significant loci were located.

Figure 1 .
Figure 1.Phenotype statistics and genome-wide association studies.(a) Box plots of all the candidate traits.(b).Histogram of the MAF of all SNPs passing the quality control.(c).Manhattan plots for associations of suggestive significant SNPs with all candidate traits.The threshold for genome-wide suggestive significance was 3.18 (p value = 6.63 × 10 −4 ).The horizontal black line indicates the genome-wide significance thresholds (p values = 3.31 × 10 −5 , threshold = 4.48).Different colors represent different traits.

Figure 1 .
Figure 1.Phenotype statistics and genome-wide association studies.(a) Box plots of all the candidate traits.(b).Histogram of the MAF of all SNPs passing the quality control.(c).Manhattan plots for associations of suggestive significant SNPs with all candidate traits.The threshold for genome-wide suggestive significance was 3.18 (p value = 6.63 × 10 −4 ).The horizontal black line indicates the genome-wide significance thresholds (p values = 3.31 × 10 −5 , threshold = 4.48).Different colors represent different traits.

Figure 2 .
Figure 2. Accuracy of genomic prediction of candidate traits.Comparison of the predictive accuracy of ML algorithms and rrBLUP (a) and BayesA (b).Red and positive values indicate that the model has an advantage over rrBLUP or BayesA; blue and negative values indicate that rrBLUP or BayesA has an advantage over the methods.

Figure 2 .
Figure 2. Accuracy of genomic prediction of candidate traits.Comparison of the predictive accuracy of ML algorithms and rrBLUP (a) and BayesA (b).Red and positive values indicate that the model has an advantage over rrBLUP or BayesA; blue and negative values indicate that rrBLUP or BayesA has an advantage over the methods.

Genes 2024 , 17 Figure 3 .
Figure 3. Correlation of heritability and sample size with the predictive accuracy of WE, rrBLUP, and BayesA.Scatter plots of heritability and sample size versus the predictive accuracy of 44 phenotypes.

Figure 3 .
Figure 3. Correlation of heritability and sample size with the predictive accuracy of WE, rrBLUP, and BayesA.Scatter plots of heritability and sample size versus the predictive accuracy of 44 phenotypes.

Figure 4 .
Figure 4. Bar plot of differences in predictive accuracy between suggestive significant SNPs obtained from the GWAS and eigenvalues from PCA.The text above the bars indicates the ML model that obtained the best predictive accuracy.

Figure 5 .
Figure 5. Bar plot of differences in prediction biases between suggestive significant SNPs obtained from the GWAS and eigenvalues from PCA.The text above the bars indicates the ML model that obtained the best prediction bias.

Figure 4 .
Figure 4. Bar plot of differences in predictive accuracy between suggestive significant SNPs obtained from the GWAS and eigenvalues from PCA.The text above the bars indicates the ML model that obtained the best predictive accuracy.

Figure 4 .
Figure 4. Bar plot of differences in predictive accuracy between suggestive significant SNPs obtained from the GWAS and eigenvalues from PCA.The text above the bars indicates the ML model that obtained the best predictive accuracy.

Figure 5 .
Figure 5. Bar plot of differences in prediction biases between suggestive significant SNPs obtained from the GWAS and eigenvalues from PCA.The text above the bars indicates the ML model that obtained the best prediction bias.

Figure 5 .
Figure 5. Bar plot of differences in prediction biases between suggestive significant SNPs obtained from the GWAS and eigenvalues from PCA.The text above the bars indicates the ML model that obtained the best prediction bias.

Table 1 .
Hyperparameter table for the AutoGluon and ML models.

Table 2 .
Descriptive statistics of important economic traits of chickens along with the aging process.

Table 3 .
Functional annotation of genes associated with suggestive significant SNPs from the GWAS.

Table 4 .
Predictive accuracy and biases of all candidate traits.

Table 4 .
Predictive accuracy and biases of all candidate traits.Numbers outside the parentheses are the Pearson correlation coefficient, and the ones on the inside are the RMSE.