Genomic Prediction Accuracies for Growth and Carcass Traits in a Brangus Heifer Population

Simple Summary The genomic estimated breeding value (GEBV) using data from Brangus heifers were obtained from genomic selection (GS) methods associating the single nucleotide polymorphisms (SNP) marker genotypes with phenotypic data for economically important growth (birth, weaning, and yearling weights) and carcass (depth of rib fat, and percent intramuscular fat and longissimus muscle area) traits using the linkage disequilibrium (LD) between SNP markers and quantitative trait loci (QTL) and/or the genomic relationship between animals. The heritability estimates were found similar across genomic best linear unbiased prediction (the GBLUP), and the Bayesian (BayesA, BayesB, BayesC and Lasso) GS methods for k-means and random cluster. The Bayesian methods resulted in underestimates of heritabilities and overestimates of accuracy of GEBV. However, the GBLUP method resulted in more reasonable estimates of heritabilities and accuracies of GEBV for growth and carcass traits of heifers from a composite population. Abstract The predictive abilities and accuracies of genomic best linear unbiased prediction (GBLUP) and the Bayesian (BayesA, BayesB, BayesC and Lasso) genomic selection (GS) methods for economically important growth (birth, weaning, and yearling weights) and carcass (depth of rib fat, apercent intramuscular fat and longissimus muscle area) traits were characterized by estimating the linkage disequilibrium (LD) structure in Brangus heifers using single nucleotide polymorphisms (SNP) markers. Sharp declines in LD were observed as distance among SNP markers increased. The application of the GBLUP and the Bayesian methods to obtain the GEBV for growth and carcass traits within k-means and random clusters showed that k-means and random clustering had quite similar heritability estimates, but the Bayesian methods resulted in the lower estimates of heritability between 0.06 and 0.21 for growth and carcass traits compared with those between 0.21 and 0.35 from the GBLUP methodologies. Although the prediction ability of the GBLUP and the Bayesian methods were quite similar for growth and carcass traits, the Bayesian methods overestimated the accuracies of GEBV because of the lower estimates of heritability of growth and carcass traits. However, GBLUP resulted in accuracy of GEBV for growth and carcass traits that parallels previous reports.


Introduction
The availability of high-density SNP genotypes from high-throughput genotyping technologies [1][2][3][4] and the development of linear and nonlinear methods (such as the GBLUP, BayesA, BayesB, BayesC, and Bayesian Lasso) [1,5,6] have made genomic selection applicable for the economically important traits in animal and plant breeding [7][8][9][10][11][12][13][14]. Genomic selection methods associate SNP marker genotypes with phenotypic data for economically important traits to obtain the GEBV of animals based on the LD between SNP and QTL and/or genomic relationship among animals. The accuracy of GEBV, important for the genetic progress in GS, is influenced by many factors, including the level of LD between SNP and QTL, heritability of the trait, and the estimation methods of GEBV [15][16][17]. Habier et al. [18] reported that the accuracies of GEBV depend on LD among SNP and QTL, and on genomic relationships among animals in the training and validation datasets. Their findings indicated that the accuracy of GEBV of a selected animal decreased as the genomic relationship between selection animals (candidates) and training animals decreased. Saatchi et al. [19] also showed that if the genetic relationships between animals in training and animals in validation data were minimized as per the pedigree-base additive genetic relationships among animals in the k-means clustering procedure, accuracies of GEBV of animals in the validation data were less affected by their genomic relationships. Villumsen et al. [20] also studied the effect of heritability on the accuracy of GEBV in GS using simulated data and reported that the accuracy of GEBV increased about 17% as the heritability increased from 0.02 to 0.30 in the GS study. Clark et al. [21] compared the accuracy of GEBV from BLUP, the GBLUP, and the BayesB methods, finding that the accuracies of genomic prediction from GS methods depended on the significant effect of QTL on the trait, and that the small effect of QTL resulted in a non-significant difference between GBLUP and BayesB.
The objectives of this research were to characterize LD structure of Brangus heifers and to compare the predictive ability and accuracy of the GBLUP and the Bayesian methods for economically important growth (birth, weaning, and yearling weights) and carcass (depth of rib fat, percent of intramuscular fat, and longissimus muscle area) traits using BovineSNP50 Infinium BeadChip SNP markers (n = 54,001 SNP).

Phenotypes
Birth weight (BW), weaning weight (WW), and yearling weight (YW) were phenotypes for growth traits, and depth of rib fat (FAT), percent intramuscular fat (IMF), and longissimus muscle area (LMA) were phenotypes for carcass traits from yearling ultrasound evaluation. Phenotypes were collected from 738 Brangus heifers that were registered with International Brangus Breeders Association [9,22,23]. Year of birth (2005 to 2007), season of calving (spring or autumn), and age of dam were also obtained from the database of the International Brangus Breeders Association. The descriptive statistics of these growth and carcass traits are presented in Table 1.

SNP Marker Genotypes
BovineSNP50 Infinium BeadChips for 54,001 SNP markers were used to genotype each heifer [2]. Genotypes of SNP markers were determined in the A/B allele format and Animals 2023, 13, 1272 3 of 18 coded as 0, 1, or 2, based on the number of B alleles at each locus. With this SNP marker information and using the snpReady package in R-program [24], three filters were applied for quality control in the following sequence: (a) Animals with > 50% missing data were removed; (b) SNP markers with > 5% missing data or < 95% call rate were removed; and (c) SNP markers with < 10% minor allele frequency were removed. After executing imputation for missing SNP markers, the complete SNP genotype data included 35,351 SNP markers from 738 animals. On each chromosome, the distribution of the number of SNP markers within a 1 Mb window was determined by using the rMVP package in the R-program [25].

Linkage Disequilibrium
The success of GS and genome-wide association studies (GWAS) are dependent on LD, which is a non-random association among SNP markers. LD is measured using the square of correlation (r 2 ) between SNP markers and ranges between 0 and 1. Linkage disequilibrium is expressed as where p AB , p A = 1 − p a and p B = 1 − p b are the observed frequencies for haplotype AB and alleles A and B at locus i and j, respectively. The estimates of LD for pairwise combinations of all SNP markers were obtained from the pairwise LD function of the Synbreed package in the R program [26,27].

Genomic Best Linear Unbiased Prediction (GBLUP)
The model for GEBV was: where y was a vector of BW, WW, YW, LMA, IMF, or FAT; X was a design matrix allocating BW, WW, YW, LMA, IMF or FAT to the fixed effects of overall mean, contemporary groups and dam age; Z was a design matrix allocating BW, WW, YW, LMA, IMF, or FAT to additive genetic effects of animals; b was a vector of fixed effects of overall mean, contemporary groups, and dam age; and g was a vector of additive genomic breeding values for animals following a multivariate normal distribution g ∼ N 0, Gσ 2 g with genomic relationship matrix (G) and the additive genetic variance (σ 2 g ) among animals. e was a vector of residuals following a multivariate normal distribution e ∼ N 0, Iσ 2 e with the residual variance (σ 2 e ). The G matrix indicating the realized relatedness among animals was calculated as where W = M − P, M was the (n × k) matrix of SNP markers for the n = 738 animals with the k = 35, 351 SNP markers; P was the (n × k) matrix of the allele frequencies multiplied by 2; p i was the allele frequency of SNP marker i; and the sum was, overall, loci [18,28]. The GBLUP used for the GEBV of animals was equivalent to solving the mixed model equations: where σ 2 g and σ 2 e were the additive genetic and residual variances and G −1 was the inverse of the G matrix. Therefore, the heritability of the trait was defined as h 2 = σ 2 g / σ 2 g + σ 2 e . The BGLR package (https://cran.r-project.org/web/packages/BGLR/index.html (accessed on 10 March 2022)) in the R program [6,26] was used to solve the mixed model equations in Equation (4) for b and g by estimating the additive genetic and residual variances (σ 2 g andσ 2 e ). The estimate of heritability was then calculated asĥ 2 =σ 2 g / σ 2 g +σ 2 e .

The Bayesian BayesA, BayesB, BayesC and Lasso Methods
The Bayesian (BayesA, BayesB, BayesC, and Lasso) methods were applied to estimate the SNP effects for genomic prediction using cross-validation datasets of BW, WW, YW, LMA, IMF, and FAT. The cross-validation data of BW, WW, YW, LMA, IMF, and FAT were modeled as a function of the individual SNP effects: where y was a vector of BW, WW, YW, LMA, IMF, or FAT; X was a design matrix allocating BW, WW, YW, LMA, IMF, or FAT to the corresponding fixed effects of overall mean, contemporary groups, and dam age; M was a n × k matrix of SNP (0, 1 or 2); b was a vector of fixed effects of overall mean and contemporary groups and dam age; and m was a k × 1 vector of SNP effects assumed a priori to follow a multivariate normal distribution m ∼ N(0, Ω) with Ω = diag σ 2 m 1 , σ 2 m 2 , · · · , σ 2 m k the diagonal matrix and σ 2 m i the variance of SNP i. The prior distribution of SNP effect m i depended on the SNP variance σ 2 m i and the prior probability π that SNP i had zero effect: where the parameter of π was defined between 0 and 1 [5]. The specifications for π and the SNP variance σ 2 m i determined the methods of BayesA, BayesB, and BayesC. In BayesA and the BayesB methods, the SNP variance σ 2 m i denoted the ith SNP variance, which had a scaled inverse chi-square distribution (χ −2 (ν, S)) with degrees of freedom ν and scale S parameters. These specifications result in a univariate Student's t distribution t(0, ν, S) for the marginal distribution of the SNP effect m i | ν, S with the probability of the parameter of (1 − π) [5,6]. In BayesC, with the SNP variance σ 2 m i = σ 2 m , prior distributions of the SNP effects had a common variance distributed with χ −2 (ν, S). Therefore, these specifications resulted in a mixture of multivariate Student's t distributions t(0, ν, IS) for the marginal distribution of the SNP effect m i | ν, S with the probability parameter of (1 − π) [5,6]. In the BayesA method, the value of zero was assigned for the parameter of π, resulting in all k SNP in the model. However, in the BayesB and the BayesC methods, the fixed value of 0.95 was assigned for the parameter of π, resulting in 5% of k SNP markers with none-null variances in the model. In the Bayesian Lasso (BL), all k SNP (π = 0) were in the model, as in the BayesA method, and each SNP marker variance σ 2 m i had a Laplace distribution Exp λ 2 2 with λ parameter, which had a conjugate prior distribution of Gamma. These specifications result in a Double Exponential (DE) distribution for the marginal distribution of SNP effect m i λ 2 with the probability the parameter of (1 − π) [6,29]. The vector of e represented normally distributed residuals ( e ∼ N 0, Iσ 2 e ) with the variance (σ 2 e ), which has a χ −2 (ν e , S e ) with degrees of freedom ν e and scale S e parameters. The BGLR package (https://cran.r-project.org/web/packages/BGLR/index.html (accessed on 10 March 2022)) in the R program [6,26] was used to estimate SNP effects for BW, WW, YW, LMA, IMF, and FAT.

K-Means and Random Clustering
The animals for cross-validation were divided into 10-fold data sets by using the k-means clustering approach. K-means clustering maximizes genetic relatedness within each cross-validation set and minimizes it between cross-validation datasets based on the genetic dissimilarity (D) matrix among animals [19], which was calculated from the pedigree numerator relationship (A) matrix [30]: where d ij was a measure of genetic dissimilarity between individuals i and j, a ij was the additive genetic relationship between individual i, and j, a ii (a jj ) was the i th (j th ) diagonal element of the A matrix, which represented Wright's coefficient of relationship (r ij ) between individuals i and j. GeneticsPed package in the R-program [31] was used to create the pedigree numerator relationship (A) matrix, and the factoextra package in the R-program [32] implementing the Hartigan and Wong [33] algorithm was used for k-means clustering. For random clustering, the animals were randomly divided into 10-fold datasets for cross-validation, and this procedure was replicated five times.

Accuracy of Genomic Prediction
The training process in the 10-fold cross-validation from k-means and random clustering approaches was performed by excluding one validation set to train on the remaining nine validation sets, and then the GEBV of animas in the omitted validation set were obtained. The predictive ability of the GBLUP and the Bayesian BayesA, BayesB, BayesC, and Lasso methods in the 10-fold datasets for cross-validation were determined using Pearson's correlation coefficient (r y,ŷ ) between the observed (y) and predicted (ŷ) phenotypic values for BW, WW, YW, LMA, IMF, and FAT.
The accuracy of GEBV represented the correlation (r BV,GEBV ) between the breeding values (BV) and GEBV. However, the BV of animals are unknown, and the accuracy of GEBV of animals for traits was calculated by pooling estimates from the 10-fold crossvalidation strategy. The accuracy of the GEBV of animals for traits was estimated using Pearson's correlation coefficient (r y,ŷ ) weighted by the heritability (h 2 ) of the traits in the validation datasets [34]:

Distribution of SNP Markers and LD Analysis
We retained 35,351 SNP after filtering markers based on the quality-control criteria. The distribution and density plots of SNP markers per chromosome are presented in Figure 1A,B. The total length of the autosomal genome was 2509.0 Mb, with the shortest chromosome (i.e., 25) being 42.9 Mb in length and the longest chromosome (i.e., 1) being 158.2 Mb in length. The length of chromosome X was 148.6 Mb. As seen in Figure 1A, there was a decreasing trend in the number of SNP markers from chromosome 1 to chromosome X and the SNP coverage ranged between 620 (1.78%) on chromosome 25 and 2194 (6.31%) on chromosome 1. Chromosome 1 and 25 had the longest and the shortest chromosomes with 158.49 Mb and 42.91 Mb in a study of Sahiwal cattle [35] with 157.78 Mb and 42.21 Mb in Charolais, Limousine, and Blonde d'Aquitaine cattle [36], and with 158.03 Mb and 42.80 Mb in Vrindavani crossbred cattle in India [37]. Singh et al. [37] also reported that since the distribution of SNP was related with the length of chromosomes, chromosome 1 had the highest number of SNP (2798) and chromosome 25 had the least number of SNP (792). The largest distance between SNP markers was 3.26 Mb on chromosome 10, and the shortest distance was 0.01 kb on chromosome 15. The average distance between SNP markers was 57.24 kb. Lu et al. [38] reported that the total genome length for Angus, Charolais, and Crossbred beef cattle in Canada was between 2534.98 and 2535.30 Mb, with the shortest chromosome 25 being 42.72 Mb and the longest chromosome 1 being 158.09 Mb. The distribution of the number of SNP differed from 2026 to 2176 for the chromosome 1 and from 580 to 607 for chromosome 28, and the overall average distance between two adjacent SNP markers was 70 kb.
had the highest number of SNP (2798) and chromosome 25 had the least number of SNP (792). The largest distance between SNP markers was 3.26 Mb on chromosome 10, and the shortest distance was 0.01 kb on chromosome 15. The average distance between SNP markers was 57.24 kb. Lu et al. [38] reported that the total genome length for Angus, Charolais, and Crossbred beef cattle in Canada was between 2534.98 and 2535.30 Mb, with the shortest chromosome 25 being 42.72 Mb and the longest chromosome 1 being 158.09 Mb. The distribution of the number of SNP differed from 2026 to 2176 for the chromosome 1 and from 580 to 607 for chromosome 28, and the overall average distance between two adjacent SNP markers was 70 kb. The density plot of SNP markers in Figure 1B showed the number of SNP markers within a 1 Mb window on each chromosome. The horizontal axis of the density plot of SNP markers indicates the length of chromosome (Mb). The different color shows SNP density from 0 to 37 SNP markers on each chromosome. The distribution of SNP markers on the autosomal chromosomes was not uniform and indicated a tendency of being clustered in some regions. The colors on the chromosomes showed the variation in the density of SNP markers on each chromosome. The density of the SNP markers differed from 12.0 SNP/Mb on chromosome 12 to 15.1 SNP/Mb on chromosome 19. For the X chromosome, density of the SNP markers was 4.7 SNP/Mb. Chromosome 1 had a similar density pattern of SNP; however, chromosomes 11, 14, 24, and 25 had higher density of SNP at the beginning of the chromosomes compared to the rest of the chromosomes. The X chromosome was the second largest chromosome, but green and grey colors indicated very sparse densities of SNP markers. In addition, chromosome 6 had more SNP markers than chromosomes 3, 4, and 5, and was shorter than those chromosomes; therefore, the density of SNP markers on chromosome 6 (14.6 SNP/Mb) was higher than those on chromosomes 3 (14.1 SNP/Mb), 4 (13.3 SNP/Mb) and 5 (112.3 SNP/Mb).
Pairwise, LD between 35,351 SNP markers were assessed using the squared correlation (r 2 ) between SNP markers. The average LD (SD) and genetic distance (SD) were 0.125 (0.156) and 0.503 (0.285) Mb within an interval of 1 Mb pairs across all chromosomes. The overall average for LD and genetic distances were 0.022 (0.054) and 29.060 (24.209) Mb, respectively. The distribution of LD (r 2 ) against the genetic distance (Mb) given in Figure 2 indicated a sharp decline with increases of the genetic distance between SNP. The association between the degree of decay in LD with the distance between SNP markers indicated a clear decreasing exponential trend with an increasing genetic distance ( Figure 2). Higher LD values were obtained for SNP markers located in close proximity. For the SNP markers less than 0.1 Mb apart, the mean LD (SD) was 0.195 (0.224), and 11.22% of SNP marker pairs had an LD higher than 0.5. For the genetic distance between pairs of SNP markers at ranges from 0 to 0.1, 0 to 0.2, and 0 to 0.5 Mb, 11.22, 8.73, and 5.67% of SNP marker pairs showed a higher LD than 0.5, respectively.

Heritability Estimates from GBLUP, and Bayesian (BayesA, BayesB, BayesC and Lasso) Methods in k-Means and Random Training Data Sets
Heritability estimates of growth (BW, WW and YW) and carcass (FAT, IMF and LMA) traits from 10-fold k-means and random cluster training data sets were obtained from GBLUP and Bayesian (BayesA, BayesB, BayesC and Lasso) methods. In the analyses McKay et al. [39] and El Hou et al. [36] reported that most of the studies based on bovine SNP data have shown that the average LD was close to zero for distances between SNP greater than 500 kb. Lu et al. [38] reported rapidly decreasing LD from 0.29 to 0.23 to 0.19 in Angus, 0.22 to 0.16 to 0.12 in Charolais, and 0.21 to 0.15 to 0.11 in crossbred cattle for the distances from 0-30 kb to 30-70 kb, and then to 70-100 kb, respectively. El Hou et al. [36] also found that the average LD values between pairs of SNP markers ranged from 0.079 to 0.121 for Charolais, Limousine, and Blonde d'Aquitaine cattle, and the average LD changed from 0.5 to 0.1 at distances from smaller than 15 kb to greater than 120 kb. Singh et al. [37] also calculated the average LD of 0.43 for the distance of less than 10 kb, and then it decreased to 0.21 for the distances of 25 to 50 kb for Vrindavani crossbred cattle.

K-Means
As presented in Figures 3 and 4, 10-fold k-means and random cluster training datasets resulted in very similar heritability (h 2 ) estimates for growth and carcass traits. The comparison of methods suggested that the GBLUP methodology yielded almost double the heritability (h 2 ) estimates than the Bayesian (BayesA, BayesB, BayesC, and Lasso) methods for growth and carcass traits within 10-fold k-means and random cluster training datasets. Within the Bayesian (BayesA, BayesB, BayesC, and Lasso) methods, the BL method resulted in higher estimates of heritability (h 2 ) than BayesA, BayesB, and the BayesC methods for growth traits; however, heritability (h 2 ) estimates for carcass traits were similar across the Bayesian methods. Peters et al. [9] reported the pedigree and genome-based estimates of heritabilities for growth and carcass traits by conducting GWAS analyses using the BayesC method and the SNP markers for the Brangus cattle of this study.
Pedigree-based estimates of heritabilities were similar with those from GBLUP for growth traits, but were higher than those from the Bayesian (BayesA, BayesB, BayesC, and Lasso) methods for growth and carcass traits. Genome-based (BayesC) estimates of heritabilies for growth and carcass traits were lower than those from GBLUP, but were similar with those from the other the Bayesian methods. The heritability (h 2 ) estimates from THE GBLUP for growth and carcass traits were in the range of heritability (h 2 ) estimates reported in the literature [9,[40][41][42][43][44], and they suggested that GBLUP resulted in more reasonable heritability estimates than the Bayesian (BayesA, BayesB, BayesC, and Lasso) methods in the analyses using smaller subsets of the data [42]. As presented in Figures 3 and 4, 10-fold k-means and random cluster training datasets resulted in very similar heritability (ℎ 2 ) estimates for growth and carcass traits. The comparison of methods suggested that the GBLUP methodology yielded almost double the heritability (ℎ 2 ) estimates than the Bayesian (BayesA, BayesB, BayesC, and Lasso) methods for growth and carcass traits within 10-fold k-means and random cluster training datasets. Within the Bayesian (BayesA, BayesB, BayesC, and Lasso) methods, the BL method resulted in higher estimates of heritability (ℎ 2 ) than BayesA, BayesB, and the BayesC methods for growth traits; however, heritability (ℎ 2 ) estimates for carcass traits were similar across the Bayesian methods. Peters et al. [9] reported the pedigree and genome-based estimates of heritabilities for growth and carcass traits by conducting GWAS analyses using the BayesC method and the SNP markers for the Brangus cattle of this study.
Pedigree-based estimates of heritabilities were similar with those from GBLUP for growth traits, but were higher than those from the Bayesian (BayesA, BayesB, BayesC, and Lasso) methods for growth and carcass traits. Genome-based (BayesC) estimates of heritabilies for growth and carcass traits were lower than those from GBLUP, but were similar with those from the other the Bayesian methods. The heritability (ℎ 2 ) estimates from THE GBLUP for growth and carcass traits were in the range of heritability (ℎ 2 ) estimates reported in the literature [9,[40][41][42][43][44], and they suggested that GBLUP resulted in more reasonable heritability estimates than the Bayesian (BayesA, BayesB, BayesC, and Lasso) methods in the analyses using smaller subsets of the data [42].

Comparison of Genome-Wide Prediction Ability
The GBLUP and the Bayesian methods were used to analyze growth (BW, WW, and YW) and carcass (FAT, IMF, and LMA) traits. The means and standard deviations of Pearson correlations ( ,̂) between actual and predicted phenotypes for growth (BW, WW and YW) traits in Figure 5 and those for carcass (FAT, IMF, and LMA) traits in Figure 6 indicate
The predictive performances for growth (BW, WW, and YW) and carcass (FAT, IMF, and LMA) traits from k-means and random cluster training and validation datasets were found different within the GBLUP and the Bayesian methods, which depends on the genetic architecture of the traits. The similar predictive performances from the GBLUP and the Bayesian methods for growth (BW, WW and YW) and carcass (FAT, IMF and LMA) also suggested that the genetic structures of growth and carcass traits controlled by many genes with small effects. The carcass traits also resulted in the higher heritabilities and then higher predictive performances than growth traits. These results also revealed that the 10-fold k-means and random cluster cross-validation datasets resulted in significantly lower correlations than training datasets for growth (BW, WW and YW) and carcass (FAT, IMF and LMA) traits within the GBLUP and the Bayesian (BayesA, BayesB, BayesC, and Lasso) methods. The decrease in mean correlations for growth (BW, WW, and YW) and carcass (FAT, IMF, and LMA) traits ranged from 52% to 87% in the 10-fold k-means and random cluster cross-validation datasets. In addition, k-means cluster minimizing genetic relationship among cross-validation datasets produced lower correlations than the random cluster and the ranges of the decrease were from 16% to 22% for BW, 40% to 49% for WW, 23% to 26% for YW as growth traits, and 12% to 15% for FAT, 16% to 18% for IMF, and 25% to 30% for LMA as carcass traits across the GBLUP and the Bayesian methods.
The averaged accuracies of GEBV over all methods were 0.559 (0.696) for BW, 0.347 (0.645) for WW, 0.593 (0.790) for YW, 0.575 (0.663) for FAT, 0.697 (0.831) for IMF, and 0.562 (0.774) for LMA in k-means (random) cluster cross-validation datasets. As seen in Figures 7  and 8, the random clustering approach resulted in higher accuracies of GEBV (24% for BW, 87% for WW, 33% for YW, 15% for FAT, 19% for IMF, and 37% for LMA) than the k-means clustering approach because of the higher relationship between training and validation datasets in random clustering. Habier et al. [45] executed the genome-wise analysis of milk yield, fat yield, protein yield, and somatic cell score, and indicated that the accuracy of GEBV decreased by reducing the genomic relationship between animals for the training and validation datasets. Saatchi et al. [19] found the accuracies of 0.554 and 0.700 for BW, 0.333 and 0.534 for WW, 0.356 and 0.573 for YW, 0.603 and 0.793 for FAT, 0.690 and 0.817 for Marbling, and 0.601 and 0.694 for LMA in the k-means and random cross-validation datasets from Angus cattle, and they suggested that minimizing the genetic relationships between animals from training and validation sets using k-means clustering resulted in the conservative accuracies of GEBV.
Daetwyler et al. [46] determined that the high accuracy of GEBV resulted from the family relationships rather than LD between SNP and QTL in a multiple-breed sheep population. Chen et al. [47] showed that individuals with close relatives in the training population had a higher accuracy of GEBV. Kang et al. [48] also reported the decreasing accuracy of GEBV with an increasing generation gap between the training and validation datasets. Zhou et al. [49] studied the factors affecting GEBV accuracy and reported that the genetic relationship between animals from cross-validation datasets created a more important effect than the LD between SNP and QTL on the accuracy of GEBV because the decrease in the accuracy of GEBV happened even when the LD between SNP increased.   Figures 7 and 8, the averaged accuracies of GEBV suggested that GBLUP resulted in lower accuracies of GEBV than the Bayesian methods within growth and carcass traits. The Bayesian methods exhibited quite similar accuracies of GEBV and the BayesC method for growth traits, and the Bayesian LASSO method for carcass traits provided higher accuracies of GEBV than other methods within the the Bayesian methods, respectively. Sun et al. [50] compared the GBLUP and the Bayesian methods using simulated data and found that the GBLUP had lower accuracy than BayesB and BayesCπ, and the Bayesian methods resulted in quite similar accuracies. Gao et al. [51] also reported that the Bayesian methods performed better accuracy of GEBV than the GBLUP methods in the genome analysis of milk production traits of Nordic Holstein cows. However, Chen et al. [52] reported that GBLUP performed better than Bayes B in the genomic analysis of carcass traits from Angus and Charolais beef cattle. Hayes et al. [53] also found that the GBLUP and the BayesB methods resulted in similar accuracies in a multibreed dairy population. Additionally, Ostersen et al. [54] reported no difference among the GBLUP, the Bayesian LASSO, and the Bayesian mixture methods based on 60,000 SNP data, and Ge et al. [55] reported the similar predictive accuracy for the GBLUP and the Bayesian methods for growth traits at weaning and yearling ages in Yaks. Although the predictive abilities of the GBLUP and the Bayesian methods were quite similar for growth and carcass traits and k-means and random clusters (Figures 5 and 6) in the current study, the realized accuracies of the GEBV of the GBLUP and the Bayesian methods were not similar (Figures 7 and 8) because of the heritability estimates from the GBLUP and the Bayesian (BayesA, BayesB, BayesC and Lasso) methods. As described by Rolf et al. [42] for the smaller subsets of the data used in analyses, robust and reasonable heritability estimates can be obtained from GBLUP methodologies compared to the Bayesian methods. The accuracies of GEBV from the GBLUP in this study were then found in the range of the theoretical predicted accuracies between 0.26 and 0.34 based on the heritability estimates of traits from 0.25 to 0.40 [56]. Figure 7. Accuracy of GEBV from Genomic Best Linear Unbiased Prediction (GBLUP) and Bayesian (BayesA, BayesB, BayesC, and Lasso (BL)) methods for birth weight (BW), weaning w (WW) and yearling weight (YW) for growth traits from 10-fold k-means and random cluster c validation datasets. Figure 8. Accuracy of GEBV from Genomic Best Linear Unbiased Prediction (GBLUP) and the Bayesian (BayesA, BayesB, BayesC and Lasso (BL)) methods for depth of rib fat (FAT), intramuscular fat (IMF) and longissimus muscle area (LMA) for carcass traits from 10-fold k-means and random cluster cross-validation data sets.

Conclusions
In order to explore the translation of genomic prediction for growth and carcass traits in Brangus cattle, the GBLUP and the Bayesian (BayesA, BayesB, BayesC and Lasso) methods were used to estimate GEBV for growth (BW, WW and YW) and carcass (FAT, IMF and LMA) traits within k-means and random clusters in this study. The heritability estimates from k-means and random cluster were quite similar across genomic prediction methods. The Bayesian methods underestimated the heritabilities between 0.06 and 0.21; however, the heritability estimates from GBLUP were between 0.21 and 0.35 for growth and carcass traits, and they parallel these types of estimates in the literature. Including low-density SNP markers with low minor allele frequency would cause a poor performance to estimate the heritabilities of traits with the Bayesian methods compared with GBLUP using genomic relationship. K-means cluster appears to minimize the genetic relationships among cross-validation datasets and yields lower correlations than in a random cluster. These results of the current study suggested that the level of genetic relationship between the training and validation data influences the prediction ability of genomic selection methods and the accuracy of GEBV. The prediction ability of the GBLUP and the Bayesian methods within k-means and random clusters were quite similar for growth and carcass traits; however, the Bayesian methods overestimated the accuracies of GEBV because of the lower estimates of the heritability of growth and carcass traits. However, the GBLUP resulted in more reasonable accuracy of GEBV for growth and carcass traits collected from Brangus heifers.