Next Article in Journal
Bottom Temperature Effect on Growth of Multiple Demersal Fish Species in Flemish Cap, Northwest Atlantic
Previous Article in Journal
Research on the Optimization of Dietary Energy Supply in Growing and Fattening Pigs Under a Low-Temperature Environment
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Weighted GBLUP in Simulated Beef Cattle Populations: Impact of Reference Population, Marker Density, and Heritability

1
College of Animal Science and Technology, Inner Mongolia Agricultural University, Hohhot 010018, China
2
Key Laboratory of Animal Genetics, Breeding and Reproduction of the Inner Mongolia Autonomous Region, College of Animal Science and Technology, Inner Mongolia Agricultural University, Hohhot 010018, China
*
Author to whom correspondence should be addressed.
Animals 2025, 15(8), 1118; https://doi.org/10.3390/ani15081118
Submission received: 13 March 2025 / Revised: 8 April 2025 / Accepted: 11 April 2025 / Published: 12 April 2025
(This article belongs to the Section Cattle)

Simple Summary

Genomic selection (GS) enhances breeding efficiency by integrating genomic data with pedigree information and phenotypes. Its effectiveness varies among livestock, with beef cattle facing challenges due to breed diversity. Therefore, this study aims to evaluate the impact of different levels of heritability, marker densities, and selection designs on the accuracy of genomic prediction in multiple beef cattle breeds through simulation studies, comparing the predictive accuracy of different methods such as PBLUP, GBLUP, and wGBLUP in simulated populations, with the goal of improving the accuracy of GP in beef cattle across different genetic backgrounds. Ultimately, we found that the use of the wGBLUP method can significantly enhance the accuracy of GP. These findings are crucial for optimizing GS in beef cattle breeding.

Abstract

Genomic selection (GS) is a technique that integrates genomic data, pedigree information, and individual phenotypes to enhance genetic improvements of economically important traits in livestock. While it has shown significant effects in dairy cattle, its efficacy in beef cattle is lower due to breed diversity and differences in reproductive structures. Therefore, this study evaluated the impact of heritability levels, marker densities, and assessment methods (such as pedigree-based BLUP, genomic BLUP, and weighted genomic BLUP) on genomic prediction accuracy across multiple beef cattle breeds through simulations. Three beef cattle populations were simulated with heritability levels set at 0.3, 0.5, and 0.7 and marker densities set at 50 k and 770 k. The results showed that the predictive accuracy of PBLUP and GBLUP increased with higher heritability and larger reference populations. Increasing the marker density also improved the accuracy of genomic predictions; even a low marker density (50 k SNP) can significantly enhance the accuracy of genetic evaluation, although the size of the reference population needs to be optimized according to population structure, heritability, and the genetic architecture of the trait. Overall, integrating pedigree, genomic, and weighted SNP information can significantly improve the precision of GEBV prediction and reduce bias. In particular, the wGBLUP method demonstrated an improvement in the prediction accuracy of low-heritability traits in small but high-density marker populations.

1. Introduction

In the field of animal breeding, the estimation of breeding values (EBVs) typically relies on phenotypic data obtained from half-sib, full-sib, and progeny testing. However, inaccuracies in pedigree information, such as errors or omissions in individual identification, can lead to a loss of accuracy in EBV predictions [1]. Despite this, genetic improvements in livestock have progressed to the point where all available phenotypic and pedigree information can be used for evaluations, with the best linear unbiased prediction (BLUP) method calculating EBVs by correcting for environmental effects on both pedigree and phenotype [2]. This has significantly propelled genetic improvements. However, pedigree-based genetic evaluations often overlook Mendelian sampling effects, which can lead to the over- or underestimation of EBVs within families [3,4]. Sullivan emphasized that EBV assessments that do not account for Mendelian sampling can result in biased outcomes, which, in the long term, can undermine the effectiveness of genetic improvements and potentially increase the risk of unidentified deleterious genetic variations [5]. Fortunately, with the advancement of genomic technologies, particularly the development of genomic selection (GS), we are able to estimate genomic breeding values (GEBVs) more accurately by analyzing phenotypes and genotypes of single-nucleotide polymorphisms (SNP) spread across the genome [6,7]. The GS method can more truthfully estimate the kinship between individuals, and with the reduction in genotyping costs, GS is being increasingly applied in the livestock industry, significantly enhancing the accuracy of animal breeding [8]. Studies have shown that even in the presence of pedigree errors, using genomic information can improve the accuracy of GEBVs, as observed in research on Hanwoo cattle [9]. By applying GS to livestock populations, we can achieve more precise breeding value estimations for each individual, enabling more refined individual selection.
Genomic selection (GS) has become a powerful tool for improving the genetic gains of economically important traits in livestock breeding, thanks to the development of high-throughput and cost-effective genotyping technologies. By combining genomic data, pedigree information, and individual phenotypic performance, GS can accurately estimate individual genomic estimated breeding values (GEBVs), thereby enhancing the genetic gains of economic traits in livestock breeding. The predictive accuracy of GEBVs is crucial for livestock genetic evaluations, but its precision is influenced by various factors, including prediction methods [10,11,12], training population size [3,13], heritability [14,15], and marker density [16]. Although GS has achieved significant success in dairy cattle breeding, its implementation still faces challenges in other livestock such as beef cattle, due to differences in breed diversity, reproductive structure, and economic characteristics. Therefore, innovative methods are essential for the evolution of animal breeding. In livestock genetic evaluation studies, research has been conducted on three models—GBLUP, wGBLUP, and BayesR—to enhance the genomic prediction accuracy of Hanwoo carcass traits using gene expression and whole-genome sequence information. wGBLUP optimizes predictions by assigning weights to each SNP in the GRM based on its effect estimate, but studies have found its role in enhancing predictive accuracy to be limited. This may be due to the insufficient association between the preselected SNP and the causal variants that actually control the traits [17]. Zhao et al. [18] constructed a weighted GRM considering the heterogeneity of the minor allele frequency (MAF) of SNPs across different populations in large white pigs, and the results showed that wGBLUP could enhance the accuracy of joint genomic predictions for small populations with different genetic backgrounds, but its advantage diminished in large populations due to increased genetic diversity. Romé et al. [19] compared the accuracy of four BLUP models in predicting the breeding values for body weight in commercial broilers and found that the wGBLUP model, which uses GWAS information to weight SNPs, could improve accuracy by 2% to 7% over the GBLUP model in some cases. However, due to the complexity of genetic structure and the accuracy of SNP effect estimation, this advantage of the wGBLUP model is not always significant. Furthermore, in genetic evaluation studies of aquaculture species, Song et al. found that wGBLUP improved the predictive accuracy by an average of 1.5% compared to traditional GBLUP methods across four aquaculture species, demonstrating its potential to enhance genomic prediction accuracy. From these studies [20], it is shown that the advantage of wGBLUP mainly lies in assigning weights to each SNP in the genomic relationship matrix to optimize predictions, potentially increasing predictive accuracy. Although the wGBLUP method has significant theoretical advantages, its performance still varies across different cattle populations or the improvement is not significant, especially in populations with complex genetic backgrounds and environmental factors, possibly due to the following: (1) the scale and structure of the reference population, which may limit the model’s accuracy; and (2) the estimation of SNP effects, which may be influenced by the population structure and genetic background, potentially limiting the advantage of the weighted approach. Although wGBLUP provides a way to improve the predictive accuracy, its effectiveness is influenced by various factors, and further research is needed to optimize its application. Therefore, by comparing the differences in prediction accuracy across different reference population sizes, heritability levels, and chip densities in various beef cattle populations, we further conducted a comparative analysis of the differences in prediction accuracy among different beef cattle populations under various scenarios.
Genomic selection plays a central role in genetic improvement, and its accuracy is influenced by various factors such as genetic architecture, marker density, and kinship. Through simulation studies, we can rapidly assess the impact of these factors on the accuracy of genomic selection under low-cost and repeatable conditions, especially for the assessment of long-term selection effects, which is difficult to achieve with real data. Simulation studies have revealed the key role of core animals in genomic selection and how it can improve the accuracy of GS when data are limited. For instance, genomic predictions in multi-breed and purebred cattle [21], the accuracy of genomic selection in populations [22], and comparative studies of different genomic selection methods provide a scientific basis for improving the accuracy of breeding objectives and economic benefits [23]. Therefore, the main objectives of this study are as follows: (1) to investigate the effects of different levels of heritability, marker densities, and selection designs on the accuracy of genomic predictions in different beef cattle breeds, (2) to evaluate the predictive accuracy of various methods (including PBLUP, GBLUP, and wGBLUP) in simulated populations, and (3) to determine the optimal reference population size for genomic selection in different beef cattle breeds. We aim to reveal the performance differences of the wGBLUP method across different cattle populations and identify the key factors for optimizing genomic predictions by comparing different reference population sizes, heritability levels, and marker densities.

2. Materials and Methods

2.1. Data Simulation

Three different beef cattle breeds were simulated using QMSim [24]. In most reported simulation studies, just five repetitions of the simulation were conducted due to computational time and storage requirements. The initial historical populations consisted of 10,000, 5000, and 1000 individuals, respectively, and evolved over 1000 generations. At the 500th generation, the population sizes were reduced from 10,000 to 1000, from 5000 to 3000, and from 1000 to 4000, respectively. Ultimately, by the 1000th generation, the population sizes stabilized at 7120 breeding individuals, with equal sex ratios, no overlapping generations, random mating, no selection, and no migration, to create an initial linkage disequilibrium (LD) and establish a mutation–drift equilibrium in historical generations. Selection designs were based on the phenotypic performance and BLUP (EBV) approaches to create the breeds (i.e., recent populations) or distant lines from this population. Each dam had only one progeny. Within each breed, animals were randomly mated for 10 generations based on animals from the last historical generation without artificial selection, using different mating designs and replacing male and female ratios. Some breeds have different initial animal numbers, which results in slightly different breed sizes, creating different effective population sizes (Ne). To save data space without compromising computational speed, only the genotype and phenotype data of animals from generations 9 and 10 were simulated, with phenotypes derived from a standard normal distribution with a mean of 0 and variance of 1, having an overall mean as the only fixed effect. Simulations were conducted for three different beef cattle breeds, with heritability levels set at low, medium, and high, corresponding to 0.3, 0.5, and 0.7, respectively.
Based on the bovine genome assembly version ARS-UCD1.2 published by Ensembl (<https://jul2023.archive.ensembl.org/Bos_taurus/Info/Index> accessed on 7 November 2024), the 29 pairs of autosomes of the beef cattle genome were simulated with a total length of 2715.85 centimorgans (cM). The aim was to create a more realistic scenario when considering the true distances between marker numbers and QTL loci. SNP markers were uniformly distributed and randomly generated. Due to differences in the initial number of markers, two densities of diallelic sites were formed, namely 50 k and 770 k, with the number of SNPs on each chromosome band being proportional to its size. The effects of markers on traits were neutral. Across the whole genome, 25 quantitative trait loci (QTL) were evenly distributed on each chromosome, making up a total of 725 QTL. QTL effects were randomly sampled from a gamma distribution with a shape parameter equal to 0.4. Over 1000 generations of the historical population, the mutation rate for QTLs and SNPs was 2.5 × 105, with an SNP mutation pattern of “recurrent”, meaning that mutations only switched between alleles without producing new mutation types. A summary of the parameters used throughout the simulation process is presented in Table 1.
The genotyped animals consisted of all individuals from generations 9 and 10. The differences in the number of animals in generations 9 and 10 led to slightly different numbers of genotyped individuals per breed (Table 1). The simulation was replicated four times and the entire process is visually explained in Figure 1, including the total number of animals (genotyped or not) in generations 9 and 10 which underwent selection.
Simulate a single trait with heritabilities of 0.3, 0.5, and 0.7, each with a phenotypic variance of 1.0. The true breeding value (TBV) for each animal is calculated as the sum of the additive effects of QTLs, as follows:
T B V k = j = 1 q t l β j · Q k j ,
where β j is the additive effect of Q T L j , and Q k j is the QTL genotype at locus j, coded as 0, 1, or 2, as the number of copies of a specified QTL allele is carried by an individual (k). The phenotypes (yi) were simulated by adding a residual term sampled as εi ∼ N(0,σe2), where σe2 is the residual variance.
The EBVs were estimated for all individuals in the current population from generations 9 to 10 based on phenotypic values and pedigree data. The best linear unbiased prediction (BLUP) of breeding values was obtained by Henderson’s [25] mixed linear model. The BLUP predictor has the smallest prediction error variance among all possible linear unbiased predictors. The numerator relationship matrix (A) is used in the following mixed model equations to derive BLUP of random additive effects (including polygenes and QTL):
Z Z + A 1 σ e 2 σ a 2 a ^ = Z y ,
where y is the vector of phenotypic records, Z is the incidence matrix relating the records to the random additive effects (a), σ e 2   is the residual variance, and σ a 2   is the additive genetic variance. The mixed model equations are solved by the conjugate gradient method.
Genotype data preprocessing steps are as follows. Based on PLINKv1.9 software [26], further screening of SNP was conducted with the following criteria: MAF < 0.05, genotype call rate < 0.10, individual call rate < 0.10, and Hardy–Weinberg equilibrium < 1 × 105. Ultimately, after genotype quality control, a total of 58,990 and 777,962 segregating SNPs with MAF greater than 0.05 were retained for subsequent analysis. The simulation assumes that the QTL allele effects are the same across all breeds, but the frequencies of QTL vary among breeds, leading to differences in the variance for each breed. The maximum QTL variance in generations 9 and 10 did not exceed 0.02 for each breed.

2.2. Model and Analysis

In this study, statistical methods used to estimate breeding values include the traditional pedigree-based best linear unbiased prediction (PBLUP) method, the genomic best linear unbiased prediction (GBLUP) based on the genomic relationship matrix, and the weighted genomic best linear unbiased prediction (wGBLUP) based on genomic information.

2.2.1. Pedigree-Based Best Linear Unbiased Prediction (PBLUP)

When estimating parameters using pedigree information, a mixed model is employed for PBLUP [27], which is constructed using the HIBLUP_1.4.0 software [28]. The model equation is as follows:
y = X b + Z u + e
X X X Z Z X Z Z + λ A 1 b ^ u ^ = X y Z y ,
where y is a vector of the phenotype value; X contains the design matrix of the observations for fixed effects; b is the vector of the fixed effects, including sex, generation, polygene effect, and sire number; Z is the design matrix-matching phenotype value and random effect values; u is the vector of the random effects with a normal distribution; e is the vector of residual error effects with a normal distribution ~N (0, I σ e 2 ); σ e 2 is random variance; and I is the identity matrix. λ = σ e 2 / σ a 2 . A is the relationship matrix (NRM) constructed based on the pedigree information.

2.2.2. Genomic-Based Best Linear Unbiased Prediction (GBLUP)

In parameter estimation using genomic information for GBLUP [29], a general linear mixed model is employed, and the model is constructed using the HIBLUP software. Although GBLUP and PBLUP use the same fixed effects, GBLUP employs a genomic relationship matrix (GRM) based on SNP markers to GEBVs. The expression for GRM is as follows [30]:
G = M M i = 1 m 2 p i ( 1 p i ) ,
Here, M is the matrix of individual genes (where homozygotes, heterozygotes, and alternative homozygotes are converted to 0, 1, and 2, respectively), m is the total number of SNP markers, and pi is the frequency at the i-th position in the SNP.
Then, since only additive genetic effects are modeled, only these effects are considered, which are shown as follows:
V a r g = G σ g 2 ,
The general linear mixed model equation for GBLUP is as follows [31]:
y = X b + Z g + e ,
X X X Z Z X Z Z + λ G 1 b ^ g ^ = X y Z y ,
where y is the vector of phenotypes, X is the matrix associating fixed effects with each animal individual, b is the vector of fixed effects, Z is the design matrix allocating records to genetic values, g is the vector of additive genetic effects for individuals, G is the genomic relationship matrix, e is the vector of residual error effects with a normal distribution ~N (0, G σ e 2 ), and σ e 2 is the residual variance. λ = σ e 2 / σ g 2 .

2.2.3. Weighted Genomic-Based Best Linear Unbiased Prediction (wGBLUP)

In the weighted GBLUP, the model and inference are the same, except a different SNP effect vector is used when constructing the GRM. In the SLEMM-0.90.1 software [32] developed by Jiang et al., two schemes for optimizing genomic prediction by SNP weighting are provided, namely (1) based on MAF dependence of SNP effect sizes; and (2) based on the SNP effect estimates with weight W equal to the identity matrix.
SLEMM fits the following linear mixed model:
y = X β + Z α + e α ~ N ( 0 , W σ α 2 ) , e ~ N 0 , R σ e 2
where y is a vector of phenotypes for a quantitative trait, β is a vector of fixed effects including the mean, X is the design matrix for β, α is a vector of SNP effects with a diagonal covariance matrix W σ α 2 , Z is a matrix of standardized genotypes, and e is a vector of residuals with a diagonal covariance matrix R σ e 2 . R is usually equal to an identity matrix, and the diagonal elements of W are weights, with the mean representing the relative contribution of the SNP to the genetic variance; that is, Wj represents the contribution of SNPj to the genetic variance.
Due to LD, the effect of a QTL can be captured by nearby SNP loci, and the effects of adjacent SNP tend to be similar in model fitting. Therefore, the second SNP weighting scheme is used in this study, with the formula as follows:
W j j = C · 1 2 S + 1 k j S j + S α k 2 ^
where C is a scaling constant to control the mean weight to be 1, S is the number of SNPs on each side of SNP j, and   α k ^ is the estimate of the effect of the kth SNP in an existing BLUP with W equal to the identity matrix. This specification of the jth SNP’s weight borrows information from a window of 2S+1 SNP. SLEMM first fits the model (9) with training data, where W is equal to the identity matrix, and then fits it with W computed by Equation (10).

2.3. Accuracy of Genomic Prediction

Because the true breeding values (TBVs) of the individuals were directly given in the simulation process, the accuracy of TBV and EBV can be directly calculated using the following formula:
Accuracy = Corr(TBV,EBV),
Here, the correlation (Corr) ranges from 0 to 1, indicating the strength of the linear relationship between TBV and EBV.

3. Results

3.1. Genomic Prediction Accuracy Across All Scenarios Based on Pedigree Information

The accuracy of genomic prediction using the PBLUP method, which is based on pedigree information, was evaluated across different scenarios (e.g., marker densities of 50 k and 770 k; heritabilities (h2) of 0.3, 0.5, and 0.7; and varying reference population sizes RP) for its three breeds. The PBLUP-predicted accuracies for breeds A, B, and C at a heritability of 0.3 were between 0.56 and 0.64 (Figure 2), 0.54 and 0.71, and 0.59 and 0.66, respectively. At a heritability of 0.5, they were between 0.58 and 0.65 (Figure 3), 0.67 and 0.75, and 0.61 and 0.70, respectively. At a heritability of 0.7, they were between 0.59 and 0.67 (Figure 4), 0.69 and 0.77, and 0.62 and 0.68, respectively. Compared across the three levels of heritability, a higher heritability increases the accuracy of PBLUP predictions (Supplementary Figures S1 and S2).
The prediction accuracy of PBLUP significantly increases with the enlargement of the training population (RP) size under both low and medium heritability levels, but it does not increase and even decreases under high heritability (Table 2). Under low heritability, the lowest accuracy is calculated when the training population size is 5000, while the highest accuracy is calculated when the size is 15,000 individuals. Under high heritability, the highest accuracy is calculated when the training population size is 5000, but the lowest accuracy is calculated when the size is 15,000 individuals. Secondly, the genomic prediction accuracy using pedigree information increases with the increase in marker density. For example, for breed A at a population size of 8000 and a medium level of heritability (h2 = 0.5), the accuracies with 50 k and 770 k marker densities are 0.5909 and 0.6338, respectively, which is an increase of 4.29%; for breed B at a population size of 12,000 and a high level of heritability (h2 = 0.7), the accuracies with 50 k and 770 k marker densities are 0.6956 and 0.7781, respectively, which is an increase of 8.25%; and for breed C at a population size of 15,000 and a low level of heritability (h2 = 0.3), the accuracies with 50 k and 770 k marker densities are 0.6162 and 0.6587, respectively, which is an increase of 4.25%. Therefore, under the PBLUP method, the prediction accuracy with a 770 k marker density is higher than that with 50 k.

3.2. Genomic Prediction Accuracy Across All Scenarios Based on Genomic Information

Assessments of the genomic prediction accuracy for the three breeds (A, B, and C) using the GBLUP method based on genomic information are shown at different marker densities of 50 k and 770 k, heritability (h2) levels of 0.3, 0.5, and 0.7, and training population (RP) sizes ranging from 5000 to 15,000 (Figure 2, Figure 3 and Figure 4). The GBLUP prediction accuracy using genomic information is higher than the accuracy of the PBLUP method using pedigree information. The prediction accuracies for breeds A, B, and C using GBLUP are from 0.61 to 0.70, 0.65 to 0.73, and 0.63 to 0.66 at a heritability of 0.3, from 0.70 to 0.76, 0.74 to 0.79, and 0.69 to 0.71 at a heritability of 0.5, and from 0.74 to 0.79, 0.78 to 0.82, and 0.70 to 0.74 at a heritability of 0.7, respectively. These results further indicate that, compared to PBLUP, the prediction accuracy of GBLUP increases by 4.77%, 6.85%, 8.95%, and 10.19% for breed A as the training population size increases from 5000 to 15,000 at a heritability of 0.3 and a marker density of 50 k (the prediction accuracies for breeds B and C also improve, but the details are not listed here).
The prediction accuracy of GBLUP also significantly increases with the enlargement of the training population (RP) size. The lowest accuracy is calculated when the training population is 5000, and the highest accuracy is calculated when the size is 15,000 individuals. In contrast, the genomic prediction accuracy under the genomic information does not increase with the increase in marker density, and even if there is an increase, it is not significant. For example, for breed A at a population size of 8000 and medium level of heritability (h2 = 0.5), the accuracies with 50 k and 770 k marker densities are 0.7214 and 0.7204, respectively, with a difference of 0.001; for breed B at a population size of 12,000 and a high level of heritability (h2 = 0.7), the accuracies with 50 k and 770 k marker densities are 0.8008 and 0.8106, respectively, with a difference of 0.0098; for breed C at a population size of 15,000 and a low level of heritability (h2 = 0.3), the accuracies with 50 k and 770 k marker densities are 0.6648 and 0.6643, respectively, with a difference of 0.0005. When the marker density is 50 k, the heritability is 0.7, and the training population size is 15,000, the highest genomic prediction accuracy is observed. Therefore, in GBLUP, a marker density of 50 k can demonstrate good prediction accuracy (Supplementary Table S1).
Figure 3. Prediction accuracy of genomic estimated breeding values (GEBVs) for simulated traits with a heritability of 0.5 in breed A using different evaluation methods: PBLUP, GBLUP, or wGBLUP. The X-axis represents the number of animals in the reference population, while the Y-axis indicates the predicted accuracy of GEBVs for the simulated traits. The blue and red lines correspond to marker densities of 50 k and 770 k, respectively.
Figure 3. Prediction accuracy of genomic estimated breeding values (GEBVs) for simulated traits with a heritability of 0.5 in breed A using different evaluation methods: PBLUP, GBLUP, or wGBLUP. The X-axis represents the number of animals in the reference population, while the Y-axis indicates the predicted accuracy of GEBVs for the simulated traits. The blue and red lines correspond to marker densities of 50 k and 770 k, respectively.
Animals 15 01118 g003

3.3. Genomic Prediction Accuracy Across All Scenarios Based on SNP Weighting

Assessments of genomic prediction accuracy for three breeds (A, B, and C) using the SNP-weighted wGBLUP method are shown across different marker densities of 50 k and 770 k, heritability (h2) levels of 0.3, 0.5, and 0.7, and training population (RP) sizes ranging from 5000 to 15,000 (Figure 2, Figure 3 and Figure 4). The prediction accuracy of wGBLUP using SNP weighting was higher than that of PBLUP, which uses pedigree information, and GBLUP, which uses genomic information. For breeds A, B, and C, the prediction accuracies using wGBLUP were from 0.63 to 0.72, 0.61 to 0.76, and 0.70 to 0.72 at a heritability of 0.3, from 0.68 to 0.76, 0.76 to 0.86, and 0.74 to 0.85 at a heritability of 0.5, and from 0.73 to 0.81, 0.79 to 0.88, and 0.77 to 0.90 at a heritability of 0.7, respectively. Under a heritability of 0.3 and a marker density of 50 k, as the training population size increased from 5000 to 15,000, the prediction accuracy of wGBLUP compared to PBLUP for breed C increased by 10.22%, 10.2%, 10.45%, and 9.64%, respectively. Compared to GBLUP, the prediction accuracy of wGBLUP under the same conditions for breed C increased by 8.92%, 7.95%, 6.51%, and 4.78%, respectively. These results further indicate that the wGBLUP method has a higher prediction accuracy than both the PBLUP and GBLUP methods (Supplementary Figures S3 and S4).
Figure 4. Prediction accuracy of genomic estimated breeding values (GEBVs) for simulated traits with a heritability of 0.7 in breed A using different evaluation methods: PBLUP, GBLUP, or wGBLUP. The X-axis represents the number of animals in the reference population, while the Y-axis indicates the predicted accuracy of GEBVs for the simulated traits. The blue and red lines correspond to marker densities of 50 k and 770 k, respectively.
Figure 4. Prediction accuracy of genomic estimated breeding values (GEBVs) for simulated traits with a heritability of 0.7 in breed A using different evaluation methods: PBLUP, GBLUP, or wGBLUP. The X-axis represents the number of animals in the reference population, while the Y-axis indicates the predicted accuracy of GEBVs for the simulated traits. The blue and red lines correspond to marker densities of 50 k and 770 k, respectively.
Animals 15 01118 g004
The prediction accuracy of wGBLUP does not increase significantly with the enlargement of the training population (RP) size under all scenarios, unlike PBLUP and GBLUP. When the heritability is 0.3, the prediction accuracy of wGBLUP increases significantly with the increase in RP size. However, when the heritability is 0.5 or 0.7, the prediction accuracy of wGBLUP does not show an increasing trend or it even decreases with the enlargement of RP size. For instance, with a heritability of 0.3 and a marker density of 50 k, the prediction accuracy of breed B is the lowest (0.61) when the population size is 5000, but the highest (0.73) when the population size is 15,000. With a heritability of 0.7 and a marker density of 50 k, the prediction accuracy of breed B is the highest (0.86) when the population size is 5000, but the lowest (0.79) when the population size is 15,000 (Supplementary Figures S5 and S6).
Additionally, the genomic prediction accuracy with SNP weighting increases as the marker density increases. For instance, with breed A at a population size of 8000 and medium heritability (h2 = 0.5), the accuracies for 50 k and 770 k marker densities are 0.6919 and 0.7360, respectively, representing a 4.41% improvement; for breed B at a population size of 5000 and a low level of heritability (h2 = 0.3), the accuracies for 50 k and 770 k marker densities are 0.6053 and 0.7441, respectively, indicating a 13.88% increase; and for breed C at a population size of 12,000 and a high level of heritability (h2 = 0.7), the accuracies for 50 k and 770 k marker densities are 0.7750 and 0.8453, respectively, which is a 7.03% enhancement. Thus, in the wGBLUP method, the prediction accuracy with 770 k markers is superior to that with 50 k markers (Supplementary Table S2).

4. Discussion

Due to the varying patterns of linkage disequilibrium (LD) decay across different cattle breeds, it is essential to investigate the factors affecting the accuracy of genomic predictions within diverse populations [33]. In our study, we examined the changes in the accuracy of GEBV under various selection scenarios, evaluation methods, reference population sizes, heritability levels, and marker densities, using three simulated beef cattle populations as our subjects. Breeding values are frequently employed for selecting superior individuals within a population. Among the factors considered, the evaluation method used is a potential influence on the accuracy of GEBV predictions. Across different reference population sizes, heritability levels, and marker densities, the prediction accuracy of wGBLUP was higher than that of PBLUP and GBLUP, indicating the advantage of wGBLUP over PBLUP and GBLUP. This may be attributed to the use of an SNP effect vector in the construction of the GRM, which is REML-estimated based on the stochastic Lanczos algorithm for genomic variance component estimation, thereby enhancing the accuracy of predictions through SNP weighting [34]. This approach can also be optimized for different datasets and research objectives by adjusting the window size and number of iterations. Our findings are consistent with those of Zhang et al. [35], who reported a higher prediction accuracy with wGBLUP compared to GBLUP in simulated populations.
Numerous studies have demonstrated that the size of the reference population significantly influences the accuracy of genomic selection. Generally, larger reference populations enhance the predictive accuracy of GEBVs by providing a richer source of genomic information, which allows the GRM to more accurately reflect the genetic relationships among individuals. Conversely, smaller population sizes may prevent the GRM from adequately capturing genetic variation, thereby reducing the prediction accuracy. Therefore, to improve the accuracy of GEBVs, it is advisable to use a large reference population in genomic selection whenever possible. However, considering the economic costs of whole-genome sequencing, it is necessary to find an optimal balance between population size and cost-effectiveness—the ideal reference population size. Brito et al. [36], based on previous research conclusions, used a large reference population (~15,000 animals) based on a high-density SNP chip to predict the accuracy of GEBV for growth, carcass, and meat quality traits. The results emphasized the importance of reference population size in genomic selection and the need to consider the balance between cost and prediction accuracy in practical applications. In addition, this study showed that the accuracy of genomic predictions based on PBLUP and GBLUP evaluation methods increased with the enlargement of the reference population size, which is consistent with previous genetic evaluation studies on simulated data [37], beef cattle data [27,38], dairy cattle data [39,40], and dairy goat data [41].
However, in this study, when using the wGBLUP evaluation method for accuracy prediction, the results showed that the improvement in accuracy with the increase in the reference population size was moderate and reached a plateau, even showing a trend of decline. When the reference population included 5000 animals, the GEBV accuracy was maximized for the three breeds under medium to high heritability levels, with values of 0.70 and 0.78 (A), 0.79 and 0.86 (B), and 0.79 and 0.85 (C). This trend is consistent with the results of Takeda et al. [42] who conducted a weighted assessment based on the maximum likelihood (ML) method, but the predictive accuracy we obtained was higher. The reason for this outcome might be that the ML method yields fewer correlation weight factors and less close genetic relationships between SNPs, while wGBLUP utilizes LD relationships to obtain physically proximate SNPs and makes their effect sizes similar [43]. Therefore, in the case of high heritability and large reference populations, the prediction accuracy of wGBLUP may not significantly improve and may even show a downward trend. This is likely because, in such cases, the basic GBLUP method can already provide a relatively high prediction accuracy, and the additional improvements brought by the weighting mechanism of wGBLUP are limited. Moreover, large reference populations may encompass more genetic variation, which could dilute the effects of the weighting mechanism. Although wGBLUP performs well in most scenarios, its performance enhancement may be restricted in certain specific contexts. This indicates that, in practical applications, it is necessary to select the appropriate genomic prediction method based on specific research objectives and data characteristics. Thus, the computational complexity of wGBLUP and its reliance on the genetic diversity of the training population also need to be taken into account in practical applications. Additionally, in the study by Uemoto et al. [44], which used a reference population size increasing from 200 to 1200 animals for a genetic evaluation, the accuracy of the genomic prediction did not reach a plateau. However, this study used about 13 times more animals than previous ones, indicating that under phenotype data with medium to high levels of heritability, the accuracy of genomic prediction gradually approaches a plateau. The results of this study and the aforementioned studies demonstrate that when the reference population reaches a certain level, it can effectively ensure the accuracy of genomic selection.
Genomic selection is a key strategy for improving the efficiency of genetic improvements, especially for traits with low heritability and those that are difficult to measure directly. In this study, we simulated scenarios with low, medium, and high levels of heritability to assess the effectiveness of genomic selection, and the results showed that an increase in heritability affects the accuracy of genomic prediction. Regardless of the evaluation method used, whether PBLUP, GBLUP, or wGBLUP, the accuracy of genomic prediction increases with the size of h2. Under different h2 scenarios, the accuracy of wGBLUP is higher than that of PBLUP and GBLUP. Nwogwugwu et al. [9] indicated that a higher h2 leads to greater predictive accuracy, as h2 represents the proportion of phenotypic variation caused by genetic factors, which directly affects the accuracy of EBV prediction. High heritability implies a strong correlation between phenotypic values and breeding values, thereby improving the accuracy of EBV prediction. This means there is a relationship between h2 and predictive accuracy, as we have observed. Many studies have proven that accuracy increases with increasing h2 values, which is consistent with our research findings [45,46,47]. Furthermore, Gualdrón Duarte et al. [48] mentioned that when large-effect variations contribute to complex traits, genomic prediction methods that assign higher variance to these variations can achieve a higher predictive accuracy. This implies that if a trait has high heritability, prediction models based on these genetic variations may be more accurate. Moreover, by using a weighted GRM, the predictive accuracy in the GBLUP method can be further improved. This is also consistent with our study results, in which the accuracy of wGBLUP is higher than that of PBLUP and GBLUP.
A higher marker density provides more genetic information on chromosomes of the same length, facilitating the easier identification of markers in linkage disequilibrium with QTL. Generally, as the marker density increases, the accuracy of genomic predictions also improves. Some authors have reported that the accuracy of genetic evaluations using genotype data from high-density chips is higher than that using genotype data from low-density chips [31,49]. In the article by Zhu et al. [50], different subsets of single-nucleotide polymorphisms (SNP) were constructed to estimate GEBVs, and the impact of these different densities on predictive ability was assessed. The results showed that predictive ability significantly increased with the inclusion of more SNPs up to a certain marker density (200K SNP). Brito et al. [51]’s simulation study using 50 k and 770 k marker densities demonstrated that the predictive accuracy improved with an increase in the number of markers. This is consistent with the conclusions obtained in this study using two evaluation methods, PBLUP and wGBLUP: the predictive accuracy increases with the marker density. By using high-density marker genotypes, a realized relationship matrix between individuals can be constructed, with elements representing the proportion of identical-by-descent (IBD) genome content between individual pairs, which enhances the BLUP estimation of breeding values, especially for individuals lacking direct phenotypic data, thus increasing the predictive precision. In short, dense genetic markers contribute to more accurate predictions of individual breeding potential [3].
However, the research by Solberg et al. [16] showed that as the marker density increases, the accuracy of genomic predictions improves, but the rate of improvement slows down and may stabilize after reaching a certain density. This indicates that there is a balance point between the marker density and predictive accuracy; beyond this point, the marginal benefits of increasing the number of markers on the predictive accuracy will gradually decrease. Rabier et al. [52], in their study on perennial ryegrass, demonstrated that using 3000 to 5000 evenly distributed SNP markers can achieve a genomic prediction accuracy close to that of high-density markers. Simulation analyses further revealed that when the marker density is reduced to about 1000 SNP, the predictive accuracy reaches a plateau, and additional increases in the number of markers contribute little to enhancing the predictive accuracy. Therefore, a reasonable choice of marker density can strike a balance between costs and benefits. Moreover, studies have shown that imputing missing genotypes in the 54K dataset does not significantly enhance the accuracy of genomic predictions. Although high-density marker panels should theoretically enhance the predictive accuracy, the actual improvement is modest and significantly influenced by the choice of model and data quality [53,54,55]. When applying the GBLUP method for genomic evaluation, we found that whether using 50 k or 770 k SNP marker densities, the difference in predictive accuracy becomes insignificant after reaching a certain threshold, which is consistent with previous studies. The reasons for this phenomenon may include the following: (1) when the marker density reaches a certain level, most significant LD regions have been covered; (2) after most major-effect QTL have been tagged, additional markers may only capture additional minor-effect QTL, contributing little to the overall accuracy; and (3) as the marker density increases, the LD between markers increases, leading to redundant marker information. After reaching a certain density, many markers may provide similar information about the same genetic variation, no longer providing additional improvements in predictive accuracy [16,56,57]. The use of over 500K SNP markers only brings a limited increase in accuracy, indicating that the addition of more SNPs mainly reduces the sampling error of the genomic relationship matrix G. Therefore, the primary role of genomic information is to enhance the accuracy of genetic evaluations by more precisely estimating the genetic relationships between individuals, including Mendelian sampling [58]. Even though the accuracy of predictions via the PBLUP method increases with marker density, it does not exceed the accuracy of predictions via the GBLUP method, and among these three methods, wGBLUP is the best choice for prediction accuracy. This is because wGBLUP assigns different weights to markers by considering the actual extent of the LD between the markers and QTL. This means that these methods can more effectively utilize marker information, especially at higher marker densities, and can better distinguish the importance of markers, thereby continuously improving the predictive accuracy.
In the field of genomic selection in plants and animals, numerous genetic evaluation methods such as GBLUP, ssGBLUP, wGBLUP, and Bayesian methods have been widely implemented and have significantly impacted prediction accuracy. Research has indicated that in genomic evaluations of beef cattle [30,59,60,61], dairy cattle [10], pigs [62], and simulated American mink [63], the average accuracy of GBLUP methods is markedly higher than that of PBLUP methods based on pedigree information. Although GBLUP is widely used in beef cattle genomic evaluations due to its simplicity and low computational requirements, its reliance on the LD between the markers and QTL limits its potential to capture new information and improve the prediction accuracy. Consequently, Nwogwugwu et al. [64] introduced three different weights in ssGBLUP to address the issue of multicollinearity among variables and the low-rank problem of matrices, which may render the inversion of matrices difficult or impossible. Therefore, Haque et al. [65], in their study on the genomic prediction accuracy for reproductive and carcass traits in Hanwoo cattle, demonstrated a clear advantage of wGBLUP over traditional GBLUP methods by assigning different weights to significant SNPs, highlighting the importance of incorporating heterogeneity in SNP effect sizes in genomic evaluations to enhance prediction performance. This result aligns with the comparative results observed in this study, and other studies have also shown that the accuracy of predictions via the wGBLUP method is higher than that of predictions via the PBLUP and GBLUP methods. Lourenco et al. [12], in their analysis of genomic prediction accuracy in an Israeli Holstein cattle population, found that weighted single-step GBLUP (WssGBLUP) significantly improved prediction accuracy by assigning higher weights to SNPs that have a greater impact on the target traits, especially when evaluating percentage traits, showing its significant potential to enhance the accuracy of genomic evaluations. Compared with PBLUP and GBLUP, wGBLUP considers the similar effects of physically adjacent SNPs due to the LD as weights in the model. Although there are various methods for calculating the allocation of weights, they all further optimize the accuracy of genomic predictions and even provide the flexibility to adapt to different datasets and research objectives by adjusting the window size and iteration number. Karaman et al. [66] estimated the covariance of SNP effects using Bayesian whole-genome regression methods and applied these covariances as weights in multi-trait wGBLUP methods. The results showed that when using the SNP covariances estimated in the Bayesian method as weights, wGBLUP can achieve a prediction accuracy comparable to the Bayesian method in multi-trait genomic predictions. In particular, when considering 100 adjacent SNPs as a common weight, wGBLUP ouRPerforms the traditional GBLUP method. There are also many studies on the genomic prediction accuracy of reproductive and carcass traits in Hanwoo cattle that compare various genomic evaluation methods, including the traditional GBLUP, wGBLUP, and machine learning methods. The results show that compared with the traditional GBLUP method, wGBLUP can significantly improve the prediction accuracy by assigning higher weights to SNPs with larger effects, especially for traits influenced by major genes. Additionally, the application of SNPs pre-selected based on gene expression information and GWAS results in wGBLUP further enhances the predictive power of the model. These findings emphasize the importance of considering SNP weights in genomic predictions and demonstrate the potential of wGBLUP to improve the accuracy of predictions of complex traits, especially for traits with known genetic structures and those that are influenced by a few major genes [17,67,68]. According to the above research results, the wGBLUP method has shown superior predictive performance to PBLUP and GBLUP in all cases, thereby enhancing the genetic improvement in beef cattle breeding. Future research could build on this foundation by integrating deep learning models (such as convolutional and recurrent neural networks) with wGBLUP to capture complex interactions between SNPs, thereby improving the prediction accuracy. Additionally, combining wGBLUP with Bayesian methods, using the posterior variance of SNP effects estimated by Bayesian methods as weights, can further optimize predictions, especially for traits influenced by large-effect QTL. These advancements not only enhance the prediction accuracy but also provide more powerful tools for beef cattle breeding, driving the continuous development of breeding technologies.

5. Conclusions

In this study, we explored the potential application of genomic selection in beef cattle breeding using simulated datasets that encompassed diverse reference populations, heritability levels, selection strategies, and marker densities. We found that even a low marker density (50 k SNP) can significantly enhance the accuracy of genetic evaluations, although the size of the reference population needs to be optimized based on the population structure, heritability, and genetic architecture of the traits. The study demonstrated that integrating pedigree, genomic, and weighted SNP information can significantly improve the precision of GEBV predictions and reduce bias. Particularly, the wGBLUP method showed an advantage in enhancing the predictive accuracy for low heritability traits in small-scale but high-density marker populations. Our research emphasizes the importance of fully utilizing genetic and population structure information in genetic evaluations and points out that by fine-tuning selection methods, evaluation strategies, reference population sizes, heritability levels, and marker densities, one can more effectively harness the genetic diversity within populations, advancing the genetic progress in beef cattle breeding. This study not only provides key technical parameters for the formulation of genomic breeding strategies in beef cattle but also establishes an optimizable framework for genomic prediction that can be extended to other livestock species. Future research can build on this foundation to further explore the integration of multi-omics data and the application of machine learning algorithms in genomic predictions, thereby continuously driving the innovative development of livestock breeding technologies.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ani15081118/s1, Figure S1: Prediction accuracy of Genomic Estimated Breeding Values (GEBVs) for simulated traits with a heritability of 0.3 in breed B using different evaluation methods: PBLUP, GBLUP or wGBLUP; Figure S2: Prediction accuracy of Genomic Estimated Breeding Values (GEBVs) for simulated traits with a heritability of 0.3 in breed C using different evaluation methods: PBLUP, GBLUP or wGBLUP; Figure S3: Prediction accuracy of Genomic Estimated Breeding Values (GEBVs) for simulated traits with a heritability of 0.5 in breed B using different evaluation methods: PBLUP, GBLUP or wGBLUP; Figure S4: Prediction accuracy of Genomic Estimated Breeding Values (GEBVs) for simulated traits with a heritability of 0.5 in breed C using different evaluation methods: PBLUP, GBLUP or wGBLUP; Figure S5: Prediction accuracy of Genomic Estimated Breeding Values (GEBVs) for simulated traits with a heritability of 0.7 in breed B using different evaluation methods: PBLUP, GBLUP or wGBLUP; Figure S6: Prediction accuracy of Genomic Estimated Breeding Values (GEBVs) for simulated traits with a heritability of 0.7 in breed C using different evaluation methods: PBLUP, GBLUP or wGBLUP; Table S1: Accuracies of genomic prediction using PBLUP, GBLUP or wGBLUP procedures under different training populations, with varying levels of heritability and marker densities for breed B; Table S2: Accuracies of genomic prediction using PBLUP, GBLUP or wGBLUP procedures under different training populations, with varying levels of heritability and marker densities for breed C.

Author Contributions

Conceptualization, L.Z. (Le Zhou), L.Z. (Lin Zhu) and W.Z.; Data curation, L.Z. (Le Zhou); Formal analysis, L.Z. (Lin Zhu) and F.M.; Funding acquisition, W.Z.; Investigation, L.Z. (Le Zhou) and C.C.; Methodology, C.C.; Project administration, R.N.; Resources, Z.L., M.G. and R.N.; Supervision, L.Z. (Lin Zhu), R.N. and W.Z.; Validation, F.M.; Visualization, L.Z. (Le Zhou) and Z.L.; Writing—original draft, L.Z. (Le Zhou); writing—review and editing, L.Z. (Le Zhou) and W.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Double First-Class Construction Funds for IMAU (BZX202201), the Fundamental Research Funds for Inner Mongolia Autonomous Region Direct Affiliated Universities (BR22-11-13 and BR221024), and the Inner Mongolia Natural Science Foundation project (2021ZD05).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
GSGenomic selection
TBVTrue breeding values
EBVEstimation of breeding values
BLUPBest linear unbiased prediction
GEBVGenomic estimated breeding value
SNPSingle-nucleotide polymorphisms
MAFMinor allele frequency
LDLinkage disequilibrium
QTLQuantitative trait loci
PBLUPTraditional pedigree-based best linear unbiased prediction
GBLUPGenomic best linear unbiased prediction
wGBLUPWeighted best linear unbiased prediction
GRMGenomic relationship matrix

References

  1. Mehrban, H.; Lee, D.H.; Naserkheil, M.; Moradi, M.H.; Ibáñez-Escriche, N. Comparison of conventional BLUP and single-step genomic BLUP evaluations for yearling weight and carcass traits in Hanwoo beef cattle using single trait and multi-trait models. PLoS ONE 2019, 14, e0223352. [Google Scholar] [CrossRef] [PubMed]
  2. Henderson, C.R. Theoretical Basis and Computational Methods for a Number of Different Animal Models. J. Dairy Sci. 1988, 71, 1–16. [Google Scholar] [CrossRef]
  3. Hayes, B.J.; Visscher, P.M.; Goddard, M.E. Increased accuracy of artificial selection by using the realized relationship matrix. Genet. Res. 2009, 91, 47–60. [Google Scholar] [CrossRef] [PubMed]
  4. Goddard, M.E.; Hayes, B.J.; Meuwissen, T.H. Using the genomic relationship matrix to predict the accuracy of genomic selection. J. Anim. Breed. Genet. 2011, 128, 409–421. [Google Scholar] [CrossRef]
  5. Sullivan, P.G. Mendelian Sampling variance tests with genomic preselection. In Proceedings of the 2018 Interbull Technical Workshop, Dubrovnik, Croatia, 25–26 August 2018. [Google Scholar]
  6. Henderson, C.R. Best Linear Unbiased Estimation and Prediction under a Selection Model. Biometrics 1975, 31, 423. [Google Scholar] [CrossRef]
  7. Meuwissen, T.H.; Hayes, B.J.; Goddard, M.E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 2001, 157, 1819–1829. [Google Scholar] [CrossRef] [PubMed]
  8. Misztal, I.; Lourenco, D.; Legarra, A. Current status of genomic evaluation. J. Anim. Sci. 2020, 98, skaa101. [Google Scholar] [CrossRef]
  9. Nwogwugwu, C.P.; Kim, Y.; Chung, Y.J.; Jang, S.B.; Roh, S.H.; Kim, S.; Lee, J.H.; Choi, T.J.; Lee, S.H. Effect of errors in pedigree on the accuracy of estimated breeding value for carcass traits in Korean Hanwoo cattle. Asian-Australas. J. Anim. Sci. 2020, 33, 1057–1067. [Google Scholar] [CrossRef]
  10. Gao, H.; Christensen, O.F.; Madsen, P.; Nielsen, U.S.; Zhang, Y.; Lund, M.S.; Su, G. Comparison on genomic predictions using three GBLUP methods and two single-step blending methods in the Nordic Holstein population. Genet. Sel. Evol. 2012, 44, 8. [Google Scholar] [CrossRef]
  11. Mehrban, H.; Lee, D.H.; Moradi, M.H.; IlCho, C.; Naserkheil, M.; Ibáñez-Escriche, N. Predictive performance of genomic selection methods for carcass traits in Hanwoo beef cattle: Impacts of the genetic architecture. Genet. Sel. Evol. 2017, 49, 1. [Google Scholar] [CrossRef]
  12. Lourenco, D.A.; Misztal, I.; Tsuruta, S.; Aguilar, I.; Ezra, E.; Ron, M.; Shirak, A.; Weller, J.I. Methods for genomic evaluation of a relatively small genotyped dairy population and effect of genotyped cow information in multiparity analyses. J. Dairy Sci. 2014, 97, 1742–1752. [Google Scholar] [CrossRef]
  13. van den Berg, I.; Meuwissen, T.H.E.; MacLeod, I.M.; Goddard, M.E. Predicting the effect of reference population on the accuracy of within, across, and multibreed genomic prediction. J. Dairy Sci. 2019, 102, 3155–3174. [Google Scholar] [CrossRef]
  14. Ren, D.; Cai, X.; Lin, Q.; Ye, H.; Teng, J.; Li, J.; Ding, X.; Zhang, Z. Impact of linkage disequilibrium heterogeneity along the genome on genomic prediction and heritability estimation. Genet. Sel. Evol. 2022, 54, 47. [Google Scholar] [CrossRef] [PubMed]
  15. Grotzinger, A.D.; Rhemtulla, M.; de Vlaming, R.; Ritchie, S.J.; Mallard, T.T.; Hill, W.D.; Ip, H.F.; Marioni, R.E.; McIntosh, A.M.; Deary, I.J.; et al. Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 2019, 3, 513–525. [Google Scholar] [CrossRef]
  16. Solberg, T.R.; Sonesson, A.K.; Woolliams, J.A.; Meuwissen, T.H. Genomic selection using different marker types and densities. J. Anim. Sci. 2008, 86, 2447–2454. [Google Scholar] [CrossRef] [PubMed]
  17. de Las Heras-Saldana, S.; Lopez, B.I.; Moghaddar, N.; Park, W.; Park, J.E.; Chung, K.Y.; Lim, D.; Lee, S.H.; Shin, D.; van der Werf, J.H.J. Use of gene expression and whole-genome sequence information to improve the accuracy of genomic prediction for carcass traits in Hanwoo cattle. Genet. Sel. Evol. 2020, 52, 54. [Google Scholar] [CrossRef]
  18. Zhao, W.; Zhang, Z.; Ma, P.; Wang, Z.; Wang, Q.; Zhang, Z.; Pan, Y. The effect of high-density genotypic data and different methods on joint genomic prediction: A case study in large white pigs. Anim. Genet. 2023, 54, 45–54. [Google Scholar] [CrossRef] [PubMed]
  19. Romé, H.; Chu, T.T.; Marois, D.; Huang, C.H.; Madsen, P.; Jensen, J. Accounting for genetic architecture for body weight improves accuracy of predicting breeding values in a commercial line of broilers. J. Anim. Breed. Genet. 2021, 138, 528–540. [Google Scholar] [CrossRef]
  20. Song, H.; Hu, H. Strategies to improve the accuracy and reduce costs of genomic prediction in aquaculture species. Evol. Appl. 2021, 15, 578–590. [Google Scholar] [CrossRef]
  21. Lund, M.S.; Su, G.; Janss, L. Genomic evaluation of cattle in a multi-breed context. Livest. Sci. 2014, 166, 101–110. [Google Scholar] [CrossRef]
  22. Cole, J.B.; Silva, M.V.G.B.D. Genomic selection in multi-breed dairy cattle populations. Rev. Bras. Zootec. 2016, 45, 195–202. [Google Scholar] [CrossRef]
  23. Barani, S.; Miraie Ashtiani, S.R.; Nejati Javaremi, A.; Khansefid, M.; Esfandyari, H. Optimizing purebred selection to improve crossbred performance. Front. Genet. 2024, 15, 1384973. [Google Scholar] [CrossRef]
  24. Sargolzaei, M.; Schenkel, F.S. QMSim: A large-scale genome simulator for livestock. Bioinformatics 2009, 25, 680–681. [Google Scholar] [CrossRef] [PubMed]
  25. Henderson, C.R. Inverse of a Matrix of Relationships Due to Sires and Maternal Grandsires. J. Dairy Sci. 1975, 58, 1917–1921. [Google Scholar] [CrossRef]
  26. Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.; Bender, D.; Maller, J.; Sklar, P.; de Bakker, P.I.; Daly, M.J.; et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef]
  27. Lee, H.S.; Kim, Y.; Lee, D.H.; Seo, D.; Lee, D.J.; Do, C.H.; Dinh, P.T.N.; Ekanayake, W.; Lee, K.H.; Yoon, D.; et al. Comparison of accuracy of breeding value for cow from three methods in Hanwoo (Korean cattle) population. J. Anim. Sci. Technol. 2023, 65, 720–734. [Google Scholar] [CrossRef]
  28. Yin, L.; Zhang, H.; Tang, Z.; Yin, D.; Fu, Y.; Yuan, X.; Li, X.; Liu, X.; Zhao, S. HIBLUP: An integration of statistical models on the BLUP framework for efficient genetic evaluation using big genomic data. Nucleic Acids Res. 2023, 51, 3501–3512. [Google Scholar] [CrossRef]
  29. Clark, S.A.; van der Werf, J. Genomic best linear unbiased prediction (gBLUP) for the estimation of genomic breeding values. Methods Mol. Biol. 2013, 1019, 321–330. [Google Scholar] [CrossRef]
  30. Kim, E.H.; Kang, H.C.; Sun, D.W.; Myung, C.H.; Kim, J.Y.; Lee, D.H.; Lee, S.H.; Lim, H.T. Estimation of breeding value and accuracy using pedigree and genotype of Hanwoo cows (Korean cattle). J. Anim. Breed. Genet. 2022, 139, 281–291. [Google Scholar] [CrossRef]
  31. Ma, H.; Li, H.; Ge, F.; Zhao, H.; Zhu, B.; Zhang, L.; Gao, H.; Xu, L.; Li, J.; Wang, Z. Improving Genomic Predictions in Multi-Breed Cattle Populations: A Comparative Analysis of BayesR and GBLUP Models. Genes 2024, 15, 253. [Google Scholar] [CrossRef]
  32. Cheng, J.; Maltecca, C.; VanRaden, P.M.; O’Connell, J.R.; Ma, L.; Jiang, J. SLEMM: Million-scale genomic predictions with window-based SNP weighting. Bioinformatics 2023, 39, btad127. [Google Scholar] [CrossRef]
  33. Porto-Neto, L.R.; Kijas, J.W.; Reverter, A. The extent of linkage disequilibrium in beef cattle breeds using high-density SNP genotypes. Genet. Sel. Evol. 2014, 46, 22. [Google Scholar] [CrossRef]
  34. Border, R.; Becker, S. Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models. BMC Bioinformatics 2019, 20, 411. [Google Scholar] [CrossRef]
  35. Zhang, X.; Lourenco, D.; Aguilar, I.; Legarra, A.; Misztal, I. Weighting Strategies for Single-Step Genomic BLUP: An Iterative Approach for Accurate Calculation of GEBV and GWAS. Front. Genet. 2016, 7, 151. [Google Scholar] [CrossRef]
  36. Brito, L.F.; Clarke, S.M.; McEwan, J.C.; Miller, S.P.; Pickering, N.K.; Bain, W.E.; Dodds, K.G.; Sargolzaei, M.; Schenkel, F.S. Prediction of genomic breeding values for growth, carcass and meat quality traits in a multi-breed sheep population using a HD SNP chip. BMC Genet. 2017, 18, 7. [Google Scholar] [CrossRef] [PubMed]
  37. de Rezende Neves, H.H.; Carvalheiro, R.; de Queiroz, S.A. Trait-specific long-term consequences of genomic selection in beef cattle. Genetica 2018, 146, 85–99. [Google Scholar] [CrossRef] [PubMed]
  38. Erbe, M.; Gredler, B.; Seefried, F.R.; Bapst, B.; Simianer, H. A function accounting for training set size and marker density to model the average accuracy of genomic prediction. PLoS ONE 2013, 8, e81046. [Google Scholar] [CrossRef] [PubMed]
  39. Liu, Z.; Seefried, F.R.; Reinhardt, F.; Rensing, S.; Thaller, G.; Reents, R. Impacts of both reference population size and inclusion of a residual polygenic effect on the accuracy of genomic prediction. Genet. Sel. Evol. 2011, 43, 19. [Google Scholar] [CrossRef]
  40. Moser, G.; Khatkar, M.S.; Hayes, B.J.; Raadsma, H.W. Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers. Genet. Sel. Evol. 2010, 42, 37. [Google Scholar] [CrossRef]
  41. Carillier, C. Evaluation of a Reference Population in Dairy Goats for Genomic Selection. Master’s Thesis, AgroParisTech, Paris France, 2012. [Google Scholar]
  42. Takeda, M.; Inoue, K.; Oyama, H.; Uchiyama, K.; Yoshinari, K.; Sasago, N.; Kojima, T.; Kashima, M.; Suzuki, H.; Kamata, T.; et al. Exploring the size of reference population for expected accuracy of genomic prediction using simulated and real data in Japanese Black cattle. BMC Genom. 2021, 22, 799. [Google Scholar] [CrossRef]
  43. Wang, H.; Misztal, I.; Aguilar, I.; Legarra, A.; Muir, W.M. Genome-wide association mapping including phenotypes from relatives without genotypes. Genet. Res. 2012, 94, 73–83. [Google Scholar] [CrossRef]
  44. Uemoto, Y.; Sasaki, S.; Kojima, T.; Sugimoto, Y.; Watanabe, T. Impact of QTL minor allele frequency on genomic evaluation using real genotype data and simulated phenotypes in Japanese Black cattle. BMC Genet. 2015, 16, 134. [Google Scholar] [CrossRef]
  45. Kolbehdari, D.; Schaeffer, L.R.; Robinson, J.A. Estimation of genome-wide haplotype effects in half-sib designs. J. Anim. Breed. Genet. 2007, 124, 356–361. [Google Scholar] [CrossRef] [PubMed]
  46. Zhang, M.; Luo, H.; Xu, L.; Shi, Y.; Zhou, J.; Wang, D.; Zhang, X.; Huang, X.; Wang, Y. Genomic Selection for Milk Production Traits in Xinjiang Brown Cattle. Animals 2022, 12, 136. [Google Scholar] [CrossRef]
  47. Luan, T.; Woolliams, J.A.; Lien, S.; Kent, M.; Svendsen, M.; Meuwissen, T.H. The accuracy of Genomic Selection in Norwegian red cattle assessed by cross-validation. Genetics 2009, 183, 1119–1126. [Google Scholar] [CrossRef] [PubMed]
  48. Gualdrón Duarte, J.L.; Gori, A.S.; Hubin, X.; Lourenco, D.; Charlier, C.; Misztal, I.; Druet, T. Performances of Adaptive MultiBLUP, Bayesian regressions, and weighted-GBLUP approaches for genomic predictions in Belgian Blue beef cattle. BMC Genom. 2020, 21, 545. [Google Scholar] [CrossRef] [PubMed]
  49. Zhu, S.; Guo, T.; Yuan, C.; Liu, J.; Li, J.; Han, M.; Zhao, H.; Wu, Y.; Sun, W.; Wang, X.; et al. Evaluation of Bayesian alphabet and GBLUP based on different marker density for genomic prediction in Alpine Merino sheep. G3 2021, 11, jkab206. [Google Scholar] [CrossRef]
  50. Zhu, B. Effects of marker density and minor allele frequency on genomic prediction for growth traits in Chinese Simmental beef cattle. J. Integr. Agric. 2017, 16, 911–920. [Google Scholar] [CrossRef]
  51. Brito, F.V.; Neto, J.B.; Sargolzaei, M.; Cobuci, J.A.; Schenkel, F.S. Accuracy of genomic selection in simulated populations mimicking the extent of linkage disequilibrium in beef cattle. BMC Genet. 2011, 12, 80. [Google Scholar] [CrossRef]
  52. Rabier, C.E.; Barre, P.; Asp, T.; Charmet, G.; Mangin, B. On the Accuracy of Genomic Selection. PLoS ONE 2016, 11, e0156086. [Google Scholar] [CrossRef]
  53. Su, G.; Brøndum, R.F.; Ma, P.; Guldbrandtsen, B.; Aamand, G.P.; Lund, M.S. Comparison of genomic predictions using medium-density (~54,000) and high-density (~777,000) single nucleotide polymorphism marker panels in Nordic Holstein and Red Dairy Cattle populations. J. Dairy Sci. 2012, 95, 4657–4665. [Google Scholar] [CrossRef]
  54. Toosi, A.; Fernando, R.L.; Dekkers, J.C. Genomic selection in admixed and crossbred populations. J. Anim. Sci. 2010, 88, 32–46. [Google Scholar] [CrossRef] [PubMed]
  55. Ogawa, S.; Matsuda, H.; Taniguchi, Y.; Watanabe, T.; Nishimura, S.; Sugimoto, Y.; Iwaisaki, H. Effects of single nucleotide polymorphism marker density on degree of genetic variance explained and genomic evaluation for carcass traits in Japanese Black beef cattle. BMC Genet. 2014, 15, 15. [Google Scholar] [CrossRef] [PubMed]
  56. Calus, M.P.; Meuwissen, T.H.; de Roos, A.P.; Veerkamp, R.F. Accuracy of genomic selection using different methods to define haplotypes. Genetics 2008, 178, 553–561. [Google Scholar] [CrossRef] [PubMed]
  57. Muir, W.M. Comparison of genomic and traditional BLUP-estimated breeding value accuracy and selection response under alternative trait and genomic parameters. J. Anim. Breed. Genet. 2007, 124, 342–355. [Google Scholar] [CrossRef]
  58. Misztal, I.; Aggrey, S.E.; Muir, W.M. Experiences with a single-step genome evaluation. Poult. Sci. 2013, 92, 2530–2534. [Google Scholar] [CrossRef]
  59. Silva, R.M.; Fragomeni, B.O.; Lourenco, D.A.; Magalhães, A.F.; Irano, N.; Carvalheiro, R.; Canesin, R.C.; Mercadante, M.E.; Boligon, A.A.; Baldi, F.S.; et al. Accuracies of genomic prediction of feed efficiency traits using different prediction and validation methods in an experimental Nelore cattle population. J. Anim. Sci. 2016, 94, 3613–3623. [Google Scholar] [CrossRef]
  60. Naserkheil, M.; Mehrban, H.; Lee, D.; Park, M.N. Evaluation of Genome-Enabled Prediction for Carcass Primal Cut Yields Using Single-Step Genomic Best Linear Unbiased Prediction in Hanwoo Cattle. Genes 2021, 12, 1886. [Google Scholar] [CrossRef]
  61. Lourenco, D.A.; Tsuruta, S.; Fragomeni, B.O.; Masuda, Y.; Aguilar, I.; Legarra, A.; Bertrand, J.K.; Amen, T.S.; Wang, L.; Moser, D.W.; et al. Genetic evaluation using single-step genomic best linear unbiased predictor in American Angus. J. Anim. Sci. 2015, 93, 2653–2662. [Google Scholar] [CrossRef]
  62. Putz, A.M.; Tiezzi, F.; Maltecca, C.; Gray, K.A.; Knauer, M.T. A comparison of accuracy validation methods for genomic and pedigree-based predictions of swine litter size traits using Large White and simulated data. J. Anim. Breed. Genet. 2018, 135, 5–13. [Google Scholar] [CrossRef]
  63. Karimi, K.; Sargolzaei, M.; Plastow, G.S.; Wang, Z.; Miar, Y. Opportunities for genomic selection in American mink: A simulation study. PLoS ONE 2019, 14, e0213873. [Google Scholar] [CrossRef] [PubMed]
  64. Nwogwugwu, C.P.; Kim, Y.; Choi, H.; Lee, J.H.; Lee, S.H. Assessment of genomic prediction accuracy using different selection and evaluation approaches in a simulated Korean beef cattle population. Asian-Australas. J. Anim. Sci. 2020, 33, 1912–1921. [Google Scholar] [CrossRef] [PubMed]
  65. Haque, M.A.; Iqbal, A.; Alam, M.Z.; Lee, Y.M.; Ha, J.J.; Kim, J.J. Estimation of genetic correlations and genomic prediction accuracy for reproductive and carcass traits in Hanwoo cows. J. Anim. Sci. Technol. 2024, 66, 682–701. [Google Scholar] [CrossRef]
  66. Karaman, E.; Lund, M.S.; Anche, M.T.; Janss, L.; Su, G. Genomic Prediction Using Multi-trait Weighted GBLUP Accounting for Heterogeneous Variances and Covariances Across the Genome. G3 2018, 8, 3549–3558. [Google Scholar] [CrossRef] [PubMed]
  67. Lopez, B.I.; Lee, S.H.; Park, J.E.; Shin, D.H.; Oh, J.D.; de Las Heras-Saldana, S.; van der Werf, J.; Chai, H.H.; Park, W.; Lim, D. Weighted Genomic Best Linear Unbiased Prediction for Carcass Traits in Hanwoo Cattle. Genes 2019, 10, 1019, Erratum in Genes 2020, 11, E1013. [Google Scholar] [CrossRef]
  68. Nishio, M.; Arakawa, A.; Inoue, K.; Ichinoseki, K.; Kobayashi, E.; Okamura, T.; Fukuzawa, Y.; Ogawa, S.; Taniguchi, M.; Oe, M.; et al. Evaluating the performance of genomic prediction accounting for effects of single nucleotide polymorphism markers in reproductive traits of Japanese Black cattle. Anim. Sci. J. 2023, 94, e13850. [Google Scholar] [CrossRef]
Figure 1. Visual presentation of the simulated data. The historical populations of 10,000, 5000, and 1000 animals were mated randomly for 1000 generations, experiencing a bottleneck and expansion phase at generation 500. Founder animals for three breeds were selected and randomly mated for 10 generations, resulting in different breed sizes and numbers of genotyped animals selected.
Figure 1. Visual presentation of the simulated data. The historical populations of 10,000, 5000, and 1000 animals were mated randomly for 1000 generations, experiencing a bottleneck and expansion phase at generation 500. Founder animals for three breeds were selected and randomly mated for 10 generations, resulting in different breed sizes and numbers of genotyped animals selected.
Animals 15 01118 g001
Figure 2. Prediction accuracy of genomic estimated breeding values (GEBVs) for simulated traits with a heritability of 0.3 in breed A using different evaluation methods: PBLUP, GBLUP, or wGBLUP. The X-axis represents the number of animals in the reference population, while the Y-axis indicates the predicted accuracy of GEBVs for the simulated traits. The blue and red lines correspond to marker densities of 50 k and 770 k, respectively.
Figure 2. Prediction accuracy of genomic estimated breeding values (GEBVs) for simulated traits with a heritability of 0.3 in breed A using different evaluation methods: PBLUP, GBLUP, or wGBLUP. The X-axis represents the number of animals in the reference population, while the Y-axis indicates the predicted accuracy of GEBVs for the simulated traits. The blue and red lines correspond to marker densities of 50 k and 770 k, respectively.
Animals 15 01118 g002
Table 1. Parameters of the simulation process.
Table 1. Parameters of the simulation process.
Population StructureABC
Step 1: Historical generations (HG)
Number of generations phase 1 (size)0 (10,000)0 (5000)0 (1000)
Number of generations phase 2 (size)500 (1000)500 (3000)500 (4000)
Number of generations phase 3 (size)1000 (7120)
Step 2: Expanded generations (EG)
Number of founder males from HG620350300
Number of founder females from HG580051005000
Number of generations10
Number of offspring per dam1
Selection and matingebv/h
Sire replacement and growth rate0.5065
0.072
0.1851
0.1038
0.063
0.123
Dam replacement and growth rate0.30
0.098
0.3015
0.1629
0.105
0.355
Mating systemRandom
Culling designRandom
Genome
Number of chromosomes29 (no X Chr)
Genome length2486cM
Number of markers58,990 (50 k)/777,962 (770 k)
Marker/QTL positionsRandom
Number of marker/QTL alleles2/2 3 4
Marker of allele frequenciesEqual
QTL allele effectsEqual
Mutation rate2.5 × 105
Table 2. Accuracies of genomic prediction using PBLUP, GBLUP, or wGBLUP procedures under different training populations, with varying levels of heritability and marker densities for breed A.
Table 2. Accuracies of genomic prediction using PBLUP, GBLUP, or wGBLUP procedures under different training populations, with varying levels of heritability and marker densities for breed A.
Population Sizeh250 k770 k
PBLUPGBLUPwGBLUPPBLUPGBLUPwGBLUP
50000.30.56093330.60853670.62898960.57931740.62431170.6543064
0.50.60189240.7037660.70189910.65246620.70652250.7386434
0.70.66582090.79086080.7816660.65881150.74976050.8060416
80000.30.56393060.63239180.62721760.58436890.63688540.6608198
0.50.5905040.72135530.69191240.63382980.72040710.7359905
0.70.64033810.78827230.76733410.6465720.76905140.8011734
12,0000.30.57974590.66910370.65501160.62098030.67834030.6995511
0.50.58359270.73725560.68987730.64437240.74141650.7533543
0.70.59359110.78067630.7334620.6525240.79379980.8052261
15,0000.30.59635470.70116060.67170330.64310270.70491340.7192473
0.50.58541960.7506360.6848550.64650780.75601240.7554457
0.70.60527220.79417010.73550520.65889320.79742390.803195
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, L.; Zhu, L.; Chang, C.; Ma, F.; Liu, Z.; Gu, M.; Na, R.; Zhang, W. Weighted GBLUP in Simulated Beef Cattle Populations: Impact of Reference Population, Marker Density, and Heritability. Animals 2025, 15, 1118. https://doi.org/10.3390/ani15081118

AMA Style

Zhou L, Zhu L, Chang C, Ma F, Liu Z, Gu M, Na R, Zhang W. Weighted GBLUP in Simulated Beef Cattle Populations: Impact of Reference Population, Marker Density, and Heritability. Animals. 2025; 15(8):1118. https://doi.org/10.3390/ani15081118

Chicago/Turabian Style

Zhou, Le, Lin Zhu, Chencheng Chang, Fengying Ma, Zaixia Liu, Mingjuan Gu, Risu Na, and Wenguang Zhang. 2025. "Weighted GBLUP in Simulated Beef Cattle Populations: Impact of Reference Population, Marker Density, and Heritability" Animals 15, no. 8: 1118. https://doi.org/10.3390/ani15081118

APA Style

Zhou, L., Zhu, L., Chang, C., Ma, F., Liu, Z., Gu, M., Na, R., & Zhang, W. (2025). Weighted GBLUP in Simulated Beef Cattle Populations: Impact of Reference Population, Marker Density, and Heritability. Animals, 15(8), 1118. https://doi.org/10.3390/ani15081118

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop