MDPI - Publisher of Open Access Journals

17 pages, 1743 KiB

Open AccessArticle

Prioritized SNP Selection from Whole-Genome Sequencing Improves Genomic Prediction Accuracy in Sturgeons Using Linear and Machine Learning Models

by Hailiang Song, Wei Wang, Tian Dong, Xiaoyu Yan, Chenfan Geng, Song Bai and Hongxia Hu

Int. J. Mol. Sci. 2025, 26(14), 7007; https://doi.org/10.3390/ijms26147007 - 21 Jul 2025

Viewed by 290

Abstract

Genomic prediction has emerged as a powerful tool in aquaculture breeding, but its effectiveness depends on the careful selection of informative single nucleotide polymorphisms (SNPs) and the application of appropriate prediction models. This study aimed to enhance genomic prediction accuracy in Russian sturgeon [...] Read more.

Genomic prediction has emerged as a powerful tool in aquaculture breeding, but its effectiveness depends on the careful selection of informative single nucleotide polymorphisms (SNPs) and the application of appropriate prediction models. This study aimed to enhance genomic prediction accuracy in Russian sturgeon (Acipenser gueldenstaedtii) by optimizing SNP selection strategies and exploring the performance of linear and machine learning models. Three economically important traits—caviar yield, caviar color, and body weight—were selected due to their direct relevance to breeding goals and market value. Whole-genome sequencing (WGS) data were obtained from 971 individuals with an average sequencing depth of 13.52×. To reduce marker density and eliminate redundancy, three SNP selection strategies were applied: (1) genome-wide association study (GWAS)-based prioritization to select trait-associated SNPs; (2) linkage disequilibrium (LD) pruning to retain independent markers; and (3) random sampling as a control. Genomic prediction was conducted using both linear (e.g., GBLUP) and machine learning models (e.g., random forest) across varying SNP densities (1 K to 50 K). Results showed that GWAS-based SNP selection consistently outperformed other strategies, especially at moderate densities (≥10 K), improving prediction accuracy by up to 3.4% compared to the full WGS dataset. LD-based selection at higher densities (30 K and 50 K) achieved comparable performance to full WGS. Notably, machine learning models, particularly random forest, exceeded the performance of linear models, yielding an additional 2.0% increase in accuracy when combined with GWAS-selected SNPs. In conclusion, integrating WGS data with GWAS-informed SNP selection and advanced machine learning models offers a promising framework for improving genomic prediction in sturgeon and holds promise for broader applications in aquaculture breeding programs. Full article

(This article belongs to the Special Issue Advances in Traits in Animals and Aquatic Species, and Their Improvement Technologies)

► Show Figures

Figure 1

14 pages, 276 KiB

Open AccessArticle

Genomic Selection for Early Growth Traits in Inner Mongolian Cashmere Goats Using ABLUP, GBLUP, and ssGBLUP Methods

by Tao Zhang, Linyu Gao, Bohan Zhou, Qi Xu, Yifan Liu, Jinquan Li, Qi Lv, Yanjun Zhang, Ruijun Wang, Rui Su and Zhiying Wang

Animals 2025, 15(12), 1733; https://doi.org/10.3390/ani15121733 - 12 Jun 2025

Viewed by 893

Abstract

This study aimed to identify the best model and method for the genomic selection of early growth traits in Inner Mongolian cashmere goats (IMCGs). Using data from 50,728 SNPs, the phenotypes (birth weight, BW; weaning weight, WW; daily weight gain, DWG; and yearling [...] Read more.

This study aimed to identify the best model and method for the genomic selection of early growth traits in Inner Mongolian cashmere goats (IMCGs). Using data from 50,728 SNPs, the phenotypes (birth weight, BW; weaning weight, WW; daily weight gain, DWG; and yearling weight, YW) of 2256 individuals, and pedigree information from 14,165 individuals, fixed effects were analyzed using a generalized linear model. Four single-trait animal models with varying combinations of individual and maternal effects were evaluated using the ABLUP, GBLUP, and ssGBLUP methods. The best model was selected based on a likelihood ratio test. Five-fold cross-validation was used to assess the accuracy and reliability of the genomic estimated breeding values (GEBVs). Birth year and herd significantly affected BW (p < 0.05) and WW, DWG, and YW (p < 0.01), while sex, birth type, and dam age had highly significant effects on all traits (p < 0.01). Model 4, incorporating direct and maternal additive genetic effects, maternal environmental effects, and their covariance, was optimal. Additionally, ssGBLUP achieved the highest GEBV accuracy (0.61–0.70), outperforming the GBLUP and ABLUP methods. Thus, ssGBLUP is recommended for enhancing the genetic progress in IMCGs. Under the best method, the heritability estimates for BW, WW, DGW, and YW were 0.11, 0.25, 0.15, and 0.23, respectively. Full article

(This article belongs to the Topic Advances in Molecular Genetics and Breeding of Cattle, Sheep, and Goats)

17 pages, 973 KiB

Open AccessArticle

Enhancing Genomic Prediction Accuracy in Beef Cattle Using WMGBLUP and SNP Pre-Selection

by Huqiong Zhao, Xueyuan Xie, Haoran Ma, Peinuo Zhou, Boran Xu, Yuanqing Zhang, Lingyang Xu, Huijiang Gao, Junya Li, Zezhao Wang and Xiaoyan Niu

Agriculture 2025, 15(10), 1094; https://doi.org/10.3390/agriculture15101094 - 19 May 2025

Viewed by 571

Abstract

Genomic selection (GS) plays a crucial role in livestock breeding. However, its implementation in Chinese beef cattle breeding is constrained by a limited reference population and incomplete data records. To address these challenges, this study aimed to identify more effective models for multi-population [...] Read more.

Genomic selection (GS) plays a crucial role in livestock breeding. However, its implementation in Chinese beef cattle breeding is constrained by a limited reference population and incomplete data records. To address these challenges, this study aimed to identify more effective models for multi-population genomic selection. We simulated five different beef cattle populations and selected three populations with varying levels of kinship to investigate the impact of population relationships on genomic prediction. Utilizing results from a genome-wide association study (GWAS), we preselected different proportions of single nucleotide polymorphism (SNP). Subsequently, we employed three models—genomic best linear unbiased prediction (GBLUP), multi-genomic best linear unbiased prediction (MGBLUP), and weighted multi-genomic best linear unbiased prediction (WMGBLUP)—for within-population and multi-population genomic prediction. Our results showed that increasing the size of the training set improved within-population prediction accuracy. Furthermore, both MGBLUP and WMGBLUP outperformed GBLUP in terms of prediction accuracy for both within-population and multi-population analyses. Among the models evaluated, the WMGBLUP model, which utilized the top 5% of preselected SNPs based on GWAS findings, demonstrated superior performance, yielding an improvement of up to 11.1% in within-population prediction and 16.5% in multi-population prediction. In summary, both WMGBLUP and MGBLUP models exhibit enhanced efficacy in improving genomic prediction accuracy, and the incorporation of GWAS results can further optimize their performance. Full article

(This article belongs to the Section Farm Animal Production)

► Show Figures

Figure 1

26 pages, 4120 KiB

Open AccessArticle

Pleiotropic Genes Affecting Milk Production, Fertility, and Health in Thai-Holstein Crossbred Dairy Cattle: A GWAS Approach

by Akhmad Fathoni, Wuttigrai Boonkum, Vibuntita Chankitisakul, Sayan Buaban and Monchai Duangjinda

Animals 2025, 15(9), 1320; https://doi.org/10.3390/ani15091320 - 2 May 2025

Viewed by 685

Abstract

Understanding the genetic basis of economically important traits is essential for enhancing the productivity, fertility, and health of dairy cattle. This study aimed to identify the pleiotropic genes associated with the 305-day milk yield (MY305), days open (DO), and milk fat-to-protein ratio (FPR) [...] Read more.

Understanding the genetic basis of economically important traits is essential for enhancing the productivity, fertility, and health of dairy cattle. This study aimed to identify the pleiotropic genes associated with the 305-day milk yield (MY305), days open (DO), and milk fat-to-protein ratio (FPR) in Thai-Holstein crossbred dairy cattle using a genome-wide association study (GWAS) approach. The dataset included 18,843 records of MY305 and milk FPR, as well as 48,274 records of DO, collected from first-lactation Thai-Holstein crossbred dairy cattle. A total of 868 genotyped animals and 43,284 informative SNPs out of 50,905 were used for the analysis. The single-nucleotide polymorphism (SNP) effects were evaluated using a weighted single-step GWAS (wssGWAS), which estimated these effects based on genomic breeding values (GEBVs) through a multi-trait animal model with single-step genomic BLUP (ssGBLUP). Genomic regions explaining at least 5% of the total genetic variance were selected for candidate gene analysis. Single-step genomic REML (ssGREML) with a multi-trait animal model was used to estimate components of (co)variance. The heritability estimates from additive genetic variance were 0.262 for MY305, 0.029 for DO, and 0.102 for milk FPR, indicating a moderate genetic influence on milk yield and a lower genetic impact on fertility and milk FPR. The genetic correlations were 0.559 (MY305 and DO), −0.306 (MY305 and milk FPR), and −0.501 (DO and milk FPR), indicating potential compromises in genetic selection. wssGBLUP showed a higher accuracy than ssGBLUP, although the improvement was modest. A total of 24, 46, and 33 candidate genes were identified for MY305, DO, and milk FPR, respectively. Pleiotropic effects, identified by SNPs showing significant influence with more than trait, were observed in 14 genes shared among all three traits, 17 genes common between MY305 and DO, 14 genes common between MY305 and milk FPR, and 26 genes common between DO and milk FPR. Overall, wssGBLUP is a promising approach for improving the genomic prediction of economic traits in multi-trait analyses, outperforming ssGBLUP. This presents a viable alternative for genetic evaluation in dairy cattle breeding programs in Thailand. However, further studies are needed to validate these candidate genes and refine marker selection for production, fertility, and health traits in dairy cattle. Full article

(This article belongs to the Section Animal Genetics and Genomics)

► Show Figures

Graphical abstract

15 pages, 1266 KiB

Open AccessArticle

Enhancing Genomic Prediction Accuracy with a Single-Step Genomic Best Linear Unbiased Prediction Model Integrating Genome-Wide Association Study Results

by Zhixu Pang, Wannian Wang, Pu Huang, Hongzhi Zhang, Siying Zhang, Pengkun Yang, Liying Qiao, Jianhua Liu, Yangyang Pan, Kaijie Yang and Wenzhong Liu

Animals 2025, 15(9), 1268; https://doi.org/10.3390/ani15091268 - 29 Apr 2025

Viewed by 558

Abstract

Genomic selection (GS) is a genetic breeding method that uses genome-wide marker information to improve the accuracy of the prediction of complex traits. The single-step GBLUP (ssGBLUP) model, which integrates pedigree, phenotypic, and genomic data, has improved genomic prediction. However, ssGBLUP assumes that [...] Read more.

Genomic selection (GS) is a genetic breeding method that uses genome-wide marker information to improve the accuracy of the prediction of complex traits. The single-step GBLUP (ssGBLUP) model, which integrates pedigree, phenotypic, and genomic data, has improved genomic prediction. However, ssGBLUP assumes that all markers contribute equally to genetic variance, which can limit its predictive accuracy, especially for traits controlled by major genes. To overcome this limitation, we integrate results from genome-wide association studies (GWAS) into an enhanced ssGBLUP framework, termed single-step genome-wide association assisted BLUP (ssGWABLUP). Our approach assigns differential weights to markers on the basis of their GWAS results, thereby increasing the contribution of effective markers while diminishing the influence of ineffective ones during the construction of the genomic relationship matrix. By incorporating pseudo quantitative trait nucleotides (pQTNs) as covariates, we aim to capture the effects of markers closely associated with major causal variants, leading to the development of the ssGWABLUP_pQTNs. Compared with weighted ssGBLUP (WssGBLUP), the ssGWABLUP model demonstrated superior accuracy and dispersion across different genetic architectures. We then compared the performance of our proposed ssGWABLUP_pQTNs model against both ssGBLUP and ssGWABLUP across various genetic scenarios. Our results demonstrate that ssGWABLUP_pQTNs outperforms other models in terms of prediction accuracy, particularly in scenarios with simpler genetic architectures. Additionally, evaluation using pig dataset confirmed the effectiveness of ssGWABLUP_pQTNs, highlighting its potential for practical breeding applications. The incorporation of pQTNs and a weighted genomic relationship matrix presents a promising and potentially scalable approach to further enhance genomic prediction, with potential implications for improving the accuracy of genomic selection in breeding programs. Full article

(This article belongs to the Section Animal Genetics and Genomics)

► Show Figures

Figure 1

18 pages, 2151 KiB

Open AccessArticle

Genetic Parameter Estimation of Body Weight and Vp_AHPND Resistance in Two Strains of Penaeus vannamei

by Guixian Huang, Jie Kong, Jiteng Tian, Sheng Luan, Mianyu Liu, Kun Luo, Jian Tan, Jiawang Cao, Ping Dai, Guangfeng Qiang, Qun Xing, Juan Sui and Xianhong Meng

Animals 2025, 15(9), 1266; https://doi.org/10.3390/ani15091266 - 29 Apr 2025

Viewed by 408

Abstract

This study evaluated the genetic parameters for growth and Vibrio parahaemolyticus (Vp_AHPND) resistance in both the introduced MK strain and the self-constructed GK strain of Penaeus vannamei, investigating the impact of genotyped female parents on trait estimates under a [...] Read more.

This study evaluated the genetic parameters for growth and Vibrio parahaemolyticus (Vp_AHPND) resistance in both the introduced MK strain and the self-constructed GK strain of Penaeus vannamei, investigating the impact of genotyped female parents on trait estimates under a single-parent nested mating design. A total of 32 families from the MK strain and 44 families from the GK strain were analyzed. Fifty-four female parents from both strains were genotyped using the “Yellow Sea Chip No. 1” containing 10.0 K SNPs. In the MK strain, heritability estimates ranged from 0.439 to 0.458 for body weight (Bw) and from 0.308 to 0.489 for survival time (ST) and survival rates at 36 h (36 SR), 50% mortality (SS₅₀), and 60 h (60 SR). In the GK strain, heritability for Bw ranged from 0.724 to 0.726, while ST, 36 SR, SS₅₀, and 60 SR had heritability estimates between 0.370 and 0.593. Genetic correlations between Bw and ST were 0.601 to 0.622 in the MK strain and 0.742 to 0.744 in the GK strain. For Bw and survival rates, correlations ranged from 0.120 to 0.547 in the MK strain and from 0.426 to 0.906 in the GK strain. The genetic correlation between ST and survival rates was not significantly different from 1 (p > 0.05) in both strains. High Pearson correlations (0.853 to 0.997, p < 0.01) were observed among survival rates at different points. Predictive accuracies for Bw, ST, and survival rates using single-step genomic best linear unbiased prediction (ssGBLUP) were comparable to pedigree-based best linear unbiased prediction (pBLUP) in the MK strain, while in the GK strain, ssGBLUP improved predictive accuracies for Bw, ST, and SS₅₀ by 0.20%, 0.32%, and 0.38%, respectively. The results indicate that both growth and Vp_AHPND resistance have significant breeding potential. Although the genetic correlation between weight and resistance varies across different populations, there is a positive genetic correlation between these traits, supporting the feasibility of multi-trait selection. To enhance genetic accuracy, breeding programs should include more genotyped progeny. These findings also suggest that infection frequency and observation time influence resistance performance and breeding selection, emphasizing the need for a tailored resistance evaluation program to improve breeding efficiency and reduce costs. Full article

(This article belongs to the Section Animal Genetics and Genomics)

► Show Figures

Figure 1

20 pages, 1177 KiB

Open AccessArticle

Weighted GBLUP in Simulated Beef Cattle Populations: Impact of Reference Population, Marker Density, and Heritability

by Le Zhou, Lin Zhu, Chencheng Chang, Fengying Ma, Zaixia Liu, Mingjuan Gu, Risu Na and Wenguang Zhang

Animals 2025, 15(8), 1118; https://doi.org/10.3390/ani15081118 - 12 Apr 2025

Viewed by 538

Abstract

Genomic selection (GS) is a technique that integrates genomic data, pedigree information, and individual phenotypes to enhance genetic improvements of economically important traits in livestock. While it has shown significant effects in dairy cattle, its efficacy in beef cattle is lower due to [...] Read more.

Genomic selection (GS) is a technique that integrates genomic data, pedigree information, and individual phenotypes to enhance genetic improvements of economically important traits in livestock. While it has shown significant effects in dairy cattle, its efficacy in beef cattle is lower due to breed diversity and differences in reproductive structures. Therefore, this study evaluated the impact of heritability levels, marker densities, and assessment methods (such as pedigree-based BLUP, genomic BLUP, and weighted genomic BLUP) on genomic prediction accuracy across multiple beef cattle breeds through simulations. Three beef cattle populations were simulated with heritability levels set at 0.3, 0.5, and 0.7 and marker densities set at 50 k and 770 k. The results showed that the predictive accuracy of PBLUP and GBLUP increased with higher heritability and larger reference populations. Increasing the marker density also improved the accuracy of genomic predictions; even a low marker density (50 k SNP) can significantly enhance the accuracy of genetic evaluation, although the size of the reference population needs to be optimized according to population structure, heritability, and the genetic architecture of the trait. Overall, integrating pedigree, genomic, and weighted SNP information can significantly improve the precision of GEBV prediction and reduce bias. In particular, the wGBLUP method demonstrated an improvement in the prediction accuracy of low-heritability traits in small but high-density marker populations. Full article

(This article belongs to the Section Cattle)

► Show Figures

Figure 1

13 pages, 997 KiB

Open AccessArticle

Weighted Kernel Ridge Regression to Improve Genomic Prediction

by Chenguang Diao, Yue Zhuo, Ruihan Mao, Weining Li, Heng Du, Lei Zhou and Jianfeng Liu

Agriculture 2025, 15(5), 445; https://doi.org/10.3390/agriculture15050445 - 20 Feb 2025

Cited by 1 | Viewed by 827

Abstract

Nonparametric models have recently been receiving increased attention due to their effectiveness in genomic prediction for complex traits. However, regular nonparametric models cannot effectively differentiate the relative importance of various SNPs, which significantly impedes the further application of these methods for genomic prediction. [...] Read more.

Nonparametric models have recently been receiving increased attention due to their effectiveness in genomic prediction for complex traits. However, regular nonparametric models cannot effectively differentiate the relative importance of various SNPs, which significantly impedes the further application of these methods for genomic prediction. To enhance the fitting ability of nonparametric models and improve genomic prediction accuracy, a weighted kernel ridge regression model (WKRR) was proposed in this study. For this new method, different weights were assigned to different SNPs according to the p-values from GWAS, and then a KRR model based on these weighted SNPs was constructed for genomic prediction. Cross-validation was further adopted to choose appropriate hyper-parameters during the weighting and prediction process for generalization. We compared the predictive accuracy of WKRR with the genomic best linear unbiased prediction (GBLUP), BayesR, and unweighted KRR using both simulated and real datasets. The results showed that WKRR outperformed unweighted KRR in all simulated scenarios. Additionally, WKRR achieved an average improvement of 1.70% in accuracies across all traits in a mice dataset and 2.17% for three lactation-related traits in a cattle dataset compared to GBLUP, and yielded competitive results compared to BayesR. These findings demonstrated the great potential of weighted nonparametric models for genomic prediction. Full article

(This article belongs to the Topic Application of Reproductive and Genomic Biotechnologies for Livestock Breeding and Selection)

► Show Figures

Figure 1

19 pages, 4489 KiB

Open AccessArticle

Genomic Prediction and Genome-Wide Association Study for Growth-Related Traits in Taiwan Country Chicken

by Tsung-Che Tu, Chen-Jyuan Lin, Ming-Che Liu, Zhi-Ting Hsu and Chih-Feng Chen

Animals 2025, 15(3), 376; https://doi.org/10.3390/ani15030376 - 28 Jan 2025

Cited by 4 | Viewed by 1037

Abstract

Taiwan Country chickens are integral to Taiwanese culture and the poultry industry. By establishing a crossbreeding system, breeders must consider the growth-related traits of the dam line to achieve acceptable traits in commercial meat-type chickens. This study compared the accuracy of genomic estimated [...] Read more.

Taiwan Country chickens are integral to Taiwanese culture and the poultry industry. By establishing a crossbreeding system, breeders must consider the growth-related traits of the dam line to achieve acceptable traits in commercial meat-type chickens. This study compared the accuracy of genomic estimated breeding values (GEBVs) predicted using the pedigree-based best linear unbiased prediction (PBLUP) model and the single-step genomic BLUP (ssGBLUP) model. Additionally, we conducted a genome-wide association study (GWAS) to identify single-nucleotide polymorphisms (SNPs) associated with growth, shank, and body conformation traits to support marker-assisted selection (MAS). The results showed that the ssGBLUP model achieved 4.3% to 16.4% higher prediction accuracy than the PBLUP model. GWAS identified four missense SNPs and four significant SNPs associated with body weight, shank length, and shank width at 12 weeks. These findings highlight the potential of integrating the ssGBLUP model with identified SNPs to improve genetic gain and breeding efficiency and provide preliminary results to assess the feasibility of genomic prediction and MAS in Taiwan Country chicken breeding programs. Further research is necessary to validate these findings and explore their mechanisms and broader application across different breeding programs, particularly for the NCHU-G101 breed of Taiwan Country chickens. Full article

(This article belongs to the Section Animal Genetics and Genomics)

► Show Figures

Figure 1

14 pages, 973 KiB

Open AccessArticle

Optimizing Breeding Strategies for Pekin Ducks Using Genomic Selection: Genetic Parameter Evaluation and Selection Progress Analysis in Reproductive Traits

by Jun Zhou, Jiang-Zhou Yu, Mei-Yi Zhu, Fang-Xi Yang, Jin-Ping Hao, Yong He, Xiao-Liang Zhu, Zhuo-Cheng Hou and Feng Zhu

Appl. Sci. 2025, 15(1), 194; https://doi.org/10.3390/app15010194 - 29 Dec 2024

Cited by 2 | Viewed by 1197

Abstract

Reproductive performance is an important trait in poultry production. Traditional methods of improving reproductive traits can only use recorded information from females, making it difficult to effectively assess the reproductive potential of males. Although genomic selection is thought to remedy this shortcoming, most [...] Read more.

Reproductive performance is an important trait in poultry production. Traditional methods of improving reproductive traits can only use recorded information from females, making it difficult to effectively assess the reproductive potential of males. Although genomic selection is thought to remedy this shortcoming, most studies now use simulated data or one or two generations of data to assess its effects. Also, the effectiveness of genomic selection for use in the improvement of reproductive traits in ducks has hardly been reported. In this study, data from four consecutive generations of Pekin duck populations were used to assess the effect of genomic selection on reproductive trait improvement. Whole-genome resequencing was performed for genotyping, and pedigree and SNP genetic parameters were evaluated. Using the BLUP (Best Linear Unbiased Prediction), GBLUP (Genomic Best Linear Unbiased Prediction), and ssGBLUP (Single-step Genomic Best Linear Unbiased Prediction) models, we assessed selection progress for body weight at 6 weeks, age at first egg, and egg number from 25 to 44 weeks over multiple generations. Ten-fold cross-validation was used to evaluate the genomic prediction performance. The results indicated that the heritability of growth traits decreased after routine selection, while reproductive and egg quality traits maintained moderate heritability (0.2–0.4). Selection progress showed a one-day advancement in age at first egg and an increase of one egg per generation from the 13th to 15th generations. The GBLUP model performance significantly outperformed BLUP, but ssGBLUP showed minimal improvement due to comprehensive genotyping. In conclusion, this study provides crucial insights for optimizing breeding strategies and improving economic efficiency in Pekin duck breeding. Full article

(This article belongs to the Section Agricultural Science and Technology)

► Show Figures

Figure 1

12 pages, 5164 KiB

Open AccessArticle

Comparative Analysis of Genomic Prediction for Production Traits Using Genomic Annotation and a Genome-Wide Association Study at Sequencing Levels in Beef Cattle

by Zhida Zhao, Qunhao Niu, Tianyi Wu, Feng Liu, Zezhao Wang, Huijiang Gao, Junya Li, Bo Zhu and Lingyang Xu

Agriculture 2024, 14(12), 2255; https://doi.org/10.3390/agriculture14122255 - 10 Dec 2024

Viewed by 1116

Abstract

Leveraging whole-genome sequencing (WGS) that includes the full spectrum of genetic variation provides a better understanding of the biological mechanisms involved in the economically important traits of farm animals. However, the effectiveness of WGS in improving the accuracy of genomic prediction (GP) is [...] Read more.

Leveraging whole-genome sequencing (WGS) that includes the full spectrum of genetic variation provides a better understanding of the biological mechanisms involved in the economically important traits of farm animals. However, the effectiveness of WGS in improving the accuracy of genomic prediction (GP) is limited. Recent genetic analyses of complex traits, such as genome-wide association study (GWAS), have identified numerous genomic regions and potential genes, which can provide valuable prior information for the improvement of genomic selection (GS). In this study, we applied different genome prediction methods to integrate GWAS results and gene feature annotations, which significantly improved the accuracy of GS for beef production traits. The Bayesian models incorporating genomic features showed the highest prediction accuracy, particularly for average daily gain (ADG) and bone weight (BW). Compared to prediction models based on WGS data, GP including biological prior can optimize the prediction accuracy by up to 11.56% for ADG and 14.60% for BW. Also, GP using GBLUP and Bayesian methods integrating biological priors for single-trait GWAS can significantly increase the prediction accuracy. Bayesian methods generally outperformed GBLUP models, with average improvements of 2.25% for ADG, 5.04% for BW, and 3.44% for live weight (LW). Our results indicate that leveraging biological prior knowledge can significantly refine GS models and underline the potential of combining WGS data with biological prior knowledge to further enhance the breeding process. Full article

(This article belongs to the Special Issue Advances in the Genetic Improvement of Farm Animals Using Genomic Tools)

► Show Figures

Figure 1

12 pages, 2898 KiB

Open AccessArticle

Integrating Gene Expression Data into Single-Step Method (ssBLUP) Improves Genomic Prediction Accuracy for Complex Traits of Duroc × Erhualian F₂ Pig Population

by Fangjun Xu, Zhaoxuan Che, Jiakun Qiao, Pingping Han, Na Miao, Xiangyu Dai, Yuhua Fu, Xinyun Li and Mengjin Zhu

Curr. Issues Mol. Biol. 2024, 46(12), 13713-13724; https://doi.org/10.3390/cimb46120819 - 3 Dec 2024

Viewed by 996

Abstract

The development of multi-omics has increased the likelihood of further improving genomic prediction (GP) of complex traits. Gene expression data can directly reflect the genotype effect, and thus, they are widely used for GP. Generally, the gene expression data are integrated into multiple [...] Read more.

The development of multi-omics has increased the likelihood of further improving genomic prediction (GP) of complex traits. Gene expression data can directly reflect the genotype effect, and thus, they are widely used for GP. Generally, the gene expression data are integrated into multiple random effect models as independent data layers or used to replace genotype data for genomic prediction. In this study, we integrated pedigree, genotype, and gene expression data into the single-step method and investigated the effects of this integration on prediction accuracy. The integrated single-step method improved the genomic prediction accuracy of more than 90% of the 54 traits in the Duroc × Erhualian F₂ pig population dataset. On average, the prediction accuracy of the single-step method integrating gene expression data was 20.6% and 11.8% higher than that of the pedigree-based best linear unbiased prediction (ABLUP) and genome-based best linear unbiased prediction (GBLUP) when the weighting factor (w) was set as 0, and it was 5.3% higher than that of the single-step best linear unbiased prediction (ssBLUP) under different w values. Overall, the analyses confirmed that the integration of gene expression data into a single-step method could effectively improve genomic prediction accuracy. Our findings enrich the application of multi-omics data to genomic prediction and provide a valuable reference for integrating multi-omics data into the genomic prediction model. Full article

(This article belongs to the Section Biochemistry, Molecular and Cellular Biology)

► Show Figures

Figure 1

13 pages, 2102 KiB

Open AccessArticle

Optimizing Genomic Selection Methods to Improve Prediction Accuracy of Sugarcane Single-Stalk Weight

by Zihao Wang, Chengcai Xia, Yanjie Lu, Qi Liu, Meiling Zou, Fenggang Zan and Zhiqiang Xia

Agronomy 2024, 14(12), 2842; https://doi.org/10.3390/agronomy14122842 - 28 Nov 2024

Viewed by 968

Abstract

Sugarcane (Saccharum spp. Hybrids), serving as a vital sugar and energy crop, holds immense development potential on a global scale. In the process of sugarcane breeding and variety improvement, single-stalk weight stands as a crucial selection criterion. By cultivating sugarcane varieties with [...] Read more.

Sugarcane (Saccharum spp. Hybrids), serving as a vital sugar and energy crop, holds immense development potential on a global scale. In the process of sugarcane breeding and variety improvement, single-stalk weight stands as a crucial selection criterion. By cultivating sugarcane varieties with heavier single stalks, robust growth, high yields, and superior quality, the planting efficiency and market competitiveness of sugarcane can be further enhanced. Single-stalk weight was determined by measuring individual stalks three times in the field, calculating the average value as the phenotypic expression. The distribution of single-stalk weights in the orthogonal and reciprocal populations revealed coefficients of variation of 19.3% and 17.7%, respectively, with the reciprocal population showing greater genetic stability. After rigorous filtering of Hyper_seq_FD sequencing data from 409 sugarcane samples, we identified 31,204 high-quality single-nucleotide polymorphisms (SNPs) evenly distributed across all 32 chromosomes, providing a comprehensive representation of the sugarcane genome. In this study, we evaluated the predictive performance of various genomic selection (GS) methods for single-stalk weight in the 299 orthogonal population, with the male parent being GZ_73-204 and the female parent being GZ_P72-1210, and in the 108 reciprocal population, with the male parent being GZ_P72-1210 and the female parent being GZ_73-204. Initially, we compared the performance of five prediction approaches, including genomic best linear unbiased prediction (GBLUP), single-step genomic best linear unbiased prediction (SSBLUP), Bayes A, machine learning (ML), and deep learning (DL) approaches. The results showed that the GBLUP model had the highest prediction accuracy, at 0.35, while the deep learning model had the lowest accuracy, at 0.20. To improve prediction accuracy, we assigned different scores to various regions of the sugarcane genome based on gene annotation information, thereby giving different weights to SNPs located in these regions. Additionally, we incorporated inbred and outbred populations as fixed effects into the model. The optimized SSBLUP model achieved a prediction accuracy of 0.44, which was a 17% improvement over the original SSBLUP model and a 9% increase compared to the originally optimal GBLUP model. The research results indicate that it is crucial to fully consider genomic structural regions, population structure characteristics, and fixed effects in GS predictions. Full article

(This article belongs to the Section Crop Breeding and Genetics)

► Show Figures

Figure 1

13 pages, 2243 KiB

Open AccessArticle

Enhancing Across-Population Genomic Prediction for Maize Hybrids

by Guangning Yu, Furong Li, Xin Wang, Yuxiang Zhang, Kai Zhou, Wenyan Yang, Xiusheng Guan, Xuecai Zhang, Chenwu Xu and Yang Xu

Plants 2024, 13(21), 3105; https://doi.org/10.3390/plants13213105 - 4 Nov 2024

Viewed by 1424

Abstract

In crop breeding, genomic selection (GS) serves as a powerful tool for predicting unknown phenotypes by using genome-wide markers, aimed at enhancing genetic gain for quantitative traits. However, in practical applications of GS, predictions are not always made within populations or for individuals [...] Read more.

In crop breeding, genomic selection (GS) serves as a powerful tool for predicting unknown phenotypes by using genome-wide markers, aimed at enhancing genetic gain for quantitative traits. However, in practical applications of GS, predictions are not always made within populations or for individuals that are genetically similar to the training population. Therefore, exploring possibilities and effective strategies for across-population prediction becomes an attractive avenue for applying GS technology in breeding practices. In this study, we used an existing maize population of 5820 hybrids as the training population to predict another population of 523 maize hybrids using the GBLUP and BayesB models. We evaluated the impact of optimizing the training population based on the genetic relationship between the training and breeding populations on the accuracy of across-population predictions. The results showed that the prediction accuracy improved to some extent with varying training population sizes. However, the optimal size of the training population differed for various traits. Additionally, we proposed a population structure-based across-population genomic prediction (PSAPGP) strategy, which integrates population structure as a fixed effect in the GS models. Principal component analysis, clustering, and Q-matrix analysis were used to assess the population structure. Notably, when the Q-matrix was used, the across-population prediction exhibited the best performance, with improvements ranging from 8 to 11% for ear weight, ear grain weight and plant height. This is a promising strategy for reducing phenotyping costs and enhancing maize hybrid breeding efficiency. Full article

(This article belongs to the Special Issue Maize Cultivation and Improvement)

► Show Figures

Figure 1

9 pages, 266 KiB

Open AccessArticle

Machine Learning for the Genomic Prediction of Growth Traits in a Composite Beef Cattle Population

by El Hamidi Hay

Animals 2024, 14(20), 3014; https://doi.org/10.3390/ani14203014 - 18 Oct 2024

Cited by 3 | Viewed by 2087

Abstract

The adoption of genomic selection is prevalent across various plant and livestock species, yet existing models for predicting genomic breeding values often remain suboptimal. Machine learning models present a promising avenue to enhance prediction accuracy due to their ability to accommodate both linear [...] Read more.

The adoption of genomic selection is prevalent across various plant and livestock species, yet existing models for predicting genomic breeding values often remain suboptimal. Machine learning models present a promising avenue to enhance prediction accuracy due to their ability to accommodate both linear and non-linear relationships. In this study, we evaluated four machine learning models—Random Forest, Support Vector Machine, Convolutional Neural Networks, and Multi-Layer Perceptrons—for predicting genomic values related to birth weight (BW), weaning weight (WW), and yearling weight (YW), and compared them with other conventional models—GBLUP (Genomic Best Linear Unbiased Prediction), Bayes A, and Bayes B. The results demonstrated that the GBLUP model achieved the highest prediction accuracy for both BW and YW, whereas the Random Forest model exhibited a superior prediction accuracy for WW. Furthermore, GBLUP outperformed the other models in terms of model fit, as evidenced by the lower mean square error values and regression coefficients of the corrected phenotypes on predicted values. Overall, the GBLUP model delivered a superior prediction accuracy and model fit compared to the machine learning models tested. Full article

(This article belongs to the Section Animal Genetics and Genomics)

Search Results (47)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (47)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI