A Unified Comparative Evaluation of Genomic Prediction Models Across Four Aquaculture Species
Abstract
1. Introduction
2. Materials and Methods
2.1. Population and Phenotypes
2.2. SNP Detection, Quality Control and Principal Component Analysis
2.3. Genomic Prediction Models
2.3.1. Genomic Best Linear Unbiased Prediction (GBLUP)
2.3.2. Bayesian Models
2.3.3. Machine Learning and Regularized Regression Models
Elastic Net (EN)
Nonlinear Machine Learning Models
2.4. Incremental Feature Selection Based on GWAS
2.5. Genomic Prediction Performance Evaluation
3. Results
3.1. Population Genetic Structure
3.2. Substantial Variation in Genomic Prediction Accuracy Across Species and Models
3.3. Assessment of Prediction Bias Across Species and Models
3.4. Optimization of Genomic Prediction Through Incremental Feature Selection
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| XGB | eXtreme Gradient Boosting |
| MAE | mean absolute error |
| MSE | mean squared error |
| GBLUP | Genomic Best Linear Unbiased Prediction |
| SVR | Support Vector Regression |
| KRR | Kernel Ridge Regression |
| RF | Random Forest |
References
- Ruben, M.O.; Akinsanola, A.B.; Okon, M.E.; Shitu, T.; Jagunna, I.I. Emerging challenges in aquaculture: Current perspectives and human health implications. Vet. World 2025, 18, 15–28. [Google Scholar] [CrossRef]
- Matias, A.C.; Andrade, C. New Challenges in Marine Aquaculture Research. J. Mar. Sci. Eng. 2025, 13, 324. [Google Scholar] [CrossRef]
- Gui, J.-F. Chinese wisdom and modern innovation of aquaculture. Water Biol. Secur. 2024, 3, 100271. [Google Scholar] [CrossRef]
- Henderson, C.R. Best linear unbiased estimation and prediction under a selection model. Biometrics 1975, 31, 423–447. [Google Scholar] [CrossRef]
- Soller, M.; Beckmann, J. Genetic polymorphism in varietal identification and genetic improvement. Theor. Appl. Genet. 1983, 67, 25–33. [Google Scholar] [CrossRef]
- Meuwissen, T.H.; Hayes, B.J.; Goddard, M. Prediction of total genetic value using genome-wide dense marker maps. Genetics 2001, 157, 1819–1829. [Google Scholar] [CrossRef] [PubMed]
- Song, H.; Dong, T.; Yan, X.; Wang, W.; Zhang, Q.; Hu, H. Advancing aquaculture breeding through genomic selection: Models, tools, and challenges. Water Biol. Secur. 2025, 100494, in press. [Google Scholar] [CrossRef]
- Kang, Z.; Kong, J.; Li, Q.; Sui, J.; Dai, P.; Luo, K.; Meng, X.; Chen, B.; Cao, J.; Tan, J. Genomic selection for hard-to-measure traits in aquaculture: Challenges in balancing genetic gain and diversity. Aquaculture 2025, 606, 742576. [Google Scholar] [CrossRef]
- Zenger, K.R.; Khatkar, M.S.; Jones, D.B.; Khalilisamani, N.; Jerry, D.R.; Raadsma, H.W. Genomic selection in aquaculture: Application, limitations and opportunities with special reference to marine shrimp and pearl oysters. Front. Genet. 2019, 9, 693. [Google Scholar] [CrossRef]
- García-Ruiz, A.; Cole, J.B.; Van Raden, P.M.; Wiggans, G.R.; Ruiz-López, F.J.; Van Tassell, C.P. Changes in genetic selection differentials and generation intervals in US Holstein dairy cattle as a result of genomic selection. Proc. Natl. Acad. Sci. USA 2016, 113, E3995–E4004. [Google Scholar] [CrossRef]
- Goddard, M.; Hayes, B. Genomic selection. J. Anim. Breed. Genet. 2007, 124, 323–330. [Google Scholar] [CrossRef]
- Jones, H.E.; Wilson, P.B. Progress and opportunities through use of genomics in animal production. Trends Genet. 2022, 38, 1228–1252. [Google Scholar] [CrossRef]
- Meuwissen, T.; Hayes, B.; Goddard, M. Genomic selection: A paradigm shift in animal breeding. Anim. Front. 2016, 6, 6–14. [Google Scholar] [CrossRef]
- Song, H.; Hu, H. Strategies to improve the accuracy and reduce costs of genomic prediction in aquaculture species. Evol. Appl. 2022, 15, 578–590. [Google Scholar] [CrossRef]
- Pang, Z.; Wang, W.; Zhang, H.; Qiao, L.; Liu, J.; Pan, Y.; Yang, K.; Liu, W. Mutual information-based best linear unbiased prediction for enhanced genomic prediction accuracy. J. Anim. Sci. 2025, 103, skaf250. [Google Scholar] [CrossRef] [PubMed]
- Sahebalam, H.; Gholizadeh, M.; Hafezian, S.H. The effect of different approaches to determining the regularization parameter of bayesian LASSO on the accuracy of genomic prediction. Mamm. Genome 2025, 36, 331–345. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.; Hu, H.; Sun, T.; Li, X.; Lv, G.; Bai, Z.; Li, J. Genomic selection for improvement of growth traits in triangle sail mussel (Hyriopsis cumingii). Aquaculture 2022, 561, 738692. [Google Scholar] [CrossRef]
- Erbe, M.; Hayes, B.; Matukumalli, L.; Goswami, S.; Bowman, P.; Reich, C.; Mason, B.; Goddard, M. Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. J. Dairy Sci. 2012, 95, 4114–4129. [Google Scholar] [CrossRef] [PubMed]
- Song, H.; Wang, W.; Dong, T.; Yan, X.; Geng, C.; Bai, S.; Hu, H. Prioritized SNP Selection from Whole-Genome Sequencing Improves Genomic Prediction Accuracy in Sturgeons Using Linear and Machine Learning Models. Int. J. Mol. Sci. 2025, 26, 7007. [Google Scholar] [CrossRef]
- Crossa, J.; Montesinos-Lopez, O.A.; Costa-Neto, G.; Vitale, P.; Martini, J.W.; Runcie, D.; Fritsche-Neto, R.; Montesinos-Lopez, A.; Pérez-Rodríguez, P.; Gerard, G. Machine learning algorithms translate big data into predictive breeding accuracy. Trends Plant Sci. 2025, 30, 167–184. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Ni, P.; Sturrock, M.; Zeng, Q.; Wang, B.; Bao, Z.; Hu, J. Deep learning for genomic selection of aquatic animals. Mar. Life Sci. Technol. 2024, 6, 631–650. [Google Scholar] [CrossRef]
- VanRaden, P.M. Efficient methods to compute genomic predictions. J. Dairy Sci. 2008, 91, 4414–4423. [Google Scholar] [CrossRef]
- Su, G.; Christensen, O.F.; Janss, L.; Lund, M.S. Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances. J. Dairy Sci. 2014, 97, 6547–6559. [Google Scholar] [CrossRef]
- Meher, P.K.; Rustgi, S.; Kumar, A. Performance of Bayesian and BLUP alphabets for genomic prediction: Analysis, comparison and results. Heredity 2022, 128, 519–530. [Google Scholar] [CrossRef]
- DeVito, R.; Gymrek, M. Modeling nonlinear and interaction effects of spatiotemporal and other non-genetic factors improves phenotypic prediction for complex traits. medRxiv 2025. [Google Scholar] [CrossRef]
- Shokor, F.; Croiseau, P.; Gangloff, H.; Saintilan, R.; Tribout, T.; Mary-Huard, T.; Cuyabano, B. Deep learning and genomic best linear unbiased prediction integration: An approach to identify potential nonlinear genetic relationships between traits. J. Dairy Sci. 2025, 108, 6174–6189. [Google Scholar] [CrossRef]
- Robledo, D.; Matika, O.; Hamilton, A.; Houston, R.D. Genome-wide association and genomic selection for resistance to amoebic gill disease in Atlantic salmon. G3 Genes Genomes Genet. 2018, 8, 1195–1203. [Google Scholar] [CrossRef] [PubMed]
- Palaiokostas, C.; Robledo, D.; Vesely, T.; Prchal, M.; Pokorova, D.; Piackova, V.; Pojezdal, L.; Kocour, M.; Houston, R.D. Mapping and sequencing of a significant quantitative trait locus affecting resistance to koi herpesvirus in common carp. G3 Genes Genomes Genet. 2018, 8, 3507–3513. [Google Scholar] [CrossRef] [PubMed]
- Palaiokostas, C.; Ferraresso, S.; Franch, R.; Houston, R.D.; Bargelloni, L. Genomic prediction of resistance to pasteurellosis in gilthead sea bream (Sparus aurata) using 2b-RAD sequencing. G3 Genes Genomes Genet. 2016, 6, 3693–3700. [Google Scholar] [CrossRef]
- Rodríguez, F.H.; Flores-Mara, R.; Yoshida, G.M.; Barría, A.; Jedlicki, A.M.; Lhorente, J.P.; Reyes-López, F.; Yáñez, J.M. Genome-wide association analysis for resistance to infectious pancreatic necrosis virus identifies candidate genes involved in viral replication and immune response in rainbow trout (Oncorhynchus mykiss). G3 Genes Genomes Genet. 2019, 9, 2897–2904. [Google Scholar] [CrossRef] [PubMed]
- Houston, R.D.; Taggart, J.B.; Cézard, T.; Bekaert, M.; Lowe, N.R.; Downing, A.; Talbot, R.; Bishop, S.C.; Archibald, A.L.; Bron, J.E. Development and validation of a high density SNP genotyping array for Atlantic salmon (Salmo salar). BMC Genom. 2014, 15, 90. [Google Scholar] [CrossRef]
- Palti, Y.; Gao, G.; Liu, S.; Kent, M.; Lien, S.; Miller, M.; Rexroad, C., III; Moen, T. The development and characterization of a 57 K single nucleotide polymorphism array for rainbow trout. Mol. Ecol. Resour. 2015, 15, 662–672. [Google Scholar] [CrossRef]
- Browning, B.L.; Browning, S.R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 2009, 84, 210–223. [Google Scholar] [CrossRef]
- Chang, C.C.; Chow, C.C.; Tellier, L.C.; Vattikuti, S.; Purcell, S.M.; Lee, J.J. Second-generation PLINK: Rising to the challenge of larger and richer datasets. Gigascience 2015, 4, 7. [Google Scholar] [CrossRef]
- Madsen, P.; Jensen, J.; Labouriau, R.; Christensen, O.F.; Sahana, G. DMU-a package for analyzing multivariate mixed models in quantitative genetics and genomics. In Proceedings of the 10th World Congress on Genetics Applied to Livestock Production (WCGALP), Vancouver, BC, Canada, 17–22 August 2014. [Google Scholar]
- Pérez, P.; de Los Campos, G. Genome-wide regression and prediction with the BGLR statistical package. Genetics 2014, 198, 483–495. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Long, N.; Gianola, D.; Rosa, G.J.; Weigel, K.A. Application of support vector regression to genome-assisted prediction of quantitative traits. Theor. Appl. Genet. 2011, 123, 1065–1074. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Douak, F.; Melgani, F.; Benoudjit, N. Kernel ridge regression with active learning for wind speed prediction. Appl. Energy 2013, 103, 328–340. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Azodi, C.B.; Bolger, E.; McCarren, A.; Roantree, M.; de Los Campos, G.; Shiu, S.-H. Benchmarking parametric and machine learning models for genomic prediction of complex traits. G3 Genes Genomes Genet. 2019, 9, 3691–3702. [Google Scholar] [CrossRef]
- González-Recio, O.; Forni, S. Genome-wide prediction of discrete traits using Bayesian regressions and machine learning. Genet. Sel. Evol. 2011, 43, 7. [Google Scholar] [CrossRef]
- Blondel, M.; Onogi, A.; Iwata, H.; Ueda, N. A ranking approach to genomic selection. PLoS ONE 2015, 10, e0128570. [Google Scholar] [CrossRef]
- Song, H.; Dong, T.; Wang, W.; Jiang, B.; Yan, X.; Geng, C.; Bai, S.; Xu, S.; Hu, H. Cost-effective genomic prediction of critical economic traits in sturgeons through low-coverage sequencing. Genomics 2024, 116, 110874. [Google Scholar] [CrossRef]
- Zhang, J.; Wei, Y.; Song, H.; Rong, Y.; Hu, W.; Chen, J.; Hu, H. Genomic selection and genome-wide association study for temperature-induced sex reversal trait in a combined common carp population. Aquaculture 2025, 612, 743274. [Google Scholar] [CrossRef]
- Kaler, A.S.; Purcell, L.C.; Beissinger, T.; Gillman, J.D. Genomic prediction models for traits differing in heritability for soybean, rice, and maize. BMC Plant Biol. 2022, 22, 87. [Google Scholar] [CrossRef] [PubMed]
- Zhao, W.; Zhang, Z.; Wang, Z.; Ma, P.; Pan, Y.; Wang, Q.; Zhang, Z. Factors affecting the accuracy of genomic prediction in joint pig populations. Animal 2023, 17, 100980. [Google Scholar] [CrossRef] [PubMed]
- Džermeikaitė, K.; Šidlauskaitė, M.; Antanaitis, R.; Anskienė, L. Enhancing Genomic Selection in Dairy Cattle Through Artificial Intelligence: Integrating Advanced Phenotyping and Predictive Models to Advance Health, Climate Resilience, and Sustainability. Dairy 2025, 6, 50. [Google Scholar] [CrossRef]
- Bian, Y.; Holland, J. Enhancing genomic prediction with genome-wide association studies in multiparental maize populations. Heredity 2017, 118, 585–593. [Google Scholar] [CrossRef] [PubMed]
- Zheng, W.; Zhang, Q.; He, J.; Han, B.; Zhang, Q.; Sun, D. Comparative evaluation of SNP-weighted, Bayesian, and machine learning models for genomic prediction in Holstein cattle. BMC Genom. 2025, 26, 1037. [Google Scholar] [CrossRef]
- Zhou, X.; Hong, Z.; Cui, W.; Zhang, Y.; Ikhwanuddin, M.; Ye, S.; Ma, H. Genetic parameters estimation and optimization of genomic selection in mud crab (Scylla paramamosain): A case study for growth-related traits. BMC Genom. 2025, 26, 1029. [Google Scholar] [CrossRef] [PubMed]
- Song, H.; Dong, T.; Yan, X.; Wang, W.; Tian, Z.; Hu, H. Using Bayesian threshold model and machine learning method to improve the accuracy of genomic prediction for ordered categorical traits in fish. Agric. Commun. 2023, 1, 100005. [Google Scholar] [CrossRef]
- Chafai, N.; Hayah, I.; Houaga, I.; Badaoui, B. A review of machine learning models applied to genomic prediction in animal breeding. Front. Genet. 2023, 14, 1150596. [Google Scholar] [CrossRef]
- Han, G.-R.; Goncharov, A.; Eryilmaz, M.; Ye, S.; Palanisamy, B.; Ghosh, R.; Lisi, F.; Rogers, E.; Guzman, D.; Yigci, D. Machine learning in point-of-care testing: Innovations, challenges, and opportunities. Nat. Commun. 2025, 16, 3165. [Google Scholar] [CrossRef]
- Heinrich, F.; Lange, T.M.; Kircher, M.; Ramzan, F.; Schmitt, A.O.; Gültas, M. Exploring the potential of incremental feature selection to improve genomic prediction accuracy. Genet. Sel. Evol. 2023, 55, 78. [Google Scholar] [CrossRef] [PubMed]
- Griot, R.; Allal, F.; Phocas, F.; Brard-Fudulea, S.; Morvezen, R.; Haffray, P.; François, Y.; Morin, T.; Bestin, A.; Bruant, J.-S. Optimization of genomic selection to improve disease resistance in two marine fishes, the European sea bass (Dicentrarchus labrax) and the gilthead sea bream (Sparus aurata). Front. Genet. 2021, 12, 665920. [Google Scholar] [PubMed]




| Species | Trait | N-obs | Mean ± SD | QC SNPs | Heritability ± SE |
|---|---|---|---|---|---|
| Atlantic salmon | Mean gill score | 1481 | 2.79 ± 0.85 | 10,383 | 0.25 ± 0.06 |
| Gilthead Sea bream | Number of days to death | 777 | 10.34 ± 4.09 | 8545 | 0.12 ± 0.06 |
| Common carp | Body weight | 1214 | 16.32 ± 4.58 | 8531 | 0.26 ± 0.06 |
| Rainbow trout | Number of days to death | 749 | 51.47 ± 13.98 | 37,958 | 0.50 ± 0.06 |
| Species | Methods 1 | Optimal Hyperparameters 2 |
|---|---|---|
| Atlantic salmon | SVR | C: 1, gamma: auto, kernel: poly |
| KRR | alpha: 10, gamma: 0.0001, kernel: poly | |
| RF | Max depth: 20, min samples leaf: 1, min samples split: 2, n estimators: 500 | |
| XGB | Learning rate: 0.01, max depth: 5, n estimators: 500, subsample: 0.6 | |
| Elastic Net | alpha: 0.1, l1_ratio: 0.5, max iter: 1000 | |
| Gilthead Sea bream | SVR | C: 10, gamma: scale, kernel: rbf |
| KRR | alpha: 100, gamma: 0.001, kernel: poly | |
| RF | Max depth: 30, min samples leaf: 4, min samples split: 2, n estimators: 100 | |
| XGB | Learning rate: 0.01, max depth: 5,n estimators: 500, subsample: 0.6 | |
| Elastic Net | alpha: 1, l1_ratio: 0.1, max iter: 1000 | |
| Common carp | SVR | C: 10, gamma: scale, kernel: rbf |
| KRR | alpha: 1, gamma: 0.0001, kernel: rbf | |
| RF | Max depth: 30, min samples leaf: 4, min samples split: 2, n estimators: 200 | |
| XGB | Learning rate: 0.01, max depth: 7, n estimators: 500, subsample: 0.6 | |
| Elastic Net | alpha: 1, l1_ratio: 0.1, max iter: 1000 | |
| Rainbow trout | SVR | C: 100, gamma: scale, kernel: rbf |
| KRR | alpha: 10, gamma: 0.0001, kernel: poly | |
| RF | Max depth: 20, min samples leaf: 4, min samples split:5, n estimators: 100 | |
| XGB | Learning rate: 0.01, max depth: 3, n estimators: 500, subsample: 0.8 | |
| Elastic Net | alpha: 1, l1_ratio: 0.5, max iter: 1000 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zhang, J.; Yang, X.; Wang, W.; Hu, H.; Xu, S.; Song, H. A Unified Comparative Evaluation of Genomic Prediction Models Across Four Aquaculture Species. Fishes 2026, 11, 115. https://doi.org/10.3390/fishes11020115
Zhang J, Yang X, Wang W, Hu H, Xu S, Song H. A Unified Comparative Evaluation of Genomic Prediction Models Across Four Aquaculture Species. Fishes. 2026; 11(2):115. https://doi.org/10.3390/fishes11020115
Chicago/Turabian StyleZhang, Jinxin, Xiaofei Yang, Wei Wang, Hongxia Hu, Shaogang Xu, and Hailiang Song. 2026. "A Unified Comparative Evaluation of Genomic Prediction Models Across Four Aquaculture Species" Fishes 11, no. 2: 115. https://doi.org/10.3390/fishes11020115
APA StyleZhang, J., Yang, X., Wang, W., Hu, H., Xu, S., & Song, H. (2026). A Unified Comparative Evaluation of Genomic Prediction Models Across Four Aquaculture Species. Fishes, 11(2), 115. https://doi.org/10.3390/fishes11020115
