Next Article in Journal
Intrauterine Growth-Restricted Pig-Associated Testicular Transcriptome Analysis Reveals microRNA-mRNA Regulatory Networks
Previous Article in Journal
Gallbladder Carcinoma in a Eurasian Otter (Lutra lutra)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

A Self-Supervised Pre-Trained Transformer Model for Accurate Genomic Prediction of Swine Phenotypes

1
Agricultural Information Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
2
Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Beijing 100081, China
*
Author to whom correspondence should be addressed.
Animals 2025, 15(17), 2485; https://doi.org/10.3390/ani15172485
Submission received: 4 July 2025 / Revised: 20 August 2025 / Accepted: 21 August 2025 / Published: 24 August 2025
(This article belongs to the Section Pigs)

Simple Summary

Predicting complex genetic traits is essential for improving swine-breeding programs, but traditional methods face limitations. This study introduces a novel deep learning framework, using a Transformer model, to more accurately predict swine phenotypes. The model first learns the fundamental patterns of the pig genome from genetic data and is then fine-tuned to predict key economic traits. Our results show this method outperforms existing approaches, like GBLUP. This enhanced accuracy provides breeders with a powerful tool for selecting superior animals, potentially accelerating genetic gain and delivering substantial economic benefits to the swine industry.

Abstract

Accurate genomic prediction of complex phenotypes is crucial for accelerating genetic progress in swine breeding. However, conventional methods like Genomic Best Linear Unbiased Prediction (GBLUP) face limitations in capturing complex non-additive effects that contribute significantly to phenotypic variation, restricting the potential accuracy of phenotype prediction. To address this challenge, we introduce a novel framework based on a self-supervised, pre-trained encoder-only Transformer model. Its core novelty lies in tokenizing SNP sequences into non-overlapping 6-mers (sequences of 6 SNPs), enabling the model to directly learn local haplotype patterns instead of treating SNPs as independent markers. The model first undergoes self-supervised pre-training on the unlabeled version of the same SNP dataset used for subsequent fine-tuning, learning intrinsic genomic representations through a masked 6-mer prediction task. Subsequently, the pre-trained model is fine-tuned on labeled data to predict phenotypic values for specific economic traits. Experimental validation demonstrates that our proposed model consistently outperforms baseline methods, including GBLUP and a Transformer of the same architecture trained from scratch (without pre-training), in prediction accuracy across key economic traits. This outperformance suggests the model’s capacity to capture non-linear genetic signals missed by linear models. This research contributes not only a new, more accurate methodology for genomic phenotype prediction but also validates the potential of self-supervised learning to decipher complex genomic patterns for direct application in breeding programs. Ultimately, this approach offers a powerful new tool to enhance the rate of genetic gain in swine production by enabling more precise selection based on predicted phenotypes.
Keywords: genomic prediction; phenotype prediction; transformer; self-supervised learning; non-additive effects; swine genomic prediction; phenotype prediction; transformer; self-supervised learning; non-additive effects; swine

Share and Cite

MDPI and ACS Style

Xiang, W.; Li, Z.; Sun, Q.; Chai, X.; Sun, T. A Self-Supervised Pre-Trained Transformer Model for Accurate Genomic Prediction of Swine Phenotypes. Animals 2025, 15, 2485. https://doi.org/10.3390/ani15172485

AMA Style

Xiang W, Li Z, Sun Q, Chai X, Sun T. A Self-Supervised Pre-Trained Transformer Model for Accurate Genomic Prediction of Swine Phenotypes. Animals. 2025; 15(17):2485. https://doi.org/10.3390/ani15172485

Chicago/Turabian Style

Xiang, Weixi, Zhaoxin Li, Qixin Sun, Xiujuan Chai, and Tan Sun. 2025. "A Self-Supervised Pre-Trained Transformer Model for Accurate Genomic Prediction of Swine Phenotypes" Animals 15, no. 17: 2485. https://doi.org/10.3390/ani15172485

APA Style

Xiang, W., Li, Z., Sun, Q., Chai, X., & Sun, T. (2025). A Self-Supervised Pre-Trained Transformer Model for Accurate Genomic Prediction of Swine Phenotypes. Animals, 15(17), 2485. https://doi.org/10.3390/ani15172485

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop