Genome-Wide Association Study Reveals Additive and Non-Additive Effects on Growth Traits in Duroc Pigs

Growth rate plays a critical role in the pig industry and is related to quantitative traits controlled by many genes. Here, we aimed to identify causative mutations and candidate genes responsible for pig growth traits. In this study, 2360 Duroc pigs were used to detect significant additive, dominance, and epistatic effects associated with growth traits. As a result, a total number of 32 significant SNPs for additive or dominance effects were found to be associated with various factors, including adjusted age at a specified weight (AGE), average daily gain (ADG), backfat thickness (BF), and loin muscle depth (LMD). In addition, the detected additive significant SNPs explained 2.49%, 3.02%, 3.18%, and 1.96% of the deregressed estimated breeding value (DEBV) variance for AGE, ADG, BF, and LMD, respectively, while significant dominance SNPs could explain 2.24%, 13.26%, and 4.08% of AGE, BF, and LMD, respectively. Meanwhile, a total of 805 significant epistatic effects SNPs were associated with one of ADG, AGE, and LMD, from which 11 sub-networks were constructed. In total, 46 potential genes involved in muscle development, fat deposition, and regulation of cell growth were considered as candidates for growth traits, including CD55 and NRIP1 for AGE and ADG, TRIP11 and MIS2 for BF, and VRTN and ZEB2 for LMD, respectively. Generally, in this study, we detected both new and reported variants and potential candidate genes for growth traits of Duroc pigs, which might to be taken into account in future molecular breeding programs to improve the growth performance of pigs.


Introduction
Pork is a major meat resource for humans, constituting over 31% of all meat consumed worldwide in 2020 (https://www.fao.org/3/cb1993en/cb1993en_meat.pdf, accessed on 20 July 2022). The huge demand for pork has greatly increased the global production of pork. There is a positive correlation between pig growth rate and meat production, which can be directly used to improve meat production in pig breeding [1]. In pig breeding programs, traits such as adjusted age at a specified weight (AGE), average daily gain (ADG), backfat thickness (BF), and loin muscle depth (LMD) are usually used to measure growth rate and carcass performance, as they are direct indicators of productivity [2]. Thus, understanding the genetic mechanisms of these traits might provide valuable information for marker-assisted selection in pigs. Fortunately, all the growth traits mentioned above have moderate to high heritabilities [3], which indicates that an effective breeding method could directly improve these traits.
The economic traits of animals are often controlled by multiple loci with additive, dominance, epistasis, and environmental interaction effects. Dominance refers to a relationship between two alleles or variants of the same gene, whereas epistasis refers to a relationship between alleles of two different genes. Both of them are known as non-additive effects which result in non-linear effects that control variation in phenotypes. Identification of non-additive effects associated with economic traits is an effective approach to understand the genetic architecture that underlies the complex variation of traits [4,5].
With the advance of whole-genome sequencing and high-throughput genotyping technology, genome-wide association study (GWAS), as effective methods for detecting causative mutations associated with economic traits, have been more and more widely used in humans [6] and domestic animals [7,8]. Several GWASs have been performed to search for QTL and candidate genes for growth traits in many species [9][10][11], especially in pigs [2,[12][13][14]. However, most studies have focused on additive genetic effects and ignored non-additive effects. Interestingly, significant dominance effects accounted for 6% of the total phenotypic variance in average daily gain, which were detached in a Duroc population, emphasizing the importance of non-additive genetic effects [15]. Meanwhile, it has been suggested that the effects of dominance and/or epistasis should not be ignored in predicting phenotypes [5,16], as these would provide new insights into the genetic architecture of complex economic traits.
Duroc boars, due to their maximized production performance, are commonly used as terminal boars in modern three-way crossbreeding production systems (Duroc × Landrace × Yorkshire, DLY), which fully utilize heterosis from both maternal and paternal sides. This is the most efficient way to boost the genetic improvement of growth traits, and this study aimed to detect additive and non-additive effects on growth traits to find causative variants and potential candidate genes for growth-related traits.

Animals and Phenotype Data
The data used in this study were all from a nucleus breeding farm in the Jiangxi province of China. After data cleaning, a total number of 2360 Duroc pigs born from 2014 to 2021, including 1559 males and 801 females, were used in our study. Pedigrees can be traced back for three generations. Regarding phenotypes, there were 1854 records for AGE and ADG, 1847 records for BF, and 1651 records for LMD, respectively. ADG and AGE for each individual was calculated with information provided by automated feeding stations. Daily weights were collected from 30 to 100 kg of body weight, and they were used to calculate AGE and ADG. Phenotypes of BF and LMD were measured individually alive at around 100 kg body weight using Aloka 500 V real-time B-mode ultrasound, according to the Chinese national pig performance testing standard. The ultrasound images were taken from the dorsum at a distance of 5 cm from the dorsal midline in the middle of the last third and fourth ribs of each pig. All phenotypes in this study were adjusted to 100 kg using the formulas shown below: (1) AGE: where AGE test represents measured age, wt represents measured weight, and A is the correction coefficient for sire and dam; A sire = 50.775 and A dam = 46.415.
(2) BF: where BF test represents measured BF, wt represents measured weight, and B is the correction coefficient for sire and dam; B sire = −7.277 and B dam = −9.440.
(4) LMD where LMD test represents measured LMD, wt represents measured weight, and C and D are the correction coefficients for boars and dam respectively; C sire = 50.52, C dam = 52.01, and D = 0.228. Descriptive information for the phenotypes is presented in Table 1.

Genotype Data and Quality Control
A total number of 2360 Duroc pigs born from 2016 to 2022 were genotyped using Illumina Porcine SNP50 BeadChip (Illumina, San Diego, USA). Data quality control was conducted using the PLINK (version: 1.90, Christopher C Chang, San Jose, USA) software [17]. The genotype data were filtered by the following procedures: (1) individuals with call rate < 0.95, (2) SNP call rate < 0.95, (3) minor allele frequency < 0.01, (4) Hardy-Weinberg equilibrium p-value < 10 −6 , and (5) SNPs located in sex chromosomes or unmapped to the reference genome. After the quality control, there were 31,372 high-quality SNPs and 2360 pigs left. Then, missing genotypes were further imputed by Beagle v5.2 software [18].

Statistical Analyses
The estimated breeding values (EBV) for AGE, ADG, BF, and LMD traits were estimated according to average information restricted maximum likelihood using the pedigreebased best linear unbiased prediction (PBLUP) method and the DMUAI module of DMU software [19], as described below: where y c was the vector of corrected phenotypic values; X and Z were the incidence matrices of b and a, respectively; b was the vector of fixed effects, including farm, sex, year-season of birth; and e was the vector of residuals with a normal distribution of e ∼ N 0, Iσ 2 e , I being an identity matrix and σ 2 e the residual variance. In the PBLUP model, a was the vector of additive genetic effects with a normal distribution of a ∼ N 0, Aσ 2 a , σ 2 a being additive genetic variance and A being a pedigree-based additive genetic relationship matrix. Subsequently, the deregressed EBVs (DEBVs) were obtained with the formula DEBV = g i /r 2 i , g i being the EBV of the ith individual and r 2 i being the square of estimated accuracies for the ith individual [20], which were calculated using the blupADC [21] R package.
Then, a linear mixed model, including additive and non-additive SNP effects, was generated by GMAT software [22], which can be written as: y dEBV = 1 n µ + ZM a a a + ZM d a d + ZM aa a aa + ZM ad a ad + ZM dd a dd + e (6) where y dEBV is the vector of DEBV; 1 n is a vector of ones; µ is the overall mean; vectors a a , a d , a aa , a ad , and a dd are additive, dominance, additive-by-additive (A × A), additiveby-dominance (A × D), and dominance-by-dominance (D × D) SNP effects, respectively; vector e is a collection of residual effects; Z is a design matrix for all the random model effects; and M a , M d , M aa , M ad , and M dd are additive, dominance, A × A, A × D, and D × D SNP matrixes, respectively. Then, Bonferroni correction was implemented to define the significant threshold. To avoid missing the true hints of linkage, the genome-wide significant and suggestive levels were set as p = 0.05/N = 1.59 × 10 −6 and p = 1/N = 3.19 × 10 −5 , respectively, where N is the number of analyzed SNPs. The proportion of DEBV variance explained by significant SNPs was calculated according to [23]. The p-values of the epistatic effects were Bonferroni-corrected for multiple testing, which resulted in p < 1.03 × 10 −9 as a significance threshold.

SNP-SNP Network and Linkage Disequilibrium (LD) Analysis
Interaction effects between every two SNPs across the whole genome for growth traits were calculated using GMAT (version: 1.01, Chao Ning, Beijing China) software [22]. Then, the SNP-SNP networks with significant epistatic effects for the studied traits were drawn using the epiNet option within epiSNP (version: 4.2, Yang Da, St. Paul, MN, USA) software [24].
The solid spine method in Haploview software (version: 4.2, Mark Daly, Cambridge, UK) [25] was used to predict the independent LD block.

Identification of Candidate Genes
Functional genes within 1 Mbp centered on each significant SNP were annotated as candidate genes. Identified potential genes were searched using the Ensemble Sus scrofa 11.1 reference genome [26]. Pig QTLdb [27] was used to annotate significant SNPs located in previously mapped QTLs in pigs. Additionally, the GeneCards [28] and NCBI [29] databases were used to query gene functions and determine promising candidates.

Additive Effects
A total number of 10 SNPs that surpassed the suggestive significance level were detected as being associated with one of the studied traits (Table 2); a Manhattan plot of the −log 10 (p-values) of SNP additive effects is shown in Figure 1. Notice that three SNPs are significantly associated with more than one growth trait, indicating that they exert pleiotropic effects on multiple growth traits. These significant SNPs explained 2.49%, 3.02%, 3.18%, and 1.96% DEBV variance for ADG, AGE, BF, and LMD, respectively. When comparing previously reported QTLs, eight of ten significant SNPs were located in the reported QTL regions (Table S1). The remaining two additive SNPs, CNCB10001620 (Sus scrofa chromosome (SSC) 2: 20310474) and CNCB10006791 (SSC9: 67900715), have not been reported previously in the literature and are likely to be potential novel loci for growth traits in pigs.

Dominance Effects
In summary, we also detected 22 SNPs associated with one of the tested studied traits (Table 3), and the p-values of the GWAS results were visualized using Manhattan plots and Q-Q plots (Figure 2). These significant SNPs explained 2.24%, 13.26%, and 4.08% DEBV variance for AGE, BF, and LMD, respectively. When comparing reported QTLs, 16 significant SNPs were located in the identified QTL regions (Table S2). The remaining six dominance SNPs have not been reported in pigs and are likely to be potential novel loci for growth traits. Additionally, we detected a haplotype block that spanned 492 kb ( Figure S1) in BF containing 11 significant SNPs.  traits. The X-axis shows the physical position of SNPs on each chromosome; the Y-axis shows the significance levels (−log 10 p-values). The solid line indicates genome-wide significance (p < 1.59 × 10 −6 ); the dashed line shows suggestive significance (p < 3.19 × 10 −5 ). The Q-Q plots show the observed vs. expected log p-values.

Epistatic Analysis
After cleaning, 287, 355, and 163 pairs of SNPs were significantly associated with ADG, AGE, and LMD, respectively (Table 4), while for BF, no significant SNP pairs were detached. Of these significant SNP-SNP interactions, 43.21-49.08% exhibited an A × A interaction, 32.52-38.03% exhibited an A × D interaction, and 17.18-23.00% exhibited a D × D interaction. To investigate the complicated mechanism of epistasis on growth traits, SNP-SNP networks were constructed for each trait by EPISNP3 [24]. The epistatic interaction subnetworks that contained more than three nodes are shown in Supplementary Figures S2-S9. There were three, five, and three networks detected for ADG, AGE, and LMD, respectively.
For ADG, sub-network 1 was the largest one and contained 28 pairs of SNP-SNP interactions between SSC14 and SSC15, which indicated interactions between the two chromosomes ( Figure 3). These interactions detected between SSC14 and SSC15 involved five SNPs on SSC14 and eight SNPs on SSC15, which spanned 192 kb (from 14.93 Mbp to 15.12 Mbp) and 422 kb (from 65.71 Mbp to 66.13 Mbp), respectively. According to gene annotation, we found that there were five genes in the 192 kb region on SSC14 and two genes in the 422 kb region on SSC15. Several SNPs in the same LD block on one chromosome appeared in sub-networks 2, 3, 4, and 5, which interacted with a single SNP on another chromosome. More details can be found in Figures S2-S5. Interestingly, for AGE, sub-network 1 contained 28 pairs of SNP-SNP interaction effects between SSC14 and SSC15, which was the same as ADG sub-network 1, except for one more SNP on SSC14 (Figure 4). These interactions detected between SSC14 and SSC15 involved five SNPs and eight SNPs respectively, which spanned 192 kb (from 14.93 Mbp to 15.12 Mbp) and 422 kb (from 65.71 Mbp to 66.13 Mbp). There were five genes in the 192 kb region on SSC14 and two genes in the region on SSC15. Several SNPs in the same LD block on one chromosome can be found in sub-networks 2 and 3, which interacted with a single SNP on another chromosome. More details can be found in Figures S6 and S7. For LMD, we detected 10 significant SNP-SNP interaction effects. These interactions were A × A interactions or A × D interactions, which implicated an interaction between SSC2 and SSC12 ( Figure 5). The interactions occurred between five SNPs in a single LD block on SSC12. Multiple SNPs in the same LD block on one chromosome were located in sub-networks 2 and 3, which interacted with a single SNP on another chromosome. More details are shown in Figures S8 and S9.

Discussion
To eliminate the contribution of information from relatives, which may be significantly associated with the trait analyzed rather than the phenotype, the deregressed EBV (DEBV) was calculated for each individual [30]. DEBV, especially, could take full advantage of the information available on genotyped animals and their relatives, which may properly correct the bias introduced by simply pooling or averaging data information and explaining heterogeneous variance [20]. Therefore, in this study, DEBVs obtained from genetic evaluations were used as response variables to improve the detecting power of GWAS. In recent years, much GWAS research has used DEBVs as pseudo-phenotypes, particularly in work on livestock animals [31,32].
Many significant additive and dominance associations were discovered or involved genomic regions previously reported in the literature. Surprisingly, our study identified a number of novel loci associated with dominance effects on growth traits. SNP-SNP networks of epistatic interactions directly revealed the relationships between SNP effects, which could aid in understanding genetic bias with respect to growth traits.
In this study, the SNP-SNP interactions which occurred on the same chromosome were discovered as interactions between two SNPs on the same chromosome. However, remarkably, these may have been haplotype effects and not interactions [33]. Meanwhile, only interactions involving 10 animals in each genotype combination were considered in order to increase the power of detaching interactions [34]. Then, the results were filtered according to these criteria and large numbers of interactions were identified.
Due to the negative genetic correlation between ADG and AGE, the analysis of these two traits used different modeling approaches to detect candidate genes for growth traits. As a result, some commonly detected signals and some differences were also found as well. There were three significant additive SNPs detected for both ADG and AGE, while, partially overlapped with these, there were SNPs identified as being significantly associated with ADG in Yorkshire and Pietrain pigs [2,35]. The CD55 molecule (CD55) gene, which plays an important role in adipocyte development and could affect the growth rates of animals by regulating adipocyte production, was identified in nearby candidate SNPs (CNCB10006791 and CNCB10006792) [36,37]. The nuclear receptor interacting protein 1 (NRIP1) gene encodes a nuclear protein also known as receptor-interacting protein 140 (RIP140), which was identified as a candidate gene for AGE with dominant effect. RIP140 is widely expressed and is involved in regulating lipid and glucose metabolism and fat cell regulation [38][39][40][41]. The KH RNA binding domain-containing, signal transduction-associated 2 (KHDRBS2) gene encodes an RNA-binding protein, which is involved in mediating uterine endometrial stroma progenitor development [42]. Studies have shown that KHDRBS2 is associated with number of teats in Yorkshire pigs [43] and with reproductive traits in Polish commercial pig breeds [44]. Meanwhile, the farnesyl-diphosphate farnesyltransferase 1 (FDFT1) gene was reported as being associated with growth rate through the muscle transcriptome in Iberian pigs [45].
For BF, the thyroid hormone receptor interactor 11 (TRIP11) gene encodes a protein product that interacts with thyroid hormone receptor β, which functions in regulating lipid metabolism [46] and was also found to be associated with intramuscular fat content in a Meishan × Duroc crossbred population [47]. The musashi homolog 2 (MIS2) gene encodes an RNA-binding protein and plays a central role in controlling feed intake and feeding behavior in mammals [48,49]. A haplotype block that spanned 492 kb among significant dominance SNPs was harbored in a reported BF QTL in Duroc, Yorkshire, and Pietrain pigs [50]. Our study increased the power of QTL detection and narrowed the QTL locations using high-density chip arrays and large sample size.
As for LMD, the vertnin (VRTN) gene is known to affect the variation in vertebral number, and, furthermore, an allele of VRTN affects the length of the longissimus muscle, as reported in [51,52]. Interestingly, two significant additive SNPs within or nearby VRTN was found in the current study to be associated with LMD. In addition, the Zinc Finger E-Box Binding Homeobox 2 (ZEB2) gene was proved to play a role in skeletal muscle differentiation in pluripotent stem cells [53]. Although it has not been reported in pigs, it would be worth seeking to verify it in the future. Additionally, considering the complexity of implementing a breeding program, the results obtained here may not be sufficient for effective incorporation into programs, though they could be combined with other omics data, such as proteomics and metabolomics data.

Conclusions
In conclusion, 32 significant SNPs-10 for additive and 22 for dominance effects-and many epistatic interactions were identified for one of four growth traits in a purebred Duroc population through a GWAS fitting additive and non-additive genetic effects. Furthermore, 46 candidate genes with potential functions in muscle development, fat deposition, and regulation of cell growth were considered as candidates for growth traits, including CD55 and NRIP1 for AGE and ADG, TRIP11 and MIS2 for BF, and VRTN and ZEB2 for LMD, respectively. This study presents novel putative causative variants and genes for future pig breeding programs, which may be used in developing trait-specific marker-assisted selection models.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes13081454/s1, Table S1: The additive QTL regions detected in the current study overlapped with previous studies; Table S2: The dominance QTL regions detected in the current study overlapped with previous studies; Figure S1: Linkage disequilibrium (LD) blocks in the significant region on SSC3 for BF; Figure S2: Sub-network 2 for ADG and the LD information; Figure S3: Sub-network 3 for ADG and the LD information; Figure S4: Sub-network 4 for ADG and the LD information; Figure S5: Sub-network 5 for ADG and the LD information; Figure S6: Sub-network 2 for AGE and the LD information; Figure S7: Sub-network 3 for AGE and the LD information; Figure S8: Sub-network 2 for LMD and the LD information; Figure S9 Informed Consent Statement: Not applicable.

Data Availability Statement:
The datasets generated and/or analyzed in the current study are not publicly available, since the studied population consists of the nucleus herd of Jiangxi Zhengbang Breeding Co., Ltd., but are available from the corresponding author upon reasonable request.