Genome-Wide Association Analysis and Genetic Parameters for Feed Efficiency and Related Traits in Yorkshire and Duroc Pigs

Simple Summary Genetic improvements in feed efficiency (FE) and related traits could considerably reduce pig production costs and energy consumption. Thus, we performed a genetic parameter estimation and genome-wide association study of four FE and FE-related traits, namely, average daily feed intake, average daily gain, the feed conversion ratio, and residual feed intake, of two pig breeds, Yorkshire and Duroc. The results demonstrate the genetic relationships of FE and FE-related traits with two growth traits, age and backfat thickness at 100 kg. We also identified many single-nucleotide polymorphisms (SNPs) and novel candidate genes related to these traits. In addition, we found many pathways significantly associated with FE and FE-related traits, and they are generally involved in digestive and metabolic processes. The results of this study are expected to provide a valuable reference for the genomic selection of FE and FE-related traits in pigs. Abstract Feed efficiency (FE) traits are key factors that can influence the economic benefits of pig production. However, little is known about the genetic architecture of FE and FE-related traits. This study aimed to identify SNPs and candidate genes associated with FE and FE-related traits, namely, average daily feed intake (ADFI), average daily gain (ADG), the feed conversion ratio (FCR), and residual feed intake (RFI). The phenotypes of 5823 boars with genotyped data (50 K BeadChip) from 1365 boars from a nucleus farm were used to perform a genome-wide association study (GWAS) of two breeds, Duroc and Yorkshire. Moreover, we performed a genetic parameter estimation for four FE and FE-related traits. The heritabilities of the FE and FE-related traits ranged from 0.13 to 0.36, and there were significant genetic correlations (−0.69 to 0.52) of the FE and FE-related traits with two growth traits (age at 100 kg and backfat thickness at 100 kg). A total of 61 significant SNPs located on eight different chromosomes associated with the four FE and FE-related traits were identified. We further identified four regions associated with FE and FE-related traits that have not been previously reported, and they may be potential novel QTLs for FE. Considering their biological functions, we finally identified 35 candidate genes relevant for FE and FE-related traits, such as the widely reported MC4R and INSR genes. A gene enrichment analysis showed that FE and FE-related traits were highly enriched in the biosynthesis, digestion, and metabolism of biomolecules. This study deepens our understanding of the genetic mechanisms of FE in pigs and provides valuable information for using marker-assisted selection in pigs to improve FE.


Introduction
Feed costs account for approximately 64-72% of the total cost of pig production [1]. Improving the feed efficiency (FE) of pigs dramatically reduces production costs. The feed conversion ratio (FCR) and residual feed intake (RFI) are two traits that have been used to evaluate feed efficiency [2]. Studies have shown significant genetic correlations between feed efficiency traits with average daily gain (ADG) and average daily feed intake (ADFI) [3,4], so they are important indicators of feed efficiency. In addition to nutritional and management strategies [5,6], the identification of causative mutations constitutes a promising perspective for the application of genetic selection on FE relying on molecularbased prediction. An effective genome-based breeding program requires the understanding of the genetic architecture of these traits. Genome-wide association analysis (GWAS) has previously been demonstrated to be an effective method for the detection of FE and related quantitative trait loci (QTL) and candidate genes in pigs. A previous study showed that a total of 829 selected single-nucleotide polymorphisms (SNPs) from an association analysis explained 61% of the phenotypic variance in RFI [7]. Wu et al. [8] identified 24 SNPs that had direct genetic effects and 31 SNPs that had social genetic effects on ADFI and ADG in Yorkshire pigs. Another study identified eight common significant QTL regions associated with the same four FE and FE-related traits as those in our study [9]. Do et al. performed a GWAS, and 19 SNPs were found to be significantly associated with two different measures of RFI based on the phenotypes of 596 Yorkshire boars. In addition to SNPs, studies have also revealed many potential candidate genes associated with FE and FE-related traits in pigs. The melanocortin 4 receptor (MC4R), a signaling molecule involved in the regulation of energy homeostasis [10], was identified as a candidate gene associated with ADG, ADFI [11,12], and FCR [13]. Duy et al. [14] found that several nearby genes of significant SNPs for RFI were clustered in the insulin signaling pathway, and they finally identified insulin receptor tyrosine kinase (INSR) as a candidate gene for RFI. Several genes involved in the transport process, such as AQP4, SLC22A23, and SLC6A14, have been identified as potential candidate genes for FCR and ADFI in previous studies [15]. An enrichment analysis showed that candidate genes affecting FCR pointed to corresponding biological pathways related to lipid metabolism, olfactory reception, and immunological status [16]. These studies might contribute to the improvement of the genome selection of FE and FE-related traits. However, only 454 and 385 QTLs have been listed as feed conversion and feed intake traits, respectively, while more than 2597 and 3511 QTLs have been listed as growth traits and fatness traits, respectively (PigQTLdb, https://www.animalgenome.org/cgi-bin/QTLdb/SS/index (accessed on 13 May 2022)). Studying genetic structures and variants affecting FE and FE-related traits does not provide enough detail since phenotype data collection is difficult. Fu et al. [9] found that three QTL regions related to ADFI and RFI traits overlapped. Ding et al. [17] found that a key SNP on SSC 7 contributed 2.16% and 2.37% of the observed phenotypic variance for DFI and RFI, respectively, and that another key SNP on SSC 1 contributed 3.22% and 5.46% of the observed phenotypic variance for FCR and RFI, respectively. Additionally, many studies have found significant correlations of FE and FE-related traits with growth traits [3,4,18]. Identifying genetic correlations with main growth traits is required before FE and FE-related traits can be included in the breeding program.
To further identify quantitative trait loci (QTL) and potential genes associated with four FE and FE-related traits in pigs, we conducted a GWAS on two purebred pig breeds, Yorkshire (YY) and Duroc (DD). In addition, we estimated the genetic parameters of FE and FE-related traits and their genetic correlation with two growth traits, backfat thickness at 100 kg (BF) and age at 100 kg (AGE).

Phenotypes and Genotypes
The data used in this study were all from a nucleus breeding farm in the province of Inner Mongolia of China. Details of the phenotype data collection and processing procedures are explained in Appendix A. After data cleaning, a total of 3661 YY boars were recorded to have at least one of the four FE and FE-related traits or two growth traits, and 239,234 individuals were included in the pedigree, including 880 boars genotyped with an in-house-designed SNP chip named CAU50K (detailed information presented in Appendix B). For DD, there were 2162 boars recorded with at least one of the six traits in this study, and 24,576 individuals were included in the pedigree, of which 485 boars were genotyped with the CAU50K SNP chip. After quality control, YY and DD populations contained 32,322 and 25,539 SNPs, respectively.

Statistical Model
The best linear unbiased prediction (BLUP) method was applied to the six traits in this study to determine the (co)variance components. For an evaluation of the genetic relationships, the phenotypes have eight combinations between two growth traits with four FE and FE-related traits. The following univariate and bivariate animal models were used to estimate the variance components: where y is the vector of observations for one of the six performance traits in univariate analyses, while in the bivariate model, y is the vector of observations composed of one FE-related trait and one growth trait. b is the vector of the fixed year-season effect; a is the vector of additive genetic effects; p is the vector of random litter effects; e is the vector of residual effects; and X, Z 1 , and Z 2 are the incidence matrices of b, a, and p, respectively. The distributions assumed for the random terms in the univariate model are a ∼ N 0, Aσ 2 a , p ∼ N 0, Iσ 2 p , and e ∼ N 0, Iσ 2 e , while in the bivariate model, they are a ∼ N(0, G 0 ⊗ A), p ∼ N 0, R p ⊗ I , and e ∼ N(0, R e ⊗ I). σ 2 a , σ 2 p , and σ 2 e are the additive genetic, litter, and residual variances, respectively. G 0 , R p , and R e are the 2 × 2 symmetrical direct additive genetic effect, litter effect, and residual effect (co)variance matrices, respectively. I denotes the identity matrix of adequate dimension, A denotes the numerator relationship matrix, and ⊗ denotes the Kronecker product. The average body weight (ABW) of each individual was computed as the average of the body weight at the start and end of testing. Metabolic body weight (MBW) mid-test calculated as ABW raised to the power of 0.75 [17] was included as a covariate for ADFI and FCR.
Using the estimated variance or covariance component from the bivariate analysis, estimated genetic correlations were calculated as r g = σ a 12 /(σ a 1 σ a 2 ). We performed a chi-square test with one degree of freedom using genetic correlation estimates and their SE to study whether the genetic correlation is different from 0.
In the GWAS, a single-step genomic BLUP (ssGBLUP) [19,20] was first carried out to estimate the genomic breeding values (GEBVs) for all animals in the pedigree by combining their pedigree and genomic information, and the inverse of the relationship matrix (H) was as follows: where G −1 is the inverse of the genomic relationship matrix, and A −1 22 is the inverse of the numerator relationship matrix for genotyped individuals. G was calculated as [21], where w = 0.95, and Gr is a genomic matrix before weighting, calculated as VanRaden [22]. The same model as the above univariate model was used for GEBV estimation by replacing A with H. Estimated variance components/GEBVs were determined based on BLUP and ssGBLUP using the DMUAI module in the DMU software package e. To eliminate the contribution of information from relatives, de-regressed estimated breeding values (DEBVs) were used as the response variables in the GWAS. According to the method proposed by Garrick et al. [23], the DEBVs of the genotyped individuals were calculated based on the GEBVs from the univariate model using the R package "blupADC" [24]. The GWAS was performed with the following a single-marker regression model using GEMMA Software [25]: where y is the vector of the dependent variable (DEBV); m is the vector of the SNP marker effects; a is the vector of the residual polygenic effects with a normal distribution a ∼ N 0, Gσ 2 a , where G is the realized relationship matrix constructed with markers [8], and σ 2 a is the additive genetic variance; e is the vector of residual errors with a normal distribution e ∼ N 0, Iσ 2 e , where I is an identity matrix, and σ 2 e is the residual variance; and X and W are the incidence matrices of y and are related to m and a, respectively.
For each SNP, the Wald statistic [26,27] was implemented to examine the significance of the SNP's association with each trait. Then, the Bonferroni correction was implemented to define the significant threshold. To avoid missing the true hints of linkage, the genomewide significant and suggestive levels were set as p = 0.05/N and p = 1/N, respectively, where N is the number of analyzed SNPs [28].

Candidate Genes and Functional Analysis
PigQTLdb [29] was used to annotate significant SNPs located in previously mapped QTLs in pigs. The definition of QTL intervals followed the same procedures as those of Delpuech et al. [30]. The region within 1 Mbp centered on each significant SNP was defined as a "QTL window". Then, overlapping windows were combined into a single QTL region per trait within each breed. Gene contents within the detected QTLs were retrieved from the Ensembl Genes 105 Database based on the Sscrofa11.1 genome assembly using BioMart [31]. The biological functions of these genes were investigated in GeneCards [32] and the reported literature. To provide insight into the functional enrichment of genes, Gene Ontology (GO) [33] term and Kyoto Encyclopedia of Genes and Genomes (KEGG) [34] pathway enrichment analyses were subsequently conducted using the KOBAS [35] database. Table 1 shows the descriptive statistical results of the phenotypes and heritabilities estimated in this study. A two-tailed Student's t-test showed that the phenotypes among the breeds were significantly different for all six traits. In general, AGE has a moderate heritability, whereas BF has a moderate-to-high heritability [4,36]. In this study, AGE and BF also had moderate heritabilities in YY and DD pigs ( Table 1). The heritabilities of the four traits in this study ranged from 0.14 (RFI) to 0.24 (ADFI) for YY and from 0.13 (ADG) to 0.36 (FCR) for DD. The heritabilities estimated in this study were lower than those in the study conducted by Homma et al. [18], which did not consider random environmental effects. Our results agree with those of a previous study [4]; the estimated heritabilities are different among breeds. In Chinese pig breeding, DD pigs are generally used as terminal sires in combination with crossbred Landrace x Yorkshire sows. In DD pigs, there is a greater focus on the selection of growth and FE, while in YY pigs, a major emphasis is placed on improving litter size. In our study, the heritabilities of FE and FE-related traits in DD were all above 0.30, except for that of ADG. Therefore, differences in breed management might lead to differences in estimated heritability. In addition, differences in the inherent genetic background of the varieties may be the main reason for this. Overall, it can be concluded that the heritability estimates for the FE and FE-related traits of pig breeds vary considerably depending on trait, statistical model, and population. The results show that these traits have a moderate heritability. Thus, FE and FE-related traits could gain considerable genetic advances with the implementation of genomic selection.

Genetic Correlations of FE and FE-Related Traits with Growth Traits
Except for BF and RFI in DD (0.07), non-ignorable genetic correlations (>0.10) can be found between two growth traits with FE and FE-related traits ( Table 2). However, it is worth noting that several genetic correlation coefficients have relatively large standard errors. Consistent with Do et al. [4], we found that the BF of YY had positive genetic correlations with ADFI (0.44) and ADG (0.13), while it was negative with RFI (−0.14). This result implies that the selection of BF might have a negative effect on these FE and FE-related traits, which is an unfavorable phenomenon for breeders. However, Hong et al. [37] and Homma et al. [18] obtained positive estimated genetic correlations between BF and RFI. Surprisingly, AGE was negatively correlated with ADFI (−0.25 for YY and −0.36 for DD) and ADG (−0.69 for YY and −0.56 for DD), implying that selecting for a lower AGE can improve a pig's feed intake and growth rate. We also found that AGE and BF had positive genetic correlations with FCR, suggesting that selecting a lower AGE made animals eat more slowly, and this has also previously been reported [37]. As discussed above, breeding experts must carefully determine the breeding scheme according to the genetic relationships of growth traits with FE and FE-related traits in order to obtain the maximum benefit.

Genome-Wide Association Study
The descriptive statistics of the DEBVs used for the GWAS for FE and FE-related traits included in this study are shown in Supplementary Table S1. We identified 61 significant SNPs (Table S2) located on 12 different chromosomes (SSC), of which 15 SNPs reached genome-wide significant thresholds ( Table 3). As in previous studies [4,17], the genetic backgrounds of the FE and FE-related traits were significantly different between breeds. In addition, the P-values of the GWAS results were visualized using Manhattan plots and Q-Q plots (Figures 1 and 2) by the R package "CMplot" [38]. Most of the significant SNPs were retrieved in PigQTLdb (Table S2), revealing the reliability of the results. Surprisingly, we found one novel SNP significantly related to each trait, ADFI (SSC18: 32623286), ADG (SSC11: 67792739), FCR (SSC15: 16281234), and RFI (SSC8: 89446476), which might be a potential new causative mutation for FE and FE-related traits.

Candidate Genes and Functional Analysis
After merging overlapping "QTL windows", we finally identified 8, 20, 2, and 7 QTLs associated with ADFI, ADG, FCR, and RFI, respectively (Table S3). The four QTLs determined by four novel significant SNPs have not been reported in PigQTLdb. Considering the biological functions of the genes annotated based on QTLs, 35 genes were identified as positional candidate genes for FE and FE-related traits (Table S4). For example, ATPase phospholipid transportation 8B1 (ATP8B1), a candidate gene of ADFI, is a protein coding gene that is involved in ion-channel transport and the transport of glucose, bile salts, organic acids, metal ions, and amine compounds, and it has been identified as a candidate gene for backfat thickness in pigs [39]. The melanocortin 4 receptor (MC4R) and lectin, mannose binding 1 (LMAN1) were also identified as candidate genes for ADFI, and they have been reported to be associated with AGE [10][11][12][13] and ADFI [40], respectively. The insulin receptor (INSR) has repeatedly been reported to be an important gene affecting RFI in relevant studies [14,41,42] because it plays an important role in synthesizing and storing carbohydrates, lipids, and proteins. In addition, we identified eight candidate genes (Table S4) in four novel QTLs, increasing the credibility of these QTLs, such as ZRANB3 and MCM6, which are essential for the initiation of genome replication.
In total, 617 enriched GO terms and 102 KEGG pathways were identified for candidate genes of FE and FE-related traits, of which, 96 GO terms and 4 KEGG pathways were considered statistically significant at FDR-corrected p < 0.05. The top 30 enrichment terms are shown in Supplementary Table S5. The GWAS signals for FE and FE-related traits were highly enriched in the biosynthesis (GO:0017101), digestion (KEGG:ssc04973), and metabolism (KEGG:ssc00052) of biomolecules, such as carbohydrates and proteins. These results are consistent with earlier findings by Li et al. [43]. We found that sialylation was significantly associated with ADFI (GO: 0097503; p < 0.01), which is involved in the covalent connection between sialic acid and substrate molecules and plays an important role in the metabolism of macromolecular organic substances. The ephrin receptor signaling pathway (GO: 0019899; p < 0.01) enriched on ADG is very important for protein synthesis, and it regulates many major cellular processes that produce or use large amounts of energy and nutrients. Therefore, it is an essential pathway for FE and FE-related traits [44]. Overall, FE and FE-related traits were found to be related to diverse biological processes and pathways, suggesting that they are highly polygenic traits regulated by many genes.

Conclusions
In conclusion, the results show that the heritabilities of FE and FE-related traits ranged from 0.13 to 0.36, and there were distinct genetic correlations (−0.69-0.52) of growth traits with FE and FE-related traits. We identified four critical genomic regions for ADFI (SSC18: 32.12-33.12 Mbp), ADG (SSC11: 67.29-68.29 Mbp), FCR (SSC15: 15.78-16.78 Mbp), and FCR (SSC8: 88.95-89.95 Mbp), and they might be potential new QTL regions for FE. We also further identified 35 potential candidate genes for four FE and FE-related traits, and they play an essential role in many biological processes, such as ion transfer, mitochondrial activities, and the macromolecule metabolism. Some interesting KEGG pathways and GO terms, e.g., sialylation, were found to have potential functions in FE in pigs. Overall, this study's estimation of genetic parameters and GWAS results helps us better understand the genetic structure of FE and FE-related traits and provides important information for the future genomic predictions of these traits.  Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to commercial interests.

Conflicts of Interest:
The authors have read the journal's guideline and have the following competing interests: the co-authors (Z.W., J.W.) are employees of Beijing Zhongyu Pig Breeding Co., Ltd that partially participated in the analysis process. The other authors have no competing interests.

Appendix A
Two purebred pig breeds, Yorkshire (YY) and Duroc (DD), were used in this study. Phenotypes were recorded from 2017 to 2019 to keep the environmental conditions consistent as much as possible. All individuals in this study were boars, and they were group-housed in half-open pens, with 12-15 individuals in each pen. All individuals in this study remained in good health during the testing period. Both growth and feed intake information were measured throughout the testing period, ranging from approximately 30 to 100 kg body weight (BW). Data collection in this weight range is mainly used to improve the number of effective records and data accuracy. The original feeding record was automatically generated by the automated feeding stations (Pig Performance Tester, Nedap, Groenlo, the Netherlands). Backfat thickness was measured between the 10th and 11th ribs of the pigs using real-time B-mode ultrasound at the end of testing. The phenotypic data comprised individual ID, breed, birth date, starting weight, starting date, daily feed intake, weight gain during testing, final weight, final date, and final backfat thickness. The 30 kg to 100 kg BW data were collected from each pig during the testing period. Each animal was labeled with a unique Radio Frequency Identification (RFID) tag on their ear, which can be detected by the performance testing system. Once the pig visited the feeder, the date and accurate start and stop times of feeding, the ID of the individuals, and feed consumption were recorded.
All phenotypes in this study were from YY or DD boars. The traits (ADFI, ADG, FCR, and ADG) of each pig were calculated during performance testing according to the information provided by the performance testing system. Age at 100 kg (AGE) was adjusted using the following formula: where AGE test represents the age at the end of testing; BW is the body weight at the end of testing; and A is the constant correction coefficients for boars and dams, with A = 50.775. Backfat thickness at 100 kg (BF) was adjusted using the following formula: where BF test represents the backfat thickness at the end of testing; BW is the body weight at the end of testing; and B is the constant correction coefficients for boars and dams, with B = −7.277. The average daily feed intake (ADFI) is the ratio of total feed intake and days during the testing period. The feed conversion ratio (FCR) is the ratio of total feed intake to weight gain during testing. The following formula calculated the average daily gain (ADG) during the testing period: where AGE 30 is the age corrected to 30 kg; BW s is the body weight at the beginning of testing; and b is the constant correction coefficients, with b YY = 1.550 and b DD = 1.536.
RFI was computed as the difference between the observed ADFI and the predicted ADFI. The equation for the predicted ADFI [45] is as follows: where β 0 is the intercept, β 1 is the partial regression coefficient of ADFI on metabolic BW (MBW) mid-test [17], β 2 is the partial regression coefficient of ADFI on ADG, β 3 is the partial regression coefficient of ADFI on BF, and e is the residual error term. Then, we can obtain the RFI from the residual. Data from animals with outlier phenotypes or DEBVs, which were greater (lower) than the average plus (minus) 3.5 × SD, were also removed from the final dataset. There were 239,234 and 24,576 individuals with YY and DD pedigrees, respectively. After preliminary screening, 3661 YY boars and 2162 DD boars could be used for the subsequent data analysis. The SNP chip CAU50k designed in-house was used for genotyping. A total of 1314 boars, 920 YY and 494 DD, were genotyped using the CAU50K SNP chip. Missing genotypes were imputed using Beagle software [46]. The quality control of the genotype was carried out using PLINK [47] software (genotype call rate > 0.9, HWE p > 10e −6 , and MAF > 0.05). Individuals with an average SNP marker call rate < 0.9 or missing DEBV were removed. Finally, 880 boars and 32,322 SNPs were retained in YY, while 485 boars and 25,539 SNPs were retained in DD.