GWAS of Reproductive Traits in Large White Pigs on Chip and Imputed Whole-Genome Sequencing Data

Total number born (TNB), number of stillborn (NSB), and gestation length (GL) are economically important traits in pig production, and disentangling the molecular mechanisms associated with traits can provide valuable insights into their genetic structure. Genotype imputation can be used as a practical tool to improve the marker density of single-nucleotide polymorphism (SNP) chips based on sequence data, thereby dramatically improving the power of genome-wide association studies (GWAS). In this study, we applied Beagle software to impute the 50 K chip data to the whole-genome sequencing (WGS) data with average imputation accuracy (R2) of 0.876. The target pigs, 2655 Large White pigs introduced from Canadian and French lines, were genotyped by a GeneSeek Porcine 50K chip. The 30 Large White reference pigs were the key ancestral individuals sequenced by whole-genome resequencing. To avoid population stratification, we identified genetic variants associated with reproductive traits by performing within-population GWAS and cross-population meta-analyses with data before and after imputation. Finally, several genes were detected and regarded as potential candidate genes for each of the traits: for the TNB trait: NOTCH2, KLF3, PLXDC2, NDUFV1, TLR10, CDC14A, EPC2, ORC4, ACVR2A, and GSC; for the NSB trait: NUB1, TGFBR3, ZDHHC14, FGF14, BAIAP2L1, EVI5, TAF1B, and BCAR3; for the GL trait: PPP2R2B, AMBP, MALRD1, HOXA11, and BICC1. In conclusion, expanding the size of the reference population and finding an optimal imputation strategy to ensure that more loci are obtained for GWAS under high imputation accuracy will contribute to the identification of causal mutations in pig breeding.


Introduction
Reproductive traits, such as total number born (TNB), number of stillborn (NSB), and gestation length (GL), are economically important traits that directly affect the economic benefits of the pig industry. The heritabilities of the TNB and NBS traits are about 0.1, whereas the heritability of the GL trait is about 0.3 [1,2]. The TNB trait is often used as one of the key indicators to measure the overall profitability of pig production, and the NSB trait is the most important feature to quantify the reproductive loss of pigs [3]. A study found that the NSB trait positively correlated with the TNB trait [4]. Several studies have shown that the key time for piglets to grow and mature is during late gestation [5], and a gestation length greater than 114 days may improve piglet survival after birth and can reduce postnatal mortality to a certain extent [6].
Genome-wide association studies (GWAS) have emerged as an efficient method for dissecting the genetic mechanisms of complex traits. Recently, GWAS have been widely

Genotype Imputation and Imputation Accuracy
We performed initial quality control of the WGS reference and target chip data. After quality control, 14,561,445 and 43,549 SNPs were retained in the 18 autosomes of the reference and target populations, respectively. We summarized the number of SNPs before and after imputation, the number of SNPs in imputation-based WGS data after quality control in each chromosome. In addition, we calculated the average imputation accuracy R 2 before and after the quality control at R 2 > 0. 8. These are shown in Supplementary Table S1. Figure 1a,b represent the number of loci and imputation accuracy of the 18 autosomes before and after quality control, respectively. After imputation, 14,561,445 SNPs were obtained from the 18 autosomes of 2655 Large White pigs in this study, with 11,521,836 loci remaining after quality control according to R 2 < 0.8 and MAF < 0.05. In addition, when we applied the same quality control condition to each line, 10,006,597 and 9,944,741 SNPs were retained after quality control for the Canadian and French lines, respectively. We calculated the average imputation accuracy R 2 for all loci before and after quality control, which were 0.876 and 0.943, respectively.

Genotype Imputation and Imputation Accuracy
We performed initial quality control of the WGS reference and target chip data. After quality control, 14,561,445 and 43,549 SNPs were retained in the 18 autosomes of the reference and target populations, respectively. We summarized the number of SNPs before and after imputation, the number of SNPs in imputation-based WGS data after quality control in each chromosome. In addition, we calculated the average imputation accuracy R 2 before and after the quality control at R 2 > 0. 8. These are shown in Supplementary Table S1. Figure 1a,b represent the number of loci and imputation accuracy of the 18 autosomes before and after quality control, respectively. After imputation, 14,561,445 SNPs were obtained from the 18 autosomes of 2655 Large White pigs in this study, with 11,521,836 loci remaining after quality control according to R 2 < 0.8 and MAF < 0.05. In addition, when we applied the same quality control condition to each line, 10,006,597 and 9,944,741 SNPs were retained after quality control for the Canadian and French lines, respectively. We calculated the average imputation accuracy R 2 for all loci before and after quality control, which were 0.876 and 0.943, respectively.    Table S2). For the GWAS analysis within two lines of Large White pigs, no genome-wide significant SNPs were detected. Two SNPs (SSC4: 101,156,553 and SSC8: 30,016,379) at the suggestive significant level were observed in the Canadian line and only one SNP (SSC8: 56,076,247) at the suggestive significant level was observed in the French line. In the GWAS analysis of the combined two lines of Large White population, one SNP (SSC8: 56,076,247) at the genome-wide significant level was observed. There was one significant SNP (SSC8: 56,076,247) at the genome-wide level and one SNP (SSC5: 79,145,588) at the suggestive significant level in the cross-population meta-analyses (Table 2).
For the NSB trait, the Manhattan plots are shown in Figure 3a-d. The Q-Q plots are shown in Supplementary Figure S3a- Table S2). For the GWAS analysis within two lines of Large White pigs, no genome-wide significant SNPs were detected. Two SNPs (SSC4: 101,156,553 and SSC8: 30,016,379) at the suggestive significant level were observed in the Canadian line and only one SNP (SSC8: 56,076,247) at the suggestive significant level was observed in the French line. In the GWAS analysis of the combined two lines of Large White population, one SNP (SSC8: 56,076,247) at the genome-wide significant level was observed. There was one significant SNP (SSC8: 56,076,247) at the genome-wide level and one SNP (SSC5: 79,145,588) at the suggestive significant level in the cross-population meta-analyses ( Table  2).        and one SNP (SSC14: 138,357,861) at the suggestive significant level was observed. In the GWAS analysis of the combined two lines of Large White population, five SNPs (SSC4: 125,301,443, SSC18: 5,909,015, SSC11: 70,923,126, SSC11: 70,551,880, and SSC1: 9,677,339) at the suggestive significant level were observed. The suggestive significant SNPs in the cross-population meta-analyses were the same as in the combined Large White population (Table 2).   Table S2). For the GWAS analysis within both Large White lines, no genome-wide significant SNPs were detected. One SNP (SSC6: 156,647,853) at the suggestive significant level was observed in the Canadian line and two SNPs (SSC11: 25,293,190, SSC1: 254,755,615) at the suggestive significant level were observed in the French line. In the GWAS analysis of the combined lines of Large White population, one genome-wide significant SNP (SSC1: 254,755,615) and one SNP (SSC14: 61,937,863) at the suggestive significant level were detected. There were two SNPs (SSC1: 254,755,615 and SSC12: 3,251,323) at the suggestive significant level in the cross-population meta-analyses ( 3,251,323) at the suggestive significant level in the cross-population meta-analyses (Table  2).    Table S3). For the GWAS analysis within both Large White lines, no genome-wide significant SNPs were detected. There were 147 SNPs at the suggestive significant level in the Canadian line and 136 SNPs at the suggestive significant level were in the French line. In the GWAS analysis of the combined lines of Large White population, there were 3 SNPs at the significant level and 203 SNPs at the suggestive significant level. There were 175 SNPs at the suggestive significant level in the cross-population meta-analyses (Table 3). Table 3. The genome significant and suggestive SNPs with the total number born (TNB) trait using imputed WGS data in pigs.   Table S3). For the GWAS analysis within both Large White lines, no genome-wide significant SNPs were detected. There were 147 SNPs at the suggestive significant level in the Canadian line and 136 SNPs at the suggestive significant level were in the French line. In the GWAS analysis of the combined lines of Large White population, there were 3 SNPs at the significant level and 203 SNPs at the suggestive significant level. There were 175 SNPs at the suggestive significant level in the cross-population meta-analyses (Table 3).    Table S3). For the Canadian line, seven genome-wide significant SNPs and 708 suggestive significant SNPs were observed. For the French line, no genomewide significant SNPs were detected and 152 SNPs at the suggestive significant level were observed. In the GWAS analysis of the combined lines of Large White population, one SNP at the genome-wide significant level and 385 suggestive significant SNPs were observed. One genome-wide significant SNP and 506 SNPs at the suggestive significant level were observed in the cross-population meta-analyses (Table 4).    Table S3). For the GWAS analysis within both Large White lines, no genome-wide significant SNPs were detected. Seven SNPs at the suggestive significant level were observed in the Canadian line and 136 SNPs at the suggestive significant level in the French line. In the GWAS analysis of the combined lines of the Large White population, no genome-wide significant SNPs and 156 SNPs at the suggestive significant level were detected. There were 48 SNPs at the suggestive significant level in the cross-population meta-analyses (Table 5).

Bioinformatics Annotation Analysis
In this study, GWAS based on 50K chip data and WGS data were used to detect candidate functional genes. According to the Sus Scrofa 11.1 pig genome, candidate genes were detected within a 20 kb region centering each significant and suggestive SNP.
For the TNB trait, 2 and 30 genes were found for 50K chip data and imputed WGS data, respectively. Additionally, one gene was simultaneously identified in both sets of data. For the NSB trait, 5 and 47 genes were found for 50K chip and imputed WGS data, respectively. Moreover, one gene was simultaneously identified in both datasets. For the GL trait, 3 and 14 genes were found for 50K chip and imputed WGS data, respectively. Furthermore, one gene was simultaneously identified in both datasets.

Imputation of 50K Chip Data to WGS Data
In recent years, genotype imputation has been widely applied with the rapid decline in the cost of whole-genome resequencing data and the need for high-density markers. Genotype-population can be used to impute data with lower-density markers to the WGS data, and the imputation accuracy may be affected by the size of the reference population, the genetic distance between the reference and target populations, and the imputation strategy [23].

Reference Population Size and Imputation Accuracy
The thirty Large White pigs used as the reference population in this study were the ancestral individuals in the population. The genetic distance between the reference and the target populations was relatively close. In addition, 14 pigs were also genotyped by 50K chip data and participated in the subsequent analysis. In this study, the accuracy of genotype imputation was higher than 0.858 for each chromosome before quality control, with an average imputation accuracy of 0.876 for 18 chromosomes, and higher than 0.928 for each chromosome after quality control, with an average imputation accuracy of 0.942. Quality control was applied in each population the loci loss rate of each population was 13.15% and 13.69% for the Canadian and French line-pigs, respectively. However, we imputed target pigs of the Canadian and French lines to WGS data by using 13 Canadianline and 17 French-line Large White pigs as the reference population. The imputation accuracy before quality control was lower than that of the combined reference population. The imputation accuracy was 0.830 and 0.825 in Canadian and French lines, respectively. After quality control, the imputation accuracy was almost the same as that of the combined reference population, which was 0.943 and 0.944, respectively. However, the site loss rate after quality control was much higher than that of the combined reference population, which was 29.07% and 30.18%, respectively, being more than twice the loss rate of the combined reference group. The variation in the number of the reference population may be an important reason for this phenomenon. Using Beagle software, we imputed a mediumdensity chip of 50K to a high-density chip of 777K. As the size of reference population increased from 488 to 1229, their imputation error rate decreased from 0.67% to 0.41% [26]. We used Beagle software to impute the GBS data to the WGS data in Landrace pigs with 20 in the reference population and Large White pigs with 40 in the reference population, resulting in imputation accuracy of 0.42 and 0.45, respectively [27].

Genetic Distance between Reference and Target Populations and Imputation Accuracy
The imputation accuracy of GBS data imputed to WGS data in the study on Large White pigs was 0.42 before quality control [27], which was much smaller than the 0.876 in this study before quality control. The possible reason was that the reference population in this study contained the key individuals in the population, and the genetic relationship between the target and the reference population was close. In a previous study, imputation was performed from 600K chip data to WGS data using multiple pig populations, and the average imputation accuracy before quality control was 0.49 [28]. In another study, Beagle software was used to impute the 60K chip data of 933 F2 populations to the WGS data. In this study, the 117 reference populations included 19 ancestors in F2 generations. The genotypic concordance and imputation accuracy were 0.89 and 0.80, respectively using cross-validation procedures [29]. In our study, we used 20-fold cross-validation procedures to evaluate the imputation genotypic concordance and imputation accuracy of chromosomes 1, 6, and 12; the genotypic concordance of the three chromosomes was 0.931, 0.936, and 0.899, respectively, and the imputation accuracy was 0.866, 0.867, and 0.812, respectively, which is close to a previously reported finding [29]. In a study on the imputation of multibreed sheep, if the individuals related to the individuals to be imputed were removed from the reference population, the concordance and accuracy of imputation reduced by 2.63% and 4.60%, respectively [30].

Imputation Strategy and Imputation Accuracy
Researchers used Beagle software to impute 60K and 600K chip data to WGS data in a chicken population, obtaining an imputation accuracy of 0.620 and 0.812, respectively. In two-step imputation approach, the authors performed indirect imputation from 60K to 600K chip data and then from 600K chip to WGS data with an average imputing accuracy of 0.742 [22]. Researchers imputed 5K to 50K chip data and then from 50K chip data to HD data in sheep, which was superior to directly imputing 5K chip data to HD data, which increased the genotypic concordance by 5.67% [30]. In a study of genotype imputation in Holstein cattle, first imputing 50K chip data to the HD data and then to the WGS data improved the imputation accuracy from 0.28 to 0.65 compared with imputing 50K chip data directly to the WGS data, but was still lower than the imputation accuracy of 0.77 for imputing from the HD to the WGS data [31]. In a study of a small cattle population, endangered German Black Pied cattle, the accuracy of the two-step imputing method was found to be 92.1%, while the imputation accuracy of the one-step method was 93.2%. The author also analyzed the possible reason for this phenomenon and found that the intermediate reference level was a small population that is not abundant, which caused the incorrect imputing of the low-density chip to the medium-high-density in the first step [32]. In future study, we can try to add a medium-to high-density chip and try the two-step imputation method to compare the imputation accuracy, and the genotypic concordance of two-step imputing can be improved compared with the previous method.

Potential Candidate Genes
Imputing the chip data to WGS data using genotype imputation will allow more marker loci to be obtained for GWAS analysis at low cost. In this study, 50K chip data were imputed to WGS data, and the average imputation accuracy was 0.943 after quality control, so GWAS based on imputed WGS data were convincing. Compared with GWAS using the chip data, GWAS based on imputed WGS data detected more potential candidate genes. In addition, the meta-analysis improved the power of detection for SNPs by combining different populations. The advantage of meta-analyses has been reported in pigs [10,33,34]. In our study, we detected novel significant SNPs in the meta-analysis compared with single-breed analyses. However, there were no candidate genes within the 20 kb region centering each novel significant SNPs in the meta-analysis.
For the TNB trait, a number of candidate genes located within 20 kb of genomewide significant and suggestive significant SNPs were identified in both lines. Among them, the NOTCH2 gene plays an important role in pregnancy recognition and corpus luteum maintenance in mice [35]. A study indicated that the NOTCH2 gene can inhibit the synthesis of estradiol [36]. Another study showed a role of NOTCH2 in T-cell differentiation in subsets of T cells between intrauterine growth-retarded groups and normal groups [37]. In the early stages of human embryogenesis showed, KLF3 is a transcription factor that persists during the transition from the zygote to the morula stage [38]. The KLF3 gene may regulate fatty acid use in the intestine and reproductive tissue [39]. PLXDC2 may play a role in reproduction and ectopic pregnancies [40]. A study showed a role for the TBX10 gene in embryo development and diseases of mice [41]. It has been revealed that maternal nutrition in sows may alter birth weight mainly by regulating placental lipid and energy metabolism, and the NDUFV1 gene plays an important role in energy metabolism [42]. The expression level of NDUFV1 was downregulated in the placenta tissues compared with the normal pregnancy group, and the NDUFV1 gene is involved in energy production processes in the mitochondrial matrix and membrane [43]. The TLR10 gene can be expressed in the endometrium, conceptus, and chorioallantoic tissues of pigs, which may play a key role in regulating mucosal immune responses to support the establishment and maintenance of pregnancy [44]. The CDC14A gene can regulate oocyte maturation in mice [45]. The CDC14A gene is a possible candidate gene for protein yield associated with milk production in North American Holstein cattle [46]. The CDC14A gene is a candidate gene for body size traits in pigs [25]. The EPC2 gene was found to be a novel candidate gene associated with reproductive performance in indigenous Chinese pigs [47]. The ORC4 gene plays an important role in polar body extrusion during oogenesis [48][49][50]. The ACVR2A gene is widely expressed in ovarian granulosa cells and closely related to granulosa cell proliferation and follicular development [51]. The ACVR2A gene is a candidate gene for reproductive traits in pigs [52]. A study showed that ACVR2A is associated with female fertility in Japanese Black cattle [53]. The GSC gene can be used as an early marker of embryonic differentiation and describe embryonic diversity in pigs [54].
For the NSB trait, a number of candidate genes located within 20 kb of the genomewide significant and suggestive significant SNPs were identified in both lines. Among them, the NUB1 gene has been reported to be associated with milk production traits in cows and sheep [55,56]. The TGFBR3 gene was also reported to be associated with oocyte maturation in pigs [57]. The ZDHHC14 gene may act as a marker and target for the clinical diagnosis and treatment of pre-eclampsia [58]. The FGF14 gene may be a promising candidate gene associated with litter traits in pigs [59] and a potential candidate gene for teat number trait in Duroc pigs [60]. The BAIAP2L1 gene may serve as a biomarker in ovarian cancer [61]. The BCAR3 gene may provide new insights into the mechanism of local estrogen action in endometriosis [62], and may contribute to the complex tumor heterogeneity of ovarian cancer cells [63]. The EVI5 gene displayed significantly differential expression in trophectoderm biopsies associated with live birth and no-implanting [64]. A study observed that the absence of the TAF1B gene in germline cells leads to the accumulation of late stage egg chambers in the ovaries [65]. The TAF1B gene is a candidate gene for congenital splay leg. Porcine splay leg syndrome is still one of the most important causes of piglet loss, which can be caused by myofibrillar hypoplasia [66].
For the GL trait, a number of candidate genes located within 20 kb of genome-wide significant and suggestive significant SNPs were identified in both lines. Among them, the PPP2R2B gene had a genetic significant effect on milk production traits in Chinese Holstein [67]. This gene may be associated with sperm motility in Duroc pigs as a candidate gene [68]. The PPP2R2B gene may act as an important reproductive driver gene [69]. A study found that the AMBP gene was overexpressed in the amniotic fluid of women without intra-amniotic infection/inflammation [70]. Increased concentrations of this AMBP gene are often considered an indicator of pre-eclampsia [71]. The MALRD1 gene is associated with endometriosis in humans [72]. The BICC1 gene is differentially expressed during prenatal development of skeletal muscle in Pietrain and Duroc pigs [73]. A study identified the BICC1 gene as an important candidate gene of reproductive traits in Duroc pigs [74]. The HOXA3, HOXA7, HOXA10, and HOXA11 genes were found to be candidates for reproductive traits in a study of runs of homozygosity in Jinhua pig [75]. It has also been shown that the HOXA11 gene is expressed in the endometrium [76] and is associated with endometrial epithelial function [77].

Animals and Phenotype
The Large White pigs used in this study were from a commercial pig company in Shanghai, China, which were introduced from Canadian and French lines. Feeding and performance testing for these two lines were conducted on two different farms, with essentially the same level of nutritional management. A total of 13,379 reproduction records of 2655 individuals from parity 1 to 7 were collected during the period of 2014-2020, of which 1403 were from the Canadian line and 1252 were from the French line. According to pedigree information, there was no genetic connectedness between the two lines. Three reproductive traits, TNB, NSB, and GL, were selected for subsequent analysis. The DMUAI procedure of DMU software(Version 6, release 5.2) was used to adjust phenotype on the repeated records of multiple parities based on the pedigree information [78]. The statistical model is described below: where y ijklm is the phenotype, such as TNB, NSB, and GL traits; µ is the total mean; L i is the line effect; T j is the parity effect; YS k is the measured year-season effect, where the season is divided according to the month and consists of four levels (spring = March to May; summer = June to August; autumn = September to November; winter = December to February); and a ijkl is the additive genetic effect, with a~N(0,Aσ 2 a ), where σ 2 a is the additive genetic variance, A is the numerator relationship matrix, pe is the permanent environmental effect, with pe~N(0,Iσ 2 pe ), and e ijklm represents residuals.

SNP Chip Data
We selected 2655 Large White pigs as the target population. Genotyping was performed using a GeneSeek Porcine 50K array. The chip was designed according to Sus Scrofa 10.2 and contained 50,915 SNPs. We mapped autosomal SNPs to the latest version of the pig genome Sus Scrofa 11.1, resulting in 46,311 autosomal SNPs for subsequent analysis.
Quality control was performed by PLINK v1.90 software [79]. In each population, pigs with an individual call rate of lower than 0.9 were excluded. SNP call rates less than 0.9 were removed and we retained SNPs with minor allele frequencies (MAF) of 0.05 or higher. After quality control, 43,549 SNPs remained in the combined Large White population for subsequent analysis. For the Canadian and French lines of Large White pigs, we used 41,039 and 40,495 autosomal SNPs for subsequent analysis, respectively.

Reference Sequence Data
Based on the pedigree information, we first ranked the individuals in the Large White population according to the number of offspring. Then, we select the top thirty ancestral individuals that we called the key individuals in the Large White population as the reference population. Among these 30 Large White pigs, there were 13 Canadian-line pigs and 17 French-line pigs. In addition, fourteen pigs also had chip data and participated in the subsequent analysis. The whole-genome resequencing of 30 Large White pigs was carried out on an Illumina HiSeq platform with average sequencing depth of 10-fold. The initial quality of resequencing data was determined by Trimmomatic (version 0.39) [80]. The clean reads were mapped to the Sus Scrofa 11.1 reference sequence with BWA (version 0.7.17) software [81]. Afterwards, GATK (version 4.1.8.1) software was used to realign the mapped reads and call the SNPs [82]. A total of 21,039,605 SNPs were called by GATK. Quality control was performed by removing duplication sites and SNPs with no position information or located on sex chromosomes. We retained the SNPs with minor allele frequency (MAF) > 0.05, SNPs call rate > 0.9, and Hardy-Weinberg equilibrium (HWE) < 1.0 × 10 −6 ; quality control was performed with VCFtools (version 0.1.16) [83]. After quality control, a total of 14,561,445 SNPs remained.

Genotype Imputation
Using the WGS reference data of 30 Large White pigs, the GeneSeek Porcine 50K chip data of 2655 target Large White pigs were imputed to WGS data. Genotype imputation was conducted with Beagle (version 5.2.2) software [84]. After imputation, quality control was performed with BCFtools (version 1.8) software in each of the two lines [85]. In each population, imputation accuracies lower than 0.8 and minor allele frequencies (MAFs) of lower than 0.05 were excluded. The imputation accuracy R 2 at each SNP was the squared correlation between the known true genotypes and the expected dosages [86].

Genome-Wide Association Studies
In this study, we used the sum of an estimated breeding value (EBV) and a residual of an individual as the adjusted phenotype to conduct GWAS. The single SNP regression models were independently performed on GeneSeek Porcine 50K chip data and imputed WGS data using GCTA (version 1.93.3beta) [87]. The statistical model is described below: where y is the vector of the adjusted phenotypes, such as TNB, NSB, and GL traits; b is the vector of fixed effects; there was no fixed effect in the within-population analysis, and the line effect was added as a fixed effect in the combined Large White population. g is a vector of the SNP effects; W and X correspond to the correlation matrix of b and g, respectively; e is the vector of residual effects, with e~N(0, Iσ 2 e ). In addition, the cross-population meta-analysis based on default method was conducted with METAL software (version "2011-03-25") [88]. The default method in the METAL software combines p-values across studies taking into account the sample size and direction of the effect.
For 50K chip data, the threshold values were determined by the Bonferroni correction method. The threshold p-values for genome-wide significance and suggestive were set to −log10 (0.05/SNPs) and −log10 (1/SNPs), respectively. For imputed WGS data, we used 5 × 10 −8 as a genome-wide significance level, which was also applied in human GWAS [89]. We adopted 5 × 10 −6 as the suggestive level. The Manhattan and quantile-quantile (QQ) plots were drawn with the R package "qqman" [90].

Bioinformatics Annotation Analysis
The bioinformatics database BioMart (http://www.ensembl.org/, accessed on 27 August 2022) was used to screen candidate genes located within significant and suggestive loci. We only considered genes located in the ±20 kb region around significant and suggestive SNPs.

Conclusions
In this study, we imputed 50K chip data to WGS data, with an average imputation accuracy of 0.876 before quality control and 0.943 after quality control (R 2 > 0.8 and MAF> 0.05). The imputed WGS data for GWAS is cost-effective, which can reduce the mapping noise. These results provide useful, new insights into the genetic variation and genes associated with TNB, NSB, and GL traits in different lines of Large White pigs. However, further studies are needed to determine the optimal imputation strategy from chip to WGS data. GWAS based on chip data and imputed WGS data were performed for three reproductive traits in the Canadian and French lines of Large White pigs. Finally, combining the results of GWAS and bioinformatics annotation analysis, NOTCH2, KLF3, PLXDC2, NDUFV1, TLR10, CDC14A, EPC2, ORC4, ACVR2A, and GSC genes were identified as potential candidate genes associated with the TNB trait; NUB1, TGFBR3, ZDHHC14, FGF14, BAIAP2L1, EVI5, TAF1B, and BCAR3 were considered potential candidate genes related to the NSB trait; and PPP2R2B, AMBP, MALRD1, HOXA11, and BICC1 were detected as potential candidate genes related to the GL trait in Large White pigs. In addition, the size of the reference population used in this study was small, and the detection power of GWAS analysis was weak. Subsequently, we can consider expanding the size of the reference population and adopting a further fine imputation strategy to discover causal mutations and validate these identified SNPs and genes.

Data Availability Statement:
The datasets analyzed during this study are available from the authors upon reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.