Investigation of the Genetic Architecture of Pigs Subjected to Breeding Intensification

Pigs are strategically important animals for the agricultural industry. An assessment of genetic differentiation between pigs, undergone and not undergone to selection intensification, is of particular interest. Our research was conducted on two groups of Large White pigs grown on the same farm but in different years. A total of 165 samples were selected with 78 LW_А (n = 78, the Russian selection) and LW_B (n = 87, a commercial livestock). For genotyping, we used GeneSeek® GGP Porcine HD Genomic Profiler v1 (Illumina Inc, San Diego, CA, USA). To define breeding characteristics of selection, we used smoothing FST and segment identification of HBD (Homozygous-by-Descent). The results of smoothing FST showed 20 areas of a genome with strong ejection regions of the genome located on all chromosomes except SSC2, SSC3, and SSC8. The average realized autozygosity in Large White pigs of native selection was in (LW_A)—0.21, in LW_В—0.29. LW_А showed 13,338 HBD segments, 171 per one animal, and LW_B—15,747 HBD segments, 181 per one animal. The ejections found by the smoothing FST method were partially localized in the HBD regions. In these areas, the genes ((NCBP1, PLPPR1, GRIN3A, NBEA, TRPC4, HS6ST3, NALCN, SMG6, TTC3, KCNJ6, IKZF2, OBSL1, CARD10, ETV6, VWF, CCND2, TSPAN9, CDH13, CEP128, SERPINA11, PIK3CG, COG5, BCAP29, SLC26A4) were defined. The revealed genes can be of special interest for further studying their influence on an organism of an animal since they can act as candidate genes for selection-significant traits.


Introduction
A human noticed long ago that some traits are displayed in animals of one species differently, and he selected the individuals with traits interesting to him and raised them artificially on his farm. Thus, there appeared a selection traditionally based on breeding and selection aimed at fixing desirable traits in the population. The development of genetics and molecular biology made it possible to considerably raise the efficiency of selection and breeding work, which also influenced the rate of improvement of animals. A long-term, targeted selection by the same breeding important traits leads to the appearance of the so-called selection signatures in the genome of farm animals and is associated with specific traits [1].
When searching for selection signatures, the FST method is extremely interesting. Value FST is a differentiation measure between populations [11]. The FST locus value is calculated as the ratio of the variance of allele frequencies between populations and the sum of variances within and between populations. The locus with more significant values of FST in comparison with other loci may indicate positive selection [12,13]. To date, various modifications of this method have been presented, but undoubtedly, it remains the most widespread and reliable one to identify genome traits of selection signatures between observable populations [14,15].
Method FST is used to identify adjacent regions of the genome during selection and is useful for analyzing distantly related populations as it reveals subtle differences between them [16]. Smoothed method FST is based on the model of Nicholson pure drift [17], according to which separate SNP clusters in genome windows are calculated as average values.
Selection signatures can also be localized in homozygous areas of various lengths [1]. When creating breeds of farm animals, the accumulation of homozygosity allows purebred animals to possess certain qualities and steadily pass them on to their offspring [18]. The use of homozygosity patterns of ROH (Runs Of Homozygosity) makes it possible to reveal long homozygous areas in a genome [19]. In the research, it is reasonable to use the method proposed by Drouet and Gaultier based on the model of HBD multiple classes (homozygous-by-Descent). This method allows estimating autozygosity according to the age of the ancestors.
Pigs are strategically important animals for agriculture. In this connection, over the past decades, targeted work has been carried out to increase selection-significant indicators [20]. This enabled us to considerably increase the pigs' reproductive, growth, and meat indexes. However, alongside high efficiency in pigs, there appeared various anomalies, congenital defects, problems with limbs, and susceptibility to various diseases [21]. Thereby an assessment of genetical differentiation between the pigs that have undergone and have not undergone selection intensification is of particular interest. In this aspect, we conducted research using pigs raised on the territory of the Russian Federation at different periods of time and identified the signatures of selection in pigs due to trends in different socioeconomic conditions [22]. However, studying modern livestock of Large White pigs has shown that formations of selection signatures are influenced by selection centers themselves since each of them implements its own breeding strategy. In this connection, we focused our work on pigs raised on one farm of the Russian Federation, but at different periods of time. In addition to the FST method, we investigated homozygosity areas using a model of plural HBD classes, defined genome signatures identified by two methods, determined QTL enrichment, and positioned genes in these areas.

Animals
Anesthesia, euthanasia, or any animal sacrifice was not used to conduct this study. This study did not involve any endangered or protected species. According to standard monitoring procedures and guidelines, the participating holdings specialists collected tissue samples, following the ethical protocols outlined in the Directive 2010/63/EU (2010). The pig ear samples (ear pluck) were obtained as a general breeding monitoring procedure. The collection of ear samples is a standard practice in pig breeding [23].

Sampling and Genotyping
For our work, we chose Large White pigs, which were kept on the same farm but in different years. Pigs of the LW_A group belonged to the Russian selection, which is based on pigs of the Large White breed, imported from England in 1923-1931. Long-term breeding work, taking into account the local climate, changed the English type of Large White pigs, and, in fact, a new domestic Large White breed was created, which at that time surpassed the English in many respects. At the beginning of the 21st century, these pigs almost completely disappeared, and imported pigs began to be imported into the Russian Federation [24]. LW_A were distinguished by good adaptation and resistance to various diseases were less whimsical to the conditions of keeping and feeding, but the pigs of imported selection were significantly superior in growth rate, reproductive performance, and thinner fat. Group LW_A (date of assembly 2008-2010) and LW_B (date of assembly 2014-2016). Pigs of the LW_A group belonged to the Russian selection, and pigs LW_B belonged to commercial livestock, which was delivered to the farm from Europe in 2013. For work, 165 samples were selected, 78 LW_A and 87 LW_B. Genomic DNA was extracted from tissue (ear pinch) using a set of DNA-Extran-2 reagents (OOO NPF Sintol, Russia). For genotyping, we used GeneSeek ® GGP Porcine HD Genomic Profiler v1, which included 68,516 SNPs evenly distributed with an average spacing of 25 kb. (Illumina Inc, San Diego, CA, USA). The total genotyping rate was 0.99.

Data Analysis
To make relations between populations visual we conducted SVD (singular value decomposition) by means of basic svd function in R. The Heatmap graph was plotted on the basis of the GRM matrix. To define selection characteristics, we used methods of smoothing FST. For smoothing FST a filtration of the data hwe 1 × 10 −7 maf 0.01-geno 0.2-mind 0.2-indep-pairwise 50 5 0.2 42,442 variants passed the QC filters and were retained for further analysis. To filter the noise obtained as a result of the FST calculation, the lokern smoothing algorithm was applied: Kernel Regression Smoothing with Local or Global Plug-in Bandwidth of the lokern package in R [25] with the n.out = 424 parameter, which approximately corresponds to one point for every 100 SNPs and allows smoothing SNP data set. The value of the x.out parameter was used to match the smoothed values against the SNP reference map and its position. The smoothest FST values, corresponding to 0.999%, were identified and translated into genomic positions of Sus scrofa 11.1, and the gene content of each region was analyzed.
To identify HBD segments and to assess autozygosity (or the coefficient of inbreeding), we used the multiple HBD classes model presented in RZooRoH package [26,27]. The method was insensitive to MAF filtration and rather resistant to the structure of LD. In this connection, the data filtration by MAF and LD was not conducted. The Rk coefficients were set from 2 to 516 (2,4,8,16,32,64,128,256,512). The inbreeding coefficient was calculated as the sum of autozygosity for all HBD classes. The total number of HBD segments, the average number per individual, the average length of the segment per individual, and the distribution of segments (and their average length) on the chromosomes were assessed for each group. We defined SNP frequency (%) in the found HBD segments and, for each group, chose top HBD provided that HBD frequency was not less than 60% and included at least 10 SNPs. Based on the results of the 2 methods (FST and HBD), regions of the genome and genes were identified, probably associated with the intensification of the selection process in commercial pigs.

Search and the Analysis of QTL Enrichment
The search of QTL, genes, and the QTL enrichment analysis performed in Genomic Annotation in Livestock for positional candidate LOci (GALLO) was an R package Ensembl genome browser [28], and also a literature search was also carried out manually for the presence of data on the associations of genes with any traits in humans and animals.

Results
To assess the genetic structure of the studied populations of Large White pigs, we used SVD and Heatmap. Figure 1A,B show that pigs of groups LW_A and LW_B have their own individual cluster.

Results
To assess the genetic structure of the studied populations of Large White pigs, we used SVD and Heatmap. Figure 1A,B show that pigs of groups LW_A and LW_B have their own individual cluster. The results of smoothing FST showed 20 regions of the genome with strong outliers located on all chromosomes, with the exception of SSC2, SSC3, and SSC8 (additional Table  1). These areas overlap with quantitative trait loci (QTLs), of which Meat and Carcass traits were most represented ( Figure 2A). Based on the analysis of QTL enrichment with the most significant enrichment, the signs of pH 24 hr post-mortem (lion), meat color a *, and Cortisol level were identified ( Figure 2B). -QTL enrichment analysis (the more intensive the red shade, the more significant is enrichment; the area of circles is proportional to quantity QTL; richness factor-the attitude of QTL quantity, annotated in the study areas, to the total number of each QTL in the reference database).  The results of smoothing FST showed 20 regions of the genome with strong outliers located on all chromosomes, with the exception of SSC2, SSC3, and SSC8 (additional Table  S1). These areas overlap with quantitative trait loci (QTLs), of which Meat and Carcass traits were most represented ( Figure 2A). Based on the analysis of QTL enrichment with the most significant enrichment, the signs of pH 24 hr post-mortem (lion), meat color a *, and Cortisol level were identified ( Figure 2B).

Results
To assess the genetic structure of the studied populations of Large White pigs, we used SVD and Heatmap. Figure 1A,B show that pigs of groups LW_A and LW_B have their own individual cluster. The results of smoothing FST showed 20 regions of the genome with strong outliers located on all chromosomes, with the exception of SSC2, SSC3, and SSC8 (additional Table  1). These areas overlap with quantitative trait loci (QTLs), of which Meat and Carcass traits were most represented ( Figure 2A). Based on the analysis of QTL enrichment with the most significant enrichment, the signs of pH 24 hr post-mortem (lion), meat color a *, and Cortisol level were identified ( Figure 2B). -QTL enrichment analysis (the more intensive the red shade, the more significant is enrichment; the area of circles is proportional to quantity QTL; richness factor-the attitude of QTL quantity, annotated in the study areas, to the total number of each QTL in the reference database).  The class Rk = 128 contributed greatly to autozygosity in LW_B (proportion of the genome about 0.1) (Figure 4 A-B; additional Table 2). Herewith, the contribution of the     Figure 5. Segments of the Rk = 2 class were determined only in pigs from the LW_A group. These segments were located in SSC1, and their average length was 108.67 Mb. Segments of class Rk = 4 for LW_A were defined at SSC1, SSC2, SSC4, SSC5, SSC6, SSC9, SSC13, SSC17; LW_B has SSC1, SSC2, SSC8, SSC9, SSC13, and SSC15. HBD segments of class Rk = 8 were absent in LW_A on SSC10 and in LW_B on SSC10 and SSC16. Starting from Rk = 16 and further up to Rk = 256, HBD segments were relatively evenly distributed on all chromosomes in LW_A and LW_B.

18
SSC18: 45801997 0.08 0.24 -In general, LW_A had 13,338 HBD segments, an average of 171 per animal; LW_B has 15,747 HBD segments, an average of 181 per animal. The largest length of HBD segments was determined for LW_A (138.52 Mb, 1759 Number SNP, SSC1). The average length of HBD segments for LW_A was about 2.47 Mb (54 Number SNP), for LW_B 3.41 Mb (77 Number SNP). The average length of HDB segments on chromosomes (taking into account different classes) is shown in Figure 5. Segments of the Rk = 2 class were determined only in pigs from the LW_A group. These segments were located in SSC1, and their average length was 108.67 Mb. Segments of class Rk = 4 for LW_A were defined at SSC1, SSC2, SSC4, SSC5, SSC6, SSC9, SSC13, SSC17; LW_B has SSC1, SSC2, SSC8, SSC9, SSC13, and SSC15. HBD segments of class Rk = 8 were absent in LW_A on SSC10 and in LW_B on SSC10 and SSC16. Starting from Rk = 16 and further up to Rk = 256, HBD segments were relatively evenly distributed on all chromosomes in LW_A and LW_B.
LW_A LW_B The SNP frequencies (%) in the detected HBD were estimated for each group of pigs and plotted against the position of the SNP in the autosomes (Figures 6 and 7). The SNP frequencies (%) in the detected HBD were estimated for each group of pigs and plotted against the position of the SNP in the autosomes (Figures 6 and 7).
For each group, the top HBD were selected, provided that the HBD frequency was at least 60% and at least 10 SNPs were included. As a result, LW_A has 4 regions located in SSC1 (Table 1). LW_B has 10 regions, of which 5 were in SSC1, 2 in SSC6, and one each in SSC10, SSC14, and SSC15. The topHBD regions did not overlap between groups.    For each group, the top HBD were selected, provided that the HBD frequency was at least 60% and at least 10 SNPs were included. As a result, LW_A has 4 regions located in SSC1 (Table 1). LW_B has 10 regions, of which 5 were in SSC1, 2 in SSC6, and one each in SSC10, SSC14, and SSC15. The topHBD regions did not overlap between groups.
In both groups, the top HBD areas overlapped with QTLs, among which the most represented were the signs of Meat and Carcass traits (Figure 8). In pigs LW_A, relative to LW_B, the QTL type of Exterior, Health and Production was more represented. In LW_B pigs, QTL Reproduction was more represented. Using the analysis of QTL enrichment in the top HBD, LW_A had the most represented characteristics: Ph Logissmus Dorsi, carcass weight (hot), body weight (weaning), backfat at last rib, average daily gain. LW_B has shoulder subcutaneous fat thickness, shear force, loin muscle area, fat area percentage in the carcass, estimated carcass lean content, and dressing percentage. In both groups, the top HBD areas overlapped with QTLs, among which the most represented were the signs of Meat and Carcass traits (Figure 8). In pigs LW_A, relative to LW_B, the QTL type of Exterior, Health and Production was more represented. In LW_B pigs, QTL Reproduction was more represented. Using the analysis of QTL enrichment in the top HBD, LW_A had the most represented characteristics: Ph Logissmus Dorsi, carcass weight (hot), body weight (weaning), backfat at last rib, average daily gain. LW_B has shoulder subcutaneous fat thickness, shear force, loin muscle area, fat area percentage in the carcass, estimated carcass lean content, and dressing percentage.  In the top HBD, genes encoding proteins and having names (according to Ensembl genome browser 104) were identified, but genes encoding lncRNA, snoRNA, snRNA were also represented in these regions with a high frequency (additional Table 1).
The ejections detected by the smoothing FST method were partially localized in the HBD regions. Table 2 shows the areas in which high outliers were found in FST smoothing and their frequencies in the HBD areas.

Discussion
The average length of HBD, chromosome distribution, and the genome proportion covered by HBD can be used as indicators of the origin and history of a population, as well as reflecting events of artificial selection. The Large White breed was created in the 1870s-1880s and officially received its name in 1885 [24]. According to the results of our research, the shortest HBD was inherited from ancestors about 128 years ago, which in general was exactly the period of the formation of the Large White breed. Subsequently, the Large White breed took part in the creation and improvement of most modern European breeds, as well as local breeds created on the territory of the USSR. Our studies showed that the total autozygosity of LW_B pigs was 0.29 proportion of the genome, while the contribution of the Rk = 128 class segments was about 0.1. On this basis, we may assume that ancestors contributed greatly to the autozygosity of this group about 64 years ago. This period was superimposed on the period of growth of intensification processes in pig breeding and the formation of commercial livestock, characterized by high production indicators [29]. In its turn, the total autozygosity in pigs of the LW_A group was 0.21 proportion of the genome, but the dominant contribution of any of the classes was not observed because the contribution of ancestors about 64 (Rk = 128) and 128 (Rk = 256) years ago amounted to 0.05 shares of autozygosity. It can be assumed that the period of intensification was reflected in the LW_A population, but in a much smaller volume, which made it possible to preserve the signatures inherited from more distant ancestors.
Common autozygosity areas in a population identify selection hotspots [30]. In the groups under study, the top HBD was determined, provided that their frequency was less than 60% and included at least 10 SNPs. In general, in both groups, QTLs for Meat and Carcass were most represented, but in group LW_A they stood out to a greater extent with traits associated with carcass weight, and in LW_B-traits associated with obtaining lean pork.
Ejections identified by FST smoothing were partially localized in HBD segments. For example, SNP rs81349176 (SSC1) occured with a frequency of 0.73 in HBD in LW_A. The functional significance of SNP rs81349176 was difficult to interpret since it was localized in the intragenic region. However, it is interesting to mention that the adjacent genes CDH19 and CDH7 belong to the cadherin family, which plays a key role in the regulation of adhesion. Dysregulation of adhesion molecules often causes various diseases, including inflammation and tumors [31][32][33]. Earlier, we also suggested a hypothesis about the connection of cadregins with Capped Hock in Pig [34]. In the future, it is vital to study cadregins in more detail in terms of limb tumors since this is a source of significant economic losses in pig production [35,36].
It is interesting to note the variants of the genes influencing the function of a hemopoiesis and hereditary diseases of the circulatory, cardiovascular system, and other pathologies (HS6ST3, IKZF2, ETV6, SMG6, SLC26A4, VWF). The genes participate in proliferation and differentiation, adhesion, migration, inflammation, fibrillation, and other various processes (HS6ST3), in the regulation of development of lymphocytes (IKZF2), play a role in the hemopoiesis and malignant transformation (ETV6), and are associated with an increased risk of ischemic heart disease (SMG6), inherited hearing loss in domestic animals (SLC26A4).
The genes regulating important components of the nervous system have also been identified (KCNJ6, TRPC4, NALCN, GRIN3A, NBEA). The gene mutations are associated with severe developmental delay, facial dysmorphism and mental retardation, reduced cognitive ability (KCNJ6, NALCN). They play an important role in dopamine-related processes, including addiction and attention (TRPC4), physiological and pathological processes in the central nervous system (GRIN3A), autism (NBEA), etc.
The revealed genes can be of special interest for further studying of their impact on an organism of an animal as they can represent themselves as genes-candidates bound to the physiological features of an organism providing high industrial indexes, but also probably associated with changes of the nervous system, various pathologies, including the number of circulatory and cardiovascular systems.

Conclusions
Investigations aimed at studying the presence and localization of selection signatures (FST), as well as the identification of areas of homozygosity (HBD) in two groups of pigs bred at different times on the same farm, enabled us to identify differences between populations. The presence of QTLs located in areas of homozygosity and associated with traits, the improvement of which was aimed at selection breeding work, makes such areas the most promising for the search for potential candidate genes associated with the level of productivity and the presence of diseases.
In the genome regions determined by using the FST and HBD methods, we identified genes that may have contributed to the changes associated with the intensification of the selection process in pigs. In general, the results presented in our work show promising prospects for genome scanning using FST and HBD methods for studying population history, as well as for identifying genomic regions and genes associated with important economic traits and various pathologies.