Genetic Diversity and Population Structures in Chinese Miniature Pigs Revealed by SINE Retrotransposon Insertion Polymorphisms, a New Type of Genetic Markers

Simple Summary Our previous studies suggested that the short interspersed nuclear element (SINE) retrotransposon insertion polymorphisms (RIPs), as a new type of molecular marker developed very recently, are ideal molecular markers and have the potential to be used for population genetic analysis and molecular breeding in pigs and possibly it can be extended to other livestock animals as well. However, no report is available for the application of SINE RIPs in population genetic analysis in livestock, including pigs. Here, we evaluated 30 SINE RIPs in several indigenous Chinese miniature pig breeds, including three subpopulations of Bama pigs (BM-cov, BM-clo, and BM-inb). BM-cov is a subpopulation conserved in the national conservation farm, and BM-clo is a closed population maintained over 30 years with only 2 boars and 14 sows imported from its original area, while BM-inb herd is an 18 generation continuous inbreeding line based on the BM-clo population. To our knowledge, it is the first time to report the genetic diversity, breed differentiation, and population structures for these populations by using SINE RIPs, and which suggests the feasibility of SINE RIPs in pig genetic analysis. Abstract RIPs have been developed as effective genetic markers and popularly applied for genetic analysis in plants, but few reports are available for domestic animals. Here, we established 30 new molecular markers based on the SINE RIPs, and applied them for population genetic analysis in seven Chinese miniature pigs. The data revealed that the closed herd (BM-clo), inbreeding herd (BM-inb) of Bama miniature pigs were distinctly different from the BM-cov herds in the conservation farm, and other miniature pigs (Wuzhishan, Congjiang Xiang, Tibetan, and Mingguang small ear). These later five miniature pig breeds can further be classified into two clades based on a phylogenetic tree: one included BM-cov and Wuzhishan, the other included Congjiang Xiang, Tibetan, and Mingguang small ear, which was well-supported by structure analysis. The polymorphic information contents estimated by using SINE RIPs are lower than the predictions based on microsatellites. Overall, the genetic distances and breed-relationships between these populations revealed by 30 SINE RIPs generally agree with their evolutions and geographic distributions. We demonstrated the potential of SINE RIPs as new genetic markers for genetic monitoring and population structure analysis in pigs, which can even be extended to other livestock animals.


Introduction
Miniature pigs, due to their physiological, anatomical, and genetic similarities to human beings and the relative easy handling, are regarded as a key animal model in biomedical studies [1,2]. There are several indigenous miniature pig breeds in China, such as Xiang, Wuzhishan, Bama, Mingguang small-ear, and Tibetan. All of them originated in the mountainous areas of the south or south-west, far away from the mainland of China [3][4][5]. The Congjiang Xiang pig, a subpopulation of Xiang pig breed, originated in Guizhou province, while the Wuzhishan pigs originated from the Wuzhishan Mountains in Hainan Island [3], and Mingguang small-ear pigs in Tengchong, Yunan Province. Tibetan pigs, which originated on the Tibetan Plateau, were adapted to a high-altitude and a low-temperature environment, distributed across Sichuan, Gansu, Yunnan, and Tibet of China. The Bama miniature pig breed formed in an isolated Bama Yao Autonomous County of Guangxi Zhuang Autonomous Region, south-west of China [4]. Currently, there are three subpopulations kept in two conservation farms seated in Guangxi province. The national conservation farm located at Bama County of Guangxi Zhuang Autonomous Region has conserved one population, which was named as BM-cov, while one closed herd (BM-clo) and one highly inbred (BM-inb) line are kept at Guangxi University [4]. All these breeds are characterized by early sexual maturity, good disease resistance, and strong adaptability in local environments [4][5][6]. In addition, because of isolation from the outside and the long time of natural and artificial selections, inbreeding has continuously increased in these populations, and the genetic diversities are expected to decrease significantly compared with the other Chinese local pig populations [4,7]. However, the genetic diversity, breed differentiation, and population structures in these populations still remain largely unknown.
Retrotransposons, as major genomic parasites of mammals, occupying for 30-45% of the genomic sequences in mammals [8][9][10][11][12][13], which can be classified into three major groups: long interspersed nuclear elements (LINEs), short interspersed nuclear elements (SINEs), and endogenous retroviruses (ERVs) [14]. It has been found that some retrotransposons are active, and they can be mobile in the host genome and generate insertion polymorphisms within a specific population [15][16][17]. Due to their ubiquitous distribution and high copy number in the genome, it is believed that RIPs are suitable for genetic marker development for potential use in population genetic analysis [18]. This is particularly true for SINEs, which are the second most abundant retrotransposons in the genomes of most mammals but represent the most extensive distribution in genomes due to their small size [11,15,19] Furthermore, SINE RIPs have been suggested as "nearly ideal" genetic markers [20]. The primate SINE (Alu) insertion polymorphisms, as genetic markers, have been extensively applied for population genetic analysis in human beings [21]. Our recent studies revealed that SINEs account for 11.05% of the pig genome [22], which are evenly distributed in chromosomes, and large-scale SINE RIPs (over 10,000) have been identified in the dog genome [19,23], thus, a similar prediction could be expected for the pig genomes. The objectives of the present study were to assess the SINE RIPs as a new type of molecular markers in terms of polymorphic information content and heterozygosity in pigs that have evolved recently. Moreover, the genetic diversity, differentiation, and population-relationship among the BM-inb, BM-clo, and BM-cov was studied by applying RIPs and comparing with the other miniature pigs (Xiang, Wuzhishan, Mingguang small-ear, Tibetan), an Italian native pig breed and Landrace, which is used as an outgroup control because of their distant origin from China and no potential intercross with Chinese miniature pigs. Our data provided an important validation of the SINE RIPs in the genetic monitoring and population structure analysis, suggesting their application potential in genetic analysis and molecular breeding in pigs, and even livestock, since most livestock share a similar mobilome landscape.

Animals and DNA Isolation
Ear or blood samples were collected from seven miniature pig populations of BMcov, BM-clo, BM-inb, Congjiang Xiang, Wuzhishan, Tibetan, Mingguang small ear, and one Italian pig breed (Sicilian black pig/Nero Siciliano pig) and one commercial breed (Landrace pig), with a sample size of 24,29,22,28,24,28,20,32, and 32, respectively. The TIANamp Genomic DNA Kit (TIANGEN Biotech Co. Ltd., Beijing, China) was used for DNA isolation from the samples of each animal using the TIANamp Genomic DNA Kit (TIANGEN Biotech Co. Ltd., Beijing, China). The quality of DNA was verified by using NanoPhotometer (Implen, Munich, Germany) and electrophoresis. BM-cov samples were taken from the national Bama conservation farm in Bama autonomous county, Guangxi Zhuang autonomous region. BM-clo and BM-inb samples were collected from the farm in Guangxi University pig farms in Nanning, Guangxi province. Congjiang Xiang samples were provided by Guizhou University pig farm in Guiyang, Guizhou province. Wuzhishan samples were from the national Wuzhishan conservation farm in Haikou, Hainan province. Tibetan samples were from the pig farm in the Animal Husbandry Research Institute of Ganzi Tibetan Autonomous Prefecture, Sichuan province. Mingguang small ear pig samples were taken from the national Mingguang small ear pig conservation farm in Tengchong, Yunnan Province. Sicilian black pig is an autochthonous genetic type that lives in the woods of the Nebrodi and Madonie mountains on the northern coast of the Mediterranean island of Sicily (Italy) [24]. The Landrace pig samples, used as a positive outbreed control, were taken from the breeding farm in Xuzhou China. The photos of six local breeds were shown in Figure 1A. The geographical distribution of seven miniature pig populations was shown in Figure 1B, and the Sicilian black pigs were shown in Figure 1C.

Development of RIP Makers
SINE RIPs were identified based on our recently established protocol (unpublished data). Briefly, the main process was divided into four main steps. (1) Screening SINE insertions in the genomes with a custom library which was built in advance [22] by using RepeatMasker. (2) The flanking sequences of these SINE insertions in the nonreference genomes were mapped to the reference genome using Blat [25], thereby, each insertion's information corresponding to the reference genome was obtained from each nonreference genome. (3) The differential insertions, designated as putative SINE insertion polymorphisms were obtained using a bedtools window. (4) The putative SINE RIPs were manually verified by local BLAST [26] and PCR amplification. In the current study, a total of 36 SINE RIPs in each chromosome were randomly selected for PCR evaluation. Out of 36 total SINE RIPs, 30 SINE RIPs showed clear polymorphic bands across the seven miniature pig populations by PCR analysis. PCR primers were designed according to the 5 and 3 flanking sequences of SINE insertion sites and synthesized by TSINGKE Biological Technology co., Ltd. (TSINGKE, Nanjing, China). All primer sequences and information are listed in Table S1.
PCR reactions were carried out in a total volume of 20 µL, composed of 1 µL 50 ng/µL genomic DNA, 10 µL 2× Taq Master Mix buffer (Vazyme, Nanjing, China), 1 µL of 10 µM primer F, 1 µL of 10 µM primer R, and 7 µL water. The PCR reaction conditions were set as following: 94 • C for 5 min for an initial denaturation, followed by 30 cycles (94 • C for 30 s, 58 • C for 30 s, 72 • C for 1 min) and a final extension of 10 min at 72 • C. The PCR products were detected by electrophoresis in a 1.5% agarose gel with 1× TAE buffer using a constant voltage of 130 V for 30 min. Gels were stained by ethidium bromide and visualized with UV fluorescence.

Statistics and Population Genetic Analyses
Allele frequencies, number of effective alleles per locus (Ne), observed heterozygosity (Ho), expected heterozygosity (He), fixation index (F, including F IS , F ST , F IT ), and the Hardy-Weinberg equilibrium test was determined using Popgene [27] (version 1.32). The polymorphic information content (PIC) was calculated according to the formula: where n was the number of alleles, pi was the frequency of the insertion allele in the population, and pj was the frequency of the deletion allele in the population.
Cluster analysis based on Nei's genetic distance [28] was carried, and a UPGMA tree was constructed by Mega7 [29]. Based on the results of SINE RIPs, we performed principal component analysis (PCA) using the R statistics package (v. 3.6.3). The population structure of the seven miniature pig groups were established using the Bayesian clustering method in STRUCTURE [30]

Evaluation of SINE RIPs in Chinese Miniature Pig Populations
Thirty-six SINE RIP genetic markers (two SINE RIPs in each chromosome), which were predicted according to the protocol described in methods, were selected to evaluate their polymorphisms in 243 animals of seven Chinese miniature pig breeds, one commercial breed, and one Italian native breed (Sicilian black pig) which was selected as an outbreed control. The genomic coordinates of these markers, their PCR primers, and the predicted PCR product sizes were listed in Table S1. Thirty RIPs displayed polymorphism in these miniature pig populations and were used for further population genetic analysis. Six RIP markers were monomorphism in these miniature pigs and not used for the present study (Table S1). The representative PCR detection results of these RIPs are shown in Figure 2. Their PCR detection results of the final thirty RIPs are summarized in Figure S1. These RIPs were biallelic with clear and stable amplified bands. Based on these data, three genotypes were identified: the first with a single small band of homozygous type absent SINE insertion defined as SINE −/− , with band size ranging from 273 to 450 bp in length; the second, a single large band of homozygous type with SINE insertion named as SINE +/+ , with PCR product sizes ranging from 415 to 739 bp in length, and the third heterozygote type named SINE +/− with both small and large bands. For thirty RIP markers, the inbreeding pig population of BM-inb displayed very low inbreed diversity and only three RIP markers were polymorphic. Low inbreed diversity was also observed for Landrace and the closed herd of Bama miniature pigs, where 15 and 17 RIP markers were polymorphic, respectively; while Bama miniature pigs at the conservation farm displayed similar inbreed diversity to other miniature pig breeds and the Italian pig breed. In total, 20, 30, 26, 23, 30, and 25 polymorphic RIPs were detected in BM-cov, Congjiang Xiang, Wuzhishan, Sichun Tibetan, Mingguang small ear, and Sicilian black pigs, respectively ( Table 1). The genotype and allele frequencies of these RIPs and the Hardy-Weinberg equilibrium test for each RIP in each breed were summarized in Table 1 and Table S2. Significant variations of SINE insertion/deletion allele frequencies across these breeds were observed. However, most RIP insertion/deletion alleles in both BM-clo and BM-inb tend to be fixed (13 RIPs) (Table 1 and Table S2).

Genetic Diversity of China Miniature Pig Populations Revealed by SINE RIPs
The genetic parameters, including Ne, He, Ho, PIC, and Fis for each population, are presented in Table 2. The average Ne among the nine populations was 1.3481, ranging from 1.0542 to 1.5813. The average PIC among nine populations was 0.1736, ranging from 0.0263 to 0.2708, of which the Mingguang small ear population was the highest, while BM-inb was the lowest. The average He among nine populations was 0.2139, ranging from 0.0333 (BM-inb) to 0.3477 (Mingguang small ear). A similar variance pattern for Ho was observed in these breeds. BM-inb had the lowest genetic diversity represented by lowest Ne, He, Ho, and PIC values. BM-clo had the second-lowest diversity. These results further confirmed that inbreeding reduces genetic diversity, and the BM-inb pigs had been inbred for many years, and most loci were homozygous. The genetic diversity of the BM-clo also decreased significantly due to limited bloodlines for mating within the subpopulation. While the genetic diversity of BM-cov was similar to the other Chinese miniature pig breeds, the estimates of Ho and He were relatively higher in Chinese Congjiang Xiang, Wuzhishan, Tibetan, Mingguang small ear, BM-cov than those of the Italian breed of Sicilian black, and Landrace. Mingguang small ear population displayed the highest genetic diversity among the eight investigated populations.  Figure 3. Overall, the BM-clo and BM-inb populations showed a relatively higher degree of distance from the BM-cov compared to Congjiang Xiang, Mingguang small ear, Wuzhishan, Tibetan, and BM-cov. Estimates of the BM-clo and BM-inb against the BMcov were 0.2773 and 0.4038, respectively, while the average differentiations among other miniature pigs (Congjiang Xiang, Mingguang small ear, Wuzhishan, Tibetan, and BM-cov) was 0.0907 ± 0.0376. Both BM-clo and BM-inb showed a relatively high genetic difference against other miniature pig populations as well as the Italian breed and Landrace. The genetic difference between BM-inb and Landrace was highest (Fst = 0.6743) and followed by the BM-inb and Sicilian black pair comparison (Fst = 0.6294).  Table 1). A very high inbreeding coefficient (Fis >0.1) was found in Wuzhishan (0.0397) and BM-cov (0.0299) pigs (Table 2). However, as expected, the BM-inb, BM-clo, and Landrace had low Fis scores with 27, 13, and 14 SINE RIPs being homozygous in these two populations.

Genetic Distances between Chinese Miniature Pig Populations Based on SINE RIPs
The pairwise Nei's distances between populations are shown in Table 3. The genetic distance among the seven miniature pig populations was relatively low (≤0.27), ranging from 0.01 to 0.27 by RIPs score, while the Sicilian black and Landrace, which were included as an outbreed heterotic group for the genetic distance computations, showed large distances from all miniature pig populations, indicating a great difference of these breeds from the Chinese miniature pigs. The smallest genetic distance (0.01) obtained was between the BM-Clo and the BM-inb, indicating a very low divergence of these varieties. However, unexpected large genetic distances (0.13) were also obtained between the subpopulations of BM breed (BM-cov, BM-inb, and BM-clo). BM-cov, as a subpopulation of Bama miniature pigs, has relatively high genetic distances from the BM-clo (0.16) and BM-inb (0.21), but relatively small genetic distances from the other miniature pig breeds, ranging from 0.07 when compared with Congjiang Xiang and Wuzhishan to 0.11 with Mingguang small ear pigs and Tibetan pig. On the other hand, both BM-clo and BM-inb have relatively small genetic distances from the Congjiang Xiang and Mingguang small ear breeds, but large genetic distances from BM-co, whereas, the average genetic distance between pairs was 0.04 among Congjiang Xiang, Tibetan, and Mingguang small ear pigs. The Congjiang Xiang had the lowest distances from the rest of the Chinese miniature pigs, ranging from 0.03 when compared with Mingguang small ear pigs to 0.07 with BM-cov, except BM-inb (0.17), and BM-clo (0.12).

Population Structure of Chinese Miniature Pigs Revealed by SINE RIPs
To measure the population structure and degree of admixture, we applied the STRUC-TURE algorithm and principal component analysis (PCA), and the UPGMA tree was generated based on the Nei's genetic distance. We analyzed the grouping situation when K ranged from 2 to 7, meaning that we presupposed that all individuals originated from K ancestors or breeds. The cluster results based on STRUCTURE are shown in Figure 4A. Interestingly, BM-clo and BM-inb were separated from the other breeds when K = 2, and they lacked any affinity with Chinese miniature pigs, even BM-cov. When K = 3, the European pigs (Sicilian black) and Landrace were separated from Chinese miniature pigs and formed three distinct ancestries; Congjiang Xiang, Mingguang small ear, Tibetan, and BM-cov had large proportions of common ancestry. This agrees with the results of the PCA and the UPGMA tree analyses, which placed the BM-inb and BM-clo as a distinct cluster from the other miniature pigs (Congjiang Xiang, Mingguang small ear, Tibetan, BM-cov, and Wuzhishan), and the outbreed Sicilian black pigs and Landrace pigs ( Figure 4B,C). A particular feature at K = 4 is that Congjiang Xiang, Mingguang small ear, Tibetan were separated completely from BM-cov and Wuzhishan. BM-cov breed clearly shares a common ancestry with the Wuzhishan breed, which also agrees with the UPGMA tree analysis. BM-cov and Wuzhishan had a tendency to group in a new subclade, while Congjiang Xiang, Mingguang small ear, and Tibetan also tend to cluster in the same subclade ( Figure 4C). BM-cov separated from Wuzhishan when K ≥ 5. Progressively, as K increased, the contributions of the assumed populations resulted in the complete separation of the seven breeds.

Discussion
Active retrotransposons move randomly in the genome, resulting in different types of structural variations, such as insertion, deletion, reversion, and recombination [34], and may influence the nearby gene activities and result in the variations of phenotypes [35]. Thus, the genetic markers based on the RIPs are suggested as an important tool for studies of genetic diversity, and evolution, QTL mapping, and even for molecular breeding in plants [36][37][38][39]. RIPs have been developed and efficiently applied for genetic analysis in animals, such as ERV RIPs in sheep [40], deer [41], chicken [38], mice [42], and disease analysis in humans [43,44]. In pigs, the impact of retrotransposons on lncRNA and protein-coding genes have been systematically evaluated, and over 80% of genes contained retrotransposon insertions, and about half of protein-coding genes (44.30%) and one-fourth (24.13%) of lncRNA genes contained the youngest retrotransposon insertions [22], which are putative polymorphic insertions and may contribute to genetic and phenotypic variations across breeds. Two cases of phenotypic variations associated with L1 RIPs were reported in pigs previously [45,46], and a recent study identified eight L1 RIPs in pigs, and one of them was significantly associated with economic traits [47]. Three SINE RIPs were reported in the pig Vertnin gene, one SINE RIP was suggested as a putatively causative mutation of vertebral number variation [48,49]. One SINE insertion in the first intron of the PDIA4 gene was associated with the litter size of the pig [50]. These data suggested that genetic and phenotypic variations caused by RIPs seem common in pigs, and they may play roles in population differentiation and breed formation. In the present study, 36 SINE RIPs, which were predicted based on the recently developed protocol (unpublished data), were used to evaluate the genetic diversity and population structure among seven miniature pig populations; 83% of them (30/36) were confirmed to be polymorphic by PCR, indicating that the established SINE RIP screening protocol is highly reliable. Furthermore, highquality bands were obtained when the PCR products were designed between 500-700 bp in sizes, and the validity of the markers of SINE RIPs was also well-supported by the genetic parameter estimates.
Genetic markers based on the microsatellites are widely used for the analysis of genetic diversity of Chinese miniature pigs, and the values of He and PIC are designated as important genetic parameters of genetic diversity [51]. Botstein [52] proposed that the loci with PIC >0.5 are highly informative based on the microsatellite makers, loci with a PIC value between 0.25 and 0.5 are moderately informative, while loci with PIC <0.25 are low informative value. Wang et al. used 32 microsatellite markers to analyze the genetic diversity of miniature pigs [53], and found the PIC values of Bama, Guizhou Xiang pig, and Tibetan pig were 0.5469, 0.7296, and 0.7663, respectively; Min et al. [54] and Yao et al. [55] used microsatellites to evaluate the genetic diversity of Wuzhishan pigs and found the means of PIC were 0.7069 and 0.84, respectively. In another report, the PICs of Tibetan, Xiang, Wuzhishan, and Diannan small-ear pigs were estimated as 0.696, 0.552, 0.653, and 0.585, respectively [56]. These data suggested that the genetic markers based on the microsatellites, in most miniature pig populations, were highly informative, and these breeds display high genetic diversity. However, the PIC values of seven miniature pig populations ranged from 0.0263 to 0.2708 estimated by using the SINE RIPs, which are substantially lower than the PICs estimated based on the microsatellite markers. This is because SINE RIP markers are biallelic, while microsatellite markers are multiple-allelic. Ho values in Bama (0.21), Wuzhishan (0.25), and Tibetan (0.24) estimated based on 1.4 million SNP chip [57] are generally similar to our estimations for Bama (0.2097 ± 0.1974), Wuzhishan (0.2698 ± 0.1999), and Tibetan (0.2431 ± 0.2021), which are listed in Table S3. These data indicate again that SINE RIPs are reliable and applicable in genetic analysis, with advantages of low costs, easy handling, and genotyping compared with SNP chip and microsatellite markers.
Based on the SINE RIPs, we also found that most investigated miniature pigs (Congjiang Xiang, Mingguang small ear, Wuzhishan, Tibetan, and BM-cov) displayed relatively high genetic diversity compared with Sicilian black and Landrace pigs according to the genetic parameters (He, Ho, PIC); while the BM-inb and BM-clo represented low genetic diversity, which generally agrees with the known genetic background and histories of these two populations. The BM-clo population, kept as a closed population in Guangxi University farm for over 30 years, was originally set up by importing 14 sows and 2 boars from the original place (Bama town) in 1987, while the BM-inb population, offspring of the 10th generation of BM-clo pigs, is a highly inbred line due to continuous inbreeding (>18 generations). However, the large genetic distances of BM-inb and BM-clo from the BM-cov pigs disagreed with their population relationships since both BM-inb and BM-clo populations were originated from the BM-cov. The exact reason is not apparent, but it may be because most detected loci (30 SINE RIPs) have been highly homogeneous in BM-inb and BM-clo pigs due to inbreeding, which resulted in an inaccurate estimation of genetic distances and population structures. The low Fis value estimation in Bov-inb may be due to the same reason. It was clear that BM-inb and BM-clo shared a large proportion of ancestry. But they did not show a close genetic relationship with the BM-cov breed. BM-cov breed was an admixture with Wuzhishan in the same clade, and Congjiang Xiang, Tibetan, and Mingguang small ear pigs formed a distinct clade, which is in good agreement with the phylogenetic analysis of 47 Chinese and European domestic breeds and wild boars based on 1.4 million SNP chip [57]. Bama and Wuzhishan display a very close phylogenetic relationship in the same branch, while Congjiang Xiang with other local pig breeds also cluster in the same clade but with a distinct phylogenetic position from Bama and Wuzhishan. In addition, low genetic distances between populations of Congjiang Xiang, Tibetan, and Mingguang small ear pigs indicated a small divergence of these breeds and that they may share the common ancestors. The inclusion of Sicilian black and Landrace pigs as outgroup in the PCA, UPGMA tree, and STRUCTURE analysis well-supported the genetic relationship among the seven Chinese miniature pig populations.
In summary, we identified 30 SINE RIP markers and applied them to determine the genetic diversity, differentiation, and population structure in seven Chinese miniature pig populations. Low genetic diversity, large genetic distance, and differentiation of BM-inb and BM-clo from the BM-cov and other miniature pig populations were observed. Our data revealed that the genetic distance, diversity, and breed-relationships between these populations generally agree with the evolutions and geographic distributions of these populations, and also basically agree with the population genetic analysis based on the SNP array, indicating that the SINE RIPs are reliable and applicable for population genetic analysis in pigs. In addition, our data also suggested that more SINE RIPs are required for population genetic analysis for high inbreeding populations. Overall, we demonstrated the potential of SINE RIPs in population genetic analysis, suggesting an alternative genetic marker that is simple, reliable, and high-quality. If RIP markers are analyzed in low numbers, it has the advantage of requiring no highly sophisticated instruments necessary to the capillary electrophoresis of labeled microsatellites or reading SNP chips. When a larger number of RIPs are analyzed, a labor-saving approach for genotyping is expected to be developed.