Genetic Signatures of Selection for Cashmere Traits in Chinese Goats

Simple Summary Cashmere goats are a unique husbandry resource in China. These goats are well known for producing the highest cashmere yield and best fiber quality in the world. Although cashmere is highly valued and also known as “fiber gem” and “soft gold”, few studies have examined the genetic basis of cashmere traits in cashmere goats. Here, we identified selection signals by comparing Fst and XP-EHH (the cross population extend haplotype homozygosity test) of a non-cashmere breed (Huanghuai goat) with those of two cashmere breeds (Inner Mongolia and Liaoning cashmere goats). Two genes (WNT10A and CSN3) were potentially associated with cashmere traits. This information may be valuable for studying the genetic uniqueness of cashmere goats and elucidating the mechanisms underlying cashmere traits in cashmere goats. Abstract Inner Mongolia and Liaoning cashmere goats in China are well-known for their cashmere quality and yield. Thus, they are great models for identifying genomic regions associated with cashmere traits. Herein, 53 Inner Mongolia cashmere goats, Liaoning cashmere goats and Huanghuai goats were genotyped, and 53,347 single-nucleotide polymorphisms (SNPs) were produced using the Illumina Caprine 50K SNP chip. Additionally, we identified some positively selected SNPs by analyzing Fst and XP-EHH. The top 5% of SNPs had selection signatures. After gene annotation, 222 and 173 candidate genes were identified in Inner Mongolia and Liaoning cashmere goats, respectively. Several genes were related to hair follicle development, such as TRPS1, WDR74, LRRC14, SPTLC3, IGF1R, PADI2, FOXP1, WNT10A and CSN3. Gene enrichment analysis of these cashmere trait-associated genes related 67 enriched signaling pathways that mainly participate in hair follicle development and stem cell pluripotency regulation. Furthermore, we identified 20 overlapping genes that were selected in both cashmere goat breeds. Among these overlapping genes, WNT10A and CSN3, which are associated with hair follicle development, are potentially involved in cashmere production. These findings may improve molecular breeding of cashmere goats in the future.

Cashmere goats are only found in specific areas of Asia [7]. In China, there are more than 20 native Cashmere goat breeds that yield 75% of the Cashmere produced worldwide [8]. Among these breeds, Inner Mongolia and Liaoning Cashmere goats are well known for the thin fiber diameter and high yield of their Cashmere [9]. Therefore, researchers have attempted to identify the underlying genetic mechanisms to produce higher Cashmere quality and yield.
With the development of high-throughput technology, association studies between genetic diversity and phenotypes can be carried out at the genome level. Selection of a trait could be traced by detecting selection signatures in a genome, and the relevant genes can be identified by selection signal analysis [10]. Recently, several preliminary studies have attempted to reveal the natural and artificial selection footprint in Cashmere goat breeds. Su et al. evaluated the primary and secondary hair follicle transcriptome of the Inner Mongolia Cashmere goats. Several genes have been identified that may regulate the Cashmere growth phases, including eight Interleukin-17 receptor B and Zinc-finger protein [4]. Wang et al. used medium-coverage (9-13x) sequences from eight domesticated goat breeds to identify genomic regions. Three genes related to Cashmere trait (LHX2, FGF9 and WNT2) were identified [11]. Guan et al. used the whole genome sequence from six Cashmere goat breeds and six non-Cashmere goat breeds to identify candidate genes can improving fiber traits (e.g., FGF5) [12]. Li et al. performed selective signal analysis on the resequencing data of 70 Cashmere goats and 14 non-Cashmere goats, and identified some genes that are potentially involved in Cashmere fiber production (FGF5, SGK3, IGFBP7, OXTR and ROCK1) [13]. Despite these useful findings, research on genetic mechanisms underlying Cashmere production is still insufficient. Therefore, it is important to study the biological characteristics of any genes that may be involved in regulating Cashmere growth [14].
Here, we compared Inner Mongolia and Liaoning Cashmere goats with the Huanghuai goats. Based on Fst and Cross Population Extend Haplotype Homozygosity Test (XP-EHH) analyses, relevant genes correlated with Cashmere traits were identified based on selection signals. Fst, which is based on allele frequency difference, can identify candidate genes between populations, and it is widely used because of its simplicity [15]. XP-EHH detect candidate regions based on haplotypes among populations, and is implemented along with Fst to eliminate false positives and improve signal analysis accuracy [16]. Therefore, we used Fst and XP-EHH to detect genes and merged the genes to identify the strongest selection signature by Fst and XP-EHH. These finding will provide a further theoretical basis for research on the formation mechanism of Cashmere and protection of local goat resources. Additionally, some important genes related to Cashmere traits were identified that can be used to enrich the gene pool of candidate Cashmere goats.

Biological Sample Collection and Genotyping
All experimental procedures involving goats were approved by the Science Research Department (in charge of animal welfare issue) of the Institute of Animal Sciences, Chinese Academy of Agricultural Sciences (IAS-CAAS) (Beijing, China). Ethical approval on animal survival was given by the animal ethics committee of IAS-CAAS (No. IASCAAS-AE-03, 12 December 2016). In this study, 53 individuals of three goat breeds were collected from conservation farms or core areas of origin and including 17 Inner Mongolia Cashmere goats (MGR), 20 Liaoning Cashmere goats (LNR) and 16 Huanghuai goats (HHG) ( Table 1). The Cashmere goat breed probably originated from the Capra falconeri [17]. Moreover, three other representative goat breeds were included to assess goat breed relationships; we included 15 Daiyun goats (DYG), 16 Chengdu Brown goats (CDM) and 15 Longlin goats (LLG). All specimens were arbitrarily selected females, and all specimens were healthy and did not have any gastrointestinal disease. Genomic DNA samples were obtained by blood collection from the jugular vein, and the TIAN amp Blood DNA Kit (Tian gen Biotech Co. Ltd., Beijing, China) was used to extract genomic DNA. A Nanodrop 2000 nucleic acid protein analyzer (Thermo, Scientific, Wilmington, NC, USA) was used to measure the genomic DNA purity and concentration. After removing the low-quality DNA samples, standard quality samples were genotyped using the Illumina Caprine SNP 50K Chip, which contains 53,347 single-nucleotide polymorphisms (SNPs).

Quality Control
To increase the data processing accuracy, stringent quality control criteria were applied. SNPs were removed if they could not pass the following four criteria: (i) Hardy_Weinberg equilibrium p < 10 −6 ; (ii) SNP call rate > 90%, (iii) SNP minimum allele frequency (MAF) > 0.01 and (iv) presence of mapped autosomal loci.

Selection Signals Analysis
To detect the potential mechanisms underlying Cashmere traits, we searched for autosomal SNPs that showed evidence of specific selection by comparing Fst and XP-EHH values between the Cashmere and non-Cashmere goats detecting selection signals among genome-wide SNP genotypes [24,25]. We compared the reference population (HHG) with two Cashmere goat breeds (LNR and MGR) to obtain Fst and XP-EHH values. Two approaches were used to identify selection signatures in the domestic goat genomes. First, the genomic regions with strongly differing or differentially fixed variants in alleles frequency between different breeds were identified by the population differentiation index (Fst), which is the conventional measure of population genetic differentiation. We calculated Fst value for each SNP as Fst = MSP−MSG MSP+(n c −1)MSG , where MSP is the mean square error within the populations, MSG represents the mean square error between the two populations, and n c represents the average sample size of the entire population after correction [26,27].
Furthermore, we also computed the XP-EHH values using haplotype information in the XP-EHH program [28] to determine the positive selection in MGR and LNR. Haplotypes were estimated with fastphase 1.4 [29]. We used population label information to estimate phased haplotype background and the following options for each chromosome: -Ku40 -Kl10 -Ki10. Specifically, XP-EHH values of individual SNPs were first calculated based on where n a and n A are the numbers of alleles a and A haplotypes, respectively; n i is the number of the ith haplotype in a population, and hx represents the number of different haplotypes in a genomic region at locus X. The integral domain D (the cutoff value of the X integral) was selected, so that the EHH values of both populations were reduced to sufficiently small values [30].

Gene Annotation and Gene Enrichment Analysis
We defined candidate selection SNPs as those that fell into the upper 95th percentile of the empirical distribution with extremely high Fst and XP-EHH values (top 5% level). Based on the goat reference genome annotation, the 50-kb upstream and downstream regions of significant SNP loci were operationally defined as candidate regions under selection. Genes were identified by these different methods and annotated using the goat genome information database. Gene function was determined using the National Center for Biotechnology Information database (https://www.ncbi.nlm.nih.gov/) and by literature search. We selected some significant genes associated with Cashmere trait submitted to the DAVID database (http://david.abcc.ncifcrf.gov/) for Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis [31]. The selected background organism was Capra hircus. Significantly enriched pathway was indicated by Q-value < 0.05.

Genetic Variation and Population Genetic Analysis
After data quality control, 45,575 SNP markers were passed through filters and quality control. The individuals were representative of six native goat breeds in China: MGR, LNR, HHG, DYG, CDM and LLG (north to south) ( Figure 1a). According to the N-J tree, the phylogenetic relationship of the six breeds revealed genetically distinct clusters, and LNR, HHG and MGR were located on the same distinct branch (Figure 1b). This is consistent with a report that MGR and LNR came from a recent common origin [32]. This result was also similar to the finding of another previous study [33]. The genetic relationships among the six Chinese native goat breeds revealed by PCA are shown in Figure 1c. The first dimension (PC1) clearly separated MGR and LNR from the others breeds. The second dimension (PC2) separated MGR, LNR, HHG and DYG from the other two breeds. PC1 and PC2 showed that MGR, LNR and HHG clusters grouped closest to each other (Figure 1c). The PCA results showed a similar pattern as the N-J tree plot. For ADMIXTURE analysis, which was conducted for ancestry estimation [22], when K = 3, there was a division between MGR, LNR, HHG and other breeds (Figure 1d).

Detecting Positive Selection Genes
Fst revealed a high coefficient of genetic differentiation, and has been widely used to identify selection signals among whole-genome SNPs [28,34]. We calculated the Fst for each SNP and ultimately annotated 1907 (Fst > 0.2878) and 1821 genes (Fst > 0.3248) in MGR and LNR, respectively. XP-EHH was also calculated for each SNP [24,28]; the top 5% of the XP-EHH values, which were consider candidate selection SNPs, included 1327 and 1255 genes in MGR and LNR, respectively. We merged the gene lists generated by these two approaches and identified 222 (MGR-HHG) and 173 (LNR-HHG) specific genes that showed the strongest selection signatures (Figure 2a). Finally, there were also 20 genes that overlapped between the two Cashmere breeds (Figure 2b).

Detecting Positive Selection Genes
Fst revealed a high coefficient of genetic differentiation, and has been widely used to identify selection signals among whole-genome SNPs [28,34]. We calculated the Fst for each SNP and ultimately annotated 1907 (Fst > 0.2878) and 1821 genes (Fst > 0.3248) in MGR and LNR, respectively. XP-EHH was also calculated for each SNP [24,28]; the top 5% of the XP-EHH values, which were consider candidate selection SNPs, included 1327 and 1255 genes in MGR and LNR, respectively. We merged the gene lists generated by these two approaches and identified 222 (MGR-HHG) and 173 (LNR-HHG) specific genes that showed the strongest selection signatures (Figure 2a). Finally, there were also 20 genes that overlapped between the two Cashmere breeds (Figure 2b).

KEGG Pathway Enrichment Analysis of Important Genes in Cashmere Goat
In our study, Cashmere trait-related genes were selected for KEGG enrichment analysis and mainly showed clustering within follicle-related pathways (Q-value < 0.05). The KEGG pathway analysis revealed that the genes associated with the significantly correlated SNPs were highly enriched in 67 pathways (Q-value < 0.05, Supplementary Table S1). Further analysis indicated that these pathways included the signaling pathways that regulate stem cells pluripotency, basal cell carcinoma, TGF-beta signaling, melanogenesis, Wnt signaling, TNF signaling and PI3K-Akt signaling (Supplement Figure S1).

Discussion
Detection of selection signals associated with economic traits has been successfully implemented in several livestock species, such as cattle [35], pigs [36], sheep [37] and horses [38]. However, many approaches have been proposed to identify selection signals in genomes, such as Fst, XP-EHH and iHS. In this study, the SNPs identified by high Fst and XP-EHH values produced a subset of 18 SNPs, which represented 20 genes (Supplementary Table S2). The genes that overlapped between the two approaches were putatively defined as candidate genes with strong selection signatures. Of these genes, WNT10A and CSN3 have plausible biological functions associated with Cashmere traits. WNT10A is a member of the WNT gene family [39][40][41][42]. WNT paracrine signaling molecules belong to a family of conserved glycoproteins with at least 19 members in humans and mice [42]. Additionally, the WNT pathway is considered the master regulator during hair follicle morphogenesis [43]. WNT signaling proceeds through EDA/EDAR/NF-kappa B signaling. NF-kappa B regulates the WNT pathway and acts as a signal mediator by upregulating Shh ligand expression [44,45]. Moreover, WNT10A plays an important role in hair follicle development [46]. Thomas et al. [47] demonstrated that a primary WNT (WNT3) and secondary WNTs (WNT0A and WNT10B) are

KEGG Pathway Enrichment Analysis of Important Genes in Cashmere Goat
In our study, Cashmere trait-related genes were selected for KEGG enrichment analysis and mainly showed clustering within follicle-related pathways (Q-value < 0.05). The KEGG pathway analysis revealed that the genes associated with the significantly correlated SNPs were highly enriched in 67 pathways (Q-value < 0.05, Supplementary Table S1). Further analysis indicated that these pathways included the signaling pathways that regulate stem cells pluripotency, basal cell carcinoma, TGF-beta signaling, melanogenesis, Wnt signaling, TNF signaling and PI3K-Akt signaling (Supplement Figure S1).

Discussion
Detection of selection signals associated with economic traits has been successfully implemented in several livestock species, such as cattle [35], pigs [36], sheep [37] and horses [38]. However, many approaches have been proposed to identify selection signals in genomes, such as Fst, XP-EHH and iHS. In this study, the SNPs identified by high Fst and XP-EHH values produced a subset of 18 SNPs, which represented 20 genes (Supplementary Table S2). The genes that overlapped between the two approaches were putatively defined as candidate genes with strong selection signatures. Of these genes, WNT10A and CSN3 have plausible biological functions associated with Cashmere traits. WNT10A is a member of the WNT gene family [39][40][41][42]. WNT paracrine signaling molecules belong to a family of conserved glycoproteins with at least 19 members in humans and mice [42]. Additionally, the WNT pathway is considered the master regulator during hair follicle morphogenesis [43]. WNT signaling proceeds through EDA/EDAR/NF-kappa B signaling. NF-kappa B regulates the WNT pathway and acts as a signal mediator by upregulating Shh ligand expression [44,45]. Moreover, WNT10A plays an important role in hair follicle development [46]. Thomas et al. [47] demonstrated that a primary WNT (WNT3) and secondary WNTs (WNT0A and WNT10B) are essential for hair follicle initiation, morphogenesis and development. CSN3(casein kappa) represents one of the most important proteins that determine the manufacturing properties of milk [48]. However, CSN3 has other plausible biological functions related to Cashmere traits, and may be involved in regular hair follicle development [49,50]. STAT5 activation acts as a mesenchyme switch to trigger natural anagen entry in post-developmental hair follicle cycling. STAT5 deletion results in significantly upregulated expression of genes, such as CSN3, Dkk3 and Dlk1, which inhibit WNT and Notch signaling. Alternatively, WNT6, FGF7 and FGF10, known regulators of anagen induction/progression, were all significantly downregulated [49,50]. Therefore, we inferred that CSN3 also has an important role in hair follicle development, although the precise function of CSN3 relative to Cashmere traits remains unknown and should be verified in future studies.
In mammals, coat hair protects against environmental changes [11]. Mohair is produced in the primary and secondary hair follicles [51]. However, the coat hair of Cashmere goats differs from mohair, because Cashmere is produced by the secondary hair follicle and coarse hair is produced by the primary hair follicle. Additionally, Cashmere yield is the main economic trait of Cashmere goats. There is now clear evidence that associated gastrointestinal parasitic status of goats with characteristics of their hair, like Diarrhoea, which is a major impediment to profitable hair goat production in many countries as it predisposes animals to blowfly strike and contaminates wool [52]. Haemonchus contortus is a bloodsucking parasite that causes an extreme stress on animals that are already protein deficient owing to the requirements for hair growth, especially in hair goat. Ectoparasites also can inflict serious damage in the hair and pelts of goats, such as Bovicola (Damalinia) caprae, the biting louse of goats, which is the most serious external parasite in hair goats. Economic losses are realized by hair damage occurring when goats bite or rub themselves to relieve itching [53]. In addition, Cashmere is also influenced by hair follicle depth and density, which have high heritability [54]. Genetically, the size and number of cells in the Cashmere tissue also determines the differences in Cashmere growth. Cashmere production by goat hair follicles is related to epidermal hair cells change, which indicates that this periodic change is related to the change in the epidermis, but not dermal thickness [55]. Candidate genes that were correlated with the hair follicle were identified in this study. PADI2 is predominantly in skeletal muscle and belongs to PAD, which is responsible for the formation of protein-bound citrulline, a major amino acid in the inner root sheath and hair follicle medulla [56]. FOXP1 is crucial for maintaining quiescence of hair follicle stem cells. FOXP1 loss in skin epithelial cells leads to precocious stem cell activation, which results in a drastic shortening of the quiescent phase of the hair cycle [57]. In addition, hair growth regulation mainly depends on the cyclic activities of the hair follicle and the successive catagen and telogen phases in the growth cycles [6]. A previous study found that hair growth of Inner Mongolia Cashmere goats was regulated by the hypertrichosis gene TRPS1 [12], which is similar to our study outcomes. Moreover, many hair follicle-related genes were identified in this study. For example, WDR74 and WNT10A were related to hair follicle morphogenesis [43], and LRRC14 regulates hair follicle induction [45] and inhibits NF-kB activation [58]; therefore, these genes may regulate hair follicle induction and are important for hair follicle and epidermal appendage development [59].
In general, the secondary follicle cycle can be divided into three periods: anagen (April-November), regression (December-January the following year), and telogen (from February to March the following year) [60]. In the anagen phase, Cashmere rapidly grows under the control of related genes [6]. The anagen phase can be further divided into early (April_August) and flourishing anagen phases (August_November). Early research on mouse hair follicles showed that the ability of the secondary hair follicle cycle to enter the growth stage was limited to the early anagen phase; once the secondary hair follicle cycle entered the flourishing anagen phase, the ability to return to the early anagen phase was lost [61]. The transition between early and flourishing growth was regulated by the expression of many inhibitors, such as bone morphogenetic proteins (BMPs) [61,62]. Interestingly, in our study, IGF1R was identified in Liaoning Cashmere goats; this gene regulates the anagen phase by activating BMP4, which affects the stem cell bulge and hair follicle cycle [63]. Moreover, pioneering research demonstrated that exogenously supplied FGF18 can prevent the hair follicle stem cells of FOXP1 null mice from being prematurely activated. As FGF18 controls telogen phase length and is a key downstream target of FOXP1, FOXP1 regulates the telogen stem cell state in the hair follicle stem cell niche by controlling FGF18 expression [57]. TRPS1 directly represses expression of the hair follicle stem cell regulator Sox9 to control follicle epithelium proliferation [64], and plays an important role in hair follicle cycling [65].
KEGG pathway analysis revealed that some important pathways were remarkably enriched for Cashmere traits. The hair follicle is a unique mini organ that self-renews throughout its lifetime. Many stem cell populations reside within human hair follicles and enable their regeneration [66]. Stem cell pluripotency plays a crucial role in cell fate decisions by controlling self-renewal and differentiation [44,67]. The Wnt signaling pathway is up-regulated in the telogen phase and peaks in the anagen phase, which is considered a key factor in stimulating hair growth in dermal papilla cells [68]; therefore, this pathway plays an essential role in hair follicle induction [44]. The Wnt signalling pathway has emerged as a potential regulator of hematopoietic stem cell self-renewal [69]. The PI3K/Akt signaling pathway is essential for de novo hair follicle regeneration [70]. Moreover, the role of melatonin in promoting Cashmere yield has been demonstrated for decades [71], and melatonin also participates in stem cell differentiation pathways. Melatonin induces Cashmere growth through melatonin receptors [72] and is a critical intermediary between photoperiod and Cashmere growth [73]. Cashmere goats breed seasonally, and usually give birth between March and April, which conforms to a spring kidding system [72]. Melatonin is administered to Cashmere goats in April, because melatonin treatment in adult Cashmere goats during the Cashmere non-growth period has been found to induce Cashmere growth, increase Cashmere yield, improve Cashmere quality and decrease the fiber diameter [74], which also present similar results from other countries of the world [75]. In Australian Cashmere-bearing feral goats, sustained immunization against melatonin gave rise to two growth cycles, resulting in a mean increase in Cashmere production of 78% over controls in the first year [76].

Conclusions
We conducted a comprehensive study to identify selection signatures in Cashmere and non-Cashmere goats based on Illumina Caprine 50K SNP chip data. Our results showed that a high degree of genetic diversity was present in our goat samples, and many important genes and pathways related to Cashmere traits could be targeted for trait improvement. We found that CSN3 may have a novel selection signal associated with goat Cashmere production. However, no clear main effect genes for Cashmere growth have been found. Our research enriches the candidate gene pool for elucidating the mechanisms underlying Cashmere traits and provides an important basis for protecting and sustainably using Cashmere goats. However, further study is needed to fully understand Cashmere production in Inner Mongolia and Liaoning Cashmere goats.
Supplementary Materials: The following are available online at http://www.mdpi.com/2076-2615/10/10/1905/s1, Figure S1: KEGG analysis for the regional candidate genes with genome-wide significant association, Table S1: KEGG analysis for the regional candidate genes with genome-wide significant association, Table S2: Overlapping selection candidate genes for Cashmere goats by Fst and XP-EHH.

Conflicts of Interest:
The authors declare no conflict of interest.