Genetic Characterization of Native Donkey (Equus asinus) Populations of Turkey Using Microsatellite Markers

Simple Summary This study was conducted to evaluate the genetic variability of Turkish native donkey (Equus asinus) populations, using polymorphism of 17 microsatellite markers. The results revealed a highly mixed genotype of all the examined donkeys, suggesting that two different group of breeds can be distinguished from each other on the basis of microsatellite markers. Abstract This study presents the first insights to the genetic diversity and structure of the Turkish donkey populations. The primary objectives were to detect the main structural features of Turkish donkeys by microsatellite markers. A panel of 17 microsatellite markers was applied for genotyping 314 donkeys from 16 locations of Turkey. One hundred and forty-two alleles were identified and the number of alleles per locus ranged from 4 to 12. The highest number of alleles was observed in AHT05 (12) and the lowest in ASB02 and HTG06 (4), while ASB17 was monomorphic. The mean HO in the Turkish donkey was estimated to be 0.677, while mean HE was 0.675. The polymorphic information content (PIC) was calculated for each locus and ranged from 0.36 (locus ASB02) to 0.98 (locus AHT05), which has the highest number of alleles per locus in the present study. The average PIC in our populations was 0.696. The average coefficient of gene differentiation (GST) over the 17 loci was 0.020 ± 0.037 (p < 0.01). The GST values for single loci ranged from −0.004 for LEX54 to 0.162 for COR082. Nei’s gene diversity index (Ht) for loci ranged from 0.445 (ASB02) to 0.890 (AHT05), with an average of 0.696. A Bayesian clustering method, the Structure software, was used for clustering algorithms of multi-locus genotypes to identify the population structure and the pattern of admixture within the populations. When the number of ancestral populations varied from K = 1 to 20, the largest change in the log of the likelihood function (ΔK) was when K = 2. The results for K = 2 indicate a clear separation between Clade I (KIR, CAT, KAR, MAR, SAN) and Clade II (MAL, MER, TOK, KAS, KUT, KON, ISP, ANT, MUG, AYD and KAH) populations.


Introduction
Turkey is one of the important bio-geographical countries which encompass three important biodiversity hotspots such as the Caucasus, Iran-Anatolian, and Mediterranean basin [1]. Also Turkey has quite a wide range of biodiversity on account of geomorphologic, topographic features, the variety of climate and geographic conditions, either flora and fauna. This biodiversity has included a high amount of endemic and rare species [2].
Numerous factors such as technological improvement, population explosion, industrialization are responsible for the disturbance of natural balance permanently. For this reason, in order to conserve the plant and animal biodiversity, especially against the threat of endanger or disappearance, protection and conservation strategies have to be formed. The morphological and genetic identification of the species for the establishment of these strategies become crucial issues for the conservation of biodiversity. Without these protective strategies, significant changes are likely to occur in the ecosystem balance and biodiversity level.
Turkey is an agricultural country where plants and animals have been raised since ancient times. For this reason, it is considered that many local animal breeds were first bred here and spread to other geographical regions of the world. Turkey has rich animal genetic resources and donkeys are one of these resources, but yet so little information is verified for the donkey breeds of Turkey. Donkey breeds that helped the breeders under severe natural conditions, and have assisted in transportation for centuries, and are still used in some rural areas in Turkey, are under threat of extinction due to the impact of industrialization [3][4][5]. In the last years, donkey populations have declined dramatically in Turkey. For these reasons, the conservation strategies have to be explored and breeds must be identified before the loss of genetic resources.
To date, little information has been found about the donkey breeds of Turkey; even some of the donkey breeds of Turkey had been extinct without clear identification. According to the Food and Agriculture Organization of the United Nations (FAO), Domestic Animal Diversity Information System (DAD-IS), Turkey has three native donkey breeds: Anatolian, Merzifon and Karakaçan donkey breeds. But on the other hand, in other unpublished records, Mardin White donkey (close to Mardin province), Toros donkey (close to Toros Mountains, Antalya province), Urfa Rahvan donkey (close toŞanlıurfa province) and Kars Yorga donkey (close to Kars Province that is located in extremely north east part of Turkey) breeds are also found in Turkey [6].
So the aim of this study has threefold: (i) to provide a detailed sampling across the entire country for molecular characterization to check if the above mentioned breeds are still found in Turkey or not, (ii) to analyze the genetic diversity of donkeys raised in Turkey using a set of 17 microsatellite markers, (iii) and to determine the genetic relationship and characterize geographical and genetic differentiation between different donkey breeds at different sites in Turkey. This research is the first application of molecular markers to characterize the donkey breeds raised in Turkey.

Sampling and DNA Isolation
According to the FAO database (Domestic Animal Diversity Information System, accessed Sep 15, 2019); the estimated number of donkeys in Turkey is about 150,000. In this study, a total of 314 blood samples from different individuals were collected from 16 different locations ( Figure 1). These locations were selected to represent the expected native Turkish donkey breeds: Anatolian, Merzifon, Karakaçan etc. The locations and the potential distribution of these breeds with the number of sample sizes are given in Table 1.  Table 1.
Blood samples were taken to Ethylenediaminetetra-acetic acid (EDTA) (0.5 mM, pH 8.0) coated vacutainer tubes and stored at +4 °C until DNA extraction. Approximately 10 mL of blood per animal was collected and genomic DNA was extracted from whole blood using the standard phenol-chloroform-isoamyl alcohol (25:24:1) extraction method. Extracted DNA was diluted in TE buffer (10:1) and stored +4 °C till analysis.   Table 1. Blood samples were taken to Ethylenediaminetetra-acetic acid (EDTA) (0.5 mM, pH 8.0) coated vacutainer tubes and stored at +4 • C until DNA extraction. Approximately 10 mL of blood per animal was collected and genomic DNA was extracted from whole blood using the standard phenol-chloroform-isoamyl alcohol (25:24:1) extraction method. Extracted DNA was diluted in TE buffer (10:1) and stored +4 • C till analysis.
The cycling conditions included an initial denaturation step at 95 • C for 5 min, 35 cycles of 95 • C for 1 min, annealing temperature between 55-62 • C (see Table 2) for 45-60 s, elongation step at 72 • C for 1 min and a final extension at 72 • C for 5 min. Amplification was carried out using a Veriti™ 96-Well Thermal Cycler or ProFlex PCR System (Applied Biosystems, Foster City, CA, USA).
PCR products were checked in 2% agarose gel stained with SYBR™ Safe DNA Gel Stain. The samples were mixed with formamide and LIZ®500-bp internal size standard (Applied Biosystems™) and detected by capillary electrophoresis using a 3500 XL Genetic Analyzer®(Applied Biosystems™) sequencer. Allele sizes were determined with the GeneMapper®Software V4.0 (Applied Biosystems™). * ASB17, LEX73 and COR022 were excluded from the study.

Statistical Analysis
Genetic diversity parameters were estimated for each microsatellite locus and across all loci for each population by total number of alleles, the mean number of alleles (Na), effective number of alleles (Ne), polymorphic information content for each locus (PIC), observed heterozygosity (H O ), expected heterozygosity (H E ), private alleles (N P ) Hardy-Weinberg equilibrium and null allele frequencies using Genetix v4.05 [21], FSTAT v2.9.4 [22], POPGENE Version 1.31 [23] and GenAlEx Version 6.5 [24]. Wright's F statistics (F ST , F IS and F IT ) as proposed by Weir and Cockerham [25] were computed using Genetix®software. Nei's gene diversity (H T ), diversity between breeds (D ST ) and coefficient of gene differentiation (G ST ) values were calculated with FSTAT v2.9.4 [22]. Exact tests for deviation from the Hardy-Weinberg (HW) equilibrium and partitioning of genetic diversity using analysis of molecular variance (AMOVA) were performed using the ARLEQUIN v. 3.5.2.2 [26]. Pairwise genetic distances (Reynold's genetic distance) and Nei's [27] unbiased D A genetic distances were calculated using the Populations v 1.2.30 software. Neighbour-net dendrogram constructed from Reynold's genetic distances by using SplitsTree v4. 16.0 [28].
The genetic structure of the populations was investigated using STRUCTURE 2.3.4, (Oxford, UK) [29][30][31]. Analysis was performed with a burn of 500,000 in length, followed by 500,000 Markov chain Monte Carlo iterations for each from K= 1-18, with 20 replicate runs for each K, using independent allele frequencies and an admixture model. Evanno's method [32] was used to identify the appropriate number of clusters using ∆K, based on the rate of change in the log probability of the data. The optimal K values were selected by means of STRUCTURE HARVESTER [33]. This software, a web-based program, was used for collating the results generated by the program STRUCTURE. The clustering pattern was implemented in the CLUMPP program and visualized using the software DISTRUCT software version 1.1 [34].
In this study, the mean number of alleles (N a ), the number of effective alleles (N e ), the number of private alleles (N p ), expected (H E ) and observed (H O ) heterozygosity, F IS, and PIC values for each population were given in Table 3. Among 17 tested loci, the number of alleles varied from 4 (ASB02 and HTG06) to 12 (AHT5), while the total number of observed alleles was 142 with an average of 8.235 per locus in 314 examined donkeys. The mean number of alleles observed in populations differed slightly: the minimum 4.529 was observed in AYD (AER) and the maximum 6.706 was in KIR (MAR). The observed (H O ) and unbiased expected (H E ) heterozygosities per location ranged from 0.6266 (KIR) to 0.7139 (AYD) and 0.6294 (ISP) to 0.6983 (ANT), respectively. The mean H O in the Turkish donkey was estimated to be 0.677, while mean H E was 0.675 (Table 3). The mean number of alleles N a was the highest in KIR (6.706) and lowest in AYD donkeys (4.529). F IS value within the populations varied between −0.0557 in KAS and 0.0923 in KIR population, wheras F IS was statistically significant only for KIR breeds due to the deficiency of heterozygosity. A total of 13 private alleles were identified in the present work, and most of the private alleles (ten) were at low frequencies of below 5%. Three alleles unique to KAH (0.115), MAL (0.050) and AYD (0.083) showed a frequency that exceeded 5% ( Table 3).
The characteristics of the analyzed loci along with the genetic variability statistics were summarized in Table 4. The total number of alleles per locus ranged from 4 (ASB02, HTG06) to 12 (ATH005), while the mean number of alleles per locus varied between 3.938 and 9.625 for the same loci, with a mean number of alleles per locus of 5.714. The number of effective alleles per locus (Ne) indicates the genetic diversity. Ne varied between 2.154 (ASB02) and 6.966 (ATH05), with a mean of 4.40 ± 2.21.
For all the analyzed samples, the Na is higher than Ne, which indicates a relatively high genetic diversity. The polymorphic information content (PIC) was calculated for each locus and ranged from 0.36 (locus ASB02) to 0.98 (locus AHT05), which has the highest number of alleles per locus in the present study. The average PIC in our populations was 0.696. Thirteen microsatellites (ASB23, HMS02, COR058, HMS03, HMS20, COR007, HTG07, COR018, COR071, COR082, HTG06, LEX54 and AHT05), having a PIC value higher than the threshold of (0.5). Additionally, four loci (HMS07, VHL209, ASB02 and HTG10) showed moderate polymorphism (PIC > 0.25) ( Table 4) The presence of null alleles, defined as non-amplifying alleles, due to mutations at PCR priming sites, causes overestimation of both F ST and genetic distance values. The null allele frequencies ranged from 0.000 (VHL209, ASB02) to 0.1894 (HTG06). The null allele frequencies in the studied microsatellite loci were below 20% ( Table 4). The lowest and highest null allele frequencies were 0.000 (VHL209, ASB02) and 0.1894 (HTG06), respectively.  The interbreed genetic distance, or F ST values of pairwise comparisons among the Turkish donkey populations are shown in Figure 2. In some cases, negative values were observed, and these equate to zero F ST values. Our F ST values fall into a small range, 0.00-0.056 (varying from white, and light to dark blue colors in Figure 2).
The neighbour-net phylogeny drawn from Reynold's genetic distances (Figure 3) visualizes the relationships between Turkish donkey populations. Populations that shared close genetic relationships were placed on different branches that originated from the same basal node. It identifies four distinct clusters, which are clearly separated, i.e., (I) from the SAR-EAR-MRM region donkey populations, (II) and (III) from the AER-MRD region donkey populations and (IV) from the MRM, BSR, MDR, CAR and AER region donkey populations in Turkey (Figure 3). The phylogeny of Reynold's distances was similar to that generated using the Nei's D A distances (Supplementary Table S1).
The analysis of molecular variance (AMOVA) is a useful tool to check how the genetic diversity is distributed within and among populations, whose structure is quantified by F ST . We analyzed different possible structures by creating and comparing different population groups. We ran the analysis under two hypotheses: Hypothesis (1)  The Table 5 reports the results for the AMOVA analysis according to two hypotheses. The results highlight that most of the observed variance is due to differences within individuals. The Hypothesis (1) AMOVA (Table 5) analyses results showed that the variation among groups, among populations within groups, among individuals within populations, and within individuals were 1.07%, 0.96%, 1.69% and 96.29%, respectively. Variance components among groups and among individuals within populations were significant (p < 0.001) for all the studied loci (Table 5), demonstrating significant geographical distribution in studied donkey populations. Furthermore, variance component among populations within groups and within individuals were significant (p < 0.05). The Hypothesis (2) AMOVA analyses results showed that the variation among groups, among populations within groups, among individuals within populations, and within individuals were 2.16%, 0.49%, 1.68% and 95.67%, respectively. Variance components among groups, among individuals within populations and within individuals were significant (p < 0.001) for all the studied loci demonstrating significant Reynold's genetic distances distribution in studied donkey populations (Table 5). Furthermore, variance component among populations within groups were significant (p < 0.05).          The populations' structure ( Figure 4) was analyzed using Bayesian clustering analysis to determine the number of clusters (K) present in the populations, permitting the identification of differences among populations and hidden substructures within them. The genetic structure of each population was determined based on admixture level for each donkey individual using correlated allele frequencies model implemented within the STRUCTURE software. When the number of ancestral populations varied from K = 1 to 20, the largest change in the log of the likelihood function (∆K) was when K = 2 ( Figure 4). The results for K = 2 ( Figure 4)

Discussion
In this study, 20 microsatellite markers (AHT05, ASB02, ASB17, ASB23, COR007, COR018,  COR022, COR058, COR071, COR082, HMS02, HMS03, HMS07, HMS20, HTG06, HTG07, HTG10, LEX54, LEX73 and VHL209) were used for genetic characterizing the population genetic structure and genetic diversity of 16 donkey population that cover the 7 geographical regions in Turkey. The ASB17, COR022 and LEX73 loci were excluded from the statistical analysis. The COR022 and LEX73 loci failed to amplify in all the samples and did not provide data for reliable analysis while the ASB17 marker was monomorphic (91 bp) for all examined animals. Similar results were reported in the Balkan donkey populations [35]. This study is the first systematic large-scale study of different geographical regions populations from Turkey by using the microsatellites.
The 17 microsatellite analysis revealed relatively high level of genetic diversity between individuals, but no high significant differences in the main genetic characteristics of the geographical region's population (MRM, BSR, AER, CAR, MDR, EAR, SAR): number of alleles (N a ), effective number of alleles (N e ), and expected and observed heterozygosities. These results are comparable to the previous values reported in Balkan donkeys [35], Spanish donkeys [36] and Catalonian donkeys [37]. In terms of mean number of alleles, the genetic variability observed in the 16 Turkish donkey populations was lower than that reported in Catalonian donkey breed [37], five Spanish breeds [36], three Croatian donkey populations [38], 15 indigenous Chinese donkey breeds [39][40][41][42], Balkan donkeys [35] and Tunisian donkeys [43], but higher than that observed in the seven indigenous Italian donkey breeds [44][45][46][47].
The results of microsatellite polymorphism revealed relatively high degree of heterozygosity in the Turkish donkey populations investigated in this study. Among Turkish donkey populations, the H E ranged from 0.6294 (ISP) to 0.6983 (ANT), which showed a comparable level to the previous values reported in Spanish donkeys [36], Catalonian donkey breeds [37], Croatian coast donkey populations [48], Balkan donkey breeds [35], Tunisian donkeys [43], Chinese donkey breeds [39][40][41][42] and was more diversified than Italian [44][45][46][47] and American donkeys [49]. This finding indicates that there are appreciable differences in the level of genetic variability among 16 Turkish donkey populations.
F IS index indicates the excess of homozygosity in a subpopulation and, with reference to molecular markers, informs if a pattern of reduction in diversity owing to several causes exists. In this study, F IS ranged from a minimum of −0.0557 in KAS to a maximum of 0.0923 in KIR population, whereas F IS was statistically significant only for KIR breeds due to the deficiency of heterozygosity. The highly significant (p < 0.001) F IS value (0.0923) revealed a rather high inbreeding degree within populations. The heterozygote deficiency found in the KIR population, could be due to the higher rate of inbreeding, to the population subdivision (Wahlund effect), and to the presence of "null alleles" (non-amplifying alleles).
Null alleles, defined as nonamplifiable alleles due to mutations in the PCR binding site, cause only a single allele to peak like a homozygote, thus cause erroneous readings. These alleles cause overestimation of both F ST and genetic distance values. It was reported by Dakin and Avise [50] that null allele frequencies below 0.20 have no significant effect on paternity tests. When the null allele frequencies obtained are examined, it is seen that the null allele frequency values of 17 microsatellites to be studied are below 0.20. The lowest and highest null allele frequencies were 0.000 (VHL209, ASB02) and 0.1894 (HTG06), respectively. Taking this value into consideration, it has been demonstrated that the studied loci can be safely used in paternity tests.
Private alleles are alleles that are found only in a single population among a broader collection of populations. These alleles have proven to be informative for diverse types of population-genetic studies. These private alleles are present in greater numbers in differentiated donkey breeds [36,38,42]. There were 13 private alleles among Turkish donkey breeds and most of the private alleles (ten) were at low frequencies of below 5%. Three alleles unique to KAH (0.115), MAL (0.050) and AYD (0.083) showed a frequency that exceeded 5%. In the present study, these alleles were consistent with those found by other authors [36,38,42], although we observed a higher value in the province of KIR and KAR, which is genetically similar to these two populations than the other provinces. Most of the KIR population (60%), samples are collected from the donkey farm in Kırklareli. In this farm, the individuals were collected from Eastern and South Eastern Anatolian region of Turkey.
PIC is a parameter indicative of the degree of informativeness of a marker. The PIC value may range from 0 to 1. In the studied Turkish donkey population, the average PIC value was 0.696 ranging from 0.3600 (ASB02) to 0.9800 (AHT05). When PIC > 0.5, 0.5 > PIC > 0.25, and PIC < 0.25, it indicates the locus has high polymorphism, moderate polymorphic, and low polymorphism, respectively [51]. Thirteen microsatellites (ASB23, HMS02, COR058, HMS03, HMS20, COR007, HTG07, COR018, COR071, COR082, HTG06, LEX54 and AHT05), having a PIC value higher than the threshold of 0.5 [51,52], seemed to be highly informative and can be used in quantifying the genetic diversity and also in paternity studies in Turkish donkey population. Additionally, four loci (HMS07, VHL209, ASB02 and HTG10) showed moderate polymorphism (PIC > 0. 25). PIC values calculated in the present investigation were comparable with those reported by Aranguren -Mèndez et al. [36] in 5 endangered Spanish donkey breeds (0.20-0.85), Ivankovic et al. [38] in 3 Croatian donkey populations (0.36-0.78), Bordonaro et al. [46] in Pantesco and two Sicilian autochthonous donkey breeds (0.146-0.796), Matassino et al. [47] in two Italian autochthonous donkey breeds (0.1918-0.8522), Zhang et al. [40] in 10 Chinese donkey breeds (0.7218-0.7967), Stanisic et al. [35] in Balkan donkey breeds (0.07-0.84) and by Zeng et al. [42] in Chinese donkey breeds (0.1489-0.8670). The variability in PIC values found in literature may be due to different microsatellite markers used in the studied populations. In the present study, the high PIC values prove that the microsatellite markers used are highly polymorphic and can be well utilized for studying the genetic diversity in Turkish donkey populations.
In this study, the mean value of between-population diversity value (D ST ), coefficient of gene diversity (G ST ) and Nei gene diversity (H T ), were determined as 0.014, 0.020, and 0.696, respectively. The global mean of the genetic diversity value (D ST ) indicated that the low diversity among 16 Turkish donkey populations studied. Nei's gene diversity values (H T ) was considerably lower than Zhang et al. [40] in Chinese donkey breeds, but similar to Aranguren-Mèndez et al. [36] in 5 endangered Spanish donkey breeds. The average G ST value obtained from overall loci pointed out that 2% of total genetic variation resulted from the differences between the populations. In all other respects, it can be said that 98% genetic variation is caused by the difference between individuals. All studied loci showed a not significant deviation from the Hardy-Weinberg Equation. Null allele frequencies were lower than the reported value (20%) by Dakin and Avise [50]. These results indicated that the microsatellite markers studied may be safely used in genetic diversity studies in Turkish donkey populations.
According to all the pairwise differences (Slatkins linearized F ST ) in this study, the distribution of F ST showed low genetic divergence (0.000 < F ST < 0.05) among populations in general. The F ST comparison values obtained were significant in 70 pairwise calculations (p < 0.05; p < 0.01; p < 0.001).  [49], Chinese donkeys [42], but lower than that of donkeys in the Europe [45,46], Near East and northeast Africa [53].
An AMOVA was carried out to investigate the relative contribution of different factors to the observed genetic variability, with each factor considered in a separate analysis, i.e., seven groups according to the geographical prevalence (MAR, BSR, AER, CAR, MDR, EAR and SAR), four groups according to the Reynold's genetic distances distribution (1st: MRM region (KIR, CAT), SAR region (SAN, MA) and EAR region (KAR) populations, 2nd: AER region (KUT) population and MDR region population (ISP); 3rd: AER region population (AYD) and MDR region population (KAH); 4rd group: MRM region population (MAL), BSR region populations (MER, TOK, KAS), AER region population (MUG), CAR region population (KON) and MDR region population (ANT)). AMOVA analysis results indicate that the majority of the observed variance is due to differences among individuals within populations. The most part of the variation is observed within the individuals (96.29% Hypothesis 1 and 95.67% Hypothesis 2) whereas the differences among groups represent only the 1.07% and 2.16% of the variation, respectively. These results are similar to a wide range of studies [38,45,54]. Among groups, among populations within groups, among individuals within populations were also a significant source of variation (p < 0.001, p < 0.05), although substantially smaller than the within individual's component.
In this study the analysis with the STRUCTURE programme revealed that Turkish donkey populations were grouped into two lineages when K = 2 ( Figure 4). Cluster I included SAN, KAR, MAR, KIR and CAT populations and Cluster II gathered MUG, ANT, KON, KAS, MAL, TOK and MER breeds, while other donkey populations (KUT, ISP, AYD and KAH) appeared to be the contact zone between both clusters, as individuals had mixed lineages. The STRUCTURE analysis results support the neighbour-net dendrogram results, as well as F ST and genetic distance results. Our results provide a broad perspective on the extant genetic diversity and population structure of Turkish donkey populations.

Conclusions
Genetic diversity is the main component of the adaptive evolution mechanism because of its preeminent role for the long-term survival probability of all species. In summary, our results suggested the relatively high genetic diversity of 16 Turkish donkey populations and brought an insight in the structure of the analyzed populations. This study is the first attempt towards a comprehensive genetic characterization of Turkish donkey populations. Despite the decreasing population size, the genetic diversity of Turkish donkey population seems still conserved. Nevertheless, further studies should be conducted to deeply evaluate the genetic variability of the Turkish donkey breeds. Furthermore, the results of this study can be utilized for future breeding strategies and conservation.