Genetic Diversity of Kazakhstani Equus caballus (Linnaeus, 1758) Horse Breeds Inferred from Microsatellite Markers

Simple Summary Traditional horse breeding has long developed several kinds and lineages of horse breeds in Kazakhstan. Among them, Kushum and Mugalzhar are the breeds most prolific and resistant to harsh climatic conditions. Microsatellite analysis is employed to examine present genetic variability and population structure. The subpopulation structure shows three regional groups indicating (I) purebred Kushum populations from western Kazakhstan, (II) the Kozhamberdy type of Mugalzhar interbreed in populations from central Kazakhstan, and (III) admixed Kushum–Mugalzhar populations from western and southwestern Kazakhstan. The majority of microsatellite markers utilized are informative, with seven of them being extremely variable. A high level of genetic admixture among Kazakhstani Equus caballus breeds is found, as well as a shallow level of genetic differentiation between the examined populations. Abstract Understanding the genetic diversity and structure of domesticated horse (Equus caballus) populations is critical for long-term herd management and breeding programs. This study examines 435 horses from Kazakhstan, covering seven groups in three geographic areas using 11 STR markers. Identified are 136 alleles, with the mean number of alleles per locus ranging from 9 to 19. VHL20 is the most variable locus across groups, while loci HTG4, AHT4, AHT5, HTG7, and HMS3 are variable in most populations. The locus AHT5 in the Emba population shows the highest frequency of rare alleles, while the lowest frequency, 0.005, is observed in the Kulandy population. All loci were highly informative for the Kazakhstani populations of E. caballus, with PIC values higher than 0.5. Pairwise variations in Wright’s FST distances show that the examined varieties have little genetic differentiation (0.05%), indicating a high degree of admixture and a continuing lineage sorting process. Phylogenetic and population structure analyses reveal three major clusters of Kazakh horses, representing (I) the Uralsk population of the Kushum breed and the monophyly of two groups: (II) the Kozhamberdy population of the Mugalzhar breed, and (III) the Mugalzhar–Kushum breed populations. Kazakhstani horse populations, while being regionally isolated, were recently in contact with each other.


Introduction
The horse was the last domesticated animal which had a long-term significant and enduring impact on human civilizations, leading to advancements in transportation and trade, influencing people's lifestyle, and revolutionizing the nature of warfare.In turn, human selection changed genetic diversity in horse populations, resulting in the variation Vet.Sci.2023, 10, 598 2 of 12 seen among current horse characteristics and types.Domestic horses were subject to constant genetic restocking from the wild, mostly from females [1][2][3][4].
Kazakhstan, the largest Central Asian country, combines several climatic and geomorphologic features that offer an optimal environment for agricultural growth due to its geographical location at the intersection of the western and eastern Palearctic.The FAO database (Domestic Animal Diversity Information System, accessed on 5 December 2022), identifies Kazakhstan as having eight registered domesticated horse breeds (Kushum, Mugalzhar, Kostanay, Akhal-Teke, Aday, Kazakh, Jabe, and Karabair) [5].Among them the Kushum breed was formed via purebred crossbreeding, whereas the Mugalzhar breed was established by traditional crossing methods, and these two breeds are dedicated to meat and milk production.The dairy production presented by fermented mare's milk called "Kymyz" and fresh milk "Saumal" are national drink products of Kazakhstan.They are rich in minerals and vitamins, and benefit human health with anticancer and anti-inflammation effects [6].The overall number of horses in Kazakhstan now surpasses 3 million, with 90% of them raised by herds in practically all areas of the country [7].
The Kushum horse was bred by the herd method of Ural and Aktobe farms, named after the Kushum River, flowing across western Kazakhstan.Initial approval as a separate breed came in 1976.The breed was established by a complex reproductive crossover procedure.Trotting, Thoroughbred riding, and Don breeds were crossed with local native mares [8].The farmers chose the best animals in terms of growth and maintained them on semi-desert cereal-wormwood pastures throughout the year.As a result, Kushum horses are ideally suited for herding.Crosses and rigorous selection of stallions for desired traits were carried out in the middle of the twentieth century in terms of external features and adaptation to local environments.Later, the crossed horses were bred "in themselves" to reinforce breeding results [9].
Mugalzhar horses are an upgraded dual-purpose form of the Kazakh horse breed Jabe type, developed by Kazakhstani scientists between 1969 and 1998 [10].They are distinguished by high-quality meat and milk production, as well as resilience to extreme weather conditions and year-round grassland farming [11].
Genetic distinctiveness can be clarified by population genetic structure, leading to breed preservation, including future breeding methods and management plans.Microsatellites have been a widely utilized genetic marker and effectively applied to studies of interand intrabreed variability in domestic and feral horse populations [25][26][27][28][29][30][31][32].
Here we determine the genetic diversity of the Kushum and Mugalzhar horse breeds within populations and between them.Applied are several different approaches to evaluate the distribution of molecular indices, genetic distances, and population structural variation, which can elucidate lineage sorting processes, and be helpful for organizing individual genetic feature panels of the type or line as well as to improve the inbreeding management strategy of Kazakh horse-breed populations.

Sampling
DNA isolation of 435 samples from blood cohorts were taken from 7 populations across Kazakhstan, representing Mugalzhar and Kushum horse breeds (Table 1; Figure 1).
DNA extractions were conducted using the "DNA-sorb-B" set (AmpliSens, Moscow, Russia).Biomaterial was processed by the solid-phase sorption method, which consists in adding a lysing solution, DNA sorption on a sorbent, repeated washing and resorption of DNA with a buffer solution, as a result of which a purified solution containing DNA was obtained.Furthermore, the spectrometric quantification was performed as well as amplification reactions carried out using the StockMarks Equine Kit (Applied Biosystems, Waltham, MA, USA) [34].Separation and analysis of amplified fragments were carried out by capillary electrophoresis using a genetic analyzer.PCR amplifications were performed on Thermocycler 2730 (Applied Biosystems) following a touchdown cycling protocol with an initial denaturation at 95 °C for 15 min, followed by 30 cycles of: the first 4 cycles, 58 °C (30 s.), 59 °C (120 s.), 72 °C (75 s); the next 6 cycles, 94 °C (30 s), 59 °C (120 s), 72 °C (75 s);
DNA extractions were conducted using the "DNA-sorb-B" set (AmpliSens, Moscow, Russia).Biomaterial was processed by the solid-phase sorption method, which consists in adding a lysing solution, DNA sorption on a sorbent, repeated washing and resorption of DNA with a buffer solution, as a result of which a purified solution containing DNA was obtained.Furthermore, the spectrometric quantification was performed as well as amplification reactions carried out using the StockMarks Equine Kit (Applied Biosystems, Waltham, MA, USA) [34].Separation and analysis of amplified fragments were carried out by capillary electrophoresis using a genetic analyzer.PCR amplifications were performed on Thermocycler 2730 (Applied Biosystems) following a touchdown cycling protocol with an initial denaturation at 95 • C for 15 min, followed by 30   C hold temperature.Amplification product separation was performed by capillary electrophoresis on an automatic genetic analyzer AB 3130 (Applied Biosystems), using the GeneMapper™ v. 4.0 program.Amplified DNA fragments were interpreted using a control DNA profile with a known genotype and data from international comparative tests (Horse Comparison Tests) conducted by ISAG.

Population Genetic Structure
Allele frequencies and polymorphic information content (PIC) were calculated using Cervus 3.0 software [35,36].Genetic diversity within and between breeds, as well as basic parameters, including total number of allele variants (NA), effective number of alleles (NE), estimation of observed (HO), expected (HE), and unbiased expected (UHE) heterozygosity, and Shannon's information index (I) were measured using GenAlEx 6.5 software (New Brunswick, NJ, USA) [37].Variance components of microsatellite diversity within and between populations for all pairs of populations were analyzed using analysis of molecular variance (AMOVA) with permutations set to 999 in the GenAlEx 6.5 [37].Chi-square tests of Hardy-Weinberg equilibrium and rare alleles were calculated for each population using Microsatellite Analyzer v. 4.05 (MSA) [38].Fixation indices (F IT , F IS , and F ST ) of Wright's F-statistics were obtained using GenAlex 6.5 and Excel microsatellite toolkit (version 3.1) [37].Neighbor joining of Saitou and Nei (1987) [39] was used to construct a phylogenetic tree based on Nei's genetic distance in MEGA 7 [40].Factorial correspondence analysis (FCA) was investigated based on the individual multilocus genotype using GENETIX version 4.03 [41].Bayesian clustering analysis was implemented in Structure 2.3.4 [42] without prior structure information.All possibilities were considered by dividing 7 populations into 7 groups.An ad hoc quantity based on the second order rate of change in the likelihood function with respect to K (K) was used for estimating the number of clusters from structure analysis [43].In addition, we also use ln(Pr(X|K) values in order to identify the k for which Pr(K = k) is highest (as described in STRUC-TURE's manual, Section 5.1.Twenty runs for K = 1 to 7 were analyzed under the admixture model, correlated allele frequencies, and a burn-in of 250,000 followed by 1,000,000 Markov chain Monte Carlo (MCMC) iterations.Structure Harvester 0.6.93 [44] was applied to choose the optimal K-value based on the Delta K method.The 20 replicates for the chosen K-value were merged using CLUMPP 1.1.2[45] and the final plots were generated using DISTRUCT 1.1 [46].

Microsatellite Genotyping, and Population Genetic Diversity and Structure
A total of 136 alleles at 11 STR loci from 435 genotyped individuals of two Kazakh horse breeds from seven populations were identified.All markers were found to be polymorphic (p ≥ 0.05) (Table 2).The mean number of alleles varied from 9 at loci AHT4, HMS6, and HMS7 to 19 at locus ASB23.The mean number of alleles (Na) per locus was 12.36, and the effective number of alleles (Ne) was 5.82.The expected heterozygosity (He), which is a widely accepted measure of genetic diversity in a population, ranged from 0.64 in locus HTG4 to 0.83 in locus VHL20, with an average He of 0.77 across the seven populations for the 11 microsatellite loci analyzed.The observed heterozygosity (Ho) fluctuated from 0.48 in locus LEX3 to 0.85 in locus VHL20, with a population mean of 0.68, indicating that all studied lineages are characterized by considerable genetic variability.The polymorphic information content (PIC) varied from 0.62 for the marker ASB2 to 0.82 for the AHT4 locus.The average PIC for the 11 microsatellite markers was 0.74 and there were no markers with a PIC of less than 0.5, indicating that all loci were found to be highly polymorphic.Shannon's information (diversity) index (I), which is an indicator of the genetic variability of a population, ranged from 1.33 in locus HTG4 to 1.99 in the VHL20 marker.The average value of the I-index for all seven populations was equal to 1.73, which reflects the level of allele abundancy (Table 2).Further, F IS , F IT , and F ST indices were calculated for each marker in whole populations.F IS ranged from −0.031 (VHL20) to 0.093 (AHT4) with an average value of 0.211 for all loci.F IT presented a mean value of 0.157 ranging from −0.010 for HTG7 to 0.439 for LEX3.The calculation of FIS was between 0.030 (AHT5) and 0.074 (ASB2) with a mean value of 0.041 in the total population.E ), inbreeding coefficient (F IS ), fixation index (F IT ), population differentiation statistic (F ST ), p ≤ 0.001.
AMOVA analysis performed on seven populations, suggests that the majority of the variation occurred within individuals-70% (Table 3).Fixation indices based on standard permutation demonstrated differences (p ≥ 0.001) indicating a reduction of heterozygosity, panmixia, and inbreeding processes which occurred in Kazakhstani populations of E. caballus.Across Kazakhstan, horse breeds are identified with rare alleles that are typical for each population: VHL20, HTG4, HMS3, HMS6, HMS7, AHT4, AHT5, ASB2, ASB23, and LEX3.Among them, five unique alleles were observed in the Uralsk (Population 1) and Kulandy (Population 7).For the Uralsk population unique alleles were found in the HTG locus at 121 bp and 125 bp lengths, at HMS6 at 155 bp length, and HMS7 169 bp length.For the Kulandy population a unique allele was found in the locus LEX3 143 bp length.
The value for gene differentiation based on F-statistic (F ST ) distance over all loci between populations of the Kushum breed (Uralsk with Aktobe) was 4.5%, whereas between populations of the Mugalzhar breed it varied in the range of 0.008-2.8%.Genetic variability between the Uralsk population and Mugalzhar populations was from 3.3% to 4.7%, which indicates that 4.7% of the variability could be attributed to differences between breeds (Table 4).A chi-squared test observed statistically significant (p ≤ 0.05) results at all loci, rejecting the null hypothesis of random mating (Table S1).Factorial correspondence analysis revealed three clusters of horse populations are distinct at three axes with variance of 35.98%, 25.27%, and 17.40%, respectively (Figure 2).An FCA plot demonstrated that the Uralsk population of the Kushum breed was clearly separated from other horses and thus the result is consistent with the phylogenetic tree and structure inferences.
Vet. Sci.2023, 10, x FOR PEER REVIEW 6 of 12 breeds (Table 4).A chi-squared test observed statistically significant (p ≤ 0.05) results at all loci, rejecting the null hypothesis of random mating (Table S1).Factorial correspondence analysis revealed three clusters of horse populations are distinct at three axes with variance of 35.98%, 25.27%, and 17.40%, respectively (Figure 2).An FCA plot demonstrated that the Uralsk population of the Kushum breed was clearly separated from other horses and thus the result is consistent with the phylogenetic tree and structure inferences.An unrooted neighbor-joining tree for all samples was constructed using a pairwise population matrix of Nei's genetic distances in order to represent relationships among seven populations of Kazakh horse breeds (Figure 3).Three main groups were recovered: Group I is the Uralsk population with a distance of 2.8% to monophyletic groups II and III.Group II is the Kozhamberdy type population representing three lines: IIa, Meiman; IIb, Maupas; and IIc, Mesker (1.4%).Group III clusters members of two breeds consisting of IIIa, Aktobe (Kushum breed); IIIb, Emba (Mugalzhar breed); and IIIc, Kulandy type populations (Mugalzhar breed) (1.6-1.7%).An unrooted neighbor-joining tree for all samples was constructed using a pairwise population matrix of Nei's genetic distances in order to represent relationships among seven populations of Kazakh horse breeds (Figure 3).Three main groups were recovered: Group I is the Uralsk population with a distance of 2.8% to monophyletic groups II and III.Group II is the Kozhamberdy type population representing three lines: IIa, Meiman; IIb, Maupas; and IIc, Mesker (1.4%).Group III clusters members of two breeds consisting of IIIa, Aktobe (Kushum breed); IIIb, Emba (Mugalzhar breed); and IIIc, Kulandy type populations (Mugalzhar breed) (1.6-1.7%).Bayesian cluster analysis performed with STRUCTURE [33] showed that the independent runs from K = 2 to K = 7 produced consistent results, where the most likely K values were identified at K = 3 and 4 (ΔK = 12.953; 12.935), respectively (Figure 4); the subpopulations' structure [32] using the median values of Ln Prob of data to calculate Prob(K = k) yielded the uppermost value of K = 7.A plot with the clustering of individuals is presented in Figure S1.Bayesian cluster analysis performed with STRUCTURE [33] showed that the independent runs from K = 2 to K = 7 produced consistent results, where the most likely K values were identified at K = 3 and 4 (∆K = 12.953; 12.935), respectively (Figure 4); the subpopulations' structure [32] using the median values of Ln Prob of data to calculate Prob(K = k) yielded the uppermost value of K = 7.A plot with the clustering of individuals is presented in Figure S1.Bayesian cluster analysis performed with STRUCTURE [33] showed that the independent runs from K = 2 to K = 7 produced consistent results, where the most likely K values were identified at K = 3 and 4 (ΔK = 12.953; 12.935), respectively (Figure 4); the subpopulations' structure [32] using the median values of Ln Prob of data to calculate Prob(K = k) yielded the uppermost value of K = 7.A plot with the clustering of individuals is presented in Figure S1.

Discussion
A comprehensive genetic analysis of microsatellite markers conducted for seven populations of two main horse breeds of Kazakhstan revealed a high genetic diversity.Wright's

Figure 1 .
Figure 1.Distribution map of two main Kazakh horse breeds, Kushum and Mugalzhar.

Figure 1 .
Figure 1.Distribution map of two main Kazakh horse breeds, Kushum and Mugalzhar.

Figure 2 .
Figure 2. Factorial correspondence analysis of 7 horse populations studied on the basis of 11 STR loci.Dashed lines representing the three clusters with the following colors: yellow-Uralsk, Kulandyblack, and waterloo color-Aktobe, Emba, Maupas, Meiman, Mesker.

Figure 3 .
Figure 3. Neighbor-joining dendrogram showing the relationships of seven Kazakh horse populations with Nei's genetic distances plotted.Outline colors of the tree branches representing the breeds according to Figure 1; red = Kushum breed, green = Mugalzhar breed.

Figure 3 .
Figure 3. Neighbor-joining dendrogram showing the relationships of seven Kazakh horse populations with Nei's genetic distances plotted.Outline colors of the tree branches representing the breeds according to Figure 1; red = Kushum breed, green = Mugalzhar breed.

Figure 3 .
Figure 3. Neighbor-joining dendrogram showing the relationships of seven Kazakh horse populations with Nei's genetic distances plotted.Outline colors of the tree branches representing the breeds according to Figure 1; red = Kushum breed, green = Mugalzhar breed.

Table 1 .
Sampling information of studied populations of E. caballus.

Table 1 .
Sampling information of studied populations of E. caballus.

Table 2 .
Summary statistics of mean genetic diversity at 11 microsatellite loci in 435 individuals of E. caballus from Kazakhstan.
Numbers of observed alleles (N A ), number of effective alleles (N E ), Shannon's information index (I), observed heterozygosity (H O ), expected heterozygosity (H E ), unbiased expected heterozygosity (UH

Table 3 .
Analysis of molecular variance.

Table 4 .
Genetic distances of studied horse populations.
The Wright's F ST fixation indices given below diagonal.

Table 4 .
Genetic distances of studied horse populations.
The Wright's FST fixation indices given below diagonal.