Analysis of the Genetic Relationship and Inbreeding Coefficient of the Hetian Qing Donkey through a Simplified Genome Sequencing Technology

The Hetian Qing donkey is an excellent local donkey breed in Xinjiang. It is of great significance to accelerate breeding and the speed of breeding and rejuvenation, as well as to understand the genetic basis of the strategies and population. This study collected a total of 4 male donkeys and 28 female donkeys. It then obtained genotype data through Simplified Genomic Sequencing (GBS) technology for data analysis. The results detected a total of 55,399 SNP loci, and the genotype detection rate of individuals was ≥90%. A total of 45,557 SNP loci were identified through quality control, of which 95.5% were polymorphic. The average minimum allele frequency was 0.250. The average observed heterozygosity was 0.347. The average expected heterozygosity was 0.340. The average IBS (state homologous) genetic distance was 0.268. ROH: 49 (homozygous fragments), with 73.47% of the length between 1 and 5 Mb. The average per-strip ROH length was 1.75 Mb. The mean inbreeding coefficient was 0.003. The 32 Hetian green donkeys could be divided into six families. The number of individuals in each family is significant. To sum up, the Hetian Qing donkey population has low heterozygosity, few families, and large differences in the number of individuals in each family, which can easily cause a loss of genetic diversity. In the subsequent process of seed protection, seed selection should be conducted according to the divided pedigree to ensure the long-term protection of the genetic resources of Hetian green donkeys.


Introduction
The Hetian Qing donkey (formerly known as the Gola donkey) is a valuable local medium-sized donkey breed in China.In 2009, the Hetian Qing donkey was listed in the national and local livestock and poultry genetic resources protection list by the state and Xinjiang Uygur Autonomous Region.Hetian Qing donkeys are mainly distributed in Qiaoda, Sangzhu, Muji, Zanggui, Piyaman townships, and other towns in the Hetian Pishan County.The main production area is in the Qiaoda township plain area, which has a typical continental warm tropical-arid desertification climate.It is rich in light and heat resources.The annual average temperature is 11.8 • C, the annual average precipitation is 51.3 mm, and the annual evaporation is 2700 mm [1].Therefore, the Hetian Qing donkey favors a dry and warm climate and has strong disease resistance, resistance to rough feeding, resistance to hunger and thirst, early sexual maturity, high reproductive rate, fast growth speed, high physical performance and health, high meat production, leathery skin, and dual-use [2].At the end of 2009, only approximately 2000 animals were in the county, which need to be protected urgently [3].Therefore, strengthening the breeding of Hetian Qing donkey strains, carrying out seed selection and mating, and successful purification, rejuvenation, population expansion, and reproduction have become the primary tasks in Hetian Qing donkey species conservation.Male and female donkeys become sexually active around the age of one.Female donkeys can start breeding at this point under good dietary conditions.Female donkeys can be used for breeding up to 15-20 years old.The average gestation period of female donkeys is 361 days, the sexual maturity is 10-18 months, and 15 to 20 foals are born in their lifetimes.The survival rate of foals reaches more than 90% [4].It is difficult to select and breed seeds and slow population expansion because of the species' long breeding cycle and poor breeding technology.
In recent years, with the continuous progress in molecular biological technology and substantial reductions in the cost of mutation detection, obtaining a large amount of SNP information is easier.High-flux mutation detection technologies such as genome sequencing and chipping are widely used in the research of livestock genetic structure, kinship, inbreeding, etc. [5].And significant progress has been made in this area, which has important reference value for the formulation of livestock breeding and seed conservation strategies.In 2007, reduced-representation genome sequencing (RRGS) technology was proposed by Miller et al. [6].On the basis of next-generation sequencing (NGS) technology, a method for sequencing specific genomic regions and reflecting the whole genome sequence was proposed [7].This technology can not only reduce the complexity of the genome but also effectively avoid the disadvantages of the complex process and the high cost of previous high-throughput sequencing methods and improve the reliability of sequencing results without a reference genome.Therefore, it has great advantages in the research of non-reference genome species [8].For most species, human activities and environmental factors can easily cause changes in gene flow among species, resulting in genetic differentiation of different degrees among populations in the region and changing the genetic structure of species.Therefore, species genomics has become an important area of study in species research [9].According to the different methods of constructing genome libraries, simplified genome sequencing technology can be divided into RRL (reduced representation library sequencing), RAD (restriction-site associated DNA), and GBS (genotyping by sequencing).GBS uses restriction endonuclease to label genes, and high-density SNP markers of sample species can be obtained through high-throughput parallel sequencing of multiple samples [10].Compared with RAD technology, the selection of GBS fragments is simpler, there are fewer steps in database construction, and it is more advantageous for species with low polymorphism and high repeat sequences [11].Although this technology has been widely used, there is no theoretical basis for the Hetian Qing donkey breed group, and research findings on the Hetian Qing donkey are lacking even further, especially at the molecular level.Therefore, this study uses GBS technology to analyze the genetic structure of 32 Hetian Qing donkeys and discusses the degree of genetic diversity, population structure, and differentiation.This is conducive to protecting the resources of Hetian Qing donkeys and has an important significance for the sustainable development of the Xinjiang donkey industry.In view of the above considerations, the purpose of this current study was to analyze the genetic structure of 32 Hetian Qing donkeys by means of the GBS technology and discuss the degree of genetic diversity, population structure, and differentiation of this donkey breed.The knowledge gathered in this investigation could represent an important tool for the protection of the resources of Hetian Qing donkeys and the sustainable development of the Xinjiang donkey industry.

Test Animals
A total of 32 Hetian Qing donkeys were selected from private herdsmen's homes in Pishan County, Hetan, Xinjiang Province, including 28 females and 4 males.After the selection of the Hetian Qing donkeys, a 5 mL EDTA anticoagulant blood collection vessel was used for jugular vein blood collection.The experimental animals were released after the blood collection was completed, and the blood samples were stored in a −20 • C refrigerator for future use.

DNA Extraction
DNA was extracted using the magnetic bead method, and genomic DNA was isolated and purified using the cwe9600 magnetic bead blood DNA kit.After extraction, 2% agarose gel was used to detect the quality, Qubit was used to measure the DNA concentration, and the purity of the DNA sample was detected by a nanodrop spectrophotometer (OD260/OD280 = 1.8-2.0),providing reliable basic data for subsequent experimental research.

Construction of the GBS Library
Based on simplified genome sequencing, the qualified DNA was digested with Mse I restriction endonuclease and a linker with a barcode was added to both sides of the enzyme section.Then, PCR amplification was carried out, and the amplified products were mixed.The required fragments were selected to establish the library.The constructed library was initially quantified using Qubit ® 2.0 and diluted to 1 ng•µL −1 .The library was examined using an Agilent 2100 Bioanalyzer (Santa Clara, CA, USA).After the length of the inserted fragment was obtained, the effective concentration of the library was accurately quantified using q-PCR (the effective concentration of the library > 2 nmol•L −1 ) to ensure the quality of the library [12].The library was qualified.The pool was mixed according to the effective concentration of different libraries and the amount of data required for the target machine, followed by Illumina Hi-seq PE150 sequencing [13].

Sequencing Data Quality Control
The raw data of high-throughput sequencing were sorted and preprocessed according to the following steps: Trimmomatic [14] was used to filter the data quality.The filtering conditions were as follows: remove the connector sequence contained in the read; remove the base with a quality of less than 20; filter the filtered data according to the length; remove the reads with a length of less than 50 bp or with only one end; and finally obtain the effective sequencing data.

SNP Detection and Quality Control
The original image data file (Illumina Hi-seq sequencing platform) obtained through sequencing was transformed into an original sequencing sequence, and the effective sequence was obtained after quality control.Samtools [15] were used for preprocessing according to the positioning results of clean reads in the reference genome, such as Mark Duplicates and Base Recalibration.Through the use of the GATK [16], the joint calling method was used to improve the performance of the population analysis.This allowed us to simultaneously detect single nucleotide polymorphisms (SNPs) in multiple samples, filter the samples, and obtain the final SNP locus set, thus ensuring the accuracy of the detected SNP. Plinkv1.90 software was used to perform quality control on the detected SNP data.The following quality control criteria were used for subsequent analysis of the SNP locus genotype data: sequencing depth ≥ 10×; Q20 > 95%; SNP detection rate ≥ 90%; minimum allele frequency (MAF) ≥ 0.01; and a Hardy-Weinberg p value ≥ 10 −6 .

Genetic Diversity Analysis
Genetic diversity analysis includes effective population size (Ne), which refers to the ideal population size with the same gene frequency variance or the same inbreeding coefficient increment (heterozygosity decay rate) as the actual population [17].It is usually estimated based on the linkage disequilibrium (LD) level of the population [18].Through the use of Plink software, the minimum allele frequency (MAF) [19] of each site was then calculated and analyzed for the proportion of polymorphic markers (PNs) in the population.The expected heterozygosity (He) of the population refers to the probability of heterozygosity of any individual in the population at any site, and the observed heterozygosity (Ho) refers to the proportion of individuals who are heterozygous at a site in the population compared to the total number of individuals.The polymorphism information content (PIC) is an index that measures the degree of gene variation and reflects the amount of genetic information.PIC can be calculated according to the formula of Bostein et al. [20].The number of effective alleles and the minimum allele frequency (MAF) refer to the frequency of uncommon alleles in a given population.
where P i and P j are the frequencies of ith and jth alleles for the selected marker, respectively.

Inbreeding Coefficient Analysis Based on Long Homozygous Fragments
Runs of homozygosity (ROHs) are widely present in all populations.An ROH is a continuous fragment of a homozygous genotype in an individual, which is produced by the complete transmission of homologous haplotypes from the parent to the offspring.The length and frequency of ROH can reflect the group history.A long ROH can indicate a recent genetic relationship; the more such fragments there are, the higher the possibility of inbreeding within the family.A short ROH indicates a genetic relationship that occurred in the distant period.Plink (V1.90) was used to calculate the ROH length of each sample.The coefficient of inbreeding based on ROH is calculated by calculating the proportion of the total length of ROH fragments in the total length of the autosome genome [21].Therefore, the longer the total length or the higher the number of ROHs in an individual, the higher the inbreeding coefficient of that individual.FROH = ∑k Length(ROH k )L Among them, k is the number of ROHs in an individual, and L is the length of the autosomal genome covered by genotype data (donkey, approximately 2,302,664.694Kb).

Genetic Relationship Analysis of Population Genome
Principal component analysis (PCA) is a purely mathematical operation method that can select a smaller number of important variables by linearly transforming multiple related variables.PCA is applied for cluster analysis in many disciplines, mainly in genetics.It is based on the degree of individual genomic SNP differences, clustering individuals into different subgroups based on principal components according to different trait characteristics, and is used for mutual validation with other methods.This project used smartpca software based on SNP to conduct principal component analysis (PCA) and obtain the principal component clustering of sample X.Through PCA analysis, it is possible to determine which samples are relatively close and which samples are relatively distant, which can assist in evolutionary analysis.The genomic relationship G matrix was constructed using the genome-wide marker information, and the G matrix molecular kinship analysis was carried out using GCTA (V1.94) software.A heat map was drawn to show the genomic kinship among individuals [22].Plink (v1.90) software was used to calculate the genetic distance between individuals, construct an identity-by-state (IBS) matrix, and analyze the genetic distance of IBS.

Cluster Analysis to Construct the Hetian Qing Donkey Family
Cluster analysis is a method of generating a relatively simple class structure from a group of complex data and classifying groups according to the degree of correlation or similarity between different individuals [23].The adjacency method (neighbor-joining, NJ) was used to cluster the samples based on the genetic distance matrix obtained from the genetic distance analysis.Finally, the whole classification system was turned into a genealogy chart, which shows the kinship between all samples and represents the families of the Hetian Qing donkey conservation population.

Genomic DNA Detection
The concentration of DNA samples from 32 Hetian Qing donkeys was greater than 100 ng/µL.The OD260 nm/OD280 nm ranges from approximately 1.8 to 2.0.The genomic DNA sample was of good quality, had no protein contamination, and had no degradation, thus meeting the analysis requirements.This meant that it could be used for subsequent research.

Sequencing Data Output and Quality Control
According to Table 1, the average effective base number obtained from GBS sequencing and quality control of 32 Hetian green donkey blood genomes was 3,813,442.839.The average GC base content was 41.55%.The average proportion of bases reaching Q20 quality was 96.83%.The average proportion of bases reaching Q30 quality was 91.06%.According to Table 2, it can be seen that the average effective read length obtained was 630,750.875.The average proportion of effective read length comparison to the reference base group was 98.32%.The sequencing depth was 2.47×~3.62×,with an average of 3.17×.This indicates that the quality of GBS sequencing was high, the GC distribution was normal, the sample was not contaminated, and the sequencing was successful.The obtained data met the requirements of subsequent analysis.

SNP Locus Quality Control
Plink (V1.90) software was used to conduct subsequent analysis on the sites with the best SNP typing quality.The specific quality control conditions and results are shown in Table 3.The distribution of SNPs on each chromosome before and after quality control is shown in Figure 1.number of clean reads mapped to the reference genome; (4) Properly mapped (%): the number of clean reads with both ends of the sequenced sequence located on the reference genome and the distance consistent with the length distribution of the sequenced segment; (5) sites_CovgMean: average coverage depth of all sites in the genome; (6) sites_NumCovg1: the proportion of bases of sequencing depth greater than or equal to 1× on the genome to the total length of the genome.

SNP Locus Quality Control
Plink (V1.90) software was used to conduct subsequent analysis on the sites with the best SNP typing quality.The specific quality control conditions and results are shown in Table 3.The distribution of SNPs on each chromosome before and after quality control is shown in Figure 1.

Genetic Diversity and Population Structure Analysis
The results of the genetic diversity analysis are shown in Table 4.A total of 1.536 effective alleles were detected, with an average number of effective alleles of 0.048 and an MAF of 0.250.Among them, there were relatively more between 0.1 and 0.2, accounting for 23.89%.The distribution is shown in Figure 2a.The PN of the SNP locus was 0.955, indicating that 95.5% of the SNP loci were polymorphic, and the polymorphic information content of the SNP locus was 0.273, as shown in Figure 2b.The Ne of Hetian Qing donkeys

Genetic Diversity and Population Structure Analysis
The results of the genetic diversity analysis are shown in Table 4.A total of 1.536 effective alleles were detected, with an average number of effective alleles of 0.048 and an MAF of 0.250.Among them, there were relatively more between 0.1 and 0.2, accounting for 23.89%.The distribution is shown in Figure 2a.The PN of the SNP locus was 0.955, indicating that 95.5% of the SNP loci were polymorphic, and the polymorphic information content of the SNP locus was 0.273, as shown in Figure 2b.The Ne of Hetian Qing donkeys was 4.1, Ho was 0.347, and He was 0.340.The average observed heterozygosity was slightly higher than the average expected heterozygosity, but the two were very close, as shown in Figure 2c.

Inbreeding Coefficient Analysis Based on Long Homozygous Segments
As shown in Figure 3a, a total of 49 ROH fragments were detected in the Hetian Qing donkey population.The number of ROH fragments with a length between 1 and 5 Mb was the largest, accounting for 73.46%.The shortest ROH was 1.07 Mb long, located on chromosome 17, and the longest ROH was 10.66 Mb long, located on chromosome 1.As shown in Figure 3b, the number of ROHs on chromosomes 2 and 7 was six, which was the largest.The number of ROHs on chromosomes 14, 15, 16, 20, 22, 25, 26, and 27 was 0. The maximum number of individuals with a total length of ROH between 0 and 5 Mb was 19, as shown in Figure 3c, accounting for 59.37%.The average coefficient of inbreeding of the Hetian Qing donkey population calculated based on ROH was 0.003, most of which was concentrated in 0.00~0.0025,as shown in Figure 3d.

Inbreeding Coefficient Analysis Based on Long Homozygous Segments
As shown in Figure 3a, a total of 49 ROH fragments were detected in the Hetian Qing donkey population.The number of ROH fragments with a length between 1 and 5 Mb was the largest, accounting for 73.46%.The shortest ROH was 1.07 Mb long, located on chromosome 17, and the longest ROH was 10.66 Mb long, located on chromosome 1.As shown in Figure 3b, the number of ROHs on chromosomes 2 and 7 was six, which was the largest.The number of ROHs on chromosomes 14, 15, 16, 20, 22, 25, 26, and 27 was 0. The maximum number of individuals with a total length of ROH between 0 and 5 Mb was 19, as shown in Figure 3c, accounting for 59.37%.The average coefficient of inbreeding of the Hetian Qing donkey population calculated based on ROH was 0.003, most of which was concentrated in 0.00~0.0025,as shown in Figure 3d.

Kinship Analysis Based on G Matrix
We performed principal component analysis on 45,557 SNP loci after quality control.From Figure 4a, it can be seen that the Hetian Qing donkeys had a close genetic relationship and high diversity.There were three male donkeys distributed far away.Male and female donkeys were far apart from each other, which clearly indicates diversity.Then, GCTA (V1.94) software was used to calculate the genetic relationship coefficients between individuals, construct a population g matrix (Figure 4b), further analyze the genetic relationship in the protected population of Hetian Qing donkeys, and validate the results of principal component analysis.From Figure 4c, it can be seen that the genetic distance of IBS was 0.011~0.339,with an average of 0.267.Most blocks were relatively light (moderately correlated), and the genetic distance of IBS between most individuals was short, resulting in higher genetic relationships.

Analysis of Genetic Relationship of Population Genome Kinship Analysis Based on G Matrix
We performed principal component analysis on 45,557 SNP loci after quality control.From Figure 4a, it can be seen that the Hetian Qing donkeys had a close genetic relationship and high diversity.There were three male donkeys distributed far away.Male and female donkeys were far apart from each other, which clearly indicates diversity.Then, GCTA (V1.94) software was used to calculate the genetic relationship coefficients between individuals, construct a population g matrix (Figure 4b), further analyze the genetic relationship in the protected population of Hetian Qing donkeys, and validate the results of principal component analysis.From Figure 4c, it can be seen that the genetic distance of IBS was 0.011~0.339,with an average of 0.267.Most blocks were relatively light (moderately correlated), and the genetic distance of IBS between most individuals was short, resulting in higher genetic relationships.

Cluster Analysis to Construct the Families in the Hetian Qing Donkey Population
Cluster Analysis In view of the importance of the male donkeys to the whole population, we extracted the male donkey samples and conducted cluster analysis separately to judge the distance between them.The results are shown in Figure 5a, and the cluster analysis results of all

Cluster Analysis to Construct the Families in the Hetian Qing Donkey Population Cluster Analysis
In view of the importance of the male donkeys to the whole population, we extracted the male donkey samples and conducted cluster analysis separately to judge the distance between them.The results are shown in Figure 5a, and the cluster analysis results of all samples are shown in Figure 5b.

Cluster Analysis
In view of the importance of the male donkeys to the whole population, we extracted the male donkey samples and conducted cluster analysis separately to judge the distance between them.The results are shown in Figure 5a, and the cluster analysis results of all samples are shown in Figure 5b.

Family Structure Analysis of Hetian Qing Donkey Population
Clustering was conducted based on the genomic phylogenetic relationship and clustering analysis results using genomic phylogenetic coefficients between male donkeys that were greater than or equal to 0.1.The existing male donkey samples could be divided into

Family Structure Analysis of Hetian Qing Donkey Population
Clustering was conducted based on the genomic phylogenetic relationship and clustering analysis results using genomic phylogenetic coefficients between male donkeys that were greater than or equal to 0.1.The existing male donkey samples could be divided into three families.Different families were divided according to the existing genetic relationship between female donkeys and male donkeys.In addition, 25 female donkeys were found to have distant blood relationships with the tested male donkeys, so they were classified as "other".The results are shown in Table 5.

Genetic Diversity Analysis
The Hetian Qing donkey is a rare and precious genetic resource in the Hetian region of Xinjiang, China.It plays a vital role in animal husbandry and transportation in southern Xinjiang.With the particularity of its geographical environment, research on the level of genetic diversity and the genetic structure of Hetian Qing donkeys is not only central to protecting the genetic resources of this species but is also the prerequisite for implementing effective scientific variety protection measures [1].There are many methods of studying the population genetics of species, such as using morphological, cellular, biochemical, and molecular markers.These latter three are vulnerable to environmental and other aspects compared with morphological markers [24].Previous studies have shown that the limited number of SNPs is the main bottleneck limiting the accuracy of kinship estimation [25].Therefore, rapid access to a large amount of SNP information is essential to further improve the applicability of molecular markers in estimating genetic relationships.GBS technology can obtain a large number of effective SNPs, provide more genome-level information for molecular markers of species polymorphism, and ensure the accuracy and reliability of the data used for estimating genetic relationships [10,11].It also has great potential for breeding excellent species when applied with simplified genome sequencing technology to construct genetic maps.Simplified genome sequencing technology can measure genetic data, build phylogenetic trees according to the genetic distance between populations, understand the evolutionary mechanisms and environmental adaptability of a population, and reproduce its evolutionary history, so it has significance in guiding the formulation of conservation strategies and methods [26].Recently, it has also been widely used in animal genetic diversity analysis, genome characteristics analysis, heterosis evaluation, and other studies [27].Zhu Wenjin et al. [28] carried out a microsatellite analysis of the genetic diversity and phylogenetic relationship of eight local donkey breeds in China.They showed a highly polymorphic information content of 24 microsatellite loci in eight donkey breeds, except NVHEQ18.The average PIC (0.6940), H (0.7119), and E (3.94) of the eight donkey breeds showed relatively high genetic polymorphism and genetic diversity.Yang Hu et al. [29] carried out a microsatellite genetic analysis of three local donkey breeds in Xinjiang and found that eight microsatellite loci were highly polymorphic in the three populations.The average polymorphic information of three local donkey populations contained PIC (0.7568), and the genetic heterozygosity h (0.7841) was higher than that of other domestic donkey breeds, indicating that the local Xinjiang donkey has rich genetic diversity, a high degree of population genetic variation, and great breeding potential.Through a microsatellite analysis of genetic diversity among Chinese donkey breeds, Zhang RF et al. [30] found that the mean values of PIC, HE, and NE of seven polymorphic loci in 10 donkey breeds were 0.7679, 0.8072, and 6.0275, respectively.In general, Chinese donkeys showed relatively high genetic diversity at the seven polymorphic loci investigated in their study.Furthermore, Lulan Zeng et al. [31] used microsatellite markers to study the genetic diversity in and the genetic relationship of Chinese donkeys.It was found that the average values of polymorphism information content, observed allele number, and expected allele number of all tested Chinese donkeys were 0.6600, 6.890, and 3.700, respectively, indicating that Chinese donkeys have relatively rich genetic diversity.Although there are abundant genetic variations among Chinese donkey breeds, their degree of genetic differentiation is relatively low, accounting for only 5.99% of the total genetic variation among different breeds.Most of the studies on the inbreeding and genetic relationship of local donkey breeds have been based on the microsatellite method, and the application of GBS to study genetic diversity is rare.
The effective population size of the Hetian Qing donkeys measured using GBS in this study was 4.1.Because the expected heterozygosity (He0.340) of the Hetian Qing donkey population was slightly lower than the observed heterozygosity (Ho0.347), it can be considered a moderately heterozygous population.The average observed heterozygosity of the whole population was close to the average expected heterozygosity, which indicates that Hetian Qing donkeys have high purity and rich genetic diversity, which may be more conducive to individual health.Polymorphism information content (PIC) is a good indicator of gene polymorphism.Botstein et al. [20] proposed the index of polymorphism information content to measure the degree of gene variation: when PIC > 0.5, the locus is highly polymorphic; when 0.25 < PIC < 0.5, the locus is moderately polymorphic; and when PIC < 0.25, the locus has a low degree of polymorphism.Their study found that 95.5% of SNPs were polymorphic, with an average PIC of 0.273, representing moderate polymorphism.The PIC value obtained in this study is slightly lower, indicating that there has been a loss of genetic diversity in the process of conservation of Hetian Qing donkeys, and the conservation measures need to be strengthened.

Analysis of Inbreeding Degree
Recently, research on the genomic inbreeding coefficient based on ROH has mainly concentrated on pigs, cattle, and other domestic animals, and many achievements have been made [32].ROH is usually used to estimate the inbreeding coefficient of a genome.The length of ROH and the proportion of the genome covered by ROH can accurately reflect the age and source of inbred lines, thus reflecting the level of inbreeding.The length of ROH is directly proportional to the genetic relationship between individuals.The longer the ROH segment, the greater the possibility of inbreeding, and vice versa.The more ROH segments there are, the more it is generally believed that a population has inbreeding [33].A long homozygous segment (ROH) is a continuous homozygous genotype segment existing in individuals in a genome.Inbreeding can improve the homozygosity of populations.The possibility of homozygosity of harmful recessive genes increases with inbreeding, which may reduce the reproduction and survival ability of offspring.Through the detection of ROH fragments in the whole genome of 32 Hetian Qing donkeys, this study found that there were 49 ROH fragments in 32 Hetian Qing donkeys, 73.47% of which were between 1 and 5 Mb long, and the average length of each ROH was 1.75 Mb.The individuals in the Hetian Qing donkey population had the most ROH in the range of 0~5 ROH, and the average inbreeding coefficient was 0.003.The number and length of ROH fragments concisely reflected that the inbreeding degree and artificial selection intensity of Hetian Qing donkeys were not obvious, and seed conservation plays a role.Wang Gang [34] found that the number of ROH and the average ROH length of the Kunsha donkey was the highest (SROH = 17,389.90Kb) in the study of the genetic structure of the Chinese domestic donkey population through whole-genome sequencing.The average ROH length of the Qinghai donkey was the lowest (SROH = 7746.38Kb), and the number of ROH was also the lowest.However, the length and number of ROH in the Hetian Qing donkey were lower than those obtained for the Qinghai donkey in Wang Gang's research results.This may be due to the following two reasons: (1) Xinjiang is a vast area, so it is not easy for Hetian Qing donkeys to inbreed in this area; (2) the Hetian Qing donkeys are neither the only local animal nor the most important economic and transportation animal, so the intensity of artificial selection is relatively low.The low inbreeding degree and low selection pressure may be the factors contributing to the short ROH of the Hetian Qing donkey.
Due to the limitations of the population size and the relatively closed operation mode of the Hetian Qing donkey species in Xinjiang, China, combined with the extension of seed conservation time and the intensification of generation overlap, an increase in the inbreeding coefficient within the population and a change in genetic structure are inevitable trends.In later stages, the breeding plan can be reasonably changed through sequencing data, the breeding method can be altered if necessary, and the stability of the population structure can be guaranteed through artificial insemination and other methods.Therefore, this study provides a valuable reference for the formulation of breeding process selection schemes.

Analysis of Genetic Relationship and Genetic Structure
In this study, 45,557 quality control SNPs were used to construct a G matrix, and the genetic coefficient obtained reflected the true genetic relationship between individuals.Combined with the results of genomic genetic relationship analysis cluster analysis, the existing male donkey samples could be divided into three families, and the genetic relationship coefficient between male donkeys was greater than or equal to 0.1 as the standard for clustering.The population can be divided into different families according to the genetic relationship between female and male donkeys.In addition, 25 female donkeys were found to be far from the detected male donkeys, so they were classified as "other".There was a relatively distant genetic relationship between the three male donkeys HTQL_8, HTQL_31, and HTQL_25, which was consistent with the results of the NJ cluster analysis, while the two-dimensional PCA analysis of the 32 Hetian Qing donkeys also showed that there were three points each forming a community that was far from the other communities.The three kinds of analysis based on SNP information in this study obtained largely the same results, which confirmed the authenticity and reliability of the genetic relationship between the 32 Hetian Qing donkeys obtained in this study.
The average inbreeding coefficient of this population was 0.003.In order to reduce population inbreeding, it is recommended that male and female donkeys of the same family are not bred together.In the results of population family construction, the genetic relationship coefficient of the "other" 25 female donkeys and all male donkeys was less than 0.1, so they can be mated with any male donkey, and then they can be mated after confirming the genetic relationship coefficient.The effective methods to reduce the inbreeding coefficient were to expand the population size, carry out proper selection, and strengthen management to avoid inbreeding caused by incomplete or incorrect pedigree records.This study also found that the male and female donkeys in the conservation population of Hetian Qing donkeys had three families; the number of male and female donkeys in the three families varied greatly, and the family structure was unbalanced.Therefore, it is necessary to introduce new lineages from other regions, expand the core group, implement reasonable selection, avoid inbreeding, and maintain the genetic diversity of the conservation population of the Hetian Qing donkey.

Conclusions
The 32 Hetian Qing donkeys were divided into six families, with significant differences in individual numbers among each family.The genetic relationships between families can serve as a basis for breeding selection.Family 1, 2, and 3 only have male donkeys (HTQL-8, HTQL-31; HTQL-25; and HTQL-6).During the breeding process, attention should be paid to preventing the loss of diversity among these three families.The inbreeding degree of male donkeys was not high, with an average inbreeding coefficient of 0.003, and there was a large space for random mating.The conservation effect of the population was good and could be compared between generations to further analyze the conservation effect.

Figure 1 .
Figure 1.Distribution of SNPs on each chromosome before and after quality control.Note: The Xaxis indicates the chromosome number, and the Y-axis indicates the number of SNPs.

Figure 1 .
Figure 1.Distribution of SNPs on each chromosome before and after quality control.Note: The X-axis indicates the chromosome number, and the Y-axis indicates the number of SNPs.

Figure 2 .
Figure 2. Genetic diversity analysis results.Note: (a) Minimum allele frequency distribution (the Xaxis indicates the minimum allele frequency interval, and the Y-axis indicates the SNP proportion).(b) Distribution of the polymorphism information content (the X-axis represents the PIC interval value, and the Y-axis represents the SNP proportion).(c) Heterozygosity analysis (note: the X-axis indicates the classification of Ho and He, and the Y-axis indicates the heterozygosity value).

Figure 2 .
Figure 2. Genetic diversity analysis results.Note: (a) Minimum allele frequency distribution (the X-axis indicates the minimum allele frequency interval, and the Y-axis indicates the SNP proportion).(b) Distribution of the polymorphism information content (the X-axis represents the PIC interval value, and the Y-axis represents the SNP proportion).(c) Heterozygosity analysis (note: the X-axis indicates the classification of Ho and He, and the Y-axis indicates the heterozygosity value).

Figure 3 .
Figure 3. ROH analysis results.Note: (a) Distribution of ROH length by population (the X-axis represents the length interval of ROH, and the Y-axis represents the population proportion).(b) Distribution of the ROH number on each chromosome (the X-axis represents chromosome number, and the Y-axis represents ROH quantity).(c) Sample number distribution of individual ROH length (the X-axis represents the length interval of ROH, and the Y-axis represents the number of individuals).(d) The width of the violin chart indicates the probability density distribution of the population's FROH.The wider part of the violin chart indicates that there are a larger number of samples at this level and vice versa.

Figure 3 .
Figure 3. ROH analysis results.Note: (a) Distribution of ROH length by population (the X-axis represents the length interval of ROH, and the Y-axis represents the population proportion).(b) Distribution of the ROH number on each chromosome (the X-axis represents chromosome number, and the Y-axis represents ROH quantity).(c) Sample number distribution of individual ROH length (the X-axis represents the length interval of ROH, and the Y-axis represents the number of individuals).(d) The width of the violin chart indicates the probability density distribution of the population's FROH.The wider part of the violin chart indicates that there are a larger number of samples at this level and vice versa.

Genes 2024, 15 , 570 10 of 16 Figure 4 .
Figure 4. Analysis chart of kinship relationship.Note: (a) Principal component analysis results (The X-axis is PC1, and the Y-axis is PC2).(b) Visualization results of the genome kinship analysis (The closer the color, the closer the kinship.The X-axis and Y-axis represent the individual IDs).(c) Visualization results of the genetic distance analysis (The closer the color is, the closer the genetic relationship is.The X-axis and Y-axis represent the individual IDs).

Figure 4 .
Figure 4. Analysis chart of kinship relationship.Note: (a) Principal component analysis results (The X-axis is PC1, and the Y-axis is PC2).(b) Visualization results of the genome kinship analysis (The closer the color, the closer the kinship.The X-axis and Y-axis represent the individual IDs).(c) Visualization results of the genetic distance analysis (The closer the color is, the closer the genetic relationship is.The X-axis and Y-axis represent the individual IDs).

Figure 4 .
Figure 4. Analysis chart of kinship relationship.Note: (a) Principal component analysis results (The X-axis is PC1, and the Y-axis is PC2).(b) Visualization results of the genome kinship analysis (The closer the color, the closer the kinship.The X-axis and Y-axis represent the individual IDs).(c) Visualization results of the genetic distance analysis (The closer the color is, the closer the genetic relationship is.The X-axis and Y-axis represent the individual IDs).

Figure 5 .
Figure 5. Cluster analysis chart.Note: (a) Cluster analysis results of male donkey samples (One color represents a family).(b) Cluster analysis results of all samples (The color of the evolutionary tree is the male donkey sample, and one color represents a family).

Figure 5 .
Figure 5. Cluster analysis chart.Note: (a) Cluster analysis results of male donkey samples (One color represents a family).(b) Cluster analysis results of all samples (The color of the evolutionary tree is the male donkey sample, and one color represents a family).

Table 1 .
Summary of quality of original sequencing data.

Table 2 .
Sequencing data quality assessment.
(6)e: (1) Sample: sample name; (2) Clean Reads: the number of filtered clean reads; (3) Mapped (%): number of clean reads mapped to the reference genome; (4) Properly mapped (%): the number of clean reads with both ends of the sequenced sequence located on the reference genome and the distance consistent with the length distribution of the sequenced segment; (5) sites_CovgMean: average coverage depth of all sites in the genome;(6)sites_NumCovg1: the proportion of bases of sequencing depth greater than or equal to 1× on the genome to the total length of the genome.

Table 3 .
Statistical results of SNP quality control.

Table 3 .
Statistical results of SNP quality control.

Table 4 .
Analysis results of population genetic diversity.

Table 5 .
Results of population and family construction.