Distribution of Homozygosity Regions in the Genome of Kazakh Cattle Breeds

: Runs of homozygosity (ROH) are contiguous stretches of homozygous genotypes that are passed from parents to their offspring. ROHs are suitable for determining population history, inbreeding rates, and the genetic relationships between individuals in the populations, as well as to detect candidate genes responsible for economic traits in farm animals. In this study, we observed that the Kazakh white-headed (KWh) cattle breed (ROH n = 55,976) had a higher number of ROH compared to the Auliekol (AK) breed (ROH n = 13,137). When calculating the mean length of ROH, there were considerable differences between Kazakh white-headed (211.59 ± 92.98 Mb) and Auliekol (99.62 ± 46.48 Mb) populations. The maximum length of ROH was higher in Auliekol cattle (510.25 Mb) than in Kazakh white-headed cattle (498.91 Mb). The average inbreeding coefﬁcient rate was equal to 0.084 ± 0.037 in Kazakh white-headed cattle and 0.039 ± 0.018 in Auliekol cattle. The high frequency of genomic regions showed that the strongest patterns were observed on chromosomes 2, 6, and 26 for KWh and 1, 5, and 14 for AK. The estimation of ROH numbers per animal showed that the number of ROH decreased with increasing ROH length in both populations. The genomic inbreeding coefﬁcient of both cattle breeds was calculated based on the ROH, and ancient inbreeding was observed. The harbored genes within ROH islands were associated with meat growth and milk production.


Introduction
Kazakh white-headed cows and Auliekol cows are two of the main domestic breeds of beef cattle in Kazakhstan, characterized by their meat quality, rusticity, and adaptation to marginal pastures and extreme weather conditions [1][2][3]. Their share in the country is about 90% of the total number of beef cattle. These breeds are raised in almost all regions of the country, and before the agricultural economic reform, there were about 1.2 million of these animals [1].
The Kazakh white-headed cattle breed was created by the reproductive crossing of Kazakh and Kalmyk cattle with Herefords and approved on 30 May 1950 [2]. The Kazakh white-headed breed is characterized by its endurance; high adaptive plasticity; high reproductive, fattening, and feeding properties; early maturity; and good meat quality [2,3].
Work on the Auliekol breed creation began in 1960 at the Moskalevsky breeding farm in the Kostanay region by means of a complex reproductive crossing of three meat breeds to combine their desired inherited traits in the offspring. These traits are the size, high growth, and good milk production, characteristic of the French Charolais breed; horniness, early maturity, and excellent meat forms and quality of Aberdeen Angus; and the reproductive ability of the Kazakh white-headed breed [4]. Auliekol cows are distinguished by high productivity, yielding a large amount of milk, although this breed of cattle was officially registered for meat production [1].
Runs of homozygosity (ROH), as the name suggests, are consecutive runs of homozygous stretches containing single nucleotide polymorphisms (SNP) on a long genomic region. However, there is no agreement on exactly what length of homozygotes qualifies as ROH. At first, Weber J.L. and Broman K.W. [5] defined the concept of ROH as the cross-section lengths of homozygous genotypes that exist in animal parents and are passed equally to their offspring. The ROH length is a sign of animal kinship and can be an indication of consanguinity, as the longer the ROH segment, the more recent inbreeding occurred in pedigree [6,7]. The shorter ROH lengths are due to the presence of a more ancient relationship, which is usually not considered in animal pedigree [7]. As a result, past and present breeding practices play important roles in determining the length of ROH for a particular animal. Inbreeding depression in cattle, which is also elicited by ROH, is manifested by a decrease in the survival and fertility of offspring and is also associated with a decrease in productive qualities, longevity, and the ability to cope with environmental problems [8]. Recent studies [9] have shown that high levels of homozygosity are compatible with life and livelihood in cattle that have been isolated for many years or generations. ROH analysis is used as an alternative to GWAS studies aimed at identifying biological factors affecting the phenotypes of organisms and can contain information on the inbreeding coefficient in the herd. ROH analysis is also used to decipher the economically important genetic traits in farm animals [10] and to compare meat and milk production in cattle [11]. The objectives of this study were to characterize the occurrence of ROH in the population of the Kazakh white-headed and Auliekol cattle breeds and evaluate inbreeding levels, as well as to describe within the ROH islands harbored genes associated with production traits in the two native cattle breeds.

Ethics Statement
All experimental procedures and ethical norms were approved by the Biological Safety and Animal Ethics Committee of the NJSC "West Kazakhstan Agrarian and Technical University named after Zhangir Khan", Kazakhstan, and performed in accordance with the relevant national guidelines on farm animal care (Protocol N4, 9 March 2020).

Sample Collection and Genotyping
To examine the two Kazakh cattle breeds, ear tissue was collected from 1478 specimens, including 479 Auliekol cows and 999 Kazakh white-headed cows. Genomic DNA was extracted and genotyped in Neogen Agrigenomics, Lincoln, NE, USA, according to the manufacturer's protocol by GeneSeek GGP Bovine 150 K, which contains 150,000 SNPs (Neogen Corporation Company, Lincoln, NE, USA).

Quality Control and Data Analysis
The Plink v. 1.9. software [12] was used for quality control of genotyped breeds based on the following parameters: missing rate per SNP, missing rate per individual, and minor allele frequency, set to 0.1, 0.1, and 0.05, respectively. To examine the genotyped data, all extra SNPs other than those located on autosomes were excluded.
After quality control of SNP data, 112,655 variants from 1468 cattle were utilized to detect ROH segments for each individual using Plink v 1.9. software, and for additional analysis, the R package was applied [13]. To calculate ROH fragments, the following parameters were applied: the minimum number of consecutive homozygous SNPs in an ROH was 30, 1 SNP per 50 kb density was set per ROH, the maximum gap between consecutive homozygous SNPs was 500 kb, and to avoid high linkage disequilibrium and short homozygosity segments, the minimum ROH length was set to 1 Mb. ROH length was classified into five different classes according to the nomenclature of Kirin et al. [7] and Ferenčaković et al. [14,15]: 1-2, 2-4, 4-8, 8-16, and > 16 Mb. The average number of ROH segments per individual, the mean length of ROH per animal, the total length of autosomal genome size, the number of homozygosity segments per chromosome, and the mean length of runs per chromosome were estimated.
The individual inbreeding coefficients based on the runs of homozygosity for each breed were calculated, where ∑ L ROH is the sum of the length of all ROH discovered in an individual, and L genome is the total length of the autosomal genome covered by SNPs [16]. For each of the studied breeds, individual consanguinity coefficients were estimated according to the five different categories: F ROH 1-2 Mb, F ROH 2-4 Mb, F ROH 4-8 Mb, F ROH 8-16 Mb, and F ROH > 16 Mb.
To identify genomic regions associated with a high frequency of ROH, the percentage of the occurrence of SNPs in an ROH was computed by counting the number of times the SNP was detected inside the ROH in the sampled individuals. Then, the proportion of times each SNP falls inside an ROH was plotted against SNP positions along the chromosomes.
To determine genomic coordinates of identified regions associated with ROH, the Genome Data Viewer of the Bos taurus UMD3.1.1 was applied, available at "National Center for Biotechnology Information" (https://www.ncbi.nlm.nih.gov/genome/gdv/?org=bostaurus accessed on 3 March 2022).
To identify genome regions with a high frequency of ROH occurrence, the "ROH islands" were generated separately for the two studied cattle breeds. These identified regions were analyzed with overlapping genes using the BioMart ensemble (https://may2012. archive.ensembl.org/biomart/martview/7b00b694654ab243c679ef8376bd080b (accessed on 3 March 2022)) based on the UMD3.1 bovine genome assembly. Then, for characterizing their molecular functions and biological processes, the Panther Classification System was used [17].

Results
In this study, we analyzed 1478 individuals from two Kazakh cattle breeds. After filtering data, 1752 variants were removed due to a missing genotype rate of > 10%, and 10 cattle were excluded with more than 10% missing genotypes, as well as 5434 SNPs extracted above the set MAF threshold (-maf 0.05). As a result of the quality check, we retained 112,655 SNPs and 1468 individuals. The BeadChip SNP panels were used to identify characteristics of runs of homozygosity distribution through the genomes of the animals. Using 112,655 SNPs in 1468 individuals of two breeds, a total of 69,113 ROH islands were identified, with 13,137 segments for the Auliekol breed and 55,976 segments for the Kazakh white-headed breed. The distribution of ROH along the autosomal chromosomes is shown in Figure S4a,b. The average number and length of ROH per individual are the most important measures for characterizing ROH composition. In our study, the mean number of segments per animal was 27.15 ± 12.48 with a range of 2-84 in the Kazakh white-headed cows and 12.1 ± 3.5 with a range of 2-24 in the Auliekol cows.
The mean length of ROH per individual was considerably higher in Kazakh whiteheaded cattle than in the Auliekol breed (99.62 ± 46.48 and 211.59 ± 92.98, respectively), though the maximum length of ROH was higher in Auliekol (510.25 Mb) than in the Kazakh white-headed cattle (498.91 Mb). Moreover, the Auliekol breed was characterized by a higher minimum length of ROH relative to Kazakh white-headed cattle, representing 29.92 Mb and 8.02 Mb, respectively (Table 1). In the population examined, the total length of the autosomal genome was equal to 2.34 Gb. The highest number of ROH was recorded for chromosomes 5 (1281) and 6 (4019), while the lowest number of ROH was observed for chromosomes 25 (124) and 18 (163) in the Auliekol and Kazakh white-headed breeds, respectively ( Figure 1). The first 6 out of 29 chromosomes had a superior percentage of ROH in both breeds ( Figure S1). Furthermore, the mean length of runs per chromosome was calculated, varying between 4.48 and 2.53 per chromosome.  In the population examined, the total length of the autosomal genome was equal to 2.34 Gb. The highest number of ROH was recorded for chromosomes 5 (1281) and 6 (4019), while the lowest number of ROH was observed for chromosomes 25 (124) and 18 (163) in the Auliekol and Kazakh white-headed breeds, respectively ( Figure 1). The first 6 out of 29 chromosomes had a superior percentage of ROH in both breeds ( Figure S1). Furthermore, the mean length of runs per chromosome was calculated, varying between 4.48 and 2.53 per chromosome. In the Auliekol population, the mean length of runs per chromosome was higher than Kazakh white-headed cows. The distribution of ROH segments according to length revealed that a shorter number of segments was higher than longer ROH in the populations, representing ROH numbers of 5740 and 23,020 (1-2 Mb) in Auliekol and Kazakh whiteheaded breeds, respectively. Among all the identified ROHs, the frequencies of ROH in different length classes were 44% (shorter than 2 Mb), 31.3% (2-4 Mb), 16.6% (4-8), 6.3% In the Auliekol population, the mean length of runs per chromosome was higher than Kazakh white-headed cows. The distribution of ROH segments according to length revealed that a shorter number of segments was higher than longer ROH in the populations, representing ROH numbers of 5740 and 23,020 (1-2 Mb) in Auliekol and Kazakh whiteheaded breeds, respectively. Among all the identified ROHs, the frequencies of ROH in different length classes were 44% (shorter than 2 Mb), 31.3% (2-4 Mb), 16.6% (4-8), 6.3% (8)(9)(10)(11)(12)(13)(14)(15)(16), and 1.9% (longer than 16 Mb) in the Auliekol breed. In the Kazakh whiteheaded breed, the frequencies of ROH were 41.1% (shorter than 2 Mb), 36.5% (2-4 Mb), 17.1% (4-8), 4.4% (8-16), and 0.9% (longer than 16 Mb). When comparing the distribution of the number of ROHs in different classes, there was no notable difference between the two breeds ( Figure S2). We calculated average ROH length in the five categories, and all the average lengths of ROH classes were similar in the two populations, except those longer than 16 Mb ( Figure S3). Comparing the average length of ROH in the five categories, Auliekol cattle had the shortest (1.45 Mb) and longest (25.41 Mb) lengths.
Furthermore, the average inbreeding coefficient values were estimated in the five classes; the range of variation is presented in Table 2. The average genomic inbreeding (F ROH ) coefficients of KWh cattle declined steadily in accordance with the ROH length, varying in the range of 0.084 ± 0.037 to 0.016 ± 0.013. Similar to the Kazakh cattle, the AK breed average F ROH value decreased consistently, corresponding to the length of the ROH (Table 2). Generally, the average F ROH of KWh cattle was much greater compared to that of AK, but the same value was observed in categories longer than 16 Mb. In order to identify the genomic regions associated with ROH that were estimated, the frequency of individual SNPs and obtained results were analyzed by Manhattan plots for each breed. The Manhattan plots showed that the highest frequencies of SNPs in the ROH regions were found in KWh cattle on chromosomes 2, 5, 6, and 26 and in AK cattle on chromosomes 1, 5, and 14 (Figure 2a,b).
The top 20% high frequency of genomic regions showed a total of 279 and 1471 SNPs in AK and KWh populations, respectively. The highest and lowest numbers of SNPs were observed on chromosomes 1 (10 SNPs) and 5 (209 SNPs) in AK cattle and on chromosomes 13 (5 SNPs) and 6 (982 SNPs) in the KWH breed, respectively (Table 3).    The results of the classification of genes for individual processes without enriching analysis are presented in Table 4. In these two breeds, most genes are involved in the metabolic process (from 44 to 46 genes), cellular process (from 59 to 69 genes), and biological regulation (39 genes for both breeds). A distinctive feature of Kazakh white-headed and Auliekol cattle is the high content of genes associated with the functioning of immune system processes, signaling, multicellular organismal processes, localization, response to stimuli, and developmental processes. Table 4. Associated genes with the highest frequency of ROH occurrence with biological processes within genome regions.

Biological Process/Breeds
The The associated genes involved in significant pathways are listed in Table 5. The results did not identify the same number of pathways in both breeds. For the Auliekol breed, genes associated with PDGF and JAK/STAT signaling pathways were identified. In the case of Kazakh white-headed cows, characteristic pathways associated with angiogenesis, the ubiquitin proteasome pathway, the heterotrimeric G-protein signaling pathway, and the Gi alpha and Gs alpha-mediated pathways were identified. In the regions of the genome with a high frequency of occurrence of ROH, a number of genes were also identified with a confirmed effect on the level of production characteristics, including STAT6, STAT2, ITCH, MSTN, PDGFRA, and ERBB3.

Distribution of ROH
In the present study, the Illumina GeneSeek GGP Bovine 150K BeadChip was used to characterize the frequency and distribution of ROH in the genomes of two native Kazak cattle breeds. After filtering 112,655 SNPs, a total of 69,113 ROH were found. The identified total number of ROH in the Kazakh white-headed breed was higher than in the Auliekol breed. The pattern of ROH differed considerably among the breeds [18]. However, it may have depended on the number of individuals, because the number of Auliekol animals was less than the number of KWh animals. On the other hand, when Rui Xie et al. investigated three pig breeds with different sample sizes (Landrace n = 83, Songliao n = 86, and Yorkshire n = 477), the Yorkshire population had a greater number of ROH in different length categories compared to Landrace and Songliao, but the mean number of ROH did not differ significantly between the three populations [19]. Furthermore, our findings showed that the mean number of ROH per animal was 27.15 ± 12.48 for the Kazakh whiteheaded cattle and 12.1 ± 3.5 for the Auliekol cattle. The genome of the Auliekol breed indicated few proportions of ROH. Similar results were observed in Fleckvieh animals, in which the authors concluded the few proportions of ROH related to the larger effective population size [20]. Auliekol also revealed a similar result to Red Holstein, still higher than those for Angler and Red-and-White dual-purpose cows, highlighting that the genetic diversity was high in the genomes of these breeds [21]. The average number of ROH per animal observed in KWh was comparable to the results of White-Backed, Polish Red, and Polish Red-and-White cattle breeds [20]. Kazakh white-headed and Aliekol genomes are composed mostly of short segments. This result is consistent with the finding reported by Purfield et al. [18]. Similarly, the average length of ROH was greater in KWh (211.6) compared to AK (99.62), although the longest mean length of the segment was observed in AK cattle. This indicates that the ancient and recent autozygosity patterns are low in AK cows. When calculating the ROH number per chromosome, the highest value was observed on chromosome 5 of the AK breed. Charolais cattle also had the highest ROH numbers on chromosome 5 [22]. Both cattle breeds are reared for meat purposes, and some cattle breeds were characterized by greater ROH numbers for BTA5 [23,24]. In the KWh cattle population, the maximum number of autozygosity segments was found on chromosome 6. If we compare the coverage percentage of ROH across autosomes, it was greater in the first six chromosomes compared to other chromosomes. This could be because of the larger size of these chromosomes.

ROH-Based Genomic Inbreeding Coefficients
Marras et al., who examined some cattle breeds farmed in Italy, reported that the most frequent ROHs were found in the 1-2 Mb length class in Piedmontese and Simmental cows; both of these breeds had high inbreeding rates in ancient times [25]. In the present study, the frequency of ROH in the five length classes showed that the shorter than 2 Mb length category had a higher ROH proportion compared to all other classes in both AK (44%) and KWh (41.1%) populations, which may be indicative of ancient inbreeding. According to many studies, F ROH is a more powerful tool for estimating both past and recent relatedness and to more accurately predict back many generations than F PED . Furthermore, to clarify the history of genomic inbreeding coefficients, we computed the average inbreeding coefficient values according to the different length categories. The obtained results showed that the level of F ROH ranged from 0.084 ± 0.037 (1-2 Mb) to 0.016 ± 0.013 (>16 Mb) in KWh, whereas it was between 0.039 ± 0.018 (1-2 Mb) and 0.016 ± 0.016 (>16 Mb) in the AK population. When comparing the mean inbreeding coefficients in different length sizes, the strongest F ROH value was found in category 1-2 Mb for both breeds. Similar results regard-ing F ROH were observed in Piedmontese and Simmental breeds, the genomic inbreeding coefficients of which were 0.041 and 0.083, respectively [25]. Our estimated F ROH findings confirmed that our two explored breeds had high ancient inbreeding coefficients compared to recent inbreeding coefficients. This outcome may be consistent with the foundation of breeds. When a new breed is initially reared, there is a limited number of animals in the herd, and inbreeding is inevitable.

Genomic Regions within ROH
High frequencies of ROH occurrence associated with the pathways were observed separately for each cattle breed. The genes INHBC, INHBE, and GDF11 were identified in the Auliekol breed related to growth factors, especially the first two genes, which are part of the TGF beta signaling pathway, whose function is transforming growth factor-beta and can affect muscle atrophy [22]. It was also detected as encoding the epidermal growth factor receptor gene ERBB3, the EGF receptor signaling pathway. Figeac et al. showed that this gene is responsible for the control of myogenic diversity and the proliferation of muscle stem cells [26]. All four genes can be associated with meat traits in the Auliekol breed. Therefore, STAT6 and STAT2 genes were determined to be involved in multiple functional pathways: PDGF signaling pathway, JAK/STAT signaling pathway, EGF receptor signaling pathway, interleukin signaling pathway, and chemokine-and cytokine-mediated inflammation, described by Cobanoglu et al. [27]. It was found that these STAT family genes may potentially be associated with milk traits in cattle [27]. Studies of phenotypic data, including data on the genotypes of the Auliekol breed [1], show that this breed has high productivity potential in both the meat and dairy sectors. This may lead to an increase in the number of livestock for different types of farms, both dairy and meat, or in the number of mixed-type farms in Kazakhstan.
In the Kazakh white-headed cattle breed, we identified PDGFRA genes, which have a role in cell survival, and KDR genes, which are the receptors for the main growth factor VEGF (vascular endothelial growth factor) [28]. These two genes are included in the angiogenesis pathway (P00005). The PDGFRA gene can describe the adaptability and survival function of the Kazakh white-headed breed for high temperatures in the summer and low temperatures in the winter, as Kazakhstan has this type of harsh climate [1,26]. The GNRHR gene, observed in the heterotrimeric G-protein signaling pathway and Gi alpha and Gs alpha-mediated pathways (P00026), is an important factor in reproduction function [29]. Marras et al. [30] and Szmatoła et al. [22] observed ROH islands on chromosomes 2 and 6, and the identified ROH islands on chromosome 2 were associated with the MSTN gene in Piedmontese and Limousin cattle breeds [22,30]. A similar finding resulting from strong selection pressure on the MSTN gene was observed in this study concerning the Kazakh white-headed breed. In particular, the Kazakh white-headed breed should be investigated in detail since the population of this breed in Kazakhstan is growing every year.

Conclusions
It is worth noting that this is the first study to characterize the distribution and frequency of ROH in the genome of Kazakh white-headed and Auliekol cattle breeds. The obtained results revealed that shorter ROH numbers were found more frequently than long ROH in both cattle populations. Genomic inbreeding coefficients calculated based on the ROH demonstrated higher ancient inbreeding rates compared to recent inbreeding coefficients. Hence, we can assume that mating strategies are going well in herds, as breeders usually avoid crossing closely related individuals. The genes obtained by ROH islands need to be better investigated in Kazakh cattle breeds. Several genes within these genomic regions were observed, with confirmed effects on the level of production characteristics, including STAT6, STAT2, ITCH, MSTN, PDGFRA, and ERBB3. The results of this study and, accordingly, the identified genes can become the basis for further research aimed at identifying genes and markers to determine economically useful traits in cattle. Our findings support more insight into the genomes of the two studied breeds.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/d14040279/s1. Figure S1: The percent distribution of ROH per chromosome in the two cattle populations. Figure S2: Percentage of distribution of the number of ROH in different classes. Figure S3: The mean length of runs per size class. Figure S4: (a,b) The detected ROH segments in each individual against their position along the chromosome.
Author Contributions: I.B., data processing, analysis, and manuscript preparation. A.B. (Alena Belaya), study design, coordination of material collection, and experimental work. K.D.: interpretation and discussion of results, data analysis, and manuscript preparation. A.B. (Anuarbek Bissembayev) and A.K., experimental work. A.S., research management and coordination of manuscript preparation. K.K. and A.N., project supervision. All authors have read and agreed to the published version of the manuscript.

Funding:
The work was carried out within the framework of the project of program-targeted financing of the Ministry of Agriculture of the Republic of Kazakhstan for 2021-2023 BR10764981 "Development of technologies for effective management of the selection process of preserving and improving genetic resources in the beef cattle breeding" and grant funding for fundamental and applied scientific research of young scientists on scientific and (or) scientific and technical projects for 2020-2022 of the Ministry of Education and Science of the Republic of Kazakhstan AP08052960, "Breed-specific QTL-marking of meat productivity of cattle of Auliekol and Kazakh white-headed breeds based on full-genome SNP chipping" (state registration no. 0120RK00043).

Institutional Review Board Statement:
The study was approved by the Ethics Commission of the the NJSC "West Kazakhstan Agrarian and Technical University named after Zhangir Khan", Kazakhstan (Protocol N4, 9 March 2020).

Data Availability Statement:
The SNP data of two Kazakh cattle breeds generated for this project are available on https://doi.org/10.5061/dryad.3bk3j9kk3 (accessed on 3 March 2022) and can be downloaded upon request.