Identification of Genomic Regions and Candidate Genes Associated with Body Weight and Body Conformation Traits in Karachai Goats

The objective of this study was to identify the SNPs and candidate genes related to body weight and seven body conformation traits at the age of 8 months in the Russian aboriginal Karachai goats (n = 269) by conducting genome-wide association studies (GWAS), using genotypes generated by Goat SNP BeadChip (Illumina Inc., USA). We identified 241 SNPs, which were significantly associated with the studied traits, including 47 genome-wide SNPs (p < 10−5) and 194 suggestive SNPs (p < 10−4), distributed among all goat autosomes except for autosome 23. Fifty-six SNPs were common for two and more traits (1 SNP for six traits, 2 SNPs for five traits, 12 SNPs for four traits, 20 SNPs for three traits, and 21 SNPs for two traits), while 185 SNPs were associated with single traits. Structural annotation within a window of 0.4 Mb (±0.2 Mb from causal SNPs) revealed 238 candidate genes. The largest number of candidate genes was identified at Chr13 (33 candidate genes for the five traits). The genes identified in our study were previously reported to be associated with growth-related traits in different livestock species. The most significant genes for body weight were CRADD, HMGA2, MSRB3, MAX, HACL1 and RAB15, which regulate growth processes, body sizes, fat deposition, and average daily gains. Among them, the HMGA2 gene is a well-known candidate for prenatal and early postnatal development, and the MSRB3 gene is proposed as a candidate gene affecting the growth performance. APOB, PTPRK, BCAR1, AOAH and ASAH1 genes associated with withers height, rump height and body length, are involved in various metabolic processes, including fatty acid metabolism and lipopolysaccharide catabolism. In addition, WDR70, ZBTB24, ADIPOQ, and SORCS3 genes were linked to chest width. KCNG4 was associated with rump height, body length and chest perimeter. The identified candidate genes can be proposed as molecular markers for growth trait selection for genetic improvement in Karachai goats.


Introduction
Goat (Capra hircus) farming is widespread in almost every country all over the world, due to their low prices and high quality of their products, and so it is attracting new farmers and investors [1]. The highest numbers of goats in the world are located in developing countries under extreme climate conditions and under traditional farming systems [2].
In Russia, there are about 2 million goats of 10 breeds of various productivity directions [3]. Of the total number of goats in Russia, 80% are concentrated in three federal regions: Caucasus (40%), South (25%), and Siberia (15.5%) [4]. Karachai goats (Supplementary Materials Figure S1) are one of the most interesting breeds for research due to their ability to survive in the harsh

Animals, Sampling and Genotyping
A total of 269 Karachai goats from six breeding herds, including Darinsk, Kyzyl Kala, Maysky, Piatigorsky, Storozhevaya, and Uchkulan, were randomly selected for our study. The ear tissue samples were collected by trained personnel under strict veterinary rules in accordance with the rules for conducting of laboratory research (tests) in the implementation of the veterinary control (supervision) approved by Council Decision Eurasian Economic Commission No 80 (10 November 2017). The DNA was extracted using the DNA-Extron reagent kit (JSC Sintol, Russia) according to the manufacturer's protocol. The concentration of double-stranded DNA was determined on a Qubit device (Thermofisher, Waltham, MA, USA). The purity of DNA was estimated by the degree of absorption at a wavelength of 260 and 280 nm, using a NanoDrop 8000 micro-spectrophotometer (Thermo Fisher Scientific, DE). The DNA samples were genotyped using the Illumina Goat SNP BeadChip (Illumina Inc., San Diego, CA, USA) comprising of 53,347 SNPs.

Quality Control of Data
Quality control and filtering of genotyping data for each SNP and each sample was performed using the PLINK 1.9 software package (http://zzz.bwh.harvard.edu/plink/ accessed on 1 August 2020) using the following filters (corresponding commands in the PLINK software are given in brackets): call-rate for all studied SNPs for an individual sample is not less than 90% (-mind); call-rate for each of the studied SNPs for all genotyped samples is not less than 90% (-geno); minor allele frequency (MAF) greater than 0.01 or 0.05 (-maf 0.01); and deviation of SNP genotypes from Hardy-Weinberg distribution in the totality of tested samples with a significance p-value < 10 −6 (-hwe). In addition, the linkage disequilibrium of the studied SNPs (LD score) was assessed with R 2 < 0.2 with a step of 50 kb (-indep-pairwise). A total of 269 animals and 47,647 SNPs were included in our final data set after quality control.

Principal Component Analysis
The studied population of Karachai goats originated from six breeding herds. To evaluate the population stratification, we performed principal component analysis (PCA). PLINK v1.9 software was used to perform PCA.

Phenotypic Traits
The records for the body weight (BW) and seven body conformation traits including withers height (WH), rump height (RH), body length (BL), chest perimeter (CP), chest width (CW), chest depth (CD) and rump width (RW) were collected at age 8 months. Elimination of environmental and permanent effects by the method of generalized linear models for fitting descriptive statistics to analyze the normal distribution of the studied animals' features were carried out by STATISTICA 10 software: where: y-the corresponding GLM (general linear model) animal phenotype; HY i -fixed effect "herd-year" of animal birth (i = 1-10); Sex k -the fixed effect of the sex of the animal (k-male, female); Age-the regression effect of age in days at the time the animal was assessed; b 1 -the regression coefficient of the model; animal j -the fixed effect of individual (j = 1-269) weighed by covariance structure for genomic relationship matrix (GRM, N(0, Gσ g 2 )) built from the genomic information using VanRaden's method [23]; e-residual effect of the model.

Genome-Wide Association Studies
To identify associations of SNP markers with the studied pure (direct) phenotypic traits, we used multiple linear regression analysis implemented in PLINK 1.90, preliminarily adjusted studied population according to its structure (-genome, -covar). After a quality check, a total of 53,347 SNPs were used in this analysis. To confirm the significant influence of SNPs and identify significant regions in the goat genome, the Bonferroni null hypothesis test was used: threshold p < 1.05 × 10 −6 ; 0.05/47647 SNP. Data visualization was carried out in the qqman package using the R programming language [24].

Gene Analysis
The SNP positions used for GWAS, specified according to the AdaptMap genome assembly, were converted into the ARS1.2 genome assembly and used for gene identification using the Ensembl Genes release 103 database web resource [25]. Structural annotation was performed for genomic regions covering a window of ±0.2 Mb from the identified SNP. The genes were considered as candidate genes when they (or part of them) were localized within selected 0.4 Mb window. For functional annotation and gene ontology (GO) term enrichment analysis for identified structural candidate genes, we used the Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 software [26]. Significant annotation clusters were selected using an enrichment score of more than 1 and a p-value < 0.05.

Population Stratification
The principal component analysis shows a distribution of the studied population between three clusters. The first principal component (PC1), which is responsible for 8.76% of genetic variability, clearly separated the Maysky herd from five remaining herds, while the principal component two (PC2), which explained 4.86% of genetic variability, distinguished the Darinsk, Kyzyl Kala and Storozhevaya herds from the Uchkulan herd. The individuals of the Piatigorsky herd were distributed between two clusters ( Figure 1). quality check, a total of 53,347 SNPs were used in this analysis. To confirm the significant influence of SNPs and identify significant regions in the goat genome, the Bonferroni null hypothesis test was used: threshold p < 1.05 × 10 −6 ; 0.05/47647 SNP. Data visualization was carried out in the qqman package using the R programming language [24].
2.6. Gene analysis The SNP positions used for GWAS, specified according to the AdaptMap genome assembly, were converted into the ARS1.2 genome assembly and used for gene identification using the Ensembl Genes release 103 database web resource [25]. Structural annotation was performed for genomic regions covering a window of ±0.2 Mb from the identified SNP. The genes were considered as candidate genes when they (or part of them) were localized within selected 0.4 Mb window. For functional annotation and gene ontology (GO) term enrichment analysis for identified structural candidate genes, we used the Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 software [26]. Significant annotation clusters were selected using an enrichment score of more than 1 and a p-value < 0.05.

Population Stratification
The principal component analysis shows a distribution of the studied population between three clusters. The first principal component (PC1), which is responsible for 8.76% of genetic variability, clearly separated the Maysky herd from five remaining herds, while the principal component two (PC2), which explained 4.86% of genetic variability, distinguished the Darinsk, Kyzyl Kala and Storozhevaya herds from the Uchkulan herd. The individuals of the Piatigorsky herd were distributed between two clusters ( Figure 1).  Considering the observed population stratification, we performed GWAS using the first two PCs as covariates.

Genome-Wide Association Studies
The descriptive GLM statistics for body weight and body conformation traits in Karachai goats at age 8 months are summarized in Table 1. The phenotypic value distribution by GLM model for measured traits in studied goats of the Karachai breed is shown in Supplementary Materials Figure S2. Descriptive statistics of the genomic inbreeding coefficient derived from the genomic relationship matrix in Karachai goats by herds revealed the low variation from 3.14 to 6.68% and had a normal distribution for all studied population (Supplementary Table S1, Supplementary Figures S3 and S4). The results of GWAS analysis for body weight and body conformation traits in Karachai goats at age 8 months are shown in Figure 2.  Using GWAS, significant SNPs associated with body weight at age of 8 months were found on Chr1, Chr5, Chr6, Chr10, Chr12, Chr16, Chr18 and Chr24 (Figure 2A). At the same time, significant SNPs related to withers height at age 8 months were located on Chr1, Chr3, Chr8, Chr9, Chr10, Chr13 and Chr18 ( Figure 2B). For rump height, SNPs associated with this trait found on Chr1, Chr3, Chr8, Chr9, Chr10, Chr18, Chr26 and Chr29. This fact is due to the lower variability of these traits in the studied samples of goats ( Figure 2C). For body length at age 8 months, significant SNPs were found on Chr3, Chr10, Chr13, Chr18 and Chr29 ( Figure 2D). Significant SNPs associated with chest perimeter at age 8 months were located on Chr4, Chr5, Chr6, Chr7, Chr9, Chr10, Chr12, Chr13, Chr17, Chr18, Chr19, Chr20 and Chr24 ( Figure 2E). Significant SNPs related to chest width were also found on Chr1, Chr2,Chr3, Chr4, Chr5, Chr7, Chr9, Chr10, Chr12, Chr17, Chr18, Chr20, Chr21, Chr22, Chr24, Chr26 and Chr28 ( Figure 2F). On the other hand, SNPs significantly associated with chest depth at age 8 months were found on Chr9, Chr13, Chr17 and Chr18 ( Figure 2G). Significant SNPs related to rump width were located on Chr1, Chr2, Chr3, Chr4, Chr8, Chr9, Chr12, Chr14, Chr16, Chr18 and Chr25 ( Figure 2H).
Analysis of Q-Q plots ( Figure 2A-H) shows that, for most traits, the observed p-values from the GWAS did not deviate significantly from the expected values, suggesting that the models for GWAS were reasonable. Greater deviations obtained at Q-Q plots for several traits including withers height, rump height, body length, chest width may be the result of some deviation from normal distribution for these traits (Supplementary Materials Figure S2). For this reason, the results produced for these traits should be treated with caution.
Among 241 SNPs significantly associated with studied traits, 56 SNPs were common for two or more traits (1 SNP for six traits, 2 SNPs for five traits, 12 SNPs for four traits, 20 SNPs for three traits, and 21 SNPs for two traits), while 185 SNPs were associated with single traits (Supplementary Materials Table S3).

Candidate Genes
The structural annotation revealed 238 genes, which (or part of which) are localized within a window of 0.4 Mb (±0.2 Mb of causal SNPs). The full list of genes is available in the Supplementary Materials, Table S3. Candidate genes for the studied traits were identified on 21 of 29 goat autosomes. The greatest number of candidate genes was found on Chr13 with 33 candidate genes for 5 traits studied. Candidate genes linked to significant SNPs (p < 10 −5 ) associated with body weight and body conformation traits based on GWAS in Karachai goats at age 8 months are shown in Table 3.  Our results showed that among the total number of identified candidate genes (Supplementary Materials Table S3), the most significant genes associated with body weight at age 8 months are CRADD, HMGA2, MSRB3, MAX, HACL1 and RAB15, which are growth factors, including those with high expression in cell growth and development in animal bodies. APOB, PTPRK, BCAR1, AOAH and ASAH1 genes regulate fatty acids metabolism in cells and are involved in various metabolic processes in animal bodies. These genes were noted as being linked to some body conformation traits such as withers height and rump height. The APOB gene, associated with intrauterine embryonic development, spermatogenesis, development of the nervous system, cholesterol metabolism, fertilization, and postembryonic development, is noted as being related to withers height, rump height and body length. Furthermore, WDR70, ZBTB24 and SORCS3 genes were found to be linked to chest width. Moreover, the KCNG4 gene was showed to be associated with rump height, body length and chest perimeter.
Using the DAVID web tool and a list of 238 genes found within the 0.4 Mb regions surrounding the identified SNPs (Supplementary Materials, Table S3), we performed the analysis of functional annotation and enrichment of GO terms (Table 4). We found two significant annotation clusters enriched with GO terms. One of them is related to the cystatin protein subfamily, while the other one is associated with the cell division process.

Discussion
The study of the goat genome is the main basis for genetic improvement in goat breeding programs. GWAS has been used as a primary strategy to detect QTL (quantitative traits loci) for complex traits [27]. Many genomic regions and candidate genes associated with growth and exterior traits have been identified by means of GWAS in different farm animals [13,14,22,[28][29][30], including goats [31].
Our study aimed to identify genomic regions and candidate genes associated with growth and exterior traits in Karachai goats at age 8 months. Karachai ( Figure S1) is the aboriginal dual-purpose goat breed, which is bred in high-altitude regions of Caucasus [5]. Phylogenetic studies performed based on Goat SNP BeadChip revealed that the Karachai goats were the most distant among other goat breeds in Russia [32].
The free-range raising of Karachai goats in small herds in the high-altitude areas complicated the phenotypic data collection. In this regard, we managed to obtain reliable body measurements and weights for only 269 goats. This sample size is relatively low for classical GWAS and may be considered as a limitation of our study. However, successful results provided by GWAS in low samples were reported in several previous research works. For example, significant associations with body weight, growth-related and body conformation traits were identified by GWAS in 150 Dazu Black goats [18], in 95 Sudanese goats [28], in 69 Egyptian Barki sheep [33], and in 96 Baluchi sheep [34]. Genome-wide associations with other economic important traits were found in sample sizes of 192 animals [35,36]. In this regard, we believe that it possible to use a sample of 269 animals for our research.
Genome-wide association studies of 269 Karachai goats performed for body weight and seven conformation traits at the age of 8 months using 47,647 SNPs identified genomewide SNPs (p < 10 −5 ) for all studied traits, including 5 SNPs for body weight, 7 SNPs for withers height, 9 SNPs for rump height, 6 SNPs for body length, 4 SNPs for chest perimeter, 30 SNPs for chest width, 1 SNP for chest depth, and 5 SNPs for rump width (Table S2). Previous studies identified significant SNPs and obtained genomic regions associated with some of the body conformation traits, which are subjects of our present study. Thus, in Dazu Black goats, 53 significant SNPs were found related to body height, body length, cannon circumference, chest depth, chest width and heart girth [18]. GWAS based on SNP genotypes of 150 Punjab goats and linear body measurements identified two significant SNPs influencing body weight, as well as two, three, four, four and five significant SNPs related to heart girth, height, body length, chest length and pubic bone length, respectively [13]. Several SNPs identified in present work overlapped with genomic regions and QTL associated with body weight and body conformation traits, which were reported in previous studies of goats. Thus, the SNP located on Chr3 (snp6325-scaffold1223-530258, position 107,364,229) was significantly associated with body length (p < 10 −5 ) in our study and was located near the body length QTL (3: 107,533,545-107,548,435), which was reported in Sudanese goats [28]. Moreover, three SNPs at Chr5 including snp54059-scaffold822-1742056 (position 8,501,612), snp54060-scaffold822-1781620 (position 8,540,781), and snp15220-scaffold1622-150966 (position 9,865,450) were located nearby the QTL (2-8 cM) associated with growth, which was reported previously in Markhoz goats [37]. A small number of matches between our study and previous results may be due to several reasons. Firstly, discovery of genomic variants, which are involved in the formation of economically important phenotypes in goats, is still lagging behind other livestock species. Moreover, the genetic differences between the goat breeds due to origin and breeding strategies will probably result in revealing various associations. In the present study, the genes related to body weight at 8 months included CRADD, HMGA2, CRADD, MSRB3, MAX, HACL1 and RAB15. In addition, several genes were associated with body conformations traits in our study, such as APOB, PTPRK, BCAR1, AOAH, ASAH1, WDR70, ZBTB24, SORCS3 and KCNG.
In our study, the gene CRADD (CASP2 and RIPKI domain containing adaptor with death domain) was related to body weight in Karachai goats at age 8 months. It has been observed that the CRADD gene, beside several other genes (SOCS2 and PLXNC1), is localized within the so-called "tall region" on chromosome 10 in mice, mutations in which are related to the appearance of a non-obese tall phenotype in mice [38]. The CRADD gene indirectly affects cysteine proteases involved in apoptosis [39]. Therefore, the increase in cell number observed in the tall phenotype appears to be the result of a change in the apoptotic metabolic pathway [40].
A previous study was carried out on the effect of genes localized within the "high growth region" in mice (CRADD, SOCS2 and PLXNC1), as well as two closely located genes (ATP2B1, DUSP6), on the growth, meat and fat quality in pigs. There was shown to be a significant relationship between these genes and phenotypic parameters, including growth and fat deposition in pigs [41].
Our study revealed an association between the gene HMGA2 (high-mobility group AThook 2) and the body weight in Karachai goats at age 8 months, the HMGA2 gene encodes a small, chromatin-associated protein that belongs to the non-histone chromosomal highly mobile group A of the DNA-binding protein family, this protein can modulate transcription and enhance or inhibit the action of transcriptional enhancers by altering chromatin structure or facilitating the assembly of transcription factor multiprotein complexes [42][43][44].
Several studies have revealed a high level of HMGA2 expression during embryogenesis; analysis of expression patterns showed the main role of this gene in the determination of growth and development [45,46]. Knockout of the HMGA2 gene in mice demonstrated the involvement of this gene in diet-induced obesity [47]. Using CRISPR/Cas9, a null HMGA2 allele was generated in mice in which only the coding sequence was specifically disrupted. Loss of one or both HMGA2 alleles has resulted in a 20% and 60% reduction in body weight, respectively, compared to wild-type littermates, as well as an allometric reduction in skull length, which shows the important role of this gene in mice body weight [48]. Similarity, several previous studies found that the HMGA2 gene has been shown to be related to growth in humans [49,50]. At the same time, in Duroc pigs, the HMGA2 gene was identified as one of the candidate genes linked to growth traits [51]. In numerous studies, evidence of a close association between HMGA2 gene expression and pig weight was also found; the HMGA2 gene is activated only during early postnatal development and controls the total number of cells in the animal. In particular, the level of its expression is proportional to the animal body weight [52,53]. Thus, the involvement of HMGA2 in the regulation of prenatal and postnatal growth in various animal and human species was confirmed. Summarizing, we can conclude that all the functions described above are directly or indirectly associated with average daily gains and consequently with an increase in body weight.
In the present study, the gene MSRB3 (methionine sulfoxide reductase B3) was found to be linked to body weight in Karachai goats at age 8 months. The MSRB3 gene is one of the most important members of the MSRB gene family; it can reduce the catalytic effect of methionine-R-sulfoxide to methionine, particularly as an oxidoreductase [54]. Many previous studies have shown that the MSRB3 gene can influence the shape and size of ears in pigs and sheep [55,56]. Based on genome-wide association analysis, gene silencing and protein precipitation, the MSRB3 gene was considered as a candidate gene affecting the growth performance of cattle [57].
More recent studies have shown that indels in the MSRB3 gene are associated with growth and development indicators (body weight, body length, rump height, chest girth behind the shoulder blades) in four Chinese native cattle breeds [58].
Our research indicates an association between the gene MAX (MYC-associated factor X) and body weight in Karachai goats at age 8 months. The protein encoded by the MAX gene is a member of the bHLHZ family of transcription factors. It is able to form homodimers and heterodimers with other family members which include MAD, MXI1 and MYC. MYC is an oncoprotein involved in cell proliferation, differentiation and apoptosis. MAX, as a partner of MYC, is involved in the control of cell proliferation [59]. The SNP in the MAX gene has been shown to be associated with growth in humans [60].
The gene HACL1 (hydroxyacyl-CoA lyase 1) has been related to body weight; this gene plays an important role in fatty acid oxidation and the fatty acid metabolism process. A previous study on dairy cattle showed that the gene HACL1 has been associated with two metabolites (-α-ketoglutarate and succinic acid), which were identified in the genemetabolite interaction network [61]. For the gene RAB15 (a member of the RAS oncogene family), there is no information on the relationship with animal growth and development in the previous studies.
Our study revealed an association between the genes APOB, PTPRK, AOAH and ASAH1 and several body conformation traits in Karachai goats at age 8 months. The gene APOB (apoliporotein B) has been associated with withers height, rump height and body length. At the same time, the APOB gene has been associated with body growth in chickens [62,63]. Furthermore, a previous study carried out on goats observed that the APOB/HaeIII polymorphism is linked to lactose content, lactation length, and somatic cell score, and also, the APOB/SmaI polymorphism is associated with lactation length, lactose percentage, and total yield of solids non-fat, lactose, protein, fat, and milk [64]. The gene PTPRK (protein tyrosine phosphatase receptor type K) gene has been related to chest perimeter. In a previous study, it was observed that the PTPRK gene has been associated with marbling in Nelore cattle [65]. At the same time, we observed that the gene BCAR1 (breast cancer anti-estrogen resistance1) was related to chest width in Karachai goats at age 8 months, several previous studies carried out on cattle and sheep identified BCAR1 as a candidate regulatory gene of intramuscular fat deposition and fatty acids content [66][67][68].
In the present study, we also observed that the gene AOAH (acyloxyacyl hydrolase) was linked to withers height and rump height, and this gene plays an important role in fatty acid metabolism and the lipopolysaccharide catabolism process [69]. At the same time, the gene ASAH1 (N-acylsphingosine amidohydrolase 1) was associated with rump height and body length, and this gene plays an important role in the fatty acid metabolism process [70].
Most of the candidate genes which we identified as being associated with body conformation traits in Karachai goats at age 8 months play important roles in the metabolism process, which have a direct effect on the size of these traits in animal bodies.
In our research, we observed that the gene WDR70 (WD repeat domain 70) was associated with chest width in Karachai goats at age 8 months. In a previous study, the WDR70 gene has been linked to fertility traits in Chinese and Nordic Holsteins [71]. Moreover, we found that the following genes were associated with chest width in our study: ZBTB24 (Zinc finger and BTB domain containing 24) and SORCS3 (Sortilin related VPS10 domain containing receptor 3). At the same time, KCNG4 (potassium voltage-gated channel modifier subfamily G member 4) has been related to rump height, body length, chest perimeter and chest width in our research. In numerous studies, KCNG4, ZBTB24 and SORCS3 have been related to the immune system and some diseases in human, but there is no information on the relationship with animal body conformation traits in the previous studies. On the other hand, the SORCS3 gene has been associated with coat color trait in goats [72]. Furthermore, we found an association between the ADIPOQ (adiponectin) gene and chest width. The ADIPOQ gene plays a role in the skeletal muscle satellite cells' differentiation into adipocytes and is potentially involved in intramuscular adipogenesis and postnatal muscle growth in goats [73][74][75].

Conclusions
In the present study, a genome-wide association studies analysis was carried out for body weight and body conformation traits in 269 Karachai goats at age 8 months. The analysis showed that SNPs were identified for all studied traits, including body weight (5 SNPs), withers height (7 SNPs), rump height (9 SNPs), body length (6 SNPs), chest perimeter (4 SNPs), chest width (30 SNPs), chest depth (1 SNPs) and rump width (5 SNPs). We identified the candidate genes related to body weight at age 8 months included CRADD, HMGA2, CRADD, MSRB3, MAX, HACL1 and RAB15. At the same time, several genes were obtained associated with body conformation traits in our study, which included: APOB, PTPRK, BCAR1, AOAH, ASAH1, WDR70, ZBTB24, SORCS3, ADIPOQ, and KCNG. These results will be useful for the development of genetic selection programs aimed at genetic improvement and will increase the productive efficiency of goats.