Genome-Wide Association Study of Milk Composition in Karachai Goats

Simple Summary The growth rate of dairy goat farming is due to the special qualities of goat milk, which are directly determined by its composition. Goat’s milk has a higher concentration of special fatty acids but less protein than cow’s milk, which is why high-quality cheeses are made from goat’s milk. Therefore, the goal of breeding dairy goats is not only to increase their milk yield but also to improve the quality composition of their milk. Karachai goats are a breed of local selection, the milk of which is widely used in the regions of the North Caucasus for the production of unique cheeses. While a relatively large number of genome studies have been carried out for transboundary goat breeds, such as Saanen goats, there is not yet enough research on local goat breeds. That is why we conducted a genome-wide association study (GWAS) on the milk component traits of Karachai goats. The use of genomic evaluation in the selection of Karachai goats will increase the rate of breeding progress in terms of enhancing milk productivity traits and thereby increase the profitability of their use. Abstract This study is first to perform a genome-wide association study (GWAS) to investigate the milk quality traits in Karachai goats. The objective of the study was to identify candidate genes associated with milk composition traits based on the identification and subsequent analysis of all possible SNPs, both genome-wide (high-confidence) and suggestive (subthreshold significance). To estimate the milk components, 22 traits were determined, including several types of fatty acids. DNA was extracted from ear tissue or blood samples. A total of 167 Karachai goats were genotyped using an Illumina GoatSNP53K BeadChip panel (Illumina Inc., San Diego, CA, USA). Overall, we identified 167 highly significant and subthreshold SNPs associated with the milk components of Karachai goats. A total of 10 SNPs were located within protein-coding genes and 33 SNPs in close proximity to them (±0.2 Mb). The largest number of genome-wide significant SNPs was found on chromosomes 2 and 8 and some of them were associated with several traits. The greatest number of genome-wide significant SNPs was identified for crude protein and lactose (6), and the smallest number—only 1 SNP—for freezing point depression. No SNPs were identified for monounsaturated and polyunsaturated fatty acids. Functional annotation of all 43 SNPs allowed us to identify 66 significant candidate genes on chromosomes 1, 2, 3, 4, 5, 8, 10, 13, 16, 18, 21, 23, 25, 26, and 27. We considered these genes potential DNA markers of the fatty acid composition of Karachai goat milk. Also, we found 12 genes that had a polygenic effect: most of them were simultaneously associated with the dry matter content and fatty acids (METTL, SLC1A 8, PHACTR1, FMO2, ECI1, PGP, ABCA3, AMDHD2). Our results suggest that the genes identified in our study affecting the milk components in Karachai goats differed from those identified in other breeds of dairy goats.


Introduction
In the last decade, dairy goat farming has rapidly developed in Russia.For example, in 2010, out of the total population of 2.1 million dairy and dairy-meat goats, there were only about 340 thousand, whereas in 2022, out of 1.9 million, there were around 750 thousand, which is 2.2 times more [1].This trend corresponds with global tendencies toward increasing the number of dairy goats and goat milk production [2,3].The interest in dairy and dairy-meat goat farming worldwide is driven by the stable demand for environmentally friendly and natural products, including goat milk, which possesses a number of valuable functional properties.It is known that goat milk, unlike cow milk, is characterized by a high dispersion of fat globules, a high content of unsaturated short-chain fatty acids and β-casein, an extremely small amount of alpha-1s-casein (which causes allergic reactions to cow milk), and virtually no monosaccharides.These characteristics determine the dietary properties of goat milk and its advantage in producing high-quality cheeses [4].
The tourism industry is actively developing in the North Caucasus, which, in turn, stimulates an increase in the volume of dairy products with improved functional characteristics, particularly from goat milk.Karachai goats belong to the dairy-meat direction of productivity and are widely spread in this region.They efficiently utilize alpine and subalpine pastures, known for their exceptionally rich flora.However, due to steep slopes and rocky outcrops, these pastures are often inaccessible to other types of livestock [20].
Although genome-wide association studies (GWASs) on goats have been conducted in many countries and on different breeds, the loci associated with quantitative and qualitative traits of milk production remain largely unknown (Saleh, A.A., 2022 [21]).This applies to the study of milk production in Karachai goats as well, which justifies the relevance of this research.
The objective of this study was to identify candidate genes associated with milk production traits in Karachai goats based on a genome-wide analysis of associations.

Sample Collection and Phenotypic Measurements
For the study, a random sampling method was used to select 167 lactating Karachai goats (from peasant/farmer households-KFH "Pyatigorsky" in the Stavropol region and KFH "Maysky" in the village of Kyzyl-Kala, the Karachai-Cherkess Republic), from which ear tissue or blood samples were obtained for DNA extraction.
The quality control and data filtering for each SNP and sample were performed using the plink 1.9 software package (http://zzz.bwh.harvard.edu/plink/,accessed on 5 September 2023), applying the following filters: individual sample call rate no less than 90% for all investigated SNPs (--mind); SNP call rate no less than 90% for all genotyped samples (--geno); minor allele frequency (MAF) greater than 0.01 or 0.05 (--maf 0.01); deviation of SNP genotypes from Hardy-Weinberg equilibrium in the tested samples with a significance level of p-value < 10 −6 (--hwe).Additionally, an assessment of the linkage disequilibrium (LD estimation) was performed for the investigated SNPs with an r 2 < 0.2 and a step size of 50 kb (--indep-pairwise).After data filtering, a total of 47155 SNPs were used in the analysis.

Genome-Wide Association Studies and Gene Analysis
To identify associations between the SNP markers and milk components, a multiple linear regression analysis implemented in plink 1.9 was used.To confirm the significant influence of the SNPs and determine significant regions in the genome of the studied goats, a Bonferroni test was performed with a p-value threshold of <1.06 × 10 −6 ; 0.05/47,155 for genome-wide signification and a p-value threshold for suggestive associations of 2.12 × 10 −5 ; 1/47,155.Manhattan plots illustrating the distribution of significant DNA polymorphisms across autosomes were generated using the ggplot2 package in RStudio [23].
To remove environmental and permanent effects and analyze the normal distribution of the studied traits, generalized linear models were applied using the STATISTICA 10 software.
The typical linear regression model in a genetic association study is: where β G is the parameter of interest quantifying the association between a genotype G and the mean of an outcome Y. Further, X is a small set of p covariates, such as age and gender.Denote X = (1, X, G) and β = (β 0 , β X , β G ).As long as the model for the mean of the outcome (1) is correct, the ordinary least squares estimator β = X T X −1 X T is an unbiased estimator of β.It is a weighted average of the model outcome Y with weights that dependent on the covariate set X.
Gene identification based on SNP positions, as defined according to the ARS1.2 genome assembly, was performed using the Ensembl Genes web resource, release 103 [24].Structural annotation of the genomic regions covering a window of ±0.20 Mb from the identified SNPs was conducted using the DAVID v6.8 program (https://david.ncifcrf.gov,accessed on 10 September 2023) [25].
The analysis of the obtained data shows that the freezing point depression and acidity were characterized as the parameters with the least variability, with a coefficient of variation (Cv) not exceeding 4.2%.The coefficient of variation in the mass fraction of truth protein (PT) and crude protein (PC), as well as β-casein, was in the range of 23-26%, which does not contradict the data from other studies.They were followed by MSNF, lactose, TS, urea, polyunsaturated, saturated, and short-chain fatty acids with a Cv not exceeding 33.0%.The Cv of the other fatty acids ranged within 45.5%.Metabolites such as acetone and BHB demonstrated a great variability, with a Cv exceeding 330.0%.
The lack of variability in acidity may be due to the biological characteristics of goats of this breed and this aspect requires further research.
The range of variability in the content of dry matter, fat, protein, lactose, urea, and saturated fatty acids indicated a normal distribution of these traits in the studied sample.However, monounsaturated fatty acids showed greater variability, which may be related to the individual characteristics of goats in producing certain fatty acids, such as palmitic acid and oleic acid.
The high variability in the levels of acetone and BHB is likely due to the manifestation of hidden ketosis in some goats, leading to a significant increase in the concentration of these substances.For example, the acetone level in certain animals reached 1.62 mmol/L compared to the sample mean of 0.035 mmol/L, while BHB reached 2.36 mmol/L compared to the mean of 0.038 mmol/L.There is no available literature on the normal range of acetone and BHB content in goat milk.However, it is known that in cows, the normal levels are up to 0.72 mmol/L for acetone and 1.2 mmol/L for BHB, with an excess indicating subclinical ketosis [26].
Figure 1 presents the visualization results of the location of statistically significant polymorphic sites across 29 chromosomes for some of the milk parameters in Karachai goats.
exceeding 33.0%.The Cv of the other fatty acids ranged within 45.5%.Metabolites such as acetone and BHB demonstrated a great variability, with a Cv exceeding 330.0%.
The lack of variability in acidity may be due to the biological characteristics of goats of this breed and this aspect requires further research.
The range of variability in the content of dry matter, fat, protein, lactose, urea, and saturated fatty acids indicated a normal distribution of these traits in the studied sample.However, monounsaturated fatty acids showed greater variability, which may be related to the individual characteristics of goats in producing certain fatty acids, such as palmitic acid and oleic acid.
The high variability in the levels of acetone and BHB is likely due to the manifestation of hidden ketosis in some goats, leading to a significant increase in the concentration of these substances.For example, the acetone level in certain animals reached 1.62 mmol/L compared to the sample mean of 0.035 mmol/L, while BHB reached 2.36 mmol/L compared to the mean of 0.038 mmol/L.There is no available literature on the normal range of acetone and BHB content in goat milk.However, it is known that in cows, the normal levels are up to 0.72 mmol/L for acetone and 1.2 mmol/L for BHB, with an excess indicating subclinical ketosis [26].
Figure 1 presents the visualization results of the location of statistically significant polymorphic sites across 29 chromosomes for some of the milk parameters in Karachai goats.Associations were found for the TS parameter with 23 SNPs (5 highly significant and 11 suggestive), distributed across 11 chromosomes.For PT, PC, and Cas.β, 16 SNPs were associated, distributed across 10 chromosomes, with a genome-wide significance observed for 3, 6, and 2 SNPs, respectively, located on chromosomes 1, 2, 8, and 24.
For fat and SFA, associations were also found with 16 SNPs, localized on 12 chromosomes, with 2 and 3 SNPs, respectively, showing genome-wide significance on chromosomes 2, 3, 8, and 25.MUFA and PUFA were associated with 13 and 12 SNPs of suggestive significance, respectively, distributed across nine chromosomes.No genome-wide significant SNPs were identified for these parameters.PUFA, MCFA, and SCFA were associated with 12, 10, and 8 SNPs, respectively, distributed across 16 chromosomes.Among them, three and four SNPs showed genome-wide significance on chromosomes 2, 3, 8, 10, 16, and 25.
Of the 167 highly significant and subthreshold SNPs associated with the studied indicators of the composition and properties of Karachai goat milk, 10 SNPs were localized within the protein-coding genes and 33 SNPs in close proximity to them (±0.2Mb).Functional annotation of all 43 SNPs in the biological library in the DAVID program (https://david.ncifcrf.gov,accessed on 5 September 2023) identified 66 significant candidate genes on 15 chromosomes (Figure 2) (Table S2).
Of the 167 highly significant and subthreshold SNPs associated with the studied indicators of the composition and properties of Karachai goat milk, 10 SNPs were localized within the protein-coding genes and 33 SNPs in close proximity to them (±0.2Mb).Functional annotation of all 43 SNPs in the biological library in the DAVID program (https://david.ncifcrf.gov,accessed on 5 September 2023) identified 66 significant candidate genes on 15 chromosomes (Figure 2) (Table S2).A total of 14 candidate genes associated with the protein and β-casein content in the milk of Karachai goats were identified, localized on six different chromosomes (Table 3).These genes were related to heart development and function (ADRA1A, NKX3-1, NKX2-6, STC1 on chromosome 8, ODAD2 on chromosome 13, CASP7 on chromosome 26), digestive tract development (NKX2-6), bone development (TNN on chromosome 16), muscle tissue Animals 2024, 14, 327 9 of 16 development (NRAP on chromosome 26), protein metabolism processes (BAG2 on chromosome 23), lipid transport, and the maintenance of mitochondrial calcium ion homeostasis (PDZD8 on chromosome 26) (Table 3).A total of 25 candidate genes associated with the fat content and fatty acid composition in the milk of Karachai goats were identified, distributed across 10 different chromosomes (Table 4).
The described functions of these candidate genes are related to lipid metabolism processes and adipose tissue development.For example, the INSIG1 gene (chromosome 4) is involved in triglyceride metabolism, cholesterol biosynthesis, metabolism and homeostasis, the regulation of fatty acid biosynthesis, and adipocyte differentiation.The PAXIP1 gene (chromosome 4) participates in adipose tissue development, the BAAT gene (chromosome 8) is involved in fatty acid metabolism, the PLPPR1 gene (chromosome 8) is involved in phospholipid metabolism, the LACTB gene (chromosome 10) regulates lipid metabolism processes, the FMO1 and FMO2 genes (chromosome 16) are involved in organic acid metabolism, the ECI1 gene (chromosome 25) is involved in the beta-oxidation of fatty acids, the PGP gene is involved in glycerol biosynthesis, and the ABCA3 gene (chromosome 25) regulates cholesterol efflux, phosphatidylcholine and phosphatidylglycerol metabolic processes, lipid biosynthesis regulation, and phospholipid homeostasis.
Among all the identified genes, ECI and PGP should be highlighted, as they are associated with the beta-oxidation of fatty acids and glycerol biosynthesis, making them candidate genes for the fatty acid composition in the milk of Karachai goats.
These genes should be considered potential DNA markers for the fatty acid composition of Karachai goat milk. 1 Note.Traits: Fat-fat content; SFA-saturated fatty acids; MUFA-monounsaturated fatty acids; PUFApolyunsaturated fatty acids; LCFA-long-chain fatty acids; MCFA-medium-chain fatty acids; SCFA-short-chain fatty acids; C14:0-myristic acid; C16:0-palmitic acid; C18:1-oleic acid; TFA-trans fatty acids; № Chr-goat chromosome number; № SNP-sequential number of SNP in Figure 1; SNP-name of the reliable SNP and its position in the genome (indicated as superscript numbers); gene-gene within or in close proximity to which the reliable SNP is localized (genes with localized SNPs are shown in bold).
A total of 17 candidate genes associated with the acetone, BHB, and urea content in the milk of Karachai goats were identified, distributed across six chromosomes (Table 5).The described functions of these candidate genes are related to neurogenesis (ETV6 on chromosome 2 and NTN5 on chromosome 18), immune response (ATF2 on chromosome 2, INPP5D on chromosome 3, CDK6 on chromosome 4, CACNA1C on chromosome 5), the development of the heart, blood vessels, and brain (ATF2 on chromosome 2, CACNA1C on chromosome 5, SPHK2 on chromosome 18), cholesterol metabolism (SULT2B1 on chromosome 18) and carbohydrate and glycoprotein metabolism (NEU2 on chromosome 3, FUT1 on chromosome 18) (Table S2).The summary of data on the significant biological functions of candidate genes allows us to conclude that the largest number-29 genes, or 37.7%-was associated with the development of physiological traits, including response to the external environment (GO:0003298).
The smallest number of genes (four genes or 5.2%) was associated with intrauterine (embryonic) development (GO:0009790) and the same number of genes was associated with immunity (GO:0006955) (Table S2).

Discussion
Goats, as subjects of genetic research, are increasingly attracting the attention of scientists.With the development of the GoatSNP50 BeadChip by Illumina Inc. [27], GWASs (genome-wide association studies) became available for goats of different productivity directions [9,21].Through these studies, it has been possible to identify certain genes associated with various economically important traits in goats, such as fiber productivity, including fiber color [28][29][30], meat productivity traits [31][32][33], reproduction [34][35][36], and adaptation ability [37][38][39].Several studies have focused on identifying genes associated with milk production traits in goats, as well as exterior traits that have a greater impact on level of their development [40][41][42].To further expand research in this direction for dairy goats, work is being undertaken to develop a new panel of liquid SNP chips at a lower cost, making them more accessible for both scientific research and future practical breeding.This panel contains 54,188 SNPs based on genotyping technology using targeted sequencing (GBTS) [43].
Most of the research in dairy goat breeding has been carried out on goats with pronounced milk productivity: Saanen, Alpine, and Toggenburg breeds.In Russia, dairy goat farming, as well as throughout the world, is largely based on the use of goats of these transboundary breeds; however, in certain regions, for example, the North Caucasus, locally selected goats, which include Karachai goats, are becoming more common.It should be noted that for dairy-meat goat breeds in general, and local breeds in particular, not enough genetic research has been carried out, which justifies the relevance of this work.
The objective of the research was to search for candidate genes associated with traits related to milk production in Karachai goats.To determine candidate genes, all identified SNPs were used, both genome-wide (highly significant) and subthreshold significance (suggestive).The largest number of highly significant SNPs were identified on chromosomes 2 and 8, with some SNPs associated with multiple traits.Thus, two SNPs on chromosomes 2 and 8 were shared by protein, β-casein, lactose, and saturated fatty acids.One SNP each on chromosomes 8 and 5 is common to closely related traits such as fat content, all classes of fatty acids, and their trans-isomers.
Comparing our data with the results of other studies, it should be noted that the established SNPs are located on different chromosomes; however, data from other authors have established a connection between SNPs and not one trait but multiple traits.Thus, Scholtens M. et al., (2020) [17] identified 43 SNPs, of which the largest number-31-was located on 19 chromosomes, while 7 were associated with several traits of milk productivity of goats, 16 with milk yield, fat increase, and protein in milk, and 8 with one or two traits.Martin et al., (2018) [14] and Taluarn, E. et al., (2020) [15] also found more "selection signatures" on chromosome 19 than on other dairy goat chromosomes, with some SNPs associated with multiple traits.Other studies found that most of the SNPs associated with increased fat in goat milk other than chromosome 19 were located on chromosomes 14 and 29.Massender E et al. [18] identified 189 unique SNPs occurring on all chromosomes, but regions on chromosomes I6 (86,050,088 and 86,858,026 bp) and CHI14 (80,143,561, 81,347,395 and 81 658,383 bp) were the most significant to the milk composition traits.
Genomic identification of the currently found SNPs associated with the addition of protein and casein and functional annotation of the candidate genes revealed their connection with heart function (ADRA1A, NKX3-1, NKX2-6, STC1, ODAD2, CASP7), the alimentary canal (NKX2-6), the development of bone (TNN) and muscle tissue (NRAP), and protein metabolism processes (BAG2).
In studies by Signer-Hasler H. et al. [44], the STC1 gene, determined using the ROH method, was identified as a novel domestication gene affecting important traits such as body size and milk production in Swiss goats.According to its biological function, it is associated with the contraction of the cardiac muscle cells (group 7 of this study) and milk production, and it can be assumed to be a special marker gene for dairy goats adapted to mountainous areas.
The other genes we identified were not described in studies by other authors as potential genetic markers of the milk protein composition.Results obtained by other researchers on Murciano-Granadina, Norwegian and Sarda goats demonstrated the association of other genes (CSN1S1, CSN1S2, CSN2, CSN3, as well as α-LA) with milk yield and milk composition (amount of protein, fat, dry matter, lactose and somatic cells) [45][46][47].
In several other studies, the polygenic nature of the influence of individual genes on the compositional characteristics of goat milk is noted.For example, the AGPAT6 gene (1-acylglycerol-3-phosphate-O-acyltransferase) is considered to be associated with both decreased milk production and increased milk protein and fat content [48].The PRL (prolactin) and PRLR (prolactin receptor) genes are associated with lactose, protein, and fat content [49].The ACACA gene (acetyl-CoA carboxylase alpha) is involved in the regulation of enzymes involved in fatty acid biosynthesis and is associated with milk protein and fat content [50].The SCD gene (Stearoyl-CoA desaturase), which is involved in the synthesis of monounsaturated fatty acids in the adipose tissue and mammary glands, directly influences the profile of other fatty acids in goat milk [51].The PPARγ gene (Peroxisome Proliferatoractivated Receptor Gamma) encodes a protein that has a high impact on the transcription of genes such as LPL, FASN, ACACA, SCD, PLIN2, PLIN3, FABP3, and PNPLA2, which are involved in lipid metabolism [52].In turn, the SCD gene alters the composition of long-chain fatty acids (LCFA), and its expression is directly regulated by SREBP-1 and PPARγ1 in lactating goat mammary gland cells [53].
A comparison of our data with the results of other researchers made it possible to identify analogies in relation to such genes as INSIG1, BAAT, and LACTB.A study of the INSIG1 gene and its expression showed that it is one of the genes that controls the synthesis of milk fat in goats' mammary gland cells [54].In our studies, the INSIG1 gene was identified in structural indicators such as polyunsaturated and long-chain fatty acids and oleic acid, and, within it, there is a reliably obtained SNP (snp1935-scaffold1053-1528396).
Analysis of the BAAT gene in a population of Liaoning cashmere goats revealed that the GG genotype is associated with cashmere characteristics and the AG genotype is associated with body size and milk production traits [55].In our studies, concerning Karachai goats, the BAAT gene is associated with the mass fraction of fat in the milk.
In our study, we also identified 12 genes that exhibited a polygenic influence.Threequarters of these genes were simultaneously associated with the dry matter content and fatty acids.These genes include METTL, SLC1A8, PHACTR1, FMO2, ECI1, PGP, ABCA3, and AMDHD2, which influenced the content of dry matter and polyunsaturated fatty acids (PUFA), saturated fatty acids (SFA), monounsaturated fatty acids (MUFA), trans-isomers of fatty acids, C14, and C16.Other genes were associated with different traits.For example, the AGTPBP1 gene was associated with PC, β-casein, lactose, and pH; the TNN gene was associated with PT and PC; the ATF2 gene was associated with BHB and acetone; and the NECAP2 gene was associated with TS and urea.

Conclusions
Summarizing the above, it should be noted that the genes identified in our study related to milk components in Karachai goats, which are dual-purpose goats, differed from those identified in relation to milk in dairy goats.Apparently, the genetic architecture in goats with different levels of the main productivity trait, milk production, is different.Additional research on larger populations and different breeds is necessary for a better understanding of the genetic mechanisms underlying milk production in goats.

Table 1 .
Descriptive statistics for milk parameters in the studied sample of Karachai goats 1 .

Table 2 .
Number of SNPs significantly associated with milk composition parameters of Karachai goats 1 .

Table 3 .
Candidate genes associated with protein and β-casein content in the milk of Karachai goats 1 .
1Note.Traits: PC-crude protein; PT-true protein; Cas.β-β-casein; № Chr-goat chromosome number; № SNP-sequential number of SNP in Figure1; SNP-name of the reliable SNP and its position in the genome (indicated as superscript numbers); gene-gene within or in close proximity to which the reliable SNP is localized (genes with localized SNPs are shown in bold).

Table 4 .
Candidate genes associated with fat content and fatty acid composition in the milk of Karachai goats 1 .

Table 5 .
Candidate genes associated with acetone, BHB, and urea content in the milk of Karachai goats 1 .