Genetic Regulation of Biomarkers as Stress Proxies in Dairy Cows

Stress in livestock reduces productivity and is a welfare concern. At a physiological level, stress is associated with the activation of inflammatory responses and increased levels of harmful reactive oxygen species. Biomarkers that are indicative of stress could facilitate the identification of more stress-resilient animals. We examined twenty-one metabolic, immune response, and liver function biomarkers that have been associated with stress in 416 Italian Simmental and 436 Italian Holstein cows which were genotyped for 150K SNPs. Single-SNP and haplotype-based genome-wide association studies were carried out to assess whether the variation in the levels in these biomarkers is under genetic control and to identify the genomic loci involved. Significant associations were found for the plasma levels of ceruloplasmin (Bos taurus chromosome 1—BTA1), paraoxonase (BTA4) and γ-glutamyl transferase (BTA17) in the individual breed analysis that coincided with the position of the genes coding for these proteins, suggesting that their expression is under cis-regulation. A meta-analysis of both breeds identified additional significant associations with paraoxonase on BTA 16 and 26. Finding genetic associations with variations in the levels of these biomarkers suggests that the selection for high or low levels of expression could be achieved rapidly. Whether the level of expression of the biomarkers correlates with the response to stressful situations has yet to be determined.


Introduction
The intensification of animal production systems has imposed a wide range of stressors on animals, including heat and metabolic stress, which are likely to increase as a result of climate change [1]. Routine management practices such as diet changes, vaccination and group rearrangement, in addition to normal physiological events, including calving, lactating and weaning, place animals under stress, particularly high-producing dairy cows [2,3]. The consequences of stress include a decrease in immune competence, with the potential were collected and stored in a paper envelope, both for the measurement of biomarkers, as described below. At the same time, blood samples were collected from the coccygeal vein during the morning milking before feeding, into two 10 mL vacutainers, one containing EDTA, for DNA extraction, and the second containing Li-heparin, for haematochemical profiling, as anticoagulants. Li-heparin samples were kept at 6-8 • C for no more than 2 hours until centrifugation at 3000 RPM for 15 min; then, plasma was collected and frozen (−80 • C) until its analysis. The local Farmer and Breeder Association [35] provided information on the animals, specifically days in milk (DIM), parity, milk yield on the day of sampling, and on management practices, including housing, ration composition, and reproductive parameters.
Milk samples were analysed for protein, fat, lactose, urea and somatic cell count (SCC) using a FT-NIR FOSS 6000 (FOSS Analytics, Hilleroed, Denmark). BHB was analysed in milk using the same test as for plasma. Cortisol, a biomarker of chronic stress, was assayed in milk and in hair, both following the procedure described by Sgorlon et al. [37].

Analysis of Phenotypes
The distribution of phenotypic data was tested for normality using a Shapiro-Wilk test. Non-normal phenotypes were adjusted by either the truncation of outliers (±3 SD and/or ±1.5 Q1Q3 values) or square-root transformation. The effect of factors potentially affecting phenotypic data was assessed by the analysis of variance (ANOVA) implemented in R [38], and significant factors (farm, DIM, Body Condition Score (BCS), SCC and milk yield) were included as fixed effects in the GWAS model. Animals with somatic cell counts (SCC) ≥500,000 were considered to have subclinical mastitis and were removed from the dataset.

Genotyping and Quality Controls
Genomic DNA was isolated from whole blood using the GenElute Mammalian Genomic DNA Miniprep Kit (Sigma-Aldrich, Carlscad, CA, USA). A total of 951 cows were genotyped: 152 IS cows were genotyped with the BovineHD Genotyping Beadchip (Illumina, San Diego, CA, USA), containing 777,000 SNPs, and 348 IS and 451 IH cows were genotyped with the GeneSeek GGP Bovine 150K array (Neogene, Lincoln, NE, USA), containing 138,892 SNPs. SNP positions were updated to the most recent bovine reference sequence, ARS-UCD1.2 [39]. SNP markers with more than 10% of missing data, with a minor allele frequency (MAF) less than 1%, and with duplicate physical positions or that were not on autosomes were removed. Animals with more than 10% of missing data were also removed. The quality control was performed using Plink v1.9 [40,41].
To create a uniform set of SNP markers across individuals and breeds, the SNPs in common between the two arrays were extracted, and those present in the 150K array and absent in the HD array were imputed using BEAGLE v4.0 [42,43]. Imputation errors were identified by allelic correlation analysis (r 2 ) calculated in BEAGLE. Imputed SNP markers showing an r 2 lower than 0.75 were excluded from further analyses. After imputation, genotypes were quality-checked. Individuals with extremely low or high observed heterozygosities (ObsHet, average ± 4 SD; lower than 0.3199 or greater than 0.4414) or admixed (i.e. more than 20% of another genetic component, evaluated at K2 from Admixture software v1.3) [44]) were excluded.
Finally, SNPs with MAFs lower than 5% or Hardy-Weinberg equilibrium (HWE) p-values lower than 10 −6 were removed (Supplementary Table S1). After this final filter, the dataset consisted of 350 Italian Simmental cows genotyped at 119,858 SNPs and 343 Italian Holstein cows genotyped at 121,432 SNPs.

Single-SNP Genome-Wide Association Study
The GWAS on all twenty-one plasma, hair and milk biomarkers, described in the Materials and Methods, were performed in GCTA v1.93.0 using the leave-one-chromosomeout (LOCO) procedure [45,46]. The regression model used for estimating the SNP markertrait association was: where Y is the vector of trait values; µ is the overall mean; b is the vector of fixed effects; a is the fixed effect of the SNP genotype; u and ε are vectors of random additive polygenic effects and random residuals, respectively; u~N(0, Aσ 2 a ) and ε~N(0, Iσ 2 ε ), where A is the additive genetic relationship matrix, I is an identity matrix, and σ 2 a and σ 2 ε are the additive genetic and residual error variance, respectively. X, S and Z are the related incidence matrices. The fixed effects included were: farm, the effect of farm; parity, the effect of the number of calvings (three levels: 1, 2 and ≥3); BCS, the fixed effect of body condition score; SCC, the covariate effect of somatic cell count, expressed as a logarithm in base 10; DIM, the covariate describing the effect of days in milk; yield, the covariate effect of milk yield. Associations were considered significant if the p-value was higher than Bonferroni genome-wide significant thresholds at 0.05 (from 4.28 × 10 −7 to 4.41 × 10 −7 , based on the number of SNP markers used in the analyses). For both breeds, the suggestive p-value threshold was set at 10 −5 . Heritability was calculated as the proportion of phenotypic variance explained by all SNPs. First, the restricted maximum likelihood estimation (REML) algorithm was used to estimate the variance components. Then, heritability was estimated from the ratio between additive genetic variance and phenotypic variance (VG/VP) using the GCTA suite [46]. The proportion of phenotypic variance explained (PVE) by a given SNP or haplotype was calculated following the method described by Shim et al., 2015 [47].

Haplotype Genome-Wide Association Study
Haplotypes were calculated and then used as if they were multiallelic genetic markers in a haplotype-based regression GWAS approach. Eagle v2.4.1 [48] was used to phase the data, and haplotypes were constructed with GHap v2.0 [49,50] using overlapping segments of four consecutive SNP markers (sliding windows moving one SNP at a time). The window size was selected based on the average inter-marker distance, which was equal to 21 kbp, and based on the calculated linkage disequilibrium, which extends up to 80 kbp in IH and 60 kbp in IS. Animals were scored 0, 1 or 2 depending on the number of copies of each haplotype allele (HapAllele) they possessed. HapAlleles were used as genetic markers in a GWAS using their allele dose. Before running the GWAS, HapAlleles were filtered, removing those with MAFs lower than 5% or Hardy-Weinberg equilibrium (HWE) p-values lower than 10 −6 . The same model and software (GCTA) used in the single-SNP GWAS were used in the haplotype-based GWAS. Associations were considered significant if the p-value was higher than Bonferroni genome-wide significance thresholds at 0.05 (from 9.16 × 10 −8 to 9.34 × 10 −8 , based on the number of haplotypes used in the analyses). For both breeds, the suggestive p-value threshold was set at 10 −5 .

Genome-Wide Association Meta-Analysis
A meta-analysis combining the data from both breeds was run using the METAL software [51], based on the approach described by Stouffer et al. [52]. First, the direction of the effect and the p-value observed for all SNPs in the analysis of each breed were converted into a signed Z-score. Then, each signed Z-score was weighted by the square-root of the sample size of the two breeds before being summed. The overall Z-score was reconverted into p-values and thresholds were set at a nominal value of 0.05 following the Bonferroni correction for the number of SNP markers (4.71 × 10 −7 ). The suggestive p-value threshold was set at 10 −5 , as for single-breed analyses.

Analysis of Candidate for Putative Causative Variants
Variants within 1 Mbp flanking the most significant SNPs in the single-SNP GWAS were identified using the 1059 Holstein and 156 Simmental genome sequences available from the "1000 bull genome" Run 8 database [53]. Variants in high linkage disequilibrium (LD) (r 2 > 0.3) with the SNP with the lowest GWAS p-value [54] were analysed for the impact on protein structure and function using the Ensembl Variant Effect Predictor (VEP) suite [55]. The effect of non-synonymous variants by the SIFT algorithm [56].
The on-line Lasagna-search 2.0 tool [57] was used to identify the transcription factor binding site (TFBS) in the 2000 bp upstream of transcription initiation sites of CP, GGT1, GGT5 and PON1. Variants extracted from the 1000 bull genome RUN 8 dataset and intersecting TFBS were analysed for LD with the most significant SNPs identified in the GWAS.

Blood Assays and Outlier Analyses
A total of 78 IH and 63 IS had SCCs higher than 500,000, most likely due to subclinical mastitis, and were therefore excluded from the genomic analyses. Blood metabolic parameters recorded in all other animals were within a normal range of physiological values (Table 1) [30,58], and veterinary inspection did not identify any of these cows that had clinical symptoms or any signs of disease.

Single-SNP GWAS Results
The final SNP dataset comprised 343 IH individuals genotyped at 121,432 SNPs, and 350 IS cows genotyped at 119,858 SNP. The dataset used for phasing, to identify haplotypes for the GWAS, comprised 116,904 SNP for IH that identified 872,520 haplotype alleles (HapAlleles) in 120,913 haplotype blocks (HaploBlocks), and 113,592 SNPs for IS that identified 961,249 HapAlleles in 117,452 HaploBlocks. Both single-SNP and haplotype GWAS analyses identified genome-wide significant associations for only three phenotypes: GGT, CP and PON (Supplementary Tables S2 and S3). The heritabilities of these biomarkers range from 0.06 (s.e. 0.14) for CP in IH to 0.69 (s.e. 0.13) for PON in IS (Supplementary Table S1).

Ceruloplasmin (CP)
The single-SNP analysis identified a significant SNP (BTA-49362-no-rs) associated with CP in IH at 118,970 kb (Figure 1), within the CP gene (118,956,369-119,016,682 bp) on BTA1. In the same region, the haplotype analysis detected haplotypes reaching the genome-wide suggestive threshold (Supplementary Figure S1, Supplementary Tables S2 and S3). For IS, a single significant SNP was detected on BTA1 at 119,190 kb in the 3 region of CP. Three significant HapAlleles were found in the interval 118,900-119,220 kb. The two most significant HapAlleles included the CP gene (see Figure 1).

Paraoxonase (PON)
Two SNPs on BTA4 were significantly associated with the level of PON for IH ( Figure  2), at 11,668 and 11,992 kb, which are close to PON1 (12,542,354-12,576,328 bp). Haplotypes that were close to significance were identified in the same region but did not reach the genome-wide threshold (Supplementary Tables S2 and S3). No single SNP reached the significance threshold for IS, but a peak approaching significance was seen that overlapped the significant peak for IH. In addition, 26 significant HapAlleles were identified at nine neighbouring, non-overlapping BTA4 regions (from 10,500 to 18,900 kb, Figure 2), which were close to, but not overlapping, the PON1 coding region. In addition to the association identified on BTA4, which included PON1, a suggestive signal was seen on BTA16 in both IH and IS for the single and haplotype GWAS analyses, and a second lower signal was also seen on BTA26, again at overlapping positions in both breeds (Supplementary Figure S2). These were tested further in a meta-analysis of IH and IS data (see below).

Paraoxonase (PON)
Two SNPs on BTA4 were significantly associated with the level of PON for IH ( Figure 2), at 11,668 and 11,992 kb, which are close to PON1 (12,542,354-12,576,328 bp). Haplotypes that were close to significance were identified in the same region but did not reach the genome-wide threshold (Supplementary Tables S2 and S3). No single SNP reached the significance threshold for IS, but a peak approaching significance was seen that overlapped the significant peak for IH. In addition, 26 significant HapAlleles were identified at nine neighbouring, non-overlapping BTA4 regions (from 10,500 to 18,900 kb, Figure 2), which were close to, but not overlapping, the PON1 coding region. In addition to the association identified on BTA4, which included PON1, a suggestive signal was seen on BTA16 in both IH and IS for the single and haplotype GWAS analyses, and a second lower signal was also seen on BTA26, again at overlapping positions in both breeds (Supplementary Figure S2). These were tested further in a meta-analysis of IH and IS data (see below).

γ Glutamyl Transferase (GGT)
Eight SNPs with genome-wide significance were associated with the level of GGT for IH on BTA17, between 67,998 and 71,462 kb ( Figure 3). Seventeen significant HapAlleles were also identified at five BTA17 regions, between 68,000 and 71,300 kb near to, but not overlapping, the significant SNP. The two most significant HapAlleles (BTA17_B3319_71348361-71461810_TTTG) spanned a region that included GGT5

γ Glutamyl Transferase (GGT)
Eight SNPs with genome-wide significance were associated with the level of GGT for IH on BTA17, between 67,998 and 71,462 kb ( Figure 3). Seventeen significant HapAlleles were also identified at five BTA17 regions, between 68,000 and 71,300 kb near to, but not overlapping, the significant SNP. The two most significant HapAlleles (BTA17_B3319_71348361-71461810_TTTG) spanned a region that included GGT5 (71,443-71,453 kbp) and GGT1 (71,455-71,471 kbp). For IS, three SNPs had genome-wide significance for GGT levels, in the same BTA17 region as that detected in IH between 69,823 and 71,484 kb (Supplementary Figure S3, Supplementary Tables S2 and S3). Four HapAlleles were significantly associated with GGT levels in two regions nested within the regions identified in IH, between 68,900 and 70,500 kb, although the GGT5 and GGT1 genes map outside these regions. 13

Genome-Wide Association Meta-analysis
A meta-analysis [59] combining data from IH and IS for CP, GGT (Supplementary Figures S4 and S5) and PON (Figure 4)

Genome-Wide Association Meta-analysis
A meta-analysis [59] combining data from IH and IS for CP, GGT (Supplementary Figures S4 and S5) and PON (Figure 4) confirmed the significant associations identified in the single-breed GWAS in close proximity to the genes coding for these three proteins (Supplementary Table S4  The BTA4 peak corresponds to the PON1 gene. A significant peak is seen in BTA26 and a suggestive peak is seen in BTA16. An isolated significant SNP marker is seen in BTA17.

Candidate Causal Variant Identification
Within a 500 kb interval upstream and downstream of the most significant SNPs, a total of 1446 variants with r 2 > 0.3 were identified in IH and 633 in IS (Supplementary Table  S5). Examining 5 kb up-and downstream of CP, PON and GGT, 373 SNP were associated with CP in IH, of which 173 were within the boundaries of the gene, and in IS, 86 were associated with CP, of which 82 were within the boundaries of the gene. No significant SNPs were associated with PON1 in either breed. In HS, 76 SNPs were found in GGT1, including the 5 kb flanking regions, of which 52 were within the gene boundaries and 33 were associated with GGT5, of which 16 were within the GGT5 gene.
Of the variants within the genes, those with a moderate-to-high effect on the protein were identified using VEP. Five missense and two frameshift mutations were found in the CP gene in IH. Among them, a non-synonymous variant (rs43265346) that changes an asparagine into a histidine was classified as deleterious by the SIFT algorithm (SIFT score = 0.03). The other variants were classified as tolerated by SIFT. A single missense variation was identified within the GGT1 gene boundaries in IH, which was classified as tolerated by SIFT. No variant of high or moderate effect was identified by VEP in IS within the CP gene.
The analysis of the CP, GGT1, GGT5 and PON1 promoter regions identified 193 variants (Supplementary Table S6). Eleven of these were in LD with the most significant SNPs The BTA4 peak corresponds to the PON1 gene. A significant peak is seen in BTA26 and a suggestive peak is seen in BTA16. An isolated significant SNP marker is seen in BTA17.
In addition to the significant SNP close to CP, PON1, GGT1 and GGT5, the significance of the suggestive signal on BTAs 16 and 26 associated with the level of PON increased. The SNP on BTA16 had an increased p value but did not reach genome-wide significance, while the SNP on BTA26 did reach genome-wide significance threshold in the meta-analysis. The SNP with the highest significance on BTA16 was BovineHD1600012174 at 42,784,481. There are five genes within a 500 kb interval flanking this SNP; the closest is castor zinc finger 1 (CASZ1) followed by peroxisomal biogenesis factor 14 (PEX14), and the other genes are TAR DNA binding protein (TARDP), spermidine synthase (SRM), mannan binding lectin serine peptidase 2 (MASP2) and exosome component 10 (EXOSC10). The signal on BTA26 became the most significant association for PON, higher than the SNP within the PON1 gene. The most significant SNP on BTA26 in the meta-analysis is BovineHD2600003408, at 13,116,737, in the 5 of the domain containing E3 ubiquitin protein ligase 2 (HECTD2) gene. A second SNP near to the significant SNP, BovineHD2600003402, at 13,083,916 bp also maps within HECTD2. Two other genes map within 500 kb flanking this SNP: Polycomb group ring finger 5 (PCGF5) and protein phosphatase 1 regulatory subunit 3C (PPP1R3C). A single isolated significant SNP (BTA-113142-no-rs) on BTA17 at 42,821,258 that was above the significance threshold falls in an intergenic gene-poor region and no gene maps in the 500 kb flanking interval.

Candidate Causal Variant Identification
Within a 500 kb interval upstream and downstream of the most significant SNPs, a total of 1446 variants with r 2 > 0.3 were identified in IH and 633 in IS (Supplementary Table S5). Examining 5 kb up-and downstream of CP, PON and GGT, 373 SNP were associated with CP in IH, of which 173 were within the boundaries of the gene, and in IS, 86 were associated with CP, of which 82 were within the boundaries of the gene. No significant SNPs were associated with PON1 in either breed. In HS, 76 SNPs were found in GGT1, including the 5 kb flanking regions, of which 52 were within the gene boundaries and 33 were associated with GGT5, of which 16 were within the GGT5 gene.
Of the variants within the genes, those with a moderate-to-high effect on the protein were identified using VEP. Five missense and two frameshift mutations were found in the CP gene in IH. Among them, a non-synonymous variant (rs43265346) that changes an asparagine into a histidine was classified as deleterious by the SIFT algorithm (SIFT score = 0.03). The other variants were classified as tolerated by SIFT. A single missense variation was identified within the GGT1 gene boundaries in IH, which was classified as tolerated by SIFT. No variant of high or moderate effect was identified by VEP in IS within the CP gene.
The analysis of the CP, GGT1, GGT5 and PON1 promoter regions identified 193 variants (Supplementary Table S6). Eleven of these were in LD with the most significant SNPs in the single-SNP GWAS and were located in sites recognised by transcription factors ( Table 2). On BTA1, two variants in linkage with the most significant SNP for CP in Italian Simmental were mapped within transcription binding sites upstream of the CP transcription start site. One at position 118,900,034 bp (−1347 bp relative to gene transcription start site) is within a binding site recognised by both Serum Response Factor (SRF) and ETS Transcription Factor (ELK1). The second is at position 118,900,683 bp (−698 bp from the start site) within a Forkhead Box J2 (FOXJ2) binding site. The LD between these variants and the most significant SNP was above the threshold set for linkage disequilibrium (r 2 > 0.3). In Holstein, variants in the same upstream region had a low LD (r 2 < 0.3) with the most significant CP SNP detected in Italian Holstein.
On BTA17, five variants were in LD with the most significant SNP for GGT in IH. On BTA4, no variants in transcription binding sites were identified above the LD threshold (r 2 > 0.3) with the most significant SNP for PON.

GWAS Analyses
Stress is associated with various physiological and metabolic responses that differ among animals. If the predisposition of animals to higher levels of stress can be predicted from variations in the biomarkers measured under non-stressed conditions, these may be used to investigate complex stress responses and to select animals that cope better in stressful situations. The current study investigated the variation in the levels of biomarkers previously reported to differ between stressed animals and controls [30,36,60,61]. Our results showed that there is considerable variation in the levels between individuals of most of the biomarkers tested. The genetic control of expression of these biomarkers was assessed, and their genetic architecture was investigated to determine if they could act as intermediate phenotypes ("endophenotypes") [62] that may be used to investigate stress traits, or applied directly in genomic breeding programmes to improve stress resilience in dairy cattle.
Among the twenty-one putative stress biomarkers tested, genome-wide statistical significance in both single-SNP and haplotype-based GWAS analyses was only found for ceruloplasmin (BTA1) paraoxonase (BTA4) and γ-glutamyl transferase (BTA17). The levels of these three biomarkers had an estimated heritability between 0.06 (s.e. 0.14), for CP in IH, and 0.69 (s.e. 0.13), for PON in IS, although the small number of animals analysed does not allow an accurate estimation of heritability. In each case, a well-defined chromosomal region reached genome-wide significance, which included the gene itself (or in the case of GGT, the gene family), in both breeds analysed, the Italian Holstein and Italian Simmental. This suggests that the control of expression of these biomarkers is regulated by a variation near or close to the gene, and there are no major direct effects of other genetic loci on their expression. The most significant SNP for these biomarkers explains 7.33% and 8.87% of the phenotypic variance for CP in IH and IS, respectively; 7.98% and 6.69% for PON in IH and IS, respectively; and 14.64% and 8.08% for GGT in IH and IS, respectively. Genetic selection could rapidly change allele frequencies of these genes and, hence, the expression of the biomarker, which would be valuable if they can be shown to be associated with improved stress response.
The three biomarkers for which significant genetic associations were found are all primarily produced by the liver with functions related to protection from oxidative stress. Oxidative stress is associated with various diseases, including diabetes, atherosclerosis and cancer [63], and in the brain, where high oxygen consumption is associated with the production of free radicals; oxidative stress has been linked with a range of psychological and anxiety-related disorders in humans [64]. The ability to control the effects of oxidative stress is likely to impact the response of an individual to stressful situations. Two of the biomarkers, CP and GGT, are also associated with the acute-phase inflammatory immune response. Psychological stress can activate an inflammatory response [65]. Proinflammatory cytokines, which are mediators of the inflammatory response, communicate with the brain and affect neuroendocrine activity in the brain, inducing emotional, cognitive, and behavioural changes [66]. Indeed, prolonged exposure to chronic stress results in neurophysiological changes, including glucocorticoid resistance and changes in the cytokine response that increase the susceptibility to stress.
The CP gene is located on BTA1 and is approximately 60 kb long. Ceruloplasmin is an a-2 glycoprotein involved in copper transport and iron metabolism, and it is an antioxidant produced by the liver [67,68]. Levels of ceruloplasmin increase in the plasma during the acute phase of inflammation [69]. Genetically modified murine models of experimental colitis show that CP plays a role in reducing bowel inflammation [70]. In CP knockout mice, the impact of CP deficiency results in the accumulation of iron in the hippocampus that impacts brain neurochemistry and behaviour. The CP knockout mice showed elevated anxiety levels compared with normal mice [71]. This suggests that CP could be involved in reducing anxiety in stressful situations, and indeed in dairy cattle, CP has been shown to increase in relation to stress, e.g. around calving [58,72,73].
The gene encoding the acute-phase protein PON is located on BTA4 and is approximately 33 kb long. PON is a high-density lipoprotein-associated enzyme that protects lipoproteins against oxidation. PON1 knock-out mice on a high-fat, high-cholesterol diet develop atherosclerosis more rapidly, and the LDL in their artery walls is more highly oxidised than in controls [74]. In contrast, in PON1-overexpressing transgenic mice, there are fewer lesions and the oxidation status of aorta is improved. Oxidative stress and the consequent lipid oxidation are thought to play a role in psychiatric disorders. In humans, decreased PON activity and specific polymorphisms in the gene have been associated with clinical conditions, including depression, generalised anxiety disorder (GAD) and schizophrenia [75]. The concentration of PON has been shown to be inversely related to oxidative stress [76] and decreases during the acute-phase response induced by stress [36], e.g. associated with calving [60,73]. Six polymorphisms in the PON1 gene promoter region have been found to be associated with plasma PON levels in Brazilian Holstein [77].
The GGT1 and GGT5 genes are located on BTA17 and are approximately 22 kb long. The GGT enzyme is a biomarker of liver function. GGT is a membrane-bound glycolprotein, the production of which increases during the acute-phase response to stressinduced inflammation and oxidative stress [78]. Under conditions of oxidative stress, GGT plays a key role in the biosynthesis of glutathione (GSH), which is a potent antioxidant [79]. GGT1 knock-out mice showed chronic GSH deficiency, growth retardation, skeletal abnormalities and cataracts, and they died younger [80]. GGT1 is in a large linkage block containing numerous genes, including GGT1, GGT5, GSTT1 and GSTT. GGT1 knock-out mice have no glutathione activity in the spleen, small intestine and kidney, suggesting that GGT5 does not compensate for the lack of GGT1.
The only SNPs above the genome-wide threshold for significance in the individualbreed GWAS were those associated with CP, PON1, GGT1 and GGT5. However, two additional signals were identified on BTAs 16 and 26 associated with the expression of PON. The signal on BTA16 was above the genome-wide suggestive, the one on BTA26 above the significant threshold in the meta-analysis, in which data from both breeds were combined in the GWAS. Indeed, the significant SNP on BTA26 had a higher p value than the SNP associated with PON on BTA4. The significant BTA26 SNPs are within the domain containing the E3 ubiquitin protein ligase 2 (HECTD2) gene, which is involved in the function of the innate immune system and is a crucial regulator of cytokine secretion [81]. The closest gene to the SNP with the highest suggestive-significance on BTA16 is castor zinc finger 1 (CASZ1). CASZ1 is involved in blood pressure regulation [82], neuronal development [83] and inflammation [84], all functions that are affected by stress. The next closest gene is peroxisomal biogenesis factor 14 (PEX14), which also plays a role in countering oxidative stress [85]. Interestingly, PEX14 seems to have been under selection during domestication [86], where docility and the tolerance of stress are key traits.

Search for Candidate Causative Variants
Significant GWAS associations were identified for the plasma levels of CP, PON and GGT. The significant SNPs in the GWAS were close to the locations of the genes coding for these proteins. Therefore, it is likely that the functional variants are close to these genes in sequences regulating their expression levels. A search of the 1000 bull sequence data identified a total of 556 variants in LD with the most significant SNP for each of the three genes, of which 457 were in CP, 373 in IH and 86 in IS, of which two were in common. None of the variants were above the LD threshold for PON1, in either breed. In IH, 76 variants were in GGT1 and 33 in GGT5, most of which were within the coding region or introns (Supplementary Table S5). Among these, a single missense variant in CP in IH was classified as deleterious by SIFT. All other variants were classified as tolerated. Although we cannot exclude that these variants may affect the level of the proteins investigated, we focus our discussion on variants in transcription factor binding sites that are more likely to affect the expression of the three proteins.
The CP gene has 25 variants upstream of the coding start site that affect known transcription factor binding sites, and among these, two were identified that are in strong LD with the significant SNP, one of which, 1347 bp upstream (BTA1:118900034) overlapped Serum Response Factor (SFR) and Elk-1 binding sites. ELK-1 and SRF form a tertiary complex [87] and have several functions, including intermediate gene activation associated with neuronal development, learning and memory [88]. The second at 698 (BTA1: 118900683) is within a FOXJ2 binding site. FoxJ2 can be expressed as two isoforms, FHX.L and FHX.S, that differ in their C-terminal ends. FoxJ2 is expressed early in embryonic development, in many tissues [89].
There have been 54 variants described upstream of the PON1 coding region, among which six have been associated with increased PON expression, e.g. in relation to calving stress [90], of which two, at 12,576,347 and 12,576,853, are in weak linkage (0.2 < r 2 < 0.3) with BovineHD0400003479, the second most significant SNP associated with PON in the present study. The first SNP is in a GATA3 binding site. GATA3 is a transcription factor involved in the development and survival of sympathetic neurons [91]. The second SNP is within a cJUN binding site. The cJUN transcription factor activates the response to oxidative stress due to ROS produced by mitochondrial dysfunction [92]. We also identified a variant at 12,576,853 in linkage with the significant SNP in our study, which has been associated with an increase in PON expression by Silveira et al. [77]. However, we did not find this SNP associated with a transcription factor binding site. Three other SNPs have been reported associated with PON levels within PON1 transcription binding sites in cattle [77]; however, in the present study, these were not in linkage with the SNP significant for PON.
The GGT1 gene has 69 variants upstream of the coding region of which eight were in LD with the significant SNP for serum GGT levels. We identified transcription factor binding sites at four of these at 71,471,816, 71,472,255, 71,472,377 and 71,473,234 bp on BTA17. The SNP at BT17:71471816 is within an activating transcription factor (AFT) binding site, the SNP at BT17:71472255 is within a Vitamin D Receptor (VDR), and the SNP at BT17:71472377 is within a Myc-associated zinc-finger protein (MAZ) binding site and a Specific protein 1 (Sp1) binding site. MAZ expression is increased in chronic inflammation [93]. Sp1 is a zinc finger transcription factor expressed in many tissues. In neuronal tissues, Sp1 and Sp3 expressions is increased in response to oxidative stress, and in rodent models, protects against neuronal loss [94]. The SNP at BTA17:71473234, which is significantly associated with GGT, is within COUP-TF1 and ER-α binding sites. COUP-TF1 is activated in response to oxidative stress, as is the ER-α transcription factor, as discussed for PON1.
The GGT5 gene has 45 variants upstream of the coding region, of which three are in LD with the most significant SNP in the single-SNP GWAS. Only one (BTA17:71455325) is within a transcription binding site, which is SMAD4. Smad-dependent signalling regulates the expression of the TGF-β family of cytokines, which in turn, negatively regulates neuronal morphogenesis during brain development and plays a role in neuronal disease in humans [95,96].

Conclusions
We assessed variations in potential biomarkers of stress and the genetic control of their expression. Three biomarkers, all associated with oxidative stress, showed strong genetic associations that indicated that their expression was primarily under cis-acting control. In the case of PON, two additional genetic loci were identified that were associated with the level of PON expression. Candidate genes at these loci are also implicated in oxidative stress. In this study, the genetic association between levels of biomarkers was examined in non-stressed animals. It would be interesting to assess changes in the levels of these biomarkers in stressed animals, and in particular, if genetic variations predict the response of animals to stress, which would facilitate the selection of animals that respond better to stressful situations and hence remain healthier and more productive.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/genes12040534/s1, Figure S1: Manhattan plots showing genetic associations with the levels of ceruloplasmin. Figure S2: Manhattan plots showing genetic associations with the levels of paraoxonase. Figure S3: Manhattan plots showing genetic associations with the level of γ-glutamyltransferase. Figure S4: Manhattan plot of CP single-SNP GWAS meta-analysis. Figure S5: Manhattan plot of GGT single-SNP GWAS meta-analysis. Table S1: Significance of fixed effects and number of animals and SNPs in the working dataset. Table S2: SNPs significantly associated with CP, PON and GGT identified by single-SNP GWAS. Table S3: Haplotypes significantly associated with CP, PON and GGT identified by haplo-GWAS. Table S4: SNPs significantly associated with CP, PON and GGT identified by single-SNP meta-GWAS. Table S5. Variants in CP, PON1, GGT1 and GGT5 genes linked to the most significant SNP associated with CP, PON and GGT. Table S6: Variants in the promoter regions of CP, PON1, GGT1 and GGT5 genes linked to the most significant SNP associated with CP, PON and GGT.  Institutional Review Board Statement: All experimental procedures and the care of animals complied with Italian legislation on animal ethics (DL n.116, 27/1/1992) and adhered to the bioethical rules of the University of Udine. Ethical approval for this study was given by the qualified veterinar-ian responsible for animal welfare of the Department of Agricultural and Environmental Science of the University of Udine.

Informed Consent Statement: Not applicable.
Data Availability Statement: The data presented in this study are available on request from the corresponding author. The data are not publicly available, due to shared ownership with breeder associations.

Conflicts of Interest:
None of the authors declare a conflict of interest.