Genome-Wide Association Studies for Milk Somatic Cell Score in Romanian Dairy Cattle

Mastitis is one of the most frequently encountered diseases in dairy cattle, negatively affecting animal welfare and milk production. For this reason, contributions to understanding its genomic architecture are of great interest. Genome-wide association studies (GWAS) have identified multiple loci associated with somatic cell score (SCS) and mastitis in cattle. However, most of the studies have been conducted in different parts of the world on various breeds, and none of the investigations have studied the genetic architecture of mastitis in Romanian dairy cattle breeds up to this point in time. In this study, we report the first GWAS for SCS in dairy cattle breeds from Romania. For GWAS, we used an Axiom Bovine v3 SNP-chip (>63,000 Single Nucleotide Polymorphism -SNPs) and 33,330 records from 690 cows belonging to Romanian Spotted (RS) and Romanian Brown (RB) cattle. The results found one SNP significantly associated with SCS in the RS breed and 40 suggestive SNPs with −log10 (p) from 4 to 4.9 for RS and from 4 to 5.4 in RB. From these, 14 markers were located near 12 known genes (AKAP8, CLHC1, MEGF10, SATB2, GATA6, SPATA6, COL12A1, EPS8, LUZP2, RAMAC, IL12A and ANKRD55) in RB cattle, 3 markers were close to ZDHHC19, DAPK1 and MMP7 genes, while one SNP overlapped the HERC3 gene in RS cattle. Four genes (HERC3, LUZP2, AKAP8 and MEGF10) associated with SCS in this study were previously reported in different studies. The most significant SNP (rs110749552) associated with SCS was located within the HERC3 gene. In both breeds, the SNPs and position of association signals were distinct among the three parities, denoting that mastitis is controlled by different genes that are dependent according to parity. The current results contribute to an expansion in the body of knowledge regarding the proportion of genetic variability explained by SNPs for SCS in dairy cattle.


Introduction
In recent decades, the prevalence of clinical mastitis in cattle has increased significantly [1] in most of developed agricultural countries where farming practices are more intensive. This tendency is manifested in parallel with a higher selection pressure for high milk yields in modern dairy breeds. Therefore, improving the understatement of the genetic base of mastitis and its indicator traits represents a major goal for the dairy cattle breeding industry as it can improve udder health and increase the milk quality, while also reducing early involuntary culling, discarded milk, veterinary services and labor costs.

Animals and Phenotypes
This study was conducted on 723 dairy cattle out of which 601 were RS and 122 were RB cattle. The animals involved were managed in three breeding herds: two experimental herds belonging to the Romanian Academy of Agricultural Science (the Research and Development Station for Bovine Arad and the Research and Development Station for Bovine Sighetu Marmatiei) and one commercial herd located in Berliste. All cows involved in the study were kept under comparable housing and feeding conditions, with similar milking and sanitary conditions, and were mechanically milked twice a day. All the animals had at least 3 monthly test-day records per lactation. Only data corresponding to the first three lactations were included in the analysis.
Phenotypic data consisted of 33,330 SCC records, of which 24,295 were for RS and 9035 were for RB cattle. For SCC determination, milk records were conducted every 28 days by the Official Dairy Control service. The milk samples were analyzed for SCC determination in the laboratory of the Milk Quality Control Foundation (Cluj-Napoca, Romania) by using CombiFoss™ FT + integrating MilkoScan™ FT + and Fossomatic™ FC.
In order to achieve a normal distribution of the data, the values of SCC were transformed to SCS according to the formula of Wiggans and Shook [15] as SCS = log2 (SCC/100,000) + 3, specified by the international standard (Interbull Code of Practice. https://interbull.org/ ib/codeofpractice, accessed on 25 August 2021 [16]).

Sampling and Genotyping
Biological samples needed for genotyping consisted of whole blood and hair roots that were collected by trained veterinarians. Blood samples (n = 627) were collected from the tail vein in vacutainers containing K3EDTA as an anticoagulant, and the hair roots (approximately 20-40 hairs with follicles attached) were collected from the tail of the animals (n = 96). All protocols used for collection of the biological samples were approved by the Ethics Committee of the Research and Development Station for Bovine Arad (Statement no 88/04. 10.2019). After collection, the samples were transferred to the laboratory and stored at 4 • C, and they were subsequently sent to IFN Schönow GmbH (Bernau bei Berlin, Germany) for DNA extraction and genotyping using an Axiom Bovine v3 microarray (based on the reference genome Bos_taurus_UMD_3.1.1), which includes >63,000 SNPs. The used SNP chip included a high percentage of SNPs that are found distributed on all chromosomes. The SNPs on the array include 44,000 markers from the Council on Dairy Cattle Breeding (CDCB), International Society for Animal Genetics (ISAG) core parentage markers, VIP trait-associated SNPs and SNPs optimized for STR imputation. In total, 723 samples were genotyped.

Quality Control (QC)
Data quality control was performed in order to remove SNPs and individuals with insufficient genotyping quality. QC procedures were carried out by using PLINK v1.90b6.16 64-bit [17]. The SNPs located on the sex chromosome or those with unknown positions were removed from the current study. Individuals with a call rate of less than 95% were excluded. The SNPs with call rates <95% or with minor allele frequency (MAF) < 0.05 and SNPs with genotypes not in accordance with the Hardy-Weinberg equilibrium (p > 10 −6 ) were eliminated.

Association Analysis
Principal component analysis (PCA) was performed for each breed-specific subpopulation in order to check for population stratification. In both cases, the first 20 principal components were computed by using the R package argyle v0.2.2 [18]. Based on the proportion of variation explained by principal components, we noted that the first 8 principal components explained the most significant amount of genetic variability (23.86% of the total variance in RB and 17.49% in RS, respectively) and, thus, captured any possible effects of population stratification (Supplementary Figure S1). For this reason, we included the first 8 principal components into the GWAS study as covariates for both breeds.
The GWAS was performed for the SCS trait by using two-stage analysis. First, a random-regression test-day model using only phenotypes and pedigree data was implemented in Blupf90 [19] by using the combined variable herd-year-season as the fixed effect in order to represent contemporary groups and by using fourth-order Legendre polynomial coefficients for both fixed and random effects (additive genetic and permanent environmental effects) in order to represent components of the lactation curve. The variance components were computed by using the remlf90 command, after which the estimated breeding values (EBVs) and prediction error variances and covariances (PEVs) were computed using the blupf90 command with the store_pev_pec option. PEVs were used to compute EBV reliabilities by using the following formula: where V A is the additive genetic variance.
EBV reliabilities were then used to compute de-regressed proofs (DRPs) by using the R package DRP [20], which implements the approach proposed by Garrick et al. [21]. More specifically, the wideDRP function was used by setting the heritability to its estimated value and using the default value of 0.5 for the c parameter, which represents the fraction of genetic variance not explained by markers.
For the second GWAS stage, DRPs were used as phenotypes in the rrBLUP package v.4.6.1 [22] by using the GWAS function, with only the first 8 principal components included as fixed effects. SNP effects were computed within the same R package by using the mixed.solve function. A genome-wide significance threshold of p < 1.00 × 10 −4 was used, and SNPs exceeding this threshold were considered as candidate variants. Manhattan plots were created by using the R package qqman v.0.1.4 [23].
Linkage disequilibrium analyses for some of the significant SNPs on chromosome 7 and 9 in RB breed and chromosome 6 in RS breed and SNPs from the surrounding regions (±500,000 KBp relative to the SNP's position) were performed by using the IntRegionalPlot function of the R package IntAssoPlot v.0.99.21 [24]. This function produces a figure consisting of two layers. The top layer represents a scatter plot of the −log 10 (p) values for all the SNPs in a selected region, with the chromosomal position on the x-axis. The bottom layer depicts a heat map of LD blocks for each of the pairs of SNPs within the region. In addition, the function adds lines between the two layers to show which SNP corresponds to which LD block.

Descriptive Statistics
A total number of 690 cows with 33,330 test-day milk records were used in this study. The number of records for second (31.66%) and third parity (24.71%) was lower than those for the first lactation (43.63%). For the two breed-specific subpopulations by parity (L 1 -L 3 ), the corresponding percentages were as follows: 45.61%, 30.78% and 23.61% in RS and 38.32%, 34.00% and 27.68% in the RB breed, respectively. The descriptive statistics of the observed phenotypes for each breed and lactation are presented in Table 1.
According to our results, the average SCSs among parities (L 1 -L 3 ) were 2.81, 3.06 and 3.27 in RS and 4.53, 4.51 and 4.62 in RB cattle, respectively. The maximum level of SCS was 8.84; nevertheless, the number of cows with this level of SCS was low in both breeds. The overall mean SCS for RS was 2.99 ± 1.82 and 4.55 ± 1.89 in RB cattle. Estimates of heritability for the SCS showed different variation between breeds over the 305 days of lactation. The range of SCS heritability values for L 1 -L 3 in RS (0.08, 0.10 and 0.11) was higher than compared to RB (0.03, 0.07 and 0.06). The highest heritability was recorded for RS cattle in the third lactation (0.11). Between breeds, a distinct pattern of variation in heritability values was observed. The shape of the SCS heritability curves (Supplementary Figure S2) was similar for L 1 -L 3 in the RS breed, with the highest values in the first and last parts of the lactation. Conversely, the analysis of daily SCS heritabilities in the RB breed revealed different shapes of curves between the first, second and third lactation. This could be due to the smaller number of RB animals included in the study, which means that possible outlier individuals had a larger impact on the breed-specific population average. The SCS heritabilities in RB decreased in the first part of the lactation (up to 110 days in milk) and then remained on the plateau in mid-lactation (up to 200 days in milk), after which it gradually increased in the last part of the lactation (up to 305 days in milk). This trend appears similar to that of the incidence of mastitis.

Quality Control
After computing descriptive statistics for the phenotypic data, quality control checking of genotyping data was the second step in our data analysis. Since the success of a genomewide association study is considerably influenced by appropriate study design and data analysis workflow [27], extra attention was paid towards data quality control of genotyping data. After eliminating markers from unplaced contigs and sex chromosomes, markers with unknown chromosomes or positions within the chromosome, multiallelic markers, indels and duplicate SNPs from the original loci of the microarray, a number of 59,915 markers for 723 animals entered the QC phase. From these, a number of animals and SNPs were discarded as follows: 11 animals that had greater than 5% missing SNP calls; 795 SNPs with call rates <95%; 17,701 SNPs with MAF <0.05; and 1114 SNPs with genotypes that are not in accordance with the Hardy-Weinberg equilibrium (p > 10 −6 ). Finally, after the quality control analysis, a data set of 40,305 SNPs across the entire bovine genome from 690 cows (571 RS and 119 RB) was used in the subsequent analysis to carry out genotype association tests.

Principal Component Analysis
The animals involved in this study came from two breeds and three breeding herds. Thus, in order to assess population stratification, we performed principal component analysis. The PCA plot revealed a clear population structure for the animals in the two cattle breeds included in the study, showing a clear separation between individuals, according to breed (Figure 1). Clusters of the same color represent individuals from the same breed. All animals from the RB breed clustered together in all pairwise scatter plots of the first three principal components, but they were separated from the RS individuals. From the three plots in Figure 1, we observed that PC1 separates the individuals from the two breeds, whereas PC2 and PC3 divide individuals within the RS breed. The individuals from the RS breed split into several subgroups that were best observed in the scatterplot of PC2 and PC3 (Figure 1b). We have conducted a separate investigation into the origins of these Genes 2021, 12, 1495 6 of 16 different clusters (data not shown) and concluded that one subgroup consists of cows with paternity from one single sire and the rest of the subgroups consist of cows from several different sires.
cording to breed ( Figure 1). Clusters of the same color represent individuals from the same breed. All animals from the RB breed clustered together in all pairwise scatter plots of the first three principal components, but they were separated from the RS individuals. From the three plots in Figure 1, we observed that PC1 separates the individuals from the two breeds, whereas PC2 and PC3 divide individuals within the RS breed. The individuals from the RS breed split into several subgroups that were best observed in the scatterplot of PC2 and PC3 (Figure 1b). We have conducted a separate investigation into the origins of these different clusters (data not shown) and concluded that one subgroup consists of cows with paternity from one single sire and the rest of the subgroups consist of cows from several different sires.

Significant SNPs Associated with SCS
To investigate the genetic variation that underlies the SCS, GWAS was performed in roder to identify the associated SNP loci in cattle. The Manhattan plots for SCS in the RS and RB breeds are shown in Figure 2. The upper horizontal lines represent the Bonferroniadjusted genome-wide significance threshold −log10(p) ≥ 5.89, and the inferior horizontal lines represent the suggestive threshold of −log10(p) ≥ 4.00, which was set to report further significant associations that were not observed under Bonferroni corrections. The suggestive threshold was chosen according to previous studies where researchers defined a suggestive threshold between p < 10 −5 and p < 10 −3 [28,29]. The candidate genes were detected by verifying whether significant or suggestive SNPs overlapped a gene or were located within 1 MBp upstream or downstream from a gene. The genomic positions were based on the UMD3.1 genome assembly of Bos taurus [7]. (a)

Significant SNPs Associated with SCS
To investigate the genetic variation that underlies the SCS, GWAS was performed in roder to identify the associated SNP loci in cattle. The Manhattan plots for SCS in the RS and RB breeds are shown in Figure 2. The upper horizontal lines represent the Bonferroni-adjusted genome-wide significance threshold −log 10 (p) ≥ 5.89, and the inferior horizontal lines represent the suggestive threshold of −log 10 (p) ≥ 4.00, which was set to report further significant associations that were not observed under Bonferroni corrections. The suggestive threshold was chosen according to previous studies where researchers defined a suggestive threshold between p < 10 −5 and p < 10 −3 [28,29]. The candidate genes were detected by verifying whether significant or suggestive SNPs overlapped a gene or were located within 1 MBp upstream or downstream from a gene. The genomic positions were based on the UMD3.1 genome assembly of Bos taurus [7].
The genome-wide SNPs detected for SCS among parities (L 1 -L 3 ) are presented in Table 2 for the RS breed and Table 3 for the RB breed. A total number of 41 SNPs were detected in the RS and RB breeds, of which 40 SNPs passed the suggestive threshold and, ultimately, one SNP passed the significance threshold after Bonferroni correction and was associated with SCS in the RS breed in L 3 (Figure 2c, left side). As shown in Table 2, for the RS breed, 3 out of 15 SNPs were located near three known genes, and one SNP overlapped the HERC3 gene. Of the fifteen SNPs in the RS breed, AX-106761943 (rs110749552) was located on chromosome 6 within the HERC3 gene (HECT and RLD domain containing E3 ubiquitin protein ligase 3) and was the most significant SNP for SCS (log 10 (p) = 6.37). In the RB breed, the genome-wide analysis detected 26 SNPs that reached genome-wide significance for association with SCS (Table 3). Among these SNPs, 14 out of 26 SNPs were located near 12 known genes; two SNPs, AX-106741653 and AX-115114947, were located near the AKAP8 gene, and two other SNPs, AX-106735825 (rs43585636) and AX-117085949 (rs43585209) were located near the COL12A1 gene, respectively. None of the 15 detected SNPs in the RS breed were in common with the detected 26 SNPs that reached genome-wide significance for association with SCS in the RB breed. significant associations that were not observed under Bonferroni corrections. The suggestive threshold was chosen according to previous studies where researchers defined a suggestive threshold between p < 10 −5 and p < 10 −3 [28,29]. The candidate genes were detected by verifying whether significant or suggestive SNPs overlapped a gene or were located within 1 MBp upstream or downstream from a gene. The genomic positions were based on the UMD3.1 genome assembly of Bos taurus [7]. The genome-wide SNPs detected for SCS among parities (L1-L3) are presented in Table 2 for the RS breed and Table 3 for the RB breed. A total number of 41 SNPs were detected in the RS and RB breeds, of which 40 SNPs passed the suggestive threshold and, ultimately, one SNP passed the significance threshold after Bonferroni correction and was associated with SCS in the RS breed in L3 (Figure 2c, left side). As shown in Table 2, for the RS breed, 3 out of 15 SNPs were located near three known genes, and one SNP overlapped the HERC3 gene. Of the fifteen SNPs in the RS breed, AX-106761943 (rs110749552) was located on chromosome 6 within the HERC3 gene (HECT and RLD domain containing E3 ubiquitin protein ligase 3) and was the most significant SNP for SCS (log10 (p) = 6.37). In the RB breed, the genome-wide analysis detected 26 SNPs that reached genomewide significance for association with SCS (Table 3). Among these SNPs, 14 out of 26 SNPs were located near 12 known genes; two SNPs, AX-106741653 and AX-115114947, were located near the AKAP8 gene, and two other SNPs, AX-106735825 (rs43585636) and AX-117085949 (rs43585209) were located near the COL12A1 gene, respectively. None of the 15 detected SNPs in the RS breed were in common with the detected 26 SNPs that reached genome-wide significance for association with SCS in the RB breed.

Linkage Disequilibrium (LD) Blocks of the Significant SNPs
We performed the linkage disequilibrium analysis (r 2 value) on chromosome 7 in the RB breed for a region of 1 MBp interval (from 7,772,794 to 8,772,794 Bp), which covers seventeen SNPs (Figure 3a). High LD blocks were considered for SNPs with r 2 ≥ 0.8. Four high LD blocks were revealed where the most significant SNP (AX-106741653) was located in a block along with another significant SNP (AX-115114947). The two significant SNPs in block 2 were located upstream of the AKAP8 gene at distances of 514,640 and 511,009 Bp, respectively. A second linkage disequilibrium analysis for a total of twenty SNPs on chromosome 9 in the RB breed was performed and showed three LD blocks (Figure 3b), each of them including two SNPs, of which the SNPs in block 2 were located upstream of the COL12A1 gene. For the RS breed, we performed a single LD analysis on chromosome 6 in a region of ±500,000 KBp relative to the position 37,526,622 bp of the SNP AX-106761943 (rs110749552), which was the only marker in our study that had a significant association according to Bonferroni correction. For the RS breed, the LD analysis revealed three high LD blocks (Figure 3c), of which the first block contained the previously mentioned SNP AX-106761943 and a non-significant SNP AX-117085324 (rs109478631), both of them located within the HERC3 gene.

Linkage Disequilibrium (LD) Blocks of the Significant SNPs
We performed the linkage disequilibrium analysis (r 2 value) on chromosom RB breed for a region of 1 MBp interval (from 7,772,794 to 8,772,794 Bp), whic seventeen SNPs (Figure 3a). High LD blocks were considered for SNPs with r 2 ≥ high LD blocks were revealed where the most significant SNP (AX-106741653) wa in a block along with another significant SNP (AX-115114947). The two signific in block 2 were located upstream of the AKAP8 gene at distances of 514,640 an Bp, respectively. A second linkage disequilibrium analysis for a total of twenty chromosome 9 in the RB breed was performed and showed three LD blocks (Fi each of them including two SNPs, of which the SNPs in block 2 were located up the COL12A1 gene. For the RS breed, we performed a single LD analysis on chro 6 in a region of ±500,000 KBp relative to the position 37,526,622 bp of the 106761943 (rs110749552), which was the only marker in our study that had a s association according to Bonferroni correction. For the RS breed, the LD analysis three high LD blocks (Figure 3c), of which the first block contained the previou tioned SNP AX-106761943 and a non-significant SNP AX-117085324 (rs1094786 of them located within the HERC3 gene. (a)

Discussion
Since mastitis is one of the most frequent diseases in dairy cattle negatively milk production and animal welfare, contributions to understanding the genom tecture are of great interest. Mastitis, on the other hand, is exceedingly difficult ically manage due to its low heritability. However, as a result of the genetic c between mastitis incidence and SCC, the latter indicator trait has been used as a

Discussion
Since mastitis is one of the most frequent diseases in dairy cattle negatively affecting milk production and animal welfare, contributions to understanding the genomic architecture are of great interest. Mastitis, on the other hand, is exceedingly difficult to genetically manage due to its low heritability. However, as a result of the genetic correlation between mastitis incidence and SCC, the latter indicator trait has been used as an indirect technique for predicting udder infections [30]. Moreover, SCC is recorded in most cases through the Official Dairy Control service in all developed countries and is used as a marker to monitor the prevalence of mastitis in dairy cattle [31], thus being more accessible to farmers compared to microbiological tests. In the present study we have performed GWAS and detected SNP variants associated with SCS variation in native Romanian dairy cattle breeds. To our knowledge, this is the first research study that identified potential causative genes and reported associated SNP variants linked to mastitis in Romanian cattle breeds.
In our study, the overall mean ± SD SCS values for the RS and RB breeds were 2.99 ± 1.82 and 4.55 ± 1.89, respectively. These averages are in agreement with previous studies. For instance, Alam et al. [32] reported means of SCS for five lactations of Holstein cows to range between 3.50 and 4.32. Zavadilová et al. [33], using records from the first three lactations of Fleckvieh cows, reported mean SCS of 3.16 to 4.01. The average SCS in this study increased for later lactations, and the trends were similar for both breeds, with the lowest values of SCS during the first and second lactation and higher values during the third lactation. The increasing trend of SCS with the increment of lactation rank observed in our study was also reported in other articles [32,34,35]. Differences in SCS suggest that increased milk flow may contribute to an increase in SCS for later lactations.
The average heritabilities for SCS over the entire days-in-milk range for 305 days were small to moderate and varied from 0.03 (RB) to 0.11 (RS), which are in agreement with previous studies. For example, for the Simmental breed, Wei et al. [36] reported values for SCS heritability of 0.09, whereas Mancin et al. [37] found a slightly higher heritability of 0.133 in Alpine Grey cattle, a local dual-purpose breed from the Italian Alps. However, substantial variations in the heritability of SCS were reported in different studies for distinct breeds. Previous estimates of heritability were low in Jersey and Holstein cattle (0.01-0.06) [38,39], moderate in Finnish Ayrshire (0.07-0.12) [40] and Valdostana cattle (0.06) [41] and higher values in Holstein-Friesian cattle were estimated (0.19-0.21) [42,43]. Various studies have resulted in the conclusion that several factors, including populations, statistical models and estimation methods, can account for these discrepancies in heritabilities [33,44].
Genome-wide association studies provide a valuable source of information for the comprehension of genetic mechanisms that point out the various phenotypes in dairy cattle. Nevertheless, when looking at the GWAS results so far, it is clear that identifying specific SNPs and genes located around them linked to SCS is difficult. As observed in the present study (Tables 2 and 3), from a total of 41 SNPs that were found to be associated with SCS in both Romanian cattle breeds, the majority of them were dissimilar from significant SNPs reported in previous publications [8,10,11,29,45,46], while the HERC3, LUZP2, AKAP8 and MEGF10 genes located near significant SNPs on chromosome 6, 29 and 7 are consistent with other studies [47][48][49], which strongly suggests that these genes are candidate genes for SCS.
In the RS breed, the GWAS results pointed out 15 SNPs associated with SCS. Out of the significant SNPs, three markers were close to the ZDHHC19, DAPK1 and MMP7 genes, while one SNP overlapped the HERC3 gene. The most significant SNP in our study, rs110749552, (6:g.36100172G > A on ARS-UCD1.2; −log 10 (p) = 6.36) associated with SCS in RS breed in L 3 was located within the HERC3 gene that encodes a member of the HERC ubiquitin ligase family. Significant associations of the HERC3 gene with different traits have been previously reported by several authors in cattle or sheep populations. This gene was previously found to be related to mastitis and somatic cell count in Ethiopian cattle populations [47]. According to Jiang [50], the HERC3 gene was associated with milk protein percentage in Chinese Holstein cattle. In French Holstein, the HERC3 gene had highly significant effects on αs1-casein [51]. Porto-Neto et al. [52] explored the genetic architecture of ten traits in tropical cattle breeds and reported the HERC3 as a candidate gene for yearling weight. In addition, Al Kalaldeh [53] showed that the HERC3 gene on Ovis aries chromosome 6 fell within the QTL region that underlies genetic variation for parasite resistance in sheep and also noticed that the genes from the HERC family of ubiquitin ligases are associated with antigen processing: ubiquitination and proteasome degradation; immune system; class I MHC mediated antigen processing and presentation; and adaptive immune system pathways. However, this is the second study where this gene has been found to be involved in the genetic makeup of mastitis.
With regards to the RB breed, the GWAS results revealed 26 promising SNPs associated with SCS. Out of the significant SNPs, 14 were located near 12 known genes, including AKAP8, CLHC1, MEGF10, SATB2, GATA6, SPATA6, COL12A1, EPS8, LUZP2, RAMAC, IL12A and ANKRD55. On chromosome 7, the rs42228650 SNP was close to MEGF10, a gene that was found to be associated with SCS in dairy cows in a previous investigation [48] and was considered as a candidate gene associated with residuals for fecal egg counts of gastrointestinal nematodes in German Black Pied cattle in a recent study [54]. On the same chromosome, two significant SNPs (AX-106741653 and AX-115114947) in the RB breed were located close to the AKAP8 gene, which was associated with somatic cell count in seven different populations of dairy cows in a complex study that used the combined approach of examining previous results from differential gene expression analysis and GWAS studies to identify the key pathways and candidate genes that potentially confer genetic resistance to mastitis in dairy cows [48]. Furthermore, another relevant gene revealed in our work located 519,687 Bp from the rs110955314 SNP was LUZP2 on chromosome 29, which has been associated with SCS [48] and milk fatty acid traits in cattle [55].
In addition to the genes known to be associated with mastitis, the present study identified novel candidate genes associated with SCS in Romanian cattle. Some of these genes have been shown to be linked with other traits in ruminants. For instance, the rs109189476 SNP on chromosome 2 was close to SATB2, a gene that encodes a DNA binding protein that specifically binds nuclear matrix attachment regions. A previous study disclosed the SATB2 gene to be associated with tick burden, and the author mentioned a potential link of this gene to the immune system [56]. Jiang et al. [57] conducted a large-scale GWAS using first-lactation Holstein cows and identified the 91.13-94.62 MBp region of chromosome 5, including the EPS8 gene associated with fat yield, covered by SNP AX-106726354 located near EPS8 in this study. The SNP AX-124381310 on chromosome 24 was close to the GATA6 gene. The GATA gene family has six members (GATA1-GATA6) in vertebrates and plays a critical role in controlling the growth, differentiation and survival of diverse cells, as well as maintaining the body's functions [58]. Currently, mutations in GATA6 were reported to be associated with microtia in Awassi sheep in ruminants [59].
Regarding the COL12A1 (Collagen Type XII α 1 Chain) gene, it is known that degradation of the extracellular matrix and collagen chain trimerization is common among the related pathways of this gene. The Gene Ontology annotations associated with this gene include extracellular matrix structural constituent conferring tensile strength. Collagens consist of 28 members in vertebrates, and they are the most abundant extracellular matrix proteins [60]. Chen et al. [61] showed that gene expression levels of a collagen member (COL4A1) were significantly downregulated in the inflammation-associated fibroblasts from bovine mammary glands compared to normal fibroblasts, suggesting an implication of collagen members in immune response since one of the related pathways include extracellular matrix-receptor interaction. Moreover, Lu et al. [60] revealed that several collagen genes in bovine had higher expression levels during lactation, among which COL12A1 is upregulated in late lactation, suggesting that the collagen genes had varied biological functions at different phases of lactation. In the present study, we identified two significant SNPs associated with SCS (rs43585636 and rs43585209) located close to COL12A1. All of the aforementioned aspects suggest that the COL12A1 gene might be involved in SCS.
This study revealed that different markers and candidate genes were found for the two breeds in our study. In addition, all identified SNPs have been distinct among the three parities, denoting that SCS is influenced by different genes according to parity. The different sets of markers discovered in this work compared to studies reported in the literature can be attributed to various factors. We can assume that the power of detecting SNPs can be lower in the RB breed compared to the RS breed as a consequence of the comparatively lower number of genotyped RB animals used in the analysis. Furthermore, previously known mastitis related genes such as LUZP2, AKAP8 and MEGF10 were also identified in our study as candidate genes for SCS in the RB breed. For this specific case in the RB breed, further studies using larger sample sizes should be performed in order to validate previous results. Finally, an additional factor that may influence the different sets of candidate SNPs/breed is the distinct genetic backgrounds of the two breeds as they have different levels of clustering, as shown in previous studies on Romanian cattle breeds [62,63].
The specific novel SNPs and candidate genes located around them reported in the present study can be considered as candidates involved in SCS and mastitis resistance; however, these need to be kept in perspective, and polymorphisms in those genes should be further analyzed to highlight whether they influence the ability of dairy cows to resist mastitis.

Conclusions
Our study represents the first GWAS for SCS in Romanian dairy cattle and, thus, provides new perspective into the genetic architecture of udder infections in these native dairy cattle. We identified 41 SNPs and detected their significant associations with variation in SCS in dairy cattle. In both breeds, the SNPs and position of association signals were distinct among the three parities, denoting that mastitis is influenced by different genes according to parity. The results contribute to an increase in knowledge regarding the proportion of genetic variability explained by SNPs for SCS in dairy cattle and, particularly, in Romanian native cattle. However, further large-scale studies of a vast number of native dairy cattle breeds are needed in order to investigate others markers and genes that could be involved in mastitis.