Does the Acknowledgement of αS1-Casein Genotype Affect the Estimation of Genetic Parameters and Prediction of Breeding Values for Milk Yield and Composition Quality-Related Traits in Murciano-Granadina?

Simple Summary Genetic evaluations and the selection of breeding animals require the accurate estimation of genetic parameters for economically important traits. As a result, dairy livestock has evolved in response to the needs of producers and consumers. Genetic selection in goats has been mostly based on quantitative traits such as milk yield, fat, protein, and dry matter. However, as reported by the increased heritability values of these parameters after the inclusion of the different allelic variants of αS1 casein in evaluation models, the selection of animals carrying this gene could result in a more efficient genetic selection. High levels of genetic polymorphism (89.58% of polymorphic SNP—as only five out of the 48 SNPs assessed were monomorphic) that are related to greater production of coagulable proteins in milk, a fact that could be associated with a higher yield and improved curd firmness properties. Abstract A total of 2090 lactation records for 710 Murciano-Granadina goats were collected during the years 2005–2016 and analyzed to investigate the influence of the αS1-CN genotype on milk yield and components (protein, fat, and dry matter). Goats were genetically evaluated, including and excluding the αS1-CN genotype, in order to assess its repercussion on the efficiency of breeding models. Despite no significant differences being found for milk yield, fat and dry matter heritabilities, protein production heritability considerably increased after aS1-CN genotype was included in the breeding model (+0.23). Standard errors suggest that the consideration of genotype may improve the model’s efficiency, translating into more accurate genetic parameters and breeding values (PBV). Genetic correlations ranged from −0.15 to −0.01 between protein/dry matter and milk yield/protein and fat content, while phenotypic correlations were −0.02 for milk/protein and −0.01 for milk/fat or protein content. For males, the broadest range for reliability (RAP) (0.45–0.71) was similar to that of females (0.37–0.86) when the genotype was included. PBV ranges broadened while the maximum remained similar (0.61–0.77) for males and females (0.62–0.81) when the genotype was excluded, respectively. Including the αS1-CN genotype can increase production efficiency, milk profitability, milk yield, fat, protein and dry matter contents in Murciano-Granadina dairy breeding programs.

Hence, the main objective of this study is to comparatively estimate genetic parameters and breeding values of milk yield and components (protein, fat and dry matter in kg), using two models, one including the αS1-CN genotype as a biomarker fixed effect, and the other excluding it. This procedure was carried out to analyze the potentially occurring discrepancies or similarities obtained after both models, assessing whether the inclusion of αS1-CN genotype may result in improved efficiency and profitability of the selection techniques applied in the breeding programs for milk production and quality in Murciano-Granadina goats.

First Phase: Study Sample Selection Process
During this initial phase, the statistical properties of the variables of milk yield, fat, protein, and dry matter contents (kg) from the Murciano-Granadina goat breed's complete historical routine milk record until 2017 were evaluated (n = 2,359,479 registries belonging to 151,997 animals born from 1998 to 2017). This whole routine milk recording set was submitted to a depuration process to rule out those goats for which registries fell outside the Murciano-Granadina goat breed's reference values for milk yield and composition (protein, fat and dry extract). After this depuration process, the pedigree matrix considered in our study for genetic analyses consisted of 29,397 animals (26,993 does and 2404 bucks) born from 1998 to 2017, registered in the historical pedigree, accounting with direct records in the relationship matrix and at least connected via one known ancestor. Then, parametric assumptions were tested (normality, homoskedasticity, sphericity, multicollinearity, and outlier presence) within the raw phenotypes.
After this, parametric assumptions were retested on females and clustered by partition number for every fixed effect in our model (farm, birth year, month and season and birth type). This was performed as a way to adjust phenotypes and determine whether the lack of normality could be attributed to the properties of the sample studied as a normal distribution could be expected for productive data from goats at the same lactation status.
To achieve this objective, SPSS Statistics for Windows', Version 24.0, IBM Corp., Armonk, New York (2016) split file routine was used. Both procedures (complete historical and adjusted data, respectively) reported the violation of the parametric assumptions already reported, a fact that suggested the application of a nonparametric approach. After parametric assumptions had been tested, we selected our sample set comprising 710 goats (with 2090 records collected during the years 2005-2016) and genotyped their αS1-CN background to evaluate the influence of such a genotype following the sample selection criteria described in Pizarro et al. [26] Animals were ranked considering their predicted breeding values (PBVs) for milk yield, fat and protein and the relationships among these traits at a previous genetic evaluation, using the combined selection index (ICO) procedures defined in Pizarro et al. [26]. The 236 females presenting the lowest ICO values in the rank, 238 females with values around percentile 50, and the 236 females presenting the highest ICO values in the rank were chosen, so as to perform an adjusted representative sampling of the genotype distribution in the population). Univariate (to estimate heritabilities) and bivariate analyses (to estimate genetic and phenotypic correlations) were performed separately for the selection of the animals that could be genotyped, aiming at preventing the incorrect estimation of heritabilities as an effect of the epistasis between traits. Then, we applied mixed model procedures using an animal model (BLUP) by restricted maximum likelihood, with the MTDFREML software package [27]. We followed an interaction process seeking the achievement of convergence criteria levels of 10 −12 . The genetic evaluation involved those goats that presented direct records and were at least connected via one known ancestor within the relationship matrix. This matrix consisted of the historical pedigree until 2017 (151,997 females). After we achieved convergence, predicted breeding values were computed with the MTDFREML software.
To rank the goats in the relationship matrix and select the ones to be genotyped out of them, we computed a combination of goat's predicted breeding values for three of the four traits measured (i.e., milk yield, fat and protein contents) using a combined selection index (ICO) procedure as described in Van Vleck [28]. Dry matter was not included in the computation of the combined index (ICO) given the potential redundancy occurring as it comprises an indirect measurement of fat and protein components. The procedure to set the combined selection index used and the premises considered to determine it are described in Pizarro et al. [26].
The model and methods used at the preliminary genetic evaluation were designed after a routine process including commonly reported effects for the evaluation of milk yield, protein, fat and dry matter components, via the common parametric approach addressed by several authors [29,30]. As parametric approaches had been used to test these preliminary model and methods, we may be able to infer a comparison of the results obtained after running nonparametric tests on our data consisting of variables that are commonly studied using parametric approaches.

Study Sample
After the depuration process and sample selection for genotyping had been accomplished, we retained registries collected during the period from 2005 to 2016 from 710 Murciano-Granadina non neutered lactating goats present in the studbook. The age of goat in the sample ranged from a minimum of 14.03 to a maximum of 130.13 months old. As age did not distribute normally (p < 0.001 for Shapiro-Wilk Francia's tests to test for normality in samples up to 5000 cases), we provide Q1 (17.70 months), median age (28.77 months) and Q3 (41.10 months). The properties of parturition age across number of parturitions were assessed through Shapiro-Wilk Francia's W test, Kurtosis and Skewness revealing the data distributed normally (p > 0.05 for Shapiro Wilk Francia's W test, around 3 for Kurtosis and a value ranging from −0.5 to 0.5 for Skewness).

Lactation Standardization
Regular management policies normally followed and applied at Murciano-Granadina farms rely on two kidding seasons per year (polyestric breed), which translates into economically important income derived from no longer than 210-240-day lactation periods [31]. For the analyses carried out in the present study, total milk yield and content were estimated until 210 days of lactation as suggested by Brito et al. [29] and expressed in kg following the procedure described in Pizarro et al. [26].

Milk Composition Evaluation
The amount of milk collected during one milking period was determined for the 710 goats individually after typifying lactation period at 210 days, following seven controls after parturition carried out periodically every 30 days. Samples were then sent to the official Milk Quality Laboratory (Córdoba, Spain) for composition determination (total fat, protein and dry matter) using a MilkoScan™ FT1 analyser. Data was then purged and 2090 registries form 710 goats were considered at the statistical analysis. Supplementary Materials Table S1 describes the individuals and observations and the genotypic frequencies of each CSN1S1 genotype in Murciano-Granadina goat for the sample used in our study and for population frequencies presented at a previous paper [32].

Second Phase: Determining the Effect of Genotype
During this phase, we adjusted phenotypes and checked potential intergenotypic differences, following nonparametric statistical procedures, as described in Pizarro et al. [26].
DNA Bank, αS1-Casein (CNS1) Genotyping and Mutation Identification The procedure of DNA sampling, αS1-casein (CNS1) genotyping and mutation identification was carried out following the premises described in a previous study by Pizarro et al. [26]. DNA information from 117,225 blood samples belonging to Murciano-Granadina breeding males and females stored in the Germplasm Bank held by Animal Breeding Consulting, Inc., Córdoba (Spain) Samples were owned by the National Association of Breeders of Goats of Murciano-Granadina Breed (CAPRIGRAN). When the blood sample for a certain animal registered in the studbook was not stored at the germplasm bank, such animal was not considered to comprise the study sample.
The encoding sequences of the allelic types of the αS1-Casein gene reported by Genebank are AJ504710, AJ504711, AJ504712, and X56462, respectively, corresponding to the alleles identified in our study (F, N, A, and B, respectively) (Supplementary Materials Table S1). Preliminary statistical analyses reported the presence of a 457 bp LINE element determining the presence E allele, involving and excluding 11 bases as described in Supplementary Materials Table S1.
PCR procedures were used to search for the alleles reported in Maga et al. [33] in order to identify the mutations in the samples in our study, as reported by Pizarro et al. [26].

Statistical Analysis of Non-Genetic and Genetic Factors
We analysed the effects of categorical factors (farm, parturition year, month and season, kidding type and genotype) on continuous variables (milk yield, protein, fat and dry matter (expressed in kg). We did not transform data given the analyses were performed on normalized milk yield expressed in kg. After this, the total number of kg of normalized milk yield at 210 days was used to compute the amount of each component expressed in kg by multiplying this total number of kg by the percentage of fat, protein and dry extract determined at the laboratory. Although the additive component could not be considered to be negligible for milk yield and composition, mean inbreeding in the historical pedigree of the breed until 2017 (n = 2,359,479 registries) is as low as 0.29%, thus irrelevant and not included in the model as described in Pizarro et al. [26]. The statistical procedures described by these authors [26] were followed as a way to counteract the obstacles represented by the effects of the properties of the data from milking records as already described in a previous paper [34]. Parametric assumptions were tested on the whole milking record available for Murciano-Granadina goat breed (n = 2,359,479) to determine the statistical approach to follow seeking the best validity for results. As normality assumption was violated (Kolmogorov Smirnov, p < 0.001, highly positively skewed and platykurtic distribution) for milk yield, fat, protein and dry matter content expressed in kg (Supplementary Tables S2 and S3) a non-parametric approach was followed. Homoscedasticity (Levene's test, p < 0.001) and Mauchly's W test of sphericity (Mauchly's W = 0.29), χ 2 (5) = 8,327,656.071, p < 0.001) were neither fulfilled. These assumptions were retested to examine whereas data could presumably follow a normal distribution when the information of goats at the same lactation status is considered. Additionally, the statistical parametric assumptions were also tested on our sample. Normality was only slightly detected (Kolmogorov-Smirnov, p < 0.05), for goats at the fourth to the sixth lactation, which in turn did not influence milk yield, protein, fat and dry matter. Contrastingly, no evidence of normality was found for any of the lactation stages from first to sixth lactation, with evidences of heteroscedasticity and lack of sphericity, hence variance calculations may be distorted, which would result in an inflated F-ratio. Normality tests were carried out with sfrancia routine of StataCorp Stata version 14.2 evidencing a highly statistically significant (p < 0.001) deviation of normality in the traits studied. Given that genotyped goat number was a limiting factor, we also tested for homoscedasticity between genotype variants in this smaller set with Levene's test, evidencing the occurrence of heteroscedasticity as well (p < 0.05). Despite univariate outliers could be the base for the violation of the assumption of normality for the data related to milk yield and composition traits in our population, we decided not to discard them as even when these values are over the values that are commonly reported for the breed in literature, they are found within the limits for empirical data field observations. For instance, the fact that the Murciano-Granadina breed presents a high frequency of goats producing up to 1300 L of milk in 210 days of lactation in routine performance checks. Highly statistically significant (p < 0.01) multicollinearity assumption among variables was evidenced after the results of Spearman rho (ρ) correlations. All these pieces of evidence determined our choice to perform nonparametric tests to statistically assess the milk yield and composition data recorded. Hence, a Kruskal-Wallis H was performed to test for interlevel differences. A summary of the levels per factor considered in the model of study can be consulted in Table 1. After Kruskal-Wallis H, the power of the factors on milk yield and composition traits was computed via the determination of partial eta squared (ηp 2 ) coefficients [35] (Table 1). Then, Dunn's test assessed the differences in the distribution of each of the traits considered across the groups of the same factor affecting them. The Bonferroni correction was applied to compensate for Type error I increase derived from the high number of factors considered. An independent-sample median test was performed to corroborate previous results analysing the median across levels of the same factor. The sfrancia routine of StataCorp Stata version 14.2. was used to perform Shapiro Wilk Francia's tests. All nonparametric tests were carried out using the independent samples package from the non-parametrical task of SPSS Statistics for Windows, Version 24.0, IBM Corp. (2016). Further details for this statistical analysis can be found in Pizarro et al. [26].

Genetic Model Comparison, Phenotypic and Genetic Parameter Estimation
This work was carried out with data made available by the Murciano-Granadina Breeders Association, collected in the framework of its breeding program. Overall, the pedigree matrix included records of 29,397 animals with direct records related through at least one known ancestor. (26,993 does and 2404 bucks), while the phenotype data set included 2090 records of 710 goats evaluated between 2005 and 2016. Therefore, a multitrait animal mixed model with repeated measures was used to estimate (co) variance components, and the corresponding heritability, repeatability, phenotypic and genetic correlations and standard errors of such correlations for the traits under examination. In matrix notation, the following multitrait animal model with repeated measures was used: where Yijklmnopq is the separate score of milk yield and components (fat, protein and dry matter in kg) for a given animal; µ is the overall mean; Fari is the fixed effect of the ith farm/owner (i = 59 farms); Yeaj is the fixed effect of the jth year of parturition (j = 2005-2016); Monk is the fixed effect of the kth month of parturition (k = May to December); Seal is the fixed effect of the lth season of evaluation (j = Autumn, Winter, Summer, and Spring); Typm is the fixed effect of the mth type of birth(m= 1 to 5 kids); Genn is the fixed effect of the nth Genotype (n = AA, BA, BB, AE, BE, and EF); age in months was considered a linear and quadratic covariate, hence b 1 and b 2 2 are the linear and quadratic regression coefficients on the age of evaluation (Ao), Animalp is the random additive genetic effect of the pth goat, PEq is its permanent environmental effect, and eijklmnopq is the random residual effect. At a second stage, genotype was not considered in the model to isolate the possible effects derived from considering its effects on the traits measured. In matrix notation, the multitrait animal model with repeated measures excluding αS1-casein genotype used was: where Yijklmnop is the separate score of milk yield and components (fat, protein and dry matter in kg) traits for a given animal; µ is the overall mean; Fari is the fixed effect of the ith farm/owner (i = 59); Yeaj is the fixed effect of the jth year of parturition (j = 2005-2016); Monk is the fixed effect of the kth month of parturition (k = May to December); Seal is the fixed effect of the lth season of evaluation (j = Autumn, Winter, Summer, and Spring); Typm is the fixed effect of the mth type of birth (m= 1 to 5 kids); age in months was considered a linear and quadratic covariate, hence b 1 and b 2 2 are the linear and quadratic regression coefficients on the age of evaluation (An), Animalo is the random additive genetic effect of the oth goat, PEp is its permanent environmental effect, and eijklmnop is the random residual effect.
MTDFREML software package [27] was used to perform Restricted maximum likelihood approach-based univariate analyses in order to compute heritabilities and variance components. The same software was used to carry out bivariate analyses to estimate covariates and genetic and phenotypic correlation. The iteration process used sought a convergence criterion level of 10 −12 . Link functions can be found in Boldman et al. [28]

Non-Genetic Factors Estimation (BLUES) and Breeding Value Prediction (BLUPS, PBVs)
After convergence was reached, we directly estimated non-genetic factors estimators through best linear unbiased estimators for fixed effects (BLUES) and predicted breeding values through best linear unbiased predictors for random effects (BLUPS, PBVs), their accuracies and reliabilities for milk yield, fat, protein, and dry matter for each animal in the matrix, using the MTDFREML software [27].
Standard error of prediction (SEP), reliability (R AP ) and accuracy (RTi) do not provide basically the same information. They differ from the very definition and equation determination (R AP = RTi 2 = (1-SEP 2 /Va) 2 ), from which Va is genetic additive variance. On the one hand, reliability could be described as the likeliness of someone repeating the experiment and getting the same result (repeatability), whereas accuracy is how close your value is to the real value, hence values should be interpreted differently. According to the international beef recording scheme [36], the following guide may assist as a rule of thumb to interpret accuracy (RTi). Less than 50% accuracies mean PBVs are preliminary, thus calculated basing on little information and hence very prone to change substantially as more direct information on the animal becomes available. Accuracies ranging from 50-74% accuracy (medium) suggest that PBVs may have been calculated based on the animal direct information and some limited indirect pedigree information. Medium/high accuracies are denoted by 75-90% and may be calculated considering the animal's direct information together with the performance of a small number of its offspring. Accuracy values over 90%, report estimates of the animal's true breeding value, as it unlikely that PBVs will change considerably even if additional information from offspring is added. Regarding reliability, the rule of thumb proposed by KWPN (Royal Dutch Sport Horse) [37] is as follows; values less than 30% are generally unreliable, 30-55% poor reliability, 55-65% sufficient reliability, 65-75% more than sufficient reliability, 75-90% good reliability, >90% very reliable and repeatable with values around 60% meaning the information strongly relies of offspring information, what would not be desirable. On the other hand, the standard error of prediction (SEP) measures how large prediction errors (residuals) are for your data set measured in the same unit as your variable, hence provides a direct measure of possible change, that is the risk of the true breeding value of animal (TBV) not to be centered on the PBV. According to Van Vleck [38], possible change is risk in units of the trait and can be 'positive' or 'negative', what means the chance true BV may exceed PBV by a certain amount (possible gain) is the same as the chance true BV is less than PBV by the same amount (possible loss). In this context, confidence ranges are frequently used to determine probabilities of possible change assuming a normal distribution of TBV around the PBV. One-half of TBV would be expected to be greater than the PBV and one-half would be expected to be less than the PBV. The interval from PBV-(1)SEP to PBV + (1)SEP corresponds to 68% of likelihood that the TBV for an animal is centered on the PBV for the animal. The range can be narrowed or widened corresponding to the probability of TBV in the interval. For example, the interval from PBV-(2)SEP to PBV + (2)SEP would be expected to contain 95% of TBV. Units of SEP other than (1) or (2) would correspond to other confidence ranges. With a 68% confidence range, 32% would be half over and half below the ranges' ends, while with the 95% range, the percentage falling out of the range would be 5% (again half over and half below each end, respectively). Ranges for many combinations of PBV and SEP may overlap considerably. Then, by observing which PBV centers the range and comparing ranges, we may obtain a more direct measure of risk than that from accuracy (RTi).

Model Comparison
Descriptive statistics were evaluated to study the distribution properties of predicted breeding values obtained with the two models (including and excluding genotype for αS1-CN as a fixed effect). Then, we computed and compared the linear regression equations described by the predicted breeding values for milk yield and components obtained after both models to determine the possibility of excluding genotype from the routine genetic evaluation of the breed given the costs involved in genotyping). Pearson product-moment correlation analysis between the predicted breeding values (PBV) for milk yield, protein, fat and dry matter (expressed in kg) obtained using both the model including the αS1casein genotype and the one excluding it for all the animals included in the pedigree (n = 29,397) of Murciano-Granadina breed goats was carried out to check for the replicability of the results.

Ethics Committee Statement
The study followed the premises described in the Declaration of Helsinki. The Spanish Ministry of Economy and Competitivity through the Royal Decree-Law 53/2013 and its credited entity the Ethics Committee of Animal Experimentation from the University of Córdoba permitted the application of the protocols present in this study as cited in the fifth section of its second article, as the animals assessed were used for credited zootechnical use. This national Decree follows the European Union Directive 2010/63/UE, from the 22nd of September of 2010. Furthermore, the present study works with records rather than live animals directly, hence no special permission was compulsory.

Statistical Analysis of Non-Genetic and Genetic Factors
A summary of the descriptive statistics for the fixed effects and covariates comprising the model testing for Murciano-Granadina goat milk yield and components is shown in Supplementary Materials Table S3. All traits significantly violated normality assumption as suggested by the results of Kolmogorov-Smirnov Test (p < 0.001), kurtosis deviation (0.871 to 1.191) and skewness values (1.218 to 2.897) (Supplementary Materials Table S2). Percentage of components (µ ± SD) in our milk samples were 5.213% ± 1.021%; 3.544% ± 0.557%; 14.308% ± 2.042%, for fat, protein and dry matter respectively. A summary of the output for the Kruskal Wallis H test and partial eta squared coefficient (ηp 2 ) is reported in Table 1.
MANOVA (as the design followed in our model) involves the use of non-independent or repeated measures while ηp 2 quantifies for intergroup differences in our MANOVA plus their associated error variance expressed as a percentage. Then, univariate follow-up tests must be carried to further isolate the groups for which such significant and mean differences can be found. The literature recommends the use of partial eta square instead of eta square after a multifactorial design has been approached, given partial eta square reports an index of association strength between independent variables and dependent factor excluding the variance produced by the rest of factors considered in the model [39]. This relies on the fact that, as the Kruskal-Wallis test statistic is based on a single independent factor, no other factor accounts for the variance explained by this factor itself in the dependent variable, ηp 2 equals η 2 , which is of remarkable importance when we consider non-possibly overlapping empirical variables, such as the ones in this study.
A complete description of the statistical analyses carried out on the nongenetic and genetic fixed effects included in the genetic model can be found in Pizarro et al. [26].

Genetic Model Comparison, Phenotypic and Genetic Parameters Estimation
Estimates of non-genetic fixed effects (BLUES) obtained from the REML quantitative genetic analysis, including age as a linear and quadratic covariate, the fixed effects of farm/owner, birth year, month and season, and birth type, including and excluding genotype from the model are shown in Supplementary Materials Table S4. The estimates for heritability, genetic, phenotypic and environmental variance obtained through REML methods for both model including and excluding the genotype as a fixed effect are shown in Table 2. The genetic (r G ) and phenotypic correlations (r P ) estimated are shown in Table 3.

Breeding Value Prediction and Comparative Descriptive Analysis
A summary of the descriptive statistics of predicted breeding values (PBV), standard error of prediction (SEP), accuracy (RTi) and reliability (R AP ) for the milk yield and components sorted by sex is shown in Table 4. Maximum and minimum PBVs for all traits were slightly to moderately higher in bucks when genotype was excluded than when it was excluded except for fat. By contrast, maximum and minimum PBVs were slightly higher for milk yield and fat, while they were slightly lower for protein and dry matter when genotype was excluded. The broadest range for R AP was reported for bucks (0.45 to 0.71) when compared to does (0.37 to 0.86) when the genotype was included, while the values for the same range remained similar (0.61 to 0.77) for bucks and does (0.62 to 0.81) when genotype was excluded. The lowest R AP was reported for the PBVs for dry matter when genotype was included while the highest was reported for fat under the same premises. Contrastingly, RTi scores, range from 0.67 to 0.84 when genotype is included and from 0.78 to 0.88 when genotype is excluded for bucks, and from 0.61 to 0.93 when genotype is included and from 0.79 to 0.90 when genotype is excluded for does, respectively. Table 5 shows the results for Pearson product-moment correlation analysis between the predicted breeding values (PBV) for milk yield (kg), fat (kg), protein (kg) and dry matter (kg) obtained using both the model including the αS1casein genotype and the one excluding it for all the animals included in the pedigree (n = 29,397) of Murciano Granadina goat breed. Regression equations for the predicted breeding values for milk yield, and components obtained after both models to determine the possibility of excluding genotype from the routine genetic evaluation of the breed given the costs involved in genotyping, are shown in Figure 1.   Table 5 shows the results for Pearson product-moment correlation analysis between the predicted breeding values (PBV) for milk yield (kg), fat (kg), protein (kg) and dry matter (kg) obtained using both the model including the αS1-casein genotype and the one excluding it for all the animals included in the pedigree (n = 29,397) of Murciano Granadina goat breed. Regression equations for the predicted breeding values for milk yield, and components obtained after both models to determine the possibility of excluding genotype from the routine genetic evaluation of the breed given the costs involved in genotyping, are shown in Figure 1.
. Figure 1. Comparison of predicted breeding values for milk yield or performance, protein, fat and dry matter content in kg, determination coefficients (R 2 ) and linear regression equations for their trends obtained using both a model involving the αS1-casein genotype and one excluding it for all the animals included in the pedigree (n = 29,397) of Murciano Granadina goat breed.

Discussion
The evaluation of potentially occurring discrepancies or similarities between the two models included in our study may outline the benefits derived from the inclusion of αS1-CN genotype, considering that genotypic variants could result in an improved efficiency of the selection techniques applied. Then, conjoining the comparison of genetic parameters and predicted breeding values, with the study of the economic repercussion of the relationship and interaction between milk component traits (linear and nonlinear) [40], especially those which may act as relevant predictors of quality in cheese or yoghurt production, the profitability of dairy goat breeding programs can be optimized [41].
Schmitz et al. [41] explained that the advantage of multivariate analyses is based on taking into account not only the expectations of covariates within the pair, but also the information that relates the traits to each other and that of the different observations for the same individual. Through this additional information, the power increases to detect unrelated genetic deviations in cases where there is shared environmental variation. Shared environmental correlations not only help in the detection of joint environmental variance but also heritability. In the same way, phenotypic correlations also contribute to the detection of genetic or shared environmental variance in the multivariate case, albeit on a smaller scale. The only situation in which there will be no increase in power with respect to a univariate case is when the phenotypic correlation between the variables is non-existent. Thus, Neale et al. [42] pointed out that the power of twin studies to detect variations of the genetic additive and environmental components depends on factors such as the effect considered and its potency on the studied variable.
New theoretical and computational developments allow the use of models that better describe biological processes, and as a result, they report predictions of genetic values and estimations of increasingly reliable variance components [43], as they progressively minimize the estimation errors associated with the sample structure and the estimation methodology used [44].
There is a large amount of variation in the literature with respect to estimates of heritability of milk yield and their components (fat, protein and dry matter) in goats [8]. Heritability values for the production of milk, fat, protein and dry matter content expressed in kg when the genotype factor of αS1-CN was considered as fixed effect were 0.40, 0.29, 0.53 and 0.16, respectively. On the other hand, when this factor was excluded, they were 0.40, 0.31, 0.30 and 0.15, respectively (Table 1).
Analla et al. [45] reported heritabilities which were around half of those reported in our study (0.18, 0.16, and 0.25 with single-trait analysis) for milk yield, fat content, and protein content for the same breed, respectively. However, despite genetic correlations between fat and protein content were similar (above 0.90), phenotypic correlations were moderately lower than those reported in our study. Contrastingly, Genetic correlations between milk yield and fat or protein were negative and moderately high. The same study by Analla et al. [45] reported phenotypic correlations between milk yield and protein and fat content that were negative but moderate as well. These negative correlations have been widely reported to be a result of the procedure followed to quantify milk composition using percentages or fractions (g/kg). Our higher heritabilities and lower correlations match some of those found in the literature and were maintained within the ranges of values estimated in other goat populations for milk yield (0.17 to 0.41) [46][47][48]; fat (0.26 to 0.62) [48], protein (0.14 to 0.67) [43,49], and dry matter (0.16 to 0.36) [50]. Estimates of heritability confirm the existence of a considerable level of genetic diversity that allows the possibility to continue improving the production of milk, fat, protein and dry matter in breeding programs and selection of Murciano Grandina breed goats, even if they are highly selected breeds [26,47,49,51], such as the one in our study.
The inclusion of the genotype of αS1-CN in the model reported slightly higher heritability values for protein and dry matter content while the values of the phenotypic correlation between the same variables were slightly lower when the genotype was considered in the model. Estimates of genetic parameters, and among them of heritabilities, of traits can vary considerably between studies because of the breed involved, the population sampled, environmental conditions of the study, and random and systematic errors in the estimation process or measurement [52].
Heritability standard errors for the production of milk, and fat, protein and dry matter content including the αS1 casein genotype as a fixed effect (0.06; 0.05; 0.02 and 0.04, respectively) and for the production of milk, and fat, protein and dry matter content without the inclusion of the genotype αS1 casein as a fixed effect (0.07; 0.05; 0.03 and 0.05, respectively) ( Table 1) were kept within the ranges of estimated values in other populations of dairy goats (0.001 to 0.07) [51], indicating that the estimation of the parameters studied in our samples are reasonable.
The estimation of the coefficient of genetic and phenotypic correlation is of great importance in the implementation of selection processes since it provides an overview of the possible proportion of genes that may cause variation in a population in two or more independent characters simultaneously [53]. In our study, genetic correlations ( Table 2) including and excluding the genotype of αS1-CN were negative between milk yield and protein content (−0.01), fat and dry matter content (−0.09), milk yield and fat or protein content (−0.01) and protein and dry matter content (−0.15).
Phenotypic correlations including and excluding αS1 casein genotype were −0.02 for milk yield and protein content and −0.01 for milk yield/fat and protein content, respectively. Very similar values were found in Saanen and Alpina breeds [48]. These negative values have been reported for genomic correlations between loci and have been attributed to the negative linkage disequilibrium expected for traits under directional selection [54], as would happen in our study. The negative genetic and phenotypic correlations between milk yield and fat, protein, and dry matter content revealed unfavorable associations, which could adversely affect the quality of various dairy products especially those derived from the cheese industry. Therefore, dairy goats must be selected using indices not only considering the relations among these traits but also including αS1 casein genotype provided its association with cheese yield, and a firmer curd [55], as this inclusion may maximize the economic response from the markets commercializing these goat milk-derived products.
The values for genetic correlations showed a positive and slightly variable value from low and moderate both when the genotype of the αS1-CN was considered in and excluded from the model. It is worth highlighting, the genetic correlation between milk yield and fat content and milk yield and dry matter content of 0.01 when the genotype was included. This correlation was -0.01 and 0.00 when on the contrary the genotype was excluded. Also, the highest genetic correlation was the one between the protein and fat content, 0.93 and 0.97, when the genotype was excluded and included, respectively, followed in descending order by the correlation between protein and dry matter (0.14 and 0.23, when the genotype was included and excluded, respectively). The genetic correlations between milk yield and protein content presented the values of −0.02 and −0.01 when the genotype was included and excluded, respectively.
For their part, phenotypic correlations, including and excluding the genotype of αS1-CN reported similar values between the fat and protein components (0.97 and 0.93, when the genotype was excluded and included, respectively) which were within the ranges estimated by some authors in other dairy goat populations [17,[56][57][58][59][60][61].
However, it is worth noting that, although there is an almost total similarity between the values of genetic correlations in both models, which would be expected since we evaluated the same traits in the same population set, phenotypic correlations showed to be moderately different for the rest of possible combinations of traits when the genotype for αS1-CN was included and excluded. The basis for these differences could be based on the explanatory significant linear effect exposed for the genotype, from 8.3% to 9.2%, for the contents in fat and protein, respectively in the categorical regression models previously analyzed. The positive correlations found for fat and protein production could produce an increase in the economic value for this dairy breed, as has been observed for other breeds whose milk is used primarily for cheese production, since cheese production depends on the total amount of protein and fat produced per year per goat to a large extent [46].
Verdier-Metz et al. [62] studied cheese performance calculating it as fresh performance by dividing the weight of fresh curd by the quantity of milk used for cheesemaking and dry matter performance by multiplying fresh performance by molded curd's dry matter value. A wide range of values (55 to 85 g/kg) was reported by these authors for fat and protein ratio between manufactured kinds of milk. The same ratio linearly accounted for 77% of fresh yield and 87% of dry matter yield variability. This suggests higher fat/protein ratio values may contribute to an improved cheesemaking and curd formation performance. Pizarro et al. [26] suggested that such higher fat/protein ration could be possibly reported in goats presenting AE hybrid allelic combinations, as suggested by their findings regarding fat and protein contents being over the median (21.580, 14.878, and 59.808 kg for fat, protein, and dry matter content, respectively).
The accuracy and reliability parameters for the breeding values for milk yield including the genotype of αS1 casein were slightly lower than those reported by the model excluding the genotype for αS1 casein, for milk yield in males in the range of 0.06-0.10 (RTi, R AP respectively). However, for females in both models, excluding and including the genotype effect, the range was 0.04-0.07 higher for the model excluding the genotype of αS1 casein.
On the other hand, the inverse situation occurs for fat content, for which the reliability for breeding values increased in the range of 0.02-0.04 (RTi, R AP respectively) for males and 0.08-0.14 for females (RTi, R AP respectively) when the genotype is included as a fixed effect. The values for R AP range from poorly reliable to very reliable and repeatable. This is supported by RTi scores, which suggest that PBVs may have been calculated based on the animal direct information and some limited indirect pedigree ancestral information or belonging to a small number of its offspring. For instance, the differences found could be attributed to the fact that although direct observations from females were available, from males they were not, as these are unable to produce milk, what compelled us to evaluate them indirectly through the records from the females that were genealogically related to them. This has been suggested to occur in small samples, as the accuracy of a male's breeding values increase as more of his daughters are measured and when further male genomic reliable information is not available [63]. The same situation is described for the predicted breeding values of protein content as their accuracies and reliability were 0.06-0.10 (RTi, R AP respectively), for the males and 0.13-0.23 (RTi, R AP respectively) for females, when the genotype of the αS1 casein was included as fixed effects in the model. On the contrary, the dry matter describes the same trend as in the production of milk, with accuracies and reliability increasing in the range of 0.11-0.14 (RTi, R AP respectively) for males, and 0.18-0.21 (RTi, R AP respectively) for females when the genotype of αS1 casein was not considered as a fixed effect in the model. Some authors have reported the possibility of accuracies to reach values close to zero or even zero for distant or unrelated animals when genetic analyses are performed on deep pedigrees (that is considering a large number of generations, involving a great number of unknown genetic background animals -founders-) as the one in our study (from 1 to 15 depending on the animal). Clark et al. [64] calculated accuracy as the correlation between estimated and true breeding values for 250 animals with no phenotype ranging from one to eight generations (depending on the individual). These authors did not find significant differences in accuracy between shallow or deep pedigree BLUPS when the animals in the pedigree had a close relationship. However, when a shallow pedigree was used, and animals were distantly related (maximum relationship ranging from 0 to 0.125) or unrelated (no pedigree relationship), all breeding values estimated were zero and accuracies close to zero (average relatedness in our pedigree ranged from 0% to 1.07%). In contrast, when pedigree was deep, breeding values were predicted with a significant accuracy when there was a distant relationship between animals and accuracy reduced to very close to zero when animals were unrelated, what may support our findings (Table 4). Genetic evaluations in our study were performed using BLUP and a deep pedigree of 29,397 animals that ranged from one to fifteen generations in length (depending on the individual). Additionally, the highest the number of offspring per dam considered in the pedigree, the highest the values of accuracy may be as reported by Iraqui et al. [65]. Our pedigree comprised animals whose offspring ranged from 0 to 846, which may account for the wide range of values and non-normal distribution presented by reliability, accuracy or standard prediction error parameters (Table 4).
Regarding standard errors of prediction (SEP), the widest confidence ranges where reported for milk yield regardless the sex of the animal and whether the genotype for αS1-casein was included or excluded. Despite the inclusion of the genotype did not affect the highest end of the confidence range, the lowest end was higher when genotype was included, a common trend followed by the confidence ranges of fat, protein and dry matter components. The confidence ranges for these components varied very slightly, with the exception of dry matter when genotype was included, which doubled the value of the confidence range when genotype was excluded for both bucks and does. Hence, the risk when making decisions on animal selection is lower for milk yield and dry matter than for the rest of the components when genotype is included. Still, when genotype was considered in the model all traits reported a high enough level of confidence that overcame the values for confidence ranges when genotype was excluded.
High maximum breeding values for milk yield ranged from 137.41 to 156.94, and from 249.04 to 265.12 when αS1 casein was included and excluded, respectively. These high values indicate some breeding does and bucks present a very good genetic potential to improve milk yield, which could be considered when selecting the matings to perform (natural mating or artificial insemination).
When the coefficient of determination (R 2 ) for the linear functions described by breeding values for milk yield, protein, fat and dry matter content, were compared between the models including and excluding the genotype of the caseins, they did not exceed the range of 0.0002-0.0052, respectively reported for the production of milk and dry matter when the genotype was included (Figure 1). This finding suggested no significant differences in the breeding values obtained whether the genotype for αS1 casein was included or not. However, these values are not fully consistent with those shown in Table 4 for Pearson correlation coefficients, as despite they hold a highly significant correlation (p < 0.001) for breeding values for milk yield and dry matter, it is not high (around 0.061-0.151, respectively), what accounts for the fact that more than linear, this correlation may present a nonlinear nature. Correlations between milk yield and fat or protein contents are conditioned by the fact of including does belonging from different lactational states, hence the low positive or even negative linear correlations reported for such traits and the deviation of the values already reported by literature. Piccardi et al. [66] reported the inclusion of animal random effects, such as in our model, may improve the predictive potential of nonlinear models, but still may not have a significant effect on production indicators such as milk yield, as it was also suggested by the values for heritability obtained in our study resembling those in literature.
The same finding was reported for the correlations between the breeding values obtained for the fat and protein content (0.049-0.056), respectively. These results, together with those of the determination coefficient (R 2 ), suggest the same linear relationship between the breeding values of the traits measured with independence of the inclusion of αS1 genotype in the model.
The PVBs for milk yield are strongly influenced by non-genetic effects, according to Pallete [67], since the environmental factor influences on average up to 70% of milk yield in dairy breeds. On the contrary, dry matter describes the same situation as in the production of milk increasing the reliability and accuracy in the range of 0.11-0.14 (RTi, R AP respectively) for males and 0.18-0.21 (RTi, R AP respectively) for females when the genotype of αS1 casein is not considered as a fixed effect within the model.
Similarly, Carillier-Jacquin et al. [68] suggested αS1-CN genotype may account for a significant effect on milk yield, and composition (fat and protein content) in French dairy goats. These authors reported the fact that considering αS1-casein genotype in genetic and genomic evaluations of bucks αS1 casein genotypes improves the accuracy of the models used in increases that range from 6 to 27%. Similarly, and although it did not reach the same increase in accuracy levels previously reported for males, in genomic evaluations carried out on female records, as in the present study, the accuracy was slightly higher (1% to 14%) than that reported by genomic models which did not involve the genotype for αS1 casein, as also suggested by the slightly higher heritability, precision, and reliability values of estimated breeding values (Tables 1 and 3).
Genomic selection is a useful tool to complement quantitative genetic evaluation and is currently being implemented and used more and more frequently. Its application needs the cooperation between universities, biotechnology centers, researchers, producers, companies, and breeder associations to work on the development of technologies, and incorporate them into our current evaluation system in order to predict the behavior of future progeny for certain characteristics, especially those of prominent profitability.

Conclusions
Our findings match and complement those results emerging from studies of genetic polymorphisms of milk proteins in Murciano-Granadina goats given the practical applications derived with the purpose of improving the efficiencies of dairy goat production. Given the different polymorphic forms of milk proteins that are controlled by autosomal genes inherited following Mendelian inheritance patterns, selecting and breeding for a desirable variant within a specific protein is possible by including the effect of genotype in breeding models. The present estimates for genetic parameters enable using milk protein genes as markers for increasing milk yield and improving its content to agree with market demands more efficiently. The effects of certain genetic variants on milk yield and its components, as denoted by low values for BLUES, render mere statistical associations what may be the base for the lack of information about such relationships in literature. Genetic and phenotypic relationships between milk yield and genetic variants of αS1-CN are inconsistently reported in literature due to their reduced reliability stemming from studies accounting for a small population size, the use of different breeds, very low frequencies of some variants, methods of determining yields, and most importantly, the appropriate rigour of statistical procedures used to correct for the major factors and variables, which in turn contributes to higher milk yield ratios. Our study supports the generally addressed tendency for the αS1-CN B allele to be associated with higher milk yield, and for the C variant of αS1-CN, which is associated with higher fat and protein in the milk. The knowledge of the different allelic forms present for the αS1-CN at the molecular level can be of great help to improve the calculation of the genetic value of the breeding animals and above all to serve as a guide to conduct the breeding and genetic selection of animals producing milk with proteins with favourable characteristics for its use in milk and its derivatives (fresh milk, curd, fermentation etc.). Given the implication of the ratio between the fat and protein content in the cheese yield and quality of the curd formation, the inclusion of the genotype as a fixed effect in those models that are used to assess production of goat's milk and its components can increase the profitability of milk as suggested by the increases in the reliability and accuracy of breeding values for fat and protein, despite the slight decrease in the accuracy of breeding values for other traits such as milk yield and dry matter. However, the negative genetic and phenotypic correlations between milk yield and the content of fat, protein, and dry matter show unfavourable associations, suggesting that the higher the amount of milk produced, the lower its quality is. In line with these results, the positive correlations between fat and protein production, both of which are of great importance for the quality of various dairy products, especially those derived from the cheese industry, suggest the need for the selection of dairy goats based on selection indices that are adapted to these traits considering the genotype of αS1 casein, as they could lead to an increase in the economic value of dairy breeds. The obtained estimates of heritability confirm the existence of a considerable level of genetic diversity that allows the possibility to continue improving the production of milk, fat, protein, and dry matter in breeding and selection programs of Murcia-Granada breed goats.
Supplementary Materials: The following are available online at http://www.mdpi.com/2076-2615/9/9/679/s1, Table S1: Number of observations and goats sorted per αS1-casein (CSN1S1) genotype; Table S2: Descriptive statistics for milk yield (kg), fat (kg), protein (kg), dry matter (kg), % of fat, % of protein and % of dry matter in Murciano-Granadina goat milk; Table S3: Testing for normality in milk yield and composition traits in the whole population of Murciano-Granadina goats (n = 2,359,479 animals); Table S4: Estimates of non-genetic effects (BLUES) obtained from the REML quantitative genetic analysis, including age as a linear and quadratic covariate, the fixed effects of farm/owner, birth year, birth month, birth season and birth type, including and excluding genotype from the model.