Factor Analysis of Genetic Parameters for Body Conformation Traits in Dual-Purpose Simmental Cattle

Simple Summary Body conformation traits are closely related to economically important characteristics and should be considered in cattle breeding programs. A variety of body conformation traits recorded by classifiers can complicate the analysis process. Factor analysis can reduce the number of variables by combining two or more variables into a single factor, which has biological significance. The results of this study could be used by breeders to define conformation indexes and implement genetic assessments for conformation traits in dual-purpose breeds. Abstract In this study, we estimated the genetic parameters for 6 composite traits and 27 body conformation traits of 1016 dual-purpose Simmental cattle reared in northwestern China from 2010 to 2019 using a linear animal mixed model. To integrate these traits, a variety of methods were used as follows: (1) genetic parameters estimates for composite and individual body conformation traits based on the pedigree relationship matrix (A) and combined genomic-pedigree relationship matrix (H); (2) factor analysis to explore the relationships among body conformation traits; and (3) genetic parameters of factor scores estimated using A and H, and the correlations of EBVs of the factor scores and EBVs of the composite traits. Heritability estimates of the composite traits using A and H were low to medium (0.07–0.47). The 24 common latent factors explained 96.13% of the total variance. Among factors with eigenvalues ≥ 1, F1 was mainly related to body frame, muscularity, and rump; F2 was related to feet and legs; F3, F4, F5, and F6 were related to teat placement, teat size, udder size, and udder conformation; and F7 was related to body frame. Single-trait analysis of factor scores yielded heritability estimates that were low to moderate (0.008–0.43 based on A and 0.04–0.43 based on H). Spearman and Pearson correlations, derived from the best linear unbiased prediction analysis of composite traits and factor scores, showed a similar pattern. Thus, incorporating factor analysis into the morphological evaluation to simplify the assessment of body conformation traits may improve the genetics of dual-purpose Simmental cattle.


Introduction
Dual-purpose Simmental cattle, a popular breed, exhibit high milk and meat production, good fertility and profitability [1][2][3][4]. Dual-purpose Simmental cattle were introduced to Northwest China in the 1950s. Core breeding farms have resulted in improvements in more than 400,000 cattle in China over several years [5]. These core breeding farms record milk production, milk composition, body measurement, and body weight every month. The breeding goals for dual-purpose Simmental cattle in Northwest China are milk production, milk quality, and growth traits. The average 305-day milk yield reached 5469 kg, the average fat content was 4.13%, the protein content was 3.33% [3], and the average daily weight gain from birth to 24 months of age was 0.70 kg·d −1 [6].
Body conformation traits are closely related to economically important traits, such as milk production [7], reproduction [8], health [9], profitability [10], and lifespan [11]. Therefore, studying the genetics of body conformation traits is as important as other production traits from an economic perspective. Understanding the genetic parameters of body conformation traits is crucial for implementing breeding programmes. Multiple regression has been used to analyze the relationship between body conformation traits, and the results have shown that the traits are correlated genetically and phenotypically [12]. For instance, Simčič et al. [12] found high genetic and phenotypic correlations between body frame traits in first parity Rendena cows. In some other cattle breeds, VanRaden et al. [13] and Mazza et al. [14], found strong genetic correlations between rear udder height and rear udder width, with values ranging from 0.85 to 0.95. Using a large number of traits with common information in multiple regression can lead to biased estimates of their relationship with productive traits [15]. Factor analysis is a useful multivariate technique for analyzing correlated traits, and it can remove redundant information introduced by incorporating multiple variables [16,17]. Mazza et al. [18] and Olasege et al. [19] suggested the use of latent factors in genetic evaluation, avoiding the analysis of highly correlated traits, thereby improving precision and reducing computational burdens for large datasets.
To date, a linear identification of body conformation traits of dual-purpose Simmental cattle in Northwest China has not been conducted, and breeding programmes do not include linear-type traits. This study aims to estimate genetic parameters for body conformation traits in dual-purpose Simmental cattle using factor analysis. The results could be used to develop a national genetic evaluation framework for the improvement of body conformation traits for dual-purpose Simmental cattle in China.

Materials and Methods
Approval from the local Institutional Animal Care and Use Committee was not required for this study because the data were obtained by field measurements, and no animal experiments were conducted.
Conformation traits of 1200 dual-purpose Simmental cattle born from 2010 to 2019 were measured at Xinjiang Hutubi Farm, Kekedala Chuangjin Farm, and Xinjiang Haozi Animal Husbandry Farm in Northwest China. After quality control was applied using the threshold of the mean ± three times the standard deviation, 1016 conformation records remained and were used for analysis. The pedigree file used for the analysis included data for 1988 animals, with each animal being traced back three generations. In the full datasets, one sire had a maximum of 195 offspring with records, whereas 16 sires had only one offspring. More than 62 dams had 2 or more offspring.

Genotype Data
The Illumina 100K Bovine BeadChip was used to genotype 516 Simmental cows. For analysis, common SNPs were obtained from 100K bead chips as target files. Quality control of the SNP genotyping was carried out with PLINK 1.07 software (Boston, MA, USA) [20]. All genotyped animals had a call rate greater than 0.90. SNPs were removed if the call rate was less than 0.90 and the minor allele frequency (MAF) was less than 0.01. After quality control, data for 88,913 SNPs from 516 animals remained for analysis.

Genetic Connectedness
To estimate genetic parameters, the genetic connectedness of the animals in the dataset must be determined. If the breeding process affects the genetic connectedness of cattle at different farms over time, this change would be reflected in a change in the average relatedness in birth-year cohorts and among the populations at different farms. The indirect method of Sargolzaei [21], implemented in the software package CFC, was used to compute the coefficient of relationship among animals.

Variance Component Estimates for Body Conformation Traits
The single-trait animal model was used to estimate genetic and residual variance for the 6 composite traits and 27 individual body conformation traits with the average information-restricted maximum likelihood (AI-REML) method. The AIREMLF90 procedure of BLUPF90 1.0.1 software (Athens, GA, USA) was used [22]. The animal linear mixed model for the single trait analysis was as follows: where Y indicates the vector of 6 composite traits and 27 individual body conformation traits; β is the vector of the fixed effects, including herd-year of evaluation (5 different levels), days in milk (ten classes: from 10 to 30 days after calving, from 31 to 270 days after calving in 30-day intervals, and >270 days after calving), age at first calving (seven classes: <23 months, from 23 to 34 months in 2-month intervals, and >34 months), and parity (four classes: 1, 2, 3, and ≥4); a is the vector of random animal additive genetic effects; e is the vector of random residual effects; and X, and Z are the incidence matrices assigning observations to fixed and random animal effects. The genetic effect was modeled using two kinds of genetic variance-covariance matrices: the pedigree relationship matrix (A) [23], and the combined genomic-pedigree relationship matrix (H) [24].
The heritability and standard error were estimated according to Austin Putz et al. [25] using the following formula:

Factor Analysis
Factor analysis was performed using the FACTOR procedure in SAS 8.0 software (Cary, NC, USA). In this analysis, a set of n observation variables (y 1 , . . . ., y n ) is synthesized into a new set of p (p < n) latent variables (X 1 , . . . ., X p ), which are referred to as common latent factors. As described by Kaiser (1960), varimax rotation was used to maintain the orthogonality of the extracted factors. Only components with eigenvalues≥1 were retained for the analyses (i.e., the Kaiser criterion; Russel [26], Mazza et al. [18], Olasege et al. [19]). By observing the individual body conformation trait loadings, the analysis was interpreted from a biological point of view. Based on the standardized scoring coefficients, we calculated the sample scores for each animal. According to Russel [27], the classic factor analysis equation specifies that a measure being factored can be represented by the following equation accounting for n factors: where X m is the m-th measure, F n is the n-th common factor that underlies the m-th measure being analyzed, and U n is the n-th factor that is unique to each m-th measure. Furthermore, W mn represents the n-th factor loading coefficients or loadings of each m-th measure on the respective factors, whereas e m reflects the random measurement errors in each m-th measure. Using this equation, we can divide the variance in the measure being factored into three parts. The first part of the variance of the measure reflects the impact of the common factors, the second part reflects the influence of the unique factor associated with the measure, and the third is the variance of the random error [26].

Estimation of Genetic Parameters Using Factor Analysis
In the single-trait animal model presented above, genetic parameters were estimated by fitting the factor scores as y. The estimated breeding values (EBVs) of the factor scores were then subjected to rank correlation analysis with the EBVs of the composite traits using SPSS.

Phenotype
The descriptive statistics for the 6 composite traits and 27 individual body conformation traits, including the mean, standard deviation, minimum, maximum, and coefficient of variation are summarized in Table 1. In the composite traits, the coefficient of variation ranged from 2.52% (final score) to 7.73% (rump). For the individual body conformation traits, the coefficient of variation ranged from 3.20% (stature) to 77.16% (udder depth).

Genetic Connectedness
All three farms use artificial insemination for breeding. Figure 1 shows the trend of the average coefficient of relationship by year of birth of Simmental cattle. The relationship coefficient among Simmental cattle born on the three farms from 2010 to 2019 ranged from 0.0179 to 0.0613, with irregular variation. Of the annual changes in the coefficient of relationship, the largest change was from 2017 to 2018.

Genetic Connectedness
All three farms use artificial insemination for breeding. Figure 1 shows the trend of the average coefficient of relationship by year of birth of Simmental cattle. The relationship coefficient among Simmental cattle born on the three farms from 2010 to 2019 ranged from 0.0179 to 0.0613, with irregular variation. Of the annual changes in the coefficient of relationship, the largest change was from 2017 to 2018.

Heritability of Conformation Traits
The heritability estimates for the 6 composite traits and 27 individual body conformation traits are presented in Table 2. Using the pedigree relationship matrix (A), the estimates for the composite traits ranged from 0.07 (muscularity composite) to 0.43 (body frame composite); using the combined genomic-pedigree matrix (H), the estimates for the composite traits ranged from 0.10 (muscularity composite) to 0.47 (body frame composite). The estimation for the final score was 0.18 from A and 0.14 from H. In general, the highest estimates of heritability were obtained in body frame traits, whereas the estimates for muscularity traits and feet and legs traits were low. For the individual body conformation, the heritability estimates from A ranged from 0.05 (HLHC and RAB) to 0.56 (ST), and from H, they ranged from 0.03 (rear udder length; RUL) to 0.65 (ST). The standard errors of heritability estimates were all ≤ 0.10, except for those for ST and RL. In addition, estimates of other body conformation traits from A and H were very similar. There was little improvement in the accuracy of the estimated heritability, i.e., the standard errors of the two models were similar. However, the average heritability estimated by H was higher than that of A, except for individual muscularity traits.

Factor Analysis
The eigenvalues and the proportion of total and cumulative variance explained by each factor are listed in Table 3. The 24 factors after varimax rotation explained 96.13% of the total variation among the 27 individual body conformation traits. The first factor (F1) accounted for the largest proportion (13.51%) of the total variability. The first 9 factors with eigenvalues ≥ 1 were retained for further analysis. The varimax rotated factor patterns coefficients and commonalities are reported in Table 4. Only loading coefficients ≥ |0.40| [27] were reported for each body conformation trait.

Heritability of Factor Scores
Variance components for nine different factor scores using the different relationship matrices are shown in Table 5. Heritability estimates for the pedigree relationship matrix had a mean value of 0.18 with a standard error of 0.08, whereas for the combined genomicpedigree matrix, the mean value of heritability was 0.20 with a standard error of 0.08 for all considered factor scores. In particular, the lowest heritability estimates were for F2 (feet and legs factor score and muscularity factor score) and F4 based on both methods. However, for both matrices, the highest values of heritability observed were for F1, a factor score accounting for the body frame and rump individual body conformation traits. Factor 3 and Factor 5 (i.e., the mammary system factor score) exhibited medium heritability values based on both matrices. In general, there was no significant difference in the heritability estimates of factor scores obtained using the pedigree relationship matrix or the combined genomic-pedigree matrix.

Correlations between EBV of Composite Traits and EBV of Factor Scores
Results of the Spearman and Pearson correlation analyses (only absolute values ≥ 0.20 reported) between EBV of composite traits and EBV of factor scores are reported in Tables 6 and 7. The results of the Spearman and Pearson correlation analyses were very similar. Correlation coefficients exhibited patterns very similar to the loading coefficients of the individual body conformation traits for F1, F2, and F4. EBVs obtained for F1 were highly positively correlated with the EBVs of the body frame, muscularity, and rump traits. In addition, Spearman and Pearson correlations between EBVs of F2 and EBVs of muscularity and feet and legs-related traits were positive, consistent with results previously reported for the loading coefficients between individual traits and the second latent factor. This pattern was also observed for F4; Spearman and Pearson correlations between EBVs of F4 and EBVs of mammary system-related traits were positive. However, similar results were not observed for F3, F5, F6, F7, F8, and F9.

Discussion
In this study, the heritability of 6 composite traits and 27 individual body conformations ranged from 0.03 to 0.65. The 24 common latent factors explained 96.13% of the total variation in 27 individual body conformation traits. Heritability estimates for the factor scores ranged from 0.008 to 0.43. The Spearman and Pearson correlation results revealed that the correlation coefficients between the EBVs of the factor scores and the EBVs of the composite traits exhibited a very similar pattern to that of the loading coefficients of the individual body conformation traits for the F1, F2, and F4.

Phenotype
Linear scoring of dual-purpose Simmental cattle has not been conducted in Northwest China. We quantified the body conformation traits that could be reliably measured to develop a linear scoring criterion. However, some individual body conformation traits were difficult to measure, such as rib and bone, rear leg rear view, and fore udder attachment. We scored ten hard-to-measure individual body conformation traits on a 9-point linear scale. The means of all scored traits ranged from 4 to 6. Similar findings were reported by Strapáková et al. [28] and Zavadilová et al. [29]. In addition, similar findings have been reported for Chinese Holstein cattle [19], US Brown Swiss dairy cattle [9], and Rendena and Aosta Red Pied dual-purpose breeds [18].
The mean values for body frame (85.14) and feet and legs scores (86.72) were slightly higher than those for Slovenian Simmental dairy cows (81.35 and 81.06, respectively). In contrast, scores for muscularity (80.72) and the mammary system (78.77) were similar for the two species [28]. For body measurement traits, the mean values of stature (140.65 cm) and body depth (78.41 cm) were slightly lower than those of Slovakia Simmental dairy cows (144.31 cm and 82.92 cm, respectively) [28] and slightly higher than those of Czech Fleckvieh cows (137.40 cm and 77.40 cm, respectively) [29]. In addition, the rump length (51.83 cm) of dual-purpose Simmental cattle in Northwest China was lower than that of Slovakia Simmental dairy cows (53.31 cm) [29] and Czech Fleckvieh cows (52.80 cm) [28]. In general, the body size of dual-purpose Simmental cattle in northwest China needs to be improved.

Heritability
In this study, except for muscularity traits, the composite traits were consistent with those of dual-purpose Rendena cattle [14], dual-purpose autochthonous Valdostana cattle [30], German and French dairy cattle [31], and Czech Fleckvieh cattle [32]. Previous studies [32,33] reported muscularity to be a medium to high heritability trait; this differs from the conclusion based on the results of our study and may be due to different definitions of muscularity. The heritability of individual body frame traits was high (0.11 to 0.65), followed by the rump traits (0.15 to 0.34) and mammary system traits (0.03 to 0.34), and the heritability was low for feet and legs traits (0.07 to 0.16) and muscularity traits (0.04 to 0.09). As reported by Kern et al. [34], Gibson et al. [9], and Spehar et al. [35], RUW and MS are moderately heritable traits (0.12-0.17) in Brazilian Holstein cattle, American Brown Swiss cattle, and Slovenian Brown Swiss cattle, while the results of this study indicated low heritability of these traits (0.04-0.06). Roveglia et al. [36] reported a heritability of 0.07 for RUW in their study of Italian Jersey cattle, which was similar to the results of our study (0.07). The differences in magnitude observed across these studies may be due to the scales used for measurement and scoring, the number of animals, breeds, statistical models, data editing procedures, and consistency among evaluators [37].
In addition, we investigated the heritability for composite traits and individual body conformation traits using the H matrix in dual-purpose Simmental cattle. Comparing the estimates of heritability and their standard errors for each trait using A and H, there was no significant difference for any conformation trait, except for ST. Based on the standard errors, there were no significant differences in any traits. It is possible that the construction of H was primarily determined by the information found in A since there were too few individuals with genotypes. Therefore, they had little influence on the genetic parameter estimates. In addition, the small number of individuals with phenotypic and pedigree records, and the specific environmental effects on farms have influenced the results of the genetic parameter estimation in this study. Some researchers have shown that genomic information can improve the accuracy of genetic parameter estimation for breeding target traits. For example, Veerkamp et al. [38] and Wei et al. [39] demonstrated that rescaling H according to the eigenvalues of A slightly changed the genetic variances. However, there are several reasons why estimated genetic variances differ between the models using pedigree and genomic relationships. First, A and H use different scales for the diagonal elements, especially when considering the Mendelian sampling component in H. A second reason is related to the genetic structure of cattle populations. The third reason is the accuracy of pedigree records. Naserkheil et al. [40] and Song et al. [41] demonstrated that single-step GBLUP provides a more accurate prediction than traditional BLUP for all the studied traits. There are so few individuals with available genotypes in the current study that combining data from genotyped and nongenotyped animals are not worthwhile. In future studies, additional individuals with available genotypes will be included in the genomic analysis of conformation traits in dual-purpose Simmental cattle.

Factor Analysis
In multivariate statistical analysis, factor analysis is one of the classical tools [18,19,42,43]. Several studies have shown that a small number of factors can be used to accurately describe the cow's conformation without reducing accuracy [18,19,44,45]. An essential aspect of the present study is the algebraic sign and magnitude of the loading coefficients and the percentage of the total variance explained by each factor. A trait with a high loading coefficient contributes more to the factor than one with a low loading coefficient [44]. Once the loading coefficients are determined, with a varimax rotation, it is possible to posit a biological interpretation of the factors [46]. Kaiser introduced the varimax rotation criterion; it maximizes the sum of variances between variables and factors squared [47]. Generally, factor analysis can be understood as a data-reduction technique that removes duplicate information from a collection of correlated variables [43]. The most frequently used factor analysis procedure in the literature has been the matrix transformation step, followed by the extraction of all factors using the principal-factor method with eigenvalues ≥ 1.0 and then rotation of these factors by varimax [18,19,45,48]. Finally, the factors can be explained by identifying the traits with the largest values. Since the procedure is available in many computer statistical packages (e.g., SAS and SPSS), it is relatively easy to use.
Phenotypic factor analysis in this study extracted 24 principal components, which accounted for 96.13% of the total variance among the 27 body conformation traits. Chu and Shi [43] found that eigenvalues > 1 explained 49.1% of the total variance in type traits of Holstein cows in the Beijing area. A similar value was found for the first six latent factors in a study of Aosta Red Pied cattle [18]. The approximate percentage of the total variance explained was determined in a factor analysis of the Rendena breed conducted by Mantovani et al. [48]. The results of the analysis indicated that high values of F1, representing body frame, rump, and muscularity traits, were associated with greater height and buttocks size. High correlations were also noted in previous studies [49,50]. As reported by Manafizar et al. [51], ST, CW, and BD exhibited a strong genetic correlation with residual feed intake, and the work of Dadati et al. [52] indicated that the rump score had the highest genetic correlation with easier calving. Therefore, selection based on F1 may improve the milk production, meat production and reproductive performance of dualpurpose cattle. F2 is characterized by high and positive loading coefficients for heel depth and foot angle and is usually associated with lameness. Thus, the selection of dual-purpose cattle based on high scores for feet and legs, steeper foot angle, straighter legs, and fine bone structure might improve locomotion and lower the risk of claw disorder [53,54]. F3, F4, F5, and F6 were udder trait-related factors, indicating the size and quality of the mammary system, respectively. Mazza et al. [18] reported that F3 and F4 reflected the mammary system in dual-purpose autochthonous breeds. High values of F3 were associated with teat placement, high values of F4 were associated with thick and long teats, high values of F5 were related to large udders, and high values of F6 were associated with shallow, strong, and balanced udders. Researchers have extensively studied the genetic correlation between conformation traits and SCS [9,[55][56][57]. The strongest correlations were obtained for FUA, FTP, and UD. F3, F4, F5, and F6 included traits that are usually associated with SCS, and a selection index based on higher udders with tighter attachments and closer teats would be favorable for reducing SCS [58]. Dube et al. [59] found that narrow teat placement and low, shallow udders were strongly correlated with low SCC in the South African Holstein population. Evaluating latent factors instead of original traits is an interesting approach. However, in reality, there is almost no routine application of conducting a genetic evaluation on the factors in any country.

Correlations between EBV of Composite Traits and EBV of Factor Scores
Consistent with the findings of Mantovani et al. [48], the Spearman correlation (rs) analyses between composite trait EBVs and factor score EBVs showed very similar patterns to the loading coefficients of individual traits in latent factors. For instance, the EBVs obtained for F1 exhibited a high correlation with the EBVs of body frame, muscularity, and rump traits (0.55 < rs < 0.67 and 0.59 < r < 0.69). Additionally, correlation between EBVs of F2 and EBVs of feet and legs also showed positive correlations (rs = 0.20 and r = 0.33), and EBVs for mammary system and udder conformation factors (i.e., F4) also showed positive correlations (rs = 0.30 and r = 0.39). The Spearman and Pearson correlation results for F7 and the composite trait EBVs were similar to those of F1, exhibiting high correlations with body frame, muscularity, and rump. Because of the generally high Spearman and Pearson correlations between the EBVs of factor scores and the respective EBVs of body conformation traits associated with those factors, factor scores could be used to guide animal breeding. However, it is crucial to select factors prudently since the random error could attenuate any further analysis based on the newly extracted variable in the factor score [26]. Heritability estimates of the nine factor scores showed that in both matrices, the most heritable factor was related to body frame, muscularity, and rump traits (F1), whereas the least heritable factor was related to feet and legs traits (F2).
The amount of phenotypic and genotypic data collected was small due to the late start of the linear assessment of body conformation and genomic evaluation of dual-purpose Simmental cattle in Northwest China. In the future, we will establish a protocol for type classification in dual-purpose Simmental cattle and conduct annual body conformation linear identification and incorporate it into the breeding programme.

Conclusions
In this study, the body conformation traits showed a range of heritability from low to high, with stature yielding the highest estimates. The number of animals with recorded body conformation traits is rather low, especially the number of genotyped animals, which led to little difference in the precision of estimating heritability using the pedigree relationship matrix and the combined genomic-pedigree matrix. The factor scores exhibited low to medium heritability, and the generally high Spearman and Pearson correlations between EBVs of Factor 1, Factor 2, and Factor 4 and the corresponding EBVs for composite traits suggested their utility in selection programmes. These analyses suggest that a few factors can describe a variety of body conformation traits without reducing the accuracy of genetic assessments.