Kinematic Sub-Populations in Bull Spermatozoa: A Comparison of Classical and Bayesian Approaches

The ejaculate is heterogenous and sperm sub-populations with different kinematic patterns can be identified in various species. Nevertheless, although these sub-populations are statistically well defined, the statistical differences are not always relevant. The aim of the present study was to characterize kinematic sub-populations in sperm from two bovine species, and diluted with different commercial extenders, and to determine the statistical relevance of sub-populations through Bayesian analysis. Semen from 10 bulls was evaluated after thawing. An ISAS®v1 computer-assisted sperm analysis (CASA)-Mot system was employed with an image acquisition rate of 50 Hz and ISAS®D4C20 counting chambers. Sub-populations of motile spermatozoa were characterized using multivariate procedures such as principal components (PCs) analysis and clustering methods (k-means model). Four different sperm sub-populations were identified from three PCs that involved progressiveness, velocity, and cell undulatory movement. The proportions of the different sperm sub-populations varied with the extender used and in the two species. Despite a statistical difference (p < 0.05) between extenders, the Bayesian analysis confirmed that only one of them (Triladyl®) presented relevant differences in kinematic patterns when compared with Tris-EY and OptiXcell®. Extenders differed in the proportion of sperm cells in each of the kinematic sub-populations. Similar patterns were identified in Bos taurus and Bos indicus. Bayesian results indicate that sub-populations SP1, SP2, and SP3 were different for PC criteria and these differences were relevant. For velocity, linearity, and progressiveness, the SP4 did not show a relevant difference regarding the other sperm sub-populations. The classical approach of clustering or sperm subpopulation thus may not have a direct biological meaning. Therefore, the biological relevance of sperm sub-populations needs to be reevaluated.


Introduction
Fertility in cattle, as in other species, is a key determinant of productivity and, thus, understanding factors that affect fertility in beef and dairy herds is of utmost importance [1]. Most of the artificial insemination performed in dairy cattle is done with frozen-thawed semen, whereas for beef cattle it is mainly used in genetic stations for selection and in developed farms. Although it is well known that sire fertility is related to sperm motility and kinematic patterns [2] the effects of semen quality on reproductive efficiency in cattle are not yet fully understood [3]. Cryopreservation causes damages to spermatozoa resulting in lower fertilizing capacity [4]. Individual variation in semen freezability exists in most domestic species, including bulls [5] and pigs [6], and such variation may be explained, at least in part, by the patterns of motile sperm sub-populations (SPs) in the ejaculate [7].
The intrinsic variability of semen samples, as well as individual variation or differences arising as a result of treatments, can be studied by using computer-assisted sperm analysis (CASA)-Mot systems that allow for the generation of huge datasets consisting of kinematics trajectories from thousands of spermatozoa [8]. CASA systems have evolved rapidly during the last decade due to major innovations in technology [9] such as increases in computational memory [10,11], use of 3D technology [12], considerations of the effect of frame rate capture [13][14][15][16], improvements in camera acquisition [17], or new approaches for sperm cell tracking [11,18,19]. Current CASA-Mot systems can be used to analyze individual sperm kinematics more accurately and this information can be submitted to a multivariate procedure, such as cluster analysis, for an overview of distinct sperm patterns grouped into SPs or clusters [20].
Cluster analysis divides data into groups (clusters) that are meaningful, useful, or both. If the goal is to obtain meaningful groups, then the cluster should capture the natural structure of the data. In some cases, however, cluster analysis is only a useful starting point for other purposes, such as data summarization. Whether for understanding or utility, cluster analysis has long played an important role in a wide variety of fields [21]. In many areas of biology, scientists have devoted considerable effort to generate groups with hierarchical classification criteria. More recently, clustering has been employed to analyze the large amount of information that is obtained by using CASA-Mot systems. For example, clustering has been used to find groups of spermatozoa that have similar patterns. Thus, several studies in many species have identified the existence of different sperm SPs, defined by specific kinematic patterns obtained from CASA-Mot systems [13,[22][23][24][25][26][27][28][29][30][31][32] The aim of the present study was to assess whether statistical differences in kinematic patterns of bull sperm SPs were relevant according to Bayesian analyses. In addition, we examined possible differences in sperm SPs in bull semen of Bos taurus and Bos indicus diluted with different extenders.

Materials and Methods
This study was performed following ethical principles and with the approval of the Committee of Centro de Investigación y Desarrollo de la Agricultura Sostenible para el Trópico Húmedo at the Costa Rica Institute of Technology (CIDASTH-ITCR) according to Section 01/2019, article 1.0, DAGSC-074-2019.

Semen Collection and Processing
Bulls used in this study were of various breeds of Bos taurus and Bos indicus, with an average age of 5.7 ± 2.8 years. Straws of frozen semen were supplied by Avance Genético S.A. (Zapote, Costa Rica). Semen was collected with an artificial vagina, under a collection program of two ejaculates per week, which had no apparent changes in animal health or semen quality throughout the semen collection period. All bulls were routinely used for semen collection for commercial purposes. Data from 20 ejaculates obtained from 10 randomly selected bulls were employed in this study. The bulls had passed a standard breeding soundness evaluation, and had produced sperm with acceptable post-thaw characteristics (progressive motile sperm >60%) and fertility (non-return rate >65%).
Within 5 to 10 min of collection, the semen samples were assessed for volume, by using a conical tube graduated at 0.1 mL, gross motility, by placing 20 µL of fresh raw semen on a prewarmed slide at 37 • C, and concentration, employing a bovine photometer Accucell (IMV, L'Aigle, France) at 530 nm wavelength. The raw semen was diluted with Tris-citric acid-egg yolk extender (Tris-EY) and two commercial egg yolk extenders (OptiXcell ® -IMV, L'Aigle, France; Triladyl ® , Minitube, Tiefenbach, Germany) to give a final live sperm concentration of 25 × 10 6 cells/straw. Diluted semen was cooled slowly to 4 • C at a linear rate of −0.3 • C min −1 in a refrigerator. After cooling, equilibration took place over a period of 4-5 h at the same temperature.
All samples were coded in such a way that the technician who performed the kinematics analysis did not know the number of the bull, the number of the ejaculate, or which ejaculate belonged to a particular bull.

Assessment of Sperm Variables
The cryopreserved semen samples were thawed in a water bath (37 • C, 60 s) and examined immediately after thawing. Disposable counting chambers for the analysis of motility and kinematics variables (ISAS ® D4C20, Proiser R+D, S.L., Paterna, Spain) were used after pre-warming them to 37 • C. After thorough mixing of the diluted semen samples, 2.7 µL of diluted semen were placed in the counting chamber tracks by capillarity. Analyses were conducted with the CASA-Mot system ISAS ® v1 (Integrated Semen Analysis System, Proiser R+D, Paterna, Spain) fitted with a video-camera (Proiser 782M, Proiser R+D), a frame rate of 50 frames per second (fps) and a final resolution of 768 × 576 pixels. The camera was attached to a microscope UB203 (UOP/Proiser R+D) with a 1X eyepiece and a 10× negative-phase contrast objective (AN 0.25), and an integrated heated stage maintained at 37 ± 0.5 • C.

Computerized Kinematic Analysis
CASA analyses were performed recording seven microscope fields with a total of at least 600 cells per sample. The CASA-Mot variables assessed in this study included: straight-line velocity (VSL, µm·s −1 ), corresponding to the straight line from the beginning to the end of the track; curvilinear velocity (VCL, µm·s −1 ), measured over the actual point-to-point track followed by the cell; average path velocity (VAP, µm·s −1 ), the average velocity over the smoothed cell path; amplitude of lateral head displacement (ALH, µm), defined as the maximum of the measured width of the head oscillation as the sperm swims; beat-cross frequency (BCF, Hz), defined as the frequency with which the actual track crosses the smoothed track in either direction; motility (%), the percentage of total motile cells and progressive motility (%), corresponding to spermatozoa swimming rapidly forward in a straight line (assessed as straightness index ≥45%; VAP ≥25 µm·s −1 ). Three progression ratios, expressed as percentages, were calculated from the velocity measurements described above: linearity of forward progression (LIN = VSL/VCL·100), straightness (STR = VSL/VAP·100), and wobble (WOB = VAP/VCL·100).

Statistical Analysis
The data obtained for the analysis of all sperm parameters were first assessed for normality and homoscedasticity by using Shapiro-Wilks and Levene tests. A normal probability plot was used to assess normal distribution. Multivariate procedures were performed to identify sperm SPs from the set of sperm motility data. All the values for kinematic variables were standardized to avoid any scale effect.

Multivariate Analysis
The first process carried out was a principal component analysis (PCA) of these data to derive a small number of linear combinations that still retained as much information as possible from the original variables. The number of principal components (PC) used in the next part of the analysis was determined from the Kaiser criterion, namely selecting only those with an eigenvalue (variance extracted of each PC) >1. Furthermore, Bartlett's sphericity test and the KMO (Kaiser-Meyer-Olkin) were performed [33]. As a rotation method, the varimax method with Kaiser normalization was used [34].
The second process was conducted to perform a non-hierarchical analysis with the k-means model that uses Euclidean distances from the quantitative variables after standardization of these data, so the cluster centers were the means of the observations assigned to each cluster [35]. The multivariate k-means cluster analysis was made to classify the spermatozoa into a reduced number of SPs (clusters) according to their kinematic variables. In the final process, to determine the optimal number of clusters, the final centroids were clustered hierarchically using the Ward method [36]. Thus, the clustering procedure enables for the identification of sperm SPs because each cluster contributed to a final cluster formed by the spermatozoa linked to the centroids. The analysis of variance (ANOVA) and χ 2 -test procedures were applied to evaluate statistical differences in the distributions of observations (individual spermatozoa) within SPs and then a generalized linear model (GLM) procedure was used to determine the effects of the breed and extender type on the mean kinematic variable values defining the different sperm SPs (i.e., the cluster centers). Differences between means were analyzed by Bonferroni test. Results are presented as mean ± standard error of the mean (SEM). Statistical significance was considered at p < 0.05. All data were analyzed using IBM SPSS package, version 23.0 for Windows (SPSS Inc., Chicago, IL, USA).

Bayesian Analysis
Differences in sperm kinematic patterns were estimated with a model including the effect of extender and species as a permanent effect. Male number was included as a random effect. All analyses were performed using Bayesian methodology. The posterior mean of the difference between extenders (D), the highest posterior density region at 95% (HPD 95% ) and the probability of the difference being positive when D > 0 or negative when D < 0 (P 0 ) were calculated. Bounded uniform priors were used for all effects. Residuals were a priori normally distributed with mean 0 and variance σ 2 e . We considered one third of the standard deviation (s.d.) of a trait as a relevant value (R), and we also calculated the probability of relevance (P R ; i.e., the probability of the difference being greater than R when D > 0 or lower than R when D < 0). The priors for the variances were also bounded uniform. Features of the marginal posterior distributions for all unknowns were estimated using Gibbs sampling. Convergence was tested using the Z criterion of Geweke [37] and Monte Carlo sampling errors were computed using time series procedures described in [38]. The Rabbit program, developed by the Institute for Animal Science and Technology (Valencia, Spain), was used for all procedures.

Subpopulation Structure
After multivariate cluster analysis (k-means model), four motile sperm SPs with different kinematic patterns were identified. There was extender effect (p < 0.05) on SPs distribution of the percentages motile-and progressively motile spermatozoa ( Figure 1). Summary data for kinematics of the SPs are presented in Table 1. They can be summarized as follows: subpopulation 1 (SP 1 ) included sperm with medium velocity (intermediate values of VCL, VSL, and VAP) and linear and progressive motility (inferred from intermediate LIN and STR values). This population included 20.02% of the total motile sperm. Subpopulation 2 (SP 2 ) contained active, fast, linear, and progressive sperm, as indicated by the greater value of VCL, VSL, and VAP, together with the lower values of LIN and STR, and intermediate value of BCF. Moreover, the ALH value was the lowest seen in all SPs, indicating movement with few undulatory characteristics. About 28.29% of the total motile sperm were assigned to this subpopulation. Subpopulation 3 (SP 3 ) included 19.14% of the total sperm and was represented by slow motile sperm with the lowest velocity but non-undulatory as revealed by values of VAP, VCL, VSL, and BCF. This population had lesser progressive motility than the other SPs as indicated by LIN and ALH values. Subpopulation 4 (SP 4 ) contained 32.54% of the total motile sperm population, and these cells had the highest motility and progressive motility, but movements were undulatory, as indicated by the high values of VCL, VSL, VAP, ALH, and BCF, together with the highest values of LIN, STR, and WOB. SPs, indicating movement with few undulatory characteristics. About 28.29% of the total motile sperm were assigned to this subpopulation. Subpopulation 3 (SP3) included 19.14% of the total sperm and was represented by slow motile sperm with the lowest velocity but non-undulatory as revealed by values of VAP, VCL, VSL, and BCF. This population had lesser progressive motility than the other SPs as indicated by LIN and ALH values. Subpopulation 4 (SP4) contained 32.54% of the total motile sperm population, and these cells had the highest motility and progressive motility, but movements were undulatory, as indicated by the high values of VCL, VSL, VAP, ALH, and BCF, together with the highest values of LIN, STR, and WOB.

Bayesian Analysis of Sperm Sub-Populations
Bayesian analysis of the data showed that VCL, VAP, LIN and BCF exhibited relevant differences between Tris-EY and Triladyl ® , that VCL and VAP presented relevant differences between Tris-EY and OptiXcell ® , and that only WOB showed relevant differences between Triladyl ® and OptiXcell ® . In the other cases, despite differences being significant (p < 0.05) between all extenders (VSL) or at least two extenders (LIN, WOB), they were not relevant (Table 4).  Bayesian analysis of the data (posterior distribution) when comparing Bos taurus and Bos indicus showed that even when differences (p < 0.05) were identified ensuing frequentist statistics (confidence distribution), these differences between kinematic variables were considered non-relevant (Table 5).  When the relevance of differences between sperm sub-populations was analyzed, the Bayesian analyses of data showed relevant differences between the following comparisons of SPs: SP 1 −SP 2 (for VSL and LIN); SP 1 −SP 3 (all kinematic variables except VCL, WOB and BCF); SP 1 −SP 4 (for VSL); SP 2 −SP 3 (all variables except VSL); SP 2 −SP 4 (for STR, WOB, ALH, and BCF). Thus, the sub-populations SP 1 , SP 2 and SP 3 showed differences (p < 0.05) for most of the kinematic variables when using the classical approach, and these differences were also revealed by the Bayesian approach. On the other hand, the SP 3 −SP 4 comparison did not show relevant differences for any kinematic variable. Even though subpopulation SP 4 showed differences regarding SP 2 or SP 3 , such differences should be regarded as non-relevant for VCL, VSL, VAP, LIN, WOB, or BCF (Table 6).  1 D: mean of the marginal posterior distribution of the difference between sperm subpopulations: SP 1 , SP 2 , SP 3 , and SP 4 ; 2 HPD 95% , highest posterior density region at 95%; 3 P 0 , probability of the difference being greater than zero when D > 0 and probability of the difference being lower than zero when D < 0; 4 P R , probability of the difference being greater than R when D > 0 and less than R when D

Distribution of the Sperm Subpopulation
Analysis of the proportion of sperm cells in each subpopulation for the three extenders revealed differences between extenders and SPs for each extender. The sperm SPs were unevenly distributed for each extender. The sperm sub-populations with highest number of cells were associated with a specific extender [(OptiXcell ® : SP 2 , 36.06%; Triladyl ® : SP 4 , 35.62%; Tris-EY: SP 4 , 34.66%). On the other hand, sub-populations with lower percentages of cells in Tris-EY and OptiXcell ® were associated with SP 1 (12.00%) and SP 3 (13.60%), respectively (Table 7).

Discussion
Cryopreservation is frequently used in animal production and there are several sources of variation in the success of semen survival during this procedure [4]. During freezing and thawing variation relates, for instance, to the effect of extender, which can influence the kinematic patterns of spermatozoa. Breed and inter-individual variation post-thawing can relate to differences in plasma membrane properties in response to freezing and thawing. Cryopreservation can result in alterations in the semen quality due to increases of oxidative stress which in turn can generate changes in sperm motility and kinematic patterns, DNA fragmentation, alterations in timing of capacitation or the acrosome reaction [39]. The evaluation of these changes is very difficult, if not impossible, when employing a classical analysis of semen quality and, thus, the use of CASA technology is important for the improvement of sperm quality assessments [2,[40][41][42]. CASA-Mot systems have demonstrated greater accuracy in motility evaluation compared with subjective methods [10]. Moreover, these systems provide abundant information on many kinematic variables for each spermatozoon [8,43] that can be used as a basis for the analysis of the heterogeneity of sperm populations. However, a strategy used by artificial insemination centers is to compensate damage to spermatozoa by increasing the number of motile sperm cells in the insemination doses, and this can mask the relevance of sperm SPs.
In bulls, it has been suggested that sperm SPs with greater percentages of fast yet nonlinear spermatozoa seem to have greater fertilization capacity [25]. Other work indicates that fast and nonlinear sperm movements are in fact specific of hyperactivated spermatozoa [44]. Results of other studies suggest that the sperm SP containing fast and linear spermatozoa is the one with a higher likelihood of sperm-oocyte interaction [28]. Despite these attempts at linking SP structure and function, including sperm fertility, the abundant data provided by CASA-Mot systems for multivariate procedures, and the resulting SP, are still insufficient to categorize an ejaculate in relation to its quality. Moreover, the classification of SPs through classical clustering procedures remains uncertain and even controversial. This relates to the observation that the multivariate procedures of clustering will find several kinematics patterns in the data set, even if there are no natural clusters in the data [45].
The multivariate approach of PCs and cluster analysis conducted for several species underscores that ejaculates are nevertheless heterogeneous in that they contain spermatozoa with different motility and kinematic patterns [17,22,31,[46][47][48][49][50][51][52][53]. In the present study, we identified four different sperm SPs described from three PCs that represented velocity, progressiveness, and cell undulatory movement. The proportions of the different SPs varied with the extender used and among species. The cluster of fast and linear sperm did not follow a consistent pattern between SPs; in any case, the samples diluted with Triladyl ® showed the greatest kinematic activity. In our study, despite differences (p < 0.05) between extenders, the Bayesian analysis revealed that only Triladyl ® presented relevant differences in comparison to Tris-EY and OptiXcell ® for these kinematic patterns. Our results indicate that subpopulation distribution was also different between and within bovine species. In contrast, other studies have indicated that the sperm subpopulation distribution was not different within species [5]. On the other hand, for the two bull species (Bos taurus, Bos indicus), the overall distribution of sperm subpopulation structure showed little variation among individuals.
The application of biostatistical analysis tools to the study of sperm kinematic variables revealed the existence of sperm SPs in the ejaculate [9,13,23,[54][55][56][57][58][59], but their significance and biological relevance does not appear to be entirely clear [8,20,27,31,60]. Our findings of varying biological relevance of sperm SPs in the ejaculate strengthen the idea that a specific sub-populational structure, depicted by spermatozoa with different kinematic patterns, must be revised. Other authors have suggested some sperm SPs can be affected by a treatment [26] or by specific processes (e.g., freezing-thawing) indicating that some sperm SPs may have biological relevance [25] whereas others, co-existing in the same sample, may not [8,26,61,62]. Our Bayesian results indicated that SP1, SP2, and SP3 were different and that these differences were relevant. The biological relevance of these SPs may thus lead to an invalidation of the classical definition of the sperm cluster or subpopulation. Relevance may relate to fertility. The variability of spermatozoa in an ejaculate can correlate with the variability of fertility among males [63], and lesser motility can also associate with lower fertility rates [2]. If velocity is a principal component that explains the kinematic variance of sperm SPs, and velocity can relate to fertility, we thus need to further clarify the biological relevance of these SPs.
Sperm SPs should then be assessed using a criterion of utility in relation to biological importance. To our knowledge, this is the first study in which an assessment of statistically relevance of ejaculate SPs has been carried out. Distinction between species, and between breeds, should be taken into account and implemented in this type of analysis. Previous studies have considered these distinction and comparisons carried out between species with regards to kinematics [16] and morphology have identified differences [64]. The fact that there are differences between species suggest that new approaches may be required when managing the ejaculates. In any case, these differences must be relevant from the biological point of view and statistical differences may not be sufficient as criterion. There is thus a need to further examine the issue of biological relevance in the analysis of sperm SPs and to develop means to explore the biological meaning of data sets of clustered cells deriving from automated semen analyses.

Conclusions
Current approaches for the analysis of sperm kinematic SPs may not have a direct biological meaning and therefore, the biological relevance of sperm SPs needs to be reevaluated. Funding: This work was supported by Fundación para el Fomento y Promoción de la Investigación y Transferencia de Tecnología Agropecuaria de Costa Rica (FITTACORI) and Costa Rica Institute of Technology (Vice-Chancellor's office of Research and Extension; VIE (Vicerrectoría de investigación y Extensión); Project 5402-2151-1014-). The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the manuscript.