Next Article in Journal
Genetic Diversity and Population Structure of African Sorghum (Sorghum bicolor L. Moench) Accessions Assessed through Single Nucleotide Polymorphisms Markers
Next Article in Special Issue
Advanced Skeletal Ossification Is Associated with Genetic Variants in Chronologically Young Beef Heifers
Previous Article in Journal
Genome-Wide Identification and Characterization of the Phytochrome Gene Family in Peanut
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Investigating the Genetic Background of Spastic Syndrome in North American Holstein Cattle Based on Heritability, Genome-Wide Association, and Functional Genomic Analyses

1
Centre for Genetic Improvement of Livestock (CGIL), Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada
2
Department of Biomedical Sciences, Ontario Veterinary College, University of Guelph, Guelph, ON N1G 2W1, Canada
3
Department of Animal Sciences, Purdue University, West Lafayette, IN 47907, USA
4
Department of Clinical Studies, Ontario Veterinary College, University of Guelph, Guelph, ON N1G 2W1, Canada
*
Author to whom correspondence should be addressed.
Current address: Zane Cohen Centre for Digestive Diseases, Division of Gastroenterology, Mount Sinai Hospital, Toronto, ON M5T 3L9, Canada.
Genes 2023, 14(7), 1479; https://doi.org/10.3390/genes14071479
Submission received: 1 July 2023 / Revised: 12 July 2023 / Accepted: 17 July 2023 / Published: 20 July 2023
(This article belongs to the Special Issue Genetics and Genomics of Cattle)

Abstract

:
Spastic syndrome is a chronic, progressive disorder of adult cattle characterized by episodes of sudden involuntary muscle contractions or spasms of the extensor and abductor muscles of one or both hind limbs. In this study, a case-control genome-wide association study (GWAS) was performed on an adult Holstein cattle cohort. Based on the 50 K and high-density (HD) SNP panel GWAS, we identified 98 and 522 SNPs, respectively. The most significant genomic regions identified are located on BTA9 at approximately 87 megabase pairs (Mb) and BTA7 between 1 and 20 Mb. Functional analyses of significant SNPs identified genes associated with muscle contraction, neuron growth or regulation, and calcium or sodium ion movement. Two candidate genes (FIG4 and FYN) were identified. FIG4 is ubiquitously expressed in skeletal muscle and FYN is involved with processes such as forebrain development, neurogenesis, locomotion, neurogenesis, synapse development, neuron migration, and the positive regulation of neuron projection development. The CACNA1A gene, which codes for a calcium channel subunit protein in the calcium signaling pathway, seems the most compelling candidate gene, as many calcium ion channel disorders are non-degenerative, and produce spastic phenotypes. These results suggest that spastic syndrome is of polygenic inheritance, with important genomic areas of interest on BTA7 and BTA9.

1. Introduction

Genomic selection has been widely adopted worldwide by the dairy cattle industry and has contributed to great genetic progress in many economically important traits [1]. Intensive selection pressure has been accompanied by high inbreeding levels [1,2,3], creating a situation where the risk of breeding carriers of deleterious traits is higher, due to repeatedly selecting relatively few animals with desirable phenotypes. The identification, isolation, and eradication of genetic disorders in Holstein cattle is pertinent due to strong worldwide pressure for high-producing cows that are minimally affected by welfare issues and cause reduced environmental impact [3].
Spastic syndrome is a chronic, progressive neuromuscular disorder of adult cattle characterized by episodes of sudden involuntary muscle contractions or spasms of the extensor and abductor muscles of one or both hind limbs. These episodes are induced by movement and are particularly noticed when the animal stands up and is adapting to bear weight. No signs of spastic syndrome are exhibited when the animal is lying down. In the later stages, the spasms can involve the epaxial thoracolumbar and neck muscles [4,5,6,7,8,9,10,11,12,13].
Some of the names applied to this syndrome include crampiness, stretches, crampy, barn cramps, periodic spasticity, inherited periodic spasticity, neuromuscular spasticity, bovine biceps femoris neuropathy, and spasmodic extensor hypertonus [5,7,10,11,13,14,15]. The term “standings disease” [10,11,12] has also been used because it appears to be more common in bulls tied by the neck in single stalls. In Europe, the condition has most commonly been called “Krämpfigkeit des Rindes” (i.e., crampiness of cattle) [4,8,16]. Other European designations include “Krampe”, “Stallkrampe”, “Gikt”, “Strek”, “Gliedersucht”, and “Stallkrampf” [16,17]. The diversity of names ascribed to the clinical manifestation of spastic syndrome emphasizes the confusion surrounding its true nature [10].
Spastic syndrome occurs in adult cattle of both sexes [5,6,7,8,9,10,12,13,16,18] and has never been diagnosed clinically in calves [18]. The first signs are usually noticed between 3 and 7 years of age [5,7,8,14,16,18,19] and are most commonly observed in cattle over 6 years of age [5,9]. In a series of 65 cases, 49 (77%) were over 6 years of age, with no cases under 3 years of age [5]. The incidence of spastic syndrome is considerably higher in bulls [5,16,18], and bulls of at least 3 years old are considered to be at greatest risk [9]. The condition is particularly a problem in mature bulls maintained in artificial insemination (AI) breeding units [5,8,11]. In a limited number of AI studs and privately owned bulls, Roberts [8] observed an incidence of 10–30% of mild-to-severe spastic signs in older proven bulls. More recently, the prevalence has been reported to be as high as 30% in older, proven dairy sires [12]. In cows, the condition occurs most commonly in animals between four and eight years of age and the signs are usually milder [5,12]. In dairy cows, the higher producing cows in a herd seem to be the most affected [4,6,16,18]. Spastic syndrome has been reported in most breeds of cattle. However, it primarily occurs in dairy and dual-purpose breeds [5,7,8,9,18,20].
The condition has been reported in the following breeds: Holstein-Friesian [5,6,7,9,10,19,21], Guernsey [5,7,9], Ayrshire [5,6,7,9], Brown Swiss [7,9], Jersey [7,9], Norwegian Red cattle [17,18], Danish Red dairy cattle [18], Simmental [22], Shorthorn [5,7], Angus [7], Hereford [7,12], Bos indicus crossbreds [5], and Holstein crossbreds [19]. Spastic syndrome is a disorder that decreases the welfare and functional lifespan of affected animals. The progression of clinical signs can span from 1 to 10 years, ultimately leading to culling or death [5,6,12]. There is no specific treatment for spastic syndrome. Empirical therapy with muscle relaxants and anti-inflammatory drugs can alleviate pain and discomfort during an acute episode of spasticity [5,6,13,16,20]. Currently, the phenotypic detection of spastic syndrome can be difficult. This is due to the adult onset of the disorder, the mild nature of initial clinical signs, and the possibility of signs being missed during recumbency [12]. Genetic studies on spastic syndrome in cattle are uncommon and there are few estimates of its degree of inheritance in the literature. Therefore, ascertaining preliminary heritability estimates is logical to determine the extent to which spastic syndrome is heritable. Estimating heritability of spastic syndrome as a binary trait (where 0 = unaffected and 1 = affected) assumes an underlying normally distributed disease liability.
Heritability estimates for spastic syndrome were previously reported by Boettcher and Wang [23] for Holstein cattle in Canada, resulting in an estimated heritability of 0.23 in the underlying liability scale and 0.003 in the observed binomial scale, with a population prevalence of 0.11%. No disease criteria were defined by Boettcher and Wang [23], and cattle were relatively young (between the ages of 22 and 34 months) when classified for spastic syndrome, which is not optimal for the detection of a late-onset disorder [5,12]. Most of the evidence suggests that the mode of inheritance of spastic syndrome is autosomal recessive with incomplete penetrance [7,8,14,20], or a polygenic trait [13].
Identifying genomic markers and deriving candidate genes for disorders of unknown inheritance could allow for future genetic testing panels to target known detrimental alleles of specific at-risk populations, such as sires in artificial insemination (AI) stations. Also, creating a candidate gene list in an initial affected population will guide future research to genomic areas of potential interest, and mapping significant SNPs to genes may assist in creating a test panel for spastic syndrome disease risk. A test panel could allow producers to identify which young bulls may be at risk of developing and disseminating spastic syndrome before they are widely used in the population. In this context, a genome-wide association study (GWAS) to screen for areas of interest associated with bovine adult-onset spasticity is an appropriate strategy to uncover genomic regions of interest related to spastic syndrome. Therefore, the main objectives of this study were to: (1) measure the degree of inheritance of spastic syndrome via heritability estimation using all available pedigree information; (2) interrogate both moderate and high-density (HD) single nucleotide polymorphism (SNP) genotypes for the discovery of SNPs associated with spastic syndrome through the GWAS; and (3) perform in silico and functional analyses of significant SNPs obtained from the GWAS of a cohort of affected animals, using HD SNP genotypes to obtain a list of candidate genes for spastic syndrome.

2. Materials and Methods

2.1. Spastic Syndrome Phenotypic Data

Spastic syndrome disease status was previously determined by two veterinarians (Dr. Hanna and Dr. Baird) via differential diagnosis, where other diseases/disorders presenting with similar clinical signs were evaluated and systematically eliminated. Animals that were at least three years of age and displaying episodes of spasms of the extensor and abductor muscles of the pelvic limb(s) were considered to have spastic syndrome, once other causes for hind-limb spasm had been ruled out, such as evident injury, tetanus, or epilepsy [5]. Additionally, adult cattle displaying signs of other neuromuscular diseases were also excluded to provide a clearer distinction between affected and unaffected animals. Because cattle under three years of age were not included in this study, spastic paresis (or “Elso heel”) was eliminated as a possible confounder since it is a congenital disease usually observed in calves three to six months of age [24,25]. The animals identified as unaffected with spastic syndrome had reached at least eight years of age with no evidence of hind limb spasticity.

2.2. Genotypic Data and Quality Control

All animals were genotyped with the Illumina BovineSNP50 BeadChip (Illumina Inc., San Diego, CA, USA). This 50 K SNP panel contained 41,847 SNPs after the quality control following [26], with an average of 51.5 Kb spacing between SNPs. Only autosomal chromosomes and SNPs with known genome positions according to the UMD_3.1 bovine assembly map [27] were used in the present study. The imputed high-density (777 K) SNP panel contained 305,341 SNPs evenly spaced across the genome after imputation from the 50 K SNP panel genotype.
The pedigree data contained 79,613 Holstein cattle from Canada, the USA, Germany, Denmark, Spain, Finland, France, Great Britain, Italy, and the Netherlands. There were 11,227 sires, 51,350 dams, and 9363 founder animals. Within the pedigree, there were 41,264 inbred animals with an average inbreeding coefficient of 0.0441, obtained using the software CFC [28].

2.3. Phenotypic and Genotypic Data for The Study Cohort

An initial study population consisted of 24 Canadian Holstein bulls (15 affected, 9 unaffected) and 16 American Holstein bulls (11 affected, 5 unaffected). In total, there were 26 affected and 14 control animals, as summarized in Table 1.
The phenotypic categories to which animals were assigned were: unaffected, affected mild, affected moderate, affected severe, and uncertain. To add statistical power to the analyses, Lactanet provided phenotypic and genotypic data from additional North American Holsteins diagnosed as “affected” or “unaffected” with spastic syndrome by local veterinarians. This included data for 131 additional affected animals, bringing the total number of affected animals to 157: 19 Canadian cows, 2 American cows, 77 Canadian bulls, and 59 American bulls. Data for 94 more control animals was also provided, bringing the total to 108: 87 Canadian cows, 2 American cows, 12 Canadian bulls, and 7 American bulls. Genotypes for these animals were obtained from Lactanet (www.lactanet.ca; Guelph, ON, Canada). The study animals were genotyped with the Illumina 50 K Beadchip SNP chip (Illumina Inc., San Diego, CA, USA), and imputed to the HD (777 K) SNP panel. The genome-wide accuracy of imputation was over 99%, when using a family- and population-based imputation method (FImpute, [29]).

2.4. Heritability Estimation

To estimate the heritability of spastic syndrome, a binomial model was fitted using the ASReml software [30], including the overall mean and the random animal additive genetic effect. The pedigree information of 42,010 Holstein cattle was utilized to calculate the additive relationship matrix for fitting the animal genetic effect. The ASReml software utilizes an approximation likelihood technique, penalized quasi-likelihood (PQL; [30]). The probit link function was used in the generalized linear model [30]. The underlying heritability estimate was transformed to heritability estimates on the observed binary scale using the following transformation [31,32]:
h ^ o 2 = h ^ u 2 w 2 α 1 α
where w = e 0.5 * z 2 ( 2 π ) , h ^ o 2 is the heritability estimate on the observed scale, h ^ u 2 is the heritability estimate on the underlying scale, α is the proportion of affected individuals within the population of interest, e is the exponential constant, z is the standard normal deviate corresponding to the proportion of affected animals (α), and π is the mathematical constant pi.
Using this transformation, heritability estimates for different reported prevalence rates within either a general population of classified Holstein cows of 0.55% [33] or a Holstein sire population within AI facilities of 30% [12] were obtained. In addition, the prevalence of spastic syndrome of 59.25%, i.e., 157 affected animals out of 265 animals, was also used.

2.5. Genome-Wide Association Study via Generalized Likelihood Score

Single nucleotide polymorphism (SNP) association analysis was carried out using the Sleuth software (Sargolzaei, University of Guelph, Guelph, ON, Canada), which uses a generalized quasi-likelihood score (GQLS) method to analyze genetic associations between SNPs and the trait considered [34]. The GQLS method can be performed on either a quantitative or a binary trait [34]; spastic syndrome is considered to be a binary trait, where affected animals were scored ‘1’, and unaffected animals were given a score of ‘0’. A logistic regression was used to associate the disease status with the genotypes, treated as covariates and response variables, respectively. Analyses were carried out one SNP at a time, where Xi = (X1, …, Xn)′, with Xi defined as the phenotypic (disease status) observation of the ith animal as 0 or 1, and where Yi = (Y1, …, Yn)′ represents the genotype of the animal, where Yi = ½ × (the number of alleles 1 in the genotype of animal i). With genotypes coded as “0”, “1”, and “2”, the allelic proportions were 0, ½, or 1. The expected SNP allele frequency is represented by μ, where μ = (μ1, …, μn) = E(Y|X), such that 0 < μi < 1. To associate the SNP allele frequency, μi, with the phenotype, Xi, the following logistic regression model was defined:
μ = E Y i | X i = e β 0 + β 1 X i 1 + e β 0 + β 1 X i
where β0 is a constant and β1 is the angular coefficient. To verify the association between the marker and the trait, the null hypothesis, H0: β1 = 0, assumes the marker is not associated with the trait (spastic syndrome), against the alternative hypothesis, H1: β1 0, where the marker is associated with the trait.
The 50 K SNP genotypes and the imputed HD genotypes were analyzed with a minor allele frequency (MAF) threshold of less than or equal to 0.00001. This threshold will screen for virtually all possible SNPs of interest, including rare variants. The call rates for individuals and SNPs were both set to 0.90, and a heterozygosity excess greater than or equal to 0.499 was excluded. Significance thresholds were set to 1%, 5%, and 10% chromosome-wise positive false discovery rates (pFDRs). The pFDR was used to increase the sensitivity of correctly detecting SNPs with potentially small effects on the disease, as the pFDR allows for more statistical power than traditional control of family-wise error rates (FWER; [35]).
To narrow down a potentially large number of significant SNPs identified in the GWAS for the HD data, a post-GWAS MAF threshold of 0.2 was applied to obtain SNPs in relative abundance, as spastic syndrome prevalence is approximately 60%. Additionally, a chi-square test was performed to determine which significant SNPs were significantly different when compared with disease status with a threshold of 0.05. The population proportionate distributions of homozygotic SNPs were calculated in the affected and unaffected population, and the differences of the homozygous states were taken. Then, the absolute difference of the differences in homozygotic status was retrieved to identify the potential polarity of homozygous SNP genotypes and disease status using a threshold of 0.3. This analysis highlights SNPs with one homozygotic variant, which is observed more frequently within affected populations, while the alternative homozygotic variant is more frequent in the unaffected population.

2.6. Genome-Wide Association via Random Forest

An alternative GWAS approach based on random forest (RF) regression was used to validate significant SNPs found through the GQLS logistic regression method. Liaw and Wiener [36] developed this non-parametric machine learning method to approach the issue of having a large number of parameters to estimate, i.e., SNP effects, with a small number of observations i.e., spastic syndrome cases and controls. RF regression is a tree-based ensemble machine tool for the classification/regression of multiple variables. Both the trait value and the SNP marker are shuffled to provide robustness against model overfitting [36,37]. The approach is as follows: (a) a random subset of binary observations is selected (i.e., affected or unaffected animals); (b) a random subset of SNP markers is selected; (c) a single tree is created by recursively splitting the subset of SNPs in the subset of samples to form tree nodes; (d) additional animals are sampled with replacements using ‘out-of-bag’ (OOB) data, which will be explained further, and the prediction error of the tree is determined; (e) a forest of trees is generated by repeating steps a–d; and (f) the final SNP variable importance (as VIM%) is obtained by averaging the prediction error values across all trees in the forest [37].
To determine the prediction error of the test set, a process called ‘bagging’ is used, where successive trees do not depend on earlier trees and where all trees are independently constructed using a bootstrap sample of the data [37]. Approximately one third of the trees created are left out of the bootstrap sample and used for the OOB error estimate, where a simple majority is taken to obtain the error estimate [36,37]. The weighting of the prediction error was even between disease statuses, and training sets showed lower error rates when regressing on the affected status. Lower errors to correctly describe SNPs implicated with spastic syndrome are adequate as this is a screening study and may include SNPs that are not truly associated with spastic syndrome. To optimize the performance of RF and HD genotype analyses, the big random forest (bigRF) was also implemented [38]. BigRF allows trees in the forest to be grown in parallel on a single machine, or to be built in parallel on multiple machines and then merged into one to decrease computing time and computational load [38]. Once the RF procedure is completed, all SNPs are ranked based on their VIM%. The VIM% is also known as the “mean decrease in accuracy”, where large positive VIM% values indicate a SNP of interest. Essentially, if a SNP with a large VIM% is removed from the forest, resulting in an increase in prediction error, the SNP is considered to be ‘important’ [37].
The RF regression was performed using the initial set of animals to estimate the VIM% of SNPs in the HD SNP genotype panel (305,341 SNPs) as the training dataset. The results were used for prediction purposes for the additional 225 phenotyped animals as a testing dataset. The forest was grown to 75 trees, as preliminary research suggests cumulative estimation and prediction error rates do not decrease with an increase in forest size, up to 1500 trees. The forest was replicated 100 times to limit spurious associations. Identified SNPs with occurrence frequencies in the top 50% were classified as genomic areas of interest. All SNPs with a non-zero VIM% were saved and compared with other forest results. Common SNPs from at least two forests were obtained, and average VIM% values were calculated. These ‘significant’ SNPs were then compared with results obtained from the HD SNP genotype panel GQLS regression for possible common areas of significance. The analysis was performed genome-wide to ascertain which SNPs were important on a genomic scale.

2.7. Post-GWAS Analyses of Resultant Significant SNPs

Several post-GWAS significance threshold criteria were applied to decrease the number of SNPs for in silico and enrichment analyses. The first post-GWAS criterion applied to the SNPs identified was a MAF equal to or greater than 20% since, as disease prevalence within the study population is greater than 60%, the alleles that contribute to the phenotype observed in spastic syndrome are assumed to be relatively abundant. A chi-square test was performed to determine which SNPs were significantly different from expected frequencies when compared with disease status with a threshold of 0.05. The final threshold placed on the remaining significant SNPs was to obtain the population proportionate distributions of homozygotic SNPs in the affected and control populations. Differences of the homozygotic states were taken per phenotype, and then the absolute difference of these differences was obtained to identify potential polarity of homozygous SNP genotypes and disease status. The homozygous polarity threshold was 0.3.

2.8. Mapping SNPs to Genes

The SNP lists generated by the GWAS were mapped using the ‘next-generation sequencing SNP’ tool (NGS-SNP; [39]). The significant SNPs from the GWAS were mapped to genes up to two Mb to the left and to the right of the SNP to ensure thorough gene attainment. The NGS-SNP script uses a list of positions in the bovine genome to return a description of nearby genes. The gene descriptions include gene position, and, if known, model species orthologs and basic gene function information.

2.9. Identification of Candidate Genes through Enrichment Analyses

A candidate gene list was created for the HD GWAS. Mapped genes were submitted to different bioinformatic web tools; the primary tool was the Database for Annotation, Visualization, and Integrated Discovery (DAVID; [40,41]). The gene ontology (GO) derived from DAVID was used to obtain an overview of gene function, including biological processes, cellular components, and molecular functions of the genes. Then, using literature information on spastic syndrome, the most relevant biological processes were investigated, including locomotion, muscular and musculoskeletal contraction or movement, neuronal growth/development, cellular/nerve receptors, neuron growth/regulation/transmission, calcium/sodium ion movement, axon and synapse growth/transmission, and synaptic vesicle maturation/fusion/exocytosis [5,8,12], reviewed in Jansen et al. [42].
Tissue types such as muscle (skeletal) and nervous (nerve axons and brain) were emphasized in the search [5,6,8,10], reviewed in Jansen et al. [42]. Molecular functions that were incorporated in the search involved ions and ion channel activity. The gene functions were explored further using the National Centre for Biotechnology Information [NCBI; www.ncbi.nlm.nih.gov (accessed on 15 February 2015)] meta-database containing a broad collection of genomic and biomedical databases. The NCBI was utilized for human/mouse ortholog location and verification, gene aliases, and relevant literature regarding genes and gene function.
Subsequently, Nextbio, a data mining bioinformatics tool, was used to obtain information on the human ortholog of a gene of interest (if applicable), and if it was up- or down-regulated in related human neuromuscular disorders [43]. General information on pertinent human orthologous genes was obtained to identify gene function related to neuromuscular/movement disorders, namely Parkinson’s disease, Huntington’s disease, and amyotrophic lateral sclerosis (ALS).

2.10. Linkage Disequilibrium for Significant SNPs and SNPs within Candidate Genes

Large distances may exist between GWAS-derived significant SNPs and SNPs within the candidate genes, and thus up to two Mb to the left and right of each SNP of interest were considered. Linkage disequilibrium (LD; r2 values) were obtained to quantify linkage between SNPs and candidate genes. For a well-established and studied trait, such as mastitis, significant SNPs were mapped to genes of up to 100 kilobase pairs (kb) [44]. However, due to the paucity of genetic knowledge surrounding spastic syndrome, a larger mapping distance was considered to obtain all possible relevant genes. Initially, the exact gene location and length were obtained through the Ensembl database (www.ensembl.org). Haplotype blocks of the candidate genes were created with the HaploView software [45], using the solid spine and r2 thresholds to create haplotype blocks within candidate genes. A solid spine was developed to search for a “spine” of strong LD from one marker to another, creating an ‘LD block’, where the first and last markers in the haplotype are in relatively strong LD. HaploView was also used to visualize gene length and intra-gene haplotype blocks.
Subsequently, we obtained syntenic LD of GWAS-derived SNPs and SNPs within candidate genes of interest. As the number of study animals was limited, the animals used to obtain pairwise LD were the 1659 HD genotyped Holsteins, also used as the reference population for the imputation analyses. The r2 metric was used to find syntenic linkage with a threshold of 0.2 [46,47].

2.11. Mutant Variant List

All SNPs within candidate genes analyses were investigated through NCBI. Variant information, including mutant types and protein consequences of mutants, was derived from the Cattle Genome Analysis Data Repository [48], as well as annotations of the HD SNP panel obtained from the Iowa State University (Ames, IA, USA).

3. Results

3.1. Heritability Estimates

The underlying heritability estimate for spastic syndrome was 0.41 with a standard error of 0.07. On the observed scale and considering a prevalence of 0.59, the estimated heritability was 0.26. Using a prevalence of 0.30, as in older proven AI bulls reported by Tenszen [12], a heritability estimate of 0.24 was obtained. For a predicted prevalence of 0.6% in the general population of Holstein cows in Canada, the estimated heritability is 0.02.

3.2. Genome-Wide Association Study Results

The chromosome-wise pFDR was used. The q-value obtained from the pFDR is the “posterior Bayesian p-value”. The Manhattan plots given from Supplementary Figures S1 and S2 (Supplementary Figures S1–S9) truncate significance on the y-axis at the –log(q) value of 6 to allow for viewing consistency across all chromosomes. SNPs that exceed the significance threshold of 6 are presented as triangles within Manhattan plots. The distribution of significant SNPs per chromosome is summarized in Table 2. The GWAS resulted in 98 significant SNPs across all chromosomes except BTA10, BTA13, BTA23, BTA24, BTA27, and BTA28, as summarized in Table 2.
Figure S1 presents a genome-wide Manhattan plot for this GWAS; significant SNPs are held within quantitative trait loci (QTL) patterns as well as single-SNP peaks. The following chromosomes had significant SNPs held within QTL patterns, with the approximate location of the QTL pattern indicated in megabase pairs (Mb): BTA7 (20 Mb), BTA8 (65–100 Mb), BTA15 (40 Mb), and BTA22 (40–50 Mb). Table 3 displays the top 10 significant SNPs based on the p-value for the 50 K GWAS, highlighting the SNP location and at what pFDR threshold the SNP was found to be significant.
The HD GWAS resulted in 522 significant SNPs across all chromosomes except BTA25 when using the significance thresholds described above. A genome-wide Manhattan plot of this GWAS is shown in Figure S2. Several chromosomes contained peaks with QTL patterns, located in BTA7 (1–20 Mb), BTA8 (60–80 Mb), BTA14 (80 Mb), BTA16 (75–80 Mb), BTA20 holding two peaks (10–20 Mb), and BTA22 (20–40 Mb). To refine the number of significant SNPs, a MAF threshold of 0.2 or greater reduced the number of SNPs to 118 on BTA1, BTA2, BTA3, BTA5, BTA7, BTA8, BTA9, BTA10, BTA12, BTA14, BTA16, BTA17, BTA18, BTA20, BTA21, and BTA22. A chi-square test identified 114 significantly different SNPs, located on BTA1, BTA2, BTA3, BTA5, BTA7, BTA8, BTA9, BTA10, BTA12, BTA14, BTA16, BTA17, BTA18, BTA20, BTA21, and BTA22. In total, 67 significant SNPs met the homozygosity polarity threshold of 0.3 across BTA1, BTA2, BTA3, BTA5, BTA7, BTA8, BTA9, BTA10, BTA12, and BTA14. A summary of SNPs found significant at each threshold can be found in Table 4.
Table 5 highlights the top 10 significant SNPs based on the p-value obtained for the HD GWAS and various post-GWAS threshold criteria values. A total of 73 SNPs were in common when comparing the 50 K GWAS with the HD GWAS. The common significant SNP frequency per chromosome is summarized in Supplementary Table S1 (Supplementary Tables S1–S7). After the post-GWAS threshold in the HD GWAS, 10 SNPs remained in common with the 50 K GWAS. Information on the common SNPs after all threshold criteria is given in Table 6.

3.3. Genome-Wide Association via Random Forest

The number of SNPs with a non-zero VIM% ranged from 67 to 105 per forest, when 100 forests were grown with 75 trees each, and total OOB estimation and prediction error rates ranged from 25.0% to 47.6%, and 28.4% to 39.1%, respectively. In total, 913 SNPs occurred in at least two forests, up to 19 forests; see Table 7 for a summary of SNPs per chromosome. The following SNPs occurred with relatively high forest frequency: BovineHD0700003818 located on BTA7 at 14 Mb occurring in 19 out of 100 forests, BovineHD0900015016 located on BTA9 at 54 Mb occurring in 9 forests, and BovineHD1600011731 located on BTA16 at 42 Mb occurring in 10 forests.
We identified 16 common SNPs when comparing significant SNPs of the RF regression with the 522 significant SNPs of the GQLS regression, located on BTA7, BTA8, BTA14, and BTA20. These SNPs have a relatively high MAF, ranging from 0.26 to 0.49, and homozygotic polarity scores ranging from 0.20 to 0.47. One common SNP does not have a homozygotic polarity score as there is only one homozygotic state for all tested animals. Table 8 provides summary information on these common SNPs, including post-GWAS threshold criteria values imposed on SNPs found significant with the GQLS regression SNPs. The average maximum VIM% for all significant SNPs was small, ranging from a minimum of 2.69 × 10−19% to a maximum of 0.00614%, and common SNPs to the GQLS regression had an average VIM% ranging from 0.00067% to 0.00177%. In general, error rates were lower for SNPs regressed to the affected phenotype, ranging from 0.00% to 23.08% for estimation error rates and 1.53% to 14.50% for prediction error rates. The estimation and prediction error values for SNPs regressed on the unaffected phenotype ranged from 50.00% to 100.00% and 56.38% and 89.36%, respectively. This suggests that RF was able to identify variants implicated with the disease phenotype more accurately.

3.4. Post-GWAS Analyses and Candidate Genes

The 67 retained significant SNPs from the GWAS resulted in 1048 genes. Table S2 provides a summary of the distribution of genes per chromosome. Candidate gene lists were created for the HD genotype GWAS, as the HD SNP genotype allows for finer mapping to genes of possible interest. To narrow the number of genes for further enrichment analysis, we screened for key biological functions, as described in the Materials and Methods section. Additionally, genes located within SNP peaks and genes that had significant SNPs within were closely examined for relevant biological function. Two genes, FIG4 and FYN, were classified as important candidate genes.

3.5. Candidate Gene List

The 67 retained significant SNPs from the GWAS were located on the following chromosomes: BTA1, BTA2, BTA3, BTA5, BTA7, BTA8, BTA9, BTA10, BTA12, BTA14, BTA18, BTA20, BTA21, and BTA22. In total, the SNPs were mapped to 1048 genes; Table 9 provides a summary of the 7 candidate genes based on statistical significance of associated SNPs and gene function.
Below, the candidate genes are described on an individual basis. The within-gene haplotype blocks (Supplementary Figures S3–S9) are a visualization of the size of the genes and the extent of LD found within. In total, seven SNPs mapped to the voltage-dependent P/Q type calcium channel α 1A subunit gene (CACNA1A) on BTA7 at approximately 13 Mb (BovineHD0700003455, which mapped within gene, BovineHD0700004239, ARS-BFGL-NGS-102773, BovineHD0700003962, BovineHD0700004113, and BovineHD0700004235). The CACNA1A gene codes for a calcium channel subunit protein in the calcium signaling pathway. This calcium channel is present in the dendrite, neuronal cell body, cytoplasm, and voltage-gated calcium channel complex, and is part of various pathways such as the calcium signaling pathway, the MAPK signaling pathway, and the synaptic vesicle cycle [40,41]. Biological processes of this gene include cation transport, calcium-ion-dependent exocytosis of neurotransmitters, adult locomotory behavior, cerebellar Purkinje cell differentiation, muscle contraction, regulation of neurotransmitter transport and secretion, and glutamatergic synaptic transmission [40,41]. In the peripheral nervous system, this gene is expressed mainly at the neuromuscular junction and mediates the release of presynaptic acetylcholine [49]. There is a 92% shared identity with the human ortholog of this gene, which is down-regulated in Huntington’s disease and up-regulated in Parkinson’s disease and ALS [43,50]. Mutations in this gene have been reported as causal in spinocerebellar ataxia type 6 in humans [51]. This gene has over 1400 SNPs implicated in Parkinson’s disease, tabulated through meta-analyses [52,53].
Two SNPs were mapped to the Ras-related protein Rab-3A gene (RAB3A) on BTA7, located at approximately 5 Mb (bovineHD0700001946 and Hapmap59438-rs29012637). The RAB3A protein is involved in calcium exocytosis in neurons, axonogenesis, neurogenesis, regulation of neurotransmitter secretion and transport, and synaptic vesicle maturation and exocytosis [40,41]. RAB3A is associated with synaptic vesicles in their GTP-bound form and dissociates from the vesicle upon depolarization of the nerve terminal [54]. There is a protein–protein interaction with the product of DNAJC5, which may be involved in calcium-dependent neurotransmitter release at nerve endings [40,41]. The bovine gene shares a 99% identity with the human gene and is down-regulated in Huntington’s disease and Parkinson’s disease [43,50]. Kapfhamer et al. [55] reported that RAB3A knockout mice have synaptic depression, suggesting that RAB3A has a regulatory role in synaptic vesicle trafficking. Additionally, one SNP is implicated in ALS, and 20 SNPs are implicated in Parkinson’s disease through meta-analyses of the RAB3A gene [52,53].
There were two SNPs mapped to the microtubule-associated protein 1S gene (MAP1S) on BTA7, located at approximately 5 Mb (BovineHD0700001946 and Hapmap59438-rs29012637). MAP1S is involved in processes like brain development, neurogenesis, neuron differentiation, neuron projection development, and the execution phase of apoptosis [40,41]. MAP1S is highly expressed in post-natal developing and mature neurons, as well as skeletal muscle (reviewed in Halpain and Dehmelt [56]). There is an 83% identity with the human gene [50]. This gene is down-regulated in Huntington’s disease and Parkinson’s disease [43]. It has been suggested that defects in MAP1S-regulated programmed cellular destruction (autophagy) could impact neurodegenerative diseases [57]. Additionally, three SNPs associated with the MAP1S gene are also implicated in ALS through meta-analyses [52]. There are 147 SNPs associated with the MAP1S gene and Parkinson’s disease through meta-analyses [52,53].
A significant SNP mapped to the bta-mir-23a-201 gene located on BTA7 at approximately 13 Mb (ARS-BFGL-NGS-102773). This gene does not code for proteins, but rather codes for miRNA. This gene is involved in post-transcriptional regulation of gene expression and possibly acts as a chaperone, and there is translational inhibition or destabilization of target mRNA [40,41]. This gene shares a 99% identity with the human ortholog [50]. One significant SNP was associated with the uncharacterized gene ENSBTAG00000039130 on BTA7, located at approximately 14.5 Mb (BovineHD0700003737). At this time, it is thought to encode a novel protein, or may encode part of the TCP-1 chaperonin family related to heat shock proteins [40,41]. One significant SNP mapped the factor-inducing gene 4 (FIG4) gene, located on BTA9 at approximately 41 Mb (BovineHD0900010787). This gene has been described in detail previously. One significant SNP (BovineHD0900010787) mapped to the FIG4 gene is also mapped to the FYN gene, located on BTA9 at approximately 40 Mb.

3.6. Associated Candidate Gene Variants

SNP mutant variants within the candidate genes analysis are summarized in Table 10. Several variant types are predicted within candidate genes, and Table 10 also quantifies where variants are predicted to be deleterious.

4. Discussion

4.1. Disease Diagnostics for Spastic Syndrome

To obtain accurate genomic analyses for spastic syndrome, a universal diagnostic criterion is necessary, as the GWAS results in particular can be skewed if misdiagnosis is present. A first set of adult bulls with phenotypes assessed by experienced veterinarians with respect to spastic syndrome was available. Experienced veterinary diagnostics add validity to the results, as the likelihood for misdiagnosis is relatively low. To increase the statistical power of the GWAS, 225 additional phenotyped animals were added based on observed spasticity by field veterinarians. If the veterinarian is unfamiliar with the clinical signs presented in spastic syndrome, this could lead to misdiagnosis, considering that other disorders can result in limb spasticity.
The broad set of clinical signs used for the diagnosis of spastic syndrome (episodic, involuntary muscle contractions or spasms involving the pelvic limbs that are associated with postural and locomotor disturbances, as well as spasticity) could potentially encompass undiscovered disorders with similar phenotypes, resulting in misdiagnosis [8,14]. Furthermore, misdiagnosis may be present at large because, in the seminal research of spastic syndrome conducted by Becker et al. [7], the diagnosis included “some diagnosis of rheumatism”. Rheumatism can be observed macroscopically through increased synovial fluid retention, although it is known that spastic syndrome shows very little in the form of muscular and joint lesions [7,10,58].
It is important to note parallels between animals and humans when discussing neuromuscular disease diagnostics, especially the susceptibility for misdiagnosis. This is because various human neuromuscular disorders have been more thoroughly investigated in terms of symptomatic progression and possible causality. One study of the retrospective accuracy (the probability of a positive test given that the individual has a condition) of Parkinson’s disease (a human disorder similar to spastic syndrome) was reportedly 82% [59], highlighting the potential risk of misdiagnosis. Diagnostic accuracy is imperative for humans and animals alike, for both treatment and management decisions, but also when screening the genome for areas of interest associated with the disease. Ensuring proper disease definition and training for veterinarians with a universal disease diagnostic criterion for spastic syndrome will allow for greater precision and accuracy in the GWAS and further in silico analyses. In conjunction, reliable diagnostic criteria and industry-supported guidelines for managing the breeding of affected animals would allow facilities that possess affected animals to make effective decisions for the minimization or elimination of spastic syndrome. It has been hypothesized that known affected animals are still being used as sires due to other desirable phenotypes, which could lead to unintentional propagation of the trait if consumers of affected bull semen are not made aware of the condition [6]. The monetary commitment to produce and maintain ideal sires may be a motivating factor in knowingly keeping and continued use of affected bulls. Some possible industry and veterinary supported options could be culling the affected animals, veterinary-supervised pain management plans while the animals are not severely affected, public awareness of affected sires, and/or developing a breeding plan to attempt to decrease future propagation of the disease. Although a larger number of animals with phenotypic records (cases and controls) would be desirable for this study, high-quality phenotypes for spastic syndrome are difficult to obtain because there are other diseases with similar signs that could cause confoundments, as discussed above. In this case-control study, the animals were examined by veterinarians that were experts on spastic syndrome, and animals that were older than the usual onset were only considered diseased if they showed signs. For the controls, only animals above 8 years old without any signs were included to guarantee they were disease-free. Therefore, despite the reduced sample size, these were highly accurate phenotypes. Furthermore, as this was a case-control study, the power of the GWAS was significantly higher than using an observational random sample of animals for the same sample size. However, future studies with larger datasets are recommended to validate the associations identified in this study.

4.2. Heritability Estimates

The heritability estimate of 0.41 on the underlying scale and 0.26 on the observed scale indicates that spastic syndrome is moderately heritable within this study population. This is similar to the heritability estimate of 0.23 on the observed scale with a herd prevalence of 30%, reported by Tenszen [12]. Thus, when spastic syndrome prevalence is relatively high, heritability estimates on the observed scale are moderate. However, heritability estimates are higher than what would be expected within the general population of cows.

4.3. Genome-Wide Association Studies

The GWAS performed for spastic syndrome assumed no sex linkage, as virtually all reports state an even prevalence between male and female cattle [5,6,7,8,12]. Additionally, both phenotyped populations had a disease prevalence rate of approximately 60%, which is higher than the reported herd prevalence of approximately 30% in older dairy sires [12].

4.3.1. Generalized Quasi-Likelihood Score

The discussion of the GQLS results will focus primarily on the high-density (777 K) genomic results, as SNPs considered significant are more likely to be closely associated with loci near to or within genes that contribute to spastic syndrome. The threshold for significance that was used was a maximum pFDR of 10%. Although the false discovery rate (controlling the proportion of false positives among the set of rejected hypotheses) is adequate in multiple test screening analyses that need higher power, fixing the chromosome-wise rejection threshold to a certain value, as in the pFDR threshold, contributes to a greater power of detection of genomic areas of possible interest [60]. This threshold was used to detect SNPs with possibly a small phenotypic effect, and in turn inflated the number of significantly associated SNPs. There were a number of ‘peaks’ obtained with both screening populations, which will be discussed further, and these preliminary GWAS results indicate that spastic syndrome inheritance is likely not to be monogenic.
Given that the affected animals were accurately diagnosed with spastic syndrome and the disease prevalence was considerable within the test populations, any significant SNPs obtained from the GWAS that truly associate with spastic syndrome would likely not be rare in the analyses. However, initial MAF thresholds for the GQLS method were set low, at 0.00001, to observe clearly all significant SNPs that contribute to a possible QTL, even rare allele contributions. Keeping a low initial MAF allows for all significant SNPs to be highlighted if spastic syndrome inheritance is governed by many genes and genomic regions. However, if causal variants for spastic syndrome are very rare, the limited sample size within this study may prohibit detection of these variants.
The GQLS method for the GWAS was primarily used, as the single SNP regression used is not constrained by the amount of genotypic data, and because the method allows for additive genetic relationships [34]. Placing further threshold criteria on SNPs found significant with the HD GWAS was necessary due to the large number of significant SNPs obtained. As this is one of the first GWAS for spastic syndrome, reducing the number of SNPs for further analysis to those with a higher likelihood of association with the disease would be important to limit unnecessary subsequent analyses with these SNPs. An initial threshold of 0.2 for the MAF was appropriate because the disease prevalence reaches over 60%, so SNPs that associate with genetic areas that contribute to the disease would most likely not be rare. This also eliminates SNPs found to be associative by chance. For the HD GWAS, this threshold decreased the number of significant SNPs by 404 (522 to 118 SNPs), indicating that the vast majority of the initial significant SNPs were relatively rare within this study population. A chi-square test was performed to determine which SNPs found significant through the GWAS with a relatively high MAF were significantly different with respect to genotypic frequencies between the affected and the unaffected populations. In total, the number of significant SNPs decreased by 4 (from 118 to 114 significant SNPs), indicating that virtually all relatively abundant significantly associated SNPs are significantly different between phenotypes. Finally, obtaining SNP homozygotic polarity with respect to the disease status helps to identify SNPs that are possibly associated with genes directly involved in the spastic syndrome phenotype. This analysis decreased the number of significant SNPs from 114 to 67; approximately 58% of the 114 SNPs showed homozygotic distributions of interest.

4.3.2. Comparison of Genotype Densities

Significant SNPs are located on various chromosomes for both genotypic densities. The increase in genetic density allows for the GWAS to be conducted on a finer scale, revealing significant SNPs closer to a possibly causative variant, which in this case appears to be located at approximately 87 Mb. This common peak also adds power to the evidence that this genomic area may be an important contributor to spastic syndrome. When comparing the two different genotypic densities (50 K vs. HD), there were 73 common significant SNPs with an initial threshold of maximum pFDR chromosome-wise at 10%. This indicates that approximately 75% of the significant SNPs found in the 50 K analysis remained significant after imputation (73 out of 98). After narrowing the number of HD significant SNPs using the various post-GWAS threshold criteria described previously, 10 SNPs remain in common when comparing the 50 K and HD GWAS, all of which are within observable SNP peaks within the HD GWAS. This suggests that, after all the post-GWAS threshold criteria imposed on the significant SNPs for the HD analysis, there is still sufficient power of the 50 K panel to detect associative SNPs. In conjunction, comparing the GQLS logistic regression with the RF regression of the HD SNP genotype panel resulted in common genomic areas on BTA7, BTA8, BTA14, and BTA20. These areas reflect significant SNPs that remained with the 50 K analysis, as well as HD analysis after the post-GWAS threshold criteria were imposed (Table 6 and Table 8).
Again, the small initial MAF threshold for all GWAS allows for relatively rare alleles to be identified. The post-GWAS thresholds placed on the HD SNPs decreased the number of significant SNPs per chromosome, and in turn this may decrease possible spurious associations with the spastic syndrome phenotype. In contrast, it may also eliminate rare SNPs that are valid associations in the GWAS. This is one reason why the GWAS findings should be validated with a larger sample size, to possibly discern which SNPs are truly associative with the phenotype observed in spastic syndrome.
If spastic syndrome is single allelic and the causative allele is in the SNP panel, or in complete or very high linkage with the causative variant in either SNP panels, common significant SNPs should arise. The reduced overlap observed across both genomic densities may be a function of a low power-of-detection due to the small sample size or it could be that more than one area in the genome is responsible for the phenotype observed with spastic syndrome.

4.3.3. Random Forest

In total, 75 trees per forest were grown, as similar error rates were obtained for forests of up to 1500 trees. Decreased computing time and computational load for the RF regression were obtained through minimizing the number of trees per forest while maintaining minimal error rates. Additionally, 100 replicates of the RF regression were obtained to avoid spurious associations due to sampling within the forest, as the RF regression can be prone to upward bias due to variable selection being random [37,61]. The RF regression obtained SNPs with a non-zero VIM% for all forests. As this is a screening study, it was necessary to analyze all SNPs that appeared to be significant, but not spuriously so; therefore, the SNPs with a non-zero VIM% that occurred in at least two forests were retained, as their presence in at least two forests may decrease the chance that these significant SNPs are anomalous.
As the average VIM% for all SNPs obtained with a non-zero VIM% was small, the frequency of forest occurrence was an alternative measure of the importance of the SNP. SNPs that occurred within approximately the top 50% regarding forest frequency were highlighted. The SNP that occurs most frequently, BovineHD0700003818, is located on BTA7 at approximately 14 Mb, and is within the bounds of the QTL-like peak at 1–20 Mb obtained from the GQLS regression method. This is validation of the QTL-like peak within BTA7. Additionally, the SNP BovineHD0900015016 occurs in nine forests and is located on BTA9 at approximately 54 Mb. Although the QTL-like pattern located at approximately 87 Mb is relatively far from this significant SNP, there are significant SNPs within the chromosome for all GQLS analyses at approximately 40–50 Mb.
The SNPs with a non-zero VIM% from at least two forests were then compared with the 522 significant SNPs obtained from the GQLS regression with the HD SNP genotype panel. In total, there were 16 common SNPs, and those that were common tended to have higher MAF and homozygous polarity scores (Table 8). The RF regression appears to be relatively effective when combined with the GQLS regression to screen for associative SNPs that are not rare.
The generally large error rates for both the prediction and estimation of the RF regression suggest that this method for regression is not sufficient for association analyses, but appears to be sufficient to validate the GQLS regression results with this study population (Table 8). The weighting of the prediction error was even between disease statuses, and the lower error rates for variants regressed to the disease phenotype suggest that RF can more accurately identify SNPs that are associated with spastic syndrome. The first set of animals was used to estimate the SNP VIM%, as diagnostics may be more accurate within this cohort, and an increase in phenotyped animals for RF estimation may decrease error rates for both estimation and prediction values. One common SNP between both regression methods (BovineHD0700006040, located at approximately 21.9 Mb) appears close to a QTL-like pattern on BTA7, located at approximately 1–20 Mb (Figure S2). This SNP has a relatively high MAF (0.381), whereas the homozygotic polarity score of this SNP is relatively low at 0.18. The results from the RF regression may be validating an area of significance also obtained with the GQLS regression within BTA7.

4.4. Mapping SNPs to Genes

Due to the significant lack of understanding of spastic syndrome, a mapping distance of up to two Mb to the left and right of the SNP of interest was used. If genes are mapped to significant SNPs at a relatively large distance, as long as ‘useful’ linkage (i.e., r2 ≥ 0.2) exists, genes of interest will remain relevant. Direct syntenic linkage was obtained between significant SNPs and SNPs within genes of interest (Tables S3–S7), including haplotype pairs that are farther than two Mb apart. This is because some variants within candidate genes exist farther than the mapping threshold of two Mb, but as they were within the candidate gene the direct linkage was still obtained. Additionally, an understanding of the biological nature of spastic syndrome is necessary to screen this list for appropriate genes for further consideration. The 26 common genes obtained for BTA9 at approximately 40 Mb indicates that this area is likely implicated with spastic syndrome. The two common candidate genes, FYN and FIG4, have functions of great interest with regard to spasticity, and will be discussed further. Even at relatively large distances there was moderate LD between haplotype pairs, which adds validity to the following candidate genes, as again some genes exist relatively distant from the significant SNP from the GWAS.
Table 10 quantifies the number of predicted deleterious mutations within each candidate gene. The variants detected within candidate genes that are predicted as deleterious may guide future research into the genetic and pathophysiologic nature of spastic syndrome.
The population prevalence of spastic syndrome was approximately 60%, and the GWAS results indicate that there are significant SNPs with a moderate-to-high MAF, along with SNPs that have polarized homozygotic distributions when looking at disease status (Table 9). Spastic syndrome is likely not monogenic, due to numerous genomic peaks obtained from the GWAS. The 67 retained significant SNPs map to seven potential candidate genes on BTA7 and BTA9. Several candidate genes listed below are implicated in degenerative neuromuscular disorders in humans and are likely not causal for spastic syndrome. However, these candidate genes may still contribute to the disorder through unknown cellular mechanisms, especially if the inheritance of spastic syndrome is multi-genic. Two genes exist in both candidate gene lists (FIG4 and FYN) and have been discussed previously.
Mitochondrial Function. It is proposed that a MAP1S deficiency contributes to a reduced ability to recognize defective mitochondria and commence cellular destruction [57]. A mouse knockout study of the MAP1S gene reported increases in mitochondrial size and frequent rupture of the outer membrane, followed by a progressive, lethal aggregation of mitochondria [57]. These knockout mice exhibited no obvious abnormalities in development or reproduction. The spastic phenotypes of the knockout mice were progressive in severity, much like what is observed in spastic syndrome. There are no cellular abnormalities reported with spastic syndrome, thus MAP1S is likely not causal for this disorder [10]. However, protein–protein interactions could occur with a mutant MAP1S that could contribute to the progressive, non-degenerative phenotype observed in spastic syndrome through currently unknown mechanisms.
Calcium Channel Function. Calcium channels play an important role in the regulation of a variety of cellular functions such as membrane depolarization, muscle contraction, and synaptic transmission, as they are involved in the control of neurotransmitter release from neurons [49]. Mutations in the CACNA1A gene are implicated as causal for spinocerebellar ataxia type 6 in humans [51]. Spinocerebellar ataxia type 6 in humans is an autosomal dominant disorder that is progressive and degenerative in nature, with an average age of onset of 43 to 52 years [51,62]. Additionally, a mouse knockout study [49] reported that CACNA1A knockout results in ataxia and death through the removal of channel-mediated neurotransmission. A deleterious mutation within this gene could result in altered calcium channels, which may alter presynaptic acetylcholine release, resulting in enhanced transmission in motor pathways and spasticity in the affected limb(s). Defects in many ion channels do not cause histopathological changes, thus the CACNA1A gene is a compelling candidate gene with respect not only to function, but also chromosomal location, as this gene is within a QTL in BTA7.
Axon and Neuron Function. Rab3A constitutive knockout mice (Rab3A(-/-)) are characterized by a deficiency in short and long-term synaptic plasticity. Synapses have the ability to strengthen or weaken over time in response to increases or decreases in activity, and it has been reported that RAB3A regulates the fusion probability of synaptic vesicles [54,55]. In mice, knocking out the RAB3A gene stopped activity-dependent vesicle accumulation close to and in the active zone of the presynaptic membrane when there was a depolarizing pulse in the presence of calcium [54]. These knockout mice are viable and present relatively mild phenotypes, suggesting nonessential functions of RAB3A [54]. Impaired vesicle recruitment was seen with repeated stimulation, and the capacity to secrete neurotransmitters, as well as vesicle docking, was impaired in Rab3A knockout mice [54]. A mutation in the RAB3A gene can result in decreased synapse strength over time due to loss of vesicle recruitment upon repeated stimulation. This could possibly lead to ataxia-related phenotypes, where ataxia is defined as an impairment of control of voluntary muscle movement [54]. Such a possibility was reported by Meilleur et al. [63], where RAB3A was listed as a candidate gene, though not confirmed at this time, for a familial form of spastic paraplegia. Symptoms of the familial spastic paraplegia include lower-limb spasticity and weakness. Although hind-limb weakness is generally only observed in cattle that are severely affected with spastic syndrome due to disuse atrophy of painfully affected limb(s), it is possible that mutations in the RAB3A gene could contribute to spasticity via a mechanism similar to that proposed for the familial spastic paraplegia reported by Meilleur et al. [63].
Additionally, FYN and FIG4, discussed previously, have functions related to either motor neuron degeneration or decreases in dendritic spine density and have been implicated in human spastic phenotypes. At this time, mutations within these genes are not thought to be causal for spastic syndrome; however, they may be indirectly implicated in spastic syndrome through defective protein–protein interactions with other proteins that can produce non-degenerative pelvic-limb spasticity.
Uncertain Functions. It is possible that the chaperone gene bta-mir-23a-201 helps to fold proteins involved in muscle contraction. A mutated chaperone may not be able to assist with the product of a gene like CACNA1A, for example. In such a case, a deleterious phenotype may arise from aberrant calcium-channel folding even though the channel gene is normal. Additionally, the mutated bta-mir-23a-201 protein could produce a mild functional calcium channel defect that is compatible with life but causes late-onset spasticity, rather than the complete removal of channel-mediated neurotransmission, which causes death in CACNA1A knockout mice [49]. Additionally, ENSBTAG00000039130 could contribute to the spastic syndrome phenotype in unknown ways. Until more is known about the physiological basis of spastic syndrome, postulation into this gene function may be fruitless, as disruption in many genes can create a spastic phenotype. There is no functional evidence either for or against the involvement of this gene in spastic syndrome at this time. Based on our findings, the most promising candidate genes are located on BTA7 (CACNA1 and RAB3A) and BTA9 (FIG4). The two genes of interest located on BTA7 are within a QTL at approximately 1–20 Mb. Additionally, the FIG4 gene is a candidate gene and is located at approximately 41 Mb on BTA9. Mutations in CACNA1A could disrupt neuromuscular signaling without causing degenerative changes. Mutations in RAB3A could act to decrease fusion probability of synaptic vesicles in axons supplying skeletal muscles [54]. The FIG4 gene is wholly expressed in skeletal muscle and mutations are implicated with severe neurodegenerative disorders in mice [64]. Spastic syndrome is non-degenerative, so it could be that FIG4 mutations impair synaptic transmission through mechanisms that are non-degenerative. An investigation into FIG4 functional rescue reports that expression of this gene in neurons is sufficient to correct the development of significant muscle atrophy [64].
Furthermore, the CACNA1A gene and the FIG4 gene are both implicated as causal for progressive neurodegenerative skeletal muscular diseases in humans, and spinocerebellar ataxia type 6 and Charcot-Marie-Tooth Neuropathy Type 4J, respectively. RAB3A is listed as a candidate gene for a familial form of spastic paraplegia [63]. Though spastic syndrome is not degenerative, these genes may play a role in the spastic phenotype through biological functions that are currently unknown. These candidate genes share functions that could implicate them in spastic syndrome. Further research regarding the candidate gene list should be conducted to verify function in relation to spastic syndrome.

4.5. Implications and Further Research Recommendations

To begin with, it would be helpful to identify a single name for this disorder that could be used consistently within academia and industry. At this time, terminology such as “crampy” and “stretches” are commonly used colloquial terms, while “progressive posterior paralysis” and “periodic spasticity” have been used within academia to describe spastic syndrome. The many names for spastic syndrome have been a source of confusion for producers and scientists, which is a problem that should be minimized if a single term could be identified by consensus. Additionally, establishing a universal protocol for the diagnosis of spastic syndrome would enable facilities to make informed decisions regarding affected animals, whether it is culling or management practices while the affected bull is still serviceable. In order for consumers of bull semen to make informed decisions about whether to use semen from affected animals for their own herd, information on spastic syndrome status is necessary and should be made available. Universal diagnostic criteria would also facilitate research, for example by minimizing phenotyping errors in a future GWAS. There could be risk factors for spastic syndrome that are not yet known, such as environmental or nutritional factors that could promote the development of the disease. A follow-up study on the SNPs found significant in the GWAS needs to be performed to ascertain the genes holding or surrounding the significant SNPs found in this study. Functional and in silico analyses of these genes, with an emphasis on the location at approximately 87 Mb on BTA9 and 1–20 Mb on BTA7, will enrich knowledge of gene function related to spastic syndrome.
A GWAS using imputed whole-genome sequence data (as carried out by Chen et al. [65]) may highlight possible causal variants in the 87 Mb region of BTA9 or 1–20 Mb within BTA7, or generate new genomic regions of interest associated with spastic syndrome. With more spastic syndrome phenotypic information becoming available, a follow-up study should also be performed in which highly related affected animals are clustered for the GWAS. These GWAS results would be compared with other unrelated clustered GWAS results for possible population-specific stratification that results in novel associative variants. Additionally, genomic prediction of breeding values for spastic syndrome could be calculated. A candidate gene list was created. Additional in silico, functional, and in vivo analyses are needed to validate the genes and gene functions described in this study. The findings of this research may lead to a SNP test of risk variants that can be commercially used to test young bulls for disease risk variants for spastic syndrome within the North American Holstein population or even genomic selection after a large enough reference population is developed.

5. Conclusions

Spastic syndrome is a heritable disease, based on the heritability estimate derived from this study population. The genome-wide association studies using both the 50 K and the imputed high-density SNP panel indicated that spastic syndrome is likely polygenic. The most notable findings were significant SNP peaks located on BTA9 at approximately 87 Mb for all analyses, and a peak within BTA7 located at 1–20 Mb. Findings within the areas of interest in BTA7 and BTA9 were validated with random forest regression. In total, 1048 genes were mapped to significant SNPs. Functional and in silico analyses focused on genes involved in skeletal muscle and nervous system function. In particular, the GPR126 gene has functions related to the myelination of axons, the CACNA1A gene is expressed at neuromuscular junctions, the RAB3A gene associates with synaptic vesicles until depolarization of the nerve terminal occurs, and the FIG4 gene is only expressed in skeletal muscle. The CACNA1A gene is the most compelling candidate gene, as many calcium ion channel disorders are non-degenerative and produce spastic phenotypes. The candidate genes identified within this study will contribute to a further understanding of the pathophysiology of spastic syndrome. Additionally, mutant variants predicted to be deleterious within candidate genes were quantified. This study has identified regions of the genome and candidate genes that may facilitate further research into the genetic nature of spastic syndrome in North American Holsteins when larger datasets become available.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/genes14071479/s1, Table S1: Common SNP frequencies chromosome between 50K and high density (777 K) genome-wide association studies in 265 North American Holsteins screening for genomic areas of interest for Spastic Syndrome; Table S2: Number of genes per chromosome mapped from significant SNPs from the genome-wide association study with imputed high density (777 K) genotypes, using the Next-Generation Sequencing SNP Tool; Table S3: Direct linkage of significant SNPs from GWAS and SNPS within haplotype blocks in genes of interest on chromosome 7; Table S4: Direct linkage of significant SNPs from GWAS and SNPS within haplotype blocks in the CACNA1A gene; Table S5: Direct linkage of significant SNPs from GWAS and SNPS within haplotype blocks in Uncharacterized gene; Table S6: Direct linkage of significant SNPs from GWAS and SNPS within haplotype blocks in the FYN gene; Table S7: Direct linkage of significant SNPs from GWAS and SNPS within haplotype blocks in the FIG4 gene; Figure S1: Genome-wide Manhattan plot of the genome-wide association study with 50 K genotypes using a generalised quasi-likelihood score via Sleuth software, with significance thresholds of 1%, 5%, and 10% chromosome-wise positive false discovery rate (pFDR); Figure S2: Genome-wide Manhattan plot of the genome-wide association study with high density (777 K) genotypes using a generalised quasi-likelihood score via Sleuth software, with significance thresholds of 1%, 5%, and 10% chromosome-wise positive false discovery rate (pFDR); Figure S3: Linkage blocks (r2) for SNPs within the CACNA1A gene located on chromosome 7, obtained from Haploview. Figure S4: Linkage blocks (r2) for SNPs within the RAB3A gene located on chromosome 7, obtained from Haploview; Figure S5: Linkage blocks (r2) for SNPs within the MAP1S gene located on chromosome 7, obtained from Haploview; Figure S6: Linkage blocks (r2) for SNPs within the bta-mir-23a-201 gene located on chromosome 7, obtained from Haploview; Figure S7: Linkage blocks (r2) for SNPs within the Uncharacterized gene located on chromosome 7, obtained from Haploview; Figure S8: Linkage blocks (r2) for SNPs within the FIG4 gene located on chromosome 9, obtained from Haploview; Figure S9: Linkage blocks (r2) for SNPs within the FYN gene located on chromosome 9, obtained from Haploview.

Author Contributions

Conceptualization, A.N., W.J.B.H., J.D.B. and F.S.S.; methodology, F.S.S. and A.N.; software, A.N.; validation, A.N., L.F.B. and F.S.S.; formal analysis, A.N.; investigation, A.N., W.J.B.H., J.D.B. and F.S.S.; resources, W.J.B.H., J.D.B. and F.S.S.; data curation, A.N., W.J.B.H., J.D.B. and F.S.S.; writing—original draft preparation, A.N.; writing—review and editing, A.N., L.F.B., W.J.B.H., J.D.B. and F.S.S.; visualization, A.N.; supervision, F.S.S.; project administration, W.J.B.H. and F.S.S.; funding acquisition, W.J.B.H., J.D.B. and F.S.S. All authors have read and agreed to the published version of the manuscript.

Funding

Authors acknowledge the financial support from the Natural Sciences and Engineering Research Council of Canada (NSERC) through a Collaborative Research & Development Grant (CRDPJ 417374-11) in partnership with the Dairy Cattle Genetics Research and Development (DairyGen) Council of Lactanet.

Institutional Review Board Statement

No Animal Care Committee approval was necessary for the purposes of this study, as all information required was obtained from pre-existing databases.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the data needed for the interpretation of the results are provided in the paper and Supplementary Materials. The raw datasets cannot be made publicly available as they are the property of Canadian dairy farmers and commercially sensitive.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Guinan, F.; Wiggans, G.; Norman, H.; Dürr, J.; Cole, J.; Van Tassell, C.; Misztal, I.; Lourenco, D. Changes in genetic trends in US dairy cattle since the implementation of genomic selection. J. Dairy Sci. 2023, 106, 1110–1129. [Google Scholar] [CrossRef] [PubMed]
  2. Baes, C.F.; Makanjuola, B.O.; Miglior, F.; Marras, G.; Howard, J.T.; Fleming, A.; Maltecca, C. Symposium review: The genomic architecture of inbreeding: How homozygosity affects health and performance. J. Dairy Sci. 2019, 102, 2807–2817. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Brito, L.; Bedere, N.; Douhard, F.; Oliveira, H.; Arnal, M.; Peñagaricano, F.; Schinckel, A.; Baes, C.; Miglior, F. Genetic selection of high-yielding dairy cattle toward sustainable farming systems in a rapidly changing world. Animal 2021, 15, 100292. [Google Scholar] [CrossRef] [PubMed]
  4. Frauchiger, E.; Hofmann, W. 4. Die “Krämpfigkeit”. In Die Nervenkrankheiten des Rindes; Frauchiger, E., Hofmann, W., Eds.; Verlag Hans Huber: Bern, Switzerland, 1941; pp. 337–340. [Google Scholar]
  5. Roberts, S.J. A spastic syndrome in cattle. Cornell Vet. 1953, 43, 380–388. [Google Scholar] [PubMed]
  6. Lafortune, J.G. Une Affection Spasmodique Des Bovins. Can. J. Comp. Med. Vet. Sci. 1956, 20, 206–215. [Google Scholar] [PubMed]
  7. Becker, R.; Wilcox, C.; Pritchard, W. Crampy or Progressive Posterior Paralysis in Mature Cattle. J. Dairy Sci. 1961, 44, 542–547. [Google Scholar] [CrossRef]
  8. Roberts, S.J. Hereditary spastic diseases affecting cattle in New York State. Cornell Vet. 1965, 55, 637–644. [Google Scholar]
  9. Sponenberg, D.P.; Vanvleck, L.D.; McEntee, K. The genetics of the spastic syndrome in dairy bulls. Vet. Med. 1985, 80, 92–94. [Google Scholar]
  10. Wells, G.A.H.; Hawkins, S.A.C.; O’Toole, D.T.; Done, S.H.; Duffell, S.J.; Bradley, R.; Hebert, C.N. Spastic Syndrome in a Holstein Bull: A Histologic Study. Vet. Pathol. 1987, 24, 345–353. [Google Scholar] [CrossRef]
  11. Mayhew, I.G. 1. Mayhew, I.G. 1. Spastic syndrome of adult cattle (Standings disease, Barn cramps, Crampiness, Stretches, Periodic spasticity). In Large Animal Neurology A Handbook for Veterinary Clinicians; Mayhew, I.G., Ed.; Lea & Febiger: Philadelphia, PA, USA, 1989; p. 216. [Google Scholar]
  12. Tenszen, A. Spastic syndrome in a Canadian Hereford bull. Can Vet J. 1998, 39, 716–717. [Google Scholar]
  13. Goeckmann, V.; Rothammer, S.; Medugorac, I. Bovine spastic syndrome: A review. Vet. Rec. 2018, 182, 693. [Google Scholar] [CrossRef] [PubMed]
  14. de Lahunta, A.; Glass, E. Veterinary Neuroanatomy and Clinical Neurology, 2nd ed.; WB Saunders Co: Philadelphia, PA, USA, 1983; pp. 147–148. [Google Scholar]
  15. Arita, Y. Patho-morphological observations on muscles and nerves of pelvic limbs in cows with spastic syndrome. Jpn. J. Vet. Res. 1988, 36, 147. [Google Scholar]
  16. Dirksen, G. Krämpfigkeit. In Innere Medizin und Chirugie des Rindes; Dirksen, G., Gründer., H.-D., Stöber, M., Eds.; Parey: Berlin, Germany, 2006; pp. 846–854. [Google Scholar]
  17. Bøhler, N.; Gjestvang, P.; Slagsvold, P. Visse rygglidelser hos storfe som årsak til såkalt »stallkrampe« [Diseases of the vertebral column in cattle as a cause to what is called »stallkrampe«]. In Proceedings of the VI Nordiska Veterinärmötet, Stockholm, Sweden, 10–11 August 1951; pp. 109–115. [Google Scholar]
  18. Smedegaard, H.H. Krämpfigkeit beim Rinde. Eine Übersicht. Nord. Vet. Med. 1964, 16, 1029–1049. [Google Scholar]
  19. Câmara, A.C.L.; JAB Afonso, N.A.; Costa, C.L.; Mendonça, M.I. Souza. Spastic syndrome in two cows in Northeastern Brazil. Rev. Port. Ciências Veterinárias 2008, 103, 100–102. [Google Scholar]
  20. Gentile, A.; Testoni, S. Inherited disorders of cattle: A selected review. Slov. Vet. Res. 2006, 43, 17–29. [Google Scholar]
  21. Windsor, P.A.; Agerholm, J.S. Inherited diseases of Australian Holstein-Friesian cattle. Aust. Vet. J. 2009, 87, 193–199. [Google Scholar] [CrossRef]
  22. Suter, J. Über die Aetiologie, Symptomatologie und Therapie der “Krämpfigkeit” des Rindes [About the Etiology, Symptomatology and Treatment of “Cramping” of Cattle]. Inaugural-Dissertation, Universität Zürich, Zurich, Switzerland, 1934. [Google Scholar]
  23. Boettcher, P.J.; Wang, Y. Estimates of Heritabilities for Defective Type Characteristics of Canadian Holsteins. Report to the Technical Committee of the Dairy Genetic Evaluation Board. September 2000. Available online: https://animalbiosciences.uoguelph.ca/~fleminga/dcbgc/Agenda0007/defgeb.htm (accessed on 18 July 2023).
  24. Daniel, R.; Goulden, B. Congenital contraction of the gastrocnemius and superficial digital flexor muscles in a friesian calf. N. Z. Vet. J. 1967, 15, 150–151. [Google Scholar] [CrossRef]
  25. Harper, P. Spastic paresis in Brahman crossbred cattle. Aust. Vet. J. 1993, 70, 456–457. [Google Scholar] [CrossRef]
  26. Wiggans, G.; Sonstegard, T.; VanRaden, P.; Matukumalli, L.; Schnabel, R.; Taylor, J.; Schenkel, F.; Van Tassell, C. Selection of single-nucleotide polymorphisms and quality of genotypes used in genomic evaluation of dairy cattle in the United States and Canada. J. Dairy Sci. 2009, 92, 3431–3436. [Google Scholar] [CrossRef] [Green Version]
  27. Zimin, A.V.; Delcher, A.L.; Florea, L.; Kelley, D.R.; Schatz, M.C.; Puiu, D.; Hanrahan, F.; Pertea, G.; Van Tassell, C.P.; Sonstegard, T.S.; et al. A whole-genome assembly of the domestic cow, Bos taurus. Genome Biol. 2009, 10, R42. [Google Scholar] [CrossRef] [Green Version]
  28. Sargolzaei, M.; H Iwaisaki, J.J. Colleau. CFC: A tool for monitoring genetic diversity. In Proceedings of the 8th World Congress on Genetics Applied to Livestock Production, Belo Horizonte, Minas Gerais, Brazil, 13–18 August 2006; pp. 13–18. [Google Scholar]
  29. Sargolzaei, M.; Chesnais, J.P.; Schenkel, F.S. A new approach for efficient genotype imputation using information from relatives. BMC Genom. 2014, 15, 478. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  30. Gilmour, A.R.; Gogel, B.J.; Cullis, B.R.; Thompson, R. ASReml User Guide Release 3.0; VSN International Ltd.: Hemel Hempstead, UK, 2009; Available online: https://www.vsni.co.uk (accessed on 18 July 2023).
  31. Dempster, E.R.; Lerner, I.M. Heritability of Threshold Characters. Genetics 1950, 35, 212–236. [Google Scholar] [CrossRef] [PubMed]
  32. Gianola, D. Heritability of Polychotomous Characters. Genetics 1979, 93, 1051–1055. [Google Scholar] [CrossRef] [PubMed]
  33. Van Doormal, B. Current Perspective on Crampiness in Holsteins. Available online: https://www.cdn.ca/document.php?id=337 (accessed on 18 July 2023).
  34. Feng, Z.; Wong, W.W.L.; Gao, X.; Schenkel, F. Generalized genetic association study with samples of related individuals. Ann. Appl. Stat. 2011, 5, 2109–2130. [Google Scholar] [CrossRef]
  35. Benjamini, Y.; Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. Ser. B Methodol. 1995, 57, 289–300. [Google Scholar] [CrossRef]
  36. Liaw, A.; Wiener, M. Classification and Regression by random Forest. R News 2002, 2, 18–22. [Google Scholar]
  37. Li, Y.; Kijas, J.; Henshall, J.M.; Lehnert, S.; McCulloch, R.; Reverter, A. Using Random Forests (RF) to Prescreen Candidate Genes: A New Prospective for GWAS. In Proceedings of the 10th World Congress for Genetics Applied to Livestock Production, Vancouver, BC, Canada, 17–22 August 2014. [Google Scholar]
  38. Aloysius, L.; Breiman, L.; Cutler, A. Bigrf: Big Random Forests: Classification and Regression Forests for Large Data Sets. R Package Version 0.1-11. Available online: https://cran.r-project.org/src/contrib/Archive/bigrf/ (accessed on 18 July 2023).
  39. Grant, J.R.; Arantes, A.S.; Liao, X.; Stothard, P. In-depth annotation of SNPs arising from resequencing projects using NGS-SNP. Bioinformatics 2011, 27, 2300–2301. [Google Scholar] [CrossRef] [Green Version]
  40. Huang, D.W.; Sherman, B.T.; Lempicki, R.A. Bioinformatics enrichment tools: Paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009, 37, 1–13. [Google Scholar] [CrossRef] [Green Version]
  41. Huang, D.W.; Sherman, B.T.; Lempicki, R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009, 4, 44–57. [Google Scholar] [CrossRef]
  42. Jansen, P.H.P.; RGM Lecluse, A.L.M. Verbeek. Past and Current Understanding of the Pathophysiology of Muscle Cramps: Why Treatment of Varicose Veins Does Not Relieve Leg Cramps. J. Eur. Acad. Dermatol. Venereol. 1999, 12, 222–229. [Google Scholar] [CrossRef]
  43. Kupershmidt, I.; Su, Q.J.; Grewal, A.; Sundaresh, S.; Halperin, I.; Flynn, J.; Shekar, M.; Wang, H.; Park, J.; Cui, W.; et al. Ontology-Based Meta-Analysis of Global Collections of High-Throughput Public Data. PLoS ONE 2010, 5, e13066. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Grossi, D.; Abo-Ismail, M.K.; Koeck, A.; Miller, S.P.; Stothard, P.; Plastow, G.; Miglior, F.; Moore, S.S.M.; Sargolzaei, M.; Schenkel, F. Genome-wide Association Analyses for Mastitis in Canadian Holsteins. In Proceedings of the 10th World Congress on Genetics Applied to Livestock Production, ASAS, Vancouver, BC, Canada, 17–22 August 2014. [Google Scholar]
  45. Barrett, J.C.; Fry, B.; Maller, J.; Daly, M.J. Haploview: Analysis and visualization of LD and haplotype maps. Bioinformatics 2004, 21, 263–265. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  46. Hill, W.G.; Robertson, A. Linkage disequilibrium in finite populations. Theor. Appl. Genet. 1968, 38, 226–231. [Google Scholar] [CrossRef] [PubMed]
  47. Meuwissen, T.H.E.; Hayes, B.J.; Goddard, M.E. Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps. Genetics 2001, 157, 1819–1829. [Google Scholar] [CrossRef] [PubMed]
  48. McLaren, W.; Pritchard, B.; Rios, D.; Chen, Y.; Flicek, P.; Cunningham, F. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 2010, 26, 2069–2070. [Google Scholar] [CrossRef] [Green Version]
  49. Todorov, B.; Van De Ven, R.C.; Kaja, S.; Broos, L.A.; Verbeek, S.J.; Plomp, J.J.; Ferrari, M.D.; Frants, R.R.; Maagdenberg, A.M.V.D. Conditional inactivation of theCacna1a gene in transgenic mice. Genes 2006, 44, 589–594. [Google Scholar] [CrossRef]
  50. Kasprzyk, A. BioMart: Driving a paradigm change in biological data management. Database 2011, 2011, bar049. [Google Scholar] [CrossRef]
  51. Andrés-Mateos, E.; Cruces, J.; Renart, J.; Solís-Garrido, L.M.; Serantes, R.; de Lucas-Cerrillo, A.M.; Montiel, C. Bovine CACNA1A gene and comparative analysis of the CAG repeats associated to human spinocerebellar ataxia type-6. Gene 2006, 380, 54–61. [Google Scholar] [CrossRef]
  52. Lill, C.M.; Roehr, J.T.; Mcqueen, M.B.; Kavvoura, F.K.; Bagade, S.; Schjeide, B.M.; Schjeide, L.M.; Meissner, E.; Zauft, U.; Allen, N.C.; et al. Comprehensive Research Synopsis and Systematic Meta-Analyses in Parkinson’s Disease Genetics: The PDGene Database. PLoS Genet 2012, 8, e1002548. [Google Scholar] [CrossRef] [Green Version]
  53. Nalls, M.A.; Pankratz, N.; Lill, C.M.; Do, C.B.; Hernandez, D.G.; Saad, M.; DeStefano, A.L.; Kara, E.; Bras, J.; Sharma, M.; et al. Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson’s disease. Nat. Genet. 2014, 46, 989–993. [Google Scholar] [CrossRef] [Green Version]
  54. Leenders, A.M.; da Silva, F.H.L.; Ghijsen, W.E.; Verhage, M. Rab3A Is Involved in Transport of Synaptic Vesicles to the Active Zone in Mouse Brain Nerve Terminals. Mol. Biol. Cell 2001, 12, 3095–3102. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  55. Kapfhamer, D.; Valladares, O.; Sun, Y.; Nolan, P.M.; Rux, J.J.; Arnold, S.E.; Veasey, S.C.; Bućan, M. Mutations in Rab3a alter circadian period and homeostatic response to sleep loss in the mouse. Nat. Genet. 2002, 32, 290–295. [Google Scholar] [CrossRef] [PubMed]
  56. Halpain, S.; Dehmelt, L. The MAP1 family of microtubule-associated proteins. Genome Biol. 2006, 7, 224. [Google Scholar] [CrossRef] [PubMed]
  57. Xie, R.; Nguyen, S.; McKeehan, K.; Wang, F.; McKeehan, W.L.; Liu, L. Microtubule-associated Protein 1S (MAP1S) Bridges Autophagic Components with Microtubules and Mitochondria to Affect Autophagosomal Biogenesis and Degradation. J. Biol. Chem. 2011, 286, 10367–10377. [Google Scholar] [CrossRef] [Green Version]
  58. Østergaard, M.; Stoltenberg, M.; Løvgreen-Nielsen, P.; Volck, B.; Jensen, C.H.; Lorenzen, I. Magnetic resonance imaging-determined synovial membrane and joint effusion volumes in rheumatoid arthritis and osteoarthritis. Comparison with the macroscopic and microscopic appearance of the synovium. Arthritis Rheum. 1997, 40, 1856–1867. [Google Scholar] [CrossRef] [PubMed]
  59. Hughes, A.J.; Daniel, S.E.; Kilford, L.; Lees, A.J. Accuracy of clinical diagnosis of idiopathic Parkinson’s disease: A clinico-pathological study of 100 cases. J. Neurol. Neurosurg. Psychiatry 1992, 55, 181–184. [Google Scholar] [CrossRef] [Green Version]
  60. Storey, J.D. The positive false discovery rate: A Bayesian interpretation and the q-value. Ann. Stat. 2003, 31, 2013–2035. [Google Scholar] [CrossRef]
  61. Strobl, C.; Boulesteix, A.L.; Zeileis, A.; Hothorn, T. Bias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution. BMC Bioinform. 2007, 8, 25. [Google Scholar] [CrossRef] [Green Version]
  62. Solodkin, A.; Gomez, C.M. Spinocerebellar ataxia type 6. Handb. Clin. Neurol. 2012, 103, 461–473. [Google Scholar]
  63. Meilleur, K.G.; Traoré, M.; Sangaré, M.; Britton, A.; Landouré, G.; Coulibaly, S.; Niaré, B.; Mochel, F.; Pean, A.; Rafferty, I.; et al. Hereditary spastic paraplegia and amyotrophy associated with a novel locus on chromosome 19. Neurogenetics 2010, 11, 313–318. [Google Scholar] [CrossRef] [Green Version]
  64. Reifler, A.; Lenk, G.M.; Li, X.; Groom, L.; Brooks, S.V.; Wilson, D.; Bowerson, M.; Dirksen, R.T.; Meisler, M.H.; Dowling, J.J. Murine Fig4 is dispensable for muscle development but required for muscle function. Skelet. Muscle 2013, 3, 21. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  65. Chen, S.-Y.; Schenkel, F.S.; Melo, A.L.P.; Oliveira, H.R.; Pedrosa, V.B.; Araujo, A.C.; Melka, M.G.; Brito, L.F. Identifying pleiotropic variants and candidate genes for fertility and reproduction traits in Holstein cattle via association studies based on imputed whole-genome sequence genotypes. BMC Genom. 2022, 23, 331. [Google Scholar] [CrossRef] [PubMed]
Table 1. Disease status and population distribution for the genome-wide association study for spastic syndrome with two screening populations of North American Holstein cattle.
Table 1. Disease status and population distribution for the genome-wide association study for spastic syndrome with two screening populations of North American Holstein cattle.
SexCountry of OriginAffectedControl
Male aCanada159
USA115
Male bCanada623
USA482
Female bCanada1987
USA22
Total157108
a Forty animals obtained for the initial spastic syndrome screening population; and b 225 additional animals for second spastic syndrome screening population.
Table 2. Number of significant SNPs a associated with spastic syndrome including data from 265 animals genotyped with 50 K SNP panels based on the generalized quasi-likelihood score method.
Table 2. Number of significant SNPs a associated with spastic syndrome including data from 265 animals genotyped with 50 K SNP panels based on the generalized quasi-likelihood score method.
Chromosome50 K SNP PanelChromosome50 K SNP Panel
BTA12BTA161
BTA21BTA173
BTA33BTA184
BTA41BTA191
BTA55BTA206
BTA66BTA212
BTA710BTA2213
BTA817BTA23-
BTA93BTA24-
BTA10-BTA252
BTA1111BTA263
BTA122BTA27-
BTA13-BTA28-
BTA145BTA291
BTA157Total98
a Significance threshold of a maximum chromosome-wise pFDR of 10%.
Table 3. Top 10 significant SNPs based on the p-value and a maximum 10% chromosome-wise pFDR threshold criterion obtained from the 50 K data genome-wide association studies, using the generalized quasi-likelihood score method.
Table 3. Top 10 significant SNPs based on the p-value and a maximum 10% chromosome-wise pFDR threshold criterion obtained from the 50 K data genome-wide association studies, using the generalized quasi-likelihood score method.
SNP NameChromosomeLocation (bp)p-ValuepFDR Chromosome-Wise Threshold (%)
BTA-108606-no-rsBTA756,290,1621.00 × 10−161
ARS-BFGL-NGS-34420BTA5115,633,5312.22 × 10−161
BTB-00260821BTA665,865,2043.33 × 10−161
BTB-01667396BTA665,889,4824.44 × 10−161
BTB-01430347BTA154,126,6002.06 × 10−111
BTB-00815539BTA2134,339,5142.41 × 10−111
BTB-01454431BTA154,155,8852.75 × 10−111
Hapmap49058-BTA-121772BTA694,228,5991.09 × 10−101
BTB-01093397BTA882,016,6611.42 × 10−101
BTB-01518195BTA261,344,0631.09 × 10−91
Table 4. Number of significant SNPs per chromosome in the genome-wide association study with imputed high-density (777 K) genotypes using the generalised quasi-likelihood score method, alongside SNP distributions per chromosome using various post-GWAS significance thresholds.
Table 4. Number of significant SNPs per chromosome in the genome-wide association study with imputed high-density (777 K) genotypes using the generalised quasi-likelihood score method, alongside SNP distributions per chromosome using various post-GWAS significance thresholds.
Number of Significant SNPs a
ChromosomepFDR 10%MAF ≥ 0.2Chi-Square TestHomozygotic Polarity Score ≥ 0.3
BTA18111
BTA29111
BTA348111
BTA43---
BTA519221
BTA635---
BTA771404014
BTA8103151411
BTA930322
BTA106111
BTA113---
BTA126111
BTA131---
BTA1436242418
BTA1519---
BTA162022-
BTA17422-
BTA1812222
BTA191---
BTA2035211914
BTA211511-
BTA22411-
BTA2311---
BTA242---
BTA25---
BTA262---
BTA2711---
BTA281---
BTA296---
Total52211811467
a Significance thresholds were cumulative: initial threshold of maximum FDR chromosome-wise value of 10%, second threshold of minimum minor allele frequency (MAF) of 0.2, third threshold of a chi-square test, and fourth threshold of minimum homozygotic SNP difference of 0.3.
Table 5. ‘Top’ 10 significant SNPs based on the p-value and maximum 10% chromosome-wise pFDR threshold criteria obtained from genome-wide association studies of imputed high-density (777 K) genotypes using the generalized quasi-likelihood score method.
Table 5. ‘Top’ 10 significant SNPs based on the p-value and maximum 10% chromosome-wise pFDR threshold criteria obtained from genome-wide association studies of imputed high-density (777 K) genotypes using the generalized quasi-likelihood score method.
SNP NameChr.Location (bp)p-ValuepFDR Chromosome-Wise Threshold (%)Minor Allele FrequencyHomozygotic Polarity Score
BovineHD1400022775BTA1480,696,5967.81 × 10−610.3920.40
BovineHD0800033791BTA870,398,6968.59 × 10−610.4030.30
BovineHD4100006888BTA870,445,8548.59 × 10−610.4030.30
BovineHD1400022777BTA1480,698,8391.19 × 10−510.3950.52
BovineHD0900010787BTA938,889,3271.21 × 10−550.2030.35
BovineHD1400022780BTA1480,703,8681.44 × 10−550.4890.45
BovineHD1000030149BTA10103,183,8041.51 × 10−550.4790.42
BovineHD1400022806BTA1480,744,7381.58 × 10−550.4900.50
BovineHD1400022805BTA1480,743,5931.59 × 10−550.4640.44
ARS-BFGL-BAC-26936BTA1480,753,8811.79 × 10−550.4940.51
Table 6. Information on common significant SNPs a between 50 K and high-density (777 K) genome-wide association studies in 265 North American Holsteins screening for genomic areas of interest for spastic syndrome.
Table 6. Information on common significant SNPs a between 50 K and high-density (777 K) genome-wide association studies in 265 North American Holsteins screening for genomic areas of interest for spastic syndrome.
SNP NameChr.Location (bp)p-ValuepFDR Chromosome-Wise Threshold (%)Minor Allele FrequencyHomozygotic Polarity Score
ARS-BFGL-NGS-67684BTA146,620,5712.98 × 10−5100.2440.50
Hapmap59438-rs29012637BTA76,669,1570.000475100.2200.38
ARS-BFGL-NGS-16633BTA715,893,3120.000217100.2590.30
Hapmap58018-ss46526014BTA870,401,4170.00012750.4030.32
ARS-BFGL-BAC-26470BTA1449,517,3770.000175100.2690.74
BTB-01223066BTA1450,747,1707.46 × 10−550.4510.31
ARS-BFGL-BAC-26936BTA1480,753,8811.79 × 10−550.2630.50
ARS-BFGL-NGS-3276BTA2012,017,7790.000332100.3960.34
ARS-BFGL-NGS-119698BTA2016,715,6515.15 × 10−550.4790.30
Hapmap57270-ss46526311BTA2016,780,5637.59 × 10−550.4790.48
a Significance threshold of a maximum chromosome-wise pFDR of 10%, a minimum minor allele frequency (MAF) of 0.2, third threshold of a chi-square test, and fourth threshold of a minimum homozygotic SNP difference of 0.3.
Table 7. Number of significant SNPs per chromosome in the genome-wide association study with imputed high-density (777 K) genotypes using a random forest regression, alongside SNP distributions per chromosome using various post-GWAS significance thresholds.
Table 7. Number of significant SNPs per chromosome in the genome-wide association study with imputed high-density (777 K) genotypes using a random forest regression, alongside SNP distributions per chromosome using various post-GWAS significance thresholds.
ChromosomeTotal Number of Significant SNPsSNP Frequency Range aAverage Vim% Range bRegion of Interest (Mb) c
BTA1342–60.00054–0.00291100–107
BTA2482–50.00057–0.0024880–90
BTA3212–30.00068–0.0020223–29
BTA4782–50.00060–0.00322110–112
BTA5412–60.00058–0.0026690–100
BTA6442–50.00034–0.0034012–15
BTA7852–190.00056–0.00341104–112
BTA8192–50.00034–0.0025360–70
BTA9502–90.00056–0.0025140–60
BTA1092–30.00059–0.00173101–103
BTA11202–50.00028–0.0023660–80
BTA12462–60.00060–0.0024130–40
BTA13222–47.40 × 10−19–0.0025127–30
BTA14572–60.00063–0.002572–6
BTA15202–50.00065–0.0020216
BTA16322–100.00029–0.0027569–71
BTA17222–40.00033–0.0022010–11
BTA18132–50.00061–0.0022359
BTA19392–50.00059–0.0025150–58
BTA20512–50.00028–0.0026910–13
BTA21132–50.00047–0.0027465–71
BTA22202–30.00079–0.0020822–25
BTA23720.00065–0.0017838–41
BTA24212–40.00066–0.0030211–12
BTA25202–60.00061–0.0019821–28
BTA26242–50.00065–0.0021123–29
BTA27182–70.00063–0.0018431–37
BTA28212–60.00059–0.0020726–27
BTA29182–40.00062–0.0025448–50
Total913
a Frequency of significant SNPs occurring in different forests, with a minimum forest frequency of two. b Average range of the VIM% for all significant SNPs. c Chromosomal region in which the majority of significant SNPs are located.
Table 8. Common SNPs between the high-density GWAS via the GQLS method and at least two forests of random forest regression for genomic areas of interest for spastic syndrome.
Table 8. Common SNPs between the high-density GWAS via the GQLS method and at least two forests of random forest regression for genomic areas of interest for spastic syndrome.
SNP NameChr.Base Pair Locationp-ValuepFDR Chromosome-Wise Threshold (%)MAFHomozygotic Polarity Score aAverage VIM%Forest Frequency
BovineHD0700004329BTA715,774,3190.00029100.450.200.001805
BovineHD0700003962BTA714,674,8950.00018100.260.250.000872
BovineHD0700004239BTA715,417,5240.00054100.490.280.000772
BovineHD0700003455BTA713,282,2300.00012100.420.350.001474
BovineHD4100006888BTA870,445,8548.59 × 10−610.400.300.002012
BovineHD0800020415BTA868,017,4050.00062100.480.300.002532
BovineHD0800020414BTA868,016,4580.00062100.480.320.001352
BovineHD0800033791BTA870,398,6968.59 × 10−610.400.340.000985
BovineHD0800020416BTA868,018,8320.00062100.480.380.001343
BovineHD1400022775BTA1480,696,5967.81 × 10−610.390.400.001262
ARS-BFGL-BAC-23853BTA1445,174,3820.00030100.49N/A0.000672
BovineHD1400022812BTA1480,764,4186.95 × 10−550.470.450.001603
BovineHD1400022779BTA1480,702,9432.43 × 10−550.420.470.001622
BovineHD2000005062BTA2016,686,2788.34 × 10−550.420.280.001195
BovineHD2000005856BTA2019,580,5040.0001050.480.280.001513
BovineHD2000003858BTA2012,044,4970.0001250.440.350.001302
a All SNP polarity scores are significant by chi-square test.
Table 9. Candidate gene list for animals genotyped with imputed high-density (777 K) SNP panels, phenotyped for spastic syndrome.
Table 9. Candidate gene list for animals genotyped with imputed high-density (777 K) SNP panels, phenotyped for spastic syndrome.
Gene ID
(Location, bp)
SNP IDChr.SNP Location (bp)SNP Location from
Gene (bp)
Homozygous Differences 1MAFProtein Significance to Spastic Syndrome
ENSBTAG00000014828
CACNA1A
(13,249,309–13,506,224)
BovineHD0700003455
BovineHD0700004239
ARS-BFGL-NGS-102773
BovineHD0700003962
BovineHD0700003008
BovineHD0700004113
BovineHD0700004235
BTA713,282,230
15,417,524
12,833,745
14,674,895
11,340,236
15,046,423
15,408,372
SNP is within gene
1,911,300 bases left
415,564 bases right
1,168,671 bases left
1,909,073 bases right
1,540,199 bases left
1,902,148 bases left
0.32
0.34
0.32
0.30
0.37
0.50
0.54
0.42
0.49
0.40
0.26
0.41
0.37
0.23
Adult locomotory behavior;
calcium transporter for nerve impulse involved in muscle contraction
ENSBTAG00000010635
RAB3A
(4,944,325–4,950,010)
BovineHD0700001946
Hapmap59438-rs29012637
BTA76,632,343
6,669,157
1,682,333 bases left
1,719,147 bases left
0.42
0.33
0.47
0.46
Neuron development;
synaptic vesicle exocytosis
ENSBTAG00000020709
MAP1S
(5,335,661–5,367,540)
BovineHD0700001946
Hapmap59438-rs29012637
BTA76,632,343
6,669,157
1,264,803 bases left
1,301,617 bases left
0.42
0.33
0.47
0.46
Neuron projection; development and morphogenesis
ENSBTAG00000029901
Bta-mir-23a-201
(12,981,970–12,982,042)
ARS-BFGL-NGS-102773BTA712,833,745148,225 bases right0.320.40Post-translational inhibition; chaperone-like behavior
ENSBTAG00000039130
Uncharacterized
(14,670,708–14,672,994)
BovineHD0700003737BTA714,129,719540,989 bases right0.310.21TCP-1 chaperone family; heat shock protein
ENSBTAG00000005827
FIG4
(40,873,047–41,046,003)
BovineHD0900010787BTA938,889,3271,983,720 bases right0.570.20Locomotory behavior
expressed in frontal lobe; where muscle movement is partially controlled
ENSBTAG00000011851
FYN
(39,107,807–39,259,883)
BovineHD0900010787BTA938,889,327218,480 bases right0.570.20Locomotion;
positive regulation of neuron projection development
1 Population proportionate distributions of homozygotic SNPs were calculated in the affected and unaffected populations, and the differences of the homozygous states were taken per disease status. Then, the absolute difference of the differences in homozygotic status was retrieved to identify the potential polarity of homozygous SNP genotypes and disease status.
Table 10. Consequences of SNP variants in genes of interest associated with spastic syndrome.
Table 10. Consequences of SNP variants in genes of interest associated with spastic syndrome.
GeneMissenseSynonymousIntron VariantUpstream Gene VariantDownstream Gene VariantElongationTruncationSplice Region Variant
Chromosome 7
RAB3A121825 a212--
MAP1S8 b3119123034-
CACNA1A14 c212200204372906
Uncharacterized2 d-169778-21
Chromosome 9
FYN-31115201746541
FIG42141496928063771
a One variant predicted to be deleterious. b Four variants predicted to be deleterious. c Three variants predicted to be deleterious. d One variant predicted to be deleterious.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Neustaeter, A.; Brito, L.F.; Hanna, W.J.B.; Baird, J.D.; Schenkel, F.S. Investigating the Genetic Background of Spastic Syndrome in North American Holstein Cattle Based on Heritability, Genome-Wide Association, and Functional Genomic Analyses. Genes 2023, 14, 1479. https://doi.org/10.3390/genes14071479

AMA Style

Neustaeter A, Brito LF, Hanna WJB, Baird JD, Schenkel FS. Investigating the Genetic Background of Spastic Syndrome in North American Holstein Cattle Based on Heritability, Genome-Wide Association, and Functional Genomic Analyses. Genes. 2023; 14(7):1479. https://doi.org/10.3390/genes14071479

Chicago/Turabian Style

Neustaeter, Anna, Luiz F. Brito, W. J. Brad Hanna, John D. Baird, and Flavio S. Schenkel. 2023. "Investigating the Genetic Background of Spastic Syndrome in North American Holstein Cattle Based on Heritability, Genome-Wide Association, and Functional Genomic Analyses" Genes 14, no. 7: 1479. https://doi.org/10.3390/genes14071479

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop