SNP Analysis and Whole Exome Sequencing: Their Application in the Analysis of a Consanguineous Pedigree Segregating Ataxia

Autosomal recessive cerebellar ataxia encompasses a large and heterogeneous group of neurodegenerative disorders. We employed single nucleotide polymorphism (SNP) analysis and whole exome sequencing to investigate a consanguineous Maori pedigree segregating ataxia. We identified a novel mutation in exon 10 of the SACS gene: c.7962T>G p.(Tyr2654*), establishing the diagnosis of autosomal recessive spastic ataxia of Charlevoix-Saguenay (ARSACS). Our findings expand both the genetic and phenotypic spectrum of this rare disorder, and highlight the value of high-density SNP analysis and whole exome sequencing as powerful and cost-effective tools in the diagnosis of genetically heterogeneous disorders such as the hereditary ataxias.


Introduction
Autosomal recessive cerebellar ataxias (ARCAs) are a complex group of disabling neurodegenerative disorders that manifest predominantly in childhood and early adulthood [1]. Despite increasing knowledge regarding the molecular basis of ARCA, a considerable number of patients remain without a specific diagnosis [2]. The diagnosis is challenging due to both genotypic and phenotypic heterogeneity [1]. More than 30 genes/loci have been associated with over 20 different clinical forms of ARCA [3]. Phenotypic variability in the expression of cerebellar impairment, including atypical phenotypes and overlapping clinical features, further complicates the picture [2]. The delineation of the precise clinical features ascribed to each ARCA remains under debate, hence complicating, yet also necessitating, a conclusive molecular diagnosis.
Mutations causative of different forms of ARCA are frequently located in genes of particularly large coding capacity (e.g., SYNE1 (NM_182961) 26 kb in 145 coding exons [4], SACS (NM_014363) 14 kb in 9 exons [5] and ATM (NM_000051) 9 kb in 62 coding exons [6]). Sanger-based (dideoxy) DNA sequencing is the gold standard for detecting mutations at the base pair level, but it is costly on an individual gene basis, and given the number and size of associated genes, its use in the investigation of unidentified ARCA is limited.
Here, we describe the application of high-density single nucleotide polymorphism (SNP) analysis and whole exome sequencing (WES) in the investigation of two Maori siblings presenting with apparent ARCA.

Case Presentation
The two affected siblings (V:11 and V:12, Figure 1), now deceased, were of Maori and English ancestry, and there were twelve unaffected sibs. Their parents were first cousins. The siblings share a similar progressive clinical history, characterised primarily by cerebellar ataxia, peripheral neuropathy, and pyramidal tract signs. The initially suspected diagnosis was Friedreich ataxia, but FXN gene testing returned normal results.
The index case (V:11), now deceased, experienced onset of lower limb weakness in his early twenties. By his late forties he was only able to walk very short distances with a frame, reliant predominantly on a wheelchair. At this stage, he was also affected by upper limb ataxia, being unable to perform fine motor movements such as fastening buttons.
Clinical examination at the age of fifty-two years revealed dysarthria, nystagmus on lateral gaze, and an intention tremor. He had marked wasting of the intrinsic muscles of the hands and distal quadriceps. His tone was normal throughout. Upper limb power was normal, except the small muscles of the hands; however, he was now wheelchair bound with truncal weakness and reduced power of the lower limbs (4/5 at the hips and knees, and minimal movement at the ankles). Tendon reflexes were absent in the lower limbs, except for a flicker at the knees. Plantars were extensor. Sensation was intact, but vibration sense was absent in the lower limbs. By sixty-one years of age, his clinical features had further progressed. Power had also slightly reduced in the upper limbs, with absent reflexes other than a small flicker of the triceps. Tone remained normal throughout. Fundoscopy revealed no abnormalities. He died from pneumonia at the age of sixty-five years. His sister (V:12), now also deceased, was at the time of our study a sixty-four year old woman, and shared a similar clinical history. Examination in her mid-late thirties had revealed moderate dysarthria, dysdiadochokinesia, and a slight intention tremor. Her gait was ataxic with a positive Romberg sign. She was able to walk unaided; lower limb tone was normal, but there was moderate symmetrical weakness, absent ankle reflexes, and extensor plantars. Upper limb tone, power and sensation were normal, but tendon reflexes were reduced.
By her late fifties, examination revealed a progression in her clinical features. She had become wheelchair-bound, with clear wasting from the mid thighs distally. Lower limb tone was normal, all reflexes were absent, and plantars were extensor. Vibration sense was grossly intact at the lower thigh but absent distally. She had severe Achilles contracture. In the upper limbs there was an impression of loss of bulk in the forearms, and clear wasting of the thenar, first interosseous and intrinsic muscles. All reflexes were absent. Vibration sense remained intact in the upper limbs. Fundal examination was unremarkable, but there was occasional coarse nystagmoid movement on lateral gaze.
At the age of thirty-six years, A CT scan demonstrated widening of cortical sulci over both cerebral hemispheres, compatible with minor cerebral atrophy, some cerebellar atrophy with enlargement of the fourth ventricle and cisterna magna, and widening of subarachnoid spaces over the superior aspect of the cerebellum. Electromyography showed very low amplitude dispersed compound muscle action potentials in the lower limbs, slowing of all motor conduction velocities, and absent sensory nerve action potentials. Neurogenic motor unit potential changes and fibrillation in distal muscles were evident.
Two cousins of the probands (V:1 and V:2, Figure 1), also born of a consanguineous union, were suspected to have been affected, but were deceased prior to the availability of molecular analysis. Clinical records were available for one of these individuals, and described a remarkably similar phenotype to the probands, with the exception of an early childhood/congenital onset, including early cognitive impairment.

Molecular Studies
Genomic DNA was isolated from peripheral blood of the two affected siblings, together with an unaffected brother, using the Gentra Puregene blood kit according to the manufacturer's instructions (Qiagen, Gaithersburg, MD, USA).

SNP Analysis
An Affymetrix Cytogenetics Whole-Genome 2.7 M Array was used for SNP analysis in the two affected siblings. This array consists of 2.7 million probes spaced across the genome, enabling high-resolution genome-wide interrogation and allelic discrimination. Genomic DNA (0.1 μg) was labelled using the Affymetrix Cytogenetics Reagent Kit (Affymetrix, Santa Clara, CA, USA), applied to the Affymetrix Cytogenetics 2.7 M array and scanned according to the manufacturer's instructions (Affymetrix Cytogenetics Assay Protocol [7]). Data was analysed using the Affymetrix Chromosome Analysis Suite (ChAS) v1.0.1/na30.1 with the aid of the UCSC genome browser [8]. All genomic coordinates were taken from the human reference sequence hg18 (NCBI build 36.1). Our reporting policy is to omit copy number changes that do not contain genes, are well established polymorphisms, are losses smaller than 200 kb, or are gains smaller than 400 kb, unless associated with a gene of known clinical significance. Regions of long contiguous stretches of homozygosity (LCSH) greater than 5 Mb are detected. The array will not detect balanced alterations (translocations, inversions and insertions).

WES
WES was performed in the affected siblings, V:11 and V:12, and one unaffected sibling, in order to investigate the locus of the undelineated neurodegenerative hereditary ataxia. The Nextera Rapid Capture Exome kit (Illumina, San Diego, CA, USA) was used to target 214,405 coding exons (37 Mb), which comprise 98.3% of all RefSeq genes. Sequencing was performed using an Illumina HiSeq 2500, achieving a mean coverage of over100×. An in-house bioinformatics pipeline was used to process the data and generate an annotated variant call file. Further annotation and variant filtering were carried out using Variant Studio software (Illumina). Variants were annotated against all gene transcripts and reported against the HGNC recommended transcript.

Primer Design and Sanger-based Sequencing
Primers were designed to flank the relevant exon of the SACS gene, including 50 bp of the flanking intronic regions. The UCSC genome browser was used to obtain the reference transcript of the SACS gene (NM_014363.4) and its protein product (NP_055178.3). This website provides a direct link to ExonPrimer for the design of primers flanking coding exons. The primers were checked for underlying SNPs using the online software tool available from the National Genetic Reference Laboratory, Manchester [9]. After passing in silico tests, the primers were tailed with M13 sequences and synthesised by Integrated DNA Technologies (details available upon request).
Sanger-based sequencing was performed to confirm the research-based WES results and the traces were analysed using Variant Reporter™ Software v1.0 (Thermo Fisher Scientific, Cleveland, OH, USA) as described previously [10]. GenBank NM_014363.4 was used as the reference sequence, with cDNA number +1 corresponding to the A of the translation initiation codon (codon 1). Each sequence trace had a minimum trace score of 35, which corresponds to an average false base call frequency of 0.031%.

Results
SNP analysis revealed nineteen regions of LCSH over 5 Mb in length in individual V:11, five of which were above 20 Mb long, and a total of fourteen regions of LCSH in individual V:12 over 5 Mb, including three above 20 Mb (Figure 2). Of the identified LCSH regions, six were common between the affected siblings (Table 1) and thus likely to harbour the locus for their recessively-inherited cerebellar ataxia. No clinically significant copy number changes were detected in either case.  A search of OMIM genes associated with ataxia and located within the regions of LCSH common to both affected siblings yielded seven candidate genes (Table 2); however, there was no single gene for which full phenotypic concordance could be seen. Table 2. Candidate ataxia genes located in the regions of LCSH common to both affected siblings, identified by searching OMIM genes for the specific clinical feature "ataxia".

Gene
Phenotype Inheritance COQ2 Multiple-system atrophy AR, AD GRID2 Autosomal recessive spinocerebellar ataxia-18 AR TTPA Ataxia with isolated vitamin E deficiency AR CYP7B1 Autosomal recessive spastic paraplegia-5A AR PEX2 Peroxisome biogenesis disorder-5B AR SACS Autosomal recessive spastic ataxia of Charlevoix-Saguenay AR ATP8A2 Cerebellar ataxia, mental retardation, and disequilibrium syndrome-4 AR Rather than screen candidate ataxia genes found in the shared LCSH regions, we undertook a complementary WES study, with variant filtering as described (Figure 3), which yielded three possible variants ( Table 3). The variant most concordant with the phenotype, most likely to be clinically significant, and located in one of the LCSH regions common to both affected siblings, was the SACS gene variant c.7962T>G p.(Tyr2654*). Sanger-based sequencing confirmed the apparent homozygosity of this variant in the two affected siblings (Figure 4).   The nonsense variant, c.7962T>G p.(Tyr2654 *), has not, to our knowledge, been previously reported in the literature or mutation databases (Leiden Open Variation Database; LOVD; www.lovd.nl/3.0/home, Human Gene Mutation Database Professional; HGMD ® Pro) [12]; but, given that it results in premature termination of mRNA translation, it is predicted to be pathogenic. The Mutalyzer 2.0.13 website [13] predicts that this variant produces an amino acid sequence which is 58% of the length of that coded by the reference SACS gene transcript (Refseq accession number: NM_014363.4).

Discussion
ARSACS (OMIM 270550) is a complex neurodegenerative disorder caused by mutations in the SACS gene [5]. ARSACS was first described in individuals from the Saguenay-Lac-St-Jean area of north-eastern Quebec [14], where cases are predominantly due to two founder mutations: c.6594delT and c.5254C>T [15]. It is now well recognised that ARSACS is not limited to this region, but occurs worldwide [16][17][18][19][20][21][22][23]. The disorder is likely under-diagnosed and the true incidence remains unknown; after Freidreich ataxia and ataxia-telangiectasia, it may be one of the more frequently seen of the recessively-inherited ataxias [24].
ARSACS is characterised by progressive cerebellar ataxia, pyramidal tract signs, and a peripheral sensorimotor neuropathy predominantly affecting the lower limbs [14]. Progressive lower limb spasticity is considered a core clinical feature, and is associated with the preservation of tendon reflexes, except for ankle jerks [14,16]. The patients we describe here differ from the classical phenotype as their lower limb tone was normal and tendon reflexes were absent. Two other ARSACS families with a spasticity-lacking phenotype have been reported, but the genotypes in each case differed: c.987T>C and c.5988_9delCT [25,26]. As observed in our cases, these individuals displayed bilateral Babinski signs indicating pyramidal involvement. This highlights that spasticity is not a constant feature of ARSACS, and its absence should not exclude ARSACS as a differential diagnosis in cases of early-onset cerebellar ataxia.
Disease onset occurred in adulthood in the probands reported here, whereas in individuals originating from Quebec onset occurs between 12 and 18 months of age. Although cases of adult onset have been described outside of this region, childhood onset is the norm [27]. Early childhood/congenital onset was observed in a cousin of the probands (V:2, Figure 1); however, early cognitive impairment was also present. Her phenotype could be explained by a coincidental birth injury, or an effect of marked environmental deprivation, upon which the familial ARCA was superimposed; although we note also more recent reports of a proposed cognitive/psychological component per se of ARSACS [28,29], and it is speculative whether this may have been a factor.
Treatment for ARSACS remains largely symptomatic. Postmortem examination has demonstrated atrophy of the anterior cerebellar vermis associated with Purkinje cell death, small corticospinal tracts, and demyelination of both spinal cord corticospinal and posterior spinocerebellar tracts [30,31].
The SACS gene is located on chromosome 13q12.12 [32,33] and encodes an 11.7 kb protein, sacsin, which is highly expressed throughout the central nervous system, as well as in skeletal muscle and skin fibroblasts [5]. The carboxyl-terminus domain contains a "DnaJ" motif that binds the heat shock protein Hsc70 [34]. Its additional ubiquitin-like domain indicates that sacsin could play a role in linking the ubiquitin-proteosome pathway to the heat shock protein 70 machinery [34]. Recent studies in homozygous Sacs knockout (Sacs−/−) mice have revealed that an absence of sacsin leads to abnormal accumulation of non-phosphorylated neurofilament bundles in the somatodendritic regions of vulnerable neuronal populations and in ARSACS brain [35]. Motor neurons cultured from Sacs−/− embryos showed a comparable rearrangement of neurofilament bundles with elongated mitochondria and reduction in mitochondrial motility [35]. These observations suggest that disruption of mitochondrial organisation and dynamics, through alterations in cytoskeletal proteins, underpin the pathophysiological basis of ARSACS [35].
The reference sequence NM_014363.4 has ten exons, of which nine are coding. More than 180 mutations have now been reported in the SACS gene (LOVD, HGMD Pro), 80% of which are in exon 10. Approximately one quarter of all reported mutations are nonsense mutations, a third are missense, and the remainder comprise deletions and insertions of various sizes.
Homozygosity mapping is a valuable tool for the identification of defective loci, especially in patients born from consanguineous relationships [36], and has been successfully used in cases of ARCA, including ARSACS [37]. However, a high percentage of consanguineous ataxia families have been found to segregate with loci not corresponding to known ataxia genes [37,24], suggesting that further unidentified ARCA entities exist and paving the way for the discovery of novel causative genes [37].
By combining the homozygosity data obtained from the SNP array with the WES data, we were able to identify a single variant present in an identified region of LCSH (Figure 3). The 2.7 M array additionally enabled the exclusion of inherited copy number variations above the size thresholds stated above, which would not be detected by WES. The authors note that the possibility of hemizygosity has not fully been excluded as parental DNA is not available for analysis and heterozygous exonic or multi-exonic deletion of the SACS gene has not been excluded. SNP or haplotype analysis of the region could be of value in determining whether the identified mutation originated from English or Maori ancestry; however, unavailability of resources prevented this analysis from being undertaken.
WES has become an increasingly powerful and affordable tool in the diagnostic setting owing to recent developments in high-throughput sequence capture methods and next-generation sequencing approaches. Aside from identifying novel genes causative of rare disorders, it provides a time-and cost-efficient means of detecting mutations in genes already implicated in disease, especially diseases with great genetic heterogeneity [38]. Obtaining a specific molecular diagnosis facilitates tailored genetic counselling, enables carrier and prenatal testing to be offered and supports the development of therapeutic strategies.
In conclusion, we demonstrate that high-density SNP genotyping and WES provide an affordable and relatively quick approach for diagnostic laboratories to establish the molecular diagnosis in cases of genetically heterogeneous disorders. Our findings also expand the phenotypic variability and underlying genetic aetiology of ARSACS.