Homozygosity Haplotype and Whole-Exome Sequencing Analysis to Identify Potentially Functional Rare Variants Involved in Multiple Sclerosis among Sardinian Families

Multiple Sclerosis (MS) is a complex multifactorial autoimmune disease, whose sex- and age-adjusted prevalence in Sardinia (Italy) is among the highest worldwide. To date, 233 loci were associated with MS and almost 20% of risk heritability is attributable to common genetic variants, but many low-frequency and rare variants remain to be discovered. Here, we aimed to contribute to the understanding of the genetic basis of MS by investigating potentially functional rare variants. To this end, we analyzed thirteen multiplex Sardinian families with Immunochip genotyping data. For five families, Whole Exome Sequencing (WES) data were also available. Firstly, we performed a non-parametric Homozygosity Haplotype analysis for identifying the Region from Common Ancestor (RCA). Then, on these potential disease-linked RCA, we searched for the presence of rare variants shared by the affected individuals by analyzing WES data. We found: (i) a variant (43181034 T > G) in the splicing region on exon 27 of CUL9; (ii) a variant (50245517 A > C) in the splicing region on exon 16 of ATP9A; (iii) a non-synonymous variant (43223539 A > C), on exon 9 of TTBK1; (iv) a non-synonymous variant (42976917 A > C) on exon 9 of PPP2R5D; and v) a variant (109859349-109859354) in 3′UTR of MYO16.


Introduction
Multiple Sclerosis (MS) is a complex neurological autoimmune disease, which mainly affects people in early adulthood; for this reason, it is considered as the most common cause of neurologic disability in young adults [1,2].
The prevalence of the disease is different across the different countries: it has a high prevalence in Europe, with a north to south gradient, and a lower prevalence in Asia and Africa [3]. In Italy, we observed a disease prevalence of 176 per 100,000 inhabitants [4], except in the Mediterranean island of Sardinia, where we found an age-and sex-adjusted prevalence of MS of 330 per 100,000 inhabitants, among the highest reported worldwide, ranging from 217 in the Olbia-Tempio district to 425 in the Ogliastra district [5], with the lowest risk areas being closer to the coast.
Although most MS cases occur sporadically, about 20% of the affected individuals are related by family, with first-degree relatives of MS patients at increased risk of disease, thus suggesting that the disease is moderately heritable, with a sibling relative recurrence risk of 6.35 in the Caucasian population [6] and of 31 in the founder population of the Sardinian province of Nuoro [7].
In line with other common, complex disorders, almost 20% of risk heritability is attributable to common genetic variants in the autosomal genome, including 233 unequivocally MS-associated loci identified over the last 15 years by GWAS (genome-wide association studies), comprising 32 loci within the Major Histocompatibility Complex (MHC) [8][9][10][11][12][13], each of which explain only a small fraction of risk [14]. A recent study by the International Multiple Sclerosis Genetics Consortium (IMSGC) [15] provides evidence that 11.34% of risk heritability is explained by low-frequency variants (Minor Allele Frequency (MAF) < 5%); of these rare variants, (MAF < 1%) alone explains 9%. Most lowfrequency variants impact genes that are not detectable by common variants identified by genome-wide association studies (GWAS), and only a small portion of them is in Linkage Disequilibrium (LD) with variants highlighted by GWAS.
Many low frequency and rare variant associations, as important sources of unexplained heritability, remain to be discovered [16]. This investigation would require large sample sizes to reach an appropriate statistical power, or alternatively, the use of multiplex families from a founder population for which both genotyping and sequencing data are available.
Our study aims at understanding the genetic contribution to MS and suggesting new potential causative variants in families, by contributing to the discovery of new exonic and potentially functional low-frequency variants. To this end, we analyzed multiplex families originating from the genetically homogeneous and isolated population of the Nuoro province of Sardinia for which Immunochip genotyping and whole exome-sequencing (WES) data are available.
We followed a two-stage approach. In the first stage, we prioritized candidate regions to be further investigated via a non-parametric Homozygosity Haplotype (HH) analysis, which uses reduced haplotypes composed by homozygous single nucleotide polymorphisms (SNPs) only and deletes all the heterozygous ones. We performed this analysis on thirteen families by exploiting the co-segregation of the disease and genetic variants between affected and unaffected subjects for a genome-wide search of shared autosomal segments. In the second stage, on the promising candidate regions identified in the HH analysis, we searched for the presence of rare variants shared by the affected individuals by analyzing WES data that were available for five families only.

Sample Description
Thirteen multiplex Sardinian pedigrees, containing from three to sixteen MS patients each, were selected for the analysis, for a total of 80 affected (63 with Immunochip genotyping data) and 655 unaffected (220 with Immunochip genotyping data) patients. Table 1 reports the description of the family data available for the HH analysis. We analyzed a total of 129.448 Immunochip QC-filtered SNPs that had assigned dbSNP refIDs. WES data were available for five families only. Specifically, three cases and one control for family 61, 10 cases for family 2360, two cases for family 45, four cases for family 4, and five cases for family 5.

Identification of RCHHs
The HH statistical analysis was performed for all the 13 families using both the genotyped patients (n = 63) and controls (n = 220), so that the algorithm worked to treat the affected and unaffected members of a family as cases and controls, respectively. As we had Immunochip data [10], obtained with an Illumina Infinium HD custom array designed for the fine mapping of 184 established autoimmune loci, and not a high-density array, we used a cutoff value of 7 cM to search the candidate RCHHs, in order to reduce the risk of false positives and to increase specificity, and we used CEUAnnotation1MDuo (http://www.hhanalysis.com/ (accessed on 15 February 2021)) as an annotation file. We selected regions with a significance level of -log 10 (p-value) ≥ 1.2, corresponding to a p-value of 0.06, to establish difference between patient and control pools. The choice of this liberal level of significance was driven by our research strategy in which HH analysis represents a step towards prioritizing candidate regions, and thus, towards reducing false negative probability, in order to enable further investigation in the second stage (WES analysis). Table 1. Description of the family data available for the HH analysis. For each family, the total numbers of affected and unaffected subjects are reported together with the availability of Immunochip genetic data.

Family
Total N. of Affected N. of Affected with Genotyping Data Total N. of Unaffected HH was run to identify Regions with a Conserved Homozygosity Haplotype (RCHHs) that were differently shared among cases and controls (see Table 2). Then, we searched for genes in each significant RCHH, and Figure 1 reports the upset plot in which the number of genes shared between families is graphically represented. The first five bar charts on the top of the figure represent the number of genes in the significant RCHHs found only in each individual family (whose identified number is specified in the left of the figure), e.g., 305 genes that are only in family 2360 but not in the other families, etc. The last three bar charts on the top of the figure report the number of genes shared between families. In particular, families 4 and 44 share one gene, i.e., AL833583, and families 3 and 2360 share DQ596042, while families 9 and 44 share the USP16 gene. Interestingly, increased USP16 levels, a deubiquitinase required for chromosomal segregation in mitosis [17], were observed in peripheral CD4+ T cells from patients with distinct autoimmune diseases and T cell-specific USP16 knockout mice showed a reduced severity of experimental autoimmune encephalitis [18]. Families that were not reported in the upset plot are those who did not share any gene between them.
The Phenolyzer (http://phenolyzer.wglab.org (accessed on 30 March 2021)) and WE-bGestalt (http://www.webgestalt.org (accessed on 30 March 2021)) tools were used to verify if there were genes involved in MS-related diseases due to their involvement in the CNS or immune system. We found that:  Table 2. List of RCHHs grouped by family. The table reports the number of subjects sharing an RCHH in both pools, its coordinate and the -log 10 (p-value) for the difference between the pools. In our study, we set a significance level of -log 10 (p-value) ≥ 1.2, corresponding to a p-value of 0.06.

Family
No

Identification of Pathogenic Variants
WES analysis of available subjects on the regions prioritized by HH analysis was performed to identify putative causative rare variants for MS. Figure 2 reports the pedigree structure of the analyzed families for which we also had WES data, and the identified variants.
From the FASTQ files, we obtained the VCF files, using bphg19 as the reference genome. The VCF files were annotated using wANNOVAR (http://wannovar.wglab.org (accessed on 3 May 2021)), and we searched for variants on the resulting csv file. Calls with synonymous variants, and benign variants were excluded, while calls with MAF reported in the 1000 genomes (https://www.internationalgenome.org/ (accessed on 10 May 2021)) dataset and gnomAD (https://gnomad.broadinstitute.org (accessed on 10 May 2021)), with values of less than 0.001 or unknown, were included. The aim was to search for rare genetic variants shared between the affected within the genes prioritized via HH analysis.
For each family and for each RCHH, the variant, its function and the number of cases and control sharing the variant are reported in Table 3. Interestingly, three variants, i.e., 43181034 T > G and 50245517 A > C, located in splicing region on exon 19 of the CUL9 gene and on exon 16 of the ATAP9A gene, respectively, and a deletion (43557751-43557832) in the splicing site of the UMODL1 gene, were classified as photogenic in VarSome (https://varsome.com/ (accessed on 12 October 2021)). The other variants were classified as of uncertain significance.
In Table S1, for each variant, the frequency in the different populations, such as those in 1000 Genome, gnomAD, and in the Sardinian population [19,20], are reported. ST1 also reports the results obtained using different prediction algorithms, with the grade of pathogenicity for each variant.

Identification of Pathogenic Variants
WES analysis of available subjects on the regions prioritized by HH analysis was performed to identify putative causative rare variants for MS. Figure 2 reports the pedigree structure of the analyzed families for which we also had WES data, and the identified variants.
From the FASTQ files, we obtained the VCF files, using bphg19 as the reference genome. The VCF files were annotated using wANNOVAR (http://wannovar.wglab.org (accessed on 3 May 2021)), and we searched for variants on the resulting csv file. Calls with synonymous variants, and benign variants were excluded, while calls with MAF reported in the 1000 genomes (https://www.internationalgenome.org/ (accessed on 10 May 2021)) dataset and gnomAD (https://gnomad.broadinstitute.org (accessed on 10 May 2021)), with values of less than 0.001 or unknown, were included. The aim was to search for rare genetic variants shared between the affected within the genes prioritized via HH analysis.
For each family and for each RCHH, the variant, its function and the number of cases and control sharing the variant are reported in Table 3. Interestingly, three variants, i.e., 43181034 T > G and 50245517 A > C, located in splicing region on exon 19 of the CUL9 gene and on exon 16 of the ATAP9A gene, respectively, and a deletion (43557751-43557832) in the splicing site of the UMODL1 gene, were classified as photogenic in VarSome (https://varsome.com/ (accessed on 12 October 2021)). The other variants were classified as of uncertain significance.
In Table S1, for each variant, the frequency in the different populations, such as those in 1000 Genome, gnomAD, and in the Sardinian population [19,20], are reported. ST1 also reports the results obtained using different prediction algorithms, with the grade of pathogenicity for each variant.

Discussion
MS is characterized by a complex multi-factorial nature that involves the interplay of a still non-identified environmental exposure and a genetic predisposition. In recent studies, it was observed that extensive grey matter lesions in the cerebral and cerebellar cortex and hippocampus [21][22][23][24] are involved in the pathology of the disease. As with other multifactorial diseases, MS has been predominantly studied assuming the "common disease-common variant" paradigm via GWASs. The importance and success of the GWASs approach to identifying the loci underlying common disease cannot be overlooked [25]. These breakthroughs, along with both statistical and technological advances, have led to the identification and confirmed association of numerous genetic loci for MS susceptibility [10][11][12]26]. In spite of these advances, only a relatively small proportion of the genetic influences in MS have been uncovered, and much is yet to be understood. It is important to point out that there are alternative hypotheses concerning the genetic architecture of common diseases, including the multiple rare variant hypothesis, which may help to elucidate the so-called missing heritability associated with common complex diseases such as MS [27].
Our study aims at contributing to the understanding of the genetic risk component of MS and the contribution of rare variants to disease risk through a combination of HH analysis-with the aim of prioritizing candidate regions-and WES to deeply explore this

Discussion
MS is characterized by a complex multi-factorial nature that involves the interplay of a still non-identified environmental exposure and a genetic predisposition. In recent studies, it was observed that extensive grey matter lesions in the cerebral and cerebellar cortex and hippocampus [21][22][23][24] are involved in the pathology of the disease. As with other multifactorial diseases, MS has been predominantly studied assuming the "common disease-common variant" paradigm via GWASs. The importance and success of the GWASs approach to identifying the loci underlying common disease cannot be overlooked [25]. These breakthroughs, along with both statistical and technological advances, have led to the identification and confirmed association of numerous genetic loci for MS susceptibility [10][11][12]26]. In spite of these advances, only a relatively small proportion of the genetic influences in MS have been uncovered, and much is yet to be understood. It is important to point out that there are alternative hypotheses concerning the genetic architecture of common diseases, including the multiple rare variant hypothesis, which may help to elucidate the so-called missing heritability associated with common complex diseases such as MS [27].
Our study aims at contributing to the understanding of the genetic risk component of MS and the contribution of rare variants to disease risk through a combination of HH analysis-with the aim of prioritizing candidate regions-and WES to deeply explore this candidate region in search of putative causative rare variants in the founder population of Nuoro province (Sardinia). The Sardinian population has a high prevalence of MS and autoimmune disease, and its genetic background is proven to be homogeneous and an outlier in Europe, having been under selective pressure due to malaria, which has been endemic there for centuries.
Exons are regions in the DNA that are particularly vulnerable in the presence of mutations in their sequence: an unexpected mutation may alter the structure and the function of the protein, leading to deleterious biological consequences that may contribute to-or cause-a specific disease. Thus, identifying the presence of an exonic genetic variant in a candidate region, previously highlighted by HH, will help us to better understand the biological processes involved in the disease.
Firstly, HH analysis performed on 13 multiplex Sardinian families made it possible to identify significant RCHHs (-log 10 (p-value) ≥ 1.2). Secondly, in these regions, WES data that were available for 25 subjects belonging to five families were analyzed.
Interesting results were found in: (i) The RCHH region on chr6:42767957-43333769, shared in 5 cases and 7 controls of family 61, where we identified the variant 43181034 T > G in the splicing region on exon 27 of the CUL9 gene. CUL9 is highly expressed in the brain, particularly in the cerebral cortex [28]. A study [29] using a human cell-derived model to characterize CUL9 in human neuronal development showed that the deletion or depletion of the protein causes the aberrant formation of neural rosettes that are related to the early stage of neurodevelopment. Furthermore, the neuronal transcription factors CUX1 and SOX3 were significantly upregulated in CUL9 knockout neuroepithelial progenitor cells. Fisher et al. [30] analyzed the potential molecular pathways of tissue injury in active cortical MS lesions, and by identifying prominent changes in gene expression, they found genes that are involved in different steps of apoptosis, DNA damage, p53 function, and DNA repair, including CUL9. In the same RCHH region, 43106964 A > C, a non-synonymous variant on exon 9, and rs780764712, in PTK7, a gene involved in the Wnt/planar cell polarity pathway, were also found. It is important to note that the PTK7 mutant with a truncated protein perinatally caused severe defects in neural tube closure [31]. In this stidy, 43223539 A > C, a non-synonymous variant on exon 9 in the TTBK1 gene, and 42976917 A > C, a non-synonymous variant on exon 9 in the PPP2R5D gene, were also found in all the available cases. TTBK1 is a brain-specific tau kinase expressed in the entorhinal cortex and hippocampal regions. TTBK1 transgenic mice showed severe axonal degeneration in the perforant path, which is essential for many forms of memory [32]. TTBK1 is highly expressed in the entorhinal cortex and the perforant path region, two specific brain regions involved in the early stage of Alzheimer's disease pathology [33], and thus, has a critical role in axonal degeneration. Collapsin response mediator protein-2 (CRMP2) is a downstream target of TTBK1 [32], whose expression induces the accumulation of phosphorylated CRMP2, and it was shown to be involved in the axonal degeneration pathology in MS [34]. PPP2R5D is a regulatory B subunit of Protein Phosphatase 2A (PP2A) and plays a crucial role in normal neuronal development and functioning. Variants of this gene were found to be associated with intellectual disability, autism, and other neurodevelopmental disorders [35]. Mutations in this gene were found in juvenile-onset parkinsonism [36]. (ii) The RCHH region on chr13:108090996-108968251, shared in 11 cases and 17 controls of family 6, where we identified the variant 109859349-109859354 TGTGTT> in 3 UTR of the MYO16 gene. This variant is also present in 1 case of family 4, in 2 cases of family 45, and in 1 case of family 5. MYO16 is mainly expressed in the central nervous system and seems to be involved in the development and functioning of the nervous system also in adulthood; therefore, alterations in this gene, e.g., SNPs, deletions, or epigenetic modifications, are associated with neurodegenerative and neuropsychiatric disorders [37][38][39]. MYO16 is thus considered as an important regulator of neural cells' functioning even if its specific role and molecular mechanisms remain to be elucidated. Interestingly, not far from MYO16, in the chr13:108090996-108968251 region highlighted by HH analysis, is located the TNFSF13B gene, encoding the cytokine and drug target B-cell activation factor (BAFF) whose overexpression is related to autoimmunity [40]. In particular, in [41], a TNFSF13B variant was found to be associated with MS and systemic lupus erythematosus (SLE) through a mechanism that led to an overexpression of BAFF, which, in turn, upregulated the humoral immunity. (iii) The RCHH region on chr20:49044993-50323395, shared in 5 cases and 7 controls of family 61, where we identified the variant 50245517 A > C in the splicing region on exon 16 of the ATP9A gene. ATP9A is a regulator of endosomal recycling and plays an inhibitory role in the release of extracellular vesicles (EV) [42], and many biological processes, such as the immune response, are modulated by proteins, DNA, miRNA, and mRNAs that could be controlled via EV-instigated intercellular communication [43].
Taken together, these results provide newer insights into the genetics of MS and a more thorough understanding of the disease biology that, in future functional studies of these highlighted specific gene variants, may provide hints towards the creation of new and more effective treatments in MS.

Sample Collection and Genotyping
MS patients were ascertained through the case register established in 1995 in the province of Nuoro, Sardinia, Italy. Cases were diagnosed according to Poser's criteria [44]. Thirteen pedigrees, containing from three to sixteen MS patients each, were selected for the analysis. Genotyping data were obtained using Immunochip results obtained from a previous study [13], where the quality control-filtered dataset included 131.497 SNPs.

HH Analysis
HH analysis, proposed by Miyazawa et al. [45], is an efficient non-parametric tool to detect regions harboring either novel or known mutations; it makes it possible to identify patient's shared chromosomal segments derived from a common ancestor, which are characterized by the distinct identity of their haplotype.
The analysis is based on the concept of identity-by-descent (IBD), in which a DNA segment is defined as IBD in two or more individuals if it is a direct copy of the same ancestral allele; thus, affected subjects who inherit the mutation from a common ancestor share IBD-the genetic portion around the mutation in which there should be no discordant homozygous calls for both dominant and recessive genes. HH are a reduced haplotype obtained by removing all the heterozygous SNPs from the sample dataset and leaving only the homozygous ones. The method works by comparing homozygous segments; in this way, there is no need to reconstruct haplotypes. This results in a reduced computational timing and a simplification of the analysis process, even for large numbers of SNPs. Specifically, the method makes it possible to compare the number of subjects in the patient pool and in the control pool who share an RCHH region. RCHH is a region with a given genetic length, chosen on the basis of the study, containing comparable SNPs (compSNP) in homozygosity in two subjects. The algorithm performs a pairwise comparison of individuals based on the presence of compSNP in the pair. A mismatched comparable SNP, also indicated as a discordant homozygous SNP (dhSNP), has discordant homozygous SNP genotypes in two subjects (e.g., AA vs. BB). The borders of an RCHH are defined by either dhSNPs or by the ends of the chromosome. RCHH regions shared by multiple subjects are used to predict the presence of a Region from a Common Ancestor (RCA) or an IBD [46,47]. Given m and n, the number of generations of two affected subjects descended from a common ancestor RCA is calculated as In consanguineous families or in populations that are geographically isolated, such as our Sardinia multiplex families, patients suffering from a disease (in our study, MS) share a common ancestor, and the RCA represent candidate regions in which disease genes can be looked for. RCA is identified through RCHHs via the comparison of RCHHs shared between the patient and the control pool [45]. As explained in [46], the algorithm works firstly by dividing a genetic autosomal region into smaller ones and selecting, for each of them, the representative RCHHs shared by the largest number of affected subjects within the patient pool. The numbers of individuals sharing each representative RCHH pool are then counted in both the patient and the control groups and compared to each other. The p-value of the null hypothesis of no difference between these two proportions is calculated according to the standard normal distribution.
In the HH approach, genotyping error may lead to be causative gene being excluded mistakenly. In fact, an RCHH will be truncated when a non-dhSNP changes into a dhSNP due to a genotyping error; in this way, the resulting genomic segment will be smaller than the chosen cut-off length, and thus, the RCHH will be not identified. The algorithm, by means of Monte Carlo Chain Simulation (MCMC), makes it possible to calculate the probability of obtaining dhSNPs due to genotyping errors. If the probability of creating dhSNPs in the region is very low (p-value < 0.001), the HH results are reliable.
A critical step in the analysis is represented by the choice of cut-off value since there is not an optimal cut-off value. They need to be chosen on the basis of the population under study, taking into account that the decreases in the average genetic length of the RCAs over generations; therefore, when we analysed distantly related subjects who shared smaller RCA, it was preferable to use a small cut-off (e.g., 3 cM). Another aspect that guides our choice of the cut-off is the array used for genotyping (e.g., low-density vs. high-density). As we only had Immunochip data [10], which were obtained with the Illumina Infinium HD custom array that was designed for the fine mapping of 184 established autoimmune loci, we chose a conservative cutoff value of 7 cm to search for candidate RCHHs that represent our prioritized regions. This step was taken to reduce the risk of false positives and increase the specificity of the results.
In summary, the advantages of HH approach numerous are: (a) there is no need to reconstruct haplotypes since the homozygote haplotype for each chromosome is uniquely determined; (b) the chromosomal segments in which all polymorphic markers are homozygous are considered to be autozygous segments [48]; (c) if the coefficient of consanguinity for a patient is large as a result of belonging to an inbred family, and the disease is rare, then the probability that the disease-causing gene is located in the shared segment is very high; (d) since HH analysis looks for ancestral segments, both dominant and recessive genes can be detected; (e) the HH approach is robust to genotyping errors.
In our analysis, the dataset was prepared using R software [49], and the HH analysis was run using the HH program (http://www.hhanalysis.com (accessed on 15 February 2021)).

Screening of Known Causative Genes
We then scrutinized the segments highlighted in the HH analysis with the help of publicly available databases such as Phenolyzer (http://phenolyzer.wglab.org (accessed on 30 March 2021)) and WEbGestalt (http://www.webgestalt.org (accessed on 30 March 2021)), in order to search for genes located, within the specified region, whose function could be plausibly related to MS.

Whole-Exome Sequencing Data Generation
All samples were sequenced at the Center for Genome Technology within the University of Miami John P. Hussman Institute for Human Genomics.
Library preparation was conducted using the SureSelectXT Human All Exon V4 + UTR (Agilent Technologies Inc., Santa Clara, CA, USA). This protocol targets 99% of coding regions in addition to 5 and 3 -untranslated region sequences. Pre-enrichment libraries were constructed using the SureSelect Low Input reagent kit, and exome enrichment of the DNA library was performed via a hybridization reaction with biotinylated baits from the SureSelect Human All Exon V4 + UTR Enrichment Kit. Sequencing of the prepared DNA libraries was undertaken using the Illumina HiSeq2000 instrument (Illumina Inc., San Diego, CA, USA) with an average coverage of 80× with 2 × 100 bps paired-end reads. Quality controls were applied at the lane and fastq levels. Specifically, the cutoff used for a successful lane was Pass Filter > 90%, with over 250 M reads for the highoutput mode. The fraction of reads in each lane assigned to each sample (no set value) and the fraction of bases with a quality score > Q30 for read 1 and read 2 (above 80% expected for each) were also checked. Raw sequencing reads were demultiplexed using Illumina bcl2fastq. In addition, the FASTQC tool kit (www.bioinformatics.babraham.ac. uk/projects/fastqc/ (accessed on 8 November 2020)) was used to review the base quality distribution, which provided representations of the four nucleotides of particular k-mer sequences (adaptor contamination). We used the Genome Analysis Software Kit (GATK) (version 4.1) best-practice pipeline to analyze our WES data. Reads were aligned with the human reference genome (hg19), using the Maximal Exact Matches algorithm in the Burrows-Wheeler Aligner (BWA) [50]. PCR duplicates were removed using the Picard tool (picard.sourceforge.net/). The GATK base quality score recalibrator was applied to correct the sequencing artifacts. Variants were called using the GATK haplotypeCaller algorithm, visually inspected using the Integrative Genomics Viewer (IGV, Broad Institute), and further annotated with ANNOVAR. Variants were categorized as follows: (1) nonsynonymous; (2) synonymous; (3) frameshift deletion or insertion; (4) splicing; (5) stop gain or loss; or (6) functional intronic or promoter variants.

Conclusions
Our study, which was performed on multiplex Sardinian families and combined the HH approach with WES data analysis, first enabled the identification of diseaselinked regions, and then the identification of specific rare variants located in these regions. Although our study was limited to the use of Immunochip data that, despite making it possible to scrutinize the entire genome, only allowed this within specific autoimmune candidate regions, the obtained results represent an important step in the comprehension of the genetics of MS.