Investigating the Endo-Lysosomal System in Major Neurocognitive Disorders Due to Alzheimer’s Disease, Frontotemporal Lobar Degeneration and Lewy Body Disease: Evidence for SORL1 as a Cross-Disease Gene

Dysfunctions in the endo-lysosomal system have been hypothesized to underlie neurodegeneration in major neurocognitive disorders due to Alzheimer’s disease (AD), Frontotemporal Lobar Degeneration (FTLD), and Lewy body disease (DLB). The aim of this study is to investigate whether these diseases share genetic variability in the endo-lysosomal pathway. In AD, DLB, and FTLD patients and in controls (948 subjects), we performed a targeted sequencing of the top 50 genes belonging to the endo-lysosomal pathway. Genetic analyses revealed (i) four previously reported disease-associated variants in the SORL1 (p.N1246K, p.N371T, p.D2065V) and DNAJC6 genes (p.M133L) in AD, FTLD, and DLB, extending the previous knowledge attesting SORL1 and DNAJC6 as AD- and PD-related genes, respectively; (ii) three predicted null variants in AD patients in the SORL1 (p.R985X in early onset familial AD, p.R1207X) and PPT1 (p.R48X in early onset familial AD) genes, where loss of function is a known disease mechanism. A single variant and gene burden analysis revealed some nominally significant results of potential interest for SORL1 and DNAJC6 genes. Our data highlight that genes controlling key endo-lysosomal processes (i.e., protein sorting/transport, clathrin-coated vesicle uncoating, lysosomal enzymatic activity regulation) might be involved in AD, FTLD and DLB pathogenesis, thus suggesting an etiological link behind these diseases.

The endo-lysosomal pathway is essential in maintaining protein homeostasis in the cell. Growing evidence suggests that endosomal and lysosomal dysfunctions, or dysregulation in protein trafficking, play an important role in neurodegeneration, leading to neurocognitive disorders [10]. Numerous human genetics studies support a critical role of endo-lysosomal dysfunction in AD and FTLD. In AD, mutations in presenilin 1 (the most common genetic cause of AD) may, besides altering amyloid processing, result in defective lysosomal acidification and proteolytic activity [11]. In FTLD, mutations in multiple genes related to lysosome and autophagy function have been described, including in GRN, C9orf72, SQSTM1/p62, UBQLN2, DCTN1, TBK1, OPTN, and VCP [12]. Specifically, a progranulin deficiency due to the presence of GRN pathogenic null mutations, one of the most common genetic causes of FTLD, leads to an upregulation of lysosomal genes as well as profound lysosomal defects [13]. Moreover, GRN null mutations cause a strong alteration of the release and composition of exosomes [14]. Of note, homozygous GRN null mutations cause adult-onset neuronal ceroid lipofuscinosis (NCL), a lysosomal storage disorder [15]. A pathogenic expansion in the C9orf72 gene, a common genetic cause of FTLD, also affects lysosomal function. In C9orf72 null mice, by mimicking the reduced C9orf72 expression observed in expansion carriers, a lysosomal enlargement and accumulation was reported [16]. Recently, it has been demonstrated that NDST3, a potent regulator of lysosomal functions, is downregulated in tissues and cells from FTD patients with C9orf72 haploinsufficiency [17]. The link between AD and FTLD and lysosomal genes was also suggested by genome-wide association studies (GWAS). Regarding AD, a pathway enrichment analysis of three large GWAS provided evidence that genetic variation within the endo-lysosomal system is associated with late-onset AD [18]. In FTLD, a large international GWAS revealed a genetic locus linked to the disease risk encompassing the RAB38 and cathepsin C (CTSC) genes, both involved in the lysosomal pathway [19]. In addition, recent large scale GWAS have provided insights into the genetic risk factors associated with PD, showing that the main contributors to PD etiology are, among others, the molecular processes underlying endo-lysosomal dysfunction [20].
Recently, we provided evidence that an alteration in EV release is common in AD, DLB, and FTLD. Specifically, we found a significant reduction in the plasma concentration of EVs and larger sized EVs in all patient groups: these EV parameters together can distinguish patients from controls with a strong sensitivity and specificity [21]. This study further supports the growing body of evidence that endo-lysosomal dysfunctions may be a converging mechanism in neurodegenerative diseases. The aim of this study is to investigate whether AD, DLB and FTLD share genetic variability in the genes involved in the endo-lysosomal pathway. To reach this goal, we performed, in a large group of AD, DLB and FTLD patients, a targeted deep sequencing of the top 50 genes belonging to the endo-lysosomal pathway. The top 50 genes were selected based on a high intolerance to variation and high expression in two or more brain regions (Tables S1 and S2). The large majority of the selected genes were described to have a potential role in AD, DLB/PD, FTLD pathogenesis, as reported in human genetics and/or human/mouse molecular studies (Table S3).

Identification of Previously Reported Disease-Associated Variants and Predicted Null Variants
Targeted genetic screening for the presence of variants in the coding regions of 50 candidate endo-lysosomal genes was performed on a total of n = 697 patients (n = 282 AD, n = 114 DLB, n = 301 FTLD) and n = 251 controls (Table 1). Thirty-one out of the fifty candidate endo-lysosomal genes were described to have a potential role in AD, DLB/PD, and FTLD pathogenesis, as reported in human genetics and human/mouse molecular studies. Specifically, out of the 31 genes, 12 genes were reported in the literature to present risk alleles or mutations associated with AD, DLB/PD and FTLD (Table S3). In the present study, we detected four previously reported disease-associated variants and three stop-gain heterozygous variants (Tables 2 and 3). Specifically, we found three previously reported variants in the sortilin-related receptor 1 (SORL1) gene: the SORL1 p.N1246K heterozygous variant, previously described as an AD risk factor [22,23], was found in a familial AD patient; the SORL1 p.N371T and p.D2065V heterozygous variants, previously described in two AD patients of North European ancestry [24], were found in an early onset AD and in an FTLD patient and in n = 4 AD, n = 2 DLB and n = 6 FTLD patients, respectively (Table 4). Moreover, in an FTLD and a DLB patient we found a variant in the DnaJ heat shock protein family (Hsp40) member C6/auxilin gene (DNAJC6 p.M133L), previously described in a sporadic early onset PD patient, even if of uncertain significance [25]. The effect of the variants on the protein stability in terms of the ∆∆G was assessed by the Mu-Pro and I-mutant 2.0 tools, and was available for three out of four of the known variants: of note, both computational tools converged in defining these variants as potentially deleterious based on their effect on protein stability. In addition to these variants, we found three stop-gain heterozygous variants in SORL1 and palmitoyl-protein thioesterase 1 (PPT1) genes that are predicted to cause a loss of function (LOF) due to haploinsufficiency (Table 3); for both genes, LOF is a known mechanism of the disease. Specifically, loss of function mutations of SORL1 have been described in AD [26][27][28][29]. The LOF SORL1 R985X carrier was an early onset (46 years) 4 of 15 familial AD patient and the LOF SORL1 R1207X carrier was a 68-year-old AD patient with an unknown family history. Similarly, we found a stop-gain variant in PPT1 (p.R48X) in an early onset (61 years) familial AD patient; PPT1 loss of function mutations have been previously demonstrated to cause an adult form of neuronal ceroid lipofuscinosis (NCL) [30,31].  AA, amino acid; LOF, loss of function; bvFTD, behavioral variant FTD; PPA, primary progressive aphasia; F, positive family history; AS, apparently sporadic; U, unknown; * For a complete list of gene functions reported by Gene Ontology see Table S2.
Considering all the carriers of these variants, the majority of patients (70%) had a disease onset of ≤65 years and/or a positive family history (Table 4). All the previously described disease-associated variants and LOF variants detected in our patients were absent in the controls, and three of the previously described variants were found to be cross-disease variants. Specifically, the SORL1 p.D2065V variant was detected in all three diseases (AD, DLB, FTLD) and was present both in the Italian and Belgian patients; the SORL1 p.N371T variant was found in AD and FTLD; the very rare DNAJC6 p.M133L variant was found in AD and DLB.

Association Analyses of the Endo-Lysosomal Pathway Genes Involved in Neurodegenerative Diseases and Other Neurological Disorders
Considering all the rare variants detected in the selected 50 genes, including the previously reported variants and the LOF variants, we performed a single variant association analysis in the three diagnostic groups separately (AD, DLB, FTLD vs. CTRL) as well as in all the patients as a whole group (AD + DLB + FTLD vs. CTRL). We found five variants associated with the investigated diseases ( Table 5). All these variants were absent in the controls. The effect of the variants on protein stability in terms of the ∆∆G was available for four variants and was assessed by the Mu-Pro or I-mutant 2.0 tool. Both computational tools converged in defining all variants as potentially deleterious based on their effect on protein stability. None of these enrichments were significant after multiple test correction, but we observed some nominally significant results of potential interest, considering the known role of SORL1 in AD [22][23][24]32]. Specifically, the previously AD-associated SORL1 p.D2065V variant was nominally associated with disease, when considering all the patients (p = 0.031) as well as in the DLB (p = 0.048) and FTLD (p = 0.021) groups, separately. In order to explore whether rare variants were enriched in specific genes, we performed a gene burden analysis and found a nominally significant burden of variants in four genes: SORL1, DNAJC6, NEU1 and AP2A2 (Table 6). Once again, even if none of these enrichments were significant after multiple test correction, we observed some nominally significant results of potential interest, both in SORL1 which was consistently reported to be associated with AD, and in the DNAJC6 gene, consistently reported to be associated with PD [25,35,36]. The SORL1 gene showed a nominally significant burden of variants in the all patients group (p value skat = 0.038) and in FTLD patients (p.burden = 0.025 and p.skato = 0.016), and for DNAJC6 there was a variant burden in DLB patients (p.burden = 0.048 and p.skato = 0.048). The localization of variants within the protein sequence and functional domains is reported in Figure 1. In the SORL1 encoded protein (Sortilin-related receptor 1), all patient-specific variants (including the two stop-gain variants) were located in the VPS10, Low-density lipoprotein (LDL) receptor class a (LDL_recept_a), LDL receptor class b (LDL_recept_b) and Fibronectin type III (fn3) domains. Specifically, (i) the known p.N371T, p.N1246K and p.D2065V variants were located in the VPS10, LDL_recept_a and fn3 domains, respectively; (ii) the LOF p.R985X and p.R1207X variants were located in the LDL_recept_b and LDL_recept_a domains, respectively. In the DNAJC6 encoded protein (auxilin) the p.M133L known variant was located in the Tensin-type phosphatase (PTEN) domain, while two additional variants detected in patients were not located in a functional domain.

Multiple Variant Carriers in the Endo-Lysosomal Pathway
Considering previously reported disease associated variants (Table 2) and predicted LOF variants (Table 3), we described four patients carrying an additional rare/very rare variant which was of unknown significance and absent in the controls (Table 7). Specifically (i) the known SORL1 p.D2065V variant was found along with a rare variant in GGA3 in a familial FTLD patient with a disease onset at 68 years; (ii) the LOF SORL1 p.R1207X was found along with a missense variant also in SORL1 (p.D140N) in an AD patient with a disease onset at 68 years and an unknown family history; (iii) the known DNAJC6 p.M133L variant was found along with a rare variant in AGRN (p.A897V) in an AD patient with a disease onset at 65 years and an unknown family history; (iv) the LOF PPT1 p.R48X variant was found along with a rare variant of GNPTG (p.R66Q) in an early onset familial AD patient. In addition, as shown in Table 7, considering the patients with a disease onset of ≤65 years and/or a positive family history, we described 14 more patients carrying two or three rare/very/ultrarare variants which were of unknown significance, and absent in controls. Among these, for example, were three patients carrying the ultrarare ABCA2 p.H1449P variant (found to be nominally associated with disease but this did not survive after multiple test correction) and one early onset (49 years) apparently sporadic FTLD patient who was carrying two compound heterozygous variants in the vacuolar protein sorting-associated protein 52 homolog gene (VPS52 p.Y508C, p.R578W). (p.burden = 0.048 and p.skato = 0.048). The localization of variants within the protein sequence and functional domains is reported in Figure 1. In the SORL1 encoded protein (Sortilin-related receptor 1), all patient-specific variants (including the two stop-gain variants) were located in the VPS10, Low-density lipoprotein (LDL) receptor class a (LDL_re-cept_a), LDL receptor class b (LDL_recept_b) and Fibronectin type III (fn3) domains. Specifically, (i) the known p.N371T, p.N1246K and p.D2065V variants were located in the VPS10, LDL_recept_a and fn3 domains, respectively; (ii) the LOF p.R985X and p.R1207X variants were located in the LDL_recept_b and LDL_recept_a domains, respectively. In the DNAJC6 encoded protein (auxilin) the p.M133L known variant was located in the Tensin-type phosphatase (PTEN) domain, while two additional variants detected in patients were not located in a functional domain.

Discussion
Neurodegenerative diseases are characterized by abnormal intracellular protein inclusions or extracellular protein aggregates [1]. There is strong interest in understanding the common molecular mechanisms that contribute to neurodegenerative disorders, and thus, in exploring the etiological link behind these brain diseases [39].
Alterations in the endo-lysosomal system, leading to the failure of proper protein trafficking and degradation, have been hypothesized to underlie neuronal dysfunction in these diseases. Emerging data argue for an interdependence between the production of exosomes (a specific subtype of EVs of endosomal origin) and endosomal pathway integrity in the brain [40]. We recently described that an alteration in the release of EVs is common across AD, FTLD, and DLB and that plasma EV parameters (EV concentration/size) can distinguish patients from controls with strong sensitivity and specificity [21]. Evidence from monogenic diseases and experimental models suggest that autophagy and the endo-lysosomal system may be mechanistically involved in the neurodegenerative processes leading to AD and FTLD [41]. Genetic variation within the endo-lysosomal system is associated with a late-onset AD risk, as demonstrated by the pathway enrichment analysis of three large GWAS: of note, this aggregate genetic association was unique for the autophagic and endo-lysosomal system, and in the same study, an association signal was also observed in PD [18].
Herein, we investigated genetic variability in the genes involved in the endo-lysosomal pathway in major neurocognitive disorders. In a large group consisting of AD, DLB, FTLD and CTRL subjects (948 in total), we performed targeted deep sequencing of the top 50 genes belonging to the endo-lysosomal pathway, with prioritization based on a high intolerance to variation and high brain expression (Tables S1-S3). Specifically, since we were looking for cross-disease genetic variants, we selected genes highly expressed in at least two brain regions. Genetic analyses revealed, in our patients' dataset, four previously described variants in SORL1 and DNAJC6 genes. The SORL1 p.N1246K variant, previously described as an AD risk factor [22,23], was found in a familial AD patient; the SORL1 p.N371T and the SORL1 p.D2065V variants, previously described in two unrelated AD patients of North European ancestry [24], were found in our dataset in AD, DLB and FTLD patients. SORL1 p.N371T was found both in early onset AD and FTLD, while SORL1 p.D2065V was found in several patients from all three diagnostic groups (four AD, two DLB and six FTLD patients). The DNAJC6 p.M133L variant was found in an AD patient and a DLB patient. This variant was previously described in a sporadic early onset PD patient [25], and even if of uncertain significance, our data also suggest a role of this variant in AD and DLB, an alpha-synuclein associated disease like PD. In addition to known variants, predicted null variants in SORL1 and PPT1 were found in familial early onset AD cases (SORL1 p.R985X, and PPT1 p.R48X) and in an AD patient with an unknown family history (SORL1 p.R1207X). Loss of function variants of SORL1 have already been described in AD [26][27][28][29] and were proposed to cause AD by inducing defects in the endolysosome-autophagy network [42]. PPT1 loss of function variants, in the homozygous or compound heterozygous state, were demonstrated to cause neuronal ceroid lipofuscinosis, an inherited, progressive neurodegenerative disease [30,43].
Interestingly, one of the most common worldwide progranulin loss of function mutations is associated with FTLD in the heterozygous state and with NCL in the homozygous state [15,44]. Similarly, we cannot exclude strictly different clinico-pathological phenotypes (AD versus NCL) also determined by the PPT1 mutation dosage.
Protein palmitoylation is an important process to regulate the physiological function of the brain. A number of studies have reported that defects in the palmitoylation step or in the enzymes for palmitoylation/depalmitoylation are associated with several neurological disorders including AD [45]. Since APP palmitoylation seems to enhance the amyloidogenic pathway, the loss of a depalmitoylating enzyme such as PPT1 might result in an increased amyloid production; this hypothesis is line with the observed clinical phenotype of the carrier, an early onset familial AD patient. Since LOF is a known mechanism of disease for both genes, the evidence that these variants are pathogenic is strong [46].
Single variant association tests further supported a role of the SORL1 p.D2065V variant as a genetic determinant/risk factor for AD, DLB and FTLD. The gene burden analysis showed a nominally significant burden of potential interest in SORL1 in the all patients group and in FTLD patients, and in DNAJC6 in the DLB group. Regarding SORL1, the majority of the variants found in patients were located in the VPS10 protein domain, the Low-density LDL_recept_a/b domains and the fibronectin type III domain. Of note, a recent meta-analysis of burden tests at the protein domain level of SORL1 missense variants showed a significant association of the VPS10, LDL_recept_a and fibronectin type III domains with AD [47]. The LDL_recept_a domain was demonstrated to have a critical role for the function of SORL1 in AD, as it is involved in amyloid precursor protein (APP) binding and APP retrograde endosome transport to the TGN [48]. The VPS10 domain is involved in Aβ peptides binding and targeting to the lysosomes and variants in this domain have been described which impair lysosomal sorting of Aβ and cause familial AD [27], and the fibronectin type III domain interacts with APP [49]. More importantly, we also provided evidence of an involvement of this gene in FTLD and DLB as we described cross-disease variants, specifically, (i) a previously known AD variant in FTLD and AD patients, and (ii) a previously known AD variant in FTLD, DLB and AD patients, that was found to be a nominally significant associated variant. Regarding the DNAJC6 encoded protein auxilin, only the p.M133L variant was located in a functional domain, and specifically, in the PTEN domain, which is important for the recruitment of auxilin onto clathrin-coated vesicles. Auxilin has a well-established role in clathrin uncoating [50]. Since endocytosis and clathrin-uncoating defects at synapses were demonstrated in auxilin knockout mice, a specialized role for this protein in the clathrin-dependent recycling of synaptic vesicles at synapses was suggested [51]. Of note, a splicing variant affecting the PTEN domain was associated with juvenile parkinsonism [52]. The present study also suggests the possible involvement of this gene in DLB and AD. Interestingly, some of the patients carrying known variants and predicted LOF variants also carried additional rare/very rare variants of still unknown significance (but not present in controls) in endo-lysosomal genes. In addition, considering the patients with early onset disease and a positive family history, we described 14 more cases carrying two or three rare/very rare/ultrarare variants of still unknown significance (but not present in controls). Among these, (i) a Belgian patient with a very early onset FTLD (49 years) carrying two potentially damaging compound heterozygous variants in VPS52, a subunit of the Golgi-associated retrograde protein complex (interacting with the PD-associated LRRK2) which is involved in retrograde transport of early and late endosomes to the Golgi [53]; (ii) three patients carrying the ultrarare ABCA2 p.H1449P variant, which was found to be nominally associated with disease, but this did not survive after multiple test correction.
There are limitations in the present study: (i) This being a pilot study, further validation in larger groups is needed. Such large studies will, on one side give a definitive answer on the role played by SORL1, DNAJC6, and PPT1 in neurocognitive disorders and, on the other side, unveil the potential of the AP2A2, ABCA2, NEU1, TOM1, and AGRN genes in contributing to the disease. (ii) Disease segregation studies on families for variants with the strongest evidence of pathogenicity are needed. (iii) We explored only a portion of endo-lysosomal genes and thus a more comprehensive genetic screening (including all genes belonging to this pathway) as well functional studies on identified variants could be of interest.
Based on our data and the literature data on the role played by the endo-lysosomal system in neurocognitive disorders, epigenetic studies evaluating the complex interplay of genetic and environmental factors are warranted to better explore the etiological link behind these diseases. The final goal is to develop new strategies for the development of innovative therapeutic approaches targeting the endo-lysosomal pathway and taking advantage of the current knowledge in this field [54][55][56].
Our data highlight that genes controlling key endo-lysosomal processes such as intracellular protein sorting/transport, the uncoating of clathrin-coated vesicles and the regulation of enzymatic activity in lysosomes, might be involved in AD, DLB and FTLD pathogenesis. Altogether our data further confirm the key role of the endo-lysosomal pathway in these diseases and suggest the existence of cross-diseases mechanisms involved in major neurocognitive disorders. Our data strongly support a critical role for SORL1 in AD and related diseases and highlight SORL1 as a potential therapeutic target for drug development.

Participants
This retrospective study was carried out on DNA from a total of n = 697 patients (n = 282 AD, n = 114 DLB, n = 301 FTLD) and n = 251 subjects with normal cognitive function (CTRL) ( Table 1). Clinical diagnosis for probable AD, DLB and FTLD was made according to international guidelines [57][58][59][60][61][62]. The family history was determined by a family history questionnaire and FTLD pedigrees were classified as previously described [63][64][65]. DNA samples were available from the biological banks of the IRCCS Fatebenefratelli Brescia and IRCCS Besta Milan (Italian cohort), from the Neurodegenerative Brain Diseases Human Biobank of the VIB Center for Molecular Neurology, Antwerp, Belgium (Belgian patient cohort [66,67]), and from Project MinE (Belgian control cohort, [68]). Written informed consent was obtained from all subjects. The study protocol was approved by the local ethics committee (Prot. N. 111/2017).

Gene Selection
The candidate genes were selected employing web resources: KEGG pathway, Gene Ontology (GO), Gene Set Enrichment Analysis (GSEA), and Reactome for selecting genes belonging to the endo-lysosomal pathway. The Genotype-Tissue Expression (GTEx) dataset for gene selection was based on brain expression and the Residual Variation Intolerance Score (RVIS) score was used as a ranking method to prioritize genes with a high intolerance to variation (more negative values express an increasing intolerance to mutations). A list of 314 genes was obtained including genes belonging to pathways connected to lysosomes, endosomes and endo-lysosomes. Genes were then filtered according to brain tissue expression. For each gene, in each region we calculated if it fell above the 3rd quartile of expression for that specific tissue and we filtered out any genes that were not highly expressed in at least two brain regions. Applying this strategy, we selected 93 genes. Fifty genes were further selected from this list, with a priority given to genes with the lowest RVIS score (Tables S1 and S2).

Genetic Analyses
The entire coding regions of the 50 candidate genes were analyzed by amplicon-based target enrichment and Next-Generation Sequencing (

Variant Annotation, Filtering and Bioinformatics
Sequence reads from amplicon-based target enrichment were processed for qualitycontrol purposes with the FastQC tool before alignment to the hg19 human reference sequence using Spliced Transcripts Alignment to a Reference (STAR), (The National Human Genome Research Institute, Bethesda, MD, USA) software. Duplicated reads were removed with Picard tools. Local realignment, recalibration, and variant calling were performed with the Genome Analysis Tool Kit (GATK) [70]. Variants with a QUAL < 20 and QualByDepth < 3 were excluded from the analysis. WES reads were aligned to the reference genome GRCh37 using a Burrows-Wheler Aligner (BWA) implemented using in house Genomecomb software [71] and variants were called using the GATK Haplotype Caller. WGS reads were aligned to the GRCh37 genome using Illumina's (Illumina, San Diego, CA, USA) standard iSAAC aligner and variants were called using the iSAAC variant caller. Variant mapping to the 50 endo-lysosomal pathway genes were extracted from the WES and WGS datasets according to the coordinates of the BED files of the endo-lysosomal gene panel. WGS data were processed for quality control as described before [69]. Variants with a coverage < 20, genotype quality (GQ) < 99 and an allelic ratio >3 were excluded from the WES data. WES and WGS data were annotated using in house Genomecomb software [71]. Gene coordinates and transcripts were annotated using the RefSeq genes. Detected variants were filtered, including all "non synonymous" SNVs and premature stop codons (stop-gain, essential splice, site, frameshift indels), all variants with a Combined Annotation Depen-dent Depletion (CADD) score > 20, all variants annotated as damaging (D) or potentially damaging (P) according to Polymorphism Phenotyping v2 (PolyPhen2_HDIV) and with a frequency in the non-Finnish European exomes cohort reported in the Genome Aggregation Database v2.1.1. (gnomAD_NFE) < 0.01. Localization of the variants within the protein sequence was performed with Elaspic [72] and Interpro [73]. The prediction of the effect on protein stability of single amino acid substitutions was performed using two different computational tools, I-mutant 2.0 (https://folding.biofold.org/i-mutant/i-mutant2.0.html, accessed on 10 August 2021 [74]) and Mu-Pro (http://mupro.proteomics.ics.uci.edu, accessed on 10 August 2021 [75]). The change in the Gibbs free energy (∆∆G) between wild-type and mutant protein was calculated. For both computational tools, a ∆∆G < 0 indicates a decreased protein stability upon substitution.

Statistical Analyses
For testing associations between rare variants and phenotypes, burden tests focusing on the cumulative effects of rare variants in genetic regions was adopted [76]. In addition, to overcome the potential loss of power when the assumption of causal and same direction effects is violated, a test which builds upon the kernel machine regression framework (sequence kernel association test-SKAT) was applied. Finally, the optimal unified test (SKAT-O), obtained as an optimal linear combination of the burden test and SKAT, was used to maximize the power [77]. For single variant tests, the efficient resampling method (ER) for a score statistic was chosen and the minimum achievable p-value (MAP) was provided; while, for multiple variant tests, a hybrid method, based on the total minor allele count (MAC), the number of individuals with minor alleles (m) and the degree of case-control imbalance, was adopted [77]. In addition, multiple testing false discovery rate and Bonferroni corrections were applied.  Informed Consent Statement: Written informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The raw data supporting the conclusions of this article will be made available by the corresponding author upon request.