Multiple Genetic Rare Variants in Autism Spectrum Disorders: A Single-Center Targeted NGS Study

: Many studies based on chromosomal microarray and next-generation sequencing (NGS) have identiﬁed hundreds of genes associated with autism spectrum disorder (ASD) risk, demonstrating that there are several complex genetic factors that contribute to ASD risk. We performed targeted NGS gene panels for 120 selected genes, in a clinical population of 40 children with well-characterized ASD. The variants identiﬁed were annotated and ﬁltered, focusing on rare variants with a minimum allele frequency <1% in GnomAD. We found 147 variants in 39 of the 40 patients. It was possible to perform family segregation analysis in 28 of the 40 patients. We found 4 de novo and 101 inherited variants. For the inherited variants, we observed that all the variants identiﬁed in the patients came equally from the paternal and maternal genetic makeup. We identiﬁed 9 genes that are more frequently mutated than the others, and upon comparing the mutational frequency of these 9 genes in our cohort and the mutational frequency in the GnomAD population, we found signiﬁcantly increased frequencies of rare variants in our study population. This study supports the hypothesis that ASD is the result of a combination of rare deleterious variants (low contribution) and many low-risk alleles (genetic background), highlighting the importance of MET and SLIT3 and the potentially stronger involvement of FAT1 and VPS13B in ASD. Taken together, our ﬁndings reinforce the importance of using gene panels to understand the contribution of the different genes already associated with ASD in the pathogenesis of the disease.


Introduction
Autism spectrum disorder (ASD) is defined as a group of clinical heterogeneous disorders characterized by deficits in social behaviors and communication, restricted interests, and repetitive behaviors, often accompanied by other impairments, such as intelligence and language deficits [1]. ASD typically manifests in the first years of age and is persistent throughout life. The overall prevalence is estimated to be 16.8 per 1000 (one in 59) children aged 8 years and varied by sex: males were four times more likely than females to be identified with ASD [2]. The heritability of ASD is supported by studies reporting ASD recurrence in 11-19% of infants with at least 1 older sibling with ASD, while the general prevalence in the US population is between 1% and 2% [3,4]. Moreover,

Materials and Methods
We recruited 40 patients with Autism Spectrum Disorder (36 males, 4 females), coming from all over the Italian territory, who were referred to the Neurodevelopment Disorders Unit of C. Besta Institute between January 2017 and July 2018.
The diagnoses of Autism Spectrum Disorder followed the diagnostic criteria of DSM-5 and were confirmed by clinical examination, consisting of an accurate anamnesis, Autism Diagnostic Interview-Revised, ADI-R [16], and the Autism Diagnostic Observation Schedule-II edition, ADOS2 [17].
The cognitive functioning was assessed using the Wechsler Intelligence Scales [18], according to the age of the patients and the Griffiths Mental Developmental Scale for children with chronological or mental age under 4 years [19,20], considering the correlation between Wechsler Preschool Scale Full IQ and General quotient obtained by Griffiths Scales [21].
The developmental delay or intellectual disability were defined by a GQ or a QI lower than 70. All the patients underwent an accurate clinical evaluation.
All the patients were evaluated by a pediatric neurologist, a clinical geneticist, and a child neuropsychologist and underwent the following instrumental screening: -Brain magnetic resonance imaging (MRI), 1.5 Tesla, without contrast enhancement (with sagittal, transverse, and coronal sequences, as well as FLAIR sequences of the whole brain and cerebellum); -Electroencephalogram (EEG); -A plasma amino acid assay using liquid chromatography-tandem mass spectrometry (ESI/MS/MS) and urinary organic acid assay using gas chromatography-mass spectrometry; -Standard karyotyping on peripheral blood by means of a lymphocyte culture and QFQ banding, with a resolution of 550 bands (number of metaphases analyzed: 16); -Molecular analysis for fragile X syndrome, using Southern blotting; -Array-comparative genomic hybridization.
We excluded patients with (1) preterm birth, known pregnancy complications or a history of perinatal injuries; (2) major neuroradiologic alterations (i.e., cortical dysplasia); (3) epileptic syndromes (e.g., West syndrome); (4) known infectious or metabolic diseases; (5) major facial and somatic peculiar characteristics; (6) major limb or visceral malformations; (7) the presence of undergrowth or overgrowth syndromes; and (8) genetic diseases (high-resolution karyotype and DNA analysis of Fragile-X syndrome) or positive genetic testing, including the presence of any copy number variants in CGH testing.
All the examinations were performed with written informed consent from the subjects' parents. The study was approved by the Ethics Committee of Fondazione IRCCS Istituto Neurologico C. Besta.
The genomic DNA was extracted from the whole blood of all the patients enrolled in the study and, when possible, of their parents.

Statistical Analysis
Statistical analysis was employed by IBM SPSS Statistics 27 software. We compared the mutational frequency of the 9 most mutated genes in our cohort and in the GnomAD population (Table 3) using the exact binomial test to verify whether a proportion from a single dichotomous variable (mutation present or absent) is equal to a value calculated in a control population. All the statistical analyses were one-tailed. To define the GnomAD population frequency, for each gene, we calculated the number of rare variants (present in less than 1% of the population) in a variable number of alleles, considering the coverage data. Synonymous and intronic variants were excluded from the counting.

Gene Panel
We found 147 variants in 39 out of 40 patients (36 males, 4 females) analyzed by NGS (which fall into 60 out of 120 genes): only one patient (H16/18) showed no significant variant (Online Supporting Information, Supplementary Table S2 and Table 4). Table 4. Summary of all variants for each patient, with family segregation, frequency, and in vitro prediction information.
The variants taken into consideration have a GnomAD frequency < 1%. Those with a greater frequency were discarded.
We found 137 missense, 2 synonymous (predicted to alter normal splicing), and 8 indel variants, all in a heterozygous state, except for one missense variant in the MET gene, which was carried by both alleles. Two variants of the DMD gene and one in the NRXN3 gene, located on Chromosome X, are present in male patients in a hemizygous state.

De Novo and Inherited Variants
It was possible to perform family segregation analysis for 28 out of 40 patients. We found 4 de novo (Table 5) and 101 inherited variants. As for the de novo variants, two of them were found in the same patient (A14/55), for whom no other candidate variants were identified. The first variant is located in the ANKRD11 gene, frequent, and predicted to be damaging by Polyphen. The second variant is located in the NF1 gene, rare, and predicted to be tolerated by SIFT and benign by Polyphen. In patient L14/95, we found a frequent, tolerated, and benign variant in the SLIT3 gene. This variant has previously been associated with ASD, as described by Cukier et al. (2014) [23]. In patient LDM167, we detected a small duplication of two nucleotides (c.1131_1132dupTA) in the PTEN gene, which caused the formation of a premature stop codon. This variant is not present in the GnomAD database and is predicted to be diseasecausing by "Mutation taster" (Table 6).  S, SIFT; P, Polyphen2, (*), 1/5000 < allele frequency (AF) < 1/1000; (f), 1/500 < AF < 1/100.
As for the inherited variants, we observed that all the variants identified in our cohort came equally from the paternal and maternal genetic makeup. This equal contribution is also observed if we consider only extremely rare variants (***, **) (23 paternal and 26 maternal) or variants with a deleterious pathogenicity prediction in silico (23 are paternal and 28 are maternal) (Figure 1).

Variants Already Associated with Autism
We found 8 variants that were already described in the literature and associated with autism (Table 7).

Variants Already Associated with Autism
We found 8 variants that were already described in the literature and associated with autism (Table 7). We found 2 variants with a frequency between 0.02% and 0.1% (*) and 6 variants with a frequency between 0.2% and 1.0% (f). All these variants are not extremely rare.
As described in Table 6, the variant, c.2242C>A (p.Leu748Ile), in the NRXN1 gene was already found in 2 patients with severe autism and paternal transmission [26,27]. In our patient (O12/11), we found this variant, inherited from the mother, together with another frequent variant in the same gene, inherited from the father, plus three additional extremely rare variants in SOX5 (variant not present in GnomAD), CNTNAP2 (AF 0.003%), and CDKL5 (AF 0.013%), the first of which was inherited from the father and the other two from the mother.
The transition, c.236G>A (p.Gly79Glu), in MBD5 was first discovered by Talkowski et al. (2011) [28] at a significantly higher frequency in patients than controls. In our patient (C.L.), we found this variant together with three extremely rare variants in TSC1 (variant not present in GnomAD), FAT1 (AF 0.002%), and SHANK3 (variant not present in GnomAD), as well as another three more frequent in SLIT3 (AF 0.02%), SCN2A (AF 0.4%), and ZEB2 (AF 0.2%). In this case, it was not possible to study the variant segregation in the family.
The variant, c.5156C>T (p.Ser1719Leu), in the RELN gene was first identified by Bonora et al. [29] in two families, whereas in our study, it was found in one patient (A14/06), with a paternal transmission, together with two other frequent variants in VPS13B (AF 0.9%) and RAI1 (AF 0.3%), the first of which was inherited from the mother and the second from the father, plus one rare variant in ZNF804A (AF 0.07%), inherited from the father, and two extremely rare variants in NRNX1 (not present in GnomAD) and NRNX3 (AF 0.005%), both inherited from the mother.
The known Gly56Ala mutation (c.167G>C) in the serotonin transporter, SERT (or 5-HTT), encoded by the SLC6A4 gene, causes an increased serotonin reuptake and has been associated with autism and rigid-compulsive behavior [30,31]. In our patient (D15/81), we found this variant, together with two other frequent variants in VPS13B (AF 0.85%) and CACNA1H (AF 0.5%), all of which were inherited from the father, and one extremely rare variant in MET (AF 0.002%), inherited from the mother.
The variant, c.2941G>A (p.Ala981Thr), in the CREBBP gene was first identified in a study conducted on a cohort of patients affected by Rubinstein-Taybi syndrome (RTS) [32]. In our study, we found this variant in one patient (D10/14), together with two other variants that are not extremely rare in the VPS13B gene (AF 0.17% and 0.045%), but it was not possible to study the family segregation to establish if they are on the same allele or not.
The variant, c.1886G>A (p.Ser629Asn), in the SLIT3 gene was identified in two ASD families in a whole-exome sequencing study, as described by Cukier et al. [23]. In our study, p.Ser629Asn was found in one patient (L14/95) as a de novo variant, together with two not extremely rare variants (in MBD5, AF 0.2%; and VPS13B, AF 0.3%) and two more rare variants (in ANK2, AF 0.07%; and KDM6B, AF 0.02), all maternally inherited.
The variant, c.12653A>G (p.Asp4218Gly), in the FAT1 gene was first identified in one family in the same study by Cukier et al. [23], previously cited for the SLIT3 variant. In our study, we identified one patient (r.g.74126) carrying p.Asp4218Gly (maternally inherited), together with an additional three frequent variants, two of which were inherited from the father (in ZEB2, AF 0.2%; and RAI1, AF 0.4%) and one from the mother (in CSMD1, AF 0.6%), and one more rare variant inherited from the mother (in AVPR1A, 0.09%).
The variant, c.1960C>G (p.Gln654Glu), in the TSC1 gene was first identified in a study by Kelleher et al. [33]. In our cohort, we found this variant in one patient (H.S.M.), carried by the maternal allele, together with one additional variant in LAMC3 (AF 0.02). In the same individual, we found, on the paternal allele, two not extremely rare variants in ANKRD11 (AF 0.04%) and a more frequent one in KATNAL2 (AF 0.1%).

Variants Found More Than Once
We identified 5 out of 146 variants shared by two patients and 1 (in RAI1) out of 146 shared by 3 patients. Two patients share two variants, one in ZEB2 and one in RAI1. One variant in NSD1 (shared by F14/70 and B12/69) is extremely rare in GnomAD (0.006%), while the other 5 are more frequent: one in SCN2A (shared by C.L. and IB17A), with a frequency in GnomAD of 0.4%; one in CEP290 (shared by G14/62 and N13/03), with a frequency in GnomAD of 0.7%; one in VPS13B (shared by D15/81 and LDM167), with a frequency in GnomAD of 0.85%; one in the RAI1 gene (shared by C.P.P, r.g.74126 and a.d.81815), with a frequency in GnomAD of 0.4%; and one in the ZEB2 gene (shared by r.g.74126 and a.d.81815), with a frequency in GnomAD of 0.2% (Table 7).

More Variants in the Same Gene
We found 2 variants in the same gene in 6 patients ( Table 8). The VPS13B gene was found to have mutated twice in 2 different patients (A15/18 e D10/14), but it was not possible to study the segregation in both families. In patient G14/62, we found 2 variants in ZNF804A, which were not inherited from the mother. Moreover, in patient F12/67, two variants located on the same allele of FAT1 were found to be paternally transmitted. In patient E11/61, we found two mutations, but it was not possible to study the segregation in the family. We also found a homozygous mutation in the MET gene in patient F14/70, which was transmitted by both parents.

Most Frequently Mutated Genes
We identified 9 genes that are more frequently mutated, compared to the others, with more than 5 variants found in each of them.
We detected 12 variants in the FAT1 gene, with two in the same patient (F12/67), both of which were inherited from the father (Tables 3 and 8); 12 variants in the VPS13B gene, with one shared by two patients (Tables 3 and 7), two present in the same patient (A15/18), and another two present in the same patient (D10/14), for whom studying the segregation in the families was not possible (Tables 3 and 8); 6 variants in ANK2 and RAI1 (in RAI1 three variants are the same) (Tables 3 and 7); 5 variants in SHANK3, ANKRD11, CSMD1, DLGAP2, and ZNF804A, with two (in ZNF804 gene) in the same patient, both of which were maternally transmitted (Tables 3 and 8).
We used a binomial test to compare the mutational frequencies of these 9 genes in our cohort, with the frequencies in the GnomAD population (as shown in Table 3). For all these 9 genes, we noted significantly increased frequencies of rare variants in our study population. The results indicate that the frequency of mutation in all the genes we studied is higher than expected (1-tailed exact p < 0.0001).

Discussion
The diagnostic yield of targeted gene panels in the evaluation of individuals with ASD has not been well defined, but it settles at around 12-14% [34,35]. In this study, our purpose is to evaluate the potential contribution of every single rare variant and not to search for a single variant causing the onset of ASD.
Interestingly, we found only 4 de novo variants in ANKRD11, NF1, SLIT3, and PTEN in 3 patients (4/105 variants). These data are slightly lower than the ratio published in the literature (according to De Rubeis et al. [36], de novo loss-of-function mutations are in over 5% of ASD patients), and this is probably due to the small number of samples and genes analyzed in our cohort. Another explanation could derive from the recruitment criteria, namely, the exclusion of patients with syndromic characteristics, which are more easily caused by variants in genes already associated with other syndromes and not inherited (De novo).
All the four genes are reported in the literature as autism-risk genes or genes causing syndromes, in which subgroups of patients may develop autism [37][38][39].
Furthermore, in our cohort, the presence of known variants does not seem to contribute to the pathogenic phenotype more than those variants that have never been associated with ASD. In fact, all the variants belong to the (f) group, except two that fall into the (*) group, and only two variants have two deleterious in silico predictions (SIFT and Polyphen). Thus, although they were already associated with ASD, it is difficult to attribute to them a highly pathogenic role, even though there is always more than one variant in other genes. However, it is certain that they contribute to ASD onset. Only the SLIT3 variant seems to play a determinant role in ASD onset in the L14/95 patient, given its de novo origin, unlike the other 4 variants transmitted by the mother. It would be interesting, in this specific case, to widen the search for variants to other autism-related genes in order to clarify their pathogenic role.
Moreover, we noted significantly increased frequencies of rare variants in 9 genes in our study population, with more than or at least 5 variants each. In this regard, we must specify that the analyzed population is composed of patients from all over the Italian territory and that 9 patients have foreign origins: a heterogeneous group comparable to GnomAD population. The results indicate that the mutational frequency in all the studied genes is higher than expected. Our study highlights the importance of using gene panels to better understand the contribution of the different genes already associated with ASD in the onset of the disease. FAT1 and VPS13B resulted in the most mutated genes in our cohort, with 12 variants each.
The FAT1 gene encodes a member of a small family of vertebrate cadherin-like genes, whose gene products play a role in cell migration, lamellipodia dynamics, cell polarity, and cell-cell adhesions (summary by Gee et al. (2016) [40]). Rare de novo missense variants in FAT1 have been identified in ASD probands by WES in two reports [37,41], while inherited damaging missense variants in FAT1 have been observed in affected individuals from 3 extended multiplex ASD families [23]. Puppo et al. (2015) [42] identified heterozygous missense variants in FAT1 in 10 out of 49 unrelated Japanese patients with a neuromuscular phenotype similar to facioscapulohumeral muscular dystrophy. Puppo et al. chose FAT1 as a candidate gene based on the findings of Caruso et al. (2013) [43], which demonstrated how hypomorphic Fat1 mice show an FSHD-like phenotype. Morris et al. (2013) [44] reported recurrent somatic mutations in FAT1 in glioblastoma, colorectal cancer, and head and neck cancer. In the SFARI database, FAT1 is classified as a gene, with evidence suggesting its implication in ASD (score 3).
The VPS13B gene has been associated with syndromic autism, where a subpopulation of individuals with a given syndrome develop autism. In particular, rare mutations of the VPS13B gene were associated with Cohen syndrome [45]. This gene encodes a potential transmembrane protein that may be involved in the vesicle-mediated transport and sorting of proteins within the cell. This protein may play a role in the development and function of the eye, hematological system, and central nervous system. Multiple splice variants encoding distinct isoforms were identified for this gene, and when we designed the NGS panel, VPS13B was part of score category S (syndromic). Now, VPS13B belongs to gene category 1; hence, several variants in VPS13B have been associated with ASD.
Very interestingly, two patients share 2 variants in 2 different genes (RAI1 and ZEB2). These variants are not extremely rare in the general population, making it necessary to screen a larger cohort of patients to verify the hypothesis of a potential additive effect of these variants on the development of the clinical phenotype, even though RAI1 and ZEB2 are reported to be functionally linked [46]. As for the other repeated variants, they are not very rare in the general population, and for this reason, it is necessary to enlarge our cohort to evaluate their role in autism onset.
In testing the hypothesis that variants can be considered as susceptibility factors in a family genetic background, we should not expect to find variants on both alleles of the same gene. Nonetheless, in our cohort, we found one homozygous variant in the MET gene, which was inherited from both parents, supporting a potential direct pathogenic role of this mutation. Indeed, the MET gene is classified as a strong candidate on the SFARI database (score 2), and positive associations in the Caucasian, Japanese, and Italian populations were found in multiple studies. In addition, biochemical assays showed a reduction in the MET protein levels and a general disruption of MET signaling in ASD patients.
We analyzed the coding regions of 120 ASD candidate genes in 40 autism patients to identify novel mutations, risk genes, and new genotype-phenotype associations. Among the 40 samples screened, only three resulted in carriers of a unique candidate variant, while samples from 36 patients resulted in more than one carrier. This result is very interesting, especially because the panel used allows for the analysis of only 120 genes, despite the approximately one thousand genes currently associated with ASD. The analysis of the 120 genes included in the panel allowed us to confirm the role of specific genes, such as MET and SLIT3, as genetic factors strongly associated with ASD. It also highlighted the potentially stronger involvement of other genes, such as FAT1, which is classified in the SFARI database with score category 3, and VPS13B, which was classified in the SFARI database with score category S, when we designed the gene panel, and only recently moved to gene category 1, showing a frequent mutation in our cohort.
In addition to these data, we also evidenced, through the analysis of the variant segregations, a generally equal distribution of paternally and maternally transmitted variants. Here, we find that the situation is the same, although we consider only extremely rare variants or variants with at least a deleterious pathogenicity prediction in silico, as if the extremely rare or deleterious variants, which are potentially more harmful, were causative of disease onset only if present together. This can further support the hypothesis of a threshold effect model. According to this hypothesis, rare and/or deleterious inherited mutations are not sufficient alone to cause the phenotype but need to co-exist with other factors (genetic, immunologic, or environmental), whose contribution to the pathogenesis of the disease remains unknown. The fact that these variants are inherited from a couple of unaffected parents could suggest the presence of a familial mutational burden in these patients. A clinical evaluation of parents to clearly define the absence of any symptoms related to autism could strengthen this statement. On the other hand, the role of de novo mutations in the onset of the disease remains unclear.
To our knowledge, this is the first study focused exclusively on rare variants with a possible role in the pathogenesis of autism, without any speculation about a direct causative effect. In the near future, we plan to expand our cohort and analyze all members of the affected families, expanding the collection of detailed clinical descriptions to all of them. In particular, our perspective is focused on performing the phenotypical characterization of probands' parents to detect the presence of any symptoms potentially related to autism and to characterize individuals with clinical or subclinical ASD (the so-called "broader phenotype"). Crucially, this will result in a better interpretation of the contribution of inherited variants. Moreover, we will improve our panel, removing those genes in which no variants were found and adding others suggested in the literature, with the aim of expanding the knowledge on the genetic basis of ASD and improving the genetic counseling for the families.

Conclusions
Our study supports the hypothesis that ASD could be the result of a combination of rare deleterious variants (low contribution) and many low-risk alleles (genetic background), as indicated by Huguet and colleagues in 2017 [47]. Indeed, most patients in our cohort are carriers of more than one rare variant (with or without in silico deleterious prediction), and only three patients are carriers of a unique variant. Considering all identified genetic alterations, we found that 6 patients (LDM160, 81815, C11/79, F14/70, IB1A, and G14/62) have many rare or extremely rare variants, and 5 patients (A14/06, G14/62, IB1A, C15/87, and r.g.74126) have several variants with at least one deleterious prediction. Looking for a possible genotype-phenotype correlation, we did not find overlapping phenotypic characteristics among these patients, or more severe symptoms, compared to in other patients, as might be expected.
Furthermore, the segregation analysis data showed an equal distribution of maternally and paternally inherited variants. Finally, the recurrence of mutations in a small set of genes, despite the restricted cohort analyzed, demonstrated a clear involvement of these genes in genetic liability to autism. In particular, our study highlights the importance of MET and SLIT3 and a potentially stronger involvement of FAT1 and VPS13B in ASD than has previously been reported. Taken together, our findings reinforce the importance of using gene panels to understand the contribution of different genes already associated with ASD in the pathogenesis of the disease.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/app11178096/s1, Table S1: Primer sequences for Sanger validation and variant segregation. Table S2: List of all variants found in our cohort and genetic details.
Author Contributions: C.R. and V.T. wrote the work. C.R. carried out the genetic tests and analyzed the genetic data. S.B. and S.A. analyzed all the clinical and neuropsychological data. A.L. reviewed the genetic data. B.G. defined the materials and methods and reviewed the paper. C.P. analyzed the clinical data and reviewed the paper. D.R. developed the idea of the work and revised the paper. S.D. designed the project, defined the materials and methods, enrolled the patients, and reviewed the paper. All authors have read and agreed to the published version of the manuscript. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: All variants found in this work are publicly available on dbSNP (NCBI). It is possible to find the submitted SNP (ss) number at: https://www.ncbi.nlm.nih.gov/SNP/snp_ viewTable.cgi?handle=NEUROGEN_BESTA, accessed on 25 November 2020. All dbSNP ID are listed in Supplementary Table S2.