Exome Sequencing in 200 Intellectual Disability/Autistic Patients: New Candidates and Atypical Presentations

Intellectual disability (ID) and autism spectrum disorder (ASD) belong to neurodevelopmental disorders and occur in ~1% of the general population. Due to disease heterogeneity, identifying the etiology of ID and ASD remains challenging. Exome sequencing (ES) offers the opportunity to rapidly identify variants associated with these two entities that often co-exist. Here, we performed ES in a cohort of 200 patients: 84 with isolated ID and 116 with ID and ASD. We identified 41 pathogenic variants with a detection rate of 22% (43/200): 39% in ID patients (33/84) and 9% in ID/ASD patients (10/116). Most of the causative genes are genes responsible for well-established genetic syndromes that have not been recognized for atypical phenotypic presentations. Two genes emerged as new candidates: CACNA2D1 and GPR14. In conclusion, this study reinforces the importance of ES in the diagnosis of ID/ASD and underlines that “reverse phenotyping” is fundamental to enlarge the phenotypic spectra associated with specific genes.


Introduction
Intellectual disability (ID) is characterised by significant limitations in intellectual functioning (reasoning, learning, problem-solving) and adaptive behaviour (conceptual, social, and practical skills) that originate before the age of 18 [1]. Affecting 1-3% of the world s population, ID represents an important socio-economic problem in healthcare [2,3]. ID is characterized by limitations in cognitive functions that manifest as an intelligence quotient (IQ) below 70. ID may be "isolated" or "syndromic" when patients have peculiar facies, specific physical signs and/or an abnormal growth pattern [4].
Autism spectrum disorder (ASD) is characterized by deficient social interactions, poor or absent communication, repetitive behaviours, and apparently limited interests [5]. ASD generally becomes apparent after the first year of life and it has been reported in an increasing number (2.2-2.7%) of children, with boys four times more likely to be affected than girls [6,7]. From a clinical point of view, the autism spectrum disorder is subdivided into "syndromic" when a co-occurrence between autism (ASD) and dysmorphic features, also other somatic or neurobehavioral abnormalities could be observed [8].
ID and ASD often co-exist and identifying the etiology of the two conditions remains challenging due to disease heterogeneity. Accurate clinical, as well as molecular diagnoses are essential for a deeper understanding of the pathogenesis of these conditions and for devising tailored treatments [4]. Until the advent of the first next generation sequencing (NGS) platforms, a large fraction of cases remained not diagnosed, with many families undergoing a "diagnostic odyssey" [9]. The introduction of exome or genome sequencing (ES/GS) has significantly improved diagnostic rates in individuals with suspected ID/ASD genetic disorders refractory to conventional diagnostic testing [10].
In the present study, ES was generated for a total of 200 individuals (84 ID and 116 ID/ASD patients). Pathogenic or likely pathogenic (P/LP) variants were found in 43 individuals (22%), with 45 variants of uncertain significance in an additional 20% (40/200). Our data strongly support the value of large-scale sequencing, especially ES within proband-parent trios, as an effective first-choice diagnostic tool.

Selection of Patients and DNA Samples' Preparation
Genetic counselling was carried out to evaluate each patient s personal and familial history. Parents provided and signed a written informed consent at the Medical Genetics department of the University of Siena, Italy, for exome sequencing analysis, clinical data usage, and the use of DNA samples from the tested individuals for both research and diagnosis purposes. We analysed a total of 200 patients affected by ID and ID/ASD (84 with ID and 116 with ID and ASD) collected from January 2019 until the end of March 2021.
Genomic DNA from the parents was isolated from EDTA peripheral blood samples using MagCore HF16 (Diatech Lab Line, Jesi, Ancona, Italy) according to the manufacturer's instructions.

Exome Sequencing
Sample preparation was performed following the Illumina DNA Prep with Enrichment manufacturer protocol. A bead-based transposome complex is used to perform tagmentation, a process that fragments the genomic DNA and then tags it with adapter sequences in one step. After saturation with input DNA, the bead-based transposome complex fragments a set number of DNA molecules. This fragmentation provides flexibility to use a wide DNA input range to generate normalized libraries with a consistent tight fragment size distribution. Then a limited-cycle PCR adds adapter sequences to the ends of a DNA fragment. A subsequent target enrichment workflow is then applied. Following pooling, the double-stranded DNA libraries are denatured and biotinylated. Illumina Exome Panel v1.2 (CEX) probes are hybridized to the denatured library fragments. Then Streptavidin Magnetic Beads (SMB) capture the targeted library fragments within the regions of interest. Then the indexed libraries are eluted from beads and further amplified before sequencing. The exome sequencing analysis was performed on the Illumina NovaSeq6000 System (Illumina San Diego, CA, USA) according to the NovaSeq6000 System Guide. Reads were mapped against the hg19 reference genome using the Burrow-Wheeler aligner BWA [11]. Variant calling was obtained using an in-house pipeline which takes advantage of the GATK Best Practices workflow [12].
Prioritization of the variants was obtained excluding polymorphisms (minor allele frequency, MAF <0.01), synonymous variants, variants classified as benign or likely benign. Frameshift, stopgain, and splice site variants were prioritized as pathogenic. Missense variants were predicted to be damaging by CADD-Phred prediction tools. The potential impact of variants on splicing was evaluated using Alamut ® Visual software-version 2.11-0 (Interac-tive Biosoftware, Rouen, France), which employs five different algorithms: SpliceSiteFinder-like, MaxEntScan, NNSPLICE, GeneSplicer, and HumanSplicingFinder.

Clinical Characteristics of Patients
We enrolled 200 families (574 individuals total) with at least one proband with an unexplained diagnosis of ASD/ID-related phenotype. In particular, 181 families had one affected proband and 4 families had two affected probands. ES was sequenced to an average depth of 100×, respectively, with~94% of bases covered ≥ 20×. The study population had a mean age of 15 years and was 64% male (128/200). We divided into four groups on the basis of the age (0-10; 11-18; 19-30; 31-49) and gender ( Figure 1). All individuals displayed ID, 58% (116/200) associated with ASD. The totality of the patients had been subjected to genetic testing prior to enrolment in this study. Our ID and ID/ASD cohort was characterized by the presence of additional associated clinical findings for 90% of the patients (179/200). These included epilepsy (n = 41/200, 20%), hypotonia (n = 10/200, 5%), and MRI abnormalities observed in 13 patients. Craniofacial dysmorphisms (n = 106/200, 53%) were found in 53% of the cases. The clinical descriptions of the 200 patients are summarized in Table S1.

P/LP Variants Identified by ES
ES was performed in 200 probands with ASD and/or ID and 41 different pathogenic/likely pathogenic (P/LP) variants were identified with a detection rate of 22% (Table S2). Clinical features of patients with pathogenic variants in disease genes were described in Figure 1. Demographic analysis of the cohort, filtered by age and gender.

P/LP Variants Identified by ES
ES was performed in 200 probands with ASD and/or ID and 41 different pathogenic/ likely pathogenic (P/LP) variants were identified with a detection rate of 22% (Table S2). Clinical features of patients with pathogenic variants in disease genes were described in Table 1.
Two de novo P/LP variants were found in the new candidate genes: CACNA2D1 and GPR14 (Table S3), and the corresponding clinical pictures were reported in Table 2.

Uncertain Variants Identified by ES
We further reported 45 uncertain variants including in this number the variants that are currently considered to have an uncertain significance in the databases and other missense variants that have not been previously described in the scientific literature (Table S4). The effect on the encoded mutated proteins has been predicted using CADD (combined depletion annotation depletion). In our study the majority of uncertain and of P/LP variants fall in genes that play a role in the axon guidance and in the neurodevelopment processes (https://reactome.org/PathwayBrowser/#/, accessed on 6 July 2021).

Discussion
This study emphasizes the clinical diagnostic relevance of ES in patients with ID and/or autism with additional clinical features. In particular, in a cohort of 200 patients, we reached a diagnostic yield of 22% (43/200) with a higher rate in ID patients (33/84; 39%) with respect to ID and ASD patients (10/116; 9%). The diagnostic yield is lower with respect to other studies that employ ES in neurodevelopmental disorder (30-43%) [13]. ES has technological limitations, including the inability to detect noncoding variants, copy number variants (CNVs), epigenetic changes, and trinucleotide repeat expansion [14]. In our cohort, 30% of cases have not been screened for CNVs and this could have underestimated the presence of other pathogenic genetic alterations, in particular in patients with ASD. Until recently, whole genome chromosomal microarray was recommended as a first-tier clinical genetic test for detecting disease-causing CNVs in individuals with ASD [15][16][17].
Most of the previous diagnoses failed for the atypical phenotypic presentation of well-established genetic syndromes. The DDX3X mutated patient (#11) shows Rett-like spectrum features with typical hand-washing stereotypes and was initially screened for mutations in MECP2, FOXG1 and CDKL5 genes [18][19][20][21]. Differently, a hundred patients are reported in literature mutated in DDX3X with various clinical features including hypotonia, movement disorder, behavioural problems, corpus callosum hypoplasia and epilepsy [22,23]. Another atypical clinical picture was manifested by a patient bearing the mutation c.688C > T (p.(Arg230Cys)) in KCNQ3, who did not suffer from any status epilepticus but showing ID, autism, stereotypies, aggressiveness, bladder anomalies; he also presented craniofacial dysmorphisms (patient #18). KCNQ3 pathogenic alterations are generally linked to the occurrence of seizures but recently patients with no EEG abnormalities have been described [24]. KIF1A gene was found altered in three unrelated cases with different phenotypic presentations (patients #19, #20, #21). KIF1A mutations cause NESCAV syndrome (NESCAVS), a neurodegenerative disorder characterized by global developmental delay, progressive spasticity, ID, speech delay, learning disabilities and/or behavioural abnormalities [25]. The mutation c.37C > T (p.(Arg13Cys)) in KIF1A was found in patient #19 with spastic paraparesis, behaviour disorder, slight enlargement of the interfolial spaces of the cerebellar hemispheres, hypertone and no craniofacial dysmorphisms. In particular the same mutation c.914C > T (p.(Pro305Leu)) was shown in two different patients (#20, #21). One of these, presented language delay, cerebellar and vermis atrophy, psychomotor delay, brain abnormalities, hypertrichosis, bilateral clinodactyly, and facial dysmorphisms (patient #20). The other showed ataxia, spastic paraparesis, angioma, nystagmus, and seizures (patient #21). Mutations in POGZ are associated with the White-Sutton syndrome, which is a neurodevelopmental disorder characterized by delayed psychomotor development and a characteristic constellation of dysmorphic facial features [26]. Additional features may include hypotonia, sensorineural hearing impairment, visual defects, joint laxity, and gastrointestinal difficulties [27]. The pathogenic variant c.1180_1181del (p.(Met394Valfs*9)) was carried by two siblings and their mother (#26, #27, #28). The sister exhibited craniofacial dysmorphisms and ID; she also showed hyperactivity, blepharophimosis, brachydactyly, nail hypoplasia, kidney abnormalities and language delay as additional clinical signs (patient #26). The brother was affected by ID, hypotonia, obesity and had some craniofacial dysmorphisms (patient #27). Their mother instead displayed ID, microcephaly, brachydactyly, and nail hypoplasia (patient #28). Another example, SHANK3 was found altered in three unrelated patients (#31, #32, #33). Mutations in SHANK3 cause Phelan-McDermid syndrome, a developmental disorder with variable features including neonatal hypotonia, global developmental delay, absent to severely delayed speech, autistic behaviour, and minor dysmorphic features [28][29][30]. One of the three mutated patients did not show neither autism nor dysmorphic features and was initially classified as a "non-syndromic" ID case.
The following new candidate genes for ID/ASD have emerged: CACNA2D1 and GPR14. They all show de novo truncating variants (patients #44-#45). CACNA2D1 encodes the alpha-2/delta subunit of skeletal muscle and brain voltage-dependent calcium channels [31]. A genomic aberration affecting the CACNA2D1 gene has been previously characterized in patients with epilepsy and ID, pinpointing the gene as an interesting candidate gene for these clinical features [32]. Mice-bearing point mutations in the CACNA2D1 gene have an abnormal central nervous system synaptic transmission [33]. GPR14 gene, encoding the orphan G protein-coupled receptor 14 for Urotensin II, is widely expressed in the brain and spinal cord [34]. We found 45 variants of unknown significance (VUS) in 40 patients. These variants are 22/84 (26%) in ID patients and 18/116 (16%) in ID/ASD patients. These variants are mostly missense 42/45 (93%) and CADD ≥ 25 in 20/42 (48%) of cases. With increased knowledge over time, exome reanalysis may change the clinical interpretation of a VUS. Thus, it is important to list all the VUS, analyse them periodically and write a report in case of changes to provide a timely response for patients and families. An accurate molecular diagnosis allows for precise genetic counselling and has the potential to change clinical management.

Conclusion
ES was able to avoid a sort of "diagnostic odyssey" for a significant fraction of families consisting in the step-by-step application of the traditional genetic methods. ES revealed atypical phenotypic presentations and new candidate genes for ID/ASD. Further studies are needed to better characterize the contribution of new candidates and to show how their haploinsufficiency can determine ID/ASD.