Gene Mutations of the Three Ectodysplasin Pathway Key Players (EDA, EDAR, and EDARADD) Account for More than 60% of Egyptian Ectodermal Dysplasia: A Report of Seven Novel Mutations

Ectodermal dysplasia (ED) is a diverse group of genetic disorders caused by congenital defects of two or more ectodermal-derived body structures, namely, hair, teeth, nails, and some glands, e.g., sweat glands. Molecular pathogenesis of ED involves mutations of genes encoding key proteins of major developmental pathways, including ectodysplasin (EDA) and wingless-type (WNT) pathways. The most common ED phenotype is hypohidrotic/anhidrotic ectodermal dysplasia (HED) featuring hypotrichosis, hypohidrosis/anhidrosis, and hypodontia. Molecular diagnosis is fundamental for disease management and emerging treatments. We used targeted next generation sequencing to study EDA, EDAR, EDARADD, and WNT10A genes in 45 Egyptian ED patients with or without hypohidrosis. We present genotype and phenotype data of 28 molecularly-characterized patients demonstrating genetic heterogeneity, variable expressivity, and intrafamilial phenotypic variability. Thirteen mutations were reported, including four novel EDA mutations, two novel EDARADD, and one novel EDAR mutations. Identified mutations congregated in exons encoding key functional domains. EDA is the most common gene contributing to 85% of the identified Egyptian ED genetic spectrum, followed by EDARADD (10%) and EDAR (5%). Our cohort represents the first and largest cohort from North Africa where more than 60% of ED patients were identified emphasizing the need for exome sequencing to explore unidentified cases.


Introduction
Ectodermal dysplasia (ED) is a heterogenous group of genetic disorders caused by congenital defects affecting the development and/or homeostasis of two or more ectodermalderived body structures, namely, hair, teeth, nails, and certain glandular structures, e.g., sweat, salivary, and sebaceous glands [1]. Seventy-seven genes and nine chromosomal regions are linked to 75 types of EDs [2]. The most common ED phenotype is hypohidrotic/anhidrotic ectodermal dysplasia (HED) which represents about 50% of ED [3,4].

Ethical Considerations
Patients were recruited to the Orodental and Genodermatoses outpatient clinics of the Medical Research Excellence Centre of the National Research Centre (NRC) of Egypt. Written informed consent was obtained from all participants and/or legal guardians. This study was approved by the Medical Ethics Committee of NRC and the institutional review board of the American University in Cairo (AUC) according to the Helsinki declaration.

Clinical Evaluation
Our study included 45 Egyptian ED patients descending from 35 unrelated families. Patients were selected based on the impairment of at least two ectodermal derivatives. Exclusion criteria included patients with non-syndromic traits, i.e., where one ectodermal derivative only was impaired, and patients having syndromic ED features, e.g., cleft lip/palate, hand/foot malformations, and hearing loss. Patients were subjected to full medical history, including sex, age, chief complaint, progression of symptoms, the presence of any birth or pregnancy complications, and history of present illness. Pedigree analysis for at least three generations with emphasis on consanguinity and family members having similar or different genetic abnormalities was done. Patients were evaluated by a multidisciplinary team, including a clinical geneticist, a dentist, a dermatologist, and an otolaryngologist. General clinical examination focused on the skin and other ectodermal elements, including the presence of sparse/brittle hair, dry skin, dystrophic nails, and history of hyperthermia. Patients and parents underwent a thorough oral and dental examination. Dentition was assessed by panoramic radiographs in cooperative patients older than five years. Hypodontia was defined as agenesis of less than six teeth while oligodontia is the agenesis of six teeth or more. Patients younger than three years with delayed eruption of teeth in the presence of conical peg shaped teeth and/or in the presence of other ectodermal manifestations were also included in the study.

Genomic DNA Extraction and Targeted NGS
DNA was extracted from peripheral blood samples of all patients, parents, and available family members using standard salting out procedure. DNA samples of patients were quantified using fluorometric Denovix Qubit™ dsDNA BR Assay Kit (ThermoFisher, Waltham, MA, USA) and diluted to 6-20 pM final loading concentration, as per custom panel recommendation. Ampliseq On Demand NGS custom panel from Illumina (Ther-moFisher, Waltham, MA, USA) was designed, and the targeted genomic regions included the exons, around 100 bp of flanking intronic sequences, and the 5 and 3 un-translated regions of EDA, EDAR, EDARADD, and WNT10A genes. The average coverage depth of the entire panel was 100×, and~85-100% of targeted bases were covered. The NGS library was prepared using Ampliseq Custom Amplicon kit (ThermoFisher, Waltham, MA, USA) according to manufacturer's reference guide (Illumina #1000000036408). Briefly, 100 ng of genomic DNA in 7.5 µL water were hybridized with an oligo pool. Then, unbound oligos were removed, and extension-ligation of bound oligos was followed by PCR amplification. PCR products were cleaned. Prior to sequencing, libraries were normalized with the normalization process of the Ampliseq Custom Amplicon kit. Libraries were paired-end sequenced with 2 × 151 basepairs cycles on a MiSeq device (Illumina, San Diego, CA, USA).

Data Processing and Bioinformatic Analysis
The generated sequence data from enriched libraries were analyzed by MiSeq reporter software. FASTQ files were generated and Burrows Wheeler Aligner (BWA) was used to align the reads against the GRCh37/hg19 reference genome to create BAM files. Genome Analysis Toolkit (GATK) was used for variant analysis for the included target regions of the specified genes. Variant filtration and analysis were carried out using Illumina Variant Studio software. Variants were analyses based on satisfying sequencing depth, quality, and frequency in dbSNP [25], Genome Aggregation Database (GnomAD) [26], and 1000 Genomes Project (1000G) [27] databases. Variants were named according to Human Genome Variation Society (http://www.hgvs.org, accessed on 10 January 2021). Pathogenicity of variants was interpreted based on the recommended standards of the American College of Medical Genetics and Genomics (ACMG) [28]. For novel missense variants, eight in silico prediction algorithms were used to assess the impact of these variants on protein structure/function and evolutionary conservation. For evolutionary conservation, we used SIFT [29], PhD-SNP [30], and Mutation Assessor [31]. Polyphen-2 [32], MutationTaster2 [33], and PROVEAN [34] were used for the impact on both the protein structure/function and evolutionary conservation. SNPs&GO takes gene ontology into account while CADD contrasts annotations of fixed derived alleles in humans with simulated variants of interest [35,36]. Novel variants were deposited in ClinVar database [37].

Segregation Analyses
Sanger sequencing was used for segregation analyses among affected siblings, parents, and available family members of molecularly characterized patients. Primers were designed using Primer3 tool (https://primer3.ut.ee/, accessed on 10 January 2021), Table 1. PCR cycling conditions started by initial denaturation at 96 • C for 5 min followed by 30 three-stage cycles. Each cycle included (1) denaturation at 96 • C for 30 s, (2) annealing at primer pair's specific temperature for 30 s, and (3) extension at 72 • C for 30 min. A final extension at 72 • C for 5 min followed. PCR products were purified using QIAquick PCR purification kit (Qiagen, Dusseldorf, Germany). Forward and reverse DNA strands were sequenced using the Big Dye Termination kit (Applied Biosystems, Waltham, MA, USA), and analyzed on the ABI Prism 3500 Genetic Analyzer (Applied Biosystems, Waltham, MA, USA) according to manufacturer's protocols.

Clinical Data of 28 Molecularly Identified ED Patients
Molecular investigation identified pathogenic variants in 28 patients (62%) descending from 20 unrelated pedigrees out of the 45 participant ED patients. The clinical data of the 28 molecularly identified patients (ED1-ED28) is summarized in Table 2. Positive parental consanguinity was detected in nine (45%) of the 20 pedigrees with characterized molecular pathology, Figure 1. Among the 28 molecularly identified patients whose ages ranged between one year and 59 years, there was a clear male preponderance (20/28, 71.4%). HED triad manifested in 19/28 (68%) of patients; in four patients (ED13, ED18, ED27, and ED28), hypohidrosis and hypotrichosis were detected, but panoramic radiographs could not be obtained owing to their young age, Figure 2. Only five patients (5/28, 17.8%) had normal sweating, i.e., hidrotic ED. Dental abnormalities were present in 24 patients (24/28, 85.7%) in form of oligodontia and/or peg shaped teeth. While almost all patients presented with dry skin and sparse hair, only six patients (6/28, 21.4%) showed dysplastic nails.

Genetic Spectrum of ED in Our Cohort
A total of 13 different mutations were identified and predicted to be pathogenic according to ACMG guidelines: six previously reported EDA mutations, four novel EDA mutations, two novel EDARADD, and one novel EDAR mutations, Figure 3 [28]. The 10 EDA (NM_001399.4) mutations were identified in 25 patients from 17 unrelated families, i.e., 85% of the studied pedigrees, two EDARADD (NM_145861.2) in 10% (two patients from two unrelated families), and one EDAR (NM_022336.3) in 5% (one patient), Table 3. All five hidrotic patients had EDA mutations. No WNT10A mutations were detected.

EDA Mutations
The 25 patients harboring EDA mutations were 18 hemizygous males and seven heterozygous females. The most common mutation was the EDA missense (c.463C > T, p.R155C) mutation found in 10 patients from four unrelated pedigrees, followed by another missense EDA mutation (c.466C > T, p.R156C), which was identified in three patients from three unrelated pedigrees, Figure 1 and Table 3. All of the 10 identified EDA mutations were missense mutations except one splicing mutation (c.707-2A > T), 18 bp in-frame deletion (c.659_676delCAGGTCCTCCTGGTCCTC, p.P220_P225del), and one base pair frameshift deletion (c.492delA, p.G165Efs*115), Figure 3. All 10 EDA mutations were identified in exons number two, four, and seven of EDA gene and their flanking intronic region, Table 3.   according to ACMG guidelines: six previously reported EDA mutations, four novel EDA mutations, two novel EDARADD, and one novel EDAR mutations, Figure 3 [28]. The 10 EDA (NM_001399.4) mutations were identified in 25 patients from 17 unrelated families, i.e., 85% of the studied pedigrees, two EDARADD (NM_145861.2) in 10% (two patients from two unrelated families), and one EDAR (NM_022336.3) in 5% (one patient), Table 3. All five hidrotic patients had EDA mutations. No WNT10A mutations were detected.

Seven Novel Mutations and Their in Silico Analyses
We identified seven novel mutations, which were assigned accession numbers (SCV001480041.1-SCV001480043.1, SCV001480045.1, SCV001480046.1 SCV001762294.1, and SCV001762295.1) in ClinVar database [37]. None of these variants were found in dbSNP, 1000G, gnomAD exome, and ExAC databases or in 100 chromosomes from unaffected/unrelated Egyptian subjects. Two novel frameshift mutations, EDAR (c.204delC, p.Y69Tfs*34) and EDA (c.492delA, p.G165Efs*115), were predicted to be disease causing by MutationTaster2 tool. For novel missense variants, seven in silico algorithms predicted the deleterious, damaging, or disease-causing statuses of the three novel EDA mutations: c.602G > A (p.G201E), c.620G > A (p.G207E), and c.628G > A (p.G210R), Table 4. The same prediction was obtained in the case of the two novel EDARADD variants, (c.85G > A, p.E29K) and (c.570C > A, p.D190E), in six out of seven prediction tools, Table 4. CADD scores for the five novel missense variants were greater than 20, which means that these variants are among the top 1% of deleterious variants of the human genome [36].  Abbreviations: CADD: combined annotation dependent depletion. CADD score ≥ 10 indicates that the variant is predicted to be in the 10% most deleterious substitutions to the human genome, a score of ≥20 indicates the 1% most deleterious, etc. Mutation Assessor predicts functional impact as high (H) or medium (M) or low (L) or neutral (N) at cutoff scores of 3.5, 1.935, and 0.8 between (H&M), (M&L), and (L&N), respectively. MutationTaster scores amino acid changes from 0 to 215 based on evolutionary distance. PhD-SNP: predictor of human deleterious single nucleotide polymorphisms. PolyPhen2 score ranges from 0.0 (benign) to 1.0 (probably damaging). PROVEAN: protein variation effect analyze, score ≤ −2.5 is predicted as damaging; otherwise, it is predicted as neutral. RI: reliability index range from 0 (unreliable) to 10 (reliable), for PhD-SNP and SNPs&GO. SIFT: sort intolerant from tolerant; its threshold for intolerance is 0.05. SNPs&GO: single nucleotide polymorphisms and gene ontology.

Variable Expressivity of Heterozygous Carriers and De Novo Events
Among the 20 molecularly characterized families, X-linked inheritance (XL) was evident through pedigree analyses in 13 families, where mothers of respective probands showed partial manifestations, e.g., four missing teeth of the mother (hypodontia) and maternal grandmother of ED13, microdontia of upper lateral incisor of the mother of ED14 and 15. In two XL ED families, F11 and F15, heterozygous carrier mothers did not show any ED related manifestations, thus, XL inheritance was confirmed only after molecular diagnosis. In another two families, F13 and F16, ED21, and ED24 harbored two de novo EDA mutations, the 18 bp in-frame deletion (c.659_676delCAGGTCCTCCTGGTCCTC, p.P220_P225del) and the missense (c.871G > A, p.G291R) mutation, respectively, Figure 1. Variant segregation analysis confirmed autosomal recessive transmission of the three novel variants: missense EDARADD (c.85G > A, p.E29K) mutation in ED26, missense EDARADD (c.570C > A, p.D190E) mutation in ED27, and frameshift EDAR (c.204delC, p.Y69Tfs*34) mutation in ED28, Figure 3 and Table 3. Both parents of ED28 had mild hypodontia, Figure 2D,E.

Discussions
Molecular diagnosis of more than 60% of Egyptian ED patients classified as HED or hidrotic ED identified EDA to be the most common gene, contributing to 85% of the Egyptian genetic spectrum, followed by EDARADD (10%) and EDAR (5%). No WNT10A mutations were identified. The majority of our cohort were HED patients who were identified to have mutations in EDA (87%), EDARADD (8.7%), and EDAR (4.3%) genes, unlike hidrotic ED patients who represented 17.8% of our cohort and clustered only into EDA mutation spectrum. To our knowledge, our study comprises the first and largest North African cohort to investigate EDA, EDAR, EDARADD, and WNT10A genes in HED. Our genetic spectrum of HED differs from previously studied Mediterranean HED cohorts, which reported WNT10A mutations to account for about 16-25% of their genetic spectra and EDARADD mutations to be the least common [6,[20][21][22]42]. Such discrepancies may be due to pan ethnic variation in populations, different cohort sizes, and/or clinical selection differences. Conclusively, the molecular defect underlying about 20-40% of HED across different studied cohorts, including ours, remains to be elucidated. Our 10 identified EDA mutations congregated into three hotspot exons (E2, E4, and E7) encoding evolutionarily conserved and functionally relevant domains of Eda protein, Figure 4. The two most common EDA mutations in our cohort, (c.463C > T, p.R155C) and (c.466C > T, p.R156C), located in E2, abolish the furin cleavage site, directly impacting Eda function. Both mutations were reported to cause classical HED triad with severe skin manifestations [23,38,[43][44][45][46]. The 13 patients (ED1-ED13) harboring either (c.463C > T, p.R155C) or (c.466C > T, p.R156C) had classical HED features, except for ED3, ED7, and ED9, who carried (c.463C > T, p.R155C) mutation and had hidrotic ED with milder phenotypes than other affected members of their families. To our knowledge, our study has the largest number of EDA patients harboring (c.463C > T, p.R155C) mutation, thus, providing evidence for intrafamilial phenotypic variability of this mutation, which makes genetic counseling, specifically concerning prenatal decision, difficult. We predict that other genetic factors might contribute to the sparing of sweat gland affection in some patients.  Seven novel (red boxes) and six previously reported (blue boxes) mutations are shown with respect to their amino acid (a.a.) position on their affected protein product. EDA has eight exons, which encode a 391 amino acid (a.a.) type II transmembrane protein divided into a small intracellular N terminal region, a transmembrane domain (TM), and a large C-terminal extracellular region. The domain harboring furin cleavage site (position: 153-159 a.a., encoded by exon 2 (E2)) is located at the beginning of the extracellular region, where Eda is cleaved into the secreted form to bind to Edar receptor (TNF receptor) for NF-κB pathway activation. The extracellular region harbors another two crucial functional domains; the 19 collagen-like repeats (Gly-X-Y)19 domain (position 180-235 a.a. encoded by E4) and a TNF homology domain (position 245-391 a.a. encoded by E6, E7, and E8) [39,47]. The EDA gene is shown to illustrate the congregation of identified mutations in E2, E4, and E7. Edar protein (448 a.a.) is a tumor necrosis factor receptor (TNFR) with two predicted domains: TNFR domain (30-148 a.a.) and death domain (358-431 a.a.). Edaradd protein (215 a.a.) is associated with tumor necrosis factor receptor with a death domain predicted at (123-202 a.a.) [48].
EDA mutations are known to cause two overlapping phenotypes: HED and nonsyndromic tooth agenesis (NTA). This variation in phenotype was suggestively attributed to residual Eda receptor binding ability of some NTA-causing EDA mutations versus complete abolishment in the case of HED-causing EDA mutations [41,49]. Such a suggestion was supported by phenotypic rescue of hair and salivary glands at lower Eda Seven novel (red boxes) and six previously reported (blue boxes) mutations are shown with respect to their amino acid (a.a.) position on their affected protein product. EDA has eight exons, which encode a 391 amino acid (a.a.) type II transmembrane protein divided into a small intracellular N terminal region, a transmembrane domain (TM), and a large C-terminal extracellular region. The domain harboring furin cleavage site (position: 153-159 a.a., encoded by exon 2 (E2)) is located at the beginning of the extracellular region, where Eda is cleaved into the secreted form to bind to Edar receptor (TNF receptor) for NF-κB pathway activation. The extracellular region harbors another two crucial functional domains; the 19 collagen-like repeats (Gly-X-Y) 19 domain (position 180-235 a.a. encoded by E4) and a TNF homology domain (position 245-391 a.a. encoded by E6, E7, and E8) [39,47]. The EDA gene is shown to illustrate the congregation of identified mutations in E2, E4, and E7. Edar protein (448 a.a.) is a tumor necrosis factor receptor (TNFR) with two predicted domains: TNFR domain (30-148 a.a.) and death domain (358-431 a.a.). Edaradd protein (215 a.a.) is associated with tumor necrosis factor receptor with a death domain predicted at (123-202 a.a.) [48].
EDA mutations are known to cause two overlapping phenotypes: HED and nonsyndromic tooth agenesis (NTA). This variation in phenotype was suggestively attributed to residual Eda receptor binding ability of some NTA-causing EDA mutations versus complete abolishment in the case of HED-causing EDA mutations [41,49]. Such a suggestion was supported by phenotypic rescue of hair and salivary glands at lower Eda doses than that required for the rescue of the teeth in canine models [50]. Adversely, the same EDA mutation can cause either HED or NTA; for example, we identified c.865C > T (p.R289C) in TNF binding domain of Eda in a 2-year-old hidrotic ED case with sparse hair and eruption of few peg shaped teeth (ED23); this EDA mutation was previously reported in isolated oligodontia cases only, thus, the involvement of other genetic factors could be suggested [41,49,[51][52][53].
Other EDA mutations affecting the TNF domain of Eda are the previously reported c.871G > A (p.G291R) and the novel EDA c.492delA (p.G165Efs*115) frameshift mutation. The c.871G > A (p.G291R) mutation was exclusively reported to cause cardinal HED phenotype, which was the same case in ED24 and ED25, harboring this mutation [22,39,54]. The novel EDA c.492delA (p.G165Efs*115) was identified in two heterozygous sisters (ED14 and ED15); ED14 had HED with dysplastic nails, unlike ED15, who had hidrotic ED with normal nails. Both sisters had peg shaped teeth with delayed teeth eruption, suggesting tooth agenesis, but ED14 had a more severe dental phenotype. Of interest, the mother of ED14 and ED15 was a heterozygous carrier of c.871G > A (p.G291R) mutation, presenting only with microdontia of the upper lateral incisor. This variable expressivity and incomplete penetrance of heterozygous females has been reported before and supports the conclusion of the presence of other genetic factors influencing the HED phenotype [6,22].
The collagenous domain of Eda provides a critical role for protein function, particularly in the oligomerization of TNF domain of Eda into higher assemblies for further downstream Eda-Edar ligand receptor binding [47]. Three novel EDA mutations, c.602G > A (p.G201E) of ED16 and ED17, 620G > A (p.G207E) of ED18, and c.628G > A (p.G210R) of ED19 and ED20, affected the Eda collagenous domain, Figure 4, and were predicted as pathogenic/disease causing/deleterious by eight different in silico algorithms, Table 4. The involvement of nails was observed in the two siblings ED19 and ED20, which questions the contribution of their EDA c.628G > A (p.G210R) mutation;a however, a larger number of patients would be required for affirmation. The previously reported EDA splicing mutation (c.707-2A > T) in intron 5 was predicted to alter gene splicing via abolishment of the canonical (AG) splice acceptor site, which results in either exon skipping or the use of an alternative splice site. The EDA c.707-2A > T has been associated with severe HED as that observed in ED22 [40]. An 18 bp in-frame deletion (c.659_676delCAGGTCCTCCTGGTCCTC, p.P220_P225del) was identified in ED21, which deleted five a.a. and interrupted (G-X-Y) [13][14][15] repeats of the Eda's 19 collagen repeats. This deletion was suggestively attributed to unequal crossing over owing to the flanking GC-rich regions or an error in replication [39,42]. A similar proximal 18 bp deletion c.663-680delTCCTCCTGGTCCTCAAGG (p.P222_G227) was reported in an Egyptian patient with NTA. The overlapping phenotype of HED and NTA caused by the same EDA mutation led to the suggestion that both phenotypes represent the same disease with variable severities [23].
We showed that pedigree analyses can be misleading due to variable expressivity of heterozygous EDA carrier mothers, where some did not show any ED related manifestations, and the occurrence of de novo events, e.g., in ED21 and ED24. In these cases, XL inheritance was excluded, but later, XLHED was molecularly diagnosed. Consequently, EDA should be screened in all ED patients, whatever their prospected pedigrees. Of interest, we identified a novel frameshift EDAR (c.204delC, p.Y69Tfs*34) mutation in ED28 inherited in an autosomal recessive manner. Both carrier parents of ED28 had mild hypodontia, suggesting that carriers of only one affected EDAR allele may show mild phenotype, similar to female carriers of XL EDA mutations [55]. The frameshift EDAR (c.204delC, p.Y69Tfs*34) mutation is predicted to produce mRNA subjected to non-sense mediated decay (NMD). Even if a putative Edar truncated protein (102 a.a.) is produced, it would lack the crucial TNF receptor and death domain of wildtype Edar (448 a.a.), Figure 4 [48]. Two novel EDARADD, (c.85G > A, p.E29K) in ED26 and (c.570C > A, p.D190E) mutation in ED27, were identified and predicted to be pathogenic by different in silico algorithms. The structure of 215 a.a. Edaradd remains to be elucidated; nonetheless, the Uniprot entry of Edaradd showed that the (c.570C > A, p.D190E) mutation is located in the death domain (123-202 a.a.) of Edaradd, while (c.85G > A, p.E29K) is located in a region of charged residues (14-29 a.a.), Figure 4 [48]. Many of the EDARADD mutations reported to date are missense mutations affecting Edaradd's death domain, thus, functional consequences of (c.85G > A, p.E29K) would be of interest in terms of the possible role of Edaradd (14-29 a.a.) region on protein structure and function.

Conclusions
We presented the largest cohort to report on ED in North Africa, expanding the molecular spectrum by presenting seven novel ED mutations that were predicted to target existing and unidentified domains of their corresponding proteins. We recommend the inclusion of EDA in targeted sequencing of ED patients, even if XL inheritance is not suggested by pedigree analysis. Our study emphasized the genetic heterogeneity of ED, variable expressivity, and incomplete penetrance. Intrafamilial phenotypic variability was evident, which affects genetic counseling, particularly when prenatal decisions are concerned. Whole exome sequencing would benefit molecularly unidentified cases and provide an insight for future research into potential key player proteins in ED pathogenesis.