Genetic Spectrum of Syndromic and Non-Syndromic Hearing Loss in Pakistani Families

The current molecular genetic diagnostic rates for hereditary hearing loss (HL) vary considerably according to the population background. Pakistan and other countries with high rates of consanguineous marriages have served as a unique resource for studying rare and novel forms of recessive HL. A combined exome sequencing, bioinformatics analysis, and gene mapping approach for 21 consanguineous Pakistani families revealed 13 pathogenic or likely pathogenic variants in the genes GJB2, MYO7A, FGF3, CDC14A, SLITRK6, CDH23, and MYO15A, with an overall resolve rate of 61.9%. GJB2 and MYO7A were the most frequently involved genes in this cohort. All the identified variants were either homozygous or compound heterozygous, with two of them not previously described in the literature (15.4%). Overall, seven missense variants (53.8%), three nonsense variants (23.1%), two frameshift variants (15.4%), and one splice-site variant (7.7%) were observed. Syndromic HL was identified in five (23.8%) of the 21 families studied. This study reflects the extreme genetic heterogeneity observed in HL and expands the spectrum of variants in deafness-associated genes.


Introduction
In parts of the world where consanguinity is prevalent, it is not uncommon to see a high prevalence of genetic diseases. The consanguineous marriage rates in Pakistan are among the highest worldwide [1]. Approximately 60% of marriages in Pakistan are consanguineous, with roughly 80% of these marriages being between first cousins [2]. The prevalence of autosomal recessive diseases associated with a monogenic background, such as profound hearing loss (HL), is high in countries where consanguineous marriages are common [3].
Hereditary HL is one of the most prevalent sensory disorders that affects 1 to 2 per 1000 live births worldwide. Genetic factors are responsible for over half of all HL [4]. Studies describing genetic variants in Pakistani families with HL show evidence of the extreme clinical and genetic heterogeneity of this sensory disease and support the importance of investigating and characterizing families from this region of the world [5][6][7]. Over 120 genes have been identified as causing non-syndromic hearing loss (NSHL), which comprises approximately 70% of all forms of hereditary HL (http://hereditaryhearingloss.org). Autosomal recessive HL (ARHL) is the most commonly observed inheritance pattern. There are presently over 600 syndromic forms of deafness [8], which appear in approximately 30% of patients with genetic HL. Many of these deafness syndromes mimic non-syndromic deafness at onset [9]. Hearing impairment profoundly complicates speech and language development in prelingual children and can negatively impact education and employment prospects [10].
Exome sequencing (ES) allows for the parallel sequencing of all coding regions of the human genome and has accelerated the process of identifying causally associated variants in patients with HL [11]. In 13 consanguineous families with diverse forms of HL, we identified 13 variants in 7 HL-associated genes using ES and gene mapping approaches in 21 Pakistani families. Two of the 13 variants were not previously described in the literature. The present study underscores the importance of genetically characterizing consanguineous families with HL to expand the spectrum of clinically relevant variants in genetically diverse populations, thus improving our understanding of the alleles involved in ARHL and enhancing genetic counseling.

Genotyping and Quality Control
The Illumina Infinium HumanCore-24 v1.0 Bead Chip array (Illumina, Inc., San Diego, CA, USA) was used for genotyping. From the 306,670 markers on the array, we filtered out indels, MT-and Y-chromosomal SNPs, and variations without physical positions, resulting in 259,460 biallelic SNPs for quality control (QC) and linkage analysis. Data conversion to linkage format files and QC was managed with ALOHOMORA software [13]. The sex of individuals was estimated by counting heterozygous genotypes on the X-chromosome and compared to the upraised pedigree data. The relationships between family members were verified with the program GRR [14]. PedCheck [15] was used to detect Mendelian errors (ME) and SNPs with ME were removed from the data set. Unlikely genotypes-e.g., double recombinants-were identified with Merlin [16] and deleted in the individuals.

Linkage Analysis
Linkage analysis was performed with Merlin using an autosomal recessive mode of inheritance and complete penetrance. We assumed 0.001 as the mutant allele frequency. We executed Merlin twice, once with a full marker set of around 258,000 SNPs after QC. This calculation was used to obtain the best positions for recombination events. The second analysis was conducted with a reduced marker set (~119,000 SNPs), where a minimal distance of 10,000 bases between markers was used. This calculation, where the linkage disequilibrium (LD) between markers is reduced, identifies linkage peaks which were inflated by markers in LD. We removed the linkage regions where the LOD score broke down more than 0.3 in the LD-reduced analysis. In summary, we selected regions where the LOD score reached the maximal LOD score of a family and where the LOD score was stable in the less dense, LD-reduced marker set. Under the given inheritance model (recessive) and the pedigree structure with a consanguinity loop, this linkage analysis is called autozygosity mapping.

Exome Sequencing
Genomic DNA (gDNA) was extracted from peripheral blood lymphocytes using a standard phenol/chloroform [17] and ethanol precipitation [18]. A total of 50 ng of gDNA from the proband from each family was subjected to ES with the Nextera Rapid Capture Exome or the TruSeq Exome Enrichment kits (Illumina, Inc., San Diego, CA, USA) according to the manufacturer's protocol. An additional family member (IV.1) of family 5 was exome sequenced due to the presence of two distinct phenotypes in the family-namely, HL and a suspected bone disorder. A 2 × 76 bp paired-end read sequencing was performed using a v2 high-output reagent kit with the NextSeq500 sequencer (Illumina, Inc., San Diego, CA, USA). Raw bcl sequencing files were converted with the bcl2fastq software (Illumina, Inc., San Diego, CA, USA) and the data were aligned to the human reference genome GRCh37 [19] (hg19).

Variant Analysis and Prioritization
Single-nucleotide variants (SNVs) and small indels (<15 bp) were analyzed with the GensearchNGS software (PhenoSystems SA, Wallonia, Belgium). Analysis was supported using an established in-house bioinformatics pipeline based on the GATK toolkit including Burrows-Wheeler (BWA)-based read alignment, base quality score recalibration, indel realignment, duplicate removal, and SNP and indel discovery, with subsequent score recalibration according to the GATK Best Practice recommendations [19][20][21]. Variant filtering and prioritization were performed using a conservative minor allele frequency <0.01 based on population databases and an alternate allele frequency present at >20% referring to reads. Additional variants not removed by minor allele frequency filtering were subjected to an in-house allele count filter (n = 300) that removed variants appearing >2%, as these are too common in our exome dataset to enter manual analysis. Variant prioritization included the tools PolyPhen-2 (PP) [22], SIFT [23], MutationTaster (MT) [24], fathmm [25], LRT [26], and GERP [27]. The Deafness Variation Database [28] was also integrated into our pipeline to permit the quick assessment of variants in known deafness genes. Frequency-based filtering was performed according to a population-specific manner that includes The Greater Middle East Variome Project [29] and gnomAD [30] to account for varying allele frequencies across ethnicities. CNV analysis was performed for 19 out of 21 families using the eXome Hidden MarkovModel (XHMM, version 1.0) approach [31].

Variant Validation and Segregation Testing
The candidate variants remaining after filtering were amplified by PCR using primers designed from the Primer3 software [32]. The primer sequences are shown in Table S1. PCR products were bidirectionally sequenced with an ABI 3130xl 16-capillary sequencer (Life Technologies, Carlsbad, CA, USA). Sequence reactions were completed with 5× sequencing buffer and big dye terminator (Applied Biosystems, Waltham, MA, USA). DNA sequence analysis was performed using the Gensearch software (Phenosystems SA, Wallonia, Belgium).

Summary of Affected Genes and Genetic Context
Using ES and bioinformatics analysis, 13 different variants in seven HL-associated genes were identified, including two that have not been previously described in the literature (15.4%). In aggregate, these variants are likely causally associated in 13 out of 21 (61.9%) consanguineous Pakistani families. Pathogenic and likely pathogenic variants were identified in the genes GJB2, MYO7A, FGF3, CDC14A, SLITRK6, CDH23, and MYO15A (Table 1). GJB2 and MYO7A were implicated in the genetic diagnosis of almost half of all cases (46.2%). Among the different variant types observed, seven were missense (53.8%), three were nonsense (23.1%), two were frameshift (15.4%), and one was a splice-site variant (7.7%). All the variants were either homozygous or compound heterozygous, showed an autosomal recessive inheritance pattern, and were validated by segregation testing (Figure 1, Figure S1).    Probands who were exome sequenced are marked with an arrow. Deceased individuals are marked with a diagonal line. The mutated and wild type alleles are illustrated with "−" (mutated) and "+" (wild type) symbols, respectively.

Clinical Features and Genetic Spectrum of Patients with Syndromic HL
FGF3, MYO7A and SLITRK6 All the affected individuals were clinically diagnosed with congenital, bilateral HL and have a consanguineous background. Five of 21 families (23.8%) revealed a syndromic form of HL. Two of the 21 families were clinically diagnosed with Usher syndrome, one of the most common forms of syndromic HL [8].
The affected individuals who were available for testing in families 1 and 2 reported severe HL and cupped ears ( Figure 2A, Table 2 Figure S2) in families 1 and 2 that co-segregated with HL in both families (Figure 1). The variant is predicted to be disease causing (MT, PP, SIFT), involves the substitution of a conserved amino acid (aa), and was not previously published in the literature. The variant is classified as likely pathogenic according to the ClinGen HL working group expert specification [41]. Homozygous variants in FGF3 have been associated with deafness, accompanied by inner ear agenesis, microtia, and microdontia [42].
Genes 2020, 11, x FOR PEER REVIEW 8 of 18 IV.2) ( Figure 1) and was absent in population databases. The variant is classified as likely pathogenic according to the ClinGen HL working group expert specification [41]. Homozygous variants in this gene are known to cause sensorineural deafness and high myopia in humans [44].    [34] in MYO7A. Ophthalmological examination of the proband IV.3 revealed high myopia, cataract, and retinitis pigmentosa in both eyes. Visual acuity was reduced to 5/60 (Snellen equivalent, 20/250), with corrective lenses of −12.00/0/0 • in both eyes. Ophthalmological examination of the proband IV.5 revealed high myopia, cataract, and retinitis pigmentosa in both eyes. Visual acuity was reduced to light perception in the right eye, and 1/60 (Snellen equivalent, worse than 20/1000) in the left eye. Thus, the patient was legally blind. Cataract was more pronounced in the right eye. Ultrasonographic findings of the right eye were within normal limits. The aa substitution weakens the 5 donor splice-site predicted by several in silico prediction tools. A previously published minigene assay that was conducted using nasal epithelial cells proved the skipping of exon 5 in the mutant transcript, which likely results in a truncated protein [43] (Figure 1). Two individuals in family 5 (III.4, IV.1) suffer from a bone disorder but have normal hearing in contrast to the affected family members (IV.2, IV.3), who show a distinct Usher syndrome phenotype ( Table 2). The ophthalmological examination of proband IV.2 revealed hyperopia in both eyes (+9/0/0 • in the right eye, and +11/0/0 • in the left eye) and lenticular opacity (cataract) in both eyes. Retinitis pigmentosa was confirmed with indirect ophthalmoscopy with a 20 Diopter power lens. We identified a homozygous pathogenic missense c.3502C>T, p.(Arg1168Trp) variant [35] in the gene MYO7A in affected individuals (IV.2, IV.3) that segregated in family 5. An ES analysis of III.3 and IV.1 did not uncover any variants in genes associated with bone disorders or neuropathies. Both variants that have been identified in family 4 and family 5 are known to cause Usher syndrome type 1 [34,35].
The affected individuals in family 7 (IV.1, IV.2) reported severe-to-profound HL (Table 2, Figure 2B). Ophthalmological examination in IV.1 revealed compound myopic astigmatism and grossly cupped discs diagnosed as glaucoma in both eyes. The macula appeared normal in both eyes. Visual acuity was reduced to 0.5 logMAR in both eyes (Snellen equivalent: 20/63), with corrective lenses of −5.50/−2.50/90 • in the right eye and −6.50/−1.00/60 • in the left eye. Intraocular pressure was 17 mmHg in both eyes. Additionally, Duane retraction syndrome (a congenital neuromuscular dysfunction of the eye movement caused by a failure of the sixth cranial nerve) was diagnosed in the left eye. Other data (e.g., axial length, status of lens, macula/retina, visual field, treatment of glaucoma) were not available. Ophthalmological examination in IV.2 revealed compound myopic astigmatism and myopic alterations with macular degeneration in both eyes. Visual acuity was reduced to count fingers with corrective lenses of −5.00/−2.50/90 • in the right eye and −4.00/−3.00/90 • in the left eye. Intraocular pressure was 15 mmHg in both eyes. Thus, the patient was legally blind. Other data (e.g., axial length, status of lens, macula/retina, visual field, treatment of glaucoma) were not available. Family 7 revealed a segregating novel homozygous nonsense variant c.120_121insT, p.(Asp41*) in the gene SLITRK6 (NM_032229.2; Figure S2) that was present in both affected family members (IV.1, IV.2) ( Figure 1) and was absent in population databases. The variant is classified as likely pathogenic according to the ClinGen HL working group expert specification [41]. Homozygous variants in this gene are known to cause sensorineural deafness and high myopia in humans [44].
In aggregate, MYO7A (NM_000260.3; DFNA11, DFNB2, USH1B) was affected in three families, accounting for 23.1% of the overall diagnostic yield. Abbreviations: HL, hearing loss; y/o, years old available ages for affected individuals; 1 no DNA available for testing, only clinical photographs (Figure 2A) or ophthalmological examination; 2 two distinct phenotypes within the family.

Autosomal Recessive HL Loci
Genome-wide genotyping and autozygosity mapping that were performed for 13 Pakistani families revealed loci for autosomal recessive HL (hg19) ( Table 3).
The most significant linkage intervals in families 3 and 8 did not include the affected genes GJB2 and MYO7A. However, the longest interval in family 3 is located slightly outside of the GJB2 gene coordinates. Abbreviations: LOD, logarithm of the odds. * Family 3 revealed a homozygous interval outside but near the GJB2 gene locus.

Discussion
Geographically or culturally isolated populations that have high rates of consanguinity, such as the Pakistani families included in this study, have proven valuable for novel HL gene identification studies and for contributing to a greater understanding of the alleles implicated in HL [46][47][48]. Although our patient cohort was relatively small, the overall resolve rate in this study of 61.9% is nonetheless comparable to other studies with consanguineous families that have showed a resolve rate of 67% [49]. As expected, most of the families revealed variants that were homozygous (92.3%) and compound heterozygous (7.7%) and were primarily found in ARHL-associated genes.
We identified causative variants in seven HL-associated genes. Many of these variants were missense (53.8%, Figure 3B), which is consistent with the mutational characteristics of deafness genes [50]. Unlike previous studies investigating the genetic spectrum of hearing-impaired Pakistani patients that have described SLC26A4 as a frequent cause of HL in this population, causal variants in this gene were not present in our cohort. This is likely explained due to the restricted geographical region from which our families were recruited and the existence of prevalent founder variants in this gene, especially in the Pakistani population, that were absent in our relatively small cohort [51].  The homozygous c.2968G>A, p.(Asp990Asn) missense variant in CDH23 segregating in family 9 was identified as a recurrent variant in South Indian families with HL [57] and is known to cause ARNSHL [39]. The second pathogenic homozygous missense variant in CDH23 (c.4688T>C, p.(Leu1563Pro)), which is known to cause non-syndromic deafness, has been identified in family 13 [45].
We identified two unrelated families (1, 2) with the same c.166C>T, p.(Leu56Phe) variant in the gene FGF3. Recessive variants in FGF3 have been described in patients diagnosed with LAMM  [6,52,53]. Variants in the genes MYO7A and GJB2 accounted for a combined 46.2% of all diagnoses in the present study which is consistent with previously published rates from Pakistani cohort studies ( Figure 3A) [5][6][7]. We also identified the c.231G>A, p.(Trp77*) variant in one family that has been reported as one of the most common GJB2 alleles in Pakistani HL patients [54].
Interestingly, disease-causing variants in Pakistani patients with NSHL often involved genes that were also associated with syndromic HL, such as MYO7A [55] or CDH23 [39]. For example, patients with a diagnosis of Usher syndrome type 1B (MIM *276900) show profound congenital hearing impairment, retinitis pigmentosa, vestibular dysfunction, and biallelic causal variants in MYO7A [56]. In this study, pathogenic variants of MYO7A were identified in three families (4,5,8). Affected individuals in family 4 and family 5 were confirmed with Usher syndrome, which is characterized by severe auditory and ophthalmic symptoms. Both of the homozygous variants that have been identified in family 4 (c.470G>A, p.(Ser157Asn)) and 5 (c.3502C>T, p.(Arg1168Trp)) are known to cause Usher syndrome type 1 [34,35].
Family 8 revealed three different heterozygous variants in MYO7A, with all three of them exclusively present in both affected siblings (IV.1, IV.2), in whom Usher syndrome was excluded. Interestingly, two of the three identified variants (c.1258A>T, c.4505A>G) were previously described in a Pakistani family reporting ARNSHL [6]. Richard et al. [6] also described a third heterozygous variant (c.3502C>T) that differs from the third variant in the present study (c.1849T>C). In family 8, the two c.1258A>T and c.4505A>G variants were inherited from the maternal allele (III.4), and the c.1849T>C variant was an inferred paternally inherited allele, thus confirming compound heterozygosity in both affected patients (IV.1, IV.2) of c.1258A>T, p.(Lys420*) and c.1849T>C, p.(Ser617Pro) (Figure 1). The absent homozygous interval in a region that contains MYO7A supports a suspected compound heterozygosity for the variants in this family. It remains to be determined if the double mutated maternal allele is either a broadly segregating allele in the Pakistani population or if the two families are possibly distantly related.
The homozygous c.2968G>A, p.(Asp990Asn) missense variant in CDH23 segregating in family 9 was identified as a recurrent variant in South Indian families with HL [57] and is known to cause ARNSHL [39]. The second pathogenic homozygous missense variant in CDH23 (c.4688T>C, p.(Leu1563Pro)), which is known to cause non-syndromic deafness, has been identified in family 13 [45].
We identified two unrelated families (1, 2) with the same c.166C>T, p.(Leu56Phe) variant in the gene FGF3. Recessive variants in FGF3 have been described in patients diagnosed with LAMM syndrome, which is characterized by congenital deafness with labyrinthine aplasia (LA), microtia (M) and microdontia (M) (MIM *610706) [42]. The phenotypic characteristics in patients can vary from fully penetrant LAMM syndrome to milder forms with less severe syndromic features [58]. Probands from families 1 and 2 show HL and cupped ears (Figure 2A), which overlap with milder phenotypic characteristics such as minor dental and external ear phenotypes that were previously described in the literature. We cannot exclude an inner ear malformation in the affected individuals due to absent temporal bone CTs.
Three nonsense variants were found in the genes GJB2, SLITRK6, and MYO7A ( Figure 3). While the c.231G>A, p.(Trp77*) variant in GJB2 and the c.1258A>T, p.(Lys420*) variant in MYO7A were previously described as pathogenic, the nonsense variant c.120_121insT, p.(Asp41*) in SLITRK6, segregating in family 7, was novel. The effect of the induced stop-codon at amino acid position 41 out of 841 amino acids encoding SLITRK6 would truncate 95% of the protein and likely be targeted by nonsense-mediated mRNA decay (NMD) [59]. To date, only five variants are known to cause the associated autosomal recessive deafness and myopia syndrome [44,60,61] that is consistent with the phenotype in family 7. Myopia in deafness and myopia syndrome has been reported to range between −6 and −11 diopters [44,62]. Findings in both siblings (IV.1, IV.2) in family 7 and affected individuals (IV.3, IV.5) in family 4, who had undergone ophthalmological examination were consistent with high myopia. Additionally, IV.3 and IV.5 in family 4, and IV.2 in family 5 showed findings typical for retinitis pigmentosa. Interestingly, all the previously identified variants are also either nonsense or frameshift variants, suggesting loss-of-function as an underlying mechanism. Consistently, Slitrk6 knock-out mice showed distinct reduction in cochlear innervation and a defective auditory brainstem response [44].
Family 12 revealed a homozygous splice-site variant c.9518-2A>G in intron 57 of MYO15A [5]. This variant likely mediates the complete loss of the 3' acceptor site according to several in silico prediction tools, and possibly results in the skipping of exon 58. Variants that occur in important consensus sequences at the exon-intron boundaries, thus disrupting the actual splicing process, are known to be the cause of a variety of human diseases [63].

Conclusions
The present study of 21 Pakistani families identified two novel alleles causing HL and emphasizes the importance of investigating different populations and their heterogeneous genetic background. The fact that 38.1% of the 21 examined families are still considered unresolved highlights a possible area in which the further application of advanced sequencing and analysis methods could uncover currently unrecognized genetic changes due to technical limitations. Candidate genes have been identified in four families that are presently undergoing functional analysis. In families without candidate genes identified, genome sequencing would support uniform copy number variation analysis and the analysis of deep intronic or regulatory variants that are difficult to ascertain by ES. Some of the limitations of the study include potentially insufficient coverage of homopolymeric or GC-rich regions and the existence of mapping difficulties for regions containing pseudogenes or larger deletions, insertions, or structural rearrangements, as listed in previous ES studies, and may diagnose some of the unresolved families [64]. Nevertheless, an ES approach is still the method of choice for elucidating genetic variants in a large portion of heritable human disorders, including HL.