Exploring the Genetic Causality of Discordant Phenotypes in Familial Apparently Balanced Translocation Cases Using Whole Exome Sequencing

Familial apparently balanced translocations (ABTs) are usually not associated with a phenotype; however, rarely, ABTs segregate with discordant phenotypes in family members carrying identical rearrangements. The current study was a follow-up investigation of four familial ABTs, where whole exome sequencing (WES) was implemented as a diagnostic tool to identify the underlying genetic aetiology of the patients’ phenotypes. Data were analysed using an in-house bioinformatics pipeline alongside VarSome Clinical. WES findings were validated with Sanger sequencing, while the impact of splicing and missense variants was assessed by reverse-transcription PCR and in silico tools, respectively. Novel candidate variants were identified in three families. In family 1, it was shown that the de novo pathogenic STXBP1 variant (NM_003165.6:c.1110+2T>G) affected splicing and segregated with the patient’s phenotype. In family 2, a likely pathogenic TUBA1A variant (NM_006009.4:c.875C>T, NP_006000.2:p.(Thr292Ile)) could explain the patient’s symptoms. In family 3, an SCN1A variant of uncertain significance (NM_006920.6:c.5060A>G, NP_008851.3:p.(Glu1687Gly)) required additional evidence to sufficiently support causality. This first report of WES application in familial ABT carriers with discordant phenotypes supported our previous findings describing such rearrangements as coincidental. Thus, WES can be recommended as a complementary test to find the monogenic cause of aberrant phenotypes in familial ABT carriers.


Introduction
The great majority of apparently balanced translocation (ABT) carriers are phenotypically normal; however, they are at risk of experiencing infertility, recurrent miscarriages, and stillbirths, or having affected offspring due to meiotic malsegregation of the normal and derivative chromosomes. Subsequently, this leads to the generation of unbalanced gametes and zygotes with partial aneuploidy of the translocated chromosomes [1,2]. In the case of prenatally detected de novo ABTs, an associated phenotypic risk of~6.7% was initially estimated [3]; this was later revised to a morbidity risk of 27%, according to a long-term follow-up study [4]. In contrast, in couples/families carrying simple or complex ABTs, a generic recurrence risk estimation for unfavourable pregnancy outcomes is challenging, as rearrangements can be individually rare or even unique in each carrier/family [2,5].
A reanalysis was also performed using the VarSome Clinical platform (version 11.0.0) [44]. Reads, in FASTQ format, were loaded on the platform and were aligned to the human reference genome (hg19) using Sentieon (bwa-mem) aligner, while variant calling for SNVs and small indels was performed using Sentieon DNAscope caller. Variants were classified as pathogenic, likely pathogenic, variant of uncertain significance (VUS), likely benign, and benign, according to the American College of Medical Genetics and Genomics (ACMG) guidelines [45,46], the Association for Clinical Genomic Science (ACGS) recommendations [47], and assisted by VarSome's automated ACMG classifier (version 11.4.3). Variants were filtered according to the suspected inheritance pattern (algorithmic filtering), while only variants with a gnomAD frequency lower than 1% and 5% in de novo and recessive analyses, respectively, variants overlapping coding or splicing regions, and variants with coverage >8X were kept for further analysis. Furthermore, custom gene lists were created using patient phenotypes in HPO terms (Table 1), to retain disease-associated gene variants in each family. Additional and updated annotation available within VarSome Clinical, assisted the interpretation and selection of candidate variants for validation. Moreover, a CNV analysis was performed using the ExomeDepth tool [43] available within VarSome Clinical.

Candidate Variant Validation
All candidate variants were validated by conventional polymerase chain reaction (PCR) using Taq DNA Polymerase (QIAGEN, Hilden, Germany), followed by purification using the ExoSAP-IT™ PCR Product Cleanup Reagent (Applied Biosystems, Thermo Fisher Scientific, Waltham, MA, USA), bidirectional Sanger sequencing using the BigDye™ Terminator v1.1 Cycle Sequencing Kit (Applied Biosystems, Thermo Fisher Scientific), clean-up of the cycle sequencing reactions using Performa ® DTR Gel Filtration Cartridges (EdgeBio, Gaithersburg, MD, USA), and capillary electrophoresis on a 3130xl Genetic Analyzer (Applied Biosystems, Thermo Fisher Scientific). Sanger sequencing data was analysed using the ABI Sequencing Analysis Software v5.4 (Applied Biosystems, Thermo Fisher Scientific). The PCR primers (Metabion, Planegg, Germany), flanking the candidate WES variants, were designed using the Primer 3 web tool, version 4.1.0 [48] (Table S3). The specific PCR and Sanger sequencing reaction volumes and conditions used are available upon request.

In Silico Analysis of WES Variants
In silico analysis to assess the impact of SNVs and indels, identified by WES, was performed using VarSome Clinical. The predicted pathogenicity of candidate variants was determined based on the combined evidence from multiple in silico predictors using the BayesDel_addAF meta-score. This ranged from −1.29334 to 0.75731; the higher the score, the more likely the variant was pathogenic. Apart from the automatic predictions by VarSome Clinical, the impact of candidate missense variants on protein structure and function was additionally assessed using Missense3D [49] and/or HOPE [50].

RNA Extraction
In order to further investigate a candidate splice-site variant identified by WES, RNA was extracted from Epstein-Barr virus transformed lymphoblastoid cell lines of the patient and a control sample using the RNeasy Midi Kit (QIAGEN), following the manufacturer's recommendations. On-column DNase digestion was performed using the RNase-Free DNase Set (QIAGEN) to ensure the removal of any residual DNA. The concentration and purity of the extracted RNA samples was measured using a NanoDrop™ 1000 Spectrophotometer (Thermo Fisher Scientific).

Reverse-Transcription PCR
DNase-treated RNA samples (1µg) were reverse transcribed to complementary DNA (cDNA) using the Protoscript ® First Strand cDNA Synthesis Kit (New England Biolabs, Ipswich, MA, USA) and following the manufacturer's protocol. cDNA samples (1µg) from the patient and non-affected father (control sample), as well as the respective RNA samples not reverse transcribed (RT-ve) and a no-template (water) sample, were PCR amplified using HotStarTaq Plus DNA polymerase (QIAGEN) and custom reverse-transcription PCR (RT-PCR) primers (Metabion) ( Table S4). The integrity of the cDNA samples was tested by amplification of a 215 bp fragment from the β-actin gene (ACTB) ( Table S4). All RT-PCR products were loaded on a 2% agarose gel, which was pre-stained with GelRed ® (1× final concentration) (Biotium, Fremont, CA, USA) and run at 120V for~1 h. Selected RT-PCR products of interest were purified and further processed with bidirectional Sanger sequencing as described above. The specific RT-PCR reaction volumes and conditions used are available upon request.

Gene and Variant Nomenclature
Genes were described according to the HUGO Gene Nomenclature Committee (HGNC) guidelines [51]. All candidate variants were described according to the latest Human Genome Variation Society (HGVS) guidelines [52]. All genomic coordinates in the present article were based on the human reference genome hg19.

Results
Apparently balanced translocations were previously identified in all four families, having both affected and non-affected individuals [12]. After extensive workup, the genetic aetiology of the affected individuals remained unknown; therefore, WES was employed next to potentially unravel unidentified patient-specific candidate variants, in addition to the common ABTs in each family, and make genotype-phenotype correlations.
Depending on the availability of parental DNA samples, WES was performed in two patient-parent trios (families 1 and 4), one trio including the affected patient, nonaffected mother, and non-affected sibling (family 3), and one duo including the affected patient and non-affected sibling (family 2). The in silico trio check (Document S1), applied for families 1 and 4, confirmed the child-parental relationships. Overall, the WES run was successful, with 90.06% of clusters passing filter, and a high base call accuracy, with 80.13% of bases having a Q score of 30. The mean target sequencing depth of the samples was~75×, with more than 81% of target regions having at least 20× coverage, while the mean uniformity was 90.3%. Validated candidate variants, commonly identified by both the in-house analysis pipeline and the most recently annotated VarSome Clinical, are presented in Table 2 and described further below for each family.

Family 1
WES trio data analysis revealed a heterozygous splice donor variant NM_003165.6:c.1110+2T>G mapping to intron 13/19 of the syntaxin-binding protein 1 (STXBP1) gene (OMIM *602926) in the patient (sample 1A) ( Figure 1A; Table 2). The specific splice donor variant was classified as pathogenic since it overlapped a gene for which loss-of-function is a known mechanism of disease. It was also absent from the gnomAD, VarSome Clinical and ClinVar databases, as well as our local database, thus making it a novel finding. In addition, multiple in silico tools used within VarSome Clinical predicted a damaging/pathogenic effect (BayesDel_addAF meta-score = 0.625).
Sanger sequencing confirmed that the identified splice donor variant occurred as a de novo event in the patient ( Figure 1A). The impact of the NM_003165.6:c.1110+2T>G variant on STXBP1 pre-mRNA splicing was also demonstrated using RT-PCR. Specifically, cDNA amplification using custom primers (STXBP1-11/12F and STXBP1-15R) flanking the splicing variant ( Figure 1B) resulted in a single band in both the patient and the nonaffected father (control) ( Figure 1C). Bidirectional sequencing confirmed that both PCR products correspond to the wild-type STXBP1 allele. However, cDNA amplification using STXBP1-mut13F, a primer specifically amplifying the mutated allele ( Figure 1B), resulted in a PCR product only in the patient ( Figure 1C), indicating the possible retention of intronic sequences into the mature STXBP1 messenger RNA (mRNA). Indeed, subsequent sequencing of this product confirmed the addition of twenty-two intron 13 nucleotides in the STXBP1 mRNA, followed by exon 14 sequences ( Figure 1D). At the protein level, the splice donor variant was predicted to result in the insertion of fifteen new amino acids at position 371 of the STXBP1 protein followed by a premature termination codon (PTC) NP_003156.1:p.(Asp371GlyfsTer16) ( Figure 1E).

Family 2
WES duo data analysis revealed a heterozygous patient-specific missense variant NM_006009.4:c.875C>T in exon 4/4 of the Tubulin α 1A (TUBA1A) gene (OMIM *602529) ( Figure S2A; Table 2). This novel finding was classified as likely pathogenic, since it was not recorded in the gnomAD, VarSome Clinical, ClinVar or our local databases. Furthermore, it overlapped a mutational hot-spot, while multiple in silico tools used within VarSome Clinical predicted a damaging/pathogenic effect (BayesDel_addAF meta-score = 0.461).   Sanger sequencing confirmed that the affected sibling (sample 2A) was a heterozygous carrier of the mutant allele, while the non-affected sibling (sample 2B) was homozygous for the reference allele ( Figure S2A); however, possible inheritance of this variant could not be established, as parental genomic samples were not available for testing. At the protein level, the identified missense variant was predicted to change the highly-conserved neutral amino acid threonine to the hydrophobic amino acid isoleucine at position 292 of the TUBA1A protein (NP_006000.2:p.(Thr292Ile)). Furthermore, a structural alteration was detected when the impact of the NP_006000.2:p.(Thr292Ile) variant was assessed with Missense3D and HOPE ( Figure S2B). Specifically, the mutant isoleucine residue affects hydrogen bond formation between the wild-type threonine residue and other amino acids in the core of the TUBA1A protein, thus disturbing correct folding.

Family 3
WES multi-sample data analysis was initially focused on homozygous recessive variants in the patient (sample 3A), because of the reported consanguinity in the family (parents are first cousins); however, no significant findings were detected. Screening for heterozygous dominant variants in the patient revealed a heterozygous missense variant NM_006920.6:c.5060A>G in exon 29/29 of the Sodium Channel, Neuronal type I, α subunit (SCN1A) gene (OMIM *182389) ( Figure S3A; Table 2). This was classified as VUS, as it was absent from publicly available databases and our local database, thus making it a novel finding. In addition, the identified SCN1A missense variant overlapped a mutational hot-spot, while a number of in silico tools used within VarSome Clinical predicted a damaging/pathogenic effect (BayesDel_addAF meta-score = 0.298).
Sanger sequencing confirmed the heterozygous SCN1A variant in the affected proband (sample 3A), while the non-affected mother (sample 3B) and male sibling (sample 3C) were both homozygous for the reference allele ( Figure S3B). However, as paternal genomic material was not available for testing, it was not confirmed yet whether this was a de novo variant. At the protein level, the identified SCN1A variant was predicted to change the hydrophilic Glutamic acid to the neutral amino acid Glycine at position 1687 of the SCN1A protein (NP_008851.3:p.(Glu1687Gly)) ( Figure S3C). This could result in loss of hydrogen bonds and/or disturb correct protein folding.

Family 4
WES trio data analysis, with our in-house pipelines and the most recently annotated VarSome Clinical, revealed no pathogenic/likely-pathogenic SNVs or CNVs in the proband of family 4 that could explain the presented polysyndactyly and oral phenotypes.

Discussion
Based on our previous findings, translocations in the four ABT families included in the current study were considered coincidental and unrelated with the observed phenotypes [12]. As a next step, WES was implemented as a diagnostic tool to identify the underlying genetic aetiology of the patients' clinical phenotypes. Patient-specific candidate variants, which were missed by low-coverage WG-MPS and other previous analyses, were identified in three out of four families. These were discussed in further detail, as seen below.
In the present study, RT-PCR results support that the novel de novo splice donor variant NM_003165.6:c.1110+2T>G affects STXBP1 pre-mRNA splicing in the patient. Intron inclusion in the mature mRNA was predicted to result in the introduction of a PTC in exon 14, more than 50 to 55 nucleotides upstream from the last exon-exon junction, and was therefore targeted for nonsense mediated decay [65]. The truncated STXBP1 protein was predicted to be dysfunctional, as NP_003156.1:p.(Asp371GlyfsTer16) disrupts STXBP1 domain 3a which, together with domain 1, forms the central cavity of STXBP1 for syntaxin binding, an important step for neurotransmitter release in the synaptic cleft [66]. Overall, the identified de novo STXBP1 splice donor variant segregated with the patient's phenotype based on the biological function of the gene, previous reports of STXBP1 splicing variants in similar patients, and supporting RT-PCR results from the present study, thus expanding the mutational spectrum of the STXBP1 gene.
In family 2, both siblings were carriers of a familial ABT, t(7;8)(q32;q24.13), which apparently was coincidental and did not contribute to phenotype development [12]. Subsequent WES analysis revealed a novel heterozygous likely pathogenic TUBA1A missense variant in the affected sibling, presenting with severe intellectual disability and microcephaly, which was absent from the non-affected sibling. The TUBA1A gene, highly expressed in brain, encodes for tubulin α 1A, which together with β tubulin, constitute the main components of microtubules as a heterodimer complex [67]. Microtubules play an important role in many cellular processes, including neuronal maturation and migration; when these are impaired, a wide spectrum of brain malformations, known as tubulinopathies, develop [55]. Heterozygous missense variants in the TUBA1A gene have been reported in patients with lissencephaly 3 (OMIM # 611603) [68][69][70]; a cortical malformation caused by neuronal migration defects and characterized by smooth brain surface. Associated clinical phenotypes include microcephaly and severe-to-profound intellectual disability, resembling those seen in our patient, as well as hypotonia, which was not present in our patient. The vast majority of pathogenic or likely pathogenic TUBA1A gene variants reported so far have been de novo heterozygous missense variants [71], with very few cases having TUBA1A nonsense variants, frameshifts, or deletions, indicating that such variants could have higher functional impacts, leading more severe phenotypes or even lethality. Dominant-negative disruption of microtubule formation rather than haploinsufficiency was thus suggested as the mechanism underlying TUBA1A variants [70,72]. Even though the de novo occurrence of the identified TUBA1A variant NM_006009.4:c.875C>T could not be determined and lissencephaly could not be confirmed in our patient due to the unavailability of an MRI scan, there was still enough evidence suggesting that it could be causative for the patient's phenotype. Future TUBA1A expression and/or functional analyses will further support such positive associations.
In family 3, WES analysis in the affected proband, non-affected mother, and nonaffected male sibling, all sharing a common t(4;10)(q35;q11.2) rearrangement [12], identified a novel heterozygous patient-specific SCN1A missense variant of uncertain significance. The SCN1A gene encodes the neuronal sodium voltage-gated channel α subunit 1 (Nav1.1) that is composed of four homologous six-transmembrane segments [73]. SCN1A, together with two β subunits, forms the complete voltage-gated sodium channel, which is important for action potential generation and propagation in neurons [73]. Heterozygous SCN1A SNVs and CNVs have been previously reported in patients with intellectual disability, developmental delay, and a spectrum of epileptic encephalopathies of varied severity [74][75][76], ranging from the milder form of generalized epilepsy with febrile seizures plus type 2 (OMIM # 604403)-to-severe myoclonic epilepsy of infancy (or Dravet Syndrome) (OMIM # 607208) [77,78]. According to the referring doctor, patient 3A was 43 years old at the time of study and did not have epileptic episodes. Clinical severity variability in individuals with SCN1A disruptions depends, not only on the type, but also the location of the specific variants [79,80]. In addition, it has been suggested that mosaicism [81], as well as other environmental and/or genetic factors, such as variants in modifier genes, could influence clinical presentation and thus explain variable phenotypes in individuals with de novo or inherited SCN1A variants [75,82]. It was, therefore, extremely important to investigate the functional impact and mechanism of each identified SCN1A variant to explain the severity of clinical manifestations [83]. Due to consanguinity in the family, identification of a recessive variant would have been expected in the patient; however, no such candidate variant was revealed that could explain the presented phenotype. At this stage, the patientspecific SCN1A variant seemed to be the most promising candidate; however, a strong causative role could not yet be established for this VUS. Revisiting the medical history of the patient and complementary functional studies will be required in order to strengthen such associations. Alternatively, careful reevaluation of the family with other methods, as suggested further below, could be done to possibly identify non-coding SNVs, variants within repetitive regions, or cryptic rearrangements that could have been missed by WES.
The patient in family 4 presented with polysyndactyly and oral anomalies which, upon clinical evaluation, were indicative for an orofaciodigital syndrome (OFDS). OFDS are a group of highly heterogeneous ciliopathy disorders characterized predominantly by abnormalities in the mouth, face, and digits [84]. There are at least fourteen OFDS subtypes, with overlapping clinical manifestations, as well as two rare unclassified ones [85]. The majority of known causal genes for OFDS encode proteins mapping in cilia compartments and regulating ciliogenesis [85]; however, novel genes and variants are continuously discovered with the use of NGS-based methods [86,87]. Furthermore, OFDS genes and clinical features are also implicated in other ciliopathies, such as Joubert syndrome and Bardet-Biedl syndrome [85,86]. In family 4, WES data was filtered to retain variant disrupting genes involved in OFDS, Joubert syndrome and polysyndactyly to cover potential differential diagnoses. However, no candidate variants could be identified, from the inhouse bioinformatics pipeline or VarSome Clinical analysis, that could explain the patient's polysyndactyly and oral anomalies. Therefore, detailed clinical examination of the patient, together with careful reevaluation of the family's WES data, and further investigation using alternative NGS-based methods, as described below, will be necessary to potentially resolve this case.
Most of the disease-causing variants reside in the exome, thus making WES a popular sequencing approach in clinical diagnostics and human genetics research [88,89]. While WES has significant diagnostic potential, it also has certain limitations that could hinder such insightful discoveries [90]. First of all, WES has a limited ability to identify clinicallyrelevant variants outside the targeted capture region; thus, high-coverage whole genome sequencing, assessing both coding and non-coding regions, should be considered to potentially increase the diagnostic yield [91,92]. Furthermore, WES does not adequately cover the entire exome; regions with high GC content are particularly difficult to capture and sequences can result in few or no mappable reads. Meanwhile, short-reads in highly-repetitive regions fail to align unambiguously, leading to low mapping quality issues [93]. Consequently, clinically-relevant gene variants (e.g., repeat expansions) and SVs overlapping such regions could be easily missed using short-read sequencing technologies. Therefore, long-read sequencing approaches offering a more uniform genome-wide coverage, such as nanopore sequencing by Oxford Nanopore Technologies and single molecule real-time sequencing by PacBio, should be considered as alternative approaches [94,95]. Finally, unresolved ABT families with discordant phenotypes could also be studied with Bionano Optical Genome Mapping, a next-generation cytogenomics tool that could potentially detect patient-specific cryptic CNVs and complexities missed by other methods [96].

Conclusions
The current study presented a follow-up investigation of four familial ABTs, where WES was implemented as a diagnostic tool to identify the underlying genetic aetiology of these patients' clinical phenotypes. Novel candidate variants were identified in three out of four families. Regarding the pathogenic STXBP1 splicing variant and likely pathogenic TUBA1A missense variant identified in Families 1 and 2, respectively, there was strong evidence to support the underlying genetic cause of the patients' phenotype. The SCN1A missense variant identified in family 3 was of uncertain significance and will require additional clinical and molecular evidence to sufficiently support causality. Based on our findings, WES can be recommended as a complementary test in familial ABT cases where cryptic complexity has been excluded. We anticipate that careful reevaluation of unsolved families, along with investigation of additional ABT families with discordant phenotypes, will further support our findings and help to decipher the underlying genetic mechanisms-as well as provide more precise phenotypic risk estimations and better genetic counselling.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes14010082/s1, Document S1: In-house Bioinformatic Exome Analysis Pipeline and Supplementary Figures; Figure S1: In-house bioinformatics pipeline for variant calling of an exome sequencing paired-end experiment using the Illumina NextSeq500 platform; Figure S2: Family 2 candidate variant investigation. (a) Top panel: IGV screenshot depicting the patient-specific TUBA1A missense variant identified by WES. Lower panel: Sanger sequencing confirmed that the affected sibling (sample 2A) was heterozygous for the mutated allele (red arrow), while the non-affected sibling (sample 2B) was homozygous for the reference allele; (b) Top panel: Schematic structures of the wild-type threonine (left) and the mutant isoleucine (right) amino acids. The backbone, which is the same for each amino acid, is coloured red. The side chain, unique for each amino acid, is coloured black. Lower panel: Structural analysis of the NP_006000.2:p.(Thr292Ile) variant (Protein Data Bank code: 5JCO); the mutant isoleucine (right) affects hydrogen bonds formed by the wild-type threonine (blue), thus disturbing correct protein folding.; Figure S3: Family 3 candidate variant investigation. (a) IGV screenshot depicting the patient-specific SCN1A candidate variant identified by WES; (b) Sanger sequencing confirmed that the proband of family 3 (sample 3A) is heterozygous for the mutated allele (red arrow), while the non-affected mother (sample 3B) and male sibling (sample 3C) are both homozygous for the reference allele; (c) Schematic structures of the wild-type glutamic acid (left) and the mutant glycine (right) amino acids. The backbone, which is the same for each amino acid, is coloured red. The side chain, unique for each amino acid, is coloured black.; Table S1: List of genes included in the Greenwood Genetic Center X-linked intellectual disability gene panel used for WES data filtering in ABT families 1, 2, and 3; Table S2: List of genes associated with polysyndactyly/synpolydactyly, orofaciodigital syndrome, and Joubert syndrome used for WES data filtering in ABT family 4.; Table S3: List of PCR primers used to validate candidate WES variants in each ABT family.; Table S4: List of Reverse-Transcription PCR primers used to investigate the underlying mechanism of the identified STXBP1 splice donor variant in family 1.
Funding: This research was funded by Telethon Cyprus and Norway Grants through the Directorate General for European Programmes, Coordination and Development of the Republic of Cyprus. WES data analysis was supported by the Cy-Tera Project (NEA ΥΠO∆OMH/ΣTPATH/0308/31), which is co-funded by the European Regional Development Fund and the Republic of Cyprus through the Cyprus Research and Innovation Foundation.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Cyprus National Bioethics Committee as part of the Translation Facility Application (EEBK/EII/2-13/09).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: All sequence variants presented in this study are available from the ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar) with accession numbers SCV002757998-SCV002758000. The remaining data generated or analysed during this study are included within the published article and its supplementary files.