Yield of Rare Variants Detected by Targeted Next-Generation Sequencing in a Cohort of Romanian Index Patients with Hypertrophic Cardiomyopathy

Background: The aim of this study was to explore the rare variants in a cohort of Romanian index cases with hypertrophic cardiomyopathy (HCM). Methods: Forty-five unrelated probands with HCM were screened by targeted next generation sequencing (NGS) of 47 core and emerging genes connected with HCM. Results: We identified 95 variants with allele frequency < 0.1% in population databases. MYBPC3 and TTN had the largest number of rare variants (17 variants each). A definite genetic etiology was found in 6 probands (13.3%), while inconclusive results due to either known or novel variants were established in 31 cases (68.9%). All disease-causing variants were detected in sarcomeric genes (MYBPC3 and MYH7 with two cases each, and one case in TNNI3 and TPM1 respectively). Multiple variants were detected in 27 subjects (60%), but no proband carried more than one causal variant. Of note, almost half of the rare variants were novel. Conclusions: Herein we reported for the first time the rare variants identified in core and putative genes associated with HCM in a cohort of Romanian unrelated adult patients. The clinical significance of most detected variants is yet to be established, additional studies based on segregation analysis being required for definite classification.


Introduction
Hypertrophic cardiomyopathy (HCM) is the most common inherited cardiac illness, affecting at least 1 in 500 individuals in the general population [1,2]. It is defined by the presence of left ventricular hypertrophy (LVH) not solely explained by abnormal loading conditions [3].
Due to numerous genetic and non-genetic modifiers yet to be deciphered, clinical expression and outcomes are particularly diverse, varying from asymptomatic to severe forms or even sudden cardiac death [4]. The genetic basis is complex, mainly involving variation in sarcomeric genes, but mutation in other genes can cause mimicking pathologies with isolated HCM or with complex phenotypes comprising LVH [5]. The main causative genes are cardiac myosin binding protein C (MYBPC3) and β-myosin heavy chain (MYH7); together they are accountable for approximatively half of all HCM cases and for at least 75% of genotype-positive probands [6]. Amongst 57 candidate genes recently curated, these 2 genes along with other 6 (listed in bold letters in Table 1) have been designated as having definitive evidence for HCM and therefore should be part of clinical genetic testing [7,8]. Table 1. List of the 47 genes analyzed in our study and the number of rare variants (AF < 0.001) identified per gene (core sarcomeric genes are represented in bold letters).  of sevenless homolog 1  2  TCAP  17  Telethonin  1  TNNC1  3  Troponin C  0  TNNI3  19  Troponin I  1  TNNT2  1  Troponin T  4  TPM1  15  Tropomyosin alpha-1 chain  2  TRIM63  1  E3 ubiquitin-protein ligase TRIM63  1  TTN  2  Titin  17  VCL  10  Vinculin  1 AF allele frequency.

Gene
Increased use of high-throughput sequencing techniques together with comprehensive gene panels led to detection of novel disease-causing variants, but mainly increased the detection of variants of uncertain significance (VUS) which are difficult to interpret, particularly in case of "private" mutations unique to a single family.
Notably, the underlying etiology may vary across different populations, precisely the probability of obtaining a positive result is influenced by the existence of preceding studies in the respective population [9]. Compared to large statistics concerning the spectrum of HCM variants in Western and Northern Europe [10][11][12][13], information about the genetic basis of HCM in Romanian adult population is limited; hence, we aimed to investigate the HCM-related rare variants in a cohort of Romanian index cases.

Study Population
The study was approved by the Ethics Committee of the Clinical Emergency Hospital of Bucharest, and performed in compliance with the principles of the Declaration of Helsinki. Before enrolment, written informed consent was obtained from all subjects. The study population comprised 45 unrelated HCM probands referred to our center for standard medical care and/or genetic testing between 2017 and 2020. HCM was diagnosed according to criteria issued by European Society of Cardiology (ESC), namely increased left ventricular (LV) wall thickness (≥15 mm in adults) not solely explained by abnormal loading conditions [5]. All patients underwent comprehensive clinical work-up, including personal and family medical history, physical examination, 12-lead electrocardiogram, two-dimensional transthoracic echocardiography, and genetic testing.

Genetic Testing
The genetic testing methodology has been previously reported [14]. Briefly, blood samples were collected at enrolment and total DNA was isolated using MagCore Genomic DNA Whole Blood Kit (RBC Bioscience) following the manufacturer's protocol, and subsequently being quantified using Qubit dsDNA HS assay kit (Life Technologies). Targeted next generation sequencing (NGS) was performed on an Illumina MiSeq platform using TruSight Cardio Sequencing Kit (Illumina) according to manufacturer's instructions. An initial amount of 50 ng of genomic DNA was used for optimal gene enrichment.

Variant Assessment
Data files yielded during sequencing runs were processed by MiSeq Reporter software (Illumina) to generate FASTQ files, and to perform the mapping of reads against the reference human genome (GRCh37) using Burrows-Wheeler Aligner-Maximal Exact Match (BWA-MEM) algorithm [15]. Following alignment, variant calling was done with Genome Analysis Toolkit (GATK) and Variant Call Format (VCF) files were produced as output. VCF files were analyzed with VariantStudio v3.0 software (Illumina).
The following filters were used to select the candidate variants for further analysis: include list of 47 genes associated with HCM (Table 1), protein-coding variants, high quality calling (PASS filter), allele frequency (AF) < 0.1% in population databases. The cut-off of 0.1% was chosen considering the disease prevalence in general population (1 in 500 individuals or 1/1000 chromosomes) [1].
Sequence variants passing the aforesaid filters were analyzed individually and were further reported using Human Genome Variation Society standardized nomenclature [16]. Interpretation of clinical significance followed the joint consensus recommendations of American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP), taking into account evidences such as allele frequency in control populations and predicted effect on the resultant protein [17]. Variant frequency was determined using the allele frequency estimates from the 1000 genomes project (GRCh37 reference assembly) and gnomAD (v2.1.1 dataset aligned against the GRCh37 reference) (accessed on August 2020); AF was retrieved from total population frequencies, including controls within gnomAD v2.1.1. For prediction of functional consequence of missense variants four freely available online in silico tools were used: Sorting Tolerant from Intolerant (SIFT), Protein variation effect analyzer (Provean), PolyPhen-2, and Mutation Taster. The disease-causing potential of stop-gain and stop-loss variants, splicing variants, frameshift, and in-frame insertions and deletions was estimated with Mutation Taster. Accordingly, a five-tier system was used to classify the variants into one of the categories: benign (B), likely benign (LB), variant of uncertain significance (VUS), likely pathogenic (LP), or pathogenic (P).
Each variant was subsequently cross-referenced with its classification provided by publicly accessible databases: the NCBI ClinVar database and the Human Gene Mutation Database (HGMD) (accessed on August 2020). In addition, all novel detected variants (irrespective of in silico prediction) were examined using VarSome [18]-a human genomic variant search engine (accessed on November 2020), and classified accordingly.

Statistical Analysis
Data were analyzed using SPSS Statistics (version 23.0); results were presented as mean ± standard deviation for continuous variables and n (%) for categorical variables.

Study Population
Forty-five unrelated index patients (33 men and 12 women) with HCM were studied. The mean age at enrolment was 51 years (SD 15.5, range 21 to 87 years). When dividing the HCM cohort into positive, considering those with a definite genetic etiology, and negative, those without definitive genetic results, the mean age in the positive group was significantly lower, 34 ± 10.3 years (range 21 to 48), compared with the negative one, 53 ± 14.7 years (range 25 to 87), p = 0.04. Except of the age difference between the two group, no other statistically significant differences were found in the clinical presentation or general characteristics of HCM cohort. Maximal LV wall thickness was 20.8 ± 5.2 mm (range 15 to 38 mm) in the overall cohort, with no differences between those with or without definitive genetic diagnosis, and moreover, no differences were found in various echocardiographic parameters ( Table 2).

Consequence Missense Stop-Gained In-Frame Frameshift Splice Synonymous Total
Among all variants, 43 (45%) were not previously published nor reported in online variant databases. Molecular consequences at the sequence level of novel variants are enumerated in Table  4.     Among all variants, 43 (45%) were not previously published nor reported in online variant databases. Molecular consequences at the sequence level of novel variants are enumerated in Table 4.
As for the already reported variants (n = 52.55%), 6 of these were classified as pathogenic/likely pathogenic, 14 were variant of uncertain significance, and 11 were benign/likely benign according to the ClinVar archive; 8 variants had conflicting interpretations of pathogenicity (CON), either VUS + LP (2 cases) or VUS + LB/B (6 cases). For 13 rare variants, the ClinVar classification was not available. The positive tests were due to P/LP variants in the MYBPC3 and MYH7 genes (2 cases each), TNNI3 and TPM1 accounting for the remaining 2 cases (Table 5, P/LP variants represented in bold letters). Multiple variants were detected in 27 (60%) patients, with a maximum of 11 variants in a single subject. No proband had more than one LP/P variant.

Discussion
In this study, we explored the genetic basis of a small cohort of Romanian adult index patients with HCM. The general characteristics of our study cohort were similar with data reported by Romanian Registry of Hypertrophic Cardiomyopathy [19], with an average age at enrolment falling in the fifth decade of life, and with male predominance.
In a nutshell, the main findings of our research comprised detection of 95 different rare variants in 33 genes of the 47 genes studied. MYBPC3 and TTN showed the greatest sequence variation. The extensive variation of TTN could have been predicted seen the size of the protein and the numerous alternative splicing the gene undergoes to encode various isoforms. Targeted sequencing revealed a definite genetic etiology (P or LP variant) in 6 subjects (13.3%) and a possible etiology due to known variations (either VUS or CON variants favoring pathogenicity) in an additional 35.6% (n = 16). All P/LP variants were found in genes encoding sarcomere proteins. Almost half of the rare variants spotted were novel.
In our study, the detection rate of LP/P variants was lower than data specified by prior studies [20]. There are several valid explanations of this phenomenon. First, more stringent criteria for variant classification have been applied lately, including segregation and/or population data as recommended by ACMG [17]. Hence, irrespective of the geographic region of origin, yield of positive genetic testing progressively declined with time, from 57.7% before 2000 to 38.4% after 2010, as shown in an analysis from a large international registry [21].
The first large-scale systematic screening of genes for causal mutations for HCM revealed disease-causing variants in 63% of unrelated index cases with familial or sporadic disease. Similar detection rates (64%) were obtained by Lopes and colleagues who used high-throughput sequencing of 41 genes in 223 unrelated patients with HCM [10]. High prevalence of pathogenic mutations (67%) was also evidenced in a nationwide study on 141 Icelandic patients with clinical diagnosis of HCM [11], while in more recent studies P/LP variants were found within 21.4% to 38% of cases [12,13,[22][23][24]. Secondly, increased referral for genetic testing have been prompted lately, including cases with less severe phenotypes and/or less conclusive diagnosis [22,25].
Thirdly, there is only scarce data regarding the genetic basis of HCM in Romanian population, the limited available data being related mainly to phenocopies [26][27][28][29].
Forty-five percentage of rare variants identified in our study were novel, and all (except MYBPC3 c.1965A>G and MYBPC3 c.1957_1962delGGCCGC) were "private", each found only once in our cohort. Some of them might be eventually proven to be disease-causing, but definitive classification is challenging and the timeline may be indeterminate, requiring additional studies based on informative segregation analysis of comprehensive pedigrees. The proportion of novel variants in our cohort is comparable with literature data indicating a burden of 35-40% owed to newly noticed mutations, half being unique for a family [22].
As for genes harboring LP/P mutations, our data is consistent with extensive prior findings showing that the most frequent causative variants were detected in core sarcomeric genes, predominantly MYBPC3 and MYH7 which together explain approximately half of the cases of familial HCM [30][31][32].
Sixteen probands (35.6%) in our cohort carried a known VUS or CON variant (VUS/LP) without another likely causal variant, a higher rate than recently published by a Finnish group [12]. Five subjects (11%) harbored previously reported variants for which ClinVar classification was not available (with or without one or more novel variants), while another 5 patients had only novel variants. Altogether, these inconclusive results accounted for 68.9% of total cases, consistently with published data showing inconclusive or negative test results in 40 to 60% of screened subjects [20,[33][34][35][36].
For the remaining 8 patients (17.8%) from our cohort, no variant (P/LP, VUS, CON or novel) was detected in any of the genes tested, indicating that additional studies might be needed in order to elucidate the underlying molecular substratum.
The failure to identify rare Mendelian variants in a substantial proportion of HCM patients suggests that more complex etiologies are likely to underlie this illness [37]. Recently, several hypotheses addressed this topic.

1.
HCM caused by rare variants in unknown genes for HCM. In the quest to identify putative causative variants outside of recognized HCM genes, various groups used extended next-generation sequencing gene panels or even whole exome/genome sequencing (WES/WGS) as a first/second-line genetic test. In a Dutch study including 453 HCM patients, the sensitivity of genetic testing only slightly improved with the increasing number of genes sequenced, but prompted primarily the yield of class 3 variants (49%) [13]. Likewise, considerable increased detection of VUS (99%) was reported by Thomson and colleagues after examining 51 genes in 240 sarcomere gene negative HCM individuals and 6229 controls, with negligible incremental diagnostic yield [38]. In light of aforementioned findings, one can assert that expanded gene panels appear to offer limited additional sensitivity, most of genes within diagnostic tests lacking robust evidence of disease association [7,35]. 2.
HCM caused by rare variants in regulatory non-coding regions of already recognized causal genes. In a paper published in 2018 by Bagnall and colleagues, it has been demonstrated that variation within deep intronic regions of MYBPC3 can explain up to 9% of gene-elusive HCM cases [39]. 3.
HCM caused by rare variants in mitochondrial DNA (mtDNA). Although rare or even private mtDNA mutations are frequently encountered in HCM patients [40], only rarely they are directly associated with the disease [38], more often acting as disease modifiers rather than cause [41].

4.
Non-Mendelian HCM. A growing body of evidence indicates that genotype-negative HCM cases are most likely to represent non-Mendelian forms of disease, with less severe prognosis and lower risk to relatives [42]. The ability to accurately identify and characterize such candidate variants is encumbered by the necessity to perform genome-wide association studies in large cohorts assessing both variant frequency in the population and phenotypic effect size in patients [37].
In line with evidence reported by Burns and colleagues [23], no proband had multiple LP/P variants, but various combinations of LP/P and VUS or VUS/VUS with or without novel detected variants, implying that the actual incidence of multiple LP/P carriers in HCM might be lower than stated in early studies [32,[43][44][45][46]. Indeed, in a study comprising 1411 unrelated index cases, after rigorous variant curation according to current guidelines, the prevalence of multiple LP/P mutations diminished substantially (from 9 to 0.4%).

Strengths and Limitations of the Study
Our study benefits from the following strong points:

•
Use of a comprehensive panel including 47 genes associated with HCM.

•
Screening for the first time of a cohort of Romanian index cases.
The study is encumbered by reduced number of enrolled patients. Future perspectives: • Validation of the identified variants through Sanger sequencing.

•
Expanding the study cohort. • Performing segregation analyses both for known and novel variants.

•
Conducting functional studies for novel detected variants.

•
Checking for rare variants in the remaining genes of the TruSight Cardio Sequencing panel.

Conclusions
To our knowledge, this is the first study exploring an extensive panel of HCM-related genes in a cohort of Romanian index patients. All disease-causing variants were detected in four genes encoding sarcomere proteins. The clinical significance of most detected variants is yet to be established, additional studies based on segregation analysis being required for a definite classification.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.