The Genetic Architecture of Hypertrophic Cardiomyopathy in Hungary: Analysis of 242 Patients with a Panel of 98 Genes

Hypertrophic cardiomyopathy (HCM) is a primary disease of the myocardium most commonly caused by mutations in sarcomeric genes. We aimed to perform a nationwide large-scale genetic analysis of a previously unreported, representative HCM cohort in Hungary. A total of 242 consecutive HCM index patients (127 men, 44 ± 11 years) were studied with next generation sequencing using a custom-designed gene-panel comprising 98 cardiomyopathy-related genes. A total of 90 patients (37%) carried pathogenic/likely pathogenic (P/LP) variants. The percentage of patients with P/LP variants in genes with definitive evidence for HCM association was 93%. Most of the patients with P/LP variants had mutations in MYBPC3 (55 pts, 61%) and in MYH7 (21 pts, 23%). Double P/LP variants were present in four patients (1.7%). P/LP variants in other genes could be detected in ≤3% of patients. Of the patients without P/LP variants, 46 patients (19%) carried a variant of unknown significance. Non-HCM P/LP variants were identified in six patients (2.5%), with two in RAF1 (p.Leu633Val, p.Ser257Leu) and one in DES (p.Arg406Trp), FHL1 (p.Glu96Ter), TTN (p.Lys23480fs), and in the mitochondrial genome (m.3243A>G). Frameshift, nonsense, and splice-variants made up 82% of all P/LP MYBPC3 variants. In all the other genes, missense mutations were the dominant form of variants. The MYBPC3 p.Gln1233Ter, the MYBPC3 p.Pro955ArgfsTer95, and the MYBPC3 p.Ser593ProfsTer11 variants were identified in 12, 7, and 13 patients, respectively. These three variants made up 36% of all patients with identified P/LP variants, raising the possibility of a possible founder effect for these mutations. Similar to other HCM populations, the MYBPC3 and the MYH7 genes seemed to be the most frequently affected genes in Hungarian HCM patients. The high prevalence of three MYBPC3 mutations raises the possibility of a founder effect in our HCM cohort.

Over the last 20 years, candidate gene research studies assessing genes with a hypothetical role in HCM implicated over 40 additional, mainly non-sarcomeric genes in HCM. This process has been accelerated by the advent of novel DNA sequencing technologies (next generation sequencing, NGS), which revolutionized the field of genomics by allowing accurate, rapid, and high-throughput screening but, in turn, generated tremendous amounts of variants of unknown significance (VUS), which were difficult to interpret because of insufficient data. However, genetic diagnosis is important for the clinical management of HCM patients and their families, primarily as it facilitates the identification of mutation carriers in the families. As a positive genetic finding confirms the etiology of the disease and enables mutation-specific family screening, it is recommended by clinical guidelines [1,11].
The majority of data on the genetic background and disease gene distribution in HCM patients come from large referral centers from Western Europe and the U.S., and data are scarce on Central-European populations. Reports from the region assessing small cohorts of HCM patients have been published from Slovakia (8 pts) [12], Romania (54 pts) [13], and Poland (29 pts) [14] and a larger cohort from the Czech Republic (336 pts) [15].
Understanding of the genetic architecture of HCM is of utmost importance in every geographical region as the genetic background of HCM may be variable across different populations. Hungarians possess a unique genetic constellation as they have ancient roots originating from the Volga-Ural/West Siberian region. Although the present-day Hungarian gene pool is highly similar to that of the surrounding European populations, a limited portion of specific Y-chromosomal lineages link modern Hungarians with populations living close to the Ural Mountain range on the border of Europe and Asia [16]. Therefore, we aimed to perform a nationwide large-scale genetic analysis of a previously unreported Hungarian HCM cohort using targeted resequencing of a comprehensive cardiomyopathy gene panel comprising 98 genes.

Patients and Clinical Evaluation
The study cohort comprised unrelated, consecutively evaluated patients with HCM referred to collaborating cardiovascular centers, establishing a nationwide framework, including University of Szeged, University of Pécs, University of Debrecen, and the Military Hospital-State Health Center, Budapest.
Patients underwent 12-lead ECG, echocardiography; ambulatory ECG monitoring as baseline assessment; and further specialized cardiology examinations when indicated. HCM was diagnosed in probands when the maximum left ventricular wall thickness (MLVWT) measured by two-dimensional echocardiography was 15 mm or more in at least one myocardial segment, in the absence of other diseases that could explain LV hypertrophy. Blood samples were collected at routine clinic visits, and DNA was isolated from peripheral blood lymphocytes.

Gene Panel Selection
Samples were analyzed with a custom-made cardiomyopathy gene panel, including the following sets of genes, categorized as reported previously [17].

Sequencing
Coding sequences and exon-intron boundaries of all included genes were determined by next-generation sequencing using Agilent's SureSelect technology with custom-made 120-mer RNA baits (designed using SureDesign), specific to the target region (Agilent Technologies, Santa Clara, CA, USA). Briefly, DNA was fragmented by the Covaris S2 System (Covaris Inc., Woburn, MA, USA), and the libraries were prepared using a SureSelect XT Reagent Kit (Agilent Technologies, Santa Clara, CA, USA), according to manufacturer's instructions. All the purification steps were performed using AmPureXP Beads (Beckman Coulter Inc., Brea, CA, USA); all quality measurements were performed on a TapeStation 2200 instrument (Agilent Technologies, Santa Clara, CA, USA) and aa Qubit (Thermo Fisher Scientific, Grand Island, NY, USA). The concentration of each library was determined using the KAPA Library Quantification Kit for Illumina (KAPA Biosystems Inc., Wilmington, MA, USA). Sequencing was performed on Illumina HiSeq or NextSeq 500/550 instruments (Illumina Inc., San Diego, CA, USA) using single-end reads (1 × 150 bp), generating~3 million reads for each sample. Variants, identified by targeted resequencing, were validated by standard capillary sequencing using custom-designed primers. Briefly, amplicons were generated using Platinum SuperFi polymerase and were subsequently sequenced using the BigDye Terminator v3.1 Cycle Sequencing Kit on the 3500 Genetic Analyzer platform (all from Thermo Fisher Scientific, Grand Island, NY, USA), following manufacturer's instructions.

Bioinformatic Analysis
Mapping of the 150 bp Illumina reads were accomplished by Genomic Workbench ver. 11 (CLC Bio, now part of Qiagen), using the human genome assembly GRCh38 as reference sequence. Variant calling and variant annotation were performed by the same software. The functional impact of amino acid changes caused by missense mutations was predicted by SIFT and PROVEAN programs. Nucleotide and amino acid changes were reported according to the Ensembl database.

Interpretation of Variants
Identified variants were evaluated according to the standards for the interpretation of sequence variants issued by the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) in 2015 [18], with published gene-specific adaptation [19], and they were classified as benign (B), likely benign (LB), a variant of unknown significance (VUS), likely pathogenic (LP), and pathogenic (P). Variants were interpreted using CardioClassifier [20], an automated and interactive web tool that supports disease-specific interpretation of genetic variants in genes associated with inherited cardiac conditions and assessing ClinVar variants entries (https://www.ncbi.nlm.nih.gov/clinvar/, accessed on 15 January 2022). A score for every ClinVar entry was assigned (benign: 1, likely benign: 2, variant of unknown significance: 3, likely pathogenic: 4, and pathogenic: 5), and the average of the scores was calculated. Variants with an average score of >4.5 were classified as P variants, and variants with an average score between 3.5 and 4.5, between 2.5 and 3.5, between 1.5 and 2.5, and <1.5 were classified as LP, VUS, LB and B variants, respectively. In case of discrepancy between CardioClassifier and ClinVar interpretations, a final verdict was reached by assessing clinical evidence for disease causation (especially data on the number of affected individuals with the condition and evidence for co-segregation). In case of novel variants with no ClinVar entry and not covered by CardioClassifier, the Varsome (https://varsome.com/, accessed on 15 January 2022) on-line interpretation program was used.

Study Population
242 unrelated index patients (127 (52%) men) with a clinical diagnosis of HCM were studied. At the time of diagnosis, the mean age was 44 ± 11 years, and the mean maximal left ventricular wall thickness was 23 ± 7 mm. The main demographic and clinical characteristics of the patients are summarized in Table 1.

Summary of Sequence Data
The median value of the per-sample average read depth (number of reads mapped on a 100-bp target region) across the samples was 392. Only 6 out of 242 samples had an average read depth lower than 50, with a minimum of 13.96% of the overall sequenced target regions covered to a depth of 40 or more and 98% to a depth of 20 or more.

Patient-Level Variant Analysis
The patient-level variant analysis (one variant per patient) is reported in Table 2. Table 2. Patient-level variant analysis, with the number (n) and percentage (%) of the identified variants in the different genes. Definitive: genes with definitive evidence for HCM association; moderate: genes with moderate evidence for HCM association; P/LP: pathogenic/likely pathogenic variant; VUS: variant of unknown significance; LVH: left ventricular hypertrophy. * in patients without P/LP variants.

Patient-Level Variant Associations in HCM Genes
Out of the 242 study patients, 90 (37%) carried P/LP variants. The percentage of patients with P/LP variants in genes with definitive evidence for HCM association was 93%. Most of the patients with P/LP variants had a mutation in MYBPC3 (55 pts, 61%), followed by patients with P/LP variants in MYH7 (21 pts, 23%). P/LP variants in other genes were found in less than 3% of the patients.
Double mutations with P/LP were present in four patients (1.7%): three patients with two LP/P variants in MYBPC3 (composite heterozygotes) and one patient with simultaneous P/LP variants in MYH7 and MYBPC3 (double heterozygote).
Of the patients without P/LP variants, 46 (19%) carried a VUS. The majority of these patients had a VUS in MYBPC3 (9 pts, 20%) and in MYH7 (11 pts, 24%). Patients with a VUS in other genes with >5% were patients with a VUS in TPM1 (4 pts, 9%), PRKAG2 (4 pts, 9%), MYL2 (3 pts, 7%), MYL3 (3 pts, 7%), and CSRP3 (3 pts, 7%). The percentage of patients with VUS variants in genes with definitive evidence for an HCM association were 76%. Two pathogenic variants (ACADVL p.Arg615Gln, TSFM p.Gln286Ter) were found in a heterozygous state and as a second variant in addition to another P/LP MYBPC3 variant and, therefore, were not considered as disease-causing. The SDHA p.Met1? variant was identified in two patients, with one of them in a homozygous and the other one in heterozygous state. Since this alteration had been reported in an individual with Leigh syndrome [21], as well as in individuals with paraganglioma and gastrointestinal stromal tumor (GIST), the importance of this variant in relation to HCM is unclear. The patient-level variant analysis is summarized in Figure 1.

Gene-Level Variant Analysis
The gene-level variant analysis (all variants included only once if present in more patients) is reported in Table 3. Table 3. Gene-level variant analysis, with the number (n) and the percentage (%) of variants identified in different genes. Definitive: genes with definitive evidence for HCM association; moderate: genes with moderate evidence for HCM association; P/LP: pathogenic/likely pathogenic variant; VUS: variant of unknown significance; B/LB: benign/likely benign variant; LVH: left ventricular hypertrophy.

Gene-Level Variant Analysis
The gene-level variant analysis (all variants included only once if present in more patients) is reported in Table 3. Table 3. Gene-level variant analysis, with the number (n) and the percentage (%) of variants identified in different genes. Definitive: genes with definitive evidence for HCM association; moderate: genes with moderate evidence for HCM association; P/LP: pathogenic/likely pathogenic variant; VUS: variant of unknown significance; B/LB: benign/likely benign variant; LVH: left ventricular hypertrophy.
A total of 82% of all P/LP MYBPC variants were frameshift, nonsense, and splicevariants. In all the other genes, missense mutations were the dominant form of variants (Supplementary Table S3).
A total of 31 novel variants were identified; six of them were P/LP variants, with four in MYBPC3, one in FHL1, and one in TTN (Supplementary Table S3).

Possible Founder Mutations in the Patient Cohort
There were three P/LP variants identified in more than three patients. These included the MYBPC3 p.Gln1233Ter in 12, the MYBPC3 p.Pro955ArgfsTer95 in 7, and the MYBPC3 p.Ser593ProfsTer11 in 13 patients, comprising 36% of all patients with identified P/LP variants. These patients were seemingly unrelated, raising the possibility of a possible founder effect for these mutations.

Discussion
In our work, we reported the results of the genetic analysis of a previously not published Hungarian patient cohort with HCM. The detection rate of P/LP variants in our patient cohort was 37%, with an additional 19% of patients carrying a VUS, among patients who had no P/LP variants. This figure for the detection rate is similar to literature data, with detection rates in the range of 21-43% having been reported (in cohorts with more than 100 screened patients) [13,15,[22][23][24][25][26][27]. The two major HCM genes in our cohort proved to be MYBPC3 and MYH7 genes, as the majority of the patients in our cohort carried P/LP variants in MYBPC3 (61%), followed by patients with P/LP variants in MYH7 (23%). This finding is also similar to other reported HCM disease-gene distributions, with MYBPC3 and MYH7 genes being the most frequently affected disease-associated genes [13,15,[22][23][24][25][26][27]. The percentage of patients with P/LP variants in sarcomere genes with definitive evidence for HCM association was 93% in our cohort, which is also similar to reported data. The variant-level analysis indicated that frameshift, nonsense, and splice-variants made up 82% of all P/LP MYBPC3 variants, while in all the other genes, missense mutations were the dominant form of variants. This observation is also very well in line with published data.
The results of our study also support the observation that sequencing expanded panels with an increasing number of genes offers limited additional sensitivity. Beyond genes with definitive or moderate evidence for HCM association and syndromic genes with isolated LVH, we identified only one patient with a P variant in the TTN gene and another one with a mitochondrial mutation. Data from the Laboratory for Molecular Medicine (Partners Healthcare Personalized Medicine, Boston, MA, USA) indicated that an expanded gene panel encompassing more than 50 genes identified only a very small number of additional pathogenic variants beyond those identifiable in their original panels, which examined 11 genes with a detection rate of~32% among unselected probands [22]. More importantly, Walsh et al. sequenced 31 genes implicated in HCM in a large prospective HCM cohort (n = 804) and found no significant excess of rare (minor allele frequency <1:10,000 in ExAC) protein-altering variants over controls for most genes tested and concluded that novel variants in these genes are rarely interpretable. Indeed, extended gene panels rarely identify P/LP variants outside of the core-genes, which can be interpreted with a high degree of certainty [28].
An interesting finding of our study is that we identified three P/LP variants in MYBPC3: MYBPC3 p.Gln1233Ter, p.Pro955ArgfsTer95, and p.Ser593ProfsTer11, which made up 36% of all patients with identified P/LP variants. These patients were seemingly unrelated, which raised the possibility of a putative founder effect for these mutations. Although these variants have been reported worldwide [29,30], they were identified mostly in isolation, without an established founder effect, except for the p.Pro955ArgfsTer95 mutations, which have been observed in 1.6% of the Dutch HCM population. As described in Dutch [31] and Finnish [32] populations, the MYBPC3 has a higher propensity for founder mutations (due to particular geographical and cultural isolation) Interestingly, by comparing the distribution of variants in our region of Central-Europe, we found very few overlaps (with regard P/LP variants). Only two variants, reported in the Polish cohort [14], were identified in our cohort (MYH7 p. Glu1039Gly and MYBPC3 p.Arg495Trp). Most notably, the variant MYBPC3 p.Tyr847Ter, reported to be a likely founder mutation in the Polish population, was not detected in our patient cohort. In the Slovak cohort [12], only two identical variants were identified (MYH7 p.Arg663His and MYBPC3 p.Tyr1136del). With regard to the Romanian cohort [13], only the TNNI3 p.Arg186Gln was present in our patient group. With the Czech HCM [15] cohort, there were eight overlapping variants (MYBPC3 p.Arg495Gln, p.Arg810His, p.Ala1056GlyfsTer9, and p.Gln1233Ter; MYH7 p.Ile736Thr and p.Glu924Lys; MYL3 p.Ala57Gly; and TNNI3 p.Ser166Phe). Out of the three Hungarian putative founder variants, only the MYBPC3 p.Gln1233Ter mutation was identified in one patient in the Czech cohort.
A total of 63% of our HCM cohort proved to be likely genotype-negative cases. The etiology of genotype-negative HCM may include rare variants in yet unknown genes or in regulatory non-coding regions of known causative genes [33]. Such genes, like TRIM63 [34], ALPK3 [35], or FHOD3 [36], were indeed reported but as being responsible only for 1-2% of genotype-negative cases. Another possible explanation may also be the presence of variants affecting cryptic splice-altering sites, which have also been reported in MYBPC3 in 2.2% of gene-negative patients [37]. Recently, data on the polygenic nature of HCM has emerged with a surprisingly high degree of possible polygenic contribution of the disease [38,39].
In conclusion, similar to other HCM populations, the MYBPC3 and MYH7 genes seem to be the most commonly affected genes in Hungarian HCM patients. The high prevalence of three MYBPC3 mutations raises the possibility of a founder effect in our HCM cohort.

Study Limitations
Although this was a nationwide study, the participating centers are all third-level referral cardiology institutions with a possible bias towards accumulating more severe or problematic cases. Furthermore, even with considerable development regarding the evaluation of different gene variants, the interpretation of gene variants still remained somehow individual. Although we applied a very strict interpretation process, based on the user-independent CardioClassifier program and the ClinVar database, both of these partly depend on the number of already-reported clinical cases associated with the given genetic variant. The latter might have influenced the variant interpretation, especially that of the 'variant of unknown significance' category. Comparisons of our results with those obtained in other Central-European populations may be influenced also by the relatively small number of patients assessed in the respective populations.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/diagnostics12051132/s1. Table S1. Genes included in the panel for targeted resequencing. Table S2. Complete list and attributes of the variants identified in the study. The mitochondrial mutation is not included. Table S3  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
To ensure independent interpretation of the study results, the authors grant all external authors access to relevant material, including participant-level clinical study data. Study documents and participant clinical study data are available to be shared on request after publication of the primary manuscript. Bona fide, qualified scientific and medical researchers are eligible to request access to the clinical study data with corresponding documentation describing the structure and content of the datasets. Data are shared in a secure data-access system. Prior to providing access, clinical study documents and data will be examined, and, if necessary, redacted and de-identified to protect the personal data of the study participants and personnel, and to respect the boundaries of the informed consent of the study participants.