Dilated cardiomyopathy (DCM) is the underlying cause of >50% of heart transplants. The high morbidity and mortality associated with this disease underscore the need for a better understanding of the underlying molecular defects. Efforts to identify these defects have made great progress and acknowledge the complexity of the genetic architecture of DCM. Historically, however, they have relied on a somewhat circular argument: DCM genetic cause is described as predominantly autosomal dominant transmission with reduced penetrance, a high degree of locus (>40 genes) and allelic (>197 variants) heterogeneity [1
], with most mutations being rare or “private” to each affected family. However, proof of variant pathogenicity within a family, at least for publication in the scientific literature, relies largely on fulfilment of these criteria. In total, ~40% of individuals test positive at known DCM loci [2
] and even within families who fulfil these criteria, age at onset, response to treatment and disease progression is variable [4
]. Despite this fact, the search for the ‘missing’ genetic cause and definition of pathogenicity relies predominantly on a familial-based strategy and criteria, making it particularly difficult to demonstrate pathogenicity in sporadic cases, even when all possible causes other than genetic, (coronary artery disease, chemotherapy-induced cardiotoxicity, valvular disease or repair) are ruled out.
Clearly, the traditional Mendelian paradigm as the causative genetic contribution to DCM genomic architecture is incomplete, and there is a need for alternative and perhaps novel genomic strategies. Genetic association studies of sporadic cases provide a potential alternative. In the first reported genome-wide association study of DCM cases (1179 cases and 1108 controls), common variants at two loci (BAG3
) were associated at genome-wide significance [5
]. These data were encouraging because one of the top loci, BAG3
, was also identified as a DCM gene in multiply affected families [6
]. These overlapping but independent reports of common genetic variants (Allele frequency = 87.5% in sporadic, “idiopathic” DCM cases and 79.2% in controls, OR = 1.89) associated with increased disease risk at a DCM locus that was originally determined by Mendelian rare variant criteria, are not hugely surprising, lending weight and confidence for further investigation of known DCM genes and common risk variants. The caveat of such an approach is that DCM is a common disease (prevalence now estimated at 1:250) and often late onset [4
]. Many members of the general population (potentially used as controls) could be asymptomatic, but due to expense, echocardiographic screening of large control populations is unlikely. In this study, we present an alternative approach to identify potential common risk variants, using a large clinical trial of breast cancer patients.
We postulate that phenotypic variability between individuals with the same DCM mutation is the result of cardiac modifying variants (i.e., we hypothesize that the severity of known DCM mutations could be influenced by individual genetic background). Differences in genetic background have been observed in animal models of DCM. For example, conditional knock-out of cardiac Erbb2
in two different lines of mice resulted in a DCM phenotype in both lines, but one showed much later disease onset [7
]. However, the homogeneity within such models can be a disadvantage when extrapolating to the human population, and strategies to tease out the human genetic architecture of DCM are required. For example, ERBB2
itself is not a known DCM gene in the Mendelian sense, but it is the target of the monoclonal antibody and breast cancer drug, trastuzumab (Herceptin), the current standard of care for HER2+ breast cancer patients [8
]. In Vitro assays of trastuzumab and human iPSC-derived cardiomyocytes demonstrate complete loss of ERBB2 within 48 h [9
], a close parallel between the mouse conditional knock-out model and use of trastuzumab in human patients. Indeed, in the first clinical trial of trastuzumab in the metastatic setting [10
], the major clinical side-effect was congestive heart failure in up to 27% of patients, although notably, this figure related to patients who received trastuzumab following anthracyclines, already a well-known cause of dose-dependent, irreversible heart failure, often ending in a phenotype of cardiomyopathy [11
]. Nonetheless, the incidence of cardiac events was considerably higher in patients who received both anthracycline and trastuzumab than in patients who received anthracycline alone, hence subsequent trials of trastuzumab employed serial echocardiographic monitoring of patients. These patients may represent an important population to identify cardiac modifying variants because: (1) Phase III clinical trials are typically large (N = 1000’s); (2) Patients receive echocardiography as a standard of care, with left ventricular ejection fraction (LVEF) monitoring at baseline, throughout treatment and on completion of treatment; (3) Patients must have baseline LVEF >50% to be eligible for trastuzumab, so unlikely to be asymptomatic prior to treatment; (4) The average age of breast cancer patients entered into phase III trials of Herceptin was >60 years, hence more age representative of DCM patients in the general population; (5) Family history of dilated cardiomyopathy is a risk factor for anthracycline-induced cardiomyopathy [12
], suggesting an overlap between disease development following chemotherapy and genetic variants at DCM loci.
In this study, we analyzed the association of genetic variants across 72 known cardiomyopathy genes with decline in LVEF in 800 patients from the N9831 clinical trial [8
]. All patients in this group were treated with doxorubicin and trastuzumab. We report results of single variant associations of common genetic variants (minor allele frequency (MAF) > 0.01) as well as those of gene-based association testing. These analyses highlight genetic variants at OBSCN
as potential cardiac modifying variants that may be relevant to the development or progression of cardiomyopathy.
2. Materials and Methods
N9831 Clinical Trial: N9831 was a pivotal clinical trial that led to the use of trastuzumab as the standard of care for early HER2+ breast cancer. Patients in the N9831 trial were required to have histologically confirmed adenocarcinoma of the breast with 3+ immunohistochemical staining for HER2 or amplification of the HER2 gene by fluorescence in situ hybridization (≥2.0 ratio) and with either lymph node-positive or high-risk lymph node-negative disease to be eligible for the study. The trial compared adjuvant chemotherapy only (Arm A) vs. adjuvant chemotherapy followed by trastuzumab, either sequentially (Arm B) or concurrently (Arm C), in operable HER2+ breast cancer [8
]. Patients received serial echocardiograms (ECHO) or multigated acquisition scans (MUGA) for up to 6-years: at baseline, at 3, 6, and 9 months after registration, and after completion of chemotherapy (Figure 1
). Long-term cardiac safety analysis was completed in 2016 [15
]. The most common cardiac symptom was decline in LVEF by ≥10 points, observed in 26.2% of patients in Arm A (chemotherapy only) and 37.3% of patients who received trastuzumab (Arms B and C). Prevalence of congestive heart failure (CHF) was also significantly higher in patients receiving trastuzumab (3%) compared to those receiving chemotherapy only (0.9%) [15
]. The majority of patients who developed CHF received cardiac medications, which included diuretics, beta-blockers, and angiotensin-converting enzyme inhibitors.
DNA extraction and genotyping: Genomic DNA was available for a total of 1446 patients from the trial. DNA was isolated from peripheral blood with the Flexigene kit (Qiagen Inc, Germantown, MD, USA) as per the manufacturer’s instructions, normalized to 15 ng/μL and shipped to Affymetrix (Affymetrix Inc, Santa Clara, CA, USA) for full service genotyping. Each 96-well plate contained one duplicate patient sample and two DNA samples routinely used as positive controls by Affymetrix. Genotyping was performed using a customized Axiom genotyping array (Affymetrix Inc, Santa Clara, CA, USA) covering a total of 762,792 single nucleotide polymorphisms (SNP)s.
A total of 16 duplicate controls were nested within 1462 DNA samples (1446 unique samples, one duplicate pair per 96-well plate) yielding 100% genotyping concordance across 793,571 SNPs. Primary analyses were confined to White/non-Hispanic with complete LVEF data. A total of 188 patients were reported as non-White/Hispanic and principal components analyses identified a further 27 outliers, and 40 patients were missing either baseline or post-treatment LVEF, leaving 1191 patients for analyses (Arm A, N = 391; Arms B + C, N = 800), Supplementary Figure S1
. Custom shell and R programming was employed to put these data in PLINK format, and all quality control (QC) was done using PLINK 1.07.
No samples had a call-rate under 95%. 13,987 SNPs had a call-rate under 95% and were removed from further analyses. Of the remaining 779,584 SNPs, 160,721 had MAF < 1%.
Deviation of the genotype distributions from Hardy–Weinberg equilibrium was tested in those patients whose LVEF did not drop by >10% to below 50%. All SNPs with Fisher’s exact test for Hardy–Weinberg Equilibrium p < 1.0 × 10−4 were excluded.
Principal components were calculated on 277,190 independent SNPs (none within a moving window of 50 SNPs could have a variance inflation factor (VIF) > 2) to assess correlation with self-reported race. The set of independent SNPs was also used to determine relatedness. There was no cryptic relatedness apart from duplicates; in total, 18 non-control pairs of samples were considered identical based on high PI_HAT (a PLINK statistic based on estimated IBD) and concordance values.
Gene and SNP selection: In this study, we focus on known DCM genes from the current literature. We report single marker association of common genetic variants (MAF > 0.01) at 72 loci (Table 1
), of which 71 are listed in the review of DCM genetic architecture [4
] and one additional gene, obscurin (OBSCN
, more recently identified as a DCM gene) [16
] and gene-based analyses which include both common (MAF > 0.01) and rare (MAF < 0.01) SNPs. The Affymetrix Axiom genotyping GWAS platform has the option to include custom-based SNPs on a GWAS backbone. We included custom SNPs for all 71 genes in the Hershberger DCM review [4
]. We did not include custom SNPs at the OBSCN
locus, as the array was designed prior to publication of [16
]. The study included a total of 15,203 variants at these 72 loci (median SNPs per gene = 68, range 1–3512, interquartile range = 178), of which, 7018 had MAF > 0.01. Each gene and the number of SNPs per gene are listed in Supplementary Table S1
Definition of cardiotoxicity: Several oncology and cardiology organizations provide definitions for cardiotoxicity that encompass overt clinical events and subclinical injury, although there is no universally accepted clinical cut point [17
]. The 2014 American Society of Echocardiography and the European Association of cardiovascular imaging consensus defined CTRCD as a decrease in the LVEF of >10%, to <53% [11
]. Reports of cardiotoxicity in the literature range in LVEF from <50% to <55%, in some cases requiring decreases of >15% or 20% [18
]. We aimed to avoid the arbitrary nature of this definition by using as our primary endpoint, the maximum decline in LVEF observed from baseline during follow-up until three months after discontinuation of trastuzumab or until two years post-treatment, whichever was earliest.
Statistical analyses: Single SNP statistical analyses were performed for 7033 common variants (MAF ≥ 0.01), using R version 3.1.1, PLINK version 1.07. Linear regression was used with change in LVEF (lowest recorded LVEF—baseline LVEF) as the outcome variable and the number of copies of the minor allele of the variant of interest as the primary predictor variable. Analyses were adjusted for age, baseline LVEF, anti-hypertensive medications and the first two principal components in the 800 patients in Arms BC who received chemotherapy (doxorubicin, cyclophosphamide and paclitaxel) and trastuzumab.
The study included a total of 15,203 variants at 72 genes/SNP-sets (median SNPs per gene = 68, range 1–3512, interquartile range = 178), of which, 7018 had MAF > 0.01. Each gene and the number of SNPs in each gene set are listed in Supplementary Table S1
. Gene-based statistical analyses were performed by aggregation of individual test-score statistics for each of the 72 gene sets to compute gene-based level p
-values, while adjusting for age, baseline LVEF, anti-hypertensive medications and the first two principal components with the SNP-set (Sequence) Kernel Association Test (SKAT). Three variations of this test were performed: SKAT [19
], SKAT-O [20
] and SKAT-common/rare [21
] under: (1) Rare variant non-burden (more powerful when a large fraction of the variants in a gene are non-causal or the effects of causal variants are in different directions); (2) Rare variant optimized burden (more powerful when most variants in a region are causal and the effects are in the same direction) and non-burden tests; (3) combination of rare and common variants respectively (weighting rare and common variants equally).
The genomic architecture of dilated cardiomyopathy is complex, with a high degree of phenotypic variability that could be accounted for by cardiac modifying variants. As an exploratory effort to identify putative modifying variants, we conducted a genetic association study of decline in LVEF following treatment with combination doxorubicin (known to induce cardiomyopathy in animal models and humans) and trastuzumab (a targeted therapy for ERBB2, crucial in prevention of dilated cardiomyopathy in mice [7
] and known cardiotoxicity in clinical trials [10
]) in 800 patients from a breast cancer clinical trial across 72 genes that are causative of cardiomyopathies.
Perhaps the strongest result from these analyses is the association with obscurin (OBSCN
), a large gene (two giant isoforms, >100 exons, spanning 170 kb). Initially screened as a candidate for hypertrophic cardiomyopathy (HCM), Arimura et al. [22
] identified variant, OBSCN
Arg4344Gln (within Ig48–49 domain) in a 19-year old affected male. Functional analyses demonstrated that the Arg4344Gln variant affected binding of obscurin to the Z9–Z10 domains of Titin [22
]. Our own single marker, common variant analyses identified associations with decline in LVEF with two missense variants in this domain: rs56021350/Thr4399Met and rs61825301/His4489Gln. Both variants were present at MAF = 0.18 in 800 patients treated with doxorubicin and trastuzumab, with the minor allele associated with larger decline in LVEF, p
= 0.001, following treatment. Our study also observed association with rs3795801/Gly4666Ser, MAF = 0.18, p
= 0.001, again with the minor allele associated with larger decline in LVEF following treatment. This variant maps to the calmodulin binding region (Ig51/52) domain and was also identified in the Arimura study [22
] of 144 unrelated HCM patients, but disregarded because it was present in the SNP database.
was also recently identified as causative of dilated cardiomyopathy (DCM) [16
] based on the observation of five potentially disease-causing mutations in four of 30 patients screened by whole exome sequencing. Marston et al. [16
] reported that 15% of the potentially disease-causing variants were in the OBSCN
gene which the authors likened to the frequency of truncating mutations in TTN
, that have been proposed as a major causative gene of DCM, suggesting mutations in OBCSN
may also be significant contributors to DCM burden. Our single marker analyses of common variants and also our gene-based analyses (including 38 common and 44 rare variants) are in agreement, and we further suggest that common and rare variants in OBSCN
may contribute to DCM burden or perhaps modify disease progression/outcome.
In a study of 312 DCM patients, TTN
truncating variants were reported in 25% of familial and 18% of sporadic cases [23
]. A subsequent study identified TTN
truncating variants in 6/17 DCM families [24
], not all of which segregated with disease, illustrating the difficulty of determining variant pathogenicity. We had hoped that our exploratory study might shine some light on causality at this locus, but we observed only minimal evidence for the association of common variants, despite the large coding region (>300 exons) and that our analyses included 275 common variants. The association we did observe, appeared to be from variants with a positive value of beta (suggesting lesser decline in LVEF following treatment), all in high linkage disequilibrium, including 17 non-coding and two missense variants, (p
= 0.019–0.047). If this signal was to be real, the predicted effect on LVEF would be protective against doxorubicin and trastuzumab.
In summary, our data are suggestive of genetic modifying variants that may increase risk of, or protect against development and/or progression of cardiomyopathy. Several of the associated variants in our study have been previously identified in sequencing studies of familial cardiomyopathy, but likely discarded because they were present in public SNP databases, even at low frequency. All associated common variants (MAF > 0.01) in this study are shown in Table 2
. Given the heterogeneity observed within DCM, even within family members carrying the same “causative” variant, a potential strategy would be to ask whether those family members with the worst outcome (earliest onset) were also positive for modifying alleles in the same gene, reported to have negative impact on LVEF.
The limitations of the study are the exploratory nature and testing of multiple genes under multiple scenarios of rare and common variants. Given that several of the associated ‘modifying’ variants are coding, perhaps the next steps are testing in model organisms. This functional testing would also discern whether specific variants are modifiers of the effects of doxorubicin, trastuzumab or combination therapy.