Is Mammographic Breast Density an Endophenotype for Breast Cancer?

Simple Summary Candidate endophenotypes should be systematically assessed against five criteria: (i) the endophenotype is associated with disease in the population; (ii) the endophenotype is heritable; (iii) within families, endophenotype and disease co-segregate; (iv) the endophenotype found in affected family members is found in non-affected family members at a higher rate than in the general population and (v) the endophenotype is primarily state independent (manifests in an individual whether or not disease is active). This study assesses the suitability of mammographic breast density as an endophenotype for breast cancer. Formally establishing a trait as a disease endophenotype confirms that the trait and endophenotype share a biological basis, thereby enabling genetic dissection of an endophenotype to inform disease risk. As breast density can be measured for any woman who has had a mammogram, studies investigating the genetic architecture of breast density could identify breast cancer risk variants that act through effects on this trait. Abstract Mammographic breast density (MBD) is a strong and highly heritable predictor of breast cancer risk and a biomarker for the disease. This study systematically assesses MBD as an endophenotype for breast cancer—a quantitative trait that is heritable and genetically correlated with disease risk. Using data from the family-based kConFab Study and the 1994/1995 cross-sectional Busselton Health Study, participants were divided into three status groups—cases, relatives of cases and controls. Participant’s mammograms were used to measure absolute dense area (DA) and percentage dense area (PDA). To address each endophenotype criterion, linear mixed models and heritability analysis were conducted. Both measures of MBD were significantly associated with breast cancer risk in two independent samples. These measures were also highly heritable. Meta-analyses of both studies showed that MBD measures were higher in cases compared to relatives (β = 0.48, 95% CI = 0.10, 0.86 and β = 0.41, 95% CI = 0.06, 0.78 for DA and PDA, respectively) and in relatives compared to controls (β = 0.16, 95% CI = −0.24, 0.56 and β = 0.16, 95% CI = −0.21, 0.53 for DA and PDA, respectively). This study formally demonstrates, for the first time, that MBD is an endophenotype for breast cancer.


Introduction
Breast cancer is the second most commonly diagnosed cancer in Australia and the most commonly diagnosed cancer in females, with a lifetime prevalence of 1 in 7 [1]. Women with an affected first-degree female relative are at approximately two-fold greater risk of developing breast cancer than women from the general population. Rare, highly penetrant mutations in genes such as BRCA1/2, identified by linkage/positional cloning in breast cancer families more than 20 years ago, remain the single largest known genetic risk factor for breast cancer, accounting for~30% of excess familial risk [2]. In recent years, large genomewide association studies (GWAS) in unrelated individuals have identified many common, low-risk alleles that together account for an additional~18% of familial risk [3]. New study designs are needed to identify the remaining "missing heritability" and causal risk variants that also contribute to inter-individual differences in breast cancer susceptibility.
Mammographic breast density (MBD) is a strong [4,5] and highly heritable [6,7] predictor of breast cancer risk and considered a strong biomarker for the disease. MBD is the white appearance of epithelial and stromal tissue on a mammogram, in contrast to adipose (fatty) breast tissue which appears dark. Dense breast tissue is quite common, with~43% of screen-aged women estimated to have heterogeneously or extremely dense breasts [8]. MBD is a modifiable risk factor [9] and it has been shown that reducing MBD by medical interventions such as tamoxifen is associated with significantly reduced breast cancer risk [10]. Previous investigations of the associations between known common breast cancer-susceptibility variants and MBD has demonstrated significant evidence of a shared genetic basis between MBD and breast cancer risk with~18% overlap of genetic associations [11]. However, MBD has never been formally examined as an endophenotype for breast cancer-a quantitative trait that is heritable and genetically correlated with disease risk. Formally establishing a trait as a disease endophenotype confirms that the trait and endophenotype share a biological basis, thereby enabling genetic dissection of an endophenotype to inform disease risk. As MBD can be measured for any woman who has had a mammogram, study designs that quantitatively examine the genetic architecture of MBD could significantly help identify risk variants for breast cancer that act through effects on this trait. As MBD is a modifiable risk factor, analysis of the genetic overlap could also help identify other possible less-invasive interventions that could be used to target women at high risk of breast cancer and thereby aid prevention of the disease.
Candidate endophenotypes should be systematically assessed against five endophenotype criteria: [12] (i) the endophenotype is associated with disease in the population; (ii) the endophenotype is heritable; (iii) within families, endophenotype and disease co-segregate; (iv) the endophenotype found in affected family members is found in non-affected family members at a higher rate than in the general population and (v) the endophenotype is primarily state independent (manifests in an individual whether or not illness is active).
This study aims to assess the suitability of MBD measures as endophenotypes for breast cancer. As it is accepted that MBD is state independent of breast cancer (i.e., breasts can be dense both with and without the presence of breast cancer) [13], we investigate the suitability of MBD against the remaining four endophenotype criteria using supporting data from two epidemiological studies-the kConFab Consortium and the Busselton Health Study.

Study Participants
Two study populations were used to assess the suitability of mammographic density measures as endophenotypes for breast cancer-kConFab and the Busselton Health Study (BHS).

kConFab kConFab (The Kathleen Cuningham Foundation Consortium for research into Familial
Breast cancer) has been collecting genetic, epidemiological, medical and psychosocial data from families with a strong history of breast cancer since 1997 and has accumulated data on more than 1400 multigenerational, multicase kindreds [14]. The consortium makes data and biospecimens widely available to researchers for use in peer-reviewed, ethically-approved research. Methods for participant recruitment and data collection are described in detail elsewhere [14]. Via the Western Australian Department of Health Data Linkage Branch, we linked all kConFab participants residing in Western Australia (WA) with BreastScreen WA to obtain and measure their mammographic images. This was possible for 426 Western Australian kConFab participants from 197 families with more than 200 cases of breast cancer. For the current study, we selected families who were known not to carry the BRCA1/2 genes and who had members with available mammograms from Breast Screen Western Australia (n = 405 participants from 183 families (plus friends) with 114 cancers).
Ethics approval was obtained from the Western Australian Department of Health Human Research Ethics Committee (#RGS0000002834) and the University of Western Australia Human Research Ethics Office (#RA/4/1/9183).

Busselton Health Study (BHS)
Busselton is a rural, historically stable community~230 km south of Perth, WA; predominantly of British (Anglo-Saxon) expatriate origin. The BHS is one of the longestrunning international epidemiological research programs, with repeated cross-sectional surveys of adults undertaken between 1966 and 2007. The recruitment and data collection of participants from the BHS have previously been described in detail [15]. In 1994/1995, a follow-up survey was conducted of all surviving participants previously surveyed with approximately 5700 individuals attending. High-density single-nucleotide polymorphism (SNP) genotyping data are available for 4671 of these 1994/1995 BHS participants using either an Illumina 660 W or 610 W genome-wide association chip [16]. Other available data include obesity-related markers (measured at time of appointment by a research nurse), reproductive history, and exogenous hormone use. Via the Western Australian Department of Health Data Linkage Branch, we linked all 1994/1995 BHS participants with BreastScreen WA to obtain and measure their mammographic images, and the WA Cancer Registry to obtain all breast cancer diagnoses from 1980 onwards.
Informed consent was granted from all participants in the 1994/1995 survey and ethics was obtained by the University of Western Australia Human Research Ethics Committee (#RA4/1/6694). The current study was approved by the Western Australian Department of Health Human Research Ethics Committee (#RGS0000002801).

Status Allocation
The participants for both studies were divided into three status groups: cases, unaffected relatives of cases (henceforward relatives) and controls. For the kConFab participants, information collected about each participant's families was used to create pedigrees and assign each participant to a status group. For BHS participants, allocation to a status group was determined using the genetic relatedness matrix (GRM) generated from genome-wide SNP genotype data. BHS participants were assigned either as a case if they had a WA Cancer Registry-confirmed breast cancer, as a relative if they were not a case and the GRM estimated them to have a relatedness of greater than 0.0875 to a case (i.e., proportion of relatedness to a case to capture 1st cousins or greater, where relatedness of 0.125 indicates third-degree relatives and 0.5 indicates first-degree relatives), or finally as a control if they were not related to a case and were not a case themselves [17]. Within both studies, all relatives were breast cancer free and if a case was related to another case, they were assigned to the case group; however, their relationship to the case was captured through the inclusion of the genetic relatedness matrix in all analyses.

Mammogram Selection
Cranio-caudal film mammograms were retrieved from BreastScreen WA, digitized and measured by author JS using the Cumulus software (Sunnybrook Health Sciences Centre, Toronto, ON, Canada). Where mammograms from multiple screening visits were available, the pre-diagnosis mammogram closest to the diagnosis date was selected for cases. For relatives and controls, the mammogram closest to the epidemiological data Cancers 2021, 13, 3916 4 of 11 collection date was selected. If a relative or control had no epidemiological data, the earliest mammogram available was selected.
The MBD measurements included absolute dense area and percentage dense area. Percentage dense area could not be measured for six participant's mammograms from the kConFab study as the images had bad edges and thus total breast size could not be measured. Mammograms were measured twice for 10% of participants to assess reliability. Intraclass correlation coefficients for absolute dense area and percentage dense area were 0.98 in the kConFab Study and 0.99 in the BHS.

Data Analysis
Analyses were conducted in R version 3.6.3 [18] and Genome-wide Complex Trait (GCTA) [19]. Medians and interquartile ranges (IQR) were used to describe the study populations. Participants missing body mass index (BMI) information were excluded from analyses. Descriptive analyses showed those missing BMI were similarly aged at the time of their mammogram but were less dense than those who had reported BMI. The GRMs were estimated using the pedigrees deduced from the family relationship data collected during interviews for kConFab and from results of the genome-wide SNP data for BHS. The latter was estimated using Linkage Disequilibrium Adjusted Kinships (LDAK) software [20] as described previously [16] Relatedness was set to zero for those with relatedness below 0.05 in the BHS GRM as this has been shown to reduce potential bias in heritability and genetic correlation estimates from using both closely and distantly related individuals [21]. All regression analyses were adjusted for age at mammogram, BMI, time between mammogram and when BMI was reported, and the GRM. In addition, all BHS analyses included number of live births and a menopause status variable (defined as 1 if the woman had reported her periods had stopped and was not taking hormone replacement therapy, and 0 otherwise).

Test of Criterion (i)-The Endophenotype Is Associated with Illness in the Population
We assessed the association of absolute dense area and percentage breast density with breast cancer by testing for differences between cases and combined relatives and controls. Generalised linear mixed models, with a binomial distribution with logit link function (R package GMMAT [22]) was used to compare the status groups adjusting for age at mammogram, BMI, time between mammogram and BMI collection (plus number of live births and menopause status for BHS), and the GRM as a random effect, using Wald tests. For some models, adjustment for the GRM resulted in the variance estimate being on the boundary of the parameter space observed. After additional testing involving the removal of highly influential observations we determined that the estimates change minimally when the GRM was removed and so have reported the model estimates which do not adjust for the GRM.

Test of Criterion (ii)-The Endophenotype Is Heritable
Narrow-sense heritability is the proportion of the variability of the phenotype that can be attributed to additive genetic variation. Heritability estimation for absolute dense area (h DA ) and percentage dense area (h PDA ) was performed using restricted maximum likelihood analysis using Genome-wide Complex Trait Analysis (GCTA) software [19]. Square root transformations were used for both MBD measurements to normalise distributions and analyses were adjusted for age at mammogram, BMI, time between mammogram and BMI collection, and the GRM plus number of live births and menopause status for BHS.

Test of Criteria (iii)-Within Families, Endophenotype and Illness Co-Segregate and (iv)-The Endophenotype Found in Affected Family Members Is Found in Non-Affected Family Members at a Higher Rate than in the General Population
Linear mixed models and estimated marginal means were used to assess whether the density measures differed across the three status groups: cases, relatives and controls. This is akin to testing genetic correlation, and in the presence of non-traditional family structures within these samples, comparisons of means between the three status groups was considered the most statistically powerful method to assess this. A square root transformation was applied to the MBD measures to normalise the distributions and models were adjusted for age at mammogram, BMI, time between mammogram and BMI collection, and the GRM, plus number of live births and menopause status for BHS. As before, estimates on the boundary space were assessed and are presented without GRM adjustment. A meta-analysis of the results from both studies was conducted using the R library 'meta'.

Results
The final number of participants within the kConFab study was 323 (88 cases, 179 relatives and 56 controls) and 1587 (92 cases, 72 relatives and 1423 controls) for the BHS. Descriptive statistics are shown in Table 1. For both study populations, the median age for cases was higher than in controls and relatives. BMI was similar across the groups with relatives having slightly lower median BMIs when compared to cases and controls for both studies. The mean relatedness of the relatives was 0.33 within the kConFab study, and 0.35 within the BHS, with approximately half of relatives in both studies either sister pairs or mother/daughter pairs.  Table 2 shows the associations between the MBD measures and breast cancer risk for both the kConFab and BHS studies. Within the kConFab study, an increase in dense area of 1 cm 2 was associated with an increased odds of breast cancer of 1.015 (95% CI = 1.002, 1.029), compared to women with no breast cancer (relatives and controls). A smaller effect size (OR = 1.009; 95% CI = 0.998, 1.019) was observed within the BHS. Similarly, an increase of 1% in percent dense area was associated with an increased odds of breast cancer within kConFab (OR = 1.019, 95% CI = 1.002, 1.037), with a smaller increase in the BHS (OR = 1.011, 95% CI = 0.998, 1.025). Table 3 shows the linear regression estimates of the association between MBD measures and case-relative-control status for both kConFab and BHS studies. Within the kConFab study, cases had higher absolute dense area (β = 6.93, 95% CI = 0.174, 13.68) and percentage dense area (β = 0.818, 95% CI = 0.177, 1.46) than controls. Similarly, within the BHS, cases also had higher absolute dense area (β = 0.429, 95% CI = 0.009, 0.850) and percentage dense area (β = 0.391, 95% CI = 0.007, 0.790). When compared to controls, however, the effect sizes were slightly smaller. These higher MBD measure estimates among cases compared to controls remained after meta-analysing both studies results. Table 2. Logistic regression estimates (odds ratios (OR) and 95% confidence intervals (CI)) showing the associations between MBD measures and breast cancer risk for both the kConFab and BHS studies.  0.097 1 kConFab models adjusted for age, BMI, time between BMI measurement and mammogram and the GRM. 2 BHS models adjusted for age, BMI and time between BMI measurement and mammogram, number of live births and menopause status. Additionally adjustment for the GRM resulted in the variance estimate being on the boundary of the parameter space observed. After additional testing conducted involving removal of highly influential observations we determined that the estimates change minimally when the GRM is removed and so have reported the models which do not adjust for the GRM. 3 6 relatives are missing percentage dense area measurements due to poor mammogram quality. 4 p value calculated using Wald test. Abbreviations: BHS: Busselton Health Study, BMI: body mass index, OR: odds ratio, and CI: confidence interval. Table 3. Linear regression estimates (β) of the associations between the MBD measures and case-relative-control status. Bold type indicates statistical significance at α < 0.05. Dense area and percentage dense area were square root transformed and all models were adjusted for age, BMI, time between BMI measurement and mammogram and the GRM unless otherwise stated. In addition, BHS models included adjustment for number of live births and menopause status. 1 The variance estimate was on the boundary of the parameter space observed so was unable to fit model with square root transformed outcome which included the GRM. This is the effect estimate with dense area not transformed. The effect estimate with dense area square root transformed and no GRM included is: 1.022 (0.327, 1.717) (SE: 0.351). 2 Meta analysis results presented are from a fixed effect meta analysis (all tests of heterogeneity p > 0.10). Abbreviations: BHS: Busselton Health Study, BMI: body mass index, and CI: confidence interval.

Test of Criterion (ii)-MBD Is Heritable
The estimated heritability of absolute dense area and percentage dense area were both significant in both studies and were higher in kConFab (h DA = 0.587, p DA = 0.002; h PDA = 0.658, p PDA = 0.005) than BHS (h DA = 0.398, p DA < 0.001; h PDA = 0.312, p PDA < 0.001).
For both studies, less evidence was seen for higher MBD measures among relatives compared to controls. The meta-analysis of both study results found slightly higher absolute dense area (β = 0.161, 95% CI = −0.236, 0.556) and percentage dense area (β = 0.162, 95% CI = −0.210, 0.534) among relatives compared to controls. However, the evidence was weak. Figures 1 and 2 show the estimated marginal means for case-relative-control status for each MBD measure across both studies. The estimated marginal mean of each MBD measure increases as the status changes from control to relative to case in both studies. However, the confidence intervals are overlapping.
For both studies, less evidence was seen for higher MBD measures among relatives compared to controls. The meta-analysis of both study results found slightly higher absolute dense area (β = 0.161, 95% CI = −0.236, 0.556) and percentage dense area (β = 0.162, 95% CI = −0.210, 0.534) among relatives compared to controls. However, the evidence was weak. Figures 1 and 2 show the estimated marginal means for case-relative-control status for each MBD measure across both studies. The estimated marginal mean of each MBD measure increases as the status changes from control to relative to case in both studies. However, the confidence intervals are overlapping.

Discussion
We have systematically assessed, for the first time, whether breast density is an endophenotype for breast cancer using five endophenotype criteria. The results of this study show that mammographic breast density (absolute and percent dense area measures) meets most of the criteria for being an endophenotype for breast cancer. Using two independent samples with a combined sample size of 1910, we provide evidence that (i) MBD

Discussion
We have systematically assessed, for the first time, whether breast density is an endophenotype for breast cancer using five endophenotype criteria. The results of this study show that mammographic breast density (absolute and percent dense area measures) meets most of the criteria for being an endophenotype for breast cancer. Using two independent samples with a combined sample size of 1910, we provide evidence that (i) MBD is associated with breast cancer, (ii) MBD measures are heritable, (iii) within families, MBD and breast cancer co-segregate, and (iv) MBD measures within relatives of breast cancer cases are higher than in the general population.

Criteria 1: MBD Is Associated with Breast Cancer Risk
We found that both MBD-dense area and percent dense area-were positively associated with breast cancer risk, independent of age and BMI. These findings replicate well-established knowledge that MBD is an independent risk factor for breast cancer risk [4,5], and our estimates are consistent with earlier studies investigating per unit increases in absolute and percent dense area [23,24].

Criteria 2: MBD Is Heritable
Consistent with the literature, we estimated the heritability of dense area and percent dense area to be 0.59 and 0.68, respectively, within the kConFab study. We have previously reported heritability estimates between 0.6 and 0.67 for percent dense area [6] and 0.65 for absolute dense area [7] within Australian and North American twin studies. The heritability estimates within the BHS were smaller (h 2 = 0.39, h 2 = 0.30, respectively). Heritability estimates within kConFab were calculated using known familial relationships, while estimates within the BHS were calculated using the SNP-based relatedness estimates (and therefore represent the variation due only to the SNPs). The lower estimates within the BHS compared to kConFab are therefore likely due to the fact that total heritability (due to all genetic variation) is assessed in the kConFab, whereas in the BHS the heritability estimate only reflects genetic variation captured by SNPs.

Criteria 3 and 4: Breast Cancer Segregates with Breast Density within Families, and Non-Affected Family Members Have an Intermediate (between Cases and Unrelated Controls) Breast Density
In our meta-analysis between the kConFab and BHS studies, we identified higher breast density in breast cancer cases, intermediate density in relatives of cases, and lower density in controls for both MBD measures. Evidence for these differences was strong comparing cases and relatives but was limited when comparing the relatives and controls. As the relatives of cases within the kConFab study may be more likely to have genetic variants predisposing them to breast cancer, the estimates involving relatives may be subject to selection bias. The differences in estimates involving relatives between kConFab and BHS might be due to both breast cancer and MBD having a greater genetic contribution (and lower environmental contribution) in the kConFab sample. The pooled marginal mean estimates were not significantly different (potentially due to lack of power), but did show an increase across each category, in line with cases having higher density, relatives with intermediate density and controls with the lowest density. These associations suggest that there is a genetic component in common between breast cancer and MBD measures.
This study has a number of strengths. First, we had access to genetic, mammography, and breast cancer case status from two epidemiological studies, representing 1910 women, consisting of 180 cases, 251 relatives of cases, and 1479 controls. The addition of family members within these studies allowed us to test the criteria among relatives of controls which is often not available in population-based or case-control cohorts. Second, the use of two studies also allowed us to validate our findings in an independent cohort, which is integral to genetic studies of this type. Third, we assessed the endophenotype criteria using two measures of MBD, percent and absolute dense area, and observed consistent results for both phenotypes. This is consistent with strong genetic correlation between percent and absolute dense area (kConFab: rhoG = 0.938, p-value = 0.008 and BHS: rhoG = 0.946, p-value < 0.001).
However, this study has some limitations. First, we did not have BRCA1/2 status for the BHS, and therefore were unable to exclude these women. However, the prevalence of these mutations in a population-based cohort is low (<1%; [25]), and therefore the proportion of women with these mutations in our population-based study would be small. Second, our study samples consisted mainly of women with European ancestry, and therefore our results may not be generalisable to other ethnic groups. However, previous studies have shown that MBD measures are strongly associated with breast cancer risk across different ethnic groups [26,27]. Finally, BMI measures for some participants were not available, and as BMI is a critical MDB covariate, these participants had to be excluded from analyses.

Conclusions
In summary, we have shown through a comprehensive assessment of endophenotype criteria that two measures of breast density-dense area and percent dense area-are endophenotypes for breast cancer. As MDB is genetically correlated with breast cancer and can be measured on any woman who has had a mammogram (regardless of breast cancer status), genetic investigations of MDB may potentially identify novel risk variants for breast cancer and help identify novel breast cancer mechanisms. Improved understanding of these genetic associations could also inform future research towards tailored screening programs and prevention strategies.