Comprehensive Analysis of Germline Variants in Mexican Patients with Hereditary Breast and Ovarian Cancer Susceptibility

Hereditary breast and ovarian cancer syndrome (HBOC) represents 5–10% of all patients with breast cancer and is associated with high-risk pathogenic alleles in BRCA1/2 genes, but only for 25% of cases. We aimed to find new pathogenic alleles in a panel of 143 cancer-predisposing genes in 300 Mexican cancer patients with suspicion of HBOC and 27 high-risk patients with a severe family history of cancer, using massive parallel sequencing. We found pathogenic variants in 23 genes, including BRCA1/2. In the group of cancer patients 15% (46/300) had a pathogenic variant; 11% (33/300) harbored variants with unknown clinical significance (VUS) and 74% (221/300) were negative. The high-risk group had 22% (6/27) of patients with pathogenic variants, 4% (1/27) had VUS and 74% (20/27) were negative. The most recurrent mutations were the Mexican founder deletion of exons 9-12 and the variant p.G228fs in BRCA1, each found in 5 of 17 patients with alterations in this gene. Rare VUS with potential impact at the protein level were found in 21 genes. Our results show for the first time in the Mexican population a higher contribution of pathogenic alleles in other susceptibility cancer genes (54%) than in BRCA1/2 (46%), highlighting the high locus heterogeneity of HBOC and the necessity of expanding genetic tests for this disease to include broader gene panels.


Introduction
Breast cancer (BC, OMIM#114480) is the most prevalent cancer in the world and accounts for 14.7 million of mortality cases [1]. Approximately, 10% of BC cases have a genetic, inherited etiology referred as Hereditary Breast and Ovarian Cancer (HBOC) with an important impact in genetic counseling and cancer prevention interventions [2].
Pathogenic variants in BRCA1 and BRCA2 genes are the most prevalent in HBOC, collectively contributing to 15-25% of the cases [3]. Pathogenic alleles in these genes frequently have high penetrance and have been found in different populations, including countries from Latin America, such as Mexico, Colombia, Argentina, Chile, Brazil, among others [4,5]. However, locus heterogeneity has been found in patients without mutations in BRCA1 and BRCA2 [6] together with additional pathogenic variants at lower frequency and in genes that confer moderate risk including CHEK2, PALB2, ATM, FANCM, ATR, STK11, RAD51C, BRIP1, CDH1, NF1, NBN and ERCC3 [7,8]. The prevalence of these novel, moderate-risk genes in HBOC patients has recently started to be defined by massive parallel sequencing (MPS). Studies indicate that causal variants have very low frequency in most of the populations studied and are spread in a larger array of genes that remain unexplored (Table 1). Until now, the contribution of pathogenic variants in genes other than BRCA1 and BRCA2 has not been entirely defined and studies in Latin American populations are still scarce. In recent years, important efforts to define common susceptibility loci for breast cancer in large cohorts have identified more than 90 SNPs, which predispose to this disease [26]. However, the risk conferred by these common susceptibility loci can explain up to 14% of hereditary breast cancer aggregation in the European population [27]. Additional SNPs remain to be discovered and association studies need to be conducted in other populations to better define the prevalence and clinical relevance of novel pathogenic alleles [28]. The identification of rare or population specific, high/moderate-risk pathogenic alleles could be translated into better molecular diagnosis, personalized risk assessment and treatment [23].
To determine the prevalence of pathogenic variants in cancer predisposing genes in Mexican patients, an understudied mixed population, and the potential benefit for molecular diagnosis with gene panel testing, we performed a germline genetic analysis in 327 patients with a clinical indication of HBOC. We analyzed all cases using a panel of 143 genes associated with different inherited oncologic diseases, by massive parallel sequencing.

Clinical and Epidemiological Description of Breast Cancer Cases
Clinical and pathological characteristics of a total of 300 sequenced cases diagnosed with breast cancer are described in Table 2. Mean age at diagnosis was 41 years (range 23-69, SD: 7.3). Seventy one percent of cases had a family history of cancer, 85% reported at least one pregnancy and the average parity was 3 children (SD: 1.6), 60% never used oral contraceptives and 93% reported not being current alcohol drinkers. Importantly, sixty two percent of all cases were overweight, obese or extremely obese. Mutational status was defined as the presence of a pathogenic or likely pathogenic variant (American College of Medical Genetics and Genomics classification) in any of the 143 genes evaluated [29]. Fifteen percent of this group had a pathogenic or likely pathogenic variant. Age at diagnosis was the only epidemiological characteristic statistically associated with mutational status (p = 0.04). No association was found between stage, histological subtype, hormone receptor status and mutational status in cases. Analysis by individual gene showed no association between presence of a mutation and a clinical or pathological characteristic.
Patients in the older age group (60-69 years) were characterized by presenting with early stage tumors (I/II) and absence of mutations.

Pathogenic Variants in Familial Breast Cancer Risk Patients without Cancer Diagnosis
In the group of 27 patients without cancer and with suspicion of familial breast cancer risk, we found pathogenic variants in 6 individuals (22%) (Figure 2

Pathogenic Variants in Familial Breast Cancer Risk Patients without Cancer Diagnosis
In the group of 27 patients without cancer and with suspicion of familial breast cancer risk, we found pathogenic variants in 6 individuals (22%) (

Description of Variants with Unknown Clinical Significance by Phosphorylation Site Disruption Analysis
We found 38 VUS in 21 genes, 4 of which were found in homozygosity (Table S2). These VUS have MAF < 0.001 in ExAC, and 1000 Genomes databases and not all of them are classified as VUS in ClinVar. To better define the potential effect of VUS in gene functionality, we evaluated the impact of the amino acid change in the context of phosphorylation sites. There was no enrichment in these sites for the occurrence of VUS. However, changes that potentially affect the phosphorylation regulation were found in the AIP and APC genes ( Figure S2). The changes affected the FKBP C domain and APC basic domain for AIP and APC, respectively.

Discussion
In this work, we evaluated genetic alterations in an expanded panel of 143 genes associated with oncologic inherited diseases, including breast, colon, gastric, among others, by MPS in two groups of high-risk HBOC patients. This is the first study in a Latin American population that analyzes a large cancer risk gene panel by MPS. Overall in all the individuals included in this study, we detected pathogenic variants in 16% (52/327), including 7% (24/327) of variants in BRCA1/2, and 8% (28/327) in genes other than BRCA1/2 ( Table 1). These mutations were found in 21 genes previously associated with more than 25 inherited conditions related to cancer (Table 3). Globally, 8% (27/327) of patients had a pathogenic mutation in one of the genes categorized by the American College of Medical Genetics ACMG as a secondary finding with clinical validity and utility to improve medical outcome [30]. Interestingly, half of the pathogenic variants, 50% (26/52), have not been reported before in any Latin-American population, which highlights the current need to expand the evaluation of the genetic diversity of under-studied, mixed populations such as Mexicans and its association to HBOC. These results also confirm the high level of locus heterogeneity that has been described for HBOC [6,23,31] (Table 1).
Age at diagnosis was the only epidemiological or clinical variable associated with the presence of a pathogenic mutation in breast cancer cases, supporting the NCCN criteria for HBOC. Several studies have identified additional life style and genetic risk factors modifying the penetrance in BRCA1/2 mutation carriers [32]. A recent meta-analysis evaluated potentially risk-modification factors for BRCA1 and BRCA2 carriers such as age at first pregnancy, parity, breastfeeding, use of oral contraceptives, smoking and radiation exposure [32]. The loss of at least 10 pounds of body weight before the age of 30 was associated with a reduced risk of BC between 30 to 49 years in BRCA1 mutation carriers [32]. Interestingly, in our analysis around 60% of BC cases were overweight or obese at the time of diagnosis but only 15% of patients had a pathogenic mutation in any of the 143 genes evaluated.
In addition, genetic studies of risk-modifiers focused in the BRCA1 and BRCA2 genes have identified 26 and 16 SNPs associated with BC risk in BRCA1/2 mutation carriers, which have small associated effect sizes (1.05-1.26) per copy of the minor allele [10,33,34]. Given the descriptive focus of our study, the possible combined effect of common low-risk alleles with the detected pathogenic variants was not evaluated. However, these genetic risk-modifiers are thought to account for less than 10% of the genetic variance [10,33,35]. The lack of evidence that associates these modifying factors with pathogenic variants in other genes of high-and medium-penetrance that participate in the development of HBOC is an unsolved concern. These potential allelic interactions could act as genetic modifiers of the risk of pathogenic variants present (especially) in low penetrance genes and might account for the clinical differences in disease presentation and outcome [36].
To our knowledge there is no information on modifiable risk factors for HBOC pathogenic variant carriers in Latin America available to compare the findings from our study. Larger prospective studies on HBOC mutation carriers that incorporate information on a variety of environmental exposures, ancestry and lifestyle factors are required to identify modifying risk factors in Latin America. These studies should include index patients and selected families in diverse representative populations to provide (i) reliable estimates of the allelic frequencies of the pathogenic alleles and modifying variants, (ii) the risk they confer and that may ultimately (iii) facilitate genetic counseling for patients carrying pathogenic variants with demonstrated clinical utility.
An interesting finding was that patients in the older age group (60-69 years) were characterized by presenting with early stage tumors (I/II) and absence of mutations even though they fulfilled the NCNN criteria for HBOC. Given the late presentation and the early stage of the disease, it is possible that these patients may have single pathogenic variants in genes or loci of lower risk not included in our analysis. Another possibility is that these patients may carry a combination of different low-risk loci (not identified in this work) that have additive or epistatic effects, as has been observed in other types of cancer [37,38]. Some reports have previously shown a series of low-risk alleles in genes involved in DNA repair, modification and metabolism related pathways, which act in concert to increase the risk of BC [39][40][41][42]. With the further generalized implementation of WES and WGS population-scale studies these potential multi-allelic interactions will be identified. In addition, a higher frequency of early stage tumors in patients above 60 years with a strong family history of BC might be explained by the increased awareness of this group to comply with current BC screening guidelines [43]. This represents a direct and additional benefit for early detection when identifying high-risk BC cases.
Pathogenic alterations in the HBOC moderate-risk genes ATM, CHEK2 and NBN, were found in a frequency of 0.6%, 0.3%, and 0.3%, respectively. Additionally, we found 7 monoallelic pathogenic variants (2%, 7/327) in 6 genes (besides BRCA1/2) of the interstrand crosslink DNA repair Fanconi anemia pathway (FANCB, FANCC, FANCF, FANCI, FANCL, FANCM) and 1 in RAD51C, a Fanconi-like phenotype gene [44]. The allelic frequency of these variants in the Latin American population spans the 0-0.0015 interval (ExAC). These results confirm findings from other multi-gene panel studies in HBOC patients (Table 1). Although strong evidence regarding the contribution of mutations in some Fanconi anemia genes to HBOC is still limited [45,46], our results provide additional support for this potential association.
Interestingly, we detected pathogenic variants in MSR1, LIG4 and PDE11A, genes not previously associated with HBOC, both in BC patients and high-risk cases. Moreover, the mutation MSR1 p.R293X was found in two unrelated patients. This mutation has been associated with Barrett's esophagus and esophageal adenocarcinoma in European families [47], and with hereditary prostate cancer [48]; although contradictory results also exist [49,50]. The mutation p.R505fs in the Non-homologous end joining (NHEJ) ligase LIG4, found in one patient, has an allele frequency of 0.0000247 in ExAC. This mutation is located in the ATP dependent DNA ligase C terminal region, and produces a protein lacking both of BRCT-I and BRCT-II domains that are required for chromatin binding [51], abrogating LIG4 function. Recently, germline mutations in LIG4 have been suggested to predispose to diffuse large B-cell lymphomas [52] and to sensitize cell to ionizing radiation, causing immunodeficiency and delay in growth and development in homozygous or compound heterozygous carriers (OMIM#606593) [53]. Given the biochemical function of LIG4 in NHEJ and the low prevalence of its mutations, germline monoallelic mutations could influence HBOC risk, although additional studies are needed to establish this association. In one patient we found a frameshift pathogenic variant p.G57fs in PDE11A, a gene previously associated with different neoplasms including Carney multiple neoplasia complex, prostate cancer and testicular germ cell tumors [54,55].
Overall, 10.8% of patients that were negative for a pathogenic mutation in any of the 143 genes tested had VUS defined following the ACMG criteria. VUS constitute a universal concern in cancer genetics diagnostic settings. The risk conferred by VUS must be addressed by generating more evidence of their allelic frequency in different populations, and by conducting co-segregation analyses, as well as efforts to define their function at protein level using experimental models. Remarkably, we found 2 VUS-AIP p.V49M in homozygosis and APC p.S2535G in heterozygosis-that potentially affect the phosphorylation regulation of the protein. In fact, mutations in the chaperone aryl hydrocarbon receptor-interacting protein (AIP) have been found in familial cases of pituitary adenomas [56]. Experimental in vitro evidence showed that AIP V49M interferes with AIP activity and stability [57]. APC p.S2535G was predicted to disrupt a phosphorylation site in the protein basic domain, which interacts with the microtubules [58]. Neither this amino acid change nor any other change in this position has been reported in COSMIC, OMIM or ClinVar. Consequently, further functional studies are needed to determine the impact of the APC p.S2535G variant.
On the other hand, seventy-four percent of all patients did not harbor alterations in any of the 143 genes studied. Even though we tested the deletion of exons 9-12 in BRCA1, a mutation with a founder effect and the highest frequency reported in Mexican population [59], additional larger rearrangements, undetectable by MPS could account for the lack of mutation detection in this group of patients. The frequency of large genomic rearrangements in BRCA1/2 varies considerably among populations but higher frequencies are related to founder effect variants [60]. In our study, we found that almost one out of three patients (5/17) with BRCA1 pathogenic variants had the Mexican founder mutation (deletion of exons 9-12), which highlights the additional value of evaluating this alteration through a rapid test, such as the one we used. It is also possible that patients who tested negative for any of the genes evaluated may harbor variants in noncoding regions that we did not analyzed. Additional mechanisms of pathogenesis that may play a role in susceptibility to BC might include pathogenic variants affecting splicing mechanisms that disrupt RNA-binding protein (RBBSs) and splicing regulatory (SRBSs) binding sites as well as transcription factor binding site disruption or promoter mutations [31,61].
An additional limitation of this study is the lack of population paired-controls, which could lead to the wrong attribution of common, non-deleterious variants as pathogenic, and which also could eliminate an important amount of common VUS present in this population. It has been described that each person carries up to 100 loss-of-function variants, thirty of which could be in homozygosis, and these are not necessarily disease-causing variants [62,63]. To exclude for non-pathogenic natural variation, we used the largest international databases (ExAC, 1000G, ESP) available in our filtering algorithm. These repositories contain whole genome and exome information from a large number (N = 60,706) of sequenced individuals, with broad ancestral diversity, including 5789 Latinos (2254 males and 3535 females). It has been reported that ExAC is not overrepresented for pathogenic variants, which supports its use to estimate normal variation [64]. We estimate that this strategy may have helped to palliate the effect of lack of paired-controls on our study.
Future germline analyses of cohort studies and population based case-control studies specifically focused on underrepresented populations such as the Latin American region and including women with BC with and without HBOC susceptibility are necessary to validate our results.
Overall, we found 54% of pathogenic variants in genes other than BRCA1 and BRCA2. Consistently, other studies using a panel sequencing approach have found a proportion of 5-64% of pathogenic variants in non-BRCA genes ( Table 1). The rate of pathogenic variant detection in these studies tend to be dependent on the total number of genes analyzed, rather than the number of individuals studied. For example, the largest study evaluated a panel of 21 genes in 65,057 patients with BC and found pathogenic variants in 8 non-BRCA genes [10], and our study using a panel of 143 genes in 327 individuals, detected pathogenic variants in 21 non-BRCA cancer-associated genes. Therefore, to further elucidate the wider variation in genes with pathogenic variants that influence HBOC, more studies that investigate larger panels, or ultimately the whole exomes or genomes are needed. Likewise, penetrance and polygenic analyses of rare and common variation will aid to provide more accurate assessment of genetic cancer risk in the clinical setting.
The findings of this work have relevant clinical and public health implications. The frequency of mutations we found in high and intermediate penetrance HBOC associated genes, emphasizes the additional advantage of using a complete gene panel testing instead of single selected mutation approach, increasing the number of identified high-risk individuals who might benefit from personalized prevention or clinical intervention programs. In addition, the implementation of extended gene panels constitutes an efficient development to accelerate de detection of a broader number of high-risk mutation carriers. We detected patients with mutations in BRCA1 and BRCA2 but also in the MLH1, SDHB and PTEN genes that are suggested to be reported by the ACMG given their risk to develop other hereditary conditions [30]. The gradual adoption of clinically informative gene panel testing along with genetic counseling programs will eventually be a key component in the prevention of cancer and other genetic diseases. Lastly, our results highlight the current necessity for the establishment of prospective cohorts in understudied populations, such as the Latin American, to better establish factors that modify penetrance and to identify association relationships of new genetic variants with the disease. These studies could provide enough evidence to direct public health guidelines for risk assessment programs in specific populations including genetic testing recommendations, lifestyle and treatment interventions. After the review of their clinical records and age of onset (<45 years), all patients with suspicion of HBOC were invited to participate in this study. After a complete and detailed explanation of the study and written informed consent, a questionnaire of enrollment was used to evaluate the fulfillment of inclusion criteria.

Sample Preparation and DNA Extraction
For all patients enrolled, 4 mL samples of blood were collected and stored locally at −80 • C. The period between sample collection and freezing never exceeded 36 h. Peripheral blood DNA was extracted with the DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) following manufacturer's instructions. DNA concentration was quantified with the Qubit dsDNA HS Assay Kit (Invitrogen, Carlsbad, USA) and the integrity and purity of the material was verified by agarose gel electrophoresis and spectrophotometry, respectively.

Library Preparation and Massive Parallel Sequencing
Peripheral Blood DNA was used for library preparation with the GeneRead Cancer Predisposition V2 Kit (Qiagen, Hilden, Germany), which targets 143 genes, which loss of function is a well-known mechanism associated with 88 inherited oncologic diseases based on data from the College of American Pathologists (CAP) guidelines, NCCN guidelines, late-stage clinical trials, The Cancer Genome Atlas (TCGA), and Ingenuity ® Knowledge Base. The amplification was divided in 4-pool PCR reactions with a total of 6582 amplicons. Pair-end sequencing was performed with the MiSeq System platform (Illumina, San Diego, USA). Briefly, 40 ± 2.5 ng of DNA was amplified with the GeneRead DNAseq Gene Panel Kit (Qiagen, Hilden, Germany) and purified with Agencourt AMPure XP magnetic Beads (Beckman Coulter, Brea, USA). The amplified fragments were end-repaired, dA-tailed and the adapter GeneRead Adapter 1 Set plex (Qiagen, Hilden, Germany) was ligated using the GeneRead DNA Library I Core Kit. Amplified segments were then size-selected (200-300 bp) using Agencourt AMPure XP magnetic beads (Beckman Coulter, Brea, USA). New England Biolabs (Ipswich, USA) barcodes were incorporated by PCR amplification in 10 PCR cycles and the products were purified. The libraries were diluted to 4.0 nM and were pooled in batches of 60-80 samples. Library quality was evaluated by DNA quantification with Qubit after size-selection, and by Bioanalyzer (Agilent, Santa Clara, USA) profiling with the High Sensitivity DNA Kit after adaptor-ligated molecules amplification and final library pooling. Pooled barcoded libraries were diluted to 15.0 pM and sequenced with a MiSeq Reagent Kit V2 2 × 150 cycles (Illumina, San Diego, USA) to reach a theoretical average coverage of 100× for each sample.

Pathogenic Variant Detection
Alignment and variant calling were performed with BWA and GATK (Broad Institute, Cambridge, USA). FastQC files were aligned to the human genome reference hg19 with BWA-MEM; indels were realigned and bases recalibrated. Adaptors were soft-clipped and reads with <20 bp were eliminated. The overall (327) mean sequencing depth of all samples was 70.3× (SD: 21.35) with a range 30-156×, excluding one sample with depth 20× ( Figure S1). Variant calling was done with HaplotypeCaller (Broad Institute, Cambridge, USA). Variants were annotated with ANNOVAR and InterVar [65,66]. Mutation description follows Human Genome Variation Society (HGVS) nomenclature (http://www.hgvs.org/). Variant classification followed the five-tier criteria of the American College of Medical Genetics and Genomics (ACMG) [29] and was manually curated. We excluded variants that were synonymous, with depth <5.0× or with mutant allele fraction <20% and those present in homopolymeric tracts >8 bp. All splicing and null variants (stop-gain/loss, frameshift indels) and missense variants defined as pathogenic in ClinVar were considered unequivocally pathogenic (https://www.ncbi.nlm.nih.gov/clinvar). Null variants present at the 3 extreme end of the gene that were reported as conflicting in ClinVar were classified as unknown clinical significance (VUS). Minor allelic frequency <0.001 in either the ExAC database, 1000 Genomes (1000G) project or the Exome Sequencing Project (ESP6500) was used to capture rare, potentially pathogenic null and missense variants. Low frequency (<0.001) missense variants predicted as deleterious alleles by SIFT or PolyPhen-2 but with no further evidence of pathogenicity in vitro/vivo or clinically were classified as VUS. All filtered variants were manually curated by inspection of the BAM files with the IGV software (Broad Institute). All pathogenic variants were confirmed by two independent assays of Sanger sequencing. Variants in BRCA1 and BRCA2 were further assessed in the Huntsman Cancer Institute Breast Cancer Genes Prior Probabilities site (http://priors.hci.utah.edu/PRIORS/index.php) to evaluate their potential impact. Variants in MLH1, MSH2, MSH6, PMS2 were also investigated in the Leiden Open Variation Database (http://hci-lovd.hci.utah.edu/home.php).

Detection of Exon 9-12 Deletion in BRCA1
The deletion in exons 9-12 founder mutation was detected by PCR amplification of the mutant and wildtype allele, using specific primers based on the Weitzel et al. method [59]. The PCR products were resolved in 1.5% agarose gels to identify the amplification of the truncated allele and sequenced.

Phosphorylation Site Disruption Analysis
To evaluate the impact of missense changes in phosphorylation sites, the protein sequences and amino acid changes of all VUS variants (Table S2) were submitted to the ReKINect portal (http://rekinect.science/home). Only mutations predicted to disrupt the phosphorylation site and those which have previous evidence of functional impact in experimental studies were considered.

Statistical Analyses
Characteristics of cases with confirmed diagnosis of breast cancer were summarized with descriptive statistics. The association between demographic and clinical characteristics on the presence of pathological mutations was assessed using univariate analyses (unadjusted logistic regression model). Age at diagnosis and BMI were included as continuous variables, whereas all other factors were considered to be categorical variables. The logistic regression model utilized all available data (complete and missing). p values of less than 0.05 were considered to indicate statistical significance. All the analyses were conducted using STATA 13.0.

Conclusions
Our results show that 16% of the patients with suspicion of HBOC carried a pathogenic mutation in at least one of the 143 genes tested. Fifty-four percent of all pathogenic alterations were not present in BRCA1 and BRCA2, highlighting the locus heterogeneity of this disease. We found 10% of patients with VUS, which require further studies to establish their significance. The genetic information derived from this study could guide the treatment, appropriate follow-up and prophylactic measures in these families and our findings emphasize the benefit of gene panel sequencing service for candidate patients. Although currently clinical guidelines for patients with the pathogenic mutations detected in several of these genes are lacking, the detection of these variants together with a suggestive family history may warrant for a post-test management change, including close follow-up and monitoring. Future efforts will collectively provide enough evidence of the clinical impact of these variants and will foster the development of consensus population-specific guidelines for clinical management.