Spectrum and Frequency of Germline FANCM Protein-Truncating Variants in 44,803 European Female Breast Cancer Cases

Simple Summary Mutations in the FANCM gene may cause a particular type of breast cancer known as ER-negative. In this study, we describe the geographic distribution of 66 different FANCM mutations identified in 44,803 female breast cancer cases from Europe, USA, Canada and Australia. We found that the FANCM:p.Gln1701* mutation is most common in Northern Europe and has lower frequencies in Southern European countries. In contrast, the FANCM:p.Gly1906Alafs*12 mutation is most common in Southern Europe and rarer in Central and Northern Europe. We found that the FANCM:p.Arg658* mutation is most prevalent in Central Europe and that the FANCM:p.Gln498Thrfs*7 mutation originates from Lithuania. Finally, we showed that many and varied FANCM mutations are present in Southwestern and Central Europeans while a much more limited range of mutations is present in Northeastern Europeans. The knowledge of this geographic distribution of FANCM mutations is important to establish more efficient genetic testing strategies in specific populations. Abstract FANCM germline protein truncating variants (PTVs) are moderate-risk factors for ER-negative breast cancer. We previously described the spectrum of FANCM PTVs in 114 European breast cancer cases. In the present, larger cohort, we report the spectrum and frequency of four common and 62 rare FANCM PTVs found in 274 carriers detected among 44,803 breast cancer cases. We confirmed that p.Gln1701* was the most common PTV in Northern Europe with lower frequencies in Southern Europe. In contrast, p.Gly1906Alafs*12 was the most common PTV in Southern Europe with decreasing frequencies in Central and Northern Europe. We verified that p.Arg658* was prevalent in Central Europe and had highest frequencies in Eastern Europe. We also confirmed that the fourth most common PTV, p.Gln498Thrfs*7, might be a founder variant from Lithuania. Based on the frequency distribution of the carriers of rare PTVs, we showed that the FANCM PTVs spectra in Southwestern and Central Europe were much more heterogeneous than those from Northeastern Europe. These findings will inform the development of more efficient FANCM genetic testing strategies for breast cancer cases from specific European populations.


Introduction
Breast cancer is a common disease in which up to 25% of the cases are expected to be caused by genetic risk factors [1]. Germline pathogenic variants in the BRCA1 and BRCA2 genes are associated with high risks of developing breast cancer. Specifically, the cumulative risks for the disease by age 80 were estimated to be 72% and 69% in women with a BRCA1 or BRCA2 pathogenic variant, respectively [2]. Since the identification of BRCA1 and BRCA2 thirty years ago, many other genes have been proposed to be associated with moderate to high risk for breast cancer; however, limited and sometimes contradictory findings from studies have impeded a conclusive annotation. In 2020, modified segregation analyses performed in 524 breast cancer families with pathogenic variants in PALB2 confirmed that this gene confers a risk for breast cancer that is comparable to that of BRCA2 [3]. One year later, two very large association studies were conducted, and several known and putative predisposition genes were sequenced in a total of more than 178,000 female breast cancer cases and controls [4,5]. The unprecedented statistical power of such large datasets enabled confirmation that protein-truncating variants (PTVs) in BRCA1, BRCA2, PALB2 and the Li-Fraumeni syndrome gene TP53 confer high risk for breast cancer. In addition, these data clarified that PTVs in BARD1, RAD51C and RAD51D are associated with moderate risk of estrogen receptor ER-negative breast cancer and that PTVs in ATM and CHEK2 are associated with moderate risk of ER-positive breast cancer [4,5].
The above-mentioned breast cancer predisposition genes have been tested worldwide and many founder variants, and variants prevalent in specific ethnic or geographic groups, have been described. This knowledge could be used to inform first pass genetic screening and more efficient strategies for genetic testing in specific populations. The prevalence and spectrum of BRCA1 and BRCA2 pathogenic variants have been reported in many different populations. Probably the two largest studies conducted so far in these genes are based one on pathogenic variants found in more than 29,000 families from 49 countries, and the other in families from the Middle East, North Africa and Southern Europe [6,7]. Comprehensive analyses of the mutational spectra of PALB2, BARD1, RAD51C and RAD51D were described in three systematic reviews including 151 [8], 123 [9] and 101 [10] studies. Finally, a study of the mutational spectrum of CHEK2 pathogenic variants was recently conducted, but was limited to the Baltic states [11].
While there is a consensus that the genes to be screened to predict the individual risk for breast cancer in diagnostic setting should be BRCA1, BRCA2, PALB2, TP53, BARD1, RAD51C, RAD51D, ATM and CHEK2, other predisposing genes, such as FANCM, are yet to be validated [12]. Burden analyses derived from FANCM sequencing, but also genotyping of the single most common variants, have shown that FANCM PTVs are generally associated with ER-negative or triple-negative breast cancer (TNBC, reviewed in [13]). In particular, the strongest association for these disease subtypes in Europeans is with the common p.Arg658* (c.1972C>T) variant, which truncates the 2048 amino acid FANCM protein at the N-terminus [14]. The risks associated with the other common FANCM PTVs p.Gln1701* (c.5101C>T) and p.Gly1906Alafs*12 (c.5791C>T, also known as p.Arg1931* [15]), which truncate the FANCM protein at the C-terminus, appear, in Europeans, to be of lower magnitude or have not been conclusively assessed. However, p.Gln1701* and p.Gly1906Alafs*12 PTVs have been associated with risk for ER-negative and TNBC subtypes in Finnish women [16,17], which we speculate might be due to population-specific variants acting as risk modifiers [13].
We previously described the spectrum of 27 different FANCM germline PTVs found in 114 female breast cancer cases ascertained from 13 European countries [18]. In the present study, we analyzed FANCM sequencing data from 44,803 female breast cancer cases from 16 European countries and from USA, Canada and Australia, reported the frequency of PTV carriers, and described the spectrum of the 66 different FANCM germline PTVs that were found in the 274 carriers.

Materials and Methods
The 44,803 breast cancer cases included in the present analysis were originally ascertained by 39 studies from 16 European countries and from USA, Canada and Australia that participated in the BRIDGES study (https://bridges-research.eu/, Supplementary Table S1). All these breast cancer cases were women of European ancestry and were older than 18 years at breast cancer diagnosis. Women carrying a pathogenic variant in the BRCA1 and/or BRCA2 genes were excluded from the study. All the 44,803 breast cancer cases underwent complete sequencing of the FANCM coding region and intron/exon boundaries in the context of the BRIDGES study [4]. Details of the library preparation, sequencing, variant calling and quality control methods have been described elsewhere [4]. Germline FANCM PTVs were defined as frameshift or nonsense variants. As a proxy for the carrier's or PTV's geographical origin, we used the country where the study ascertaining the carrier was conducted. PTV carrier frequencies were compared using Pearson's chi-squared test, all tests were two-sided. p-values < 0.05 were considered statistically significant.

Frequency of Germline FANCM PTVs
Sixty-six different FANCM PTVs were found in 274 PTV carriers that were identified by gene sequencing of 44,803 female breast cancer cases (Figure 1, and Supplementary Table S2  and Table 1). A large percentage (65.3%) of the carriers carried either p.Gln1701* or p.Gly1906Alafs*12. Importantly, for these two PTVs the evidence of association with breast cancer risk was previously inconclusive [13]. Thus, we studied the frequencies of the carriers of all PTVs and of the carriers of all PTVs excluding p.Gln1701* and p.Gly1906Alafs*12. The frequencies of carriers of all PTVs in the 19 tested countries were heterogeneous, varying between 2.50% in Finland and 0.20% in Canada. However, the exclusion of p.Gln1701* or p.Gly1906Alafs*12 carriers resulted in PTV carrier frequencies which were more homogeneous, ranging between 0.11% in France and 0.63% in Belarus (Table 1). We also compared the frequencies of the two groups of PTV carriers with respect to their breast cancer family history and the ER status of their tumors. In these analyses, we observed a significantly higher PTV carrier frequency in familial versus sporadic cases (p-value = 0.032), and in ER-negative versus ER-positive cases (p-value = 0.048). When we excluded carriers of p.Gln1701* and p.Gly1906Alafs*12, these differences became greater in both familial versus sporadic cases (p-value = 0.021), and in ER-negative versus ER-positive cases (p-value = 0.0005, Table 2). The excess of PTV carriers in familial cases with respect to sporadic cases has been shown for other genes established as moderate-risk factors for breast cancer, for example CHEK2. Specifically, the CHEK2:c.1100delC PTV, which accounts for the majority of CHEK2 PTVs, has been shown to have a 2.79-fold higher frequency or to confer a 1.77-fold higher risk in familial versus sporadic breast cancer cases [19,20]. FANCM has been reported as specifically associated with ER-negative breast cancer risk [13]. Hence, the excess of FANCM PTVs that we observed in familial cases and in ER-negative cases is reinforcing the knowledge that FANCM is a moderate-risk gene for breast cancer. Importantly, the fact that frequency differences increased after the exclusion of p.Gln1701* and p.Gly1906Alafs*12 carriers corroborates the hypothesis that these two PTVs have a lower impact on breast cancer risk.
cancer cases [19,20]. FANCM has been reported as specifically associated with ER-negative breast cancer risk [13]. Hence, the excess of FANCM PTVs that we observed in familial cases and in ER-negative cases is reinforcing the knowledge that FANCM is a moderaterisk gene for breast cancer. Importantly, the fact that frequency differences increased after the exclusion of p.Gln1701* and p.Gly1906Alafs*12 carriers corroborates the hypothesis that these two PTVs have a lower impact on breast cancer risk.

Spectrum of Common and Rare FANCM PTVs
Among the 66 different variants, four, namely p.Gln498Thrfs*7 (FANCM:c.1491dupA), p.Arg658*, p.Gln1701* and p.Gly1906Alafs*12, were relatively common, each being identified in at least six carriers. The remaining 62 variants were unique or were found in a maximum of three carriers and were classified as "rare FANCM PTVs" (Figure 1, Supplementary Table S2). Of the 274 carriers, 202 (73.7%) carried one of the four common FANCM PTVs. Of these 202 carriers, 6 (3.0%), carried p.Gln498Thrfs*7, 17 (8.4%) carried p.Arg658*, 109 (54.0%) carried p.Gln1701* and 70 (34.6%) carried p.Gly1906Alafs*12. The remaining 72 carriers (26.3% of the total) carried one of the 62 rare PTVs (Figure 2a). Of the 62 rare PTVs, 54 were unique, six were found in two breast cancer cases, and two in three breast cancer cases (Supplementary Table S2). These results were consistent with those of a previous study in which we described the spectrum of 27 different FANCM PTVs identified in 114 European female breast cancer cases [18]. In fact, we observed that p.Gln1701* was the most common PTV in Northern Europe, with highest frequencies in Finland and Sweden and decreasing frequencies along the North-South axis (Figure 2a). Similarly, p.Gly1906Alafs*12 was validated to be the most common PTV in Southern Europe with decreasing frequencies in Central and Northern Europe. We also confirmed that p.Arg658* was the third most common PTV which was common in Central Europe with higher frequencies in Eastern Europe. Moreover, the geographical origin of the six p.Gln498Thrfs*7 carriers was compatible with our previous findings indicating that this PTV is probably a founder variant from Lithuania [18] (Figure 2a). Furthermore, with respect to the distribution of rare PTVs, it appears that carrier frequencies in Germany and Sweden are higher than those we previously reported [18] (Figure 2a). Finally, we observed heterogeneous spectra in Australia and USA consistent with the fact that those carriers are of European ancestry.

Comprehensive Spectrum of FANCM PTVs
We combined the here presented data with those we published previously [18], and with all the other available studies based on FANCM sequencing of European breast cancer cases [24][25][26][27][28][29][30][31]. Figure 2b shows the distribution spectrum of a total of 91 different FANCM PTVs found in 487 breast cancer cases from 23 countries. This map shows the different frequency distributions and the specific prevalence of p.Gln498Thrfs*7, p.Arg658*, p.Gln1701* and p.Gly1906Alafs*12 PTVs. It could be also observed that the spectra of FANCM PTVs seem to be much more heterogeneous in Southwestern Europe (i.e., Portugal, Spain and France) with respect to Northeastern Europe (i.e., Sweden, Finland and Norway (Figure 2b)). To investigate this observation better, we grouped the tested countries in those from Southwestern or Central Europe (Portugal, Spain, Italy, Greece, Macedonia, Hungary, Czech Republic, Germany, France, the Netherlands, UK and Ireland) and those from Northeastern Europe (Finland, Sweden, Norway, Denmark, Poland, Lithuania, Belarus and Russia). If we consider the carriers of rare PTVs, there were 80 (33.1% of 242 total carriers) in Southwestern and Central Europe versus only 13 (6.2% of 209 total carriers) in Northeastern Europe (p-value < 0.0001). Considering specifically the single different PTVs, we observed that there were 62 (25.6% of 242 carriers) in Southwestern and Central Europe compared with 16 (7.6% of 209 carriers) in Northeastern Europe (p-value < 0.0001).
Only for some of the 87 rare different PTVs, it was possible to speculate on the geographic origin. In particular, we considered the eight PTVs that were found in at least three carriers (Table 3). Among these, p.Arg185Glufs*13 and p.Gln498* might be prevalent in Germany and the Netherlands, while p.Glu774* and p.Lys863Ilefs*12 might be specific to the Iberian Peninsula, and to Spain and France, respectively. Finally, p.Tyr1398* could be from the UK. However, since we could not exclude that some of these carriers were originally members of the same family that were ascertained as different probands, additional data are required to confirm these hypotheses. Table 3. List of FANCM rare protein truncating variants (PTVs) that, through combining data from the present study with those from previously published studies, were identified in at least three individuals. For each PTV, the total number of detected carriers is reported along with the study/database and the country of origin.

Conclusions
In this study, we report FANCM PTV carrier frequencies among 44,803 breast cancer cases from 19 countries. In addition, our data in combination with data from previous studies allowed us to describe the spectra of 91 FANCM PTVs in breast cancer cases from Europe, USA, Canada and Australia. These data could be used to inform first pass genotyping screening, and for more efficient genetic testing strategies in breast cancer cases from specific populations.