Comprehensive Functional Characterization and Clinical Interpretation of 20 Splice-Site Variants of the RAD51C Gene

Simple Summary Genetic variants in more than 10 genes are known to confer moderate to high risks to breast and/or ovarian cancers (BC/OC). In the framework of the international project BRIDGES, a panel of 34 known or suspected BC/OC genes has been sequenced in 60,466 breast cancer patients and 53,461 controls. In this work, we focus on BRIDGES variants detected in the RAD51C gene and their impact on the gene expression step known as splicing (intron removal), whose alteration is a relevant disease mechanism. For this purpose, we bioinformatically analyzed 40 RAD51C variants from the intron/exon boundaries, 20 of which were selected. Then, we developed a biotechnological tool, called splicing reporter minigene, containing RAD51C exons 2 to 8 where any variant can be introduced by site-directed mutagenesis and functionally assayed in MCF-7 cells under the splicing perspective. Nineteen variants impaired splicing, 18 of which induced severe splicing anomalies. Finally, they were clinically interpreted according to strict guidelines whereby 15 variants were classified as Pathogenic/Likely Pathogenic, so they are clinically actionable. Therefore, carrier patients and families may benefit from tailored prevention protocols and therapies. Abstract Hereditary breast and/or ovarian cancer is a highly heterogeneous disease with more than 10 known disease-associated genes. In the framework of the BRIDGES project (Breast Cancer Risk after Diagnostic Gene Sequencing), the RAD51C gene has been sequenced in 60,466 breast cancer patients and 53,461 controls. We aimed at functionally characterizing all the identified genetic variants that are predicted to disrupt the splicing process. Forty RAD51C variants of the intron-exon boundaries were bioinformatically analyzed, 20 of which were selected for splicing functional assays. To test them, a splicing reporter minigene with exons 2 to 8 was designed and constructed. This minigene generated a full-length transcript of the expected size (1062 nucleotides), sequence, and structure (Vector exon V1- RAD51C exons_2-8- Vector exon V2). The 20 candidate variants were genetically engineered into the wild type minigene and functionally assayed in MCF-7 cells. Nineteen variants (95%) impaired splicing, while 18 of them produced severe splicing anomalies. At least 35 transcripts were generated by the mutant minigenes: 16 protein-truncating, 6 in-frame, and 13 minor uncharacterized isoforms. According to ACMG/AMP-based standards, 15 variants could be classified as pathogenic or likely pathogenic variants: c.404G > A, c.405-6T > A, c.571 + 4A > G, c.571 + 5G > A, c.572-1G > T, c.705G > T, c.706-2A > C, c.706-2A > G, c.837 + 2T > C, c.905-3C > G, c.905-2A > C, c.905-2_905-1del, c.965 + 5G > A, c.1026 + 5_1026 + 7del, and c.1026 + 5G > T.


Introduction
Genetic variants in more than 10 genes are known to confer moderate to high risks to breast and/or ovarian cancers (BC/OC) and explain 5% to 10% of all breast cancers and approximately 20% of all ovarian cancers [1,2]. Most of these genes encode for tumor suppressor proteins that play a role in repair of DNA double-strand (DSB) breaks by homologous recombination (HR). In addition to the main breast cancer genes, BRCA1 [MIM #113705] [ [1,2,5,6].
Loss-of-function variants in RAD51C and RAD51D increase the risk of breast and ovarian cancer, but the same has not been demonstrated for other RAD51 paralogs, or for RAD51 itself that plays a major role in HR repair [7][8][9][10][11]. Likewise, bi-allelic RAD51C (or FANCO) deleterious variants have been found in Fanconi Anemia patients [12]. RAD51C participates in the recruitment of RAD51 to DNA damage sites and the stabilization of RAD51 nucleofilaments as part of the BCDX2 complex (RAD51B, RAD51C, RAD51D, and XRCC2). It is also involved in the resolution of Holliday junctions interacting with XRCC3 resulting in the CX3 complex, and recently, it was demonstrated that RAD51C interacts directly with PALB2, a key protein in HR [13][14][15][16][17]. Furthermore, RAD51C has been reported to facilitate ATM-dependent CHEK2 phosphorylation, allowing the activation of CHEK2, another important regulator of the cellular response to DNA damage [18,19].
The detection of germ-line pathogenic variants in these cancer susceptibility genes can contribute to improve the prevention, therapy, and surveillance of breast/ovarian cancer patients, as well as to a better knowledge of BC/OC genetics. Unfortunately, a large fraction of variants is classified as variants of uncertain clinical significance (VUS). Since the association with cancer risk is unknown for these variants, this complicates genetic counseling and the clinical management of patients. Multifactorial likelihood approaches, together with functional studies of variants, can facilitate their interpretation [20][21][22]. Variants of disease-genes are typically assessed according to their predicted impact on protein translation, so protein truncating variants (frameshift and nonsense) are usually classified as damaging variants. However, variants might also have an impact on RNA expression and, e.g., disrupt transcription initiation, miRNA regulation, or splicing [23][24][25][26][27].
Pre-mRNA splicing is an essential gene expression mechanism, whereby introns are excised and exons are consecutively joined to produce the mature mRNA. The splicing motifs include the core consensus sequences (5 and 3 splice sites -5 SS and 3 SS-, the polypyrimidine tract, and the branchpoint) and exonic or intronic splicing enhancers and silencers [28]. Variants in these cis-motifs may lead to abnormal events such as exon skipping, intron retention, inclusion of pseudoexons, or the use of alternative splice sites [29]. These generate aberrant transcripts which may be associated with a genetic disorder [21,[30][31][32]. According to the Human Gene Mutation Database (accessed on 27 November 2019) around 9% (23354/269419) of reported disease-causing mutations impair splicing, although some authors suggested that up to 50% of all human disease mutations impair splicing [33,34].
Given the low precision of in silico analysis tools that predict the impact of candidate variants on RNA splicing, the exact consequences of these genetic changes must be verified in functional assays [35,36]. The most suitable method to determine whether a particular variant affects splicing is the direct analysis of blood RNA from heterozygous carriers (either patients or healthy relatives), although access to blood RNA samples is not always feasible in the diagnostic routine [37][38][39]. Even if available, the assessment of the transcripts derived from the variant allele is hampered by the presence of the wild type one. One possible alternative strategy is to use minigene assays, which have been proven to represent a robust tool for assessing the pathogenicity of potential spliceogenic variants [40][41][42][43].
Multigene panel testing is a cost-and time-effective option to evaluate genes and genetic variants that may be associated with a risk of cancer, and is becoming widely used in clinical practice. Our study was conducted in the context of the BRIDGES project (Breast Cancer After Diagnostic Gene Sequencing; https://bridges-research.eu/) where a panel of 34 known or suspected breast cancer susceptibility genes were sequenced in 60,466 cases and 53,461 controls [44]. Here, we bioinformatically analyze 40 variants from the intron/exon boundaries of the RAD51C gene identified in BRIDGES subjects. Twenty variants are selected and functionally tested by minigene assays.

Bioinformatics Analysis
We identified in BRIDGES patients and controls a total of 40 different variants located at RAD51C exon/intron boundaries (see Methods). These variants were bioinformatically analyzed with Max Ent Scan (MES) according to the standards indicated in Materials and Methods. Twenty variants were selected for further analysis, based on their predicted impact on splicing (Table 1 and Table S1). Of the 20 selected variants, eleven variants were predicted to impair the 3 SS, and the other nine were predicted to impair the 5 SS. Six variants (c.405-6T > A, c.571 + 4A > G, c.706-2A > C, c.706-2A > G, c.966-2A > G, and c.966-2A > T) were predicted to impair the SS and simultaneously create a de novo SS. Variants c.146-3C > T, c.1026 + 5_1026 + 7del, and c.1026 + 5G > T did not produce significant MES score changes (≥15%), but they affected the conserved nucleotides of the splice sites. The MES value of the exon 8 donor site (2.0) was below the default threshold (3.0), so NNSplice calculations for variants c.1026 + 5_1026 + 7del and c.1026 + 5G > T were used instead (0.8→ <0.1).

Functional Analysis
Since all candidate variants were located in exons 2 to 8 (Table 1), we designed a 3731-bp insert containing these seven exons ( Figure S1) and cloned this insert into the pSAD vector [41], representing the minigene mgR51C_ex2-8 ( Figure 1A). This clone produced a full-length transcript in MCF-7 cells of an expected size (1062 nt), sequence, and structure (V1-RAD51C_ex2 to ex8-V2) ( Figure 1B), so it was suitable to assess a possible effect of the variants on pre-mRNA splicing. The wild type (wt) construct also generated residual amounts (1.4%) of an unknown 1106-nt transcript that could not be characterized. To identify physiological alternative splicing events, RNA from the host cells (MCF-7) and from the human breast control were analyzed by RT-PCR as well. The expected full-length transcript (957-nt) was detected by fluorescent fragment electrophoresis together with some alternative splicing isoforms, of which exon 7 skipping was the main event ( Figure 1C).

Functional Analysis
Since all candidate variants were located in exons 2 to 8 (Table 1), we designed a 3731-bp insert containing these seven exons ( Figure S1) and cloned this insert into the pSAD vector [41], representing the minigene mgR51C_ex2-8 ( Figure 1A). This clone produced a full-length transcript in MCF-7 cells of an expected size (1062 nt), sequence, and structure (V1-RAD51C_ex2 to ex8-V2) ( Figure 1B), so it was suitable to assess a possible effect of the variants on pre-mRNA splicing. The wild type (wt) construct also generated residual amounts (1.4%) of an unknown 1106-nt transcript that could not be characterized. To identify physiological alternative splicing events, RNA from the host cells (MCF-7) and from the human breast control were analyzed by RT-PCR as well. The expected full-length transcript (957-nt) was detected by fluorescent fragment electrophoresis together with some alternative splicing isoforms, of which exon 7 skipping was the main event ( Figure 1C).  and RTpSAD-RV (full-length transcript V1-RAD51C ex2-8-V2 = 1062 nt). The RT-PCR product was run by agarose gel electrophoresis (left) and fluorescent capillary electrophoresis (right), where the full-length transcript is shown as a blue peak and the LIZ1200 size standard as orange/faint peaks. (C) Agarose gel (left) and fluorescent capillary electrophoresis (right) of transcripts produced by MCF-7 cells (above) and human breast RNA (below). cDNAs were amplified with primers RTR51C_ex1-FW and RTR51C_ex9-RV (full-length transcript = 957nt). FAM-labelled products (blue peaks) were run with LIZ1200 (orange peaks) as the size standard. FL, full-length transcript. The 20 selected variants were genetically engineered into the wt minigene and then were introduced into MCF-7 cells. Nineteen variants (95%) impaired splicing, 18 of which produced no trace or residual amounts of the full-length transcript (Table 1; Figure 2B). Eight variants affected the classical ±1, LIZ1200 size standard as orange/faint peaks. (C) Agarose gel (left) and fluorescent capillary electrophoresis (right) of transcripts produced by MCF-7 cells (above) and human breast RNA (below). cDNAs were amplified with primers RTR51C_ex1-FW and RTR51C_ex9-RV (full-length transcript = 957nt). FAM-labelled products (blue peaks) were run with LIZ1200 (orange peaks) as the size standard. FL, full-length transcript.
The 20 selected variants were genetically engineered into the wt minigene and then were introduced into MCF-7 cells. Nineteen variants (95%) impaired splicing, 18 of which produced no trace or residual amounts of the full-length transcript (Table 1; Figure Table S2; FL, full-length transcript.

Transcript Analysis
High sensitivity fluorescent fragment analysis allowed us to detect at least 35 transcripts (from 1 to 5 transcripts per variant), 22 of which could be characterized (Table S2, Table 1). Sixteen transcripts introduced premature termination codons (PTC), including 11 predicted to undergo NMD (PTC-NMD transcripts) and 5 disrupting the reading-frame but not predicted to undergo NMD (PTC transcripts). On the other hand, six RNA isoforms kept the open reading-frame, but five of them were minor. ∆(E5) was the most abundant in-frame transcript induced by c.706-2A > G and c.837 + 2T > C (65.4% and 89.3%, respectively). RAD51C exon 5 encodes for 44 amino acids, 26 of which are strictly conserved in vertebrates and contain the Walker-B domain between (p.238-242; Figure S2) [45], which plays a relevant role in RAD51C function. ∆(E8q18) was produced by variants c.1026 + 5_1026 + 7del and c.1026 + 5G > T (13.8% and 18.7%, respectively). This transcript encodes for a deletion of six amino acids (Val337 to Lys342, of which only Ile 341 is strictly conserved in vertebrates), which removes six out of the seven amino acids of the essential β-strand-8, suggesting a plausible protein dysfunction ( Figure S2). However, no pathogenic missense mutations have been recorded at the ClinVar database in this region, so we cannot confirm that transcript ∆(E8q18) encodes for inactive RAD51C. Of note, variants c.966-3C > A, c.966-2A > G, and c.966-2A > T of the exon 8 3 SS produced three different versions of a 3-nt intronic insertion (acceptor shift; Table 1 5.9%), respectively. These would provoke three different effects on protein translation, i.e., p.Arg322dup, p.Arg322delinsSerGly and p.Arg322delinsSerTrp, respectively. Arg322 is strictly conserved, indicating that this residue might be important for protein function ( Figure S2). On the other hand, three missense changes have been reported in ClinVar at codon 322 (p.Arg322Lys, p.Arg322Thr, and p.Arg322Ser), all of them classified as VUS, so protein dysfunctionality by any of these three transcripts could not be supported. The remaining in-frame transcript ∆(E3q114) showed a relative proportion below 5% in variants c.571 + 4A > G and c.571 + 5G > A, where 12 out of the 38 deleted amino acids are strictly conserved ( Figure S2).

ACMG/AMP-Like Classification of RAD51C Variants Based on PS3/BS3 Functional Evidence
On the basis of the data acquired using minigene analysis, the ACMG/AMP-like classification approach classifies 15 variants as pathogenic/likely pathogenic and 5 variants as of uncertain significance ( Table 2; Methods; Figure S3). Incorporating splicing functional data into the ACMG/AMP framework proved to be non-trivial and raised several relevant issues, including the identification of what we think are internal inconsistences of the framework (see Discussion).

Discussion
Massive parallel sequencing of breast and/or ovarian cancer genes has allowed the genetic testing of thousands of patients in a high throughput and cost-effective strategy. The goal of the BRIDGES initiative was to firmly establish the breast cancer association of genes tested by commercial multigene panels with the narrowest confidence intervals of risk estimates currently available. BRIDGES analyzed 34 known or suspected BC genes that were sequenced in 60,466 patients and 53,461 controls [44]. Nonsense, frameshift, and ±1, 2 splice site variants (sometimes collectively referred to as protein truncating variants or PTVs) are usually assumed to be pathogenic or likely pathogenic. This assumption might work well for certain epidemiological studies but cannot be taken for granted in the clinic (e.g., spliceogenic variants, including ±1, 2 splice site variants, are not necessarily pathogenic, as they may cause in-frame alterations preserving function). Many other variants (e.g., rare missense changes) are considered VUS, due to their unknown impact on gene function and disease risk [48]. In fact, clinical management of VUS carriers (and non-carrier relatives) is complex, since risk evaluation is solely based on family history [49,50].
The RAD51C gene was one of the 34 genes analyzed by BRIDGES given its role in breast and ovarian cancer [6,51]. A statistically significant association for PTVs has been found for ER-negative breast cancer and breast and ovarian cancer [44,52]. In this work, we have carried out the most comprehensive splicing study of germline variants of RAD51C to date. Forty variants located within the intron/exon boundaries were selected and analyzed by MES or NNSplice. In keeping with the standards indicated in Materials and Methods, 20 candidate variants were chosen (Table 1) for subsequent RNA assays.
In the absence of patient RNA, splicing reporter minigenes provide a straightforward and robust method for the initial characterization of putative spliceogenic variants for several reasons. The assay (i) uses a simple and clean analysis of a single mutant allele; (ii) is performed in a cell type relevant for the disease; (iii) circumvents the NMD interference with the use of an inhibitor; (iv) uses a single construct for testing multiple variants, among other benefits of this technology. Here, we envisioned a construct that contained a synthesized insert with seven (exons 2-8) out of the nine exons of the RAD51C gene, so that all the selected variants ( Figure 2A) could be evaluated in one single minigene.
Remarkably, all but one variant disrupted splicing, underlining the specificity of our criteria. MES or NNSplice predicted correctly an effect on RNA splicing (either splice-site disruptions or significant score reductions) in 19 variants (Table 1). Only one variant, c.146-3C > T, did not alter splicing, indeed, the MES score was just slightly reduced (−8.5%) because the most frequent −3 nucleotide (C) is substituted by the second most frequent one (T). However, other -3 non-conservative changes in which the nucleotide substitution was different, such as c.905-3C > G and c.966-3C > A, caused total or almost total splicing disruptions. Likewise, a double effect was precisely predicted by MES for c.405-6T > A: 3 SS disruption and generation of a strong de novo 3 SS 4-nt upstream that, in fact, was mainly used by the splicing machinery ( (E3p4), 95.2%). MES did not identify the exon 8 donor site, although the NNSplice did. In this case, both +5 variants (c.1026 + 5_1026 + 7del and c.1026 + 5G > T) totally disrupted splicing without any trace of the full-length isoform. Conversely, another +5 variant (c.705 + 5G > C) yielded 51.6% of the full-length isoform with a relatively low MES decrease (−20.8%). It is also worthy to mention that c.571 + 4A > G slightly reduced the MES score (−22.5%) but the resultant mutant donor site was still strong (MES = 8.1). However, this change induced an almost complete aberrant splicing with a residual amount of the full-length transcript (5.4%). Finally, the different splicing outcomes of the two changes at the same position, c.706-2A > C and -2A > G, should be highlighted (Table 1). Variant c.706-2A > C mainly caused the use of a cryptic 3 SS 10-nt downstream (∆(E5p10); 91.4%), while c.706-2A > G mainly generated ∆(E5) (65.4%) but also ∆(E5p10) (33.5%). However, MES scores of the cryptic 3 SS of both changes (3.3 vs. 3.2) were low and not significantly different. One possible explanation could be that the c.706-2A > C is a purine to pyrimidine change that would strengthen the polypyrimidine tract of the internal cryptic acceptor site 10-nt downstream (used in ∆(E5p10)), whereas c.706-2A > G (purine to purine) would not.

Clinical Interpretation of Variants
The clinical interpretation of variants cannot be done solely on the basis of the functional data presented in this manuscript. From a clinical perspective, the data presented here are to assist in classifying genetic variants. Yet, the analysis of spliceogenic variants is an especially challenging and laborious mission. The presence of numerous RAD51C abnormal transcripts and the production of several transcripts by many variants are proofs of this arduous undertaking. From a simple functional viewpoint, the biological indicators of pathogenicity of a particular variant are the strong reduction of the expression of wild type transcript and the presence of severe splicing anomalies that are predicted to result in protein truncation or loss of critical protein domains. On this basis, 18 variants with severe splicing anomalies (Table 1) should be classified as deleterious or likely deleterious.
However, more complex and comprehensive guidelines have been developed for the clinical interpretation of variants, such as those of the ACMG-AMP [66]. Here, we propose a clinical classification of our findings on the basis of these guidelines. Overall, we think that our ACMG/AMP-like classification of 20 RAD51C pre-selected variants based on minigene data is rigorous, with most variants placed in the pathogenic/likely pathogenic category, but highlighting as well up to four variants (c.705 + 5G > C, c.966-3C > A, c.966-2A > G, c.966-2A > T) that despite being spliceogenic, require further studies to be definitely classified.
We would like to highlight as well that, at some point, our classification is based on decisions not necessarily shared by other experts in the field (e.g., replacing in silico predictions by functional evidence rather than combining both, see rationale below and in Supplementary Methods). For that reason, others may propose a different clinical classification. In turn, this highlights a relevant issue in variant classification, namely, the lack of standardization. Accordingly, our minigene-based ACMG/AMP-like classification approach (Table 2) was not intended to produce a definitive (i.e., authoritative) clinical classification of these variants (a prerequisite for that will be the completion of the ClinGen expert panel adaptation of the ACMG/AMP rules to RAD51C), but rather to highlight the complexity of determining the appropriate aggregate strength of combining predictive and functional splicing types of evidence into the ACMG/AMP classification framework without introducing inconsistences into the system [67].
In the present study, we propose addressing these issues by a somewhat radical approach: replacing in silico predictions by functional evidence (rather than combining both). We think that this approach: (i) avoids the internal inconsistences already mentioned, and (ii) recognizes the fact that predictive and functional splicing pieces of evidence are not truly independent from each other. Implicitly, the ACMG/AMP classification framework assumes that each piece of evidence is independent [68], an assumption hardly met by the predictive and functional criteria as most functional analyses are performed in pre-selected variants on the basis of bioinformatics predictions such as the present study.
The ClinGen CDH1 expert panel has proposed to use PVS1_Strong (rather than PVS1) for GT-AG ± 1, 2 variants and combine these with RNA (PS3) or association (PS4) data to reach a pathogenic classification [69]. In a second iteration of the rules (www.clinicalgenome.org/affiliation/50014/), the authors refine the approach by stating that for PVS1_Strong variants (GT-AG ± 1, 2), PS3_moderate (rather than PS3) should be applied.
While the suggestion of "downgrading" the loss-of-function prediction for GT-AG ± 1, 2 variants (and encouraging RNA analyses) is appealing to us, the approach does not eliminate internal inconsistences for GT-AG ± 1, 2 vs. other PTC-NMD variants (PVS1_Strong + PS3_moderate = Likely Pathogenic vs. PVS1only = uncertain significance) and does not even address the issue for spliceogenic variants other than GT-AG ± 1, 2. Further, nothing is said about the appropriate strength of combining computational and functional splicing data if the evidence codes go in opposite directions.
In our approach, the computational evidence does not contribute to the final clinical classification of functionally validated spliceogenic variants, but we do acknowledge a fundamental role for these predictions in selecting and prioritizing variants for subsequent splicing analyses. Indeed, we recommend running bioinformatic splicing predictions for all genetic variants regardless of their nature and/or location (i.e., nonsense, in-frame, and frameshift indels and synonymous, non-synonymous, and intronic variants). Further, once a variant is selected for splicing analysis, the predictions have a role in designing and/or validating the corresponding assays. For instance, a negative experimental result (no splicing effect) in a variant with strong computational evidence might points towards a sub-optimal experimental design (e.g., multi-exon skipping is missed due to wrong selection of primers). Further on, a positive result (splicing alteration) for a variant with no strong computational evidence may suggest that it is not the presumed variant under investigation but another variant in cis (e.g., a deep intronic variant) that is causing the splicing alteration.
The "quality control" role of computational evidences is probably more relevant for assays performed in RNA from carriers than in minigene-based assays (e.g., in the latter approach there is no doubt about the variant under investigation). Yet, we argue that the concordance with computational evidence (as observed in the present study) is also relevant to consider minigene outputs strong (or very strong) evidence towards pathogenicity.
Ultimately, validation of the pathogenicity will need to be based on the observed risk associated with the variants-either through case-control or family-based studies. It will be extremely challenging to evaluate risk for individual variants, since they are very rare, but it is possible in principle to evaluate the classification system as a whole. Furthermore, in BRIDGES, these spliceogenic variants account for 44.9% of all patients carrying a pathogenic/likely pathogenic variant (data not shown), indicating that a high proportion of RAD51C breast cancer risk-associated alleles displays splicing defects, as previously described for BRCA1 and BRCA2 [21].

Ethics Approval
Ethical approval for this study was obtained from the Ethics Committee of the Spanish National Research Council-CSIC (28/05/2018).

Variant and Transcript Annotations
BRIDGES sequencing data [44] identified a total of 40 different variants located at RAD51C splice sites (SS), defined for the purpose of the present study as: (i) intron/exon (IVS-10_IVS-1/2nt) boundaries (3 SS), and (ii) exon/intron (2nt/IVS + 1_IVS + 10) boundaries (5 SS). Variants and alternative transcripts were annotated according to the Human Genome Variation Society (HGVS) guidelines on the basis of the RAD51C GenBank sequence NM_058216.3. To simplify transcript annotation, we identified them with a shortened code that combines the following symbols [56,70]: ∆ (skipping of exonic sequences), (inclusion of intronic sequences), E (exon), p (acceptor shift), q (donor shift). When necessary, the number of deleted or inserted nucleotides is indicated. For example, (E2q27) indicates the use of an alternative donor site downstream of exon 2 causing a 27-nt intron insertion.

Bioinformatics Analysis
All 40 RAD51C variants from the intron-exon boundaries were analyzed to identify potential splicing variants using splice site prediction software (Table S1). Mutant and wild type sequences were analyzed with the Max Ent Scan (MES) algorithm of Human Splicing Finder 3.1 [71,72], except for exon 8 donor variants that were analyzed by NNSplice [73] because this site was not detected by MES. Potential spliceogenic variants were selected according to the following criteria: (i) splice site disruption at the AG/GT positions; (ii) important MES score changes (≥15%) [35,74]; (iii) creation of de novo splice sites; (iv) regardless of computer predictions, variants at other conserved positions of the acceptor (Y 11 NCAG|G) and donor (MAG|GTRAGT) consensus sequences, such as Pyrimidine to Purine changes or deletions at the polypyrimidine tract, nucleotide substitutions of a conserved nt at the intronic positions −3C, +3R, +4A, +5G, +6T, as well as the first (G) and the last three nucleotides of the exon (M, A, G).

Minigene Construction and Mutagenesis
RAD51C has 9 exons but all the potential spliceogenic variants from BRIDGES subjects were located in exons 2 to 8. Therefore, an insert (3731 bp) with exons 2 to 8 and their respective flanking intronic sequences was designed in our laboratory and then synthesized at the Genewiz facility (Genewiz, South Plainfield, NJ, USA) ( Figure S1). This fragment was cloned into the splicing vector pSAD (Patent P201231427-CSIC) [41,75] between the restriction sites BamHI and EcoRI. The wild type minigene mgR51C_ex2-8 was used as template to generate 20 candidate BRIDGES DNA variants (Table S2) with the QuikChange Lightning kit (Agilent, Santa Clara, CA, USA). All constructs were confirmed by sequencing (Macrogen, Madrid, Spain). The whole protocol is outlined in Figure 3.
In order to quantify all transcripts relatively to each other, semi-quantitative fluorescent RT-PCRs were performed in triplicate with primers PSPL3_RT-FW and RTpSAD-RV (FAM-labelled) and Platinum Taq DNA polymerase (Life Technologies, Carlsbad, CA, USA) under the above standard conditions except that 26 cycles were herein applied [31,41]. FAM-labeled products were run with LIZ-1200 Size Standard at the Macrogen facility and analyzed with the Peak Scanner software V1.0. Only peak heights ≥50 RFU (Relative Fluorescence Units) were considered. Furthermore, MCF-7 and Human Breast Total RNAs (Agilent, cat. no. 540045, discontinued) were retrotranscribed with primer RTR51C_ex9-RV (5 -ACATGCAGAAGTAACAACAG-3 ) and then amplified with primers RTR51C_ex1-FW (5 -GAACTCCTAGAGGTGAAAC-3 ) and again RTR51C_ex9-RV labelled with FAM (amplicon length: 957 bp) in the same above PCR conditions except that the annealing temperature was set at 58 • C. Mean peak areas of three independent experiments of each variant were used to calculate the relative proportions of each transcript and standard deviations.

ACMG/AMP-Like Classification of 20 RAD51C Variants Based on PS3/BS3 Functional Evidence
Since no ClinGen RAD51C Expert panel specifications of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) variant curation guidelines are currently available (www.clinicalgenome.org/), we performed a tentative classification (ACMG/AMP-like) based on: (i) generic ACMG/AMP guidelines [66]; (ii) specific aspects of the ClinGen Sequence Variant Interpretation Working Group (ClinGen-SVI) recommendations for interpreting the loss-of-function PVS1 and functional PS3/BS3 evidence codes [67,76]; (iii) some non-gene specific approaches developed by the ClinGen CDH1 variant curation expert panel [69], and (iv) expert judgment.
In addition to PS3/BS3 (functional evidence, in this case based on splicing data obtained from minigene analysis), only the rarity code (PM2) made a major contribution to the classification process. In a subset of variants, association with disease (PS4), and detected in trans with a pathogenic variant in Fanconi Anemia patients' (PM3) codes (see Table 2 and Methods for further details), also contributed. Of note, we excluded the use of predictive evidence codes (i.e., PVS1/PP3) from our classification approach because (i) splicing predictive and functional evidence are not independent from each other, and (ii) incorporating both types of evidence into the framework creates internal inconsistences (see Discussion).

Conclusions
We have shown that aberrant splicing of RAD51C represents a relevant pathogenic mechanism in breast cancer susceptibility. The functional study of variants provides critical data that may increase the number of families that may benefit from preventive or therapeutic measures. In this regard pSAD-derived minigenes have been proven as robust and high capacity approaches for the primary characterization of variant-associated defective splicing, since they replicate splicing results of patient RNA, as we have shown in RAD51C and other disease genes [57,[77][78][79]. By these means and the application of ACMG-AMP-based criteria, we have classified 15 RAD51C variants as pathogenic or likely pathogenic, which constitute the largest number of spliceogenic variants of this gene reported so far.