Genomic Mapping of Splicing-Related Genes Identify Amplifications in LSM1, CLNS1A, and ILF2 in Luminal Breast Cancer

Simple Summary The alternative splicing (AS) process is highly relevant, affecting most of the hallmarks of cancer, such as proliferation, angiogenesis, and metastasis. Our study evaluated alterations in 304 splicing-related genes and their prognosis value in breast cancer patients. Amplifications in CLNS1A, LSM1, and ILF2 genes in luminal patients were significantly associated with poor outcome. Downregulation of these genes in luminal cell lines showed an antiproliferative effect. Pharmacological modulation of transcription and RNA regulation is key for the optimal development of therapeutic strategies against key proteins. Administration of a BET inhibitor and BET-PROTAC reduced the expression of these identified genes and displayed a significant antiproliferative effect on these cell models. In conclusion, we describe novel splicing genes amplified in luminal breast tumors that are associated with detrimental prognosis and can be modulated pharmacologically. It opens the door for further studies confirming the effect of these genes in patients treated with BET inhibitors. Abstract Alternative splicing is an essential biological process, which increases the diversity and complexity of the human transcriptome. In our study, 304 splicing pathway-related genes were evaluated in tumors from breast cancer patients (TCGA dataset). A high number of alterations were detected, including mutations and copy number alterations (CNAs), although mutations were less frequently present compared with CNAs. In the four molecular subtypes, 14 common splice genes showed high level amplification in >5% of patients. Certain genes were only amplified in specific breast cancer subtypes. Most altered genes in each molecular subtype clustered to a few chromosomal regions. In the Luminal subtype, amplifications of LSM1, CLNS1A, and ILF2 showed a strong significant association with prognosis. An even more robust association with OS and RFS was observed when expression of these three genes was combined. Inhibition of LSM1, CLNS1A, and ILF2, using siRNA in MCF7 and T47D cells, showed a decrease in cell proliferation. The mRNA expression of these genes was reduced by treatment with BET inhibitors, a family of epigenetic modulators. We map the presence of splicing-related genes in breast cancer, describing three novel genes, LSM1, CLNS1A, and ILF2, that have an oncogenic role and can be modulated with BET inhibitors.


Introduction
The RNA splicing process regulates gene expression in eukaryotic cells through a complex process in which introns are removed from precursor RNAs (pre-mRNAs) and consecutive exons are precisely joined together to produce mature mRNAs, with the final goal of maintaining mature transcripts to guarantee a successful translation process [1]. The alternative splicing process (AS) is the way in which exons or portions of exons or non-coding regions within a pre-mRNA transcript are differentially joined or skipped, resulting in multiple protein isoforms being encoded by a single gene [1]. Alternative splicing (AS) contributes to transcriptome (and proteome) diversity in development-and tissue-regulated pathways, as well as in response to multiple physiological signals [2]. Remarkably, large-scale proteomic studies suggest that many predicted alternative transcripts are not translated into proteins, so the exact contribution of AS to protein diversity is currently under dispute [3,4]. On top of that, some authors have suggested a role for AS in buffering mutational consequences [5], and mounting evidence indicates that AS coupled to nonsense-mediated decay is a major post-transcriptional regulator of gene expression [6,7]. Five major types of AS have been described: exon skipping, mutually exclusive exons, intron retention, and alternative 5 or 3 splice site [8]. The AS process is carried out by the spliceosome and consists of four stages: assembly, activation, catalysis or splicing, and disassembly. In each specific stage of a splicing cycle, different spliceosome subcomplexes are involved (pre-B, B, Bact, B*, C, C*, P, and ILS complexes), which are composed of small nuclear ribonucleoproteins (snRNPs) and non-snRNPs splicing factors [9]. AS is a highly regulated process, with five snRNPs and over 300 non-snRNP proteins identified as recruited to the spliceosome at these specific stages [10].
Changes due to AS can affect the translation rate and the functional role of the mRNA. AS can act on different cellular and biological processes or be involved in tissue specificity, developmental states, or disease conditions, such as cancer [11]. It has a relevant role in cancer development and maintenance, affecting most of the hallmarks of cancer [12,13]. In addition, it can be involved in cancer relapse or resistance to different treatment modalities [12]. Thus, specific isoforms have been identified promoting and supporting neoplastic transformation and tumor development. In a variety of tumor types, AS has been linked to up-regulation of oncogenes, participating in different processes of tumor development, including angiogenesis, cell division, altered metabolism, proliferation, or metastasis [10,14]. In addition, they can also contribute to the deregulation of several non-oncogenic vulnerabilities that are also relevant in the initiation and maintenance of the oncogenic transformation [12].
Alterations in the AS machinery have been identified in different human tumors, and they can affect a network of downstream splicing targets. Using high throughput methodologies, Seiler et al. have described somatic mutations in 119 splicing factors in 33 tumor types, bladder carcinoma and uveal melanoma being those with higher frequencies [15]. Moreover, mutations in splicing factors appear to be mutually exclusive within a tumor, which might indicate that co-occurrence of these mutations may be lethal [15].
Specifically in breast cancer, AS affects major breast cancer-related proteins, such as the estrogen receptor (ER), BRCA1, and BRCA2, among others [16]. Thus, disequilibrium between ERα66 and ERα36 induce abnormal proliferation, and high levels of ERα36 can cause resistance to Tamoxifen [16]. Alterations in components of the regulatory splicing machinery have been described in breast cancer. For example, SF3B1 is involved in the 3 -SS recognition and is one of the most commonly mutated genes with a higher frequency in the metastatic setting [17]. Mutant SF3B1 produces aberrant splicing, inducing metabolic reprogramming [18]. In addition, AS has been described to have a role in drug resistance. For instance, it has been described that, in carriers of BRCA1 exon 11 premature termination codon variants, tumors upregulate exon 11 skipping to produce isoforms that retain residual activity, contributing to PARPi resistance [19]. Overexpression of SF3B1 and SF3B3 are associated with tamoxifen and fulvestrant resistance, and inhibition of another splicing factors, such as ZRANB2 and SYF2, reduces resistance to doxorubicin in breast cancer cells [20,21]. On the other hand, SRSF4 induces apoptosis in cancer cells, in combination with platinum agents [22].
In our study, alterations in 304 splicing factors were evaluated in breast cancer patients using several large datasets. We found high frequency of amplification in CLNS1A, LSM1, and ILF2 in Luminal tumors, with a significant association with poor prognosis. Despite the limited information about these genes, they have been associated with oncogenic processes and resistance to treatments. IFL2 deregulation has been related to an aberrant RNA splicing pattern, mainly deregulated skipped exons in genes involved in DNA repair [23]. LSM1 is included in the heteroheptameric complex LSM1-7, which initiates mRNA decay [24]. CLNS1A acts as a Sm chaperone, recruiting Sm proteins to the PRMT5 complex [25]. In our study, genomic regulation of these genes, with a reduction of their expression, decreased proliferation of luminal tumor cells. In addition, treatment with epigenetic modulators, such as the Bromodomain and extraterminal (BET) family of inhibitors, reduced the expression level of these genes, leading to cell growth reduction.

Data Collection and Processing
Processed TCGA (The Cancer Genome Atlas) PanCancer dataset was downloaded through cBioportal (www.cbioportal.org; accessed on 4 December 2019). This dataset contains whole exome sequencing and RNA-Seq data from patients with breast invasive carcinoma, consisting of 696 Luminal, 78 HER2 positive, and 171 Basal tumors and their matched normal tissues. WES data was used to explore CNAs and mutations in 304 splicing factor genes. Splicing related genes were collected from four sources: HUGO Gene Nomenclature Committee and the studies of Hegele et al. [26], Wan et al. [9], and Koedoot et al. [20]. Only somatic non-silent mutations in splicing factor genes were considered (missense, premature termination codon, and IVS+-1,2). Somatic non-silent mutations in splicing factor genes were only considered. In the PanCancer dataset, identification of somatic single nucleotide variations was performed using Mutect. CNAs were assessed as deviations in the tumor sample from the paired normal tissue sample using GISTIC 2.0. GISTIC 2.0 identifies regions significantly amplified or deleted and lists genes found in each "wide peak" region [27]. Value +/− 2 indicates high-level thresholds for amplifications/deletions, respectively, and those with +/− 1 exceed the low-level thresholds but not the high-level thresholds. In addition, the Metabric dataset (www.cbioportal.org; accessed on 4 December 2019) was used to validate results of identified genes with a high level of amplification in >5% of tumors.

Outcome Analysis
The relationship between gene expression levels and patient clinical prognosis in terms of relapse-free survival (RFS) and overall survival (OS) was evaluated using the Kaplan-Meier Plotter platform, as described previously [28,29] (accessed on 6 June 2020). This tool used gene expression and survival data from 7830 breast tumors (sources include GEO, EGA and TCGA). Samples were split into two groups using the best threshold as the cutoff (auto select best cutoff). When testing multiple genes, the analysis was performed using the mean expression. Patients above the threshold were labelled as "high" expressing, while patients below the threshold were labelled as "low" expressing. The two groups were compared using Cox survival analysis. The prognostic value of the identified signature (containing LSM1, CLSN1A, and ILF2) was validated using the TCGA project.
The correlation between CNAs and patient clinical outcome was analyzed using the Genotype-2-Outcome platform (accessed on 8 January 2021) [30]. This tool links genotype to clinical outcome by utilizing next generation sequencing and gene chip data of 6697 breast cancer patients. It allows the association with prognosis of a specific transcriptomic signature linked to an altered gene, by classifying patients according to the average expression of significant genes designated as a surrogate metagene of its alteration status. The median expression values for different transcripts are used as a cut-off to discriminate "high" and "low" expression cohorts, which are compared using a Cox survival analysis. To identify factors independently associated with OS and RFS, a multivariate analysis (Cox proportional risk regression model) was performed.

Cell Culture and Compounds
MCF7 and T47D cells (American Type Culture Collection, Manassas, VA, USA) were cultured in DMEM (Sigma-Aldrich, Saint Louis, MO, USA) containing 10% fetal bovine serum with 100 U/mL penicillin, 100 µg/mL streptomycin, and 2 mM L-glutamine, and cells were maintained at 37 • C in a 5% CO 2 atmosphere. The BET inhibitor JQ1 was purchased from Tocris Bioscience, and BET-PROTAC MZ1 was purchased from Selleckchem (Houston, TX, USA).

Quantitative Reverse-Transcription PCR
Total RNA was collected from cells using the RNeasy Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. Determination of concentration and purity were measured using a NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific, Rockford, IL, USA), and then 1 µg of total RNA was reverse transcribed using the RevertAid H Minus first-strand cDNA synthesis kit (Thermo Fisher Scientific) in a thermal cycler (Bio-Rad, Hercules, CA, USA) under the following reaction conditions: 65 • C for 5 min, 42 • C for 60 min, and 70 • C for 10 min. cDNAs were then subjected to real-time PCR analysis using Fast SYBR Green master mix on the StepOnePlus Real-Time PCR system (Applied Biosystems) according to the manufacturer's instructions. Primer sequences used were as follows: h-LSM1 F: TTCCTCGAGGGATTTTTGTG, h-LSM1 R: TTCTCTGCTTCCAGCTTGGT, h-CLNS1A F: TCGGCACTGGTACCCTTTAC, h-CLNS1A R: AATGGTGGGGTATTCCAGTG, h-ILF2 F: GCTCCAGGGACATTTGAAGT, h-ILF2 R: CAGCCACATTGTGTCCTGTAG, h-18S F: GAGGATGAGGTGGAACGTGT, h-18S R: TCTTCAGTCGCTCCAGGTCT. An initial step was performed at 95 • C for 5 min, followed by 40 cycles of 95 • C for 15 s, and finished at 60 • C for 1 min. Each sample was analyzed in triplicates, and cycle threshold (Ct) values of transcripts were determined using StepOne Software v.2.1. Ct values were calculated using the 18Sas reference. Untreated control cells were used as the control to calculate the Ct value and determine the X-fold mRNA expression.
For antiproliferative drugs validation, MCF7 and T47D cells (5000/well, 48-multiwell plates) were seeded. After, they were treated with increased doses of JQ1 and MZ1 for 72 h. Later medium was replaced with red-phenol free DMEM containing MTT (0.5 mg/mL) and incubated for 45 min at 37 • C. After, medium was removed and dimethylsulfoxide (DMSO) (Thermo Fisher Scientific) was used for dissolved formazan accumulates. Absorbance (A555 nm-A690 nm) was recorded in a multiwell plate reader (BMG labtech, Ortenberg, Germany).

Growth Studies
To compare the growth between siRNAs-transfected cells and siGFP-transfected cells (control), proliferation rate was studied by cell count. Cell lines were cultured at a density of 50,000 cells in 6-well. At the times of 24 and 48 h, cells were trypsinized and counted. Images was performed at 48 h using inverted microscope (10×).

Cell Cycle Assay
siRNAs-transfected cells (MCF7 and T47D) were collected and fixed in ethanol (70%, cold) for 30 min. Cell pellets were washed in PBS+2% BSA and incubated in the dark for 1 h at 4 • C with Propidium iodide/RNAse staining solution (Immunostep).

Statistical Analysis
We used the student's t-test unpaired for independent samples. The level of significance was considered 95%; therefore, p values lower than 0.05 were considered statistically significant: * p < 0.05; ** p < 0.01, and *** p < 0.001. Statistics and representations were made with statistical software GraphPad Prism 7.0. All results (unless indicated) are presented as the mean ± SEM of three independent experiments, each of them performed in triplicate.

Mutations in Splicing-Related Genes
Alterations in 304 splicing-related genes (Supplementary Table S1 Figure S1). When patients were classified based on molecular subtypes, several differences were observed in the distribution of the identified genes. Regarding the 278 genes with presence of mutations, the number of altered genes were 231 (83.1%) for Luminal A, 157 (56.5%) for Luminal B, 175 (62.9%) for Basal, and 150 (54%) for HER2+ subtype ( Figure 1A). Tumors with modifications in any of these genes were observed in a higher percentage in HER2+ (70.5%) and Basal (66.1%) compared with the Luminal A and B subtypes (47.1% and 61.9%, respectively) ( Figure 1B). When the frequency of tumors with alterations in each gene were evaluated, HER2+ and the basal subtype population showed splicing-related genes with a higher percentage of alterations ( Figure 1C; Supplementary Table S2). In the four molecular subtypes, no gene was detected to be mutated in more than 6% of tumors. Splicing genes with mutations in >3% of tumors are displayed in Figure 1C and mainly belonged to the HER2+ and Basal subgroups ( Figure 1D). with alterations in each gene were evaluated, HER2+ and the basal subtype population showed splicing-related genes with a higher percentage of alterations ( Figure 1C; Supplementary Table S2). In the four molecular subtypes, no gene was detected to be mutated in more than 6% of tumors. Splicing genes with mutations in >3% of tumors are displayed in Figure 1C and mainly belonged to the HER2+ and Basal subgroups ( Figure 1D).

Copy Number Alterations (CNAs) in Splicing-Related Genes
The TCGA PanCancer series also includes putative copy-number data [31]. Thus, we evaluated the following changes in splicing-related genes: homozygous or hemizygous deletions, no change, gain, and high level of amplification. In this large series of breast cancer patients, we found information about 301 genes from those identified. High level of amplification (GISTIC thresholded CN value of +2) in any of these genes were detected in a high percentage of tumors (HER2+: 87.2%; Basal: 81.3%; Luminal A: 51.1%; and Luminal B: 67.5%). Considering only those genes in regions with homozygous deletion or a high level of amplification in >5% of patients, we found 61 altered splicing-related genes (58 amplified and 3 loss). Regarding the molecular subtypes, 33 (10.9%) splicing-related genes were altered in the Basal subtype (30 amplified and 3 loss), 41 (13.6%) amplified in HER2+, 30 (9.9%) amplified in Luminal A, and 28 (9.3%) altered in Luminal B (26 amplified and 2 loss) (Figure 2A-D, respectively). Therefore, a large number of genes showed higher frequencies of CNAs versus mutations (only 6 genes with mutations in >5% of patients, Figure 1D). In the four molecular subtypes, 14 common splicing-related genes showed a high level of amplification in >5% of tumors ( Figure 2E). On the other hand, certain genes were amplified only in specific subtypes ( Figure 2F). A complete list of genes is displayed in the Supplementary Table S3.

Copy Number Alterations (CNAs) in Splicing-Related Genes
The TCGA PanCancer series also includes putative copy-number data [31]. Thus, we evaluated the following changes in splicing-related genes: homozygous or hemizygous deletions, no change, gain, and high level of amplification. In this large series of breast cancer patients, we found information about 301 genes from those identified. High level of amplification (GISTIC thresholded CN value of +2) in any of these genes were detected in a high percentage of tumors (HER2+: 87.2%; Basal: 81.3%; Luminal A: 51.1%; and Luminal B: 67.5%). Considering only those genes in regions with homozygous deletion or a high level of amplification in >5% of patients, we found 61 altered splicing-related genes (58 amplified and 3 loss). Regarding the molecular subtypes, 33 (10.9%) splicing-related genes were altered in the Basal subtype (30 amplified and 3 loss), 41 (13.6%) amplified in HER2+, 30 (9.9%) amplified in Luminal A, and 28 (9.3%) altered in Luminal B (26 amplified and 2 loss) (Figure 2A-D, respectively). Therefore, a large number of genes showed higher frequencies of CNAs versus mutations (only 6 genes with mutations in >5% of patients, Figure 1D). In the four molecular subtypes, 14 common splicing-related genes showed a high level of amplification in >5% of tumors ( Figure 2E). On the other hand, certain genes were amplified only in specific subtypes ( Figure 2F). A complete list of genes is displayed in the Supplementary Table S3. With this information, we next aimed to explore the chromosome location of splicingrelated genes with a high level of amplification or homozygous deletion. Interestingly, most altered splicing-related genes in each molecular subtype were distributed in a few chromosome regions: 1q, 8q, and 17q, as shown in Figure 3A. In total, 12 out of 14 common amplified genes were located in 1q and 8q. Different altered regions were specific for each subtype: (a) 10p, 12p 13q, 15q, and 19q for Basal; (b) 6q, 17q, and 3q for HER2+; (c) all genes in the 16p region for Luminal A; and (d) no one for Luminal B (Figure 3A; Supplementary Table S3). Copy-number gain in these regions is represented in Figure 3B. With this information, we next aimed to explore the chromosome location of splicingrelated genes with a high level of amplification or homozygous deletion. Interestingly, most altered splicing-related genes in each molecular subtype were distributed in a few chromosome regions: 1q, 8q, and 17q, as shown in Figure 3A. In total, 12 out of 14 common amplified genes were located in 1q and 8q. Different altered regions were specific for each subtype: (a) 10p, 12p 13q, 15q, and 19q for Basal; (b) 6q, 17q, and 3q for HER2+; (c) all genes in the 16p region for Luminal A; and (d) no one for Luminal B ( Figure 3A; Supplementary  Table S3). Copy-number gain in these regions is represented in Figure 3B.

Associations of Splicing-Related Genes with Clinical Outcome in Patients
To identify which of the identified genes could have a role in cancer progression, we intended to link the described data with patient clinical outcome. To do so, we used published transcriptomic microarray data, as described elsewhere [32]. The prognostic value of the high amplified genes (with a cutoff of >5% of tumors) were analyzed in a dataset of 6234 breast cancer patients ( Figure 4A). CNA frequencies in identified genes were validated in an additional dataset (Supplementary Figure S2). Frequencies of high level of amplified genes were correlated between both datasets. Alterations in splicing-related genes were most frequently observed in the HER2+ and Basal-like subtypes. However, as there were few patients in these subtypes, it was not possible to establish the prognostic value for most amplified genes. Despite the low number of patients in this subgroup, several genes showed association with RFS and OS ( Figure 4B and Supplementary Table S4). We focused on those genes, with a clear impact on survival by using an arbitrary selection based on statistical outcome relevance and low false discovery rate (FDR) (p < 0.002, Hazard Ratio (HR) > 1.5; FDR < 5). For Luminal A, high expression of ESRP1, LSM1, CLNS1A, ILF2, and PPP1CA showed a clear association with detrimental OS and RFS ( Figure 4B). In the same way, high levels of these first four genes were associated with a poor prognostic in the Luminal B subtype. In the Luminal series, CNAs were significant associated with expression levels in these genes (Supplementary Figure S3).
Next, to confirm the prognostic role of the alterations described before, we used a transcriptomic fingerprint of the amplified splicing-related genes by using the genotype-2-outcome ( Figure 4C). With this approach, we can obtain the clinical outcome of a gene signature associated with a specific genomic alteration, including gene amplification, as described in the materials and methods section. Thus, the transcriptomic fingerprint associated with the amplification of LSM1, CLNS1A, and ILF2 showed a strong association with survival (Supplementary Figure S4). The transcripts included in the signatures associated with the CNA gain of LSM1, CLNS1A, and ILF2 are displayed in Supplementary  Table S5.
In Figure 4D, we summarized the prognostic value (RFS and OS), percentage of amplification, subtype, and chromosome location of the identified genes. In addition, a more robust association with OS and RFS was observed when expression of these three genes

Associations of Splicing-Related Genes with Clinical Outcome in Patients
To identify which of the identified genes could have a role in cancer progression, we intended to link the described data with patient clinical outcome. To do so, we used published transcriptomic microarray data, as described elsewhere [32]. The prognostic value of the high amplified genes (with a cutoff of >5% of tumors) were analyzed in a dataset of 6234 breast cancer patients ( Figure 4A). CNA frequencies in identified genes were validated in an additional dataset (Supplementary Figure S2). Frequencies of high level of amplified genes were correlated between both datasets. Alterations in splicing-related genes were most frequently observed in the HER2+ and Basal-like subtypes. However, as there were few patients in these subtypes, it was not possible to establish the prognostic value for most amplified genes. Despite the low number of patients in this subgroup, several genes showed association with RFS and OS ( Figure 4B and Supplementary Table S4). We focused on those genes, with a clear impact on survival by using an arbitrary selection based on statistical outcome relevance and low false discovery rate (FDR) (p < 0.002, Hazard Ratio (HR) > 1.5; FDR < 5). For Luminal A, high expression of ESRP1, LSM1, CLNS1A, ILF2, and PPP1CA showed a clear association with detrimental OS and RFS ( Figure 4B). In the same way, high levels of these first four genes were associated with a poor prognostic in the Luminal B subtype. In the Luminal series, CNAs were significant associated with expression levels in these genes (Supplementary Figure S3).
was combined together ( Figure 4E). Finally, the prognostic role of the identified signature in the Luminal A subtype was confirmed in a validation cohort, confirming the reproducibility of the findings described before (TCGA dataset; Figure 4F). Univariate and multivariate COX regression analysis showed that the combination of LSM1, CLNS1A, and ILF2 was a clear, independent prognostic factor, mainly with OS (Supplementary Table  S6).

Genomic Down-Regulation of LSM1, CLNS1A, and ILF2 Reduces Cell Proliferation
To validate LSM1, CLNS1A, and ILF2 dependency in Luminal breast cancer cells lines, mRNA expression of these genes was downregulated by using small interfering RNA (siRNA). LSM1, CLNS1A, and ILF2 downregulation in MCF7 and T47D efficiently reduced gene expression, as shown in Figure 5A. Cell growth ( Figure 5B,C) and cell proliferation, evaluated as MTT metabolization ( Figure 5D), was reduced after siRNA knockdown of the mentioned splicing genes. Growth reduction was observed clearer with gene interfering of LSM1 in MCF7 cells and CLNS1A in T47D. The antiproliferative effect produced after gene inhibition evaluated as a metabolization of MTT was significantly observed after 120 h, with no differences at a shorter time. To explore how the mechanism for genomic down-regulation of LSM1, CLNS1A, and ILF2 inhibits cell proliferation, we performed cell cycle analysis using propidium iodure. No major changes were observed in cell cycle phases, only CLNS1A down-regulation showed a G0/G1 arrest in T47D cells in accordance with previous findings ( Figure 5E). Next, to confirm the prognostic role of the alterations described before, we used a transcriptomic fingerprint of the amplified splicing-related genes by using the genotype-2-outcome ( Figure 4C). With this approach, we can obtain the clinical outcome of a gene signature associated with a specific genomic alteration, including gene amplification, as described in the materials and methods section. Thus, the transcriptomic fingerprint associated with the amplification of LSM1, CLNS1A, and ILF2 showed a strong association with survival (Supplementary Figure S4). The transcripts included in the signatures associated with the CNA gain of LSM1, CLNS1A, and ILF2 are displayed in Supplementary Table S5.
In Figure 4D, we summarized the prognostic value (RFS and OS), percentage of amplification, subtype, and chromosome location of the identified genes. In addition, a more robust association with OS and RFS was observed when expression of these three genes was combined together ( Figure 4E). Finally, the prognostic role of the identified signature in the Luminal A subtype was confirmed in a validation cohort, confirming the reproducibility of the findings described before (TCGA dataset; Figure 4F). Univariate and multivariate COX regression analysis showed that the combination of LSM1, CLNS1A, and ILF2 was a clear, independent prognostic factor, mainly with OS (Supplementary Table S6).

Genomic Down-Regulation of LSM1, CLNS1A, and ILF2 Reduces Cell Proliferation
To validate LSM1, CLNS1A, and ILF2 dependency in Luminal breast cancer cells lines, mRNA expression of these genes was downregulated by using small interfering RNA (siRNA). LSM1, CLNS1A, and ILF2 downregulation in MCF7 and T47D efficiently reduced gene expression, as shown in Figure 5A. Cell growth ( Figure 5B,C) and cell proliferation, evaluated as MTT metabolization ( Figure 5D), was reduced after siRNA knockdown of the mentioned splicing genes. Growth reduction was observed clearer with gene interfering of LSM1 in MCF7 cells and CLNS1A in T47D. The antiproliferative effect produced after gene inhibition evaluated as a metabolization of MTT was significantly observed after 120 h, with no differences at a shorter time. To explore how the mechanism for genomic down-regulation of LSM1, CLNS1A, and ILF2 inhibits cell proliferation, we performed cell cycle analysis using propidium iodure. No major changes were observed in cell cycle phases, only CLNS1A down-regulation showed a G0/G1 arrest in T47D cells in accordance with previous findings ( Figure 5E).

BET Inhibitors Reduce the Expression of LSM1, CLNS1A, and ILF2
Epigenetic agents can modulate the expression of genes that have a role in transcription and maturation [33]. With this in mind, we explored the effect of Bromo and Extra terminal domain (BET) inhibitors and BET derivatives, such as BET-Proteolysis targeting chimeras (PROTAC), on the expression of the identified genes. Administration of the BET

BET Inhibitors Reduce the Expression of LSM1, CLNS1A, and ILF2
Epigenetic agents can modulate the expression of genes that have a role in transcription and maturation [33]. With this in mind, we explored the effect of Bromo and Extra terminal domain (BET) inhibitors and BET derivatives, such as BET-Proteolysis targeting chimeras (PROTAC), on the expression of the identified genes. Administration of the BET inhibitor (JQ1) and BET-PROTAC (MZ1) produced a reduction of the gene expressions of LSM1, CLNS1A, and ILF2. In MCF7 cells, ILF2 was downregulated with MZ1 treatment after 12 and 24 h of administration. Moreover, LSM1 was downregulated with MZ1 after 12 h ( Figure 6A). In T47D cells, after 12 h of treatment, these three genes were downregulated by both JQ1 and MZ1. This effect was maintained at 24 h of treatment for MZ1, but not for JQ1, suggesting that the PROTAC had a more prolonged effect ( Figure 6B). Following these findings, we explored their effect on cell growth. We observed that JQ1 and MZ1 displayed an antiproliferative effect in Luminal cells lines ( Figure 6C). EC50 values showed that MZ1 PROTAC was more potent than the inhibitor JQ1 ( Figure 6D). In summary, these findings confirm the modulation of the expression of these three genes by JQ1 and MZ1 and the pharmacological effect of these agents on cell proliferation.  Figure 6A). In T47D cells, after 12 h of treatment, these three genes were downregulated by both JQ1 and MZ1. This effect was maintained at 24 h of treatment for MZ1, but not for JQ1, suggesting that the PROTAC had a more prolonged effect ( Figure 6B). Following these findings, we explored their effect on cell growth. We observed that JQ1 and MZ1 displayed an antiproliferative effect in Luminal cells lines ( Figure 6C). EC50 values showed that MZ1 PROTAC was more potent than the inhibitor JQ1 ( Figure 6D). In summary, these findings confirm the modulation of the expression of these three genes by JQ1 and MZ1 and the pharmacological effect of these agents on cell proliferation.

Discussion
In the present article, we characterize the presence and role of genomic alterations in splicing genes in breast cancer. Splicing is a biological process that permits transcriptional diversity and redundancy of molecular functions, allowing the integrity of key cellular activities [34]. Transcriptional regulation by splicing has been involved in the control of different biological tasks from DNA damage, to cell survival, or stemness, among others [13]. In this context, several genes implicated in splicing have been described in cancer, leading to the promotion of different oncogenic properties. For instance, some known factors, such as SRSF1, have been described as overexpressed in cancer, leading to malignant transformation by an alternative splicing of genes involved in proliferation and apoptosis [35]. Other examples include RBM39 in Acute Myeloid Leukemia or RBM11 in glioblastoma cells, among others [36,37]. In breast cancer, mutations in SF3B1 are more frequently observed in the metastatic setting, and its potential role in the regulation of protein deg-

Discussion
In the present article, we characterize the presence and role of genomic alterations in splicing genes in breast cancer. Splicing is a biological process that permits transcriptional diversity and redundancy of molecular functions, allowing the integrity of key cellular activities [34]. Transcriptional regulation by splicing has been involved in the control of different biological tasks from DNA damage, to cell survival, or stemness, among others [13]. In this context, several genes implicated in splicing have been described in cancer, leading to the promotion of different oncogenic properties. For instance, some known factors, such as SRSF1, have been described as overexpressed in cancer, leading to malignant transformation by an alternative splicing of genes involved in proliferation and apoptosis [35]. Other examples include RBM39 in Acute Myeloid Leukemia or RBM11 in glioblastoma cells, among others [36,37]. In breast cancer, mutations in SF3B1 are more frequently observed in the metastatic setting, and its potential role in the regulation of protein degradation or metabolism is known [17,18]. In addition, overexpression of SF3B1 and SF3B3 has been associated with resistance to hormone therapy, and others, such as ZRANB2 and SYF2, to chemotherapy, particularly for doxorubicin [20,21]. Taking this background into consideration, the identification of deregulated genes involved in splicing and the understanding of their role in cancer is a main objective, with the final goal of designing or implementing therapeutic strategies to reduce their presence.
In our study, we analyzed a set of genes involved in splicing in breast cancer. Genomic modifications of splicing proteins were highly presented in breast cancer, the HER2 subtype being the most common tumor (70.5%), with the less frequency presence observed in the Luminal A subtype (47.1%). Although mutations in splicing genes have been widely reported [15], in our study, no specific gene was mutated in more than 6% of the tumors. On the other hand, when CNAs in our splicing-related gene lists were evaluated, 61 of them were altered in >5% of patients. In a similar way to mutations, the HER2 subtype showed a higher number of altered genes compared with the other groups. This really demonstrates that mutations are less frequently observed than other structural alterations, and the splicing pathway is mainly altered in the HER2 subtype compared with the other breast subtypes. Nevertheless, 14 common splicing-related genes showed high-level amplification in >5% of patients in the four molecular groups, most of them located in 1q and 8q chromosome regions.
The next step in our study was to select those altered splicing-related genes with a role in patient clinical outcome. The results were not conclusive for the HER2 subtype due to the small number of patients. Regarding the Luminal subtype, we identified three genes with clear association with poor prognosis: ILF2, LSM1, and CLNS1A. Although prognosis value cannot be evaluated in HER2 and Basal subtypes, these three genes were also detected as amplified in tumors of these molecular subtypes. Moreover, when the presence of CNAs in our selected genes was analyzed in different tumor types (GDC Data Portal; 67 primary sites), breast cancer was one of most frequently amplified for LSM1, CLNS1A, and ILF2 (Supplementary Figure S5). IFL2 has been described as implicated in the RNA splicing regulation of crucial effectors involved in DNA damage response [23]. In addition, overexpression of this gene mediated resistance to DNA damaging agents [23]. Of note, IFL2 is located at the 8p chromosome, where other genes with a particular oncogenic role in breast cancer, such as FGFR1, has been described as amplified [38]. LSM1 is involved in pre-RNA splicing by acting on the removal of the 5 cap structure [24,39,40]. LSM1 has been studied in other tumor types, such as pancreatic cancer, observing a role in cancer progression, metastasis, and resistance to chemotherapies [41]. CLNS1A is involved in both the assembly of spliceosomal snRNPs and the methylation of Sm proteins [25,42]. CLNS1A cooperates with the protein PRMT5 and functions as an epigenetic activator of AR transcription in castration resistance prostate cancer [43]. CLNS1A has also been described in malignant gliomas [44], but data for breast cancer is very limited.
An interesting observation is the fact that the overexpression and amplification of these three genes was associated with detrimental prognosis in two large datasets, particularly in the luminal breast cancer subtype. Furthermore, the genomic knockdown of these transcripts reduced cell proliferation, suggesting an effect on cell growth.
Pharmacological modulation of transcription and RNA regulation is key for the optimal development of therapeutic strategies against key proteins. Spliceosome inhibitors have been developed, particularly those that bind to the HEAT repeats domain of some proteins, such as SF3B1 [45]. A comprehensive description has been recently reviewed elsewhere and beyond the scope of this article [12]. However, another approach to target this family of genes is the use of epigenetic modulators, such as BET inhibitors, to modulate transcriptional regulators or genes involved in RNA maturation. Examples have been provided with BET inhibitors, such as MK-8628 or ZEN003694 [12]. In this context, we observed that administration of the BET inhibitor JQ1 and the BET-PROTAC MZ1 reduced the expression of the three identified genes at different levels in two characteristic estrogen receptor breast cancer cell lines, MCF7 and T47D. In addition, these agents displayed a significant antiproliferative effect on these cell models. Although we agree that the antiproliferative effect of the compound could be multifactorial and the participation of these genes is a part and not a whole, we demonstrate in breast cancer that BETi can modulate the expression of splicing-related agents.

Conclusions
In conclusion, we describe novel splicing genes amplified in luminal breast tumors that are associated with detrimental prognosis and can be modulated pharmacologically. This data opens the door for further studies, confirming the effect of these genes in patients treated with BET inhibitors.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/cancers13164118/s1, Figure S1: Percentage of tumors with non-silent mutations for each splicing-related gene; Figure S2: Relation between expression level and CNAs for ILF2, CLNS1A, LSM1, and ESRP1 genes; Figure S3: Associations of the transcriptomic fingerprint associated with amplification of LSM1, CLNS1A, and ILF2 genes (genotype-2-outcome) and clinical outcome (OS and RFS); Figure S4: Presence of CNAs in our selected genes (LSM1, CLNS1A, and ILF2) in different tumor types (GDC Data Portal; 67 primary sites; Gain: red and Loss: blue); Table S1: List of splicing related genes evaluated in our study; Table S2: Percentage of tumors with mutation for each gene; Table S3: Common and specific splicing-related genes depending on molecular subtypes. Table shows the percentage of tumors with a high level of amplification for each gene (only included those with high amplification in >5%); Table S4: Prognostic value (RFS and OS) of the amplified genes (with a cutoff of >5% of tumors) in the HER2+ and Basal-like subtypes; Table S5: List of the transcripts included in the signatures associated with the CNA gain of LSM1, CLNS1A, and ILF2. Table S6: Univariate and Multivariate COX regression analysis to assess the potential prognostic value of CLNS1A, ILF2, and LSM1 expression in Luminal breast cancer patients.  Institutional Review Board Statement: Not applicable. All the data corresponding to the breast cancer patient series used in this study are available in the public functional genomics data repository.
Informed Consent Statement: Not applicable. All the data corresponding to the breast cancer patient series used in this study are available in the public functional genomics data repository.
Data Availability Statement: All the data used in this study are available in public functional genomics data repositories (GEO, EGA, cBioportal, and TCGA).