Proteomic Profiling Identifies Co-Regulated Expression of Splicing Factors as a Characteristic Feature of Intravenous Leiomyomatosis

Simple Summary Intravenous leiomyomatosis is a rare form of smooth muscle tumour that has unique and distinct clinical features including growth within the uterine and pelvic veins. Here we use proteomics by mass spectrometry to show that this disease is distinct from uterine leiomyomas and other benign smooth muscle tumours due to the enrichment of components of the spliceosome machinery. In particular, we find that intravenous leiomyomatosis tumours harbour co-regulated expression of multiple splicing factors that are associated with biological processes including cell signalling. Our study demonstrates that intravenous leiomyomatosis is a distinct disease from other smooth muscle tumours and indicates a possible functional role for alternative splicing in disease initiation and progression. Abstract Intravenous leiomyomatosis (IVLM) is a rare benign smooth muscle tumour that is characterised by intravenous growth in the uterine and pelvic veins. Previous DNA copy number and transcriptomic studies have shown that IVLM harbors unique genomic and transcriptomic alterations when compared to uterine leiomyoma (uLM), which may account for their distinct clinical behaviour. Here we undertake the first comparative proteomic analysis of IVLM and other smooth muscle tumours (comprising uLM, soft tissue leiomyoma and benign metastasizing leiomyoma) utilising data-independent acquisition mass spectrometry. We show that, at the protein level, IVLM is defined by the unique co-regulated expression of splicing factors. In particular, IVLM is enriched in two clusters composed of co-regulated proteins from the hnRNP, LSm, SR and Sm classes of the spliceosome complex. One of these clusters (Cluster 3) is associated with key biological processes including nascent protein translocation and cell signalling by small GTPases. Taken together, our study provides evidence of co-regulated expression of splicing factors in IVLM compared to other smooth muscle tumours, which suggests a possible role for alternative splicing in the pathogenesis of IVLM.


Introduction
Intravenous leiomyomatosis (IVLM) is a rare histologically benign smooth muscle tumour which is characterised by intravenous growth in the uterine and pelvic veins [1,2]. In some instances, it can extend into the inferior vena cava and the right heart which in rare cases may cause death [3,4]. IVLM is usually present with concomitant uterine leiomyoma (uLM) and one theory is that it originates from a pre-existing uLM where it extends and invades into the vessel wall [4,5]. Given that there are some instances where IVLM arises in the absence of a uLM [2,6], an alternate theory is that this tumour originates from the smooth muscle cells of the vessel wall. In addition to IVLM, there are other rare smooth muscle tumours with unusual quasi-malignant clinical behaviour such as benign metastasizing leiomyoma (BML) and disseminated peritoneal leiomyomatosis [7,8].
Previous studies have undertaken comparative analyses of the molecular features of IVLM versus uLM to gain a better understanding of its underlying biology as well as the relationship between the two entities [9][10][11][12][13][14][15]. Some of the system-wide comprehensive profiling studies that have been reported include array comparative genomic profiling (aCGH) and transcriptomic analysis [9,11,13,14]. Collectively, these focused and systemwide studies indicate that IVLM share some cytogenetic and protein expression features with uLM (e.g., translocations in (12; 14) and HMGA2 protein expression) [11,12,14,15], while at the same time harbouring genetic and transcriptomic alterations that are unique. These unique alterations include distinct MED12 mutations and elevated HOXA13 gene expression in IVLM [10,12,13]. Given its rarity, all of the published Omics-based IVLM molecular profiling studies, with the exception of a recent study by Ordulu et al. [11], have been limited to a small number of cases (typically < 5).
To date, no proteomic profiling analyses have been undertaken in IVLM. Proteins are the critical drivers of cellular communication in normal cells and dysregulation of protein function is causative of many diseases including cancer [16,17]. We hypothesized that, unlike genomic and transcriptomic analysis, proteomic profiling will provide a more direct readout of the biological pathways and protein complexes that may play a role in the pathogenesis of IVLM [18,19]. Here we undertake a comparative mass spectrometry-based proteomic analysis of IVLM and other smooth muscle tumours (uLM, soft tissue leiomyoma (stLM) and BML), and demonstrate that at the protein level, IVLM is characterised by the unique co-regulated expression of splicing factors that comprise the spliceosome.

Patients and Tumour Specimens
Formalin fixed paraffin embedded (FFPE) tumour samples and linked anonymized patient data were used under approval by the Institutional Review Board as part of the PROSPECTUS study, a Royal Marsden-sponsored non-interventional translational protocol (CCR 4371, REC 16/EE/0213). One of the IVLM cases in this series has previously been described in a case report [20]. Departmental database and medical notes were retrospectively reviewed to identify FFPE tissue from surgically resected primary tumours and accompanying clinico-pathological characteristics. Baseline clinico-pathological annotation is summarized in Table 1. The histological diagnosis was confirmed in all cases by experienced soft tissue pathologists (KT, CF). Viable tumour content in each tumour was reviewed by the analysis of haematoxylin and eosin (H&E)-stained sections and a single FFPE tissue block containing representative viable tumour was selected for subsequent processing. Five 20µm sections were cut from each selected FFPE block and, where indicated, macrodissected to enrich to >75% viable tumour content.

Protein Extraction and Sample Preparation
The samples were processed as previously described [18]. Briefly, tissue sections from each sample were deparaffinized in multiple washes in xylene. Deparaffinised sections were then rehydrated by washes with decreasing ethanol gradient and dried. A lysis buffer (0.1 M Tris-HCl pH 8.8, 0.50% (w/v) sodium deoxycholate, 0.35% (w/v) sodium lauryl sulphate) was added to each sample at a ratio of 200 uL/mg of dry tissue, and samples were homogenized on ice using a LabGen700 blender (ColeParmer) with 3 × 30 s pulses. The resultant homogenates were sonicated on ice for 10 min to shear DNA followed by incubation at 95 • C for 1 h to reverse formalin crosslinks and shaking at 750 rpm at 80 • C for 2 h. The homogenate was cleared by 15 min centrifugation at 4 • C and 15,000 rpm. The supernatant was collected and protein concentration was measured by bicinchoninic acid (BCA) assay (Pierce). A filter-aided sample preparation (FASP) protocol was used to digest the extracted proteins as previously described [21]. Briefly, for each sample, an equivalent of 600 µg of protein was placed into an Amicon-Ultra 4 (Merck) centrifugal filter unit and detergents were removed by multiple washes with 8 M urea. The purified, concentrated sample was then transferred to Amicon-Ultra 0.5 (Merck) filters, reduced with 10 mM dithiothreitol (DTT) at 56 • C for 40 min and alkylated with 55 mM iodoacetamide (IAA) at 25 • C for 30 in in dark. The sample was washed with 100 mM ammonium bicarbonate (ABC) to remove 8 M urea and digested by trypsin (Promega, trypsin to starting protein ratio 1:100 µg) overnight at 37 • C. Peptides were desalted on C18 SepPak columns (Waters) and dried in a SpeedVac concentrator and stored at −80 • C.

SWATH-MS Data Acquisition and Processing
Quantitative proteomic profiling was performed by sequential window acquisition of all theoretical fragments mass spectrometry (SWATH-MS), which is also known as data-independent acquisition mass spectrometry. All data were acquired on an Agilent 1260 HPLC system (Agilent Technologies, Santa Clara, CA, USA) coupled to a TripleTOF 5600+ mass spectrometer with NanoSource III (AB SCIEX Ltd. Macclesfield, UK). Dried, desalted peptides were resuspended in a buffer A (2% ACN/0.1% formic acid), spiked with iRT calibration mix (Biognosys AG, Schlieren, Switzerland) and 1 µg of peptides for each sample was loaded onto a self-made trap column packed with a 10 µm ReprosilPur C18AQ beads (Dr. Maisch) and washed for 5 min by buffer A. Peptides were then loaded onto a 75 µm × 15 cm long analytical column with an integrated manually pulled tip packed with Reprosil Pur C18AQ beads (3 µm, 120 Å particles, Dr. Maisch) and separated over 120 min by linear gradient of 2-40% of Buffer B (98% ACN, 0.1% formic acid) at a flow rate of 250 nL/min. Each sample was acquired in two technical replicates. Acquisition parameters were as follows: 60 precursor isolation windows with a fixed size of 13 Da across the mass range of m/z 380-1100 with 1 Da overlap. MS/MS scans were acquired in the mass range of m/z 100-1500. Cycle time of 3.1 s was used resulting in an average of eight datapoints per elution peak. SWATH-MS spectra were analysed using Spectronaut 15.2 (Biognosys AG, Schlieren, Switzerland) against a published human library [22]. FDR was restricted to 1% on both protein and PSM levels. A peak area of 3 to 6 fragment ions was used for peptide quantification. Protein quantity was calculated as an average value of a maximum of six quantified peptides. Proteins quantified with <2 unique peptides were excluded from the subsequent analysis.

Data Processing and Statistical Methods
The proteomics dataset was further processed using R, Perseus 1.5.6 [23,24] and Graph-Pad 8.2.1. Protein quantities were log2 transformed and quantile normalised at sample level using the proBatch package [25] in R followed by protein median centering across the samples. The normalized dataset was then visualized by hierarchical clustering using ComplexHeatmap package in R [26]. A Gene Set Enrichment Analysis (GSEA) was applied using the GenePattern online tool [27] to identify gene sets obtained from the MSigDB (c5.gobp.v7.5) [28] that were significantly enriched in IVLM samples. Similarly, single sample GSEA (ssGSEA) was applied using GenePattern to score sample-specific enrichment of the Spliceosome gene set from the KEGG pathways database [29]. To identify spliceosome components, the list of all identified proteins in this study was cross-referenced with the annotated spliceosome protein interaction dataset published by Hegele et al. [30]. Mutual co-expression of the splicing factors was assessed by Pearson's correlation coefficient that was calculated in Perseus for all possible combinations of the identified splicing factors. The resulting similarity matrix was analysed and visualised by ConsensusClusterPlus [31] and ComplexHeatmap packages in R, respectively.
To study the association of the splicing factors identified in clusters 1-3 with known biological pathways, Pearson's correlation coefficients between splicing factors and all other proteins in the proteomic dataset (after removal of all proteins annotated in the Spliceosome Database [32]) were calculated in Perseus. The resulting similarity matrices were hierarchically clustered and visualized by ComplexHeatmap package in R, where rows of each matrix were split into four clusters using k-means partitioning, Euclidean distance and 1000 repetitions. Subsets of proteins from the clusters with the highest and lowest average correlation were then used for over-representation analysis using the DAVID 6.8 Functional analysis online tool [33]. The chord plot was plotted by SRplot, available online: https://www.bioinformatics.com.cn/en, accessed on 4 February 2022). Protein-protein interactions between splicing factors in cluster 3 and 565 other proteins with highest positive correlations (excluding other splicing factors) were analysed and visualized in CytoScape (3.7.1) [34]. To analyse the protein-protein interaction network, the STRING database was searched using high confidence threshold (STRING score >0.7) by StringApp [35] within CytoScape. The resulting network was then clustered by MCL clustering using the clusterMaker2 application [36] in CytoScape with granularity of 2.5 and STRING score for edges weighing.

Immunohistochemical Staining and Scoring
SWATH-MS results for SRSF3 was validated by immunohistochemical staining of the same cohort of specimens. From each FFPE block, 6µm sections were mounted on a microscopic glass slide, deparaffinized in xylene and rehydrated by decreasing gradient of ethanol in water (once in 100%, 96% and 80%). The glass slides were than immersed in a Tris-EDTA buffer (pH 6) and heated in a microwave oven for 8 min to retrieve the antigen. Sections cooled to room temperature were than washed once in Tris-buffered saline (TBS) and twice in TBS-Tween (TBST) buffer. Sections were subsequently covered by blocking buffer and incubated at room temperature for 90 min in a humidity chamber. After blocking, sections were incubated with primary antibody for SRSF3 (Abcam, ab198291) in a humidity chamber at 4 • C overnight. Sections were washed the following day by TBS and twice by TBST and incubated with DAKO Peroxidase blocking solution (DAKO, Agilent Technologies, Santa Clara, CA, USA) for 1 h at room temperature. Sections were washed once by TBS and TBST and subsequently stained with diaminobenzidine (DAKO, Agilent Technologies, Santa Clara, CA, USA) for 3 min to visualize SRSF3. After rinsing in water, Modified Mayer's heamatoxylin (Abcam) was added as a counterstain. Finally, stained sections were dehydrated by washes in increasing gradients of ethanol (once in 80%, 96%, 100%) and xylene, mounted in Pertex mounting medium (Pioneer) and scanned on a semi-automatic slide scanner (Hamamatsu Nanozoomer XR). Scans were semi-quantitatively scored by two independent examiners using the same scoring system as previously described in Milighetti et al. [18] 3. Results

Quantitative Proteomic Profiling of Smooth Muscle Tumours
The cohort is comprised of FFPE tumour material from 14 patients treated at The Royal Marsden Hospital. These specimens were obtained from surgical resections of IVLM (n = 3), uLM (n = 3), stLM (n = 7) and BML (n = 1). The detailed clinico-pathological characteristics of the patient cohort are presented in Table 1. Tumour specimens were subjected to sample preparation and protein extraction as depicted in Figure 1. Digested peptides then underwent proteomic profiling with SWATH-MS in technical duplicates. This analysis resulted in the identification and quantification of 2473 proteins (Table S1). Unsupervised clustering of the full dataset shows that the IVLM cases largely cluster together separate from the stLM and uLM cases ( Figure 2A). Interestingly, the only BML case in the cohort clusters most closely to the IVLM cases.
Assessment of proteins that are significantly different in IVLM cases compared to uLM, stLM and BML cases identified 162 proteins of which 109 and 53 proteins are upregulated (>2 fold) or downregulated (<2 fold) in IVLM respectively ( Figure 2B). Consistent with published immunohistochemical analysis and RNA studies [9,12,14,[37][38][39][40], expression of the chromatin factor HMGA2, a protein which is highly expressed in a subset of both IVLM and uterine leiomyomas due to the breakpoint on 12q14-15 [11,12,14], was not significantly different between IVLM and the other smooth muscle tumours in the cohort ( Figure S1). Interestingly we find that 29/162 (18%) of the differentially expressed proteins are components of the spliceosome complex ( Figure 2B). Four of the 29 splicing factors were downregulated, while 25 splicing factors were upregulated in IVLM when compared to the uLM, stLM and BML cases. ) and xylene, mounted in Pertex mounting medium (Pioneer) and scanned on a semi-automatic slide scanner (Hamamatsu Nanozoomer XR). Scans were semi-quantitatively scored by two independent examiners using the same scoring system as previously described in Milighetti et al. [18] 3. Results

Quantitative Proteomic Profiling of Smooth Muscle Tumours
The cohort is comprised of FFPE tumour material from 14 patients treated at The Royal Marsden Hospital. These specimens were obtained from surgical resections of IVLM (n = 3), uLM (n = 3), stLM (n = 7) and BML (n = 1). The detailed clinico-pathological characteristics of the patient cohort are presented in Table 1. Tumour specimens were subjected to sample preparation and protein extraction as depicted in Figure 1. Digested peptides then underwent proteomic profiling with SWATH-MS in technical duplicates. This analysis resulted in the identification and quantification of 2473 proteins (Table S1). Unsupervised clustering of the full dataset shows that the IVLM cases largely cluster together separate from the stLM and uLM cases ( Figure 2A). Interestingly, the only BML case in the cohort clusters most closely to the IVLM cases.    Table S1. (B) Volcano plot depicting difference in protein expression between IVLM cases and all the other smooth muscle tumours (rest). Splicing factors with significantly different expression levels (>two-fold or <two-fold) are highlighted in red.
Assessment of proteins that are significantly different in IVLM cases compared to uLM, stLM and BML cases identified 162 proteins of which 109 and 53 proteins are upregulated (>2 fold) or downregulated (<2 fold) in IVLM respectively ( Figure 2B). Consistent with published immunohistochemical analysis and RNA studies [9,12,14,[37][38][39][40], expression of the chromatin factor HMGA2, a protein which is highly expressed in a subset of both IVLM and uterine leiomyomas due to the breakpoint on 12q14-15 [11,12,14], was not significantly different between IVLM and the other smooth muscle tumours in the cohort ( Figure S1). Interestingly we find that 29/162 (18%) of the differentially expressed proteins are components of the spliceosome complex ( Figure 2B). Four of the 29 splicing factors were downregulated, while 25 splicing factors were upregulated in IVLM when compared to the uLM, stLM and BML cases.

Enrichment of Splicing Processes in IVLM
To further investigate the biological processes that are enriched in IVLM compared to the other smooth muscle tumours, we undertook gene set enrichment analysis (GSEA) of the full proteomic dataset ( Figure 3A). We show that the majority of the top 20 ranked enriched gene sets are processes associated with RNA splicing, processing, transport or metabolism. Beyond RNA-related biological processes, other enriched gene sets include protein targeting and localisation to membrane, regulation of gene transcription and translation. In line with the observation that a significant proportion of proteins enriched in IVLM are components of the spliceosome complex ( Figure 2B), single sample GSEA (ssGSEA) of the proteomic data for each specimen in the cohort using the KEGG spliceosome gene set showed that the IVLM cases had significantly higher ssGSEA spliceosome scores compared to the other smooth muscle tumours in the cohort ( Figure  3B). Taken together, our data indicate that both the spliceosome complex and biological processes involving RNA biology are enriched in IVLM specimens.  Table S1. (B) Volcano plot depicting difference in protein expression between IVLM cases and all the other smooth muscle tumours (rest). Splicing factors with significantly different expression levels (>two-fold or <two-fold) are highlighted in red.

Enrichment of Splicing Processes in IVLM
To further investigate the biological processes that are enriched in IVLM compared to the other smooth muscle tumours, we undertook gene set enrichment analysis (GSEA) of the full proteomic dataset ( Figure 3A). We show that the majority of the top 20 ranked enriched gene sets are processes associated with RNA splicing, processing, transport or metabolism. Beyond RNA-related biological processes, other enriched gene sets include protein targeting and localisation to membrane, regulation of gene transcription and translation. In line with the observation that a significant proportion of proteins enriched in IVLM are components of the spliceosome complex ( Figure 2B), single sample GSEA (ssGSEA) of the proteomic data for each specimen in the cohort using the KEGG spliceosome gene set showed that the IVLM cases had significantly higher ssGSEA spliceosome scores compared to the other smooth muscle tumours in the cohort ( Figure 3B). Taken together, our data indicate that both the spliceosome complex and biological processes involving RNA biology are enriched in IVLM specimens.

Identification of Co-Regulated Expression of Splicing Factors in the Proteomic Profiling Dataset
It is well-established that the spliceosome is a highly dynamic macromolecular complex where more than 200 splicing factors are assembled into distinct complexes that vary in their composition in space and time [30,41]. We therefore hypothesized that despite the overall enrichment of spliceosome components in IVLM ( Figure 3B), it is possible that subsets of co-regulated splicing factors may be responsible for the distinct clinical behaviour of IVLM versus leiomyomas. Indeed, unsupervised hierarchical clustering of 116 spliceosome components in the proteomic dataset showed that the spliceosome complex as a whole was not upregulated in IVLM ( Figure 4A). Rather, there appeared to be subsets of splicing factors that were differentially expressed in IVLM, uLM and stLM.
Inspired by a previous study which showed that co-regulation of splicing factors is important in regulating breast cancer progression [42], we performed a Pearson's cor-relation coefficient analysis of the protein expression levels of all possible combinations of 116 splicing factors in our dataset. Consensus clustering identified three clusters of splicing factors which is shown in the similarity matrix in Figure 4B (composition of each cluster provided in Table S2). In particular, Clusters 2 (n = 43) and 3 (n = 40) contain splicing factors which are negatively correlated between clusters but are positively correlated within clusters. Cluster 1 (n = 33) is mixed with both positively and negatively correlated splicing factors.

Identification of Co-Regulated Expression of Splicing Factors in the Proteomic Profiling Dataset
It is well-established that the spliceosome is a highly dynamic macromolecular complex where more than 200 splicing factors are assembled into distinct complexes that vary in their composition in space and time [30,41]. We therefore hypothesized that despite the overall enrichment of spliceosome components in IVLM ( Figure 3B), it is possible that subsets of co-regulated splicing factors may be responsible for the distinct clinical behaviour of IVLM versus leiomyomas. Indeed, unsupervised hierarchical clustering of 116 spliceosome components in the proteomic dataset showed that the spliceosome complex as a whole was not upregulated in IVLM ( Figure 4A). Rather, there appeared to be subsets of splicing factors that were differentially expressed in IVLM, uLM and stLM.    Figure 4B. Venn diagrams depict spliceosome composition (core versus non-core, and distinct splicing factor classes) in each cluster while plots below show average expression levels of spliceosome components in each sample for a given cluster. The detailed composition of clusters and the identity of individual proteins are listed in Table S2. The line and whiskers in plots represent mean and standard deviation. Statistical significance was calculated by two-sample t-test. ** p < 0.01, **** p < 0.0001.
Inspired by a previous study which showed that co-regulation of splicing factors is important in regulating breast cancer progression [42], we performed a Pearson's correlation coefficient analysis of the protein expression levels of all possible combinations of 116 splicing factors in our dataset. Consensus clustering identified three clusters of splicing factors which is shown in the similarity matrix in Figure 4B (composition of each cluster provided in Table S2). In particular, Clusters 2 (n = 43) and 3 (n = 40) contain splicing factors which are negatively correlated between clusters but are positively correlated within clusters. Cluster 1 (n = 33) is mixed with both positively and negatively correlated splicing factors.  Figure 4B. Venn diagrams depict spliceosome composition (core versus non-core, and distinct splicing factor classes) in each cluster while plots below show average expression levels of spliceosome components in each sample for a given cluster. The detailed composition of clusters and the identity of individual proteins are listed in Table S2. The line and whiskers in plots represent mean and standard deviation. Statistical significance was calculated by two-sample t-test. ** p < 0.01, **** p < 0.0001.

Distinct Co-Regulated Clusters Are Comprised of Splicing Factors Which Are Differentially Expressed in IVLM versus the Other Smooth Muscle Tumours
An evaluation of the composition of splicing factors showed that each cluster is comprised of different proportions of core and non-core spliceosome proteins with Cluster 2 having the highest proportion of core proteins (65%) and Cluster 3 having the least core proteins (25%) ( Figure 4C). Furthermore, assessment of the splicing factor classes based on nomenclature defined by Hegele et al. [30] finds that the splicing factor class composition of Clusters 1 and 3 is similar with the majority of proteins coming from the hnRNP, LSm, SR and Sm protein classes ( Figure 4C). In contrast, the composition of cluster 2 is very different, with U2, U2 rel and U5 protein classes dominating.
Quantitative assessment of the proteomic data showed that when broken down by cluster assignment, the IVLM specimens were significantly enriched in co-regulated splicing factors from Clusters 1 and 3 versus the other smooth muscle tumours in the cohort ( Figure 4C). No significant difference between IVLM and the other smooth muscle tumours was seen in co-regulated splicing factors in Cluster 2. Upregulation of the splicing factor SRSF3 in IVLM that was found in Cluster 3 was independently confirmed by immunohistochemical staining (IHC) ( Figure S2). All 3 IVLM cases showed strong staining for SRSF3, while 77% of the other soft tissue tumours in the cohort showed weak or no staining. The three non-IVLM cases that showed strong SRSF3 staining included the BML case, which consistently showed the highest levels of spliceosome factors in Clusters 1 and 3 ( Figure 4C). Collectively, this analysis indicates that at the protein level, IVLM is characterised by the co-regulated expression of specific classes of splicing factors that comprise the spliceosome.

Co-Regulated Splicing Factors Are Associated with Multiple Biological Pathways, including Protein Translocation and Signal Transduction by Small GTPases
We sought to determine if the expression of splicing factors in each of these clusters was linked to specific biological processes. To do this, the Pearson's correlation coefficient was calculated between all the proteins in the dataset (excluding spliceosomal proteins) and the splicing factors in each of the three clusters. Unsupervised hierarchical clustering finds that 537 and 585 proteins were positively or negatively correlated with the splicing factors in Cluster 2, respectively ( Figure 5A, clusters C and A). The same analysis in Cluster 3 identified positive and negative correlation in 545 and 738 proteins, respectively ( Figure 5B, clusters C and B). Unsurprisingly, since Cluster 1 is comprised of both positively and negatively correlated splicing factors, no significantly correlated proteins were found in our dataset. Given that Clusters 2 and 3 have opposing profiles in co-regulated splicing factors ( Figure 4B), it is expected that proteins correlating with these clusters would follow the same trend. Indeed, we demonstrate that there was a substantial overlap of proteins which show opposite co-expression patterns (i.e., positively correlated proteins in Cluster 2 and negatively correlated proteins in Cluster 3), and vice versa ( Figure 5C).  Focusing on Cluster 3, which is significantly upregulated in IVLM ( Figure 4C), the over-representation analysis finds four ontologies that are enriched in the proteins that are positively correlated with the splicing factors in this cluster ( Figure 5D). These ontologies include nascent protein targeting to the endoplasmic reticulum (SRP-dependent cotranslational protein targeting to membrane), signal transduction mediated by small GTPases, the hydrolysis of proteins by peptidases (negative regulation of endopeptidase activity) and proteins involved in viral transcription. The positively correlated proteins in these ontologies are shown in the chord diagram in Figure 5E. The analysis of proteinprotein interactions between splicing factors in Cluster 3 and positively correlated proteins ( Figure 5F) revealed a number of closely interacting RNA binding proteins such as RNAbinding protein homolog Musashi-2 (MSI2), heterogenous nuclear ribonucleoprotein D-like (HNRNPDL) and spermatid perinuclear RNA-binding protein (STRBP) or proteins involved in regulation and transport of RNA such as transcription and mRNA export factor ENY2 (ENY2).

Discussion
IVLM is a rare benign smooth muscle tumour with quasi-malignant clinical behaviour. Previous profiling studies characterising its molecular features have focused on DNA copy number and transcriptomic alterations [9,11,13,14]. Here we performed the first proteome level analysis of IVLM and compared it to other smooth muscle tumours including uLM, stLM and BML. We show that IVLM is characterised by a differential expression of spliceosome complex components. In particular, by utilising a bioinformatics approach to delineate co-regulation of splicing factors, we find that there are two specific clusters of co-regulated splicing factors in the hnRNP, LSm, SR and Sm protein classes that are enriched in IVLM compared to the other smooth muscle tumours in this cohort. Finally, we demonstrate that one of these clusters (Cluster 3) is associated with the high expression of proteins involved in key biological processes such as nascent protein translocation and signalling by small GTPases. To our knowledge, this is the first demonstration that IVLM is characterised by a distinct group of co-regulated splicing factors, which may contribute to its unique clinical behaviour. It highlights the utility of proteomics to provide novel insights into IVLM tumour biology beyond the current state-of-the-art gained from published aCGH and gene expression studies.
Three previous comparative gene expression studies between IVLM and uLM have been reported [9,13,14]. Interestingly, there were no common genes reported across these studies, which could be the result of the inter-patient heterogeneity and the small number of cases analysed in each of the reports. None of these studies report the differential expression of splicing factors between IVLM and uLM. One reason for this lack of overlap between our proteomic dataset and previous transcriptomic datasets could be the consistently poor correlation between protein and RNA expression that has been observed across patient samples in multiple proteogenomic studies [43][44][45]. Protein abundance is governed by multiple factors, and in addition to transcription rates and mRNA half-life, it is also highly dependent on translation rates and protein half-life [46]. In line with this, the analysis of 5 genes reported to be upregulated in IVLM versus uLM by Ordulu et al. [14] and found in our proteomic dataset (EFEMP1, CFH, GPX3, HBA1, HBB) were not significantly upregulated at the protein level.
Splicing occurs through a complex series of well-regulated steps mediated by the spliceosome machinery [47]. It has been shown that aberrations in specific splicing factors disrupt the composition of the spliceosome complex and drive carcinogenesis [48,49]. For instance, mutations in the splicing factor SF3B1 in both solid and liquid cancers initiate oncogenic alternative splicing reprogramming that is key to cancer development and progression [50][51][52][53][54]. Furthermore, it has been recently shown that some of these splicing factor mutations may induce new vulnerabilities that can be therapeutically exploited in a synthetic lethal fashion [55][56][57]. In the same vein, it is possible that the distinct coregulation of splicing factors observed in IVLM may result in dysregulated alternative splicing that could account for its intravenous growth patterns. Unfortunately, due to the highly fragmented nature of total RNA extracted from FFPE specimens, we were unsuccessful in our efforts to measure alternative splicing profiles by RT-PCR from the cases in this series despite multiple repeated attempts. Future RNASeq or RT-PCR analyses on prospectively collected flash frozen specimens would be key to establishing if differential alternative splicing occurs in IVLM versus uLM. Identifying such alternatively spliced genes could offer a mechanistic explanation into the quasi-malignant behaviour of IVLM.
This study is limited by the small number of IVLM cases that were studied. IVLM is a rare condition and the vast majority of profiling studies to date comprise a small number of cases (typically < 5). Despite the limited numbers, we were able to demonstrate that there was a statistically significant enrichment of co-regulated spliceosome components in IVLM. Interestingly, we show that the sole BML case in our cohort clustered most closely to the IVLM cases (Figure 2A). BML is another rare unusual variant of leiomyoma that often manifests as multiple nodules in the lungs and other sites [58]. A recent aCGH analysis finds that IVLM and BML share recurrent copy number alterations that are rarely seen in uLM [11]. Consistent with this finding, our data shows that at the proteomic level, BML is more similar to IVLM compared to uLM. It is, however, important to note that our proteomic analysis was performed on a small case series treated within a single institution and that any findings will need to be independently validated.

Conclusions
In summary, we have undertaken a comparative proteomic profiling study of IVLM and other smooth muscle tumours (uLM, stLM and BML) and describe the selective enrichment of co-regulated splicing factors which are associated with distinct biological pathways. We anticipate that future work integrating proteomics with complementary Omics-based profiling approaches such as RNAseq will shed further light on the possible role of alternative splicing in the pathogenesis of IVLM.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/cancers14122907/s1, Figure S1: Expression levels of HGMA2 protein of each case in the cohort. Figure S2: Comparative analysis of expression levels of SRSF3 protein by SWATH-MS with immunohistochemical staining (IHC) of the FFPE tissue sections from the same samples in the cohort. The line and whiskers in plots for SWATH-MS represent median and interquartile range. Stacked bar charts for IHC represents immunoreactivity of individual samples. Photomicrographs of representative samples with strong, weak and no staining for SRSF3 protein are shown. Scale bar represents 50 µm. Table S1: Full proteomic dataset for the cohort. Table S2  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The mass spectrometry proteomics data have been deposited to the Pro-teomeXchange Consortium via the PRIDE [59] partner repository with the dataset identifier PXD031637.