Small Non-Coding RNAs and Their Role in Locoregional Metastasis and Outcomes in Early-Stage Breast Cancer Patients

Deregulation of small non-coding RNAs (sncRNAs) has been associated with the onset of metastasis. We evaluated the expression of sncRNAs in patients with early-stage breast cancer, performing RNA sequencing in 60 patients for whom tumor and sentinel lymph node (SLN) samples were available, and conducting differential expression, gene ontology, enrichment and survival analyses. Sequencing annotation classified most of the sncRNAs into small nucleolar RNA (snoRNAs, 70%) and small nuclear RNA (snRNA, 13%). Our results showed no significant differences in sncRNA expression between tumor or SLNs obtained from the same patient. Differential expression analysis showed down-regulation (n = 21) sncRNAs and up-regulation (n = 2) sncRNAs in patients with locoregional metastasis. The expression of SNHG5, SNORD90, SCARNA2 and SNORD78 differentiated luminal A from luminal B tumors, whereas SNORD124 up-regulation was associated with luminal B HER2+ tumors. Discriminating analysis and receiver-operating curve analysis revealed a signature of six snoRNAs (SNORD93, SNORA16A, SNORD113-6, SNORA7A, SNORA57 and SNORA18A) that distinguished patients with locoregional metastasis and predicted patient outcome. Gene ontology and Reactome pathway analysis showed an enrichment of biological processes associated with translation initiation, protein targeting to specific cell locations, and positive regulation of Wnt and NOTCH signaling pathways, commonly involved in the promotion of metastases. Our results point to the potential of several sncRNAs as surrogate markers of lymph node metastases and patient outcome in early-stage breast cancer patients. Further preclinical and clinical studies are required to understand the biological significance of the most significant sncRNAs and to validate our results in a larger cohort of patients.


Introduction
Cancer metastases are responsible for most breast cancer deaths.Despite intensive research in this field, our comprehension of the molecular events that drive metastatic progression remains largely incomplete.The identification of predictive and prognostic biomarkers is needed for early diagnosis of cancer and response monitoring of available therapies [1].Ideally, these biomarkers should be highly specific and sensitive and several strategies are currently being developed to detect low expressed cancer-related biomarkers in liquid biopsies and other tumor samples [2].However, more information on their expression in tumors, and thus suitability as biomarkers, is still required.
SnoRNAs, which have long been known, represent a class of abundantly expressed sncRNAs, primarily present in the nucleolus and play pivotal roles in post-transcriptional rRNA processing and modification, thereby contributing significantly to the maintenance of cellular functions related to protein synthesis.SnoRNAs are approximately 30-300 nt long and, based on conserved sequence elements, are classified into C/D box snoRNAs (SNORDS) or H/ACA box snoRNA (SNORA), which determine the binding and modification of the RNA target.A third class of snoRNAs contains both sequence motifs and localizes to the nuclear Cajal bodies (SCARNA) [3].However, approximately half of human snoRNA have no predictable rRNA targets, and numerous snoRNAs have been discovered to possess the ability to influence cell fate and alter disease progression; they therefore hold immense potential in terms of controlling human diseases, including Prader-Willi syndrome, Duplication15q syndrome and cancer [4].It has been suggested that snoRNA dysregulation exhibits differential expression across various cancer types, stages, metastasis, treatment response and/or prognosis in patients [5].This new role of snoRNAs has been addressed by recent studies showing that snoRNA can act to regulate pre-mRNA alternative splicing and mRNA abundance, as well as activate enzymes and be processed into shorter sncRNAs resembling miRNAs and piRNAs [6][7][8].Furthermore, recent biochemical studies have shown that a given snoRNA can form both methylating and non-methylating ribonucleoprotein complexes, providing clues to the likely physical basis for such diverse new functions [9].SnoRNAs are evidently more structurally and functionally diverse than previously thought, and their role in gene expression is under-appreciated.
SnRNAs are a class of highly abundant sncRNA molecules with an average size of 150 nt present in the cell nucleus and are involved in intron removal from pre-mRNA.SnRNA form a large particulate complex (splicesome) along with ribonucleoprotein particles (snRNPs) and additional proteins, which binds to the primary RNA transcripts to mediate the splicing.Additional evidence indicates that snRNPs function in nuclear maturation of primary transcripts in mRNAs, gene expression regulation, splice donor in non-canonical systems and in 3 ′ -end processing of replication-dependent histone mR-NAs [10].Accumulating evidence demonstrate that snRNA dysregulation are closely related to the progression of cancer through different mechanisms, such as transcriptional inhibition and post-transcriptional regulation [11].
In this study, we performed RNA sequencing to profile the sncRNA expression in 60 patients with early-stage breast cancer for whom tumor tissue and SLNs samples were available.We identified most of the sncRNAs as snoRNAs or snRNAs, and we found that, overall, down-regulation of snoRNAs was associated with patient locoregional metastatic status.Furthermore, our classifier model yielded a 6-snoRNA signature that clearly differentiated between negative and positive metastatic SLN and correlates with patient outcome.Deregulated snoRNAs showed a significant enrichment of biological processes associated with translation initiation, protein targeting to various organelles and regulation of Wnt and NOTCH signaling pathways.Our data highlight the potential use of sncRNAs as surrogate markers of locoregional metastases and patient outcome in breast cancer.

Patients
For each of the 60 female patients who were included in this study, we analyzed paired tumor tissues and SLNs.The main clinicopathological characteristics of the patients are described in Table 1.Of the 60 patients, 40 (67%) had SLN-positive tumors, 20 were diagnosed as micrometastasis and 20 were diagnosed as macrometastasis.

RNA Sequencing
A total of 117 samples (98%) from 59 tumors and 58 SLNs passed the pre-and post-sequencing quality check, which confirmed average read quality and base quality Q-scores > 30 (99.9% correct) [12].Three samples (1 tumor and 2 SLNs) were excluded from further analyses.In total, we analyzed 57 patients with paired samples (n = 117).The mean read number for tissues and SLNs were 3.8 million and 4.4 million, respectively.Following sequencing and trimming, reads were collapsed into a single read and passed into the analysis pipeline.This allowed for true quantification of the sncRNAs by eliminating library amplification bias and a better representation of the RNA molecules in the sample.We obtained an average 0.96 million and 1 million of collapsed reads for tissues and SLNs, respectively, and an average genome mapping rate of 25% and 23% for tissues and SLNs, respectively.The raw counts yield a total of 4207 sncRNAs that was reduced to 536 sncRNAs after performing a filtering step of at least 1 CPM in half of the samples.Count data were normalized and log2 transformed using the regularized log (rlog) method from the DESeq2 package (Table S1).The resulting sncRNAs were classified as snoRNAs (69.5%), snRNA (12.5%), miscellaneous RNAs (7.1%) and rRNAs (9.3%).Within the snoRNAs category, the majority of them were SNORDS (67.8%), followed by SNORA (29.7%) and SCARNA (2.6%).

Correlations and Clustering Analyses
To investigate whether patients were assigned into biological groups based on their sncRNA expression, we performed supervised hierarchical clustering using 50 sncRNAs with the largest coefficient of variation based on rlog-normalized counts.Our data indicated no significant differences between tumor and SLNs (Figure 1A).Similar results were obtained using a principal component analysis (PCA).Despite our data showing some differences between the two samples, those differences were not sufficiently large to cluster samples into different groups (Figure 1B).We also performed a tumor-to-SLN Spearman's correlation analysis (r s ).Our results showed that sncRNA expression in tumor and SLN samples from the same patient were highly correlated, with an average value for all patients of r s = 0.955 (0.904-0.975) (Figure 1C, Table S2).

Differentially Expressed Tumor snoRNAs
Given the similarities in sncRNA expression between tumor and SLNs, we focused on tumor tissues to investigate whether differentially expressed (DE) sncRNAs were associated with the locoregional metastasis status of the patients.We first analyzed tumors samples according to their positive (n = 39) or negative (n = 20) metastasis status.We found 23 significantly DE sncRNAs (21 down-regulated and 2 up-regulated) after correcting for multiple testing (q < 0.05), with an absolute log2 fold change ≥ 1.5 associated with positive samples (Figure 2A) (Table S3).Similar results were obtained when patients were

Differentially Expressed Tumor snoRNAs
Given the similarities in sncRNA expression between tumor and SLNs, we focused on tumor tissues to investigate whether differentially expressed (DE) sncRNAs were associated with the locoregional metastasis status of the patients.We first analyzed tumors samples according to their positive (n = 39) or negative (n = 20) metastasis status.We found 23 significantly DE sncRNAs (21 down-regulated and 2 up-regulated) after correcting for multiple testing (q < 0.05), with an absolute log2 fold change ≥ 1.5 associated with positive samples (Figure 2A) (Table S3).Similar results were obtained when patients were classified according to the SLN metastatic status, either positive macrometastasis (n = 19) or positive micrometastasis (n = 20) (Figure 2B,C).Interestingly, we observed that up-regulated sncRNAs were associated only with micrometastasis.No DE sncRNAs were found between patients with micrometastasis and macrometastasis.We next investigated DE sncRNAs based on breast cancer molecular subtypes.Our series included mainly patients with luminal A (n = 25) and luminal B (n = 21), followed by luminal B HER2+ (n = 8) and TN (n = 5) tumors (Table 1).Analyzing the first three subgroups (Figure 2D-F), our results show down-regulation of SNHG5, SNORD90, SCARNA2 and up-regulation of SNORD78 associated with luminal B compared to luminal A tumors.On the other hand, SNORD124 up-regulation was associated with luminal B HER2+ compared to either luminal A (Figure 2E) or luminal B tumors (Figure 2F).

Biological Significance and Enriched Analysis of sncRNAs
We performed a biological significance analysis using DE sncRNAs based on patient locoregional metastatic status (p < 0.05 and absolute log fold change > 0.3).In contrast to other sncRNAs such as microRNAs, annotation of snoRNAs in functional databases such as gene ontology (GO), KEGG or Reactome is scarce.Therefore, biological significance analysis was assessed using three different gene list, including the snoRNAs host genes and gene targets retrieved from the snoDB database, and genes correlated selected sncRNA expression in the TCGA-BRCA and SNOric databases (Table S4).Overall, our data show that the host genes of DE sncRNAs in patients with positive locoregional metastasis included GO categories associated with translational initiation, various processes targeting specific proteins to particular regions of the cell during or after the translational process, regulation of the Wnt signaling pathway (Figure 3A).Likewise, the top GO categories of targets genes associated with DE sncRNAs were also involved in the same processes.
We carried out an enrichment analysis to determine those pathways associated with the DE sncRNAs.GO biological processes analysis using the SnoDB showed that the target genes were associated with Wnt signaling, protein translation, targeting proteins to particular cell locations, histone methylation, neutrophil activation and cell maturation and development (Figure 3B,C).To further understand the signaling pathways involved in the regulation of DE snoRNAs according to locoregional metastasis status, we performed a similar analysis using the Reactome, which, in contrast to the GO biological processes, makes extensive use of protein complex interactions in its representation, thus given a more detail picture of the pathways involved with a particular set of snoRNAs.Our results show an enrichment of DE sncRNAs associated with the NOTCH processing, resolution of sister chromatids and chemokine binding pathways (Figure 3D, Table 2).

Clinicopathological Correlation with DE sncRNAs
In our series of 60 patients with early-stage breast cancer, we observed recurrence in 11 (18%) patients.Median follow-up time was 9.6 years (range 0.4-12.5 years).At the last follow-up, nine (15%) patients were dead.The univariate analysis showed several sncRNAs associated with tumor grade (n = 134), lymphovascular invasion (n = 20) and tumor focality (n = 12) and, to a lesser degree, with menopausal status (n = 2) and tumor stage (n = 1) (Table 3 and Table S5).Table 3. Univariate analysis shows the number of significant (q < 0.01) sncRNAs associated with the patient clinicopathological characteristics of the patients.The names of the top 10 most significant sncRNAs are shown.

Discussion
Breast cancer is one of the most prevalent cancers among women and the leadin cause of cancer mortality in women [13].Currently, LN affection remains the mos important prognosis factor in breast cancer [14,15] and the presence of metastasis in th SLNs is still currently the recommended procedure for axillary staging of early breas cancer [16].Our recent research has focused on the involvement of miRNAs in th development of locoregional metastases in patients with early-stage breast cance [17,18,19].We did not, however, investigate other classes of sncRNAs that have emerge in recent years as important regulators in cancer development and in the various steps o the metastatic process [20].
In this study, we investigated the expression of sncRNAs in paired primary tumo and SLNs from early-stage breast cancer patients and correlated the results with SLN metastatic status.Our RNA sequencing data show that 83% of the annotated sncRNA were classified as snoRNAs (70%) or snRNAs (13%), whereas the rest belong t miscellaneous RNAs, including some long non-coding RNAs (lncRNAs).Nonetheless, th

Discussion
Breast cancer is one of the most prevalent cancers among women and the leading cause of cancer mortality in women [13].Currently, LN affection remains the most important prognosis factor in breast cancer [14,15] and the presence of metastasis in the SLNs is still currently the recommended procedure for axillary staging of early breast cancer [16].
Our recent research has focused on the involvement of miRNAs in the development of locoregional metastases in patients with early-stage breast cancer [17][18][19].We did not, however, investigate other classes of sncRNAs that have emerged in recent years as important regulators in cancer development and in the various steps of the metastatic process [20].
In this study, we investigated the expression of sncRNAs in paired primary tumor and SLNs from early-stage breast cancer patients and correlated the results with SLN metastatic status.Our RNA sequencing data show that 83% of the annotated sncRNAs were classified as snoRNAs (70%) or snRNAs (13%), whereas the rest belong to miscellaneous RNAs, including some long non-coding RNAs (lncRNAs).Nonetheless, the full landscape of sncRNAs may have not been revealed in our dataset as (1) we could only use tools as currently available for gene annotation are constantly improving (and those tools are constantly improving), (2) known RNA sequencing biases still exist [9] and (3) RNA library preparation limitations may prevent certain sncRNAs from being amplified [20].
Our data show that sncRNA expression in tumor and SLNs is similar, with minor nonsignificant changes that are likely due to histological differences between the SLN and the tumor, whereas SLNs are constituted mainly by lymphoid and monocytes cells the tumor tissue is formed mostly from epithelial and mesenchymal cells.Since our data showed high tumor-SLN correlation for patients, we performed further analyses on tumor samples.
We identified 23 DE sncRNAs associated with patient locoregional metastatic status.Overall, and similar to our previously reported data on miRNAs, we found that most DE sncRNAs were down-regulated (adjusted q < 0.05), suggesting that the expression of all sncRNAs follows a similar pattern in early-stage breast cancer patients [17,19].Interestingly, down-regulated sncRNA expression was similar in patients affected by micro-and macrometastases, but up-regulated sncRNAs (SNORD93, SNORD114-20 and SNORD116-24) was only significant in the micrometastatic group.However, the number of patients in each group was small (n = 20) and it remains to be elucidated whether loss of the expression of specific sncRNAs is part of the natural history course of breast cancer tumors.
Among the DE sncRNAs in our list, four snoRNAs (SNORA47, SNORD94, SNORA70 and SNORD10) have been documented to be involved in various human cancers [5].SNORA47 has been reported to be up-regulated in human hepatocellular carcinoma and associated with intrahepatic metastasis and lymphatic invasion.In addition, a high expression of SNORA47 predicted worse patient outcome [21].SNOR94, SNORA70 and SNORD10 have been reported up-regulated in a p53 oncogenic gain-of-function mutant mouse osteosarcoma model.The authors showed by RNA-seq that a cluster of snoRNAs were highly up-regulated in p53 mutant tumors in association with the Ets2 transcription factor-binding site.Homozygous deletion of Est2 resulted in down-regulation of these snoRNAs and reversed the pro-metastatic phenotype of p53 mutant tumors [22].The results for those four snoRNAs suggest that they act as oncogenes, contrast with our results and suggest that, in breast cancer SNORA47, SNOR94, SNORA70 and SNORD10 may play a role as a tumor-suppressor gene (TSG) by yet unknown mechanisms.In support of this argument, down-regulation of SNORD10 has been associated with epigenetic promoter silencing in stage IV melanoma cell lines [23].Thirteen sncRNAs in our dataset (some of which with a non-adjusted p < 0.05) (RMRP, RN7SK, SNORA47, SNORA50C, SNORA71A, SNORA73B, SNORA7B, SNORA80E, SNORD10, SNORD112, SNORD12B, SNORD15A and VTRNA2-1) were found annotated in the DisGeNet database [24], a curated database integrating information on human-disease associations from various repositories and inferred associations from literature text mining.RMRP, SNORA7B, SNORD15A, SNORA71A and VTRNA2-1 have been previously reported in breast carcinomas [25][26][27][28].RMRP, part of the RNase mitochondrial RNA processing (MRP) complex, has been found to be regulated by the oncogenic Wnt/b-catenin and Hippo/YAP pathways [25].SNORA7B has been reported up-regulated in breast tumors compared to normal tissue [26], and SNORD15A and SNORA71A have been found up-regulated in brain metastases [27].VTRNA2-1 has been shown to be involved in the inhibition of protein kinase R (PKR) activity and act as a TSG in several cancer types.Increased breast cancer risk has been associated with down-regulation of VTRNA2-1 linked to five methylation marks within the VTRNA2-1 promoter region [28].
An interesting picture emerges from our data compared to the aforementioned reported data.First, our dataset included only tumor samples since normal tissue from the same patients was not available for study.We therefore could not address the relative expression of the DE sncRNAs found in our study compared to normal breast tissue, which would have help to elucidate whether the studied sncRNAs act as oncogenes or TSG.However, three DE sncRNAs associated with locoregional metastases SNORA80E, SNOR15B and SNORD114-20 have been previously reported to be deregulated in invasive local BC compared to benign breast tissue [29].Second, our biological significance and enrichment analyses-based on various databases that included host genes, target genes and interactions between genes and sncRNAs from the TCGA-BRCA atlas-showed similar signaling pathways as those described in the literature [20,30].For instance, our data identified various GO biological processes as part of the Wnt signaling pathway.WNTs and WNT pathway components are also frequently over-or under-expressed in various cancers, and these changes are correlated with epigenetic regulation of promoter activity.In some contexts, both the canonical and non-canonical WNT signaling, which governs processes such as cell polarity and morphogenesis, may also contribute to tumor formation by promoting cell migration, invasiveness and metastasis [31].In addition to the GO biological processes that focus on the activities of individual genes, we used the Reactome pathway, which makes extensive use of protein complexes in their pathway representations and describes their formation, dissociation and activities [32].The main Reactome pathways were associated with the Notch signaling pathway (NSP), a highly conserved pathway for cell-cell communication involved in the regulation of cellular differentiation, proliferation, and specification [33]; chemokine receptors and their interaction with various chemokines that activates integrins for leukocyte adherence on endothelial cells and induces chemotaxis of leukocytes in tissue microenvironments [34]; and various processes associated with mitosis such as resolution of sister chromatids during mitotic prometaphase that indicates the involvement of sncRNAs in cell proliferation.
In addition to finding a correlation of the DE of sncRNAs with patient metastasis status, we found that a large number of sncRNAs correlated with various clinicopathological features, especially tumor grade, lymphovascular invasion, tumor focality and breast cancer molecular subtypes, in agreement with a recent study [29].Interestingly, we have also described SNORD124 up-regulation in tumors expressing HER2, suggesting that SNORD124 could serve as a diagnostic biomarker for HER2-positive tumors.More importantly, our six-snoRNAs signature (SNORD93, SNORA16A, SNORD113-6, SNORA7A, SNORA57 and SNORA18A) accurately distinguished between patients with locoregional metastasis.We also found that a low expression of the snoRNA discriminant score was associated with better patient outcome.
The main limitation of our study is the small number of samples used to assess the potential of sncRNAs as surrogate biomarkers of the lymph node metastasis in breast cancer.Therefore, our results are preliminary and must be interpreted with caution.Nonetheless, in this study, we provide evidence that several sncRNAs are associated with the locoregional metastatic status and patient outcome in early-stage breast cancer.Further studies are required in a larger number of patients to clinically validate our results and to unveil the molecular mechanisms of the sncRNAs described in this study.

Materials and Methods
Patients.We studied 60 patients with early-stage breast cancer treated with surgery.Male patients were excluded from this study.Sample size was determined according to the model developed by Dobin, K et al. [35].None of the patients had been previously treated with surgery, chemotherapy or radiation.All patients had confirmed diagnoses based on tumor biopsy histopathology and intraoperative SLN tissue evaluated using the OSNA assay [36].All tumors were invasive ductal carcinomas (IDCs) with or without an in situ component.The following clinical and pathological parameters were recorded: age, menopausal status, personal and family disease precedents, clinical follow-up, tumor stage determined according to the UICC system [37], histological grade determined using the Elston-Ellis grading system [38], tumor histology, presence of associated carcinoma in situ, presence of vascular and lymphatic invasion, tumor infiltrating lymphocytes, tumor focality, tumor necrosis; proliferation of non-tumoral tissue.For each patient, we collected paired tumor and SLNs (n = 120 samples).Samples were classified according to the SLN status as negative (n = 20) or positive (n = 40).Positive samples were sub-classified as macrometastatic (n = 20) or micrometastatic (n = 20) [36].
RNA isolation.Tumor and SLNs were processed as previously described [17].RNA was isolated from tumor and SLNs samples using miRNeasy (Qiagen , Germantown, MD, USA) according to the manufacturer's instructions an eluted in a volume of 30 µL.The RNA integrity (RIN) level was measured for each RNA sample using Agilent TapeStation (Santa Clara, CA, USA).All samples used in this study had a RIN value > 7. A range of spike-ins were added to all samples prior to RNA isolation.A Pre-sequencing quality check by q-PCR was performed on all samples to control for the quality of the RNA and inhibition in downstream enzymatic reactions as previously described [18].
RNA sequencing.All steps required to performed next-generation sequencing and genome annotation were performed as previously described [17,18].Genome annotation was performed using the QIAGEN CLC Genomics Server v20.0.4 (Qiagen, Germantown, MD, USA).Following sequencing, Cutadapt (1.9.1) [39] was used to trim adaptor sequences.A quality check was performed to ensure Q-scores > 30 (>99.9% correct) for our data [12].Reads with the correct length were collapsed into FASTQ files.Bowtie2 software (2.2.6) was used to map the reads.The mapping criterion for aligning reads to spike-ins, abundant sequences and databases was for reads to perfectly match the reference sequences.To map the genome, two mismatches were allowed in the sequences.Small insertions and deletions were not allowed.The resulting sequences were annotated using the human assembly GRCh38 (Ensembl) and the snoDB database v1.2.1 [40].The raw data were filtered to keep only sncRNAs with at least 1 CPM in half of the samples.Count data were then normalized and log2 transformed using the regularized log (rlog) method from the DESeq2 package [41] to eliminate biases in the composition of the sequencing libraries and to stabilize variance-mean dependence in count data.
Correlation, hierarchical clustering and differential expression analyses.Correlation analyses were performed using the rlog-normalized counts matrix.Spearman's rho (r s ) statistic and a heatmap plotted with Euclidian distances were used to measure similarities between samples from the same patients.To visualize sample expression profiles, hierarchical clustering was performed using Euclidian distances and scaled and centered rlog-normalized counts.PCA was performed to reduce the rlog-normalized counts in two dimensions.Differential expression analyses were performed using the trimmed mean of M values (TMM) normalization method [42], converted to log2 scale, the R statistical software package v3.6.3 and libraries from the Bioconductor Project (www.bioconductor.org,accessed on 26 March 2024) [43].
Biological significance and enrichment analyses.Functional annotation of selected sncRNAs with a p < 0.05 was performed using Ensembl, NCBI resources and snoDB database v1.2.1 [40].Validated gene targets were searched for using R software v3.6.3 to retrieve sncRNA-target interactions from DisGeNet v7.0 database [24].The biological significance analysis was performed using gene and host set lists from the snoDB database and the TCGA_BRCA dataset from the SNORic [44] data portal.Enrichment analyses was conducted using GO [45] and the Reactome pathway database [46].The analyses were performed using the R/Bioconductor's cluster Profiler package v3.12.0 [47].
Classifier model building.Briefly, several iterations were performed during the resampling step in a balanced, random manner.The data were split into training and test cohorts, which were used to build and validate the model, respectively.Within each iteration, biomarker candidates were selected using different methods (t-test, lasso, random forest), and choosing a fixed number of features (3, 5, 10 and 25).Once candidates were selected, classification profiles were created using penalized logistic regression, partial Int. J. Mol.Sci.2024, 25, x FOR PEER REVIEW 5 of 17

Figure 1 .
Figure 1.Class discovery associated with SLN metastatic status.The analysis was performed using 50 sncRNAs with the largest coefficient of variation based on rlog-normalized values.(A) Heatmap and unsupervised hierarchical clustering.Each row represents one sncRNA and each column represents one sample.The row Z-score scaling method was used to represent expression level above (red) and below (blue) the mean.(B) Principal component analysis shows sample clusters arising naturally based on the sncRNA expression profile.(C) Scatterplots depicting tumor-to-SLN comparison between samples from the same patient show the log expression of sncRNA expression for each sample type.The average Spearman's correlation coefficient (rs) for all tumor-SLN comparisons is shown.** Two-tailed p < 0.01.

Figure 1 .
Figure 1.Class discovery associated with SLN metastatic status.The analysis was performed using 50 sncRNAs with the largest coefficient of variation based on rlog-normalized values.(A) Heatmap and unsupervised hierarchical clustering.Each row represents one sncRNA and each column represents one sample.The row Z-score scaling method was used to represent expression level above (red) and below (blue) the mean.(B) Principal component analysis shows sample clusters arising naturally based on the sncRNA expression profile.(C) Scatterplots depicting tumor-to-SLN comparison between samples from the same patient show the log expression of sncRNA expression for each sample type.The average Spearman's correlation coefficient (r s ) for all tumor-SLN comparisons is shown.** Two-tailed p < 0.01.

Figure 2 .
Figure 2. Differentially expressed sncRNAs.The volcano plots show differentially expressed sncRNAs in tumor samples according to patient locoregional metastatic status (A-C) and molecular subtype (D-F).The data show the logarithmic relationship between false discovery rate-adjusted p values (q value) (y-axis) and the log2 fold change expression (x-axis).Red, blue and grey dots show q values < 0.05, non-adjusted p values < 0.05 and non-significant p values > 0.05, respectively.Only snoRNAs with an absolute log2 fold change ≥ 1.5 are labeled.

Figure 2 .
Figure 2. Differentially expressed sncRNAs.The volcano plots show differentially expressed sncRNAs in tumor samples according to patient locoregional metastatic status (A-C) and molecular subtype (D-F).The data show the logarithmic relationship between false discovery rate-adjusted p values (q value) (y-axis) and the log2 fold change expression (x-axis).Red, blue and grey dots show q values < 0.05, non-adjusted p values < 0.05 and non-significant p values > 0.05, respectively.Only snoRNAs with an absolute log2 fold change ≥ 1.5 are labeled.

Figure 3 .Figure 3 .
Figure 3. Gene ontology (GO) enrichment analysis for significant biological processes associated with positive SLNs.(A) The dot plot graph shows the 50 most significant biological process GO terms (y-axis) and the ratio between the number of expressed sncRNAs associated with the GO term and the number of significantly differentially expressed genes associated with the GO term (x-axis).The color of the nodes indicates the p value and the size of the nodes the number of sncRNAs associated with a specific GO term.(B) Enrichment map of the top 60 sncRNAs, with pathwaysFigure 3. Gene ontology (GO) enrichment analysis for significant biological processes associated with positive SLNs.(A) The dot plot graph shows the 50 most significant biological process GO terms (y-axis) and the ratio between the number of expressed sncRNAs associated with the GO term and the number of significantly differentially expressed genes associated with the GO term (x-axis).The color of the nodes indicates the p value and the size of the nodes the number of sncRNAs associated with a specific GO term.(B) Enrichment map of the top 60 sncRNAs, with pathways grouped by similarity.Node size indicates the number of sncRNAs found in a pathway and node color reflects the significance of the p value.(C,D) The neural plots show the link between genes and terms associated with the most significant GO terms or Reactome pathways, respectively.

Figure 4 .
Figure 4. Classifier model and association with patient outcome.(A) Boxplot shows th normalized expression of the snoRNA score (y-axis) and the metastatic status of patients.The re line indicates the cutoff value.(B) ROC curve analysis of the snoRNAs score for discriminatin patients with locoregional metastasis (blue).The reference line for random classification is show in red.(C,D) Kaplan-Meier survival curves and log-rank tests for disease-free survival and overa survival based on snoRNAs categorized as low or high expression.

Figure 4 .
Figure 4. Classifier model and association with patient outcome.(A) Boxplot shows the normalized expression of the snoRNA score (y-axis) and the metastatic status of patients.The red line indicates the cutoff value.(B) ROC curve analysis of the snoRNAs score for discriminating patients with locoregional metastasis (blue).The reference line for random classification is shown in red.(C,D) Kaplan-Meier survival curves and log-rank tests for disease-free survival and overall survival based on snoRNAs categorized as low or high expression.

Table 1 .
Basic patient and tumor characteristics.

Table 2 .
GO and Reactome enrichment analyses associated with Wnt and NOTCH signaling pathways.Data show sncRNAs associated with each biological term and their target genes.