Circulating miRNA Expression Is Inversely Correlated with Tumor Tissue or Sentinel Lymph Nodes in Estrogen Receptor-Positive Early Breast Cancer Patients

The deregulation of microRNAs (miRNAs) is associated with the various steps of the metastatic process. In addition, circulating miRNAs are remarkably stable in peripheral blood, making them ideal noninvasive biomarkers for disease diagnosis. Here, we performed a proof-of-principle study to determine whether tumor-tissue-derived miRNAs are traceable to plasma in ER-positive early breast cancer patients. We performed RNA-sequencing on 30 patients for whom plasma, sentinel lymph nodes (SLNs) and tumor tissue were available. We carried out differential expression, gene ontology and enrichment analyses. Our results show that circulating miRNAs are inversely expressed compared with tumor tissue or SLNs obtained from the same patients. Our differential expression analysis shows the overall downregulation of circulating miRNAs. However, the expression of miR-643a-3p and miR-223 was up-regulated in patients with positive SLNs. Furthermore, gene ontology analysis showed the significant enrichment of biological processes associated with the regulation of epithelial cell proliferation and transcriptional regulation commonly involved in the promotion of metastases. Our results suggest the potential role of several circulating miRNAs as surrogate markers of lymph node metastases in early breast cancer patients. Further preclinical and clinical studies are required to understand the biological significance of the most significant miRNAs and to validate our results in a larger cohort of patients.


Introduction
Cancer metastases are responsible for the majority of breast cancer deaths.Despite intensive research in this field, our understanding of the molecular events that drive metastatic progression remains largely incomplete.In recent years, microRNAs (miRNAs) have emerged as important regulators of many cellular processes, including the various steps of the metastatic process.miRNAs are small (19-25 nt), noncoding RNA (ncRNA) molecules that regulate the expression of target genes by binding to complementary regions of mRNAs to repress their translation or regulate their degradation.Many cellular pathways are affected by the regulatory function of miRNAs, and several human pathologies, including cancers, are associated with the deregulation of miRNAs [1] and their metastases [2].
The development of miRNA signatures in liquid biopsies for the diagnosis of a variety of different cancer types is an active field of research [3].This is in part due to the fact that circulating miRNAs and other small RNA molecules are stable in peripheral blood [4,5], making them ideal noninvasive biomarkers for disease diagnosis.Our recent research has focused on the involvement of miRNAs in the development of locoregional metastases in patients with early breast cancer [6].We showed a good correlation in the miRNA expression profile between sentinel lymph nodes (SLNs) and matched primary tumor samples, in agreement with a previous study [7].The down-regulation of various miRNAs was significantly differentially expressed in patients with locoregional metastasis.In addition, we recently profiled the expression of miRNAs in a small number of plasma samples collected before surgery from patients with early-stage breast cancer who were not previously treated [8].Our results showed an overall downregulation of miRNAs that, in many cases, are direct targets of proteins that promote metastasis.One important caveat in our previous studies was a lack of correlation between miRNA expression in plasma, tumor tissues and SLNs obtained from the same patients.This is of utmost importance in order to develop potential circulating-based biomarkers.First, to elucidate whether circulating miRNAs are derived from tumor-tissue miRNAs and, second, to establish whether plasma miRNAs accurately reflect the heterogeneity of a primary tumor and its subclones [3].
Here, we performed RNA-sequencing to profile the expression of miRNAs in 30 early breast cancer patients for whom plasma, tumor tissue and SLN samples were available.Our results show an inverse relationship between circulating miRNAs and tumor/SLN samples.Nonetheless, several circulating miRNAs were associated with metastatic SLN status via a significant enrichment of the biological processes associated with the regulation of epithelial-mesenchymal transition, cell proliferation and transcriptional regulation.Our data highlight the potential use of circulating miRNAs as surrogate markers of locoregional metastases in breast cancer.

Patients
A total of 30 female patients were included in this study.We analyzed 30 plasma samples, 29 SLNs (2 SLNs were collected from the same patients) and nine tumor tissues.We paired plasma and SLN samples for 27 patients; paired plasma and tumor tissue for 9 patients; and all three samples available for 7 patients (Table S1).The main clinicopathological characteristics of the patients are described in Table 1.A total of 12 (40%) patients had SLN-positive tumors; seven SLNs were diagnosed as micrometastasis (23%); and five were diagnosed as macrometastasis (17%).

RNA-Sequencing
All samples passed pre-and post-sequencing quality checks, which confirmed average read quality and base quality Q-scores of >30 (99.9% correct) [9] and the expected read length distribution for miRNAs (Figure S1).The mean read numbers for plasma and tissues were 18.5 ± 0.9 million and 13.8 ± 0.6 million, respectively.Following sequencing and trimming, all reads containing identical insert sequences and UMI sequences (insert-UMI pair) were collapsed into a single read and passed into the analysis pipeline.This allowed for the true quantification of the miRNAs by eliminating library amplification bias and a better representation of the RNA molecules in the sample.We obtained an average of 1.97 million and 2.24 million collapsed reads for plasma and tissues, respectively, which resulted in good miRNA mapping reads with a very dominant miRNA peak in all samples, indicating good sample/data quality (Figure S2A).Overall, we obtained average genome mapping rates of 65% and 95% for plasma and tissues, respectively (Figure S2B).After mapping and counting relevant entries in mirbase_22, the expression of known miRNAs was calculated using the TPM method (Table S2).We did not identify any sequences identical to those of known miRNAs in miRBase_22 for other organisms.

Unsupervised Clustering Analysis
We investigated whether patients were assigned to biological groups based on their miRNA expression.We performed a supervised two-way hierarchical clustering of miRNAs and samples using the 50 miRNAs with the largest coefficients of variation based on TMM counts.Interestingly, the expression of circulating miRNAs showed an inverse correlation with either tumor tissues or SLNs obtained from the same patients.That is, for all patients studied, we observed that any miRNAs overexpressed in tissues were downregulated in plasma and vice versa (Figures 1A and S3).We obtained similar results using a principal component analysis (PCA), which clearly clustered plasma samples into a different group (Figure 1B).In addition, we performed Spearman's correlation analysis (r s ) between samples obtained from each patient.Our data show that both tumor and SLN samples from the same patient have a high correlation index, with an average value of r s = 0.78 among all patients.Interestingly, we obtained two SLNs from patients 1 and 20.The latter patient had both SLNs diagnosed as macrometastases, whereas the former patient had one macrometastasis and one micrometastasis.In both cases, we showed the highest correlations with r s = 0.875 and r s = 0.864, respectively (Figure 1C).In contrast, both tumors and SLNs showed little correlation when compared with plasma samples from the same patient, with averages of r s = 0.61 and r s = 0.59, respectively (Figures 1C and S4).

Differentially Expressed Circulating miRNAs
We focused on circulating miRNAs to investigate whether differentially expressed plasma miRNAs were associated with the locoregional metastasis status of the patients.First, we analyzed plasma samples according to positive (n = 12) or negative (n = 18) SLN metastasis status.We found 34 miRNAs with a significant differential expression (p < 0.05).However, only the upregulation of hsa-miR-642a-3p remained significant after correcting for multiple testing (q = 0.034) (Figure 2A).Overall, we found four miRNAs upregulated and seven miRNAs downregulated with an absolute log2 fold change of ≥1 (Table S3).Similar results were found when patients were sub-classified into positive macrometastasis (n = 5) or micrometastasis (n = 8) SLNs and compared with patients with negative SLNs (n = 18).We found that hsa-miR-642a-3p remained upregulated in patients with micrometastasis SLNs (p < 0.05) but not in patients with macrometastasis SLNs.In contrast, we found that hsa-miR-223 was upregulated (q = 0.03) in patients with positive macrometastasis (Figure 2B).Since our series included patients with luminal A and luminal B tumors (Table 1), we analyzed the differential expression between these two subgroups.Our results show 12 miRNAs with an absolute log2 fold change of ≥1 when comparing luminal B against luminal A tumors.Ten miRNAs were downregulated and two miRNAs were upregulated in luminal B tumors.However, none of them remained significant after correcting the p-values for multiple testing (q > 0.05) (Figure 2D, Table S3).

Biological Significance and Enriched Analysis of Circulating miRNAs
We performed a biological significance analysis using a subset of miRNAs (p < 0.05 and absolute log fold change >0.3).Table 2 shows a summary of the top targeted genes for each miRNA investigated.A complete list of validated target genes and all evidence types are provided in Table S4.In addition, we carried out an enrichment analysis of the GO and Reactome Pathways databases using the identified set of Homo sapiens target genes to understand how our data are related to biological functions.The results are summarized in Figure 3 and Table 3.Our data show that differentially expressed miRNAs were associated with biological processes (BP) markedly focused on gland development, gene expression regulation via receptor tyrosine kinases, epithelial cell motility and proliferation processes (p < 0.001).A complete list of the analyses for each comparison is included in Table S5.Differentially expressed miRNAs were enriched in cellular component (CC) terms located in membrane fractions, WNT signalosome and transcription factor complexes.Moreover, differentially expressed miRNAs were enriched in molecular function (MF) terms associated with RNA polymerase II, chromatin and ubiquitin-binding (p < 0.001) (Table 3).

Clinicopathological Correlation with Circulating miRNAs
Our series included 30 patients with ER+ early breast cancer, and we reported recurrence in two (7%) patients.The median follow-up time was 5.2 years (range 2.3-7.3 years).At the last follow-up, all were reported to be alive.We investigated whether the differentially expressed miRNAs correlated with the patients' clinicopathological parameters.The expressions of miR-6724-5p, miR-27a-3p, miR-423-3p and miR-140-5p were significantly higher in patients with high Ki67.The expression of miR-16-2 was significantly higher in older patients (>60 years).The expressions of miR-6724-5p, miR-320d and miR-423-3p were significantly higher in luminal B tumors.The increased expression of miR-182-5p was associated with higher-grade tumors, and low miR-3157-5p expression was associated with the presence of tumor multifocality.Given the low number of events, we were unable to perform any survival analysis on our cohort of patients.Table 3. Gene ontology analysis.Gene set enrichment analysis using terms within GO categories (biological process, cellular component, molecular function) was applied to extract biological meaning from the identified differentially expressed transcripts and predicted mRNA targets.The top 10 GOsignificant categories associated with differentially expressed circulating miRNAs in patients with positive SLNs are shown."Counts" refers to the ratio between the number of enriched differentially expressed miRNAs and the total number of miRNAs assigned to these terms.

Discussion
In recent years, miRNAs have emerged as important regulators of the various steps of the metastatic process [10].Currently, lymph node affection remains the most important prognosis factor in breast cancer [11], and the presence of metastasis in SLNs is still the recommended procedure for the axillary staging of early breast cancer.Our recent research focused on the involvement of miRNAs in the development of locoregional metastases in patients with early-stage breast cancer.Our initial study focused on profiling the expression of miRNAs via RNA-seq in paired tumor tissue and lymph nodes from the same patients [6], whereas the second was a small observational study of plasma samples in breast cancer patients [8].Both studies reported an association between several miRNAs and the locoregional metastatic outcomes of patients.However, we could not establish whether the expression of circulating miRNAs resembled that of primary tumors or SLNs since both sample cohorts were obtained from different patients.
Therefore, in this study, we sought to address this issue by analyzing plasma, SLNs and tumor tissues derived from the same patients and correlating the data with the metastatic status of these patients.In total, we RNA-sequenced 30 plasmas, 30 SLNs and nine tumor tissues (n = 68 samples) from 30 patients with ER+ early breast cancer.Overall, our data show that the miRNA expression in tumors and SLNs is similar, with minor changes that are likely due to histological differences between the SLN and the tumor.In contrast, the expression of circulating miRNAs showed an inverse correlation with tumor and SLN samples.For instance, any specific miRNAs that were upregulated in tumor tissues or SLNs were downregulated in plasma samples and vice versa.In fact, when we performed a sample-to-sample comparison on the same patients, the plasma samples showed a low Spearman correlation when compared with both tumor tissues or SLNs.Our data agree with previous publications, suggesting little correlation between the tumor and serum expression of miRNAs in breast cancer using qPCR [12].However, we cannot rule out the possibility that blood sample collection may influence the expression of circulating miRNAs.In this regard, there is evidence showing that, in metastatic colorectal cancer, circulating miRNA expression changes before and after passing through the metastatic site (liver), as if somehow some miRNAs are retained specifically in the liver, while others are not [13].In addition, transcriptome analysis revealed non-identical miRNA profiles between arterial and venous plasma [14].Similarly, comparisons between arterial and venous miRNAs in hypertension-versus-control conditions also revealed a prominent disease association between circulating miRNAs and their target genes in arteries but not in veins [15].The overall picture that emerges from these studies is that the blood collection point affects the miRNA expression profile.Increasing evidence also shows that tumordraining veins constitute a better source of biomarkers than peripheral veins in colon cancer patients [16].For instance, a recent small exploratory study indicated that tumor-proximal venous samples are highly enriched in miRNAs, ctDNA mutations and circulating tumor cells among oncological biomarkers and may allow for more robust molecular analysis than peripheral vein samples [17,18].Whether something similar occurs in breast cancer patients remains to be answered.In our study, all plasma samples were obtained from a peripheral vein, and this could, at least in part, explain the observed discrepancies between circulating miRNAs and paired tumors or SLNs.
Despite all the aforementioned data, we identified several circulating miRNAs associated with the locoregional metastatic statuses of patients.Overall, we found that most differentially expressed miRNAs associated with positive SLNs were downregulated (p < 0.05), in agreement with our previous study, including eight miRNAs (339-5p, miR-29b-3p, 139-5p, 144-3p, 423-3p,584-5p, 99b-3p and 101-3p) that we previously described [8].However, their differential expression did not remain significant after adjusting for a false discovery rate.Interestingly, our results show the upregulation of miR-642a-3p and miR-223 (q < 0.05) in patients with positive locoregional metastases.The upregulation of miR-642a-3p has been reported to mediate the HDAC inhibitor-mediated downregulation of HER2 and apoptosis in breast cancer cell lines [19].Others have associated the upregulation of miR-642a-3p as an invasion/metastasis associated miRNA via the mTOR and PI3K-AKT signaling pathways in gallbladder cancer [20], whereas, in breast cancer, miR-642a-3p expression increases in the breast milk extracellular vesicles of mothers with obesity and is associated with cancer signaling pathways [21].In the case of miR-223, its role in human cancer is controversial, and it has been found recently to act as both a tumor-suppressor gene and an oncomiR [22].This dual role may mainly depend on its target in different locations of tissues and cells, and it has been described in breast [23], pancreatic [24] and prostate [25] human cancers.Our data show an upregulation of miR-223 in early-breast cancer patients with positive macrometasis in the SLNs and, thus, support its role as an oncomiR.Interestingly, our enrichment analysis showed that miR-233 top target genes (i.e., FOXO3, MEF2C and FBXW7) have already been described as mediating miR-233 oncomiR characteristics.For instance, FoxO3a, a well-defined tumor suppressor gene in the forkhead transcription factor O subfamily (FoxO), has been shown to be downregulated by the increased expression of miR-233 in pancreatic cancer cell lines [24].Finally, our GO analysis shows that both miR-642a-3p and miR-223 are involved in GO biological processes related to gland and reproductive development, the regulation of receptor tyrosine kinases and epithelial cell proliferation among the top significant terms, adding further evidence of their role in promoting tumor metastasis.
To conclude, in this study, we provided evidence that circulating miRNAs have an inverse correlation with the expression observed in tumor and SLN samples derived from the same patients.Therefore, we could not fully establish whether the expression of circulating miRNAs resembles that of the primary tumor or the SLNs.This is of utmost importance in developing potential circulating-based biomarkers because (1) the origin of tissue-derived miRNAs traceable in plasma and ( 2) to what extent circulating miRNAs accurately reflect the heterogeneity of a tumor and its subclones (or all tumor lesions in metastatic patients) need to be established [3].Nonetheless, we described several miRNAs associated with the locoregional metastatic status of patients with breast cancer, which have been reported to be direct targets of proteins that promote metastasis.Further studies are required with a larger number of patients to demonstrate whether the sensitivity and specificity of plasma miRNAs may be improved through selective venous sampling in breast cancer patients and to unveil the molecular mechanisms of the miRNAs described in this study.

Patients
We studied 30 patients with early breast cancer treated with surgery.Male patients were excluded from this study.None of the patients had prior treatment with surgery, chemotherapy or radiation.All patients had confirmed diagnoses based on the histopathology of tumor biopsies and intraoperative SLNs evaluated using the OSNA assay [26].All tumors were invasive ductal carcinomas (IDCs) with or without in situ components.All patients were estrogen receptor-positive and HER2-negative.We collected the following clinical and pathological parameters: age, menopausal status, personal and familiar disease precedents, clinical follow-up, tumor stage determined according to the UICC system [27], histological grade determined using the Elston-Ellis grading system [28], tumor histology, presence of associated carcinoma in situ, presence of vascular and lymphatic invasion, tumor-infiltrating lymphocytes, tumor focality, tumor necrosis, proliferation of non-tumoral tissue.

Sample Processing
All samples were collected from early breast cancer patients who had not received any previous treatment.Peripheral blood was withdrawn before surgery.Approximately 10-15 mL of peripheral blood was collected for plasma processing in EDTA tubes.Plasma tubes were processed within 2 h of collection, spun at 1200× g for 10 min and inspected for the absence of hemolysis, as previously described [8,29].Tumors and SLNs were processed as previously described [6].All samples were aliquoted in 1.5 mL tubes and stored at −80 • C until further processing.

RNA Isolation
RNA was isolated from 300 µL of plasma samples using the miRNeasy serum/plasma advanced kit (Qiagen, Germantown, MD, USA) according to the manufacturer's instructions.RNA was eluted in a volume of 14 µL.RNA was isolated from tumor tissues and SLNs using the miRNeasy kit (Qiagen, Germantown, MD, USA) according to the manufacturer's instructions and eluted in a volume of 30 µL.The RNA integrity (RIN) level was measured for each RNA sample using the Agilent TapeStation (Santa Clara, CA, USA).
All samples used in this study had a RIN value > 7. A range of spike-ins was added to all samples prior to RNA isolation.A pre-sequencing quality check using q-PCR was performed on all samples to control for the quality of the RNA and the inhibition of enzymatic reactions in downstream reactions, as previously described [8].

RNA-Sequencing
Next-generation sequencing (NGS) and genome annotation were performed as previously described [6,8].Genome annotation was performed using the QIAGEN CLC Genomics Server v20.0.4 (Qiagen, Germantown, MD, USA).Following sequencing, Cutadapt (1.9.1) [30] was used to trim adaptor sequences.A quality check (QC) was performed to ensure Q-scores > 30 (>99.9% correct) for our data [9].Reads with the correct length were analyzed for the presence of UMIs using Cutadapt v1.9.1 (Dortmund, Germany) and then collapsed by UMIs into FASTQ files.This approach eliminated library amplification bias and allowed for the true identification of miRNAs.The Bowtie2 software v2.2.6 (San Diego, CA, USA) was used to map the reads.The mapping criterion for aligning reads to spikeins, abundant sequences and miRBase_22 was for reads to have perfect matches with the reference sequences.To map the genome, one mismatch was allowed in the first 32 bases of the read.Small insertions and deletions (INDELs) were not allowed.The resulting sequences were annotated using human assembly GRCh38 (Ensembl) and the miRBase_22 database.Transcripts per million (TPM) was used as a normalization procedure to correct for differences in sequencing depth and to quantify each RNA species.

Differential Expression Analysis
Differential expression analyses were performed using the TMM normalization method [31] and the EdgeR v3.17 statistical software package (Bioconductor; Victoria, Australia).We used TMM-normalized quantification R scripts [32] to perform principal component analysis (PCA) and unsupervised hierarchical clustering analysis.We selected the top 50 miRNAs with the largest coefficient of variation across all samples to obtain a cluster of samples.The data were normalized to TMM and converted to log2 scale.

Statistics
Differentially expressed miRNAs from RNA-sequencing data were detected using an exact test based on conditional maximum likelihood (CML), included in the R/Bioconductor package EdgeR (Victoria, Australia) [37].p-values from RNA-sequencing were corrected (q-values) for multiple testing using the Benjamini-Hochberg procedure [38].A false discovery rate (FDR) of q < 0.05 was considered significant.In all group comparisons, missing expression values were treated as zero.Differences in total numbers of miRNAs between groups were analyzed using two-sided parametric t-tests.A clinicopathological analysis was performed using Student's t-test for the comparison of quantitative variables and the X2 or Fisher exact tests for the comparison of qualitative variables.The Cox regression model was used to perform the multivariate analysis.A two-sided p-value ≤ 0.05 was considered significant.

Figure 1 .Figure 1 .
Figure 1.Class discovery associated with SLN metastatic status.The analysis was performed using the 50 miRNAs with the largest coefficients of variation based on the trimmed mean of M-values Figure 1.Class discovery associated with SLN metastatic status.The analysis was performed using the 50 miRNAs with the largest coefficients of variation based on the trimmed mean of M-values (TMM counts).(A) Heat map and unsupervised hierarchical clustering.Each row represents one miRNA, and each column represents one sample.The color represents the relative expression level of a miRNA across all samples.The color scale shows the expression level above (purple) or below (green) the mean.(B) Principal component analysis (PCA) shows sample clusters arising naturally based on the miRNA expression profile.Tumors are shown in light brown, plasma samples in grey and SLNs in yellow.Patients with more than 1 SLN analyzed are depicted as blue.(C) Representative scatterplots of selected patients showing a sample-to-sample comparison between different tissues from the same patient as indicated.The scatterplots show the log expression of miRNA expression and the Spearman correlation coefficient (r s ) for each comparison.

Figure 2 .Figure 2 .
Figure 2. Differentially expressed circulating miRNAs.The volcano plots show differentially expressed miRNAs in plasma samples according to the patients' locoregional metastatic status as indicated (A-C) or the molecular subtype (D).The data show the logarithmic relationship between non-adjusted p-values (y-axis) and the fold change expression (x-axis).Corrected q-values of < 0.05 are shown as red dots; non-adjusted p-values of < 0.05 are shown as blue dots; and non-significant p-values of > 0.05 are shown as gray dots.Only miRNAs with an absolute log2 fold change of ≥1 are labeled.

Figure 3 .
Figure 3. Gene ontology (GO) enrichment analysis for significant biological processes associated with positive SLNs.(A) Dot plot graph shows the 50 most significant biological-process GO terms

Figure 3 .
Figure 3. Gene ontology (GO) enrichment analysis for significant biological processes associated with positive SLNs.(A) Dot plot graph shows the 50 most significant biological-process GO terms (y-axis) and the ratio between the number of expressed miRNAs associated with the GO term and the number of significantly differentially expressed genes associated with the GO term (x-axis).The colors of the nodes indicate the p-value and the size of the nodes denotes the number of miRNAs associated with a specific GO term.(B) Enrichment map of the top 60 miRNAs.Pathways are grouped by similarity.The size of each node indicates the number of miRNAs found in a pathway, whereas the color of the node is based on the significance of the p-value (C) Neural network shows the GO terms for the biological processes associated with patients with positive SLNs.

Table 1 .
Basic patient and tumor characteristics.

Table 2 .
Biological significance analysis.We analyzed a subset of differentially expressed miRNAs, and we show the top 5 validated targets for each miRNA.