MiR-106b-5p: A Master Regulator of Potential Biomarkers for Breast Cancer Aggressiveness and Prognosis

Breast cancer (BCa) is the leading cause of death by cancer in women worldwide. This disease is mainly stratified in four subtypes according to the presence of specific receptors, which is important for BCa aggressiveness, progression and prognosis. MicroRNAs (miRNAs) are small non-coding RNAs that have the capability to modulate several genes. Our aim was to identify a miRNA signature deregulated in preclinical and clinical BCa models for potential biomarker discovery that would be useful for BCa diagnosis and/or prognosis. We identified hsa-miR-21-5p and miR-106b-5p as up-regulated and hsa-miR-205-5p and miR-143-3p as down-regulated in BCa compared to normal breast or normal adjacent (NAT) tissues. We established 51 shared target genes between hsa-miR-21-5p and miR-106b-5p, which negatively correlated with the miRNA expression. Furthermore, we assessed the pathways in which these genes were involved and selected 12 that were associated with cancer and metabolism. Additionally, GAB1, GNG12, HBP1, MEF2A, PAFAH1B1, PPP1R3B, RPS6KA3 and SESN1 were downregulated in BCa compared to NAT. Interestingly, hsa-miR-106b-5p was up-regulated, while GAB1, GNG12, HBP1 and SESN1 were downregulated in aggressive subtypes. Finally, patients with high levels of hsa-miR-106b-5 and low levels of the abovementioned genes had worse relapse free survival and worse overall survival, except for GAB1.


Introduction
Breast cancer (BCa) is the first in incidence and the leading cause of death by cancer among women worldwide [1]. It is a heterogeneous disease, since it has a variety of histological and biological properties in terms of genetics and epigenetics, which influences its diagnosis, treatment and prognosis. Even though there are plenty of treatments for this disease, the need of new therapeutic targets is urgent, since there is no specific treatment for the most aggressive BCa types.
Currently, BCa is stratified according to its staging and to its molecular subtype; this combination will determine the patient's therapy [2]. In terms of staging, it can be stratified from 0 to IV according to the size of the tumor (stages I-III), the presence of secondary tumors in nodes (stages II-III), their distance and the presence of metastasis in other organs (stage IV) [3]. According to its molecular profile, BCa is mostly divided according to the presence/absence of progesterone receptor (PR), estrogen receptor (ER) and/or ErbB2 receptor (Her2) [4].
The main subtypes are: Luminal A (PR + , ER + and Her2 − ), Luminal B (PR + , ER + and Her2 + ), Her2 (PR − , ER − and Her2 + ) and triple negative (PR − , ER − and Her2 − ), represented mostly by the basal-like sub-classification [5]. The luminal subtype is the most prevalent, with a better prognosis and several treatments available [4]. The Her2 subtype also has anti-receptor treatment, but has a worse prognosis compared to the luminal subtype. Finally, triple negative BCa has no specific treatment available thus far; therefore, it has the worse prognosis and is, therefore, the most aggressive subtype [4].
MicroRNAs (miRNAs) are small non-coding RNAs (18)(19)(20)(21)(22) nucleotides) capable of modulating gene-expression [6]. It was found that miRNAs play a crucial role in cancer development and progression including BCa [7,8] and can affect several hallmarks of cancer, such as cell death resistance, invasion and metastasis activation, among others [9]. In conclusion, miRNAs can play a role acting as oncomiRs or tumor suppressor miRNAs, mostly depending on their microenvironment [10].
The aim of this work was to identify a miRNA signature deregulated in breast tumors and their target genes as possible biomarkers for BCa diagnosis and/or prognosis. We compared the expression between normal breast/normal adjacent and BCa tissues of several miRNAs from mice model and human samples. Then, we focused on hsa-miR-21-5p and miR-106b-5p that shared similar profiles between human and mice and identified their target genes. Using bioinformatics approaches, we evaluated the pathways in which these genes are involved, and their relevance in normal and cancerous tissues. We also established the overall survival and relapse free survival analysis in terms of miRNA and gene expression. Finally, the correlation between the expression of miRNAs and genes and BCa aggressiveness was assessed.
To analyze the expression profile of these miRNAs in BCa and normal adjacent tissue (NAT) from patients, available data was obtained from the UCSC Xena bioinformatic tool [16]. The information of the mature miRNAs expression obtained by RNAseq was analyzed using paired BCa-NAT from the same patient ( Figure 1B). We found that hsa-125b-5p, miR-221-3p, miR-143-3p and miR-205-5p were downregulated in BCa tissue compared to NAT, while hsa-miR-21-5p and miR-106b-5p were upregulated. Since mice and human share several miRNA sequences and combining human and murine data, we found that miR-21-5p and miR-106b-5p were both up-regulated in mice and human tumors compared to MG or NAT, respectively, while miR-205-5p and miR-143-3p were both down-regulated. We next focused the analysis on these four miRNAs. Stem-loop RT-qPCR from 4T1 allografts (Tumor) or mammary-glands (MG) Balb/c mice using specific primers for the indicated miRNAs is shown (n = 8 per group; primers were run in duplicates). The data were normalized to mmu-miR-191-5p and MG. Statistical analysis was performed using a T-test, Welch or sing-median test when appropriate. (B) Expression levels of the corresponding miRNAs in BCa tumors paired normal adjacent tissue (NAT) from the TCGA-BRCA dataset are graphed in read per millions. The name of the miRNAs is indicated up each plot in bold. Statistical analysis was performed using a paired T-test or Wilcoxon test when appropriate.

Hsa-miR-21-5p and miR-106b-5p Negatively Correlate with Their Target Genes
In order to determine the relevance of hsa-miR-21-5p and miR-106b-5p in human tissues, we performed principal component analysis (PCA) of the 51 genes using normalized expression data from normal mammary gland (NMG) (GTEX), NAT and BCa samples (TCGA BRCA). The first two components (Dim 1 and 2), which explained 33.4% and 9.3%, respectively, of the total variation, were plotted. We found marked differences in the overall gene expression between NMG and BCa tissues ( Figure 3A, I). As shown in Figure 3A, II, the 51 genes contributed to Dim 1 in similar ways indicating that almost all the genes contributed to the discrimination between tumor and non-tumor tissues. Additionally, we performed single sample gene set enrichment analysis (ssGSEA) to explore whether the 51-gene signature were coordinately up-or down-regulated within the BCa or the NAT samples using TCGA BRCA dataset. We found that the 51 target genes showed a positive ssGSEA-enrichment in BCa and NAT samples ( Figure 3B). In addition, a negative correlation between the ssGSEA-enrichment and the expression of hsa-miR-21-5p and miR-106b-5p was found in BCa tissue (rho = −0.1807 and −0.2876, respectively). These results suggest that hsa-miR-21-5p and miR-106b-5p might play a relevant role in the tumorigenesis, being able to distinguish normal from tumor tissue, based on their gene expression profile.

Hsa-miR-21-5p and miR-106b-5p Modulate Cancer and Metabolic Related Pathways
To further analyze the relevant pathways associated with the miRNA-target genes, we performed a KEGG pathway enrichment analysis (p-value < 0.05) for all the EVT genes of hsa-miR-21-5p, miR-106b-5p, miR-205-5p and miR-143-3p ( Figure 4A, Tables S2 and S3). In particular, this analysis showed 12 common target genes, indicated with an asterisk at Figure 4A, between hsa-miR-21-5p and miR-106b-5p associated with processes related to cancer, including the MAPK signaling pathway, TGF-β signaling pathway, ERBB1 downstream pathway, mTOR signaling pathway and Wnt signaling pathway; and several metabolic processes related to the insulin signaling pathway.
Moreover, analyzing the hsa-miR-205-5p and miR-143-3p target genes, we found several related cancer pathways, including senescence and autophagy, signaling by EGFR in cancer, microRNA regulation of DDR among others, but no genes were shared between these two miRNAs.
We next focused on the 12 genes indicated in Figure 4A: GAB1, GNG12, HBP1, MEF2A, PAFAH1B1, PPP1R3B, RPS6KA3, SESN1, MAP3K2, YY1, FRS2 and STAT3. We analyzed the expression of these genes in BCa and NAT tissue ( Figure 4B). We found that GAB1, GNG12, HBP1, MEF2A, PAFAH1B1, PPP1R3B, RPS6KA3 and SESN1 were downregulated in BCa compared to NAT; meanwhile, MAP3K2 and YY1 were up-regulated. There were no statistical differences between BCa and NAT in FRS2 and STAT3. We continued the analysis with the eight target genes that were down-regulated in BCa, as these could potentially be tumor suppressor genes.

Hsa-miR-106b-5p and miR-21-5p Are Upregulated in More Aggressive BCa Subtypes and Are Predictors of Worse Overall Survival
We evaluated the expression of hsa-miR-21-5p and miR-106b-5p in BCa from the TCGA BRCA dataset and compared them between subtypes and stages ( Figure 5A). We found that hsa-miR-106b-5p was up-regulated in more aggressive subtypes (Basal), while hsa-miR-21-5p was up-regulated in intermediate aggressive subtypes (Luminal B and Her2). No significant differences were found in terms of the stage for these miRNAs.
Additionally, we performed Kaplan-Meier plots to evaluate the overall survival (OS) based on the expression of these miRNAs ( Figure 5B) using a cohort of 1262 BCa patients. We found that patients who had a higher expression of hsa-miR-21 and miR-106b had lower survival and worse prognosis. Altogether these results propose hsa-miR-21-5p and miR-106b-5p as new biomarkers for BCa prognosis.   We further analyzed the relevance of the eight selected miRNA-target genes from Figure 4B in terms of aggressiveness based on their expression in different BCa tissue subtypes ( Figure 5C). We found that GAB1, GNG12, HBP1 and SESN1 expression was reduced in more aggressive compared with less aggressive subtypes, in concordance to the fact that hsa-miR-106b-5p was up-regulated in these tissues and could be repressing these genes. Moreover, we found that MEF2A, PAFAH1B1, PPP1R3B and RPS6KA3 were downregulated in intermediate aggressive tumors (Luminal B and Her2) and up-regulated in Luminal A and Basal subtypes, suggesting that these genes could be regulated mostly by hsa-miR-21-5p.
We also performed Kaplan-Meier plots to determine OS and relapse-free survival (RFS) from the genes that were down-regulated in more aggressive subtypes in a cohort of 1879 BCa patients. As shown in Figure 6, patients with low tumor expression of GNG12, HBP1 and SESN1 genes presented reduced OS ( Figure 6A) and worse RFS ( Figure 6B), compared to patients with high expression of these genes. Moreover, low GAB1 expression showed only worse RFS without changes in the OS. These results reinforce the relevant role for these genes as prognosis biomarkers in advanced BCa patients.
We plotted the expression of hsa-miR-106b-5p and each target gene to determine their correlation based on BCa subtype. As shown in Figure 7B, we found that more aggressive subtypes, such as Basal, had higher expression of hsa-miR-106b-5p and lower expression of the four target genes, in comparison with less aggressive subtypes, such as Luminal A, that had a lower expression of hsa-miR-106b-5p and higher expression of the target genes.
To determine the relevance of these genes in BCa metastasis, we used TNMplot in a cohort of 7893 subjects. We found that the expression of GAB1, GNG12, HBP1 and SESN1 was reduced in BCa compared to normal breast tissues ( Figure 7C), as we previously showed in the TCGA cohort ( Figure 4B). More importantly, these genes' expression was dramatically diminished in metastatic tissues compared to normal breast tissues and BCa ( Figure 7C).

Discussion
In this work, we provide new evidence of miRNAs and genes that could be used in the study of BCa aggressiveness and prognosis. We found that hsa-miR-21-5p and miR-106b-5p were both up-regulated in BCa tissue in mice and human samples. Accordingly, eight miRNA target-genes (GAB1, GNG12, HBP1, MEF2A, PAFAH1B1, PPP1R3B, RPS6KA3 and SESN1) were downregulated in BCa compared to NAT tissue in patients.
We found that hsa-miR-21-5p was up-regulated in BCa subtypes Her2+, correlated with worse OS and could be regulating MEF2A, PAFAH1B1, PPP1R3B and RPS6KA3, since they were mostly downregulated in Her2+ BCa subtypes. Moreover, we found that hsa-miR-106b-5p was up-regulated in more aggressive BCa subtypes and correlated with worse OS, and four of the eight target genes (GAB1, GNG12, HBP1 and SESN1) were downregulated in more aggressive subtypes and correlated with worse OS and RFS. Finally, we found that their expression negatively correlated with hsa-miR-106b-5p expression in BCa tissue, proposing them as promising prognosis biomarkers.
In terms of BCa progression, it was found to be related to BCa pathogenesis, proliferation, invasion, apoptosis, the epithelial to mesenchymal transition (EMT), cell cycle control and metastasis in tumor cells [18][19][20]22,23,25,26]. It was found an association between up-regulation of hsa-miR-21-5p and poor disease-free survival (DFS) in early stage BCa [20]; however, there is contradictory evidence for the correlation between hsa-miR-21-5p expression and OS [17,18]. Thus, these data, together with our findings, strongly support hsa-miR-21-5p as a biomarker for BCa diagnosis and prognosis.
Preclinical studies demonstrated that hsa-miR-106b-5p can predict the presence of metastasis [30,36] since mice with up-regulation of this miRNA in tumors had developed lung metastasis [34] and increased number of metastatic nodules in the liver [35]. Moreover, hsa-miR-106b-5p was found up-regulated in metastatic primary tumors compared to non-metastatic, up-regulated in metastasis compared to primary tumors [35] and even up-regulated in secondary metastasis compared to primary metastasis [36].
In terms of BCa patient survival, it has been reported that those who have lower hsa-miR-106b-5p expression have longer DFS [35]. Moreover, in another study, BCa patients with higher expression of this miRNA had shorter DFS and OS as well as an augmented recurrence risk [30]. Higher levels of hsa-miR-106-5p were also associated to decreased tumor relapse [31]. In summary, all this evidence suggests hsa-miR-106-5p as a good biomarker to predict recurrence and progression and as a prognostic biomarker to identify local or distant recurrence and high-risk BCa.
In this work, we found that hsa-miR-106b-5p and miR-21-5p share several target genes, including GAB1, GNG12, HBP1 and SESN1, that were down-regulated in breast primary tumors and metastases. According to Lee et al. [37], GNG12 was found down-regulated in local recurrent DCIS compared to no recurrent, which strongly supports our data.
Furthermore, SESN1 was found to be mainly involved in metabolism regulation and aging via the AMPK/mTOR pathway [38]. In particular, it interacts with AMPK in the stress response, tumorigenesis suppression and maintenance of genomic integrity [38]. It was found that mutant p53 blockades SESN1/AMPK complex, unleashing oncogenic effects such as inhibition of apoptosis, increasing drug resistance and proliferation [39]. It was found that, in patients with mutant p53, a lower expression of several genes, including SESN1, correlated with lower RFS times and distant-metastasis-free survival [40].
GAB1's role in BCa has been largely studied and, in opposition with our findings, it has been proposed as an oncogene. GAB1 is a key scaffold protein that enhances the downstream signaling of c-Met among others tyrosine-kinase receptors [41,42], it is involved in the formation of invadopodia [41], allows epidermal growth factor receptor (EGFR) dimers to directly signal PI3K [43] and is implicated in EGF-induced activation of MAPK [44].
In opposition with our findings, GAB1 was found up-regulated in breast tumors compared to benign mammary hyperplasia from patients [45]. Moreover, GAB1 overexpression was associated to metastasis in Her2 and triple negative BCa subtypes [45]. However, this study was limited to a low number of subjects, which could be the cause of disagreement with our results.
Several studies have proposed HBP1 as a transcriptional repressor, since it has an HMG box DNA-binding domain that allows it to impair the union between DNA and active transcription factors [46][47][48][49]. HBP1 is located in chromosome 7q31.1, a region that is found usually deleted in several cancer types [46,50] and 30% of BCa patients have mutants or variants of HBP1 [47,48]. In BCa preclinical models, HBP1 depletion increased cell cycle progression, proliferation, invasion, migration, and tumorigenesis [47,49].
Additionally, HBP1 was found down-regulated in breast tumors compared to normal tissue from patients, which was associated with poor prognosis, RFS and relapse [46,47,50]. Furthermore, hsa-miR-17-5p, a member of the 17-92 cluster, paralogous of the 106b-25 cluster, binds to HBP1, lowering its expression [50]. These findings could indicate that hsa-miR-106b-5p could also be involved in HBP1 decrease; however, further studies are needed to confirm our hypothesis.
In summary, all these data suggest that GAB1, GNG12, HBP1 and SESN1 act as tumor suppressor genes by the proposed mechanism indicated in Figure 8. As shown, less aggressive BCa subtypes and NAT display low expression of hsa-miR-106b-5p, which, in turn, maintains GAB1, GNG12, HBP1 and SESN1 high expression. However, more aggressive BCa correlates with hsa-miR-106b-5p up-regulation and the down-regulation of its targets GAB1, GNG12, HPB1 and SESN1, unleashing BCa aggressiveness and, therefore, worsening the prognosis of BCa patients. Figure 8. Hypothetical model. Hsa-miR-106b-5p is down-regulated in less aggressive BCa subtypes and NAT, which keeps GAB1, GNG12, HBP1 and SESN1 up-regulated. When BCa progresses, or in more aggressive BCa subtypes, hsa-miR-106b-5p is then up-regulated and targets GAB1, GNG12, HPB1 and SESN1, diminishing their expression, unleashing BCa aggressiveness and, therefore, resulting in a worse BCa prognosis.

Cell Culture
The 4T1 murine cell line (ATCC: CRL-2539) was cultured in RPMI 1640 (Invitrogen) supplemented with 10% of fetal bovine serum and antibiotics in a 5% CO 2 humidified atmosphere at 37 • C.

BCa Allograft Murine Model
We housed 16-week-old Balb/c female mice (n = 16) under pathogen-free conditions following the IBYME's animal care guidelines. Mice were randomized into two groups: Control and BCa. BCa mice were inoculated in the mammary fat pad with 1 × 10 4 4T1 cells. Control animals were not inoculated at any point of the experiment. Control and BCa were sacrificed both at the same time (4 weeks after the inoculation) and tumor and mammary gland (MG) samples were collected. Tumor measurement was done as previously described [51].

RNA Isolation and RT-qPCR Analysis
The total RNA from allografts and MG were isolated using TriReagent (Molecular Research Center). Stem-loop RT-qPCR method was used to retrotranscribe miRNAs as previously described [52][53][54]. Briefly, 100 ng of total RNA and 0.07 µM of stem-loop specific primer were preheated at 70 • C for 5 min. Then, retrotranscription was performed using M-MLV RT (Promega) and incubated in TC 9639 Thermal Cycler (Benchmark) at 16 • C for 30 min, 42 • C for 60 min and 70 • C for 2 min. qPCRs were performed in StepOne Plus Real Time PCR (Applied Biosystems) using Taq DNA Polymerase (Pegasus) as previously described [51].
All reactions were run in duplicate. The expression levels of miRNAs were calculated using the ∆∆Ct method normalizing to mmu-miR-191-5p and the Control. The primer sequences for Stem-loop RT-qPCR are listed in Table 1. The results are given as the median and interquartile ranges. Data normalization and homogeneity of variances was assessed using the Shapiro-Wilk test, F test or boxplot, respectively. Student's t-test was applied for data that fulfill the requirements. Otherwise, the Welch, Wilcoxon or Median test was performed. We used a significance level of 5%. Table 1. Primer sequences used for stem-loop RT-qPCR.
When miRNAs were analyzed, 491 BCa samples were used, and, when genes were analyzed, 821 BCa samples were used. Data normalization and homogeneity of variances was assessed using the Shapiro-Wilk test, and Levene, F test or boxplot, respectively. A paired sample t-test was applied for data that fulfill the requirements. Otherwise, the Wilcoxon test was performed. When three or more groups were analyzed, one-way ANOVA followed by Tukey was performed for data that fulfill the requirements. Otherwise, the Kruskal-Wallis followed by Dunn's Test was performed. We used a significance level of 5%.

Functional Enrichment Analysis
To identify experimentally validated target (EVT) genes regulated by differentially expressed miRNAs in human BCa and mice allografts selected miRNAs, we employed the DIANA TARBASE v8 resource (threshold score > 0.5) [55]. Functional enrichment analyses of the obtained gene lists were performed using the ClueGo Cytoscpape's plug in and the Enrichr resource. For pathway terms and annotation, we used those provided by KEGG and BioPlanet (http://tripod.nih.gov/bioplanet/, accessed on 1 August 2021; https://www.genome.jp/kegg/pathway.html, accessed on 1 August 2021). For network construction of the interactions miRNA-target-pathways, we used a Sankey diagram.

Principal Component Analysis (PCA)
Publicly available gene expression values for TCGA Breast Cancer (BRCA) and the GTEx project patient samples were downloaded as previously described. PCA was performed to determine samples distribution based on the expression of 51 common target genes from up-regulated miRNAs in both breast human tissue and mice allografts. For PCA plots, the R function "prcomp" from stats package (version 4.0.2) was used.

Single-Sample Gene-Set Enrichment Analysis
A ssGSEA was performed to analyze the coordinated regulation of a defined gene signature and the activation of specific biological processes in BCa samples. Log 2 (RPM + 1) gene expression values of 474 BCa samples were obtained from TCGA-BRCA cohort and loaded into GenePattern web-tool (https://www.genepattern.org, accessed on 30 August 2021). A gene set enrichment profile was obtained for each sample using a 51 gene signature obtained from common target genes from up-regulated miRNAs in both breast human tissue and mice allografts. Then, Spearman correlation analysis was performed between the expression of two up-regulated miRNAs and the enrichment score obtained from the ssGSEA (GraphPad Prism 8.0.1; Prism, San Diego, CA, USA).

Correlation Matrix
Expression levels of miRNA and target genes in breast tumors from patients and their PAM50 classification were obtained from TCGA BRCA data available in UCSC Xena. Only BCa samples were included in the present analysis. Using the Hmisc R package, we generated a correlation matrix for each miRNA, applying the Spearman correlation coefficient.

Kaplan-Meier Plots
Overall Survival (OS) was performed for the selected miRNAs using data from miRpower bioinformatic tool (https://kmplot.com, accessed on 3 September 2021) [56]. OS and Relapse Free Survival (RFS) for the selected genes were performed using data from the Kaplan-Meier plotter bioinformatic tool (https://kmplot.com, accessed on 3 September 2021) [57]. Data was analyzed and plotted using the libraries survminer and survival in R software.

Normal, Tumor and Metastatic Gene Expression
Data and plots comparing normal, tumor and metastatic gene expression were performed using TNMplot bioinformatic tool (https://TNMplot.com, accessed on 1 September 2021) [58], which contains data from 242 normal tissues, 7569 BCa tissues and 82 metastatic tissues. For comparisons among tissues, Kruskal-Wallis followed by Dunn's Test was performed.