In Vitro and in Silico Analysis of miR-125a with rs12976445 Polymorphism in Breast Cancer Patients

: Background : Breast cancer a ﬀ ects over 2 million women yearly. Its early detection allows for successful treatment, which motivates to research factors that enable an accurate diagnosis. miR-125a is one of them, correlating with di ﬀ erent types of cancer. For example, the miR-125a level decreases in breast cancer tissues; polymorphisms in the miR-125a encoding gene are related to prostate cancer and the risk of radiotherapy-induced pneumonitis. Methods : In this work, we investigated two variants of rs12976445 polymorphism in the context of breast cancer. We analyzed the data of 175 blood samples from breast cancer patients and compared them with the control data from 129 control samples. Results : We observed the tendency that in breast cancer cases TT genotype appeared slightly more frequent over CC and CT genotypes (statistically nonsigniﬁcant). The TT genotype appeared also to be more frequent among human epidermal growth factor receptor 2 (HER2) positive patients, compared to HER2 negative. In silico modelling showed that the presence of uridine (U) diminished the probability of pri-miR-125a binding to NOVA1 and HNRNPK proteins. We demonstrated that U and C -variants could promote di ﬀ erent RNA folding patterns and provoke alternative protein binding. Conclusions : U-variant may imply a lower miR-125a expression in breast cancer.


Introduction
Constant investigation of the genetic background of breast cancer is a crucial endeavour since only 15-36% of hereditary breast cancers have known genetic background [1]. Many current studies refer to estimate the association of polymorphisms in genes encoding miRNA with cancer and to find how single nucleotide polymorphisms (SNP) modulate miRNA expression [2]. SNPs in genes encoding miRNAs have been revealed in several different types of cancer, including breast cancer [3][4][5]. It has been proven that SNPs in miRNA genes affect their expression in breast cancer and modulate the expression of miRNA target genes (e.g., miR-27a, miR-196a2, miR-559) [6][7][8]. SNPs contribute to abnormal expression of miRNAs in breast cancers either as upregulated oncomirs or downregulated suppressors [9]. Therefore, understanding of SNPs impact on miRNA levels might be helpful in the selection of breast cancer diagnostic markers.

Polymerase Chain Reaction, Sequencing, and Restriction Analysis
DNA specimens were amplified using the standard PCR protocols. DreamTaq DNA Polymerase 0.02 U/µl (EP0701, Thermo Fisher Scientific, Waltham, MA, USA), dNTP Mix 1 mM (U151A, Promega, Madison, WI, USA), primers 0.5 µM were applied. The temperature profile was as follows: (1) the preheating phase, 95 • C for 10 min, (2) the amplification cycle of 35 repeats, 95 • C for 30 s and 55 • C for 30 s, and finishing the cycle with 72 • C for 30 s, (3) the final elongation process, 72 • C for 7 min. PCR primers: 5 -TTTTGGTCTTTCTGTCTCTGG-3 and 5 -TGGAGGAAGGGTATGAGGAGT-3 , which were spanning the sequence of pre-miR-125a (accession number ENSG00000208008) were designed using the Oligo software (Molecular Biology Insights, Inc., DBA Oligo, Inc., Colorado Springs, CO, USA). This amplicon was used for the restriction analysis and MIR125A sequencing. PCR products of 27 samples collected from patients were purified using the gel-out purification kit (A&A Biotechnology, Gdynia, Poland) and sequenced at the DNA Sequencing and Oligonucleotide Synthesis Laboratory of the Institute of Biochemistry and Biophysics of the Polish Academy of Sciences (Warsaw, Poland). The sequencing results were analyzed using BioEdit Sequence Alignment Editor [25]. The rs12976445 SNP was genotyped using the BaeGI restriction enzyme (R0708S, New England Biolabs, Ipswich, MA, USA). The genotyping required 17.8 µL of the PCR product and 2 U of a restriction enzyme that were kept in 37 • C for 12 h. BaeGI cuts GKGCM|C K = G or T, M = A or C (T-variant TGTGCCTA, C-variant TGTGCCCA). Upon preliminary sequencing samples of 27 patients, we supposed that the uncut sequence would contain T-variant, although G-and A-variant could not be excluded [26]. The enzyme HpyF3I (DdeI) (ER 1881, Thermo Scientific, Waltham, MA, USA), which cuts C|TNAG, was used to analyze rs143525573 and rs12975333 SNP in the miR-125a amplicon. The applied procedure was analogous to the genotyping with the BaeGI restriction enzyme.

Preparing Sequence for in Silico Analysis
From the amplicon comprising pre-miR-125a sequence, we selected a fragment which consisted of the SNP nucleotide, the 25 nt upstream region of SNP, and the 25 nt downstream region of SNP. This subsequence was named as 51-miR-125a and it was analyzed in two variants, C and U, of the genotype.

Predicting Protein Interactions
RBPmap web server [27] and ATtRACT database [28] were used to find possible protein interactions within 51-miR-125a, for both U-and C-variants. RBPmap was designed in 2014 as a tool for finding potential binding sites of RNA-interacting proteins, especially in human, mouse, and Drosophila melanogaster genomes. Searching for potential binding sites of RNA-interacting proteins was performed using RBPmap server with three different stringency levels. ATtRACT is a database with an implemented search engine that applies a fast algorithm dedicated to finding motifs corresponding to RNA-interacting proteins. The database contains 370 RBPs and 1583 RBP consensus binding motifs that can be searched throughout different organisms (including humans).

2D Structure Modelling
The secondary structure of 51 nt long RNA fragment of pri-miR-125a sequence (51-miR-125a) was predicted using bioinformatictools -RNAstructure [29] and RNAfold programs [30]. RNAfold is a part of the Vienna RNA package that collects tools specialised for analysing single-stranded nucleic acid sequences. RNAfold generates 2D models optimizing either minimum free energy (MFE mode) or minimal base-pair distance (centroid mode) parameters. RNAstructure is the second program that we used for predicting 2D structures. It can handle both RNA and DNA sequences and allows users to predict biomolecular secondary structures. Based on a given sequence, RNAstructure generates the lowest (minimum) free energy 2D structure model (MFE mode) and a 2D structure composed of highly probable base pairs (MaxExpect mode). Predicted models in this research for both RNAfold and RNAstructure were obtained using default settings.

Statistical Analysis
The expected genotype and allele frequencies for the observed variations were calculated for all 175 positive cases (Ca) and 129 control samples (Co). These frequencies were tested for Hardy-Weinberg equilibrium. Statistical analysis of genotypes frequency was conducted using the online tool SNPstats [31]. The association of rs12976445 in miR-125a in breast cancer cells was calculated using odds ratio (OR) at a 95% confidence interval (CI). The association of the genotypes with the receptor status was calculated using χ square test and the software GraphPadInStat version 7.00 (GraphPad Software, San Diego, California, USA).

In Vitro Analysis
rs12976445 SNP in pri-miR-125a occurring in three genotypes TT, CT, and CC has been associated with cancer and other diseases [32][33][34]. We studied the pri-miR-125a amplicon ( Figure 1) by sequencing and restriction analysis to determine the genotype frequency and its association with breast cancer. allows users to predict biomolecular secondary structures. Based on a given sequence, RNAstructure generates the lowest (minimum) free energy 2D structure model (MFE mode) and a 2D structure composed of highly probable base pairs (MaxExpect mode). Predicted models in this research for both RNAfold and RNAstructure were obtained using default settings.

Statistical Analysis
The expected genotype and allele frequencies for the observed variations were calculated for all 175 positive cases (Ca) and 129 control samples (Co). These frequencies were tested for Hardy-Weinberg equilibrium. Statistical analysis of genotypes frequency was conducted using the online tool SNPstats [31]. The association of rs12976445 in miR-125a in breast cancer cells was calculated using odds ratio (OR) at a 95% confidence interval (CI). The association of the genotypes with the receptor status was calculated using χ square test and the software GraphPadInStat version 7.00 (GraphPad Software, San Diego, California, USA).

In Vitro Analysis
rs12976445 SNP in pri-miR-125a occurring in three genotypes TT, CT, and CC has been associated with cancer and other diseases [32][33][34]. We studied the pri-miR-125a amplicon ( Figure 1) by sequencing and restriction analysis to determine the genotype frequency and its association with breast cancer. Whole amplicon (247 nt) comprising pre-miR-125a sequence, the fragment of the sequence from Ensembl Gene ID: ENSG00000208008. Lower-case corresponds to pri-miR-125a fragments removed in pre-miR-125a processing; grey upper-case is pre-miR-125a; in bold upper-case we marked mature miR-125a 5p and 3p, respectively; bold underlined lower-case nucleotides are the analyzed U-, C-variants; bold italic lower-case is a 51 nt fragment (51-miR-125a) that was analyzed using computational methods. BaeGI restriction site is located directly before rs12976445 SNP, dividing sequence into 42 nt and 205 nt subsequences.
We amplified DNA obtained from 27 breast cancer tumours to assess the frequency of several known SNPs in pri-miR-125a sequence. The sequencing of 27 pri-miR-125a amplicons showed that only nrs12976445 polymorphic site was highly variable in our samples. Subsequently, we estimated the association of rs12976445 with breast cancer using DNA from 304 blood samples, 175 positive (cases, Ca) and 129 negative (controls, Co) samples. We digested the pri-miR-125a amplicon using BeaGI restriction enzyme producing 42 and 205 fragments if C or non-digesting if T variant was present. We performed statistical analysis with SNPStats on-line software. The allele and genotype frequencies analysis are presented in Table 1. We found that rs12976445 in pri-miR-125a followed Hardy-Weinberg equilibrium in breast cancer cases (Ca) set (Table 1).

Figure 1.
Whole amplicon (247 nt) comprising pre-miR-125a sequence, the fragment of the sequence from Ensembl Gene ID: ENSG00000208008. Lower-case corresponds to pri-miR-125a fragments removed in pre-miR-125a processing; grey upper-case is pre-miR-125a; in bold upper-case we marked mature miR-125a 5p and 3p, respectively; bold underlined lower-case nucleotides are the analyzed U-, C-variants; bold italic lower-case is a 51 nt fragment (51-miR-125a) that was analyzed using computational methods. BaeGI restriction site is located directly before rs12976445 SNP, dividing sequence into 42 nt and 205 nt subsequences.
We amplified DNA obtained from 27 breast cancer tumours to assess the frequency of several known SNPs in pri-miR-125a sequence. The sequencing of 27 pri-miR-125a amplicons showed that only nrs12976445 polymorphic site was highly variable in our samples. Subsequently, we estimated the association of rs12976445 with breast cancer using DNA from 304 blood samples, 175 positive (cases, Ca) and 129 negative (controls, Co) samples. We digested the pri-miR-125a amplicon using BeaGI restriction enzyme producing 42 and 205 fragments if C or non-digesting if T variant was present. We performed statistical analysis with SNPStats on-line software. The allele and genotype frequencies analysis are presented in Table 1. We found that rs12976445 in pri-miR-125a followed Hardy-Weinberg equilibrium in breast cancer cases (Ca) set (Table 1). The statistical analysis, comprising odds ratio (OR) at 95% confidence interval (CI), are presented in Table 1. In the co-dominant model (TT vs. TC vs. CC), the heterozygous CT genotype of rs12976445 SNP was slightly more frequent in control (Co) set (50.4%) compared to cases (Ca) (46.3%) with OR Co/Ca = 1.19, 95%, CI = 0.74-1.91 and p = 0.77 (Table 1). In the dominant model (TT vs. TC + CC), we observed slightly higher level of CT + CC in controls over cases with OR = 1.17, 95% (CI = 0.74-1.85) and p = 0.5. These calculations indicated the tendency that in breast cancer cases, TT genotype appeared slightly more frequent than CC and CT genotypes.
Additionally, we analyzed two other known SNPs, rs143525573 and rs12975333, located in the mature miR-125a coding sequence. The DdeI enzyme was applied to assess these two SNPs. We did not observe the variability of rs143525573 and rs12975333 in the studied group of patients with breast cancer.
Since SNP in pri-miR-125a could modulate the level of mature miRNA, we assessed the relationship of rs12976445 variability with the status of human epidermal growth factor receptor 2 (HER2), ER1α and PR receptors. mRNA encoding receptor HER2 (human epidermal growth factor receptor 2, receptor tyrosine-protein kinase erbB-2, encoded by ERBB2 gene) is a target of miR-125a [33]. HER2 together with ER1α and PR receptors are used in the diagnosis of breast cancer. Therefore, we decided to perform tests of the relationship of rs12976445 variation with the status of these receptors. HER2 receptor status 0, 1, and 2 without amplification was classified as the negative group (N), while status 2 with amplification and status 3 were ranked as the positive group (P). Patients were also divided into two groups according to ER1α or PR status, 0-10% as the negative group and 10-75% as the positive group. We analyzed data using χ square test. The lowest p-value, 0.0606, was obtained for TT, CT, CC genotypes analysis for HER2 receptor. Our analysis revealed a tendency that T allele predominated in HER2 positive samples (Table 2). These results are concomitant with genotype frequency, which TT-variant slightly predominates in cases over controls. To determine how Cand T(U)-variants could regulate miR-125a expression, we decided to analyze the 2D structure of pre-miRNA usingin silicomethods. Table 2. The association analysis between miR-125a rs12976445 polymorphism and HER2, ER and PR receptors status in breast cancer patients. The association was calculated using χ square test. The status of receptors was divided into two groups-positive (P) and negative (N).

In Silico Analysis of RNA Binding Proteins (RBPs)
To find potential binding proteins to the pri-miR-125a containing C-and U-variants, we analyzed the 51-nucleotide fragment of pri-miR-125a using RBPmap web server and ATtRACT database [27,28]. Both, the C-and U-variants were analyzed using RBPmap with three different stringency levels: low, medium, and high. Low stringency level corresponds to significant p-value < 0.01 and suboptimal p-value < 0.02; medium stringency level thresholds were at significant p-value < 0.005 and suboptimal p-value < 0.01; high stringency level was set to <0.001 and <0.01 for significant p-value and suboptimal p-value, respectively. We searched for any available human motif stored in RBPmap. Low stringency algorithm mapped SRSF5, PTBP1, and PCBP1 proteins in both C-/U-variants. BRUNOL4 and BRUNOL5 proteins were only mapped in U-variant, whereas SRSF3, HNRNPK, and NOVA1 proteins were only found in C-variant. In medium and high stringency levels, RBPmap found only PTBP1 protein motif in U-variant. For the C-variant, medium stringency algorithm found following potentially binding proteins: SRSF3, HNRNPK, NOVA1, and PTBP1. High stringency level for C-variant found the same RBPs as the medium algorithm, excluding HNRNPK protein. All RBPmap-predicted proteins that potentially bind to the region containing C-/U-variant in 51-miR-125a sequence are shown in Table 3. We focused on proteins that were variant-specific. Less frequent C-variant interacts potentially with HNRNPK and NOVA1-the proteins involved in the RNA processing [35,36]. The C-variant was also mapped with SRSF3 protein, which was not found by RBPmap in U-variant on any stringency level of the algorithm execution and appeared on every stringency level of C-variant. On the contrary, BRUNOL4 and BRUNOL5, proteins that regulate alternative splicing of pre-mRNA and are possibly connected with mRNA editing and translation, were only mapped in U-variant of the analyzed sequence.
In the next step, we used ATtRACT database to search for potential RBP motifs in 51-miR-125a sequence. In the U-variant, ATtRACT predicted PTBP1 motif corresponding to the results from RBPmap, where PTBP1 was mapped in all three stringency levels of the algorithm. In the C-variant, ATtRACT predicted YBX1 and NOVA1 binding motifs. YBX1 protein is involved in cellular processes, including pre-mRNA splicing, transcriptional and translational regulation. It is also potentially involved in miRNA processing [37]. What brings particular interest, is the mapping of the NOVA1 protein in the C-variant. It complies to the results from RBPmap presented earlier. Similarly, as PTBP1, RBPmap predicted NOVA1 binding to 51-miR-125a for all stringency levels (low, medium, high). The results suggest that C-variant in 51-miR-125a is connected with different RBP (PTBP1) than the U-variant (NOVA1). Binding motifs for C-/U-variants from ATtRACT database are presented in Table 4.

In Silico Modelling of pri-miR-125a Folding
Computational modelling of the 2D structure of 51-miR-125a was performed using RNAfold ( Figure 2) and RNAstructure (Figure 3) [29,30].  Predicted models revealed variant-dependent folding of the 51 nucleotides long pri-miR-125aregion comprising rs12976445. The modelling of C-variant 51-miR-125a with RNAfold resulted in two different structures when we applied two modes of analysis: minimizing free energy Predicted models revealed variant-dependent folding of the 51 nucleotides long pri-miR-125a region comprising rs12976445. The modelling of C-variant 51-miR-125a with RNAfold resulted in two different structures when we applied two modes of analysis: minimizing free energy (MFE) and minimizing base-pair distance (centroid). For the U-variant using MFE and centroid modes, we obtained the same model of RNA structure. In the RNAfold MFE model of the C-variant (Figure 2a), SNP created a base-pairing near the hairpin loop. In the RNAfold centroid model of the C-variant structure, SNP was unpaired within the hairpin loop. In both RNAfold modes MFE and centroid from U-variant, SNP was the first unpaired nucleotide of a long unpaired subsequence located on 3 end of 51-miR-125a.
Predicting the 2D structure of 51-miR-125a using RNAstructure resulted in two MFE structures for C-variant (Figure 3a,b) and a single model for U-variant. In both, MFE and centroid (Figure 3a,d, respectively), structures predicted for 51-miR-125a C-variant by RNAstructure (Figure 3), we observed the same location of SNP as in the model generated by the RNAfold in the MFE mode (Figure 2a). Moreover, all predicted U-variant foldings (Figure 2b,d and Figure 3c,e) have the same structure, independent of the 2D structure prediction software used or mode. In both structures generated with RNAstructure MFE (Figure 3a) and centroid (Figure 3d) modes, SNP was a part of base-pairing near the hairpin loop. In the remaining three models, RNAstructure MFE C-variant (in Figure 3b) and U-variant RNAstructure MFE and centroid (3c,e), SNP is a part of a long unpaired region starting with SNP (26th nucleotide) and moving along till the last (51st) nucleotide. The differences observed between U-and C-variant models in both RNAfold and RNAstructure may suggest SNP-dependent folding of the structure.

Discussion
The analysis of our results concerning rs12976445 SNP in miR-125a revealed that the TT genotype was slightly more frequent in breast cancer patients and HER2 positive patients. Our study revealed only a tendency, and we obtained p-values above 0.05. One reason for this is the participation of miR-125a with numerous other factors in the development of breast cancer. The correlation of such multigenic diseases with single SNP requires a greater number of individuals and controls. Our findings suggested that rs12976445 has the potential to be a predictive biomarker for cancer risk, but a meta-analysis of a greater number of cases is required. Several studies have described the association of rs12976445 genotypes TT, CT, and CC in miR-125a with cancer and other diseases. In the Chinese study of Jiao et al. TT genotype has been significantly related to increased risk of mortality in breast cancer patients compared with those carrying the CC genotypes [19]. miR-125a rs12976445 was significantly associated with survival in codominant, recessive, and dominant models. However, only an association under the codominant model remained significant after adjustment for lymph node metastasis, TNM stage, estrogen receptor, and progesterone receptor [19]. It has been shown in previous studies that the miR-125a level is decreased in breast cancer [32][33][34], however, the level of miR-125a-5p is significantly higher in younger patients than in the older ones [38]. It has been shown that SNPs located in miR-125a are associated with breast cancer tumorigenesis [39] and rs12976445 SNP in miR-125a may serve as a prognostic biomarker for breast cancer [19]. Although rs12976445 is associated with breast and prostate cancer, the impact of this SNP has been rarely studied in a functional assay. The prominent study of the effect of rs12976445 on miR-125a expression was evaluated in the context of recurrent pregnancy loss [15,40]. In embryonic kidney cells, HEK293T miR-125a expression level of C haplotype was nearly four-fold higher than T haplotype [15,40]. The question remained how the rs12976445 genotype impacts the miR-125a level.
In the previous studies, rs12976445 SNP in miR-125a has been associated with the risk of pneumonitis [20]. The expression level of miR-125a mRNA has been significantly downregulated in the CT and TT groups, and CC genotype samples demonstrated upregulated miR-125a expression [20,21]. Rs12976445 polymorphism, also associated with the risk of diabetic nephropathy, showed that the expression levels of miR-125a were approximately three times lower in patients carrying TT and CT than in the CC [41]. Other studies showed that concomitantly with miR-125a the expression of miR-205 in breast cancer is also downregulated due to SNP variations in the miR-205 sequence [42]. Studies in various cell lines identified the differential expression of miR-205 in breast cancer cell lines in correlation with the missing number of AGC repeats [42].
Herein, we decided to analyze the association of rs12976445 genotypes with breast cancer. Moreover, we performed computational research to find potential protein interacting with the SNP region and to predict and compare the 2D structure of C-/U-variant sequence. In both cases, we used two independent tools, RBPmap and ATtRACT for RNA binding protein (RBP) search, and RNAstructure and RNAfold programs to build 2D models. We observed different RBP mapped to C-and U-variants of the sequence. Both programs that we used indicated the interaction between U-variant and PTBP1 protein. In C-variant, NOVA1 protein appears in the results obtained from both ATtRACT and RBPmap. It should be underlined that NOVA1 was found only in the C-variant sequence, whereas connection with PTBP1 protein was established in U-and C-variant by RBPmap program. Another interesting fact was the appearance of HNRNPK protein in the results for C-variant obtained from RBPmap for low and medium stringency level of the algorithm search criteria. This protein was not recognized as a possible binding protein to the U-variant. This indicated that RNA processing proteins have a different affinity to C-variant and U-variant potentially modulating the probability of pre-miR-125a RNA maturation.
We also predicted the 2D structures of 51 nt sequence with C-/U-variant. We applied the same RNAfold system as Hu et al. [15]. These authors have modelled the 2D structures of rs12976445 allele T, revealing that the rare allele T can neither change the predicted secondary structure nor the predicted ∆G [15]. Hu used 1016 nt fragment of RNA, whereas we used RNAfold with 51 nt, revealing two different 2D structures and different hairpin structures in U-variant and C-variant. The shorter sequence used in our study strengthened the obtained results of modelling as in silico methods deal better with shorter sequences. In a later paper, Hu et al. admitted the difference in 2D structure between variants of rs12976445 rare allele T using the same software as previously and the same 1016 nt RNA fragment [40]. For RNAfold-predicted structures we received different models per each variant. C-variants predicted by this tool were either a part of a hairpin loop (as an unpaired nucleotide) or as a base pair near the hairpin loop. On the contrary, U-variants in RNAfold were part of a long unpaired region on the 3 end of the sequence. RNAstructure program predicted 3 models for the C-variant sequence and 2 for the U-variant. In the Figure 3b,c,e structures were identical (except for the SNP in A1.2). Let us notice, that in Figures 2a and 3a are the same structures, and Figure 3c,d models are identical, despite using different predicting tools.
Using two computational programs, we revealed the potential differences in RNA-binding proteins between the analyzed C-and U-variant. We found that NOVA1 and HRNPK RNA-binding protein may interact with the C-variant and PTBP1 with the U-variant. Polypyrimidine tract binding protein 1 (PTBP1) binds to mRNA and regulates alternative splicing patterns [43]. In the previous reports, it has been shown that PTBP1 enhances miR-101-guided AGO2 (Argonaute) interaction with MCL1, thereby regulating miR-101-induced apoptosis and cell survival [44]. NOVA1 stimulates miRNA function by different mechanisms that converge on Argonaute proteins, a core component of the miRNA-induced silencing complex (miRISC). NOVA1 physically interacts with Ago proteins, and control neuronal miRISC function at the level of Ago proteins, with possible implications for the regulation of synapse development and plasticity [35,45]. Heterogeneous nuclear ribonucleoprotein K (hnRNPK), a ubiquitously occurring RNA-binding protein (RBP), can interact with many nucleic acids and various proteins and is involved in several cellular functions including transcription, translation, splicing, chromatin remodelling, etc. [36].
The limitation of the current study is the number of cases and controls. In the future study, a larger group of breast cancer patients should be analyzed to confirm the tendency of TT genotype association with breast cancer and to test the significance of the observed associations.

Conclusions
Understanding of the miRNA processing in cancer cells is an important step towards global fight against cancer. Among others, it could explain the reason why miRNA-125a level is decreased in breast cancer. Moreover, it can also help to focus the investigation on solving the upstream cascade of factors participating in abnormal regulation of miRNA in cancer. We were attempting to reveal potential features of rs12976445 SNPs of miRNA-125aby using all available resources, in vitro analysis on experimental data and using bioinformatics resources. The combination of these two approaches, in vitro and in silico, is more efficient and allows for a broader research perspective. Our analysis showed that the TT genotype of miRNA-125a was slightly more frequent in breast cancer patients and HER2 positive patients. We also demonstrated that the U-variant of rs12976445 diminished the probability of pri-miR-125a binding to NOVA1 and HNRNPK proteins. Our in silico analysis revealed that C-and U-variants could promote different RNA folding patterns that may further affect protein binding. Altogether, these may imply a lower miR-125a expression in breast cancer. These results may not only be useful for diagnostic purposes but can also contribute to the research into novel therapies for breast cancer. In a wider perspective, our experimental protocol and findings may be extended into different types of cancer or other diseases, where miRNA expression level is affected. Following this path, in future work, we would like to consider the other molecular types of cancer to extend the understanding of the impact of rs12976445 on miR-125a expression and verified the obtained in silico results by in vivo methods