Searching for the Novel Specific Predictors of Prostate Cancer in Urine: The Analysis of 84 miRNA Expression

The aim of this study was to investigate miRNA profiles of clarified urine supernatant and combined urine vesicle fractions of healthy donors and patients with benign prostatic hyperplasia and prostate cancer (PCa). The comparative analysis of miRNA expression was conducted with a custom miRCURY LNA miRNA qPCR panel. Significant combinations of miRNA pairs were selected by the RandomForest-based feature selection algorithm Boruta; the difference of the medians between the groups and a 95% confidence interval was built using the bootstrap approach. The Asymptotic Wilcoxon-Mann-Whitney Test was performed for miRNA combinations to compare different groups of donors. Benjamini-Hochberg correction was used to adjust the statistical significance for multiple comparisons. The most diagnostically significant miRNAs pairs were miR-107-miR-26b.5p and miR-375.3p-miR-26b.5p in the urine supernatant fraction that discriminated the group of healthy patients and PCa patients, as well as miR-31.5p-miR-16.5p, miR-31.5p-miR-200b, miR-31.5p-miR-30e.3p and miR-31.5p-miR-660.5p in the fraction extracellular vesicles that were different between healthy men and benign prostate hyperplasia patients. Such statistical criteria as the occurrence of individual significant miRNA pairs in the total number of comparisons, median ΔCt difference, and confidence interval can be useful tools for determining reliable markers of PCa.


Introduction
Prostate cancer (PCa) is rightfully considered a global social problem because of its prevalence [1] and high associated mortality in developing countries [2]. According to the American Cancer Society 2017, in developed countries, the relative 5-year survival rate for localised PCa is nearly 100%, but at advanced stages it is only about 29% [3]. Similar to many other cancers, lack of early symptomatic manifestation is a major contributor to the late detection of PCa. Modern PCa diagnostics are based on the detection of prostate specific antigen (PSA) in blood, digital rectal examination of diagnostic molecules, a highly sensitive detection is required [10]. The search for liquid biopsy biomarkers which will allow us to detect PCa with a high degree of reliability still continues.
Extracellular miRNAs found in biological fluids are considered as one class of perspective PCa markers. It was shown that these molecules take part in the key processes of cancerogenesis, and their expression is determined by the tumor status, i.e., growth rate, the tendency of the tumor to metastasize, etc., [1,9,11,[15][16][17][18]. Moreover, miRNAs can stably persist in biological fluids for extended periods of time [19], which ensures their reliable detection. The analysis of miRNA expression requires a small amount of biological material (obtained in less-invasive or non-invasive manner) and can be performed with a minimum set of equipment and few personnel by using such well-developed methods as microarray analysis, sequencing and RT-PCR. These properties instill hope that a set of diagnostically significant miRNAs can be discovered in blood or urine that can be used to develop a panel of markers for a more sensitive and specific PCa diagnosis, suitable for implementation in routine clinical practice.
In this study, we investigated the miRNA profiles of clarified urine supernatant and combined urine extracellular vesicles (EVs) in groups of healthy men and patients with benign prostatic hyperplasia (BPH) and PCa to identify miRNAs that can allow us to reliably distinguish PCa patients from non-cancer groups, and evaluate the diagnostic potential of selected predictors.

Results
The threshold cycle (C t ) values obtained after miRNA profiling were normalized using the pair ratio method, effectively evaluating the expression of all possible combinations of two miRNAs [20,21]. For the analyses, the following comparison groups were established: healthy donors-prostate cancer patients (comparison groups h-c for urine supernatant and hm-cm for urine EVs), healthy donors-patients with BPH (comparisons h-b and hm-bm) and patients with BPH-PCa patients (comparisons b-c and bm-cm).
Next, data on the occurrence of individual significant miRNA pairs in the total number of comparisons (frequency), median ∆C t difference (median distance), confidence interval (CI) and unadjusted (p) and adjusted (p adj ) significance were collected using the Boruta feature selection method as described in "Materials and Methods". The best miRNA combinations (Table 1) were selected based on the following criteria: 3 pairs with the highest frequency; 3 pairs with the largest median distance between the groups of donors (but no less than 1 C t ); 3 pairs with 95% CI of the median distance most removed from 0 (but no less than 0.1); 3 pairs with the lowest value of p adj . Different criteria allow for flexibility in approaching the group comparisons. Supposedly, the frequency criterion is a valuable indicator provided by the Boruta algorithm that allows for selecting of miRNA pairs that are less sensitive to sampling bias and can be effectively combined with other pairs. The median distance and the 95% CI criteria reflect the "diversity" of values in different groups and are a measure of practical applicability. If group value clusters are in close proximity to each other, small errors in the miRNA expression measurement (e.g., technical and sampling variation) can dramatically affect the accuracy of group identification and the quality of the analytical system. Pairs with the lowest value of p adj were selected as the most differently expressed from the statistical point of view. In general, since these criteria aimed at achieving the most accurate separation of groups, the pairs identified by the criteria often overlapped. In this case, any pair suggested by several criteria was only included in the final selection once. For each of the selected miRNA pairs, ROC analysis was performed and AUC was determined.
It is worth noting that, despite the uniqueness of predictor pairs for each of the 6 comparisons, the same miRNAs were detected simultaneously in pairs from several comparisons (Table 2). Interestingly, some miRNAs were detected only in the urine supernatant (miR-23b.3p, miR-30a.5p, miR-205.5p), while others exclusively in the EV fraction (miR-24.3p, miR.31.5p and miR-200b.3p). In each comparison, certain miRNAs repeatedly occurred in different combinations (Table 3). For example, miR-205.5p and miR-26b.5p were present in 50% and 40% of pairs, respectively in the h-c comparison. The miR-30c.5p was in each of the three miRNA pairs in the h-b comparison, and miR-31.5p was seen in 6 of the 10 combinations in hm-bm.
The distribution diagrams of normalized expression values of miRNA pairs were plotted to estimate the group separation of the most promising miRNA pairs ( Figure 1). As shown in Figure 1A, the values of the miR-107-miR-26b.5p pair in PCa patients and healthy donors were tightly spread and partially overlap, which can negatively affect group separation, and even more so in the case of measurement errors. By comparison, the likelihood of discriminating healthy donors from PCa patients using miR-375.3p-miR-26b.5p expression is significantly higher ( Figure 1A). 1A, the values of the miR-107-miR-26b.5p pair in PCa patients and healthy donors were tightly spread and partially overlap, which can negatively affect group separation, and even more so in the case of measurement errors. By comparison, the likelihood of discriminating healthy donors from PCa patients using miR-375.3p-miR-26b.5p expression is significantly higher ( Figure 1A). A similar pattern could be seen in the hm-bm comparison, where the values of miR-31.5p-miR-660.5p and miR-31.5p-miR.30e.3p pairs were sufficiently well spread, while miR-31.5p-miR-16.5p and miR-31.5p-miR-200с values of BPH patients overlapped with the values for healthy donors, which prevents the efficient separation of these groups ( Figure 1B).
Thus, here we compared miRNA expression in groups of healthy donors and patients with BPH and PCa using various statistical criteria to identify the best predictors that could reliably distinguish all three groups. In general, the largest number of such miRNA pairs (padj < 0.05) was found in the comparison group of healthy donor-PCa patients in the urine supernatant fraction (h-c, 6 miRNA pairs) and in the urine EV fraction (hm-cm, 3 miRNA pairs). Additionally, the urine supernatant fraction had more pairs that reliably distinguish healthy donors and PCa patients with ΔCt from 1 to 2 (Table 1) than the EV fraction. The most diagnostically significant pairs for discriminating between healthy donors vs PCa patients were miR-107-miR-26b-5p and miR-375-3p-miR-26b-5p in the urine supernatant fraction, (Table 4). When comparing the groups of healthy donors vs BPH patients or BPH patients vs PCa patients, it was not possible to find miRNA pairs that could effectively distinguish donors from the two groups due to borderline statistical significance (urine supernatant fraction) or weak statistical criteria (EV fraction), Table 1. However, when assessing the diagnostic value of predictors in the group of healthy donors vs BPH patients, the largest number of diagnostically significant pairs was found in the urine EV fraction (miR-31-5p-miR-16-5p, miR-31-5p-miR-200b, miR-31-5p-miR-30e-3p and miR-31-5p-miR-660-5p), Table 4.

Discussion
Currently, there exists a range of unresolved research challenges concerning miRNA investigation in biological fluids, including urine. For example, there are no universal well-defined A similar pattern could be seen in the hm-bm comparison, where the values of miR-31.5p-miR-660.5p and miR-31.5p-miR.30e.3p pairs were sufficiently well spread, while miR-31.5p-miR-16.5p and miR-31.5p-miR-200c values of BPH patients overlapped with the values for healthy donors, which prevents the efficient separation of these groups ( Figure 1B).
Thus, here we compared miRNA expression in groups of healthy donors and patients with BPH and PCa using various statistical criteria to identify the best predictors that could reliably distinguish all three groups. In general, the largest number of such miRNA pairs (p adj < 0.05) was found in the comparison group of healthy donor-PCa patients in the urine supernatant fraction (h-c, 6 miRNA pairs) and in the urine EV fraction (hm-cm, 3 miRNA pairs). Additionally, the urine supernatant fraction had more pairs that reliably distinguish healthy donors and PCa patients with ∆C t from 1 to 2 (Table 1) than the EV fraction. The most diagnostically significant pairs for discriminating between healthy donors vs. PCa patients were miR-107-miR-26b-5p and miR-375-3p-miR-26b-5p in the urine supernatant fraction, (Table 4). When comparing the groups of healthy donors vs. BPH patients or BPH patients vs. PCa patients, it was not possible to find miRNA pairs that could effectively distinguish donors from the two groups due to borderline statistical significance (urine supernatant fraction) or weak statistical criteria (EV fraction), Table 1. However, when assessing the diagnostic value of predictors in the group of healthy donors vs. BPH patients, the largest number of diagnostically significant pairs was found in the urine EV fraction (miR-31-5p-miR-16-5p, miR-31-5p-miR-200b, miR-31-5p-miR-30e-3p and miR-31-5p-miR-660-5p), Table 4.

Discussion
Currently, there exists a range of unresolved research challenges concerning miRNA investigation in biological fluids, including urine. For example, there are no universal well-defined standards for preanalytical, analytical and postanalytical stages of biomarker investigation, including protocols of the collection, processing and storage of biological samples, miRNA isolation procedures, biomarker quantification routines, data analysis, etc., [9]. Such a lack of clarity leads to a situation where results from different studies are difficult to compare, and reduces the value of meta-analyses [9]. Despite these existing problems, urine is still considered a promising source of PCa biomarkers, due to the non-invasiveness of sample collection and confirmed presence of PCa specific molecules [12].
EVs are one of the most extensively researched sources of cancer biomarkers, containing markers of different nature, ref. [37] including tumour-specific miRNA [34][35][36]. Urine EVs are a valuable source of miRNA because of the better signal to background ratio than in cells of urine sediments [38] and higher relative miRNA concentration in comparison with cell-free urine [15]. However, EV isolation by standard methods is more time-consuming and laborious than preparation of cell sediments or cell-free urine samples. Moreover, isolation of urine EVs miRNA is complicated by the presence of uromodulin (Tamm-Horsfall protein). Polymerization of uromodulin molecules can create structures that are capable of entrapping the EVs, reducing their isolation efficiency [39] and also contaminating extracted miRNA samples [40]. That is why in the present work both the urine EVs and cell-free urine supernatant were investigated in search for diagnostically valuable miRNAs (Table 1). In previously published work it was suggested that the changes in miRNA signature in different urine fractions can reflect different pathological processes in the prostate [22]. Thus, simultaneous investigation of miRNA expression in several urine fractions could be of considerable scientific and practical interest.
There are various normalization procedures used for miRNA expression analysis [41]. Among them, the ratio normalization method is a simple and robust approach in which all possible combinations of every two miRNAs are constructed and screened for links to pathological processes, in our case, PCa. In this approach, some of the frequently occurring miRNAs (Table 3) can be assumed to be oncospecific, whereas rarely occurring paired with them are potential stable normalizers. As such, this approach allows us to select a complete panel of oncospecific miRNAs.
It is known that miRNAs play important roles in the key aspects of PCa carcinogenesis and development, including overexpression of androgen receptor (AR), apoptosis resistance, loss of cell cycle control, cell adhesion and epithelial-mesenchymal transition [16][17][18]42,43]. These events are some of the major steps in the acquisition of invasive properties by PCa cells, giving them the propensity to proliferate uncontrollably, enhanced cell survival, mobility and the ability to spread to other organs and tissues. Thus, the detection of oncogenic miRNAs, involved in different pathways of PCa carcinogenesis, can be used as a diagnostic and treatment monitoring tool for this disease. For example, previous studies have shown that miR-205 and miR-214 expression in urine correlated with PCa progression [28] and miR-16, miR-21, miR-222 can be used as predictors of aggressive PCa [30].
In order to access the possible involvement of miRNAs in PCa cancerogenesis, miRNAs from pairs listed in Table 1 were analysed using OncomiRDB and DIANA-TarBase, containing experimentally validated oncogenic and oncosupressor miRNAs. The miR-let-7a and miR-let-7g from the h-b comparison were excluded based on low significance values (p adj > 0.05). Most of the remaining miRNAs were identified by both databases. According to OncomiRDB and DIANA-TarBase, 52.8% and 65% of these miRNAs were found to be prostate-cancer specific and were previously implicated in PCA development. The results of the miRNA search in OncomiRDB are shown in Table 5. Moreover, the most frequently occurring miRNAs in each group of comparison could also be associated with PCa development (Table 3), which can be taken into account in developing analytical diagnostic systems. Interestingly, according to this data, miRNA pairs identified in h-b and hm-bm comparisons contained oncogenic miRNAs (Table 2). Moreover, the differences between the expression of miRNA pairs in urine supernatants of BPH and PCa patients were only borderline significant, suggesting similar miRNA expression profiles (Table 1). This can be due to the mosaic heterogeneity of prostate tumours-non-diagnosed malignant foci can be present in patients with BPH, and areas of benign growth persist in PCa. BPH is also known to atypically progress into intraductal dysplasia (PIN) accompanied by the alteration or loss of characteristic tissue structure. Although no signs of malignant transformation are generally present in PIN, it can be considered as a pre-cancerous state [44]. Moreover, according to recent data, up to 25 percent of BPH cases are later discovered to have PC [5]. This supports older data that latent prostate carcinomas accompany 15.1% of BPH cases [45]. Another study revealed that nearly one third all cancerous lesions in the transitional zone of the prostate co-exist with BPH [46]. Also there are a number of pathophysiological similarities between BPH and PCa [47,48], including age-dependency profile and androgen requirement for growth and development. Still, to date no explicit evidence linking the two pathologies has been presented [44]. In this light, it is possible that in both BPH and PCa, changes in the expression of some miRNA may be tissue-specific, rather than tumor-specific, as is the case with PSA.
One distinctive feature of miRNA expression in biological fluids is that the magnitude differences between cancer patients and control groups tend to be lower or comparable with the within group variation. This fact calls for strict requirements to be placed on the design of a diagnostic system, including the use of spike-in controls, a certain number of analyzed replicates, and straightforward inclusion and exclusion criteria for miRNA samples, isolated from blood and urine. Only after the formulation of such requirements and development of analytical systems that satisfy them, would it be possible to start the verification of miRNA markers and determine the effectiveness of miRNA-based diagnostic systems.

Study Population and Sample Collection
Blood (used only to determine the PSA level) and urine samples of 10 healthy male individuals (HD), 10 patients with BPH and 10 previously untreated PCa patients were obtained from the Center of New Medical Technologies of ICBFM SB RAS and Regional Oncology Center (Novosibirsk, Russia).
Clinicopathological and demographic characteristics of donors are presented in Table 6. Biological samples were harvested from 10 healthy donors (HD), 10 patients with benign prostatic hyperplasia (BPH) and 10 prostate cancer patients (PCa with T 2-3 NxMx stage and pathological Gleason score 6-7). None of the patients had undergone surgical treatment or received chemotherapy prior to/at the time of sample collection.
The work was conducted in compliance with the principles of voluntariness and confidentiality in accordance with the "Fundamentals of Legislation on Health Care". The study was approved by the ethics committees of ICBFM SB RAS and Novosibirsk Regional Oncology Center, (minutes of meeting N 10 from 22 December 2008) and written informed consent was provided by all participants.

Urine Fractionation and Isolation of Extracellular Vesicles
To pellet cells, fresh urine was clarified by centrifugation at 400× g, 20 • C, for 20 min within 3 h after collection. Supernatants were then centrifuged at 17,000× g, 20 • C, for 20 min. Aliquots from supernatants were immediately frozen and stored at −20 • C and thawed once immediately before use.
Total extracellular vesicles (EVs) were precipitated from the 17,000× g supernatant by high-speed centrifugation at 100,000× g, 18 • C, for 90 min. The pellets were washed with 10 mL PBS, resuspended in 100-300 µL of PBS and frozen in liquid nitrogen and stored at −80 • C until use. Aliquots were thawed once immediately before use.

miRNA Isolation and Analysis
For miRNA extraction from EVs, a sample of 200 µL resuspended EVs was sequentially mixed with 100 µL of denaturing buffer (1% 2-mercaptoethanol and 0.3 M guanidine isothiocyanate and 20 mM Tris-acetate, pH 4) and 200 µL of precipitation buffer (12 mM OcA and 0.8 M sodium acetate pH 4.0). The mixture was vortexed for 5 s, incubated for 5 min at room temperature and centrifuged at 16,200× g. Supernatant was mixed thoroughly with an equal volume of 95% ethanol and applied to a silica spin column (e.g., BioSilica Ltd., Novosibirsk, Russia). Then, miRNA extraction was performed as described previously [49]. The modified acid phenol-chloroform extraction was used for miRNA isolation from urine supernatant samples [50]. A sample of 2 mL urine was sequentially mixed with 2 mL of denaturing buffer (1% 2-mercaptoethanol and 3 M guanidine isothiocyanate) and 75 µL of 2 M sodium acetate, pH 4. Then, 4 mL of phenol and 700 µL chloroform were sequentially added and mixed thoroughly. The mixture was centrifuged at 9000× g, 4 • C for 20 min. After phase separation, the water phase was collected and phenol extraction was re-performed. The water phase obtained after the second extraction was mixed with an equal volume of 2 M sodium acetate, pH 4, double volume of 96% ethanol, and purified using silica columns (BioSilica Ltd., Novosibirsk, Russia). The column was washed with washing buffer 1 (0.3 M guanidine isothiocyanate, 10 mM Tris-acetate, pH 6.5, 50% ethanol, 1% 2-mercaptoethanol) and washing buffer 2 (10 mM Tris-HCl, pH 7.5, 0.1 M NaCl, 75% ethanol). RNA was eluted from the column with BioSilica RNA elution solution (BioSilica Ltd., Novosibirsk, Russia).

Analysis of miRNA Expression
The comparative analysis of miRNA expression was conducted with a custom miRCURY LNA miRNA qPCR panel (Exiqon, Danmark) based on 67 miRNA samples from a pre-formed Urine Exosome marker panel and 18 PCa-specific miRNAs selected from available literature and databases analysis.

Statistical Analysis
Statistical analysis was performed using the R environment [51]. Threshold cycle (Ct) values were normalized using the pair ratio method. Significant combinations of miRNA pairs were selected by the RandomForest-based feature selection algorithm Boruta. A total of 3403 combinations were made. A number of miRNAs were missing in some samples. The following method was used to not exclude these miRNAs from the analysis. First, 9 ratio values of each microRNA pair were randomly selected from each group without replacement. Then, the significance of these combinations as predictors of PCa, HD and BPH was evaluated using Boruta. This procedure was repeated 1000 times, and significant predictors were recorded at each iteration. For the selected miRNA pairs, the difference of the medians between the groups was assessed and a 95% confidence interval was built using the bootstrap approach [52].
The miRNA combinations with the confidence interval containing zero were excluded from the further analysis. The asymptotic Wilcoxon-Mann-Whitney Test was performed for the remaining miRNA combinations to compare different groups of donors. A p-value < 0.05 was considered statistically significant. Benjamini-Hochberg correction (p adj ) was used to adjust the statistical significance for multiple comparisons.
The specificity and sensitivity of the analytical systems were obtained using Receiving Operator Characteristic (ROC) curves. The Area under ROC curves (AUC) was used to assess the diagnostic performance of miRNA combinations.

Conclusions
A random Forest-based feature selection (Boruta) was used to analyze miRNA expression in urine supernatant and EVs of healthy men and PCa and BPH patients. Several miRNAs were selected as candidate biomarkers based on criteria such as frequency, median distance and the p-value. The best combinations were chosen and their diagnostic potential was determined. This work is a preliminary stage. The next step is to validate the selected miRNA markers using an independent group of donors and investigate the association of miRNA expression with demographical and clinico-pathological factors, including age, disease stage, Gleason index, tumor genotype, resistance to chemo-and radiotherapy, patient survival, etc.