A Panel of MicroRNAs as Diagnostic Biomarkers for the Identification of Prostate Cancer

Prostate cancer is the most common non-cutaneous cancer among men; yet, current diagnostic methods are insufficient, and more reliable diagnostic markers need to be developed. One answer that can bridge this gap may lie in microRNAs. These small RNA molecules impact protein expression at the translational level, regulating important cellular pathways, the dysregulation of which can exert tumorigenic effects contributing to cancer. In this study, high throughput sequencing of small RNAs extracted from blood from 28 prostate cancer patients at initial stages of diagnosis and prior to treatment was used to identify microRNAs that could be utilized as diagnostic biomarkers for prostate cancer compared to 12 healthy controls. In addition, a group of four microRNAs (miR-1468-3p, miR-146a-5p, miR-1538 and miR-197-3p) was identified as normalization standards for subsequent qRT-PCR confirmation. qRT-PCR analysis corroborated microRNA sequencing results for the seven top dysregulated microRNAs. The abundance of four microRNAs (miR-127-3p, miR-204-5p, miR-329-3p and miR-487b-3p) was upregulated in blood, whereas the levels of three microRNAs (miR-32-5p, miR-20a-5p and miR-454-3p) were downregulated. Data analysis of the receiver operating curves for these selected microRNAs exhibited a better correlation with prostate cancer than PSA (prostate-specific antigen), the current gold standard for prostate cancer detection. In summary, a panel of seven microRNAs is proposed, many of which have prostate-specific targets, which may represent a significant improvement over current testing methods.


Introduction
Prostate Cancer (PCa) is the most common non-cutaneous cancer among men, yet current diagnostic methods are insufficient at detecting this disease, and more reliable biomarkers need to be developed. Currently, the prostate-specific antigen (PSA) is used as a diagnostic marker for PCa; however, many factors have been found to elevate PSA levels. Age, infection, trauma, ejaculation, urinary retention, instrumentation, certain medications and even bike riding can lead to false positive diagnoses, generating unnecessary concern and over-treatment with dire outcomes for the patient [1][2][3][4]. Even worse are the chances of false negative diagnoses, which result in PCa remaining undetected until its later stages. Therefore, although the use of the PSA level has had its clinical advantages, it has failed to sufficiently bridge the gap to accurately diagnose disease or distinguish indolent from aggressive disease. One answer that might close this gap and enable more efficient diagnoses may lie in microRNAs (miRs) [5].
Small RNAs play an extremely important role in gene regulation. Their function in the suppression of unwanted genetic materials is vital to the proper operation of the cell. Small RNAs fall into three classifications: microRNAs, siRNA and PIWI-interacting RNAs (piRNA), the most dominating of which are microRNAs [6]. MicroRNAs are small non-coding RNA molecules (18-22 nts in length) that are evolutionarily conserved and associated with the Argonaute family of proteins. These microRNAs function at the translational level through silencing mechanisms to regulate gene expression.
MicroRNAs have been shown to be significantly altered throughout the course of disease progression [7]. This is especially true in cancer where abnormal cell growth and angiogenesis are critical for tumorigenesis to occur. The loss of microRNAs that suppress the translation of oncogenes, termed tumor suppressors, has been shown to contribute to the development and progression of many cancers [7]. These microRNAs are primarily responsible for controlling apoptotic pathways and cell cycle checkpoints [7].
Since the discovery of microRNAs, many research groups have analyzed blood in hopes of establishing a correlation to disease. Mitchell et al. first reported that PCa cells released microRNAs into the bloodstream in protective capsules, the content of which could be monitored by PCR-based methods [8]. Schultz et al. studied whole blood for the identification of microRNAs that could be used as biomarkers for the detection of pancreatic cancer [9]. By confirmatory qRT-PCR, they found 38 microRNAs dysregulated and were able to identify two diagnostic microRNA panels that could distinguish between patients with pancreatic cancer from healthy controls [9]. Another study compared microRNA levels between plasma and serum from PCa patients by measuring four microRNAs: hsa-miR-15b, hsa-miR-16, hsa-miR-19b and hsa-miR-24. Interestingly, they found a strong correlation in the microRNA content of these two types of body fluids supporting either serum or plasma as a sufficient source of material for disease studies [8]. Using qRT-PCR, Cochetti et al. suggested a panel of serum microRNAs that could distinguish PCa from benign prostatic hyperplasia in age-matched patients with elevated PSA levels [10]. Thus, the use of blood, serum or plasma as a worthwhile source of material to diagnose disease is well documented [8][9][10].
To date, a number of studies have used PCR technology to identify microRNAs that could be used as relevant biomarkers to diagnose PCa [11,12]. Certainly, these are important studies, but for the most part, they have used preformed panels of microRNA arrays or focused qRT-PCR assays for specific microRNAs suggested from studying a wide range of different cancers and then applied to PCa. By this approach, only predetermined, known microRNAs are being evaluated. In an effort to widen the scope of microRNA candidates, high throughput sequencing (HTS), also referred to as deep sequencing or RNA sequencing, would better evaluate all possible microRNAs, as well as permitting the discovery of new, novel microRNAs. Keller et al. used HTS of whole blood samples collected with PAXgene blood tubes to study microRNA profiles in lung cancer patients [13]. However, in this case, samples were pooled prior to sequencing, thereby preventing an analysis of microRNA dysregulation across individual samples. To our knowledge, only two reports have used HTS to identify microRNAs diagnostic for PCa. In one case, HTS was used to compare the microRNA content of prostate tumors to adjacent tumor-free margins with the discovery of a loss of miR-143 and miR-145 expression in tumor tissues [14]. A second report applied HTS to exosomal material isolated from blood and found miR-1290 and miR-375 as prognostic markers for castration-resistant prostate cancer (CRPC) [15]. However, in this case, these microRNAs would be useful for identifying late stage prostate cancers.
To better define microRNAs that could be used to more accurately predict PCa at early, not later stages of disease, in this pilot study, we have used HTS of blood from PCa patients at initial stages of diagnosis and before undergoing treatment compared to healthy controls. Moreover, samples were analyzed individually rather than as a pool so that variability between patients or control samples could be followed. RNA sequencing results were also analyzed to identify normalization microRNAs that could be used as endogenous controls for subsequent qRT-PCR analyses. Confirmatory qRT-PCR was then used to corroborate HTS results for the top seven dysregulated microRNAs. Data analysis of the area under the curve (AUC) of the receiver operating curves (ROC) for these selected microRNAs exhibited a better correlation with prostate cancer (AUC range = 0.819-0.950) than the reported value for PSA (AUC 0.678 comparing PCa to non-cancer) [16]. In summary, a panel of seven microRNAs is proposed, many of which have prostate-specific targets, which upon follow-up confirmatory studies could represent a significant improvement over current testing methods.

High Throughput Sequencing Results
A summary of the characteristics and pathological data for patients (n = 28) and controls (n = 12) selected for this study is compared in Table 1. Data for each individual can be found in Appendix A Table A1. Blood was retrieved from patients at early stages of diagnosis and prior to treatment. For most cases, age, ethnicity, PSA and Gleason scores obtained from biopsy were reported. The Gleason score was obtained by microscopic analysis by a trained pathologist and is the combined score of the most common and second most abundant cell type based on cell morphology. When the Gleason score or PSA were not available, it is designated as unknown. Although some mix of ethnicity was obtained, Caucasian was most prevalent with no ethnicity or age recorded for nine individuals. Low Gleason scores of G6 and G7 and PSA values ranging from 3.4-22 predominated, since samples were taken from patients at early stages of diagnosis. Although Gleason scores were not reported for four patients, elevated PSA levels including the high of 22 was found within this group supporting their inclusion to analyze as many samples as possible in this pilot study. Every effort was made to select a control group that had no evidence of PCa either for the individual or within the family. PCa being predominately a disease of the elderly, the average age of the patient group did exceed that of the controls, but since all data were analyzed as individuals, we could subsequently evaluate differences within each group. In this case, we did not find notable discrepancies in data within either the patient or control group due to age or the group of four with elevated PSA values, but unknown Gleason scores, further supporting their inclusion in this study. Blood was collected and small RNAs extracted from individual patient and control samples as described in the Materials and Methods. The HTS data revealed that among the 2588 microRNAs present in the miRBase (mature 21 June 2014) [17], about 550 were found at detectable levels in the samples tested. To better refine this list of potential candidates, p-values were adjusted using the Benjamini-Hochberg method to yield a False Detection Rate, or FDR value. The FDR value indicates the possible false detection rate using a generalized linearization model. This method is considered a Type 1 error expansion multiple comparison model that reduces the risk of rejecting a true null hypothesis. In order to include as many positive hits as possible in the HTS screening, a cutoff FDR value of <0.2 was selected. An FDR value of 0.2 would mean that 20% of selected microRNAs may be false positives. Since all HTS results would be subsequently confirmed by qRT-PCR, it was felt that lowering the stringency to include more potential microRNA candidates for future confirmation was acceptable at this initial stage. In fact, lowering the stringency of this selection generated a list of 10 possible dysregulated microRNAs for future study (Table 2). Subsequently, miR-5582-3p and miR-543 were dropped because there were no manufactured primers readily available in the market, and their abundance was low. In addition, miR-500b-3p was also dropped due to its low abundance. Thus, seven microRNAs were chosen for future analysis. During the bioinformatics analysis, the HTS total reads for patients and controls were not significantly different from each other, suggesting that blood from normal and patient groups contained similar amounts of total microRNA ( Figure 1). This similarity increased the confidence of dysregulation, as it could be confirmed that the differential expression of certain microRNAs was not due to differences in library size. Blood was collected and small RNAs extracted from individual patient and control samples as described in the Materials and Methods. The HTS data revealed that among the 2588 microRNAs present in the miRBase (mature 21 June 2014) [17], about 550 were found at detectable levels in the samples tested. To better refine this list of potential candidates, p-values were adjusted using the Benjamini-Hochberg method to yield a False Detection Rate, or FDR value. The FDR value indicates the possible false detection rate using a generalized linearization model. This method is considered a Type 1 error expansion multiple comparison model that reduces the risk of rejecting a true null hypothesis. In order to include as many positive hits as possible in the HTS screening, a cutoff FDR value of <0.2 was selected. An FDR value of 0.2 would mean that 20% of selected microRNAs may be false positives. Since all HTS results would be subsequently confirmed by qRT-PCR, it was felt that lowering the stringency to include more potential microRNA candidates for future confirmation was acceptable at this initial stage. In fact, lowering the stringency of this selection generated a list of 10 possible dysregulated microRNAs for future study (Table 2). Subsequently, miR-5582-3p and miR-543 were dropped because there were no manufactured primers readily available in the market, and their abundance was low. In addition, miR-500b-3p was also dropped due to its low abundance. Thus, seven microRNAs were chosen for future analysis. During the bioinformatics analysis, the HTS total reads for patients and controls were not significantly different from each other, suggesting that blood from normal and patient groups contained similar amounts of total microRNA ( Figure 1). This similarity increased the confidence of dysregulation, as it could be confirmed that the differential expression of certain microRNAs was not due to differences in library size.  Processed raw reads were further normalized using the Trimmed Mean of M-values (TMM) method provided by the Edge R program [18]. Based on the hypothesis that most genes are not differentially expressed, the TMM method generates a scaling factor applied to library sizes, which attempts to minimize the intra-group variation in gene expression. This normalization method can further minimize the effect of technical variations caused by sequencing depth and batch variation. The TMM method is a very powerful method when varying library size, and high-count genes can exist [19]. Compared to the commonly-used normalization methods of Total Counts (TC) or Reads Per Kilobase per Million mapped reads (RPKM), the TMM method is more reliable, because it not only normalizes the library size, but also takes into account the effect of RNA composition [18]. The effectiveness of the TMM method in normalizing the microRNA sequencing results was later confirmed via qRT-PCR.
The seven dysregulated microRNAs indicated by HTS differential expression analysis showed great differences in normalized reads between control and patient groups (Figure 2a-g). According to the HTS data, four microRNAs were upregulated (miR-127-3p, miR-204-5p, miR-329-3p and miR-487b-3p) in patients' blood samples (Figure 2a-d), while three microRNAs (miR-32-5p, miR-20a-5p and miR-454-3p) were downregulated (Figure 2e-g). Processed raw reads were further normalized using the Trimmed Mean of M-values (TMM) method provided by the Edge R program [18]. Based on the hypothesis that most genes are not differentially expressed, the TMM method generates a scaling factor applied to library sizes, which attempts to minimize the intra-group variation in gene expression. This normalization method can further minimize the effect of technical variations caused by sequencing depth and batch variation. The TMM method is a very powerful method when varying library size, and high-count genes can exist [19]. Compared to the commonly-used normalization methods of Total Counts (TC) or Reads Per Kilobase per Million mapped reads (RPKM), the TMM method is more reliable, because it not only normalizes the library size, but also takes into account the effect of RNA composition [18]. The effectiveness of the TMM method in normalizing the microRNA sequencing results was later confirmed via qRT-PCR.
The seven dysregulated microRNAs indicated by HTS differential expression analysis showed great differences in normalized reads between control and patient groups (Figure 2a-g). According to the HTS data, four microRNAs were upregulated (miR-127-3p, miR-204-5p, miR-329-3p and miR-487b-3p) in patients' blood samples (Figure 2a  (g) Figure 2. HTS data show dysregulation of seven microRNAs in blood from PCa patients (red) compared to controls (blue). (a-g) Box plots of the top seven dysregulated microRNAs as indicated on each panel are based on Edge R differential expression analysis. p-Values and FDR values were generated by Edge R using the generalized linear method. Box-and-whiskers graphs were plotted using Prism. The minimum, the 25th percentile, the median, the 75th percentile and the maximum are shown on each box plot as the bottom to the top lines, respectively. An FDR < 0.2 was considered significant.

Identification of MicroRNAs as Normalization Standards for qRT-PCR
In order to confirm HTS data by qRT-PCR, a normalization method needed to be developed to ensure that microRNA dysregulation was due to true biological variation and not technical error. Ideally, microRNA normalizers should exhibit small standard deviations and display similar expression levels to the dysregulated microRNAs under study. To this end, the concentration of small RNAs in each sample was determined from bioanalyzer results, set to a constant amount, and the Cq value determined by qRT-PCR. Results were analyzed using the NormFinder program, which scrutinizes intra-and inter-group variations to determine which microRNA candidates are best suited for normalization using an algorithm to calculate a stability value for each microRNA, i.e., the lower the value, the lower the variation.
The NormFinder program selected eight microRNAs as exhibiting stable expression patterns; miR-146a-5p, miR-1538, miR-197-3p, miR-1468-3p, miR-26b-5p, miR-296-5p, miR-1248 and miR-23a-3p ( Figure 3a; raw data Appendix Table A2). Due to limiting amounts of material, the search for potential microRNA normalizers was initially monitored in a subset of samples, i.e., eight patients and eight controls. Of these, the first four candidates, which exhibited the closest stability values (ranging from 0.009-0.0016) with minimal differences in expression between control and patient samples, were subsequently analyzed in a fuller spectrum of samples (26 patients and 10 controls). Unfortunately, two samples from each group of HTS data had to be dropped from further analysis due to a lack of material ( Figure 3b; raw data Appendix Table A3). NormFinder suggested that the single best candidate was miR-146a-5p. However, when there is no obvious single, outstanding normalization candidate, NormFinder suggests using a combination of microRNAs to increase reliability and produce less intra-and inter-group variability. Since the top four microRNAs (miR-146a-5p, miR-1538, miR-197-3p, miR-1468-3p) all showed very close stability values, the Cq value of each was compared to each other, as well as the geometric mean of the top two candidates (miR-146-5p and miR-1538) or all four top candidates together (Figure 3b). The top two candidates showed decreased intra-group variation, especially for the patient group ( Figure 3b). However, the geometric mean of all four candidates together showed even smaller intra-and inter-variations within and between both control and patient groups ( Figure 3b). Therefore, these four microRNAs were selected as a group of normalizers to be used for downstream qRT-PCR analyses. . HTS data show dysregulation of seven microRNAs in blood from PCa patients (red) compared to controls (blue). (a-g) Box plots of the top seven dysregulated microRNAs as indicated on each panel are based on Edge R differential expression analysis. p-Values and FDR values were generated by Edge R using the generalized linear method. Box-and-whiskers graphs were plotted using Prism. The minimum, the 25th percentile, the median, the 75th percentile and the maximum are shown on each box plot as the bottom to the top lines, respectively. An FDR < 0.2 was considered significant.

Identification of MicroRNAs as Normalization Standards for qRT-PCR
In order to confirm HTS data by qRT-PCR, a normalization method needed to be developed to ensure that microRNA dysregulation was due to true biological variation and not technical error. Ideally, microRNA normalizers should exhibit small standard deviations and display similar expression levels to the dysregulated microRNAs under study. To this end, the concentration of small RNAs in each sample was determined from bioanalyzer results, set to a constant amount, and the Cq value determined by qRT-PCR. Results were analyzed using the NormFinder program, which scrutinizes intra-and inter-group variations to determine which microRNA candidates are best suited for normalization using an algorithm to calculate a stability value for each microRNA, i.e., the lower the value, the lower the variation.
The NormFinder program selected eight microRNAs as exhibiting stable expression patterns; miR-146a-5p, miR-1538, miR-197-3p, miR-1468-3p, miR-26b-5p, miR-296-5p, miR-1248 and miR-23a-3p ( Figure 3a; raw data Appendix A Table A2). Due to limiting amounts of material, the search for potential microRNA normalizers was initially monitored in a subset of samples, i.e., eight patients and eight controls. Of these, the first four candidates, which exhibited the closest stability values (ranging from 0.009-0.0016) with minimal differences in expression between control and patient samples, were subsequently analyzed in a fuller spectrum of samples (26 patients and 10 controls). Unfortunately, two samples from each group of HTS data had to be dropped from further analysis due to a lack of material (Figure 3b; raw data Appendix A Table A3). NormFinder suggested that the single best candidate was miR-146a-5p. However, when there is no obvious single, outstanding normalization candidate, NormFinder suggests using a combination of microRNAs to increase reliability and produce less intra-and inter-group variability. Since the top four microRNAs (miR-146a-5p, miR-1538, miR-197-3p, miR-1468-3p) all showed very close stability values, the Cq value of each was compared to each other, as well as the geometric mean of the top two candidates (miR-146-5p and miR-1538) or all four top candidates together (Figure 3b). The top two candidates showed decreased intra-group variation, especially for the patient group ( Figure 3b). However, the geometric mean of all four candidates together showed even smaller intra-and inter-variations within and between both control and patient groups ( Figure 3b). Therefore, these four microRNAs were selected as a group of normalizers to be used for downstream qRT-PCR analyses.  . Analysis by the NormFinder program identified four microRNAs as the best normalization candidates for qRT-PCR studies. (a) Eight stably-expressed microRNAs (miR-146a-5p, miR-1538, miR-197-3p, miR-1468-3p, miR-26b-5p, miR-296-5p, miR-1248 and miR-23a-3p) suggested by HTS data were confirmed by qRT-PCR in triplicate (eight controls and eight patients). The small RNA concentration for each sample was normalized to roughly 0.012 ng/µL in each reaction. A stability value was generated for each candidate by the NormFinder program, where the lower the value, the better; (b) The Cq value of the top four microRNA candidates (miR-1468-3p, miR-146a-5p, miR-1538, miR-197-3p) was subsequently evaluated in 26 patients and 10 controls and plotted individually as box plots versus the geometric (Geo) mean of two candidates (miR-146a-5p and miR-1538) or four candidates (miR-146a-5p, miR-1538, miR-197-3p and miR-1468-3p ) as analyzed in triplicate by qRT-PCR.

Validation of HTS Data by qRT-PCR Analysis
The elucidation of valid microRNA normalizers permitted further analysis of HTS results via qRT-PCR. The individual dot plots of dCq (∆Cq) values are shown for the seven dysregulated miRs suggested by HTS data (Figure 4). According to the qRT-PCR results, miR-127-3p, miR-204-5p, miR-329-3p and miR-487b-3p were all upregulated in patients compared to controls (Figure 4a Appendix Tables A3 and A4, respectively. Thus, the qRT PCR results agreed with the HTS data, confirming that all seven microRNAs were dysregulated in PCa patients.  . Analysis by the NormFinder program identified four microRNAs as the best normalization candidates for qRT-PCR studies. (a) Eight stably-expressed microRNAs (miR-146a-5p, miR-1538, miR-197-3p, miR-1468-3p, miR-26b-5p, miR-296-5p, miR-1248 and miR-23a-3p) suggested by HTS data were confirmed by qRT-PCR in triplicate (eight controls and eight patients). The small RNA concentration for each sample was normalized to roughly 0.012 ng/µL in each reaction. A stability value was generated for each candidate by the NormFinder program, where the lower the value, the better; (b) The Cq value of the top four microRNA candidates (miR-1468-3p, miR-146a-5p, miR-1538, miR-197-3p) was subsequently evaluated in 26 patients and 10 controls and plotted individually as box plots versus the geometric (Geo) mean of two candidates (miR-146a-5p and miR-1538) or four candidates (miR-146a-5p, miR-1538, miR-197-3p and miR-1468-3p ) as analyzed in triplicate by qRT-PCR.

Validation of HTS Data by qRT-PCR Analysis
The elucidation of valid microRNA normalizers permitted further analysis of HTS results via qRT-PCR. The individual dot plots of dCq (∆Cq) values are shown for the seven dysregulated miRs suggested by HTS data (Figure 4). According to the qRT-PCR results, miR-127-3p, miR-204-5p, miR-329-3p and miR-487b-3p were all upregulated in patients compared to controls (Figure 4a-d), while miR-32-5p, miR-20a-5p and miR-454-3p were downregulated (Figure 4e-g) The differences in expression in control versus patient samples was calculated for each microRNA as the -ddCq (Log2 fold change) and shown in Figure 4h. Raw and normalized Cq values for controls and patients are included in Appendix A Tables A3 and A4, respectively. Thus, the qRT PCR results agreed with the HTS data, confirming that all seven microRNAs were dysregulated in PCa patients.  . Analysis by the NormFinder program identified four microRNAs as the best normalization candidates for qRT-PCR studies. (a) Eight stably-expressed microRNAs (miR-146a-5p, miR-1538, miR-197-3p, miR-1468-3p, miR-26b-5p, miR-296-5p, miR-1248 and miR-23a-3p) suggested by HTS data were confirmed by qRT-PCR in triplicate (eight controls and eight patients). The small RNA concentration for each sample was normalized to roughly 0.012 ng/µL in each reaction. A stability value was generated for each candidate by the NormFinder program, where the lower the value, the better; (b) The Cq value of the top four microRNA candidates (miR-1468-3p, miR-146a-5p, miR-1538, miR-197-3p) was subsequently evaluated in 26 patients and 10 controls and plotted individually as box plots versus the geometric (Geo) mean of two candidates (miR-146a-5p and miR-1538) or four candidates (miR-146a-5p, miR-1538, miR-197-3p and miR-1468-3p ) as analyzed in triplicate by qRT-PCR.

Validation of HTS Data by qRT-PCR Analysis
The elucidation of valid microRNA normalizers permitted further analysis of HTS results via qRT-PCR. The individual dot plots of dCq (∆Cq) values are shown for the seven dysregulated miRs suggested by HTS data (Figure 4). According to the qRT-PCR results, miR-127-3p, miR-204-5p, miR-329-3p and miR-487b-3p were all upregulated in patients compared to controls (Figure 4a Appendix Tables A3 and A4, respectively. Thus, the qRT PCR results agreed with the HTS data, confirming that all seven microRNAs were dysregulated in PCa patients.   In order to further assess whether the seven microRNAs could serve as good biomarkers, Receiver Operator Curves (ROC) were drawn based on the qRT-PCR data. ROC analysis demonstrates the trade-off between sensitivity and specificity where a good biomarker should   In order to further assess whether the seven microRNAs could serve as good biomarkers, Receiver Operator Curves (ROC) were drawn based on the qRT-PCR data. ROC analysis demonstrates the trade-off between sensitivity and specificity where a good biomarker should display both high sensitivity and high specificity [20]. The ROC curve for each microRNA is shown in Figure 5. In ROC analysis, the Area Under the Curve (AUC) quantifies the biomarker potential for each candidate where the higher the AUC value, the better a candidate microRNA is at distinguishing PCa patients from controls. Via ROC analysis, the currently used PCa biomarker, PSA, has a reported AUC value of 0.678 for distinguishing PCa from no cancer [16]. The seven microRNAs identified in our study exhibited a respectable range of AUC values from 0.7538 for miR-127-3p up to 0.9462 for miR-329-3p, all significantly better than that reported for PSA, with p-values ranging from 1.9435 × 10 −6 to 0.0094 (Figure 5a-g). display both high sensitivity and high specificity [20]. The ROC curve for each microRNA is shown in Figure 5. In ROC analysis, the Area Under the Curve (AUC) quantifies the biomarker potential for each candidate where the higher the AUC value, the better a candidate microRNA is at distinguishing PCa patients from controls. Via ROC analysis, the currently used PCa biomarker, PSA, has a reported AUC value of 0.678 for distinguishing PCa from no cancer [16]. The seven microRNAs identified in our study exhibited a respectable range of AUC values from 0.7538 for miR-127-3p up to 0.9462 for miR-329-3p, all significantly better than that reported for PSA, with p-values ranging from 1.9435 × 10 −6 to 0.0094 (Figure 5a-g).

Comparison of Blood Results to TCGA Database
The expression of our panel of blood microRNAs was compared to expression levels in tumor tissue by our analysis of data in The Cancer Genome Atlas (TCGA) database. Although the TCGA microRNA sequencing data were annotated with the stem-loop transcripts instead of the mature strands, all seven microRNAs from our study derive from the major expressed mature strand of their stem-loop precursor based on data in miRBase [17]. Therefore, the expression of these seven mature microRNAs is directly proportional to the abundance of their stem-loop precursors. The mature miR-127-3p, miR-204-5p, miR-487b-3p, miR-32-5p, miR-20a-5p and miR-454-3p are derived from precursors miR-127, miR-204, miR-487b, miR-32, miR-20a and miR-454, respectively. The mature miR-329-3p was derived from two precursors, miR-329-1 and miR-329-2.
The expression of each precursor in PCa tissues compared to their disease-free matched margins showed significant dysregulation, and the direction of dysregulation agreed with the literature results ( Figure 6). However, the pattern of dysregulation for each microRNA in tumor tissue was opposite to that pattern observed in our blood samples. For example, miR-127, miR-204, miR-329-1, miR-329-2 and miR-487b were all upregulated in PCa tissue, which suggested that their major mature strands (miR-127-3p, miR-204-5p, miR-329-3p and miR-487b-3p) were also upregulated. However, these four microRNAs were shown to be downregulated in our blood samples. The inverse correlation was observed for the three microRNAs (miR-32-5p, miR-20a-5p and miR-454-3p) that are downregulated in blood. Again, their precursor transcripts and presumably major, mature microRNA products were upregulated in the TCGA tissue data. A comparison of the fold changes between our HTS blood data versus that from the TCGA database are included in the Appendix (Table A5).

Comparison of Blood Results to TCGA Database
The expression of our panel of blood microRNAs was compared to expression levels in tumor tissue by our analysis of data in The Cancer Genome Atlas (TCGA) database. Although the TCGA microRNA sequencing data were annotated with the stem-loop transcripts instead of the mature strands, all seven microRNAs from our study derive from the major expressed mature strand of their stem-loop precursor based on data in miRBase [17]. Therefore, the expression of these seven mature microRNAs is directly proportional to the abundance of their stem-loop precursors. The mature miR-127-3p, miR-204-5p, miR-487b-3p, miR-32-5p, miR-20a-5p and miR-454-3p are derived from precursors miR-127, miR-204, miR-487b, miR-32, miR-20a and miR-454, respectively. The mature miR-329-3p was derived from two precursors, miR-329-1 and miR-329-2.
The expression of each precursor in PCa tissues compared to their disease-free matched margins showed significant dysregulation, and the direction of dysregulation agreed with the literature results ( Figure 6). However, the pattern of dysregulation for each microRNA in tumor tissue was opposite to that pattern observed in our blood samples. For example, miR-127, miR-204, miR-329-1, miR-329-2 and miR-487b were all upregulated in PCa tissue, which suggested that their major mature strands (miR-127-3p, miR-204-5p, miR-329-3p and miR-487b-3p) were also upregulated. However, these four microRNAs were shown to be downregulated in our blood samples. The inverse correlation was observed for the three microRNAs (miR-32-5p, miR-20a-5p and miR-454-3p) that are downregulated in blood. Again, their precursor transcripts and presumably major, mature microRNA products were upregulated in the TCGA tissue data. A comparison of the fold changes between our HTS blood data versus that from the TCGA database are included in the Appendix A (Table A5).

Comparison of Blood Results to TCGA Database
The expression of our panel of blood microRNAs was compared to expression levels in tumor tissue by our analysis of data in The Cancer Genome Atlas (TCGA) database. Although the TCGA microRNA sequencing data were annotated with the stem-loop transcripts instead of the mature strands, all seven microRNAs from our study derive from the major expressed mature strand of their stem-loop precursor based on data in miRBase [17]. Therefore, the expression of these seven mature microRNAs is directly proportional to the abundance of their stem-loop precursors. The mature miR-127-3p, miR-204-5p, miR-487b-3p, miR-32-5p, miR-20a-5p and miR-454-3p are derived from precursors miR-127, miR-204, miR-487b, miR-32, miR-20a and miR-454, respectively. The mature miR-329-3p was derived from two precursors, miR-329-1 and miR-329-2.
The expression of each precursor in PCa tissues compared to their disease-free matched margins showed significant dysregulation, and the direction of dysregulation agreed with the literature results ( Figure 6). However, the pattern of dysregulation for each microRNA in tumor tissue was opposite to that pattern observed in our blood samples. For example, miR-127, miR-204, miR-329-1, miR-329-2 and miR-487b were all upregulated in PCa tissue, which suggested that their major mature strands (miR-127-3p, miR-204-5p, miR-329-3p and miR-487b-3p) were also upregulated. However, these four microRNAs were shown to be downregulated in our blood samples. The inverse correlation was observed for the three microRNAs (miR-32-5p, miR-20a-5p and miR-454-3p) that are downregulated in blood. Again, their precursor transcripts and presumably major, mature microRNA products were upregulated in the TCGA tissue data. A comparison of the fold changes between our HTS blood data versus that from the TCGA database are included in the Appendix (Table A5). . An analysis of microRNA sequencing results from the TGCA matched tissue database for the seven microRNA candidates. (a-h) Reads for each microRNA candidate as indicated were normalized using the Edge R TMM method and plotted as a dot blot with a line (bottom to top, respectively) representing the minimum, median (or mean) and maximum value for the tumor versus the disease-free matched tissue (free-margin) from PCa patients. A p-value was obtained using the Mann-Whitney nonparametric test assuming that that data do not follow a Gaussian distribution. A p < 0.05 was considered significant. TCGA, The Cancer Genome Atlas.

Discussion
Analysis of HTS sequencing results suggested a panel of seven microRNAs that could be useful in diagnosing PCa in blood. Previously, the lack of reliable microRNA standards for normalization across different samples had been detrimental to subsequent qRT-PCR validation studies. In some studies, snRNAs have been used for this purpose, but since these are not normally secreted and are not produced by pathways that correlate with microRNA synthesis, their use as normalizers for complex body fluids such as blood is questionable. A review of our HTS data selected four representing the minimum, median (or mean) and maximum value for the tumor versus the disease-free matched tissue (free-margin) from PCa patients. A p-value was obtained using the Mann-Whitney nonparametric test assuming that that data do not follow a Gaussian distribution. A p < 0.05 was considered significant. TCGA, The Cancer Genome Atlas.

Discussion
Analysis of HTS sequencing results suggested a panel of seven microRNAs that could be useful in diagnosing PCa in blood. Previously, the lack of reliable microRNA standards for normalization across different samples had been detrimental to subsequent qRT-PCR validation studies. In some studies, snRNAs have been used for this purpose, but since these are not normally secreted and are not produced by pathways that correlate with microRNA synthesis, their use as normalizers for complex body fluids such as blood is questionable. A review of our HTS data selected four microRNAs (miR-197-3p or -5p, miR-1538, miR-1468-3p and miR-146a-5p) that were consistently expressed across all patient and control samples and could be used as reliable normalization standards for future qRT-PCR studies. Kirschner et al. also found miR-146a to be stably expressed in plasma and serum, not affected by hemolysis, in agreement with our results in blood [21]. Moreover, the enhanced geometric mean of these microRNAs was shown to be significantly better than any single microRNA alone.
With a proven group of normalization standards, it was important to confirm HTS results via qRT-PCR. Significantly, results from these two very different methodologies agreed well, further supporting the validity of our approach. All of the microRNAs with low pand FDR-values via HTS data showed significant p-values with qRT-PCR and notable AUC values upon ROC analysis. This agreement was encouraging because thus far, investigators had been determining diagnostic microRNAs by screening of pre-selected microRNA arrays, which represented only a small subset of microRNAs from the database of >2580 total microRNAs [17]. This approach is limited to analyzing only those microRNAs that have already been shown to be dysregulated in some disease and then selected for analyzing PCa. However, HTS permits the identification of all possible diagnostic miRNAs, both known and perhaps novel, expanding the spectrum of microRNA candidates evaluated. Interestingly, a few novel microRNA species were identified, but these always turned out to be a single report; thus, their relevance as a "new" molecule warranting further verification was hard to justify in this pilot study due to their low abundance. More importantly, HTS results were validated by qRT-PCR for all seven candidates generating ROC curves with individual AUC values better than PSA (AUC = 0.678), the current gold standard for diagnosing PCa [16].
Another value of our suggested panel is that four microRNAs are upregulated in blood, whereas three are downregulated. This result means that each group can serve as an additional internal control for each other, thereby further serving to verify the accuracy of results, i.e., they do not all go up or all go down. Constructing a diagnostic panel with only downregulated microRNAs is always hard to justify; however, by pairing the loss of three microRNAs with an increase in the other four allows for greater diagnostic confidence.
In some previous studies, PCa samples were pooled in order to obtain sufficient material for subsequent analysis [22,23]. This approach prevents an analysis of variability across individual samples and blocks any correlation to the stage of disease when Gleason scores are available. Not only is it important to diagnose PCa, but eventually to identify biomarkers that could serve to stage disease and, more importantly, discern indolent from aggressive disease, thereby impacting subsequent treatment options. Thus, the fact that valid HTS data could be acquired from individual samples without requiring pooling might enable a correlation between microRNA and tumor stage in future studies. Samples analyzed here were predominately from lower Gleason-scored patients (Table 1: 9 or 11 patients with a G6 or G7 score respectively with only 2 samples scored as G8 or G9). Thus, our panel is more diagnostic for early detection and, if these patients could be followed, might elucidate microRNAs useful for separating indolent from more aggressive disease. In any case, more patient samples are needed particularly with higher Gleason scores to determine microRNAs that could identify later stages of disease as proposed for miR-1290 and miR-375 in CRPC [15].

Literature Review of Diagnostic Panel of Dysregulated MicroRNAs in Cancer
A brief review of these diagnostic seven microRNAs, their Chromosomal (Chr) location and known targets was carried out to determine if their dysregulation might support a functional role in prostate tumorigenesis.
miR-329-3p (Chr14) is part of an extensive microRNA cluster containing over 40 microRNAs. Yang et al. found miR-329-3p to be downregulated in metastatic, neuroblastoma tumor tissue compared to the primary tumor [30]. One promising target for miR-329-3p is KDM1A, which has been shown to be significantly upregulated in the androgen-dependent LnCaP prostate cell line [30,31]. Upon depletion of KDMA1 using siRNA, VEGF-A expression was also decreased, which in turn blocked androgen-induced VEGF-A, PSA and Tmprss2 expression, suggesting a role for miR-329-3p as a tumor suppressor.
miR-487-3p (Chr 14 within the same microRNA cluster as miR-329) has been found to be downregulated in neuroblastomas and in PCa [32,33]. Moreover, 10 microRNAs from this cluster were found to be significantly downregulated in PCa as Gleason scores increased, thereby playing an important role in regulating proliferation, apoptosis, migration and invasion in metastatic PCa cells [32]. An interesting predicted target for miR-487b-3p is ALDH1A3, aldehyde dehydrogenase 1A3, an enzyme known to be upregulated four-fold in the LnCaP PCa cell lines [34] when exposed to the androgen Dihydrotestosterone (DHT).
3.1.2. miR-32-5p, miR-20a-5p and miR-454-3p as OncomiRs miR-32-5p (Chr 9) has been found to be an androgen-regulated microRNA that targets BTG2 [35]. Its overexpression has been shown to block apoptosis and promote PCa in CRPC. Furthermore, this microRNA was discovered to be regulated by DHT and displays putative upstream androgen receptor-binding sites (ARBS). miR-20a-5p (Chr 13) is part of the miR-17-92 cluster, which plays an important role in cell cycle progression, proliferation, apoptosis and other cellular processes [36]. One of the most studied targets of miR-20a is the E2F family, particularly E2F2 and E2F3 [37]. The overexpression of miR-20a-5p in the PC3 PCa cell line was shown to regulate the cell cycle via targeting of E2F2 and E2F3 mRNAs [36]. In addition, this microRNA also targets several cyclin-dependent kinases, including p21 and p57, which halt cell cycle progression. Finally, another notable target is FasI, which promotes cell death [37]. Thus, the major targets of miR-20a-5p promote tumorigenesis and angiogenesis by blocking cell cycle checkpoints [36]. miR-454-3p is located on Chr 17 in the first intronic region of its host gene SKA2 (Spindle and Kinetochore-Associated Complex Subunit 2). SKA2 is essential for proper chromosome segregation. During the cell cycle, both SKA2 and miR-454-3p have been shown to be upregulated. miR-454-3p targets the tumor suppressor gene, BTG1 (B cell Translocation Gene 1), which plays an important role in cell cycle progression and is involved in the stress response [38]. This anti-proliferative gene is expressed at its highest concentration during the G0/G1 phases of the cell cycle and is then downregulated when the cell progresses through the G1 phase. In renal carcinoma cells, an increase in miR-454-3p displayed a marked decrease in BTG1 via a direct interaction with the 3 -UTR of BTG1 mRNA [38].

Summary of the Literature Review for Targets of Panel MicroRNAs
A summary of these results is shown in Table 3. The four upregulated microRNAs in patient blood (miR-127-3p, miR-329-3p, miR-487b-3p and miR-204-5p) cumulatively target BCL6, TrkB, KDM1A and ALDH1A3, all of which have been shown to be important regulators in PCa [24][25][26][27][28][29][30][31][32][33][34]. Since these proteins exert oncogenic effects in prostate tissue, their regulators are viewed as tumor suppressors, the loss of which could contribute to tumorigenesis. On the other hand, the three downregulated microRNAs in patient blood (miR-20a-5p, miR-32-5p and miR-454-3p) have been shown to target the tumor suppressor proteins E2F2/3, BTG2 and BTG1, respectively [35][36][37][38]. Although they were downregulated in our patient blood samples, they have been shown to be oncomiRs in tumor tissue, the retention of which could promote tumor progression. A review of this literature supports how these microRNAs could play a role in PCa progression. Interestingly, three of the microRNAs in our panel belong to the same mega cluster on Chr 14. A post-review of our data did note differential expression for several additional members from this cluster. However, due to their low abundance, slightly higher FDR values and limited budget, they were not included in subsequent qRT-PCR confirmatory studies. Analysis of the HTS data showed that five mega cluster members (miR-654-5p, miR 654-3p, miR-493-3p, miR-493-5p and 433-5p) were present in the top 50 dysregulated microRNAs ranking 17th-59th from the top (Appendix A Table A6). A review of the TGCA data showed that all were downregulated to different degrees in tumor tissue, fitting with their loss as tumor suppressors. Interestingly, one of these microRNAs, miR-433-3p, has been shown to target CREB (cAMP Response Element Binding protein), a nuclear transcription factor shown to be involved in tumor initiation, progression and metastasis [39]. Sun et al. showed that overexpression of miR-433-3p could counteract the effects of CREB. Studies have shown that the microRNAs in this Chr 14 cluster are downregulated through unknown mechanisms. If increases in these microRNAs are found in blood, it is possible to hypothesize that the expression of these microRNAs is not just being turned off at the transcriptional level, but that they are being shuttled out of the tumor cell and into the blood as a survival and growth mechanism for the developing tumor.

Relevance of Comparing Blood HTS and qRT-PCR Data to the TCGA Database
An unanticipated discovery from this study was the inverse relationship between blood and tumor microRNA expression levels (Figures 2 and 4 compared to Figure 6). Since all of our blood microRNAs displayed an inverse expression level with our analysis of data from the TCGA database, we propose that it is possible that tumors are retaining oncomiRs for the purpose of driving tumorigenesis and angiogenesis, and therefore, less of these oncomiRs are released into the blood (Figure 7). Conversely, tumor suppressors block tumor growth and may need to be disposed of to enhance tumorigenesis and ultimately metastasis; hence, the increase in blood levels of these microRNAs. If this were the case for only one or two members of our diagnostic panel of seven microRNAs, perhaps not, but the recurrence for all seven candidates lends credibility to this hypothesis. Moreover, a review of their known targets and the roles they could play in PCa further supports this idea (Table 3). It has been shown that cancer cells secrete vesicles containing not only mature microRNAs to modify their environment for future metastasis, but the entire processing machinery (dicer, RISC with premiR) to ensure that once taken up by the target cell, the microRNA is efficiently processed and actively moved into the translational silencing mechanism of target mRNAs [40]. It is proposed that if only the mature microRNA were delivered to a secondary site, it might not be as efficient in modifying translation within the target cell. Thus, the preferential cellular export of certain microRNAs as "hormomirs" may function to modulate gene expression at secondary sites, thereby affecting disease pathology [41,42]. With this in mind, it is not much of a stretch to propose a developing tumor wants to dispose of compromising microRNAs that could restrict its growth; thus, excluding tumor suppressor microRNAs. Concomitantly, holding onto an oncomiR to quickly modulate the proteome is fast and efficient, faster than modifying gene expression at the transcriptional level. In support of this hypothesis, Selth et al. found a similar inverse relationship between the loss of miR-146b-3p expression in the prostate tumor with a concomitant increase in circulation [41]. Conversely, in the same study, a direct correlation between increased expression of miR-194 in the tumor and in circulation was noted, suggesting that microRNAs may vary in their expression and patterns of secretion. Since overexpression of miR-194 blocks cell proliferation, induces apoptosis, caspase-3/-9 activities and p53/p21 signaling while suppressing PI3K/AKT/FoxO3a signaling, it is difficult to understand how a tumor could tolerate an increase in the expression of this microRNA, which was not discussed [43]. Another study found miR-194 to be decreased in prostate tumors, befitting its function as a tumor suppressor [44]. On the other hand, miR-1 and miR-133a have been shown to be increased in serum in response to acute myocardial infarction where the levels of both are reduced in the infarcted myocardial tissue [45]. Thus, some evidence for an inverse correlation between tissue expression and microRNAs in circulation exists, but at this time, additional studies in PCa with more patient samples and expansion to other cancers and disease states need to be completed to determine the overall merits of this hypothesis.

Comparison of HTS Data to Screens of MicroRNA Panels and qRT-PCR Analysis
To date, the elucidation of microRNAs to identify PCa has been mostly generated from screens of preformed microRNA panels or qRT-PCR assays for microRNAs already shown to be involved in cancer. In both cases, the decision as to what microRNAs should be surveyed has already been made rather than using a technology like HTS, which uses no preselection and permits the identification of any potential microRNA. Cochetti et al. chose 23 microRNAs from an in silico survey of predicted target genes to analyze serum from PCa patients [10]. A review of a variety of such studies using serum or plasma has suggested miR-141, -21, -200b, -375, -221, -26a, -195, -15b, -16, -19b, -24, -451 or let-7i as biomarkers to distinguish PCa patients from healthy individuals [5,8,14,41,42]. Some of these microRNAs have also been proposed as biomarkers for other cancers not being unique to PCa (miR-141, -21, -16, -451) or are heavily influenced by hemolysis (miR-15b, -16, -451), making their It has been shown that cancer cells secrete vesicles containing not only mature microRNAs to modify their environment for future metastasis, but the entire processing machinery (dicer, RISC with premiR) to ensure that once taken up by the target cell, the microRNA is efficiently processed and actively moved into the translational silencing mechanism of target mRNAs [40]. It is proposed that if only the mature microRNA were delivered to a secondary site, it might not be as efficient in modifying translation within the target cell. Thus, the preferential cellular export of certain microRNAs as "hormomirs" may function to modulate gene expression at secondary sites, thereby affecting disease pathology [41,42]. With this in mind, it is not much of a stretch to propose a developing tumor wants to dispose of compromising microRNAs that could restrict its growth; thus, excluding tumor suppressor microRNAs. Concomitantly, holding onto an oncomiR to quickly modulate the proteome is fast and efficient, faster than modifying gene expression at the transcriptional level. In support of this hypothesis, Selth et al. found a similar inverse relationship between the loss of miR-146b-3p expression in the prostate tumor with a concomitant increase in circulation [41]. Conversely, in the same study, a direct correlation between increased expression of miR-194 in the tumor and in circulation was noted, suggesting that microRNAs may vary in their expression and patterns of secretion. Since overexpression of miR-194 blocks cell proliferation, induces apoptosis, caspase-3/-9 activities and p53/p21 signaling while suppressing PI3K/AKT/FoxO3a signaling, it is difficult to understand how a tumor could tolerate an increase in the expression of this microRNA, which was not discussed [43]. Another study found miR-194 to be decreased in prostate tumors, befitting its function as a tumor suppressor [44]. On the other hand, miR-1 and miR-133a have been shown to be increased in serum in response to acute myocardial infarction where the levels of both are reduced in the infarcted myocardial tissue [45]. Thus, some evidence for an inverse correlation between tissue expression and microRNAs in circulation exists, but at this time, additional studies in PCa with more patient samples and expansion to other cancers and disease states need to be completed to determine the overall merits of this hypothesis.

Comparison of HTS Data to Screens of MicroRNA Panels and qRT-PCR Analysis
To date, the elucidation of microRNAs to identify PCa has been mostly generated from screens of preformed microRNA panels or qRT-PCR assays for microRNAs already shown to be involved in cancer. In both cases, the decision as to what microRNAs should be surveyed has already been made rather than using a technology like HTS, which uses no preselection and permits the identification of any potential microRNA. Cochetti et al. chose 23 microRNAs from an in silico survey of predicted target genes to analyze serum from PCa patients [10]. A review of a variety of such studies using serum or plasma has suggested miR-141, -21, -200b, -375, -221, -26a, -195, -15b, -16, -19b, -24, -451 or let-7i as biomarkers to distinguish PCa patients from healthy individuals [5,8,14,41,42]. Some of these microRNAs have also been proposed as biomarkers for other cancers not being unique to PCa (miR-141, -21, -16, -451) or are heavily influenced by hemolysis (miR-15b, -16, -451), making their utility for PCa diagnosis debatable [8,42]. Interestingly, we did not find any of these microRNAs to be significantly altered in our HTS data. In part, this could be due to differences in analyzing serum or plasma versus blood, a very different source of body fluid, as well as the use of HTS data as the starting point for investigation.
HTS has been applied to a very limited number of studies for identifying microRNAs diagnostic for PCa. Szczyrba et al. looked at microRNA profiles of prostate carcinoma compared to normal tissue and cell lines [14]. Here, the loss of miR-143 and miR-145 targeting myosin VI (MYO6) was suggested as a diagnostic marker for prostate carcinoma. However, we did not find these two microRNAs to be dysregulated in blood. Of more significance to our study was the reported HTS data of blood exosomal material isolated from CRPC patients [15]. Here, as well, a group of normalizing RNAs (miR-301a/e-5p, miR-99a-5p, let-7c, miR-125a-5p, miR-16-5 and RNU6B) was proposed for subsequent qRT-PCR analysis. Since RNU6B is not a secreted microRNA, it is doubtful that it should have been included in this analysis, but discounting this snRNA, the rest would be useful normalizers for exosomal material. Interestingly, we found a completely different group of normalizing microRNAs with no overlap of this group, supporting how different an exosomal pool might be to the blood samples analyzed here. Upon normalization, these authors proposed miR-1290 and miR-375 as prognostic markers for CRPC. Since this was only for CRPC patients, perhaps it is not surprising that we did not find these two microRNAs in our analysis, since we are not focused on CRPC. Perhaps our panel of seven microRNAs would be better suited for identifying early stages of prostate cancer, rather than this later stage. Again, additional studies for CRPC versus patients that have not progressed to later stage disease are needed to clarify this hypothesis.

Limitations to This Pilot Study
This study was meant as a pilot study and, as such, suffers from some limitations, which should be addressed. First, to obtain sufficient material for HTS analysis, blood was used as the initial source of small RNAs. Thus, small RNAs will be contaminated with cellular microRNAs, not just circulating microRNAs. However, circulating microRNAs can come in many forms as exosomes, microvesicles, apoptotic bodies or bound to HDL, argonaute 2 or RNA-binding proteins, such as nucleophosmin 1 [42]. At this time, it is not clear which of these forms or combinations thereof would be the most diagnostic as biomarkers for PCa. Thus, focusing on some particles (exosomes or microvesicles) at the exclusion of the others might not be the most relevant source. Isolation of small RNAs from whole blood rather than purification of a subset of these particles seemed like a more inclusive starting point. More importantly, it was assumed that control samples will contain the same contaminates as patient samples, and thus, contaminating cellular microRNAs should cancel out and not be found amongst the dysregulated microRNAs, if samples are handled consistently. In support of this premise, it was reassuring that microRNAs known to reflect WBC (miRs-15-and -230), RBCs affected by hemolysis (miR-16-485-3p, -532-3p, -15b, -16 and -451), RBCs not-affected by hemolysis (miR-1274b, -142-3p and 146a), myeloid (miR-7a, -223, -197 and 574-3p) or lymphoid (miR-150) cells were not found in our panel of dysregulated microRNAs [21,41,42,46]. Our final panel of seven microRNAs was unique amongst those proposed in the literature, perhaps due to the fact that they were compiled from HTS data rather than screens of predetermined miR array panels or primer sets. Second, PSA values were not reported for control samples, making it impossible to obtain an AUC value for this cohort. Thus, an AUC value for PSA was taken from the literature [16], and it could be higher for our control group with its younger age, a problem encountered by other such studies [42]. However, all seven panel members yielded individual AUC values considerably better than the reported value for PSA, which when used as a panel should be stronger than any single microRNA. In fact, in a review by Selth, it was proposed that "no single analyte is likely to achieve the desired level of diagnostic or prognostic accuracy for PCa . . . requiring a signature of multiple microRNAs rather than a single miR", as proposed here [42]. Third, validation of HTS results by qRT-PCR was on the same cohort of samples analyzed by HTS. In a future study, a third larger cohort should be evaluated independently to better validate panel members as relevant biomarkers. Finally, these samples did not span the spectrum of higher Gleason scores, but reflect earlier stages of PCa. Thus, at this time, they are not useful for staging or separating indolent from aggressive disease, an important future correlate. Nevertheless, it is proposed that despite these limitations, this initial pilot study does present new novel microRNAs that have not been previously suggested, which warrant inclusion in future studies sampling larger cohorts.
Here, we have proposed a panel of seven microRNAs generated from HTS data rather than pre-judged screens of microRNA arrays proposed from other cancers with unknown relevance to PCa. The same criticism exists for data generated by qRT-PCR studies, since again, only a subset of chosen total microRNAs is being investigated. We propose that HTS data confirmed by qRT-PCR analysis are a worthwhile approach for deducing biomarkers for PCa. Certainly, our ROC curves and AUC values appear superior compared to the current PSA gold standard [16]. As a group, the value of a panel of diagnostic microRNAs is substantial. However, additional studies with more extensive patient sampling are required to determine their future usefulness in not only identifying PCa, but ultimately staging prostate cancer and separating indolent versus aggressive disease. This will require sampling, preferably individually, of a vast number of patient and control samples, but our initial study certainly justifies the merits of future investigation.

Sample Extraction and HTS Sequencing
Whole blood samples from patients and controls were obtained from the Nelson Urology Clinic, VCU (Virginia Commonwealth University) Medical Center and Mcguire Veterans Hospital following approval by the ethics committee (IRB Panel D Approval #HM14344). All patients provided written consent. Blood samples were taken prior to treatment, radiotherapy or prostatectomy. In most cases, age, ethnicity, PSA values and Gleason scores from biopsies were provided (Table 1). Controls were carefully selected to not have any history of PCa either as an individual or within the family, and written consent was obtained. A complete analysis of information provided to us for each individual is included in Appendix A Table A1. Samples were collected in PAXgene blood tubes (PreAnalytiX, Qiagen/BD, Franklin Lakes, NJ, USA), which contain a manufacture's additive to stabilize RNA. The total RNA, including microRNAs in each sample, was extracted using a corresponding PAXgene blood miRNA kit following the manufacture's protocol, which removes DNA and results in the purification of pure RNA. The quality and concentration of small RNAs ranging from 10-40 nts were measured using the small RNA Chip Assay (Agilent) based on the manufacturer's instructions. A total number of 40 samples (12 controls and 28 patients) with a small RNA concentration >100 ng/µL was selected for HTS microRNA sequencing using the Illumina ® TruSeq Small RNA Library Preparation kit (New England Biolabs, Ipswich, MA, USA) and HiSeq 2500 system (Illumina, San Diego CA, USA) according to the manufacturer's protocol.

Bioinformatics Analysis
The raw deep sequencing data were processed using Flow ® v 3.0 (Partek Incorporated, St. Louis, MO, USA). The adapter sequence "AGATCGGAAGAGCACACGTCT" (TruSeq Adapter, Index 7), frequently detected from all reads, was removed from both the 5 -and 3 -ends. A second trimming was performed to further eliminate bases at both ends with a Phred quality score lower than the average 35, indicating a probability that every 1 in 5000 bases was incorrect; accuracy of 99.95%. The minimum read length detected by the program was changed from 25 to 16 nts in order to include all possible microRNA reads in a suitable range. The trimmed data were aligned to the human genome (GRCh38) with only 1 seed mismatch allowed. The three best alignments satisfying such criteria were reported for each read using Bowtie 1.0. The parameters used by Bowtie while performing the alignment were as follows: alignment mod = quality limit, seed mismatch limit = 1, seed length = 28 and quality limit = 70, both strand alignment and alignments reported per read = 3. The aligned reads were annotated with miRBase (mature 21, Version 2). The differential expression analysis was conducted on the annotated sequencing reads exported from Partek Flow ® using Edge R (Version 3.12) (Roswell Park Cancer Institute, Buffalo, NY, USA) [18,47]. The reads were normalized using the default "Trimmed Mean of M values" (TMM method) algorithm, which aims at minimizing the effect of sequencing depth and RNA composition [47].

Quantitative Real Time-Polymerase Chain Reaction
The search for microRNA candidates serving as good qRT-PCR endogenous controls was attempted using NormFinder software [48]. The microRNAs showing a relative high abundance and minimal intergroup variation suggested by the HTS data were selected for exploratory qRT-PCR on 16 samples (8 controls, 8 patients) in triplicate (Appendix A Table A2). Each RNA sample (3 ngs) was converted to cDNA (final volume 20 µL) using the qScript™ synthesis kit (Quanta Biosciences Inc., Gaithersburg, MD, USA) following the manufacturer's protocol and qPCR conducted as previously described [49]. Briefly, the cDNA was diluted with RNase-free water 1:1 (v/v), and 2 µL were used for each PCR reaction run in triplicate. Each PCR reaction was scaled down to 6.25 µL SYBR ® Green Master Mix, 0.25 µL primer, 4.0 µL RNase-free H 2 O for the purpose of saving reagents without compromising the results. qRT-PCR was conducted in an Applied Biosystems 7500 real-time PCR instrument (Life Technologies, Foster City, CA, USA) using the following conditions: 50 • C for 2 min, followed by 40 cycles at 95 • C for 15 s, 60 • C for 15 s and 70 • C for 30 s. Data were collected at 70 • C and analyzed using SDS software v1.3.1 (Life Technologies), using automatic threshold and baseline settings. Negative amplification controls and DNase-treated controls were routinely included for each microRNA and did not impact analysis. PCR efficiency for each microRNA primer set was tested and found to be within the acceptable range (80-110%). The Cq values of each candidate microRNA were imported into the NormFinder program, which generated stability values for each candidate after evaluating both intra-and inter-group variations. Lower stability values suggested higher consistency of a microRNA across different samples and groups. A combination of the four microRNAs with the lowest stability values showed a greater stability and consistency. Therefore, a normalization factor for each plate was determined by taking the geometric mean of the Cq values of the four microRNAs.
Next qRT-PCR was performed on dysregulated candidate microRNAs using 10 controls and 26 patients (raw and normalized values in Appendix A Tables A3 and A4, respectively) analyzed in triplicate (SD < 0.2). Unfortunately, material from two patient and control samples evaluated by HTS was found to be insufficient for confirmatory qRT-PCR analysis. For all other samples, the protocol was the same as described above. Raw Cq values were normalized by subtracting the geometric mean Cq value of the top four normalization candidates (miR-146a-5p, miR-1538, miR-197-3p and miR-1468-3p) from individual Cq values to generate dCq. A p-value was obtained using the Mann-Whitney nonparametric test assuming that data do not follow a Gaussian distribution. A p-value <0.05 was considered significant. The −ddCq value for each candidate was obtained by taking the mean of the normalized dCq for all controls minus the normalized dCq of each patient sample. The −ddCq values were equivalent to Log2 fold change, as the fold change was calculated by 2 −ddCq .

The Cancer Genome Atlas Tissue Data Analysis
Illumina HiSeq level 3 miRNA sequencing data of 50 prostate tumor tissue samples and their matched normal margins were selected and downloaded from The Cancer Genome Atlas database (TCGA database) for analysis. The reads were annotated with stem-loop transcripts of each microRNA [50]. The raw reads were normalized in the same way as described above using the Edge R software [18].

Statistical Analysis
Differential expression analysis was conducted on the normalized next generation sequencing reads for both blood samples and TCGA tissue data using EdgeR [18]. As the distribution of microRNA sequencing reads remains unclear, the dispersion of reads was estimated via the Cox-Reid profile adjusted likelihood method default in Edge R [51]. The reads matrix was fitted to a generalized linear model and a likelihood ratio test was performed on the fitted data. The p-value was adjusted to the number of comparisons (equal to total number of microRNAs detected in the sequencing) using the Benjamini-Hochberg method, which yields a False Discovery Rate (FDR) to minimize Type I error [52]. In order to maximize the screening results, a FDR value smaller than 0.2 was considered significant in this experiment.
For dysregulated microRNA PCR results, the normalized dCq values of each candidate between control and patient groups were compared using the Mann-Whitney nonparametric test assuming that the data do not follow a Gaussian distribution on Prism (GraphPad Software Inc., Version 6, 2015). A p-value lower than 0.05 was considered significant. ROC curves were generated for candidate microRNAs showing a statistically-significant difference between two groups [20]. The ROC was obtained by plotting sensitivity against specificity using the pROC package (Version 1.8). An area greater than 0.5 under the curve (AUC) suggests the diagnostic potential of each microRNA candidate.

Conclusions
In summary, we propose a group of four microRNAs (miR-146a-5p, miR-1538, miR197-3p and miR-1468-5p) that could be used as normalization standards for the comparative analysis of blood samples at least for PCa and perhaps other cancers, as well. In addition, a panel of seven microRNAs (miR-127-3p, miR-204-5p, miR-329-3p, miR-487b-3p, miR-32-5p, miR-20a-5p and miR-454-3p) might be useful for diagnosing PCa dependent on further validation. Individual members of this panel display better diagnostic capabilities than PSA alone and as a group are superior.    Table A6. Chromosome 14 q32.31 dysregulated microRNAs in our analysis of blood samples compared to data from the TCGA tissue database. In addition to our proposed panel members (miR-329-3p, miR-487b-3p and miR-127-3p), subsequent analysis of HTS data uncovered five other microRNAs from this locus (miR-654-5p, miR-654-3p, miR-493-3p, miR-493-5p and miR-433-3p) to be upregulated in blood. Since these microRNAs were not within the top ten potential candidates, they were not carried forth for qRT-PCR confirmation. However, analysis of the TCGA database did confirm these microRNAs to be tumor suppressors lost in tumor tissue compared to matched tumor-free margins, fitting with our model that tumors may strive to get rid of tumor suppressors in order to progress. Fold change is shown in a Log2 scale and was calculated with the same method as panel members using the edge R program.