The Potential of MicroRNAs as Non-Invasive Prostate Cancer Biomarkers: A Systematic Literature Review Based on a Machine Learning Approach

Simple Summary Prostate cancer (PCa) is the most common cancer in men worldwide. Screening and diagnosis are based on prostate-specific antigen (PSA) blood testing and digital rectal examination. Nevertheless, these methods are not specific and have a high risk of mistaken results. This has led to overtreatment and unnecessary radical therapy; thus, better prognostic tools are urgently needed. In this view, microRNAs (miRs) appear as potential non-invasive biomarkers for PCa diagnosis, prognosis, and therapy. As the scientific literature available in this field is huge and very often controversial, we identified and discussed three topics that characterize the investigated research area by combining the big data from the literature together with a novel machine learning approach. By analyzing the papers clustered into these topics we have offered a deeper understanding of the current research, which helps to contribute to the advancement of this research field. Abstract Background: Prostate cancer (PCa) is the second leading cause of cancer-related deaths in men. Although the prostate-specific antigen (PSA) test is used in clinical practice for screening and/or early detection of PCa, it is not specific, thus resulting in high false-positive rates. MicroRNAs (miRs) provide an opportunity as biomarkers for diagnosis, prognosis, and recurrence of PCa. Because the size of the literature on it is increasing and often controversial, this study aims to consolidate the state-of-art of relevant published research. Methods: A Systematic Literature Review (SLR) approach was applied to analyze a set of 213 scientific publications through a text mining method that makes use of the Latent Dirichlet Allocation (LDA) algorithm. Results and Conclusions: The result of this activity, performed through the MySLR digital platform, allowed us to identify a set of three relevant topics characterizing the investigated research area. We analyzed and discussed all the papers clustered into them. We highlighted that several miRs are associated with PCa progression, and that their detection in patients’ urine seems to be the more reliable and promising non-invasive tool for PCa diagnosis. Finally, we proposed some future research directions to help future scientists advance the field further.


Introduction
Prostate cancer (PCa) is the most commonly diagnosed cancer and the second leading cause of cancer death in men in the developed world [1], with a mortality rate expected to approximately double over the next 20 years [2]. Prostate cancer can be clinically insignificant (low-risk and localized to the prostate) or significant, in this case it is a potentially metastatic and aggressive tumor, which requires early detection, and is lethal if untreated. The prostate-specific antigen (PSA) test is used as a screening biomarker of PCa, but alone it is not indicative of the disease, therefore digital rectal examination is also required. Its diagnosis is based only on histopathological analysis of prostate biopsies. Due to the widespread use of the PSA test for PCa detection, its incidence rapidly increased, even if mortality remained stable. The main reason is that, although this test is still a gold-standard, it is not a specific biomarker, and is not very helpful in distinguishing aggressive from non-aggressive diseases, thus resulting in a high number of false positives, as well as it fails to detect indolent disease. This has led to overtreatment with radical therapy, resulting in a dramatic impact on men and their quality of life. Distinguishing the aggressive and lethal tumoral form from the indolent one is, therefore, extremely relevant to limit overtreatment and improve patient outcome. This highlights the urgent need for more specific and sensitive diagnostic and prognostic tools.
MicroRNAs (MiRs) are small non-coding RNAs that modulate gene expression and play significant roles in almost all biological pathways, influencing cancer-relevant processes, such as proliferation, apoptosis, cell cycle, invasion and migration [3]. Profiling of miRs in human cancer has generated great interest, and several studies have described their critical role in PCa pathogenesis [4,5]. More interesting is their potential use as biomarkers for the early detection, diagnosis, and prognosis of cancer. Indeed, miRs are actively released by different cell types and detectable in all human bio-fluids, especially in plasma, serum, and urine, making them suitable as circulating biomarkers for PCa [6,7] As a considerable number of papers on both miRs and PCa are available in the literature, a systematic review approach was required to carry out a deep and comprehensive investigation of the whole literature [8]. The systematic review approach is used when the number of contributions to be analyzed is huge and, as in this case, heterogeneous and often controversial. Several studies, for example, reported that upregulated expression of miR-200 family members in PCa facilitates oncogenic activity and promotes metastasis [9], despite the prevailing opinion that under-expression of the miR-200 family promotes EMT and metastasis [10,11].
Machine learning has transformed oncological research in recent years. For instance, it has been used to classify tissue samples as benign or malignant, or for the early and automatic detection of cancer by using whole slide images [12]. The technical reason of massive machine learning adoption in medicine resides in the fast progresses in classification models, which supports the adoption of these techniques in many tools such as image-based ones [13].
This study is the first of its kind in medicine, as the machine learning classification approach has been used to locate existing studies, select and evaluate quality contributions, analyze and synthesize results and, finally, report results that can highlight clear conclusions about what is known and what is not yet known [14]. According to Petticrew [15] "a systematic review is an efficient technique for hypothesis testing, for summarizing the results of existing studies, and for assessing the consistency among previous studies". However, the approach we used is not the traditional one used in medicine, as exemplified by the Cochrane Collaboration [16], but an innovative qualitative approach useful to provide a comprehensive/integrated analysis on all articles published in prestigious scientific journals [17]. Consequently, we performed a Systematic Literature Review (SLR) approach by using the MySLR digital platform to analyze a large number of scientific publications through a text mining method, which makes use of the Latent Dirichlet Allocation (LDA) algorithm.
Therefore, the main aim of the present study was to consolidate the state-of-art of the research published over the last 15 years on the prognostic significance of miRs expression in PCa, and on their potential use as alternative non-invasive biomarkers. This has produced a systematic mapping of the insights and knowledge gaps present in existing research, thus providing useful insights that can contribute to the development of this research field and suggesting promising directions for future research. This study represents the first attempt in which a text mining approach was applied on a sample of scientific original articles in a medical setting.

Methods
We adopted a machine learning approach to deeply analyze a large number of papers present in the scientific literature to extract latent knowledge useful for the aim of this research.
We did not contrast studies conducted with the same medical protocol, as in traditional SLR in medicine. The original research protocol adopted in this study considered scientific papers that, regardless of the adopted medical protocol, somehow dealt with potential connections between miRs and PCa.
Although traditional algorithms are developed around numerical and structured data, the information generated in the scientific literature consists of documents (papers) that are generally unstructured. Consequently, the LDA algorithm was chosen to extend machine learning applications, in order to extract information from unstructured textual data, i.e., scientific journal articles [18]. The behavior that the MySLR platform reproduces by implementing the LDA algorithm, simulates as close as possible that of a "human-like intelligence", it can process a large amount of data, read the texts, understand their content, extract the required information, and highlight hidden connections among papers. More in detail, this approach is based on the creation of a model that is able, by analyzing texts, to autonomously identify within them a set of "topics" (or themes), to identify the topic addressed by each of them and subsequently to recognize within the various papers the presence of the topics identified above.
Therefore, we carried out a SLR to offer a complete and exhaustive overview of scientific research on the potential value of miRs as non-invasive biomarkers for PCa. This was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [19]. In order to perform this activity, we adopted the MySLR digital platform [17], a semi-automated tool supporting scientists in performing SLR, which is available at https://myslr.unical.it (accessed on 20 May 2022) upon registration.
The methodological approach is based on three steps, namely: paper location and selection, paper analysis, and results presentation, according to Denyer and Tranfield [14], as discussed below.

Paper Location and Selection
We searched the PubMed, Scopus, and ISI Web of Knowledge online databases up to 20 May 2022, to identify relevant studies published between 2007 (year in which the first paper in the investigated field was published) and 2022. The terms associated with the keywords were: ("microRNA" OR "miRNA" OR "miR") AND ("prostate cancer" OR "prostate carcinoma" OR "prostate tumour" OR "prostate tumor" OR "prostate neoplasm") AND "biomarkers" AND "prognosis". The search string was structured in such a way that the results contained papers with at least one term from each set in the title, abstract and keywords.

Paper Analysis
At this stage, after removing duplicates, we examined papers (n = 618) to identify relations and common points among them.
Studies that met the following inclusion criteria were considered eligible: i.
The study was conducted on human cells, tissue or patients with PCa (not xenografts or other animal models). ii. The study measured the expression of miRs in serum/plasma/urine or cells/tissues. iii. The study investigated the association between prognosis outcomes and miRs expression.
Studies were excluded if: i. The study tested the prognostic role of target genes instead of the miR itself.
ii. The study involved other non-coding RNAs with as yet unknown functions, such as circular RNAs, long non-coding RNAs and small nucleolar RNAs. iii. The clinical study lacked key information such as hazard ratio (HR), 95% confidence intervals (CI), p value, and survival curves. iv. The study was a review, an editorial article, a meta-analysis, a letter to editors, a short communication, a conference paper, an erratum, a chapter book, a note, a personal opinion and commentary, or a retracted publication.
We independently evaluated pertinent papers by examining titles, abstracts, and full texts matching the appropriate criteria. At the end, 213 journal articles were included in the final set of eligible documents for further topic extraction analysis.
To highlight the main research topics in the context of miRs as potential biomarkers in PCa, we performed a text mining method on the final set of 213 papers. This method is based on LDA, a statistical procedure that provides each document with a distribution along a certain number of topics. The model treats documents as topics probability distribution and topics as words distributions.
In Natural Language Processing, a topic model is a statistical model whose objective is to find the abstract "topics" (or themes) contained in a set of documents. The topics are not known a priori but are independently identified by the algorithm based on the frequency and number of occurrences of the words in the various texts. By exploiting statistics of this type, the used algorithm was able to identify three main general topics (the so-called topics) related to the keywords given by the LDA procedure present in the various texts, and to correctly assign each text its respective semantic topic. This procedure provides as output: • k sets of relevant keywords (where each set represents a topic).

•
The document-term matrix, i.e., a matrix describing how much each paper is statistically related to a specific topic (namely, the topic proportion).

Results Presentation
The last step of the methodological approach is elucidated in the sections "Results" and "Discussion". The aim of this step is to clearly describe and discuss the results of the LDA procedure by means of a detailed human-based review of significant papers gathered around the three topics.

Results
An overall number of 618 unique studies were retrieved from the initial literature search. Of these, 138 papers were removed as they reported non-relevant studies such as reviews, book chapters, meta-analyses, and other not relevant publications. Full-text reading and analysis resulted in removal of 267 other studies for reasons such as inability to access full text or unsatisfactory reporting of results, or that they did not meet the above inclusion criteria. Ultimately, a total number of 213 studies were considered eligible. The flowchart shown in Figure 1 elaborates the algorithm of selection of final studies for this systematic review.
As shown in Figure 2, the interest of the scientific community on the issue is evident. It is not surprising that the debate around this theme has received the attention of numerous original articles over time, especially starting from 2017. In fact, if we do not consider 2022, which is still ongoing, over 60% of the articles have been published in the last 5 years.
According to the indication provided in Blei [18], we selected the k value (number of topics to be extracted) of 3, which ensured a satisfactory value of topic coherence (−1.09) [20] in unison with an easy interpretation of the results for a human reader.  Thanks to the LDA procedure, we identified relevant keywords associated to each of the three topics. In Figure 3, a graphical representation of the most relevant keywords for each topic is provided in the form of "word cloud". As shown in Figure 4, although the issue under investigation has been increasingly considered by researchers over recent years, the trend of the topic papers over time is different. First, topic 1, after a fairly slow start, has had a rapid increase since 2017, and recently it has gained increasing interest (Figure 4, blue). In contrast, topic 2 has had a steady growth over time until it reached its highest peak in 2018, then its interest decreased (Figure 4, green). A similar trend can be observed for topic 3, which reached its peak of interest around 2018-2019 (Figure 4, yellow). The three topics identified through the LDA algorithm are presented and discussed below. Then we performed a human-based review on a subset of relevant papers to infer a meaningful description of each topic. Based on the main concepts of the papers, we developed the discussion starting from topic 3, then we treated topic 2 and finally topic 1.

Topic 3-Study of miRs in Human PCa
Looking at the top-30 most relevant terms and their frequency within papers grouped around the selected topic ( Figure 5), and then by analyzing the 91 papers clustered into this topic, it was evident that the cornerstone of the topic was the role of miRs in PCa progression and development.
The miRs can act as oncogenes (if upregulated in PCa) or tumor suppressors (downregulated miRs) and contribute to the development and progression of tumors, thus affecting the prognosis and survival of cancer patients. Most of these papers share results obtained from in vitro experiments to explore the function of candidate miRs. Authors identified miR signatures that were able to differentiate malignant PCa from benign prostate hyperplasia. The MiR expressions were determined mostly by qPCR. Furthermore, the identification of miR target genes and their pathways played a significant role in a better knowledge of PCa. Thus, miRs can act as oncogenes or tumor suppressors in PCa by influencing multiple cancer-related processes, among which the main are cell growth and proliferation, apoptosis, migration, invasion, and metastasis, as well as epithelial to mesenchymal transition (EMT). The most relevant and representative papers clustered in the topic are summarized in Table 1, from which deregulated miRs in PCa, their putative targets and main regulatory effects on tumors, together with other pathological data, can be found.

Topic 2-Potential of miRs as Biomarkers in Translational Research of PCa
The top-30 most relevant terms of this topic (i.e., the most frequent terms within papers grouped in this topic) (Figure 6), are indicative of a research "network" focus on the evaluation of miRs potential in translation research. Indeed, the 66 analyzed articles aimed to elucidate the relationship of miRs expression with clinicopathological data, and to evaluate the potential of miRs as prognostic and diagnostic biomarkers in PCa. Because of the molecular heterogeneity of PCa, the ideal biomarker for early diagnosis and prognosis should be capable of identifying potentially aggressive tumors at the stage in which they are still treatable, while minimizing the detection of indolent disease. Aberrant expression of miRs in PCa patients could be a prognostic biomarker, associated with aggressive progression or indicative of poor prognosis. A limitation of these studies is that they often report inconsistent and/or controversial results, due to differences in clinical heterogeneity, study designs and methods of sample collection. Clearly, all these controversial results delay translation from bench to bedside.
Using the MySLR digital platform, we were able to analyze this huge and heterogeneous number of papers, and select and evaluate quality contributions that matched the selected search criteria. In most of the papers, survival was assessed by using the Kaplan-Meier method, differences in survival according to miRs expression were compared by using the log-rank test, while the prognostic values of miRs expression and clinical outcomes were evaluated by Cox regression analysis. Moreover, many analyses were also performed by using bioinformatics tools, such as the "PANTHER" online tool or a deep learning "autoencoder" model. Therefore, the most relevant publications in the scientific literature reporting miRs as potential prognostic biomarkers in PCa are clustered in this topic ( Table 2).

Topic 1-Use of miRs as Biomarkers for PCa in the Clinical Setting
As shown in Figure 7, some of the most relevant terms in all 56 papers clustered into topic 1 are "urinary", "urine", "urine_sample". Analysis of all the papers singularly clearly revealed the focus of this topic is the possible use of miRs in the clinical setting as diagnostic or prognostic markers for PCa. Studies clustered in this topic monitored human miRs in PCa patients by both liquid and tissue biopsies approaches. Scientists assessed miRs expression profiles in PCa tissues and biofluids, including urine, serum/plasma, semen, and prostate secretion fluids at various stages of the disease, and examine their potential as prognostic markers in PCa, as can be seen from Table 3, in which the most relevant papers clustered in the topic are summarized. This research area that investigates circulating miRs as markers is a rapidly developing area; indeed, the topic is based on a growing body of studies whose interest has grown especially over the last 5 years (Figure 4, blue and Table 3). Analysis of miRs in prostate tissue is routinely performed on fresh tissue, but also in formalin fixed paraffin embedded tissue (FFPE) due to the stable nature of miRs, by using microarrays, next generation sequencing (NGS) and qRT-PCR. For many years, biopsies have been the gold standard to determine clinicopathological characteristics of cancer tissues, but the procedure is very aggressive and uncomfortable for patients. During the last years, non-invasive methods have shown relevance, because they could be good indicators for cancer detection at the molecular level. Cancer cells can release miRs, which are stabilized by their incorporation into microvesicles secreted by the prostate, these are detectable in body fluids without requiring invasive biopsies. Several body fluids such as blood/serum, semen, urine, etc. have been used. Detection of miRs in blood/serum has some limitations and is often controversial. Appropriate endogenous controls for miRs quantification in serum are under debate because many mRNA and rRNA species are absent in blood/serum due to circulating RNases. Furthermore, changes in circulating miRs can occur because of therapies, diet, or other factors, thus increasing noise in these assays. However, in addition to serum and plasma, miRs have been identified in other body fluids, in particular semen and urine, which makes them even more interesting biomarkers candidate for PCa. Most of the papers clustered in topic 1 report urine as an excellent option and a reliable non-invasive tool for identifying PCa, and several diagnostic methods have also been described. They are available to detect the presence or absence of miRs involved in the development of the disease by using a non-invasive urine-based test.

Discussion
Considering the above, advances in early detection are crucial. Scientists have just started to evaluate the role of miRs as clinical biomarkers for PCa detection. Urine-based miR tests seem to be the most useful in PCa diagnosis and prognosis and may help to reduce the number of unnecessary prostate biopsies and guide treatment decisions. Tumor cells release exosomes into biological fluids, and so also into urine, molecules inside are protected from degradation by the exosomal lipid bilayer. As exosomes contain tumor-driven molecules (including miRs), urinary exosomes have been considered ideal substrates for the development of non-invasive biomarkers. Furthermore, the number and composition of exosomal miRs are different between healthy and diseased patients, therefore, the study of novel biomarkers in exosomes is a promising research field for studying PCa prognostic biomarkers [95]. Zhuo et al. investigated whether exosomal miR-141 is an effective biomarker for human PCa. They showed that miR-141 expression was higher in exosomes and in PCa patients than in whole serum and patients with benign prostatic hyperplasia. Expression levels were also significantly higher in metastatic PCa than in localized PCa (p < 0.0001) [115]. Recently, levels of specific miRs were also measured in exosomes from urine samples in order to develop a model for the prediction of biochemical recurrence after radical prostatectomy for curative purposes [103]. An alternative method uses field-effect transistor (FET)-based sensors, which allow to measure chemicals and biomolecules with electrical signals. In particular, a label-free urinary miR sensing system was reported, it was based on a disposable and switchable graphene-based electrical sensor with high sensitivity and specificity in urine samples useful as a noninvasive method. This sensor enables rapid and direct detection of target miRs over a wide dynamic range, with a detection limit of up to 10 fM in human urine samples within 20 min, and it also allows for simultaneous quantification of multiple miRs [100]. To identify and validate urinary miRs with the aim of increasing the specificity of PCa diagnosis, several clinicopathological parameters of patients are taken into account. Among these, PSA is used as a clinical biomarker for PCa diagnosis. Guadarrama et al. demonstrated that the model including the miR-100/200b signature significantly outperformed the ability of PSA to discriminate between PCa and benign prostatic hyperplasia [114]. Fredsoe et al. observed expression levels of different miRs by qPCR in cell-free urine samples from patients with benign prostatic hyperplasia and from those with clinically localized PCa. Furthermore, they developed a new diagnostic model of three miRs (miR-222-3p*miR-24-3p/miR-30c-5p) which distinguished benign prostatic hyperplasia from PCa [105]. Moreover, urinary levels of miR-21 also had significant discriminatory power (p = 0.010) to separate benign prostatic hyperplasia from PCa by using real time PCR [112]. A different approach is to analyze serial urine samples from patients with localized PCa. Urine miRs validation was generated from three patient cohorts with different Gleason scores. First, temporally stable miRs were measured, and a predictive biomarker of the Gleason score, used as a clinicophatological parameter, was created by using machine-learning techniques [96]. Although most of the papers clustered in topic 1 agree to consider urine as a reliable non-invasive sample for identifying PCa status by testing miRs, the diagnostic methodologies used are several and different, as described above, thus highlighting the limitations of any clinical application of miRs.
Although, as already noted, the research in the field is moving in this direction, before an actual clinical test can be developed further studies are needed, including large sample sets with well-supported validation through long-term clinical data. It seems necessary to establish widely accepted guidelines in the near future, which will determine the best urine-related method, sampling and processing, sample storage, miRs isolation and quantification, quality control and data analysis, in order to minimize the high interand intra-tumoral variability. Finally, prior to any clinical application of miRs, optimization is critical to enhance PCa detection, as well as to use miRs in cancer therapy. Moreover, the use of miRs as prognostic markers in PCa may help to define subpopulations of patients with significantly different expected outcomes, who could benefit from different therapies. Patients with a good prognosis may not require additional treatment beyond the primary surgical resection, while patients with a poor prognosis may derive improved survival from adjuvant therapy. Hence, prognostic markers could potentially be "drivers" of cancer progression. Apart from well-noted diagnostic and prognostic values, miRs also provide a potential treatment option for PCa. MiR-based therapy has a great potential to be a more powerful tool in tumor treatment due the simultaneous modulation of multiple genes involved in distinct tumor-related signaling networks. In this view, personalized anticancer therapy is the most ambitious challenge of modern medicine, aiming to identify novel patient-tailored treatments based on the unique features of patient's disease. This approach, based on miRs delivery, could represent a potential non-toxic successful therapy for a large subset of PCa patients, which could not only decrease the socio-economic costs of this disease, but also improve its burden on patients' life, thus improving their quality of life.

Conclusions
Prostate cancer has attracted a great deal of interest due to its high rate of mortality among cancers worldwide [1]. Prostate cancer patients are typically asymptomatic in their early stages, and are often diagnosed too late, thus failing in successful treatment. Although mortality rates have been reduced, thanks to early detection and improved treatment strategies, diagnostic methods are very aggressive for patients and a lot remains to be done to avoid overtreatment.
Extensive studies over the last 15 years have clearly suggested that miRs are critical regulators in PCa progression and development and indicate the use of miRs as promising non-invasive markers of tumoral diagnosis and prognosis. All this huge scientific literature highlights how fascinating but also complicated the world around miRs can be. Studies are often controversial, some questions are being answered, while many new ones need to be answered. Contradictory results between studies can be caused, for example, by differences in the methodology used to analyze and isolate miRs. The use of miRs as markers is promising, but it is not yet a reality in daily clinical practice, and there is still no clear vision of where scientists should turn their attention.
This study aimed to deepen the understanding of existing literature on the role of miRs as potential non-invasive biomarkers in PCa, in terms of major research topics. A machine learning method was used to automatically extract knowledge from scientific literature by means of the LDA algorithm. The innovative concept is based on the ability to independently analyze the texts and identify a certain number of "topics", and subsequently be able to recognize their presence within the texts themselves. The developed algorithm simulates an intelligence that is as human and complete as possible. Therefore, a systematic review of the literature based on LDA was employed. By analyzing 213 papers we found three main topics that the literature focused on, which are also areas for future research. A first observation that arises quite easily from reading the topics is that the literature passes from the basic research level (topic 3) to the translational level (topic 2), to then consider clinical aspects of increasing complexity (topic 1). Our analysis suggests a prevalence of studies (91) aimed to identify deregulated miRs in PCa, their putative targets and their role in tumoral development and progression. Translational and clinical research studies were a minority (66 and 56 papers, respectively) but their interest is growing more and more over time, thus pointing the direction for future research. Furthermore, our analysis leads us to conclude that several miRs are associated with PCa development and progression, they are indicative of poor prognosis and aggressiveness, are stable under adverse conditions, and can be easily detected in urine. Hence, urinary miRs are valid and promising candidates as non-invasive biomarkers for PCa, as their presence or absence in urine is correlated with that of matched tumor tissues. As highlighted in the Discussion, methodologies used in miRs analysis are several and different, making it necessary to determine the best urine-related method along with accepted guidelines for sampling and processing, quality control and data analysis. To date, we are not yet able to know exactly which miR candidate is the best to be used as a biomarker. Analyzing the papers, we noticed indeed that the number of miRs detected within individuals is different, thus suggesting high variability of miRs within individuals, and high inter-and intra-tumoral variability. Nevertheless, miR-200 family members (including miR-200a, miR-200b, miR-200c, miR-429, and miR-141) were the most repeated miRs in our selected papers, and they could represent potential urine-based biomarkers for PCa detection and prognosis because: (i) their expression is necessary for the maintenance of the epithelial phenotype, as they are important negative regulators of epithelial to mesenchymal transition (EMT), an essential developmental process implicated in cancer metastasis [10,11]; (ii) their expression is deregulated in PCa, in tissue as well as in blood [73]; and (iii) they have unusually high stability in biological fluids, as this is an important prerequisite for usefulness as a biomarker [107,111,112,115]. Of all candidates, miR-141 showed the greatest differential expression (46-fold overexpressed) in PCa patients compared to healthy controls. In this regard, promising results were recently obtained in PCa detection by analyzing miR-141 in urinary exosomes isolated by differential centrifugation [43,73,86,91,108].
Obviously, as already mentioned, further studies and validation in a large tightly defined patient population are needed to confirm the usefulness of these urinary miRs as PCa biomarkers. Although significant efforts remain to be made, we expect this innovative miR-based technology to drastically change medical practice in the foreseeable future.
In conclusion, this is the first time that a text mining technique, led by using an innovative machine learning approach, has been applied to a sample of original scientific articles in a medical setting. The methodology was used to specifically address the role of miRs in PCa, and their potential as non-invasive biomarkers for early diagnosis and prognosis. Certainly, this study could pave the way for other studies with larger cohorts, and it could be applied and extended in order to study other cancers or diseases.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available in this article.

Conflicts of Interest:
The authors declare no conflict of interest.