A Tetra-Panel of Serum Circulating miRNAs for the Diagnosis of the Four Most Prevalent Tumor Types.

The purpose of this study is to clinically validate a series of circulating miRNAs that distinguish between the 4 most prevalent tumor types (lung cancer (LC); breast cancer (BC); colorectal cancer (CRC); and prostate cancer (PCa)) and healthy donors (HDs). A total of 18 miRNAs and 3 housekeeping miRNA genes were evaluated by qRT-PCR on RNA extracted from serum of cancer patients, 44 LC, 45 BC, 27 CRC, and 40 PCa, and on 45 HDs. The cancer detection performance of the miRNA expression levels was evaluated by studying the area under the curve (AUC) of receiver operating characteristic (ROC) curves at univariate and multivariate levels. miR-21 was significantly overexpressed in all cancer types compared with HDs, with accuracy of 67.5% (p = 0.001) for all 4 tumor types and of 80.8% (p < 0.0001) when PCa cases were removed from the analysis. For each tumor type, a panel of miRNAs was defined that provided cancer-detection accuracies of 91%, 94%, 89%, and 77%, respectively. In conclusion, we have described a series of circulating miRNAs that define different tumor types with a very high diagnostic performance. These panels of miRNAs would constitute the basis of different approaches of cancer-detection systems for which clinical utility should be validated in prospective cohorts.


Introduction
Nowadays, cancer is expected to rank as the leading cause of death in every country of the Western world in the 21st century. The Global Cancer Observatory (GLOBOCAN) 2018 estimated that there would be 18.1 million new cancer cases and 9.6 million cancer deaths worldwide in 2018. Globally, the incidence rate for all cancers combined was about 20% higher in men than in women, with the incidence rates varying across regions in both males and females, indicating that 1 in 8 men and 1 in 10 women will develop cancer along their life. For both sexes combined, lung cancer (LC) is the most commonly diagnosed cancer (11.6% of the total cases), followed by female breast cancer (BC) (11.6%), colorectal cancer (CRC) (10.2%), and prostate cancer (PCa) (7.1%) [1].
Despite the high mortality of this disease, it has been demonstrated that the chance of curing cancer is very high when identified early. In this sense, cancer screening programs (CSPs) play a critical role in identifying cancer before symptoms appear, and their impact on cancer-specific and overall survival has been well documented [2]. Many of these CSPs may involve analytical procedures such as blood or urine tests, other tests, or medical imaging that identifies individuals with a high probability of having cancer and whose diagnosis should be confirmed by means of a histopathological examination.
Many of these analytical tests are based on the identification of blood-based tumor biomarkers which constitute a readily available, inexpensive, and minimally invasively tool which also allows for repeated sampling [3].
Particularly, microRNAs (miRNAs) are 19-25 nucleotide noncoding RNA molecules that regulate a variety of cellular processes including cell differentiation, cell cycle progression, and apoptosis [4]. Some studies have reported that some circulating miRNAs in serum and plasma can be used as noninvasive biomarkers for defining different stages of disease, including cancer [5].
To take advantage of miRNAs as biomarkers for the early identification of cancer, the H2020 European project SAPHELY (Self-amplified photonic biosensing platform for microRNA-based early diagnosis of diseases) (https://saphely.eu/) is focused on developing a nanophotonic-based handheld point of care nanophotonic device for detecting specific circulating miRNAs for the four major cancer types: LC, BC, CRC, and PCa.
Hence, the purpose of this study is to clinically validate a series of circulating miRNAs that distinguish between the different tumor types and controls and that could constitute the basis of the SAPHELY detection system. To quantify the expression of the selected circulating miRNAs, it was necessary to select a reference housekeeping miRNA or combination of miRNAs [6] with stable expression levels in the serum from both healthy donors (HDs) and cancer patients. Three housekeeping miRNAs were evaluated: U6snRNA, miR-16, and miR-1228 [7][8][9].

Average of Expression
The average Ct value of each housekeeping miRNA or the mean Ct values of the combination of 2 or 3 of these miRNAs were evaluated to select the best combination of miRNAs that did not show differences between the groups under study. A nonparametric test for independent samples based on the median of the Ct values was used to demonstrate that no significant differences between groups were observed ( Figure 1). This analysis shows that the Ct average of the three housekeeping genes constituted the best option for normalizing the circulating miRNAs expression levels. Figure 1. Selection of housekeeping genes: According to the null hypothesis, the median value of Ct of housekeeping genes has nonsignificant differences (p = 0.165), and the mean of Ct of U6snRNA, miR-16, and miR-1228 was used as a reference. Test used: nonparametric test for independent samples. The median value for each group is defined by the bold, black line within the box.

Analysis of miR-21
miR-21 was significantly overexpressed in all cancer types compared with HDs ( Figure 2), suggesting that only this miRNA might be indicative for the presence of cancer. In order to analyze the diagnostic performance of circulating miRNA-21, a ROC curve was calculated, obtaining an AUC of 0.675 (95% CI: 0.578-0.736; p = 0.001) ( Figure 3A). Interestingly, when PCa patients were removed from the analysis (because the dispersion of miR-21 expression in this cohort was very high), the diagnostic performance of miR-21 increased in almost 15%: AUC of 0.808 (95% CI: 0.738-0.878; p < 0.0001) ( Figure 3B).

Multivariate Analysis
In order to adjust the weight of each miRNAs in detecting cancer, a Fisher linear discriminant analysis was performed for each tumor type using only those miRNAs that significantly identified cancer patients (those with an AUC > 0.5 and p-value < 0.05) ( Table 1). ROC curves for these models showed that the combinations of the expression levels of circulating miRNAs improved the discriminant diagnostic capability of individual miRNAs (Table 2 and Figure 5). Therefore, the diagnostic accuracy in identifying cancer patients with these models are 77%, 89%, 91%, and 94% for BC, CRC, LC, and PCa, respectively.
In order to validate the power of the statistical model, a cross-validation consisting of 1000 iterations with replacement was performed for each tumor type. The strategy was to repeat cross-validation at intervals of 10 samples (i.e., 30, 40, 50, . . . , n with n being the sample size of each of the four datasets). Every subset is randomly resampled in each iteration up to 1000. These subsamples show that stabilization of the AUC along with standard deviation shrinkage is achieved before reaching the sample size studied in this work, showing an asymptotic pattern and suggesting that an increase of the number of cases for each tumor type would not significantly change the observed AUC ( Figure S1).

Analysis of Variation of the Tumor-Selected Circulating-miRNA Depending on Tumor Stage
In order to demonstrate if the tumor-selected circulating-miRNAs are associated to tumor stages, a nonparametric test for independent samples was performed. The results showed a significant increase (p ≤ 0.001) in the probability of being classified as cancer cases as tumor stage rises, corroborating the performance of miRNAs selected as diagnostic and prognostic tools ( Figure S2).
Moreover, a nonparametric test was also performed in PCa samples to analyze the probability of being classify as cancer cases based on the variation of the PCa-selected circulating-miRNA depending on Gleason Score (GS) and prostate specific antigen (PSA) values. In this case, despite the fact that the median values of the probabilities are higher in GS, ≥7, and serum PSA, ≥4.33 ng/mL, the differences were not statistically significant (p > 0.05) ( Figure S3), mainly because of the disbalancing of the groups analyzed, with only 15% of cases belonging to advanced disease.

Discussion
Cancer represents the set of diseases with more incidences and mortality in Western countries, and its numbers are increasing every year with the consequent social and economic impacts [1]. For that reason, special efforts have been allocated to the design of efficient CSP for specific tumor types that impact overall survival [10][11][12][13]. Most of these CSP incorporate screening tests for selecting individuals with risk of having cancer to whom the standard diagnostic procedures are applied to confirm the disease. Unfortunately, most of these tests have demonstrated poor accuracy and efficacy, particularly among the most prevalent cancers [14]. The actual cancer-screening tests for the four most prevalent tumor types (CRC, LC, PCa, and BC) are mainly based on radiological images (Computed Axial Tomography (CAT), mammography, and multiparametric magnetic resonance for LC, BC, and PCa respectively), some biomarkers such as PSA in PCa or fecal occult blood for CRC, and other invasive interventions such as colonoscopy for CRC [11][12][13][14]. These cancer-screening tests are characterized by low sensitivities or specificities (PSA and fecal occult blood tests) or require specialized trained personnel (image related tests) that make them expensive and limit their use to specific populations and/or referral centers [14]. This is the reason by which blood biomarkers have been proposed as effective indicators to distinguish between cancer and normal conditions or among different cancer groups [15].
More than one decade ago, it was reported that miRNAs are also present in blood [16]. Circulating miRNAs were found to be remarkably stable even under conditions as harsh as boiling, low or high pH, long-time storage at room temperature, and multiple freeze-thaw cycles [5]. Furthermore, miRNAs also represent the status of the disease as they are associated with tumor biology and tumor behavior [17,18]. Thus far, distinctive patterns of circulating miRNAs have been found for different tumor types [19] including BC [20], PCa [21], LC [22], and CRC [23].
Taking all these premises into consideration, the H2020-SAPHELY project (https://saphely.eu) intends to break into the field of screening tests for early diagnosis of the four major cancer types, PCa, BC, LC, and CRC, by using a novel ultrahigh sensitivity nanophotonic-based sensing technique for the direct detection of circulating miRNA biomarkers through a combination of molecular beacon probes with an attached high index nanoparticle so that the hybridization events are translated into the displacement of these nanoparticles from the sensor surface [24,25]. The idea of the SAPHELY device because of the final cost of the test and the expected accuracy in cancer detection is that it could be useful in a cancer screening context. For an explanation of how the SAPHELY device would work, please visit the following URL: https://youtu.be/6ZAuSkfJrB8?list=PL8qM5Jl41EI7iW2a57QEyaCvRFHn-ix95.
With this study, we have evaluated, from the clinical point of view, a series of miRNAs that were already defined in independent studies (Table S2) and that, to some extent, were associated with diagnosis or tumor progression of patients with at least one of the four most prevalent types of tumors (Table S1). For LC and CRC, two sets of 7 miRNAs each were evaluated; 5 miRNAs were evaluated for PCa; and 3 were evaluated for BC. The expression levels of these miRNAs were analyzed by qRT-PCR and their diagnostic capacity by ROC curves using serum samples from cancer patients and HDs in order to define those miRNAs that finally would be incorporated within the SAPHELY detection system. Based on our results, only 12 miRNAs (miR-21, miR-429, miR-200b, miR-125b, miR-141, miR-375, miR-182, miR-29a, miR-210, miR-200c, miR-155, and miR-205) and three housekeeping genes (U6snRNA, miR-16, and miR-1228) would constitute the basis of the detection system of the SAPHELY device. Combinations of the expression levels of these miRNAs provides diagnosis accuracies of 77%, 89%, 91%, and 94% for BC, CRC, LC, and PCa, respectively ( Figure 5 and Table 2).
Interestingly, of all validated miRNAs, miR-21 was overexpressed in all four cancer types, particularly in BC, CRC, and LC. In fact, the diagnostic accuracy for cancer of miR-21 was 67.5% when considering the four tumor types and it increased up to 80.8% when PCa patients were released from the analysis. Overall, miR-21 is considered to be a typical "onco-miR", which acts by inhibiting the expression of phosphatases, which limit the activity of signaling pathways such as AKT and MAPK. miR-21 is associated with a wide variety of cancers including that of breast [26], ovaries [27], colorectal [28], lung, or prostate [29], among others. More recently, a 2014 meta-analysis of 36 studies evaluated circulating miR-21 as a biomarker of various carcinomas, finding that it has potential as a tool for early diagnosis [30], findings which are in accordance to those herein reported.
The LC miRNA panel is represented by miR-21, miR-429, miR-200b, and miR-125b; except for miR-21, the rest of miRNAs were discovered and validated in a previous study in which we collaborated with Reference [31]. This work defined a panel of 6 miRNAs with a diagnostic accuracy for LC of 89%, which is very close to the 91% described in our study.
We have defined BC detection with miR-21 and miR-205, achieving a 77% diagnostic precision. With this study, we have confirmed the diagnostic value of miR-205 defined by other authors that demonstrated the overexpression of this miRNA in serum of BC patients [32,33].
For PCa, the diagnostic accuracy defined by the miRNA panel (miR-21, miR-141, miR-375, miR-125b, and miR-182) was 94%, the highest of our series and similar to other clinico-molecular multi-biomarker panels focused on detection of high-grade PCa [34]. Whereas circulating miR-141, miR-375, and miR-125b were already described by other authors as potential diagnostic biomarkers [35,36], we have introduced miR-182 as a novel promising circulating miRNA. Overexpression of this miRNA was reported by our group as the strongest prognostic miRNA in PCa, and their expression was directly proportional to tumor stage and Gleason score [37]. Hence, with this study, we have demonstrated that circulating miR-182 (AUC: 0.895; 95% CI: 0.824-0.967; p < 0.0001) significantly increases the performance of the PCa miRNA panel.
Additionally, due the role of miRNA as key regulators of virtually every biological process, it is interesting to study the impact of miRNA in phenotypes by studying the role of its effector mRNAs. In this sense, there is a great corpus of evidence from both functional studies and computational predictions. This data is organized in public databases on the Internet along which one of the most important is miRBase [40].
An in silico analysis reveals a dense network of gene effectors of the miRNA in every one of the four tumor types studied. Gene Set Enrichment Analysis (GSEA) highlighted the three top overrepresented pathways. In all four datasets, the main Kyoto Encyclopedia of Genes and Genomes (KEGG) category is pathways related to cancer. In LC, this category is accompanied by ErbB signaling and PCa pathways, whereas PCa showed altered pancreatic and PCa pathways. CRC and finally the BC series have overrepresented pancreatic and PCa pathways. There is difficulty drawing direct conclusions of such a dense network of genes; thus, the network was previously pruned according the degree of the nodes in every one of the disease set. The top overrepresented KEGG pathways are closely related with cancer in all the cases with interesting potential mRNA biomarkers which deserve to be explored ( Figure S4).
Moreover, we have demonstrated that the tumor-selected circulating-miRNA are associated to different tumor stages but not to GS and serum PSA values in the case of PCa. This observation gives robust evidence of circulating-miRNA utility as a diagnostic and prognostic tool in these four cancer types.
However, this present study has some limitations. In the first place, we only have analyzed some miRNAs considered tumor specific for each cancer type, but there are more miRNAs that have not been investigated that could be relevant for cancer diagnosis. Moreover, we need an amplification step in order to increase the miRNA concentration to being able to detect them, but the SAPHELY device is not be able to perform this step, a point that could be crucial for the success of the device. Finally, we have defined a panel for each cancer type, but they are not always represented, with all altered miRNAs in all cases hindering the disease identification.
Because of these limitations, this present study must be continued by analyzing more miRNAs that can be completely tumor-specific and by improving the detection system to offset the low concentration of miRNAs in bloodstream.

Patients' Characteristics
Patients with newly diagnosed LC, BC, CRC, and PCa were considered eligible for this study. Within the period from October 2011 to March 2017, a total of 156 cancer patients (44 LC, 40 PCa, 27 CRC, and 45 BC) was finally selected. Patients' clinical and histopathological information are summarized in Table 3. Serum samples from 45 HDs with no history of cancer or other chronical disease prior the blood collection were also collected. These samples were included as a control group for the miRNA analyses.

MiRNAs Selection for Each Tumor Type
In view of the role of miRNAs as diagnostic biomarkers, a literature search was performed to identify those miRNA candidates to be implemented as biomarkers. A list of potential miRNAs was selected for each tumor type by applying a decision-making algorithm using PubMed as source of information. The search criteria used were ("microRNA" OR "miRNA" OR "miRNAs") AND ("prostate cancer"/"breast cancer"/"lung cancer"/"colorectal cancer") AND ("serum" OR "plasma") AND ("diagnosis"). All the papers selected met the following criteria: expression of miRNAs in serum; papers written in English; and clinical setting: diagnosis, including control samples.
The results of the literature search are summarized in Table S2.

Blood Collection and RNA Extraction
Seven milliliters of peripheral blood from cancer patients and HDs were collected at the time of diagnosis in SARSTEDT Monovette Serum gel tubes (Sarstedt AG & Co, Nümbrecht, Germany) and left to clot at room temperature (RT) for 30 min. The serum was separated by centrifugation for 10 min 3000× g at RT and stored at FIVO Biobank at −80 • C in 1-mL aliquots until RNA could be extracted and purified.
Total RNA was isolated using the miRNeasy Serum/Plasma Kit (Qiagen N.V., Hilden, Germany) following the manufacturer's instructions, starting with 200 µL of serum. The concentration of total RNA in each sample was measured using a NanoDrop 1000 spectrophotometer (Thermo Scientific, Wilmington, DE).

Conversion of total RNA into cDNA
One-hundred and fifty ng of total RNA were reverse-transcribed using the TaqMan MicroRNA Reverse Transcription kit (Applied Biosystems, Foster City, CA) in a total volume reaction of 15 µL containing 2 mM dNTPs, 3.3 U/mL MultiScribe reverse transcriptase, 1× reverse transcription buffer, 0.25 U/mL RNase inhibitor, and 0.02× of each specific miRNA primer (Table S1), that were pooled from a 5× solution to a working solution of 0.05× (TaqMan MicroRNA Assays, Applied Biosystems) and nuclease-free water. The reaction was performed using the Veriti Thermal Cycler (Applied Biosystems) at 16 • C for 30 min, 42 • C for 30 min, and 85 • C for 5 min.

cDNA Pre-Amplification
Five µL of cDNA was pre-amplified using the TaqMan PreAmp Master Mix (2×) according to the manufacturer's instructions (Applied Biosystems). For this reaction, 5 µL of cDNA sample was added to a total volume of 25 µL containing 12.5 µL of TaqMan PreAmp Master Mix (2×), 7.25 µL of polled assay mix (0.2×), and nuclease-free water. The reaction was performed using a Veriti Thermal Cycler (Applied Biosystems) during 14 cycles at 95 • C for 10 min, 95 • C for 15 s, and 60 • C for 4 min.

Quantification of Circulating miRNAs by Quantitative Reverse Transcription-Polymerase Chain Reaction (qRT-PCR)
cDNA, previously pre-amplified, was diluted 1:10 in Tris-EDTA (TE) buffer 1× and added to a final qRT-PCR reaction volume of 20 µL, which contained TaqMan MicroRNA assay primers (Applied Biosystems) for each miRNA (Table S1), TaqMan Gene Expression Master Mix (2×) (Applied Biosystems), and nuclease-free water. The reaction was performed using ABI 7500 fast real-time PCR systems (Applied Biosystems) at 95 • C for 10 min and 40 cycles at 95 • C for 15 s and 60 • C for 60 s.
After validating the Ct mean of the housekeeping genes, U6snRNA, miR-16, and miR-1228, as reference miRNAs, the relative levels of each specific miRNA was calculated using the equation 2 -∆Ct, where ∆Ct = mean Ct miRNA -mean Ct (miR-U6,16&1228) , and Ct = threshold cycle.

Statistical Analysis
The differences in circulating miRNA levels and the ROC curves were evaluated using IBM SPSS Statistics V22.0 (SPSS, Chicago, IL). Kruskal-Wallis nonparametric test was used to perform a statistical analysis of serum miRNA levels; for post hoc pairwise comparison, U Mann-Whitney test was used. The median expression level of each miRNA between different groups was compared (cancer vs. HDs).
For the multivariate analysis, Fisher linear discriminant analysis was employed. In order to establish the feasibility of this kind of analysis, assumptions of the model in terms of inequality of covariances and variances were checked with M-Box and Lambda of Wilks tests, respectively. The obtained discriminant function was used to classify the samples as tumor or normal. Performances of univariate and multivariate analysis were studied with the AUC of ROC curves.

miRNA Target-Interactions Analysis
In this analysis, miRNet tool was used (url: https://www.mirnet.ca/miRNet/home.xhtml) [41]. Briefly, this resource implements algorithms for differential expression and mining in miRNA databases in order to create a graph representing the relationships between miRNAs and its targets. Networks can be manually amplified and pruned. Finally, it implements hypergeometric tests to perform a gene set enrichment analysis with KEGG pathways [42].

Conclusions
In conclusion, with this study, we have defined a series of circulating miRNAs that define different tumor types with a very high diagnostic performance. In addition, these miRNAs would constitute the basis of a multianalyte blood test through the SAPHELY detection system; however, their clinical utility should be demonstrated in a prospective study in the context of a CSP.