Expression of Epithelial and Mesenchymal Markers in Plasmatic Extracellular Vesicles as a Diagnostic Tool for Neoplastic Processes

Tumor-derived extracellular vesicles (TD-EVs) have active roles as cancer hallmark enablers. EVs RNA of epithelial and stromal cells carry information that facilitates the communication processes that contribute to oncological progression, so the objective of this work was to validate by RT-PCR the presence of epithelial (KRT19; CEA) and stromal (COL1A2; COL11A1) markers in RNA of plasmatic EVs in healthy and diverse-malignancy patients for the development of a non-invasive cancer diagnosis system using liquid biopsy. Ten asymptomatic controls and 20 cancer patients were included in the study, and results showed that the isolated plasmatic EVs by scanning transmission electron microscopy (STEM) andBiomedical Research Institute A Coruña nanoparticle tracking analysis (NTA) contained most exosome structures with also a considerable percentage of microvesicles. No differences were found in concentration and size distribution between the two cohorts of patients, but significant gene expression in epithelial and mesenchymal markers between healthy donors and patients with active oncological disease was shown. Results of quantitative RT-PCR are solid and reliable for KRT19, COL1A2, and COL11A1, so the analysis of RNA extracted from TD-EVs could be a correct approach to develop a diagnostic tool in oncological processes.


Introduction
Tissue biopsies have been the usual procedures when the onset of cancer symptoms appeared. However, these types of biopsies are invasive, expensive, and harmful, and sometimes they cannot even be performed due to the difficulty in accessing the tumor location or for patient health conditions [1]. In addition, due to the characteristics of neoplastic cells (heterogeneity, plasticity, etc.) and the different tumor genomic profiles, conventional biopsies are not capable of reflecting the complete nature of primary or secondary tumors, since they only represent a small part of the malignancy [2]. This fact makes difficult the correct patient management [3]. On the other hand, for monitoring response to therapy, rebiopsy becomes necessary to identify genetic alterations during and after treatment. Therefore, the development of new, lower-cost, non-invasive sampling techniques for screening, early diagnosis, investigation of tumor dynamics, and detection of the risk of recurrence was required [4]. Liquid biopsies emerged to overcome these drawbacks. This technique can be used as a method of screening/early detection [5], for monitoring objective of this work was to validate by RT-PCR the presence of KRT19, CEA, COL11A1, and COL1A2 in RNA of plasmatic EVs in healthy and diverse-malignancy patients for the development of a non-invasive cancer diagnosis system using liquid biopsy.

Baseline Characteristics of Patients
Ten healthy donors as asymptomatic controls (AC) and 20 cancer patients (CP) were included in this study. All of them underwent blood extraction for healthcare purposes, and samples were analyzed in the Pathological Anatomy service at the University Hospital Complex of A Coruña. Clinical data such as gender, age, tumor histology, and disease stage were obtained from medical records and are summarized in Table 1. Regarding the control group corresponding to healthy patients, ten patients aged between 25 and 55 years who had not previously presented any oncological process were analyzed. Patients also had no history of chronic pathologies, such as chronic inflammatory bowel disease (Crohn's disease, ulcerative colitis, etc.), pancreatitis, cirrhosis, chronic obstructive pulmonary disease (COPD), or hypothyroidism, since it has been shown that these types of pathologies may be related to alterations in the basal levels of KRT19 [43,44] and CEA [45][46][47][48][49]. Among these 10 patients, 30% were men (n = 3) and 70% were women (n = 7).
In the group of cancer patients, we analyzed plasma samples of eight prostate adenocarcinomas (seven stage IV and one stage IIb); three renal cell carcinomas (two stage IV, one stage III); three pancreatic neuroendocrine tumors (stage IV); two melanomas (one stage IV and one in situ melanoma); one urothelial bladder carcinoma (stage IV); one patient with synchronous colorectal tumors, namely, one adenocarcinoma and one small intestine neuroendocrine tumor (stage I and stage IV, respectively); one serous ovarian carcinoma (stage III); and one small-cell lung carcinoma (stage IV). Among these 20 patients, 85.00% 4 of 18 were men (n = 17) and 15.00% were women (n = 3). The staging of the different tumors was evaluated according to the eighth edition of the TNM classification of the American Joint Committee on Cancer of 2017 [50].

Characterization of Plasmatic Extracellular Vesicles
Visible and quantitative characterizations of plasmatic EVs were performed to assess purity and concentration. We isolated EVs from plasma of patients and STEM electron microscopy, and NTA were used to analyze the morphology, size distribution, and concentration of samples.

Visible Characterization of Extracellular Vesicles by Electron Microscopy
To confirm the presence, morphology, and size of EVs from plasma of cancer patients and healthy individuals, samples were observed by scanning transmission electron microscopy (STEM) technology. Vesicles ranging from 800 to 25 nm were analyzed. Micrographs showed the majority of exosome and microvesicle structures in both samples ( Figure 1). The morphology revealed a greater number of spheroid EVs, with some cup-shaped exosome configurations ( Figure 1A.1,A.2). No differences were observed in structure or morphology between AC and CP donors.

Characterization of Plasmatic Extracellular Vesicles
Visible and quantitative characterizations of plasmatic EVs were performed to assess purity and concentration. We isolated EVs from plasma of patients and STEM electron microscopy, and NTA were used to analyze the morphology, size distribution, and concentration of samples.

Visible Characterization of Extracellular Vesicles by Electron Microscopy
To confirm the presence, morphology, and size of EVs from plasma of cancer patients and healthy individuals, samples were observed by scanning transmission electron microscopy (STEM) technology. Vesicles ranging from 800 to 25 nm were analyzed. Micrographs showed the majority of exosome and microvesicle structures in both samples (Figure 1). The morphology revealed a greater number of spheroid EVs, with some cupshaped exosome configurations ( Figure 1A.1,A.2). No differences were observed in structure or morphology between AC and CP donors.

Quantitative Characterization of Extracellular Vesicles by Nanoparticle Tracking Analysis (NTA)
Nanoparticle tracking analysis (NTA) is a particle tracking method for measuring the EV concentration and size distribution ( Figure 2) [51]. From NTA analysis, it was shown that the mean size of vesicles ranged from 161.6 ± 2.3 nm (AC samples) to 160.5 ± 0.7 nm (CP) (Figure 2A). Mode values were similar (AC: 127.6 ± 10.7 nm; CP: 156.5 ± 13.2 nm). Analysis showed that exosome size structures were majority in both samples. Regarding concentration, AC samples showed a mean value of 1.08 × 10 11 ± 5.09 × 10 9 particles/mL. Plasma of cancer donors registered a mean of 1.02 × 10 11 ± 1.11 × 10 10 particles/mL. No statistically significant differences were found in terms of concentration and size distribution between the two cohorts of patients ( Figure 2B).  3) between AC and CP plasmatic samples. No differences were found between AC and CP plasmatic EV samples. The data were expressed as mean ± SD obtained from three independent experiments. Ns denotes no statistically significant differences when comparing CP with the control group.

RNA Analysis of Integrity
In order to determine the integrity and concentration of total RNA extracted from plasmatic EVs, Bioanalyzer profiles were examined. RNA integrity number (RIN) values were analyzed for all plasma samples with RNA 6000 Pico Chip. Results showed that all samples seemed to be partially degraded with low RIN values, ranging from 2.3 to 4.2 ( Figure 3). Plasmatic EVs harbored high amounts of small purified RNAs; this fact can have a negative impact on RIN values, since the algorithm behind RIN calculation assesses these small RNAs as degradation products [52]. Therefore, samples that appeared to be partially or total degraded actually showed a variable amount of small RNAs not reflected in the integrity analysis. In this context, the RIN value could not be the adequate method to quality control total EV RNA extracted samples. With respect to concentration, values ranging from 400 to 1.200 pg/µL were showed from the samples. Small concentrations of RNA were found, which was a drawback in terms of reproducibility and repeatability. No differences in integrity or concentration were found between samples from healthy donors and cancer patients ( Figure 3A,B).

RNA Analysis of Integrity
In order to determine the integrity and concentration of total RNA extracted from plasmatic EVs, Bioanalyzer profiles were examined. RNA integrity number (RIN) values were analyzed for all plasma samples with RNA 6000 Pico Chip. Results showed that all samples seemed to be partially degraded with low RIN values, ranging from 2.3 to 4.2 ( Figure 3). Plasmatic EVs harbored high amounts of small purified RNAs; this fact can have a negative impact on RIN values, since the algorithm behind RIN calculation assesses these small RNAs as degradation products [52]. Therefore, samples that appeared to be partially or total degraded actually showed a variable amount of small RNAs not reflected in the integrity analysis. In this context, the RIN value could not be the adequate method to quality control total EV RNA extracted samples. With respect to concentration, values ranging from 400 to 1.200 pg/µ L were showed from the samples. Small concentrations of RNA were found, which was a drawback in terms of reproducibility and repeatability. No differences in integrity or concentration were found between samples from healthy donors and cancer patients ( Figure 3A,B).     Tables 2 and 3, respectively. All data were normalized by ACTB relative gene expression. Table 2. ∆Ct amplification values for KRT19, CEA, COL11A1, and COL1A2 genes corresponding to asymptomatic controls (AC). ∆Ct indicates the difference between the mean of two replicates of the Ct of the target gene and the mean of the Ct of the reference control gene (ACTB ∆Ct values were expressed as mean ± SEM. Healthy donors showed a ∆Ct of 8.64 ± 0.36 and a ∆Ct of 10.35 ± 0.70 for KRT19 and CEA markers, respectively. For mesenchymal markers, a ∆Ct of 8.21 ± 0.37 for COL11A1 and ∆Ct of 8.44 ± 0.37 for COL1A2 were found ( Table 2). Regarding CP expression analysis, epithelial marker ∆Ct amplification values were 3.14 ± 0.30 for KRT19 and 5.06 ± 0.75 for CEA. COL11A1 showed a ∆Ct value of 3.01 ± 0.26, and COL1A2, 3.27 ± 0.23 (Table 3). COL1A2 expression was only analyzed in eleven cancer individuals due to the lack of sample and the impossibility of reproducing replicates in the remaining oncological patients.
A box plot diagram was performed to visualize the distribution of results of ∆Ct in AC and CP, and comparisons between both groups were made ( Figure 4). The distribution of the data showed that the median ∆Ct for KRT19 in AC and CP was 8.66 (IQR: 1.21) and 3.66 (IQR: 1.24), respectively. The median ∆Ct for CEA was 10.32 (IQR: 2.10; AC) and 5.32 (IQR: 6.92; PC). This last result indicated a great dispersion and variability of the data of the CP group for CEA. Stromal markers presented a median ∆Ct for COL11A1 of 7.97 (IQR: 1.54) in healthy donors and 2.39 (IQR: 2.03) for CP; and a median ∆Ct for COL1A2 of 8.29 (IQR: 1.64; AC) and 3.25 (IQR: 1.47; CP) ( Figure 4A). Box plot diagrams also showed the presence of extreme data in ∆Ct values in KRT19 (AC and CP samples) and in CEA AC results. Grubbs and Dixon test analyses of the outliers were performed for these values and confirmed that these data were furthest from the rest but were not significant outliers (p-value > 0.05; IC: 95%).  Figure 4A). Box plot diagrams also showed the presence of extreme data in ΔCt values in KRT19 (AC and CP samples) and in CEA AC results. Grubbs and Dixon test analyses of the outliers were performed for these values and confirmed that these data were furthest from the rest but were not significant outliers (p-value > 0.05; IC:95%). Data are expressed as mean ± SE. Asterisks denote significant differences between groups AC and CP for each gene (**** p-value < 0.0001; *** p-value < 0.001; ** p-value < 0.01). Figure 4A,B also showed that differences in ΔCt values between AC and CP groups were significative in all genes studied. T-tests yielded a p-value < 0.0001 for KRT19, pvalue = 0.0001 for CEA, p-value < 0.0001 for COL11A1, and a p-value < 0.0001 for COL1A2.
The relative gene expression level was evaluated using a modified comparative Ct method, (2 −ΔΔCt ), as described previously by Pfaffl [53] for each biomarker. Figure 5 showed expression fold changes in AC and CP samples. KRT19 showed an increase expression in CP (66.04 ± 14.29) in comparison to AC patients (1.36 ± 0.43). CEA levels also were highly incremented in cancer individuals (245.81 ± 84.63) related to healthy donors (2.96 ± 1.80). However, a high interindividual variability in expression was found for this gene in both AC and CP cohorts, even in oncological patients with the same type of cancer. Mesenchymal markers were as well highly elevated in CP group. AC patients showed values of 1.26 ± 0.25 (COL11A1) and 1.31 ± 0.30 (COL1A2), whereas CP expression for COL11A1 was 47.80 ± 7.51-fold and 40.64 ± 6.00-fold for COL1A2. Both groups were found to be significantly different with a non-parametric test for all epithelial and mesenchymal biomarkers. Mann-Whitney U tests yielded a p-value < 0.0001 for KRT19, p-value = 0.0003 for CEA, p-value < 0.0001 for COL11A1, and p-value = 0.0001 for COL1A2. Data are expressed as mean ± SE. Asterisks denote significant differences between groups AC and CP for each gene (**** p-value < 0.0001; *** p-value < 0.001; ** p-value < 0.01). Figure 4A,B also showed that differences in ∆Ct values between AC and CP groups were significative in all genes studied. t-Tests yielded a p-value < 0.0001 for KRT19, p-value = 0.0001 for CEA, p-value < 0.0001 for COL11A1, and a p-value < 0.0001 for COL1A2.
The relative gene expression level was evaluated using a modified comparative Ct method, (2 −∆∆Ct ), as described previously by Pfaffl [53] for each biomarker. Figure 5 showed expression fold changes in AC and CP samples. KRT19 showed an increase expression in CP (66.04 ± 14.29) in comparison to AC patients (1.36 ± 0.43). CEA levels also were highly incremented in cancer individuals (245.81 ± 84.63) related to healthy donors (2.96 ± 1.80). However, a high interindividual variability in expression was found for this gene in both AC and CP cohorts, even in oncological patients with the same type of cancer. Mesenchymal markers were as well highly elevated in CP group. AC patients showed values of 1.26 ± 0.25 (COL11A1) and 1.31 ± 0.30 (COL1A2), whereas CP expression for COL11A1 was 47.80 ± 7.51-fold and 40.64 ± 6.00-fold for COL1A2. Both groups were found to be significantly different with a non-parametric test for all epithelial and mesenchymal biomarkers. Mann-Whitney U tests yielded a p-value < 0.0001 for KRT19, p-value = 0.0003 for CEA, p-value < 0.0001 for COL11A1, and p-value = 0.0001 for COL1A2. Regarding the gene expression results obtained with the 20 cancer patients in total, we separated the patients according to the type of tumor and analyzed them individually. Only the pooled data of patients with prostate (n = 8), renal (n = 3), and pancreatic cancer (n = 3) were discussed, as they were the only ones with a statistically significant number of cases. Data of COL1A2 expression in pancreatic cancer patients were not analyzed due to the lack of samples and the impossibility of reproducing replicates. ΔCt values and RNA relative expression data are collected in Figure 6. In prostate cancer patients (PC), the results obtained were very similar to analyzed data with the total number of patients ( Figure  6A). Both, ΔCt and 2 −ΔΔCt values showed a significant increase in all markers compared to the AC cohort. KRT19 and CEA levels showed increased expression in PC (58.03 ± 24.09; 177.70 ± 104.70, respectively). The relative fold expression values of mesenchymal markers in the PC cohort were also higher (COL11A1 = 62.66 ± 15.50; COL1A2= 50.58 ± 12.35). Concerning renal cancer patients (RC), we observed the same results ( Figure 6B). The differences in expression in epithelial markers compared to healthy donors were even superior (KRT19 = 106.60 ± 67.26; COL1A2= 718.20 ± 356.70), probably due to the small number of cases analyzed. In pancreatic cancer patients (PaC), CEA levels were found not significant ( Figure 6C). KRT19 (39.66 ±3.22) and COL11A1 (52.71 ± 16.85) showed higher expression levels in the PaC cohort compared to AC (KRT19 = 1.36 ± 0.43; COL11A1 = 1.26 ± 0.25). No data of COL1A2 expression were available. In summary, analyzing the cases according to the type of tumor, we found similar expression levels to the complete CP cohort. However, further studies with a larger number of patients are necessary to be able to assess each type of tumor individually. Mann-Whitney U tests showed that these differences were significant between groups in all biomarkers (KRT19, p-value < 0.0001; CEA, p-value = 0.0003; COL11A1, p-value < 0.0001; COL1A2, p-value = 0.0001). Data are expressed as mean ± SE. Asterisks denote significant differences between groups AC and CP for each gene (**** p-value < 0.0001; *** p-value < 0.001. Regarding the gene expression results obtained with the 20 cancer patients in total, we separated the patients according to the type of tumor and analyzed them individually. Only the pooled data of patients with prostate (n = 8), renal (n = 3), and pancreatic cancer (n = 3) were discussed, as they were the only ones with a statistically significant number of cases. Data of COL1A2 expression in pancreatic cancer patients were not analyzed due to the lack of samples and the impossibility of reproducing replicates. ∆Ct values and RNA relative expression data are collected in Figure 6. In prostate cancer patients (PC), the results obtained were very similar to analyzed data with the total number of patients ( Figure 6A). Both, ∆Ct and 2 −∆∆Ct values showed a significant increase in all markers compared to the AC cohort. KRT19 and CEA levels showed increased expression in PC (58.03 ± 24.09; 177.70 ± 104.70, respectively). The relative fold expression values of mesenchymal markers in the PC cohort were also higher (COL11A1 = 62.66 ± 15.50; COL1A2 = 50.58 ± 12.35). Concerning renal cancer patients (RC), we observed the same results ( Figure 6B). The differences in expression in epithelial markers compared to healthy donors were even superior (KRT19 = 106.60 ± 67.26; COL1A2 = 718.20 ± 356.70), probably due to the small number of cases analyzed. In pancreatic cancer patients (PaC), CEA levels were found not significant ( Figure 6C). KRT19 (39.66 ± 3.22) and COL11A1 (52.71 ± 16.85) showed higher expression levels in the PaC cohort compared to AC (KRT19 = 1.36 ± 0.43; COL11A1 = 1.26 ± 0. 25). No data of COL1A2 expression were available. In summary, analyzing the cases according to the type of tumor, we found similar expression levels to the complete CP cohort. However, further studies with a larger number of patients are necessary to be able to assess each type of tumor individually.  Data are expressed as mean ± SE. Asterisks denote significant differences between groups for each gene (**** p-value < 0.0001; ** p-value < 0.01; * p-value < 0.05; ns = not significant).

Analysis of Specificity of cDNA Amplicons
We analyzed the specificity of the PCR products obtained for all biomarkers by analyzing the melt curves for each gene and summiting the amplified fragments to 2% agarose gel electrophoresis in order to verify amplicon size and discard the presence of dimers and other nonspecific products that could be interfering with the results obtained. Results are shown in Figure 7. Data representing three CP samples showing amplification sigmoidal curves in the log scale for both epithelial and mesenchymal biomarkers ( Figure 7A, upper zone). Regarding melt curves ( Figure 7A, lower zone), KRT19, COL11A1, and COL1A2 curve analysis revealed a single peak of specific PCR products in all samples. This single peak was distributed in the same specific temperature range for each pair of primers. However, CEA curves showed two peaks at different temperatures. Specific products of real-time PCR had a higher melting temperature than nonspecific products (primer dimers or artifacts) and a smaller peak. Melt curve analysis was performed in all samples (AC and CP), and CEA results were similar in all runs. Others sets of CEA pairs of primers were proved, but no significant differences were found. The negative control for CEA also evidenced unspecific peaks in curve analysis. Data are expressed as mean ± SE. Asterisks denote significant differences between groups for each gene (**** p-value < 0.0001; ** p-value < 0.01; * p-value < 0.05; ns = not significant).

Analysis of Specificity of cDNA Amplicons
We analyzed the specificity of the PCR products obtained for all biomarkers by analyzing the melt curves for each gene and summiting the amplified fragments to 2% agarose gel electrophoresis in order to verify amplicon size and discard the presence of dimers and other nonspecific products that could be interfering with the results obtained. Results are shown in Figure 7. Data representing three CP samples showing amplification sigmoidal curves in the log scale for both epithelial and mesenchymal biomarkers ( Figure 7A, upper zone). Regarding melt curves ( Figure 7A, lower zone), KRT19, COL11A1, and COL1A2 curve analysis revealed a single peak of specific PCR products in all samples. This single peak was distributed in the same specific temperature range for each pair of primers. However, CEA curves showed two peaks at different temperatures. Specific products of real-time PCR had a higher melting temperature than nonspecific products (primer dimers or artifacts) and a smaller peak. Melt curve analysis was performed in all samples (AC and CP), and CEA results were similar in all runs. Others sets of CEA pairs of primers were proved, but no significant differences were found. The negative control for CEA also evidenced unspecific peaks in curve analysis.  Agarose gel electrophoresis displayed analogous results ( Figure 7B). KRT19 and COL1A2 amplicons showed a very specific band without the presence of dimers or artifacts. Nevertheless, COL11A1 and CEA showed two bands of similar size, more identified in CEA samples. All CEA amplicons analyzed by electrophoresis presented two amplified PCR products; however, in only a part of COL11A1 amplicons was a double band revealed. An analysis of ACTB PCR products was also performed by a study of melt curves and agarose gel electrophoresis showing specific amplicons for the control gene.

Discussion
Molecular diagnostics is already an integral part of practical medicine in oncological processes, but due to the complex and changing nature of the disease, robust and well-validated cancer biomarkers are increasingly needed [54]. World Health Organization defined a biomarker as "any substance, structure or process that can be measured in the body or its products and influence or predict the incidence of outcome or disease" [55,56]. Regarding cancer clinical practice, a cancer biomarker may measure the risk of developing cancer, cancer progression, or potential responses to therapy [57]. TD-EVs found in liquid biopsies have the unique potential to capture the dynamic landscape of the disease, and their physiological features make them potential vehicles for biomarkers in cancer [58]. In this sense, by analyzing different epithelial and mesenchymal markers extracted from plasmatic EVs, we found significant differences in gene expression between healthy donors and patients with active oncological disease. Our results at 10 AC and 20 CP showed that the isolated EVs by STEM and NTA contained a majority of exosome structures with also a considerable percentage of microvesicles. Surprisingly, no differences were found in concentration and size distribution between the two cohorts of patients [33,59]. However, this fact is not that unusual, since it depends on the type or physiology of the patient, and other studies have not found these differences between the two groups either [60,61]. Furthermore, several studies showed that age significantly influences the secretion of EVs [62,63]. Plasma concentrations of EVs may decrease with age, and this difference could be bigger according to the difference in age ranges. Our AC group presented a mean age of 40.30 years, while the mean age of the CP was 67.50 years. Therefore, the baseline characteristics of our two cohorts of patients could be decisive in this subject.
Analyzing the total extracted RNA from plasmatic EVs, we observed a composition of diverse specimens with a higher percentage of small RNAs than in cells. Low concentrations of total RNA were also yielded, but the amount was enough for performing replicable qRT-PCR analysis. Moreover, despite the low RIN values observed, probably because it is not an adequate method to measure the total RNA of EVs, we obtained specific amplicons of the target genes in our CP cohort. Although RNA is more vulnerable and less stable than DNA, it is well preserved in various body fluids when secreted as part of EVs. Furthermore, RNA secretion is a physiological process; it is secreted from active living cells rather than apoptotic/necrotic cells, which gives very relevant information on the origin and may be significant in some types of cancer that secreted a low abundance of tumor-derived plasma DNA [54]. RNA-based assays with validated results have been under development for years [64,65] and can provide a complete overview of the expression profiles that may be occurring at a given point in time in the pathology. Therefore, despite the fact that low concentration levels may be a limiting factor, the analysis of RNA extracted from TD-EVs could be a correct approach to develop a diagnostic tool.
Quantitative PCR results showed significant differences in both epithelial and mesenchymal expression between healthy and cancer patients. KRT19, CEA, COL1A2, and COL11A1 expression levels were much higher in cancer patients (KRT19 = 66.04 ± 14.29fold; CEA = 245.81 ± 84.63-fold; COL11A1 = 47.80 ± 7.51-fold; COL1A2 = 40.64 ± 6.00-fold). Marker expression in healthy controls was almost residual in all cases. Analyzing data according to the type of tumor, we observed similar results in prostate, renal, and pancreatic cancer. Our data revealed that CK19, COL11A1, and COL1A2 could be potential tumor biomarkers in plasmatic EVs, since results both at the level of expression as well as specificity and reproducibility are robust and promising. However, RNA expression findings obtained for CEA, both in healthy donors and cancer patients (in complete cohort and individually by tumor type), were not solid and reliable. This fact could explain the high variability found in the results, so the CEA biomarker, in those conditions, does not appear to be a suitable marker for our diagnosis system.
A cancer biomarker test should be an assay easily and reproducibly performed and must meet several main requirements. High specificity and sensibility are needed, allowing for the accurate discrimination between cancer and other pathological or physiological processes. Moreover, ideally, proportionality must also be taken into account, correlating with diverse features of malignancy as a stage or prognosis [65,66]. In our study, all the patients were in an advanced stage of the disease; however, our preliminary results encourage us to carry out further studies with patients with earlier stage disease in order to validate our data. Multicancer screening based on blood analysis may have the potential to be applied to early detection and could also address some limitations of current screening methods. There are currently very few multicancer tests on the market and with administration approval. CancerSEEK was one of the first multicancer early detection tests being reported. It is a blood test that can detect eight different types of cancer by the detection of cfDNA and eight protein biomarkers that are released by tumors [67]. Sensitivity of the test ranged from 69 to 98%, and specificity was greater than 99%. However, is not yet available to patients and it is awaiting FDA approval. The Galleri Test [68] detects abnormalities in the methylation patterns of cfDNA through next-generation sequencing. The test detects over 50 types of cancer and can predict the organ of origin of the cancer signal. Specificity is 99.5%, and overall sensitivity is 51.5%. It is FDA approved but is still only commercialized in America. The Panseer test [69] detects DNA methylation patterns linked to gene silencing that may contribute to cancer development. The test could detect five common types of cancer in 88% of post-diagnosis patients with a specificity of 96% and could detect cancer in 95% of asymptomatic individuals who were later diagnosed. One of the limitations of the test is that it does not detect the tissue of origin, and it is for research use only. The development of a diagnostic tool is a long process that usually takes place over different phases, namely, an initial discovery and assay development, the assessment of clinical validity, and finally market approval. As we have seen above, there are very few tests available for healthcare assistance, so the search for new fast and non-invasive multicancer early detection tools seems crucial nowadays. The three markers in which we obtained acceptable results (KRT19, COL1A2, and COL11A1), if used in combination, could increase the discriminatory potential that each one showed separately. Validation studies with a larger number of patients in early stages of the disease are needed to ensure statistical robustness of the assay. Moreover, samples should reflect the biological variability of the targeted population, intending to differentiate subjects according to the different types of tumors.
In conclusion, the role of plasmatic EVs in the progression of oncological diseases, as well as their therapeutic potential [70], together with the specificity of neoplastic transformation indicators such as KRT19 or diverse CAFs markers, leads us to consider that the development of a non-invasive cancer diagnosis system using a combination of epithelial and mesenchymal markers could be a promising approach in the diagnosis of several neoplastic processes.

Patients and Human Samples
This research was conducted as a single-center retrospective study between February 2020 and March 2020. Samples were collected and analyzed in the Pathological Anatomy service at the University Hospital Complex of A Coruña. The Pathological Anatomy laboratory is UNE-EN ISO 9001-2015 certified. Clinical data such as gender, age, tumor histology, and disease stage were obtained from the medical records. Patients belonged to a study approved by the Clinical Research Ethics Committee (approval registration number 2020/010), and it was conducted in compliance with the Declaration of Helsinki. Written informed consent custody and remnant sample storage was managed by the Biobank of A Coruña.
A total of 30 plasma samples belonging to 30 patients was included in this study. Ten healthy AC and 20 CP that presented advanced clinical stage (III-IV) at the moment of the analysis underwent blood extraction for healthcare causes. Remnant samples were used for the study.

Blood Sample Collection, Extracellular Vesicle Isolation, and RNA Purification
Peripheral whole blood was collected from each subject in a 10 mL EDTA-K2 tube and processed after centrifugation within 4 h to avoid contamination with genomic DNA released from lysed blood cells. Samples were centrifuged at 2000× g for 20 min to collect 2 to 4 mL of plasma The plasma obtained was passed through a 0.8 µm filter and stored at −80 • C. Processing of plasma samples and RNA isolation was carried out using the commercial ExoRNeasy Maxi Kit (QIAGEN, Hilden, Germany), and the manufacturer's protocols were followed. Briefly, a membrane-based affinity binding step to isolate EVs from filtered plasma was used. Subsequently, a phenol/guanidine-based combined lysis and elution step recovered vesicular RNA from the spin columns. Purification of total RNA was performed by a silica-membrane-based column system. Total RNA was eluted in 14 µL of RNase-free water. Purified RNA from each sample was assayed qualitatively and quantitatively using the Agilent RNA 6000 Pico Kit (Agilent Technologies, Santa Clara, CA, USA) on an Agilent 2100 Bioanalyzer (see protocol hereafter).

Extracellular Vesicle Characterization by Scanning Transmission Electron Microscopy
To characterize ultrastructural morphology of plasmatic EVs obtained from patients, STEM was performed. EVs were isolated as describe above. After collection, EVs were resuspended in 500 µL of XE buffer (QIAGEN, Hilden, Germany), and samples were then adsorbed onto 300-mesh carbon-coated copper grids for 1 min in a humidified chamber at room temperature. Grids with adhered EV were examined with a Zeiss Gemini SEM 500 microscope (Carl Zeiss Microscopy GmbH, Jena, Germany) equipped with a STEM detector at 20-30 kV.

Extracellular Vesicle Characterization by Nanoparticle Tracking Analysis
The size distribution and concentration of plasmatic EVs was determined using a Malvern NanoSight NS300 Analyzer (Malvern Panalytical Ltd., Malvern, UK) with specific parameters according to the manufacturer's protocols. EVs were isolated using the ExoRNeasy Maxi Kit (QIAGEN, Hilden, Germany) following manufacturer's instructions and resuspended in 500 µL of XE buffer (QIAGEN, Hilden, Germany). Captures and analysis were achieved by using the built-in NanoSight Software NTA3.3.301 (Malvern Panalytical Ltd., Malvern, UK). The detection threshold for nanoparticles was fixed at 8 for all tests. Samples were diluted in PBS to a final volume of 1 mL. For each measurement, five consecutive 60 s videos were recorded at 25 • C, using a continuous syringe pump at an infusion rate of 40 units. Particles (EVs) were detected using a 488 nm laser (blue) and a scientific CMOS camera.

Bioanalyzer Analysis of Total Purified RNA
Quality, integrity, and the size distribution pattern of total RNA was analyzed using chip-based capillary electrophoresis Agilent 2100 Bioanalyzer using the RNA 6000 Pico Chip (Agilent Technologies, Santa Clara, CA, USA), according to manufacturer's protocol. The RNA 600 Pico Chip assay is designed for analysis of RNA fragments, and each chip contains an interconnected set of microchannels that is used for separation of nucleic acid fragments based on their size. Quality and quantity measures were collected from the generated Bioanalyzer result reports and after evaluation of the reference ladder. The total RNA concentration in the sample had to be between 200 and 5000 pg/µL. 4.6. Reverse Transcription and Real-Time PCR (qRT-PCR) 4.6.1. Reverse Transcription RNA was quantified by an Agilent 2100 Bioanalyzer, and total RNA samples were reverse transcribed into cDNA according to the QuantiNova Reverse Transcription kit (QIAGEN, Hilden, Germany) protocol. The synthesis of cDNA was carried out at a thermal cycler following the manufacturer's instructions: 2 min at 45 • C for gDNA elimination reaction, 3 min at 25 • C for annealing step, 10 min at 45 • C for reverse transcription step, and 5 min 85 • C to inactivate the reverse transcriptase.

qRT-PCR
Quantitative RT-PCR was performed in a CFX96 C1000 Thermal Cycler (Bio-Rad Laboratories, Hercules, CA, USA). cDNA expression was assessed using a QuantiNova SYBR Green PCR kit (QIAGEN, Hilden, Germany). PCR conditions were set according to the supplier. Briefly, initial activation at 95 • C for 2 min and 45 cycles of 95 • C for 5 s, 60 • C for 10 s. Analysis of the melting curves: increase of 0.5 • C from 55 • C to 95 • C. PCRs were performed in 20 µL reaction volumes containing 6 µL H 2 O, 10 µL QuantiNova SYBR Green PCR, 1 µL of each primer (20 µM forward/reverse primers (TIB Molbiol, Berlin, Germany), final concentration: 1 µM), and 2 µL cDNA template. Forward and reverse primers are listed in Table 4. Relative gene expression was calculated using a modified comparative threshold cycle (Ct) method, (2 −∆∆Ct ), as described previously by Pfaffl [53]. The method is a simple formula used in order to calculate the relative fold gene expression. Fold gene expression = 2 −∆∆Ct , where ∆Ct = average of Ct (gene of interest) − average of Ct (housekeeping gene); and ∆∆Ct = average of ∆Ct (group of interest) − average of ∆Ct (control group). Two replicates of each sample were analyzed for each gene. For housekeeping gene (ACTB), a Ct ≤ 28 was considered positive. ACTB Ct values ≥ 29 were considered negative and results were considered invalid. For epithelial and mesenchymal markers, a Ct value ≤ 39 with a sigmoidal curve was accepted as positive. After amplification, a representative sample from each set of amplicons was analyzed by agarose electrophoresis to confirm their specificity. Sixteen microliters of PCR products were separated by electrophoresis on a 2% agarose gel.

Statistical Analysis
Statistical analyses were performed using the IBM SPSS ® Statistics v27 program. Descriptive statistics were used for characterizing the clinical and pathological data of the patients in the study. A Shapiro-Wilk normality test was performed for each data set. A Grubbs and Dixon test was carried out to detect atypical data (outliers). Box plot diagrams were performed for study the distribution of ∆Ct values between AC and CP groups. t-Tests were performed for comparisons between two groups when the statistical data followed a normal distribution (comparison of ∆Ct values between AC and CP groups for epithelial and mesenchymal biomarkers). The nonparametric Mann-Whitney U test was used for comparisons of relative fold expression (2 −∆∆Ct values) among the two cohort of patients for each gene. ∆Ct and 2 −∆∆Ct values were represented as mean ± SEM. Statistical significance was determined at α-limit = 5%. Funding: This research was funded by Plan de Innovación Sanitaria Codigo100, Servizo Galego de Saúde, Xunta de Galicia, grant number AB-SER1-19-008 L3. The article processing charges were funded by a Fundación Profesor Nova Santos grant.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Clinical Research Ethics Committee CEIm-G (Comité de ética de la investigación con medicamentos de Galicia; approval registration number 2020/010).
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available in the article. Any other related information or document not present in this study are available upon request.