Protein Cargo of Salivary Small Extracellular Vesicles as Potential Functional Signature of Oral Squamous Cell Carcinoma

The early diagnosis of oral squamous cell carcinoma (OSCC) is still an investigative challenge. Saliva has been proposed as an ideal diagnostic medium for biomarker detection by mean of liquid biopsy technique. The aim of this pilot study was to apply proteomic and bioinformatic strategies to determine the potential use of saliva small extracellular vesicles (S/SEVs) as a potential tumor biomarker source. Among the twenty-three enrolled patients, 5 were free from diseases (OSCC_FREE), 6 were with OSCC without lymph node metastasis (OSCC_NLNM), and 12 were with OSCC and lymph node metastasis (OSCC_LNM). The S/SEVs from patients of each group were pooled and properly characterized before performing their quantitative proteome comparison based on the SWATH_MS (Sequential Window Acquisition of all Theoretical Mass Spectra) method. The analysis resulted in quantitative information for 365 proteins differentially characterizing the S/SEVs of analyzed clinical conditions. Bioinformatic analysis of the proteomic data highlighted that each S/SEV group was associated with a specific cluster of enriched functional network terms. Our results highlighted that protein cargo of salivary small extracellular vesicles defines a functional signature, thus having potential value as novel predict biomarkers for OSCC.


Introduction
Oral squamous cell carcinoma (OSCC) is one of the most prevalent histotypes of cancer worldwide and is a challenge to public health. Despite the introduction of new diagnostic tools and treatment modalities for the management of OSCC, its prognosis still remains very poor, with a 5-year mortality rate of approximately 60% [1]. Although the accessibility of the oral cavity can render the clinical examination easy, OSCC is usually diagnosed in advanced stages due to diagnostic delay, which obviously decreases the chances of survival [2,3].
To date in current clinical practice, OSCC diagnosis is usually preceded by oral visual examination, including inspection and palpation, by general physicians or dentists. In cases of suspicious neoplastic lesions, the clinical examination is integrated by incisional biopsy followed by histological investigation; however, no specific and reliable molecular markers are yet available [2,4,5]. Thus, more recent research has been focusing on the identification 2 of 20 of non-invasive or minimally invasive markers for OSCC screening and longitudinal monitoring of the patients' response to treatment. In this context, liquid biopsy is a promising method for early diagnosis and real-time monitoring based on the analysis of circulating tumor cells (CTCs), circulating tumor DNAs (ctDNAs), circulating cell-free microRNAs (cfmiRNAs), extracellular vesicles (EVs), and other cancer-derived products isolated by the blood or other biofluids (e.g., saliva, urine, ascites, pleural effusion, etc.) [5][6][7][8]. Liquid biopsy allows one to obtain a real-time picture at different time points, giving information about tumor and tumor burden as well as early evidence of drug resistance and tumor recurrence [4,9], supporting the development of more highly personalized diagnosis and therapies [7,10]. In recent years, several studies have been focused on describing the use of EV-based liquid biopsy as a source of biomarkers for several kinds of cancer [11][12][13][14].
EVs are heterogeneous membranous structures secreted by all living cells, including cancer cells, in the surrounding microenvironment, as well as in proximal and systemic body fluids.
Historically, EVs, based on their biogenesis, were classified in exosomes (of endocytic origin) and microvesicles (directly shed by the plasmatic membrane); however, since it is not always easy to establish the presence of specific markers of subcellular origin, the International Society for Extracellular Vesicles (ISEV) suggests indicating EV subtypes with reference to physical characteristics of EVs, such as the size. Thus, now it is more appropriate to refer to "small EVs" (SEVs, <200 nm) and "medium/large EVs" (M/LEVs) [15].
From a functional point of view, SEVs are described as cell-free messengers playing a crucial role in cell-cell communication, strongly depending on the nature of the transported active biomolecules (proteins, mRNAs, miRNAs, and lipids). A significant body of literature has demonstrated that the SEVs released by tumor cells have an active role in promoting tumor growth and progression [16][17][18] and carry tumor-specific RNAs and proteins that are considered attractive targets for diagnostic application [19][20][21]. Moreover, for their high stability in the circulation and body fluids, SEVs are considered one of the more promising elements characterizing the liquid biopsy. Among the biological fluids, saliva is proposed as an ideal diagnostic medium for biomarker detection. The main advantages of using saliva are its non-invasiveness, ease of collection, and cost-effectiveness, as well as the possibility of detecting low-abundance biomarkers often untraceable in blood or serum samples, which have a more complex molecular composition. In the last 15 years, several studies have widely demonstrated that saliva mirrors the conditions of the oral cavity (as its proximal fluid) but also of the whole body, thus supporting the application of salivary diagnostics for systemic and oral diseases [22][23][24]. Among the components of saliva, SEVs are considered as a specific and stable source of biomarkers, since by reducing the complexity of the whole saliva, they can provide more accurate and clinically relevant information for disease detection and diagnoses [25].
In the last decades, proteomics technologies have represented promising tools for disease-associated biomarker detection, offering the possibility of analyzing the global protein profile of a sample (tumor tissues, body fluids, vesicles). The comparative analysis of protein profiles identified in "normal" and "disease" samples and the following bioinformatic analysis allow one to define a panel of aberrantly expressed proteins that can increase the accuracy of current diagnostic methods.
In this study, we applied proteomic and bioinformatic strategies to determine the potential use of saliva small extracellular vesicles (S/SEVs) derived from OSCC as a potential tumor biomarker source. The proteome profiles of S/SEVs from subjects without OSCC (OSCC_FREE) and from OSCC patients without and with lymph node metastasis (OSCC_NLNM and OSCC_LNM, respectively) were compared using the quantitative proteomic SWATH-MS (Sequential Window Acquisition of all Theoretical Mass Spectra) method. For the first time, this study reveals that the S/SEVs have a specific protein signature differentiating not only healthy controls from OSCC patients but also NLNM patients from LNM ones, showing their potential use as non-invasive liquid biopsies for improving the diagnostic routines and the clinical outcomes of OSCC patients.

Enrolled Subjects and Sample Collections
Among the 23 subjects enrolled in this study, 5 were without OSCC (OSCC_FREE group) and 18 were patients with OSCC, of which 6 were without lymph node metastases (NLMN) and 12 with lymph node metastases (LMN) ( Figure 1A). Demographic and clinical/anamnestic data of each group are summarized in Table 1. For all groups, the mean age was over 60 years; the OSCC_FREE group was closer to being gender-balanced (# males = 3, 60%; # females = 2, 40%), while a female prevalence was observed in the OSCC_NLNM group (# females = 5, 83.3%) and a male prevalence was observed in the OSCC_LNM group (# males = 8; 66.7%). In the OSCC_FREE group, only one subject was a current or former smoker (20%), while the smokers numbered two (33.3%) and seven (58.3%), respectively, in the OSCC_NLNM and OSCC_NLNM groups. Finally, most of the enrolled subjects were non-drinkers: 100% (5/5) in the OSCC_ FREE group and 83.3% in the OSCC_NLNM and OSCC_LNM groups (respectively, 5/6 and 10/12). patients from LNM ones, showing their potential use as non-invasive liquid biopsies for improving the diagnostic routines and the clinical outcomes of OSCC patients.

S/SEV Isolation and Protein Cargo Characterization
As reported in the flowchart in Figure 1B, the small EVs were isolated from saliva by performing differential centrifugation and filtration of saliva samples collected from 5 OSCC_FREE subjects and 18 patients with OSCC (6 NLNM and 12 LNM). The EV pellets belonging to the same group were then pooled and used for the analyses summarized in Figure 2. The protein cargo of the isolated SEVs was characterized by evaluating the presence of specific markers. In order to validate the protocol used for S/SEV isolation, we confirmed the presence of the EV markers HSC70 and CD63 in pooled S/SEV OSCC_FREE samples ( Figure 3A). Moreover, the obtained reference protein library formed by 421 proteins identified by ProteinPilot 4.5 at a 1% critical against the Homo sapiens UniProt fasta database (Supplementary Table S1-Protein Library and SWATH-MS Data, Sheet "Protein Library" and Table 3) was compared to the Vesiclepedia database by using FunRich software, in order to verify how many TOP10 and TOP100 EV proteins were present within our S/SEV OSCC protein dataset. The Venn diagram in Figure 3B showed that isolated S/SEV contained all the TOP10 and more the 50% of the TOP100 EV proteins. Finally, the analysis performed by FunRich within the GO category "Cellular Component" (GO_CC) showed a good overlapping between the S/SEV protein dataset and the Vesiclepedia dataset referring to exosomes and nanovesicles ( Figure 3C). Indeed, we found that the first six most represented terms are the same in the two analyzed datasets, even if there are differences in the percentage of proteins included in each group, probably due the major numeric complexity of the Vesiclepedia dataset.

Protein Profile Characterization of S/SEVs
The obtained spectral reference library was then used for developing the SWATH-MS strategy, and 7852 targeted peptides (filtered using an FDR threshold of ≤5% over nine runs) allowed obtaining of a detection rate of 75.3% (47314 of 62816), resulting in quantitative information for 365 proteins (Supplementary Table S1, sheet "SWATH-MS Data"). We found that among the technical replicates of each group, the percentage of proteins whose quantitation showed a coefficient of variation (CV) ≤25% in the quantitative data was around 80% (Table 3 and Figure 4A).

Protein Profile Characterization of S/SEVs.
The obtained spectral reference library was then used for developing the SWATH-MS strategy, and 7852 targeted peptides (filtered using an FDR threshold of ≤ 5% over nine runs) allowed obtaining of a detection rate of 75.3% (47314 of 62816), resulting in quantitative information for 365 proteins (Supplementary Table S1, sheet "SWATH-MS Data"). We found that among the technical replicates of each group, the percentage of proteins whose quantitation showed a coefficient of variation (CV) ≤ 25% in the quantitative data was around 80% (Table 3 and Figure 4A).  In our analysis, we considered as differentially modulated proteins those showing a fold change (FC) > ±1.5 (>1.5 or <0.067) in relative abundance and a corrected BY p-value ≤ 0.05, indicated as yellow dots (up-represented) and blue (down-represented) in the volcano plots in Figure 4B. In total, as summarized in Table 4 . The high number of regulated proteins we found is due not only to the small sample size, but also to the small fold change cutoff that we set for getting a wide overview of differences characterizing each of the analyzed S/SEV groups. This choice served the purpose of highlighting, rather than single proteins, an S/SEV protein profile to which was assigned the value of biomarker for OSCC. For these reasons, in this study we will not present analysis of single proteins, even if highly regulated, since this speculation should require a validation step on single S/SEV preparation. We have retained more useful and valid, according to the kind of used samples, to perform an analysis aimed at extrapolating a protein signature of OSCC S/SEVs. Further analyses will eventually be needed to propose specific proteins which can have a direct role in clinical practice, but this is not the aim of this study. Details of the performed quantitative analysis are reported in Supplementary Table S2, in sheets "SEV OSCC_FREE vs. SEV OSCC_NLNM", "SEV OSCC_FREE vs. SEV OSCC_ LNM", and "SEV OSCC_ NLNM vs. SEV OSCC_LNM", respectively.
The modulation of the all-quantified protein, shown in the heat map in Figure 4C, highlighted that each S/SEV pool is specifically distinguished from the others by the group of proteins that are up-represented, corresponding to yellow bars framed by the dotted line. In light of this observation, among the significantly modulated proteins reported in the volcano plot in Figure 4C         The analysis of these up-represented proteins performed using ClueGo allowed us to highlight three different clusters of enriched functional network terms (Adj p-value < 0.05) for each of the three S/SEV subtypes ( Figure 5A and Supplementary Table S3, sheet "ClueGO Results"), indicated as CLUSTER S/SEV OSCC_FREE, CLUSTER S/SEV OSCC_ NLNM, and CLUSTER S/SEV OSCC_ LNM. Within the clusters, each node represents a term of "biological process" (circle) or a Reactome pathway (hexagon), and the arrows represent direct relations between the nodes. Nodes are specifically related to a cluster when at least 75% of the proteins of the node belong to that cluster (Table S3, sheet "ClueGO Results"). In Figure 5B, nodes/terms with the same color form a GO functional group, as specified in Figure 6 and in the Supplementary Table S3 (sheets "CLUSTER OSCC_FREE, "CLUSTER OSCC_NLNM", and "CLUSTER OSCC_ LNM"). Interestingly, we found that these GO groups were unique for each cluster and defined a specific functional signature of S/SEV OSCC_FREE, S/SEV OSCC_NLNM, and S/SEV OSCC_LNM ( Figure 6). Since it is known that the protein cargo of EVs often reflects that of the originating cells, the functional signature characterizing the three clusters can probably mirror the biological status and activities of the oral mucosa cells in the three analyzed clinical conditions. In particular, the ClueGo analysis highlighted five GO groups specifically associated with CLUSTER S/SEV OSCC_FREE, five with CLUSTER S/SEV OSCC_LNM, and seven with CLUSTER S/SEV OSCC_NLNM. Among the five GO groups identified in the CLUSTER S/SEV OSCC_FREE ( Figure 6 and Supplementary Table S3), it was interesting to find the group "Detoxification of Reactive Oxygen Species" (associated to ERO1A, GSTP1, PRDX1, PRDX6, TXN-see Table 5 for the FC in S/SEV OSCC_FREE), the group "Diseases associated with O-glycosilation of proteins" (associated to MUC 5, MUC7, and MUC16-see Table 5 for the FC in S/SEV OSCC_FREE), and the group "Keratinization" (associated to DSG3, KRT1, KRT10, KRT9, PRSS8, SPRR3-see Table 5 for the FC in S/SEV OSCC_FREE), all activities that can protect oral mucosa against cancer development [26][27][28][29][30][31]. The last GO Group associated to the CLUSTER S/SEV OSCC_FREE was that of "Immune-response-regulating cell surface receptor signaling pathway" (associated to several immunoglobulin heavy and light chains, RAP1A, CEACAM1, EZR, MUC16, MUC5B, MUC7, PIGR, PRDX1, RAP1Asee Table 5 for the FC in S/SEV OSCC_FREE), indicating that S/SEV OSCC_FREE are enriched in proteins involved in the modulation of immune response.
( Figure 6). Since it is known that the protein cargo of EVs often reflects that of the originating cells, the functional signature characterizing the three clusters can probably mirror the biological status and activities of the oral mucosa cells in the three analyzed clinical conditions. In particular, the ClueGo analysis highlighted five GO groups specifically associated with CLUSTER S/SEV OSCC_FREE, five with CLUSTER S/SEV OSCC_LNM, and seven with CLUSTER S/SEV OSCC_NLNM. Among the five GO groups identified in the CLUSTER S/SEV OSCC_FREE ( Figure 6 and Supplementary Table S3), it was interesting to find the group "Detoxification of Reactive Oxygen Species" (associated to ERO1A, GSTP1, PRDX1, PRDX6, TXN-see Table 5 for the FC in S/SEV OSCC_FREE), the group "Diseases associated with O-glycosilation of proteins" (associated to MUC 5, MUC7, and MUC16-see Table 5 for the FC in S/SEV OSCC_FREE), and the group "Keratinization" (associated to DSG3, KRT1, KRT10, KRT9, PRSS8, SPRR3-see Table 5 for the FC in S/SEV OSCC_FREE), all activities that can protect oral mucosa against cancer development [26][27][28][29][30][31]. The last GO Group associated to the CLUSTER S/SEV OSCC_FREE was that of "Immune-response-regulating cell surface receptor signaling pathway" (associated to several immunoglobulin heavy and light chains, RAP1A, CEACAM1, EZR, MUC16, MUC5B, MUC7, PIGR, PRDX1, RAP1A-see Table 5 for the FC in S/SEV OSCC_FREE), indicating that S/SEV OSCC_FREE are enriched in proteins involved in the modulation of immune response.  Figure 6). Within the clusters, each node represents a term of "biological process" (circle) or a Reactome pathway (hexagon), and the arrows represent direct relations between the nodes. Nodes are specifically related to a cluster when at least the 75% of the proteins of the node belong to that cluster.  Figure 6). Within the clusters, each node represents a term of "biological process" (circle) or a Reactome pathway (hexagon), and the arrows represent direct relations between the nodes. Nodes are specifically related to a cluster when at least the 75% of the proteins of the node belong to that cluster.

Discussion
An early and accurate diagnosis of OSCC often provides the best chance of survival and favorable outcomes as compared to diagnoses in advanced stages. To date, the visual inspection of the oral cavity followed by an incisional biopsy is still considered the gold standard diagnostic method for OSCC [2]. However, these approaches require the presence of lesions and visible alterations of oral mucosa, often not allowing the early capture of the latent or still asymptomatic malignant lesions. Thus, the availability of molecular biomarkers in the biological fluid becomes indispensable. In this context, blood and saliva EVs (B/EVs and S/EVs respectively) represent a valid source for detection of OSCC biomarkers [34][35][36]. However, even though, due to the emerging exosome technologies, interesting data on the diagnostic and prognostic values of miRNA and protein profiles of EVs has been available [37], many efforts for a deep molecular characterization of EVs are still needed, and further studies have to be performed to allow clinical applications of this knowledge.
In this study, in order to provide new insights leading to the development of valid diagnostic and prognostic tools for OSCC, we performed a proteome quantitative SWATH-MS analysis of S/EVs isolated from healthy subjects and patients with NLNM and LNM OSCC.
Unlike the shot-gun proteomic methods used to investigate S/EV proteomes [34], the targeted SWATH-MS strategy employed in this study is a specific variant of dataindependent acquisition (DIA) methods emerging as a technology that combines deep proteome coverage capabilities with quantitative consistency and accuracy, making it a valid strategy for biomarker discovery [38,39].
Results showed that the S/SEV OSCC_FREE, S/SEV OSCC_NLNM, and S/SEV OSCC_LNM were characterized by the enrichment of specific proteins belonging to GO groups which defined a unique functional signature of each S/SEV cluster. Since it is known that the protein cargo of EVs often reflects that of the originating cells, the functional signature characterizing the three clusters can probably mirror the biological status and activities of oral mucosa cells in the three analyzed clinical conditions. As reported in the "Results" section, among the GO groups identified in the CLUSTER S/SEV OSCC_FREE ( Figure 6 and Supplementary Table S3), we found the group "detoxification of Reactive Oxygen Species" (associated to ERO1A, GSTP1, PRDX1, PRDX6, TXN-see Table 5 for the FC in S/SEV OSCC_FREE), the group "diseases associated with O-glycosilation of proteins" (associated to MUC 5, MUC7, and MUC16-see Table 5 for the FC in S/SEV OSCC_FREE), and the group "keratinization" (associated to DSG3, KRT1, KRT10, KRT9, PRSS8, SPRR3-see Table 5 for the FC in S/SEV OSCC_FREE), all activities that can protect oral mucosa against cancer development. Indeed, since it is well known that oxidative stress and consequent ROS production are involved in the pathogenesis of oral cancer [30], the higher presence in S/SEV_FREE of proteins eliciting an anti-oxidative response can mirror the condition of the originating cells, therefore indicating their ability to protect oral mucosa from the pro-tumoral solicitations. Similarly, the higher presence in S/SEV_FREE of MUC 5, MUC7, and MUC16 may indicate a condition in which the oral mucosa of OSCC_FREE subjects is more protected from bacterial infections that are strictly related to oral carcinogenesis [29]. The mucins are highly O-glycosylated proteins forming the mucus gel layers on several organs with a tissue specificity, thus maintaining a continuous defensive barrier protection against all aggressive external forces [26]. In the oral cavity, the mucosal pellicle is mostly composed by the salivary mucins MUC5B, MUC7 (having antifungal, antibacterial, and antiviral functions), and by the secretory IgA (SIgA), which constitutes the main specific immune defense mechanism playing an important role in the homeostasis of the oral microbiota [28]. Due to this composition, the mucosal pellicle works as a protective layer, ensuring lubrication of the oral epithelia and also protection against excessive bacterial colonization [29]. Moreover, it is also known that beside their proper defensive action, mucins mediate the SIgA binding to the mucosal surface, thus influencing the immune activity of the mucosal pellicle [27]. The higher presence of mucins in S/SEV_FREE can indicate a better predisposition to prevent oral dysbiosis that emerging evidence suggests to be involved in oral cancer development [29]. In addition, the presence in the CLUSTER S/SEV OSCC_FREE of the GO group "keratinization" may prompt a condition of well-being of the oral mucosa of OSCC_FREE subjects. Indeed, it is known that in the oral cavity, the keratinocytes, through a network of desmosomes and keratins, form a strong anatomical barrier that protects from both mechanical and chemical stress, as well as from microbial infections [31]. It was interesting to find within this GO group, the Small Proline Rich Protein 3 (SPRR3) recently proposed as a novel diagnostic and prognostic tumor marker of OSCC, since the survival analysis showed that its under-expression was associated to a poor prognosis, and that the decrease of SPRR3 expression corresponded to the increased the tumor malignancy [40].
Within CLUSTER S/SEV OSCC_NLNM, it was interesting to specifically find the GO groups of "regulation of blood coagulation" and "platelet degranulation" (associated to A1BG, A2M, AHSG, ALB, APCS, some apolipoproteins, CLU, ECM1, F2, FGA, FGB, FGG, FN1, HRG, ITIH4, KLKB1, KNG1, ORM1, PLG, PROS1, several SERPINs, TF, VTN) and the GO group "plasma lipoprotein particle remodeling" (associated to A2M, ALB, APOA1, APOA2, APOB, APOC1, APOC3, APOE, ALB, APOA4). Hypercoagulability is a recurrent condition of several types of cancer, causing the venous thromboembolism (VTE) that is a common complication in patients with cancer [41]. Thus, it was stimulating to find that CLUSTER S/SEV OSCC_NLNM was characterized by the presence of proteins associated with the coagulation process, which distinguished this cluster not only from that of S/SEV OSCC_FREE, but also from that of S/SEV OSCC_LNM.
The role of lipid carriers in cancers is widely discussed, and emerging evidence highlights that the functionality and the impact of the apolipoproteins on the tumor microenvironment depend on the specific tissue context [42]. Interestingly, it was reported that stress-induced recruitment of lipoproteins and EVs represents a new mechanism of cancer cell adaptation, and that microenvironment changes induced by tumor cells can promote the formation of EV/lipoprotein complexes affecting the following entry and cargo transfer into recipient cells [43].
Finally, the CLUSTER S/SEV OSCC_LNM was specifically associated to GO groups related to activities against pathogen agents, such as "metal sequestration by antimicrobial proteins" (associated to LCN2, LTF, S100A9-see Table 6 for the FC in S/SEV OSCC_LNM); "growth of symbiont in host" (associated to ELANE, MPO, PGLYRP1-see Table 6 for the FC in S/SEV OSCC_LNM); and "antimicrobial peptides" (associated to ELANE, GAPDH, LCN2, LTF, MPO, PGLYRP1, PI3, PRTN3, RNASE3, S100A12, S100A9-see Table 6 for the FC in S/SEV OSCC_LNM). Proteins of these groups, such as lactoferrin (LFT,), lipocalin-2 (LCN2), S100A9 (forming with S100A8 the heterodimeric complex calprotectin), neutrophil elastase (ELANE), peptidoglycan recognition protein 1 (PGLYRP1), and myeloperoxidase (MPO), are widely described for their antibacterial activity or for their role as inflammatory markers [44][45][46][47][48]. The enrichment of these proteins in the S/SEVs from patients with OSCC_LNM could be indicative of dysbiotic signatures occurring during tumor progression. Evidence accumulated in the last years indicates that alterations of the oral microbiome can have a role in inducing oral cancer progression [49][50][51][52]. Interestingly, some of these proteins (as S100 proteins and LCN2) are described as diagnostic and prognostic markers for several types of tumors, even though their role in oral cancer is controversial [48,53,54].
Taken together, obtained data support the use of S/SEVs as a promising diagnostic marker source for OSCC. Our approach presented here also has limitations, particularly with regard to the small number of patients enrolled and the numerical non-homogeneity of the groups analyzed, so further analyses must be performed using larger data sets. Furthermore, since this proteomic study was performed on S/SEV pools, the validity of the predictive value of their protein cargo in OSCC will also have to be evaluated on single samples.

Subject Enrolment and Saliva Collection
All participants were recruited from the Unit of Oral Medicine at the "Paolo Giaccone" Policlinico University Hospital in Palermo (Italy). The study protocol, which conformed with ethical guidelines of the 1964 Declaration of Helsinki and later amendments or comparable ethical standards, was approved by Institutional Ethics Committee of "Paolo Giaccone" Policlinico University Hospital in Palermo (Approval date: 6 February 2013; approval number 3/2013). All patients signed written informed consent before specimens were collected for the analyses. In total, 18 patients diagnosed with OSCC and 5 subjects OCSS_FREE that were not on any medication and practiced regular oral hygiene were enrolled.
All OSCC patients underwent surgery, including wide tumor excision and neck lymph node dissection and foe. Among them, 6 did not have lymph node metastasis (NLNM) and 12 did (LNM). Finally, three different groups were defined for the following analyses ( Figure 1A): the OSCC_FREE group (n = 5), OSCC_NLNM group (n = 6), and OSCC_LNM group (n = 12). All subjects were asked to refrain from eating, drinking, or oral hygiene for at least one hour prior to collection. The volunteers were asked to rinse their mouth with 10 mL water with 0.9% saline to remove food debris and then waited for at least 5 min before collection of about 15 mL of saliva in 50 mL Falcon tubes. Once collected, the saliva samples were immediately kept on dry ice and transported from the hospital to the laboratory for processing. If not immediately processed, the samples were stored at −80 • C until further analyses.

Saliva SEV Isolation
Each saliva sample was diluted 1:1 with phosphate-buffered saline (PBS) prior to proceed to SEV isolation following the experimental workflow shown in Figure 1B. As reported, the saliva samples were centrifuged at 300× g for 5 min at 30 • C to eliminate the cells. Then, the supernatant was centrifugated at 3000× g for 15 min at 4 • C, and further at 10,000× g for 30 min at 4 • C to eliminate cell debris, other contaminants, and M/LEVs as well. Finally, the supernatant was filtered through a 0.45 µm VWR ® Vacuum Filtration System (VWR International, West Chester, PA, USA), before ultracentrifugation (Ti70 or Ti45 rotor, Beckman Coulter, Brea, CA, USA) at 100,000× g for 70 min at 4 • C to pellet the SVEs, which were finally resuspended in 100 µL PBS.
In order to improve the protein amount and also to minimize the individual-toindividual differences, the SEV pellets isolated from saliva samples (S/SEVs) of the same group were pooled. Thus, subsequent analyzes were carried out on three types of pooled samples: (a) S/SEV OSCC_FREE; (b) S/SEV OSCC_NLNM; and (c) S/SEV OSCC_LNM.

Western Blot
An aliquot of S/SEV OSCC_FREE sample was treated with RIPA lysis buffer with protease inhibitor cocktail [55]. Subsequently, lysates were centrifuged at 12,000 g for 1 h in ice, the supernatant was collected, and the protein concentration was determined by Bradford assay (Pierce, Rockford, IL, USA). Proteins were then separated using 4-12% Novex Bis-Tris SDS-acrylamide gels (Invitrogen, Waltham, Massachusetts, USA) and immunoblotted with the following primary antibodies: CD63 and HSC70 (Santa Cruz Biotechnology, Inc., Santa Cruz, CA, USA). The secondary antibodies were obtained from Santa Cruz Biotechnology (Santa Cruz Biotechnology, Inc., Santa Cruz, CA, USA). Chemiluminescence was detected using Amersham ECL Western Blotting Detection Reagent (Global Life Sciences Solutions, UK Amersham place, Little Chalfont, Buckinghamshire).
Equal amounts of peptides from each of the three samples were mixed to prepare a pool of tryptic peptides, which was subjected to Data-Dependent Acquisition (DDA) analysis. The resulting list of proteins/peptides was used for construction of the SWATH-MS reference spectral library.
The analysis was performed by a Triple TOF 5600 Plus System equipped with an Eksigent Ekspert nano LC 425 system (AB Sciex, Framingham, USA).
Pooled tryptic peptides (4 ug) were loaded in a C18 reverse-phase trap column (Acclaim PepMap 100 C18 LC Trap Column Thermo Fisher Scientific) at a flow rate of 5 µL/min, using 0.1% v/v formic acid (FA) in water from a loading pump. Peptides were then separated on the Acclaim™ PepMap™ RSLC (75 µm × 25 cm nanoViper C18 2 µm 100 Å, Thermo Fisher Scientific) equilibrated at 40 • C with 0.1% FA in water (solvent A) at a flow rate of 300 nL/min) using 0.1% FA in ACN (solvent B), in accord with the following gradient method: linear increase of solvent B from 10 to 40% for 60 min and from 40% to 70% for 15 min, further increase to 95% within 1 min, and maintenance at 95% for 5 min to rinse the analytical column. Finally, decrease of solvent B from 95 to 10% within 1 min, and hold at 10% for remaining 18 min to re-equilibrate the column.
The mass spectrometer operated in MS scan (400 m/z to 1250 m/z; accumulation time 250 ms) in high resolution mode (>30,000) and in MS/MS scan (230 m/z to 1500 m/z; accumulation time 65 ms) in high sensitivity mode (resolution > 15,000) with rolling collision energy. A maximum of 50 precursors per cycle from each MS spectrum, with charge states from 2 to 5, were fragmented if exceeding a threshold of 100 counts per second (cps), with a dynamic exclusion window of 12 s.
The DDA file was submitted to Protein Pilot™ 4.5 software (AB SCIEX, Toronto, Canada); Uniprot was used as the human protein database (downloaded in May 2020, 149,644 protein sequence entries). The database search was performed with the Paragon algorithm by using the following parameters: iodoacetamide cysteine alkylation, digestion by trypsin, and ID focus on biological modifications.
For SWATH-MS analysis, 2 µg of each sample was analyzed in triplicate to avoid random variation by the following SWATH-MS mode: at a cycle time of 2 s, 50 ms TOF/MS survey scan was performed between 400 and 1250Da with 34 × 25 Da precursor isolation window (swath). SWATH MS/MS acquisition was carried out using a 76 ms accumulation time between 230 and 1500 Da.

Bioinformatic and Statistical Analysis
To evaluate the global protein composition of the S/SEV, the reference protein library obtained by the DDA analysis was compared with the Vesiclepedia database by using the stand-alone enrichment analysis tool FunRich (Functional Enrichment analysis tool; http://www.funrich.org, accessed on 22 September 2021) [57].
Data from SWATH-MS analysis were processed by Peak View v2.2 and Marker View 1.2.1 (AB SCIEX; Framingham, USA). In Peak View, data were analyzed using the following parameters: 10 peptides, 7 transitions per peptide, 90% peptide confidence threshold, 5% false discovery rate threshold (FDR), exclude modified peptides, extracted ion chromatogram (XIC) extraction window of 5 min, 0.05 Da XIC width. The protein list with FDR lower than 5% generated by analyzing SWATH-MS data with PeakView 2.2 was exported to MarkerView for normalization of protein intensity (peak area) using the total area sums algorithm and t-test analysis [58]. Proteins were considered to be differentially expressed if the fold change (FC) among the compared groups was >±1.5 (>1.5 or <0.067) with corrected p-value ≤ 0.05.
The analyses of coefficients of variation, mean calculation, and Student's t-test were performed by using Microsoft Excel 2016. Mean of the replicates was used to perform the following comparisons: (a) S/SEV OSCC_FREE vs. S/SEV OSCC_NLNM; (b) S/SEV OSCC_FREE vs. S/SEV OSCC_LNM; (c) S/SEV OSCC_NLNM vs. S/SEV OSCC_LNM. GraphPad Prism 9.00 for Windows was used for (i) performing the p-value Benjamini-Yekutieli correction (BY p-value); (ii) to make a volcano plot scaling in which the FC was transformed using the log2 function, so that the data is centered on zero, while the BY corrected p-value was −log10 transformed [57]. The expression-based heat map was obtained by using the Heatmapper freely available web server (http://www.heatmapper.ca, accessed on 22 September 2021), applying the following criteria: (a) clustering method: average linkage; (b) distance measurement method: Kendall's tau. To identify the biological processes and functional pathways specifically correlated to the protein cargo of S/EV OSCC_FREE, S/EV OSCC_ NLNM, and S/EV OSCC_ LNM, the bioinformatic tool ClueGO v2.5.2 + CluePedia v1.5.2, a Cytoscape v3.8.0 plug-in was used. This analysis allowed us to visualize the non-redundant gene ontology (GO) terms (within the term "biological processes") and functional pathways (searched in Reactome pathway database) in organized networks reflecting the relations between the biological groups based on the similarity of their linked genes/proteins [59]. In order to make a group comparison and highlight functional differences, the three protein groups of up-regulated proteins were uploaded in ClueGO as separate clusters using the Cytoscape environment [60]. For the enrichment of biological terms and functional groups, we used the two-sided (enrichment/depletion) test based on the hyper-geometric distribution. We set the statistical significance to 0.05 (p ≤ 0.05), and we used the Benjamini-Hochberg adjustment to correct the p-value for the terms/groups visualized by ClueGO. We used fusion criteria to diminish the redundancy of the terms shared by similar associated proteins. The used parameters were: kappa score threshold set to 0.4; GO tree interval: 3-8; GO Term Fusion.

Conclusions
In conclusion, our study provides new evidence highlighting a S/SEV-based protein functional signature specifically associated to the absence of OSCC as well as to the LNM or LMN status, thus having a potential application value as novel predictive biomarkers for OSCC. The increase of sample size and the development of a validation phase based on targeted DIA strategies (as selected reaction monitoring), immunoassays, and so on, will be necessary to validate the S/SEV protein signature and the clinical value proposed.