A Label-Free Proteomic Approach for the Identification of Biomarkers in the Exosome of Endometrial Cancer Serum

Simple Summary Endometrial cancers (ECs) are mostly adenocarcinomas arising from the inner part of the uterus. The identification of serum biomarkers may be useful for an early diagnosis. This study compared the exosome serum proteins of 12 patients with EC with those of 12 non-cancer subjects, and identified 33 proteins with diagnostic potential. Quantification analysis in 36 patients with endometrial cancer compared to 36 healthy individuals confirmed the upregulation of APOA1, HBB, CA1, HBD, LPA, SAA4, PF4V1, and APOE. We developed a statistical model based on this set of proteins that detects cancer samples with excellent sensitivity and specificity levels, particularly for stage 1 ECs. In our opinion, the combined levels of PF4V1, CA1, HBD, and APOE have great potential to reach the clinical stage, after a validation phase. Abstract Endometrial cancers (ECs) are mostly adenocarcinomas arising from the inner part of the uterus. The identification of serum biomarkers, either soluble or carried in the exosome, may be useful in making an early diagnosis. We used label-free quantification mass spectrometry (LFQ-MS)-based proteomics to investigate the proteome of exosomes in the albumin-depleted serum from 12 patients with EC, as compared to 12 healthy controls. After quantification and statistical analysis, we found significant changes in the abundance (p < 0.05) of 33 proteins in EC vs. control samples, with a fold change of ≥1.5 or ≤0.6. Validation using Western blotting analysis in 36 patients with EC as compared to 36 healthy individuals confirmed the upregulation of APOA1, HBB, CA1, HBD, LPA, SAA4, PF4V1, and APOE. A multivariate logistic regression model based on the abundance of these proteins was able to separate the controls from the EC patients with excellent sensitivity levels, particularly for stage 1 ECs. The results show that using LFQ-MS to explore the specific proteome of serum exosomes allows for the identification of biomarkers in EC. These observations suggest that PF4V1, CA1, HBD, and APOE represent biomarkers that are able to reach the clinical stage, after a validation phase.


Introduction
Endometrial cancers (ECs) are tumors of the inner part of the uterus, deriving mostly from the glandular tissue. ECs account for 75-80% of all uterine cancers, and represent one of the most common gynecologic malignancies, affecting ∼3% of women [1]. The risk of EC development is associated with genetic predisposition (e.g., Lynch syndrome), racial background, age, obesity, metabolic syndrome, diabetes, polycystic ovary syndrome, and high estrogen exposure. Metabolic conditions related to lifestyle represent the most For proteomic and data validation, a total of 72 patients (36 women suffering from EC, and 36 non-EC controls) were recruited at the Institute for Maternal and Child Health-IRCCS "Burlo Garofolo" (Trieste, Italy) during 2019 and 2021. All procedures complied with the Declaration of Helsinki, and were approved by the Institutional Review Board of IRCCS Burlo Garofolo and the Regional Ethics Committee (CEUR-2020-Os-030). All patients signed informed consent forms. The clinical and pathological characteristics of the patients are described in Supplementary Materials Table S1. The median age of the patients was 68 years (Interquartile Range: IQR 57-75; Min = 48, Max = 88), while the median age of the controls was 38 (IQR 28-54; Min = 23, Max = 78). Subjects who were positive to human immunodeficiency virus (HIV) or hepatitis B or C virus (HBV, HBC) were excluded from the study. The controls who were chosen excluded oncologic patients, and patients with leiomyomas or adenomyosis. In this study, we also excluded controls with benign Cancers 2022, 14, 6262 3 of 21 tumors (myoma), chronic inflammatory disease (adenomyosis), or viral infections, since these pathologies may affect the abundance of proteins in serum exosome analysis.

Serum Sample Collection and EV Isolation
Serum was obtained with blood centrifugation at 5000 rcf for 5 min, and was collected and stored at −80 • C. In order to improve the proteomic study, 100 µL of crude serum was incubated for 5 min with the Albumin Depletion Kit (Thermo Fisher, Waltham, MA, USA). After column elution, EVs were isolated with the Total Exosome Isolation kit (Thermo Fisher Scientific), as reported in [20]. Then, 100 µL of depleted serum was mixed with 20 µL reagent and incubated for 30 min at 4 • C. After incubation, the samples were centrifuged at 10,000 rcf for 10 min at room temperature, and resuspended with 30 µL of PBS. Sample characterization was performed, as previously reported in [18]. Protein content was determined using Bradford reagent.

Exosome Digestion and MS Analysis
An amount of 100 µg of depleted exosomes was digested with the EasyPep™ MS Sample Prep Kits (Thermo Fisher). After digestion, analysis was performed with nanoflow ultra-high performance liquid chromatography high-resolution mass spectrometry, using an Ultimate 3000 nanoLC (Thermo Fisher Scientific, Bremen, Germany) coupled to an Orbitrap Lumos tribrid mass spectrometer (Thermo Fisher Scientific) that used a nanoelectrospray ion source (Thermo Fisher Scientific). A volume of 1 µL of digestion was initially trapped on a PepMap trap column for 1.50 min at a flow rate of 30 µL/min (Thermo Fisher), and then peptides were loaded and separated onto a C18-reversed phase column (250 mm × 75 µm I.D, 2.6 µm, 100 Å, BioZen Phenomenex, Bologna, Italy). The flow rate was set to 300 nL/min. The obile phases were A): 0.1% HCOOH in water v/v, and B): 0.1% HCOOH in ACN/Water v/v 80/20. A linear 60 min gradient was performed. The samples were run in duplicate. HRMS analysis was performed in data-dependent acquisition (DDA), with an MS1 range of 375-1500 m/z; HCD fragmentation was used with normalized collision energy setting 27. Resolution was set at 120,000 for MS1 and 15,000 for MS/MS. Single and unassigned charges were excluded. Quadrupole isolation was set to 3Da. The maximum ion injection times for MS (OT) and the MS/MS (OT) scans were set to auto and to 60 ms, respectively, and ACG values were set to standard. The dynamic exclusion was 30 s. For data processing, raw MS data were analyzed using the Mascot Distiller 2.8 with the Mascot search engine. The MS/MS scans were matched against the human proteome (Uniprot 03/2022 version). The following parameters were used: enzyme trypsin, missed cleavages max 1, mass accuracy tolerance 10 ppm and 0.6 Da for precursors and fragments, respectively. Carbamidomethylcysteine was used as fixed modification, while methionine oxidation as variable. Proteins were considered identified with at least one unique peptide setting a false discovery threshold of <1%. The label-free quantification was performed with Mascot Distiller software, based on the Replicate protocol. This workflow was based on the relative intensities of high-resolution extracted ion chromatograms (XICs) for precursor ions in multiple data sets, and aligned using mass and elution time. Relative quantitation was based on protein ratio calculation, which uses the median of the assigned peptide ratios. The minimum precursor charge was set to 2, and the minimum peptides number was set to 2. The Replicate protocol was used to measure the relative abundance of a protein from sample to sample.
SuperSignal West Pico Chemiluminescent was used for protein band signal visualization. The intensity of the immunostained bands was quantified, normalizing on the total protein content evaluated by the red ponceau solution staining of the membrane from the same blot.

Bioinformatic Analysis
Proteins identified through MS were analyzed with gProfiler classification systems, and categorized according to their molecular function involvement, biological processes, and protein class. Pathway analysis was carried out using the Reactome tool. The biofunctions were generated via Ingenuity Pathway Analysis (IPA) [26]. Results from the IPA were considered statistically significant when p < 0.01. For the filter summary, we only considered associations where the confidence was high (predicted), or those that had been observed experimentally.

Statistical Analysis
Differences were considered significant between patients and controls when proteins showed a fold change of ±1.5, and satisfied the Mann-Whitney rank sum test (p < 0.05). All analyses were conducted with Stata/IC 16.1 for Windows (StataCorp LP, College Station, TX, USA). Proteins with a fold change ≥ 3.5 and p < 0.05 were further validated. With this selected number of proteins, we elaborated a predictive model using a multivariate logistic regression approach. The dependent variable was EC patients vs. controls. Independent variables were the exosomes selected as explained above. We adopted a step-down procedure, and thus began with all proteins and excluded, one at a time, those which had the highest p-values if p ≥ 0.05. Consequently, the resulting model comprised only exosomal proteins that were significantly associated with the outcome. For this final model, we reported the coefficients of the predictive probability function, the Area under the Receiver Operating Characteristic (ROC) curve (AUC), and levels of sensitivity and specificity. Finally, given that the results of the EC patients vs. controls model were not fully satisfactory, we elaborated two different models using the same multivariate logistic regression, but considered stage 1 EC patients vs. controls and advanced stage (2, 3, and 4) EC patients vs. controls separately.

Proteomic Analysis Reveals a Specific Exosomal Proteome in EC Patients
To characterize the exosomal proteome of EC patients' sera, we first isolated exosomes from the albumin-depleted sera of 12 ECs in stage 1 and 12 controls. To verify proper exosome isolation, we performed Western blotting ( Figure S1) of common exosome markers CD 63 and CD9 ( Figure 1).
Next, to investigate the whole proteomic profile of these ECs as compared to the controls, we subjected the obtained exosomes to LFQ nanoLC-MS/MS-based proteomic analysis. We identified 421 exosomal protein groups, with one unique peptide and FDR < 1%. After quantification and statistical analysis, we found that 33 proteins showed a significant alteration (p < 0.05) in their abundance in EC vs. control samples, (Table 1). In detail, 31 proteins displayed increased levels in EC samples (fold change ≥ 1.5), while only two showed lower levels (fold change ≤ 0.6). The accession numbers, gene names, score, and peptide numbers are listed in Supplementary Materials Table S1.

Proteomic Analysis Reveals a Specific Exosomal Proteome in EC Patients
To characterize the exosomal proteome of EC patients' sera, we first isolated exosomes from the albumin-depleted sera of 12 ECs in stage 1 and 12 controls. To verify proper exosome isolation, we performed Western blotting ( Figure S1) of common exosome markers CD 63 and CD9 ( Figure 1). Next, to investigate the whole proteomic profile of these ECs as compared to the controls, we subjected the obtained exosomes to LFQ nanoLC-MS/MS-based proteomic analysis. We identified 421 exosomal protein groups, with one unique peptide and FDR < 1%. After quantification and statistical analysis, we found that 33 proteins showed a significant alteration (p < 0.05) in their abundance in EC vs. control samples, (Table 1). In detail, 31 proteins displayed increased levels in EC samples (fold change ≥ 1.5), while only two showed lower levels (fold change ≤ 0.6). The accession numbers, gene names, score, and peptide numbers are listed in Supplementary Materials Table S1.

Western Blotting Validation of Eight Exosomal Proteins in EC Patients
To validate data obtained from the proteomic analysis, we focused on those proteins that showed the strongest differences in abundance between the two groups. Thus, starting from the 33 identified proteins, we selected eight (namely APOA, HBB, CA1, HBD, LPA, SAA4, PF4V1, and APOE) that showed a fold change ≥ 3.5 and a p-value < 0.05. The abundance of these proteins was then validated with Western blotting in 36 ECs versus  36 controls. For the validation, we analyzed exosomes that derived from EC patients at different stages, with the first 18 patients at stage 1 and the remaining 18 at advanced stages.
Western blotting ( Figure S1) quantitative analysis confirmed the LC-MS/MS results, showing a higher serum abundance in EC patients than in controls, for all proteins. The higher abundance was significant for APOA1 (p = 0.04) ( , and APOE in controls (C) and endometrial cancer (EC) patients. The intensity of immunostained bands was normalized against the total protein intensities measured from the same blot stained with Red Ponceau. The graph shows the relative abundance of the proteins in control and endometrial cancer exosomes. Results are shown as a histogram (p < 0.05), with each bar representing mean ± standard deviation.
Since the validated proteins were mostly plasma proteins, we wondered if these proteins are also expressed in endometrial cancer. To this aim, we inspected the Protein Atlas database to verify the expression of their genes in different tissues, and we found that the mRNAs codifying for all of these proteins are expressed in EC tissue. In addition, we performed WB analysis in protein lysates from three EC tissues to verify the expression of these proteins. As shown in Figure 3, proteins APOA1, CA1, HBB, HBD, and LPA were all found as single band, while SAA4 and PF4V1 were found as double band. The MW weight of these proteins in tissue corresponded to the apparent MW that was found through Western blotting of the serum exosome. This experiment confirmed that all the Since the validated proteins were mostly plasma proteins, we wondered if these proteins are also expressed in endometrial cancer. To this aim, we inspected the Protein Atlas database to verify the expression of their genes in different tissues, and we found that the mRNAs codifying for all of these proteins are expressed in EC tissue. In addition, we performed WB analysis in protein lysates from three EC tissues to verify the expression of these proteins. As shown in Figure 3, proteins APOA1, CA1, HBB, HBD, and LPA were all found as single band, while SAA4 and PF4V1 were found as double band. The MW weight of these proteins in tissue corresponded to the apparent MW that was found through Western blotting of the serum exosome. This experiment confirmed that all the proteins were detectable in the endometrial tumor, and that proteoforms found in exosomes are probably secreted from tumor cells, suggesting that these could be bona fide EC biomarkers. proteins were detectable in the endometrial tumor, and that proteoforms found in exosomes are probably secreted from tumor cells, suggesting that these could be bona fide EC biomarkers.

Statistical Results
The first step in developing a predictive model was to build a saturated multivariate logistic model with the outcome EC patients vs. controls. Results are reported in Table 2.

Statistical Results
The first step in developing a predictive model was to build a saturated multivariate logistic model with the outcome EC patients vs. controls. Results are reported in Table 2.
Following the step-down procedure, we obtained the model reported in Table 3. The latter yielded a pseudo R-squared = 0.277, an AUC = 83.5% (95% CI 74.1-92.9%), reaching a sensitivity of 82.86% with a specificity of 71.43% (Figures 4 and 5). For the regression coefficients reported in Table 2, and the predicted probability cut points reported in Tables 3 and 4, the following model identified cases and controls with the specified sensitivity and specificity: Predicted probability = 1/(1 + exp (−(−5.75242 + 0.7624538 × PF4V1 + 0.0467958 × APOE + 0.0738784 × HBD)))  Following the step-down procedure, we obtained the model reported in Table 3. The latter yielded a pseudo R-squared = 0.277, an AUC = 83.5% (95% CI 74.1-92.9%), reaching a sensitivity of 82.86% with a specificity of 71.43% (Figures 4 and 5). For the regression coefficients reported in Table 2, and the predicted probability cut points reported in Table  3 and 4, the following model identified cases and controls with the specified sensitivity and specificity: Predicted probability = 1/(1 + exp (−(−5.75242 + 0.7624538 × PF4V1 + 0.0467958 × APOE + 0.0738784 × HBD))) Note: OR = odds ratio; CI = confidence interval; Coefficient = logistic regression model coefficient.    Considering that this result was not fully satisfactory, we hypothesized that we could obtain better results by separating EC patients into Stage 1 and Advanced Stages (2, 3 and 4), and comparing these two groups separately with the control group.

Stage 1 EC Patients Vs. Controls
We adopted the same approach as above. The results of the saturated multivariate logistic regression model with outcome Stage 1 EC patients vs. controls are reported in Table 5. Following the step-down procedure, we obtained the model reported in Table 6, which yielded a pseudo R-squared = 0.717, an AUC = 98.0% (95% CI 95.0-100%), reaching a sensitivity of 100% with a specificity of 86.11% (Figures 6 and 7). For the regression coefficients reported in Table 6 and the predicted probability cut points reported in Table 7, the following model identified cases and controls with the specified sensitivity and specificity:  Considering that this result was not fully satisfactory, we hypothesized that we could obtain better results by separating EC patients into Stage 1 and Advanced Stages (2, 3 and 4), and comparing these two groups separately with the control group.

Stage 1 EC Patients vs. Controls
We adopted the same approach as above. The results of the saturated multivariate logistic regression model with outcome Stage 1 EC patients vs. controls are reported in Table 5. Following the step-down procedure, we obtained the model reported in Table 6, which yielded a pseudo R-squared = 0.717, an AUC = 98.0% (95% CI 95.0-100%), reaching a sensitivity of 100% with a specificity of 86.11% (Figures 6 and 7). For the regression coefficients reported in Table 6 and the predicted probability cut points reported in Table 7, the following model identified cases and controls with the specified sensitivity and specificity: Predicted probability = 1/(1 + exp (−(−15.43866 + 1.689701 × PF4V1 + 0.1522074 × CA1 + 0.1571125 × HBD)))  Note: OR = odds ratio; CI = confidence interval; Coefficient = logistic regression model coefficient.

Advanced Stage EC Patients Vs. Controls
The results of the saturated multivariate logistic regression model with the outcome Advanced Stage EC patients vs. controls are reported in Table 8. Following the step-down procedure, we obtained the model reported in Table 9, with only one independent variable (ApoE) remaining. The model yielded a pseudo R-squared = 0.201, and an AUC = 80.9% (95% CI 69.5-92.4%). It reached a sensitivity of 83.33% for a specificity of 68.57% (Figures 8 and 9). For the regression coefficients reported in Table 9, and the predicted probability cut points reported in Table 10, the following model identified cases and controls with the specified sensitivity and specificity: Predicted probability = 1/(1 + exp (−(−2.929284 + 0.0540449 × ApoE))) (1)

Advanced Stage EC Patients vs. Controls
The results of the saturated multivariate logistic regression model with the outcome Advanced Stage EC patients vs. controls are reported in Table 8. Following the step-down procedure, we obtained the model reported in Table 9, with only one independent variable (ApoE) remaining. The model yielded a pseudo R-squared = 0.201, and an AUC = 80.9% (95% CI 69.5-92.4%). It reached a sensitivity of 83.33% for a specificity of 68.57% (Figures 8 and 9). For the regression coefficients reported in Table 9, and the predicted probability cut points reported in Table 10, the following model identified cases and controls with the specified sensitivity and specificity: Predicted probability = 1/(1 + exp (−(−2.929284 + 0.0540449 × ApoE))) (1) Note: OR = odds ratio; CI = confidence interval; Coefficient = logistic regression model coefficient.    Note: OR = odds ratio; CI = confidence interval; Coefficient = logistic regression model coefficient.

Bioinformatic Analysis
For proteomic enrichment data analysis, we used gProfiler ( Figure 10) tool classification that classified the proteins into groups according to their molecular function, biological processes, and protein class. Regarding biological processes, proteins were categorized into reverse cholesterol transport, cholesterol transport, cholesterol efflux, defense response, sterol transport, plasma lipoprotein particle remodeling, and protein-lipid complex remodeling. Regarding the molecular function, the proteins were classified in the following categories: cholesterol transfer activity, sterol transfer activity, phosphatidylcholine-sterol O-acyltransferase activator activity, lipoprotein particle receptor binding, antioxidant activity, sterol transporter activity, and lipid transfer activity. In addition, considering the cell compartment, the proteins were organized into blood microparticle, extracellular space, extracellular region, plasma lipoprotein particle, lipoprotein particle, protein-lipid complex, extracellular exosome, and extracellular vesicle. Looking at pathways, the Reactome tool analysis indicated that these proteins pertained to five pathways: plasma lipoprotein assembly, chylomicron assembly, chylomicron remodeling, plasma lipoprotein remodeling, hemostasis, plasma lipoprotein assembly, remodeling, clearance, and platelet degranulation.

Bioinformatic Analysis
For proteomic enrichment data analysis, we used gProfiler ( Figure 10) tool classification that classified the proteins into groups according to their molecular function, biological processes, and protein class. Regarding biological processes, proteins were categorized into reverse cholesterol transport, cholesterol transport, cholesterol efflux, defense response, sterol transport, plasma lipoprotein particle remodeling, and protein-lipid complex remodeling. Regarding the molecular function, the proteins were classified in the following categories: cholesterol transfer activity, sterol transfer activity, phosphatidylcholine-sterol O-acyltransferase activator activity, lipoprotein particle receptor binding, antioxidant activity, sterol transporter activity, and lipid transfer activity. In addition, considering the cell compartment, the proteins were organized into blood microparticle, extracellular space, extracellular region, plasma lipoprotein particle, lipoprotein particle, protein-lipid complex, extracellular exosome, and extracellular vesicle. Looking at pathways, the Reactome tool analysis indicated that these proteins pertained to five pathways: plasma lipoprotein assembly, chylomicron assembly, chylomicron remodeling, plasma lipoprotein remodeling, hemostasis, plasma lipoprotein assembly, remodeling, clearance, and platelet degranulation.

Discussion
Cancer biomarkers help to characterize tumor alterations, and are frequently used for the diagnosis and prognosis of the disease, and for determination of a personalized treatment [27]. Exosomes are microvesicles that, once released, play key roles in tumor growth and invasion. The proteomic characterization of patients' exosomes is still challenging, but may represent a key step in the discovery of new potential biomarkers, particularly at an early stage [28]. LC-MS/MS has been successfully used to identify the proteomic profile of exosomes, and for biomarker identification as in prostate cancer [29], bladder cancer [30], and ovarian cancer serum [31]. In EC, thanks to proteomic approaches (such as two-dimensional electrophoresis, protein arrays, and mass spectrometry), hundreds of proteins have been reported as potential biomarkers in cancer tissue, blood, its derivatives, and other body fluids (17,18). Nevertheless, none of them have reached clinical stages, probably due to lack of tissue specificity, and also since most of them are proteins that are involved in broad processes including metabolic pathways, inflammatory responses, cell adhesion, and in hormones. To find specific non-invasive biomarkers, there is growing interest in exploring exosome-enriched proteins.
Interestingly, Song et al. recently found that plasma exosomes from EC patients are enriched in LGALS3BP, a protein that also promotes endometrial cancer progression.
In this study, we used a larger cohort of EC patients to identify novel exosomal biomarkers, and investigated the exosome proteome from albumin-depleted serum, using LFQ based proteomics followed by Western blotting analysis validation.
Starting with the proteomic profile, we obtained 440 proteins that were further selected based on the strongest differences in abundances; then, we applied a multivariate logistic regression analysis for all EC stages (1, 2, 3, and 4). By this analysis, we found that PF4V1, APOE, and HBD allowed us to separate cases from controls with an AUC = 83.5% in 36 EC patients as compared to 36 healthy individuals.
Considering that this result was not fully satisfactory, and attempting to find early EC biomarkers, we decided to separate patients into stage 1 (often asymptomatic pa-

Discussion
Cancer biomarkers help to characterize tumor alterations, and are frequently used for the diagnosis and prognosis of the disease, and for determination of a personalized treatment [27]. Exosomes are microvesicles that, once released, play key roles in tumor growth and invasion. The proteomic characterization of patients' exosomes is still challenging, but may represent a key step in the discovery of new potential biomarkers, particularly at an early stage [28]. LC-MS/MS has been successfully used to identify the proteomic profile of exosomes, and for biomarker identification as in prostate cancer [29], bladder cancer [30], and ovarian cancer serum [31]. In EC, thanks to proteomic approaches (such as two-dimensional electrophoresis, protein arrays, and mass spectrometry), hundreds of proteins have been reported as potential biomarkers in cancer tissue, blood, its derivatives, and other body fluids (17,18). Nevertheless, none of them have reached clinical stages, probably due to lack of tissue specificity, and also since most of them are proteins that are involved in broad processes including metabolic pathways, inflammatory responses, cell adhesion, and in hormones. To find specific non-invasive biomarkers, there is growing interest in exploring exosome-enriched proteins.
Interestingly, Song et al. recently found that plasma exosomes from EC patients are enriched in LGALS3BP, a protein that also promotes endometrial cancer progression.
In this study, we used a larger cohort of EC patients to identify novel exosomal biomarkers, and investigated the exosome proteome from albumin-depleted serum, using LFQ based proteomics followed by Western blotting analysis validation.
Starting with the proteomic profile, we obtained 440 proteins that were further selected based on the strongest differences in abundances; then, we applied a multivariate logistic regression analysis for all EC stages (1, 2, 3, and 4). By this analysis, we found that PF4V1, APOE, and HBD allowed us to separate cases from controls with an AUC = 83.5% in 36 EC patients as compared to 36 healthy individuals.
Considering that this result was not fully satisfactory, and attempting to find early EC biomarkers, we decided to separate patients into stage 1 (often asymptomatic patients) and advanced stages (2, 3, and 4).
A multivariate logistic regression model for stage 1 (18 patients) based on PF4V1, CA1, and HBD allowed us to separate cases from controls with an AUC = 98.0%. The last multivariate logistic regression model was performed that compared patients with an advanced stage EC (2, 3, or 4) (18 patients). Based on ApoE expression, it allowed us to separate cases from controls with an AUC = 80.9%. The best and only fully satisfactory model was that of the analysis that considered stage 1 patients. It is noteworthy that proteins that discriminate stage 1 ECs well do not satisfactorily discriminate all stages of EC. The low performance of these proteins in the more advanced stages of the disease is interesting, and requires further investigation. This study shows that stage 1 discriminating proteins do not work well for more advanced stages of EC.
Reactome analysis revealed that EC exosome proteins are involved in dysregulation of plasma lipoprotein assembly and remodeling, hemostasis, and platelet degranulation pathways that may be involved in cancer development.
CA1 is a member of the carbonic anhydrase (CA) family, and an overexpression in osteosarcoma cells leads to calcification with ascorbic acid [32]. Wang et al., in a proteomic study, identified this protein in stage I non-small cell lung cancer, and validated its overexpression via Western blotting; this represents a promising early biomarker for non-small cell lung cancer.
HBB is involved in oxygen transport from the lung to several peripheral tissues [33]. Expression of HBB in lung cancer cells and breast cancer cells is associated with ROS cytotoxicity suppression, leading to cancer cell survival and spread [34]. PF4V1 suppresses chemokine angiogenesis by blocking the protein bFGF, and is closely associated with the growth and metastasis of various cancers [35]. It is known that PF4V1 in prostate cancer leads to suppression of proliferation and invasion, and serves as a potential prognostic biomarker [36].
APOE is a protein associated with lipid particles, that mainly functions in lipoproteinmediated lipid transport between organs via the plasma and interstitial fluid [37]. Studies on several tumors, including glioblastoma [38], EC [39], lung cancer [40], and prostate cancer [41], showed that when APOE is overexpressed, the disease is more aggressive, and the prognosis is poor [42].
Of note, although these proteins are commonly found in plasma upon liver secretion, in this study we found that they are detectable in endometrial tumor tissue; thus, they may represent bona fide EC biomarkers when secreted into exosomes by tumor cells.
Interestingly, through Western blotting we noticed that CA1, PF4V1, and APOE displayed higher molecular weights than the canonical ones. This may rely on the glycosylation of proteins in exosomes, as previously described by the literature [43]. Whether glycosylation may affect protein functions or exosome distribution is still unknown, and is of great interest.
Importantly, in this study we provided a proteomic profile of EC serum exosomes that suggests new promising non-invasive biomarkers, although we are aware of some limitations. These proteins have not yielded satisfactory results for all stages together, or for the more advanced stages of EC. However, in the discovery phase, we identified proteins that discriminated well between controls and stage 1 patients. This suggests that for advanced stages, separate studies are needed, as biomarkers that work well for stage 1 are not necessarily efficient for advanced stages or all stages together. Moreover, we are aware that broader studies are needed to validate the role of the identified proteins in larger cohorts of patients. Another weakness of our study is that patients of the discovery phase were also included in the validation phase, which, however, was performed in a larger cohort that included 24 patients that did not overlap with the previous ones. Lastly, since levels of some proteins may change with age, we think that validating these results with age-matched cases and controls should be the next step to ascertain the potential of these proteins as biomarkers.

Conclusions
In our opinion, our proteomic data may expand the knowledge of the protein composition of EC serum exosomes, and may contribute to the discovery of new EC biomarkers. Optimally, the proteins that we validated were able to discriminate stage 1 EC patients from controls, but failed to satisfactorily identify EC patients at more advanced stages. Other studies with larger cohorts of patients are necessary to develop new algorithms that are able to discriminate patients at more advanced stages, while matching cases and controls by age. LC-MS/MS represents a powerful technology to explore the serum exosome, allowing for the identification of candidate biomarkers in EC.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki, and was approved by the Institutional Review Board of IRCCS Burlo Garofolo and the Regional Ethics Committee (protocol code RC18/19 approved in 2019 and CEUR-2020-Os-030).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to ethical reasons.