1. Introduction
Extracellular vesicles (EVs) encompass a diverse group of lipid-bound nanoparticles, 50–1000 nm in diameter that are released by most cell types in normal as well as diseased states [
1,
2]. Exosomes are a distinct class of small extracellular vesicles (sEVs), approximately 50–150 nm-diameter that originate and traffic specifically via the cellular endosomal sorting apparatus [
3]. Large extracellular vesicles (lEVs) include vesicle types commonly referred to as microvesicles, oncosomes and apoptotic bodies. Cells also release non-vesicle lipid structures including lipoproteins and other lipid-protein complexes and varied complexes of proteins, nucleic acids and other biomolecules. These vesicle and non-vesicle entities are actively being categorized and investigated and are currently of considerable research interest [
4,
5].
Emerging evidence indicates EVs function in cell to cell signaling, communication, transfer and exchange in both local and systemic environments [
6,
7]. The molecular cargoes of EVs have been shown promote cancer progression, invasion and metastasis, remodeling of the tumor microenvironment and angiogenesis [
8,
9]. We previously demonstrated that EVs disseminated from pancreatic cancer cells communicate a considerable repertoire of tumor antigens that are associated with host humoral immune response and circulating anti-tumor autoantibodies in pancreatic cancer patient plasmas [
10].
EVs have been detected in blood, urine, fine needle aspirates, saliva, cerebrospinal fluid and ascites and are considered to convey signature features of the cells from which they originate [
11]. Clinical assessment of EV contents from patient-derived biofluids offers promise of minimally invasive means for early detection of cancer and dynamically informing clinical decision-making. In addition, the assessment of EV constituents may yield improvements in assay classification performance, as relevant molecules could conceivably be enriched tumor-derived EVs compared to whole plasmas [
12]. Accordingly, there is high interest in developing liquid biopsy assay potential of EVs as markers for early detection of disease, especially in the field of cancer [
13,
14,
15,
16]. EV-associated cancer marker types investigated include oncogenic micro- and mRNAs, mutated DNAs and tumor proteins [
17,
18,
19].
Primary effort towards realizing liquid biopsy utility for EV cargoes has, to date, largely focused on EV-associated microRNAs. In lung cancer, a number of groups have reported diagnostic, prognostic and predictive microRNA markers for various lung cancer subtypes [
20]. Exploration of EV-associated proteins in biofluids for lung cancer biomarker potential has been more limited and generally targeted or array based [
20,
21]. We recently demonstrated a feasible approach for exploration of plasma-derived EV protein signatures for detection of lung adenocarcinoma via untargeted mass spectrometry profiling of plasma-derived EVs from patients and matched cancer free controls [
22].
Here, using a combination of mass-spectrometry and aptamer array-based proteomics [
22,
23], we further these findings and explore protein features conveyed by circulating EVs in the context of lung and pancreatic ductal adenocarcinomas. We profile sEVs isolated from conditioned media of a panel of lung and pancreatic adenocarcinoma cell lines and establish characteristic sEV-associated protein signatures of LUAD and PDAC. We then perform comparative profiling of plasma-derived sEVs and paired whole plasmas in the context of lung and pancreatic ductal adenocarcinomas to identify specific protein features with high performance in the circulating sEV compartment.
3. Discussion
Through comprehensive proteomic profiling of extracellular vesicles derived from cell line conditioned medias and patient plasmas, we have identified protein signatures of lung and pancreatic ductal adenocarcinomas that are conveyed by cancer cell-disseminated extracellular vesicles. These protein features provide circulating evidence of wide-ranging oncogenic reprogramming and functional alteration in cancer cells. Aptamer array-based profiling of sEVs enriched from LUAD patient plasmas enabled enumeration of 37 protein features with AUC ≥ 0.7 case:control classifier performance; 34 (92%) of these indicated improved performance in the sEV compartment compared to in intact plasma. Correspondingly, profiling of sEVs enriched from PDAC patient plasmas revealed 446 protein features with AUC ≥ 0.7; 413 (93%) of these showed improved performance in the sEV compartment. The number of high classifier performance plasma sEV-associated features identified in the PDAC cohort likely derives from the advanced disease stage of these patients, consistent with the typical clinical encounter of pancreatic cancer.
We note that a novel component of this study in relation to other studies, which have similarly evaluated utility of EV-associated protein cargo for early detection or prognostication [
29,
30], is the direct comparison protein features in plasma-derived EVs versus unfractionated plasma from the same individual for utility in distinguishing case from control. This approach enables us to address a fundamental question and conclude that there are indeed EV-associated features that exhibited improved classifier performance specifically in the plasma sEV compartment.
Interestingly, the plasma sEV-associated proteins (
Figure 3B,C) identified in plasma with AUC ≥ 0.7 case:control classifier performance in both the LUAD and PDAC cohorts denote a pan-adenocarcinoma sEV signature and suggest intriguing implications regarding markers that yield insight into cancer-host interchange within the tumor microenvironment. For example, PDGF (Platelet-derived growth factor subunit A) has been shown to be involved in recruitment and activation of cancer-associated fibroblasts in lung adenocarcinoma [
31] and to also exert similar effect on pancreatic stellate cells and fibroblasts in PDAC tumors [
32,
33]. Tumor derived VEGFC (Vascular endothelial growth factor C) has been demonstrated to support metastatic expansion of primary tumors into the lymphatic vasculature in many tumor types, challenging the traditional concept of a purely passive role of the lymphatic vessels in cancer metastasis [
34,
35,
36]. PSMD7 (26S proteasome non-ATPase regulatory subunit 7) has been reported as overexpressed in a variety of tumors and is hypothesized to be involved in maintenance of proteasome function in cancer cells through its roles in mediating endosomal trafficking and protein recycling [
37,
38]. This is especially intriguing in the context of our previous discovery of another member of the ubiquitin-proteasome pathway [
39], HUWE1, as a plasma EV-associated marker of lung adenocarcinoma with good performance for distinguishing cases from controls [
22]. NID2 (Nidogen-2) is a basement membrane protein primarily produced by mesenchymal cells [
40] that has been linked to mesenchymal/de-differentiated phenotypes in breast cancer and melanoma [
41]. Elevated serum NID2 has been reported in cases of ovarian and esophageal cancers [
42]. In esophageal, lung, bladder and oral cancers,
NID2 methylation and reduced tissue expression are observed [
43], suggesting biomarker context of NID2 may diverge in tumor tissue and EV compartments. SFRP1 (Secreted frizzled-related protein 1) is a glycoprotein modulator of Wnt-signaling that has been observed to play both tumor suppressor and oncogenic roles in a number of human cancers and to also be subject to epigenetic regulation via DNA methylation or microRNA transcriptional silencing [
44]. A recent study demonstrated functional relevance of exosome-associated SFRP family proteins and that human lung cancer cells take up SFRP-containing exosomes, whereby SFRPs co-localize with β-catenin in both the cytoplasm and nucleus to mediate β-catenin/TCF-mediated transcription of Wnt target genes known to effect cancer stem cell properties [
45]. B2M (Beta-2-microglobulin) is a component of the class I major histocompatibility complex (MHC-I) and is a crucial factor required for MHC-I assembly and maintenance of stable surface presentation of antigens to immune system effectors [
46]. We previously found MHCs and presentation pathway proteins to be enriched on the surface of PDAC cell derived EVs [
10] and there are multiple lines of evidence that tumor cells employ EVs to transfer MHCs loaded with tumor-derived peptides to antigen presenting cells, thereby activating anti-tumor CD8 T-cell response [
47]. Interestingly, inactivation of
B2M in cancer cells has been associated with downregulation of the MHC-I complex, abnormal immune surveillance that contributes to cancer development and attenuated responses to anti-PD-1/anti-PD-L1 immunotherapies [
48], highlighting the potential for cancer EV cargoes to report on tumor immune status. Notably, serum-derived EV-associated LBP (Lipopolysaccharide-binding protein) has previously been shown to be elevated in PDAC cases as compared to healthy controls. Consistent with this study, we also observed that LBP is elevated in EVs of PDAC cases as compared to controls, thus providing independent validation [
29].
The cell line and plasma sEV-associated proteins uncovered in this study are also known effectors of cancer-associated mechanistic regulators including transcription factors TP53 and MYC; signal transducer KRAS; cell growth and proliferation cytokines TGFB1, HGF, VEGF and EGF; and proinflammatory and/or immunoregulatory cytokines TNF, IL1B, IL6 and IFNG, all with established, fundamental roles in cancer etiology. In addition, evaluation of the identified plasma sEV-associated proteins regarding subcellular compartment of origin revealed prominent representation, interestingly, of proteins manifest in the adenocarcinoma cancer cell line and patient plasma-derived sEVs annotated to nuclear and mitochondrial compartments, in addition to the expected extracellular and plasma membrane cellular compartments. This is indicative of sEV conveyance of multifaceted signatures of cancer and broad utility for interrogation of cancer intercellular dynamics.
Exploration of cancer associated extracellular vesicles has revealed diverse repertoires of proteins, nucleic acids, lipids and metabolites that reflect the altered molecular expression, sorting, trafficking and fates that underlie cancer pathophysiology [
4]. The promise of liquid biopsy interrogation of EV multidimensional cargoes for real time reporting of tumor dynamics would redefine clinical management of cancer [
49]. This is especially relevant in the emerging era of anti-cancer immunotherapy in which understating cancer immune-infiltrate interaction and varying tumor phenotype would guide personalized therapies and predict treatment outcomes [
10].
As clinical oncology is increasingly guided by molecular insight and deployment of targeted precision therapies, approaches for improved characterization of tumor molecular features are ever more essential. Liquid biopsy based on assay of tumor-derived information disseminated into the peripheral blood or other biofluids complements traditional tissue biopsy and offers advantages of being less invasive, allowing for serial interrogation and potential for overcoming sampling error by integrating information across cancer cell subpopulations and immune and stromal infiltrates within the tumor milieu. Liquid biopsy research efforts to date have concentrated primarily on circulating tumor cells (CTCs), cell-free DNA (cfDNA), circulating tumor DNA (ctDNA) and cell free tumor RNA, with EVs being of more recent interest [
50]. It is likely that no single mode of deriving tumor molecular information will be sufficient arrive at an optimal biomarker panel for the desired clinical application. Therefore, it is essential to evaluate the strengths and complementary contributions of various approaches to liquid biopsy [
49,
51].
The feasibility of using ctDNA for tracking and monitoring tumor dynamics, drug response and therapy resistance has been demonstrated through numerous studies [
52]. Nevertheless, practical challenges remain as cfDNA is generally highly fragmented and the total amount of ctDNA can be as low as 0.01% of the total cfDNA in circulation [
52]. Strategies to improve this by enriching for ctDNA associated with circulating tumor-derived EVs have been developed [
18]. While ctDNA can yield cancer genotype profiles, it offers limited utility for discerning expression level dynamics that underlie phenotypic plasticity and give rise to tumor stemness, progression or therapeutic escape. The potential of circulating tumor cells in liquid biopsy has been widely investigated in scientific and clinical studies [
50]. CTCs provide direct tumor information including real-time phenotypic profiles that can inform treatment selection and monitor response. CTCs are, however, exceedingly rare in blood samples and are outnumbered by normal leukocytes by 10
6-fold or more. A clinical blood sample may contain as few as 5–10 CTCs and these cells may be heterogeneous, necessitating that analyses be performed at the single-cell level [
50].
EVs provide an additional source of material for liquid biopsy based profiling of tumors. Similar to CTCs, their molecular cargoes can report tumor genotype and phenotype. Mechanistically, their constitutive expression and physical features seem to yield relatively straightforward conveyance of abundant tumor EVs into blood and biofluids comparted to the relatively complex process of CTC extravasation into the peripheral circulation. This makes tumor EVs and their associated cargoes an appealing target for liquid biopsy. Nevertheless, challenges remain, for example in identifying surface protein profiles that would enable capture and enrichment of tumor-derived EVs from uninformative, circulating background populations. Herein, we have identified cancer-associated EV-protein features that may offer utility for early interception of disease. Our findings provide a basis for inclusion of identified EV-resident protein targets on validation studies that encompass other biomarker candidates [
51,
53,
54].
4. Materials and Methods
4.1. Blood Samples
Whole blood samples were collected at MD Anderson Cancer Center (MDACC) through informed consent following institutional review board (IRB) approval (PA14-0552). Healthy control samples were obtained from volunteers in the clinic waiting rooms and for the most part, are relatives of the patients. Plasma was prepared from EDTA-treated whole blood by two successive room temperature centrifugation steps for 12 min at 1200× g, without braking and subsequently stored in −80 °C until use.
4.2. Exosome Isolation
Cell line sEVs were purified from cell-conditioned media obtained from LUAD and PDAC cell lines (NCI-H23, NCI-H647, NCI-H1573, HCC4019, CFPAC-1, HPAF-II, SU.86.86, Panc 03.27, MIA PaCa-2 and PANC-1) by differential ultracentrifugation after the basic protocol of Théry, et al. [
24]. Briefly, cells were cultured for 48 h in serum-free media and subjected to sequential centrifugation steps of 800×
g and 2000×
g. The resulting supernatant was next filtered through a 0.22 µm PES vacuum filter (Corning) and concentrated ~50-fold by tangential flow filtration (TFF) using a 100 kDa MWCO membrane (Vivaflow 200R, Sartorius AG, Göttingen, Germany). The concentrated TFF retentate was then subjected to ultracentrifugation at 100,000×
g for 2 h in 45Ti fixed angle rotor (Beckman Coulter Inc., Brea, CA, USA). The supernatant was removed and PBS added to the pellet for an overnight washing step, again at 100,000×
g. The resultant sEV pellet was resuspended in PBS and harvested for further analyses.
For the isolation of plasma sEVs we applied a density gradient flotation approach. Cell debris and large EVs were depleted by centrifugation at 2000× g for 20 min followed by 16,500× g for 30 min; the resulting supernatant was additionally filtered through a pre-wetted 0.22 µm vacuum filter (Steriflip SCGP00525, MilliporeSigma, Burlington, MA, USA). The lEV-depleted plasma was mixed with OptiPrep iodixanol solution (MilliporeSigma, St. Louis, MO, USA) to a final density of 1.18 g/mL. This was loaded into the bottom of a polycarbonate ultracentrifuge tube (Seton Scientific Inc., Petaluma, CA, USA) and overlaid with 2–3 mL of 1.14 g/mL iodixanol/PBS (phosphate buffered saline) solution to form a single-step density fractionation gradient. Ultracentrifugation was performed for at 100,000× g for 16 h at 8 °C. Vesicles were collected in a single fraction from the top of the tube, proceeding downward to recover 0.6 mL of overlaid gradient volume.
4.3. Transmission Electron Microscopy
Extracellular vesicle aliquots were fixed in 2% paraformaldehyde; 5 µL of EV suspension was then applied to each formvar/carbon-coated 200 mesh nickel grid and allowed to adsorb for 20 min. The grids were washed twice on 100 µL drops of PBS, followed by eight washes with deionized water. Uranyl acetate (2%) was used as a counterstain; after 1 min of staining, excess uranyl acetate was blotted from the grid edge with Whatman No. 1 filter paper and the grids were air-dried. EM grids and reagents were from EMS (Hatfield, PA, USA). Imaging was performed using a JEM1010 TEM (JEOL, Peabody, MA, USA) at an accelerating voltage of 80 kV. Digital images were taken with AMT imaging software (Advanced Microscopy Techniques Corp, Danvers, MA, USA).
4.4. Particle Size Distribution and Quantification
EV yields were quantified via Brownian diffusion size analyses using ZetaView Nanoparticle-tracking analysis (NTA) instrumentation (Particle Metrix, Meerbusch, Germany). Sample aliquots were diluted 102–106-fold to achieve optimal concentration for analysis; 1.0 mL of diluted sample was used for each analysis. Light scattering of individual particles in solution was digitally recorded, particle trajectory and displacement were automatically analyzed by image analysis tracking software and the particle-size distribution was determined from the observed Brownian motion of individual particles according to the Stokes-Einstein relationship.
4.5. Protein Quantitation, SDS-PAGE and Western Blot Assay
Protein quantification was performed using Pierce BCA (bicinchoninic acid) protein assay (Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer’s recommended microplate assay procedure. Absorbance was measured with a SpectraMax M5 multi-mode microplate reader using SoftMax Pro data acquisition and analysis software (MolecularDevices, San Jose, CA, USA). Vesicle isolates were denatured in 4× Laemmli sample buffer at 100 °C for 10 min. Proteins were separated using 4–15% sodium dodecyl sulfate polyacrylamide gel electrophoresis in Tris/Glycine/SDS running buffer and transferred to Immun-Blot PVDF (polyvinylidene difluoride) membrane (all reagents and supplies from Bio-Rad, Hercules, CA, USA). Immunoblotting was performed with the following primary antibodies: CD9 (EXOAB-CD9A-1, System Biosciences, Palo Alto, CA, USA); TSG101 (EXOAB-TSG101-1, System Biosciences). Blots were washed and incubated with appropriate HRP-conjugated secondary antibodies (Amersham ECL, Cytiva, Marlborough, MA, USA) and detected using Pierce ECL western blotting substrate (Thermo Fisher Scientific) with chemiluminescence optimized autoradiography film.
4.6. Aptamer Array
The SOMAscan assay is arrayed from SOMAmer (slow off-rate modified DNA aptamer) reagents with affinity optimized for individual protein targets. Unfractionated plasma (150 µL) was prepared for SOMAscan assay using SomaLogic Plasma Diluent and Assay Buffer. Aliquots of plasma-derived EVs from each sample were normalized to 180 µL plasma input equivalent and prepared for SOMAscan assay by diluting solubilization buffer containing 120 mM NaCl, 5 mM KCl, 5 mM MgCl2, 40 mM HEPES pH 7.5, 0.05% Tween20, 1% DDM (w/v) and 0.5% sodium deoxycholate (w/v) and incubated for 15 min with mild agitation, followed by centrifugation at 14,000× g for 5 min. Final volume sample volume was standardized to 120 µL. SOMAscan assays were performed by SomaLogic. Briefly, samples were incubated with a pools of SOMAmer reagents for equilibration binding followed by two sequential bead-based immobilization and washing steps to eliminate unbound and non-specifically bound proteins and unbound SOMAmer reagents. The remaining SOMAmer reagents were isolated and quantified simultaneously on a custom Agilent hybridization array such that the measured quantity of each SOMAmer is proportional to the corresponding target protein concentration in the original sample.
4.7. Mass Spectrometry Analysis
For cell line derived EVs, protein digestion and identification by LC-MS/MS (Liquid Chromatography-tandem Mass Spectrometry) was performed based on our established protocol [
55]. Intact protein separation was performed using a UPLC (Ultra Performance Liquid Chromatography) system (Waters Corporation, Milford, MA, USA) with reversed-phase column 4.6 mm × 150 mm (Column Technology Inc., Fremont, CA, USA); eluted proteins were subjected to in-solution trypsin digestion followed by LC-MS using a NanoAcquity UPLC system equipped with a Waters Symmetry C18 nanoAcquity trap-column (180 µm × 20 mm, 5 μm) and a C18 analytical column (75 µm × 150 mm, 1.8 µm, Column Technology Inc.) coupled in-line with a Waters SYNAPT G2-Si mass spectrometer. LC-HDMSE (Liquid Chromatography-data independent High Definition Mass Spectrometry) data were acquired in resolution mode with SYNAPT G2-Si using Waters Masslynx (version 4.1, SCN 851). The capillary voltage was set to 2.80 kV, sampling cone voltage to 30 V, source offset to 30 V and source temperature to 100 °C. Mobility utilized high-purity N2 as the drift gas in the IMS TriWave cell. Pressures in the helium cell, Trap cell, IMS TriWave cell and Transfer cell were 4.50 mbar, 2.47 × 10
−2 mbar, 2.90 mbar and 2.53 × 10
−3 mbar, respectively. IMS wave velocity was 600 m/s, helium cell DC 50 V, Trap DC bias 45 V, IMS TriWave DC bias V and IMS wave delay 1000 µs. The mass spectrometer was operated in V-mode with a typical resolving power of at least 20,000. All analyses were performed using positive mode ESI using a NanoLockSpray source. The lock mass channel was sampled every 60 s. The mass spectrometer was calibrated with a [Glu1] fibrinopeptide solution (300 fmol/µL) delivered through the reference sprayer of the NanoLockSpray source. Accurate mass LC-HDMSE data were collected in an alternating, low energy (MS) and high energy (MSE) mode of acquisition with mass scan range from m/z 50 to 1800. The spectral acquisition time in each mode was 1.0 s with a 0.1-s inter-scan delay. In low energy HDMS (High Definition Mass Spectrometry) mode, data were collected at constant collision energy of 2 eV in both Trap cell and Transfer cell. In high energy HDMSE mode, the collision energy was ramped from 25 to 55 eV in the Transfer cell only. The RF applied to the quadrupole mass analyzer was adjusted such that ions from m/z 300 to 2000 were efficiently transmitted, ensuring that any ions observed in the LC-HDMSE data less than m/z 300 arose from dissociations in the Transfer collision cell. The acquired LC-HDMSE data were processed and searched against protein knowledge database (Uniprot) through ProteinLynx Global Server (PLGS, Waters Corporation) with 4% false discovery rate (FDR). Spectral counts were normalized to total spectral abundance—each identified peptide count was divided by the total count for each analysis and rescaled using a constant factor of 50,000.
4.8. Statistical and Bioinformatics Analyses
Raw assay data were log2-transformed, followed by median and 95% confidence interval calculation using bootstrap method as a resampling technique. Raw data were z-score transformed within each group LUAD and PDAC cases and matched healthy controls and ROC AUC was determined for 1305 proteins.
P values were computed using two-sided Wilcoxon rank-sum test to compare cases and controls and the values were subjected to multiple testing correction to determine q-values. Corresponding plots were generated using ggplot2 (version 3.2.1) in the R software environment (version 3.6.1, The R Foundation,
https://www.r-project.org).
For the proteins that were identified both in protein array with high performance (AUC ≥ 0.7) and in previous cell line exosome proteomics (MS2 ≥ 5 in at least one cell line), the MS2 counts in total exosome extract (TEE), cargo and surface were normalized to the total MS2 count in each cell line. The normalized count (per 10,000 total count) were represented in heatmaps using gplots (version 3.0.1.1, The Comprehensive R Archive Network,
https://github.com/talgalili/gplots) in the R software environment (version 3.6.1) for visualization.
Protein subcellular localizations were determined using COMPARTMENTS database (
https://compartments.jensenlab.org/Search). The confidence levels of protein targets associated with 9 major subcellular localizations were first extracted and the heatmaps were generated using gplots (version 3.0.1.1) in the R software environment (version 3.6.1). Unsupervised hierarchical clustering was performed by Ward’s method to aggregate the compartments with similar patterns of protein localization.
To identify the canonical pathways that were enriched for the genes with high performance in lung and pancreatic adenocarcinoma respectively, Ingenuity Pathway Analysis (IPA) and Ingenuity Upstream Regulator Analysis was used to further explore feature relationships, independent of established canonical pathways according to the associations contained within the Ingenuity Knowledge Base.