Marker Identification of the Grade of Dysplasia of Intraductal Papillary Mucinous Neoplasm in Pancreatic Cyst Fluid by Quantitative Proteomic Profiling

The incidence of patients with pancreatic cystic lesions, particularly intraductal papillary mucinous neoplasm (IPMN), is increasing. Current guidelines, which primarily consider radiological features and laboratory data, have had limited success in predicting malignant IPMN. The lack of a definitive diagnostic method has led to low-risk IPMN patients undergoing unnecessary surgeries. To address this issue, we discovered IPMN marker candidates by analyzing pancreatic cystic fluid by mass spectrometry. A total of 30 cyst fluid samples, comprising IPMN dysplasia and other cystic lesions, were evaluated. Mucus was removed by brief sonication, and the resulting supernatant was subjected to filter-aided sample preparation and high-pH peptide fractionation. Subsequently, the samples were analyzed by LC-MS/MS. Using several bioinformatics tools, such as gene ontology and ingenuity pathway analysis, we detailed IPMNs at the molecular level. Among the 5834 proteins identified in our dataset, 364 proteins were differentially expressed between IPMN dysplasia. The 19 final candidates consistently increased or decreased with greater IPMN malignancy. CD55 was validated in an independent cohort by ELISA, Western blot, and IHC, and the results were consistent with the MS data. In summary, we have determined the characteristics of pancreatic cyst fluid proteins and discovered potential biomarkers for IPMN dysplasia.


Introduction
The incidental detection of pancreatic cystic lesions (PCLs) has increased in recent years due to the implementation of various screening methods and the advancement of medical imaging technologies, such as magnetic resonance imaging (MRI), computed tomography (CT), and endoscopic ultrasound (EUS) [1][2][3][4][5]. In response, many studies have attempted to develop screening methods that aid in the 2. Results

In-Depth Quantitative Proteomics of Pancreatic Cyst Fluid
A mass spectrometry-based method, based on our previous study, was used to analyze a cohort of cyst fluid samples to measure the changes in protein expression with respect to the progression of IPMN [32]. The overall procedure for discovering markers of IPMN progression, from sample preparation to the LC-MS/MS analysis, is depicted in Figure 1. The discovery cohort included 30 pancreatic cyst fluid samples from 3 types of IPMN (LGD, HGD, and invasive IPMN) and other PCLs (MCN and SCN). The pooled samples were fractionated and analyzed in parallel to generate a peptide library, which was used to expand the coverage of identified proteins for individual samples. Each fractionated sample was analyzed once, whereas all individual samples were analyzed in triplicate on a Q Exactive mass spectrometer.
Cancers 2020, 12, x FOR PEER REVIEW 3 of 22 fluid that was obtained exclusively from IPMN patients by LC-MS/MS [32]. In the current study, we aimed to discover marker candidates for IPMN dysplasia from an expanded cohort that included IPMNs and other PCLs (mucinous cystic neoplasm (MCN) and serous cystic neoplasm (SCN)) by mass spectrometry to better reflect actual clinical circumstances to help classify various PCLs and avoid unnecessary pancreatic resection for low-risk IPMN patients.

In-Depth Quantitative Proteomics of Pancreatic Cyst Fluid
A mass spectrometry-based method, based on our previous study, was used to analyze a cohort of cyst fluid samples to measure the changes in protein expression with respect to the progression of IPMN [32]. The overall procedure for discovering markers of IPMN progression, from sample preparation to the LC-MS/MS analysis, is depicted in Figure 1. The discovery cohort included 30 pancreatic cyst fluid samples from 3 types of IPMN (LGD, HGD, and invasive IPMN) and other PCLs (MCN and SCN). The pooled samples were fractionated and analyzed in parallel to generate a peptide library, which was used to expand the coverage of identified proteins for individual samples. Each fractionated sample was analyzed once, whereas all individual samples were analyzed in triplicate on a Q Exactive mass spectrometer. The cohort for label-free quantification included 30 pancreatic cyst fluid samples (10 LGD, 5 HGD, 5 invasive IPMN, 5 MCN, and 5 SCN). After mucus removal by sonication, the samples were centrifuged to isolate supernatant. Pooled cyst fluid (comprising equal amounts of 30 individual samples), secreted proteins from PANC1, Mia Paca-2, BxPC3, and pooled cell lysates from the 3 cell lines were compiled to generate a peptide library. All samples were precipitated using cold acetone to extract the protein. After FASP digestion, only the samples that were used to construct the peptide library were subjected to high-pH reverse-phase peptide fractionation. All peptides were analyzed on a Q Exactive mass spectrometer. CD55, one of the potential markers of IPMN dysplasia, was validated by ELISA. PPT, precipitation; FASP, filter-aided sample preparation; LGD, low-grade The cohort for label-free quantification included 30 pancreatic cyst fluid samples (10 LGD, 5 HGD, 5 invasive IPMN, 5 MCN, and 5 SCN). After mucus removal by sonication, the samples were centrifuged to isolate supernatant. Pooled cyst fluid (comprising equal amounts of 30 individual samples), secreted proteins from PANC1, Mia Paca-2, BxPC3, and pooled cell lysates from the 3 cell lines were compiled to generate a peptide library. All samples were precipitated using cold acetone to extract the protein. After FASP digestion, only the samples that were used to construct the peptide library were subjected to high-pH reverse-phase peptide fractionation. All peptides were analyzed on a Q Exactive mass spectrometer. CD55, one of the potential markers of IPMN dysplasia, was validated by ELISA. PPT, precipitation; FASP, filter-aided sample preparation; LGD, low-grade dysplasia; HGD, high-grade dysplasia; MCN, mucinous cystic neoplasm; SCN, serous cystic neoplasm; SPNT, supernatant.
Raw MS data were processed in MaxQuant (version 1.6.0.16), and the statistical analysis was performed with Perseus (version 1.6.1.1). The MaxQuant analysis identified 1,314,934 spectral matches, 56,583 peptides, and 5834 protein groups, 5774 of which were quantifiable (Table S1). For label-free quantification, 5578 and 3249 proteins were identified in the peptide library and 30 individual cyst fluid samples, respectively. A total of 2993 proteins (92.1%) that were identified in individual samples overlapped with the peptide library ( Figure 2A). The quantified proteins in individual cyst fluids accounted for 86.5% of the 3249 identified proteins ( Figure 2B). Notably, the 3 IPMN groups had approximately twice the number of quantified proteins (2220-2500) than MCN (1218) and SCN (1346) ( Figure S1A). Overall, the number of quantified proteins varied significantly, even within histological groups. Specifically, the identified and quantified proteins in each sample ranged from a minimum of 657 identified (298 quantified) in LGD 10 to a maximum of 2587 identified (2014 quantified) in invasive IPMN 1 ( Figure S1B). Raw MS data were processed in MaxQuant (version 1.6.0.16), and the statistical analysis was performed with Perseus (version 1.6.1.1). The MaxQuant analysis identified 1,314,934 spectral matches, 56,583 peptides, and 5834 protein groups, 5774 of which were quantifiable (Table S1). For label-free quantification, 5578 and 3249 proteins were identified in the peptide library and 30 individual cyst fluid samples, respectively. A total of 2993 proteins (92.1%) that were identified in individual samples overlapped with the peptide library ( Figure 2A). The quantified proteins in individual cyst fluids accounted for 86.5% of the 3249 identified proteins ( Figure 2B). Notably, the 3 IPMN groups had approximately twice the number of quantified proteins (2220-2500) than MCN (1218) and SCN (1346) ( Figure S1A). Overall, the number of quantified proteins varied significantly, even within histological groups. Specifically, the identified and quantified proteins in each sample ranged from a minimum of 657 identified (298 quantified) in LGD 10 to a maximum of 2587 identified (2014 quantified) in invasive IPMN 1 ( Figure S1B).   To improve the proteome coverage, the "match between runs" feature in MaxQuant was utilized to align the retention times and MS/MS spectra of the individual sample against the peptide library [33]. In total, an additional 773 and 420 proteins were identified and quantified, respectively, across all individual samples. LGD 6 showed the largest increase in the number of identified and quantified proteins-by 457 and 235, respectively. On average, 100 more peptides were identified in each sample ( Figure 2C, Table S2). This result demonstrates that the overall proteome coverage of individual cyst fluid samples rose, thereby enlarging the pool of potential biomarker candidates.
The dynamic range of protein expression levels spanned over 7 orders of magnitude, but most proteins (95%) were expressed within 4 orders of magnitude ( Figure S2). Of these proteins, the levels of pancreatic cancer-associated proteins, such as MUC5AC, MUC2, and CEA, were high. Five proteins (PNLIP, CPA1, CPB1, PRSS1, and PRSS2) in a smaller dynamic range, as shown in Figure S2, are known to be significantly expressed in the pancreas compared with other organs [34]. In addition, with the exception of PRSS1, these proteins, denoted in blue, are generally exclusive to the pancreas, per Wilhelm [34]. This result confirms the presence of pancreas-specific proteins in our proteome data.

Reproducibility of Data and Comparison with Other Proteome Databases
To evaluate the reproducibility between the technical replicates, the coefficient of variation (CV) values and Pearson correlation coefficients of LFQ intensity values between technical replicates were calculated ( Figures S3 and S4, Tables S3 and S4). The low median CV values (<20%) and high Pearson correlation coefficient (>0.9) between technical replicates indicated that the label-free quantification of cyst fluid samples was highly reproducible.
To examine the composition of pancreatic cyst fluid proteins, our data were compared with various proteome databases and data from past studies. These datasets included the following: (1) SecretomeP, SignalP, and TMHMM; (2) the Human Plasma Protein Database; (3) the Human Protein Atlas; (4) the "core" proteome, which referred to the proteins that were common to 5 major proteome databases, in Wilhelm et al. [34]; and (5) our previous study [32] ( Figure S5, Table S1). These comparative analyses supported that the proteins that were identified in this study had the appropriate characteristics of pancreatic cyst fluid and exceeded the proteome coverage of past cyst fluid proteomes. The expression patterns of 8 marker candidates from our previous report were replicated in the present study but were statistically insignificant ( Figure S6, Table S5). Results on the reproducibility of the data and the comparative analysis with other proteome databases are detailed in Supplementary Results.

Differentially Expressed Proteins between IPMN Dysplasia
The diagram in Figure S7 details the discovery of potential markers of IPMN dysplasia. Of the 5834 identified proteins, 2809 were quantifiable in individual cyst fluid samples and had LFQ intensity values in at least 2 technical replicates in 1 biological replicate. Of the 2809 quantified proteins, 1019 had more than 70% measurable LFQ intensities in at least 1 histological group and were deemed usable for the statistical analysis. This criterion was established to ensure that a putative marker candidate represented at least 1 histological group.
The variance in expression between comparisons 1 to 3 was depicted in volcano plots. The highlighted final marker candidates including the validation target, CD55, underwent significantly large fold-changes ( Figure 3). In the statistical analysis of comparisons 1 to 3, 364 DEPs remained after overlapping proteins were removed from each comparative group ( Figure S8). Of the 364 DEPs, Cancers 2020, 12, 2383 6 of 21 261 were exclusively upregulated, and 80 proteins were exclusively downregulated. The remaining 23 proteins did not have consistent expression patterns across the 3 comparisons (Table S7).
Gene ontology (GO), KEGG pathway analysis, and Ingenuity Pathway Analysis (IPA) were performed to characterize the 364 DEPs. The GO terms and KEGG pathways were associated with pancreatic cancer and cyst fluid ( Figure S9, Table S8). A total of 216 DEPs from comparison 1 (LGD vs. HGD) and 247 DEPs from comparison 3 (LGD vs. invasive IPMN) were analyzed by IPA. The biological functions that were related to "malignancy" and "molecular secretion" were associated with the DEPs in the core analysis ( Figure S10A,B). In addition, in the comparative analysis, pancreasspecific diseases and biological functions that were related to malignancy were confirmed to be enriched to a greater extent in comparison 3 than comparison 1 ( Figure S10C,D). All bioinformatics analyses results are detailed in Supplementary Results. LGD versus invasive intraductal papillary mucinous neoplasm (IPMN)) (C) to discover differentially expressed proteins (DEPs). The DEPs that were significantly expressed in each comparison group are indicated as colored dots (red: upregulated DEPs, blue: downregulated DEPs). Several marker candidates, including the validation target CD55, are highlighted in each comparison. DEP, differentially expressed protein.

Biomarker Candidates of IPMN Dysplasia
A total of 364 DEPs passed the statistical analysis ( Figure S8). Following the rationale that proteins with significant differences in expression in more comparisons are more likely to be biomarkers, 179 DEPs were designated as initial marker candidates, because they were present in at least 2 of 3 comparative pairs (1 to 3) [32,33]. Subsequently, 27 DEPs had expression patterns that consistently increased or decreased with greater IPMN malignancy. Of them, 13 DEPs that were preferentially expressed in invasive IPMN were statistically significant in comparisons 6 (SCN versus invasive IPMN) and 7 (MCN versus invasive IPMN). Similarly, the remaining 14 DEPs were expressed in LGD and showed significant differences in comparisons 4 (SCN versus LGD) and 5 (MCN versus LGD). Based on the rationale that tumor-associated proteins are generally secreted from surrounding tumor cells, the 19 DEPs predicted to be secreted by SecretomeP, SignalP, and TMHMM were selected as the final marker candidates [35,36].
The heat map in Figure 4 provides an overview of the expression of the 19 final marker candidates of IPMN dysplasia: 7 invasive IPMN-specific marker candidates and 12 LGD-specific Gene ontology (GO), KEGG pathway analysis, and Ingenuity Pathway Analysis (IPA) were performed to characterize the 364 DEPs. The GO terms and KEGG pathways were associated with pancreatic cancer and cyst fluid ( Figure S9, Table S8). A total of 216 DEPs from comparison 1 (LGD vs. HGD) and 247 DEPs from comparison 3 (LGD vs. invasive IPMN) were analyzed by IPA. The biological functions that were related to "malignancy" and "molecular secretion" were associated with the DEPs in the core analysis ( Figure S10A,B). In addition, in the comparative analysis, pancreas-specific diseases and biological functions that were related to malignancy were confirmed to be enriched to a greater extent in comparison 3 than comparison 1 (Figure S10C,D). All bioinformatics analyses results are detailed in Supplementary Results.

Biomarker Candidates of IPMN Dysplasia
A total of 364 DEPs passed the statistical analysis ( Figure S8). Following the rationale that proteins with significant differences in expression in more comparisons are more likely to be biomarkers, 179 DEPs were designated as initial marker candidates, because they were present in at least 2 of 3 comparative pairs (1 to 3) [32,33]. Subsequently, 27 DEPs had expression patterns that consistently increased or decreased with greater IPMN malignancy. Of them, 13 DEPs that were preferentially expressed in invasive IPMN were statistically significant in comparisons 6 (SCN versus invasive IPMN) and 7 (MCN versus invasive IPMN). Similarly, the remaining 14 DEPs were expressed in LGD and showed significant differences in comparisons 4 (SCN versus LGD) and 5 (MCN versus LGD). Based on the rationale that tumor-associated proteins are generally secreted from surrounding tumor cells, the 19 DEPs predicted to be secreted by SecretomeP, SignalP, and TMHMM were selected as the final marker candidates [35,36].
Similarly, the percentage of LGD-specific marker candidates represents the number of LGD samples that showed upregulation for the given protein.  Table S6 details the results of the statistical analysis of the 19 potential markers (7 upregulated proteins and 12 downregulated proteins), including statistical significance, p-values, and foldchanges for each comparative group. A total of 16 proteins, with the exception of RAB11B, KLK1, and CELA2A, were pancreatic tissue-specific proteins, according to the Human Protein Atlas. In addition, 15 proteins, with the exception of DEFA3, MUC13, RAB11B, and LEFTY1, were observed in plasma or serum, per the Plasma Proteome Database (PPD) ( Table S1).
The fold-changes of the 19 final marker candidates were assessed in relation to the general  for each comparative group. A total of 16 proteins, with the exception of RAB11B, KLK1, and CELA2A, were pancreatic tissue-specific proteins, according to the Human Protein Atlas. In addition, 15 proteins, with the exception of DEFA3, MUC13, RAB11B, and LEFTY1, were observed in plasma or serum, per the Plasma Proteome Database (PPD) ( Table S1).
The fold-changes of the 19 final marker candidates were assessed in relation to the general distribution of the 1019 proteins that were used for the statistical analysis. To this end, potential markers from comparisons 1 and 3 were displayed in a dynamic range and ordered, based on their fold-change. (Figure 6A,B). Excluding CD55, RAB11B, and CPS1 in comparison 1, 16 proteins had p-values below 0.05 and lay generally near the 2 extremes of the dynamic range. Similarly, in comparison 3, 19 candidates were statistically significant (p < 0.05) and located near the upper and lower extremes of the dynamic range. All potential markers had higher fold-changes than CEA, the most well-established pancreatic cancer-associated marker. Upstream regulator analysis in IPA was conducted to predict the upstream proteins of the final candidates and their biological functions. This analysis predicted the top 20 likely upstream regulators that modulate the 5 final candidates (MUC13, CD55, CPS1, SOD2, and LEFTY1). A total of 11 regulator proteins were associated with SOD2, 4 proteins correlated with CD55, 4 proteins were linked to LEFTY1, and 1 was associated with CPS1 and MUC13 ( Figure 6C, Table S9). The upstream regulators were the following molecular types: Kinase, enzyme, g-protein-coupled receptor, transcription regulator, transporter, and ion channel. The upstream regulators were tumor suppressors, pancreatic mitogens, and other key factors of cancer progression, according to several Upstream regulator analysis in IPA was conducted to predict the upstream proteins of the final candidates and their biological functions. This analysis predicted the top 20 likely upstream regulators that modulate the 5 final candidates (MUC13, CD55, CPS1, SOD2, and LEFTY1). A total of 11 regulator proteins were associated with SOD2, 4 proteins correlated with CD55, 4 proteins were linked to LEFTY1, and 1 was associated with CPS1 and MUC13 ( Figure 6C, Table S9). The upstream regulators were the following molecular types: Kinase, enzyme, g-protein-coupled receptor, transcription regulator, transporter, and ion channel. The upstream regulators were tumor suppressors, pancreatic mitogens, and other key factors of cancer progression, according to several publications, supporting the credibility of the marker candidates [37][38][39][40][41][42][43][44][45][46][47][48][49][50][51]. The selection of the validation target was based on a protein's predominance in LGD or invasive IPMN and its statistical significance ( Figure S7). Accordingly, CD55 was chosen as a validation target of IPMN dysplasia for 3 main reasons: (1) It was highly expressed in all (100%, 5/5) invasive IPMN samples (Figure 4), (2) it had the smallest p-value of all potential markers from all comparative groups that involved invasive IPMN ( Figure 5, Table S6 Table S10. CD55 concentrations in individual cyst fluid samples were calculated and demonstrated in 2 types of IPMN classification (Figure 7). The concentration of CD55 was the highest in invasive IPMN, and its expression patterns generally correlated with LFQ intensity values. CD55 concentrations in invasive IPMN (mean: 1.354 ng/mL, STDEV: 1.532 ng/mL) were significantly higher than in LGD (mean: 0.598 ng/mL, STDEV: 1.045) (p < 0.05). In addition, CD55 concentrations in high-risk IPMN (mean: 1.219 ng/mL, STDEV: 1.567) were significantly higher versus low-risk IPMN (mean: 0.598 ng/mL, STDEV: 1.045). The expression levels of CD55 in invasive IPMN compared with SCN and MCN were statistically significant (p < 0.05). The intraplate and interplate repeatability of the CD55 ELISA was evaluated by measuring 3 replicates of a total of 21 positive and negative control samples. The CV values for the intraplate and interplate repeatability were less than 20% ( Figure S12 and Table S11). A detailed description of the precision of the CD55 ELISA can be found in the Supplementary Data.  Table S10. CD55 concentrations in individual cyst fluid samples were calculated and demonstrated in 2 types of IPMN classification (Figure 7). The concentration of CD55 was the highest in invasive IPMN, and its expression patterns generally correlated with LFQ intensity values. CD55 concentrations in invasive IPMN (mean: 1.354 ng/mL, STDEV: 1.532 ng/mL) were significantly higher than in LGD (mean: 0.598 ng/mL, STDEV: 1.045) (p < 0.05). In addition, CD55 concentrations in high-risk IPMN (mean: 1.219 ng/mL, STDEV: 1.567) were significantly higher versus low-risk IPMN (mean: 0.598 ng/mL, STDEV: 1.045). The expression levels of CD55 in invasive IPMN compared with SCN and MCN were statistically significant (p < 0.05). The intraplate and interplate repeatability of the CD55 ELISA was evaluated by measuring 3 replicates of a total of 21 positive and negative control samples. The CV values for the intraplate and interplate repeatability were less than 20% ( Figure S12 and Table S11). A detailed description of the precision of the CD55 ELISA can be found in the Supplementary Data. To reconfirm CD55 expression between the 5 cystic lesions, Western blot was conducted using 30 cyst fluid samples (8 LGD, 4 HGD, 8 invasive IPMN, 5 MCN, and 5 SCN). Ponceau S staining was included as a loading control to confirm that comparable amounts of individual samples were loaded onto each gel. The resulting CV value was 12.53% ( Figure S13A and Table S12). The signal intensity of CD55 was the highest in invasive IPMN, and its expression patterns correlated with the MS analysis findings ( Figure S13). The signal intensities in invasive IPMN were significantly higher than To reconfirm CD55 expression between the 5 cystic lesions, Western blot was conducted using 30 cyst fluid samples (8 LGD, 4 HGD, 8 invasive IPMN, 5 MCN, and 5 SCN). Ponceau S staining was included as a loading control to confirm that comparable amounts of individual samples were loaded onto each gel. The resulting CV value was 12.53% ( Figure S13A and Table S12). The signal intensity of CD55 was the highest in invasive IPMN, and its expression patterns correlated with the MS analysis findings ( Figure S13). The signal intensities in invasive IPMN were significantly higher than in LGD (p < 0.001), MCN (p < 0.01), and SCN (p < 0.01) ( Figure S13B). In addition, the signal intensities in high-risk IPMN were significantly higher versus low-risk IPMN (p < 0.001) ( Figure S13C).

Immunohistochemistry (IHC) of CD55 and Myeloperoxidase (MPO)
Immunohistochemical stains for CD55 and MPO were performed on formalin-fixed paraffin-embedded (FFPE) tissue sections from SCN, LGD, HGD, and invasive IPMN ( Figure S14). CD55 expression was observed predominantly in the apical border of the tumor epithelial cells, showing increased distribution and intensity, in accordance with the histological grades of IPMN, whereas strong membranous staining was observed in invasive IPMN ( Figure S14A-D). Also, we examined MPO, a neutrophil marker, to identify neutrophil infiltration in invasive IPMN, based on a previous study that reported that CD55 is responsible for transepithelial migration of neutrophils [52]. As expected, neutrophil infiltration increased from LGD to HGD and invasive IPMN, which had the highest neutrophil counts ( Figure S14E-H). CD55 expression and neutrophil infiltration were not observed in SCN.

Discussion
In this study, we discovered reliable marker candidates of IPMN dysplasia using cyst fluid from IPMN, MCN, and SCN patients by LC-MS/MS and investigated their molecular characterization, based on the advantages of pancreatic cyst fluid and MS-based proteomic approaches [27][28][29][30]. The current diagnostic screens cannot accurately determine the IPMN-associated grade of dysplasia [1,4], leading to unnecessary surgical resections for low-risk IPMN patients [23]. In addition, most studies have focused on discovering diagnostic markers that differentiate IPMNs from other PCLs, rather than the grade of dysplasia in IPMN [53]. Our report is the first study to discover potential markers of IPMN dysplasia using cyst fluid from 3 major types of PCL by LC-MS/MS and validated them by orthogonal method.
Our proteome data have 3 notable aspects: (1) increased depth of proteome coverage through the use of a peptide library, (2) high reproducibility, and (3) an abundance of pancreas-associated proteins. We hypothesized that a high proportion of proteins are coexpressed in cyst fluid and pancreatic cancer cell lines. Thus, we expected that the "match between runs" feature in MaxQuant would help identify proteins in cyst fluid that are normally unidentifiable without a peptide library [54]. Consistent with our expectation, approximately twice as many proteins were identified in this dataset than in our previous study ( Figure S5F, Table S1), constituting the largest proteomic dataset of pancreatic cyst fluid [32,55,56]. Consequently, we discovered potential markers of the histological grades of IPMN in a larger pool of proteins. Further, the low median CV values and high Pearson correlation coefficients of LFQ intensities between technical replicates indicated that the individual samples were injected into the mass spectrometer without significant variance and that the technical replicates were analyzed reproducibly. (Figures S3 and S4, Tables S3 and S4). In previous studies, it was concluded that pancreatic cyst fluid contains secreted proteins from surrounding tumor cells [35,36,53] and plasma proteins that penetrate into the cyst epithelium due to tissue injury or the enhanced permeability and retention (EPR) effect of the surrounding blood vessels [57]. Tumor-promoting mediators, such as immunosuppressive cytokines, that are released from cancer-associated fibroblasts and mast cells in the tumor microenvironment can promote the neoplastic evolution of IPMN [58][59][60]. They induce an aggressive phenotype and drug resistance in premalignant pancreatic lesions. One possible mechanism of the malignant evolution of IPMN, in light of our findings and existing literature, is that the expression of CD55, promoted by the immunosuppressive cytokine, interleukin-4, prevents complement-dependent cytotoxicity in cancer cells, which consequently accelerates the malignant transformation of IPMN dysplasia [61,62]. The results of the comparative analyses with the 3 databases for the secretome analysis and the Human Plasma Protein Database ( Figure S5A-C, Table S1) support these findings [35,36,53]. In addition, 15 of the final marker candidates were detected in plasma and serum, supporting their viability in a blood-based assay [63,64]. A high proportion (approximately 90%) of pancreas-associated proteins were identified in our dataset ( Figure S5D,E, Table S1) and 5 pancreas tissue-specific proteins, as defined by Wilhelm et al. [34], were in the top 25 proteins (Figure S2), demonstrating that our proteome has sufficient coverage of pancreas-specific proteins.
GO and KEGG pathway analyses were performed to identify key processes of the 364 DEPs in IPMN dysplasia. The results indicated the enrichment of terms that pertained to tumorigenesis: pancreatic secretion, enzymatic activity, and malignancy ( Figure S9, Table S8). GO terms that were related to "pancreatic secretion" and "molecular transport" were highly ranked in the GO and KEGG analyses, suggesting that the DEPs in IPMN dysplasia are generally secreted from the surrounding tumor cells. One of the most highly enriched GO terms, "proteolysis," is the most fundamental feature of malignancy [65,66]. Proteolytic degradation of ECM constituents accelerates tumor cell growth, migration, and angiogenesis. The most highly enriched KEGG pathway, "complement and coagulation cascades," is associated with tumor growth and metastasis [67,68]. Complement activation promotes an immunosuppressive microenvironment and thus induces angiogenesis, activating cancer-related signaling pathways. In addition, several studies have reported increases in complement activity in biological fluids from cancer patients [69,70]. Coagulation cascades can be activated directly by cancer procoagulants, which are released by tumor cells.
The criteria for discovering marker candidates of IPMN dysplasia that comprised 8 steps ( Figure S7) and the association of the final marker candidates with the pancreas-related disease (IPMN and PDAC) and malignancy support the credibility of the potential markers. For instance, MUC13 has been studied extensively. According to 2 previous studies with similar goals as ours, MUC13 increased with histological grade of IPMN [71,72]. In addition, several studies have indicated that MUC13 is highly upregulated in PDAC tissue but not in adjacent normal tissue and is related to PDAC progression [73][74][75]. CD55 is involved in the dedifferentiation, invasiveness, migration, and metastasis of tumors and association with pancreatic cancer [76,77]. Further, Iacobuzio-Donahue confirmed that CD55 is highly expressed in pancreatic cancer when measured by microarrays [78]. Two previous studies that aimed to discover protein markers of mucinous and nonmucinous cysts selected AMY2A as a biomarker, consistent with the expression patterns in our study [26,55]. In addition, AMY2A was expressed at higher levels in nontumor versus PDAC tissues [79]. According to the label-free quantification data of another group, CPB1, a member of the carboxypeptidase family, was confirmed to be downregulated in PDAC tissue [80]. The biological functions and expression patterns of our potential markers are consistent with previous studies.
Of the final marker candidates, CD55 was validated using 70 individual cyst fluid samples by ELISA as it evidently differed in expression between the histological groups of IPMN. In specific, CD55 had the lowest p-values between all comparative groups and the highest fold-changes in comparisons 2 (HGD versus invasive IPMN) and 3 (LGD versus invasive IPMN) (Figure 3, Figure 5, Table S6). Although previous studies have concluded that CD55 is associated with the dedifferentiation and invasiveness of tumors [76,77], this study is the first to report CD55 as a marker of IPMN dysplasia.
The statistical significance in the ELISA analysis was lower compared with the MS analysis. Further, the CD55 concentrations by ELISA were generally lower in all sample groups (Figure 7). However, the low statistical power and concentrations do not diminish the value of this potential marker because the expression patterns of CD55 by ELISA were consistent with our MS data. In addition, the Western blot and IHC results for CD55 were consistent with the MS expression data, further supporting CD55 as a potential marker of IPMN dysplasia (Figures S13 and S14).
Several studies have examined CD55 as a potential biomarker and therapeutic target, establishing that CD55 is frequently upregulated in various cancer types and can serve as an indicator of cancer progression [81]. In a preclinical study, Saygin et al. demonstrated that CD55 maintains self-renewal and cisplatin resistance in endometrioid tumors to accelerate tumor development [82]. Other preclinical studies concluded that silencing CD55 enhances the therapeutic efficacy of rituximab [83] and the anti-HER2 monoclonal antibodies trastuzumab and pertuzumab [84]. In this context, it is evident that CD55 has a significant function in tumor progression, based on past studies and our findings. When considering the aforementioned preclinical studies regarding CD55, this marker has a significant potential as a reflection of the neoplastic evolution of IPMN.
In our previous study, which applied a similar MS-based approach, we reported potential markers that can differentiate the histological grades of IPMN using only cyst fluid from IPMN patients [32]. However, the exclusion of other PCLs, such as MCNs and SCNs, is not applicable to an actual clinical environment. To compensate for this limitation, cyst fluid from other PCLs (MCN and SCN) as well as IPMN were included to increase the likelihood of discovering more clinically relevant marker candidates. Further, we increased the size of our validation cohort by 4-fold and measured protein levels by ELISA instead of Western blot, due to the higher sensitivity and specificity of the former. In contrast to Western blot, which cannot distinguish proteins with similar molecular weights, ELISA is highly specific to its target epitope and consequently generates credible quantitative concentrations [85].
However, several challenges remain to be addressed. Our study was limited by its simplistic design (single-center). Using a larger cohort from multiple centers would decrease the potential bias that might have been unique to the cohort in this study. Thus, it would be necessary to examine the clinical potential of CD55 through randomized, controlled, multicenter validation in a large cohort. Although 5 marker candidates, including CD55, were statistically significant (p < 0.05) in the univariate analysis for predicting the neoplastic evolution of IPMN, no significant covariates were identified in the multivariate analysis. One explanation is the relatively small sample size used in the study, which comprised 10 low-risk IPMN and 10 high-risk IPMN samples. Thus, further validation with more samples is necessary for obtaining more reliable results. Another limitation is that all of the cyst fluid in this study was collected from the tissue of patients who were undergoing resection rather than EUS-guided aspiration, which is a closer representation of clinical practice. Thus, the evaluation of CD55 using preoperative EUS-guided cyst fluid is a natural next step in validating the diagnostic value of CD55 as a marker for IPMN dysplasia.

Patients and Cyst Fluid Samples and Preparation
Cystic fluid samples were collected from 30 patient specimens (20 IPMN,5 MCN,and 5 SCN) immediately after pancreatectomy at Seoul National University Hospital between April 2013 and June 2017. IPMN samples were classified as low-grade dysplasia (LGD, n = 10), high-grade dysplasia (HGD, n = 5), and invasive IPMN (n = 5). The same samples were also categorized as low-risk IPMN (LGD) and high risk-IPMN (HGD and invasive IPMN). The patient data and characteristics of the cystic lesions are summarized in Table 1 and detailed in Supplementary Methods. At least 200 µL of cyst fluid was aspirated from patients to acquire sufficient protein for analysis. The aspirated cyst fluid samples were stored at −80 • C until sample preparation. All contents of this research were approved by the Institutional Review Board (IRB No. 1304-121-486), and all participants provided written informed consent. LGD, low-grade dysplasia; HGD, high-grade dysplasia; MCN, mucinous cystic neoplasm; SCN, serous cystic neoplasm; * The number of patients with two different gland types is shown in parentheses.

Pancreatic Cyst Fluid Sample Preparation
Brief sonication was performed to remove the mucus in a 1.5-mL Eppendorf tube. The sonicated samples were then centrifuged (15,000 rpm, 20 min, 4 • C) to obtain a supernatant [32]. After measuring protein concentration, equal amounts of protein in each sample were precipitated with cold acetone. The proteins were denatured using SDT lysis buffer (4% SDS, 0.1 M DTT, 0.1 M Tris-Cl, pH 7.4). The sample preparation comprised tryptic digestion with filter-aided sample preparation (FASP) and desalting with homemade StageTips [86,87]. StageTip-based, high-pH peptide fractionation was performed only for pooled samples that were used to generate the peptide library. The sample preparation is detailed in Supplementary Methods.

LC-MS/MS and Statistical Analysis
Liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis was performed on a Q Exactive mass spectrometer that was equipped with an EASY-Spray ion source (Thermo Fisher Scientific, Waltham, MA, USA), coupled to an Easy-nano LC 1000 (Thermo Fisher Scientific, Waltham, MA, USA) [32,88]. All raw MS files were processed in MaxQuant, version 1.6.0.16 [89] with the built-in Andromeda search engine [90] against the Uniprot human database (88,717 entries; version from December 2014). All generated proteomic data have been submitted to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org/) via the PRIDE partner repository, with PXD016127 as the identifier [91,92]. The LC-MS/MS analysis and raw data search are detailed in Supplementary Methods.
The statistical analysis was performed in Perseus (version 1.6.1.1) per our previous studies [32,93]. Student's t-test (p < 0.05) was applied to find significantly changed proteins. The 7 comparative pairs that were subjected to statistical analysis were as follows: LGD

Enzyme-Linked Immunosorbent Assay (ELISA)
CD55 protein was measured using a commercial quantikine ELISA kit (CSB-E05121h, CUSABIO, China) per the manufacturer's instructions. Seventy cyst fluid samples- 22 LGD, 5 HGD, 14 invasive IPMN, 13 MCN, and 16 SCN-were centrifuged to isolate the supernatant for the ELISA. Equal amounts of proteins (298 µg, as measured by BCA assay) were loaded into each well of a 96-well plate. The protein concentration data were analyzed statistically by student s t-test.

Conclusions
In summary, we have generated the largest proteomic dataset of pancreatic cyst fluid to date and discovered potential markers of IPMN dysplasia using cyst fluid from 3 major types of PCLs (IPMN, MCN, and SCN) by LC-MS/MS. We significantly increased the protein coverage of each sample with a peptide library and discovered markers from a larger pool of candidates. By bioinformatics analyses, the DEPs were associated with biological functions that were related to pancreatic cancer, malignancy, and molecular secretion. Our process for discovering potential markers of IPMN dysplasia was logically sound. The agreement in the expression pattern of CD55 between the MS and ELISA data demonstrates that we have discovered reliable marker candidates of IPMN dysplasia. The development of cyst fluid markers can facilitate an accurate assessment of the degree of IPMN dysplasia and effectively guide surgical decision-making. Ultimately, if the developed marker is implemented in clinical practice, the accurate assessment of IPMN dysplasia will prevent unnecessary surgical resection for low-risk IPMN patients.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2072-6694/12/9/2383/s1, Figure S1: The number of identified and quantified proteins in individual samples and the peptide library, Figure S2: Dynamic range of quantified proteins, Figure S3: Coefficient of variation (CV%) values of technical triplicates in each individual sample, Figure S4: Pearson correlation coefficients between technical replicates in each sample group, Figure S5: Comparative analysis with various proteome databases and other proteomic studies, Figure S6: Comparison of the expression patterns of the final marker candidates between our previous and present studies, Figure S7: Flowchart of the discovery of potential markers of IPMN dysplasia, Figure S8: Venn diagram of differentially expressed proteins in 3 comparative groups, Figure S9: Results of Gene Ontology (GO) and KEGG pathway analyses, Figure S10: Ingenuity Pathway Analysis, Figure S11: Thirteen potential biomarkers with expression patterns that were consistent with the degree of IPMN malignancy, Figure S12: Intraplate and interplate repeatability of CD55 by ELISA, Figure S13: Validation of CD55 as a potential marker by Western blot, Figure S14: Immunohistochemical staining of CD55 and MPO, Table S1: List of total identified protein groups, Table S2: Summary of MaxQuant searches in terms of identified and quantified proteins and peptides with and without the peptide library, Table S3: CV values of log2-transformed LFQ intensity sums of each technical replicate in individual samples, Table S4: Pearson correlation coefficients of technical triplicates, Table S5: Expression patterns of final marker candidates in the previous study compared with the present study, Table S6: Statistical analysis results, Table S7: List of exclusively upregulated, exclusively downregulated, and up-or downregulated proteins in comparisons 1 to 3, Table S8: Results of GO and KEGG pathway analyses conducted with the DAVID bioinformatics tool, Table S9: Results of upstream regulator analysis in IPA, Table S10: Demographic and clinical information of the validation cohort used for the ELISA, Table S11: Intraplate and interplate repeatability of CD55 by ELISA, Table S12: Ponceau S staining as an alternative loading control to actin.

Conflicts of Interest:
The authors declare no conflict of interest.