Proteomic Characterization of Colorectal Cancer Cells versus Normal-Derived Colon Mucosa Cells: Approaching Identification of Novel Diagnostic Protein Biomarkers in Colorectal Cancer

In the western world, colorectal cancer (CRC) is the third most common cause of cancer-related deaths. Survival is closely related to the stage of cancer at diagnosis striking the clinical need for biomarkers capable of early detection. To search for possible biological parameters for early diagnosis of CRC we evaluated protein expression for three CREC (acronym: Cab45, reticulocalbin, ERC-55, calumenin) proteins: reticulocalbin, calumenin, and ERC-55 in a cellular model consisting of a normal derived colon mucosa cell line, NCM460, and a primary adenocarcinoma cell line of the colon, SW480. Furthermore, this cellular model was analyzed by a top-down proteomic approach, 2-dimensional polyacrylamide gel electrophoresis (2D-PAGE) and liquid chromatography–tandem mass spectrometry (LC–MS/MS) for novel putative diagnostic markers by identification of differentially expressed proteins between the two cell lines. A different colorectal carcinoma cell line, HCT 116, was used in a bottom-up proteomic approach with label-free quantification (LFQ) LC–MS/MS. The two cellular models gave sets of putative diagnostic CRC biomarkers. Various of these novel putative markers were verified with increased expression in CRC patient neoplastic tissue compared to the expression in a non-involved part of the colon, including reticulocalbin, calumenin, S100A6 and protein SET. Characterization of these novel identified biological features for CRC patients may have diagnostic potential and therapeutic relevance in this malignancy characterized by a still unmet clinical need.


Introduction
Colorectal cancer (CRC) is among the leading causes of death in the Western countries and considerable effort are spent identifying diagnostic biomarkers. Mean overall survival is closely related Figure 1. Western blotting of three CREC family members. (A) NCM460 and SW480 cells were grown in separate rounds and harvested prior to total protein concentration determination and equal amounts of protein were loaded on the gels. Pixel intensity was measured in each band and the mean value of NCM460 was used as a reference set to 1. P values were calculated with Mann-Whitney Utest. Relative pixel intensities are depicted at a log2-transformed scale in the dot plots. (B) Colorectal cancer (CRC) patient tissue was collected during bowel resection for CRC. Proteins were solubilized by homogenization in lysis buffer prior to total protein concentration determination and equal amounts of protein were loaded on the gels. Pixel intensity was measured in each band and the value of each non-involved (N) band were set to 1 and the peripheral (P) and central (C) samples were calculated relative to the non-involved tissue value in each patient. P values were calculated with Wilcoxon signed-rank test. Relative pixel intensities are depicted at a log2-transformed scale in the dot plots. ** p < 0.01; *** p < 0.001.
Subsequently, we analyzed the protein expression of reticulocalbin and calumenin in biopsies  NCM460 and SW480 cells were grown in separate rounds and harvested prior to total protein concentration determination and equal amounts of protein were loaded on the gels. Pixel intensity was measured in each band and the mean value of NCM460 was used as a reference set to 1. p values were calculated with Mann-Whitney U-test. Relative pixel intensities are depicted at a log 2 -transformed scale in the dot plots. (B) Colorectal cancer (CRC) patient tissue was collected during bowel resection for CRC. Proteins were solubilized by homogenization in lysis buffer prior to total protein concentration determination and equal amounts of protein were loaded on the gels. Pixel intensity was measured in each band and the value of each non-involved (N) band were set to 1 and the peripheral (P) and central (C) samples were calculated relative to the non-involved tissue value in each patient. p values were calculated with Wilcoxon signed-rank test. Relative pixel intensities are depicted at a log 2 -transformed scale in the dot plots. ** p < 0.01; *** p < 0.001.
Subsequently, we analyzed the protein expression of reticulocalbin and calumenin in biopsies from 10 patients diagnosed with CRC (Table 1). Since ERC-55 was present at similar levels in the two cell lines we saved the patient biopsies for other putative markers. From each patient a sample was taken from the central part of the tumor (C), the peripheral part of the tumor (P) as well as from a non-involved part of the colon (N). Reticulocalbin is highly expressed in the central part as well as in the peripheral part of the tumor (p < 0.01, p < 0.001) compared to the expression of the non-involved part of the colon. A similar pattern is observed for calumenin (p < 0.01, p < 0.01) ( Figure 1B).

Top-Down Proteomic Comparison of NCM460 and SW480 Cell Lines
We further analyzed the NCM460/SW480 cellular model system by performing a proteomic comparison using 2-dimensional polyacrylamide gel electrophoresis (2D-PAGE) in order to detect protein spots that were significantly and at least 2-fold differentially expressed between the two cell lines. This analysis revealed 73 spots changing in expression. All spots were excised for protein identification. Twenty-five spots (16 with protein identifications) were upregulated and 48 spots (22 with protein identifications) were downregulated ( Figure 2) in the colon cancer cell line (SW480) compared to the protein expression of the normal-derived colon cell line (NCM460). In total 38 spots were identified with at least one peptide by liquid chromatography-tandem mass spectrometry (LC-MS/MS). Details are given in Table 2.   NCM460 and SW480 cells were grown in separate rounds and harvested prior to total protein concentration determination and equal amounts of protein were loaded on the gels. Each group consisted of 6 samples. Comparative analyses of the spots were done with the PDQuest software. All differentially identified spots with 2-fold or more increased/decreased expression and with significance level (p < 0.05) with Mann-Whitney U-test are shown on the representative gels in the figure.  Figure 2. Proteomic analysis of NCM460 and SW480 using a top-down proteomic strategy. A representative gel from each group (A) NCM460 and (B) SW480, molecular mass is given in kDa. NCM460 and SW480 cells were grown in separate rounds and harvested prior to total protein concentration determination and equal amounts of protein were loaded on the gels. Each group consisted of 6 samples. Comparative analyses of the spots were done with the PDQuest software. All differentially identified spots with 2-fold or more increased/decreased expression and with significance level (p < 0.05) with Mann-Whitney U-test are shown on the representative gels in the figure.

Bottom-up Proteomic Comparison of NCM460 and HCT 116 Cell Lines
The NCM460/HCT 116 cellular model system was analyzed by label-free quantification (LFQ) LC-MS/MS proteomics in order to detect proteins that were significantly and at least 2-fold differentially expressed in this cellular model system. We found 901 proteins to be differentially expressed, 465 upregulated and 436 downregulated, Figure 3. The differentially expressed proteins are listed in supplementary Table S1. Retinol-binding protein I was highly upregulated in HCT 116 and was below detection limit in NCM460 in line with the 2D-PAGE results ( Figure 2, spot 1007). None of the CREC proteins, reticulocalbin, calumenin or ERC-55 were changed in this system. Protein SET and endoplasmin were not changed either. Only one of the S100 proteins was detected, S100A10, and it was not changed.

Immunologic Evaluation of Protein Expression
The expression patterns of a number of the identified differentially expressed proteins obtained by 2D-PAGE and by LFQ LC-MS/MS were further evaluated by 1D western blotting of the cell lines, NCM460 and SW480, as well as of tissue from CRC patients ( Figure 4). Unchanged proteins are indicated with grey circles. 901 proteins were significantly (p < 0.05) at least 2-fold differentially expressed. 777 of these are indicated with black circles and 57 were detected in all HCT 116 samples and were below detection limit in all NCM460 samples while 67 proteins were detected in all NCM460 and were below detection limit in all HCT 116 samples. They could not be pictured in the volcano plot, but are included in supplementary Table S1.

Immunologic Evaluation of Protein Expression
The expression patterns of a number of the identified differentially expressed proteins obtained by 2D-PAGE and by LFQ LC-MS/MS were further evaluated by 1D western blotting of the cell lines, NCM460 and SW480, as well as of tissue from CRC patients ( Figure 4).
2.5. S100 Proteins S100A4 (spot 4001) was strongly downregulated in the 2D-PAGE proteomic analysis ( Figure 2) and this was confirmed by the western blot analysis which also showed strong downregulation of the protein in the SW480 cancer cell line ( Figure 4A). This antibody did not show specific reaction in the patient tissue samples. However, an antibody against another member of the S100 family, S100A6, showed reasonable specificity in both systems. S100A6 is upregulated in the SW480 cancer cell line (p < 0.01). In the tissue analysis, we found the protein to be upregulated in the central part of the tumor compared to the non-involved part of the colon (p < 0.05) ( Figure 4A,B). Only one S100 protein, S100A10, was detected with LFQ LC-MS/MS and it was unchanged in HCT 116.
circles and 57 were detected in all HCT 116 samples and were below detection limit in all NCM460 samples while 67 proteins were detected in all NCM460 and were below detection limit in all HCT 116 samples. They could not be pictured in the volcano plot, but are included in supplementary Table  1.

Immunologic Evaluation of Protein Expression
The expression patterns of a number of the identified differentially expressed proteins obtained by 2D-PAGE and by LFQ LC-MS/MS were further evaluated by 1D western blotting of the cell lines, NCM460 and SW480, as well as of tissue from CRC patients ( Figure 4).

Figure 4.
Western blotting of potential markers. (A) NCM460 and SW480 cells were grown in separate rounds and harvest prior to total protein concentration determination and equal amount of protein were loaded on the gels. Pixel intensity was measured in each band and the mean value of NCM460 was used as reference set to 1. Total pixel intensity from the two bands for protein SET showed no significant difference (p = 0.2). However, quantifying the top band alone showed significant up-regulation in NCM460 (p < 0.05) whereas the lower band showed no significant difference (p = 0.9). p values were calculated with Mann-Whitney U-test. Relative pixel intensities are depicted at a log 2 -transformed scale in the dot plots. (B) CRC patient tissue was exercised from CRC during resection for CRC. Proteins were solubilized by homogenization in lysis buffer prior to total protein concentration determination and equal amounts of protein were loaded on the gels. Pixel intensity was measured in each band and the value of each non-involved (N) band was set to 1 and the periphery (P) and central (C) samples were calculated relative to the non-involved value in each patient. p values were calculated with Wilcoxon signed-rank test. Quantification of protein SET was done on the total expression (all bands together), the top band alone and the band just below the top band alone which all showed the p-values as stated in the figure. To save space, the western blot of protein SET has been compressed in the y-direction as inferred by the range of the molecular masses stated by the marker. Few bands were recognized by the endoplasmin antibody migrating approximately at 40 and 25 kDa (not shown). Relative pixel intensities are depicted at a log 2 -transformed scale in the dot plots. * p < 0.05; ** p < 0.01; *** p < 0.001.

Retinol-Binding Protein I
The antibody against retinol-binding protein I (Figure 2, spot 1007) reacted strongly with the cancer cell line revealing strong upregulation compared with the normal derived colon cell line (p < 0.01) in agreement with the 2D-PAGE proteomic analysis (Figures 2 and 4A and Table 2). Technically, it was not possible to analyze the tumor tissue due to unspecific reactions of the antibody (not shown). The LFQ LC-MS/MS proteomic analysis also revealed strong upregulation of this protein in the HCT 116 cell line, supplementary Table S1.

Protein SET
Protein SET (Figure 2, spot 0304) was increased in the 2D-PAGE proteomic analysis of SW480. 1D western blotting of the cell lines revealed two closely migrating bands. The upper band analyzed alone showed significant decrease in the SW480 cell line (p < 0.05) while the lower band did not show any significant difference ( Figure 4A). When the expression pattern of the protein was analyzed in tumor tissue an even more complex pattern was seen since the antibody reacted with several bands on the western blot suggesting that the protein is fragmented in the non-involved part of colon as well as in the tumor tissue ( Figure 4B). We found that the central part as well as the peripheral part of the tumor contained significantly more protein/fragments than the non-involved colon (p < 0.001, p < 0.001). The LFQ LC-MS/MS proteomic analysis revealed an unchanged level in the HCT 116 cell line.

Endoplasmin
Endoplasmin (Figure 2, spot 1801) with a molecular mass of approximately 80 kDa showed a significant decrease in the cancerous SW480 cell line in the 2D-PAGE proteomic analysis and the 1D western blot of the cell lines confirmed the result (p < 0.01) ( Figure 4A). The 2D-PAGE proteomic analysis also revealed a fragment of endonuclein ( Figure 2, spot 0607) with a molecular mass of approximately 55 kDa to be increased in the cancerous cell line. Analysis of the tumor tissue showed a major band of endoplasmin with a molecular mass of approximately 64 kDa, which was significantly increased in the peripheral part (p < 0.01) of the tumor but not in the central part compared to the non-involved part of the colon. Additional minor bands with molecular masses around 40 kDa and 25 kDa were observed in the tissue western blot analyses (not shown). The LFQ LC-MS/MS proteomic analysis revealed an unchanged level in the HCT 116 cell line.

Bioinformatic Enrichment Analysis of SW480 versus NCM460
The combined set of perturbed proteins from 2D-PAGE and western blot analysis was used for bioinformatic analysis using STRING [28] as given in Figure 5. In order to focus on protein identifications with high reliability we restricted the analysis to proteins identified with at least two peptides. The perturbed proteins have significantly more interactions than expected (p < 1.84 × 10 −11 ). Thus, the analysis indicated that the proteins are at least partially biologically related. All the 20 proteins belong to the 'cytoplasmic part' of the Gene Ontology (GO) classification Cellular Component. With respect to Molecular Function 19 belong to various types of 'binding' such as protein binding, small molecule binding, signaling receptor binding, enzyme binding, purine ribonucleotide binding, calcium ion binding, RNA binding and cytoskeletal binding as indicated in Figure 5B. Figure 5C reveals the proteins found to be enriched in Reactome pathways. Eight were found to be part of 'signal transduction', eight of 'metabolism of proteins', and seven were part of the 'immune system'. Thus, proteins found to be perturbed in SW480 were mainly cytoplasmic proteins with various binding functions that participate in pathways of signal transduction, protein metabolism, and the immune system.

Discussion
A number of studies have been published on diagnostic/prognostic/predictive markers including hypothesis-generating proteomic analysis of CRC performed with patient samples, animal models and cell lines as reviewed by de Wit et al. [29]. In the present study, the sample material included the total protein content from cellular models established from a primary adenocarcinoma of the colon, SW480, a colonic carcinoma, HCT 116, and a normal derived colon mucosa cell line, NCM460. The two colorectal cancer cell lines show a number of differences [16]. SW480 from a Dukes' B stage tumor shows intermediate growth while HCT 116 from a Dukes' D stage tumor is fast growing. SW480 shows microsatellite stability while HCT 116 possesses microsatellite instability. SW480 and HCT 116 have different CpG island methylator phenotypes and different chromosomal instability status. Additionally, they show a number of differences in mutations of KRAS (G12V in SW480, G13D in HCT 116), PIK3CA (wt in SW480, H1047R in HCT 116) and TP53 (R273H and P309S in SW480, wt in HCT 116) [16]. In spite of these molecular differences as well as differences in specific proteins that are differentially expressed, the bioinformatic analyses showed a number of similarities with respect to enriched pathways among the two cell lines. 2D-PAGE and LFQ LC-MS/MS-based proteomic analyses have not previously been published with these cellular systems. The biological variation within a cell line is minimal as compared to patient samples. However, results from model systems show some differences in expression levels of proteins and must be further verified in patient samples to establish the clinical impact.

Model Systems Versus Patient Tissue
In the present analysis, we identified some differentially expressed proteins in one of the cell models that show similar expression patterns in the patient tissue (reticulocalbin (SW480), calumenin (SW480), S100A6 (SW480)). Other proteins showed a more complex pattern (protein SET, endoplasmin). We were unable to establish the expression pattern in the patient tissue for S100A4 and retinol binding protein I. The latter was highly upregulated in both cell lines, SW480 and HCT 116. The antibodies did not work on tissue, presumably because the protein composition may be different from the cellular model systems. Patient tumor tissue may also be heterogeneous and show varying degrees of central necrosis. Further a tumor may consist of different clones, which may have different protein composition. Therefore, we analyzed central as well as peripheral parts of the tumor. Thus, conclusions from the model systems cannot always be transferred to the clinical setting.

Proteomics Methodology
We have used 2D-PAGE where quantification is performed directly on intact proteins (top-down proteomics) as well as LFQ LC-MS/MS where quantification is performed on tryptic generated peptides (bottom-up proteomics). Proteoforms [30] of the proteins can be separated by 2D-PAGE if they vary in pI and/or molecular mass, which may include different posttranslational modifications and fragmentations. Since bottom-up proteomics is performed on peptides from digested proteins this technique does not give detailed information about proteoforms of the proteins. A major advantage of the latter technique is that it can be performed on formalin-fixed paraffin-embedded (FFPE) CRC patient tissue [31]. FFPE in combination with 2D-PAGE is still insufficient [32]. The use of frozen tumor tissue is applicable with 2D-PAGE.

Comparisons of 2-Dimensional Polyacrylamide Gel Electrophoresis (2D-PAGE) with 1D Western Blotting
When 2D-PAGE is compared with 1D western blotting, there are some issues to be considered. In order to obtain consistent results, antibody specificity is crucial. The specificity may depend on the protein composition in the system used, as the combination of proteins may be different from the cell lines to tissue samples. Here, we report results from the antibodies that worked either in the cell lines and/or in the tissue. Another problem is that proteins may consist of several proteoforms that may migrate to different isoelectric points with similar molecular masses using 2D-PAGE. Differential expression of each of these single isoelectric variants may be detected by proteomic analysis using 2D-PAGE while this may not be apparent when all isoelectric variants are analyzed together as a single band using 1D western blotting. As an example, triosephosphate isomerase migrates as at least three proteoforms that vary in isoelectric points but with similar molecular masses that are expressed to varying degrees in NCM460 and SW480 ( Figure 6A). However, the 1D western blot analysis revealed band intensities that were quite similar in the two cell lines ( Figure 6B).
2D-PAGE while this may not be apparent when all isoelectric variants are analyzed together as a single band using 1D western blotting. As an example, triosephosphate isomerase migrates as at least three proteoforms that vary in isoelectric points but with similar molecular masses that are expressed to varying degrees in NCM460 and SW480 ( Figure 6A). However, the 1D western blot analysis revealed band intensities that were quite similar in the two cell lines ( Figure 6B).

CREC Proteins
We found two of the three CREC proteins analyzed, reticulocalbin and calumenin, to be increased in the tumor cell line SW480, and both of them were also expressed to a higher degree in tumor biopsies (peripheral and central part) both compared to the non-involved part of the colon. They were not changed in the HCT 116 cell line. ERC-55, another CREC protein was expressed at similar levels in the cell model systems of normal derived cells, NCM460, and cancer cells, SW480 and HCT 116, and was not further examined. In the cellular model, the molecular masses of calumenin vary between the two cell lines (SW480 and NCM460) and regarding reticulocalbin more bands are observed in the tissue biopsies. Both of these expression patterns may be due to glycosylations as previously shown for CREC proteins [33][34][35]. Thus, the differential expressions and differences in putative glycosylation states of reticulocalbin and calumenin imply some role of these proteins in CRC pathogenesis.

S100 Proteins
The S100 family consists of more than twenty members with a high degree of sequence and structural similarities. So far, almost all of the S100 family members are involved in cancer with different expression patterns. In SW480, S100A4 was downregulated while S100A6 was upregulated. S100A6 was upregulated in the central part of the tumor compared to the non-involved part of the bowel. This pattern is consistent with previous studies [36][37][38][39]. The interpretation of high S100A4 expression is not unequivocal. Some studies have found a positive [40] where other studies show a negative [41,42] impact of high S100A6 expression in regard to CRC. The divergent results of S100A4

CREC Proteins
We found two of the three CREC proteins analyzed, reticulocalbin and calumenin, to be increased in the tumor cell line SW480, and both of them were also expressed to a higher degree in tumor biopsies (peripheral and central part) both compared to the non-involved part of the colon. They were not changed in the HCT 116 cell line. ERC-55, another CREC protein was expressed at similar levels in the cell model systems of normal derived cells, NCM460, and cancer cells, SW480 and HCT 116, and was not further examined. In the cellular model, the molecular masses of calumenin vary between the two cell lines (SW480 and NCM460) and regarding reticulocalbin more bands are observed in the tissue biopsies. Both of these expression patterns may be due to glycosylations as previously shown for CREC proteins [33][34][35]. Thus, the differential expressions and differences in putative glycosylation states of reticulocalbin and calumenin imply some role of these proteins in CRC pathogenesis.

S100 Proteins
The S100 family consists of more than twenty members with a high degree of sequence and structural similarities. So far, almost all of the S100 family members are involved in cancer with different expression patterns. In SW480, S100A4 was downregulated while S100A6 was upregulated. S100A6 was upregulated in the central part of the tumor compared to the non-involved part of the bowel. This pattern is consistent with previous studies [36][37][38][39]. The interpretation of high S100A4 expression is not unequivocal. Some studies have found a positive [40] where other studies show a negative [41,42] impact of high S100A6 expression in regard to CRC. The divergent results of S100A4 expression may be based on not taking the cellular localization into consideration. Previously, it has been shown that nuclear S100A4 from adenocarcinomas of the colon or rectum is a negative predictor of disease-free and overall survival whereas cytoplasmic S100A4 was not associated with patient outcome [43]. In the HCT 116 cell line we only detected S100A10 with LFQ LC-MS/MS and it was unchanged.

Retinol-Binding Protein I
Retinol-binding protein I (RBP) was highly upregulated in SW480 both in the 2D-PAGE proteomic analysis, with western blotting of the cell lines and also in the HCT 116 cell line by LFQ LC-MS/MS. However, no specific antibody reaction was obtained on the patient tissue. Previous studies have shown that low level of RBP is correlated to an increased risk of tumor recurrence [44]. Furthermore, Ali et al. have shown that upregulation of RBP might have a role in TGF-β/Smad4 signaling resulting in cancer progression [45]. Thus, the expression pattern of RBP is not fully established. Still, this protein may be an important player in progression and recurrence in CRC.

Protein SET
Protein SET was differentially expressed in the SW480 cellular model although not in the HCT 116 model. Western blotting of SW480 showed a two-band pattern, likely identifying the αand β-isoforms (TAF-1α (290 amino acids)), TAF-1β (277 amino acids)). The two isoforms are 98.7% identical and differs only in the N-terminal 1-47 amino acids. The top band in SW480 showed significant downregulation. A complex pattern was found in CRC tissue with several bands. Quantification of all bands together showed a significant upregulation of protein SET in the peripheral and central part of the tumor compared to the non-involved part of the colon. Previously, protein SET was found upregulated in HCT 116 regarding a putative effect in the COX-2 inhibitor celecoxib [46]. The specific role of protein SET in pathogenesis of CRC awaits further investigations.

Endoplasmin
Endoplasmin was identified in two spots in the 2D-PAGE proteomic analysis of SW480. Spot 1801 (Figure 2, approximately 80 kDa) is downregulated in SW480, and spot 0607 (Figure 2, approximately 55 kDa) is upregulated. Both spots possess lower molecular masses than the deduced mass of endoplasmin (92.4 kDa). Western blots showed a main band at approximately 80 kDa, most likely corresponding to spot 1801, that was downregulated in the SW480 cancer cell line confirming the 2D-PAGE analysis. No pronounced bands with lower molecular masses were observed in the SW480 cellular model. In the HCT 116 cellular system endoplasmin showed no changes. CRC patient tissue showed a major band with a molecular mass of approximately 64 kDa, also lower than the deduced mass, with a slight but significant up-regulation in the peripheral part of the tumor compared to the non-involved colon. Additional bands with even lower molecular masses (approximately 40 kDa and approximately 25 kDa) were also detected in the tumor tissue (not shown). Previously, two studies have shown endoplasmin expression to be correlated with CRC [47]. Thus, our results together with these studies infer that the expression of endoplasmin is complex and may correlate with CRC.

Conclusions
The CREC proteins reticulocalbin and calumenin seem to have a role in CRC carcinogenesis whereas ERC-55, another family member, shows no differential regulation indicating a specific role of reticulocalbin and calumenin and not the CREC family as such. Both reticulocalbin and calumenin were found with high expression in the tumor compared to the healthy part of colon and must therefore be seen as potential markers of CRC. Furthermore, in this study, more putative markers have been identified including the S100 protein S100A4 and S100A6, RBP, protein SET, and endoplasmin which are observed to be differentially regulated either in one of the cellular model systems and/or in CRC patient tissue. Further studies must be performed to investigate the biological roles of the various identified putative markers of CRC aiming at elucidating the clinical relevance of these new potential disease makers. Confluent cells were washed three times in PBS-buffer without Ca 2+ and Mg 2+ and scraped off. Pellets were further washed with PBS-buffer and kept at −80 • C until use. Cell pellets were re-suspended in SDS sample buffer (LC2676, Invitrogen) for 1D western blotting and in 2D lysis buffer (9M Urea, 2% (vol/vol) Triton X-100, 2% (wt/vol) DTT, 2% (vol/vol) IPG-buffer (GE Healthcare, Buckinhamshire, UK)) for 2D-PAGE analysis and in lysis buffer (5% (wt/vol) SDS, 50 mM TEAB, pH 7.55) for LFQ LC-MS/MS analysis. Protein concentration was determined using the Non-Interfering Protein Assay (488250, Calbiochem ® , Merck KGaA, Darmstadt, Germany) or by infrared spectrometry (Direct Detect Spectrometer, Merck KGaA, Darmstadt, Germany) [48].

Tissue Biopsies from Patients with CRC
All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by The North Denmark Region Committee on Health Research Ethics (Project identification No. 2-16-4-0001-03) and the Danish data agency. This project was further approved (18 December 2017) by the National Committee on Health Research Ethics (Project identification No. 60819). Biopsies were obtained from ten patients with CRC, Table 1. From each patient 3 samples were collected, one from the central part, one from the peripheral part of the tumor, as well as one from the resection margin furthest from the tumor. The tissue samples were collected from the tumor and bowel just after the specimen was removed from the abdomen, rinsed in cold saline, frozen in liquid nitrogen, and stored at −80 • C. For protein extraction, initially, the tissue was rinsed with PBS-buffer if presence of excess blood was observed, then solubilized in IP 3-10 NL lysis buffer (9M Urea, 2% (vol/vol) Triton X-100, 2% (wt/vol) DTT, 2% (vol/vol) IPG-buffer (GE Healthcare)) using a homogenizer. Protein concentration was determined using the Non-Interfering Protein Assay (488250, Calbiochem ® , Merck KGaA, Darmstadt, Germany).

Two-Dimensional Polyacrylamide Gel Electrophoresis (2D-PAGE)
2D-PAGE was done essentially as previously described [49]. Six replicates of each cell line were harvested and from each sample 75 µg of total protein was loaded on the gels. In short, first dimension isoelectric focusing (IEF) was performed using pH 3-10 nonlinear 18 cm strips (GE Healthcare). Separation in the second dimension was performed using polyacrylamide gels (12% T, 3% C). Gels were then silver stained and scanned on a GS-710 Imaging Densitometer (Bio-Rad, Hercules, CA, USA) using Quantity-One. The TIFF files were then imported into PDQuest software (BioRad, Hercules, CA, USA). Protein spots were initially automatically defined and quantified and then manually investigated for proper alignment between the gels. Differentially expressed spots were defined as spots that were 2-fold or more differentially expressed with a significance level at p < 0.05 (Mann-Whitney U-test).

Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/MS) for Protein Identification
Differentially expressed spots were excised from either the rehydrated silver stained gels or for a few spots from Coomassie blue stained gels. The excised spots were subjected to in-gel tryptic digestion and peptides were identified by liquid chromatography-tandem mass spectrometry (LC-MS/MS) with a Q-Tof Premier mass spectrometer (Waters, Milford, MA, USA). Proteins were identified by searching in the SwissProt protein database (various releases from 52.5-57.11) using the online version of the Mascot MS/MS Ions Search facility (Matrix Science, Ltd., London, UK) [50]. The parameters used for searching were doubly and triply charged ions with up to two missed cleavages, a peptide tolerance between 20 and 100 ppm or 1 Da, one variable modification, Carbamidomethyl-C (occasionally together with oxidation of Met) and a MS/MS tolerance of 0.02 or 0.05. Contaminating peptides such as keratins and trypsin were disregarded. Individual ions scores above approximately 34-37 indicated identity or extensive homology giving a less than 5% probability that the observed match was a random event.

Label-Free Quantification Liquid Chromatography-Tandem Mass Spectrometry (LFQ LC-MS/MS)
Proteomic comparison of cell lines NCM460 with HCT 116 were performed with LFQ LC-MS/MS using an LC-MS instrument platform from Thermo Fisher Scientific (Waltham, MA, USA). This was performed essentially as previously described [51] with slight modifications. Protein preparation and tryptic digestions from each cell line was performed using the S-Trap Micro Spin Column procedure (Protifi, Farmingdale, NY, USA) as described by the manufacturer. In short, from each of the cell lines six 20 µg protein preparations were alkylated with TCEP and digested with trypsin. Peptide concentration was measured by fluorescence as described [48]. Samples were dried by vacuum centrifugation and dissolved in 0.1% formic acid. Peptides were separated by nano liquid chromatography, Ultimate 3000 (Thermo Fisher Scientific, Waltham, MA, USA). Peptides were trapped with a µ-Precolumn (300 µm × 5 mm, C18 PepMap100, 5 µm, 100 Å (Thermo Fisher Scientific, Waltham, MA, USA) and separated on an analytical column (EASY-Spray Column, 500 mm × 75 µm, PepMap RSCL, C18, 2 mm, 100 Å, Thermo Scientific) coupled to an Orbitrap Fusion Tribrid mass spectrometer (Thermo Scientific). Peptides were eluted with a flow of 300 nL/min. using a gradient by mixing buffer A (99.9% (vol/vol) water, 0.1% (vol/vol) formic acid) with buffer B (99.9% (vol/vol) acetonitrile, 0.1% (vol/vol) formic acid). One µg of peptide was injected from each preparation in duplicate. The universal method setting was used with full Orbitrap scans (m/z 375-1500) at a resolution of 120,000 with automatic gain control (AGC) target of 4 × 10 5 and a maximum injection time of 50 ms. Most intense precursors were selected with an intensity threshold of 5 × 10 3 . Charge states 2-7 were included. In the linear ion trap MS2 scans were performed at rapid scan rate with collision-induced dissociation energy at 35%, an AGC target of 2 × 10 3 and a maximum injection time of 300 ms. Precursor ions were isolated using the quadrupole set with an isolation window of 1.6 m/z. Cycle time was 3 s and dynamic exclusion was set to 60 s. Raw data files were used to search the reviewed human database from Uniprot downloaded on the 9th of February 2020 and using MaxQuant version 1.6.6.0 for LFQ analysis [52]. Carbamidomethyl (C) was used as fixed modification. False discovery rate for PSM, protein and identification by site was each set at 1%. The LFQ minimum ratio count was set to 1. MS/MS was required for LFQ comparisons. Unique and razor peptides, unmodified and modified with oxidation (M) or acetyl (protein N-terminal) were used for protein quantification. Contaminant sequences were included in the search and revert sequences were used for decoy search. The generated results file was further analyzed with Perseus version 1.6.6.0 [53]. Proteins identified in all samples in both groups were included together with proteins identified in all samples in one group but were below detection limit in all samples in the other group, since these are of high interest as putative biomarkers. Differentially expressed proteins were defined as proteins that were 2-fold or more differentially expressed with a significance level at p < 0.05 (t-test).

Western Blotting
Western blotting was done essentially as previously described [49]. In short, 30 µg of total protein from cell lines or 10 µg of total protein from tissue was loaded in each lane, separated by electrophoresis and blotted onto nitrocellulose Hybond-C Extra membranes. We used the total protein level for normalization of the blots and abstained from using household proteins. This was based on our previous experience that GAPDH and actin are differentially expressed in lymphoma tumor biopsies [49]. Furthermore, Hu and coworkers [54] have shown that seven commonly used household proteins change expression level in CRC in particular. Recently, Xu and coworkers [55] showed that 17 out of 21 classical reference genes were upregulated in CRC at transcript level compared to normal colonic epithelial tissue. Even among 42 novel potential reference genes some of these transcripts changed expression level in a subset of the tumors. Thus, the ideal reference gene or protein is still missing and we found it more reliable to normalize to the total protein as recently suggested [54]. All primary antibodies were incubated overnight at 4 • C or for one hour at room temperature. The blots were then incubated with horseradish peroxidase (HRP)-conjugated secondary antibodies and developed using the enhanced chemiluminescence system. Digital images were taken using Fujifilm LAS-4000 or LAS-4010 and the corresponding ImageQuant LAS 4000 software (Fuji, Tokyo, Japan). The images were analyzed by ImageQuant TL, version 7.0. Pixel intensity for each band was measured and normalized to the background density. The intensities were quantified and study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.