Lipidomic Signatures for Colorectal Cancer Diagnosis and Progression Using UPLC-QTOF-ESI+MS

Metabolomics coupled with bioinformatics may identify relevant biomolecules such as putative biomarkers of specific metabolic pathways related to colorectal diagnosis, classification and prognosis. This study performed an integrated metabolomic profiling of blood serum from 25 colorectal cancer (CRC) cases previously classified (Stage I to IV) compared with 16 controls (disease-free, non-CRC patients), using high-performance liquid chromatography and mass spectrometry (UPLC-QTOF-ESI+ MS). More than 400 metabolites were separated and identified, then all data were processed by the advanced Metaboanalyst 5.0 online software, using multi- and univariate analysis, including specificity/sensitivity relationships (area under the curve (AUC) values), enrichment and pathway analysis, identifying the specific pathways affected by cancer progression in the different stages. Several sub-classes of lipids including phosphatidylglycerols (phosphatidylcholines (PCs), phosphatidylethanolamines (PEs) and PAs), fatty acids and sterol esters as well as ceramides confirmed the “lipogenic phenotype” specific to CRC development, namely the upregulated lipogenesis associated with tumor progression. Both multivariate and univariate bioinformatics confirmed the relevance of some putative lipid biomarkers to be responsible for the altered metabolic pathways in colorectal cancer.


Introduction
Colorectal cancer (CRC) is an important public health issue, among the three leading causes of cancer-related mortality in both men and women, according to recent cancer statistics [1][2][3][4], particularly in Western countries but also in developing countries, and is strongly related to lifestyle, stress, food diet and habits.
The early detection and endoscopic resection of adenomatous polyps (premalignant conditions) and screening colonoscopy significantly improve the survival rate, being considered a gold standard for the detection of colorectal neoplasms, beside sigmoidoscopy, colon capsule endoscopy and magnetic resonance colonography. The biopsy specimens of colorectal mucosa and colonic lesions are also useful diagnosis procedures; however, all these techniques are invasive. This is the reason why scientists are more and more keen on using non-invasive techniques with good predictive value and high sensitivity such as metabolomics. As reviewed recently, the management of colorectal cancer was changed radically, using omics technologies for finding diagnosis, stratification and prognosis biomarkers, as well as for treatment monitoring [5].
Metabolomics-based procedures using biofluids (especially blood serum or plasma) assure a systematic screening or fingerprinting of small metabolites (with less than 2000 Da) related to the metabolic signature and pathway alterations in different stages of CRC. Metabolomics and metabonomics offer a qualitative untargeted signature (fingerprint) or a targeted methodology with quantitative evaluation of putative biomarkers [6][7][8][9][10][11].
Metabolomic investigations include mainly gas chromatography or high-performance liquid chromatography coupled with mass spectrometry (GC-MS, HPLC-MS) and magnetic resonance (NMR). By far the most applied technique is based on LC-MS, as it has superior detection and identification capability [12][13][14][15]. Recently, the serum fatty acid profiling of colorectal cancer was reported, using either gas chromatography -mass spectrometry [16,17], NMR [18] or Fourier transform ion cyclotron resonance mass spectrometry [19].
Different metabolic alterations are associated with colorectal cancer (CRC), since cancer cells are able to generate energy even in a nutrient-deficient environment and prefer glycolysis against oxidative phosphorylation as demonstrated for years (the Warburg effect). However, recently this paradigm shifted towards a "reversed Warburg effect": some cancer cell types, including CRC cells, may synthesize ATP by mitochondrial phosphorylation [20], realizing metabolic remodeling and alterations of mitochondrial respiration [21], opening new research directions for the identification of molecular therapeutic targets, such as fatty acid (FA) synthesis and oxidation.
The metabolization of exogenous glutamine represents another dependence of cancer cell, with many oncogenic mutations affecting glutamine metabolism [22]. Meanwhile, alterations of lipid metabolism in CRC lead to structural changes in cell membranes and disruption of energy homeostasis, cell signaling, gene expression and protein distribution, affecting a number of cell functions, such as proliferation, differentiation, apoptosis, autophagy, necrosis, and drug and chemotherapy resistance [23].
The growing interest related to the role of lipids and their metabolism in cancer development has been presented in previous reviews [1,24,25]. The lipid metabolic pathways are affected in CRC cells and include FA synthesis, desaturation, elongation and mitochondrial oxidation. A plasma lipidomic signature reveals perturbed lipid metabolic pathways and potential lipid biomarkers of human colorectal cancer [26]. The complex lipid metabolic changes may be explained by the high proliferation rate of the CRC cells, with high energetic needs and changes in the serum levels of phospholipid components derived from cell membrane degradation, accompanied by inflammation and changes in the arachidonic acid metabolites in serum or tissue. Recently, an integrated multi-omics approach and lipidomic-based characterization of the lipid metabolism in colorectal cancer were reported [27][28][29][30].
The explanation for increased levels of choline-related metabolites in tumors is probably the result of the accelerated lipid membrane metabolism involved in ATP generation, due to rapid cell proliferation. Glucose changes were consecutive to glycolysis and upregulated in CRC, while increases in 3-hydroxybutyrate, an end metabolite of fatty acids, suggested that the upregulation of fatty acid β-oxidation needed as energy support for cancer cell proliferation [31]. Increased oxidative stress is usually associated with increased oxidation of fatty acids, which may result in an accumulation of 3-hydroxybutyrate [32]. The predictive value of lipid biomarkers is very important, the main predictive factor being considered the stage of diagnosis [4], which explains the importance of CRC screening and early diagnosis. Different prediction models for CRC patients compared with controls included metabolites such as 2-hydroxybutyrate, aspartic acid, kynurenine and cysteamine [33], or pyruvic, fumaric and glycolic acids; palmitoleic acid; ornithine; lysine tryptophan and 3hydroxyisovaleric acid [17]. When serum fingerprints from CRC patients were compared prior to surgery and one month after, the potential biomarkers belonged to lipid classes (phosphatidylcholines (PCs), lysophosphatidylcholines (LPCs) and diacylglycerols (DGs)), without a significant difference between the pre-operative and post-operative status [15]. For the assessment of the CRC recurrence rate and survival of the patients after surgical intervention or chemotherapy, 15 metabolites, including four lipids (glycerol, myristate, palmitoleate and 2-aminobutyrate), were selected as potential biomarkers [32].
Recently, a relevant serum MS study for lipophilic metabolites performed under the European Prospective Investigation into Cancer and Nutrition (EPIC), reported nine metabolites to be related to CRC etiology and were recommended for further CRC prospective studies [34]. It was concluded that changes in plasma lipid composition preceded the appearance of neoplasia and that tumor changes can induce a global change in LPC metabolism [35]. Another prospective study found 35 metabolites associated with CRC risk, including 12 glycerophospholipids with an important role in the risk of developing colorectal cancer [36].
Our previous reports focused on the selection of some lipidomic biomarkers to diagnose CRC, based on literature surveys [37,38]. In this context, this experimental study aimed at the identification of specific blood serum biomarkers from patients diagnosed with CRC in four progression stages. The UPLC-QTOF-ESI + MS results, combined with a succession of multivariate and univariate statistical models, including ANOVA, partial least squares discriminant analysis (PLSDA), cluster analysis, random forest and pathway analysis, showed the predictive value of specific biomolecules to be considered as putative CRC biomarkers.

Patients and Compliance with Ethical Standards
The protocol of this study was approved by the Ethics Committee of the Cluj-Napoca "Iuliu Hatieganu" University of Medicine and Pharmacy, including collection of the details about samples and the individual, and the written consent of all subjects, before entering them in the study. The CRC patient group (25 patients) included 16 men and 9 women operated on in the Surgery Department, Regional Institute of Gastroenterology and Hepatology "Octavian Fodor" Cluj-Napoca, Romania, between May and December 2018, with confirmed CRC, either before surgery (suspicion through colonoscopy) or post-surgery (having preoperative radiological suspicion of CRC). The clinical and pathological features as well the stage of CRC tumor were established according to histological evaluation and pTNM classification, as presented in Table 1. The control group included 7 males (53.1 ± 6.7 years) and 4 females (56.33 ± 8.98 years) considered to be CRC-free, with a negative colonoscopy for cancer or adenoma during the last 12 months. The controls were operated on in the same hospital for other benign diseases (inguinal or incisional hernia, benign skin lesions, cholelithiasis).
Co-morbidities like morbid obesity, insulin-dependent diabetes mellitus, liver cirrhosis, adenomatous polyps and a history of inflammatory bowel disease were excluded in both groups. The blood samples were collected similarly from all participants, in hospital, prior to surgery, in the morning, after a minimum of 12 h fasting.
In parallel, data about other clinical and preclinical parameters of the patients and controls were registered but not included in this report.

Blood Collection and Processing
Blood serum samples were collected according to standardized procedures in accordance with the ethical standards of the institutional and national research ethical committee and with the 1964 Helsinki Declaration and its later amendments for ethical standards.
The blood was collected in vacutainer tubes without anticoagulant, kept at room temperature for 30 min to allow clotting and centrifuged for 10 min at 3000 rpm (4 • C) to separate clear serum. After separation, the blood serum was stored at −80 • C. To a volume of 0.2 mL serum, 0.8 mL of a mixture of methanol and acetonitrile (1:1) was added to precipitate proteins. The mixture was vortexed for 1 min, kept at −20 • C overnight and then vortexed again for 1 min. After mixing, the vials were centrifuged at 12,500 rpm for 10 min and the supernatant was collected and filtered through PTFE filters of 0.25 µm.

HPLC-ESI(+)-QTOF-MS Analysis of Blood Serum
Aliquots of 3 µL of serum were subjected to ultrahigh-pressure chromatography on a Thermo Scientific HPLC UltiMate 3000 (Waltham, MA, USA) system equipped with a quaternary pump system DionexUltiMate 3000 (UHPLC) (ThermoFischer, Waltham, MA, USA), a DionexUltimate 3000 photodiode array detector, a column oven and autosampler. Serum metabolites were separated using a Thermo Scientific C18 reverse-phase column (Acquity, UPLC C18 BEH, Waters Corporation, Milford, MA, USA) (5 µm, 2.1 × 75 mm) at 25 • C and a flow rate of 0.3 mL/min. The mobile phase was represented by a gradient of Eluent A (water containing 0.1% formic acid) and Eluent B (methanol:acetonitrile, 1:1, containing 0.1% formic acid). The gradient system consisted of 99% A (Minute 0), 70% A (Minute 1), 40% A (Minute 2), 20% A (Minute 6) and 100% B (Minute 9-10), followed by 5 min with 99% A. The total running time was 15 min. The mass spectrometry was performed on a Bruker Daltonics MaXis Impact QTOF (Bremen, Germany) instrument, operating in positive ion mode (ESI+). The mass range was set between 50 and 1000 m/z. For measurements, the nebulizing gas pressure was set at 2.8 bar, the drying gas flow at 12 L/min and the drying gas temperature at 300 • C. Before each chromatographic run, a calibrant solution of sodium formate was injected. The control of the instrument, the acquisition and data processing were done using Chromeleon, TofControl 3.2, Hystar 3.2 and Data Analysis 4.2 (Bruker Daltonics, Bremen, Germany).

Statistical Analysis
The Bruker software attached to the instrument, Data Analysis 4.2, was used to process the acquired data. First, from the total ion chromatogram, using specific algorithms base peak chromatograms were obtained, and the Find Molecular Features (FMF) algorithm generated an advanced bucket matrix. The matrix released by Data Analysis contained the retention time, the peak areas and intensities and the signal/noise (S/N) ratio for each component, together with its m/z value. Generally, the number of separated compounds ranged between 600 and 800.
In this first step, a matrix for all samples was obtained and stored in an Excel file. In order to eliminate the small signals with S/N values under 10, an initial filtration (1) was made and then a second matrix containing m/z values and peak intensities was saved and filtered in a second step eliminating the small intensities (2). Generally, the number of peaks remained at 180-220. Only metabolites which were detected in more than 80% of the samples were included in the statistical analysis, so to make an adequate alignment of the peak m/z values, the online software from bioinformatica.isa.cnr.it/NEAPOLIS was applied. The aligned matrix (3) allowed the calculation of mean Intensity values and standard deviations for the control group and for subgroups corresponding to CRC Stages I-IV. The aligned matrix was then converted to a csv file and introduced in the specialized online software Metaboanalyst 5.0.
After successive alignment and normalization of the matrix data, the multivariate analysis consisted of the representation of fold change, volcano plot, principal component analysis (PCA), partial least squares discriminant analysis (PLSDA) and random forest, finding correlations between samples and between variables (m/z values), as well as building the heatmap which represents the correlation between variables and samples. Finally, using the biomarker analysis, the receiver operating curves (ROCs) were obtained and the values of the areas under the ROC curves (AUCs) were obtained and the molecules identified were ranked according to their sensitivity/specificity. The enrichment analysis and the MS to pathways algorithm allowed the identification of specific alterations of metabolic pathways induced in CRC.
The identification of molecules which can be considered potential biomarkers was made using the 2 most relevant databases, LIPID MAPS Lipidomics Gateway and the Human Metabolome Database.

PCA and PLSDA Analysis
By unsupervised PCA, the co-variance for the first five components was evaluated. The explained variance in serum groups (CRC and C) was 25.6% (PC1) and 13.3% (PC2), covered by a total variance of 38.9% (Figure 1a). The discrimination between CRC and C groups was better represented by PLSDA (covariance of 35.1%) (Figure 1b). According to the PCA and PLSDA plots, the C group was less homogeneous than the CRC group, two or three subgroups being visible in this group. This can be explained by the diverse co-morbidities of the patients from this non-CRC group.
The cross-validation algorithm showed a high accuracy (close to 1), high R 2 and significantly high Q2 values, its performance increasing from Component 1 to 3, up to 0.7 ( Figure S1a, Supplementary Materials). These data indicated very good validation and predictability for this model. Figure 2 shows the hierarchical clustering of different samples, displayed as a tree diagram called a dendrogram. The hierarchical clustering dendrogram was chosen, with an Euclidian distance measure and the Ward clustering algorithm, one of the options found in Metaboanalyst 5.0. Using the Euclidian algorithm, in the scale of distances from 1-60, one can see good similarities (distances of <35) between the individual samples from the CRC group. The pathologic CRC group (marked in green) showed a good similarity while the group C (marked red) was split into two subgroups.  or negative (blue) correlations as determined by the t-test. One can discriminate different patterns for samples correlated with variables (blue zones differentiated from red zones). In the heatmap (Figure S2c), the red spots show the molecules which have increased levels in certain samples, while blue spots reflect decreases in certain molecules for specific samples.

The Random Forest Algorithm and Its Predictive Value
This algorithm was able to indicate the predictive value (as potential biomarkers) for some molecules which differentiated the CRC and C groups. Table 2 presents the m/z values of the first 30 molecules to be considered as predictive by the random forest algorithm. The MDA values from 0.012 to 0.002 were considered and the decrease (D) or increase (I) in the level of these molecules in the CRC vs. C groups was seen.

Biomarker Analysis
According to Metaboanalyst software, biomarker analysis includes the receiver operating characteristic (ROC) curve as a useful tool to evaluate the diagnostic accuracy. Many biomarker combination methods rely on maximization of the area under the ROC curve (AUC). This parameter allowed the evaluation of the sensitivity versus specificity of each molecule to be considered a relevant biomarker. Higher values of AUC close to 1 for a certain molecule mean higher prediction to be considered as a biomarker. Table 3 shows the m/z value and putative identification, AUC value, p-values and log2FC values for each molecule identified, as well its variation in the CRC group vs. C group.
Significantly high AUC values above 0.750 showed that 25 molecules might be considered as putative biomarkers; these molecules belong to different lipid classes.
These data confirm that lipid molecules, mainly choline-dependent phospholipids, ceramides and different esters (of fatty acids or cholesterol), can be considered as predictive molecules with high prognostic values for CRC diagnosis.

One-Way ANOVA to Identify Biomarkers for CRC Progression (Stages I to IV)
Applying the ANOVA univariate analysis, included in the Metaboanalyst software, Figure 3a,b presents the PCA and PLSDA plots. The PCA plot showed a covariance of 38.9%, while PLSDA showed a covariance of 28% that was able to discriminate between the C and CRCI-IV groups. In the PCA plot, the subgroups CRCIV and CRCII are well discriminated as well, in the opposite direction compared with the C group and the CRC I and III subgroups. Interestingly, in the PLSDA plot, the CRCIII subgroup had a significant difference from the other subgroups. Finally, considering the inputs from these plots, we consider that CRCIII and CRCIV are the CRC stages to show significant differences to be considered in chemo-statistic evaluations. The cross-validation graphic ( Figure S1b) shows the high accuracy and significance of the PLSDA model: an accuracy value close to 1 indicates a very good description of the data by the model, whereas the R 2 and high Q2 values confirm the model's performance, increasing from Component 1 to 5, up to 0.7.
The dendrogram (Figure 3c) shows the clustering of subgroups CRC I-IV and Figure 3d shows the MDA values of the model, as determined by the random forest algorithm, including the first 15 predictive molecules. As can be seen in Figure 3a,b, clear delimitation of Stages III and IV were observed either by the PCA or PLSDA score plots. The loading analysis and MDA values showed which are the molecules considered to be responsible for this strong discrimination. Table 4 includes the first 30 molecules and their MDA values (up to 0.002). The increase (I) or decrease (D) of these molecules in the CRC vs. C group the included.

Statistical Analysis Based on MS Peak Intensity Values for CRC Subgroups
In order to compare the data obtained by multivariate analysis, we considered the initial matrices (peak intensity tables), considering the mean values for the CRCI-IV and C groups, and calculated the statistical differences between these groups (values from p < 0.1 to p < 0.01 indicate significance). From a total of 93 biomolecules (included in Table S1), significant deviations of these ratios (p < 0.01) were selected. The data released from LC-MS analysis were presented in a matrix representing the average values of MS peak intensity values for each molecule separated and selected according to the protocol presented above (n = 45). Table S1 includes the list of common molecules identified in blood serum from C and CRC groups. The ratios between the mean values of CRCIV and C, CRCIV and I, CRCIV and III, and CRCIII and C are presented in Table 5, as well the tentative identification of the molecules and their codification in the PubChem database. Table 5. M/z values and tentative identification of molecules which show different ratios between the mean values of CRCIV and C, CRCIV and I, CRCIV and III, and CRCIII and C. The codification in the PubChem database is included for each molecule. Significant increases (p < 0.01) in these ratios are marked with * symbol. These data suggest general increases in different lipid subclasses such as cholinedependent glycerophspholipids, cholesterol and fatty acid esters, as well ceramides, especially in Stages III and IV compared with controls. Stearic and palmitic acids are mostly involved in such esters. These data are in good agreement with the multivariate analysis.

Enrichment and Pathway Analysis
Using enrichment analysis and pathway analysis by the GSEA algorithm for the matrix including the m/z values presented in Table 5 and decreasing p-values and t-scores, the possible pathways affected by CRC in different stages were obtained. Figure 4 presents a general and detailed overview of the enriched metabolite classes sets and their significance. The results plotted above confirm the metabolite sets which can be considered as significant in CRC diagnosis and prognosis, including their classifications. The main class of metabolites is represented by the glycerophosphocholines (mono-and diacyl derivatives), followed by sterols and their esters, ceramides and sphingomyelins.

Discussion
The results of this study may contribute to the actual knowledge directed towards identification of the most relevant biomarkers of CRC and progression stage subtypes, compared with controls. These data are mostly in good agreement with previous findings which reflect the key role of lipid-mediated pathways in CRC diagnosis and prognosis.
As reported before, choline-related phospholipids can be considered good biomarkers for CRC [39,40]. Increases in PCs may be followed by decreases in LPCs, associated with body weight loss and activated inflammatory processes in CRC patients [41] but also by accumulation of some LPCs (16:0, 16:1 and 18:0) [42] oran increased degradation rate of some LPCs (20:4 and 22:6) as a result of the accelerated cell proliferation in CRC patients [36,43]. Another lipid metabolic signature represented by palmitic amide, oleamide, octadecanoic, hexadecanedioic, myristic and eicosatrienoic acids, LPCs(16:0, 18:2, 20:4, 22:6) was statistically significant and these lipid metabolites were considered potential biomarkers to discriminate early-stage patients from healthy controls, superior to the prediction made by carcinoembryonic antigen [40]. Endogenous synthesis of arachidonic and oleic acids was also reported to have an impact on CRC development, as well as the arachidonic acid metabolites (eicosanoids and their oxidized forms), which generate prostaglandin E2 which stimulates tumorigenesis [44]. Meanwhile, no significant differences between normal, polyp and cancer mucosa were noticed for oxidized lipids 12-hydroxyeicosatetraenoic acid (HETE), 15-HETE or leukotriene B4 levels, or decreased 13-hydroxyoctadecadienoic acid (HODE) and HETE levels in cancer and colorectal polyp mucosa [15,27,45]. The upregulated and downregulated metabolites through the various stages of CRC were found to be also benzoic, octanoic and decanoic acids, proportional to CRC stage [35,46,47]. Glyceraldehyde, hippuric and linolenic acids, glycochenodeoxycholate and glycocholate may also discriminate CRC from polyps [12], while β-hydroxibutyrate increased and tryptophan and indoleacrylic acid decreased from Stage I to Stage IV CRC [48][49][50]. By comparison, the healthy, polyp adenomas and CRC patients had different glycerolipid metabolism, reflected by higher levels of lipids and polyunsaturated fatty acids (PUFAs) and lower levels of glycerol [18].
To summarize the findings presented above, in relation to our results we can assume that in CRC, several classes of lipids including phosphatidylglycerols (PCs, phosphatidylethanolamines (PEs) and phosphatidic acids (PAs)), fatty acids and sterol esters, as well as ceramides, confirm the "lipogenic phenotype" for CRC development, dependent on lipogenesis and lipolysis, upregulated and associated with tumor progression. Both multivariate and univariate bioinformatics confirm these findings and the specificity of these metabolic pathways activated in CRC patients [58].
Further studies are under development using larger cohorts of patients in different CRC stages, with improved characterization and data processing.

Conclusions
Metabolomics has already proven a great potential as a high-value technology to realize proper metabolic signatures to discriminate significantly between healthy controls with benign polyps versus malignant CRC tumors. Specific classes of lipids involved in cellular signaling and energy provision proved to be good biomarkers for CRC in different stages and can be relevant prognosis factors. The lipid profile alterations presented in this study, many of them also confirmed by similar investigations, showed statistically significant differences and can be considered reliable biomarkers, differentiating between early and advanced stages of this malignancy, or serving as survival predictors. Complementary studies on larger cohorts of patients are needed for the development of clinically useful biomarkers, especially related to the signaling lipids.
Metabolomics have the potential to become a standard technology for future applications in translational cancer research, but further, large-scale studies and prospective validation are still needed. Moreover, bioinformatics tools offered by the online Metaboanalyst 5.0 software significantly helped refining of the key biomolecules which may be considered as putative biomarkers for CRC diagnosis and staging. These biomarkers are not only useful for diagnostics and patient stratification but can be mapped on a biochemical chart to identify the altered metabolic pathways involved in the initiation and progression of this invasive cancer.
Supplementary Materials: The following are available online at https://www.mdpi.com/2218-2 73X/11/3/417/s1, Table S1. Matrix representing the m/z values of molecules separated by LC-MS (column 1) and the mean values of mass spectra peak intensity for the C group (column 2) and CRC subgroups representing Stages I, II, III and IV (Columns 3-6). Figure S1. (a) Cross-validation of PLSDA analysis for the CRC and C groups. For interpretation, see the main text. (b) Cross-validation of PLSDA analysis for CRC I to IV subgroups. For interpretation, see the manuscript. Figure S2. Correlation maps between variables (m/z values) (a) and samples (b). The heatmap (c) represents the correlation between samples and variables. All maps were built using the Metaboanalyst 5.0 algorithm (multivariate statistics and t-test). Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Acknowledgments:
The authors are grateful for the technical and logistic support offered by the Research center BIODIATECH (biodiatech.ro) and the Medical Center Medisyn from Cluj-Napoca, Romania.

Conflicts of Interest:
The authors declare no conflict of interest.