A Review of GC-Based Analysis of Non-Invasive Biomarkers of Colorectal Cancer and Related Pathways

Colorectal cancer (CRC) is the third most commonly diagnosed cancer in the world. In Europe, it is the second most common cause of cancer-related deaths. With the advent of metabolomics approaches, studies regarding the investigation of metabolite profiles related to CRC have been conducted, aiming to serve as a tool for early diagnosis. In order to provide further information about the current status of this field of research, 21 studies were systematically reviewed, regarding their main findings and analytical aspects. A special focus was given to the employment of matrices obtained non-invasively and the use of gas chromatography as the analytical platform. The relationship between the reported volatile and non-volatile biomarkers and CRC-related metabolic alterations was also explored, demonstrating that many of these metabolites are connected with biochemical pathways proven to be involved in carcinogenesis. The most commonly reported CRC indicators were hydrocarbons, aldehydes, amino acids and short-chain fatty acids. These potential biomarkers can be associated with both human and bacterial pathways and the analysis based on such species has the potential to be applied in the clinical practice as a low-cost screening method.


Colorectal Cancer Background
According to data regarding cancer burden in 2018 (GLOBOCAN 2018), colorectal cancer (CRC) is currently the third most incident cancer type in the word, with nearly 1.85 million cases and 881 thousand deaths worldwide. In Europe, it occupies the second place in the ranking of cancer occurrence and related deaths, with approximately half a million new cases registered and almost a quarter of a million associated deaths. Moreover, research on cancer progression predicts an increase of 75% in CRC cases over the next 20 years [1]. The global population over time has experienced significant changes in their habits, notably the prevalence of sedentarism, increased intake of dietary fat and processed food, and exposure to carcinogens, all risk factors in CRC [2]. Such context presents a complex perspective on CRC, also a from socioeconomic point of view, emphasizing the need for prevention strategies and promotion of early diagnosis.

General Analytical Platforms Available for Metabolomics Studies
There are many analytical platforms for molecular biomarker analysis, such as nuclear magnetic resonance (NMR), high performance liquid chromatography-mass spectrometry (HPLC-MS), ultrahigh performance liquid chromatography-mass spectrometry (UHPLC-MS), supercritical fluid chromatography-mass spectrometry (SFC-MS), capillary electrophoresis-mass spectrometry (CE-MS) and GC-MS [52,53]. NMR is highly reproducible technique which requires low amounts of the sample, enables quantitation without standards and allows for the identification of both polar and nonpolar compounds. The main disadvantage of NMR compared with MS is poor sensitivity, limiting analyses to low-abundant metabolites [52]. UHPLC-MS is more sensitive than HPLC-MS and requires smaller sample volume for injection. Recent UHPLC-MS are using porous particles with internal diameters smaller than 2 µm, which provide higher peak capacity, increased specificity and high-throughput capabilities as compared to HPLC columns [53]. The difficulty is that electrospray ionization (ESI) mass spectral libraries are not standardized like in the case of GC-MS [54]. SFC-MS is a promising tool in the field of metabolomics, next to GC and LC. It can analyze both polar and nonpolar compounds using "green" and rather cheap CO 2 as the mobile phase [52]. CE-MS has low sensitivity and enables the analysis of polar compounds. CE-MS offers high-analyte resolution and a small volume of sample for injection (1-20 nL) [52,53]. GC-MS is a gold standard for the analysis of volatiles and it is, relatively, a simpler technique than LC-MS, more cost effective and with reduced matrix effect enabling to quantify compounds in picograms and identification using easy-accessible reference libraries [27][28][29]. It is also possible to separate and analyze semi-volatiles, for example when using solvent extraction as the sample preparation technique. However, most compounds require a chemical derivatization step at room or elevated temperatures to provide necessary volatility and thermal stability [54]. GC-MS is one of most important methods to analyze VOCs from various matrices like breath, saliva, urine, feces and blood for metabolomic purposes [55].

Search Strategy
On 20 May 2020, a literature search was performed in the electronic database Web of Science Core Collection (from Clarivate Analytics; Philadelphia, PA, USA), as well as Google Scholar. The used search terms were: colorectal cancer, volatile organic compounds, gas chromatography, breath, urine and feces, with time span from 2010 to 2020. Due to the historical relevance, two older articles involving breath samples (from 1977 and 1984) were also included.
Studies that were included met the following criteria; (i) at least two patient groups; one group with colon or digestive tract cancer in any stage and another group without cancer; (ii) analytical platform based on GC coupled with MS or other commonly used detectors; (iii) detection and identification of chemicals or gases in breath, urine and feces.
The following information was gathered from the articles, per type of matrix: author, year of publication, analytical method, method of data analysis, sample preparation technique, type of used GC column, patient groups and number of volunteers, degree of validation of the method and its level of sensitivity, specificity, accuracy and other statistical parameters.
Twenty-one studies on CRC molecular markers were reviewed, all of them employing gas chromatography as the analytical technique and comprising the investigation of urine, breath or fecal samples as sources of metabolites. Data regarding candidate CRC biomarkers and studied biological matrices are presented in Table 2-these compounds are referred to as being the most relevant for being reported as possible markers by more than one study and/or for the possibility of being addressed to formerly described biochemical mechanisms.      where: ↑-concentration elevated in comparison of healthy controls; ↓-concentration decreased in comparison of healthy controls; post -index regarding postoperative samples; m -index regarding only male samples; no arrows-changes in concentration of compound not mentioned by authors. Code-reference used for further discussions in Section 3.

CRC Biomarkers in Urine
Qiu et al., 2010 profiled urine samples from the same group of patients (60 CRC diseased individuals with different stage of cancer) from previous serum experiments and 63 healthy volunteers. Patients were also examined before and after surgical operation in order to evaluate changes in volatile profiles. The postoperative urine specimens were collected on the seventh day after surgery. Authors used GC-MS with solvent extraction and derivatization (using ethyl chloroformate) of samples. In a predictive model, 187 volatile metabolites were found in 90% of samples allowing discrimination of CRC patients from the healthy controls in the predictive component. 16 potential biomarkers of disease between preoperative CRC patients and healthy controls were identified including decreasing compounds, e.g., 3-methylhistidine, histidine, citric acid, and increasing volatiles such as 5-hydroxytryptophan, 5-oxoproline, p-cresol and phenyl acetate. Four compounds, 5-hydroxytryptophan, 2-hydroxyhippuric acid, succinic acid and phenylacetylglutamine, demonstrated recovering tendency to healthy controls in the postoperative samples. Different CRC stages could be distinguished by six metabolites with characteristic expression levels. Levels of indoleacetic acid were elevated for stage I patients, p-hydroxyphenylacetic acid for stage II and 5-hydroxyindoleacetic acid for stage III. The highest concentration of 2-methylpropanoic acid was found for stage II with a sharp decrease for stage IV patients. A continuous increase in glutamic acid levels was observed until stage IV. Finally, stage I was characterized by proportionally large amount of leucine. Twenty-one VOCs, mostly amino acids and phenyl-containing constituents, allowed for discrimination between preoperative and postoperative states of patients and they are likely related to the metabolic changes resulting from the surgical operation [31].
Volatile metabolites from urine of 54 subjects were investigated in a study by Silva et al., 2011. Thirty-three cancer patients (oncological group: 14 leukaemia, 12 colorectal and 7 lymphoma) and 21 healthy (cancer-free) volunteers were enrolled in the experiment. Positive rates of 16 volatiles among the 82 detected were found to be statistically different (p < 0.05). They used an optimized technique relying on dynamic solid-phase microextraction in headspace mode (dHS-SPME), in combination with GC-MS-based metabolomics. Prior optimization concerned extraction time and extraction temperature and selection of SPME fiber. Oncological groups were characterized with the predominance of benzene derivatives, terpenoids and phenols. Levels of p-cymene, anisole, γ-terpinene, bornylene, dimethyl disulfide, 4-methylphenol, 1,2-dihydro-1,1,6-trimethylnaphthalene, 1,4,5-trimethylnaphthalene and 2,7-dimethylquinoline were elevated in colorectal patients than in lymphoma and leukaemia patients. Decreasing concentrations for CRC patients were observed for e.g., 1-octanol, heptanal, hexanal and dimethyl disulfide in comparison to healthy controls; 4-methyl-2-heptanone was only identified in colorectal patients [32].
Solvent extraction and derivatization combined with GC-time-of flight mass spectrometry (TOFMS) was used by Cheng et al., 2012, to find metabolite markers of colorectal cancer. A cohort of 103 CRC patients and 101 healthy subjects participated in the study. From the total 163 volatiles detected, 19 metabolites were selected as potential biomarkers based on statistical analysis by uni-and multivariate statistical methods. Using a set of seven metabolites, citric acid, hippuric acid, p-cresol, 2-aminobutyric acid, myristic acid, putrescine, and kynurenate, it was possible to discriminate CRC subjects from healthy volunteers, presenting AUROC (area under the receiver operating characteristic curve) of 0.993, sensitivity of 97.5% and specificity of 97.6% for the training set, and an area under the curve (AUC ) of 0.998, sensitivity of 97.5% and specificity of 100% for the testing set [33].
A field asymmetric ion mobility spectrometer (FAIMS) was used by Arasaradnam et al., 2014, for analyses of urine samples from 83 CRC patients and 50 healthy controls. Data analysis for FAIMS results was performed using fisher discriminant analysis. The sensitivity and specificity of FAIMS was 88% and 60%, respectively, for CRC and this technique allowed for the differentiation between CRC patients and healthy ones. The author conducted a parallel in-tube extraction (ITEX)-GC-MS experiment with the same samples. No unique chemical was identified in those with CRC compared with healthy volunteers. According to the incorporated table, they found nine different volatiles associated with colorectal cancer and they could be assigned to 26 NIST library targets for these peaks. There was no information if concentrations of these compounds changed, nevertheless they were included to Table 2 [34].
Liesenfeld et al., 2015, investigated urine samples from a cohort of 170 subjects divided in four groups: pre-surgery (79), post-surgery within few days (9), after 6 months follow-up (46) and after 12 months follow-up (36) CRC patients. A total of 82 VOCs detected by GC-MS were significant and allowed to distinguish pre-from post-surgery CRC patients. However, only 49 metabolites were included in Table 2, since the remaining VOCs were unknown (level three or four identification). Liesenfeld et al., 2015, attributed the origin of many significant metabolites to alternations of the gut microbiome affected by CRC surgery, such as 2,3-butanediol, pyrogallol, hydroquinone and maleamic acid. A total of 10 compounds (four identified) were highlighted as metabolites significant to discriminate pre-surgery CRC patients by disease stage. Levels of a dipeptide of hydroxyproline  [36].
The FAIMS technique was used once again for development of VOC-based screening tool for CRC and adenomas (Mozdiak et al., 2019). Moreover, gas chromatography coupled with ion mobility spectrometry (GC-IMS) was also employed. A total of 163 patients were enrolled in the study. For patients grouped into categories according to diagnosis, FAIMS analysis revealed high sensitivity and specificity between CRC vs. normal control: 1.0 (95% CI 0.74-1) and 0.92 (95% 0.62-1), respectively. For GC-IMS study, the corresponding values showed a high degree of separation with a sensitivity of 0.80 (95% CI 0.44-0.97) and specificity of 0.83 (95% CI 0.63-0.95). However, when considering CRC cases grouped with adenomas and compared with other groups, the accuracy dropped significantly. Hence, urinary VOC profiles from CRC patients in combination with other (non-neoplastic) gastrointestinal disorders, are not sufficiently distinct to allow correct classification. No unique VOC biomarkers were found using GC-IMS. Summarizing, VOC signatures enabled correct classification of malignant patients from pre-malignant ones with higher test uptake and superior sensitivity than FOBT used for bowel cancer screening [37].

CRC Biomarkers in Feces
In studies regarding fecal samples, profiling of gut microflora is recurrent, in order to establish correlations between fecal metabolites and human microbiota. Previous evidence demonstrates differences between the microbial composition found in samples of heathy and CRC patients, supporting the existence of oncogenic bacteria, which potentially promote CRC initiation and tumor development [56]. Hence, the detection of specific bacterial metabolites, especially in feces, presented to be relevant in the assessment of CRC risk.
Feces samples were investigated by Weir et al., 2013, from healthy adults (n = 10) and colorectal cancer patients (n = 11). Using solvent extraction combined with GC-MS, they examined stool samples to find overall metabolite profiles and to extract short chain fatty acids (SCFAs). Microbial diversity in stool biota from CRC subjects and controls were identified using amplification of the V4 region of the bacterial 16S rRNA gene and pyrosequencing. Fourteen bacterial species (mostly butyrate-producing) were significantly more abundant in the stool of healthy individuals compared to CRC patients. On the other hand, four bacterial species were significantly over-represented in stool samples from CRC patients. The last mucin-degrading bacteria, Akkermansia muciniphila, was observed in significantly greater proportion in CRC stool samples. SCFA analysis resulted in finding six bacterially produced fatty acids that differed significantly between stool of healthy adults and subjects with CRC. Levels of acetic acid, propionic acid, valeric acid, and particularly isobutyric acid and isovaleric acid were significantly elevated for CRC samples. Butyric acid concentrations were higher in samples from controls. Independently, Weir et al., 2013, detected 27 global stool metabolites which they proposed as cancer biomarkers. Eleven of them were amino acids demonstrating a 41-80% increase in fecal samples from CRC patients, originating possibly from degradation of dietary proteins and intestinal mucins, which is consistent with the presence of bacteria A. muciniphila. Higher levels of glycerol, several unsaturated fatty acids and ursodeoxycholic acid (UDCA) were observed for healthy adults. Pearson's correlations between groups of metabolites and bacterial genera/species revealed strong correlations between Bacteroides finegoldii, two Dialister spp., and Pseudobutyrivibrio ruminis and increased stool free fatty acids and glycerol, as well as between Ruminococcus spp. and UDCA. A strong correlation existed also between bacterial genera Phascolarctobacterium and Acidiminobacter (noteworthy for CRC samples) and amino acids phenylalanine and glutamic acid [38].
In the study of Phua et al., 2014, GC-TOFMS was employed for the metabolic profiling of fecal samples from 11 CRC patients and 10 healthy subjects and to find metabolite markers from tumor specimens against their matched normal mucosae from eight CRC patients and 10 controls. The discrimination between CRC patients and healthy controls was evident based on fecal profiles (orthogonal partial least squares discriminant analysis (OPLS-DA), one predictive and three Y-orthogonal components, R 2 X = 0.373, R 2 Y = 0.995, Q 2 (cumulative) = 0.215). The robustness of the OPLS-DA model was demonstrated by an AUROC of one. GC-TOFMS profiling also enabled the separation of tumor tissue from matched normal mucosae and to assign nine potential biomarkers of CRC. Glucose, galactose, 3-phosphoglycerate, citric acid, inosine, and creatinine were lower in CRC while uracil, uridine, and proline were significantly higher in tumor tissue. Three fecal markers were found to be also lower compared to healthy stool samples, namely nicotinic acid, fructose and linoleic acid. Authors pointed out that the conducted study revealed the ability to differentiate CRC subjects from healthy subjects regardless of the presence of samples containing blood beyond 1 mg Hb/g stool [39].
Bond et al., 2016, used SPME-GC-MS to analyze volatiles in stool samples of 137 participants, consisting of 60 controls, 56 patients with adenomatous polyp/s and 21 CRC patients. Four compounds were assigned as biomarkers, however their names were not mentioned due to alleged potential future intellectual property. After a tenfold cross validation, an AUC of 0.82 with a sensitivity of 87.9% and a specificity of 84.6% was measured [40].
Wang et al., 2017, applied very similar approach to Weir et al., 2013. A total of 27 subjects (15 CRC patients and 12 controls) were enrolled in the study. Cancer group consisted of stage II in four cases, stage III in six cases and stage IV in five clinical cases. Two parallel experiments involving GC-MS were conducted: global metabolite profiling and SCFA analysis. Bacterial species present in stool samples were identified using pyrosequencing for specific detection of the V4 region of bacterial 16S ribosomal RNA on the isolated genomic DNA. Eighteen bacteria obtained from gut flora differed significantly between CRC and control group. VOC profiling revealed 24 volatiles being proposed as markers of disease. Among them, the levels of four SCFAs (acetic acid, valeric acid, butyric acid and isovaleric acid) were elevated for CRC patients, while the concentration of isobutyric acid was diminished for the cancer group. Compounds with decreased levels for CRC samples included fatty acids (oleic acid, elaidic acid, linoleic acid and myristic acid), ursodeoxycholic acid and pantothenic acid (vitamin B 5 ). The largest group of increased species were amino acids, represented by i.e., glutamic acid, leucine, serine, valine and phenylalanine. High positive correlations were observed between bacterial species and volatiles, including: Bacteroides, Dialister and Pseudobutyrivibrio-free fatty acids; Ruminococcus-ursodeoxycholic acid; Phascolarctobacterium and Acidiminobacter-phenylalanine and glutamic acid [41].
Fecal fatty acids acid profiles of CRC patients and healthy controls were analyzed by Song et al., 2018. A total of 54 subjects (26 CRC patients and 28 controls) participated in the study. Fecal samples were dedicated for two independent experiments, consisting in profiling of long-and short-chain fatty acids by solvent extraction with GC-MS. The most predominant saturated fatty acids among both gender groups were palmitic acid (C16:0), stearic acid (C18:0), and myristic acid (C14:0). The significant changes between profiles were observed only for the male group, no difference was estimated between CRC patients and healthy controls in the female group. The levels of total monounsaturated fatty acid (MUFA) and total omega-6 polyunsaturated fatty acids (PUFAs) were higher in the male CRC group than healthy controls. The differences were especially significant for two compounds, namely oleic acid (C18:1ω-9) and linoleic acid (C18:2ω-6). The levels of four SCFAs, acetic acid, butyric acid, propionic acid and valeric acid, were not considerably distinct between the controls and positive cohort [42].

CRC Biomarkers in Exhaled Breath
The first studies concerning colorectal cancer biomarkers in exhaled breath involved the determination of methane levels in obtained samples. Haines et al. 1977 enrolled three groups of subjects: 30 patients with CRC (19 with carcinoma of the rectum and 11 with carcinoma of the colon), 64 patients with non-malignant large-bowel disorders, and 208 subjects without known large-bowel disorders. Paired end-expiratory breath-samples were taken using one of two similar methods, by means of either a modified Haldane-Priestley tube or a three-bag collecting system in which one bag contains sample which can then be transferred to a syringe or evacuated aerosol can for later analysis. Determination of methane was carried out using GC. A total of 14 out of 19 patients with rectal carcinoma produced methane, as well as 10 out of 11 patients with colonic carcinoma. In the group of 30 CRC patients, 24 (80%) had significant levels of methane in their breath (mean: 28.8 ± 20.9 ppm), compared with 25 (39%) of 64 patients with non-malignant large-bowel disorders (mean: 16.8 ± 12.7 ppm) and 83 (40%) of 208 subjects without large-bowel affliction (mean: 16.7 ± 13.8 ppm) [44].
Direct gas sampling combined with GC-flame ionization detector (FID) was again used to measure methane concentrations in breath of 270 subjects (Piqué et al. 1984). End-expiratory breath samples were collected using a modification of the Haldane-Priestley tube and stored in 50-mL plastic syringes. Sixty-seven (42.9%) of the 156 healthy controls were CH 4 producers and forty-three (91.4%) of the 47 patients with CRC were CH 4 producers (the percentage was significantly higher; p < 0.001). In 36 patients in whom the cancer was resected, the incidence of methane producers fell to 47.2%. The percentage of methane producers in patients operated on, but with unresectable cancer, remained very high (87.7%). A significantly higher proportion of patients with extensive ulcerative colitis and colonic polyposis produced CH 4 than patients suffering ulcerative proctosigmoiditis, benign diseases of the colon, and healthy controls (p < 0.05). Values were expressed only as percentage of incidence of methane producers [45].
Depalma et al., 2014, collected samples of breath from 20 patients with colonoscopic diagnosis of colonic polyps, 15 CRC patients and 15 healthy controls (negative at colonoscopy). They used TD (thermal desorption) combined with GC-MS method to obtain VOC patterns. A linear discriminant analysis (LDA) enabled a discriminant performance with an accuracy of 96.5% and a sensitivity of 100%. The selected model provided correct classification of 14 patients with polyps over 15 for the group of patients with colorectal cancer, with a sensitivity of 93.3%. Group with polyps were distinguished as markedly pathological [48].
TD-GC-MS was employed again by Altomare et al., 2015, who investigated breath from 48 CRC patients and 55 healthy controls. They found 31 VOCs discriminating CRC patients from follow-up (patients after resection with curative intent; FU) groups and from FU groups and healthy groups. The reliability of the calculated PNN model in discriminating between the CRC and the FU groups showed a sensitivity of 100%, a specificity of 95.83%, an accuracy of 97.50%, and an AUC of 0.993. When comparing the FU and healthy control groups with a set of those 31 biomarkers, a sensitivity of 100%, a specificity of 96.36%, an accuracy of 97.70% and an AUC of 0.992 were noted. A total of 11 VOCs were common to the previous study [47] and the PNN analysis using only them, resulted in discrimination of follow-up from CRC patients before surgery with a sensitivity of 100%, a specificity of 97.92%, an accuracy of 98.75%, and an AUC of one. Moreover, the FU group was distinguished from the healthy group which showed a sensitivity of 100%, a specificity of 90.91%, an accuracy of 94.25%, and an AUC of 0.959 [50].
Four compounds were assigned as biomarkers of colorectal cancer by Amal et al., 2016. The amounts of acetone and ethyl acetate were elevated for CRC patients, ethanol and 4-methyl octane were lower for the diseased group. A total of 65 patients with CRC, 22 with advanced or nonadvanced adenomas, and 122 healthy controls were enrolled to the study. Two different techniques were employed for the experiments: sensor analysis with a pattern recognition method and TD-GC-MS. Patients suffering with CRC were distinguished from the control group using a sensor with 85% sensitivity, 94% specificity and 91% accuracy. The advanced adenoma group from the non-advanced adenomas was discriminated with 88% sensitivity, 100% specificity, and 94% accuracy. Finally, the advanced adenoma group from healthy controls was discriminated with 100% sensitivity, 88% specificity, and 94% accuracy. Acetone and ethyl acetate were found in elevated levels in CRC patients (999.6 ± 116.8 ppb and 128.4 ± 4.01 ppb, respectively) compared to the healthy subjects (731.2 ± 63.8 ppb and 41.80 ± 10.00 ppb, respectively). Ethanol and 4-methyl octane were increased in the control group (464.7 ± 61.7 ppb and 19.1 ± 0.8 ppb, respectively) compared with the CRC patients (95.9 ± 48.1 ppb and 16.0 ± 0.63 ppb, respectively) [51].
In total, 205 different compounds (278, considering substances reported more than once) were indicated as potential CRC markers from all three matrices. Figure 1 and Table 3 show the functional group distribution of compounds in the matrices, divided by matrix. For urine specimen the most prevalent group of compounds were amino acids and their derivatives, acids and sugars with their derivatives. These two first groups were also the most predominant in feces, whereas breath samples contained mainly hydrocarbons. Table 3 demonstrates that 23.7% of all identified potential biomarkers of CRC were amino acids and their derivatives, followed by hydrocarbons (20.1%) and acids (19.1%). The compound with the highest incidence of occurrence among matrices was p-cresol (reported by five different studies). The other compounds frequently detected were: hexanal, 2-methylbutane, methylcyclohexane, citric acid, linoleic acid and glutamic acid (all reported at least by four different studies). contained mainly hydrocarbons. Table 3 demonstrates that 23.7% of all identified potential biomarkers of CRC were amino acids and their derivatives, followed by hydrocarbons (20.1%) and acids (19.1%). The compound with the highest incidence of occurrence among matrices was p-cresol (reported by five different studies). The other compounds frequently detected were: hexanal, 2methylbutane, methylcyclohexane, citric acid, linoleic acid and glutamic acid (all reported at least by four different studies).

Clinical Relevance of Reviewed Articles
Breath, urine and feces can be collected noninvasively from subjects, in contrast to conventionally used matrices (blood, serum, and tissue). Noninvasiveness is an important factor for patient safety and may reflect on increased rates of adherence to the test, helping to promote early diagnosis. Besides that, simplified sample collection may not require qualified personnel. The mentioned biological samples are specimens which can be taken relatively fast, at little cost, and without extensive sample preparation for analytical instruments. The unfavorable issues concerning urine and feces are the possible discomfort experienced by patients and the complex chemical nature of them. Although breath has lower chemical complexity, various external factors can influence its composition, such as food remnants, hygiene products, metabolic products of colonizing oral bacteria and ambient air [57][58][59][60], turning challenging to address which fraction of it would represent blood-related biomarkers.
Urine is a more stable specimen and contains mainly water, inorganic salts and organic compounds, such as hormones, proteins, other metabolites and bacteria with their products. Due to the role of urine as organism's route of elimination, metabolized forms of substances prevail over parental molecules. Regarding bioanalyses of urine, there are a few recommendations intended to avoid excessive contribution of bacterial content in the samples. In fact, the influence of bacterial growth on the levels of metabolites and proteins has been reported [61]. It was demonstrated that the mid-stream portion of urine has reduced microbiota composition [62]. Time collection period and diet are also significant parameters; in this sense, fasting urine was characterized as the most stable in terms of basal composition [63]. The determination of urine specific gravity is an alternative for normalization of samples for metabolic investigations [64]. Most of the studied urine samples in this review were collected in the morning after overnight fasting [31][32][33]35]. Further information regarding detailed protocol of sampling was not provided.
Feces are an especially meaningful biological specimen in what concerns the evaluation of colonic system. These are typically solid or semisolid heterogeneous remains of food, not completely digested or absorbed by the organism. This content is metabolized by intestine bacteria to smaller waste products and can also contain dead epithelial cells from the lining of the gut. Diet can change the composition of feces, which directly alter metabolite profiles [65]. Moreover, consumed food impacts metabolic pathways of gut microbiota and volatiles produced by them [66]. Factors related to the collection and handling of fecal samples like post-collection metabolite deterioration, due to exposure to aerobic conditions and ongoing microbial fermentation, also take place [67]. Investigations of fecal samples regarding VOC profiles, revealed no effect when comparing a processed fresh sample (firstly homogenized and extracted) to a thawed sample after 7 days, kept at −20 • C [68]. Interestingly, lyophilized feces showed a decrease in the number of detected analytes [69].
All revised studies regarding fecal samples did not impose any kind of dietary restrictions to the patients. In certain studies, samples from individuals following vegetarian diet or having any dietary restrictions were not included [38,43], as this factor itself could reflect on a differentiated profile of molecules extracted from the collected specimen. Interruption of consumption of tobacco or alcohol was also required in a study [41]. Song et al., 2018, collected samples provided after overnight fasting and participants were instructed to consume solely local traditional food [42]; participants enrolled in another study all had a mild diet [41]. Fecal samples were generally provided before the colonoscopy, without the influence of any kind of procedure for bowel preparation [38,[41][42][43]. Another common exclusion criteria regards the intake of antibiotics and probiotics months prior to sample donation, once these agents impact the natural composition of gut microbiota [41,43]. In most of the studies, patients suffering from intestinal chronic inflammatory diseases were also not enrolled [38,39,41,43], due to the possibility that the metabolic alterations promoted by these conditions may result in a false positive or a confounding factor when CRC cases are evaluated. The composition of feces, urine and breath can also vary according to the physical state of the body, age, and general health. Hence, biomarkers found in these matrices can have multifarious origin that should be considered while establishing potential metabolic pathways of them [59,70,71].
The revised studies can be critically analyzed under the point of view of their clinical value. In urine metabolome studies, Qiu et al., 2010, indicated separations both from CRC patients and precancerous colorectal lesions in rats to their healthy analogous counterparts. It is an interesting result because indication of precancerous lesion is the main limitation of commercially available CRC tests which do not provide good sensitivity in the first stage of the disease. The main differences observed between pre-and post-groups would be due to removal of a portion of the microbiota during preoperative preparation of the colon and the influence of supplements intake. These authors suggested that general results point mainly to changes in tryptophan metabolism and tricarboxylic acid (TCA) cycle due to the CRC development. Moreover, the method's reliability was corroborated since almost half of the altered substances had their identity confirmed by the analysis of standards [31]. Cheng et al., 2012, similarly to Qiu et al., 2010, suggested the tricarboxylic acid (TCA) cycle, tryptophan metabolism, and polyamine metabolism as the main affected pathways in the CRC metabolome. Additionally, they achieved receiver operating characteristic (ROC) curves indicating a method accuracy close to 100% [33]. Liesenfeld et al., 2015, monitored the metabolomes of patients over time, in the frame of the ColoCare project-a study encompassing a series of interventions for the collection of samples and data at different time-points during the period of 5 years. A clear distinction was presented between the metabolisms of pre-and post-operative cohorts. Their result supported the hypothesis that the gut microbiota play a more important role for colon cancer than for rectal or rectosigmoidal cancer patients. Removal of cancerous tissue and parts of the intestine may affect microbial metabolites and possibly the microbiota itself. The increased effect is intensified with adjuvant chemotherapy. Once again, subjects' metabolism was evidenced to be affected by changes in the levels of tryptophan. One limitation of the study was impossibility to identify some metabolites using the available spectral library [35]. Primarily, the study of Delphan et al., 2018, was focused on determination of branched-chain amino acids (BCAAs). However, no correlation between the amounts of BCAAs and some evaluated energy balance parameters were observed. On the other hand, two acids were significantly elevated in the CRC group and associated with overall survival [36]. Silva et al., 2011, were able to discriminate colorectal cancer from other types of neoplasm (leukemia and lymphoma). In this case, authors presented data regarding the optimization of the sample preparation method for the selection of parameters with the greatest recovery of compounds [32]. GC-MS was used as complementary technique by Arasaradnam et al., 2014, in which an ITEX device served in the preconcentration step. ITEX is a solution that provides sensitivity similar to purge and trap systems and requires less instrumental effort, with lower susceptibility to contamination [34,72]. Usage of two employed methods (FAIMS and GC-IMS) from the described protocol, did not provide satisfactory differentiation between the adenoma and healthy groups, however, singular CRC cases could be correctly classified from control and adenomas using the mentioned techniques. The authors highlighted that a panel of biomarkers is preferential for differentiation, rather than using a single biomarker for diagnostic purposes. This study was focused on individuals that tested positive for FOBT, then, the results were compared with actual diagnostic outcomes. In this sense, the proposed metabolome-based methodology proved to be superior compared to the FOBT approach [37].
As mentioned before, bacterial content strongly influences the composition and metabolic outcome of fecal samples. Weir et al., 2013, investigated stool samples from a small group of CRC and healthy patients. The authors stated that using sequencing, allied together with metabolic profiles may be a powerful approach in the elucidation of mechanisms of interaction between gut microorganisms and metabolites. They quoted the hypothesis of the driver-passenger model, in which bacteria are infectious agents in the development of cancer. Bacterial drivers of colorectal neoplasm are intestinal bacteria with pro-carcinogenic features. One of these features is the production of DNA-damaging compounds. "Passenger bacteria" are bacteria that may outcompete drivers to flourish in the tumor environment as the cancer progresses. The findings from Weir et al., 2013, lead to the conclusion that by knowing the relationship between colonizing bacteria and altered fecal metabolites, there is the possibility to focus on metabolome analyses to indirectly assess the composition of microbiota, and if the latter implies CRC risk [38,73]. Phua et al., 2014, recognized that the used sample set was rather small for the aimed purposes. They mentioned that a previous study regarding fecal samples (Weir et al., 2013) was potentially affected for not considering the presence of blood in the stool samples, making difficult to determine whether the found metabolites would be originated from blood or not. With respect to this, in their study the authors worked with a subset of stool samples without a noticeable amount of blood, aiming to mitigate the interferences of this matrix on the conducted experiments [39]. Wang et al., 2017, also focused on the bacterial content of feces and its relationship to secreted metabolites. They signalized that for a better assessment of the correlation between bacterial genus and detected compounds, a larger group of samples should be considered [41]. In contrast to other researchers, Song et al., 2018, pointed out that no significant differences were found regarding the levels of fecal short-chain fatty acids in CRC samples. Through the monitoring of the diet of a healthy group, it was demonstrated that there is no correlation between described dietary habits and level of substances of interest, suggesting that other factors (specifically related to CRC) contribute to the differential distribution of fatty acids in feces [42]. Bond et al., 2019, hypothesized that propan-2-ol is a major metabolite of Fusobacterium nucleatum, a strain previously connected with CRC tumorigenesis. Despite that main candidate biomarkers were identified by analysis of chemical standards, no quantification was performed [43].
The earliest works comprising exhaled breath from CRC patients concerned the detection of methane. The study of Haines et al. 1977, was focused on the evaluation of this gas and encompassed the collection of the last portion of patients' expiration. Methane levels in the room air were taken as a basal measure to define methane producers and non-producers. Excluding univariate comparison between the two groups, no further statistical analyses were explored. The results revealed poor sensitivity and specificity, reinforcing that approaches relying on the determination of a single marker tend to be deficient; however, the used approaches offered perspectives towards CRC breath diagnosis. Nowadays, different options of analytical apparatus are available, able to provide increased sensitivity and detection of a wider range of compounds from different classes. Considering the limitations inherent to the period, this work was the first to suggest the evaluation of bacterial metabolites as a manner to assess the presence of CRC tissue [44]. Due to the coincident prevalence of methane in the studied cohorts, results of Piqué et al. 1984, suggested the existence of a common substrate for large bowel cancer, extensive ulcerative colitis, and colonic polyposis. Produced metabolites found in the breath from CRC patients are likely to originate from modified microflora [45]. In the work of Peng et al., 2010, the main goal was to enable differentiation of various types of cancers using nanosensors, since GC-MS was used as validation tool. The employed methodology did not perform so satisfactorily for CRC, in comparison to other studied neoplasms-it was pointed out that a significant portion of the CRC profile was overlapping with VOC patterns coming from healthy subjects. This may indicate that sensed CRC metabolic changes were rather subtle [46]. Altomare et al., 2013, used for the first time TD-GC-MS in such context. The main advantage of the thermal desorption technique using sorbent cartridges, is the efficient concentration of analytes, allowing VOCs trace analysis without requiring any sample preparation steps [74,75]. The researchers used a sample collection protocol based on the vital capacity expiration instead of prioritizing obtaining alveolar breath. This procedure can increase the influence of exogenous compounds in breath composition, however, it is a more comfortable and reachable approach for patients [47]. Depalma et al., 2014, were able to differentiate between patients with polyps and patients with CRC. Once the majority of CRC cases derive from polyps, the detection of such markers (as achieved by the referred study) is significantly relevant. Polyps' detection may indicate which individuals should be monitored regularly in order to prevent disease development [48]. In the work of Wang et al., 2014, the differentiation between tumor stages was studied, however, no specific patterns could be associated with the varied degree of tumor growth. The researchers used a collection procedure based on obtaining alveolar breath samples, thus favoring the evaluation of compounds more closely related to blood levels (alveoli-blood exchange), which suffers less influence from room air composition [49]. Altomare et al., 2015, showed for the first time that breath profiles are also subjected to differences when comparing patterns from positive groups with the same cohort after undergoing curative tumor resection. In addition, they highlighted the need for more complex chemometric models to achieve the identification of specific patterns, leading to a successful breath test [50]. GC-MS was used in a study by Amal et al., 2016, to identify candidate biomarkers related to the results of sensor analyses. Authors performed quantitation of detected potential biomarkers, however, details of the calibration procedure were not demonstrated in the article [51].

Possible Origin of Potential Molecular Biomarkers of CRC
There are several complex mechanisms involved in the carcinogenesis process, which comprehend changes in cell biochemistry and a series of metabolic adaptations, characteristic of tumor development. Most of these mechanisms are still not fully elucidated and their investigation at the level of small metabolites remains insufficient. Considering this, the present section aims to discuss the main hypotheses related to the occurrence and modulation of the compounds most frequently reported as potential CRC biomarkers. For this purpose, aforementioned candidate markers of CRC were correlated with molecular pathways previously described as related to this disease. Solely mechanisms which could be addressed to the altered metabolites reported in the previous section were presented. Through the next section, references are made to the observations presented in Table 2.

Alterations in Cell Energetics
During tumorigenesis, the cells demand the reprogramming of energy generation. Apart from the favoring of glycolysis (Warburg effect) [76][77][78], perturbations of the TCA cycle are also documented. In the TCA cycle, intermediates which can be incorporated in biosynthetic pathways are formed; therefore, this cycle can be found to be partially down-regulated, due to the need of malignant cells for precursors [78,79]. Apart from this, alterations in genes related to the expression of TCA cycle enzymes are reported [80]. The beta-oxidation of fatty acids seems to be triggered by the aforementioned metabolic disturbances. It is demonstrated that this process can be parallelly stimulated to serve as another energy sustaining source, once fatty acids are broken down to give rise to acetyl-CoA, ATP (adenosine triphosphate) and reduced co-enzymes [81].
In most of the studies related to CRC, lactic acid (M54) is found to be increased in biological matrices, such circumstance can be due to the enhanced secretion of this metabolite during the upregulated glycolysis in oxygen-deprived environment [82]. Partial inactivation of the TCA cycle is evidenced in CRC, with its corresponding intermediates citric, isocitric, succinic and fumaric acids (M47, M51, M61, M63, respectively) presenting decreasing trends in comparison to urine control samples.
Acetic acid (M44) was pointed out as a potential cancer indicator, according to the investigation in fecal samples belonging to individuals with cancer. Acetate is mostly sourced from intestinal fermentation performed by bacteria; however, current research verified that acetate could also be produced from pyruvate (yielded at the end of glycolysis), this conversion would be promoted by reactive oxygen species (ROS) [83,84]. The assimilated acetate can be converted to acetyl-CoA and may configure as a substantial energy source in a poor nutrient environment and under hypoxia conditions, which are characteristic of tumors [84,85].

Structural Self-Maintenance
The glycerol produced in the cytosol can be phosphorylated and submitted to consecutive reactions, covering the generation of phospholipids and glycerides. In this sense, the glycerolipid pathway comprises a source of building blocks required for cell survival and growth; concurrently, produced lipids have a function as signaling molecules with pro-tumorigenic properties in cancer [86,87]. When compared to control fecal samples, glycerol (M6) was found in lower abundance. Apart from this, monoacylglycerol (M7), an intermediate of the same pathway, also presented lower responses in the positive cohort. These observations can be connected with the greater consumption of glycerol in CRC, due to the enhancement of glycerolipid biosynthesis.
Lipogenesis also appears upregulated in cancers, generating material for cell membrane building and saturation, and the biosynthesis of lipid-signaling molecules [88,89]. Additionally, the lipolytic pathway seems to be explored by cancer cells as a manner of using exogenous fatty acids to enhance their own growth [89]. Considering this context, species related to the lipolysis pathway are also expected to exhibit alterations. Linoleic (M55) and oleic acids (M57), the main constituents of cell membranes, are recurrently reported as decreased in fecal samples of CRC individuals, an observation coincident with the withdrawal of fatty acids. Extending this analysis and considering the occurrence of an excessive breakdown of fatty acids, the greater occurrence of ketone bodies can be supposed. Both 3-Hydroxybutanoic acid (M53) and acetone (M21) were found with augmented levels in the urine and breath of CRC patients, both compounds are ketone bodies derived from acetoacetate decarboxylation occurring in the final stage of lipolysis pathway. Figure 2 presents a scheme of the main pathways possibly altered in CRC. During carcinogenesis, a series of molecular signals culminate in the activation of the transcription of genes linked to the mevalonate pathway, causing dysfunctions in the levels of intermediate metabolites. This would be a mechanism of tumor cells to restore the levels of molecules that are products of this pathway, which have important structural functions [90,91]. Terpenoids indicated as cancer biomarkers can have their origin related to unrevealed mechanisms laying on aberrant activity of mevalonate pathway or even represent intermediates from these pathways that were not properly identified by the spectral library. Besides this, the differentiated occurrence of such compounds could be associated with deficient processes in the metabolization of secondary plant metabolites coming from the diet. p-Cymene (M41) and γ-terpinene (M42) were terpenoids found increased in CRC urine specimens, and beta-pinene (M43) in breath samples.

Oxidative Stress
The concentration of ROS is often reported to be elevated in tumors-these species can be produced in the cell as consequence of enhanced metabolic activity and mitochondrial damage, but also play important role in cancer signaling pathways [92,93].
Oxidative stress also favors lipid peroxidation, generating products such as linear alkanes, aldehydes and alcohols (Figure 3) [94]. Nonanal (M15) and decanal (M12) were present in greater amounts in breath of CRC patients, these are compounds that can be directly associated with the During carcinogenesis, a series of molecular signals culminate in the activation of the transcription of genes linked to the mevalonate pathway, causing dysfunctions in the levels of intermediate metabolites. This would be a mechanism of tumor cells to restore the levels of molecules that are products of this pathway, which have important structural functions [90,91]. Terpenoids indicated as cancer biomarkers can have their origin related to unrevealed mechanisms laying on aberrant activity of mevalonate pathway or even represent intermediates from these pathways that were not properly identified by the spectral library. Besides this, the differentiated occurrence of such compounds could be associated with deficient processes in the metabolization of secondary plant metabolites coming from the diet. p-Cymene (M41) and γ-terpinene (M42) were terpenoids found increased in CRC urine specimens, and beta-pinene (M43) in breath samples.

Oxidative Stress
The concentration of ROS is often reported to be elevated in tumors-these species can be produced in the cell as consequence of enhanced metabolic activity and mitochondrial damage, but also play important role in cancer signaling pathways [92,93].
Oxidative stress also favors lipid peroxidation, generating products such as linear alkanes, aldehydes and alcohols (Figure 3) [94]. Nonanal (M15) and decanal (M12) were present in greater amounts in breath of CRC patients, these are compounds that can be directly associated with the oxidation of the main lipids constituting cell membrane. Nonanal can be produced both by α and β-scissions, while decanal may arise from the β-fragmentation of oleic acid [95]. Alkanes such as dodecane (M37), 2-methylbutane (M30), 2-methylpentane (M31), 3-methylpentane (M32), 4-methyloctane (M35)-also elevated in exhaled air of positive cohort, are likely other products of lipid oxidation that have not been precisely described yet.

Alterations in Enzyme Catalytic Activity
The activity of cytochrome P450 is reported to be altered in different neoplasms, through the differential expression of their isoforms. This factor can influence the bioavailability of substrates and their products, as well as their pattern of excretion in cancer cases [96,97]. Alcohol dehydrogenase and aldehyde dehydrogenase present activities statistically superior in several tumors when compared to normal tissues [98]. It is reported that reduced co-enzymes produced by the action of these enzymes may support ATP production [99].
Magnified activity of aldehyde dehydrogenase in tumors can transform the aldehydes released during lipid peroxidation into corresponding carboxylic acids. In this sense, secondary ketones may arise from carboxylic acids undergoing β-oxidation process-a mechanism boosted in cancer cells, as previously mentioned. Such observations regarding metabolic dynamics in cancer may explain why n-aldehydes heptanal (M13) and hexanal (M14) were decreased in the urine of individuals belonging to the positive group, while the ketones 4-heptanone (M18) and 2-pentanone (M20) presented a greater response in urine samples from the diseased. Because urine consists of a fluid related to the elimination route in organisms, the assumption that molecules are in their metabolized form may be reasonable. Compounds such as cyclohexane (M36), methylcyclohexane (M39) and xylene (M27,

Alterations in Enzyme Catalytic Activity
The activity of cytochrome P450 is reported to be altered in different neoplasms, through the differential expression of their isoforms. This factor can influence the bioavailability of substrates and their products, as well as their pattern of excretion in cancer cases [96,97]. Alcohol dehydrogenase and aldehyde dehydrogenase present activities statistically superior in several tumors when compared to normal tissues [98]. It is reported that reduced co-enzymes produced by the action of these enzymes may support ATP production [99].
Magnified activity of aldehyde dehydrogenase in tumors can transform the aldehydes released during lipid peroxidation into corresponding carboxylic acids. In this sense, secondary ketones may arise from carboxylic acids undergoing β-oxidation process-a mechanism boosted in cancer cells, as previously mentioned. Such observations regarding metabolic dynamics in cancer may explain why n-aldehydes heptanal (M13) and hexanal (M14) were decreased in the urine of individuals belonging to the positive group, while the ketones 4-heptanone (M18) and 2-pentanone (M20) presented a greater response in urine samples from the diseased. Because urine consists of a fluid related to the elimination route in organisms, the assumption that molecules are in their metabolized form may be reasonable. Compounds such as cyclohexane (M36), methylcyclohexane (M39) and xylene (M27, M28) in the breath were pointed out as CRC markers. These compounds are traditionally addressed as exogenous, coming from environment. The endogenous synthesis of such VOCs was not described in humans yet, nevertheless, bacterial shikimate and derived pathways could play a part in the occurrence of similar aromatic compounds. Another hypothesis is that altered isoforms of enzymes responsible for the metabolism of these exogenous compounds are impaired during cancer, making increased levels of these cyclic and aromatic VOCs frequently observed among diseased patients.

Contribution of the Microbiota
One of the most important roles of intestinal bacteria is the fermentation of saccharides that cannot be digested by human. Dietary fibers can exhibit different levels of susceptibility to the fermentation process and, when low fermentable, appear to be related to colonic diseases [100]. In bacterial fermentation, released monosaccharides are converted into pyruvate and derived short-chain fatty acids are produced-from these, the major species formed are acetate, butyrate and propionate [100,101]. In CRC collected fecal samples, acetic (M44) and propionic acid (M59) were registered with augmented responses. The indexed literature reports butyric acid (M46) as a discriminant feature in CRC; however, diverging trends in butyric acid levels are documented.
The minor metabolites of the fermentation pathways are methane, hydrogen sulfide, ethanol and formate [102]. Methane (M25) was reported with higher concentrations in the breath of colon cancer subjects. In intestinal gas samples, the hydrogen sulfide concentration was elevated, while its precursor-hydrogen-was decreased [103]. Ethanol (M5) measured in exhaled air presented to be distinguished for healthy and diseased subjects. Succinate (M61), a propionate precursor, was present in urine at lower abundance; correspondingly, propionic acid (M59) itself was detected in superior amounts in feces. Fermenting microorganisms can convert acetone to 2-propanol (M3) [104], a compound found to be increased in fecal samples in CRC cases. A scheme of microbial fermentation pathways in the gut environment is depicted in Figure 4. M28) in the breath were pointed out as CRC markers. These compounds are traditionally addressed as exogenous, coming from environment. The endogenous synthesis of such VOCs was not described in humans yet, nevertheless, bacterial shikimate and derived pathways could play a part in the occurrence of similar aromatic compounds. Another hypothesis is that altered isoforms of enzymes responsible for the metabolism of these exogenous compounds are impaired during cancer, making increased levels of these cyclic and aromatic VOCs frequently observed among diseased patients.

Contribution of the Microbiota
One of the most important roles of intestinal bacteria is the fermentation of saccharides that cannot be digested by human. Dietary fibers can exhibit different levels of susceptibility to the fermentation process and, when low fermentable, appear to be related to colonic diseases [100]. In bacterial fermentation, released monosaccharides are converted into pyruvate and derived shortchain fatty acids are produced-from these, the major species formed are acetate, butyrate and propionate [100,101]. In CRC collected fecal samples, acetic (M44) and propionic acid (M59) were registered with augmented responses. The indexed literature reports butyric acid (M46) as a discriminant feature in CRC; however, diverging trends in butyric acid levels are documented.
The minor metabolites of the fermentation pathways are methane, hydrogen sulfide, ethanol and formate [102]. Methane (M25) was reported with higher concentrations in the breath of colon cancer subjects. In intestinal gas samples, the hydrogen sulfide concentration was elevated, while its precursor-hydrogen-was decreased [103]. Ethanol (M5) measured in exhaled air presented to be distinguished for healthy and diseased subjects. Succinate (M61), a propionate precursor, was present in urine at lower abundance; correspondingly, propionic acid (M59) itself was detected in superior amounts in feces. Fermenting microorganisms can convert acetone to 2-propanol (M3) [104], a compound found to be increased in fecal samples in CRC cases. A scheme of microbial fermentation pathways in the gut environment is depicted in Figure 4. An alternative fermentation pathway through acetoin metabolism may be performed by bacteria. This path involves the formation of 2,3-butadione (M22) and 2,3-butanediol (M10) [106,107]. The production of 2,3-butanedione is believed to be supported by an acidic pH and low oxygen environment, conditions compatible with the tumor micro-region. This substance appeared in elevated amounts in urine samples of CRC subjects.
Based on the exposed, and according with the reviewed studies, an accumulation of some short-chain fatty acids in CRC is evidenced. In fact, it is demonstrated that butyrate (M46) has its oxidation reduced in CRC tissue, the process is achieved through the down-regulated expression of short chain acyl-CoA dehydrogenase [108,109]. Specifically, butyrate promotes the development of normal cells and inhibits histone deacetylases, thus, making it reasonable to assume that malignant colonocytes perform a metabolic shift to skip butyrate as an energy source [109].
Mucins are glycosylated proteins frequently overexpressed by different epithelial cancer cells. They play a role in the control of the inflammatory response and it is believed that their molecular apparat is employed by tumor cells to stimulate cell growth and survival [110]. In parallel, it is demonstrated that the cancer environment can favor bacterial genera that perform mucin degradation in the intestine [111,112]. Lower levels of sugars and derivatives (M90-99) in feces and urine from CRC patients are indicative of the recruitment of these species for fermentation. Apart from this, free amino acids (M73, M74, M76, M79, M80, M82, M83, M85, M86, M89) were encountered as being elevated in fecal samples, suggesting degradation of mucins and other proteins in the colon.
The microbial catabolism of proteins in the gut is another factor to be considered [113]. Proteolytic fermentation is considered as non-favorable, once reactive products can be generated, leading to tissue inflammation, a process that can serve both to initiate and promote colorectal cancer [114]. Neoplastic lesions appear to be induced by the formation of detrimental amino acid metabolites, such as p-cresol (M4) [115] (Figure 5A). Similarly to p-cresol, phenyl acetate (M23) was also encountered with elevated amounts in the urine of CRC patients. The later can be derived, for example, from phenylalanine catabolism by the microbial aldoxime-nitrile pathway [116] (Figure 5B).  An alternative fermentation pathway through acetoin metabolism may be performed by bacteria. This path involves the formation of 2,3-butadione (M22) and 2,3-butanediol (M10) [106,107]. The production of 2,3-butanedione is believed to be supported by an acidic pH and low oxygen environment, conditions compatible with the tumor micro-region. This substance appeared in elevated amounts in urine samples of CRC subjects.
Based on the exposed, and according with the reviewed studies, an accumulation of some shortchain fatty acids in CRC is evidenced. In fact, it is demonstrated that butyrate (M46) has its oxidation reduced in CRC tissue, the process is achieved through the down-regulated expression of short chain acyl-CoA dehydrogenase [108,109]. Specifically, butyrate promotes the development of normal cells and inhibits histone deacetylases, thus, making it reasonable to assume that malignant colonocytes perform a metabolic shift to skip butyrate as an energy source [109].
Mucins are glycosylated proteins frequently overexpressed by different epithelial cancer cells. They play a role in the control of the inflammatory response and it is believed that their molecular apparat is employed by tumor cells to stimulate cell growth and survival [110]. In parallel, it is demonstrated that the cancer environment can favor bacterial genera that perform mucin degradation in the intestine [111,112]. Lower levels of sugars and derivatives (M90-99) in feces and urine from CRC patients are indicative of the recruitment of these species for fermentation. Apart from this, free amino acids (M73, M74, M76, M79, M80, M82, M83, M85, M86, M89) were encountered as being elevated in fecal samples, suggesting degradation of mucins and other proteins in the colon.
The microbial catabolism of proteins in the gut is another factor to be considered [113]. Proteolytic fermentation is considered as non-favorable, once reactive products can be generated, leading to tissue inflammation, a process that can serve both to initiate and promote colorectal cancer [114]. Neoplastic lesions appear to be induced by the formation of detrimental amino acid metabolites, such as p-cresol (M4) [115] (Figure 5A). Similarly to p-cresol, phenyl acetate (M23) was also encountered with elevated amounts in the urine of CRC patients. The later can be derived, for example, from phenylalanine catabolism by the microbial aldoxime-nitrile pathway [116] ( Figure  5B).  Hydrogen sulfide and methanethiol are harmful species produced in the bowel by the reduction of sulphate-derived substrates [117,118] (Figure 5C). Stimulated production of sulfur-containing species has been addressed in colorectal cancer, as an indicator of alterations in gut bacteria activity driven by carcinogenesis [118]. Potentially related to this, the levels of dimethyl disulfide (M65) and 2-methoxythiophene (M64) appeared altered in positive urine samples.
Elevated amounts of free amino acids in feces-a colon directly related sample-can be also indicative of poor nutrient fixation in the previous steps of digestion. The observed increment in the concentration of the aforementioned metabolites in CRC specimens demonstrates that CRC-associated gut microbiota actively perform amino acid catabolism, which corroborates to colon inflammation and may constitute a tumor promoting mechanism by triggering an inflammatory response which activates cell division.

Conclusions
Considering the reviewed articles, it is possible to observe that varied methodologies have been employed for the study of the CRC metabolome. The investigation of different biological species has permitted observations of varied aspects of changes in metabolism, as each matrix involves different physiological mechanisms. Chemometric approaches were prevalent among the reviewed studies, presenting themselves as indispensable for the processing of the complex data acquired. Such statistical evaluations enable us to identify discriminant features and latent patterns, as well as to express the method's performance. The most prevalent altered compounds observed in the investigated profiles were hydrocarbons, short-chain fatty acids and amino acids and their derivatives. These chemical species could be correlated with general cancer mechanisms and specific pathways affected during CRC. The indexed candidate biomarkers could be addressed as metabolites, both of human and bacteria from the gut microbiota. The inspected studies indicate the promising status of the analysis of small molecules, using non-invasive approaches, in the determination of CRC with an accuracy greater than the already available screening tests. Besides that, GC-based exams have great potential to configure as affordable and standardized methods, able to attend to broad demand. Nowadays, there is an expanding group of evidence showing metabolite-based diagnosis of CRC, however, data inconsistency due to the employment of diverse protocols is observed, emphasizing the need for validation strategies. Yet, the value of independent studies cannot be diminished, for the reason that the gathered evidence presents great elucidative value in what concerns the understanding of particular mechanisms associated with CRC. In this way, studies on molecular patterns can be applied for diagnostic purposes, as well as to configure as a powerful tool for the interpretation of disease mechanisms at the molecular level.