1. Introduction
Colorectal cancer (CRC) is the third most prevalent cancer worldwide [
1,
2,
3]. It is more prevalent in the West than in other part of the World. In the US, there are discrepancies within different ethnic/racial groups [
4,
5]. African Americans (AAs) have a high incidence of, and mortality from this disease [
6,
7]. Several factors have been proposed and investigated, including genetics, epigenetics, diet, socio-economic status and access to healthcare [
8,
9,
10,
11,
12,
13,
14,
15].
Several publications suggest an essential, mutualistic relationship between the host and their colonic microbiota [
16,
17,
18]. A single commensal,
Bacteroides thetaiotamicron, was shown to induce colonic mucosal gene expression, angiogenesis and immune responses in mouse models of colon cancer, revealing a broader extent of microbe-mucosal communication and cross-regulation than previously recognized [
19]. Similar findings were also obtained in colorectal cancer mouse models with enterotoxigenic
Bacteroides fragilis [
20,
21]. The human colon harbors the greatest number and diversity of organisms, primarily bacteria, than any other organ in the human body [
22,
23]. Molecular analysis of the colonic luminal and mucosal microbiota indicates that individuals harbor unique microbiotas that are fairly stable along the colonic axis. However, the mucosal microbiota is either distinct or contains only a subset of the bacterial phylotypes identified in the luminal fecal samples [
24,
25].
The primary local environment to which the colonic mucosa is exposed is created by the microbiota of the colon and their metabolic products that include beneficial non-carcinogenic components such as short-chain fatty acids as well as harmful compounds including toxins and other proliferation-promoting metabolites [
26,
27,
28]. Two divisions of bacteria (
Bacteroidetes and
Firmicutes) are considered dominant in the cultured colonic microbiota.
Actinobacteria were also reported as prevalent in the intestinal tract but their presence has been underestimated in Polymerase Chain Reaction (PCR)-based approaches [
29].
Several recent studies, using next generation sequencing technologies, have set the framework for metagenomic studies in general and for the gut microbiota in particular [
24,
25,
30,
31,
32,
33,
34,
35,
36]. Huge databases for 16S rRNA genes as well as for gut microbiota functions have been established as a resource for other studies in the field [
37,
38,
39]. We capitalized on such data to run an analysis of stool samples from AAs with colon polyps that showed subtle differences in the microbiota composition at the Operational Taxonomic Units (OTUs) level when compared to healthy individuals [
40]. While this published study further reinforced the presence of oncogenic-associated microbiota’s changes, it lacked the potential to define bacterial markers with diagnostic potential that might directly affect the colon mucosa and that might serve for screening purposes.
In this study, we performed a microbiomic study in AAs with colorectal lesions. Bacterial markers of potential diagnostic value were defined and validated in an independent cohort.
4. Discussion
Several studies have recently addressed the issue of microbiota participation and potential roles in colon oncogenic transformation [
35,
36,
37,
38,
39]. Our present study extends those findings with the specific goal of finding gut microbiota markers with diagnostic value, taking into consideration the likely participation of several bacterial actors at once in a process as complex as cancer in the colon that harbors the most diverse microbiome in the human body. We report here the presence of distinct stool metabolomic profiles in patients with colon adenomas when compared to those from healthy subjects. Partial Least Squares Discriminant Analysis (PLS-DA) revealed a close clustering of the adenoma samples’ metabolomes, further confirming different metabolite exposure in the colon mucosa of patients prone to develop colonic lesions. Short-chain fatty acids (SCFAs) were found to be more prevalent in patients with a low risk of colon cancer; other differences were also noted and were assigned to the role the microbiota plays in food processing [
52]. In our samples, butyrate, acetate and propionate were more prevalent in the adenoma samples; however, many amino acids (lysine, glycine, valine and threonine) along with glucose, fatty acids and glycerol were higher in normal samples. While the SCFAs distinct presence in the adenoma samples goes against general wisdom, the low levels of amino acids, glucose and glycerol might play a role in colon homeostasis disruption as this creates an environment with fewer available nutrients to the colonocytes. Dai et al. reported that select amino acids are rapidly utilized by single bacteria such as
Esherichia coli,
Klebsiella sp. and
Streptococcus sp. and bacterial mixtures, in a bacterial species and gut segment-specific manner [
53]. As such, the amino acid abundance in the analyzed samples is likely reflective of different microbiota’s composition. Based on our results, it seems that the adenoma patients’ microbiota is more efficient at using amino acids than normal patients’ microbiota, which leads to a colonic environment that is depleted of essential amino acids in adenoma patients. Our finding highlights the potential use of amino acid quantification as a tool for detecting colon cancer presence or predisposition. Indeed, Yatabe et al. [
54] previously reported the development of an Amino acids Index Cancer Screening (AICS) test that allowed the early detection of colon cancer in patients without clinical symptoms. Amino acids among other nutraceuticals have already been used as supplements to modulate the gut microbiota with the goal of reducing inflammation and maintaining colon homeostasis [
55]. This might correspond to a first line of intervention that might reduce colon neoplastic incidence in this population.
The analysis of the colon cancer samples revealed that at the phylum level,
Firmicutes and
Fusobacteria were more prevalent in cancer tissues, while
Bacteroides,
Proteobacteria, and
Verrucomicrobia were prevalent in the matched normal tissues (
Figure 2). At the genus level,
Streptococcus,
Prevotella,
Fusobacteria,
Lactobacillus,
Veillonella,
Gemella,
Enterococcus and
Actinomyces were more strongly represented in the cancer samples when compared to their matched normal samples (
Figure 2b,c). It is most striking that most differences between cancers and matched normal samples were primarily noted for
Fusobacteria,
Prevotella and
Streptococcus bacteria. At the individual level,
Fusobacteria 16S rRNA gene sequences were more prevalent in the cancer vs. matched normal in 7 out of 10 patients. The remaining three had almost the same amount.
Fusobacteria were reported as prevalent in colon cancer tumors when compared to normal matched tissues [
56]. Since then, several studies attempted to determine whether
Fusobacteria drive the oncogenic process or merely benefit from it. Our findings presented here with stool samples 16S rRNA gene data do not substantiate an early role of
Fusobacteria as they were barely detectable in the adenoma stool samples. Also, our previous work on polyp patients [
40] did not reveal any
Fusobacteria presence in colon polyps’ stool samples. One might assume that this bacterium is primarily an adherent bacterium that will be primarily detected in colon biopsies rather than in stool samples. Indeed, McCoy et al. reported an association of
Fusobacteria with colon adenoma tissues [
57]. However, a recent study that analyzed
Fusobacteria presence at different stages of the oncogenic path found that the bacterium becomes more significant in high-grade dysplasia stages, not before [
58]. Moreover, the levels of
Fusobacteria in the stools did not correlate with their levels in the cancer or advanced adenoma tissues of the same individuals [
58], making this bacterium an unlikely good marker to be used for stool-based non-invasive CRC screening. However, it was shown that patients with high levels of
Fusobacteria in their colon have lower survival, making it a potentially good prognostic marker [
58].
The second major group of bacteria that showed major differences between cancers and matched normals was
Streptococcus. These strains were more prevalent in the cancer samples vs. matched normals in all pairs of samples as well as when combined. This was true for all detected
Streptococcus OTUs. In the adenoma stool samples, the two major detected
Streptococcus bacteria, namely
Streptococcus mitis et rel. and
Streptococcus bovis et rel., showed higher prevalence in the adenoma samples when compared to the healthy subjects’ stool samples’ microbiota. This finding defines a directional change that seems to be consistent in the two sets of analyzed samples. In our previous study on colon polyp patients’ stool samples [
40],
Streptococcus bovis was detected; however, the difference between the polyp vs. normal was not as pronounced as the one observed between adenomas vs. healthy subjects or cancers vs. matched normals, reported here. As such, the
Streptococcus gradual prevalence in both adenoma stool samples and colon cancer tissues seems to be a firm finding from the above results. Indeed, several reports have cited an association of
Streptococcus strains with colon cancer occurrence [
59,
60,
61,
62].
Streptococcus bacterial strains are known as a large and dynamic component of the small intestine [
63]. Would this resurgence in colon cancer tissues and adenoma stool samples correspond to a bacterial translocation with deleterious effects on colon homeostasis? This remains to be further explored. It will also be interesting to see if
Streptococcus strain resurgence in the colon is involved in the amino acids shortage reported in the metabolomic analysis. Indeed,
Streptococcus strains, among other bacteria, have been described as rapid utilizers of amino acids [
53,
64].
While our intent is to develop a stool-based screening test for CRC, a recent publication reported the prevalence of
Fusobacteria,
Prevotella and
Streptococcus bacteria in laryngeal carcinomas from throat microbiota analysis [
65]. These same bacteria were prevalent in our cancer specimens. Their association with laryngeal cancer and colon cancer might open the door for an upper GI flora analysis—e.g., oral flora—for the assessment of cancer risk over the GI tract, rather than relying on stool samples for such a goal.
Since our goal for this study was to define a panel of bacterial markers for non-invasive colorectal cancer screening tests and because of bacterial genomic plasticity, the 16S rRNA gene description is not always sufficient to achieve such goals. The mapping of metagenomic stool sequencing reads against the tissue samples identified bacterial genes led to 30 tissues’ microbiota genes, 6 of which were shown to have a discriminative power between adenoma vs. healthy subjects on one hand and cancer vs. matched normals on the other hand. Similarly, mapping the reads from the tissue samples to annotated sequences from stool sample data led to the identification of 4233 genes. Only 8 of these genes displayed a discriminative power between adenoma vs. healthy subjects and between cancers vs. matched normals. The 14 combined discriminatory sequences (6 from tissues and 8 from stool metagenomic data) led to 9 unique sequences after the removal of 5 duplicates. These sequences were proven to have a statistically significant potential in adenoma vs. healthy subjects’ stool samples and a striking, although not significant one, in cancer vs. matched normal samples. This finding is of utmost relevance to the goal that we set for our study since this will set the foundation for stool-based non-invasive screening test development at the preneoplastic stages.
These sequences mapped to Streptococcus sp. VT_162, Acinetobacter baumanii AC12 and Sphingomonas sp. PM2-P1-29. Neither Acinetobacter nor Sphingomonas showed differences between adenomas vs. healthy subjects or cancers vs. matched normals. However, as stated above, the Streptococcus differences at the 16S rRNA gene level were noted in both sets of data and were associated with the diseased samples (cancers and adenomas). Our findings were further validated through the use of highly specific Streptococcus sp. VT_162 primers and a probe in q-PCR experiments from independent cohorts. 16S rDNA detection was not significant in polyps and adenomas samples but was statistically significant in advanced adenoma (p = 0.041) and cancer (p < 0.00013) samples when compared to normal stool samples. These findings are of major significance knowing that these significant p values were obtained with samples from the Hong Kong validation cohort while the bacterium was identified in African American patients.
Streptococcus sp. vt_162 is a bacterium that was first isolated from the saliva of pediatric oncohematology patients [
66]. The fact that this bacterium has been isolated in such a context gives further credibility to our findings. While the rapid amino acid using
Streptococcus strains were consistently present in the adenoma and cancer samples at the 16S rRNA gene and metagenomics levels, the metagenomic functions that associate with amino acids metabolism in the analyzed samples showed no significant differences in adenoma samples compared with healthy subjects.
It is noteworthy that
Fusobacteria that have been found in our study and many others as associating with colon cancer are agents of periodontal disease [
67]. The fact that both
Streptococcus sp. VT_162 and
Fusobacteria are oral bacteria already known for their involvement in hemoncology and periodontal inflammation might be strong evidence for a possible involvement in colon oncogenic transformation. As reported above, these two groups of bacteria along with
Prevotella have already been associated with laryngeal cancer [
65], and as such, this group might potentially be used for oral microbiota assessment of CRC risk.
These findings will need to be validated in a larger population of patients that include different stages of the carcinogenic process and different ethnic backgrounds to establish the specificity and sensitivity of the discovered markers. This study also stresses the possible use of oral flora as a potential surrogate for assessing colorectal and gastrointestinal cancers’ risk among other associated health disorders.