Previous Article in Journal
Influence of Aspirin on Hospital and Clinical Outcomes in Hepatocellular Carcinoma: Insights from National Data
Previous Article in Special Issue
Comparative Effects of Flurbiprofen—Lidocaine Spray Versus Lidocaine Spray Alone as Topical Pharyngeal Anesthesia Before Unsedated Upper Gastrointestinal Endoscopy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

From Dysbiosis to Prediction: AI-Powered Microbiome Insights into IBD and CRC

1
Independent Researcher, 966, Sangmu-daero, Seo-gu, Gwangju 61993, Republic of Korea
2
Department of Pharmacy, Yonsei University School, Seoul 03722, Republic of Korea
3
Department of Medicine, CHA University School, Pocheon 487-800, Republic of Korea
4
Department of Medicine, Daegu Catholic University School, Daegu 42472, Republic of Korea
5
Independent Researcher, Seoul 04318, Republic of Korea
6
Department of Medicine, Yonsei WonJu University School, Seoul 26493, Republic of Korea
7
Department of Medicine, Chung-Ang University School, Seoul 06973, Republic of Korea
8
Department of Medicine, Kosin University School, Busan 49267, Republic of Korea
9
Department of Internal Medicine, Kosin University College, Busan 49267, Republic of Korea
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Gastroenterol. Insights 2025, 16(3), 34; https://doi.org/10.3390/gastroent16030034
Submission received: 3 August 2025 / Revised: 29 August 2025 / Accepted: 6 September 2025 / Published: 11 September 2025
(This article belongs to the Special Issue Advances in the Management of Gastrointestinal and Liver Diseases)

Abstract

Recent advances in the integration of artificial intelligence (AI) and microbiome analysis have expanded our understanding of gastrointestinal diseases, particularly in inflammatory bowel disease (IBD), colitis-associated colorectal cancer (CAC), and sporadic colorectal cancer (CRC). While IBD and CAC are mechanistically linked, recent evidence also implicates dysbiosis in sporadic CRC. The progression from IBD to CAC is mechanistically linked through chronic inflammation and microbial dysbiosis, whereas distinct dysbiotic patterns are also observed in sporadic CRC. In this review, we examined how machine learning (ML) and AI were applied to the microbiome and multi-omics data, which enabled the discovery of non-invasive microbial biomarkers, refined risk stratification, and prediction of treatment response. We highlighted how emerging computational frameworks, including explainable AI (xAI), graph-based models, and integrative multi-omics, were advancing the field from descriptive profiling toward predictive and prescriptive analytics. While emphasizing these innovations, we also critically assessed current limitations, including data variability, the lack of methodological standardization, and challenges in clinical translation. Collectively, these developments enabled AI-powered microbiome research as a driving force for precision medicine in IBD, CAC, and sporadic CRC.

1. Introduction

The gut microbiome has been comprehensively analyzed over the past two decades, such as in the Human Microbiome Project, which provided data resources and revealed the interactions between the host and gut microbiome across various body sites, laying the groundwork for subsequent microbiome research [1]. To date, extensive studies have robustly identified associations between gut microbiota and a wide range of digestive diseases, including irritable bowel syndrome (IBS), inflammatory bowel disease (IBD), colorectal cancer (CRC), and metabolic or hepatic disorders such as non-alcoholic fatty liver disease, all commonly linked to a state of microbial imbalance referred to as dysbiosis [2,3,4,5].
Among such digestive diseases, IBD and colitis-associated colorectal cancer (CAC) have received particular attention due to their mechanistic link: chronic intestinal inflammation is a well-established risk factor for tumorigenesis via mechanisms such as epithelial proliferation and oxidative stress injury. In addition, IBD and CAC demonstrate partially overlapping patterns of microbial dysbiosis, including reduced diversity and enrichment of pro-inflammatory taxa, though these signatures may differ from those seen in sporadic CRC [6,7,8]. Moreover, microbial metabolites, including bile acids, which reshape gut microbial composition and modulate intestinal immunity, play a crucial role in the development and progression of IBD, CAC, and sporadic CRC [9].
IBD, encompassing Crohn’s disease (CD) and ulcerative colitis (UC), affects about 10 million individuals worldwide and is steadily rising in newly industrialized regions [10]. Similarly, CRC is the third most commonly diagnosed cancer and the second leading cause of cancer-related mortality, accounting for nearly one million deaths annually [11,12]. Given that IBD is a chronic and progressive inflammatory disease—and long-standing colonic inflammation increases the risk of malignant transformation—the need for early, non-invasive diagnosis and microbiome-based prediction of treatment response has grown [13,14,15,16].
Specific microbiota—such as Fusobacterium nucleatum, an anaerobic Gram-negative bacterium—have been shown to promote persistent inflammation by stimulating release of proinflammatory cytokines (e.g., IL-1β, IL-6, and tumor necrosis factor (TNF)-α), activate oncogenic signaling pathways, which encourage the epithelial-to-mesenchymal transition (EMT) in CRC, and accelerate oncogenesis in inflammation-induced model of colitis-associated cancer (CAC) [17,18]. Additionally, genotoxic metabolites produced by certain microbes, such as colibactin-producing Escherichia coli, can induce DNA double-strand breaks and a specific mutation in the APC gene [19]. Moreover, inflammation-driven changes in microbial composition have been shown to promote CAC development in colitis-susceptible mouse models, implicating specific taxa such as F. nucleatum and pks+ E. coli [20].
Dysbiosis plays a pivotal role not only in inducing chronic inflammation in IBD and driving CAC but also in modulating metabolism, immune responses, and genotoxicity in sporadic CRC [21,22]. Together, these findings refine the traditional concept of an IBD–CAC continuum by highlighting how the gut microbiome plays distinct yet significant roles in both CAC and sporadic CRC. This recent insight allows for more precise risk prediction, earlier detection, and therapeutic decisions.
Microbiome sequencing methods allow researchers to scrutinize the genomic composition and function of microbial communities. The rapid evolution of sequencing technologies has greatly enhanced our ability to investigate the complexity of the microbiome. Among these, 16S ribosomal ribonucleic acid (rRNA) gene sequencing is commonly used for genus-level classification by examining the variable regions (V1-V9) of bacterial rRNA. For instance, dual-region 16S rRNA analysis identifies rare microbial taxa in patients with UC, improving clarity and specificity compared to relying on a single 16S rRNA region [23]. Shotgun metagenomic sequencing involves fragmenting and assembling microbial genes to identify their composition and function through gene prediction. A meta-analysis of shotgun sequencing demonstrated that this method allows predictive clustering of microbiomes from CRC patients, and that both taxonomic and functional classification models generalize well across studies [24]. Beyond taxonomy, integrating transcriptomics, proteomics, and metabolomics into multi-’omics’ facilitates mapping host–microbiome interactions, enabling systems-level insights. For example, omics-based signatures have been shown to be a diagnostic marker of IBD to differentiate between CD and UC [25].
The complexity and high dimensionality of microbiome data, particularly when it comes to multi-omics, necessitate advanced novel computational analysis, artificial intelligence (AI), and machine learning (ML). Broadly, these models fall into several categories:
(1)
Supervised learning models (e.g., Random Forest (RF), Support Vector Machines (SVMs), and Extreme Gradient Boosting (xGBoost)).
(2)
Unsupervised learning models (e.g., Principal Component Analysis (PCA) and k-means clustering).
(3)
Deep learning models (e.g., Artificial Neural Networks (ANNs) and Convolutional Neural Networks (CNNs)).
(4)
Graph-based models (e.g., Graph Neural Networks (GNNs) and Graph Convolutional Networks (GCNs)).
(5)
Explainable AI (xAI) (e.g., SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIMEs)) [26].
Among classical ML methods, RF has shown strong performance on microbiome data. RF aggregates multiple decision trees, each built from randomly sampled features, and combines their outputs, thereby improving accuracy and reducing overfitting compared with a single tree model [27].
xAI refers to a set of methods designed to clarify the decision-making of complex machine learning models. Its primary goal is to provide human-understandable explanations that improve transparency and trust, enabling safer use of black-box models in sensitive domains such as healthcare. Yet their practical limitations constrain clinical applicability. These methods are computationally expensive, hindering their scalability and feasibility in real-time settings [28]. Moreover, xAI models suffer from strong model-dependency and instability in the presence of feature collinearity, which may yield inconsistent or misleading feature rankings across different models [29]. Particularly, LIME oversimplifies complex relationships into local linear models, causing loss of non-linear dependencies, while SHAP explanations may be distorted when correlated predictors are present. Together, these issues illustrate that while xAI provides tools to interpret “black-box” models, current approaches still struggle with balancing interpretability, computational cost, and reliability of outputs, underscoring the need for cautious interpretation in biomedical applications.

2. From Dysbiosis to Carcinogenesis: A Pathophysiological Perspective

Chronic inflammation and intestinal microbial dysbiosis are well-established contributors to the development of CAC, thus playing essential roles in the transition from IBD to CAC, forming a mechanically connected disease continuum. Recent evidence suggests that the microbiome regulates mucosal immunity and disrupts epithelial barrier function, and has established itself as an active driver of neoplastic transformation through microbial toxins and metabolites such as colibactin, dysregulated bile acids (BAs), and altered short-chain fatty acids (SCFAs). In this section, we cover the multi-kingdom ecological disruption of bacteria, viruses, archaea, fungi, and their downstream effects across UC, CD, and CAC. We also compare and contrast these with sporadic CRCs to identify distinct and overlapping mechanisms, providing a mechanistic basis for AI-guided diagnostics and therapeutic strategies, which will be discussed later.

2.1. Etiology and Epidemiologic Relationship

The risk of CRC is increased in both UC and CD, with UC posing a higher risk of CAC due to persistent colonic inflammation [30,31]. CAC occurs without adenomatous precursors and follows a unique mutation sequence, which differs from sporadic CRC—an early TP53 mutation, followed by APC loss [32,33]. It typically appears earlier, exhibits multifocality, and is triggered by inflammation. Global epidemiological trends are associated with simultaneous increases in IBD and CRC in industrialized regions, which are associated with westernized diets, antibiotic overuse, and loss of microbial diversity [31,32]. Major risk factors for IBD-associated CRCs include disease duration (>10 years), pan-colitis, persistent histologic inflammation despite endoscopic remission, coexisting primary sclerosing cholangitis (PSC), early-onset disease, CRC family history, and male sex [30,31,33]. These factors guide surveillance strategies and suggest that microbial disturbance plays an active pathogenic role rather than merely serving as a bystander.

2.2. Pathophysiology vs. Healthy Control (HC)

Intestinal microbiota disruptions precede and potentiate mucosal inflammation, shaping host–microbe dynamics across IBD and CAC. These changes include bacterial shifts, virome remodeling, archaeal enrichment, and fungal overgrowth. Collectively, these changes impair barrier function, fuel immune dysregulation, and activate carcinogenic signaling as depicted in Figure 1.

2.2.1. Microbiome Community Disruption

Bacteria
Healthy individuals possess diverse microbial communities enriched in Faecalibacterium prausnitzii, Roseburia, and Akkermansia muciniphila. These commensals produce SCFAs, which support barrier integrity, regulate immune tolerance, and serve an anti-inflammatory role [34,35]. IBD patients, especially those with CD, show reduced alpha diversity, Firmicutes depletion, and expansion of Proteobacteria. Among these, adherent-invasive E. coli (AIEC) disrupts barrier function by adhering to and invading epithelial cells and survives within macrophages, playing a pro-inflammatory role [36,37]. This bacterial dysbiosis is exacerbated across CAC and, to a lesser extent, in sporadic CRCs, where F. nucleatum predominates. It binds to E-cadherin in epithelial cells via surface adhesin FadA, which then activates β-catenin signaling, promoting epithelial proliferation [38]. It also interferes with immune surveillance and induces chemoresistance by autophagy through the TLR4-MyD88 axis [39,40]. Thus, its presence correlates with advanced tumor stage, poor prognosis, and chemotherapy resistance [39,40]. Parvimonas and Peptostreptococcus are also abundant in both CAC and sporadic CRC, while in familial adenomatous polyposis (FAP), they show milder microbial changes, including depletion of SCFA producers and an increase in opportunists [41].
Archaea
Archaea, especially increased Methanobrevibacter smithii, are increasingly reported in patients with CRC, showing consistent enrichment across multiple cohorts. AI-driven metagenomic analysis using international cohorts revealed that the methanogenesis pathways are upregulated with the enrichment of archaea along the adenoma–carcinoma sequence [42]. M. smithii consumes hydrogen and produces methane, which modulates intestinal transit and luminal pH. These changes may facilitate the buildup of carcinogenic metabolites and increase oxidative stress of the intestinal mucosa. These effects may indirectly favor carcinogenic signaling. The classifier model incorporating archaeal signatures achieved high diagnostic performance (AUC up to 0.931), which represents excellent discriminatory accuracy for distinguishing CRC cases from controls, thereby positioning archaea as one of the functionally relevant contributors of the tumor-promoting microbiota.
Virus/Phages
IBD subtypes exhibit distinct virome expansion, particularly in bacteriophages (Caudovirales). The richness of temperate phages of the Caudovirales family (Siphoviridae and Myoviridae) increases in CD, forming cohort-specific viral profiles [43]. In UC, an elevated but distinct Caudovirales profile is seen. As the inflammation progresses to CRC, the virome composition undergoes further shifts: across nine cohorts from France, China, Austria, the USA (including a USA–Canada cohort), Italy, Germany, and Japan, Microviridae were significantly enriched in CRC in five of nine studies with a positive cross-cohort meta-analytic effect (ℓ = 0.34, q = 0.02). Virome-based models achieved LOCO-validated AUCs of ~0.81 for CRC and ~0.77 for adenoma, values considered acceptable to good in a diagnostic context [44]. Microbial co-occurrence network analysis shows that phages can regulate bacterial proliferation, inflammation, and horizontal gene transfer along the IBD–CAC continuum.
Fungi
Fungal dysbiosis, such as the enrichment of Malassezia, Candida, and members of the phylum Basidiomycota, has been observed in both IBD and CRC [45,46]. Fungal components such as mannans, β-glucans, and chitin activate innate immune receptors, including Dectin-1, TLRs, and NLRP3, thus triggering proinflammatory cytokine cascades leading to chronic inflammation [45]. Additionally, fungal dysbiosis is implicated in CRC progression. CRC tumors harbor intra-tumoral fungal heterogeneity, with increasing fungal diversity within different tumor regions from adenoma to advanced CRC [46]. In CD patients, elevated Malassezia restricta exacerbates colitis through CARD9-mediated immunomodulation, which supports the mycobiome–immune axis [47]. Emerging evidence also indicates that fungal dysbiosis may influence inflammation of distal organs such as the skin, thus forming a potential gut–skin axis in conditions such as atopic dermatitis [48]. Therefore, these shifts link mycobiota to gut inflammation and systemic immune crosstalk.
Similar reductions in fungal alpha diversity can also be seen in recent research using regional data from Saudi UC cohorts. However, the regional overrepresentation of Candida albicans and M. restricta differed from fungal composition observed in western cohorts, where Candida tropicalis or Debaryomyces hansenii may dominate, which implicates the geographical divergence in the fungal microbiota [49].

2.2.2. Microbial Metabolites and Toxins

Microbial metabolites mediate the transition from inflammation to carcinogenesis. In healthy individuals, compounds such as SCFAs serve as an energy source for colonocytes, regulate immune tolerance, and promote mucosal repair. In dysbiosis, depletion of such beneficial metabolites, accompanied by the accumulation of genotoxins, harmful secondary BA derivatives, and succinate, promotes chronic inflammation, DNA damage, and tumor progression.
Short-Chain Fatty Acids (SCFAs)
SCFAs, primarily acetate, propionate, and butyrate, are products of dietary fiber by symbionts such as F. prausnitzii and Roseburia spp. As signaling molecules via G-protein coupled receptors (GPCRs), they exert their effect on intestinal epithelial cells (IECs), immune cells, and enteric neurons. In IECs, they serve as an energy source and maintain epithelial barrier integrity by upregulating tight junction proteins such as claudin-1 and occludin. SCFAs also exert anti-inflammatory effects by acting on IECs and inducing differentiation of regulatory T cells (Tregs) through Histone deacetylase (HDAC) inhibition [50]. In IBD and CAC, the loss of these beneficial taxa leads to the depletion of SCFAs, thus contributing to mucosal dysfunction and the amplification of inflammatory conditions [51].
Bile Acids and the Bai Operon
Primary BAs synthesized in the liver are converted to secondary BAs by intestinal microbiota through enzymes such as bile salt hydrolases (BSHs) and 7α-dehydroxylases. In dysbiosis, this pathway is enhanced via the bai operon, particularly in Clostridium spp., generating harmful secondary BAs: deoxycholic acid (DCA) and lithocholic acid (LCA) [52]. Through its action on Farnesoid X receptor (FXR) and TGR5 (a GPCR for BA), it leads to inflammation and tumorigenesis by enhancing epithelial proliferation [53]. It also induces ROS and DNA strand breaks and activates NF-κB and Wnt signaling, which fosters senescence and promotes colonic neoplasia. Elevated DCA/LCA is observed through the IBD-CAC spectrum with distinct disease-specific profiles [53].
Colibactin and Genotoxic E. coli
E. coli strains harboring the polyketide synthase (pks) island—hereafter denoted as pks+ strains—produce colibactin, a DNA-alkylating agent that induces inter-strand crosslinks and double-strand breaks. Therefore, it causes single-nucleotide variants (SNVs) and insertion-deletion mutations with a characteristic mutational signature: termed SBS88. These mutations affect tumor suppressor genes like APC and TP53, promoting carcinogenesis [54]. Detected in both UC and CRC tissues, these pks+ strains resemble extraintestinal pathogenic E.coli and display conserved virulence loci [55]. Their ability to drive oncogenic mutational signatures has been confirmed in organoid and in vivo studies, especially in CAC, where inflammation-induced oxidative stress facilitates genotoxin transmission [54].
Succinate and the “Succinotype”
Succinate accumulation has been proposed to reflect a pro-inflammatory shift in microbial metabolism. In this context, the concept of “succinotype”, a microbiome stratification tool based on metabolite output, has been introduced. Succinotype 1, characterized by elevated succinate, showing Bacteroides and Prevotella dominance, is known to be strongly associated with IBD and CRC [51]. Recent evidence indicates succinate enhances epithelial proliferation through succinate receptor SUCNR1, which activates MAPK/ERK signaling [56]. It also suppresses Treg and enhances inflammation. In CRC, F. nucleatum-derived succinate is known to induce tumor resistance in immune checkpoint blockade chemotherapy. Through its action on GPR91 (a succinate-specific receptor) on CRC cell lines, it suppresses the cGAS–interferon-β axis, thereby reducing CD8 + T cell infiltration [57]. In addition, elevated succinate levels are also present in FAP and CAC, indicating a conserved metabolic signature.

2.2.3. Carcinogenesis Pathways

Chronic inflammation accelerates neoplastic transformation in IBD. Although CAC and sporadic CRC share end-point mutations such as APC, KRAS, and TP53, the sequence of their occurrence varies. Sporadic CRC follows the adenoma–carcinoma model (APC → KRAS → TP53). On the other hand, CAC begins with TP53 mutation in the inflamed mucosa and acquires APC loss later [33]. This inflammation-driven sequence reflects increased genomic instability. Recent multi-omics classifiers using immune and transcriptomic profiles have demonstrated improved performance in differentiating CAC from sporadic CRC [58].
Bacterial Oncogenesis
Through activation of the FadA—E-cadherin → β-catenin axis, F. nucleatum promotes epithelial proliferation. Additionally, activation of TLR4/MyD88 → cytokine signaling recruits MDSCs, contributing to evading immune surveillance. Collectively, these effects promote carcinogenesis [38,39,40]. In a mouse model, co-colonization with F. nucleatum and pks+ E. coli significantly increases the tumor burden, demonstrating synergistic oncogenicity [54].
Viral and Phage Contributions
Phages regulate carcinogenesis by transferring toxin genes (e.g., colibactin clusters) between microbes and reshaping microbial community structures. An ecological model termed ‘Piggyback-the-winner’ (PtW) proposes that, in environments with high microbial abundance, such as intestinal mucosa, temperate phages preferentially integrate as prophages (lysogeny), favoring coexistence [59]. This contrasts with the conventional ‘Kill-the-winner’ (KTW) model, in which lytic phages suppress the most abundant and fastest-growing bacterial strains, thereby regulating community composition [59]. Through auxiliary gene exchange and horizontal gene transfer, they propagate virulence factors and immune modulators. These processes contribute to phage-modulated bacterial virulence and toxin pathways. Phage–bacteria co-occurrence network analyses have revealed strong correlations between phage and bacterial community structures, supporting the role of phages in modulating dysbiosis and promoting the proliferation of carcinogenic taxa [60].
Host–Microbiome Interaction and Immune-Metabolic Signaling
Host responses to specific microbial taxa may vary across the context of the disease. Along the disease spectrum, from health to IBD and CRC, individuals possess different underlying genetic susceptibilities, epithelial barrier status, and immune tone. For instance, Ruminococcus gnavus and Bacteroides fragilis have emerged as key taxa with distinct immunometabolic consequences in CRC and exhibit stage-specific abundance [61]. Multi-omics integration further reveals that these taxa are significantly correlated with host gene expression changes involved in immune and metabolic pathways, such as iron uptake systems, virulence, and biofilm-related genes [62]. Therefore, integrating metagenomic data with host transcriptomic and immunologic profiles, such as inflammatory gene signatures, including cytokines and chemokines, provides insight into how distinct microbial communities interact with host pathways [62].

2.3. Experimental Validation

Causality between intestinal dysbiosis, chronic inflammation, and colorectal carcinogenesis has been identified through preclinical models. These experimental systems span chemical induction, fecal transfers, and gene-targeted strains. Together, they demonstrate that specific microbial communities and virulence factors can drive both colitis and tumorigenesis as depicted in Figure 1.

2.3.1. AOM/DSS Model

The inflammation-driven carcinogenic environment in CAC can be simulated by combining azoxymethane (AOM), a chemical carcinogen, with dextran sulfate sodium (DSS), which induces colonic inflammation [39]. This model serves as a baseline for researchers to gain insight into microbial causality in tumorigenesis. For example, pks+ E. coli promotes colonic DNA damage via colibactin, while Δpks mutants do not, confirming genotoxin dependence [20,54]. Additionally, mechanistic analysis can identify molecular pathways, such as how F. nucleatum enhances tumor burden by activating EGFR–AKT–ERK, β-catenin signaling, and the epithelial–mesenchymal transition [40].

2.3.2. Germ-Free Mice + FMT

Transplantation of human microbiota into germ-free (GF) mice demonstrates causality in disease development. GF mice receiving fecal microbiota from IBD donors developed more severe DSS-induced colitis than those receiving healthy donor microbiota. These mice exhibited increased pro-inflammatory cytokines, particularly IL-6 and TNF-α, and more extensive epithelial damage [63]. Similarly, when AOM-treated GF mice received FMT from CRC donors, they showed greater tumor burden, enhanced epithelial proliferation (indicated by elevated Ki-67 staining), enrichment of pro-oncogenic bacteria like F. nucleatum, and depletion of beneficial SCFA-producing bacteria compared to controls [64].

2.3.3. Colitis-Microbiome Transfers

Transplantation of colitis microbiota can induce colitis even in immunologically naïve individuals. Experimental evidence demonstrates that when transferred to wild-type recipients, colitic microbiota from TRUC (T-bet−/− × Rag2−/−) mice with innate immune dysfunction or IL-10−/− mice that develop microbiota-dependent colitis led to histologically confirmed colitis. The causative pathobionts were mainly Proteobacteria, including Helicobacter, and the condition was reversible with antibiotic treatment [65]. This shows that the colitic phenotype is transmissible and that dysbiosis alone is sufficient to trigger disease.

2.3.4. Colibactin-Deficient Strains

Strains of E. coli with pks genomic island (pks+) produce colibactin. In both organoids and AOM/DSS mice, pks+ E. coli induced DNA damage and mutational signatures that resemble human CRC. On the other hand, Δpks strains lack this effect, which allows for the establishment of a direct genotoxic–oncogenic link [20,54].

2.4. Regional and Demographic Variability in Microbiome Research

It is critical to understand regional and demographic variability in microbiome research regarding IBD and CRC. Microbial composition is shaped by geography, diet, genetics, and environmental exposures. Therefore, such variability influences disease risk, alters biomarker reproducibility, and affects the generalizability of predictive models across populations. Without accounting for these differences, classifiers trained in one cohort may perform poorly when applied to another, which in turn underscores the need for careful consideration of variability in both biological interpretation and clinical translation.
Non-genetic factors, such as geography, diet, and environment, largely shape the composition of intestinal microbiota. Traditional high-fiber diets in Africa and South America favor Prevotella and Xylanibacter dominant enterotypes. Conversely, industrialized populations such as the USA or Italy, with western diets rich in animal fat, are associated with the selection of Bacteroides-dominant profiles [66,67]. These findings collectively suggest that early-life exposures and urbanization exert a more immediate and dominant influence on gut microbiota composition than host genetic background.
These regional differences extend beyond bacteria. In Saudi pediatric UC cohorts, M. restricta was enriched in the gut mucosa, suggesting a potential pathogenic contribution of the mycobiome to disease severity [49]. Virome profiling in the same population revealed an overrepresentation of Salmonella phage SEN4 and crAssphage. These findings were distinct from Western UC cases, forming a Middle Eastern-specific viral signature [68].
Universal (e.g., F. nucleatum) and population-specific taxa were both identified through global CRC microbiome meta-analyses [24,61]. However, AI models trained on Euro-American datasets often failed when applied to Middle Eastern or East Asian cohorts. This performance gap was attributed to regional microbiome divergence and host–microbe variability [69]. Collectively, these limitations underscore the need for regionally stratified microbiome atlases, rigorous external validation, and context-aware modeling, particularly in underrepresented or resource-limited populations.

3. Translating Microbiome Signals into Clinical Action: Diagnosis, Treatment, and Prognosis

3.1. Microbiome-Based Diagnosis and Classification

3.1.1. CRC

Detection Across the Disease Spectrum
Across the spectrum from HC to adenoma and CRC, non-invasive detection through ML using taxonomic and functional microbiome data has been studied. Genus-level RF models distinguished CRC from HC (AUC: 0.84) and adenomas (AUC: 0.73), while a 3-genus signature reached 0.87 and 0.67. CRC was consistently enriched in LPS biosynthesis and sulfur metabolism pathways [70]. This study also underscored reproducible enrichment of LPS biosynthesis and sulfur metabolism pathways as essential risk factors of CRC, and validated these signatures across studies. A meta-analysis using amplicon sequence variant (ASV)-level profiles constructed RF models to classify adenoma and CRC separately, validating performance across two independent cohorts (AUC: 0.78, 0.84) and associating CRC with elevated vitamin K2 biosynthesis [71]. Interestingly, learning from western populations and validating in the eastern cohort gained better performance, suggesting potential transfer ability across populations despite limited variables. A diagnostic model combined 20 cross-cohort-stable microbial features from multiple countries with clinical variables, which outperformed microbiome-only models (AUC: 0.939 for CRC, 0.925 for adenoma) [72]. While the use of publicly available datasets limited the specificity of the clinical information and prevented the incorporation of time-series data, the study demonstrated that combining microbiome with demographic and clinical features improved stability and generalizability compared with microbiome-only models.
Function-based profiles (KEGG and eggNOG) outperformed taxonomic data in adenoma classification, with eggNOG models achieving full group separation (HC < adenoma < CRC) [73]. These results are promising for CRC prevention by enabling detection at earlier stages, but were demonstrated only in discovery cohorts. With neural network models incorporating FIT-positive samples and 16S rRNA profiling, colorectal cancer (CRC) and adenoma cases were stratified [74]. This improved screening specificity while maintaining sensitivity above 97%. However, because both training and validation were restricted to FIT-positive cohorts, the models remain dependent on pre-screened populations, raising concerns about spectrum bias and leaving their generalizability to average-risk screening untested. Beyond stools, blood-based microbial cfDNA markers detected CRC (AUC: 0.9824) and adenoma (AUC: 0.8849) but lacked external validation [75]. While promising as a minimally invasive tool, the study was limited to a single-center cohort with only internal train–test splits. Moreover, because microbial cfDNA reads represent less than 1% of total sequences, technical variability remains a barrier to clinical translation.
Viral signatures also showed diagnostic value; a phage-based RF model using 405 vOTUs reached cross-cohort (AUC: 0.830) and was further externally validated (AUC: 0.906) [44]. This study showed virus–bacteria interaction networks, linking reduced CRC-associated viruses with butyrate-producing taxa, and enriched viruses with oral bacteria, thereby suggesting a role of gut virus–bacterial synergy. Meanwhile, microbiome alterations following FOBT highlight the need for confounder-aware models in screening [76].
At the precancerous stage, a model targeting advanced adenoma (AA) (AUC: 0.799) linked disease with decreased diversity, network complexity, and altered tryptophan metabolism [77]. Microbial and SNP comparisons between AA and CRC provided insight into adenoma–carcinoma transition [78]. However, both studies are based on a small cohort recruited from a single region, underscoring the need for further validation in larger, multi-center populations [77,78].
One study combined feature engineering with mediation analysis and incorporated clinical variables such as BMI and smoking status, improving interpretability and increasing AUC by 13% [79]. Specifically, the pipeline applied a taxonomy-aware feature-reduction strategy to merge redundant taxa and an iterative feature selection procedure to retain only the most predictive microbial variables, thereby producing more stable and biologically meaningful representations of the microbiome. Likewise, the combination of multitarget stool DNA (MT-sDNA), six gut genera, and tumor markers (e.g., CEA) showed higher diagnostic accuracy than either MT-sDNA or tumor markers alone. The accuracies were 97.1%, 83.0%, and 94.3%, respectively [80]. While this demonstrates translational potential, the study was limited by its reliance on single-center recruitment, and prospective validation is still needed before adoption into screening programs.
Classification of CRC Subtypes
ML classifiers have been applied to distinguish CRC by tumor location, histologic grade, and microbial enterotypes. Stool RNA-seq-derived microbial features differentiated right- vs. left-sided CRC (AUC: 0.89), identifying region-specific taxa [81]. Metagenomic sequencing of tumor tissue confirmed these signatures, with distinct taxa found only in right-sided CRC [82]. The convergence of stool- and tissue-based observations supports biological plausibility, although the sample size was small and external validation was lacking.
Histologic grading was predicted using microbial profiles, with RF achieving perfect classification between poorly and moderately differentiated CRC, based on differences in genera such as Pseudoramibacter and Bifidobacterium [83]. While the model even reached 100% accuracy, these results were derived from small sample sets, underscoring the risk of overfitting and the need for replication in larger, independent datasets. Enterotype-based stratification (Bacteroides, Blautia, and Streptococcus-dominated clusters) revealed progressive microbial shifts from HC to adenoma to CRC; the Streptococcus-dominated subtype achieved the highest classification accuracy (AUC: 0.78) [84]. This supports the potential of community-level clustering for risk stratification, although the study acknowledged limitations such as sample imbalance across enterotypes, reliance on genus-level data, and the need for further validation before clinical application.
Methodological Advances
A study comparing 16S rRNA with shotgun metagenomics revealed that both can identify common patterns in CRC. However, they demonstrated that shotgun metagenomics often offers a more detailed snapshot than 16S, while 16S tends to give greater weight to dominant bacteria [85]. Processing approaches such as OTU clustering, DADA2, and Deblur differed in detected diversity, yet consistently identified key CRC-associated taxa and achieved similar model performance, underscoring the importance of bioinformatics consistency [86]. An overview of representative CRC models is provided in Table 1.

3.1.2. IBD

IBD vs. HC
An increasing number of studies have applied machine learning to distinguish IBD from HC. An open-source diagnostic tool (LightCUD), trained on 349 samples, achieved high performance (AUC > 0.95) using WGS and 16S data, identifying Ehrlichia canis and Burkholderia gladioli as key features [87]. In a large Korean multicenter study (n > 1800), sparse partial least squares–discriminant analysis (sPLS-DA) achieved excellent accuracy (AUC: 0.992), while also emphasizing that differences in sample collection timing between UC and CD cohorts (e.g., post-diagnosis/pre-treatment in UC vs. various treatment stages in CD) and pre-collection interventions such as discontinuation of antibiotics or probiotics could confound microbial diversity, underscoring the need for careful cohort design in future studies [88]. Moreover, a pilot study, which applied sPLS-DA to classify HC, inactive UC, and active UC, identified potential microbial markers, such as Haemophilus parainfluenzae and Bifidobacterium adolescentis [89].
The integration of microbiota profiles with fecal biomarkers, fecal calprotectin (FCal), and the novel marker HBD2 enhanced the classification performance, highlighting the complementary value of both markers [90]. A longitudinal ITS2 sequencing study examined the fungal microbiome concerning disease activity in UC [91]. Candida abundance increased by approximately 3.5-fold during active disease. Using fungal features alone, Random Forest models achieved moderate classification performance, with an AUC of around 0.8, but remained cohort-bound. Moreover, the lack of dietary intake data, limited clinical metadata, and absence of antifungal usage records further constrained the interpretability of the findings.
Beyond standard approaches, a leveraging scheme that incorporated external samples improved model robustness and generalizability. When over 25% of target samples were included for classifying CD and HC, AUC increased by up to 0.075 across independent microbiome cohorts [92]. In summary, internal resampling often inflates apparent performance, whereas true gains in external validation were observed only when explicit cross-cohort alignment strategies were implemented.
Differentiating UC and CD
Compared to IBD vs. HC classification, differentiating UC from CD remains substantially challenging. A WMS-based ML model using species-level features showed the limits of cross-cohort transferability [93]. The AUC dropped from 0.778 in internal validation to 0.633 in external validation. Still, it identified taxa specific to CD (E. coli, Shigella dysenteriae) and UC (Lactobacillus ruminis, Lachnospira eligens), and found broader functional alterations in CD than in UC.
An RF model based on genus-level microbiome was externally validated in separate UC and CD datasets (AUC: 0.74, 0.76), though its predictive capacity was constrained by cohort heterogeneity, the lack of gender information, and limitations of the American Gut Project training data [94]. The same LightCUD platform built separate classifiers for UC vs. CD using WGS and 16S data (AUC: 0.989 and 0.966, respectively) [87]. A phylotype-level model employing sPLS-DA was evaluated across multiple cohorts and demonstrated excellent internal performance (AUC = 0.988, accuracy = 94.5%), suggesting strong discriminatory power despite limited generalizability [88]. Another model included quantified absolute microbial load in combination with clinical markers such as CRP and FCal as input features. The model achieved an AUC of 0.86, with 90% sensitivity and 82% specificity [95]. Collectively, these findings indicate that UC/CD discrimination remains particularly vulnerable to cohort heterogeneity and technical variation; even when internal metrics appear robust, the absence of standardized sampling and cohort-aware validation renders such models difficult to generalize. Representative AI/ML applications in IBD are summarized in Table 2.
Pediatric IBD (PIBD)
In PIBD, stool-based RF models achieved accurate discrimination from HC and IBS (AUC: 0.88, 0.84), with enrichment of Escherichia-Shigella and Enterococcus, and depletion of SCFA-producing genera, which also correlated with nutritional status [96]. Disease activity classification improved by integrating fecal and serum bile acid profiles with microbial features (AUC: 0.84). Altered bile acid ratios were the most influential variable [97]. Logistic regression using serum amino acids and microbial taxa yielded high performances (AUC: 0.94); key features included leucine and E. coli, distinguishing PIBD from HC [98].
Despite these promising results, several limitations remain. The diagnostic tool excluded patients with extraintestinal manifestations [96]. Cohort sizes were often small, and microbial profiling relied mainly on 16S rRNA rather than metagenomic sequencing, restricting functional resolution [97]. Environmental factors such as diet, age-related differences, and heterogeneous control groups were insufficiently standardized, and single-sample designs precluded the assessment of intra-individual variability over time [98]. Collectively, these constraints highlight the need for larger, longitudinal, and multi-omic pediatric studies that incorporate both dietary and host factors to establish more reliable biomarkers for early diagnosis and disease monitoring.

3.1.3. IBD-Associated CRC (CAC)

Diagnostic Challenges in the IBD Context
In CAC, a specific microbiome that can be distinguished from IBD has not been consistently found. In one study, there was a greater frequency of pks+ E. coli in CAC than IBD, confirmed with Pathobiont-specific qPCR [99]. However, pks+ E. coli is closely related to IBD and sporadic CRC [100]. Furthermore, the study presented the timing of sample collection as a limitation. In most cases, samples were collected after the tumor had already fully developed. This makes it difficult to determine clearly whether microbial changes, such as pathogen proliferation, precede CAC development. Further validation is needed to determine whether the microbiome is a suitable biomarker for monitoring CAC, which progresses through a gradual inflammatory–dysplasia–carcinogenesis pathway [100].

3.2. Prediction of Treatment Response

3.2.1. CRC

Microbiota interactions dictate treatment toxicity and response in CRC. For example, F. nucleatum-mediated KLK10 expression was associated with poor immune checkpoint blockade response and worse survival [101]. However, this analysis relied mainly on TCGA/GEO datasets without independent clinical validation, raising concerns about dataset reuse and generalizability. In addition, chemotherapy alters the intestinal microbiota, influencing the risk of toxicity [102,103]. Microbiome-driven systems could allow for preventive interventions like early dose adjustment or microbiota modulation.
Induced elevated KLK10 expression, facilitated by F. nucleatum, was associated with pro-inflammatory tumor microenvironments and immune resistance. This gene emerged as a potential microbial-influenced biomarker of immunotherapy failure [101]. Capecitabine (CAP) alters gut microbiota, significantly increasing Escherichia species and menaquinol (vitamin K2) biosynthesis in a way that was associated with reduced neuropathy risk; patients enriched for vitamin K2 genes had fewer side effects [102,103]. The novel model employed a KO-based random forest framework and predicted CAP-related toxicity. The novel model predicted CAP-related toxicity (AUC: 0.92 internally, and AUC: 0.72 externally) [103]. Although the modest cohort size and the sharp decline in performance on external validation highlight limited robustness, the significance of this study lies in its attempt to build a model on baseline gut microbiome functional signatures rather than taxonomic abundance.
Gut microbiota also predicted leukopenia risk: hematologic toxic patients had lower diversity and Fusicatenibacter and Cetobacterium depletion [104]. An early risk identification was made possible by a taxonomic RF model, supporting dose adjustment or leukocyte-stimulating agents. Nevertheless, the leukopenia subgroup was very small (n = 13), and no external validation was performed, making overfitting likely.

3.2.2. IBD

Responses to IBD treatment vary widely due to complex factors, but clinical indicators alone (e.g., age, sex, disease extent, and inflammatory markers) have shown poor predictive power. For example, only ~50% of UC patients respond to 5-aminosalicylic acid (5-ASA), and up to 40% exhibit primary resistance to anti-TNF therapy. In pediatric CD, exclusive enteral nutrition (EEN) induces clinical remission in ~80%, but <50% achieve biomarker normalization. These limitations have motivated the integration of biological data into prediction models [105,106].
Microbial features have also shown promise in predicting treatment response. In UC, non-responders to 5-ASA had lower baseline abundance of F. prausnitzii, Blautia massiliensis, and Phascolarctobacterium faecium [106]. An RF model trained on these three taxa achieved internal (AUC: 0.8) and external validation (AUC: 0.82), with a specificity of 88%. Clinical variables showed no significant differences between groups, highlighting the microbiome’s added value. However, this study was limited by a small, single-center cohort. In pediatric CD, a prospective study integrating clinical, dietary, cytokine, proteomic, microbiome, and fecal metabolite data predicted responses to EEN [105]. A multicomponent RF model performance was superior to the clinical-only model (78% accuracy, 80% sensitivity, and 77% specificity), highlighting key predictors including SCFA-related taxa such as Ruminococcaceae and Bacteroides, as well as fecal metabolites like butyrate and acetate. Yet, the paradoxical finding that lower baseline SCFAs were linked to responses and the modest sample size highlight mechanistic and generalizability concerns.
Multi-cohort analyses have reported distinct microbial signatures associated with remission after biologic therapy. An RF model derived from baseline stool microbiota profiles predicted remission of IBD at 22 weeks after biologic treatment (AUC: 0.895 internal, 0.75 external), reporting enrichment of SCFA-producing genera in responders [107]. While this study used multi-cohort data, residual batch effects may persist despite correction. Heterogeneity across cohorts and the relatively limited number of patients in the final analysis (133 internal and 29 external) remain important concerns. In addition, subgroup heterogeneity in responses between anti-TNF and anti-IL-12/23 treatments may act as a confounder, so a larger-scale, balanced prospective validation is needed in the future. Withdrawal of anti-TNF in UC was examined by transcriptomic and mucosal microbiome characterization that identified microbial and gene expression biomarkers of durable remission and enabling biomarker-guided treatment de-escalation [108]. However, this work remained a proof-of-concept approach with only nine patients (n = 9). Also, the requirement for escalation of IBD therapy was forecasted using baseline metagenomic and EHR data (AUC ~0.75), where a lower abundance of SCFA-producing bacteria, such as Roseburia, was related to a higher risk for escalation [109]. However, following FDR corrections for multiple testing (FDR > 0.1), no microbial species or pathways were significant. This lack of reproducibility likely reflects substantial inter-individual variability, as well as methodological differences in sample processing and bioinformatic pipelines.
For FMT, donor–recipient microbial interactions influenced treatment response [110]. Specifically, enterotype combinations (such as Bacteroides-dominant recipients with Prevotella-dominant donors) showed higher remission rates (69.3% vs. 34.5%). An enterotype-based donor selection (EDS) model achieved an AUC of 0.80, and external validation cohorts demonstrated significantly improved outcomes with EDS-guided donor–recipient matching (response rates 93.3% vs. 37.0%). Nonetheless, the lack of standardized enterotyping methods, restricted donor availability, and uncertain reproducibility across centers constrain clinical translation.
Microbiome-informed models consistently outperform clinical predictors in CRC and IBD. However, translation is limited by small sample sizes, modest external validation, batch effects, and therapy-specific heterogeneity. Reproducibility is further hindered by inter-individual variability, dietary and treatment confounders, and inconsistent analytic workflows.

3.3. Prognosis from Risk Stratification to Surveillance

3.3.1. Prognosis in CRC

Baseline microbiota patterns stratified CRC prognosis. A study using RF and LASSO identified three microbial clusters differing in diversity and survival outcomes. Eight taxa (e.g., Synechococcus and Candidatus nitrosotenuis) were strongly associated with survival based on Cox regression [111]. While encouraging, these associations varied by preprocessing pipeline, underscoring methodological instability.
In a longitudinal analysis, stool samples collected before and one year after curative CRC surgery showed that while F. nucleatum declined, bile acid–producing species like Clostridium scindens and carcinogenic metabolites like DCA increased [112]. A post-surgery RF model revealed that some patients retained high microbiome-derived risk scores, indicating incomplete microbial restoration. However, this study was limited by its single-center design, short one-year follow-up, and lack of external validation, restricting generalizability.
CRC with BRAF V600E mutations possessed distinct microbial signatures, including ten taxa, such as Prevotella enoeca and Ruthenibacterium lactatiformans, which distinguished mutation status (AUC: 0.85), with potential implications for stratifying molecular subtypes [113]. Although biologically plausible, the results were derived from modest sample sizes and may be confounded by tumor sidedness and overlapping molecular features.

3.3.2. Prediction of Flare, Relapse, and Progression in IBD

A metagenome-based RF classifier distinguished relapse in CD (AUC: 0.769), linked to specific taxa and metabolic pathways like propionate metabolism [114]. The signal was linked to propionate metabolism rather than community diversity, highlighting functional over taxonomic relevance. A mucosal biopsy-based XGBoost model incorporating multi-omics data predicted CD and UC relapse frequency (AUC: 0.84). Novel stratification by succinotype (Dialister vs. Phascolarctobacterium) reflected relapse rates (0.51 vs. 0.26/year) within 4 years of colonoscopy [115]. While novel, these findings were not replicated externally, leaving uncertainty about reproducibility.
An RF model incorporating stress reactivity (SR), based on fecal and plasma metabolites and mucosal microbes, predicted future flares in UC (AUC up to 0.91). SR was associated with elevated Ruminococcaceae and Lachnospiraceae and metabolites [116]. This underscores the value of psychosocial–microbiome interactions, but the study was also limited to a small single-center cohort with internal validation only.
Long-term progression in CD was stratified using mucosal-luminal microbiota, metabolomics, and host genetics. Worse phenotypes were linked to Faecalibacterium and Parasutterella loss [117]. However, this was derived from a tertiary referral center, raising concerns about referral bias. A multimodal XGBoost model, including magnetic resonance imaging, microbiome, and metabolome data, predicted bowel injury. Erysipelatoclostridium and fecal alanine emerged as key predictors [118]. Despite this methodological advance, the study was geographically confined to Chinese cohorts, potentially limiting applicability to broader populations.
Postoperative recurrence in CD was predicted using mucosal microbiota sampled after ileocecal resection and clinical factors (AUC: 0.78) [119]. Recurrence was associated with recolonization of Fusobacteriaceae and loss of butyrate producers, but perioperative exposures such as antibiotics and diet were potential confounders.

3.3.3. Early Detection and Shared Modeling

A GNN model captured structural patterns across IBD, CRC, and HC, achieving predictions of IBD (AUC: 0.93) and CRC (AUC: 0.90), which emphasized disease-specific microbial connectivity [120]. However, validation was limited to public datasets with internal splits, leaving external reproducibility untested. A logistic regression model with demographics, lifestyle, and microbiome data predicted colorectal adenoma (AUC: 0.84) [121]. The addition of F. nucleatum and pks+ E. coli improved risk estimation. The research suggested screening individuals with high risk at age 42, ten years before the present general practice. However, lifestyle data were collected through questionnaires subject to recall bias, and multicohort replication was geographically restricted, raising questions about cultural specificity.

4. Beyond Feces: Expanding the Microbiome Landscape

Although fecal microbiome profiling has been widely used in gastrointestinal disease research, it provides only a limited perspective of the gut ecosystem. Site-specific microbial communities in the oral cavity, intestinal mucosa, and small intestine have their distinct taxonomic and functional profiles linked to inflammation in IBD and tumorigenesis in CRC. These under-explored niches may provide enhanced diagnostic precision, particularly when fecal biomarkers fall short (Figure 2). Key diagnostic metrics, representative taxa, sampling considerations, and current AI applications for each niche are summarised in Table 3.

4.1. Oral Microbiome

4.1.1. Oral–Gut Axis: Microbial Translocation and Systemic Inflammation

The dense microbial ecosystem of the oral cavity affects systemic health via the oral–gut axis. Barrier dysfunction is permissive for the translocation of oral bacteria to distal gastrointestinal locations. Fusobacterium nucleatum binds to intestinal epithelial cells [122], enhances immune evasion [123], and promotes tumorigenesis in CRC models [124]. Oral and tumor isolates are identical through strain-level detection using CRISPR–Cas genotyping, establishing molecular evidence of oral–gut translocation. [125]. In IBD, weakened barrier integrity permits colonization by oral pathobionts such as Veillonella, Streptococcus, Prevotella, and Porphyromonas gingivalis, triggering immune activation and fostering inflammation [126,127,128,129].

4.1.2. Dysbiosis of Oral Microbiota in IBD and CRC

Distinct oral microbial populations have been reported in both IBD and CRC. In IBD, the salivary microbiome often shows increased levels of Prevotella, Veillonella, Atopobium, and Megasphaera [130]. By contrast, CRC-associated profiles frequently include higher detection of Streptococcus anginosus, Peptostreptococcus stomatis, Prevotella intermedia, and Fusobacterium nucleatum [131]. Notably, P. intermedia has been connected to periodontitis severity [132], while F. nucleatum appears more common in advanced cases and associates with overall salivary makeup [133]. Moreover, tongue-coating research has revealed high levels of Streptococcus sanguinis and Prevotella oris in CRC, with Atopobium rimae contributing most to predictive models [134]. These findings may suggest that oral and dental conditions could serve as indicators of IBD and CRC.

4.1.3. Diagnostic Potential of Salivary Microbiome

Saliva serves as a useful non-invasive medium for assessing microbial communities because of its ease of collection. In CRC, Bifidobacterium functions as a bacterial biomarker, whereas Fusobacterium, Dialister, and Catonella distinguish patients from controls [135]. In a study using 16S rRNA sequencing with sPLS-DA, diagnostic AUCs reached 0.966 for IBD versus control and 0.923 for distinguishing CD from UC, with key differential taxa including CD-enriched Fusobacterium and Dialister, and UC-enriched Prevotella and Bacteroides [136]. The findings indicate the potential of oral microbiota in paralleling intestinal pathology.

4.1.4. AI-Driven Approaches

Rather than relying on individual taxa, ML trained on several salivary profiles has shown stronger performance. For instance, Streptococcus infantis and Desulfovibrio desulfuricans achieved modest AUCs (0.74 and 0.653), whereas salivary microbiome-based models yielded an AUC of 0.91 for CRC prediction [137] and a C-index of 0.866 in ML-powered nomograms [138]. xAI promises to transform research by revealing risk mechanisms, building on fecal microbiome successes using RF models with SHAP analysis [26].

4.2. Mucosal-Associated Microbiota (MAM)

Mucosal-associated microbiota differ markedly from luminal communities, and mounting evidence shows they provide superior bacterial “footprints” for phenotype discrimination [139,140]. Along the intestinal segments, mucosal profiles retain 94.7% similarity regardless of clinical status, highlighting their stability and value for the detection and diagnosis of IBD and CRC [141]. Proximity to the epithelium permits direct immune modulation: commensals such as F. prausnitzii and A. muciniphila bolster barrier integrity, whereas adherent-invasive E. coli breach the epithelium and drive CD [142,143]. Endoscopic biopsies now enable high-resolution spatial analyses. In murine colitis, spatial transcriptomics revealed localized downregulation of barrier genes and upregulation of antimicrobial peptides at sites of microbial infiltration [144].
Based on such molecular and spatial data, advanced computational approaches, including ML, are increasingly applied to decipher complex host–microbe interactions. For instance, in large human cohorts integrating mucosal RNA-seq with 16S sequencing, researchers apply sparse canonical correlated analysis (CCA)-based ML to map co-association networks between microbial taxa and host gene modules linked to IBD inflammation [145]. Complementarily, graph-based models such as MGMLink infer mechanistic host–microbe interaction pathways from knowledge-graph data [146].

4.3. Small Intestine (SI) Microbiome

Compared with the colon, the small intestine has rapid transit, acidic pH, bile exposure, and low microbial density [147,148]. This milieu favors inflammation-prone taxa. In early CD, expansions of E. coli strain 35A1 and R. gnavus are reported in the ileal mucosa, and their metabolites can injure the epithelium [149]. Concomitantly, anti-inflammatory species decline [150]. Genetic mutations that disrupt bile-acid regulation further aggravate these shifts [151]. In CRC, small-intestine–derived bile-acid metabolites may influence tumorigenesis at distal colonic sites, positioning the SI microbiome as an upstream regulator of gut ecology and host responses [53].
Sampling remains difficult because of anatomic inaccessibility. Conventional endoscopy reaches the duodenum and jejunum but carries discomfort and sampling bias. Newer approaches, which include string tests, capsule endoscopy with sampling, and nasoduodenal tubes, increase access yet are not widely adopted [148,152]. Stoma effluent offers a practical window into SI communities, though cohorts are small and confounded by clinical and dietary factors [153]. Analytically, low biomass, higher oxygen tension, and contamination risk complicate profiling; rapid transit and diet-driven fluctuations add temporal variability, weakening longitudinal inference and statistical power [148].

5. Considerations When Applying AI to Microbiome-Based Prediction

5.1. Limitations of Previous Machine Learning Approaches

While ML has been widely applied to predict diseases based on the microbiome, earlier research has had methodological difficulties in developing sufficiently advanced models for clinical usage. Major points were that the training and evaluation workflows were improperly separated. Some models selected features prior to cross-validation, which was reported to result in information leakage and overestimated performance, or repeated or non-independent samples were treated randomly, thereby compromising generalizability across the data [154]. In other cases, many models were tested within a single cohort and became susceptible to technical artifacts, such as batch effects, which were shown to have lower AUC in external validation [69]. Inconsistent preprocessing and handling of metadata also hindered reproducibility [155]. Input data based on OTU or ASV tables was sometimes found to be dependent on inaccurate or incomplete taxonomic labels, potentially missing biologically meaningful assessments [156]. Moreover, the coexistence of high dimensionality and sparsity of microbiome data made extracting stable patterns by conventional ML challenging [157]. Overlooking specific factors of subgroups was found in ensemble models [158], and the complexity of decision path classes decreased their interpretability [159]. Although many models have identified associated taxa to each disease, few have set the findings into biologically coherent models or mechanistic frameworks to overcome these problems [160].

5.2. Considerations for Robust Generalization

Researchers should ensure robust generalization for the prediction models to be distributed to the clinical field. One widely used method is leave-one-dataset-out cross-validation (LODOCV), which excludes each dataset in turn from training iteration to mimic identifying model performance on cohorts not encountered before [69]. To prevent information leakage, a study was designed to confine preprocessing steps, such as feature selection and normalization, to the training set, and to balance the distribution of each class by disease type and sample origin, allowing it to learn clinically relevant patterns. Moreover, normalization by study size on evaluations minimized bias from the size of each cohort. In a distinct approach, by paying attention to the property of real-world data lacking diagnostic labels, preprocessing relied on batch metadata rather than disease status alone [157]. Some metrics, such as LISI and PERMANOVA, were applied to determine whether batch correction effectively removed noise without discarding relevant variations. Another study nested feature selection within cross-validation iterations, ensuring each selection was made exclusively within the training set [154]. These methodological endeavors demonstrate how the design of validation and preprocessing can critically affect a model’s generalizability.
Addressing generalization beyond cross-checking, control augmentation strategies enabled the model to learn from baseline variability and reduce overfitting, such as introducing samples from HC from an external dataset into the training set [154]. Another study introduced the transformer, pretrained on the data, and a frozen encoder while fine-tuning the classifier layer only, resulting in stable performance without further adaptation [161]. Ensemble methods, such as stacking, combined distinct models trained on clinical and microbial data into a meta-learner that weighted each model based on its generalizability [162]. However, it was criticized for the fact that its combination of taxonomic and functional profiles, which were eventually derived from the same sequencing data, did not improve performance, but taxonomic features alone could suffice [157].
Another unique and comprehensive attempt to improve robustness was an end-to-end pipeline titled ‘comprehensive data optimization and risk prediction framework’, which incorporated triple optimization imputation, reduced dimensionality by adopting an importance-weighted variational autoencoder, and tuned hyperparameters, achieving superior generalizability compared to traditional ML [163].

5.3. Strategies for Data Optimization and Preprocessing

Preprocessing also plays a critical role in predicting well in the flood of real-world data, by structuring and cleaning the dataset. Unsupervised filtering and normalization, such as introducing prevalence thresholds, minimized noise while restraining information leakage [154]. Another study adjusted data distribution to normality, applied rank transformations to reduce noise, and filtered high-dimensionality before modeling [164].
In parallel, data optimization aims to enhance the quality of input. The log-ratio transformation was shown to stabilize recursive feature elimination, reducing sensitivity to scaling, and led to the identification of more reproducible predictive taxa [155]. Stability selection to CCA identified microbial and metabolomic variables, which emerged consistently across subsampled datasets [165]. An alternative framework bypassed conventional OTU or ASV altogether, and caught graph-based representations from raw 16S sequences, yielding structured inputs [156].

5.4. Explainability as a Translational Consideration

Interpretability is also a crucial component for clinical applications. Post hoc explanation tools, such as SHAP, have been widely utilized to highlight not only globally but also individually important microbial features concerning patient-specific factors, including diet and medication usage [158,166]. However, some researchers have attempted to build unique frameworks to interpret the model from the beginning. Logic-based rule extraction extracted readable rules from decision trees, revealing specific taxa combinations related to disease states [159]. Another framework utilized sparse generalized CCA filtering to build biologically coherent modules from multi-omic features [160]. Interestingly, utilizing natural language processing allowed it to learn contextual relationships among taxa, revealing key clusters [161]. Collectively, these methods show that explainability can be an integral part of the design.

6. Conclusions

Recent efforts in microbiome research utilizing AI have revolutionized our insights into IBD and CRC, reinforcing their distinct pathological mechanisms while revealing both shared and unique dysbiotic patterns. This review highlights how AI-powered microbiome analysis has enabled early detection, non-invasive diagnosis, treatment stratification, and prognosis prediction in both IBD and CRC patient populations. Despite these practical feasibility challenges, such as generalization and data optimization, the most critical barrier remains the limited interpretability of AI models, which arises from their inherent complexity. Moreover, acquisition is limited to feces, while the nature of the microbiome varies along anatomical locations; therefore, there are emerging approaches to gather data from other sites. Ultimately, the integration of AI and microbial data could become a standard tool in the management of digestive diseases, translating the complexity of individual diversity into precision medicine approaches tailored to the distinct pathologies of IBD and CRC. As these approaches mature, they are expected to support the development of precision medicine, enabling more tailored prevention, diagnosis, and treatment strategies that directly improve patient outcomes.

Author Contributions

Conceptualization, M.K., D.G., S.K., H.K. and J.H.K.; methodology, M.K. and D.G.; software, M.K. and T.P.E.; validation, M.K., J.S., T.P.E., S.K., J.Y. and J.H.K.; formal analysis, M.K.; investigation, M.K. and T.P.E.; resources, M.K.; data curation, M.K.; writing—original draft preparation, M.K., D.G., S.K., H.K., J.S., J.Y., S.P., C.J., G.S. and H.K.; writing—review and editing, all authors.; visualization, G.S., S.P., J.S., J.Y. and S.K.; supervision, J.H.K.; project administration, M.K.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The reference data reviewed in this manuscript were systematically collected through PubMed searches based on predefined inclusion criteria (e.g., publication within the past 5 years and relevance to inflammatory bowel disease, colorectal cancer, and microbiome research). To minimize selection bias, references were organized by disease subtype and mechanistic focus, and validated for recency, scientific quality, and PubMed indexation. The complete list of curated articles, along with the structured classification and inclusion criteria, is available from the first or corresponding author upon reasonable request.

Acknowledgments

We utilized ChatGPT (version 4o, OpenAI) during the preparation of this review article for the purposes of workflow optimization, literature synthesis, and collaborative drafting. Specifically, After independently identifying eligible articles via PubMed based on predefined inclusion criteria (e.g., publication within the past 5 years and relevance to inflammatory bowel disease [IBD], colorectal cancer [CRC], ulcerative colitis [UC], or Crohn’s disease [CD]), we employed ChatGPT to assist in the topical classification of articles in alignment with the manuscript’s structure. In addition, ChatGPT was used to generate concise summaries of reference articles to facilitate interdisciplinary understanding and discussion among co-authors, with all summaries cross-validated by the authors through direct review of the original publications. Each author also assumed responsibility for drafting specific sections of the manuscript, while ChatGPT provided initial outlines and suggested summaries of recent studies, enabling real-time and efficient coordination among authors with differing schedules and research environments. Finally, AI support was used to manage in-text citation formatting and reference organization during manuscript development, although all references were independently verified by manually retrieving original articles via PubMed to confirm DOI accuracy, indexing status, and content relevance. At no point did we rely solely on AI-generated content for evidence interpretation or critical analysis. All scientific arguments, conclusions, and interpretations presented in the manuscript were conceived, evaluated, and finalized by the human authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. The Integrative HMP (iHMP) Research Network Consortium; Proctor, L.M.; Creasy, H.H.; Fettweis, J.M.; Lloyd-Price, J.; Ma-hurkar, A.; Zhou, W.; Buck, G.A.; Snyder, M.P.; Strauss, J.F.; et al. The Integrative Human Microbiome Project. Nature 2019, 569, 641–648. [Google Scholar] [CrossRef]
  2. Raskov, H.; Burcharth, J.; Pommergaard, H.-C.; Rosenberg, J. Irritable Bowel Syndrome, the Microbiota and the Gut-Brain Axis. Gut Microbes 2016, 7, 365–383. [Google Scholar] [CrossRef] [PubMed]
  3. Hsu, C.L.; Schnabl, B. The Gut–Liver Axis and Gut Microbiota in Health and Liver Disease. Nat. Rev. Microbiol. 2023, 21, 719–733. [Google Scholar] [CrossRef] [PubMed]
  4. Haneishi, Y.; Furuya, Y.; Hasegawa, M.; Picarelli, A.; Rossi, M.; Miyamoto, J. Inflammatory Bowel Diseases and Gut Microbiota. Int. J. Mol. Sci. 2023, 24, 3817. [Google Scholar] [CrossRef] [PubMed]
  5. Rebersek, M. Gut Microbiome and Its Role in Colorectal Cancer. BMC Cancer 2021, 21, 1325. [Google Scholar] [CrossRef]
  6. Axelrad, J.E.; Lichtiger, S.; Yajnik, V. Inflammatory Bowel Disease and Cancer: The Role of Inflammation, Immunosuppression, and Cancer Treatment. World J. Gastroenterol. 2016, 22, 4794. [Google Scholar] [CrossRef]
  7. Sato, Y.; Tsujinaka, S.; Miura, T.; Kitamura, Y.; Suzuki, H.; Shibata, C. Inflammatory Bowel Disease and Colorectal Cancer: Epidemiology, Etiology, Surveillance, and Management. Cancers 2023, 15, 4154. [Google Scholar] [CrossRef]
  8. Quaglio, A.E.V.; Grillo, T.G.; Oliveira, E.C.S.D.; Stasi, L.C.D.; Sassaki, L.Y. Gut Microbiota, Inflammatory Bowel Disease and Colorectal Cancer. World J. Gastroenterol. 2022, 28, 4053–4060. [Google Scholar] [CrossRef]
  9. Cai, J.; Sun, L.; Gonzalez, F.J. Gut Microbiota-Derived Bile Acids in Intestinal Immunity, Inflammation, and Tumorigenesis. Cell Host Microbe 2022, 30, 289–300. [Google Scholar] [CrossRef]
  10. Wang, R.; Li, Z.; Liu, S.; Zhang, D. Global, Regional and National Burden of Inflammatory Bowel Disease in 204 Countries and Territories from 1990 to 2019: A Systematic Analysis Based on the Global Burden of Disease Study 2019. BMJ Open 2023, 13, e065186. [Google Scholar] [CrossRef]
  11. Bray, F.; Laversanne, M.; Sung, H.; Ferlay, J.; Siegel, R.L.; Soerjomataram, I.; Jemal, A. Global Cancer Statistics 2022: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA. Cancer J. Clin. 2024, 74, 229–263. [Google Scholar] [CrossRef]
  12. Wagle, N.S.; Nogueira, L.; Devasia, T.P.; Mariotto, A.B.; Yabroff, K.R.; Islami, F.; Jemal, A.; Alteri, R.; Ganz, P.A.; Siegel, R.L. Cancer Treatment and Survivorship Statistics, 2025. CA. Cancer J. Clin. 2025, 75, 308–340. [Google Scholar] [CrossRef] [PubMed]
  13. Kafel, A.J.; Muzalyova, A.; Schnoy, E. Malignancy and Inflammatory Bowel Disease (IBD): Incidence and Prevalence of Malignancy in Correlation to IBD Therapy and Disease Activity—A Retrospective Cohort Analysis over 5 Years. Biomedicines 2025, 13, 1395. [Google Scholar] [CrossRef] [PubMed]
  14. Song, D.; Wang, F.; Ju, Y.; He, Q.; Sun, T.; Deng, W.; Ding, R.; Zhang, C.; Xu, Q.; Qi, C.; et al. Application and Development of Noninvasive Biomarkers for Colorectal Cancer Screening: A Systematic Review. Int. J. Surg. 2023, 109, 925–935. [Google Scholar] [CrossRef] [PubMed]
  15. Massaro, C.A.; Meade, S.; Lemarié, F.L.; Kaur, G.; Bressler, B.; Rosenfeld, G.; Leung, Y.; Williams, A.-J.; Lunken, G. Gut Microbiome Predictors of Advanced Therapy Response in Crohn’s Disease: Protocol for the OPTIMIST Prospective, Longitudinal, Observational Pilot Study in Canada. BMJ Open 2025, 15, e094280. [Google Scholar] [CrossRef]
  16. Zhang, C.; Wang, Y.; Cheng, L.; Cao, X.; Liu, C. Gut Microbiota in Colorectal Cancer: A Review of Its Influence on Tumor Immune Surveillance and Therapeutic Response. Front. Oncol. 2025, 15, 1557959. [Google Scholar] [CrossRef]
  17. Galasso, L.; Termite, F.; Mignini, I.; Esposto, G.; Borriello, R.; Vitale, F.; Nicoletti, A.; Paratore, M.; Ainora, M.E.; Gasbarrini, A.; et al. Unraveling the Role of Fusobacterium Nucleatum in Colorectal Cancer: Molecular Mechanisms and Pathogenic Insights. Cancers 2025, 17, 368. [Google Scholar] [CrossRef]
  18. Yu, M.R.; Kim, H.J.; Park, H.R. Fusobacterium Nucleatum Accelerates the Progression of Colitis-Associated Colorectal Cancer by Promoting EMT. Cancers 2020, 12, 2728. [Google Scholar] [CrossRef]
  19. Rosendahl Huber, A.; Pleguezuelos-Manzano, C.; Puschhof, J.; Ubels, J.; Boot, C.; Saftien, A.; Verheul, M.; Trabut, L.T.; Groenen, N.; Van Roosmalen, M.; et al. Improved Detection of Colibactin-Induced Mutations by Genotoxic E. Coli in Organoids and Colorectal Cancer. Cancer Cell 2024, 42, 487–496. [Google Scholar] [CrossRef]
  20. Arthur, J.C.; Perez-Chanona, E.; Mühlbauer, M.; Tomkovich, S.; Uronis, J.M.; Fan, T.-J.; Campbell, B.J.; Abujamel, T.; Dogan, B.; Rogers, A.B.; et al. Intestinal Inflammation Targets Cancer-Inducing Activity of the Microbiota. Science 2012, 338, 120–123. [Google Scholar] [CrossRef]
  21. Wang, X.; Fang, Y.; Liang, W.; Wong, C.C.; Qin, H.; Gao, Y.; Liang, M.; Song, L.; Zhang, Y.; Fan, M.; et al. Fusobacterium Nucleatum Facilitates Anti-PD-1 Therapy in Microsatellite Stable Colorectal Cancer. Cancer Cell 2024, 42, 1729–1746. [Google Scholar] [CrossRef]
  22. Zhang, H.; Tian, Y.; Xu, C.; Chen, M.; Xiang, Z.; Gu, L.; Xue, H.; Xu, Q. Crosstalk between Gut Microbiotas and Fatty Acid Metabolism in Colorectal Cancer. Cell Death Discov. 2025, 11, 78. [Google Scholar] [CrossRef]
  23. Kim, K.S.; Noh, J.; Kim, B.-S.; Koh, H.; Lee, D.-W. Refining Microbiome Diversity Analysis by Concatenating and Integrating Dual 16S rRNA Amplicon Reads. npj Biofilms Microbiomes 2025, 11, 57. [Google Scholar] [CrossRef] [PubMed]
  24. Wirbel, J.; Pyl, P.T.; Kartal, E.; Zych, K.; Kashani, A.; Milanese, A.; Fleck, J.S.; Voigt, A.Y.; Palleja, A.; Ponnudurai, R.; et al. Meta-Analysis of Fecal Metagenomes Reveals Global Microbial Signatures That Are Specific for Colorectal Cancer. Nat. Med. 2019, 25, 679–689. [Google Scholar] [CrossRef] [PubMed]
  25. Preto, A.J.; Chanana, S.; Ence, D.; Healy, M.D.; Domingo-Fernández, D.; West, K.A. Multi-Omics Data Integration Identifies Novel Biomarkers and Patient Subgroups in Inflammatory Bowel Disease. J. Crohns Colitis 2025, 19, jjae197. [Google Scholar] [CrossRef] [PubMed]
  26. Novielli, P.; Romano, D.; Magarelli, M.; Bitonto, P.D.; Diacono, D.; Chiatante, A.; Lopalco, G.; Sabella, D.; Venerito, V.; Filannino, P.; et al. Explainable Artificial Intelligence for Microbiome Data Analysis in Colorectal Cancer Biomarker Identification. Front. Microbiol. 2024, 15, 1348974. [Google Scholar] [CrossRef]
  27. Hernández Medina, R.; Kutuzova, S.; Nielsen, K.N.; Johansen, J.; Hansen, L.H.; Nielsen, M.; Rasmussen, S. Machine Learning and Deep Learning Applications in Microbiome Research. ISME Commun. 2022, 2, 98. [Google Scholar] [CrossRef]
  28. Aysel, H.I.; Cai, X.; Prugel-Bennett, A. Explainable Artificial Intelligence: Advancements and Limitations. Appl. Sci. 2025, 15, 7261. [Google Scholar] [CrossRef]
  29. Salih, A.M.; Raisi-Estabragh, Z.; Galazzo, I.B.; Radeva, P.; Petersen, S.E.; Lekadir, K.; Menegaz, G. A Perspective on Explainable Artificial Intelligence Methods: SHAP and LIME. Adv. Intell. Syst. 2025, 7, 2400304. [Google Scholar] [CrossRef]
  30. Eaden, J.A. The Risk of Colorectal Cancer in Ulcerative Colitis: A Meta-Analysis. Gut 2001, 48, 526–535. [Google Scholar] [CrossRef]
  31. Jess, T.; Simonsen, J.; Jørgensen, K.T.; Pedersen, B.V.; Nielsen, N.M.; Frisch, M. Decreasing Risk of Colorectal Cancer in Patients With Inflammatory Bowel Disease Over 30 Years. Gastroenterology 2012, 143, 375–381. [Google Scholar] [CrossRef]
  32. Baker, A.-M.; Cross, W.; Curtius, K.; Al Bakir, I.; Choi, C.-H.R.; Davis, H.L.; Temko, D.; Biswas, S.; Martinez, P.; Williams, M.J.; et al. Evolutionary History of Human Colitis-Associated Colorectal Cancer. Gut 2019, 68, 985–995. [Google Scholar] [CrossRef]
  33. Terzić, J.; Grivennikov, S.; Karin, E.; Karin, M. Inflammation and Colon Cancer. Gastroenterology 2010, 138, 2101–2114. [Google Scholar] [CrossRef]
  34. Sokol, H.; Pigneur, B.; Watterlot, L.; Lakhdari, O.; Bermúdez-Humarán, L.G.; Gratadoux, J.-J.; Blugeon, S.; Bridonneau, C.; Furet, J.-P.; Corthier, G.; et al. Faecalibacterium prausnitzii Is an Anti-Inflammatory Commensal Bacterium Identified by Gut Microbiota Analysis of Crohn Disease Patients. Proc. Natl. Acad. Sci. USA 2008, 105, 16731–16736. [Google Scholar] [CrossRef] [PubMed]
  35. Martín, R.; Chain, F.; Miquel, S.; Lu, J.; Gratadoux, J.-J.; Sokol, H.; Verdu, E.F.; Bercik, P.; Bermúdez-Humarán, L.G.; Langella, P. The Commensal Bacterium Faecalibacterium prausnitzii Is Protective in DNBS-Induced Chronic Moderate and Severe Colitis Models. Inflamm. Bowel Dis. 2014, 20, 417–430. [Google Scholar] [CrossRef] [PubMed]
  36. Xu, Z.; Dong, X.; Yang, K.; Chevarin, C.; Zhang, J.; Lin, Y.; Zuo, T.; Chu, L.C.; Sun, Y.; Zhang, F.; et al. Association of Adherent-Invasive Escherichia Coli with Severe Gut Mucosal Dysbiosis in Hong Kong Chinese Population with Crohn’s Disease. Gut Microbes 2021, 13, 1994833. [Google Scholar] [CrossRef]
  37. Darfeuille-Michaud, A.; Boudeau, J.; Bulois, P.; Neut, C.; Glasser, A.-L.; Barnich, N.; Bringer, M.-A.; Swidsinski, A.; Beaugerie, L.; Colombel, J.-F. High Prevalence of Adherent-Invasive Escherichia Coli Associated with Ileal Mucosa in Crohn’s Disease. Gastroenterology 2004, 127, 412–421. [Google Scholar] [CrossRef] [PubMed]
  38. Rubinstein, M.R.; Wang, X.; Liu, W.; Hao, Y.; Cai, G.; Han, Y.W. Fusobacterium Nucleatum Promotes Colorectal Carcinogenesis by Modulating E-Cadherin/β-Catenin Signaling via Its FadA Adhesin. Cell Host Microbe 2013, 14, 195–206. [Google Scholar] [CrossRef]
  39. Kostic, A.D.; Chun, E.; Robertson, L.; Glickman, J.N.; Gallini, C.A.; Michaud, M.; Clancy, T.E.; Chung, D.C.; Lochhead, P.; Hold, G.L.; et al. Fusobacterium Nucleatum Potentiates Intestinal Tumorigenesis and Modulates the Tumor-Immune Microenvironment. Cell Host Microbe 2013, 14, 207–215. [Google Scholar] [CrossRef]
  40. Yu, T.; Guo, F.; Yu, Y.; Sun, T.; Ma, D.; Han, J.; Qian, Y.; Kryczek, I.; Sun, D.; Nagarsheth, N.; et al. Fusobacterium Nucleatum Promotes Chemoresistance to Colorectal Cancer by Modulating Autophagy. Cell 2017, 170, 548–563. [Google Scholar] [CrossRef]
  41. Biondi, A.; Basile, F.; Vacante, M. Familial Adenomatous Polyposis and Changes in the Gut Microbiota: New Insights into Colorectal Cancer Carcinogenesis. World J. Gastrointest. Oncol. 2021, 13, 495–508. [Google Scholar] [CrossRef]
  42. Li, T.; Coker, O.O.; Sun, Y.; Li, S.; Liu, C.; Lin, Y.; Wong, S.H.; Miao, Y.; Sung, J.J.Y.; Yu, J. Multi-Cohort Analysis Reveals Altered Archaea in Colorectal Cancer Fecal Samples Across Populations. Gastroenterology 2025, 168, 525–538. [Google Scholar] [CrossRef]
  43. Norman, J.M.; Handley, S.A.; Baldridge, M.T.; Droit, L.; Liu, C.Y.; Keller, B.C.; Kambal, A.; Monaco, C.L.; Zhao, G.; Fleshner, P.; et al. Disease-Specific Alterations in the Enteric Virome in Inflammatory Bowel Disease. Cell 2015, 160, 447–460. [Google Scholar] [CrossRef]
  44. Chen, F.; Li, S.; Guo, R.; Song, F.; Zhang, Y.; Wang, X.; Huo, X.; Lv, Q.; Ullah, H.; Wang, G.; et al. Meta-Analysis of Fecal Viromes Demonstrates High Diagnostic Potential of the Gut Viral Signatures for Colorectal Cancer and Adenoma Risk Assessment. J. Adv. Res. 2023, 49, 103–114. [Google Scholar] [CrossRef]
  45. Iliev, I.D.; Leonardi, I. Fungal Dysbiosis: Immunity and Interactions at Mucosal Barriers. Nat. Rev. Immunol. 2017, 17, 635–646. [Google Scholar] [CrossRef] [PubMed]
  46. Yuan, K.; Xu, H.; Li, S.; Coker, O.O.; Liu, W.; Wang, L.; Zhang, X.; Yu, J. Intraneoplastic Fungal Dysbiosis Is Associated with Colorectal Cancer Progression and Host Gene Mutation. eBioMedicine 2025, 113, 105608. [Google Scholar] [CrossRef] [PubMed]
  47. Limon, J.J.; Tang, J.; Li, D.; Wolf, A.J.; Michelsen, K.S.; Funari, V.; Gargus, M.; Nguyen, C.; Sharma, P.; Maymi, V.I.; et al. Malassezia Is Associated with Crohn’s Disease and Exacerbates Colitis in Mouse Models. Cell Host Microbe 2019, 25, 377–388. [Google Scholar] [CrossRef] [PubMed]
  48. Mok, K.; Suratanon, N.; Roytrakul, S.; Charoenlappanit, S.; Patumcharoenpol, P.; Chatchatee, P.; Vongsangnak, W.; Nakphaichit, M. ITS2 Sequencing and Targeted Meta-Proteomics of Infant Gut Mycobiome Reveal the Functional Role of Rhodotorula Sp. during Atopic Dermatitis Manifestation. J. Fungi 2021, 7, 748. [Google Scholar] [CrossRef]
  49. El Mouzan, M.; Al Quorain, A.; Assiri, A.; Almasoud, A.; Alsaleem, B.; Aladsani, A.; Al Sarkhy, A. Gut Fungal Profile in New Onset Treatment-Naïve Ulcerative Colitis in Saudi Children. Saudi J. Gastroenterol. 2025, 31, 28–33. [Google Scholar] [CrossRef]
  50. Parada Venegas, D.; De La Fuente, M.K.; Landskron, G.; González, M.J.; Quera, R.; Dijkstra, G.; Harmsen, H.J.M.; Faber, K.N.; Hermoso, M.A. Short Chain Fatty Acids (SCFAs)-Mediated Gut Epithelial and Immune Regulation and Its Relevance for Inflammatory Bowel Diseases. Front. Immunol. 2019, 10, 277. [Google Scholar] [CrossRef]
  51. Anthamatten, L.; Von Bieberstein, P.R.; Menzi, C.; Zünd, J.N.; Lacroix, C.; de Wouters, T.; Leventhal, G.E. Stratification of Human Gut Microbiomes by Succinotype Is Associated with Inflammatory Bowel Disease Status. Microbiome 2024, 12, 186. [Google Scholar] [CrossRef] [PubMed]
  52. Funabashi, M.; Grove, T.L.; Wang, M.; Varma, Y.; McFadden, M.E.; Brown, L.C.; Guo, C.; Higginbottom, S.; Almo, S.C.; Fischbach, M.A. A Metabolic Pathway for Bile Acid Dehydroxylation by the Gut Microbiome. Nature 2020, 582, 566–570. [Google Scholar] [CrossRef] [PubMed]
  53. Jia, W.; Xie, G.; Jia, W. Bile Acid–Microbiota Crosstalk in Gastrointestinal Inflammation and Carcinogenesis. Nat. Rev. Gastroenterol. Hepatol. 2018, 15, 111–128. [Google Scholar] [CrossRef]
  54. Pleguezuelos-Manzano, C.; Puschhof, J.; Rosendahl Huber, A.; Van Hoeck, A.; Wood, H.M.; Nomburg, J.; Gurjao, C.; Manders, F.; Dalmasso, G.; Stege, P.B.; et al. Mutational Signature in Colorectal Cancer Caused by Genotoxic Pks+ E. Coli. Nature 2020, 580, 269–273. [Google Scholar] [CrossRef]
  55. Lv, C.; Abdullah, M.; Su, C.-L.; Chen, W.; Zhou, N.; Cheng, Z.; Chen, Y.; Li, M.; Simpson, K.W.; Elsaadi, A.; et al. Correction: Genomic Characterization of Escherichia Coli with a Polyketide Synthase (Pks) Island Isolated from Ulcerative Colitis Patients. BMC Genom. 2025, 26, 91. [Google Scholar] [CrossRef]
  56. Tian, S.; Paudel, D.; Hao, F.; Neupane, R.; Castro, R.; Patterson, A.D.; Tiwari, A.K.; Prabhu, K.S.; Singh, V. Refined Fiber Inulin Promotes Inflammation-associated Colon Tumorigenesis by Modulating Microbial Succinate Production. Cancer Rep. 2023, 6, e1863. [Google Scholar] [CrossRef]
  57. Jiang, S.-S.; Xie, Y.-L.; Xiao, X.-Y.; Kang, Z.-R.; Lin, X.-L.; Zhang, L.; Li, C.-S.; Qian, Y.; Xu, P.-P.; Leng, X.-X.; et al. Fusobacterium Nucleatum-Derived Succinic Acid Induces Tumor Resistance to Immunotherapy in Colorectal Cancer. Cell Host Microbe 2023, 31, 781–797. [Google Scholar] [CrossRef]
  58. Li, J.; Ji, Y.; Chen, N.; Dai, L.; Deng, H. Colitis-Associated Carcinogenesis: Crosstalk between Tumors, Immune Cells and Gut Microbiota. Cell Biosci. 2023, 13, 194. [Google Scholar] [CrossRef]
  59. Silveira, C.B.; Rohwer, F.L. Piggyback-the-Winner in Host-Associated Microbial Communities. npj Biofilms Microbiomes 2016, 2, 16010. [Google Scholar] [CrossRef]
  60. Hannigan, G.D.; Duhaime, M.B.; Ruffin, M.T.; Koumpouras, C.C.; Schloss, P.D. Diagnostic Potential and Interactive Dynamics of the Colorectal Cancer Virome. mBio 2018, 9, e02248-18. [Google Scholar] [CrossRef]
  61. Zeller, G.; Tap, J.; Voigt, A.Y.; Sunagawa, S.; Kultima, J.R.; Costea, P.I.; Amiot, A.; Böhm, J.; Brunetti, F.; Habermann, N.; et al. Potential of Fecal Microbiota for Early-stage Detection of Colorectal Cancer. Mol. Syst. Biol. 2014, 10, 766. [Google Scholar] [CrossRef]
  62. Qin, Y.; Wang, Q.; Lin, Q.; Liu, F.; Pan, X.; Wei, C.; Chen, J.; Huang, T.; Fang, M.; Yang, W.; et al. Multi-Omics Analysis Reveals Associations between Gut Microbiota and Host Transcriptome in Colon Cancer Patients. mSystems 2025, 10, e00805-24. [Google Scholar] [CrossRef] [PubMed]
  63. Nagao-Kitamoto, H.; Shreiner, A.B.; Gillilland, M.G.; Kitamoto, S.; Ishii, C.; Hirayama, A.; Kuffa, P.; El-Zaatari, M.; Grasberger, H.; Seekatz, A.M.; et al. Functional Characterization of Inflammatory Bowel Disease–Associated Gut Dysbiosis in Gnotobiotic Mice. Cell. Mol. Gastroenterol. Hepatol. 2016, 2, 468–481. [Google Scholar] [CrossRef] [PubMed]
  64. Wong, S.H.; Zhao, L.; Zhang, X.; Nakatsu, G.; Han, J.; Xu, W.; Xiao, X.; Kwong, T.N.Y.; Tsoi, H.; Wu, W.K.K.; et al. Gavage of Fecal Samples From Patients With Colorectal Cancer Promotes Intestinal Carcinogenesis in Germ-Free and Conventional Mice. Gastroenterology 2017, 153, 1621–1633. [Google Scholar] [CrossRef] [PubMed]
  65. Garrett, W.S.; Lord, G.M.; Punit, S.; Lugo-Villarino, G.; Mazmanian, S.K.; Ito, S.; Glickman, J.N.; Glimcher, L.H. Communicable Ulcerative Colitis Induced by T-Bet Deficiency in the Innate Immune System. Cell 2007, 131, 33–45. [Google Scholar] [CrossRef]
  66. Yatsunenko, T.; Rey, F.E.; Manary, M.J.; Trehan, I.; Dominguez-Bello, M.G.; Contreras, M.; Magris, M.; Hidalgo, G.; Baldassano, R.N.; Anokhin, A.P.; et al. Human Gut Microbiome Viewed across Age and Geography. Nature 2012, 486, 222–227. [Google Scholar] [CrossRef]
  67. De Filippo, C.; Cavalieri, D.; Di Paola, M.; Ramazzotti, M.; Poullet, J.B.; Massart, S.; Collini, S.; Pieraccini, G.; Lionetti, P. Impact of Diet in Shaping Gut Microbiota Revealed by a Comparative Study in Children from Europe and Rural Africa. Proc. Natl. Acad. Sci. USA 2010, 107, 14691–14696. [Google Scholar] [CrossRef]
  68. El Mouzan, M.; Savidge, T.C.; Al Sarkhy, A.; Badu, S.; Alsaleem, B.; Al Mofarreh, M.; Almasood, A.; Assiri, A. Gut Virome Profile in New Onset Treatment Naïve Saudi Children with Ulcerative Colitis. Saudi J. Gastroenterol. 2025, 31, 212–218. [Google Scholar] [CrossRef]
  69. Kubinski, R.; Djamen-Kepaou, J.-Y.; Zhanabaev, T.; Hernandez-Garcia, A.; Bauer, S.; Hildebrand, F.; Korcsmaros, T.; Karam, S.; Jantchou, P.; Kafi, K.; et al. Benchmark of Data Processing Methods and Machine Learning Models for Gut Microbiome-Based Diagnosis of Inflammatory Bowel Disease. Front. Genet. 2022, 13, 784397. [Google Scholar] [CrossRef]
  70. Zhang, H.; Wu, J.; Ji, D.; Liu, Y.; Lu, S.; Lin, Z.; Chen, T.; Ao, L. Microbiome Analysis Reveals Universal Diagnostic Biomarkers for Colorectal Cancer across Populations and Technologies. Front. Microbiol. 2022, 13, 1005201. [Google Scholar] [CrossRef]
  71. Wu, Y.; Jiao, N.; Zhu, R.; Zhang, Y.; Wu, D.; Wang, A.-J.; Fang, S.; Tao, L.; Li, Y.; Cheng, S.; et al. Identification of Microbial Markers across Populations in Early Detection of Colorectal Cancer. Nat. Commun. 2021, 12, 3063. [Google Scholar] [CrossRef]
  72. Zhou, D.; Chen, Y.; Wang, Z.; Zhu, S.; Zhang, L.; Song, J.; Bai, T.; Hou, X. Integrating Clinical and Cross-Cohort Metagenomic Features: A Stable and Non-Invasive Colorectal Cancer and Adenoma Diagnostic Model. Front. Mol. Biosci. 2024, 10, 1298679. [Google Scholar] [CrossRef]
  73. Casimiro-Soriguer, C.S.; Loucera, C.; Peña-Chilet, M.; Dopazo, J. Towards a Metagenomics Machine Learning Interpretable Model for Understanding the Transition from Adenoma to Colorectal Cancer. Sci. Rep. 2022, 12, 450. [Google Scholar] [CrossRef]
  74. Khannous-Lleiffe, O.; Willis, J.R.; Saus, E.; Moreno, V.; Castellví-Bel, S.; Gabaldón, T.; on behalf of the CRIPREV Consortium. Microbiome Profiling from Fecal Immunochemical Test Reveals Microbial Signatures with Potential for Colorectal Cancer Screening. Cancers 2022, 15, 120. [Google Scholar] [CrossRef]
  75. Zhou, Z.; Ma, Y.; Zhang, D.; Ji, R.; Wang, Y.; Zhao, J.; Ma, C.; Zhu, H.; Shen, H.; Jiang, X.; et al. Microbiome and Fragmentation Pattern of Blood Cell-Free DNA and Fecal Metagenome Enhance Colorectal Cancer Micro-Dysbiosis and Diagnosis Analysis: A Proof-of-Concept Study. mSystems 2025, 10, e00276-25. [Google Scholar] [CrossRef]
  76. Guodong, W.; Yinhang, W.; Xinyue, W.; Hong, S.; Jian, C.; Zhanbo, Q.; Shuwen, H. Fecal Occult Blood Affects Intestinal Microbial Community Structure in Colorectal Cancer. BMC Microbiol. 2025, 25, 34. [Google Scholar] [CrossRef]
  77. Xiang, J.; Chai, N.; Li, L.; Hao, X.; Linghu, E. Alterations of Gut Microbiome in Patients with Colorectal Advanced Adenoma by Metagenomic Analyses. Turk. J. Gastroenterol. 2024, 35, 859–868. [Google Scholar] [CrossRef] [PubMed]
  78. Han, S.; Zhuang, J.; Pan, Y.; Wu, W.; Ding, K. Different Characteristics in Gut Microbiome between Advanced Adenoma Patients and Colorectal Cancer Patients by Metagenomic Analysis. Microbiol. Spectr. 2022, 10, e01593-22. [Google Scholar] [CrossRef] [PubMed]
  79. Zhou, Y.-H.; Sun, G. Improve the Colorectal Cancer Diagnosis Using Gut Microbiome Data. Front. Mol. Biosci. 2022, 9, 921945. [Google Scholar] [CrossRef] [PubMed]
  80. Fan, J.-Q.; Zhao, W.-F.; Lu, Q.-W.; Zha, F.-R.; Lv, L.-B.; Ye, G.-L.; Gao, H.-L. Fecal Microbial Biomarkers Combined with Multi-Target Stool DNA Test Improve Diagnostic Accuracy for Colorectal Cancer. World J. Gastrointest. Oncol. 2023, 15, 1424–1435. [Google Scholar] [CrossRef]
  81. Kolisnik, T.; Sulit, A.K.; Schmeier, S.; Frizelle, F.; Purcell, R.; Smith, A.; Silander, O. Identifying Important Microbial and Genomic Biomarkers for Differentiating Right- versus Left-Sided Colorectal Cancer Using Random Forest Models. BMC Cancer 2023, 23, 647. [Google Scholar] [CrossRef] [PubMed]
  82. Liu, L.; Shi, J.; Wang, H.; Du, H.; Yang, J.; Wei, K.; Zhou, Z.; Li, M.; Huang, S.; Zhan, L.; et al. The Characteristics of Tissue Microbiota in Different Anatomical Locations and Different Tissue Types of the Colorectum in Patients with Colorectal Cancer. mSystems 2025, 10, e00198-25. [Google Scholar] [CrossRef] [PubMed]
  83. Qi, Z.; Zhibo, Z.; Jing, Z.; Zhanbo, Q.; Shugao, H.; Weili, J.; Jiang, L.; Shuwen, H. Prediction Model of Poorly Differentiated Colorectal Cancer (CRC) Based on Gut Bacteria. BMC Microbiol. 2022, 22, 312. [Google Scholar] [CrossRef] [PubMed]
  84. Qingbo, L.; Jing, Z.; Zhanbo, Q.; Jian, C.; Yifei, S.; Yinhang, W.; Shuwen, H. Identification of Enterotype and Its Predictive Value for Patients with Colorectal Cancer. Gut Pathog. 2024, 16, 12. [Google Scholar] [CrossRef]
  85. Bars-Cortina, D.; Ramon, E.; Rius-Sansalvador, B.; Guinó, E.; Garcia-Serrano, A.; Mach, N.; Khannous-Lleiffe, O.; Saus, E.; Gabaldón, T.; Ibáñez-Sanz, G.; et al. Comparison between 16S rRNA and Shotgun Sequencing in Colorectal Cancer, Advanced Colorectal Lesions, and Healthy Human Gut Microbiota. BMC Genom. 2024, 25, 730. [Google Scholar] [CrossRef]
  86. Liu, G.; Li, T.; Zhu, X.; Zhang, X.; Wang, J. An Independent Evaluation in a CRC Patient Cohort of Microbiome 16S rRNA Sequence Analysis Methods: OTU Clustering, DADA2, and Deblur. Front. Microbiol. 2023, 14, 1178744. [Google Scholar] [CrossRef]
  87. Xu, C.; Zhou, M.; Xie, Z.; Li, M.; Zhu, X.; Zhu, H. LightCUD: A Program for Diagnosing IBD Based on Human Gut Microbiome Data. BioData Min. 2021, 14, 2. [Google Scholar] [CrossRef]
  88. Kim, H.; Na, J.E.; Kim, S.; Kim, T.-O.; Park, S.-K.; Lee, C.-W.; Kim, K.O.; Seo, G.-S.; Kim, M.S.; Cha, J.M.; et al. A Machine Learning-Based Diagnostic Model for Crohn’s Disease and Ulcerative Colitis Utilizing Fecal Microbiome Analysis. Microorganisms 2023, 12, 36. [Google Scholar] [CrossRef]
  89. Barberio, B.; Facchin, S.; Patuzzi, I.; Ford, A.C.; Massimi, D.; Valle, G.; Sattin, E.; Simionati, B.; Bertazzo, E.; Zingone, F.; et al. A Specific Microbiota Signature Is Associated to Various Degrees of Ulcerative Colitis as Assessed by a Machine Learning Approach. Gut Microbes 2022, 14, 2028366. [Google Scholar] [CrossRef]
  90. Gacesa, R.; Vich Vila, A.; Collij, V.; Mujagic, Z.; Kurilshikov, A.; Voskuil, M.D.; Festen, E.A.M.; Wijmenga, C.; Jonkers, D.M.A.E.; Dijkstra, G.; et al. A Combination of Fecal Calprotectin and Human Beta-Defensin 2 Facilitates Diagnosis and Monitoring of Inflammatory Bowel Disease. Gut Microbes 2021, 13, 1943288. [Google Scholar] [CrossRef]
  91. Jangi, S.; Hsia, K.; Zhao, N.; Kumamoto, C.A.; Friedman, S.; Singh, S.; Michaud, D.S. Dynamics of the Gut Mycobiome in Patients With Ulcerative Colitis. Clin. Gastroenterol. Hepatol. 2024, 22, 821–830. [Google Scholar] [CrossRef]
  92. Song, K.; Zhou, Y.-H. Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations. Bioengineering 2023, 10, 231. [Google Scholar] [CrossRef] [PubMed]
  93. Kang, D.-Y.; Park, J.-L.; Yeo, M.-K.; Kang, S.-B.; Kim, J.-M.; Kim, J.S.; Kim, S.-Y. Diagnosis of Crohn’s Disease and Ulcerative Colitis Using the Microbiome. BMC Microbiol. 2023, 23, 336. [Google Scholar] [CrossRef] [PubMed]
  94. Liñares-Blanco, J.; Fernandez-Lozano, C.; Seoane, J.A.; López-Campos, G. Machine Learning Based Microbiome Signature to Predict Inflammatory Bowel Disease Subtypes. Front. Microbiol. 2022, 13, 872671. [Google Scholar] [CrossRef]
  95. Sarrabayrouse, G.; Elias, A.; Yáñez, F.; Mayorga, L.; Varela, E.; Bartoli, C.; Casellas, F.; Borruel, N.; Herrera De Guise, C.; Machiels, K.; et al. Fungal and Bacterial Loads: Noninvasive Inflammatory Bowel Disease Biomarkers for the Clinical Setting. mSystems 2021, 6, 10.1128/msystems.01277-20. [Google Scholar] [CrossRef] [PubMed]
  96. Wang, X.; Xiao, Y.; Xu, X.; Guo, L.; Yu, Y.; Li, N.; Xu, C. Characteristics of Fecal Microbiota and Machine Learning Strategy for Fecal Invasive Biomarkers in Pediatric Inflammatory Bowel Disease. Front. Cell. Infect. Microbiol. 2021, 11, 711884. [Google Scholar] [CrossRef]
  97. Chen, W.; Wang, D.; Deng, X.; Zhang, H.; Dong, D.; Su, T.; Lu, Q.; Jiang, C.; Ni, Q.; Cui, Y.; et al. Bile Acid Profiling as an Effective Biomarker for Staging in Pediatric Inflammatory Bowel Disease. Gut Microbes 2024, 16, 2323231. [Google Scholar] [CrossRef]
  98. Vermeer, E.; Jagt, J.Z.; Lap, E.M.; Struys, E.A.; Budding, A.E.; Verhoeven-Duif, N.M.; Bosma, M.; Van Limbergen, J.E.; Koot, B.G.P.; De Jonge, R.; et al. Fecal Gut Microbiota and Amino Acids as Noninvasive Diagnostic Biomarkers of Pediatric Inflammatory Bowel Disease. Gut Microbes 2025, 17, 2517828. [Google Scholar] [CrossRef]
  99. Masaadeh, A.H.; Eletrebi, M.; Parajuli, B.; De Jager, N.; Bosch, D.E. Human Colitis-Associated Colorectal Carcinoma Progression Is Accompanied by Dysbiosis with Enriched Pathobionts. Gut Microbes 2025, 17, 2479774. [Google Scholar] [CrossRef]
  100. Liu, K.; Yang, X.; Zeng, M.; Yuan, Y.; Sun, J.; He, P.; Sun, J.; Xie, Q.; Chang, X.; Zhang, S.; et al. The Role of Fecal Fusobacterium Nucleatum and Pks+ Escherichia Coli as Early Diagnostic Markers of Colorectal Cancer. Dis. Markers 2021, 2021, 1–11. [Google Scholar] [CrossRef]
  101. Li, Z.; Liu, Y.; Guo, P.; Wei, Y. Construction and Validation of a Novel Angiogenesis Pattern to Predict Prognosis and Immunotherapy Efficacy in Colorectal Cancer. Aging 2023, 15, 12413–12450. [Google Scholar] [CrossRef] [PubMed]
  102. Hillege, L.E.; Trepka, K.R.; Ziemons, J.; Aarnoutse, R.; Guthrie, B.G.H.; De Vos-Geelen, J.; Iersel, L.V.; Van Hellemond, I.E.G.; Baars, A.; Vestjens, J.H.M.J.; et al. Metagenomic Analysis during Capecitabine Therapy Reveals Microbial Chemoprotective Mechanisms and Predicts Drug Toxicity in Colorectal Cancer Patients. Oncology 2024, 12, 2024.10.11.24315249. [Google Scholar] [CrossRef]
  103. Hillege, L.E.; Trepka, K.R.; Guthrie, B.G.H.; Fu, X.; Aarnoutse, R.; Paymar, M.R.; Olson, C.; Zhang, C.; Ortega, E.; Ramirez, L.; et al. Microbial Vitamin Biosynthesis Links Gut Microbiota Dynamics to Chemotherapy Toxicity. mBio 2025, 16, e00930-25. [Google Scholar] [CrossRef] [PubMed]
  104. Xiaofeng, N.; Jian, C.; Jingjing, W.; Zhanbo, Q.; Yifei, S.; Jing, Z.; Shuwen, H. Correlation of Gut Microbiota with Leukopenia after Chemotherapy in Patients with Colorectal Cancer. BMC Microbiol. 2023, 23, 349. [Google Scholar] [CrossRef]
  105. Nichols, B.; Briola, A.; Logan, M.; Havlik, J.; Mascellani, A.; Gkikas, K.; Milling, S.; Ijaz, U.Z.; Quince, C.; Svolos, V.; et al. Gut Metabolome and Microbiota Signatures Predict Response to Treatment with Exclusive Enteral Nutrition in a Prospective Study in Children with Active Crohn’s Disease. Am. J. Clin. Nutr. 2024, 119, 885–895. [Google Scholar] [CrossRef]
  106. Dang, Y.; Xu, X.; Ma, J.; Zhou, M.; Xu, C.; Huang, X.; Xu, F.; Wang, Z.; Shi, H.; Zhang, S. Gut Microbiome Signatures Predict 5-ASA Efficacy in Ulcerative Colitis. iScience 2025, 28, 112568. [Google Scholar] [CrossRef]
  107. Zheng, Q.; Zhong, Y.; Lian, H.; Zhuang, J.; Wang, L.; Chen, J.; Wang, H.; Wang, H.; Ye, X.; Huang, Z.; et al. Gut Microbial Signatures Associated With Clinical Remission in Inflammatory Bowel Disease Treated With Biologics: A Comprehensive Multi-Cohort Analysis. United Eur. Gastroenterol. J. 2025, ueg2.70064. [Google Scholar] [CrossRef]
  108. Sakurai, T.; Nishiyama, H.; Sakai, K.; De Velasco, M.A.; Nagai, T.; Komeda, Y.; Kashida, H.; Okada, A.; Kawai, I.; Nishio, K.; et al. Mucosal Microbiota and Gene Expression Are Associated with Long-Term Remission after Discontinuation of Adalimumab in Ulcerative Colitis. Sci. Rep. 2020, 10, 19186. [Google Scholar] [CrossRef]
  109. Al Radi, Z.M.A.; Prins, F.M.; Collij, V.; Vich Vila, A.; Festen, E.A.M.; Dijkstra, G.; Weersma, R.K.; Klaassen, M.A.Y.; Gacesa, R. Exploring the Predictive Value of Gut Microbiome Signatures for Therapy Intensification in Patients with Inflammatory Bowel Disease: A 10-Year Follow-up Study. Inflamm. Bowel Dis. 2024, 30, 1642–1653. [Google Scholar] [CrossRef]
  110. He, R.; Li, P.; Wang, J.; Cui, B.; Zhang, F.; Zhao, F. The Interplay of Gut Microbiota between Donors and Recipients Determines the Efficacy of Fecal Microbiota Transplantation. Gut Microbes 2022, 14, 2100197. [Google Scholar] [CrossRef]
  111. Smyth, J.; Godet, J.; Choudhary, A.; Das, A.; Gkoutos, G.V.; Acharjee, A. Microbiome-Based Colon Cancer Patient Stratification and Survival Analysis. Cancer Med. 2024, 13, e70434. [Google Scholar] [CrossRef]
  112. Shiroma, H.; Shiba, S.; Erawijantari, P.P.; Takamaru, H.; Yamada, M.; Sakamoto, T.; Kanemitsu, Y.; Mizutani, S.; Soga, T.; Saito, Y.; et al. Surgical Treatment for Colorectal Cancer Partially Restores Gut Microbiome and Metabolome Traits. mSystems 2022, 7, e00018-22. [Google Scholar] [CrossRef]
  113. Trivieri, N.; Pracella, R.; Cariglia, M.G.; Panebianco, C.; Parrella, P.; Visioli, A.; Giani, F.; Soriano, A.A.; Barile, C.; Canistro, G.; et al. BRAFV600E Mutation Impinges on Gut Microbial Markers Defining Novel Biomarkers for Serrated Colorectal Cancer Effective Therapies. J. Exp. Clin. Cancer Res. 2020, 39, 285. [Google Scholar] [CrossRef]
  114. Serrano-Gómez, G.; Mayorga, L.; Oyarzun, I.; Roca, J.; Borruel, N.; Casellas, F.; Varela, E.; Pozuelo, M.; Machiels, K.; Guarner, F.; et al. Dysbiosis and Relapse-Related Microbiome in Inflammatory Bowel Disease: A Shotgun Metagenomic Approach. Comput. Struct. Biotechnol. J. 2021, 19, 6481–6489. [Google Scholar] [CrossRef]
  115. O’Sullivan, J.; Patel, S.; Leventhal, G.E.; Fitzgerald, R.S.; Laserna-Mendieta, E.J.; Huseyin, C.E.; Konstantinidou, N.; Rutherford, E.; Lavelle, A.; Dabbagh, K.; et al. Host-Microbe Multi-Omics and Succinotype Profiling Have Prognostic Value for Future Relapse in Patients with Inflammatory Bowel Disease. Gut Microbes 2025, 17, 2450207. [Google Scholar] [CrossRef]
  116. Jacobs, J.P.; Sauk, J.S.; Ahdoot, A.I.; Liang, F.; Katzka, W.; Ryu, H.J.; Khandadash, A.; Lagishetty, V.; Labus, J.S.; Naliboff, B.D.; et al. Microbial and Metabolite Signatures of Stress Reactivity in Ulcerative Colitis Patients in Clinical Remission Predict Clinical Flare Risk. Inflamm. Bowel Dis. 2024, 30, 336–346. [Google Scholar] [CrossRef] [PubMed]
  117. Jacobs, J.P.; Goudarzi, M.; Lagishetty, V.; Li, D.; Mak, T.; Tong, M.; Ruegger, P.; Haritunians, T.; Landers, C.; Fleshner, P.; et al. Crohn’s Disease in Endoscopic Remission, Obesity, and Cases of High Genetic Risk Demonstrate Overlapping Shifts in the Colonic Mucosal-Luminal Interface Microbiome. Genome Med. 2022, 14, 91. [Google Scholar] [CrossRef] [PubMed]
  118. Huang, L.; Meng, J.; Lin, S.; Peng, Z.; Zhang, R.; Shen, X.; Zheng, W.; Zheng, Q.; Wu, L.; Wang, X.; et al. Integrating Gut Microbiome and Metabolomics with Magnetic Resonance Enterography to Advance Bowel Damage Prediction in Crohn’s Disease. J. Inflamm. Res. 2025, 18, 7631–7649. [Google Scholar] [CrossRef] [PubMed]
  119. Machiels, K.; Pozuelo Del Río, M.; Martinez-De La Torre, A.; Xie, Z.; Pascal Andreu, V.; Sabino, J.; Santiago, A.; Campos, D.; Wolthuis, A.; D’Hoore, A.; et al. Early Postoperative Endoscopic Recurrence in Crohn’s Disease Is Characterised by Distinct Microbiota Recolonisation. J. Crohns Colitis 2020, 14, 1535–1546. [Google Scholar] [CrossRef]
  120. Syama, K.; Jothi, J.A.A.; Khanna, N. Automatic Disease Prediction from Human Gut Metagenomic Data Using Boosting GraphSAGE. BMC Bioinform. 2023, 24, 126. [Google Scholar] [CrossRef]
  121. Zhou, Y.-L.; Deng, J.-W.; Liu, Z.-H.; Ma, X.-Y.; Zhu, C.-Q.; Xie, Y.-H.; Zhou, C.-B.; Fang, J.-Y. Derivation and Validation of Lifestyle-Based and Microbiota-Based Models for Colorectal Adenoma Risk Evaluation and Self-Prediction. BMJ Open Gastroenterol. 2025, 12, e001597. [Google Scholar] [CrossRef]
  122. Huang, A.; Torres, A.; Patel, R.; Saxena, A.; Patel, M. Fusobacterium Nucleatum as a Marker for Epithelial to Mesenchymal Transition in Colorectal Cancer. FASEB J. 2022, 36, fasebj.2022.36.S1.L7993. [Google Scholar] [CrossRef]
  123. Pignatelli, P.; Nuccio, F.; Piattelli, A.; Curia, M.C. The Role of Fusobacterium Nucleatum in Oral and Colorectal Carcinogenesis. Microorganisms 2023, 11, 2358. [Google Scholar] [CrossRef]
  124. Zhang, S.; Li, C.; Liu, J.; Geng, F.; Shi, X.; Li, Q.; Lu, Z.; Pan, Y. Fusobacterium Nucleatum Promotes Epithelial-mesenchymal Transiton through Regulation of the lncRNA MIR4435-2HG/miR-296-5p/Akt2/SNAI1 Signaling Pathway. FEBS J. 2020, 287, 4032–4047. [Google Scholar] [CrossRef]
  125. Shimomura, Y.; Sugi, Y.; Kume, A.; Tanaka, W.; Yoshihara, T.; Matsuura, T.; Komiya, Y.; Ogata, Y.; Suda, W.; Hattori, M.; et al. Strain-Level Detection of Fusobacterium Nucleatum in Colorectal Cancer Specimens by Targeting the CRISPR–Cas Region. Microbiol. Spectr. 2023, 11, e05123-22. [Google Scholar] [CrossRef]
  126. Yang, C.; Zhao, Y.; Chen, J.; Chen, N. P1350 The Role of Oral Porphyromonas Gingivalis and Its Outer Membrane Vesicles in Inflammatory Bowel Disease. J. Crohns Colitis 2025, 19 (Suppl. S1), i2431. [Google Scholar] [CrossRef]
  127. Abdelbary, M.M.H.; Hatting, M.; Bott, A.; Dahlhausen, A.; Keller, D.; Trautwein, C.; Conrads, G. The Oral-Gut Axis: Salivary and Fecal Microbiome Dysbiosis in Patients with Inflammatory Bowel Disease. Front. Cell. Infect. Microbiol. 2022, 12, 1010853. [Google Scholar] [CrossRef]
  128. Zhou, T.; Xu, W.; Wang, Q.; Jiang, C.; Li, H.; Chao, Y.; Sun, Y.; Lan, A. The Effect of the “Oral-Gut” Axis on Periodontitis in Inflammatory Bowel Disease: A Review of Microbe and Immune Mechanism Associations. Front. Cell. Infect. Microbiol. 2023, 13, 1132420. [Google Scholar] [CrossRef]
  129. Xiang, B.; Runzhi, C.; Min, Z.; Yuanqi, Z.; Xiang, P.; Wei, W.; Qi, Z.; Zhiyin, G.; Jun, H.; Tao, D.; et al. P1214 The Crosstalk between Oral and Intestinal Microbiota in Inflammatory Bowel Disease Patients. J. Crohns Colitis 2024, 18 (Suppl. S1), i2145. [Google Scholar] [CrossRef]
  130. Wands, D.; Whelan, R.; Hansen, R.; Rimmer, P.; Iqbal, T.; Ho, G.T. P1328 Reduced Salivary Alpha-Diversity in Inflammatory Bowel Disease: Systematic Review and Meta-Analysis of 1000 IBD Patients—A Role for Oral Microbiome as a Marker of Downstream Dysbiosis in IBD. J. Crohns Colitis 2025, 19 (Suppl. S1), i2390–i2392. [Google Scholar] [CrossRef]
  131. Camañes-Gonzalvo, S.; Montiel-Company, J.M.; Lobo-de-Mena, M.; Safont-Aguilera, M.J.; Fernández-Diaz, A.; López-Roldán, A.; Paredes-Gallardo, V.; Bellot-Arcís, C. Relationship between Oral Microbiota and Colorectal Cancer: A Systematic Review. J. Periodontal Res. 2024, 59, 1071–1082. [Google Scholar] [CrossRef]
  132. Ji, S.; Kook, J.-K.; Park, S.-N.; Lim, Y.K.; Choi, G.H.; Jung, J.-S. Characteristics of the Salivary Microbiota in Periodontal Diseases and Potential Roles of Individual Bacterial Species To Predict the Severity of Periodontal Disease. Microbiol. Spectr. 2023, 11, e04327-22. [Google Scholar] [CrossRef]
  133. Akase, T.; Inubushi, J.; Hayashi-Okada, Y.; Shimizu, Y. Association of Fusobacterium Nucleatum in Human Saliva with Periodontal Status and Composition of the Salivary Microbiome Including Periodontopathogens. Microbiol. Spectr. 2024, 12, e00855-24. [Google Scholar] [CrossRef]
  134. Chen, Q.; Huang, X.; Zhang, H.; Jiang, X.; Zeng, X.; Li, W.; Su, H.; Chen, Y.; Lin, F.; Li, M.; et al. Characterization of Tongue Coating Microbiome from Patients with Colorectal Cancer. J. Oral Microbiol. 2024, 16, 2344278. [Google Scholar] [CrossRef]
  135. Rezasoltani, S.; Aghdaei, H.A.; Jasemi, S.; Gazouli, M.; Dovrolis, N.; Sadeghi, A.; Schlüter, H.; Zali, M.R.; Sechi, L.A.; Feizabadi, M.M. Oral Microbiota as Novel Biomarkers for Colorectal Cancer Screening. Cancers 2022, 15, 192. [Google Scholar] [CrossRef]
  136. Kang, S.-B.; Kim, H.; Kim, S.; Kim, J.; Park, S.-K.; Lee, C.-W.; Kim, K.O.; Seo, G.-S.; Kim, M.S.; Cha, J.M.; et al. Potential Oral Microbial Markers for Differential Diagnosis of Crohn’s Disease and Ulcerative Colitis Using Machine Learning Models. Microorganisms 2023, 11, 1665. [Google Scholar] [CrossRef]
  137. Rezasoltani, S.; Azizmohammad Looha, M.; Asadzadeh Aghdaei, H.; Jasemi, S.; Sechi, L.A.; Gazouli, M.; Sadeghi, A.; Torkashvand, S.; Baniali, R.; Schlüter, H.; et al. 16S rRNA Sequencing Analysis of the Oral and Fecal Microbiota in Colorectal Cancer Positives versus Colorectal Cancer Negatives in Iranian Population. Gut Pathog. 2024, 16, 9. [Google Scholar] [CrossRef]
  138. Wang, Y.; Zhang, Y.; Wang, Z.; Tang, J.; Cao, D.-X.; Qian, Y.; Xie, Y.-H.; Chen, H.-Y.; Chen, Y.-X.; Chen, Z.-F.; et al. A Clinical Nomogram Incorporating Salivary Desulfovibrio Desulfuricans Level and Oral Hygiene Index for Predicting Colorectal Cancer. Ann. Transl. Med. 2021, 9, 754. [Google Scholar] [CrossRef]
  139. Čipčić Paljetak, H.; Barešić, A.; Panek, M.; Perić, M.; Matijašić, M.; Lojkić, I.; Barišić, A.; Vranešić Bender, D.; Ljubas Kelečić, D.; Brinar, M.; et al. Gut Microbiota in Mucosa and Feces of Newly Diagnosed, Treatment-Naïve Adult Inflammatory Bowel Disease and Irritable Bowel Syndrome Patients. Gut Microbes 2022, 14, 2083419. [Google Scholar] [CrossRef]
  140. Juge, N. Relationship between Mucosa-Associated Gut Microbiota and Human Diseases. Biochem. Soc. Trans. 2022, 50, 1225–1236. [Google Scholar] [CrossRef]
  141. Lepage, P.; Seksik, P.; Sutren, M.; De La Cochetière, M.-F.; Jian, R.; Marteau, P.; Doré, J. Biodiversity of the Mucosa-Associated Microbiota Is Stable Along the Distal Digestive Tract in Healthy Individuals and Patients With Ibd. Inflamm. Bowel Dis. 2005, 11, 473–480. [Google Scholar] [CrossRef]
  142. Effendi, R.M.R.A.; Anshory, M.; Kalim, H.; Dwiyana, R.F.; Suwarsa, O.; Pardo, L.M.; Nijsten, T.E.C.; Thio, H.B. Akkermansia Muciniphila and Faecalibacterium Prausnitzii in Immune-Related Diseases. Microorganisms 2022, 10, 2382. [Google Scholar] [CrossRef]
  143. Goldiș, A.; Dragomir, R.; Mercioni, M.A.; Goldiș, C.; Sirca, D.; Enătescu, I.; Olariu, L.; Belei, O. Personalized Microbiome Modulation to Improve Clinical Outcomes in Pediatric Inflammatory Bowel Disease: A Multi-Omics and Interventional Approach. Microorganisms 2025, 13, 1047. [Google Scholar] [CrossRef]
  144. Zhu, B.; Bai, Y.; Yeo, Y.Y.; Lu, X.; Rovira-Clavé, X.; Chen, H.; Yeung, J.; Gerber, G.K.; Angelo, M.; Shalek, A.K.; et al. A Spatial Multi-Modal Dissection of Host-Microbiome Interactions within the Colitis Tissue Microenvironment. Immunology 2024, 2024.03.04.583400. [Google Scholar] [CrossRef]
  145. Hu, S.; Bourgonje, A.R.; Gacesa, R.; Jansen, B.H.; Björk, J.R.; Bangma, A.; Hidding, I.J.; Van Dullemen, H.M.; Visschedijk, M.C.; Faber, K.N.; et al. Mucosal Host-Microbe Interactions Associate with Clinical Phenotypes in Inflammatory Bowel Disease. Nat. Commun. 2024, 15, 1470. [Google Scholar] [CrossRef]
  146. Santangelo, B.E.; Bada, M.; Hunter, L.E.; Lozupone, C. Hypothesizing Mechanistic Links between Microbes and Disease Using Knowledge Graphs. Sci. Rep. 2025, 15, 6905. [Google Scholar] [CrossRef]
  147. Kuang, J.; Zheng, X.; Jia, W. Investigating Regional-Specific Gut Microbial Distribution: An Uncharted Territory in Disease Therapeutics. Protein Cell 2024, 16, 623–640. [Google Scholar] [CrossRef]
  148. Kastl, A.J.; Terry, N.A.; Wu, G.D.; Albenberg, L.G. The Structure and Function of the Human Small Intestinal Microbiota: Current Understanding and Future Directions. Cell. Mol. Gastroenterol. Hepatol. 2020, 9, 33–45. [Google Scholar] [CrossRef]
  149. Nagayama, M.; Yano, T.; Atarashi, K.; Tanoue, T.; Sekiya, M.; Kobayashi, Y.; Sakamoto, H.; Miura, K.; Sunada, K.; Kawaguchi, T.; et al. TH1 Cell-Inducing Escherichia Coli Strain Identified from the Small Intestinal Mucosa of Patients with Crohn’s Disease. Gut Microbes 2020, 12, 1788898. [Google Scholar] [CrossRef]
  150. Ma, X.; Lu, X.; Zhang, W.; Yang, L.; Wang, D.; Xu, J.; Jia, Y.; Wang, X.; Xie, H.; Li, S.; et al. Gut Microbiota in the Early Stage of Crohn’s Disease Has Unique Characteristics. Gut Pathog. 2022, 14, 46. [Google Scholar] [CrossRef]
  151. Briggs, K.; Tomar, V.; Ollberding, N.; Haberman, Y.; Bourgonje, A.R.; Hu, S.; Chaaban, L.; Sunuwar, L.; Weersma, R.K.; Denson, L.A.; et al. Crohn’s Disease–Associated Pathogenic Mutation in the Manganese Transporter ZIP8 Shifts the Ileal and Rectal Mucosal Microbiota Implicating Aberrant Bile Acid Metabolism. Inflamm. Bowel Dis. 2024, 30, 1379–1388. [Google Scholar] [CrossRef]
  152. Hale, M.F. Capsule Endoscopy: Current Practice and Future Directions. World J. Gastroenterol. 2014, 20, 7752. [Google Scholar] [CrossRef]
  153. Matsuzawa, H.; Munakata, S.; Kawai, M.; Sugimoto, K.; Kamiyama, H.; Takahashi, M.; Kojima, Y.; Sakamoto, K. Analysis of Ileostomy Stool Samples Reveals Dysbiosis in Patients with High-Output Stomas. Biosci. Microbiota Food Health 2021, 40, 135–143. [Google Scholar] [CrossRef]
  154. Wirbel, J.; Zych, K.; Essex, M.; Karcher, N.; Kartal, E.; Salazar, G.; Bork, P.; Sunagawa, S.; Zeller, G. Microbiome Meta-Analysis and Cross-Disease Comparison Enabled by the SIAMCAT Machine Learning Toolbox. Genome Biol. 2021, 22, 93. [Google Scholar] [CrossRef]
  155. Lee, Y.; Cappellato, M.; Di Camillo, B. Machine Learning–Based Feature Selection to Search Stable Microbial Biomarkers: Application to Inflammatory Bowel Disease. GigaScience 2022, 12, giad083. [Google Scholar] [CrossRef]
  156. Unal, M.; Bostanci, E.; Ozkul, C.; Acici, K.; Asuroglu, T.; Guzel, M.S. Crohn’s Disease Prediction Using Sequence Based Machine Learning Analysis of Human Microbiome. Diagnostics 2023, 13, 2835. [Google Scholar] [CrossRef]
  157. Lee, S.; Lee, I. Comprehensive Assessment of Machine Learning Methods for Diagnosing Gastrointestinal Diseases through Whole Metagenome Sequencing Data. Gut Microbes 2024, 16, 2375679. [Google Scholar] [CrossRef]
  158. Rynazal, R.; Fujisawa, K.; Shiroma, H.; Salim, F.; Mizutani, S.; Shiba, S.; Yachida, S.; Yamada, T. Leveraging Explainable AI for Gut Microbiome-Based Colorectal Cancer Classification. Genome Biol. 2023, 24, 21. [Google Scholar] [CrossRef]
  159. DiMucci, D.; Kon, M.; Segrè, D. BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes. Front. Mol. Biosci. 2021, 8, 663532. [Google Scholar] [CrossRef]
  160. Muller, E.; Shiryan, I.; Borenstein, E. Multi-Omic Integration of Microbiome Data for Identifying Disease-Associated Modules. Nat. Commun. 2024, 15, 2621. [Google Scholar] [CrossRef]
  161. Pope, Q.; Varma, R.; Tataru, C.; David, M.M.; Fern, X. Learning a Deep Language Model for Microbiomes: The Power of Large Scale Unlabeled Microbiome Data. PLOS Comput. Biol. 2025, 21, e1011353. [Google Scholar] [CrossRef] [PubMed]
  162. Imangaliyev, S.; Schlötterer, J.; Meyer, F.; Seifert, C. Diagnosis of Inflammatory Bowel Disease and Colorectal Cancer through Multi-View Stacked Generalization Applied on Gut Microbiome Data. Diagnostics 2022, 12, 2514. [Google Scholar] [CrossRef] [PubMed]
  163. Peng, Y.; Liu, Y.; Liu, Y.; Wang, J. Comprehensive Data Optimization and Risk Prediction Framework: Machine Learning Methods for Inflammatory Bowel Disease Prediction Based on the Human Gut Microbiome Data. Front. Microbiol. 2024, 15, 1483084. [Google Scholar] [CrossRef] [PubMed]
  164. Mulenga, M.; Rajamanikam, A.; Kumar, S.; Muhammad, S.B.; Bhassu, S.; Samudid, C.; Sabri, A.Q.M.; Seera, M.; Eke, C.I. Revolutionizing Colorectal Cancer Detection: A Breakthrough in Microbiome Data Analysis. PLoS ONE 2025, 20, e0316493. [Google Scholar] [CrossRef]
  165. Pusa, T.; Rousu, J. Stable Biomarker Discovery in Multi-Omics Data via Canonical Correlation Analysis. PLoS ONE 2024, 19, e0309921. [Google Scholar] [CrossRef]
  166. Onwuka, S.; Bravo-Merodio, L.; Gkoutos, G.V.; Acharjee, A. Explainable AI-Prioritized Plasma and Fecal Metabolites in Inflammatory Bowel Disease and Their Dietary Associations. iScience 2024, 27, 110298. [Google Scholar] [CrossRef]
Figure 1. Landscape summary of microbial disruption, barrier dysfunction, and carcinogenesis in IBD and CRC. AIEC, adherent-invasive Escherichia coli; SCFA, short-chain fatty acids; FMT, fecal microbiota transfer; DCA, deoxycholic acid; LCA, lithocholic acid; EMT, epithelial–mesenchymal transition; CAC, colitis-associated cancer; CRC, colorectal cancer; AOM, azoxymethane; DSS, dextran sodium sulfate; TP53, tumor protein 53; APC, adenomatous polyposis coli; SNP, single nucleotide polymorphism; pks+, polyketide synthase-positive strain; cGAS, cyclic GMP–AMP synthase; Treg, regulatory T cell; FAP, familial adenomatous polyposis. Images used in the illustrations were designed via ChatGPT (version 5o).
Figure 1. Landscape summary of microbial disruption, barrier dysfunction, and carcinogenesis in IBD and CRC. AIEC, adherent-invasive Escherichia coli; SCFA, short-chain fatty acids; FMT, fecal microbiota transfer; DCA, deoxycholic acid; LCA, lithocholic acid; EMT, epithelial–mesenchymal transition; CAC, colitis-associated cancer; CRC, colorectal cancer; AOM, azoxymethane; DSS, dextran sodium sulfate; TP53, tumor protein 53; APC, adenomatous polyposis coli; SNP, single nucleotide polymorphism; pks+, polyketide synthase-positive strain; cGAS, cyclic GMP–AMP synthase; Treg, regulatory T cell; FAP, familial adenomatous polyposis. Images used in the illustrations were designed via ChatGPT (version 5o).
Gastroent 16 00034 g001
Figure 2. Conceptual workflow linking site-specific microbiota to intestinal disease and computational analysis. Three anatomical niches—oral cavity, mucosal-associated microbiota (MAM), and small-intestine microbiota—converge to drive dysbiosis and chronic inflammation, which progresses to IBD and CRC. Downstream, an AI/ML Integration layer illustrates current analytics: validated niche-specific classifiers (saliva AUC 0.90–0.97; mucosal AUC 0.85–0.93), explainable AI (SHAP), and emerging multi-omic or graph-based models for cross-site risk prediction. Solid arrows trace this biological-to-computational flow. AI, artificial intelligence; ML, machine learning; IBD, inflammatory bowel disease; CRC, colorectal cancer.
Figure 2. Conceptual workflow linking site-specific microbiota to intestinal disease and computational analysis. Three anatomical niches—oral cavity, mucosal-associated microbiota (MAM), and small-intestine microbiota—converge to drive dysbiosis and chronic inflammation, which progresses to IBD and CRC. Downstream, an AI/ML Integration layer illustrates current analytics: validated niche-specific classifiers (saliva AUC 0.90–0.97; mucosal AUC 0.85–0.93), explainable AI (SHAP), and emerging multi-omic or graph-based models for cross-site risk prediction. Solid arrows trace this biological-to-computational flow. AI, artificial intelligence; ML, machine learning; IBD, inflammatory bowel disease; CRC, colorectal cancer.
Gastroent 16 00034 g002
Table 1. CRC models.
Table 1. CRC models.
ReferenceObjectiveDL Model (Highest Performance)InputOutputPublication Date
[44]Diagnosis of CRC using gut viral signaturesRFStool virome (405 CRC-associated vOTUs)AUC 0.830(cross-cohort, CRC vs. HC)2 October 2022
[70]Diagnosis of CRC using microbial and functional profilesRFStool microbiota (genus-level), KEGG functional profilesGenus-level
AUC 0.84(CRC vs. HC), 0.73(CRC vs. CA)
3-genus signature
AUC 0.87(CRC vs. HC). 0.67(CRC vs. CA)
3 November 2022
[71]Diagnosis of colorectal adenoma and CRCRFStool microbiota (ASV-level), age, sex, and BMIAUC 0.78 (adenoma vs. control), AUC 0.84 (adenoma vs. CRC, external validation)24 May 2021
[72]CRC and adenoma diagnosis integrating microbiome and clinical dataRFLarge-scale metagenomic data + clinical variablesAUC 0.939 (CRC), 0.925 (adenoma)22 January 2024
[73]Interpretable ML model for CRC and adenoma classification using functional profilesExplainable Boosting Machine (EBM)Stool microbiota(WGS), KEGG, eggNOG functional profileeggNOG profile 100% hit ratio for all the performed tests10 January 2022
[74]Risk stratification of FIT-positive individuals using microbiome signaturesNeural Network16S rRNA(V3–V4) 8 selected taxa, age, sex, fecal hemoglobin concentration (FIT)Sensitivity: 98.98% (CRC), 97.98% (CR lesions)25 December 2022
[75]Diagnosis of adenoma and CRC using blood cfDNA microbiomeRFBlood cfDNAAUC 0.8849 (adenoma), AUC 0.9824 (CRC)29 April 2025
[76]To evaluate the impact of FOBT on gut microbiota composition and improve CRC prediction modelsSVMGut microbiota(16S rRNA, genus-level), FOBT statusAccuracy without FOBT: 89.71% → with FOBT: 92%20 January 2025
[77]Diagnosis of advanced adenoma using gut microbiotaRFShotgun metagenomic data (species-level abundance)AUC 0.799 (adenoma vs. control)10 October 2024
[78]Discriminating Advanced Adenoma from CRC using metagenomic microbiome and SNP dataRF (SNP model)Microbial SNPs from fecal metagenomic dataAccuracy 92.31%1 December 2022
[79]Improve the Colorectal Cancer Diagnosis Using Gut MicrobiomeRF, BART16S rRNA/Shotgun metagenomicsAUC 0.867 (RF), 0.882 (Bart)12 August 2022
[80]Improve CRC diagnostic accuracy by combining gut microbiota, MT-sDNA, and tumor markersRF16S rRNA(genus-level), MT-sDNA, CEAAccuracy 97.1% Sensitivity 98.1%, Specificity 92.3%15 August 2023
[81]Classification of right- vs. left-sided colorectal cancer using tumor-associated microbial featuresRFTumor-derived microbial RNA-seq expression dataAUC 0.9, 0.76, and 0.89 for the human genomic, microbial, and combined feature sets, respectively11 July 2023
[82]Tissue microbiome analysis by site and type in CRC patientsRFTissue-derived metagenomic data(WGS)Site- and tissue-specific microbial signatures27 May 2025
[83]Classification of poorly vs. moderately differentiated colorectal cancer using gut microbiotaRFFecal 16S rRNA data (V1–V4)Accuracy 100%20 December 2022
[84]Classification of healthy, adenoma, and CRC based on enterotype-specific gut microbiotaRFFecal 16S rRNA sequencing data (genus-level), stratified by enterotype AUC 0.78, S_E model, CRC vs. non-CRC27 February 2024
[85]CRC and adenoma diagnosis integrating microbiome and clinical dataRFLarge-scale metagenomic data + clinical variablesAUC 0.939 (CRC), 0.925 (adenoma)22 January 2024
Abbreviations: CRC, colorectal cancer; CA, colorectal adenoma; HC, healthy control; cfDNA, cell-free DNA; WGS, whole genome shotgun sequencing; ASV, amplicon sequence variant; DL, deep learning; ML, machine learning; EBM, explainable boosting machine; FIT, fecal immunochemical test; vOTU, viral operational taxonomic unit; FOBT, fecal occult blood test; MT-sDNA, multi-target stool DNA test; SNP, single nucleotide polymorphism; BART, Bayesian additive regression trees; RF, random forest.
Table 2. IBD models.
Table 2. IBD models.
ReferenceObjectiveDL Model (Highest Performance)InputOutputPublication Date
[87]Development of LightCUD for IBD and UC/CD diagnosisLightGBMWGS (strain-level), 16S rRNA (genus-level)AUC 0.984 (IBD vs. HC-WGS), AUC 0.989 (CD vs. UC-WGS)19 January 2021
[88]Diagnosis of IBD and subtype differentiation (CD vs. UC)sPLS-DA16S rRNA (phylotype-level, V3–V4)AUC 0.992 (IBD vs. HC), AUC 0.988 (CD vs. UC)24 December 2023
[89]Identify the microbiota signature by UC disease activitysPLS-DA16S rRNA (V3–V4)Perfect class prediction (Active UC/Inactive UC/HC)28 December 2021
[90]Non-invasive diagnosis and monitoring of IBD using fecal biomarkers and microbiomeLogistic RegressionFecal HBD2, FCal, 16S rRNA (Genus-level)AUC 0.93 (IBD vs. IBS)7 June 2021
[91]Classify disease activity in UC based on gut fungal (mycobiome) signaturesRFITS2AUC ~0.80 (Active vs. Remission UC)19 April 2024
[92]Incorporating external samples to improve model robustness and generalizability.RF, among others16S rRNA (Genus level)AUC increased by up to 0.0758 February 2023
[93]Diagnosis of UC and CD patientsRegularized Logistic RegressionFecal WGS (species-level)AUC 0.873 (train), 0.778 (test), 0.633 (validation)11 November 2023
[94]Diagnosis of UC and CD patientsRF16s rRNA (genus-level OTU)AUC 0.76 (HC vs. CD), 0.74 (HC vs. UC)17 May 2022
[95].Including absolute microbial load and clinical markersRFFungal 18S rDNA copies, bacterial 16S rDNA copiesAUC 0.86 (UC vs. CD)27 April 2021
[96]Diagnosis of PIBD using fecal microbiotaRFStool microbiota (11 OTUs)AUC 0.88 (HC vs. PIBD), 0.84 (IBS vs. PIBD)7 December 2021
[97]Classifying PIBD activity using bile acidRFSerum BAsAUC 0.8421 February 2024
[98]Noninvasive diagnosis of Pediatric IBD using fecal AAs and microbiotaLogistic RegressionFecal microbiota, AAsAUC 0.94 (Discovery), 0.84 (Validation)4 June 2025
Abbreviations: IBD, irritable bowel disease; rRNA, ribosomal ribonucleic acid; AUC, area under curve; HC, healthy cohort; CD, Crohn’s disease; UC, ulcerative colitis; ITS2, internal transcribed spacer 2; FCal, fecal calprotectin; BMI, body mass index; RF, random forest; sPLS-DA, sparse partial least squares–discriminant analysis.
Table 3. Comparative summary of gut-associated microbiota niches across five domains. Side-by-side comparison of Oral, MAM, and small-intestine microbiome across five domains: diagnostic utility, key bacterial taxa, sampling methods, challenges, and current AI/ML integration. Quantitative AUCs are drawn from peer-reviewed studies. IBD, inflammatory bowel disease; CRC, colorectal cancer; F. nucleatum, Fusobacterium nucleatum; S. anginosus, Streptococcus anginosus; P. intermedia, Prevotella intermedia; F. prausnitzii, Faecalibacterium prausnitzii; A. muciniphila, Akkermansia muciniphila; AIEC, adherent-invasive Escherichia coli; R. gnavus, Ruminococcus gnavus; xAI, explainable artificial intelligence; AI, artificial intelligence.
Table 3. Comparative summary of gut-associated microbiota niches across five domains. Side-by-side comparison of Oral, MAM, and small-intestine microbiome across five domains: diagnostic utility, key bacterial taxa, sampling methods, challenges, and current AI/ML integration. Quantitative AUCs are drawn from peer-reviewed studies. IBD, inflammatory bowel disease; CRC, colorectal cancer; F. nucleatum, Fusobacterium nucleatum; S. anginosus, Streptococcus anginosus; P. intermedia, Prevotella intermedia; F. prausnitzii, Faecalibacterium prausnitzii; A. muciniphila, Akkermansia muciniphila; AIEC, adherent-invasive Escherichia coli; R. gnavus, Ruminococcus gnavus; xAI, explainable artificial intelligence; AI, artificial intelligence.
Domain.Oral MicrobiomeMucosal-Associated Microbiota (MAM)Small Intestine Microbiome
Diagnostic UtilityAUC ≈ 0.90–0.97;
Fully non-invasive screening
Outperforms fecal profiles in IBD/CRC stratificationEarly-stage evidence; Limited clinical use
Key Bacterial TaxaF. nucleatum, S. anginosus (CRC);
P. intermedia, Veillonella (IBD)
Beneficial - F. prausnitzii, A. muciniphila
Pathogenic - AIEC
E. coli 35A1,
R. gnavus (CD)
Sampling MethodsSaliva, dental plaque, tongue coating - repeatable, non-invasiveEndoscopic biopsy
- high precision, invasive
Endoscopy, capsule/string tests, stoma effluent
- technically demanding
ChallengesHigh inter-individual variability;
Periodontal confounders
Invasiveness, spatial heterogeneity, small sample sizeLow biomass, contamination risk, scarce longitudinal data
AI/ML IntegrationMature ML / xAI models for CRC & IBD early predictionActive ML / graph-AI mapping host-microbe networksLimited by data sparsity; Models under development
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kim, M.; Gim, D.; Kim, S.; Park, S.; Eom, T.P.; Seol, J.; Yeo, J.; Jo, C.; Seo, G.; Ku, H.; et al. From Dysbiosis to Prediction: AI-Powered Microbiome Insights into IBD and CRC. Gastroenterol. Insights 2025, 16, 34. https://doi.org/10.3390/gastroent16030034

AMA Style

Kim M, Gim D, Kim S, Park S, Eom TP, Seol J, Yeo J, Jo C, Seo G, Ku H, et al. From Dysbiosis to Prediction: AI-Powered Microbiome Insights into IBD and CRC. Gastroenterology Insights. 2025; 16(3):34. https://doi.org/10.3390/gastroent16030034

Chicago/Turabian Style

Kim, Minkwan, Donghyeon Gim, Sunghan Kim, Sungsu Park, Tehyun Phillip Eom, Jaehoon Seol, Junyeong Yeo, Changmin Jo, Gunha Seo, Hyungjune Ku, and et al. 2025. "From Dysbiosis to Prediction: AI-Powered Microbiome Insights into IBD and CRC" Gastroenterology Insights 16, no. 3: 34. https://doi.org/10.3390/gastroent16030034

APA Style

Kim, M., Gim, D., Kim, S., Park, S., Eom, T. P., Seol, J., Yeo, J., Jo, C., Seo, G., Ku, H., & Kim, J. H. (2025). From Dysbiosis to Prediction: AI-Powered Microbiome Insights into IBD and CRC. Gastroenterology Insights, 16(3), 34. https://doi.org/10.3390/gastroent16030034

Article Metrics

Back to TopTop