Introduction: Early-onset colorectal cancer (EOCRC) is rising rapidly, particularly among the Hispanic/Latino (H/L) populations, who face disproportionately poor outcomes. The transforming growth factor-beta (TGF-β) signaling pathway plays a critical role in colorectal cancer (CRC) progression by mediating epithelial-to-mesenchymal transition (EMT), immune evasion, and
[...] Read more.
Introduction: Early-onset colorectal cancer (EOCRC) is rising rapidly, particularly among the Hispanic/Latino (H/L) populations, who face disproportionately poor outcomes. The transforming growth factor-beta (TGF-β) signaling pathway plays a critical role in colorectal cancer (CRC) progression by mediating epithelial-to-mesenchymal transition (EMT), immune evasion, and metastasis. However, integrative analyses linking TGF-β alterations to clinical features remain limited—particularly for diverse populations—hindering translational research and the development of precision therapies. To address this gap, we developed AI-HOPE-TGFbeta (Artificial Intelligence agent for High-Optimization and Precision Medicine focused on TGF-β), the first conversational artificial intelligence (AI) agent designed to explore TGF-β dysregulation in CRC by integrating harmonized clinical and genomic data via natural language queries. Methods: AI-HOPE-TGFbeta utilizes a large language model (LLM), Large Language Model Meta AI 3 (LLaMA 3), a natural language-to-code interpreter, and a bioinformatics backend to automate statistical workflows. Tailored for TGF-β pathway analysis, the platform enables real-time cohort stratification and hypothesis testing using harmonized datasets from the cBio Cancer Genomics Portal (cBioPortal). It supports mutation frequency comparisons, odds ratio testing, Kaplan–Meier survival analysis, and subgroup evaluations across race/ethnicity, microsatellite instability (MSI) status, tumor stage, treatment exposure, and age. The platform was validated by replicating findings on the SMAD4, TGFBR2, and BMPR1A mutations in EOCRC. Exploratory queries were conducted to examine novel associations with clinical outcomes in H/L populations. Results: AI-HOPE-TGFbeta successfully recapitulated established associations, including worse survival in SMAD4-mutant EOCRC patients treated with FOLFOX (fluorouracil, leucovorin and oxaliplatin) (
p = 0.0001) and better outcomes in early-stage TGFBR2-mutated CRC patients (
p = 0.00001). It revealed potential population-specific enrichment of BMPR1A mutations in H/L patients (OR = 2.63;
p = 0.052) and uncovered MSI-specific survival benefits among SMAD4-mutated patients (
p = 0.00001). Exploratory analysis showed better outcomes in SMAD2-mutant primary tumors vs. metastatic cases (
p = 0.0010) and confirmed the feasibility of disaggregated ethnicity-based queries for TGFBR1 mutations, despite small sample sizes. These findings underscore the platform’s capacity to detect both known and emerging clinical–genomic patterns in CRC. Conclusions: AI-HOPE-TGFbeta introduces a new paradigm in cancer bioinformatics by enabling natural language-driven, real-time integration of genomic and clinical data specific to TGF-β pathway alterations in CRC. The platform democratizes complex analyses, supports disparity-focused investigation, and reveals clinically actionable insights in underserved populations, such as H/L EOCRC patients. As a first-of-its-kind system studying TGF-β, AI-HOPE-TGFbeta holds strong promise for advancing equitable precision oncology and accelerating translational discovery in the CRC TGF-β pathway.
Full article