Advancements in Artificial Intelligence (AI) for Cancer Genomics and Genetics

A special issue of Biomedicines (ISSN 2227-9059). This special issue belongs to the section "Cancer Biology and Oncology".

Deadline for manuscript submissions: closed (31 August 2025) | Viewed by 9406

Special Issue Editor


E-Mail Website
Guest Editor
Istituto di Biologia e Patologia Molecolari del Consiglio Nazionale delle Ricerche (IBPM-CNR), Dipartimento di Biologia e Biotecnologie, Università Sapienza di Roma, Piazzale Aldo Moro 5, 00185 Rome, Italy
Interests: chromatin structure and function; heterochromatin; Drosophila melanogaster; mitosis and male meiosis; cytokinesis; DNA repair; cancer epigenetics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The progress made through the application of computational approaches to biology and medicine requires a broad expertise for the management, processing, analysis, and interpretation of the results produced using very heterogeneous data. Inter-disciplinary collaborations and resources are key because large amounts of information cannot be handled by single scientists working in their specialized fields. Data need to be collected, reviewed, processed, and analyzed in order to create models aimed at understanding diseases at the molecular level, allowing the basis for personalized medicine to be established. Research regarding the neoplastic transformation will significantly benefit from this progress as well, and the use of Artificial Intelligence (AI) to leverage large datasets is expected to become more and more important.

Cancer is a multistep and complex disease; its onset is influenced by several factors including—but not limited to—the genetic background of patients, their lifestyle, the environment in which they live, and multiple interactions among these factors. Moreso than before, cancer characterization cannot be limited to the evaluation of cytological/histological features and the testing of a few biomarkers. An integrated approach and the use of AI appear necessary, involving not only high-throughput data analysis, but also its contextualization in order to identify the best tools for diagnosis, prognosis, and treatment. 

We are pleased to invite you to submit your contribution, with the goal of understanding and characterizing human cancer in terms of all the aforementioned aspects. Comprehensive reviews, illustrating the state of the art in the use of AI in cancer genetics/genomics, are equally welcome.

Research areas may include (but are not limited to) the following: (i) the use of AI to collect, filter, and/or analyze large datasets for cancer characterization; (ii) the use of genetics and genomics approaches in cancer; (iii) the response of genes to complex environmental insults; (iv) studies based on cancer transcriptomics; ad (v) new approaches for the early diagnosis and personalized treatment of cancer, as well as related topics.

We are looking forward to receiving your contribution.

Dr. Roberto Piergentili
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Biomedicines is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • cancer
  • genetics
  • epigenetics
  • genomics
  • epigenomics
  • gene-environment interaction
  • cancer diagnosis and prognosis
  • personalized medicine
  • drug discovery

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

17 pages, 1913 KB  
Article
A Machine Learning Framework for Cancer Prognostics: Integrating Temporal and Immune Gene Dynamics via ARIMA-CNN
by Rui-Bin Lin, Linlin Zhou, Yu-Chun Lin, Yu Yu, Hung-Chih Yang and Chen-Wei Yu
Biomedicines 2025, 13(11), 2751; https://doi.org/10.3390/biomedicines13112751 - 11 Nov 2025
Viewed by 378
Abstract
Background: Hepatocellular carcinoma remains a global health challenge with high mortality rates. The tumor immune microenvironment significantly impacts disease progression and survival. However, traditional analyses predominantly focus on single immune genes, overlooking the critical interplay among multiple immune gene signatures. Our study explores [...] Read more.
Background: Hepatocellular carcinoma remains a global health challenge with high mortality rates. The tumor immune microenvironment significantly impacts disease progression and survival. However, traditional analyses predominantly focus on single immune genes, overlooking the critical interplay among multiple immune gene signatures. Our study explores the prognostic significance of chemokine (C-C motif) ligand 5 (CCL5) expression and associated immune genes through an innovative combination of Autoregressive Integrated Moving Average (ARIMA) and Convolutional Neural Network (CNN) models. Methods: A time series dataset of CCL5 expression, comprising 230 liver cancer patients, was analyzed using an ARIMA model to capture its temporal dynamics. The residuals from the ARIMA model, combined with immune gene expression data, were utilized as input features for a CNN to predict survival outcomes. Survival analyses were conducted using the Cox proportional hazards model and Kaplan–Meier curves. Furthermore, the ARIMA-CNN framework’s results were systematically compared with traditional median-based stratification methods, establishing a benchmark for evaluating model efficacy and highlighting the enhanced predictive power of the proposed integrative approach. Results: CNN-extracted features demonstrated superior prognostic capability compared to traditional median-split analyses of single-gene datasets. Features derived from CD8+ T cells and effector T cells achieved a hazard ratio (HR) of 0.7324 (p = 0.0008) with a statistically significant log-rank p-value (0.0131), highlighting their critical role in anti-tumor immunity. Hierarchical clustering of immune genes further identified distinct survival associations. Notably, a cluster comprising B cells, Th2 cells, T cells, and NK cells demonstrated a moderate protective effect (HR: 0.8714, p = 0.1093) with a significant log-rank p-value (0.0233). Conversely, granulocytes, Tregs, macrophages, and myeloid-derived suppressor cells showed no significant survival association, emphasizing the complex regulatory landscape within the tumor immune microenvironment. Conclusions: Our study provides the first ARIMA-CNN framework for modeling gene expression and survival analysis, marking a significant innovation in integrating temporal dynamics and machine learning for biological data interpretation. This model offers deeper insights into the tumor immune microenvironment and underscores the potential for advancing precision immunotherapy strategies and identifying novel biomarkers, contributing significantly to innovative cancer management solutions. Full article
Show Figures

Figure 1

23 pages, 2002 KB  
Article
Precision Oncology Through Dialogue: AI-HOPE-RTK-RAS Integrates Clinical and Genomic Insights into RTK-RAS Alterations in Colorectal Cancer
by Ei-Wen Yang, Brigette Waldrup and Enrique Velazquez-Villarreal
Biomedicines 2025, 13(8), 1835; https://doi.org/10.3390/biomedicines13081835 - 28 Jul 2025
Cited by 2 | Viewed by 1423
Abstract
Background/Objectives: The RTK-RAS signaling cascade is a central axis in colorectal cancer (CRC) pathogenesis, governing cellular proliferation, survival, and therapeutic resistance. Somatic alterations in key pathway genes—including KRAS, NRAS, BRAF, and EGFR—are pivotal to clinical decision-making in precision oncology. However, the integration of [...] Read more.
Background/Objectives: The RTK-RAS signaling cascade is a central axis in colorectal cancer (CRC) pathogenesis, governing cellular proliferation, survival, and therapeutic resistance. Somatic alterations in key pathway genes—including KRAS, NRAS, BRAF, and EGFR—are pivotal to clinical decision-making in precision oncology. However, the integration of these genomic events with clinical and demographic data remains hindered by fragmented resources and a lack of accessible analytical frameworks. To address this challenge, we developed AI-HOPE-RTK-RAS, a domain-specialized conversational artificial intelligence (AI) system designed to enable natural language-based, integrative analysis of RTK-RAS pathway alterations in CRC. Methods: AI-HOPE-RTK-RAS employs a modular architecture combining large language models (LLMs), a natural language-to-code translation engine, and a backend analytics pipeline operating on harmonized multi-dimensional datasets from cBioPortal. Unlike general-purpose AI platforms, this system is purpose-built for real-time exploration of RTK-RAS biology within CRC cohorts. The platform supports mutation frequency profiling, odds ratio testing, survival modeling, and stratified analyses across clinical, genomic, and demographic parameters. Validation included reproduction of known mutation trends and exploratory evaluation of co-alterations, therapy response, and ancestry-specific mutation patterns. Results: AI-HOPE-RTK-RAS enabled rapid, dialogue-driven interrogation of CRC datasets, confirming established patterns and revealing novel associations with translational relevance. Among early-onset CRC (EOCRC) patients, the prevalence of RTK-RAS alterations was significantly lower compared to late-onset disease (67.97% vs. 79.9%; OR = 0.534, p = 0.014), suggesting the involvement of alternative oncogenic drivers. In KRAS-mutant patients receiving Bevacizumab, early-stage disease (Stages I–III) was associated with superior overall survival relative to Stage IV (p = 0.0004). In contrast, BRAF-mutant tumors with microsatellite-stable (MSS) status displayed poorer prognosis despite higher chemotherapy exposure (OR = 7.226, p < 0.001; p = 0.0000). Among EOCRC patients treated with FOLFOX, RTK-RAS alterations were linked to worse outcomes (p = 0.0262). The system also identified ancestry-enriched noncanonical mutations—including CBL, MAPK3, and NF1—with NF1 mutations significantly associated with improved prognosis (p = 1 × 10−5). Conclusions: AI-HOPE-RTK-RAS exemplifies a new class of conversational AI platforms tailored to precision oncology, enabling integrative, real-time analysis of clinically and biologically complex questions. Its ability to uncover both canonical and ancestry-specific patterns in RTK-RAS dysregulation—especially in EOCRC and populations with disproportionate health burdens—underscores its utility in advancing equitable, personalized cancer care. This work demonstrates the translational potential of domain-optimized AI tools to accelerate biomarker discovery, support therapeutic stratification, and democratize access to multi-omic analysis. Full article
Show Figures

Figure 1

16 pages, 483 KB  
Article
Learning to Train and to Explain a Deep Survival Model with Large-Scale Ovarian Cancer Transcriptomic Data
by Elena Spirina Menand, Manon De Vries-Brilland, Leslie Tessier, Jonathan Dauvé, Mario Campone, Véronique Verrièle, Nisrine Jrad, Jean-Marie Marion, Pierre Chauvet, Christophe Passot and Alain Morel
Biomedicines 2024, 12(12), 2881; https://doi.org/10.3390/biomedicines12122881 - 18 Dec 2024
Viewed by 1702
Abstract
Background/Objectives: Ovarian cancer is a complex disease with poor outcomes that affects women worldwide. The lack of successful therapeutic options for this malignancy has led to the need to identify novel biomarkers for patient stratification. Here, we aim to develop the outcome predictors [...] Read more.
Background/Objectives: Ovarian cancer is a complex disease with poor outcomes that affects women worldwide. The lack of successful therapeutic options for this malignancy has led to the need to identify novel biomarkers for patient stratification. Here, we aim to develop the outcome predictors based on the gene expression data as they may serve to identify categories of patients who are more likely to respond to certain therapies. Methods: We used The Cancer Genome Atlas (TCGA) ovarian cancer transcriptomic data from 372 patients and approximately 16,600 genes to train and evaluate the deep learning survival models. In addition, we collected an in-house validation dataset of 12 patients to assess the performance of the trained survival models for their direct use in clinical practice. Despite deceptive generalization capabilities, we demonstrated how our model can be interpreted to uncover biological processes associated with survival. We calculated the contributions of the input genes to the output of the best trained model and derived the corresponding molecular pathways. Results: These pathways allowed us to stratify the TCGA patients into high-risk and low-risk groups (p-value 0.025). We validated the stratification ability of the identified pathways on the in-house dataset consisting of 12 patients (p-value 0.229) and on the external clinical and molecular dataset consisting of 274 patients (p-value 0.006). Conclusions: The deep learning-based models for survival prediction with RNA-seq data could be used to detect and interpret the gene-sets associated with survival in ovarian cancer patients and open a new avenue for future research. Full article
Show Figures

Figure 1

16 pages, 2665 KB  
Article
Deep Multiple Instance Learning Model to Predict Outcome of Pancreatic Cancer Following Surgery
by Caroline Truntzer, Dina Ouahbi, Titouan Huppé, David Rageot, Alis Ilie, Chloe Molimard, Françoise Beltjens, Anthony Bergeron, Angelique Vienot, Christophe Borg, Franck Monnien, Frédéric Bibeau, Valentin Derangère and François Ghiringhelli
Biomedicines 2024, 12(12), 2754; https://doi.org/10.3390/biomedicines12122754 - 2 Dec 2024
Cited by 1 | Viewed by 1868
Abstract
Background/Objectives: Pancreatic ductal adenocarcinoma (PDAC) is a cancer with very poor prognosis despite early surgical management. To date, only clinical variables are used to predict outcome for decision-making about adjuvant therapy. We sought to generate a deep learning approach based on hematoxylin [...] Read more.
Background/Objectives: Pancreatic ductal adenocarcinoma (PDAC) is a cancer with very poor prognosis despite early surgical management. To date, only clinical variables are used to predict outcome for decision-making about adjuvant therapy. We sought to generate a deep learning approach based on hematoxylin and eosin (H&E) or hematoxylin, eosin and saffron (HES) whole slides to predict patients’ outcome, compare these new entities with known molecular subtypes and question their biological significance; Methods: We used as a training set a retrospective private cohort of 206 patients treated by surgery for PDAC cancer and a validation cohort of 166 non-metastatic patients from The Cancer Genome Atlas (TCGA) PDAC project. We estimated a multi-instance learning survival model to predict relapse in the training set and evaluated its performance in the validation set. RNAseq and exome data from the TCGA PDAC database were used to describe the transcriptomic and genomic features associated with deep learning classification; Results: Based on the estimation of an attention-based multi-instance learning survival model, we identified two groups of patients with a distinct prognosis. There was a significant difference in progression-free survival (PFS) between these two groups in the training set (hazard ratio HR = 0.72 [0.54;0.96]; p = 0.03) and in the validation set (HR = 0.63 [0.42;0.94]; p = 0.01). Transcriptomic and genomic features revealed that the poor prognosis group was associated with a squamous phenotype. Conclusions: Our study demonstrates that deep learning could be used to predict PDAC prognosis and offer assistance in better choosing adjuvant treatment. Full article
Show Figures

Figure 1

25 pages, 2816 KB  
Article
GastricAITool: A Clinical Decision Support Tool for the Diagnosis and Prognosis of Gastric Cancer
by Rocío Aznar-Gimeno, María Asunción García-González, Rubén Muñoz-Sierra, Patricia Carrera-Lasfuentes, María de la Vega Rodrigálvarez-Chamarro, Carlos González-Muñoz, Enrique Meléndez-Estrada, Ángel Lanas and Rafael del Hoyo-Alonso
Biomedicines 2024, 12(9), 2162; https://doi.org/10.3390/biomedicines12092162 - 23 Sep 2024
Cited by 5 | Viewed by 2736
Abstract
Background/Objective: Gastric cancer (GC) is a complex disease representing a significant global health concern. Advanced tools for the early diagnosis and prediction of adverse outcomes are crucial. In this context, artificial intelligence (AI) plays a fundamental role. The aim of this work was [...] Read more.
Background/Objective: Gastric cancer (GC) is a complex disease representing a significant global health concern. Advanced tools for the early diagnosis and prediction of adverse outcomes are crucial. In this context, artificial intelligence (AI) plays a fundamental role. The aim of this work was to develop a diagnostic and prognostic tool for GC, providing support to clinicians in critical decision-making and enabling personalised strategies. Methods: Different machine learning and deep learning techniques were explored to build diagnostic and prognostic models, ensuring model interpretability and transparency through explainable AI methods. These models were developed and cross-validated using data from 590 Spanish Caucasian patients with primary GC and 633 cancer-free individuals. Up to 261 variables were analysed, including demographic, environmental, clinical, tumoral, and genetic data. Variables such as Helicobacter pylori infection, tobacco use, family history of GC, TNM staging, metastasis, tumour location, treatment received, gender, age, and genetic factors (single nucleotide polymorphisms) were selected as inputs due to their association with the risk and progression of the disease. Results: The XGBoost algorithm (version 1.7.4) achieved the best performance for diagnosis, with an AUC value of 0.68 using 5-fold cross-validation. As for prognosis, the Random Survival Forest algorithm achieved a C-index of 0.77. Of interest, the incorporation of genetic data into the clinical–demographics models significantly increased discriminatory ability in both diagnostic and prognostic models. Conclusions: This article presents GastricAITool, a simple and intuitive decision support tool for the diagnosis and prognosis of GC. Full article
Show Figures

Figure 1

Review

Jump to: Research, Other

16 pages, 766 KB  
Review
Stromal COL11A1: Mechanisms of Stroma-Driven Multidrug Resistance in Breast Cancer and Biomarker Potential
by Andreea Onofrei (Popa), Felicia Mihailuta, Daniela Mihalache, Cristina Chelmu Vodă, Sanda Jurja, Sorin Deacu and Mihaela Cezarina Mehedinți
Biomedicines 2025, 13(12), 2905; https://doi.org/10.3390/biomedicines13122905 - 27 Nov 2025
Abstract
Background/Objectives: Therapeutic resistance remains a major obstacle in breast cancer management, particularly among estrogen receptor-positive (ERα+) tumors that initially respond to endocrine therapy such as tamoxifen. Type XI collagen (COL11A1), a minor fibrillar collagen secreted by cancer-associated fibroblasts, has recently emerged [...] Read more.
Background/Objectives: Therapeutic resistance remains a major obstacle in breast cancer management, particularly among estrogen receptor-positive (ERα+) tumors that initially respond to endocrine therapy such as tamoxifen. Type XI collagen (COL11A1), a minor fibrillar collagen secreted by cancer-associated fibroblasts, has recently emerged as a stromal biomarker linked to tumor progression, immune modulation, and poor prognosis in several solid malignancies. Methods: We conducted a narrative review of the literature indexed in PubMed, Scopus, and Web of Science between 2011 and 2025, including original research, reviews, and clinical studies addressing COL11A1 expression and function in breast cancer. Mechanistic studies in other cancer types (ovarian, pancreatic, lung) were also evaluated when relevant to breast cancer biology. Results: Across multiple cancer types, COL11A1 overexpression correlates with stromal remodeling, epithelial–mesenchymal transition, and resistance to both hormone therapy and chemotherapy. In breast cancer, emerging data suggest a potential prognostic role and possible involvement in shaping the immune microenvironment. Nevertheless, most evidence derives from retrospective or preclinical studies, and clinical validation remains limited. Conclusions: COL11A1 represents a promising, though still exploratory, biomarker of therapeutic resistance and immune modulation in breast cancer. Future prospective and subtype-specific studies are needed to clarify its diagnostic and therapeutic value and to determine whether its inclusion in immunohistochemical panels could enhance patient stratification and guide personalized treatment. Full article
Show Figures

Figure 1

Other

Jump to: Research, Review

31 pages, 4232 KB  
Systematic Review
Artificial Intelligence-Driven SELEX Design of Aptamer Panels for Urinary Multi-Biomarker Detection in Prostate Cancer: A Systematic and Bibliometric Review
by Ayoub Slalmi, Nabila Rabbah, Ilham Battas, Ikram Debbarh, Hicham Medromi and Abdelmjid Abourriche
Biomedicines 2025, 13(12), 2877; https://doi.org/10.3390/biomedicines13122877 - 25 Nov 2025
Abstract
Background/Objectives: The limited specificity of prostate-specific antigen (PSA) drives unnecessary biopsies in prostate cancer (PCa). Urinary extracellular vesicles (uEVs) provide a non-invasive reservoir of tumor-derived nucleic acids and proteins. Aptamers selected by SELEX enable highly specific capture, and artificial intelligence (AI) can accelerate [...] Read more.
Background/Objectives: The limited specificity of prostate-specific antigen (PSA) drives unnecessary biopsies in prostate cancer (PCa). Urinary extracellular vesicles (uEVs) provide a non-invasive reservoir of tumor-derived nucleic acids and proteins. Aptamers selected by SELEX enable highly specific capture, and artificial intelligence (AI) can accelerate their optimization. This systematic review evaluated AI-assisted SELEX for urine-derived and exosome-enriched aptamer panels in PCa detection. Methods: Systematic searches of PubMed, Scopus, and Web of Science (1 January 2010–24 August 2025; no language restrictions) followed PRISMA 2020 and PRISMA-S. The protocol is registered on OSF (osf.io/b2y7u). After deduplication, 1348 records were screened; 129 studies met the eligibility criteria, including 34 (26.4%) integrating AI within SELEX or downstream refinement. Inclusion required at least one quantitative metric (dissociation constant Kd, SELEX cycles, limit of detection [LoD], sensitivity, specificity, or AUC). Risk of bias was appraised with QUADAS-2 (diagnostic accuracy studies) and PROBAST (prediction/machine learning models). Results: AI-assisted SELEX workflows reduced laboratory enrichment cycles from conventional 12–15 to 5–7 (≈40–55% relative reduction) and reported Kd values spanning low picomolar to upper nanomolar ranges; heterogeneity and inconsistent comparators precluded pooled estimates. Multiplex urinary panels (e.g., PCA3, TMPRSS2:ERG, miR-21, miR-375, EN2) yielded single-study AUCs between 0.70 and 0.92 with sensitivities up to 95% and specificities up to 88%; incomplete 2 × 2 contingency reporting prevented bivariate meta-analysis. LoD reporting was sparse and non-standardized despite several ultralow claims (attomolar to low femtomolar) on nanomaterial-enhanced platforms. Pre-analytical variability and absent threshold prespecification contributed to high or unclear risk (QUADAS-2). PROBAST frequently indicated high risk in participants and analysis domains. Across the included studies, lower Kd and reduced LoD improved analytical detectability; however, clinical specificity and AUC were predominantly shaped by pre-analytical control (matrix; post-DRE vs. spontaneous urine) and prespecified thresholds, so engineering gains did not consistently translate into higher diagnostic accuracy. Conclusions: AI-assisted SELEX is a promising strategy for accelerating high-affinity aptamer discovery and assembling multiplex urinary panels for PCa, but current evidence is early phase, heterogeneous, and largely single-center. Priorities include standardized uEV processing, complete 2 × 2 diagnostic reporting, multicenter external validation, calibration and decision impact analyses, and harmonized LoD and Kd reporting frameworks. Full article
Show Figures

Figure 1

Back to TopTop