Next Article in Journal
Evaluating Scale-Up Cultivation Modes for Aspergillus oryzae Biomass Production Using VFA-Rich Effluents from Agro-Industrial Residues
Previous Article in Journal
Bacillus Pectinases as Key Biocatalysts for a Circular Bioeconomy: From Green Extraction to Process Optimization and Industrial Scale-Up
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Innovative Approaches to EMT-Related Biomarker Identification in Breast Cancer: Multi-Omics and Machine Learning Methods

by
Ghazaleh Khalili-Tanha
1 and
Alireza Shoari
2,*
1
Department of Medical Genetics and Molecular Medicine, School of Medicine, Mashhad University of Medical Sciences, Mashhad 91388-13944, Iran
2
Department of Cancer Biology, Mayo Clinic Comprehensive Cancer Center, Jacksonville, FL 32224, USA
*
Author to whom correspondence should be addressed.
BioTech 2025, 14(3), 75; https://doi.org/10.3390/biotech14030075
Submission received: 14 August 2025 / Revised: 16 September 2025 / Accepted: 18 September 2025 / Published: 22 September 2025
(This article belongs to the Section Medical Biotechnology)

Abstract

Breast cancer is the most prevalent cancer among women and is challenging to diagnose and treat due to its diverse subtypes and stages. Precision medicine aims to improve early detection, prognosis, and treatment planning by identifying new clinical biomarkers. The review emphasizes the importance of using cutting-edge technology and artificial intelligence (AI) to identify new biomarkers associated with epithelial–mesenchymal transition (EMT). During EMT, epithelial cells transform into a mesenchymal state, a process driven by genetic and epigenetic alterations that facilitate cancer progression. The review discusses how statistical analysis and machine learning methods applied to multi-omics data facilitate the discovery of novel EMT-related biomarkers, thereby advancing therapeutic strategies. This conclusion is supported by numerous clinical and preclinical studies on breast cancer.
Key Contribution: This review highlights how integrating multi-omics data with advanced machine learning enables the discovery of reliable EMT-related biomarkers in breast cancer. It shows how combining genomics, transcriptomics, proteomics, and metabolomics with artificial intelligence can reveal diagnostic, prognostic, and predictive markers that support early detection, patient stratification, and development of targeted therapies.

1. Introduction

Breast cancer is one of the most common cancers affecting women worldwide [1]. It arises from the breast tissue, typically from the lining of the milk ducts or lobules, and while it predominantly affects women, men can also develop breast cancer [2]. The incidence rate of breast cancer varies globally but has been increasing over the years due to factors such as lifestyle changes, environmental influences, and improved detection methods [3]. In the United States, for instance, the American Cancer Society estimated about 310,720 new cases of invasive breast cancer in women in 2024 [4]. Breast cancer mortality has been declining in many developed countries, largely due to early detection and improved treatment methods; despite this, it remains the second leading cause of cancer death among women [5].
Breast cancer can be classified into several types, including (i) ductal carcinoma in situ (DCIS) which is non-invasive cancer where abnormal cells are found in the lining of a breast duct but have not spread outside the duct; (ii) invasive ductal carcinoma (IDC) which is the most common type, where cancer cells spread beyond the ducts into other parts of the breast tissue; (iii) invasive lobular carcinoma (ILC) that cancer cells spread from the lobules to surrounding breast tissues [6]; (iv) triple-negative breast cancer (TNBC) that lacks estrogen, progesterone, and HER2 receptors, making it harder to treat [7]; (v) HER2-positive breast cancer which characterized by overexpression of the HER2 protein, that promotes cancer cell growth [8].
Clinical biomarkers are used to help diagnose, predict, and monitor the treatment response in breast cancer patients. Estrogen receptor (ER) and progesterone receptor (PR) positivity indicate that the cancer cells may receive signals from these hormones, promoting their growth [9]. Overexpression of the HER2 protein can lead to more aggressive cancer, and HER2-positive cancers may benefit from targeted therapies like trastuzumab (Herceptin) [10]. Higher levels of Ki-67 gene indicate a higher growth rate of the cancer cells [11]. Genetic mutations in these BRCA1 and BRCA2 significantly increase the risk of developing breast cancer [12]. Expression of PD-L1 can be a biomarker for response to immunotherapy in certain breast cancer subtypes [13,14].
Epithelial–mesenchymal transition (EMT) is a cellular program in which epithelial cells lose polarity and adhesion properties, acquiring mesenchymal traits that enhance motility. While essential in development and wound repair, EMT dysregulation contributes to fibrosis and, importantly, cancer progression. In breast cancer, type-3 EMT (Type 3 EMT refers to oncogenic EMT in carcinoma cells, distinct from Type 1 developmental EMT and Type 2 wound-healing/fibrotic EMT) arises from genetic/epigenetic alterations and tumor microenvironmental cues—including hypoxia, growth factors, and inflammatory cytokines—that collectively drive invasion and metastasis [15,16].
Here, modern methods for the detection of EMT biomarkers in breast cancer are going to be discussed. To offer a thorough grasp of EMT processes, it explores the integration of multi-omics technologies, including genomics, transcriptomics, proteomics, and metabolomics. The review also highlights the application of advanced machine learning algorithms to analyze complex datasets, identify novel biomarkers, and predict disease progression. By combining these innovative approaches, the article aims to offer insights into more accurate diagnostic tools and targeted therapies for breast cancer.
Breast cancer is a highly heterogeneous disease encompassing distinct molecular subtypes—luminal A, luminal B, HER2-enriched, triple-negative, and basal-like—each defined by unique gene-expression profiles and showing different therapeutic responses and prognoses [17]. Recognizing these subtypes is critical because they influence treatment selection, resistance mechanisms, and the relevance of EMT-associated biomarkers discussed in this review. To maintain focus on these clinically meaningful categories, we streamlined demographic statistics and retained only those essential for understanding molecular heterogeneity and its impact on EMT-related multi-omics analyses.

2. Epithelial–Mesenchymal Transition (EMT)

EMT is a biological process where epithelial cells undergo significant morphological changes, adopting a mesenchymal phenotype. This transformation involves a shift from a highly organized, polarized, and adhesive cell structure to a more motile and invasive phenotype [18]. EMT is crucial for various physiological processes, including embryogenesis, wound healing, and tissue regeneration; however, it also plays a significant role in pathological conditions, most notably cancer progression [19]. Characterized by the downregulation of epithelial markers (such as E-cadherin) and the upregulation of mesenchymal markers (such as N-cadherin and vimentin), EMT involves extensive changes in cell morphology, signaling pathways, and gene expression profiles [20]. EMT is driven by several transcription factors, including Snail, Slug, Twist, and Zeb1/2, which orchestrate the reprogramming of the epithelial phenotype [21]. Key characteristics of EMT are loss of epithelial markers, where epithelial cells lose cell–cell adhesion molecules, such as E-cadherin, and on the other hand, cells gain mesenchymal markers like N-cadherin and vimentin; Actin cytoskeleton is reorganized, contributing to changes in cell shape and increased motility, and cells acquire the ability to invade extracellular matrices and migrate [22].
EMT is implicated in several stages of cancer development and progression. Disruption of epithelial cell junctions and loss of polarity can lead to increased cellular proliferation and survival, contributing to tumor initiation [23]. EMT endows cancer cells with migratory and invasive capabilities, enabling them to breach the basement membrane and invade surrounding tissues and facilitating the entry of cancer cells into the bloodstream or lymphatic system, aiding in the dissemination to distant organs [24]. EMT-induced cells often exhibit resistance to apoptosis, allowing them to survive in the hostile microenvironment of distant metastatic sites; additionally, EMT is associated with increased resistance to conventional chemotherapy and targeted therapies, complicating treatment strategies [25]. Key transcription factors such as Snail, Slug, Twist, and Zeb are upregulated during EMT, driving the repression of epithelial markers and the induction of mesenchymal traits, and various signaling pathways, including TGF-β, Wnt, Notch, and Hedgehog, play critical roles in regulating EMT [26,27].
Proteases drive EMT by degrading the extracellular matrix, activating signaling pathways, cleaving cell-adhesion molecules, and altering cell-surface receptors to enhance cellular plasticity [28,29]. Matrix metalloproteinases (MMPs) degrade extracellular matrix components, a process vital for normal tissue remodeling but also a key driver of cancer metastasis [30]. MMPs also play a significant role in pathological processes, such as cancer metastasis, and one of the key processes in cancer metastasis is the EMT, where epithelial cells acquire mesenchymal properties, enhancing their migratory and invasive abilities [31]. Specific MMPs are involved in regulating EMT through various mechanisms, and MMPs degrade various components of the ECM, creating pathways for cancer cells to invade surrounding tissues and disseminate to distant sites [32]. MMPs can release bioactive growth factors sequestered in the ECM, such as TGF-β, which further promote EMT and cancer progression [33]. MMP-2 degrades type IV collagen, a major component of basement membranes, facilitating cell invasion and migration, and its activity is often upregulated during EMT, promoting the breakdown of ECM and enabling tumor cells to penetrate surrounding tissues [34]. Similarly to MMP-2, MMP-9 degrades type IV collagen and is involved in ECM remodeling, and MMP-9 activity is linked to increased cell motility and invasion during EMT [35]. MMP-3 can degrade various ECM components and activate other MMPs, such as MMP-9, and it is implicated in promoting EMT by inducing the expression of mesenchymal markers (e.g., vimentin) and repressing epithelial markers (e.g., E-cadherin) [36]. Furthermore, MMP-7 is involved in the degradation of ECM components and the release of growth factors that can promote EMT, and it plays a role in the cleavage of E-cadherin, leading to the disruption of cell–cell adhesion and enhanced cell motility [37]. Moreover, MMP-14 is a membrane-type MMP that activates pro-MMP-2 and degrades ECM components directly, and it is critical for cell migration and invasion during EMT [38]. These mechanistic insights align with multi-omics and machine-learning studies that highlight MMP-related EMT signatures in breast cancer. For example, XGBoost models have identified MMP3, MMP9, and MT1-MMP (MMP14) transcripts and proteomic features as predictors of invasion and poor prognosis.
Understanding the mechanisms of EMT has significant therapeutic implications, such as developing inhibitors that target key EMT transcription factors or signaling pathways to prevent or reverse EMT in cancer cells and strategies to overcome EMT-associated drug resistance by combining EMT inhibitors with conventional therapies [21]. EMT markers can serve as potential biomarkers for early detection of metastasis and prognosis prediction in breast cancer patients [39]. EMT is a critical biological process that occurs in both normal physiological and pathological situations. In the context of cancer, EMT plays a pivotal role in tumor initiation, invasion, metastasis, and resistance to therapy. Targeting EMT and its regulatory mechanisms holds promise for improving cancer treatment and patient outcomes (Figure 1).

3. Biomarkers and Analyzing Multi-Omics Data

Identifying novel biomarkers, particularly those linked to EMT in breast cancer, is crucial for improving diagnosis, prognosis, and treatment personalization [40]. Prognostic biomarkers help stratify patients based on risk profiles and guide more tailored therapies [41]. In breast cancer, EMT-associated biomarkers are especially valuable because they capture tumor aggressiveness, metastatic potential, and therapy resistance [42].
Multi-omics approaches provide a more complete view of EMT regulation in breast cancer. Genomic data reveal mutations and alterations associated with EMT [43], while transcriptomic profiles capture the expression shifts in EMT-related genes [44]. Proteomic analysis identifies changes in protein abundance and signaling cascades driving EMT [45], and metabolomics reflects metabolic reprogramming that accompanies the transition to a mesenchymal state [46].
Integrating these data layers enables the identification of robust EMT biomarkers and regulatory networks underlying breast cancer progression [39]. Compared with generic big data approaches, omics-driven strategies in breast cancer highlight clinically relevant EMT biomarkers. For instance, multi-omics integration has revealed EMT signatures linked to poor survival and resistance to chemotherapy, offering targets for prognostic tools and therapeutic intervention [47].
Traditional statistical methods such as t-tests, ANOVA, and regression analysis remain useful for analyzing omics data [48], and dimensionality reduction approaches like PCA help identify patterns but may oversimplify complex EMT-related biology [49]. Hierarchical clustering can group EMT gene or protein signatures [50], while pathway enrichment methods such as GSEA and KEGG highlight dysregulated EMT pathways [51]. However, these approaches often struggle with the high dimensionality and cross-omics integration challenges of EMT datasets [52].
Machine learning methods are increasingly applied to overcome these challenges in breast cancer EMT research. Supervised learning (e.g., support vector machines, random forests) and unsupervised clustering approaches identify EMT-related signatures from high-dimensional omics data [53]. Deep learning architectures, such as convolutional neural networks (CNN), extract multi-level EMT features [54], and AI-based integration frameworks like MOFA enable cross-omics analysis of EMT regulation [55]. Natural language processing (NLP) further accelerates biomarker discovery by mining EMT-related associations from literature [56].
AI methods not only enhance EMT biomarker discovery but also improve prediction of clinical outcomes in breast cancer [57]. By capturing non-linear, multi-layered interactions across omics levels, AI models provide a scalable and powerful framework for identifying EMT-related diagnostic, prognostic, and therapeutic biomarkers.

4. Integrative Multi-Omics Analysis with Artificial Intelligence

Finding novel biomarkers in cancer is critical for early diagnosis, prognosis, and treatment personalization. The integration of big data and artificial intelligence (AI) significantly enhances this process and develops personalized treatment plans based on individual biomarker profiles. This means treatments can be tailored to the specific genetic and molecular makeup of a patient’s cancer, potentially improving efficacy and reducing side effects. As shown in Figure 2, the process involves five steps of AI application in analyzing biological data.
Data Collection: This initial step involves gathering multi-omics data, such as genomics, epigenomics, transcriptomics, metabolomics, proteomics, single-cell multi-omics, spatial transcriptomics, etc., from various biological databases [58]. Analyzing each type of omics data provides valuable insights into the biological and molecular pathways involved in disease progression. Several databases provide this biological data, with Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) being the most commonly used. These databases offer comprehensive genomic, transcriptomic, epigenomic, and proteomic data across different types of cancer [59,60]. Additionally, some databases focus on specific cancers, such as Breast Cancer Gene Expression Miner (bc-GenExMiner), which specifically provides gene expression data related to breast cancer obtained through RNA sequencing methods and DNA microarrays [61]. METABRIC, the Molecular Taxonomy of Breast Cancer International Consortium, is a comprehensive initiative focused on understanding breast cancer at the molecular level. This extensive study provides detailed genomic, transcriptomic, and clinical data from a large cohort of breast cancer patients. The project was generously funded by Cancer Research UK, the Canadian Breast Cancer Foundation BC/Yukon, and the British Columbia Cancer Foundation [62].
Pre-processing: It is an essential step for raw data and includes data filtering, batch effect removal, systematic normalization, and quality checks. These steps are crucial because they significantly influence the outcomes of integrative analyses. In particular, data filtering is vital for reducing noise and minimizing features, which is important given the computational demands of most integrative methods [63]. LASSO, Ridge regression, and Elastic Net regularization are advanced pre-processing techniques used to enhance machine learning models by selecting relevant features. The pre-processing of various types of multi-omics data requires specialized tools and packages [64].
The third step involves feature selection and feature extraction, which are methods used to decrease the number of features in a dataset. Feature selection involves picking a subset of the original features, removing attributes or features that introduce noise, irrelevance, or redundancy, thereby improving the accuracy of data sample classification. While feature extraction changes the original features into a new set [65,66]. Common techniques include Independent Component Analysis (ICA) and Principal Component Analysis (PCA). The benefits of both techniques include simplifying and enhancing the model’s effectiveness, improving interpretability, reducing dimensionality, and resulting in a more concise and cohesive statement [67,68].
Model Training: The process of model training leverages both supervised and unsupervised learning techniques to teach the AI system [69]. Supervised learning uses labeled data to help the AI learn the relationships between input features (biomarkers) and output labels (conditions or outcomes). Common supervised ML algorithms for classification include Naïve Bayes (NB), Random Forest (RF), Support Vector Machines (SVM), and K-Nearest Neighbors algorithm (k-NN) [70,71,72]. Unsupervised learning, on the other hand, helps the AI find hidden patterns or intrinsic structures in the data without explicit labels. By applying these methods, the AI becomes capable of recognizing and predicting the connections between biomarkers and various medical conditions. Unsupervised learning methods like clustering and dimensionality reduction are frequently used in biomarker discovery [73,74].
Validation and evaluation: They are crucial stages in analyzing AI models, ensuring their performance and generalizability. Validation assesses a trained model’s ability to handle new, unseen data, enhancing its capacity to generalize effectively. This process involves evaluating metrics such as accuracy, sensitivity, and specificity using data separate from the training set. There are two main types of validation: internal and external. Internal validation, typically performed through cross-validation, involves dividing the data into training and test sets to evaluate the accuracy of the model. External validations, such as in vivo and in vitro experiments, along with in-house cohort studies [75,76]. Successful cross-layer modeling requires careful technical control [77]. Batch effects arising from different sequencing platforms or acquisition dates should be mitigated using tools such as ComBat or Harmony, and missing-layer samples can be handled through imputation methods or model architectures that tolerate incomplete views (matrix factorization, variational autoencoders). Multi-view learning frameworks—including early-fusion approaches that merge features before modeling and late-fusion methods such as MOFA that learn latent factors across omics layers—should be selected according to study goals and sample size. To avoid model overfitting and information leakage, preprocessing and feature selection must be performed inside the training folds during cross-validation, and feature-stability analyses (e.g., bootstrapping or repeated cross-validation) are essential to confirm robustness of selected biomarkers. These practices help ensure that predictive EMT signatures derived from integrated datasets remain reproducible and clinically meaningful.

5. Machine Learning: Revolutionizing Multiomics Data Interpretation

Recent advances in analytical technologies have generated an exponential increase in data from diverse omics platforms. Statistical methods and AI, particularly machine learning (ML) and deep learning (DL) algorithms, have proven highly effective in analyzing these large-scale biological and clinical datasets. By uncovering complex patterns, they have facilitated the identification and validation of EMT-related biomarkers that hold promise for personalized treatment decisions and the development of targeted therapies. In this review, we summarized and compiled novel biomarkers identified through multi-omics integration using ML approaches (Table 1). Relevant studies were identified from PubMed, Web of Science, and Scopus using combinations of “breast cancer,” “epithelial–mesenchymal transition,” “multi-omics,” “biomarkers,” and “machine learning.” Given the heterogeneity of endpoints, algorithms, and datasets, strict inclusion/exclusion criteria were not applied as in a systematic review; rather, studies were selected to illustrate the diversity of approaches and key findings in this emerging field.
While multi-omics and ML approaches have uncovered promising EMT-related biomarkers, their clinical utility ultimately depends on translation into deployable assays. Several avenues are currently being explored. Immunohistochemistry (IHC) panels incorporating recurrent EMT markers such as E-cadherin, N-cadherin, vimentin, and fibronectin offer standardized diagnostic and prognostic assessment, provided staining reproducibility and concordance with molecular EMT signatures are demonstrated [78,79]. Although the trials and multi-omics analyses summarized in Table 1 provide important leads, their biomarker findings vary in sample size, cohort diversity, and analytical rigor. Many rely on retrospective datasets or single-center cohorts, which can limit reproducibility. Independent external validation is often incomplete, and assay platforms differ in sensitivity and specificity across studies. These factors highlight the need for larger prospective trials with standardized protocols, cross-platform benchmarking, and transparent data sharing to confirm the clinical utility of EMT-related biomarkers.
qPCR-based gene signatures, such as those derived from miR-21, miR-222-3p [80], and ESRP1/2, represent rapid and low-cost clinical tests, though they require rigorous cross-cohort validation and benchmarking against existing prognostic models [81,82]. Extracellular vesicle (EV)-based proteomics has shown that proteins such as FAK, MEK1, and fibronectin enriched in circulating EVs can serve as minimally invasive EMT biomarkers, but clinical translation necessitates robust EV isolation protocols, reproducible proteomic quantification, and adherence to laboratory standards [82]. Similarly, radiomics and imaging pipelines applying ML to MRI-based radiomic features have successfully predicted EMT status and correlated it with outcomes; however, their clinical adoption will require harmonization of imaging protocols, standardized feature extraction, and integration into hospital PACS systems [83].
Recent studies have leveraged machine learning (ML) and multi-omics data to identify epithelial–mesenchymal transition (EMT)-related biomarkers in breast cancer, particularly in aggressive subtypes such as triple-negative breast cancer (TNBC). Thalor et al. analyzed GEO gene expression datasets using multiple ML algorithms, including SVM, kNN, RF, DT, LR, and XGBoost. XGBoost achieved the highest accuracy and AUC with the top 25 features (CX-25), including BCHE, ATP7B, PPP4R4, TFF1, PTGFR, TTYH1, and SERPINA6, while the driver dataset (DX-20) highlighted CDKN2A, WIF1, ZNF521, MUC16, WNK4, COL2A1, and S100A7. Kaplan–Meier analysis identified S100B and POU2AF1 as potential EMT-related prognostic genes, linked to metastasis-associated pathways such as PI3K-AKT, Wnt, MAPK, and TGF-β. This study underscores the power of ML to uncover biomarkers that survive external validation and are directly connected to EMT processes [84]. Building on this concept, Rozova et al. explored how the microenvironment influences EMT dynamics. Using ML to analyze mesenchymal breast carcinoma cells on substrates with varying stiffness, they observed EMT to MET mediated by E-cadherin localization and vimentin expression, demonstrating the critical role of biomechanical cues in regulating EMT markers. These findings complement gene-based analyses, emphasizing that EMT is not only genetically regulated but also microenvironment-dependent [78]. Further integrating molecular layers, Villemin et al. identified an EMT-related splicing signature in basal-like TNBC. ML analysis revealed that splicing regulators RBM47 and ESRP1/2 control EMT-associated alternatively spliced variants. Low RBM47 expression correlated with poor prognosis, highlighting its potential as a clinically relevant EMT biomarker [85]. Similarly, XGBoost-based models on TCGA datasets identified metastasis marker genes (RGS7, SPPL2C, KRT23) that would have been missed by traditional statistics, illustrating how ML captures non-linear interactions and complex feature relationships relevant to EMT [86].
Integration of EMT markers with immune profiling was demonstrated by Chen et al., who applied unsupervised clustering to TNBC datasets (TCGA, GEO) to define immune subtypes. Correlation with EMT markers (CDH2, FN1, CDH1, VIM) revealed higher EMT activity in subtypes with poor prognosis. Random forest models further predicted clinical outcomes and potential immunotherapy response, highlighting the translational value of integrating EMT biomarkers with ML-based multi-omics analysis [79].
Circulating biomarkers also provide insight into EMT regulation. Triantafyllou et al. identified miR-21 as a common EMT-related molecule across breast cancer subtypes. LZTFL1, a target of miR-21, regulates EMT via E-cadherin and actin cytoskeleton interactions, illustrating how post-transcriptional mechanisms can serve as prognostic EMT biomarkers [81,87]. In parallel, Gou et al. used ML to assess tumor microenvironment (TME) profiles in 491 TNBC patients. TME-related gene (TRG) scores correlated with EMT markers and predicted immunotherapy responses, emphasizing the integration of multi-omics and ML for both prognostic and predictive insights [88]. Proteomic analyses further highlight EMT biomarkers. Kothari et al. applied ML algorithms (SCM, DT, RF) to TCGA data, identifying MFGE8 [89] and TBC1D9 as EMT-associated markers that differentiate TNBC from non-TNBC and correlate with prognosis. Similarly, proteomic profiling of circulating sEVs in breast cancer patients revealed upregulated FAK, MEK1, and fibronectin, which could serve as noninvasive diagnostic EMT markers [82].
Hypoxia and metabolism-driven EMT have also been studied using multi-omics integration. Li et al. constructed a hypoxia- and lactate metabolism-related prognostic model (HLMRPM) using TCGA and GEO datasets. Genes such as DARS2, ESRP1, TH, and SLC2A1 modulated EMT pathways, offering predictive insights into survival and therapeutic response [90]. Boolean network models also predicted hybrid EMT cellular phenotypes, further illustrating the utility of ML in classifying complex EMT states [91].
Finally, deep learning and imaging-based approaches have enabled single-cell quantification of EMT phenotypes. Malik et al. applied deep neural networks integrating multi-omics and clinical features to stratify patients into risk groups and predict drug response. Key EMT-related genes (CDH1, PIK3CA, TP53, EFHD1) were highlighted, demonstrating the value of ML in connecting molecular EMT signatures to clinical decision-making [92].
Table 1. Potential EMT-related biomarkers in breast cancer identified through machine learning-based integrated analysis of multi-omics data.
Table 1. Potential EMT-related biomarkers in breast cancer identified through machine learning-based integrated analysis of multi-omics data.
Biomarker(s)Role in Breast CancerDataset(s)ML Method(s)ValidationCohort SizeClinical Context (Assay)OutcomeRef.
BCHE, ATP7B, PPP4R4, TFF1, PTGFR, TTYH1, SERPINA6, CDKN2A, WIF1, ZNF521, MUC16, WNK4, COL2A1, S100A7, S100B, POU2AF1Prognostic (multi-gene EMT-associated panel)GEOXGBoostInternal CV; Kaplan–Meier survival analysisn = 623 TNBC and 527 non-TNBC samples (GEO cohorts)Prognostic; RNA-seq, microarrayHigher expression is associated with better survival[84]
E-cadherin (CDH1), Vimentin (VIM)Prognostic (classical EMT markers)ECM Select ArrayHierarchical clusteringExperimental (Spearman correlation, t-test)Cell line/tissue assaysPrognostic; IHC, array-basedWorse prognosis due to EMT features[78]
RBM47, ESRP1/2Prognostic (RNA splicing regulators of EMT)GEORandom Forest, Cox regressionInternal CV; Log-rank, Wilcoxon tests Prognostic; RNA-seq, microarrayWorse prognosis in basal-like breast cancer[85]
RGS7, SPPL2C, KRT23Prognostic (linked to EMT signaling)TCGA, DisGeNET, KEGGXGBoostInternal CV; Kaplan–Meier survival analysisTCGA: n= 22 samples (metastasis to other organs)Prognostic; RNA-seqWorse prognosis[86]
CDH2, FN1, CDH1, VIMPrognostic & Predictive (epithelial–mesenchymal switch signature)TCGA, GEO, METABRICRandom Forest, Consensus ClusteringInternal CV (TCGA), External validation (METABRIC)TCGA: n = 116 TNBC;
GEO: 815 TNBC
METABRIC: n = 313 (ER- and HER2-negative BC)
Prognostic/Predictive; RNA-seq, IHCResponse to immune checkpoint blockade (ICB) and better survival[79]
miR-21, miR-148b, miR-144, miR-203a, miR-140Prognostic (EMT-related miRNA)miRecords, miRTarBase, TarBaseLinear SVMInternal CV; Fisher’s exact testn = 66 (primary breast cancer)Prognostic; qPCR, RNA-seqmiR-21, prognostic marker of worse outcome. miR-148b, miR-144, miR-203a, miR-140, predictive markers for targeted therapy[81]
Tumor microenvironment-related gene (TRG) scorePrognostic & Predictive (EMT and immune infiltration)TCGA, GEO, UCSC XenaLASSO, OCLR, Cox regressionInternal CV; ROC, PCA; External validation in GEOGEO: multiple cohortsPrognostic/Predictive; RNA-seqLow TME-related gene scores are associated with improved prognosis and better response to immunotherapy.[88]
MFGE8Diagnostic & Prognostic (linked to EMT signaling)TCGA, KM PlotterSVM, Decision Tree, Random ForestExperimental (qPCR, LC-MS/MS); Internal validationTCGA: n = 140 TNBC and 737 non-TNBCDiagnostic/Prognostic; qPCR, proteomicsMFGE8 overexpression is associated with poor prognosis[89]
Fibronectin, FAK, MEK1Diagnostic (EMT-related adhesion/migration proteins)RPPA, immunoblotting, EMk-NN, Logistic RegressionExperimental (t-test, ROC)Cell lines, patient tissueDiagnostic; RPPA, IHCProtein clusters distinguish sample types; some predict relapse and therapy response[82]
DARS2, SLC2A1, ESRP1, TH, MAFFPrognostic (EMT-related metabolic & splicing regulators)TCGA, GEO, UCSC XenaCox regression, RSFInternal CV; ROC, TIDE; External validation GEOTCGA: n = 1113 (patients with overall survival (OS) time longer than 30 days); GEO: 327Prognostic; RNA-seq, bioinformaticsworse overall survival in patients with high lactate-hypoxia scores[90]
GATA3, KRT6, ACTA2, CDH1Diagnostic & Prognostic (canonical EMT transcription factors)TCGA, METABRICNeural Network (Cox-nnet)Internal CV; Experimental (IMC imaging)TCGA: n = 159 (TNBC), n = 599 (Luminal A); METABRIC: n = 299 (TNBC), 1369 (Luminal A)Diagnostic/Prognostic; RNA-seq, IMCKRT6 and ACTA2 over-expression and CDH1 under-expression show poor prognosis.[93]
CDH1, PIK3CA, TP53, EFHD1Prognostic & Predictive (partly EMT-related)TCGA, GDSCNaïve Bayes, SMO, RF, k-NNInternal CV; STRING/Cytoscape validationTCGA: n = 1000Prognostic/Predictive; RNA-seq, microarrayWorse survival and potential treatment response[92]
miR-222-3pDiagnostic & Prognostic (EMT-associated miRNA)TCGA, GEO, miRWalkOCLRInternal CV; ROC, Kaplan–MeierTCGA: n= 1103; GEO: multipleDiagnostic/Prognostic; qPCR, RNA-seqHigher miR-222-3p expression indicates worse prognosis.[80]
Quantitative EMT score (epithelial–mesenchymal traits)Diagnostic (phenotypic EMT scoring)DHM imagingAdaBoost, SVMExperimental (t-test, post hoc analysis)Cell lines, tissue samplesDiagnostic; digital holographic microscopy [72]
Immune-radiomic models with EMT signaturesPredictive (therapy response prediction)MRI-based radiomicsML algorithms (unspecified)External validation (MRI cohort)n = 570 (breast MRI)Predictive; MRI, radiomicsMRI-based model predicts risk of positive margins in BCS.[83]
Abbreviation: Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) Extreme Gradient Boosting (XGBoost), Immunohistochemistry (IHC), linear Support Vector Machine (SVM), Quantitative real-time PCR (q-PCR), Liquid Chromatography with tandem mass spectrometry (LC-MS/MS), Random forests (RF), Reverse phase protein array (RPPA), random survival forest (RSF), Triple-negative breast cancer (TNBC), Sequential minimal optimization (SMO), K-Nearest Neighbor’s algorithm (k-NN), Genomics of Drug Sensitivity in Cancer (GDSC).
Complementary studies using single-cell imaging and digital holographic microscopy quantified epithelial–mesenchymal scores via SVM and neural networks, providing sensitive assessment of EMT dynamics in TNBC and luminal subtypes [93]. Ma et al. leveraged MRI-based immune-radiomic models to infer EMT status in tumor margins, linking elevated EMT activity to poor surgical outcomes and highlighting ML’s role in bridging imaging and molecular biomarkers [83].

6. EMT-Related Biomarkers: Predictive Indicators and Therapeutic Targets

The primary therapeutic strategies in cancer therapy focus on suppressing EMT processes. This is achieved by targeting EMT transcription factors (EMT-TFs), EMT-related signaling pathways, and EMT-related proteins. By doing so, these strategies aim to reduce the risk of metastasis, thereby improving patient outcomes [94]. Despite advancements in cancer research, significant challenges persist in its treatment. The primary treatments—chemotherapy, radiotherapy, and immunotherapy—often face resistance, making therapy less effective. Addressing this resistance is a key issue in oncology. Over the past decade, there has been substantial advancement in targeted cancer therapies. This includes the development and approval of monoclonal small molecule inhibitors, non-coding RNA molecules, antibodies, and natural products, many of which are currently under clinical investigation. These innovations represent a significant leap forward in the fight against cancer [95]. To develop effective targeted therapies, it is crucial to understand the mechanisms behind therapy resistance, including epithelial–mesenchymal plasticity (EMP) and the tumor microenvironment. By reversing the EMT and maintaining the epithelial traits of cancer cells, their sensitivity to chemotherapy and radiotherapy can be enhanced [96].
Developing new cancer therapies centers on targeting EMT transcription factors (EMT-TFs) and their related signaling pathways to prevent or reverse EMT. This approach aims to inhibit the processes that enable cancer cells to become more invasive and spread to other parts of the body. Akbar et al. conducted a study that combined in silico and in vitro analyses. They used publicly available datasets from cell lines and patient tumors for in silico analyses, while in vitro experiments were conducted on human breast cancer cell lines to assess phenotypic plasticity and drug responsiveness. Their findings indicated that knocking down ZEB1 and SNAI2 genes led to decreased sensitivity to the drug Midostaurin and increased sensitivity to Lapatinib in MDA-MB-157 breast cells. They discovered that the CNCL gene list, which includes genes related to stem-cell and EMT characteristics, is indicative of tumor plasticity and influences cytotoxicity profiles, particularly in response to Lapatinib and Midostaurin in breast cancer. This gene list, while not always accurately forecasting patient outcomes, can help determine which patients are likely to respond well to treatment with Taxane drugs before their main therapy [97]. Imani and colleagues found that by targeting specific EMT-related proteins (Zeb 1, Twist 1, and NOTCH1) with miR-34a and the natural compound thymoquinone, they could halt the EMT process in breast cancer cells. This led to a reduction in both cell invasion and metastasis, highlighting a potential therapeutic approach [98]. Addison et al. investigated the expression of EMT-TF networks across various cancer models and human breast tumors. They used conditional knockdown techniques to demonstrate that ZEB1 and ZEB2 are crucial regulators of metastasis. Their research identified that ZEB1/2, TCF4, SNAI2, and TWIST1/2 are commonly and cooperatively upregulated during forced EMT in normal mammary epithelial cells. This upregulation likely occurs due to the suppression of epithelial-specific miRNAs, such as miR200s/203/205, which normally inhibit multiple EMT-TFs. They also discovered that miR200c can target TCF4, in addition to SNAI2 and ZEB1/2. Remarkably, activating the EMT program in non-transformed epithelial cells bestows stem cell-like properties, characterized by the CD24−/CD44+ phenotype. ZEB1/2 and TWIST1/2 were found to be significantly upregulated in these CD24−/CD44+ mammary stem-like cells, with ZEB1 playing a major role in maintaining their mesenchymal status. Furthermore, using Dox-inducible shRNAs, they showed that depleting ZEB1/2 and suppressing EMT at the early stages of tumor growth can block spontaneous lung metastasis. This suggests that administering anti-EMT drugs during early tumor stages could be an effective strategy for preventing metastasis [99]. In another study, researchers treated breast cancer cells with cisplatin and then measured cell migration and changes in EMT markers using methods such as Western blot, migration assays, and immunofluorescent staining. They analyzed RNA expression changes through RNA-seq and confirmed the binding of activating transcription factor 3 (ATF3) to cytoskeleton-related genes using ChIP-seq. They later utilized a paclitaxel and cisplatin combination in treating xenograft mouse models. This study demonstrated that a lower dose of cisplatin specifically inhibited cancer metastasis without significantly affecting tumor growth. The underlying mechanism was identified as cisplatin disrupting a positive feedback loop between TGFβ and FN1, which is crucial for TGFβ activation. This inhibitory effect was lost when ATF3 was disrupted [100]. Tian et al. performed a study to identify natural inhibitors of breast cancer metastasis; researchers used small interfering RNAs (siRNAs) to transiently knock down 591 ERF-coding genes in luminal breast cancer MCF-7 cells. They discovered that depletion of the gene AF9 significantly promoted MCF-7 cell invasion and migration. A metastasis mouse model further confirmed AF9’s suppressive role in breast cancer metastasis. RNA profiling showed that AF9 target genes are enriched in the EMT pathway. The tandem mass spectrometry revealed that AF9 interacts with Snail, a master regulator of EMT, inhibiting Snail’s transcriptional activity in basal-like breast cancer (BLBC) cells. AF9 reconstitutes an active state on Snail’s promoter, recruiting GCN5 or CBP to derepress target genes. Additionally, miR-5694 targets and degrades AF9 mRNA in BLBC cells, further enhancing cell migration and invasion. Notably, AF9 and miR-5694 expression levels in BLBC clinical samples are inversely correlated. Therefore, miR-5694 mediates the downregulation of AF9, promoting metastasis in BLBC. Restoring the expression of the metastasis suppressor AF9 could be a potential therapeutic strategy against metastatic breast cancer [101]. Another study demonstrated that combining doxorubicin with miR34a both in vitro and in vivo synergistically suppresses the progression of doxorubicin-resistant breast cancer. This combination works by decreasing the expression of Snail, which it achieves by inhibiting various signaling pathways, including RAS/RAF/MEK/ERK and Notch/NF-κB. High Snail expression is known to significantly promote cell migration, invasion, and adhesion, likely through the regulation of E-cadherin and N-cadherin. These findings highlight the importance of miR34a in regulating Snail and suggest that co-administering miR34a with doxorubicin could offer a more effective therapeutic strategy against drug-resistant breast cancer in clinical settings [102].
Extensive evidence indicates that targeting EMT-related proteins and signaling pathways can effectively inhibit cancer progression and enhance the sensitivity of cancer cells to therapy. By disrupting these EMT mechanisms, therapies can become more effective, potentially leading to better patient outcomes. Nelson et al. demonstrated that overexpressing membrane N-cadherin enhances chemo-resistance and invasiveness in TNBC. Using an antibody to inhibit pro-N-cadherin increased chemo-sensitivity in the BT549 and SUM159 cell lines [103]. Senthebane et al. found that elevated levels of collagens, laminins, and fibronectin in tumor tissue play a significant role in cancer progression. They demonstrated that using siRNA to knock down collagen and fibronectin, combined with chemotherapy drugs, increased the sensitivity of esophageal and breast cancer cells to these agents. This approach also reduced colony formation and cancer cell migration, underscoring the potential therapeutic benefits of targeting these extracellular matrix components [104]. Ahmad et al. demonstrated that PGI overexpression enhances NF-κB binding to DNA, resulting in the upregulation of Zeb1 and Zeb2. They found that silencing the PGI pathway reversed the EMT process and reduced the aggressiveness of breast cancer cells. Specifically, knocking down PGI/AMF expression in breast cancer cells led to the upregulation of miR-200s, correlating with the reversal of the EMT phenotype. This was consistent with changes in the expression of epithelial markers (E-cadherin) and mesenchymal markers (ZEB1, vimentin, ZEB2), as well as decreased cell aggressiveness, as evidenced by motility, clonogenic, and invasion assays. These results suggest that miR-200s play a role in PGI/AMF-induced EMT, and therefore, approaches to upregulate miR-200s could represent a novel therapeutic strategy for treating highly invasive breast cancer [105]. Zhu et al. demonstrated that TIMP-1 serves as a prognostic and predictive biomarker in breast cancer. Elevated TIMP-1 levels are linked to poorer overall survival, disease-free survival, and resistance to Paclitaxel [106]. The results of a different study indicated that antisense oligonucleotides decreased miR-221 and miR-222 levels in breast cells, enhancing cellular sensitivity to tamoxifen by increasing TIMP-3 expression [107]. Thakur et al. found that targeting MT1-MMP could improve the effectiveness of chemotherapy and radiotherapy in breast cancer patients, especially those with TNBC. MT1-MMP activates a DNA damage response that affects breast cancer resistance to Doxorubicin in both in vitro studies and an animal model [108]. A pilot investigation on breast cancer patients undergoing traditional and hypofractionated radiation therapy suggested that MMPs (specifically MMP-3 and MMP-9) and TIMP4 could be valuable prognostic and predictive biomarkers. They discovered that MMP-3 levels were associated with tumor size, grade, menopausal status, lymph node involvement, and hormonal receptor status. Additionally, the study revealed a correlation between MMP levels and radiation-induced side effects [109]. Yuan et al. found that breast cancer patients who underwent radiotherapy and chemotherapy post-surgery showed decreased MMP-9 levels and elevated TIMP-1 levels in their serum, which correlated with clinicopathological characteristics [110]. Saxena et al. found that Snail, Foxc2, and Twist increased the expression of multiple ABC transporters in breast cancer cells exposed to Doxorubicin [111]. The metastatic breast cancer model resistant to cyclophosphamide exhibited tolerance to apoptosis, and targeting EMT with miR-200 eradicated chemo-resistance [112].
Breast cancer subtypes differ markedly in their response to immune-checkpoint inhibitors. Triple-negative and some basal-like tumors show the highest—but still modest—objective response rates, whereas hormone-receptor–positive and HER2-enriched cancers generally respond poorly [113]. Resistance mechanisms include low tumor mutational burden, immune-suppressive tumor microenvironments, and up-regulation of alternative checkpoints [114]. Multi-omics and ML studies are beginning to identify predictive biomarkers—such as tumor-infiltrating lymphocyte signatures and composite EMT/immune scores—that correlate with ex vivo drug-sensitivity metrics such as IC50 values (the drug concentration that inhibits 50% of a specified cellular response or activity). Incorporating these immune-response data with EMT-focused omics can guide patient stratification and combination-therapy design [115].

7. Conclusions

The current review highlights the essential role of advanced technology and AI techniques in identifying novel EMT-related biomarkers for breast cancer. These biomarkers hold strong potential for improving early detection, prognosis, personalized treatment, and the development of targeted therapies. Nevertheless, several challenges remain, particularly the heterogeneity of multi-omics data and the complexity of its integration. To move the field forward, it is crucial to benchmark EMT signatures across large-scale public zdatasets such as TCGA and METABRIC and validate them in independent clinical cohorts to ensure reproducibility and generalizability. Future studies should prioritize multi-omics integration models that demonstrate improved calibration and net clinical benefit compared with single-omics approaches. At the same time, assessing feature stability, interpretability, and incorporating decision-curve analyses will be essential for ensuring the robustness and clinical relevance of ML predictions. Finally, encouraging open science practices, including the sharing of datasets, code, and model architectures, will enhance reproducibility and foster collaborative refinement of EMT-focused ML pipelines. By aligning biomarker discovery with these practical steps, the integration of omics and AI-driven methods can advance from theoretical promise to clinically actionable strategies that improve patient stratification, therapeutic decision-making, and precision oncology.

Author Contributions

G.K.-T. conceived the original idea and wrote the manuscript. A.S. performed supervision and reviewed the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wilkinson, L.; Gathani, T. Understanding breast cancer as a global health concern. Br. J. Radiol. 2022, 95, 20211033. [Google Scholar] [CrossRef]
  2. Feng, Y.; Spezia, M.; Huang, S.; Yuan, C.; Zeng, Z.; Zhang, L.; Ji, X.; Liu, W.; Huang, B.; Luo, W.; et al. Breast cancer development and progression: Risk factors, cancer stem cells, signaling pathways, genomics, and molecular pathogenesis. Genes Dis. 2018, 5, 77–106. [Google Scholar] [CrossRef]
  3. Trieu, P.D.Y.; Mello-Thoms, C.R.; Barron, M.L.; Lewis, S.J. Look how far we have come: BREAST cancer detection education on the international stage. Front. Oncol. 2022, 12, 1023714. [Google Scholar] [CrossRef] [PubMed]
  4. Siegel, R.L.; Giaquinto, A.N.; Jemal, A. Cancer statistics, 2024. CA Cancer J. Clin. 2024, 74, 12–49. [Google Scholar] [CrossRef]
  5. Arnold, M.; Morgan, E.; Rumgay, H.; Mafra, A.; Singh, D.; Laversanne, M.; Vignat, J.; Gralow, J.R.; Cardoso, F.; Siesling, S.; et al. Current and future burden of breast cancer: Global statistics for 2020 and 2040. Breast 2022, 66, 15–23. [Google Scholar] [CrossRef]
  6. Makki, J. Diversity of Breast Carcinoma: Histological Subtypes and Clinical Relevance. Clin. Med. Insights Pathol. 2015, 8, 23–31. [Google Scholar] [CrossRef] [PubMed]
  7. Aysola, K.; Desai, A.; Welch, C.; Xu, J.; Qin, Y.; Reddy, V.; Matthews, R.; Owens, C.; Okoli, J.; Beech, D.J.; et al. Triple Negative Breast Cancer—An Overview. Hered. Genet. 2013, 2013 (Suppl. S2), 001. [Google Scholar] [CrossRef]
  8. Exman, P.; Tolaney, S.M. HER2-positive metastatic breast cancer: A comprehensive review. Clin. Adv. Hematol. Oncol. 2021, 19, 40–50. [Google Scholar] [PubMed]
  9. Hacking, S.M.; Yakirevich, E.; Wang, Y. From Immunohistochemistry to New Digital Ecosystems: A State-of-the-Art Biomarker Review for Precision Breast Cancer Medicine. Cancers 2022, 14, 3469. [Google Scholar] [CrossRef]
  10. Swain, S.M.; Shastry, M.; Hamilton, E. Targeting HER2-positive breast cancer: Advances and future directions. Nat. Rev. Drug Discov. 2023, 22, 101–126. [Google Scholar] [CrossRef]
  11. Nassar, A.; Hoskin, T.L.; Stallings-Mann, M.L.; Degnim, A.C.; Radisky, D.C.; Frost, M.H.; Vierkant, R.A.; Hartmann, L.C.; Visscher, D.W. Ki-67 expression in sclerosing adenosis and adjacent normal breast terminal ductal lobular units: A nested case-control study from the Mayo Benign Breast Disease Cohort. Breast Cancer Res. Treat. 2015, 151, 89–97. [Google Scholar] [CrossRef] [PubMed]
  12. Ogony, J.; Hoskin, T.L.; Stallings-Mann, M.; Winham, S.; Brahmbhatt, R.; Arshad, M.A.; Kannan, N.; Pena, A.; Allers, T.; Brown, A.; et al. Immune cells are increased in normal breast tissues of BRCA1/2 mutation carriers. Breast Cancer Res. Treat. 2023, 197, 277–285. [Google Scholar] [CrossRef]
  13. Wang, X.; Collet, L.; Rediti, M.; Debien, V.; De Caluwe, A.; Venet, D.; Romano, E.; Rothe, F.; Sotiriou, C.; Buisseret, L. Predictive Biomarkers for Response to Immunotherapy in Triple Negative Breast Cancer: Promises and Challenges. J. Clin. Med. 2023, 12, 953. [Google Scholar] [CrossRef] [PubMed]
  14. Khalili-Tanha, G.; Sebzari, A.; Moodi, M.; Hajipoor, F.; Naseri, M. Mutations analysis of BRCA1 gene in patients with breast cancer in South Khorasan province, East Iran. Med. J. Islam. Repub. Iran 2019, 33, 105. [Google Scholar] [PubMed]
  15. Craene, B.D.; Berx, G. Regulatory networks defining EMT during cancer initiation and progression. Nat. Rev. Cancer 2013, 13, 97–110. [Google Scholar] [CrossRef]
  16. Roche, J. The epithelial-to-mesenchymal transition in cancer. Cancers 2018, 10, 52. [Google Scholar] [CrossRef]
  17. Orrantia-Borunda, E.; Anchondo-Nunez, P.; Acuna-Aguilar, L.E.; Gomez-Valles, F.O.; Ramirez-Valdespino, C.A. Subtypes of Breast Cancer. In Breast Cancer; Mayrovitz, H.N., Ed.; Exon Publications: Brisbane, Australia, 2022. [Google Scholar]
  18. Yang, J.; Antin, P.; Berx, G.; Blanpain, C.; Brabletz, T.; Bronner, M.; Campbell, K.; Cano, A.; Casanova, J.; Christofori, G.; et al. Guidelines and definitions for research on epithelial-mesenchymal transition. Nat. Rev. Mol. Cell Biol. 2020, 21, 341–352. [Google Scholar] [CrossRef]
  19. Nistico, P.; Bissell, M.J.; Radisky, D.C. Epithelial-mesenchymal transition: General principles and pathological relevance with special emphasis on the role of matrix metalloproteinases. Cold Spring Harb. Perspect. Biol. 2012, 4, a011908. [Google Scholar] [CrossRef]
  20. Loh, C.Y.; Chai, J.Y.; Tang, T.F.; Wong, W.F.; Sethi, G.; Shanmugam, M.K.; Chong, P.P.; Looi, C.Y. The E-Cadherin and N-Cadherin Switch in Epithelial-to-Mesenchymal Transition: Signaling, Therapeutic Implications, and Challenges. Cells 2019, 8, 1118. [Google Scholar] [CrossRef]
  21. Huang, Y.H.; Hong, W.Q.; Wei, X.W. The molecular mechanisms and therapeutic strategies of EMT in tumor progression and metastasis. J. Hematol. Oncol. 2022, 15, 129. [Google Scholar] [CrossRef]
  22. Lamouille, S.; Xu, J.; Derynck, R. Molecular mechanisms of epithelial-mesenchymal transition. Nat. Rev. Mol. Cell Biol. 2014, 15, 178–196. [Google Scholar] [CrossRef] [PubMed]
  23. Ribatti, D.; Tamma, R.; Annese, T. Epithelial-Mesenchymal Transition in Cancer: A Historical Overview. Transl. Oncol. 2020, 13, 100773. [Google Scholar] [CrossRef] [PubMed]
  24. Radisky, E.S.; Radisky, D.C. Matrix metalloproteinase-induced epithelial-mesenchymal transition in breast cancer. J. Mammary Gland Biol. Neoplasia 2010, 15, 201–212. [Google Scholar] [CrossRef]
  25. Smith, B.N.; Bhowmick, N.A. Role of EMT in Metastasis and Therapy Resistance. J. Clin. Med. 2016, 5, 17. [Google Scholar] [CrossRef]
  26. Diaz, V.M.; Vinas-Castells, R.; Garcia de Herreros, A. Regulation of the protein stability of EMT transcription factors. Cell Adh. Migr. 2014, 8, 418–428. [Google Scholar] [CrossRef]
  27. Kang, E.; Seo, J.; Yoon, H.; Cho, S. The Post-Translational Regulation of Epithelial-Mesenchymal Transition-Inducing Transcription Factors in Cancer Metastasis. Int. J. Mol. Sci. 2021, 22, 3591. [Google Scholar] [CrossRef]
  28. Mitschke, J.; Burk, U.C.; Reinheckel, T. The role of proteases in epithelial-to-mesenchymal cell transitions in cancer. Cancer Metastasis Rev. 2019, 38, 431–444. [Google Scholar] [CrossRef]
  29. Radisky, E.S. Extracellular proteolysis in cancer: Proteases, substrates, and mechanisms in tumor progression and metastasis. J. Biol. Chem. 2024, 300, 107347. [Google Scholar] [CrossRef]
  30. Shoari, A.; Khalili-Tanha, G.; Coban, M.A.; Radisky, E.S. Structure and computation-guided yeast surface display for the evolution of TIMP-based matrix metalloproteinase inhibitors. Front. Mol. Biosci. 2023, 10, 1321956. [Google Scholar] [CrossRef]
  31. Scheau, C.; Badarau, I.A.; Costache, R.; Caruntu, C.; Mihai, G.L.; Didilescu, A.C.; Constantin, C.; Neagu, M. The Role of Matrix Metalloproteinases in the Epithelial-Mesenchymal Transition of Hepatocellular Carcinoma. Anal. Cell. Pathol. 2019, 2019, 9423907. [Google Scholar] [CrossRef] [PubMed]
  32. Quintero-Fabián, S.; Arreola, R.; Becerril-Villanueva, E.; Torres-Romero, J.C.; Arana-Argáez, V.; Lara-Riegos, J.; Ramírez-Camacho, M.A.; Alvarez-Sánchez, M.E. Role of Matrix Metalloproteinases in Angiogenesis and Cancer. Front. Oncol. 2019, 9, 1370. [Google Scholar] [CrossRef]
  33. Niland, S.; Riscanevo, A.X.; Eble, J.A. Matrix Metalloproteinases Shape the Tumor Microenvironment in Cancer Progression. Int. J. Mol. Sci. 2021, 23, 146. [Google Scholar] [CrossRef] [PubMed]
  34. Li, S.; Luo, W. Matrix metalloproteinase 2 contributes to aggressive phenotype, epithelial-mesenchymal transition and poor outcome in nasopharyngeal carcinoma. Onco Targets Ther. 2019, 12, 5701–5711. [Google Scholar] [CrossRef] [PubMed]
  35. Li, Y.C.; He, J.; Wang, F.; Wang, X.; Yang, F.; Zhao, C.Y.; Feng, C.L.; Li, T.J. Role of MMP-9 in epithelial-mesenchymal transition of thyroid cancer. World J. Surg. Oncol. 2020, 18, 181. [Google Scholar] [CrossRef]
  36. Sternlicht, M.D.; Lochter, A.; Sympson, C.J.; Huey, B.; Rougler, J.P.; Gray, J.W.; Pinkel, D.; Bissell, M.J.; Werb, Z. The stromal proteinase MMP3/stromelysin-1 promotes mammary carcinogenesis. Cell 1999, 98, 137–146. [Google Scholar] [CrossRef]
  37. Lin, Y.X.; Liu, J.J.; Huang, Y.Q.; Liu, D.L.; Zhang, G.W.; Kan, H.P. microRNA-489 Plays an Anti-Metastatic Role in Human Hepatocellular Carcinoma by Targeting Matrix Metalloproteinase-7. Transl. Oncol. 2017, 10, 211–220. [Google Scholar] [CrossRef]
  38. Liu, M.; Qi, Y.; Zhao, L.; Chen, D.; Zhou, Y.; Zhou, H.; Lv, Y.; Zhang, L.; Jin, S.; Li, S.; et al. Matrix metalloproteinase-14 induces epithelial-to-mesenchymal transition in synovial sarcoma. Hum. Pathol. 2018, 80, 201–209. [Google Scholar] [CrossRef]
  39. Liu, F.; Gu, L.N.; Shan, B.E.; Geng, C.Z.; Sang, M.X. Biomarkers for EMT and MET in breast cancer: An update. Oncol. Lett. 2016, 12, 4869–4876. [Google Scholar] [CrossRef] [PubMed]
  40. Das, S.; Dey, M.K.; Devireddy, R.; Gartia, M.R. Biomarkers in Cancer Detection, Diagnosis, and Prognosis. Sensors 2023, 24, 37. [Google Scholar] [CrossRef]
  41. Davis, K.D.; Aghaeepour, N.; Ahn, A.H.; Angst, M.S.; Borsook, D.; Brenton, A.; Burczynski, M.E.; Crean, C.; Edwards, R.; Gaudilliere, B.; et al. Discovery and validation of biomarkers to aid the development of safe and effective pain therapeutics: Challenges and opportunities. Nat. Rev. Neurol. 2020, 16, 381–400. [Google Scholar] [CrossRef]
  42. Hassan, M.; Awan, F.M.; Naz, A.; deAndres-Galiana, E.J.; Alvarez, O.; Cernea, A.; Fernandez-Brillet, L.; Fernandez-Martinez, J.L.; Kloczkowski, A. Innovations in Genomics and Big Data Analytics for Personalized Medicine and Health Care: A Review. Int. J. Mol. Sci. 2022, 23, 4645. [Google Scholar] [CrossRef] [PubMed]
  43. Malagoli Tagliazucchi, G.; Wiecek, A.J.; Withnell, E.; Secrier, M. Genomic and microenvironmental heterogeneity shaping epithelial-to-mesenchymal trajectories in cancer. Nat. Commun. 2023, 14, 789. [Google Scholar] [CrossRef] [PubMed]
  44. Yu, X.; Pan, X.; Zhang, S.; Zhang, Y.H.; Chen, L.; Wan, S.; Huang, T.; Cai, Y.D. Identification of Gene Signatures and Expression Patterns During Epithelial-to-Mesenchymal Transition from Single-Cell Expression Atlas. Front. Genet. 2020, 11, 605012. [Google Scholar] [CrossRef] [PubMed]
  45. Neagu, M.; Constantin, C.; Bostan, M.; Caruntu, C.; Ignat, S.R.; Dinescu, S.; Costache, M. Proteomic Technology “Lens” for Epithelial-Mesenchymal Transition Process Identification in Oncology. Anal. Cell. Pathol. 2019, 2019, 3565970. [Google Scholar] [CrossRef]
  46. Matadamas-Guzman, M.; Zazueta, C.; Rojas, E.; Resendis-Antonio, O. Analysis of Epithelial-Mesenchymal Transition Metabolism Identifies Possible Cancer Biomarkers Useful in Diverse Genetic Backgrounds. Front. Oncol. 2020, 10, 1309. [Google Scholar] [CrossRef]
  47. Orsini, A.; Diquigiovanni, C.; Bonora, E. Omics Technologies Improving Breast Cancer Research and Diagnostics. Int. J. Mol. Sci. 2023, 24, 12690. [Google Scholar] [CrossRef]
  48. Eicher, T.; Kinnebrew, G.; Patt, A.; Spencer, K.; Ying, K.; Ma, Q.; Machiraju, R.; Mathe, A.E.A. Metabolomics and Multi-Omics Integration: A Survey of Computational Methods and Resources. Metabolites 2020, 10, 202. [Google Scholar] [CrossRef]
  49. Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. A Math. Phys. Eng. Sci. 2016, 374, 20150202. [Google Scholar] [CrossRef]
  50. Agamah, F.E.; Bayjanov, J.R.; Niehues, A.; Njoku, K.F.; Skelton, M.; Mazandu, G.K.; Ederveen, T.H.A.; Mulder, N.; Chimusa, E.R.; t Hoen, P.A.C. Computational approaches for network-based integrative multi-omics analysis. Front. Mol. Biosci. 2022, 9, 967205. [Google Scholar] [CrossRef]
  51. Reimand, J.; Isserlin, R.; Voisin, V.; Kucera, M.; Tannus-Lopes, C.; Rostamianfar, A.; Wadi, L.; Meyer, M.; Wong, J.; Xu, C.; et al. Pathway enrichment analysis and visualization of omics data using g:Profiler, GSEA, Cytoscape and EnrichmentMap. Nat. Protoc. 2019, 14, 482–517. [Google Scholar] [CrossRef]
  52. Chen, C.; Wang, J.; Pan, D.; Wang, X.; Xu, Y.; Yan, J.; Wang, L.; Yang, X.; Yang, M.; Liu, G.P. Applications of multi-omics analysis in human diseases. MedComm (2020) 2023, 4, e315. [Google Scholar] [CrossRef]
  53. Mirza, B.; Wang, W.; Wang, J.; Choi, H.; Chung, N.C.; Ping, P. Machine Learning and Integrative Analysis of Biomedical Big Data. Genes 2019, 10, 87. [Google Scholar] [CrossRef] [PubMed]
  54. Miotto, R.; Wang, F.; Wang, S.; Jiang, X.; Dudley, J.T. Deep learning for healthcare: Review, opportunities and challenges. Brief. Bioinform. 2018, 19, 1236–1246. [Google Scholar] [CrossRef]
  55. Gao, F.; Huang, K.; Xing, Y. Artificial Intelligence in Omics. Genom. Proteom. Bioinform. 2022, 20, 811–813. [Google Scholar] [CrossRef]
  56. Tsimenidis, S.; Vrochidou, E.; Papakostas, G.A. Omics Data and Data Representations for Deep Learning-Based Predictive Modeling. Int. J. Mol. Sci. 2022, 23, 12272. [Google Scholar] [CrossRef]
  57. Sharma, A.; Lysenko, A.; Jia, S.R.; Boroevich, K.A.; Tsunoda, T. Advances in AI and machine learning for predictive medicine. J. Hum. Genet. 2024, 69, 487–497. [Google Scholar] [CrossRef]
  58. Das, T.; Andrieux, G.; Ahmed, M.; Chakraborty, S. Integration of online omics-data resources for cancer research. Front. Genet. 2020, 11, 578345. [Google Scholar] [CrossRef] [PubMed]
  59. Tomczak, K.; Czerwińska, P.; Wiznerowicz, M. Review The Cancer Genome Atlas (TCGA): An immeasurable source of knowledge. Contemp. Oncol. Współczesna Onkol. 2015, 2015, 68–77. [Google Scholar] [CrossRef]
  60. Barrett, T.; Suzek, T.O.; Troup, D.B.; Wilhite, S.E.; Ngau, W.-C.; Ledoux, P.; Rudnev, D.; Lash, A.E.; Fujibuchi, W.; Edgar, R. NCBI GEO: Mining millions of expression profiles—Database and tools. Nucleic Acids Res. 2005, 33, D562–D566. [Google Scholar] [CrossRef] [PubMed]
  61. Jézéquel, P.; Gouraud, W.; Ben Azzouz, F.; Guérin-Charbonnel, C.; Juin, P.P.; Lasla, H.; Campone, M. bc-GenExMiner 4.5: New mining module computes breast cancer differential gene expression analyses. Database 2021, 2021, baab007. [Google Scholar] [CrossRef]
  62. Curtis, C.; Shah, S.P.; Chin, S.-F.; Turashvili, G.; Rueda, O.M.; Dunning, M.J.; Speed, D.; Lynch, A.G.; Samarajiwa, S.; Yuan, Y.; et al. The genomic and transcriptomic architecture of 2000 breast tumours reveals novel subgroups. Nature 2012, 486, 346–352. [Google Scholar] [CrossRef]
  63. Subramanian, I.; Verma, S.; Kumar, S.; Jere, A.; Anamika, K. Multi-omics data integration, interpretation, and its application. Bioinform. Biol. Insights 2020, 14, 1177932219899051. [Google Scholar] [CrossRef]
  64. Torres-Martos, Á.; Bustos-Aibar, M.; Ramírez-Mena, A.; Cámara-Sánchez, S.; Anguita-Ruiz, A.; Alcalá, R.; Aguilera, C.M.; Alcalá-Fdez, J. Omics data preprocessing for machine learning: A case study in childhood obesity. Genes 2023, 14, 248. [Google Scholar] [CrossRef]
  65. Zebari, R.; Abdulazeez, A.; Zeebaree, D.; Zebari, D.; Saeed, J. A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. J. Appl. Sci. Technol. Trends 2020, 1, 56–70. [Google Scholar] [CrossRef]
  66. Tang, J.; Alelyani, S.; Liu, H. Feature selection for classification: A review. In Data Classification: Algorithms and Applications; CRC Press: Boca Raton, FL, USA, 2014; pp. 37–64. [Google Scholar]
  67. Koch, I.; Naito, K. Dimension selection for feature selection and dimension reduction with principal and independent component analysis. Neural Comput. 2007, 19, 513–545. [Google Scholar] [CrossRef]
  68. Haury, A.-C.; Gestraud, P.; Vert, J.-P. The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures. PLoS ONE 2011, 6, e28210. [Google Scholar] [CrossRef]
  69. Bhavsar, H.; Ganatra, A. A comparative study of training algorithms for supervised machine learning. Int. J. Soft Comput. Eng. (IJSCE) 2012, 2, 2231–2307. [Google Scholar]
  70. Sambo, F.; Trifoglio, E.; Di Camillo, B.; Toffolo, G.M.; Cobelli, C. Bag of Naïve Bayes: Biomarker selection and classification from genome-wide SNP data. BMC Bioinform. 2012, 13, S2. [Google Scholar] [CrossRef] [PubMed]
  71. Vijayarani, S.; Muthulakshmi, M. Comparative analysis of bayes and lazy classification algorithms. Int. J. Adv. Res. Comput. Commun. Eng. 2013, 2, 3118–3124. [Google Scholar]
  72. Lam, V.K.; Nguyen, T.; Bui, V.; Chung, B.M.; Chang, L.-C.; Nehmetallah, G.; Raub, C.B. Quantitative scoring of epithelial and mesenchymal qualities of cancer cells using machine learning and quantitative phase imaging. J. Biomed. Opt. 2020, 25, 026002. [Google Scholar] [CrossRef] [PubMed]
  73. Wang, J.; Biljecki, F. Unsupervised machine learning in urban studies: A systematic review of applications. Cities 2022, 129, 103925. [Google Scholar] [CrossRef]
  74. Lopez, C.; Tucker, S.; Salameh, T.; Tucker, C. An unsupervised machine learning method for discovering patient clusters based on genetic signatures. J. Biomed. Inform. 2018, 85, 30–39. [Google Scholar] [CrossRef]
  75. Myllyaho, L.; Raatikainen, M.; Männistö, T.; Mikkonen, T.; Nurminen, J.K. Systematic literature review of validation methods for AI systems. J. Syst. Softw. 2021, 181, 111050. [Google Scholar] [CrossRef]
  76. Ho, S.Y.; Phua, K.; Wong, L.; Goh, W.W.B. Extensions of the external validation for checking learned model interpretability and generalizability. Patterns 2020, 1, 100129. [Google Scholar] [CrossRef]
  77. Ganji, M.; Bakhshi, S.; Shoari, A.; Ahangari Cohan, R. Discovery of potential FGFR3 inhibitors via QSAR, pharmacophore modeling, virtual screening and molecular docking studies against bladder cancer. J. Transl. Med. 2023, 21, 111. [Google Scholar] [CrossRef]
  78. Rozova, V.S.; Anwer, A.G.; Guller, A.E.; Es, H.A.; Khabir, Z.; Sokolova, A.I.; Gavrilov, M.U.; Goldys, E.M.; Warkiani, M.E.; Thiery, J.P.; et al. Machine learning reveals mesenchymal breast carcinoma cell adaptation in response to matrix stiffness. PLoS Comput. Biol. 2021, 17, e1009193. [Google Scholar] [CrossRef] [PubMed]
  79. Chen, Z.; Wang, M.; De Wilde, R.L.; Feng, R.; Su, M.; Torres-de la Roche, L.A.; Shi, W. A machine learning model to predict the triple negative breast cancer immune subtype. Front. Immunol. 2021, 12, 749459. [Google Scholar] [CrossRef]
  80. Fang, Y.; Zhang, Q.; Chen, C.; Chen, Z.; Zheng, R.; She, C.; Zhang, R.; Wu, J. Identification and comprehensive analysis of epithelial–mesenchymal transition related target genes of miR-222-3p in breast cancer. Front. Oncol. 2023, 13, 1189635. [Google Scholar] [CrossRef]
  81. Triantafyllou, A.; Dovrolis, N.; Zografos, E.; Theodoropoulos, C.; Zografos, G.C.; Michalopoulos, N.V.; Gazouli, M. Circulating miRNA expression profiling in breast cancer molecular subtypes: Applying machine learning analysis in bioinformatics. Cancer Diagn. Progn. 2022, 2, 739. [Google Scholar] [CrossRef]
  82. Vinik, Y.; Ortega, F.G.; Mills, G.B.; Lu, Y.; Jurkowicz, M.; Halperin, S.; Aharoni, M.; Gutman, M.; Lev, S. Proteomic analysis of circulating extracellular vesicles identifies potential markers of breast cancer progression, recurrence, and response. Sci. Adv. 2020, 6, eaba5714. [Google Scholar] [CrossRef]
  83. Ma, J.; Chen, K.; Li, S.; Zhu, L.; Yu, Y.; Li, J.; Ma, J.; Ouyang, J.; Wu, Z.; Tan, Y.; et al. MRI-based radiomic models to predict surgical margin status and infer tumor immune microenvironment in breast cancer patients with breast-conserving surgery: A multicenter validation study. Eur. Radiol. 2024, 34, 1774–1789. [Google Scholar] [CrossRef]
  84. Thalor, A.; Joon, H.K.; Singh, G.; Roy, S.; Gupta, D. Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer. Comput. Struct. Biotechnol. J. 2022, 20, 1618–1631. [Google Scholar] [CrossRef]
  85. Villemin, J.-P.; Lorenzi, C.; Cabrillac, M.-S.; Oldfield, A.; Ritchie, W.; Luco, R.F. A cell-to-patient machine learning transfer approach uncovers novel basal-like breast cancer prognostic markers amongst alternative splice variants. BMC Biol. 2021, 19, 70. [Google Scholar] [CrossRef]
  86. Jung, J.; Yoo, S. Identification of Breast Cancer Metastasis Markers from Gene Expression Profiles Using Machine Learning Approaches. Genes 2023, 14, 1820. [Google Scholar] [CrossRef]
  87. Wang, H.; Tan, Z.; Hu, H.; Liu, H.; Wu, T.; Zheng, C.; Wang, X.; Luo, Z.; Wang, J.; Liu, S.; et al. microRNA-21 promotes breast cancer proliferation and metastasis by targeting LZTFL1. BMC Cancer 2019, 19, 738. [Google Scholar] [CrossRef]
  88. Gou, Q.; Liu, Z.; Xie, Y.; Deng, Y.; Ma, J.; Li, J.; Zheng, H. Systematic evaluation of tumor microenvironment and construction of a machine learning model to predict prognosis and immunotherapy efficacy in triple-negative breast cancer based on data mining and sequencing validation. Front. Pharmacol. 2022, 13, 995555. [Google Scholar] [CrossRef]
  89. Kothari, C.; Osseni, M.A.; Agbo, L.; Ouellette, G.; Déraspe, M.; Laviolette, F.; Corbeil, J.; Lambert, J.-P.; Diorio, C.; Durocher, F. Machine learning analysis identifies genes differentiating triple negative breast cancers. Sci. Rep. 2020, 10, 10464. [Google Scholar] [CrossRef] [PubMed]
  90. Li, J.; Qiao, H.; Wu, F.; Sun, S.; Feng, C.; Li, C.; Yan, W.; Lv, W.; Wu, H.; Liu, M.; et al. A novel hypoxia-and lactate metabolism-related signature to predict prognosis and immunotherapy responses for breast cancer by integrating machine learning and bioinformatic analyses. Front. Immunol. 2022, 13, 998140. [Google Scholar] [CrossRef] [PubMed]
  91. Font-Clos, F.; Zapperi, S.; La Porta, C.A. Classification of triple-negative breast cancers through a Boolean network model of the epithelial-mesenchymal transition. Cell Syst. 2021, 12, 457–462.e4. [Google Scholar] [CrossRef] [PubMed]
  92. Malik, V.; Kalakoti, Y.; Sundar, D. Deep learning assisted multi-omics integration for survival and drug-response prediction in breast cancer. BMC Genom. 2021, 22, 214. [Google Scholar] [CrossRef]
  93. Yadav, S.; Zhou, S.; He, B.; Du, Y.; Garmire, L.X. Deep learning and transfer learning identify breast cancer survival subtypes from single-cell imaging data. Commun. Med. 2023, 3, 187. [Google Scholar] [CrossRef]
  94. Zhong, W.; Sun, T. Epithelial-mesenchymal transition (EMT) as a therapeutic target in cancer, Volume II. Front. Oncol. 2023, 13, 1218855. [Google Scholar] [CrossRef]
  95. Joosse, S.A.; Pantel, K. Biologic challenges in the detection of circulating tumor cells. Cancer Res. 2013, 73, 8–11. [Google Scholar] [CrossRef]
  96. De Las Rivas, J.; Brozovic, A.; Izraely, S.; Casas-Pais, A.; Witz, I.P.; Figueroa, A. Cancer drug resistance induced by EMT: Novel therapeutic strategies. Arch. Toxicol. 2021, 95, 2279–2297. [Google Scholar] [CrossRef]
  97. Akbar, M.W.; Isbilen, M.; Belder, N.; Canli, S.D.; Kucukkaraduman, B.; Turk, C.; Sahin, O.; Gure, A.O. A stemness and EMT based gene expression signature identifies phenotypic plasticity and is a predictive but not prognostic biomarker for breast cancer. J. Cancer 2020, 11, 949. [Google Scholar] [CrossRef] [PubMed]
  98. Imani, S.; Wei, C.; Cheng, J.; Khan, M.A.; Fu, S.; Yang, L.; Tania, M.; Zhang, X.; Xiao, X.; Zhang, X.; et al. MicroRNA-34a targets epithelial to mesenchymal transition-inducing transcription factors (EMT-TFs) and inhibits breast cancer cell migration and invasion. Oncotarget 2017, 8, 21362. [Google Scholar] [CrossRef]
  99. Addison, J.B.; Voronkova, M.A.; Fugett, J.H.; Lin, C.-C.; Linville, N.C.; Trinh, B.; Livengood, R.H.; Smolkin, M.B.; Schaller, M.D.; Ruppert, J.M.; et al. Functional hierarchy and cooperation of EMT master transcription factors in breast cancer metastasis. Mol. Cancer Res. 2021, 19, 784–798. [Google Scholar] [CrossRef] [PubMed]
  100. Wang, H.; Guo, S.; Kim, S.-J.; Shao, F.; Ho, J.W.K.; Wong, K.U.; Miao, Z.; Hao, D.; Zhao, M.; Xu, J.; et al. Cisplatin prevents breast cancer metastasis through blocking early EMT and retards cancer growth together with paclitaxel. Theranostics 2021, 11, 2442. [Google Scholar] [CrossRef] [PubMed]
  101. Tian, X.; Yu, H.; Li, D.; Jin, G.; Dai, S.; Gong, P.; Kong, C.; Wang, X. The miR-5694/AF9/Snail axis provides metastatic advantages and a therapeutic target in basal-like breast cancer. Mol. Ther. 2021, 29, 1239–1257. [Google Scholar] [CrossRef]
  102. Yang, X.; Shang, P.; Yu, B.; Jin, Q.; Liao, J.; Wang, L.; Ji, J.; Guo, X. Combination therapy with miR34a and doxorubicin synergistically inhibits Dox-resistant breast cancer progression via down-regulation of Snail through suppressing Notch/NF-κB and RAS/RAF/MEK/ERK signaling pathway. Acta Pharm. Sin. B 2021, 11, 2819–2834. [Google Scholar] [CrossRef] [PubMed]
  103. Nelson, E.R.; Li, S.; Kennedy, M.; Payne, S.; Kilibarda, K.; Groth, J.; Bowie, M.; Parilla-Castellar, E.; de Ridder, G.; Marcom, P.K.; et al. Chemotherapy enriches for an invasive triple-negative breast tumor cell subpopulation expressing a precursor form of N-cadherin on the cell surface. Oncotarget 2016, 7, 84030. [Google Scholar] [CrossRef]
  104. Senthebane, D.A. The Role of the Tumour Microenvironment Components in Cancer Cell Behaviour and Drug Response. Ph.D. Thesis, University of Cape Town, Cape Town, South Africa, 2022. [Google Scholar]
  105. Ahmad, A.; Aboukameel, A.; Kong, D.; Wang, Z.; Sethi, S.; Chen, W.; Sarkar, F.H.; Raz, A. Phosphoglucose isomerase/autocrine motility factor mediates epithelial-mesenchymal transition regulated by miR-200 in breast cancer cells. Cancer Res. 2011, 71, 3400–3409. [Google Scholar] [CrossRef]
  106. Zhu, D.; Zha, X.; Hu, M.; Tao, A.; Zhou, H.; Zhou, X.; Sun, Y. High expression of TIMP-1 in human breast cancer tissues is a predictive of resistance to paclitaxel-based chemotherapy. Med. Oncol. 2012, 29, 3207–3215. [Google Scholar] [CrossRef]
  107. Gan, R.; Yang, Y.; Yang, X.; Zhao, L.; Lu, J.; Meng, Q. Downregulation of miR-221/222 enhances sensitivity of breast cancer cells to tamoxifen through upregulation of TIMP3. Cancer Gene Ther. 2014, 21, 290–296. [Google Scholar] [CrossRef] [PubMed]
  108. Thakur, V.; Zhang, K.; Savadelis, A.; Zmina, P.; Aguila, B.; Welford, S.M.; Abdul-Karim, F.; Bonk, K.W.; Keri, R.A.; Bedogni, B. The membrane tethered matrix metalloproteinase MT1-MMP triggers an outside-in DNA damage response that impacts chemo-and radiotherapy responses of breast cancer. Cancer Lett. 2019, 443, 115–124. [Google Scholar] [CrossRef]
  109. Olivares-Urbano, M.A.; Griñán-Lisón, C.; Zurita, M.; Del Moral, R.; Ríos-Arrabal, S.; Artacho-Cordón, F.; Arrebola, J.P.; González, A.R.; León, J.; Antonio Marchal, J.; et al. Matrix metalloproteases and TIMPs as prognostic biomarkers in breast cancer patients treated with radiotherapy: A pilot study. J. Cell. Mol. Med. 2020, 24, 139–148. [Google Scholar] [CrossRef]
  110. Yuan, J.; Xiao, C.; Lu, H.; Yu, H.; Hong, H.; Guo, C.; Wu, Z. Effects of various treatment approaches for treatment efficacy for late stage breast cancer and expression level of TIMP-1 and MMP-9. Cancer Biomark. 2018, 23, 1–7. [Google Scholar] [CrossRef]
  111. Saxena, M.; Stephens, M.A.; Pathak, H.; Rangarajan, A. Transcription factors that mediate epithelial–mesenchymal transition lead to multidrug resistance by upregulating ABC transporters. Cell Death Dis. 2011, 2, e179. [Google Scholar] [CrossRef]
  112. Fischer, K.R.; Durrans, A.; Lee, S.; Sheng, J.; Li, F.; Wong, S.T.; Choi, H.; El Rayes, T.; Ryu, S.; Troeger, J. Epithelial-to-mesenchymal transition is not required for lung metastasis but contributes to chemoresistance. Nature 2015, 527, 472–476. [Google Scholar] [CrossRef]
  113. Debien, V.; De Caluwe, A.; Wang, X.; Piccart-Gebhart, M.; Tuohy, V.K.; Romano, E.; Buisseret, L. Immunotherapy in breast cancer: An overview of current strategies and perspectives. NPJ Breast Cancer 2023, 9, 7. [Google Scholar] [CrossRef] [PubMed]
  114. Vathiotis, I.A.; Trontzas, I.; Gavrielatou, N.; Gomatou, G.; Syrigos, N.K.; Kotteas, E.A. Immune Checkpoint Blockade in Hormone Receptor-Positive Breast Cancer: Resistance Mechanisms and Future Perspectives. Clin. Breast Cancer 2022, 22, 642–649. [Google Scholar] [CrossRef] [PubMed]
  115. Hai, L.; Jiang, Z.; Zhang, H.; Sun, Y. From multi-omics to predictive biomarker: AI in tumor microenvironment. Front. Immunol. 2024, 15, 1514977. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The EMT-MET model describes the process of cancer metastasis. Epithelial cancer cells undergo epithelial–mesenchymal transition (EMT), losing cell–cell junctions and gaining invasive abilities. These cells then enter the bloodstream (intravasation) and must survive circulation to reach a target organ. Upon arrival, they exit the bloodstream (extravasation) and invade the tissue. For these cells to form detectable and potentially dangerous macro-metastases, they must undergo mesenchymal–epithelial transition (MET). (illustration created with BioRender.com) (https://app.biorender.com, accessed on 13 August 2025).
Figure 1. The EMT-MET model describes the process of cancer metastasis. Epithelial cancer cells undergo epithelial–mesenchymal transition (EMT), losing cell–cell junctions and gaining invasive abilities. These cells then enter the bloodstream (intravasation) and must survive circulation to reach a target organ. Upon arrival, they exit the bloodstream (extravasation) and invade the tissue. For these cells to form detectable and potentially dangerous macro-metastases, they must undergo mesenchymal–epithelial transition (MET). (illustration created with BioRender.com) (https://app.biorender.com, accessed on 13 August 2025).
Biotech 14 00075 g001
Figure 2. The identification of EMT-related biomarkers involves analyzing extensive breast cancer data through an integrated multi-omics approach. This includes genetics, epigenetics, transcriptomics, proteomics, metabolomics, single-cell multi-omics, and radiomics, analyzed using statistical and machine learning techniques. The findings from this comprehensive analysis need validation through further experiments, such as in vitro, in vivo, cohort studies, and clinical trials. These studies aim to identify potential prognostic, diagnostic, and predictive biomarkers and to develop therapeutic strategies for breast cancer. (illustration created with BioRender.com) (https://app.biorender.com, accessed on 13 August 2025).
Figure 2. The identification of EMT-related biomarkers involves analyzing extensive breast cancer data through an integrated multi-omics approach. This includes genetics, epigenetics, transcriptomics, proteomics, metabolomics, single-cell multi-omics, and radiomics, analyzed using statistical and machine learning techniques. The findings from this comprehensive analysis need validation through further experiments, such as in vitro, in vivo, cohort studies, and clinical trials. These studies aim to identify potential prognostic, diagnostic, and predictive biomarkers and to develop therapeutic strategies for breast cancer. (illustration created with BioRender.com) (https://app.biorender.com, accessed on 13 August 2025).
Biotech 14 00075 g002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Khalili-Tanha, G.; Shoari, A. Innovative Approaches to EMT-Related Biomarker Identification in Breast Cancer: Multi-Omics and Machine Learning Methods. BioTech 2025, 14, 75. https://doi.org/10.3390/biotech14030075

AMA Style

Khalili-Tanha G, Shoari A. Innovative Approaches to EMT-Related Biomarker Identification in Breast Cancer: Multi-Omics and Machine Learning Methods. BioTech. 2025; 14(3):75. https://doi.org/10.3390/biotech14030075

Chicago/Turabian Style

Khalili-Tanha, Ghazaleh, and Alireza Shoari. 2025. "Innovative Approaches to EMT-Related Biomarker Identification in Breast Cancer: Multi-Omics and Machine Learning Methods" BioTech 14, no. 3: 75. https://doi.org/10.3390/biotech14030075

APA Style

Khalili-Tanha, G., & Shoari, A. (2025). Innovative Approaches to EMT-Related Biomarker Identification in Breast Cancer: Multi-Omics and Machine Learning Methods. BioTech, 14(3), 75. https://doi.org/10.3390/biotech14030075

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop