Prognostic Biomarkers in Breast Cancer via Multi-Omics Clustering Analysis

Malighetti, Federica; Villa, Matteo; Villa, Alberto Maria; Pelucchi, Sara; Aroldi, Andrea; Cortinovis, Diego Luigi; Canova, Stefania; Capici, Serena; Cazzaniga, Marina Elena; Mologni, Luca; Ramazzotti, Daniele; Cordani, Nicoletta

doi:10.3390/ijms26051943

Open AccessArticle

Prognostic Biomarkers in Breast Cancer via Multi-Omics Clustering Analysis

by

Federica Malighetti

^1,†

,

Matteo Villa

^1,†

,

Alberto Maria Villa

¹

,

Sara Pelucchi

¹

,

Andrea Aroldi

^1,2

,

Diego Luigi Cortinovis

^1,2

,

Stefania Canova

²

,

Serena Capici

³,

Marina Elena Cazzaniga

^1,3,

Luca Mologni

¹

,

Daniele Ramazzotti

¹

and

Nicoletta Cordani

^1,*

¹

Department of Medicine and Surgery, University of Milano-Bicocca, 20900 Monza, Italy

²

Oncology Unit, Fondazione IRCCS San Gerardo dei Tintori, 20900 Monza, Italy

³

Phase 1 Unit, Fondazione IRCCS San Gerardo dei Tintori, 20900 Monza, Italy

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work as first authors.

Int. J. Mol. Sci. 2025, 26(5), 1943; https://doi.org/10.3390/ijms26051943

Submission received: 27 December 2024 / Revised: 20 February 2025 / Accepted: 21 February 2025 / Published: 24 February 2025

(This article belongs to the Special Issue Molecular Basis and Advances of Targeted Therapy for Breast Cancer: Second Edition)

Download

Browse Figures

Versions Notes

Abstract

Breast cancer (BC) is a highly heterogeneous disease with diverse molecular subtypes, which complicates prognosis and treatment. In this study, we performed a multi-omics clustering analysis using the Cancer Integration via MultIkernel LeaRning (CIMLR) method on a large BC dataset from The Cancer Genome Atlas (TCGA) to identify key prognostic biomarkers. We identified three genes—LMO1, PRAME, and RSPO2—that were significantly associated with poor prognosis in both the TCGA dataset and an additional dataset comprising 146 metastatic BC patients. Patients’ stratification based on the expression of these three genes revealed distinct subtypes with markedly different overall survival (OS) outcomes. Further validation using almost 2000 BC patients’ data from the METABRIC dataset and RNA sequencing data from therapy-resistant cell lines confirmed the upregulation of LMO1 and PRAME, respectively, in patients with worse prognosis and in resistant cells, also suggesting their potential role in drug resistance. Our findings highlight LMO1 and PRAME as potential biomarkers for identifying high-risk BC patients and informing targeted treatment strategies. This study provides valuable insights into the multi-omics landscape of BC and underscores the importance of personalized therapeutic approaches based on molecular profiles.

Keywords:

breast cancer; multi-omics landscape; prognostic biomarkers

1. Introduction

Cancer is a highly complex disease characterized by molecular, cellular, and environmental interactions [1]. This heterogeneity is reflected in genetic, epigenetic, and phenotypic diversity [2], all of which contribute to different treatment responses, even among patients with similar diagnoses [3]. Such heterogeneity manifests in the coexistence of several cell subtypes that dynamically adapt to their microenvironment. Clonal evolution of tumor cells can enable immune evasion and resistance to therapeutic interventions [4].

Breast cancer (BC) is the most common cancer worldwide and represents a very heterogeneous disease [2]. It is the second leading cause of cancer-related deaths among women, following lung cancer. BC is clinically classified into hormone receptor (HR)-positive, HER2-positive (HER2+), and triple-negative breast cancer (TNBC) subtypes based on the expression of estrogen receptors (ER), progesterone receptors (PR), and HER2 receptors. These molecular profiles guide treatment decisions, with tailored therapies targeting specific receptors or addressing receptor absence in TNBC. Early detection improves outcomes significantly, with a 99% five-year relative survival rate for localized cases. HR+/HER2-negative is the most common subtype, accounting for over 75% of BC cases [5]. TNBC, representing 10–20% of BC cases [6], is highly aggressive, with a peak relapse within three years from diagnosis and predominant hepatic, pulmonary, and central nervous system metastases [7].

Precision medicine addresses BC heterogeneity by utilizing PAM50 subtypes and risk of recurrence (ROR) scores to classify intrinsic molecular subtypes (Luminal A, Luminal B, HER2-enriched, Basal-like, or Normal-like) or recurrence risks [8,9].

The use of advanced high-throughput experimental technologies is becoming increasingly prevalent in generating extensive omics datasets, including those derived from genomics and transcriptomics, across diverse scientific fields. These datasets frequently originate from the same patients, allowing for a comprehensive examination of both molecular and clinical characteristics. The integration of these datasets has significantly advanced efforts to identify and categorize molecular subtypes of cancer, which are closely associated with patient outcomes. Consequently, characterizing these subtypes has emerged as a critical area of focus in oncology research, potentially offering insights that could inform personalized therapeutic strategies [10].

In this study, we aimed to uncover biologically relevant subtypes of BC using multi-omics data to improve our understanding of tumor heterogeneity and its implications for prognosis. To this end, we analyzed BC data from the latest multi-omics dataset released by The Cancer Genome Atlas (TCGA) [11,12], encompassing nearly 1000 primary tumors, and from a validation cohort of 146 metastatic BC cases [13]. We performed a standardized pipeline for raw data preprocessing and employed the Cancer Integration via MultIkernel LeaRning (CIMLR) clustering method to identify subtypes within these multi-omics datasets, particularly focusing on the molecular characteristics significantly linked to prognosis [14].

Multi-omics clustering analysis enabled the identification of molecularly distinct subgroups that may respond differently to treatment and exhibit varying survival outcomes. We then identified genes differentially expressed among these clusters, determining a set of candidate molecular features. To determine their prognostic relevance, we performed survival analysis using regularized Cox regression, ultimately identifying potential biomarkers associated with patient outcomes (Supplementary Figure S1). Our findings were further validated using independent datasets, including almost 2000 BC patients’ data from the METABRIC project and resistant cell lines [15], reinforcing their potential clinical utility.

Our analysis identified LMO1 and PRAME as promising prognostic biomarkers in BC patients, thereby validating our framework as a robust, generalizable approach for discovering prognostic biomarkers in cancer.

2. Results

2.1. Biomarkers Identification

We conducted multi-omics clustering analysis using CIMLR [14] on the latest dataset from TCGA [11,12], which includes 985 BC patients with multi-omics data. This analysis identified clusters representing distinct cancer subtypes. These clusters were further analyzed to detect differentially expressed genes, resulting in a preliminary set of differentially expressed genes (Supplementary Data S1), which characterized the identified subtypes.

To assess the prognostic relevance of these genes, we performed regularized Cox regression analysis, identifying 32 significant genes (Supplementary Data S1). Through refinement and validation using the existing literature, we categorized these genes as oncogenes or tumor suppressors, confirming whether poorer prognosis was associated with higher oncogene expression or lower tumor suppressor expression. This process reduced the list to 20 significant genes (Supplementary Table S1), which were also consistent with the literature.

For additional validation, we analyzed a second dataset comprising 146 metastatic BC cases [13] and performed regularized regression analysis using the 20 identified genes. From this analysis, three genes—LMO1, PRAME, and RSPO2—emerged as highly significant prognostic biomarkers in both cohorts.

2.2. Clustering Analysis

We performed a clustering analysis considering the three genes selected by the regularized regression analysis. This was performed by computing a risk score for each patient based on the coefficients estimated by regularized regression and then stratifying the patients based on these risk scores using hierarchical clustering. This analysis revealed five distinct subtypes with significantly different overall survival (OS) (p < 0.001) in the TCGA primary BC dataset (Figure 1A). Similarly, two subtypes with significantly different OS (p < 0.001) were identified in the metastatic BC dataset [13] (Figure 1B). In both cases, the three selected genes (LMO1, PRAME, and RSPO2) were differentially expressed.

The transcriptomic profile of LMO1 in the TCGA cohort is markedly elevated in Cluster 5 (Figure 2A), which also shows the worst prognosis (Figure 1A). This finding aligns with the existing literature, which identifies LMO1 as an oncogene implicated in T-cell Acute Lymphoblastic Leukemia (T-ALL) [16] and neuroblastoma [17]. Conversely, LMO4, another LIM domain protein, is broadly expressed in human tissues, including over 50% of breast tumors, where it reduces the differentiation of breast cancer epithelial cells and functions as an oncogene [18].

Similarly, gene expression of the tumor antigen Preferentially Expressed Antigen of Melanoma (PRAME), a repressor of the retinoic acid receptor [19], is significantly elevated in the TCGA Cluster 5 (Figure 2B), consistent with previous reports linking high PRAME levels to worse prognoses [20].

Finally, we observed overexpression of RSPO2 mRNA in the TCGA Cluster 5 (Figure 2C). RSPO2 is a secreted glycoprotein that stimulates Wnt/β-catenin signaling and functions as a cancer driver [21,22,23].

Notably, the three genes were also found to be correlated with the TCGA’s annotated PAM50 subtypes, as they were significantly overexpressed in the more aggressive basal subtype, including triple-negative breast cancer (TNBC) (Supplementary Figures S2–S4). However, the two stratifications are not equivalent (Supplementary Figure S5); thus, our analysis may capture additional molecular differences beyond the PAM50 subtype classification. This finding corroborates our results, further emphasizing the prognostic relevance of these genes.

To further validate the prognostic significance of LMO1, PRAME, and RSPO2, we also verified the expression levels of the three genes in the metastatic BC dataset [13], confirming that LMO1, PRAME, and RSPO2 were consistently overexpressed in Cluster 2 showing worst survival (Figure 3), reinforcing their roles as key prognostic biomarkers.

2.3. Validation on the METABRIC Dataset

To further validate the three identified genes (LMO1, PRAME, and RSPO2) as prognostic biomarkers in an external cohort, we considered a dataset comprising 1980 patients from the METABRIC database [24,25,26], from which the PAM50 model was delineated. Clustering analysis again revealed five distinct subtypes with significantly different overall survival (OS) (p < 0.001) in this dataset of BC (Figure 4A). The findings indicated a robust association of LMO1 and PRAME with patient prognosis (Figure 4B,C), while RSPO2 demonstrated a non-significant association (Figure 4D).

2.4. Expression of the Identified Biomarkers on Resistant Cell Lines and Metastatic BC Samples

Genes whose expression is positively associated with an aggressive phenotype and poor outcome may also be drivers of resistance. Thus, we verified the expression levels of LMO1 and PRAME in the RNAseq data generated in the study by Cordani and colleagues [15]. Both genes were identified as differentially expressed in RNAseq data comparing MCF-7 cells to MCF-7 cells resistant to palbociclib [15]. Notably, LMO1 and PRAME were strongly upregulated in the resistant cells, further highlighting their strong association with aggressive breast cancer phenotypes. These findings underscore the potential role of these genes in mediating resistance to therapy and their relevance as biomarkers for identifying high-risk and treatment-resistant cases (Figure 5). In addition, we performed RT-qPCR to validate the three targets and confirm the data observed in the RNA-seq experiments. Moreover, we confirmed that all three candidate biomarkers are upregulated also in triple-negative breast cancer cell lines (see Supplementary Figure S6).

Finally, we analyzed an independent dataset from The Metastatic Breast Cancer Project (https://mbcproject.org/) consisting of 150 breast cancer tumors, including 59 from patients without metastasis and 91 from patients with metastasis. This analysis revealed significantly higher expression of PRAME and RSPO2 in metastatic tumors, further supporting their potential role in disease progression. While LMO1 expression was also elevated in metastases, the difference was not statistically significant. These findings provide additional insight into the clinical relevance of our identified biomarkers and are included in Supplementary Data S2.

3. Discussion

This study identified and validated potential prognostic biomarkers in BC by leveraging the latest multi-omics dataset from TCGA. Through multi-omics clustering and survival association analysis, we identified a set of differentially expressed genes significantly linked with BC’s prognosis. Our findings highlight three genes—LMO1, PRAME, and RSPO2—as candidate prognostic biomarkers.

The clustering analysis, performed on both primary and metastatic BC datasets, revealed distinct BC subtypes with significantly different OS, confirming the molecular heterogeneity of this cancer. The identification of five subtypes in the primary BC cohort and two subtypes in metastatic BC, based on the expression of LMO1, PRAME, and RSPO2, underscores the importance of these genes in potentially shaping the clinical outcome of patients. Specifically, Cluster 5 in the primary BC cohort and Cluster 2 in the metastatic cohort were strongly associated with the worst prognosis, aligning with the elevated expression levels of the three genes. These results are consistent with the existing literature that links these genes to cancer progression, aggressiveness, and poor prognosis.

Among the three biomarkers, LMO1 stands out as a key oncogene that is overexpressed in BC subtypes with the worst prognosis. Previous studies have shown that LMO1 is involved in T-cell Acute Lymphoblastic Leukemia (T-ALL) and neuroblastoma, and its role in BC, particularly in the TCGA Cluster 5, supports its potential as a marker for aggressive cancer phenotypes. Likewise, PRAME, a known repressor of the retinoic acid receptor, is implicated in promoting cancer cell growth and resistance to therapy. In our study, its elevated expression in the TCGA Cluster 5 of primary BCs and in Cluster 2 metastatic BC cohorts reinforces its relevance as a prognostic factor. RSPO2, which stimulates Wnt/β-catenin signaling, is another key contributor to tumor progression and metastasis. Our findings of increased RSPO2 expression in the worst-prognosis clusters further support its role as a cancer driver, particularly in BC subtypes with aggressive phenotypes.

Furthermore, these biomarkers were also associated with basal-like tumors from the PAM50 subtyping system, known for its aggressive behavior and poor therapeutic response. This observation further highlights the clinical significance of these genes, particularly in identifying high-risk patients who may benefit from more aggressive treatment strategies.

Additionally, our validation on both patients’ data from the METABRIC project and on cell lines, particularly those resistant to the drug palbociclib, demonstrated that LMO1 and PRAME were strongly upregulated in both patients with worse prognosis and in resistant MCF-7 cells, suggesting that these genes may also play a role in mediating resistance to therapy. The strong association of these genes with both aggressive BC phenotypes and drug resistance underscores their potential as biomarkers for identifying patients at high risk for poor prognosis and treatment resistance.

In conclusion, this study identifies LMO1 and PRAME as highly significant biomarkers that can improve our understanding of BC prognosis, particularly in aggressive subtypes such as TNBC. These findings suggest that these biomarkers may not only help identify patients with poor outcomes but could also inform the development of targeted therapies to overcome drug resistance. Future studies should explore the mechanisms by which these genes contribute to BC progression and resistance, as well as the potential for incorporating them into clinical practice for personalized treatment strategies.

This study has limitations. First, further investigation is needed to understand the mechanisms of action and the pathways involved, with the aim of targeting them. Additionally, in future studies, one should seek to confirm these two genes as biomarkers of poorer prognosis and more aggressive or resistant tumors in patient specimens using, e.g., immunohistochemistry.

4. Materials and Methods

4.1. Data Preprocessing

In this study, we used multi-omics data from The Cancer Genome Atlas (TCGA) [11,12], considering 985 patients within the broader PanCancer initiative. We accessed six types of omics data for each BC patient via cBioPortal [27,28], including substitutions and small insertions/deletions, copy number alterations, methylation data, gene expression profiles, microRNA expression, and reverse-phase protein microarray (RPPA) data. The 146 metastatic BC cases [13] were also obtained from cBioPortal, where gene expression data were retrieved. A dataset comprising 1980 patients from the METABRIC database [24,25,26] was used to validate the previous results. Moreover, we also obtained and integrated clinical information such as OS for all the BC cases. Finally, we considered 150 samples from The Metastatic Breast Cancer Project (https://mbcproject.org/, accessed on 1 January 2022), also obtained from cBioPortal (https://www.cbioportal.org/study/summary?id=brca_mbcproject_2022, accessed on 1 January 2022).

4.2. Multi-Omics Integrative Clustering

To robustly identify a set of prognostic biomarkers, we first applied multi-omics clustering to group patients into biologically meaningful subtypes. This preliminary step not only reduces data dimensionality but also mitigates convergence issues—such as multicollinearity and an excessive number of predictors—that can compromise the performance of approaches based on regularized Cox regression.

To this end, we employed the Cancer Integration via MultIkernel LeaRning (CIMLR) [14] algorithm to integrate the six considered omics data types for subtype identification and patient stratification. CIMLR uses kernel-based machine learning to combine different data types, generating an integrated kernel matrix that captures patient similarity based on their molecular profiles. We calculated 385 Gaussian kernels for the seven data types and constructed a patient-to-patient similarity matrix. K-means clustering was then applied to this matrix, and the optimal number of clusters was determined using the elbow method.

4.3. Differential Analysis and Feature Selection

Continuous data were analyzed via analysis of variance (ANOVA) comparing distinct clusters. We applied the Benjamani–Hochberg correction to account for multiple testing and selected features with an adjusted p-value < 0.05.

4.4. Survival Analysis

We evaluated the prognostic significance of the identified clusters by associating them with overall survival (OS) over a 10-year period. Data points corresponding to patients who either died within one month of diagnosis or were older than 80 years were censored to avoid bias from uncertain observations. This exclusion criterion was applied to avoid bias from extreme cases that could distort survival estimates. Kaplan–Meier survival analysis with a log-rank test was used to assess the statistical significance of the associations, using a threshold of p < 0.05.

4.5. Regularized Cox Regression Analysis

We used the Coxnet algorithm [29,30] for regularized Cox regression analysis to identify significant predictors of patient outcomes. This method, a variant of the Cox proportional hazards model, applies a regularization term to shrink regression coefficients and select the most relevant variables. The elastic net method with the LASSO penalty was used to identify predictors with non-zero coefficients, selecting the model that minimized cross-validation error. Risk scores were calculated for each patient, based on a weighted sum of the selected covariates. Patients were stratified into different risk groups based on their risk scores relative to the dataset mean, providing insights into their prognosis.

4.6. RNA Extraction, Reverse Transcription and Real-Time Quantitative PCR

MCF-7pS and MCF-7pR cells were seeded in triplicate at a density of 3 × 10⁶ cells in T75 flasks and incubated to reach 70% confluence. Total RNA from the cells was extracted using RNeasy MiniKit (QIAGEN GmbH, Hilden, Germany) according to the manufacturer’s instructions. Total RNA (2 µg) was retrotranscribed with PrimeScript RT-Master Mix (Takara BIO Europe, SAS) in a 40 µL reaction volume. Of the resulting first-strand cDNA, 2 µL were amplified with Mastermix 2X (GeneSpin, Milan, Italy) in triplicate using a StepOnePlus™ Real-Time PCR System. Relative expression was normalized to the GAPDH using the 2^−ΔCt method. The TaqMan assays used are Hs00231133_m1 (LMO1), Hs01022301_m1 (PRAME), and Hs04400416_m1 (RSPO2) (ThermoFisher Scientific, Milano, Italy).

Supplementary Materials

The supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms26051943/s1.

Author Contributions

Conceptualization: D.R. and N.C.; Methodology: D.R. and N.C.; Investigation: F.M., M.V., A.M.V., S.P., A.A., D.R. and N.C.; Visualization: D.R. and N.C.; Funding Acquisition: L.M. and D.R.; Supervision: D.L.C., M.E.C., S.C. (Stefania Canova), S.C. (Serena Capici), L.M., D.R. and N.C.; Writing—Original Draft: F.M., M.V., D.R. and N.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by AIRC- IG-24828 to L.M.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

This study utilized multi-omics data from The Cancer Genome Atlas (TCGA), accessed via the cBioPortal and publicly available from https://www.cbioportal.org/study/summary?id=brca_tcga_pan_can_atlas_2018. Metastatic breast cancer cases and validation data from the METABRIC database were also downloaded from the cBioPortal, respectively, from https://www.cbioportal.org/study/summary?id=metastatic_solid_tumors_mich_2017 and https://www.cbioportal.org/study/summary?id=brca_metabric (accessed on 1 December 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Baghban, R.; Roshangar, L.; Jahanban-Esfahlan, R.; Seidi, K.; Ebrahimi-Kalan, A.; Jaymand, M.; Kolahian, S.; Javaheri, T.; Zare, P. Tumor microenvironment complexity and therapeutic implications at a glance. Cell Commun. Signal. 2020, 18, 59. [Google Scholar] [CrossRef]
Liu, S.; Tang, Y.; Li, J.; Zhao, W. Global, regional, and national trends in the burden of breast cancer among individuals aged 70 years and older from 1990 to 2021: An analysis based on the global burden of disease study 2021. Arch. Public Health 2024, 82, 170. [Google Scholar] [CrossRef] [PubMed]
Ottaiano, A.; Ianniello, M.; Santorsola, M.; Ruggiero, R.; Sirica, R.; Sabbatino, F.; Perri, F.; Cascella, M.; Di Marzo, M.; Berretta, M.; et al. From Chaos to Opportunity: Decoding Cancer Heterogeneity for Enhanced Treatment Strategies. Biology 2023, 12, 1183. [Google Scholar] [CrossRef] [PubMed]
Lenz, G.; Onzi, G.R.; Lenz, L.S.; Buss, J.H.; Dos Santos, J.A.; Begnini, K.R. The Origins of Phenotypic Heterogeneity in Cancer. Cancer Res. 2022, 82, 3–11. [Google Scholar] [CrossRef] [PubMed]
Cogliati, V.; Capici, S.; Pepe, F.F.; di Mauro, P.; Riva, F.; Cicchiello, F.; Maggioni, C.; Cordani, N.; Cerrito, M.G.; Cazzaniga, M.E. How to Treat HR+/HER2- Metastatic Breast Cancer Patients after CDK4/6 Inhibitors: An Unfinished Story. Life 2022, 12, 378. [Google Scholar] [CrossRef]
Wolff, A.C.; Hammond, M.E.H.; Allison, K.H.; Harvey, B.E.; Mangu, P.B.; Bartlett, J.M.S.; Bilous, M.; Ellis, I.O.; Fitzgibbons, P.; Hanna, W.; et al. Human Epidermal Growth Factor Receptor 2 Testing in Breast Cancer: American Society of Clinical Oncology/College of American Pathologists Clinical Practice Guideline Focused Update. J. Clin. Oncol. 2018, 36, 2105–2122. [Google Scholar] [CrossRef] [PubMed]
Haffty, B.G.; Yang, Q.; Reiss, M.; Kearney, T.; Higgins, S.A.; Weidhaas, J.; Harris, L.; Hait, W.; Toppmeyer, D. Locoregional relapse and distant metastasis in conservatively managed triple negative early-stage breast cancer. J. Clin. Oncol. 2006, 24, 5652–5657. [Google Scholar] [CrossRef] [PubMed]
Schettini, F.; Braso-Maristany, F.; Kuderer, N.M.; Prat, A. A perspective on the development and lack of interchangeability of the breast cancer intrinsic subtypes. NPJ Breast Cancer 2022, 8, 85. [Google Scholar] [CrossRef] [PubMed]
Parker, J.S.; Mullins, M.; Cheang, M.C.; Leung, S.; Voduc, D.; Vickery, T.; Davies, S.; Fauron, C.; He, X.; Hu, Z.; et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 2009, 27, 1160–1167. [Google Scholar] [CrossRef]
Chakraborty, S.; Sharma, G.; Karmakar, S.; Banerjee, S. Multi-OMICS approaches in cancer biology: New era in cancer therapy. Biochim. Biophys. Acta Mol. Basis Dis. 2024, 1870, 167120. [Google Scholar] [CrossRef] [PubMed]
Blum, A.; Wang, P.; Zenklusen, J.C. SnapShot: TCGA-Analyzed Tumors. Cell 2018, 173, 530. [Google Scholar] [CrossRef]
Cancer Genome Atlas, N. Comprehensive molecular portraits of human breast tumours. Nature 2012, 490, 61–70. [Google Scholar] [CrossRef] [PubMed]
Pleasance, E.; Titmuss, E.; Williamson, L.; Kwan, H.; Culibrk, L.; Zhao, E.Y.; Dixon, K.; Fan, K.; Bowlby, R.; Jones, M.R.; et al. Pan-cancer analysis of advanced patient tumors reveals interactions between therapy and genomic landscapes. Nat. Cancer 2020, 1, 452–468. [Google Scholar] [CrossRef]
Ramazzotti, D.; Lal, A.; Wang, B.; Batzoglou, S.; Sidow, A. Multi-omic tumor data reveal diversity of molecular mechanisms that correlate with survival. Nat. Commun. 2018, 9, 4453. [Google Scholar] [CrossRef] [PubMed]
Cordani, N.; Mologni, L.; Piazza, R.; Tettamanti, P.; Cogliati, V.; Mauri, M.; Villa, M.; Malighetti, F.; Di Bella, C.; Jaconi, M.; et al. TWIST1 Upregulation Is a Potential Target for Reversing Resistance to the CDK4/6 Inhibitor in Metastatic Luminal Breast Cancer Cells. Int. J. Mol. Sci. 2023, 24, 16294. [Google Scholar] [CrossRef] [PubMed]
Valge-Archer, V.; Forster, A.; Rabbitts, T.H. The LMO1 and LDB1 proteins interact in human T cell acute leukaemia with the chromosomal translocation t(11;14)(p15;q11). Oncogene 1998, 17, 3199–3202. [Google Scholar] [CrossRef] [PubMed][Green Version]
Wang, K.; Diskin, S.J.; Zhang, H.; Attiyeh, E.F.; Winter, C.; Hou, C.; Schnepp, R.W.; Diamond, M.; Bosse, K.; Mayes, P.A.; et al. Integrative genomics identifies LMO1 as a neuroblastoma oncogene. Nature 2011, 469, 216–220. [Google Scholar] [CrossRef]
Visvader, J.E.; Venter, D.; Hahm, K.; Santamaria, M.; Sum, E.Y.; O’Reilly, L.; White, D.; Williams, R.; Armes, J.; Lindeman, G.J. The LIM domain gene LMO4 inhibits differentiation of mammary epithelial cells in vitro and is overexpressed in breast cancer. Proc. Natl. Acad. Sci. USA 2001, 98, 14452–14457. [Google Scholar] [CrossRef] [PubMed]
Korša, L.; Abramović, M.; Kovačević, L.; Milošević, M.; Podolski, P.; Prutki, M.; Marušić, Z. PRAME expression and its prognostic significance in invasive breast carcinoma. Pathol.-Res. Pract. 2024, 254, 155096. [Google Scholar] [CrossRef] [PubMed]
Epping, M.T.; Hart, A.A.; Glas, A.M.; Krijgsman, O.; Bernards, R. PRAME expression and clinical outcome of breast cancer. Br. J. Cancer 2008, 99, 398–403. [Google Scholar] [CrossRef]
Ter Steege, E.J.; Bakker, E.R.M. The role of R-spondin proteins in cancer biology. Oncogene 2021, 40, 6469–6478. [Google Scholar] [CrossRef] [PubMed]
Crippa, V.; Malighetti, F.; Villa, M.; Graudenzi, A.; Piazza, R.; Mologni, L.; Ramazzotti, D. Characterization of cancer subtypes associated with clinical outcomes by multi-omics integrative clustering. Comput. Biol. Med. 2023, 162, 107064. [Google Scholar] [CrossRef] [PubMed]
Conboy, C.B.; Velez-Reyes, G.L.; Rathe, S.K.; Abrahante, J.E.; Temiz, N.A.; Burns, M.B.; Harris, R.S.; Starr, T.K.; Largaespada, D.A. R-Spondins 2 and 3 Are Overexpressed in a Subset of Human Colon and Breast Cancers. DNA Cell Biol. 2021, 40, 70–79. [Google Scholar] [CrossRef]
Curtis, C.; Shah, S.P.; Chin, S.F.; Turashvili, G.; Rueda, O.M.; Dunning, M.J.; Speed, D.; Lynch, A.G.; Samarajiwa, S.; Yuan, Y.; et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012, 486, 346–352. [Google Scholar] [CrossRef] [PubMed]
Pereira, B.; Chin, S.F.; Rueda, O.M.; Vollan, H.K.; Provenzano, E.; Bardwell, H.A.; Pugh, M.; Jones, L.; Russell, R.; Sammut, S.J.; et al. The somatic mutation profiles of 2,433 breast cancers refines their genomic and transcriptomic landscapes. Nat. Commun. 2016, 7, 11479. [Google Scholar] [CrossRef] [PubMed]
Rueda, O.M.; Sammut, S.J.; Seoane, J.A.; Chin, S.F.; Caswell-Jin, J.L.; Callari, M.; Batra, R.; Pereira, B.; Bruna, A.; Ali, H.R.; et al. Dynamics of breast-cancer relapse reveal late-recurring ER-positive genomic subgroups. Nature 2019, 567, 399–404. [Google Scholar] [CrossRef]
Cerami, E.; Gao, J.; Dogrusoz, U.; Gross, B.E.; Sumer, S.O.; Aksoy, B.A.; Jacobsen, A.; Byrne, C.J.; Heuer, M.L.; Larsson, E.; et al. The cBio cancer genomics portal: An open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012, 2, 401–404. [Google Scholar] [CrossRef] [PubMed]
Gao, J.; Aksoy, B.A.; Dogrusoz, U.; Dresdner, G.; Gross, B.; Sumer, S.O.; Sun, Y.; Jacobsen, A.; Sinha, R.; Larsson, E.; et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signal. 2013, 6, l1. [Google Scholar] [CrossRef] [PubMed]
Simon, N.; Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent. J. Stat. Softw. 2011, 39, 1–13. [Google Scholar] [CrossRef] [PubMed]
Tibshirani, R.; Bien, J.; Friedman, J.; Hastie, T.; Simon, N.; Taylor, J.; Tibshirani, R.J. Strong rules for discarding predictors in lasso-type problems. J. R. Stat. Soc. Ser. B Stat. Methodol. 2012, 74, 245–266. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Kaplan–Meier survival analysis showing overall survival (OS) differences among subtypes identified via hierarchical clustering based on the three selected genes (LMO1, PRAME, and RSPO2). (A) shows the five subtypes detected in the TCGA primary breast cancer dataset (Cancer Genome Atlas 2012; Blum, Wang et al., 2018) [11,12] with significantly different OS (p < 0.001). (B) illustrates the two subtypes detected in the metastatic breast cancer dataset (Pleasance, Titmuss et al., 2020) [13], also with significantly different OS (p < 0.001).

Figure 2. The expression profile (Log2 expression value) of LMO1 (A), PRAME (B), and RSPO2 (C) in the clusters from TCGA 985 BC [11,12].

Figure 3. The expression profile (Log2 expression value) of LMO1 (A), PRAME (B), and RSPO2 (C) in the clusters from the 146 metastatic BC [13].

Figure 4. Kaplan–Meier survival analysis showing overall survival (OS) differences among sub-types identified via clustering based on the three selected genes (LMO1, PRAME, and RSPO2), also with significantly different OS (p < 0.001) (A) depicts four subtypes in the METABRIC primary breast cancer dataset. The expression profile (Log2 expression value) of LMO1 (B), PRAME (C), and RSPO2 (D) in the clusters from the 1980 patients affected by BC from METABRIC.

Figure 5. Overexpression of LMO1 and PRAME in RNA-seq normalized counts data. Values represent the mean of three internal replicates [15].

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Malighetti, F.; Villa, M.; Villa, A.M.; Pelucchi, S.; Aroldi, A.; Cortinovis, D.L.; Canova, S.; Capici, S.; Cazzaniga, M.E.; Mologni, L.; et al. Prognostic Biomarkers in Breast Cancer via Multi-Omics Clustering Analysis. Int. J. Mol. Sci. 2025, 26, 1943. https://doi.org/10.3390/ijms26051943

AMA Style

Malighetti F, Villa M, Villa AM, Pelucchi S, Aroldi A, Cortinovis DL, Canova S, Capici S, Cazzaniga ME, Mologni L, et al. Prognostic Biomarkers in Breast Cancer via Multi-Omics Clustering Analysis. International Journal of Molecular Sciences. 2025; 26(5):1943. https://doi.org/10.3390/ijms26051943

Chicago/Turabian Style

Malighetti, Federica, Matteo Villa, Alberto Maria Villa, Sara Pelucchi, Andrea Aroldi, Diego Luigi Cortinovis, Stefania Canova, Serena Capici, Marina Elena Cazzaniga, Luca Mologni, and et al. 2025. "Prognostic Biomarkers in Breast Cancer via Multi-Omics Clustering Analysis" International Journal of Molecular Sciences 26, no. 5: 1943. https://doi.org/10.3390/ijms26051943

APA Style

Malighetti, F., Villa, M., Villa, A. M., Pelucchi, S., Aroldi, A., Cortinovis, D. L., Canova, S., Capici, S., Cazzaniga, M. E., Mologni, L., Ramazzotti, D., & Cordani, N. (2025). Prognostic Biomarkers in Breast Cancer via Multi-Omics Clustering Analysis. International Journal of Molecular Sciences, 26(5), 1943. https://doi.org/10.3390/ijms26051943

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prognostic Biomarkers in Breast Cancer via Multi-Omics Clustering Analysis

Abstract

1. Introduction

2. Results

2.1. Biomarkers Identification

2.2. Clustering Analysis

2.3. Validation on the METABRIC Dataset

2.4. Expression of the Identified Biomarkers on Resistant Cell Lines and Metastatic BC Samples

3. Discussion

4. Materials and Methods

4.1. Data Preprocessing

4.2. Multi-Omics Integrative Clustering

4.3. Differential Analysis and Feature Selection

4.4. Survival Analysis

4.5. Regularized Cox Regression Analysis

4.6. RNA Extraction, Reverse Transcription and Real-Time Quantitative PCR

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI