Next Article in Journal
SPOTing Acetyl-Lysine Dependent Interactions
Previous Article in Journal
Data Mining of Gene Arrays for Biomarkers of Survival in Ovarian Cancer
Previous Article in Special Issue
An Optimization-Driven Analysis Pipeline to Uncover Biomarkers and Signaling Paths: Cervix Cancer
Article Menu

Export Article

From the third issue of 2017, Microarrays has changed its name to High-Throughput.

Open AccessArticle

Identification of Copy Number Aberrations in Breast Cancer Subtypes Using Persistence Topology

Department of Mathematics, University of California Davis, 1 Shields Avenue, Davis, CA 95616, USA
Department of Molecular and Cellular Biology, University of California Davis, 1 Shields Avenue, Davis, CA 95616, USA
Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
Department of Mathematics, San Francisco State University, 1600 Holloway Avenue, San Francisco, CA 96132, USA
Helen Diller Comprehensive Cancer Center,University of California San Francisco, 1600 Divisadero Street, San Francisco, CA 94143, USA
Author to whom correspondence should be addressed.
Academic Editor: Shu-Kay Ng
Microarrays 2015, 4(3), 339-369;
Received: 9 April 2015 / Accepted: 3 August 2015 / Published: 12 August 2015
(This article belongs to the Special Issue Advanced Methods in Microarrays for Cancer Research)
PDF [2271 KB, uploaded 12 August 2015]


DNA copy number aberrations (CNAs) are of biological and medical interest because they help identify regulatory mechanisms underlying tumor initiation and evolution. Identification of tumor-driving CNAs (driver CNAs) however remains a challenging task, because they are frequently hidden by CNAs that are the product of random events that take place during tumor evolution. Experimental detection of CNAs is commonly accomplished through array comparative genomic hybridization (aCGH) assays followed by supervised and/or unsupervised statistical methods that combine the segmented profiles of all patients to identify driver CNAs. Here, we extend a previously-presented supervised algorithm for the identification of CNAs that is based on a topological representation of the data. Our method associates a two-dimensional (2D) point cloud with each aCGH profile and generates a sequence of simplicial complexes, mathematical objects that generalize the concept of a graph. This representation of the data permits segmenting the data at different resolutions and identifying CNAs by interrogating the topological properties of these simplicial complexes. We tested our approach on a published dataset with the goal of identifying specific breast cancer CNAs associated with specific molecular subtypes. Identification of CNAs associated with each subtype was performed by analyzing each subtype separately from the others and by taking the rest of the subtypes as the control. Our results found a new amplification in 11q at the location of the progesterone receptor in the Luminal A subtype. Aberrations in the Luminal B subtype were found only upon removal of the basal-like subtype from the control set. Under those conditions, all regions found in the original publication, except for 17q, were confirmed; all aberrations, except those in chromosome arms 8q and 12q were confirmed in the basal-like subtype. These two chromosome arms, however, were detected only upon removal of three patients with exceedingly large copy number values. More importantly, we detected 10 and 21 additional regions in the Luminal B and basal-like subtypes, respectively. Most of the additional regions were either validated on an independent dataset and/or using GISTIC. Furthermore, we found three new CNAs in the basal-like subtype: a combination of gains and losses in 1p, a gain in 2p and a loss in 14q. Based on these results, we suggest that topological approaches that incorporate multiresolution analyses and that interrogate topological properties of the data can help in the identification of copy number changes in cancer. View Full-Text
Keywords: breast cancer subtypes; copy number aberrations; topological data analysis; TAaCGH breast cancer subtypes; copy number aberrations; topological data analysis; TAaCGH

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Supplementary material


Share & Cite This Article

MDPI and ACS Style

Arsuaga, J.; Borrman, T.; Cavalcante, R.; Gonzalez, G.; Park, C. Identification of Copy Number Aberrations in Breast Cancer Subtypes Using Persistence Topology. Microarrays 2015, 4, 339-369.

Show more citation formats Show less citations formats

Article Metrics

Article Access Statistics



[Return to top]
Microarrays EISSN 2076-3905 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top