Next Article in Journal
Convergent Evolution of the Seed Shattering Trait
Next Article in Special Issue
PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction
Previous Article in Journal
Writing Histone Monoubiquitination in Human Malignancy—The Role of RING Finger E3 Ubiquitin Ligases
Previous Article in Special Issue
A Multi-Label Supervised Topic Model Conditioned on Arbitrary Features for Gene Function Prediction
Article Menu
Issue 1 (January) cover image

Export Article

Open AccessArticle

A Random Walk Based Cluster Ensemble Approach for Data Integration and Cancer Subtyping

1
College of Computer Science and Technology, Anhui University, Hefei 230601, Anhui, China
2
School of Software Engineering, Qufu Normal University, Qufu 273165, Shandong, China
3
Co-Innovation Center for Information Supply & Assurance Technology, Anhui University, Hefei 230601, Anhui, China
*
Author to whom correspondence should be addressed.
Genes 2019, 10(1), 66; https://doi.org/10.3390/genes10010066
Received: 27 November 2018 / Revised: 11 January 2019 / Accepted: 14 January 2019 / Published: 18 January 2019
  |  
PDF [1854 KB, uploaded 22 January 2019]
  |     |  

Abstract

Availability of diverse types of high-throughput data increases the opportunities for researchers to develop computational methods to provide a more comprehensive view for the mechanism and therapy of cancer. One fundamental goal for oncology is to divide patients into subtypes with clinical and biological significance. Cluster ensemble fits this task exactly. It can improve the performance and robustness of clustering results by combining multiple basic clustering results. However, many existing cluster ensemble methods use a co-association matrix to summarize the co-occurrence statistics of the instance-cluster, where the relationship in the integration is only encapsulated at a rough level. Moreover, the relationship among clusters is completely ignored. Finding these missing associations could greatly expand the ability of cluster ensemble methods for cancer subtyping. In this paper, we propose the RWCE (Random Walk based Cluster Ensemble) to consider similarity among clusters. We first obtained a refined similarity between clusters by using random walk and a scaled exponential similarity kernel. Then, after being modeled as a bipartite graph, a more informative instance-cluster association matrix filled with the aforementioned cluster similarity was fed into a spectral clustering algorithm to get the final clustering result. We applied our method on six cancer types from The Cancer Genome Atlas (TCGA) and breast cancer from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC). Experimental results show that our method is competitive against existing methods. Further case study demonstrates that our method has the potential to find subtypes with clinical and biological significance. View Full-Text
Keywords: cluster ensemble; random walk; refined similarity; cancer subtypes cluster ensemble; random walk; refined similarity; cancer subtypes
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Supplementary material

SciFeed

Share & Cite This Article

MDPI and ACS Style

Yang, C.; Wang, Y.-T.; Zheng, C.-H. A Random Walk Based Cluster Ensemble Approach for Data Integration and Cancer Subtyping. Genes 2019, 10, 66.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Genes EISSN 2073-4425 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top