Next Article in Journal
Convergent Evolution of the Seed Shattering Trait
Next Article in Special Issue
PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction
Previous Article in Journal
Writing Histone Monoubiquitination in Human Malignancy—The Role of RING Finger E3 Ubiquitin Ligases
Previous Article in Special Issue
A Multi-Label Supervised Topic Model Conditioned on Arbitrary Features for Gene Function Prediction
Article

A Random Walk Based Cluster Ensemble Approach for Data Integration and Cancer Subtyping

1
College of Computer Science and Technology, Anhui University, Hefei 230601, Anhui, China
2
School of Software Engineering, Qufu Normal University, Qufu 273165, Shandong, China
3
Co-Innovation Center for Information Supply & Assurance Technology, Anhui University, Hefei 230601, Anhui, China
*
Author to whom correspondence should be addressed.
Genes 2019, 10(1), 66; https://doi.org/10.3390/genes10010066
Received: 27 November 2018 / Revised: 11 January 2019 / Accepted: 14 January 2019 / Published: 18 January 2019
Availability of diverse types of high-throughput data increases the opportunities for researchers to develop computational methods to provide a more comprehensive view for the mechanism and therapy of cancer. One fundamental goal for oncology is to divide patients into subtypes with clinical and biological significance. Cluster ensemble fits this task exactly. It can improve the performance and robustness of clustering results by combining multiple basic clustering results. However, many existing cluster ensemble methods use a co-association matrix to summarize the co-occurrence statistics of the instance-cluster, where the relationship in the integration is only encapsulated at a rough level. Moreover, the relationship among clusters is completely ignored. Finding these missing associations could greatly expand the ability of cluster ensemble methods for cancer subtyping. In this paper, we propose the RWCE (Random Walk based Cluster Ensemble) to consider similarity among clusters. We first obtained a refined similarity between clusters by using random walk and a scaled exponential similarity kernel. Then, after being modeled as a bipartite graph, a more informative instance-cluster association matrix filled with the aforementioned cluster similarity was fed into a spectral clustering algorithm to get the final clustering result. We applied our method on six cancer types from The Cancer Genome Atlas (TCGA) and breast cancer from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC). Experimental results show that our method is competitive against existing methods. Further case study demonstrates that our method has the potential to find subtypes with clinical and biological significance. View Full-Text
Keywords: cluster ensemble; random walk; refined similarity; cancer subtypes cluster ensemble; random walk; refined similarity; cancer subtypes
Show Figures

Figure 1

MDPI and ACS Style

Yang, C.; Wang, Y.-T.; Zheng, C.-H. A Random Walk Based Cluster Ensemble Approach for Data Integration and Cancer Subtyping. Genes 2019, 10, 66. https://doi.org/10.3390/genes10010066

AMA Style

Yang C, Wang Y-T, Zheng C-H. A Random Walk Based Cluster Ensemble Approach for Data Integration and Cancer Subtyping. Genes. 2019; 10(1):66. https://doi.org/10.3390/genes10010066

Chicago/Turabian Style

Yang, Chao, Yu-Tian Wang, and Chun-Hou Zheng. 2019. "A Random Walk Based Cluster Ensemble Approach for Data Integration and Cancer Subtyping" Genes 10, no. 1: 66. https://doi.org/10.3390/genes10010066

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop