Cancer Subtype Recognition Based on Laplacian Rank Constrained Multiview Clustering

Integrating multigenomic data to recognize cancer subtype is an important task in bioinformatics. In recent years, some multiview clustering algorithms have been proposed and applied to identify cancer subtype. However, these clustering algorithms ignore that each data contributes differently to the clustering results during the fusion process, and they require additional clustering steps to generate the final labels. In this paper, a new one-step method for cancer subtype recognition based on graph learning framework is designed, called Laplacian Rank Constrained Multiview Clustering (LRCMC). LRCMC first forms a graph for a single biological data to reveal the relationship between data points and uses affinity matrix to encode the graph structure. Then, it adds weights to measure the contribution of each graph and finally merges these individual graphs into a consensus graph. In addition, LRCMC constructs the adaptive neighbors to adjust the similarity of sample points, and it uses the rank constraint on the Laplacian matrix to ensure that each graph structure has the same connected components. Experiments on several benchmark datasets and The Cancer Genome Atlas (TCGA) datasets have demonstrated the effectiveness of the proposed algorithm comparing to the state-of-the-art methods.


Introduction
Tumor is a malignant heterogeneous disease caused by changes in cellular components at the levels of expression, epigenetics, transcription and proteomics. The heterogeneity will be reflected in that the same cancer will produce the subtypes with different phenotypes, which will affect the clinical treatment and prognosis [1,2]. With the development and maturity of new generation sequencing technologies, large amounts of biological data are collected in public databases that are easily accessible to researchers [3]. For example, The Cancer Genome Atlas (TCGA), a landmark cancer genomics project, stores information on biological processes such as mRNA expression data, DNA methylation data, miRNA expression data and mutation data for more than 30 cancers and thousands of cancer patients [4]. Therefore, in order to solve the problem of cancer subtype recognition, building a multiview clustering model that makes full use of biological information plays a significant role.
In order to implement the task of clustering, scholars initially focus on dimensionality reduction, matrix decomposition and linear regression technologies. They all use different strategies to project high-dimensional data into low-dimensional feature space, and then achieve clustering by k-means [5][6][7][8][9][10]. For example, an effective classical method, iCluster [5], builds a Gaussian latent variable model and its modified version, iClusterPlus [6], considers different variable types following different linear probabilistic relationships to build a regression model. Both of them achieve a low-dimensional space with the combination of different biological characteristics. The other method, Pattern Fusion Analysis (PFA) [10], first uses an improved Principal Component Analysis (PCA) to find out a Genes 2021, 12, 526 2 of 17 low-dimensional matrix of each sample, and then uses an adaptive alignment algorithm to build a fused low-dimensional feature space. However, these methods may further dilute the already low signal-to-noise ratio and increase the noise pollution to the results. Considering that the sample (patient) size of the biological data is much smaller than the feature (gene) size, some graph-based learning methods for cancer subtype recognition are designed [11][12][13][14][15][16][17]. These methods use the sample points to quickly construct the similarity graph, which can be converted into the problem of spectral clustering. For example, a widely mentioned algorithm, Similarity Network Fusion (SNF) [11], constructs the global and local similar networks for each data, and then integrates them into the final similar network based on the strategy of information propagation to dilute low similarity and enhance high similarity. Inspired by SNF, Ma et al. provided Affinity Network Fusion (ANF) [12], which constructs patients' k-nearest neighbor similar network for each data type, and then fuses these networks based on the random walk method. In addition, Yu et al. proposed Multiview Clustering using Manifold Optimization (MVCMO) [17], and solved the problem of spectral clustering optimization by using the line search method on Stiefel manifold space.
However, most existing graph-based multiview clustering methods separate the data clustering process from graph learning process [18,19]. In some methods, the construction of the graph is independent of the clustering task, resulting in its performance being highly dependent on the predefined graph. Recently, some adaptive graph learning methods using a rank constraint on the Laplacian matrix have been able to directly reveal the clustering structure, which makes the graph construction closely related to the clustering task [20][21][22][23][24]. In addition, the similarity between sample points may commonly behave differently in different views in the process of graph fusion. Some existing algorithms simply take the average of the affinity graph of multiple views to represent the result of the fusion graph [25,26]. Therefore, the rich heterogeneous information is not fully utilized.
To sum up, we designed a graph-based multiview clustering algorithm, called Laplacian Rank Constrained Multiview Clustering (LRCMC). Firstly, the Laplacian Rank Constraint (LRC) algorithm [27] is used to simultaneously find the affinity graph and the embedding matrix in each view to ensure that the graph structure is on the same connected components. Then, based on the method of Nie et al. [24], we use LRC method to obtain the consistent graph, whose connected components are the same as the affinity graph of each view. Finally, the clustering structure is obtained. In the process of graph fusion, the inverse distance weighting scheme is employed to design different weights for each view's affinity graph [24], so as to adjust the structure of the consistent graph more effectively. Moreover, the processes of graph learning, graph fusion and clustering are coupled into an optimization problem to update the more accurate consistent graph and improve the results of the clustering. In order to evaluate the effectiveness of the proposed method, experiments were carried out on four benchmark datasets and four TCGA datasets. Four start-of-the-art methods were used for comparison. The values of Accuracy (ACC), Normalized Mutual Information (NMI) and Purity on benchmark datasets, which are commonly used metrics in clustering analysis, and the p value obtained from survival analysis on the TCGA dataset can all show that the proposed LRCMC approach achieves considerable improvement over the state-of-the-art baseline methods. In the analysis of the Glioblastoma Multiforme (GBM) subtypes, we found these clusters have biological significance, e.g., the Proneural subtype granted by G-CIMP phenotype has a better survival advantage. The source code and datasets can be found in the Supplementary File 1.

Methods
The overall flow of LRCMC is shown in Figure 1. Specifically, given a set of omics data with m views X 1 , . . . , X m , a set of affinity graph matrices S 1 , . . . , S m are constructed, respectively, according to X 1 , . . . , X m . It should be emphasized that the process of learning the affinity matrix in LRCMC is different from most multiview clustering algorithms. S 1 , . . . , S m are not calculated directly from the original matrix, but are constructed, respec-tively, from a set of embedding matrices F 1 , . . . , F m by the LRC method. Therefore, each affinity graph matrix is constrained to the same connected components, which ensures that each affinity graph has a similar structure before the fusion process. Then, the proposed fusion method is applied to the affinity graph matrices of all views in order to learn a consistent graph matrix Z. Simultaneously, each view is automatically assigned a different weight w 1 , . . . , w m to represent its contribution to Z during the fusion process. Finally, the learned consistent graph matrix Z is used to optimize the affinity graph matrix for each view. The LRC method is also imposed to constrain that the number of connected components in the Z is equal to the required number of clusters c by constructing the fusion embedded matrix U. Our LRCMC improves the affinity matrix of each view, builds a fused consistent graph matrix and obtains clustering results simultaneously.

Methods
The overall flow of LRCMC is shown in Figure 1. Specifically, given a set of omics data with m views 1 ,, m XX , a set of affinity graph matrices 1 ,, m S S are constructed, respectively, according to 1 ,, m XX . It should be emphasized that the process of learning the affinity matrix in LRCMC is different from most multiview clustering algorithms. 1 ,, m S S are not calculated directly from the original matrix, but are constructed, respectively, from a set of embedding matrices 1 , , m F F by the LRC method. Therefore, each affinity graph matrix is constrained to the same connected components, which ensures that each affinity graph has a similar structure before the fusion process. Then, the proposed fusion method is applied to the affinity graph matrices of all views in order to learn a consistent graph matrix Z. Simultaneously, each view is automatically assigned a different weight 1 , , m w w to represent its contribution to Z during the fusion process. Finally, the learned consistent graph matrix Z is used to optimize the affinity graph matrix for each view. The LRC method is also imposed to constrain that the number of connected components in the Z is equal to the required number of clusters c by constructing the fusion embedded matrix U. Our LRCMC improves the affinity matrix of each view, builds a fused consistent graph matrix and obtains clustering results simultaneously.

Construction of Affinity Graph Based on LRC
Given a single biological data denotes the v-th view data with dv features, where n is the number of data points.
represents the similar relationship between the sample points in the graph learning framework. The smaller the distance between a pair of vertices in the graph is, the greater the similarity between the pair of vertices will be, the greater the corresponding weight will be, and vice versa. Based on the manifold structure of graph, the most traditional way to build v S is by generating a k-nearest neighbor graph for it. A pair of vertices are considered connected if they are near neighbors. There are other effective strategies to design more accurate affinity graph v S , e.g., smooth representation [28], Gaussian kernel for similarity learning [29], etc. For the purpose of clustering, if the sample points can be assigned to the c categories, the obtained v S should contain exact c connected components. Based on the following Theorem 1, v S can be realized.

Construction of Affinity Graph Based on LRC
Given a single biological data X v = x v 1 , . . . , x v n ∈ R d v ×n denotes the v-th view data with d v features, where n is the number of data points. S v ∈ R n×n represents the similar relationship between the sample points in the graph learning framework. The smaller the distance between a pair of vertices in the graph is, the greater the similarity between the pair of vertices will be, the greater the corresponding weight will be, and vice versa. Based on the manifold structure of graph, the most traditional way to build S v is by generating a k-nearest neighbor graph for it. A pair of vertices are considered connected if they are near neighbors. There are other effective strategies to design more accurate affinity graph S v , e.g., smooth representation [28], Gaussian kernel for similarity learning [29], etc. For the purpose of clustering, if the sample points can be assigned to the c categories, the obtained S v should contain exact c connected components. Based on the following Theorem 1, S v can be realized. When all the elements in S v satisfy the non-negative condition, its Laplacian matrix L v has the above property [30,31].
points on S v have been ideally assigned to c categories [32], Laplacian rank meets the constraint condition rank(L v ) = n − c. Therefore, based on the Ky Fan's theorem [33], where F v ∈ R n×c is obtained by the c eigenvectors of L v corresponding to the c smallest eigenvalues. Tr( .) denotes the trace operator, is a diagonal matrix and its elements are column sums of (S v ) T + S v /2 . However, the solution to F v in Equation (1) is actually to solve trivial solution to S v . Therefore, a 2 -norm regularization term is employed to obtain smooth S v and each column of [21]. Finally, we can obtain the objective function related to F v and S v simultaneously: where α is the regularization parameter. A set of the affinity graph matrices S 1 , . . . , S m and the embedded matrices F 1 , . . . , F m are obtained through Equation (2) without the participation of the original data. However, these affinity matrices are unrelated; if they are simply stacked together for clustering, the graphs will be badly damaged and the algorithm performance will degrade. Therefore, we need to introduce a graph fusion strategy to construct a consistent graph matrix with the unified connected components.

Graph Fusion with LRC
Integrating these basic graphs to form the fused affinity graph Z ∈ R n×n , two intuitive points should be considered: (1) The designed graph S v for each view can be considered as the consistent graph Z with noise representation and outlier interference. (2) S v closer to Z should be given greater weight to reduce the perturbation of the low-quality graphs on the fusion graph. In this way, Z can accurately capture the true similarity hidden in the multiview data. Therefore, we employed the proposed method of Nie et al. [24] to optimize Z as follows: where w v is the weight of the single affinity graph S v . The inverse distance weighting scheme is designed to calculate w v . The Lagrange function of Equation (3) can be written as: where Λ is the Lagrange multiplier, ς(Λ, Z) is the formal term derived from constraint condition. Taking the derivative of Equation (4) w.r.t Z and setting the derivative to zero, we can obtain: where w v is given as follows: Here, a set of weights w 1 , . . . , w m and a consistent graph matrix Z are obtained from Equation (3). In order to make the learned Z also have c connected components for clustering, the LRC term is added to Equation (3) according to Theorem 1 and Ky Fan's Theorem. The objective function is as follows: where U ∈ R n×c is obtained by the c eigenvectors of L Z corresponding to the c smallest eigenvalues, i.e., the embedded matrix corresponding to Z. L Z = D Z − Z T + Z /2 is the Laplacian matrix, D Z is a diagonal matrix and its elements are column sums of Z T + Z /2 . β is the regularization parameter.

LRCMC Algorithm
As described in Sections 2.1 and 2.2, the LRC operation is used to guarantee the structures of S 1 , . . . , S m and Z. Therefore, we can combine Equations (2) and (7) into a final objective function, i.e., the proposed Laplacian Rank Constrained Multiview Clustering (LRCMC). It is represented as: Here, we complete the tasks of graph construction, graph fusion and clustering in one step through the integrated model. In this way, the learning of S 1 , . . . , S m and Z can help each other embedded in a joint coupling problem. The objective function Equation (8) enjoys the following properties:

•
Our method can effectively learn a set of affinity graph matrices with c connected components, instead of most multiview clustering methods requiring predefined graphs; • In the graph fusion process, we assign the weight to each view to represent their contribution to the consistent graph Z, rather than simply superimposing then together; • We use LRC to constantly adjust the structures of S 1 , . . . , S m and Z, and at the same time complete the task of clustering.

Optimization Algorithm of LRCMC
Obviously, since the variables in Equation (8) are coupled to each other, we use alternating iterative method and Augmented Lagrange Multiplier (ALM) scheme to solve S 1 , . . . , S m , F 1 , . . . , F m , w 1 , . . . , w m , Z, U. The specific solution process is as follows: The Equation (8) becomes: (9) can be written in vector form as: , Equation (10) is obviously written as: Then, the Equation (11) can be written as: Therefore, the Lagrangian function of Equation (12) combined with its constraints can be defined as: where η is the Lagrangian coefficient scalar and ϕ is the Lagrangian coefficient vector. Based on the Karush-Kuhn-Tucker (KKT) condition [34], the optimal solution of s v j can be estimated as: The study in [35] found that sparse representation is robust to noise and outliers. In order to obtain the sparse affinity graph S v , we can find the k nonzero adaptive neighbors for s v j to satisfy s v jk > 0 and s v j,k+1 = 0. Denote α + w v = δ, then, we arrive at: Moreover, according to Equation (15) and the constraint condition 1 T s v j − 1 = 0, η is given: Therefore, according to Equations (15) and (16), the range of δ is obtained as follows: Then, the parameter δ can be set as: Finally, according to Equations (15), (16) and (18), the optimal solution of s v j in s v ij is represented as: Updating F v in Equation (8) is converted to Equation (2). Therefore, F v is updated from Equation (1) in Section 2.1.

4.
Fix S 1 , . . . , S m , F 1 , . . . , F m , w 1 , . . . , w m and U, solve Z.; Updating Z in Equation (8) is converted to Equation (7). Due to Tr(U T L Z U) = where u i and u j denote the i-th and j-th column of U, z ij denotes the (i, j)th element of Z, Equation (7) yields: Based on the Karush-Kuhn-Tucker (KKT) condition [34], the closed form solution of z j can be estimated as: Equation (22) can be solved by an efficient optimization method proposed in [35].
According to the method of finding F v , U is obtained as follows: The final solution of U is the c eigenvectors of L Z corresponding to the c smallest eigenvalues.

Experiments' Results
In order to verify effectiveness of LRCMC in cancer subtype recognition, LRCMC was compared with four state-of-the-art clustering algorithms, i.e., ANF [12], SNF [11], PFA [10] and MVCMO [17]. Since biological omics data are not labeled, we first downloaded four widely used benchmark datasets containing real labels, i.e., 3-source, Calt-7, MSRC, WebKB, to verify that proposed LRCMC can achieve good clustering effect. Furthermore, we applied LRCMC to the datasets downloaded and preprocessed by Wang et al. [11] from TCGA. The datasets contain four types of cancer, i.e., GBM, Breast Invasive Carcinoma (BIC), Lung Squamous Cell Carcinoma (LSCC) and Colon Adenocarcinoma (COAD).

Comparison Experiments on Benchmark Datasets
The benchmark datasets are described as follows: • 3-source [20]: It contains 169 news that were reported by three news magazines, i.e., BBC, Reuters, and The Guardian. There are six different thematic labels for each news; • Calt-7 [36]: The object recognition dataset is drawn from the Caltech101 dataset to screen 7 widely used classes, i.e., faces, motorbikes, dollar bill, Garfield, stop sign, and  [20]: It collects 203 web pages in 4 classes from the University's Computer science department. Each page has 3 features, i.e., the content of the page, the anchor text of the hyperlink, and the text description in the title. Table 1 is an overview of these datasets, where n, m, and c describe the number of samples, views, and classes for each dataset, respectively, d v denotes the i-th feature of these datasets. Three commonly used evaluation metrics, i.e., Accuracy (ACC), Normalized Mutual Information (NMI) and Purity, are used to quantitatively measure the clustering performance of the algorithms. The metrics compare the resulting labels with the real labels provided by the dataset. The larger the value obtained, the better the clustering results. To ensure the fairness of the comparison experiments, each algorithm was run 10 times to reduce the impact of randomness. The mean and standard deviation of the obtained metrics were calculated. In addition, the neighbor k required by ANF, SNF, MVCMO and LRCMC was set within the range of [5,50], and other parameters were specified as the default values provided by the authors. Only one parameter β needs to be set in our LRCMC algorithm, which is caused by the introduction of LRC. In order to achieve rapid convergence of Algorithm 1, we adopt a dynamic parameter updating method proposed by Nie et al. [23]. β is set in the range of [1,30]. If the number of connected components of Z is greater than c, we will shrink β (β = β/2). On the contrary, if less than c, we will increase β (β = 2 × β) until finding the right components for Z. Table 2 shows the final evaluation metrics obtained by these algorithms in the four datasets. It is obvious that LRCMC achieves better clustering performance in the multiview clustering task than the other methods.

Algorithm 1. LRCMC algorithm
Input: Original data X 1 , . . . , X m with m views, the number of clusters c, the number of neighbors k, the regularization parameter β. Output: The learned consensus matrix Z. Initialize the affinity matrices S 1 , . . . , S m for each view by solving the following problem:

Comparison Experiments on TCGA Datasets
To demonstrate the effectiveness of LRCMC in identifying cancer subtype, the designed LRCMC was applied to four cancer omics datasets, i.e., GBM, BIC, LSCC, and COAD. Each cancer subtype contains three types of expression data from different platforms, i.e., mRNA expression data, DNA methylation data and miRNA expression data. Table 3 shows the number of samples (patients) and features (genes) held by each cancer subtype. To ensure that the identified cancer labels conform to the true clinical diagnosis, we specified that the number of samples in each cluster should be at least 3. We used the number of subtypes of GBM, BIC, LSCC, and COAD specified by Wang et al., which were 3, 5, 4 and 3, respectively. Then, the p values based on Cox log-rank model were used to evaluate the clustering results of these algorithms in survival analysis [38]. If the p value is smaller, the survival rate between different groups is more significant and the difference is greater, which means the cluster is considered to have different characteristics of the underlying cancer subtypes. Cancer survival curves can also represent heterogeneity between different subtypes. As shown in Table 4, LRCMC obtained the best p value in BIC, GBM, KRCCC and COAD. Other algorithms also had good results in specific datasets, but they were all lower than our algorithm. Therefore, we believe that LRCMC is significantly advantageous in the topic of cancer subtype recognition. Figure 2 shows the Kaplan-Meier survival analysis curves of the four cancers. Each curve depicts trends in the survival time of each cancer cluster and the number of samples for each cluster is also shown in the figure. Table 4. p values of survival analysis in Cox log-rank model for different clustering methods of four cancers on The Cancer Genome Atlas (TCGA) datasets.

Methods
GBM BIC LSCC COAD The best results have been highlighted in bold.

Analysis on GBM Dataset
GBM is the most malignant glioma among astrocytomas. It has been studied and analyzed at the genetic level by many scholars, and specific subtypes and treatment protocols have been proposed. For example, according to the mRNA expression data, Verhaak et al. [39] reported that GBM is divided into Mesenchymal, Classical, Neural and Proneural subtypes, and the heterogeneous subtypes were also verified in somatic mutations and copy number variations (CNVs). Another study divided GBM patients into two subtypes, i.e., G-CLMP and non-G-CLMP, based on the difference of CpG Island methylator phenotype (CLMP) [40]. Table 5 shows the distribution of the cluster results obtained by LRCMC on the subtype identified by these two studies. From Table 5, there are more patients in cluster 1 than in cluster 3, and all of them are assigned to non-G-CLMP subtype-also they have four subtypes identified based on mRNA expression. The point is that the Proneural subtype in these two clusters belong to non-G-CLMP subtype. However, cluster 2, with a smaller number of patients, is almost the Proneural subtype, and also belongs to G-CLMP subtype.
Genes 2021, 12, x FOR PEER REVIEW 11 of 18 To ensure that the identified cancer labels conform to the true clinical diagnosis, we specified that the number of samples in each cluster should be at least 3. We used the number of subtypes of GBM, BIC, LSCC, and COAD specified by Wang et al., which were 3, 5, 4 and 3, respectively. Then, the p values based on Cox log-rank model were used to evaluate the clustering results of these algorithms in survival analysis [38]. If the p value is smaller, the survival rate between different groups is more significant and the difference is greater, which means the cluster is considered to have different characteristics of the underlying cancer subtypes. Cancer survival curves can also represent heterogeneity between different subtypes. As shown in Table 4, LRCMC obtained the best p value in BIC, GBM, KRCCC and COAD. Other algorithms also had good results in specific datasets, but they were all lower than our algorithm. Therefore, we believe that LRCMC is significantly advantageous in the topic of cancer subtype recognition. Figure 2 shows the Kaplan-Meier survival analysis curves of the four cancers. Each curve depicts trends in the survival time of each cancer cluster and the number of samples for each cluster is also shown in the figure. The best results have been highlighted in bold.  The values represent the number of patients counted.
To further analyze the identified clusters, we downloaded clinical data, somatic mutation data and CNV data for all patients from the cBio Cancer Genomis Portal database (http://www.cbioportal.org/ accessed on 15 December 2020). The age profiles of the three clusters (Figure 3), differential gene statistics of CNVs and mutations (Table 6), and Kaplan-Meier survival curves of Temozolomide (TMZ) (Figure 4) in GBM patients were obtained. Figure 3 shows that the diagnosis age of patients in cluster 2 with the best survival advantage is also lower than that of patients in cluster 1 and cluster 3. The genetic variant signatures associated with GBM in terms of mutation (IDH1) and CNVs (CDKN2A, CDKN2B, C9orf53, MTAP, EGFR) are significantly different in the three identified clusters. In particular, IDH1 mutation only occurs in cluster 2, while EGFR amplification is 0. Then, we divided the patients within the three clusters into two groups: patients treated with TMZ and those not treated with TMZ, then we compared the drug response. TMZ is a drug that is commonly used to treat GBM, but only responds well to a subset of patients.
The p values of survival analysis in Cox log-rank model of the three cluster comparison experiments are 2.0 × 10 −6 , 0.76 and 0.01, respectively, which indicates that TMZ treatment has no effect on the patients in cluster 2. Therefore, in summary, we can infer that the subtype belonging to G-CLMP subtype and Proneural subtype might be a potentially new subtype. This also verified by the fact that that the Proneural subtype granted by the G-CIMP phenotype proposed by Canmenron et al. has unique properties [41].    In addition, mRNA expression data and DNA methylation data were used to compare the differentially expressed genes in cluster 1 and 3 to look for the heterogeneity between them. We compared the genes in the two clusters using ANOVA (the lower the p-value, the higher the ranking). The gene differences in miRNA expression data were not significant enough (p values were all greater than 0.1) and were omitted from consideration. Figure 5 shows the heatmaps of the top 20 differentially expressed genes in the mRNA expression data and the DNA methylation data, respectively. It is obvious that cluster 1 and cluster 3 are different in gene expression level, and some of the genes on the heatmaps have been shown to be linked to GBM., e.g., PRKAA1 overexpressed in Cluster 3, also known as AMPK, induces antitumor activity in GBM cells and has become a possible tumor control target [42]. MUC1 overexpressed in cluster 1 is a pathogenic gene that induces GBM and can be used as a target for cellular immunotherapy [43].      Finally, we compared the three clusters with normal samples and screened for differentially expressed genes using ANOVA. We did Gene Ontology (GO: BP), KEGG pathway and Disease Ontology (DO) enrichment analysis using the top 100 differential genes in ToppGene Suite database (https://toppgene.cchmc.org/enrichment.jsp accessed on 20 December 2020). From Table 7, it is clear that the biological processes of cluster 1 are related to "epithelium development" and "cell adhesion", while the biological processes of cluster 2 and 3 are mostly related to "protein targeting" and "protein localization". Moreover, it is interesting to note that all three clusters are associated with anemia in DO enrichment analysis. It is possible that GBM patients treated with TMZ will develop aplastic anemia [44].

1, 12, x FOR PEER REVIEW
Genes 2021, 12, x FOR PEER REVIEW 14 of In addition, mRNA expression data and DNA methylation data were used to com pare the differentially expressed genes in cluster 1 and 3 to look for the heterogenei between them. We compared the genes in the two clusters using ANOVA (the lower th p-value, the higher the ranking). The gene differences in miRNA expression data were n significant enough (p values were all greater than 0.1) and were omitted from consider tion. Figure 5 shows the heatmaps of the top 20 differentially expressed genes in th mRNA expression data and the DNA methylation data, respectively. It is obvious th cluster 1 and cluster 3 are different in gene expression level, and some of the genes on th heatmaps have been shown to be linked to GBM., e.g., PRKAA1 overexpressed in Clust 3, also known as AMPK, induces antitumor activity in GBM cells and has become a poss ble tumor control target [42]. MUC1 overexpressed in cluster 1 is a pathogenic gene th induces GBM and can be used as a target for cellular immunotherapy [43].   provide appropriate weight for each view; (2) the tasks of constructing the affinity matrix of each view, learning the fused matrix and clustering are completed simultaneously in a system. Furthermore, LRCMC has the following two advantages in algorithm running: (1) there is no need to spend a lot of time choosing the appropriate parameters; (2) the final consensus graph has been assigned to the given categories without adding additional base clustering algorithms. We demonstrated the power of LRCMC using four benchmark datasets and four cancer datasets. The experiments show that LRCMC has a good clustering evaluation. The cancer subtype recognition results on GBM data show that LRCMC can effectively capture cancer subtypes with specific biological characteristics based on omics data.
In addition, we must admit that LRCMC also has shortcomings and limitations. It is not suitable for binary data (somatic mutation) or categorical data (copy number states: loss/normal/gain), and has only limited application to continuous data (mRNA expression) to identify cancer subtype. It also does not have the ability to find the gene modules that affect differences in each subtype. Therefore, we will continue our efforts to improve and extend the LRCMC algorithm to explore cancer heterogeneity.
Author Contributions: S.G. and J.L. conceived and designed the approach; S.G. and J.L. programmed the algorithm; S.G., X.W. and Y.C. analyzed the results of the experiment; S.G., J.L. and X.W. wrote the paper. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement: Ethical review and approval were waived for this study, due to that the samples were from public TCGA databases.

Informed Consent Statement: Not applicable.
Data Availability Statement: Data are contained within the article and supplementary materials.