Graph Theoretic and Pearson Correlation-Based Discovery of Network Biomarkers for Cancer
AbstractTwo graph theoretic concepts—clique and bipartite graphs—are explored to identify the network biomarkers for cancer at the gene network level. The rationale is that a group of genes work together by forming a cluster or a clique-like structures to initiate a cancer. After initiation, the disease signal goes to the next group of genes related to the second stage of a cancer, which can be represented as a bipartite graph. In other words, bipartite graphs represent the cross-talk among the genes between two disease stages. To prove this hypothesis, gene expression values for three cancers— breast invasive carcinoma (BRCA), colorectal adenocarcinoma (COAD) and glioblastoma multiforme (GBM)—are used for analysis. First, a co-expression gene network is generated with highly correlated gene pairs with a Pearson correlation coefficient ≥ 0.9. Second, clique structures of all sizes are isolated from the co-expression network. Then combining these cliques, three different biomarker modules are developed—maximal clique-like modules, 2-clique-1-bipartite modules, and 3-clique-2-bipartite modules. The list of biomarker genes discovered from these network modules are validated as the essential genes for causing a cancer in terms of network properties and survival analysis. This list of biomarker genes will help biologists to design wet lab experiments for further elucidating the complex mechanism of cancer. View Full-Text
- Supplementary File 1:
ZIP-Document (ZIP, 2018 KB)
Externally hosted supplementary file 1
Description: Figure S1: Maximal Clique-like Modules created by combining all the genes in the largest cliques of each cancer and visualized in Cytoscape. a) BRCA: (19 genes and 168 edges). b) COAD: (30 genes and 383 edges).; and c) GBM: (14 genes and 87 edges). Figure S2: Top three 3-clique-2-bipartie modules. (a) COAD; (b) GBM. Table S1: Frequency of cliques according to the size of cliques (number of nodes). Last row (yellow color) for each cancer represents the size and number of maximal cliques for the corresponding cancer. Table S2: List of Genes in Maximal Cliques and Maximal Clique-Like Modules for Three Cancers: BRCA, COAD, and GBM. Table S3: List of genes for top three 3-clique-2-bipartite modules. Table S4: List of benchmark genes in top-20 and top-50 metrics. Table S5: List of genes combining three network modules. BRCA (47 genes), COAD (61 genes) and GBM (38 genes). Table S6: Enriched pathways with key genes. Table S7: Gene Ontology (GO) enrichment analysis of key genes.
Share & Cite This Article
Tanvir, R.B.; Aqila, T.; Maharjan, M.; Mamun, A.A.; Mondal, A.M. Graph Theoretic and Pearson Correlation-Based Discovery of Network Biomarkers for Cancer. Data 2019, 4, 81.
Tanvir RB, Aqila T, Maharjan M, Mamun AA, Mondal AM. Graph Theoretic and Pearson Correlation-Based Discovery of Network Biomarkers for Cancer. Data. 2019; 4(2):81.Chicago/Turabian Style
Tanvir, Raihanul B.; Aqila, Tasmia; Maharjan, Mona; Mamun, Abdullah A.; Mondal, Ananda M. 2019. "Graph Theoretic and Pearson Correlation-Based Discovery of Network Biomarkers for Cancer." Data 4, no. 2: 81.
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.