Integrated Multi-Omics Analysis Identifies Novel Prognostic and Diagnostic Hub Genes in Colorectal Cancer
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsColorectal can certainly is a major killer and there is a need to identify prognostic and discriminatory genes. In this study, the authors performed a 'meta-analysis' like study of 4 separately published transcriptome data sets from Colorectal tumours and healthy controls. This identified 128 genes with prognostic potential and 23 'hub' genes with discriminatory potential. Many of these genes have not been flagged up before.
This study is interesting and the methods are valid and I believe the results. The major issue I have with the manuscript as it is presented now is that the data presentation and text obscured potential impact. To put it into a nutshell, the authors should drastically shorten the manuscript, highlight key findings in their figures and streamline the whole. Many data that are described at length in the RESULTS section in the text can be presented in tables and summarised. Many figures could be relegated into what would be supplementary figures or equivalent. Many of the figure segments are too small. The DISCUSSION is far too long. In all, the whole manuscript should be shortened by 50-70%. I will go through some details below:
(1) Much of the analysis is based on the WGCNA but the key reference for this is not provided which is strange. I guess it should be:
Langfelder P, Horvath S. BMC Bioinformatics. 2008 Dec 29;9:559. doi: 10.1186/1471-2105-9-559. PMID: 19114008.
(2) The volcano plots of Figure 1 are not very informative. They could go as supplementary figures. It would be nice to have some of the genes that really stand out in terms of significance and fold change labelled in the plots.
(3) the heat maps of Figure 2 again seem supplementary, the labels of the genes on the right are too small to read.
(4) Figure 3 is ok as main figure.
(5) The gene ontology (Figures 4, 6) is described excessively in the text, with GO terms listed etc. Much is shown in the figure anyway and could be listed in a table. Figures 5 and 7 should probably be a supplementary figures.
(6) The individual Kaplan-Meier plots (scheme 3) are too small, especially the text in them, but they represent key findings. The authors should select some of the key plots as major figures and relegate most as supplementary data.
(7) Figure 10 could be improved, the individual panels are too small.
(8) The presentation and description in the RESULTS text of figure 12 is confusing. I cannot distinguish any miRNA in the figure itself, so what is the point? Maybe the authors should make a scheme of what they describe in the text instead.
(8) The DISCUSSION is far too long, this should be drastically shortened and some of the info (e.g. function of key hub genes) could be summarised in a table. A large fraction of the DISCUSSION just repeats what has already been described in the RESULTS section, just alone this could be much shortened. As it is, the DISCUSSION obscures what is truly novel and there is a lack of discussion of next steps and limitations of the study.
Author Response
The authors like to thank the reviewers for their time and careful evaluation of the manuscript. Their valuable comments will surely help in improving the quality of the manuscript. The point-by-point answers to reviewers comments are given below:
Reviewer 1: Colorectal can certainly is a major killer and there is a need to identify prognostic and discriminatory genes. In this study, the authors performed a 'meta-analysis' like study of 4 separately published transcriptome data sets from Colorectal tumours and healthy controls. This identified 128 genes with prognostic potential and 23 'hub' genes with discriminatory potential. Many of these genes have not been flagged up before.
This study is interesting and the methods are valid and I believe the results. The major issue I have with the manuscript as it is presented now is that the data presentation and text obscured potential impact. To put it into a nutshell, the authors should drastically shorten the manuscript, highlight key findings in their figures and streamline the whole. Many data that are described at length in the RESULTS section in the text can be presented in tables and summarised. Many figures could be relegated into what would be supplementary figures or equivalent. Many of the figure segments are too small. The DISCUSSION is far too long. In all, the whole manuscript should be shortened by 50-70%. I will go through some details below:
(1) Much of the analysis is based on the WGCNA but the key reference for this is not provided which is strange. I guess it should be:
Langfelder P, Horvath S. BMC Bioinformatics. 2008 Dec 29;9:559. doi: 10.1186/1471-2105-9-559. PMID: 19114008.
Author’s response: The correct reference has been added to the text.
(2) The volcano plots of Figure 1 are not very informative. They could go as supplementary figures. It would be nice to have some of the genes that really stand out in terms of significance and fold change labelled in the plots.
Author’s response: As suggested, the figure has been moved to supplementary figure.
(3) the heat maps of Figure 2 again seem supplementary, the labels of the genes on the right are too small to read.
Author’s response: As suggested, the figure has been moved to supplementary figure.
(4) Figure 3 is ok as main figure.
Author’s response: As suggested, the figure has been retained as main figure.
(5) The gene ontology (Figures 4, 6) is described excessively in the text, with GO terms listed etc. Much is shown in the figure anyway and could be listed in a table. Figures 5 and 7 should probably be a supplementary figures.
Author’s response: As suggested, figures 5 and 7 have been moved to supplementary figures.
(6) The individual Kaplan-Meier plots (scheme 3) are too small, especially the text in them, but they represent key findings. The authors should select some of the key plots as major figures and relegate most as supplementary data.
Author’s response: All figures represented as schemes are supplementary figures. The editorial office has mistakenly put them in the main text.
(7) Figure 10 could be improved, the individual panels are too small.
Author’s response: The suggested changes have been made in both figure 10 and 11, which should be represented as landscape image in the main text for better clarity.
(8) The presentation and description in the RESULTS text of figure 12 is confusing. I cannot distinguish any miRNA in the figure itself, so what is the point? Maybe the authors should make a scheme of what they describe in the text instead.
Author’s response: There are many miRNA in the figure and it is not possible to label them. The image has been shifted to supplementary figure and supporting information in the form of excel sheet has been provided as supplementary information.
(8) The DISCUSSION is far too long, this should be drastically shortened and some of the info (e.g. function of key hub genes) could be summarised in a table. A large fraction of the DISCUSSION just repeats what has already been described in the RESULTS section, just alone this could be much shortened. As it is, the DISCUSSION obscures what is truly novel and there is a lack of discussion of next steps and limitations of the study.
Author’s response: As suggested by the reviewer, the discussion is shortened by almost 50 % and references are reduced to 136 from 272. To conclude discussion, we have included a line at the end of Discussion “Although this study integrates multi-omic analysis to identify genes with prognostic and diagnostic significance in CRC, it remains an exploratory analysis. The findings from this study must be validation in independent cohorts and future studies that incorporate patient metadata and functional validation are essential to elucidate the mechanistic roles of these hub genes and their use as clinical biomarkers.”
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors conducted an integrated multi analysis to address non-invasive biomarkers in colorectal cancer, by finding differently express genes using samples in GEO database between CRC and normal tissue. They then identified hub genes, using WGCNA and PPI from this analysis, and obtained candidate genes which were validated for prognostic significance via survival analysis. They identify 23 genes with high discriminatory power using ... they also performed an integrative analysis of immune cell in in infiltration, genetic alterations, promoter methylation and finally constructed a multiomic hub network.
The authors are able to find a list of genes at the end, but literature search reveals that none of these are unknown regulators of CRC or other cancers in general, and are known mediators of cancer biology. To be truly be able to find biomarkers for CRC, one would need to show that these biomarkers are truly enriched in particular CRC patient populations. Additionally, these genes and the hub genes are not indicative of the corresponding protein being made, and is not indicative of any mechanism fundamental to CRC (which is a broad spectrum of disease). The initial GEO datasets have not been studied for their variation in metadata, and we are not able to conclude this broadly for CRC. Additionally, none of these genes have been functionally validated, or their specific role in disease progression has been illustrated.
The authors would need to perform a deeper study into these genes and provide mechanistic interpretation of all their results, rather than using algorithms to shortlist genes. Perhaps using TCGA itself, and using image data, or other kinds of data modalities (single cell, ATAC etc) to explore the connection between these genes and CRC, and how the population expressing these genes are truly distinct.
Author Response
Reviewer 2: The authors conducted an integrated multi analysis to address non-invasive biomarkers in colorectal cancer, by finding differently express genes using samples in GEO database between CRC and normal tissue. They then identified hub genes, using WGCNA and PPI from this analysis, and obtained candidate genes which were validated for prognostic significance via survival analysis. They identify 23 genes with high discriminatory power using ... they also performed an integrative analysis of immune cell in in infiltration, genetic alterations, promoter methylation and finally constructed a multiomic hub network.
The authors are able to find a list of genes at the end, but literature search reveals that none of these are unknown regulators of CRC or other cancers in general, and are known mediators of cancer biology. To be truly be able to find biomarkers for CRC, one would need to show that these biomarkers are truly enriched in particular CRC patient populations. Additionally, these genes and the hub genes are not indicative of the corresponding protein being made, and is not indicative of any mechanism fundamental to CRC (which is a broad spectrum of disease). The initial GEO datasets have not been studied for their variation in metadata, and we are not able to conclude this broadly for CRC. Additionally, none of these genes have been functionally validated, or their specific role in disease progression has been illustrated.
The authors would need to perform a deeper study into these genes and provide mechanistic interpretation of all their results, rather than using algorithms to shortlist genes. Perhaps using TCGA itself, and using image data, or other kinds of data modalities (single cell, ATAC etc) to explore the connection between these genes and CRC, and how the population expressing these genes are truly distinct.
Author’s response: We sincerely thank the reviewer for detailed comments and valuable insights. Though, we agree with the reviewer that functional validation is crucial to establish the role of identified hub genes in CRC biology, such studies are well beyond the intended scope of the present study. The present study is a bioinformatics and integrative multi-omics analysis to identify potential prognostic and diagnostic biomarkers for CRC. The primary objective of this study was to use publicly available datasets to integrate expression data with immune infiltration, network analysis, mutation data and promoter methylation patterns for identifying key genes that can serve as novel diagnostic and prognostic biomarkers for CRC. The study does not aim to experimentally validate the identified genes or their underlying molecular mechanisms. The identification of several hub genes that have been previously implicated in other cancers further strengthens the reliability of our study and provides new insights into immune associations, co-regulatory networks, mutations and methylation patterns of these genes in CRC.
We also appreciate the reviewer’s comment about variations in GEO datasets. To address this, we have included a line at the end of Discussion “Although this study integrates multi-omic analysis to identify genes with prognostic and diagnostic significance in CRC, it remains an exploratory analysis. The findings from this study must be validation in independent cohorts and future studies that incorporate patient metadata and functional validation are essential to elucidate the mechanistic roles of these hub genes and their use as clinical biomarkers.”
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors addressed the issues that I raised and improved the manuscript.
