Understanding the Function of a Locus Using the Knowledge Available at Single-Nucleotide Polymorphisms
Round 1
Reviewer 1 Report
In the present manuscript, Nikpay et al. presents an algorithm for elucidating functional consequence of a locus of interest using SNPs from multi-omics data using SNP based annotation for a phenotype of interest.
This is an important study contributing for addressing the problem of extracting meaningful information across diverse datasets. The methodological approach to establish gene functional attributing to the phenotype using published GWAS data and SNP based annotation is sound.
Minor comments:
- The plethora of steps in the algorithm is not presented in a very structured way and the paper is not easy to follow in terms of the workflow implemented. An updated graphical representation of methods in a more clearly arranged way including the underlying software (or thresholds/parameters) used at each methodological step would be very helpful to the reader. For example, I suggest updating the Fig.1 workflow diagram adding that the P<5e-8 and SMR was used for QTL mapping step and GSMR was used under the statistical analysis step of the algorithm.
- The output format of the algorithm is not clearly described, It would be helpful for the readers if an explanation of the output generated was also included in the methods section. Moreover, it would be helpful for the user of this workflow to have a comprehensive documentation/README associated with the output format updated on the corresponding github page.
- The paper lacks emphasis on clinical implications/benefits of this algorithm/methodology. The introduction and Discussion should also focus on potential clinical implications.
- Abbreviations for single-nucleotide polymorphisms (SNPs) and linkage disequilibrium (LD) are specified at first usage but QTL abbreviation is not defined.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Reviewer 2 Report
I think the authors raised an important issue in biological research that a streamlined and reproducible analytical approach can be crucial for reporting new findings. While the authors proposed a specific workflow to investigate and solve the issue, there are some major improvements needed:
- In the Introduction section, you mentioned the tedious task of referencing annotation tracks etc., it is not clear to me how your pipeline can help biologists with these tasks. I think it would be better to specify the scope of your tool here.
- What is your rationale to use only rare variants? How accurate is the haplotypes and tag SNPs identified by this rare variant? If possible, will a set of more common variants be more useful in this case?
- More details should be provided for your pipeline. For example, did you perform harmonization of results from different GWAS? What is the format for the phenotype file? Will the provided phenotype file be used as ‘exposure’ or ‘outcome’ in the Mendelian randomization analysis?
- Based on your examples, especially the second one (APOE), it seems the number of identified SNPs used as instrumental variables is small. Do you think your approach have enough power to be applicable to other regions of the genome? If not, I think it would be helpful to discuss some other limitations regarding the power of detection in the manuscript.
Author Response
Please see the attachment.
Author Response File: Author Response.docx