Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Joint Representation Learning for Retrieval and Annotation of Genomic Interval Sets

Bioengineering 2024, 11(3), 263; https://doi.org/10.3390/bioengineering11030263

by Erfaneh Gharavi^1,2

, Nathan J. LeRoy^1,3

, Guangtao Zheng⁴

, Aidong Zhang^2,3,4

, Donald E. Brown^2,5

and Nathan C. Sheffield^{1,2,3,4,6,7,8,*}

Reviewer 1:

Jiayin Wang

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Bioengineering 2024, 11(3), 263; https://doi.org/10.3390/bioengineering11030263

Submission received: 14 December 2023 / Revised: 20 February 2024 / Accepted: 22 February 2024 / Published: 8 March 2024

(This article belongs to the Special Issue AI and Big Data Research in Biomedical Engineering)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Summary and Strengths:

The authors propose a fast search method for large-scale genomic interval data.

This method trains numerical embeddings for region sets and their metadata labels, and captures similarity between region sets and their metadata in a low-dimensional space.

This method successfully solves 3 information retrieval tasks using embedding distance computations: retrieving region sets related to a user query string, suggesting new labels for database region sets, and retrieving database region sets similar to a query region set.

The experimental results demonstrate the effectiveness of the proposed method.

Weakness:

1. The StarSpace method should be further clearly formulated, e.g., how the original genomic interval data is embedded into low-dimensional formation, how the similarity is calculated.

2. StarSpace is an old method presented in 2018. So, why such method is taken to handle the genomic search task? The suggestion is to theoretically explain why it is specific to this task, or compare it with other new methods in practice.

Author Response

Review response uploaded as PDF.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Retrieval and annotation of genomic interval sets is a challenging task but has significant value for researchers. In this manuscript, the authors leveraged deep learning approaches to learn the representation of genomic intervals and their annotations. They reported the detailed implication and used three cases to demonstrate the utility of their search system. This method overcomes the limitations of traditional methods by the combination of sequence information and metadata labels thus have the potential for inferring biological functions of those genomic intervals. The authors also point out the future directions for further improvement which I am very interested. Overall this is a very good paper and though the performance may be limited by current available datasets for training.

Author Response

Review response uploaded as PDF.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The biggest issue with this article is the lack of availability of any code. Where is the code? How can researchers reproduce your work? Please put your code on Github with a README file immediately.

Author Response

Review response uploaded as PDF.

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

The code has now been included in the Github page. Thank you for making this revision.

Article Menu

Joint Representation Learning for Retrieval and Annotation of Genomic Interval Sets

Further Information

Guidelines

MDPI Initiatives

Follow MDPI