Next Article in Journal
Antioxidant Activity of Selected Medicinal Plants Used by Traditional Herbal Practitioners to Treat Cancer in Malawi
Previous Article in Journal
Photocatalytic Reduction of Cr(VI) and Pb(II) with Biogenically Synthesized Copper Oxide Nanoparticles Using an Extract of the Myriophyllum spicatum Plant
 
 
Article
Peer-Review Record

Improving ISOMAP Efficiency with RKS: A Comparative Study with t-Distributed Stochastic Neighbor Embedding on Protein Sequences

J 2023, 6(4), 579-591; https://doi.org/10.3390/j6040038
by Sarwan Ali and Murray Patterson *
J 2023, 6(4), 579-591; https://doi.org/10.3390/j6040038
Submission received: 2 July 2023 / Revised: 27 October 2023 / Accepted: 29 October 2023 / Published: 31 October 2023

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Authors,

 

We recently had the opportunity to read your manuscript titled “Improving Isomap Efficiency with RKS: A Comparative Study with t-SNE on Bioinformatics-based Protein Sequences”, and we wanted to reach out to you to express our comments about your work.

 

Your study proposes and evaluates an approximate, more efficient version of the Isomap dimensionality reduction algorithm using Random Kitchen Sinks to approximate pairwise distances, enabling meaningful low-dimensional embeddings of protein sequence data..

 

Nevertheless, here are some possible comments outlining areas that could improve the quality and readability of the manuscript:

 

Introduction:

1.      The introduction provides adequate background on dimensionality reduction techniques like Isomap and t-SNE. However, the specific research aims and objectives are not clearly stated. The introduction would benefit from more precisely defining the research questions.

2.     The rationale for approximating Isomap with RKS could be expanded - why is improving efficiency important in this context?

3.     The introduction could cite more recent related work on approximate Isomap methods.

 

Methods:

4.     The proposed approximate Isomap approach using RKS is described clearly. However, details are lacking on parameter selection, such as how the number of nearest neighbors was chosen.

5.     More information is needed on how the various sequence embedding methods (k-mers, minimizers, PWM) were implemented.

6.     The limitations of RKS and the proposed approach should be discussed.

 

Results:

7.     The results focus on comparing ISOMAP and t-SNE but don't directly address the research aims in the introduction. The aims should be stated more explicitly.

8.     The results presentation could be streamlined - some redundant metrics like MSE and MAE.

9.     More interpretation of the results is needed, rather than just describing the performance metrics.

 

Discussion:

10.  The discussion summarizes the key findings but lacks depth. More interpretation of the results and their significance is needed.

11.  Limitations of the evaluation methodology should be addressed, like issues with using reconstruction error for assessment.

12.  Implications of the findings for protein sequence analysis are not really explored.

 

Conclusions:

13.  The conclusion reiterates the findings but does not sufficiently highlight their implications and significance to the field.

14.  The aims outlined in the introduction are not directly revisited.

15.  Future work is mentioned but quite vague - more specific future research directions could be outlined.

16.  Finally, there is a lack of mention to Quantum Kitchen Sinks (QKS) which is also, as you may know, another technique used for dimensionality reduction with some advantages over RKS (it preserves global data structures and relationships leading to more meaningful low dimensional embeddings, it has the ability to handle larger and more complex datasets, it requires fewer qubits and circuit depth to achieve a good approximation, so it is more efficient, etc). Some comments about this and other alternatives should be added to the manuscript explaining why the authors chose RKS instead other available options.

 

Regarding the grammar, the manuscript is well-written in clear, concise English. However, there are some areas where the grammar, word choice, and flow could be improved:

 

·      There are several grammatical errors throughout, including issues with subject-verb agreement, article use (a vs an), and tense consistency. Carefully proofreading and editing would help correct these kinds of errors.

·      Some sentences are choppy or awkwardly phrased, disrupting the flow. Smoothing out the sentence structures and transitions would improve readability.

·      Certain terms and phrases are overused, such as "the motivation behind this research is" appearing multiple times (3 times in one page). Using more varied wording for this and other sentences would enhance the writing style.

·      Abbreviations and acronyms are introduced inconsistently. Following a standard format would improve clarity.

·      The tone is generally appropriate for a manuscript, but some informal word choices detract from the scientific style.

 

Once again, thank you very much for your work. We´ll be waiting for your answers about our comments.

 

Kindest regards,

 

 

Comments on the Quality of English Language

Regarding the grammar, the manuscript is well-written in clear, concise English. However, there are some areas where the grammar, word choice, and flow could be improved:

 

·      There are several grammatical errors throughout, including issues with subject-verb agreement, article use (a vs an), and tense consistency. Carefully proofreading and editing would help correct these kinds of errors.

·      Some sentences are choppy or awkwardly phrased, disrupting the flow. Smoothing out the sentence structures and transitions would improve readability.

·      Certain terms and phrases are overused, such as "the motivation behind this research is" appearing multiple times (3 times in one page). Using more varied wording for this and other sentences would enhance the writing style.

·      Abbreviations and acronyms are introduced inconsistently. Following a standard format would improve clarity.

·      The tone is generally appropriate for a manuscript, but some informal word choices detract from the scientific style.

 

 

Author Response

We have attached a separate pdf file that includes point-by-point response to the reviewer's comments

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors congratulations for your work. I think it is necessary to explain some elements, add more information and improve your algorithm description.

 

Algorithm 1 includes a mixture of numpy functions and algorithmic instructions. Please fix it.

Line 266: we set the value of k to 3, which was determined using a validation set approach, please include details about it.

 

Your results are important, how do you explain them? Why are the differences so big?

 

Table 4 summarized your results, there is no more evidence of your process comparing both algorithms, please add more details of your intermediate results.

 

5. Results And Discussion must be improved with more details about how ISOMAT outperforms to t-SNE

 

Finally please check for typos.

Author Response

We have attached a separate pdf file that includes point-by-point response to the reviewer's comments

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Authors,

 

Thank you for the opportunity to review your manuscript titled "Improving ISOMAP Efficiency with RKS: A Comparative Study with t-SNE on Protein Sequences". I appreciate your effort in modifying the manuscript and sharing your work with me.

 

While the study offers some interesting ideas on improving ISOMAP's efficiency for dimensionality reduction, there are several major issues that need to be addressed before I could recommend this manuscript for publication.

 

Most critically, the specific research aims and objectives are unclear in the introduction. The rationale behind approximating ISOMAP with RKS needs more explanation, and recent related work should be cited.

 

Additionally, key details are lacking in the methods section regarding parameter selection, implementation of sequence embeddings, and limitations. The results focus heavily on metric comparisons without interpreting the findings, and the discussion requires more depth and attention to implications and limitations.

 

The conclusion does not sufficiently highlight the significance of the work, revisit the original aims, or outline future directions. And there are opportunities to compare against other approaches like Quantum Kitchen Sinks.

 

Finally, while generally well-written, there are numerous grammatical errors throughout that need to be corrected, along with some awkward phrasing that disrupts the flow. Tightening up the writing style would enhance the quality.

 

In summary, with major revisions to the introduction, methods, results, discussion, conclusion, and language quality, I believe the manuscript could be substantially strengthened. Please feel free to contact me if you would like any clarification or have additional questions on my comments. I'm happy to review a revised version that thoroughly addresses these issues.

 

Thank you again for allowing me to review your work. I look forward to seeing how it develops with future research.

 

Kindest regards,

 

 

Comments on the Quality of English Language

While generally well-written, there are numerous grammatical errors throughout the manuscript that need to be corrected, along with some awkward phrasing that disrupts the flow. Tightening up the writing style would enhance the quality.

Author Response

We have included a PDF document that include our detailed response to both reviewer's comments.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

"Dear authors, your paper has been improved; it is now easier to read and understand.

Regarding the algorithm, I have marked some parts that need modification. There are mistakes in your acronyms which I have highlighted.

I have a question related to the text on line 223. Is the sequence correct?

I also have a question about equations 4 and 5. Are they equal?"

These changes enhance the clarity and correctness of your message.

please check attached document

Comments for author File: Comments.pdf

Comments on the Quality of English Language

please authors check for typos.

Author Response

We have included a PDF document that include our detailed response to both reviewer's comments.

Author Response File: Author Response.pdf

Round 3

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Authors,

 

Thank you for the opportunity to review one more time your manuscript titled "Improving ISOMAP Efficiency with RKS: A Comparative Study with t-SNE on Protein Sequences". I appreciate your effort in modifying the manuscript and sharing your work with me.

 

While this actual version presents several modifications, I would recommend you to discuss the Limitations of the study (the comparison between ISOMAP and RKS has been proved with only one dataset) and also the Conclusions of the research itself because the conclusion does not sufficiently highlight the significance of the work.

 

Also, I would recommend you that the manuscript could be revised by a native speaker due to several typos, grammar mistakes, etc.

 

Thank you again for allowing me to review your work. I look forward to seeing how it develops with future research.

 

Kindest regards,

 

 

Comments on the Quality of English Language

I would recommend you that the manuscript could be revised by a native speaker due to several typos, grammar mistakes, etc.

Author Response

We thank the reviewer for their valuable comments. We have adjusted the limitations and conclusion sections based on the reviewer's suggestions. We also want to mention that in the paper, we already showed results for 3 different datasets comprised of protein sequences. However, we still mentioned in the limitation section that more diverse biological datasets could be considered such as nucleotides and short reads, which we will explore in the future.

We have also proofread the paper and adjusted all grammatical issues.

Back to TopTop