Next Article in Journal
Integrating Structured Time-Series Modeling and Ensemble Learning for Strategic Performance Forecasting
Previous Article in Journal
Two Modifications of MinSum Algorithm for Efficient System-Optimal Traffic Assignment
Previous Article in Special Issue
Agentic RAG-Driven Multi-Omics Analysis for PI3K/AKT Pathway Deregulation in Precision Medicine
 
 
Article
Peer-Review Record

A Method for Calculating Whole-Genome Sequencing Outcomes from Trio Data

Algorithms 2025, 18(10), 610; https://doi.org/10.3390/a18100610
by Nikita Koltunov 1, Egor Guguchkin 1, Oleg Samovarov 1, Liudmila Mikhailova 2,3 and Evgeny Karpulevich 4,5,*
Reviewer 1: Anonymous
Reviewer 2:
Algorithms 2025, 18(10), 610; https://doi.org/10.3390/a18100610
Submission received: 23 August 2025 / Revised: 22 September 2025 / Accepted: 26 September 2025 / Published: 29 September 2025
(This article belongs to the Special Issue Advanced Algorithms for Biomedical Data Analysis)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

In "A Method for Calculating Whole-Genome Sequencing Outcomes from Trio Data," the authors developed an algorithm for WGS trio data by computing Mendelian-consistency scores, implemented in C++ and integrated into a Nextflow workflow. They evaluated the pipeline with two variant callers, which are DeepVariant and HaplotypeCaller, across two datasets and concluded that DeepVariant performs better than HaplotypeCaller.

Minor issues:

  1. Where can readers access the C++ implemented code that integrated with the Nextflow workflow? Please provide a link to a public repository and add it to the manuscript.
  2. The test datasets are not available, possibly due to privacy or IRB constraints. Why not use well-known, publicly available trios datasets (e.g., NA12878 with parents NA12891/NA12892, or HG00733 with parents HG00731/HG00732) so that readers can reproduce and compare results? Including more information such as accession numbers, exact software versions, and parameter settings would also help ensure reproducibility.

Author Response

Comment 1: Where can readers access the C++ implemented code that integrated with the Nextflow workflow? Please provide a link to a public repository and add it to the manuscript.

Response 1: Thank you for your comment. Link has been added to the "Data Availability Statement" section of the manuscript: https://github.com/ispras/sarek_trio

Comment 2: The test datasets are not available, possibly due to privacy or IRB constraints. Why not use well-known, publicly available trios datasets (e.g., NA12878 with parents NA12891/NA12892, or HG00733 with parents HG00731/HG00732) so that readers can reproduce and compare results? Including more information such as accession numbers, exact software versions, and parameter settings would also help ensure reproducibility.

Response 2: Thank you for your comment. The created module was intentionally tested on data provided by UFIC RAS ​​to compare the performance of DeepVariant and HaplotypeCaller in under-represented populations. Datasets such as HG and NA are often used to train models (e.g., DeepVariant), so the results may be biased.

 

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

I am very glad to see a deterministic algorithm proposed in an era when AI bubble papers are rampant. The algorithm is explainable and can be implemented in reality. Here are my comments.

1) Please add an introductory paragraph between the titles of Section 2 and Section 2.1.

2) The font family of the text in Figure 4 is informal. Please change the font family to "Times New Roman", making it aligned with that used in the content.

3) Normally, screenshots should not appear in the research paper. Please consider adjusting the presentation.

4) Some English expressions read like translation results from Chinese or Russian. For example, in Lines 220-221, the sentence "Because std::set keeps its elements sorted, these operations run in linear time, allowing scalable analysis of large genomic datasets" is not well-written. Please try to optimize the English expression.

5) Please approach to optimize the writing style and the academic expression.

Overall, I would like to make a "Minor Revision" decision.

Author Response

Thank you very much for your comments!

 

Comment 1: Please add an introductory paragraph between the titles of Section 2 and Section 2.1.

Response 1: We have added the introductory paragraph between Section 2 and 2.1.

 

Comment 2: The font family of the text in Figure 4 is informal. Please change the font family to "Times New Roman", making it aligned with that used in the content.

Response 2: All images have been updated.

 

Comment 3: Normally, screenshots should not appear in the research paper. Please consider adjusting the presentation.

Response 3: Screenshots have been deleted/re-maded

 

Comment 4: Some English expressions read like translation results from Chinese or Russian. For example, in Lines 220-221, the sentence "Because std::set keeps its elements sorted, these operations run in linear time, allowing scalable analysis of large genomic datasets" is not well-written. Please try to optimize the English expression.

Response 4: We have corrected the expression: As std::set keeps its elements sorted, these operations run in linear time. It allows scalable analysis of large genomic datasets.

 

Comment 5:  Please approach to optimize the writing style and the academic expression.

Response 5: We have corrected the sentences with informal or vague expressions.

Author Response File: Author Response.docx

Back to TopTop