Next Article in Journal
Global Patterns and Future Dynamics of Four Invasive Cocklebur Species Under Climate Change: Contrasting Climatic and Anthropogenic Drivers
Previous Article in Journal
Ultrastructural Evidence for Dual Sperm Morphotypes in Hormone-Induced Japanese Eel (Anguilla japonica): Implications for Sperm Maturation
 
 
Article
Peer-Review Record

A Comparative Evaluation of Four Bioinformatic Tools for Identifying HIV-1 pol Drug Resistance Mutations Using Illumina MiSeq Data

Biology 2026, 15(5), 438; https://doi.org/10.3390/biology15050438
by Ogestelli Fabia Lee 1 and Chun Kiat Lee 2,*
Reviewer 1:
Reviewer 2:
Biology 2026, 15(5), 438; https://doi.org/10.3390/biology15050438
Submission received: 19 January 2026 / Revised: 22 February 2026 / Accepted: 5 March 2026 / Published: 7 March 2026
(This article belongs to the Section Bioinformatics)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript addresses a relevant and timely question and is generally well designed and clearly written. In view of its solid methodological framework and the practical value of the comparative evaluation for laboratories implementing HIV-1 NGS drug resistance screening.

Major concern: 

At present, the manuscript does not explore whether non-consensus minority variants above 5% are associated with patients’ prior antiretroviral exposure. The authors could, where feasible, strengthen the interpretation by reviewing clinical records for these cases to assess historical ART regimens that might plausibly explain the emergence of such mutations. This additional analysis would help distinguish biologically meaningful resistant subpopulations from potential sequencing artifacts or algorithm-specific detection biases, thereby reinforcing the validity and clinical relevance of the observed discrepancies.

Although Table 1 summarizes the reported discrepancies across pipelines, the interpretation would benefit from a graphical representation (e.g., bar plot or dot plot) showing the mutation frequencies detected by each algorithm alongside the applied detection thresholds. This would facilitate a clearer visualization of where disagreements occur and highlight that most discrepancies involve low-frequency variants, frequently below 5% and in several cases close to or below the 1% threshold. As currently presented, the text may overemphasize algorithmic discordance without sufficiently contextualizing the biological and clinical relevance of these low-abundance variants. A brief discussion addressing whether such differences are expected to impact clinical decision-making or patient outcomes would substantially strengthen the manuscript.​​

Minor:

The Discussion would benefit from an explicit limitations paragraph. 

Author Response

Comment 1:

Major concern: 

At present, the manuscript does not explore whether non-consensus minority variants above 5% are associated with patients’ prior antiretroviral exposure. The authors could, where feasible, strengthen the interpretation by reviewing clinical records for these cases to assess historical ART regimens that might plausibly explain the emergence of such mutations. This additional analysis would help distinguish biologically meaningful resistant subpopulations from potential sequencing artifacts or algorithm-specific detection biases, thereby reinforcing the validity and clinical relevance of the observed discrepancies.

 

Response 1: We thank Reviewer 1 for highlighting the importance of clinical correlation in validating the LAMs. We agree that reviewing historical ART regimens would further distinguish biological resistance from potential artifacts or algorithm-specific detection biases.

However, because this study utilized de-identified datasets to facilitate bioinformatic tool comparison, we do not have access to individual clinical records for these cases. To address this, we have added a new subsection, Section 4.5 (Limitations of the Study) acknowledging this limitation.

 

Comment 2:

Although Table 1 summarizes the reported discrepancies across pipelines, the interpretation would benefit from a graphical representation (e.g., bar plot or dot plot) showing the mutation frequencies detected by each algorithm alongside the applied detection thresholds. This would facilitate a clearer visualization of where disagreements occur and highlight that most discrepancies involve low-frequency variants, frequently below 5% and in several cases close to or below the 1% threshold. As currently presented, the text may overemphasize algorithmic discordance without sufficiently contextualizing the biological and clinical relevance of these low-abundance variants. A brief discussion addressing whether such differences are expected to impact clinical decision-making or patient outcomes would substantially strengthen the manuscript.​​

Response 2: We appreciate the reviewer’s suggestion to better visualize the distribution of discordant variants. We have added Figure 1, a grouped bar chart illustrating the mutation frequencies reported by each pipeline for the discordant LAMs.

Furthermore, we would like to direct the reviewer to the Discussion section, where we have detailed the clinical significance and potential for virological failure associated with the specific mutations that were undetected or discordant.

 

Comment 3: Minor:

The Discussion would benefit from an explicit limitations paragraph. 

Response 3: We agree with the reviewer. We have added a new subsection, Section 4.5 (Limitations of the Study), which details the challenges regarding the lack of patient treatment history.

Reviewer 2 Report

Comments and Suggestions for Authors

In the manuscript by Lee & Lee, the two authors compared four different SNP-calling software solutions in a clinical setting, using HIV as the subject. The manuscript offers limited conceptual novelty, but the technical findings may improve methodological understanding and clarify the consequences for clinicians. The manuscript is well written and accessible to the target readers, who may have limited analytical capabilities.

As mentioned, the limited complexity of the paper is not a concern for this review. However, I suggest that the authors elaborate on the data used. In particular, the amount and quality of the reads are fundamental drivers of bioinformatic results. A table depicting the number of reads and their quality would help readers understand the limitations of the studied SNP callers.

I also encourage the authors to keep the discussion and results sections strictly separated. Attributing the error to Bowtie2 (l.190), from my understanding, is not based on technical investigation but is rather hypothesized by the authors. Similarly, the Results section should not contain statements such as "is likely to represent," since, as the name implies, it should present results rather than interpretations.

Besides this, I have the following minor points:

- l.15: "new method." Even if the clinical HIV field is apparently using outdated methodologies, I would not define next-generation sequencing as a "new" method. Moreover, it is not a single method but rather a broader concept. The actual methods differ substantially across technology platforms.

- l.214: "mutation s" should probably be corrected to "mutations."

- M&M section: It is not clear whether the different software tools allow the specification of arguments and parameters. If so, did the authors use the default settings, or did they modify them to optimize performance? I suggest that the authors clarify these details, as parameter settings can heavily impact the outcome.

Author Response

Comment 1:

As mentioned, the limited complexity of the paper is not a concern for this review. However, I suggest that the authors elaborate on the data used. In particular, the amount and quality of the reads are fundamental drivers of bioinformatic results. A table depicting the number of reads and their quality would help readers understand the limitations of the studied SNP callers.

Response 1: We thank Reviewer 2 for this insightful suggestion. We agree with the reviewer that the number and base quality of the sequencing reads are critical determinants of variant-calling accuracy. We added a new summary table (Table 1) in Section 3.1 reporting the mean raw read count (83,764; 95 CI: 75,635–91,893) and the mean percentage of bases ≥ Q30 quality threshold (96.37%; 95 CI: 95.92%–96.82%).

 

Comment 2:

I also encourage the authors to keep the discussion and results sections strictly separated. Attributing the error to Bowtie2 (l.190), from my understanding, is not based on technical investigation but is rather hypothesized by the authors. Similarly, the Results section should not contain statements such as "is likely to represent," since, as the name implies, it should present results rather than interpretations.

Response 2: We thank the reviewer for this constructive feedback regarding the formal structure of the manuscript. We agree that the Results section should remain strictly descriptive. Accordingly, we have moved all technical interpretations, specifically the hypothesis regarding Bowtie2’s gap penalties and the translation logic for IUPAC codes, from the Section 3.3 to the Section 4.1 and 4.3. We have also refined the language in the Results section to remove speculative phrasing such as “is likely to represent”, ensuring it focuses solely on the observed data. We believe these changes significantly improve the clarity and professional tone of the article.

 

Comment 3:

Besides this, I have the following minor points:

- l.15: "new method." Even if the clinical HIV field is apparently using outdated methodologies, I would not define next-generation sequencing as a "new" method. Moreover, it is not a single method but rather a broader concept. The actual methods differ substantially across technology platforms.

Response 3: We agree with the reviewer. We have revised the Simple Summary to remove the phrase “a new method called”.

 

Comment 4:

- l.214: "mutation s" should probably be corrected to "mutations."

Response 4: We thank the reviewer for identifying this typographical error. We have corrected “mutation s” to “mutations”.

 

Comment 5:

- M&M section: It is not clear whether the different software tools allow the specification of arguments and parameters. If so, did the authors use the default settings, or did they modify them to optimize performance? I suggest that the authors clarify these details, as parameter settings can heavily impact the outcome.

Response 5: We thank the reviewer for this important observation. We agree that bioinformatic outcomes are heavily influenced by parameter configurations. In response, we have revised Section 2.2 to state where default settings were utilized and where specific parameters were manually optimized to ensure diagnostic sensitivity. Regarding the commercial Exatype platform (Section 2.2.1), we have clarified that the parameters are proprietary and fixed. Therefore, the tool was utilized with its default configuration except for the manually preset 2% reporting threshold. For Quasitools (Section 2.2.3), we indicated that default arguments for Bowtie2 alignment and hypermutation filtering were maintained. Finally, for the iLunaR pipeline (Section 2.2.4), we have provided detailed technical specifications, including the quality filtering and adapter removal parameters for Trimmomatic (ILLUMINACLIP:TruSeq3-PE.fa, LEADING:30, TRAILING:30), the optimization of the MEGAHIT minimum k-mer size to 19 and the adjustment of the BWA-SW Z-best value to 10.

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors addressed my concerns; I agree with the actual form 

Back to TopTop