Review Reports
- Yifan Pan 1,2,
- Pingting Zhu 2 and
- Lan Cao 1,2,3,*,†
- et al.
Reviewer 1: Anonymous Reviewer 2: Anonymous Reviewer 3: Anonymous
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis study investigates the epidemiology, co‑infection patterns, and genetic evolution of human bocavirus (HBoV) in Guangzhou between 2023 and 2025. Most infections occurred in children under three and a clear autumn peak, followed by a secondary rise shortly afterward. Co‑infections were frequent. Two strains showed strong evidence of genetic recombination and these antigenic alterations suggest potential immune escape mechanisms of HboV. The authors propose that such recombination‑driven antigenic shifts may contribute to the secondary epidemic peak. The study highlights the importance of genomic surveillance during peak seasons and underscores the dynamic evolution of HBoV and its implications for pediatric respiratory infections.
The study is clearly presented, and the sample size is substantial. Although it addresses a topic rooted in basic virology, the epidemiological, co‑infection, and clinical implications are articulated in a way that remains accessible and relevant to clinical pediatrics. The Results section is particularly well developed, with figures that are appropriately designed and easy to interpret. Overall, the manuscript provides an additional contribution to the broader understanding of viral respiratory epidemiology in childhood, even if its findings are not of immediate bedside applicability.
Author Response
Comment:
This study investigates the epidemiology, co‑infection patterns, and genetic evolution of human bocavirus (HBoV) in Guangzhou between 2023 and 2025. Most infections occurred in children under three and a clear autumn peak, followed by a secondary rise shortly afterward. Co‑infections were frequent. Two strains showed strong evidence of genetic recombination and these antigenic alterations suggest potential immune escape mechanisms of HboV. The authors propose that such recombination‑driven antigenic shifts may contribute to the secondary epidemic peak. The study highlights the importance of genomic surveillance during peak seasons and underscores the dynamic evolution of HBoV and its implications for pediatric respiratory infections.
The study is clearly presented, and the sample size is substantial. Although it addresses a topic rooted in basic virology, the epidemiological, co‑infection, and clinical implications are articulated in a way that remains accessible and relevant to clinical pediatrics. The Results section is particularly well developed, with figures that are appropriately designed and easy to interpret. Overall, the manuscript provides an additional contribution to the broader understanding of viral respiratory epidemiology in childhood, even if its findings are not of immediate bedside applicability.
Response:
We sincerely thank the reviewer for the encouraging and constructive comments. We are particularly grateful for your positive remarks.
Reviewer 2 Report
Comments and Suggestions for AuthorsComments and Suggestions
Title and Abstract
1. The title and abstract overstate the link between recombination and antigenic change; with the data presented, the study demonstrates recombination detection and in silico predictions of epitopes, but not functional validation of antigenic change.
2. The abstract's conclusion should be toned down, as it states that recombination “led to the alteration of its antigenic characteristics,” while the discussion itself acknowledges that there was no experimental validation and that only speculation about antigenic alterations is possible.
3. The abstract also suggests a causal interpretation of the second epidemic peak that is not supported by the observational design or the analyses performed.
Introduction
4. The introduction needs a review of scientific writing in English; expressions such as “worldwidely” and several grammatical constructions reduce the clarity and credibility of the manuscript.
5. The objective is well stated, but the introduction anticipates “antigenic variation” as an almost certain outcome, when methodologically the study only uses computational predictors of epitopes.
Methods
6. The selection process for the 15 positive specimens that underwent full sequencing is not clearly described; without explicit selection criteria, representativeness and the risk of selection bias cannot be assessed.
7. The pathogen detection section requires greater methodological precision: the manuscript describes the NxTAG panel as “real-time quantitative PCR,” which may lead to confusion regarding the exact technology and positivity criteria used.
8. The phylogenetic methodology is insufficient for reproducibility because it does not specify the evolutionary model, the exact method of inference, how gaps are addressed, or the number of bootstrap replicates.
9. The statistical analysis is too limited for the subsequent inferences: only χ² and partitioned χ² are described, without effect estimates, confidence intervals, adjustments for confounding factors, or temporal analyses to support the interpretation of co-infection or a second epidemic peak. 10. The justification for the chosen parameters for epitope and antigenicity prediction needs further support, especially the VaxiJen threshold and the selection of 27 HLA alleles “covering 90% of the population,” as it is not explained which population it refers to or its relevance to Guangzhou.
Epidemiological Results
11. The age categorization in Table 1 is not interpretable in its current form; the groups “0~”, “3~”, “6~”, “18~”, and “65~” do not define closed intervals and leave significant ambiguity, especially between 6 and 18 years.
12. The phrase “differences in positivity rates among partial age groups were statistically significant” is too vague; it should specify which comparisons were significant and show the post hoc contrasts explicitly, not just with asterisks in the figure.
12. 13. The described seasonality requires caution because the observation period begins in August 2023 and ends in December 2025; this incomplete calendar-year window may distort the visual interpretation of seasonal peaks.
Co-infection
14. The co-infection analysis is based on unadjusted contingency tables and should not be interpreted as evidence of biological association without controlling for age, seasonality, and baseline circulation of each virus.
15. The co-infection table mixes co-detection frequency with “co-infection rate,” but it is unclear whether this percentage is calculated on all HBoV-positive cases or on the total number of positive cases for each other virus; the definition should be standardized.
16.
16. The interpretation of influenza, rhinovirus, and parainfluenza as pathogens “associated with bocavirus infection” should be tempered, because the data presented show co-detection and do not allow for inferences of synergy or a true preference for co-infection.
Phylogeny and Recombination
17. The inference of a “recombination rate of 13.33% (2/15)” is potentially misleading, because the denominator does not correspond to the total number of positive cases but only to the 15 sequenced genomes, whose selection was not described.
18. The identification of putative parental strains from very different geographic regions requires a more critical methodological discussion on unsampled diversity and on the “putative” nature of the parents assigned by RDP4/SimPlot.
19. The presented evidence of recombination is suggestive, but it should be strengthened with more explicit criteria for accepting recombinant events, such as a minimum number of concordant methods, multiplicity correction, or p-values per algorithm.
Antigenicity and Epitopes
20. The epitope section interprets predicted differences as actual changes in antigenicity; this equivalence is not demonstrated and should be reformulated in terms of “predicted epitope differences.”
21. The claim that recombination “has led to an alteration in the antigenicity of the VP2 protein” is too strong for purely computational data, even if there is consistency between linear and conformational data.
22. The comparison with “domestically circulating strains” needs better epidemiological justification, because it is not explained why GZ-2024-15663 and YN-1044 were specifically chosen as circulating reference strains.
20. Discussion
23. The discussion regarding the second epidemic peak and possible immune escape is clearly speculative and should be presented as a working hypothesis, not as a partially established explanation.
24. The causal argument linking high prevalence, co-infection, recombination, antigenic alteration, and a second peak is not supported by a formal temporal analysis or by sequencing of all positive cases during both peaks.
25. The discussion acknowledges the lack of experimental validation and complete sequencing of all positive cases as a limitation, but this caution is not consistently maintained in the conclusion.
Conclusions
26. The conclusion should be substantially moderated: “These recombination events could have led to alterations in virus antigenicity, which partially explain the emergence of the secondary peak” goes beyond what the data support.
27. It would be more consistent to conclude that recombinant events and in silico differences in epitopes were detected, and that both findings warrant functional validation and expanded genomic surveillance. 28. There are multiple English writing problems throughout the text, including grammatical errors, verb agreement issues, and unnatural phrasing, which require a comprehensive linguistic review.
29. The legend for Figure 2 contains faulty punctuation and syntax, and the age/table notation is inconsistent.
Author Response
Comment 1:
The title and abstract overstate the link between recombination and antigenic change; with the data presented, the study demonstrates recombination detection and in silico predictions of epitopes, but not functional validation of antigenic change.
Response 1:
We agree with the reviewer and have revised the title and abstract to avoid overstating the link between recombination and antigenic change.
Comment 2:
The abstract's conclusion should be toned down, as it states that recombination “led to the alteration of its antigenic characteristics,” while the discussion itself acknowledges that there was no experimental validation and that only speculation about antigenic alterations is possible.
Response 2:
We agree and have toned down the conclusion in the abstract.
Comment 3:
The abstract also suggests a causal interpretation of the second epidemic peak that is not supported by the observational design or the analyses performed.
Response 3:
We thank the reviewer for pointing this out. We did not intentionally include such a causal interpretation in the abstract. Nevertheless, we have carefully reviewed and revised other parts of the manuscript (e.g., the third-to-last paragraph of the Discussion) using hypothetical language.
Comment 4:
The introduction needs a review of scientific writing in English; expressions such as “worldwidely” and several grammatical constructions reduce the clarity and credibility of the manuscript.
Response 4:
We have carefully revised the entire manuscript for English language and grammar.
Comment 5:
The objective is well stated, but the introduction anticipates “antigenic variation” as an almost certain outcome, when methodologically the study only uses computational predictors of epitopes.
Response 5:
We have revised the introduction accordingly. The phrase “antigenic variation” has been replaced with a more general term “molecular characteristic” (line 66)
Comment 6:
The selection process for the 15 positive specimens that underwent full sequencing is not clearly described; without explicit selection criteria, representativeness and the risk of selection bias cannot be assessed.
Response 6:
We have clarified the selection process in section 2.3. All positive specimens were subjected to sequencing, and only those with low-quality data were excluded from the study.
Comment 7:
The pathogen detection section requires greater methodological precision: the manuscript describes the NxTAG panel as “real-time quantitative PCR,” which may lead to confusion regarding the exact technology and positivity criteria used.
Response 7:
We have revised the description in section 2.2, changing “real-time quantitative PCR” to “bead-based multiplex RT-PCR” to more accurately reflect the technology used.
Comment 8:
The phylogenetic methodology is insufficient for reproducibility because it does not specify the evolutionary model, the exact method of inference, how gaps are addressed, or the number of bootstrap replicates.
Response 8:
We have supplemented in section 2.4. Specifically, we used ClustalX for MSA, the NJ method for tree construction, the Maximum Composite Likelihood model, and 1,000 bootstrap replicates. Gaps were treated via pairwise deletion.
Comment 9:
The statistical analysis is too limited for the subsequent inferences: only χ² and partitioned χ² are described, without effect estimates, confidence intervals, adjustments for confounding factors, or temporal analyses to support the interpretation of co-infection or a second epidemic peak.
Response 9:
We have strengthened the statistical analysis accordingly. Odds ratios with 95% confidence intervals have been added to Tables 1 and 2. In addition, a temporal trend analysis has been included in section 3.1 to better support the interpretation of the second epidemic peak.
Comment 10:
The justification for the chosen parameters for epitope and antigenicity prediction needs further support, especially the VaxiJen threshold and the selection of 27 HLA alleles “covering 90% of the population,” as it is not explained which population it refers to or its relevance to Guangzhou.
Response 10:
The VaxiJen threshold was increased to 0.85 based on literature to improve specificity. All other prediction parameters were set to default values. The 27 HLA alleles covering 90% of the population refers to the global human population, as described in the IEDB documentation. We have added this clarification in the revised manuscript.
Comment 11:
The age categorization in Table 1 is not interpretable in its current form; the groups “0~”, “3~”, “6~”, “18~”, and “65~” do not define closed intervals and leave significant ambiguity, especially between 6 and 18 years.
Response:
We have revised the age categorization in Table 1. The groups are now clearly defined as closed intervals (e.g., 0-2, 3-5, 6-17, 18-64, ≥65) to eliminate any ambiguity.
Comment 12:
The phrase “differences in positivity rates among partial age groups were statistically significant” is too vague; it should specify which comparisons were significant and show the post hoc contrasts explicitly, not just with asterisks in the figure.
Response 12:
We have revised section 3.1 , including the post hoc contrast results in table 1.
Comment 13:
The described seasonality requires caution because the observation period begins in August 2023 and ends in December 2025; this incomplete calendar-year window may distort the visual interpretation of seasonal peaks.
Response 13:
We agree with the reviewer and have added a limitation statement regarding the incomplete calendar-year window in the second-to-last paragraph of the Discussion section.
Comment 14:
The co-infection analysis is based on unadjusted contingency tables and should not be interpreted as evidence of biological association without controlling for age, seasonality, and baseline circulation of each virus.
Response 14:
We agree and have revised the wording in the third paragraph of the Discussion.
Comment 15:
The co-infection table mixes co-detection frequency with “co-infection rate,” but it is unclear whether this percentage is calculated on all HBoV-positive cases or on the total number of positive cases for each other virus; the definition should be standardized.
Response:
We have revised Table 2 by adding fractions to clarify the calculation.
Comment 16:
The interpretation of influenza, rhinovirus, and parainfluenza as pathogens “associated with bocavirus infection” should be tempered, because the data presented show co-detection and do not allow for inferences of synergy or a true preference for co-infection.
Response 16:
We agree and have revised the wording in the third paragraph of the Discussion.
Comment 17:
The inference of a “recombination rate of 13.33% (2/15)” is potentially misleading, because the denominator does not correspond to the total number of positive cases but only to the 15 sequenced genomes, whose selection was not described.
Response 17:
We agree with the reviewer and have removed the statement regarding the “recombination rate of 13.33% (2/15)” from the manuscript to avoid potential misunderstanding.
Comment 18:
The identification of putative parental strains from very different geographic regions requires a more critical methodological discussion on unsampled diversity and on the “putative” nature of the parents assigned by RDP4/SimPlot.
Response 18:
We agree and have added a critical discussion regarding the putative parental strains. We note that the distant geographic origins of the assigned parents may be due to unsampled diversity in the sequence databases. Additionally, similar cases where parental and recombinant strains are geographically far apart have been reported in other studies.
Comment 19:
The presented evidence of recombination is suggestive, but it should be strengthened with more explicit criteria for accepting recombinant events, such as a minimum number of concordant methods, multiplicity correction, or p-values per algorithm.
Response 19:
We have strengthened the evidence by adding p-values for each detection method in Table 3.
Comment 20:
The epitope section interprets predicted differences as actual changes in antigenicity; this equivalence is not demonstrated and should be reformulated in terms of “predicted epitope differences.”
Response 20:
We agree and have revised the relevant statements throughout the manuscript.
Comment 21:
The claim that recombination “has led to an alteration in the antigenicity of the VP2 protein” is too strong for purely computational data, even if there is consistency between linear and conformational data.
Response 21:
We agree and have revised the relevant statements throughout the manuscript.
Comment 22:
The comparison with “domestically circulating strains” needs better epidemiological justification, because it is not explained why GZ-2024-15663 and YN-1044 were specifically chosen as circulating reference strains.
Response 22:
We have added the justification in section 3.5. The two strains were selected because: (1) they were circulating during the same period as the recombinant strain, and (2) they provide geographic representation (one from Guangzhou), both being domestically circulating strains.
Comment 23:
The discussion regarding the second epidemic peak and possible immune escape is clearly speculative and should be presented as a working hypothesis, not as a partially established explanation.
Response 23:
We agree and have revised the third-to-last paragraph of the Discussion as well as the Conclusions section.
Comment 24:
The causal argument linking high prevalence, co-infection, recombination, antigenic alteration, and a second peak is not supported by a formal temporal analysis or by sequencing of all positive cases during both peaks.
Response 24:
We agree with the reviewer and have softened the relevant statements in the second paragraph of the Discussion.
Comment 25:
The discussion acknowledges the lack of experimental validation and complete sequencing of all positive cases as a limitation, but this caution is not consistently maintained in the conclusion.
Response 25:
We have now softened the wording in the Conclusions section.
Comment 26:
The conclusion should be substantially moderated: “These recombination events could have led to alterations in virus antigenicity, which partially explain the emergence of the secondary peak” goes beyond what the data support.
Response 26:
We agree and have deleted the relevant statement from the Conclusions section.
Comment 27:
It would be more consistent to conclude that recombinant events and in silico differences in epitopes were detected, and that both findings warrant functional validation and expanded genomic surveillance.
Response 27:
We agree with the reviewer and have revised the Conclusions section accordingly. The conclusion now states that recombinant events and in silico differences in epitopes were detected, and that these findings warrant functional validation and expanded genomic surveillance.
Comment 28:
There are multiple English writing problems throughout the text, including grammatical errors, verb agreement issues, and unnatural phrasing, which require a comprehensive linguistic review.
Response 28:
We have carefully revised the entire manuscript for English language issues.
Comment 29:
The legend for Figure 2 contains faulty punctuation and syntax, and the age/table notation is inconsistent.
Response 29:
We have revised the legend and caption for Figure 2B to correct punctuation and syntax errors.
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors analyze bocavirus prevalence in children and perform phylogenetic and recombination analyses, in addition to protein structure determination and show that this recombination may have led to alteration of antigenic potential. The paper is well written with substantive analysis performed, but a few things can be clarified.
1) Please provide more details on the phylogenetic analysis. What MSA program was used, how many bootstraps, what method was used to construct the trees?
2) The authors should include a future work paragraph in the Conclusions section to show how this work can be extended.
3) I wonder if the authors can make Tables 4,5,6 clearer since there is a lot of information there. Can the authors highlight the important sections there?
4) I did not see how many patients were sampled in the Methods. I see 5316 samples in the Results but this number should probably be put in the Methods too.
5) The authors used a one step RT-PCR kit, but bocavirus is a DNA virus. Can the authors expand on the use of this kit and why it was used?
6) Can the authors provide some more information on read depth, genome coverage, and quality thresholds for accepting complete genomes? This would be useful information to the reader who wants more information on the MiSeq method.
Author Response
Comments 1:
Please provide more details on the phylogenetic analysis. What MSA program was used, how many bootstraps, what method was used to construct the trees?
Response 1:
We thank the reviewer for this valuable suggestion. Accordingly, we have added more detailed information on the phylogenetic analysis in the revised manuscript. Specifically, we used ClustalX for multiple sequence alignment, the neighbor-joining method for tree construction, and the Maximum Composite Likelihood model with 1,000 bootstrap replicates. These details have been incorporated into subsection “2.4 Phylogenetic Analysis” of the revised manuscript.
Comment 2:
The authors should include a future work paragraph in the Conclusions section to show how this work can be extended.
Response 2:
We agree with this. As suggested, we have added a paragraph discussing future work at the end of the Discussion section (the last paragraph).
Comment 3:
I wonder if the authors can make Tables 4,5,6 clearer since there is a lot of information there. Can the authors highlight the important sections there?
Response 3:
We thank the reviewer for this constructive suggestion. To improve the clarity of the tables, we have simplified Tables 4 and 5. Regarding Table 6, its content was found to be largely redundant with Figure 6 in terms of its role in the manuscript; therefore, we have moved Table 6 to the Supplementary Material.
Comment 4:
I did not see how many patients were sampled in the Methods. I see 5316 samples in the Results but this number should probably be put in the Methods too.
Response 4:
We thank the reviewer for pointing this out. We have added the number of patients to the Methods section (line 73) as suggested.
Comment 5:
The authors used a one-step RT-PCR kit, but bocavirus is a DNA virus. Can the authors expand on the use of this kit and why it was used?
Response 5:
We thank the reviewer for raising this important point. Although bocavirus is a DNA virus, in our actual laboratory workflow we also need to amplify certain RNA viruses (e.g., influenza virus) in parallel. For the sake of operational consistency and efficiency, we therefore uniformly adopted the one-step RT-PCR protocol for all viral detection, including bocavirus.
Comment 6:
Can the authors provide some more information on read depth, genome coverage, and quality thresholds for accepting complete genomes? This would be useful information to the reader who wants more information on the MiSeq method.
Response 6:
We appreciate the reviewer’s attention to the technical details of the MiniSeq sequencing. Unfortunately, due to the long duration over which this study was conducted, the relevant informations were not fully documented or have since been lost. In future studies, we will ensure that such parameters are systematically recorded and reported.
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have responded to my comments and suggestions to the best of their ability. I have no further input.
Author Response
Comment: The authors have responded to my comments and suggestions to the best of their ability. I have no further input.
Response: We sincerely thank the reviewer for their time and positive feedback. We are glad that our revisions are satisfactory.