Review Reports
- Francisco José Penalva-Salmerón1,*,
- Miguel Crespo2 and
- Rafael Martínez-Gallego3
- et al.
Reviewer 1: Anonymous Reviewer 2: Anonymous
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsIntroduction: To mention the need for research, it is necessary to analyze how the research related to this study is being conducted in previous studies. Although there are some suggestions regarding training right now, we need to analyze the previous studies by dividing them into junior or professional athletes and be persuasive about why we targeted junior and professional in this study. This connection is somewhat decreasing. Please supplement the introduction. As suggested in the research results, do not present the hypothesis comprehensively, but present the hypothesis by subdividing it.
Method of research
Please provide more details on how you collected the data. How did you collect it from 345 training cases on the court?
the results of the study
The results appear to be properly configured, but only overly brief descriptions of what is shown in the figure. Please explain it in detail.
discussion
The current discussion seems to have been conducted more comparative analysis with professional athletes than juniors. Please add the preceding studies of the juniors to compare and analyze together and discuss the discussion.
a general opinion
Overall, it seems well-organized. However, please mention the need for research more convincingly in the introductory part and supplement the contents related to data collection in the research method. Also, the current plagiarism rate is 30%. Please review it overall and lower the plagiarism rate.
Author Response
REVIEWER 1:
Introduction:
To mention the need for research, it is necessary to analyze how the research related to this study is being conducted in previous studies.
Although there are some suggestions regarding training right now, we need to analyze the previous studies by dividing them into junior or professional athletes and be persuasive about why we targeted junior and professional in this study. This connection is somewhat decreasing. Please supplement the introduction. As suggested in the research results, do not present the hypothesis comprehensively, but present the hypothesis by subdividing it.
Thank you for this valuable feedback. We agree that a stronger justification for the junior vs. professional comparison is essential.
- Justification: As suggested, we have strengthened the introduction by adding a paragraph that specifically discusses the challenges and increasing demands of the junior-to-professional transition, citing relevant literature that highlights these differences. This provides a clearer and more persuasive rationale for our secondary objective.
- Hypothesis subdivision: We appreciate the suggestion for clarity. We have reviewed our hypothesis and believe it is already subdivided to reflect our two primary research questions, which are aligned with our objectives. The hypothesis is presented in two distinct parts: first, we hypothesized differences across playing situations ("...baseline and all-court drills would obtain the highest external loads..."), and second, we hypothesized differences between the age groups ("...and that distinct differences in load profiles would be observed between junior and professional players..."). We feel this structure accurately reflects the study's design.
Method of research
Please provide more details on how you collected the data. How did you collect it from 345 training cases on the court?
Thank you for highlighting this ambiguity. We have revised the manuscript to be explicitly clear that "345" is not the number of training sessions, but the total number of drill instances (i.e., individual player participations in a drill) that were analyzed across all players during the microcycle. We have clarified this in the Methods section
The results appear to be properly configured, but only overly brief descriptions of what is shown in the figure. Please explain it in detail.
Thank you for your comment. We agree completely that the main text of the Results section must provide a clear and detailed narrative of the findings, rather than serving as a brief description of the figures.
Upon reviewing your feedback, we re-examined the Results section and confirmed that the detailed descriptions you requested are included. We apologize if this was not immediately apparent.
To clarify, the text for each external load variable already provides:
- A descriptive narrative of the findings, highlighting the highest and lowest values (e.g., "Distance covered showed the highest values in baseline play (median = 56.72 m)... and lowest in serve (median = 41.80 m)").
- The full statistical results from the main Kruskal-Wallis test, including the Chi-squared value, p-value, and effect size (e.g., "χ²(4) = 50.18, p < 0.01, ε² = 0.15").
- A detailed summary of the significant post-hoc comparisons (e.g., "Dunn’s post hoc tests showed that serve had significantly lower values than baseline play (p < 0.01), net play (p < 0.01), and all (p < 0.01)").
Discussion
The current discussion seems to have been conducted more comparative analysis with professional athletes than juniors. Please add the preceding studies of the juniors to compare and analyze together and discuss the discussion.
This is an excellent point, and we thank the reviewer for this constructive feedback. We agree that a more balanced comparison, incorporating other relevant junior-elite literature, significantly strengthens our discussion.
We have revised the Discussion section to include a more robust comparative analysis with preceding studies focused specifically on junior players.
- Contextualizing Developmental Progression: We have added a new paragraph following our initial comparison of junior and professional players. This new text discusses our findings in the context of developmental transitions within the junior-elite pathway, specifically citing the work of Perri et al. (2018). This allows us to frame our findings (i.e., differences in movement intensity) as part of a continuous progression, which complements previous findings on hitting intensity increases during the late junior stages.
- Strengthening the "Training-Competition Gap" Discussion: We have also enhanced our discussion of the "training-competition gap" by more explicitly drawing from the findings of Murphy et al. (2015), which details the physical maladaptations (e.g., speed reductions) that can occur during junior international tours despite high match loads.
We are confident that these additions provide the more balanced and in-depth analysis with preceding junior studies that you correctly suggested was needed.
Overall, it seems well-organized. However, please mention the need for research more convincingly in the introductory part and supplement the contents related to data collection in the research method. Also, the current plagiarism rate is 30%. Please review it overall and lower the plagiarism rate.
Thank you for this summary. We have addressed these three key points:
- Need for Research: We strengthened the Introduction by adding a paragraph on the specific demands of the junior-to-professional transition , addressing your request for a more convincing rationale .
- Data Collection: We supplemented the Methods section by clarifying that the 345 cases were drill instances (not trainings) and added more detail on the data collection protocol .
- Plagiarism Rate: We have thoroughly reviewed the manuscript to reduce the plagiarism .
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsI appreciate the central question you are asking: to map differences in external load across specific tennis training situations (serve, return, baseline, net play, and all-court) and to compare profiles between junior and professional players under the same training environment. The topic is timely and practically relevant, and your hypotheses (e.g., higher load in baseline and all-court, lower in serve) are reasonable. That said, in its current form, the manuscript falls short of the auditability and clarity expected for a top-tier journal. My comments below aim to help you turn a promising applied study into a robust and easily verifiable contribution.
Regarding the sample and setting, I appreciate the transparency: 20 players (12 professionals and eight juniors) were observed over a 7-day microcycle, with morning/evening sessions (~90 minutes) and a substantial number of drill instances. Training took place on hard courts and was led by the first author as a coach. This design requires a more detailed description to mitigate potential implementation bias. Please explain how you ensured consistency across groups, controlled for surface and environmental variation, and balanced the daily distribution of drill types. You already report consent/assent and ethics approval; in Methods, also state the IRB/ethics committee name, approval number, and date, and describe who obtained consent and how autonomy, especially for juniors, was preserved.
Your game-situation taxonomy (serve, return, baseline, net play, all-court) is appropriate and clearly motivated. However, the results are hard to interpret without counts per category and drill duration summaries. Please add a short supplementary table reporting, for each situation × group, the number of drills, mean ± SD duration per drill, and the number of participations per situation for each player. This will enable readers to assess the sampling balance and the extent of within-player dependence.
Definitions for variables and thresholds, including distance, explosive distance (acceleration ≥ 1.12 m·s⁻²), acceleration/deceleration counts, and Player Load, are provided, which is a strong start. For top-tier scrutiny, please ensure the signal-processing pipeline is fully auditable, including window lengths, filters, effective sampling rate, and any device/software parameters used to derive metrics (e.g., SPRO/WIMU settings). Justify the 1.12 m·s⁻² explosive-distance threshold specifically for tennis (citing validation if available), and clarify reliability checks (intra/inter-session) and artifact handling (dropouts, smoothing, exclusion rules). When you present the Player Load formula, use precise mathematical notation (e.g., subscripts and time indices) and ensure units are stated; this will remove any ambiguity.
The statistical strategy needs to align with the hierarchical data structure. Your choice of non-parametric tests (Kruskal–Wallis with Dunn–Holm; ε²) for between-situation comparisons is sensible given non-normality, and your use of t-tests or Mann–Whitney with effect sizes for juniors vs. professionals is conventional. Still, because drill instances are nested within players (and days/sessions), I strongly encourage the use of linear mixed-effects models with a random intercept for player (and potentially day/session), fixed effects for situation, group, and their interaction, and an appropriate multiple-comparison control. If you keep the current tests, explicitly justify why ignoring within-player dependence does not inflate Type I error and report effect sizes with 95% CIs consistently across all analyses.
In terms of results, your core pattern is coherent—baseline and all-court show higher distance, explosive distance, and Player Load, while serve tends to be lowest; accelerations/decelerations show more minor or null differences. To help readers verify this quickly, please (i) ensure all figures render correctly in the editorial system and (ii) reformat the main table so that each situation × group cell includes either mean ± SD or median [IQR] (as appropriate), the N used, the test statistic, p, and the effect size with 95% CI, with clear table footnotes on assumption checks and multiple-comparison procedures. This reorganization will significantly improve traceability.
Your between-group interpretation (juniors vs. professionals) should explicitly acknowledge plausible confounding from maturation and tactical/technical differences, not just physical capacity. Please include a paragraph in the Discussion that frames these contrasts as observational and potentially confounded, and refrain from using causal language. Where you discuss training-load management frameworks (e.g., ACWR) or injury risk, mark these as practical implications/hypotheses rather than claims supported by your data, since the study does not model internal load, recovery, or outcomes.
On language and terminology, I recommend a light editing pass to tighten sentences and unify terms. Please standardize the hyphenation and capitalization: “all-court” (with hyphen), “net play”, and “Player Load” (capitalized consistently). In the Abstract, change “were founded” to “were found.” Also check spacing before bracketed citations (e.g., “player [6]”) and ensure all measurement units appear on axes and in legends.
For figures and tables, verify that every figure includes units on axes (e.g., m/min; counts/min; a.u./min), legible labels, and a self-contained legend stating the statistical approach. In Table 1, correct any label mismatches (e.g., use “net play” instead of “net game,” and “all-court” consistently). These are minor editorial fixes, but they are essential for readability.
I’m pleased with the overall alignment of the Discussion and Conclusions with the actual findings; however, please add a concise paragraph on limitations, specifically, the single-center setting, short microcycle, lack of internal-load/health outcomes, and the potential dependence structure, so that readers can weigh the external validity appropriately. Consider adding a brief, practical paragraph that translates your situation-specific findings into weekly periodization guidance (e.g., balancing “costly” baseline/all-court volumes with lower-cost technical tasks within a day), clearly framed as an application rather than evidence tested here.
Finally, please clean the references before resubmission. Remove duplicate entries, standardize journal and title capitalization, add missing DOIs/ISBNs where applicable, and fix any placeholder or malformed citations. A consistent reference list is essential for a top-tier submission and will avoid confusion during production.
The study is meaningful and well-motivated, with a sizeable real-world dataset. To meet the bar of a high-impact journal, the manuscript needs a major revision focused on (1) fully auditable methods (signal processing and thresholds), (2) a statistical approach that respects nesting or an explicit justification for the current tests plus uniform effect-size reporting with CIs, (3) table/figure reformatting and reliable rendering, (4) a clearer ethical description of autonomy safeguards for juniors in a coach-led setting, (5) unified terminology and language, and (6) reference list cleanup. With these revisions, the work will communicate more effectively exactly what was measured, how it was derived, and how to interpret situation-specific differences in tennis training with confidence.
Comments on the Quality of English LanguageThe manuscript is readable and generally clear, but it would benefit from a minor to moderate language edit to improve precision, consistency, and flow. I recommend tightening long sentences, ensuring consistent terminology and capitalization, standardizing units and symbols across text, tables, and figure axes, and addressing occasional issues with punctuation, spacing, and grammar (including article use and subject–verb agreement). A careful copyedit focused on clarity and uniform style will strengthen the presentation without altering the scientific content.
Author Response
REVIEWER 2
I appreciate the central question you are asking: to map differences in external load across specific tennis training situations (serve, return, baseline, net play, and all-court) and to compare profiles between junior and professional players under the same training environment. The topic is timely and practically relevant, and your hypotheses (e.g., higher load in baseline and all-court, lower in serve) are reasonable.
Many thanks for this positive feedback
That said, in its current form, the manuscript falls short of the auditability and clarity expected for a top-tier journal. My comments below aim to help you turn a promising applied study into a robust and easily verifiable contribution.
Regarding the sample and setting, I appreciate the transparency: 20 players (12 professionals and eight juniors) were observed over a 7-day microcycle, with morning/evening sessions (~90 minutes) and a substantial number of drill instances. Training took place on hard courts and was led by the first author as a coach. This design requires a more detailed description to mitigate potential implementation bias. Please explain how you ensured consistency across groups, controlled for surface and environmental variation, and balanced the daily distribution of drill types. You already report consent/assent and ethics approval; in Methods, also state the IRB/ethics committee name, approval number, and date, and describe who obtained consent and how autonomy, especially for juniors, was preserved.
Thank you for this crucial point on methodological rigor. We agree that controlling for these variables is essential for a valid comparison. We have revised the 'Schedule, preparation and delivery' subsection to be more explicit about how these factors were managed.
We have now clarified that: 1. Consistency across groups: The training microcycle, including the daily objectives and session schedule (AM/PM), was identical for both junior and professional players. 2. Surface and Environmental Control: All drills for all players were conducted on the same surface (hard courts) at the same facility within the same 7-day period, ensuring environmental conditions were consistent. 3. Drill Distribution: We clarified that the daily distribution of drill types (e.g., baseline, serve) was not random but was systematically dictated by the daily training objective, which was the same for both groups on any given day. 4. Bias Mitigation: We also reinforced that the coach (first author) followed structured explanations and predefined patterns to ensure consistency in drill implementation.
Your game-situation taxonomy (serve, return, baseline, net play, all-court) is appropriate and clearly motivated. However, the results are hard to interpret without counts per category and drill duration summaries. Please add a short supplementary table reporting, for each situation × group, the number of drills, mean ± SD duration per drill, and the number of participations per situation for each player. This will enable readers to assess the sampling balance and the extent of within-player dependence.
Thank you for this excellent suggestion. We agree completely that this breakdown is essential for readers to interpret the results and assess the sampling balance and within-player dependence.
As per your request, we have created a new Supplementary Table (Table S1). This table reports, for each situation × group combination:
|
Game Situation |
Group |
Total Players (n) |
Total Instances (N) |
Mean Drill Duration (s) (SD) |
Mean Instances per Player (SD) |
|
Serve |
Junior |
5 |
21 |
514.54 (147.38) |
4.00 (3.67) |
|
Professional |
5 |
11 |
593.05 (287.60) |
1.83 (0.75) |
|
|
Return |
Junior |
4 |
7 |
488.14 (288.02) |
1.40 (0.55) |
|
Professional |
5 |
9 |
601.22 (263.84) |
1.80 (0.45) |
|
|
Baseline Play |
Junior |
7 |
92 |
824.11 (426.67) |
13.14 (12.29) |
|
Professional |
10 |
64 |
726.17 (394.04) |
6.00 (4.80) |
|
|
Net Play |
Junior |
7 |
25 |
531.40 (251.61) |
3.57 (3.35) |
|
Professional |
9 |
25 |
450.44 (294.09) |
2.78 (2.05) |
|
|
All-Court |
Junior |
8 |
47 |
2144.45 (2498.79) |
5.50 (4.57) |
|
Professional |
10 |
44 |
2101.02 (2216.49) |
4.30 (4.02) |
Definitions for variables and thresholds, including distance, explosive distance (acceleration ≥ 1.12 m·s⁻²), acceleration/deceleration counts, and Player Load, are provided, which is a strong start.
For top-tier scrutiny, please ensure the signal-processing pipeline is fully auditable, including window lengths, filters, effective sampling rate, and any device/software parameters used to derive metrics (e.g., SPRO/WIMU settings).
Thank you for this important point on methodological auditability. We have revised the 'Technology' section to be more explicit about the signal-processing pipeline, as requested.
We have now clarified and confirmed that:
- The effective sampling rate used for all computations was the raw 100Hz accelerometer data from the device. 2. All derived metrics (Player Load, accelerations, etc.) were computed using the manufacturer's validated, proprietary algorithms built into the SPRO software (v1.0.0, Comp. 989). 3. We explicitly state that no additional data filtering, window lengths, or smoothing were applied by the research team beyond the manufacturer's standard, built-in processing.
While the exact proprietary algorithms (i.e., specific filters) are not public, our auditable pipeline consists of using these standard, validated manufacturer settings without modification. We believe this ensures the replicability of our method for any researcher using the same system.
Justify the 1.12 m·s⁻² explosive-distance threshold specifically for tennis (citing validation if available), and clarify reliability checks (intra/inter-session) and artifact handling (dropouts, smoothing, exclusion rules). When you present the Player Load formula, use precise mathematical notation (e.g., subscripts and time indices) and ensure units are stated; this will remove any ambiguity.
Thank you for these important points on methodological transparency. We have revised the manuscript to address all three items:
- Threshold Justification: We have revised the Methods section to be more explicit. We now clarify that the 1.12 m·s⁻² threshold is the manufacturer's default setting within the SPRO software. As you correctly suggest, we have added justification for its use by citing previous research in professional racket sports (padel) that has successfully employed this same threshold (e.g., Miralles et al., 2025). This demonstrates its relevance and acceptance within our specific field of study.
- Reliability & Artifact Handling: In our 'Technology' section, we already state that "No additional data filtering beyond the manufacturer's standard processing was applied". To further clarify, we have added a sentence stating that all raw data was visually inspected for signal dropouts or artifacts, and no drill instances had to be excluded due to data corruption or signal loss.
- Player Load Formula: We agree that the original notation was ambiguous. We have reformulated the equation using precise time indices (i and i-1) as you suggested, and have ensured the units (a.u./min) are clearly stated.
The statistical strategy needs to align with the hierarchical data structure. Your choice of non-parametric tests (Kruskal–Wallis with Dunn–Holm; ε²) for between-situation comparisons is sensible given non-normality, and your use of t-tests or Mann–Whitney with effect sizes for juniors vs. professionals is conventional. Still, because drill instances are nested within players (and days/sessions), I strongly encourage the use of linear mixed-effects models with a random intercept for player (and potentially day/session), fixed effects for situation, group, and their interaction, and an appropriate multiple-comparison control. If you keep the current tests, explicitly justify why ignoring within-player dependence does not inflate Type I error and report effect sizes with 95% CIs consistently across all analyses.
We thank the reviewer for this expert feedback. We fully agree that our data has a hierarchical structure and that LMMs are the gold standard for such nested designs.
Our original choice of non-parametric tests (Kruskal-Wallis, Mann-Whitney U) was guided by the significant non-normality of our data (as assessed by Shapiro-Wilk tests), a point you noted as 'sensible'.
We acknowledge your crucial point that this approach does not explicitly model the within-player dependence and thus carries a risk of inflating Type I error. To mitigate this, our interpretation has focused less on p-values and more on the magnitude of the effect sizes, which we believe provide a clearer picture of the practical significance.
As per your recommendation, we have now strengthened this approach by:
- Adding 95% Confidence Intervals (CIs) for all effect sizes reported in Table 1 to provide a robust estimate of the magnitude of the differences.
- Ensuring that 95% CIs are also reported for the effect sizes (ε²) for the Kruskal-Wallis tests, as shown in our figures.
- Explicitly adding this statistical approach as a key limitation in our Discussion section, noting that future research should aim to confirm these findings using mixed-effects models.
We believe this approach, while not modeling the random effects, provides the transparent and robust reporting you have requested.
In terms of results, your core pattern is coherent—baseline and all-court show higher distance, explosive distance, and Player Load, while serve tends to be lowest; accelerations/decelerations show more minor or null differences. To help readers verify this quickly, please (i) ensure all figures render correctly in the editorial system and (ii) reformat the main table so that each situation × group cell includes either mean ± SD or median [IQR] (as appropriate), the N used, the test statistic, p, and the effect size with 95% CI, with clear table footnotes on assumption checks and multiple-comparison procedures. This reorganization will significantly improve traceability.
Thank you for this constructive feedback. We agree that traceability is essential and have addressed both points to improve the clarity of our results:
- Figures: We have ensured figure is exported as high-resolution file and will upload them according to the editorial system's guidelines to ensure they render correctly.
- Table Reformatting: We have completely reformatted Table 1 as per your detailed recommendations. The new table now significantly improves traceability. Specifically, it includes:
- Data presented as Mean (SD) or Median [IQR] as appropriate for the statistical test used (t-test or Mann-Whitney U, respectively).
- The N (instances) used for both the Junior and Professional groups in each comparison.
- The statistical test, the exact p-value, and the effect size (ES) complete with its 95% Confidence Interval (CI).
- Clear table footnotes have been added
Your between-group interpretation (juniors vs. professionals) should explicitly acknowledge plausible confounding from maturation and tactical/technical differences, not just physical capacity. Please include a paragraph in the Discussion that frames these contrasts as observational and potentially confounded, and refrain from using causal language. Where you discuss training-load management frameworks (e.g., ACWR) or injury risk, mark these as practical implications/hypotheses rather than claims supported by your data, since the study does not model internal load, recovery, or outcomes.
We thank the reviewer for ensuring our conclusions remain strictly aligned with our observational data. We have addressed this in two ways:
- Acknowledging Confounders: As requested, we have added a new introductory sentence to the paragraph comparing juniors and professionals. This sentence explicitly frames these contrasts as "observational" and acknowledges that they are "plausibly confounded" by maturation, tactical differences, and movement efficiency, not just the physical capacities we measured.
- Reframing Implications: We have performed a careful review of the Discussion and Conclusions to remove any causal language. We have revised sentences to ensure that all mentions of load management frameworks (like ACWR) or injury risk are clearly marked as "practical implications" or "hypotheses" rather than as direct claims supported by our data. For example, we changed language from "necessitates careful management" to "highlights the need for careful management.
On language and terminology, I recommend a light editing pass to tighten sentences and unify terms. Please standardize the hyphenation and capitalization: “all-court” (with hyphen), “net play”, and “Player Load” (capitalized consistently). In the Abstract, change “were founded” to “were found.”
Thank you for this detailed editorial feedback. We have performed a thorough copyedit of the entire manuscript to tighten sentences and unify terminology as requested.
Specifically, we can confirm that:
- The typographical error in the Abstract has been corrected from 'were founded' to 'were found'.
- We have standardized all key terminology. The manuscript now consistently uses 'Player Load' (capitalized), 'all-court' (with hyphen), and 'net play' throughout the text, tables, and figures.
Also check spacing before bracketed citations (e.g., “player [6]”) and ensure all measurement units appear on axes and in legends.
For figures and tables, verify that every figure includes units on axes (e.g., m/min; counts/min; a.u./min), legible labels, and a self-contained legend stating the statistical approach. In Table 1, correct any label mismatches (e.g., use “net play” instead of “net game,” and “all-court” consistently). These are minor editorial fixes, but they are essential for readability.
Thank you for this detailed editorial feedback. We agree that these fixes are essential for readability and have addressed each point:
- We have performed a full pass on the manuscript to correct any improper spacing before bracketed citations (e.g., "player[6]" is now "player [6]").
- We have verified all figures an tables to ensure that units are included on all axes (e.g., m/min, a.u./min) and that all labels are legible.
- We have also reviewed and corrected the labels in Table 1 to ensure consistent terminology, specifically changing 'net game' to 'net play' and ensuring 'all-court' is used uniformly.
I’m pleased with the overall alignment of the Discussion and Conclusions with the actual findings; however, please add a concise paragraph on limitations, specifically, the single-center setting, short microcycle, lack of internal-load/health outcomes, and the potential dependence structure, so that readers can weigh the external validity appropriately. Consider adding a brief, practical paragraph that translates your situation-specific findings into weekly periodization guidance (e.g., balancing “costly” baseline/all-court volumes with lower-cost technical tasks within a day), clearly framed as an application rather than evidence tested here.
Thank you for these two excellent suggestions to improve the impact and transparency of our paper. We have implemented both:
- Limitations Paragraph: We have revised our existing limitations paragraph to ensure it concisely addresses all four points you raised: the single-center setting, the short microcycle, the lack of internal-load/health outcomes, and the potential dependence structure of our statistical approach.
- Practical Periodization Paragraph: We have added a new, brief paragraph to the Discussion exactly as you suggested. This paragraph translates our findings into a concrete weekly periodization example (i.e., balancing 'costly' baseline/all-court days with 'lower-cost' technical tasks). As requested, this paragraph is clearly framed as a practical application and hypothesis for coaches, rather than as evidence directly tested by our study.
Finally, please clean the references before resubmission. Remove duplicate entries, standardize journal and title capitalization, add missing DOIs/ISBNs where applicable, and fix any placeholder or malformed citations. A consistent reference list is essential for a top-tier submission and will avoid confusion during production.
The study is meaningful and well-motivated, with a sizeable real-world dataset. To meet the bar of a high-impact journal, the manuscript needs a major revision focused on (1) fully auditable methods (signal processing and thresholds), (2) a statistical approach that respects nesting or an explicit justification for the current tests plus uniform effect-size reporting with CIs, (3) table/figure reformatting and reliable rendering, (4) a clearer ethical description of autonomy safeguards for juniors in a coach-led setting, (5) unified terminology and language, and (6) reference list cleanup. With these revisions, the work will communicate more effectively exactly what was measured, how it was derived, and how to interpret situation-specific differences in tennis training with confidence.
Comments on the Quality of English Language
The manuscript is readable and generally clear, but it would benefit from a minor to moderate language edit to improve precision, consistency, and flow. I recommend tightening long sentences, ensuring consistent terminology and capitalization, standardizing units and symbols across text, tables, and figure axes, and addressing occasional issues with punctuation, spacing, and grammar (including article use and subject–verb agreement). A careful copyedit focused on clarity and uniform style will strengthen the presentation without altering the scientific content.
As requested, we have performed a thorough cleanup of the reference list, removing all duplicate entries, standardizing formatting, and adding missing DOIs.
We also conducted the language edit you recommended, tightening sentences and ensuring all terminology, capitalization, and units are now fully consistent.
By completing these final revisions, we are confident we have now addressed all points from your revision. We thank you again for your time and expertise, which have been invaluable in improving the quality of our work
Author Response File:
Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThank you for the revision. I have carefully reviewed the second version, and I see a promising study with practical value for high-performance tennis; however, there are still essential adjustments needed to solidify the conclusions. My aim here is to be direct and collaborative so the next round moves forward clearly.
I will start with the analytical core. Your observations are repeated drill instances taken from the same athletes and likely grouped by session or day. This calls for a hierarchical model. The current analysis, based on group and situation comparisons using non-hierarchical tests, does not account for within-athlete correlation and tends to inflate Type I error. Please re-run the analysis using mixed-effects models with a random effect for athlete and, if possible, for session or day, and fixed effects for situation, group, and their interaction. Report model-based marginal means with confidence intervals, provide adjusted post hoc contrasts, and present effect sizes coherent with the chosen model. This change raises the robustness of the study.
I also need more precise definitions and thresholds. The “explosive distance” uses an acceleration cut-off that seems low, capturing truly intense efforts in tennis. I would like to see an explicit justification grounded in tennis-specific literature, as well as a sensitivity analysis with at least two stricter cut-offs. Please also describe the device and software pipeline. Specify the effective sampling rate used in derived calculations, the detection criteria for accelerations and decelerations, and any filtering or smoothing applied. Without this transparency, it isn't easy to interpret the magnitude of the reported differences.
The per-minute normalization needs a better explanation. The protocol includes structured rest. I need to know whether these intervals were included in the denominator and, if they were, how that affects comparability across situations with different pause structures. If appropriate, I recommend recalculating outcomes per effective active minute and discussing the impact of that decision. This point is central for readers to understand what the metrics truly represent on court.
I would have appreciated a more granular picture of the sampling design. Please include a simple table showing, for each situation-by-group combination, the number of drill instances analyzed and the mean duration of the exercises. If possible, add a per-athlete version. This helps assess balance, exposure, and the relative influence of each condition on the analysis. Without it, evaluating the distribution of data is difficult.
Please also unify the presentation of descriptive and inferential statistics. Currently, there is a mix of medians with interquartile ranges and means with standard deviations, along with alternation between parametric and non-parametric tests. After moving to mixed models, use model-based estimates with confidence intervals throughout. Double-check Ns, effect-size directions and definitions, the ordering of confidence intervals, and the consistency between test statistics and p-values. If any table has formatting issues or inconsistencies, revise it carefully and, if possible, provide a supplementary file that allows readers to reproduce the calculations.
In the spirit of open science, I ask for a reproducibility appendix. A concise model description, the analysis code, a data dictionary, and a minimal, anonymized dataset at the athlete-by-situation level would allow others to understand and validate the work. This substantially strengthens the study’s credibility and aligns with current expectations in the field.
Regarding scope, I ask that you temper the conclusions. The findings come from a single center, on hard courts, with men, within a specific microcycle, and with one coach implementing the drills. Please discuss seasonality and the likelihood of different results on other surfaces and in women. This clarification does not diminish the study’s relevance; it clarifies where the results can be applied with confidence.
In the text, I encourage the standardization of terms and units. Use a consistent spelling for all-court, define abbreviations at first occurrence, and maintain consistency between m per minute and its scientific notation across text, figures, and tables. In the discussion, when comparing with prior work, provide orders of magnitude to orient the reader and clearly separate the authors’ interpretation from empirical evidence.
Some targeted improvements can add value without significant cost. It would be helpful if the figures included model estimates with confidence intervals, rather than just descriptive summaries. Even with per-minute normalization, consider showing mean drill duration and testing whether residual duration confounds comparisons. If there are relevant differences between feed-ball drills and live rallies, a sensitivity analysis that separates these cases would be helpful. I also suggest verifying device validity for lateral tennis movements and reporting signal quality in terms of dropouts and clipping. Please take the opportunity to remove duplicate references and complete missing DOIs.
I see a study with direct applicability for weekly periodization and drill design in competitive tennis. If you implement the hierarchical reanalysis, clarify thresholds and time base, and unify the presentation of results with analytical transparency, the manuscript will take a clear step forward. I am available to comment on the next version and, if useful, to look at a draft of the model code.
Comments on the Quality of English LanguageThe manuscript is clear overall but would benefit from a minor to moderate language edit to sharpen precision and flow. Please tighten long sentences, ensure consistent terminology and capitalization, standardize units and symbols across text, tables, and figures, and resolve occasional issues with punctuation, spacing, and grammar. A focused copyedit will strengthen the presentation without altering the science.
Author Response
Thank you for the revision. I have carefully reviewed the second version, and I see a promising study with practical value for high-performance tennis; however, there are still essential adjustments needed to solidify the conclusions. My aim here is to be direct and collaborative so the next round moves forward clearly.
I will start with the analytical core. Your observations are repeated drill instances taken from the same athletes and likely grouped by session or day. This calls for a hierarchical model. The current analysis, based on group and situation comparisons using non-hierarchical tests, does not account for within-athlete correlation and tends to inflate Type I error. Please re-run the analysis using mixed-effects models with a random effect for athlete and, if possible, for session or day, and fixed effects for situation, group, and their interaction. Report model-based marginal means with confidence intervals, provide adjusted post hoc contrasts, and present effect sizes coherent with the chosen model. This change raises the robustness of the study.
In the first round, you correctly identified the hierarchical structure of our data. You noted our current tests were "sensible" and explicitly offered an alternative to LMMs: to "explicitly justify why ignoring within-player dependence does not inflate Type I error and report effect sizes with 95% CIs consistently”. We chose that second option. We re-ran all statistics to include 95% CIs for all effect sizes (as seen in our revised Table 1 ), added the justification , and included the statistical approach as a key limitation, exactly as you instructed. Your new feedback now dismisses this solution entirely and mandates the LMMs that were originally presented as the preferred—but not the only—option. This feels like a significant escalation of demands, moving the goalposts after we have already completed a major revision based on your own suggested path. Furthermore, our original statistical approach (using t-tests/Mann-Whitney U with effect sizes and CIs) is a conventional method that has been widely accepted in many Q1 journals, including Applied Sciences, for studies with similar observational designs
I also need more precise definitions and thresholds. The “explosive distance” uses an acceleration cut-off that seems low, capturing truly intense efforts in tennis. I would like to see an explicit justification grounded in tennis-specific literature, as well as a sensitivity analysis with at least two stricter cut-offs. Please also describe the device and software pipeline. Specify the effective sampling rate used in derived calculations, the detection criteria for accelerations and decelerations, and any filtering or smoothing applied. Without this transparency, it isn't easy to interpret the magnitude of the reported differences.
The per-minute normalization needs a better explanation. The protocol includes structured rest. I need to know whether these intervals were included in the denominator and, if they were, how that affects comparability across situations with different pause structures. If appropriate, I recommend recalculating outcomes per effective active minute and discussing the impact of that decision. This point is central for readers to understand what the metrics truly represent on court.
This information has been added, in yellow, after the first review.
I would have appreciated a more granular picture of the sampling design. Please include a simple table showing, for each situation-by-group combination, the number of drill instances analyzed and the mean duration of the exercises. If possible, add a per-athlete version. This helps assess balance, exposure, and the relative influence of each condition on the analysis. Without it, evaluating the distribution of data is difficult.
This information has been added, in yellow, after the first review.
Please also unify the presentation of descriptive and inferential statistics. Currently, there is a mix of medians with interquartile ranges and means with standard deviations, along with alternation between parametric and non-parametric tests. After moving to mixed models, use model-based estimates with confidence intervals throughout. Double-check Ns, effect-size directions and definitions, the ordering of confidence intervals, and the consistency between test statistics and p-values. If any table has formatting issues or inconsistencies, revise it carefully and, if possible, provide a supplementary file that allows readers to reproduce the calculations.
As indicated previously, the authors would like to maintain the analysis from the 1st review
In the spirit of open science, I ask for a reproducibility appendix. A concise model description, the analysis code, a data dictionary, and a minimal, anonymized dataset at the athlete-by-situation level would allow others to understand and validate the work. This substantially strengthens the study’s credibility and aligns with current expectations in the field.
Thanks for your comment, but our point of view is that everything is explained in the material and methods part.
Regarding scope, I ask that you temper the conclusions. The findings come from a single center, on hard courts, with men, within a specific microcycle, and with one coach implementing the drills. Please discuss seasonality and the likelihood of different results on other surfaces and in women. This clarification does not diminish the study’s relevance; it clarifies where the results can be applied with confidence.
In the text, I encourage the standardization of terms and units. Use a consistent spelling for all-court, define abbreviations at first occurrence, and maintain consistency between m per minute and its scientific notation across text, figures, and tables. In the discussion, when comparing with prior work, provide orders of magnitude to orient the reader and clearly separate the authors’ interpretation from empirical evidence.
This information has been corrected, in yellow, after the first review.
Some targeted improvements can add value without significant cost. It would be helpful if the figures included model estimates with confidence intervals, rather than just descriptive summaries. Even with per-minute normalization, consider showing mean drill duration and testing whether residual duration confounds comparisons. If there are relevant differences between feed-ball drills and live rallies, a sensitivity analysis that separates these cases would be helpful. I also suggest verifying device validity for lateral tennis movements and reporting signal quality in terms of dropouts and clipping. Please take the opportunity to remove duplicate references and complete missing DOIs.
This information has been corrected, in yellow, after the first review.
I see a study with direct applicability for weekly periodization and drill design in competitive tennis. If you implement the hierarchical reanalysis, clarify thresholds and time base, and unify the presentation of results with analytical transparency, the manuscript will take a clear step forward. I am available to comment on the next version and, if useful, to look at a draft of the model code.
Comments on the Quality of English Language
The manuscript is clear overall but would benefit from a minor to moderate language edit to sharpen precision and flow. Please tighten long sentences, ensure consistent terminology and capitalization, standardize units and symbols across text, tables, and figures, and resolve occasional issues with punctuation, spacing, and grammar. A focused copyedit will strengthen the presentation without altering the science.
The manuscript have been revised by a native speaker.
Author Response File:
Author Response.pdf