Rugby Sevens sRPE Workload Imputation Using Objective Models of Measurement
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsDear authors,
This article has too many similarities (47% and 18% from one article) to be accepted for publication or to enter the review stage of a prestigious journal. The modifications should be made even if it's another one of your articles. Such a percentage of similarities also means that your article lacks elements of originality. The recommendation is to make all changes concerning the similarities and then resubmit the article for review.
Author Response
Reviewer 1
This article has too many similarities (47% and 18% from one article) to be accepted for publication or to enter the review stage of a prestigious journal. The modifications should be made even if it's another one of your articles. Such a percentage of similarities also means that your article lacks elements of originality. The recommendation is to make all changes concerning the similarities and then resubmit the article for review.
Response to Reviewer 1:
Thank you for your detailed feedback regarding the similarity index of my manuscript. We acknowledge the overlap identified with prior work and appreciate the importance of maintaining originality and avoiding self-plagiarism, even when building upon related studies.
This manuscript represents a direct and purposeful extension of earlier research, with the goal of advancing the understanding of athlete workload modeling through the use of imputation. From the point of originality, this manuscript uses two workload models—a Speed-Deceleration-Contact (SDC) model and a mechanical work model—evaluated using an established framework that includes statistical accuracy, RMSE, R², and equivalence testing. The decision to retain consistency with previously published methods and evaluation metrics, is intentional and, more crucially, it is necessary to enable fair and accurate comparisons across imputation strategies.
We acknowledge that portions of the methodology overlap with previously published work. This is because the current study uses the same dataset as prior research in order to extend those findings through the application of the two objective workload models. To ensure consistency and comparability, certain descriptions—such as data collection procedures and sensor-derived variables—were closely aligned, but not copies of, earlier reporting (lines: 119-213). Nonetheless, the analytical approach and research questions are distinct and novel, and we have revised the text to rephrase overlapping sections, ensured all material is properly cited, and emphasized the novel aspects of this study more explicitly (lines 115-116: “This novel approach assesses how two objective workload strategies behave when used to impute sRPE-CL data.”), while maintaining the necessary methodological consistency for valid comparison. We appreciate the opportunity to improve the clarity and originality of the submission.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis study presents an investigation into RPE imputation using objective models (mechanical work and SDC) in women’s rugby sevens. The manuscript falls short of publication standards in its current form. The research question is relevant, but the execution, analysis, and presentation require substantial revisions to address methodological limitations, statistical clarity, and broader implications. Key concerns include inadequate justification for model selection, insufficient discussion of limitations, and unclear practical applications. Below are the major issues requiring attention:
- The Introduction adequately outlines the limitations of sRPE and proprietary ATD metrics but fails to clearly articulate why imputation (rather than alternative objective metrics) is the preferred solution for missing RPE data. Prior studies suggest that imputation is often a last resort due to inherent inaccuracies.
- The SDC model is introduced as a novel alternative but lacks a clear theoretical link to RPE.
- The monitoring of loads in athletes is essential for injury prevention and optimization of athletic performance. However, the introduction section lacks relevant background and it is recommended that more detailed descriptions be added to introduce the topic. To provide more effective evidence, the authors may consider referring to the following relevant studies: Accurately and effectively predict the ACL force: Utilizing biomechanical landing pattern before and after-fatigue (https://doi.org/10.1016/j.cmpb.2023.107761).
- The paper lacks a robust rationale for focusing solely on linear regression and random forest models. Given the modest performance of these models (e.g., SDC-random forest accuracy: 27.2%, R²: 0.338), why were more advanced techniques (e.g., Bayesian networks, ensemble methods) not explored? Compare with prior work to contextualize choices.
- While statistical equivalence was achieved (TOST test), the practical utility of imputed RPE remains questionable due to low accuracy. Discuss whether such models are truly interchangeable with true RPE in real-world settings, especially given the ANOVA results showing significant differences.
- The study assumes missingness is completely at random (MCAR). Please justify this assumption and address potential biases if data are missing not at random (MNAR), which is common in athlete self-reports.
- The comparison with the daily team mean substitution is insufficient. Include benchmarks against other imputation methods (e.g., k-nearest neighbors, MICE) to strengthen claims of superiority.
- The SDC model’s components (e.g., contact counts) may not generalize to other sports or contexts. Discuss its applicability beyond rugby sevens and potential adjustments for sports with different demands.
- The conclusion advocates for objective metrics but overlooks the value of subjective RPE in capturing psychological load (e.g., Saw et al., 2016). Rebalance the discussion to acknowledge both approaches.
- Figure 3’s accuracy trends are unclear. Add error bars or confidence intervals. Table 1 should include p-values for model comparisons.
Author Response
Reviewer 2
- The Introduction adequately outlines the limitations of sRPE and proprietary ATD metrics but fails to clearly articulate why imputation (rather than alternative objective metrics) is the preferred solution for missing RPE data. Prior studies suggest that imputation is often a last resort due to inherent inaccuracies.
We thank the reviewer for this assertion and fully agree, imputation should be a last resort to replace information, however, as the reviewer recognized in a later point (#9), sRPE data holds value beyond simply the physical space and there may be a desire by practitioners to retain sRPE data for use to calculate sRPE-CL as a means to preserve a more holistic appreciation for the perception of effort from the athlete’s experiences. This manuscript intends to provide options, via objective metrics, for the imputation of sRPE to preserve, or to work as an alternative to, sRPE. To that end, we have added more information on the importance of retaining sRPE data (lines 49-53).
Lines 49-53: While the sRPE scale acts as a proxy measure of sessional intensity, it remains a subjective element as it reflects the perception of effort [7]. However, the use of sRPE to inform sRPE-CL enables a holistic approach to understanding an athlete’s response to the stress of training or competition [10]. The imputation of sRPE data, to ensure continuity in sRPE-CL athlete load data, preserves insights into the athlete’s psychological and physiological experiences [10].
- The SDC model is introduced as a novel alternative but lacks a clear theoretical link to RPE.
We thank the reviewer for this concern and have added further detail on the previously published SDC model to enhance the reader’s understanding of the model and its connections to sRPE (lines 93-103).
Lines 93-103: The SDC regression model is presented in Equation 1, where uathlete represents the random error of each athlete, and the distances in each zone are individualized to each athlete [19]. Epp-Stobbe et al. (2024) provide more detail on the specifics of how these zones are individualized [21]. This model was found to have reasonable explanatory power, with a moderate to strong relationship to sRPE-CL (R2adjusted = 0.487) suggesting it may be used in place of sRPE-CL as a load monitoring tool [21].
sRPECL =-0.852+53.87(Total High Deceleration Distance)+0.159(Contact Count)-53.46(High Speed×High Deceleration Distance)-26.59(Low Speed×High Deceleration Distance)+uathlete±10.989
(Equation 1)
- The monitoring of loads in athletes is essential for injury prevention and optimization of athletic performance. However, the introduction section lacks relevant background and it is recommended that more detailed descriptions be added to introduce the topic. To provide more effective evidence, the authors may consider referring to the following relevant studies: Accurately and effectively predict the ACL force: Utilizing biomechanical landing pattern before and after-fatigue (https://doi.org/10.1016/j.cmpb.2023.107761).
We appreciate the reviewer’s suggestion to enhance the relevant background around the importance of monitoring athlete load and have added material on this topic (lines 32-36).
Lines 32-36: By monitoring the physical output of athletes in the competition space, practitioners gain an understanding of the demands of performance, allowing for the appropriate design and deployment of training programs to meet these demands, ensuring protective effects against injury for athletes when exposed to future competitions [1]. More immediately, in rugby sevens tournaments, monitoring physical load provides information on tactical strategies for substitutions for players to manage in-game fatigue and inform optimal recovery strategies [2].
- The paper lacks a robust rationale for focusing solely on linear regression and random forest models. Given the modest performance of these models (e.g., SDC-random forest accuracy: 27.2%, R²: 0.338), why were more advanced techniques (e.g., Bayesian networks, ensemble methods) not explored? Compare with prior work to contextualize choices.
We thank the reviewer for this comment and have included additional contextualization on model choice in both the introduction and methods sections (lines 78-83; 164-168).
Lines 78-83: Similar results were echoed in a study on training sRPE in Australian football athletes, where random forest outperformed even the more complex neural network, C5.0 decision rules, and naïve Bayesian models [14]. This supports an opportunity for other, statistically or theoretically driven, objective models to improve the ability to impute missing sRPE data. Further the presence of additional developed models will continue to need to be compared using similar methodology for the imputation of sRPE as originally presented in Epp Stobbe et al. (2022) [15].
Lines 164-168: Previous work by Epp-Stobbe et al. (2022), demonstrated that in a rugby sevens population random forest outperformed other strategies including neural networks, lasso, ridge, and elastic net regression for the imputation of competition sRPE scores [15]. Carey et al. (2016) imputed missing training load data in an Australian football program and again identified that random forest outperformed even naïve Bayesian models [14].
- While statistical equivalence was achieved (TOST test), the practical utility of imputed RPE remains questionable due to low accuracy. Discuss whether such models are truly interchangeable with true RPE in real-world settings, especially given the ANOVA results showing significant differences.
We thank the reviewer for this comment and have expanded the discussion to focus on potential “real-world” considerations for the application of the models and considerations for use (lines: 278-292).
Lines: 278-292: The differences by imputation strategy model type, as shown in Figure 2, indicate that while the models may be statistically equivalent, they are still producing workload values that are not entirely identical. The differences by across levels of missingness and imputation strategy model type, as shown in Figure 3, suggest that the average accuracy is modest at best, with most models achieving peak accuracy when no more than 15% of the data was missing. With 1002 total rows of data available, 15% of that would see about 150 cases missing. In more practical terms, assuming six matches per tournament, in which 12 athletes provide post-match sRPE values, a per-tournament missingness rate of 15% would see about 11 out of the possible 72 sRPE scores missing. This is almost two scores missed for each match played, which is quite high given that injuries per team-match hover around 0.2 [43]. This suggests that the level of missing data is usually below 15%, which, in some cases may mean it is entirely possible to interchange imputation strategies, for example at 10% missingness linear regression produces the same accuracy whether using the SDC or mechanical work strategy (Figure 3). It remains, however, critical that practitioners understand the level of missingness across their particular datasets as well as consider their knowledge and available resources when selecting an imputation strategy model type.
- The study assumes missingness is completely at random (MCAR). Please justify this assumption and address potential biases if data are missing not at random (MNAR), which is common in athlete self-reports.
We thank the reviewer for this comment and have included additional information on the considerations for MCAR, relative to MAR or MNAR (lines 299-308).
Lines 299-308: In this analysis, missing values in athlete self-reported sRPE were assumed to be MCAR. This assumption was based on the context in which sRPE data were collected — typically following training sessions via standardized electronic or paper-based forms. sRPE entries were not dependent on the intensity, duration, or quality of the training session, but were most commonly missing due to incidental, non-systematic factors such as non-submission unrelated to performance in the case where athletes may be called to provide a post-match sample for doping control, require extensive medical treatment for an injury, or asked to attend to media duties. In contrast, if missingness had been related to the session difficulty (e.g., athletes avoiding reporting due to high intensity of match), this would align more closely with MNAR or MAR. Given the collection environment and lack of bias in submission patterns, the MCAR assumption is considered reasonable for the purposes of this imputation analysis [45,46].
- The comparison with the daily team mean substitution is insufficient. Include benchmarks against other imputation methods (e.g., k-nearest neighbors, MICE) to strengthen claims of superiority.
We thank the reviewer for this comment and have included additional contextualization on model choice in both the introduction and methods sections (lines 78-83; 164-168), as well as consideration for other measures in the discussion (lines 314-317).
Lines 78-83: Similar results were echoed in a study on training sRPE in Australian football athletes, where random forest outperformed even the more complex neural network, C5.0 decision rules, and naïve Bayesian models [14]. This supports an opportunity for other, statistically or theoretically driven, objective models to improve the ability to impute missing sRPE data. Further the presence of additional developed models will continue to need to be compared using similar methodology for the imputation of sRPE as originally presented in Epp Stobbe et al. (2022) [15].
Lines 164-168: Previous work by Epp-Stobbe et al. (2022), demonstrated that in a rugby sevens population random forest outperformed other strategies including neural networks, lasso, ridge, and elastic net regression for the imputation of competition sRPE scores [15]. Carey et al. (2016) imputed missing training load data in an Australian football program and again identified that random forest outperformed even naïve Bayesian models [14].
Lines 314-317: For example, fuzzy clustering, multivariate imputation by chained-equations (MICE), or Bayesian approaches may be able to more accurately describe outcomes for sport datasets as these approaches work on subjectively reported data with an overlap in input data ranges and limited possible outcomes [44,47,49,50].
- The SDC model’s components (e.g., contact counts) may not generalize to other sports or contexts. Discuss its applicability beyond rugby sevens and potential adjustments for sports with different demands.
We thank the reviewer for identifying this consideration. While we feel that the inclusion of physical contact makes this model applicable across rugby codes, there are certainly considerations for the applicability as well as potential adjustments which we have endeavoured to address in the discussion (lines 332-337).
Lines 332-337: While the SDC model was developed using a women’s rugby sevens population, it may be possible to generalize the applicability across rugby codes and Australian football, where physical contact and decelerations feature heavily [5,10]. Further, the consideration for the inclusion of a tactical variable, like physical contact, could be replaced for another high value tactical variable with appropriate investigation. For example, deceleration is a considerable factor in football and so perhaps a variable such as number of ball touches could be included [17].
- The conclusion advocates for objective metrics but overlooks the value of subjective RPE in capturing psychological load (e.g., Saw et al., 2016). Rebalance the discussion to acknowledge both approaches.
We appreciate the reviewer’s observation to the importance of recognizing the roles of subjective sRPE. We have sought to rebalance the manuscript, emphasizing our particular focus on the physical measures of load while recognizing the value of sRPE data (lines 49-53, 357-361, 365-368).
Lines 49-53: While the sRPE scale acts as a proxy measure of sessional intensity, it remains a subjective element as it reflects the perception of effort [7]. However, the use of sRPE to inform sRPE-CL enables a holistic approach to understanding an athlete’s response to the stress of training or competition [10]. The imputation of sRPE data, to ensure continuity in sRPE-CL athlete load data, preserves insights into the athlete’s psychological and physiological experiences [10].
Lines 357-361: The use of objective measures may provide a meaningful strategy to monitoring athletes’ physical output without the burden of reporting. GNSS units provide an “invisible monitoring” strategy that enable practitioners to gather information about the physical demands of competition as well as ensure that athletes are appropriately meeting or exceeding these demands in training to ensure a safe and lengthy sporting career [1].
Lines 365-368: To that end, it remains important for practitioners to appreciate the value of both a subjective and objective reflect of athlete load, whereby the athlete’s subjective perception of the experience holds value in different ways and should be used alongside measures of the athlete’s physical output [7,10,11].
- Figure 3’s accuracy trends are unclear. Add error bars or confidence intervals. Table 1 should include p-values for model comparisons.
We thank the reviewer for this helpful suggestion and have added error bars to both figures (lines 236 & 242).
Reviewer 3 Report
Comments and Suggestions for AuthorsThe manuscript explores RPE workload in elite rugby using regression models based on objective GNSS-derived metrics. While the topic is relevant and the experimental setting focused in real-world performance contexts, the methodological transparency, statistical rigor, and interpretative caution are lacking in several parts of the paper.
1 - The description of the SDC (Speed-Distance Class) model lacks full clarity. The formula is abstracted but not operationalized. What thresholds define each class? Were they calibrated per player or fixed? In any case, clarification of this issue is required.
2 - Regression models were implemented in R using both linear and random forest approaches. Yet the hyperparameter tuning procedure is entirely omitted. How many trees were used? Was cross-validation nested or external? Was random seed fixed? I recommend to add discussion about this issue.
3 - The transformation of RPE into model-ready targets is unclear. Is it modeled as a continuous variable, ordinal class, or treated as pseudo-linear?
4 - The use of R² and RMSE is standard, but the use of “accuracy” is misleading in a regression context. If accuracy refers to within ±1 unit of RPE, this must be explicitly stated.
5 - No confidence intervals or error margins are reported anywhere. Single-point metrics are not sufficient in a dataset with this level of variability and missingness. Then, it is difficult to understand the real performance of the proposed approach compared with previous works in the current literature.
6 - The two one-sided test (TOST) is used appropriately in principle, but its justification is weak. The authors adopt Cohen’s d = 0.2 as the equivalence bound without tailoring it to the RPE distribution or contextual variability in elite sport. However, equivalence is claimed across all models, even though R² values remain low (max ~0.33) and RMSE values suggest wide prediction error margins. These findings are not discussed with sufficient nuance. It is necessary to clarify this issue.
7 - How are matches vs. training sessions treated? Were differences in load magnitude controlled for?
8 - Figures 2 and 3 are poorly annotated and insufficiently discussed in the results. For example, Figure 3 shows RMSE distributions, but the text barely references it.
9 - No visualizations of error distributions, residuals, or per-player variance are included, which is a serious omission in a regression-heavy paper.
Comments on the Quality of English Language1 - The manuscript is generally well-structured, but the writing is overly verbose, redundant, and at times technically imprecise.
2 - Repetitive phrases like “statistically equivalent”, “objective measurement models”, and “team mean substitution” appear too often, diluting clarity.
3 - Transitions between sections (especially Methods → Results → Discussion) are abrupt, and figure interpretations are superficial or absent.
Author Response
Reviewer 3
1 - The description of the SDC (Speed-Distance Class) model lacks full clarity. The formula is abstracted but not operationalized. What thresholds define each class? Were they calibrated per player or fixed? In any case, clarification of this issue is required.
We thank the reviewer for this comment. While the SDC model used was informed by previous works (Epp-Stobbe et al., 2024), we have included additional information about the model to improve clarity (lines 93-103).
Lines 93-103: The SDC regression model is presented in Equation 1, where uathlete represents the random error of each athlete, and the distances in each zone are individualized to each athlete [19]. Epp-Stobbe et al. (2024) provide more detail on the specifics of how these zones are individualized [21]. This model was found to have reasonable explanatory power, with a moderate to strong relationship to sRPE-CL (R2adjusted = 0.487) suggesting it may be used in place of sRPE-CL as a load monitoring tool [21].
sRPECL =-0.852+53.87(Total High Deceleration Distance)+0.159(Contact Count)-53.46(High Speed×High Deceleration Distance)-26.59(Low Speed×High Deceleration Distance)+uathlete±10.989
(Equation 1)
Reference: Epp-Stobbe, A., Tsai, M.-C., Klimstra, M. D. (2024a). Predicting athlete workload in women’s rugby sevens using GNSS sensor data, contact count and mass. Sensors, 24, 6699. https://doi.org/10.3390/s24206699.
2 - Regression models were implemented in R using both linear and random forest approaches. Yet the hyperparameter tuning procedure is entirely omitted. How many trees were used? Was cross-validation nested or external? Was random seed fixed? I recommend to add discussion about this issue.
We thank the reviewer for this comment and have added additional information to clarify the hyperparameter tuning for the random forest model in the methods (lines 170-186).
Lines 170-186: This process was repeated 100 times, and the mean values from these iterations were used for further analysis to generate predicted mechanical work scores. Following the process for the imputation of sRPE from mechanical work, the same models were used to impute sRPE from the Speed-Deceleration-Contact (SDC) model using the same datasets, with a fixed random seed value set to 100.
In this random forest model, the primary tuning decision involved adjusting the number of predictor variables considered at each decision point in the trees. This value was set to the square root of the total number of predictors, introducing greater randomness between trees, which can improve generalizability and reduce overfitting—especially useful when predicting bounded outcomes like sRPE scores. All other model settings were kept at their default values within the randomForest package [31]. Specifically, the model generated 500 trees, ensuring stability in predictions without excessive computation time. Each tree was built using a bootstrap sample of the training data, supporting ensemble diversity [31]. The trees were allowed to grow fully, with a minimum of five observations required to create a terminal node, which enabled the model to capture subtle patterns in the data [31]. Collectively, these default parameters provide a strong balance between predictive power, model interpretability, and computational efficiency, making them suitable for exploratory analysis or imputation tasks in moderately sized datasets [31].
3 - The transformation of RPE into model-ready targets is unclear. Is it modeled as a continuous variable, ordinal class, or treated as pseudo-linear?
We thank the reviewer for this observation and have indicated the type of data for sRPE in the methods (lines 132-133).
Lines 132-133: sRPE data collected by these means is considered to be ordinal data [10].
4 - The use of R² and RMSE is standard, but the use of “accuracy” is misleading in a regression context. If accuracy refers to within ±1 unit of RPE, this must be explicitly stated.
We thank the reviewer for this comment and have added more information about accuracy in the methods (lines 190-193).
Lines 190-193: Accuracy was defined as the proportion of imputed sRPE scores that exactly matched the true sRPE scores. For each observation, an imputed score was considered accurate if it was numerically identical to the corresponding true score. Accuracy was then calculated as the number of exact matches divided by the total number of predictions.
5 - No confidence intervals or error margins are reported anywhere. Single-point metrics are not sufficient in a dataset with this level of variability and missingness. Then, it is difficult to understand the real performance of the proposed approach compared with previous works in the current literature.
We thank the reviewer for this helpful suggestion and have added error bars to both figures (lines 236 & 242).
6 - The two one-sided test (TOST) is used appropriately in principle, but its justification is weak. The authors adopt Cohen’s d = 0.2 as the equivalence bound without tailoring it to the RPE distribution or contextual variability in elite sport. However, equivalence is claimed across all models, even though R² values remain low (max ~0.33) and RMSE values suggest wide prediction error margins. These findings are not discussed with sufficient nuance. It is necessary to clarify this issue.
We thank the reviewer for this observation. We have attempted to enhance the discussion around the statistical equivalence and modest associations in the discussion (lines 278-291) as well as provide improved support for the use of ±0.2 as the equivalence bounds (lines 201-205).
Lines 278-291: The differences by imputation strategy model type, as shown in Figure 2, indicate that while the models may be statistically equivalent, they are still producing workload values that are not entirely identical. The differences by across levels of missingness and imputation strategy model type, as shown in Figure 3, suggest that the average accuracy is modest at best, with most models achieving peak accuracy when no more than 15% of the data was missing. With 1002 total rows of data available, 15% of that would see about 150 cases missing. In more practical terms, assuming six matches per tournament, in which 12 athletes provide post-match sRPE values, a per-tournament missingness rate of 15% would see about 11 out of the possible 72 sRPE scores missing. This is almost two scores missed for each match played, which is quite high given that injuries per team-match hover around 0.2 [43]. This suggests that the level of missing data is usually below 15%, which, in some cases may mean it is entirely possible to interchange imputation strategies, for example at 10% missingness linear regression produces the same accuracy whether using the SDC or mechanical work strategy (Figure 3). It remains, however, critical that practitioners understand the level of missingness across their particular datasets as well as consider their knowledge and available resources when selecting an imputation strategy model type.
Lines 201-205: While it is ideal to define equivalence margins based on domain-specific or normative benchmarks, such reference data are currently lacking or insufficient in this particular sporting population. Cohen’s d = 0.2 provides a conservative and widely accepted standard for identifying negligible differences [40]. This approach ensures that potentially meaningful discrepancies are not dismissed while still enabling a statistically grounded evaluation of equivalence.
7 - How are matches vs. training sessions treated? Were differences in load magnitude controlled for?
We appreciate the reviewer’s question. As identified in the manuscript’s methods, the data came from competition, or matches, only. We have attempted to clarify this in the purpose statement (lines 112-115).
Lines 112-115: Therefore, the purpose of this investigation is to assess the accuracy of alternative load metrics of mechanical work and SDC, augmented with linear regression and random forest classification to impute competition sRPE as compared to DTMS.
8 - Figures 2 and 3 are poorly annotated and insufficiently discussed in the results. For example, Figure 3 shows RMSE distributions, but the text barely references it.
We thank the reviewer for this observation, we have included additional interpretations for the figures, including figure 3 which references accuracy, in the discussion (lines 278-291).
Lines 278-291: The differences by imputation strategy model type, as shown in Figure 2, indicate that while the models may be statistically equivalent, they are still producing workload values that are not entirely identical. The differences by across levels of missingness and imputation strategy model type, as shown in Figure 3, suggest that the average accuracy is modest at best, with most models achieving peak accuracy when no more than 15% of the data was missing. With 1002 total rows of data available, 15% of that would see about 150 cases missing. In more practical terms, assuming six matches per tournament, in which 12 athletes provide post-match sRPE values, a per-tournament missingness rate of 15% would see about 11 out of the possible 72 sRPE scores missing. This is almost two scores missed for each match played, which is quite high given that injuries per team-match hover around 0.2 [43]. This suggests that the level of missing data is usually below 15%, which, in some cases may mean it is entirely possible to interchange imputation strategies, for example at 10% missingness linear regression produces the same accuracy whether using the SDC or mechanical work strategy (Figure 3). It remains, however, critical that practitioners understand the level of missingness across their particular datasets as well as consider their knowledge and available resources when selecting an imputation strategy model type.
9 - No visualizations of error distributions, residuals, or per-player variance are included, which is a serious omission in a regression-heavy paper.
We thank the reviewer for this helpful suggestion and have added error bars to both figures (lines 236 & 242). We have also included information on residuals in a Supplementary Materials section (lines 525-535).
Comments on the Quality of English Language
1 - The manuscript is generally well-structured, but the writing is overly verbose, redundant, and at times technically imprecise.
We thank the reviewer for this comment, in attempting to balance all reviewer comments we have sought to ensure improved precision and minimized redundancies.
2 - Repetitive phrases like “statistically equivalent”, “objective measurement models”, and “team mean substitution” appear too often, diluting clarity.
We thank the reviewer for this comment and have adjusted phrasing where necessary, including the use of the acronym DTMS in place of “daily team mean substitution” throughout to enhance clarity.
3 - Transitions between sections (especially Methods → Results → Discussion) are abrupt, and figure interpretations are superficial or absent.
We thank the reviewer for this observation, in support of a previous point (#8), we have included additional interpretations for the figures in the discussion (lines 278-291).
Lines 278-291: The differences by imputation strategy model type, as shown in Figure 2, indicate that while the models may be statistically equivalent, they are still producing workload values that are not entirely identical. The differences by across levels of missingness and imputation strategy model type, as shown in Figure 3, suggest that the average accuracy is modest at best, with most models achieving peak accuracy when no more than 15% of the data was missing. With 1002 total rows of data available, 15% of that would see about 150 cases missing. In more practical terms, assuming six matches per tournament, in which 12 athletes provide post-match sRPE values, a per-tournament missingness rate of 15% would see about 11 out of the possible 72 sRPE scores missing. This is almost two scores missed for each match played, which is quite high given that injuries per team-match hover around 0.2 [43]. This suggests that the level of missing data is usually below 15%, which, in some cases may mean it is entirely possible to interchange imputation strategies, for example at 10% missingness linear regression produces the same accuracy whether using the SDC or mechanical work strategy (Figure 3). It remains, however, critical that practitioners understand the level of missingness across their particular datasets as well as consider their knowledge and available resources when selecting an imputation strategy model type.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsFollowing the changes made by the authors in the review process and on the basis of the attached report, it can be stated that it meets the standards of originality required to be published in a journal of the prestige of this one, with a few modifications:
- At the end of the introduction, please emphasise the novel elements/elements of originality with which this study is presented and its place in the scientific literature.
- Ethics approval was provided by the University of Victoria for the use of voluntary data collection with the investigation complying with the principles outlined in the Declaration of Helsinki. The recommendation here is to specify that you have complied with the Declaration of Helsinki (DoH)—Ethical Principles for Medical Research Involving Human Participants (1964) and its latest amendments from the 75th General Assembly of the World Medical Association (WMA) in Finland, on October 19, 2024.
- It is recommended that the first sentence of this discussion be transferred to the introduction so that it can then be expanded in more detail.
- Please start the discussion by stating the study's aims and objectives and whether they have been achieved.
- Given the journal's prestige, please remove/replace the following outdated bibliographical sources from the study: 12, 22, 29, 32, 36-37 and 41.
Comments for author File: Comments.pdf
Minor editing of the English language is needed.
Author Response
- At the end of the introduction, please emphasise the novel elements/elements of originality with which this study is presented and its place in the scientific literature.
We thank the reviewer for this suggestion, in accordance with other suggestions from the same reviewer we have attempted to emphasize the novelty of this investigation in the final paragraph of the introduction (lines 112-118).
Lines 112-118: This novel approach assesses how two objective workload strategies behave when used to impute sRPE-CL data. Outcomes of this novel investigation could provide important alternatives for the imputation of sRPE which could better support the use of sRPE-CL to calculate athlete loads when missing data is experienced, supporting continuity of data longitudinally across high performance sport programs. The use of these objective workload strategies to impute missing data may allow practitioners to better understand and address changes in the training and competition environment to support optimal performance.
- Ethics approval was provided by the University of Victoria for the use of voluntary data collection with the investigation complying with the principles outlined in the Declaration of Helsinki. The recommendation here is to specify that you have complied with the Declaration of Helsinki (DoH)—Ethical Principles for Medical Research Involving Human Participants (1964) and its latest amendments from the 75th General Assembly of the World Medical Association (WMA) in Finland, on October 19, 2024.
We thank the reviewer for the suggestion to improve clarity on the ethics approval process and compliance. We have amended the sentence to include the above detail (lines 123-127).
Lines 123-127: Ethics approval was provided by the University of Victoria for the use of voluntary data collection with the investigation complying with the principles outlined in the Declaration of Helsinki (DoH)—Ethical Principles for Medical Research Involving Human Participants (1964) and its latest amendments from the 75th General Assembly of the World Medical Association (WMA) in Finland, on October 19, 2024.
- It is recommended that the first sentence of this discussion be transferred to the introduction so that it can then be expanded in more detail.
We thank the reviewer for this suggestion. We have removed this sentence from the discussion and adjusted the phrasing to include it in the introduction (lines 112-113). We have also attempted to provide more detail (lines 113-118).
Lines 112-113: This novel approach assesses how two objective workload strategies behave when used to impute sRPE-CL data.
Lines 113-118: Outcomes of this investigation could provide important alternatives for the imputation of sRPE which could better support the use of sRPE-CL to calculate athlete loads when missing data is experienced, supporting continuity of data longitudinally across high performance sport programs. This may allow practitioners to better understand and address changes in the training and competition environment to support optimal performance.
- Please start the discussion by stating the study's aims and objectives and whether they have been achieved.
We thank the reviewer for this comment and have restructured the first paragraph of the discussion to include a review of the investigation’s purpose, which helps to more clearly tie into the outcome of the investigation’s aims, also described in that paragraph (lines 261-277).
Lines 261-277: The aim of this investigation was to assess the accuracy of competition sRPE data imputed from mechanical work and the SDC model to impute by linear regression, random forest classification, and DTMS. This investigation is an important follow up to previous work by Epp Stobbe et al. (2022) where different imputation methods were used and included on one simple objective imputation method, the current investigation used more theoretically driven models that while found to be equivalent, still demonstrated modest accuracy and R2 with some differences based on imputation strategy. First, it was found that both objective workload measures; mechanical work and the SDC model (using either linear regression or random forest regression) outperformed DTMS (Table 1, Figures 2 & 3). Further, in terms of statistical approaches, for each model compared, random forest performed the best in terms of accuracy and explanatory power in imputing sRPE. Finally, the SDC model using random forest regression resulted in the best accuracy and explanatory power of all strategies and models. However, regardless of the models used, the model accuracy and goodness of fit statistics would be considered poor. This finding further substantiates the difficulty in using subjective measures for athlete load calculation when adherence may limit reporting and data missingness is possible. Overall, these results suggest that while objective models can be used to impute missing sRPE data for use in the calculation of sRPE-CL the true athlete reported value is far superior and it is recommended that either strategies are developed to minimize missing data or the use of load metrics, in place of subjective metrics, would be advised [8,9,14,44].
- Given the journal's prestige, please remove/replace the following outdated bibliographical sources from the study: 12, 22, 29, 32, 36-37 and 41.
We thank the reviewer for this suggestion and have removed the recommended references, replacing with existing references where suitable throughout the manuscript.
Reviewer 2 Report
Comments and Suggestions for AuthorsThank the authors for their efforts in improving the quality of their papers. The quality of the article has already improved a bit with the revisions. It is suggested that the conclusions be further condensed to provide some substantive comments and recommendations. The monitoring of loads in athletes is essential for injury prevention and optimization of athletic performance. It is recommended to refer to the previously mentioned literature. It is recommended to optimize all figures (font size, color scheme, scatterplot to present more raw data, and so on). In addition, it is recommended to typeset all figures and to integrate figures from supplementary materials into the manuscript. The manuscript is not sufficiently informative and is fully capable of incorporating supplementary material figures. Lack of fully credible results. Could the authors provide more and relevant results, including more detailed data, in addition to suggesting that the results be presented in the form of figures to improve the readability of the results. It is recommended that a methodology flowchart be provided with additional methodology details, in order to present the methodology more clearly and to improve the reproducibility of the methodology.
Author Response
Thank the authors for their efforts in improving the quality of their papers. The quality of the article has already improved a bit with the revisions.
It is suggested that the conclusions be further condensed to provide some substantive comments and recommendations.
We thank the reviewer for this suggestion. We have condensed the conclusion to more clearly substantiate the outcome of the investigation (lines 390-399).
Lines 390-399: Practically, this investigation suggests that mechanical work or the SDC model may be reasonable alternatives for the imputation of missing sRPE data meaning that practitioners are not completely reliant on athlete self-reported data to understand elements of an athlete’s performance. However, further investigation into the ability of these models, derived from objective physical data, to appropriately reflect athletic performance as a whole, beyond the physical output is required. Ultimately, practitioners are encouraged to collect and clean data from ATDs directly wherever possible before applying any imputation methods to workload models that include meaningful and relevant factors to their sport.
The monitoring of loads in athletes is essential for injury prevention and optimization of athletic performance. It is recommended to refer to the previously mentioned literature.
We thank the reviewer for emphasizing the importance of load monitoring for injury prevention and performance optimization. While we acknowledge the suggested reference (https://doi.org/10.1016/j.cmpb.2023.107761) provides valuable information on modeling ACL landing biomechanics, an injury with lengthy return-to-sport rehabilitation time and high demand on sport resources, we believe it is less directly applicable to the current investigation’s context of competition load monitoring in elite women’s rugby sevens. Instead, we have cited a more relevant and widely recognized paper for it’s support of monitoring athlete loads for injury prevention and performance optimization: Gabbett, T.J. (2016). The training-injury prevention paradox: Should athletes be training smarter and harder? British Journal of Sports Medicine, 50(5), 273–280.
It is recommended to optimize all figures (font size, color scheme, scatterplot to present more raw data, and so on). In addition, it is recommended to typeset all figures and to integrate figures from supplementary materials into the manuscript. The manuscript is not sufficiently informative and is fully capable of incorporating supplementary material figures. Lack of fully credible results. Could the authors provide more and relevant results, including more detailed data, in addition to suggesting that the results be presented in the form of figures to improve the readability of the results.
We thank the reviewer for these helpful suggestions for the Results section of the manuscript. We have updated resolution on Figures 1, 2, 3-6, 8, & 9 (lines 215, 225, 229, 231, 233, 235, 252, & 258). We recognize that the images may be unnecessarily compressed in word document format and have also attached the images separately to the submission to ensure maximal clarity is available for final assembly of the materials. Notably, we have also included Figure 2 as a new item, depicting the raw data of the predicted and true sRPE values from the dataset (line 225). We have also included the material that was previously in the Supplementary Material section as a part of the results to ensure the manuscript is sufficiently informative (lines 226-236).
It is recommended that a methodology flowchart be provided with additional methodology details, in order to present the methodology more clearly and to improve the reproducibility of the methodology.
We thank the reviewers for this helpful suggestion and have included a methodology flowchart to enhance clarity (line 215).
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors have satisfactorily addressed most of the reviewer’s comments and they have properly justified why some of them have not been taken into account. I have no further remarks about the current version of the manuscript.
Author Response
The authors have satisfactorily addressed most of the reviewer’s comments and they have properly justified why some of them have not been taken into account. I have no further remarks about the current version of the manuscript.
Response to Reviewer 3
We thank the reviewer for their feedback in both rounds which has enhanced the quality of this manuscript.