Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessSystematic Review

Peer-Review Record

Modifiable and Non-Modifiable Predictors of Exercise Capacity in Stroke Survivors: A Systematic Review

Healthcare 2026, 14(3), 382; https://doi.org/10.3390/healthcare14030382

by Klaske van Kammen¹

, Lotte A. J. Verkuijlen^1,2

, Ana B. Nasser^1,2

, Rienk Dekker¹

, Leonie A. Krops¹

and Bregje L. Seves^1,*

Reviewer 1: Anonymous

Reviewer 2:

Monira Aldhahi

Reviewer 3:

Bekir Erhan Orhan

Healthcare 2026, 14(3), 382; https://doi.org/10.3390/healthcare14030382

Submission received: 18 December 2025 / Revised: 27 January 2026 / Accepted: 30 January 2026 / Published: 3 February 2026

(This article belongs to the Special Issue Physical Activity Intervention for Non-Communicable Diseases)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript addresses an important and relevant topic in examining predictors of exercise capacity after stroke. The authors present a large amount of complex data, that can sometimes be difficult to follow and is not always consistently reported. With substantial revisions, the precision and clarity of the findings could be improved.

Abstract

Line 23: The study aim may benefit from being more narrowly stated (i.e., identifying modifiable and non-modifiable predictors of exercise capacity). Optimization of rehabilitation and improving long-term health outcomes would be better characterized as longer-term implications of the research, than as specific aims of the present study.
If categorizing predictors as modifiable or non-modifiable is an aim, more details regarding the methodology/process for categorizing these variables should be provided within the methods section.
Line 32: “Data extraction categorized predictors into modifiable and non-modifiable predictors.” This is not clear as written. Do you mean, during the data extraction phase, predictors were categorized…?
Line 36: Presentation of findings could be strengthened by acknowledging that the predictive abilities of certain factors were not consistent across studies. For example maybe you could say that 6/11 studies found age to be an independent predictor, 1/8 studies found sex to be an independent predictor, as to not bias or overstate a potential role for these factors.
Consider using more specific terms where possible (e.g BMI instead of body composition, or diabetes (instead of comorbidities).
There is some minor inconsistency in spacing after periods (e,g, line 38, 85, 92, 94…).
Line 39 states “This systematic review highlights the significant role of modifiable predictors…” I would add “non-modifiable predictors” to this sentence since both were examined in your review?
Line 41: You state “In addition, considering non-modifiable predictors allows for more personalized treatment planning.” Similar to above, I would add “modifiable predictors” to this sentence, as they are also important to treatment planning.

Introduction

Lines 56-61: The reference provided (Khan 2022) does not support this sentence you have written; The reference links the “berg balance scale” and the “time-up-to-go” test with mobility (not ADLs or QOL). I was unable to find van de Port et al., 2006 reference in your reference list. New references are needed here.
I agree that it is important to better understand predictors of exercise capacity in stroke survivors, however a better case needs to be made for this, including more specific examples of how improved exercise capacity affects QOL, as well as risks of cardiovascular disease.
In the second paragraph of the introduction, I would acknowledge that low VO2 can be both a consequence of stroke as well as risk factor for stroke itself. Both may contribute to the lower VO2 levels observed in stroke populations.
Line 78: The references (Pang et al., 2006, Brazzelli et al., 2011) here are 15+ years old. Given the availability of more recent clinical trials, and meta-analyses, updated references should be provided.
Line 80: Please provide updated references for this statement as well: “Beyond its impact on aerobic capacity, aerobic exercise training contributes to enhanced balance, gait speed, walking endurance, cognitive performance, and quality of life. (Pang et al., 2013, Billinger et al., 2012, Saunders et al., 2016).”

Methods

Search Strategy (2.1) What were the earliest publication dates included in the searches?
In line 129 you mention that VO2 peak will be used throughout the review for consistency. Figure 1 (Prisma diagram) is still using VO2 max term, and should be updated.
Quality assessment (2.4). Line 154 – An incorrect reference is provided here (Letts et al., 2007). This is the reference for the qualitative form: Letts, L., Wilkins, S., Law, M., Stewart, D., Bosch, J., & Westmorland, M. (2007). Critical review form - qualitative studies (version 2.0). https://www.unisa.edu.au/contentassets/72bf75606a2b4abcaf7f17404af374ad/7b-mcmasters_qualreview_version2-01.pdf .

However, the Critical review form for quantitative studies is needed.

Can you provide details as to the range of possible scores on the McMaster assessment, and how study quality is inferred from scores?

Results

Line 166 – It is stated that 86 articles were excluded due to reasons 2 and 3. However, the Prisma diagram indicates 82 articles were excluded for these reasons.
Table 1: Column entitled “Predicting factors of Physical Capacity”. This sounds like all factors listed in this column are predictors. Is this true? or do you mean these were all the factors evaluated in studies. If so, consider changing title of column to “Evaluated predicting factors…”
Line 201: Should this say “study design” rather than “design of the articles”?
Quality Assessment (3.3): Please clarify the study design, so that the descriptors in the text and Table 2 match. For example, in the text, 6 articles are listed as a cross-sectional design, however, only 5 articles appear to be cross sectional according to table 2. Similarly, 3 are listed in text as “retrospective cohort”, while only 2 are listed as such in table. It is also not clear which study has the “experimental design”.
Quality Assessment (3.3)/Table 2: Are there total scores that are calculated for each article to summarize overall quality?
Table 3 – define abbreviations (e.g, np, Svend, COend,DTNPL, USERm)
Table 3 – What do the superscripts B and C indicate, and should there be an A?
In line 220, higher BMI is said to be a predictor of lower VO2peak in five articles. However, only 4 studies are listed: (Baert 220 et al., 2012, 2, Blokland et al., 2023, Liu et al., 2022, and Woodward et al., 2019).
Section 3.5.2. Lower limb characteristics. This section was very hard to follow. As you did for body composition, I recommend you begin by including the total number of studies assessing the lower limb characteristics as predictors of VO2peak, followed by a description of the studies finding significant results.
Results 3.5.2. Lower limb characteristics: The presentation of the muscle strength results is a bit misleading. In all three studies examining muscle strength (Baert 2012, 2; Kim 2020 and Wang 2014), isometric/isokinetic and paretic/nonparetic measurements were all correlated with VO2 peak. In the multivariable models, a single measurement often emerged as the “best predictor”, but that doesn’t mean that other measurements were not also related to some degree. These significant correlations with VO2 max should be acknowledged, in addition to the main predictor determined by multivariable modeling, to give a complete picture of how these variables may relate to VO2.
In the Wang study, according to Table 3, only 90 degree torque of the non-paretic leg was predictive of VO2 peak in multivariable models, not “both the paretic and non-paretic legs)” as you have stated (Lines 235-238). Make sure the correct result is presented in both places.
Should the 90degree torque measurements be classified as isokinetic?
When describing the results of Wang’s study, I think for most accurate interpretation it needs to be stated that lean tissue mass of the arms, legs, paretic and non-paretic legs were also correlated with VO2, with lean tissue mass of thighs emerging as the significant independent predictor in the multivariable model.
I can appreciate that it may be hard to categorize variables under subheadings ( when they could fit under multiple categories). However, do you think Chedoke-McMaster Stroke Assessment might fit better under the “Stroke Specific Predictors”?
Prior to presenting each predictive cardiorespiratory parameter, it would be helpful to state the number of studies evaluating the measure (e.g, cardiac output, baseline VO2peak, 6MWT).
It would be helpful to describe the conditions under which predictive cardiorespiratory parameters were measured. For example, Were resting CO measurements predictive of VO2 peak? Or were these measurements obtained during maximal or submaximal exercise testing conditions? Was heart rate, stroke volume measured under resting or exercise conditions?
Again, I think it would be helpful to note that many of the cardiovascular parameters described as “non-significant predictors” of VO2 (e.g, minute ventilation, tidal volume), were found to be significantly correlated with the outcomes. Even though these were ultimately found to play non-significant roles in multivariable models, it is still useful to know that they are related.
Line 251- The ability of baseline VO2peak to predict VO2peak in follow up is described for Tang et al., 2013, Linder et al., 2020 and Linder et al., 2024. Should the results from Macko’s study be here as well, since initial VO2 peak is listed among the predictive parameters in Table 1? Also sentence stating “higher baseline VO2peak was consistently reported as a significant predictor of VO2 peak measured during followup…” is a bit misleading, as only 2/4 studies examining baseline VO2 peak as a predictor found a significant relationship.
Line 253 – Similarly, the ability of 6MWT to predict VO2peak is presented for Woodward et al., 2019; Tang et al., 2013; and Linder et al., 2020, but not for Kim et al., 2020 and Liu 2022.
Line 253 – Missing “.,” after Linder et al
Line 256 – extra comma after Linder et al., 2020
Line 260 – Clarify how moderate to vigorous physical activity was defined, and when it was measured (pre-stroke? Post stroke?)
Line 264- Linder et al. (2020), Tang 264 et al. (2013), and Baert et al. (2012, 2) reported that various other training parameters were not predictive of VO2peak (table 2)”. Specify what training parameters were examined in these studies.

Line278 – “Eleven articles evaluated age as predictor of exercise capacity (table 3).” Should this reference table 1, since only significant predictors are reported in table 3?
Line 278: It is stated that six articles concluded that exercise capacity decreased significantly with age, however only 3 studies are referenced. (Baert et al., 2012,1, Blokland et al., 2023, , Liu et al., 2022,).
Line 285: Sex is reported as a non-significant predictor in their 7 articles, but only 6 are referenced. (Baert et al., 2012,2, Kim et al., 2020, Lam et al., 2010, Linder et al., 2020, Linder et al., 2024, Liu et al., 2022).
Line 289 – What are the four articles that investigated comorbidities? Are the studies that investigated beta-blockers included as part of these 4? Beta-blockers shouldn’t be considered a comorbidity.
Line 303 – Higher scores on FAC are mentioned as being associated with VO2peak in Baert 2012,2. However, table 1 indicates that FAC was also assessed in Baert 2012,1 and Blokland 2023. These studies should be mentioned, even if FAC was a non-significant predictor, to give a complete picture of findings. Same for other measures (time post stroke, right side hemiparesis/side of lesion, walking velocity, ect.
As a general comment, I wonder if anything can be said about effect sizes. It should be noted in table 3 if the B coefficients are standardized or not. If possible, a discussion of effect size could indicate the relative importance of predictive factors, and potentially greater indication of clinical meaningfulness, (versus statistical significance alone). If not possible, due to lack of reporting, or inconsistency in assessments, discuss as a limitation/future direction.
In general, the results would benefit from a more careful explanation of how variables are related to VO2peak versus change in VO2peak. In the results section, studies evaluating change in VO2peak are not always distinguished from studies evaluating absolute VO2 peak. Not only might there be differences in the way factors that affect change versus absolute values, but also in how the findings are ultimately interpreted.

Discussion

47. Many of the modifiable factors listed in the first paragraph of the discission were only found to be predictive in a subset of studies, or in 1 or 2 studies only. Heterogeneity of findings, and scarcity of studies evaluating certain factors should be noted, to give appropriate context for conclusions.

48. The discussion of higher training intensity predicting improved exercise capacity may be further bolstered by citations of randomized controlled trials or meta-analyses examining the effect of higher intensity training protocols on VO2 max.

49. When discussing lower limb strength (Line 324), why focus on the study by Wang et al., 2024 only? Other studies you reviewed also found a relationships between lower limb strength and VO2peak? (e.g, Bert 2012,2; Kim 2010; Ryan 2000).

50. Are there any conclusions that can be drawn about paretic vs non paretic limb? Kinetic versus isometric strength measurements? What additional research is needed?

51. Line 324: Should the intext citation (Wang et al., 2024) be 2014?

52. Line 331 – You indicate that higher fat mass was associated with reduced VO2 peak in several studies. However, the majority of studies looked at BMI (not fat mass). I would be clear you are referring to BMI.

53. Its important to acknowledge (as you have) that many of the cardiorespiratory parameters you looked at have known associations with VO2peak. However, I would take a little more time with this, discussing the individual factors you are referring to (cardiac output, 6MWT distances, baseline VO2, sit to stand), and how the results of your review compare to what is known for the general population.

54. I think it could be important to discuss how 6MWT and 30 second sit to stand test scores were predictive of higher VO2 peak. These tests are relatively simple to administer, and do not require a lot of equipment, and so may be useful for predicting VO2 peak in facilities that do not have the ability to assess VO2peak using CPET.

55. Line 335 – clarify what factors fall under psychosocial parameters.

56. The stroke specific factors are really what set apart your paper from articles that look at predictors of VO2 in the general population. When discussing the stroke specific predictors, I would take some highlight some of the factors that were identified as having a potential relationship with the outcome, while acknowleging the challenges. Is there anything to take away from the research that has been done?

57. Lines 364-369. Most studies did not find sex to be a independent predictor of VO2. Therefore I would tailor the discussion to this finding

58. Strengths and Limitations – This section should come after the Clinical Implications Section, Just prior to conclusions.

59. Lines 394-401. First paragraph of clinical implications seems very redundant with last paragraph of Strengths and Limitations (381-392). I would combine the content of these two paragraphs, and remove from clinical implications section

60. Lines 408-409 provide more up to date references on how aerobic exercise can improve exercise capacity.

61. Lines 415-421 – I don’t disagree with anything in this paragraph, but I also don’t see how it fits in with the rest of the paper.

62. Line 598 – The Mith reference seems out of place alphabetically

63. Line 601 – Should this reference be Tang 2013? Instead of 2023?

Author Response

Reviewer 1:

Reply: We sincerely thank the reviewer for the thoughtful and constructive comments. We appreciate the recognition of the relevance of our work and have carefully considered all suggestions. In the revised manuscript, we have made substantial efforts to improve the clarity, consistency, and precision of the data presentation and to address each point raised. Below, we provide detailed responses to all comments and describe the corresponding revisions. Note that the line numbers correspond with line numbers in the tracked changes version of our revised manuscript.

Abstract

Line 23: The study aim may benefit from being more narrowly stated (i.e., identifying modifiable and non-modifiable predictors of exercise capacity). Optimization of rehabilitation and improving long-term health outcomes would be better characterized as longer-term implications of the research, than as specific aims of the present study.
Reply: We agree that the study aim is more precise when focused strictly on identifying and categorizing predictors of exercise capacity. In accordance with this recommendation, we have revised the aim to emphasize the identification of modifiable and non‑modifiable predictors. The broader implications for rehabilitation and long‑term health are now described in the Introduction and Discussion as potential applications of the findings, rather than as formal study aims. Please see line 24-27 and line 100-106.
If categorizing predictors as modifiable or non-modifiable is an aim, more details regarding the methodology/process for categorizing these variables should be provided within the methods section.
Reply: To improve transparency, we have added a detailed description of the process used to categorize predictors as modifiable or non‑modifiable in the Methods section. This includes the criteria applied, the independent classification by two reviewers, and the procedure for resolving disagreements. See line 295-302
Line 32: “Data extraction categorized predictors into modifiable and non-modifiable predictors.” This is not clear as written. Do you mean, during the data extraction phase, predictors were categorized…?
Reply: Yes, we clarified this accordingly on line 33-34.
Line 36: Presentation of findings could be strengthened by acknowledging that the predictive abilities of certain factors were not consistent across studies. For example maybe you could say that 6/11 studies found age to be an independent predictor, 1/8 studies found sex to be an independent predictor, as to not bias or overstate a potential role for these factors.
Reply: We agree that this is of added value and added the details to the abstract. Note that as a results, some sentences were slightly rewritten to adhere to the maximum word count for the abstract (see line 24-93)
Consider using more specific terms where possible (e.g BMI instead of body composition, or diabetes (instead of comorbidities).
Reply: We agree and changed this accordingly in the abstract and conclusions. Please see updated abstract, lines 24-93.
There is some minor inconsistency in spacing after periods (e,g, line 38, 85, 92, 94…).
Reply: We have checked the double spacing in the manuscript and changed it accordingly.
Line 39 states “This systematic review highlights the significant role of modifiable predictors…” I would add “non-modifiable predictors” to this sentence since both were examined in your review?
Reply: We agree and changed this suggestion in the manuscript (see line 88-90)
Line 41: You state “In addition, considering non-modifiable predictors allows for more personalized treatment planning.” Similar to above, I would add “modifiable predictors” to this sentence, as they are also important to treatment planning.
Reply: We agree and changed this suggestion in the manuscript (see line 90).

Introduction

Lines 56-61: The reference provided (Khan 2022) does not support this sentence you have written; The reference links the “berg balance scale” and the “time-up-to-go” test with mobility (not ADLs or QOL). I was unable to find van de Port et al., 2006 reference in your reference list. New references are needed here.
Reply: We agree that the references used here refer to reduced mobility in stroke survivors and not to reduced exercise capacity associated with a person's ability to perform activities of daily living, as well as on overall quality of life. New references were added in the manuscript.
I agree that it is important to better understand predictors of exercise capacity in stroke survivors, however a better case needs to be made for this, including more specific examples of how improved exercise capacity affects QOL, as well as risks of cardiovascular disease.
Reply: Thank you for this suggestions. We have revised the introduction to more clearly articulate the importance of identifying predictors of exercise capacity, including specific examples of how improved exercise capacity relates to quality of life and cardiovascular risk reduction. Please see lines 107-113.
In the second paragraph of the introduction, I would acknowledge that low VO2 can be both a consequence of stroke as well as risk factor for stroke itself. Both may contribute to the lower VO2 levels observed in stroke populations.
Reply: We have added a sentences in the second paragraph of the introduction to explicitly acknowledge that low VO2 can be both a consequence of stroke and an independent risk factor for stroke (see lines 125-128)
Line 78: The references (Pang et al., 2006, Brazzelli et al., 2011) here are 15+ years old. Given the availability of more recent clinical trials, and meta-analyses, updated references should be provided.
Reply: We agree and have cited two reviews in this context, one already added in our original manuscript and one new reference (Li, Z., Guo, H., Yuan, Y., & Liu, X. (2024). The effect of moderate and vigorous aerobic exercise training on the cognitive and walking ability among stroke patients during different periods: A systematic review and meta-analysis. PLOS ONE, 19. https://doi.org/10.1371/journal.pone.0298339).
Line 80: Please provide updated references for this statement as well: “Beyond its impact on aerobic capacity, aerobic exercise training contributes to enhanced balance, gait speed, walking endurance, cognitive performance, and quality of life. (Pang et al., 2013, Billinger et al., 2012, Saunders et al., 2016).”
Reply: Majority of these measures were also considered in the review of Li et al, so reference was added here as well.

Methods

Search Strategy (2.1) What were the earliest publication dates included in the searches?
Reply: Thank you for this question. We have added a sentence to this paragraph, see lines 205-207.
In line 129 you mention that VO2 peak will be used throughout the review for consistency. Figure 1 (Prisma diagram) is still using VO2 max term, and should be updated.
Reply: Thank you for your sharp observation regarding the figure, we have updated the figure accordingly.
Quality assessment (2.4). Line 154 – An incorrect reference is provided here (Letts et al., 2007). This is the reference for the qualitative form: Letts, L., Wilkins, S., Law, M., Stewart, D., Bosch, J., & Westmorland, M. (2007). Critical review form - qualitative studies (version 2.0). https://www.unisa.edu.au/contentassets/72bf75606a2b4abcaf7f17404af374ad/7b-mcmasters_qualreview_version2-01.pdf . However, the Critical review form for quantitative studies is needed.
Reply: Thank you again for your careful review, we have updated the reference accordingly.
Can you provide details as to the range of possible scores on the McMaster assessment, and how study quality is inferred from scores?
Reply: Thank you for your question regarding the McMaster assessment tool. The McMaster tool consists of a series of items assessing different aspects of methodological quality in quantitative studies. While each item can be scored individually, the tool does not provide a validated weighting system for the relative importance of these items. For this reason, we deliberately chose not to calculate or report a total score, as such a score could imply an overall quality judgment that does not reflect the nuanced contribution of each criterion. Instead, we present the item-level assessments to allow readers to interpret strengths and limitations in context.

Results

Line 166 – It is stated that 86 articles were excluded due to reasons 2 and 3. However, the Prisma diagram indicates 82 articles were excluded for these reasons.
Reply: Apologies for this discrepancy. There were four articles that could not be retrieved, so indeed 82 articles excluded during full text screening. We updated the manuscript accordingly (see lines 339-341)
Table 1: Column entitled “Predicting factors of Physical Capacity”. This sounds like all factors listed in this column are predictors. Is this true? or do you mean these were all the factors evaluated in studies. If so, consider changing title of column to “Evaluated predicting factors…”
Reply: We agree this should be nuanced, and updated the table accordingly (see Table 1)
Line 201: Should this say “study design” rather than “design of the articles”.
Reply: Yes, updated accordingly.
Quality Assessment (3.3): Please clarify the study design, so that the descriptors in the text and Table 2 match. For example, in the text, 6 articles are listed as a cross-sectional design, however, only 5 articles appear to be cross sectional according to table 2. Similarly, 3 are listed in text as “retrospective cohort”, while only 2 are listed as such in table. It is also not clear which study has the “experimental design”.
Reply: Thank you for your careful checking of the match between text and tables. We agree that this requires clarification and decided to only mention the designs that were most common and refer to the table for the remaining design (see line 441-444)
Quality Assessment (3.3)/Table 2: Are there total scores that are calculated for each article to summarize overall quality?
Reply: See our response to comment#17, no total scores were determined.
Table 3 – define abbreviations (e.g, np, Svend, COend,DTNPL, USERm)
Reply: In our original submission these abbreviations were added and explained in the foot notes underneath the table. We noticed that with transferring our manuscript to the journal template, these foot notes inadvertently got lost. We are thankful the reviewer noticed this and have added the footnotes to the edited manuscript (see line 554-557).
Table 3 – What do the superscripts B and C indicate, and should there be an A?
Reply: In our original submission these abbreviations were added and explained in the foot notes underneath the table. We noticed that with transferring our manuscript to the journal template, these foot notes got lost. We are thankful the reviewer noticed this and have added the footnotes to the edited manuscript (see line 554-557).
In line 220, higher BMI is said to be a predictor of lower VO2peak in five articles. However, only 4 studies are listed: (Baert 220 et al., 2012, 2, Blokland et al., 2023, Liu et al., 2022, and Woodward et al., 2019).
Reply: Four is the correct number, updated accordingly.
Section 3.5.2. Lower limb characteristics. This section was very hard to follow. As you did for body composition, I recommend you begin by including the total number of studies assessing the lower limb characteristics as predictors of VO2peak, followed by a description of the studies finding significant results.
Reply: We agree this section requires some rewriting and amended the section accordingly (see lines 567-579)
Results 3.5.2. Lower limb characteristics: The presentation of the muscle strength results is a bit misleading. In all three studies examining muscle strength (Baert 2012, 2; Kim 2020 and Wang 2014), isometric/isokinetic and paretic/nonparetic measurements were all correlated with VO2 peak. In the multivariable models, a single measurement often emerged as the “best predictor”, but that doesn’t mean that other measurements were not also related to some degree. These significant correlations with VO2 max should be acknowledged, in addition to the main predictor determined by multivariable modeling, to give a complete picture of how these variables may relate to VO2.
Reply: Thank you for this thoughtful comment. We fully agree that several studies reported significant univariate correlations between various isometric/isokinetic and paretic/non‑paretic strength measures and VO2peak. However, for the purpose of this review we deliberately chose to extract and report only multivariable results, in line with our predefined protocol and the aim of identifying predictors of aerobic capacity based on multivariate regression models aimed to predict VO2peak. For clarity we updated inclusion criteria 4 to explicitly mention this criteria (see lines 215-216).
Across categories, not only for muscle strength, many variables showed univariate associations that did not remain significant when adjusted for other covariates. Including univariate results for muscle strength alone would introduce imbalance in reporting and may unintentionally overstate the importance of these predictors outside multivariable contexts.
Reply: To ensure consistency and avoid selective emphasis, we therefore chose to present the multivariable results only. We acknowledge that the univariate results may give valuable insight as well, in our discussion of limitations of the separate multivariate models to generate one comprehensive model: lines 1215-1221.
In the Wang study, according to Table 3, only 90 degree torque of the non-paretic leg was predictive of VO2 peak in multivariable models, not “both the paretic and non-paretic legs)” as you have stated (Lines 235-238). Make sure the correct result is presented in both places.
Reply: Thank you for pointing this out; we amended the text as the information in the table is correct (see lines 567-579).
Should the 90degree torque measurements be classified as isokinetic?
Reply: Indeed this should be isokinetic instead of isometric based on the methodology of determining the 90degree torque. We updated the results accordingly (see lines 567-579).
When describing the results of Wang’s study, I think for most accurate interpretation it needs to be stated that lean tissue mass of the arms, legs, paretic and non-paretic legs were also correlated with VO2, with lean tissue mass of thighs emerging as the significant independent predictor in the multivariable model.
Reply: We assume the reviewer refers to the interpretation of the Ryan et al study. As we only include the final multivariable prediction model in our results; we believe our original interpretation is correct.
I can appreciate that it may be hard to categorize variables under subheadings ( when they could fit under multiple categories). However, do you think Chedoke-McMaster Stroke Assessment might fit better under the “Stroke Specific Predictors”?
Reply: This indeed is one of the items that was difficult to categorize, as the measure is indeed stroke-specific. However, we classified it under “lower limb characteristics” because the subscale that was added to the prediction model focuses specifically on lower extremity impairment and functional deficit, and hence directly evaluates motor recovery and impairment of the lower limb. For this reasoning, we believe this categorization best reflects its purpose in our analysis.
Prior to presenting each predictive cardiorespiratory parameter, it would be helpful to state the number of studies evaluating the measure (e.g, cardiac output, baseline VO2peak, 6MWT).
Reply: Updated accordingly, see updated section 3.5.3. (line 643-661)
It would be helpful to describe the conditions under which predictive cardiorespiratory parameters were measured. For example, Were resting CO measurements predictive of VO2 peak? Or were these measurements obtained during maximal or submaximal exercise testing conditions? Was heart rate, stroke volume measured under resting or exercise conditions?
Reply: These details are available in Table 1, for example HRpeak indicates HR during peak exercise during the maximal test. Only studies that examined maximal exercise testing were included (see methods section, inclusion criteria 2 ‘outcome had to be a measurement of maximum exercise capacity (VO2peak) measured with CPET (or equivalent)’)
Again, I think it would be helpful to note that many of the cardiovascular parameters described as “non-significant predictors” of VO2 (e.g, minute ventilation, tidal volume), were found to be significantly correlated with the outcomes. Even though these were ultimately found to play non-significant roles in multivariable models, it is still useful to know that they are related.
Reply: In line with our aim and methodology, we decided to address only the final prediction models of each study. While we acknowledge that this additional information could be highly valuable for clinical practice, it falls outside the scope of the current review. See also our response to comment #27.
Line 251- The ability of baseline VO2peak to predict VO2peak in follow up is described for Tang et al., 2013, Linder et al., 2020 and Linder et al., 2024. Should the results from Macko’s study be here as well, since initial VO2 peak is listed among the predictive parameters in Table 1? Also sentence stating “higher baseline VO2peak was consistently reported as a significant predictor of VO2 peak measured during followup…” is a bit misleading, as only 2/4 studies examining baseline VO2 peak as a predictor found a significant relationship.
Reply: Yes, indeed. Added accordingly, (see lines 567-579)
Line 253 – Similarly, the ability of 6MWT to predict VO2peak is presented for Woodward et al., 2019; Tang et al., 2013; and Linder et al., 2020, but not for Kim et al., 2020 and Liu 2022.
Reply: Yes, indeed. Added accordingly, (see lines 567-579.
Line 253 – Missing “.,” after Linder et al
Reply: Thank you, we checked the manuscript and updated were needed.
Line 256 – extra comma after Linder et al., 2020
Reply: Thank you, we checked the manuscript and updated were needed.
Line 260 – Clarify how moderate to vigorous physical activity was defined, and when it was measured (pre-stroke? Post stroke?)
Reply: The authors determined this with use of accelerometer data, at 3 months post-stroke (see amended Table 1).
Line 264- Linder et al. (2020), Tang 264 et al. (2013), and Baert et al. (2012, 2) reported that various other training parameters were not predictive of VO2peak (table 2)”. Specify what training parameters were examined in these studies.
Reply: Reference was made to Table 1 to reduce word count.
Line278 – “Eleven articles evaluated age as predictor of exercise capacity (table 3).” Should this reference table 1, since only significant predictors are reported in table 3?
Reply: Yes, indeed. Changed accordingly.
Line 278: It is stated that six articles concluded that exercise capacity decreased significantly with age, however only 3 studies are referenced. (Baert et al., 2012,1, Blokland et al., 2023, , Liu et al., 2022,).
Reply: Three is the correct number, changed accordingly.
Line 285: Sex is reported as a non-significant predictor in their 7 articles, but only 6 are referenced. (Baert et al., 2012,2, Kim et al., 2020, Lam et al., 2010, Linder et al., 2020, Linder et al., 2024, Liu et al., 2022).
Reply: Baert et al., 2012,1 was inadvertently missing and now added to the list
Line 289 – What are the four articles that investigated comorbidities? Are the studies that investigated beta-blockers included as part of these 4? Beta-blockers shouldn’t be considered a comorbidity.
Reply: Yes, these were included as they examined beta-blockers as indicator of cardiovascular disease in their model – but note that they also examined other comorbidities (see table 1).
Line 303 – Higher scores on FAC are mentioned as being associated with VO2peak in Baert 2012,2. However, table 1 indicates that FAC was also assessed in Baert 2012,1 and Blokland 2023. These studies should be mentioned, even if FAC was a non-significant predictor, to give a complete picture of findings. Same for other measures (time post stroke, right side hemiparesis/side of lesion, walking velocity, ect.
Reply: Agreed, we added this were needed, see amended section 3.6.3. (line 884-896)
As a general comment, I wonder if anything can be said about effect sizes. It should be noted in table 3 if the B coefficients are standardized or not. If possible, a discussion of effect size could indicate the relative importance of predictive factors, and potentially greater indication of clinical meaningfulness, (versus statistical significance alone). If not possible, due to lack of reporting, or inconsistency in assessments, discuss as a limitation/future direction.
Reply: Details on the coefficients have been added to the footnotes of Table 3 in the revised manuscript. We agree that evaluating effect sizes could provide additional insight into the relative importance of predictive factors and their clinical relevance. However, this was not possible due to lack of reporting. Therefore, we have acknowledged this in the limitations section (see line 1229-1231)

Discussion

Many of the modifiable factors listed in the first paragraph of the discission were only found to be predictive in a subset of studies, or in 1 or 2 studies only. Heterogeneity of findings, and scarcity of studies evaluating certain factors should be noted, to give appropriate context for conclusions.
Reply: We agree and added accordingly, see lines 898-912.
The discussion of higher training intensity predicting improved exercise capacity may be further bolstered by citations of randomized controlled trials or meta-analyses examining the effect of higher intensity training protocols on VO2 max.
Reply: We thank the reviewer for this valuable suggestion. We have now incorporated multiple recent randomized controlled trials and meta-analyses specifically in stroke populations. A large Bayesian network meta-analysis demonstrated that high-intensity interval training (HIIT) produces the greatest improvements in VO2peak post-stroke compared with moderate or low-intensity exercise. Additional meta-analyses confirm that moderate‑to‑high intensity aerobic training increased VO2peak in chronic stroke survivors. These references support our finding that training intensity is an important predictor of improvements in exercise capacity.
Please see lines 980-984.
When discussing lower limb strength (Line 324), why focus on the study by Wang et al., 2024 only? Other studies you reviewed also found a relationships between lower limb strength and VO2peak? (e.g, Bert 2012,2; Kim 2010; Ryan 2000).
Reply: In line with the amended results section, this sentence was updated (see line 987)
Are there any conclusions that can be drawn about paretic vs non paretic limb? Kinetic versus isometric strength measurements? What additional research is needed?
Reply: Current evidence does not allow firm conclusions regarding differences between paretic and non-paretic limbs or between isometric and isokinetic strength measurements; further research is indeed needed to clarify these aspects. The section was updated accordingly, see line 988-989)
Line 324: Should the intext citation (Wang et al., 2024) be 2014?
Reply: Yes, we corrected the typo.
Line 331 – You indicate that higher fat mass was associated with reduced VO2 peak in several studies. However, the majority of studies looked at BMI (not fat mass). I would be clear you are referring to BMI.
Reply: Yes, indeed. Changed accordingly.
Its important to acknowledge (as you have) that many of the cardiorespiratory parameters you looked at have known associations with VO2peak. However, I would take a little more time with this, discussing the individual factors you are referring to (cardiac output, 6MWT distances, baseline VO2, sit to stand), and how the results of your review compare to what is known for the general population.
Reply: We appreciate the suggestion to elaborate on cardiorespiratory parameters with general-population comparisons. To maintain focus and clinical relevance for stroke survivors—where neurological impairment and gait asymmetry fundamentally alter these relationships—we decided not to expand this section further in order to preserve focus and clarity.
I think it could be important to discuss how 6MWT and 30 second sit to stand test scores were predictive of higher VO2 peak. These tests are relatively simple to administer, and do not require a lot of equipment, and so may be useful for predicting VO2 peak in facilities that do not have the ability to assess VO2peak using CPET.
Reply: Thank you for raising this important clinical implication. We have added this to the discussion, please see lines 995-1003.
Line 335 – clarify what factors fall under psychosocial parameters.
Reply: This was meant to be an example of missing point of view, changed accordingly in the sentence.
The stroke specific factors are really what set apart your paper from articles that look at predictors of VO2 in the general population. When discussing the stroke specific predictors, I would take some highlight some of the factors that were identified as having a potential relationship with the outcome, while acknowleging the challenges. Is there anything to take away from the research that has been done?
Reply: Thank you for highlighting this strength of our review. We have added a sentence in the discussion that mentions the added value of stroke specific factors, besides general determinants of exercise capacity, see lines 1047-1049.
Lines 364-369. Most studies did not find sex to be a independent predictor of VO2. Therefore I would tailor the discussion to this findin
Reply: We agree and have amended this section of the discussion accordingly, see line1060-1064.
Strengths and Limitations – This section should come after the Clinical Implications Section, Just prior to conclusions.
Reply: We agree and re-arranged accordingly. Note that we decided to arrange as – Strengths and Limitations; Future Research; Conclusions.
Lines 394-401. First paragraph of clinical implications seems very redundant with last paragraph of Strengths and Limitations (381-392). I would combine the content of these two paragraphs, and remove from clinical implications section
Reply: Thank you for this comment. We agree that there is some overlap between the clinical implications section and the Strengths and Limitations section. However, we believe it is important to retain a concise statement in the clinical implications section, as these implications are directly shaped by the limitations and are highly relevant for practice. To address your concern, we have streamlined the text in the clinical implications section to reduce redundancy while keeping the main message.
Lines 408-409 provide more up to date references on how aerobic exercise can improve exercise capacity.
Reply: Agreed, we added the systematic reviews that were already referred in other context.
Lines 415-421 – I don’t disagree with anything in this paragraph, but I also don’t see how it fits in with the rest of the paper.
Reply: Thank you, we amended the text to relate more to the context of VO2peak, here but also refer to the lifestyle factors in other sections of the discussion (see lines 1197-1203).
Line 598 – The Mith reference seems out of place alphabetically
Reply: This should be Smith – the typo was corrected accordingly.
Line 601 – Should this reference be Tang 2013? Instead of 2023.
Reply: Yes, indeed, updated accordingly.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This manuscript addresses a clinically relevant question: which factors predict peak aerobic exercise capacity (VO₂peak) in people after stroke, and which of those predictors might be modifiable in rehabilitation? The attempt to separate “modifiable” from “non-modifiable” predictors is potentially useful for clinical decision-making and individualized rehabilitation planning. Several issues currently limit interpretability and scientific rigor—primarily around how “prediction” is defined and synthesized, risk-of-bias assessment appropriate for prediction model studies (Table 3 in particular). Addressing the major comments below would substantially strengthen validity and publication readiness.

The work focuses on predictors (rather than intervention effects), and the modifiable/non-modifiable categorization is a meaningful framing, and the authors state this is the first review to explicitly do so. The review currently blends prognostic factor studies, baseline determinants, and predictors of change in VO₂peak after training. These are related but not interchangeable. Without stratification, the clinical “so what” becomes muddled, and the reader may overinterpret determinants as predictors of response. I suggest that you recast the review explicitly as one (or two) of the following: Predictors/determinants of VO₂peak at a given timepoint, and separately. Predictors of change in VO₂peak (training response).
Present these as distinct analyses throughout the abstract, results, and discussion.

In the last paragraph of the introduction, clarify whether the review aims to identify predictors of VO₂peak level, predictors of VO₂peak change, or both—and why both matter.

( In line 111-119)–The “Prediction model” inclusion criteria are underspecified. Add explicit criteria: e.g., multivariable regression intended to predict VO₂peak (level or change), with stated model specification and at least one of:

regression coefficients (or sufficient statistics), and/or
model performance (R², calibration, discrimination), and/or

( In line 111-119)– The inclusion criteria state that “primary research” was included; however, it is unclear whether this refers specifically to randomized controlled trials (RCTs), observational studies, or both. Given that prediction models are often derived from observational designs, the authors should clearly specify: Which study designs were eligible (e.g., RCTs, cohort studies, cross-sectional studies)

(In table3 ) You used the McMaster Critical Review Form for Quantitative studies. Your risk-of-bias conclusions may not reflect prediction-specific threats to validity. Checklist is not tailored to bias in prognostic factor studies (QUIPS), or prediction model studies (PROBAST). Replace or supplement with: PROBAST (if you treat included studies as prediction model studies), and/or QUIPS (if treated as prognostic factor studies). If you keep McMaster, justify why it is appropriate for prediction models and explicitly acknowledge what prediction-specific domains it does not cover.

(In the method section)–Add a “Data synthesis” subsection describing: how predictors were grouped, how “significance” was interpreted (p-value thresholds?), whether direction of effect was required, how you handled multiple models/timepoints within a study.

(lines 12, 37, 310)The manuscript lists cardiorespiratory fitness as a modifiable predictor, measured using VO₂peak. However, VO₂peak is a physiological outcome of cardiorespiratory performance rather than a predictor unless baseline VO₂peak is explicitly used to predict follow-up VO₂peak. The authors should clarify whether VO₂peak is being treated as a predictor of future exercise capacity, or the outcome itself. If cardiorespiratory fitness is intended to encompass broader constructs (e.g., muscular endurance, flexibility), these should be explicitly defined and distinguished.

(Table 3 )lists significant predictors but frequently reports NR for coefficients/SE/CI and uses mixed notation (“B”, “C”, F-statistics) without clear definitions. Standardize Table 3 with the following minimum fields for every predictor:

outcome type (absolute VO₂peak vs change),
timepoint,
model type,
adjusted covariates (or at least count + key covariates),
effect estimate (β or OR, with units),
SE or 95% CI,
p-value,
direction of effect (↑/↓ VO₂peak),
sample size used in the model (not just study N).

Add footnotes defining all symbols (e.g., “B”, “C”, “F”) and units (ml/kg/min vs L/min).
If original studies did not report coefficients, say so explicitly and consider contacting

(In lines 226-241) authors or extracting from supplementary materials where possible Lower-limb characteristics include predictors such as lean mass, which are more appropriately classified under body composition (e.g., BMI-related variables). Similarly, the inclusion of the 6-minute walk test (6MWT) under cardiorespiratory parameters is questionable: 6MWT is primarily a measure of functional endurance and peripheral limitations rather than central cardiorespiratory or ventilatory function.The same applies to the 30-second sit-to-stand test, which reflects neuromuscular and peripheral performance. Predictor domains should be restructured to reflect physiological

(In Fig1), The PRISMA diagram “Reports excluded… Reason 1: Exercise capacity not measured as VO2max” appears inconsistent with the inclusion criterion requiring VO₂peak/VO₂max equivalence and the narrative stating VO₂peak focus. Please verify this label (on page 5.)

(Lines 217-306) The manuscript emphasizes modifiable factors as targets for rehabilitation, but many “modifiable predictors” are also proxies for severity, baseline function, or selection into training intensity. For example: baseline VO₂peak predicting follow-up VO₂peak may reflect regression to the mean, ceiling effects, or disease severity rather than a “modifiable target.” training intensity predictors may be confounded by ability/mobility (those who can train harder often are less impaired).

I suggest that you add a paragraph explicitly distinguishing predictors that are causal targets (modifiable with interventions), vs predictors that are markers (modifiable in principle but not necessarily causal drivers).

(Discursion) need to be reorganizes into subsections by:

predictors of VO₂peak level, and
predictors of VO₂peak change/response,
and, within each, discuss subacute vs chronic contexts

Minor issues

Consistent spelling/terminology: “crossectional” → “cross-sectional” appears multiple times.
In Table 2, items refer to “anaerobic test” despite focusing on aerobic capacity/CPET; likely a template error.
Some claims would benefit from tighter phrasing and fewer broad generalizations (e.g., clinical implications section).
Clarify how VO₂max values were harmonized to VO₂peak (units; treadmill vs cycle; relative vs absolute).
Report how multiple timepoints/models per study were handled (e.g., Baert 3/6/12 months).
In Table 1, consider abbreviations legend closer to the table and ensure consistent naming (HIIT vs HITT appears).
Standardize terminology: “predictors,” “determinants,” “associations”—choose one anddefine it.
Throughout the manuscript, the term “stroke sufferers” (or similar phrasing) is used. This terminology is not scientifically appropriate and should be replaced with “individuals diagnosed with stroke” or “stroke survivors”, in line with current clinical and ethical reporting standards.
In line 292-The citation (Baert et al., 2012,1) appears to contain a formatting or numbering error and should be corrected according to the journal’s reference style.

Author Response

Reviewer 2:

Reply: Thank you for your thoughtful and constructive feedback. We appreciate your recognition of the clinical relevance of our review and the potential utility of distinguishing modifiable from non-modifiable predictors. Your comments provide valuable guidance for improving clarity and scientific rigor. We have carefully addressed the major points raised, and believe the revisions substantially improve the manuscript. Below, we provide detailed responses to all comments and describe the corresponding revisions. Note that the line numbers correspond with line numbers in the tracked changes version of our revised manuscript.

The work focuses on predictors(rather than intervention effects), and the modifiable/non-modifiable categorization is a meaningful framing, and the authors state this is the first review to explicitly do so. The review currently blends prognostic factor studies, baseline determinants, and predictors of change in VO₂peak after training. These are related but not interchangeable. Without stratification, the clinical “so what” becomes muddled, and the reader may overinterpret determinants as predictors of response. I suggest that you recast the review explicitly as one (or two) of the following: Predictors/determinants of VO₂peak at a given timepoint, and separately. Predictors of change in VO₂peak (training response).
Present these as distinct analyses throughout the abstract, results, and discussion.
Reply: Thank you for this thoughtful comment. We agree that prognostic factors at a given timepoint, baseline determinants, and predictors of change are related but not interchangeable. In the original submission, we explicitly reported study timepoints and predictor types in Table 1 and noted their variability and implications in the Discussion to guide interpretation. We considered stratifying the review into separate analyses but opted against this to preserve readability and avoid fragmenting the synthesis. To maintain clarity, we consistently indicate in the text – where needed - whether an association pertains to VO2peak level or to change following training (see amended manuscript). We believe this approach balances interpretability and focus while ensuring that readers are alerted to the differences in predictor type and timing.
In the last paragraph of the introduction, clarify whether the review aims to identify predictors of VO₂peak level, predictors of VO₂peak change, or both—and why both matter.
Reply: Thank you for the suggestion, in line with previous remark this was added.
( In line 111-119)–The “Prediction model” inclusion criteria are underspecified. Add explicit criteria: e.g., multivariable regression intended to predict VO2peak (level or change), with stated model specification and at least one of:

regression coefficients (or sufficient statistics), and/or
model performance (R², calibration, discrimination)

reply: Thank you for this suggestion. We have added the sentence to clarify the inclusion criteria for prediction models as recommended. However, as some studies reported prediction models as secondary analyses rather than their primary aim, full details (e.g., model performance metrics) were not always available. Given the scarcity of literature, we chose not to exclude these studies but reflected on the lack of detail in our risk-of-bias assessment and in the limitations section of the review (see line 1224-1227)

( In line 111-119)– The inclusion criteria state that “primary research” was included; however, it is unclear whether this refers specifically to randomized controlled trials (RCTs), observational studies, or both. Given that prediction models are often derived from observational designs, the authors should clearly specify: Which study designs were eligible (e.g., RCTs, cohort studies, cross-sectional studies)
Reply: We thank the reviewer for this comment. We did not exclude any study design – seen the scarcity of the literature – and clarified this in the inclusion/exclusion criteria section accordingly (see line 265-266)

(In table3 ) You used the McMaster Critical Review Form for Quantitative studies. Your risk-of-bias conclusions may not reflect prediction-specific threats to validity. Checklist is not tailored to bias in prognostic factor studies (QUIPS), or prediction model studies (PROBAST). Replace or supplement with: PROBAST (if you treat included studies as prediction model studies), and/or QUIPS (if treated as prognostic factor studies). If you keep McMaster, justify why it is appropriate for prediction models and explicitly acknowledge what prediction-specific domains it does not cover.
Reply: Thank you for this important comment. We acknowledge that PROBAST and QUIPS are specifically designed for prediction model and prognostic factor studies. We chose to use the McMaster Critical Review Form because it provides a comprehensive evaluation of methodological quality across quantitative designs, which was appropriate given the heterogeneity of the designs of the included studies. To address prediction-specific aspects, we interpreted item 13 of the McMaster tool to capture key characteristics of the prediction models. We also note that some studies did not report all details (e.g., model performance metrics); this was not considered in our methodological assessment, but explicitly discussed as a limitation in the manuscript – also in response to the earlier comment. We have adjusted our method sections to clarify on how the McMaster was used to determine prediction-specific aspects (see lines 310-332).

(In the method section)–Add a “Data synthesis” subsection describing: how predictors were grouped, how “significance” was interpreted (p-value thresholds?), whether direction of effect was required, how you handled multiple models/timepoints within a study.
Reply: Thank you. This comment was also raised by Reviewer 1 and we addressed how predictors were grouped in the data extraction section (see line 295-303). Note that data/information of all multivariate models in an article was extracted as mentioned in the data extraction section, as well as which data was extracted of the models (e.g. direction but also confidence intervals and p-values in order to determine significance). Therefore no further updates were made.

(lines 12, 37, 310)The manuscript lists cardiorespiratory fitnessas a modifiable predictor, measured using VO₂peak. However, VO₂peak is a physiological outcome of cardiorespiratory performance rather than a predictor unless baseline VO₂peak is explicitly used to predict follow-up VO₂peak. The authors should clarify whether VO₂peak is being treated as a predictor of future exercise capacity, or the outcome itself. If cardiorespiratory fitness is intended to encompass broader constructs (e.g., muscular endurance, flexibility), these should be explicitly defined and distinguished.
Reply: Thank you for addressing this relevant distinction. We indeed intended to refer to baseline VO2peak as predictor and have amended the sentences accordingly throughout the manuscript (see amended manuscript).
(Table 3 )lists significant predictors but frequently reports NR for coefficients/SE/CI and uses mixed notation (“B”, “C”, F-statistics) without clear definitions. Standardize Table 3 with the following minimum fields for every predictor:

outcome type (absolute VO₂peak vs change),
timepoint,
model type,
adjusted covariates (or at least count + key covariates),
effect estimate (β or OR, with units),
SE or 95% CI,
p-value,
direction of effect (↑/↓ VO₂peak),
sample size used in the model (not just study N).

Add footnotes defining all symbols (e.g., “B”, “C”, “F”) and units (ml/kg/min vs L/min).If original studies did not report coefficients, say so explicitly and consider contacting authors or extracting from supplementary materials where possible
Reply: Thank you for this suggestion. We agree that standardized reporting would improve clarity. However, most of the requested details (e.g., coefficients, SE, CI, model performance) were not provided in the original studies, and despite contacting authors, we did not receive additional information. Given the scarcity of literature, we chose to retain these studies to provide a comprehensive overview. In our original submission, abbreviations and symbols were explained in footnotes, which were inadvertently lost during template transfer; these have now been restored. Because the requested information is largely unavailable, amending the Table would result in multiple NR or empty columns. For this reason, we decided to keep the Table unchanged, and explicitly acknowledge the lack of detailed reporting as a methodological limitation in the manuscript (see line 1224-1227).

(In lines 226-241) Lower-limb characteristics include predictors such as lean mass, which are more appropriately classified under body composition(e.g., BMI-related variables). Similarly, the inclusion of the 6-minute walk test (6MWT) under cardiorespiratory parameters is questionable: 6MWT is primarily a measure of functional endurance and peripheral limitations rather than central cardiorespiratory or ventilatory function. The same applies to the 30-second sit-to-stand test, which reflects neuromuscular and peripheral performance. Predictor domains should be restructured to reflect physiological
Reply: Thank you for this comment. We acknowledge that some measures, such as lean mass, 6MWT, and sit-to-stand, could be classified differently based on their underlying physiological constructs. We considered this carefully and decided to retain the current structure because our categorization aimed to group predictors by their primary functional domain as it relates specifically to stroke rehabilitation. In this context, lower‑limb characteristics were grouped to represent musculoskeletal and neuromuscular aspects typically affected after stroke, while the 6MWT was included under cardiorespiratory parameters given its widespread use as a proxy for aerobic capacity within stroke and rehabilitation research. Our aim was to create categories that aligned as closely as possible with the functional consequences of stroke and the domains targeted in rehabilitation, while remaining consistent with how predictors were reported in the included studies.
(In Fig1), The PRISMA diagram “Reports excluded… Reason 1: Exercise capacity not measured as VO2max” appears inconsistent with the inclusion criterion requiring VO₂peak/VO₂max equivalence and the narrative stating VO₂peak focus. Please verify this label (on page 5.)
Reply: Thank you for raising the discrepancy, the Figure has been updated.
(Lines 217-306) The manuscript emphasizes modifiable factors as targets for rehabilitation, but many “modifiable predictors” are also proxies for severity, baseline function, or selection into training intensity. For example: baseline VO₂peak predicting follow-up VO₂peak may reflect regression to the mean, ceiling effects, or disease severity rather than a “modifiable target.” training intensity predictors may be confounded by ability/mobility (those who can train harder often are less impaired).
I suggest that you add a paragraph explicitly distinguishing predictors that are causal targets (modifiable with interventions), vs predictors that are markers (modifiable in principle but not necessarily causal drivers).
Reply: Thank you for this valuable suggestion. We agree that this nuance is important for interpreting our findings and have added a paragraph reflecting on this point in the discussion of modifiable predictors, see lines 1014-1022.
(Discursion) need to be reorganizes into subsections by:
predictors of VO₂peak level, and
predictors of VO₂peak change/response,
and, within each, discuss subacute vs chronic contexts
Reply: Thank you for this suggestion. We agree that distinguishing predictors of VO2peak level from predictors of VO2peak change, and considering subacute versus chronic contexts, is an important nuance for interpretation. As noted in our response to Comment 1, we decided not to explicitly reorganize the review into separate subsections because this would fragment the synthesis and reduce readability. Instead, we ensured that timepoints and prediction targets (absolute VO₂peak vs. change) are clearly indicated in Table 1 and explicitly discussed in the manuscript, so readers can interpret findings in context. Where relevant, we also specify whether associations relate to VO2peak at a given timepoint or to change following training. We believe this approach maintains clarity while preserving the comprehensive nature of the review.

Minor issues

Consistent spelling/terminology: “crossectional” → “cross-sectional” appears multiple times.
Reply: Updated accordingly throughout the article.
In Table 2, items refer to “anaerobic test” despite focusing on aerobic capacity/CPET; likely a template error.
Reply: Typo indeed, updated accordingly.
Some claims would benefit from tighter phrasing and fewer broad generalizations (e.g., clinical implications section).
Reply: Thank you, we checked our manuscript on grammar and phrasing and updated where needed. Please refer to the amended manuscript.
Clarify how VO₂max values were harmonized to VO₂peak (units; treadmill vs cycle; relative vs absolute).
Reply: Only the terms were harmonized, updated method section to clarify (see line 275-275).
Report how multiple timepoints/models per study were handled (e.g., Baert 3/6/12 months).
Reply: To ensure consistency, when studies reported multiple timepoints or separate multivariable models, results from each timepoint/model were extracted and presented separately, with the corresponding timepoint or model specified in the narrative synthesis, see amended results section.
In Table 1, consider abbreviations legend closer to the table and ensure consistent naming (HIIT vs HITT appears).
Reply: Edited accordingly.
Standardize terminology: “predictors,” “determinants,” “associations”—choose one and define it.
Reply: Thank you for this comment, we thoroughly checked our manuscript and consistently used the term ‘predictors’, in line with our aim (see amended manuscript).
Throughout the manuscript, the term “stroke sufferers”(or similar phrasing) is used. This terminology is not scientifically appropriate and should be replaced with “individuals diagnosed with stroke” or “stroke survivors”, in line with current clinical and ethical reporting standards.
Reply: Updated accordingly.
In line 292-The citation (Baert et al., 2012,1)appears to contain a formatting or numbering error and should be corrected according to the journal’s reference style.
Reply: Updated accordingly.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This systematic review addresses a clinically relevant question: which factors predict exercise capacity in stroke survivors. The protocol is preregistered and methods follow PRISMA, with independent screening and extraction described.

The main findings, modifiable and non-modifiable predictors, are clearly framed, and the discussion appropriately notes that predictors come from separate multivariable models and should be interpreted cautiously.

The McMaster Critical Review Form is used, but the review question concerns predictors from multivariable prediction models. Please consider adding a prediction-model–appropriate risk-of-bias/applicability tool and summarise risk-of-bias and applicability alongside findings to support the credibility of conclusions. The header mixes “F Coefficient/Beta (SE)” and uses unexplained suffixes/labels and terms such as “USERm,” with many “NR” entries. Please:

define every abbreviation/label in footnotes,
specify the model type, outcome target, and units,
separate predictors of absolute VO2peak vs predictors of change in VO2peak,
report comparable effect estimates where possible and clearly indicate when only p-values are available.

The manuscript treats VO2max as equivalent to VO2peak “for consistency.” Please add a brief justification and note any implications for interpretability across studies (e.g., whether maximal criteria were met), potentially as a sensitivity/limitation statement.

Items 7–8 refer to an “anaerobic test,” which is inconsistent with VO2peak/CPET focus; please correct wording.

Example: NIHSS appears as NIHHS in Table 1; please standardise.

Some predictors can be partially modifiable; provide the decision rule or examples in Methods to improve reproducibility.

Comments on the Quality of English Language

English is understandable, but revisions are needed to improve clarity, consistency of terminology/abbreviations, and sentence structure in the Methods/Results and tables.

Author Response

Reviewer 3:

Thank you for your thoughtful and encouraging feedback. Your comments reinforce the value of this work, and we are confident that the revisions further strengthen clarity and applicability for clinical practice. Below, we provide detailed responses to all comments and describe the corresponding revisions. Note that the line numbers correspond with line numbers in the tracked changes version of our revised manuscript.

The McMaster Critical Review Form is used, but the review question concerns predictors from multivariable prediction models. Please consider adding a prediction-model–appropriate risk-of-bias/applicability tool and summarise risk-of-bias and applicability alongside findings to support the credibility of conclusions. The header mixes “F Coefficient/Beta (SE)” and uses unexplained suffixes/labels and terms such as “USERm,” with many “NR” entries.
Reply: Thank you for this important comment. We acknowledge that other tools – like PROBAST and QUIPS - are specifically designed for prediction model and prognostic factor studies. We chose to use the McMaster Critical Review Form because it provides a comprehensive evaluation of methodological quality across quantitative designs, which was appropriate given the heterogeneity of the designs of the included studies. To address prediction-specific aspects, we interpreted item 13 of the McMaster tool to capture key characteristics of the prediction models. We have adjusted our method sections to clarify on how the McMaster was used to determine prediction-specific aspects (see lines 310-332).
For Table 3, please note that in our original submission, abbreviations and symbols were explained in footnotes, which were inadvertently lost during template transfer; these have now been restored. We acknowledge that a lot of NR was entered into the table, which is due to the lack of reporting important outcomes by the original studies, and despite contacting authors, we did not receive additional information. Given the scarcity of literature, we chose to retain these studies to provide a comprehensive overview. We explicitly discussed the lack of detailed reporting as a methodological limitation in the manuscript (see line 1224-1233).
define every abbreviation/label in footnotes,
Reply: In our original submission these abbreviations were added and explained in the foot notes underneath the table. We noticed that with transferring our manuscript to the journal template, these foot notes got lost. We are thankful the reviewer noticed this and have added the footnotes to the edited manuscript.

specify the model type, outcome target, and units,
Reply: Thank you for this suggestion. As mentioned in response to comment1, we agree that standardized reporting would improve clarity but we were unable to specify further due to lack of reporting in the original studies. We explicitly discussed the lack of detailed reporting as a methodological limitation in the manuscript (see line 1224-1233).

separate predictors of absolute VO2peak vs predictors of change in VO2peak,
Reply: Thank you for this thoughtful comment. We considered stratifying the review into separate analyses but opted against this to preserve readability and avoid fragmenting the synthesis. To maintain clarity, we consistently indicate in the text – where needed - whether an association pertains to VO2peak level or to change following training (see amended manuscript). We believe this approach balances interpretability and focus while ensuring that readers are alerted to the differences in predictor type and timing.

report comparable effect estimates where possible and clearly indicate when only p-values are available.
Reply: We agree that evaluating effect sizes could provide additional insight into the relative importance of predictive factors and their clinical relevance. However, this was not possible due to lack of reporting. Therefore, we have acknowledged this in the limitations section (see lines 1229-1233). In table 3 p-values, if reported, can be found.

The manuscript treats VO2max as equivalent to VO2peak “for consistency.” Please add a brief justification and note any implications for interpretability across studies (e.g., whether maximal criteria were met), potentially as a sensitivity/limitation statement.
Reply: Only the terms were harmonized, we updated method section to clarify (see line 275-276). Only studies that examined maximal exercise testing were included (see methods section, inclusion criteria 2 ‘outcome had to be a measurement of maximum exercise capacity (VO2peak) measured with CPET (or equivalent)’), hence no updates were made with regards to this.

Items 7–8 refer to an “anaerobic test,” which is inconsistent with VO2peak/CPET focus; please correct wording.
Reply: Thank you for addressing the typo, we corrected accordingly.
Example: NIHSS appears as NIHHS in Table 1; please standardise.

Reply: Thank you for the careful review, we corrected the typo accordingly.

Some predictors can be partially modifiable; provide the decision rule or examples in Methods to improve reproducibility.
Reply: To improve transparency, we have added a detailed description of the process used to categorize predictors as modifiable or non‑modifiable in the Methods section. This includes the criteria applied, the independent classification by two reviewers, and the procedure for resolving disagreements (see lines 295-303)

Comments on the Quality of English Language
English is understandable, but revisions are needed to improve clarity, consistency of terminology/abbreviations, and sentence structure in the Methods/Results and tables.
Reply: Thank you for addressing this, we have carefully reread our manuscript and updated grammar/phrasing where necessary (see amended manuscript).

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you for your revisions. My comments have been addressed, and the manuscript is much improved. I just have a few small suggestions.

Line 840 – remove “consistently” from this sentence, as baseline VO2 was only predictive in 2/4 studies.

Lines 1257-1260 – “However, it should be noted…" Suggest incorporating this sentence to limitations section to maintain the flow of the 1st paragraph of discussion.

Comments on the Quality of English Language

The new revision contains a few spelling errors /typos. I have highlighted a few below, but I recommend that manuscript be further proofread for grammatical errors and sentence structure.

Line 513: "therapie" should be therapy

Lines 597-598: improper sentence structure here:

The study design varied widely across the articles, but a cross-sectional design (n=5). Four articles and a (secondary analysis of) RCT design (n=4) were most common

Line 598 – Instead of “Other study designs are presented…” say “All study designs are presented… “

Line 730: predictor should be predictors?

Line 734: This sentence should be rewritten for clarity: “Similarly, greater isokinetic muscle strength of the knee extensors at 90 degree flexion the non-paretic legs was a significant predictor of improved VO₂peak level [45].”

Line 742: no should be not?

Line 843: indicated should be indicating?

Line 1374: Unnecessary comma after “related”

Line 1384: “predictos should be predictors

Author Response

Reviewer 1:

Thank you for your revisions. My comments have been addressed, and the manuscript is much improved. I just have a few small suggestions.

Reply: I would like to sincerely thank the reviewers for their thoughtful evaluation of our revised manuscript. We greatly appreciate their positive reception of the changes implemented during the first round of revisions, and we are grateful for the additional minor remarks provided in this second round. We have updated our manuscript accordingly, see below our point-to-point responses.

Line 840 – remove “consistently” from this sentence, as baseline VO2 was only predictive in 2/4 studies.

Reply: We agreed and updated accordingly.

Lines 1257-1260 – “However, it should be noted…" Suggest incorporating this sentence to limitations section to maintain the flow of the 1st paragraph of discussion.

We agree and updated accordingly, see lines 498-500.

Comments on the Quality of English Language

The new revision contains a few spelling errors /typos. I have highlighted a few below, but I recommend that manuscript be further proofread for grammatical errors and sentence structure.

Line 513: "therapie" should be therapy

Lines 597-598: improper sentence structure here:

The study design varied widely across the articles, but a cross-sectional design (n=5). Four articles and a (secondary analysis of) RCT design (n=4) were most common

Line 598 – Instead of “Other study designs are presented…” say “All study designs are presented… “

Line 730: predictor should be predictors?

Line 742: no should be not?

Line 843: indicated should be indicating?

Line 1374: Unnecessary comma after “related”

Line 1384: “predictos should be predictors

Reply: We appreciate your careful reading of the manuscript. In response, we have conducted a thorough proofreading of the full text, focusing on correcting grammatical inaccuracies, improving sentence structure, and ensuring overall clarity. All identified errors have been corrected, and we have revised several additional sentences to improve readability. We are confident that the manuscript now reflects a higher level of linguistic precision. We kindly refer to the revised manuscript.

Reviewer 2 Report

Comments and Suggestions for Authors

I appreciate the authors’ thoughtful responses to the comments raised and the evident effort made to improve the manuscript. However, one comment still requires further attention to ensure full clarity and methodological consistency.

Lean mass is a body composition variable and should be classified accordingly. It should not be listed under lower-limb characteristics. Please revise the manuscript and move lean mass to the body composition domain, ensuring that predictor domains are structured based on accepted physiological classification rather than functional interpretation.

Author Response

Reviewer 2:

I appreciate the authors’ thoughtful responses to the comments raised and the evident effort made to improve the manuscript. However, one comment still requires further attention to ensure full clarity and methodological consistency.

Reply: Thank you very much for your kind remarks regarding the revisions and the effort invested in improving the manuscript. We appreciate your careful reading and that you revisit this point with clear guidance. We acknowledge your concern regarding the classification of lean mass and agree that it should be categorized within the body composition domain rather than under lower-limb characteristics. In response, we have revised the manuscript accordingly (see lines 253-257).

Reviewer 3 Report

Comments and Suggestions for Authors

Thank you for submitting the revised manuscript. The topic is clinically relevant and the review question is clear. While the revision resolves some minor issues, major methodological and presentation concerns remain, particularly the suitability of the risk-of-bias approach for a predictor/modelling-focused review and the clarity of the main synthesis table (Table 3). Addressing the points below is necessary to improve reproducibility and interpretability.

The review focuses on predictors/modelling, yet RoB is assessed with the McMaster form. Please use an appropriate prediction/prognostic tool or justify McMaster and clearly acknowledge modelling-specific limitations.
Still difficult to interpret. Please separate predictors of VO₂peak level vs VO₂peak change, and report model type, outcome target/timepoint, units, and comparable effect estimates where feasible.
Add a brief limitation on maximal criteria and comparability when VO₂max is reported versus VO₂peak.
Provide a clear decision rule and 1–2 examples for borderline/partially modifiable predictors.
Minor typos/terminology issues remain; careful copy-editing is required.

Comments on the Quality of English Language

English is understandable, but revisions are needed to improve clarity, consistency of terminology/abbreviations, and sentence structure in the Methods/Results and tables.

Author Response

Reviewer 3:

Reply: Thank you very much for your thoughtful second-round review and for acknowledging the clinical relevance and clarity of the review question. We appreciate your careful assessment of the revision and your guidance on the remaining methodological and presentation issues. We have carefully considered each of your points and have made substantial additional revisions to improve reproducibility, transparency, and interpretability.

The review focuses on predictors/modelling, yet RoB is assessed with the McMaster form. Please use an appropriate prediction/prognostic tool or justify McMaster and clearly acknowledge modelling-specific limitations.

Reply: We agree that tools such as PROBAST or QUIPS are specifically designed for prediction and prognostic factor studies. Given the heterogeneity of study designs in our review (cross-sectional, RCTs, secondary analyses), and the fact that in many studies the prediction modeling was not the primary aim, we selected the McMaster Critical Review Form. This form allows consistent evaluation across diverse quantitative designs. However, we acknowledge that it does not fully capture modelling‑specific sources of bias. To address this we have added an explicit justification of the use of the McMaster tool in the methods section (see lines 175-177) and reflected on the modelling specific limitations of the McMaster tool in the limitations section (see lines 515-523).

Still difficult to interpret. Please separate predictors of VO₂peak level vs VO₂peak change, and report model type, outcome target/timepoint, units, and comparable effect estimates where feasible.

Reply: Thank you for highlighting the need for further refinement. In response, the Results section has been reorganized into two separate subsections distinguishing predictors of VO₂peak level from predictors of VO₂peak change (see amended sections 3.5 – 3.8). Table 3 has been amended accordingly as well, and now includes additional details on model type, units, outcome timepoints (change vs level), and comparable effect estimates (extracted both unstandardized and standardized B if , where available). As a result of this enhanced reporting, several entries were listed as NR, reflecting limited information in the original publications; this issue is also addressed in the quality assessment.

To maintain clarity and readability in the main narrative, the Abstract and Discussion section was deliberately not restructured into separate parts for VO₂peak level and VO₂peak change. Instead, timepoints and distinctions between these outcome types continue to be explicitly referenced within the text where relevant, ensuring interpretability without fragmenting the overall discussion.

Add a brief limitation on maximal criteria and comparability when VO₂max is reported versus VO₂peak.

Reply: Thank you for raising this point. We carefully reviewed the included articles and concluded that none of the included articles measured VO2max, so in our opinion this limitation does not need to be mentioned in the discussion.

Provide a clear decision rule and 1–2 examples for borderline/partially modifiable predictors.
Reply: Thank you for this helpful suggestion. To maintain clarity for readers, the review employs two categories only: modifiable and non‑modifiable predictors. Variables that could be considered “partially modifiable” (e.g., lean mass, habitual physical activity) were classified as modifiable, because they can change with training or rehabilitation, albeit to varying degrees. The Methods section now states this decision rule explicitly and provides examples (lines 180-183).
Minor typos/terminology issues remain; careful copy-editing is required.

Comments on the Quality of English Language

Thank you for noting this. The manuscript and tables have undergone careful copy‑editing.

Article Menu

Modifiable and Non-Modifiable Predictors of Exercise Capacity in Stroke Survivors: A Systematic Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI