Review Reports - Pathological Predictors of Limited Salvage Radiotherapy Efficacy After Radical Prostatectomy: Central Review of JCOG0401

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Taken all together, I recommend the authors to revise the wording of the conclusions to avoid implying definitive predictive value and instead clearly state that the findings are exploratory and hypothesis-generating.The distinction between prognostic and predictive effects should be clarified throughout the manuscript, particularly in the Abstract and Discussion. The authors are encouraged to provide formal interaction analyses with appropriate statistical thresholds (p < 0.05)and clearly report which findings remain statistically significant. Given the large number of subgroup analyses, consideration should be given to adjusting for multiple testing or explicitly acknowledging the increased risk of false-positive results. A comparison of baseline characteristics between included (n=167) and excluded patients (n=43) should be added to assess potential selection bias. The authors should consider adding a sensitivity analysis to test the robustness of the main findings (e.g., excluding small subgroups or using alternative model specifications).The use of stepwise Cox regression should be justified, or alternatively, a more robust modeling strategy should be considered. The manuscript would benefit from a clearer explanation of why time to treatment failure (TTF) was selected as the primary endpoint and a discussion of its limitations compared to standard endpoints such as PFS or metastasis-free survival. The authors should acknowledge more explicitly that treatment decisions (e.g., initiation of SHT) may influence TTF and introduce bias. Additional clinically relevant variables, such as PSA kinetics (e.g., PSA doubling time), should be included if available or discussed as an important limitation. The absence of modern imaging (e.g., PSMA PET) should be highlighted as a limitation affecting applicability to current clinical practice. The authors should consider simplifying or focusing the subgroup analyses on pre-specified key variables (e.g., IDC-P and tGP5) to improve clarity and reduce multiplicity concerns. Figures (Kaplan–Meier curves and forest plots) should be improved for readability, including clearer labeling, higher resolution, and complete numbers at risk. The terminology “treatment-resistant factors” should be replaced with a more neutral phrasing such as “features associated with reduced relative benefit”. The Discussion should be revised to better contextualize findings within existing literature and avoid overinterpretation, emphasizing the need for prospective validation in independent cohorts.

You should also consider adding future direction - need for prospective validation, integration with molecular markers, modern imaging...

Author Response

I commend authors for doing this subanalysis of important trial. Actually, this trial is the only one showing how critical is radiotherapy for curing patients with biochemical recurrence while HT looks like palliative modality. However, to get this published they must improve the methodology and make this report more scientifically sound and robust.
We thank the reviewers for their thoughtful and constructive comments, which have significantly strengthened the manuscript. We appreciate the time and effort spent reviewing the work and have addressed each point carefully.

The authors are encouraged to provide formal interaction analyses with appropriate statistical thresholds (p < 0.05) and clearly report which findings remain statistically significant.
Thank you for this important comment. We have already provided formal interaction analysis in the Figure 2.

Given the large number of subgroup analyses, consideration should be given to adjusting for multiple testing or explicitly acknowledging the increased risk of false-positive results. A comparison of baseline characteristics between included (n=167) and excluded patients (n=43) should be added to assess potential selection bias.
Thank you for this important comment. To assess the potential for selection bias, we compared baseline clinicopathological characteristics between patients included in the ancillary analysis (n=167) and those not included (n=43). As shown in the newly added supplementary table, no major differences were observed between the two groups, suggesting that the analyzed cohort was broadly representative of the overall JCOG0401 population.

The authors should consider adding a sensitivity analysis to test the robustness of the main findings (e.g., excluding small subgroups or using alternative model specifications).
We thank the reviewer for the valuable suggestion regarding sensitivity analyses.
To evaluate the robustness of the main findings, we conducted several additional multivariable Cox regression analyses using alternative model specifications. Specifically, we performed: (1) a multivariable analysis including all candidate covariates; (2) analyses excluding variables that were considered potentially strongly correlated with other clinicopathological factors, including Gleason score, Gleason pattern 5, tertiary Gleason pattern 5, lymphovascular invasion, surgical margin distance, Gleason score at the positive surgical margin (among patients with positive margins only), and PSA doubling time; and (3) several additional models using alternative combinations and categorizations of Gleason score-related variables.
Across these sensitivity analyses, the overall conclusions of the study remained materially unchanged, supporting the robustness of the primary findings.

The use of stepwise Cox regression should be justified, or alternatively, a more robust modeling strategy should be considered.
We appreciate the reviewer’s comment regarding the variable selection procedure used in the multivariable Cox regression analysis.
Because the number of candidate covariates was relatively large compared with the number of observed events, inclusion of all candidate variables in a fully adjusted Cox model was considered likely to result in model overfitting and unstable hazard ratio estimates. Therefore, we used a stepwise variable selection procedure to construct a more parsimonious and stable multivariable model.
We acknowledge the known limitations of stepwise selection methods, including potential instability in variable selection and biased coefficient estimates. Accordingly, the results of the multivariable analysis should be interpreted as exploratory rather than strictly confirmatory.
To clarify this rationale, we have added an explanation in the Methods section of the revised manuscript.

The manuscript would benefit from a clearer explanation of why time to treatment failure (TTF) was selected as the primary endpoint and a discussion of its limitations compared to standard endpoints such as PFS or metastasis-free survival. The authors should acknowledge more explicitly that treatment decisions (e.g., initiation of SHT) may influence TTF and introduce bias. Additional clinically relevant variables, such as PSA kinetics (e.g., PSA doubling time), should be included if available or discussed as an important limitation.
We appreciate this important comment. In response, we clarified the rationale for the use of bicalutamide-based salvage hormonal therapy and the selection of TTF of bicalutamide as the primary endpoint in the original JCOG0401 trial. We additionally expanded the Discussion to address the potential limitations of TTF compared with contemporary endpoints such as progression-free survival and metastasis-free survival, including the possibility that physician-dependent treatment decisions may influence TTF and introduce bias. Furthermore, we acknowledged that clinically relevant variables such as PSA kinetics and PSA doubling time were not incorporated into the present analysis. Because the primary objective of this study was exploratory pathological assessment focusing on centrally reviewed histopathological features, we intentionally focused the present analysis on pathological variables. We additionally noted that PSA doubling time had been evaluated separately in a previous report (Reference 5).

The absence of modern imaging (e.g., PSMA PET) should be highlighted as a limitation affecting applicability to current clinical practice. The authors should consider simplifying or focusing the subgroup analyses on pre-specified key variables (e.g., IDC-P and tGP5) to improve clarity and reduce multiplicity concerns.
We appreciate this important comment. We understand the reviewer’s concern that subgroup analyses should ideally be limited to prespecified key variables. However, we also believe that selectively reporting only a subset of the subgroup analyses that were actually conducted could create the misleading impression that only those specific subgroup analyses had been performed from the outset, which may itself raise concerns regarding transparency and reporting bias.
Therefore, rather than omitting the existing subgroup analyses, we believe it is more appropriate to present the currently reported subgroup analyses while clearly stating that no adjustment for multiple comparisons was performed. We have additionally clarified in the Methods section that these subgroup analyses were exploratory in nature and should be interpreted with caution.

Figures (Kaplan–Meier curves and forest plots) should be improved for readability, including clearer labeling, higher resolution, and complete numbers at risk.
We appreciate this helpful comment. In response, we replaced the Kaplan–Meier curves and forest plots with revised higher-resolution figures to improve readability. We also revised the labeling and added complete numbers at risk to the Kaplan–Meier plots in the updated figures.

The terminology “treatment-resistant factors” should be replaced with a more neutral phrasing such as “features associated with reduced relative benefit”. The Discussion should be revised to better contextualize findings within existing literature and avoid overinterpretation, emphasizing the need for prospective validation in independent cohorts.
We appreciate your critical comment. We replaced definitive expressions such as “treatment-resistant factors” or statements implying that certain patients should not receive SRT with more cautious phrasing, including “features potentially associated with reduced relative benefit from SRT.” Furthermore, we explicitly acknowledged the increased risk of false-positive findings related to multiple subgroup analyses and multiple testing in the limitations section. We also added statements highlighting the need for prospective validation in larger independent cohorts incorporating modern imaging and molecular biomarkers.

They should also tone down ERG PTEN findings. Either remove from main results or clearly label as underpowered exploratory. Please expand limitations section, add following issues multiple testing, small subgroup sizes, selection bias, outdated imaging (no PSMA PET), add clinical implication carefully, Instead of: "these patients should not receive SRT” use: “these findings may help identify patients who could benefit from treatment intensification or alternative strategies”
We appreciate this important comment. In response, we revised the manuscript to further tone down the ERG and PTEN findings and explicitly clarified in the main text that these results should be interpreted as underpowered and exploratory. We also expanded the limitations section to address the issues you raised, including multiple testing, small subgroup sizes, potential selection bias, and the absence of modern imaging modalities such as PSMA PET during the study period. In addition, we revised the clinical implication statements and modified the Conclusions to avoid suggesting that certain patients should not receive SRT. Specifically, the Conclusions were revised to state that “these findings may help identify patients who could benefit from treatment intensification or alternative strategies.” All revised portions have been highlighted in red in the revised manuscript.

You should also consider adding future direction - need for prospective validation, integration with molecular markers, modern imaging...
We appreciate this important and constructive comment. In response, we added sentences regarding future directions to the Discussion and Conclusion.

Reviewer 2 Report

Comments and Suggestions for Authors

Thank you for asking me to review this interesting article. Salvage radiation makes up a significant portion of any GU radiation practice, such that guidance is always appreciated. This is a retrospective review of pathological factors which may influence outcomes in patients previously enrolled in the JCOG0401 study.

As identified by the authors, the study is limited by sample size. The initial study contained 210 patients, of which, 167 are missing from this analysis.
In addition to the above, the authors have chosen to perform multiple comparisons, even when division of factors has resulted in small bin sizes, such as seen with ERG positive patients. This may not be valid for statistical analysis.
A more detailed description of the initial study may be helpful in the Introduction, especially regarding the treatments and length of follow up. Was radiation given to the pelvis or prostate bed alone?
PTEN and ERG should be defined at their first instances.
Define “Not applicable” for Tertiary Gleason Pattern 5. How is this different from “absent”, in Table 1?
Table 1 includes Seminal Vesicle Invasion, but also includes stage 3b under Pathological T stage. Those are the same. Since this is redundant, there is no value in including Seminal Vesicle invasion.
“Not Done” seems to account for a large number of patients for N stage, PTEN and ERG. This needs to be stressed when listing study limitations.
In Table 2, the HR for absent tGP5 is listed as 0.099. Just by looking at the raw numbers (10/15 = 0.667 and 5/17 = 0.294), a HR of <0.1 seems very unlikely. Please check this.
As mentioned above, the bin sizes for ERG are very small: 7 versus 2, in the positive set. It would be better to report that you looked at ERG but didn’t feel it was reasonable to report any statistics on this group.
It is not clear to me how factors were chosen for multivariable analysis. I can’t really see a trend looking at either HR or p values. Were these pre-specified?
Table A1 – the title suggests multivariate analysis but the table states multivariable. This is not consistent.
Is Supplementary Table 1 the same as Appendix A Table A1? If so, please label it consistently. If not, please provide whichever is missing.
It is not clear why bicalutamide was used in the SHT group when LHRH agonists were standard of care when this study was accruing patients. Please explain this.
As per above, since bicalutamide is not used as monotherapy in prostate cancer, please explain in the Discussion how this is a study limitation and might limit the external validity of your findings.
How do your findings of Hazard Ratios compare with those of larger, similar studies (eg. GETUG, RTOG 9601) on salvage radiation?

Author Response

We thank the reviewers for their thoughtful and constructive comments, which have significantly strengthened the manuscript. We appreciate the time and effort spent reviewing the work and have addressed each point carefully.

As identified by the authors, the study is limited by sample size. The initial study contained 210 patients, of which, 167 are missing from this analysis.

We appreciate this important comment. We agree that potential selection bias due to patient exclusion should be carefully addressed. To evaluate this issue, we additionally performed a comparison of baseline clinicopathological characteristics between patients included in the present analysis and those excluded from the ancillary pathological review. The results of this comparison have been added to the revised manuscript and are discussed as part of the study limitations.

In addition to the above, the authors have chosen to perform multiple comparisons, even when division of factors has resulted in small bin sizes, such as seen with ERG positive patients. This may not be valid for statistical analysis.

We appreciate this important comment. We agree that multiple subgroup comparisons with small sample sizes may limit the reliability of statistical analyses and increase the risk of false-positive findings. In response to this concern, we revised the manuscript to explicitly acknowledge this limitation. In particular, because the number of ERG-positive cases was very small, we considered formal interaction analysis for ERG not to be insufficient and we additionally clarified that the ERG-related findings should be interpreted as exploratory and underpowered.

A more detailed description of the initial study may be helpful in the Introduction, especially regarding the treatments and length of follow up. Was radiation given to the pelvis or prostate bed alone?

We appreciate this helpful comment. In response, we expanded the Introduction to provide a more detailed description of the original JCOG0401 trial, including the study design, treatment strategy, eligibility criteria, and follow-up duration. Specifically, we clarified that patients with PSA recurrence after radical prostatectomy were randomized to salvage hormonal therapy alone or salvage radiotherapy followed by the same hormonal therapy strategy in cases of treatment failure. We also added the median follow-up duration of 5.5 years reported in the original study.　In addition, we clarified the radiotherapy protocol used in JCOG0401. Salvage radiotherapy consisted of irradiation to the prostate bed alone with a total dose of 64.8 Gy, and elective whole-pelvis irradiation was not performed.

PTEN and ERG should be defined at their first instances.

We appreciate the request for clarification regarding immunohistochemistry for PTEN and ERG. We have added the following sentences to the Trial design section in the Materials and Methods:

“Monoclonal antibodies against ERG (EP111, dilution 1:100) and PTEN (SP218, dilution 1:100) were used for immunohistochemistry. ERG expression was considered positive when a cluster of tumor nuclei demonstrated positive staining. PTEN expression was considered “PTEN loss” when at least 90% of tumor cells showed absence of staining.”

Define “Not applicable” for Tertiary Gleason Pattern 5. How is this different from “absent”, in Table 1?

We appreciate this comment. “Not applicable” refers to cases in which Gleason pattern 5 was already included as the primary or secondary Gleason pattern, such as GS 3+5, 5+3, 4+5, 5+4, or 5+5; therefore, tertiary Gleason pattern 5 could not be assigned. In contrast, “absent” refers to cases without tertiary Gleason pattern 5 among tumors with GS 3+3, 3+4, 4+3, or 4+4.　We added this sentences in the methods.

Table 1 includes Seminal Vesicle Invasion, but also includes stage 3b under Pathological T stage. Those are the same. Since this is redundant, there is no value in including Seminal Vesicle invasion.

We appreciate this helpful comment. We agree that seminal vesicle invasion is redundant because it is already represented by pathological stage pT3b. Accordingly, we removed seminal vesicle invasion from Table 1 to avoid duplication and improve clarity.

“Not Done” seems to account for a large number of patients for N stage, PTEN and ERG. This needs to be stressed when listing study limitations.

We appreciate your critical comments. We added following sentence to the limitations; In addition, a substantial proportion of patients had “Not Done” status for N stage, PTEN, and ERG assessments, which may have further reduced statistical power and introduced potential bias.

In Table 2, the HR for absent tGP5 is listed as 0.099. Just by looking at the raw numbers (10/15 = 0.667 and 5/17 = 0.294), a HR of <0.1 seems very unlikely. Please check this.

We appreciate the reviewer for carefully checking the values presented in Table 2.

We re-examined the analysis and confirmed that the reported hazard ratio is correct. The hazard ratio was estimated using a Cox proportional hazards regression model, which takes into account not only the number of events but also the timing of events and censoring information. Therefore, the estimated hazard ratio does not necessarily correspond directly to the simple ratio of event proportions calculated from the raw event counts (e.g., 10/15 vs. 5/17).

Accordingly, the value reported in Table 2 reflects the results of the Cox regression analysis and does not represent an error in calculation.

As mentioned above, the bin sizes for ERG are very small: 7 versus 2, in the positive set. It would be better to report that you looked at ERG but didn’t feel it was reasonable to report any statistics on this group.

We appreciate this important comment. We agree that the number of ERG-positive cases was too small to support reliable statistical interpretation. Accordingly, we revised the manuscript to clearly state that the ERG-related findings should be interpreted cautiously as exploratory and underpowered. We also clarified in the Discussion that the very limited subgroup size resulted in unstable estimates and wide confidence intervals, limiting the reliability of the statistical analyses for this cohort.

It is not clear to me how factors were chosen for multivariable analysis. I can’t really see a trend looking at either HR or p values. Were these pre-specified?

We appreciate this important comment. In the present exploratory analysis, we intentionally focused on pathological variables evaluated by central pathological review, because the primary objective of this study was to investigate the potential pathological factors associated with differential relative benefit from SRT. Therefore, the variables included in the multivariable analyses were selected based on pathological relevance and prior clinical interest rather than solely on univariable HRs or p values. In addition, we performed supplementary analyses excluding age from the multivariable model, and the overall trends and interpretation of the results remained essentially unchanged.

Table A1 – the title suggests multivariate analysis but the table states multivariable. This is not consistent.

We appreciate pointing out this inconsistency. We have revised the terminology throughout the manuscript and standardized the expression to “multivariable analysis,” in Appendix A Table A1. Please also see the 12.

Is Supplementary Table 1 the same as Appendix A Table A1? If so, please label it consistently. If not, please provide whichever is missing.

We apologize for our mistake, which caused confusion for the reviewer. Yes, Supplement Table 1 and Appendix A Table A1 are the same. We have deleted Supplement Table 1.

It is not clear why bicalutamide was used in the SHT group when LHRH agonists were standard of care when this study was accruing patients. Please explain this.

We appreciate this important comment. In the original JCOG0401 trial, bicalutamide-based salvage hormonal therapy was selected because the investigators considered overall survival to be an impractical primary endpoint in this postoperative PSA recurrence setting, where many elderly patients were expected to experience competing mortality unrelated to prostate cancer. In addition, time to treatment failure (TTF) of LH-RH analogue therapy was considered less suitable because, in the SRT+SHT arm, PSA failure would need to occur multiple times before initiation of LH-RH analogue therapy, potentially introducing bias in endpoint evaluation. Therefore, TTF of bicalutamide was adopted as an earlier and more feasible surrogate endpoint for treatment evaluation.

As per above, since bicalutamide is not used as monotherapy in prostate cancer, please explain in the Discussion how this is a study limitation and might limit the external validity of your findings.

We appreciate this important comment. We agree that the use of bicalutamide monotherapy differs from current standard treatment strategies for recurrent prostate cancer and may limit the external validity of the present findings. We therefore added this point to the Discussion as a study limitation. Specifically, we clarified that treatment approaches for biochemical recurrence have evolved substantially since the accrual period of JCOG0401, including broader use of LHRH-based androgen deprivation therapy, androgen receptor pathway inhibitors, and modern imaging-guided management. Accordingly, caution is required when extrapolating the present results directly to contemporary clinical practice.

How do your findings of Hazard Ratios compare with those of larger, similar studies (eg. GETUG, RTOG 9601) on salvage radiation?

We appreciate this valuable comment. We added a brief comparison with major randomized salvage radiotherapy trials to better contextualize our findings. In JCOG0401, initial SRT before SHT significantly prolonged TTF of bicalutamide compared with SHT alone (HR 0.555, 95% CI 0.376–0.818). Similarly, RTOG 9601 demonstrated improved overall survival with the addition of bicalutamide to salvage radiotherapy (HR 0.77, 95% CI 0.59–0.99), while GETUG-AFU 16 showed improved progression-free survival with radiotherapy plus gosereli compared with radiotherapy alone (HR 0.50, 95% CI 0.38–0.66). Although direct comparison is limited by differences in study design and endpoints, these findings consistently support the clinical benefit of combining radiotherapy and hormonal therapy in patients with biochemical recurrence after radical prostatectomy.

Reviewer 3 Report

Comments and Suggestions for Authors

This is an excellently written manuscript. I have no further comments beside that I would like to see more discussion, proposal for the future. What are future implications? Fri example, integration of new findings in prospective trials as e.g .a stratificaiton factor.

Author Response

We appreciate your positive and constructive comment. In response, we expanded the Discussion to further address the potential future implications of our findings. Specifically, we added statements on the need for prospective validation in independent cohorts and on the potential integration of pathological features, such as IDC-P and tGP5, into future prospective clinical trials, including their possible use as stratification factors. We also discussed the importance of combining these pathological findings with modern imaging modalities and molecular biomarkers to improve postoperative risk stratification and treatment personalization.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you to authors for additional work they did on my suggestions which improved the overall quality of this paper.