Review Reports
- Emily Belker 1,*,
- Katrin Hornemann 1 and
- Waldemar Schreiner 1
- et al.
Reviewer 1: Anonymous Reviewer 2: Kaibo Guo Reviewer 3: Anonymous
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis manuscript presents a retrospective single-center analysis of the prognostic role of spread through air spaces (STAS) in resected lung adenocarcinoma. While the topic is clinically relevant, the study suffers from substantial methodological limitations, internal inconsistencies, and over-interpretation of exploratory findings. Most importantly, the main conclusions are not supported by the multivariable analyses, and the data do not justify the proposed clinical implications. In its current form, the manuscript does not provide sufficiently robust or novel evidence to merit publication.
Major Concerns
- Conclusions Are Not Supported by Multivariable Analysis Although STAS is associated with worse overall survival in univariate analysis, it is not an independent prognostic factor in multivariable Cox regression (HR 0.65, p = 0.227) in the satudy. Despite this, the authors repeatedly suggest clinical relevance and potential guidance for surgical decision-making. This interpretation is not supported by the data and represents a clear overstatement of the findings.
- Severe Selection Bias and Limited Generalizability Only 100 of 366 surgically treated NSCLC patients were included after extensive exclusions, resulting in a highly selected cohort characterized by advanced age, very high smoking prevalence, and a substantial proportion of stage III–IV disease. These features severely limit external validity, particularly for early-stage adenocarcinoma populations in which STAS-guided surgical decisions would be most relevant.
- Methodological Weakness in STAS Assessment STAS was assessed using a heterogeneous mix of H&E slides, frozen sections, and immunohistochemical preparations. No interobserver agreement or sensitivity analysis is provided. This heterogeneity introduces significant risk of misclassification and undermines reproducibility of the primary pathological endpoint.
- Over-Interpretation of Subgroup Analyses Claims regarding differential prognostic impact of STAS by extent of resection are based on small subgroups and lack formal interaction testing. The proposed “floor effect” explanation is speculative and unsupported. These findings should be considered exploratory only and do not justify clinical inference.
- Inclusion of stage IV patients in a resected cohort is insufficiently justified.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for Authors1. Retrospective Design and Selection Bias
The study’s retrospective nature introduces significant limitations, including selection bias (e.g., exclusion of unresectable tumors, incomplete recurrence data). The cohort size (n=100) is insufficient to detect small but clinically meaningful effects, particularly in multivariable analyses where STAS lost significance. Furthermore, the exclusion of 45 patients due to missing data raises concerns about generalizability. Sensitivity analyses or multiple imputation methods should address missing data. A prospective multicenter design would strengthen validity.
2. Confounding Variables and Lack of Adjuvant Therapy Adjustment
The analysis does not account for adjuvant therapy (chemotherapy/radiation), which likely influenced survival outcomes, especially in advanced-stage patients (e.g., 21% stage IV). Adjuvant treatment protocols and molecular-targeted therapies (e.g., EGFR/ALK inhibitors) are critical confounders that may attenuate the observed association between STAS and survival. Subgroup analyses stratified by adjuvant therapy status are necessary to isolate STAS’s independent prognostic role.
3. Methodological Limitations in STAS Assessment
While STAS evaluation followed WHO criteria, discrepancies arise from using non-H&E slides (e.g., frozen sections, immunohistochemistry) in 23% of cases. This introduces variability, as STAS identification in non-H&E preparations lacks standardization. Interobserver reliability was not reported, despite prior studies noting moderate variability. Including a second blinded pathologist for validation would improve robustness.
4. Overinterpretation of Subgroup Analyses
The claim that STAS-guided radical resection improves survival (p=0.034) is problematic. The subgroup analysis comparing STAS+/− in radical resection lacks adjustment for confounding factors (e.g., nodal status, tumor size). The observed survival difference may reflect confounding rather than STAS causality. Furthermore, limited resection outcomes (n=35) lacked statistical power to detect differences, undermining conclusions about STAS neutrality in sublobar resections.
5. Clinical Relevance and Practical Implications
The manuscript overstates STAS’s utility in surgical decision-making. While STAS correlates with adverse features, its inability to independently predict survival in multivariable models (HR=0.648, p=0.227) limits clinical applicability. The discussion lacks actionable recommendations (e.g., margin thresholds or adjuvant therapy adjustments). Additionally, the absence of recurrence-free survival (RFS) data due to incomplete documentation weakens the prognostic narrative.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors- How was the decision determined for the cutoff between low STAS and high STAS?
- Fig2b - Please provide the STAS classification for this representative microscopic sample.
- Since sample size is 100, % values are only valid to integer accuracy. Please revise throughout the manuscript.
- Table 1. The statistical test used should be part of the footnote. It should stand independent of information presented elsewhere in the manuscript.
- Section 3.3 pgh 2 and Table 2 - Is data expression to 3 decimal places justified. I suggest that data expression to 2 decimal places is a better choice.
- In the Discussion section, don't repeat data from the Results section, only interpret your findings already presented in the Results section relative to previously published findings.
- My general conclusion is that this study is confirmatory, and does not inform the reader of any clinically significant novelty to previous reports. If you disagree, you need to tell the reader what your findings have added to what is already known from previous studies reported in the scientific literature.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have made a substantial effort to address the concerns raised in the previous round of review. The revised manuscript presents a more balanced interpretation of the data, particularly in the Discussion and Conclusion sections, by acknowledging that Spread Through Air Spaces (STAS) was not found to be an independent prognostic factor in this cohort. The inclusion of additional limitations regarding the retrospective design and the "exploratory" nature of the subgroup analyses is appreciated.
However, significant issues remain that prevent the manuscript from being accepted in its current form. There is a critical contradiction between the Abstract and the Conclusion regarding the clinical utility of STAS. Furthermore, the statistical reporting in Section 3.2 (regarding age) involves unconventional metrics that appear to contradict standard hypothesis testing, and the multivariable model (Table 2) demonstrates signs of instability that cast doubt on the robustness of the negative findings.
Major concerns
- While the authors have commendably revised the Conclusion to state that STAS "cannot be regarded as an independent prognostic factor" (Lines 536–537), the Abstract remains contradictory. The Abstract concludes that STAS "may guide surgical decision-making in early ADCL" (Lines 46–47). This statement is not supported by the study’s primary multivariable analysis (p=0.227). If STAS is not an independent predictor, it cannot be recommended to guide surgical decision-making based on this data alone. The Abstract must be revised to align strictly with the cautious tone of the new conclusion.
- In Section 3.2, the statistical reporting regarding the association between age and STAS is highly unconventional for clinical research and appears contradictory.
The standard hypothesis test (Chi-square/Fisher’s) shows no significant association between age and STAS (p=0.789) (Table 1). However, the text immediately following introduces "symmetric lambda," "Goodman and Kruskal’s tau," and "uncertainty coefficients" to suggest a "moderate relationship" (Lines 299–309). Introducing predictive error reduction metrics (Lambda) when the primary association test is non-significant (p>0.05) is confusing and appears to be an attempt to force a correlation where none exists statistically. Unless the authors can provide a compelling statistical justification for why these specific metrics are standard for this study design, this section should be removed to avoid misleading the reader.
- There are concerns regarding the stability of the Cox proportional hazards model used in Table 2. The Hazard Ratio (HR) for Nodal Status (N0 vs. N+) is reported as 25.797 with a 95% Confidence Interval (CI) of 6.474 – 102.794. Such an extremely wide confidence interval usually indicates model instability, likely due to the small sample size (n=100) relative to the number of events, or quasi-complete separation of data points. This instability raises the question of whether the study is sufficiently powered to conclude that STAS is not a prognostic factor (potential Type II error). The authors should address this instability in the limitations or verify the proportional hazards assumption.
- In Section 2.3 and Response 4, the authors state that a "limited subset" or "minority" of cases were assessed using digitized frozen sections or IHC because original H&E slides were missing. Frozen sections are known to be prone to artifacts (e.g., inflation artifacts) that can mimic STAS. The term "minority" is vague. Please explicitly state the exact number (n) and percentage of cases where STAS determination was made using non-H&E slides. If this number is greater than 5-10%, a sensitivity analysis excluding these cases would be appropriate to ensure they are not skewing the prevalence data.
- In the Discussion (Lines 397–399), the authors attribute the lack of STAS prognostic value in the limited resection group to a "floor effect" caused by uniformly poor outcomes. Limited resections are often performed on older patients with significant comorbidities. If the "poor outcomes" are driven by non-cancer mortality (e.g., cardiovascular death), STAS would naturally show no effect. If cause-of-death data is available, please clarify if the deaths in the limited resection group were cancer-related or due to competing risks. This would validate whether the "floor effect" is oncological or related to patient frailty.
Minor Concerns
- Line 126: The inclusion of Stage IV patients is now justified in the text as "oligometastatic" (Line 21), which is acceptable, but the authors should ensure this definition is consistent with the "curative intent" claim throughout the manuscript.
- Typos: Please check the manuscript for minor typographical errors (e.g., formatting of p-values in tables).
Author Response
For research article
|
Response to Reviewer 1 Comments Round 2 Review:
|
||
|
1. Summary |
|
|
|
We thank you very much for the thorough and constructive second-round evaluation of our manuscript. The comments were carefully considered and led to further refinement of the manuscript, focusing on statistical reporting and consistency between the Abstract and Conclusions.
|
||
|
2. Questions for General Evaluation |
Reviewer’s Evaluation |
|
|
Does the introduction provide sufficient background and include all relevant references? |
Yes/Can be improved/Must be improved/Not applicable |
|
|
Are all the cited references relevant to the research? |
Yes/Can be improved/Must be improved/Not applicable |
|
|
Is the research design appropriate? |
Yes/Can be improved/Must be improved/Not applicable |
|
|
Are the methods adequately described? |
Yes/Can be improved/Must be improved/Not applicable |
|
|
Are the results clearly presented? |
Yes/Can be improved/Must be improved/Not applicable |
|
|
Are the conclusions supported by the results? |
Yes/Can be improved/Must be improved/Not applicable
|
|
|
3. Point-by-point response to Comments and Suggestions for Authors
|
||
|
Comment 1: [While the authors have commendably revised the Conclusion to state that STAS "cannot be regarded as an independent prognostic factor" (Lines 536–537), the Abstract remains contradictory. The Abstract concludes that STAS "may guide surgical decision-making in early ADCL" (Lines 46–47). This statement is not supported by the study’s primary multivariable analysis (p=0.227). If STAS is not an independent predictor, it cannot be recommended to guide surgical decision-making based on this data alone. The Abstract must be revised to align strictly with the cautious tone of the new conclusion.]
|
||
|
Response 1: [We thank the reviewer for highlighting the inconsistency between the Abstract and the Conclusions in the previous version of the manuscript. We fully agree that statements implying a role of STAS in guiding surgical decision-making are not supported by the results of the primary multivariable analysis. The Abstract has therefore been revised accordingly. The Conclusions of the Abstract now strictly reflect the results of the multivariable Cox regression analysis, stating that STAS did not retain independent prognostic significance and should be interpreted as a marker of aggressive tumor biology rather than an independent determinant of prognosis or surgical strategy. The Abstract is now fully aligned with the revised Conclusions and the cautious interpretation presented in the Discussion. (Section Abstract; line 28-56)]
|
||
|
Comment 2: [In Section 3.2, the statistical reporting regarding the association between age and STAS is highly unconventional for clinical research and appears contradictory. The standard hypothesis test (Chi-square/Fisher’s) shows no significant association between age and STAS (p=0.789) (Table 1). However, the text immediately following introduces "symmetric lambda," "Goodman and Kruskal’s tau," and "uncertainty coefficients" to suggest a "moderate relationship" (Lines 299–309). Introducing predictive error reduction metrics (Lambda) when the primary association test is non-significant (p>0.05) is confusing and appears to be an attempt to force a correlation where none exists statistically. Unless the authors can provide a compelling statistical justification for why these specific metrics are standard for this study design, this section should be removed to avoid misleading the reader.]
|
||
|
Response 2: [We thank the reviewer for addressing this important comment regarding the statistical reporting of the association between age and STAS. We agree that the presentation of directional association measures following a non-significant primary hypothesis test was unconventional and potentially misleading in the context of clinical research. We have revised Section 3.2 and removed all predictive error reduction and directional association metrics, including symmetric lambda, Goodman and Kruskal’s tau, and uncertainty coefficients. The Results section now consistently reports only the non-significant association between age and STAS based on the chi-square test (p = 0.789).
Comment 3: [There are concerns regarding the stability of the Cox proportional hazards model used in Table 2. The Hazard Ratio (HR) for Nodal Status (N0 vs. N+) is reported as 25.797 with a 95% Confidence Interval (CI) of 6.474 – 102.794. Such an extremely wide confidence interval usually indicates model instability, likely due to the small sample size (n=100) relative to the number of events, or quasi-complete separation of data points. This instability raises the question of whether the study is sufficiently powered to conclude that STAS is not a prognostic factor (potential Type II error). The authors should address this instability in the limitations or verify the proportional hazards assumption.]
Response 3: [We thank the reviewer for this important comment. The wide confidence interval observed for nodal status reflects limited precision and is most likely attributable to the relatively small sample size in relation to the number of events and potential quasi-complete separation. This issue has now been explicitly addressed in the Discussion as a methodological limitation. We further clarify that the absence of independent prognostic significance for STAS in the multivariable model should not be interpreted as evidence of biological irrelevance, but rather as a consequence of limited statistical power to detect smaller independent effects in this cohort, acknowledging the potential risk of a type II error. In addition, we have clarified in the Statistical Analysis section that the proportional hazards assumption was formally assessed using standard diagnostic procedures, and no relevant violations were identified. (Discussion, Section 4.3; lines 490-496)]
Comment 4: [n Section 2.3 and Response 4, the authors state that a "limited subset" or "minority" of cases were assessed using digitized frozen sections or IHC because original H&E slides were missing. Frozen sections are known to be prone to artifacts (e.g., inflation artifacts) that can mimic STAS. The term "minority" is vague. Please explicitly state the exact number (n) and percentage of cases where STAS determination was made using non-H&E slides. If this number is greater than 5-10%, a sensitivity analysis excluding these cases would be appropriate to ensure they are not skewing the prevalence data.]
The manuscript has been revised accordingly. We now explicitly state in Section 2.3 that STAS assessment was based on non-H&E material in 7 of 100 cases (7%), due to the unavailability of original H&E slides in a small subset of archival specimens. In these cases, STAS was evaluated only when alveolar architecture and detached tumor clusters were unequivocally preserved, and all cases were handled conservatively by an experienced pulmonary pathologist to minimize the risk of misclassification. Given the low proportion of affected cases (<10%), we consider the risk of relevant bias to be limited. No indication was found that inclusion of these cases skewed STAS prevalence or associated analyses. (This limitation has now been explicitly stated in the 2.3. Histopathological Evaluation of STAS; line 200-201)]
Comment 5: [In the Discussion (Lines 397–399), the authors attribute the lack of STAS prognostic value in the limited resection group to a "floor effect" caused by uniformly poor outcomes. Limited resections are often performed on older patients with significant comorbidities. If the "poor outcomes" are driven by non-cancer mortality (e.g., cardiovascular death), STAS would naturally show no effect. If cause-of-death data is available, please clarify if the deaths in the limited resection group were cancer-related or due to competing risks. This would validate whether the "floor effect" is oncological or related to patient frailty.]
Response 5: [We thank the reviewer for this important comment regarding the interpretation of the “floor effect” observed in the limited resection subgroup. We agree that competing risks related to patient frailty and comorbidities may influence survival outcomes in this population. Reliable cause-of-death data were not consistently available in this retrospective cohort, precluding a robust distinction between cancer-related mortality and non-cancer-related competing risks. The Discussion has therefore been revised to explicitly acknowledge this limitation and to clarify that the observed lack of prognostic impact of STAS in the limited resection group may be influenced by competing risks rather than reflecting a purely oncological floor effect. (Discussion 4.1 Prognostic Role of STAS in Resected NSCLC and Implications for Surgical Strategy line 422-431)].
Comment 6: Minor Concerns · Line 126: The inclusion of Stage IV patients is now justified in the text as "oligometastatic" (Line 21), which is acceptable, but the authors should ensure this definition is consistent with the "curative intent" claim throughout the manuscript. · Typos: Please check the manuscript for minor typographical errors (e.g., formatting of p-values in tables). .
Response 6: We thank the reviewer for these helpful minor comments. Regarding the inclusion of stage IV patients, we agree that consistency with the “curative intent” terminology is essential. The manuscript has been revised accordingly. We now explicitly state that stage IV patients were included only in cases of oligometastatic disease treated within a curative surgical concept, thereby ensuring consistency with the curative intent claim throughout the manuscript. In addition, the manuscript has been carefully proofread, and minor typographical issues, including formatting of p-values in tables and text, have been corrected. (Section 2.1 Study Design and Setting, line 137-139)]
|
||
Author Response File:
Author Response.docx
Reviewer 2 Report
Comments and Suggestions for AuthorsThis manuscript has been well revised and is recommended for publication.
Author Response
Comment: This manuscript has been well revised and is recommended for publication.
Response: We sincerely thank the reviewer for the careful evaluation of our manuscript and for the positive and supportive recommendation for publication.
Reviewer 3 Report
Comments and Suggestions for AuthorsThank you for your detailed response to my suggestions for revision.
Author Response
Comment: Thank you for your detailed response to my suggestions for revision.
Response: We sincerely thank the reviewer for the positive assessment.
Round 3
Reviewer 3 Report
Comments and Suggestions for AuthorsNo comments on the revision for the submitting authors.