Review Reports - Segmentation of 71 Anatomical Structures Necessary for the Evaluation of Guideline-Conforming Clinical Target Volumes in Head and Neck Cancers

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper has chosen for an auto-segmentation task. For that, all 71 structures have been manually delineated, and used to train nnU-Net models for auto-segmentation.it has automatically segmented 71 anatomical structures in the head and neck area relevant for CTV delineation. The predictions for 18 unseen data sets are evaluated against the manual labels as well as segmentations generated by the TotalSegmentator, and compared to previously reported segmentation results.For this paper, I would like to make the following recommendations:

(1) The abstract is not clear enough about the model proposed in this study.

(2) The motivation is not clear. Please specify the importance of the proposed solution.

(3) Please highlight the contributions of this paper in introduction.

(4) It is advisable to compare your method in more depth with the previously reported results of 129 segments, including performance on different anatomical structures. Highlight the innovations of your approach and discuss potential areas for improvement.

(5) Detail the method for subdividing the labels of all 71 anatomical structures into three non-overlapping, disjoint subsets. Particular attention is paid to the impact of this segmentation on model performance and compared to other possible label processing methods.

(6) Why 2 mm was chosen as the margin in the sDice ?

(7) Discuss in the discussion the potential explanations for the DICE_m values you have observed, as well as the factors that may lead to lower similarity. This helps to provide a more comprehensive understanding of the results.

(8) At the end of this paper, the shortcomings and application limitations of this study are elaborated, but the direction of improvement is not proposed, so as to lay a foundation for further research.

Comments on the Quality of English Language

Author Response

Response to Reviewer 1 Comments

1. Summary

Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions indicated by red text color in the re-submitted files.

2. Point-by-point response to Comments and Suggestions for Authors

Comments 1: The abstract is not clear enough about the model proposed in this study.

Response 1: Thank you for pointing this out. We agree with this comment and have added the 3D manner of the model, the data subsets and the five-fold cross-validation scheme of the model into the abstract (line 11 - 14).

Comments 2: The motivation is not clear. Please specify the importance of the proposed solution.

Response 2: Thank you for drawing our attention to this aspect. We have added a paragraph to the introduction (line 126 – 134) and to Section 4.5. (line 538 - 546). In the paragraph in the introduction, we state the current status quo of using manual labels for CTV delineation, that those are often unreliable, and that the metrics used do usually not measure the quantity of interest (i.e., spatial similarity instead of treatment outcome). The novelty of our approach is the exploitation of the international consensus expert guidelines as ground truth instead.

In the paragraph in Section 4.5. we give an outlook on the advantages and usage of the automatic segmentations of the anatomical structures presented in the manuscript.

Comments 3: Please highlight the contributions of this paper in introduction.

Response 3: Thank you for bringing this to our attention. We have added a paragraph summarizing the contribution of this paper to the end of the introduction (line 144 – 150). This paragraph names the 48 first-time segmented anatomical structures and the improved results for other structures as well as the analysis we made with respect to the expert-guidelines application and individual deviations in accuracy metrics.

Comments 4: It is advisable to compare your method in more depth with the previously reported results of 129 segments, including performance on different anatomical structures. Highlight the innovations of your approach and discuss potential areas for improvement.

Response 4: Thank you for highlighting this detail. We have added the very recent results published in Podobnik et al. (2023) to Table 1 and Table A1 to compare our results with the ones from other researchers. We’re very interested to present a complete comparison of head and neck segmentations on CT scans due to its clinical relevance to the current radiation therapy workflow. If you know more previously published results in line with our study that we could not find yet, we kindly ask you to provide a clear reference to these paper(s).

Comments 5: Detail the method for subdividing the labels of all 71 anatomical structures into three non-overlapping, disjoint subsets. Particular attention is paid to the impact of this segmentation on model performance and compared to other possible label processing methods.

Response 5: We're grateful for your discerning eye on this. We have modified Section 2.3. (line 202, 203) accordingly. The division of the labels into non-overlapping subset is required by the applied state-of-the-art framework for medical image segmentation, the nnU-Net Version 1. One subset only contained the skin contour since this surrounds all other anatomical structures. The second subset contains the hypopharynx, left and right nasal cavity, nasopharynx, oral cavity, and oropharynx. All other 64 structures form the third subset.

Although dense annotations were found to have advantages for image segmentation tasks, as described in our introduction, our presented labels are not dense. We see the potential of our work to enable the segmentation community to get closer to dense annotations assembling our labels with other existing labels.

In this research we did not aim to deduce a division scheme that is most favorable for this type of tasks.

Comments 6: Why 2 mm was chosen as the margin in the sDice?

Response 6: Thank you for highlighting this detail. We have added an explanation of our choice for the 2 mm margin in Section 2.4. (line 227 - 229). This number originates from the clinical practice in photon radiation therapy in which outlines are only corrected if they deviate more than 2 mm. Based on this clinical practice we consider all deviations smaller than this margin as sufficiently accurate.

Comments 7: Discuss in the discussion the potential explanations for the DICEm values you have observed, as well as the factors that may lead to lower similarity. This helps to provide a more comprehensive understanding of the results.

Response 7: We thank you for pointing out this detail. We have added a summary of reasons for the DICE values we observed in Section 4.1. (line 356 – 367). Common reasons are difficulties with the transition between related structures, definition of beginning or end of elongated structures, no tolerance of the DICE for small deviations in thin structures and inconsistent manual labels due to metal artefacts or insufficient soft tissue contrast.

Comments 8: At the end of this paper, the shortcomings and application limitations of this study are elaborated, but the direction of improvement is not proposed, so as to lay a foundation for further research.

Response 8: Thank you for bringing this to our attention. We have modified Section 4.5. accordingly. For each limitation, we have now added ideas on how to improve the current situation in the future, by e.g., bringing different segmentation models together for dense annotations, adding MRI scans for better soft tissue contrast, or increasing the number of training data sets with underrepresented image features.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

In this manuscript the authors segmented 71 anatomical structures that are relevant for Clinical Target Volumes (CTV) delineation in head and neck cancer patients, according to the expert guidelines, on 104 CT scans to assess the possibility to automate their segmentation by state-of-the-art deep learning methods. They have trained nnU-Net models to automatically segment those 71 structures on planning CT scans. They present the DICE, HD and sDICE for the 71 (+ 5) anatomical structures, for most of which no previous segmentation accuracies have been reported. The results achieved by the authors for their segmentation models’ accuracy matched or exceeded the previous reported values. Additionally, the predictions from their models were always better than those predicted by the TotalSegmentator, and the sDICE with 2mm margin was larger than 80% for almost all structures.

Globally, the manuscript is very well written and organized. There are some English minor typos and editing issues that should be corrected; please refer to the attached commented PDF document where some of the needed corrections are highlighted.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Please refer to the comments above.

Author Response

Response to Reviewer 2 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. We very much appreciate your detailed review that helped to improve the quality of this manuscript. Please find the overall revisions indicated by red text color in the re-submitted files. We have added a sentence about model details in the abstract, a paragraph in the introduction and Section 4.5. clarifying the importance of the proposed solution, a paragraph to summarize our contribution to the introduction, added the recently published results from Podobnik et al. (2023) to Table 1, details about the subdivision of our labels in Section 2.3, a paragraph summarizing typical reasons for impaired segmentation performance, and potential future research in Section 4.5. Additionally, small orthographic corrections are made but not indicted in red text color.
2. Point-by-point response to Comments and Suggestions for Authors
Comments 1: There are some English minor typos and editing issues that should be corrected; please refer to the attached commented PDF document where some of the needed corrections are highlighted.
Response 1: Thank you for pointing this out. We agree with this comment and have included all corrections as proposed.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The authors propose an interesting segmentation method for medical CT. The article is well structured and documented.

To increase the value of the article, I ask the authors to make a comparison of the results obtained with other results reported in the specialized literature on the same topic.

Author Response

Response to Reviewer 3 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. Please find the overall revisions indicated by red text color in the re-submitted files. We have added a sentence about model details in the abstract, a paragraph in the introduction and Section 4.5. clarifying the importance of the proposed solution, a paragraph to summarize our contribution to the introduction, added the recently published results from Podobnik et al. (2023) to Table 1, details about the subdivision of our labels in Section 2.3, a paragraph summarizing typical reasons for impaired segmentation performance, and potential future research in Section 4.5. Additionally, small orthographic corrections are made but not indicted in red text color.

2. Point-by-point response to Comments and Suggestions for Authors
Comments 1: To increase the value of the article, I ask the authors to make a comparison of the results obtained with other results reported in the specialized literature on the same topic.
Response 1: Thank you for highlighting this detail. We have added the very recent results published in Podobnik et al. (2023) to Table 1 and Table A1 in which we list all previously published results we found for comparison with our results.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

This study evaluated a new method of normal structure delineation on CT images. Even though authors developed this method to delineate CTV in head and neck cancer, CTV could not be evaluated without gold standard measurements.

The abstract should not include abbreviations.

Author Response

Response to Reviewer 4 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. Please find the overall revisions indicated by red text color in the re-submitted files. We have added a sentence about model details in the abstract, a paragraph in the introduction and Section 4.5. clarifying the importance of the proposed solution, a paragraph to summarize our contribution to the introduction, added the recently published results from Podobnik et al. (2023) to Table 1, details about the subdivision of our labels in Section 2.3, a paragraph summarizing typical reasons for impaired segmentation performance, and potential future research in Section 4.5. Additionally, small orthographic corrections are made but not indicted in red text color.
2. Point-by-point response to Comments and Suggestions for Authors
Comments 1: The abstract should not include abbreviations.
Response 1: Thank you for pointing this out. We agree with this comment and eliminated all abbreviations, except the one for clinical target volume (CTV), because in our opinion it helps the reading flow to leave this abbreviation in place.