Uncovering the Diagnostic Power of Radiomic Feature Significance in Automated Lung Cancer Detection: An Integrative Analysis of Texture, Shape, and Intensity Contributions
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThank you for the opportunity to review this manuscript. I thoroughly enjoyed reading it.
I have given comments as follows:
Introduction: The study provides a promising perspective on radiomics for lung cancer diagnosis but suffers from several shortcomings. It would be great if the introduction positions radiomics within the broader diagnostic landscape or compare it to alternative approaches like liquid biopsies. Claims about diagnostic accuracy and the importance of shape- and texture-based features are overly generalized and unsupported by robust evidence or quantitative data. Please include more references pertaining to it. Practical challenges, such as imaging variability, computational complexity, and integration into clinical workflows, are insufficiently addressed, as are issues of reproducibility, overfitting, and the interpretability of machine learning models. Please include that as well.
Methods: The method section does not provide enough of information about the dataset balance (in terms of disease status and demographic variables like age, sex, and smoking history), this could lead to potentially bias. The image quality control process is also underexplained, with no mention of artifact handling or quality thresholds. Please include more information on that. Additionally, while forward stepwise correlation analysis identifies key features, the rationale behind specific thresholds for feature selection lacks sufficient justification, and further discussion on potential feature losses is needed. Please include the potential feature losses in the limitations section or methods section. The validation set’s composition, with an imbalance between cancerous and healthy images, may skew results, so balancing the dataset through oversampling or stratified cross-validation should be considered.
Results: While SHAP analysis offers some insights into feature importance, more advanced explainable AI (XAI) methods can be used to improve model transparency. Mention this please. The study proposes several future directions, including standardizing imaging protocols, improving feature robustness across diverse cohorts, and developing hybrid models that combine the interpretability of traditional machine learning with the accuracy of deep learning. Additionally, integrating radiomics with genomic and clinical data could enhance diagnostic models to improve predictions related to cancer aggressiveness, treatment response, and prognosis.
Author Response
Dear Reviewer,
We would like to extend our heartfelt thanks for your thoughtful and constructive comments on our manuscript. Your feedback has significantly enhanced the clarity, rigor, and overall quality of our work. Below, we outline how we have addressed each of your comments in detail:
Introduction:
- We have expanded the introduction to position radiomics within the broader diagnostic landscape and included a comparison with alternative approaches, such as liquid biopsies. This addition highlights the complementary nature of radiomics and molecular diagnostics.
- Claims about diagnostic accuracy and the importance of shape- and texture-based features have been revised with additional quantitative data and references to relevant studies, supporting the robustness of these features in lung cancer diagnosis.
- Practical challenges, including imaging variability, computational complexity, and integration into clinical workflows, are now explicitly discussed. We have also included considerations regarding reproducibility, overfitting, and the interpretability of machine learning models to provide a comprehensive view of these limitations.
Methods:
- We have added detailed information regarding dataset balance, including distributions by disease status, age, sex, and smoking history, ensuring transparency about the composition of our dataset and its potential biases.
- The image quality control process has been elaborated to include specific thresholds (e.g., SNR > 20 dB and CNR > 10 dB) and methods for handling artifacts, such as motion and reconstruction errors.
- The rationale for selecting an ICC threshold >0.75 during forward stepwise correlation analysis is now detailed, emphasizing its basis in ensuring feature stability. Additionally, we acknowledge the exclusion of features with borderline ICC values as a limitation and discuss their potential diagnostic relevance.
- To address the imbalance in the validation set (1,786 cancerous vs. 337 healthy images), oversampling techniques were applied, and stratified k-fold cross-validation was implemented to ensure class balance during evaluation.
Results:
- While SHAP analysis provides insights into feature importance, we have incorporated additional explainable AI (XAI) methods, such as Grad-CAM and LIME, to further enhance model transparency. These methods offer complementary perspectives by visualizing spatial influences and providing localized explanations for individual predictions.
- The future directions section has been expanded to emphasize the need for standardizing imaging protocols, improving feature robustness across diverse cohorts, and developing hybrid models that integrate traditional machine learning with deep learning techniques. We have also discussed the integration of radiomics with genomic and clinical data to enhance predictions related to cancer aggressiveness, treatment response, and prognosis.
We sincerely appreciate your valuable input, which has greatly improved our manuscript. We hope the revisions address your concerns comprehensively and meet your expectations. Please do not hesitate to provide further suggestions.
Thank you once again for your time and effort in reviewing our work.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper presents the diagnostic power of radiomic features in automated lung cancer detection through an integrative analysis of texture, shape, and intensity contributions. This work shows a clear methodology and detailed results, making some contribution to the field of lung cancer diagnosis. However, there are several comments that require further improvement and clarification, as detailed below.
Major comments
1. The authors say the dataset is a combination of some public data and their private data, I suggest, a statistical analysis of these data sets, such as the origin and age composition of the cases, should be added to the manuscript.
2. It was recommended that authors use formulas or model structure diagrams to describe the models they use, rather than large blocks of text.
3. If the dataset is not fully open source, I strongly recommend that the authors add more comparative experiments to demonstrate the scientific significance and contribution of the work
4. The study focuses mainly on the diagnostic performance of the models. Future work could expand to include prognostic or predictive capabilities, such as predicting patient survival or response to treatment, which would have greater clinical significance.
5. Line 85-87, How is the validation set divided? why are 1786 cancer images and 337 healthy images selected from 2963 cancer images and 383 healthy images? Does the number (1786 and 337) have special meaning?
6. Too many tables are in the Discussion part, please move some tables and their description to the Result part
Minor comments
1. The figure 1 is not so clear, please replace it.
2. Line 171, Please give the full name of GLSZM
3. Line 267 and 280, no Serial Number in the title
Author Response
Dear Reviewer,
We would like to extend our heartfelt thanks for your thoughtful and constructive comments on our manuscript. Your feedback has significantly enhanced the clarity, rigor, and overall quality of our work. Below, we outline how we have addressed each of your comments in detail:
Major Comments
1. Statistical analysis of the datasets (origin and age composition):
- Action Taken: A detailed statistical analysis of the dataset, including origin and age composition, was added in the revised manuscript. This analysis highlights the demographic diversity and ensures minimal bias.
- Response: We have added a statistical analysis of the datasets, detailing the origin and age composition to address potential biases and ensure transparency.
2. Use of formulas or model structure diagrams:
- Action Taken: A concise model structure diagram was incorporated, replacing large blocks of text in the Methods section.
- Response: We have added a model structure diagram to provide a clearer representation of the employed models and methodologies.
3. Comparative experiments for private datasets:
- Action Taken: Additional comparative experiments were included to demonstrate the scientific significance of the work, as requested.
- Response: Comparative experiments have been added to validate the robustness and applicability of the methodology across different datasets.
4. Prognostic or predictive capabilities:
- Action Taken: A section outlining potential future expansions, including prognostic and predictive capabilities such as survival predictions, was added to the Discussion section.
- Response: Future directions for expanding the study to include prognostic and predictive capabilities have been included, emphasizing the clinical significance of such applications.
5. Validation set division:
- Action Taken: The rationale behind the specific division of the validation set (1,786 cancer and 337 healthy images) was clarified.
- Response: The manuscript now includes a detailed explanation for the division of the validation set, ensuring clarity and rationale. The validation set was constructed to maintain approximately 60% of the available images for training and 20% for validation, with the remaining 20% reserved for testing. These numbers do not have a special meaning but were determined to balance dataset size and ensure sufficient representation for model training and testing.
6. Tables in Discussion section:
- Action Taken: Key tables previously in the Discussion section were moved to the Results section, where they are more appropriately discussed.
- Response: We have relocated the relevant tables to the Results section, aligning them with the flow of the manuscript.
Minor Comments
1. Clarity of Figure 1:
- Action Taken: Figure 1 was replaced with a clearer version to enhance readability.
- Response: The figure has been replaced to ensure clarity and comprehensibility.
2. Full name of GLSZM:
- Action Taken: The full name of GLSZM was provided in the text where it first appears.
- Response: The full term for GLSZM (Gray Level Size Zone Matrix) has been added for completeness.
3. Serial numbers for titles:
- Action Taken: Clarified that the mentioned sections are subheadings and do not require serial numbers.
- Response: We have clarified that these titles are subheadings and do not require serial numbers as per the manuscript structure.
We sincerely appreciate your valuable input, which has greatly improved our manuscript. We hope the revisions address your concerns comprehensively and meet your expectations. Please do not hesitate to provide further suggestions.
Thank you once again for your time and effort in reviewing our work.
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have well replied my comments, I recommend accept in this form.
Author Response
Dear Reviewer,
Thank you very much for your kind feedback and for recommending our manuscript for acceptance. We truly appreciate your thoughtful comments and suggestions during the review process, which have greatly contributed to improving the quality of our work.
Your positive remarks and support are highly encouraging, and we are grateful for the opportunity to present our research.
Sincerely,
Sotiris Raptis