Identifying a Critical Blind Spot: How Commercial AI (CAD) Systems Fail to Detect Faint Ground-Glass Opacities at −730 HU on Low-Dose CT
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors1- The data used in the study has not been externally validated, and only the data from 100 patients was used. This single-center and limited approach raises questions about the reliability of the results. The authors may consider performing external validation or increasing the number of analyses.
2- We understand that nodule density measurements (HU) were performed using manual ROI. You should discuss how this approach performed across all data.
3- You should detail your ROI algorithm. You should consider how you could transition to segmentation or another autonomous format.
4- Is Figure 4 entirely a visual created by you? If not, you should cite it or mention this in the text.
5- Why is the study based solely on False Negative success? I recommend that you create complete confusion matrices and draw a conclusion based on the overall system performance. You may consider adding confusion matrices to the conclusion section of your study.
6- For example, in the “Exclusion Criteria:” section, you have provided content in bullet points. I am not opposed to writing content in bullet points. However, the bullet points you have provided in your manuscript without any explanation fall short of clarifying the topic. You should explain these bullet points.
7- Figure 1 is already legible and is just an algorithm diagram. Does it need to be this large? It should be reduced in size.
8- Why didn't you reference the programs you wrote about in the heading “2.4. CAD Systems and Nodule Detection”? You should reference them, even if they are a commercial website or a user document.
9- The difference between CAD1 and CAD2 is very superficial and is given with a brief explanation. These explanations should be collected in one place and expanded upon.
Author Response
Response to Reviewers
Dear Editor and Reviewers,
We sincerely thank you for the constructive feedback on our manuscript entitled "Identifying a Critical Blind Spot: How Commercial AI (CAD) Systems Fail to Detect Early Lung Adenocarcinoma at -730 HU on Low-Dose CT". We have carefully addressed all comments, incorporated external validation data, detailed our methodologies, and anonymized the CAD systems as requested. All modifications in the manuscript have been highlighted in red.
Reviewer 1:
1- The data used in the study has not been externally validated, and only the data from 100 patients was used. This single-center and limited approach raises questions about the reliability of the results. The authors may consider performing external validation or increasing the number of analyses.
Response: Thank you for this highly valuable suggestion. We completely agree that external validation is crucial. We have now acquired and analyzed an additional external validation cohort consisting of 50 patients from a separate institution. The results from this external cohort are highly consistent with our primary findings, demonstrating similar false-negative rates for low-density GGOs. Furthermore, we specifically measured the CT attenuation values of the false-negative GGOs in this external cohort. Crucially, the mean CT values (-741 ± 48.20 HU for Vendor A, and -733 ± 62.50 HU for Vendor B) perfectly corroborated the "-730 HU density blind spot" identified in our primary cohort.
We have updated Table 2 (now Table 3) to present the quantitative CT attenuation analysis for both the primary and external validation cohorts side-by-side, directly demonstrating the stability of this threshold across different centers. We have added these data to the Methods (Section 2.1), Results (Section 3.2 and 3.4), and comprehensively updated Table 1, Table 2, and Table 3 to reflect the integrated external validation findings.
2- We understand that nodule density measurements (HU) were performed using manual ROI. You should discuss how this approach performed across all data.
Response: Thank you for pointing this out. We have added a discussion on the reliability of the manual ROI approach. Inter-observer reliability was excellent, with an intraclass correlation coefficient (ICC) of 0.92. This has been detailed in Section 2.5: "The manual ROI measurements demonstrated excellent inter-observer reliability across all data, with an intraclass correlation coefficient (ICC) of 0.92..."
3- You should detail your ROI algorithm. You should consider how you could transition to segmentation or another autonomous format.
Response: We have expanded the description of our manual ROI technique and discussed the future transition to automated segmentation in Section 2.5 and the Discussion: "Measurements were obtained using a 3D spherical tool, strictly maintaining a 1-2 mm distance from the nodule margin..."
4- Is Figure 4 entirely a visual created by you? If not, you should cite it or mention this in the text.
Response: We confirm that Figure 4 was entirely generated by us.
5- Why is the study based solely on False Negative success? I recommend that you create complete confusion matrices and draw a conclusion based on the overall system performance. You may consider adding confusion matrices to the conclusion section of your study.
Response: This is a very insightful comment. We have now included a complete performance evaluation, including True Positives (TP) and False Positives (FP), to construct a confusion matrix for both systems across the total 150 cases. We have added a new Table 2 and provided an overall performance summary in Section 3.2. We have also generated a new statistical bar chart (Supplementary Figure S1) to visually represent this data.
6- For example, in the “Exclusion Criteria:” section, you have provided content in bullet points. I am not opposed to writing content in bullet points. However, the bullet points you have provided in your manuscript without any explanation fall short of clarifying the topic. You should explain these bullet points.
Response: We have converted the bullet points in the Exclusion Criteria into fully explained sentences to provide clearer clinical justification for each exclusion, as detailed in Section 2.1. "For instance, significant image artifacts... were excluded because they fundamentally corrupt the density analysis..."
7- Figure 1 is already legible and is just an algorithm diagram. Does it need to be this large? It should be reduced in size.
Response: We have adjusted the dimensions and resolution of Figure 1 in the final submission files to ensure it occupies an appropriate, reduced space while maintaining legibility.
8- Why didn't you reference the programs you wrote about in the heading “2.4. CAD Systems and Nodule Detection”? You should reference them, even if they are a commercial website or a user document.
Response: We have now added proper literature and technical references for both AI/CAD algorithms in Section 2.4 and the Reference list. [16, 17]
9- The difference between CAD1 and CAD2 is very superficial and is given with a brief explanation. These explanations should be collected in one place and expanded upon.
Response: We have expanded Section 2.4 to detail the fundamental algorithmic differences between the two systems (e.g., deep learning 3D-CNN vs. hybrid traditional/machine learning approaches). "Vendor A utilizes a modern 3D Convolutional Neural Network (3D-CNN)... whereas Vendor B employs a hybrid architecture..."
Reviewer 2 Report
Comments and Suggestions for AuthorsComments to the Authors
Title: Identifying a Critical Blind Spot: How Commercial AI (CAD) Systems Fail to Detect Early Lung Adenocarcinoma at -730 HU on Low-Dose CT
General Comments:
Overall, the established limitations of computer-aided diagnosis (CAD) systems in detecting subtle ground-glass opacities (GGOs)—which remain perceptible to the human eye—as well as nodules abutting the pleura or vasculature, are well documented in the literature. The findings presented by the authors effectively corroborate these known paradigms.
Abstract:
No specific comments.
Introduction:
Whilst slightly tangential to the primary focus of the manuscript, a critical clinical question arises: what is the actual risk that the minute GGOs missed by this AI system will progress into life-threatening or health-compromising lung carcinomas? In the context of screening, the mere detection of malignant or pre-malignant lesions is not unequivocally beneficial; rather, the paramount consideration is whether such detection ultimately enhances the patient’s future quality of life. It may be conceptually vital to consider whether lesions typically overlooked by AI are, in fact, those that do not warrant active identification.
Striking a balance to avoid unnecessary surgical interventions (overdiagnosis and overtreatment) whilst implementing therapeutic measures at the optimal juncture remains a central tenet of contemporary respiratory medicine and thoracic surgery. It is highly probable that patients harbouring GGOs measuring 5 mm or less will complete their natural lifespan without experiencing any substantial volumetric progression of these lesions. The authors are strongly encouraged to contextualise their premise by discussing these paradigms within the Introduction or Discussion, incorporating references to landmark guidelines and studies, such as the Guidelines for Management of Incidental Pulmonary Nodules Detected on CT Images: From the Fleischner Society 2017 and the prospective multicentre study by Kakinuma R, et al. (Journal of Thoracic Oncology, 2015).
Materials and Methods:
No specific comments.
Results:
Figure 2: The explicit inclusion of corporate names for CAD1 and CAD2 is highly inadvisable due to potential commercial conflicts of interest and litigation risks. I request that these systems be strictly anonymised.
Figure 3: To optimally elucidate the morphological characteristics and attenuation of the lesions, please provide magnified views of the regions of interest or present high-resolution computed tomography (HRCT) images of the unilateral lung.
Figure 5: Please provide representative CT images, preferably utilising HRCT.
Figure 6: If feasible, please present the corresponding CT scans as magnified images or HRCT.
Discussion:
To reiterate the fundamental point raised previously, the authors assert that: "This finding is pivotal because persistent GGOs have a significantly higher malignancy rate (up to 34%) compared to solid nodules (7%)[15], often representing pre-invasive or minimally invasive adenocarcinomas (AIS/MIA) that offer excellent prognosis if resected early." Notwithstanding the aggressive behaviour of solid carcinomas, one must critically evaluate the true biological trajectory of these specific GGOs: what is their genuine potential to become deleterious to the patient's health? The ultimate objective of screening is not merely the indiscriminate identification of neoplasms, but the meaningful improvement of the patient's quality of life. Mitigating overdiagnosis and overtreatment is paramount. Furthermore, the unwarranted identification of indolent nodules may precipitate severe psychological distress for the patient and impose an unjustifiable economic burden upon the healthcare system through excessive surveillance.
As noted, GGOs measuring 5 mm or less frequently exhibit negligible growth, allowing patients to complete their natural lifespan unaffected by the disease. I mandate that the authors address these critical clinical realities, referencing the aforementioned Fleischner Society Guidelines (2017) and the Kakinuma et al. study (2015).
Whilst this perspective does not inherently vindicate the false-negative rates of CAD systems—and acknowledging that radiologists must read scans with the caveat of potential CAD oversights—it would significantly enrich the manuscript to critically appraise the overarching issue: the disproportionate expenditure of clinicians' time and cognitive effort in detecting diminutive or faint GGOs whose true clinical significance remains highly equivocal.
Conclusion:
No specific comments.
Comments for author File:
Comments.pdf
Author Response
Response to Reviewers
Dear Editor and Reviewers,
We sincerely thank you for the constructive feedback on our manuscript entitled "Identifying a Critical Blind Spot: How Commercial AI (CAD) Systems Fail to Detect Early Lung Adenocarcinoma at -730 HU on Low-Dose CT". We have carefully addressed all comments, incorporated external validation data, detailed our methodologies, and anonymized the CAD systems as requested. All modifications in the manuscript have been highlighted in red.
Reviewer 2:
Title: Identifying a Critical Blind Spot... Introduction: what is the actual risk that the minute GGOs missed by this AI system will progress into life-threatening... Striking a balance to avoid unnecessary surgical interventions... incorporate references to Fleischner Society 2017 and Kakinuma R, et al. (2015).
Response: We deeply appreciate this crucial clinical perspective. We have incorporated a substantial discussion regarding the indolent nature of many small GGOs, the risk of overdiagnosis, and the importance of clinical guidelines. We have explicitly cited the Fleischner Society Guidelines (2017) and Kakinuma et al. (2015) in both the Introduction (Section 1) and Discussion (Section 4.4). "Crucially, detecting every minute lesion is not unequivocally beneficial. As emphasized by the Fleischner Society Guidelines (2017)[12] and prospective studies by Kakinuma et al.[13]..."
Results - Figure 2: The explicit inclusion of corporate names for CAD1 and CAD2 is highly inadvisable due to potential commercial conflicts of interest and litigation risks. I request that these systems be strictly anonymised.
Response: We have completely anonymized the systems. "United Imaging" and "GE Healthcare" have been replaced with "Vendor A" and "Vendor B" throughout the text, figures, and tables to avoid any conflict of interest.
Figure 3, 5, 6: Provide magnified views or HRCT.
Response: We have prepared high-resolution computed tomography (HRCT) magnified unilateral lung images as requested. The figure legends have been updated to reflect the use of HRCT images.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsI thank the researchers for their answers to my concerns. I believe the study can be published in its current version.
The authors have excessively enlarged some figures in the manuscript. They should quickly revise this.
Also, there are still problems with the reference page. The MDPI guidelines can be reviewed.
If possible, the number of patients in the study should be increased slightly.
