Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Segmentation-Guided Preprocessing Improves Deep Learning Diagnostic Accuracy and Confidence of Ameloblastoma and Odontogenic Keratocyst in Cone Beam CT Images—A Preliminary Study

Diagnostics 2026, 16(3), 416; https://doi.org/10.3390/diagnostics16030416

by Xinyue Zhang¹

, Yuxuan Yang², Chen Zhong², Jupeng Li² and Gang Li^1,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Piero Antonio Zecca

Diagnostics 2026, 16(3), 416; https://doi.org/10.3390/diagnostics16030416

Submission received: 31 October 2025 / Revised: 19 January 2026 / Accepted: 22 January 2026 / Published: 1 February 2026

(This article belongs to the Special Issue Application of Artificial Intelligence to Oral Diseases)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Introduction

The literature review is not structured chronologically or thematically, and the research gap addressed by the current study is not clearly defined.
Previous studies on preprocessing strategies are not synthesized sufficiently; therefore, the manuscript’s novelty is not strongly emphasized.
The comparison of panoramic and CBCT studies broadens the discussion unnecessarily and weakens the focus of the introduction.
The clinical motivation for using “segmentation-guided preprocessing” is not clearly framed; the specific clinical problem is not well articulated.

Materials and Methods

The sample size is not justified statistically; no power analysis is provided.
Although the segmentation and annotation steps are described, inter-observer or intra-observer reliability is not reported.
The criterion of selecting slices with “>50% lesion area” is subjective, and the rationale for choosing this threshold is not explained.
The exclusive use of axial slices is methodologically restrictive but not sufficiently justified.
Selecting every 5th slice may introduce information loss; however, no ablation study or analysis is provided to evaluate this effect.
The MAPS sampling strategy is described but not validated with supporting experiments or references.
Data augmentation parameters are given, but the impact of augmentation on performance is not experimentally demonstrated.
Only one model architecture and fixed hyperparameters were used; therefore, comparisons across preprocessing methods may be limited.
The distribution of folds in the 5-fold cross-validation (number of patients/slices per fold) is not detailed.
The ethics approval numbers in the manuscript appear inconsistent.

Results

Although ROC curves are reported, statistical comparisons between AUCs (e.g., DeLong test) are not performed.
Variability in accuracy, AUC, and other metrics is high, yet the clinical implications of this variance are not discussed.
Radiographic characteristics in Table 1 are not correlated with model performance; no feature-performance analysis is provided.
The decision rule for aggregating slice-level predictions into patient-level classification is not explained.
No qualitative error analysis is presented for incorrectly classified cases.
Confidence curve analysis remains descriptive and lacks quantitative metrics or statistical evaluation.

Discussion

Comparisons with prior studies are superficial; methodological differences are not critically analyzed.
Explanations for why the moderately expanded ROI performed best are speculative, without empirical validation.
The limitations of using only axial slices are mentioned but not explored in sufficient depth.
The proposed clinical value of the “confidence curve” approach is insufficiently supported; relevant references are limited.
Parts of the discussion merely repeat results rather than providing deeper interpretation.
Future research directions remain general and lack specific methodological guidance.

General / Structural Issues

Terminology is occasionally inconsistent (e.g., AME naming variations, ROI descriptions).
Some figures (e.g., radar charts) offer limited informational value and add redundancy.
Figures 5 and 6 include only a few examples, insufficient to demonstrate generalizability.
Key dataset parameters (number of slices per patient, distribution across folds) are not clearly reported.
Claims of “clinical relevance” are not strongly substantiated; no clinical workflow simulation is provided.
Differences among the three CBCT scanners (scanner-based domain shift) are not analyzed.

Comments on the Quality of English Language

Overall, the manuscript is understandable; however, the English language requires substantial improvement to meet publication standards. There are multiple issues involving grammar, sentence structure, redundancy, and clarity throughout the text. Several paragraphs are overly long and contain ambiguous phrasing, and some terminology is used inconsistently. I recommend a thorough professional language editing to improve readability, coherence, and academic tone.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Thank you for the opportunity to review this manuscript. Overall, I found the study to be well written, technically solid, and clearly the result of substantial effort from the authors. The topic is timely and relevant, and the paper presents a thoughtful and rigorous evaluation of segmentation-guided preprocessing strategies for AI-based diagnosis of AME and OKC on CBCT images. The manuscript is generally well organized, and the results are convincingly presented.

That said, there are a few essential aspects that would benefit from clarification or revision before the manuscript can be considered for publication.

Strengths

The introduction and discussion are written with clarity and depth, showing a strong understanding of both the clinical background and the technical challenges.
The methodological section is very detailed, transparent, and carefully structured, something that is not always seen in AI diagnostic studies.
The use of Grad-CAM and confidence curves adds valuable interpretability and clinical relevance.
The figures are informative and well designed.
The results are compelling and consistently support the narrative of the study.
Overall, the paper is of high quality and clearly prepared with care.

Points that require revision

The study aim is not explicitly stated in the Introduction

Although the Introduction is rich and comprehensive, it does not include a clear, direct statement of purpose (e.g., “Therefore, this study aims to…”).

Inclusion criteria are missing

The authors present the exclusion criteria, but the inclusion criteria are not described. For a retrospective study, readers expect clarity regarding: which patients were included and on what basis, whether cases were consecutive, whether histopathological confirmation was required, whether incomplete or low-quality scans were excluded at the outset, how the final dataset was assembled.

Methods section is excellent but somewhat too technical

The level of detail (optimizer configuration, two-phase learning-rate schedule, GPU specifications, sampling weights, etc.) is impressive, but perhaps more than necessary for a medical journal. Consider slightly condensing some of these technical details without removing their scientific value.

Figure formatting

Some figures, especially those with many small sub-graphs (e.g., confidence curve plots), are quite dense visually (Figure 6). A slightly cleaner layout or grouping would improve readability.

This is a strong and valuable manuscript that presents a well-designed study with meaningful clinical implications. The authors demonstrate both technical mastery and thoughtful consideration of how AI models behave in real diagnostic settings. With a few targeted revisions, the paper has the potential to make a significant contribution to the field.

My recommendation is Major Revisions, not because the manuscript is weak (it is, in fact, very strong), but because a few key elements are essential for completeness and methodological transparency. I look forward to seeing a revised version.

Best regards!

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This manuscript addresses a highly relevant topic in oral and maxillofacial radiology: the application of deep learning to the differential diagnosis of ameloblastoma and odontogenic keratocyst on CBCT images. The authors propose a segmentation-guided preprocessing strategy and demonstrate that this approach significantly improves both diagnostic performance and model interpretability compared to using raw axial slices. The work is methodologically careful and introduces thoughtful concepts such as patient-centric data partitioning and confidence curve analysis, which represent notable strengths.
However, despite these positive aspects, I have significant concerns about the current scientific and clinical robustness of the study, which necessitate a major revision before this work can be considered for publication.
The primary limitation lies in the reliance on isolated axial 2D slices extracted from inherently 3D datasets. While the authors acknowledge this issue, the clinical reality of CBCT interpretation is intrinsically multiplanar and volumetric. Several critical diagnostic features that differentiate AME from OKC, including root resorption, tooth displacement, cortical expansion, and spatial relationships with adjacent structures, are not adequately captured when restricting the analysis to single-plane axial slices. As a result, the proposed approach risks oversimplifying a complex diagnostic process and limiting its translational relevance.
Furthermore, the study heavily depends on manually generated segmentation masks for ROI extraction. Although this improves model performance, it also introduces a substantial gap between experimental conditions and real-world applicability. Without an automated, validated segmentation pipeline, the workflow remains difficult to reproduce at scale and limits its feasibility in routine clinical settings.
The dataset size, although balanced, remains relatively small for deep learning purposes (128 CBCT scans), and no external validation is provided. This raises concerns about generalizability, especially considering the variability in CBCT acquisition parameters and lesion morphology across institutions and populations. The current results, while promising, should therefore be interpreted as preliminary.
Finally, while the interpretability analysis using Grad-CAM and confidence curves is an interesting addition, the manuscript at times adopts an overly enthusiastic narrative that may not be fully supported by the demonstrated performance, which remains moderate at the patient level. A more cautious and clinically grounded discussion would strengthen the scientific credibility of the work.
To substantially improve the manuscript, I strongly encourage the authors to:
You can explore or discuss more concretely the integration of multi-planar or 2.5D/3D approaches to better reflect real diagnostic workflows.
Address the segmentation bottleneck by proposing or testing an automated segmentation framework, even as a proof-of-concept.
Strengthen the clinical interpretation of the results, clarifying the role of this system as a support tool rather than as a near-autonomous diagnostic solution.
Could you provide a more balanced discussion of the limitations, avoiding overly optimistic statements that are not fully supported by the data?
In conclusion, this study represents an interesting and valuable methodological contribution to AI-based CBCT analysis. Still, it currently lacks sufficient clinical realism and validation to justify acceptance in its present form. With substantial methodological and conceptual refinement, however, it has the potential to evolve into a meaningful reference for segmentation-guided AI workflows in dental radiology.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The revised manuscript demonstrates clear improvement in methodological transparency, clinical contextualization, interpretability, and analytical rigor. Reviewer concerns regarding terminology, data structure, statistical comparison, and clinical relevance have been appropriately resolved. Congratulations on a well-executed revision

Author Response

Dear Reviewer 1,

Thank you very much for taking the time to review our revised manuscript and for your positive and encouraging feedback. We are especially grateful for the insightful comments and suggestions you provided during the previous round of review. They have been invaluable in helping us enhance the clarity, rigor, and clinical relevance of our work.

Your recognition of our efforts is highly motivating, and we are delighted that the revised version meets your expectations. Thank you again for your time and expertise. We look forward to the possibility of our work contributing to the field with your support.

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have addressed the previously recommended revisions in the manuscript entitled “Segmentation-guided preprocessing improves deep learning diagnostic accuracy and confidence of ameloblastoma and odontogenic keratocyst in cone-beam CT images – A preliminary study.”

From my perspective, the requested modifications have been adequately implemented, and the manuscript has improved accordingly. In its current form, I consider the article suitable for acceptance.

Kind regards!

Author Response

Dear Reviewer 2,

Thank you very much for your thoughtful review of our revised manuscript and for your positive assessment. We are particularly grateful for your constructive feedback during the review process. It is encouraging to know that, in your view, the article in its current form is suitable for acceptance.

Thank you again for your time, expertise, and kind support throughout this process.

Reviewer 3 Report

Comments and Suggestions for Authors

While the methodological framework is clearly described and the results are internally consistent, the study currently lacks a critical element for scientific transparency and reproducibility. All conclusions rely on a custom-developed preprocessing and training pipeline implemented in Python; however, no source code, executable software, or public repository is provided.
In the absence of code availability, both reviewers and readers are required to rely solely on the authors’ descriptions and reported outcomes, without the possibility of independently verifying, testing, or benchmarking the proposed approach. This substantially limits reproducibility and weakens the robustness of the conclusions, particularly for a study that aims to propose a methodological framework rather than a purely clinical observation.
I therefore strongly encourage the authors to make their code publicly available or, at a minimum, to provide a well-documented executable version that the community can test.
Without code accessibility, the work remains difficult to validate, and its impact is necessarily reduced, as readers are asked to trust the results on authority rather than on verifiable evidence.

Author Response

Dear Reviewer 3,

Thank you very much for your thorough review of our manuscript and for raising the critical issue of scientific transparency and reproducibility. In response to your valuable suggestion,we have made the source code publicly available in a public repository (GitHub) and included the corresponding link in the final version of the manuscript.（See line 441-442）

We hope these measures adequately address your concerns.

Article Menu

Segmentation-Guided Preprocessing Improves Deep Learning Diagnostic Accuracy and Confidence of Ameloblastoma and Odontogenic Keratocyst in Cone Beam CT Images—A Preliminary Study

Introduction

Materials and Methods

Results

Discussion

General / Structural Issues

Further Information

Guidelines

MDPI Initiatives

Follow MDPI