1. Introduction
Artificial intelligence (AI), and in particular convolutional neural networks (CNNs), due to their ability to efficiently and repeatedly detect pathological changes, are revolutionizing the analysis of imaging studies across numerous fields of medicine [
1].
It has been demonstrated that their use in the evaluation of screening examinations increases diagnostic accuracy in the early detection of breast cancer [
2], enables the identification of skin cancers at a level comparable to dermatologists [
3], as well as the detection of pneumonia on radiographic images [
4] and diabetic retinopathy [
5].
The implementation of artificial intelligence in dental diagnostics enhances efficiency and improves treatment effectiveness, thereby optimizing clinical workflows. Advances in this field, due to the vast potential for reliable and objective diagnostics, have significantly improved diagnostic accuracy, particularly in complex cases such as periodontal diseases, tooth fractures, oral infections, and the detection of dental caries. AI algorithms have demonstrated the ability to detect changes in bone density in imaging studies with accuracy often exceeding human capabilities, which is of importance in the early detection of periodontal diseases and in implant treatment planning. However, this requires further research focused on integrating AI with various imaging modalities and providing more heterogeneous imaging datasets—from different devices and from patients of diverse ethnic backgrounds [
6,
7,
8].
Given that most studies discussed in the literature are conducted under isolated conditions—without patient involvement—the implementation of AI models into routine dental practice will require collaboration between clinicians and software developers, the establishment of standardized protocols [
9] as well as further clinical studies to validate findings and address the limitations of previous research [
7,
10,
11]. It is anticipated that the future benefits of artificial intelligence in dental practice will lead to more personalized treatment plans and improved patient outcomes.
Imaging studies play a crucial role in contemporary dentistry. Intraoral radiographs and orthopantomograms have long been essential diagnostic tools; however, they are associated with limitations inherent to two-dimensional (2D) imaging. The advent of cone beam computed tomography (CBCT) has enabled greater diagnostic precision and accuracy in dentistry by providing high-resolution three-dimensional cross-sectional images [
12]. CBCT devices emit an X-ray beam in a cone-shaped configuration, as opposed to the fan-shaped beam used in conventional computed tomography scanners, resulting in significantly lower radiation doses. Conventional computed tomography still provides superior image quality, particularly when information regarding soft tissues is required, and allows for the use of intravenously administered contrast agents, which is especially important in cases of suspected head and neck malignancies. However, for routine use in daily clinical practice, CBCT is justified due to its lower radiation dose and the greater availability of CBCT devices in dental offices [
13]. An undeniable advantage of this method over standard 2D radiographs is the possibility of reconstruction, as well as the absence of distortion and superimposition of anatomical structures [
14,
15]. Over the years, an increasing trend among dentists toward more frequent use of CBCT in daily clinical practice has been observed [
16], including in the assessment of temporomandibular joints [
17], planning of orthognathic surgeries [
18], dental implant placement [
19], maxillary sinus floor elevation procedures [
20] and endodontics [
21], thereby enhancing its significance as a diagnostic tool. A study conducted by Kumar N. et al. [
22] demonstrated a systematic increase in CBCT utilization, particularly in implant planning—consistent with the guidelines of the American Academy of Oral and Maxillofacial Radiology, which recommend CBCT as the method of choice for imaging implant sites, as well as for the evaluation of impacted teeth and the diagnosis of pathological lesions. The authors emphasize that CBCT is becoming increasingly integrated into routine clinical practice, reflecting the growing trend toward the use of 3D imaging in dentistry [
22,
23].
Despite the higher radiation dose associated with CBCT compared to 2D imaging, continuous advancements in low-dose protocols and artifact reduction algorithms have expanded its clinical applications. Integration with digital tools, such as intraoral scanners, Computer-Aided Design (CAD) and Computer-Aided Manufacturing (CAM) systems, has further increased its utility in implant surgery and orthodontic appliance design [
24]. However, it demonstrates significant clinical utility in cases where clinical symptoms do not correspond with findings from radiovisiographic (RVG), panoramic (OPG) images or when the image is unclear or ambiguous. Additionally, CBCT performed in the context of implant planning may simultaneously allow for the assessment of the presence of caries. It should be noted that the radiological appearance of caries may change significantly depending on the projection geometry [
14,
25].
Interpretation of CBCT examinations is time-consuming and largely dependent on the clinician’s experience. Studies have shown that the amount of training devoted to the interpretation of three-dimensional imaging among dentists is insufficient [
26,
27]. Therefore, support in the form of artificial intelligence–based tools is important for optimizing treatment [
25]. Among 305 surveyed dentists, a substantial majority expressed willingness to refer CBCT scans for external review if such an option were available. Diagnostic confidence regarding CBCT interpretation was reported by only 31.5% of the participants [
28].
The accuracy of CBCT analysis is further constrained by the specific clinical conditions under which diagnostics are performed. These include time constraints and the necessity to focus on the primary clinical indication. Consequently, there is a risk of oversight regarding incidental findings that fall outside the immediate area of diagnostic interest [
29,
30].
The implications of these limitations are particularly pronounced in the diagnosis of prevalent conditions frequently overlooked during three-dimensional imaging analysis, such as dental caries. To address these constraints, the methodology of the present study incorporates an artificial intelligence system operating within real-world clinical conditions—an approach that represents a novel element compared to prior investigations [
30].
Oral diseases constitute a major public health problem on a global scale [
31]. According to data presented in the literature, dental caries was the 10th most prevalent disease worldwide, affecting 2.4 billion people in 2015 [
32]. It disproportionately affects populations of low socioeconomic status and, despite being largely preventable, its prevalence has remained stable over the past three decades [
33].
Given the enormous potential of artificial intelligence in medicine to optimize workflow, improve treatment outcomes, and enable personalized treatment planning, there is a strong need for studies analyzing its use in clinical settings. In response to this need, and building on our previous analyses, we conducted a comparative study aimed at evaluating the effectiveness of the Diagnocat program in detecting dental caries on CBCT images, in comparison with clinical assessments performed by three general dentists without specialization, each with comparable professional experience of at least five years [
30].
4. Discussion
The results obtained from the CBCT analysis confirm previous findings based on 2D imaging highlighting the influence of patient age, gender, tooth type, and inter-examiner variability on diagnostic concordance. Despite the system’s high specificity, its relatively low sensitivity indicates a potential for significant false-negative rates. These findings suggest that the system may overlook a substantial number of pathologies, thereby limiting its clinical utility as a standalone diagnostic tool.
Current research indicates that artificial intelligence-based systems can achieve high diagnostic accuracy. Esmaeilyfard et al. [
14] reported a diagnostic accuracy of approximately 95%, with high sensitivity and specificity for caries detection in molars. However, these superior results may be attributed to differences in study methodology, specifically, their analysis was conducted on isolated teeth rather than entire dental arches within CBCT examinations. Restricting the diagnostic field to isolated structures significantly reduces diagnostic complexity and the prevalence of imaging artifacts, which likely leads to inflated performance metrics compared to full-arch clinical scenarios.
In a study by Kaźmierczak et al. [
39] which evaluated dentition at both the individual tooth and full-arch levels, the Diagnocat system assessed the presence or absence of missing teeth, restorations, endodontic treatments, bridge pontics, orthodontic appliances, crowns, and implants in CBCT scans. While the system achieved nearly perfect precision (over 99%) during the analysis of individual teeth, its performance significantly declined when conducting a comprehensive evaluation of the entire oral cavity.
Similarly, in a study by Ezhov et al. [
15] the Diagnocat system was used to analyze full-volume CBCT scans, automatically segmenting anatomical structures and identifying pathologies. The authors demonstrated high system specificity and reported improved diagnostic performance among clinicians utilizing AI support. These findings confirm the potential value of such tools as supplement to the diagnostic process rather than full replacements for clinical judgment.
The analysis revealed differences in diagnostic agreement between the AI tool and clinical assessment based on patient gender. Specifically, a higher overall agreement rate for all teeth combined was observed in female patients.
While artificial intelligence has been extensively studied in CBCT-based diagnostics, current literature does not address potential gender-related differences in AI performance or its concordance with clinical evaluation. Consequently, a direct comparison of these results with existing data is currently not possible. Nevertheless, the presence of sexual dimorphism in odontometric parameters—such as crown dimensions and pulp chamber volume—may potentially influence detection efficacy. This underscores the need to consider gender as a factor in the training and validation processes of AI systems [
40].
Our findings demonstrate a clear age-dependent trend in the diagnostic performance of the Diagnocat system, characterized by a general propensity for higher agreement within the youngest cohort. This pattern was particularly pronounced in the aggregate analysis of all teeth and specifically within the molar region, where consistency noticeably surpassed that of older generations. A compelling exception emerged in the premolar region, where the oldest population achieved the highest level of agreement; this may be attributed to more advanced pathological changes in older individuals, potentially presenting more distinct features that facilitate identification by the system.
Available literature does not directly address age-related differences in the diagnostic agreement of AI systems in CBCT imaging, precluding direct comparison. However, our findings suggest a distinct age-dependent pattern, particularly regarding tooth type. The notably higher variability observed in incisors among the oldest age group, compared to the youngest, stands in contrast to the stability of canine assessments, which remained consistent across all cohorts. Furthermore, the high level of agreement across specific teeth (such as 12, 15, 28, and 35) was most prominent in the youngest group.
This higher agreement observed in the younger population may be associated with age-dependent anatomical and structural factors, such as a lower prevalence of prosthetic restorations, fewer metallic artifacts, and reduced complexity of degenerative changes. However, these findings warrant further verification in studies involving larger and more diverse age cohorts.
Analysis revealed variations in agreement levels among clinicians. These findings highlight the presence of inter-observer variability, a well-documented phenomenon in radiology arising from differences in clinical experience, image interpretation techniques, and individual diagnostic strategies.
A notable strength of this study is the demographic homogeneity across the clinician groups. The lack of significant associations between clinician characteristics and patient demographics (age and gender) indicates that the study groups were well-balanced. This uniformity is crucial as it minimizes the risk of selection bias, ensuring that the observed differences in AI diagnostic agreement are likely attributable to inherent anatomical and structural factors rather than confounding variables in the study population.
This study extends previous 2D-based research by evaluating AI performance in the interpretation of full-arch CBCT scans, allowing for an assessment of diagnostic parameters within a three-dimensional imaging environment.
Our findings provide critical insights into the system’s efficacy across specific population subgroups, which carries significant clinical implications. Understanding these variations is important for identifying areas in which the algorithm may require further refinement and validation, particularly in patient groups or clinical conditions associated with lower diagnostic accuracy. At the same time, these findings emphasize that AI systems should currently be used primarily as decision-support tools, providing a valuable “second opinion,” rather than as autonomous diagnostic authorities.
This study has several limitations that should be considered when interpreting the results. First, the analysis was conducted on a relatively small patient sample within a single clinical center, which may limit the generalizability of the findings. Additionally, all CBCT scans were acquired using a single imaging device; this may have influenced the specific image characteristics and potentially limited comparability with scans obtained from other imaging systems.
Furthermore, the retrospective nature of the study and the lack of case randomization constitute additional limitations. The evaluation was also restricted to a single artificial intelligence platform, precluding a direct comparison of efficacy across different AI systems.
Finally, a significant limitation was the absence of a definitive ‘ground truth,’ such as a reference assessment performed by an expert panel, which could serve as the ultimate benchmark for evaluating diagnostic accuracy. Consequently, this limits the study to assessing concordance between the system and clinical evaluations rather than determining true diagnostic accuracy.
Future research should incorporate larger and more diverse patient populations, preferably within multicenter study designs, to enhance the reliability and generalizability of the findings.
Investigating gender-related differences in the performance of artificial intelligence systems should become a standard component of future investigations.
Furthermore, subsequent studies are required to determine whether the observed age-related variations reflect true biological variability, imbalances in training datasets, or inherent algorithmic bias.
Additionally, it is pertinent to compare different AI platforms used for CBCT image analysis and to conduct a more detailed assessment of technical factors, such as imaging artifacts and dental materials, which may impede algorithmic accuracy in radiographic interpretation.
A pivotal direction for future AI-driven caries detection models should be the integration of data from radiographic imaging, intraoral scans, and clinical photography. Combining these three modalities could potentially overcome current limitations in detecting early-stage demineralization and incipient caries, which often elude radiographic detection due to minimal hard-tissue loss.
Given the constraints of routine clinical practice—often characterized by time pressure and a focus on the patient’s chief complaint—future research should implement independent expert verification to establish a definitive ground truth. Conducting evaluations under conditions free from the demands of daily practice is essential for an objective assessment of AI system efficacy in detecting high-complexity lesions. Moreover, future studies could benefit from a modified and more adaptive methodological framework tailored to the dynamic and continuously evolving landscape of AI applications in dentistry. Such an approach would allow subsequent research to better reflect ongoing technological developments, changing clinical requirements, and emerging standards for the validation of AI-based diagnostic tools.