1. Introduction
Dental radiographs are indispensable tools in modern dentistry, providing clinicians with critical insights into the anatomical structures and pathological conditions of the oral and maxillofacial regions [
1]. They play a key role in diagnosing, planning treatments, and monitoring therapeutic outcomes, especially in managing conditions such as caries, periodontal disease, and bone resorption. Among the most commonly employed radiographic modalities are orthopantomograms, bitewings, and periapical images, which collectively enable comprehensive evaluation of dental hard tissues and disease states [
2]. Despite their importance, interpreting radiographs is a time-consuming and cognitively demanding task. Manual interpretation requires significant focus and precision, leaving clinicians vulnerable to errors arising from fatigue or lapses in concentration, particularly in high-volume settings. These challenges can delay the diagnostic process and, in some cases, compromise accuracy, underlining the need for tools that can enhance efficiency without sacrificing reliability [
3]. Advances in artificial intelligence (AI) have introduced automated segmentation systems capable of identifying and delineating structures in radiographic images [
4,
5,
6]. Early methods based on traditional computer vision techniques, such as thresholding and region-growing algorithms, demonstrated some utility but struggled to handle variability in image quality and patient anatomy [
7]. Deep learning has since emerged as a transformative approach, leveraging Convolutional Neural Networks (CNNs) to address these limitations. By learning features directly from data, CNN-based models have achieved significant success in segmentation tasks, such as caries detection [
8,
9,
10,
11], bone loss assessment [
12,
13,
14], and anatomical labeling [
15,
16,
17]. ORTHOSEG (ORTHOpanoramic and intraoral SEGmentation) is a deep learning-based AI system integrated into the Neowise software (version 1.5, CEFLA S.C., Imola, Italy). The core of the system consists of a suite of specialized deep learning models belonging to the convolutional neural network (CNN) family, with each model dedicated to a specific predictive task. In particular, these models are based on the Mask R-CNN (Region-based Convolutional Neural Network) architecture, which enables instance segmentation by performing pixel-level detection and classification of individual anatomical and pathological structures within radiographic images [
18]. It offers automated segmentation of anatomical, pathological, and non-pathological elements in radiographs. Designed to improve diagnostic workflows, ORTHOSEG (CEFLA S.C., Imola, Italy) classifies each pixel to identify and delineate structures across various radiographic modalities, including orthopantomograms, bitewings, and periapicals. Its primary goal is to assist clinicians by providing a rapid and detailed visualization of radiographic findings, allowing them to focus on interpretation and decision-making while reducing the cognitive load associated with manual analysis. This study aims to validate the performance of ORTHOSEG (CEFLA S.C., Imola, Italy), assessing its accuracy and reliability in segmenting anatomical, pathological, and non-pathological elements in dental radiographs. The evaluation is based on comparing the system’s output against manual segmentations performed by experienced clinicians, which serve as the ground truth. By assessing its accuracy and reliability across different radiographic types, this research seeks to determine whether ORTHOSEG (CEFLA S.C., Imola, Italy) meets the standards required for integration into routine dental workflows. The research hypothesis of this study is that ORTHOSEG can achieve segmentation performance with quantitative accuracy metrics sufficient to enable reliable identification of anatomical, pathological, and non-pathological structures in orthopantomograms, bitewings, and periapical radiographs, supporting its applicability in routine clinical practice.
3. Results
The model was evaluated on a diverse dataset comprising 50 orthopantomograms, 50 bitewings, and 50 periapical radiographs, totaling 150 images. Segmentation results for all supported radiograph types are presented in
Table 5,
Table 6,
Table 7,
Table 8 and
Table 9, with an
mDSC of 0.635 ± 0.222.
Table 6 provides the mean inference time for each type.
Table 7,
Table 8 and
Table 9 detail the results for orthopantomograms, bitewings, and periapicals, respectively.
The system achieved an mDSC of 0.756 ± 0.174 for orthopantomograms with a mean processing time of 19.745 ± 3.625 s, an mDSC of 0.705 ± 0.154 for bitewings with a mean processing time of 8.467 ± 0.903 s, and an mDSC of 0.445 ± 0.197 for periapicals with a mean processing time of 5.653 ± 0.897 s.
To investigate potential performance variations based on demographic factors, the data was analyzed by sex and age group.
Table 10 presents the metrics analyzed by sex, with
p-values obtained from a two-tailed independent
t-test.
Table 11 summarizes the results grouped by age, with
p-values calculated using ANOVA to assess differences among age groups.
The analysis revealed no statistically significant differences in performance metrics between sexes. However, a statistically significant difference was observed among age groups. This can be attributed to the increased complexity of radiographs from elderly individuals, which often feature a greater abundance of both pathological and non-pathological elements compared to those of children, adolescents, and adults.
Figure 1,
Figure 2 and
Figure 3 show the output of the ORTHOSEG for the three supported types of radiographs: orthopanoramic, bitewing and periapical, respectively.
4. Discussion
A rigorous evaluation of ORTHOSEG was conducted to assess its ability to segment anatomical, pathological, and non-pathological structures from dental radiographs. The study employed a dataset prepared by two experienced clinicians, who followed a dual-review protocol. One clinician performed manual annotations using CVAT, while the second clinician conducted a comprehensive review and quality assurance process. This approach, coupled with stringent checks to address segmentation consistency and image resolution, ensured that the dataset accurately represented real-world clinical scenarios.
ORTHOSEG showed strong performance, particularly for orthopantomograms and bitewings, achieving an
mDSC of 0.756 ± 0.174 and 0.705 ± 0.154 and an
mIoU of 0.684 ± 0.172 and 0.652 ± 0.151, respectively. Notably, these results were achieved while identifying and segmenting 70 distinct elements, including anatomical, pathological, and non-pathological structures, an extent of recognition that surpasses many existing systems [
8]. This capability highlights ORTHOSEG’s potential to manage complex datasets effectively, ensuring a comprehensive analysis of radiographic images.
When compared to existing state-of-the-art solutions, ORTHOSEG demonstrated promising results. For example, U-Net and Faster R-CNN models have achieved an
mIoU of 0.501 and an
mDSC of 0.569 in dental segmentation tasks [
5]. ORTHOSEG’s results exceed these benchmarks in terms of the quality and quantity of elements detected, suggesting its capability to deliver competitive performance in dental radiograph analysis while addressing a broader range of identifiable features. The results of the present study are consistent with previous research demonstrating the effectiveness of deep learning models in dental radiograph segmentation. Previous studies have reported Dice coefficients up to 0.569 and
mIoU values up to 0.501 for periapical radiograph segmentation using U-Net-based architectures [
5,
15]. Similarly, high segmentation performance has been reported in periapical radiographs, with F1-scores up to 0.88, confirming the potential of deep learning systems for detailed radiographic analysis [
22]. In bitewing radiographs, CNN-based segmentation models have achieved F1-scores of approximately 0.84 and diagnostic accuracies up to 0.87, supporting the reliability of artificial intelligence in dental radiographic analysis [
9,
10]. These findings are consistent with the performance observed for ORTHOSEG, particularly in orthopantomograms and bitewing radiographs. Furthermore, systematic and scoping reviews have emphasized the growing role of convolutional neural networks in improving diagnostic accuracy and supporting clinical decision-making in dentistry [
4,
11,
21].
In addition to its segmentation accuracy, ORTHOSEG achieved efficient computational performance, with average processing times of 19.745 ± 3.625 s for orthopantomograms, 8.467 ± 0.903 s for bitewings, and 5.653 ± 0.897 s for periapical radiographs. These speeds were achieved on hardware commonly available in clinical settings, which suggests that the system could be easily integrated into routine workflows without requiring specialized computational infrastructure.
These results highlight the superior performance of the model on orthopantomograms, while the lower performance on periapical radiographs is attributed to the inherent complexity of these images, particularly in classifying individual teeth.
The lower segmentation performance observed in periapical radiographs may be attributed to several modality-specific factors. Compared to orthopantomograms and bitewings, periapical radiographs provide a more limited field of view and fewer surrounding anatomical reference structures, reducing the contextual information available for accurate segmentation. In addition, periapical radiographs often exhibit greater variability in tooth positioning, angulation, and anatomical overlap, which can increase segmentation complexity. Furthermore, in the present dataset, periapical radiographs included a higher proportion of elderly patients, who more frequently present restorations, implants, and structural alterations, potentially increasing image heterogeneity. These findings suggest that segmentation of periapical radiographs represents a more challenging task for AI systems and highlight the importance of continued optimization and validation.
The system’s segmentation results were consistent across sexes and children and adult age groups, reflecting its robustness and generalizability. However, a slight decrease in accuracy was observed when the system was applied to radiographs of older patients (65+ years), likely due to the increased complexity of their radiographs, which often feature a greater number of overlapping anatomical and pathological elements [
36,
37,
38,
39,
40,
41,
42]. These findings underline the system’s adaptability while emphasizing the continued need for clinician oversight in managing complex cases.
By automating routine segmentation tasks, ORTHOSEG can reduce the time required for image interpretation, alleviating the cognitive burden on clinicians and enabling them to focus on evaluating complex cases and making informed clinical decisions.
This study has several limitations that should be acknowledged. First, the dataset used for evaluation was derived exclusively from a European population without differentiating between ethnic groups. Therefore, the findings may not be fully generalizable to patients from other geographic regions that can have different demographic and epidemiological characteristics. Second, dealing with AI systems means dealing with variability. Although the dataset was carefully curated and reviewed by experienced clinicians, it may not capture the full variability encountered in everyday practice, particularly in cases with rare pathologies or unusual anatomical conditions. Finally, the study focused on the system’s performance without analyzing its usability in the Neowise software, its clinical interaction, or its impact on diagnostic workflows.
Future work includes expanding the dataset to cover a broader and more representative patient population and include more imaging devices in order to confirm the system’s generalizability beyond the European sample examined in this study. Future research should focus on expanding the validation dataset to include a broader range of patient demographics and clinical scenarios, further refining the system’s performance and generalizability. This will ensure that ORTHOSEG continues to meet the evolving needs of modern dentistry, cementing its role as an indispensable tool in routine clinical practice.
Moreover, future studies could include prospective clinical trials in which clinicians would use the ORTHOSEG system on new patients, record its outputs in real time, and assess how its segmentation affects diagnostic decisions, reporting times, and workflow efficiency [
43,
44,
45,
46,
47,
48]. Finally, studies on user experience and usability among clinicians will be fundamental to ensure the effective integration of this AI-assisted tool into daily routine dental practice.