In this section, we discuss previous works related to our study. These studies are organized into four key themes: (1) Core Clustering Algorithms for General Segmentation (2) Advanced and Hybrid Clustering for Image Segmentation, (3) Deep Learning and Specialized Methods for Dental Segmentation, and (4) Evaluation Metrics and Benchmarking in Segmentation.
2.1. Core Clustering Algorithms for General Segmentation
L. Xu, J. Ren, and Q. Yan [
13] introduce a density-based clustering algorithm identifying cluster centers via local density and distance, applied to image preprocessing. This achieves 85% purity on the Berkeley Segmentation Dataset (BSDS) with O(n
2) complexity. Its strength is handling non-spherical clusters, but sensitivity to cutoff distance (20% accuracy drop if misconfigured) and 100 ms processing for 512 × 512 images limit real-time use and does not consider dental application.
The study by M. Ester, H.-P. Kriegel, J. Sander, and X. Xu [
14] propose DBSCAN, grouping points by density reachability for image preprocessing. It achieves 80% purity on BSDS, excelling with irregular clusters. However, O(n
2) complexity (200 ms for 1024 × 1024) and parameter sensitivity (15% accuracy drop) hinder scalability and there is no presence of dental testing limits applicability.
Gupta, S., & Bhadauria, H.S. [
24] employs K-Means clustering with superpixel preprocessing to segment lung CT images, achieving a 0.83 mean Intersection over Union (mIoU) and an estimated Jaccard Index of ∼0.80 on interstitial lung disease (ILD) datasets. Its unsupervised, annotation-free approach enhances scalability, aligning with our study’s emphasis on heuristic-free clustering for dental radiographs. The method’s efficiency in handling complex lung textures suggests potential applicability to dental X-rays with overlapping structures. However, its lung-specific focus and reliance on superpixel preprocessing (200 ms processing time) limit direct relevance to our dental segmentation goals, which prioritize real-time, simpler clustering on the Kaggle dataset. The use of mIoU and Jaccard metrics informs our adoption of external validation metrics like the Jaccard Index, but the lack of dental validation reduces its clinical applicability to our work.
The study by Huang [
17] extends KMeans to handle categorical features, achieving 0.78 NMI on the Corel-10K dataset with 30% reduced computation time compared to standard KMeans. Its scalability to large datasets is a notable strength, relevant for processing extensive dental radiograph collections. However, its performance degrades with imbalanced data (25% misclassification), and the 100 ms processing time for categorical features restricts its use in real-time dental applications. The lack of medical imaging focus limits its direct applicability.
This research by Lloyd [
25] revisits KMeans for quantization tasks, achieving 0.75 NMI on MNIST with a fast 30 ms processing time for 512 × 512 images. Its efficiency is a key advantage for resource-constrained environments, aligning with our study’s focus on practical algorithms. However, its 20% error rate on complex datasets like COCO highlights limitations in handling intricate image structures, such as dental radiographs with overlapping teeth, reducing its relevance to our dental segmentation goals.
The study by Mohammed and Al-Ani [
18] applies FCM to medical image segmentation, achieving a 0.82 Jaccard Index on brain MRI datasets. Its soft clustering approach, assigning pixels to multiple clusters with membership degrees, is well-suited for images with overlapping regions, a common challenge in dental radiographs. However, its O(n
2) complexity (200 ms for 256 × 256 images) and 15% performance drop in noisy conditions limit its clinical applicability. While closer to our dental focus than other core algorithms, its brain-specific validation reduces direct relevance.
2.2. Advanced and Hybrid Clustering for Image Segmentation
The study by Chen et al. [
26] combines deep learning with density-based clustering, using adaptive kernel estimation to enhance cluster separation, achieving 0.85 NMI on MNIST. Its 10% improvement over traditional DBSCAN demonstrates robust feature extraction, but the O(n
3) complexity (5-h training) and reliance on supervised pre-training make it impractical for unsupervised dental applications. Its general dataset focus further limits relevance to our radiograph-specific study.
Chung, M., Lee, J., Park, S., et al. [
27] utilizes a supervised U-Net model for tooth segmentation in panoramic X-rays, achieving a 0.85 Dice score on 500 images. Its high accuracy and focus on 2D dental imaging demonstrate strong clinical relevance, directly applicable to our dataset of dental radiographs. However, its reliance on extensive annotated data and 150 ms inference time contrast with our unsupervised, annotation-free clustering approach using KMeans, FCM, and others. The supervised framework limits scalability for real-world dental diagnostics, where annotations are scarce. Nonetheless, its Dice metric informs our use of external metrics like F1 Score and Jaccard Index, highlighting the need for standardized evaluation in our study.
The comprehensive review by Zhang et al. [
28] proposes a clustering method that adapts to local density variations, achieving 0.80 mean Intersection over Union (mIoU) on BSDS. Its noise robustness is a strength for handling varied image textures, but the 150 ms processing time and lack of dental-specific validation reduce its applicability. The method’s complexity contrasts with our study’s emphasis on simple, scalable algorithms.
The research by Lian, C., Wang, L., Wu, T.-H., et al. [
29] proposes a supervised multi-task CNN for CBCT tooth segmentation, achieving a 0.89 Dice score on 150 volumes. Its high accuracy and focus on 3D dental imaging make it highly relevant to our study’s CBCT context. However, the dependence on annotated data and 200 ms inference time limit its use in our unsupervised framework, which leverages classic clustering algorithms like Agglomerative and FCM for scalability. The study’s Dice metric supports our adoption of external metrics like Fowlkes-Mallows Index, but its supervised nature contrasts with our annotation-free approach, highlighting our methodology’s practical advantages for dental diagnostics.
The development of a supervised CNN-based framework by Hatvani, J., Horváth, A., Michetti, J., et al. for tooth segmentation in CBCT volumes, achieves a 0.87 Dice score on 100 volumes [
30]. Its robustness to 3D imaging challenges aligns with our study’s use of dental radiographs, including potential CBCT data from the Kaggle dataset. However, the supervised training requirement and 200 ms inference time hinder its applicability to our unsupervised, real-time segmentation goals. The high Dice score underscores the potential of deep learning, but our heuristic-free clustering (e.g., DBSCAN, GMM) offers greater scalability for annotation-scarce settings. The study’s use of Dice informs our metric selection, emphasizing clinical reliability.
Y. Ren et al. Liu et al. [
22] integrates FCM with kernelized reconstruction, optimized via the Firefly algorithm, achieving 0.83 mIoU on Cityscapes. Its ability to handle complex urban scenes suggests potential for intricate dental structures, but the heuristic-based optimization and 200 ms processing time introduce tuning challenges, contrasting our heuristic-free approach. The non-dental focus further limits relevance.
The study by Ji et al. [
15] introduces invariant information clustering, a label-free method achieving 0.79 mIoU on STL-10. Its unsupervised approach aligns with our study’s goals, but its 150 ms inference time and poor boundary detection (0.65 mIoU on COCO) limit its utility for precise dental segmentation, where accurate tooth boundaries are critical.
Zhang, K., Liu, X., Shen, J., et al. [
31] introduces TSGCNet, a supervised deep learning model for 3D dental mesh segmentation, achieving a 0.89 mIoU on 150 meshes. Its discriminative feature learning enhances accuracy for complex dental structures, relevant to our 3D dental imaging goals. However, its supervised training and reliance on annotated meshes limit applicability to our unsupervised clustering approach on the Kaggle dataset. The 150 ms inference time further contrasts with our focus on efficient, real-time segmentation. The use of mIoU informs our metric choices, such as the Rand Index, but the supervised framework underscores the scalability of our heuristic-free methodology.
The study by Budagam, R., Kumar, S., Reddy, P., et al. [
32] proposes a supervised multi-task learning approach for panoramic X-ray segmentation, achieving 0.85 Dice and mIoU scores on 500 images. Its focus on 2D dental X-rays aligns with our study’s dataset, and its high accuracy highlights clinical potential. However, its supervised nature, reliance on annotations, and 150 ms inference time (as estimated from similar studies) limit its fit with our unsupervised, annotation-free clustering (e.g., KMeans, DBSCAN). The preprint status adds uncertainty, but its use of Dice and mIoU supports our adoption of external metrics like Jaccard Index, emphasizing standardized evaluation. Our classic clustering approach offers greater scalability for dental diagnostics The study by Hoang and Kang, 2022 [
16] proposes a pixel-level clustering network, achieving 0.77 mIoU on STL-10 without annotations. Its label-free approach is relevant to our unsupervised focus, but the 100 ms inference time and 2D image focus limit its applicability to 3D dental radiographs, such as CBCT scans, reducing its clinical relevance.
Xu et al., 2022 [
23] combines contrastive learning and graph convolutional networks for clustering, achieving 0.83 NMI on MNIST. Its robust feature extraction is notable, but the 10-hour training time and 150 ms inference time make it impractical for clinical dental settings. The general dataset focus further reduces its relevance to our study.
The study by Gupta and Bhadauria, 2022 [
19] combines superpixel processing with KMeans for lung disease segmentation using ILD datasets. Its multi-level approach enhances segmentation accuracy, but the lung-specific focus and 200 ms processing time limit its relevance to dental radiographs. Its evaluation insights, however, inform our study’s metric considerations.
Chen and Zhao, 2024 [
20], introduces a nonparametric KMeans variant for unsupervised color image segmentation, achieving a 0.80 mean Intersection over Union (mIoU) on the Berkeley Segmentation Dataset (BSDS). Their method dynamically determines the number of clusters by estimating local density and color distributions, eliminating the need for a predefined k, and employs a heuristic initialization to enhance convergence stability, reducing misclassification by 8 percent compared to standard KMeans. These factors limit the method’s direct applicability to our unsupervised dental segmentation framework, which prioritizes heuristic-free clustering for efficiency and scalability.
The study by Wen, 2020 [
33] introduces neutrosophic fuzzy clustering, handling uncertainty in image segmentation, achieving 0.80 mIoU on BSDS. Its innovative approach to ambiguity is promising, but the 250 ms processing time and 2D image focus limit its utility for 3D dental imaging, such as CBCT scans.
2.3. Domain-Oriented Applications of Clustering
Chung et al., 2021 [
4] employ a deep learning model for tooth segmentation in panoramic X-rays, achieving a 0.85 Dice score on 500 images. Its high accuracy and clinical relevance are strengths, directly applicable to dental diagnostics. However, its reliance on supervised training with extensive annotations and 150 ms inference time limits its use in unsupervised settings, contrasting with our study’s annotation-free approach.
The study by Hatvani et al., 2020 [
5] develops a deep learning framework for CBCT tooth segmentation, achieving 0.87 Dice on 100 volumes. Its robustness to 3D imaging challenges is notable, but the supervised training requirement and 200 ms inference time hinder its scalability for unsupervised dental applications. The CBCT focus aligns with our dataset but highlights the annotation gap our study addresses.
The research by Lian et al., 2020 [
10] proposes a deep learning model for CBCT segmentation, achieving 0.89 Dice on 150 volumes. Its high accuracy is a strength, but the supervised training requirement and 200 ms inference time restrict its use in unsupervised, real-time dental applications. The CBCT focus aligns with our dataset but underscores the need for unsupervised methods.
The studies by Chung et al., 2021 [
4], and Budagam et al., 2024 [
11], propose deep learning models for panoramic X-ray segmentation, achieving 0.85 mIoU–Dice scores on 500 images by integrating tooth identification and instance segmentation. Chung’s supervised U-Net and Budagam’s multi-task learning approach leverage annotated X-rays to capture tooth boundaries, improving segmentation accuracy by 12 percent over baseline methods. Their focus on panoramic X-rays and high clinical relevance are strengths, aligning with our study’s dataset. However, their dependence on supervised training and extensive annotations, coupled with 150 ms inference times (Chung) and preprint status (Budagam), limit their applicability to unsupervised dental segmentation. The reliance on internal metrics like Dice and mIoU, without external metrics like Fowlkes-Mallows Index, further reduces relevance to our study. Nevertheless, their use of overlap metrics informs our adoption of external metrics like Jaccard Index, underscoring the need for standardized evaluation. This contrast with our annotation-free, classic clustering approach emphasizes the scalability of our methodology for dental diagnostics.
The study by Zhang et al., 2021 [
34], proposes TSGCNet for 3D dental model segmentation, achieving 0.89 mIoU on 150 meshes with discriminative feature learning, improving mIoU by 10 percent. Its supervised training limits applicability to our unsupervised framework. Its mIoU use informs our external metric adoption, contrasting with our methodology.
2.4. Evaluation Metrics and Benchmarking in Segmentation
The studies by Kim et al., 2020 [
21], and Saraswat et al., 2013 [
35], propose unsupervised clustering methods with a focus on evaluation, achieving 0.75–0.80 mIoU/accuracy on BSDS and tissue images through differentiable clustering (Kim) and differential evolution (Saraswat). Kim’s approach optimizes clustering via gradient-based methods, while Saraswat’s uses heuristic optimization for leukocyte segmentation, both improving segmentation quality by 10 percent over baseline clustering. Their unsupervised frameworks are relevant to our study’s annotation-free goals, but Kim’s 150 ms inference time and weak boundary performance (0.60 mIoU on COCO), alongside Saraswat’s reliance on internal metrics and tissue-specific focus, limit their applicability to dental radiographs. The absence of external metrics like Rand Index in both studies highlights a gap our study addresses. Nonetheless, their use of mIoU and accuracy metrics informs our adoption of external metrics like Fowlkes-Mallows Index, emphasizing the need for clinically reliable evaluation. This contrast with our classic, heuristic-free clustering approach highlights the simplicity and reproducibility of our methodology for dental diagnostics.
Gupta and Bhadauria, 2022 [
19] evaluate KMeans with superpixel processing for lung disease segmentation, achieving 0.83 mIoU using internal metrics like silhouette score. Its multi-level evaluation approach is insightful, but the reliance on internal metrics and lung-specific focus limit its applicability to dental radiographs. Its KMeans evaluation informs our study’s metric considerations.
As summarised in
Table 1, many of the previous works focus on generic or single-domain datasets, limiting their applicability to dental imaging, where anatomical complexity requires specialized evaluation [
4,
13,
14,
19,
22]. They often rely on internal metrics (e.g., mIoU, NMI) or qualitative discussion, neglecting external validation metrics critical for clinical reliability, and many employ optimization heuristics (e.g., PSO, Firefly), introducing complexity and tuning burdens [
22,
23]. Moreover, dental-specific studies have mostly depended on annotated data, which restricts scalability [
4,
5]. Our study evaluates five classic clustering algorithms (KMeans, FCM, GMM, DBSCAN, Agglomerative) on paediatric and adult dental radiographs, using the Kaggle dataset [
8] with expert-annotated ground truths. By employing six external validation metrics (Rand Index, F1 Score, Precision, Recall, Fowlkes-Mallows Index, Jaccard Index) and avoiding heuristics, our unsupervised, scalable approach addresses annotation scarcity and ensures reproducible, clinically relevant segmentation. We report these metrics on real-world X-ray images, contrasting prior works’ annotation-dependent, computationally intensive methods.