Next Article in Journal
Regulating AI-Driven Triage: Fundamental Rights and Compliance Challenges in the European Union
Previous Article in Journal
Leveraging AI to Mitigate Learning Poverty in the Digital Era: The Impacts of Integrated AI Educational Tools on Students’ Literacy Skills
 
 
Due to scheduled maintenance work on our servers, there may be short service disruptions on this website between 11:00 and 12:00 CEST on March 28th.
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives

by
A. M. Mutawa
1,*,
Yacoub Yousef Altarakemah
2 and
Karthiga Thirupathy
1
1
Department of Computer Engineering, College of Engineering and Petroleum, Kuwait University, Sabah Al Salem University City, P.O. Box 5969, Safat 13060, Shadadiya, Kuwait
2
Department of Restorative Science, Faculty of Dentistry, Kuwait University, P.O. Box 24923, Safat 13110, Kuwait City, Kuwait
*
Author to whom correspondence should be addressed.
Submission received: 30 December 2025 / Revised: 11 February 2026 / Accepted: 12 February 2026 / Published: 2 March 2026

Abstract

Dental conditions, including caries, periodontal disease, plaque accumulation, malocclusion, and oral mucosal abnormalities, remain highly prevalent worldwide. Early detection is crucial for preventing disease progression, simplifying treatment, and improving patient outcomes. Conventional diagnostic methods rely on subjective visual and tactile examinations, which are often inconsistent. Recent advances in deep learning (DL), particularly convolutional neural networks and vision transformers, enable automated, accurate detection of dental diseases from intraoral images captured via smartphones or dedicated imaging devices. DL-driven systems facilitate cost-effective virtual consultations, community screenings, and remote oral health monitoring. This narrative review was conducted following a structured search of PubMed, Scopus, Web of Science, Embase, and Google Scholar (October 2020–October 2025), which identified 74 eligible studies on intraoral photographic imaging-based DL systems, encompassing caries, gingival inflammation, plaque, malocclusion, and soft-tissue lesions. Most studies focused on caries, plaque, and periodontal disease using CNN and U-Net-based models, often reporting accuracies above 85% but with substantial performance drops in external validation. Despite promising results, clinical integration remains limited by challenges such as class imbalance, limited external validation, heterogeneous imaging protocols, and insufficient model interpretability. Emerging approaches, including self-supervised and federated learning, explainable artificial intelligence, multimodal data fusion, and smartphone-based diagnostics, offer potential solutions. Standardized imaging workflows, high-quality annotations, and robust clinical trials are essential to translate DL-based dental diagnostic systems into real-world practice. This narrative review aims to guide the development of reliable, equitable, and clinically deployable DL solutions for oral health assessment.

1. Introduction

Dental diseases constitute a major global public health burden, affecting an estimated 3.5 billion people worldwide. Dental caries is the most prevalent non-communicable condition, while periodontal diseases remain a leading cause of tooth loss [1,2]. Early and accurate diagnosis is essential for preventing disease progression, reducing treatment costs, and improving patients’ quality of life. However, conventional diagnostic approaches rely predominantly on visual–tactile examination, which is often insufficiently sensitive and poorly reproducible, particularly in early disease stages [3,4].
Recent advances in artificial intelligence (AI), especially deep learning (DL), have transformed image-based diagnostics across several medical specialties and now achieve expert-level performance in fields such as dermatology [5], radiology [6], and ophthalmology [7]. In dentistry, AI research has primarily concentrated on radiographic modalities such as bitewing radiographs, panoramic imaging, and cone-beam computed tomography. Although these imaging techniques offer valuable structural information, they necessitate specialized equipment and involve ionizing radiation. In contrast, intraoral photographic images (IOPIs), captured using smartphones or intraoral cameras, represent a non-ionizing, cost-effective, and accessible alternative that supports tele-dentistry [8], community-based screening, and remote oral-health monitoring, particularly in low-resource settings [9].
Despite these advantages, the direct translation of DL techniques developed for other medical imaging tasks to dental intraoral photography is non-trivial. Unlike standardized acquisition protocols commonly used in radiology, intraoral photography is characterized by substantial variability in illumination, camera type, viewing angle, saliva presence, reflections, and occlusions. Moreover, many dental pathologies manifest as small, visually subtle lesions, often accompanied by pronounced class imbalance and a lack of large, well-annotated, multicenter datasets. These domain-specific characteristics introduce challenges related to robustness, generalizability, and reproducibility that are not adequately addressed by DL models designed for more standardized imaging environments.
Several reviews have explored artificial intelligence (AI) and deep learning (DL) approaches for dental diagnosis, including systematic and umbrella reviews that primarily focus on caries detection and AI-assisted interpretation of dental images. Schwendicke et al. examined deep learning–based methods for caries detection across various imaging modalities [10], whereas Kumar et al. conducted an umbrella review on AI-assisted caries examination using digital dental photography [11]. Additionally, Noor Uddin et al. summarized DL models developed for caries detection using intraoral images [12]. However, these reviews vary significantly in scope, methodological approach, and reporting depth, often combining radiographic and photographic modalities or concentrating on a single disease entity [13]. As a result, a focused synthesis dedicated exclusively to deep learning applications based on intraoral photographic images across multiple dental conditions remains limited.
Despite promising performance, deep learning (DL) models using intraoral photographic images (IOPIs) continue to face significant challenges that impede their clinical application. These challenges encompass class imbalance across disease categories, reliance on single-center datasets, sensitivity to domain shifts due to heterogeneous acquisition conditions, and limited model explainability, all of which collectively hinder generalizability and clinician trust [14]. Methodologically, research has advanced beyond initial convolutional neural networks (CNNs) to include encoder–decoder architectures such as U-Net for segmentation tasks [15], vision transformers (ViTs) for global context modeling [16], generative adversarial networks (GANs) for data augmentation and imbalance mitigation [17,18], self-supervised learning (SSL) for label-efficient training [19], and federated learning (FL) for privacy-preserving multi-center collaboration [20,21]. However, these methodologies have not yet been systematically integrated within the specific context of IOPI-based dental disease analysis. In this narrative review, deep learning approaches are examined by their primary analytical task: image-level classification, lesion or region detection, pixel-level segmentation, and multi-task learning frameworks that combine two or more of these objectives, applied to intraoral photographic images. Figure 1 depicts the DL workflow for IOPI analysis.
To address this gap, the present review evaluates deep learning architectures applied to intraoral photographic diagnosis, summarizes model performance across major dental pathologies, and examines how preprocessing and augmentation strategies affect model robustness and generalizability. By focusing exclusively on intraoral photographic imaging—rather than radiographic modalities—this review complements existing AI-in-dentistry literature and fills a significant gap in current evaluations. Specifically, this review addresses the following questions:
  • Which deep learning architectures and learning paradigms (e.g., CNNs, encoder–decoder networks, vision transformers, self-supervised learning, and federated learning) demonstrate robust performance for dental disease detection and segmentation using intraoral photographic images?
  • How do preprocessing, data augmentation, and image enhancement strategies influence diagnostic accuracy, robustness, and generalizability across primary dental conditions?
  • What methodological and translational challenges, including class imbalance, single-center dataset bias, domain shift, and limited explainability, most strongly constrain clinical adoption?
  • What emerging research directions and implementation strategies are most promising for improving fairness, interpretability, privacy preservation, and real-world deployment of AI-assisted intraoral photographic diagnostics?
These questions are addressed through:
(i).
task- and disease-level framing (Section 4).
(ii).
synthesis of deep learning architectures (Section 5).
(iii).
review of preprocessing and augmentation strategies (Section 6).
(iv).
task-wise synthesis of diagnostic performance (Section 7).
(v).
evaluation metrics and generalizability considerations (Section 8).
(vi).
analysis of translational and clinical challenges (Section 9)
(vii).
identification of emerging research directions (Section 10).

2. Methodology

2.1. Search Strategy

A structured literature search was conducted across major academic databases (PubMed, Scopus, Web of Science, Embase, and Google Scholar) from October 2020 to October 2025. The search strategy combined terms related to AI and deep learning (“deep learning,” “DL,” “convolutional neural network,” “CNN,” “vision transformer,” “machine learning,” “artificial intelligence”) with dental and oral health terms (dental, dentistry, oral, tooth, teeth, “dental caries,” gingivitis, “dental calculus,” periodontitis, “oral lesion,” “oral potentially malignant disorder,” OPMD, “oral cancer”) and with intraoral imaging terms (“intraoral photograph,” “intraoral images,” “intraoral photo,” “oral photograph,” “smartphone image,” “intraoral camera,” “photographic images”). The PRISMA-inspired flow diagram (Figure 2) illustrates the process of study identification, screening, eligibility assessment, and final inclusion in this narrative review.

2.2. Inclusion and Exclusion Criteria

The inclusion criterion was limited to peer-reviewed articles published in English. All retrieved records were exported to EndNote and deduplicated. Studies were then screened through a four-stage process conducted independently to minimize bias. Three reviewers independently assessed all titles, abstracts and full text based on predefined inclusion and exclusion criteria, removing records that clearly violated any of the following six eligibility requirements: (1) non-English publications, (2) duplicate records, (3) content unrelated to dentistry or AI, (4) conference abstracts, (5) retracted papers, and (6) non-empirical materials, including editorials, case reports, and commentaries.

2.3. Study Selection

The initial database search yielded 194 articles. After eliminating duplicates, 172 unique studies remained and were evaluated based on their titles and abstracts. Out of these, 112 articles were subjected to a full-text review to determine their eligibility. All studies deemed relevant by keyword searches were thoroughly reviewed in full text. The complete manuscripts were analyzed to gather information on methodologies, dataset characteristics, validation strategies, performance metrics, and any reported limitations. A structured quality assessment and risk-of-bias evaluation were then conducted using predefined criteria, ensuring that conclusions were based on a comprehensive analysis of the complete texts rather than just abstract-level information. Applying the inclusion and exclusion criteria resulted in the selection of 74 articles for the final review. The choice of studies was strictly guided by predefined eligibility criteria, with a focus on methodological rigor and relevance. Independent screening was carried out at both the title/abstract and full-text stages to separate preliminary relevance assessment from detailed methodological evaluation, thereby enhancing transparency and reducing selection bias.

2.4. Data Extraction and Synthesis

The following information was extracted from each eligible study:
  • Study information: author, publication year, and journal.
  • Dataset characteristics: type of intraoral images, dataset source, and preprocessing methods.
  • DL model details: architecture (e.g., CNN, ResNet, ViT), transfer-learning methods, and training/validation approaches.
  • Performance metrics: accuracy, sensitivity, specificity, F1-score, and area under the curve (AUC) of the receiver operating characteristic (ROC).
  • External validation: presence or absence of independent testing.
  • Key findings and limitations.
Two reviewers independently conducted data extraction and resolved disagreements through discussion. The extracted data were then qualitatively synthesized to identify the emerging trends, methodological patterns, and gaps in the existing literature.

2.5. Quality and Risk of Bias Assessment Criteria

Following data extraction and synthesis, the methodological quality and risk of bias of the included studies were assessed using a predefined seven-domain framework tailored to dental photographic image analysis. Each domain was evaluated based on full-text review and rated as low, Moderate, or high risk of bias. An overall risk-of-bias classification was subsequently assigned using predefined, non-overlapping thresholds.
Q1.
Is the dataset source of the dental photographic image dataset clearly described (e.g., public/private, single- or multi-center)?
Q2.
Is an appropriate strategy reported for splitting the dataset into training, validation, and test sets, with explicit measures to prevent data leakage (e.g., patient-level separation or avoidance of duplicate images across splits)?
Q3.
Was the developed model evaluated using external validation or an independent test dataset to assess its generalizability to unseen data?
Q4.
Are clinically meaningful performance metrics reported beyond overall accuracy, such as sensitivity, specificity, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC)?
Q5.
Are ground-truth labels derived from reliable clinical reference standards, such as expert dental annotation, consensus labeling, or validated diagnostic criteria?
Q6.
Does the study explicitly discuss key methodological limitations, potential sources of bias, and constraints affecting the interpretation or generalizability of the results?
Q7.
Is image acquisition standardized or quality-controlled, including reporting of camera type, lighting conditions, acquisition protocols, or exclusion of poor-quality images? Image acquisition standardization and quality control.

Risk of Bias (ROB) Scoring Criteria and Thresholds

Each study was evaluated across seven domains (Q1–Q7) using explicit criteria to ensure transparency and replicability in the risk-of-bias assessment. Studies were categorized as having low, moderate, or high risk of bias based on the thoroughness and rigor of their methodologies. For Q1, which pertains to data source and representativeness, studies were classified as low risk if the dataset source and population characteristics were comprehensively documented, and as higher risk if they relied on single-center data without adequate justification. Q2, which focuses on data splitting and leakage prevention, necessitated transparent reporting of training, validation, and test splits, along with measures to prevent data leakage. Q3, concerning the validation strategy and generalizability, was considered high risk in the absence of external or multicenter validation. Q4, regarding performance reporting transparency, was rated low risk when studies reported multiple clinically relevant metrics beyond mere accuracy. Q5, addressing labeling quality and reference standards, required expert or consensus annotations. Q6, on the reporting of limitations, was based on a clear discussion of methodological constraints. Q7, which evaluates image acquisition standardization and quality control, was assessed using documentation of imaging protocols and quality assurance procedures. Each study was independently evaluated using established decision criteria, and any discrepancies in opinion were resolved through consensus.
An overall risk of bias for each study included in the analysis was evaluated by examining the percentage of domains categorized as low, moderate, or high risk. A study was considered to have Minimal Risk of Bias (High Quality) Ai 07 00085 i001 if 70% or more of the evaluated domains were rated as low risk, with no more than one high-risk rating in a specified critical domain. Conversely, a study was deemed to have a moderate Risk of Bias (Lower Quality) Ai 07 00085 i002. If less than 70% of the domains were rated as low risk, and there were high-risk ratings in two or more critical domains. A High Risk of Bias classification Ai 07 00085 i003 was assigned when 50% or fewer domains were rated as high risk, resulting in the study’s exclusion from the qualitative synthesis. The use of explicit criteria and quantitative thresholds was intended to minimize subjective judgment and improve reproducibility.

3. Results

The results of the study selection and subsequent analyses are presented below. The characteristics of the included studies, the quality and risk-of-bias assessment outcomes, and the distribution of evidence across dental applications are summarized in the following sections.

3.1. Summary of Included Studies

The reviewed literature encompasses a wide array of dental applications, including caries detection, assessment of periodontal and gingival diseases, analysis of dental plaque and calculus, evaluation of malocclusion and crowding, screening for oral lesions and potentially malignant disorders, and support for orthodontic diagnostics. The studies use datasets from private clinical collections, institutional repositories, public datasets, and images acquired via community or smartphone cameras, with dataset sizes ranging from a few hundred to over 10,000 images. Imaging modalities primarily consist of professional intraoral cameras, digital clinical photography, and smartphone-based imaging. A variety of deep learning architectures are employed, including convolutional neural networks, U-Net-based segmentation models, object detection frameworks like YOLO and Mask R-CNN, and transformer-based models such as Swin Transformer and SegFormer. Reported outcomes primarily focus on classification, detection, and segmentation performance, using metrics such as accuracy, sensitivity, specificity, F1-score, AUC, Dice coefficient, and mean intersection-over-union. Commonly noted limitations include small or single-center datasets, limited population diversity, variability in image quality, lack of external validation, and restricted generalizability. Table 1 summarizes the key characteristics of studies that explore deep learning-based analysis of intraoral dental photographic images.

3.2. Distribution of Study Characteristics and Evidence

Figure 3 presents a sunburst diagram that illustrates the hierarchical distribution of the included studies, categorized by dental application, contributing authors, deep learning model families, and reported performance outcomes. The innermost ring represents the authors and the year. In contrast, the subsequent concentric rings depict the primary dental conditions investigated in the corresponding studies using intraoral photographic images, model architectures, and key outcome metrics. This visualization provides an integrated overview of how deep learning approaches have been applied across various dental conditions, highlights the diversity of model choices, and summarizes reported performance trends within the current literature.

3.3. Quality and Risk of Bias Assessment

The quality and risk of bias in the included studies were assessed using the predefined seven-domain framework outlined in Section 2.5. Table 2 presents a summary of the overall risk-of-bias classification for each study, assessed across seven predefined methodological domains (Q1–Q7), using explicit decision criteria and quantitative scoring thresholds.
Across the studies reviewed, the overall risk-of-bias profile revealed consistent methodological strengths alongside recurring weaknesses. Most studies clearly identified their data sources and demonstrated satisfactory representativeness (Q1), although some were limited to data from a single center. Strategies for data splitting and preventing leakage (Q2) were generally reported with low to moderate concern. The most significant source of bias was linked to validation strategy and generalizability (Q3), as many studies lacked external or multi-center validation, resulting in high-risk ratings in this area. Conversely, transparency in performance reporting (Q4) was consistently strong, with most studies providing multiple clinically relevant metrics beyond just accuracy. Labeling quality and reference standards (Q5) were primarily rated as low risk, indicating reliance on expert dental annotations or consensus labeling. Limitations (Q6) were often acknowledged, although a small number of studies reported higher risks due to insufficient discussion of methodological constraints. Reporting on image acquisition standardization and quality control (Q7) were variable, with several studies reporting moderate to high risk due to limited documentation of imaging protocols or quality assurance procedures. Overall, the main contributors to the elevated risk of bias across the reviewed literature were deficiencies in external validation and inconsistent re-porting of image acquisition.

4. Discussion

IOPIs are widely used for documentation, patient monitoring, tele-consultations, and AI-assisted diagnostic applications. In dentistry, IOPIs capture enamel, gingiva, plaque, oral mucosa, and occlusal surfaces. However, image quality is affected by variations in lighting, saliva reflection, camera angle, and device type. Consistency is often improved through noise-reduction and image-enhancement techniques such as contrast-limited adaptive histogram equalization (CLAHE), color-constancy correction, and region-of-interest (ROI) extraction.

4.1. Detectable Dental Diseases in Intraoral Photographs

4.1.1. Dental Caries

Dental caries is a primary global oral health concern, affecting more than 2 billion individuals and imposing significant clinical and economic burdens [1,2]. Carious lesions exhibit characteristic visual features. Early, non-cavitated lesions typically appear as chalky white spots due to subsurface enamel demineralization and loss of translucency. As the disease advances, lesions may darken to brown or black, develop surface roughness, and ultimately form cavities, primarily in occlusal pits, fissures, and proximal surfaces. IOPIs clearly capture these changes, presenting DL models with diagnostic cues such as localized color variation, texture inconsistencies, and morphological disruption [23]. Although the contrast between healthy enamel and demineralized areas enhances lesion visibility, early lesions may be obscured by lighting variability, saliva reflection, or device-dependent color differences, challenging the distinction ability of AI systems [47]. Despite these limitations, numerous studies have reported that CNNs and transformer-based architectures detect both cavitated and non-cavitated lesions with high diagnostic performance.

4.1.2. Gingivitis and Periodontal Diseases

Gingivitis and periodontitis produce characteristic soft tissue changes that are readily visible in IOPIs. Gingivitis typically manifests as erythematous, edematous gingiva with reduced stippling and increased bleeding tendency. In a more advanced periodontal disease [25], gingival recession and altered gingival margins may be observed. These color and contour changes serve as reliable visual markers for DL models to classify inflammation severity and estimate periodontal indices [24].
Detection accuracy can be affected by factors such as natural pigmentation differences, variations in gingival biotypes, and inconsistent lighting across the vestibular region. Nevertheless, standardized preprocessing techniques, including color-constancy correction and image normalization, enable DL systems to accurately distinguish healthy from inflamed tissues across diverse populations [26].

4.1.3. Dental Plaque

Dental plaque, a biofilm of bacteria and extracellular matrix, is a primary etiological factor in both caries and periodontal disease. In IOPIs, plaque typically appears as a yellowish-white or opaque film along the gingival margin, within fissures, or in interproximal areas. High-resolution images provide sufficient contrast for DL algorithms to recognize plaque-distribution patterns, even without disclosing agents [27].
Pixel-level segmentation models, such as U-Net [3] and DeepLabV3+, are widely used to quantify plaque coverage and assess oral hygiene. These models demonstrate robust performance despite variations in lighting and plaque translucency. By detecting subtle textural differences between plaque and enamel, they are particularly suited for automated plaque assessment from IOPIs [28]. Recent deep learning methods have also been utilized in dental plaque analysis, including a self-attention-based segmentation framework that combines local-to-global feature representations to accurately and automatically segment plaque from IOPIs [40].

4.1.4. Orthodontic Conditions

IOPIs capture dental alignment, occlusal relationships, and arch forms, enabling assessment of common orthodontic abnormalities such as crowding, spacing, anterior crossbite, deep bite, and open bite. High-resolution photographs accurately depict tooth angulation, incisal relationships, and midline deviations, allowing DL models to extract both global structural patterns and local tooth-level features.
Transformer-based models, which capture long-range spatial relationships across the dental arch, achieve remarkably accurate multi-landmark detection and occlusal pattern recognition. IOPI-based remote monitoring systems further demonstrate the practical utility of this approach, especially in aligner therapy [29,30]. Machine learning based methods have been employed for the automatic identification and analysis of photometric landmarks on two-dimensional facial images, enabling objective facial measurements that support orthodontic diagnosis and treatment planning [48].

4.1.5. Soft-Tissue Lesions and Potentially Malignant Disorders

Soft-tissue lesions, including oral leukoplakia, erythroplakia, candidiasis, aphthous ulcers, and lichen planus, exhibit visually distinctive features in clinical photographs. These may present as white keratotic plaques, bright red velvety patches, ulcerated areas, or reticular patterns, which DL models can effectively leverage for classification. Photographic analysis of potentially malignant disorders (PMDs) [33,34] is particularly valuable given the critical importance of early detection. Although variations in mucosal color, lighting conditions, and saliva can complicate interpretation, CNNs and attention-based models have recently identified suspicious mucosal changes with promising sensitivity [31]. Collectively, these studies indicate that while IOPIs are effective for detecting visually prominent conditions such as dental plaque and gingivitis, diagnostic performance for early-stage caries and subtle mucosal lesions remains constrained by class imbalance and variability in image acquisition, underscoring the need for robust preprocessing strategies and multi-task learning frameworks.
Table 3 synthesizes the relationships among major dental disease categories, their characteristic visual manifestations in intraoral photographs, and the corresponding deep learning tasks. As summarized in Table 3, conditions characterized by intense color or texture contrasts—such as dental plaque and gingival inflammation—are well suited to image-level classification and segmentation tasks, whereas diseases presenting subtle visual cues, including early-stage caries and mucosal lesions, pose greater challenges for automated detection. This task–disease mapping highlights the importance of preprocessing, region-of-interest extraction, and multi-task learning for improving diagnostic performance across heterogeneous dental conditions. Figure 4 provides a visual reference for the diagnostic cues described across disease categories, thereby supporting the interpretability of deep learning feature representations derived from intraoral photographs.

5. Deep Learning Architectures and Applications for Intraoral Photographic Image Analysis

This section integrates the discussion of deep learning architectures with their corresponding applications in IOPI analysis. DL has achieved remarkable success in various image-based medical diagnostic applications. Several model families from mainstream computer-vision research have been adapted to dental imaging: CNNs such as Visual Geometry Group (VGG), residual network (Res Net), Dense Net, and Inception [49]; encoder–decoder architectures, ViTs, generative adversarial networks (GANs) [17], and hybrid models [37]. These early successes in radiographic interpretation provide a foundation for extending DL techniques to photographic intraoral images.

5.1. Convolutional Neural Networks (CNNs)

CNNs remain central to IOPI analysis due to their ability to learn hierarchical spatial representations. Initial convolutional layers capture low-level features such as edges, contrast variations, and textures, while deeper layers encode higher-level semantic patterns, including lesion morphology, plaque distribution, gingival contours, and occlusal relationships [50,51].
Several CNN architectures have been effectively applied in dental diagnostics:
  • ResNet employs skip connections to mitigate vanishing gradients, enabling more profound and more discriminative models [22].
  • DenseNet connects each layer to all subsequent layers, promoting feature reuse and improving gradient flow; it is particularly advantageous for small or imbalanced dental datasets.
  • EfficientNet uses a compound scaling strategy that jointly optimizes network width, depth, and resolution, achieving high accuracy with reduced computational cost, making it suitable for smartphone-based inference.
  • MobileNetV2/V3 is optimized for lightweight deployment and real-time tele-dentistry workflows, facilitating rapid community-based screening [39,48].
CNN-based methods demonstrate robust performance across diverse IOPI tasks, including caries detection [52], caries classification, gingival inflammation classification [45], plaque segmentation [40] OPMD’s orthodontic assessment [35], and oral lesion identification. Additionally, CNNs form the backbone of hybrid architectures integrating attention modules, multi-task learning, and explainability tools such as Gradient-weighted Class Activation Mapping (Grad-CAM) visuals, which were confirmed to accurately depict the regions most significant for the model’s prediction, reflecting model focus rather than conclusive disease [37]. While convolutional neural network (CNN) classifiers demonstrate high performance on controlled datasets, their accuracy may diminish under conditions of variable lighting, diverse acquisition devices, and class imbalance. To mitigate these limitations, robust preprocessing strategies, such as multi-scale feature extraction and contrast-limited adaptive histogram equalization (CLAHE), have been implemented, thereby enhancing the detection of subtle white-spot lesions.

5.2. Encoder–Decoder Networks for Segmentation (U-Net and Derivatives)

Precise pixel-level discrimination is essential for tasks such as plaque-region extraction, gingival-margin delineation, mucosal-lesion isolation, and teeth cropping prior to classification. The U-Net architecture remains the benchmark in biomedical image segmentation, as its symmetric encoder–decoder design with skip connections effectively integrates contextual and fine-grained spatial information [53].
Extensions of U-Net have further enhanced performance on challenging intraoral images affected by saliva reflections, uneven illumination, and soft-tissue variability. Variants such as U-Net++, Attention U-Net, DeepLabV3+, and hybrid convolution–transformer models improve boundary precision and robustness in plaque segmentation [43], gingival mapping, lesion delineation, and ROI extraction. These models support automated plaque-index estimation and enable remote monitoring for periodontal health programs [44]. Most state-of-the-art segmentation pipelines for IOPI analysis are based on these encoder–decoder architectures. Attention-based extensions enhance robustness by effectively suppressing background artifacts, such as saliva reflections and non-diagnostic soft tissue.

5.3. Vision Transformers

Leveraging global self-attention mechanisms that model long-range spatial dependencies in dental images, transformer-based models have emerged as powerful alternatives to CNNs, which operate only within localized receptive fields. Vision transformers (ViTs, Swin transformer, DeiT) process images as sequences of patches, capturing both tooth-level and mouth-level contextual relationships [54]. The global modeling capability of vision transformers (ViTs) is particularly advantageous for tasks requiring multi-tooth contextual understanding, such as orthodontic landmark detection, occlusal pattern assessment, and multi-disease classification, in which inter-tooth spatial relationships are diagnostically meaningful. In the domain of dental imaging, ViTs have demonstrated high performance in classifying oral potentially malignant disorders (OPMDs) [34] and multi-view fusion tasks, often capturing global shape and texture patterns more accurately than conventional CNNs. Hybrid CNN–transformer models leverage CNNs for local texture extraction and transformers for contextual reasoning, achieving state-of-the-art results across diverse IOPI applications [55]. However, their primary limitation remains the increased data and computational demands, which are frequently mitigated through pretraining and hybrid model designs.

5.4. Self-Supervised Learning

Self-supervised learning (SSL) techniques, including SimCLR, BYOL, and MoCo, have emerged as highly effective for dental imaging, particularly in the context of limited annotated datasets [19]. SSL leverages contrastive learning and latent-space bootstrapping to extract invariant and discriminative features, enhancing performance in downstream tasks such as lesion detection, classification, and plaque segmentation without extensive manual annotation [56].
Federated learning (FL) enables multi-center model training without sharing patient data, addressing privacy and data-governance concerns while improving generalizability across devices, demographics, and clinical settings. Integrating SSL and FL provides a synergistic approach that accelerates the development of AI-driven dental diagnostic tools. These advances facilitate scalable deployment of IOPI-based classification systems, supporting clinical implementation, community health initiatives, remote oral health monitoring, and ultimately improving the accessibility and equity of dental care [57].

5.5. Federated Learning

Dental image datasets are often confined to institutional silos, limiting data diversity and raising concerns regarding privacy, governance, and regulation. FL is a decentralized training paradigm in which multiple dental clinics or institutions collaboratively train a shared model without exchanging raw patient data. FL offers several advantages in dentistry, notably mitigating domain shifts arising from variations in imaging devices, patient demographics, and acquisition protocols. Early FL applications in IOPI analysis demonstrated superior external generalization compared with single-center models, highlighting its value for multi-population robustness. Additionally, FL supports continuous model updates within real clinical workflows, enabling models to evolve with newly captured patient images while maintaining strict privacy safeguards. With its scalability, fairness, and privacy-preserving properties, FL is poised to play a pivotal role in deploying real-world dental AI systems [20,21].

5.6. Generative Adversarial Networks

GANs are increasingly applied in dental imaging for tasks including synthetic image generation, class-imbalance correction, data augmentation, image denoising, color correction, and cross-modal translation [18]. By generating realistic synthetic samples, GANs can augment underrepresented disease categories such as early caries, rare mucosal lesions, and specific orthodontic conditions. Beyond augmentation, GANs enhance image quality by correcting illumination inconsistencies, reducing noise, and standardizing color profiles, thereby improving feature extraction from intraoral photographs. GAN-based cross-modal translation can convert low-quality smartphone images into standardized formats, enhancing diagnostic consistency across imaging sources. Empirical evidence indicates that GAN-augmented datasets increase diversity, mitigate bias toward common presentations, and improve performance in caries detection, lesion segmentation, and plaque analysis. Consequently, GANs have become integral to modern dental AI workflows, particularly when training is limited by data scarcity or heterogeneity. Table 4 presents a study-level synthesis of deep learning applications for intraoral photographic images, mapping diagnostic tasks to their corresponding model architectures. Most classification tasks—such as the identification of dental caries, gingivitis, and soft-tissue lesions—employ convolutional neural network (CNN) backbones, reflecting their effectiveness in capturing localized color and texture features. Segmentation-oriented studies, particularly those focused on dental plaque and gingival regions, predominantly adopt encoder–decoder architectures such as U-Net and its variants, which enable accurate boundary delineation through multi-scale feature integration. Transformer-based models are more frequently reported in orthodontic assessments and multi-disease classification settings, where modeling global spatial dependencies across the dental arch is diagnostically relevant.
While Table 3 provides a study-level overview, Table 4 presents a consolidated comparison of major deep learning architecture families applied to intraoral photographic image analysis, summarizing their typical use cases, strengths, and limitations. CNN-based models deliver strong baseline performance for disease classification but can struggle to capture long-range contextual dependencies. Encoder–decoder architectures excel in pixel-level segmentation tasks—such as plaque quantification and gingival mapping—due to their ability to preserve spatial detail and integrate multi-scale features. Vision transformers offer enhanced global context modeling and multi-region reasoning, although they often require larger datasets or extensive pretraining. Emerging paradigms, including generative adversarial networks, self-supervised learning, and federated learning, address challenges such as class imbalance, limited annotations, and data privacy constraints. Together, these developments highlight a shift toward more robust, scalable, and clinically applicable AI systems for intraoral photographic diagnostics. Table 5 summarizes the DL architectures for analyzing IOPIs based on dental diseases.

6. Image Preprocessing and Augmentation

Preprocessing is critical for improving the quality, consistency, and diagnostic value of IOPIs, which are often affected by variations, saliva reflections, color distortions, and occlusal angulation. Effective preprocessing minimizes the irrelevant variability and highlights the diagnostically relevant structures.

6.1. Color Constancy and Normalization

IOPIs captured using smartphones or consumer-grade cameras often exhibit uneven illumination, shadows, and color shifts. Techniques such as gray-world, shades-of-gray, Retinex-based algorithms, and learning-based color-constancy models standardize color appearance across sessions and devices, improving the visibility of white-spot lesions, plaque films, gingival erythema, mucosal abnormalities, and other subtle features [58].

6.2. Contrast Enhancement

CLAHE is commonly applied to enhance local contrast, particularly for early caries, plaque boundaries, and soft-tissue textures. CLAHE increases the discriminability of low-contrast regions without excessively amplifying noise, enabling CNNs and transformers to extract more informative features from enamel surfaces and gingival tissues [59].

6.3. Region-of-Interest Extraction

Automatic ROI extraction reduces background noise and directs the model’s focus to clinically relevant areas. Prior to classification, teeth, gingiva, or oral lesions can be cropped using detection models such as YOLOv5, Faster R-CNN, or SSD. ROI-based pipelines enhance classification accuracy by excluding extraneous regions such as lips, tongue, cheeks, and specular highlights [42].

6.4. Image Augmentation

Image augmentation strategies—including rotation, scaling, horizontal flipping, color jittering, Gaussian noise, elastic deformation, and Random Erasing—expand dataset diversity and mitigate overfitting [60]. Advanced techniques such as GAN-based synthetic data generation and mix-up/cut mix produce realistic variations that improve model generalization, particularly for underrepresented categories such as early lesions or rare soft-tissue abnormalities [18].

6.5. Standardized Photography Protocols

Despite advances in preprocessing, variability in image acquisition remains a major challenge. Studies have emphasized the importance of standardized protocols that specify a fixed camera distance, controlled lighting, use of retractors, tooth-surface drying, and consistent framing [61]. Clinical guidelines for IOPI capture are essential for enhancing model robustness and supporting future multi-center deployments [62].
Table 6 summarizes the reported performance of representative deep learning models across major intraoral photographic image-based dental diagnostic tasks, including caries, gingivitis, plaque, orthodontic conditions, and soft-tissue lesions. It enables concise comparison of evaluation metrics and highlights persistent performance gaps under external validation.

7. Evaluation Metrics

When evaluating DL systems for IOPI analysis, the metrics must be tailored to classification, detection, and segmentation tasks. Rigorous and consistent evaluation is crucial for translating these systems into clinical practice.

7.1. Classification Metrics

7.1.1. Accuracy

Accuracy, defined as the proportion of correctly classified images among all predictions, is commonly reported but may be insufficient for evaluating imbalanced dental datasets, where conditions like caries are far more prevalent than soft-tissue lesions. For screening applications, metrics such as sensitivity and specificity often provide more clinically meaningful insights than accuracy alone.

7.1.2. Sensitivity and Specificity

Sensitivity (true positive rate) and specificity (true negative rate) measure a model’s ability to correctly identify diseased and healthy cases, respectively. These metrics are critical for detecting caries and gingivitis, as false negatives can lead to missed diagnoses with remarkable clinical implications.

7.1.3. Precision, Recall, and F1-Score

Precision defines the proportion of accurate positive predictions among all positive cases, while recall (also called sensitivity) indicates the proportion of accurate positive predictions among all cases. The F1-score, calculated as the harmonic mean of precision and recall, provides a balanced measure that is particularly useful for imbalanced datasets. The F1-score is commonly used in IOPI research, particularly for tasks such as mucosal-lesion classification and multi-disease detection.

7.1.4. Area Under the Receiver Operating Characteristic Curve

The AUC evaluates classification performance across varying probability thresholds and is robust to class imbalance. High AUC values (typically 0.85–0.96) are commonly reported in DL-based caries detection, gingivitis classification, and multi-disease screening tasks [10,63].

7.2. Segmentation Metrics

Robust evaluation metrics are essential for assessing the spatial accuracy of pixel-level tasks, such as dental plaque and lesion segmentation. The Dice coefficient measures the overlap between predicted segmentation masks and ground-truth annotations, providing sensitivity to small structures. Similarly, Intersection over Union (IoU) evaluates boundary alignment and region matching, reflecting segmentation precision. State-of-the-art architectures, including U-Net and Deep Lab, consistently achieve Dice scores above 0.85 on well-preprocessed intraoral datasets, highlighting their effectiveness in fine-grained dental image analysis [22].

7.3. Calibration, Robustness, and Reliability Assessments

Calibration metrics such as the Brier score or expected calibration error are rarely reported but are important for assessing prediction confidence. Robustness testing under varied lighting conditions, low-resolution inputs, and real-world smartphone imagery is increasingly recommended in model reliability assessments [64].

7.4. External Validation and Cross-Domain Generalization

Large-scale external validation remains conspicuously lacking in the current research. The performances of many models decline by 10–30% on data from different institutions or imaging devices, underscoring the need for domain adaptation and FL. External validation on geographically and demographically diverse datasets remains the gold standard for achieving clinical readiness [13].

7.5. Human–Artificial Intelligence Comparison and Clinical Utility

Multiple studies have shown that AI systems can match or exceed human performance in tasks such as caries detection, plaque assessment, and lesion identification. However, clinical utility depends not only on accuracy but also on explainability, robustness, and seamless integration into real-world workflows, including tele-dentistry and mobile health applications [65].

7.6. Interpretation of Evaluation Metrics and Acceptable Performance Ranges

In IOPI-based dental AI, evaluation metrics such as accuracy, sensitivity, specificity, AUC, Dice, and IoU should be interpreted in a task-specific context and with consideration of dataset imbalance; general guidance on metric reporting and pitfalls in medical imaging AI is provided in recent methodological recommendations. For classification tasks, accuracy and AUC values ≥0.85 are commonly reported as acceptable for assistive screening under controlled settings, as demonstrated in caries and periodontal or gingival classification studies [25,26]. For segmentation tasks, including plaque and gingival-region delineation, Dice and IoU values ≥ 0.80 are generally considered acceptable and ≥0.85 strong, consistent with recent intraoral segmentation studies [11,43]. For photographic soft-tissue and potentially malignant disorder (OPMD) screening, high sensitivity (approximately ≥0.85) is typically prioritized to reduce missed lesions [33,35]. Across all tasks, reliance on a single metric is insufficient, and complementary metrics together with explicit reporting of internal versus external validation are required for robust and clinically meaningful evaluation [64]. Figure 5 presents a conceptual overview of the evaluation metrics frequently employed in deep learning-based intraoral photographic image (IOPI) analysis. It systematically maps each metric to its respective task type and underscores key interpretational limitations.

8. Challenges and Limitations

Despite notable advancements, the widespread adoption of DL for IOPI analysis is limited by several key challenges, including data quality constraints, limited model robustness, restricted interpretability, privacy and governance concerns, and workflow integration barriers. Addressing these limitations is critical for developing reliable, clinically deployable AI systems in dentistry.

8.1. Dataset Limitations and Class Imbalance

A major challenge in AI-based dental imaging is the limited availability of large, diverse, and well-annotated datasets. Most studies rely on single-center cohorts with narrow demographic representation and inconsistent imaging conditions. Rare conditions, such as early enamel lesions or PMDs, remain underrepresented, leading to model overfitting on common disease patterns. While augmentation strategies and GAN-based synthetic image generation partially mitigate class imbalance, real-world variability remains a significant obstacle [66].

8.2. Domain Shift and Variability in Image Acquisition

Intraoral photographs exhibit substantial variations in lighting, camera type, occlusal angle, saliva reflections, and operator technique. These factors cause domain shift, wherein models trained on one dataset perform poorly on another. Even minor changes in imaging conditions can markedly reduce segmentation or classification performance. Techniques such as color-constancy correction, standardized acquisition protocols, and domain adaptation provide partial solutions, but multi-center validation remains limited [67].

8.3. Lack of Explainability and Clinical Transparency

CNNs, U-Nets, and vision transformers achieve strong predictive performance, but often operate as opaque “black-box” systems. Most studies rely on Grad-CAM or attention-based visualizations, which remain clinically unverified and frequently emphasize regions unrelated to diagnostic decision-making. To earn the trust of clinicians, models must provide clear, reproducible reasoning pathways and demonstrate consistent interpretability of their automated outputs across diverse disease categories [68].

8.4. Limited External and Prospective Validation

Many models achieve high accuracy on internal test sets resembling their training data, yet performance often declines by 10–30% on independent external datasets. Few studies report prospective validations or evaluate system performance in real-world tele-dentistry environments. Without robust cross-institutional evaluation, these models may lack generalizability across diverse patient demographics, imaging devices, and clinical workflows [46].

8.5. Privacy, Ethics, and Regulatory Constraints

IOPIs contain identifiable biometric features, including tooth morphology and surrounding soft tissues. Data sharing is restricted by regulations such as the GDPR, HIPAA, and institutional policies, limiting the creation of large multi-center datasets. While FL partially addresses these challenges, regulatory oversight of AI-driven dental diagnostic systems remains underdeveloped, and no DL tool based on IOPI analysis has yet received FDA or CE certification [20,69].

8.6. Integration into Clinical Workflow

Despite strong experimental performance, most DL systems have not been adopted in routine dental practice. Barriers include limited interoperability with electronic dental records, unstable internet access in remote settings, hardware constraints in low-resource clinics, and the absence of intuitive, clinician-centered interfaces. Effective integration will require robust deployment frameworks, informative visualization tools, and targeted clinician training programs [70].

9. Future Research Directions

To overcome current limitations and accelerate clinical translation, the following research avenues should be prioritized:

9.1. Development of Large, Multi-Center, Standardized Datasets

Collaborative international datasets with standardized acquisition protocols can enhance model generalizability. Evidence indicates that cross-center training and multi-center imaging repositories substantially reduce bias and strengthen the external validity of dental AI models [66].

9.2. Advanced Learning Paradigms: Self-Supervised Learning, FL, and Multi-Task Models

SSL leverages unlabeled dental images, reducing annotation burden and improving model performance. FL enables privacy-preserving collaboration across clinics, mitigating domain shifts and promoting fairness. Multi-task architectures integrating detection of caries, plaque, gingivitis, orthodontic findings, and soft-tissue lesions have shown promise for simultaneous classification tasks [56,71].

9.3. Improved Explainability and Clinical Interpretability

Clinically validated explainability tools, including hierarchical attention maps, Shapley additive explanation-based attributions, and rule-based overlays linked to diagnostic criteria, are emerging in healthcare AI. Human–AI collaborative studies are essential to evaluate how explainability affects clinical trust and diagnostic confidence [14].

9.4. Robustness, Calibration, and Continual Learning

Models must maintain performance under real-world imaging challenges, including poor lighting, motion blur, and artifacts. Recent robustness-testing frameworks and motion-correction techniques in medical imaging support reliable performance. Calibration methods and continual learning strategies enable models to adapt over time, mitigating domain drift in diagnostic applications [72].

9.5. Integration with Mobile and Tele-Dentistry Platforms

Lightweight architectures, such as MobileNetV3 and Efficient Net-Lite, can be deployed on mobile platforms to expand access to underserved populations. Early studies of mobile AI for plaque detection report only moderate accuracy, emphasizing the need for improved usability and seamless workflow integration [39,73].

9.6. Regulatory Approval Pathways and Ethical Frameworks

Regulatory guidelines for AI applications are evolving rapidly. The FDA’s credibility framework and CE guidelines prioritize safety, transparency, and reliability. Dentistry-specific ethical checklists, emphasizing autonomy, fairness, and data protection, provide essential guidance for the responsible deployment of AI in clinical practice [74]. Figure 6 presents a radial diagram illustrating the key future directions for deep learning based IOPIs analysis. The radial representation highlights the translational advances expected to collectively address current limitations related to data availability, privacy, interpretability, and clinical deployment. Table 7 highlights the key challenges and prospective directions for advancing IOPI-based dental disease detection.

10. Conclusions

DL has transformed IOPI analysis, enabling precise detection of dental caries, gingival disease, plaque accumulation, orthodontic abnormalities, and soft-tissue lesions. CNNs, U-Net variants, and transformer-based architectures have demonstrated strong diagnostic performance, while emerging strategies such as SSL, multi-task learning, and federated collaboration offer further improvements in accuracy, generalizability, and scalability.
Despite these advances, key challenges remain, including limited dataset diversity, domain adaptation, model interpretability, privacy compliance, and clinical integration. Rigorous external validation, adherence to regulatory standards, and seamless workflow incorporation are essential for widespread adoption in dental practice. With ongoing progress in data quality, model architecture, privacy-preserving learning, and mobile platform deployment, AI-driven IOPI analysis is poised to enhance diagnostic precision, expand tele-dentistry services, reduce healthcare disparities, and enable early detection of oral diseases worldwide.

Author Contributions

Conceptualization, A.M.M. and Y.Y.A.; methodology, A.M.M., Y.Y.A. and K.T.; writing—original draft preparation, A.M.M. and K.T.; writing—review and editing, A.M.M., Y.Y.A., K.T. and A.M.M.; project administration, resource management, supervision, and funding acquisition, A.M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Kuwait Foundation for the Advancement of Sciences (KFAS) (Research Grant No PN2313NR2019).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The authors would like to thank Kuwait University and Kuwait Foundation for the Advancement of Sciences (KFAS) for providing the support and resources necessary for the completion of this research.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
IOPIIntraoral Photographic Image
DLDeep Learning
CNNConvolutional Neural Network
DSLR Digital Single-Lens Reflex
HIPAAHealth Insurance Portability and Accountability Act

References

  1. Tonetti, M.S.; Jepsen, S.; Jin, L.; Otomo-Corgel, J. Impact of the global burden of periodontal diseases on health, nutrition and wellbeing of mankind: A call for global action. J. Clin. Periodontol. 2017, 44, 456–462. [Google Scholar] [CrossRef] [PubMed]
  2. Lang, N.P.; Bartold, P.M. Periodontal health. J. Periodontol. 2018, 89, S9–S16. [Google Scholar] [CrossRef] [PubMed]
  3. Pretty, I.A.; Ekstrand, K.R.; Pretty, I.A.; Ekstrand, K.R. Detection and monitoring of early caries lesions: A review. Eur. Arch. Paediatr. Dent. 2015, 17, 13–25. [Google Scholar] [CrossRef] [PubMed]
  4. Ismail, A.I.; Sohn, W.; Tellez, M.; Amaya, A.; Sen, A.; Hasson, H.; Pitts, N.B. The International Caries Detection and Assessment System (ICDAS): An integrated system for measuring dental caries. Community Dent. Oral Epidemiol. 2007, 35, 170–178. [Google Scholar] [CrossRef]
  5. Jeong, H.K.; Park, C.; Henao, R.; Kheterpal, M. Deep Learning in Dermatology: A Systematic Review of Current Approaches, Outcomes, and Limitations. JID Innov. 2022, 3, 100150. [Google Scholar] [CrossRef]
  6. Rajpurkar, P.; Irvin, J.; Ball, R.L.; Zhu, K.; Yang, B.; Mehta, H.; Duan, T.; Ding, D.; Bagul, A.; Langlotz, C.P.; et al. Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 2018, 15, e1002686. [Google Scholar] [CrossRef]
  7. Ruamviboonsuk, P.; Cheung, C.Y.; Zhang, X.; Raman, R.; Park, S.J.; Ting, D.S.W. Artificial Intelligence in Ophthalmology: Evolutions in Asia. Asia-Pac. J. Ophthalmol. 2020, 9, 78–84. [Google Scholar] [CrossRef]
  8. Estai, M.; Bunt, S.; Kanagasingam, Y.; Kruger, E.; Tennant, M. Diagnostic accuracy of teledentistry in the detection of dental caries: A systematic review. J. Evid. Based Dent. Pract. 2016, 16, 161–172. [Google Scholar] [CrossRef]
  9. Estai, M.; Kanagasingam, Y.; Mehdizadeh, M.; Vignarajan, J.; Norman, R.; Huang, B.; Spallek, H.; Irving, M.; Arora, A.; Kruger, E.; et al. Teledentistry as a novel pathway to improve dental health in school children: A research protocol for a randomised controlled trial. BMC Oral Health 2020, 20, 11. [Google Scholar] [CrossRef]
  10. Schwendicke, F.; Samek, W.; Krois, J. Artificial Intelligence in Dentistry: Chances and Challenges. J. Dent. Res. 2020, 99, 769–774. [Google Scholar] [CrossRef]
  11. Kumar, P.D.M.; Sivakumar, S.; Rajeshwari, S.; Lavanya, C.; Ranganathan, K. Diagnostic efficiency of digital photography and AI-assisted image interpretation in dental caries examination: An umbrella review. J. Oral Biol. Craniofacial Res. 2026, 16, 1–7. [Google Scholar] [CrossRef] [PubMed]
  12. Noor Uddin, A.; Ali, S.A.; Lal, A.; Adnan, N.; Ahmed, S.M.F.; Umer, F. Applications of AI-based deep learning models for detecting dental caries on intraoral images—A systematic review. Evid.-Based Dent. 2025, 26, 71–72. [Google Scholar] [CrossRef] [PubMed]
  13. Yu, A.C.; Mohajer, B.; Eng, J. External Validation of Deep Learning Algorithms for Radiologic Diagnosis: A Systematic Review. Radiol. Artif. Intell. 2022, 4, e210064. [Google Scholar] [CrossRef]
  14. Eke, C.I.; Shuib, L.; Eke, C.I.; Shuib, L. The role of explainability and transparency in fostering trust in AI healthcare systems: A systematic literature review, open issues and potential solutions. Neural Comput. Appl. 2024, 37, 1999–2034. [Google Scholar] [CrossRef]
  15. Du, G.; Cao, X.; Liang, J.; Chen, X.; Zhan, Y.; Cao, X.; Liang, J.; Chen, X.; Zhan, Y. Medical Image Segmentation based on U-Net: A Review. J. Imaging Sci. Technol. 2020, 64, jist0710. [Google Scholar] [CrossRef]
  16. Khan, S.; Naseer, M.; Hayat, M.; Waqas, Z.; Shahbaz, K.; Shah, M. Transformers in Vision: A Survey. ACM Comput. Surv. (CSUR) 2022, 54, 200. [Google Scholar] [CrossRef]
  17. Singh, N.K.; Raza, K. Medical Image Generation Using Generative Adversarial Networks: A Review. In Studies in Computational Intelligence; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar] [CrossRef]
  18. Johnson, J.W. Generative adversarial networks in medical imaging. In State of the Art in Neural Networks and Their Applications; Elsevier: Amsterdam, The Netherlands, 2021. [Google Scholar] [CrossRef]
  19. Qayyum, A.; Tahir, A.; Butt, M.A.; Luke, A.; Abbas, H.T.; Qadir, J.; Arshad, K.; Assaleh, K.; Imran, M.A.; Abbasi, Q.H.; et al. Dental caries detection using a semi-supervised learning approach. Sci. Rep. 2023, 13, 749. [Google Scholar] [CrossRef]
  20. Kaissis, G.A.; Makowski, M.R.; Rückert, D.; Braren, R.F. Secure, privacy-preserving and federated machine learning in medical imaging. Nat. Mach. Intell. 2020, 2, 305–311. [Google Scholar] [CrossRef]
  21. Rischke, R.; Schneider, L.; Müller, K.; Samek, W.; Schwendicke, F.; Krois, J. Federated Learning in Dentistry: Chances and Challenges. J. Dent. Res. 2022, 101, 1269–1273. [Google Scholar] [CrossRef]
  22. Park, E.Y.; Cho, H.; Kang, S.; Jeong, S.; Kim, E.K. Caries detection with tooth surface segmentation on intraoral photographic images using deep learning. BMC Oral Health 2022, 22, 573. [Google Scholar] [CrossRef]
  23. Mehta, L.R.; Borse, M.S.; Tepan, M.; Shah, J. Identifying Suitable Deep Learning Approaches for Dental Caries Detection Using Smartphone Imaging. Int. J. Comput. Methods Exp. Meas. 2024, 12, 251–267. [Google Scholar] [CrossRef]
  24. Sabri, R.K.; Abdulkadir, L.Y.; Khidhir, A.M.; Saleh, H.A. Diagnosing Gingiva Disease Using Artificial Intelligence Techniques. Caries detection with tooth surface segmentation on intraoral photographic images using deep learning. Diyala J. Eng. Sci. 2025, 18, 179–190. [Google Scholar] [CrossRef]
  25. Park, S.; Erkinov, H.; Hasan, M.A.M.; Nam, S.-H.; Kim, Y.-R.; Shin, J.; Chang, W.-D. Periodontal Disease Classification with Color Teeth Images Using Convolutional Neural Networks. Electronics 2023, 12, 1518. [Google Scholar] [CrossRef]
  26. Wen, C.; Bai, X.; Yang, J.; Li, S.; Wang, X.; Yang, D. Deep learning based approach: Automated gingival inflammation grading model using gingival removal strategy. Sci. Rep. 2024, 14, 19780. [Google Scholar] [CrossRef]
  27. Garg, A.; Lu, J.; Maji, A. Towards Earlier Detection of Oral Diseases On Smartphones Using Oral and Dental RGB Images. arXiv 2023, arXiv:2308.15705. [Google Scholar] [CrossRef]
  28. Nantakeeratipat, T.; Apisaksirikul, N.; Boonrojsaree, B.; Boonkijkullatat, S.; Simaphichet, A. Automated machine learning for image-based detection of dental plaque on permanent teeth. Front. Dent. Med. 2024, 5, 1507705. [Google Scholar] [CrossRef]
  29. Zhang, R.; Zhang, L.; Zhang, D.; Wang, Y.; Huang, Y.; Wang, D.; Xu, L. Development and evaluation of a deep learning model for occlusion classification in intraoral photographs. PeerJ 2025, 13, e20140. [Google Scholar] [CrossRef]
  30. Ryu, J.; Lee, Y.-S.; Mo, S.-P.; Lim, K.; Jung, S.-K.; Kim, T.-W. Application of deep learning artificial intelligence technique to the classification of clinical orthodontic photos. BMC Oral Health 2022, 22, 454. [Google Scholar] [CrossRef]
  31. Su, A.-Y.; Wu, M.-L.; Wu, Y.-H. Deep learning system for the differential diagnosis of oral mucosal lesions through clinical photographic imaging. J. Dent. Sci. 2025, 20, 54–60. [Google Scholar] [CrossRef]
  32. Zhang, R.; Lu, M.; Zhang, J.; Chen, X.; Zhu, F.; Tian, X.; Chen, Y.; Cao, Y. Research and Application of Deep Learning Models with Multi-Scale Feature Fusion for Lesion Segmentation in Oral Mucosal Diseases. Bioengineering 2024, 11, 1107. [Google Scholar] [CrossRef]
  33. Tanriver, G.; Tekkesin, M.S.; Ergen, O. Automated Detection and Classification of Oral Lesions Using Deep Learning to Detect Oral Potentially Malignant Disorders. Cancers 2021, 13, 2766. [Google Scholar] [CrossRef]
  34. Vinayahalingam, S.; van Nistelrooij, N.; Rothweiler, R.; Tel, A.; Verhoeven, T.; Tröltzsch, D.; Kesting, M.; Bergé, S.; Xi, T.; Heiland, M.; et al. Advancements in diagnosing oral potentially malignant disorders: Leveraging Vision transformers for multi-class detection. Clin. Oral Investig. 2024, 28, 364. [Google Scholar] [CrossRef] [PubMed]
  35. Warin, K.; Limprasert, W.; Suebnukarn, S.; Jinaporntham, S.; Jantana, P. Performance of deep convolutional neural network for classification and detection of oral potentially malignant disorders in photographic images. Int. J. Oral Maxillofac. Surg. 2022, 51, 699–704. [Google Scholar] [CrossRef] [PubMed]
  36. Talwar, V.; Singh, P.; Mukhia, N.; Shetty, A.; Birur, P.; Desai, K.M.; Sunkavalli, C.; Varma, K.S.; Sethuraman, R.; Jawahar, C.V.; et al. AI-Assisted Screening of Oral Potentially Malignant Disorders Using Smartphone-Based Photographic Images. Cancers 2023, 15, 4120. [Google Scholar] [CrossRef] [PubMed]
  37. Rashid, U.; Javid, A.; Khan, A.R.; Liu, L.; Ahmed, A.; Khalid, O.; Saleem, K.; Meraj, S.; Iqbal, U.; Nawaz, R. A hybrid mask RCNN-based tool to localize dental cavities from real-time mixed photographic images. PeerJ Comput. Sci. 2022, 8, e888. [Google Scholar] [CrossRef]
  38. Ali, D.A.; Sadeeq, H.T. An Interpretable Deep Learning Framework for Multi-Class Dental Disease Classification from Intraoral RGB Images. Stat. Optim. Inf. Comput. 2025, 14, 3380–3397. [Google Scholar] [CrossRef]
  39. Boy, A.F.; Akhyar, A.; Arif, T.Y.; Syahrial, S. Development of an artificial intelligence model based on MobileNetV3 for early detection of dental caries using smartphone images: A preliminary study. Adv. Sci. Technol. Res. J. 2025, 19, 109–116. [Google Scholar] [CrossRef]
  40. Li, S.; Guo, Y.; Pang, Z.; Song, W.; Hao, A.; Xia, B. Automatic Dental Plaque Segmentation Based on Local-to-Global Features Fused Self-Attention Network. IEEE J. Biomed. Health Inform. 2022, 26, 2240–2251. [Google Scholar] [CrossRef]
  41. Patel, A.; Besombes, C.; Dillibabu, T.; Sharma, M.; Tamimi, F.; Ducret, M.; Madathil, S. Attention-guided convolutional network for bias-mitigated and interpretable oral lesion classification. Sci. Rep. 2024, 14, 31700. [Google Scholar] [CrossRef]
  42. Ryu, J.; Kim, Y.-H.; Kim, T.-W.; Jung, S.-K. Evaluation of artificial intelligence model for crowding categorization and extraction diagnosis using intraoral photographs. Sci. Rep. 2023, 13, 5177. [Google Scholar] [CrossRef]
  43. Liu, Y.; Cheng, Y.; Song, Y.; Cai, D.; Zhang, N. Oral screening of dental calculus, gingivitis and dental caries through segmentation on intraoral photographic images using deep learning. BMC Oral Health 2024, 24, 1287. [Google Scholar] [CrossRef]
  44. Jeong, J.-S.; Kim, K.-S.; Gu, Y.; Yoon, D.-H.; Zhang, M.; Wang, L.; Kim, J.-H. Deep learning for automated dental plaque index assessment: Validation against expert evaluations. BMC Oral Health 2025, 25, 1000. [Google Scholar] [CrossRef]
  45. Li, W.; Liang, Y.; Zhang, X.; Liu, C.; He, L.; Miao, L.; Sun, W. A deep learning approach to automatic gingivitis screening based on classification and localization in RGB photos. Sci. Rep. 2021, 11, 16831. [Google Scholar] [CrossRef]
  46. Neumayr, J.; Frenkel, E.; Schwarzmaier, J.; Ammar, N.; Kessler, A.; Schwendicke, F.; Kühnisch, J.; Dujic, H. External validation of an artificial intelligence-based method for the detection and classification of molar incisor hypomineralisation in dental photographs. J. Dent. 2024, 148, 105228. [Google Scholar] [CrossRef]
  47. Duong, D.L.; Kabir, M.H.; Kuo, R.F. Automated caries detection with smartphone color photography using machine learning. Health Inform. J. 2021, 27, 14604582211007530. [Google Scholar] [CrossRef] [PubMed]
  48. Rao, G.K.L.; Srinivasa, A.C.; Iskandar, Y.H.P.; Mokhtar, N. Identification and analysis of photometric points on 2D facial images: A machine learning approach in orthodontics. Health Technol. 2019, 9, 715–724. [Google Scholar] [CrossRef]
  49. Abdulwahhab, A.H.; Mahmood, N.T.; Mohammed, A.A.; Myderrizi, I.; Al-Jumaili, M.H. A Review on Medical Image Applications Based on Deep Learning Techniques. J. Image Graph. 2024, 12, 215–227. [Google Scholar] [CrossRef]
  50. Mienye, I.D.; Swart, T.G.; Obaido, G.; Jordan, M.; Ilono, P. Deep Convolutional Neural Networks in Medical Image Analysis: A Review. Information 2025, 16, 195. [Google Scholar] [CrossRef]
  51. Dai, L.; Zhou, M.; Liu, H. Recent Applications of Convolutional Neural Networks in Medical Data Analysis: Medicine & Healthcare Book Chapter. In Federated Learning and AI for Healthcare; IGI Global Scientific Publishing: Hershey, PA, USA, 2024. [Google Scholar]
  52. Kühnisch, J.; Meyer, O.; Hesenius, M.; Hickel, R.; Gruhn, V. Caries Detection on Intraoral Images Using Artificial Intelligence. J. Dent. Res. 2022, 101, 158–165. [Google Scholar] [CrossRef]
  53. Srinivasan, S.; Durairaju, K.; Deeba, K.; Mathivanan, S.K.; Karthikeyan, P.; Shah, M.A. Multimodal Biomedical Image Segmentation using Multi-Dimensional U-Convolutional Neural Network. BMC Med. Imaging 2024, 24, 38. [Google Scholar] [CrossRef]
  54. Zhou, Z.; Zhu, J.; Zhang, Y.; Guan, X.; Wang, P.; Li, T. Deep Learning in Dental Image Analysis: A Systematic Review of Datasets, Methodologies, and Emerging Challenges. arXiv 2025, arXiv:2510.20634. [Google Scholar] [CrossRef]
  55. He, K.; Gan, C.; Li, Z.; Rekik, I.; Yin, Z.; Ji, W.; Gao, Y.; Wang, Q.; Zhang, J.; Shen, D. Transformers in medical image analysis. Intell. Med. 2023, 3, 59–78. [Google Scholar] [CrossRef]
  56. Tran, Q.V.; Byeon, H. The Promise of Self-Supervised Learning for Dental Caries. Int. J. Adv. Comput. Sci. Appl. 2023, 14, 57–61. [Google Scholar] [CrossRef]
  57. Taleb, A.; Rohrer, C.; Bergner, B.; Leon, G.D.; Rodrigues, J.A.; Schwendicke, F.; Lippert, C.; Krois, J. Self-Supervised Learning Methods for Label-Efficient Dental Caries Classification. Diagnostics 2022, 12, 1237. [Google Scholar] [CrossRef] [PubMed]
  58. Badano, A.; Revie, C.; Casertano, A.; Cheng, W.-C.; Green, P.; Kimpe, T.; Krupinski, E.; Sisson, C.; Skrøvseth, S.; Treanor, D.; et al. Consistency and Standardization of Color in Medical Imaging: A Consensus Report. J. Digit. Imaging 2014, 28, 41–52. [Google Scholar] [CrossRef]
  59. Yoshimi, Y.; Mine, Y.; Ito, S.; Takeda, S.; Okazaki, S.; Nakamoto, T.; Nagasaki, T.; Kakimoto, N.; Murayama, T.; Tanimoto, K. Image preprocessing with contrast-limited adaptive histogram equalization improves the segmentation performance of deep learning for the articular disk of the temporomandibular joint on magnetic resonance images. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. 2024, 138, 128–141. [Google Scholar] [CrossRef]
  60. Xu, M.; Yoon, S.; Fuentes, A.; Park, D.S. A Comprehensive Survey of Image Augmentation Techniques for Deep Learning. Pattern Recognit. 2023, 137, 109347. [Google Scholar] [CrossRef]
  61. Saincher, R.; Kumar, S.; Gopalkrishna, P.; Maithri, M. Comparison of color accuracy and picture quality of digital SLR, point and shoot and mobile cameras used for dental intraoral photography—A pilot study. Heliyon 2022, 8, e09262. [Google Scholar] [CrossRef]
  62. Lamas-Lara, V.F.; Mattos-Vela, M.A.; Evaristo-Chiyong, T.A.; Guerrero, M.E.; Jiménez-Yano, J.F.; Gómez-Meza, D.N. Validity and reliability of a smartphone-based photographic method for detection of dental caries in adults for use in teledentistry. Front. Oral Health 2025, 6, 1470706. [Google Scholar] [CrossRef]
  63. Li, X.; Zhao, D.; Xie, J.; Wen, H.; Liu, C.; Li, Y.; Li, W.; Wang, S. Deep learning for classifying the stages of periodontitis on dental images: A systematic review and meta-analysis. BMC Oral Health 2023, 23, 1017. [Google Scholar] [CrossRef]
  64. Kocak, B.; Klontzas, M.E.; Stanzione, A.; Meddeb, A.; Demircioğlu, A.; Bluethgen, C.; Bressem, K.K.; Ugga, L.; Mercaldo, N.; Díaz, O.; et al. Evaluation metrics in medical imaging AI: Fundamentals, pitfalls, misapplications, and recommendations. Eur. J. Radiol. Artif. Intell. 2025, 3, 100030. [Google Scholar] [CrossRef]
  65. Adeniran, A.A.; Onebunne, A.P.; William, P. Explainable AI (XAI) in healthcare: Enhancing trust and transparency in critical decision-making. World J. Adv. Res. Rev. 2024, 23, 2647–2658. [Google Scholar] [CrossRef]
  66. Krois, J.; Garcia Cantu, A.; Chaurasia, A.; Patil, R.; Chaudhari, P.K.; Gaudin, R.; Gehrung, S.; Schwendicke, F. Generalizability of deep learning models for dental image analysis. Sci. Rep. 2021, 11, 6102. [Google Scholar] [CrossRef] [PubMed]
  67. Guan, H.; Liu, M. Domain Adaptation for Medical Image Analysis: A Survey. IEEE Trans. Biomed. Eng. 2022, 69, 1173–1185. [Google Scholar] [CrossRef] [PubMed]
  68. Das, A.; Rad, P. Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey. arXiv 2020, arXiv:2006.11371. [Google Scholar] [CrossRef]
  69. Liu, T.-Y.; Lee, K.-H.; Mukundan, A.; Karmakar, R.; Dhiman, H.; Wang, H.-C. AI in Dentistry: Innovations, Ethical Considerations, and Integration Barriers. Bioengineering 2025, 12, 928. [Google Scholar] [CrossRef]
  70. Rajkumar, N.M.R.; Muzoora, M.R.; Thun, S. Dentistry and Interoperability. J. Dent. Res. 2022, 101, 1258–1262. [Google Scholar] [CrossRef]
  71. Haripriya, R.; Khare, N.; Pandey, M. Privacy-preserving federated learning for collaborative medical data mining in multi-institutional settings. Sci. Rep. 2025, 15, 12482. [Google Scholar] [CrossRef]
  72. Kumari, P.; Chauhan, J.; Bozorgpour, A.; Huang, B.; Azad, R.; Merhof, D. Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects. Med. Image Anal. 2025, 106, 103730. [Google Scholar] [CrossRef]
  73. Al-Zubaidy, D.; Innes, N.; Galloway, J.; Al-Yaseen, W.; Al-Zubaidy, D.; Innes, N.; Galloway, J.; Al-Yaseen, W. Evaluating user perceptions and usability of an AI-powered smartphone application for at-home dental plaque screening. Br. Dent. J. 2025, 239, 46–52. [Google Scholar] [CrossRef]
  74. Rokhshad, R.; Ducret, M.; Chaurasia, A.; Karteva, T.; Radenkovic, M.; Roganovic, J.; Hamdan, M.; Mohammad-Rahimi, H.; Krois, J.; Lahoud, P.; et al. Ethical considerations on artificial intelligence in dentistry: A framework and checklist. J. Dent. 2023, 135, 104593. [Google Scholar] [CrossRef]
Figure 1. DL workflow of intraoral photographic image analysis.
Figure 1. DL workflow of intraoral photographic image analysis.
Ai 07 00085 g001
Figure 2. PRISMA-inspired flow diagram of the study identification, screening, eligibility assessment, and inclusion process.
Figure 2. PRISMA-inspired flow diagram of the study identification, screening, eligibility assessment, and inclusion process.
Ai 07 00085 g002
Figure 3. Sunburst diagram of dental study types, models, and outcomes.
Figure 3. Sunburst diagram of dental study types, models, and outcomes.
Ai 07 00085 g003
Figure 4. Visual diagnostic cues across dental disease categories in IOPIs.
Figure 4. Visual diagnostic cues across dental disease categories in IOPIs.
Ai 07 00085 g004
Figure 5. Evaluation metrics landscape for Deep Learning intra-oral photographic image analysis.
Figure 5. Evaluation metrics landscape for Deep Learning intra-oral photographic image analysis.
Ai 07 00085 g005
Figure 6. Future Research Directions in Deep Learning-Based Intraoral Photographic Image (IOPI) Analysis.
Figure 6. Future Research Directions in Deep Learning-Based Intraoral Photographic Image (IOPI) Analysis.
Ai 07 00085 g006
Table 1. Performance summary of DL using Intraoral Photographic Images across Dental-disease Domains.
Table 1. Performance summary of DL using Intraoral Photographic Images across Dental-disease Domains.
Author
(Year)
Dental StudyDataset Source & SizeImaging ModalityModel UsedOutcomesLimitations
Park et al., 2022 [22]Caries
Detection
KNUDH,
2348 images
Intraoral cameraResNet18, Faster R-CNNAccuracy: 0.813
AUC: 0.837
Sensitivity: 0.890
Limited internal visibility and occult lesions
Mehta et al., 2024 [23]Dental
Caries
Bharati Vidyapeeth’s Dental College, Pune,
1164 images
Intraoral
digital RGB
images
DenseNet201Accuracy: 0.93Dataset scarcity, generalizability, and risk of overfitting.
Sabri et al., 2025 [24]Gingival diseasesMultihospital
Karnataka,
2270 images
X-Ray and
Intraoral images
Mobile NetAccuracy: 92.7%Data scarcity, poor interpretability, and clinical limits
Park et al., 2023 [25]Periodontal diseasesGitHub public,
220 images
Optical
camera
images
YOLOv5sF1 score: 99.9%Dataset expansion, synthetic bias, and
low applicability
Wen et al., 2024 [26]Gingival inflammation gradingSchool and Hospital of Stomatology, Wuhan,
8214 images
Digital
Camera
U-Net with Dense Net
encoder
Accuracy: 79.22%
AUC: 0.837
Sensitivity: 83.75%
Specificity: 69.33%
Precision: 0.867
Limited dataset, regional gap, and image bias
Garg et al., 2023 [27]Dental
Calculus
Public
Dataset,
220 images
RGB
Intraoral images
ResNet34Accuracy: 81.82% Recall: 75.00%
F1-score: 81.82% Precision: 90.00%
Data demand, training cost, and
manual processing
Nantakeeratipat et al.,
2024 [28]
Dental plaqueSrinakharinwirot University, Bangkok,
299 images
Smartphone camera
images
Google Cloud’s Vertex AI AutoMLPrecision: 0.964
F1-score: 0.931 AUPRC: 0.964
Data limitation, weak generalization, and manual
cropping risk
Zhang et al., 2025 [29]Dental
occlusion
classification
Private
dataset,
7200 images
Digital
camera IOPIs
Swin
Transformer
F1-score: 0.90
(Molar Occlusal)
F1-score: 0.87
(Canine Occlusal)
Quality flaw, Source dependent, and validation gap
Ryu et al., 2022 [30]Orthodontic photo classificationSeoul National University Dental
Hospital,
4448 images
IOPIsMulti-domain CNNAccuracy: 99.3%
(Facial)
Accuracy: 99.9%)
(Intraoral photos)
Single dataset, no flip-handling
Su et al., 2025 [31]Oral
mucosal lesions
National Cheng Kung University Hospital,
506 images
Clinical photographic imagingCNNSpecificity: 97.0%
Kappa: 0.851
AUC: 0.985
Dataset scarcity, class imbalance, cross-validation
Zhang et al., 2024 [32] Oral lesion segmentationPrivate
dataset,
838 images
Intraoral
lesion images
SegFormer-B2 TransformerDice: 0.710
Precision: 0.886
Data scarcity, low diversity, weak generalization
Tanriver et al., 2021 [33] OPMD
Disorders
Combined public dataset,
652 images
White-light photographic imagesYOLOv5l
U-Net
Dice: 0.929
(U-Net)
AP: 0.855
(YOLOv5l)
Data scarcity, low diversity, and lesion challenge
Vinayahalingam et al., 2024 [34] OPMD
detection
Private
dataset,
4161 images
Clinical photographsMask R-CNN + SwinF1 score:0.852
AUC: 0.974
F1score: 0.796 AUC: 0.938
Site limitation, low diversity, and label inconsistency
Warin et al., 2022 [35]OPMD
detection
Private
dataset,
600 images
Digital
dental
camera
DenseNet-121
ResNet-50
Faster R-CNN
AUC: 95%
(DenseNet-121)
AUC: 95%
(ResNet-50)
F1 Score: 0.743
(Faster R-CNN)
Data scarcity and risk of overfitting
Talwar et al., 2023 [36]OPMDsIndian Dental Institute,
2178 images
Intraoral photographic imagesDenseNet-201 F1 Score: 0.84Inconsistent quality, focus, and angle variation
Rashid et al., 2022 [37]Dental
Caries
Public
dataset
Mixed dental imagesHybrid Mask RCNNAccuracy between 78% and 92%No explicit study,
annotated datasets
Ali & Sadeeq et al.,
2025 [38]
Dental ClassificationKaggle
Multi
dataset
Clinically obtained RGB
intraoral images
EfficienNet-B3Accuracy: 95.4% (Oral Diseases)
Accuracy: 89.9% (Oral Infection)
Accuracy: 99.3% (Teeth Dataset)
Class imbalance and low recall in Hypodontia
Boy et al.,
2025 [39]
Dental
caries
Private Indonesian clinical
dataset,
1200 images
Smartphone imagesMobileNetV3Accuracy: 90%
Precision: 90%
Sensitivity: 90%
Specificity: 90%
Quality flaw,
device variability, and low resolution
Pang et al., 2022 [40]Dental PlaquePrivate dataset,
2884 images
Raw oral endoscope RGB imagesResNet101Accuracy: 83.86%Device variability, Imaging inconsistency, Equipment variation
Patel et al., 2024 [41]Oral lesionsPrivate OCPP data,
2765 images
Intraoral imagesGAIN + ASPAccuracy:75.45%
AUC:99.7%
No limitations stated
Ryu et al., 2023 [42]Dental crowding severitySeoul
National University Dental
Hospital,
2248 images
Intraoral photographsVGG19(Maxilla)
Accuracy: 0.922
(Mandible)
Accuracy: 0.898
Single-center data, weak generalization, quality flaw
Liu et al., 2024 [43]Dental
caries,
calculus, gingivitis.
Private
dataset, 3365 images
Intraoral photographic imagesOral-Mamba
CNN
Accuracy:
0.83 (gingivitis)
0.83 (caries)
0.81 (calculus)
No explicit limitations stated
Jeong et al., 2025 [44]Dental plaque accumulationPrivate
dataset,
1094 images
Camera
IOPIs
U-NetPrecision: 76.34%
Recall: 65.15%
F1-score: 66.15%
Single-dataset, Imaging and Visualization limits
Li et al.,
2021 [45]
GingivitisPrivate
dataset,
10,000 images
RGB photosResNet-50
YOLOv3
Accuracy: 92.1%
Sensitivity: 91.3%
Specificity: 92.9%
Single-center data, Subjective diagnosis, data scarcity
Neumayr et al., 2024 [46]Molar
incisor hypomineralisation
Open source
web images,
455 images
IOPIsAI-based modelAccuracy 94.3%
sensitivity (94.4%)
specificity (94.2%)
AUC: 0.89–0.94
Heterogeneous images, Subjective quality rating, No standard criteria
Table 2. Risk of Bias Assessment.
Table 2. Risk of Bias Assessment.
Study IDQuestion NumberOverall Bias
Q1Q2Q3Q4Q5Q6Q7
Park et al., 2022 [22]lowModerateLowLowLowLowLowAi 07 00085 i001
Mehta et al., 2024 [23]LowLowHighModerateLowLowLowAi 07 00085 i001
Sabri et al., 2025 [24]LowLowModerateLowModerateLowModerateAi 07 00085 i002
Park et al., 2023 [25]HighModerateModerateModerateHighLowHighAi 07 00085 i003
Wen et al., 2024 [26]ModerateModerateLowLowLowLowLowAi 07 00085 i001
Garg et al., 2023 [27]ModerateModerateLowLowLowLowLowAi 07 00085 i001
Nantakeeratipat et al., 2024 [28]LowModerateLowLowLowLowLowAi 07 00085 i001
Zhang et al., 2025 [29]ModerateLowHighLowLowLowLowAi 07 00085 i001
Ryu et al., 2022 [30]LowModerateLowHighLowLowHighAi 07 00085 i003
Su et al., 2025 [31]ModerateModerateLowLowLowLowHighAi 07 00085 i002
Zhang et al., 2024 [32]LowModerateHighLowLowLowLowAi 07 00085 i001
Tanriver et al., 2021 [33] ModerateLowHighLowLowLowLowAi 07 00085 i001
Vinayahalingam et al.,
2024 [34]
LowModerateHighLowLowLowModerateAi 07 00085 i001
Warin et al., 2022 [35]LowModerateHighLowLowLowLowAi 07 00085 i001
Talwar et al., 2023 [36]LowModerateHighLowLowHighLowAi 07 00085 i002
Rashid et al., 2022 [37]LowModerateHighLowLowLowLowAi 07 00085 i001
Ali & Sadeeq, 2025 [38]LowLowLowLowModerateLowModerateAi 07 00085 i001
Boy et al., 2025 [39]LowModerateHighLowLowLowLowAi 07 00085 i001
Pang et al., 2022 [40]ModerateModerateLowLowLowLowLowAi 07 00085 i001
Patel et al., 2024 [41]LowModerateHighLowLowLowLowAi 07 00085 i001
Ryu et al., 2023 [42]ModerateLowHighLowLowLowModerateAi 07 00085 i001
Liu et al., 2024 [43]LowModerateHighLowLowHighModerateAi 07 00085 i003
Jeong et al., 2025 [44]LowModerateHighLowLowLowModerateAi 07 00085 i001
Li et al., 2021 [45]LowModerateHighLowModerateLowLowAi 07 00085 i002
Neumayr et al., 2024 [46]LowLowLowLowLowLowHighAi 07 00085 i001
Ai 07 00085 i001 Low Risk Of Bias (High quality); Ai 07 00085 i002 Moderate Risk Of Bias (Low quality); Ai 07 00085 i003 High Risk of Bias (Poor quality).
Table 3. Mapping dental diseases to visual cues and DL tasks.
Table 3. Mapping dental diseases to visual cues and DL tasks.
Disease CategoryKey Visual Indicators in IOPIsClinical RelevanceTypical DL TasksRepresentative Studies
Dental cariesWhite-spot lesions, cavitation,
discoloration
Early prevention of
progression
Classification,
localization
[3]
Gingivitis/PeriodontitisGingival redness, swelling, bleeding Prevents progression and tooth lossClassification,
grading
[26]
Dental plaqueYellowish biofilm
at the gingival margin
Risk factors for caries
and gingivitis
Segmentation,
quantification
[40]
Orthodontic conditions Crowding, spacing, occlusal imbalance Treatment planningClassification,
landmark detection,
[48]
Soft-tissue lesions/PMDsWhite/red patches
ulcers
Early oral cancer
screening
Classification,
Lesion segmentation
[33]
Table 4. DL Architectures used in dental IOPI studies.
Table 4. DL Architectures used in dental IOPI studies.
Study
(Author, Year)
Imaging TaskDL ArchitectureKey Methodological FocusRef
Park, 2022Tooth surface caries detection and segmentationRes Net-based
segmentation pipeline
Tooth-surface segmentation before classification [22]
Duong, 2021Caries screeningClassical ML/CNN prototypeFeasibility of smartphone photographic ML[47]
Kühnisch, 2022Caries
detection
CNN ensembles,
Transfer learning
High-performance caries benchmarking[27]
Shuai Li, 2022Plaque
segmentation
Local-to-global attention (U-Net variant)
Attention-based U-Net
Improved plaque boundary delineation[40]
Nantakeeratipat, 2024Plaque
detection
Automated ML frameworksAutoML for plaque detection on permanent teeth
Automated model selection
[28]
Ryu, 2022Orthodontic
diagnosis
CNN (IOPI Classification)Automated classification of intraoral photos[30]
Vinayahalingam, 2024OPMD
multi-class
detection
Vision TransformerMulti-class PMD detection using ViTs[34]
Tanriver, 2021Oral lesion
detection
CNN-based classifiersEarly automated PMD detection[33]
Kaissis, 2020
Rischke, 2022
Federated training
context
FL frameworks
(FedAvg/FedProx)
Privacy-preserving multi-center training in medical imaging/dentistry[20,21]
Taleb, 2022Label-efficient learningSSL paradigms (SimCLR, BYOL,
MoCo)
SSL promises efficient detection and labeling of dental caries [57]
Table 5. Summary of DL architecture for analyzing IOPI-based dental disease.
Table 5. Summary of DL architecture for analyzing IOPI-based dental disease.
Model/ArchitectureTypical Dental Roles (IOPI)StrengthsLimitationsRepresentative Benchmark
Studies
Res NetCaries and lesion
classification
Strong feature
extraction
Limited global context; needs augmentation[22]
Efficient Net/
Mobile Net
Smartphone screeningGood accuracy-
compute tradeoff
Sensitive to IOPI
variability
[39]
U-Net/DeepLabV3+/
Attention U-Net
Plaque, gingiva, lesion
segmentation
Accurate boundary
localization
Limited global
reasoning
[15,53]
Vision Transformers Multi-disease and
orthodontic analysis
Strong global context modelingData hungry[34,55]
GANs Data augmentation,
color correction
Mitigate class
imbalance
Risk of unrealistic
samples
[17,18]
Self-Supervised Learning Label efficient,
pretraining
Reduces annotation
burden
Sensitive to
augmentation
[56,57]
Federated Learning Multi-center trainingPrivacy-preserving robustCommunication overhead[20,21]
Table 6. Performance summary of state-of-the-art DL approaches for key IOPI-based dental disease tasks.
Table 6. Performance summary of state-of-the-art DL approaches for key IOPI-based dental disease tasks.
TaskBest Reported Model(s)Dataset SourceKey TechniquesReported
Performance
Key
Reviewed Studies
Caries
detection
Res Net/CNN models smartphone IOPIs Tooth-surface
segmentation, CLAHE,
ROI cropping
Accuracy
~85–93%
[22,47,52]
Gingivitis/
periodontal grading
CNN classifiersClinical RGB
photos
Gingival ROI,
color normalization
Accuracy
~85–92%
[25,26,45]
Plaque
segmentation
Attention U-Net variantsMasked datasetsColor normalization,
morphological cleaning
Dice
coefficients ~0.82–0.95
[28,40,44]
Orthodontic
classification
CNNs, ViT/Swin variantsClinical
IOPIs
Alignment/
ROI extraction
Accuracy ~90–99% [29,42]
OPMD detectionCNN ensembles, ViTsClinical photosContrast enhancement,AUC~
0.85–0.96
[33,35]
Multi-center robustnessFederated
learning
Multi- center
datasets
FL training;
domain adaptation
Improved external
robustness
[20,21]
Table 7. Future directions of DL approaches for IOPI-based dental-disease detection.
Table 7. Future directions of DL approaches for IOPI-based dental-disease detection.
Challenge AreaEvidence from LiteratureFuture Research Directions
Dataset heterogeneityMany studies rely on single-center datasets with limited demographic and clinical diversity [10,26], affecting generalizability [60].Development of multi-center IOPI datasets with standardized
acquisition protocols.
Image acquisition variabilityVariations in lighting, device type, viewing angle, and saliva artifacts influence model performance [26,47].Color normalization, illumination correction, and guided
image-capture strategies.
Class imbalance and rare conditionsRare conditions such as early caries and OPMDs are underrepresented in available datasets [26,29].Targeted data collection, synthetic augmentation, and self-supervised pretraining.
Domain shift and
external validation
Performance degradation is commonly reported on external datasets due to domain shift [60].Domain adaptation techniques, external validation, and federated learning.
Limited
explainability
Saliency-based explanations are not always aligned with clinical reasoning [64,71].Clinically interpretable explanation frameworks and human–AI studies.
Annotation burdenManual labeling is time-consuming and subject to inter-observer variability [21,31].Self-supervised, weakly supervised, and consensus-based annotation methods.
Privacy and regulatory constraintsData-sharing restrictions limit large-scale multi-institutional collaboration [74].Privacy-preserving learning and regulatory-aligned AI development.
Model reliability and calibrationConfidence estimates are often poorly calibrated for clinical decision support [68].Uncertainty-aware modeling and calibration strategies.
Clinical workflow integrationLimited deployment due to interoperability and hardware constraints [8,50].Lightweight models, interoperable deployment frameworks, and
user-centered design.
Lack of prospective evaluationMost studies rely on retrospective analysis.Prospective and longitudinal clinical validation studies.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mutawa, A.M.; Altarakemah, Y.Y.; Thirupathy, K. Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives. AI 2026, 7, 85. https://doi.org/10.3390/ai7030085

AMA Style

Mutawa AM, Altarakemah YY, Thirupathy K. Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives. AI. 2026; 7(3):85. https://doi.org/10.3390/ai7030085

Chicago/Turabian Style

Mutawa, A. M., Yacoub Yousef Altarakemah, and Karthiga Thirupathy. 2026. "Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives" AI 7, no. 3: 85. https://doi.org/10.3390/ai7030085

APA Style

Mutawa, A. M., Altarakemah, Y. Y., & Thirupathy, K. (2026). Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives. AI, 7(3), 85. https://doi.org/10.3390/ai7030085

Article Metrics

Back to TopTop