Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives

Mutawa, A. M.; Altarakemah, Yacoub Yousef; Thirupathy, Karthiga

doi:10.3390/ai7030085

Open AccessReview

Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives

by

A. M. Mutawa

^1,*

,

Yacoub Yousef Altarakemah

² and

Karthiga Thirupathy

¹

Department of Computer Engineering, College of Engineering and Petroleum, Kuwait University, Sabah Al Salem University City, P.O. Box 5969, Safat 13060, Shadadiya, Kuwait

²

Department of Restorative Science, Faculty of Dentistry, Kuwait University, P.O. Box 24923, Safat 13110, Kuwait City, Kuwait

^*

Author to whom correspondence should be addressed.

AI 2026, 7(3), 85; https://doi.org/10.3390/ai7030085

Submission received: 30 December 2025 / Revised: 11 February 2026 / Accepted: 12 February 2026 / Published: 2 March 2026

Download

Browse Figures

Versions Notes

Abstract

Dental conditions, including caries, periodontal disease, plaque accumulation, malocclusion, and oral mucosal abnormalities, remain highly prevalent worldwide. Early detection is crucial for preventing disease progression, simplifying treatment, and improving patient outcomes. Conventional diagnostic methods rely on subjective visual and tactile examinations, which are often inconsistent. Recent advances in deep learning (DL), particularly convolutional neural networks and vision transformers, enable automated, accurate detection of dental diseases from intraoral images captured via smartphones or dedicated imaging devices. DL-driven systems facilitate cost-effective virtual consultations, community screenings, and remote oral health monitoring. This narrative review was conducted following a structured search of PubMed, Scopus, Web of Science, Embase, and Google Scholar (October 2020–October 2025), which identified 74 eligible studies on intraoral photographic imaging-based DL systems, encompassing caries, gingival inflammation, plaque, malocclusion, and soft-tissue lesions. Most studies focused on caries, plaque, and periodontal disease using CNN and U-Net-based models, often reporting accuracies above 85% but with substantial performance drops in external validation. Despite promising results, clinical integration remains limited by challenges such as class imbalance, limited external validation, heterogeneous imaging protocols, and insufficient model interpretability. Emerging approaches, including self-supervised and federated learning, explainable artificial intelligence, multimodal data fusion, and smartphone-based diagnostics, offer potential solutions. Standardized imaging workflows, high-quality annotations, and robust clinical trials are essential to translate DL-based dental diagnostic systems into real-world practice. This narrative review aims to guide the development of reliable, equitable, and clinically deployable DL solutions for oral health assessment.

Keywords:

deep learning; oral diseases; intraoral photographic images; convolutional neural networks; transfer learning; explainable artificial intelligence

1. Introduction

Dental diseases constitute a major global public health burden, affecting an estimated 3.5 billion people worldwide. Dental caries is the most prevalent non-communicable condition, while periodontal diseases remain a leading cause of tooth loss [1,2]. Early and accurate diagnosis is essential for preventing disease progression, reducing treatment costs, and improving patients’ quality of life. However, conventional diagnostic approaches rely predominantly on visual–tactile examination, which is often insufficiently sensitive and poorly reproducible, particularly in early disease stages [3,4].

Recent advances in artificial intelligence (AI), especially deep learning (DL), have transformed image-based diagnostics across several medical specialties and now achieve expert-level performance in fields such as dermatology [5], radiology [6], and ophthalmology [7]. In dentistry, AI research has primarily concentrated on radiographic modalities such as bitewing radiographs, panoramic imaging, and cone-beam computed tomography. Although these imaging techniques offer valuable structural information, they necessitate specialized equipment and involve ionizing radiation. In contrast, intraoral photographic images (IOPIs), captured using smartphones or intraoral cameras, represent a non-ionizing, cost-effective, and accessible alternative that supports tele-dentistry [8], community-based screening, and remote oral-health monitoring, particularly in low-resource settings [9].

Despite these advantages, the direct translation of DL techniques developed for other medical imaging tasks to dental intraoral photography is non-trivial. Unlike standardized acquisition protocols commonly used in radiology, intraoral photography is characterized by substantial variability in illumination, camera type, viewing angle, saliva presence, reflections, and occlusions. Moreover, many dental pathologies manifest as small, visually subtle lesions, often accompanied by pronounced class imbalance and a lack of large, well-annotated, multicenter datasets. These domain-specific characteristics introduce challenges related to robustness, generalizability, and reproducibility that are not adequately addressed by DL models designed for more standardized imaging environments.

Several reviews have explored artificial intelligence (AI) and deep learning (DL) approaches for dental diagnosis, including systematic and umbrella reviews that primarily focus on caries detection and AI-assisted interpretation of dental images. Schwendicke et al. examined deep learning–based methods for caries detection across various imaging modalities [10], whereas Kumar et al. conducted an umbrella review on AI-assisted caries examination using digital dental photography [11]. Additionally, Noor Uddin et al. summarized DL models developed for caries detection using intraoral images [12]. However, these reviews vary significantly in scope, methodological approach, and reporting depth, often combining radiographic and photographic modalities or concentrating on a single disease entity [13]. As a result, a focused synthesis dedicated exclusively to deep learning applications based on intraoral photographic images across multiple dental conditions remains limited.

Despite promising performance, deep learning (DL) models using intraoral photographic images (IOPIs) continue to face significant challenges that impede their clinical application. These challenges encompass class imbalance across disease categories, reliance on single-center datasets, sensitivity to domain shifts due to heterogeneous acquisition conditions, and limited model explainability, all of which collectively hinder generalizability and clinician trust [14]. Methodologically, research has advanced beyond initial convolutional neural networks (CNNs) to include encoder–decoder architectures such as U-Net for segmentation tasks [15], vision transformers (ViTs) for global context modeling [16], generative adversarial networks (GANs) for data augmentation and imbalance mitigation [17,18], self-supervised learning (SSL) for label-efficient training [19], and federated learning (FL) for privacy-preserving multi-center collaboration [20,21]. However, these methodologies have not yet been systematically integrated within the specific context of IOPI-based dental disease analysis. In this narrative review, deep learning approaches are examined by their primary analytical task: image-level classification, lesion or region detection, pixel-level segmentation, and multi-task learning frameworks that combine two or more of these objectives, applied to intraoral photographic images. Figure 1 depicts the DL workflow for IOPI analysis.

To address this gap, the present review evaluates deep learning architectures applied to intraoral photographic diagnosis, summarizes model performance across major dental pathologies, and examines how preprocessing and augmentation strategies affect model robustness and generalizability. By focusing exclusively on intraoral photographic imaging—rather than radiographic modalities—this review complements existing AI-in-dentistry literature and fills a significant gap in current evaluations. Specifically, this review addresses the following questions:

Which deep learning architectures and learning paradigms (e.g., CNNs, encoder–decoder networks, vision transformers, self-supervised learning, and federated learning) demonstrate robust performance for dental disease detection and segmentation using intraoral photographic images?
How do preprocessing, data augmentation, and image enhancement strategies influence diagnostic accuracy, robustness, and generalizability across primary dental conditions?
What methodological and translational challenges, including class imbalance, single-center dataset bias, domain shift, and limited explainability, most strongly constrain clinical adoption?
What emerging research directions and implementation strategies are most promising for improving fairness, interpretability, privacy preservation, and real-world deployment of AI-assisted intraoral photographic diagnostics?

These questions are addressed through:

(i).: task- and disease-level framing (Section 4).
(ii).: synthesis of deep learning architectures (Section 5).
(iii).: review of preprocessing and augmentation strategies (Section 6).
(iv).: task-wise synthesis of diagnostic performance (Section 7).
(v).: evaluation metrics and generalizability considerations (Section 8).
(vi).: analysis of translational and clinical challenges (Section 9)
(vii).: identification of emerging research directions (Section 10).

2. Methodology

2.1. Search Strategy

A structured literature search was conducted across major academic databases (PubMed, Scopus, Web of Science, Embase, and Google Scholar) from October 2020 to October 2025. The search strategy combined terms related to AI and deep learning (“deep learning,” “DL,” “convolutional neural network,” “CNN,” “vision transformer,” “machine learning,” “artificial intelligence”) with dental and oral health terms (dental, dentistry, oral, tooth, teeth, “dental caries,” gingivitis, “dental calculus,” periodontitis, “oral lesion,” “oral potentially malignant disorder,” OPMD, “oral cancer”) and with intraoral imaging terms (“intraoral photograph,” “intraoral images,” “intraoral photo,” “oral photograph,” “smartphone image,” “intraoral camera,” “photographic images”). The PRISMA-inspired flow diagram (Figure 2) illustrates the process of study identification, screening, eligibility assessment, and final inclusion in this narrative review.

2.2. Inclusion and Exclusion Criteria

The inclusion criterion was limited to peer-reviewed articles published in English. All retrieved records were exported to EndNote and deduplicated. Studies were then screened through a four-stage process conducted independently to minimize bias. Three reviewers independently assessed all titles, abstracts and full text based on predefined inclusion and exclusion criteria, removing records that clearly violated any of the following six eligibility requirements: (1) non-English publications, (2) duplicate records, (3) content unrelated to dentistry or AI, (4) conference abstracts, (5) retracted papers, and (6) non-empirical materials, including editorials, case reports, and commentaries.

2.3. Study Selection

The initial database search yielded 194 articles. After eliminating duplicates, 172 unique studies remained and were evaluated based on their titles and abstracts. Out of these, 112 articles were subjected to a full-text review to determine their eligibility. All studies deemed relevant by keyword searches were thoroughly reviewed in full text. The complete manuscripts were analyzed to gather information on methodologies, dataset characteristics, validation strategies, performance metrics, and any reported limitations. A structured quality assessment and risk-of-bias evaluation were then conducted using predefined criteria, ensuring that conclusions were based on a comprehensive analysis of the complete texts rather than just abstract-level information. Applying the inclusion and exclusion criteria resulted in the selection of 74 articles for the final review. The choice of studies was strictly guided by predefined eligibility criteria, with a focus on methodological rigor and relevance. Independent screening was carried out at both the title/abstract and full-text stages to separate preliminary relevance assessment from detailed methodological evaluation, thereby enhancing transparency and reducing selection bias.

2.4. Data Extraction and Synthesis

The following information was extracted from each eligible study:

Study information: author, publication year, and journal.
Dataset characteristics: type of intraoral images, dataset source, and preprocessing methods.
DL model details: architecture (e.g., CNN, ResNet, ViT), transfer-learning methods, and training/validation approaches.
Performance metrics: accuracy, sensitivity, specificity, F1-score, and area under the curve (AUC) of the receiver operating characteristic (ROC).
External validation: presence or absence of independent testing.
Key findings and limitations.

Two reviewers independently conducted data extraction and resolved disagreements through discussion. The extracted data were then qualitatively synthesized to identify the emerging trends, methodological patterns, and gaps in the existing literature.

2.5. Quality and Risk of Bias Assessment Criteria

Following data extraction and synthesis, the methodological quality and risk of bias of the included studies were assessed using a predefined seven-domain framework tailored to dental photographic image analysis. Each domain was evaluated based on full-text review and rated as low, Moderate, or high risk of bias. An overall risk-of-bias classification was subsequently assigned using predefined, non-overlapping thresholds.

Q1.: Is the dataset source of the dental photographic image dataset clearly described (e.g., public/private, single- or multi-center)?
Q2.: Is an appropriate strategy reported for splitting the dataset into training, validation, and test sets, with explicit measures to prevent data leakage (e.g., patient-level separation or avoidance of duplicate images across splits)?
Q3.: Was the developed model evaluated using external validation or an independent test dataset to assess its generalizability to unseen data?
Q4.: Are clinically meaningful performance metrics reported beyond overall accuracy, such as sensitivity, specificity, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC)?
Q5.: Are ground-truth labels derived from reliable clinical reference standards, such as expert dental annotation, consensus labeling, or validated diagnostic criteria?
Q6.: Does the study explicitly discuss key methodological limitations, potential sources of bias, and constraints affecting the interpretation or generalizability of the results?
Q7.: Is image acquisition standardized or quality-controlled, including reporting of camera type, lighting conditions, acquisition protocols, or exclusion of poor-quality images? Image acquisition standardization and quality control.

Risk of Bias (ROB) Scoring Criteria and Thresholds

Each study was evaluated across seven domains (Q1–Q7) using explicit criteria to ensure transparency and replicability in the risk-of-bias assessment. Studies were categorized as having low, moderate, or high risk of bias based on the thoroughness and rigor of their methodologies. For Q1, which pertains to data source and representativeness, studies were classified as low risk if the dataset source and population characteristics were comprehensively documented, and as higher risk if they relied on single-center data without adequate justification. Q2, which focuses on data splitting and leakage prevention, necessitated transparent reporting of training, validation, and test splits, along with measures to prevent data leakage. Q3, concerning the validation strategy and generalizability, was considered high risk in the absence of external or multicenter validation. Q4, regarding performance reporting transparency, was rated low risk when studies reported multiple clinically relevant metrics beyond mere accuracy. Q5, addressing labeling quality and reference standards, required expert or consensus annotations. Q6, on the reporting of limitations, was based on a clear discussion of methodological constraints. Q7, which evaluates image acquisition standardization and quality control, was assessed using documentation of imaging protocols and quality assurance procedures. Each study was independently evaluated using established decision criteria, and any discrepancies in opinion were resolved through consensus.

An overall risk of bias for each study included in the analysis was evaluated by examining the percentage of domains categorized as low, moderate, or high risk. A study was considered to have Minimal Risk of Bias (High Quality) Ai 07 00085 i001

if 70% or more of the evaluated domains were rated as low risk, with no more than one high-risk rating in a specified critical domain. Conversely, a study was deemed to have a moderate Risk of Bias (Lower Quality) Ai 07 00085 i002

. If less than 70% of the domains were rated as low risk, and there were high-risk ratings in two or more critical domains. A High Risk of Bias classification Ai 07 00085 i003

was assigned when 50% or fewer domains were rated as high risk, resulting in the study’s exclusion from the qualitative synthesis. The use of explicit criteria and quantitative thresholds was intended to minimize subjective judgment and improve reproducibility.

3. Results

The results of the study selection and subsequent analyses are presented below. The characteristics of the included studies, the quality and risk-of-bias assessment outcomes, and the distribution of evidence across dental applications are summarized in the following sections.

3.1. Summary of Included Studies

The reviewed literature encompasses a wide array of dental applications, including caries detection, assessment of periodontal and gingival diseases, analysis of dental plaque and calculus, evaluation of malocclusion and crowding, screening for oral lesions and potentially malignant disorders, and support for orthodontic diagnostics. The studies use datasets from private clinical collections, institutional repositories, public datasets, and images acquired via community or smartphone cameras, with dataset sizes ranging from a few hundred to over 10,000 images. Imaging modalities primarily consist of professional intraoral cameras, digital clinical photography, and smartphone-based imaging. A variety of deep learning architectures are employed, including convolutional neural networks, U-Net-based segmentation models, object detection frameworks like YOLO and Mask R-CNN, and transformer-based models such as Swin Transformer and SegFormer. Reported outcomes primarily focus on classification, detection, and segmentation performance, using metrics such as accuracy, sensitivity, specificity, F1-score, AUC, Dice coefficient, and mean intersection-over-union. Commonly noted limitations include small or single-center datasets, limited population diversity, variability in image quality, lack of external validation, and restricted generalizability. Table 1 summarizes the key characteristics of studies that explore deep learning-based analysis of intraoral dental photographic images.

3.2. Distribution of Study Characteristics and Evidence

Figure 3 presents a sunburst diagram that illustrates the hierarchical distribution of the included studies, categorized by dental application, contributing authors, deep learning model families, and reported performance outcomes. The innermost ring represents the authors and the year. In contrast, the subsequent concentric rings depict the primary dental conditions investigated in the corresponding studies using intraoral photographic images, model architectures, and key outcome metrics. This visualization provides an integrated overview of how deep learning approaches have been applied across various dental conditions, highlights the diversity of model choices, and summarizes reported performance trends within the current literature.

3.3. Quality and Risk of Bias Assessment

The quality and risk of bias in the included studies were assessed using the predefined seven-domain framework outlined in Section 2.5. Table 2 presents a summary of the overall risk-of-bias classification for each study, assessed across seven predefined methodological domains (Q1–Q7), using explicit decision criteria and quantitative scoring thresholds.

Across the studies reviewed, the overall risk-of-bias profile revealed consistent methodological strengths alongside recurring weaknesses. Most studies clearly identified their data sources and demonstrated satisfactory representativeness (Q1), although some were limited to data from a single center. Strategies for data splitting and preventing leakage (Q2) were generally reported with low to moderate concern. The most significant source of bias was linked to validation strategy and generalizability (Q3), as many studies lacked external or multi-center validation, resulting in high-risk ratings in this area. Conversely, transparency in performance reporting (Q4) was consistently strong, with most studies providing multiple clinically relevant metrics beyond just accuracy. Labeling quality and reference standards (Q5) were primarily rated as low risk, indicating reliance on expert dental annotations or consensus labeling. Limitations (Q6) were often acknowledged, although a small number of studies reported higher risks due to insufficient discussion of methodological constraints. Reporting on image acquisition standardization and quality control (Q7) were variable, with several studies reporting moderate to high risk due to limited documentation of imaging protocols or quality assurance procedures. Overall, the main contributors to the elevated risk of bias across the reviewed literature were deficiencies in external validation and inconsistent re-porting of image acquisition.

4. Discussion

IOPIs are widely used for documentation, patient monitoring, tele-consultations, and AI-assisted diagnostic applications. In dentistry, IOPIs capture enamel, gingiva, plaque, oral mucosa, and occlusal surfaces. However, image quality is affected by variations in lighting, saliva reflection, camera angle, and device type. Consistency is often improved through noise-reduction and image-enhancement techniques such as contrast-limited adaptive histogram equalization (CLAHE), color-constancy correction, and region-of-interest (ROI) extraction.

4.1. Detectable Dental Diseases in Intraoral Photographs

4.1.1. Dental Caries

Dental caries is a primary global oral health concern, affecting more than 2 billion individuals and imposing significant clinical and economic burdens [1,2]. Carious lesions exhibit characteristic visual features. Early, non-cavitated lesions typically appear as chalky white spots due to subsurface enamel demineralization and loss of translucency. As the disease advances, lesions may darken to brown or black, develop surface roughness, and ultimately form cavities, primarily in occlusal pits, fissures, and proximal surfaces. IOPIs clearly capture these changes, presenting DL models with diagnostic cues such as localized color variation, texture inconsistencies, and morphological disruption [23]. Although the contrast between healthy enamel and demineralized areas enhances lesion visibility, early lesions may be obscured by lighting variability, saliva reflection, or device-dependent color differences, challenging the distinction ability of AI systems [47]. Despite these limitations, numerous studies have reported that CNNs and transformer-based architectures detect both cavitated and non-cavitated lesions with high diagnostic performance.

4.1.2. Gingivitis and Periodontal Diseases

Gingivitis and periodontitis produce characteristic soft tissue changes that are readily visible in IOPIs. Gingivitis typically manifests as erythematous, edematous gingiva with reduced stippling and increased bleeding tendency. In a more advanced periodontal disease [25], gingival recession and altered gingival margins may be observed. These color and contour changes serve as reliable visual markers for DL models to classify inflammation severity and estimate periodontal indices [24].

Detection accuracy can be affected by factors such as natural pigmentation differences, variations in gingival biotypes, and inconsistent lighting across the vestibular region. Nevertheless, standardized preprocessing techniques, including color-constancy correction and image normalization, enable DL systems to accurately distinguish healthy from inflamed tissues across diverse populations [26].

4.1.3. Dental Plaque

Dental plaque, a biofilm of bacteria and extracellular matrix, is a primary etiological factor in both caries and periodontal disease. In IOPIs, plaque typically appears as a yellowish-white or opaque film along the gingival margin, within fissures, or in interproximal areas. High-resolution images provide sufficient contrast for DL algorithms to recognize plaque-distribution patterns, even without disclosing agents [27].

Pixel-level segmentation models, such as U-Net [3] and DeepLabV3+, are widely used to quantify plaque coverage and assess oral hygiene. These models demonstrate robust performance despite variations in lighting and plaque translucency. By detecting subtle textural differences between plaque and enamel, they are particularly suited for automated plaque assessment from IOPIs [28]. Recent deep learning methods have also been utilized in dental plaque analysis, including a self-attention-based segmentation framework that combines local-to-global feature representations to accurately and automatically segment plaque from IOPIs [40].

4.1.4. Orthodontic Conditions

IOPIs capture dental alignment, occlusal relationships, and arch forms, enabling assessment of common orthodontic abnormalities such as crowding, spacing, anterior crossbite, deep bite, and open bite. High-resolution photographs accurately depict tooth angulation, incisal relationships, and midline deviations, allowing DL models to extract both global structural patterns and local tooth-level features.

Transformer-based models, which capture long-range spatial relationships across the dental arch, achieve remarkably accurate multi-landmark detection and occlusal pattern recognition. IOPI-based remote monitoring systems further demonstrate the practical utility of this approach, especially in aligner therapy [29,30]. Machine learning based methods have been employed for the automatic identification and analysis of photometric landmarks on two-dimensional facial images, enabling objective facial measurements that support orthodontic diagnosis and treatment planning [48].

4.1.5. Soft-Tissue Lesions and Potentially Malignant Disorders

Soft-tissue lesions, including oral leukoplakia, erythroplakia, candidiasis, aphthous ulcers, and lichen planus, exhibit visually distinctive features in clinical photographs. These may present as white keratotic plaques, bright red velvety patches, ulcerated areas, or reticular patterns, which DL models can effectively leverage for classification. Photographic analysis of potentially malignant disorders (PMDs) [33,34] is particularly valuable given the critical importance of early detection. Although variations in mucosal color, lighting conditions, and saliva can complicate interpretation, CNNs and attention-based models have recently identified suspicious mucosal changes with promising sensitivity [31]. Collectively, these studies indicate that while IOPIs are effective for detecting visually prominent conditions such as dental plaque and gingivitis, diagnostic performance for early-stage caries and subtle mucosal lesions remains constrained by class imbalance and variability in image acquisition, underscoring the need for robust preprocessing strategies and multi-task learning frameworks.

Table 3 synthesizes the relationships among major dental disease categories, their characteristic visual manifestations in intraoral photographs, and the corresponding deep learning tasks. As summarized in Table 3, conditions characterized by intense color or texture contrasts—such as dental plaque and gingival inflammation—are well suited to image-level classification and segmentation tasks, whereas diseases presenting subtle visual cues, including early-stage caries and mucosal lesions, pose greater challenges for automated detection. This task–disease mapping highlights the importance of preprocessing, region-of-interest extraction, and multi-task learning for improving diagnostic performance across heterogeneous dental conditions. Figure 4 provides a visual reference for the diagnostic cues described across disease categories, thereby supporting the interpretability of deep learning feature representations derived from intraoral photographs.

5. Deep Learning Architectures and Applications for Intraoral Photographic Image Analysis

This section integrates the discussion of deep learning architectures with their corresponding applications in IOPI analysis. DL has achieved remarkable success in various image-based medical diagnostic applications. Several model families from mainstream computer-vision research have been adapted to dental imaging: CNNs such as Visual Geometry Group (VGG), residual network (Res Net), Dense Net, and Inception [49]; encoder–decoder architectures, ViTs, generative adversarial networks (GANs) [17], and hybrid models [37]. These early successes in radiographic interpretation provide a foundation for extending DL techniques to photographic intraoral images.

5.1. Convolutional Neural Networks (CNNs)

CNNs remain central to IOPI analysis due to their ability to learn hierarchical spatial representations. Initial convolutional layers capture low-level features such as edges, contrast variations, and textures, while deeper layers encode higher-level semantic patterns, including lesion morphology, plaque distribution, gingival contours, and occlusal relationships [50,51].

Several CNN architectures have been effectively applied in dental diagnostics:

ResNet employs skip connections to mitigate vanishing gradients, enabling more profound and more discriminative models [22].
DenseNet connects each layer to all subsequent layers, promoting feature reuse and improving gradient flow; it is particularly advantageous for small or imbalanced dental datasets.
EfficientNet uses a compound scaling strategy that jointly optimizes network width, depth, and resolution, achieving high accuracy with reduced computational cost, making it suitable for smartphone-based inference.
MobileNetV2/V3 is optimized for lightweight deployment and real-time tele-dentistry workflows, facilitating rapid community-based screening [39,48].

CNN-based methods demonstrate robust performance across diverse IOPI tasks, including caries detection [52], caries classification, gingival inflammation classification [45], plaque segmentation [40] OPMD’s orthodontic assessment [35], and oral lesion identification. Additionally, CNNs form the backbone of hybrid architectures integrating attention modules, multi-task learning, and explainability tools such as Gradient-weighted Class Activation Mapping (Grad-CAM) visuals, which were confirmed to accurately depict the regions most significant for the model’s prediction, reflecting model focus rather than conclusive disease [37]. While convolutional neural network (CNN) classifiers demonstrate high performance on controlled datasets, their accuracy may diminish under conditions of variable lighting, diverse acquisition devices, and class imbalance. To mitigate these limitations, robust preprocessing strategies, such as multi-scale feature extraction and contrast-limited adaptive histogram equalization (CLAHE), have been implemented, thereby enhancing the detection of subtle white-spot lesions.

5.2. Encoder–Decoder Networks for Segmentation (U-Net and Derivatives)

Precise pixel-level discrimination is essential for tasks such as plaque-region extraction, gingival-margin delineation, mucosal-lesion isolation, and teeth cropping prior to classification. The U-Net architecture remains the benchmark in biomedical image segmentation, as its symmetric encoder–decoder design with skip connections effectively integrates contextual and fine-grained spatial information [53].

Extensions of U-Net have further enhanced performance on challenging intraoral images affected by saliva reflections, uneven illumination, and soft-tissue variability. Variants such as U-Net++, Attention U-Net, DeepLabV3+, and hybrid convolution–transformer models improve boundary precision and robustness in plaque segmentation [43], gingival mapping, lesion delineation, and ROI extraction. These models support automated plaque-index estimation and enable remote monitoring for periodontal health programs [44]. Most state-of-the-art segmentation pipelines for IOPI analysis are based on these encoder–decoder architectures. Attention-based extensions enhance robustness by effectively suppressing background artifacts, such as saliva reflections and non-diagnostic soft tissue.

5.3. Vision Transformers

Leveraging global self-attention mechanisms that model long-range spatial dependencies in dental images, transformer-based models have emerged as powerful alternatives to CNNs, which operate only within localized receptive fields. Vision transformers (ViTs, Swin transformer, DeiT) process images as sequences of patches, capturing both tooth-level and mouth-level contextual relationships [54]. The global modeling capability of vision transformers (ViTs) is particularly advantageous for tasks requiring multi-tooth contextual understanding, such as orthodontic landmark detection, occlusal pattern assessment, and multi-disease classification, in which inter-tooth spatial relationships are diagnostically meaningful. In the domain of dental imaging, ViTs have demonstrated high performance in classifying oral potentially malignant disorders (OPMDs) [34] and multi-view fusion tasks, often capturing global shape and texture patterns more accurately than conventional CNNs. Hybrid CNN–transformer models leverage CNNs for local texture extraction and transformers for contextual reasoning, achieving state-of-the-art results across diverse IOPI applications [55]. However, their primary limitation remains the increased data and computational demands, which are frequently mitigated through pretraining and hybrid model designs.

5.4. Self-Supervised Learning

Self-supervised learning (SSL) techniques, including SimCLR, BYOL, and MoCo, have emerged as highly effective for dental imaging, particularly in the context of limited annotated datasets [19]. SSL leverages contrastive learning and latent-space bootstrapping to extract invariant and discriminative features, enhancing performance in downstream tasks such as lesion detection, classification, and plaque segmentation without extensive manual annotation [56].

Federated learning (FL) enables multi-center model training without sharing patient data, addressing privacy and data-governance concerns while improving generalizability across devices, demographics, and clinical settings. Integrating SSL and FL provides a synergistic approach that accelerates the development of AI-driven dental diagnostic tools. These advances facilitate scalable deployment of IOPI-based classification systems, supporting clinical implementation, community health initiatives, remote oral health monitoring, and ultimately improving the accessibility and equity of dental care [57].

5.5. Federated Learning

Dental image datasets are often confined to institutional silos, limiting data diversity and raising concerns regarding privacy, governance, and regulation. FL is a decentralized training paradigm in which multiple dental clinics or institutions collaboratively train a shared model without exchanging raw patient data. FL offers several advantages in dentistry, notably mitigating domain shifts arising from variations in imaging devices, patient demographics, and acquisition protocols. Early FL applications in IOPI analysis demonstrated superior external generalization compared with single-center models, highlighting its value for multi-population robustness. Additionally, FL supports continuous model updates within real clinical workflows, enabling models to evolve with newly captured patient images while maintaining strict privacy safeguards. With its scalability, fairness, and privacy-preserving properties, FL is poised to play a pivotal role in deploying real-world dental AI systems [20,21].

5.6. Generative Adversarial Networks

GANs are increasingly applied in dental imaging for tasks including synthetic image generation, class-imbalance correction, data augmentation, image denoising, color correction, and cross-modal translation [18]. By generating realistic synthetic samples, GANs can augment underrepresented disease categories such as early caries, rare mucosal lesions, and specific orthodontic conditions. Beyond augmentation, GANs enhance image quality by correcting illumination inconsistencies, reducing noise, and standardizing color profiles, thereby improving feature extraction from intraoral photographs. GAN-based cross-modal translation can convert low-quality smartphone images into standardized formats, enhancing diagnostic consistency across imaging sources. Empirical evidence indicates that GAN-augmented datasets increase diversity, mitigate bias toward common presentations, and improve performance in caries detection, lesion segmentation, and plaque analysis. Consequently, GANs have become integral to modern dental AI workflows, particularly when training is limited by data scarcity or heterogeneity. Table 4 presents a study-level synthesis of deep learning applications for intraoral photographic images, mapping diagnostic tasks to their corresponding model architectures. Most classification tasks—such as the identification of dental caries, gingivitis, and soft-tissue lesions—employ convolutional neural network (CNN) backbones, reflecting their effectiveness in capturing localized color and texture features. Segmentation-oriented studies, particularly those focused on dental plaque and gingival regions, predominantly adopt encoder–decoder architectures such as U-Net and its variants, which enable accurate boundary delineation through multi-scale feature integration. Transformer-based models are more frequently reported in orthodontic assessments and multi-disease classification settings, where modeling global spatial dependencies across the dental arch is diagnostically relevant.

While Table 3 provides a study-level overview, Table 4 presents a consolidated comparison of major deep learning architecture families applied to intraoral photographic image analysis, summarizing their typical use cases, strengths, and limitations. CNN-based models deliver strong baseline performance for disease classification but can struggle to capture long-range contextual dependencies. Encoder–decoder architectures excel in pixel-level segmentation tasks—such as plaque quantification and gingival mapping—due to their ability to preserve spatial detail and integrate multi-scale features. Vision transformers offer enhanced global context modeling and multi-region reasoning, although they often require larger datasets or extensive pretraining. Emerging paradigms, including generative adversarial networks, self-supervised learning, and federated learning, address challenges such as class imbalance, limited annotations, and data privacy constraints. Together, these developments highlight a shift toward more robust, scalable, and clinically applicable AI systems for intraoral photographic diagnostics. Table 5 summarizes the DL architectures for analyzing IOPIs based on dental diseases.

6. Image Preprocessing and Augmentation

Preprocessing is critical for improving the quality, consistency, and diagnostic value of IOPIs, which are often affected by variations, saliva reflections, color distortions, and occlusal angulation. Effective preprocessing minimizes the irrelevant variability and highlights the diagnostically relevant structures.

6.1. Color Constancy and Normalization

IOPIs captured using smartphones or consumer-grade cameras often exhibit uneven illumination, shadows, and color shifts. Techniques such as gray-world, shades-of-gray, Retinex-based algorithms, and learning-based color-constancy models standardize color appearance across sessions and devices, improving the visibility of white-spot lesions, plaque films, gingival erythema, mucosal abnormalities, and other subtle features [58].

6.2. Contrast Enhancement

CLAHE is commonly applied to enhance local contrast, particularly for early caries, plaque boundaries, and soft-tissue textures. CLAHE increases the discriminability of low-contrast regions without excessively amplifying noise, enabling CNNs and transformers to extract more informative features from enamel surfaces and gingival tissues [59].

6.3. Region-of-Interest Extraction

Automatic ROI extraction reduces background noise and directs the model’s focus to clinically relevant areas. Prior to classification, teeth, gingiva, or oral lesions can be cropped using detection models such as YOLOv5, Faster R-CNN, or SSD. ROI-based pipelines enhance classification accuracy by excluding extraneous regions such as lips, tongue, cheeks, and specular highlights [42].

6.4. Image Augmentation

Image augmentation strategies—including rotation, scaling, horizontal flipping, color jittering, Gaussian noise, elastic deformation, and Random Erasing—expand dataset diversity and mitigate overfitting [60]. Advanced techniques such as GAN-based synthetic data generation and mix-up/cut mix produce realistic variations that improve model generalization, particularly for underrepresented categories such as early lesions or rare soft-tissue abnormalities [18].

6.5. Standardized Photography Protocols

Despite advances in preprocessing, variability in image acquisition remains a major challenge. Studies have emphasized the importance of standardized protocols that specify a fixed camera distance, controlled lighting, use of retractors, tooth-surface drying, and consistent framing [61]. Clinical guidelines for IOPI capture are essential for enhancing model robustness and supporting future multi-center deployments [62].

Table 6 summarizes the reported performance of representative deep learning models across major intraoral photographic image-based dental diagnostic tasks, including caries, gingivitis, plaque, orthodontic conditions, and soft-tissue lesions. It enables concise comparison of evaluation metrics and highlights persistent performance gaps under external validation.

7. Evaluation Metrics

When evaluating DL systems for IOPI analysis, the metrics must be tailored to classification, detection, and segmentation tasks. Rigorous and consistent evaluation is crucial for translating these systems into clinical practice.

7.1. Classification Metrics

7.1.1. Accuracy

Accuracy, defined as the proportion of correctly classified images among all predictions, is commonly reported but may be insufficient for evaluating imbalanced dental datasets, where conditions like caries are far more prevalent than soft-tissue lesions. For screening applications, metrics such as sensitivity and specificity often provide more clinically meaningful insights than accuracy alone.

7.1.2. Sensitivity and Specificity

Sensitivity (true positive rate) and specificity (true negative rate) measure a model’s ability to correctly identify diseased and healthy cases, respectively. These metrics are critical for detecting caries and gingivitis, as false negatives can lead to missed diagnoses with remarkable clinical implications.

7.1.3. Precision, Recall, and F1-Score

Precision defines the proportion of accurate positive predictions among all positive cases, while recall (also called sensitivity) indicates the proportion of accurate positive predictions among all cases. The F1-score, calculated as the harmonic mean of precision and recall, provides a balanced measure that is particularly useful for imbalanced datasets. The F1-score is commonly used in IOPI research, particularly for tasks such as mucosal-lesion classification and multi-disease detection.

7.1.4. Area Under the Receiver Operating Characteristic Curve

The AUC evaluates classification performance across varying probability thresholds and is robust to class imbalance. High AUC values (typically 0.85–0.96) are commonly reported in DL-based caries detection, gingivitis classification, and multi-disease screening tasks [10,63].

7.2. Segmentation Metrics

Robust evaluation metrics are essential for assessing the spatial accuracy of pixel-level tasks, such as dental plaque and lesion segmentation. The Dice coefficient measures the overlap between predicted segmentation masks and ground-truth annotations, providing sensitivity to small structures. Similarly, Intersection over Union (IoU) evaluates boundary alignment and region matching, reflecting segmentation precision. State-of-the-art architectures, including U-Net and Deep Lab, consistently achieve Dice scores above 0.85 on well-preprocessed intraoral datasets, highlighting their effectiveness in fine-grained dental image analysis [22].

7.3. Calibration, Robustness, and Reliability Assessments

Calibration metrics such as the Brier score or expected calibration error are rarely reported but are important for assessing prediction confidence. Robustness testing under varied lighting conditions, low-resolution inputs, and real-world smartphone imagery is increasingly recommended in model reliability assessments [64].

7.4. External Validation and Cross-Domain Generalization

Large-scale external validation remains conspicuously lacking in the current research. The performances of many models decline by 10–30% on data from different institutions or imaging devices, underscoring the need for domain adaptation and FL. External validation on geographically and demographically diverse datasets remains the gold standard for achieving clinical readiness [13].

7.5. Human–Artificial Intelligence Comparison and Clinical Utility

Multiple studies have shown that AI systems can match or exceed human performance in tasks such as caries detection, plaque assessment, and lesion identification. However, clinical utility depends not only on accuracy but also on explainability, robustness, and seamless integration into real-world workflows, including tele-dentistry and mobile health applications [65].

7.6. Interpretation of Evaluation Metrics and Acceptable Performance Ranges

In IOPI-based dental AI, evaluation metrics such as accuracy, sensitivity, specificity, AUC, Dice, and IoU should be interpreted in a task-specific context and with consideration of dataset imbalance; general guidance on metric reporting and pitfalls in medical imaging AI is provided in recent methodological recommendations. For classification tasks, accuracy and AUC values ≥0.85 are commonly reported as acceptable for assistive screening under controlled settings, as demonstrated in caries and periodontal or gingival classification studies [25,26]. For segmentation tasks, including plaque and gingival-region delineation, Dice and IoU values ≥ 0.80 are generally considered acceptable and ≥0.85 strong, consistent with recent intraoral segmentation studies [11,43]. For photographic soft-tissue and potentially malignant disorder (OPMD) screening, high sensitivity (approximately ≥0.85) is typically prioritized to reduce missed lesions [33,35]. Across all tasks, reliance on a single metric is insufficient, and complementary metrics together with explicit reporting of internal versus external validation are required for robust and clinically meaningful evaluation [64]. Figure 5 presents a conceptual overview of the evaluation metrics frequently employed in deep learning-based intraoral photographic image (IOPI) analysis. It systematically maps each metric to its respective task type and underscores key interpretational limitations.

8. Challenges and Limitations

Despite notable advancements, the widespread adoption of DL for IOPI analysis is limited by several key challenges, including data quality constraints, limited model robustness, restricted interpretability, privacy and governance concerns, and workflow integration barriers. Addressing these limitations is critical for developing reliable, clinically deployable AI systems in dentistry.

8.1. Dataset Limitations and Class Imbalance

A major challenge in AI-based dental imaging is the limited availability of large, diverse, and well-annotated datasets. Most studies rely on single-center cohorts with narrow demographic representation and inconsistent imaging conditions. Rare conditions, such as early enamel lesions or PMDs, remain underrepresented, leading to model overfitting on common disease patterns. While augmentation strategies and GAN-based synthetic image generation partially mitigate class imbalance, real-world variability remains a significant obstacle [66].

8.2. Domain Shift and Variability in Image Acquisition

Intraoral photographs exhibit substantial variations in lighting, camera type, occlusal angle, saliva reflections, and operator technique. These factors cause domain shift, wherein models trained on one dataset perform poorly on another. Even minor changes in imaging conditions can markedly reduce segmentation or classification performance. Techniques such as color-constancy correction, standardized acquisition protocols, and domain adaptation provide partial solutions, but multi-center validation remains limited [67].

8.3. Lack of Explainability and Clinical Transparency

CNNs, U-Nets, and vision transformers achieve strong predictive performance, but often operate as opaque “black-box” systems. Most studies rely on Grad-CAM or attention-based visualizations, which remain clinically unverified and frequently emphasize regions unrelated to diagnostic decision-making. To earn the trust of clinicians, models must provide clear, reproducible reasoning pathways and demonstrate consistent interpretability of their automated outputs across diverse disease categories [68].

8.4. Limited External and Prospective Validation

Many models achieve high accuracy on internal test sets resembling their training data, yet performance often declines by 10–30% on independent external datasets. Few studies report prospective validations or evaluate system performance in real-world tele-dentistry environments. Without robust cross-institutional evaluation, these models may lack generalizability across diverse patient demographics, imaging devices, and clinical workflows [46].

8.5. Privacy, Ethics, and Regulatory Constraints

IOPIs contain identifiable biometric features, including tooth morphology and surrounding soft tissues. Data sharing is restricted by regulations such as the GDPR, HIPAA, and institutional policies, limiting the creation of large multi-center datasets. While FL partially addresses these challenges, regulatory oversight of AI-driven dental diagnostic systems remains underdeveloped, and no DL tool based on IOPI analysis has yet received FDA or CE certification [20,69].

8.6. Integration into Clinical Workflow

Despite strong experimental performance, most DL systems have not been adopted in routine dental practice. Barriers include limited interoperability with electronic dental records, unstable internet access in remote settings, hardware constraints in low-resource clinics, and the absence of intuitive, clinician-centered interfaces. Effective integration will require robust deployment frameworks, informative visualization tools, and targeted clinician training programs [70].

9. Future Research Directions

To overcome current limitations and accelerate clinical translation, the following research avenues should be prioritized:

9.1. Development of Large, Multi-Center, Standardized Datasets

Collaborative international datasets with standardized acquisition protocols can enhance model generalizability. Evidence indicates that cross-center training and multi-center imaging repositories substantially reduce bias and strengthen the external validity of dental AI models [66].

9.2. Advanced Learning Paradigms: Self-Supervised Learning, FL, and Multi-Task Models

SSL leverages unlabeled dental images, reducing annotation burden and improving model performance. FL enables privacy-preserving collaboration across clinics, mitigating domain shifts and promoting fairness. Multi-task architectures integrating detection of caries, plaque, gingivitis, orthodontic findings, and soft-tissue lesions have shown promise for simultaneous classification tasks [56,71].

9.3. Improved Explainability and Clinical Interpretability

Clinically validated explainability tools, including hierarchical attention maps, Shapley additive explanation-based attributions, and rule-based overlays linked to diagnostic criteria, are emerging in healthcare AI. Human–AI collaborative studies are essential to evaluate how explainability affects clinical trust and diagnostic confidence [14].

9.4. Robustness, Calibration, and Continual Learning

Models must maintain performance under real-world imaging challenges, including poor lighting, motion blur, and artifacts. Recent robustness-testing frameworks and motion-correction techniques in medical imaging support reliable performance. Calibration methods and continual learning strategies enable models to adapt over time, mitigating domain drift in diagnostic applications [72].

9.5. Integration with Mobile and Tele-Dentistry Platforms

Lightweight architectures, such as MobileNetV3 and Efficient Net-Lite, can be deployed on mobile platforms to expand access to underserved populations. Early studies of mobile AI for plaque detection report only moderate accuracy, emphasizing the need for improved usability and seamless workflow integration [39,73].

9.6. Regulatory Approval Pathways and Ethical Frameworks

Regulatory guidelines for AI applications are evolving rapidly. The FDA’s credibility framework and CE guidelines prioritize safety, transparency, and reliability. Dentistry-specific ethical checklists, emphasizing autonomy, fairness, and data protection, provide essential guidance for the responsible deployment of AI in clinical practice [74]. Figure 6 presents a radial diagram illustrating the key future directions for deep learning based IOPIs analysis. The radial representation highlights the translational advances expected to collectively address current limitations related to data availability, privacy, interpretability, and clinical deployment. Table 7 highlights the key challenges and prospective directions for advancing IOPI-based dental disease detection.

10. Conclusions

DL has transformed IOPI analysis, enabling precise detection of dental caries, gingival disease, plaque accumulation, orthodontic abnormalities, and soft-tissue lesions. CNNs, U-Net variants, and transformer-based architectures have demonstrated strong diagnostic performance, while emerging strategies such as SSL, multi-task learning, and federated collaboration offer further improvements in accuracy, generalizability, and scalability.

Despite these advances, key challenges remain, including limited dataset diversity, domain adaptation, model interpretability, privacy compliance, and clinical integration. Rigorous external validation, adherence to regulatory standards, and seamless workflow incorporation are essential for widespread adoption in dental practice. With ongoing progress in data quality, model architecture, privacy-preserving learning, and mobile platform deployment, AI-driven IOPI analysis is poised to enhance diagnostic precision, expand tele-dentistry services, reduce healthcare disparities, and enable early detection of oral diseases worldwide.

Author Contributions

Conceptualization, A.M.M. and Y.Y.A.; methodology, A.M.M., Y.Y.A. and K.T.; writing—original draft preparation, A.M.M. and K.T.; writing—review and editing, A.M.M., Y.Y.A., K.T. and A.M.M.; project administration, resource management, supervision, and funding acquisition, A.M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Kuwait Foundation for the Advancement of Sciences (KFAS) (Research Grant No PN2313NR2019).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The authors would like to thank Kuwait University and Kuwait Foundation for the Advancement of Sciences (KFAS) for providing the support and resources necessary for the completion of this research.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IOPI	Intraoral Photographic Image
DL	Deep Learning
CNN	Convolutional Neural Network
DSLR	Digital Single-Lens Reflex
HIPAA	Health Insurance Portability and Accountability Act

References

Tonetti, M.S.; Jepsen, S.; Jin, L.; Otomo-Corgel, J. Impact of the global burden of periodontal diseases on health, nutrition and wellbeing of mankind: A call for global action. J. Clin. Periodontol. 2017, 44, 456–462. [Google Scholar] [CrossRef] [PubMed]
Lang, N.P.; Bartold, P.M. Periodontal health. J. Periodontol. 2018, 89, S9–S16. [Google Scholar] [CrossRef] [PubMed]
Pretty, I.A.; Ekstrand, K.R.; Pretty, I.A.; Ekstrand, K.R. Detection and monitoring of early caries lesions: A review. Eur. Arch. Paediatr. Dent. 2015, 17, 13–25. [Google Scholar] [CrossRef] [PubMed]
Ismail, A.I.; Sohn, W.; Tellez, M.; Amaya, A.; Sen, A.; Hasson, H.; Pitts, N.B. The International Caries Detection and Assessment System (ICDAS): An integrated system for measuring dental caries. Community Dent. Oral Epidemiol. 2007, 35, 170–178. [Google Scholar] [CrossRef]
Jeong, H.K.; Park, C.; Henao, R.; Kheterpal, M. Deep Learning in Dermatology: A Systematic Review of Current Approaches, Outcomes, and Limitations. JID Innov. 2022, 3, 100150. [Google Scholar] [CrossRef]
Rajpurkar, P.; Irvin, J.; Ball, R.L.; Zhu, K.; Yang, B.; Mehta, H.; Duan, T.; Ding, D.; Bagul, A.; Langlotz, C.P.; et al. Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 2018, 15, e1002686. [Google Scholar] [CrossRef]
Ruamviboonsuk, P.; Cheung, C.Y.; Zhang, X.; Raman, R.; Park, S.J.; Ting, D.S.W. Artificial Intelligence in Ophthalmology: Evolutions in Asia. Asia-Pac. J. Ophthalmol. 2020, 9, 78–84. [Google Scholar] [CrossRef]
Estai, M.; Bunt, S.; Kanagasingam, Y.; Kruger, E.; Tennant, M. Diagnostic accuracy of teledentistry in the detection of dental caries: A systematic review. J. Evid. Based Dent. Pract. 2016, 16, 161–172. [Google Scholar] [CrossRef]
Estai, M.; Kanagasingam, Y.; Mehdizadeh, M.; Vignarajan, J.; Norman, R.; Huang, B.; Spallek, H.; Irving, M.; Arora, A.; Kruger, E.; et al. Teledentistry as a novel pathway to improve dental health in school children: A research protocol for a randomised controlled trial. BMC Oral Health 2020, 20, 11. [Google Scholar] [CrossRef]
Schwendicke, F.; Samek, W.; Krois, J. Artificial Intelligence in Dentistry: Chances and Challenges. J. Dent. Res. 2020, 99, 769–774. [Google Scholar] [CrossRef]
Kumar, P.D.M.; Sivakumar, S.; Rajeshwari, S.; Lavanya, C.; Ranganathan, K. Diagnostic efficiency of digital photography and AI-assisted image interpretation in dental caries examination: An umbrella review. J. Oral Biol. Craniofacial Res. 2026, 16, 1–7. [Google Scholar] [CrossRef] [PubMed]
Noor Uddin, A.; Ali, S.A.; Lal, A.; Adnan, N.; Ahmed, S.M.F.; Umer, F. Applications of AI-based deep learning models for detecting dental caries on intraoral images—A systematic review. Evid.-Based Dent. 2025, 26, 71–72. [Google Scholar] [CrossRef] [PubMed]
Yu, A.C.; Mohajer, B.; Eng, J. External Validation of Deep Learning Algorithms for Radiologic Diagnosis: A Systematic Review. Radiol. Artif. Intell. 2022, 4, e210064. [Google Scholar] [CrossRef]
Eke, C.I.; Shuib, L.; Eke, C.I.; Shuib, L. The role of explainability and transparency in fostering trust in AI healthcare systems: A systematic literature review, open issues and potential solutions. Neural Comput. Appl. 2024, 37, 1999–2034. [Google Scholar] [CrossRef]
Du, G.; Cao, X.; Liang, J.; Chen, X.; Zhan, Y.; Cao, X.; Liang, J.; Chen, X.; Zhan, Y. Medical Image Segmentation based on U-Net: A Review. J. Imaging Sci. Technol. 2020, 64, jist0710. [Google Scholar] [CrossRef]
Khan, S.; Naseer, M.; Hayat, M.; Waqas, Z.; Shahbaz, K.; Shah, M. Transformers in Vision: A Survey. ACM Comput. Surv. (CSUR) 2022, 54, 200. [Google Scholar] [CrossRef]
Singh, N.K.; Raza, K. Medical Image Generation Using Generative Adversarial Networks: A Review. In Studies in Computational Intelligence; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar] [CrossRef]
Johnson, J.W. Generative adversarial networks in medical imaging. In State of the Art in Neural Networks and Their Applications; Elsevier: Amsterdam, The Netherlands, 2021. [Google Scholar] [CrossRef]
Qayyum, A.; Tahir, A.; Butt, M.A.; Luke, A.; Abbas, H.T.; Qadir, J.; Arshad, K.; Assaleh, K.; Imran, M.A.; Abbasi, Q.H.; et al. Dental caries detection using a semi-supervised learning approach. Sci. Rep. 2023, 13, 749. [Google Scholar] [CrossRef]
Kaissis, G.A.; Makowski, M.R.; Rückert, D.; Braren, R.F. Secure, privacy-preserving and federated machine learning in medical imaging. Nat. Mach. Intell. 2020, 2, 305–311. [Google Scholar] [CrossRef]
Rischke, R.; Schneider, L.; Müller, K.; Samek, W.; Schwendicke, F.; Krois, J. Federated Learning in Dentistry: Chances and Challenges. J. Dent. Res. 2022, 101, 1269–1273. [Google Scholar] [CrossRef]
Park, E.Y.; Cho, H.; Kang, S.; Jeong, S.; Kim, E.K. Caries detection with tooth surface segmentation on intraoral photographic images using deep learning. BMC Oral Health 2022, 22, 573. [Google Scholar] [CrossRef]
Mehta, L.R.; Borse, M.S.; Tepan, M.; Shah, J. Identifying Suitable Deep Learning Approaches for Dental Caries Detection Using Smartphone Imaging. Int. J. Comput. Methods Exp. Meas. 2024, 12, 251–267. [Google Scholar] [CrossRef]
Sabri, R.K.; Abdulkadir, L.Y.; Khidhir, A.M.; Saleh, H.A. Diagnosing Gingiva Disease Using Artificial Intelligence Techniques. Caries detection with tooth surface segmentation on intraoral photographic images using deep learning. Diyala J. Eng. Sci. 2025, 18, 179–190. [Google Scholar] [CrossRef]
Park, S.; Erkinov, H.; Hasan, M.A.M.; Nam, S.-H.; Kim, Y.-R.; Shin, J.; Chang, W.-D. Periodontal Disease Classification with Color Teeth Images Using Convolutional Neural Networks. Electronics 2023, 12, 1518. [Google Scholar] [CrossRef]
Wen, C.; Bai, X.; Yang, J.; Li, S.; Wang, X.; Yang, D. Deep learning based approach: Automated gingival inflammation grading model using gingival removal strategy. Sci. Rep. 2024, 14, 19780. [Google Scholar] [CrossRef]
Garg, A.; Lu, J.; Maji, A. Towards Earlier Detection of Oral Diseases On Smartphones Using Oral and Dental RGB Images. arXiv 2023, arXiv:2308.15705. [Google Scholar] [CrossRef]
Nantakeeratipat, T.; Apisaksirikul, N.; Boonrojsaree, B.; Boonkijkullatat, S.; Simaphichet, A. Automated machine learning for image-based detection of dental plaque on permanent teeth. Front. Dent. Med. 2024, 5, 1507705. [Google Scholar] [CrossRef]
Zhang, R.; Zhang, L.; Zhang, D.; Wang, Y.; Huang, Y.; Wang, D.; Xu, L. Development and evaluation of a deep learning model for occlusion classification in intraoral photographs. PeerJ 2025, 13, e20140. [Google Scholar] [CrossRef]
Ryu, J.; Lee, Y.-S.; Mo, S.-P.; Lim, K.; Jung, S.-K.; Kim, T.-W. Application of deep learning artificial intelligence technique to the classification of clinical orthodontic photos. BMC Oral Health 2022, 22, 454. [Google Scholar] [CrossRef]
Su, A.-Y.; Wu, M.-L.; Wu, Y.-H. Deep learning system for the differential diagnosis of oral mucosal lesions through clinical photographic imaging. J. Dent. Sci. 2025, 20, 54–60. [Google Scholar] [CrossRef]
Zhang, R.; Lu, M.; Zhang, J.; Chen, X.; Zhu, F.; Tian, X.; Chen, Y.; Cao, Y. Research and Application of Deep Learning Models with Multi-Scale Feature Fusion for Lesion Segmentation in Oral Mucosal Diseases. Bioengineering 2024, 11, 1107. [Google Scholar] [CrossRef]
Tanriver, G.; Tekkesin, M.S.; Ergen, O. Automated Detection and Classification of Oral Lesions Using Deep Learning to Detect Oral Potentially Malignant Disorders. Cancers 2021, 13, 2766. [Google Scholar] [CrossRef]
Vinayahalingam, S.; van Nistelrooij, N.; Rothweiler, R.; Tel, A.; Verhoeven, T.; Tröltzsch, D.; Kesting, M.; Bergé, S.; Xi, T.; Heiland, M.; et al. Advancements in diagnosing oral potentially malignant disorders: Leveraging Vision transformers for multi-class detection. Clin. Oral Investig. 2024, 28, 364. [Google Scholar] [CrossRef] [PubMed]
Warin, K.; Limprasert, W.; Suebnukarn, S.; Jinaporntham, S.; Jantana, P. Performance of deep convolutional neural network for classification and detection of oral potentially malignant disorders in photographic images. Int. J. Oral Maxillofac. Surg. 2022, 51, 699–704. [Google Scholar] [CrossRef] [PubMed]
Talwar, V.; Singh, P.; Mukhia, N.; Shetty, A.; Birur, P.; Desai, K.M.; Sunkavalli, C.; Varma, K.S.; Sethuraman, R.; Jawahar, C.V.; et al. AI-Assisted Screening of Oral Potentially Malignant Disorders Using Smartphone-Based Photographic Images. Cancers 2023, 15, 4120. [Google Scholar] [CrossRef] [PubMed]
Rashid, U.; Javid, A.; Khan, A.R.; Liu, L.; Ahmed, A.; Khalid, O.; Saleem, K.; Meraj, S.; Iqbal, U.; Nawaz, R. A hybrid mask RCNN-based tool to localize dental cavities from real-time mixed photographic images. PeerJ Comput. Sci. 2022, 8, e888. [Google Scholar] [CrossRef]
Ali, D.A.; Sadeeq, H.T. An Interpretable Deep Learning Framework for Multi-Class Dental Disease Classification from Intraoral RGB Images. Stat. Optim. Inf. Comput. 2025, 14, 3380–3397. [Google Scholar] [CrossRef]
Boy, A.F.; Akhyar, A.; Arif, T.Y.; Syahrial, S. Development of an artificial intelligence model based on MobileNetV3 for early detection of dental caries using smartphone images: A preliminary study. Adv. Sci. Technol. Res. J. 2025, 19, 109–116. [Google Scholar] [CrossRef]
Li, S.; Guo, Y.; Pang, Z.; Song, W.; Hao, A.; Xia, B. Automatic Dental Plaque Segmentation Based on Local-to-Global Features Fused Self-Attention Network. IEEE J. Biomed. Health Inform. 2022, 26, 2240–2251. [Google Scholar] [CrossRef]
Patel, A.; Besombes, C.; Dillibabu, T.; Sharma, M.; Tamimi, F.; Ducret, M.; Madathil, S. Attention-guided convolutional network for bias-mitigated and interpretable oral lesion classification. Sci. Rep. 2024, 14, 31700. [Google Scholar] [CrossRef]
Ryu, J.; Kim, Y.-H.; Kim, T.-W.; Jung, S.-K. Evaluation of artificial intelligence model for crowding categorization and extraction diagnosis using intraoral photographs. Sci. Rep. 2023, 13, 5177. [Google Scholar] [CrossRef]
Liu, Y.; Cheng, Y.; Song, Y.; Cai, D.; Zhang, N. Oral screening of dental calculus, gingivitis and dental caries through segmentation on intraoral photographic images using deep learning. BMC Oral Health 2024, 24, 1287. [Google Scholar] [CrossRef]
Jeong, J.-S.; Kim, K.-S.; Gu, Y.; Yoon, D.-H.; Zhang, M.; Wang, L.; Kim, J.-H. Deep learning for automated dental plaque index assessment: Validation against expert evaluations. BMC Oral Health 2025, 25, 1000. [Google Scholar] [CrossRef]
Li, W.; Liang, Y.; Zhang, X.; Liu, C.; He, L.; Miao, L.; Sun, W. A deep learning approach to automatic gingivitis screening based on classification and localization in RGB photos. Sci. Rep. 2021, 11, 16831. [Google Scholar] [CrossRef]
Neumayr, J.; Frenkel, E.; Schwarzmaier, J.; Ammar, N.; Kessler, A.; Schwendicke, F.; Kühnisch, J.; Dujic, H. External validation of an artificial intelligence-based method for the detection and classification of molar incisor hypomineralisation in dental photographs. J. Dent. 2024, 148, 105228. [Google Scholar] [CrossRef]
Duong, D.L.; Kabir, M.H.; Kuo, R.F. Automated caries detection with smartphone color photography using machine learning. Health Inform. J. 2021, 27, 14604582211007530. [Google Scholar] [CrossRef] [PubMed]
Rao, G.K.L.; Srinivasa, A.C.; Iskandar, Y.H.P.; Mokhtar, N. Identification and analysis of photometric points on 2D facial images: A machine learning approach in orthodontics. Health Technol. 2019, 9, 715–724. [Google Scholar] [CrossRef]
Abdulwahhab, A.H.; Mahmood, N.T.; Mohammed, A.A.; Myderrizi, I.; Al-Jumaili, M.H. A Review on Medical Image Applications Based on Deep Learning Techniques. J. Image Graph. 2024, 12, 215–227. [Google Scholar] [CrossRef]
Mienye, I.D.; Swart, T.G.; Obaido, G.; Jordan, M.; Ilono, P. Deep Convolutional Neural Networks in Medical Image Analysis: A Review. Information 2025, 16, 195. [Google Scholar] [CrossRef]
Dai, L.; Zhou, M.; Liu, H. Recent Applications of Convolutional Neural Networks in Medical Data Analysis: Medicine & Healthcare Book Chapter. In Federated Learning and AI for Healthcare; IGI Global Scientific Publishing: Hershey, PA, USA, 2024. [Google Scholar]
Kühnisch, J.; Meyer, O.; Hesenius, M.; Hickel, R.; Gruhn, V. Caries Detection on Intraoral Images Using Artificial Intelligence. J. Dent. Res. 2022, 101, 158–165. [Google Scholar] [CrossRef]
Srinivasan, S.; Durairaju, K.; Deeba, K.; Mathivanan, S.K.; Karthikeyan, P.; Shah, M.A. Multimodal Biomedical Image Segmentation using Multi-Dimensional U-Convolutional Neural Network. BMC Med. Imaging 2024, 24, 38. [Google Scholar] [CrossRef]
Zhou, Z.; Zhu, J.; Zhang, Y.; Guan, X.; Wang, P.; Li, T. Deep Learning in Dental Image Analysis: A Systematic Review of Datasets, Methodologies, and Emerging Challenges. arXiv 2025, arXiv:2510.20634. [Google Scholar] [CrossRef]
He, K.; Gan, C.; Li, Z.; Rekik, I.; Yin, Z.; Ji, W.; Gao, Y.; Wang, Q.; Zhang, J.; Shen, D. Transformers in medical image analysis. Intell. Med. 2023, 3, 59–78. [Google Scholar] [CrossRef]
Tran, Q.V.; Byeon, H. The Promise of Self-Supervised Learning for Dental Caries. Int. J. Adv. Comput. Sci. Appl. 2023, 14, 57–61. [Google Scholar] [CrossRef]
Taleb, A.; Rohrer, C.; Bergner, B.; Leon, G.D.; Rodrigues, J.A.; Schwendicke, F.; Lippert, C.; Krois, J. Self-Supervised Learning Methods for Label-Efficient Dental Caries Classification. Diagnostics 2022, 12, 1237. [Google Scholar] [CrossRef] [PubMed]
Badano, A.; Revie, C.; Casertano, A.; Cheng, W.-C.; Green, P.; Kimpe, T.; Krupinski, E.; Sisson, C.; Skrøvseth, S.; Treanor, D.; et al. Consistency and Standardization of Color in Medical Imaging: A Consensus Report. J. Digit. Imaging 2014, 28, 41–52. [Google Scholar] [CrossRef]
Yoshimi, Y.; Mine, Y.; Ito, S.; Takeda, S.; Okazaki, S.; Nakamoto, T.; Nagasaki, T.; Kakimoto, N.; Murayama, T.; Tanimoto, K. Image preprocessing with contrast-limited adaptive histogram equalization improves the segmentation performance of deep learning for the articular disk of the temporomandibular joint on magnetic resonance images. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. 2024, 138, 128–141. [Google Scholar] [CrossRef]
Xu, M.; Yoon, S.; Fuentes, A.; Park, D.S. A Comprehensive Survey of Image Augmentation Techniques for Deep Learning. Pattern Recognit. 2023, 137, 109347. [Google Scholar] [CrossRef]
Saincher, R.; Kumar, S.; Gopalkrishna, P.; Maithri, M. Comparison of color accuracy and picture quality of digital SLR, point and shoot and mobile cameras used for dental intraoral photography—A pilot study. Heliyon 2022, 8, e09262. [Google Scholar] [CrossRef]
Lamas-Lara, V.F.; Mattos-Vela, M.A.; Evaristo-Chiyong, T.A.; Guerrero, M.E.; Jiménez-Yano, J.F.; Gómez-Meza, D.N. Validity and reliability of a smartphone-based photographic method for detection of dental caries in adults for use in teledentistry. Front. Oral Health 2025, 6, 1470706. [Google Scholar] [CrossRef]
Li, X.; Zhao, D.; Xie, J.; Wen, H.; Liu, C.; Li, Y.; Li, W.; Wang, S. Deep learning for classifying the stages of periodontitis on dental images: A systematic review and meta-analysis. BMC Oral Health 2023, 23, 1017. [Google Scholar] [CrossRef]
Kocak, B.; Klontzas, M.E.; Stanzione, A.; Meddeb, A.; Demircioğlu, A.; Bluethgen, C.; Bressem, K.K.; Ugga, L.; Mercaldo, N.; Díaz, O.; et al. Evaluation metrics in medical imaging AI: Fundamentals, pitfalls, misapplications, and recommendations. Eur. J. Radiol. Artif. Intell. 2025, 3, 100030. [Google Scholar] [CrossRef]
Adeniran, A.A.; Onebunne, A.P.; William, P. Explainable AI (XAI) in healthcare: Enhancing trust and transparency in critical decision-making. World J. Adv. Res. Rev. 2024, 23, 2647–2658. [Google Scholar] [CrossRef]
Krois, J.; Garcia Cantu, A.; Chaurasia, A.; Patil, R.; Chaudhari, P.K.; Gaudin, R.; Gehrung, S.; Schwendicke, F. Generalizability of deep learning models for dental image analysis. Sci. Rep. 2021, 11, 6102. [Google Scholar] [CrossRef] [PubMed]
Guan, H.; Liu, M. Domain Adaptation for Medical Image Analysis: A Survey. IEEE Trans. Biomed. Eng. 2022, 69, 1173–1185. [Google Scholar] [CrossRef] [PubMed]
Das, A.; Rad, P. Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey. arXiv 2020, arXiv:2006.11371. [Google Scholar] [CrossRef]
Liu, T.-Y.; Lee, K.-H.; Mukundan, A.; Karmakar, R.; Dhiman, H.; Wang, H.-C. AI in Dentistry: Innovations, Ethical Considerations, and Integration Barriers. Bioengineering 2025, 12, 928. [Google Scholar] [CrossRef]
Rajkumar, N.M.R.; Muzoora, M.R.; Thun, S. Dentistry and Interoperability. J. Dent. Res. 2022, 101, 1258–1262. [Google Scholar] [CrossRef]
Haripriya, R.; Khare, N.; Pandey, M. Privacy-preserving federated learning for collaborative medical data mining in multi-institutional settings. Sci. Rep. 2025, 15, 12482. [Google Scholar] [CrossRef]
Kumari, P.; Chauhan, J.; Bozorgpour, A.; Huang, B.; Azad, R.; Merhof, D. Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects. Med. Image Anal. 2025, 106, 103730. [Google Scholar] [CrossRef]
Al-Zubaidy, D.; Innes, N.; Galloway, J.; Al-Yaseen, W.; Al-Zubaidy, D.; Innes, N.; Galloway, J.; Al-Yaseen, W. Evaluating user perceptions and usability of an AI-powered smartphone application for at-home dental plaque screening. Br. Dent. J. 2025, 239, 46–52. [Google Scholar] [CrossRef]
Rokhshad, R.; Ducret, M.; Chaurasia, A.; Karteva, T.; Radenkovic, M.; Roganovic, J.; Hamdan, M.; Mohammad-Rahimi, H.; Krois, J.; Lahoud, P.; et al. Ethical considerations on artificial intelligence in dentistry: A framework and checklist. J. Dent. 2023, 135, 104593. [Google Scholar] [CrossRef]

Figure 1. DL workflow of intraoral photographic image analysis.

Figure 2. PRISMA-inspired flow diagram of the study identification, screening, eligibility assessment, and inclusion process.

Figure 3. Sunburst diagram of dental study types, models, and outcomes.

Figure 4. Visual diagnostic cues across dental disease categories in IOPIs.

Figure 5. Evaluation metrics landscape for Deep Learning intra-oral photographic image analysis.

Figure 6. Future Research Directions in Deep Learning-Based Intraoral Photographic Image (IOPI) Analysis.

Table 1. Performance summary of DL using Intraoral Photographic Images across Dental-disease Domains.

Author (Year)	Dental Study	Dataset Source & Size	Imaging Modality	Model Used	Outcomes	Limitations
Park et al., 2022 [22]	Caries Detection	KNUDH, 2348 images	Intraoral camera	ResNet18, Faster R-CNN	Accuracy: 0.813 AUC: 0.837 Sensitivity: 0.890	Limited internal visibility and occult lesions
Mehta et al., 2024 [23]	Dental Caries	Bharati Vidyapeeth’s Dental College, Pune, 1164 images	Intraoral digital RGB images	DenseNet201	Accuracy: 0.93	Dataset scarcity, generalizability, and risk of overfitting.
Sabri et al., 2025 [24]	Gingival diseases	Multihospital Karnataka, 2270 images	X-Ray and Intraoral images	Mobile Net	Accuracy: 92.7%	Data scarcity, poor interpretability, and clinical limits
Park et al., 2023 [25]	Periodontal diseases	GitHub public, 220 images	Optical camera images	YOLOv5s	F1 score: 99.9%	Dataset expansion, synthetic bias, and low applicability
Wen et al., 2024 [26]	Gingival inflammation grading	School and Hospital of Stomatology, Wuhan, 8214 images	Digital Camera	U-Net with Dense Net encoder	Accuracy: 79.22% AUC: 0.837 Sensitivity: 83.75% Specificity: 69.33% Precision: 0.867	Limited dataset, regional gap, and image bias
Garg et al., 2023 [27]	Dental Calculus	Public Dataset, 220 images	RGB Intraoral images	ResNet34	Accuracy: 81.82% Recall: 75.00% F1-score: 81.82% Precision: 90.00%	Data demand, training cost, and manual processing
Nantakeeratipat et al., 2024 [28]	Dental plaque	Srinakharinwirot University, Bangkok, 299 images	Smartphone camera images	Google Cloud’s Vertex AI AutoML	Precision: 0.964 F1-score: 0.931 AUPRC: 0.964	Data limitation, weak generalization, and manual cropping risk
Zhang et al., 2025 [29]	Dental occlusion classification	Private dataset, 7200 images	Digital camera IOPIs	Swin Transformer	F1-score: 0.90 (Molar Occlusal) F1-score: 0.87 (Canine Occlusal)	Quality flaw, Source dependent, and validation gap
Ryu et al., 2022 [30]	Orthodontic photo classification	Seoul National University Dental Hospital, 4448 images	IOPIs	Multi-domain CNN	Accuracy: 99.3% (Facial) Accuracy: 99.9%) (Intraoral photos)	Single dataset, no flip-handling
Su et al., 2025 [31]	Oral mucosal lesions	National Cheng Kung University Hospital, 506 images	Clinical photographic imaging	CNN	Specificity: 97.0% Kappa: 0.851 AUC: 0.985	Dataset scarcity, class imbalance, cross-validation
Zhang et al., 2024 [32]	Oral lesion segmentation	Private dataset, 838 images	Intraoral lesion images	SegFormer-B2 Transformer	Dice: 0.710 Precision: 0.886	Data scarcity, low diversity, weak generalization
Tanriver et al., 2021 [33]	OPMD Disorders	Combined public dataset, 652 images	White-light photographic images	YOLOv5l U-Net	Dice: 0.929 (U-Net) AP: 0.855 (YOLOv5l)	Data scarcity, low diversity, and lesion challenge
Vinayahalingam et al., 2024 [34]	OPMD detection	Private dataset, 4161 images	Clinical photographs	Mask R-CNN + Swin	F1 score:0.852 AUC: 0.974 F1score: 0.796 AUC: 0.938	Site limitation, low diversity, and label inconsistency
Warin et al., 2022 [35]	OPMD detection	Private dataset, 600 images	Digital dental camera	DenseNet-121 ResNet-50 Faster R-CNN	AUC: 95% (DenseNet-121) AUC: 95% (ResNet-50) F1 Score: 0.743 (Faster R-CNN)	Data scarcity and risk of overfitting
Talwar et al., 2023 [36]	OPMDs	Indian Dental Institute, 2178 images	Intraoral photographic images	DenseNet-201	F1 Score: 0.84	Inconsistent quality, focus, and angle variation
Rashid et al., 2022 [37]	Dental Caries	Public dataset	Mixed dental images	Hybrid Mask RCNN	Accuracy between 78% and 92%	No explicit study, annotated datasets
Ali & Sadeeq et al., 2025 [38]	Dental Classification	Kaggle Multi dataset	Clinically obtained RGB intraoral images	EfficienNet-B3	Accuracy: 95.4% (Oral Diseases) Accuracy: 89.9% (Oral Infection) Accuracy: 99.3% (Teeth Dataset)	Class imbalance and low recall in Hypodontia
Boy et al., 2025 [39]	Dental caries	Private Indonesian clinical dataset, 1200 images	Smartphone images	MobileNetV3	Accuracy: 90% Precision: 90% Sensitivity: 90% Specificity: 90%	Quality flaw, device variability, and low resolution
Pang et al., 2022 [40]	Dental Plaque	Private dataset, 2884 images	Raw oral endoscope RGB images	ResNet101	Accuracy: 83.86%	Device variability, Imaging inconsistency, Equipment variation
Patel et al., 2024 [41]	Oral lesions	Private OCPP data, 2765 images	Intraoral images	GAIN + ASP	Accuracy:75.45% AUC:99.7%	No limitations stated
Ryu et al., 2023 [42]	Dental crowding severity	Seoul National University Dental Hospital, 2248 images	Intraoral photographs	VGG19	(Maxilla) Accuracy: 0.922 (Mandible) Accuracy: 0.898	Single-center data, weak generalization, quality flaw
Liu et al., 2024 [43]	Dental caries, calculus, gingivitis.	Private dataset, 3365 images	Intraoral photographic images	Oral-Mamba CNN	Accuracy: 0.83 (gingivitis) 0.83 (caries) 0.81 (calculus)	No explicit limitations stated
Jeong et al., 2025 [44]	Dental plaque accumulation	Private dataset, 1094 images	Camera IOPIs	U-Net	Precision: 76.34% Recall: 65.15% F1-score: 66.15%	Single-dataset, Imaging and Visualization limits
Li et al., 2021 [45]	Gingivitis	Private dataset, 10,000 images	RGB photos	ResNet-50 YOLOv3	Accuracy: 92.1% Sensitivity: 91.3% Specificity: 92.9%	Single-center data, Subjective diagnosis, data scarcity
Neumayr et al., 2024 [46]	Molar incisor hypomineralisation	Open source web images, 455 images	IOPIs	AI-based model	Accuracy 94.3% sensitivity (94.4%) specificity (94.2%) AUC: 0.89–0.94	Heterogeneous images, Subjective quality rating, No standard criteria

Table 2. Risk of Bias Assessment.

Study ID	Question Number
	Q1	Q2	Q3	Q4	Q5	Q6	Q7
Park et al., 2022 [22]	low	Moderate	Low	Low	Low	Low	Low
Mehta et al., 2024 [23]	Low	Low	High	Moderate	Low	Low	Low
Sabri et al., 2025 [24]	Low	Low	Moderate	Low	Moderate	Low	Moderate
Park et al., 2023 [25]	High	Moderate	Moderate	Moderate	High	Low	High
Wen et al., 2024 [26]	Moderate	Moderate	Low	Low	Low	Low	Low
Garg et al., 2023 [27]	Moderate	Moderate	Low	Low	Low	Low	Low
Nantakeeratipat et al., 2024 [28]	Low	Moderate	Low	Low	Low	Low	Low
Zhang et al., 2025 [29]	Moderate	Low	High	Low	Low	Low	Low
Ryu et al., 2022 [30]	Low	Moderate	Low	High	Low	Low	High
Su et al., 2025 [31]	Moderate	Moderate	Low	Low	Low	Low	High
Zhang et al., 2024 [32]	Low	Moderate	High	Low	Low	Low	Low
Tanriver et al., 2021 [33]	Moderate	Low	High	Low	Low	Low	Low
Vinayahalingam et al., 2024 [34]	Low	Moderate	High	Low	Low	Low	Moderate
Warin et al., 2022 [35]	Low	Moderate	High	Low	Low	Low	Low
Talwar et al., 2023 [36]	Low	Moderate	High	Low	Low	High	Low
Rashid et al., 2022 [37]	Low	Moderate	High	Low	Low	Low	Low
Ali & Sadeeq, 2025 [38]	Low	Low	Low	Low	Moderate	Low	Moderate
Boy et al., 2025 [39]	Low	Moderate	High	Low	Low	Low	Low
Pang et al., 2022 [40]	Moderate	Moderate	Low	Low	Low	Low	Low
Patel et al., 2024 [41]	Low	Moderate	High	Low	Low	Low	Low
Ryu et al., 2023 [42]	Moderate	Low	High	Low	Low	Low	Moderate
Liu et al., 2024 [43]	Low	Moderate	High	Low	Low	High	Moderate
Jeong et al., 2025 [44]	Low	Moderate	High	Low	Low	Low	Moderate
Li et al., 2021 [45]	Low	Moderate	High	Low	Moderate	Low	Low
Neumayr et al., 2024 [46]	Low	Low	Low	Low	Low	Low	High

Low Risk Of Bias (High quality); Ai 07 00085 i002

Moderate Risk Of Bias (Low quality); Ai 07 00085 i003

High Risk of Bias (Poor quality).

Table 3. Mapping dental diseases to visual cues and DL tasks.

Disease Category	Key Visual Indicators in IOPIs	Clinical Relevance	Typical DL Tasks	Representative Studies
Dental caries	White-spot lesions, cavitation, discoloration	Early prevention of progression	Classification, localization	[3]
Gingivitis/Periodontitis	Gingival redness, swelling, bleeding	Prevents progression and tooth loss	Classification, grading	[26]
Dental plaque	Yellowish biofilm at the gingival margin	Risk factors for caries and gingivitis	Segmentation, quantification	[40]
Orthodontic conditions	Crowding, spacing, occlusal imbalance	Treatment planning	Classification, landmark detection,	[48]
Soft-tissue lesions/PMDs	White/red patches ulcers	Early oral cancer screening	Classification, Lesion segmentation	[33]

Table 4. DL Architectures used in dental IOPI studies.

Study (Author, Year)	Imaging Task	DL Architecture	Key Methodological Focus	Ref
Park, 2022	Tooth surface caries detection and segmentation	Res Net-based segmentation pipeline	Tooth-surface segmentation before classification	[22]
Duong, 2021	Caries screening	Classical ML/CNN prototype	Feasibility of smartphone photographic ML	[47]
Kühnisch, 2022	Caries detection	CNN ensembles, Transfer learning	High-performance caries benchmarking	[27]
Shuai Li, 2022	Plaque segmentation	Local-to-global attention (U-Net variant) Attention-based U-Net	Improved plaque boundary delineation	[40]
Nantakeeratipat, 2024	Plaque detection	Automated ML frameworks	AutoML for plaque detection on permanent teeth Automated model selection	[28]
Ryu, 2022	Orthodontic diagnosis	CNN (IOPI Classification)	Automated classification of intraoral photos	[30]
Vinayahalingam, 2024	OPMD multi-class detection	Vision Transformer	Multi-class PMD detection using ViTs	[34]
Tanriver, 2021	Oral lesion detection	CNN-based classifiers	Early automated PMD detection	[33]
Kaissis, 2020 Rischke, 2022	Federated training context	FL frameworks (FedAvg/FedProx)	Privacy-preserving multi-center training in medical imaging/dentistry	[20,21]
Taleb, 2022	Label-efficient learning	SSL paradigms (SimCLR, BYOL, MoCo)	SSL promises efficient detection and labeling of dental caries	[57]

Table 5. Summary of DL architecture for analyzing IOPI-based dental disease.

Model/Architecture	Typical Dental Roles (IOPI)	Strengths	Limitations	Representative Benchmark Studies
Res Net	Caries and lesion classification	Strong feature extraction	Limited global context; needs augmentation	[22]
Efficient Net/ Mobile Net	Smartphone screening	Good accuracy- compute tradeoff	Sensitive to IOPI variability	[39]
U-Net/DeepLabV3+/ Attention U-Net	Plaque, gingiva, lesion segmentation	Accurate boundary localization	Limited global reasoning	[15,53]
Vision Transformers	Multi-disease and orthodontic analysis	Strong global context modeling	Data hungry	[34,55]
GANs	Data augmentation, color correction	Mitigate class imbalance	Risk of unrealistic samples	[17,18]
Self-Supervised Learning	Label efficient, pretraining	Reduces annotation burden	Sensitive to augmentation	[56,57]
Federated Learning	Multi-center training	Privacy-preserving robust	Communication overhead	[20,21]

Table 6. Performance summary of state-of-the-art DL approaches for key IOPI-based dental disease tasks.

Task	Best Reported Model(s)	Dataset Source	Key Techniques	Reported Performance	Key Reviewed Studies
Caries detection	Res Net/CNN models	smartphone IOPIs	Tooth-surface segmentation, CLAHE, ROI cropping	Accuracy ~85–93%	[22,47,52]
Gingivitis/ periodontal grading	CNN classifiers	Clinical RGB photos	Gingival ROI, color normalization	Accuracy ~85–92%	[25,26,45]
Plaque segmentation	Attention U-Net variants	Masked datasets	Color normalization, morphological cleaning	Dice coefficients ~0.82–0.95	[28,40,44]
Orthodontic classification	CNNs, ViT/Swin variants	Clinical IOPIs	Alignment/ ROI extraction	Accuracy ~90–99%	[29,42]
OPMD detection	CNN ensembles, ViTs	Clinical photos	Contrast enhancement,	AUC~ 0.85–0.96	[33,35]
Multi-center robustness	Federated learning	Multi- center datasets	FL training; domain adaptation	Improved external robustness	[20,21]

Table 7. Future directions of DL approaches for IOPI-based dental-disease detection.

Challenge Area	Evidence from Literature	Future Research Directions
Dataset heterogeneity	Many studies rely on single-center datasets with limited demographic and clinical diversity [10,26], affecting generalizability [60].	Development of multi-center IOPI datasets with standardized acquisition protocols.
Image acquisition variability	Variations in lighting, device type, viewing angle, and saliva artifacts influence model performance [26,47].	Color normalization, illumination correction, and guided image-capture strategies.
Class imbalance and rare conditions	Rare conditions such as early caries and OPMDs are underrepresented in available datasets [26,29].	Targeted data collection, synthetic augmentation, and self-supervised pretraining.
Domain shift and external validation	Performance degradation is commonly reported on external datasets due to domain shift [60].	Domain adaptation techniques, external validation, and federated learning.
Limited explainability	Saliency-based explanations are not always aligned with clinical reasoning [64,71].	Clinically interpretable explanation frameworks and human–AI studies.
Annotation burden	Manual labeling is time-consuming and subject to inter-observer variability [21,31].	Self-supervised, weakly supervised, and consensus-based annotation methods.
Privacy and regulatory constraints	Data-sharing restrictions limit large-scale multi-institutional collaboration [74].	Privacy-preserving learning and regulatory-aligned AI development.
Model reliability and calibration	Confidence estimates are often poorly calibrated for clinical decision support [68].	Uncertainty-aware modeling and calibration strategies.
Clinical workflow integration	Limited deployment due to interoperability and hardware constraints [8,50].	Lightweight models, interoperable deployment frameworks, and user-centered design.
Lack of prospective evaluation	Most studies rely on retrospective analysis.	Prospective and longitudinal clinical validation studies.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mutawa, A.M.; Altarakemah, Y.Y.; Thirupathy, K. Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives. AI 2026, 7, 85. https://doi.org/10.3390/ai7030085

AMA Style

Mutawa AM, Altarakemah YY, Thirupathy K. Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives. AI. 2026; 7(3):85. https://doi.org/10.3390/ai7030085

Chicago/Turabian Style

Mutawa, A. M., Yacoub Yousef Altarakemah, and Karthiga Thirupathy. 2026. "Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives" AI 7, no. 3: 85. https://doi.org/10.3390/ai7030085

APA Style

Mutawa, A. M., Altarakemah, Y. Y., & Thirupathy, K. (2026). Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives. AI, 7(3), 85. https://doi.org/10.3390/ai7030085

Article Menu

Deep Learning Applications for Dental-Disease Classification Using Intraoral Photographic Images: Current Status and Future Perspectives

Abstract

1. Introduction

2. Methodology

2.1. Search Strategy

2.2. Inclusion and Exclusion Criteria

2.3. Study Selection

2.4. Data Extraction and Synthesis

2.5. Quality and Risk of Bias Assessment Criteria

Risk of Bias (ROB) Scoring Criteria and Thresholds

3. Results

3.1. Summary of Included Studies

3.2. Distribution of Study Characteristics and Evidence

3.3. Quality and Risk of Bias Assessment

4. Discussion

4.1. Detectable Dental Diseases in Intraoral Photographs

4.1.1. Dental Caries

4.1.2. Gingivitis and Periodontal Diseases

4.1.3. Dental Plaque

4.1.4. Orthodontic Conditions

4.1.5. Soft-Tissue Lesions and Potentially Malignant Disorders

5. Deep Learning Architectures and Applications for Intraoral Photographic Image Analysis

5.1. Convolutional Neural Networks (CNNs)

5.2. Encoder–Decoder Networks for Segmentation (U-Net and Derivatives)

5.3. Vision Transformers

5.4. Self-Supervised Learning

5.5. Federated Learning

5.6. Generative Adversarial Networks

6. Image Preprocessing and Augmentation

6.1. Color Constancy and Normalization

6.2. Contrast Enhancement

6.3. Region-of-Interest Extraction

6.4. Image Augmentation

6.5. Standardized Photography Protocols

7. Evaluation Metrics

7.1. Classification Metrics

7.1.1. Accuracy

7.1.2. Sensitivity and Specificity

7.1.3. Precision, Recall, and F1-Score

7.1.4. Area Under the Receiver Operating Characteristic Curve

7.2. Segmentation Metrics

7.3. Calibration, Robustness, and Reliability Assessments

7.4. External Validation and Cross-Domain Generalization

7.5. Human–Artificial Intelligence Comparison and Clinical Utility

7.6. Interpretation of Evaluation Metrics and Acceptable Performance Ranges

8. Challenges and Limitations

8.1. Dataset Limitations and Class Imbalance

8.2. Domain Shift and Variability in Image Acquisition

8.3. Lack of Explainability and Clinical Transparency

8.4. Limited External and Prospective Validation

8.5. Privacy, Ethics, and Regulatory Constraints

8.6. Integration into Clinical Workflow

9. Future Research Directions

9.1. Development of Large, Multi-Center, Standardized Datasets

9.2. Advanced Learning Paradigms: Self-Supervised Learning, FL, and Multi-Task Models

9.3. Improved Explainability and Clinical Interpretability

9.4. Robustness, Calibration, and Continual Learning

9.5. Integration with Mobile and Tele-Dentistry Platforms

9.6. Regulatory Approval Pathways and Ethical Frameworks

10. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI