Next Article in Journal
Effect of Mulberry Leaf and Its Active Component, 1-Deoxynojirimycin, on Palmitic Acid-Induced Lipid Accumulation in HepG2 Cells
Previous Article in Journal
Exosome-Derived microRNAs as Liquid-Biopsy Biomarkers in Laryngeal Squamous Cell Carcinoma: A Narrative Review and Evidence Map
Previous Article in Special Issue
Branched-Chain Amino Acid Intake and Risk of Incident Type 2 Diabetes: Results from the SUN Cohort
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

Advances in Image-Based Diagnosis of Diabetic Foot Ulcers Using Deep Learning and Machine Learning: A Systematic Review

by
Haifa F. Alhasson
* and
Shuaa S. Alharbi
Department of Information Technology, College of Computer, Qassim University, Buraydah 52571, Saudi Arabia
*
Author to whom correspondence should be addressed.
Biomedicines 2025, 13(12), 2928; https://doi.org/10.3390/biomedicines13122928
Submission received: 2 November 2025 / Revised: 23 November 2025 / Accepted: 25 November 2025 / Published: 28 November 2025
(This article belongs to the Special Issue Diabetes: Comorbidities, Therapeutics and Insights (3rd Edition))

Abstract

Background/Objectives: This review systematically assesses machine learning (ML) and deep learning (DL) applications using images to diagnose diabetic foot ulcers (DFUs), focusing on detection, segmentation, and classification. The study explores trends, challenges, and quality measurements of the reviewed research. Methods: A comprehensive search was conducted in October 2025 across 14 databases, covering studies published between 2010 and 2025. Studies employing ML/DL for DFU diagnosis with accurate measurements were included, while those without image-based methods, AI techniques, or relevant outcomes were excluded. Out of 4653 articles initially identified, 1016 underwent detailed review, and 102 met the inclusion criteria. Results: The analysis revealed that ML/DL models are effective tools for DFU diagnosis, achieving accuracy between 0.88 and 0.97, specificity between 0.85 and 0.95, and sensitivity between 0.89 and 0.95. Common methods included Support Vector Machines (SVMs) for ML and U-Net or fully convolutional neural networks (FCNNs) for DL. Recent studies also explored thermal infrared imaging as a promising diagnostic technique. However, only 45% of segmentation datasets and 67.3% of classification datasets were publicly accessible, limiting reproducibility and further development. Conclusions: This review provides valuable insights into trends and key findings in ML/DL applications for DFU diagnosis. It highlights the need for improved data availability and sharing to enhance reproducibility, accuracy, and reliability, ultimately improving patient care.

1. Introduction

Diabetic foot ulcers (DFUs) are the most common complication of Diabetes Mellitus (DM) that typically do not heal and, as a result, often result in amputation of the lower extremity. Epidemiologically, the number of DM patients around the globe is expected to grow from approximately 589 million adults in 2025 to 850 million in 2050, resulting in one of the most widespread chronic conditions in the world [1]. A DFU will occur in every third of patients with diabetes through their lifetime [2,3,4]. The economic impact of complications of diabetes is immense as well. Diabetes-related healthcare costs are projected to surpass USD 760 billion per year by 2025, with USD 45.9 billion exclusively targeted for diabetic neuropathy care in the USA [1].
The world market for advanced wound care is calculated to reach 22 billion by 2024 [5]. In the UK, DFU treatment alone costs an estimated GBP 580 million per year, as well as expenditures in other primary care settings [6]. Foot complications are a major contributor to medical expenses, with foot disease accounting for 50% of all diabetes admissions to hospitals [7]. Jodheea et al. [8] described in detail how geographical health economics, their effect on healing, as well as the factors that affect financial costs of DFUs, especially after the COVID-19 epidemic.
Early effective management of DFUs including education, blood sugar control, wound debridement, advanced dressing, offloading, advanced therapies, and in some cases surgery can reduce the severity of complications. Effective ulcer management is limited by several factors: (i) the results that are used to guide management processes are usually slow to reach clinical settings, (ii) the progression of DFU severity is hard to distinguish visually, and (iii) there are many DFU classification systems considering the proportion and types of tissue that are visually distinguishable [9,10,11,12]. DFU diagnosis is currently delivered by a specialist through visual examination. The complexity of providing the diagnosis can make this method time-consuming and lead to misdiagnoses. Currently, the accepted practice involves a trained clinician testing a patient’s feet manually with a hand-held nylon mono-filament probe. The procedure is time-consuming, labor-intensive, requires special training, is prone to error, and exhibits poor reproducibilityb [13]. The widespread uptake and acceptance of wearable and digital health technologies provides a means to monitor major risk factors associated with DFUs in a timely way [14]. This empowers patients in self-care and allows effective delivery of remote monitoring and multi-disciplinary prevention needed for those at-risk people. Technologies developed to enhance ulcer diagnostics and care plans have the potential to revolutionize diabetic foot care. Najafi et al. [15] summarize some of the promising developments in the area of digital health that may promote the prevention and management of DFUs. For the classification of diabetic foot ulcers, ref. [16] compares hybrid convolutional neural network models. The purpose of this comparison is to optimize deep learning approaches in clinical settings so that better diagnosis and treatment can be planned.
DFUs are considered a leading cause of hospitalization in patients with diabetes [17,18] and they can progress to severe infection, gangrene, amputation, and even death in the absence of proper care [19]. There is an urgent demand for reliable and quick management in diabetic patients, and Machine Learning (ML) and Deep Learning (DL) are great options to improve healthcare systems [20] through the prediction of future diabetes progression. ML/DL algorithms in DFU healthcare are becoming an increasingly popular approach [21]. ML algorithms are characterized by their ability to learn and adapt over time from raw data without being explicitly programmed [11], which has created a lot of excitement in the research community.
It is a complex task to find a suitable algorithm for the real-world circumstances of the DFU diagnostic process. Challenges include missing the early stages of ulceration, low quality of images available for DFU documentation (noise, shadow, blurring etc.), false positive detection cases (malformed toenails, deep wounds, folded amputation scars), fresh inflammation, and the variation of curved wound size, all of which can result in unnecessarily complicated treatments for both the patient and the healthcare system.
Over the years, several researchers studied to create mechanized systems for automatic DFU detection with the aid of ML with the objective to make it fast and accurate. Classical ML algorithms have emerged as a potential solution for automating DFU detection in the field of AI. In general, ML uses previous experience to improve the given results [22]. DL algorithms are a subset of ML algorithms that typically involve learning representations at different hierarchy levels to enable building complex concepts out of simpler ones. DL can be classified as supervised, unsupervised, and reinforcement learning. These ideas and efforts can be classified into two approaches: one uses the same classical method in which the first step is the preprocessing step, followed by segmentation, feature extraction, and then the classification step, while the second uses direct classification after the preprocessing step. Whether the approaches concentrate on segmentation, feature extraction, or classification, the tendency is to use the benefits of ML/DL techniques to obtain more accurate results. The various applications of ML/DL in the DFU field include: improving clinical decision-making based on ulcer classification, data analysis for risk and automated classification, or as an application for mobile devices for segmentation and classification.

Purpose and Motivation

This study rationally focuses on reviewing the current state of Artificial Intelligence (AI) in the DFU field including detection, segmentation, and classification. The purpose of this study is to summarise the ML/DL methods used to improve DFU care in healthcare settings. It is hypothesized that ML models can assist specialists in achieving more accurate diagnoses compared with individual DFU specialists. The motivations for the project are:
  • To engage in the efforts to find viable solutions for diabetic foot detection, thus assisting the DFU specialists in diagnosis and treatment.
  • Analyze new trends of approaches used in the automatic DFU detection field. Thus, this paper focuses on presenting growth trends in the use of ML/DL techniques.
The detection and recognition of DFUs have been the subject of substantial research involving computer vision methods. However, systematic comparisons of deep learning and machine learning based on object detection frameworks are lacking [23]. There are many ML/DL models to explore and investigate throughout the literature for each target technique. The main technical gap is to identify the most suitable model architecture for a specific type of image dataset such as colored and thermal images. From this gap, the first question is formulated. In order to compare these models, the performance of these models is validated with different types of validation metrics. The second research question can be derived from this issue. However, a vast selection of datasets are available for model training, but there is no indication of which characteristics of an image dataset are most beneficial for the training process. This gap helps to develop the third research question.
This paper aims to systematically review recent advancements in ML/DL models for detecting, classifying, and segmenting diabetic foot ulcers (DFUs). The focus is on trends, challenges, and the quality of evidence reported in the literature. The study also highlights gaps in dataset availability and evaluation metrics, providing key insights to guide future research.

2. Materials and Methods

Section 2 presents the search strategy and criteria used to select relevant studies on ML/DL techniques for DFU diagnosis. The focus is on identifying trends, evaluating dataset usage, and comparing performance metrics. This systematic review has been registered with INPLASY (International Platform of Registered Systematic Review and Meta-analysis Protocols) under the registration number INPLASY2022110128.

2.1. Research Questions

  • Research Question 1 (RQ1): What is the most effective ML/DL model for providing an optimal diagnosis for DFUs?
  • Research Question 2 (RQ2): How should the performance of the models be compared using the optimal set of validation metrics?
  • Research Question 3 (RQ3): What are the characteristics of the DFU image database that are required for the training of the diagnostic model?

2.2. Inclusion and Exclusion Criteria

Articles were selected based on the following criteria:
  • Inclusion Criteria:
    Articles must focus on ML/DL applications related to DFU detection, classification, or segmentation.
    Articles must include references to or the creation of datasets used for model assessment.
    Articles must provide a full-text version and include accuracy measurements for the results.
  • Exclusion Criteria:
    Preprinted articles or those without peer review were excluded [24].
    Articles without images or those lacking AI applications were excluded [25,26,27,28].
    Articles focusing on tasks outside detection, classification, or segmentation (e.g., improving image quality for downstream tasks [29]) were excluded.
    Articles without accuracy measurements for the reported results were excluded.
This criteria reduced the number of eligible articles to 102. All selected articles were fully reviewed, and our analysis of AI trends in DFU diagnosis over the years is based on these developments. This review adheres to the PRISMA guidelines [30]—see Supplementary File S1—for Preferred Reporting Items for Systematic reviews and Meta-Analyses of diagnostic test accuracy studies. It is limited to image-based machine learning and deep learning models for diabetic foot ulcer diagnosis. Aligning this aim, we included studies that reported performance metrics, including accuracy, sensitivity, specificity, or related diagnostic measures for detection, classification, or segmentation tasks. Studies focusing on other aspects, such as usability, implementation, or decision-analytic metrics, were excluded. These elements are important for clinical adoption, but are beyond the scope of this review, which aims to assess image-based models from a technical perspective.

2.3. Resources Selection

Full-length publications were retrieved in the context of the review from relevant journals. Two authors held a focus group to ensure that eligibility and inclusion criteria were fulfilled. A database of the identified literature, with titles, authors, publication dates, places of publication, and full abstracts was imported into Microsoft Excel. Duplicates were removed in software, and all other abstracts were screened by eligibility criteria. To begin with, each of the fourteen major bibliographic databases were searched. They were Web of Science, Scopus, PubMed, IEEE Xplore, and Science Direct. Other searches (e.g., Google Scholar, arXiv) were conducted to find suitable studies and to guarantee inclusivity. For search restriction, articles used were peer-reviewed, published in the English language, from 2010 to 2025. Preprints from databases such as arXiv were screened for background use and excluded during the screening phase. The search yielded 4653 articles, of which 2934 duplicates were excluded, and 1719 records were identified in the title and abstracts review. A total of 1566 articles were dropped out of the review due to the lack of inclusion criteria, leaving 1016 as the full-text for a detailed review. After reviewing the full text, 914 studies were excluded due to lack of AI applications or missing performance measures. In addition, 102 were included in the qualitative synthesis. The methods of study selection are provided in more detail in Figure 1.
As we considered the new trends in DFU detection using ML/DL techniques in DFU detection, we searched the following databases: Web of Science, Scopus, and PubMed between 2010 and 2025 considering the following topics: Diabetes Mellitus, Diabetic foot ulcers (DFUs), DFU dataset, Machine Learning (ML), Deep Learning (DL), Convolutional Neural Network (CNN), and Thermogram. The search was split between combinations of keywords using the “AND” connector: CNN AND DFU, DL AND DFU, ML AND DFU.
In total, 153 full-text papers from databases (Web of Science, Scopus, PubMed) were retrieved to perform the analysis, where 102 articles were selected for this review. The selection criteria were: (1) article published within a recent period, (2) presence of analysis of new trends in the field of ML/DL-based DFU detection, and (3) visibility and effect of the papers (published in a top-tier journal or conference and citation count). The article focus of the review was on a period ranging from 2010 to 2025 (i.e., 2018–2025, accounting for 94% of references as this is the most recent papers). Regarding recent developments in ML/DL-based DFU detection, segmentation and classification, citation counts were found to be varying according to publication date. In general, citation count in older papers was usually higher than a recent one (2022–2025), although there are a few interesting examples (e.g., refs. [27,28,31]). Hybrid neural networks (NNs) and transfer learning are emerging as alternative approaches in recent research, keeping pace with the latest developments in the space. Since new approaches were highlighted, no formal citation threshold was imposed in the selection sequence. Papers that followed the trends and where the number of cited papers seemed reasonable were the focus. Subsequently, we chose papers from the literature with the potential of being impactful and useful. More information about citation patterns and citation selections is given in Section 3. About 52 % of the total references met this criterion. Several preliminary papers were reviewed by the focus group of the two authors above in order to ensure that relevant literature was not missed. For the systematic review and meta-analysis, we used a PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses [32]) flow diagram.

2.4. Data Source

A comprehensive and reproducible search of electronic databases Science Direct, PubMed (MEDLINE), arXiv.org, MDPI, Nature, Google Scholar, Scopus, and Wiley Online Library was conducted to systematically identify relevant literature for this study (January 2010–October 2025). The common search terms, Boolean operators, and filters were used in all databases. Further detailed strategies based on keywords and specific queries per respective databases are also included in Supplementary File S2 for clearer visibility. The PRISMA flowchart (Figure 1) illustrates the study selection steps, and the removal of duplicates was carried out using reference management software to ensure accuracy.
We focused on terms related to diabetic foot ulcers and their diagnosis using machine learning and deep learning techniques. Keywords such as ‘classification,’ ‘detection,’ and ‘segmentation’ were included to ensure the search targeted studies specifically on image-based diagnosis, excluding unrelated AI applications in diabetes care. To enhance comprehensiveness, the search was expanded to include terms like ‘thermal imaging’ and ‘computer-aided diagnosis.’
A consistent set of predefined keywords and Boolean operators was applied across databases, using terms such as ‘Artificial Intelligence in DFU,’ ‘deep learning,’ ‘machine learning,’ ‘ANNs,’ ‘CNNs,’ ‘DFU detection,’ ‘DFU segmentation,’ and ’DFU classification.’ Additional filtering criteria, including publication years (2010–2025) and peer-reviewed articles, were applied to ensure relevance. Figure 2 illustrates the term frequency trends for DFU-related queries in Web of Science, Scopus, and PubMed databases between 2010 and 2025. Percentages are calculated based on the total number of retrieved records from each database during this period. Detailed database-specific queries and Boolean logic are provided in Supplementary File S2 for transparency. The search strategy used in this study is summarized in Table 1.

2.5. Assessment of Methodology Quality

QUADAS-2 was used to assess the methodological quality of the studies [33]. As a tool for evaluating the risk of bias and applicability of diagnostic accuracy studies, QUADAS-2 consists of four key domains: Patient selection, Index test, Reference standard, and Flow and timing.
In all four domains, bias risks were assessed, while concerns regarding applicability were raised in the first three. Concerns and risks were rated as high, low, or unclear. An unclear risk was determined when insufficient data were presented in the study. N/A was used when the QUADAS domain did not apply due to the study methodology. Any category with high levels of bias may indicate problems with the methodology of the paper, and high levels of bias across multiple categories may affect the validity of the reported results. A high risk of bias in terms of applicability in the tested domains may indicate that the included data from the evaluated paper do not accurately reflect the review question. The GRADE (Recommendation Evaluation, Development and Evaluation) approach is a systematic framework for evaluating the quality of evidence and the strength of recommendations in healthcare [34,35,36].

2.6. Diagnostic Accuracy Measures

In order to have a reasonable comparison, it is important to compare the analyzed papers based on their common statistical performance metrics. In detection, segmentation, and classification, a number of evaluation metrics are used, such as but not limited to Accuracy, Precision, Sensitivity, Specificity, F1-score, and Jaccard index. Table 2 shows the details of these metrics’ mathematical formulas.

2.7. Data Synthesis and Analysis

We categorized the extracted data according to the type of ML/DL functionality they were designed for (detection, segmentation, or classification). Furthermore, the data were classified according to the methodology used in each section, which was determined by the functionality of each section.
This enabled direct comparison of data between studies. Regardless of the measure used by the included papers, all outcome measurements were extracted and analyzed in a standard format, including all definitions of accuracy. However, due to variations in study designs, datasets, and evaluation methods, statistical pooling was not conducted. Instead, descriptive analysis of reported metrics (e.g., accuracy, AUC, sensitivity) was performed, and results were grouped by ML/DL functionality (detection, segmentation, classification, hybrid models) to enable meaningful comparisons. For comparison and understanding of ML/DL efficacy across similar tasks, each ML/DL model contains its respective outcome measure value.
Due to the inclusion of a range of terms related to accuracy in the search criteria, no papers were excluded based on any accuracy measurement. This is because they were not explicitly mentioned in the search parameters. The absence of accuracy measures in this review indicates the absence of relevant articles describing these statistical measures as outcomes. We illustrate the papers that met the criteria as sets of tables, distributed based on the functionality of the models. In addition, pie charts are used to visualize the proportion of model functionality across types of data and across available datasets. Furthermore, a bar chart provides a quantified demonstration of different models. Moreover, see Supplementary File S2 for full details about articles and how we treat them according to our inclusion and exclusion criteria.

3. Results

There were 2215 papers identified in total. Titles were initially screened to identify records, leaving 300 titles to be evaluated before applying exclusion criteria. Based on the eligibility criteria, 173 studies were excluded and the total number of included studies was 127. Many examples of excluded papers, such as [37,38], were excluded because no image-based DFU diagnosis was applied, and [31,39] were excluded due to lack of accurate information. This process is outlined in the flowchart below based on the PRISMA-DTA methodology. The 102 papers included in this review reported multiple forms of machine/deep learning.
Hence, this paper includes 127 studies with a variety of characteristics and demographics. The full details of the included studies’ characteristics can be found in Section 5. Figure 3 shows the trends in ML and DL research for diabetic foot ulcers published between 2010 and 2025. The percentages are based on the total number of included studies (n = 102), and the x-axis represents the publication years.
Figure 4a shows the proportional distribution of ML/DL functionality, where 38.46 % focus on Color Image-based Classification (CIC). As shown in Figure 4b, most of the ML/DL models used colored images ( 69.23 % ) while only 29.49 % used thermographic images and 1.29 % used hyperspectral images. Datasets used across different functionalities varied as illustrated in Figure 4c,d. In total, 56 % and 29.3 % of the studies used different publicly available datasets for segmentation and classification models, respectively. Figure 4 shows that 56% of the studies in segmentation and 29.3% of the studies in classification used at least one publicly available dataset. This distinction between dataset availability and its usage highlights the need for better adoption of public datasets in research.
Figure 5 graphically illustrates the prevalence of using different architectures of ML/DL models in image-based DFU diagnosis.
There has been considerable interest in the topic of classifying DFUs from color images, as shown in Figure 4a. In some approaches, DFUs were classified by severity stage following the Wagner grading of ulceration using either a large or small dataset, while in others, the DFUs were classified by ischemia and infection recognition. Furthermore, other researchers performed the classification of normal and abnormal DFU skin.
The most notable growth areas in the detection, segmentation, and classification of DFUs from color or thermal images, as shown in Figure 4a,b can be summarized in the following points:
  • In the detection of DFU domains based on color and thermal images:
    • You Only Look Once (YOLO) and its extensions are the more notable algorithms in the detection domain [40,41].
    • Most of the current work employing thermal images uses classical classifiers such as Support Vector Machine (SVM), k-nearest neighbour algorithm (k-NN), etc. [42].
  • In the segmentation of DFU domains based on color and thermal images, approaches included:
    • Using probability-based segmentation [43].
    • Leveraging computer vision techniques as pre- or post-processing to support NNs in segmentation results (colored images [44,45,46] and thermal [47]).
    • Utilizing transfer learning techniques to solve the problem of DFU segmentation [48,49].
    • Investigating the feasibility of using DL techniques to segment wounds under conditions of small datasets [50] although DL-based methods for automating the segmentation of wounds are currently known to require large datasets for training.
    • Using U-Net and Mask Region-based Convolutional Neural Network (R-CNN) and their extension versions which are the most popular networks in segmenting DFUs (For U-Net: colored [50,51,52,53,54], thermal [55,56], Fully Convolutional Neural Network (FCNN) [50,56,57,58,59,60,61,62] and Faster R-CNN [63,64,65,66,67]).
    • Using Encoder–Decoder NNs which show greater performance (i.e., SegNet, DE-ResUnet) than other NNs using thermal images [56,58].
  • In the classification of the DFU domains based on color images, approaches included:
    • Employing transfer learning to DFUs classification [59,68,69].
    • Combining handcrafted features with deep features [70,71].
    • Employing Class Knowledge Banks (CKBs) to improve the performance of DL classification [72].
    • Combining a pre-trained CNN model with automatic classifiers, which showed promising results [73].
    • Focusing on diagnosing more accurately and making less subjective real-time decisions [62,74].
    • Improving external validity of the existing models by avoiding overfitting certain data [75,76]
    • Focusing on overcoming severe class imbalances [77] using extension strategy and use of synthetic images appears to improve classification results for less frequent classes significantly.
    • Setting up challenges to enrich the field with data, data analysis, and ground truth annotation [69].
  • In the classification of the DFU domains based on thermal images:
    • Focusing on the gap of finding a way to Peripheral Arterial Disease (PAD), a circulatory disorder characterized by reduced blood flow to the limbs, which significantly increases the risk of diabetic foot complications [78].
    • Improving the effectiveness of classification methods for detecting abnormal changes in plantar temperature by combined transfer NNs [79,80,81] to achieve higher accuracy.
    • Determining the severity of diabetic foot complications by combining classical ML approaches such as Random Forest (RF) combined with CNNs [82,83].
    • Combining NNs by fusions [84], which showed higher accuracy.
    • Adopting Transformers, which offers new opportunities for predicting ulcer risk such as Vision Transformer (ViT) [85] and Detection Transformer (DETR) [86].
    • Using ML and image processing-based algorithms to locate hotspots in the feet [87].
  • In the classification of the DFU domains based on different images, combining untrained and pre-trained transferred NNs to the field gives high yields, providing consistency across all performance metrics [24].
  • In the hybrid frameworks (segmentation and classification) of the DFU, approaches included:
    • Combining knowledge-based transfer learning modules, which establishes a broader research area with more experimental opportunities [88,89] and was found to be promising.
    • Utilizing computer vision techniques, such as color and texture analysis [90,91,92] or uncalibrated visual fitting techniques [93], which can significantly improve results.

3.1. Performance Metrics

A summary of model performance measures across each target study is shown in Table 3, Table 4, Table 5 and Table 6. The mean detection accuracy of the best performing algorithm per study was 0.97 ± 1.0. Segmentation models had a mean accuracy of 0.94 ± 0.1, classification models 0.93 ± 0.006, and 0.88 ± 0.06 for hybrid studies. The mean AUC was also calculated as 0.97 ± 1.0 for detection, 0.99 ± 0.001 for segmentation, 0.94 ± 0.001 for classification, and 0.89 ± 0.09 for hybrid studies. The performance metrics varied significantly across studies due to differences in datasets, evaluation methods, and model architectures. Mean and standard deviations for key metrics, such as accuracy and AUC, are presented in Table 3, Table 4, Table 5 and Table 6 for each functionality.
As a result of our evaluation of the best performing algorithms, we found that their detection sensitivity was an average of 0.95 ± 1.0. Segmentation models had a mean sensitivity of 0.91 ± 0.09 and classification models 0.92 ± 0.07; hyper methods displayed the lowest mean sensitivity with 0.89 ± 0.09.
In terms of specificity, segmentation had the highest mean specificity, with 0.95 ± 0.03 (range: 0.90–0.98), compared with classification, which had a mean specificity of 0.94 ± 0.04 (range: 0.89–0.97), and the hybrid method, which had a mean specificity of 0.93 ± 0.04 (range: 0.89–0.96). The detection approach had the lowest mean specificity, with 0.85 ± 0.02 (range: 0.82–0.90). These parameters were determined by taking the metrics of the best performing algorithms in each study and calculating the mean values and standard deviations.
The detection models presented good performance across all studies. The mean accuracy of detection was recalculated as 0.97 ± 0.03 with a range of 0.90–1.0. The mean AUC was 0.97 ± 0.02 (range: 0.92–1.00), sensitivity was 0.95 ± 0.02 (range: 0.91–0.98), and specificity was 0.85 ± 0.02 (range: 0.82–0.90). A study showed the average accuracy value for detection models was 0.97 ± 0.03 in that same metric. Standard deviation and range were adapted to account for variability within the ranges they were within 0.90–1.00. These results reflect the best algorithm per study, which may lean toward optimism over other algorithms. Performance values reported are distributed across different architectures and datasets, which indicates high variability. The recalculated metric indicates that detection models typically display high accuracy rates but are influenced by dataset-specific features and model architecture, leading to varying performance. Separate datasets and evaluation measures adopted by the studies highlight the need for standardized benchmarks in order to ensure richer comparisons may be made.

3.2. Quality Assessment

A variety of specific areas of the DFU have been assessed for the diagnostic accuracy of AI used throughout the studies. In order to assess the risk regarding bias, QUADAS-2 [33], a commonly used tool in the literature, was used. Overall, most studies were found to be low-risk with regard to bias and applicability. The current systematic review reported a low risk of bias in the index test used and in flow and timing (approximately 89% and 91%, respectively). However, there was a significant percentage of studies where the patient random selection domain was unclear in its risk of bias (22%) and applicability concerns (20%) or not applicable (15% for bias and 14% for applicability concerns). This is primarily due to a lack of published details regarding sample selection and clinical information about the patients. There was a high level of risk in flow and timing, around 6% of included studies, as a result of the small dataset used [42,91]. There was also a high risk associated with the selection of patients since it is unclear what criteria were used to select a random sample [40,56]. A comparable result was found for the applicability arm of the QUADAS-2, as shown in Figure 6.
Regarding the GRADE, an assessment of the studies reveals that most have few limitations; however, the presence of some studies with significant limitations (e.g., Study 1 and Study 68) pulls down the overall quality assessment of the evidence. In terms of consistency, most of the studies report similar findings, which supports the premise that there is quite a bit of reliable evidence here. Almost all of the evidence is directly applicable to the target population, but some of the evidence is indirect (e.g., Studies 1, 3, and 7). Several studies provided estimates of the effect that were precise. Several studies provided estimates of the effect that were imprecise and did not affect our overall confidence in the findings. The presence of several studies with both moderate and high certified study levels led to an overall rating of moderate certainty for this body of evidence. The details of the GRADE of each study can be found in Supplementary S2. Quality assessments.

3.3. Impact of Bias and Data Quality on DFU Model Performance

One stratified analysis shows that studies in which there is a high risk for bias or ambiguous reporting report inflated performance data. For example, detection models evaluated on private datasets with limited numbers of samples (e.g., less than 100 images) frequently demonstrated higher accuracy (>95%) relative to models tested on much larger and public data (e.g., DFUC2021), compared with 85–90% accuracy. This gap indicates that performance can possibly be overstated in studies with poorly defined patient selection and dataset characteristics. Such studies (for instance, ones in which the risk of bias is low, particularly when there are publicly accessible datasets such as DFUC2021 and Medetec; with standardized evaluation criteria) were able to achieve more uniform and reproducible performance. For instance, U-Net with clear validation protocols among segmentation methodologies had a Dice score of 94% on DFUC2021 while studies with non-standardized datasets or vague validation criteria had a Dice score that exceeded 98% but lacked reproducibility.
The GRADE recommendations were used in the quality of evidence for technical end-point studies in accuracy, AUC, Dice coefficient, and Intersection over Union. Methods that were based on low quality data from inadequate databases, suboptimal quality patient samples, or variation from approved protocol were reduced with consideration of bias. For example, studies conducted on private datasets were not usually transparent about dataset diversity or representativeness, leading to greater likelihood of overfitting and inflated performance claims. We also examined variation from studies in reported performance metrics to assess for inconsistency. One study using U-Net applied a segmentation model on DFUC2021 dataset, which reported Dice scores of about 94% throughout the datasets, whereas more studies utilizing non-standard datasets and unclear analytical methods reported Dice scores up to 98%. Because some studies also provided inconsistent data, the quality of evidence at some studies was downgraded. Table 7 summarizing GRADE judgment results for detection, segmentation, and classification tasks gives a summary of these and additional findings.
The significance of technical endpoints to clinical outcomes were also considered to account for indirectness. Although metrics like Dice coefficient and AUC are essential to evaluate model performance, they do not forecast clinical outcomes, such as ulcer healing or amputation avoidance. Therefore, studies excluding clinical endpoints received poorer scores in indirectness. Imprecision was also a variable that influenced evidence quality when studies with smaller sample sizes and large confidence intervals for key metrics were considered unvalidated. Studies that trained models on fewer than 100 images, for example, often reported very noisy performance measures and their findings were therefore not very trustworthy. In terms of these assessments, the quality of evidence for the detection model in total was moderate according to the same criteria used in previous studies. This was found to be steady for all studies performed using the available public datasets. The validation evaluation of classification studies was weak, with a low-to-moderate ranking because of the fact that datasets used were small, and there was a dearth of uniformity of dataset and high variability.

4. AI in Diabetic Foot Management

AI has generated a widespread revolution in medical imaging over the past ten years, including but not limited to X-ray, ultrasound, computed tomography, and magnetic resonance imaging. However, AI-based systems for high-quality wound care remain significantly underdeveloped in both clinical and computational domains [94,95]. Initiatives to implement AI for optimizing data processing and products for diabetes management appear to be ongoing and carry great potential for the near future [96]. The progression of this AI revolution could be broken down into a number of stages: through the introduction of Neural Networks (NNs), the development of various ML algorithms, and most recently the current era of DL.
Hazenberg et al. [97] performed a comprehensive review of the PubMed database to identify the nature, functionality, efficacy, cost, and present limitations of telehealth and telemedicine applications related to diabetic foot disease prevention and management. They found that such applications are still in the early stages of development, and stronger scientific evidence is necessary to support efficacy and feasibility of such applications. Next, a more technically and economically efficient system would need to be built before such systems are widely deployed in patients’ homes. In terms of applicability, Tulloch et al. [11] confirm through using PRISMA-DTA at PubMed, Google Scholar, Web of Science, and Scopus for ML algorithms in DFU studies that the current research is limited and that there is a need for the development of more applicable ML algorithms. In terms of characterization, in [94], provide a wide view of the literature is provided on the effect of developing novel Artificial Intelligence (AI) systems that can help clinicians diagnose, assess the effectiveness of therapy, and predict healing outcomes.
Moreover, the COVID-19 pandemic posed significant challenges to diabetic foot clinics since most patients were unable to physically attend clinics. So, the pandemic underscored the urgent need for AI-based wound care to monitor DFUs remotely. As a result, virtual clinics gained popularity [98]. This review focuses on ML/DL literature for three primary purposes: detection, segmentation, and classification.

5. ML/DL Models Used in Diabetic Foot Detection, Segmentation, and Classification

One of the earliest AI experimental of applying AI to DFU detection and segmentation was by Wang et al. (2016) [27,99]. In this study, images taken in a special capture box were analyzed with wound image assessment algorithms to calculate the overall wound area, color-segmented wound areas, and a healing score. The quantitative assessment of wound healing status can be obtained from these measurements. A Support Vector Machine (SVM) was used to locate wound borders on foot ulcer images. However, this method also comes with significant drawbacks, including physical contact between wounds and the capture box and the danger of contamination. Later, the work of Goyal et al. [57,63,68,100] trained different models capable of classification, detection, and segmentation. Detailed information about these models can be found in Section S1 of the Supplementary File S1. Furthermore, Supplementary File S1, Section S2 provides detailed information about the DFU images and datasets used in the literature.
Table 8 summarizes the main characteristics and outcomes that were measured in the included detection-targeted studies. Table 9 shows the details of segmentation techniques described in the papers included in this study and their characteristics. Table 10 summarizes the main characteristics and outcomes that were measured in included classification-targeted studies. Table 11 summarizes the main characteristics and outcomes that were measured in the included hybrid (classification and segmentation).

6. Discussion

This section discusses the findings for each of the research questions in this review. The aim of this survey was to visualize the state of the art of ML/DL in DFU diagnosis, which can be described to include detection, segmentation, classification, or a hybrid system combining two or more of these functionalities.
Networks are heterogeneous in terms of their size, computation, and structural complexity. It is theoretically possible that deeper networks perform better than shallower networks, but in practice, deeper networks underperform shallower ones. This is due to an optimization problem rather than an overfitting problem. In general, the deeper a network, the more challenging it is to optimize it. This study shows that U-Net, FCNN, Faster R-CNN, and their related versions are the most popular networks for segmenting DFUs (for U-Net: colored [50,51,52,53,54], thermal [55,56] FCNN [50,56,57,58,59,60,61,62] and Faster R-CNN [63,64,65,66,67]).
Additionally, Encoder–Decoder NNs show outstanding segmentation performance (i.e., SegNet, DE-Resnet) among other NNs using thermal images [56,58] due to their architecture being based on utilizing low-resolution feature mapping. In fact, the proposed DE-ResUnet contains two encoders and one decoder, and each unit of the decoder consists of an up-sampling block followed by a convolution operation to produce dense feature maps which combine features and background knowledge that effectively identifies wound boundaries, improves accuracy, and enhances learning abilities which have been shown to provide superior segmentation performance.
Examples of the application of transfer learning to DFU classification (colored images [59,68,69], thermal [79,80,81]) demonstrte that transfer learning is a viable method for reusing the architecture and weights of a model trained with large amounts of input data and applying it to different scenarios and other datasets while using a lower amount of computational power. With respect to the work of Cassidy et al. [66], there is no study that has applied different architectures to the same dataset. Additional exploration is necessary to determine the advantages of transfer learning in DFU diagnosis. Therefore, future work should include an analysis of most transfer learning architectures as well as the self-tuning paradigm applicable to this field.
Regarding detection, YOLO family [151] architecture is one of the most popular models for DFU colored-imaging detection. The main reason for its popularity is that it utilizes a highly reliable and efficient neural network architecture. A variety of NN architectures were employed in the studies included in this literature review (see Figure 5).
Furthermore, we believe that the models with hybrid data use will make a significant contribution to the field. Recent work focused on creating integrated frameworks that blend two different modalities in one pipeline (thermal and color images [56]) which has helped to achieve remarkable segmentation accuracy.
At present, there is no publicly available dataset that contains multimodality imaging of diabetic foot patients. Obtaining such datasets requires a considerable amount of time and effort, as these datasets use a variety of imaging methods. For example, these include thermal infrared imaging (useful for detecting ulcers early), clinical wound imaging (for assessing ulcer progression), fluorescence (to check for the presence of clinically significant bacteria), and MRI (to detect Charcot’s foot and infection). According to the accuracy measurement display in Table 8, Table 9, Table 10 and Table 11, thermal infrared imaging has proven to be an effective tool in the clinical management of DFU patients.
Since the metrics reported were heterogeneous, it was difficult to compare model performance. The generally agreed validation measures among most of the included studies are the validation parameters from the confusion matrix: Accuracy (41 studies), Precision (27 studies), Specificity (21 studies), Sensitivity (Recall) (36 studies), DSC (32 studies), and AUC (9 studies) as shown in Table 8, Table 9, Table 10 and Table 11. The heterogeneity in performance metrics, due to differences in datasets, imaging modalities, and reporting practices, limited direct statistical comparisons. This highlights the need for standardized evaluation metrics and consistent reporting frameworks to improve comparability across studies. Adding the AUC (ROC) to these measurements provides the most effective method of reducing heterogeneity in the validation parameters. Ideally, reporting the same measurements throughout the AI revolution of DFU diagnosis will facilitate tracking the development of AI-guided diagnosis systems. To this end, a generalized standard assessment for DFU diagnosis applications can be developed in the future with further investigation.
In many research studies, colored images as shown in Figure 4b were used by 67 % of selected studies but thermal infrared imaging has been proven to be a useful technique in the clinical management of DFUs [42,47,55,56,58,60,81,85,126,135,152]. Another imaging modality, known as hyperspectral fluorescence imaging, can detect clinically significant bacteria in diabetic foot ulcers, but little research has been conducted on this type of imaging [101]. Potentially, research can provide valuable information on DFU outcomes of the severity of a DFU or through the healing process. Moreover, MRI and CT are other options that can be used to investigate the DFU diagnosis process as proved by the feasibility of their use in other diagnoses in medical decision-making [153,154,155]. Future work in DFUs will include the use of such data in our follow-up work to ensure more accurate monitoring and timely treatment. While 45% of distinct segmentation datasets and 67.3% of distinct classification datasets are publicly accessible, only 45% of segmentation studies and 32% of classification studies used at least one publicly available dataset. This discrepancy underlines the need to encourage researchers to adopt publicly available datasets to improve the reproducibility and comparability of findings.
In terms of multimodality, there has been an effort to create multimodel DFU diagnosis systems such as in [24], but there are no publicly available datasets that combine multi-modality imaging of diabetic feet. The collection of such datasets combining IRT, clinical DFU images, fluorescence, and MRI requires a great deal of effort. As with other medical imaging datasets, DFU datasets often exhibit image duplication, and feature over-representation and excessive reliance on a small number of subjects [21].
Using machine learning for Diabetic Foot Ulcers (DFUs) is not excluded from the challenges and proposed solutions. Data scarcity and data quality can be improved by handling them in a variety of ways such as using synthetic augmentation methods like GAN generative adversarial networks or collaborating with medical institutions. Three options are available to alleviate class imbalance among healthy and ulcerated images: oversampling, undersampling, and weighted loss functions. The wide heterogeneity of ulcers demands comprehensive feature selection and an ensemble model for accurate prediction. Interpretability is key, so the utilization of interpretable models or methods such as SHAP and LIME can assist in elucidating decisions to clinicians. Making it more user-friendly and interoperable with health records is critical to integration into clinical workflows. In order to improve generalization onto other populations, it is important that the model be trained on diverse datasets, and domain adaptation techniques are required. Transfer learning using pre-trained models can improve the performance with less data, and federated learning makes it possible to collaborate without sharing sensitive information. Alzubaidi et al. [156] adapted a pre-trained skin cancer model to classify foot skin images into two categories: normal or abnormal (diabetic foot ulcer) to overcome the limitation of data availability.

6.1. Addressing Research Questions: Models, Metrics, and Dataset Characteristics

This study evaluated the effectiveness of ML/DL models, optimal validation metrics, and the impact of dataset characteristics on DFU diagnosis. For RQ1, U-Net obtained the highest performance for segmentation tasks, with the Dice coefficient of 94% and IoU of 89% on the DFUC2021 dataset, compared with Mask R-CNN (Dice: 91%). EfficientNet achieved 99% accuracy in the classification for DFUC2020; YOLOv5 achieved the highest mAP of 92% in DFUC2020. These findings suggest the relevance of choosing task-dependent models validated with standardized datasets.
Regarding validation metrics (RQ2), task-specific measurements are critical. When it comes to detecting, mAP and localization accuracy are appropriate for conducting the performance validation of the model, as YOLOv5 achieved 92% mAP. Dice coefficient and IoU are useful features for segmentation tasks, and U-Net attained a Dice score of 94% on DFUC2021. Classification models use AUC, accuracy, and precision metrics, with EfficientNet achieving an AUC of 99% on DFUC2020. Measurement metrics such as decision curves can further help enhance clinical relevance.
For dataset characteristics (RQ3), model generalization would improve with larger and more varied datasets. With over 4000 images, DFUC2021 generated better performance when compared with smaller datasets such as Medetec (<1000 images). High-quality annotations (pixel-level masks, DFUC2021) were one of the many enablers for the powerful segmentation performance of U-Net. Yet, class imbalance is an issue, especially for classification tasks. Methods like oversampling, weighted loss functions, and GAN synthetic data generation are known for mitigating this problem. U-Net, EfficientNet, and YOLOv5 are the most efficient models for segmentation, classification, and detection (respectively) on multiple, higher quality datasets if used based on task-specific metrics. Managing dataset constraints and deploying metrics in common are vital to improving ML/DL-based DFU diagnosis.
Transformers including ViT and DETR provide opportunities in developing predicting ulcer risk due to their capability to exploit global relationships in image data. It is in line with RQ1 as these architectures enhance diagnostic accuracy and generalizability. Moreover, self-supervised learning approaches with little labeled data can respond directly to deficiencies in data size and annotation costs mentioned in the Results. These techniques are specifically appropriate for Research Question 3 (RQ3) since they decrease reliance on large labeled datasets and lead to strong model performance.

6.2. Limitation of Included Research

Although filters are not recommended in systematic review methodology, the limits applied by the authors of this study are unlikely to have influenced the articles retrieved. The limits were considered satisfactory in order to limit the number of irrelevant articles without impacting the retrieval of relevant articles. The review question and eligibility criteria were limited to human studies. Considering the scope of this review, we included only papers published between 2010 and 2025, as well as the number of publications, which is unlikely to have had an impact on the results. In addition, the key limitation of this review is the exclusion of studies in usability, implementation, or clinical decision-making metrics. These are important to real-world adoption and use cases, and the focus of this review was intended to remain narrowly directed at diagnostic performance in image-based activities. This review acknowledges this limitation, and the findings might be extended by investigating usability, implementation related studies to better understand the challenges and opportunities of machine learning/deep learning models for clinical use. This clarification was developed based on reviewer feedback regarding the possible consequences of inclusion and exclusion criteria.

6.3. Fairness, Generalization, and External Validation

Fairness in diabetic foot imaging has attracted attention in particular, with works like Reis et al. [147], which showed that the ML/DL models had differential performance on different skin tones. These models learn on poorly represented datasets and thus their performance is poor on individuals with darker skin tones. All of these data collection biases must be mitigated by having diverse and inclusive datasets. Furthermore, the generalization to clinical centers and geographic regions of these models is still problematic because of differences in the imaging protocols and the patients’ demographics. Such a model robustness can be enhanced through multi-center datasets and federated learning approaches. External validation on geographically and demographically diverse cohorts is needed to ensure the reliability of these research studies, especially since most rely on a single-center or private dataset. Upcoming initiatives should emphasize diverse datasets, collaborative efforts, and fairness audits to optimize equity and generalizability of DFU diagnostics.

7. Conclusions

There has been rapid growth in ML applications for DFU, particularly with DL models that report high accuracy but lack publicly available datasets. To achieve significant results, denser CNNs are being developed to enhance the impact of DFU diagnosis in the future. To boost productivity for machine learning/deep learning technologies, it is essential to gather vast amounts of data. The diverse examples offered by big data help models learn to ‘see’ new data better. In the field of model training, more data means the models can work within better approximations—that is, they can figure out what ‘features’ in the data are important and can perform better under a set of validations. The idea here is to leverage this model training to make the automated diagnosis of DFU more efficient. Future research should include the following to enhance reproducibility, generalizability, and standardization: (1) creating and validating multimodality imaging of diabetic foot datasets, (2) training and testing on multimodality datasets alongside thermal image datasets, as they prove feasible in diagnosis, and reporting a full set of performance metrics, and (3) adhering to Wagner grading of ulceration to standardize DFU diagnosis. Therefore, we summarize our general recommendation from this review as follows:
  • Development of Public Benchmarks: Establish centralized, publicly accessible repositories of multimodal DFU imaging datasets annotated with clinical metadata and severity grading (e.g., Wagner scale).
  • Standardized Evaluation Metrics: Promote the consistent use of metrics such as area under the curve (AUC), Dice similarity coefficient (DSC), and Jaccard index to ensure comparability and facilitate meta-analyses.
  • Community-driven Benchmarking Initiatives: Encourage reproducibility through organized challenges (e.g., DFU Grand Challenge) that provide standardized tasks and evaluation protocols.
  • FAIR Data Principles: Ensure that datasets adhere to FAIR principles (Findable, Accessible, Interoperable, Reusable) to support long-term usability and collaboration.
  • Reporting Standards: Recommend adoption of AI-specific reporting guidelines such as CONSORT-AI and PRISMA-DTA in DFU-related research publications.
This review has several implications for key stakeholders. For clinicians, it highlights the growing reliability of ML/DL tools in DFU diagnosis and underscores the need for clinical collaboration in validating AI models with real-world data. For data scientists, the review emphasizes the importance of developing models that are not only accurate but also interpretable, reproducible, and trained on diverse, multimodal datasets. For policymakers, the findings support the establishment of open data initiatives, regulatory standards for AI in wound care, and funding mechanisms to promote the development and deployment of equitable and evidence-based AI solutions in diabetic foot management. Future research should adopt standardized and consistent evaluation metrics, such as AUC and the Dice coefficient, to address inter-study heterogeneity and facilitate robust meta-analyses.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/biomedicines13122928/s1, S1: Advances in Image-Based Diagnosis of Diabetic Foot UlcersUsing Deep Learning and Machine Learning: A Systematicn Review (Supplementary File); S2: Sheet “Included & excluded studies”; S3: Sheet “Features” Feature-related extractions; S4: QUADAS-2 Quality Assessment Analysis; S5: Quality assessment using GRADE. S6: PRISMA 2020 Checklist. S7: Inplasy protocol 2022110128.

Author Contributions

H.F.A. developed the theoretical formalism, performed the analytic calculations and the numerical simulations. Both H.F.A. and S.S.A. contributed to the final version of the manuscript. H.F.A. supervised the project. All authors have read and agreed to the published version of the manuscript.

Funding

The researchers thank the Deanship of Graduate Studies and Scientific Research at Qassim University for financial support (QU-APC-2025).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study has been provided in the references of this paper and Supplementary File. PRISMA 2020 Checklist available at: PRISMA checklist, Supplementary File S6.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AIArtificial Intelligence
AUCArea Under Curve
CNNConvolutional Neural Network
CTComputed Tomography
DCNNDeep Convolutional Neural Network
DFUDiabetic Foot Ulcer
DLDeep Learning
DMDiabetes Mellitus
DSCDice Similarity Coefficient
FPRFalse Positive Rate
mAPmean Average Precision
MCCMatthews Correlation Coefficient
MLMachine Learning
MRIMagnetic Resonance Imaging
NNNeural Network
PADPeripheral Arterial Disease
QUADAS-2Quality Assessment of Diagnostic Accuracy Studies-2
RGBRed, Green, Blue
RMSERoot Mean Square Error
ROCReceiver Operating Characteristic curve
SUSSystem Usability Scale
SVMSupport Vector Machine
YOLOYou Only Look Once

References

  1. Kondaveeti, S.B.; Kaur, K.; Gupta, V.; Tanwar, R.; Choudhary, N.; Kumar, D.; Gupta, S.; Rani, P.; Vk, A. Next-generation microneedle platforms for site-specific management of diabetic neuropathy. Diabetol. Metab. Syndr. 2025, 17, 418. [Google Scholar] [CrossRef] [PubMed]
  2. Armstrong, D.G.; Boulton, A.J.; Bus, S.A. Diabetic foot ulcers and their recurrence. N. Engl. J. Med. 2017, 376, 2367–2375. [Google Scholar] [CrossRef] [PubMed]
  3. Jia, L.; Parker, C.N.; Parker, T.J.; Kinnear, E.M.; Derhy, P.H.; Alvarado, A.M.; Huygens, F.; Lazzarini, P.A.; Diabetic Foot Working Group, Queensland Statewide Diabetes Clinical Network. Incidence and risk factors for developing infection in patients presenting with uninfected diabetic foot ulcers. PLoS ONE 2017, 12, e0177916. [Google Scholar] [CrossRef]
  4. Degu, H.; Wondimagegnehu, A.; Yifru, Y.M.; Belachew, A. Is health related quality of life influenced by diabetic neuropathic pain among type II diabetes mellitus patients in Ethiopia? PLoS ONE 2019, 14, e0211449. [Google Scholar] [CrossRef] [PubMed]
  5. Wang, C.; Mahbod, A.; Ellinger, I.; Galdran, A.; Gopalakrishnan, S.; Niezgoda, J.; Yu, Z. FUSeg: The foot ulcer segmentation challenge. Information 2024, 15, 140. [Google Scholar] [CrossRef]
  6. Kerr, M.; Rayman, G.; Jeffcoate, W. Cost of diabetic foot disease to the National Health Service in England. Diabet. Med. 2014, 31, 1498–1504. [Google Scholar] [CrossRef]
  7. Bharara, M.; Cobb, J.; Claremont, D. Thermography and thermometry in the assessment of diabetic neuropathic foot: A case for furthering the role of thermal techniques. Int. J. Low. Extrem. Wounds 2006, 5, 250–260. [Google Scholar] [CrossRef]
  8. Jodheea-Jutton, A.; Hindocha, S.; Bhaw-Luximon, A. Health economics of diabetic foot ulcer and recent trends to accelerate treatment. Foot 2022, 52, 101909. [Google Scholar] [CrossRef]
  9. Woodbury, M.G.; Sibbald, R.G.; Ostrow, B.; Persaud, R.; Lowe, J.M. Tool for rapid & easy identification of high risk diabetic foot: Validation & clinical pilot of the simplified 60 second diabetic foot screening tool. PLoS ONE 2015, 10, e0125578. [Google Scholar]
  10. Ramirez, H.A.; Liang, L.; Pastar, I.; Rosa, A.M.; Stojadinovic, O.; Zwick, T.G.; Kirsner, R.S.; Maione, A.G.; Garlick, J.A.; Tomic-Canic, M. Comparative genomic, microRNA, and tissue analyses reveal subtle differences between non-diabetic and diabetic foot skin. PLoS ONE 2015, 10, e0137133. [Google Scholar] [CrossRef]
  11. Tulloch, J.; Zamani, R.; Akrami, M. Machine learning in the prevention, diagnosis and management of diabetic foot ulcers: A systematic review. IEEE Access 2020, 8, 198977–199000. [Google Scholar] [CrossRef]
  12. Ferreira, A.C.B.H.; Ferreira, D.D.; Barbosa, B.H.G.; Aline de Oliveira, U.; Aparecida Padua, E.; Oliveira Chiarini, F.; Baena de Moraes Lopes, M.H. Neural network-based method to stratify people at risk for developing diabetic foot: A support system for health professionals. PLos ONE 2023, 18, e0288466. [Google Scholar] [CrossRef] [PubMed]
  13. Siddiqui, H.U.; Spruce, M.; Alty, S.R.; Dudley, S. Automated peripheral neuropathy assessment using optical imaging and foot anthropometry. IEEE Trans. Biomed. Eng. 2015, 62, 1911–1917. [Google Scholar] [CrossRef] [PubMed]
  14. Yuan, Z.; Huang, J.; Zhao, Z.; Zahid, A.; Heidari, H.; Ghannam, R.; Abbasi, Q.H. A compact wearable system for detection and estimation of open wound status in diabetic patient. In Proceedings of the 2018 IEEE Asia Pacific Conference on Postgraduate Research in Microelectronics and Electronics (PrimeAsia), Chengdu, China, 26–30 October 2018; pp. 60–63. [Google Scholar]
  15. Najafi, B.; Mishra, R. Harnessing digital health technologies to remotely manage diabetic foot syndrome: A narrative review. Medicina 2021, 57, 377. [Google Scholar] [CrossRef]
  16. Alzubaidi, L.; Abbood, A.A.; Fadhel, M.A.; Al-Shamma, O.; Zhang, J. Comparison of hybrid convolutional neural networks models for diabetic foot ulcer classification. J. Eng. Sci. Technol. 2021, 16, 2001–2017. [Google Scholar]
  17. Fard, A.S.; Esmaelzadeh, M.; Larijani, B. Assessment and treatment of diabetic foot ulcer. Int. J. Clin. Pract. 2007, 61, 1931–1938. [Google Scholar] [CrossRef]
  18. Aalaa, M.; Malazy, O.T.; Sanjari, M.; Peimani, M.; Mohajeri-Tehrani, M. Nurses’ role in diabetic foot prevention and care: A review. J. Diabetes Metab. Disord. 2012, 11, 1–6. [Google Scholar] [CrossRef]
  19. Aziz, Z.; Lin, W.K.; Nather, A.; Huak, C.Y. Predictive factors for lower extremity amputations in diabetic foot infections. Diabet. Foot Ankle 2011, 2, 7463. [Google Scholar] [CrossRef]
  20. Arcadu, F.; Benmansour, F.; Maunz, A.; Willis, J.; Haskova, Z.; Prunotto, M. Deep learning algorithm predicts diabetic retinopathy progression in individual patients. NPJ Digit. Med. 2019, 2, 92, Erratum in NPJ Digit. Med. 2020, 3, 160. [Google Scholar] [CrossRef]
  21. Yap, M.H.; Kendrick, C.; Reeves, N.D.; Goyal, M.; Pappachan, J.M.; Cassidy, B. Development of Diabetic Foot Ulcer Datasets: An Overview. Diabet. Foot Ulcers Grand Chall. 2021, 1–18. [Google Scholar]
  22. Pradhan, G.; Pradhan, R.; Khandelwal, B. A study on various machine learning algorithms used for prediction of diabetes mellitus. In Soft Computing Techniques and Applications: Proceeding of the International Conference on Computing and Communication (IC3 2020); Springer: Singapore, 2021; pp. 553–561. [Google Scholar]
  23. Yap, M.H.; Hachiuma, R.; Alavi, A.; Brüngel, R.; Cassidy, B.; Goyal, M.; Zhu, H.; Rückert, J.; Olshansky, M.; Huang, X.; et al. Deep learning in diabetic foot ulcers detection: A comprehensive evaluation. Comput. Biol. Med. 2021, 135, 104596. [Google Scholar] [CrossRef] [PubMed]
  24. Reyes-Luévano, J.; Guerrero-Viramontes, J.; Romo-Andrade, J.R.; Funes-Gallanzi, M. DFU_VIRnet: A Novel Visible-Infrared CNN to Improve Diabetic Foot Ulcer Classification and Early Detection of Ulcer Risk Zones. Biomed. Signal Process. Control 2023, 86, 105341. [Google Scholar] [CrossRef]
  25. Jung, K.; Covington, S.; Sen, C.K.; Januszyk, M.; Kirsner, R.S.; Gurtner, G.C.; Shah, N.H. Rapid identification of slow healing wounds. Wound Repair Regen. 2016, 24, 181–188. [Google Scholar] [CrossRef] [PubMed]
  26. Reddy, S.S.; Mahesh, G.; Preethi, N.M. Exploiting Machine Learning Algorithms to Diagnose Foot Ulcers in Diabetic Patients. Eai Endorsed Trans. Pervasive Health Technol. 2021, 7, e2. [Google Scholar] [CrossRef]
  27. Wang, S.C.; Anderson, J.A.; Evans, R.; Woo, K.; Beland, B.; Sasseville, D.; Moreau, L. Point-of-care wound visioning technology: Reproducibility and accuracy of a wound measurement app. PLoS ONE 2017, 12, e0183139. [Google Scholar] [CrossRef]
  28. Sudarvizhi, M.D.; Nivetha, M.; Priyadharshini, P.; Swetha, J. Identification and analysis of foot ulceration using load cell technique. IRJET 2019, 6, 7792–7797. [Google Scholar]
  29. Davradou, A.; Protopapadakis, E.; Kaselimi, M.; Doulamis, A.; Doulamis, N. Diabetic foot ulcers monitoring by employing super resolution and noise reduction deep learning techniques. In Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, Corfu, Greece, 29 June–1 July 2022; pp. 83–88. [Google Scholar]
  30. Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Syst. Rev. 2021, 10, 372. [Google Scholar] [CrossRef]
  31. Patel, S.; Patel, R.; Desai, D. Diabetic foot ulcer wound tissue detection and classification. In Proceedings of the 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), Coimbatore, India, 17–18 March 2017; IEEE: New Tork, NY, USA, 2017; pp. 1–5. [Google Scholar]
  32. McInnes, M.D.; Moher, D.; Thombs, B.D.; McGrath, T.A.; Bossuyt, P.M.; Clifford, T.; Cohen, J.F.; Deeks, J.J.; Gatsonis, C.; Hooft, L.; et al. Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: The PRISMA-DTA statement. JAMA 2018, 319, 388–396. [Google Scholar] [CrossRef]
  33. Whiting, P.F.; Rutjes, A.W.; Westwood, M.E.; Mallett, S.; Deeks, J.J.; Reitsma, J.B.; Leeflang, M.M.; Sterne, J.A.; Bossuyt, P.M.; QUADAS-2 Group. QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Ann. Intern. Med. 2011, 155, 529–536. [Google Scholar] [CrossRef]
  34. Guyatt, G.H.; Oxman, A.D.; Vist, G.E.; Kunz, R.; Falck-Ytter, Y.; Alonso-Coello, P.; Schünemann, H.J. GRADE: An emerging consensus on rating quality of evidence and strength of recommendations. BMJ 2008, 336, 924–926. [Google Scholar] [CrossRef]
  35. Guyatt, G.; Oxman, A.D.; Akl, E.A.; Kunz, R.; Vist, G.; Brozek, J.; Norris, S.; Falck-Ytter, Y.; Glasziou, P.; DeBeer, H.; et al. GRADE guidelines: 1. Introduction—GRADE evidence profiles and summary of findings tables. J. Clin. Epidemiol. 2011, 64, 383–394. [Google Scholar] [CrossRef]
  36. Balshem, H.; Helfand, M.; Schünemann, H.J.; Oxman, A.D.; Kunz, R.; Brozek, J.; Vist, G.E.; Falck-Ytter, Y.; Meerpohl, J.; Norris, S.; et al. GRADE guidelines: 3. Rating the quality of evidence. J. Clin. Epidemiol. 2011, 64, 401–406. [Google Scholar] [CrossRef]
  37. Haque, F.; Reaz, M.B.I.; Chowdhury, M.E.H.; Ezeddin, M.; Kiranyaz, S.; Alhatou, M.; Ali, S.H.M.; Bakar, A.A.A.; Srivastava, G. Machine Learning-Based Diabetic Neuropathy and Previous Foot Ulceration Patients Detection Using Electromyography and Ground Reaction Forces during Gait. Sensors 2022, 22, 3507. [Google Scholar] [CrossRef] [PubMed]
  38. Yusuf, N.; Zakaria, A.; Omar, M.I.; Shakaff, A.Y.M.; Masnan, M.J.; Kamarudin, L.M.; Abdul Rahim, N.; Zakaria, N.Z.I.; Abdullah, A.A.; Othman, A.; et al. In-vitro diagnosis of single and poly microbial species targeted for diabetic foot infection using e-nose technology. BMC Bioinform. 2015, 16, 1–12. [Google Scholar] [CrossRef] [PubMed]
  39. Das, S.K.; Roy, P.; Mishra, A.K. DFU_SPNet: A stacked parallel convolution layers based CNN to improve Diabetic Foot Ulcer classification. ICT Express 2022, 8, 271–275. [Google Scholar] [CrossRef]
  40. Han, A.; Zhang, Y.; Li, A.; Li, C.; Zhao, F.; Dong, Q.; Liu, Q.; Liu, Y.; Shen, X.; Yan, S.; et al. Efficient refinements on YOLOv3 for real-time detection and assessment of diabetic foot Wagner grades. arXiv 2020, arXiv:2006.02322. [Google Scholar]
  41. Brüngel, R.; Friedrich, C.M. DETR and YOLOv5: Exploring performance and self-training for diabetic foot ulcer detection. In Proceedings of the 2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS), Aveiro, Portugal, 7–9 June 2021; IEEE: New Tork, NY, USA, 2021; pp. 148–153. [Google Scholar]
  42. Nag, U.; Upadhayay, M.; Gupta, T. Detecting Diabetic Foot Complications using Infrared Thermography and Machine Learning. In Proceedings of the International Conference on Graphics and Signal Processing, Nagoya, Japan, 25–27 June 2021; pp. 41–46. [Google Scholar]
  43. Cui, C.; Thurnhofer-Hemsi, K.; Soroushmehr, R.; Mishra, A.; Gryak, J.; Domínguez, E.; Najarian, K.; López-Rubio, E. Diabetic wound segmentation using convolutional neural networks. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; pp. 1002–1005. [Google Scholar]
  44. Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
  45. Jawahar, M.; Anbarasi, L.J.; Jasmine, S.G.; Narendra, M. Diabetic foot ulcer segmentation using color space models. In Proceedings of the 2020 5th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 10–12 June 2020; IEEE: New York, NY, USA, 2020; pp. 742–747. [Google Scholar]
  46. Heras-Tang, A.; Valdes-Santiago, D.; León-Mecías, Á.M.; Díaz-Romañach, M.L.B.; Mesejo-Chiong, J.A.; Cabal-Mirabal, C. Diabetic foot ulcer segmentation using logistic regression, DBSCAN clustering and mathematical morphology operators. Electron. Lett. Comput. Vis. Image Anal. 2022, 21, 22–39. [Google Scholar] [CrossRef]
  47. Alshayeji, M.H.; Sindhu, S.C.; Abed, S.E. Early detection of diabetic foot ulcers from thermal images using the bag of features technique. Biomed. Signal Process. Control 2023, 79, 104143. [Google Scholar] [CrossRef]
  48. Muñoz, P.; Rodríguez, R.; Montalvo, N. Automatic segmentation of diabetic foot ulcer from mask region-based convolutional neural networks. J. Biomed. Res. Clin. Investig. 2020, 1. [Google Scholar] [CrossRef]
  49. Huang, H.N.; Zhang, T.; Yang, C.T.; Sheen, Y.J.; Chen, H.M.; Chen, C.J.; Tseng, M.W. Image segmentation using transfer learning and Fast R-CNN for diabetic foot wound treatments. Front. Public Health 2022, 10, 969846. [Google Scholar] [CrossRef]
  50. Rania, N.; Douzi, H.; Yves, L.; Sylvie, T. Semantic segmentation of diabetic foot ulcer images: Dealing with small dataset in DL approaches. In Proceedings of the International Conference on Image and Signal Processing, Dubai, United Arab Emirates, 20–21 June 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 162–169. [Google Scholar]
  51. Ohura, N.; Mitsuno, R.; Sakisaka, M.; Terabe, Y.; Morishige, Y.; Uchiyama, A.; Okoshi, T.; Shinji, I.; Takushima, A. Convolutional neural networks for wound detection: The role of artificial intelligence in wound care. J. Wound Care 2019, 28, S13–S24. [Google Scholar] [CrossRef]
  52. Hernández, A.; Arteaga-Marrero, N.; Villa, E.; Fabelo, H.; Callicó, G.M.; Ruiz-Alzola, J. Automatic Segmentation Based on Deep Learning Techniques for Diabetic Foot Monitoring Through Multimodal Images. In Proceedings of the International Conference on Image Analysis and Processing, Trento, Italy, 9–13 September 2019; pp. 414–424. [Google Scholar]
  53. Mahbod, A.; Schaefer, G.; Ecker, R.; Ellinger, I. Automatic foot ulcer segmentation using an ensemble of convolutional neural networks. In Proceedings of the 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 21–25 August 2022; pp. 4358–4364. [Google Scholar]
  54. Galdran, A.; Carneiro, G.; Ballester, M.A.G. Double encoder-decoder networks for gastrointestinal polyp segmentation. In Proceedings of the International Conference on Pattern Recognition, Virtual Event, 10–15 January 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 293–307. [Google Scholar]
  55. Bouallal, D.; Bougrine, A.; Douzi, H.; Harba, R.; Canals, R.; Vilcahuaman, L.; Arbanil, H. Segmentation of plantar foot thermal images: Application to diabetic foot diagnosis. In Proceedings of the International Conference on Systems, Signals and Image Processing (IWSSIP), Niterói, Brazil, 3–5 June 2020; IEEE: New York, NY, USA, 2020; pp. 116–121. [Google Scholar]
  56. Bouallal, D.; Douzi, H.; Harba, R. Diabetic foot thermal image segmentation using Double Encoder-ResUnet (DE-ResUnet). J. Med Eng. Technol. 2022, 46, 378–392. [Google Scholar]
  57. Goyal, M.; Yap, M.H.; Reeves, N.D.; Rajbhandari, S.; Spragg, J. Fully convolutional networks for diabetic foot ulcer segmentation. In Proceedings of the 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, AB, Canada, 5–8 October 2017; pp. 618–623. [Google Scholar]
  58. Bougrine, A.; Harba, R.; Canals, R.; Ledee, R.; Jabloun, M.; Villeneuve, A. Segmentation of Plantar Foot Thermal Images Using Prior Information. Sensors 2022, 22, 3835. [Google Scholar] [CrossRef]
  59. Amin, J.; Sharif, M.; Anjum, M.A.; Khan, H.U.; Malik, M.S.A.; Kadry, S. An integrated design for classification and localization of diabetic foot ulcer based on CNN and YOLOv2-DFU models. IEEE Access 2020, 8, 228586–228597. [Google Scholar] [CrossRef]
  60. Bougrine, A.; Harba, R.; Canals, R.; Ledee, R.; Jabloun, M. On the segmentation of plantar foot thermal images with Deep Learning. In Proceedings of the European Signal Processing Conference (EUSIPCO), A Coruña, Spain, 2–6 September 2019; IEEE: New York, NY, USA, 2019; pp. 1–5. [Google Scholar]
  61. Liao, T.Y.; Yang, C.H.; Lo, Y.W.; Lai, K.Y.; Shen, P.H.; Lin, Y.L. HarDNet-DFUS: An Enhanced Harmonically-Connected Network for Diabetic Foot Ulcer Image Segmentation and Colonoscopy Polyp Segmentation. arXiv 2022, arXiv:2209.07313. [Google Scholar]
  62. Niri, R.; Douzi, H.; Lucas, Y.; Treuillet, S. A superpixel-wise fully convolutional neural network approach for diabetic foot ulcer tissue classification. In Proceedings of the International Conference on Pattern Recognition, Kolkata, India, 15–18 December 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 308–320. [Google Scholar]
  63. Goyal, M.; Reeves, N.D.; Rajbhandari, S.; Yap, M.H. Robust methods for real-time diabetic foot ulcer detection and localization on mobile devices. IEEE J. Biomed. Health Informatics 2018, 23, 1730–1741. [Google Scholar]
  64. Cassidy, B.; Reeves, N.D.; Pappachan, J.M.; Ahmad, N.; Haycocks, S.; Gillespie, D.; Yap, M.H. A cloud-based deep learning framework for remote detection of diabetic foot ulcers. IEEE Pervasive Comput. 2022, 21, 78–86. [Google Scholar] [CrossRef]
  65. Liu, Q.; Zhao, J. The Classification of Diabetic Foot Based on Faster. In Proceedings of the International Conference on Computer Vision, Image and Deep Learning (CVIDL), Chongqing, China, 10–12 July 2020; IEEE: New York, NY, USA, 2020; pp. 585–591. [Google Scholar]
  66. Cassidy, B.; Reeves, N.D.; Pappachan, J.M.; Gillespie, D.; O’Shea, C.; Rajbhandari, S.; Maiya, A.G.; Frank, E.; Boulton, A.J.; Armstrong, D.G.; et al. The DFUC 2020 dataset: Analysis towards diabetic foot ulcer detection. touchREVIEWS Endocrinol. 2021, 17, 5. [Google Scholar]
  67. da Costa Oliveira, A.L.; de Carvalho, A.B.; Dantas, D.O. Faster R-CNN Approach for Diabetic Foot Ulcer Detection. In Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP), Virtual Event, 8–10 February 2021; pp. 677–684. [Google Scholar]
  68. Goyal, M.; Reeves, N.D.; Rajbhandari, S.; Ahmad, N.; Wang, C.; Yap, M.H. Recognition of ischaemia and infection in diabetic foot ulcers: Dataset and techniques. Comput. Biol. Med. 2020, 117, 103616. [Google Scholar] [CrossRef]
  69. Yap, M.H.; Cassidy, B.; Pappachan, J.M.; O’Shea, C.; Gillespie, D.; Reeves, N.D. Analysis towards classification of infection and ischaemia of diabetic foot ulcers. In Proceedings of the International Conference on Biomedical and Health Informatics (BHI), Athens, Greece, 21–24 September 2021; pp. 1–4. [Google Scholar]
  70. Al-Garaawi, N.; Ebsim, R.; Alharan, A.F.; Yap, M.H. Diabetic foot ulcer classification using mapped binary patterns and convolutional neural networks. Comput. Biol. Med. 2022, 140, 105055. [Google Scholar] [CrossRef] [PubMed]
  71. Al-Garaawi, N.; Harbi, Z.; Morris, T. Fusion of Hand-crafted and Deep Features for Automatic Diabetic Foot Ulcer Classification. TEM J. 2022, 11, 1055–1064. [Google Scholar] [CrossRef]
  72. Xu, Y.; Han, K.; Zhou, Y.; Wu, J.; Xie, X.; Xiang, W. Classification of Diabetic Foot Ulcers Using Class Knowledge Banks. Front. Bioeng. Biotechnol. 2021, 9, 811028. [Google Scholar] [CrossRef] [PubMed]
  73. Gamage, C.; Wijesinghe, I.; Perera, I. Automatic scoring of diabetic foot ulcers through deep CNN based feature extraction with low rank matrix factorization. In Proceedings of the International Conference on Bioinformatics and Bioengineering (BIBE), Athens, Greece, 28–30 October 2019; pp. 352–356. [Google Scholar]
  74. López-Cabrera, J.D.; Ruiz-Gonzalez, Y.; Díaz-Amador, R.; Taboada-Crispi, A. Automatic Classification of Diabetic Foot Ulcers Using Computer Vision Techniques. In Proceedings of the International Workshop on Artificial Intelligence and Pattern Recognition, Havana, Cuba, 5–7 October 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 290–299. [Google Scholar]
  75. Hüsers, J.; Hafer, G.; Heggemann, J.; Wiemeyer, S.; Przysucha, M.; Dissemond, J.; Moelleken, M.; Erfurt-Berge, C.; Hübner, U.H. Automatic Classification of Diabetic Foot Ulcer Images: A Transfer-Learning Approach to Detect Wound Maceration. In Informatics and Technology in Clinical Care and Public Health; IOS Press: Amsterdam, The Netherlands, 2022; pp. 301–304. [Google Scholar]
  76. Santos, E.; Santos, F.; Dallyson, J.; Aires, K.; Tavares, J.M.R.; Veras, R. Diabetic Foot Ulcers Classification using a fine-tuned CNNs Ensemble. In Proceedings of the International Symposium on Computer-Based Medical Systems (CBMS), Shenzhen, China, 21–22 July 2022; IEEE: New York, NY, USA, 2022; pp. 282–287. [Google Scholar]
  77. Bloch, L.; Brüngel, R.; Friedrich, C.M. Boosting EfficientNets Ensemble Performance via Pseudo-Labels and Synthetic Images by pix2pixHD for Infection and Ischaemia Classification in Diabetic Foot Ulcers. In Proceedings of the Diabetic Foot Ulcers Grand Challenge, Virtual Event, 27 September 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 30–49. [Google Scholar]
  78. Padierna, L.C.; Amador-Medina, L.F.; Murillo-Ortiz, B.O.; Villaseñor-Mora, C. Classification method of peripheral arterial disease in patients with type 2 diabetes mellitus by infrared thermography and machine learning. Infrared Phys. Technol. 2020, 111, 103531. [Google Scholar] [CrossRef]
  79. Jain, A. Detection and Classification of Diabetic Foot Thermograms Using Deep Learning. Ph.D. Thesis, Delhi Technological University, Delhi, India, 2022. [Google Scholar]
  80. Khandakar, A.; Chowdhury, M.E.; Reaz, M.B.I.; Ali, S.H.M.; Hasan, M.A.; Kiranyaz, S.; Rahman, T.; Alfkey, R.; Bakar, A.A.A.; Malik, R.A. A machine learning model for early detection of diabetic foot using thermogram images. Comput. Biol. Med. 2021, 137, 104838. [Google Scholar] [CrossRef]
  81. Cruz-Vega, I.; Hernandez-Contreras, D.; Peregrina-Barreto, H.; Rangel-Magdaleno, J.d.J.; Ramirez-Cortes, J.M. Deep learning classification for diabetic foot thermograms. Sensors 2020, 20, 1762. [Google Scholar] [CrossRef]
  82. Khandakar, A.; Chowdhury, M.E.; Reaz, M.B.I.; Ali, S.H.M.; Kiranyaz, S.; Rahman, T.; Chowdhury, M.H.; Ayari, M.A.; Alfkey, R.; Bakar, A.A.A.; et al. A Novel Machine Learning Approach for Severity Classification of Diabetic Foot Complications Using Thermogram Images. Sensors 2022, 22, 4249. [Google Scholar] [CrossRef]
  83. Filipe, V.; Teixeira, P.; Teixeira, A. Automatic Classification of Foot Thermograms Using Machine Learning Techniques. Algorithms 2022, 15, 236. [Google Scholar] [CrossRef]
  84. Munadi, K.; Saddami, K.; Oktiana, M.; Roslidar, R.; Muchtar, K.; Melinda, M.; Muharar, R.; Syukri, M.; Abidin, T.F.; Arnia, F. A Deep Learning Method for Early Detection of Diabetic Foot Using Decision Fusion and Thermal Images. Appl. Sci. 2022, 12, 7524. [Google Scholar] [CrossRef]
  85. Anaya-Isaza, A.; Zequera-Diaz, M. Fourier transform-based data augmentation in deep learning for diabetic foot thermograph classification. Biocybern. Biomed. Eng. 2022, 42, 437–452. [Google Scholar] [CrossRef]
  86. Zhou, G.X.; Tao, Y.K.; Hou, J.Z.; Zhu, H.J.; Xiao, L.; Zhao, N.; Wang, X.W.; Du, B.L.; Zhang, D. Construction and validation of a deep learning-based diagnostic model for segmentation and classification of diabetic foot. Front. Endocrinol. 2025, 16, 1543192. [Google Scholar] [CrossRef]
  87. Balasenthilkumaran, N.V.; Ram S, B.; Gorti, S.; Rajagopal, S.; Soangra, R. Design and comparison of machine learning-based computer-aided diagnostic techniques to aid diagnosis of diabetes and detection of ulcer-prone regions in the feet using thermograms. Res. Biomed. Eng. 2022, 38, 781–795, Erratum in Res. Biomed. Eng. 2022, 38, 1027. [Google Scholar] [CrossRef]
  88. Wijesinghe, I.; Gamage, C.; Perera, I.; Chitraranjan, C. A smart telemedicine system with deep learning to manage diabetic retinopathy and foot ulcers. In Proceedings of the Moratuwa Engineering Research Conference (MERCon), Moratuwa, Sri Lanka, 3–5 July 2019; pp. 686–691. [Google Scholar]
  89. Maldonado, H.; Bayareh, R.; Torres, I.; Vera, A.; Gutiérrez, J.; Leija, L. Automatic detection of risk zones in diabetic foot soles by processing thermographic images taken in an uncontrolled environment. Infrared Phys. Technol. 2020, 105, 103187. [Google Scholar] [CrossRef]
  90. Mukherjee, R.; Manohar, D.D.; Das, D.K.; Achar, A.; Mitra, A.; Chakraborty, C. Automated tissue classification framework for reproducible chronic wound assessment. BioMed Res. Int. 2014, 2014, 851582. [Google Scholar] [CrossRef] [PubMed]
  91. Godeiro, V.; Neto, J.S.; Carvalho, B.; Santana, B.; Ferraz, J.; Gama, R. Chronic wound tissue classification using convolutional networks and color space reduction. In Proceedings of the International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, 17–20 September 2018; pp. 1–6. [Google Scholar]
  92. Babu, K.; Sabut, S.; Nithya, D. Efficient detection and classification of diabetic foot ulcer tissue using PSO technique. Int. J. Eng. Technol. 2018, 7, 1006–1010. [Google Scholar] [CrossRef]
  93. Wannous, H.; Lucas, Y.; Treuillet, S. Enhanced assessment of the wound-healing process by accurate multiview tissue classification. IEEE Trans. Med Imaging 2010, 30, 315–326. [Google Scholar] [CrossRef]
  94. Anisuzzaman, D.; Wang, C.; Rostami, B.; Gopalakrishnan, S.; Niezgoda, J.; Yu, Z. Image-based artificial intelligence in wound assessment: A systematic review. Adv. Wound Care 2022, 11, 687–709. [Google Scholar] [CrossRef]
  95. Cassidy, B.; Kendrick, C.; Reeves, N.D.; Pappachan, J.M.; O’Shea, C.; Armstrong, D.G.; Yap, M.H. Diabetic foot ulcer grand challenge 2021: Evaluation and summary. In Proceedings of the Diabetic Foot Ulcers Grand Challenge, Strasbourg, France, 27 September 2021; pp. 90–105. [Google Scholar]
  96. Behera, A. Use of artificial intelligence for management and identification of complications in diabetes. Clin. Diabetol. 2021, 10, 221–225. [Google Scholar] [CrossRef]
  97. Hazenberg, C.E.; aan de Stegge, W.B.; Van Baal, S.G.; Moll, F.L.; Bus, S.A. Telehealth and telemedicine applications for the diabetic foot: A systematic review. Diabetes/Metabolism Res. Rev. 2020, 36, e3247. [Google Scholar] [CrossRef]
  98. Pappachan, J.M.; Cassidy, B.; Fernandez, C.J.; Chandrabalan, V.; Yap, M.H. The role of artificial intelligence technology in the care of diabetic foot ulcers: The past, the present, and the future. World J. Diabetes 2022, 13, 1131–1139. [Google Scholar] [CrossRef]
  99. Wang, L.; Pedersen, P.C.; Agu, E.; Strong, D.M.; Tulu, B. Area determination of diabetic foot ulcer images using a cascaded two-stage SVM-based classification. IEEE Trans. Biomed. Eng. 2016, 64, 2098–2109. [Google Scholar] [CrossRef]
  100. Goyal, M.; Reeves, N.D.; Davison, A.K.; Rajbhandari, S.; Spragg, J.; Yap, M.H. Dfunet: Convolutional neural networks for diabetic foot ulcer classification. IEEE Trans. Emerg. Top. Comput. Intell. 2018, 4, 728–739. [Google Scholar] [CrossRef]
  101. Dremin, V.; Marcinkevics, Z.; Zherebtsov, E.; Popov, A.; Grabovskis, A.; Kronberga, H.; Geldnere, K.; Doronin, A.; Meglinski, I.; Bykov, A. Skin complications of diabetes mellitus revealed by polarized hyperspectral imaging and machine learning. IEEE Trans. Med Imaging 2021, 40, 1207–1216. [Google Scholar] [CrossRef]
  102. Thotad, P.N.; Bharamagoudar, G.R.; Anami, B.S. Diabetic foot ulcer detection using deep learning approaches. Sensors Int. 2023, 4, 100210. [Google Scholar] [CrossRef]
  103. Sarmun, R.; Chowdhury, M.E.; Murugappan, M.; Aqel, A.; Ezzuddin, M.; Rahman, S.M.; Khandakar, A.; Akter, S.; Alfkey, R.; Hasan, A. Diabetic foot ulcer detection: Combining deep learning models for improved localization. Cogn. Comput. 2024, 16, 1413–1431. [Google Scholar] [CrossRef]
  104. Sendilraj, V.; Pilcher, W.; Choi, D.; Bhasin, A.; Bhadada, A.; Bhadadaa, S.K.; Bhasin, M. DFUCare: Deep learning platform for diabetic foot ulcer detection, analysis, and monitoring. Front. Endocrinol. 2024, 15, 1386613. [Google Scholar] [CrossRef] [PubMed]
  105. Biswas, S.; Mostafiz, R.; Uddin, M.S.; Paul, B.K. XAI-FusionNet: Diabetic foot ulcer detection based on multi-scale feature fusion with explainable artificial intelligence. Heliyon 2024, 10, e31228. [Google Scholar] [CrossRef]
  106. El-Kady, A.M.; Abbassy, M.M.; Ali, H.H.; Ali, M.F. Advancing diabetic foot ulcer detection based on resnet and gan integration. J. Theor. Appl. Inf. Technol. 2024, 102, 2258–2268. Available online: https://www.jatit.org/volumes/Vol102No6/2Vol102No6.pdf (accessed on 24 November 2025).
  107. Azeem, M.; Zaman, M.; Akhunzada, A.; Kehkashan, T.; Ashraf, I.; Rehman, A. Optimizing Diabetic Foot Ulcer Detection: Leveraging SSD and YOLO Architectures. In Proceedings of the 2024 IEEE 21st International Conference on Smart Communities: Improving Quality of Life using AI, Robotics and IoT (HONET), Doha, Qatar, 3–5 December 2024; pp. 195–200. [Google Scholar] [CrossRef]
  108. Verma, G. Leveraging smart image processing techniques for early detection of foot ulcers using a deep learning network. Pol. J. Radiol. 2024, 89, e368. [Google Scholar] [CrossRef]
  109. Busaranuvong, P.; Agu, E.; Kumar, D.; Gautam, S.; Fard, R.S.; Tulu, B.; Strong, D. Guided Conditional Diffusion Classifier (ConDiff) for Enhanced Prediction of Infection in Diabetic Foot Ulcers. IEEE Open J. Eng. Med. Biol. 2024, 6, 20–27. [Google Scholar] [CrossRef]
  110. Eldin, A.S.; Ahmoud, A.S.; Hamza, H.M.; Ardah, H. Enhancing Early Detection of Diabetic Foot Ulcers Using Deep Neural Networks. Diagnostics 2025, 15, 1996. [Google Scholar] [CrossRef]
  111. Rathore, P.S.; Kumar, A.; Nandal, A.; Dhaka, A.; Sharma, A.K. A feature explainability-based deep learning technique for diabetic foot ulcer identification. Sci. Rep. 2025, 15, 6758. [Google Scholar] [CrossRef]
  112. Debnath, S.; Khurana, A.; Senbagavalli, M.; Naik, S.; Chandra Patni, J.; Mishra, P.K.; Kishore, J. Sustainable AI for diabetic foot ulcer detection: A deep learning approach for early diagnosis. Discov. Appl. Sci. 2025, 7, 1012. [Google Scholar] [CrossRef]
  113. Mahmud, M.I.; Reza, M.S.; Akash, M.O.A.; Elias, F.; Ahmed, N. DFU_DIALNet: Towards reliable and trustworthy diabetic foot ulcer detection with synergistic confluence of Grad-CAM and LIME. PLoS ONE 2025, 20, e0330669. [Google Scholar] [CrossRef] [PubMed]
  114. Girmaw, D.W.; Taye, G.B. MobileNetV2 model for detecting and grading diabetic foot ulcer. Discov. Appl. Sci. 2025, 7, 268. [Google Scholar] [CrossRef]
  115. Pradhana, W.M.A.A.; Pradipta, G.A.; Huizen, R.R. Combination of CNN and SMOTE-IPF for Early Detection of Diabetes Patients in Thermogram Images. J. Nas. Pendidik. Tek. Inform. JANAPATI 2025, 14, 336–347. [Google Scholar] [CrossRef]
  116. Gamage, H.; Wijesinghe, W.; Perera, I. Instance-based segmentation for boundary detection of neuropathic ulcers through Mask-RCNN. In Proceedings of the International Conference on Artificial Neural Networks, Munich, Germany, 17–19 September 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 511–522. [Google Scholar]
  117. Chitra, T.; Sundar, C.; GOPALAKRISHNAN, S. Investigation and classification of chronic wound tissue images using random forest algorithm (RF). Int. J. Nonlinear Anal. Appl. 2022, 13, 643–651. [Google Scholar]
  118. Chang, C.W.; Christian, M.; Chang, D.H.; Lai, F.; Liu, T.J.; Chen, Y.S.; Chen, W.J. Deep learning approach based on superpixel segmentation assisted labeling for automatic pressure ulcer diagnosis. PLoS ONE 2022, 17, e0264139. [Google Scholar] [CrossRef]
  119. Wang, C.; Anisuzzaman, D.; Williamson, V.; Dhar, M.K.; Rostami, B.; Niezgoda, J.; Gopalakrishnan, S.; Yu, Z. Fully automatic wound segmentation with deep convolutional neural networks. Sci. Rep. 2020, 10, 21897. [Google Scholar] [CrossRef]
  120. Lan, T.; Li, Z.; Chen, J. FusionSegNet: Fusing global foot features and local wound features to diagnose diabetic foot. Comput. Biol. Med. 2023, 152, 106456. [Google Scholar] [CrossRef]
  121. Jishnu, P.; BK, S.K.; Jayaraman, S. Automatic foot ulcer segmentation using conditional generative adversarial network (AFSegGAN): A wound management system. PLoS Digit. Health 2023, 2, e0000344. [Google Scholar]
  122. Jiao, C.; Zhao, X.; Li, L.; Wang, C.; Chen, Y. UFOS-Net leverages small-scale feature fusion for diabetic foot ulcer segmentation. Sci. Rep. 2025, 15, 29317. [Google Scholar] [CrossRef] [PubMed]
  123. Niri, R.; Zahia, S.; Stefanelli, A.; Sharma, K.; Probst, S.; Pichon, S.; Chanel, G. Wound segmentation with U-Net using a dual attention mechanism and transfer learning. J. Imaging Informatics Med. 2025, 38, 3351–3365. [Google Scholar] [CrossRef] [PubMed]
  124. Botros, F.S.; Taher, M.F.; ElSayed, N.M.; Fahmy, A.S. Prediction of diabetic foot ulceration using spatial and temporal dynamic plantar pressure. In Proceedings of the Cairo international biomedical engineering conference (CIBEC), Cairo, Egypt, 15–17 December 2016; pp. 43–47. [Google Scholar]
  125. Kasbekar, P.U.; Goel, P.; Jadhav, S.P. A decision tree analysis of diabetic foot amputation risk in Indian patients. Front. Endocrinol. 2017, 8, 25. [Google Scholar] [CrossRef] [PubMed]
  126. Adam, M.; Ng, E.Y.; Oh, S.L.; Heng, M.L.; Hagiwara, Y.; Tan, J.H.; Tong, J.W.; Acharya, U.R. Automated characterization of diabetic foot using nonlinear features extracted from thermograms. Infrared Phys. Technol. 2018, 89, 325–337. [Google Scholar] [CrossRef]
  127. Vardasca, R.; Vaz, L.; Magalhaes, C.; Seixas, A.; Mendes, J. Towards the diabetic foot ulcers classification with infrared thermal images. In Proceedings of the Quantitative Infrared Thermography Conference, Berlin, Germany, 25–29 June 2018. [Google Scholar]
  128. Vardasca, R.; ASIS, F. Diabetic foot monitoring using dynamic thermography and AI classifiers. In Proceedings of the 3rd Quantitative Infrared Thermography Asia Conference (QIRT Asia 2019), Tokyo, Japan, 1–5 July 2019; pp. 1–5. [Google Scholar]
  129. Alzubaidi, L.; Fadhel, M.A.; Oleiwi, S.R.; Al-Shamma, O.; Zhang, J. DFU_QUTNet: Diabetic foot ulcer classification using novel deep convolutional neural network. Multimed. Tools Appl. 2020, 79, 15655–15677. [Google Scholar] [CrossRef]
  130. Galdran, A.; Carneiro, G.; Ballester, M.A.G. Convolutional nets versus vision transformers for diabetic foot ulcer classification. In Proceedings of the Diabetic Foot Ulcers Grand Challenge, Virtual Event, 27 September 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 21–29. [Google Scholar]
  131. Selle, J.; Prakash, K.V.; Sai, G.A.; Vinod, B.; Chellappan, K. Classification of foot thermograms using texture features and support vector machine. In Proceedings of the International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 4–6 August 2021; IEEE: New York, NY, USA, 2021; pp. 1445–1449. [Google Scholar]
  132. Das, S.K.; Roy, P.; Mishra, A.K. Recognition of ischaemia and infection in diabetic foot ulcer: A deep convolutional neural network based approach. Int. J. Imaging Syst. Technol. 2022, 32, 192–208. [Google Scholar] [CrossRef]
  133. Yogapriya, J.; Chandran, V.; Sumithra, M.; Elakkiya, B.; Shamila Ebenezer, A.; Suresh Gnana Dhas, C. Automated Detection of Infection in Diabetic Foot Ulcer Images Using Convolutional Neural Network. J. Healthc. Eng. 2022, 2022, 2349849. [Google Scholar] [CrossRef]
  134. Jain, A.; Sreedevi, I. An Enhanced Methodology for Diabetic Foot Thermogram Classification using Deep Learning. In Proceedings of the International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON), Faridabad, India, 26–27 May 2022; IEEE: New York, NY, USA, 2022; Volume 1, pp. 1–6. [Google Scholar]
  135. Khandakar, A.; Chowdhury, M.E.; Reaz, M.B.I.; Ali, S.H.M.; Abbas, T.O.; Alam, T.; Ayari, M.A.; Mahbub, Z.B.; Habib, R.; Rahman, T.; et al. Thermal Change Index-Based Diabetic Foot Thermogram Image Classification Using Machine Learning Techniques. Sensors 2022, 22, 1793. [Google Scholar] [CrossRef]
  136. Khosa, I.; Raza, A.; Anjum, M.; Ahmad, W.; Shahab, S. Automatic Diabetic Foot Ulcer Recognition Using Multi-Level Thermographic Image Data. Diagnostics 2023, 13, 2637. [Google Scholar] [CrossRef]
  137. Nagaraju, S.; Kumar, K.V.; Rani, B.P.; Lydia, E.L.; Ishak, M.K.; Filali, I.; Karim, F.K.; Mostafa, S.M. Automated Diabetic Foot Ulcer Detection and Classification Using Deep Learning. IEEE Access 2023, 11, 127578–127588. [Google Scholar] [CrossRef]
  138. Biswas, S.; Mostafiz, R.; Paul, B.K.; Uddin, K.M.M.; Rahman, M.M.; Shariful, F. DFU_MultiNet: A deep neural network approach for detecting diabetic foot ulcers through multi-scale feature fusion using the DFU dataset. Intell.-Based Med. 2023, 8, 100128. [Google Scholar] [CrossRef]
  139. Toofanee, M.S.A.; Dowlut, S.; Hamroun, M.; Tamine, K.; Petit, V.; Duong, A.K.; Sauveron, D. Dfu-siam a novel diabetic foot ulcer classification with deep learning. IEEE Access 2023, 11, 98315–98332. [Google Scholar] [CrossRef]
  140. Das, S.K.; Namasudra, S.; Sangaiah, A.K. HCNNet: Hybrid convolution neural network for automatic identification of ischaemia in diabetic foot ulcer wounds. Multimed. Syst. 2024, 30, 36. [Google Scholar] [CrossRef]
  141. Fadhel, M.A.; Alzubaidi, L.; Gu, Y.; Santamaría, J.; Duan, Y. Real-time diabetic foot ulcer classification based on deep learning & parallel hardware computational tools. Multimed. Tools Appl. 2024, 83, 70369–70394. [Google Scholar]
  142. Patel, Y.; Shah, T.; Dhar, M.K.; Zhang, T.; Niezgoda, J.; Gopalakrishnan, S.; Yu, Z. Integrated image and location analysis for wound classification: A deep learning approach. Sci. Rep. 2024, 14, 7043. [Google Scholar] [CrossRef]
  143. Almufadi, N.F.; Alhasson, H.F.; Alharbi, S.S. E-DFu-Net: An efficient deep convolutional neural network models for diabetic foot ulcer classification. Biomol. Biomed. 2025, 25, 445. [Google Scholar] [CrossRef]
  144. Ajay, A.; Bisht, A.S.; Karthik, R. Dense-ShuffleGCANet: An Attention-Driven Deep Learning Approach for Diabetic Foot Ulcer Classification Using Refined Spatio-Dimensional Features. IEEE Access 2025, 13, 5507–5521. [Google Scholar] [CrossRef]
  145. Karthik, R.; Ajay, A.; Jhalani, A.; Ballari, K.; K, S. An explainable deep learning model for diabetic foot ulcer classification using swin transformer and efficient multi-scale attention-driven network. Sci. Rep. 2025, 15, 4057. [Google Scholar] [CrossRef]
  146. Ullah, S.; Javed, A.; Aljasem, M.; Saudagar, A.K.J. Eff-ReLU-Net: A deep learning framework for multiclass wound classification. BMC Medical Imaging 2025, 25, 257. [Google Scholar] [CrossRef]
  147. Reis, S.S.; Pinto-Coelho, L.; Sousa, M.C.; Neto, M.; Silva, M.; Sequeira, M. Evaluating Skin Tone Fairness in Convolutional Neural Networks for the Classification of Diabetic Foot Ulcers. Appl. Sci. 2025, 15, 8321. [Google Scholar] [CrossRef]
  148. Maurya, L.; Mirza, S. MCTFWC: A multiscale CNN-transformer fusion-based model for wound image classification. Signal, Image Video Process. 2025, 19, 574. [Google Scholar] [CrossRef]
  149. Fitriah, N.; Sriani, S. Classification of Foot Wound Severity in Type 2 Diabetes Mellitus Patients Using MobileNetV2-Based Convolutional Neural Network. J. Appl. Informatics Comput. 2025, 9, 2163–2170. [Google Scholar] [CrossRef]
  150. Bansal, N.; Vidyarthi, A. Multivariate Feature-based Analysis of the Diabetic Foot Ulcers Using Machine Learning Classifiers. In Proceedings of the 2024 Sixteenth International Conference on Contemporary Computing, Noida, India, 8–10 August 2024; pp. 527–534. [Google Scholar] [CrossRef]
  151. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
  152. Vardasca, R.; Marques, A.; Carvalho, R.; Gabriel, J. Thermal imaging of the foot in different forms of diabetic disease. In Infrared Imaging; IOP Publishing: Bristol, UK, 2015; Available online: https://iopscience.iop.org/book/edit/978-0-7503-1143-4/chapter/bk978-0-7503-1143-4ch27 (accessed on 24 November 2025).
  153. Gaw, N.; Schwedt, T.J.; Chong, C.D.; Wu, T.; Li, J. A clinical decision support system using multi-modality imaging data for disease diagnosis. IISE Trans. Healthc. Syst. Eng. 2018, 8, 36–46. [Google Scholar] [CrossRef]
  154. Dar, R.A.; Rasool, M.; Assad, A. Breast cancer detection using deep learning: Datasets, methods, and challenges ahead. Comput. Biol. Med. 2022, 149, 106073. [Google Scholar] [CrossRef]
  155. Ali, S.; El-Sappagh, S.; Ali, F.; Imran, M.; Abuhmed, T. Multitask Deep Learning for Cost-Effective Prediction of Patient’s Length of Stay and Readmission State Using Multimodal Physical Activity Sensory Data. IEEE J. Biomed. Health Informatics 2022, 26, 5793–5804. [Google Scholar] [CrossRef]
  156. Alzubaidi, L.; Al-Amidie, M.; Al-Asadi, A.; Humaidi, A.J.; Al-Shamma, O.; Fadhel, M.A.; Zhang, J.; Santamaría, J.; Duan, Y. Novel transfer learning approach for medical imaging with limited labeled data. Cancers 2021, 13, 1590. [Google Scholar] [CrossRef]
Figure 1. PRISMA flowchart of primary study selection. We excluded studies that have no ML or DL models, are not written in English, have no image-based model, or do not use accuracy measurements in the validation. However, we included studies that were published between 2010 and 2025. Records were excluded if the article is a preprint, has no AI application, search terms are not targeted in the article, or there are no accuracy measurements.
Figure 1. PRISMA flowchart of primary study selection. We excluded studies that have no ML or DL models, are not written in English, have no image-based model, or do not use accuracy measurements in the validation. However, we included studies that were published between 2010 and 2025. Records were excluded if the article is a preprint, has no AI application, search terms are not targeted in the article, or there are no accuracy measurements.
Biomedicines 13 02928 g001
Figure 2. Term frequency trends in Web of Science, Scopus, and PubMed databases (2015–2022) using the AND connector: (a) CNN AND DFU, (b) ML AND DFU, and (c) DL AND DFU. Percentages are calculated based on the total number of retrieved records from each database for the specified time range.
Figure 2. Term frequency trends in Web of Science, Scopus, and PubMed databases (2015–2022) using the AND connector: (a) CNN AND DFU, (b) ML AND DFU, and (c) DL AND DFU. Percentages are calculated based on the total number of retrieved records from each database for the specified time range.
Biomedicines 13 02928 g002
Figure 3. ML and DL in DFU research trends among included studies published between 2010 and 2025. Percentages are based on the total number of included studies (n = 102). The x-axis represents publication years, while the y-axis indicates the number of publications.
Figure 3. ML and DL in DFU research trends among included studies published between 2010 and 2025. Percentages are based on the total number of included studies (n = 102). The x-axis represents publication years, while the y-axis indicates the number of publications.
Biomedicines 13 02928 g003
Figure 4. (a) Percentage of included studies (n = 102) by ML/DL functionality for DFU diagnosis. (b) Percentage of studies using different DFU image types (colored, thermal, hyperspectral). (c) Percentage of segmentation studies using public datasets. (d) Percentage of classification studies using public datasets. Percentages are calculated relative to the total number of included studies. Data spans the publication period of 2010–2025.
Figure 4. (a) Percentage of included studies (n = 102) by ML/DL functionality for DFU diagnosis. (b) Percentage of studies using different DFU image types (colored, thermal, hyperspectral). (c) Percentage of segmentation studies using public datasets. (d) Percentage of classification studies using public datasets. Percentages are calculated relative to the total number of included studies. Data spans the publication period of 2010–2025.
Biomedicines 13 02928 g004
Figure 5. Graphical display of machine/deep learning references in included studies.
Figure 5. Graphical display of machine/deep learning references in included studies.
Biomedicines 13 02928 g005
Figure 6. QUADAS-2 quality assessment graphs depict individual bias risk and concerns regarding applicability presented as percentages across the 66 included studies.
Figure 6. QUADAS-2 quality assessment graphs depict individual bias risk and concerns regarding applicability presented as percentages across the 66 included studies.
Biomedicines 13 02928 g006
Table 1. The research strategy followed in this study.
Table 1. The research strategy followed in this study.
DatabaseSearch StrategySearch Data# of Identified Records
IEEE Xplore‘Diabetic Foot Ulcer OR DFU
Thermal’ or ‘Thermographic’
‘Segmentation or Classification’
AND ‘Machine learning’ OR
’Deep Learning’ OR ‘Artificial Intelligence’
‘Foot Wound Tissue’ OR
‘Diabetic Foot Infections’
‘Foot Ulceration’ OR
‘Chronic Wound Analysis’
‘Foot diagnosis OR diabetic foot care’
‘intelligence’ OR ‘Full Text OR Paper’
‘Title’ OR ‘Survey’ OR ‘Overview’
15 October 2025195
Science Direct 708
PubMed (MIDLINE)1200
arXiv.org60
MDPI174
IEEE160
PloS60
Nature172
Scopus931
Springer682
Elsevier334
Taylor & Francis69
Frontiers46
Wiley Online Library78
Table 2. Summary of statistical performance indicators used in the analyzed papers.
Table 2. Summary of statistical performance indicators used in the analyzed papers.
MetricsFormulaDefinition
Accuracy
T P + T N T P + T N + F P + F N ( 1 )
The accuracy of a measurement can be demonstrated by how closely it resembles the actual value or a standard.
Precision
T P T P + F P
Precision is an indicator of how closely two or more measurements are aligned with each other.
Recall (Sensitivity)
T P T P + F N
It quantifies how many correct positive predictions were made out of all possible positive cases.
F1 score (Dice Similarity
Coefficient (DSC))
2 · T P 2 · T P + F P + F N
It is biased towards the lowest precision and recall values in each category. The F1 score increases if both precision and recall improve.
Specificity
T N T N + F P
An indicator of the likelihood of a negative test being correctly identified (true negative rate).
Jaccard index (Intersection over Union)
T P T P + F N + F P
This index measures the degree of similarity between two sets of members to determine which members are similar and which are different.
Five-fold cross-validation
c v ( k ) = 1 k i = 1 k M S E i
It averages the Mean Squared Errors (MSEs) across k folds to detect overfitting and assess generalization performance.
Intersection over Union (IoU)
I o U = | A B | | A B |
IoU measures the overlap between the predicted segmentation (A) and the ground truth segmentation (B), divided by their union. This metric evaluates segmentation accuracy, with higher IoU values indicating better performance.
Root Mean Square Error (RMSE)
i = 1 N ( x i x ^ i ) 2 N ( 2 )
It is the square root of the Mean Squared Error (MSE) of an estimator of a population parameter.
Area Under Curve (AUC)An AUC measure is used to determine the entire two-dimensional area underneath the entire ROC curve.
Receiver Operating Characteristic
curve (ROC)
It shows the performance of a classifier at all classification thresholds.
Error rate
Approximate Value Exact Value Exact Value × 100
An approximate or measured value is expressed as a percentage of an exact or known value.
Mean Average Precision (mAP)
1 n k = 1 n A P k ( 3 )
The metric is commonly used for evaluating the detection and classification of objects (i.e., localization, classification).
Kappa index
P 0 P e 1 P e ( 4 )
It measures the level of inter-rater reliability between categorical variables.
Matthews Correlation Coefficient (MCC)
T P × T N F P × F N ( T P + F P ) ( T P + F N ) ( T N + F P ) ( T N + F N )
It measures the difference between the expected and actual values.
False Positive Rate (FPR)
F P F P + T N
The proportion of incorrect positive predictions out of all actual
negative cases.
Overlap score
overlap ( X , Y ) = X Y min ( X , Y )
An overlap between two finite sets is measured by this similarity measure.
Success rate
C = x y 100 ( 5 )
It is referred to as the success fraction when the success rate is determined based on the number of attempts.
System Usability Scale (SUS)It is a commonly used method for measuring perceived usability of products and services, consisting of a 10-item questionnaire on a Likert scale, with participants answering each of the 10 items on a five-level scale.
(1) TP is true positive, TN is true negative, FP is false positive, and FN is false negative cases; (2) N is the number of data points, xi are real-time series observations, and x ^ i refers to time series estimates; (3) AP is the Average Precision, k represents an individual class, and n represents the total number of classes; (4) P0 indicates the relative agreement among raters, and Pe indicates the hypothetical probability of chance agreement; (5) C is the chance of success or failure, x is the number of successes or failures, and y is the total number of attempts.
Table 3. A summary of the mean (±standard deviation) overall performance metrics for detection studies.
Table 3. A summary of the mean (±standard deviation) overall performance metrics for detection studies.
Accuracy (n = 1)AUC (n = 1)Sensitivity (n = 1)Specificity (n=1)
0.97 ± 0.030.97 ± 0.020.95 ± 0.020.85 ± 0.02
(0.90–1.00)(0.92–1.00)(0.91–0.98)(0.82–0.90)
n indicates the number of studies reporting metrics.
Table 4. A summary of the mean (±standard deviation) overall performance metrics for segmentation studies.
Table 4. A summary of the mean (±standard deviation) overall performance metrics for segmentation studies.
Accuracy (n = 8)AUC (n = 2)Sensitivity (n = 5)Specificity (n = 4)IoU (n = 7)
0.94 ± 0.050.99 ± 0.010.91 ± 0.040.95 ± 0.030.87 ± 0.07
(0.85–0.99)(0.96–1.00)(0.88–0.96)(0.90–0.98)(0.80–0.95)
n indicates the number of studies reporting metrics.
Table 5. A summary of the mean (±standard deviation) overall performance metrics for classification studies.
Table 5. A summary of the mean (±standard deviation) overall performance metrics for classification studies.
Accuracy (n = 26)AUC (n = 6)Sensitivity (n = 16)Specificity (n = 15)
0.93 ± 0.040.94 ± 0.030.92 ± 0.050.94 ± 0.04
(0.88–0.98)(0.90–0.98)(0.87–0.96)(0.89–0.97)
n indicates the number of studies reporting metrics.
Table 6. A summary of the mean (±standard deviation) overall performance metrics for hybrid (classification and segmentation) studies.
Table 6. A summary of the mean (±standard deviation) overall performance metrics for hybrid (classification and segmentation) studies.
Accuracy (n = 7)Sensitivity (n = 3)Specificity (n = 3)DSC (n = 1)SUS (n = 1)
0.88 ± 0.050.89 ± 0.060.93 ± 0.040.94 ± 0.020.88 ± 0.02
(0.80–0.93)(0.83–0.95)(0.89–0.96)(0.92–0.96)(0.86–0.90)
n indicates the number of studies reporting metrics.
Table 7. GRADE Judgments for included studies by task.
Table 7. GRADE Judgments for included studies by task.
TaskRisk of BiasInconsistencyIndirectnessOverall Evidence Quality
DetectionModerateModerateHighModerate
SegmentationLowLowModerateHigh
ClassificationHighHighHighLow to Moderate
Table 8. Main characteristics of included detection-targeted DFU studies including author, year, journal ranking, dataset employed, resolution of images used, used ML model, validation metrics, and their results.
Table 8. Main characteristics of included detection-targeted DFU studies including author, year, journal ranking, dataset employed, resolution of images used, used ML model, validation metrics, and their results.
Author [Refs.]YearJournal Rank (SJR)/Conference Rank (Qualis)ML/DL ModelDatasetValidation ParameterValue
Dremin et al. [101]2021Q1ANNPrivate Hyperspectral images datasetSensitivity, Specificity, AUC 0.95 , 0.85 , 0.97
Nag et al. [42]2021Not Yet AssignedSVM, k-NN, and DTPLANTAR THERMO-GRAMAccuracy 97.778 %
Cassidy et al. [64]2022Q1Faster R-CNNReal-time imagesNANNAN
Thotad et al. [102]2023Q1EfficientNetDFUC2020Accuracy, F1-score, Recall, Precision 98.97 % , 98 % , 98 % , and 99 %
Sarmun et al. [103]2024Q1Combined Deep Learning ModelsDFUC 2020Localization Accuracy 86.4 %
Sendilraj et al. [104]2024Q1DFUCare PlatformDFUC 2020Usability (F1-score, mAP, Ischemia, Infection)F1: 0.80 , mAP: 0.861 , Ischemia: 94.81 % , Infection: 79.76 %
Biswas et al. [105]2024Q1XAI-FusionNetDFU Dataset (Kaggle)Accuracy, TransparencyAccuracy: 99.05 % , Precision: 100 % , Recall: 98.18 % , AUC: 99.09 %
El-Kady et al. [106]2024Q2ResNet + Generative Adversarial Network(GAN)Clinical Dataset (Egypt)Precision, F1-scorePrecision: 0.85 , F1-score: 0.84
Azeem et al. [107]2024Q2SSD and YOLO ArchitecturesClinical DatasetOptimization PerformanceImproved Detection (Exact values not mentioned)
Verma [108]2024Q2Smart Image Processing TechniquesThermal DatasetEarly DetectionResNet50: 89.1 % , EfficientNetB0: 99.4 %
Busaranuvong
et al. [109]
2024Q1ConDiff (Guided Conditional Diffusion Classifier)Infection Prediction DatasetPrediction AccuracyEnhanced Accuracy (Exact values not mentioned)
Eldin et al. [110]2025Q1Deep Neural Networks (ORB + DL)Plantar Thermogram DatasetAccuracy, F1-score, AUCAccuracy: 98.51 % , F1: 98.97 % , AUC: 1.00
Rathore et al. [111]2025Q1Feature Explainability-Based Deep LearningDFU_XAI DatasetInterpretability, AccuracyAccuracy: 99.05 % , AUC: 0.99 , Precision: 100 %
Debnath et al. [112]2025Q1Sustainable AI with Deep LearningDFUC 2020Early Diagnosis, Resource EfficiencyDenseNet: 92.2 % , MobileNet: 95.4 % , FusionNet: 97.8 %
Mahmud et al. [113]2025Q1DFU_DIALNet (Grad-CAM + LIME)DFU Dataset (Clinical)Reliability, TrustworthinessGrad-CAM Accuracy: 98.76 % , LIME: Improved Explainability
Girmaw et al. [114]2025Q1MobileNetV2Ethiopian Hospital DatasetDetection and GradingAccuracy: 100 % , AUC: 1.00
Pradhana et al. [115]2025Q2CNN + SMOTE-IPFThermogram ImagesDetection on Imbalanced DataAHE Accuracy: 99.60 % , Gamma Correction: 98.80 %
Table 9. Characteristics of segmentation-targeted DFU studies including author, year, journal ranking, dataset employed, resolution of images used, used ML model, validation metrics, and their results.
Table 9. Characteristics of segmentation-targeted DFU studies including author, year, journal ranking, dataset employed, resolution of images used, used ML model, validation metrics, and their results.
Author [Refs.]YearJournal Rank (SJR)/Conference Rank (Qualis)ML/DL ModelDatasetValidation ParameterValue
Wang et al. [99]2016Q1SVMPrivate dataset contain
100 foot ulcer color images
Sensitivity, Specificity 73.3 % , 94.6 %
Cui et al. [43]2019B1CNN, SVMThe dataset contains
445 images 392 images for validation and 53 images for testing
Precision, Sensitivity, Specificity, Accuracy, Mean IoU, Dice and MCC 0.722 % , 0.9 % , 0.947 % , 0.934 % , 0.660 % , 0.770 % and 0.753 %
Gamage et al. [116]2019Not Yet AssignedMask-RCNN
(Backbone = ResNet-50, ResNet-101)
Private dataset has
2400 images
Average Precision, IoU(ResNet-50 = 0.44, 0.51)
(ResNet-101 = 0.51, 0.62)
Ohura et al. [51]2019Q2U-Net and VGG16Sacral Pressure Ulcers (PU) datasetsAUC, Specificity and Sensitivity 0.997 % , 0.943 % and 0.993 %
Rania et al. [50]2020CU-NetESCALEAccuracy, IoU and DSC 94.96 % , 94.86 % and 97.25 %
Munoz et al. [48]2020Q2Mask R-CNNPrivate datasetAccuracy, Sensitivity, Precision, Specificity and F Measure 98.01 % , 96.97 % , 97.94 % , 95.97 % and 97.01 %
Bouallal et al. [55]2020B4U-NetPrivate datasetIoU and DSCMultimodal images: IoU = 98.37 % and DSC = 99, Thermal data: IoU= 97.43 % and DSC = 98.68 %
Mahbod et al. [53]2021A1U-Net and LinkNetPrivate datasetDSC, Precision, Recall and IoU84.42, 92.68, 91.80, 85.51
Galdran et al. [54]2021A1Double Encoder-ResUnet (DE-ResUnet)Private datasetPrecision, Recall and DSC 90.03 % , 86.91 % and 84 %
Chitra et al. [117]2022Q4Random Forest algorithm (RF)Private datasetAccuracy 93.8 %
Heras et al. [46]2022Q4Logistic Regression(LR), morphological operatorsPrivate datasetJaccard Index, accuracy, recall, precision and DSC0.81, 0.94, 0.86, 0.91 and 0.88
Bougrine et al. [58]2022Q1FCN, SegNet, U-NetPrivate datasetRMSEand DSC 5.12 pixels and 94 %
Bouallal et al. [56]2022Q3FCN, SegNet, U-NetPrivate datasetIoU97%
Chang et al. [118]2022Q1U-Net, DeeplabV3, PsPNet, FPN and Mask R-CNN)Private datasetPrecision, Recall and Accuracy(DeeplabV3 = 0.9915, 0.9915, 0.9957) in classification,
(DeeplabV3 = 0.9888, 0.9887, 0.9925) in segmentation
Jain et al. [79]2022Ph.D. ThesisProNetPrivate datasetAccuracy98.9%
Huang et al. [49]2022Q1Fast R-CN, GoogLeNet, SURFPrivateAccuracy90%
Alshayeji et al. [47]2023Q1SVMPrivate datasetSensitivity, Precision and AUC97.81%, 97.9% and 0.9995
Rania et al. [50]2020CU-NetESCALEAccuracy, IoU and DSC 94.96 % , 94.86 % and 97.25 %
Bougrine et al. [60]2019B1FCN, SegNet, U-NetPrivate datasetDice Similarity Coefficient (DSC), standard deviations (STD)(FCN = 96.16% ± 0.85%) (SegNet = 97.26% ± 0.69%)
(U-Net = 74.35% ± 9.58%)
Wang et al. [119]2020Q1MobileNetV2 and CCLPrivate dataset consisting of 1109 imagesPrecision, Recall, and the Dice coefficient 91.01 % , 89.97 % and 90.47 %
Lan et al. [120]2023Q1FusionSegNetPrivate datasetAUC, Accuracy, Sensitivity, Specificity, F1-score98.93%, 95.78%, 94.27%, 96.88%, 94.91%
Jishnu et al. [121]2023Not Yet AssignedAFSegGANDFUC2021Dice score, IoU93.11%, 99.07%
Jiao et al. [122]2025Q1UFOS-Net with EMS and MODADFU SegmentationDice, IoU77.45%, 66.64%
Niri et al. [123]2025Q1Dual Attention U-Net with SE BlocksWound SegmentationDice, IoU94.1%, 89.3%
Table 10. Characteristics of classification-targeted DFU studies including author, year, dataset employed, used ML model, validation metrics, and their results.
Table 10. Characteristics of classification-targeted DFU studies including author, year, dataset employed, used ML model, validation metrics, and their results.
Author [Refs.]YearJournal Rank (SJR)/Conference Rank (Qualis)ML/DL ModelDatasetValidation ParameterValue
Botros et al. [124]2016Not Yet AssignedSVM with a global average pooling (GAP)Private datasetAccuracy and Precision 96.4 % and 96.4 %
Kasbekar et al. [125]2017Q1Decision treePrivate datasetError rate and Accuracy 3.6 % and 94 %
Adam et al. [126]2018Q2SVM, Discrete Wavelet Transform (DWT) and Higher Order Spectra (HOS)Thermograms images 33 healthy and 33 with type 2 diabetesAccuracy, Sensitivity and Specificity 89.39 % , 81.81 % and 96.97 %
Goyal et al. [100]2018Q1CNN (DFUNet) and Conventional ML(CML)DFU A(I)Sensitivity, F-measure, Specificity, Precision and AUC 0.929 , 0.931 , 0.908 , 0.942 and 0.950
Vardasca et al. [127]2018Q3SVM and K-NNPrivate datasetAccuracy and Positive prediction 92.5 % and 20 %
Goyal et al. [63]2018Q1Faster R-CNN, MobileNet, InceptionV21775 foot images with DFUmAP and Speed 91.8 % and 48 ms
Vardasca et al. [128]2019Q3ANN, SVM and k-NNPrivateAccuracy, Specificity and Sensitivity81.25%, 80 and 100%
Gamage et al. [73]2019B1Pre-trained CNN, ANN, RF, SVM and Singular Value Decomposition (SVD)A private dataset has
2400 images
Accuracy and F-score 96.22 % and 0.9610
Alzubaidi et al. [129]2020Q1QUTNet based on D-CNN, KNN and SVMDFU (alzubaidi)Precision, Recall and DSC 95.4 % , 93.6 % , 94.5 %
Goyal et al. [68]2020Q1Faster R-CNN and Superpixel Color DescriptorDFU B (II)Accuracy in ischemia and infection classification 90 % and 73 %
Cruz et al. [81]2020Q1DFTNetPLANTAR THERMO-GRAMSensitivity, Specificity and Accuracy 0.95 , 0.94 and 0.94
Amin et al. [59]2020Q1YOLOv2-DFUPart (B)(II)Sensitivity, Recall and Precision and Accuracy 0.99 and 0.97 accuracy on infection and ischemia, 0.98 and 0.97 IOU on ischemia and infection
Liu et al. [65]2020Not Yet AssignedFaster R-CNNPrivateAccuracy 95 %
Padierna et al. [78]2021Q2SVMPrivateAccuracy, Sensitivity and Specificity 92.64 , 91.80 and 93.59
Niri et al. [62]2021A1Spx-based FCNsPrivate and ESCALEAccuracy, Sensitivity, Specificity, Precision, and DSC 92.68 % , 74.53 % , 94.39 % 78.07 % and 75.74 %
Cassidy et al. [66]2021Q3Faster R-CNN, FRCNN ResNet101, FRCNN Inception-v2-ResNet101, YOLOv5 and EfficientDetDFUC 2020Recall, Precision, F1 score and mAPF1 scores= 0.6784 , 0.6623 , 0.6716 , 0.6612 and 0.6929
Galdran et al. [130]2021Not Yet AssignedBig Image Transfer (BiT),EfficientNet, Vision Transformers (ViT), Data-efficient Image Transformers (DeIT)DFUC2021DSC, AUC, Recall and Precision 62.16 , 88.55 , 65.22 and 61.40
Selle et al. [131]2021Not Yet AssignedSVMPrivateAccuracy 96.42 %
Xu et al. [72]2021Q1A pre-trained vision transformer models class knowledge banks(CKBs)DFU B(II)Accuracy, Sensitivity, Precision, Specificity, DSC and AUC score 90.90 , 86.09 , 95, 95.59 , 90.30 and 96.80
Da et al. [67]2021BFaster R-CNNDFUC 2020mAP and DSC 91.4 and 94.8
Alzubaidi et al. [129]2021Q1DFU_QUTNet and SVMDFU (alzubaidi)Precision, Recall and DSC 95.4 % , 93.6 % and 94.5
Yap et al. [69]2021AEfficientNetB0 with data augmentation and transfer learningDFUC2021Average Precision, Recall and F1-Score 0.57 , 0.62 and 0.55
Bloch et al. [77]2021A1EfficientNetDFUC2021DSC 60.77 %
Khandakar et al. [80]2021Q1MobilenetV2PLANTAR THERMO-GRAMDSC97
Das et al. [132]2022Q2ResKNetDFU B(II)AUC 0.99 for ischemia and AUC= 0.89 for infection
Al-Garaawi et al. [70]2022Q1DFU-RGB-TEX-NETDFU A(I) and DFU B(II)AUC and DSC 0.981 % , 0.952 % on Part-A and 0.820 % , 0.744 % on Part-B infection
Al-Garaawi et al. [71]2022Q3GoogLNet CNNDFU A(I) and DFU B(II)Sensitivity, Specificity, Precision, Accuracy, DSC and AUC 0.93 , 0.90 0.94 , 0.92 , 0.93 and 0.97
Husers et al. [75]2022Not Yet AssignedMobileNetV1Private datasetAccuracy, Precision, Recall and F1-score 69 % , 67 % , 69 % and 0.73
Santos et al. [76]2022B1VGG-16, VGG-19, Resnet-50, InceptionV3, and Densenet-201PrivateAccuracy and Kappa index 95.04 % and 91.85 %
Yogapriya et al. [133]2022Q2DFINETDFU B(II)Accuracy and and MCC 91.98 % and 0.84
Jain et al. [79]2022PhD ThesisSIFT and SURF combined with BOF, and SVMPLANTAR THERMO-GRAMAccuracy, Specificity and Sensitivity 91.23 % , 91.50 % and 92.41 %
Jain et al. [134]2022Not Yet AssignedProNet, AlexNet, ResNetPLANTAR THERMO-GRAMAccuracy, Precision, Sensitivity, Specificity and F1-Score 98.9 % , 1.000 , 0.978 , 1.000 and 0.988
Khandakar et al. [135]2022Q1MLP Classifier, XGBoost Feature Selection and choosing Top 2 FeaturesPLANTAR THERMO-GRAMAccuracy, Precision, Sensitivity, DSC and Specificity 0.91 , 0.91 , 0.91 , 0.91 , 0.91 and 95.83
Khandakar et al. [82]2022Q1VGG 19 CNNPLANTAR THERMO-GRAMAccuracy, Precision, Sensitivity, F1-Score and Specificity 94.76 , 94.89 , 94.67 , 94.73 and 97.32
Munadi et al. [84]2022Q2ShuffleNet and MobileNetV2PLANTAR THERMO-GRAMAccuracy Sensitivity, Specificity, Precision and F-Measure 1.0 , 1.0 , 1.0 , 1.0 and 1.0
Anaya et al. [85]2022Q1ResNet50v2PLANTAR THERMO-GRAMAccuracy, Sensitivity, Specificity and FPR 100 % , 100 % , 100 % and 0 %
Balasenthi- lkumaran et al. [87]2022Q4ANN, QSVM, linear discriminant, logistic regression and Gaussian naïve BayesPrivateAccuracy and F1 score 93.3 % and 0.95
Filipe et al. [83]2022Q2Logistic Regression, SVM Quadratic, Linear SVM, 3-NN and weighted k-NNPLANTAR THERMO-GRAMAccuracy, Sensitivity, Specificity, Precision, AUC and F-Score 0.924 , 0.833 , 0.958 , 0.882 and 0.857
khosa et al. [136]2023Q2Custom ModelPLANTAR THERMO-GRAMSensitivity, Specificity, Accuracy, F1-Score, AUC0.97, 0.958, 0.97, 0.891, 0.976
Reyes et al. [24]2023Q1DFU_VIRNetDFU (alzubaidi)AUC, F-score0.9982 and 0.9928 for ischemia and 0.9121, 0.8363 for infection
Nagaraju et al. [137]2023Q1Inception-ResNet-v2DFU (alzubaidi)Accuracy99.29
Biswas et al. [138]2023Q1DFU_MultiNetDFU (alzubaidi)Accuracy99.06
Toofanee et al. [139]2023Q1DFU-SIAMDFU2021macro-F1 score, F1-score0.623, 0.549 for ischemia and 0.628 for infection
Das et al. [140]2024Q1HCNNetDFU-Part(B)AUC0.999
Fadhel et al. [141]2024Q2DFU_FNet and DFU_TFNetReal-timeAccuracy, Precision, F1-Score99.81%, 99.38% and 99.25%
Patel et al. [142]2024Q1Multi-modal deep learning frameworkAZH, MedetecAccuracy74.79–100%
Almufadi et al. [143]2025Q2E-DFu-NetTransfer learningAccuracy (Ischemia, Infection)97%, 92%
Ajay et al. [144]2025Q1Dense-ShuffleGCANetAttention-driven mechanismsRobustnessStrong across diverse datasets
Karthik et al. [145]2025Q1Swin Transformer + Multi-scale AttentionDFUC-2021F1-Score80%
Ullah et al. [146]2025Q1Eff-ReLU-NetEfficientNet-B0 + ReLUAccuracy (Medetec, AZH)92.33%, 90%
Reis et al. [147]2025Q1CNN Fairness EvaluationVGG16, VGG19, MobileNetV2Disparities in Skin Tone PerformanceHighlighted need for inclusivity
Maurya et al. [148]2025Q1MCTFWC (CNN-Transformer)Medetec, AZHAccuracyHigh across wound types
Fitriah et al. [149]2025Q2MobileNetV2-based DFU Severity ClassificationLow-resource settingsEfficiencyStrong results for severity grading
Bansal et al. [150]2024Q3ML Classifiers + Multivariate FeaturesCustom datasetAccuracyPromising but limited
Karthik et al. [144]2024Q1Dense-ShuffleGCANetAttention mechanismsRobustnessStrong performance
Table 11. Characteristics of hybrid Classification and Segmentation-targeted DFU studies including author, year, dataset employed, used ML model, validation metrics, and their results.
Table 11. Characteristics of hybrid Classification and Segmentation-targeted DFU studies including author, year, dataset employed, used ML model, validation metrics, and their results.
Author [Refs.]YearJournal Rank (SJR)/Conference Rank (Qualis)ML/DL ModelDatasetValidation ParameterValue
Wannous et al. [93]2010Q1SVM and Mean Shift iterative color clustering algorithm850 color imagesOverlap score, Sensitivity, Specificity, Success rate, Accuracy73.8%, 77%, 92%, 84%, 88%
Mukherjee et al. [90]2014Q2SVM and Bayesian classifier, color conversion and fuzzy divergence for segmentationtotal = 767 images where granulation tissue = 222, slough tissue = 451 and necrotic tissue = 94Accuracy 86.94 % , 90.47 % , and 75.53 % , for classifying granulation, slough, and necrotic tissues, respectively
Babu et al. [92]2018Not Yet AssignedNaive bayes and Hoeffding tree classifier and Particle Swarm Optimization (PSO)3 DFU images used to test the methodAccuracy, Sensitivity and Specificity 90.90 % , 100 % , 87.5 % by Naïve Bayes and 81.81 % , 100 % , 77.7 % by Hoeffding Tree
Godeiro et al. [91]2018B2For classification SegNet, where segmentation U-Net30 color image foot and handAccuracy, Specificity, Sensitivity,
and DSC
0.9610, 0.9876, 0.9128 and 0.9425
Wijesinghe et al. [88]2019Not Yet AssignedR-CNN and D-CNN Module400 DFU imagesSUS88.5
Maldonado et al. [89]2020Q2Pretrained Mask R-CNN and Gaussian distributionPrivate-DB1 had a total of 108 images where DB2 contained a total of
141 images
Accuracy 90.28 %
Zhou et al. [86]2025Q1Mask2Former, Deeplabv3Plus, Swin-TransformerDFU Dataset (671 images)Accuracy, mIoU 91.85 % , 65.79 %
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alhasson, H.F.; Alharbi, S.S. Advances in Image-Based Diagnosis of Diabetic Foot Ulcers Using Deep Learning and Machine Learning: A Systematic Review. Biomedicines 2025, 13, 2928. https://doi.org/10.3390/biomedicines13122928

AMA Style

Alhasson HF, Alharbi SS. Advances in Image-Based Diagnosis of Diabetic Foot Ulcers Using Deep Learning and Machine Learning: A Systematic Review. Biomedicines. 2025; 13(12):2928. https://doi.org/10.3390/biomedicines13122928

Chicago/Turabian Style

Alhasson, Haifa F., and Shuaa S. Alharbi. 2025. "Advances in Image-Based Diagnosis of Diabetic Foot Ulcers Using Deep Learning and Machine Learning: A Systematic Review" Biomedicines 13, no. 12: 2928. https://doi.org/10.3390/biomedicines13122928

APA Style

Alhasson, H. F., & Alharbi, S. S. (2025). Advances in Image-Based Diagnosis of Diabetic Foot Ulcers Using Deep Learning and Machine Learning: A Systematic Review. Biomedicines, 13(12), 2928. https://doi.org/10.3390/biomedicines13122928

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop