A Deep Learning Approach for Diabetic Foot Ulcer Classiﬁcation and Recognition

: Diabetic foot ulcer (DFU) is one of the major complications of diabetes and results in the amputation of lower limb if not treated timely and properly. Despite the traditional clinical approaches used in DFU classiﬁcation, automatic methods based on a deep learning framework show promising results. In this paper, we present several end-to-end CNN-based deep learning architectures, i.e., AlexNet, VGG16/19, GoogLeNet, ResNet50.101, MobileNet, SqueezeNet, and DenseNet, for infection and ischemia categorization using the benchmark dataset DFU2020. We ﬁne-tune the weight to overcome a lack of data and reduce the computational cost. Afﬁne transform techniques are used for the augmentation of input data. The results indicate that the ResNet50 achieves the highest accuracy of 99.49% and 84.76% for Ischaemia


Introduction
Diabetes is a chronic disease that has a huge negative influence on people's lives, families, and society all over the world [1,2].A serious consequence of diabetes can lead to the amputation of the foot or leg from diabetic foot ulcers (DFUs).Recognizing infection and ischaemia is critical for determining factors that predict DFU healing progress and amputation risk.A good grasp of the vascular architecture of the leg, particularly ischaemia, enables medical professionals to make better decisions when predicting the potential of DFU healing based on available blood supply [3].According to the International Diabetes Federation [4], in 2019 roughly 463 million adults globally had diabetes.By 2045, this value is predicted to rise to 700 million.
Lower limb amputation may occur as a result of insufficient microvascular and macrovascular tissue perfusion, and infection.A diabetic patient with a "high-risk" foot needs routine doctor visits, ongoing pricey medication, and sanitary personal care to prevent further complications [5].This places a heavy financial burden on the patients and their families, particularly in underdeveloped nations where the expense of treating this illness can be as high as 5.7 years of annual income [6].The ability to quickly intervene and receive adequate therapy to either heal foot ulcers or stop the development of amputation may be made possible by the early detection and better classification of foot problems.Early surveillance through self-diagnosis at home may be helpful in stopping the onset and progression of DFU.
The simplest monitoring method, eye inspection, has certain drawbacks, such as the inability of those with obesity or visual impairment to accurately identify subtle changes.Recent studies show that a home-temperature monitoring system could identify 97% of DFUs in early stages [7].Patients who have their feet temperatures monitored continuously are at a lower risk of developing foot problems.The evaluation of DFU in current clinical practices includes a variety of significant responsibilities in early diagnosis and in monitoring progress, and a number of time-consuming steps need to completed to ensure the treatment and care of DFU for each individual case.Firstly, the patient's medical background is examined, then diabetic foot experts extensively analyze the DFU [8]; further testing such as CT scans, MRIs, and X-rays can also be helpful for doctors to analyze DFUs.The DFU typically has fluctuating formations and uncertainty outside borders.The visual characteristics of DFU and the skin around it depend on the different stages, such as redness, substantial callus development, and blisters.
The DFU has recently drawn the attention of many researchers because it is a major and global issue for diabetic patients.In severe cases, the patient's survival rate is reduced due to the removal of all or a portion of a limb.Since 2020, the DFU challenge has attracted researchers to work on the identification and detection of DFU using some machine learning and deep learning approaches.To detect infection and ischaemia in DFU, a new dataset and a computer vision approach were introduced in [9] using the super pixel color descriptor, along with the customized machine learning approach.Then, the authors applied the ensemble convolutional neural network model to more accurately identify ischaemia and infection.Their ensemble CNN deep learning algorithms outperformed handcrafted machine learning algorithms for classification tasks, achieving 90% accuracy in ischaemia classification and 73% accuracy in infection classification.
It was found that deformable convolution, a faster R-CNN variation, performed the best, with an F1-Score of 0.7434 and a mean average precision of 0.6940, when comparing the results in DFUC2020 (deep learning-based algorithms proposed by the winning teams, including Faster R-CNN, three variations of Faster R-CNN, an ensemble approach, YOLOv3, and YOLOv5 [10]).In addition, a fresh deep convolutional neural network called DFU QUTNet was created in order to automatically distinguish between the classes of normal skin (healthy skin) and abnormal skin (DFU) [11].The F1-score on their DFU QUTNet network was 94.5%.In order to classify DFU images, ref. [12] suggests an ensemble strategy made up of five modified convolutional neural networks, i.e., VGG-16, VGG-19, Resnet-50, InceptionV3, and Densenet-201.It is found that the combination of the five CNNs greatly improved the classification rates.After five-fold cross-validation, the average accuracy of 95.04% and a Kappa index of over 91.85% were achieved.
Xie et al. [13] developed a reliable model to predict the probability of in-hospital amputation in DFU patients.A multi-class classification model was created using the light gradient boosting machine (LightGBM) and to forecast the three outcomes.In addition, they utilized the SHapley Additive exPlanations 72 (SHAP) method to evaluate the model's predictions and obtained AUCs of 85%, 90%, and 73.86% for minor amputation, nonamputation, and major amputation outcomes, respectively.On foot thermogram images, ref.
[14] compared numerous state-of-the-art convolutional neural networks (CNNs) to a machine learning-based scoring technique using feature selection and optimization techniques, as well as learning classifiers, and provided a reliable solution to diagnose the diabetic foot.They conclude that the AdaBoost Classifier used 10 features and obtained an F1 score of 97%, and MobilenetV2 only produced an F1 score of 95% percent for a two-foot thermogram image-based classification.
To help medical professionals make an early diagnosis, deep learning algorithms are becoming more and more popular and achieving promising performance in different fields of bio-informatics, medial imaging, and biomedical [9,[15][16][17][18][19][20].Deep learning models would be based on the precise evaluation of these visual cues as texture and color descriptors for DFU classification.This paper presented the performance of a number of CNNs architectures using pre-trained weights, outcome evaluation utilizing various matrices, and a comparison of the top deep learning model and cutting-edge methods.
The major contributions of this study include: • To use several end-to-end CNN-based deep learning architectures to transfer the learnt knowledge and update and analyze visual features for infection and ischemia categorization using the DFU202 dataset.

•
To use fine-tune weight to overcome a lack of data and avoid computational costs.
• To investigate whether Affine transform techniques for the augmentation of input data affect the performance of transfer learning based on a fine tuned approach or not.

•
To investigate and select the best CNN model for DFU classification.
The remainder of the paper is organized as follows.Section 2 covers the materials and methods utilized in the study; Section 3 contains the results; and Section 4 has the conclusion.

Materials and Methods
Diabetic Foot Ulcer 2020 (DFU2020) dataset https://www.touchendocrinology.com/d iabetes/journal-articles/the-dfuc-2020-dataset-analysis-towards-diabetic-foot-ulcer-dete ction/ (accessed on 13 June 2020), is subjected to an augmentation process in the first step of the pre-processing stage of this study.The proposed methods for DFU classification and recognition consist of the technique of transfer learning using fine-tuned weights using source and target domains.ImageNet is a big benchmark image dataset that can be used for image categorization in the source domain.There are 1000 classes, 1.28 million training pictures, and 50,000 validation pictures in total.The dataset was created with the intention of serving as a research and development tool for better computer vision systems.We retrained a number of pretrained models, including AlexNet, VGG16/19, ResNet 50/101, GoogLeNet, MobileNet, SqueezeNet, and DenseNet, and we used the DFU2020 dataset to assess the efficacy of the proposed approaches.
Figure 1 depicts the proposed framework of this study.It can be seen that images are pre-processed using different data-augmentation strategies, including rotation, flipping, scaling, translation, mirror, and shearing, applied to patches to increase the input size of target domain.The sample images of the augmented dataset are fed into the number of CNNs architectures, separately.Then, the feature vector of the ImageNet dataset of the source domain is fine-tuned and retrained using the CNN models on the destination DFU2020 dataset of the target domain.Fine-tuned features are extracted, and classification is carried out for two cases of ischaemia and infection.Details of dataset, data augmentation, and classification models are illustrated in subsections.

Dfu Dataset and Preprocessing
The diabetic foot ulcer (DFU) dataset 2020 [9] contains two cases of ischemia vs. all and infection vs. all.This is two binary classification tasks, one for ischaemia and the other for infection.The infection-denoting microorganisms in the wound and ischaemia denote insufficient blood flow.This dataset's initial release included 1459 photos with sizes ranging from 1600 × 1200 and 3648 × 2736.The ischaemia "positive" and "negative" classes had 1431 and 235 cases, respectively, indicating an imbalance in the dataset.Similarly, the infection "negative" and "positive" groups have 628 and 831 cases, respectively, and the dataset was roughly balanced.Different data-augmentation strategies (rotation, flipping, scaling, translation, mirror, pepper and salt noise, Gaussian noise, and shearing) were then applied to balance the dataset.The augmented dataset contains 4935 patches for ischaemia and 2945 patches for infection.Figure 2

Features Learning and Classification
In the field of medicine and medical imaging, there is always a scarcity of data and the problem of the verification of ground truth or the labelling of data by medical experts.The convolution neural network requires a huge number of data for feature extraction and classification, and high computational resources.The transfer learning with the fine-tune approach has been deployed to overcome the above limitations and drawbacks.In this section, we utilized a number of pre-trained fine-tuned-based deep learning models such as AlexNet, VGG16/19, GoogLeNet, ResNet50.101,MobileNet, SqueezeNet, and DenseNet.These convolutional neural networks (CNNs) have been pretrained on ImageNet [21].In general, a pretrained model is a model created by someone else to tackle a similar problem.We do not have to develop pretraind models from scratch when we use them.The only change that is required in a pretrained model is to change the last three layers according to the ones that are needed.For automatic feature extraction, we employed nine deep learning models, i.e., AlexNet, GoogLeNet, VGG 16, VGG 19, MobileNet, ResNet 50, ResNet 101, SqueezeNet, and DenseNet.
Alex Krizhevsky proposed AlexNet in 2012 [21].The imagenet large-scale visual recognition challenge (ILVRC) was awarded its first place in 2012.It has eight layers, three of which are compeletely linked layers and five of which are convolutional layers.Maxpooling layers are placed after the first two convolutional layers.The third, fourth, and fifth convolution layers are directly related.Following the fifth convolution layer is the maxpooling layer, and the output of the maxpooling layer is passed to the fully connected layer.The softmax classifier is used in the final fully connected layer for classification.
In 2014, Simonyan and Zisserman proposed VGG [22].A variation of VGG called VGG 16 indicates that it has 16 convolutional layers with a 3 × 3 dimension, and a version of VGG called VGG 19 indicates that it has 19 such layers.Three completely connected layers follow the convolutional layer.
GoogLeNet was created in a research project at Google in 2014 [23].In contrast to other models, the GoogLeNet architecture contains inception blocks instead of a basic sequential structure.It includes nine inception modules, along with convolution and maxpooling layers.Additionally, many layers are concatenated to boost the model's capacity for learning.
ResNet first appeared in 2015.With encouraging outcomes, it supports hundreds or thousands of layers.Skipping connections was a notion first offered by ResNet.Skip connections address the vanishing gradient issue.Both ResNet 50 and 101 will be used in this study.Resent 50 contains 50 layers.It has five residual blocks, each of which has an identity block and a convolution block.There are 101 levels in ResNet 101.It includes three convolution and identity blocks, in addition to three residual blocks.
In 2016, SqueezeNet was released [24].Convolution and maxpooling layers are used as the foundation.Following the initial layers, there are five modules, and convolutional and average pooling layers complete the process.
In a model's learning phase, DenseNet [25] can also accommodate hundreds or even thousands of layers.However, it differs from ResNet in that it uses concatenation rather than addition.Each layer in a dense block is linked to all of the blocks before it.
A key component of MobineNet [26] is a structure known as depth-wise separable convolutions.Linear bottlenecks between the layers and skip connections between the bottlenecks are two characteristics of this approach.

Experimental Results, Analysis, and Comparison
This section presents the experimental setups and results analysis for identifying DFU disease using the convolutional neural-network-based architectures.Using the pre-learnt knowledge on ImageNet dataset, we retrain and assess the CNN networks on various patterns of ischemia and infection using a transfer-learning approach in order to obtain the best parameter values for our system.

Results and Analysis
For the experimental analysis and performance of our proposed systems for ischemia and infection classification, we split the data into training, test, and validation sets with a 80:10:10 split.Prior to training, we configure the learning parameters to maximize accuracy while preserving learning stability.We decided to use a momentum of around 0.8.A learning rate of 0.001 was initially specified.With a batch size of 32, we used 30 epochs to train each model.
The evaluation measures employed in this study are sensitivity, specificity, precision, F-measure, accuracy, area under the curve, and Mathew correlation coefficient, to assess the model's effectiveness.Precision or positive predictive value (PPV) is derived as in Equation ( 1); where, P denotes precision, FP denotes false positive, and FN denotes false negative.
The ratio of accurately predicted positive observations to all observations in the actual class is known as recall and is also known as the true positive rate or sensitivity.The formula for recall is shown in Equation ( 2); The F1 score (Equation ( 3)), which weighs the average of the recall and precision, accounts for both false positives and false negatives.This is necessary to strike a balance between recall and precision.
The percentage of actual TNs that the model properly predicted can be calculated as specificity and sensitivity, respectively, and can be written as shown in Equations ( 4) and (5).
AUC is defined as area under ROC.Machine learning measures the effectiveness of binary classifications using the matthews correlation coefficient (MCC), sometimes known as the phi coefficient.The formula used to compute MCC is shown in Equation (6).
It can be seen from Table 1 that ResNet50 outperforms AlexNet, VGG16/19, GoogLeNet, ResNet101, MobileNet, SqueezeNet, and DenseNet models and achieved 99.49% accuracy, 99.59% sensitivity, 99.39% specificity, 99.39% precision, 99.49% F-Score, 99.96% AUC, and 98.99% MCC.Similarly, in case of infection, ResNet50 outperforms other models and produces 84.76% accuracy, 89.80% sensitivity, 85.71% specificity, 83.27% precision, 85.00% F-Score, 94.16 AUC, and 75.57%MCC, respectively.The biggest obstacles in the field of DFU detection are asymmetrical forms of skin lesions, diverse types of colors for skin, and locating the area of interest on each dermoscopy image.To define minute changes in skin, you must be an expert in this field.However, these minor variations may be overlooked during a human-eye test.Deep learning approaches can assist doctors in this respect, potentially saving countless lives [27].With the goal of saving lives, we attempted to classify ischaemia and infection and identify malignant instances.Detecting DFU is a difficult task, and providing the data to the model also involves some pre-processing.
It is evident from the results that performance is promising on DFU ischaemia as compared to DFU infection using ResNet50.As DFU infection contains much fewer samples than DFU ischaemia, we can deduce that a larger data set may allow deep learning models to perform better as the classifier could be trained on more representative class distributions.The small number of data that are currently available to the scientific community is a significant obstacle to DFU detection studies.
Dermatologists may take photos of skin lesions, but they are only allowed to be used inside the clinic, maybe due to privacy or commercial considerations.Much larger data sets must be gathered for training and testing these decision-support systems to achieve strong models and statistical validity.

Comparison
In the past, a variety of conventional approaches for the identification and classification of various diseases, including DFU, were used, but the results were not promising.With the advent of deep learning, various researchers implemented deep learning methods for identification, recognition, detection, and semantic segmentation in different fields.Deep learning proved its significance over machine learning techniques.For classifying DFU, numerous strategies have been put forwarded by various researchers.We contrast our suggested research with recently published articles in this area.The results are compared on the basis of accuracy, AUC, and MCC .

Conclusions and Future Work
DFU is becoming more common and is affecting an increasing number of people every day.If detected in its early stages, it can be properly treated.Early detection and treatment will result in a higher survival rate and, ultimately, a lower mortality rate.However, existing clinical approaches for the diagnosis of skin malignancy are sensitive to human error due to subjectivity and inexperienced clinicians.As a result, there is a need for more dependable and precise solutions that may benefit both experienced and inexperienced physicians.The goal of our deep learning approaches was to detect DFU.The effectiveness of pre-trained fine-tuned models such as AlexNet, VGG16/19, GoogLeNet, ResNet 50/101, MobileNetv2, SqueezeNet, and Densenet201 has been examined.ResNet outperformed all of the other models, scoring 99.49 for the DFU ischemia dataset and 84.76 for the DFU infection dataset.However, only 84% of DFU infections were detected, but we believe the true percentage may be higher .We may attempt to increase the classification rates of infection in the future.
illustrated sample images of all cases.

Figure 1 .
Figure 1.Proposed framework of transfer learning using fine-tuning approach for classification of ischemia and infection classes of DFU .

Figure 2 .
Figure 2. DFU2020 dataset: Sample images of infection (negative and positive) and ischaemia (negative and positive).

Table 2 .
Comparative analysis of the proposed system with other systems for DFU identification.