Automated Detection and Diagnosis of Diabetic Retinopathy: A Comprehensive Survey

Diabetic Retinopathy (DR) is a leading cause of vision loss in the world. In the past few years, artificial intelligence (AI) based approaches have been used to detect and grade DR. Early detection enables appropriate treatment and thus prevents vision loss. For this purpose, both fundus and optical coherence tomography (OCT) images are used to image the retina. Next, Deep-learning (DL)-/machine-learning (ML)-based approaches make it possible to extract features from the images and to detect the presence of DR, grade its severity and segment associated lesions. This review covers the literature dealing with AI approaches to DR such as ML and DL in classification and segmentation that have been published in the open literature within six years (2016–2021). In addition, a comprehensive list of available DR datasets is reported. This list was constructed using both the PICO (P-Patient, I-Intervention, C-Control, O-Outcome) and Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) 2009 search strategies. We summarize a total of 114 published articles which conformed to the scope of the review. In addition, a list of 43 major datasets is presented.


Introduction
Diabetic retinopathy (DR) is a major cause of irreversible visual impairment and blindness worldwide [1]. This etiology of DR is due to chronic high blood glucose levels, which cause retinal capillary damage, and mainly affects the working-age population. DR begins at a mild level with no apparent visual symptoms but it can progress to severe and proliferated levels and progression of the disease can lead to blindness. Thus, early diagnosis and regular screening can decrease the risk of visual loss to 57.0% as well as decreasing the cost of treatment [2].
DR is clinically diagnosed through observation of the retinal fundus either directly or through imaging techniques such as fundus photography or optical coherence tomography. There are several standard DR grading systems such as the Early Treatment Diabetic Retinopathy Study (ETDRS) [3]. ETDRS separates fine detailed DR characteristics using multiple levels. This type of grading is done upon all seven retinal fundus Fields of View (FOV). Although ETDRS [4] is the gold standard, due to implementation complexity and technical limitations [5], alternative grading systems are also used such as the International Clinical Diabetic Retinopathy (ICDR) [6] scale which is accepted in both clinical and Computer-Aided Diagnosis (CAD) settings [7]. The ICDR scale defines 5 severity levels and 4 levels for Diabetic Macular Edema (DME) and requires fewer FOVs [6]. The ICDR levels are discussed below and are illustrated in Figure 1. Mild Non-Proliferative Diabetic Retinopathy (NPDR): This is the first stage of diabetic retinopathy, specifically characterized by tiny areas of swelling in retinal blood vessels known as Microaneurysms (MA) [8]. There is an absence of profuse bleeding in retinal nerves and if DR is detected at this stage, it can help save the patient's eyesight with proper medical treatment ( Figure 1A). • Moderate NPDR: When left unchecked, mild NPDR progresses to a moderate stage when there is blood leakage from the blocked retinal vessels. Additionally, at this stage, Hard Exudates (Ex) may exist ( Figure 1B). Furthermore, the dilation and constriction of venules in the retina causes Venous Beadings (VB) which are visible ophthalmospically [8].

•
Severe NPDR: A larger number of retinal blood vessels are blocked in this stage, causing over 20 Intra-retinal Hemorrhages (IHE; Figure 1C) in all 4 fundus quadrants or there are Intra-Retinal Microvascular Abnormalities (IRMA) which can be seen as bulges of thin vessels. IRMA appears as small and sharp-border red spots in at least one quadran. Furthermore, there can be a definite evidence of VB in over 2 quadrants [8]. • Proliferative Diabetic Retinopathy (PDR): This is an advanced stage of the disease that occurs when the condition is left unchecked for an extended period of time. New blood vessels form in the retina and the condition is termed Neovascularization (NV). These blood vessels are often fragile, with a consequent risk of fluid leakage and proliferation of fibrous tissue [8]. Different functional visual problems occur at PDR, such as blurriness, reduced field of vision, and even complete blindness in some cases ( Figure 1D).
DR detection has two main steps: screening and diagnosis. For this purpose, fine pathognomonic DR signs in initial stages are determined normally, after dilating pupils (mydriasis). Then, DR screening is performed through slit lamp bio-microscopy with a + 90.0 D lens, and direct [9]/indirect ophthalmoscopy [10]. The next step is to diagnose DR which is done through finding DR-associated lesions and comparing with the standard grading system criteria. Currently, the diagnosis step is done manually. This procedure is costly, time consuming and requires highly trained clinicians who have considerable experience and diagnostic precision. Even if all these resources are available there is still the possibility of misdiagnosis [11]. This dependency on manual evaluation makes the situation challenging. In addition, in year 2020, the number of adults worldwide with DR, and vision-threatening DR was estimated to be 103.12 million, and 28.54 million. By the year 2045, the numbers are projected to increase to 160.50 million, and 44.82 million [12]. In addition, in developing countries where there is a shortage of ophthalmologists [13,14] as well as access to standard clinical facilities. This problem also exists in underserved areas of the developed world.
Recent developments in CAD techniques, which are defined in the subscope of artificial intelligence (AI), are becoming more prominent in modern ophthalmology [15] as they can save time, cost and human resources for routine DR screening and involve lower diagnostic error factors [15]. CAD can also efficiently manage the increasing number of afflicted DR patients [16] and diagnose DR in early stages when fewer sight threatening effects are present. The scope of AI based approaches are divided between Machine Learning-based (ML) and Deep Learning-based (DL) solutions. These techniques vary depending on the imaging system and disease severity. For instance, in early levels of DR, super-resolution ultrasound imaging of microvessels [17] is used to visualize the deep ocular vasculature. On this imaging system, a successful CAD method applied a DL model for segmenting lesions on ultrasound images [18]. The widely applied imaging methods such as Optical Coherence Tomography (OCT), OCT Angiography (OCTA), Ultrawide-field fundus (UWF) and standard 45 • fundus photography are covered in this review. In addition to the mentioned imaging methods, Majumder et al. [15] reported a real time DR screening procedure using a smartphone camera.
The main purpose of this review is to analyze 114 articles published within the last 6 years that focus on the detection of DR using CAD techniques. These techniques have made considerable progress in performance with the use of ML and DL schemes that employ the latest developments in Deep Convolutional Neural Networks (DCNNs) architectures for DR severity grading, progression analysis, abnormality detection and semantic segmentation. An overview of ophthalmic applications of convolutional neural networks is presented in [19][20][21][22].

Literature Search Details
For this review, literature from 5 publicly accessible databases were surveyed. The databases were chosen based on their depth, their ease of accessibility, and their popularity. These 5 databases are: Google Scholar has been chosen to fill the gaps in the search strategy by identifying literature from multiple sources, along with articles that might be missed in manual selection from the other four databases. The articles of this topic within the latest six year time-period show that the advances in AI-enabled DR detection has increased considerably. Figure 2 visualizes the articles matching with this topic. This figure was generated using the PubMed results. At the time of writing this review, a total of 10,635 search results were listed in the PubMed database for this time period when just the term "diabetic retinopathy" was used. The MEDLINE database is arguably the largest for biomedical research. In addition, some resources from the National Library of Medicine which is a part of the U.S. National Institutes of Health, were employed in this review.
A search of the IEEE Xplore library and the SPIE digital library for the given time period reports a total of 812 and 332 search results respectively. The IEEE Xplore and SPIE libraries contain only publications of these two professional societies. Further databases were added to this list by collecting papers from non-traditional sources such as pre-print servers such as ArXiv. In Figure 3, by using data from all sources, we plot the number of papers published as a function of the year.
The scope of this review is limited to "automated detection and grading of diabetic retinopathy using fundus & OCT images". Therefore, to make the search more manageable, a combination of relevant keywords was applied using the PICO (P-Patient, I-Intervention, C-Control, O-Outcome) search strategy [23]. Keywords used in the PICO search were predetermined. A combination of ("DR" and ("DL" or "ML" or "AI")) and (fundus or OCT) was used which reduced the initial 10,635 search results in PubMed to just 217 during the period under consideration. A manual process of eliminating duplicate search results carried out across the results from all obtained databases resulted in a total number of 114 papers. Overall, the search strategy for identifying relevant research for the review involved three main steps: 1.
Using the predefined set of keywords and logical operators, a small set of papers were identified in this time range (2016-2021).

2.
Using a manual search strategy, the papers falling outside the scope of this review were eliminated. 3.
The duplicate articles (i.e., the papers occurring in multiple databases) were eliminated to obtain the set of unique articles.
The search strategy followed by this review abides by the Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) 2009 checklist [24], and the detailed search and identification pipeline is shown in Figure 4. Flowchart summarizing the literature search and dataset identification using PRISMA review strategy for identifying articles related to automated detection and diagnosis of diabetic retinopathy.

Dataset Search Details
The backbone of any automated detection model whether ML-based, DL-based, or multi-model-based, is the dataset. High-quality data with correct annotations have extreme importance in image feature extraction and training the DR detection model, properly. In this review, a comprehensive list of datasets has been created and discussed. A previously published paper [25] also gives a list of ophthalmic image datasets, containing 33 datasets that can be used for training DR detection and grading models. The paper by Khan et al. [25] highlighted 33 of the 43 datasets presented in Table 1. However, some databases which are popular and publicly accessible are not listed by Khan et al. [25], e.g., UoA-DR [26], Longitudinal DR screening data [27], FGADR [28] etc. In this review, we identified additional datasets that are available to use. The search strategy for determining relevant DR detection datasets is as follows: Appropriate results from all 5 of the selected databases (PubMed, PUBLONS, etc.) were searched manually. We gathered information about names of datasets for DR detection and grading.

4.
The original papers and websites associated with each dataset were analyzed and a systematic, tabular representation of all available information was created.

5.
The Google dataset search and different forums were checked for missing dataset entries and step 2 was repeated for all original datasets found. 6.
A final comprehensive list of datasets and its details was generated and represented in Table 1.
A total of 43 datasets were identified employing the search strategy given above. Upon further inspection, a total number of 30 datasets were identified as open access (OA), i.e., can be accessed easily without any permission or payment. Of the total number of datasets, 6 have restricted access. However, the databases can be accessed with the permission of the author or institution; the remaining 7 are private and cannot be accessed. These datasets were used to create a generalized model because of the diversity of images (multi-national and multi-ethnic groups).

Dataset Search Results
This section provides a high-level overview of the search results that were obtained using the datasets, as well as using different review articles on datasets in the domain of ophthalmology, e.g., Khan et al. [25]. Moreover, different leads obtained from GitHub and other online forums are also employed in this overview. Thus, 43 datasets were identified and a general overview of the datasets is systematically presented in this section. The datasets reviewed in this article are not limited to 2016 to 2021 and could have been released before that. The list of datasets and their characteristics are shown in Table 1 below. Depending on the restrictions and other proforma required for accessing the datasets, the list has been divided into 3 classes; they are:

•
Public open access (OA) datasets with high quality DR grades. • DR datasets, that can be accessed upon request, i.e., can be accessed by filling necessary agreements and forms for fair usage; they are a sub-type of (OA) databases and are termed Access Upon Request (AUR) in the table. • Private datasets from different institutions that are not publicly accessible or require explicit permission can access are termed Not Open Access (NOA).

Diabetic Retinopathy Classification
This section discusses the classification approaches used for DR detection. The classification can be for the detection of DR [68], referable DR (RDR) [66,69], vision threatening DR (vtDR) [66], or to analyze the proliferation level of DR using the ICDR system. Some studies also considered Diabetic Macular Edema (DME) [69,70]. Recent ML and DL methods have produced promising results in automated DR diagnosis.
Thus, multiple performance metrics such as accuracy (ACC), sensitivity (SE) or recall, specificity (SP) or precision, area under the curve (AUC), F1 and Kappa scores are used to evaluate the classification performance. Tables 2 and 3 present a brief overview of articles that used fundus images for DR classification, and articles that classify DR on fundus images using novel preprocessing techniques, respectively. Table 4 lists the recent DR classification studies that used OCT and OCTA images. In the following subsections, we provide the details of ML and DL aspects and evaluate the performance of prior studies in terms of quantitative metrics.

Machine Learning Approaches
In this review, 9 out of 93 classification-based studies employed machine learning approaches and 1 article used un-ML method for detecting and grading DR. Hence, in this section, we present an evaluation over various ML-based feature extraction and decisionmaking techniques that have been employed in the selected primary studies to construct DR detection models. In general, six major distinct ML algorithms were used in these studies. These are: principal component analysis (PCA) [70,71], linear discriminant analysis (LDA)based feature selection [71], spatial invariant feature transform (SIFT) [71], support vector machine (SVM) [16,[71][72][73], k nearest neighbor (KNN) [72] and random forest (RF) [74].
In addition to the widely used ML methods, some studies such as [75] presented a pure ML model with an accuracy of over 80% including distributed Stochastic Neighbor Embedding (t-SNE) for image dimensionality reduction in combination with ML Bagging Ensemble Classifier (ML-BEC). ML-BEC improves classification performance by using the feature bagging technique with a low computational time. Ali et al. [57] focused on five fundamental ML models, named sequential minimal optimization (SMO), logistic (Lg), multi-layer perceptron (MLP), logistic model tree (LMT), and simple logistic (SLg) in the classification level. This study proposed a novel preprocessing method in which the Region of Interest (ROI) of lesions is segmented with the clustering-based method and K-Means; then, Ali et al. [57] extracted features of the histogram, wavelet, grey scale co-occurrence, and run-length matrixes (GLCM and GLRLM) from the segmented ROIs. This method outperformed previous models with an average accuracy of 98.83% with the five ML models. However, an ML model such as SLg performs well; the required classification time is 0.38 with Intel Core i3 1.9 gigahertz (GHz) CPU, 64-bit Windows 10 operating system and 8 gigabytes (GB) memory. This processing time is higher than previous studies.
We can also use ML method for OCT and OCTA for DR detection. Recently, LiU et al. [76] deployed four ML models of logistic regression (LR), logistic regression regularized with the elastic net penalty (LR-EN), support vector machine (SVM), and the gradient boosting tree named XGBoost with over 246 OCTA wavelet features and obtained ACC, SE, and SP of 82%, 71%, and 77%, respectively. This study, despite inadequate results, has the potential to reach higher scores using model optimization and fine-tuning hyper parameters. These studies show a lower overall performance if using a small number of feature types and simple ML models are used. Dimensionality reduction is an application of ML models which can be added in the decision layer of CAD systems [77,78].
The ML methods in combination with DL networks can have a comparable performance with the pure DL models. Narayanan et al. [78] applied a SVM model for the classification of features obtained from the state of art DNNs that are optimized with PCA [78]. This provided an accuracy of 85.7% on preprocessed images. In comparison with methods such as AlexNet, VGG, ResNet, and Inception-v3, the authors report an ACC of 99.5%. In addition, they also found that this technique is more applicable with considerably less computational cost.

Deep Learning Approaches
This section gives an overview of DL algorithms that have been used. Depending on the imaging system, image resolution, noise level, and contrast as well as the size of the dataset, the methods can vary. Some studies propose customized networks such as the work done by Gulshan et al. [69], Gargeya et al. [68], Rajalakshmi et al. [79], Riaz et al. [80]. These networks have lower performance outcomes than the state of art networks such as VGG, ResNet, Inception, and DenseNet but the fewer layers make them more generalized, suitable for training with small datasets, and computationally efficient. Quellec et al. [81] applied L2 regularization over the best performed DCNN in the KAGGLE competition for DR detection named o-O. Another example of customized networks is the model proposed by Sayres et al. [82], which showed 88.4%, 91.5%, 94.8% for ACC, SE, and SP, respectively, over a small subset of 2000 images obtained from the EyePACS database. However, the performance of this network is lower than the results obtained from Mansour et al. [72] which used a larger subset of the EyePACS images (35,126 images). Mansour et al. [72] deployed more complex architectures such as the AlexNet on the extracted features of LDA and PCA that generated better results than Sayres et al. [82] with 97.93%, 100%, and 93% ACC, SE, and SP, respectively. Such DCNNs should be used with large datasets since the large number of images used in the training reduces errors. If a deep architecture is applied for a small number of observations, it might cause overfitting in which the performance over the test data is not as well as expected on the training data. On the other hand, the deepness of networks does not always guarantee higher performance, meaning that they might face problems such as vanishing or exploding gradient which will have to be addressed by redesigning the network to simpler architectures. Furthermore, the deep networks extract several low and high-level features. As these image features get more complicated, it becomes more difficult to interpret. Sometimes, high-level attributes are not clinically meaningful. For instance, the high-level attributes may refer to an existing bias in all images belonging to a certain class, such as light intensity and similar vessel patterns, that are not considered as a sign of DR but the DCNN will consider them as critical features. Consequently, this fact makes the output predictions erroneous.
In the scope of DL-based classification, Hua et al. [83] designed a DL model named Trilogy of Skip-connection Deep Networks (Tri-SDN) over the pretrained base model ResNet50 that applies skip connection blocks to make the tuning faster yielding to ACC and SP of 90.6% and 82.1%, respectively, which is considerably better than the values of 83.3% and 64.1% compared with the situation when skip connection blocks are not used.
There are additional studies that do not focus on proposing new network architectures but enhance the preprocessing step. The study done by Pao et al. [84] presents bi-channel customized CNN in which an image enhancement technique known as unsharp mask is used. The enhanced images and entropy images are used as the inputs of a CNN with 4 convolutional layers with results of 87.83%, 77.81%, 93.88% over ACC, SE, and SP. These results are all higher than the case of analysis without preprocessing (81.80% 68.36%, 89.87%, respectively).
Shankar et al. [85] proposed another approach to preprocessing using Histogrambased segmentation to extract regions containing lesions on fundus images. As the classification step, this article utilized the Synergic DL (SDL) model and the results indicated that the presented SDL model offers better results than popular DCNNs on MESSIDOR 1 database in terms of ACC, SE, SP.
Furthermore, classification is not limited to the DR detection and DCNNs can be applied to detect the presence of DR-related lesions such as the study reported by Wang et al. They cover twelve lesions in their study: MA, IHE, superficial retinal hemorrhages (SRH), Ex, CWS, venous abnormalities (VAN), IRMA, NV at the disc (NVD), NV elsewhere (NVE), pre-retinal FIP, VPHE, and tractional retinal detachment (TRD) with average precision and AUC 0.67 and 0.95, respectively; however, features such as VAN have low individual detection accuracy. This study provides essential steps for DR detection based on the presence of lesions that could be more interpretable than DCNNs which act as black boxes [86][87][88].
There are explainable backpropagation-based methods that produce heatmaps of the lesions associated with DR such as the study done by Keel et al. [89], which highlights Ex, HE, and vascular abnormalities in DR diagnosed images. These methods have limited performance providing generic explanations which might be inadequate as clinically reliable. Tables 2-4 briefly summarizes previous studies on DR classification with DL methods.

Diabetic Retinopathy Lesion Segmentation
The state-of-the-art DR classification machines [68,69] identify referable DR identification without directly taking lesion information into account. Therefore, their predictions lack clinical interpretation, despite their high accuracy. This black box nature of DCNNs is the major problem that makes them unsuitable for clinical application [86,152,153] and has made the topic of eXplainable AI (XAI) of major importance [153]. Recently, visualization techniques such as gradient-based XAI have been widely used for evaluating networks. However, these methods with generic heatmaps only highlight the major contributing lesions and hence are not suitable for detection of DR with multiple lesions and severity. Thus, some studies focused on the lesion-based DR detection instead. In general, we found 20 papers that do segmentation of the lesions, such as MA (10 articles), Ex (9 articles) and IHE, VHE, PHE, IRMA, NV, CWS. In the following sections, we discuss the general segmentation approaches. The implementation details of each article are accessible in Tables 5 and 6 based on its imaging type.

Machine Learning and Un-Machine Learning Approaches
In general, using ML methods with a high processing speed, low computational cost, and interpretable decisions is preferred to DCNNs. However, the automatic detection of subtle lesions such as MA did not reach acceptable values. In this review, we collected 2 pure ML-involved models and 6 un-ML methods. As reported in a study by Ali Shah at el. [154], they detected MA using color, Hessian and curvelet-based feature extraction and achieved a SE of 48.2%. Huang et al. [155] focused on localizing NV through using the Extreme Learning Machine (ELM). This study applied Standard deviation, Gabor, differential invariant, and anisotropic filters for this purpose and with the final classifier applying ELM. This network performed as well as an SVM with lower computational time (6111 s vs. 6877 s) with a PC running the Microsoft Windows 7 operating system with a Pentium Dual-Core E5500 CPU and 4 GB memory. For the segmentation task, the preprocessing step had a fundamental rule which had a direct effect on the outputs. The preprocessing techniques varied depending on the lesion type and the dataset properties. Orlando et al. applied a combination of DCNN extracted features and manually designed features using image illumination correction, CLAHE contrast enhancement, and color equalization. Then, this high dimensionality feature vector was fed into an RF classifier to detect lesions and achieved an AUC score of 0.93, which is comparable with some DCNN models [81,137,141].
Some studies used un-ML methods for detection of exudates such as that of Kaur et al. [156], who proposed a pipeline consisting of a vessel and optic disk removal step and used a dynamic thresholding method for detection of CWS and Ex. Prior to this study, Imani et al. [157] also did the same process with the focus on Ex on a smaller dataset. In their study, they employed additional morphological processing and smooth edge removal to reduce the detection of CWS as Ex. This article reported the SE and SP of 89.1% and 99.9% and had an almost similar performance compared to Kaur's results with 94.8% and 99.8% for SE and SP, respectively. Further description of the recent studies on lesion segmentation with ML approach can be found in Tables 5 and 6.

Deep Learning Approaches
Recent works show that DCNNs can produce promising results in automated DR lesion segmentation. DR lesion segmentation is mainly focused on fundus imaging. However, some studies apply a combination of fundus and OCT. Holmberg et al. [158] proposed a retinal layer extraction pipeline to measure retinal thickness with Unet. Furthermore, Yukun Guo et al. [159] applied DCNNs for avascular zone segmentation from OCTA images and received the accuracy of 87.0% for mild to moderate DR and 76.0% for severe DR.
Other studies mainly focus on DCNNS applied to fundus images which give a clear view of existing lesions on the surface of the retina. Other studies such as Lam et al. [160] deployed state of the art DCNNS to detect the existence of DR lesions in image patches using AlexNet, ResNet, GoogleNet, VGG16, and Inception v3 achieving 98.0% accuracy on a subset of 243 fundus images obtained from EyePACS. Wang et al. [28] [140] and considerably better on MA than Wang et al. [141]. On the other hand, Wang et al. [141] performed better in HE detection. Further details of these article and others can be found in the Tables 5 and 6.

Conclusions
Recent studies for DR detection are mainly focused on automated methods known as CAD systems. In the scope of the CAD system for DR, there are two major approaches known as first classification and staging DR severity and second segmentation of lesions such as MA, HE, Ex, CWS associated with DR.
The DR databases are categorized into public databases (36 out of 43) and private databases (7 out of 43). These databases contain fundus and OCT retinal images, and among these two imaging modalities, fundus photos are used in 86.0% of the published studies. Several public large fundus datasets are available online. The images might have been taken with different systems that affect image quality. Furthermore, some of the image-wise DR labels can be erroneous. The image databases that provide lesion annotations constitute only a small portion of the databases that require considerable resources for pixel-wise annotation. Hence, some of them contain fewer images than image-wise labeled databases. Furthermore, Lesion annotations requires intra-annotator agreement and high annotation precision. These factors make the dataset error sensitive, and its quality evaluation might become complicated.
The DR classification needs a standard grading system validated by clinicians. ETDRS is the gold standard grading system proposed for DR progression grading. Since this grading type needs fine detail evaluation and access to all 7 FOV fundus images, these issues make the use of ETDRS limited. Thus, ICDR with less precise scales is applicable for 1 FOV images to detect the DR severity levels.
The classification and grading DR can be divided into two main approaches, namely, ML-based and DL-based classification. The ML/DL-based DR detection has a generally better performance than ML/DL-based DR grading using the ICDR scale which needs to extract higher-level features associated with each level of DR [57,71]. The evaluation results proved that the DCNN architectures can achieve higher performance scores when large databases are used [72]. There is a trade-off between the performance on one side and the architecture complexity, processing time, and the lack of interpretability over the network's decisions and extracted features on the other side. Thus, some recent works have proposed semi-DCNN models containing both DL-based and ML-based models acting as classifier or feature extractor [71,72]. The use of regularization techniques is another solution to reduce the complexity of DCNN models [81].
The second approach for CAD-related studies in DR is pixel-wise lesion segmentation or image-wise lesion detection. The main lesions of DR are MA, Ex, HE, CWS. These lesions have a different detection difficulty which directly affects the performance of the proposed pipeline. Among these lesions, the annotation of MA is more challenging [28,167]. Since this lesion is difficult to detect and is the main sign of DR in early stages, some studies focused on the pixel-wise segmentation of this lesion with DCNNs and achieved high enough scores [166]. Although some of the recent DCNN-based works exhibit high performance in term of the standard metrics, the lack of interpretability may not be sufficiently valid for real-life clinical applications. This interpretability brings into the picture the concept of XAI. Explainability studies aim to show the features that influence the decision of a model the most. Singh et al. [87] have reviewed the currently used explainability methods. There is also the need for a large fundus database with high precision annotation of all associated DR lesions to help in designing more robust pipelines with high performance.  Informed Consent Statement: Study did not involve humans or animals and therefore informed consent is not applicable.
Data Availability Statement: This is review of the published articles, and all the articles are publicly available.