Indian Diabetic Retinopathy Image Dataset (IDRiD): A Database for Diabetic Retinopathy Screening Research

: Diabetic Retinopathy is the most prevalent cause of avoidable vision impairment, mainly affecting the working-age population in the world. Recent research has given a better understanding of the requirement in clinical eye care practice to identify better and cheaper ways of identification, management, diagnosis and treatment of retinal disease. The importance of diabetic retinopathy screening programs and difﬁculty in achieving reliable early diagnosis of diabetic retinopathy at a reasonable cost needs attention to develop computer-aided diagnosis tool. Computer-aided disease diagnosis in retinal image analysis could ease mass screening of populations with diabetes mellitus and help clinicians in utilizing their time more efﬁciently. The recent technological advances in computing power, communication systems, and machine learning techniques provide opportunities to the biomedical engineers and computer scientists to meet the requirements of clinical practice. Diverse and representative retinal image sets are essential for developing and testing digital screening programs and the automated algorithms at their core. To the best of our knowledge, IDRiD (Indian Diabetic Retinopathy Image Dataset), is the ﬁrst database representative of an Indian population. It constitutes typical diabetic retinopathy lesions and normal retinal structures annotated at a pixel level. The dataset provides information on the disease severity of diabetic retinopathy, and diabetic macular edema for each image. This makes it perfect for development and evaluation of image analysis algorithms for early detection of diabetic retinopathy.

Diabetic Retinopathy (DR) is the result of microvascular retinal changes triggered by diabetes and it is the most common leading cause of preventable blindness in the working-age population in the world [1,2].Whereas, Diabetic Macular Edema (DME) is a complication associated with DR, characterized by accumulation of fluid or retinal thickening that can occur at any stage of DR [3,4].International Council of Ophthalmology (ICO) report [5] indicate that 1 out of 3 individuals affected with diabetes had some form of DR and also highlighted that 1 in 10 had vision-threatening DR.In India it is the sixth common cause of blindness [6].
DR is referred as a clinical diagnosis, depicted by the presence (see Figure 1) of one or more several retinal lesions like microaneurysms, hemorrhages, hard exudates and soft exudates [7].Early diagnosis and treatment of DR can prevent vision loss [8].Hence, diabetic patients are referred to do a regular biannual or annual follow-up and frequent consultation for the screening of their retina [9].The elimination of preventable visual impairment is mainly dependent on the pool of expert clinicians and basic health care infrastructure essential for the treatment of the eye [10,11].In the Indian subcontinent, against national eye care experts: population ratio of 1:107,000, in various regions this ratio is 1:9000 whereas in some other parts there is only one eye care expert for 608,000 population [12,13].Due to the large number of people that require a continuous follow-up and shortage of ophthalmologists, management of DR needs attention to develop computer-aided diagnosis tool [14,15].The recent technological advances in computing power, communication systems, and machine learning techniques provide opportunities to the biomedical engineers and computer scientists to meet the requirements of clinical practice [16,17].The raw images with ground truths facilitates the scientific community for development, validation, comparison and aid in the further improvement of DR lesion detection algorithms used in clinical application [18].Precise pixel level annotation of abnormalities associated with DR like microaneurysms, soft exudates, hard exudates and hemorrhages is invaluable resource for performance evaluation of individual lesion segmentation techniques.Whereas, the reliable information about disease severity level of DR, and DME is useful in development and evaluation of image analysis and retrieval algorithms for early detection of the disease [19].
This dataset was available as a part of "Diabetic Retinopathy: Segmentation and Grading Challenge (http://biomedicalimaging.org/2018/challenges/)" organized in conjunction with IEEE International Symposium on Biomedical Imaging (ISBI-2018), Washington D.C.The data challenge was hosted on Grand Challenges in Biomedical Imaging Platform [20].Information about specifications and data accessibility is provided in the Table 1.

Experimental factors
Mydriasis with one drop of tropicamide at 0.5% concentration

Experimental features
Retinal image of humans affected by diabetes was captured with 39 mm distance between lenses and examined eye using non-invasive fundus camera having xenon flash lamp.

Data Description
The IDRiD dataset, is a new publicly available retinal fundus image database consisting of 516 images categorized in two parts:

•
Retinal images with the signs of DR and/or DME.

•
Normal retinal images (without signs of DR and/or DME).
The dataset provides ground truths associated with the signs of Diabetic Retinopathy (DR) and Diabetic Macular Edema (DME) and normal retinal structures given below and described as follows:

•
Pixel level annotations of typical diabetic retinopathy lesions and optic disc.

•
Image level disease severity grading of diabetic retinopathy, and diabetic macular edema.

•
Optic Disc and Fovea center co-ordinates.

Image Level Disease Grading
The medical experts graded the full set of 516 images with a variety of pathological conditions of DR and DME.The dataset is divided into training and testing set comprising of 413 (80%) and 103 (20%) images respectively by maintaining appropriate mixture of disease stratification.Similarly, the expert labels of DR and DME severity level for the dataset are provided in two CSV files a.IDRiD_DiseaseGrading_TrainingLabels.CSV and b.IDRiD_DiseaseGrading_TestingLabels.CSV.

Optic Disc and Fovea Center Location
Along with the annotations presented above, the dataset provides center pixel locations of optic disc [OD x , OD y ] and fovea [F x , F y ] for all 516 images as shown in Figure 4. Table 2 summarizes the available data with its description, quantity of data and different file types.

Ethics Statement
Informed consent was received from the patients of this study.Appropriate care has been taken for privacy protection of patients as per the guidelines from the local ethics committee and ethics of the clinical practices and medical research.The dataset has also received approval from the local research ethics committee of Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded (M.S.), India.
Details regarding the data acquisition and annotation is as follows:

Data Acquisition
Retinal fundus imaging is non-invasive and painless mean to screen retina [9,21].The fundus images in IDRiD database were acquired from an Eye Clinic located in Nanded, (M.S.), India.Retinal images of humans affected by diabetes were captured with 39 mm distance between lenses and examined eye using non-invasive fundus camera having xenon flash lamp.The details of pretreatment of samples and camera specifications are as follows:

•
Pretreatment of Samples: All the subjects in the dataset had undergone mydriasis prior to capturing of images.Mydriasis is process of pupil dilation which was done with one drop of tropicamide at 0.5% concentration.

•
Fundus Camera Specifications: Images were acquired using a Kowa VX-10α digital fundus camera with 50 • field of view (FOV).The images have resolution of 4288 × 2848 pixels and are stored in jpg file format.The size of each image is about 800 KB.

•
Data Quality: The dataset is formed by extracting 516 images from the thousands of examinations done during the period 2009-2017.Experts verified that all images are of adequate quality, clinically relevant, that no image is duplicated and that a reasonable mixture of disease stratification representative of diabetic retinopathy (DR) and diabetic macular edema (DME) is present.

Annotation of Images
This dataset provides three type of annotations, namely pixel level annotations of lesions, image level DR and DME grading and center markups for OD and Fovea.Details of the ground truths for each of the three types is explained as follows: • Pixel Level Annotation: Initially, all observers were trained by expert ophthalmologists for the identification of individual lesion.An image processing expert chose 81 images with contextual data comprising soft exudates, hard exudates, microaneurysms, and hemorrhages.The pixel level annotation is done by a master's student using special software developed by ADCIS [22] specifically for annotation purposes.Figure 5 shows the sample image from the database and manually drawn contours.Later the markings on each of these images were reviewed by two retinal specialists, and they were finalized when the necessary consensus was reached.The final groundtruth images for all lesions and optic disc are shown in Figure 6.Similar pixel level lesion annotations are available in the E-Optha dataset [23].

•
DR and DME Grading: The medical experts graded full set of 516 images with variety of pathological conditions of DR and DME.The diabetic retinal images were classified into separate groups ranging from 0 (No apparent DR) to 4 (Severe DR) according to the International Clinical Diabetic Retinopathy Scale [24], similar to existing Kaggle DR Dataset [25].The risk of macular edema can be determined by the presence of exudates [26], severity grading of DME is done based on occurrences of hard exudates near to macula center as per the definitions provided by Messidor database [27].

Figure 1 .
Figure 1.Color fundus photograph containing different retinal lesions associated with diabetic retinopathy.Enlarged parts illustrating presence of Microaneurysms, Soft Exudates, Hemorrhages and Hard Exudates.

Figure 2 .
Figure 2. Color fundus photograph containing different retinal lesions associated with diabetic retinopathy.Enlarged parts illustrating sample annotations of Microaneurysms, Soft Exudates, Hemorrhages and Hard Exudates.

Figure 3
illustrates the information available in both CSV files with each column description given as follows: A. Image No: Name (serial number) of deidentified and renamed patient image.B. DR Grade: DR severity level in range 0 (No apparent DR) to 4 (Severe DR). C. Risk of DME: Macular edema severity level in range 0 (No DME) to 2 (Severe DME).

Figure 3 .
Figure 3. Sample DR and DME expert labels in CSV file.

Figure 4 .
Figure 4. Sample cropped image from the IDRiD database illustrating the OD and fovea center location.

•Figure 5 .Figure 6 .
Figure 5. Enlarged part of fundus image containing hard exudates from our database: (a) presence of hard exudates; (b) the manual mark-ups for hard exudates; and (c) markup contour from annotator display.

Table 1 .
Specifications Table.More specific subject area Retinal image analysis for detection of DR and DME

Table 2 .
List of data available in the created dataset.