Next Article in Journal
Can We Rely on Mobile Devices and Other Gadgets to Assess the Postural Balance of Healthy Individuals? A Systematic Review
Next Article in Special Issue
Design, Implementation, and Evaluation of a Head and Neck MRI RF Array Integrated with a 511 keV Transmission Source for Attenuation Correction in PET/MR
Previous Article in Journal
Preamble-Based Adaptive Channel Estimation for IEEE 802.11p
Previous Article in Special Issue
Laplacian Eigenmaps Network-Based Nonlocal Means Method for MR Image Denoising
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep Ensemble Learning Based Objective Grading of Macular Edema by Extracting Clinically Significant Findings from Fused Retinal Imaging Modalities

1
School of Automation Science and Electrical Engineering, Beihang University (BUAA), Beijing 100191, China
2
Department of Electrical Engineering, Bahria University (BU), Islamabad 44000, Pakistan
3
School of Computer Science and Engineering, Beihang University (BUAA), Beijing 100191, China
4
School of Computer and Communication Engineering, University of Science & Technology Beijing (USTB), Beijing 100083, China
5
Department of Electrical and Computer Engineering, Sir Syed CASE Institute of Technology (SSCIT), Islamabad 44000, Pakistan
*
Author to whom correspondence should be addressed.
These authors contributed equally in this research.
Sensors 2019, 19(13), 2970; https://doi.org/10.3390/s19132970
Submission received: 26 May 2019 / Revised: 21 June 2019 / Accepted: 26 June 2019 / Published: 5 July 2019
(This article belongs to the Special Issue Biomedical Imaging and Sensing)

Abstract

:
Macular edema (ME) is a retinal condition in which central vision of a patient is affected. ME leads to accumulation of fluid in the surrounding macular region resulting in a swollen macula. Optical coherence tomography (OCT) and the fundus photography are the two widely used retinal examination techniques that can effectively detect ME. Many researchers have utilized retinal fundus and OCT imaging for detecting ME. However, to the best of our knowledge, no work is found in the literature that fuses the findings from both retinal imaging modalities for the effective and more reliable diagnosis of ME. In this paper, we proposed an automated framework for the classification of ME and healthy eyes using retinal fundus and OCT scans. The proposed framework is based on deep ensemble learning where the input fundus and OCT scans are recognized through the deep convolutional neural network (CNN) and are processed accordingly. The processed scans are further passed to the second layer of the deep CNN model, which extracts the required feature descriptors from both images. The extracted descriptors are then concatenated together and are passed to the supervised hybrid classifier made through the ensemble of the artificial neural networks, support vector machines and naïve Bayes. The proposed framework has been trained on 73,791 retinal scans and is validated on 5100 scans of publicly available Zhang dataset and Rabbani dataset. The proposed framework achieved the accuracy of 94.33% for diagnosing ME and healthy subjects and achieved the mean dice coefficient of 0.9019 ± 0.04 for accurately extracting the retinal fluids, 0.7069 ± 0.11 for accurately extracting hard exudates and 0.8203 ± 0.03 for accurately extracting retinal blood vessels against the clinical markings.

Graphical Abstract

1. Introduction

Visual impairments severely degrade the quality of life and have an adverse effect on people suffering from other chronic health issues. Currently, blindness is considered as a major health problem worldwide. According to the Global Burden of Disease (GBD) in their 2017 report (released on November 8th, 2018), loss of vision is categorized as the third leading form of impairments in humans and 48.2 million people are suffering from eye diseases all over the world. In addition to this, 39.6 million people have severe visual impairments whereas 279 million people and 969 million people have moderate to low visual impairments, respectively [1,2]. Moreover, most of the visual impairments that were reported are due to retinopathy.
The prime cause of retinopathy is diabetes mellitus (DM). DM is caused due to the destruction of pancreatic beta cells (β-cell) affecting the glucose metabolism of the candidate subject. DM is graded into two types. Type I DM specifies deficiency of insulin whereas Type II DM is associated with insulin resistance [3,4,5]. Apart from this, DM also affects other vital organs of the human body including eyes, kidney, heart, etc. [6]. Macula produces the central vision and it is the most critical part of retina. Any damage to macula results in the loss of central vision. The retinal diseases that affect the central vision of a person are collectively known as maculopathy. The most common form of maculopathy is ME, which is caused by the leakage of extracellular fluid from hyper-permeable capillaries in the macula of the retina. ME is clinically graded into different stages depending upon the affected area of macular thickening. However, early detection and laser photocoagulation can prevent sudden blindness in most of the cases. Moreover, many retinal complications are often treatable and according to the initiative of “VISION 2020: The Right to Sight”, different measures are being taken to eradicate avoidable blindness by the year 2020 [7]. At the same time, it is equally important to equip ophthalmologists with state-of-the-art retinal computer aided diagnostic systems for efficient detection and grading of retinopathy.
The two non-invasive imaging modalities that are clinically in practice for retinal examination, are OCT and fundus imagery [8]. OCT captures the tissue reflection through light coherency. For retinal examination, a beam is bombarded on the fundus of retina yielding a cross-sectional axial scan (A-scan) [9,10]. The A-scans are joined together to produce a brightness scan (B-scan). Since OCT captures the cross-sectional retina, so early progression of retinopathy can be easily visualized. The early identification of retinopathy effectively leads towards the better treatment. Retinal OCT imagery has revolutionized the clinical examination and eye treatment [11,12]. Figure 1a shows the basic OCT scan acquisition schematics, which is based on Michelson interferometer (MI). In MI, a monochromatic coherent light source is used to penetrate the human eye to produce a cross sectional retinal scan. Beam splitter at the center splits the light source into two separate beams where one beam is directed towards a reference mirror and the other travels to the subject’s eye. These two beams upon reflection get recombined into a single beam producing axial scan at the detector.
On the other hand, fundus photography also captures the central and peripheral retinal regions [13]. Figure 1b shows the acquisition principle of fundus imagery where a specialized microscope attached to a charge coupled device (CCD) camera for taking fundus photography. Fundus scans should ideally be taken in dim conditions. In certain circumstances, it becomes vital to consider all of the retinal examination techniques to fully analyze the pathological conditions of the human retina. The optical principle of the fundus camera is the same as of ophthalmoscopy, which acquires about two to five times enlarged inverted fundus scan [14,15]. The light passes through the series of biconvex lenses, which are used for focusing light to pass through the central aperture forming an annulus. After that, light passes through the cornea and falls on the fundus and hence the fundus scan appears on the display device, which can then be saved. The advantages of fundus photography are: it does not require pupil dilation, it is easy to use, it does not require a skilled user and it captures the images that can easily be examined by specialists at any time anywhere. However, apart from the high cost of equipment and non-portability, a major limitation of fundus photography is that it obtains a 2D representation of 3D semi-transparent retinal tissues projected onto the imaging plane, which is catered through OCT imagery. Figure 2 shows ME visualization in both OCT and fundus scans.

2. Related Work

In the past, many researchers have conducted clinical studies on analyzing ME using fundus and OCT scans [16,17,18] and concluded that OCT imaging provides better visualization of ME in comparison to fundus photography, especially in early stages where symptoms of ME are not relatively prominent. In addition to this, many studies have been conducted on devising automated algorithms for detecting ME from fundus or OCT scans individually. Most of the methods that use fundus images for the automated detection of ME are based on component segmentation, lesion detection and extraction of hard exudates (HE). Since in digital fundus scans, the contrast between HE and other retinal structures is relatively high, the most common approaches for detecting HE include marker-controlled watershed transformation [19], particle swarm optimization (PSO) based algorithm [20] and by means of local standard variation in a sliding window, morphological closing of the luminance channel and watershed transform [21]. However, illumination variations, which arise because of the changes in tissue pigmentation and imaging conditions, greatly affect these methods. Additionally, the methods based on extracting edge and color features are also proposed over the past for the segmentation of HE [22,23,24,25]. In general, such algorithms produce unsatisfactory results without including complex pre and post processing steps.
Different researchers have developed automated frameworks for the extraction of retinal layers and retinal fluids for analyzing ME affected pathologies [26,27,28,29]. Kernel regression and graph theory dynamic programming (KR + GTDP) [30] and software development life cycle (SDLC) [31] frameworks are also developed for segmenting retinal layers and retinal fluids in ME affected OCT scans. Srinivasan et al. [32] proposed a maculopathy detection framework using histogram of oriented gradients. Apart from this, deep learning frameworks [33,34,35] are also proposed recently for the automated extraction of retinal information from maculopathy affected OCT scans.
However, to the best of our knowledge, no method has been proposed in the past that fuses multiple retinal imaging modalities for objective evaluation of ME pathology. In this paper, we proposed a deep ensemble learning based framework that gives the objective grading ME pathology. The main contributions of our papers were as follows:
  • A novel method was presented in this paper that extracted the ME pathological symptoms from retinal fundus and OCT scans.
  • Instead of extracting handcrafted features, the proposed framework employed a deep convolutional neural network (CNN) model that gives the most relevant and useful features from retinal fundus and OCT scans for the objective evaluation of ME pathology irrespective of the scan acquisition machinery.
  • Many frameworks that have been proposed in the past were tested on a single dataset or on scans acquired through single OCT machinery. However, the proposed framework could give objective grading of ME pathology irrespective of OCT acquisition machinery and was rigorously tested on scans from different publicly available datasets.
  • The proposed framework employed an ensemble of artificial neural networks (ANN), support vector machines (SVM) and naïve Bayes (NB) for the in-depth grading of ME using both fundus and OCT retinal imaging modalities.
  • The proposed framework is adaptive and gives more weight to the clinical findings such as foveal swelling, fluid filled spaces and hard exudates while evaluating ME. This is achieved by fine-tuning the proposed CNN model on observing the critical ME symptoms from both fundus and OCT imagery.
Rest of the paper is organized as: Section 3 reports dataset details used in this study, Section 4 explains the proposed methodology, results are presented in Section 5 and Section 6 describes the detailed discussion about the proposed framework. Section 7 concludes the paper and highlights the future directions.

3. Datasets

The proposed framework has been tested on retinal fundus and OCT B-scans from multiple publicly available Rabbani and Zhang datasets. Zhang’s dataset only consisted of OCT scans of various retinal pathologies while Rabbani’s datasets had scans of fundus, fluorescein angiography (FA) and OCT retinal imaging modalities. We excluded the retinal pathologies other than healthy and ME in these datasets. The detailed description of the datasets that were used for training and evaluation purposes is listed in Table 1. All the scans within the datasets were marked by the expert clinicians and we used them as a ground truth in evaluating the performance of the proposed framework.

4. Proposed Methodology

The proposed framework fuses retinal fundus and OCT imagery for the automated recognition and classification of ME and healthy subjects. The block diagram of the proposed framework is shown in Figure 3 where it can be observed that the proposed framework consisted of five major stages:
  • Retinal imaging modality recognition;
  • Preprocessing retinal scans;
  • Extraction of clinically significant ME pathological symptoms;
  • CNN for feature extraction;
  • Retinal diagnosis.
At first, the input retinal scans were categorized as fundus or OCT through the first layer of the deep CNN model. Afterwards, different acquisition artifacts and unwanted noise content from both type of imagery were removed through the preprocessing stage. After enhancing the scans, the information about retinal layers, retinal fluids and the hard exudate regions were automatically extracted through the set of coherent tensors, which highlights the clinically significant pathological features of ME retinal syndrome. The extracted retinal information was then mapped on the original scan from which the distinct features were extracted through deep CNN models. The extracted features from both fundus and OCT imagery were concatenated together to form a feature vector upon which the candidate subject was graded. The detailed description of each stage is presented in the subsequent subsections below.

4.1. Retinal Imaging Modality Recognition

The first stage of the proposed framework was related to the automated recognition of retinal fundus and OCT scans. For this purpose, we utilized the pre-trained AlexNet model [42]. AlexNet is a 25-layered CNN architecture that is trained on an ImageNet dataset. We modified the classification layer of the AlexNet network and retrained it on the local image modality recognition training dataset through transfer learning. The transfer learning phase is shown in Figure 4 and the detailed description of the respective training dataset is presented in Table 1. The pretrained weights of the AlexNet model were very convergent for the recognition of retinal imaging modalities, which resulted in lesser training and fine-tuning time. The optimization during the training phase was performed through stochastic gradient descent (SGD) [43] where two 50% dropout layers were employed to reduce the overfitting. The main reason for employing AlexNet model instead of designing a CNN architecture from scratch is to achieve greater accuracy with the small amount of training dataset in a lesser time duration. Apart from this, the softmax function was used in the modified AlexNet architecture to compute final output probabilities. The softmax function is mathematically expressed in Equation (1) and the architectural description about AlexNet layers is presented in Table 2.
σ ( X ) i = e x i j = 1 N e x j
where, X = { x 1 ,   x 2 ,     x N } is the input vector. After each convolution layer, the rectified linear units (ReLU) layer is employed that ensures that only the positive values retain in the feature map (because the negative values reflect the changes, which are dissimilar within the input and the convolutional kernel). After the ReLU layer, the max pooling layer has been added, which only keeps the maximum values within the neighborhood, which ultimately shrinks the resultant feature map.

4.2. Preprocessing Retinal Scans

The acquisition of retinal scans is highly sensitive to the subject’s head and eye movements and this often leads towards the scan degradation. Apart from this, the acquisition machines add different kind of scan annotations, which greatly affects the automated retinal analysis. In order to cater such noisy artifacts, a preprocessing stage was added, which removes the noisy contents effectively, while enhancing the retina. Since the annotations are mostly added in the top and bottom rows of the respective B-scan. They are automatically removed by setting the first and last fifty rows to zero. This threshold was empirically selected by analyzing the scans within all the datasets. Apart from this, the degraded scan areas as shown in Figure 5 were automatically removed by searching for the first and last highly sharp transitions for each column within the respective scan and then by setting the values in the identified noisy regions with the mean of background pixels.
The preprocessing stage further enhances retinal portions by increasing their variability with the background and also by removing the noisy outliers. This is accomplished through an adaptive low pass Wiener filter, which uses a localized neighborhood of a candidate pixel for denoising [34]. The response of the Wiener filter is expressed in Equations (2)–(4):
𝓂 = 1 w h w v x i w h y i w v O ( x i , y i ) ,
𝓈 2 = 1 w h w v x i w h y i w v O 2 ( x i , y i ) 𝓂 2 ,
D ( x i , y i ) = 𝓂 + 𝓈 2 + a 2 𝓈 2 ( O ( x i , y i ) 𝓂 ) ,
where, O ( x i , y i ) and D ( x i , y i ) represent the pixels of the original and denoised retinal scan respectively, w h is the horizontal axis of denoising window while w y is the vertical axis of denoising window. Local estimated mean and variance are represented by 𝓂 and 𝓈 2 respectively and 𝒶 2 is the average of all estimated mean values.

4.3. Extraction of Clinically Significant ME Pathological Symptoms

ME is clinically graded into different categories as defined by the Early Treatment Diabetic Retinopathy Study (EDTRS). ME due to the presence of hard exudates and retinal fluids within the foveal diameter of 500 micrometers was considered to be clinically significant. ME outside this region was considered as non-clinically significant. Therefore, the accurate extraction of hard exudates, retinal fluid regions and the localization of fovea were very critical for effectively grading ME. Retinal fluids could be accurately observed through OCT scans while hard exudates were effectively visualized through fundus images. Therefore, the proposed framework, rather than relying on either fundus or OCT imagery, used both of them to effectively extract the retinal information for the reliable and objective grading of ME. In order to localize fovea, the proposed framework extracted the retinal layers from the OCT volume and measured the deepest inner limiting membrane (ILM) point within the foveal B-scan.
The extraction of retinal information from both types of imagery was performed through structure coherence matrix, also known as structure tensor. Structure tensor has gained tremendous popularity in medical image processing because it provides low-level feature analysis and it is very useful for detecting corners, edges and boundaries [44]. Structure tensor also known as Förstner interest operator, is a second moment matrix which computes the gradients of an image by using Gaussian derivative filters as expressed in Equations (5)–(8):
𝓈 𝒯 = [ T 𝓍 𝓍 2 T 𝓍 𝓎 T 𝓎 𝓍 T 𝓎 𝓎 2 ] ,
T 𝓍 𝓍 2 = x i g x y i g y g ( x i ,   y j ) [ φ 𝓍 D ( x x i , y y j ) ] 2 ,
T 𝓎 𝓎 2 = x i g x y i g y g ( x i , y j ) [ φ 𝓎 D ( x x i , y y j ) ] 2 ,
T 𝓍 𝓎 = T 𝓎 𝓍 = x i g x y i g y g ( x i ,   y j ) [ φ 𝓍 𝓎 D ( x x i , y y j ) ] ,
where, 𝓈 T is the second order structure tensor matrix. T 𝓍 𝓍 2 is the horizontally computed tensor, T 𝓎 𝓎 2 is the vertically computed tensor and T 𝓍 𝓎 , T 𝓎 𝓍 are the horizontal and vertical oriented tensors. φ 𝓍 , φ 𝓎 and φ 𝓍 𝓎 are the partial derivate of denoised image within the pixel neighborhood with respect to x, y and both x, y orientations. 𝑔 ( x , y ) is the Gaussian window and D ( x , y ) is the de-noised retinal scan. Figure 6 shows the structure tensor computational stage. Structure tensor uses a set of eigenvalues to measure the degree of coherency and the tensor with maximum coherency is automatically selected for extracting retinal information [34].
After preprocessing the retinal fundus and OCT scans, the second moment matrix was automatically computed by the proposed framework for further analysis. 𝓈 T from the retinal fundus scan was computed for the extraction of blood vascular patterns. Afterwards, the optic disc region was automatically localized by analyzing the high intensity retinal regions. The extraction of blood vessels and localization of optic disc region was performed in order to improve the segmentation of hard exudates regions. Since blood vessels contain high frequency components, so, the tensors present their detailed visualization while suppressing all other contents as evident from Figure 6b. After computing 𝓈 T of the candidate fundus scan, the four coherent tensors were obtained. The best tensor ( T M A X ) was then obtained by fusing T X X and T Y Y tensors, which together contained gives the maximum information about the blood vessels. Blood vessel segmentation in the proposed framework is quite robust as it can easily extract small blood capillaries as well, which are not even visible to the naked eye as shown in Figure 6e.
Structure coherence matrix of the retinal OCT scans is computed for the extraction of up to nine retinal layers [34]. Since most of the retinal layers are horizontally oriented so T Y Y will the most coherent tensor in 𝓈 T for extracting layers information as evident from Figure 6b. After extracting the nine retinal layers, ILM and the retinal pigment epithelium (RPE) layers were used to generate a retinal mask, which was then multiplied by the candidate OCT B-scan for the extraction of retinal fluids [34]. The extraction retinal information was then overlaid onto the respective scan for the extraction of clinically significant feature set by the proposed CNN model.

4.4. CNN for Feature Extraction

After extracting the hard exudates, retinal layers and retinal fluids, they were marked on the respective fundus and OCT scans for computing the distinct features to discriminate between healthy and ME affected subjects. These features were extracted through proposed CNN architecture. We designed a 14 layered structure tensor influenced CNN architecture containing one input layer, three convolution layers, three batch normalization layers, three ReLUs, two max pooling layers, one dropout layer with 50% threshold and one fully connected layer. The kernels within the convolution layers of the proposed CNN architecture contained weights that retain the structure tensor-based features while suppressing other content. This gave the significant variability between ME and healthy subjects. The proposed CNN model for feature extraction was designed from scratch and was trained on more than 0.07 million scans where the optimization was performed through SGD. In the proposed CNN model, the negative convolution sum values were removed through ReLU and the max pooling layer shrank the feature map to avoid unnecessary calculations. Since retinal fundus and OCT scans showed different clinically significant ME findings, therefore, the proposed CNN architecture extracted distinct features from both imaging modalities (i.e., it extracted eight distinct features from retinal fundus scan and eight distinct features from OCT images), which were then concatenated together to generate a 16-D feature vector. These sixteen features were then used to grade healthy and ME subjects. The proposed CNN model shows promising results of feature extraction after getting trained on the dataset mentioned in Table 1. This was due to the robust extraction of retinal information, which were mapped on the retinal scans from which proposed that the CNN model generates the most meaningful and distinctive features as shown in Figure 7. The detailed configuration of the proposed CNN model for feature extraction is presented Table 3, while Table 4 contains the sixteen extracted features from some of the healthy and ME affected scans. Figure 7 shows detailed CNN model for features extraction from both imaging modalities.

4.5. Retinal Diagnosis

After extracting the sixteen clinically significant features from retinal fundus and OCT imagery, they were concatenated together and were utilized by the hybrid classification system for grading ME. The hybrid classification model in the proposed framework consisted of an ensemble of ANN, SVM and NB. The final decision was computed by measuring the majority votes of all three classification models. The description of each classification model is presented below.

4.5.1. Artificial Neural Networks

In this study, we used a feed forward artificial neural network classifier with one input layer, one output layer and two hidden layers. The input layer consisted of 16 nodes as per the extracted features. For hidden layers, we experimented with two to 40 nodes to find the optimum architecture (12 for the 1st hidden layer and nine for the 2nd hidden layer) of the neural network. A single output layer node gave the final classification probability. The sigmoid function was used for activation in each hidden layer whereas the final output layer contained softmax as the activation function. The weights during training were updated through gradient descent. Figure 8 shows the architecture of ANN used in the proposed study.

4.5.2. Support Vector Machines

We used a SVM classifier as well in the proposed classification model. SVM is among the most extensively used classifier [34], and in this research a non-linear decision boundary was computed through Gaussian radial basis function (RBF) and multilayer perceptron (MLP) hyperplanes for predicting ME and healthy subjects based on the extracted feature vector ( F V ).

4.5.3. Naïve Bayes

NB is a probabilistic classifier, which makes a decision based on the maximum a posteriori (MAP) rule. In this study, we used the NB classifier to determine the probability of ME and healthy classes through a 16-D feature vector. The category with the maximum probability was then automatically chosen as a diagnosis for the respective feature vector. The probabilities were computed through Bayes Rule as expressed in Equations (9) and (10):
P ( c i | F v ) = P ( F v | c i ) P ( c i ) / P ( F v ) ,
Y = a r g c i max [ P ( c i | F v ) ] ,
where, c i represents the healthy and ME class, F v is the 16-D test feature vector formulated during the feature extraction stage and Y represents the class assigned to the unlabeled scan, which has the largest probability given the F v . F v contains eight distinct features from the retinal fundus scan and eight distinct features from the OCT scan. We used Gaussian distribution to calculate the likelihood P ( F v | c i ) .
The detailed block diagram of classifiers training stage is shown in Figure 9. We used around 0.07 million retinal scans for training the hybrid classifier. Details of the training dataset are mentioned in Table 1. At first, sixteen distinct features were extracted from the labeled training scans to form a 16-D feature vector, which was then passed to all three classifiers separately and their decisions were finalized through majority voting. The performance of the proposed hybrid classifier during training was measured through K-fold cross validation as shown in Table 5 for different values of k. Once the classifiers achieved the desirable accuracy, they were used for retinal diagnosis of unlabeled scans during the classification stage as shown in Figure 9. Algorithm 1 summarizes the working flow of our proposed framework.
Algorithm 1: Proposed Framework
Sensors 19 02970 i001

5. Results

We tested the proposed framework on an unlabeled dataset consisting of 5000 OCT B-scans out of which 2500 were of ME affected eyes and 2500 were of healthy eyes and 100 fundus scans with the same ratio of ME and healthy eyes. Since the feature vector is generated by concatenating the extracted features from both fundus and OCT scans, therefore, we individually computed the mean dice coefficient for measuring the performance of extracting hard exudates, blood vessels and retinal fluids.
The dataset in [37] consisted of 24 diabetic macular edema eyes with seven diffuse pattern of fluid leakage, 10 focal pattern of leakage and seven mixed pattern of leakage. They also provided three different expert markings of hard exudates for all these cases, which we used in validating the performance of the proposed system in extracting hard exudates regions as shown in Table 6. It can be observed from Table 6 that the proposed framework achieved the overall mean dice coefficient of 0.7069 ± 0.11 in extracting hard exudates. Figure 10 shows the visual comparison of the proposed framework for extracting hard exudates with three different expert markings.
Since the annotations against blood vessels in fundus/FA scans and retinal fluids in OCT scans were not available in the datasets used in this study, we arranged these annotations through a local expert clinician for comparative analysis. We evaluated the efficiency of the proposed system for blood vessels extraction through mean dice coefficient computed against the manual markings done by a local clinician as shown in Table 7. We obtained the overall mean dice coefficient of 0.8589 ± 0.04 for blood vessels segmentation in the case of healthy eyes and 0.8012 ± 0.03 in the case of ME affected eyes. Whereas, for both retinal conditions we achieved the overall mean dice coefficient of 0.8203 ± 0.03. These results validate the accuracy of proposed systems in blood vessels segmentation against various retinal pathologies, even in the presence of hard exudates, hemorrhages and micro-aneurysms in ME fundus/FA scans. It shows the effectiveness of the proposed method in detailed extraction of blood vessels. Figure 11 shows the extracted blood vessels by the proposed system in healthy and ME affected fundus/FA scans.
Similarly, we evaluated the performance of the proposed system for the extraction of retinal fluids through mean dice coefficient computed against the manual markings done by a local clinician as shown in Table 8. We obtained the overall mean dice coefficient of 0.9026 ± 0.03 for retinal fluid extraction on the Rabbani dataset [36] and 0.9012 ± 0.04 on the Zhang dataset [35].
Whereas, for both the datasets we achieved the overall mean dice coefficient of 0.9019 ± 0.04. These results show that the proposed method performed well in retinal fluid extraction irrespective of the datasets and the OCT acquisition equipment. Figure 12 shows the extracted retinal fluid by the proposed system in healthy and ME affected OCT scans.
Moreover, we performed a classification of healthy and ME scans based on a 16-D feature vector extracted through the proposed CNN model. We passed the 16-D feature vector extracted from retinal fundus and OCT imagery to the proposed hybrid classifier for grading ME. The hybrid classification model in the proposed framework consisted of an ensemble of ANN, SVM and NB. The final decision was computed by measuring the majority votes of all three classification models. The hybrid classifier correctly classified 94.33% of all the unlabeled dataset scans, while individual performances of each classifier along with other methods reported in literature are listed in Table 9. We used sensitivity (SE), specificity (SP), positive predictive values (PPV), negative predictive values (NPV) and diagnostic accuracy (A) as the five measuring metrics to evaluate the hybrid classifier as expressed in Equations (11)–(15):
S e n s i t i v i t y =   T P T P + F N ,
S p e c i f i c i t y =   T N T N + F P ,
P P V =   T P T P + F P ,
N P V =   T N T N + F N ,
D i a g n o s t i c   A c c u r a c y =   T P +   T N T P + T N + F P + F N .
T P and T N are the true positives and true negatives respectively, which specify the correctly classified (CC) cases. In this study, T P indicates whether the input scan was macular edema and it was classified as macular edema too, while T N represents the cases where actual input scan was of healthy eye and the classification also showed it as healthy. F P and F N stands for false positive and false negative, respectively, these are false classification indicators. F P cases are those in which actual input scan was of healthy eye and classifier classified it as ME, while F N is the reverse of F P .
Figure 13 and Figure 14 shows some healthy and ME OCT cases from both Rabbani [36] and Zhang [35] datasets, which are correctly processed by the proposed framework whereas Figure 15 shows some of the healthy and ME fundus scans, which were correctly processed by the proposed system.
Figure 16 and Figure 17 shows the training performance of AlexNet and the proposed CNN model for the modality recognition and feature extraction, respectively. The training of AlexNet was conducted for 40 epochs where each epoch was completed in 35 iterations. The training of the proposed CNN model for feature extraction was conducted for 30 epochs where each epoch contained 50 iterations. The performance of the AlexNet and proposed CNN model during training phase was measured through the accuracy and cross-entropy loss function as expressed in Equation (16):
C L = Σ I F V , w l o g ( P F V , w )
where I F V , w is an indicator that the w is the correct class for the feature vector F V , P F V , w is the probability computed for F V that it belongs to class w and C L is the cross-entropy loss. The summation in Equation (9) runs for the total number of classes.
Apart from this, for every 100 iterations, the validation was performed where validation performance was also measured through accuracy and cross-entropy loss function. The validation was performed in order to get the unbiased evaluation of the candidate model during the training phase as evident from Figure 16 and Figure 17. Furthermore, we employed 50% dropout layers within each model to reduce overfitting on the dataset. The proposed CNN model achieved the accuracy of 99.23% in 1500 iterations during the training phase while the AlexNet model achieved the accuracy of 98.79%. These results were obtained through MATLAB R2018a and Table 10 shows the details of systems and software along with the average time required for computing the results by each classifier. Although, the average time of hybrid classifier was a few seconds more than individual classifiers, the accuracy achieved by the proposed classification model was 94.33%.

6. Discussion

A deep retinal diagnostic framework was proposed here that combines retinal fundus and OCT imagery for the extraction of clinically significant ME findings and uses the extracted information for the reliable and accurate grading of ME. According to EDTRS, ME was clinically graded based upon the locality of edema with respect to fovea i.e., if the retinal fluids or hard exudates are observed within the foveal diameter of 500 micrometers, then ME is graded as clinically significant otherwise it is graded as non-clinically significant. Clinically significant macular edema is more critical as compared to non-clinically significant macular edema as it produces retinal thickening near the fovea, which causes non-recoverable visual impairments (or even blindness). Retinal fundus and OCT imagery are the most common and non-invasive retinal examination techniques, which depicts the prominent symptoms of retinopathy. OCT imagery shows the early symptoms of retinopathy due to its ability to present retinal cross-sectional regions. Therefore, the retinal blood vessels leakages and retinal fluids accumulation can be easily visualized through OCT scans. However, accurate visualization of hard exudates from OCT imagery is a very cumbersome task, therefore, the retinal fundus scans are clinically used for this purpose. To the best of our knowledge, all the retinal diagnostic frameworks that have been proposed in the past for ME diagnosis are based on single retinal imaging modality, which do not completely depict the retinal abnormalities. The proposed framework is unique as it fuses the findings from both retinal fundus and OCT imagery for the effective, reliable and objective diagnosis as well as grading of ME subjects (especially those having a diabetic history). The proposed framework works in a way that it first recognizes the type of imagery through the pre-trained AlexNet CNN model. The retinal imaging modality recognition is one of the crucial steps of the proposed framework since both images do not contain any metadata that can depict their unique information or description. Therefore, in order to develop a generalized framework that can perform automated analysis and can automatically mass screen retinal patients, the respective imagery has to be automatically recognized first. After recognizing the retinal images, the proposed framework extracts the retinal layers and retinal fluids from the candidate OCT scans and it also extracts the hard exudate regions from the fundus scans. The extracted retinal information is then overlaid onto the respective scans and the annotated scans are then passed to the proposed CNN model, which extracts the eight distinct features from the annotated fundus scan and eight distinct features from the annotated fundus scans. These features are fused together to form a 16-D feature vector, which is passed to the proposed hybrid classifier formed through the ensemble of ANN, SVM and NB. One of the major aims of the proposed framework was to accurately diagnose and grade ME pathologies. Since ME is clinically graded into different categories depending upon the disease severity levels so in order to get reliable and accurate diagnosis, the hybrid classification was proposed that gives a decision based upon the majority votes obtained through all the three supervised classifiers. This increases the diagnostic performance of the proposed framework without compromising the time performance as evident from Table 10. Apart from this, the proposed framework was extensively tested on multiple publicly available datasets and was compared with state-of-the-art solutions against different metrics and ground truths (provided by expert clinicians) as evident from the results section. Table 9 depicts the detailed diagnostic comparison with other existing solutions where it can be seen that the proposed framework was the only generic framework that was validated on multiple publicly available datasets containing both retinal fundus and OCT imagery and achieved the diagnostic accuracy of 94.33%.

7. Conclusions and Future Work

In this paper, we proposed a computer aided diagnostics method for segmentation of retinal pathological symptoms and classification of macular edema using two retinal imaging modalities (OCT and fundus imaging). The proposed framework was based on a hybrid classification model in which 16 unique features are extracted for distinguishing macular edema cases from healthy ones. The dataset used for conducting this study consisted of more than 78,891 retinal scans in total, out of which we used 73,791 scans for training purpose and 5100 for evaluation purpose. The proposed classification model correctly classified 4811 retinal scans, achieving 94.33% accuracy. The proposed system was quite robust in general, insensitive to OCT B-scans orientations and performed extremely well against the noisy and degraded scans as shown in Figure 5. Moreover, the proposed technique could be optimized for detecting other ocular diseases such as age-related macular degeneration (ARMD), idiopathic central serous chorioretinopathy (CSCR), Glaucoma, diabetic retinopathy, etc., as well as for segmenting other retinal layers. It could also be extended for the 3D modeling of the human retina.

Author Contributions

Conceptualization, validation, formal analysis, data curation, visualization and writing—review and editing, B.H., T.H., R.A., B.L., O.H.; methodology, writing—original draft preparation, B.H., T.H., R.A., O.H.; software, B.H., T.H., R.A.; supervision, project administration and funding acquisition, B.L.

Funding

This research and its APC was funded by National Key R&D Program of China, grant number 2017YFB0202601.

Acknowledgments

We are extremely thankful to Rabbani et al. and Zhang et al. for making their datasets publicly available, which enabled us to conduct this study. We are also thankful to local clinician for helping us in annotation of datasets.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. GBD 2017 Disease and Injury Incidence and Prevalence Collaborators. Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: A systematic analysis for the Global Burden of Disease Study 2017. Lancet 2018, 392, 1789–1858. [Google Scholar] [CrossRef]
  2. Hassan, B.; Ahmed, R.; Li, B.; Noor, A.; Hassan, Z.U. A comprehensive study capturing vision loss burden in Pakistan (1990–2025): Findings from the Global Burden of Disease (GBD) 2017 study. PLoS ONE 2019, 14, e0216492. [Google Scholar] [CrossRef] [PubMed]
  3. Harney, F. Diabetic retinopathy. Medicine 2006, 34, 95–98. [Google Scholar] [CrossRef]
  4. Alghadyan, A.A. Diabetic retinopathy—An update. Saudi J. Ophthalmol. 2011, 25, 99–111. [Google Scholar] [CrossRef] [PubMed]
  5. Acharya, U.R.; Vidya, K.S.; Ghista, D.N.; Lim, W.J.E.; Molinari, F.; Sankaranarayanan, M. Computer-aided diagnosis of diabetic subjects by heart rate variability signals using discrete wavelet transform method. Knowl. Based Syst. 2015, 81, 56–64. [Google Scholar] [CrossRef]
  6. Verma, L.; Prakash, G.; Tewari, H.K. Diabetic retinopathy: Time for action. No complacency please! Bull. World Health Organ. 2002, 80, 419. [Google Scholar] [PubMed]
  7. Mingguang, H.; Wanjiku, M.; Susan, L.; Paul, C. Global Efforts to Generate Evidence for Vision 2020. Ophthalmic Epidemiol. 2015, 22, 237–238. [Google Scholar] [Green Version]
  8. Hassan, T.; Akram, M.U.; Hassan, B.; Nasim, A.; Bazaz, S.A. Review of OCT and fundus images for detection of Macular Edema. In Proceedings of the 2015 IEEE International Conference on Imaging Systems and Techniques (IST), Macau, China, 16–18 September 2015; pp. 1–4. [Google Scholar]
  9. Huang, D.; Swanson, E.A.; Lin, C.P.; Schuman, J.S.; Stinson, W.G.; Chang, W.; Puliafito, C.A. Optical coherence tomography. Science 1991, 254, 1178–1181. [Google Scholar] [CrossRef]
  10. Swanson, E.A.; Izatt, J.A.; Hee, M.R.; Huang, D.; Lin, C.P.; Schuman, J.S.; Fujimoto, J.G. In vivo retinal imaging by optical coherence tomography. Opt. Lett. 1993, 18, 1864–1866. [Google Scholar] [CrossRef]
  11. De Carlo, T.E.; Romano, A.; Waheed, N.K.; Duker, J.S. A review of optical coherence tomography angiography (OCTA). Int. J. Retin. Vitr. 2015, 1, 5. [Google Scholar] [CrossRef]
  12. Fercher, F.; Drexler, W.; Hitzenberger, C.K.; Lasser, T. Optical coherence tomography-principles and applications. Rep. Prog. Phys. 2003, 66, 239. [Google Scholar] [CrossRef]
  13. Schmitz-Valckenberg, S.; Holz, F.G.; Bird, A.C.; Spaide, R.F. Fundus autofluorescence imaging: Review and perspectives. Retina 2008, 28, 385–409. [Google Scholar] [CrossRef] [PubMed]
  14. Sepah, Y.J.; Akhtar, A.; Sadiq, M.A.; Hafeez, Y.; Nasir, H.; Perez, B.; Nguyen, Q.D. Fundus autofluorescence imaging: Fundamentals and clinical relevance. Saudi J. Ophthalmol. 2014, 28, 111–116. [Google Scholar] [CrossRef]
  15. Chee, K.L.; Santiago, P.A.; Lingam, G.; Singh, M.S.; Naing, T.; Mangunkusumo, A.E.; Naser, M.N. Application of Ocular Fundus Photography and Angiography. In Ophthalmological Imaging and Applications; CRC Press: Boca Raton, FL, USA, 2014; pp. 135–156. [Google Scholar]
  16. Virgili, G.; Menchini, F.; Dimastrogiovanni, A.F.; Rapizzi, E.; Menchini, U.; Bandello, F.; Chiodini, R.G. Optical coherence tomography versus stereoscopic fundus photography or biomicroscopy for diagnosing diabetic macular edema: A systematic review. Investig. Ophthalmol. Vis. Sci. 2007, 48, 4963–4973. [Google Scholar] [CrossRef] [PubMed]
  17. Browning, J.; McOwen, M.D.; Bowen, R.M., Jr.; Tisha, L.O. Comparison of the clinical diagnosis of diabetic macular edema with diagnosis by optical coherence tomography. Ophthalmology 2004, 111, 712–715. [Google Scholar] [CrossRef] [PubMed]
  18. Strøm, C.; Sander, B.; Larsen, N.; Larsen, M.; Lund-Andersen, H. Diabetic macular edema assessed with optical coherence tomography and stereo fundus photography. Investig. Ophthalmol. Vis. Sci. 2002, 43, 241–245. [Google Scholar]
  19. Reza, A.W.; Eswaran, C.; Hati, S. Automatic tracing of optic disc and exudates from color fundus images using fixed and variable thresholds. J. Med. Syst. 2009, 33, 73–80. [Google Scholar] [CrossRef]
  20. Sreejini, K.S.; Govindan, V.K. Automatic grading of severity of diabetic macular edema using color fundus images. In Proceedings of the 2013 Third International Conference on Advances in Computing and Communications (ICACC), Cochin, India, 29–31 August 2013; pp. 177–180. [Google Scholar]
  21. Walter, T.; Klein, J.C.; Massin, P.; Erginay, A.A. Contribution of image processing to the diagnosis of diabetic retinopathy-detection of exudates in color fundus images of the human retina. IEEE Trans. Med. Imaging 2002, 21, 1236–1243. [Google Scholar] [CrossRef]
  22. Giancardo, L.; Meriaudeau, F.; Karnowski, T.P.; Li, Y.; Garg, S.; Tobin, K.W., Jr.; Chaum, E. Exudate-based diabetic macular edema detection in fundus images using publicly available datasets. Med. Image Anal. 2012, 16, 216–226. [Google Scholar] [CrossRef]
  23. Osareh, A.; Shadgar, B.; Markham, R. A computational-intelligence-based approach for detection of exudates in diabetic retinopathy images. IEEE Trans. Inf. Technol. Biomed. 2009, 13, 535–545. [Google Scholar] [CrossRef]
  24. Deepak, K.S.; Sivaswamy, J. Automatic assessment of macular edema from color retinal images. IEEE Trans. Med. Imaging 2012, 31, 766–776. [Google Scholar] [CrossRef] [PubMed]
  25. Yazid, H.; Arof, H.; Isa, H.M. Automated identification of exudates and optic disc based on inverse surface thresholding. J. Med. Syst. 2012, 36, 1997–2004. [Google Scholar] [CrossRef] [PubMed]
  26. Hassan, B.; Raja, G. Fully Automated Assessment of Macular Edema using Optical Coherence Tomography (OCT) Images. In Proceedings of the 2016 International Conference on Intelligent Systems Engineering (ICISE), Islamabad, Pakistan, 15–17 January 2016; pp. 5–9. [Google Scholar]
  27. Wilkins, G.R.; Houghton, O.M.; Oldenburg, A.L. Automated segmentation of intraretinal cystoid fluid in optical coherence tomography. IEEE Trans. Biomed. Eng. 2012, 59, 1109–1114. [Google Scholar] [CrossRef] [PubMed]
  28. Sugmk, J.; Kiattisin, S.; Leelasantitham, A. Automated classification between age-related macular degeneration and diabetic macular edema in OCT image using image segmentation. In Proceedings of the 7th 2014 Biomedical Engineering International Conference (BMEiCON), Fukuoka, Japan, 26–28 November 2014; pp. 1–4. [Google Scholar]
  29. Hassan, B.; Raja, G.; Hassan, T.; Akram, M.U. Structure tensor based automated detection of macular edema and central serous retinopathy using optical coherence tomography images. J. Opt. Soc. Am. A Opt. Image Sci. Vis. 2016, 33, 455–463. [Google Scholar] [CrossRef] [PubMed]
  30. Chiu, S.J.; Allingham, M.J.; Mettu, P.S.; Cousins, S.W.; Izatt, J.A.; Farsiu, S. Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema. Biomed. Opt. Express 2015, 6, 1172–1194. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  31. Sahar, S.; Ayaz, S.; Akram, M.U.; Basit, I. A Case Study Approach: Iterative Prototyping Model Based Detection of Macular Edema in Retinal OCT Images. In Proceedings of the 27th International Conference on Software Engineering and Knowledge Engineering (SEKE), Pittsburgh, PA, USA, 6–8 July 2015; pp. 266–271. [Google Scholar]
  32. Srinivasan, P.P.; Kim, L.A.; Mettu, P.S.; Cousins, S.W.; Comer, G.M.; Izatt, J.A.; Farsiu, S. Fully automated detection of diabetic macular edema and dry age-related macular degeneration from optical coherence tomography images. Biomed. Opt. Express 2014, 5, 3568–3577. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  33. Lee, C.S.; Tyring, A.J.; Deruyter, N.P.; Wu, Y.; Rokem, A.; Lee, A.Y. Deep-learning based, automated segmentation of macular edema in optical coherence tomography. Biomed. Opt. Express 2017, 8, 3440–3448. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Hassan, T.; Akram, M.U.; Masood, M.F.; Yasin, U. Deep structure tensor graph search framework for automated extraction and characterization of retinal layers and fluid pathology in retinal SD-OCT scans. Comput. Biol. Med. 2018, 105, 112–124. [Google Scholar] [CrossRef] [PubMed]
  35. Kermany, D.S.; Goldbaum, M.; Cai, W.; Valentim, C.C.; Liang, H.; Baxter, S.L.; McKeown, A.; Yang, G.; Wu, X.; Yan, F.; et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 2018, 172, 1122–1131. [Google Scholar] [CrossRef]
  36. Rasti, R.; Rabbani, H.; Mehridehnavi, A.; Hajizadeh, F. Macular OCT classification using a multi-scale convolutional neural network ensemble. IEEE Trans. Med. Imaging 2017, 37, 1024–1034. [Google Scholar] [CrossRef]
  37. Rabbani, H.; Allingham, M.J.; Mettu, P.S.; Cousins, S.W.; Farsiu, S. Fully automatic segmentation of fluorescein leakage in subjects with diabetic macular edema. Investig. Ophthalmol. Vis. Sci. 2015, 56, 1482–1492. [Google Scholar] [CrossRef] [PubMed]
  38. Mahmudi, T.; Kafieh, R.; Rabbani, H.; Akhlagi, M. Comparison of macular OCTs in right and left eyes of normal people. In Proceedings of the Medical Imaging 2014: Biomedical Applications in Molecular, Structural, and Functional Imaging, San Diego, CA, USA, 17–21 August 2014; Volume 9038, p. 90381W. [Google Scholar]
  39. Alipour, S.H.M.; Rabbani, H.; Akhlaghi, M.R. Diabetic retinopathy grading by digital curvelet transform. Comput. Math. Methods Med. 2012, 2012, 761901. [Google Scholar] [CrossRef]
  40. Esmaeili, M.; Rabbani, H.; Dehnavi, A.M.; Dehghani, A. Automatic detection of exudates and optic disk in retinal images using curvelet transform. IET Image Process. 2012, 6, 1005–1013. [Google Scholar] [CrossRef]
  41. Alipour, S.H.M.; Rabbani, H.; Akhlaghi, M.; Dehnavi, A.M.; Javanmard, S.H. Analysis of foveal avascular zone for grading of diabetic retinopathy severity based on curvelet transform. Graefe’s Arch. Clin. Exp. Ophthalmol. 2012, 250, 1607–1614. [Google Scholar] [CrossRef] [PubMed]
  42. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 5th International Conference on Neural Information Processing Systems, Lake Tahoe, CA, USA, 3–6 December 2012. [Google Scholar]
  43. Bengio, Y. Practical recommendations for gradient based training of deep architectures. In Neural Networks: Tricks of the Trade; Springer: Berlin, Germany, 2012; pp. 437–478. [Google Scholar]
  44. Köthe, U. Edge and junction detection with an improved structure tensor. Jt. Pattern Recognit. Symp. 2003, 25–32. [Google Scholar] [CrossRef]
Figure 1. Working principle of retinal imaging modalities. (a) Optical coherence tomography (OCT) scan acquisition principle and (b) the fundus scan acquisition principle.
Figure 1. Working principle of retinal imaging modalities. (a) Optical coherence tomography (OCT) scan acquisition principle and (b) the fundus scan acquisition principle.
Sensors 19 02970 g001
Figure 2. Appearance of macular edema symptoms in (a) an OCT B-scan and (b) fundus scan.
Figure 2. Appearance of macular edema symptoms in (a) an OCT B-scan and (b) fundus scan.
Sensors 19 02970 g002
Figure 3. Proposed methodology for automated recognition and classification of healthy and macular edema (ME) affected eyes. (a) Retinal imaging modality recognition; (b) preprocessing retinal scans; (c) extraction of clinically significant ME pathological symptoms; (d) convolutional neural network (CNN) for feature extraction and (e) retinal diagnosis.
Figure 3. Proposed methodology for automated recognition and classification of healthy and macular edema (ME) affected eyes. (a) Retinal imaging modality recognition; (b) preprocessing retinal scans; (c) extraction of clinically significant ME pathological symptoms; (d) convolutional neural network (CNN) for feature extraction and (e) retinal diagnosis.
Sensors 19 02970 g003
Figure 4. Retinal imaging modality recognition stage. (a) Input retinal scans; (b) transfer learning approach using AlexNet CNN architecture; (c) recognized OCT scan and (d) recognized fundus scan.
Figure 4. Retinal imaging modality recognition stage. (a) Input retinal scans; (b) transfer learning approach using AlexNet CNN architecture; (c) recognized OCT scan and (d) recognized fundus scan.
Sensors 19 02970 g004
Figure 5. Degraded retinal scans. (a) OCT Scans and (b) fundus/ fluorescein angiography (FA) scans.
Figure 5. Degraded retinal scans. (a) OCT Scans and (b) fundus/ fluorescein angiography (FA) scans.
Sensors 19 02970 g005
Figure 6. Extraction of clinically significant ME pathological symptoms. (a) Denoised retinal scans; (b) structure tensors computation stage; (c) detection of edges using the canny edge detection method; (d) processing steps for OCT scans and (e) processing steps for fundus scans.
Figure 6. Extraction of clinically significant ME pathological symptoms. (a) Denoised retinal scans; (b) structure tensors computation stage; (c) detection of edges using the canny edge detection method; (d) processing steps for OCT scans and (e) processing steps for fundus scans.
Sensors 19 02970 g006
Figure 7. CNN for feature extraction. (a) Overlaid retinal scans with ME pathological symptoms; (b) CNN model for feature extraction from OCT retinal scans; (c) CNN model for feature extraction from retinal fundus scans and (d) 16-D feature vector containing eight OCT ( f 1 , f 2 , , f 8 ) and eight fundus features ( f 9 , f 10 , , f 16 ).
Figure 7. CNN for feature extraction. (a) Overlaid retinal scans with ME pathological symptoms; (b) CNN model for feature extraction from OCT retinal scans; (c) CNN model for feature extraction from retinal fundus scans and (d) 16-D feature vector containing eight OCT ( f 1 , f 2 , , f 8 ) and eight fundus features ( f 9 , f 10 , , f 16 ).
Sensors 19 02970 g007
Figure 8. Architecture of an ANN model.
Figure 8. Architecture of an ANN model.
Sensors 19 02970 g008
Figure 9. Block diagram of a hybrid classifier training and classification.
Figure 9. Block diagram of a hybrid classifier training and classification.
Sensors 19 02970 g009
Figure 10. Segmented hard exudates region in the [37] dataset. (a) Original FA scans; (b) overlaid hard exudates region on original FA scans; (c) extracted hard exudates region through the proposed system; (d) Grader 1 markings of hard exudates region; (e) Grader 2 markings of hard exudates region; (f) Grader 3 markings of hard exudates region; (g) hard exudates extraction against expert markings—red, yellow and green represents the expert markings of Grader 1, 2 and 3, respectively, cyan represents the hard exudates region extracted through the proposed system and magenta represents the overlapped region of hard exudates.
Figure 10. Segmented hard exudates region in the [37] dataset. (a) Original FA scans; (b) overlaid hard exudates region on original FA scans; (c) extracted hard exudates region through the proposed system; (d) Grader 1 markings of hard exudates region; (e) Grader 2 markings of hard exudates region; (f) Grader 3 markings of hard exudates region; (g) hard exudates extraction against expert markings—red, yellow and green represents the expert markings of Grader 1, 2 and 3, respectively, cyan represents the hard exudates region extracted through the proposed system and magenta represents the overlapped region of hard exudates.
Sensors 19 02970 g010
Figure 11. Extracted blood vessels through the proposed system in the [37,41] datasets. (a) Original fundus/FA scans; (b) extracted blood vessels through proposed system and (c) overlaid blood vessels on original fundus/FA scans.
Figure 11. Extracted blood vessels through the proposed system in the [37,41] datasets. (a) Original fundus/FA scans; (b) extracted blood vessels through proposed system and (c) overlaid blood vessels on original fundus/FA scans.
Sensors 19 02970 g011
Figure 12. Extracted retinal fluid regions in the [35,36] datasets. (a) Original OCT scans; (b) overlaid retinal fluid regions on original OCT scans; (c) extracted retinal fluid regions through the proposed system; (d) local clinician markings of retinal fluid regions (e) retinal fluid extraction against expert markings—cyan represents the expert markings of local clinician, yellow represents the retinal fluid regions extracted through the proposed system and magenta represents the overlapped region of retinal fluids.
Figure 12. Extracted retinal fluid regions in the [35,36] datasets. (a) Original OCT scans; (b) overlaid retinal fluid regions on original OCT scans; (c) extracted retinal fluid regions through the proposed system; (d) local clinician markings of retinal fluid regions (e) retinal fluid extraction against expert markings—cyan represents the expert markings of local clinician, yellow represents the retinal fluid regions extracted through the proposed system and magenta represents the overlapped region of retinal fluids.
Sensors 19 02970 g012
Figure 13. Healthy and ME OCT scans in the Rabbani dataset [36], which are processed by the proposed system. (a) Original healthy OCT B-scans; (b) original ME OCT B-scans; (c) classified as healthy by the proposed system and (d) classified as ME by the proposed system.
Figure 13. Healthy and ME OCT scans in the Rabbani dataset [36], which are processed by the proposed system. (a) Original healthy OCT B-scans; (b) original ME OCT B-scans; (c) classified as healthy by the proposed system and (d) classified as ME by the proposed system.
Sensors 19 02970 g013
Figure 14. Healthy and ME OCT scans in the Zhang dataset [35], which are processed by the proposed system. (a) Original healthy OCT B-scans; (b) original ME OCT B-scans; (c) classified as healthy by the proposed system and (d) classified as ME by the proposed system.
Figure 14. Healthy and ME OCT scans in the Zhang dataset [35], which are processed by the proposed system. (a) Original healthy OCT B-scans; (b) original ME OCT B-scans; (c) classified as healthy by the proposed system and (d) classified as ME by the proposed system.
Sensors 19 02970 g014
Figure 15. Healthy and ME fundus scans in the Rabbani Dataset [39,40], which are processed by the proposed system. (a) Original healthy fundus scans; (b) original ME fundus scans; (c) classified as healthy by the proposed system and (d) classified as ME by the proposed system.
Figure 15. Healthy and ME fundus scans in the Rabbani Dataset [39,40], which are processed by the proposed system. (a) Original healthy fundus scans; (b) original ME fundus scans; (c) classified as healthy by the proposed system and (d) classified as ME by the proposed system.
Sensors 19 02970 g015
Figure 16. Training performance of the AlexNet model. Blue line shows the training accuracy, red line shows the training loss and the dashed black lines shows the validation accuracy and validation loss.
Figure 16. Training performance of the AlexNet model. Blue line shows the training accuracy, red line shows the training loss and the dashed black lines shows the validation accuracy and validation loss.
Sensors 19 02970 g016
Figure 17. Training performance of the proposed CNN model. Blue line shows the training accuracy, red line shows the training loss and the dashed black lines shows the validation accuracy and validation loss.
Figure 17. Training performance of the proposed CNN model. Blue line shows the training accuracy, red line shows the training loss and the dashed black lines shows the validation accuracy and validation loss.
Sensors 19 02970 g017
Table 1. Details of the dataset used for training and testing purposes.
Table 1. Details of the dataset used for training and testing purposes.
Dataset aImaging Modality bRetinal Pathology bScans Dimension(s)
OCTFundus/FAHealthyMEOCTFundus/FA
1 [36]2764-16281136496 × 512-
2 [37]-24-24-512 × 612
768 × 868
3 [38]12,80010012,900-512 × 6501612 × 1536
4 [39]-1206060-720 × 576
5 [40]-35-35-720 × 576
6 [41]-602535-720 × 576
7 [35]62,988-51,39011,598512 × 496-
512 × 512
768 × 496
1024 × 496
1536 × 496
Total78,55233966,00312,888-
SplitTraining Validation
ModalityOCTFundus/FA OCTFundus/FA
Total Scans73,552239 5000100
Healthy63,318135 250050
ME10,234104 250050
a We only considered OCT and fundus imaging modalities consisting of healthy and ME retinal pathologies from these datasets; b the count shows the total number of scans in these datasets (including all the B-scans in OCT volumes).
Table 2. Function wise details of the AlexNet architecture.
Table 2. Function wise details of the AlexNet architecture.
FunctionLayerDescription
Input Image1227 × 227 × 3 images
Convolution29611 × 11 × 3 convolutions
62565 × 5 × 48 convolutions
103843 × 3 × 256 convolutions
123843 × 3 × 192 convolutions
142563 × 3 × 192 convolutions
ReLU3Assigns ‘0’ to non-positive values
7
11
13
15
18
21
Max Pooling53 × 3 max pooling
9
16
Dropout1950% dropout
22
Normalization4Five channels per element
8
Fully Connected174096 fully connected layer
20
23
Softmax24Softmax activation function
Output25Two Classes (OCT, Fundus)
Table 3. Proposed structure tensor influenced CNN architecture.
Table 3. Proposed structure tensor influenced CNN architecture.
1Input layer227 × 227 × 3 images with ‘zerocenter’ normalization
2Convolution89 × 9 × 3 convolutions with stride [1,1] and padding ‘same’
3Batch NormalizationBatch normalization with eight channels
4ReLURectified Linear Units
5Max Pooling2 × 2 max pooling with stride [2,2] and padding [0,0,0,0]
6Convolution169 × 9 × 8 convolutions with stride [1,1] and padding ‘same’
7Batch NormalizationBatch normalization with 16 channels
8ReLURectified Linear Units
9Dropout50% Dropout
10Max Pooling2 × 2 max pooling with stride [2,2] and padding [0,0,0,0]
11Convolution329 × 9 × 16 convolutions with stride [1,1] and padding ‘same’
12Batch NormalizationBatch normalization with 32 channels
13ReLURectified Linear Units
14Fully ConnectedFully connected layer giving the significant eight features
Table 4. Selected fundus and OCT features from healthy and ME subjects.
Table 4. Selected fundus and OCT features from healthy and ME subjects.
FeaturesHealthyMacular Edema
Case 1Case 2Case 3Mean aCase 1Case 2Case 3Mean a
OCTF12.281.531.691.511.97−0.79−2.621.17
F22.134.351.582.180.580.631.290.86
F3−3.13−1.92−6.4−3.57−0.852.731.73−1.56
F44.830.841.482.87−0.291.3−3.380.19
F50.590.280.440.291.91.350.760.68
F64.821.877.594.462.733.292.592.93
F71.030.3720.71−0.070.640.190.26
F8−0.650.940.58−0.691.611.341.41.38
FundusF9−1.850.61−1.62−0.950.32.30.10.27
F102.671.42.512.931.281.421.171.09
F11−1.77−7.83−1.7−1.14−2.54−2.11−3.06−2.71
F121.290.180.010.261.441.131.461.43
F131.61−0.090.660.250.02−2.490.81−0.11
F142.371.652.440.812.754.612.243.12
F151.70.140.680.72−1.041.46−0.260.07
F161.260.821.251.170.80.620.980.57
a Mean value was computed using all the scans in a validation dataset.
Table 5. Classifiers K-Fold cross validation performance.
Table 5. Classifiers K-Fold cross validation performance.
KMax Accuracy Achieved
ANNSVMNB
20.8160.8090.794
30.8930.8410.829
40.9140.8740.864
60.9480.9070.917
80.9720.9250.942
100.9910.9660.980
110.9850.9480.961
120.9730.9290.944
Table 6. Mean dice coefficient for hard exudates segmentation against the expert markings [37].
Table 6. Mean dice coefficient for hard exudates segmentation against the expert markings [37].
Leakage PatternScansAgainst Grader 1Against Grader 2Against Grader 3
Diffuse10.75290.89310.8372
20.35730.59730.6698
30.68380.67270.6744
40.67180.73910.7739
Focal10.58710.63970.6928
20.23390.72880.694
30.71690.82750.8631
40.38870.63490.634
Mixed10.60350.8820.875
20.59410.66910.7329
30.65510.79380.8339
40.55820.76610.7851
Mean ± STD
(All Dataset)
0.5726 ± 0.160.7669 ± 0.100.7813 ± 0.08
Mean ± STD
(Overall)
0.7069 ± 0.11
Table 7. Mean dice coefficient for blood vessels segmentation against a local clinician’s annotations.
Table 7. Mean dice coefficient for blood vessels segmentation against a local clinician’s annotations.
ScansRabbani Dataset 1 [41]Rabbani Dataset 2 [37]
HealthyMEME
10.78170.78710.7914
20.88080.83150.8135
30.87470.81970.7861
40.79830.78900.8047
50.86690.84440.7971
60.85180.80710.8168
70.88110.80040.7896
80.81900.79270.8238
90.84680.80580.8275
100.86170.78710.7914
Mean ± STD
(All Dataset)
0.8589 ± 0.040.8185 ± 0.030.7839 ± 0.02
Mean ± STD
(Overall)
0.8203 ± 0.03
Table 8. Mean dice coefficient for retinal fluids extraction against a local clinician’s annotations.
Table 8. Mean dice coefficient for retinal fluids extraction against a local clinician’s annotations.
ScansRabbani Dataset [36]Zhang Dataset [35]
10.91940.9152
20.86890.8560
30.90820.9351
40.85510.9145
50.87260.9243
60.93220.8796
70.92380.8986
80.88870.8731
90.91620.8766
100.87240.9259
Mean ± STD
(All Dataset)
0.9026 ± 0.030.9012 ± 0.04
Mean ± STD
(Overall)
0.9019 ± 0.04
Table 9. Measure outcomes of the proposed ensemble hybrid classifier in comparison to other state-of-the-art techniques.
Table 9. Measure outcomes of the proposed ensemble hybrid classifier in comparison to other state-of-the-art techniques.
MethodsValidation DatasetCCTPTNFPFNSESPPPVNPVA
OCTFundusOCTFundus
ProposedANN5000 R, Z100 R46539424572291259930.960.900.900.960.93
SVM464893240723222281430.940.910.910.940.92
NB455992237422892611760.930.900.900.930.91
Hybrid47169524732338212770.970.920.920.970.94
[19]20 Ψ, ψ-----0.943*1*0.92*--
0.967^1^0.949^--
[20]100 ϕ933360070.8251--0.93
[21]30--132-0.928-0.924--
[23]150 ζ-7271--0.960.946---
[24]400 ϕ-----0.950.9---
104 ξ10.74
[25]15 ψ-----0.9780.990.833--
15 C0.9070.9940.74--
[26]30 D281513--10.933---
[27]19 C-----0.910.96---
[28]16---------0.875
[29]90 B886028--10.933 0.977
[31]50 B422814--0.930.8--0.84
[32]45 D433013--10.866---
[34]42281 D-----0.9910.986--0.985
4260 B
[35]500 Z483237246--0.9680.996--0.982
ΨDigital Retinal Images for Vessel Extraction (DRIVE) Dataset, ψStructured Analysis of the Retina (STARE) Dataset, ϕMethods to Evaluate Segmentation and Indexing Techniques in the field of Retinal Ophthalmology (MESSIDOR) Dataset, ζBristol Eye Hospital Dataset, ξHamilton Eye Institute Macular Edema (HEI-DMED) Dataset, CCustom Dataset used by the authors, DDUKE Dataset, BBiomedical Image and Signal Analysis (BIOMISA) Dataset, ZZhang Dataset, RRabbani Dataset, *Results with fixed threshold, ^Results with variable threshold; ANN = Artificial Neural Network, SVM = Support Vector Machine, NB = Naïve Bayes Classifier, CC = Correctly Classified, TP = True Positive, TN = True Negative, FP = False Positive, FN = False Negative, SE = Sensitivity, SP = Specificity, PPV = Positive Predicted Values, NPV = Negative Predicted Values and A = Diagnostic Accuracy.
Table 10. Details of the system and software used for conducting this research.
Table 10. Details of the system and software used for conducting this research.
SystemSoftwareAverage Time for Single Classification (seconds)
ClassifierOCTFundus
MadeDELLWindows 10 Pro
64-bit
MATLAB R2018aANN4.63.3
Processori7-4500U @ 2.4GHzSVM5.74.2
RAM8GB DDR2NB3.21.8
GraphicsAMD HD 8670MHybrid6.85.1

Share and Cite

MDPI and ACS Style

Hassan, B.; Hassan, T.; Li, B.; Ahmed, R.; Hassan, O. Deep Ensemble Learning Based Objective Grading of Macular Edema by Extracting Clinically Significant Findings from Fused Retinal Imaging Modalities. Sensors 2019, 19, 2970. https://doi.org/10.3390/s19132970

AMA Style

Hassan B, Hassan T, Li B, Ahmed R, Hassan O. Deep Ensemble Learning Based Objective Grading of Macular Edema by Extracting Clinically Significant Findings from Fused Retinal Imaging Modalities. Sensors. 2019; 19(13):2970. https://doi.org/10.3390/s19132970

Chicago/Turabian Style

Hassan, Bilal, Taimur Hassan, Bo Li, Ramsha Ahmed, and Omar Hassan. 2019. "Deep Ensemble Learning Based Objective Grading of Macular Edema by Extracting Clinically Significant Findings from Fused Retinal Imaging Modalities" Sensors 19, no. 13: 2970. https://doi.org/10.3390/s19132970

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop