Automated Detection of Corneal Ulcer Using Combination Image Processing and Deep Learning

A corneal ulcers are one of the most common eye diseases. They come from various infections, such as bacteria, viruses, or parasites. They may lead to ocular morbidity and visual disability. Therefore, early detection can reduce the probability of reaching the visually impaired. One of the most common techniques exploited for corneal ulcer screening is slit-lamp images. This paper proposes two highly accurate automated systems to localize the corneal ulcer region. The designed approaches are image processing techniques with Hough transform and deep learning approaches. The two methods are validated and tested on the publicly available SUSTech-SYSU database. The accuracy is evaluated and compared between both systems. Both systems achieve an accuracy of more than 90%. However, the deep learning approach is more accurate than the traditional image processing techniques. It reaches 98.9% accuracy and Dice similarity 99.3%. However, the first method does not require parameters to optimize an explicit training model. The two approaches can perform well in the medical field. Moreover, the first model has more leverage than the deep learning model because the last one needs a large training dataset to build reliable software in clinics. Both proposed methods help physicians in corneal ulcer level assessment and improve treatment efficiency.


Introduction
A corneal ulcer is a type of illness in the cornea; it comes from infection or injury and leads to ocular morbidity [1,2]. The likelihood of vision impairment is decreased by early identification and differentiation of various ulcer conditions. Slit-lamp imaging techniques used in conventional clinical procedures can be tedious, costly, and time-consuming. The following issues make it challenging to appropriately segment corneal ulcers: significant discrepancies in the pathological morphologies of point-flaky and flaky corneal ulcers, hazy border, noise interference, and a dearth of reliable ground-truth slit-lamp pictures. To recognize and quantify corneal ulcers from ocular staining pictures, various segmentation procedures are needed. Due to the varied sizes and forms of point-flaky mixed corneal ulcers and flaky corneal ulcers, it is difficult to segment them in a slit-lamp picture. The lack of high-quality datasets for both corneal ulcers and their ground truth segment, particularly for supervised learning-based segmentation algorithms, has hampered the development of such systems [3,4]. Corneal segmentation is the first step for diagnosing and assessing ocular surface damage. Therefore, extracting information from fluorescein images is a

Materials and Methods
This paper proposes two methods for the automatic segmentation of corneal ulcers. The first method is image processing techniques, and the second is the semantic segmentation method. The dataset utilized in this paper is the publicly available SUSTech-SYSU database [13][14][15]. The dataset consists of 712 fluorescein-stained images that acquired the ocular surface region for patients with different corneal ulcer disease levels. In addition, there are 354 images labeled where the corneal ulcer region is localized. The labeled images are used for evaluating both methods. On top of that, they are used for building deep learning models in the semantic segmentation procedure. The corresponding sections clarify the proposed methods.

Image Processing with Hough Transform
The first method utilizes the benefits of image processing techniques with the Hough transform to segment the corneal ulcer region. The designed method is shown in Figure  1. The corneal ulcer region segmentation system proposed in this work is fully automated. Segmentation of the corneal ulcer regions from the whole RGB eye image undergoes several stages. First, the image is subjected to preprocessing stage by initially excluding most unwanted details from the image, particularly the specular reflection region. This is performed by taking the blue part of the image, then squaring its pixel values and binarizing the output. Next, we applied the morphological operation of closing, followed by calculating its complement, as illustrated in Figure 2, for one of the corneal ulcer image datasets, as an example. The corneal ulcer region segmentation system proposed in this work is fully automated. Segmentation of the corneal ulcer regions from the whole RGB eye image undergoes several stages. First, the image is subjected to preprocessing stage by initially excluding most unwanted details from the image, particularly the specular reflection region. This is performed by taking the blue part of the image, then squaring its pixel values and binarizing the output. Next, we applied the morphological operation of closing, followed by calculating its complement, as illustrated in Figure 2, for one of the corneal ulcer image datasets, as an example.
The binary image shown in Figure 2b was then multiplied by the green part of the original-colored image after smoothing using a Gaussian filter, which gives the output shown in Figure 3a. The pixel values are then squared and binarized to give the image shown in Figure 3b. The binary image shown in Figure 2b was then multiplied by the green part of the original-colored image after smoothing using a Gaussian filter, which gives the output shown in Figure 3a. The pixel values are then squared and binarized to give the image shown in Figure 3b. Next, designing an ellipse mask with proper semi-minor and major axis and centroid coordinates is similar to the binary image shown in Figure 3b. The mask shown in Figure  3a is used to exclude most of the unwanted details by multiplying the mask with the binary image shown in Figure 2b, which then gives the image shown in Figure 4b. The final step of the preprocessing stage is performing a thinning operation on the image shown in Figure 4b, which gives the image shown in Figure 5. The binary image shown in Figure 2b was then multiplied by the green part of the original-colored image after smoothing using a Gaussian filter, which gives the output shown in Figure 3a. The pixel values are then squared and binarized to give the image shown in Figure 3b. Next, designing an ellipse mask with proper semi-minor and major axis and centroid coordinates is similar to the binary image shown in Figure 3b. The mask shown in Figure  3a is used to exclude most of the unwanted details by multiplying the mask with the binary image shown in Figure 2b, which then gives the image shown in Figure 4b. The final step of the preprocessing stage is performing a thinning operation on the image shown in Figure 4b, which gives the image shown in Figure 5. Next, designing an ellipse mask with proper semi-minor and major axis and centroid coordinates is similar to the binary image shown in Figure 3b. The mask shown in Figure 3a is used to exclude most of the unwanted details by multiplying the mask with the binary image shown in Figure 2b, which then gives the image shown in Figure 4b. The final step of the preprocessing stage is performing a thinning operation on the image shown in Figure 4b, which gives the image shown in Figure 5.   In general, the eye contour extraction shown in Figure 5 is insufficiently accurate due to the many details in the eye image. To make a better delineation of the eye border, we performed the second stage, which is eye border recognition using a proper eye border mathematical model, and then used a proper recognition algorithm. Hough transform was used as a parametric shape recognition algorithm, where the eye border parametric shape was generated using a closed mathematical formula introduced by Johan Gielis, namely the Superformula [16]. It models curves called Gielis curves, as described by the polar coordinate, , in the corresponding equation where r is the radial distance to the origin, is the polar angle, and the rational number m is the value of rotational symmetry. The exponents , , and are introduced, which, with the m parameter, allow a greater degree of freedom and enable the Superformula equation to represent several useful shapes. The chosen parameters for mimicking the eye border are 1, 1, 1, and 2 for , , , and m, respectively, which gives the shape shown in Figure 6a. To determine the iris region, where the cornea is positioned directly in front of the iris and pupil, a disk is designed with a diameter and centroid equal to the semi-minor and centers of the eye-recognized shape respectively, as shown in Figure 5b. By applying this concept to the eye image border and cornea region in the adopted corneal ulcer image sample, we get the output shown in Figures 6 and 7, respectively. Next, the ulcer region of interest is separated by multiplying the mask shown in image Figure 2b with the image shown in Figure 8a to get the image shown in Figure 8b. In general, the eye contour extraction shown in Figure 5 is insufficiently accurate due to the many details in the eye image. To make a better delineation of the eye border, we performed the second stage, which is eye border recognition using a proper eye border mathematical model, and then used a proper recognition algorithm. Hough transform was used as a parametric shape recognition algorithm, where the eye border parametric shape was generated using a closed mathematical formula introduced by Johan Gielis, namely the Superformula [16]. It models curves called Gielis curves, as described by the polar coordinate, r(φ), in the corresponding equation where r is the radial distance to the origin, φ is the polar angle, and the rational number m is the value of rotational symmetry. The exponents n 1 , n 2 , and n 3 are introduced, which, with the m parameter, allow a greater degree of freedom and enable the Superformula equation to represent several useful shapes. The chosen parameters for mimicking the eye border are 1, 1, 1, and 2 for n 1 , n 2 , n 3 , and m, respectively, which gives the shape shown in Figure 6a. To determine the iris region, where the cornea is positioned directly in front of the iris and pupil, a disk is designed with a diameter and centroid equal to the semi-minor and centers of the eye-recognized shape respectively, as shown in Figure 5b. By applying this concept to the eye image border and cornea region in the adopted corneal ulcer image sample, we get the output shown in Figures 6 and 7, respectively. Next, the ulcer region of interest is separated by multiplying the mask shown in image Figure 2b with the image shown in Figure 8a to get the image shown in Figure 8b.
The pixel values of the green part of the image shown in Figure 9b are squared and binarized, yielding the image shown in Figure 10a. The mask segments shown are tested in the segmentation system. Provided the segment is connected to the eye border in which its semi-major to semi-minor ratio is greater than a certain threshold, it will be considered as an accumulation of the fluorescein stain at the eyelids. It will then be excluded from the final ulcer regions result, as shown in Figure 10b. Finally, the original image will be masked with the remaining mask segments, as in the result shown in Figure 11.

Semantic Segmentation
The second method that is proposed in this paper is semantic segmentation. Figure 12 demonstrates the steps for automated segmentation using a deep learning model.    binarized, yielding the image shown in Figure 10a. The mask segments shown are tested in the segmentation system. Provided the segment is connected to the eye border in which its semi-major to semi-minor ratio is greater than a certain threshold, it will be considered as an accumulation of the fluorescein stain at the eyelids. It will then be excluded from the final ulcer regions result, as shown in Figure 10b. Finally, the original image will be masked with the remaining mask segments, as in the result shown in Figure 11.  The pixel values of the green part of the image shown in Figure 9b are squared and binarized, yielding the image shown in Figure 10a. The mask segments shown are tested in the segmentation system. Provided the segment is connected to the eye border in which its semi-major to semi-minor ratio is greater than a certain threshold, it will be considered as an accumulation of the fluorescein stain at the eyelids. It will then be excluded from the final ulcer regions result, as shown in Figure 10b. Finally, the original image will be masked with the remaining mask segments, as in the result shown in Figure 11.  The pixel values of the green part of the image shown in Figure 9b are squared and binarized, yielding the image shown in Figure 10a. The mask segments shown are tested in the segmentation system. Provided the segment is connected to the eye border in which its semi-major to semi-minor ratio is greater than a certain threshold, it will be considered as an accumulation of the fluorescein stain at the eyelids. It will then be excluded from the final ulcer regions result, as shown in Figure 10b. Finally, the original image will be masked with the remaining mask segments, as in the result shown in Figure 11.

Semantic Segmentation
The second method that is proposed in this paper is semantic segmentation. Figure  12 demonstrates the steps for automated segmentation using a deep learning model. As stated in Figure 12, the system splits the dataset (images and their labels) into training and test partitions. The pre-trained convolutional network in this paper is ResNet 18 [15,17]. The pre-trained CNN model was trained and evaluated on the test data.
Semantic segmentation divides image pixels into one or more semantically interpret-  As stated in Figure 12, the system splits the dataset (images and their labels) into training and test partitions. The pre-trained convolutional network in this paper is ResNet 18 [15,17]. The pre-trained CNN model was trained and evaluated on the test data.
Semantic segmentation divides image pixels into one or more semantically interpretable classes rather than real-world objects. Region proposal and annotation is the process of categorizing pixel values into distinct groups using CNN. Candidate object patches (COMPs) are small groups of pixels that most likely belong to the same object as region proposals.
The semantic segmentation procedure is started by the encoder network and followed by the decoder network. The encoder is typically a pre-trained network such as ResNets, which is followed by a decoder network. The type of ResNet used in this paper is the Resnet-18 model that won the 2016 ImageNet competition. It is well-known due to its depth and use of residual blocks [18]. These blocks are essential for solving obstacle issues in training by introducing identity skip connections, which allow layers to copy their inputs to the next layer [19].
To create a segmentation map, encoders may be convolutional neural networks, and decoders may be based on deconvolutional or transposed neural networks [20,21]. Figure 13 describes the procedure of semantic segmentation, which is based mainly on the deep learning approach [22]. The corresponding figure illustrates that the input image passes through a trained deep-learning model to end by the localization of the ulcer region. As stated in Figure 12, the system splits the dataset (images and their labels) into training and test partitions. The pre-trained convolutional network in this paper is ResNet 18 [15,17]. The pre-trained CNN model was trained and evaluated on the test data.
Semantic segmentation divides image pixels into one or more semantically interpretable classes rather than real-world objects. Region proposal and annotation is the process of categorizing pixel values into distinct groups using CNN. Candidate object patches (COMPs) are small groups of pixels that most likely belong to the same object as region proposals.
The semantic segmentation procedure is started by the encoder network and followed by the decoder network. The encoder is typically a pre-trained network such as ResNets, which is followed by a decoder network. The type of ResNet used in this paper is the Resnet-18 model that won the 2016 ImageNet competition. It is well-known due to its depth and use of residual blocks [18]. These blocks are essential for solving obstacle issues in training by introducing identity skip connections, which allow layers to copy their inputs to the next layer [19].
To create a segmentation map, encoders may be convolutional neural networks, and decoders may be based on deconvolutional or transposed neural networks [20,21]. Figure  13 describes the procedure of semantic segmentation, which is based mainly on the deep learning approach [22]. The corresponding figure illustrates that the input image passes through a trained deep-learning model to end by the localization of the ulcer region. The pre-trained ResNet18 was used, and the data were divided into 70% training and 30% testing. The images were resized to 224 × 224 × 3 to match the input requirements for the first layer in ResNet18. The model was trained using MATLAB ® with a single CPU. The hyper-parameters are the Adam optimization method besides the initial learning rate of 0.0001, with a minibatch size of 32 and a maximum epoch of 50. The pre-trained ResNet18 was used, and the data were divided into 70% training and 30% testing. The images were resized to 224 × 224 × 3 to match the input requirements for the first layer in ResNet18. The model was trained using MATLAB ® with a single CPU. The hyper-parameters are the Adam optimization method besides the initial learning rate of 0.0001, with a minibatch size of 32 and a maximum epoch of 50.

Results and Discussion
Both methods are applied to the whole dataset, trained, validated, and tested to localize ulcer regions in the cornea.

Image Processing and Hough Transform
The method is applied to whole images. Figures 14-18 depict some of the obtained results for different shapes of ulcer regions. Each figure illustrates the original image, the segmentation output, and its corresponding ground truth. calize ulcer regions in the cornea.

Image Processing and Hough Transform
The method is applied to whole images. Figures 14-18 depict some of the obtained results for different shapes of ulcer regions. Each figure illustrates the original image, the segmentation output, and its corresponding ground truth.  The examples of figures from Figures 14-18 illustrate the output of the first proposed method. All figures describe the ability of the proposed method to localize the ulcer region with high similarities to the ground truth. Similarity indices are calculated for each case, such as the Jaccard similarity index and intersection union unit (IOU). The similarities indices are almost 100% for all presented images except the image in Figure 16. As shown in Figure 16, the method was sensitive to the bottom region of the eye to detect ulcer region that is not presented in the ground truth. In this case, the Jaccard and IOU indices are too low. However, the proposed method may have the capability to distinguish ulcer regions from other eye regions more than manual segmentation.   The examples of figures from Figures 14-18 illustrate the output of the first proposed method. All figures describe the ability of the proposed method to localize the ulcer region with high similarities to the ground truth. Similarity indices are calculated for each case, such as the Jaccard similarity index and intersection union unit (IOU). The similarities indices are almost 100% for all presented images except the image in Figure 16. As shown in Figure 16, the method was sensitive to the bottom region of the eye to detect ulcer region that is not presented in the ground truth. In this case, the Jaccard and IOU indices are too low. However, the proposed method may have the capability to distinguish ulcer regions from other eye regions more than manual segmentation.

Semantic Segmentation
After training the model on 70% of the whole dataset, accuracy, sensitivity, and specificity were calculated for the training and test stages. The accuracy reveals the percentage of correctly classified pixels to all over pixels. Table 1 describes the results of sensitivity, accuracy, and specificity of semantic deep learning segmentation for both training and test stages [23][24][25][26][27][28][29][30][31].

Semantic Segmentation
After training the model on 70% of the whole dataset, accuracy, sensitivity, and specificity were calculated for the training and test stages. The accuracy reveals the percentage of correctly classified pixels to all over pixels. Table 1 describes the results of sensitivity, accuracy, and specificity of semantic deep learning segmentation for both training and test stages [23][24][25][26][27][28][29][30][31]. The proposed method is applied to the dataset. The following Figures 19-23 illustrate the output of the deep learning model. Each figure shows the original image and its corresponding ulcer region that is localized by the deep learning model. The proposed method is applied to the dataset. The following Figures 19-23 illustrate the output of the deep learning model. Each figure shows the original image and its corresponding ulcer region that is localized by the deep learning model.   Figures 19-23 illustrate how sensitive the model is to the ulcer region. In addition, the time required for each test image is less than 1 s, implying that the second proposed method is accurate, sensitive, and fast after building the AI model.
The comparison is performed between the two methods in terms of sensitivity, accuracy, specificity, Jaccard index, and Dice similarity. The Jaccard index expresses the division of true classified pixels over the sum of the number of ground truth pixel and the predicted pixels. It is also defined as intersection of union (IOU), as is clear in the corresponding equation [31]: On the other hand, the Dice similarity defines as two times the area of intersection divided by the sum of the number of pixels predicted and the number of ground truth pixels, and it can be defined as F1 score. The corresponding equation reveals the relation [31]: All evaluated matrices are carried out on the same test data, which is formed by 30% of the whole dataset. The number of test data is 107 images. Table 2 depicts the performance of each method on the same images.  Table 2 abstracted the results for both methods and its conclusion of the benefit of deep learning techniques on the traditional image processing tools. In terms of accuracy, specificity, and Jaccard similarity, the second approach is higher than the first one. However, it is less sensitive than the first method. Additionally, the IOU is lower than the image processing proposed method. That comes from the truth; the deep learning approach needs a large dataset to obtain a robust and highly sensitive one by optimizing its training parameters. On the other hand, the time required for the second approach is less than the first approach where the first method requires almost 30s to detect the ulcer region whereas the second strategy is just 1 s for a single test image. Therefore, the second method can be the promised approach for ulcer segmentation in the medical field. Furthermore, building a sensible and reliable model requires training the semantic model on a large dataset. Figure 24 describes the performance of each method. Both methods are effective as shown in the corresponding figures. Their IOU and Dice similarity are almost the same. Based on the experiment which is carried out in this paper, the time required to segment ulcers in a single image using AI is just 1 s, where using image processing needs 30 s. This study compared with literature that used the same dataset. Table 3 describes the performance of both methods in terms of accuracy, sensitivity, specificity, and Dice index. As illustrated in Table 3, both methods are effective and influence ulcer detection.   Figures 19-23 illustrate how sensitive the model is to the ulcer region. In addition, the time required for each test image is less than 1 s, implying that the second proposed method is accurate, sensitive, and fast after building the AI model.
The comparison is performed between the two methods in terms of sensitivity, accuracy, specificity, Jaccard index, and Dice similarity. The Jaccard index expresses the division of true classified pixels over the sum of the number of ground truth pixel and the predicted pixels. It is also defined as intersection of union (IOU), as is clear in the corresponding equation [31]: On the other hand, the Dice similarity defines as two times the area of intersection divided by the sum of the number of pixels predicted and the number of ground truth pixels, and it can be defined as F1 score. The corresponding equation reveals the relation [31]: All evaluated matrices are carried out on the same test data, which is formed by 30% of the whole dataset. The number of test data is 107 images. Table 2 depicts the performance of each method on the same images.   Table 2 abstracted the results for both methods and its conclusion of the benefit of deep learning techniques on the traditional image processing tools. In terms of accuracy, specificity, and Jaccard similarity, the second approach is higher than the first one. However, it is less sensitive than the first method. Additionally, the IOU is lower than the image processing proposed method. That comes from the truth; the deep learning approach needs a large dataset to obtain a robust and highly sensitive one by optimizing its training parameters. On the other hand, the time required for the second approach is less than the first approach where the first method requires almost 30s to detect the ulcer region whereas the second strategy is just 1 s for a single test image. Therefore, the second method can be the promised approach for ulcer segmentation in the medical field. Furthermore, building a sensible and reliable model requires training the semantic model on a large dataset. Figure 24 describes the performance of each method. Both methods are effective as shown in the corresponding figures. Their IOU and Dice similarity are almost the same. Based on the experiment which is carried out in this paper, the time required to segment ulcers in a single image using AI is just 1 s, where using image processing needs 30 s. This study compared with literature that used the same dataset. Table 3 describes the performance of both methods in terms of accuracy, sensitivity, specificity, and Dice index.

Conclusions
A corneal ulcer is commonly a corneal disease. It causes ocular morbidity due to injury or infection by bacteria, viral, or parasites. Ulcer early diagnosis decreases vision impairment chance. Employing slit-lamp imaging techniques in clinics can be tedious, expensive, and time-consuming. Localization of ulcer regions in slit-lamp images influences the level of diagnoses.
Manual detection needs highly expert physicians, and it is not accurate. Automated segmentation of the corneal ulcer region develops the assessment method and helps diagnose accurately.
This paper proposed two methods to extract the ulcer region automatically. The first approach utilizes image processing techniques with Hough transform to localize the corneal ulcer-affected segment. The second approach is designed based on deep learning algorithms. The two methods are trained and evaluated in terms of performance matrices: accuracy, sensitivity, specificity, Jaccard similarity, Dice similarity, and IOU. The results show the effectiveness of both methods in accuracy, but deep learning is more accurate than image processing. However, image processing is more sensitive to ulcer regions, whereas the deep learning method has higher specificity. This study recommends exploiting the properties of image processing algorithms and artificial intelligence (AI) to guide the residents in extracting the affected ulcer region.
The sensitivity of the AI model can be enhanced using a large dataset to achieve a more sensitive, reliable, and robust model. The two approaches leverage finding appropriate treatment based on the assessment report, which decreases the probability of reaching the visually impaired.