Comparison between Deep-Learning-Based Ultra-Wide-Field Fundus Imaging and True-Colour Confocal Scanning for Diagnosing Glaucoma

In this retrospective, comparative study, we evaluated and compared the performance of two confocal imaging modalities in detecting glaucoma based on a deep learning (DL) classifier: ultra-wide-field (UWF) fundus imaging and true-colour confocal scanning. A total of 777 eyes, including 273 normal control eyes and 504 glaucomatous eyes, were tested. A convolutional neural network was used for each true-colour confocal scan (Eidon AF™, CenterVue, Padova, Italy) and UWF fundus image (Optomap™, Optos PLC, Dunfermline, UK) to detect glaucoma. The diagnostic model was trained using 545 training and 232 test images. The presence of glaucoma was determined, and the accuracy and area under the receiver operating characteristic curve (AUC) metrics were assessed for diagnostic power comparison. DL-based UWF fundus imaging achieved an AUC of 0.904 (95% confidence interval (CI): 0.861–0.937) and accuracy of 83.62%. In contrast, DL-based true-colour confocal scanning achieved an AUC of 0.868 (95% CI: 0.824–0.912) and accuracy of 81.46%. Both DL-based confocal imaging modalities showed no significant differences in their ability to diagnose glaucoma (p = 0.135) and were comparable to the traditional optical coherence tomography parameter-based methods (all p > 0.005). Therefore, using a DL-based algorithm on true-colour confocal scanning and UWF fundus imaging, we confirmed that both confocal fundus imaging techniques had high value in diagnosing glaucoma.


Introduction
Glaucoma is a chronic, progressive optic nerve disease with characteristic visual field (VF) impairment resulting from the loss of the retinal nerve fibre layer (RNFL). The early diagnosis of the disease is difficult because the symptoms are not clear until it has progressed to the end stage. Nevertheless, early diagnosis is important for preventing visual impairment caused by glaucomatous damage. The visualisation of RNFL defects is a useful indicator for the early detection of glaucomatous damage [1][2][3][4][5]. In the very early stages of glaucoma, detecting RNFL defects is critical for diagnosing it; the RNFL has a relatively high diagnostic value because damage to it can appear before VF damage [6,7].
Several imaging devices are used to diagnose glaucoma. As many types of imaging methods can be used, physicians usually integrate the results obtained from multiple methods and perform the diagnosis based on such. Therefore, efforts to improve diagnosis using various imaging methods have been made in the fields of ophthalmology and glaucoma [8].
Recently, high-resolution fundus imaging via confocal scanning has been developed and has begun to be widely used in clinical practice. Ultra-wide-field (UWF) fundus imaging and true-colour confocal scanning are widely performed. High-resolution fundus imaging equipment uses a confocal scanning laser method. UWF fundus imaging (Op-tomap™, Optos PLC, Dunfermline, UK) combines data captured using red and green laser sources. It is advantageous in that it is possible to capture the data within a short time period without mydriasis and observe a wide range of peripheral retinas [9,10]. True-colour confocal scanning (Eidon AF™, CenterVue, Padova, Italy) allows the emission of light from a white light-emitting diode (LED) at a particular optical wavelength to acquire a high-resolution image of 14 megapixels [11]. It can also be used to capture data within a short time period without mydriasis. These two confocal fundus imaging modalities have been introduced relatively recently and are widely used for health screening purposes in medical check-up centres as well as for glaucoma diagnosis in ophthalmology clinics. In UWF fundus imaging, red-free fundus images can be obtained by only using one laser source and a blue-reflectance image, a part of which is similar to the conventional red-free fundus images obtained using a red-free filter. In true-colour confocal scanning, red-free images can be obtained by only capturing the image of a specific wavelength region using a software that processes the image obtained using an LED light source [12].
Some studies have shown that UWF fundus imaging is useful for diagnosing glaucoma [30], whereas others have shown that accurate glaucoma diagnosis can be achieved by combining UWF fundus imaging with DL methods [31][32][33][34]. However, no studies have shown the results of true-colour confocal scanning in the field of glaucoma, and this modality has only been used for the diagnosis of diabetic retinopathy and retinal diseases [35][36][37].
In a previous study conducted by our team, true-colour confocal scanning was found to be superior to UWF fundus imaging in detecting localised RNFL defects (early changes in glaucoma) when eye physicians manually evaluated images taken using both modalities [12,30,31]. It is widely known that a high false-positive rate is obtained when UWF fundus imaging is used to manually determine localised RNFL defects. Therefore, by applying DL to true-colour confocal scanning and UWF fundus imaging, we compared the diagnostic power of these test methods to determine the kind of change that occurs therein. Furthermore, we evaluated the diagnostic power of both relatively recent confocal fundus imaging techniques using a DL-based algorithm and compared them with conventional OCT parameter-based methods in terms of glaucoma diagnosis.

Participants
The study protocol was approved by the institutional review board of Hanyang University Hospital. This study design followed the tenets of the Declaration of Helsinki for biomedical research.
The definition of a normal VF, glaucomatous VF defects, and the reliability of the VF test was determined according to the classical method used in many previous studies in the literature [38].
Patients with best-corrected visual acuity ≥ 20/40, refractive errors within a ±6.0 D spherical equivalent, and a ±3.0 D astigmatism were included in this study. Exclusion criteria were a history of surgical therapy (e.g., glaucoma-filtering surgery; however, patients who only underwent cataract surgery were not excluded), any other ocular disease that could affect visual function, any media opacity that would significantly hinder OCT image acquisition, and an inability to acquire high-quality images (i.e., image quality scores of <50). When both eyes met all the eligibility criteria, one eye was randomly selected for inclusion [27].
Patients with glaucoma were identified on the basis of several clinical signs. Specifically, the presence of a characteristic optic disc (i.e., neuroretinal rim thinning, notching, excavation, or a cup-to-disc ratio difference of >0.2 between the eyes) on stereo disc images was used as a major clinical sign for glaucoma diagnosis. The presence of RNFL defects on red-free fundus imaging is an alternative sign, regardless of the presence or absence of glaucomatous VF defects [27].
Healthy patients were defined as satisfying all of the following criteria: (1) patients with no history of intraocular surgery, (2) IOP of ≤21 mmHg with no history of increased IOP, (3) no glaucomatous disc appearance, and (4) normal ophthalmologic findings. Two glaucoma specialists (M.S. and W.J.L.), who were masked to all other patient information, independently judged the images. In order to avoid ambiguity, cases in which the two examiners concluded differently were excluded from the training and test datasets [27].

Glaucoma Diagnosis Using UWF Fundus Imaging and True-Colour Confocal Scanning
Convolutional neural networks (CNNs) are widely used in image-recognition tasks. They effectively extract spatial features by conducting two-dimensional convolution with kernels whose weights are determined by training the model with the dataset. Herein, a CNN was applied to detect glaucoma using true-colour confocal scanning and UWF fundus images separately. Among the various CNN architectures, the VGG-19 network was selected owing to its popularity and computational efficiency. The VGG-19 architecture is illustrated in Figure 1.
The VGG-19 network consisted of 16 convolution layers and 5 pooling layers. The last layer consisted of the fully connected layer, followed by the softmax function that produced the probability values for each category. It was pretrained with ImageNet. Using the collected dataset, we trained the VGG-19 network to perform a glaucoma diagnosis task. A batch size of 4, a learning rate of 0.001, and an epoch of 40 were used on the basis of empirical parameter tuning.

Statistical Analyses
To evaluate the diagnostic ability for glaucoma, we calculated the area under the receiver operating characteristic curve (AUC) and accuracy. The cut-off value for glaucoma probability was changed using the AUC of the 95% CI, and the accuracy was used as a measure of precision when classifying the stages of glaucoma. The precision-recall curve is a metric for evaluating the model of the classifier and is used when the distribution of data labels is uneven. The x-axis represents the recall, and the y-axis represents the precision. The area value is used to evaluate the performance of the binary classifier.

Results
We split the entire dataset into training and test datasets at a ratio of 7:3. The demographic and clinical characteristics of the participants are summarised in Table 1.

Healthy Group versus Glaucoma Group
As shown in Table 2, the accuracy of UWF fundus imaging based on DL was 83.62%, and that of true-colour confocal scanning was 81.46%. We also evaluated the AUC of the two imaging methods. In the AUC (95% CI) analysis shown in Table 2, the DL-based method achieved an AUC of 0.904 (0.861-0.937) for UWF fundus imaging and 0.868 (0.824-0.912) for true-colour confocal scanning.  In addition, as shown in the AUC and precision-recall curve in Figure 2, the glaucoma detection performance of the methods based on DL was similar to or better than that of the existing methods based on the RNFL, ganglion cell complex (GCC), or ganglion cell-inner plexiform layer (GCIPL).  Table 3 shows that the p-value between the two confocal imaging modalities was 0.135, thereby indicating no significant difference. The analysis shows that the AUC of the DL-based imaging methods did not significantly differ from that of the traditional OCT parameter-based methods (RNFL, GCIPL, and GCC, all p > 0.05).

Discussion
In this study, we investigated the performance of DL-based UWF fundus imaging and true-colour confocal scanning in diagnosing glaucoma. We applied DL algorithms to these two relatively recent confocal fundus imaging modalities and found that they exhibited good glaucoma diagnostic abilities compared to those of conventional OCT parameter-based methods.
Although DL has been used for diagnosing glaucoma in several studies [18][19][20][21][22][23][24][25][27][28][29][30][31], only a few have applied DL to UWF fundus images [31]. To the best of our knowledge, our study is the first to apply DL to true-colour confocal scanning images for diagnosing glaucoma. We also compared UWF fundus imaging and true-colour confocal scanning, which have not been extensively investigated in previous studies. Although a previous study compared these two imaging modalities, the images were evaluated by glaucoma specialists and not based on DL, as was performed in our study [12]. In a previous study, a large amount of UWF fundus imaging data was used to diagnose glaucoma using DL, and a high diagnostic accuracy was reported [31]. The diagnostic ability reported in that study (AUC = 0.983) was higher than that in our study (DL-based UWF fundus imaging: AUC = 0.904). This might be attributed to the fact that the previous study used a larger amount of data than that used in this study. If various methods are applied to overcome the limited data size, for example, using data augmentation, in a future study, a higher detection accuracy could be achieved with our method.
Because of the high resolution of the images, true-colour confocal scanning has been used to evaluate some retinal diseases. Some studies used true-colour confocal scanning using the EyeArt software to diagnose diabetic retinopathy [36,37]. These previously published studies that performed true-colour confocal scanning investigated the detection of retinal disease rather than glaucoma and used the EyeArt software rather than a DL algorithm. To the best of our knowledge, our study is the first to use DL-based true-colour confocal scanning to diagnose glaucoma.
Herein, we used VGGNet among the various available architectures for imaging analysis (AlexNet, VGGNet, and ResNet) [40][41][42]. VGGNet has a larger capacity than AlexNet [40,41]. Although ResNet has a stronger expression capability than VGGNet, it can easily overfit if the amount of training data is not sufficient to train the model [42]. Therefore, VGGNet was considered a reasonable option for this study. For confirmation, the performance data obtained using AlexNet and ResNet are also provided in Table S1.
We compared the diagnostic ability of the methods based on DL with that of conventional methods based on the peripapillary RNFL, macular GCIPL, and macular GCC thickness values. The conventional methods employ widely used OCT parameters for diagnosing glaucoma. Our analyses showed that the diagnostic ability of the DL-based methods was not significantly different from that of existing OCT parameter-based methods. Our study suggests that the accuracy of glaucoma diagnosis is comparable between OCT parameter-and DL-based methods.
Many studies have applied DL to fundus images [11,13,26,[30][31][32]34,35]; however, our work is different from existing studies in that DL was designed to be applied to the two new confocal fundus imaging techniques with high resolution, and the diagnostic power of the two methods was compared. We believe that our analysis will be useful to the community, because these two methods are often used for screening in health check-up centres and primary clinics.
Our team compared UWF fundus imaging and true-colour confocal scanning for detecting localised RNFL defects in a previous study [12], in which the two types of images were evaluated by glaucoma specialists, not based on DL. A previous study has shown that true-colour confocal scanning could have advantages over UWF fundus imaging in evaluating localised RNFL defects [12]. UWF fundus imaging showed a significantly lower specificity than true-colour confocal scanning, which indicates that UWF fundus imaging yields a high false-positive rate. One study by another group that compared the detection of RNFL defects between conventional red-free fundus imaging and UWF fundus imaging also reported that UWF fundus imaging showed a comparable sensitivity but a lower specificity than conventional imaging [30].
Unlike the comparison made by glaucoma specialists [12], our analysis showed no significant difference in glaucoma diagnosis between the two modalities, and both yielded sufficiently good results when we adopted the DL algorithm.
The technical differences between UWF fundus imaging and true-colour confocal scanning are as follows: (1) differences in light source: a two-wavelength laser light versus an LED and a difference between false colour and true colour; (2) the confocal aperture of UWF fundus imaging is circular, whereas true-colour confocal scanning uses a confocal slit; and (3) there is a difference in the distortion owing to the different capture ranges of the two devices used.
These factors can cause differences in the interpretation of images by physicians (humans). These are possible disadvantages of UWF fundus imaging when DL is used to detect glaucoma instead of physician observations. Interestingly, the relatively distorted UWF fundus image showed a comparable diagnostic performance to the true-colour confocal scanning image (after adopting AI). Imaging not only the optic nerve but also RNFL defects is very important in diagnosing glaucoma, although the primary origin of glaucomatous damage is the optic nerve.
This study has several limitations. First, a direct comparison between the DL algorithm and judgement by physicians was not performed. Instead, we compared DL with OCT parameters that are often used. Second, because conventional fundus images or red-free images were not available, a direct comparison was difficult. When physicians discriminate against glaucoma and set the gold standard for comparison, it would be a major disadvantage to use a true-colour confocal scanning image instead of a red-free image. However, true-colour confocal scanning images have been used for the diagnosis of glaucoma in many previous studies instead of red-free images [27,43,44], and glaucoma specialists have set the gold standard (glaucoma or normal) using several other tests (stereo disc imaging or the VF test). It is expected that this limitation can be overcome to some extent because the diagnostic decisions were made by specialists based on images using the gold standard, and all other analyses were only performed via artificial intelligence. Third, age differences were observed between the glaucoma and normal groups. A large amount of data was collected in actual clinical settings; therefore, the high prevalence of glaucoma in elderly populations could affect our results. However, because the analysis only used images based on DL and no difference was found between the training and test datasets, it did not have a significant effect on the main results of this study. Finally, to check the results of the limited dataset (small sample size), we performed cross-validation using different samples for training by dividing the images into five test groups and evaluating the accuracy for each group. Tables S2 and S3 show the accuracy obtained for each test group.

Conclusions
Our study results confirmed that the ability of DL-based UWF fundus imaging and true-colour confocal scanning to diagnose glaucoma was comparable to that of OCT parameter-based methods. By adopting DL, these two recently developed confocal fundus imaging systems are expected to be useful for glaucoma diagnosis.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/jcm11113168/s1, Table S1: CNN accuracy; accuracy when comparing UWF fundus images and true-colour confocal scanning images through various algorithms; Table S2: K-fold cross validation verification (k = 5); k-fold cross-validation conducted to overcome the small sample size and to solve the overfitting problem in the test for UWF fundus images; Table S3: K-fold cross-validation verification (k = 5); k-fold cross-validation conducted to overcome the small sample size and to solve the overfitting problem in the test for true-colour confocal scanning images.