Deep Learning-Based Glaucoma Screening Using Regional RNFL Thickness in Fundus Photography

Since glaucoma is a progressive and irreversible optic neuropathy, accurate screening and/or early diagnosis is critical in preventing permanent vision loss. Recently, optical coherence tomography (OCT) has become an accurate diagnostic tool to observe and extract the thickness of the retinal nerve fiber layer (RNFL), which closely reflects the nerve damage caused by glaucoma. However, OCT is less accessible than fundus photography due to higher cost and expertise required for operation. Though widely used, fundus photography is effective for early glaucoma detection only when used by experts with extensive training. Here, we introduce a deep learning-based approach to predict the RNFL thickness around optic disc regions in fundus photography for glaucoma screening. The proposed deep learning model is based on a convolutional neural network (CNN) and utilizes images taken with fundus photography and with RNFL thickness measured with OCT for model training and validation. Using a dataset acquired from normal tension glaucoma (NTG) patients, the trained model can estimate RNFL thicknesses in 12 optic disc regions from fundus photos. Using intuitive thickness labels to identify localized damage of the optic nerve head and then estimating regional RNFL thicknesses from fundus images, we determine that screening for glaucoma could achieve 92% sensitivity and 86.9% specificity. Receiver operating characteristic (ROC) analysis results for specificity of 80% demonstrate that use of the localized mean over superior and inferior regions reaches 90.7% sensitivity, whereas 71.2% sensitivity is reached using the global RNFL thicknesses for specificity at 80%. This demonstrates that the new approach of using regional RNFL thicknesses in fundus images holds good promise as a potential screening technique for early stage of glaucoma.


Introduction
Glaucoma is a chronic and irreversible disease caused by damage to the optic nerve that can lead to vision loss and blindness [1][2][3]. Glaucoma is typically characterized by the progressive degeneration of retinal ganglion cells, resulting in morphological changes in both the optic nerve and retinal nerve fiber layer (RNFL) [4][5][6][7][8]. As no optimal cure for glaucoma currently exists, typical treatments only aim to interrupt the progression of functional damage and visual impairment. Thus, early detection of glaucoma is critical to prevent significant vision loss.
Retinal fundus photography monitors the retinal structure and is also widely used for initial glaucoma screening. Fundus photography provides the two-dimensional surface of categorized into three thinning levels by comparison with the normative reference and utilized for a highly sensitive screening protocol for glaucoma detection.

Data Preparation
We acquired the data from 303 patients diagnosed with normal tension glaucoma (NTG) in Ulsan University Hospital ophthalmic clinic to build our DL model and its validation. The enrolled NTG patients had a peak intraocular pressure of consistently ≤21 mmHg without glaucoma medication, a normal open angle, typical glaucomatous optic nerve and visual field changes, and the absence of an ocular or systemic disorder responsible for the optic nerve damage. Before the diagnosis, at least two reliable and reproducible visual field examinations were obtained using the Humphrey field analyzer and the Swedish Interactive Threshold Algorithm 24-2 test program (HFA 24-2; Carl Zeiss Meditec, Inc., Dublin, CA, USA). A total of 557 eyes from 303 patients were acquired for the dataset. The average of the mean deviation (MD) of collected eyes is −4.69 dB and about 72.6% of cases were in early stage (MD > −6 dB). The averaged visual field index (VFI) of collected eyes is 88.39%. The dataset comprised color fundus photographs (TRC-NW8, Topcon, Tokyo, Japan) and spectral-domain OCT (SD-OCT) scans (Triton and Atlantis, Topcon, Tokyo, Japan) from collected eyes.
Collected raw data were preprocessed for preparing the training set, as follows. The color fundus photographs of size 3216 × 2136 were processed in four steps to compartmentalize the optic disc region: (1) convert color fundus image to gray scaled image, (2) perform the Gaussian filtering with kernel size of 65 × 65, (3) brightest spot identification by scanning the max intensity pixel, since the center of the optic disc is generally the brightest in fundus photos, and (4) cropping a circular region of 320 × 320 pixels around the center of the optic disc. The cropped fundus photographs were manually reviewed by an expert to ensure the centering of the optic disc. Manually filtered preprocessed images were sorted into an image dataset without artifacts such as whitening, blurring, dusk, camera artifacts, dust and darkness. OCT measurements were acquired from the report generated by analysis software (IMAGEnet6, version 1.24, Topcon, Japan). From a given SD-OCT scan, the software averages data over 360 • for the global RNFL thickness and 30 • for corresponding regional RNFL thicknesses along the circle with a diameter of 3.45 mm centered at the optic disc. We assumed that a preprocessed fundus photograph and corresponding OCT scan were rotationally aligned. A total of 940 pairs of color fundus photographs and SD-OCT scans within 180 days (mean 31.4 days and standard deviation 62.9 days) were collected. Fundus photographs were sectionalized along 30 • to produce a fan-shaped optic disc image toward each sub-regional direction and matched with corresponding RNFL thickness, accounting for 11,280 data pairs. A summary of collected raw data is given in Table 1. Collected raw data was split into training (80%) and test (20%) sets. The random sampling process was at the patient level to avoid biased estimation of test performance. For the training process, data from 242 patients with 730 pairs for global RNFL and 8760 pairs for segmented were used. The remaining 210 pairs for global RNFL and 2520 pairs segmented from 61 patients were utilized as the holdout test set for model performance evaluation and further screening protocol test.
In addition, we acquired 583 fundus photographs from 522 normal patients who visited the Ulsan University Hospital HealthCare Center. Among the fundus photographs from normal patients 103 suspicious fundus photos from 93 patients were selected by a glaucoma expert; however, OCT measurements were not performed for normal patients. Additional fundus images from normal patients were used to verify the applicability of our approach. This study for clinical use was approved by the Ethics Committee of Ulsan University Hospital (Approval No. 2020-09-001).

CNN Architecture for RNFL Prediction and Training
To predict RNFL thickness from fundus photographs, we utilized a convolutional neural network (CNN) [41]. Figure 1b shows that the proposed model architecture consists of four convolution blocks and fully connected layers. Each convolution block comprises double convolutional layers with a kernel size of 3 × 3 and a stride of 2. After the convolutional layers, batch normalization and max-pooling layers are used for stable and fast learning [42]. The depth of the convolutional layers increases from 16 to 128 to extract enough image features for accurate estimation of RNFL thickness. The fully connected layer with 128 nodes is placed after the final convolution block, which gathers high dimensional features extracted by convolution blocks. The final node returns the estimated RNFL thickness from the merged and weighted features that transform into a numeric value from the input fundus photograph. Nonlinearity is introduced by the rectified linear unit (ReLU) activation at each convolution and fully connected layer [43]. At each step of the training, randomly sampled 32 pairs have formed a minibatch. To increase the diversity and heterogeneity of the training set, the data of the fundus photographs were augmented by applying random contrast, brightness, hue and saturation. The mean squared error (MSE) is used for the loss function and model parameters were updated by the Adam optimizer [44]. We additionally trained the model with an identical method to predict global RNFL thicknesses from a whole optic disc image for comparison with former studies. The proposed CNN models were trained and validated based on the five-fold cross-validation method using the training set from 242 patients. In total, 5 models were trained for each global and regional RNFL prediction, and the final models were then selected based on performance evaluation using validation sets.

Estimation and Categorization of RNFL Thickness for Glaucoma Screening
Two trained CNN models predicted the RNFL thickness value from the whole and sectioned optic disc images. For a given fundus image input, CNN outputs RNFL thickness in µm scale. For each eye, we predicted 13 RNFL values: the global RNFL (360 degrees) and 12 regional RNFL corresponding to each direction (30 degrees).
The thinning level of RNFL is categorized into three levels (green, yellow, and red) by comparison between OCT measured and the normative database [45]. The green indicates normal range, between 95% to 5% of the reference data; the yellow indicates borderline range, between 5% to 1% of the reference data; and red indicates beyond range, outside 1% of the normative reference data [46]. (a) Fundus images and OCT images were segmented into 12 subsections and then the segmented images were paired with the RNFL thickness measurements from OCT to train the model. The trained CNN model can predict RNFL thickness from a subdivided retinal image. The label of the thinning level for an estimated RNFL of a given region is determined by comparison with the normative reference data. (b) The details of CNN architecture for RNFL thickness estimation from segmented retinal fundus images are presented. Note that the model for estimation of global RNFL has the input size as 320 × 320 × 3 and thus sizes of the following convolution blocks were adjusted.
We additionally trained the model with an identical method to predict global RNFL thicknesses from a whole optic disc image for comparison with former studies. The proposed CNN models were trained and validated based on the five-fold cross-validation method using the training set from 242 patients. In total, 5 models were trained for each global and regional RNFL prediction, and the final models were then selected based on performance evaluation using validation sets.

Estimation and Categorization of RNFL Thickness for Glaucoma Screening
Two trained CNN models predicted the RNFL thickness value from the whole and sectioned optic disc images. For a given fundus image input, CNN outputs RNFL thickness in μm scale. For each eye, we predicted 13 RNFL values: the global RNFL (360 degrees) and 12 regional RNFL corresponding to each direction (30 degrees).
The thinning level of RNFL is categorized into three levels (green, yellow, and red) by comparison between OCT measured and the normative database [45]. The green Figure 1. Overview of the proposed study. (a) Fundus images and OCT images were segmented into 12 subsections and then the segmented images were paired with the RNFL thickness measurements from OCT to train the model. The trained CNN model can predict RNFL thickness from a subdivided retinal image. The label of the thinning level for an estimated RNFL of a given region is determined by comparison with the normative reference data. (b) The details of CNN architecture for RNFL thickness estimation from segmented retinal fundus images are presented. Note that the model for estimation of global RNFL has the input size as 320 × 320 × 3 and thus sizes of the following convolution blocks were adjusted. We used the predicted thinning level of RNFL for screening criteria, as determined by a comparison between the CNN predicted RNFL thickness and the normative database. In a previous study, the screening capability of RNFL measurement using SD-OCT was investigated [14]. Positive diagnosis was made for glaucoma if any abnormal or borderline level is included in interested regions of a given fundus image. Comparison of various combinations of subregion were tested to demonstrate that screening performance strongly depends on subregion variations.
Receiver operating characteristic (ROC) analyses were performed for global and regional predicted RNFL thicknesses. The ROC curves reveal the tradeoff between the true positive rate (TPR or sensitivity) and the false positive rate (FPR or 1-specificity). The area under ROC curve (AUC), sensitivity for fixed specificities at 80% and 95% and corresponding threshold of RNFL thickness values were investigated. The research workflow and how acquired datasets were used at each step is summarized in Figure 2. mance strongly depends on subregion variations.
Receiver operating characteristic (ROC) analyses were performed for global and regional predicted RNFL thicknesses. The ROC curves reveal the tradeoff between the true positive rate (TPR or sensitivity) and the false positive rate (FPR or 1-specificity). The area under ROC curve (AUC), sensitivity for fixed specificities at 80% and 95% and corresponding threshold of RNFL thickness values were investigated. The research workflow and how acquired datasets were used at each step is summarized in Figure 2.  Figure 3 shows the relationship between estimated RNFL thickness from optic disc photograph by trained CNNs and OCT measurements in the test set for the global and the regional RNFL. To verify the model performance, we confirmed the mean absolute error (MAE) between predicted and OCT measured for the regional RNFL prediction from the trained model, the R-squared value and the Pearson's correlation coefficient, as shown in Table 2. Table 2. A summary of performance metrics. The mean-absolute-error (MAE), the R-squared, and the Pearson's correlation coefficient between prediction of RNFL from trained models and true OCT-measurements.

Model
Predict MAE R-Squared Pearson's Correlation Previous study [33] Global  Figure 3 shows the relationship between estimated RNFL thickness from optic disc photograph by trained CNNs and OCT measurements in the test set for the global and the regional RNFL. To verify the model performance, we confirmed the mean absolute error (MAE) between predicted and OCT measured for the regional RNFL prediction from the trained model, the R-squared value and the Pearson's correlation coefficient, as shown in Table 2.  The MAE for the global RNFL prediction from the proposed architecture achieved 9.38 μm, which is a similar error range compared to a previous study [33], which was 7.39 μm. The MAE for the regional RNFL thickness prediction is more significant than the global RNFL prediction. The higher MAE in regional RNFL prediction is attributable to the reduced information amount due to a sectionalized optic disc image for input to the DL model compared to the non-sectionalized image.

Model Evaluation and Regional RNFL Thinning Level
The predicted values strongly correlate with the ground truths, as shown in Figure 3 by the linear trend along a wide range of RNFL thicknesses in both regional and global averages. Correlation between predicted and OCT measured RNFL confirms that the suggested CNN architecture was trained as desired and properly working for RNFL thick-  The MAE for the global RNFL prediction from the proposed architecture achieved 9.38 µm, which is a similar error range compared to a previous study [33], which was 7.39 µm. The MAE for the regional RNFL thickness prediction is more significant than the global RNFL prediction. The higher MAE in regional RNFL prediction is attributable to the reduced information amount due to a sectionalized optic disc image for input to the DL model compared to the non-sectionalized image.
The predicted values strongly correlate with the ground truths, as shown in Figure 3 by the linear trend along a wide range of RNFL thicknesses in both regional and global averages. Correlation between predicted and OCT measured RNFL confirms that the suggested CNN architecture was trained as desired and properly working for RNFL thickness estimation from the optic disc image. A summarized comparison between averaged OCT measurement and trained CNN prediction on RNFL in the test dataset along regions is given in Table 3. Table 3. The mean and standard deviation of OCT measured and trained CNN prediction on RNFL thickness (µm) in the test set along the global and regional sections. In Figure 4, representative examples of prediction from the test set are shown for various RNFL thinning progression. Figure 4a shows an example with true and predicted RNFL thinning level for each sub-region. Examples in Figure 4b are all regions in normal: Figure 4c globally normal but with partially damaged RNFL, and Figure 4d for severe cases in which progression of RNFL thinning is acute.

Region
In the example, the prediction of the RNFL thinning level is not always correct since there is an error in the prediction of RNFL thickness. However, the inference capability of the regional RNFL thinning levels from a fundus photograph could be useful for screening glaucoma when OCT measurement is not available. It is also helpful to distinguish the condition briefly among different levels of progression in RNFL defects. In the example, the prediction of the RNFL thinning level is not always correct since there is an error in the prediction of RNFL thickness. However, the inference capability of the regional RNFL thinning levels from a fundus photograph could be useful for screening glaucoma when OCT measurement is not available. It is also helpful to distinguish the condition briefly among different levels of progression in RNFL defects.

Regional RNFL Thinning and Glaucoma Screening
We applied our prediction protocol to three groups to reveal different RNFL thinning levels among distinct patient groups. In Figure 5, prediction results are shown for 30 randomly selected cases for each of the following groups of patients: (a) glaucoma (NTG), (b) suspicious, and (c) normal patients. The colored grid represents the prediction of the RNFL thinning level using trained CNN for the global and 12 regional parts. The scatters indicate predicted RNFL values with corresponding colors of thinning level. From the results, we have confirmed that the categorization of RNFL thinning level based exclusively on the global can often lead to false negative outcomes. In Figure 6, the results of the performed screening test are represented. For the test, we composed the screening test set with 313 glaucoma cases (NTG and suspicious cases) and 313 normal cases as described in Figure 2b. The screening rule for distinguishing glaucomatous eyes simply checks the predicted categorization of averaged and regional RNFL thickness. A given eye was judged as glaucomatous by the following criteria: (1) based on global RNFL alone, a given eye is glaucomatous if the thinning level is abnormal or borderline, (2) based on regional RNFL, a given eye is glaucomatous if one of the thinning  Figure 5a,b, are the categorization results of averaged RNFL from many glaucomatous eyes and most suspicious eyes in the normal range. However, a clear distinction is disclosed in the categorized regional RNFL thinning level. Notably, directional thinning of RNFL at the superior and inferior region is clearly shown from glaucomatous eyes. More specifically, directional thinning is predicted in suspicious patients from superior-temporal (ST), inferior (II), and inferior-temporal (IT) regions. Regional thinning of RNFL around the optic disc was previously reported, significantly thinning of the superior and inferior region strongly correlated in the early stage of glaucoma progression, and much quicker to become thinner than other regions [47][48][49]. Our results from estimated RNFL thickness from fundus photograph using trained CNN showed a similar tendency compared to previous findings. Therefore, prediction of RNFL thinning level along 12 regions from optic disc image with trained CNN could be useful for screening glaucomatous patients with OCT-based RNFL measurements. A summary of estimated RNFL from three groups as the average and the standard deviation for the global and 12 regions is given in Table 4. In Figure 6, the results of the performed screening test are represented. For the test, we composed the screening test set with 313 glaucoma cases (NTG and suspicious cases) and 313 normal cases as described in Figure 2b. The screening rule for distinguishing glaucomatous eyes simply checks the predicted categorization of averaged and regional RNFL thickness. A given eye was judged as glaucomatous by the following criteria: (1) based on global RNFL alone, a given eye is glaucomatous if the thinning level is abnormal or borderline, (2) based on regional RNFL, a given eye is glaucomatous if one of the thinning levels is abnormal or borderline and is located in superior or inferior regions (i.e., ST, SS, SN, IN, II, and IT). The results are confusion matrices as shown in Figure 6(a1,a2). Screening results based on global RNFL thickness showed 14.4% for sensitivity and 100% for specificity. Screening with regional information results showed 92% for sensitivity and 86.9% for specificity. Comparison between two confusion matrices clearly shows that screening based on the regional performs much better.

Shown in
We further investigated which subregion is crucial for screening performance in terms of sensitivity and specificity. We performed the screening test using combinations of two regional subsections among 12 regions. The criteria for judging a glaucomatous eye are the same as previously. If one thinning level is abnormal or borderline, we judge a given eye as glaucoma. Figure 5b shows sensitivity and specificity for screening results from all possible combinations. The results show that screening sensitivity is relatively higher if the superior or inferior region is included. Thus, for the glaucoma patient screening, at least in the case of the normal tension glaucoma, observation of RNFL defect in the superior and inferior regions could be critical for correct diagnosis. Furthermore, when inferior-temporal (IT) subregion is included, the sensitivity of screening becomes higher than 75% while other combinations become lower than 75%. Interestingly, if these regions are included, the specificity becomes lower than 76%, which is worse than that of other combinations. This is because of the relatively wider distribution of RNFL thickness in these regions.
In addition, we performed an ROC analysis based on predicted global and regional RNFL thicknesses. For global RNFL thicknesses, sensitivities were 34.5% and 71.2% for specificities at 95% and 80%, respectively, as shown in Figure 7a. These values are lower than those reported in Ref. [33] (76% and 90%, respectively). We attribute this poor performance to the fact that our data was based on the normal tension glaucoma (NTG) patients. We suspect that our data distribution relative to that of Ref. [33] is closer to normal cases with thicker RNFL. For example, the mean of global RNFL thickness values from NTG patients is 84.3 µm whereas the mean of global RNFL thickness from glaucoma patients in the dataset of previous study is 68.8 µm. NTG patients show less thinning of the RNFL relative to other types of glaucoma and thus we saw thicker global value.
Diagnostics 2022, 12, x FOR PEER REVIEW 11 of 17 levels is abnormal or borderline and is located in superior or inferior regions (i.e., ST, SS, SN, IN, II, and IT). The results are confusion matrices as shown in Figure 6(a1, a2). Screening results based on global RNFL thickness showed 14.4% for sensitivity and 100% for specificity. Screening with regional information results showed 92% for sensitivity and 86.9% for specificity. Comparison between two confusion matrices clearly shows that screening based on the regional performs much better. We further investigated which subregion is crucial for screening performance in terms of sensitivity and specificity. We performed the screening test using combinations of two regional subsections among 12 regions. The criteria for judging a glaucomatous eye are the same as previously. If one thinning level is abnormal or borderline, we judge a given eye as glaucoma. Figure 5b shows sensitivity and specificity for screening results from all possible combinations. The results show that screening sensitivity is relatively higher if the superior or inferior region is included. Thus, for the glaucoma patient screening, at least in the case of the normal tension glaucoma, observation of RNFL defect in the superior and inferior regions could be critical for correct diagnosis. Furthermore, when inferior-temporal (IT) subregion is included, the sensitivity of screening becomes higher than 75% while other combinations become lower than 75%. Interestingly, if these regions are included, the specificity becomes lower than 76%, which is worse than that of other combinations. This is because of the relatively wider distribution of RNFL thickness in these regions.
In addition, we performed an ROC analysis based on predicted global and regional RNFL thicknesses. For global RNFL thicknesses, sensitivities were 34.5% and 71.2% for specificities at 95% and 80%, respectively, as shown in Figure 7a. These values are lower than those reported in Ref. [33] (76% and 90%, respectively). We attribute this poor performance to the fact that our data was based on the normal tension glaucoma (NTG) patients. We suspect that our data distribution relative to that of Ref. [33] is closer to normal cases with thicker RNFL. For example, the mean of global RNFL thickness values from NTG patients is 84.3 μm whereas the mean of global RNFL thickness from glaucoma patients in the dataset of previous study is 68.8 μm. NTG patients show less thinning of the RNFL relative to other types of glaucoma and thus we saw thicker global value. Furthermore, we performed ROC analysis with predictions of RNFL thickness from 12 regions and localized mean RNFL thicknesses over superior (S), nasal (N), inferior (I), temporal (T), superior plus inferior (S + I), and nasal plus temporal (N + T) regions to confirm diagnostically meaningful regions. Each local mean RNFL thickness was averaged over corresponding sections, for example, RNFL thicknesses from ST, SS, and SN sections were averaged for the local mean of the superior (S) region. Among 12 subsections, the top two highest sensitivity for specificity at 80% were confirmed as 88.2% and 70.6% from the inferior-temporal (IT) and the superior-nasal (SN) regions, respectively. The best score is given by using localized mean RNFL over S + I regions with 0.913 of AUC and 90.7% sensitivity for specificity at 80%. Sensitivity of 90.7% is higher than that based Furthermore, we performed ROC analysis with predictions of RNFL thickness from 12 regions and localized mean RNFL thicknesses over superior (S), nasal (N), inferior (I), temporal (T), superior plus inferior (S + I), and nasal plus temporal (N + T) regions to confirm diagnostically meaningful regions. Each local mean RNFL thickness was averaged over corresponding sections, for example, RNFL thicknesses from ST, SS, and SN sections were averaged for the local mean of the superior (S) region. Among 12 subsections, the top two highest sensitivity for specificity at 80% were confirmed as 88.2% and 70.6% from the inferior-temporal (IT) and the superior-nasal (SN) regions, respectively. The best score is given by using localized mean RNFL over S + I regions with 0.913 of AUC and 90.7% sensitivity for specificity at 80%. Sensitivity of 90.7% is higher than that based on global RNFL and comparable to the previously reported values. For a fair comparison with the screening result (Figure 6(a2)), we also report sensitivities for specificity at 86.9%. In Table 5, full results of the ROC analysis are summarized. Table 5. Summary of ROC analysis based on section-wise predicted RNFL thicknesses and localized mean over superior (S), nasal (N) inferior (I), temporal (T), superior plus inferior (S + I), and nasal plus temporal (N + T). * Refer to the highest sensitivity and ** to the second highest sensitivity for specificity at 80% among 12 subsections. † Refer to the best scored overall results.

Discussion
We have demonstrated the DL model-based screening protocol by inferring RNFL information from color fundus photographs. The CNN models are trained on data from two different ophthalmic imaging modalities: color fundus photography and OCT. Experimental results show that a color fundus photograph can delineate quantitative morphological changes in the retina in terms of RNFL thickness with the DL technique. These integrations of the core OCT advantages and relatively simple imaging device fundus photography by DL technique would provide low-cost, easy, and simple assessment and quantitative analysis for glaucoma screening and diagnosis. Moreover, intuitive screening information enables the classification of the normal and diseased groups with examination of the regional RNFL defect around the optic disc, which could be utilized as a new screening protocol in real world testing.
For precise diagnosis of glaucoma, periodic observation on RNFL thinning is required because of a wide spectrum of thickness observed in normal patients [50]. Thus, it may be necessary to observe every 6 months for follow-up medical examination. However, it is very hard to perform such in-depth medical checkups in developing or underdeveloped countries with limited access to modern imaging systems such as OCT. On the other hand, qualitative examinations based on fundus photographs are limited in detecting early stage of progressive optic neuropathy due to the extensive training required.
Recent studies suggested utilizing the DL technique that can provide effective and inexpensive glaucoma screening to a population in a limited medical environment. In these studies, they developed DL algorithms to predict global RNFL [33] or Bruch membrane opening-minimum rim width (BMO-MRW) [34] from an image of the whole optic-disc photograph. They showed predicted that quantitative structural deformation correlated with visual loss and screening capabilities based on ROC analysis. In Ref. [34], regional BMO-MRW along six different sectors was also investigated. Compared to previous studies, the proposed approach has in common that quantitative retinal neuropathy is predicted by DL technique from optic disc photographs. However, detailed localization has made for precise estimation with pairing between compartmentalized optic disc images and corresponding regional RNFL thicknesses. Screening capabilities were checked along each section and reginal combinations based on ROC analysis. From the result we have found that localized mean of RNFL thickness over superior plus inferior regions showed the best screening performance. Though ROC analysis was performed based on predicted RNFL, results demonstrate that the global RNFL thickness alone may be inadequate for accurate screening of the NTG and early-stage glaucoma patients. Results support our hypothesis that image analysis based on regional segmentation improves the outcome drastically.
The proposed simple and intuitive screening protocol showed highly sensitive performance in a single shot examination with the capability of localized RNFL estimation. The criteria of our screening protocol based on thinning level predictions could be useful to tell whether the patient's eye is glaucomatous or normal. However, the thinning level is not sufficient to reveal how much of the RNFL defect has progressed quantitatively. In contrast, the quantitative prediction of RNFL thickness directly confirms how much retinal deformation is progressed. In a previous study, the potential of DL for predicting a progression of glaucomatous RNFL defects was investigated based on the averaged level [50]. The prediction of averaged RNFL is useful to confirm the trend of progression. However, regional-based RNFL thickness prediction can provide more detailed information. Thus, the periodic observation for checking progressive glaucoma without an OCT imaging can be an effective screening technique if localized retinal deformation can be trackable from fundus photographs.
Our approach can significantly extend the limits of the qualitative and subjective information based solely on fundus photographs, but still needs further improvement before its adoption in clinical practice. We believe the accuracy of RNFL thickness prediction can be improved further if additional data is used for the training model. However, even if trained CNN can predict quantitative retinal deformation precisely, the gap between OCT-measurement and CNN based estimation critically depends on the imaging quality of the fundus photograph. In Figure 8, examples of prediction failures are represented. If the fundus photograph is not clear with blurriness, low contrast, and not enough brightness, the prediction of RNFL thickness becomes inaccurate and thus the thinning label is consequently unreliable. Thus, a clear fundus photo is an essential prerequisite for accurate prediction. In addition, the screening capability of the proposed technique would be further enhanced when the DL model includes other risk factors such as intraocular pressure, history of diabetes, hypertension and high myopia. The current approach may also include additional quantitative values such as the rim thinning, notching, and cup-to-disc ratio for the better credibility of glaucoma screening [33,34,51].
As glaucoma is known to be age-related, the global incidence is expected to rise due to the aging population. In particular, the prevalence of glaucoma is very severe in many developing countries because a large proportion of cases remain undiagnosed or sub-optimally managed due to a lack of diagnostic and screening tools. Thus, the availability of low-cost imaging devices for quick and easy screening would tremendously benefit glaucoma care in low-resource settings. For our future work, a fully developed CNN model would be tested with a low-cost, handheld fundus device integrated in an ophthalmoscope [9,10,52]. Our technique based on fundus photography and DL may be further extended to detect other retinal conditions, such as diabetic retinopathy and age-related macular degeneration while enabling accurate diagnosis even in low-resource settings.
ness, the prediction of RNFL thickness becomes inaccurate and thus the thinning label is consequently unreliable. Thus, a clear fundus photo is an essential prerequisite for accurate prediction. In addition, the screening capability of the proposed technique would be further enhanced when the DL model includes other risk factors such as intraocular pressure, history of diabetes, hypertension and high myopia. The current approach may also include additional quantitative values such as the rim thinning, notching, and cup-to-disc ratio for the better credibility of glaucoma screening [33,34,51]. Figure 8. Examples of prediction failure. Model fails to estimate RNFL thickness accurately when the quality of fundus photograph is bad in terms of contrast, brightness, and blurriness and arrives at an incorrect thinning level prediction consequently. Data is based on rejected samples.
As glaucoma is known to be age-related, the global incidence is expected to rise due to the aging population. In particular, the prevalence of glaucoma is very severe in many developing countries because a large proportion of cases remain undiagnosed or sub-optimally managed due to a lack of diagnostic and screening tools. Thus, the availability of low-cost imaging devices for quick and easy screening would tremendously benefit glaucoma care in low-resource settings. For our future work, a fully developed CNN model would be tested with a low-cost, handheld fundus device integrated in an ophthalmoscope [9,10,52]. Our technique based on fundus photography and DL may be further extended to detect other retinal conditions, such as diabetic retinopathy and age-related macular degeneration while enabling accurate diagnosis even in low-resource settings.

Conclusions
We present a deep learning approach to exploit the advantages of color fundus photography for glaucoma screening. A fundus photograph is augmented with estimated RNFL thickness based on OCT data, unveiling morphological changes in the retina that cannot be obtained using color fundus photography alone. Our approach for detecting Figure 8. Examples of prediction failure. Model fails to estimate RNFL thickness accurately when the quality of fundus photograph is bad in terms of contrast, brightness, and blurriness and arrives at an incorrect thinning level prediction consequently. Data is based on rejected samples.

Conclusions
We present a deep learning approach to exploit the advantages of color fundus photography for glaucoma screening. A fundus photograph is augmented with estimated RNFL thickness based on OCT data, unveiling morphological changes in the retina that cannot be obtained using color fundus photography alone. Our approach for detecting glaucoma using data from NTG patients gives a comprehensive and informative outcome, and potentially delivers a screening capability over existing methods.