Automated Cone Cell Identiﬁcation on Adaptive Optics Scanning Laser Ophthalmoscope Images Based on TV-L1 Optical Flow Registration and K -Means Clustering

: Cone cell identiﬁcation is essential for diagnosing and studying eye diseases. In this paper, we propose an automated cone cell identiﬁcation method that involves TV-L1 optical ﬂow estimation and K -means clustering. The proposed algorithm consists of the following steps: image denoising based on TV-L1 optical ﬂow registration, bias ﬁeld correction, cone cell identiﬁcation based on K -means clustering, duplicate identiﬁcation removal, identiﬁcation based on threshold segmentation, and merging of closed identiﬁed cone cells. Compared with manually labelled ground-truth images, the proposed method shows high effectiveness with precision, recall, and F1 scores of 93.10%, 94.97%, and 94.03%, respectively. The method performance is further evaluated on adaptive optics scanning laser ophthalmoscope images obtained from a healthy subject with low cone cell density and subjects with either diabetic retinopathy or acute zonal occult outer retinopathy. The evaluation results demonstrate that the proposed method can accurately identify cone cells in subjects with healthy retinas and retinal diseases.

Automated cone cell identification mainly includes image denoising and target identification. Reliable denoising of AO-SLO images often relies on averaging multiple AO-SLO images. However, the directly averaged AO-SLO image is blurred due to eye motion. To correct eye motion artifacts, the AO-SLO images need to be registered. Although hardwarebased registration is effective [3,4,[26][27][28][29][30], software-based registration is more commonly used due to its low costs [31][32][33][34][35][36]. For software-based registration, optical flow estimation provides high performance [36]. However, this registration method requires preprocessing that increases its complexity. To avoid preprocessing, we adopt TV-L1 optical flow [37] to achieve direct software-based AO-SLO image registration. Then, for target identification, we use K-means clustering [38], which despite being a common unsupervised learning algorithm, has not been applied to cone cell identification.
To verify the effectiveness of the proposed method, we compared it with manual labeling to obtain various evaluation measures, namely, recall, precision, and F1 score. We further evaluated the identification performance of the proposed method for cone cell identification of AO-SLO images from a healthy subject with low cone cell density and subjects with either diabetic retinopathy or acute zonal occult outer retinopathy, as far as we know for the first time. Figure 1 shows the flowchart of the proposed cone cell identification method, which comprises six steps: (1) image denoising, (2) bias field correction, (3) cone cell identification based on K-means clustering [38], (4) duplicate identification removal, (5) identification based on threshold segmentation, and (6) merging of closed identified cone cells. First, AO-SLO images are denoised by averaging multiple images registered by TV-L1 optical flow estimation [37]. Second, bias field correction [39] is applied to the denoised image, and the corrected image is isotropically magnified four times via bicubic interpolation. Third, cone cells are roughly identified by K-means clustering [38]. Fourth, duplicate identification results are removed. Fifth, a threshold is calculated by using the identification results to segment the corrected image. The segmentation results are used to generate the cone cell identification results. Finally, the identification outcome is obtained by merging closed identified cone cells [16].  Figure 3 shows an example of image denoising on a representative AO-SLO image patch using the proposed method. Note that image noise is markedly reduced after denoising, indicating the high effectiveness of the method.

AO-SLO Image Denoising
The human eye has a limited resilience to light exposure. Therefore, to prevent optical damage, the light of AO-SLO imaging should be set to a low-power mode. However, low light might lead to a low signal-to-noise ratio in AO-SLO images. To increase this ratio, multiple registered AO-SLO images can be averaged. To this end, we adopt TV-L1 optical flow estimation [37] for image registration and then average the registered images after screening. The flowchart of image denoising based on registered images is shown in Figure 2. First, the middle image in a time sequence is selected as a reference for the registration of other images. Then, TV-L1 optical flow registration is applied to each image to be registered. Third, image registration is evaluated. Registered images whose structural similarity index [40] with respect to the reference image is below 0.5 under masks are discarded. If the number of remaining registered images is below one-fifth of the number of acquired images, the algorithm selects the next image closest to the middle in the time sequence as the new reference image. When the number of successfully registered images is sufficient, they are averaged along with the reference image under masks to obtain the denoised AO-SLO image.  Figure 3 shows an example of image denoising on a representative AO-SLO image patch using the proposed method. Note that image noise is markedly reduced after denoising, indicating the high effectiveness of the method.

Bias Field Correction
To mitigate intensity differences across the cone cells in an AO-SLO image and improve the identification accuracy, we apply bias field correction [39] to denoised AO-SLO images. First, an intensity bias field image is created by applying a Gaussian filter to the denoised image: The denoised image is corrected by extracting the bias field image [39]:

Bias Field Correction
To mitigate intensity differences across the cone cells in an AO-SLO image and improve the identification accuracy, we apply bias field correction [39] to denoised AO-SLO images. First, an intensity bias field image is created by applying a Gaussian filter to the denoised image: Bias field image = Gaussion filter(Denoised image) The denoised image is corrected by extracting the bias field image [39]: Bias field corrected image(x, y) = Mean(Bias field image) × Denoised image(x, y) Bias field image(x, y) (2) Figure 4 illustrates bias field correction performed on a representative AO-SLO image. The intensity of the cone cells is more uniform after bias field correction. To accurately identify the cone cells, we also isotropically magnify the corrected image four times via bicubic interpolation after bias field correction and before identification.

Bias Field Correction
To mitigate intensity differences across the cone cells in an AO-SLO image and improve the identification accuracy, we apply bias field correction [39] to denoised AO-SLO images. First, an intensity bias field image is created by applying a Gaussian filter to the denoised image: Bias field image = Gaussion filter(Denoised image) The denoised image is corrected by extracting the bias field image [39]: Denoised image(x,y) Bias field corrected image(x,y) = Mean(Bias field image) Bias field image(x,y) × (2) Figure 4 illustrates bias field correction performed on a representative AO-SLO image. The intensity of the cone cells is more uniform after bias field correction. To accurately identify the cone cells, we also isotropically magnify the corrected image four times via bicubic interpolation after bias field correction and before identification.

Identification Based on K-Means Clustering
K-means clustering [38] is the most time-consuming operation in the proposed cone cell identification method. To reduce the computational time of K-means clustering, we divide the bias-field-corrected image into image patches before identification. Then, we separately identify the cone cells on each patch and join the identification results afterward. To accurately identify the cone cells near the edge of image patches, we add extra patch borders of 10% of the length of the image patch before identification. These borders only support identification in the image patches, but the identified cone cells in the borders are discarded from the final identification results.
For identification on each image patch, we first apply a histogram equalization. Second, the pixels are divided into 3 clusters by K-means according to intensity. Third, we generate a mask for the cluster that contains the largest number of pixels because this mask represents either cone cells or the background. Fourth, we extract the contours of this mask by using function findContours in the Python implementation of OpenCV. Then, we obtain the centroid of the area inside each contour to identify the cone cells. Figure 5 shows an example of cone cell identification based on K-means clustering on a representative image patch. The cone cells are roughly identified. Nevertheless, overidentification errors occur, as some cone cells are identified more than once.

Identification Based on K-Means Clustering
K-means clustering [38] is the most time-consuming operation in the proposed cone cell identification method. To reduce the computational time of K-means clustering, we divide the bias-field-corrected image into image patches before identification. Then, we separately identify the cone cells on each patch and join the identification results afterward. To accurately identify the cone cells near the edge of image patches, we add extra patch borders of 10% of the length of the image patch before identification. These borders only support identification in the image patches, but the identified cone cells in the borders are discarded from the final identification results.
For identification on each image patch, we first apply a histogram equalization. Second, the pixels are divided into 3 clusters by K-means according to intensity. Third, we generate a mask for the cluster that contains the largest number of pixels because this mask represents either cone cells or the background. Fourth, we extract the contours of this mask by using function findContours in the Python implementation of OpenCV. Then, we obtain the centroid of the area inside each contour to identify the cone cells. Figure 5 shows an example of cone cell identification based on K-means clustering on a representative image patch. The cone cells are roughly identified. Nevertheless, overidentification errors occur, as some cone cells are identified more than once.

Duplicate Identification Removal
To mostly correct overidentification, we remove duplicate identification results with low intensities. Specifically, we first divide the identified results into groups, with the maximum distances per group being below a threshold. Second, we only keep the identification results corresponding to the highest intensity per group and remove the likely duplicate identification results. Thus, we only preserve one identification result per

Duplicate Identification Removal
To mostly correct overidentification, we remove duplicate identification results with low intensities. Specifically, we first divide the identified results into groups, with the maximum distances per group being below a threshold. Second, we only keep the identification results corresponding to the highest intensity per group and remove the likely duplicate identification results. Thus, we only preserve one identification result per group. Figure 6 shows an example of duplicate identification result removal using the image patch shown in Figure 5. The overidentification shown in Figure 5 is corrected after duplicate identification removal.

Duplicate Identification Removal
To mostly correct overidentification, we remove duplicate identification results with low intensities. Specifically, we first divide the identified results into groups, with the maximum distances per group being below a threshold. Second, we only keep the identification results corresponding to the highest intensity per group and remove the likely duplicate identification results. Thus, we only preserve one identification result per group. Figure 6 shows an example of duplicate identification result removal using the image patch shown in Figure 5. The overidentification shown in Figure 5 is corrected after duplicate identification removal.

Identification Based on Threshold Segmentation
To further improve the accuracy of cone cell identification at high-intensity locations, we apply threshold segmentation to the bias-field-corrected image (Section 2.2). The threshold is determined as follows: (1) we generate a mask that includes the identification results obtained in Section 2.4 and their surroundings whose locations are within 4 pixels from them; (2) the mean value of the bias-field-corrected image under the mask is calculated; (3) this mean value is set as the threshold for segmentation. After segmenting the bias-field-corrected image using this threshold, we extract the contours of the segmentation results by using function findContours of OpenCV and obtain the centroid of the area inside each contour to identify the cone cells. Figure 7 shows an example of threshold segmentation applied to a representative image patch. The cone cells are more accurately identified after threshold segmentation.

Identification Based on Threshold Segmentation
To further improve the accuracy of cone cell identification at high-intensity locations, we apply threshold segmentation to the bias-field-corrected image (Section 2.2). The threshold is determined as follows: (1) we generate a mask that includes the identification results obtained in Section 2.4 and their surroundings whose locations are within 4 pixels from them; (2) the mean value of the bias-field-corrected image under the mask is calculated; (3) this mean value is set as the threshold for segmentation. After segmenting the bias-fieldcorrected image using this threshold, we extract the contours of the segmentation results by using function findContours of OpenCV and obtain the centroid of the area inside each contour to identify the cone cells. Figure 7 shows an example of threshold segmentation applied to a representative image patch. The cone cells are more accurately identified after threshold segmentation.

Duplicate Identification Removal
To mostly correct overidentification, we remove duplicate identification results with low intensities. Specifically, we first divide the identified results into groups, with the maximum distances per group being below a threshold. Second, we only keep the identification results corresponding to the highest intensity per group and remove the likely duplicate identification results. Thus, we only preserve one identification result per group. Figure 6 shows an example of duplicate identification result removal using the image patch shown in Figure 5. The overidentification shown in Figure 5 is corrected after duplicate identification removal.

Identification Based on Threshold Segmentation
To further improve the accuracy of cone cell identification at high-intensity locations, we apply threshold segmentation to the bias-field-corrected image (Section 2.2). The threshold is determined as follows: (1) we generate a mask that includes the identification results obtained in Section 2.4 and their surroundings whose locations are within 4 pixels from them; (2) the mean value of the bias-field-corrected image under the mask is calculated; (3) this mean value is set as the threshold for segmentation. After segmenting the bias-field-corrected image using this threshold, we extract the contours of the segmentation results by using function findContours of OpenCV and obtain the centroid of the area inside each contour to identify the cone cells. Figure 7 shows an example of threshold segmentation applied to a representative image patch. The cone cells are more accurately identified after threshold segmentation.

Merging of Closed Identification Results
After threshold segmentation, some cone cells are identified more than once. To mitigate duplicate identification, we merge closed identified cone cells by using the method in [16], which combine several closed identified results into one identified result. Specifically, morphological dilation is applied to the identification results (function morphologyEx of OpenCV), and the centroid of each region is obtained. Then, closed identified cone cells are combined, obtaining refined identification around the middle of the previously identified closed cone cells.

Merging of Closed Identification Results
After threshold segmentation, some cone cells are identified more than once. To mitigate duplicate identification, we merge closed identified cone cells by using the method in [16], which combine several closed identified results into one identified result. Specifically, morphological dilation is applied to the identification results (function morpholo-gyEx of OpenCV), and the centroid of each region is obtained. Then, closed identified cone cells are combined, obtaining refined identification around the middle of the previously identified closed cone cells. Figure 8 shows an example of merging closed identified cone cells on a representative image patch. Closed cone cells are merged, thus improving the identification results.

Results
An AO-SLO system with an acquisition rate of 30 Hz was used for imaging the posterior part of the eye of human subjects. The field of view and frame size of the human eye are 1.5° and 512 × 449 pixels, respectively. Thus, a transverse region of 445 × 445 µm was scanned using an effective focal length of 17 mm for the eye. More details of the AO-SLO system can be found in [41]. Before imaging, the pupil was dilated with 1% tropicamide and 2.5% phenylephrine hydrochloride to increase its diameter to 6-8 mm. Throughout the procedure, light exposure was maintained in accordance with the maximum permissible exposure limits specified by the American National Standards Institute [42].
The automated processing of a 100 × 100-pixel image required 8.72 s for denoising based on TV-L1 optical flow registration, 0.01 s for bias field correction, 0.52 s for identification based on K-means clustering, 0.01 s for duplicate identification removal, 1.56 s for identification based on threshold segmentation, and 1.65 s for merging of closed identification results. These computational times were obtained from a system equipped with an Intel Core i5-9400 CPU at 2.90 GHz, an NVIDIA GeForce GTX 1660 Ti graphics card, and 16.0 GB memory. Python (64 bits) was used for implementation of the algorithms: the numerical calculations were mainly implemented by using NumPy and SciPy libraries; the computer vision algorithms, especially K-means, were mainly implemented by OpenCV library.
To verify the effectiveness of the proposed method for cone cell identification, we acquired five retinal images around the foveola of healthy subjects (n = 5). The proposed method successfully identified the cone cells from the five subjects. The identification on three representative images is shown in Figure 9. By regarding manual cone cell identification as the ground truth, the overall precision, recall, and F1 score of the identification are listed in Table 1. The proposed method provides high identification accuracy near the foveola of the healthy subjects.

Results
An AO-SLO system with an acquisition rate of 30 Hz was used for imaging the posterior part of the eye of human subjects. The field of view and frame size of the human eye are 1.5 • and 512 × 449 pixels, respectively. Thus, a transverse region of 445 × 445 µm was scanned using an effective focal length of 17 mm for the eye. More details of the AO-SLO system can be found in [41]. Before imaging, the pupil was dilated with 1% tropicamide and 2.5% phenylephrine hydrochloride to increase its diameter to 6-8 mm. Throughout the procedure, light exposure was maintained in accordance with the maximum permissible exposure limits specified by the American National Standards Institute [42].
The automated processing of a 100 × 100-pixel image required 8.72 s for denoising based on TV-L1 optical flow registration, 0.01 s for bias field correction, 0.52 s for identification based on K-means clustering, 0.01 s for duplicate identification removal, 1.56 s for identification based on threshold segmentation, and 1.65 s for merging of closed identification results. These computational times were obtained from a system equipped with an Intel Core i5-9400 CPU at 2.90 GHz, an NVIDIA GeForce GTX 1660 Ti graphics card, and 16.0 GB memory. Python (64 bits) was used for implementation of the algorithms: the numerical calculations were mainly implemented by using NumPy and SciPy libraries; the computer vision algorithms, especially K-means, were mainly implemented by OpenCV library.
To verify the effectiveness of the proposed method for cone cell identification, we acquired five retinal images around the foveola of healthy subjects (n = 5). The proposed method successfully identified the cone cells from the five subjects. The identification on three representative images is shown in Figure 9. By regarding manual cone cell identification as the ground truth, the overall precision, recall, and F1 score of the identification are listed in Table 1. The proposed method provides high identification accuracy near the foveola of the healthy subjects.
We further evaluated the performance of the proposed method on different types of AO-SLO images. Figure 10 shows three examples of AO-SLO images [43,44] and their corresponding cone cell identification results. The first example (Figure 10(a1,b1)) shows an AO-SLO image at different retinal locations in the same healthy eye with cone cell density below that of the examples shown in Figure 9. The examples in Figure 10(a2,b2) and Figure 10(a3,b3) show AO-SLO images of an eye with diabetic retinopathy [43] and an eye with acute zonal occult outer retinopathy [44], respectively, along with the corresponding cone cell identification results. The proposed method accurately identifies AO-SLO images of eyes with low cone cell density, diabetic retinopathy, and acute zonal occult outer retinopathy. We further evaluated the performance of the proposed method on different types of AO-SLO images. Figure 10 shows three examples of AO-SLO images [43], [44] and their corresponding cone cell identification results. The first example (Figure 10(a1,b1)) shows an AO-SLO image at different retinal locations in the same healthy eye with cone cell density below that of the examples shown in Figure 9. The examples in Figure 10(a2,b2) and Figure 10(a3,b3) show AO-SLO images of an eye with diabetic retinopathy [43] and an eye with acute zonal occult outer retinopathy [44], respectively, along with the corresponding cone cell identification results. The proposed method accurately identifies AO-SLO images of eyes with low cone cell density, diabetic retinopathy, and acute zonal occult outer retinopathy.

Discussion
One future direction of image denoising for AO-SLO images is to try the deep learning-based image registration method and averaging the registered images. The deep learning registration model could be trained by natural images and applied to the AO-   We further evaluated the performance of the proposed method on different types of AO-SLO images. Figure 10 shows three examples of AO-SLO images [43], [44] and their corresponding cone cell identification results. The first example ( Figure 10(a1,b1)) shows an AO-SLO image at different retinal locations in the same healthy eye with cone cell density below that of the examples shown in Figure 9. The examples in Figure 10(a2,b2) and Figure 10(a3,b3) show AO-SLO images of an eye with diabetic retinopathy [43] and an eye with acute zonal occult outer retinopathy [44], respectively, along with the corresponding cone cell identification results. The proposed method accurately identifies AO-SLO images of eyes with low cone cell density, diabetic retinopathy, and acute zonal occult outer retinopathy.

Discussion
One future direction of image denoising for AO-SLO images is to try the deep learning-based image registration method and averaging the registered images. The deep learning registration model could be trained by natural images and applied to the AO-

Discussion
One future direction of image denoising for AO-SLO images is to try the deep learningbased image registration method and averaging the registered images. The deep learning registration model could be trained by natural images and applied to the AO-SLO images. For automated cone cell identification methods, more unsupervised clustering methods, which are known for their high accuracy in image segmentation, could be applied to AO-SLO images for trying to obtain highly accurate cone cell identification results in some pathological cases.

Conclusions
We propose an automated cone cell identification method that uses TV-L1 optical flow registration and K-means clustering identification on AO-SLO images. The proposed method successfully achieves denoising based on TV-L1 optical flow registration, bias field correction, identification based on K-means clustering, and merging of closed identified cone cells, as verified experimentally. To evaluate the performance of the proposed method, we compared its automated cone cell identification with manual labeling. The proposed method achieves precision, recall, and F1 scores of 93.10%, 94.97%, and 94.03%, respectively. Furthermore, the proposed method exhibits high-performance cone cell identification on AO-SLO images of eyes with low cone cell density, diabetic retinopathy, and acute zonal occult outer retinopathy.  Data Availability Statement: The data are not publicly available due to them containing information that could compromise research participant privacy.