Novel Application of Long Short-Term Memory Network for 3D to 2D Retinal Vessel Segmentation in Adaptive Optics—Optical Coherence Tomography Volumes

Le, Christopher T.; Wang, Dongyi; Villanueva, Ricardo; Liu, Zhuolin; Hammer, Daniel X.; Tao, Yang; Saeedi, Osamah J.

doi:10.3390/app11209475

Open AccessArticle

Novel Application of Long Short-Term Memory Network for 3D to 2D Retinal Vessel Segmentation in Adaptive Optics—Optical Coherence Tomography Volumes

by

Christopher T. Le

¹,

Dongyi Wang

^2,3

,

Ricardo Villanueva

^1,4,

Zhuolin Liu

⁴,

Daniel X. Hammer

⁴

,

Yang Tao

³ and

Osamah J. Saeedi

^1,*

¹

Department of Ophthalmology and Visual Sciences, University of Maryland, Baltimore, MD 21201, USA

²

Biological and Agricultural Engineering Department, University of Arkansas, Fayetteville, AR 72701, USA

³

Bioengineering Department, University of Maryland, College Park, MD 20742, USA

⁴

Center for Devices and Radiological Health (CDRH), U.S. Food and Drug Administration, Silver Spring, MD 20993, USA

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(20), 9475; https://doi.org/10.3390/app11209475

Submission received: 10 August 2021 / Revised: 6 October 2021 / Accepted: 6 October 2021 / Published: 12 October 2021

(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅲ)

Download

Browse Figures

Versions Notes

Abstract

:

Featured Application

Authors are encouraged to provide a concise description of the specific application or a potential application of the work. This section is not mandatory.

Abstract

Adaptive optics—optical coherence tomography (AO-OCT) is a non-invasive technique for imaging retinal vascular and structural features at cellular-level resolution. Whereas retinal blood vessel density is an important biomarker for ocular diseases, particularly glaucoma, automated blood vessel segmentation tools in AO-OCT have not yet been explored. One reason for this is that AO-OCT allows for variable input axial dimensions, which are not well accommodated by 2D-2D or 3D-3D segmentation tools. We propose a novel bidirectional long short-term memory (LSTM)-based network for 3D-2D segmentation of blood vessels within AO-OCT volumes. This technique incorporates inter-slice connectivity and allows for variable input slice numbers. We compare this proposed model to a standard 2D UNet segmentation network considering only volume projections. Furthermore, we expanded the proposed LSTM-based network with an additional UNet to evaluate how it refines network performance. We trained, validated, and tested these architectures in 177 AO-OCT volumes collected from 18 control and glaucoma subjects. The LSTM-UNet has statistically significant improvement (p < 0.05) in AUC (0.88) and recall (0.80) compared to UNet alone (0.83 and 0.70, respectively). LSTM-based approaches had longer evaluation times than the UNet alone. This study shows that a bidirectional convolutional LSTM module improves standard automated vessel segmentation in AO-OCT volumes, although with higher time cost.

Keywords:

adaptive optics; optical coherence tomography; deep learning; vessel segmentation; long short-term memory; recurrent neural network; glaucoma

1. Introduction

Adaptive optics optical coherence tomography (AO-OCT) imaging is a non-invasive technique that provides improved lateral resolution compared to traditional optical coherence tomography (OCT) by using adaptive optics (AO) to correct for ocular aberrations. With AO-OCT, it is now possible to obtain three-dimensional (3D), cellular-level resolution of the retina and optic nerve head to study ocular physiology and diseases [1,2,3,4,5,6,7,8]. As an example of cellular resolution capabilities, AO-OCT based methods can reliably quantify retinal ganglion cell (RGC) soma morphology [8,9,10,11] and distinguish individual retinal vessels [12,13,14]. OCT and AO-OCT collect and register cross-sectional scans of the retina in sequence that are combined to form 3D volumes. These volumes can be probed to observe vasculature changes in their natural depth-layered plexuses in the en face plane. The anatomic relationship of these plexuses has been previously well-characterized [15]. In the parafoveal and perifoveal macula, retinal vessels separate into three distinct vascular plexuses: the superficial vascular plexus (SVP), intermediate capillary plexus (ICP), and deep capillary plexus (DCP) [15]. The SVP primarily nourishes retinal ganglion cells in the ganglion cell layer (GCL). This layer also may be connected to the radial peripapillary capillary complex (RPCP) which supplies the retinal nerve fiber layer (RNFL) in the peripapillary region. The ICP lies deeper to the SVP and supplies the dendritic synapses in the GCL as well as the cells in the inner nuclear layer (INL). Finally, the DCP is at the base of the INL and mainly supplies the bipolar and horizontal cells and their connections to the photoreceptor outer nuclear layer [16].

Glaucoma is an ocular disease that significantly affects the inner retina, specifically the RGC somas and vasculature that supplies them [17]. It is a leading cause of irreversible blindness with a projected global disease prevalence of greater than 111.8 million by 2040 [18]. Reduction of intraocular pressure (IOP) is the only current treatment for the disease, but glaucoma can worsen even with adequate IOP control and up to one-third of patients develop glaucoma with an IOP in normal range [19]. This indicates the need to understand non-IOP related factors that contribute to the disease, including retinal vascular dysfunction [17,20]. Previous studies have shown that glaucoma, and specifically thinning of the GCL is associated with lower retinal vascular density in OCT images [21]. However, GCL thickness is an approximate surrogate for RGC density. AO-OCT has provided great capability to simultaneously measure RGC and vessel characteristics. We have previously leveraged AO-OCT to determine the association between RGC density and vessel density using a laborious and semi-automated process [22].

To further explore the relationship between RGC damage and vascular dysfunction characterized by vessel drop-out, automated quantification methods to extract these metrics from AO-OCT volumes are needed. A weakly-supervised segmentation method using a deep learning algorithm has been investigated to automatically quantify individual ganglion cell layer soma in AO-OCT volumes [9]. However, automated vessel segmentation in AO-OCT volumes, which resolve vessels down to the capillary scale, has not yet been explored. Automated retinal vessel segmentation tools for this modality will be increasingly useful and relevant for AO-OCT clinical translation.

Retinal blood vessel segmentation using deep learning is an area of active research in other retinal imaging modalities, such as traditional OCT and OCT angiography (OCTA) [23,24]. While OCTA collects multiple axial cross-sections, commonly referred to as B-scans, which can be used to form a 3D volume, typically this volume is projected back onto an en face 2D plane for vessel segmentation and interpretation. 2D-2D segmentation loses inter-slice connectivity between the en face plane slices of the 3D volume that can potentially be valuable context for automated segmentation. However, training end-to-end 3D-3D segmentation models, which receive the 3D volume and output 3D vessel labels, are also challenging. 3D-3D convolutional techniques require uniform input sizes [25,26], which may be an obstacle given the variability in retinal layer thickness across patients [27,28]. Furthermore, acquisition of 3D labeled data for deep learning algorithm training is costly requiring up to 50 h per scan to label, even for an expert grader [29]. Such models are also computationally expensive for high resolution data and can potentially be more difficult for providers and researchers to interpret if they are accustomed to a 2D en face view of vascular networks. 3D-2D segmentation, which receives the entire 3D volume as input and generates a 2D label, is a possible tool to leverage inter-slice connectivity while using relatively low-cost labels from 2D segmentation maps. Recurrent neural networks (RNN) are deep learning architectures originally designed to model sequential information with dependence on previous states, such as language processing or time-series forecasting. These tools notably do not require a uniform input size, which offers a unique advantage for the segmentation of AO-OCT vessel images with variable input thickness. The primary purpose of this study is to investigate the performance of novel RNN, specifically bidirectional convolutional long short-term memory networks (LSTM), for 3D-2D vessel segmentation of AO-OCT volumes compared to traditional 2D-2D segmentation.

2. Materials and Methods

2.1. In-Vivo Adaptive Optics Imaging

This study used AO-OCT data collected from a study of ganglion cell layer soma quantification in 18 glaucoma and control subjects (six glaucoma and 12 control subjects). The data from both glaucoma and control subjects were combined for this study and no analysis differentiated the disease state. The full details of subject clinical assessment and AO imaging are found in previously published papers [8,22]. Pupillary dilation and cycloplegia were achieved with 1% tropicamide in subjects, who were subsequently imaged using the FDA multimodal adaptive optics (mAO) device previously described [8,30]. We examined 1.5° × 1.5° regions located symmetrically 2.5° superior and inferior about the horizontal midline at eccentricities 3°, 6°, and 12° in the temporal retina (Figure 1A). The AO focus was set approximately to the ganglion cell layer and 300 AO-OCT volumes were collected, registered, and averaged at each location.

In the AO-OCT volumes, the SVP, ICP, and DCP, were segmented separately by creating en face average intensity projections across the axial pixels in which each plexus resides (Figure 2) [15]. After undergoing ImageJ automatic contrast enhancement, all vessels in each en face projection were then manually labeled by a single expert grader (co-author R.V.) using a uniform brush size for each capillary segment. For each capillary branch, the brush size was readjusted depending on the grader’s visual estimate of that segment’s vessel width. These tracings were binarized in ImageJ, reviewed for quality by the principal investigator, and used as the ground-truth standard for our models (Figure 2) [31].

2.2. Nested Model Architectures

Three models, referred to as UNet, LSTM-UNet, and UNet-LSTM-UNet, were designed and examined. To sequentially evaluate the conferred benefit from each architecture’s design, the models were nested such that LSTM-UNet incorporated the UNet architecture, and UNet-LSTM-UNet incorporated the LSTM-UNet architecture.

The UNet architecture was selected as a base architecture as is commonly done for medical image segmentation, including for previous OCT and retinal vascular imaging applications [23,24,32]. Our standard UNet received a 2D image as input and output a 2D image following a process comprised of an encoder, skip connections, and a decoder. The encoder used convolutional layers to extract features at variable input resolutions. Max-pooling during the encoding process shrank input resolution, further allowing identification of segmentation features at different scales. At each resolution, skip connections were used to concatenate the convolution layer output to a corresponding resolution in the decoding pathway. The decoder uses these convolution outputs along with the up-sampled images as inputs for deconvolution layers to generate the resulting segmentation output following sigmoid activation. The resulting output has two channels, one for vessel activation and one for background activation. Our base model had a network depth of 3 utilizing skip connections at each layer (Figure 1A). We also performed batch normalization and transformation with a leaky rectified linear unit (slope = 0.01) at each convolutional layer.

The LSTM-UNet is our proposed 3D-2D deep learning framework that uses bidirectional convolutional LSTM networks to incorporate the 3D context (i.e., inter-slice connectivity) from 3D input slices with unfixed depths to output 2D segmentation maps. LSTM networks are a form of RNNs that extend base RNN’s ability to model sequential data by updating a hidden state representation with input-to-state and state-to-state operations, namely an input gate, output gate, forget gate, and cell-state, which can better account for long-term sequential data [33]. Importantly, a LSTM model allows for varying numbers of input slices, which is necessary for our task, as the volumes containing each vessel plexus can vary in size when considering retinal layer thickness variability across patient populations [27,28]. While the standard LSTM classically uses fully connected layers for language processing applications, for our purposes, we employed a convolutional LSTM, which uses convolutional structures in the input-to-state and state-to-state transitions to more efficiently handle sequential spatial data (Figure 1B) [34]. Furthermore, to incorporate the inter-slice context from both the previous and following slices, we presented slices to two distinct LSTM units in forward (top-to-bottom) and reverse (bottom-to-top) order and concatenated the outputs to generate a bidirectional convolutional LSTM (Figure 1C) [35]. In our LSTM-UNet architecture, we used bidirectional LSTM units at three input resolutions during image encoding (Figure 3). The encoded outputs were concatenated with up-sampled images during decoding, eventually resulting in a single channel 2D output that could be fed into a UNet unit.

The UNet-LSTM-UNet appends another UNet as an additional image pre-processing step before each slice is fed into the LSTM-UNet. This design, inspired by cascading architectures for brain tumor segmentation [36], similarly allows for variable slice number volumes, 3D-2D segmentation, and provides increased numbers of trainable convolutions and parameters earlier within the architecture, potentially allowing for improved labeling of vessels within each slice.

2.3. Model Training and Performance Evaluation

AO-OCT volumes (n = 177) were randomly split into training, validation, and testing datasets in a 60%–20%–20% split, respectively, ensuring an equal composition of SVP, ICP, and DCP volumes in each split. The characteristics of these splits are shown in Table 1. Each slice for each volume underwent automatic contrast adjustment following the ImageJ automatic contrast function that is based on histogram stretching [31]. To improve model robustness, we performed image augmentation [37] on the training dataset with random horizontal flip, vertical flip, affine transformation with translation and scaling, and random cropping to patches of 64 × 64 pixels. All models were trained using a binary cross-entropy loss function. Convolutional and deconvolutional layer weights were initialized as described by Kaiming et al. [38]. All models were trained for 600 epochs using an Adam optimizer at a learning rate of 0.0001. We also employed early stopping criteria based on validation set performance with a patience of 180 epochs. During evaluation, the model with the lowest binary cross-entropy loss on the validation set for each architecture was selected as the “best model” and used to segment volumes from each testing set in 64 × 64 × Ni pixel patches, where N was number of slices for a volume, i. The 64 × 64 segmented masks were reassembled to form the final 2D mask output.

We evaluated each model’s performance on the testing set with an average Dice coefficient, area under receiver operating characteristic curve (AUC), precision, recall, and accuracy [39,40,41]. The metrics were compared between the models using a one-way ANOVA and follow-up Tukey test with statistical significance of differences determined to be p-value < 0.05. We also recorded the time it took each model to generate a segmentation for a single 30-slice volume. This volume was selected as it was the closest to average number of slices per volume within our testing set. All the algorithms were implemented on a computer with a NVIDA GeForce RTX 2070 (8GB) GPU and AMD Ryzen 5 3600 6-Core Processor @ 3.6 GHz (16 GB RAM). All image processing, model training, and model evaluation was performed in ImageJ and PyTorch 1.7.1 using Python 3 [31,42].

3. Results

The UNet, LSTM-UNet, and UNet-LSTM-UNet were each trained with the same 109 training and 34 testing AO-OCT volumes. We evaluated each model’s performance on the testing set with an average Dice coefficient, area under receiver operating characteristic curve (AUC), precision, recall, and accuracy and report their relative performance in Figure 4 and Table 2. Of the three models, the LSTM-UNet had the greatest average Dice coefficient (0.69), recall (0.80), and AUC (0.88), while the UNet-LSTM-UNet had the best precision (0.65). The LSTM-UNet and UNet-LSTM-UNet had similar average pixel-wise classification accuracy (0.92). Both the LSTM-UNet and UNet-LSTM-UNet had significantly better AUC (p value < 0.001 for both) and recall (p value < 0.001 and = 0.001, respectively) performance on the testing set than UNet alone.

We show representative examples of all three models’ qualitative performance in Figure 5. Visually we observe that all three models were affected by shadowing artifacts in deeper layers (ICP and DCP), but generally the LSTM-UNet and UNet-LSTM-UNet were less affected than the UNet model.

Table 3 shows the number of learnable parameters and the evaluation time on a sample 30 slice volume for each architecture, demonstrating that the size of the model was correlated with evaluation time. However, the relationship between number of parameters and time to evaluation was not necessarily linear, with the LSTM-UNet and UNet-LSTM-UNet segmenting the sample volume in 184.8 and 241.7 s respectively, while the UNet alone, which had 20–25% of the number of parameters of the LSTM-based models, segmented a single 2D projection of the 30 slices in 1.6 s, including image loading speed.

4. Discussion

This is the first study to explore automated vessel segmentation in AO–OCT volumes. In this work, we found that augmentation of a UNet with a LSTM could significantly improve vessel segmentation performance with respect to AUC and recall on our held-out testing set when compared to a UNet alone.

Previous studies have examined alternative 3D-2D vessel segmentation approaches in OCTA retinal imaging and other imaging modalities. Li et al., have developed a novel image projection network (IPN) which uses a unidirectional pooling layer to effectively learn weights for each slice within the projection step [43]. This unidirectional pooling layer in the IPN necessitates consistent pixel volumes as input, which would require interpolation or compression of non-uniform input for individual retinal layer blood vessel segmentation. In a large dataset of 500 OCTA volumes, their most recent iteration has shown a best Dice coefficient of 0.93 representing a 0.03 improvement over the baseline 2D-2D UNet with a Dice coefficient of 0.90 in their dataset [44]. As an imaging modality, OCTA uses motion-based processing to improve vessel contras [45] and it is expected that the segmentation performance on OCTA data would be greater in both the base UNet and the IPN than in AO-OCT volumes, which in its current form, does not perform any additional processing to improve vessel contrast. In comparison to the added benefit demonstrated by the 3D-2D architecture of Li et al. over baseline UNet, we achieve similar improvement of 0.04 with a Dice coefficient of 0.69 with our LSTM-UNet relative to 0.65 in the UNet alone. Lee et al. developed a Spider U-Net, which similarly uses bidirectional convolutional LSTM to capture inter-slice connectivity, but employs the LSTM in between the encoding and decoding path of several UNet modules for each slice [46]. This architecture was trained and evaluated on multiple modalities, specifically Brain MRA, Abdomen CT and Cardiac MRI, for 3D-3D segmentation of blood vessels with Dice coefficients for Spider U-Net improving over 2D UNet by 0.05, 0.13, and 0.06, respectively for each dataset. While this architecture differs from ours in that it requires annotations for each slice within a 3D volume for training and was evaluated on (non-retinal) vessel and organ segmentation tasks, Lee et al., found that incorporating a LSTM for inter-slice connectivity for their task produced fewer false-negative pixels. Our results are consistent with this finding as the recall for our LSTM-based models were significantly improved over the UNet based models alone.

Our study also found that vessels in the ICP and DCP that are subject to shadowing artifacts evident in the raw image are more likely to be partially or fully segmented within the LSTM-based models compared to the UNet alone. This finding is expected, as an averaged projection would have lower pixel intensity and would be more difficult to distinguish in the 2D-2D approach, but in the inter-slice context gained from the LSTM could assist with identification of these vessels and could indicate that deeper plexuses may benefit from LSTM-based segmentation methods. When comparing the two LSTM-based architectures, we found that the UNet-LSTM-UNet architecture has similar performance to the LSTM-UNet architecture alone, with non-significantly different precision (p value = 0.96), Dice coefficient (p value = 0.99), recall (p value = 0.61), accuracy (p value = 0.99) and AUC (p value = 0.65), indicating that increased parameters alone are not guaranteed to significantly improve performance. In fact, when factoring in the cost of increased evaluation time by the UNet-LSTM-UNet model, our work suggests that the LSTM-UNet approach is superior to the higher capacity model. Additionally, while the LSTM-based models demonstrate greater segmentation performance than the UNet alone, the evaluation time for a 30-slice volume (184.8 or 241.7 s) was 100–200 times longer than the projection segmentation (1.6 s). These differences indicate a trade-off between segmentation performance and speed inherent in the two architectures. Whether the time cost of LSTM-based models is a significant barrier for real-world segmentation and outweighs the benefit of greater fidelity performance will be an important consideration when implementing these models for research or clinical use in the future.

This work is not without limitations. Our sample size represents 177 volumes collected from a limited cohort of 18 patients. We note that this is a substantial number of volumes and on a greater or similar scale to previously published OCTA vessel segmentation datasets [47] and that theoretically the computer vision task of classifying pixels from grayscale images should be agnostic to the individual patient identity or disease state. However, more studies will be needed to ensure these results generalize to a greater subject population and perform consistently across glaucoma and control eyes separately. The design of studies to ensure no differences in model segmentation performance of glaucoma and control eyes is especially important if these tools are intended to be utilized to quantify biomarkers between the two disease states. Additionally, as our ground-truth labels were derived from annotating the 2D projection of each volume, rather than 3D annotations with masks for each slice, there is the possibility of human judgement impacting our model’s training and performance. However, any error in manual labeling resulting from tracing the 2D projection would more likely bias our results towards the 2D-2D UNet alone, yet our 3D-2D approach remains significantly superior.

5. Conclusions

The results of this study demonstrate that augmenting traditional UNet approaches with LSTM enables improved automated vessel segmentation in AO-OCT volumes. This 3D-2D approach would enable researchers to continue using lower cost 2D labels on readily available 3D AO-OCT data to train deep learning tools for AO-OCT vessel segmentation.

6. Disclaimer

The mention of commercial products, their sources, or their use in connection with material reported herein is not to be construed as either an actual or implied endorsement of such products by the U.S. Department of Health and Human Services.

Author Contributions

Conceptualization, C.T.L., D.W. and O.J.S.; methodology, C.T.L. and D.W.; software, C.T.L. and D.W.; validation, C.T.L. and D.W.; formal analysis, C.T.L., D.W., Z.L. and D.X.H.; investigation, C.T.L. and D.W.; data curation, C.T.L., D.W., Z.L., D.X.H. and R.V.; writing—original draft preparation, C.T.L., D.W. and O.J.S.; writing—review and editing, C.T.L., D.W., Z.L., D.X.H., Y.T. and O.J.S.; visualization, C.T.L.; supervision, O.J.S.; project administration, O.J.S.; funding acquisition, Y.T. and O.J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received funding support from NIH/NEI award (R01EY031731) and the UMD BIOE Fischell Fellowship program. We acknowledge the support of the University of Maryland, Baltimore, Institute for Clinical & Translational Research (ICTR) and the National Center for Advancing Translational Sciences (NCATS) Clinical Translational Science Award (CTSA) grant number 1UL1TR003098.

Institutional Review Board Statement

This study was conducted according to the Declaration of Helsinki and was approved by the Institutional Review Boards of the Food and Drug Administration (FDA) (protocol code 17-062R, Approved: 11 January 2018) and the University of Maryland (protocol code HP-00078023, Approved 17 January 2018).

Informed Consent Statement

Informed consent for the collection and analysis of data was obtained from all subjects involved in the study.

Data Availability Statement

Computer code and best performance states for computational models can be found at https://github.com/ctnle/AO-OCT-Vessel-Segmentation (accessed on 10 October 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

Nadler, Z.; Wang, B.; Wollstein, G.; Nevins, J.E.; Ishikawa, H.; Bilonick, R.; Kagemann, L.; Sigal, I.A.; Ferguson, R.D.; Patel, A.; et al. Repeatability of in vivo 3D lamina cribrosa microarchitecture using adaptive optics spectral domain optical coherence tomography. Biomed. Opt. Express 2014, 5, 1114–1123. [Google Scholar] [CrossRef] [Green Version]
Akagi, T.; Hangai, M.; Takayama, K.; Nonaka, A.; Ooto, S.; Yoshimura, N. In Vivo Imaging of Lamina Cribrosa Pores by Adaptive Optics Scanning Laser Ophthalmoscopy. Investig. Opthalmol. Vis. Sci. 2012, 53, 4111–4119. [Google Scholar] [CrossRef] [PubMed]
Hood, N.C.; Lee, N.; Jarukasetphon, R.; Nunez, J.; Mavrommatis, M.A.; Rosen, R.B.; Ritch, R.; Dubra, A.; Chui, T.Y.P. Progression of Local Glaucomatous Damage Near Fixation as Seen with Adaptive Optics Imaging. Transl. Vis. Sci. Technol. 2017, 6, 6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Takayama, K.; Ooto, S.; Hangai, M.; Ueda-Arakawa, N.; Yoshida, S.; Akagi, T.; Ikeda, H.; Nonaka, A.; Hanebuchi, M.; Inoue, T.; et al. High-Resolution Imaging of Retinal Nerve Fiber Bundles in Glaucoma Using Adaptive Optics Scanning Laser Ophthalmoscopy. Am. J. Ophthalmol. 2013, 155, 870–881.e3. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chen, M.F.; Chui, T.Y.P.; Alhadeff, P.; Rosen, R.B.; Ritch, R.; Dubra, A.; Hood, D.C. Adaptive Optics Imaging of Healthy and Abnormal Regions of Retinal Nerve Fiber Bundles of Patients with Glaucoma. Investig. Opthalmol. Vis. Sci. 2015, 56, 674–681. [Google Scholar] [CrossRef]
Huang, G.; Luo, T.; Gast, T.J.; Burns, S.A.; Malinovsky, V.E.; Swanson, W.H. Imaging Glaucomatous Damage Across the Temporal Raphe. Investig. Opthalmol. Vis. Sci. 2015, 56, 3496–3504. [Google Scholar] [CrossRef] [Green Version]
Liu, Z.; Kurokawa, K.; Zhang, F.; Lee, J.J.; Miller, D.T. Imaging and quantifying ganglion cells and other transparent neurons in the living human retina. Proc. Natl. Acad. Sci. USA 2017, 114, 12803–12808. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Liu, Z.; Saeedi, O.; Zhang, F.; Villanueva, R.; Asanad, S.; Agrawal, A.; Hammer, D.X. Quantification of Retinal Ganglion Cell Morphology in Human Glaucomatous Eyes. Investig. Opthalmol. Vis. Sci. 2021, 62, 34. [Google Scholar] [CrossRef] [PubMed]
Soltanian-Zadeh, S.; Kurokawa, K.; Liu, Z.; Zhang, F.; Saeedi, O.; Hammer, D.X.; Miller, D.T.; Farsiu, S. Weakly supervised individual ganglion cell segmentation from adaptive optics OCT images for glaucomatous damage assessment. Optica 2021, 8, 642. [Google Scholar] [CrossRef]
Miller, D.T.; Kurokawa, K. Cellular-Scale Imaging of Transparent Retinal Structures and Processes Using Adaptive Optics Optical Coherence Tomography. Annu. Rev. Vis. Sci. 2020, 6, 115–148. [Google Scholar] [CrossRef]
Kurokawa, K.; Crowell, J.A.; Zhang, F.; Miller, D.T. Suite of methods for assessing inner retinal temporal dynamics across spatial and temporal scales in the living human eye. Neurophotonics 2020, 7, 015013. [Google Scholar] [CrossRef] [Green Version]
Karst, S.G.; Salas, M.; Hafner, J.; Scholda, C.; Vogl, W.-D.; Drexler, W.; Pircher, M.; Schmidt-Erfurth, U. Three-dimensional analysis of retinal microaneurysms with adaptive optics optical coherence tomography. Retina 2019, 39, 465–472. [Google Scholar] [CrossRef]
Iwasaki, M.; Inomata, H. Relation between superficial capillaries and foveal structures in the human retina. Investig. Ophthalmol. Vis. Sci. 1986, 27, 1698–1705. [Google Scholar]
Felberer, F.; Rechenmacher, M.; Haindl, R.; Baumann, B.; Hitzenberger, C.; Pircher, M. Imaging of retinal vasculature using adaptive optics SLO/OCT. Biomed. Opt. Express 2015, 6, 1407–1418. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Detailed Vascular Anatomy of the Human Retina by Projection-Resolved Optical Coherence Tomography Angiography Scientific Reports. Available online: https://www.nature.com/articles/srep42201 (accessed on 15 June 2021).
Spaide, R.F.; Klancnik, J.M.; Cooney, M.J. Retinal Vascular Layers Imaged by Fluorescein Angiography and Optical Coherence Tomography Angiography. JAMA Ophthalmol. 2015, 133, 45–50. [Google Scholar] [CrossRef] [PubMed]
Jones, A.; Kaplowitz, K.; Saeedi, O. Autoregulation of optic nerve head blood flow and its role in open-angle glaucoma. Expert Rev. Ophthalmol. 2014, 9, 487–501. [Google Scholar] [CrossRef]
Tham, Y.-C.; Li, X.; Wong, T.Y.; Quigley, H.A.; Aung, T.; Cheng, C.-Y. Global Prevalence of Glaucoma and Projections of Glaucoma Burden through 2040. Ophthalmology 2014, 121, 2081–2090. [Google Scholar] [CrossRef]
Leske, M.C.; Heijl, A.; Hussein, M.; Bengtsson, B.; Hyman, L.; Komaroff, E. Factors for Glaucoma Progression and the Effect of Treatment. Arch. Ophthalmol. 2003, 121, 48–56. [Google Scholar] [CrossRef]
Weinreb, R.N.; Aung, T.; Medeiros, F.A. The Pathophysiology and Treatment of Glaucoma. JAMA 2014, 311, 1901–1911. [Google Scholar] [CrossRef] [Green Version]
Richter, G.M.; Madi, I.; Chu, Z.; Burkemper, B.; Chang, R.; Zaman, A.; Sylvester, B.; Reznik, A.; Kashani, A.; Wang, R.; et al. Structural and Functional Associations of Macular Microcirculation in the Ganglion Cell-Inner Plexiform Layer in Glaucoma Using Optical Coherence Tomography Angiography. J. Glaucoma 2018, 27, 281–290. [Google Scholar] [CrossRef]
Villanueva, R.; Le, C.; Liu, Z.; Zhang, F.; Magder, L.; Hammer, D.X.; Saeedi, O. Cell-Vessel Mismatch in Glaucoma: Correlation of Ganglion Cell Layer Soma and Capillary Densities. Investig. Ophthalmol. Vis. Sci. 2021, 62, 2. [Google Scholar] [CrossRef] [PubMed]
Guo, C.; Szemenyei, M.; Hu, Y.; Wang, W.; Zhou, W.; Yi, Y. Channel Attention Residual U-Net for Retinal Vessel Segmentation. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Cananda, 6–11 June 2021; pp. 1185–1189. [Google Scholar] [CrossRef]
Zhang, J.; Zhang, Y.; Xu, X. Pyramid U-Net for Retinal Vessel Segmentation. arXiv 2021, arXiv:2104.02333, 1125–1129. [Google Scholar] [CrossRef]
Içek, Ö.; Abdulkadir, A.; Lienkamp, S.S.; Brox, T.; Ronneberger, O. 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France, 27 September–1 October 2021; Available online: http://arxiv.org/abs/1606.06650 (accessed on 25 July 2021).
Milletari, F.; Navab, N.; Ahmadi, S.-A. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Seg-mentation. 2016 Fourth International Conference on 3d Vision (3dv), Stanford, CA, USA, 25–28 October 2016; Available online: http://arxiv.org/abs/1606.04797 (accessed on 25 July 2021).
Demirkaya, N.; Van Dijk, H.W.; Van Schuppen, S.M.; Abramoff, M.; Garvin, M.K.; Sonka, M.; Schlingemann, R.O.; Verbraak, F.D. Effect of Age on Individual Retinal Layer Thickness in Normal Eyes as Measured with Spectral-Domain Optical Coherence Tomography. Investig. Opthalmol. Vis. Sci. 2013, 54, 4934–4940. [Google Scholar] [CrossRef] [PubMed]
Kim, J.H.; Lee, S.H.; Han, J.Y.; Kang, H.G.; Byeon, S.H.; Kim, S.S.; Koh, H.J.; Kim, M. Comparison of Individual Retinal Layer Thicknesses between Highly Myopic Eyes and Normal Control Eyes Using Retinal Layer Segmentation Analysis. Sci. Rep. 2019, 9, 1–11. [Google Scholar] [CrossRef] [PubMed]
Wilson, M.; Chopra, R.; Wilson, M.Z.; Cooper, C.; MacWilliams, P.; Liu, Y.; Wulczyn, E.; Florea, D.; Hughes, C.O.; Karthikesalingam, A.; et al. Validation and Clinical Applicability of Whole-Volume Automated Segmentation of Optical Coherence Tomography in Retinal Disease Using Deep Learning. JAMA Ophthalmol. 2021, 139, 964. [Google Scholar] [CrossRef]
Liu, Z.; Tam, J.; Saeedi, O.; Hammer, D.X. Trans-retinal cellular imaging with multimodal adaptive optics. Biomed. Opt. Express 2018, 9, 4246–4262. [Google Scholar] [CrossRef]
NIH Image to ImageJ: 25 years of image analysis|Nature Methods. Available online: https://www.nature.com/articles/nmeth.2089 (accessed on 15 June 2021).
Wang, D.; Haytham, A.; Pottenburgh, J.; Saeedi, O.; Tao, Y. Hard Attention Net for Automatic Retinal Vessel Segmentation. EEE J. Biomed. Health Inform. 2020, 24, 3384–3396. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.-Y.; Wong, W.; Woo, W. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. Processing of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, QC, Canada, 7–12 December 2015; Available online: https://papers.nips.cc/paper/2015/hash/07563a3fe3bbe7e3ba84431ad9d055af-Abstract.html (accessed on 15 June 2021).
Liu, Q.; Zhou, F.; Hang, R.; Yuan, X. Bidirectional-Convolutional LSTM Based Spectral-Spatial Feature Learning for Hyperspectral Image Classification. Remote Sens. 2017, 9, 1330. [Google Scholar] [CrossRef] [Green Version]
Liu, H.; Shen, X.; Shang, F.; Ge, F.; Wang, F. CU-Net: Cascaded U-Net with Loss Weighted Sampling for Brain Tumor Segmentation; Springer: Berlin, Germany, 2019; pp. 102–111. [Google Scholar] [CrossRef] [Green Version]
Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 13–16 December 2015; pp. 1026–1034. [Google Scholar] [CrossRef] [Green Version]
Zou, K.H.; Warfield, S.; Bharatha, A.; Tempany, C.M.; Kaus, M.R.; Haker, S.J.; Wells, W.M.; Jolesz, F.A.; Kikinis, R. Statistical validation of image segmentation quality based on a spatial overlap index1: Scientific reports. Acad. Radiol. 2004, 11, 178–189. [Google Scholar] [CrossRef] [Green Version]
Zou, K.H.; Wells, W.M.; Kikinis, R.; Warfield, S.K. Three validation metrics for automated probabilistic image segmentation of brain tumours. Stat. Med. 2004, 23, 1259–1282. [Google Scholar] [CrossRef]
Al-Faris, A.Q.; Ngah, U.K.; Isa, N.A.M.; Shuaib, I.L. MRI Breast Skin-line Segmentation and Removal using Integration Method of Level Set Active Contour and Morphological Thinning Algorithms. J. Med. Sci. 2012, 12, 286–291. [Google Scholar] [CrossRef] [Green Version]
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Processing of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019; Available online: http://arxiv.org/abs/1912.01703 (accessed on 15 June 2021).
Li, M.; Chen, Y.; Ji, Z.; Xie, K.; Yuan, S.; Chen, Q.; Li, S. Image Projection Network: 3D to 2D Image Segmentation in OCTA Images. IEEE Trans. Med. Imag. 2020, 39, 3343–3354. [Google Scholar] [CrossRef]
Li, M.; Zhang, Y.; Ji, Z.; Xie, K.; Yuan, S.; Liu, Q.; Chen, Q. IPN-V2 and OCTA-500: Methodology and Dataset for Retinal Image Segmentation. arXiv Prepr. 2020, arXiv:2012.07261. [Google Scholar]
De Carlo, T.E.; Romano, A.; Waheed, N.K.; Duker, J.S. A review of optical coherence tomography angiography (OCTA). Int. J. Retin. Vitr. 2015, 1, 1–15. [Google Scholar] [CrossRef] [Green Version]
Lee, K.; Sunwoo, L.; Kim, T.; Lee, K. Spider U-Net: Incorporating Inter-Slice Connectivity Using LSTM for 3D Blood Vessel Segmentation. Appl. Sci. 2021, 11, 2014. [Google Scholar] [CrossRef]
Ma, Y.; Hao, H.; Xie, J.; Fu, H.; Zhang, J.; Yang, J.; Wang, Z.; Liu, J.; Zheng, Y.; Zhao, Y. ROSE: A Retinal OCT-Angiography Vessel Segmentation Dataset and New Model. IEEE Trans. Med. Imag. 2020, 40, 928–939. [Google Scholar] [CrossRef]

Figure 1. Architecture schematics (A) UNet architecture with a depth of three layers (Conv2d = two-dimensional convolutional operation, batch normalization (BatchNorm), rectified linear activation (ReLU) function) (B) convolutional LSTM (cLSTM) unit schematic demonstrating the incorporation of previous cell state memory and output in the generation of current cell state and output (C) Bidirectional cLSTM (Bi-cLSTM) architecture schematic with forward and reverse cLSTM as distinct units.

Figure 2. Sample Z-projection image and manual segmentation for each plexus (superficial vascular plexus (SVP), intermediate capillary plexus (ICP), deep capillary plexus (DCP)).

Figure 3. Schematic outlining the nested architectures compared in the study: (A) UNet Only, (B) Long Short-Term Memory (LSTM)-UNet, and (C) UNet-LSTM-UNet (two-dimensional convolutional operation (Conv2d), batch normalization (BatchNorm), rectified linear activation function (ReLU), bidirectional convolutional long short-term memory network (Bi-cLSTM)).

Figure 4. Performance metrics (unitless) on testing set for the three segmentation architectures (AUC = area under receiver operating characteristic curve). Error bars indicate standard deviation. * indicates significant difference (p-value < 0.05 by one-way analysis of variance).

Figure 5. Sample segmentation for each architecture for each retinal layer (superficial vascular plexus (SVP), intermediate capillary plexus (ICP), deep capillary plexus (DCP)).

Table 1. Descriptive characteristics for adaptive optics optical coherence tomography volumes and image slices used for training, validation, and testing sets.

Volumes
	Train	Validation	Testing	Total Volumes	Mean ± Standard Deviation # Slices per Volume (range)
Superficial Vascular Plexus	43	13	13	69	34 ± 20 (range: 5–76)
Intermediate Capillary Plexus	40	13	13	66	32 ±10 (range: 14–57)
Deep Capillary Plexus	26	8	8	42	33 ± 11 (range: 12–60)
Total Volumes	109	34	34	177	33 ± 15 (range: 5–76)
Mean ± Standard Deviation # Slices per Volume (range)	33 ± 15 (range: 5–76)	32 ± 13 (range: 10–71)	31 ± 15 (range: 5–60)	33 ± 15 (range: 5–76)

Table 2. Performance metrics (with standard deviation) on testing set for the three segmentation architectures. Area under receiver operating characteristic curve (AUC). * indicates significant difference (p-value < 0.05 by one-way analysis of variance).

Model	Dice Coefficient	Precision	Recall *	Accuracy	AUC *
UNet Only	0.645 (0.114)	0.629 (0.161)	0.703 (0.109)	0.914 (0.027)	0.830 (0.053)
LSTM-UNet	0.687 (0.140)	0.635 (0.122)	0.799 (0.070)	0.924 (0.023)	0.880 (0.036)
UNet-LSTM-UNet	0.684 (0.141)	0.645 (0.127)	0.779 (0.072)	0.924 (0.024)	0.870 (0.036)

Table 3. Number of parameters and evaluation time for a 30-slice volume for each architecture.

Model	# of Parameters (Million)	Evaluation Time on 30 Slice Image (Seconds)
UNet Only	0.52	1.56
LSTM-UNet	1.82	184.79
UNet-LSTM-UNet	2.34	241.692

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Le, C.T.; Wang, D.; Villanueva, R.; Liu, Z.; Hammer, D.X.; Tao, Y.; Saeedi, O.J. Novel Application of Long Short-Term Memory Network for 3D to 2D Retinal Vessel Segmentation in Adaptive Optics—Optical Coherence Tomography Volumes. Appl. Sci. 2021, 11, 9475. https://doi.org/10.3390/app11209475

AMA Style

Le CT, Wang D, Villanueva R, Liu Z, Hammer DX, Tao Y, Saeedi OJ. Novel Application of Long Short-Term Memory Network for 3D to 2D Retinal Vessel Segmentation in Adaptive Optics—Optical Coherence Tomography Volumes. Applied Sciences. 2021; 11(20):9475. https://doi.org/10.3390/app11209475

Chicago/Turabian Style

Le, Christopher T., Dongyi Wang, Ricardo Villanueva, Zhuolin Liu, Daniel X. Hammer, Yang Tao, and Osamah J. Saeedi. 2021. "Novel Application of Long Short-Term Memory Network for 3D to 2D Retinal Vessel Segmentation in Adaptive Optics—Optical Coherence Tomography Volumes" Applied Sciences 11, no. 20: 9475. https://doi.org/10.3390/app11209475

APA Style

Le, C. T., Wang, D., Villanueva, R., Liu, Z., Hammer, D. X., Tao, Y., & Saeedi, O. J. (2021). Novel Application of Long Short-Term Memory Network for 3D to 2D Retinal Vessel Segmentation in Adaptive Optics—Optical Coherence Tomography Volumes. Applied Sciences, 11(20), 9475. https://doi.org/10.3390/app11209475

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Novel Application of Long Short-Term Memory Network for 3D to 2D Retinal Vessel Segmentation in Adaptive Optics—Optical Coherence Tomography Volumes

Abstract

Featured Application

Abstract

1. Introduction

2. Materials and Methods

2.1. In-Vivo Adaptive Optics Imaging

2.2. Nested Model Architectures

2.3. Model Training and Performance Evaluation

3. Results

4. Discussion

5. Conclusions

6. Disclaimer

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI