Automatic Detection and Distinction of Retinal Vessel Bifurcations and Crossings in Colour Fundus Photography †

: The analysis of retinal blood vessels present in fundus images, and the addressing of problems such as blood clot location, is important to undertake accurate and appropriate treatment of the vessels. Such tasks are hampered by the challenge of accurately tracing back problems along vessels to their source. This is due to the unresolved issue of distinguishing automatically between vessel bifurcations and vessel crossings in colour fundus photographs. In this paper, we present a new technique for addressing this problem using a convolutional neural network approach to ﬁrstly locate vessel bifurcations and crossings and then to classifying them as either bifurcations or crossings. Our method achieves high accuracies for junction detection and classiﬁcation on the DRIVE dataset and we show further validation on an unseen dataset from which no data has been used for training. Combined with work in automated segmentation, this method has the potential to facilitate: reconstruction of vessel topography, classiﬁcation of veins and arteries and automated localisation of blood clots and other disease symptoms leading to improved management of eye disease.


Introduction
Vascular conditions present a challenging public health problem as they become more common due to global ageing [1].Vascular conditions are often life-threatening and blood vessel damage caused from common health issues such as diabetes, hypertension and strokes can lead to significant health complications.It is, therefore, of great importance to better understand and be able to manage such conditions.The retina is the only inner organ which can be directly imaged, using a fundus camera, and also serve as a "window" for the diagnosis of systematic diseases such as: cerebral malaria, stroke, dementia and cardiovascular diseases [2].It is also significant that pathologies often affect veins and arteries differently.For example, in diabetic retinopathy, abnormalities typically occur in veins such as venous beading which is a significant predictor to the sight-damaging proliferative stage of the condition.With the availability of imaging techniques such as colour fundus photography, fundus angiography and recent optical coherence tomography angiography, there is a significant need for automated vessel analysis techniques [3,4].
There has been a considerable amount of work, in recent years, aimed at the effective segmentation of retinal blood vessels in fundus photography, which is a prerequisite step for blood vessel analysis.Work such as [3][4][5] has been able to achieve increasingly improved segmentation of retinal vessels.However, a significant remaining challenge is to distinguish between vessel bifurcations and vessel crossings.A vessel bifurcation is where a mother vessel branches into two daughter vessels, while a vessel crossing is where one vessel passes over another but does not connect to it.This is important for tracking vessels, separating veins from arteries and providing for quantitative analysis of vasculature.
For example, we must be able to trace back along the vessel when a blood clot has been identified.The current inability to accurately identify vessel crossings after or during vessel segmentations hinders this.It is also important to monitor progress of a vessel after vein and artery occlusions; being able to identify and distinguish vessel crossings and bifurcations facilitates this.Automating the detection and classification of vessel bifurcations and crossings also allows us to aid clinicians in detecting vascular abnormalities.The vast amount of vessels and vessel junctions within the retina make this task a laborious one for clinicians; by automating the process we can save time for treatment while maintaining accuracy.The vasculature can be obtained through vessel tracking or pixel-based classification.Detecting bifurcations and crossings are critical to either of these vasculature reconstruction methods.The detected and classified vessel junctions can be used in combination wth vessel segmentation, or used in vessel tracking methods to detect the source of irregular vasculature.
The previous work on vessel bifurcations and junctions has involved using orientation scores to detect bifurcations and junctions in retinal images [6].In contrast to the method we proposed, which is a fully automated system that uses only the image to determine the diagnosis, the work in [6] required 24 orientation processes for each image before training.However, the results in the paper show that the features within the image are extractable.There has also been similar orientation-based work in [7,8].
For applications in image analysis and classification, Convolutional Neural Networks (CNNs), a branch of deep learning, has achieved state of the art results for many problems.The 1970s saw the introduction of network architectures being used to analyse image data [9].These had useful applications and allowed challenging tasks, such as handwritten character recognition [10], to be achieved.Decades later, there were several breakthroughs in neural networks that lead to vast improvements in their implementation, such as the introduction of dropout [11] and rectified linear units [12].These theoretical enhancements and the accompanying increase in computing power through graphical processor units (GPUs) meant that CNNs became viable for more complex image recognition problems.Presently, large CNNs are used to successfully tackle highly complex image recognition tasks with many object classes to an impressive standard [13,14].The recent improvements in image recognition problems mentioned present an opportunity for more efficient and accurate methods of our vessel problem.CNNs are used in many of the current state-of-the-art image classification tasks including medical imaging.Hence, we use this method combined with expert segmented fundus images and skeletonisation [15,16] to detect and classify vessel bifurcations and crossings within fundus images.
There are many different architectures for neural networks.Recently residual networks have achieved impressive results on the highly competitive competition of ImageNet detection, ImageNet localisation, COCO detection, and COCO segmentation [17].They were then widely used in the following 2016 ImageNet competition due to their impressive performance on general large data sets of small images; such as the MNIST [18] dataset for handwritten digits 0-9 and CIFAR-10 [14], a dataset of 10 classes of colour images.This network learns from the residual of the identity of the previous layer of a new layer in order to learn features more effectively.This makes the network ideal for our patch-based method as the higher level features can distinguish between the background of the retina and the bifurcations and crossings.Hence, the Res18 network structure, containing 18 residual layers, was used in the CNNs throughout this paper.
In this paper, we present a new hierarchical approach, that utilises deep learning, to first automatically determine the locations of blood vessel bifurcations and crossings in colour fundus images, and then to distinguish between vessel bifurcation and crossings.We employ an available segmentation of the vessel structure, although an automatic segmentation procedure could be incorporated, to identify points along blood vessels.Annotated image datasets with identified vessel bifurcations and crossings aid our deep learning framework as we use a supervised learning method to solve this image recognition problem.From the original fundus images of the DRIVE dataset we created small patches of images using a skeletonisation of the vessels.We use a convolutional neural network approach which is trained on some of the patches of the fundus images using the expert ground truth for optimization.A matching network architecture is then used and trained to learn new convolution filters to distinguish between vessel bifurcations and crossings.The results is a novel method which is capable of identifying and classifying vessel bifurcations and crossings without user intervention.
The rest of this paper is organised as follows.In Section 2, we present our new automatic approach for locating and identifying crossings and bifurcations of retinal vessels, in Section 3 we demonstrate that proposed method yields robust, state-of-the-art results and in Sections 4 and 5 we present our conclusions and discuss future work.This paper is an extention of the paper [19] extended and more substantial results and a refined method for accuracy.The figures used are cited throughout.

Methods
Firstly we identifying patches of fundus images z(x).During our experiments we found that the optimal size for both performance and collection of the patches was 21 by 21 pixels.All of the patches used throughout this method were of this size.We make use of available vessel segmentations given as binary functions defined on the domain, and perform a skeletonisation process on this domain.The patches are then produced along the skeleton so that each contains some of vessel structure.Furthermore, after creating the patches we train our Res18 convolutional neural network to identify the patches which include either a bifurcation or a crossings.The res18 neural network contains 18 convolutional layers learned using the residual of the previous convolutional layer as in [17].The network contains 11,181,570 trainable weight parameters for optimization and the architecture layout can be found in the supplementary material.Another network with the same architecture is then trained on the patches that have been graded to have bifurcations and crossings to distinguishing the type of vessel junction located.
We tested the ability of our algorithm using 40 images from the DRIVE database with manual segmentations [4].We also studied the variability between grading and how this relates to the trained network for each grader.The data split was 30 images for training the neural networks, leaving 10 for testing.While this may seem a small number for a machine learning approach, our patch-based method means that the images generated for training numbered more than 100,000 providing sufficient data.Ground truth annotations of vessel crossings and bifurcations were provided by two graders (G1 and G2).

Datasets
The images used to implement our framework are from the Digital Retinal Images for Vessel Extraction (DRIVE) database with manual segmentations [4].The images in the DRIVE dataset were obtained from a diabetic retinopathy screening program in The Netherlands.The images were acquired using a Canon CR5 non-mydriatic 3CCD camera with a 45 degree field of view (FOV) using 8 bits per colour plane at 768 by 584 pixels.Moreover, we use the IOSTAR dataset [6,20] for testing the robustness of the method.Our networks are trained on the DRIVE Dataset, but all of them are tested on the unseen IOSTAR dataset for further validation.The IOSTAR dataset is made of the images taken with EasyScan camera (provided by i-Optics B.V., The Hague, The Netherlands).The original images have a resolution of 1024 by 1024 (14 µm/px), and a 45 degree field of view.For the moment, vessels, bifurcations and crossings of 24 images have been annotated and corrected by two different experts, the same experts that graded the DRIVE dataset.For testing on the IOSTAR dataset, which has the same field of view as DRIVE, the images were resized using bilinear interpolation to the dimensions of the DRIVE images.Patches were then extracted in the same way with both datasets to allow for fair comparison purposes.This process can be used to compare with any dataset of varying image size.
The datasets were graded separately by 3 expert graders to compare variability between the networks and between the graders themselves.The graders labelled bifurcations by clicking as close to the centre of the junction as possible.This allowed for it to be a simple operation so that the graders focus could remain on image.

Skeletonisation and Patch Extraction
We consider patches of the fundus images centred along the segmented vessels.In order to restrict the number of patches for training to a manageable number, and reduce bias, we aim to reduce to segmentation of the vessels to a skeleton and consider regions centred only on these points.We achieve this by performing a skeletonisation of the level set function φ(x) for each image.
We convolve the level set function with the kernels in Equation (1) where r j denotes rotation of the matrix by a multiple j of π/2 radians and . These kernals are shown in Figure 1 We thin the segmentation of the vessels by removing the points which are centred on regions matching the above filters.That is, we set such points as background points.
We achieve this by iterating as in Equation ( 2), beginning with l 1 0 = 0 and cycling through i ∈ {1, 2}, j ∈ {0, 1, 2, 3}.Following this, we extract the patches by cropping the image z(x) to 21 × 21 pixel windows Θ p centred on points p in the set Υ of points considered the foreground of the skeletonised vessel map.The patch size was selected so that bifurcations, and crossings and branches, in the vessel would fit within one patch.The patches are given by: In the training stage, the set of patches (Θ) of the images in the training set are used to train the neural network to identify whether a bifurcation or crossing is contained in the image patch.In the test stage, the trained CNN classifies the patches accordingly.This step is described below.

Junction Distinction-CNN C 1
To identify the vessel bifurcations and crossings within the patches created we train our CNN on a high-end Graphics Processor Unit (GPU).The large random access memory of the Nvidia K40c means that we were able to train on the whole dataset of patches at once.The Nvidia K40c contains 2880 CUDA cores and comes with the Nvidia CUDA Deep Neural Network library (cuDNN) for GPU learning.The deep learning package Keras [21] was used alongside the Theano machine learning back end to implement the network.After training, the feed forward process of the CNN can classify the patches produced from a single image in under a second.We used the Res18 network architecture [17] as deep levels of convolution were required to distinguish the vessel junction type in our small patches.The residual layers incorporate activation, batch normalisation, convolutional, dense and maxpooling layers.We also use L 2 regularisation to improve weight training.There were approximately 100,000 patches for training and 30,000 for testing in the junction distinction problem.The classes were weighted as a ratio of junction to background due to the fact that bifurcations and crossings in the training and testing patches were sparse at a ratio of 1:39.The network was optomised using Adam stochastic optimisation for backpropegation [22].The network was trained to classify the patches to give a binary classification of either vessel junction or background.Gaussian initialisation was used within the network to reduce initial training time.The loss function used for the optimisation was the widely used categorical cross-entropy function.Training was undertaken until reduction of the loss plateaued to obtain optimal results.These results can be seen in Figure 2.

Locate the Centres
Following the neural network classification, which tell us if a bifurcation or crossing is contained within a patch, we aim to find the locations of the points.
We achieve this by forming the cumulative sum image shown in Equation ( 3) and taking the local maxima r ∈ Υ as points of interest.We then aim to determine whether points are at crossings or bifurcations.

Junction Classification C 2
We extract the patches Θ(r) and use these to train a neural network to distinguish between crossings and bifurcations as shown in Figure 3.The second neural network was trained with the Res18 architecture, like the first.Using a relatively small training set of patches, as from our images the majority of patches did not contain bifurcations and crossings, we trained our network in similar fashion to that used in the previous step.Weighted classes were introduced again to cater for the imbalance, in that images from the bifurcation class were substantially more prominent than that of the cross class.
Depending on the patch method there were around 800-2500 patches containing a junction that was used for training which can be seen in Figure 4.In all methods there were approximately twice as many junction patches containing bifurcation vessels compared to patches containing vessels crossing.Training was performed until a plateau in the reduction of the loss function was reached indicating no further improvement.bifurcations) and their enhanced counterparts for presentation.The neural networks were able to achieve good results using the patches without enhancement.Reproduced with permission from [19].

Results
We present our results on a patch by patch basis as well as in the fundus image form with vessel bifurcations and splittings labeled.The patch detection and classification information is then used to in a probability map type reconstruction of the fundus image to produce the final appropriate vessel bifurcations and splittings, as demonstrated in Figures 5 and 6.Here we present both the patch accuracy results and the final classified and vessel type distinguished images.We measure sensitivity, specificity and accuracy of the final image-based result as follows.Since the region of a junction is not restricted to a single point, we allow a region r(x, y) of 10 pixels either side of an annotated point at (x, y) to be considered the correct region.That is, we split each image domain Ω into two sets where j denotes a junction point location.Ω 1 is considered the true (junction) set and Ω 2 is considered the background set.We then calculate error measures based on this and report the mean measures.
For validation we used the test images from the DRIVE dataset.Furthermore, we used the separate IOSTAR dataset, and another expert grader (G3), for testing and comparison of the patch detection and classification method.We show in Tables 1 and 2 the results of training our neural networks on the data provided by graders 1, 2 and 3.In each case, we use patches extracted from 30 images from the DRIVE dataset to train our network.This relates to 101,416 patches of vessel junctions and 216,756 other patches.From the 101,416 patches there are 67,650 patches of vessel crossings and 33,766 patches of vessel bifurcations.This network is then tested by using our network to classify patches from the remaining ten images of the DRIVE dataset.This relates to 72,980 non-vessel patches and 31,026 vessel patches of which 9176 are vessel crossings and 21,850 are vessel bifurcations.We compare the results with the annotations provided by the grader in question, achieving high accuracies of 0.76, 0.76 and 0.77 for graders 1, 2 and 3 respectively.Furthermore, we use the trained network to classify images from the IOSTAR dataset, comparing with the annotations provided.With each trained network, the accuracy is lower for this unseen dataset, but the sensitivity is retained.The IOSTAR dataset gave us 132,064 non-vessel junction patches and 52,878 vessel junction patches with 14,228 vessel crossings and 38,650 vessel bifurcations.Following the detection, we resolve the   Reproduced with permission from [19].

Discussion
We have produced a method that can learn to detect and classify vessel bifurcations and crossings using a very small dataset of 40 fundus images images that had been manually classified for bifurcations and crossings and their type.Using the CNN C 1 , we managed to detect the bifurcations and crossings to an impressive detection accuracy of over 90% due in part to the relatively large amount of patches containing bifurcations and crossings.Along with the skeletonisation, our deep learning classification C 2 for vessel type gave us a high accuracy.The classifications statistics are similar to that of the intravariability between graders and hence the method has more chance of reaching higher results should a consistent grading be given within the patches.This can be seen by the higher testing results on the third grader.Increasing the size of our dataset would allow better distinction in the classification of the vessel bifurcations and crossings.It is worth noting that junction type training was undertaken on a couple of thousand patches and tested on around 800. Through training on more images the model could be fine tuned to refine the filters and increase distinction accuracy.
The current algorithm works well for images which have been manually segmented but this time-consuming task could be further extended to incorporate automatic segmentation techniques [3].A further very useful extension would be to automatically determine whether the artery or vein is in front given arteriovenous crossings, along with consideration of intra and inter-observer variability.In order to better identify and classify bifurcations and crossings with other nearby bifurcations and crossings, it would be useful to consider extending our method to a multi-scale approach.Furthermore, while we have included images with different vessel pathologies due to diseases such as retinopathy, using datasets containing other retinal diseases this could be studied further to investigate how this affects the detection and classification of vessel junctions.

Conclusions
The challenging task of detecting and classifying vessel bifurcations and crossings in fundus images is achieved to a high level of accuracy using our method.The variability over datasets without training on the dataset represents a robust algorithm for unseen images.The ability to expand on this method to make the detection both quicker and more accurate than manual classification is possible.These preliminary results demonstrate that the overall framework, including the deep learning approach proposed, is a viable technique to accurately find and identifying vessel bifurcations and crossings with little training data.More extensive testing of this framework could be undertaken to assess the transferability of these results and patch sizes to different size images from different datasets.However, there is no reason why this framework would not be directly applicable to another dataset.

Figure 3 .
Figure 3. Example outcomes of second part of algorithm: classifying bifurcations and crossings as bifurcations and crossings.(a) Identified bifurcations and crossings; (b) bifurcation Points; (c) Vessel Crossings.Reproduced with permission from [19].

Figure 4 .
Figure 4. Example of C 2 input.Rows 1 and 2 (resp.3 and 4): training patches with crossings (resp.bifurcations)and their enhanced counterparts for presentation.The neural networks were able to achieve good results using the patches without enhancement.Reproduced with permission from[19].

Figure 8 .
Figure 8. Example of distinguishing between crossings and bifurcations in fundus images.Rows one shows the classified bifurcations in (a-c) and row two the respective crossings in (d-f).Likewise, row three shows the classified birfucations in (g-i) and row four the respective crossings in (j-l).Reproduced with permission from[19].

Table 4 .
Confusion Matrices for CNN-based Classification results.BF denotes bifurcations, CR denotes crossings.True labels are along rows, predicted along columns.