Tortuosity Index Calculations in Retinal Images: Some Criticalities Arising from Commonly Used Approaches

: A growing body of research in retinal imaging is recently considering vascular tortuosity measures or indexes, with deﬁnitions and methods mostly derived from cardiovascular research. However, retinal microvasculature has its own peculiarities that must be considered in order to produce reliable measurements. This study analyzed and compared various derived metrics (e.g., TI, TI_avg, TI*CV) across four existing computational workﬂows. Speciﬁcally, the implementation of the models on two critical OCT images highlighted main pitfalls of the methods, which may fail in reliably differentiating a highly tortuous image from a normal one. A tentative, encouraging approach to mitigate the issue on the same OCT exempliﬁcative images is described in the paper, based on the suggested index TI*CV.


Introduction
Since the introduction of Optical Coherence Tomography as a retinal morphofunctional investigational technique, an exceptional amount of diagnostic information is overwhelming clinicians, and this trend is constantly increasing with the evolution of the technology and the features of the associated software [1]. Researchers detected at least a couple of relevant weaknesses within this trend:

1.
A certain number of studies recognized that this bulk of clinical information still needs to be "standardized" both for the nomenclature used and for the way to report information [2][3][4][5]; 2.
The reliability of certain measurements needs to be carefully assessed, especially in view of artificial intelligence applications [6][7][8].
As a paradigm of these weaknesses, retinal vascular tortuosity is one of the features that are recently gaining increasing attention.
Vascular tortuosity is a striking physiological feature that has been studied since Leonardo Da Vinci 's works [9][10][11]. This 16th century scientist was the first to observe that superficial vessels of an elderly's arm were more tortuous than the corresponding vessels of a young individual. Since then, vascular tortuosity has been a subject of study in a wide range of body districts; this phenomenon, in fact, affects both small size vessels, such as distal finger capillaries and retinal arteries, and middle and large size arteries, such as the aorta and the coronary, cerebrovascular or iliac vessels.
While mild vessel tortuosity is usually asymptomatic, increased or severe vessel tortuosity can lead to ischemic manifestations in the affected organ. Clinical studies have linked vessel tortuosity to aging, atherosclerosis, hypertension, genetic defects and diabetes mellitus [12][13][14][15][16].
The mechanism of progression, if any, is still poorly understood.
Retinal tortuosity can be described as the presence of curliness in vessels; however, there is not a standardized agreement on how to clinically define a tortuous retina: vessel tortuosity is mostly assessed using the clinician's experience in identifying variations in vessel normal course, mainly highlighting dissimilarities with respect to normal healthy vessels. This subjective assessment of vessel tortuosity can lead to a high inter-expert variability, and has been a promoting factor for several studies aimed at delivering more formal mathematical definitions of tortuosity [30][31][32].
In the Artificial Intelligence and Big Data Era, some questions may arise observing the diverse applications of more or less established computer-based analysis techniques, together with the need to critically review their application, especially in the context of scientific repeatability of results [33]. This is a relevant topic, especially in those cases where technological advancements render complex tools, whose limitations are less evident than their capabilities, available to a broader and broader number of researchers.
Main aim of this paper is to discuss some criticalities that frequently used tortuosity computation workflows may encounter in analyzing retinal tortuosity. In particular, Authors' efforts focus on finding and commenting potential pitfalls identified in the Lee's method [34], an AI-based computational workflow easy to implement and exhaustively described in the literature. To better understand those criticalities, tortuosity indexes of two exemplificative, critical OCT images are computed by using Lee's method and other published methods, either AI-based or traditional. Further, a preliminary exploration of possible mitigation actions is described in the paper, and a potentially better performing tortuosity index is proposed and used on the same tricky images.

Materials and Methods
The method from Lee [34] was identified as a potential example of a computational workflow easily available to researchers. A PubMed search was carried on to identify a subset of papers citing Lee's work and dealing with tortuosity indexes. The full text of those papers was reviewed in order to identify whether and how they actually used Lee's method.
Tortuosity indexes and other image derived statistics were then computed by using the Lee's method (Method A), the deep learning method described by Zhao [35] (Method B) and two classical image processing methods (Methods C and D) previously used by our group. In more detail: Method A: machine learning classification task was performed using Fiji's Trainable Weka Segmentation plugin [36] following the previously published procedure [34,37]. The classifier's output consists of a segmentation probability map highlighting the retinal structures detected as vessels. The probability map was then converted into a binarized image.
Method B: the skeletonization results coming from the deep learning method described by Zhao [35] and available in the ROSE-2 subset were used for this analysis.
Methods C and D: image binarization was performed with classical image processing methods (no artificial intelligence support) previously used and published by our group ( [38] and [39], respectively). Briefly, Method C uses some preprocessing techniques (Niblack's algorithm, histogram equalization) followed by image binarization by the method of Ridler and Calvard [40], while Method D uses Otsu's segmentation [41] method followed by active contours (snakes) region growing technique [42].
In Methods A, C and D the binarization was followed by a skeletonization (each white object in the binary image was converted to a single pixel line) in Fiji; in all the four methods the skeleton features (branch length (BL), vertices positions, branch Euclidean distance (ED)) were calculated in Fiji using the available AnalyzeSkeleton plugin [43]. The previously published procedure setup [34] was implemented for skeletonization and skeleton feature extraction. Figure 1 reports examples for method A.    Starting from the skeleton features, tortuosity index was computed as previously proposed [34], namely: Several additional branches measurement-related metrics were calculated as well, and reported in Table 1. Conversion from measurements expressed in pixels to metric lengths and areas were performed considering a pixel transverse size of 9.37 µm [44]. All image processing was performed using a combination of Mathwork's Matlab [45], Fiji [46] and some Fiji plugins [43]. A more detailed insight of the implemented methods is available in Supplementary Material S1.
OCT en face angiograms were obtained from the Rose Dataset (ROSE-2 subset) [35], containing OCT-A images acquired by Heidelberg OCT2 system with Spectralis software (Heidelberg Engineering, Heidelberg, Germany), within a 3 × 3 mm 2 area centered at the fovea. In this database, two retinal angiograms (N and T) were selected which resulted clearly distinguishable as for vessel tortuosity, the latter (T) being characterized by evident vessel tortuosity ( Figure 1, panel A, left N, right T).

Results
In the PubMed search 33 papers citing the Lee's 2018 publication [34] were found. Full text review identified 10 papers which actually used Lee's method, eight of which were from groups different from Lee's one. Incidentally, in other, side searches, some papers were found [21] which use the same (or similar method) without citing the original reference. This result is consistent with the good acceptance of Lee's work, and the relatively high number of groups who have been using this method in the last three years; this shows the diffusion of the method and, consequently, the importance of getting a better insight on the used methodologies.
Segmentation and skeletonization results are reported in the exemplificative Figure 1 (Method A). Regarding the computed metrics, Table 1 contains three different tortuosity indexes and several branch-related features.
All methods identified a smaller number of branches in the more tortuous retina (T), and AI-based methods (methods A and B) resulted in a generally smaller number of branches with respect to conventional image processing methods. Regarding the distribution of branch lengths, several potential criticalities should be noted. As a general observation, computed branch lengths are not normally distributed (Shapiro-Wilk, p < 0.05 for all methods, see histogram in Figure 2 for method A). Mean branch length for image N varies from 12.97 pixel to 6.58 pixel (121 to 61 µm), while mean branch length for image T varies from 13.28 pixel to 6.56 pixel (124 to 61 µm). A high percentage of computed branches (from 47.23% up to 80.85%) are less than 10 pixels length (93 µm), and at least 80% are less than 21 pixels, corresponding to a metric length close to 200µm. A certain number of computed branches have unitary length (and consequently, ED = 1). While it is not clear which impact this computed length has on TI, conceptually this can be classified as an unwanted effect of the skeletonization procedure.
Additionally, a relevant number of computed Euclidean distances are equal to zero. This last observation deserves additional comments. First, ED can be zero if and only if the starting point of the branch and the ending point coincide. This can happen, and actually does happen, when the skeletonization process identifies a loop (examples are visible in Figure 3). This means that the presence of loops in skeletons silently affects TI computation, since they have positive branch lengths (numerator) and zero ED (denominator). For this reason, in the present work zero ED branches were excluded from the computation of TI avg.  Regarding tortuosity index TI, (sum branch lengths/sum Euclidean distance) it must be noted that in two out of four imaging processing workflows (denoted * in Table 1), the smallest computed index is associated with the most tortuous retina. As a collateral observation, we hypothesized that the contribution of very small branches may adversely affect TI behavior. However, this hypothesis was refuted by calculating TI while iteratively removing increasing length branches. As a matter of fact, TI dependence on branch lengths has a similar behavior in both cases (N and T) and with all considered methods. As an exemplificative case, in Figure 4, TI is reported as a function of minimal branch length (for methods A and B). In Figure 3, the output of the branch detection computations is shown for upper right corner of T image. The branch detection algorithm classifies each pixel into three different categories depending on its neighbors: pixels having less than two neighbors are classified as end points (blue); pixels having more than two neighbors are classified as junctions (green), while pixels having exactly two neighbors are classified as slabs (branches, in orange).
In the figure, it can also be easily observed that most branches are relatively short (their length is limited by the presence of junctions), and this may affect TI computations (Figure 2).
Regarding tortuosity index TI, (sum branch lengths/sum Euclidean distance) it must be noted that in two out of four imaging processing workflows (denoted * in Table 1), the smallest computed index is associated with the most tortuous retina. As a collateral observation, we hypothesized that the contribution of very small branches may adversely affect TI behavior. However, this hypothesis was refuted by calculating TI while iteratively removing increasing length branches. As a matter of fact, TI dependence on branch lengths has a similar behavior in both cases (N and T) and with all considered methods. As an exemplificative case, in Figure 4, TI is reported as a function of minimal branch length (for methods A and B).
smallest computed index is associated with the most tortuous retina. As a collateral observation, we hypothesized that the contribution of very small branches may adversely affect TI behavior. However, this hypothesis was refuted by calculating TI while iteratively removing increasing length branches. As a matter of fact, TI dependence on branch lengths has a similar behavior in both cases (N and T) and with all considered methods. As an exemplificative case, in Figure 4, TI is reported as a function of minimal branch length (for methods A and B). In all considered combinations, TI initially increases-i.e., when removing very short branches-, reaches a maximum at branch lengths between 10 and 30 pixels, and gradually decreases until the relatively small number of included branches generates a certain numerical instability. All methods deliver results with similar behavior. The removal of small branches from TI computation does not alter the relationship between TI (N) and TI(T), which remains inverted with respect to what expected -namely, higher TI for more tortuous retinas.
TI_avg was computed as i.e., the average of the ratios between branch length and Euclidean distance for each branch. This index, introduced to avoid the influence of the length of loop branches and to weigh the contribution of each individual branch to the tortuosity metric, seems to perform even worse than TI ( Table 1). The last index, TI*CV, was computed multiplying TI by the coefficient of variation (CV) of the branch length (sd/mean for all branches in the image). Since branch length CV is a measure of mean branch length precision, this index represented an attempt to compensate mean branch length inaccuracy due to the excessive segmentation (and the possible underestimation of tortuosity) of longer branches.
At least for the analyzed images, TI*CV seems to correlate better than the other proposed indexes with the tortuosity of the examined angiograms, in all selected imaging workflows ( Table 1).
The dependence of TI and TI*CV on branch length was finally investigated, by calculating both indexes while iteratively removing branches of increasing length. Figure 5 shows this dependence for method A (similar results were obtained for the other methods); the plot clearly shows the inversion of the two indexes between N and T angiograms (TI*CV higher for T than for N, as expected). In all considered combinations, TI initially increases-i.e., when removing very short branches-, reaches a maximum at branch lengths between 10 and 30 pixels, and gradually decreases until the relatively small number of included branches generates a certain numerical instability. All methods deliver results with similar behavior. The removal of small branches from TI computation does not alter the relationship between TI (N) and TI(T), which remains inverted with respect to what expected -namely, higher TI for more tortuous retinas.
TI_avg was computed as i.e., the average of the ratios between branch length and Euclidean distance for each branch. This index, introduced to avoid the influence of the length of loop branches and to weigh the contribution of each individual branch to the tortuosity metric, seems to perform even worse than TI ( Table 1). The last index, TI*CV, was computed multiplying TI by the coefficient of variation (CV) of the branch length (sd/mean for all branches in the image). Since branch length CV is a measure of mean branch length precision, this index represented an attempt to compensate mean branch length inaccuracy due to the excessive segmentation (and the possible underestimation of tortuosity) of longer branches.
At least for the analyzed images, TI*CV seems to correlate better than the other proposed indexes with the tortuosity of the examined angiograms, in all selected imaging workflows ( Table 1).
The dependence of TI and TI*CV on branch length was finally investigated, by calculating both indexes while iteratively removing branches of increasing length. Figure 5 shows this dependence for method A (similar results were obtained for the other methods); Information 2021, 12, 466 7 of 11 the plot clearly shows the inversion of the two indexes between N and T angiograms (TI*CV higher for T than for N, as expected).

Discussion
Objective quantification of vessels tortuosity in retinal images may be of great clinical relevance both in the presence of retinal pathologies and in the investigation of correlated systemic diseases. To the Authors' knowledge, however, this work is the first to highlight relevant criticalities emerging from the adaptation of tortuosity computational methods used in cardiovascular research to the quantitative examination of retinal vessels. Usually, in fact, papers dealing with the analysis of retinal vessels in retinal images apply published computational methods to address clinical questions, without systematically investigating technical or methodological criticalities.
One of the most used and novel approaches, the Lee method has been here analyzed and tested with respect to two exemplificative retinal images, purposely selected because it referred to two clearly distinguishable levels of vessels tortuosity. The performance of the method, which resulted in a higher tortuosity index (TI) for the less tortuous retina (1.151 against 1.146 for the more tortuous retina) suggested a deeper, step-by-step investigation: each step (and the corresponding outcomes) of the method workflow was then analyzed, as well as other relevant branch-related features, the comparison with similar approaches, and the search for possible mitigation actions.
The first observed criticality concerns a potential limitation of all the four analyzed computational methods in accurately deriving feature images and numerical measures. This limitation may entail possible underestimation of the index: tortuosity of long vessels, in fact, is likely underestimated by the skeleton analysis process which, in turn, depends on the preceding image segmentation and skeletonization process. The difficulty of the method to identify branches whose length is comparable with the original, longer vessels may in fact entail a sort of "piecewise linearization" of those vessels: as a result, the computation of apparently longer Euclidean distances of the "pieces" of branches contributes to lower the tortuosity index. Focusing on the results summarized in Table 1, Method B (a previously published deep learning model [35]) seems to define a bit longer branches than the other examined methods, with a smaller percentage of very short branches; however, also in this case, it is worth noticing that more than 80% of branches is shorter than 20 pixels (less than 200 μm). This effect, at the same length cut-off, has been reported by

Discussion
Objective quantification of vessels tortuosity in retinal images may be of great clinical relevance both in the presence of retinal pathologies and in the investigation of correlated systemic diseases. To the Authors' knowledge, however, this work is the first to highlight relevant criticalities emerging from the adaptation of tortuosity computational methods used in cardiovascular research to the quantitative examination of retinal vessels. Usually, in fact, papers dealing with the analysis of retinal vessels in retinal images apply published computational methods to address clinical questions, without systematically investigating technical or methodological criticalities.
One of the most used and novel approaches, the Lee method has been here analyzed and tested with respect to two exemplificative retinal images, purposely selected because it referred to two clearly distinguishable levels of vessels tortuosity. The performance of the method, which resulted in a higher tortuosity index (TI) for the less tortuous retina (1.151 against 1.146 for the more tortuous retina) suggested a deeper, step-by-step investigation: each step (and the corresponding outcomes) of the method workflow was then analyzed, as well as other relevant branch-related features, the comparison with similar approaches, and the search for possible mitigation actions.
The first observed criticality concerns a potential limitation of all the four analyzed computational methods in accurately deriving feature images and numerical measures. This limitation may entail possible underestimation of the index: tortuosity of long vessels, in fact, is likely underestimated by the skeleton analysis process which, in turn, depends on the preceding image segmentation and skeletonization process. The difficulty of the method to identify branches whose length is comparable with the original, longer vessels may in fact entail a sort of "piecewise linearization" of those vessels: as a result, the computation of apparently longer Euclidean distances of the "pieces" of branches contributes to lower the tortuosity index. Focusing on the results summarized in Table 1, Method B (a previously published deep learning model [35]) seems to define a bit longer branches than the other examined methods, with a smaller percentage of very short branches; however, also in this case, it is worth noticing that more than 80% of branches is shorter than 20 pixels (less than 200 µm). This effect, at the same length cut-off, has been reported by other authors [47], and is a common problem well known even in other related fields [48]. Ideally, the impact of shorter (and smaller) capillary vessels on TI should be carefully assessed, since their presence and morphology are highly dependent on the image processing workflow and on the image quality.
Another relevant criticality, as already mentioned in the Results section, is the presence of loops: they have Euclidean distance equal to zero-thus having no weight in the denominator of the tortuosity index-but their branch lengths do contribute and affect the index numerator. From this point of view, and with reference to the two specific images analyzed in this study, Method A (which implements a machine learning classification task) seems the most robust, with a significantly lower number of loops.
Methods B and C (classical image processing methods), for the specific image comparison, show a suitable behavior and deliver a higher tortuosity index for the more tortuous retina.
In the attempt to mitigate the impact of segmentation and skeletonization processes on the clinical relevance of the derived metrics, a different implementation of the TI was investigated: namely, the TI_avg was calculated as the mean value of the distribution of individual TI calculated for each branch. This new index, however, seems to perform even worse than the original TI.
The best performing adaptation of the index came out by introducing a correction factor which accounts for the different branch length variability within each image. Specifically, the coefficient of variation of branch lengths was calculated for each image; this relevant indicator simultaneously accounts not only for the branch lengths distribution but also for their numerosity, and magnifies the TI of those images with higher uncertainty of the mean. The use of this correction factor brought to the TI*CV index which resulted higher for the more tortuous image irrespective of the method (Table 1). These results remain valid also in the case a high-pass threshold is introduced to remove the shortest branches from the computation. Figure 5 shows the TI and the TI*CV indexes as function of the fixed threshold for the branch length inclusion/exclusion in Method A (similar results were found for the other examined methods). With respect to TI index, which is alternatively-i.e., positively or negatively-affected by short branches removal, TI*CV remains relatively stable over a wide branch length interval. While working satisfactorily on the OCT exemplificative images analyzed in this study, this promising index needs of course a wider validation on an adequate dataset of retinal images. The proposed mitigation approach, however, may result useful and applicable to other retinal vessel structural indexes.

Conclusions
Vessel's tortuosity in retinal images has a great clinical relevance. Its objective quantification, currently relying on the adaptation of existing tortuosity computational methods mostly used in cardiovascular research, shows some criticalities. The exploratory investigation conducted in this study, based on two exemplificative retinal images with very different tortuosity, highlighted potential limitations of some computational methods, either AI or not AI-based-in accurately deriving feature images and numerical measures. Those limitations may in turn entail possible underestimation of the currently used metrics associated with the tortuosity index. The study also suggests that attempts may be conducted to optimize those metrics so as to mitigate the impact of segmentation and skeletonization processes; in the exploratory attempt conducted by the authors on the two exemplificative images, the best performing adaptation of the tortuosity index was shown by introducing a correction factor which accounts for the different branch length variability. Irrespective of the specific solution hereby proposed, the mitigation approach may result useful and applicable to other retinal vessel structural indices.
Author Contributions: Conceptualization, F.M.; methodology, F.M. and C.G.; software, F.M.; writingoriginal draft preparation, F.M. and C.G.; writing-review and editing, F.M. and C.G.; All authors have read and agreed to the published version of the manuscript.
Funding: This research received funding from Istituto Superiore di Sanità, Intramural Support for the "Ricerca e innovazione tecnologica nella gestione della funzione visiva" Project.
Data Availability Statement: All OCT images used in this works are available in the publicly available dataset described in [35]. Matlab codes are made available by the authors on reasonable requests.