4.1. Robustness against Parameter Variations and Speckle Noise
An important aspect to consider is the robustness of the method against variations in the parameters, in particular the size of the structuring elements. The morphological filters (mainly openings and closings), proposed for the coarse detection of the RNFL boundaries, focus on the shape and size of the objects and their relative intensity (a bright object on a dark background or vice versa). For instance, a circular structuring element with a radius greater than 3, i.e.,
, is required in Equation (
4) to detect the upper boundary #1-UB. Given that
is used in the proposed method, there is a wide range of values in which this morphological filtering would provide the desired results. For the sake of clarity,
Figure 16 shows the results of the steps of
Figure 7b,d, corresponding to the Equations (
4) and (
5), using circular structuring elements with radii 5 and 9. As can be seen, the effect of the morphological operation is visible in the image, but the desired result, i.e., the delineation of the upper boundary, is practically similar in all cases. Likewise, we analyzed the steps shown in
Figure 8h–k, corresponding to Equations (
12)–(
14), where an opening, a closing, and a thresholding of the positive vertical gradient are performed. Variations are made on the proposed structuring element
, using in this case radii 1 and 7, as shown in
Figure 17. The only noticeable effect between the images appears in the shaded areas by the vessels, which is subsequently compensated by the application of active contours with suppression of external forces by means of the vein mask. It is important to note that the application of the proposed method to a different dataset might require a readjustment of the parameters used in the filtering. Since morphological operators filter the shape (and size) according to the structuring element, the expected maximum and minimum width of the retinal layer under study, measured in pixels, influences the setting of these parameters.
Regarding the influence of speckle noise on the results of the segmentation, it is negligible for most OCT images of the database due to their correct acquisition and their limited number of artifacts. Nevertheless, the bypassing of these filters could cause problems in OCT images with a higher level of noise or number of artifacts. The proposed filtering suppresses with morphological openings the typical bright spots of speckle noise in Equations (
4), (
10) and (
23); and suppresses with morphological closings the dark spots in Equations (
7) and (
23). As indicated above, the parameter of this filtering, i.e., the size of the structuring element, is not critical, and there is a wide range of values in which the performance of the filtering is optimum. The mean value of the standard deviation of the speckle noise within the dataset used in this work is estimated using the NOLSE estimator [
71], resulting in
, within the range of
= [0.0030–0.0292]. To evaluate the robustness to speckle noise of the proposed method, we tested the method over the OCT images with synthetically added speckle noise with standard deviation from
to
(variance from
to
).
Figure 18 illustrates the results on an image with a speckle noise estimation of
. The performance of the method is appropriate with noise level below
. As can be seen in
Figure 18d, by exceeding this level, the #3-LB was not correctly detected, resulting in an incorrect adjustment to the CGL + IPL layer. It should be noted that given the visual degradation achieved in OCT images, this value should be outside of any realistic practical situation.
4.2. Analysis of the Performance of the Methods
By first applying a qualitative analysis to the results depicted in
Figure 15, we can clearly see that the three methods handle the delineation of the upper boundary of the RNFL with adequate reliability. However, it can be noticed that the H-DLpNet method tends to generate excessive smoothing in bumpy areas, with peaks and valleys, causing certain inaccuracies. By contrast, both the proposed method and Spectralis tend to produce a result closer to the manual segmentation as defined by the experts.
The analysis of the accuracy of the lower boundary of the RNFL is, on the other hand, notoriously difficult. As described in
Section 2.1, the speckle noise and the artifacts due to the shadows cast by blood vessels generate inaccuracies in all the segmentation algorithms. However, this lack of information in the shaded areas is also a challenge in the manual segmentation carried out by the experts. This makes it unclear which method shows a higher precision in certain images of the database.
Analyzing the manual segmentations in
Figure 15, it can be observed that the H-DLpNet method tends to project upward the layer delineation inside the areas shaded by the vessels, while Spectralist tends to project the layer in the downward direction. The proposed method, on the other hand, tends to be a compromised solution by maintaining a greater continuity of the layer in these areas. In manual segmentation, experts maintain or lower the location of the lower edge in the shaded areas, as roughly performed by the proposed method and Spectralis, respectively. Note that although only three OCT images are shown in these figures, similar results can be observed in most of the images in the database. This inaccuracy in the shaded areas, even for the experts, complicates the quantitative analysis whose metrics are detailed in
Section 3.
In addition, it is important to determine the highest accuracy in layer thickness measurement allowed by the resolution of the images. Although the proposed method has subpixel precision, the thickness measurement by the neural network in the image is performed in pixels. Considering a minimum error of one pixel in the delineation of each boundary, two boundaries of the RNFL to be delineated, and a vertical resolution of 3.87 µm/pixel, therefore a difference of less than 7.74 µm between the layer thicknesses provided by either method, should not be significant due to the image resolution.
Table 4 shows the mean value of the RNFL thickness in each section of the eye calculated by the three methods. It can be seen that the mean values of the layers are between a minimum of 67.3 µm and a maximum of 136.2 µm, which correspond to 8.7 and 17.6 times the minimum resolution allowed by the image. Therefore, the layer thicknesses measured by each method are very close to each other, considering the limitation of the image resolution. Note that the proposed method tends to provide slightly greater thicknesses than the other two methods in most sectors.
Focusing on
Table 5 with the thickness errors, it can be observed that the smallest and the largest mean values correspond to 6.9 and 17.6 µm, respectively, which again, are acceptable values given the resolution provided by the image. Excluding the overall thickness (G), we can notice that the temporal sector T shows the lowest error of all sectors, both in the mean value and in the standard deviation. This is mainly due to the fact that the presence of blood vessels in this segment is reduced, and therefore, the three methods provide similar measurements. By contrast, sectors such as the temporal superior (TS), temporal inferior (TI), and nasal inferior (NI), with a considerable number of artifacts and shadows, show much greater differences. One sector of particular interest is the nasal sector (N). In the comparison between Spectralis and the proposed method, it can be seen that the mean error is quite high in both eyes, about 17 µm. As can be seen in
Figure 15, although some vessels appear in this sector, the mean error is relatively high, due in part to the fact that Spectralis tends to lower the upper edge. Thus, Spectralis provides a lower mean thickness than the proposed method, with the error being even higher than in sectors with more artifacts.
The two right-hand columns of
Table 5 show the relative error of the proposed method with the other two methods. It can be seen that analogously to the absolute error, the lowest value corresponds to the overall mean (G) with 0.07 for right eye, and the worst case corresponds to the nasal sector (N) with 0.19 also for the right eye. It is also noteworthy that the error between the proposed method and the H-DLpNet method has a smaller magnitude than with the results provided by Spectralis. The similarity between the results of the proposed method and the H-DLpNet method can also be measured with the Dice similarity coefficient detailed in
Table 4. As can be seen, the concordance between the segmented layer is high, since in all sectors, the Dice coefficient is higher than 0.893 in all cases.
We emphasize that both the proposed method and the H-DLpNet method use the same set of images provided by the Spectralis device. However, we cannot be sure whether the Spectralis software uses only the information from the OCT images or makes use of other data available to the device software but not available from the image set. The details of this inbuilt software are not available and, generally, the processing is specific for each particular equipment. This fact makes it difficult to compare measurements between equipment from different brands as well as with implementations based on recent research. For example, as addressed in [
25], the authors highlight the differences in manual and automated segmentation in drusen volume from a Heidelberg spectral domain (SD-) and a Zeiss swept-source (SS) PlexElite optical coherence tomography (OCT).
At this point, it is important to remark that the main purpose of the segmentation of the RNFL in peripapillary B-scan OCTs is the assessment of the glaucoma status or progression. To this end, the main objective is the automatic measurement of the evolution of the RNFL thickness over time and thus determines its relationship with the progression of the disease. For this reason, although the results provided by these three methods differ slightly, what is really important for its application in glaucoma screening is the consistency of the chosen method over time. Since the image characteristics producing these small differences between methods are constant over time, the methods studied, including the proposed approach, can be expected to provide this consistency over time. Nevertheless, this hypothesis needs to be verified as this method is used in research on the progression in glaucoma patients.
An important distinguishing feature between the proposed method and the H-DLpNet method is the ability to apply constraints in the segmentation process. In the H-DLpNet method, a training is first performed with a set of labeled images, and then the convolutional neural network is applied to automatically predict the segmentation in the target images. By contrast, the proposed method can directly operate on the images in the database, being able to impose certain restrictions in the segmentation process, such as grey levels of the layers, minimum and maximum thicknesses, periodicity, smoothness or maximum curvature, and layer continuity, among others. The application of these constraints is more challenging in neural-network-based methods and allows for greater robustness in layer detection. In particular, the H-DLpNet method has problems in the segmentation of seven images of the database, which are instead correctly segmented with the proposed method.
Although the proposed method has proven to be robust to image artifacts produced by vessels shadowing and speckle noise of usual practical scenarios (variance less than 0.1), some issues may hinder the correct segmentation of the RNFL. Neither the proposed method nor the H-DLpNet method do not provide satisfactory results in four images of the dataset, in which the retinal layers suffer from the lack of continuity at the edges of the image or where part of the RNFL is cut off in the image area, as exemplified in
Figure 19. It should be noted, however, that in clinical practice, the expert repeats the acquisition of such images, as they are considered invalid.
In addition, the approaches to segment the retinal layers in OCT images based on machine learning usually require a huge amount of labeled data, which are difficult to obtain, even more so if the data are derived from manual segmentations. Furthermore, the contours provided by expert ophthalmologists possess certain variability due the subjective knowledge which changes between observers. Therefore, an automatic method for the segmentation of the retinal layers based on “classical” image processing is useful since it is usually faster, it can be used as a standalone technique, or it can provide a large amount of label data to be used in deep learning methods under human supervision.
Finally, the computational load between methods was also compared. The proposed method uses an average processing time of 1.818 s per image, which is substantially less than the average time of 5.169 s used by the H-DLpNet method for the segmentation of each image. Note that as indicated above, the latter method additionally requires a network training process prior to the prediction process. Thus, the proposed method shows a higher efficiency in the segmentation process.