2. Materials and Methods
2.1. Study Design and Population
We planned an analytical, observational, prospective, cross-sectional and comparative study including healthy subjects, ocular hypertension patients and POAG patients. They were consecutively recruited at Policlínica Baza (Baza, Spain) and Clínica Vistacamacho (Almería, Spain) between May 2016 and June 2018.
The patients were informed about the nature of the study, agreed to participate, and] provided informed consent according to the data protection law currently in force. The study adhered to the conditions of the Declaration of Helsinki (sixth revision, 2008) and was approved by the ethics committees of both participating centres (Policlínica Baza, Baza, Spain and Clínica Vistacamacho, Almería, Spain).
The inclusion criteria were: (A) cognitive ability to accept and understand the proposed procedures; (B) a clinical record of intraocular pressure (IOP) < 23 with the absence of perimetric damage, normal optic disc appearance in the fundus examination and all sector values assessed by SD-OCT (BMO-MRW and peripapillary measurements) over the 5th percentile; (C) a clinical record of intraocular pressure IOP > 23 mmHg indicative of ocular hypertension (OH). In the absence of perimetric damage, normal optic disc appearance and all sector values assessed by SD-OCT over the 5th percentile; (D) POAG with historical clinical records of a IOP > 23 mmHg at least once, perimetric damage or glaucomatous optic disc appearance; such patients should have received anti-hypertensive treatment. (E) All the patients had to have a minimum of two clinical records: VF < 20% considering false-negative and false-positive responses; <20% fixation losses, as well as glaucomatous defects, according to the Hodapp classification. VF tests were performed using the standard Swedish 64 interactive threshold algorithm (SITA) target and the VF target strategy with a Humphrey II field analyser 65 (software version 4, programme 24-2, Goldmann objective size III, stimulus duration of 200 ms, model 66 HFA 740, Humphrey Instruments, Inc., Dublin, CA, USA); (F) Caucasian race; (G) open-angle verification by gonioscopy and the absence 68 of signs of pseudoexfoliation and pigmentary glaucoma.
The exclusion criteria: Ocular hypertensive patients or POAG patients were excluded from the study if they (A) did not agree to participate or did not meet the inclusion criteria, (B) had a direct family history of cognitive neurodegenerative diseases or exhibited suspicious signs, (C) had diseases affecting the retina and optic nerve before or at the time of recruitment, (D) had undergone intraocular surgeries in the previous 6 months, except for successful cataract removal or surgeries related to POAG, (E) exhibited non-transparent media at any level, (F) had severe systemic diseases, (G) had refractive errors above the spherical equivalent of three dioptres, (H) had SD-OCT tests with signal strengths below 20, or (I) were initially considered hypertensive and did not have high IOP values after 3 months of anti-hypertensive treatment and discontinuation of this treatment.
2.2. Ophthalmological Assessment
A complete ophthalmological examination was performed during the inclusion session, which included best-corrected visual acuity, cycloplegic refractions (Tropicamide 1%), visual acuity with best correction, measurement of central corneal thickness by ultrasound pachymetry, slit-lamp examination and gonioscopy, and a retinal and optic nerve funduscopy with a 78-dioptric hand lens.
2.3. Spectral Domain Optical Coherence Tomography
SD-OCT was performed on all patients on one side to avoid problems related to a correlation between the eyes of the same patient. OCT was performed with the Heidelberg Spectralis device (Heidelberg Engineering, GmbH, Heidelberg, Germany, Software Heidelberg Eye Explorer ver. 6.7c).
The glaucoma program provided by the manufacturer was used, which has a patented anatomical positioning system (APS) with a series of patterns for scanning of the optic nerve head, the RNFL and the macular ganglion cell layer. The program compares the eyes of patients with normalised baseline values for normal eyes. All the participants underwent an examination of the thicknesses of peripapillary RNFL rings with diameters of 3.0, 4.1, and 4.7 mm centred on the optic nerve and BMO-MRW.
No segmentations were performed manually for the RNFL values. Only small corrections were made by the same experienced operator (A.P.B.) to readjust Bruchs membrane endings during the BMO-MRW examination.
The absolute values of healthy, glaucomatous and hypertensive eyes prior to normalisation indicated by Heidelberg Engineering and obtained from the seven sectors of the BMO-MRW and the three peripapillary rings (G-TS-T-TI-NS-N-NI), were exported to a Microsoft Excel (version 2016; Microsoft Corporation, Redmon, WA, USA) table. These absolute values were adjusted for age and the area of the papillary cup, and were normalised following the Heidelberg Engineering indications using the following formula provided by Heidelberg Engineering:
where:
is the value of individual i in variable x.
is the mean of variable x.
is the standard deviation of variable x.
is the age of individual i.
is the mean age of healthy individuals at baseline.
is the slope of the regression line of variable x on age.
is the area of the BMO-MRW of individual i.
is the mean area of the BMO-MRW of healthy individuals at baseline.
is the slope of the regression line of variable x over the area of the BMO-MRW.
is the standardized value of individual i in variable x.
2.4. Statistical Analysis
Data processing and analysis were performed with programmes R (version 3.5), RKWard (version 7.0) and rk.Teaching (package) (version 1.3) [
13].
A descriptive summary of the main variables in the study was given by groups (healthy and glaucomatous eyes), including the mean and standard deviation.
The normality of RNFL variables was proven with the Kolmogorov–Smirnov test, and the homogeneity of variances was tested with Box’s M test using an level of .
Pearson’s correlation coefficient was used for the correlation analysis of the variables. A strong correlation was considered for (r > 0.75) and a very strong correlation for (r > 0.9). The principal components analysis (PCA) was performed to reduce data dimensionality and to determine which combination of variables explained most of the variability of data. The PCA is a statistical method used to reduce data dimensionality by transforming the original variables (usually correlated) into a new set of linearly uncorrelated variables. The first principal component is a linear combination of variables that explain the widest variability in the data (maximum variance). The second principal component is a linear combination of variables with maximum variance, which is orthogonal to the first component [
14].
The k-means method was followed to cluster eyes into glaucoma stages [
14]. This algorithm groups eyes into
k clusters or classes according to the Euclidean distance from the eye to the cluster centroids in the
n-dimensional space of the variables. Every eye is assigned to the cluster with the closest centroid. The initial choice of cluster centroids was random. The number of clusters
k was decided according to elbow criteria [
15] by observing the reduction in intra-group variability when increasing the number of clusters. This criterion chooses the number of clusters where reduction in intra-group variability significantly decreases.
A linear discriminant analysis (LDA) [
14] was used to classify eyes into the clusters, previously defined by the
k-means method. LDA is a statistical method used to identify a set of discriminant functions, all of which are uncorrelated linear combinations of the variables, that maximizes the difference between classes and separates the classes the best [
14]. The number of discriminant functions created is the number of classes minus one. The classification power of classifiers was assessed by cross-validation, and by computing sensitivity, specificity and the overall accuracy measures from the confusion matrix.
4. Discussion
Even after OCT emergence, one constant in the attempt made to classify POAG was the VF. Our study did not consider the VF to be a classification element given its subjective nature because of arbitrary limits of exclusion according to fixed losses, false-positives, cognitive impairment and difficult collaborations. VF-based classifications have been proposed, and many reviews of these classifications have been published [
16,
17,
18]. A summary of all of these reviews may yield an inevitable conclusion. The presence of glaucomatous optic neuropathy increases with staging severity for all systems. However, different systems led to different severity stages [
18].
Other proposals for the diagnosis and classification of POAG have recently been presented based on OCT angiography [
19,
20,
21,
22,
23]. All of these proposals, and regardless of them being based on retrospective reviews of articles [
23] or personal work, the potential of this technique is revealed, but they also introduce uncertainty due to inconsistent criteria in glaucoma diagnosis termes, especially the glaucoma classification. We believe that OCT angiography introduce elements such as anatomical anomalies, exudates, haemorrhages, and the distorting effect of the vessels on the nasal side of the papilla, which can lead to misinterpretation. Therefore, we considered the actual anatomical shape of the papilla limits, BMO-MRW [
24], and the normalized values of the BMO-MRW and RNFL to be objective, accessible, and reproducible variables to propose a clinical-evolutionary glaucoma classification system.
The diagnostic accuracy of the sectoral and total analysis of RNFL and/or BMO-MRW has also been determined in glaucoma. Danthurebandara et al. [
25] used an independent OCT normative database and classified eyes as glaucomatous if their BMO-MRW or RNFL values went below the normative limits of 1%, 5% or 10% of the normative database by using all the measurements (total analysis) or the sectoral means (sectoral analysis). They reported that at a normative limit of 1%, the sectoral analysis of BMO-MRW gave 87% sensitivity and 92% specificity, while the total analysis yielded 88% sensitivity at the same specificity (92%). For RNFLT, the sectoral analysis yielded 85% sensitivity and 95% specificity, while the total analysis gave 83% sensitivity at the same specificity (95%). The results for the 5% and 10% normative limits yielded lower specificity, but higher sensitivity. The authors concluded that both analyses, the sectoral and total, had similar diagnostic accuracies.
Fan et al. [
26] compared the diagnostic ability between three-dimensional (3D) neuroretinal edge parameters (including BMO-MRW) and two-dimensional (2D) parameters (including RNFL thickness) using SD-OCT. They concluded that 3D parameters have the same or better diagnostic ability than 2D parameters. Among the three parameters of the 3D edges (Minimum Distance Band (MDB), BMO-MRW-MRW), no significant differences were found in diagnostic capacity (false detection rate >0.05 with 95% specificity).
Authors like Zheng et al. [
27] have also worked with the inferotemporal and superotemporal sectors of the RNFL to improve both sensitivity and specificity in relation to the same sectors of the BMO-MRW with POAG perimetry and percentiles <5. Abnormal superotemporal and/or inferotemporal RNFL thicknesses achieved a higher sensitivity than abnormal superotemporal and/or inferotemporal BMO-MRWs in detecting mild glaucoma (mean VF DM: −3.32 ± 1.59 dB) (97.9% and 88.4%, respectively,
p = 0.006), and glaucoma (mean VF DM: −9.36 ± 8.31 dB) (98.4% and 93.6%, respectively,
p = 0.006), with the same specificity (96.1%). We belive that below the 5th percentile, the best definitions are of little significance when more specific sectors like G and TI are not studied.
In our study, we relied on artificial intelligence (AI) for its potential for detect, diagnose and classify glaucoma through the automated processing of large data sets and the early detection of new patterns of diseases [
28]. We considered the parameter BMO-MRW to be important when defining the stages of the disease, as shown in
Table 4. Valuing the diagnostic capacities of the parameters RNFL thickness, BMO-MRW yielded better diagnostic performance than the other parameters. At 95% specificity, the sensitivity of RNFLT, BMO-HRW, and BMO-MRW was 70%, 51%, and 81%, respectively. More studies have compared the diagnostic ability of BMO–MRW and peripapillary RNFL. Gardiner [
29] found that peripapillary RNFL outperformed BMO–MRW during the follow-up assessment of glaucoma. In contrast, Chauhan et al. [
30], Enders et al. [
31] and Malik et al. [
32] proved that BMO–MRW showed higher diagnostic ability compared to peripapillary RNFL. Bambo et al. [
33] did not find any differences between diagnostic ability of peripapillary RNFL and BMO–MRW in glaucoma. In the present study, we integrated both strategies to achieve a better model.
In a recent study, Brusini [
34] proposed a classification with similarities to our own classification, but also with some definitive differences. Brusini’s classification uses the upper and lower RNFL thickness values plotted on a Cartesian plane to classify glaucoma OCT results. The RNFL defects are classified into six stages of increasing severity and three classes of defect location (upper, lower, or diffuse defects). The diagram was created based on 302 OCT tests with 94 healthy controls and 284 patients affected by perimetric POAG.
Our study provides a more practical and simpler proposal and incorporates healthy, OHT and glaucomatous patients. We noted some substantial differences with he study by Brusini. Firstly, we included more variables that may suffer the effect of glaucoma (28 variables of BMO-MRW and RNFL sectors) than those Brusini included (only two RNFL sectors). Brusini did not consider BMO-MRW assessment. We corroborated the importance of BMO-MRW for determining glaucoma severity as two of its variables (BMO-MRW.G and BMO-MRW.TI) appear in the simplified classification model (
Table 4).
Brusini considered the mean thicknesses of the upper and lower quadrants of RNFL, but ignored other variables because they are not reliable for determining structural damage, although no justification for excluding these variables was provided. We propose considering all the the RNFL and BMO-MRW sectors (NI, N, NS, TS, T, and TI), as well as a global measure (G) that averages 728 points in each rim. We selected those sectors with greater discriminatory power in the final classification model.
Brusini proposed six groups of increasing severity to classify POAG arbitraryly, but it does not obey objective or statistical criteria. Our work established four classes because this number is optimal for reducing intra-group variability according to the
k-means algorithm. A bigger number of classes does not substantially reduce the total intra-group variability, as shown in
Figure 5.
The groups that Brusini established are separated on the Cartesian plane in the direction of the two variables considered by taking intervals of equal length in the x- and y-axes. However, these axes do not correspond to the directions of the maximum variability of the data, which would be the directions of the principal components.
Our groups are very well separated in the direction of the first principal component, as shown in
Figure 6. When using the
k-means algorithm, the resulting groups were not same size, but emerge more naturally by agglutination around the centroids of each group. As for the validation of the classification model, Brusini’s only studied sensitivity (0.85–0.95) and specificity (0.92–0.98) to distinguish between patients with perimetric damage and healthy patients, but not for making the distinction between different stages. Our results, which include ocular hypertensive patients (without 315 perimetric damage), validate the system with similar balanced accuracies (
Table 6) when healthy eyes are not included. When healthy eyes are included in the classification system, the sensitivity of stages I and II lowers. The reason for this is that the cluster of stage I, and partially the cluster of stage II, overlap the cluster of healthy eyes, as we can see in
Figure 6b. Thus most of the eyes in stage I are wrongly classified as healthy, as are some of the eyes in stage II. However, the sensitivity and specificity of stages III and IV remain similar to those of Brusini’s study.
Our classification algorithm not only demonstrates its functionality within a group of known glaucomatous patients but can also be useful for the diagnosis of POAG in general population surveys because, although its performance in early glaucoma stages (I and II) shows low sensitivity and specificity, significant values are reached in more advanced stages (III and IV), thus enabling diagnosis with a mean global precision of 0.88. When we considered the robustness of our study due to the verification and evaluation of 1001 patients, 235 of whom had glaucoma at various disease stages, we can accept that only a single tomographic scan was considered for the study (the last scan performed). This test was considered the examination that would define the current process state.
This study has some limitations. As it is a cross-sectional study, its conclusions cannot be strictly considered to POAG progression in time and follow-up. More studies have to be done in this sense. Moreover, this work was performed with Caucasian participants, so its results should not be extended to other races. In adition, individuals with refractive error above the spherical equivalent of three dioptres were excluded from this study in order to avoid artifactual results of OCT. Hence this system cannot be applied to eyes with moderate to severe refractive errors, and external validity of the current study may be restricted according to ethnic or refractive factors. Finally, we only considered VF results as the inclusion/exclusion criteria in this work. Thus, we did not compare our proposed OCT clinical-evolutionary staging system with VF parameters. The reasons for this were the subjective nature and poor reliability of VF, as explained above. However, we consider that this issue should be addressed in further studies.
In conclusion, we present a new objective method to classify glaucoma patients according to both BMO-MRW and RNFL measurements that could be useful in clinical practice.