1. Introduction
Diabetic retinopathy is a major complication of diabetes mellitus, which is a chronic condition that affects a large population because of its widespread incidence. Diabetic macular edema (DME) remains the major cause of vision loss in diabetic individuals. According to the statistics by the International Diabetes Federation (IDF) Diabetes Atlas (11th edition), 1 among 9 individuals within an age group of 20–79 possesses diabetes, and 40% of the population are not aware of the fact that they have acquired the illness. IDF forecasts that 853 million people, or one of eight aged individuals, are expected to acquire the disease, indicating a rise of 46% in incidence by 2050 [
1]. The diagnosing, tracking, and treatment of DR and DME have become vital in recent years. Imaging modalities like OCT provide high-resolution imaging of the retinal tissue, which allows for clear visualization of changes in the morphology of the retina in a non-invasive manner [
2]. These imaging methods have become vital for guiding therapy and predicting patient outcomes, along with improved efficiency in diagnosis.
The retinal health and the structural abnormalities can be assessed with the help of noninvasive OCT imaging biomarkers. Subretinal fluid (SRF), cystic spaces within the intraretinal region, and DRIL, along with the thickness of the retina, are all crucial biomarkers on OCT that are designated as important identifiers of the disease severity, response to the treatment, and development. A vital imaging biomarker for DR, primarily DME, is DRIL apparent in OCT images. According to Sun et al. DRIL is defined as ”Disorganization of the retinal inner layers was defined as the horizontal extent in microns for which any boundaries between the ganglion cell–inner plexiform layer complex, inner nuclear layer, and the outer plexiform layer could not be identified” [
3].
Figure 1 illustrates the absence and presence of DRIL in the OCT image.
The importance of DRIL as a reliable biomarker across multiple dimensions has been demonstrated by extensive clinical research. Novel possibilities have been offered by Deep Learning (DL) in the domain of healthcare imaging for identification of biomarkers and computer-based disease categorization. DL architectures have provided exceptional outcomes in the case of retinal image classification tasks, typically comparable to or even surpassing the results provided by doctors in the identification of a variety of OCT characteristics, including hyperreflective retinal foci (HRF), SRF, and structural differences in the ellipsoid zone (EZ).
Even though the current treatment methods for DME and DR are more successful, early detection of disease and treatment care are always beneficial. Use of automated disease identification tools that can categorize the patients’ groups to refer them to a suitable ophthalmologist is very important. Individuals with DME are often tested and scanned with the tool OCT in routine clinical practice. The OCT biomarker DRIL is proved to possess an association with RNFL layer thickening, decreased VA, EZ disruption, and retinal function impairment. It is very difficult to detect DRIL, as OCT variations are subtle, especially during the earlier manifestations of DME and DR [
5]. Latest developments in the field of deep learning and artificial intelligence have shown enhanced potential for faster, computerized, and accurate health evaluation in the ophthalmic domain.
Unique and potential OCT biomarkers like DRIL are crucial, as there are no alternative methods in the literature to understand when people with DME may lose or gain their eyesight over time. Potential biomarkers like DRIL, the site of presence of fluid accumulation, and the variations in the morphology of the retina can all be identified with the help of OCT. To categorize people based on disease prognosis in research related to retinal health, identifying DRIL at baseline becomes important. Grading of DRIL by human experts is tedious, whereas automated methods can speed up the work and will allow for more extensive research to identify the connection within the DRIL development and response to treatments. A few researchers have attempted to use OCT with DL to diagnose ME associated with other disorders by leveraging the power of AI models. The method used in the implementation, the uniqueness of DRIL, and the idea of integrating DRIL detection with computer-based diagnostic methods are detailed in this research.
2. Related Work
This section details the state-of-the-art methods on DRIL detection and also the studies that define the association of DRIL with other OCT biomarkers. Sun et al. [
3] defined DRIL as a novel biomarker where the study proved that extended severity of DRIL at baseline is correlated with decreased VA. Importantly, changes in DRIL over a time of initial four months were a crucial predictor of VA results over a period of eight months compared to central subfield thickness (CST). This study gave useful insights into the characteristics of DRIL and its associations, proving that DRIL is a noninvasive, reliable predictor of vision. Radwan et al. [
6] revealed that in the presence of DME, the alterations in the resolution of DRIL are strongly correlated with future VA. In the case of severe DRIL, VA outcomes were very bad, but as DRIL cleared, there was improvement in the VA. This research proved that in the context of diabetic abnormalities, DRIL becomes a unique predictive biomarker. An extended follow-up study by Sun et al. [
7] was focused on assessing 80 eyes with resolved and ongoing DME. They showed that, in both cases, DRIL spanning more than 50% of the foveal 1 mm region was always linked to lower VA. Grewal and Jaffe [
4] defined DRIL as the “inability to distinguish between the outer plexiform layer, inner nuclear layer, and ganglion cell layer-inner plexiform layer complex.” They proved DRIL to be a reliable predictive biomarker for evaluating VA in cases of both uveitic and diabetic ME. They stated DRIL to be a crucial tool when it comes to counselling of patients and therapy. Various studies have documented substantially regarding the use of DRIL as a DME biomarker. Acon and Wu [
8] report that there is still a need for discovering more about reliable biomarkers like DRIL, and people can use machine learning to plan treatment options and diagnoses that are built around various imaging modalities, even though OCT offers the best ways to manage DME.
Das et al. [
9] analyzed the association of DRIL with the integrity of the outer retinal region. Using SD-OCT, they proved that a larger horizontal extent of DRIL is significantly associated with misalignment of EZ and the external limiting membrane (ELM), and both these factors contributed to lower best corrected VA (BCVA). They proved that DRIL being a surrogate marker of VA is also a predictor for increased morphological degeneration in the outer retina. These findings support the fact that DRIL acts as a pathophysiological link between inner and outer retinal disruption. Joltikov et al. [
10] demonstrated a correlation of DRIL with decreased VA in individuals with initial stages of diabetic retinopathy (DR) but do not have ME. They also stated the therapeutic significance of DRIL and reported the possibility that DRIL seems to be an initial cellular consequence of diabetes. Nakano et al. [
11] highlighted the role and impact of DRIL as an important tool for detailed visual functioning. They also showed that irrespective of the status of ME and VA, DRIL strongly correlates with the level of metamorphopsia in individuals with DME. These findings were supported by Nadri et al. [
12], where they proved the association of DRIL with thinning of the retinal nerve fiber layer and EZ disruption. Di-Luciano et al. [
13] performed a semantic assessment of seven research publications evaluating DRIL as an important biomarker of DME.
Most of these articles in the review employed SD-OCT scans with retrospective data-based cohort and cross-sectional designs. Throughout the findings provided by these reports, DRIL resolution was linked to improvement of vision, and the presence and increase in the extent of DRIL were constantly linked to decreased VA. Recent technical improvements in the field of DL have opened up new possibilities for automated DRIL detection. Singh et al. [
5] used OCT images to categorize DRIL. This research also outlined the advantage of using DL in therapeutic healthcare decisions by designing the initial DL model for OCT biomarker categorization. Singh et al. [
14] developed a DL-based convolutional neural network (CNN) achieving an accuracy of 88.3% in DRIL classification. However, this research lacked explainability even if it was one of the initial attempts to classify DRIL. A system based on fuzzy-logic design is employed by Tripathi et al. [
15], leveraging DRIL, HRF, and cystoid spaces for determining DME severity from OCT scans. The method aimed at acquiring and reporting quantitative information to define classes of DME severity.
Table 1 provides the details of various datasets used and the results obtained for the DRIL classification by the previous studies.
To bridge the gap between computer-based decision support systems and the clinical decision-making strategies, this work showed that rule-based methods that are easily interpretable, including different biomarkers such as DRIL for computerized assessment of DME severity, are achievable. In patients with DME, to identify hard exudates (HE) and to classify DRIL, Toto et al. [
17] offered a DL-based system. They report that DL-supported DRIL detection is viable and provided one of the early extensive AI-based methods to combine classification pipelines with object detection-based strategies for DME biomarker assessment. Irrespective of the treatments given to patients, multiple studies have related the extent of DRIL to both VA at baseline and long-term results, as found by the extensive literature search that assesses DRIL-related articles by Tripathi et al. [
19]. They reported the need for additional research to improve DRIL identification and use the results for providing individual patient care. Singuri et al. [
16] demonstrate the fact that regardless of the subjective nature of DRIL, it has substantial association with VA status and DR. Ruiz-Medrano et al. [
18] identified DRIL as a vital biomarker of DME on OCT by performing a study to predict treatment options. It is proved that DRIL presence, along with other biomarkers, leads to the failure of anti-VEGF treatments, necessitating the employment of additional treatment approaches.
3. Materials and Methods
This section provides details of the method used, the datasets, the training pipeline, and
Figure 2, utilized for classification of DRIL with limited annotated data conditions. We employ a Bootstrap Your Own Latent (BYOL) [
20] learning framework based on self-supervision where a ResNet-50 backbone pretrained on ImageNet functions as an encoder. 108,309 unlabeled OCT images are used to refine this encoder to learn domain-related features of the retina in a class-agnostic approach. DRIL classification is thus achieved through the adaptation of this pretrained ResNet-50 encoder utilizing only 823 labeled images through transfer learning.
3.1. Mendeley Dataset
A large labelled OCT and chest X-ray collection dataset made available by Kermany et al. [
21] is utilized for the research. 109,309 OCT images belonging to four categories—DME, Drusen, Normal, and Choroidal Neovascularization (CNV)—are contained in the dataset. All these retinal OCT images were obtained from the dataset of retinal scans performed at UC San Diego Health and the Shiley Eye Institute. The dataset is subdivided into training, validation, and testing groups to facilitate learning with self-supervision. Since this dataset is publicly available, diverse, and comprehensive for evaluating DL models in retinal image categorization, this dataset has become a common benchmark.
3.2. KMC Dataset
The private, labeled dataset was obtained from Kasturba Medical College (KMC), Manipal, MAHE, Manipal, from the Department of Ophthalmology. This dataset contains fovea-centered, anonymized, original OCT B scans that were used for evaluating the DRIL classification model. We have obtained ethical clearance for the dataset that was used from the Kasturba Medical College and Kasturba Hospital Institutional Ethics Committee, having assigned it an approval code IEC1-287/2022. The retrospective dataset contained horizontal, high-quality, fovea-centered B-scans having a signal strength greater than 7. The dataset was collected over a time period from January 2019 to August 2022. All these OCT images were captured using a certified Zeiss Cirrus HD OCT 5000 imaging device. The dataset contains 429 OCT images with the presence of DRIL, and 394 images do not have DRIL. The method used in the research uses expert-validated consensus annotations created by two experienced ophthalmologists from KMC Manipal. These ophthalmologists have immense professional experience treating patients with a multitude of eye disorders, including DR and DME, over a period of more than twenty-three years. The labelling of the images was done, and final reconciliations were performed to resolve discrepancies using a consensus protocol by these doctors. Thus, the dataset involved is highly reliable with clinical accuracy and relevance, and this provides a strong basis for the training and validation of DL models, preventing bias and promoting model generalization.
Inter-Observer Agreement Statistics
Inter-observer variability metrics help us understand the consistency and differences in the annotations/labeling among the observers/doctors. Two experienced doctors, both having more than 23 years of clinical practice experience, have labelled the 823 OCT images for the presence and absence of DRIL. The metric known as Cohen’s kappa coefficient (
), which considers the agreement between two sets of annotations provided by the doctors, is used to measure inter-observer variability. Among the 823 OCT scans, 796 (96.7%) scans were annotated with 100% agreement between both the doctors. Out of these 796 scans, both doctors agreed upon the presence of DRIL in 409 (49.7%) scans, and 387 (47.0%) were marked as DRIL-absent scans. But for the remaining 27 scans, discordant classifications were provided where 16 images (1.9%) were labelled as DRIL-present by Observer 1 and DRIL-absent by Observer 2. Also 11 images (1.3%) were annotated as DRIL-absent by Observer 1 and DRIL-present by Observer 2. Statistical assessment revealed satisfactory inter-observer agreement with Cohen’s
= 0.933 (95% CI: 0.906–0.960,
p < 0.001), which is really much more than the good reliability threshold (
> 0.81). All 27 discordant cases were discussed together by both doctors in order to resolve the discrepancies and provide the final annotations through consensus labelling. The detailed inter-observer agreement analysis is shown in
Figure 3. With Cohen’s
= 0.933 (95% CI: 0.906–0.960,
p < 0.001), the confusion matrix (Panel A) shows strong concordance, indicating excellent agreement beyond chance. There was 96.7% overall agreement (796/823 pictures). Observer 1 was classified as DRIL-present in 16 cases (59.3% of conflicts), Observer 2 was classified as DRIL-absent in 11 cases (40.7% of disagreements), and there were very few discordant cases (27 cases, 3.3%). As can be seen from Panel C’s kappa scale, our result is firmly in the “Excellent” area.
3.3. Reprsentation Learning in Anamtomical Context
Standard BYOL applies global average pooling to the encoder’s final feature map to obtain a single global representation of the image. This pooled vector is then passed through a projection MLP and a prediction MLP. While this approach works well for natural image representation learning, it discards all spatial structure. In medical imaging, specifically for OCT, spatial information is essential, as the pathologies are localized and retinal layers follow a fixed geometric structure. To address this, we extend BYOL with spatially aware feature learning and introduce a spatial self-supervision loss aimed at preserving structural information represented with Equation (
1). Instead of relying solely on the deepest feature map, we extract latent representations at multiple spatial resolutions from the ResNet-50 backbone.
Here, correspond to the feature maps at increasing network depth and decreasing spatial resolution. The final latent representation at is used for the adaptive average pooling as the standard BYOL implementation. This branch keeps the standard BYOL objective and maintains compatibility with the global downstream task. For spatial learning, we select as the input to our Spatial Branch. This layer provides an ideal balance between spatial granularity with 196 spatial locations and rich semantic content with 1024 channels. We process through our custom Spatial Projection Head and Spatial Prediction Head, both of which preserve the spatial grid. The Spatial Projection Head applies a convolution followed by batch normalization and non-linearity, followed by a convolution with batch normalization, producing a lower-dimensional spatial representation without destroying location information. The Spatial Prediction Head takes this spatial dimensional representation as input and applies two CNN blocks, yielding a representation, which plays the role of BYOL predictor for the spatial branch. This setup avoids collapsing the feature map into a single vector and instead learns a dense field of spatial predictions, allowing the model to retain anatomical structure.
To capture global semantics and anatomical structure, we introduce a hybrid loss that integrates the standard BYOL global objective with our spatial self-supervision objective. The global branch operates on the deepest feature map
. Considering
denote the global predictions for two augmented views, and
denote the corresponding target projections obtained from
, the global BYOL loss is given by the Equation (
2).
In parallel, the spatial branch operates on
, preserving its
resolution. After passing
through the Spatial Projection and Prediction Heads, we obtain spatial prediction maps
for the two augmented views. The target network produces corresponding spatial projections
with the same dimensions. We compute the cosine similarity at each location of the 14 × 14 location for the 128-dimensional vector and take the mean of the cosine similarity across all the locations, formally given by (
3).
and apply the same symmetric formulation:
Finally, we combine the global and spatial objectives into the hybrid loss:
This hybrid loss enables the model to learn global semantics through and local anatomical representations through the spatial supervision applied to . The resulting encoder captures both high-level disease context and fine-grained retinal morphology properties that are critical for OCT lesion localization and segmentation downstream.
We trained the spatial BYOL encoder for 125 epochs with early stopping conditions, with a patience of 25 epochs and a batch size of 64 and gradient accumulation over 4 consecutive mini-batches, yielding an effective batch size of 256 for each optimizer update. This configuration allowed us to maintain a large effective batch size. We used the AdamW optimizer with initial learning rate
and weight decay
, together with a cosine annealing learning-rate schedule over the 100 epochs. We employed mixed-precision training and gradient clipping with max norm
; all executions were carried out on NVIDIA A100 GPUs. The online and target encoders were coupled via an exponential moving average update with decay
. The total training loss combined the global BYOL loss and the spatial BYOL loss with equal weighting (
), and the best checkpoint was observed at the 100th epoch and was selected as the backbone based on the minimum total BYOL loss on the training set, all of which are depicted in
Figure 4,
Figure 5 and
Figure 6. Since the framework is fully self-supervised and does not use class labels, all four disease categories—CNV, DME, DRUSEN, and NORMAL—contribute equally to representation learning. This class-independent pretraining encourages the learned features to capture a broad range of OCT characteristics across disease types, improving the generalizability of the downstream models.
3.4. Finetuning for DRIL Identification
We employed the trained encoder for binary classification of DRIL after the self-supervised pretraining. The pretrained BYOL, along with a lightweight classification head, together made the classification model. Two fully connected layers with dropout regularization and batch normalization were included as a classification head. The classification head is responsible for application-specific prediction boundaries, and the high-level features are extracted by the encoder for detection of DRIL. Kaiming initialization was used for the linear layers of the classification head for random initialization.
In order to avoid catastrophic forgetting to enable application-specific adaptation, we employed a two-phase approach of fine-tuning that’s similar to pretraining. The classification head was made trainable, and the pretrained encoder was frozen. This enables the classifier that’s randomly initialized to adapt and learn patterns and features utilizing the pretrained characteristics without disturbing the learned representations. During this phase a very small quantity (2.6 M) of parameters were trainable while taking into consideration 73 M parameters. End-to-end fine-tuning is performed by unfreezing the encoder. The learning rate was lowered by 90% to allow for adaptation of the pretrained characteristics to characteristics that are specific to DRIL detection while promoting stability in learning. Fine-grained adaptation of both the high and low-level characteristics is allowed with this phase for optimized DRIL detection outcomes.
We utilized inverse frequency class weights with a loss function of weighted cross-entropy and the AdamW optimizer with a batch size of 32. When validation accuracy plateaued, dynamic reduction of learning rate was allowed with the ReduceLROnPlateau scheduler, thus enabling lighter optimization in the later training stages. To provide stability during training, gradient clipping was enabled with a norm of 1.0. The efficiency of the self-supervised method was demonstrated with the training converging within an average of 20–25 epochs observed in
Figure 7 and
Figure 8, because of the high-quality pretrained representations.
We have also made an attempt to evaluate the performance of the BYOL classifier with the other state-of-the-art CNN models in our quest to finalize a best possible model for classification of DRIL OCT images. Six CNN models that are pretrained on the ImageNet dataset, EfficientNetB3, ResNet50, InceptionResNetV2, MobileNetV2, DenseNet169, and VGG16, were fine-tuned to classify no-DRIL and DRIL OCT images. An almost identical training protocol has been used to train all six CNN models. The image size specifications were changed according to the requirements of the specific models to optimize the outcomes of the models. The parameters for which different values have been used are input image size and the type of augmentations used to get optimal results, which have been listed in
Table 2. A stratified dataset split (80% for training, 10% of training data used for validation, and 20% for testing) has been used. Each model was evaluated for baseline performance with the backbone frozen and trained for 15 epochs, which was followed by complete fine-tuning of all the layers for another 15 epochs. All models were trained with the Adam optimizer and with a learning rate of
during the frozen stage and
during the fine-tuning phase. Across all architectures a batch size of 32 has been used with binary cross-entropy loss. The best model checkpoint obtained with minimum validation loss was saved for further evaluation. An independently held-out test set of 165 testing samples (20% of the entire dataset of 823 images) is used to assess the performance of all the fine-tuned models.
4. Results
Our experiments show that standard transfer learning baselines achieve strong performance on the DRIL classification task, with models such as ResNet50 and VGG16 reaching 98.18% accuracy.
Table 3 provides a detailed comparison between our method and several pretrained state-of-the-art architectures. However, our proposed approach, Spatial BYOL with Hybrid loss, is able to outperform all baselines, achieving 99.39% accuracy with only a single misclassification on the test set.
Table 4 further contextualizes this improvement by comparing model sizes and convergence behavior. With 73 million parameters, of which only 27 million are updated during fine-tuning, our model performs better than the VGG-16 with over 138 million parameters and converges faster. Together, these results demonstrate that the proposed Spatial BYOL learns more expressive representations than existing pretrained models, enabling both faster convergence and improved final performance with substantially fewer trainable parameters.
The confusion matrix in
Figure 9 reveals the model’s error pattern. Out of 78 No-DRIL cases, 77 were correctly classified, giving us a specificity of 98.72%. Our approach was able to identify all 86 DRIL cases correctly, yielding a 100% sensitivity with zero false negatives. This performance is particularly significant for clinical deployment, as failing to detect DRIL (false negatives) has greater clinical consequences than false alarms, which can be resolved through secondary review. Explainability becomes crucial for the AI model to be accepted clinically for DRIL detection. Gradient-weighted Class Activation Mapping (Grad-CAM) heatmaps provide useful information about which regions of the OCT images have contributed the most in making the DRIL prediction.
Figure 10 and
Figure 11 provide the Grad-CAM heatmaps obtained for various pretrained models along with the proposed spatial BYOL implementation.
5. Discussion
DRIL remains a difficult target for reliable detection in a clinical setting. In contrast to more overt retinal pathology such as large hemorrhages or exudates, DRIL appears as very subtle irregularity within the inner retinal layers, and often requires careful, targeted inspection for even experienced graders to confidently identify. These biomarkers and their associated structural changes alter the scan appearance only slightly, but still have clinical relevance, which makes algorithmic detection challenging. The problem is further amplified by the limited availability of high quality expert-annotated DRIL cases.
Supervised deep neural networks typically require large numbers of labeled examples to learn fine-grained and discriminative visual cues of this type. However, obtaining reliable labels for DRIL is costly, labor intensive, and slow, and this limits the achievable dataset scale in practice. Conventional transfer learning from natural image benchmarks also provides only marginal benefit because of the substantial domain mismatch. Feature extractors trained on everyday photographs tend to emphasize cues that are characteristic of object-centric scenes, such as sharp object boundaries, natural color statistics, and common surface textures. These learned inductive biases do not align well with retinal OCT, which is defined by layered biological structure, speckle characteristics, and subtle disease-related architectural changes.
To address the challenges of subtle pathological features, limited annotations, and domain shift, we propose leveraging self-supervised learning on large dataset of unlabeled OCT data. Our approach helps our model learn domain specific representations before being finetuned for pathology specific tasks. This class agnostic pretraining method allows the model to grasp basic OCT image characteristics, like retinal layer structures, tissue reflectivity patterns, speckle noise features, and anatomical spatial relationships without needing pathology labels. By training on diverse pathological conditions in an unsupervised manner, the model learns to encode retinal layer boundaries, variations in tissue texture, and structural organizations that are vital for generalizability. In comparison to supervised pretraining on a single pathology which would skew the learned features toward task-specific patterns and limit their transferability.
For OCT B-scans and for medical imaging, contrastive methods seem to be less well-suited. We hypothesize that this could be due to individual scans sharing highly similar global retinal structure, and disease-related changes often manifest as subtle, localized variations. This makes it difficult to construct truly “negative” examples. Images from different patients or even different disease categories can still be very similar at the global level, increasing the risk of false negatives and degrading representation quality. At the same time, our goal is to operate in relatively low-data and moderate-compute regimes, where transformer-based SSL approaches and masked-image-modeling variants, including DINO with VIT backbones, are typically data and computation intensive and therefore harder to deploy at full scale. We observe in our study that DINO with CNN-based backbones don’t perform equally well. In this context, self-distillation methods and BYOL offer a promising solution. BYOL does not rely on negative sampling, is compatible with convolutional backbones that naturally capture local retinal structure, and has been shown to perform well with smaller batch sizes and limited data. These properties make BYOL a natural choice for learning OCT-specific representations that remain sensitive to subtle pathology while respecting our computational constraints.
Our results establish self-supervised learning on domain-specific data as a viable strategy for creating foundation models in medical imaging. The pretrained BYOL encoder acts as a general feature extractor for OCT images. Capable of rapid adaptation for downstream tasks with minimal labelled data. The foundational model approach for specific imaging areas holds particular promise for rare pathologies and new biomarkers, where expert annotations are often limited. A single self-supervised pretraining phase on varied unlabeled OCT data can support multiple downstream applications, such as DRIL detection, drusen quantification, CNV classification, and layer segmentation through lightweight, task-specific heads. This approach amortizes computational cost of representation learning across various clinical applications.
6. Conclusions
We demonstrate a framework supported with self-supervised learning to address the critical issue of computerized DRIL detection with limited labeled data. Our research shows that domain-specific learning based on self-supervision can help improve over traditional transfer learning strategies, improved results and faster convergence with limited supervision show the ability of this approach to have potential to adapt to complex and rare pathologies where the labelled data is scarce. We acknowledge that testing on a limited dataset size may constrain the generalizability of our findings. Our future research focuses on exploring the adaptation of proposed self-supervised OCT pretraining to various other retinal imaging biomarkers and lesions. Further, investigation of self-supervised learning with other imaging modalities like fundus and OCT angiography incorporating the identification of different retinal disorders may be explored to move towards foundational models in ophthalmic diagnosis.
Author Contributions
Conceptualization, P.K.C., A.T., P.K., S.V.B. and S.F.; methodology, P.K.C., A.T., P.K. and G.M.; software, P.K.C., A.T. and P.K.; validation, P.K.C., A.T., P.K., G.M. and S.V.B.; formal analysis, P.K., G.M., S.V.B. and S.F.; investigation, P.K.C., A.T., P.K. and G.M.; resources, P.K.C. and S.V.B.; data curation, P.K.C. and S.V.B.; writing—original draft, P.K.C. and A.T.; writing—review & editing, P.K.C., A.T., P.K., G.M., S.V.B. and S.F.; visualization, P.K.C., A.T. and P.K.; supervision, P.K., G.M., S.V.B. and S.F.; project administration, P.K.C., A.T., P.K., G.M., S.V.B. and S.F. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Ethics Committee of Kasturba Medical College and Kasturba Hospital (Approval number: IEC1-287/2022 and date of approval: 3 February 2023).
Informed Consent Statement
The informed consent was waived as this is a retrospective study.
Data Availability Statement
The public data presented in the study are openly available in the Mendeley database at [
https://doi.org/10.17632/rscbjbr9sj.3]. The private dataset generated and analyzed during this study is not publicly available due to ethical considerations. However, they can be obtained from the corresponding author upon reasonable request. Code for implementation can be found here [
https://github.com/Tulsani/Spatial-Byol] (accessed on 2 November 2025).
Acknowledgments
The authors acknowledge the usage of retinal OCT images from the department of Ophthalmology, Kasturba Medical College Manipal, Manipal Academy of Higher Education, Manipal, Karnataka.
Conflicts of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References
- International Diabetes Federation. Diabetes Facts & Figures. Available online: https://idf.org/about-diabetes/diabetes-facts-figures/ (accessed on 7 August 2025).
- Ruia, S.; Saxena, S.; Cheung, C.; Gilhotra, J.; Lai, T. Spectral domain optical coherence tomography features and classification systems for diabetic macular edema: A review. Asia-Pac. J. Ophthalmol. 2016, 5, 360–367. [Google Scholar] [CrossRef] [PubMed]
- Sun, J.; Lin, M.M.; Lammer, J.; Prager, S.; Sarangi, R.; Silva, P.S.; Aiello, L.P. Disorganization of the retinal inner layers as a predictor of visual acuity in eyes with Center-Involved diabetic macular edema. JAMA Ophthalmol. 2014, 132, 1309. [Google Scholar] [CrossRef] [PubMed]
- Grewal, D.; Jaffe, G.; Hariprasad, S. Role of disorganization of retinal inner layers as an optical coherence tomography biomarker in diabetic and uveitic macular edema. Ophthalmic Surg. Lasers Imaging Retin. 2017, 48, 282–288. [Google Scholar] [CrossRef] [PubMed]
- Singh, R.; Luo, S.; Hatipoglu, D.; Yuan, A.; Anand-Apte, B. Deep learning based tool for routine and rapid DRIL identification for patient with diabetes. Investig. Ophthalmol. Vis. Sci. 2020, 61, PB00144. [Google Scholar]
- Radwan, S.; Soliman, A.; Tokarev, J.; Zhang, L.; van Kuijk, F.; Koozekanani, D. Association of disorganization of retinal inner layers with vision after resolution of center-involved diabetic macular edema. JAMA Ophthalmol. 2015, 133, 820–825. [Google Scholar] [CrossRef] [PubMed]
- Sun, J.; Radwan, S.H.; Soliman, A.Z.; Lammer, J.; Lin, M.M.; Prager, S.G.; Silva, P.S.; Aiello, L.B.; Aiello, L.P. Neural retinal disorganization as a robust marker of visual acuity in current and resolved diabetic macular edema. Diabetes 2015, 64, 2560–2570. [Google Scholar] [CrossRef]
- Acón, D.; Wu, L. Multimodal imaging in diabetic macular edema. Asia-Pac. J. Ophthalmol. 2018, 7, 22–27. [Google Scholar]
- Das, R.; Spence, G.; Hogg, R.; Stevenson, M.; Chakravarthy, U. Disorganization of inner retina and outer retinal morphology in diabetic macular edema. JAMA Ophthalmol. 2018, 136, 202. [Google Scholar] [CrossRef] [PubMed]
- Joltikov, K.; Sesi, C.A.; de Castro, V.M.; Davila, J.R.; Anand, R.; Khan, S.M.; Farbman, N.; Jackson, G.R.; Johnson, C.A.; Gardner, T.W. Disorganization of retinal inner layers (DRIL) and neuroretinal dysfunction in early diabetic retinopathy. Investig. Ophthalmol. Vis. Sci. 2018, 59, 5481. [Google Scholar] [CrossRef] [PubMed]
- Nakano, E.; Ota, T.; Jingami, Y.; Nakata, I.; Hayashi, H.; Yamashiro, K. Correlation between metamorphopsia and disorganization of the retinal inner layers in eyes with diabetic macular edema. Graefe’s Arch. Clin. Exp. Ophthalmol. 2019, 257, 1873–1878. [Google Scholar] [CrossRef] [PubMed]
- Nadri, G.; Saxena, S.; Stefanickova, J.; Ziak, P.; Benacka, J.; Gilhotra, J.S.; Kruzliak, P. Disorganization of retinal inner layers correlates with ellipsoid zone disruption and retinal nerve fiber layer thinning in diabetic retinopathy. J. Diabetes Its Complicat. 2019, 33, 550–553. [Google Scholar] [CrossRef] [PubMed]
- Di-Luciano, A.; Lam, W.C.; Velasque, L.; Kenstelman, E.; Torres, R.M.; Alvarado-Villacorta, R.; Nagpal, M. Disorganization of the inner retinal layers in diabetic macular edema: Systematic review. Rev. Bras. Oftalmol. 2022, 81, e0027. [Google Scholar] [CrossRef]
- Singh, R.; Singuri, S.; Batoki, J.; Lin, K.; Luo, S.; Hatipoglu, D.; Anand-Apte, B.; Yuan, A. Deep learning algorithm detects presence of disorganization of retinal inner layers (dril)—An early imaging biomarker in diabetic retinopathy. Transl. Vis. Sci. Technol. 2023, 12, 6. [Google Scholar] [CrossRef] [PubMed]
- Tripathi, A.; Kumar, P.; Tulsani, A.; Chakrapani, P.K.; Maiya, G.; Bhandary, S.V.; Mayya, V.; Pathan, S.; Achar, R.; Acharya, U.R. Fuzzy Logic-Based System for Identifying the Severity of Diabetic Macular Edema from OCT B-Scan Images Using DRIL, HRF, and Cystoids. Diagnostics 2023, 13, 2550. [Google Scholar] [CrossRef] [PubMed]
- Singuri, S.; Luo, S.; Hatipoglu, D.; Nowacki, A.S.; Patel, R.; Schachat, A.P.; Ehlers, J.P.; Singh, R.P.; Anand-Apte, B.; Yuan, A. Clinical utility of Spectral-Domain optical coherence tomography marker disorganization of retinal inner layers in diabetic retinopathy. Ophthalmic Surg. Lasers Imaging Retin. 2023, 54, 692–700. [Google Scholar] [CrossRef] [PubMed]
- Toto, L.; Romano, A.; Pavan, M.; Degl’Innocenti, D.; Olivotto, V.; Formenti, F.; Viggiano, P.; Midena, E.; Mastropasqua, R. A deep learning approach to hard exudates detection and disorganization of retinal inner layers identification on OCT images. Sci. Rep. 2024, 14, 16652. [Google Scholar] [CrossRef] [PubMed]
- Ruiz-Medrano, J.; Udaondo Mirete, P.; Fernández-Jiménez, M.; Asencio-Duran, M.; Fernández-Vigo, J.I.; Medina-Baena, M.; Flores-Moreno, I.; Pareja-Esteban, J.; Touhami, S.; Giocanti-Aurégan, A.; et al. Biomarkers of risk of switching to dexamethasone implant for the treatment of diabetic macular oedema in real clinical practice: A multicentric study. Br. J. Ophthalmol. 2025, 109, 1155–1160. [Google Scholar] [CrossRef] [PubMed]
- Tripathi, A.; Gaur, S.; Agarwal, R.; Singh, N.; Singh, A.; Parveen, S.; Singh, N.; Rima, N. Disorganization of retinal inner layers as an optical coherence tomography biomarker in diabetic retinopathy: A review. Indian J. Ophthalmol. 2025, 73, 1245–1250. [Google Scholar] [CrossRef] [PubMed]
- Grill, J.B.; Strub, F.; Altché, F.; Tallec, C.; Richemond, P.; Buchatskaya, E.; Doersch, C.; Avila Pires, B.; Guo, Z.; Gheshlaghi Azar, M.; et al. Bootstrap your own latent-a new approach to self-supervised learning. Adv. Neural Inf. Process. Syst. 2020, 33, 21271–21284. [Google Scholar]
- Kermany, D.S.; Goldbaum, M.; Cai, W.; Valentim, C.C.; Liang, H.; Baxter, S.L.; McKeown, A.; Yang, G.; Wu, X.; Yan, F.; et al. Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning. Cell 2018, 172, 1122–1131.e9. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).