Previous Issue
Volume 12, May
 
 

Tomography, Volume 12, Issue 6 (June 2026) – 1 article

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
29 pages, 4755 KB  
Article
DenseViT-OCT: A Hybrid CNN-Transformer Architecture with Multi-Scale Dense Feature Aggregation for Automated Epiretinal Membrane Severity Classification
by Elif Yusufoğlu, Salih Taha Alperen Özçelik, Orhan Atila, Numan Halit Guldemir and Abdulkadir Sengur
Tomography 2026, 12(6), 76; https://doi.org/10.3390/tomography12060076 - 22 May 2026
Abstract
Background/Objectives: Epiretinal membrane (ERM) is a common vitreoretinal disorder characterized by fibrocellular proliferation on the inner retinal surface, often leading to progressive visual impairment. Accurate grading of ERM severity using optical coherence tomography (OCT) is critical for treatment planning and surgical decision-making; however, [...] Read more.
Background/Objectives: Epiretinal membrane (ERM) is a common vitreoretinal disorder characterized by fibrocellular proliferation on the inner retinal surface, often leading to progressive visual impairment. Accurate grading of ERM severity using optical coherence tomography (OCT) is critical for treatment planning and surgical decision-making; however, manual grading is labor-intensive and subjective. This study aims to develop an automated and reliable deep learning-based method for ERM severity classification. Methods: We propose DenseViT-OCT, a hybrid deep learning model that integrates dense convolutional neural networks (CNN) and vision transformers (ViT). The model introduces three key modules: Multi-Scale Dense Feature Aggregation (MDFA) for capturing hierarchical features across multiple spatial scales, Adaptive Feature Calibration (AFC) for enhancing feature discrimination through channel and spatial attention, and Cross-Attention Feature Fusion (CAFF) for enabling bidirectional interaction between convolutional and transformer representations. The model was trained and evaluated on 2195 OCT B-scan images obtained from 397 patients. Results: DenseViT-OCT achieved an overall accuracy of 94.76% on the internal four-class test set, outperforming 19 benchmark models, including ConvNeXt, EfficientNet, ViT, and Swin Transformers. The model demonstrated balanced performance with a macro-averaged precision of 93.76%, recall of 93.22%, F1-score of 93.47%, Cohen’s kappa of 92.62%, and macro-Area Under the Curve (AUC) of 98.95%. Ablation experiments confirmed the contribution of the proposed MDFA, AFC, CAFF, and deep supervision components, with the full model consistently outperforming reduced variants and standalone DenseNet121 and ViT-B/16 backbones. In repeated experiments across five random seeds, DenseViT-OCT also achieved the best mean accuracy (0.9399 ± 0.0052). External validation on the public multicenter OCTDL dataset, performed as binary ERM-versus-normal classification because of label availability, yielded 90.76% accuracy and 97.61% AUC, indicating promising generalization beyond the development cohort. Conclusions: DenseViT-OCT provides a robust framework for automated ERM severity classification from OCT B-scans. The combination of local CNN features, global transformer context, and dedicated fusion modules improves classification performance and yields clinically meaningful error patterns. Although further stage-wise multicenter validation, volumetric OCT analysis, and prospective clinical assessment are required, the proposed method shows promise as a research-oriented decision-support framework for B-scan-level ERM assessment. Full article
(This article belongs to the Special Issue Medical Image Analysis in CT Imaging)
Show Figures

Figure 1

Previous Issue
Back to TopTop