Skip Content
You are currently on the new version of our website. Access the old version .

Journal of Imaging

Journal of Imaging is an international, multi/interdisciplinary, peer-reviewed, open access journal of imaging techniques, published online monthly by MDPI.

Indexed in PubMed | Quartile Ranking JCR - Q2 (Imaging Science and Photographic Technology)

All Articles (2,243)

Deep learning models for three-dimensional (3D) data are increasingly used in domains such as medical imaging, object recognition, and robotics. At the same time, the use of AI in these domains is increasing, while, due to their black-box nature, the need for explainability has grown significantly. However, the lack of standardized and quantitative benchmarks for explainable artificial intelligence (XAI) in 3D data limits the reliable comparison of explanation quality. In this paper, we present a unified benchmarking framework to evaluate both intrinsic and post hoc XAI methods across three representative 3D datasets: volumetric CT scans (MosMed), voxelized CAD models (ModelNet40), and real-world point clouds (ScanObjectNN). The evaluated methods include Grad-CAM, Integrated Gradients, Saliency, Occlusion, and the intrinsic ResAttNet-3D model. We quantitatively assess explanations using the Correctness (AOPC), Completeness (AUPC), and Compactness metrics, consistently applied across all datasets. Our results show that explanation quality significantly varies across methods and domains, demonstrating that Grad-CAM and intrinsic attention performed best on medical CT scans, while gradient-based methods excelled on voxelized and point-based data. Statistical tests (Kruskal–Wallis and Mann–Whitney U) confirmed significant performance differences between methods. No single approach achieved superior results across all domains, highlighting the importance of multi-metric evaluation. This work provides a reproducible framework for standardized assessment of 3D explainability and comparative insights to guide future XAI method selection.

30 January 2026

Overview of the cross-domain 3D XAI benchmarking framework.

Lung cancer remains a leading cause of cancer-related mortality. Although reliable multiclass classification of lung lesions from CT imaging is essential for early diagnosis, it remains challenging due to subtle inter-class differences, limited sample sizes, and class imbalance. We propose an Adaptive Attention-Augmented Convolutional Neural Network with Vision Transformer (AACNN-ViT), a hybrid framework that integrates local convolutional representations with global transformer embeddings through an adaptive attention-based fusion module. The CNN branch captures fine-grained spatial patterns, the ViT branch encodes long-range contextual dependencies, and the adaptive fusion mechanism learns to weight cross-representation interactions to improve discriminability. To reduce the impact of imbalance, a hybrid objective that combines focal loss with categorical cross-entropy is incorporated during training. Experiments on the IQ-OTH/NCCD dataset (benign, malignant, and normal) show consistent performance progression in an ablation-style evaluation: CNN-only, ViT-only, CNN-ViT concatenation, and AACNN-ViT. The proposed AACNN-ViT achieved 96.97% accuracy on the validation set with macro-averaged precision/recall/F1 of 0.9588/0.9352/0.9458 and weighted F1 of 0.9693, substantially improving minority-class recognition (Benign recall 0.8333) compared with CNN-ViT (accuracy 89.09%, macro-F1 0.7680). One-vs.-rest ROC analysis further indicates strong separability across all classes (micro-average AUC 0.992). These results suggest that adaptive attention-based fusion offers a robust and clinically relevant approach for computer-aided lung cancer screening and decision support.

30 January 2026

AACNN-ViT workflow for lung cancer classification, combining CNN and ViT features using adaptive attention.

Multiscale RGB-Guided Fusion for Hyperspectral Image Super-Resolution

  • Matteo Kolyszko,
  • Marco Buzzelli and
  • Raimondo Schettini
  • + 1 author

Hyperspectral imaging (HSI) enables fine spectral analysis but is often limited by low spatial resolution due to sensor constraints. To address this, we propose CGNet, a color-guided hyperspectral super-resolution network that leverages complementary information from low-resolution hyperspectral inputs and high-resolution RGB images. CGNet adopts a dual-encoder design: the RGB encoder extracts hierarchical spatial features, while the HSI encoder progressively upsamples spectral features. A multi-scale fusion decoder then combines both modalities in a coarse-to-fine manner to reconstruct the high-resolution HSI. Training is driven by a hybrid loss that balances L1 and Spectral Angle Mapper (SAM), which ablation studies confirm as the most effective formulation. Experiments on two benchmarks, ARAD1K and StereoMSI, at ×4 and ×6 upscaling factors demonstrate that CGNet consistently outperforms state-of-the-art baselines. CGNet achieves higher PSNR and SSIM, lower SAM, and reduced ΔE00, confirming its ability to recover sharp spatial structures while preserving spectral fidelity.

28 January 2026

Overview of the proposed CGNet architecture. The network takes as input a low-resolution hyperspectral image and a high-resolution RGB image. It employs two parallel encoders to extract multiscale spectral and spatial features, which are progressively fused in a coarse-to-fine manner by the fusion decoder. The final output is a super-resolved hyperspectral image with full spatial and spectral fidelity.

Radiographic imaging remains a cornerstone of diagnostic practice. However, accurate interpretation faces challenges from subtle visual signatures, anatomical variability, and inter-observer inconsistency. Conventional deep learning approaches, such as convolutional neural networks and vision transformers, deliver strong predictive performance but often lack anatomical grounding and interpretability, limiting their trustworthiness in imaging applications. To address these challenges, we present SpineNeuroSym, a neuro-geometric imaging framework that unifies geometry-aware learning and symbolic reasoning for explainable medical image analysis. The framework integrates weakly supervised keypoint and region-of-interest discovery, a dual-stream graph–transformer backbone, and a Differentiable Radiographic Geometry Module (dRGM) that computes clinically relevant indices (e.g., slip ratio, disc asymmetry, sacroiliac spacing, and curvature measures). A Neuro-Symbolic Constraint Layer (NSCL) enforces monotonic logic in image-derived predictions, while a Counterfactual Geometry Diffusion (CGD) module generates rare imaging phenotypes and provides diagnostic auditing through counterfactual validation. Evaluated on a comprehensive dataset of 1613 spinal radiographs from Sunpasitthiprasong Hospital encompassing six diagnostic categories—spondylolisthesis (n = 496), infection (n = 322), spondyloarthropathy (n = 275), normal cervical (n = 192), normal thoracic (n = 70), and normal lumbar spine (n = 258)—SpineNeuroSym achieved 89.4% classification accuracy, a macro-F1 of 0.872, and an AUROC of 0.941, outperforming eight state-of-the-art imaging baselines. These results highlight how integrating neuro-geometric modeling, symbolic constraints, and counterfactual validation advances explainable, trustworthy, and reproducible medical imaging AI, establishing a pathway toward transparent image analysis systems.

28 January 2026

Stepwise workflow of the proposed methodology.

News & Conferences

Issues

Open for Submission

Editor's Choice

Reprints of Collections

Advances in Retinal Image Processing
Reprint

Advances in Retinal Image Processing

Editors: P. Jidesh, Vasudevan (Vengu) Lakshminarayanan

Get Alerted

Add your email address to receive forthcoming issues of this journal.

XFacebookLinkedIn
J. Imaging - ISSN 2313-433X