-
A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis -
Next-Generation Advances in Prostate Cancer Imaging and Artificial Intelligence Applications -
Classifying Sex from MSCT-Derived 3D Mandibular Models Using an Adapted PointNet++ Deep Learning Approach in a Croatian Population -
AIGD Era: From Fragment to One Piece
Journal Description
Journal of Imaging
Journal of Imaging
is an international, multi/interdisciplinary, peer-reviewed, open access journal of imaging techniques, published online monthly by MDPI.
- Open Accessfree for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), PubMed, PMC, dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: JCR - Q2 (Imaging Science and Photographic Technology) / CiteScore - Q1 (Radiology, Nuclear Medicine and Imaging)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 18 days after submission; acceptance to publication is undertaken in 3.6 days (median values for papers published in this journal in the second half of 2025).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
3.3 (2024);
5-Year Impact Factor:
3.3 (2024)
Latest Articles
Accelerating Point Cloud Computation via Memory in Embedded Structured Light Cameras
J. Imaging 2026, 12(2), 91; https://doi.org/10.3390/jimaging12020091 (registering DOI) - 21 Feb 2026
Abstract
Embedded structured light cameras have been widely applied in various fields. However, due to constraints such as insufficient computing resources, it remains difficult to achieve high-speed structured light point cloud computation. To address this issue, this study proposes a memory-driven computational framework for
[...] Read more.
Embedded structured light cameras have been widely applied in various fields. However, due to constraints such as insufficient computing resources, it remains difficult to achieve high-speed structured light point cloud computation. To address this issue, this study proposes a memory-driven computational framework for accelerating point cloud computation. Specifically, the point cloud computation process is precomputed as much as possible and stored in memory in the form of parameters, thereby significantly reducing the computational load during actual point cloud computation. The framework is instantiated in two forms: a low-memory method that minimizes memory footprint at the expense of point cloud stability, and a high-memory method that preserves the nonlinear phase–distance relation via an extensive lookup table. Experimental evaluations demonstrate that the proposed methods achieve comparable accuracy to the conventional method while delivering substantial speedups, and data-format optimizations further reduce required bandwidth. This framework offers a generalizable paradigm for optimizing structured light pipelines, paving the way for enhanced real-time 3D sensing in embedded applications.
Full article
(This article belongs to the Special Issue Intelligent 3D Vision: Reconstruction, Understanding, Generative Modeling, and Applications)
►
Show Figures
Open AccessArticle
MDF2Former: Multi-Scale Dual-Domain Feature Fusion Transformer for Hyperspectral Image Classification of Bacteria in Murine Wounds
by
Decheng Wu, Wendan Liu, Rui Li, Xudong Fu, Lin Tao, Yinli Tian, Anqiang Zhang, Zhen Wang and Hao Tang
J. Imaging 2026, 12(2), 90; https://doi.org/10.3390/jimaging12020090 - 19 Feb 2026
Abstract
Bacterial wound infection poses a major challenge in trauma care and can lead to severe complications such as sepsis and organ failure. Therefore, rapid and accurate identification of the pathogen, along with targeted intervention, is of vital importance for improving treatment outcomes and
[...] Read more.
Bacterial wound infection poses a major challenge in trauma care and can lead to severe complications such as sepsis and organ failure. Therefore, rapid and accurate identification of the pathogen, along with targeted intervention, is of vital importance for improving treatment outcomes and reducing risks. However, current detection methods are still constrained by procedural complexity and long processing times. In this study, a hyperspectral imaging (HSI) acquisition system for bacterial analysis and a multi-scale dual-domain feature fusion transformer (MDF2Former) were developed for classifying wound bacteria. MDF2Former integrates three modules: a multi-scale feature enhancement and fusion module that generates tokens with multi-scale discriminative representations, a spatial–spectral dual-branch attention module that strengthens joint feature modeling, and a frequency and spatial–spectral domain encoding module that captures global and local interactions among tokens through a hierarchical stacking structure, thereby enabling more efficient feature learning. Extensive experiments on our self-constructed HSI dataset of typical wound bacteria demonstrate that MDF2Former achieved outstanding performance across five metrics: Accuracy (91.94%), Precision (92.26%), Recall (91.94%), F1-score (92.01%), and Kappa coefficient (90.73%), surpassing all comparative models. These results have verified the effectiveness of combining HSI with deep learning for bacterial identification, and have highlighted its potential in assisting in the identification of bacterial species and making personalized treatment decisions for wound infections.
Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Classification of the Surrounding Rock Based on Image Processing Analysis and Transfer Learning
by
Yanyun Fan, Jiaqi Zhu, Hua Luo, Yaxi Shen, Shuanglong Wang, Xiaoning Liu, Dong Li and Chuhan Deng
J. Imaging 2026, 12(2), 89; https://doi.org/10.3390/jimaging12020089 - 19 Feb 2026
Abstract
Currently, standardized classification methods of surrounding rock are relatively insufficient. The classification of surrounding rock mainly relies on the subjective judgment of technicians, leading to diverse evaluation results. This study focuses on the feature extraction and classification methods of surrounding rock images in
[...] Read more.
Currently, standardized classification methods of surrounding rock are relatively insufficient. The classification of surrounding rock mainly relies on the subjective judgment of technicians, leading to diverse evaluation results. This study focuses on the feature extraction and classification methods of surrounding rock images in a certain tunnel of the Central Yunnan Water Diversion Project by using image processing analysis and transfer learning. Rich surrounding rock images and the water conservancy tunnel data are collected, and then the surrounding rock is classified relatively accurately according to the code and expert guidance. By introducing the fractal theory, the complexity and irregularity of the spatial distribution of weak layers and joints on the surrounding rock surface are revealed effectively. Based on the analysis of changes in fractal dimension characteristic values, a classification method for surrounding rock based on the fractal theory is proposed. Combined with the quantified parameters of surrounding rock images and the strength data collected by rebound meters, a method for correcting the surrounding rock strength based on image analysis is proposed, which can effectively solve the error caused by the uneven distribution of rock masses in the traditional rebound meter strength values. After correction, more accurate strength characteristics can be obtained, which is conducive to the standardized classification of the surrounding rock. After studying the recognition of tunnel surrounding rock images with transfer learning, a model is constructed to achieve rapid classification of tunnel surrounding rock. This research provides support for the standardized classification of tunnel surrounding rock.
Full article
(This article belongs to the Section Image and Video Processing)
►▼
Show Figures

Figure 1
Open AccessReview
Analysis of Biological Images and Quantitative Monitoring Using Deep Learning and Computer Vision
by
Aaron Gálvez-Salido, Francisca Robles, Rodrigo J. Gonçalves, Roberto de la Herrán, Carmelo Ruiz Rejón and Rafael Navajas-Pérez
J. Imaging 2026, 12(2), 88; https://doi.org/10.3390/jimaging12020088 - 18 Feb 2026
Abstract
Automated biological counting is essential for scaling wildlife monitoring and biodiversity assessments, as manual processing currently limits analytical effort and scalability. This review evaluates the integration of deep learning and computer vision across diverse acquisition platforms, including camera traps, unmanned aerial vehicles (UAVs),
[...] Read more.
Automated biological counting is essential for scaling wildlife monitoring and biodiversity assessments, as manual processing currently limits analytical effort and scalability. This review evaluates the integration of deep learning and computer vision across diverse acquisition platforms, including camera traps, unmanned aerial vehicles (UAVs), and remote sensing. Methodological paradigms ranging from Convolutional Neural Networks (CNNs) and one-stage detectors like You Only Look Once (YOLO) to recent transformer-based architectures and hybrid models are examined. The literature shows that these methods consistently achieve high accuracy—often exceeding 95%—across various taxa, including insect pests, aquatic organisms, terrestrial vegetation, and forest ecosystems. However, persistent challenges such as object occlusion, cryptic species differentiation, and the scarcity of high-quality, labeled datasets continue to hinder fully automated workflows. We conclude that while automated counting has fundamentally increased data throughput, future advancements must focus on enhancing model generalization through self-supervised learning and improved data augmentation techniques. These developments are critical for transitioning from experimental models to robust, operational tools for global ecological monitoring and conservation efforts.
Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications (3rd Edition))
►▼
Show Figures

Figure 1
Open AccessArticle
Automated Compactness Quantitative Metrics for Wrist Bone on Conventional Radiography in Rheumatoid Arthritis: A Clinical Evaluation Study
by
Jiajing Zhou, Junmu Peng, Haolin Wang, Hiroshi Kataoka, Masaya Mukai, Tunlada Wiriyanukhroh and Tamotsu Kamishima
J. Imaging 2026, 12(2), 87; https://doi.org/10.3390/jimaging12020087 - 18 Feb 2026
Abstract
Rheumatoid arthritis (RA) frequently affects the joints of the hands, with joint space narrowing (JSN) representing an important early marker of structural damage. The semi-quantitative Sharp/van der Heijde (SvdH) scoring system is widely used in clinical practice but is inherently subjective and susceptible
[...] Read more.
Rheumatoid arthritis (RA) frequently affects the joints of the hands, with joint space narrowing (JSN) representing an important early marker of structural damage. The semi-quantitative Sharp/van der Heijde (SvdH) scoring system is widely used in clinical practice but is inherently subjective and susceptible to observer variability. Moreover, the complex anatomy of the wrist and substantial overlap of carpal bones pose challenges for automated quantitative assessment of wrist JSN on routine radiographs. This study aimed to introduce a novel quantitative assessment perspective and to clinically validate an automated, compactness-related quantification framework for evaluating wrist JSN in RA. This study initially enrolled 51 patients with RA. After excluding one case with severe carpal fusion that precluded anatomical differentiation, 50 patients (44 females and 6 males) were included in the final analysis. The cohort had a mean age of 61 years (range: 21–82), a median symptom duration of 9 years (IQR: 1–32), and a median follow-up interval for bilateral hand radiographs of 1.06 years (IQR: 0.82–1.30). To quantify global wrist JSN, 10 compactness-related metrics were computed based on the spatial distribution of bone centroids extracted from carpal segmentation masks. These metrics were validated against the wrist JSN subscore of the SvdH score (SvdH-JSN_wrist) and the total Sharp score (TSS) as gold standards. Several distance-based metrics among the compactness-related metrics showed significant negative correlations with the wrist joint space narrowing subscore of the Sharp/van der Heijde score (SvdH-JSN_wrist). Specifically, mean-pairwise-distance ( ), root-mean-square-radius ( ), and median-radius ( ) showed moderate to strong correlations (r = −0.52 to −0.63, all ) that were consistent at BL and FU. Correlations with TSS were weaker overall, with only and its normalized form showing stable negative correlations (r = −0.40 to −0.43, p < 0.01). Longitudinal analyses showed limited correlations between metric changes and clinical score changes. The proposed automated compactness quantification framework enables objective and reliable assessment of wrist JSN on standard radiographs and complements conventional scoring systems by supporting automated and standardized evaluation of RA-related wrist structural changes.
Full article
(This article belongs to the Section Medical Imaging)
Open AccessArticle
Print Quality Assessment of QR Code Elements Achieved by the Digital Thermal Transfer Process
by
Igor Majnarić, Marija Jelkić, Marko Morić and Krunoslav Hajdek
J. Imaging 2026, 12(2), 86; https://doi.org/10.3390/jimaging12020086 - 18 Feb 2026
Abstract
The new European Regulation (EU) 2025/40 includes provisions on modern packaging and packaging waste. It defines the use of image QR codes on packaging (items 71 and 161) and in personal documents, making line barcodes a thing of the past. The definition of
[...] Read more.
The new European Regulation (EU) 2025/40 includes provisions on modern packaging and packaging waste. It defines the use of image QR codes on packaging (items 71 and 161) and in personal documents, making line barcodes a thing of the past. The definition of a QR code is precisely specified in ISO/IEC 18004:2024. However, their implementation in printing systems is not specified and remains an important factor for their future application. Digital foil printing is a completely new hybrid printing process for applying information to highly precise applications such as QR codes, security printing, and packaging printing. The technique is characterized by a combination of two printing techniques: drop-on-demand UV inkjet followed by thermal transfer of black foil. Using a matte-coated printing substrate (Garda Matt, 300 g/m2), Konica Minolta KM1024 LHE Inkjet head settings, and a transfer temperature of 100 °C, the size of the square printing elements in QR codes plays a decisive role in the quality of the decoded information. The aim of this work is to investigate the possibility of realizing the basic elements of the QR code image (the profile of square elements and the success of realizing a precisely defined surface) with a variation in the thickness of the UV varnish coating (7, 14 and 21 µm), realized using the MGI JETvarnish 3DS digital machine. The most commonly used rectangular elements with a surface area of 0.01 cm2 were tested: 0.06 cm2, 0.25 cm2, 1 cm2, 4 cm2, and 16 cm2. The results showed that the imprint quality is uneven for the smallest elements (square elements with base lengths of 0.1 cm and 0.25 cm). The effect is especially visible with a minimum UV varnish application of 7 μm (1 drop). By increasing the amount of UV varnish and the application thickness to 14 μm (2 drops) and 21 μm (3 drops), respectively, a significantly more stable, even reproduction of the achromatic image is achieved. The highest technical precision was achieved with a UV varnish thickness of 21 μm.
Full article
(This article belongs to the Topic New Challenges in Image Processing and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
SREF: Semantics-Refined Feature Extraction for Long-Term Visual Localization
by
Danfeng Wu, Kaifeng Zhu, Heng Shi, Fenfen Zhou and Minchi Kuang
J. Imaging 2026, 12(2), 85; https://doi.org/10.3390/jimaging12020085 - 18 Feb 2026
Abstract
Accurate and robust visual localization under changing environments remains a fundamental challenge in autonomous driving and mobile robotics. Traditional handcrafted features often degrade under long-term illumination and viewpoint variations, while recent CNN-based methods, although more robust, typically rely on coarse semantic cues and
[...] Read more.
Accurate and robust visual localization under changing environments remains a fundamental challenge in autonomous driving and mobile robotics. Traditional handcrafted features often degrade under long-term illumination and viewpoint variations, while recent CNN-based methods, although more robust, typically rely on coarse semantic cues and remain vulnerable to dynamic objects. In this paper, we propose a fine-grained semantics-guided feature extraction framework that adaptively selects stable keypoints while suppressing dynamic disturbances. A fine-grained semantic refinement module subdivides coarse semantic categories into stability-homogeneous sub-classes, and a dual-attention mechanism enhances local repeatability and semantic consistency. By integrating physical priors with self-supervised clustering, the proposed framework learns discriminative and reliable feature representations. Extensive experiments on the Aachen and RobotCar-Seasons benchmarks demonstrate that the proposed approach achieves state-of-the-art accuracy and robustness while maintaining real-time efficiency, effectively bridging coarse semantic guidance with fine-grained stability estimation. Quantitatively, our method achieves strong localization performance on Aachen (up to 88.1% at night under the threshold) and on RobotCar-Seasons (up to 57.2%/28.4% under the same threshold for day/night), demonstrating improved robustness to seasonal and illumination changes.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Open AccessArticle
LEGS: Visual Localization Enhanced by 3D Gaussian Splatting
by
Daewoon Kim and I-gil Kim
J. Imaging 2026, 12(2), 84; https://doi.org/10.3390/jimaging12020084 - 16 Feb 2026
Abstract
Accurate six-degree-of-freedom (6-DoF) visual localization is a fundamental component for modern mapping and navigation. While recent data-centric approaches have leveraged Novel View Synthesis (NVS) to augment training datasets, these methods typically rely on uniform grid-based sampling of virtual cameras. Such naive placement often
[...] Read more.
Accurate six-degree-of-freedom (6-DoF) visual localization is a fundamental component for modern mapping and navigation. While recent data-centric approaches have leveraged Novel View Synthesis (NVS) to augment training datasets, these methods typically rely on uniform grid-based sampling of virtual cameras. Such naive placement often yields redundant or weakly informative views, failing to effectively bridge the gap between sparse, unordered captures and dense scene geometry. To address these challenges, we present LEGS (Visual Localization Enhanced by 3D Gaussian Splatting), a trajectory-agnostic synthetic-view augmentation framework. LEGS constructs a joint set of 6-DoF camera pose proposals by integrating a coarse 3D lattice with the Structure-from-Motion (SfM) camera graph, followed by a visibility-aware, coverage-driven selection strategy. By utilizing 3D Gaussian Splatting (3DGS), our framework enables high-throughput, scene-specific synthesis within practical computational budgets. Experiments on standard benchmarks and an in-house dataset demonstrate that LEGS consistently improves pose accuracy and robustness, particularly in scenarios characterized by sparse sampling and co-located viewpoints.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Open AccessArticle
3D Road Defect Mapping via Differentiable Neural Rendering and Multi-Frame Semantic Fusion in Bird’s-Eye-View Space
by
Hongjia Xing and Feng Yang
J. Imaging 2026, 12(2), 83; https://doi.org/10.3390/jimaging12020083 - 15 Feb 2026
Abstract
Road defect detection is essential for traffic safety and infrastructure maintenance. Excising automated methods based on 2D image analysis lack spatial context and cannot provide accurate 3D localization required for maintenance planning. We propose a novel framework for road defect mapping from monocular
[...] Read more.
Road defect detection is essential for traffic safety and infrastructure maintenance. Excising automated methods based on 2D image analysis lack spatial context and cannot provide accurate 3D localization required for maintenance planning. We propose a novel framework for road defect mapping from monocular video sequences by integrating differentiable Bird’s-Eye-View (BEV) mesh representation, semantic filtering, and multi-frame temporal fusion. Our differentiable mesh-based BEV representation enables efficient scene reconstruction from sparse observations through MLP-based optimization. The semantic filtering strategy leverages road surface segmentation to eliminate off-road false positives, reducing detection errors by 33.7%. Multi-frame fusion with ray-casting projection and exponential moving average update accumulates defect observations across frames while maintaining 3D geometric consistency. Experimental results demonstrate that our framework produces geometrically consistent BEV defect maps with superior accuracy compared to single-frame 2D methods, effectively handling occlusions, motion blur, and varying illumination conditions.
Full article
(This article belongs to the Special Issue Intelligent 3D Vision: Reconstruction, Understanding, Generative Modeling, and Applications)
Open AccessReview
Research Progress on the Application of Radiomics and Deep Learning in Liver Fibrosis
by
Yi Dang, Wenjing Li, Zhao Liu and Junqiang Lei
J. Imaging 2026, 12(2), 82; https://doi.org/10.3390/jimaging12020082 - 15 Feb 2026
Abstract
Liver fibrosis (LF) represents a crucial intermediate stage in the pathological progression from chronic liver disease to cirrhosis and hepatocellular carcinoma. Early and accurate diagnosis is of vital importance for the intervention treatment of diseases and the improvement of prognosis. Traditional liver biopsy,
[...] Read more.
Liver fibrosis (LF) represents a crucial intermediate stage in the pathological progression from chronic liver disease to cirrhosis and hepatocellular carcinoma. Early and accurate diagnosis is of vital importance for the intervention treatment of diseases and the improvement of prognosis. Traditional liver biopsy, long regarded as the diagnostic gold standard, remains associated with several notable limitations such as invasiveness, sampling errors and inter-observer variability. Lately, as artificial intelligence (AI) technology progresses swiftly, radiomics and deep learning (DL) have risen to prominence as non-invasive diagnostic instruments, showing significant potential in the LF diagnostic evaluation. This review summarizes the latest advancements in radiomics and DL for LF diagnosis, staging, prognosis prediction and etiological differentiation. It also analyzes the application value of multimodal imaging modalities, including magnetic resonance imaging (MRI), computed tomography (CT) and ultrasound in this field. Despite ongoing challenges in model generalization and standardization, improved model interpretability, technological integration and multimodal fusion, the continuous advancement of radiomics and DL technologies holds promise for AI-driven imaging analysis strategies. These approaches aim to integrate multiple clinical monitoring methods, overcome obstacles in the early LF diagnosis and treatment and provide new perspectives for precision medicine of this disease.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Automatic Childhood Pneumonia Diagnosis Based on Multi-Model Feature Fusion Using Chi-Square Feature Selection
by
Amira Ouerhani, Tareq Hadidi, Hanene Sahli and Halima Mahjoubi
J. Imaging 2026, 12(2), 81; https://doi.org/10.3390/jimaging12020081 - 14 Feb 2026
Abstract
Pneumonia is one of the main reasons for child mortality, with chest radiography (CXR) being essential for its diagnosis. However, the low radiation exposure in pediatric analysis complicates the accurate detection of pneumonia, making traditional examination ineffective. Progress in medical imaging with convolutional
[...] Read more.
Pneumonia is one of the main reasons for child mortality, with chest radiography (CXR) being essential for its diagnosis. However, the low radiation exposure in pediatric analysis complicates the accurate detection of pneumonia, making traditional examination ineffective. Progress in medical imaging with convolutional neural networks (CNN) has considerably improved performance, gaining widespread recognition for its effectiveness. This paper proposes an accurate pneumonia detection method based on different deep CNN architectures that combine optimal feature fusion. Enhanced VGG-19, ResNet-50, and MobileNet-V2 are trained on the most widely used pneumonia dataset, applying appropriate transfer learning and fine-tuning strategies. To create an effective feature input, the Chi-Square technique removes inappropriate features from every enhanced CNN. The resulting subsets are subsequently fused horizontally, to generate more diverse and robust feature representation for binary classification. By combining 1000 best features from VGG-19 and MobileNet-V2 models, the suggested approach records the best accuracy (97.59%), Recall (98.33%), and F1-score (98.19%) on the test set based on the supervised support vector machines (SVM) classifier. The achieved results demonstrated that our approach provides a significant enhancement in performance compared to previous studies using various ensemble fusion techniques while ensuring computational efficiency. We project this fused-feature system to significantly aid timely detection of childhood pneumonia, especially within constrained healthcare systems.
Full article
(This article belongs to the Section Medical Imaging)
Open AccessArticle
Confidence-Guided Adaptive Diffusion Network for Medical Image Classification
by
Yang Yan, Zhuo Xie and Wenbo Huang
J. Imaging 2026, 12(2), 80; https://doi.org/10.3390/jimaging12020080 - 14 Feb 2026
Abstract
Medical image classification is a fundamental task in medical image analysis and underpins a wide range of clinical applications, including dermatological screening, retinal disease assessment, and malignant tissue detection. In recent years, diffusion models have demonstrated promising potential for medical image classification owing
[...] Read more.
Medical image classification is a fundamental task in medical image analysis and underpins a wide range of clinical applications, including dermatological screening, retinal disease assessment, and malignant tissue detection. In recent years, diffusion models have demonstrated promising potential for medical image classification owing to their strong representation learning capability. However, existing diffusion-based classification methods often rely on oversimplified prior modeling strategies, which fail to adequately capture the intrinsic multi-scale semantic information and contextual dependencies inherent in medical images. As a result, the discriminative power and stability of feature representations are constrained in complex scenarios. In addition, fixed noise injection strategies neglect variations in sample-level prediction confidence, leading to uniform perturbations being imposed on samples with different levels of semantic reliability during the diffusion process, which in turn limits the model’s discriminative performance and generalization ability. To address these challenges, this paper proposes a Confidence-Guided Adaptive Diffusion Network (CGAD-Net) for medical image classification. Specifically, a hybrid prior modeling framework is introduced, consisting of a Hierarchical Pyramid Context Modeling (HPCM) module and an Intra-Scale Dilated Convolution Refinement (IDCR) module. These two components jointly enable the diffusion-based feature modeling process to effectively capture fine-grained structural details and global contextual semantic information. Furthermore, a Confidence-Guided Adaptive Noise Injection (CG-ANI) strategy is designed to dynamically regulate noise intensity during the diffusion process according to sample-level prediction confidence. Without altering the underlying discriminative objective, CG-ANI stabilizes model training and enhances robust representation learning for semantically ambiguous samples.Experimental results on multiple public medical image classification benchmarks, including HAM10000, APTOS2019, and Chaoyang, demonstrate that CGAD-Net achieves competitive performance in terms of classification accuracy, robustness, and training stability. These results validate the effectiveness and application potential of confidence-guided diffusion modeling for two-dimensional medical image classification tasks, and provide valuable insights for further research on diffusion models in the field of medical image analysis.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Progressive Upsampling Generative Adversarial Network with Collaborative Attention for Single-Image Super-Resolution
by
Haoxiang Lu, Jing Zhang, Mengyuan Jing, Ziming Wang and Wenhao Wang
J. Imaging 2026, 12(2), 79; https://doi.org/10.3390/jimaging12020079 - 11 Feb 2026
Abstract
Single-image super-resolution (SISR) is an essential low-level visual task that aims to produce high-resolution images from low-resolution inputs. However, most existing SISR methods heavily rely on ideal degradation kernels and rarely consider the actual noise distribution. To tackle these issues, this paper presents
[...] Read more.
Single-image super-resolution (SISR) is an essential low-level visual task that aims to produce high-resolution images from low-resolution inputs. However, most existing SISR methods heavily rely on ideal degradation kernels and rarely consider the actual noise distribution. To tackle these issues, this paper presents a progressive upsampling generative adversarial network with collaborative attention mechanism called PUGAN. Specifically, the residual multiscale blocks (RMBs) based on stacked mixed-pooling multiscale structures (MPMSs) is designed to make full use of multiscale global–local hierarchical features, and the frequency collaborative attention mechanism (CAM) is used to fully dig up high- and low-frequency characteristics. Meanwhile, we design a progressive upsampling strategy to guide the model’s learning better while reducing the model’s complexity. Finally, the discriminator is also used to evaluate the reconstructed high-resolution images for balancing super-resolution reconstruction and detail enhancement. Our PUGAN can yield comparable PSNR/SSIM/LPIPS values for the NTIRE 2020, Urban 100, and B100 datasets, whose values are 33.987/0.9673/0.1210, 32.966/0.9483/0.1431, and 33.627/0.9546/0.1354 for the scale factor of as well as 26.349/0.8721/0.1975, 26.110/0.8614/0.1983, and 26.306/0.8803/0.1978 for the scale factor of , respectively. Extensive experiments demonstrate that our PUGAN outperforms state-of-the-art SISR methods in qualitative and quantitative assessments for the SISR task. Additionally, our PUGAN shows the potential benefits to pathological image super-resolution.
Full article
(This article belongs to the Section Image and Video Processing)
►▼
Show Figures

Figure 1
Open AccessArticle
Age Prediction of Hematoma from Hyperspectral Images Using Convolutional Neural Networks
by
Arash Keshavarz, Gerald Bieber, Daniel Wulff, Carsten Babian and Stefan Lüdtke
J. Imaging 2026, 12(2), 78; https://doi.org/10.3390/jimaging12020078 - 11 Feb 2026
Abstract
Accurate estimation of hematoma age remains a major challenge in forensic practice, as current assessments rely heavily on subjective visual interpretation. Hyperspectral imaging (HSI) captures rich spectral signatures that may reflect the biochemical evolution of hematomas over time. This study evaluates whether a
[...] Read more.
Accurate estimation of hematoma age remains a major challenge in forensic practice, as current assessments rely heavily on subjective visual interpretation. Hyperspectral imaging (HSI) captures rich spectral signatures that may reflect the biochemical evolution of hematomas over time. This study evaluates whether a convolutional neural network (CNN) integrating both spectral and spatial information improves hematoma age estimation accuracy. Additionally, we investigate whether performance can be maintained using a reduced, physiologically motivated subset of wavelengths. Using a dataset of forearm hematomas from 25 participants, we applied radiometric normalization and SAM-based segmentation to extract hyperspectral patches. In leave-one-subject-out cross-validation, the CNN outperformed a spectral-only Lasso baseline, reducing the mean absolute error (MAE) from 3.24 days to 2.29 days. Band-importance analysis combining SmoothGrad and occlusion sensitivity identified 20 highly informative wavelengths; using only these bands matched or exceeded the accuracy of the full 204-band model across early, middle, and late hematoma stages. These results demonstrate that spectral–spatial modeling and physiologically grounded band selection can enhance estimation accuracy while significantly reducing data dimensionality. This approach supports the development of compact multispectral systems for objective clinical and forensic evaluation.
Full article
(This article belongs to the Special Issue Multispectral and Hyperspectral Imaging: Progress and Challenges)
►▼
Show Figures

Figure 1
Open AccessCorrection
Correction: Jiang et al. Double-Gated Mamba Multi-Scale Adaptive Feature Learning Network for Unsupervised Single RGB Image Hyperspectral Image Reconstruction. J. Imaging 2026, 12, 19
by
Zhongmin Jiang, Zhen Wang, Wenju Wang and Jifan Zhu
J. Imaging 2026, 12(2), 77; https://doi.org/10.3390/jimaging12020077 - 11 Feb 2026
Abstract
There were two errors in the original publication [...]
Full article
(This article belongs to the Special Issue Multispectral and Hyperspectral Imaging: Progress and Challenges)
Open AccessArticle
A Multiphase CT-Based Integrated Deep Learning Framework for Rectal Cancer Detection, Segmentation, and Staging: Performance Comparison with Radiologist Assessment
by
Tzu-Hsueh Tsai, Jia-Hui Lin, Yen-Te Liu, Jhing-Fa Wang, Chien-Hung Lee and Chiao-Yun Chen
J. Imaging 2026, 12(2), 76; https://doi.org/10.3390/jimaging12020076 - 10 Feb 2026
Abstract
Accurate staging of rectal cancer is crucial for treatment planning; however, computed tomography (CT) interpretation remains challenging and highly dependent on radiologist expertise. This study aimed to develop and evaluate an AI-assisted system for rectal cancer detection and staging using CT images. The
[...] Read more.
Accurate staging of rectal cancer is crucial for treatment planning; however, computed tomography (CT) interpretation remains challenging and highly dependent on radiologist expertise. This study aimed to develop and evaluate an AI-assisted system for rectal cancer detection and staging using CT images. The proposed framework integrates three components—a convolutional neural network (RCD-CNN) for lesion detection, a U-Net model for rectal contour delineation and tumor localization, and a 3D convolutional network (RCS-3DCNN) for staging prediction. CT scans from 223 rectal cancer patients at Kaohsiung Medical University Chung-Ho Memorial Hospital were retrospectively analyzed, including both non-contrast and contrast-enhanced studies. RCD-CNN achieved an accuracy of 0.976, recall of 0.975, and precision of 0.976. U-Net yielded Dice scores of 0.897 (rectal contours) and 0.856 (tumor localization). Radiologist-based clinical staging had 82.6% concordance with pathology, while AI-based staging achieved 80.4%. McNemar’s test showed no significant difference between the AI and radiologist staging results (p = 1.0). The proposed AI-assisted system achieved staging accuracy comparable to that of radiologists and demonstrated feasibility as a decision-support tool in rectal cancer management. This study introduces a novel three-stage, dual-phase CT-based AI framework that integrates lesion detection, segmentation, and staging within a unified workflow.
Full article
(This article belongs to the Topic Machine Learning and Deep Learning in Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Robust Detection and Localization of Image Copy-Move Forgery Using Multi-Feature Fusion
by
Kaiqi Lu and Qiuyu Zhang
J. Imaging 2026, 12(2), 75; https://doi.org/10.3390/jimaging12020075 - 10 Feb 2026
Abstract
Copy-move forgery detection (CMFD) is a crucial image forensics analysis technique. The rapid development of deep learning algorithms has led to impressive advancements in CMFD. However, existing models suffer from two key limitations: Their feature fusion modules insufficiently exploit the complementary nature of
[...] Read more.
Copy-move forgery detection (CMFD) is a crucial image forensics analysis technique. The rapid development of deep learning algorithms has led to impressive advancements in CMFD. However, existing models suffer from two key limitations: Their feature fusion modules insufficiently exploit the complementary nature of features from the RGB domain and noise domain, resulting in suboptimal feature representations. During decoding, they simply classify pixels as authentic or forged, without aggregating cross-layer information or integrating local and global attention mechanisms, leading to unsatisfactory detection precision. To overcome these limitations, a robust detection and localization approach to image copy-move forgery using multi-feature fusion is proposed. Firstly, a Multi-Feature Fusion Network (MFFNet) was designed. Within its feature fusion module, features from both the RGB domain and noise domain were fused to enable mutual complementarity between distinct characteristics, yielding richer feature information. Then, a Lightweight Multi-layer Perceptron Decoder (LMPD) was developed for image reconstruction and forgery localization map generation. Finally, by aggregating information from different layers and combining local and global attention mechanisms, more accurate prediction masks were obtained. The experimental results demonstrate that the proposed MFFNet model exhibits enhanced robustness and superior detection and localization performance compared to existing methods when faced with JPEG compression, noise addition, and resizing operations.
Full article
(This article belongs to the Section Image and Video Processing)
►▼
Show Figures

Figure 1
Open AccessFeature PaperArticle
LDFSAM: Localization Distillation-Enhanced Feature Prompting SAM for Medical Image Segmentation
by
Xuanbo Zhao, Cheng Wang, Huaxing Xu, Hong Zhou, Zekuan Yu, Tao Chen, Xiaoling Wei and Rongjun Zhang
J. Imaging 2026, 12(2), 74; https://doi.org/10.3390/jimaging12020074 - 10 Feb 2026
Abstract
Standard SAM-based approaches in medical imaging typically rely on explicit geometric prompts, such as bounding boxes or points. However, these rigid spatial constraints are often insufficient for capturing the complex, deformable boundaries of medical structures, where localization noise easily propagates into segmentation errors.
[...] Read more.
Standard SAM-based approaches in medical imaging typically rely on explicit geometric prompts, such as bounding boxes or points. However, these rigid spatial constraints are often insufficient for capturing the complex, deformable boundaries of medical structures, where localization noise easily propagates into segmentation errors. To overcome this, we propose the Localization Distillation-Enhanced Feature Prompting SAM (LDFSAM), a novel framework that shifts from discrete coordinate inputs to a latent feature prompting paradigm. We employ a lightweight prompt generator, refined via Localization Distillation (LD), to inject multi-scale features into the SAM decoder as complementary Dense Feature Prompts (DFPs) and Sparse Feature Prompts (SFPs). This effectively guides segmentation without explicit box constraints. Extensive experiments on four public benchmarks (3D CBCT Tooth, ISIC 2018, MMOTU, and Kvasir-SEG) demonstrate that LDFSAM outperforms both prior SAM-based baselines and conventional networks, achieving Dice scores exceeding 0.91. Further validation on an in-house cohort demonstrates its robust generalization capabilities. Overall, our method outperforms both prior SAM-based baselines and conventional networks, with particularly strong gains in low-data regimes, providing a reliable solution for automated medical image segmentation.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Assessing Impact of Data Quality in Early Post-Operative Glioblastoma Segmentation
by
Ragnhild Holden Helland, David Bouget, Asgeir Store Jakola, Sébastien Muller, Ole Solheim and Ingerid Reinertsen
J. Imaging 2026, 12(2), 73; https://doi.org/10.3390/jimaging12020073 - 10 Feb 2026
Abstract
Quantification of the residual tumor from early post-operative magnetic resonance imaging (MRI) is essential in follow-up and treatment planning for glioblastoma patients. Residual tumor segmentation from early post-operative MRI is particularly challenging compared to the closely related task of pre-operative segmentation, as the
[...] Read more.
Quantification of the residual tumor from early post-operative magnetic resonance imaging (MRI) is essential in follow-up and treatment planning for glioblastoma patients. Residual tumor segmentation from early post-operative MRI is particularly challenging compared to the closely related task of pre-operative segmentation, as the tumor lesions are small, fragmented, and easily confounded with noise in the resection cavity. Recently, several studies successfully trained deep learning models for early post-operative segmentation, yet with subpar performances compared to the analogous task pre-operatively. In this study, the impact of image and annotation quality on model training and performance in early post-operative glioblastoma segmentation was assessed. A dataset consisting of early post-operative MRI scans from 423 patients and two hospitals in Norway and Sweden was assembled, for which image and annotation qualities were evaluated by expert neurosurgeons. The Attention U-Net architecture was trained with five-fold cross-validation on different quality-based subsets of the dataset in order to evaluate the impact of training data quality on model performance. Including low-quality images in the training set did not deteriorate performance on high-quality images. However, models trained on exclusively high-quality images did not generalize to low-quality images. Models trained on exclusively high-quality annotations reached the same performance level as the models trained on the entire dataset, using only two-thirds of the dataset. Both image and annotation quality had a significant impact on model performance. In dataset curation, images should ideally be representative of the quality variations in the real-world clinical scenario, and efforts should be made to ensure exact ground truth annotations of high quality.
Full article
(This article belongs to the Special Issue Progress and Challenges in Biomedical Image Analysis—2nd Edition)
►▼
Show Figures

Figure 1
Open AccessArticle
GreenViT: A Vision Transformer with Single-Path Progressive Upsampling for Urban Green-Space Segmentation and Auditable Area Estimation
by
Ziqiang Xu, Young Choi, Changyong Yi, Chanjeong Park, Jinyoung Park, Hyungkeun Park and Sujeen Song
J. Imaging 2026, 12(2), 72; https://doi.org/10.3390/jimaging12020072 - 10 Feb 2026
Abstract
Urban green-space monitoring in dense cityscapes remains limited by accuracy–efficiency trade-offs and the absence of integrated, auditable area estimation. We introduce GreenViT, a Vision Transformer (ViT) based framework for precise segmentation and transparent quantification of urban green space. GreenViT couples a ViT-L/14 backbone
[...] Read more.
Urban green-space monitoring in dense cityscapes remains limited by accuracy–efficiency trade-offs and the absence of integrated, auditable area estimation. We introduce GreenViT, a Vision Transformer (ViT) based framework for precise segmentation and transparent quantification of urban green space. GreenViT couples a ViT-L/14 backbone with a lightweight single-path, progressive upsampling decoder (Green Head), preserving global context while recovering thin structures. Experiments were conducted on a manually annotated dataset of 20 high-resolution satellite images collected from Satellites.Pro, covering five land-cover classes (background, green space, building, road, and water). Using a 224 × 224 sliding window sampling scheme, the 20 images yield 62,650 training/validation patches. Under five-fold evaluation, it attains 0.9200 ± 0.0243 mIoU, 0.9580 ± 0.0135 Dice, and 0.9570 PA, and the calibrated estimator achieves 1.10% relative area error. Overall, GreenViT strikes a strong balance between accuracy and efficiency, making it particularly well-suited for thin or boundary-rich classes. It can be used to support planning evaluations, green-space statistics, urban renewal assessments, and ecological red-line verification, while providing reliable green-area metrics to support urban heat mitigation and pollution control efforts. This makes it highly suitable for decision-oriented long-term monitoring and management assessments.
Full article
(This article belongs to the Section AI in Imaging)
►▼
Show Figures

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
AI, Applied Sciences, Bioengineering, Healthcare, IJERPH, JCM, Clinics and Practice, J. Imaging
Artificial Intelligence in Public Health: Current Trends and Future Possibilities, 2nd EditionTopic Editors: Daniele Giansanti, Giovanni CostantiniDeadline: 15 March 2026
Topic in
Applied Sciences, Computers, Electronics, Information, J. Imaging
Visual Computing and Understanding: New Developments and Trends
Topic Editors: Wei Zhou, Guanghui Yue, Wenhan YangDeadline: 31 March 2026
Topic in
Applied Sciences, Electronics, J. Imaging, MAKE, Information, BDCC, Signals
Applications of Image and Video Processing in Medical Imaging
Topic Editors: Jyh-Cheng Chen, Kuangyu ShiDeadline: 30 April 2026
Topic in
Diagnostics, Electronics, J. Imaging, Mathematics, Sensors
Transformer and Deep Learning Applications in Image Processing
Topic Editors: Fengping An, Haitao Xu, Chuyang YeDeadline: 31 May 2026
Conferences
Special Issues
Special Issue in
J. Imaging
Emerging Technologies for Less Invasive Diagnostic Imaging
Guest Editors: Francesca Angelone, Noemi Pisani, Armando RicciardiDeadline: 28 February 2026
Special Issue in
J. Imaging
3D Image Processing: Progress and Challenges
Guest Editor: Chinthaka DineshDeadline: 28 February 2026
Special Issue in
J. Imaging
Translational Preclinical Imaging: Techniques, Applications and Perspectives
Guest Editors: Sara Gargiulo, Sandra AlbaneseDeadline: 31 March 2026
Special Issue in
J. Imaging
Artificial Intelligence for Medical Imaging and Applications
Guest Editors: Wen Tang, Jinhua LiuDeadline: 31 March 2026





