Journal Description
Journal of Imaging
Journal of Imaging
is an international, multi/interdisciplinary, peer-reviewed, open access journal of imaging techniques, published online monthly by MDPI.
- Open Accessfree for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), PubMed, PMC, dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: JCR - Q2 (Imaging Science and Photographic Technology) / CiteScore - Q1 (Radiology, Nuclear Medicine and Imaging)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 18 days after submission; acceptance to publication is undertaken in 3.6 days (median values for papers published in this journal in the second half of 2025).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
3.3 (2024);
5-Year Impact Factor:
3.3 (2024)
Latest Articles
Feasibility of High-Frequency Ultrasound and Magnetic Resonance Imaging to Assess the In Ovo Development of Chicken Embryos
J. Imaging 2026, 12(5), 217; https://doi.org/10.3390/jimaging12050217 - 20 May 2026
Abstract
Preclinical multimodal imaging is widely applied in small animal models for longitudinal studies of human diseases. Beyond murine systems, cost-effective and ethically sustainable models such as the chicken embryo and its chorioallantoic membrane are gaining increasing interest in accordance with the 3Rs principles.
[...] Read more.
Preclinical multimodal imaging is widely applied in small animal models for longitudinal studies of human diseases. Beyond murine systems, cost-effective and ethically sustainable models such as the chicken embryo and its chorioallantoic membrane are gaining increasing interest in accordance with the 3Rs principles. This study evaluated the feasibility of using both high-frequency ultrasound and magnetic resonance imaging for the non-invasive longitudinal monitoring of chicken embryo development in ovo. Fifty fertilized eggs were incubated under controlled conditions and examined up to embryonic day 14. High-frequency ultrasound (15–71 MHz) enabled real-time imaging and quantitative assessment of superficial structures, including cranial biometry and limb growth, while magnetic resonance imaging (7T) provided high-resolution three-dimensional visualization of internal organs and extraembryonic compartments. Together, these modalities allowed the progressive identification of key anatomical structures from ED5 onward, with HFUS enabling earlier linear measurements and MRI facilitating detailed anatomical and volumetric evaluation. The integration of these techniques allowed the generation of a developmental imaging timeline and quantitative reference dataset of normal embryogenesis. This multimodal approach represents a promising strategy for in vivo developmental studies, offering a robust baseline to characterize structural alterations induced by experimental conditions. Moreover, the use of the chicken embryo model provides significant ethical and economic advantages, supporting its application in preclinical research and imaging-based studies.
Full article
(This article belongs to the Special Issue Translational Preclinical Imaging: Techniques, Applications and Perspectives)
►
Show Figures
Open AccessArticle
IPSM-UNet: An Inverted Pyramid-Shaped U-Net++ Architecture with Multi-Resolution Information Interaction for Coronary Artery Segmentation
by
Yinong Liao, Wei Li, Guopeng Liu, Rong Wang and Nan Zheng
J. Imaging 2026, 12(5), 216; https://doi.org/10.3390/jimaging12050216 - 20 May 2026
Abstract
Accurate coronary artery segmentation is essential for diagnosis and interventional planning, but conventional U-shaped networks often miss thin, low-contrast vessels and break vessel continuity. We propose Inverted Pyramid-Shaped Multi-resolution U-Net (IPSM-UNet), a dual U-Net++ architecture with multi-resolution feature interaction, feature aggregation, and layer-wise
[...] Read more.
Accurate coronary artery segmentation is essential for diagnosis and interventional planning, but conventional U-shaped networks often miss thin, low-contrast vessels and break vessel continuity. We propose Inverted Pyramid-Shaped Multi-resolution U-Net (IPSM-UNet), a dual U-Net++ architecture with multi-resolution feature interaction, feature aggregation, and layer-wise deep supervision. The method is evaluated on DRIVE, CHASE_DB1, DCA1, and an internal coronary angiography dataset. IPSM-UNet achieves competitive or better performance across datasets, including F1 = 0.8310 and Acc = 0.9707 on DRIVE, Se = 0.8792 and Acc = 0.9745 on CHASE_DB1, F1 = 0.8043 and Acc = 0.9793 on DCA1, and Se = 0.8741, F1 = 0.8590, and Acc = 0.9879 on the internal dataset. IPSM-UNet improves vessel continuity and overall segmentation quality, particularly for small-caliber vessels, and supports downstream coronary analysis.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Cell Structure Segmentation in TEM Images of Murine Skin Melanoma Cells by Deep Learning Model
by
Mikhail A. Genaev, Izabella S. Gogaeva, Iuliia S. Taskaeva, Nataliya P. Bgatova, Mikhail V. Kozhekin, Evgeniy G. Komyshev and Dmitry A. Afonnikov
J. Imaging 2026, 12(5), 215; https://doi.org/10.3390/jimaging12050215 - 18 May 2026
Abstract
Mitochondria–endoplasmic reticulum contact sites (MERCs) are known as the specialized areas that are involved in a large number of intracellular signaling pathways that regulate Ca2+ homeostasis, lipid transport, mitochondrial dynamics, cell death, and autophagy. Understanding MERC dynamics has important therapeutic implications in
[...] Read more.
Mitochondria–endoplasmic reticulum contact sites (MERCs) are known as the specialized areas that are involved in a large number of intracellular signaling pathways that regulate Ca2+ homeostasis, lipid transport, mitochondrial dynamics, cell death, and autophagy. Understanding MERC dynamics has important therapeutic implications in cancer, as these contacts regulate fundamental cellular processes and MERCs represent promising targets for therapeutic interventions aimed at improving cancer treatment outcomes. Despite the accumulated data, the role of MERCs in carcinogenesis still remains unknown; thus, it seems promising to search for new tools facilitating the study of MERCs in tumor cells. The structure of MERCs can be examined in great detail using transmission electron microscopy (TEM). Currently, several hundred TEM images are required to obtain reliable data on these contacts. The speed of data processing can be significantly improved by using fast and accurate image analysis techniques based on deep learning models. In this study, five U-Net models with a ResNet34 encoder network were evaluated, including the basic U-Net-Vanilla architecture as well as models incorporating various attention blocks and blocks capturing multilevel image structure, for the segmentation of mitochondria and the endoplasmic reticulum (ER). The best performance on the test dataset was demonstrated by the U-Net-scSE network, with F1 scores of 0.872 for mitochondria and 0.744 for the ER being achieved. Two models were tested for their ability to leverage pre-training on external datasets (Lucchi++, Kasthuri++, and DeepPi-EM). Additionally, models pre-trained on the CEM500K dataset were evaluated after the parameters had been tuned on the data. It was demonstrated by the results that pre-training or the use of pre-trained networks did not lead to an improvement in the IoU and F1 metrics on the test dataset. Subsequent image analysis was conducted to assess two types of MERCs in the segmented images. Finally, the free and user-friendly UltraNet web server was developed for automated analysis of mitochondria, ER, and MERCs using TEM images.
Full article
(This article belongs to the Special Issue Deep Learning in Biomedical Image Segmentation and Classification: Advancements, Challenges and Applications, 2nd Edition)
►▼
Show Figures

Figure 1
Open AccessArticle
Robust Point Cloud Registration via Rotation-Equivariant Geometric Encoding and State Space Models
by
Junjie Li, Jiajun Liu, Anqi Chen, Huifang Shen and Jianya Yuan
J. Imaging 2026, 12(5), 214; https://doi.org/10.3390/jimaging12050214 - 18 May 2026
Abstract
Point cloud registration in environments lacking rich textures or containing repetitive structures remains highly susceptible to misalignments. The core challenge lies in balancing the demand for extracting highly distinctive local features with the computational cost of global context modeling. In this paper, we
[...] Read more.
Point cloud registration in environments lacking rich textures or containing repetitive structures remains highly susceptible to misalignments. The core challenge lies in balancing the demand for extracting highly distinctive local features with the computational cost of global context modeling. In this paper, we propose a robust registration framework that efficiently combines rotation-equivariant geometric representations with state space models of linear complexity to mitigate feature ambiguity and mismatch. First, a multivariate geometric encoding mechanism is embedded within convolutional layers, enhancing local feature distinctiveness under strict rotation equivariance by explicitly leveraging surface properties. Second, to efficiently establish long-range spatial dependencies, we replace standard dense attention with a hybrid geometry-state aggregation module. This module integrates local geometric self-attention with the Mamba architecture, strengthening focus on overlapping regions without the quadratic computational burden. Finally, we optimize the generated correspondences through a physically consistent hypothesis generator to compute reliable rigid transformation results. On standard benchmarks, our framework demonstrates exceptional robustness to ambiguous matches, achieving a 96.3% registration recall on the 3DMatch dataset and outstanding accuracy on the KITTI dataset.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
MRI-Derived Biomarkers and Radiomic Signatures for Early, Dose-Dependent Evaluation of Prostate Cancer Radiotherapy: An Exploratory Study
by
Eleni Bekou, Admir Mulita, Ioannis M. Koukourakis, Nikolaos Courcoutsakis, Athanasia Kotini, Evlampia Psatha, Georgios Tsakaldimis, Ioannis Seimenis, Michael I. Koukourakis and Efstratios Karavasilis
J. Imaging 2026, 12(5), 213; https://doi.org/10.3390/jimaging12050213 - 17 May 2026
Abstract
This study provides an accurate assessment of radiotherapy-induced tissue changes in prostate cancer when relying solely on serum prostate-specific antigen kinetics. The current study aims to explore the role of quantitative magnetic resonance imaging and radiomic analyses. In this exploratory prospective study, 22
[...] Read more.
This study provides an accurate assessment of radiotherapy-induced tissue changes in prostate cancer when relying solely on serum prostate-specific antigen kinetics. The current study aims to explore the role of quantitative magnetic resonance imaging and radiomic analyses. In this exploratory prospective study, 22 patients with histologically confirmed prostate cancer underwent multiparametric magnetic resonance imaging at three time points: pre-treatment, mid-treatment, and two months post-radiotherapy. Quantitative imaging analysis included total prostate volume, T2, apparent diffusion coefficient—ADC, and T2* mapping, alongside T2-weighted and diffusion-weighted radiomic feature extraction. Longitudinal changes and dose correlations were analyzed using repeated-measures ANOVA and linear mixed-effects models. Prostate volume increased from 44.22 ± 21.26 cm3 at baseline to 51.11 ± 22.36 cm3 mid-treatment (p < 0.001) and decreased to 37.98 ± 15.5626 cm3 post-treatment (p = 0.034), indicative of temporary radiation-induced glandular edema. T2 relaxation times decreased from 106.00 ± 23.74 ms to 93.33 ± 9.50 ms after therapy (p = 0.023), with androgen deprivation therapy influencing overall values (partial η2 = 0.228, p = 0.028), while ADC and T2* remained largely stable (p > 0.05). Radiomic features, particularly from DWI, exhibited subtle time- and dose-dependent variations. Radiation dose was significantly associated with volume and T2, but not with ADC or T2*. These findings suggest that quantitative MRI biomarkers combined with radiomic analysis may provide objective, non-invasive measures of early prostate cancer radiotherapy-induced changes. These imaging-derived metrics may capture early treatment-related tissue alterations and could provide exploratory signals for early treatment evaluation in prostate cancer, although their relationship with biochemical markers requires further validation.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
GCA-Trans: Global Context-Aware Transformer for Robust Transparent Object Segmentation in Robotic Environments
by
Deping Li, Zujian Dong, Zilong Yang, Ka-Kui Li and Yushen Huang
J. Imaging 2026, 12(5), 212; https://doi.org/10.3390/jimaging12050212 - 16 May 2026
Abstract
Transparent object segmentation plays a critical role in indoor and outdoor scene understanding, particularly driven by the rapid advancements in autonomous driving and robotics. However, this task presents significant challenges due to the lack of distinct texture and chromatic features in transparent objects,
[...] Read more.
Transparent object segmentation plays a critical role in indoor and outdoor scene understanding, particularly driven by the rapid advancements in autonomous driving and robotics. However, this task presents significant challenges due to the lack of distinct texture and chromatic features in transparent objects, causing their appearance to blend into the background. Existing methods face inherent architectural limitations: CNNs are restricted by limited receptive fields, while Transformer-based methods may inadvertently suppress the weak feature details of transparent surfaces due to the inherent low-pass filtering property of self-attention mechanisms, treating them as background noise. Consequently, these approaches struggle to consistently segment transparent objects across diverse scales, failing to preserve both fine details and large-scale structures. To address these limitations, we propose the Global Context-Aware Transformer (GCA-Trans). Specifically, we design a Multi-scale Context Mining (MCM) module that leverages parallel dilated convolutions with varying receptive fields to simultaneously extract features at multiple scales. This design allows the model to capture and fuse fine-grained local details (e.g., edges and textures) with coarse-grained global spatial context (e.g., overall object shapes), ensuring robust segmentation performance for transparent objects of varying scales. Extensive experiments on four benchmark datasets demonstrate that GCA-Trans sets a new state of the art, achieving significant improvements of 2.53% mIoU on Trans10K-v2, 2.1% IoU on RGB-D GSD, 2.2% IoU on GDD, and 1.9% IoU on GSD, validating the effectiveness and robustness of our approach.
Full article
(This article belongs to the Special Issue AI-Driven Robot Vision: Progress, Challenges, and Perspectives)
►▼
Show Figures

Figure 1
Open AccessArticle
Clinician-Centered Evaluation Framework for Explainable AI Heatmaps in OCT-Based Retinal Disease Classification
by
Eirini Maliagkani, Ilias Georgalas, Ioannis Datseris, Elpiniki Papageorgiou and Ioannis D. Apostolopoulos
J. Imaging 2026, 12(5), 211; https://doi.org/10.3390/jimaging12050211 - 16 May 2026
Abstract
This study presents a two-phase framework for selecting clinically plausible explainable artificial intelligence (XAI) heatmaps for retinal optical coherence tomography (OCT) classification. A six-class Swin Transformer model was trained and validated using a combined dataset consisting of a subset of the public OCT-C8
[...] Read more.
This study presents a two-phase framework for selecting clinically plausible explainable artificial intelligence (XAI) heatmaps for retinal optical coherence tomography (OCT) classification. A six-class Swin Transformer model was trained and validated using a combined dataset consisting of a subset of the public OCT-C8 dataset and private data from a Greek tertiary hospital and externally evaluated on an independent dataset from a private ophthalmological institute. Diagnostic performance was high, achieving 97% accuracy in cross-validation and 91.82% on external evaluation. In Phase 1, one ophthalmologist and one artificial intelligence (AI) specialist independently assessed 100 heatmaps per method based on visual quality and anatomical plausibility, reducing the candidate methods to three. In Phase 2, 21 specialists evaluated the selected methods across multiple cases using a five-point Likert scale reflecting agreement between highlighted regions and the model diagnosis. The proposed Token contRAST map (TRAST) achieved the highest ratings, followed by Gradient-weighted Class Activation Mapping (Grad-CAM++), while Cosine-Grad Fusion Map (CGFM) showed the lowest performance. These findings reflect clinical plausibility rather than direct model interpretability and indicate that effective XAI in OCT imaging requires not only technical performance but also structured expert evaluation. The proposed framework provides a practical approach for selecting explanation methods suitable for clinical use in ophthalmology.
Full article
(This article belongs to the Special Issue From Code to Clinic: Trustworthy AI for Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Beyond GLM: Inter-Subject Variability as a Complementary Approach to Detect Longitudinal Changes in Emotion Processing in Multiple Sclerosis
by
Alice Pirastru, Valeria Blasi, Diego Michael Cacciatore, Marco Rovaris, Elena Toselli, Francesco Pagnini, Cesare Cavalera, Fabrizio Esposito, Giuseppe Baselli and Francesca Baglio
J. Imaging 2026, 12(5), 210; https://doi.org/10.3390/jimaging12050210 - 15 May 2026
Abstract
Understanding how to reliably capture neural changes induced by treatments in neurological patients remains a major methodological challenge. This issue is particularly evident in the emotional domain—frequently impaired in conditions such as multiple sclerosis (MS) and a key target of rehabilitation—yet not limited
[...] Read more.
Understanding how to reliably capture neural changes induced by treatments in neurological patients remains a major methodological challenge. This issue is particularly evident in the emotional domain—frequently impaired in conditions such as multiple sclerosis (MS) and a key target of rehabilitation—yet not limited to it. Longitudinal neuroimaging studies predominantly rely on group-level analyses (e.g., General Linear Model, GLM), which assume inter-subject homogeneity and treat inter-subject variability (ISV) as noise. Such assumptions may obscure treatment-related neuroplastic changes, especially in domains like emotion processing, where neural responses are intrinsically variable and highly individualized in clinical populations. This study investigates whether modeling ISV can better capture treatment-related neural changes, using emotion-focused rehabilitation as a representative case. We compared GLM with threshold-weighted overlap maps (OMth-w), which quantify spatial consistency across individuals. Thirty healthy controls (HCs) and thirteen people with MS (pwMS) undergoing EMDR for depression performed an emotional fMRI task (pwMS pre/post-treatment). GLM revealed no longitudinal effects, whereas OMth-w showed reduced variability in pwMS after treatment, alongside decreased depressive symptoms (p < 0.001). These findings highlight the value of variability-based approaches as a complementary framework to conventional GLM analyses for detecting treatment-related neuroplasticity in neurological populations.
Full article
(This article belongs to the Special Issue Advances in Neuroimaging for Human Cognition, Behavior, Brain Modulation and Prediction)
Open AccessArticle
LDSNet: A Lightweight Detail-Sensitive Network for Small Object Detection in Low-Altitude UAV Scenarios
by
Tong Tan, Xianrong Peng, Jianlin Zhang, Haorui Zuo, Yao Zhang, Yunhao Wu and Hui Li
J. Imaging 2026, 12(5), 209; https://doi.org/10.3390/jimaging12050209 - 14 May 2026
Abstract
Object detection in Unmanned Aerial Vehicle (UAV) imagery faces significant challenges due to the unique aerial perspective. A major bottleneck is the weak feature representation of small objects, which limits both detection accuracy and computational efficiency. To address this issue, we propose a
[...] Read more.
Object detection in Unmanned Aerial Vehicle (UAV) imagery faces significant challenges due to the unique aerial perspective. A major bottleneck is the weak feature representation of small objects, which limits both detection accuracy and computational efficiency. To address this issue, we propose a Lightweight Detail-Sensitive Network (LDSNet). Specifically, LDSNet consists of three key components: (1) Lightweight Detail-Sensitive Downsampling (LDSDown), which combines anti-aliasing smoothing with dual-path feature extraction to preserve the spatial details of small objects during downsampling; (2) Shared Recursive Dilated Convolution (SRDC), which uses weight-shared multi-rate dilated convolutions to capture multi-scale context and enlarge the receptive field without introducing extra parameters; and (3) Deeply Decoupled Grouped Head (DGHead), which employs high-ratio grouped convolutions to significantly reduce the computational cost of processing high-resolution inputs. Extensive experiments on the VisDrone2019 and HIT-UAV datasets demonstrate that LDSNet achieves an excellent trade-off between accuracy and efficiency. Compared to the YOLOv11n baseline, LDSNet reduces parameters by 84.6% (from 2.6 M to 0.4 M) and FLOPs by 29.2% (from 6.5 G to 4.6 G), while improving mAP50 by 2.2% on VisDrone2019 and achieving 94.5% on HIT-UAV.
Full article
(This article belongs to the Special Issue AI-Driven Remote Sensing Image Processing and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
Spatial–Temporal EEG Imaging for Dual-Loop Neuro-Adaptive Simulation: Cognitive-State Decoding and Communication Gating in Critical Human–Machine Teams
by
Rubén Juárez, Antonio Hernández-Fernández, Claudia Barros Camargo and David Molero
J. Imaging 2026, 12(5), 208; https://doi.org/10.3390/jimaging12050208 - 12 May 2026
Abstract
Human performance in critical environments is frequently degraded by mistimed communication delivered during periods of visual–cognitive saturation. In such settings, failures arise not only from individual limitations but also from poor coordination between operators under rapidly changing workload conditions. We present a dual-loop
[...] Read more.
Human performance in critical environments is frequently degraded by mistimed communication delivered during periods of visual–cognitive saturation. In such settings, failures arise not only from individual limitations but also from poor coordination between operators under rapidly changing workload conditions. We present a dual-loop neuro-adaptive simulation framework based on real-time spectral–topographic EEG representations, in which multichannel cortical activity is transformed into dynamic spatial maps and decoded to regulate both operator assistance and team communication. The system integrates 14-channel wireless EEG (Emotiv EPOC X, 256 Hz), gaze tracking, telemetry, and communication events through an LSL-based multimodal synchronization pipeline. A hybrid CNN–LSTM model processes sequences of spectral-topographic EEG maps to classify three operationally actionable neurocognitive states—Channelized Attention, Diverted Attention, and Surprise/Startle—while also estimating a continuous Cognitive Load Index (CLI). These representation-derived features are then used by a multi-agent proximal policy optimization (MAPPO) controller to generate two coordinated outputs: (i) adaptive haptic guidance for the pilot, designed to reduce reliance on overloaded visual and auditory channels, and (ii) a traffic-light communication gate for the telemetry engineer, regulating whether radio intervention should proceed, be delayed, or be withheld. In a high-fidelity dual-station simulation with 25 pilot–engineer pairs, the proposed framework was associated with a reduction of more than 30% in communication breakdown errors relative to open-loop telemetry, with the strongest effects observed during peak-load windows, while preserving realistic task progression. It also improved pilot reaction time to time-critical warnings and reduced engineer decision load under the tested conditions. These findings support the use of spectral-topographic EEG representations as a practical basis for combining multimodal neurophysiological sensing, spatiotemporal pattern decoding, and adaptive coordination in high-pressure human–machine teams. At the same time, the study should be interpreted as evidence of controlled feasibility in a simulated setting rather than as definitive proof of field-level generalization. We further discuss deployment constraints and propose privacy-by-design safeguards to ensure that neurocognitive signals are used exclusively for operational adaptation rather than employability assessment or performance scoring.
Full article
(This article belongs to the Section AI in Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
A Dual-Branch Deep Learning Framework with Explainability for Dental Caries Classification Using Intra-Oral Photographs and Radiographs
by
Lijuan Ren and Jinjing Chen
J. Imaging 2026, 12(5), 207; https://doi.org/10.3390/jimaging12050207 - 12 May 2026
Abstract
The accurate detection of dental caries is often hindered by modality-specific imaging challenges, such as illumination artifacts in intra-oral photographs and low lesion contrast in radiographs. This study proposes a comprehensive framework comprising three key components: (1) HybridAugment+, an entropy-guided adaptive augmentation strategy
[...] Read more.
The accurate detection of dental caries is often hindered by modality-specific imaging challenges, such as illumination artifacts in intra-oral photographs and low lesion contrast in radiographs. This study proposes a comprehensive framework comprising three key components: (1) HybridAugment+, an entropy-guided adaptive augmentation strategy that applies stronger transformations to low-information images; (2) DBAttNet, a dual-branch attention network featuring illumination–reflection aware attention (IRAA) for photographs and contrast–frequency-aware attention (CFA) for radiographs; and (3) a CAM-based explainability method, selected through a systematic evaluation of five advanced techniques. This study utilized two datasets derived from public sources, comprising 639 intra-oral photographs (481 caries, 158 healthy) and 456 radiographs (268 caries, 188 healthy). These were annotated by two dentists, with established inter-rater reliability (κ = 0.82 for photographs, κ = 0.79 for radiographs). The experimental results demonstrate that HybridAugment+ improved performance over conventional augmentation by up to 8.72% on photographs and 7.67% on radiographs. Furthermore, DBAttNet achieved F1-scores of 97.90% on photographs and 95.72% on radiographs, outperforming ResNet50, InceptionV3, MSDNet, DCANet, and ARM-Net. A comparative evaluation identified XGrad-CAM as the most suitable explainability method, with optimal visualization thresholds of 30% for photographs and 20% for radiographs. Generalization experiments on ophthalmology (APTOS 2019, Messidor-2) and chest radiography datasets (Kermany CXR, NIH ChestX-ray14) demonstrated consistent performance gains over domain-specific methods (DT-Net, ConvNeXt-Tiny). These results confirm that the core design principles effectively transfer to other modalities facing analogous imaging challenges.
Full article
(This article belongs to the Special Issue Artificial Intelligence for Medical Imaging and Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
Quantification of Costal Cartilage Calcification Using 18F-NaF-PET/CT
by
Vanessa Shehu, Om H. Gandhi, Patrick Glennan, Jaskeerat Gujral, Shashi B. Singh, Amir A. Amanullah, Shiv Patil, Khushi Gujral, William Y. Raynor, Peter Sang Uk Park, Eric M. Teichner, Robert C. Subtirelu, Talha Khan, Thomas J. Werner, Poul Flemming Høilund-Carlsen, Ali Gholamrezanezhad, Mona-Elisabeth Revheim and Abass Alavi
J. Imaging 2026, 12(5), 206; https://doi.org/10.3390/jimaging12050206 - 12 May 2026
Abstract
A quantification technique for costal cartilage calcification using 18F-sodium fluoride–positron emission tomography/computed tomography (18F-NaF-PET/CT) has yet to be established, and the effects of aging and other demographic variables on costal cartilage calcification remain understudied. This study aims to introduce a
[...] Read more.
A quantification technique for costal cartilage calcification using 18F-sodium fluoride–positron emission tomography/computed tomography (18F-NaF-PET/CT) has yet to be established, and the effects of aging and other demographic variables on costal cartilage calcification remain understudied. This study aims to introduce a quantification methodology for assessing costal cartilage calcification using 18F-NaF-PET/CT, assess age-related changes in its 18F-NaF uptake in females and males, and examine the relationship between its 18F-NaF uptake and CT attenuation as well as 18F-NaF uptake and coronary artery calcification. In this retrospective study, we analyzed subjects from the Cardiovascular Molecular Calcification Assessed by 18F-NaF PET/CT (CAMONA) clinical trial. This study evaluated 130 subjects (mean age 48.7 ± 14.5 years; n = 67 females). We manually generated regions of interest overlying the costal cartilages from ribs 8 to 10 on the left side, carefully avoiding osseous uptake from adjacent ribs and sternum, to measure cartilaginous 18F-NaF uptake. Non-parametric statistical analyses (Spearman correlations, Mann–Whitney U tests, Kruskal–Wallis tests) and receiver operating characteristic analysis were performed to evaluate sex-specific age-related changes in uptake, correlations between imaging parameters, and associations with coronary artery calcium (CAC) score. In females, the mean 18F-NaF uptake (as assessed by average SUVmean) was 0.69 ± 0.38 while the corresponding mean Hounsfield Unit (HU) was 108.0 ± 40.0. In males, the mean 18F-NaF uptake (as assessed by average SUVmean) was 0.63 ± 0.22, and the mean HU was 104.0 ± 24.0. There was a significant correlation between 18F-NaF uptake and age in both females (p = 0.003, r = 0.36) and males (p < 0.0001, r = 0.63). The correlation was significantly stronger in males than females (Fisher’s z-test, p = 0.040). There was a significant correlation between CAC score and costal cartilage SUVmean in both females (r = 0.26, p = 0.036) and males (r = 0.51, p < 0.0001). This study introduces a quantification technique to assess costal cartilage calcification using 18F-NaF-PET/CT and demonstrates that the calcification increases with age, more strongly in males than in females, and 18F-NaF uptake is correlated with CAC score. This technique can be applied to other cartilages of interest, in both physiological and pathological conditions, to assess the effects of aging and various demographic variables on cartilage calcification.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Federated Learning with Differential Privacy for Ultrasound Breast Cancer Classification: An Empirical Study
by
Nursultan Makhanov, Beibit Abdikenov, Tomiris Zhaksylyk and Temirlan Karibekov
J. Imaging 2026, 12(5), 205; https://doi.org/10.3390/jimaging12050205 - 11 May 2026
Abstract
Breast cancer is a critical global health challenge, and deep learning shows transformative potential for medical image classification. However, privacy regulations such as HIPAA and GDPR create barriers to centralized data aggregation across institutions. This paper presents an empirical evaluation of federated learning
[...] Read more.
Breast cancer is a critical global health challenge, and deep learning shows transformative potential for medical image classification. However, privacy regulations such as HIPAA and GDPR create barriers to centralized data aggregation across institutions. This paper presents an empirical evaluation of federated learning (FL) for breast cancer classification in ultrasound images, systematically comparing seven deep learning architectures (ResNet-50, VGG16, VGG19, DenseNet-121, MobileNetV2, Vision Transformer, CoAtNet) across three FL algorithms (FedAvg, FedProx, FedOpt) with client-side differential privacy (DP). Using a simulated federation of eight institutions, we evaluate three clinically relevant classification scenarios. Federated models achieve performance comparable to centralized baselines—98.52% accuracy for normal/abnormal screening, 89.53% for three-class classification—with ViT-small and DenseNet-121 exceeding their centralized counterparts in several configurations. Under strong DP constraints (noise multiplier , yielding conservative privacy budget estimates of with ), screening accuracy remains above 82%, though diagnostic tasks incur substantial degradation (best 68.42%). Our findings provide empirical guidance on architecture selection, FL algorithm choice, and privacy-utility trade-offs for privacy-preserving breast cancer diagnosis, while identifying key challenges for clinical deployment.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Self-Supervised Text-Driven Point Cloud Upsampling via Semantic Text Guidance
by
Zhiyong Zhang, Meiling Qiu, Shuo Chen, Ruyu Liu, Jianhua Zhang and Shengyong Chen
J. Imaging 2026, 12(5), 204; https://doi.org/10.3390/jimaging12050204 - 11 May 2026
Abstract
Point cloud upsampling is a fundamental task in 3D vision, yet most existing methods adopt a global and uniform strategy, which is computationally inefficient and fails to address the need for region-specific refinement. To address this challenge, we propose PartSPUNet, a novel self-supervised,
[...] Read more.
Point cloud upsampling is a fundamental task in 3D vision, yet most existing methods adopt a global and uniform strategy, which is computationally inefficient and fails to address the need for region-specific refinement. To address this challenge, we propose PartSPUNet, a novel self-supervised, text-driven point cloud upsampling framework designed to enhance robotic perception through task-oriented local refinement. Inspired by the human cognitive process where high-level language instructions guide visual attention to specific regions of interest, our method allows an operator to use intuitive natural language prompts to direct the upsampling process. Specifically, PartSPUNet leverages a pretrained vision–language model to zero-shot localize the user-specified semantic part within a sparse point cloud. It then performs geometry-aware densification exclusively on this target region, recovering rich geometric details while preserving the global structure. Experimental results demonstrate that our approach significantly outperforms existing methods in reconstructing specified areas, offering a powerful and intuitive tool for enhancing the 3D perception pipeline in intelligent robotic systems.
Full article
(This article belongs to the Special Issue 3D Image Processing: Progress and Challenges)
►▼
Show Figures

Figure 1
Open AccessArticle
Characterization of RGB-Polarization Sensor-Based Cameras
by
Andreas Karge, Maximilian Klammer, Bernhard Eberhardt and Andreas Schilling
J. Imaging 2026, 12(5), 203; https://doi.org/10.3390/jimaging12050203 - 7 May 2026
Abstract
This work presents a characterization method for cameras with trichromatic RGB color filter array and polarization layer (RGB-P) sensor-based imaging devices. Such sensors enable the reconstruction of color and polarization of registered scene elements, which is an important requirement in computer vision. We
[...] Read more.
This work presents a characterization method for cameras with trichromatic RGB color filter array and polarization layer (RGB-P) sensor-based imaging devices. Such sensors enable the reconstruction of color and polarization of registered scene elements, which is an important requirement in computer vision. We will present spectral responsivity measurements, which reveal different sensitivities for various color and polarization channels. Furthermore, we will discuss and model an observed chromaticity shift in registered camera signals for polarized irradiance. Both lead to inaccurate estimation of color and polarization features. In order to overcome these issues, we will present a neural-network-based model for color and polarization feature reconstruction. Essentially, it considers spectral sensitivity for polarized irradiance. Furthermore, the model takes into account that, for visualization, the color signals have to be a linear combination of polarization channels. Models were trained for selected natural and synthetic reflectance sets, as well as commonly used lighting. We evaluated the resulting performance, which yielded robust results. The method can be employed for an estimation of color and polarization features for RGB-P imaging devices. Applications can be found in photography, as well as machine and computer vision, in which object surface color rendering plays a major role.
Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
►▼
Show Figures

Figure 1
Open AccessReview
FFR-CT: Technical Advances and Implementation in Clinical Practice
by
Kamil Stankowski, Amedeo Pellizzon, Luca Signorelli, Andrea Baggiano, Nicola Cosentino, Alberico Del Torto, Fabio Fazzari, Daniele Junod, Maria Elisabetta Mancini, Riccardo Maragna, Manuela Muratori, Luigi Tassetti, Alessandra Volpe, Saima Mushtaq and Gianluca Pontone
J. Imaging 2026, 12(5), 202; https://doi.org/10.3390/jimaging12050202 - 5 May 2026
Abstract
Fractional flow reserve derived from coronary computed tomography angiography (FFR-CT) has emerged as a non-invasive modality for the functional assessment of coronary artery disease. By using computational fluid dynamics, particularly in its most extensively validated off-site implementation, FFR-CT enables lesion-specific estimation of pressure
[...] Read more.
Fractional flow reserve derived from coronary computed tomography angiography (FFR-CT) has emerged as a non-invasive modality for the functional assessment of coronary artery disease. By using computational fluid dynamics, particularly in its most extensively validated off-site implementation, FFR-CT enables lesion-specific estimation of pressure gradients across coronary stenoses without the need for invasive catheterization. This narrative review summarizes the technical foundations of FFR-CT as well as the evidence demonstrating that FFR-CT enhances the diagnostic accuracy of coronary CT angiography alone by improving specificity for hemodynamically significant stenoses when compared with invasive fractional flow reserve. Beyond diagnosis, FFR-CT provides incremental prognostic information, supporting risk stratification and guiding revascularization decisions. Suggestions for clinical implementation of FFR-CT and guidance on interpreting results within the appropriate clinical context are provided. Despite these advantages, limitations remain, including dependence on image quality, reduced performance in heavily calcified vessels, assumptions regarding hyperemic flow conditions, and limited validation in certain populations. While computational fluid dynamics-based FFR-CT remains the most commonly adopted approach in clinical settings, machine learning-based on-site FFR-CT is rapidly evolving and is expected to become a reliable alternative. As technical refinements continue, FFR-CT is poised to play an expanding role in precision-guided management of coronary artery disease.
Full article
(This article belongs to the Special Issue Advances and Challenges in Cardiovascular Imaging)
►▼
Show Figures

Graphical abstract
Open AccessArticle
Beyond Single Descriptors: Complementary Feature Learning for Image Matching
by
Xianguo Yu, Yulong Feng and Xi Li
J. Imaging 2026, 12(5), 201; https://doi.org/10.3390/jimaging12050201 - 5 May 2026
Abstract
Sparse local feature matching has served as the cornerstone of numerous visual geometry tasks and attracted extensive attention. Although significant progress has been made in this area, improving the discriminative power of descriptors remains a key challenge. As far as we know, existing
[...] Read more.
Sparse local feature matching has served as the cornerstone of numerous visual geometry tasks and attracted extensive attention. Although significant progress has been made in this area, improving the discriminative power of descriptors remains a key challenge. As far as we know, existing sparse feature matching methods only predict a single descriptor map for keypoints, which might restrict their potential in solving complex scenarios. This issue is particularly pronounced in real-time applications where most methods only learn descriptor maps at a reduced spatial resolution compared to the input image. Consequently, they require interpolating from the low resolution map for obtaining per-keypoint descriptors, which will introduce background contamination and reduce the discriminability of final descriptors. To address these issues, we propose an efficient novel complementary local feature description model. Specifically, the model simultaneously learns two descriptor maps using different loss functions within a single Convolutional Neural Network (CNN). An orthogonal loss is introduced to effectively coordinate the learning of the two branches, aiming to obtain decoupled and complementary descriptors. Extensive experiments across various visual geometry tasks, such as homography estimation, indoor and outdoor pose estimation, as well as visual localization, have demonstrated the superior performance of the proposed method.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
A Scene-Adaptive Super-Resolution Framework for Video Compression
by
Qiyu Zha and Jiangling Guo
J. Imaging 2026, 12(5), 200; https://doi.org/10.3390/jimaging12050200 - 5 May 2026
Abstract
Video compression is central to large-scale video delivery, where better rate–distortion efficiency directly reduces bandwidth and storage cost. A practical way to improve efficiency is to encode a low-resolution video stream with a standard codec and restore high-resolution details with a learned super-resolution
[...] Read more.
Video compression is central to large-scale video delivery, where better rate–distortion efficiency directly reduces bandwidth and storage cost. A practical way to improve efficiency is to encode a low-resolution video stream with a standard codec and restore high-resolution details with a learned super-resolution model at the decoder. However, prior SR-assisted compression methods usually update the reconstruction model at fixed temporal intervals, which can waste bitrate when those update boundaries do not match actual scene changes. In this paper, we present SASVC, a scene-adaptive super-resolution video compression framework for offline codec-augmented compression. SASVC detects scene changes using frame-wise grayscale differences, updates only compact adapter modules when a content transition is observed, and compresses the resulting model updates with chained differencing, quantization, and entropy coding. In this way, the method reduces unnecessary model-stream overhead while preserving scene-specific reconstruction fidelity. Experimental results on both long-form and short-form datasets show that SASVC consistently outperforms SRVC-style baselines and conventional codec-based alternatives under the Bjontegaard delta rate based on peak signal-to-noise ratio (BD-rate/PSNR) criterion. Complementary rate–distortion (RD) comparisons in terms of structural similarity index measure (SSIM) and Video Multi-Method Assessment Fusion (VMAF) show the same overall trend, indicating that the gain is not limited to a single distortion metric. Specifically, SASVC achieves BD-rate gains of and on Vimeo and Xiph, respectively, and further reaches and on UVG and MCL-JCV. The decoder also maintains real-time 1080p reconstruction at 125 frames per second (FPS) on an NVIDIA RTX 3080 Ti GPU, indicating that scene-aligned model updates can improve compression efficiency while keeping decoder-side deployment practical.
Full article
(This article belongs to the Section Image and Video Processing)
►▼
Show Figures

Figure 1
Open AccessArticle
A Cost-Effective and Rapidly Manufacturable Infrared–Visible High-Contrast Calibration Board Based on Structural Parametrization
by
Yuandong Shao and Aleksandr S. Vasilev
J. Imaging 2026, 12(5), 199; https://doi.org/10.3390/jimaging12050199 - 2 May 2026
Abstract
The infrared (IR)—visible light (VIS) dual-camera system provides complementary cues for image fusion, but issues such as geometric mismatch caused by different imaging methods, inconsistent resolution/field-of-view, and installation offsets often lead to ghosting and artifacts. This study aims to develop a fast-deployable and
[...] Read more.
The infrared (IR)—visible light (VIS) dual-camera system provides complementary cues for image fusion, but issues such as geometric mismatch caused by different imaging methods, inconsistent resolution/field-of-view, and installation offsets often lead to ghosting and artifacts. This study aims to develop a fast-deployable and repeatable calibration workflow based on cost-effective calibration board. We designed an infrared-visible high-contrast checkerboard plate that can be generated through structural parameterization and efficiently manufactured using Python/OpenSCAD. We also established a corner-based registration pipeline that estimates global homography to align the visible-light images onto the infrared pixel grid for fusion and quantitative evaluation. Experiments conducted in a controlled indoor environment demonstrated stable sub-pixel performance within a range of 1.5–2.5 m, with an average re-projection error of 0.47–0.50 pixels per frame and a 95th percentile lower than 0.51 pixels. The corner position re-projection error test further confirmed stability near image boundaries, with a median value of 0.53–0.63 pixels and a 95th percentile of 0.54–0.64 pixels. Overall, the proposed target design and workflow can achieve practical infrared-visible calibration under typical deployment constraints and have repeatable accuracy, providing geometrically consistent input for subsequent fusion and dataset construction.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
DAER-YOLO: Defect-Aware and Edge-Reconstruction Enhanced YOLO for Surface Defect Detection of Varistors
by
Wu Xie, Shushuo Yao, Tao Zhang, Gaoxue Qiu, Dong Li, Fuxian Luo and Yong Fan
J. Imaging 2026, 12(5), 198; https://doi.org/10.3390/jimaging12050198 - 2 May 2026
Abstract
Varistors are critical overvoltage protection components in modern power electronic systems. They effectively absorb and dissipate surge energy to ensure the safe and stable operation of electrical equipment. However, surface defects can lead to substandard performance or even trigger equipment failure, compromising overall
[...] Read more.
Varistors are critical overvoltage protection components in modern power electronic systems. They effectively absorb and dissipate surge energy to ensure the safe and stable operation of electrical equipment. However, surface defects can lead to substandard performance or even trigger equipment failure, compromising overall system stability. Therefore, high-precision surface defect detection is essential for quality assurance. To address these challenges, we propose a lightweight model termed Defect-Aware and Edge-Reconstruction Enhanced YOLO (DAER-YOLO) for efficient varistor inspection. First, we construct a C3k2-based defect-aware enhancement module (C3k2-iEMA). This module tackles the difficulty of extracting features from small or morphologically complex defects. By integrating multi-scale feature extraction, an attention mechanism, and efficient nonlinear mapping, it strengthens the perception of defect details. Second, to enhance the reconstruction capability for edge damage and small-object defects, we introduce the Efficient Up-Convolution Block (EUCB). This block improves multi-level feature fusion and generates clearer enhanced feature maps. Based on these improvements, DAER-YOLO outperforms the YOLOv11n baseline on a custom varistor dataset, with mAP@50 and mAP@50:95 increasing by 1.6% and 2.3%, respectively. Experimental results demonstrate that the model effectively improves detection accuracy while exhibiting significant potential for real-time industrial applications.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
9 October 2025
Meet Us at the 3rd International Conference on AI Sensors and Transducers, 2–7 August 2026, Jeju, South Korea
Meet Us at the 3rd International Conference on AI Sensors and Transducers, 2–7 August 2026, Jeju, South Korea
19 May 2026
Meet Us Virtually at the 1st International Online Conference on Tomography (IOCTG2026), 10–11 September 2026
Meet Us Virtually at the 1st International Online Conference on Tomography (IOCTG2026), 10–11 September 2026
Topics
Topic in
Diagnostics, Electronics, J. Imaging, Mathematics, Sensors
Transformer and Deep Learning Applications in Image Processing
Topic Editors: Fengping An, Haitao Xu, Chuyang YeDeadline: 31 May 2026
Topic in
AI, Applied Sciences, Electronics, J. Imaging, Sensors, IJGI
State-of-the-Art Object Detection, Tracking, and Recognition Techniques
Topic Editors: Mang Ye, Jingwen Ye, Cuiqun ChenDeadline: 30 June 2026
Topic in
Applied Sciences, Information, Remote Sensing, Signals, Symmetry, J. Imaging
Image Processing, Signal Processing and Their Applications
Topic Editors: Jun Xu, Lianbo MaDeadline: 16 July 2026
Topic in
Applied Sciences, Bioengineering, Diagnostics, J. Imaging, Signals
Signal Analysis and Biomedical Imaging for Precision Medicine
Topic Editors: Surbhi Bhatia Khan, Mo SaraeeDeadline: 31 August 2026
Conferences
Special Issues
Special Issue in
J. Imaging
Towards Deeper Understanding of Image and Video Processing and Analysis
Guest Editors: Zixiang Zhao, Haotong Qin, Tao FengDeadline: 31 May 2026
Special Issue in
J. Imaging
AI-Driven Image and Video Understanding
Guest Editors: Gongyang Li, Xiaofei Zhou, Yong WuDeadline: 31 May 2026
Special Issue in
J. Imaging
Image and Video Forensics: Progress and Challenges
Guest Editors: Daniele Baracchi, Simone MagistriDeadline: 31 May 2026
Special Issue in
J. Imaging
Computer Vision and Deep Learning: Trends and Applications (3rd Edition)
Guest Editors: Pier Luigi Mazzeo, Alessandro BrunoDeadline: 30 June 2026




