Next Issue
Volume 11, December
Previous Issue
Volume 11, October
 
 

J. Imaging, Volume 11, Issue 11 (November 2025) – 48 articles

Cover Story (view full-size image): The cover illustrates a multimodal AI workflow for MRI-based skull segmentation. High-resolution bone contours extracted from CT data are transferred onto corresponding MRI scans through precise registration, creating anatomically enriched training datasets. A deep neural network learns to segment cranial structures directly from routine MRI sequences, achieving high accuracy without radiation exposure. This framework redefines cranio-maxillofacial imaging by enabling radiation-free virtual surgical planning and patient-specific implant design. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
17 pages, 3217 KB  
Article
Optimization of Neural Network Models of Computer Vision for Biometric Identification on Edge IoT Devices
by Bauyrzhan Belgibayev, Madina Mansurova, Ganibet Ablay, Talshyn Sarsembayeva and Zere Armankyzy
J. Imaging 2025, 11(11), 419; https://doi.org/10.3390/jimaging11110419 - 20 Nov 2025
Viewed by 357
Abstract
This research is dedicated to the development of an intelligent biometric system based on the synergy of Internet of Things (IoT) technologies and Artificial Intelligence (AI). The primary goal of this research is to explore the possibilities of personal identification using two distinct [...] Read more.
This research is dedicated to the development of an intelligent biometric system based on the synergy of Internet of Things (IoT) technologies and Artificial Intelligence (AI). The primary goal of this research is to explore the possibilities of personal identification using two distinct biometric traits: facial images and the venous pattern of the palm. These methods are treated as independent approaches, each relying on unique anatomical features of the human body. This study analyzes state-of-the-art methods in computer vision and neural network architectures and presents experimental results related to the extraction and comparison of biometric features. For each biometric modality, specific approaches to data collection, preprocessing, and analysis are proposed. We frame optimization in practical terms: selecting an edge-suitable backbone (ResNet-50) and employing metric learning (Triplet Loss) to improve convergence and generalization while adapting the stack for edge IoT deployment (Dockerized FastAPI with JWT). This clarifies that “optimization” in our title refers to model selection, loss design, and deployment efficiency on constrained devices. Additionally, the system’s architectural principles are described, including the design of the web interface and server infrastructure. The proposed solution demonstrates the potential of intelligent biometric technologies in applications such as automated access control systems, educational institutions, smart buildings, and other areas where high reliability and resistance to spoofing are essential. Full article
(This article belongs to the Special Issue Techniques and Applications in Face Image Analysis)
Show Figures

Figure 1

14 pages, 2751 KB  
Article
Deep Learning and Atlas-Based MRI Segmentation Enable Longitudinal Characterization of Healthy Mouse Brain
by Edoardo Micotti, Liviu Soltuzu, Elisa Bianchi, Sebastiano La Ferla, Lorenzo Carnevale and Gianluigi Forloni
J. Imaging 2025, 11(11), 418; https://doi.org/10.3390/jimaging11110418 - 19 Nov 2025
Viewed by 523
Abstract
We compared the results of brain magnetic resonance image (MRI) segmentation across a longitudinal dataset spanning mouse adulthood using an atlas-based approach and deep learning. Our results demonstrate that deep learning performs similarly yet faster than more established segmentation methods, even when computational [...] Read more.
We compared the results of brain magnetic resonance image (MRI) segmentation across a longitudinal dataset spanning mouse adulthood using an atlas-based approach and deep learning. Our results demonstrate that deep learning performs similarly yet faster than more established segmentation methods, even when computational resources are limited. Both methods enabled the large-scale analysis of a cohort of C57Bl6/J healthy mice, revealing sex-dependent morphological differences in the aging brain. These findings highlight the potential use of deep learning for high-throughput, longitudinal neuroimaging studies and underscore the importance of considering sex as a biological variable in preclinical brain research. Full article
Show Figures

Graphical abstract

13 pages, 2033 KB  
Article
Explainable Radiomics-Based Model for Automatic Image Quality Assessment in Breast Cancer DCE MRI Data
by Georgios S. Ioannidis, Katerina Nikiforaki, Aikaterini Dovrou, Vassilis Kilintzis, Grigorios Kalliatakis, Oliver Diaz, Karim Lekadir and Kostas Marias
J. Imaging 2025, 11(11), 417; https://doi.org/10.3390/jimaging11110417 - 19 Nov 2025
Viewed by 472
Abstract
This study aims to develop an explainable radiomics-based model for the automatic assessment of image quality in breast cancer Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) data. A cohort of 280 images obtained from a public database was annotated by two clinical experts, resulting [...] Read more.
This study aims to develop an explainable radiomics-based model for the automatic assessment of image quality in breast cancer Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) data. A cohort of 280 images obtained from a public database was annotated by two clinical experts, resulting in 110 high-quality and 110 low-quality images. The proposed methodology involved the extraction of 819 radiomic features and 2 No-Reference image quality metrics per patient, using both the whole image and the background as regions of interest. Feature extraction was performed under two scenarios: (i) from a sample of 12 slices per patient, and (ii) from the middle slice of each patient. Following model training, a range of machine learning classifiers were applied with explainability assessed through SHapley Additive Explanations (SHAP). The best performance was achieved in the second scenario, where combining features from the whole image and background with a support vector machine classifier yielded sensitivity, specificity, accuracy, and AUC values of 85.51%, 80.01%, 82.76%, and 89.37%, respectively. This proposed model demonstrates potential for integration into clinical practice and may also serve as a valuable resource for large-scale repositories and subgroup analyses aimed at ensuring fairness and explainability. Full article
(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)
Show Figures

Figure 1

14 pages, 2365 KB  
Article
Seam Carving Forgery Detection Through Multi-Perspective Explainable AI
by Miguel José das Neves, Felipe Rodrigues Perche Mahlow, Renato Dias de Souza, Paulo Roberto G. Hernandes, Jr., José Remo Ferreira Brega and Kelton Augusto Pontara da Costa
J. Imaging 2025, 11(11), 416; https://doi.org/10.3390/jimaging11110416 - 18 Nov 2025
Viewed by 333
Abstract
This paper addresses the critical challenge of detecting content-aware image manipulations, specifically focusing on seam carving forgery. While deep learning models, particularly Convolutional Neural Networks (CNNs), have shown promise in this area, their black-box nature limits their trustworthiness in high-stakes domains like digital [...] Read more.
This paper addresses the critical challenge of detecting content-aware image manipulations, specifically focusing on seam carving forgery. While deep learning models, particularly Convolutional Neural Networks (CNNs), have shown promise in this area, their black-box nature limits their trustworthiness in high-stakes domains like digital forensics. To address this gap, we propose and validate a framework for interpretable forgery detection, termed E-XAI (Ensemble Explainable AI). Conceptually inspired by Ensemble Learning, our framework’s novelty lies not in combining predictive models, but in integrating a multi-perspective ensemble of explainability techniques. Specifically, we combine SHAP for fine-grained, pixel-level feature attribution with Grad-CAM for region-level localization to create a more robust and holistic interpretation of a single, custom-trained CNN’s decisions. Our approach is validated on a purpose-built, balanced, binary-class dataset of 10,300 images. The results demonstrate high classification performance on an unseen test set, with a 95% accuracy and a 99% precision for the forged class. Furthermore, we analyze the model’s robustness against JPEG compression, a common real-world perturbation. More importantly, the application of the E-XAI framework reveals how the model identifies subtle forgery artifacts, providing transparent, visual evidence for its decisions. This work contributes a robust end-to-end pipeline for interpretable image forgery detection, enhancing the trust and reliability of AI systems in information security. Full article
Show Figures

Figure 1

18 pages, 1489 KB  
Article
Few-Shot Adaptation of Foundation Vision Models for PCB Defect Inspection
by Sang-Jeong Lee
J. Imaging 2025, 11(11), 415; https://doi.org/10.3390/jimaging11110415 - 17 Nov 2025
Viewed by 447
Abstract
Automated Optical Inspection (AOI) of Printed Circuit Boards (PCBs) suffers from scarce labeled data and frequent domain shifts caused by variations in camera optics, illumination, and product design. These limitations hinder the development of accurate and reliable deep-learning models in manufacturing settings. To [...] Read more.
Automated Optical Inspection (AOI) of Printed Circuit Boards (PCBs) suffers from scarce labeled data and frequent domain shifts caused by variations in camera optics, illumination, and product design. These limitations hinder the development of accurate and reliable deep-learning models in manufacturing settings. To address this challenge, this study systematically benchmarks three Parameter-Efficient Fine-Tuning (PEFT) strategies—Linear Probe, Low-Rank Adaptation (LoRA), and Visual Prompt Tuning (VPT)—applied to two representative foundation vision models: the Contrastive Language–Image Pretraining Vision Transformer (CLIP-ViT-B/16) and the Self-Distillation with No Labels Vision Transformer (DINOv2-S/14). The models are evaluated on six-class PCB defect classification tasks under few-shot (k = 5, 10, 20) and full-data regimes, analyzing both performance and reliability. Experiments show that VPT achieves 0.99 ± 0.01 accuracy and 0.998 ± 0.001 macro–Area Under the Precision–Recall Curve (macro-AUPRC), reducing classification error by approximately 65% compared with Linear and LoRA while tuning fewer than 1.5% of backbone parameters. Reliability, assessed by the stability of precision–recall behavior across different decision thresholds, improved as the number of labeled samples increased. Furthermore, class-wise and few-shot analyses revealed that VPT adapts more effectively to rare defect types such as Spur and Spurious Copper while maintaining near-ceiling performance on simpler categories (Short, Pinhole). These findings collectively demonstrate that prompt-based adaptation offers a quantitatively favorable trade-off between accuracy, efficiency, and reliability. Practically, this positions VPT as a scalable strategy for factory-level AOI, enabling the rapid deployment of robust defect inspection models even when labeled data is scarce. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

24 pages, 4018 KB  
Article
Toward Smarter Orthopedic Care: Classifying Plantar Footprints from RGB Images Using Vision Transformers and CNNs
by Lidia Yolanda Ramírez-Rios, Jesús Everardo Olguín-Tiznado, Edgar Rene Ramos-Acosta, Everardo Inzunza-Gonzalez, Julio César Cano-Gutiérrez, Enrique Efrén García-Guerrero and Claudia Camargo-Wilson
J. Imaging 2025, 11(11), 414; https://doi.org/10.3390/jimaging11110414 - 16 Nov 2025
Cited by 1 | Viewed by 322
Abstract
The anatomical structure of the foot can be assessed by examining the plantar footprint for orthopedic intervention. In fact, there is a relationship between a specific type of foot and multiple musculoskeletal disorders, which are among the main ailments affecting the lower extremities, [...] Read more.
The anatomical structure of the foot can be assessed by examining the plantar footprint for orthopedic intervention. In fact, there is a relationship between a specific type of foot and multiple musculoskeletal disorders, which are among the main ailments affecting the lower extremities, where its accurate classification is essential for early diagnosis. This work aims to develop a method for accurately classifying the plantar footprint and hindfoot, specifically concerning the sagittal plane. A custom image dataset was created, comprising 603 RGB plantar images that were modified and augmented. Six state-of-the-art models have been trained and evaluated: swin_tiny_patch4_window7_224, convnextv2_tiny, deit3_base_patch16_224, xception41, inception-v4, and efficientnet_b0. Among them, the swin_tiny_patch4_window7_224 model achieved 98.013% accuracy, demonstrating its potential as a reliable and low-cost tool for clinical screening and diagnosis of foot-related conditions. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

23 pages, 19620 KB  
Article
Sentinel-2-Based Forest Health Survey of ICP Forests Level I and II Plots in Hungary
by Tamás Molnár, Bence Bolla, Orsolya Szabó and András Koltay
J. Imaging 2025, 11(11), 413; https://doi.org/10.3390/jimaging11110413 - 14 Nov 2025
Viewed by 546
Abstract
Forest damage has been increasingly recorded over the past decade in both Europe and Hungary, primarily due to prolonged droughts, causing a decline in forest health. In the framework of ICP Forests, the forest damage has been monitored for decades; however, it is [...] Read more.
Forest damage has been increasingly recorded over the past decade in both Europe and Hungary, primarily due to prolonged droughts, causing a decline in forest health. In the framework of ICP Forests, the forest damage has been monitored for decades; however, it is labour-intensive and time-consuming. Satellite-based remote sensing offers a rapid and efficient method for assessing large-scale damage events, combining the ground-based ICP Forests datasets. This study utilised cloud computing and Sentinel-2 satellite imagery to monitor forest health and detect anomalies. Standardised NDVI (Z NDVI) maps were produced for the period from 2017 to 2023 to identify disturbances in the forest. The research focused on seven active ICP Forests Level II and 78 Level I plots in Hungary. Z NDVI values were divided into five categories based on damage severity, and there was agreement between Level II field data and satellite imagery. In 2017, severe damage was caused by late frost and wind; however, the forest recovered by 2018. Another decline was observed in 2021 due to wind and in 2022 due to drought. Data from the ICP Forests Level I plots, which represent forest condition in Hungary, indicated that 80% of the monitored stands were damaged, with 30% suffering moderate damage and 15% experiencing severe damage. Z NDVI classifications aligned with the field data, showing widespread forest damage across the country. Full article
Show Figures

Figure 1

15 pages, 3988 KB  
Article
Boundary-Guided Differential Attention: Enhancing Camouflaged Object Detection Accuracy
by Hongliang Zhang, Bolin Xu and Sanxin Jiang
J. Imaging 2025, 11(11), 412; https://doi.org/10.3390/jimaging11110412 - 14 Nov 2025
Viewed by 495
Abstract
Camouflaged Object Detection (COD) is a challenging computer vision task aimed at accurately identifying and segmenting objects seamlessly blended into their backgrounds. This task has broad applications across medical image segmentation, defect detection, agricultural image detection, security monitoring, and scientific research. Traditional COD [...] Read more.
Camouflaged Object Detection (COD) is a challenging computer vision task aimed at accurately identifying and segmenting objects seamlessly blended into their backgrounds. This task has broad applications across medical image segmentation, defect detection, agricultural image detection, security monitoring, and scientific research. Traditional COD methods often struggle with precise segmentation due to the high similarity between camouflaged objects and their surroundings. In this study, we introduce a Boundary-Guided Differential Attention Network (BDA-Net) to address these challenges. BDA-Net first extracts boundary features by fusing multi-scale image features and applying channel attention. Subsequently, it employs a differential attention mechanism, guided by these boundary features, to highlight camouflaged objects and suppress background information. The weighted features are then progressively fused to generate accurate camouflage object masks. Experimental results on the COD10K, NC4K, and CAMO datasets demonstrate that BDA-Net outperforms most state-of-the-art COD methods, achieving higher accuracy. Here we show that our approach improves detection accuracy by up to 3.6% on key metrics, offering a robust solution for precise camouflaged object segmentation. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

21 pages, 1479 KB  
Article
Neural Radiance Fields: Driven Exploration of Visual Communication and Spatial Interaction Design for Immersive Digital Installations
by Wanshu Li and Yuanhui Hu
J. Imaging 2025, 11(11), 411; https://doi.org/10.3390/jimaging11110411 - 13 Nov 2025
Viewed by 524
Abstract
In immersive digital devices, high environmental complexity can lead to rendering delays and loss of interactive details, resulting in a fragmented experience. This paper proposes a lightweight NeRF (Neural Radiance Fields) modeling and multimodal perception fusion method. First, a sparse hash code is [...] Read more.
In immersive digital devices, high environmental complexity can lead to rendering delays and loss of interactive details, resulting in a fragmented experience. This paper proposes a lightweight NeRF (Neural Radiance Fields) modeling and multimodal perception fusion method. First, a sparse hash code is constructed based on Instant-NGP (Instant Neural Graphics Primitives) to accelerate scene radiance field generation. Second, parameter distillation and channel pruning are used to reduce the model’s size and reduce computational overheads. Next, multimodal data from a depth camera and an IMU (Inertial Measurement Unit) is fused, and Kalman filtering is used to improve pose tracking accuracy. Finally, the optimized NeRF model is integrated into the Unity engine, utilizing custom shaders and asynchronous rendering to achieve low-latency viewpoint responsiveness. Experiments show that the file size of this method in high-complexity scenes is only 79.5 MB ± 5.3 MB, and the first loading time is only 2.9 s ± 0.4 s, effectively reducing rendering latency. The SSIM is 0.951 ± 0.016 at 1.5 m/s, and the GME is 7.68 ± 0.15 at 1.5 m/s. It can stably restore texture details and edge sharpness under dynamic viewing angles. In scenarios that support 3–5 people interacting simultaneously, the average interaction response delay is only 16.3 ms, and the average jitter error is controlled at 0.12°, significantly improving spatial interaction performance. In conclusion, this study provides effective technical solutions for high-quality immersive interaction in complex public scenarios. Future work will explore the framework’s adaptability in larger-scale dynamic environments and further optimize the network synchronization mechanism for multi-user concurrency. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

17 pages, 1121 KB  
Article
TASA: Text-Anchored State–Space Alignment for Long-Tailed Image Classification
by Long Li, Tinglei Jia, Huaizhi Yue, Huize Cheng, Yongfeng Bu and Zhaoyang Zhang
J. Imaging 2025, 11(11), 410; https://doi.org/10.3390/jimaging11110410 - 13 Nov 2025
Viewed by 442
Abstract
Long-tailed image classification remains challenging for vision–language models. Head classes dominate training while tail classes are underrepresented and noisy, and short prompts with weak text supervision further amplify head bias. This paper presents TASA, an end-to-end framework that stabilizes textual supervision and enhances [...] Read more.
Long-tailed image classification remains challenging for vision–language models. Head classes dominate training while tail classes are underrepresented and noisy, and short prompts with weak text supervision further amplify head bias. This paper presents TASA, an end-to-end framework that stabilizes textual supervision and enhances cross-modal fusion. A Semantic Distribution Modulation (SDM) module constructs class-specific text prototypes by cosine-weighted fusion of multiple LLM-generated descriptions with a canonical template, providing stable and diverse semantic anchors without training text parameters. Dual-Space Cross-Modal Fusion (DCF) module incorporates selective-scan state–space blocks into both image and text branches, enabling bidirectional conditioning and efficient feature fusion through a lightweight multilayer perceptron. Together with a margin-aware alignment loss, TASA aligns images with class prototypes for classification without requiring paired image–text data or per-class prompt tuning. Experiments on CIFAR-10/100-LT, ImageNet-LT, and Places-LT demonstrate consistent improvements across many-, medium-, and few-shot groups. Ablation studies confirm that DCF yields the largest single-module gain, while SDM and DCF combined provide the most robust and balanced performance. These results highlight the effectiveness of integrating text-driven prototypes with state–space fusion for long-tailed classification. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

24 pages, 248126 KB  
Article
Image Matching for UAV Geolocation: Classical and Deep Learning Approaches
by Fatih Baykal, Mehmet İrfan Gedik, Constantino Carlos Reyes-Aldasoro and Cefa Karabağ
J. Imaging 2025, 11(11), 409; https://doi.org/10.3390/jimaging11110409 - 12 Nov 2025
Viewed by 745
Abstract
Today, unmanned aerial vehicles (UAVs) are heavily dependent on Global Navigation Satellite Systems (GNSSs) for positioning and navigation. However, GNSS signals are vulnerable to jamming and spoofing attacks. This poses serious security risks, especially for military operations and critical civilian missions. In order [...] Read more.
Today, unmanned aerial vehicles (UAVs) are heavily dependent on Global Navigation Satellite Systems (GNSSs) for positioning and navigation. However, GNSS signals are vulnerable to jamming and spoofing attacks. This poses serious security risks, especially for military operations and critical civilian missions. In order to solve this problem, an image-based geolocation system has been developed that eliminates GNSS dependency. The proposed system estimates the geographical location of the UAV by matching the aerial images taken by the UAV with previously georeferenced high-resolution satellite images. For this purpose, common visual features were determined between satellite and UAV images and matching operations were carried out using methods based on the homography matrix. Thanks to image processing, a significant relationship has been established between the area where the UAV is located and the geographical coordinates, and reliable positioning is ensured even in cases where GNSS signals cannot be used. Within the scope of the study, traditional methods such as SIFT, AKAZE, and Multiple Template Matching were compared with learning-based methods including SuperPoint, SuperGlue, and LoFTR. The results showed that deep learning-based approaches can make successful matches, especially at high altitudes. Full article
Show Figures

Figure 1

16 pages, 2880 KB  
Article
Wafer Defect Detection Technology Based on CTM-IYOLOv10 Network
by Pengcheng Ji, Zhenzhi He, Weiwei Yang, Jiawei Du, Guo Ye and Xiangning Lu
J. Imaging 2025, 11(11), 408; https://doi.org/10.3390/jimaging11110408 - 12 Nov 2025
Viewed by 443
Abstract
The continuous scaling of semiconductor devices has increased the density and complexity of wafer dies, making precise and efficient defect detection a critical task for intelligent manufacturing. Traditional manual or semi-automated inspection approaches are often inefficient, error-prone, and susceptible to missed or false [...] Read more.
The continuous scaling of semiconductor devices has increased the density and complexity of wafer dies, making precise and efficient defect detection a critical task for intelligent manufacturing. Traditional manual or semi-automated inspection approaches are often inefficient, error-prone, and susceptible to missed or false detections, particularly for small or irregular defects. This study presents a wafer defect detection framework that integrates clustering–template matching (CTM) with an improved YOLOv10 network (CTM-IYOLOv10). The CTM strategy enhances die segmentation efficiency and mitigates redundant matching in multi-die fields of view, while the introduction of a modified GhostConv module and an enhanced BiFPN structure strengthens feature representation, reduces computational redundancy, and improves small-object detection. Furthermore, data augmentation strategies are employed to improve robustness and generalization. Experimental evaluations demonstrate that CTM-IYOLOv10 achieves a detection accuracy of 98.1%, reduces inference time by 23.2%, and compresses model size by 52.3% compared with baseline YOLOv10, and consistently outperforms representative detectors such as YOLOv5 and YOLOv8. These results highlight both the methodological contributions of the proposed architecture and its practical significance for real-time wafer defect inspection in semiconductor manufacturing. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

15 pages, 2484 KB  
Article
Fully Automated AI-Based Digital Workflow for Mirroring of Healthy and Defective Craniofacial Models
by Michel Beyer, Julian Grossi, Alexandru Burde, Sead Abazi, Lukas Seifert, Joachim Polligkeit, Neha Umakant Chodankar and Florian M. Thieringer
J. Imaging 2025, 11(11), 407; https://doi.org/10.3390/jimaging11110407 - 12 Nov 2025
Viewed by 502
Abstract
The accurate reconstruction of craniofacial defects requires the precise segmentation and mirroring of healthy anatomy. Conventional workflows rely on manual interaction, making them time-consuming and subject to operator variability. This study developed and validated a fully automated digital pipeline that integrates deep learning–based [...] Read more.
The accurate reconstruction of craniofacial defects requires the precise segmentation and mirroring of healthy anatomy. Conventional workflows rely on manual interaction, making them time-consuming and subject to operator variability. This study developed and validated a fully automated digital pipeline that integrates deep learning–based segmentation with algorithmic mirroring for craniofacial reconstruction. A total of 388 cranial CT scans were used to train a three-dimensional nnU-Net model for skull and mandible segmentation. A Principal Component Analysis–Iterative Closest Point (PCA–ICP) algorithm was then applied to compute the sagittal symmetry plane and perform mirroring. Automated results were compared with expert-generated segmentations and manually defined symmetry planes using Dice Similarity Coefficient (DSC), Mean Surface Distance (MSD), Hausdorff Distance (HD), and angular deviation. The nnU-Net achieved high segmentation accuracy for both the mandible (mean DSC 0.956) and the skull (mean DSC 0.965). Mirroring results showed minimal angular deviation from expert reference planes (mandible: 1.32° ± 0.71° in defect cases, 1.58° ± 1.12° in intact cases; skull: 1.75° ± 0.84° in defect cases, 1.15° ± 0.81° in intact cases). The presence of defects did not significantly affect accuracy. This automated workflow demonstrated robust performance and clinical applicability, offering standardized, reproducible, and time-efficient planning for craniofacial reconstruction. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

11 pages, 914 KB  
Communication
High-Resolution Peripheral Quantitative Computed Tomography (HR-pQCT) for Assessment of Avascular Necrosis of the Lunate
by Esin Rothenfluh, Georg F. Erbach, Léna G. Dietrich, Laura De Pellegrin, Daniela A. Frauchiger and Rainer J. Egli
J. Imaging 2025, 11(11), 406; https://doi.org/10.3390/jimaging11110406 - 12 Nov 2025
Viewed by 294
Abstract
This exploratory study investigates the feasibility and diagnostic value of high-resolution peripheral quantitative computed tomography (HR-pQCT) in detecting structural and microarchitectural changes in lunate avascular necrosis (AVN), or Kienböck’s disease. Five adult patients with unilateral AVN underwent either MRI or CT, alongside HR-pQCT [...] Read more.
This exploratory study investigates the feasibility and diagnostic value of high-resolution peripheral quantitative computed tomography (HR-pQCT) in detecting structural and microarchitectural changes in lunate avascular necrosis (AVN), or Kienböck’s disease. Five adult patients with unilateral AVN underwent either MRI or CT, alongside HR-pQCT of both wrists. Imaging features such as subchondral remodeling, joint space narrowing, and bone fragmentation were assessed across modalities. HR-pQCT detected at least one additional pathological feature not seen on MRI or CT in four of five patients and revealed early subchondral changes in two contralateral asymptomatic wrists. Quantitative measurements of bone volume fraction (BV/TV) further indicated altered trabecular structure correlating with disease stage. These findings suggest that HR-pQCT may offer enhanced sensitivity for early-stage AVN and better delineation of disease extent, which is critical for informed surgical planning. While limited by small sample size, this study provides preliminary evidence supporting HR-pQCT as a complementary imaging tool in the assessment of lunate AVN, with potential to improve early detection, staging accuracy, and individualized treatment strategies. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

27 pages, 16752 KB  
Article
Unified-Removal: A Semi-Supervised Framework for Simultaneously Addressing Multiple Degradations in Real-World Images
by Yongheng Zhang
J. Imaging 2025, 11(11), 405; https://doi.org/10.3390/jimaging11110405 - 11 Nov 2025
Viewed by 540
Abstract
This work introduces Uni-Removal, an innovative two-stage framework that effectively addresses the critical challenge of domain adaptation in unified image restoration. Contemporary approaches often face significant performance degradation when transitioning from synthetic training environments to complex real-world scenarios due to the substantial domain [...] Read more.
This work introduces Uni-Removal, an innovative two-stage framework that effectively addresses the critical challenge of domain adaptation in unified image restoration. Contemporary approaches often face significant performance degradation when transitioning from synthetic training environments to complex real-world scenarios due to the substantial domain discrepancy. Our proposed solution establishes a comprehensive pipeline that systematically bridges this gap through dual-phase representation learning. In the first stage, we implement a structured multi-teacher knowledge distillation mechanism that enables a unified student architecture to assimilate and integrate specialized expertise from multiple pre-trained degradation-specific networks. This knowledge transfer is rigorously regularized by our novel Instance-Grained Contrastive Learning (IGCL) objective, which explicitly enforces representation consistency across both feature hierarchies and image spaces. The second stage introduces a groundbreaking output distribution calibration methodology that employs Cluster-Grained Contrastive Learning (CGCL) to adversarially align the restored outputs with authentic real-world image characteristics, effectively embedding the student model within the natural image manifold without requiring paired supervision. Comprehensive experimental validation demonstrates Uni-Removal’s superior performance across multiple real-world degradation tasks including dehazing, deraining, and deblurring, where it consistently surpasses existing state-of-the-art methods. The framework’s exceptional generalization capability is further evidenced by its competitive denoising performance on the SIDD benchmark and, more significantly, by delivering a substantial 4.36 mAP improvement in downstream object detection tasks, unequivocally establishing its practical utility as a robust pre-processing component for advanced computer vision systems. Full article
Show Figures

Figure 1

22 pages, 683 KB  
Article
LatAtk: A Medical Image Attack Method Focused on Lesion Areas with High Transferability
by Long Li, Yibo Huang, Chong Li, Fei Zhou, Jingjing Li and Kamarul Hawari Ghazali
J. Imaging 2025, 11(11), 404; https://doi.org/10.3390/jimaging11110404 - 11 Nov 2025
Viewed by 284
Abstract
The rise in trusted machine learning has prompted concerns about the security, reliability and controllability of deep learning, especially when it is applied to sensitive areas involving life and health safety. To thoroughly analyze potential attacks and promote innovation in security technologies for [...] Read more.
The rise in trusted machine learning has prompted concerns about the security, reliability and controllability of deep learning, especially when it is applied to sensitive areas involving life and health safety. To thoroughly analyze potential attacks and promote innovation in security technologies for DNNs, this paper conducts research on adversarial attacks against medical images and proposes a medical image attack method that focuses on lesion areas and has good transferability, named LatAtk. First, based on the image segmentation algorithm, LatAtk divides the target image into an attackable area (lesion area) and a non-attackable area and injects perturbations into the attackable area to disrupt the attention of the DNNs. Second, a class activation loss function based on gradient-weighted class activation mapping is proposed. By obtaining the importance of features in images, the features that play a positive role in model decision-making are further disturbed, making LatAtk highly transferable. Third, a texture feature loss function based on local binary patterns is proposed as a constraint to reduce the damage to non-semantic features, effectively preserving texture features of target images and improving the concealment of adversarial samples. Experimental results show that LatAtk has superior aggressiveness, transferability and concealment compared to advanced baselines. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

21 pages, 4155 KB  
Article
Integrating Deep Learning and Radiogenomics: A Novel Approach to Glioblastoma Segmentation and MGMT Methylation Prediction
by Nabil M. Abdelaziz, Emad Abdel-Aziz Dawood and Alshaimaa A. Tantawy
J. Imaging 2025, 11(11), 403; https://doi.org/10.3390/jimaging11110403 - 11 Nov 2025
Viewed by 605
Abstract
Radiogenomics, which integrates imaging phenotypes with genomic profiles, enhances diagnosis, prognosis, and treatment planning for glioblastomas. This study specifically establishes a correlation between radiomic features and MGMT promoter methylation status, advancing towards a non-invasive, integrated diagnostic paradigm. Conventional genetic analysis requires invasive biopsies, [...] Read more.
Radiogenomics, which integrates imaging phenotypes with genomic profiles, enhances diagnosis, prognosis, and treatment planning for glioblastomas. This study specifically establishes a correlation between radiomic features and MGMT promoter methylation status, advancing towards a non-invasive, integrated diagnostic paradigm. Conventional genetic analysis requires invasive biopsies, which cause delays in obtaining results and necessitate further surgeries. Our methodology is twofold: First, an enhanced U-Net model segments brain tumor regions with high precision (Dice coefficient: 0.889). Second, a hybrid classifier, leveraging the complementary features of EfficientNetB0 and ResNet50, predicts MGMT promoter methylation status from the segmented volumes. The proposed framework demonstrated superior performance in predicting MGMT promoter methylation status in glioblastoma patients compared to conventional methods, achieving a classification accuracy of 95% and an AUC of 0.96. These results underscore the model’s potential to enhance patient stratification and guide treatment selection. The accurate prediction of MGMT promoter methylation status via non-invasive imaging provides a reliable criterion for anticipating patient responsiveness to alkylating chemotherapy. This capability equips clinicians with a tool to inform personalized treatment strategies, optimizing therapeutic efficacy from the outset. Full article
(This article belongs to the Topic Intelligent Image Processing Technology)
Show Figures

Figure 1

22 pages, 1770 KB  
Article
Key-Frame-Aware Hierarchical Learning for Robust Gait Recognition
by Ke Wang and Hua Huo
J. Imaging 2025, 11(11), 402; https://doi.org/10.3390/jimaging11110402 - 10 Nov 2025
Viewed by 415
Abstract
Gait recognition in unconstrained environments is severely hampered by variations in view, clothing, and carrying conditions. To address this, we introduce HierarchGait, a key-frame-aware hierarchical learning framework. Our approach uniquely integrates three complementary modules: a TemplateBlock-based Motion Extraction (TBME) for coarse-to-fine anatomical feature [...] Read more.
Gait recognition in unconstrained environments is severely hampered by variations in view, clothing, and carrying conditions. To address this, we introduce HierarchGait, a key-frame-aware hierarchical learning framework. Our approach uniquely integrates three complementary modules: a TemplateBlock-based Motion Extraction (TBME) for coarse-to-fine anatomical feature learning, a Sequence-Level Spatio-temporal Feature Aggregator (SSFA) to identify and prioritize discriminative key-frames, and a Frame-level Feature Re-segmentation Extractor (FFRE) to capture fine-grained motion details. This synergistic design yields a robust and comprehensive gait representation. We demonstrate the superiority of our method through extensive experiments. On the highly challenging CASIA-B dataset, HierarchGait achieves new state-of-the-art average Rank-1 accuracies of 98.1% under Normal (NM), 95.9% under Bag (BG), and 87.5% under Coat (CL) conditions. Furthermore, on the large-scale OU-MVLP dataset, our model attains a 91.5% average accuracy. These results validate the significant advantage of explicitly modeling anatomical hierarchies and temporal key-moments for robust gait recognition. Full article
(This article belongs to the Section Biometrics, Forensics, and Security)
Show Figures

Figure 1

11 pages, 1793 KB  
Article
Knee Cartilage Quantification: Performance of Low-Field MR in Detecting Low Grades of Chondropathy
by Francesco Pucciarelli, Antonio Marino, Maria Carla Faugno, Giuseppe Argento, Edoardo Monaco, Andrea Redler, Nicola Maffulli, Pierfrancesco Orlandi, Marta Zerunian, Domenico De Santis, Michela Polici, Damiano Caruso, Marco Francone and Andrea Laghi
J. Imaging 2025, 11(11), 401; https://doi.org/10.3390/jimaging11110401 - 8 Nov 2025
Viewed by 426
Abstract
This study aimed to evaluate the diagnostic accuracy of T2 mapping on low-field (0.31 T) MRI for detecting low-grade knee chondropathy, using arthroscopy as the reference standard. Fifty-two patients (mean age 48.1 ± 17.2 years) undergoing arthroscopy for anterior cruciate ligament or meniscal [...] Read more.
This study aimed to evaluate the diagnostic accuracy of T2 mapping on low-field (0.31 T) MRI for detecting low-grade knee chondropathy, using arthroscopy as the reference standard. Fifty-two patients (mean age 48.1 ± 17.2 years) undergoing arthroscopy for anterior cruciate ligament or meniscal tears were prospectively enrolled, excluding those with previous surgery, infection, or high-grade chondropathy (Outerbridge III–IV). MRI was performed with a 0.31 T scanner using a 3D SHARC sequence, and T2 relaxometric maps were generated for 14 cartilage regions per knee according to the WORMS classification. Arthroscopy, performed within one month by two blinded surgeons, served as the gold standard. A total of 728 regions were analyzed. T2 mapping differentiated healthy cartilage (grade 0) from early chondropathy (grades I–II) with an optimal cut-off of 45 ms and moderate discriminative accuracy (AUC = 0.714 for Reader 1 and 0.709 for Reader 2). Agreement with arthroscopy was good (κ = 0.731), with excellent intra-reader (ICC = 0.998) and good inter-reader reproducibility (ICC = 0.753). Most degenerative changes were located at the femoral condyles (59%). Low-field T2 mapping showed good diagnostic performance and reproducibility in detecting early cartilage degeneration, supporting its potential as a cost-effective and accessible quantitative biomarker for the assessment of cartilage integrity in clinical practice. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

23 pages, 2512 KB  
Article
Benchmarking Compact VLMs for Clip-Level Surveillance Anomaly Detection Under Weak Supervision
by Kirill Borodin, Kirill Kondrashov, Nikita Vasiliev, Ksenia Gladkova, Inna Larina, Mikhail Gorodnichev and Grach Mkrtchian
J. Imaging 2025, 11(11), 400; https://doi.org/10.3390/jimaging11110400 - 8 Nov 2025
Viewed by 807
Abstract
CCTV safety monitoring demands anomaly detectors combine reliable clip-level accuracy with predictable per-clip latency despite weak supervision. This work investigates compact vision–language models (VLMs) as practical detectors for this regime. A unified evaluation protocol standardizes preprocessing, prompting, dataset splits, metrics, and runtime settings [...] Read more.
CCTV safety monitoring demands anomaly detectors combine reliable clip-level accuracy with predictable per-clip latency despite weak supervision. This work investigates compact vision–language models (VLMs) as practical detectors for this regime. A unified evaluation protocol standardizes preprocessing, prompting, dataset splits, metrics, and runtime settings to compare parameter-efficiently adapted compact VLMs against training-free VLM pipelines and weakly supervised baselines. Evaluation spans accuracy, precision, recall, F1, ROC-AUC, and average per-clip latency to jointly quantify detection quality and efficiency. With parameter-efficient adaptation, compact VLMs achieve performance on par with, and in several cases exceeding, established approaches while retaining competitive per-clip latency. Adaptation further reduces prompt sensitivity, producing more consistent behavior across prompt regimes under the shared protocol. These results show that parameter-efficient fine-tuning enables compact VLMs to serve as dependable clip-level anomaly detectors, yielding a favorable accuracy–efficiency trade-off within a transparent and consistent experimental setup. Full article
(This article belongs to the Special Issue Object Detection in Video Surveillance Systems)
Show Figures

Figure 1

21 pages, 5390 KB  
Article
HitoMi-Cam: A Shape-Agnostic Person Detection Method Using the Spectral Characteristics of Clothing
by Shuji Ono
J. Imaging 2025, 11(11), 399; https://doi.org/10.3390/jimaging11110399 - 7 Nov 2025
Viewed by 827
Abstract
While convolutional neural network (CNN)-based object detection is widely used, it exhibits a shape dependency that degrades performance for postures not included in the training data. Building upon our previous simulation study published in this journal, this study implements and evaluates the spectral-based [...] Read more.
While convolutional neural network (CNN)-based object detection is widely used, it exhibits a shape dependency that degrades performance for postures not included in the training data. Building upon our previous simulation study published in this journal, this study implements and evaluates the spectral-based approach on physical hardware to address this limitation. Specifically, this paper introduces HitoMi-Cam, a lightweight and shape-agnostic person detection method that uses the spectral reflectance properties of clothing. The author implemented the system on a resource-constrained edge device without a GPU to assess its practical viability. The results indicate that a processing speed of 23.2 frames per second (fps) (253 × 190 pixels) is achievable, suggesting that the method can be used for real-time applications. In a simulated search and rescue scenario where the performance of CNNs declines, HitoMi-Cam achieved an average precision (AP) of 93.5%, surpassing that of the compared CNN models (best AP of 53.8%). Throughout all evaluation scenarios, the occurrence of false positives remained minimal. This study positions the HitoMi-Cam method not as a replacement for CNN-based detectors but as a complementary tool under specific conditions. The results indicate that spectral-based person detection can be a viable option for real-time operation on edge devices in real-world environments where shapes are unpredictable, such as disaster rescue. Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
Show Figures

Graphical abstract

17 pages, 4840 KB  
Article
A Deep Learning-Based Approach for Explainable Microsatellite Instability Detection in Gastrointestinal Malignancies
by Ludovica Ciardiello, Patrizia Agnello, Marta Petyx, Fabio Martinelli, Mario Cesarelli, Antonella Santone and Francesco Mercaldo
J. Imaging 2025, 11(11), 398; https://doi.org/10.3390/jimaging11110398 - 7 Nov 2025
Viewed by 435
Abstract
Microsatellite instability represents a key biomarker in gastrointestinal cancers with significant diagnostic and therapeutic implications. Traditional molecular assays for microsatellite instability detection, while effective, are costly, time-consuming, and require specialized infrastructure. In this paper we propose an explainable deep learning-based method for microsatellite [...] Read more.
Microsatellite instability represents a key biomarker in gastrointestinal cancers with significant diagnostic and therapeutic implications. Traditional molecular assays for microsatellite instability detection, while effective, are costly, time-consuming, and require specialized infrastructure. In this paper we propose an explainable deep learning-based method for microsatellite instability detection starting from the analysis of histopathological images. We consider a set of convolutional neural network architectures i.e., MobileNet, Inception, VGG16, VGG19, and a Vision Transformer model, and we propose a way to provide a kind of clinical explainability behind the model prediction through (three) Class Activation Mapping techniques. With the aim to further strengthen trustworthiness in predictions, we introduce a set of robustness metrics aimed to quantify the consistency of highlighted discriminative regions across different Class Activation Mapping methods. Experimental results on a real-world dataset demonstrate that VGG16 and VGG19 models achieve the best performance in terms of accuracy; in particular, the VGG16 model obtains an accuracy of 0.926, while the VGG19 one reaches an accuracy equal to 0.917. Furthermore, Class Activation Mapping techniques confirmed that the developed models consistently focus on similar tissue regions, while robustness analysis highlighted high agreement between different Class Activation Mapping techniques. These results indicate that the proposed method not only achieves interesting predictive accuracy but also provides explainable predictions, with the aim to boost the integration of deep learning into real-world clinical practice. Full article
(This article belongs to the Special Issue Progress and Challenges in Biomedical Image Analysis—2nd Edition)
Show Figures

Figure 1

20 pages, 2086 KB  
Article
Real-Time Colorimetric Imaging System for Automated Quality Classification of Natural Rubber Using Yellowness Index Analysis
by Suphatchakorn Limhengha and Supattarachai Sudsawat
J. Imaging 2025, 11(11), 397; https://doi.org/10.3390/jimaging11110397 - 7 Nov 2025
Viewed by 361
Abstract
Natural rubber quality assessment traditionally relies on subjective visual inspection, leading to inconsistent grading and processing inefficiencies. This study presents a colorimetric imaging system integrating 48-megapixel image acquisition with automated colorimetric analysis for objective rubber classification. Five rubber grades—white crepe, STR5, STR5L, RSS3, [...] Read more.
Natural rubber quality assessment traditionally relies on subjective visual inspection, leading to inconsistent grading and processing inefficiencies. This study presents a colorimetric imaging system integrating 48-megapixel image acquisition with automated colorimetric analysis for objective rubber classification. Five rubber grades—white crepe, STR5, STR5L, RSS3, and RSS5—were analyzed using standardized 25 × 25 mm2 specimens under controlled environmental conditions (25 ± 2 °C, 50 ± 5% relative humidity, 3200 K illumination). The image processing pipeline employed color space transformations from RGB through CIE1931 XYZ to CIELAB coordinates, with yellowness index calculation following ASTM E313-20 standards. The classification algorithm achieved 100% accuracy across 100 validation specimens under controlled laboratory conditions, with a processing time of 1.01 ± 0.09 s per specimen. Statistical validation via one-way ANOVA confirmed measurement reliability (p > 0.05) with yellowness index values ranging from 8.52 ± 0.52 for white crepe to 72.15 ± 7.47 for RSS3. Image quality metrics demonstrated a signal-to-noise ratio exceeding 35 dB and a spatial uniformity coefficient of variation below 5%. The system provides 12-fold throughput improvement over manual inspection, offering objective quality assessment suitable for industrial implementation, though field validation under diverse conditions remains necessary. Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
Show Figures

Figure 1

22 pages, 16214 KB  
Article
Self-Tuned Two-Stage Point Cloud Reconstruction Framework Combining TPDn and PU-Net
by Zhiping Ying and Dayuan Lv
J. Imaging 2025, 11(11), 396; https://doi.org/10.3390/jimaging11110396 - 6 Nov 2025
Viewed by 457
Abstract
This paper presents a self-tuned two-stage framework for point cloud reconstruction. A parameter-free denoising module (TPDn) automatically selects thresholds through polynomial model fitting to remove noise and outliers without manual tuning. The denoised cloud is then upsampled by PU-Net to recover fine-grained geometry. [...] Read more.
This paper presents a self-tuned two-stage framework for point cloud reconstruction. A parameter-free denoising module (TPDn) automatically selects thresholds through polynomial model fitting to remove noise and outliers without manual tuning. The denoised cloud is then upsampled by PU-Net to recover fine-grained geometry. This synergy enhances structural consistency and demonstrates qualitative robustness under various noise conditions. Experiments on synthetic datasets and real industrial scans show that the proposed method improves geometric accuracy and uniformity while maintaining low computational cost. The framework is simple, efficient, and easily scalable to large-scale point clouds. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

36 pages, 163603 KB  
Article
Multi-Weather DomainShifter: A Comprehensive Multi-Weather Transfer LLM Agent for Handling Domain Shift in Aerial Image Processing
by Yubo Wang, Ruijia Wen, Hiroyuki Ishii and Jun Ohya
J. Imaging 2025, 11(11), 395; https://doi.org/10.3390/jimaging11110395 - 6 Nov 2025
Viewed by 490
Abstract
Recent deep learning-based remote sensing analysis models often struggle with performance degradation due to domain shifts caused by illumination variations (clear to overcast), changing atmospheric conditions (clear to foggy, dusty), and physical scene changes (clear to snowy). Addressing domain shift in aerial image [...] Read more.
Recent deep learning-based remote sensing analysis models often struggle with performance degradation due to domain shifts caused by illumination variations (clear to overcast), changing atmospheric conditions (clear to foggy, dusty), and physical scene changes (clear to snowy). Addressing domain shift in aerial image segmentation is challenging due to limited training data availability, including costly data collection and annotation. We propose Multi-Weather DomainShifter, a comprehensive multi-weather domain transfer system that augments single-domain images into various weather conditions without additional laborious annotation, coordinated by a large language model (LLM) agent. Specifically, we utilize Unreal Engine to construct a synthetic dataset featuring images captured under diverse conditions such as overcast, foggy, and dusty settings. We then propose a latent space style transfer model that generates alternate domain versions based on real aerial datasets. Additionally, we present a multi-modal snowy scene diffusion model with LLM-assisted scene descriptors to add snowy elements into scenes. Multi-weather DomainShifter integrates these two approaches into a tool library and leverages the agent for tool selection and execution. Extensive experiments on the ISPRS Vaihingen and Potsdam dataset demonstrate that domain shift caused by weather change in aerial image-leads to significant performance drops, then verify our proposal’s capacity to adapt models to perform well in shifted domains while maintaining their effectiveness in the original domain. Full article
(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)
Show Figures

Figure 1

20 pages, 5440 KB  
Article
RepSAU-Net: Semantic Segmentation of Barcodes in Complex Backgrounds via Fused Self-Attention and Reparameterization Methods
by Yanfei Sun, Junyu Wang and Rui Yin
J. Imaging 2025, 11(11), 394; https://doi.org/10.3390/jimaging11110394 - 6 Nov 2025
Viewed by 374
Abstract
In the digital era, commodity barcodes serve as a bridge between the physical and digital worlds and are widely used in retail checkout systems. To meet the broader application demands for product identification, this paper proposes a method for locating, semantically segmenting barcodes [...] Read more.
In the digital era, commodity barcodes serve as a bridge between the physical and digital worlds and are widely used in retail checkout systems. To meet the broader application demands for product identification, this paper proposes a method for locating, semantically segmenting barcodes in complex backgrounds, decoding hidden information, and recovering these barcodes in wide field-of-view images. This method integrates self-attention mechanisms and reparameterization techniques to construct a RepSAU-Net model. Specifically, this paper first introduces a barcode image dataset synthesis strategy adapted for deep learning models, constructing the SBS (Screen Stego Barcodes) dataset, which comprises 2000 wide field-of-view background images (Type A) and 400 information-hidden barcode images (Type B), totaling 30,000 images. Based on this, a network architecture (RepSAU-Net) combining a self-attention mechanism and RepVGG reparameterization technology was designed, with a parameter count of 32.88 M. Experimental results demonstrate that this network performs well in barcode segmentation tasks, achieving an inference speed of 4.88 frames/s, a Mean Intersection over Union (MIoU) of 98.36%, and an Accuracy (Acc) of 94.96%. This research effectively enhances global information capture and feature extraction capabilities without significantly increasing computational load, providing technical support for the application of data-embedded barcodes. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

22 pages, 1641 KB  
Article
PGRF: Physics-Guided Rectified Flow for Low-Light RAW Image Enhancement
by Juntai Zeng and Qingyun Yang
J. Imaging 2025, 11(11), 393; https://doi.org/10.3390/jimaging11110393 - 6 Nov 2025
Viewed by 801
Abstract
Enhancing RAW images acquired under low-light conditions remains a fundamental yet challenging problem in computational photography and image signal processing. Recent deep learning-based approaches have shifted from real paired datasets toward synthetic data generation, where sensor noise is typically simulated through physical modeling. [...] Read more.
Enhancing RAW images acquired under low-light conditions remains a fundamental yet challenging problem in computational photography and image signal processing. Recent deep learning-based approaches have shifted from real paired datasets toward synthetic data generation, where sensor noise is typically simulated through physical modeling. However, most existing methods primarily account for additive noise, neglect multiplicative noise components, and rely on global calibration procedures that fail to capture pixel-level manufacturing variability. Consequently, these methods struggle to faithfully reproduce the complex statistics of real sensor noise. To overcome these limitations, this paper introduces a physically grounded composite noise model that jointly incorporates additive and multiplicative noise components. We further propose a per-pixel noise simulation and calibration strategy, which estimates and synthesizes noise individually for each pixel. This physics-based calibration not only circumvents the constraints of global noise modeling but also captures spatial noise variations arising from microscopic CMOS sensor fabrication differences. Inspired by the recent success of rectified-flow methods in image generation, we integrate our physics-based noise synthesis into a rectified-flow generative framework and present PGRF (Physics-Guided Rectified Flow): a physics-guided rectified-flow framework for low-light RAW image enhancement. PGRF leverages the expressive capacity of rectified flows to model complex data distributions, while physical guidance constrains the generation process toward the desired clean image manifold. To evaluate our method, we constructed the LLID, a dedicated indoor low-light RAW benchmark captured using the Sony A7S II camera. Extensive experiments demonstrate that the proposed framework achieves substantial improvements over state-of-the-art methods in low-light RAW image enhancement. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

27 pages, 3005 KB  
Systematic Review
Prognostic Value of Enterography Findings in Crohn’s Disease: A Systematic Review and Meta-Analysis
by Felipe Montevechi-Luz, Adrieli Heloísa Campardo Pansani, Juliana Delgado Campos Mello, Ana Emilia Carvalho de Paula, Lívia Moreira Genaro, Marcia Carolina Mazzaro, Daniel Lahan-Martins and Raquel Franco Leal
J. Imaging 2025, 11(11), 392; https://doi.org/10.3390/jimaging11110392 - 5 Nov 2025
Viewed by 601
Abstract
Crohn’s disease is a chronic inflammatory disorder with variable progression that often leads to hospitalization, treatment escalation, or surgery. While clinical and endoscopic indices guide disease monitoring, cross-sectional enterography provides unique visualization of transmural and extramural inflammation, offering valuable prognostic information. This systematic [...] Read more.
Crohn’s disease is a chronic inflammatory disorder with variable progression that often leads to hospitalization, treatment escalation, or surgery. While clinical and endoscopic indices guide disease monitoring, cross-sectional enterography provides unique visualization of transmural and extramural inflammation, offering valuable prognostic information. This systematic review and meta-analysis examined the prognostic significance of magnetic resonance enterography (MRE) and computed tomography enterography (CTE) in Crohn’s disease. Following PRISMA guidelines and a registered protocol, eight databases were systematically searched through August 2024. Two reviewers independently conducted data extraction, risk-of-bias assessment (QUADAS-2), and certainty grading (GRADE). Random-effects models were applied for pooled analyses. Eleven studies, including more than 1500 patients, met eligibility criteria. Across cohorts, transmural healing on enterography was consistently associated with favorable long-term outcomes, including a markedly lower need for surgery and hospitalization. Conversely, stenosis and persistent inflammatory activity identified patients at substantially higher risk of surgery, treatment intensification, or disease-related hospitalization. The certainty of evidence was high for surgical outcomes and moderate to low for other endpoints. Conventional enterography provides meaningful prognostic insight into Crohn’s disease and should be considered a complementary tool for risk stratification and treatment planning. Transmural healing represents a protective marker of a favorable disease course, whereas structural and inflammatory findings indicate patients who may benefit from closer monitoring or earlier therapeutic intervention. Full article
Show Figures

Figure 1

24 pages, 59247 KB  
Article
Pursuing Better Representations: Balancing Discriminability and Transferability for Few-Shot Class-Incremental Learning
by Qi Li, Wei Wang, Hui Fan, Bingwei Hui and Fei Wen
J. Imaging 2025, 11(11), 391; https://doi.org/10.3390/jimaging11110391 - 4 Nov 2025
Viewed by 519
Abstract
Few-Shot Class-Incremental Learning (FSCIL) aims to continually learn novel classes from limited data while retaining knowledge of previously learned classes. To mitigate catastrophic forgetting, most approaches pre-train a powerful backbone on the base session and keep it frozen during incremental sessions. Within this [...] Read more.
Few-Shot Class-Incremental Learning (FSCIL) aims to continually learn novel classes from limited data while retaining knowledge of previously learned classes. To mitigate catastrophic forgetting, most approaches pre-train a powerful backbone on the base session and keep it frozen during incremental sessions. Within this framework, existing studies primarily focus on representation learning in FSCIL, particularly Self-Supervised Contrastive Learning (SSCL), to enhance the transferability of representations and thereby boost model generalization to novel classes. However, they face a trade-off dilemma: improving transferability comes at the expense of discriminability, precluding simultaneous high performance on both base and novel classes. To address this issue, we propose BR-FSCIL, a representation learning framework for the FSCIL scenario. In the pre-training stage, we first design a Hierarchical Contrastive Learning (HierCon) algorithm. HierCon leverages label information to model hierarchical relationships among features. In contrast to SSCL, it maintains strong discriminability when promoting transferability. Second, to further improve the model’s performance on novel classes, an Alignment Modulation (AM) loss is proposed that explicitly facilitates learning of knowledge shared across classes from an inter-class perspective. Building upon the hierarchical discriminative structure established by HierCon, it additionally improves the model’s adaptability to novel classes. Through optimization at both intra-class and inter-class levels, the representations learned by BR-FSCIL achieve a balance between discriminability and transferability. Extensive experiments on mini-ImageNet, CIFAR100, and CUB200 demonstrate the effectiveness of our method, which achieves final session accuracies of 53.83%, 53.04%, and 62.60%, respectively. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

21 pages, 3119 KB  
Review
Next-Generation Advances in Prostate Cancer Imaging and Artificial Intelligence Applications
by Kathleen H. Miao, Julia H. Miao, Mark Finkelstein, Aritrick Chatterjee and Aytekin Oto
J. Imaging 2025, 11(11), 390; https://doi.org/10.3390/jimaging11110390 - 3 Nov 2025
Viewed by 1454
Abstract
Prostate cancer is one of the leading causes of cancer-related morbidity and mortality worldwide, and imaging plays a critical role in its detection, localization, staging, treatment, and management. The advent of artificial intelligence (AI) has introduced transformative possibilities in prostate imaging, offering enhanced [...] Read more.
Prostate cancer is one of the leading causes of cancer-related morbidity and mortality worldwide, and imaging plays a critical role in its detection, localization, staging, treatment, and management. The advent of artificial intelligence (AI) has introduced transformative possibilities in prostate imaging, offering enhanced accuracy, efficiency, and consistency. This review explores the integration of AI in prostate cancer diagnostics across key imaging modalities, including multiparametric MRI (mpMRI), PSMA PET/CT, and transrectal ultrasound (TRUS). Advanced AI technologies, such as machine learning, deep learning, and radiomics, are being applied for lesion detection, risk stratification, segmentation, biopsy targeting, and treatment planning. AI-augmented systems have demonstrated the ability to support PI-RADS scoring, automate prostate and tumor segmentation, guide targeted biopsies, and optimize radiation therapy. Despite promising performance, challenges persist regarding data heterogeneity, algorithm generalizability, ethical considerations, and clinical implementation. Looking ahead, multimodal AI models integrating imaging, genomics, and clinical data hold promise for advancing precision medicine in prostate cancer care and assisting clinicians, particularly in underserved regions with limited access to specialists. Continued multidisciplinary collaboration will be essential to translate these innovations into evidence-based practice. This article explores current AI applications and future directions that are transforming prostate imaging and patient care. Full article
(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop