Previous Issue
Volume 11, September
 
 

J. Imaging, Volume 11, Issue 10 (October 2025) – 43 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
25 pages, 34242 KB  
Article
ImbDef-GAN: Defect Image-Generation Method Based on Sample Imbalance
by Dengbiao Jiang, Nian Tao, Kelong Zhu, Yiming Wang and Haijian Shao
J. Imaging 2025, 11(10), 367; https://doi.org/10.3390/jimaging11100367 - 16 Oct 2025
Abstract
In industrial settings, defect detection using deep learning typically requires large numbers of defective samples. However, defective products are rare on production lines, creating a scarcity of defect samples and an overabundance of samples that contain only background. We introduce ImbDef-GAN, a sample [...] Read more.
In industrial settings, defect detection using deep learning typically requires large numbers of defective samples. However, defective products are rare on production lines, creating a scarcity of defect samples and an overabundance of samples that contain only background. We introduce ImbDef-GAN, a sample imbalance generative framework, to address three persistent limitations in defect image generation: unnatural transitions at defect background boundaries, misalignment between defects and their masks, and out-of-bounds defect placement. The framework operates in two stages: (i) background image generation and (ii) defect image generation conditioned on the generated background. In the background image-generation stage, a lightweight StyleGAN3 variant jointly generates the background image and its segmentation mask. A Progress-coupled Gated Detail Injection module uses global scheduling driven by training progress and per-pixel gating to inject high-frequency information in a controlled manner, thereby enhancing background detail while preserving training stability. In the defect image-generation stage, the design augments the background generator with a residual branch that extracts defect features. By blending defect features with a smoothing coefficient, the resulting defect boundaries transition more naturally and gradually. A mask-aware matching discriminator enforces consistency between each defect image and its mask. In addition, an Edge Structure Loss and a Region Consistency Loss strengthen morphological fidelity and spatial constraints within the valid mask region. Extensive experiments on the MVTec AD dataset demonstrate that ImbDef-GAN surpasses existing methods in both the realism and diversity of generated defects. When the generated data are used to train a downstream detector, YOLOv11 achieves a 5.4% improvement in mAP@0.5, indicating that the proposed approach effectively improves detection accuracy under sample imbalance. Full article
(This article belongs to the Section Image and Video Processing)
4 pages, 170 KB  
Editorial
Editorial on the Special Issue: “Advances in Retinal Image Processing”
by P. Jidesh and Vasudevan Lakshminarayanan
J. Imaging 2025, 11(10), 366; https://doi.org/10.3390/jimaging11100366 - 16 Oct 2025
Abstract
Retinal disorders are one of the major causes of visual impairment [...] Full article
(This article belongs to the Special Issue Advances in Retinal Image Processing)
13 pages, 1736 KB  
Article
Automatic Brain Tumor Segmentation in 2D Intra-Operative Ultrasound Images Using Magnetic Resonance Imaging Tumor Annotations
by Mathilde Gajda Faanes, Ragnhild Holden Helland, Ole Solheim, Sébastien Muller and Ingerid Reinertsen
J. Imaging 2025, 11(10), 365; https://doi.org/10.3390/jimaging11100365 - 16 Oct 2025
Abstract
Automatic segmentation of brain tumors in intra-operative ultrasound (iUS) images could facilitate localization of tumor tissue during the resection surgery. The lack of large annotated datasets limits the current models performances. In this paper, we investigated the use of tumor annotations in magnetic [...] Read more.
Automatic segmentation of brain tumors in intra-operative ultrasound (iUS) images could facilitate localization of tumor tissue during the resection surgery. The lack of large annotated datasets limits the current models performances. In this paper, we investigated the use of tumor annotations in magnetic resonance imaging (MRI) scans, which are more accessible than annotations in iUS images, for training of deep learning models for iUS brain tumor segmentation. We used 180 annotated MRI scans with corresponding unannotated iUS images, and 29 annotated iUS images. Image registration was performed to transfer the MRI annotations to the corresponding iUS images before training the nnU-Net model with different configurations of the data and label origins. The results showed similar performance for a model trained with only MRI annotated tumors compared to models trained with only iUS annotations and both, and to expert annotations, indicating that MRI tumor annotations can be used as a substitute for iUS tumor annotations to train a deep learning model for automatic brain tumor segmentation in the iUS images. The best model obtained an average Dice score of 0.62 ± 0.31, compared to 0.67 ± 0.25 for an expert neurosurgeon, where the performance on larger tumors was similar, but lower for the models on smaller tumors. In addition, the results showed that removing smaller tumors from the training sets improved the results. Full article
(This article belongs to the Special Issue Progress and Challenges in Biomedical Image Analysis—2nd Edition)
Show Figures

Figure 1

11 pages, 1100 KB  
Communication
Surgical Instrument Segmentation via Segment-then-Classify Framework with Instance-Level Spatiotemporal Consistency Modeling
by Tiyao Zhang, Xue Yuan and Hongze Xu
J. Imaging 2025, 11(10), 364; https://doi.org/10.3390/jimaging11100364 - 15 Oct 2025
Abstract
Accurate segmentation of surgical instruments in endoscopic videos is crucial for robot-assisted surgery and intraoperative analysis. This paper presents a Segment-then-Classify framework that decouples mask generation from semantic classification to enhance spatial completeness and temporal stability. First, a Mask2Former-based segmentation backbone generates class-agnostic [...] Read more.
Accurate segmentation of surgical instruments in endoscopic videos is crucial for robot-assisted surgery and intraoperative analysis. This paper presents a Segment-then-Classify framework that decouples mask generation from semantic classification to enhance spatial completeness and temporal stability. First, a Mask2Former-based segmentation backbone generates class-agnostic instance masks and region features. Then, a bounding box-guided instance-level spatiotemporal modeling module fuses geometric priors and temporal consistency through a lightweight transformer encoder. This design improves interpretability and robustness under occlusion and motion blur. Experiments on the EndoVis 2017 and 2018 datasets demonstrate that our framework achieves mIoU improvements of 3.06%, 2.99%, and 1.67% and mcIoU gains of 2.36%, 2.85%, and 6.06%, respectively, over previously state-of-the-art methods, while maintaining computational efficiency. Full article
(This article belongs to the Section Image and Video Processing)
11 pages, 1676 KB  
Article
Radiographic Markers of Hip Dysplasia and Femoroacetabular Impingement Are Associated with Deterioration in Acetabular and Femoral Cartilage Quality: Insights from T2 MRI Mapping
by Adam Peszek, Kyle S. J. Jamar, Catherine C. Alder, Trevor J. Wait, Caleb J. Wipf, Carson L. Keeter, Stephanie W. Mayer, Charles P. Ho and James W. Genuario
J. Imaging 2025, 11(10), 363; https://doi.org/10.3390/jimaging11100363 - 14 Oct 2025
Viewed by 22
Abstract
Femoroacetabular impingement (FAI) and hip dysplasia have been shown to increase the risk of hip osteoarthritis in affected individuals. MRI with T2 mapping provides an objective measure of femoral and acetabular articular cartilage tissue quality. This study aims to evaluate the relationship between [...] Read more.
Femoroacetabular impingement (FAI) and hip dysplasia have been shown to increase the risk of hip osteoarthritis in affected individuals. MRI with T2 mapping provides an objective measure of femoral and acetabular articular cartilage tissue quality. This study aims to evaluate the relationship between hip morphology measurements collected from three-dimensional (3D) reconstructed computed tomography (CT) scans and the T2 mapping values of hip articular cartilage assessed by three independent, blinded reviewers on the optimal sagittal cut. Hip morphology measures including lateral center edge angle (LCEA), acetabular version, Tönnis angle, acetabular coverage, alpha angle, femoral torsion, neck-shaft angle (FNSA), and combined version were recorded from preoperative CT scans. The relationship between T2 values and hip morphology was assessed using univariate linear mixed models with random effects for individual patients. Significant associations were observed between femoral and acetabular articular cartilage T2 values and all hip morphology measures except femoral torsion. Hip morphology measurements consistent with dysplastic anatomy including decreased LCEA, increased Tönnis angle, and decreased acetabular coverage were associated with increased cartilage damage (p < 0.001 for all). Articular cartilage T2 values were strongly associated with the radiographic markers of hip dysplasia, suggesting hip microinstability significantly contributes to cartilage damage. The relationships between hip morphology measurements and T2 values were similar for the femoral and acetabular sides, indicating that damage to both surfaces is comparable rather than preferentially affecting one side. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

13 pages, 1871 KB  
Article
CT Imaging Biomarkers in Rhinogenic Contact Point Headache: Quantitative Phenotyping and Diagnostic Correlations
by Salvatore Lavalle, Salvatore Ferlito, Jerome Rene Lechien, Mario Lentini, Placido Romeo, Alberto Maria Saibene, Gian Luca Fadda and Antonino Maniaci
J. Imaging 2025, 11(10), 362; https://doi.org/10.3390/jimaging11100362 - 14 Oct 2025
Viewed by 72
Abstract
Rhinogenic contact point headache (RCPH) represents a diagnostic challenge due to different anatomical presentations and unstandardized imaging markers. This prospective multicenter study involving 120 patients aimed to develop and validate a CT-based imaging framework for RCPH diagnosis. High-resolution CT scans were systematically assessed [...] Read more.
Rhinogenic contact point headache (RCPH) represents a diagnostic challenge due to different anatomical presentations and unstandardized imaging markers. This prospective multicenter study involving 120 patients aimed to develop and validate a CT-based imaging framework for RCPH diagnosis. High-resolution CT scans were systematically assessed for seven parameters: contact point (CP) type, contact intensity (CI), septal deviation, concha bullosa (CB) morphology, mucosal edema (ME), turbinate hypertrophy (TH), and associated anatomical variants. Results revealed CP-I (37.5%) and CP-II (22.5%) as predominant patterns, with moderate CI (45.8%) and septal deviation > 15° (71.7%) commonly observed. CB was found in 54.2% of patients, primarily bulbous type (26.7%). Interestingly, focal ME at CP was independently associated with greater pain severity in the multivariate model (p = 0.003). The framework demonstrated substantial to excellent interobserver reliability (κ = 0.76–0.91), with multivariate analysis identifying moderate–severe CI, focal ME, and specific septal deviation patterns as independent predictors of higher pain scores. Our imaging classification system highlights key radiological biomarkers associated with symptom severity and may facilitate future applications in quantitative imaging, automated phenotyping, and personalized treatment approaches. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

19 pages, 2435 KB  
Article
A Lesion-Aware Patch Sampling Approach with EfficientNet3D-UNet for Robust Multiple Sclerosis Lesion Segmentation
by Hind Almaaz and Samia Dardouri
J. Imaging 2025, 11(10), 361; https://doi.org/10.3390/jimaging11100361 - 13 Oct 2025
Viewed by 90
Abstract
Accurate segmentation of multiple sclerosis (MS) lesions from 3D MRI scans is essential for diagnosis, disease monitoring, and treatment planning. However, this task remains challenging due to the sparsity, heterogeneity, and subtle appearance of lesions, as well as the difficulty in obtaining high-quality [...] Read more.
Accurate segmentation of multiple sclerosis (MS) lesions from 3D MRI scans is essential for diagnosis, disease monitoring, and treatment planning. However, this task remains challenging due to the sparsity, heterogeneity, and subtle appearance of lesions, as well as the difficulty in obtaining high-quality annotations. In this study, we propose Efficient-Net3D-UNet, a deep learning framework that combines compound-scaled MBConv3D blocks with a lesion-aware patch sampling strategy to improve volumetric segmentation performance across multi-modal MRI sequences (FLAIR, T1, and T2). The model was evaluated against a conventional 3D U-Net baseline using standard metrics including Dice similarity coefficient, precision, recall, accuracy, and specificity. On a held-out test set, EfficientNet3D-UNet achieved a Dice score of 48.39%, precision of 49.76%, and recall of 55.41%, outperforming the baseline 3D U-Net, which obtained a Dice score of 31.28%, precision of 32.48%, and recall of 43.04%. Both models reached an overall accuracy of 99.14%. Notably, EfficientNet3D-UNet also demonstrated faster convergence and reduced overfitting during training. These results highlight the potential of EfficientNet3D-UNet as a robust and computationally efficient solution for automated MS lesion segmentation, offering promising applicability in real-world clinical settings. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

23 pages, 4523 KB  
Article
Lung Nodule Malignancy Classification Integrating Deep and Radiomic Features in a Three-Way Attention-Based Fusion Module
by Sadaf Khademi, Shahin Heidarian, Parnian Afshar, Arash Mohammadi, Abdul Sidiqi, Elsie T. Nguyen, Balaji Ganeshan and Anastasia Oikonomou
J. Imaging 2025, 11(10), 360; https://doi.org/10.3390/jimaging11100360 - 13 Oct 2025
Viewed by 125
Abstract
In this study, we propose a novel hybrid framework for assessing the invasiveness of an in-house dataset of 114 pathologically proven lung adenocarcinomas presenting as subsolid nodules on Computed Tomography (CT). Nodules were classified into group 1 (G1), which included atypical adenomatous hyperplasia, [...] Read more.
In this study, we propose a novel hybrid framework for assessing the invasiveness of an in-house dataset of 114 pathologically proven lung adenocarcinomas presenting as subsolid nodules on Computed Tomography (CT). Nodules were classified into group 1 (G1), which included atypical adenomatous hyperplasia, adenocarcinoma in situ, and minimally invasive adenocarcinomas, and group 2 (G2), which included invasive adenocarcinomas. Our approach includes a three-way Integration of Visual, Spatial, and Temporal features with Attention, referred to as I-VISTA, obtained from three processing algorithms designed based on Deep Learning (DL) and radiomic models, leading to a more comprehensive analysis of nodule variations. The aforementioned processing algorithms are arranged in the following three parallel paths: (i) The Shifted Window (SWin) Transformer path, which is a hierarchical vision Transformer that extracts nodules’ related spatial features; (ii) The Convolutional Auto-Encoder (CAE) Transformer path, which captures informative features related to inter-slice relations via a modified Transformer encoder architecture; and (iii) a 3D Radiomic-based path that collects quantitative features based on texture analysis of each nodule. Extracted feature sets are then passed through the Criss-Cross attention fusion module to discover the most informative feature patterns and classify nodules type. The experiments were evaluated based on a ten-fold cross-validation scheme. I-VISTA framework achieved the best performance of overall accuracy, sensitivity, and specificity (mean ± std) of 93.93 ± 6.80%, 92.66 ± 9.04%, and 94.99 ± 7.63% with an Area under the ROC Curve (AUC) of 0.93 ± 0.08 for lung nodule classification among ten folds. The hybrid framework integrating DL and hand-crafted 3D Radiomic model outperformed the standalone DL and hand-crafted 3D Radiomic model in differentiating G1 from G2 subsolid nodules identified on CT. Full article
(This article belongs to the Special Issue Progress and Challenges in Biomedical Image Analysis—2nd Edition)
Show Figures

Figure 1

29 pages, 2757 KB  
Article
Non-Contrast Brain CT Images Segmentation Enhancement: Lightweight Pre-Processing Model for Ultra-Early Ischemic Lesion Recognition and Segmentation
by Aleksei Samarin, Alexander Savelev, Aleksei Toropov, Aleksandra Dozortseva, Egor Kotenko, Artem Nazarenko, Alexander Motyko, Galiya Narova, Elena Mikhailova and Valentin Malykh
J. Imaging 2025, 11(10), 359; https://doi.org/10.3390/jimaging11100359 - 13 Oct 2025
Viewed by 146
Abstract
Timely identification and accurate delineation of ultra-early ischemic stroke lesions in non-contrast computed tomography (CT) scans of the human brain are of paramount importance for prompt medical intervention and improved patient outcomes. In this study, we propose a deep learning-driven methodology specifically designed [...] Read more.
Timely identification and accurate delineation of ultra-early ischemic stroke lesions in non-contrast computed tomography (CT) scans of the human brain are of paramount importance for prompt medical intervention and improved patient outcomes. In this study, we propose a deep learning-driven methodology specifically designed for segmenting ultra-early ischemic regions, with a particular emphasis on both the ischemic core and the surrounding penumbra during the initial stages of stroke progression. We introduce a lightweight preprocessing model based on convolutional filtering techniques, which enhances image clarity while preserving the structural integrity of medical scans, a critical factor when detecting subtle signs of ultra-early ischemic strokes. Unlike conventional preprocessing methods that directly modify the image and may introduce artifacts or distortions, our approach ensures the absence of neural network-induced artifacts, which is especially crucial for accurate diagnosis and segmentation of ultra-early ischemic lesions. The model employs predefined differentiable filters with trainable parameters, allowing for artifact-free and precision-enhanced image refinement tailored to the challenges of ultra-early stroke detection. In addition, we incorporated into the combined preprocessing pipeline a newly proposed trainable linear combination of pretrained image filters, a concept first introduced in this study. For model training and evaluation, we utilize a publicly available dataset of acute ischemic stroke cases, focusing on the subset relevant to ultra-early stroke manifestations, which contains annotated non-contrast CT brain scans from 112 patients. The proposed model demonstrates high segmentation accuracy for ultra-early ischemic regions, surpassing existing methodologies across key performance metrics. The results have been rigorously validated on test subsets from the dataset, confirming the effectiveness of our approach in supporting the early-stage diagnosis and treatment planning for ultra-early ischemic strokes. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

16 pages, 571 KB  
Article
Lightweight Statistical and Texture Feature Approach for Breast Thermogram Analysis
by Ana P. Romero-Carmona, Jose J. Rangel-Magdaleno, Francisco J. Renero-Carrillo, Juan M. Ramirez-Cortes and Hayde Peregrina-Barreto
J. Imaging 2025, 11(10), 358; https://doi.org/10.3390/jimaging11100358 - 13 Oct 2025
Viewed by 153
Abstract
Breast cancer is the most commonly diagnosed cancer in women globally and represents the leading cause of mortality related to malignant tumors. Currently, healthcare professionals are focused on developing and implementing innovative techniques to improve the early detection of this disease. Thermography, studied [...] Read more.
Breast cancer is the most commonly diagnosed cancer in women globally and represents the leading cause of mortality related to malignant tumors. Currently, healthcare professionals are focused on developing and implementing innovative techniques to improve the early detection of this disease. Thermography, studied as a complementary method to traditional approaches, captures infrared radiation emitted by tissues and converts it into data about skin surface temperature. During tumor development, angiogenesis occurs, increasing blood flow to support tumor growth, which raises the surface temperature in the affected area. Automatic classification techniques have been explored to analyze thermographic images and develop an optimal classification tool to identify thermal anomalies. This study aims to design a concise description using statistical and texture features to accurately classify thermograms as control or highly probable to be cancer (with thermal anomalies). The importance of employing a short description lies in facilitating interpretation by medical professionals. In contrast, a characterization based on a large number of variables could make it more challenging to identify which values differentiate the thermograms between groups, thereby complicating the explanation of results to patients. A maximum accuracy of 91.97% was achieved by applying only seven features and using a Coarse Decision Tree (DT) classifier and robust Machine Learning (ML) model, which demonstrated competitive performance compared with previously reported studies. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

18 pages, 3321 KB  
Article
New Solution for Segmental Assessment of Left Ventricular Wall Thickness, Using Anatomically Accurate and Highly Reproducible Automated Cardiac MRI Software
by Balázs Mester, Kristóf Attila Farkas-Sütő, Júlia Magdolna Tardy, Kinga Grebur, Márton Horváth, Flóra Klára Gyulánczi, Hajnalka Vágó, Béla Merkely and Andrea Szűcs
J. Imaging 2025, 11(10), 357; https://doi.org/10.3390/jimaging11100357 - 11 Oct 2025
Viewed by 169
Abstract
Introduction: Changes in left ventricular (LV) wall thickness serve as important diagnostic and prognostic indicators in various cardiovascular diseases. To date, no automated software exists for the measurement of myocardial segmental wall thickness in cardiac MRI (CMR), which leads to reliance on manual [...] Read more.
Introduction: Changes in left ventricular (LV) wall thickness serve as important diagnostic and prognostic indicators in various cardiovascular diseases. To date, no automated software exists for the measurement of myocardial segmental wall thickness in cardiac MRI (CMR), which leads to reliance on manual caliper measurements that carry risks of inaccuracy. Aims: This paper aims to present a new automated segmental wall thickness measurement software, OptiLayer, developed to address this issue and to compare it with the conventional manual measurement method. Methods: In our pilot study, the algorithm of the OptiLayer software was tested on 50 HEALTHY individuals, and 50 excessively trabeculated noncompaction (LVET) subjects with preserved LV function, whose morphology makes it more challenging to measure left ventricular wall thickness, although often occurring with myocardial thinning. Measurements were performed by two independent investigators who assessed LV wall thicknesses in 16 segments, both manually using the Medis Suite QMass program and automatically with the new OptiLayer method, which enables high-density sampling across the distance between the epicardial and endocardial contours. Results: The results showed that the segmental wall thickness measurement values of the OptiLayer algorithm were significantly higher than those of the manual caliper. In comparisons of the HEALTHY and LVET subgroups, OptiLayer measurements demonstrated differences at several points than manual measurements. Between the investigators, manual measurements showed low intraclass correlations (ICC below 0.6 on average), while measurements with OptiLayer gave excellent agreement (ICC above 0.9 in 75% of segments). Conclusions: Our study suggests that OptiLayer, a new automated wall thickness measurement software based on high-precision anatomical segmentation, offers a faster, more accurate, and more reproducible alternative to manual measurements. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

20 pages, 5063 KB  
Article
AI Diffusion Models Generate Realistic Synthetic Dental Radiographs Using a Limited Dataset
by Brian Kirkwood, Byeong Yeob Choi, James Bynum and Jose Salinas
J. Imaging 2025, 11(10), 356; https://doi.org/10.3390/jimaging11100356 - 11 Oct 2025
Viewed by 272
Abstract
Generative Artificial Intelligence (AI) has the potential to address the limited availability of dental radiographs for the development of Dental AI systems by creating clinically realistic synthetic dental radiographs (SDRs). Evaluation of artificially generated images requires both expert review and objective measures of [...] Read more.
Generative Artificial Intelligence (AI) has the potential to address the limited availability of dental radiographs for the development of Dental AI systems by creating clinically realistic synthetic dental radiographs (SDRs). Evaluation of artificially generated images requires both expert review and objective measures of fidelity. A stepwise approach was used to processing 10,000 dental radiographs. First, a single dentist screened images to determine if specific image selection criterion was met; this identified 225 images. From these, 200 images were randomly selected for training an AI image generation model. Second, 100 images were randomly selected from the previous training dataset and evaluated by four dentists; the expert review identified 57 images that met image selection criteria to refine training for two additional AI models. The three models were used to generate 500 SDRs each and the clinical realism of the SDRs was assessed through expert review. In addition, the SDRs generated by each model were objectively evaluated using quantitative metrics: Fréchet Inception Distance (FID) and Kernel Inception Distance (KID). Evaluation of the SDR by a dentist determined that expert-informed curation improved SDR realism, and refinement of model architecture produced further gains. FID and KID analysis confirmed that expert input and technical refinement improve image fidelity. The convergence of subjective and objective assessments strengthens confidence that the refined model architecture can serve as a foundation for SDR image generation, while highlighting the importance of expert-informed data curation and domain-specific evaluation metrics. Full article
(This article belongs to the Topic Machine Learning and Deep Learning in Medical Imaging)
Show Figures

Figure 1

28 pages, 2961 KB  
Article
An Improved Capsule Network for Image Classification Using Multi-Scale Feature Extraction
by Wenjie Huang, Ruiqing Kang, Lingyan Li and Wenhui Feng
J. Imaging 2025, 11(10), 355; https://doi.org/10.3390/jimaging11100355 - 10 Oct 2025
Viewed by 234
Abstract
In the realm of image classification, the capsule network is a network topology that packs the extracted features into many capsules, performs sophisticated capsule screening using a dynamic routing mechanism, and finally recognizes that each capsule corresponds to a category feature. Compared with [...] Read more.
In the realm of image classification, the capsule network is a network topology that packs the extracted features into many capsules, performs sophisticated capsule screening using a dynamic routing mechanism, and finally recognizes that each capsule corresponds to a category feature. Compared with previous network topologies, the capsule network has more sophisticated operations, uses a large number of parameter matrices and vectors to express picture attributes, and has more powerful image classification capabilities. However, in the practical application field, the capsule network has always been constrained by the quantity of calculation produced by the complicated structure. In the face of basic datasets, it is prone to over-fitting and poor generalization and often cannot satisfy the high computational overhead when facing complex datasets. Based on the aforesaid problems, this research proposes a novel enhanced capsule network topology. The upgraded network boosts the feature extraction ability of the network by incorporating a multi-scale feature extraction module based on proprietary star structure convolution into the standard capsule network. At the same time, additional structural portions of the capsule network are changed, and a variety of optimization approaches such as dense connection, attention mechanism, and low-rank matrix operation are combined. Image classification studies are carried out on different datasets, and the novel structure suggested in this paper has good classification performance on CIFAR-10, CIFAR-100, and CUB datasets. At the same time, we also achieved 98.21% and 95.38% classification accuracy on two complicated datasets of skin cancer ISIC derived and Forged Face EXP. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

19 pages, 3418 KB  
Article
WSVAD-CLIP: Temporally Aware and Prompt Learning with CLIP for Weakly Supervised Video Anomaly Detection
by Min Li, Jing Sang, Yuanyao Lu and Lina Du
J. Imaging 2025, 11(10), 354; https://doi.org/10.3390/jimaging11100354 - 10 Oct 2025
Viewed by 381
Abstract
Weakly Supervised Video Anomaly Detection (WSVAD) is a critical task in computer vision. It aims to localize and recognize abnormal behaviors using only video-level labels. Without frame-level annotations, it becomes significantly challenging to model temporal dependencies. Given the diversity of abnormal events, it [...] Read more.
Weakly Supervised Video Anomaly Detection (WSVAD) is a critical task in computer vision. It aims to localize and recognize abnormal behaviors using only video-level labels. Without frame-level annotations, it becomes significantly challenging to model temporal dependencies. Given the diversity of abnormal events, it is also difficult to model semantic representations. Recently, the cross-modal pre-trained model Contrastive Language-Image Pretraining (CLIP) has shown a strong ability to align visual and textual information. This provides new opportunities for video anomaly detection. Inspired by CLIP, WSVAD-CLIP is proposed as a framework that uses its cross-modal knowledge to bridge the semantic gap between text and vision. First, the Axial-Graph (AG) Module is introduced. It combines an Axial Transformer and Lite Graph Attention Networks (LiteGAT) to capture global temporal structures and local abnormal correlations. Second, a Text Prompt mechanism is designed. It fuses a learnable prompt with a knowledge-enhanced prompt to improve the semantic expressiveness of category embeddings. Third, the Abnormal Visual-Guided Text Prompt (AVGTP) mechanism is proposed to aggregate anomalous visual context for adaptively refining textual representations. Extensive experiments on UCF-Crime and XD-Violence datasets show that WSVAD-CLIP notably outperforms existing methods in coarse-grained anomaly detection. It also achieves superior performance in fine-grained anomaly recognition tasks, validating its effectiveness and generalizability. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

20 pages, 34236 KB  
Article
ILD-Slider: A Parameter-Efficient Model for Identifying Progressive Fibrosing Interstitial Lung Disease from Chest CT Slices
by Jiahao Zhang, Shoya Wada, Kento Sugimoto, Takayuki Niitsu, Kiyoharu Fukushima, Hiroshi Kida, Bowen Wang, Shozo Konishi, Katsuki Okada, Yuta Nakashima and Toshihiro Takeda
J. Imaging 2025, 11(10), 353; https://doi.org/10.3390/jimaging11100353 - 9 Oct 2025
Viewed by 265
Abstract
Progressive Fibrosing Interstitial Lung Disease (PF-ILD) is a severe phenotype of Interstitial Lung Disease (ILD) with a poor prognosis, typically requiring prolonged clinical observation and multiple CT examinations for diagnosis. Such requirements delay early detection and treatment initiation. To enable earlier identification of [...] Read more.
Progressive Fibrosing Interstitial Lung Disease (PF-ILD) is a severe phenotype of Interstitial Lung Disease (ILD) with a poor prognosis, typically requiring prolonged clinical observation and multiple CT examinations for diagnosis. Such requirements delay early detection and treatment initiation. To enable earlier identification of PF-ILD, we propose ILD-Slider, a parameter-efficient and lightweight deep learning framework that enables accurate PF-ILD identification from a limited number of CT slices. ILD-Slider introduces anatomy-based position markers (PMs) to guide the selection of representative slices (RSs). A PM extractor, trained via a multi-class classification model, achieves high PM detection accuracy despite severe class imbalance by leveraging a peak slice mining (PSM)-based strategy. Using the PM extractor, we automatically select three, five, or nine RSs per case, substantially reducing computational cost while maintaining diagnostic accuracy. The selected RSs are then processed by a slice-level 3D Adapter (Slider) for PF-ILD identification. Experiments on 613 cases from The University of Osaka Hospital (UOH) and the National Hospital Organization Osaka Toneyama Medical Center (OTMC) demonstrate the effectiveness of ILD-Slider, achieving an AUPRC of 0.790 (AUROC 0.847) using only five automatically extracted RSs. ILD-Slider further validates the feasibility of diagnosing PF-ILD from non-contiguous slices, which is particularly valuable for real-world and public datasets where contiguous volumes are often unavailable. These results highlight ILD-Slider as a practical and efficient solution for early PF-ILD identification. Full article
(This article belongs to the Special Issue Advances in Medical Imaging and Machine Learning)
Show Figures

Figure 1

17 pages, 4166 KB  
Article
Non-Destructive Volume Estimation of Oranges for Factory Quality Control Using Computer Vision and Ensemble Machine Learning
by Wattanapong Kurdthongmee and Arsanchai Sukkuea
J. Imaging 2025, 11(10), 352; https://doi.org/10.3390/jimaging11100352 - 9 Oct 2025
Viewed by 115
Abstract
A crucial task in industrial quality control, especially in the food and agriculture sectors, is the quick and precise estimation of an object’s volume. This study combines cutting-edge machine learning and computer vision techniques to provide a comprehensive, non-destructive method for predicting orange [...] Read more.
A crucial task in industrial quality control, especially in the food and agriculture sectors, is the quick and precise estimation of an object’s volume. This study combines cutting-edge machine learning and computer vision techniques to provide a comprehensive, non-destructive method for predicting orange volume. We created a reliable pipeline that employs top and side views of every orange to estimate four important dimensions using a calibrated marker. These dimensions are then fed into a machine learning model that has been fine-tuned. Our method uses a range of engineered features, such as complex surface-area-to-volume ratios and new shape-based descriptors, to go beyond basic geometric formulas. Based on a dataset of 150 unique oranges, we show that the Stacking Regressor performs significantly better than other single-model benchmarks, including the highly tuned LightGBM model, achieving an R2 score of 0.971. Because of its reliance on basic physical characteristics, the method is extremely resilient to the inherent variability in fruit and may be used with a variety of produce types. Because it allows for the real-time calculation of density (mass over volume) for automated defect detection and quality grading, this solution is directly applicable to a factory sorting environment. Full article
(This article belongs to the Topic Nondestructive Testing and Evaluation)
Show Figures

Figure 1

19 pages, 3520 KB  
Article
Multifactorial Imaging Analysis as a Platform for Studying Cellular Senescence Phenotypes
by Shatalova Rimma, Larin Ilya and Shevyrev Daniil
J. Imaging 2025, 11(10), 351; https://doi.org/10.3390/jimaging11100351 - 8 Oct 2025
Viewed by 294
Abstract
Cellular senescence is a heterogeneous and dynamic state characterised by stable proliferation arrest, macromolecular damage and metabolic remodelling. Although markers such as SA-β-galactosidase staining, yH2AX foci and p53 activation are widely used as de facto standards, they are imperfect and differ in terms [...] Read more.
Cellular senescence is a heterogeneous and dynamic state characterised by stable proliferation arrest, macromolecular damage and metabolic remodelling. Although markers such as SA-β-galactosidase staining, yH2AX foci and p53 activation are widely used as de facto standards, they are imperfect and differ in terms of sensitivity, specificity and dependence on context. We present a multifactorial imaging platform integrating scanning electron, flow cytometry and high-resolution confocal microscopy. This allows us to identify senescence phenotypes in three in vitro models: replicative ageing via serial passaging; dose-graded genotoxic stress under serum deprivation; and primary fibroblasts from young and elderly donors. We present a multimodal imaging framework to characterise senescence-associated phenotypes by integrating LysoTracker and MitoTracker microscopy and SA-β-gal/FACS, p16INK4a immunostaining provides independent confirmation of proliferative arrest. Combined nutrient deprivation and genotoxic challenge elicited the most pronounced and concordant organelle alterations relative to single stressors, aligning with age-donor differences. Our approach integrates structural and functional readouts across modalities, reducing the impact of phenotypic heterogeneity and providing reproducible multiparametric endpoints. Although the framework focuses on a robustly validated panel of phenotypes, it is extensible by nature and sensitive to distributional shifts. This allows both drug-specific redistribution of established markers and the emergence of atypical or transient phenotypes to be detected. This flexibility renders the platform suitable for comparative studies and the screening of senolytics and geroprotectors, as well as for refining the evolving landscape of senescence-associated states. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

30 pages, 10084 KB  
Article
Automatic Visual Inspection for Industrial Application
by António Gouveia Ribeiro, Luís Vilaça, Carlos Costa, Tiago Soares da Costa and Pedro Miguel Carvalho
J. Imaging 2025, 11(10), 350; https://doi.org/10.3390/jimaging11100350 - 8 Oct 2025
Viewed by 221
Abstract
Quality control represents a critical function in industrial environments, ensuring that manufactured products meet strict standards and remain free from defects. In highly regulated sectors such as the pharmaceutical industry, traditional manual inspection methods remain widely used. However, these are time-consuming and prone [...] Read more.
Quality control represents a critical function in industrial environments, ensuring that manufactured products meet strict standards and remain free from defects. In highly regulated sectors such as the pharmaceutical industry, traditional manual inspection methods remain widely used. However, these are time-consuming and prone to human error, and they lack the reliability required for large-scale operations, highlighting the urgent need for automated solutions. This is crucial for industrial applications, where environments evolve and new defect types can arise unpredictably. This work proposes an automated visual defect detection system specifically designed for pharmaceutical bottles, with potential applicability in other manufacturing domains. Various methods were integrated to create robust tools capable of real-world deployment. A key strategy is the use of incremental learning, which enables machine learning models to incorporate new, unseen data without full retraining, thus enabling adaptation to new defects as they appear, allowing models to handle rare cases while maintaining stability and performance. The proposed solution incorporates a multi-view inspection setup to capture images from multiple angles, enhancing accuracy and robustness. Evaluations in real-world industrial conditions demonstrated high defect detection rates, confirming the effectiveness of the proposed approach. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

16 pages, 5738 KB  
Article
Image-Processing-Driven Modeling and Reconstruction of Traditional Patterns via Dual-Channel Detection and B-Spline Analysis
by Xuemei He, Siyi Chen, Yin Kuang and Xinyue Yang
J. Imaging 2025, 11(10), 349; https://doi.org/10.3390/jimaging11100349 - 7 Oct 2025
Viewed by 290
Abstract
This study aims to address the research gap in the digital analysis of traditional patterns by proposing an image-processing-driven parametric modeling method that combines graphic primitive function modeling with topological reconstruction. The image is processed using a dual-channel image processing algorithm (Canny edge [...] Read more.
This study aims to address the research gap in the digital analysis of traditional patterns by proposing an image-processing-driven parametric modeling method that combines graphic primitive function modeling with topological reconstruction. The image is processed using a dual-channel image processing algorithm (Canny edge detection and grayscale mapping) to extract and vectorize graphic primitives. These primitives are uniformly represented using B-spline curves, with variations generated through parametric control. A topological reconstruction approach is introduced, incorporating mapped geometric parameters, topological combination rules, and geometric adjustments to output topological configurations. The generated patterns are evaluated using fractal dimension analysis for complexity quantification and applied in cultural heritage imaging practice. The proposed image processing pipeline enables flexible parametric control and continuous structural integration of the graphic primitives and demonstrates high reproducibility and expandability. This study establishes a novel computational framework for traditional patterns, offering a replicable technical pathway that integrates image processing, parametric modeling, and topological reconstruction for digital expression, stylistic innovation, and heritage conservation. Full article
(This article belongs to the Section Computational Imaging and Computational Photography)
Show Figures

Figure 1

23 pages, 5437 KB  
Article
Hierarchical Deep Learning for Abnormality Classification in Mouse Skeleton Using Multiview X-Ray Images: Convolutional Autoencoders Versus ConvNeXt
by Muhammad M. Jawaid, Rasneer S. Bains, Sara Wells and James M. Brown
J. Imaging 2025, 11(10), 348; https://doi.org/10.3390/jimaging11100348 - 7 Oct 2025
Viewed by 245
Abstract
Single-view-based anomaly detection approaches present challenges due to the lack of context, particularly for multi-label problems. In this work, we demonstrate the efficacy of using multiview image data for improved classification using a hierarchical learning approach. Using 170,958 images from the International Mouse [...] Read more.
Single-view-based anomaly detection approaches present challenges due to the lack of context, particularly for multi-label problems. In this work, we demonstrate the efficacy of using multiview image data for improved classification using a hierarchical learning approach. Using 170,958 images from the International Mouse Phenotyping Consortium (IMPC) repository, a specimen-wise multiview dataset comprising 54,046 specimens was curated. Next, two hierarchical classification frameworks were developed by customizing ConvNeXT and a convolutional autoencoder (CAE) as CNN backbones, respectively. The customized architectures were trained at three hierarchy levels with increasing anatomical granularity, enabling specialized layers to learn progressively more detailed features. At the top level (L1), multiview (MV) classification performed about the same as single views, with a high mean AUC of 0.95. However, using MV images in the hierarchical model greatly improved classification at levels 2 and 3. The model showed consistently higher average AUC scores with MV compared to single views such as dorsoventral or lateral. For example, at Level 2 (L2), the model divided abnormal cases into three subclasses, achieving AUCs of 0.65 for DV, 0.76 for LV, and 0.87 for MV. Then, at Level 3 (L3), it further divided these into ten specific abnormalities, with AUCs of 0.54 for DV, 0.59 for LV, and 0.82 for MV. A similar performance was achieved by the CAE-driven architecture, with mean AUCs of 0.87, 0.88, and 0.89 at Level 2 (L2) and 0.74, 0.78, and 0.81 at Level 3 (L3), respectively, for DV, LV, and MV views. The overall results demonstrate the advantage of multiview image data coupled with hierarchical learning for skeletal abnormality detection in a multi-label context. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

12 pages, 9239 KB  
Article
Effects of Motion in Ultrashort Echo Time Quantitative Susceptibility Mapping for Musculoskeletal Imaging
by Sam Sedaghat, Jinil Park, Eddie Fu, Fang Liu, Youngkyoo Jung and Hyungseok Jang
J. Imaging 2025, 11(10), 347; https://doi.org/10.3390/jimaging11100347 - 6 Oct 2025
Viewed by 317
Abstract
Quantitative susceptibility mapping (QSM) is a powerful magnetic resonance imaging (MRI) technique for assessing tissue composition in the human body. For imaging short-T2 tissues in the musculoskeletal (MSK) system, ultrashort echo time (UTE) imaging plays a key role. However, UTE-based QSM (UTE-QSM) often [...] Read more.
Quantitative susceptibility mapping (QSM) is a powerful magnetic resonance imaging (MRI) technique for assessing tissue composition in the human body. For imaging short-T2 tissues in the musculoskeletal (MSK) system, ultrashort echo time (UTE) imaging plays a key role. However, UTE-based QSM (UTE-QSM) often involves repeated acquisitions, making it vulnerable to inter-scan motion. In this study, we investigate the effects of motion on UTE-QSM and introduce strategies to reduce motion-induced artifacts. Eight healthy male volunteers underwent UTE-QSM imaging of the knee joint, while an additional seven participated in imaging of the ankle joint. UTE-QSM was conducted using multiple echo acquisitions, including both UTE and gradient-recalled echoes, and processed using the iterative decomposition of water and fat with echo asymmetry and least-squares estimation (IDEAL) and morphology-enabled dipole inversion (MEDI) algorithms. To assess the impact of motion, datasets were reconstructed both with and without motion correction. Furthermore, we evaluated a two-step UTE-QSM approach that incorporates tissue boundary information. This method applies edge detection, excludes pixels near detected edges, and performs a two-step QSM reconstruction to reduce motion-induced streaking artifacts. In participants exhibiting substantial inter-scan motion, prominent streaking artifacts were evident. Applying motion registration markedly reduced these artifacts in both knee and ankle UTE-QSM. Additionally, the two-step UTE-QSM approach, which integrates tissue boundary information, further enhanced image quality by mitigating residual streaking artifacts. These results indicate that motion-induced errors near tissue boundaries play a key role in generating streaking artifacts in UTE-QSM. Inter-scan motion poses a fundamental challenge in UTE-QSM due to the need for multiple acquisitions. However, applying motion registration along with a two-step QSM approach that excludes tissue boundaries can effectively suppress motion-induced streaking artifacts, thereby improving the accuracy of musculoskeletal tissue characterization. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

15 pages, 2364 KB  
Article
Optimized Lung Nodule Classification Using CLAHE-Enhanced CT Imaging and Swin Transformer-Based Deep Feature Extraction
by Dorsaf Hrizi, Khaoula Tbarki and Sadok Elasmi
J. Imaging 2025, 11(10), 346; https://doi.org/10.3390/jimaging11100346 - 4 Oct 2025
Viewed by 199
Abstract
Lung cancer remains one of the most lethal cancers globally. Its early detection is vital to improving survival rates. In this work, we propose a hybrid computer-aided diagnosis (CAD) pipeline for lung cancer classification using Computed Tomography (CT) scan images. The proposed CAD [...] Read more.
Lung cancer remains one of the most lethal cancers globally. Its early detection is vital to improving survival rates. In this work, we propose a hybrid computer-aided diagnosis (CAD) pipeline for lung cancer classification using Computed Tomography (CT) scan images. The proposed CAD pipeline integrates ten image preprocessing techniques and ten pretrained deep learning models for feature extraction including convolutional neural networks and transformer-based architectures, and four classical machine learning classifiers. Unlike traditional end-to-end deep learning systems, our approach decouples feature extraction from classification, enhancing interpretability and reducing the risk of overfitting. A total of 400 model configurations were evaluated to identify the optimal combination. The proposed approach was evaluated on the publicly available Lung Image Database Consortium and Image Database Resource Initiative dataset, which comprises 1018 thoracic CT scans annotated by four thoracic radiologists. For the classification task, the dataset included a total of 6568 images labeled as malignant and 4849 images labeled as benign. Experimental results show that the best performing pipeline, combining Contrast Limited Adaptive Histogram Equalization, Swin Transformer feature extraction, and eXtreme Gradient Boosting, achieved an accuracy of 95.8%. Full article
(This article belongs to the Special Issue Advancements in Imaging Techniques for Detection of Cancer)
Show Figures

Figure 1

24 pages, 14242 KB  
Article
DBA-YOLO: A Dense Target Detection Model Based on Lightweight Neural Networks
by Zhiyong He, Jiahong Yang, Hongtian Ning, Chengxuan Li and Qiang Tang
J. Imaging 2025, 11(10), 345; https://doi.org/10.3390/jimaging11100345 - 4 Oct 2025
Viewed by 445
Abstract
Current deep learning-based dense target detection models face dual challenges in industrial scenarios: high computational complexity leading to insufficient inference efficiency on mobile devices, and missed/false detections caused by dense small targets, high inter-class similarity, and complex background interference. To address these issues, [...] Read more.
Current deep learning-based dense target detection models face dual challenges in industrial scenarios: high computational complexity leading to insufficient inference efficiency on mobile devices, and missed/false detections caused by dense small targets, high inter-class similarity, and complex background interference. To address these issues, this paper proposes DBA-YOLO, a lightweight model based on YOLOv10, which significantly reduces computational complexity through model compression and algorithm optimization while maintaining high accuracy. Key improvements include the following: (1) a C2f PA module for enhanced feature extraction, (2) a parameter-refined BIMAFPN neck structure to improve small target detection, and (3) a DyDHead module integrating scale, space, and task awareness for spatial feature weighting. To validate DBA-YOLO, we constructed a real-world dataset from cigarette package images. Experiments on SKU-110K and our dataset show that DBA-YOLO achieves 91.3% detection accuracy (1.4% higher than baseline), with mAP and mAP75 improvements of 2–3%. Additionally, the model reduces parameters by 3.6%, balancing efficiency and performance for resource-constrained devices. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

36 pages, 462 KB  
Article
No Reproducibility, No Progress: Rethinking CT Benchmarking
by Dmitry Polevoy, Danil Kazimirov, Marat Gilmanov and Dmitry Nikolaev
J. Imaging 2025, 11(10), 344; https://doi.org/10.3390/jimaging11100344 - 2 Oct 2025
Viewed by 347
Abstract
Reproducibility is a cornerstone of scientific progress, yet in X-ray computed tomography (CT) reconstruction, it remains a critical and unresolved challenge. Current benchmarking practices in CT are hampered by the scarcity of openly available datasets, the incomplete or task-specific nature of existing resources, [...] Read more.
Reproducibility is a cornerstone of scientific progress, yet in X-ray computed tomography (CT) reconstruction, it remains a critical and unresolved challenge. Current benchmarking practices in CT are hampered by the scarcity of openly available datasets, the incomplete or task-specific nature of existing resources, and the lack of transparent implementations of widely used methods and evaluation metrics. As a result, even the fundamental property of reproducibility is frequently violated, undermining objective comparison and slowing methodological progress. In this work, we analyze the systemic limitations of current CT benchmarking, drawing parallels with broader reproducibility issues across scientific domains. We propose an extended data model and formalized schemes for data preparation and quality assessment, designed to improve reproducibility and broaden the applicability of CT datasets across multiple tasks. Building on these schemes, we introduce checklists for dataset construction and quality assessment, offering a foundation for reliable and reproducible benchmarking pipelines. A key aspect of our recommendations is the integration of virtual CT (vCT), which provides highly realistic data and analytically computable phantoms, yet remains underutilized despite its potential to overcome many current barriers. Our work represents a first step toward a methodological framework for reproducible benchmarking in CT. This framework aims to enable transparent, rigorous, and comparable evaluation of reconstruction methods, ultimately supporting their reliable adoption in clinical and industrial applications. Full article
(This article belongs to the Special Issue Tools and Techniques for Improving Radiological Imaging Applications)
Show Figures

Figure 1

22 pages, 4682 KB  
Article
Development of a Fully Optimized Convolutional Neural Network for Astrocytoma Classification in MRI Using Explainable Artificial Intelligence
by Christos Ch. Andrianos, Spiros A. Kostopoulos, Ioannis K. Kalatzis, Dimitris Th. Glotsos, Pantelis A. Asvestas, Dionisis A. Cavouras and Emmanouil I. Athanasiadis
J. Imaging 2025, 11(10), 343; https://doi.org/10.3390/jimaging11100343 - 2 Oct 2025
Viewed by 279
Abstract
Astrocytoma is the most common type of brain glioma and is classified by the World Health Organization into four grades, providing prognostic insights and guiding treatment decisions. The accurate determination of astrocytoma grade is critical for patient management, especially in high-malignancy-grade cases. This [...] Read more.
Astrocytoma is the most common type of brain glioma and is classified by the World Health Organization into four grades, providing prognostic insights and guiding treatment decisions. The accurate determination of astrocytoma grade is critical for patient management, especially in high-malignancy-grade cases. This study proposes a fully optimized Convolutional Neural Network (CNN) for the classification of astrocytoma MRI slices across the three malignant grades (G2–4). The training dataset consisted of 1284 pre-operative axial 2D MRI slices from T1-weighted contrast-enhanced and FLAIR sequences derived from 69 patients. To provide the best possible model performance, an extensive hyperparameter tuning was carried out through the Hyperband method, a variant of Successive Halving. Training was conducted using Repeated Hold-Out Validation across four randomized data splits, achieving a mean classification accuracy of 98.05%, low loss values, and an AUC of 0.997. Comparative evaluation against state-of-the-art pre-trained models using transfer learning demonstrated superior performance. For validation purposes, the proposed CNN trained on an altered version of the training set yielded 93.34% accuracy on unmodified slices, which confirms the model’s robustness and potential use for clinical deployment. Model interpretability was ensured through the application of two Explainable AI (XAI) techniques, SHAP and LIME, which highlighted the regions of the slices contributing to the decision-making process. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

15 pages, 2112 KB  
Article
Radiomics-Based Preoperative Assessment of Muscle-Invasive Bladder Cancer Using Combined T2 and ADC MRI: A Multicohort Validation Study
by Dmitry Kabanov, Natalia Rubtsova, Aleksandra Golbits, Andrey Kaprin, Valentin Sinitsyn and Mikhail Potievskiy
J. Imaging 2025, 11(10), 342; https://doi.org/10.3390/jimaging11100342 - 1 Oct 2025
Viewed by 284
Abstract
Accurate preoperative staging of bladder cancer on MRI remains challenging because visual reads vary across observers. We investigated a multiparametric MRI (mpMRI) radiomics approach to predict muscle invasion (≥T2) and prospectively tested it on a validation cohort. Eighty-four patients with urothelial carcinoma underwent [...] Read more.
Accurate preoperative staging of bladder cancer on MRI remains challenging because visual reads vary across observers. We investigated a multiparametric MRI (mpMRI) radiomics approach to predict muscle invasion (≥T2) and prospectively tested it on a validation cohort. Eighty-four patients with urothelial carcinoma underwent 1.5-T mpMRI per VI-RADS (T2-weighted imaging and DWI-derived ADC maps). Two blinded radiologists performed 3D tumor segmentation; 37 features per sequence were extracted (LifeX) using absolute resampling. In the training cohort (n = 40), features that differed between non-muscle-invasive and muscle-invasive tumors (Mann–Whitney p < 0.05) underwent ROC analysis with cut-offs defined by the Youden index. A compact descriptor combining GLRLM-LRLGE from T2 and GLRLM-SRLGE from ADC was then fixed and applied without re-selection to a prospective validation cohort (n = 44). Histopathology within 6 weeks—TURBT or cystectomy—served as the reference. Eleven T2-based and fifteen ADC-based features pointed to invasion; DWI texture features were not informative. The descriptor yielded AUCs of 0.934 (training) and 0.871 (validation) with 85.7% sensitivity and 96.2% specificity in validation. Collectively, these findings indicate that combined T2/ADC radiomics can provide high diagnostic accuracy and may serve as a useful decision support tool, after multicenter, multi-vendor validation. Full article
(This article belongs to the Topic Machine Learning and Deep Learning in Medical Imaging)
Show Figures

Figure 1

19 pages, 7222 KB  
Article
Multi-Channel Spectro-Temporal Representations for Speech-Based Parkinson’s Disease Detection
by Hadi Sedigh Malekroodi, Nuwan Madusanka, Byeong-il Lee and Myunggi Yi
J. Imaging 2025, 11(10), 341; https://doi.org/10.3390/jimaging11100341 - 1 Oct 2025
Viewed by 244
Abstract
Early, non-invasive detection of Parkinson’s Disease (PD) using speech analysis offers promise for scalable screening. In this work, we propose a multi-channel spectro-temporal deep-learning approach for PD detection from sentence-level speech, a clinically relevant yet underexplored modality. We extract and fuse three complementary [...] Read more.
Early, non-invasive detection of Parkinson’s Disease (PD) using speech analysis offers promise for scalable screening. In this work, we propose a multi-channel spectro-temporal deep-learning approach for PD detection from sentence-level speech, a clinically relevant yet underexplored modality. We extract and fuse three complementary time–frequency representations—mel spectrogram, constant-Q transform (CQT), and gammatone spectrogram—into a three-channel input analogous to an RGB image. This fused representation is evaluated across CNNs (ResNet, DenseNet, and EfficientNet) and Vision Transformer using the PC-GITA dataset, under 10-fold subject-independent cross-validation for robust assessment. Results showed that fusion consistently improves performance over single representations across architectures. EfficientNet-B2 achieves the highest accuracy (84.39% ± 5.19%) and F1-score (84.35% ± 5.52%), outperforming recent methods using handcrafted features or pretrained models (e.g., Wav2Vec2.0, HuBERT) on the same task and dataset. Performance varies with sentence type, with emotionally salient and prosodically emphasized utterances yielding higher AUC, suggesting that richer prosody enhances discriminability. Our findings indicate that multi-channel fusion enhances sensitivity to subtle speech impairments in PD by integrating complementary spectral information. Our approach implies that multi-channel fusion could enhance the detection of discriminative acoustic biomarkers, potentially offering a more robust and effective framework for speech-based PD screening, though further validation is needed before clinical application. Full article
(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)
Show Figures

Figure 1

32 pages, 9105 KB  
Article
Development of Semi-Automatic Dental Image Segmentation Workflows with Root Canal Recognition for Faster Ground Tooth Acquisition
by Yousef Abo El Ela and Mohamed Badran
J. Imaging 2025, 11(10), 340; https://doi.org/10.3390/jimaging11100340 - 1 Oct 2025
Viewed by 319
Abstract
This paper investigates the application of image segmentation techniques in endodontics, focusing on improving diagnostic accuracy and achieving faster segmentation by delineating specific dental regions such as teeth and root canals. Deep learning architectures, notably 3D U-Net and GANs, have advanced the image [...] Read more.
This paper investigates the application of image segmentation techniques in endodontics, focusing on improving diagnostic accuracy and achieving faster segmentation by delineating specific dental regions such as teeth and root canals. Deep learning architectures, notably 3D U-Net and GANs, have advanced the image segmentation process for dental structures, supporting more precise dental procedures. However, challenges like the demand for extensive labeled datasets and ensuring model generalizability remain. Two semi-automatic segmentation workflows, Grow From Seeds (GFS) and Watershed (WS), were developed to provide quicker acquisition of ground truth training data for deep learning models using 3D Slicer software version 5.8.1. These workflows were evaluated against a manual segmentation benchmark and a recent dental segmentation automated tool on three separate datasets. The evaluations were performed by the overall shapes of a maxillary central incisor and a maxillary second molar and by the region of the root canal of both teeth. Results from Kruskal–Wallis and Nemenyi tests indicated that the semi-automated workflows, more often than not, were not statistically different from the manual benchmark based on dice coefficient similarity, while the automated method consistently provided significantly different 3D models from their manual counterparts. The study also explores the benefits of labor reduction and time savings achieved by the semi-automated methods. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

3 pages, 136 KB  
Editorial
Editorial on the Special Issue “Geometry Reconstruction from Images (2nd Edition)”
by Daniel Meneveaux
J. Imaging 2025, 11(10), 339; https://doi.org/10.3390/jimaging11100339 - 30 Sep 2025
Viewed by 182
Abstract
In recent decades, research has produced impressive methods for recovering geometric information from real objects [...] Full article
(This article belongs to the Special Issue Geometry Reconstruction from Images (2nd Edition))
14 pages, 3002 KB  
Communication
Interpretability of Deep High-Frequency Residuals: A Case Study on SAR Splicing Localization
by Edoardo Daniele Cannas, Sara Mandelli, Paolo Bestagini and Stefano Tubaro
J. Imaging 2025, 11(10), 338; https://doi.org/10.3390/jimaging11100338 - 28 Sep 2025
Viewed by 190
Abstract
Multimedia Forensics (MMF) investigates techniques to automatically assess the integrity of multimedia content, e.g., images, videos, or audio clips. Data-driven methodologies like Neural Networks (NNs) represent the state of the art in the field. Despite their efficacy, NNs are often considered “black boxes” [...] Read more.
Multimedia Forensics (MMF) investigates techniques to automatically assess the integrity of multimedia content, e.g., images, videos, or audio clips. Data-driven methodologies like Neural Networks (NNs) represent the state of the art in the field. Despite their efficacy, NNs are often considered “black boxes” due to their lack of transparency, which limits their usage in critical applications. In this work, we assess the interpretability properties of Deep High-Frequency Residuals (DHFRs), i.e., noise residuals extracted from images by NNs for forensic purposes, that nowadays represent a powerful tool for image splicing localization. Our research demonstrates that DHFRs not only serve as a visual aid in identifying manipulated regions in the image but also reveal the nature of the editing techniques applied to tamper with the sample under analysis. Through extensive experimentation on spliced amplitude Synthetic Aperture Radar (SAR) images, we establish a correlation between the appearance of the DHFRs in the tampered-with zones and their high-frequency energy content. Our findings suggest that, despite the deep learning nature of DHFRs, they possess significant interpretability properties, encouraging further exploration in other forensic applications. Full article
Show Figures

Figure 1

Previous Issue
Back to TopTop