Previous Issue
Volume 11, May
 
 

J. Imaging, Volume 11, Issue 6 (June 2025) – 21 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
21 pages, 1578 KiB  
Article
SADiff: Coronary Artery Segmentation in CT Angiography Using Spatial Attention and Diffusion Model
by Ruoxuan Xu, Longhui Dai, Jianru Wang, Lei Zhang and Yuanquan Wang
J. Imaging 2025, 11(6), 192; https://doi.org/10.3390/jimaging11060192 - 11 Jun 2025
Abstract
Coronary artery disease (CAD) is a highly prevalent cardiovascular disease and one of the leading causes of death worldwide. The accurate segmentation of coronary arteries from CT angiography (CTA) images is essential for the diagnosis and treatment of coronary artery disease. However, due [...] Read more.
Coronary artery disease (CAD) is a highly prevalent cardiovascular disease and one of the leading causes of death worldwide. The accurate segmentation of coronary arteries from CT angiography (CTA) images is essential for the diagnosis and treatment of coronary artery disease. However, due to small vessel diameters, large morphological variations, low contrast, and motion artifacts, conventional segmentation methods, including classical image processing (such as region growing and level sets) and early deep learning models with limited receptive fields, are unsatisfactory. We propose SADiff, a hybrid framework that integrates a dilated attention network (DAN) for ROI extraction, a diffusion-based subnet for noise suppression in low-contrast regions, and a striped attention network (SAN) to refine tubular structures affected by morphological variations. Experiments on the public ImageCAS dataset show that it has a Dice score of 83.48% and a Hausdorff distance of 19.43 mm, which is 6.57% higher than U-Net3D in terms of Dice. The cross-dataset validation on the private ImageLaPP dataset verifies its generalizability with a Dice score of 79.42%. This comprehensive evaluation demonstrates that SADiff provides a more efficient and versatile method for coronary segmentation and shows great potential for improving the diagnosis and treatment of CAD. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

17 pages, 1267 KiB  
Article
Prediction of PD-L1 and CD68 in Clear Cell Renal Cell Carcinoma with Green Learning
by Yixing Wu, Alexander Shieh, Steven Cen, Darryl Hwang, Xiaomeng Lei, S. J. Pawan, Manju Aron, Inderbir Gill, William D. Wallace, C.-C. Jay Kuo and Vinay Duddalwar
J. Imaging 2025, 11(6), 191; https://doi.org/10.3390/jimaging11060191 - 10 Jun 2025
Abstract
Clear cell renal cell carcinoma (ccRCC) is the most common type of renal cancer. Extensive efforts have been made to utilize radiomics from computed tomography (CT) imaging to predict tumor immune microenvironment (TIME) measurements. This study proposes a Green Learning (GL) framework for [...] Read more.
Clear cell renal cell carcinoma (ccRCC) is the most common type of renal cancer. Extensive efforts have been made to utilize radiomics from computed tomography (CT) imaging to predict tumor immune microenvironment (TIME) measurements. This study proposes a Green Learning (GL) framework for approximating tissue-based biomarkers from CT scans, focusing on the PD-L1 expression and CD68 tumor-associated macrophages (TAMs) in ccRCC. Our approach includes radiomic feature extraction, redundancy removal, and supervised feature selection through a discriminant feature test (DFT), a relevant feature test (RFT), and least-squares normal transform (LNT) for robust feature generation. For the PD-L1 expression in 52 ccRCC patients, treated as a regression problem, our GL model achieved a 5-fold cross-validated mean squared error (MSE) of 0.0041 and a Mean Absolute Error (MAE) of 0.0346. For the TAM population (CD68+/PanCK+), analyzed in 78 ccRCC patients as a binary classification task (at a 0.4 threshold), the model reached a 10-fold cross-validated Area Under the Receiver Operating Characteristic (AUROC) of 0.85 (95% CI [0.76, 0.93]) using 10 LNT-derived features, improving upon the previous benchmark of 0.81. This study demonstrates the potential of GL in radiomic analyses, offering a scalable, efficient, and interpretable framework for the non-invasive approximation of key biomarkers. Full article
(This article belongs to the Special Issue Imaging in Healthcare: Progress and Challenges)
Show Figures

Figure 1

23 pages, 4896 KiB  
Article
Insulator Surface Defect Detection Method Based on Graph Feature Diffusion Distillation
by Shucai Li, Na Zhang, Gang Yang, Yannong Hou and Xingzhong Zhang
J. Imaging 2025, 11(6), 190; https://doi.org/10.3390/jimaging11060190 - 10 Jun 2025
Abstract
Aiming at the difficulties of scarcity of defect samples on the surface of power insulators, irregular morphology and insufficient pixel-level localization accuracy, this paper proposes a defect detection method based on graph feature diffusion distillation named GFDD. The feature bias problem is alleviated [...] Read more.
Aiming at the difficulties of scarcity of defect samples on the surface of power insulators, irregular morphology and insufficient pixel-level localization accuracy, this paper proposes a defect detection method based on graph feature diffusion distillation named GFDD. The feature bias problem is alleviated by constructing a dual-division teachers architecture with graph feature consistency constraints, while the cross-layer feature fusion module is utilized to dynamically aggregate multi-scale information to reduce redundancy; the diffusion distillation mechanism is designed to break through the traditional single-layer feature transfer limitation, and the global context modeling capability is enhanced by fusing deep semantics and shallow details through channel attention. In the self-built dataset, GFDD achieves 96.6% Pi.AUROC, 97.7% Im.AUROC and 95.1% F1-score, which is 2.4–3.2% higher than the existing optimal methods; it maintains excellent generalization and robustness in multiple public dataset tests. The method provides a high-precision solution for automated inspection of insulator surface defect and has certain engineering value. Full article
(This article belongs to the Special Issue Self-Supervised Learning for Image Processing and Analysis)
Show Figures

Figure 1

31 pages, 55513 KiB  
Article
SAM for Road Object Segmentation: Promising but Challenging
by Alaa Atallah Almazroey, Salma kammoun Jarraya and Reem Alnanih
J. Imaging 2025, 11(6), 189; https://doi.org/10.3390/jimaging11060189 - 10 Jun 2025
Abstract
Road object segmentation is crucial for autonomous driving, as it enables vehicles to perceive their surroundings. While deep learning models show promise, their generalization across diverse road conditions, weather variations, and lighting changes remains challenging. Different approaches have been proposed to address this [...] Read more.
Road object segmentation is crucial for autonomous driving, as it enables vehicles to perceive their surroundings. While deep learning models show promise, their generalization across diverse road conditions, weather variations, and lighting changes remains challenging. Different approaches have been proposed to address this limitation. However, these models often struggle with the varying appearance of road objects under diverse environmental conditions. Foundation models such as the Segment Anything Model (SAM) offer a potential avenue for improved generalization in complex visual tasks. Thus, this study presents a pioneering comprehensive evaluation of the SAM for zero-shot road object segmentation, without explicit prompts. This study aimed to determine the inherent capabilities and limitations of the SAM in accurately segmenting a variety of road objects under the diverse and challenging environmental conditions encountered in real-world autonomous driving scenarios. We assessed the SAM’s performance on the KITTI, BDD100K, and Mapillary Vistas datasets, encompassing a wide range of environmental conditions. Using a variety of established evaluation metrics, our analysis revealed the SAM’s capabilities and limitations in accurately segmenting various road objects, particularly highlighting challenges posed by dynamic environments, illumination changes, and occlusions. These findings provide valuable insights for researchers and developers seeking to enhance the robustness of foundation models such as the SAM in complex road environments, guiding future efforts to improve perception systems for autonomous driving. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

25 pages, 3449 KiB  
Article
CSANet: Context–Spatial Awareness Network for RGB-T Urban Scene Understanding
by Ruixiang Li, Zhen Wang, Jianxin Guo and Chuanlei Zhang
J. Imaging 2025, 11(6), 188; https://doi.org/10.3390/jimaging11060188 - 9 Jun 2025
Abstract
Semantic segmentation plays a critical role in understanding complex urban environments, particularly for autonomous driving applications. However, existing approaches face significant challenges under low-light and adverse weather conditions. To address these limitations, we propose CSANet (Context Spatial Awareness Network), a novel framework that [...] Read more.
Semantic segmentation plays a critical role in understanding complex urban environments, particularly for autonomous driving applications. However, existing approaches face significant challenges under low-light and adverse weather conditions. To address these limitations, we propose CSANet (Context Spatial Awareness Network), a novel framework that effectively integrates RGB and thermal infrared (TIR) modalities. CSANet employs an efficient encoder to extract complementary local and global features, while a hierarchical fusion strategy is adopted to selectively integrate visual and semantic information. Notably, the Channel–Spatial Cross-Fusion Module (CSCFM) enhances local details by fusing multi-modal features, and the Multi-Head Fusion Module (MHFM) captures global dependencies and calibrates multi-modal information. Furthermore, the Spatial Coordinate Attention Mechanism (SCAM) improves object localization accuracy in complex urban scenes. Evaluations on benchmark datasets (MFNet and PST900) demonstrate that CSANet achieves state-of-the-art performance, significantly advancing RGB-T semantic segmentation. Full article
Show Figures

Figure 1

19 pages, 5272 KiB  
Article
Biomechanics of Spiral Fractures: Investigating Periosteal Effects Using Digital Image Correlation
by Ghaidaa A. Khalid, Ali Al-Naji and Javaan Chahl
J. Imaging 2025, 11(6), 187; https://doi.org/10.3390/jimaging11060187 - 7 Jun 2025
Viewed by 120
Abstract
Spiral fractures are a frequent clinical manifestation of child abuse, particularly in non-ambulatory infants. Approximately 50% of fractures in children under one year of age are non-accidental, yet differentiating between accidental and abusive injuries remains challenging, as no single fracture type is diagnostic [...] Read more.
Spiral fractures are a frequent clinical manifestation of child abuse, particularly in non-ambulatory infants. Approximately 50% of fractures in children under one year of age are non-accidental, yet differentiating between accidental and abusive injuries remains challenging, as no single fracture type is diagnostic in isolation. The objective of this study is to investigate the biomechanics of spiral fractures in immature long bones and the role of the periosteum in modulating fracture behavior under torsional loading. Methods: Paired metatarsal bone specimens from immature sheep were tested using controlled torsional loading at two angular velocities (90°/s and 180°/s). Specimens were prepared through potting, application of a base coat, and painting of a speckle pattern suitable for high-speed digital image correlation (HS-DIC) analysis. Both periosteum-intact and periosteum-removed groups were included. Results: Spiral fractures were successfully induced in over 85% of specimens. Digital image correlation revealed localized diagonal tensile strain at the fracture initiation site, with opposing compressive zones. Notably, bones with intact periosteum exhibited broader tensile stress regions before and after failure, suggesting a biomechanical role in constraining deformation. Conclusion: This study presents a novel integration of high-speed digital image correlation (DIC) with paired biomechanical testing to evaluate the periosteum’s role in spiral fracture formation—an area that remains underexplored. The findings offer new insight into the strain distribution dynamics in immature long bones and highlight the periosteum’s potential protective contribution under torsional stress. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

27 pages, 3997 KiB  
Article
NCT-CXR: Enhancing Pulmonary Abnormality Segmentation on Chest X-Rays Using Improved Coordinate Geometric Transformations
by Abu Salam, Pulung Nurtantio Andono, Purwanto, Moch Arief Soeleman, Mohamad Sidiq, Farrikh Alzami, Ika Novita Dewi, Suryanti, Eko Adhi Pangarsa, Daniel Rizky, Budi Setiawan, Damai Santosa, Antonius Gunawan Santoso, Farid Che Ghazali and Eko Supriyanto
J. Imaging 2025, 11(6), 186; https://doi.org/10.3390/jimaging11060186 - 5 Jun 2025
Viewed by 225
Abstract
Medical image segmentation, especially in chest X-ray (CXR) analysis, encounters substantial problems such as class imbalance, annotation inconsistencies, and the necessity for accurate pathological region identification. This research aims to improve the precision and clinical reliability of pulmonary abnormality segmentation by developing NCT-CXR, [...] Read more.
Medical image segmentation, especially in chest X-ray (CXR) analysis, encounters substantial problems such as class imbalance, annotation inconsistencies, and the necessity for accurate pathological region identification. This research aims to improve the precision and clinical reliability of pulmonary abnormality segmentation by developing NCT-CXR, a framework that combines anatomically constrained data augmentation with expert-guided annotation refinement. NCT-CXR applies carefully calibrated discrete-angle rotations (±5°, ±10°) and intensity-based augmentations to enrich training data while preserving spatial and anatomical integrity. To address label noise in the NIH Chest X-ray dataset, we further introduce a clinically validated annotation refinement pipeline using the OncoDocAI platform, resulting in multi-label pixel-level segmentation masks for nine thoracic conditions. YOLOv8 was selected as the segmentation backbone due to its architectural efficiency, speed, and high spatial accuracy. Experimental results show that NCT-CXR significantly improves segmentation precision, especially for pneumothorax (0.829 and 0.804 for ±5° and ±10°, respectively). Non-parametric statistical testing (Kruskal–Wallis, H = 14.874, p = 0.0019) and post hoc Nemenyi analysis (p = 0.0138 and p = 0.0056) confirm the superiority of discrete-angle augmentation over mixed strategies. These findings underscore the importance of clinically constrained augmentation and high-quality annotation in building robust segmentation models. NCT-CXR offers a practical, high-performance solution for integrating deep learning into radiological workflows. Full article
Show Figures

Figure 1

14 pages, 5492 KiB  
Article
Comparison of Imaging Modalities for Left Ventricular Noncompaction Morphology
by Márton Horváth, Dorottya Kiss, István Márkusz, Márton Tokodi, Anna Réka Kiss, Zsófia Gregor, Kinga Grebur, Kristóf Farkas-Sütő, Balázs Mester, Flóra Gyulánczi, Attila Kovács, Béla Merkely, Hajnalka Vágó and Andrea Szűcs
J. Imaging 2025, 11(6), 185; https://doi.org/10.3390/jimaging11060185 - 4 Jun 2025
Viewed by 179
Abstract
Left ventricular noncompaction (LVNC) is characterized by excessive trabeculation, which may impair left ventricular function over time. While cardiac magnetic resonance imaging (CMR) is considered the gold standard for evaluating LV morphology, the optimal modality for follow-up remains uncertain. This study aimed to [...] Read more.
Left ventricular noncompaction (LVNC) is characterized by excessive trabeculation, which may impair left ventricular function over time. While cardiac magnetic resonance imaging (CMR) is considered the gold standard for evaluating LV morphology, the optimal modality for follow-up remains uncertain. This study aimed to assess the correlation and agreement among two-dimensional transthoracic echocardiography (2D_TTE), three-dimensional transthoracic echocardiography (3D_TTE), and CMR by comparing volumetric and strain parameters in LVNC patients and healthy individuals. Thirty-eight LVNC subjects with preserved ejection fraction and thirty-four healthy controls underwent all three imaging modalities. Indexed end-diastolic, end-systolic, and stroke volumes, ejection fraction, and global longitudinal and circumferential strains were evaluated using Pearson correlation and Bland–Altman analysis. In the healthy group, volumetric parameters showed strong correlation and good agreement across modalities, particularly between 3D_TTE and CMR. In contrast, agreement in the LVNC group was moderate, with lower correlation and higher percentage errors, especially for strain parameters. Functional data exhibited weak or no correlation, regardless of group. These findings suggest that while echocardiography may be suitable for volumetric follow-up in LVNC after baseline CMR, deformation parameters are not interchangeable between modalities, likely due to trabecular interference. Further studies are warranted to validate modality-specific strain assessment in hypertrabeculated hearts. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

20 pages, 8445 KiB  
Article
COSMICA: A Novel Dataset for Astronomical Object Detection with Evaluation Across Diverse Detection Architectures
by Evgenii Piratinskii and Irina Rabaev
J. Imaging 2025, 11(6), 184; https://doi.org/10.3390/jimaging11060184 - 4 Jun 2025
Viewed by 177
Abstract
Accurate and efficient detection of celestial objects in telescope imagery is a fundamental challenge in both professional and amateur astronomy. Traditional methods often struggle with noise, varying brightness, and object morphology. This paper introduces COSMICA, a novel, curated dataset of manually annotated astronomical [...] Read more.
Accurate and efficient detection of celestial objects in telescope imagery is a fundamental challenge in both professional and amateur astronomy. Traditional methods often struggle with noise, varying brightness, and object morphology. This paper introduces COSMICA, a novel, curated dataset of manually annotated astronomical images collected from amateur observations. COSMICA enables the development and evaluation of real-time object detection systems intended for practical deployment in observational pipelines. We investigate three modern YOLO architectures, YOLOv8, YOLOv9, and YOLOv11, and two additional object detection models, EfficientDet-Lite0 and MobileNetV3-FasterRCNN-FPN, to assess their performance in detecting comets, galaxies, nebulae, and globular clusters. All models are evaluated using consistent experimental conditions across multiple metrics, including mAP, precision, recall, and inference speed. YOLOv11 demonstrated the highest overall accuracy and computational efficiency, making it a promising candidate for real-world astronomical applications. These results support the feasibility of integrating deep learning-based detection systems into observational astronomy workflows and highlight the importance of domain-specific datasets for training robust AI models. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

12 pages, 2782 KiB  
Article
Platelets Image Classification Through Data Augmentation: A Comparative Study of Traditional Imaging Augmentation and GAN-Based Synthetic Data Generation Techniques Using CNNs
by Itunuoluwa Abidoye, Frances Ikeji, Charlie A. Coupland, Simon D. J. Calaminus, Nick Sander and Eva Sousa
J. Imaging 2025, 11(6), 183; https://doi.org/10.3390/jimaging11060183 - 4 Jun 2025
Viewed by 217
Abstract
Platelets play a crucial role in diagnosing and detecting various diseases, influencing the progression of conditions and guiding treatment options. Accurate identification and classification of platelets are essential for these purposes. The present study aims to create a synthetic database of platelet images [...] Read more.
Platelets play a crucial role in diagnosing and detecting various diseases, influencing the progression of conditions and guiding treatment options. Accurate identification and classification of platelets are essential for these purposes. The present study aims to create a synthetic database of platelet images using Generative Adversarial Networks (GANs) and validate its effectiveness by comparing it with datasets of increasing sizes generated through traditional augmentation techniques. Starting from an initial dataset of 71 platelet images, the dataset was expanded to 141 images (Level 1) using random oversampling and basic transformations and further to 1463 images (Level 2) through extensive augmentation (rotation, shear, zoom). Additionally, a synthetic dataset of 300 images was generated using a Wasserstein GAN with Gradient Penalty (WGAN-GP). Eight pre-trained deep learning models (DenseNet121, DenseNet169, DenseNet201, VGG16, VGG19, InceptionV3, InceptionResNetV2, and AlexNet) and two custom CNNs were evaluated across these datasets. Performance was measured using accuracy, precision, recall, and F1-score. On the extensively augmented dataset (Level 2), InceptionV3 and InceptionResNetV2 reached 99% accuracy and 99% precision/recall/F1-score, while DenseNet201 closely followed, with 98% accuracy, precision, recall and F1-score. GAN-augmented data further improved DenseNet’s performance, demonstrating the potential of GAN-generated images in enhancing platelet classification, especially where data are limited. These findings highlight the benefits of combining traditional and GAN-based augmentation techniques to improve classification performance in medical imaging tasks. Full article
(This article belongs to the Topic Machine Learning and Deep Learning in Medical Imaging)
Show Figures

Figure 1

27 pages, 4299 KiB  
Article
A Structured and Methodological Review on Multi-View Human Activity Recognition for Ambient Assisted Living
by Fahmid Al Farid, Ahsanul Bari, Abu Saleh Musa Miah, Sarina Mansor, Jia Uddin and S. Prabha Kumaresan
J. Imaging 2025, 11(6), 182; https://doi.org/10.3390/jimaging11060182 - 3 Jun 2025
Viewed by 376
Abstract
Ambient Assisted Living (AAL) leverages technology to support the elderly and individuals with disabilities. A key challenge in these systems is efficient Human Activity Recognition (HAR). However, no study has systematically compared single-view (SV) and multi-view (MV) Human Activity Recognition approaches. This review [...] Read more.
Ambient Assisted Living (AAL) leverages technology to support the elderly and individuals with disabilities. A key challenge in these systems is efficient Human Activity Recognition (HAR). However, no study has systematically compared single-view (SV) and multi-view (MV) Human Activity Recognition approaches. This review addresses this gap by analyzing the evolution from single-view to multi-view recognition systems, covering benchmark datasets, feature extraction methods, and classification techniques. We examine how activity recognition systems have transitioned to multi-view architectures using advanced deep learning models optimized for Ambient Assisted Living, thereby improving accuracy and robustness. Furthermore, we explore a wide range of machine learning and deep learning models—including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, Temporal Convolutional Networks (TCNs), and Graph Convolutional Networks (GCNs)—along with lightweight transfer learning methods suitable for environments with limited computational resources. Key challenges such as data remediation, privacy, and generalization are discussed, alongside potential solutions such as sensor fusion and advanced learning strategies. This study offers comprehensive insights into recent advancements and future directions, guiding the development of intelligent, efficient, and privacy-compliant Human Activity Recognition systems for Ambient Assisted Living applications. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

11 pages, 1895 KiB  
Article
3D Echocardiographic Assessment of Right Ventricular Involvement of Left Ventricular Hypertrabecularization from a New Perspective
by Márton Horváth, Kristóf Farkas-Sütő, Flóra Klára Gyulánczi, Alexandra Fábián, Bálint Lakatos, Anna Réka Kiss, Kinga Grebur, Zsófia Gregor, Balázs Mester, Attila Kovács, Béla Merkely and Andrea Szűcs
J. Imaging 2025, 11(6), 181; https://doi.org/10.3390/jimaging11060181 - 3 Jun 2025
Viewed by 191
Abstract
Right ventricular (RV) involvement in left ventricular hypertrabeculation (LVNC) remains under investigation. Due to its complex anatomy, assessing RV function is challenging, but 3D transthoracic echocardiography (3D_TTE) offers valuable insights. We aimed to evaluate volumetric, functional, and strain parameters of both ventricles in [...] Read more.
Right ventricular (RV) involvement in left ventricular hypertrabeculation (LVNC) remains under investigation. Due to its complex anatomy, assessing RV function is challenging, but 3D transthoracic echocardiography (3D_TTE) offers valuable insights. We aimed to evaluate volumetric, functional, and strain parameters of both ventricles in LVNC patients with preserved left ventricular ejection fraction (EF) and compare findings to a control group. This study included 37 LVNC patients and 37 age- and sex-matched controls. 3D_TTE recordings were analyzed using TomTec Image Arena (v. 4.7) and reVISION software to assess volumes, EF, and global/segmental strains. RV EF was further divided into longitudinal (LEF), radial (REF), and antero-posterior (AEF) components. LV volumes were significantly higher in the LVNC group, while RV volumes were comparable. EF and strain values were lower in both ventricles in LVNC patients. RV movement analysis showed significantly reduced LEF and REF, whereas AEF remained normal. These findings suggest subclinical RV dysfunction in LVNC, emphasizing the need for follow-up, even with preserved EF. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

7 pages, 1286 KiB  
Brief Report
Photon-Counting Detector CT Scan of Dinosaur Fossils: Initial Experience
by Tasuku Wakabayashi, Kenji Takata, Soichiro Kawabe, Masato Shimada, Takeshi Mugitani, Takuya Yachida, Rikiya Maruyama, Satomi Kanai, Kiyotaka Takeuchi, Tomohiro Kotsuji, Toshiki Tateishi, Hideki Hyodoh and Tetsuya Tsujikawa
J. Imaging 2025, 11(6), 180; https://doi.org/10.3390/jimaging11060180 - 2 Jun 2025
Viewed by 537
Abstract
Beyond clinical areas, photon-counting detector (PCD) CT is innovatively applied to study paleontological specimens. This study presents a preliminary investigation into the application of PCD-CT for imaging large dinosaur fossils, comparing it with standard energy-integrating detector (EID) CT. The left dentary of Tyrannosaurus [...] Read more.
Beyond clinical areas, photon-counting detector (PCD) CT is innovatively applied to study paleontological specimens. This study presents a preliminary investigation into the application of PCD-CT for imaging large dinosaur fossils, comparing it with standard energy-integrating detector (EID) CT. The left dentary of Tyrannosaurus and the skull of Camarasaurus were imaged using PCD-CT in ultra-high-resolution mode and EID-CT. The PCD-CT and EID-CT image quality of the dinosaurs were visually assessed. Compared with EID-CT, PCD-CT yielded higher-resolution anatomical images free of image deterioration, achieving a better definition of the Tyrannosaurus mandibular canal and the three semicircular canals of Camarasaurus. PCD-CT clearly depicts the internal structure and morphology of large dinosaur fossils without damaging them and also provides spectral information, thus allowing researchers to gain insights into fossil mineral composition and the preservation state in the future. Full article
(This article belongs to the Section Computational Imaging and Computational Photography)
Show Figures

Figure 1

28 pages, 3438 KiB  
Article
Optimizing Remote Sensing Image Retrieval Through a Hybrid Methodology
by Sujata Alegavi and Raghvendra Sedamkar
J. Imaging 2025, 11(6), 179; https://doi.org/10.3390/jimaging11060179 - 28 May 2025
Viewed by 228
Abstract
The contemporary challenge in remote sensing lies in the precise retrieval of increasingly abundant and high-resolution remotely sensed images (RS image) stored in expansive data warehouses. The heightened spatial and spectral resolutions, coupled with accelerated image acquisition rates, necessitate advanced tools for effective [...] Read more.
The contemporary challenge in remote sensing lies in the precise retrieval of increasingly abundant and high-resolution remotely sensed images (RS image) stored in expansive data warehouses. The heightened spatial and spectral resolutions, coupled with accelerated image acquisition rates, necessitate advanced tools for effective data management, retrieval, and exploitation. The classification of large-sized images at the pixel level generates substantial data, escalating the workload and search space for similarity measurement. Semantic-based image retrieval remains an open problem due to limitations in current artificial intelligence techniques. Furthermore, on-board storage constraints compel the application of numerous compression algorithms to reduce storage space, intensifying the difficulty of retrieving substantial, sensitive, and target-specific data. This research proposes an innovative hybrid approach to enhance the retrieval of remotely sensed images. The approach leverages multilevel classification and multiscale feature extraction strategies to enhance performance. The retrieval system comprises two primary phases: database building and retrieval. Initially, the proposed Multiscale Multiangle Mean-shift with Breaking Ties (MSMA-MSBT) algorithm selects informative unlabeled samples for hyperspectral and synthetic aperture radar images through an active learning strategy. Addressing the scaling and rotation variations in image capture, a flexible and dynamic algorithm, modified Deep Image Registration using Dynamic Inlier (IRDI), is introduced for image registration. Given the complexity of remote sensing images, feature extraction occurs at two levels. Low-level features are extracted using the modified Multiscale Multiangle Completed Local Binary Pattern (MSMA-CLBP) algorithm to capture local contexture features, while high-level features are obtained through a hybrid CNN structure combining pretrained networks (Alexnet, Caffenet, VGG-S, VGG-M, VGG-F, VGG-VDD-16, VGG-VDD-19) and a fully connected dense network. Fusion of low- and high-level features facilitates final class distinction, with soft thresholding mitigating misclassification issues. A region-based similarity measurement enhances matching percentages. Results, evaluated on high-resolution remote sensing datasets, demonstrate the effectiveness of the proposed method, outperforming traditional algorithms with an average accuracy of 86.66%. The hybrid retrieval system exhibits substantial improvements in classification accuracy, similarity measurement, and computational efficiency compared to state-of-the-art scene classification and retrieval methods. Full article
(This article belongs to the Topic Computational Intelligence in Remote Sensing: 2nd Edition)
Show Figures

Figure 1

22 pages, 20735 KiB  
Article
High-Throughput ORB Feature Extraction on Zynq SoC for Real-Time Structure-from-Motion Pipelines
by Panteleimon Stamatakis and John Vourvoulakis
J. Imaging 2025, 11(6), 178; https://doi.org/10.3390/jimaging11060178 - 28 May 2025
Viewed by 263
Abstract
This paper presents a real-time system for feature detection and description, the first stage in a structure-from-motion (SfM) pipeline. The proposed system leverages an optimized version of the ORB algorithm (oriented FAST and rotated BRIEF) implemented on the Digilent Zybo Z7020 FPGA board [...] Read more.
This paper presents a real-time system for feature detection and description, the first stage in a structure-from-motion (SfM) pipeline. The proposed system leverages an optimized version of the ORB algorithm (oriented FAST and rotated BRIEF) implemented on the Digilent Zybo Z7020 FPGA board equipped with the Xilinx Zynq-7000 SoC. The system accepts real-time video input (60 fps, 1920 × 1080 resolution, 24-bit color) via HDMI or a camera module. In order to support high frame rates for full-HD images, a double-data-rate pipeline scheme was adopted for Harris functions. Gray-scale video with features identified in red is exported through a separate HDMI port. Feature descriptors are calculated inside the FPGA by Zynq’s programmable logic and verified using Xilinx’s ILA IP block on a connected computer running Vivado. The implemented system achieves a latency of 192.7 microseconds, which is suitable for real-time applications. The proposed architecture is evaluated in terms of repeatability, matching retention and matching accuracy in several image transformations. It meets satisfactory accuracy and performance considering that there are slight changes between successive frames. This work paves the way for future research on the implementation of the remaining stages of a real-time SfM pipeline on the proposed hardware platform. Full article
(This article belongs to the Special Issue Recent Techniques in Image Feature Extraction)
Show Figures

Figure 1

19 pages, 3903 KiB  
Article
CFANet: The Cross-Modal Fusion Attention Network for Indoor RGB-D Semantic Segmentation
by Long-Fei Wu, Dan Wei and Chang-An Xu
J. Imaging 2025, 11(6), 177; https://doi.org/10.3390/jimaging11060177 - 27 May 2025
Viewed by 237
Abstract
Indoor image semantic segmentation technology is applied to fields such as smart homes and indoor security. The challenges faced by semantic segmentation techniques using RGB images and depth maps as data sources include the semantic gap between RGB images and depth maps and [...] Read more.
Indoor image semantic segmentation technology is applied to fields such as smart homes and indoor security. The challenges faced by semantic segmentation techniques using RGB images and depth maps as data sources include the semantic gap between RGB images and depth maps and the loss of detailed information. To address these issues, a multi-head self-attention mechanism is adopted to adaptively align features of the two modalities and perform feature fusion in both spatial and channel dimensions. Appropriate feature extraction methods are designed according to the different characteristics of RGB images and depth maps. For RGB images, asymmetric convolution is introduced to capture features in the horizontal and vertical directions, enhance short-range information dependence, mitigate the gridding effect of dilated convolution, and introduce criss-cross attention to obtain contextual information from global dependency relationships. On the depth map, a strategy of extracting significant unimodal features from the channel and spatial dimensions is used. A lightweight skip connection module is designed to fuse low-level and high-level features. In addition, since the first layer contains the richest detailed information and the last layer contains rich semantic information, a feature refinement head is designed to fuse the two. The method achieves an mIoU of 53.86% and 51.85% on the NYUDv2 and SUN-RGBD datasets, which is superior to mainstream methods. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

15 pages, 2957 KiB  
Article
Four-Wavelength Thermal Imaging for High-Energy-Density Industrial Processes
by Alexey Bykov, Anastasia Zolotukhina, Mikhail Poliakov, Andrey Belykh, Roman Asyutin, Anastasiia Korneeva, Vladislav Batshev and Demid Khokhlov
J. Imaging 2025, 11(6), 176; https://doi.org/10.3390/jimaging11060176 - 27 May 2025
Viewed by 194
Abstract
Multispectral imaging technology holds significant promise in the field of thermal imaging applications, primarily due to its unique ability to provide comprehensive two-dimensional spectral data distributions without the need for any form of scanning. This paper focuses on the development of an accessible [...] Read more.
Multispectral imaging technology holds significant promise in the field of thermal imaging applications, primarily due to its unique ability to provide comprehensive two-dimensional spectral data distributions without the need for any form of scanning. This paper focuses on the development of an accessible basic design concept and a method for estimating temperature maps using a four-channel spectral imaging system. The research examines key design considerations and establishes a workflow for data correction and processing. It involves preliminary camera calibration procedures, which are essential for accurately assessing and compensating for the characteristic properties of optical elements and image sensors. The developed method is validated through testing using a blackbody source, demonstrating a mean relative temperature error of 1%. Practical application of the method is demonstrated through temperature mapping of a tungsten lamp filament. Experiments demonstrated the capability of the developed multispectral camera to detect and visualize non-uniform temperature distributions and localized temperature deviations with sufficient spatial resolution. Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
Show Figures

Figure 1

12 pages, 1156 KiB  
Article
Performance of A Statistical-Based Automatic Contrast-to-Noise Ratio Measurement on Images of the ACR CT Phantom
by Choirul Anam, Riska Amilia, Ariij Naufal, Heri Sutanto, Wahyu S. Budi and Geoff Dougherty
J. Imaging 2025, 11(6), 175; https://doi.org/10.3390/jimaging11060175 - 26 May 2025
Viewed by 281
Abstract
This study evaluates the performance of a statistical-based automatic contrast-to-noise ratio (CNR) measurement method on images of the ACR CT phantom under varying imaging parameters. A statistical automatic method for segmenting low-contrast objects and for measuring CNR was recently introduced. The method employs [...] Read more.
This study evaluates the performance of a statistical-based automatic contrast-to-noise ratio (CNR) measurement method on images of the ACR CT phantom under varying imaging parameters. A statistical automatic method for segmenting low-contrast objects and for measuring CNR was recently introduced. The method employs a 25 mm region of interest (ROI), rotated in 2° clockwise steps, to identify the low-contrast object by locating the maximum CT value. The CNR was measured on images acquired with different parameters: tube voltage (80–140 kVp), tube current (80–200 mA), slice thickness (1.25–10 mm), field of view (190–230 mm), and convolution kernel (edge, ultra, lung, bone, chest, standard). The automatic results were compared to manual measurements. The automatic method accurately identified the largest low-contrast object. The CNR values from the automatic and manual methods showed no significant difference (p > 0.05). The CNR increased with higher tube voltage and current, and with thinner slice thickness. Chest and standard kernels yielded higher CNRs, while edge, ultra, lung, and bone kernels yielded lower ones. The CNR remained stable with minor FOV changes. The statistical-based automatic method provided accurate and consistent CNR measurements across a range of imaging settings for the ACR CT phantom. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

19 pages, 1536 KiB  
Article
A Study on Energy Consumption in AI-Driven Medical Image Segmentation
by R. Prajwal, S. J. Pawan, Shahin Nazarian, Nicholas Heller, Christopher J. Weight, Vinay Duddalwar and C.-C. Jay Kuo
J. Imaging 2025, 11(6), 174; https://doi.org/10.3390/jimaging11060174 - 26 May 2025
Viewed by 423
Abstract
As artificial intelligence advances in medical image analysis, its environmental impact remains largely overlooked. This study analyzes the energy demands of AI workflows for medical image segmentation using the popular Kidney Tumor Segmentation-2019 (KiTS-19) dataset. It examines how training and inference differ in [...] Read more.
As artificial intelligence advances in medical image analysis, its environmental impact remains largely overlooked. This study analyzes the energy demands of AI workflows for medical image segmentation using the popular Kidney Tumor Segmentation-2019 (KiTS-19) dataset. It examines how training and inference differ in energy consumption, focusing on factors that influence resource usage, such as computational complexity, memory access, and I/O operations. To address these aspects, we evaluated three variants of convolution—Standard Convolution, Depthwise Convolution, and Group Convolution—combined with optimization techniques such as Mixed Precision and Gradient Accumulation. While training is energy-intensive, the recurring nature of inference often results in significantly higher cumulative energy consumption over a model’s life cycle. Depthwise Convolution with Mixed Precision achieves the lowest energy consumption during training while maintaining strong performance, making it the most energy-efficient configuration among those tested. In contrast, Group Convolution fails to achieve energy efficiency due to significant input/output overhead. These findings emphasize the need for GPU-centric strategies and energy-conscious AI practices, offering actionable guidance for designing scalable, sustainable innovation in medical image analysis. Full article
(This article belongs to the Special Issue Imaging in Healthcare: Progress and Challenges)
Show Figures

Figure 1

19 pages, 2322 KiB  
Article
CAS-SFCM: Content-Aware Image Smoothing Based on Fuzzy Clustering with Spatial Information
by Felipe Antunes-Santos, Carlos Lopez-Molina, Maite Mendioroz and Bernard De Baets
J. Imaging 2025, 11(6), 173; https://doi.org/10.3390/jimaging11060173 - 22 May 2025
Viewed by 305
Abstract
Image smoothing is a low-level image processing task mainly aimed at homogenizing an image, mitigating noise, or improving the visibility of certain image areas. There exist two main strategies for image smoothing. The first strategy is content-unaware image smoothing. This strategy replicates identical [...] Read more.
Image smoothing is a low-level image processing task mainly aimed at homogenizing an image, mitigating noise, or improving the visibility of certain image areas. There exist two main strategies for image smoothing. The first strategy is content-unaware image smoothing. This strategy replicates identical smoothing behavior at every region in the image, hence ignoring any local or semi-local properties of the image. The second strategy is content-aware image smoothing, which takes into account the local properties of the image in order to adapt the smoothing behavior. Such adaptation to local image conditions is intended to avoid the blurring of relevant structures (such as ridges, edges, and blobs) in the image. While the former strategy was ubiquitous in the early years of image processing, the last 20 years have seen an ever-increasing use of the latter, fueled by a combination of greater computational capability and more refined mathematical models. In this work, we propose a novel content-aware image smoothing method based on soft (fuzzy) clustering. Our proposal capitalizes on the strengths of soft clustering to produce content-aware smoothing and allows for the direct configuration of the most relevant parameters for the task: the number of distinctive regions in the image and the relative relevance of spatial and tonal information in the smoothing. The proposed method is put to the test on both artificial and real-world images, combining both qualitative and quantitative analyses. We also propose the use of a local homogeneity measure for the quantitative analysis of image smoothing results. We show that the proposed method is not sensitive to centroid initialization and can be used for both artificial and real-world images. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

14 pages, 7842 KiB  
Article
Unsupervised Class Generation to Expand Semantic Segmentation Datasets
by Javier Montalvo, Álvaro García-Martín, Pablo Carballeira and Juan C. SanMiguel
J. Imaging 2025, 11(6), 172; https://doi.org/10.3390/jimaging11060172 - 22 May 2025
Viewed by 275
Abstract
Semantic segmentation is a computer vision task where classification is performed at the pixel level. Due to this, the process of labeling images for semantic segmentation is time-consuming and expensive. To mitigate this cost there has been a surge in the use of [...] Read more.
Semantic segmentation is a computer vision task where classification is performed at the pixel level. Due to this, the process of labeling images for semantic segmentation is time-consuming and expensive. To mitigate this cost there has been a surge in the use of synthetically generated data—usually created using simulators or videogames—which, in combination with domain adaptation methods, can effectively learn how to segment real data. Still, these datasets have a particular limitation: due to their closed-set nature, it is not possible to include novel classes without modifying the tool used to generate them, which is often not public. Concurrently, generative models have made remarkable progress, particularly with the introduction of diffusion models, enabling the creation of high-quality images from text prompts without additional supervision. In this work, we propose an unsupervised pipeline that leverages Stable Diffusion and Segment Anything Module to generate class examples with an associated segmentation mask, and a method to integrate generated cutouts for novel classes in semantic segmentation datasets, all with minimal user input. Our approach aims to improve the performance of unsupervised domain adaptation methods by introducing novel samples into the training data without modifications to the underlying algorithms. With our methods, we show how models can not only effectively learn how to segment novel classes, with an average performance of 51% intersection over union for novel classes, but also reduce errors for other, already existing classes, reaching a higher performance level overall. Full article
Show Figures

Figure 1

Previous Issue
Back to TopTop