Next Issue
Volume 11, July
Previous Issue
Volume 11, May
 
 

J. Imaging, Volume 11, Issue 6 (June 2025) – 34 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
27 pages, 963 KiB  
Review
Inferring Body Measurements from 2D Images: A Comprehensive Review
by Hezha Mohammedkhan, Hein Fleuren, Çíçek Güven and Eric Postma
J. Imaging 2025, 11(6), 205; https://doi.org/10.3390/jimaging11060205 - 19 Jun 2025
Abstract
The prediction of anthropometric measurements from 2D body images, particularly for children, remains an under-explored area despite its potential applications in healthcare, fashion, and fitness. While pose estimation and body shape classification have garnered extensive attention, estimating body measurements and body mass index [...] Read more.
The prediction of anthropometric measurements from 2D body images, particularly for children, remains an under-explored area despite its potential applications in healthcare, fashion, and fitness. While pose estimation and body shape classification have garnered extensive attention, estimating body measurements and body mass index (BMI) from images presents unique challenges and opportunities. This paper provides a comprehensive review of the current methodologies, focusing on deep-learning approaches, both standalone and in combination with traditional machine-learning techniques, for inferring body measurements from facial and full-body images. We discuss the strengths and limitations of commonly used datasets, proposing the need for more inclusive and diverse collections to improve model performance. Our findings indicate that deep-learning models, especially when combined with traditional machine-learning techniques, offer the most accurate predictions. We further highlight the promise of vision transformers in advancing the field while stressing the importance of addressing model explainability. Finally, we evaluate the current state of the field, comparing recent results and focusing on the deviations from ground truth, ultimately providing recommendations for future research directions. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

28 pages, 4483 KiB  
Article
Historical Manuscripts Analysis: A Deep Learning System for Writer Identification Using Intelligent Feature Selection with Vision Transformers
by Merouane Boudraa, Akram Bennour, Mouaaz Nahas, Rashiq Rafiq Marie and Mohammed Al-Sarem
J. Imaging 2025, 11(6), 204; https://doi.org/10.3390/jimaging11060204 - 19 Jun 2025
Abstract
Identifying the scriptwriter in historical manuscripts is crucial for historians, providing valuable insights into historical contexts and aiding in solving historical mysteries. This research presents a robust deep learning system designed for classifying historical manuscripts by writer, employing intelligent feature selection and vision [...] Read more.
Identifying the scriptwriter in historical manuscripts is crucial for historians, providing valuable insights into historical contexts and aiding in solving historical mysteries. This research presents a robust deep learning system designed for classifying historical manuscripts by writer, employing intelligent feature selection and vision transformers. Our methodology meticulously investigates the efficacy of both handcrafted techniques for feature identification and deep learning architectures for classification tasks in writer identification. The initial preprocessing phase involves thorough document refinement using bilateral filtering for denoising and Otsu thresholding for binarization, ensuring document clarity and consistency for subsequent feature detection. We utilize the FAST detector for feature detection, extracting keypoints representing handwriting styles, followed by clustering with the k-means algorithm to obtain meaningful patches of uniform size. This strategic clustering minimizes redundancy and creates a comprehensive dataset ideal for deep learning classification tasks. Leveraging vision transformer models, our methodology effectively learns complex patterns and features from extracted patches, enabling precise identification of writers across historical manuscripts. This study pioneers the application of vision transformers in historical document analysis, showcasing superior performance on the “ICDAR 2017” dataset compared to state-of-the-art methods and affirming our approach as a robust tool for historical manuscript analysis. Full article
(This article belongs to the Section Document Analysis and Processing)
Show Figures

Figure 1

13 pages, 3615 KiB  
Article
Performance Calibration of the Wavefront Sensor’s EMCCD Detector for the Cool Planets Imaging Coronagraph Aboard CSST
by Jiangpei Dou, Bingli Niu, Gang Zhao, Xi Zhang, Gang Wang, Baoning Yuan, Di Wang and Xingguang Qian
J. Imaging 2025, 11(6), 203; https://doi.org/10.3390/jimaging11060203 - 18 Jun 2025
Viewed by 44
Abstract
The wavefront sensor (WFS), equipped with an electron-multiplying charge-coupled device (EMCCD) detector, is a critical component of the Cool Planets Imaging Coronagraph (CPI-C) on the Chinese Space Station Telescope (CSST). Precise calibration of the WFS’s EMCCD detector is essential to meet the stringent [...] Read more.
The wavefront sensor (WFS), equipped with an electron-multiplying charge-coupled device (EMCCD) detector, is a critical component of the Cool Planets Imaging Coronagraph (CPI-C) on the Chinese Space Station Telescope (CSST). Precise calibration of the WFS’s EMCCD detector is essential to meet the stringent requirements for high-contrast exoplanet imaging. This study comprehensively characterizes key performance parameters of the detector to ensure its suitability for astronomical observations. Through a multi-stage screening protocol, we identified an EMCCD chip exhibiting high resolution and low noise. The electron-multiplying gain (EM Gain) of the EMCCD was analyzed to determine its impact on signal amplification and noise characteristics, identifying the optimal operational range. Additionally, noise properties such as readout noise were investigated. Experimental results demonstrate that the optimized detector meets CPI-C’s initial application requirements, achieving high resolution and low noise. This study provides theoretical and experimental foundations for the use of EMCCD-based WFS in adaptive optics and astronomical observations, ensuring their reliability for advanced space-based imaging applications. Full article
Show Figures

Figure 1

17 pages, 1863 KiB  
Article
MedSAM/MedSAM2 Feature Fusion: Enhancing nnUNet for 2D TOF-MRA Brain Vessel Segmentation
by Han Zhong, Jiatian Zhang and Lingxiao Zhao
J. Imaging 2025, 11(6), 202; https://doi.org/10.3390/jimaging11060202 - 18 Jun 2025
Viewed by 56
Abstract
Accurate segmentation of brain vessels is critical for diagnosing cerebral stroke, yet existing AI-based methods struggle with challenges such as small vessel segmentation and class imbalance. To address this, our study proposes a novel 2D segmentation method based on the nnUNet framework, enhanced [...] Read more.
Accurate segmentation of brain vessels is critical for diagnosing cerebral stroke, yet existing AI-based methods struggle with challenges such as small vessel segmentation and class imbalance. To address this, our study proposes a novel 2D segmentation method based on the nnUNet framework, enhanced with MedSAM/MedSAM2 features, for arterial vessel segmentation in time-of-flight magnetic resonance angiography (TOF-MRA) brain slices. The approach first constructs a baseline segmentation network using nnUNet, then incorporates MedSAM/MedSAM2’s feature extraction module to enhance feature representation. Additionally, focal loss is introduced to address class imbalance. Experimental results on the CAS2023 dataset demonstrate that the MedSAM2-enhanced model achieves a 0.72% relative improvement in Dice coefficient and reduces HD95 (mm) and ASD (mm) from 48.20 mm to 46.30 mm and from 5.33 mm to 4.97 mm, respectively, compared to the baseline nnUNet, showing significant enhancements in boundary localization and segmentation accuracy. This approach addresses the critical challenge of small vessel segmentation in TOF-MRA, with the potential to improve cerebrovascular disease diagnosis in clinical practice. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

20 pages, 14779 KiB  
Article
Automation of Multi-Class Microscopy Image Classification Based on the Microorganisms Taxonomic Features Extraction
by Aleksei Samarin, Alexander Savelev, Aleksei Toropov, Aleksandra Dozortseva, Egor Kotenko, Artem Nazarenko, Alexander Motyko, Galiya Narova, Elena Mikhailova and Valentin Malykh
J. Imaging 2025, 11(6), 201; https://doi.org/10.3390/jimaging11060201 - 18 Jun 2025
Viewed by 69
Abstract
This study presents a unified low-parameter approach to multi-class classification of microorganisms (micrococci, diplococci, streptococci, and bacilli) based on automated machine learning. The method is designed to produce interpretable taxonomic descriptors through analysis of the external geometric characteristics of microorganisms, including cell shape, [...] Read more.
This study presents a unified low-parameter approach to multi-class classification of microorganisms (micrococci, diplococci, streptococci, and bacilli) based on automated machine learning. The method is designed to produce interpretable taxonomic descriptors through analysis of the external geometric characteristics of microorganisms, including cell shape, colony organization, and dynamic behavior in unfixed microscopic scenes. A key advantage of the proposed approach is its lightweight nature: the resulting models have significantly fewer parameters than deep learning-based alternatives, enabling fast inference even on standard CPU hardware. An annotated dataset containing images of four bacterial types obtained under conditions simulating real clinical trials has been developed and published to validate the method. The results (Precision = 0.910, Recall = 0.901, and F1-score = 0.905) confirm the effectiveness of the proposed method for biomedical diagnostic tasks, especially in settings with limited computational resources and a need for feature interpretability. Our approach demonstrates performance comparable to state-of-the-art methods while offering superior efficiency and lightweight design due to its significantly reduced number of parameters. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

18 pages, 6011 KiB  
Article
Comparative Analysis of Ultrasonography and MicroCT Imaging for Organ Size Evaluation in Mice
by Juan Jose Jimenez Catalan, Marina Ferrer Clotas and Juan Antonio Camara Serrano
J. Imaging 2025, 11(6), 200; https://doi.org/10.3390/jimaging11060200 - 18 Jun 2025
Viewed by 53
Abstract
In this work, the authors compared microCT and in vivo ultrasonography in terms of accuracy and efficacy for measuring the volume of various organs in mice. Two quantification protocols were applied: ellipsoidal volume measuring maximum diameters in all three axes in both imaging [...] Read more.
In this work, the authors compared microCT and in vivo ultrasonography in terms of accuracy and efficacy for measuring the volume of various organs in mice. Two quantification protocols were applied: ellipsoidal volume measuring maximum diameters in all three axes in both imaging systems and manual delineation of organ borders in microCT studies. The results were compared with ex vivo volumes. In general, both imaging techniques and quantification protocols are accurate, but ultrasound is faster in both acquisition and analysis. The only accurate method for heart volume measurement is manual segmentation on microCT. For the ovary, none of the techniques and protocols had a positive correlation with ex vivo volume. The three-diameter method can be used for ellipsoid organs because of its rapidity, but for more irregular structures, manual segmentation is recommended, although it is time-consuming. Full article
Show Figures

Graphical abstract

25 pages, 4277 KiB  
Article
Decolorization with Warmth–Coolness Adjustment in an Opponent and Complementary Color System
by Oscar Sanchez-Cesteros and Mariano Rincon
J. Imaging 2025, 11(6), 199; https://doi.org/10.3390/jimaging11060199 - 18 Jun 2025
Viewed by 44
Abstract
Creating grayscale images from a color reality has been an inherent human practice since ancient times, but it became a technological challenge with the advent of the first black-and-white televisions and digital image processing. Decolorization is a process that projects visual information from [...] Read more.
Creating grayscale images from a color reality has been an inherent human practice since ancient times, but it became a technological challenge with the advent of the first black-and-white televisions and digital image processing. Decolorization is a process that projects visual information from a three-dimensional feature space to a one-dimensional space, thus reducing the dimensionality of the image while minimizing the loss of information. To achieve this, various strategies have been developed, including the application of color channel weights and the analysis of local and global image contrast, but there is no universal solution. In this paper, we propose a bio-inspired approach that combines findings from neuroscience on the architecture of the visual system and color coding with evidence from studies in the psychology of art. The goal is to simplify the decolorization process and facilitate its control through color-related concepts that are easily understandable to humans. This new method organizes colors in a scale that links activity on the retina with a system of opponent and complementary channels, thus allowing the adjustment of the perception of warmth and coolness in the image. The results show an improvement in chromatic contrast, especially in the warmth and coolness categories, as well as an enhanced ability to preserve subtle contrasts, outperforming other approaches in the Ishihara test used in color blindness detection. In addition, the method offers a computational advantage by reducing the process through direct pixel-level operation. Full article
(This article belongs to the Special Issue Color in Image Processing and Computer Vision)
Show Figures

Figure 1

12 pages, 1514 KiB  
Article
Quantitative Ultrashort Echo Time Magnetization Transfer Imaging of the Osteochondral Junction: An In Vivo Knee Osteoarthritis Study
by Dina Moazamian, Mahyar Daskareh, Jiyo S. Athertya, Arya A. Suprana, Saeed Jerban and Yajun Ma
J. Imaging 2025, 11(6), 198; https://doi.org/10.3390/jimaging11060198 - 16 Jun 2025
Viewed by 143
Abstract
Osteoarthritis (OA) is the most prevalent degenerative joint disorder worldwide, causing significant declines in quality of life. The osteochondral junction (OCJ), a critical structural interface between deep cartilage and subchondral bone, plays an essential role in OA progression but is challenging to assess [...] Read more.
Osteoarthritis (OA) is the most prevalent degenerative joint disorder worldwide, causing significant declines in quality of life. The osteochondral junction (OCJ), a critical structural interface between deep cartilage and subchondral bone, plays an essential role in OA progression but is challenging to assess using conventional magnetic resonance imaging (MRI) due to its short T2 relaxation times. This study aimed to evaluate the utility of ultrashort echo time (UTE) MRI biomarkers, including macromolecular fraction (MMF), magnetization transfer ratio (MTR), and T2*, for in vivo quantification of OCJ changes in knee OA for the first time. Forty-five patients (mean age: 53.8 ± 17.0 years, 50% female) were imaged using 3D UTE-MRI sequences on a 3T clinical MRI scanner. Patients were stratified into two OA groups based on radiographic Kellgren–Lawrence (KL) scores: normal/subtle (KL = 0–1) (n = 21) and mild to moderate (KL = 2–3) (n = 24). Quantitative analysis revealed significantly lower MMF (15.8  ±  1.4% vs. 13.6 ± 1.2%, p < 0.001) and MTR (42.5 ± 2.5% vs. 38.2  ±  2.3%, p < 0.001) in the higher KL 2–3 group, alongside a higher trend in T2* values (19.7  ±  2.6 ms vs. 21.6  ±  3.8 ms, p = 0.06). Moreover, MMF and MTR were significantly negatively correlated with KL grades (r = −0.66 and −0.59; p < 0.001, respectively), while T2* showed a weaker positive correlation (r = 0.26, p = 0.08). Receiver operating characteristic (ROC) analysis demonstrated superior diagnostic accuracy for MMF (AUC = 0.88) and MTR (AUC = 0.86) compared to T2* (AUC = 0.64). These findings highlight UTE-MT techniques (i.e., MMF and MTR) as promising imaging tools for detecting OCJ degeneration in knee OA, with potential implications for earlier and more accurate diagnosis and disease monitoring. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

29 pages, 4199 KiB  
Article
Dose Reduction in Scintigraphic Imaging Through Enhanced Convolutional Autoencoder-Based Denoising
by Nikolaos Bouzianis, Ioannis Stathopoulos, Pipitsa Valsamaki, Efthymia Rapti, Ekaterini Trikopani, Vasiliki Apostolidou, Athanasia Kotini, Athanasios Zissimopoulos, Adam Adamopoulos and Efstratios Karavasilis
J. Imaging 2025, 11(6), 197; https://doi.org/10.3390/jimaging11060197 - 14 Jun 2025
Viewed by 185
Abstract
Objective: This study proposes a novel deep learning approach for enhancing low-dose bone scintigraphy images using an Enhanced Convolutional Autoencoder (ECAE), aiming to reduce patient radiation exposure while preserving diagnostic quality, as assessed by both expert-based quantitative image metrics and qualitative evaluation. Methods: [...] Read more.
Objective: This study proposes a novel deep learning approach for enhancing low-dose bone scintigraphy images using an Enhanced Convolutional Autoencoder (ECAE), aiming to reduce patient radiation exposure while preserving diagnostic quality, as assessed by both expert-based quantitative image metrics and qualitative evaluation. Methods: A supervised learning framework was developed using real-world paired low- and full-dose images from 105 patients. Data were acquired using standard clinical gamma cameras at the Nuclear Medicine Department of the University General Hospital of Alexandroupolis. The ECAE architecture integrates multiscale feature extraction, channel attention mechanisms, and efficient residual blocks to reconstruct high-quality images from low-dose inputs. The model was trained and validated using quantitative metrics—Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM)—alongside qualitative assessments by nuclear medicine experts. Results: The model achieved significant improvements in both PSNR and SSIM across all tested dose levels, particularly between 30% and 70% of the full dose. Expert evaluation confirmed enhanced visibility of anatomical structures, noise reduction, and preservation of diagnostic detail in denoised images. In blinded evaluations, denoised images were preferred over the original full-dose scans in 66% of all cases, and in 61% of cases within the 30–70% dose range. Conclusion: The proposed ECAE model effectively reconstructs high-quality bone scintigraphy images from substantially reduced-dose acquisitions. This approach supports dose reduction in nuclear medicine imaging while maintaining—or even enhancing—diagnostic confidence, offering practical benefits in patient safety, workflow efficiency, and environmental impact. Full article
Show Figures

Figure 1

22 pages, 2000 KiB  
Article
Generation of Synthetic Non-Homogeneous Fog by Discretized Radiative Transfer Equation
by Marcell Beregi-Kovacs, Balazs Harangi, Andras Hajdu and Gyorgy Gat
J. Imaging 2025, 11(6), 196; https://doi.org/10.3390/jimaging11060196 - 13 Jun 2025
Viewed by 165
Abstract
The synthesis of realistic fog in images is critical for applications such as autonomous navigation, augmented reality, and visual effects. Traditional methods based on Koschmieder’s law or GAN-based image translation typically assume homogeneous fog distributions and rely on oversimplified scattering models, limiting their [...] Read more.
The synthesis of realistic fog in images is critical for applications such as autonomous navigation, augmented reality, and visual effects. Traditional methods based on Koschmieder’s law or GAN-based image translation typically assume homogeneous fog distributions and rely on oversimplified scattering models, limiting their physical realism. In this paper, we propose a physics-driven approach to fog synthesis by discretizing the Radiative Transfer Equation (RTE). Our method models spatially inhomogeneous fog and anisotropic multi-scattering, enabling the generation of structurally consistent and perceptually plausible fog effects. To evaluate performance, we construct a dataset of real-world foggy, cloudy, and sunny images and compare our results against both Koschmieder-based and GAN-based baselines. Experimental results show that our method achieves a lower Fréchet Inception Distance (10% vs. Koschmieder, 42% vs. CycleGAN) and a higher Pearson correlation (+4% and +21%, respectively), highlighting its superiority in both feature space and structural fidelity. These findings highlight the potential of RTE-based fog synthesis for physically consistent image augmentation under challenging visibility conditions. However, the method’s practical deployment may be constrained by high memory requirements due to tensor-based computations, which must be addressed for large-scale or real-time applications. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

14 pages, 822 KiB  
Article
Optical Coherence Tomography (OCT) Findings in Post-COVID-19 Healthcare Workers
by Sanela Sanja Burgić, Mirko Resan, Milka Mavija, Saša Smoljanović Skočić, Sanja Grgić, Daliborka Tadić and Bojan Pajic
J. Imaging 2025, 11(6), 195; https://doi.org/10.3390/jimaging11060195 - 12 Jun 2025
Viewed by 459
Abstract
Recent evidence suggests that SARS-CoV-2 may induce subtle anatomical changes in the retina, detectable through advanced imaging techniques. This retrospective case–control study utilized optical coherence tomography (OCT) to assess medium-term retinal alterations in 55 healthcare workers, including 25 individuals with PCR-confirmed COVID-19 and [...] Read more.
Recent evidence suggests that SARS-CoV-2 may induce subtle anatomical changes in the retina, detectable through advanced imaging techniques. This retrospective case–control study utilized optical coherence tomography (OCT) to assess medium-term retinal alterations in 55 healthcare workers, including 25 individuals with PCR-confirmed COVID-19 and 30 non-COVID-19 controls, all of whom had worked in COVID-19 clinical settings. Comprehensive ophthalmological examinations, including OCT imaging, were conducted six months after infection. The analysis considered demographic variables, comorbidities, COVID-19 severity, risk factors, and treatments received. Central macular thickness (CMT) was significantly increased in the post-COVID-19 group (p < 0.05), with a weak but statistically significant positive correlation between CMT and disease severity (r = 0.245, p < 0.05), suggesting potential post-inflammatory retinal responses. No significant differences were observed in retinal nerve fiber layer (RNFL) or ganglion cell complex (GCL + IPL) thickness. However, mild negative trends in inferior RNFL and average GCL+IPL thickness may indicate early neurodegenerative changes. Notably, patients with comorbidities exhibited a significant reduction in superior and inferior RNFL thickness, pointing to possible long-term neurovascular impairment. These findings underscore the value of OCT imaging in identifying subclinical retinal alterations following COVID-19 and highlight the need for continued surveillance in recovered patients, particularly those with pre-existing systemic conditions. Full article
(This article belongs to the Special Issue Learning and Optimization for Medical Imaging)
Show Figures

Figure 1

28 pages, 3384 KiB  
Article
Evaluating Features and Variations in Deepfake Videos Using the CoAtNet Model
by Eman Alattas, John Clark, Arwa Al-Aama and Salma Kammoun Jarraya
J. Imaging 2025, 11(6), 194; https://doi.org/10.3390/jimaging11060194 - 12 Jun 2025
Viewed by 512
Abstract
Deepfake video detection has emerged as a critical challenge in the realm of artificial intelligence, given its implications for misinformation and digital security. This study evaluates the generalisation capabilities of the CoAtNet model—a hybrid convolution–transformer architecture—for deepfake detection across diverse datasets. Although CoAtNet [...] Read more.
Deepfake video detection has emerged as a critical challenge in the realm of artificial intelligence, given its implications for misinformation and digital security. This study evaluates the generalisation capabilities of the CoAtNet model—a hybrid convolution–transformer architecture—for deepfake detection across diverse datasets. Although CoAtNet has shown exceptional performance in several computer vision tasks, its potential for generalisation in cross-dataset scenarios remains underexplored. Thus, in this study, we explore CoAtNet’s generalisation ability by conducting an extensive series of experiments with a focus on discovering features and variations in deepfake videos. These experiments involve training the model using various input and processing configurations, followed by evaluating its performance on widely recognised public datasets. To the best of our knowledge, our proposed approach outperforms state-of-the-art models in terms of intra-dataset performance, with an AUC between 81.4% and 99.9%. Our model also achieves outstanding results in cross-dataset evaluations, with an AUC equal to 78%. This study demonstrates that CoAtNet achieves the best AUC for both intra-dataset and cross-dataset deepfake video detection, particularly on Celeb-DF, while also showing strong performance on DFDC. Full article
Show Figures

Figure 1

25 pages, 29384 KiB  
Article
Efficient Multi-Material Volume Rendering for Realistic Visualization with Complex Transfer Functions
by Chunxiao Xu, Xinran Xu, Jiatian Zhang, Yiheng Cao and Lingxiao Zhao
J. Imaging 2025, 11(6), 193; https://doi.org/10.3390/jimaging11060193 - 11 Jun 2025
Viewed by 435
Abstract
Physically based realistic direct volume rendering (DVR) is a critical area of research in scientific data visualization. The prevailing realistic DVR methods are primarily rooted in outdated theories of participating media rendering and often lack comprehensive analyses of their applicability to realistic DVR [...] Read more.
Physically based realistic direct volume rendering (DVR) is a critical area of research in scientific data visualization. The prevailing realistic DVR methods are primarily rooted in outdated theories of participating media rendering and often lack comprehensive analyses of their applicability to realistic DVR scenarios. As a result, the fidelity of material representation in the rendered output is frequently limited. To address these challenges, we present a novel multi-material radiative transfer model (MM-RTM) designed for realistic DVR, grounded in recent advancements in light transport theories. Additionally, we standardize various transfer function techniques and propose five distinct forms of transfer functions along with proxy volumes. This comprehensive approach enables our DVR framework to accommodate a wide range of complex transfer function techniques, which we illustrate through several visualizations. Furthermore, to enhance sampling efficiency, we develop a new multi-hierarchical volumetric acceleration method that supports multi-level searches and volume traversal. Our volumetric accelerator also facilitates real-time structural updates when applying complex transfer functions in DVR. Our MM-RTM, the unified representation of complex transfer functions, and the acceleration structure for real-time updates are complementary components that collectively establish a comprehensive framework for realistic multi-material DVR. Evaluation from a user study indicates that the rendering results produced by our method demonstrate the most realistic effects among various publicly available state-of-the-art techniques. Full article
(This article belongs to the Section Visualization and Computer Graphics)
Show Figures

Figure 1

21 pages, 1578 KiB  
Article
SADiff: Coronary Artery Segmentation in CT Angiography Using Spatial Attention and Diffusion Model
by Ruoxuan Xu, Longhui Dai, Jianru Wang, Lei Zhang and Yuanquan Wang
J. Imaging 2025, 11(6), 192; https://doi.org/10.3390/jimaging11060192 - 11 Jun 2025
Viewed by 575
Abstract
Coronary artery disease (CAD) is a highly prevalent cardiovascular disease and one of the leading causes of death worldwide. The accurate segmentation of coronary arteries from CT angiography (CTA) images is essential for the diagnosis and treatment of coronary artery disease. However, due [...] Read more.
Coronary artery disease (CAD) is a highly prevalent cardiovascular disease and one of the leading causes of death worldwide. The accurate segmentation of coronary arteries from CT angiography (CTA) images is essential for the diagnosis and treatment of coronary artery disease. However, due to small vessel diameters, large morphological variations, low contrast, and motion artifacts, conventional segmentation methods, including classical image processing (such as region growing and level sets) and early deep learning models with limited receptive fields, are unsatisfactory. We propose SADiff, a hybrid framework that integrates a dilated attention network (DAN) for ROI extraction, a diffusion-based subnet for noise suppression in low-contrast regions, and a striped attention network (SAN) to refine tubular structures affected by morphological variations. Experiments on the public ImageCAS dataset show that it has a Dice score of 83.48% and a Hausdorff distance of 19.43 mm, which is 6.57% higher than U-Net3D in terms of Dice. The cross-dataset validation on the private ImageLaPP dataset verifies its generalizability with a Dice score of 79.42%. This comprehensive evaluation demonstrates that SADiff provides a more efficient and versatile method for coronary segmentation and shows great potential for improving the diagnosis and treatment of CAD. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

17 pages, 1267 KiB  
Article
Prediction of PD-L1 and CD68 in Clear Cell Renal Cell Carcinoma with Green Learning
by Yixing Wu, Alexander Shieh, Steven Cen, Darryl Hwang, Xiaomeng Lei, S. J. Pawan, Manju Aron, Inderbir Gill, William D. Wallace, C.-C. Jay Kuo and Vinay Duddalwar
J. Imaging 2025, 11(6), 191; https://doi.org/10.3390/jimaging11060191 - 10 Jun 2025
Viewed by 238
Abstract
Clear cell renal cell carcinoma (ccRCC) is the most common type of renal cancer. Extensive efforts have been made to utilize radiomics from computed tomography (CT) imaging to predict tumor immune microenvironment (TIME) measurements. This study proposes a Green Learning (GL) framework for [...] Read more.
Clear cell renal cell carcinoma (ccRCC) is the most common type of renal cancer. Extensive efforts have been made to utilize radiomics from computed tomography (CT) imaging to predict tumor immune microenvironment (TIME) measurements. This study proposes a Green Learning (GL) framework for approximating tissue-based biomarkers from CT scans, focusing on the PD-L1 expression and CD68 tumor-associated macrophages (TAMs) in ccRCC. Our approach includes radiomic feature extraction, redundancy removal, and supervised feature selection through a discriminant feature test (DFT), a relevant feature test (RFT), and least-squares normal transform (LNT) for robust feature generation. For the PD-L1 expression in 52 ccRCC patients, treated as a regression problem, our GL model achieved a 5-fold cross-validated mean squared error (MSE) of 0.0041 and a Mean Absolute Error (MAE) of 0.0346. For the TAM population (CD68+/PanCK+), analyzed in 78 ccRCC patients as a binary classification task (at a 0.4 threshold), the model reached a 10-fold cross-validated Area Under the Receiver Operating Characteristic (AUROC) of 0.85 (95% CI [0.76, 0.93]) using 10 LNT-derived features, improving upon the previous benchmark of 0.81. This study demonstrates the potential of GL in radiomic analyses, offering a scalable, efficient, and interpretable framework for the non-invasive approximation of key biomarkers. Full article
(This article belongs to the Special Issue Imaging in Healthcare: Progress and Challenges)
Show Figures

Figure 1

23 pages, 4896 KiB  
Article
Insulator Surface Defect Detection Method Based on Graph Feature Diffusion Distillation
by Shucai Li, Na Zhang, Gang Yang, Yannong Hou and Xingzhong Zhang
J. Imaging 2025, 11(6), 190; https://doi.org/10.3390/jimaging11060190 - 10 Jun 2025
Viewed by 481
Abstract
Aiming at the difficulties of scarcity of defect samples on the surface of power insulators, irregular morphology and insufficient pixel-level localization accuracy, this paper proposes a defect detection method based on graph feature diffusion distillation named GFDD. The feature bias problem is alleviated [...] Read more.
Aiming at the difficulties of scarcity of defect samples on the surface of power insulators, irregular morphology and insufficient pixel-level localization accuracy, this paper proposes a defect detection method based on graph feature diffusion distillation named GFDD. The feature bias problem is alleviated by constructing a dual-division teachers architecture with graph feature consistency constraints, while the cross-layer feature fusion module is utilized to dynamically aggregate multi-scale information to reduce redundancy; the diffusion distillation mechanism is designed to break through the traditional single-layer feature transfer limitation, and the global context modeling capability is enhanced by fusing deep semantics and shallow details through channel attention. In the self-built dataset, GFDD achieves 96.6% Pi.AUROC, 97.7% Im.AUROC and 95.1% F1-score, which is 2.4–3.2% higher than the existing optimal methods; it maintains excellent generalization and robustness in multiple public dataset tests. The method provides a high-precision solution for automated inspection of insulator surface defect and has certain engineering value. Full article
(This article belongs to the Special Issue Self-Supervised Learning for Image Processing and Analysis)
Show Figures

Figure 1

31 pages, 55513 KiB  
Article
SAM for Road Object Segmentation: Promising but Challenging
by Alaa Atallah Almazroey, Salma kammoun Jarraya and Reem Alnanih
J. Imaging 2025, 11(6), 189; https://doi.org/10.3390/jimaging11060189 - 10 Jun 2025
Viewed by 420
Abstract
Road object segmentation is crucial for autonomous driving, as it enables vehicles to perceive their surroundings. While deep learning models show promise, their generalization across diverse road conditions, weather variations, and lighting changes remains challenging. Different approaches have been proposed to address this [...] Read more.
Road object segmentation is crucial for autonomous driving, as it enables vehicles to perceive their surroundings. While deep learning models show promise, their generalization across diverse road conditions, weather variations, and lighting changes remains challenging. Different approaches have been proposed to address this limitation. However, these models often struggle with the varying appearance of road objects under diverse environmental conditions. Foundation models such as the Segment Anything Model (SAM) offer a potential avenue for improved generalization in complex visual tasks. Thus, this study presents a pioneering comprehensive evaluation of the SAM for zero-shot road object segmentation, without explicit prompts. This study aimed to determine the inherent capabilities and limitations of the SAM in accurately segmenting a variety of road objects under the diverse and challenging environmental conditions encountered in real-world autonomous driving scenarios. We assessed the SAM’s performance on the KITTI, BDD100K, and Mapillary Vistas datasets, encompassing a wide range of environmental conditions. Using a variety of established evaluation metrics, our analysis revealed the SAM’s capabilities and limitations in accurately segmenting various road objects, particularly highlighting challenges posed by dynamic environments, illumination changes, and occlusions. These findings provide valuable insights for researchers and developers seeking to enhance the robustness of foundation models such as the SAM in complex road environments, guiding future efforts to improve perception systems for autonomous driving. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

25 pages, 3449 KiB  
Article
CSANet: Context–Spatial Awareness Network for RGB-T Urban Scene Understanding
by Ruixiang Li, Zhen Wang, Jianxin Guo and Chuanlei Zhang
J. Imaging 2025, 11(6), 188; https://doi.org/10.3390/jimaging11060188 - 9 Jun 2025
Viewed by 193
Abstract
Semantic segmentation plays a critical role in understanding complex urban environments, particularly for autonomous driving applications. However, existing approaches face significant challenges under low-light and adverse weather conditions. To address these limitations, we propose CSANet (Context Spatial Awareness Network), a novel framework that [...] Read more.
Semantic segmentation plays a critical role in understanding complex urban environments, particularly for autonomous driving applications. However, existing approaches face significant challenges under low-light and adverse weather conditions. To address these limitations, we propose CSANet (Context Spatial Awareness Network), a novel framework that effectively integrates RGB and thermal infrared (TIR) modalities. CSANet employs an efficient encoder to extract complementary local and global features, while a hierarchical fusion strategy is adopted to selectively integrate visual and semantic information. Notably, the Channel–Spatial Cross-Fusion Module (CSCFM) enhances local details by fusing multi-modal features, and the Multi-Head Fusion Module (MHFM) captures global dependencies and calibrates multi-modal information. Furthermore, the Spatial Coordinate Attention Mechanism (SCAM) improves object localization accuracy in complex urban scenes. Evaluations on benchmark datasets (MFNet and PST900) demonstrate that CSANet achieves state-of-the-art performance, significantly advancing RGB-T semantic segmentation. Full article
Show Figures

Figure 1

19 pages, 5272 KiB  
Article
Biomechanics of Spiral Fractures: Investigating Periosteal Effects Using Digital Image Correlation
by Ghaidaa A. Khalid, Ali Al-Naji and Javaan Chahl
J. Imaging 2025, 11(6), 187; https://doi.org/10.3390/jimaging11060187 - 7 Jun 2025
Viewed by 418
Abstract
Spiral fractures are a frequent clinical manifestation of child abuse, particularly in non-ambulatory infants. Approximately 50% of fractures in children under one year of age are non-accidental, yet differentiating between accidental and abusive injuries remains challenging, as no single fracture type is diagnostic [...] Read more.
Spiral fractures are a frequent clinical manifestation of child abuse, particularly in non-ambulatory infants. Approximately 50% of fractures in children under one year of age are non-accidental, yet differentiating between accidental and abusive injuries remains challenging, as no single fracture type is diagnostic in isolation. The objective of this study is to investigate the biomechanics of spiral fractures in immature long bones and the role of the periosteum in modulating fracture behavior under torsional loading. Methods: Paired metatarsal bone specimens from immature sheep were tested using controlled torsional loading at two angular velocities (90°/s and 180°/s). Specimens were prepared through potting, application of a base coat, and painting of a speckle pattern suitable for high-speed digital image correlation (HS-DIC) analysis. Both periosteum-intact and periosteum-removed groups were included. Results: Spiral fractures were successfully induced in over 85% of specimens. Digital image correlation revealed localized diagonal tensile strain at the fracture initiation site, with opposing compressive zones. Notably, bones with intact periosteum exhibited broader tensile stress regions before and after failure, suggesting a biomechanical role in constraining deformation. Conclusion: This study presents a novel integration of high-speed digital image correlation (DIC) with paired biomechanical testing to evaluate the periosteum’s role in spiral fracture formation—an area that remains underexplored. The findings offer new insight into the strain distribution dynamics in immature long bones and highlight the periosteum’s potential protective contribution under torsional stress. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

27 pages, 3997 KiB  
Article
NCT-CXR: Enhancing Pulmonary Abnormality Segmentation on Chest X-Rays Using Improved Coordinate Geometric Transformations
by Abu Salam, Pulung Nurtantio Andono, Purwanto, Moch Arief Soeleman, Mohamad Sidiq, Farrikh Alzami, Ika Novita Dewi, Suryanti, Eko Adhi Pangarsa, Daniel Rizky, Budi Setiawan, Damai Santosa, Antonius Gunawan Santoso, Farid Che Ghazali and Eko Supriyanto
J. Imaging 2025, 11(6), 186; https://doi.org/10.3390/jimaging11060186 - 5 Jun 2025
Viewed by 665
Abstract
Medical image segmentation, especially in chest X-ray (CXR) analysis, encounters substantial problems such as class imbalance, annotation inconsistencies, and the necessity for accurate pathological region identification. This research aims to improve the precision and clinical reliability of pulmonary abnormality segmentation by developing NCT-CXR, [...] Read more.
Medical image segmentation, especially in chest X-ray (CXR) analysis, encounters substantial problems such as class imbalance, annotation inconsistencies, and the necessity for accurate pathological region identification. This research aims to improve the precision and clinical reliability of pulmonary abnormality segmentation by developing NCT-CXR, a framework that combines anatomically constrained data augmentation with expert-guided annotation refinement. NCT-CXR applies carefully calibrated discrete-angle rotations (±5°, ±10°) and intensity-based augmentations to enrich training data while preserving spatial and anatomical integrity. To address label noise in the NIH Chest X-ray dataset, we further introduce a clinically validated annotation refinement pipeline using the OncoDocAI platform, resulting in multi-label pixel-level segmentation masks for nine thoracic conditions. YOLOv8 was selected as the segmentation backbone due to its architectural efficiency, speed, and high spatial accuracy. Experimental results show that NCT-CXR significantly improves segmentation precision, especially for pneumothorax (0.829 and 0.804 for ±5° and ±10°, respectively). Non-parametric statistical testing (Kruskal–Wallis, H = 14.874, p = 0.0019) and post hoc Nemenyi analysis (p = 0.0138 and p = 0.0056) confirm the superiority of discrete-angle augmentation over mixed strategies. These findings underscore the importance of clinically constrained augmentation and high-quality annotation in building robust segmentation models. NCT-CXR offers a practical, high-performance solution for integrating deep learning into radiological workflows. Full article
Show Figures

Figure 1

14 pages, 5492 KiB  
Article
Comparison of Imaging Modalities for Left Ventricular Noncompaction Morphology
by Márton Horváth, Dorottya Kiss, István Márkusz, Márton Tokodi, Anna Réka Kiss, Zsófia Gregor, Kinga Grebur, Kristóf Farkas-Sütő, Balázs Mester, Flóra Gyulánczi, Attila Kovács, Béla Merkely, Hajnalka Vágó and Andrea Szűcs
J. Imaging 2025, 11(6), 185; https://doi.org/10.3390/jimaging11060185 - 4 Jun 2025
Viewed by 301
Abstract
Left ventricular noncompaction (LVNC) is characterized by excessive trabeculation, which may impair left ventricular function over time. While cardiac magnetic resonance imaging (CMR) is considered the gold standard for evaluating LV morphology, the optimal modality for follow-up remains uncertain. This study aimed to [...] Read more.
Left ventricular noncompaction (LVNC) is characterized by excessive trabeculation, which may impair left ventricular function over time. While cardiac magnetic resonance imaging (CMR) is considered the gold standard for evaluating LV morphology, the optimal modality for follow-up remains uncertain. This study aimed to assess the correlation and agreement among two-dimensional transthoracic echocardiography (2D_TTE), three-dimensional transthoracic echocardiography (3D_TTE), and CMR by comparing volumetric and strain parameters in LVNC patients and healthy individuals. Thirty-eight LVNC subjects with preserved ejection fraction and thirty-four healthy controls underwent all three imaging modalities. Indexed end-diastolic, end-systolic, and stroke volumes, ejection fraction, and global longitudinal and circumferential strains were evaluated using Pearson correlation and Bland–Altman analysis. In the healthy group, volumetric parameters showed strong correlation and good agreement across modalities, particularly between 3D_TTE and CMR. In contrast, agreement in the LVNC group was moderate, with lower correlation and higher percentage errors, especially for strain parameters. Functional data exhibited weak or no correlation, regardless of group. These findings suggest that while echocardiography may be suitable for volumetric follow-up in LVNC after baseline CMR, deformation parameters are not interchangeable between modalities, likely due to trabecular interference. Further studies are warranted to validate modality-specific strain assessment in hypertrabeculated hearts. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

20 pages, 8445 KiB  
Article
COSMICA: A Novel Dataset for Astronomical Object Detection with Evaluation Across Diverse Detection Architectures
by Evgenii Piratinskii and Irina Rabaev
J. Imaging 2025, 11(6), 184; https://doi.org/10.3390/jimaging11060184 - 4 Jun 2025
Viewed by 308
Abstract
Accurate and efficient detection of celestial objects in telescope imagery is a fundamental challenge in both professional and amateur astronomy. Traditional methods often struggle with noise, varying brightness, and object morphology. This paper introduces COSMICA, a novel, curated dataset of manually annotated astronomical [...] Read more.
Accurate and efficient detection of celestial objects in telescope imagery is a fundamental challenge in both professional and amateur astronomy. Traditional methods often struggle with noise, varying brightness, and object morphology. This paper introduces COSMICA, a novel, curated dataset of manually annotated astronomical images collected from amateur observations. COSMICA enables the development and evaluation of real-time object detection systems intended for practical deployment in observational pipelines. We investigate three modern YOLO architectures, YOLOv8, YOLOv9, and YOLOv11, and two additional object detection models, EfficientDet-Lite0 and MobileNetV3-FasterRCNN-FPN, to assess their performance in detecting comets, galaxies, nebulae, and globular clusters. All models are evaluated using consistent experimental conditions across multiple metrics, including mAP, precision, recall, and inference speed. YOLOv11 demonstrated the highest overall accuracy and computational efficiency, making it a promising candidate for real-world astronomical applications. These results support the feasibility of integrating deep learning-based detection systems into observational astronomy workflows and highlight the importance of domain-specific datasets for training robust AI models. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

12 pages, 2782 KiB  
Article
Platelets Image Classification Through Data Augmentation: A Comparative Study of Traditional Imaging Augmentation and GAN-Based Synthetic Data Generation Techniques Using CNNs
by Itunuoluwa Abidoye, Frances Ikeji, Charlie A. Coupland, Simon D. J. Calaminus, Nick Sander and Eva Sousa
J. Imaging 2025, 11(6), 183; https://doi.org/10.3390/jimaging11060183 - 4 Jun 2025
Viewed by 456
Abstract
Platelets play a crucial role in diagnosing and detecting various diseases, influencing the progression of conditions and guiding treatment options. Accurate identification and classification of platelets are essential for these purposes. The present study aims to create a synthetic database of platelet images [...] Read more.
Platelets play a crucial role in diagnosing and detecting various diseases, influencing the progression of conditions and guiding treatment options. Accurate identification and classification of platelets are essential for these purposes. The present study aims to create a synthetic database of platelet images using Generative Adversarial Networks (GANs) and validate its effectiveness by comparing it with datasets of increasing sizes generated through traditional augmentation techniques. Starting from an initial dataset of 71 platelet images, the dataset was expanded to 141 images (Level 1) using random oversampling and basic transformations and further to 1463 images (Level 2) through extensive augmentation (rotation, shear, zoom). Additionally, a synthetic dataset of 300 images was generated using a Wasserstein GAN with Gradient Penalty (WGAN-GP). Eight pre-trained deep learning models (DenseNet121, DenseNet169, DenseNet201, VGG16, VGG19, InceptionV3, InceptionResNetV2, and AlexNet) and two custom CNNs were evaluated across these datasets. Performance was measured using accuracy, precision, recall, and F1-score. On the extensively augmented dataset (Level 2), InceptionV3 and InceptionResNetV2 reached 99% accuracy and 99% precision/recall/F1-score, while DenseNet201 closely followed, with 98% accuracy, precision, recall and F1-score. GAN-augmented data further improved DenseNet’s performance, demonstrating the potential of GAN-generated images in enhancing platelet classification, especially where data are limited. These findings highlight the benefits of combining traditional and GAN-based augmentation techniques to improve classification performance in medical imaging tasks. Full article
(This article belongs to the Topic Machine Learning and Deep Learning in Medical Imaging)
Show Figures

Figure 1

27 pages, 4299 KiB  
Article
A Structured and Methodological Review on Multi-View Human Activity Recognition for Ambient Assisted Living
by Fahmid Al Farid, Ahsanul Bari, Abu Saleh Musa Miah, Sarina Mansor, Jia Uddin and S. Prabha Kumaresan
J. Imaging 2025, 11(6), 182; https://doi.org/10.3390/jimaging11060182 - 3 Jun 2025
Viewed by 993
Abstract
Ambient Assisted Living (AAL) leverages technology to support the elderly and individuals with disabilities. A key challenge in these systems is efficient Human Activity Recognition (HAR). However, no study has systematically compared single-view (SV) and multi-view (MV) Human Activity Recognition approaches. This review [...] Read more.
Ambient Assisted Living (AAL) leverages technology to support the elderly and individuals with disabilities. A key challenge in these systems is efficient Human Activity Recognition (HAR). However, no study has systematically compared single-view (SV) and multi-view (MV) Human Activity Recognition approaches. This review addresses this gap by analyzing the evolution from single-view to multi-view recognition systems, covering benchmark datasets, feature extraction methods, and classification techniques. We examine how activity recognition systems have transitioned to multi-view architectures using advanced deep learning models optimized for Ambient Assisted Living, thereby improving accuracy and robustness. Furthermore, we explore a wide range of machine learning and deep learning models—including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, Temporal Convolutional Networks (TCNs), and Graph Convolutional Networks (GCNs)—along with lightweight transfer learning methods suitable for environments with limited computational resources. Key challenges such as data remediation, privacy, and generalization are discussed, alongside potential solutions such as sensor fusion and advanced learning strategies. This study offers comprehensive insights into recent advancements and future directions, guiding the development of intelligent, efficient, and privacy-compliant Human Activity Recognition systems for Ambient Assisted Living applications. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

11 pages, 1895 KiB  
Article
3D Echocardiographic Assessment of Right Ventricular Involvement of Left Ventricular Hypertrabecularization from a New Perspective
by Márton Horváth, Kristóf Farkas-Sütő, Flóra Klára Gyulánczi, Alexandra Fábián, Bálint Lakatos, Anna Réka Kiss, Kinga Grebur, Zsófia Gregor, Balázs Mester, Attila Kovács, Béla Merkely and Andrea Szűcs
J. Imaging 2025, 11(6), 181; https://doi.org/10.3390/jimaging11060181 - 3 Jun 2025
Viewed by 268
Abstract
Right ventricular (RV) involvement in left ventricular hypertrabeculation (LVNC) remains under investigation. Due to its complex anatomy, assessing RV function is challenging, but 3D transthoracic echocardiography (3D_TTE) offers valuable insights. We aimed to evaluate volumetric, functional, and strain parameters of both ventricles in [...] Read more.
Right ventricular (RV) involvement in left ventricular hypertrabeculation (LVNC) remains under investigation. Due to its complex anatomy, assessing RV function is challenging, but 3D transthoracic echocardiography (3D_TTE) offers valuable insights. We aimed to evaluate volumetric, functional, and strain parameters of both ventricles in LVNC patients with preserved left ventricular ejection fraction (EF) and compare findings to a control group. This study included 37 LVNC patients and 37 age- and sex-matched controls. 3D_TTE recordings were analyzed using TomTec Image Arena (v. 4.7) and reVISION software to assess volumes, EF, and global/segmental strains. RV EF was further divided into longitudinal (LEF), radial (REF), and antero-posterior (AEF) components. LV volumes were significantly higher in the LVNC group, while RV volumes were comparable. EF and strain values were lower in both ventricles in LVNC patients. RV movement analysis showed significantly reduced LEF and REF, whereas AEF remained normal. These findings suggest subclinical RV dysfunction in LVNC, emphasizing the need for follow-up, even with preserved EF. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

7 pages, 1286 KiB  
Brief Report
Photon-Counting Detector CT Scan of Dinosaur Fossils: Initial Experience
by Tasuku Wakabayashi, Kenji Takata, Soichiro Kawabe, Masato Shimada, Takeshi Mugitani, Takuya Yachida, Rikiya Maruyama, Satomi Kanai, Kiyotaka Takeuchi, Tomohiro Kotsuji, Toshiki Tateishi, Hideki Hyodoh and Tetsuya Tsujikawa
J. Imaging 2025, 11(6), 180; https://doi.org/10.3390/jimaging11060180 - 2 Jun 2025
Viewed by 793
Abstract
Beyond clinical areas, photon-counting detector (PCD) CT is innovatively applied to study paleontological specimens. This study presents a preliminary investigation into the application of PCD-CT for imaging large dinosaur fossils, comparing it with standard energy-integrating detector (EID) CT. The left dentary of Tyrannosaurus [...] Read more.
Beyond clinical areas, photon-counting detector (PCD) CT is innovatively applied to study paleontological specimens. This study presents a preliminary investigation into the application of PCD-CT for imaging large dinosaur fossils, comparing it with standard energy-integrating detector (EID) CT. The left dentary of Tyrannosaurus and the skull of Camarasaurus were imaged using PCD-CT in ultra-high-resolution mode and EID-CT. The PCD-CT and EID-CT image quality of the dinosaurs were visually assessed. Compared with EID-CT, PCD-CT yielded higher-resolution anatomical images free of image deterioration, achieving a better definition of the Tyrannosaurus mandibular canal and the three semicircular canals of Camarasaurus. PCD-CT clearly depicts the internal structure and morphology of large dinosaur fossils without damaging them and also provides spectral information, thus allowing researchers to gain insights into fossil mineral composition and the preservation state in the future. Full article
(This article belongs to the Section Computational Imaging and Computational Photography)
Show Figures

Figure 1

28 pages, 3438 KiB  
Article
Optimizing Remote Sensing Image Retrieval Through a Hybrid Methodology
by Sujata Alegavi and Raghvendra Sedamkar
J. Imaging 2025, 11(6), 179; https://doi.org/10.3390/jimaging11060179 - 28 May 2025
Viewed by 284
Abstract
The contemporary challenge in remote sensing lies in the precise retrieval of increasingly abundant and high-resolution remotely sensed images (RS image) stored in expansive data warehouses. The heightened spatial and spectral resolutions, coupled with accelerated image acquisition rates, necessitate advanced tools for effective [...] Read more.
The contemporary challenge in remote sensing lies in the precise retrieval of increasingly abundant and high-resolution remotely sensed images (RS image) stored in expansive data warehouses. The heightened spatial and spectral resolutions, coupled with accelerated image acquisition rates, necessitate advanced tools for effective data management, retrieval, and exploitation. The classification of large-sized images at the pixel level generates substantial data, escalating the workload and search space for similarity measurement. Semantic-based image retrieval remains an open problem due to limitations in current artificial intelligence techniques. Furthermore, on-board storage constraints compel the application of numerous compression algorithms to reduce storage space, intensifying the difficulty of retrieving substantial, sensitive, and target-specific data. This research proposes an innovative hybrid approach to enhance the retrieval of remotely sensed images. The approach leverages multilevel classification and multiscale feature extraction strategies to enhance performance. The retrieval system comprises two primary phases: database building and retrieval. Initially, the proposed Multiscale Multiangle Mean-shift with Breaking Ties (MSMA-MSBT) algorithm selects informative unlabeled samples for hyperspectral and synthetic aperture radar images through an active learning strategy. Addressing the scaling and rotation variations in image capture, a flexible and dynamic algorithm, modified Deep Image Registration using Dynamic Inlier (IRDI), is introduced for image registration. Given the complexity of remote sensing images, feature extraction occurs at two levels. Low-level features are extracted using the modified Multiscale Multiangle Completed Local Binary Pattern (MSMA-CLBP) algorithm to capture local contexture features, while high-level features are obtained through a hybrid CNN structure combining pretrained networks (Alexnet, Caffenet, VGG-S, VGG-M, VGG-F, VGG-VDD-16, VGG-VDD-19) and a fully connected dense network. Fusion of low- and high-level features facilitates final class distinction, with soft thresholding mitigating misclassification issues. A region-based similarity measurement enhances matching percentages. Results, evaluated on high-resolution remote sensing datasets, demonstrate the effectiveness of the proposed method, outperforming traditional algorithms with an average accuracy of 86.66%. The hybrid retrieval system exhibits substantial improvements in classification accuracy, similarity measurement, and computational efficiency compared to state-of-the-art scene classification and retrieval methods. Full article
(This article belongs to the Topic Computational Intelligence in Remote Sensing: 2nd Edition)
Show Figures

Figure 1

22 pages, 20735 KiB  
Article
High-Throughput ORB Feature Extraction on Zynq SoC for Real-Time Structure-from-Motion Pipelines
by Panteleimon Stamatakis and John Vourvoulakis
J. Imaging 2025, 11(6), 178; https://doi.org/10.3390/jimaging11060178 - 28 May 2025
Viewed by 324
Abstract
This paper presents a real-time system for feature detection and description, the first stage in a structure-from-motion (SfM) pipeline. The proposed system leverages an optimized version of the ORB algorithm (oriented FAST and rotated BRIEF) implemented on the Digilent Zybo Z7020 FPGA board [...] Read more.
This paper presents a real-time system for feature detection and description, the first stage in a structure-from-motion (SfM) pipeline. The proposed system leverages an optimized version of the ORB algorithm (oriented FAST and rotated BRIEF) implemented on the Digilent Zybo Z7020 FPGA board equipped with the Xilinx Zynq-7000 SoC. The system accepts real-time video input (60 fps, 1920 × 1080 resolution, 24-bit color) via HDMI or a camera module. In order to support high frame rates for full-HD images, a double-data-rate pipeline scheme was adopted for Harris functions. Gray-scale video with features identified in red is exported through a separate HDMI port. Feature descriptors are calculated inside the FPGA by Zynq’s programmable logic and verified using Xilinx’s ILA IP block on a connected computer running Vivado. The implemented system achieves a latency of 192.7 microseconds, which is suitable for real-time applications. The proposed architecture is evaluated in terms of repeatability, matching retention and matching accuracy in several image transformations. It meets satisfactory accuracy and performance considering that there are slight changes between successive frames. This work paves the way for future research on the implementation of the remaining stages of a real-time SfM pipeline on the proposed hardware platform. Full article
(This article belongs to the Special Issue Recent Techniques in Image Feature Extraction)
Show Figures

Figure 1

19 pages, 3903 KiB  
Article
CFANet: The Cross-Modal Fusion Attention Network for Indoor RGB-D Semantic Segmentation
by Long-Fei Wu, Dan Wei and Chang-An Xu
J. Imaging 2025, 11(6), 177; https://doi.org/10.3390/jimaging11060177 - 27 May 2025
Viewed by 492
Abstract
Indoor image semantic segmentation technology is applied to fields such as smart homes and indoor security. The challenges faced by semantic segmentation techniques using RGB images and depth maps as data sources include the semantic gap between RGB images and depth maps and [...] Read more.
Indoor image semantic segmentation technology is applied to fields such as smart homes and indoor security. The challenges faced by semantic segmentation techniques using RGB images and depth maps as data sources include the semantic gap between RGB images and depth maps and the loss of detailed information. To address these issues, a multi-head self-attention mechanism is adopted to adaptively align features of the two modalities and perform feature fusion in both spatial and channel dimensions. Appropriate feature extraction methods are designed according to the different characteristics of RGB images and depth maps. For RGB images, asymmetric convolution is introduced to capture features in the horizontal and vertical directions, enhance short-range information dependence, mitigate the gridding effect of dilated convolution, and introduce criss-cross attention to obtain contextual information from global dependency relationships. On the depth map, a strategy of extracting significant unimodal features from the channel and spatial dimensions is used. A lightweight skip connection module is designed to fuse low-level and high-level features. In addition, since the first layer contains the richest detailed information and the last layer contains rich semantic information, a feature refinement head is designed to fuse the two. The method achieves an mIoU of 53.86% and 51.85% on the NYUDv2 and SUN-RGBD datasets, which is superior to mainstream methods. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

15 pages, 2957 KiB  
Article
Four-Wavelength Thermal Imaging for High-Energy-Density Industrial Processes
by Alexey Bykov, Anastasia Zolotukhina, Mikhail Poliakov, Andrey Belykh, Roman Asyutin, Anastasiia Korneeva, Vladislav Batshev and Demid Khokhlov
J. Imaging 2025, 11(6), 176; https://doi.org/10.3390/jimaging11060176 - 27 May 2025
Viewed by 484
Abstract
Multispectral imaging technology holds significant promise in the field of thermal imaging applications, primarily due to its unique ability to provide comprehensive two-dimensional spectral data distributions without the need for any form of scanning. This paper focuses on the development of an accessible [...] Read more.
Multispectral imaging technology holds significant promise in the field of thermal imaging applications, primarily due to its unique ability to provide comprehensive two-dimensional spectral data distributions without the need for any form of scanning. This paper focuses on the development of an accessible basic design concept and a method for estimating temperature maps using a four-channel spectral imaging system. The research examines key design considerations and establishes a workflow for data correction and processing. It involves preliminary camera calibration procedures, which are essential for accurately assessing and compensating for the characteristic properties of optical elements and image sensors. The developed method is validated through testing using a blackbody source, demonstrating a mean relative temperature error of 1%. Practical application of the method is demonstrated through temperature mapping of a tungsten lamp filament. Experiments demonstrated the capability of the developed multispectral camera to detect and visualize non-uniform temperature distributions and localized temperature deviations with sufficient spatial resolution. Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop