Next Issue
Volume 11, August
Previous Issue
Volume 11, June
 
 

J. Imaging, Volume 11, Issue 7 (July 2025) – 41 articles

Cover Story (view full-size image): The study examines the influence of flashing light at the critical fusion frequency on cortical excitability in the human brain. An EEG recording revealed that selective chromatic flicker stimulation, combined with light scattering, enhances magnocellular stimulation and parvocellular pathway inhibition. This resulted in increased high-frequency brain oscillations (i.e., beta and gamma waves), indicating neuroplastic modulation. Furthermore, the article introduces a new non-invasive way to obtain the E/I ratio and a new metric to calculate the peak shift of the main frequency ranges. These results suggested that non-invasive visual flicker stimulation is a promising tool for rebalancing cortical excitation/inhibition dynamics and has therapeutic application for neurological and psychiatric diseases such as Alzheimer's, epilepsy, depression, and schizophrenia. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
14 pages, 2370 KiB  
Article
DP-AMF: Depth-Prior–Guided Adaptive Multi-Modal and Global–Local Fusion for Single-View 3D Reconstruction
by Luoxi Zhang, Chun Xie and Itaru Kitahara
J. Imaging 2025, 11(7), 246; https://doi.org/10.3390/jimaging11070246 - 21 Jul 2025
Viewed by 283
Abstract
Single-view 3D reconstruction remains fundamentally ill-posed, as a single RGB image lacks scale and depth cues, often yielding ambiguous results under occlusion or in texture-poor regions. We propose DP-AMF, a novel Depth-Prior–Guided Adaptive Multi-Modal and Global–Local Fusion framework that integrates high-fidelity depth priors—generated [...] Read more.
Single-view 3D reconstruction remains fundamentally ill-posed, as a single RGB image lacks scale and depth cues, often yielding ambiguous results under occlusion or in texture-poor regions. We propose DP-AMF, a novel Depth-Prior–Guided Adaptive Multi-Modal and Global–Local Fusion framework that integrates high-fidelity depth priors—generated offline by the MARIGOLD diffusion-based estimator and cached to avoid extra training cost—with hierarchical local features from ResNet-32/ResNet-18 and semantic global features from DINO-ViT. A learnable fusion module dynamically adjusts per-channel weights to balance these modalities according to local texture and occlusion, and an implicit signed-distance field decoder reconstructs the final mesh. Extensive experiments on 3D-FRONT and Pix3D demonstrate that DP-AMF reduces Chamfer Distance by 7.64%, increases F-Score by 2.81%, and boosts Normal Consistency by 5.88% compared to strong baselines, while qualitative results show sharper edges and more complete geometry in challenging scenes. DP-AMF achieves these gains without substantially increasing model size or inference time, offering a robust and effective solution for complex single-view reconstruction tasks. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

11 pages, 1106 KiB  
Review
Three-Dimensional Ultraviolet Fluorescence Imaging in Cultural Heritage: A Review of Applications in Multi-Material Artworks
by Luca Lanteri, Claudia Pelosi and Paola Pogliani
J. Imaging 2025, 11(7), 245; https://doi.org/10.3390/jimaging11070245 - 21 Jul 2025
Viewed by 346
Abstract
Ultraviolet-induced fluorescence (UVF) imaging represents a simple but powerful technique in cultural heritage studies. It is a nondestructive and non-invasive imaging technique which can supply useful and relevant information to define the state of conservation of an artifact. UVF imaging also helps to [...] Read more.
Ultraviolet-induced fluorescence (UVF) imaging represents a simple but powerful technique in cultural heritage studies. It is a nondestructive and non-invasive imaging technique which can supply useful and relevant information to define the state of conservation of an artifact. UVF imaging also helps to establish the value of an artwork by indicating inpainting, repaired areas, grouting, etc. In general, ultraviolet fluorescence imaging output takes the form of 2D photographs in the case of both paintings and sculptures. For this reason, a few years ago the idea of applying the photogrammetric method to create 3D digital twins under ultraviolet fluorescence was developed to address the requirements of restorers who need daily documentation tools for their work that are simple to use and can display the entire 3D object in a single file. This review explores recent applications of this innovative method of ultraviolet fluorescence imaging with reference to the wider literature on the UVF technique to make evident the practical importance of its application in cultural heritage. Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
Show Figures

Figure 1

27 pages, 3888 KiB  
Article
Deep Learning-Based Algorithm for the Classification of Left Ventricle Segments by Hypertrophy Severity
by Wafa Baccouch, Bilel Hasnaoui, Narjes Benameur, Abderrazak Jemai, Dhaker Lahidheb and Salam Labidi
J. Imaging 2025, 11(7), 244; https://doi.org/10.3390/jimaging11070244 - 20 Jul 2025
Viewed by 337
Abstract
In clinical practice, left ventricle hypertrophy (LVH) continues to pose a considerable challenge, highlighting the need for more reliable diagnostic approaches. This study aims to propose an automated framework for the quantification of LVH extent and the classification of myocardial segments according to [...] Read more.
In clinical practice, left ventricle hypertrophy (LVH) continues to pose a considerable challenge, highlighting the need for more reliable diagnostic approaches. This study aims to propose an automated framework for the quantification of LVH extent and the classification of myocardial segments according to hypertrophy severity using a deep learning-based algorithm. The proposed method was validated on 133 subjects, including both healthy individuals and patients with LVH. The process starts with automatic LV segmentation using U-Net and the segmentation of the left ventricle cavity based on the American Heart Association (AHA) standards, followed by the division of each segment into three equal sub-segments. Then, an automated quantification of regional wall thickness (RWT) was performed. Finally, a convolutional neural network (CNN) was developed to classify each myocardial sub-segment according to hypertrophy severity. The proposed approach demonstrates strong performance in contour segmentation, achieving a Dice Similarity Coefficient (DSC) of 98.47% and a Hausdorff Distance (HD) of 6.345 ± 3.5 mm. For thickness quantification, it reaches a minimal mean absolute error (MAE) of 1.01 ± 1.16. Regarding segment classification, it achieves competitive performance metrics compared to state-of-the-art methods with an accuracy of 98.19%, a precision of 98.27%, a recall of 99.13%, and an F1-score of 98.7%. The obtained results confirm the high performance of the proposed method and highlight its clinical utility in accurately assessing and classifying cardiac hypertrophy. This approach provides valuable insights that can guide clinical decision-making and improve patient management strategies. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

15 pages, 4874 KiB  
Article
A Novel 3D Convolutional Neural Network-Based Deep Learning Model for Spatiotemporal Feature Mapping for Video Analysis: Feasibility Study for Gastrointestinal Endoscopic Video Classification
by Mrinal Kanti Dhar, Mou Deb, Poonguzhali Elangovan, Keerthy Gopalakrishnan, Divyanshi Sood, Avneet Kaur, Charmy Parikh, Swetha Rapolu, Gianeshwaree Alias Rachna Panjwani, Rabiah Aslam Ansari, Naghmeh Asadimanesh, Shiva Sankari Karuppiah, Scott A. Helgeson, Venkata S. Akshintala and Shivaram P. Arunachalam
J. Imaging 2025, 11(7), 243; https://doi.org/10.3390/jimaging11070243 - 18 Jul 2025
Viewed by 410
Abstract
Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static [...] Read more.
Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static images, overlooking critical temporal cues present in video data. To bridge this gap, a novel DL-based framework is proposed for spatiotemporal feature extraction from medical video sequences. As a feasibility use case, this study focuses on gastrointestinal (GI) endoscopic video classification. A 3D convolutional neural network (CNN) is developed to classify upper and lower GI endoscopic videos using the hyperKvasir dataset, which contains 314 lower and 60 upper GI videos. To address data imbalance, 60 matched pairs of videos are randomly selected across 20 experimental runs. Videos are resized to 224 × 224, and the 3D CNN captures spatiotemporal information. A 3D version of the parallel spatial and channel squeeze-and-excitation (P-scSE) is implemented, and a new block called the residual with parallel attention (RPA) block is proposed by combining P-scSE3D with a residual block. To reduce computational complexity, a (2 + 1)D convolution is used in place of full 3D convolution. The model achieves an average accuracy of 0.933, precision of 0.932, recall of 0.944, F1-score of 0.935, and AUC of 0.933. It is also observed that the integration of P-scSE3D increased the F1-score by 7%. This preliminary work opens avenues for exploring various GI endoscopic video-based prospective studies. Full article
Show Figures

Figure 1

14 pages, 2426 KiB  
Article
FakeMusicCaps: A Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
by Luca Comanducci, Paolo Bestagini and Stefano Tubaro
J. Imaging 2025, 11(7), 242; https://doi.org/10.3390/jimaging11070242 - 18 Jul 2025
Viewed by 362
Abstract
Text-to-music (TTM) models have recently revolutionized the automatic music generation research field, specifically by being able to generate music that sounds more plausible than all previous state-of-the-art models and by lowering the technical proficiency needed to use them. For these reasons, they have [...] Read more.
Text-to-music (TTM) models have recently revolutionized the automatic music generation research field, specifically by being able to generate music that sounds more plausible than all previous state-of-the-art models and by lowering the technical proficiency needed to use them. For these reasons, they have readily started to be adopted for commercial uses and music production practices. This widespread diffusion of TTMs poses several concerns regarding copyright violation and rightful attribution, posing the need of serious consideration of them by the audio forensics community. In this paper, we tackle the problem of detection and attribution of TTM-generated data. We propose a dataset, FakeMusicCaps, that contains several versions of the music-caption pairs dataset MusicCaps regenerated via several state-of-the-art TTM techniques. We evaluate the proposed dataset by performing initial experiments regarding the detection and attribution of TTM-generated audio considering both closed-set and open-set classification. Full article
Show Figures

Figure 1

18 pages, 2200 KiB  
Article
A Self-Supervised Adversarial Deblurring Face Recognition Network for Edge Devices
by Hanwen Zhang, Myun Kim, Baitong Li and Yanping Lu
J. Imaging 2025, 11(7), 241; https://doi.org/10.3390/jimaging11070241 - 15 Jul 2025
Viewed by 331
Abstract
With the advancement of information technology, human activity recognition (HAR) has been widely applied in fields such as intelligent surveillance, health monitoring, and human–computer interaction. As a crucial component of HAR, facial recognition plays a key role, especially in vision-based activity recognition. However, [...] Read more.
With the advancement of information technology, human activity recognition (HAR) has been widely applied in fields such as intelligent surveillance, health monitoring, and human–computer interaction. As a crucial component of HAR, facial recognition plays a key role, especially in vision-based activity recognition. However, current facial recognition models on the market perform poorly in handling blurry images and dynamic scenarios, limiting their effectiveness in real-world HAR applications. This study aims to construct a fast and accurate facial recognition model based on novel adversarial learning and deblurring theory to enhance its performance in human activity recognition. The model employs a generative adversarial network (GAN) as the core algorithm, optimizing its generation and recognition modules by decomposing the global loss function and incorporating a feature pyramid, thereby solving the balance challenge in GAN training. Additionally, deblurring techniques are introduced to improve the model’s ability to handle blurry and dynamic images. Experimental results show that the proposed model achieves high accuracy and recall rates across multiple facial recognition datasets, with an average recall rate of 87.40% and accuracy rates of 81.06% and 79.77% on the YTF, IMDB-WIKI, and WiderFace datasets, respectively. These findings confirm that the model effectively addresses the challenges of recognizing faces in dynamic and blurry conditions in human activity recognition, demonstrating significant application potential. Full article
(This article belongs to the Special Issue Techniques and Applications in Face Image Analysis)
Show Figures

Figure 1

22 pages, 5106 KiB  
Article
Predicting Very Early-Stage Breast Cancer in BI-RADS 3 Lesions of Large Population with Deep Learning
by Congyu Wang, Changzhen Li and Gengxiao Lin
J. Imaging 2025, 11(7), 240; https://doi.org/10.3390/jimaging11070240 - 15 Jul 2025
Viewed by 337
Abstract
Breast cancer accounts for one in four new malignant tumors in women, and misdiagnosis can lead to severe consequences, including delayed treatment. Among patients classified with a BI-RADS 3 rating, the risk of very early-stage malignancy remains over 2%. However, due to the [...] Read more.
Breast cancer accounts for one in four new malignant tumors in women, and misdiagnosis can lead to severe consequences, including delayed treatment. Among patients classified with a BI-RADS 3 rating, the risk of very early-stage malignancy remains over 2%. However, due to the benign imaging characteristics of these lesions, radiologists often recommend follow-up rather than immediate biopsy, potentially missing critical early interventions. This study aims to develop a deep learning (DL) model to accurately identify very early-stage malignancies in BI-RADS 3 lesions using ultrasound (US) images, thereby improving diagnostic precision and clinical decision-making. A total of 852 lesions (256 malignant and 596 benign) from 685 patients who underwent biopsies or 3-year follow-up were collected by Southwest Hospital (SW) and Tangshan People’s Hospital (TS) to develop and validate a deep learning model based on a novel transfer learning method. To further evaluate the performance of the model, six radiologists independently reviewed the external testing set on a web-based rating platform. The proposed model achieved an area under the receiver operating characteristic curve (AUC), sensitivity, and specificity of 0.880, 0.786, and 0.833 in predicting BI-RADS 3 malignant lesions in the internal testing set. The proposed transfer learning method improves the clinical AUC of predicting BI-RADS 3 malignancy from 0.721 to 0.880. In the external testing set, the model achieved AUC, sensitivity, and specificity of 0.910, 0.875, and 0.786 and outperformed the radiologists with an average AUC of 0.653 (p = 0.021). The DL model could detect very early-stage malignancy of BI-RADS 3 lesions in US images and had higher diagnostic capability compared with experienced radiologists. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

21 pages, 5493 KiB  
Article
Estimating Snow-Related Daily Change Events in the Canadian Winter Season: A Deep Learning-Based Approach
by Karim Malik, Isteyak Isteyak and Colin Robertson
J. Imaging 2025, 11(7), 239; https://doi.org/10.3390/jimaging11070239 - 14 Jul 2025
Viewed by 218
Abstract
Snow water equivalent (SWE), an essential parameter of snow, is largely studied to understand the impact of climate regime effects on snowmelt patterns. This study developed a Siamese Attention U-Net (Si-Att-UNet) model to detect daily change events in the winter season. The daily [...] Read more.
Snow water equivalent (SWE), an essential parameter of snow, is largely studied to understand the impact of climate regime effects on snowmelt patterns. This study developed a Siamese Attention U-Net (Si-Att-UNet) model to detect daily change events in the winter season. The daily SWE change event detection task is treated as an image content comparison problem in which the Si-Att-UNet compares a pair of SWE maps sampled at two temporal windows. The model detected SWE similarity and dissimilarity with an F1 score of 99.3% at a 50% confidence threshold. The change events were derived from the model’s prediction of SWE similarity using the 50% threshold. Daily SWE change events increased between 1979 and 2018. However, the SWE change events were significant in March and April, with a positive Mann–Kendall test statistic (tau = 0.25 and 0.38, respectively). The highest frequency of zero-change events occurred in February. A comparison of the SWE change events and mean change segments with those of the northern hemisphere’s climate anomalies revealed that low temperature and low precipitation anomalies reduced the frequency of SWE change events. The findings highlight the influence of climate variables on daily changes in snow-related water storage in March and April. Full article
Show Figures

Figure 1

17 pages, 464 KiB  
Article
Detection of Major Depressive Disorder from Functional Magnetic Resonance Imaging Using Regional Homogeneity and Feature/Sample Selective Evolving Voting Ensemble Approaches
by Bindiya A. R., B. S. Mahanand, Vasily Sachnev and DIRECT Consortium
J. Imaging 2025, 11(7), 238; https://doi.org/10.3390/jimaging11070238 - 14 Jul 2025
Viewed by 327
Abstract
Major depressive disorder is a mental illness characterized by persistent sadness or loss of interest that affects a person’s daily life. Early detection of this disorder is crucial for providing timely and effective treatment. Neuroimaging modalities, namely, functional magnetic resonance imaging, can be [...] Read more.
Major depressive disorder is a mental illness characterized by persistent sadness or loss of interest that affects a person’s daily life. Early detection of this disorder is crucial for providing timely and effective treatment. Neuroimaging modalities, namely, functional magnetic resonance imaging, can be used to identify changes in brain regions related to major depressive disorder. In this study, regional homogeneity images, one of the derivative of functional magnetic resonance imaging is employed to detect major depressive disorder using the proposed feature/sample evolving voting ensemble approach. A total of 2380 subjects consisting of 1104 healthy controls and 1276 patients with major depressive disorder from Rest-meta-MDD consortium are studied. Regional homogeneity features from 90 regions are extracted using automated anatomical labeling template. These regional homogeneity features are then fed as an input to the proposed feature/sample selective evolving voting ensemble for classification. The proposed approach achieves an accuracy of 91.93%, and discriminative features obtained from the classifier are used to identify brain regions which may be responsible for major depressive disorder. A total of nine brain regions, namely, left superior temporal gyrus, left postcentral gyrus, left anterior cingulate gyrus, right inferior parietal lobule, right superior medial frontal gyrus, left lingual gyrus, right putamen, left fusiform gyrus, and left middle temporal gyrus, are identified. This study clearly indicates that these brain regions play a critical role in detecting major depressive disorder. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

18 pages, 2182 KiB  
Article
Visual Neuroplasticity: Modulating Cortical Excitability with Flickering Light Stimulation
by Francisco J. Ávila
J. Imaging 2025, 11(7), 237; https://doi.org/10.3390/jimaging11070237 - 14 Jul 2025
Viewed by 620
Abstract
The balance between cortical excitation and inhibition (E/I balance) in the cerebral cortex is critical for cognitive processing and neuroplasticity. Modulation of this balance has been linked to a wide range of neuropsychiatric and neurodegenerative disorders. The human visual system has well-differentiated magnocellular [...] Read more.
The balance between cortical excitation and inhibition (E/I balance) in the cerebral cortex is critical for cognitive processing and neuroplasticity. Modulation of this balance has been linked to a wide range of neuropsychiatric and neurodegenerative disorders. The human visual system has well-differentiated magnocellular (M) and parvocellular (P) pathways, which provide a useful model to study cortical excitability using non-invasive visual flicker stimulation. We present an Arduino-driven non-image forming system to deliver controlled flickering light stimuli at different frequencies and wavelengths. By triggering the critical flicker fusion (CFF) frequency, we attempt to modulate the M-pathway activity and attenuate P-pathway responses, in parallel with induced optical scattering. EEG recordings were used to monitor cortical excitability and oscillatory dynamics during visual stimulation. Visual stimulation in the CFF, combined with induced optical scattering, selectively enhanced magnocellular activity and suppressed parvocellular input. EEG analysis showed a modulation of cortical oscillations, especially in the high frequency beta and gamma range. Our results support the hypothesis that visual flicker in the CFF, in addition to spatial degradation, initiates detectable neuroplasticity and regulates cortical excitation and inhibition. These findings suggest new avenues for therapeutic manipulation through visual pathways in diseases such as Alzheimer’s disease, epilepsy, severe depression, and schizophrenia. Full article
Show Figures

Figure 1

11 pages, 1628 KiB  
Article
Bone Mineral Density (BMD) Assessment Using Dual-Energy CT with Different Base Material Pairs (BMPs)
by Stefano Piscone, Sara Saccone, Paola Milillo, Giorgia Schiraldi, Roberta Vinci, Luca Macarini and Luca Pio Stoppino
J. Imaging 2025, 11(7), 236; https://doi.org/10.3390/jimaging11070236 - 13 Jul 2025
Viewed by 296
Abstract
The assessment of bone mineral density (BMD) is essential for osteoporosis diagnosis. Dual-energy X-ray Absorptiometry (DXA) is the current gold standard, but it has limitations in evaluating trabecular bone and is susceptible to different artifacts. In this study we evaluate whether Dual-Energy Computed [...] Read more.
The assessment of bone mineral density (BMD) is essential for osteoporosis diagnosis. Dual-energy X-ray Absorptiometry (DXA) is the current gold standard, but it has limitations in evaluating trabecular bone and is susceptible to different artifacts. In this study we evaluate whether Dual-Energy Computed Tomography (DECT) can be defined as an alternative method for the assessment of BMD in a sample of postmenopausal patients undergoing oncological follow-up. In this study a retrospective analysis was conducted on 41 patients who had both DECT and DXA within six months. BMD values were extracted from DECT using five different base material pairs (BMPs) and compared with DXA measurements at the femoral neck. The calcium–fat pairing showed the strongest correlation with DXA-derived BMD (Spearman’s ρ = 0.797) and excellent reproducibility (ICC = 0.983). There was a strong and significant association between the DXA results and the various BPM measurements. These findings support the possibility of DECT in the precise and opportunistic evaluation of BMD changes when employing particular BMPs. This study showed how this technique can be a useful and effective substitute for conventional DXA, particularly when patients are in oncological follow-up using DECT, minimizing additional radiation exposure. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

23 pages, 3404 KiB  
Article
MST-AI: Skin Color Estimation in Skin Cancer Datasets
by Vahid Khalkhali, Hayan Lee, Joseph Nguyen, Sergio Zamora-Erazo, Camille Ragin, Abhishek Aphale, Alfonso Bellacosa, Ellis P. Monk and Saroj K. Biswas
J. Imaging 2025, 11(7), 235; https://doi.org/10.3390/jimaging11070235 - 13 Jul 2025
Viewed by 307
Abstract
The absence of skin color information in skin cancer datasets poses a significant challenge for accurate diagnosis using artificial intelligence models, particularly for non-white populations. In this paper, based on the Monk Skin Tone (MST) scale, which is less biased than the Fitzpatrick [...] Read more.
The absence of skin color information in skin cancer datasets poses a significant challenge for accurate diagnosis using artificial intelligence models, particularly for non-white populations. In this paper, based on the Monk Skin Tone (MST) scale, which is less biased than the Fitzpatrick scale, we propose MST-AI, a novel method for detecting skin color in images of large datasets, such as the International Skin Imaging Collaboration (ISIC) archive. The approach includes automatic frame, lesion removal, and lesion segmentation using convolutional neural networks, and modeling normal skin tones with a Variational Bayesian Gaussian Mixture Model (VB-GMM). The distribution of skin color predictions was compared with MST scale probability distribution functions (PDFs) using the Kullback-Leibler Divergence (KLD) metric. Validation against manual annotations and comparison with K-means clustering of image and skin mean RGBs demonstrated the superior performance of the MST-AI, with Kendall’s Tau, Spearman’s Rho, and Normalized Discounted Cumulative Gain (NDGC) of 0.68, 0.69, and 1.00, respectively. This research lays the groundwork for developing unbiased AI models for early skin cancer diagnosis by addressing skin color imbalances in large datasets. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

14 pages, 2707 KiB  
Article
Implantation of an Artificial Intelligence Denoising Algorithm Using SubtlePET™ with Various Radiotracers: 18F-FDG, 68Ga PSMA-11 and 18F-FDOPA, Impact on the Technologist Radiation Doses
by Jules Zhang-Yin, Octavian Dragusin, Paul Jonard, Christian Picard, Justine Grangeret, Christopher Bonnier, Philippe P. Leveque, Joel Aerts and Olivier Schaeffer
J. Imaging 2025, 11(7), 234; https://doi.org/10.3390/jimaging11070234 - 11 Jul 2025
Viewed by 274
Abstract
This study assesses the clinical deployment of SubtlePET™, a commercial AI-based denoising algorithm, across three radiotracers—18F-FDG, 68Ga-PSMA-11, and 18F-FDOPA—with the goal of improving image quality while reducing injected activity, technologist radiation exposure, and scan time. A retrospective analysis on [...] Read more.
This study assesses the clinical deployment of SubtlePET™, a commercial AI-based denoising algorithm, across three radiotracers—18F-FDG, 68Ga-PSMA-11, and 18F-FDOPA—with the goal of improving image quality while reducing injected activity, technologist radiation exposure, and scan time. A retrospective analysis on a digital PET/CT system showed that SubtlePET™ enabled dose reductions exceeding 33% and time savings of over 25%. AI-enhanced images were rated interpretable in 100% of cases versus 65% for standard low-dose reconstructions. Notably, 85% of AI-enhanced scans received the maximum Likert quality score (5/5), indicating excellent diagnostic confidence and noise suppression, compared to only 50% with conventional reconstruction. The quantitative image quality improved significantly across all tracers, with SNR and CNR gains of 50–70%. Radiotracer dose reductions were particularly substantial in low-BMI patients (up to 41% for FDG), and the technologist exposure decreased for high-exposure roles. The daily patient throughput increased by an average of 4.84 cases. These findings support the robust integration of SubtlePET™ into routine clinical PET practice, offering improved efficiency, safety, and image quality without compromising lesion detectability. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

2 pages, 153 KiB  
Correction
Correction: Pegoraro et al. Cardiac Magnetic Resonance in the Assessment of Atrial Cardiomyopathy and Pulmonary Vein Isolation Planning for Atrial Fibrillation. J. Imaging 2025, 11, 143
by Nicola Pegoraro, Serena Chiarello, Riccardo Bisi, Giuseppe Muscogiuri, Matteo Bertini, Aldo Carnevale, Melchiore Giganti and Alberto Cossu
J. Imaging 2025, 11(7), 233; https://doi.org/10.3390/jimaging11070233 - 11 Jul 2025
Viewed by 160
Abstract
In the original publication [...] Full article
20 pages, 2750 KiB  
Article
E-InMeMo: Enhanced Prompting for Visual In-Context Learning
by Jiahao Zhang, Bowen Wang, Hong Liu, Liangzhi Li, Yuta Nakashima and Hajime Nagahara
J. Imaging 2025, 11(7), 232; https://doi.org/10.3390/jimaging11070232 - 11 Jul 2025
Viewed by 302
Abstract
Large-scale models trained on extensive datasets have become the standard due to their strong generalizability across diverse tasks. In-context learning (ICL), widely used in natural language processing, leverages these models by providing task-specific prompts without modifying their parameters. This paradigm is increasingly being [...] Read more.
Large-scale models trained on extensive datasets have become the standard due to their strong generalizability across diverse tasks. In-context learning (ICL), widely used in natural language processing, leverages these models by providing task-specific prompts without modifying their parameters. This paradigm is increasingly being adapted for computer vision, where models receive an input–output image pair, known as an in-context pair, alongside a query image to illustrate the desired output. However, the success of visual ICL largely hinges on the quality of these prompts. To address this, we propose Enhanced Instruct Me More (E-InMeMo), a novel approach that incorporates learnable perturbations into in-context pairs to optimize prompting. Through extensive experiments on standard vision tasks, E-InMeMo demonstrates superior performance over existing state-of-the-art methods. Notably, it improves mIoU scores by 7.99 for foreground segmentation and by 17.04 for single object detection when compared to the baseline without learnable prompts. These results highlight E-InMeMo as a lightweight yet effective strategy for enhancing visual ICL. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

12 pages, 4368 KiB  
Article
A Dual-Branch Fusion Model for Deepfake Detection Using Video Frames and Microexpression Features
by Georgios Petmezas, Vazgken Vanian, Manuel Pastor Rufete, Eleana E. I. Almaloglou and Dimitris Zarpalas
J. Imaging 2025, 11(7), 231; https://doi.org/10.3390/jimaging11070231 - 11 Jul 2025
Viewed by 421
Abstract
Deepfake detection has become a critical issue due to the rise of synthetic media and its potential for misuse. In this paper, we propose a novel approach to deepfake detection by combining video frame analysis with facial microexpression features. The dual-branch fusion model [...] Read more.
Deepfake detection has become a critical issue due to the rise of synthetic media and its potential for misuse. In this paper, we propose a novel approach to deepfake detection by combining video frame analysis with facial microexpression features. The dual-branch fusion model utilizes a 3D ResNet18 for spatiotemporal feature extraction and a transformer model to capture microexpression patterns, which are difficult to replicate in manipulated content. We evaluate the model on the widely used FaceForensics++ (FF++) dataset and demonstrate that our approach outperforms existing state-of-the-art methods, achieving 99.81% accuracy and a perfect ROC-AUC score of 100%. The proposed method highlights the importance of integrating diverse data sources for deepfake detection, addressing some of the current limitations of existing systems. Full article
Show Figures

Figure 1

22 pages, 3354 KiB  
Article
PS-YOLO-seg: A Lightweight Instance Segmentation Method for Lithium Mineral Microscopic Images Based on Improved YOLOv12-seg
by Zeyang Qiu, Xueyu Huang, Zhicheng Deng, Xiangyu Xu and Zhenzhong Qiu
J. Imaging 2025, 11(7), 230; https://doi.org/10.3390/jimaging11070230 - 10 Jul 2025
Viewed by 485
Abstract
Microscopic image automatic recognition is a core technology for mineral composition analysis and plays a crucial role in advancing the intelligent development of smart mining systems. To overcome the limitations of traditional lithium ore analysis and meet the challenges of deployment on edge [...] Read more.
Microscopic image automatic recognition is a core technology for mineral composition analysis and plays a crucial role in advancing the intelligent development of smart mining systems. To overcome the limitations of traditional lithium ore analysis and meet the challenges of deployment on edge devices, we propose PS-YOLO-seg, a lightweight segmentation model specifically designed for lithium mineral analysis under visible light microscopy. The network is compressed by adjusting the width factor to reduce global channel redundancy. A PSConv-based downsampling strategy enhances the network’s ability to capture dim mineral textures under microscopic conditions. In addition, the improved C3k2-PS module strengthens feature extraction, while the streamlined Segment-Efficient head minimizes redundant computation, further reducing the overall model complexity. PS-YOLO-seg achieves a slightly improved segmentation performance compared to the baseline YOLOv12n model on a self-constructed lithium ore microscopic dataset, while reducing FLOPs by 20%, parameter count by 33%, and model size by 32%. Additionally, it achieves a faster inference speed, highlighting its potential for practical deployment. This work demonstrates how architectural optimization and targeted enhancements can significantly improve instance segmentation performance while maintaining speed and compactness, offering strong potential for real-time deployment in industrial settings and edge computing scenarios. Full article
(This article belongs to the Special Issue Advances in Machine Learning for Computer Vision Applications)
Show Figures

Figure 1

14 pages, 5045 KiB  
Article
Depth-Dependent Variability in Ultrasound Attenuation Imaging for Hepatic Steatosis: A Pilot Study of ATI and HRI in Healthy Volunteers
by Alexander Martin, Oliver Hurni, Catherine Paverd, Olivia Hänni, Lisa Ruby, Thomas Frauenfelder and Florian A. Huber
J. Imaging 2025, 11(7), 229; https://doi.org/10.3390/jimaging11070229 - 9 Jul 2025
Viewed by 344
Abstract
Ultrasound attenuation imaging (ATI) is a non-invasive method for quantifying hepatic steatosis, offering advantages over the hepatorenal index (HRI). However, its reliability can be influenced by factors such as measurement depth, ROI size, and subcutaneous fat. This paper examines the impact of these [...] Read more.
Ultrasound attenuation imaging (ATI) is a non-invasive method for quantifying hepatic steatosis, offering advantages over the hepatorenal index (HRI). However, its reliability can be influenced by factors such as measurement depth, ROI size, and subcutaneous fat. This paper examines the impact of these confounders on ATI measurements and discusses diagnostic considerations. In this study, 33 healthy adults underwent liver ultrasound with ATI and HRI protocols. ATI measurements were taken at depths of 2–5 cm below the liver capsule using small and large ROIs. Two operators performed the measurements, and inter-operator variability was assessed. Subcutaneous fat thickness was measured to evaluate its influence on attenuation values. The ATI measurements showed a consistent decrease in attenuation coefficient values with increasing depth, approximately 0.05 dB/cm/MHz. Larger ROI sizes increased measurement variability due to greater anatomical heterogeneity. HRI values correlated weakly with ATI and were influenced by operator technique and subcutaneous fat, the latter accounting for roughly 2.5% of variability. ATI provides a quantitative assessment of hepatic steatosis compared to HRI, although its accuracy can vary depending on the depth and ROI selection. Standardised imaging protocols and AI tools may improve reproducibility and clinical utility, supporting advancements in ultrasound-based liver diagnostics for better patient care. Full article
Show Figures

Figure 1

11 pages, 3292 KiB  
Article
Essential Multi-Secret Image Sharing for Sensor Images
by Shang-Kuan Chen
J. Imaging 2025, 11(7), 228; https://doi.org/10.3390/jimaging11070228 - 8 Jul 2025
Viewed by 215
Abstract
In this paper, we propose an innovative essential multi-secret image sharing (EMSIS) scheme that integrates sensor data to securely and efficiently share multiple secret images of varying importance. Secret images are categorized into hierarchical levels and encoded into essential shadows and fault-tolerant non-essential [...] Read more.
In this paper, we propose an innovative essential multi-secret image sharing (EMSIS) scheme that integrates sensor data to securely and efficiently share multiple secret images of varying importance. Secret images are categorized into hierarchical levels and encoded into essential shadows and fault-tolerant non-essential shares, with access to higher-level secrets requiring higher-level essential shadows. By incorporating sensor data, such as location, time, or biometric input, into the encoding and access process, the scheme enables the context-aware and adaptive reconstruction of secrets based on real-world conditions. Experimental results demonstrate that the proposed method not only strengthens hierarchical access control, but also enhances robustness, flexibility, and situational awareness in secure image distribution systems. Full article
(This article belongs to the Section Computational Imaging and Computational Photography)
Show Figures

Figure 1

16 pages, 759 KiB  
Article
Interpretation of AI-Generated vs. Human-Made Images
by Daniela Velásquez-Salamanca, Miguel Ángel Martín-Pascual and Celia Andreu-Sánchez
J. Imaging 2025, 11(7), 227; https://doi.org/10.3390/jimaging11070227 - 7 Jul 2025
Viewed by 602
Abstract
AI-generated content has grown significantly in recent years. Today, AI-generated and human-made images coexist across various settings, including news media, social platforms, and beyond. However, we still know relatively little about how audiences interpret and evaluate these different types of images. The goal [...] Read more.
AI-generated content has grown significantly in recent years. Today, AI-generated and human-made images coexist across various settings, including news media, social platforms, and beyond. However, we still know relatively little about how audiences interpret and evaluate these different types of images. The goal of this study was to examine whether image interpretation is influenced by the origin of the image (AI-generated vs. human-made). Additionally, we aimed to explore whether visual professionalization influences how images are interpreted. To this end, we presented 24 AI-generated images (produced using Midjourney, DALL·E, and Firefly) and 8 human-made images to 161 participants—71 visual professionals and 90 non-professionals. Participants were asked to evaluate each image based on the following: (1) the source they believed the image originated from, (2) the level of realism, and (3) the level of credibility they attributed to it. A total of 5152 responses were collected for each question. Our results reveal that human-made images are more readily recognized as such, whereas AI-generated images are frequently misclassified as human-made. We also find that human-made images are perceived as both more realistic and more credible than AI-generated ones. We conclude that individuals are generally unable to accurately determine the source of an image, which in turn affects their assessment of its credibility. Full article
Show Figures

Figure 1

16 pages, 1347 KiB  
Article
Detection of Helicobacter pylori Infection in Histopathological Gastric Biopsies Using Deep Learning Models
by Rafael Parra-Medina, Carlos Zambrano-Betancourt, Sergio Peña-Rojas, Lina Quintero-Ortiz, Maria Victoria Caro, Ivan Romero, Javier Hernan Gil-Gómez, John Jaime Sprockel, Sandra Cancino and Andres Mosquera-Zamudio
J. Imaging 2025, 11(7), 226; https://doi.org/10.3390/jimaging11070226 - 7 Jul 2025
Viewed by 707
Abstract
Traditionally, Helicobacter pylori (HP) gastritis has been diagnosed by pathologists through the examination of gastric biopsies using optical microscopy with standard hematoxylin and eosin (H&E) staining. However, with the adoption of digital pathology, the identification of HP faces certain limitations, particularly due to [...] Read more.
Traditionally, Helicobacter pylori (HP) gastritis has been diagnosed by pathologists through the examination of gastric biopsies using optical microscopy with standard hematoxylin and eosin (H&E) staining. However, with the adoption of digital pathology, the identification of HP faces certain limitations, particularly due to insufficient resolution in some scanned images. Moreover, interobserver variability has been well documented in the traditional diagnostic approach, which may further complicate consistent interpretation. In this context, deep convolutional neural network (DCNN) models are showing promising results in the automated detection of this infection in whole-slide images (WSIs). The aim of the present article is to detect the presence of HP infection from our own institutional dataset of histopathological gastric biopsy samples using different pretrained and recognized DCNN and AutoML approaches. The dataset comprises 100 H&E-stained WSIs of gastric biopsies. HP infection was confirmed previously using immunohistochemical confirmation. A total of 45,795 patches were selected for model development. InceptionV3, Resnet50, and VGG16 achieved AUC (area under the curve) values of 1. However, InceptionV3 showed superior metrics such as accuracy (97%), recall (100%), F1 score (97%), and MCC (93%). BoostedNet and AutoKeras achieved accuracy, precision, recall, specificity, and F1 scores less than 85%. The InceptionV3 model was used for external validation, and the predictions across all patches yielded a global accuracy of 78%. In conclusion, DCNN models showed stronger potential for diagnosing HP in gastric biopsies compared with the auto ML approach. However, due to variability across pathology applications, no single model is universally optimal. A problem-specific approach is essential. With growing WSI adoption, DL can improve diagnostic accuracy, reduce variability, and streamline pathology workflows using automation. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

24 pages, 5942 KiB  
Article
Leveraging Achromatic Component for Trichromat-Friendly Daltonization
by Dmitry Sidorchuk, Almir Nurmukhametov, Paul Maximov, Valentina Bozhkova, Anastasia Sarycheva, Maria Pavlova, Anna Kazakova, Maria Gracheva and Dmitry Nikolaev
J. Imaging 2025, 11(7), 225; https://doi.org/10.3390/jimaging11070225 - 7 Jul 2025
Viewed by 448
Abstract
Color vision deficiency (CVD) affects around 300 million people globally due to issues with cone cells, highlighting the need for effective daltonization methods. These methods modify color palettes to enhance detail visibility for individuals with CVD. However, they can also distort the natural [...] Read more.
Color vision deficiency (CVD) affects around 300 million people globally due to issues with cone cells, highlighting the need for effective daltonization methods. These methods modify color palettes to enhance detail visibility for individuals with CVD. However, they can also distort the natural appearance of images. This study presents a novel daltonization method that focuses on preserving image naturalness for both normal trichromats and individuals with CVD. Our approach modifies only the achromatic component while enhancing detail visibility for individuals with CVD. To compare our approach with the previously known anisotropic daltonization method, we utilize objective and subjective evaluations that separately assess visibility enhancement and naturalness preservation. Our findings indicate that the proposed method outperforms the anisotropic method in naturalness by over 10 times according to objective criteria. Subjective evaluations revealed that more than 90% of CVD individuals and 95% of trichromats preferred our method for its natural appearance. Although objective contrast metrics suggest inferior visibility enhancement, subjective evaluation indicates comparable performance: contrast improvement was observed in 65% of protan cases for our method versus 70% for the anisotropic method, with contrast deterioration in 18% versus 7%, respectively. Overall, our method offers superior naturalness while maintaining comparable detail discrimination. Full article
(This article belongs to the Special Issue Image and Video Processing for Blind and Visually Impaired)
Show Figures

Figure 1

17 pages, 2591 KiB  
Article
Sex Determination Using Linear Anthropometric Measurements Relative to the Mandibular Reference Plane on CBCT 3D Images
by Nikolaos Christoloukas, Anastasia Mitsea, Leda Kovatsi and Christos Angelopoulos
J. Imaging 2025, 11(7), 224; https://doi.org/10.3390/jimaging11070224 - 5 Jul 2025
Viewed by 349
Abstract
Sex determination is a fundamental component of forensic identification and medicolegal investigations. Several studies have investigated sexual dimorphism through mandibular osteometric measurements, including the position of anatomical foramina such as the mandibular and mental foramen (MF), reporting population-specific discrepancies. This study assessed the [...] Read more.
Sex determination is a fundamental component of forensic identification and medicolegal investigations. Several studies have investigated sexual dimorphism through mandibular osteometric measurements, including the position of anatomical foramina such as the mandibular and mental foramen (MF), reporting population-specific discrepancies. This study assessed the reliability and predictive ability of specific anthropometric mandibular measurements for sex estimation using three-dimensional (3D) cone beam computed tomography (CBCT) surface reconstructions. Methods: CBCT scans from 204 Greek individuals (18–70 years) were analyzed. Records were categorized by sex and age. Five linear measurements were traced on 3D reconstructions using ViewBox 4 software: projections of the inferior points of the right and left mental and mandibular foramina and the linear distance between mental foramina projections. A binary logistic regression (BLR) model was employed. All measurements showed statistically significant sex differences, with males presenting higher mean values. The final model achieved accuracy of 66.7% in sex prediction, with two vertical measurements—distances from the right mandibular foramen and the left mental foramen—identified as the strongest predictors of sex. The positions of the mandibular and mental foramina demonstrate sex-related dimorphism in this Greek sample, supporting their forensic relevance in population-specific applications. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

22 pages, 10233 KiB  
Article
Artificial Intelligence Dystocia Algorithm (AIDA) as a Decision Support System in Transverse Fetal Head Position
by Antonio Malvasi, Lorenzo E. Malgieri, Tommaso Difonzo, Reuven Achiron, Andrea Tinelli, Giorgio Maria Baldini, Lorenzo Vasciaveo, Renata Beck, Ilenia Mappa and Giuseppe Rizzo
J. Imaging 2025, 11(7), 223; https://doi.org/10.3390/jimaging11070223 - 5 Jul 2025
Viewed by 310
Abstract
Transverse fetal head position during labor is associated with increased rates of operative deliveries and cesarean sections. Traditional assessment methods rely on digital examination, which can be inaccurate in cases of prolonged labor. Intrapartum ultrasound offers improved diagnostic capabilities, but standardized interpretation frameworks [...] Read more.
Transverse fetal head position during labor is associated with increased rates of operative deliveries and cesarean sections. Traditional assessment methods rely on digital examination, which can be inaccurate in cases of prolonged labor. Intrapartum ultrasound offers improved diagnostic capabilities, but standardized interpretation frameworks are needed. This study aimed to evaluate the significance of appropriate assessment and management of transverse fetal head position during labor, with particular emphasis on the correlation between geometric parameters and delivery outcomes. Additionally, the investigation analyzed the potential role of Artificial Intelligence Dystocia Algorithm (AIDA) as an innovative decision support system in standardizing diagnostic approaches and optimizing clinical decision-making in cases of fetal malposition. This investigation was conducted as a focused secondary analysis of data originally collected for the development and validation of the Artificial Intelligence Dystocia Algorithm (AIDA). The study examined 66 cases of transverse fetal head position from a cohort of 135 nulliparous women with prolonged second-stage labor across three Italian hospitals. Cases were stratified by Midline Angle (MLA) measurements into classic transverse (≥75°), near-transverse (70–74°), and transitional (60–69°) positions. Four geometric parameters (Angle of Progression, Head–Symphysis Distance, Midline Angle, and Asynclitism Degree) were evaluated using the AIDA classification system. The predictive capabilities of three machine learning algorithms (Support Vector Machine, Random Forest, and Multilayer Perceptron) were assessed, and delivery outcomes were analyzed. The AIDA system successfully categorized labor dystocia into five distinct classes, with strong predictive value for delivery outcomes. A clear gradient of cesarean delivery risk was observed across the spectrum of transverse positions (100%, 93.1%, and 85.7% for near-transverse, classic transverse, and transitional positions, respectively). All cases classified as AIDA Class 4 required cesarean delivery regardless of the specific MLA value. Machine learning algorithms demonstrated high predictive accuracy, with Random Forest achieving 95.5% overall accuracy across the study cohort. The presence of concurrent asynclitism with transverse position was associated with particularly high rates of cesarean delivery. Among the seven cases that achieved vaginal delivery despite transverse positioning, none belonged to the classic transverse positions group, and five (71.4%) exhibited at least one parameter classified as favorable. The integration of artificial intelligence through AIDA as a decision support system, combined with intrapartum ultrasound, offered a promising approach for objective assessment and management of transverse fetal head position. The AIDA classification system’s integration of multiple geometric parameters, with particular emphasis on precise Midline Angle (MLA) measurement in degrees, provided superior predictive capability for delivery outcomes compared to qualitative position assessment alone. This multidimensional approach enabled more personalized and evidence-based management of malpositions during labor, potentially reducing unnecessary interventions while identifying cases where expectant management might be futile. Further prospective studies are needed to validate the predictive capability of this decision support system and its impact on clinical decision-making in real-time labor management. Full article
Show Figures

Graphical abstract

14 pages, 2571 KiB  
Article
Development of Deep Learning Models for Real-Time Thoracic Ultrasound Image Interpretation
by Austin J. Ruiz, Sofia I. Hernández Torres and Eric J. Snider
J. Imaging 2025, 11(7), 222; https://doi.org/10.3390/jimaging11070222 - 5 Jul 2025
Viewed by 391
Abstract
Thoracic injuries account for a high percentage of combat casualty mortalities, with 80% of preventable deaths resulting from abdominal or thoracic hemorrhage. An effective method for detecting and triaging thoracic injuries is point-of-care ultrasound (POCUS), as it is a cheap and portable noninvasive [...] Read more.
Thoracic injuries account for a high percentage of combat casualty mortalities, with 80% of preventable deaths resulting from abdominal or thoracic hemorrhage. An effective method for detecting and triaging thoracic injuries is point-of-care ultrasound (POCUS), as it is a cheap and portable noninvasive imaging method. POCUS image interpretation of pneumothorax (PTX) or hemothorax (HTX) injuries requires a skilled radiologist, which will likely not be available in austere situations where injury detection and triage are most critical. With the recent growth in artificial intelligence (AI) for healthcare, the hypothesis for this study is that deep learning (DL) models for classifying images as showing HTX or PTX injury, or being negative for injury can be developed for lowering the skill threshold for POCUS diagnostics on the future battlefield. Three-class deep learning classification AI models were developed using a motion-mode ultrasound dataset captured in animal study experiments from more than 25 swine subjects. Cluster analysis was used to define the “population” based on brightness, contrast, and kurtosis properties. A MobileNetV3 DL model architecture was tuned across a variety of hyperparameters, with the results ultimately being evaluated using images captured in real-time. Different hyperparameter configurations were blind-tested, resulting in models trained on filtered data having a real-time accuracy from 89 to 96%, as opposed to 78–95% when trained without filtering and optimization. The best model achieved a blind accuracy of 85% when inferencing on data collected in real-time, surpassing previous YOLOv8 models by 17%. AI models can be developed that are suitable for high performance in real-time for thoracic injury determination and are suitable for potentially addressing challenges with responding to emergency casualty situations and reducing the skill threshold for using and interpreting POCUS. Full article
(This article belongs to the Special Issue Learning and Optimization for Medical Imaging)
Show Figures

Figure 1

12 pages, 1337 KiB  
Review
Diagnostic Accuracy of Sonoelastography for Breast Lesions: A Meta-Analysis Comparing Strain and Shear Wave Elastography
by Youssef Ahmed Youssef Selim, Hussein Sabit, Borros Arneth and Marwa A. Shaaban
J. Imaging 2025, 11(7), 221; https://doi.org/10.3390/jimaging11070221 - 4 Jul 2025
Viewed by 346
Abstract
This meta-analysis evaluated the diagnostic accuracy of sonoelastography for distinguishing benign and malignant breast lesions, comparing strain elastography and shear wave elastography (SWE). We systematically reviewed 825 publications, selecting 30 studies (6200 lesions: 45% benign, 55% malignant). The pooled sensitivity and specificity for [...] Read more.
This meta-analysis evaluated the diagnostic accuracy of sonoelastography for distinguishing benign and malignant breast lesions, comparing strain elastography and shear wave elastography (SWE). We systematically reviewed 825 publications, selecting 30 studies (6200 lesions: 45% benign, 55% malignant). The pooled sensitivity and specificity for overall sonoelastography were 88% (95% CI: 85–91%) and 84% (95% CI: 81–87%), respectively. Strain elastography showed sensitivity and specificity of 85% and 80%, respectively, while SWE demonstrated superior performance with 90% sensitivity, 86% specificity, and an AUC of 0.92. Moderate heterogeneity (I2 = 55%) was attributed to study variation. SWE showed the potential to reduce unnecessary biopsies by 30–40% by increasing specificity. AI-assisted image analysis and standardized protocols may enhance accuracy and reduce variability. These findings support the integration of SWE into breast imaging protocols. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

23 pages, 1945 KiB  
Article
Spectro-Image Analysis with Vision Graph Neural Networks and Contrastive Learning for Parkinson’s Disease Detection
by Nuwan Madusanka, Hadi Sedigh Malekroodi, H. M. K. K. M. B. Herath, Chaminda Hewage, Myunggi Yi and Byeong-Il Lee
J. Imaging 2025, 11(7), 220; https://doi.org/10.3390/jimaging11070220 - 2 Jul 2025
Viewed by 349
Abstract
This study presents a novel framework that integrates Vision Graph Neural Networks (ViGs) with supervised contrastive learning for enhanced spectro-temporal image analysis of speech signals in Parkinson’s disease (PD) detection. The approach introduces a frequency band decomposition strategy that transforms raw audio into [...] Read more.
This study presents a novel framework that integrates Vision Graph Neural Networks (ViGs) with supervised contrastive learning for enhanced spectro-temporal image analysis of speech signals in Parkinson’s disease (PD) detection. The approach introduces a frequency band decomposition strategy that transforms raw audio into three complementary spectral representations, capturing distinct PD-specific characteristics across low-frequency (0–2 kHz), mid-frequency (2–6 kHz), and high-frequency (6 kHz+) bands. The framework processes mel multi-band spectro-temporal representations through a ViG architecture that models complex graph-based relationships between spectral and temporal components, trained using a supervised contrastive objective that learns discriminative representations distinguishing PD-affected from healthy speech patterns. Comprehensive experimental validation on multi-institutional datasets from Italy, Colombia, and Spain demonstrates that the proposed ViG-contrastive framework achieves superior classification performance, with the ViG-M-GELU architecture achieving 91.78% test accuracy. The integration of graph neural networks with contrastive learning enables effective learning from limited labeled data while capturing complex spectro-temporal relationships that traditional Convolution Neural Network (CNN) approaches miss, representing a promising direction for developing more accurate and clinically viable speech-based diagnostic tools for PD. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

18 pages, 13103 KiB  
Article
ILViT: An Inception-Linear Attention-Based Lightweight Vision Transformer for Microscopic Cell Classification
by Zhangda Liu, Panpan Wu, Ziping Zhao and Hengyong Yu
J. Imaging 2025, 11(7), 219; https://doi.org/10.3390/jimaging11070219 - 1 Jul 2025
Viewed by 356
Abstract
Microscopic cell classification is a fundamental challenge in both clinical diagnosis and biological research. However, existing methods still struggle with the complexity and morphological diversity of cellular images, leading to limited accuracy or high computational costs. To overcome these constraints, we propose an [...] Read more.
Microscopic cell classification is a fundamental challenge in both clinical diagnosis and biological research. However, existing methods still struggle with the complexity and morphological diversity of cellular images, leading to limited accuracy or high computational costs. To overcome these constraints, we propose an efficient classification method that balances strong feature representation with a lightweight design. Specifically, an Inception-Linear Attention-based Lightweight Vision Transformer (ILViT) model is developed for microscopic cell classification. The ILViT integrates two innovative modules: Dynamic Inception Convolution (DIC) and Contrastive Omni-Kolmogorov Attention (COKA). DIC combines dynamic and Inception-style convolutions to replace large kernels with fewer parameters. COKA integrates Omni-Dimensional Dynamic Convolution (ODC), linear attention, and a Kolmogorov-Arnold Network(KAN) structure to enhance feature learning and model interpretability. With only 1.91 GFLOPs and 8.98 million parameters, ILViT achieves high efficiency. Extensive experiments on four public datasets are conducted to validate the effectiveness of the proposed method. It achieves an accuracy of 97.185% on BioMediTech dataset for classifying retinal pigment epithelial cells, 97.436% on ICPR-HEp-2 dataset for diagnosing autoimmune disorders via HEp-2 cell classification, 90.528% on Hematological Malignancy Bone Marrow Cytology Expert Annotation dataset for categorizing bone marrow cells, and 99.758% on a white blood cell dataset for distinguishing leukocyte subtypes. These results show that ILViT outperforms the state-of-the-art models in both accuracy and efficiency, demonstrating strong generalizability and practical potential for cell image classification. Full article
Show Figures

Figure 1

17 pages, 1609 KiB  
Article
Parallel Multi-Scale Semantic-Depth Interactive Fusion Network for Depth Estimation
by Chenchen Fu, Sujunjie Sun, Ning Wei, Vincent Chau, Xueyong Xu and Weiwei Wu
J. Imaging 2025, 11(7), 218; https://doi.org/10.3390/jimaging11070218 - 1 Jul 2025
Viewed by 332
Abstract
Self-supervised depth estimation from monocular image sequences provides depth information without costly sensors like LiDAR, offering significant value for autonomous driving. Although self-supervised algorithms can reduce the dependence on labeled data, the performance is still affected by scene occlusions, lighting differences, and sparse [...] Read more.
Self-supervised depth estimation from monocular image sequences provides depth information without costly sensors like LiDAR, offering significant value for autonomous driving. Although self-supervised algorithms can reduce the dependence on labeled data, the performance is still affected by scene occlusions, lighting differences, and sparse textures. Existing methods do not consider the enhancement and interaction fusion of features. In this paper, we propose a novel parallel multi-scale semantic-depth interactive fusion network. First, we adopt a multi-stage feature attention network for feature extraction, and a parallel semantic-depth interactive fusion module is introduced to refine edges. Furthermore, we also employ a metric loss based on semantic edges to take full advantage of semantic geometric information. Our network is trained and evaluated on KITTI datasets. The experimental results show that the methods achieve satisfactory performance compared to other existing methods. Full article
Show Figures

Figure 1

12 pages, 6032 KiB  
Review
Imaging Evaluation of Periarticular Soft Tissue Masses in the Appendicular Skeleton: A Pictorial Review
by Francesco Pucciarelli, Maria Carla Faugno, Daniela Valanzuolo, Edoardo Massaro, Lorenzo Maria De Sanctis, Elisa Zaccaria, Marta Zerunian, Domenico De Santis, Michela Polici, Tiziano Polidori, Andrea Laghi and Damiano Caruso
J. Imaging 2025, 11(7), 217; https://doi.org/10.3390/jimaging11070217 - 30 Jun 2025
Viewed by 300
Abstract
Soft tissue masses are predominantly benign, with a benign-to-malignant ratio exceeding 100:1, often located around joints. They may be contiguous or adjacent to joints or reflect systemic diseases or distant organ involvement. Clinically, they typically present as palpable swellings. Evaluation should consider duration, [...] Read more.
Soft tissue masses are predominantly benign, with a benign-to-malignant ratio exceeding 100:1, often located around joints. They may be contiguous or adjacent to joints or reflect systemic diseases or distant organ involvement. Clinically, they typically present as palpable swellings. Evaluation should consider duration, size, depth, and mobility. Also assess consistency, growth rate, symptoms, and history of trauma, infection, or malignancy. Laboratory tests are generally of limited diagnostic value. The primary clinical goal is to avoid unnecessary investigations or procedures for benign lesions while ensuring timely diagnosis and treatment of malignant ones. Imaging plays a central role: it confirms the presence of the lesion, assesses its location, size, and composition, differentiates between cystic and solid or benign and malignant features, and can sometimes provide a definitive diagnosis. Imaging is also crucial for biopsy planning, treatment strategy, identification of involved structures, and follow-up. Ultrasound (US) is the first-line imaging modality for palpable soft tissue masses due to its low cost, wide availability, and lack of ionizing radiation. If findings are inconclusive, magnetic resonance imaging (MRI) or computed tomography (CT) is recommended. This review aims to discuss the most common causes of periarticular soft tissue masses in the appendicular skeleton, focusing on clinical presentation and radiologic features. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop