Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (105)

Search Parameters:
Keywords = synthetic medical image generation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1253 KiB  
Article
Leveraging Synthetic Degradation for Effective Training of Super-Resolution Models in Dermatological Images
by Francesco Branciforti, Kristen M. Meiburger, Elisa Zavattaro, Paola Savoia and Massimo Salvi
Electronics 2025, 14(15), 3138; https://doi.org/10.3390/electronics14153138 (registering DOI) - 6 Aug 2025
Abstract
Teledermatology relies on digital transfer of dermatological images, but compression and resolution differences compromise diagnostic quality. Image enhancement techniques are crucial to compensate for these differences and improve quality for both clinical assessment and AI-based analysis. We developed a customized image degradation pipeline [...] Read more.
Teledermatology relies on digital transfer of dermatological images, but compression and resolution differences compromise diagnostic quality. Image enhancement techniques are crucial to compensate for these differences and improve quality for both clinical assessment and AI-based analysis. We developed a customized image degradation pipeline simulating common artifacts in dermatological images, including blur, noise, downsampling, and compression. This synthetic degradation approach enabled effective training of DermaSR-GAN, a super-resolution generative adversarial network tailored for dermoscopic images. The model was trained on 30,000 high-quality ISIC images and evaluated on three independent datasets (ISIC Test, Novara Dermoscopic, PH2) using structural similarity and no-reference quality metrics. DermaSR-GAN achieved statistically significant improvements in quality scores across all datasets, with up to 23% enhancement in perceptual quality metrics (MANIQA). The model preserved diagnostic details while doubling resolution and surpassed existing approaches, including traditional interpolation methods and state-of-the-art deep learning techniques. Integration with downstream classification systems demonstrated up to 14.6% improvement in class-specific accuracy for keratosis-like lesions compared to original images. Synthetic degradation represents a promising approach for training effective super-resolution models in medical imaging, with significant potential for enhancing teledermatology applications and computer-aided diagnosis systems. Full article
(This article belongs to the Section Computer Science & Engineering)
24 pages, 23817 KiB  
Article
Dual-Path Adversarial Denoising Network Based on UNet
by Jinchi Yu, Yu Zhou, Mingchen Sun and Dadong Wang
Sensors 2025, 25(15), 4751; https://doi.org/10.3390/s25154751 - 1 Aug 2025
Viewed by 220
Abstract
Digital image quality is crucial for reliable analysis in applications such as medical imaging, satellite remote sensing, and video surveillance. However, traditional denoising methods struggle to balance noise removal with detail preservation and lack adaptability to various types of noise. We propose a [...] Read more.
Digital image quality is crucial for reliable analysis in applications such as medical imaging, satellite remote sensing, and video surveillance. However, traditional denoising methods struggle to balance noise removal with detail preservation and lack adaptability to various types of noise. We propose a novel three-module architecture for image denoising, comprising a generator, a dual-path-UNet-based denoiser, and a discriminator. The generator creates synthetic noise patterns to augment training data, while the dual-path-UNet denoiser uses multiple receptive field modules to preserve fine details and dense feature fusion to maintain global structural integrity. The discriminator provides adversarial feedback to enhance denoising performance. This dual-path adversarial training mechanism addresses the limitations of traditional methods by simultaneously capturing both local details and global structures. Experiments on the SIDD, DND, and PolyU datasets demonstrate superior performance. We compare our architecture with the latest state-of-the-art GAN variants through comprehensive qualitative and quantitative evaluations. These results confirm the effectiveness of noise removal with minimal loss of critical image details. The proposed architecture enhances image denoising capabilities in complex noise scenarios, providing a robust solution for applications that require high image fidelity. By enhancing adaptability to various types of noise while maintaining structural integrity, this method provides a versatile tool for image processing tasks that require preserving detail. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

13 pages, 3685 KiB  
Article
A Controlled Variation Approach for Example-Based Explainable AI in Colorectal Polyp Classification
by Miguel Filipe Fontes, Alexandre Henrique Neto, João Dallyson Almeida and António Trigueiros Cunha
Appl. Sci. 2025, 15(15), 8467; https://doi.org/10.3390/app15158467 (registering DOI) - 30 Jul 2025
Viewed by 189
Abstract
Medical imaging is vital for diagnosing and treating colorectal cancer (CRC), a leading cause of mortality. Classifying colorectal polyps and CRC precursors remains challenging due to operator variability and expertise dependence. Deep learning (DL) models show promise in polyp classification but face adoption [...] Read more.
Medical imaging is vital for diagnosing and treating colorectal cancer (CRC), a leading cause of mortality. Classifying colorectal polyps and CRC precursors remains challenging due to operator variability and expertise dependence. Deep learning (DL) models show promise in polyp classification but face adoption barriers due to their ‘black box’ nature, limiting interpretability. This study presents an example-based explainable artificial intehlligence (XAI) approach using Pix2Pix to generate synthetic polyp images with controlled size variations and LIME to explain classifier predictions visually. EfficientNet and Vision Transformer (ViT) were trained on datasets of real and synthetic images, achieving strong baseline accuracies of 94% and 96%, respectively. Image quality was assessed using PSNR (18.04), SSIM (0.64), and FID (123.32), while classifier robustness was evaluated across polyp sizes. Results show that Pix2Pix effectively controls image attributes like polyp size despite limitations in visual fidelity. LIME integration revealed classifier vulnerabilities, underscoring the value of complementary XAI techniques. This enhances DL model interpretability and deepens understanding of their behaviour. The findings contribute to developing explainable AI tools for polyp classification and CRC diagnosis. Future work will improve synthetic image quality and refine XAI methodologies for broader clinical use. Full article
Show Figures

Figure 1

35 pages, 4256 KiB  
Article
Automated Segmentation and Morphometric Analysis of Thioflavin-S-Stained Amyloid Deposits in Alzheimer’s Disease Brains and Age-Matched Controls Using Weakly Supervised Deep Learning
by Gábor Barczánfalvi, Tibor Nyári, József Tolnai, László Tiszlavicz, Balázs Gulyás and Karoly Gulya
Int. J. Mol. Sci. 2025, 26(15), 7134; https://doi.org/10.3390/ijms26157134 - 24 Jul 2025
Viewed by 408
Abstract
Alzheimer’s disease (AD) involves the accumulation of amyloid-β (Aβ) plaques, whose quantification plays a central role in understanding disease progression. Automated segmentation of Aβ deposits in histopathological micrographs enables large-scale analyses but is hindered by the high cost of detailed pixel-level annotations. Weakly [...] Read more.
Alzheimer’s disease (AD) involves the accumulation of amyloid-β (Aβ) plaques, whose quantification plays a central role in understanding disease progression. Automated segmentation of Aβ deposits in histopathological micrographs enables large-scale analyses but is hindered by the high cost of detailed pixel-level annotations. Weakly supervised learning offers a promising alternative by leveraging coarse or indirect labels to reduce the annotation burden. We evaluated a weakly supervised approach to segment and analyze thioflavin-S-positive parenchymal amyloid pathology in AD and age-matched brains. Our pipeline integrates three key components, each designed to operate under weak supervision. First, robust preprocessing (including retrospective multi-image illumination correction and gradient-based background estimation) was applied to enhance image fidelity and support training, as models rely more on image features. Second, class activation maps (CAMs), generated by a compact deep classifier SqueezeNet, were used to identify, and coarsely localize amyloid-rich parenchymal regions from patch-wise image labels, serving as spatial priors for subsequent refinement without requiring dense pixel-level annotations. Third, a patch-based convolutional neural network, U-Net, was trained on synthetic data generated from micrographs based on CAM-derived pseudo-labels via an extensive object-level augmentation strategy, enabling refined whole-image semantic segmentation and generalization across diverse spatial configurations. To ensure robustness and unbiased evaluation, we assessed the segmentation performance of the entire framework using patient-wise group k-fold cross-validation, explicitly modeling generalization across unseen individuals, critical in clinical scenarios. Despite relying on weak labels, the integrated pipeline achieved strong segmentation performance with an average Dice similarity coefficient (≈0.763) and Jaccard index (≈0.639), widely accepted metrics for assessing segmentation quality in medical image analysis. The resulting segmentations were also visually coherent, demonstrating that weakly supervised segmentation is a viable alternative in histopathology, where acquiring dense annotations is prohibitively labor-intensive and time-consuming. Subsequent morphometric analyses on automatically segmented Aβ deposits revealed size-, structural complexity-, and global geometry-related differences across brain regions and cognitive status. These findings confirm that deposit architecture exhibits region-specific patterns and reflects underlying neurodegenerative processes, thereby highlighting the biological relevance and practical applicability of the proposed image-processing pipeline for morphometric analysis. Full article
Show Figures

Figure 1

23 pages, 3645 KiB  
Article
Color-Guided Mixture-of-Experts Conditional GAN for Realistic Biomedical Image Synthesis in Data-Scarce Diagnostics
by Patrycja Kwiek, Filip Ciepiela and Małgorzata Jakubowska
Electronics 2025, 14(14), 2773; https://doi.org/10.3390/electronics14142773 - 10 Jul 2025
Viewed by 265
Abstract
Background: Limited availability of high-quality labeled biomedical image datasets presents a significant challenge for training deep learning models in medical diagnostics. This study proposes a novel image generation framework combining conditional generative adversarial networks (cGANs) with a Mixture-of-Experts (MoE) architecture and color histogram-aware [...] Read more.
Background: Limited availability of high-quality labeled biomedical image datasets presents a significant challenge for training deep learning models in medical diagnostics. This study proposes a novel image generation framework combining conditional generative adversarial networks (cGANs) with a Mixture-of-Experts (MoE) architecture and color histogram-aware loss functions to enhance synthetic blood cell image quality. Methods: RGB microscopic images from the BloodMNIST dataset (eight blood cell types, resolution 3 × 128 × 128) underwent preprocessing with k-means clustering to extract the dominant colors and UMAP for visualizing class similarity. Spearman correlation-based distance matrices were used to evaluate the discriminative power of each RGB channel. A MoE–cGAN architecture was developed with residual blocks and LeakyReLU activations. Expert generators were conditioned on cell type, and the generator’s loss was augmented with a Wasserstein distance-based term comparing red and green channel histograms, which were found most relevant for class separation. Results: The red and green channels contributed most to class discrimination; the blue channel had minimal impact. The proposed model achieved 0.97 classification accuracy on generated images (ResNet50), with 0.96 precision, 0.97 recall, and a 0.96 F1-score. The best Fréchet Inception Distance (FID) was 52.1. Misclassifications occurred mainly among visually similar cell types. Conclusions: Integrating histogram alignment into the MoE–cGAN training significantly improves the realism and class-specific variability of synthetic images, supporting robust model development under data scarcity in hematological imaging. Full article
Show Figures

Figure 1

17 pages, 4622 KiB  
Article
Dual Focus-3D: A Hybrid Deep Learning Approach for Robust 3D Gaze Estimation
by Abderrahmen Bendimered, Rabah Iguernaissi, Mohamad Motasem Nawaf, Rim Cherif, Séverine Dubuisson and Djamal Merad
Sensors 2025, 25(13), 4086; https://doi.org/10.3390/s25134086 - 30 Jun 2025
Viewed by 395
Abstract
Estimating gaze direction is a key task in computer vision, especially for understanding where a person is focusing their attention. It is essential for applications in assistive technology, medical diagnostics, virtual environments, and human–computer interaction. In this work, we introduce Dual Focus-3D, a [...] Read more.
Estimating gaze direction is a key task in computer vision, especially for understanding where a person is focusing their attention. It is essential for applications in assistive technology, medical diagnostics, virtual environments, and human–computer interaction. In this work, we introduce Dual Focus-3D, a novel hybrid deep learning architecture that combines appearance-based features from eye images with 3D head orientation data. This fusion enhances the model’s prediction accuracy and robustness, particularly in challenging natural environments. To support training and evaluation, we present EyeLis, a new dataset containing 5206 annotated samples with corresponding 3D gaze and head pose information. Our model achieves state-of-the-art performance, with a MAE of 1.64° on EyeLis, demonstrating its ability to generalize effectively across both synthetic and real datasets. Key innovations include a multimodal feature fusion strategy, an angular loss function optimized for 3D gaze prediction, and regularization techniques to mitigate overfitting. Our results show that including 3D spatial information directly in the learning process significantly improves accuracy. Full article
(This article belongs to the Special Issue Advances in Optical Sensing, Instrumentation and Systems: 2nd Edition)
Show Figures

Figure 1

12 pages, 2782 KiB  
Article
Platelets Image Classification Through Data Augmentation: A Comparative Study of Traditional Imaging Augmentation and GAN-Based Synthetic Data Generation Techniques Using CNNs
by Itunuoluwa Abidoye, Frances Ikeji, Charlie A. Coupland, Simon D. J. Calaminus, Nick Sander and Eva Sousa
J. Imaging 2025, 11(6), 183; https://doi.org/10.3390/jimaging11060183 - 4 Jun 2025
Viewed by 980
Abstract
Platelets play a crucial role in diagnosing and detecting various diseases, influencing the progression of conditions and guiding treatment options. Accurate identification and classification of platelets are essential for these purposes. The present study aims to create a synthetic database of platelet images [...] Read more.
Platelets play a crucial role in diagnosing and detecting various diseases, influencing the progression of conditions and guiding treatment options. Accurate identification and classification of platelets are essential for these purposes. The present study aims to create a synthetic database of platelet images using Generative Adversarial Networks (GANs) and validate its effectiveness by comparing it with datasets of increasing sizes generated through traditional augmentation techniques. Starting from an initial dataset of 71 platelet images, the dataset was expanded to 141 images (Level 1) using random oversampling and basic transformations and further to 1463 images (Level 2) through extensive augmentation (rotation, shear, zoom). Additionally, a synthetic dataset of 300 images was generated using a Wasserstein GAN with Gradient Penalty (WGAN-GP). Eight pre-trained deep learning models (DenseNet121, DenseNet169, DenseNet201, VGG16, VGG19, InceptionV3, InceptionResNetV2, and AlexNet) and two custom CNNs were evaluated across these datasets. Performance was measured using accuracy, precision, recall, and F1-score. On the extensively augmented dataset (Level 2), InceptionV3 and InceptionResNetV2 reached 99% accuracy and 99% precision/recall/F1-score, while DenseNet201 closely followed, with 98% accuracy, precision, recall and F1-score. GAN-augmented data further improved DenseNet’s performance, demonstrating the potential of GAN-generated images in enhancing platelet classification, especially where data are limited. These findings highlight the benefits of combining traditional and GAN-based augmentation techniques to improve classification performance in medical imaging tasks. Full article
(This article belongs to the Topic Machine Learning and Deep Learning in Medical Imaging)
Show Figures

Figure 1

24 pages, 2716 KiB  
Article
Synthetic Data-Based Algorithm Selection for Medical Image Classification Under Limited Data Availability
by Maxim Zhabinets, Benjamin Tyler, Martin Lukac, Shinobu Nagayama, Ferdinand Molnár and Michitaka Kameyama
Algorithms 2025, 18(6), 310; https://doi.org/10.3390/a18060310 - 25 May 2025
Viewed by 347
Abstract
The Algorithm selection approach improves performance by dynamically choosing the optimal Algorithm for each input instance. While this selection strategy has been extensively studied, the amount of data and their nature have not yet been investigated with respect to meta-learning, particularly in scenarios [...] Read more.
The Algorithm selection approach improves performance by dynamically choosing the optimal Algorithm for each input instance. While this selection strategy has been extensively studied, the amount of data and their nature have not yet been investigated with respect to meta-learning, particularly in scenarios with limited data availability. This paper addresses a critical challenge: where additional data might not be available for training an Algorithm selector, and to implement a selection mechanism, data must be generated. Focusing on medical image classification, we investigate whether synthetic data can effectively train an Algorithm selector when real training data are scarce. Our methodology involves data generation using Generative Adversarial Network. To determine if Algorithm selection trained on synthetically generated data can achieve the same accuracy as if trained on real-world natural data, we systematically evaluate the data generative model using the smallest amount of data needed to choose the right Algorithm and to achieve the expected level of accuracy. Our experimental results demonstrate that using a small amount of real samples can provide enough information to a Generative Adversarial Network to synthesize a new dataset that, when used for training the Algorithm selection, improves image classification in some cases. Full article
(This article belongs to the Special Issue Advanced Machine Learning Algorithms for Image Processing)
Show Figures

Figure 1

24 pages, 3487 KiB  
Article
A Convolutional Mixer-Based Deep Learning Network for Alzheimer’s Disease Classification from Structural Magnetic Resonance Imaging
by M. Krithika Alias Anbu Devi and K. Suganthi
Diagnostics 2025, 15(11), 1318; https://doi.org/10.3390/diagnostics15111318 - 23 May 2025
Viewed by 621
Abstract
Objective: Alzheimer’s disease (AD) is a neurodegenerative disorder that severely impairs cognitive function across various age groups, ranging from early to late sixties. It progresses from mild to severe stages, so an accurate diagnostic tool is necessary for effective intervention and treatment planning. [...] Read more.
Objective: Alzheimer’s disease (AD) is a neurodegenerative disorder that severely impairs cognitive function across various age groups, ranging from early to late sixties. It progresses from mild to severe stages, so an accurate diagnostic tool is necessary for effective intervention and treatment planning. Methods: This work proposes a novel AD classification architecture that integrates depthwise separable convolutional layers with traditional convolutional layers to efficiently extract features from structural magnetic resonance imaging (sMRI) scans. This model benefits from excellent feature extraction and lightweight operation, which reduces the number of parameters without compromising accuracy. The model learns from scratch with optimized weight initialization, resulting in faster convergence and improved generalization. However, medical imaging datasets contain class imbalance as a major challenge, which often results in biased models with poor generalization to the underrepresented disease stages. A hybrid sampling approach combining SMOTE (synthetic minority oversampling technique) with the ENN (edited nearest neighbors) effectively handles the complications of class imbalance issue inherent in the datasets. An explainable activation space occlusion sensitivity map (ASOP) pixel attribution method is employed to highlight the critical regions of input images that influence the classification decisions across different stages of AD. Results and Conclusions: The proposed model outperformed several state-of-the-art transfer learning architectures, including VGG19, DenseNet201, EfficientNetV2S, MobileNet, ResNet152, InceptionV3, and Xception. It achieves noteworthy results in disease stage classification, with an accuracy of 98.87%, an F1 score of 98.86%, a precision of 98.80%, and recall of 98.69%. These results demonstrate the effectiveness of the proposed model for classifying stages of AD progression. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

22 pages, 4959 KiB  
Article
Predicting Post-Liposuction Body Shape Using RGB Image-to-Image Translation
by Minji Kim, Jiseong Byeon, Jihun Chang and Sekyoung Youm
Appl. Sci. 2025, 15(9), 4787; https://doi.org/10.3390/app15094787 - 25 Apr 2025
Viewed by 457
Abstract
The growing interest in weight management has elevated the popularity of liposuction. Individuals deciding whether to undergo liposuction must rely on a doctor’s subjective projections or surgical outcomes for other people to gauge how their own body shape will change. However, such predictions [...] Read more.
The growing interest in weight management has elevated the popularity of liposuction. Individuals deciding whether to undergo liposuction must rely on a doctor’s subjective projections or surgical outcomes for other people to gauge how their own body shape will change. However, such predictions may not be accurate. Although deep learning technology has recently achieved breakthroughs in analyzing medical images and rendering diagnoses, predicting surgical outcomes based on medical images outside clinical settings remains challenging. Hence, this study aimed to develop a method for predicting body shape changes after liposuction using only images of the subject’s own body. To achieve this, we utilize data augmentation based on a conditional continuous Generative Adversarial Network (CcGAN), which generates realistic synthetic data conditioned on continuous variables. Additionally, we modify the loss function of Pix2Pix—a supervised image-to-image translation technique based on Generative Adversarial Networks (GANs)—to enhance prediction quality. Our approach quantitatively and qualitatively demonstrates that accurate, intuitive predictions before liposuction are possible. Full article
Show Figures

Figure 1

29 pages, 6518 KiB  
Article
Generative AI Models (2018–2024): Advancements and Applications in Kidney Care
by Fnu Neha, Deepshikha Bhati and Deepak Kumar Shukla
BioMedInformatics 2025, 5(2), 18; https://doi.org/10.3390/biomedinformatics5020018 - 3 Apr 2025
Cited by 1 | Viewed by 2609
Abstract
Kidney disease poses a significant global health challenge, affecting millions and straining healthcare systems due to limited nephrology resources. This paper examines the transformative potential of Generative AI (GenAI), Large Language Models (LLMs), and Large Vision Models (LVMs) in addressing critical challenges in [...] Read more.
Kidney disease poses a significant global health challenge, affecting millions and straining healthcare systems due to limited nephrology resources. This paper examines the transformative potential of Generative AI (GenAI), Large Language Models (LLMs), and Large Vision Models (LVMs) in addressing critical challenges in kidney care. GenAI supports research and early interventions through the generation of synthetic medical data. LLMs enhance clinical decision-making by analyzing medical texts and electronic health records, while LVMs improve diagnostic accuracy through advanced medical image analysis. Together, these technologies show promise for advancing patient education, risk stratification, disease diagnosis, and personalized treatment strategies. This paper highlights key advancements in GenAI, LLMs, and LVMs from 2018 to 2024, focusing on their applications in kidney care and presenting common use cases. It also discusses their limitations, including knowledge cutoffs, hallucinations, contextual understanding challenges, data representation biases, computational demands, and ethical concerns. By providing a comprehensive analysis, this paper outlines a roadmap for integrating these AI advancements into nephrology, emphasizing the need for further research and real-world validation to fully realize their transformative potential. Full article
Show Figures

Figure 1

28 pages, 957 KiB  
Systematic Review
Advancing Diabetic Foot Ulcer Care: AI and Generative AI Approaches for Classification, Prediction, Segmentation, and Detection
by Suhaylah Alkhalefah, Isra AlTuraiki and Najwa Altwaijry
Healthcare 2025, 13(6), 648; https://doi.org/10.3390/healthcare13060648 - 16 Mar 2025
Cited by 3 | Viewed by 2914
Abstract
Background: Diabetic foot ulcers (DFUs) represent a significant challenge in managing diabetes, leading to higher patient complications and increased healthcare costs. Traditional approaches, such as manual wound assessment and diagnostic tool usage, often require significant resources, including skilled clinicians, specialized equipment, and [...] Read more.
Background: Diabetic foot ulcers (DFUs) represent a significant challenge in managing diabetes, leading to higher patient complications and increased healthcare costs. Traditional approaches, such as manual wound assessment and diagnostic tool usage, often require significant resources, including skilled clinicians, specialized equipment, and extensive time. Artificial intelligence (AI) and generative AI offer promising solutions for improving DFU management. This study systematically reviews the role of AI in DFU classification, prediction, segmentation, and detection. Furthermore, it highlights the role of generative AI in overcoming data scarcity and potential of AI-based smartphone applications for remote monitoring and diagnosis. Methods: A systematic literature review was conducted following the PRISMA guidelines. Relevant studies published between 2020 and 2025 were identified from databases including PubMed, IEEE Xplore, Scopus, and Web of Science. The review focused on AI and generative AI applications in DFU and excluded non-DFU-related medical imaging articles. Results: This study indicates that AI-powered models have significantly improved DFU classification accuracy, early detection, and predictive modeling. Generative AI techniques, such as GANs and diffusion models, have demonstrated potential in addressing dataset limitations by generating synthetic DFU images. Additionally, AI-powered smartphone applications provide cost-effective solutions for DFU monitoring, potentially improving diagnosis. Conclusions: AI and generative AI are transforming DFU management by enhancing diagnostic accuracy and predictive capabilities. Future research should prioritize explainable AI frameworks and diverse datasets for AI-driven healthcare solutions to facilitate broader clinical adoption. Full article
(This article belongs to the Special Issue Artificial Intelligence in Healthcare: Opportunities and Challenges)
Show Figures

Figure 1

20 pages, 31492 KiB  
Article
The Bright Feature Transform for Prominent Point Scatterer Detection and Tone Mapping
by Gregory D. Vetaw and Suren Jayasuriya
Remote Sens. 2025, 17(6), 1037; https://doi.org/10.3390/rs17061037 - 15 Mar 2025
Viewed by 535
Abstract
Detecting bright point scatterers plays an important role in assessing the quality of many sonar, radar, and medical ultrasound imaging systems, especially for characterizing the resolution. Traditionally, prominent scatterers, also known as coherent scatterers, are usually detected by employing thresholding techniques alongside statistical [...] Read more.
Detecting bright point scatterers plays an important role in assessing the quality of many sonar, radar, and medical ultrasound imaging systems, especially for characterizing the resolution. Traditionally, prominent scatterers, also known as coherent scatterers, are usually detected by employing thresholding techniques alongside statistical measures in the detection processing chain. However, these methods can perform poorly in detecting point-like scatterers in relatively high levels of speckle background and can distort the structure of the scatterer when visualized. This paper introduces a fast image-processing method to visually identify and detect point scatterers in synthetic aperture imagery using the bright feature transform (BFT). The BFT is analytic, computationally inexpensive, and requires no thresholding or parameter tuning. We derive this method by analyzing an ideal point scatterer’s response with respect to pixel intensity and contrast around neighboring pixels and non-adjacent pixels. We show that this method preserves the general structure and the width of the bright scatterer while performing tone mapping, which can then be used for downstream image characterization and analysis. We then modify the BFT to present a difference of trigonometric functions to mitigate speckle scatterers and other random noise sources found in the imagery. We evaluate the performance of our methods on simulated and real synthetic aperture sonar and radar images, and show qualitative results on how the methods perform tone mapping on reconstructed input imagery in such a way to highlight the bright scatterer, which is insensitive to seafloor textures and high speckle noise levels. Full article
Show Figures

Figure 1

30 pages, 34873 KiB  
Article
Text-Guided Synthesis in Medical Multimedia Retrieval: A Framework for Enhanced Colonoscopy Image Classification and Segmentation
by Ojonugwa Oluwafemi Ejiga Peter, Opeyemi Taiwo Adeniran, Adetokunbo MacGregor John-Otumu, Fahmi Khalifa and Md Mahmudur Rahman
Algorithms 2025, 18(3), 155; https://doi.org/10.3390/a18030155 - 9 Mar 2025
Cited by 1 | Viewed by 1393
Abstract
The lack of extensive, varied, and thoroughly annotated datasets impedes the advancement of artificial intelligence (AI) for medical applications, especially colorectal cancer detection. Models trained with limited diversity often display biases, especially when utilized on disadvantaged groups. Generative models (e.g., DALL-E 2, Vector-Quantized [...] Read more.
The lack of extensive, varied, and thoroughly annotated datasets impedes the advancement of artificial intelligence (AI) for medical applications, especially colorectal cancer detection. Models trained with limited diversity often display biases, especially when utilized on disadvantaged groups. Generative models (e.g., DALL-E 2, Vector-Quantized Generative Adversarial Network (VQ-GAN)) have been used to generate images but not colonoscopy data for intelligent data augmentation. This study developed an effective method for producing synthetic colonoscopy image data, which can be used to train advanced medical diagnostic models for robust colorectal cancer detection and treatment. Text-to-image synthesis was performed using fine-tuned Visual Large Language Models (LLMs). Stable Diffusion and DreamBooth Low-Rank Adaptation produce images that look authentic, with an average Inception score of 2.36 across three datasets. The validation accuracy of various classification models Big Transfer (BiT), Fixed Resolution Residual Next Generation Network (FixResNeXt), and Efficient Neural Network (EfficientNet) were 92%, 91%, and 86%, respectively. Vision Transformer (ViT) and Data-Efficient Image Transformers (DeiT) had an accuracy rate of 93%. Secondly, for the segmentation of polyps, the ground truth masks are generated using Segment Anything Model (SAM). Then, five segmentation models (U-Net, Pyramid Scene Parsing Network (PSNet), Feature Pyramid Network (FPN), Link Network (LinkNet), and Multi-scale Attention Network (MANet)) were adopted. FPN produced excellent results, with an Intersection Over Union (IoU) of 0.64, an F1 score of 0.78, a recall of 0.75, and a Dice coefficient of 0.77. This demonstrates strong performance in terms of both segmentation accuracy and overlap metrics, with particularly robust results in balanced detection capability as shown by the high F1 score and Dice coefficient. This highlights how AI-generated medical images can improve colonoscopy analysis, which is critical for early colorectal cancer detection. Full article
Show Figures

Figure 1

16 pages, 48485 KiB  
Article
Detection of Surgical Instruments Based on Synthetic Training Data
by Leon Wiese, Lennart Hinz, Eduard Reithmeier, Philippe Korn and Michael Neuhaus
Computers 2025, 14(2), 69; https://doi.org/10.3390/computers14020069 - 15 Feb 2025
Viewed by 1083
Abstract
Due to a significant shortage of healthcare staff, medical facilities are increasingly challenged by the need to deploy current staff more intensively, which can lead to significant complications for patients and staff. Digital surgical assistance systems that track all instruments used in procedures [...] Read more.
Due to a significant shortage of healthcare staff, medical facilities are increasingly challenged by the need to deploy current staff more intensively, which can lead to significant complications for patients and staff. Digital surgical assistance systems that track all instruments used in procedures can make a significant contribution to relieving the load on staff, increasing efficiency, avoiding errors and improving hygiene. Due to data safety concerns, laborious data annotation and the complexity of the scenes, as well as to increase prediction accuracy, the provision of synthetic data is key to enabling the wide use of artificial intelligence for object recognition and tracking in OR settings. In this study, a synthetic data generation pipeline is introduced for the detection of eight surgical instruments during open surgery. Using 3D models of the instruments, synthetic datasets consisting of color images and annotations were created. These datasets were used to train common object detection networks (YOLOv8) and compared against networks solely trained on real data. The comparison, conducted on two real image datasets with varying complexity, revealed that networks trained on synthetic data demonstrated better generalization capabilities. A sensitivity analysis showed that synthetic data-trained networks could detect surgical instruments even at higher occlusion levels than real data-trained networks. Additionally, 1920 datasets were generated using different parameter combinations to evaluate the impact of various settings on detection performance. Key findings include the importance of object visibility, occlusion, and the inclusion of occlusion objects in improving detection accuracy. The results highlight the potential of synthetic datasets to simulate real-world conditions, enhance network generalization, and address data shortages in specialized domains like surgical instrument detection. Full article
(This article belongs to the Special Issue AI in Its Ecosystem)
Show Figures

Figure 1

Back to TopTop