MDPI - Publisher of Open Access Journals

22 pages, 1280 KB

Open AccessArticle

Enhancing Early Skin Cancer Detection: A Deep Learning Approach with Multi-Scale Feature Refinement and Fusion

by Siyuan Wu, Pengfei Zhao, Huafu Xu and Zimin Wang

Symmetry 2026, 18(4), 612; https://doi.org/10.3390/sym18040612 - 5 Apr 2026

Viewed by 368

The global incidence of skin cancer is rising, making it an increasingly critical public health issue. Malignant skin tumors such as melanoma originate from pathological alterations in skin cells, and their accurate early-stage segmentation is crucial for quantitative analysis, early diagnosis, and effective [...] Read more.

The global incidence of skin cancer is rising, making it an increasingly critical public health issue. Malignant skin tumors such as melanoma originate from pathological alterations in skin cells, and their accurate early-stage segmentation is crucial for quantitative analysis, early diagnosis, and effective treatment. However, achieving precise and efficient segmentation remains a major challenge, as existing methods often struggle to capture complex lesion characteristics. To address this challenge, we propose a novel deep learning framework that integrates the PVT v2 backbone with two key modules: the Spatial-Aware Feature Enhancement (SAFE) module and the Multiscale Dual Cross-attention Fusion (MDCF) module. The SAFE module enhances multi-scale encoder features through a dual-branch architecture, which adaptively extracts offset information to integrate fine-grained shallow details with deep semantic information, thereby bridging the feature gap across network depths. The MDCF module establishes bidirectional cross-attention between decoder and encoder features, followed by multi-scale deformable convolutions that capture lesion boundaries and small fragments across heterogeneous receptive fields, thereby enriching semantic details while suppressing background interference. The proposed model was evaluated on two public benchmark datasets (ISIC 2016 and ISIC 2018), achieving Intersection over Union (IoU) scores of 87.33% and 83.67%, respectively. These results demonstrate superior performance compared to current state-of-the-art methods and indicate that our framework significantly enhances skin lesion image analysis, offering a promising tool for improving early detection of skin cancer. Full article

(This article belongs to the Special Issue Symmetric/Asymmetric Study in Medical Imaging)

► Show Figures

Figure 1

15 pages, 2052 KB

Open AccessArticle

A Dual-Branch Multi-Scale Network for Skin Lesion Classification

by Ying Liu, Xinyu Feng, Yuchai Wan, Huifu Li, Xun Zhang and Abdureyim Raxidin

Electronics 2026, 15(5), 1118; https://doi.org/10.3390/electronics15051118 - 8 Mar 2026

Viewed by 359

Abstract

Dermoscopic images are widely used for diagnosing skin diseases, and automatic classification of lesion types using deep learning can significantly enhance diagnostic efficiency. However, challenges such as variations in imaging conditions, subtle differences between classes, high variability within classes, and severe class imbalance [...] Read more.

Dermoscopic images are widely used for diagnosing skin diseases, and automatic classification of lesion types using deep learning can significantly enhance diagnostic efficiency. However, challenges such as variations in imaging conditions, subtle differences between classes, high variability within classes, and severe class imbalance complicate skin lesion analysis. This paper introduces a dual-branch deep learning model where two branches independently process high-frequency and low-frequency image features to generate multi-scale fused representations. To address class imbalance, the model employs cosine similarity to strengthen inter-class discrimination and incorporates a bias term to improve recognition of minority lesion classes. Experiments conducted on the ISIC 2017 and ISIC 2018 datasets demonstrate that the proposed method surpasses state-of-the-art approaches, achieving accuracies of 97.0% and 91.9%, respectively, with sensitivity and specificity both exceeding 90% on the two datasets. Full article

(This article belongs to the Special Issue Deep Learning for Computer Vision Application: Second Edition)

► Show Figures

Figure 1

18 pages, 12952 KB

Open AccessArticle

Synthetic Melanoma Image Generation and Evaluation Using Generative Adversarial Networks

by Pei-Yu Lin, Yidan Shen, Neville Mathew, Renjie Hu, Siyu Huang, Courtney M. Queen, Cameron E. West, Ana Ciurea and George Zouridakis

Bioengineering 2026, 13(2), 245; https://doi.org/10.3390/bioengineering13020245 - 20 Feb 2026

Viewed by 856

Abstract

Melanoma is the most lethal form of skin cancer, and early detection is critical for improving patient outcomes. Although dermoscopy combined with deep learning has advanced automated skin-lesion analysis, progress is hindered by limited access to large, well-annotated datasets and by severe class [...] Read more.

Melanoma is the most lethal form of skin cancer, and early detection is critical for improving patient outcomes. Although dermoscopy combined with deep learning has advanced automated skin-lesion analysis, progress is hindered by limited access to large, well-annotated datasets and by severe class imbalance, where melanoma images are substantially underrepresented. To address these challenges, we present the first systematic benchmarking study comparing four GAN architectures—DCGAN, StyleGAN2, and two StyleGAN3 variants (T and R)—for high-resolution (

512 \times 512

) melanoma-specific synthesis. We train and optimize all models on two expert-annotated benchmarks (ISIC 2018 and ISIC 2020) under unified preprocessing and hyperparameter exploration, with particular attention to R1 regularization tuning. Image quality is assessed through a multi-faceted protocol combining distribution-level metrics (FID), sample-level representativeness (FMD), qualitative dermoscopic inspection, downstream classification with a frozen EfficientNet-based melanoma detector, and independent evaluation by two board-certified dermatologists. StyleGAN2 achieves the best balance of quantitative performance and perceptual quality, attaining FID scores of 24.8 (ISIC 2018) and 7.96 (ISIC 2020) at

γ = 0.8

. The frozen classifier recognizes 83% of StyleGAN2-generated images as melanoma, while dermatologists distinguish synthetic from real images at only 66.5% accuracy (chance = 50%), with low inter-rater agreement (

κ = 0.17

). In a controlled augmentation experiment, adding synthetic melanoma images to address class imbalance improved melanoma detection AUC from 0.925 to 0.945 on a held-out real-image test set. These findings demonstrate that StyleGAN2-generated melanoma images preserve diagnostically relevant features and can provide a measurable benefit for mitigating class imbalance in melanoma-focused machine learning pipelines. Full article

(This article belongs to the Special Issue AI and Data Science in Bioengineering: Innovations and Applications)

► Show Figures

Figure 1

20 pages, 3102 KB

Open AccessFeature PaperArticle

LDFSAM: Localization Distillation-Enhanced Feature Prompting SAM for Medical Image Segmentation

by Xuanbo Zhao, Cheng Wang, Huaxing Xu, Hong Zhou, Zekuan Yu, Tao Chen, Xiaoling Wei and Rongjun Zhang

J. Imaging 2026, 12(2), 74; https://doi.org/10.3390/jimaging12020074 - 10 Feb 2026

Viewed by 697

Abstract

Standard SAM-based approaches in medical imaging typically rely on explicit geometric prompts, such as bounding boxes or points. However, these rigid spatial constraints are often insufficient for capturing the complex, deformable boundaries of medical structures, where localization noise easily propagates into segmentation errors. [...] Read more.

Standard SAM-based approaches in medical imaging typically rely on explicit geometric prompts, such as bounding boxes or points. However, these rigid spatial constraints are often insufficient for capturing the complex, deformable boundaries of medical structures, where localization noise easily propagates into segmentation errors. To overcome this, we propose the Localization Distillation-Enhanced Feature Prompting SAM (LDFSAM), a novel framework that shifts from discrete coordinate inputs to a latent feature prompting paradigm. We employ a lightweight prompt generator, refined via Localization Distillation (LD), to inject multi-scale features into the SAM decoder as complementary Dense Feature Prompts (DFPs) and Sparse Feature Prompts (SFPs). This effectively guides segmentation without explicit box constraints. Extensive experiments on four public benchmarks (3D CBCT Tooth, ISIC 2018, MMOTU, and Kvasir-SEG) demonstrate that LDFSAM outperforms both prior SAM-based baselines and conventional networks, achieving Dice scores exceeding 0.91. Further validation on an in-house cohort demonstrates its robust generalization capabilities. Overall, our method outperforms both prior SAM-based baselines and conventional networks, with particularly strong gains in low-data regimes, providing a reliable solution for automated medical image segmentation. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

31 pages, 1633 KB

Open AccessArticle

Foundation-Model-Driven Skin Lesion Segmentation and Classification Using SAM-Adapters and Vision Transformers

by Faisal Binzagr and Majed Hariri

Diagnostics 2026, 16(3), 468; https://doi.org/10.3390/diagnostics16030468 - 3 Feb 2026

Cited by 1 | Viewed by 883

Abstract

Background: The precise segmentation and classification of dermoscopic images remain prominent obstacles in automated skin cancer evaluation due, in part, to variability in lesions, low-contrast borders, and additional artifacts in the background. There have been recent developments in foundation models, with a particular [...] Read more.

Background: The precise segmentation and classification of dermoscopic images remain prominent obstacles in automated skin cancer evaluation due, in part, to variability in lesions, low-contrast borders, and additional artifacts in the background. There have been recent developments in foundation models, with a particular emphasis on the Segment Anything Model (SAM)—these models exhibit strong generalization potential but require domain-specific adaptation to function effectively in medical imaging. The advent of new architectures, particularly Vision Transformers (ViTs), expands the means of implementing robust lesion identification; however, their strengths are limited without spatial priors. Methods: The proposed study lays out an integrated foundation-model-based framework that utilizes SAM-Adapter-fine-tuning for lesion segmentation and a ViT-based classifier that incorporates lesion-specific cropping derived from segmentation and cross-attention fusion. The SAM encoder is kept frozen while lightweight adapters are fine-tuned only, to introduce skin surface-specific capacity. Segmentation priors are incorporated during the classification stage through fusion with patch-embeddings from the images, creating lesion-centric reasoning. The entire pipeline is trained using a joint multi-task approach using data from the ISIC 2018, HAM10000, and PH2 datasets. Results: From extensive experimentation, the proposed method outperforms the state-of-the-art segmentation and classification across the dataset. On the ISIC 2018 dataset, it achieves a Dice score of 94.27% for segmentation and an accuracy of 95.88% for classification performance. On PH2, a Dice score of 95.62% is achieved, and for HAM10000, an accuracy of 96.37% is achieved. Several ablation analyses confirm that both the SAM-Adapters and lesion-specific cropping and cross-attention fusion contribute substantially to performance. Paired t-tests are used to confirm statistical significance for all the previously stated measures where improvements over strong baselines indicate a

p < 0.01

for most comparisons and with large effect sizes. Conclusions: The results indicate that the combination of prior segmentation from foundation models, plus transformer-based classification, consistently and reliably improves the quality of lesion boundaries and diagnosis accuracy. Thus, the proposed SAM-ViT framework demonstrates a robust, generalizable, and lesion-centric automated dermoscopic analysis, and represents a promising initial step towards clinically deployable skin cancer decision-support system. Next steps will include model compression, improved pseudo-mask refinement and evaluation on real-world multi-center clinical cohorts. Full article

(This article belongs to the Special Issue Medical Image Analysis and Machine Learning)

► Show Figures

Figure 1

31 pages, 4397 KB

Open AccessArticle

Transformer-Based Foundation Learning for Robust and Data-Efficient Skin Disease Imaging

by Inzamam Mashood Nasir, Hend Alshaya, Sara Tehsin and Wided Bouchelligua

Diagnostics 2026, 16(3), 440; https://doi.org/10.3390/diagnostics16030440 - 1 Feb 2026

Cited by 2 | Viewed by 610

Abstract

Background/Objectives: Accurate and reliable automated dermoscopic lesion classification remains challenging. This is due to pronounced dataset bias, limited expert-annotated data, and poor cross-dataset generalization of conventional supervised deep learning models. In clinical dermatology, these limitations restrict the deployment of data-driven diagnostic systems across [...] Read more.

Background/Objectives: Accurate and reliable automated dermoscopic lesion classification remains challenging. This is due to pronounced dataset bias, limited expert-annotated data, and poor cross-dataset generalization of conventional supervised deep learning models. In clinical dermatology, these limitations restrict the deployment of data-driven diagnostic systems across diverse acquisition settings and patient populations. Methods: Motivated by these challenges, this study proposes a transformer-based, dermatology-specific foundation model. The model learns transferable visual representations from large collections of unlabeled dermoscopic images via self-supervised pretraining. It integrates large-scale dermatology-oriented self-supervised learning with a hierarchical vision transformer backbone. This enables effective capture of both fine-grained lesion textures and global morphological patterns. The evaluation is conducted across three publicly available dermoscopic datasets: ISIC 2018, HAM10000, and PH2. The study assesses in-dataset, cross-dataset, limited-label, ablation, and computational-efficiency settings. Results: The proposed approach achieves in-dataset classification accuracies of 94.87%, 97.32%, and 98.17% on ISIC 2018, HAM10000, and PH2, respectively. It outperforms strong transformer and hybrid baselines. Cross-dataset transfer experiments show consistent performance gains of 3.5–5.8% over supervised counterparts. This indicates improved robustness to domain shift. Furthermore, when fine-tuned with only 10% of the labeled training data, the model achieves performance comparable to fully supervised baselines. Conclusions: This highlights strong data efficiency. These results demonstrate that dermatology-specific foundation learning offers a principled and practical solution for robust dermoscopic lesion classification under realistic clinical constraints. Full article

(This article belongs to the Special Issue Advanced Imaging in the Diagnosis and Management of Skin Diseases)

► Show Figures

Figure 1

32 pages, 7593 KB

Open AccessReview

Advancing Medical Decision-Making with AI: A Comprehensive Exploration of the Evolution from Convolutional Neural Networks to Capsule Networks

by Ichrak Khoulqi and Zakariae El Ouazzani

J. Imaging 2026, 12(1), 17; https://doi.org/10.3390/jimaging12010017 - 30 Dec 2025

Viewed by 899

Abstract

In this paper, we propose a literature review regarding two deep learning architectures, namely Convolutional Neural Networks (CNNs) and Capsule Networks (CapsNets), applied to medical images, in order to analyze them to help in medical decision support. CNNs demonstrate their capacity in the [...] Read more.

In this paper, we propose a literature review regarding two deep learning architectures, namely Convolutional Neural Networks (CNNs) and Capsule Networks (CapsNets), applied to medical images, in order to analyze them to help in medical decision support. CNNs demonstrate their capacity in the medical diagnostic field; however, their reliability decreases when there is slight spatial variability, which can affect diagnosis, especially since the anatomical structure of the human body can differ from one patient to another. In contrast, CapsNets encode not only feature activation but also spatial relationships, hence improving the reliability and stability of model generalization. This paper proposes a structured comparison by reviewing studies published from 2018 to 2025 across major databases, including IEEE Xplore, ScienceDirect, SpringerLink, and MDPI. The applications in the reviewed papers are based on the benchmark datasets BraTS, INbreast, ISIC, and COVIDx. This paper review compares the core architectural principles, performance, and interpretability of both architectures. To conclude the paper, we underline the complementary roles of these two architectures in medical decision-making and propose future directions toward hybrid, explainable, and computationally efficient deep learning systems for real clinical environments, thereby increasing survival rates by helping prevent diseases at an early stage. Full article

(This article belongs to the Special Issue Clinical and Pathological Imaging in the Era of Artificial Intelligence: New Insights and Perspectives—2nd Edition)

► Show Figures

Figure 1

26 pages, 6899 KB

Open AccessArticle

When RNN Meets CNN and ViT: The Development of a Hybrid U-Net for Medical Image Segmentation

by Ziru Wang and Ziyang Wang

Fractal Fract. 2026, 10(1), 18; https://doi.org/10.3390/fractalfract10010018 - 28 Dec 2025

Cited by 3 | Viewed by 2574

Abstract

Deep learning for semantic segmentation has made significant advances in recent years, achieving state-of-the-art performance. Medical image segmentation, as a key component of healthcare systems, plays a vital role in the diagnosis and treatment planning of diseases. Due to the fractal and scale-invariant [...] Read more.

Deep learning for semantic segmentation has made significant advances in recent years, achieving state-of-the-art performance. Medical image segmentation, as a key component of healthcare systems, plays a vital role in the diagnosis and treatment planning of diseases. Due to the fractal and scale-invariant nature of biological structures, effective medical image segmentation requires models capable of capturing hierarchical and self-similar representations across multiple spatial scales. In this paper, a Recurrent Neural Network (RNN) is explored within the Convolutional Neural Network (CNN) and Vision Transformer (ViT)-based hybrid U-shape network, named RCV-UNet. First, the ViT-based layer was developed in the bottleneck to effectively capture the global context of an image and establish long-range dependencies through the self-attention mechanism. Second, recurrent residual convolutional blocks (RRCBs) were introduced in both the encoder and decoder to enhance the ability to capture local features and preserve fine details. Third, by integrating the global feature extraction capability of ViT with the local feature enhancement strength of RRCBs, RCV-UNet achieved promising global consistency and boundary refinement, addressing key challenges in medical image segmentation. From a fractal–fractional perspective, the multi-scale encoder–decoder hierarchy and attention-driven aggregation in RCV-UNet naturally accommodate fractal-like, scale-invariant regularity, while the recurrent and residual connections approximate fractional-order dynamics in feature propagation, enabling continuous and memory-aware representation learning. The proposed RCV-UNet was evaluated on four different modalities of images, including CT, MRI, Dermoscopy, and ultrasound, using the Synapse, ACDC, ISIC 2018, and BUSI datasets. Experimental results demonstrate that RCV-UNet outperforms other popular baseline methods, achieving strong performance across different segmentation tasks. The code of the proposed method will be made publicly available. Full article

(This article belongs to the Special Issue Fractional and Fractal Methods in Biomedical Imaging and Time Series Learning)

► Show Figures

Figure 1

17 pages, 2692 KB

Open AccessArticle

MSDTCN-Net: A Multi-Scale Dual-Encoder Network for Skin Lesion Segmentation

by Da Li, Xinyang Wu and Qin Wei

Diagnostics 2025, 15(22), 2924; https://doi.org/10.3390/diagnostics15222924 - 19 Nov 2025

Viewed by 835

Abstract

Background/Objectives: Accurate segmentation of skin lesions is essential for early skin cancer detection. However, traditional CNNs are limited in modeling long-range dependencies, leading to poor performance on lesions with complex shapes. Methods: We propose MSDTCN-Net, a dual-encoder network that integrates ConvNeXt and Deformable [...] Read more.

Background/Objectives: Accurate segmentation of skin lesions is essential for early skin cancer detection. However, traditional CNNs are limited in modeling long-range dependencies, leading to poor performance on lesions with complex shapes. Methods: We propose MSDTCN-Net, a dual-encoder network that integrates ConvNeXt and Deformable Transformer to extract both local details and global semantic information. A Squeeze-and-Excitation (SE) mechanism is introduced to adaptively emphasize important channels. To address scale variation in lesions, we design a Multi-Scale Receptive Field (MSRF) module combining multi-branch and dilated convolutions. Furthermore, a Hierarchical Feature Transfer (HFT) mechanism is employed to guide high-level semantics progressively to shallow layers, enhancing boundary reconstruction in the decoder. Results: Extensive experiments on the ISIC 2016, ISIC 2017, ISIC 2018, and PH2 datasets show that MSDTCN-Net achieves competitive performance across metrics including IoU, Dice, and ACC, validating its effectiveness and generalization in skin lesion segmentation. Conclusions: MSDTCN-Net effectively combines local and global feature extraction, multi-scale adaptability, and semantic guidance to achieve high-accuracy skin lesion segmentation, demonstrating its potential in clinical diagnostic applications. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

26 pages, 5268 KB

Open AccessArticle

Blurred Lesion Image Segmentation via an Adaptive Scale Thresholding Network

by Qi Chen, Wenmin Wang, Zhibing Wang, Haomei Jia and Minglu Zhao

Appl. Sci. 2025, 15(17), 9259; https://doi.org/10.3390/app15179259 - 22 Aug 2025

Viewed by 1379

Abstract

Medical image segmentation is crucial for disease diagnosis, as precise results aid clinicians in locating lesion regions. However, lesions often have blurred boundaries and complex shapes, challenging traditional methods in capturing clear edges and impacting accurate localization and complete excision. Small lesions are [...] Read more.

Medical image segmentation is crucial for disease diagnosis, as precise results aid clinicians in locating lesion regions. However, lesions often have blurred boundaries and complex shapes, challenging traditional methods in capturing clear edges and impacting accurate localization and complete excision. Small lesions are also critical but prone to detail loss during downsampling, reducing segmentation accuracy. To address these issues, we propose a novel adaptive scale thresholding network (AdSTNet) that acts as a post-processing lightweight network for enhancing sensitivity to lesion edges and cores through a dual-threshold adaptive mechanism. The dual-threshold adaptive mechanism is a key architectural component that includes a main threshold map for core localization and an edge threshold map for more precise boundary detection. AdSTNet is compatible with any segmentation network and introduces only a small computational and parameter cost. Additionally, Spatial Attention and Channel Attention (SACA), the Laplacian operator, and the Fusion Enhancement module are introduced to improve feature processing. SACA enhances spatial and channel attention for core localization; the Laplacian operator retains edge details without added complexity; and the Fusion Enhancement module adapts concatenation operation and Convolutional Gated Linear Unit (ConvGLU) to improve feature intensities to improve edge and small lesion segmentation. Experiments show that AdSTNet achieves notable performance gains on ISIC 2018, BUSI, and Kvasir-SEG datasets. Compared with the original U-Net, our method attains mIoU/mDice of 83.40%/90.24% on ISIC, 71.66%/80.32% on BUSI, and 73.08%/81.91% on Kvasir-SEG. Moreover, similar improvements are observed in the rest of the networks. Full article

► Show Figures

Figure 1

17 pages, 6870 KB

Open AccessArticle

Edge- and Color–Texture-Aware Bag-of-Local-Features Model for Accurate and Interpretable Skin Lesion Diagnosis

by Dichao Liu and Kenji Suzuki

Diagnostics 2025, 15(15), 1883; https://doi.org/10.3390/diagnostics15151883 - 27 Jul 2025

Cited by 1 | Viewed by 1276

Abstract

Background/Objectives: Deep models have achieved remarkable progress in the diagnosis of skin lesions but face two significant drawbacks. First, they cannot effectively explain the basis of their predictions. Although attention visualization tools like Grad-CAM can create heatmaps using deep features, these features [...] Read more.

Background/Objectives: Deep models have achieved remarkable progress in the diagnosis of skin lesions but face two significant drawbacks. First, they cannot effectively explain the basis of their predictions. Although attention visualization tools like Grad-CAM can create heatmaps using deep features, these features often have large receptive fields, resulting in poor spatial alignment with the input image. Second, the design of most deep models neglects interpretable traditional visual features inspired by clinical experience, such as color–texture and edge features. This study aims to propose a novel approach integrating deep learning with traditional visual features to handle these limitations. Methods: We introduce the edge- and color–texture-aware bag-of-local-features model (ECT-BoFM), which limits the receptive field of deep features to a small size and incorporates edge and color–texture information from traditional features. A non-rigid reconstruction strategy ensures that traditional features enhance rather than constrain the model’s performance. Results: Experiments on the ISIC 2018 and 2019 datasets demonstrated that ECT-BoFM yields precise heatmaps and achieves high diagnostic performance, outperforming state-of-the-art methods. Furthermore, training models using only a small number of the most predictive patches identified by ECT-BoFM achieved diagnostic performance comparable to that obtained using full images, demonstrating its efficiency in exploring key clues. Conclusions: ECT-BoFM successfully combines deep learning and traditional visual features, addressing the interpretability and diagnostic accuracy challenges of existing methods. ECT-BoFM provides an interpretable and accurate framework for skin lesion diagnosis, advancing the integration of AI in dermatological research and clinical applications. Full article

(This article belongs to the Special Issue Lesion Detection and Analysis Using Artificial Intelligence, Third Edition)

► Show Figures

Figure 1

18 pages, 1995 KB

Open AccessArticle

A U-Shaped Architecture Based on Hybrid CNN and Mamba for Medical Image Segmentation

by Xiaoxuan Ma, Yingao Du and Dong Sui

Appl. Sci. 2025, 15(14), 7821; https://doi.org/10.3390/app15147821 - 11 Jul 2025

Cited by 4 | Viewed by 3007

Abstract

Accurate medical image segmentation plays a critical role in clinical diagnosis, treatment planning, and a wide range of healthcare applications. Although U-shaped CNNs and Transformer-based architectures have shown promise, CNNs struggle to capture long-range dependencies, whereas Transformers suffer from quadratic growth in computational [...] Read more.

Accurate medical image segmentation plays a critical role in clinical diagnosis, treatment planning, and a wide range of healthcare applications. Although U-shaped CNNs and Transformer-based architectures have shown promise, CNNs struggle to capture long-range dependencies, whereas Transformers suffer from quadratic growth in computational cost as image resolution increases. To address these issues, we propose HCMUNet, a novel medical image segmentation model that innovatively combines the local feature extraction capabilities of CNNs with the efficient long-range dependency modeling of Mamba, enhancing feature representation while reducing computational cost. In addition, HCMUNet features a redesigned skip connection and a novel attention module that integrates multi-scale features to recover spatial details lost during down-sampling and to promote richer cross-dimensional interactions. HCMUNet achieves Dice Similarity Coefficients (DSC) of 90.32%, 81.52%, and 92.11% on the ISIC 2018, Synapse multi-organ, and ACDC datasets, respectively, outperforming baseline methods by 0.65%, 1.05%, and 1.39%. Furthermore, HCMUNet consistently outperforms U-Net and Swin-UNet, achieving average Dice score improvements of approximately 5% and 2% across the evaluated datasets. These results collectively affirm the effectiveness and reliability of the proposed model across different segmentation tasks. Full article

► Show Figures

Figure 1

24 pages, 5169 KB

Open AccessArticle

A Dual-Headed Teacher–Student Framework with an Uncertainty-Guided Mechanism for Semi-Supervised Skin Lesion Segmentation

by Changman Zou, Wang-Su Jeon, Hye-Rim Ju and Sang-Yong Rhee

Electronics 2025, 14(5), 984; https://doi.org/10.3390/electronics14050984 - 28 Feb 2025

Cited by 4 | Viewed by 3081

Abstract

Medical image segmentation is a challenging task due to limited annotated data, complex lesion boundaries, and the inherent variability in medical images. These challenges make accurate and robust segmentation crucial for clinical applications. In this study, we propose the Uncertainty-Driven Auxiliary Mean Teacher [...] Read more.

Medical image segmentation is a challenging task due to limited annotated data, complex lesion boundaries, and the inherent variability in medical images. These challenges make accurate and robust segmentation crucial for clinical applications. In this study, we propose the Uncertainty-Driven Auxiliary Mean Teacher (UDAMT) model, a novel semi-supervised framework specifically designed for skin lesion segmentation. Our approach employs a dual-headed teacher–student architecture with an uncertainty-guided mechanism, enhancing feature learning and boundary precision. Extensive experiments on the ISIC 2016, ISIC 2017, and ISIC 2018 datasets demonstrate that UDAMT achieves significant improvements over state-of-the-art methods, with increases of 1.17 percentage points in the Dice coefficient and 1.31 percentage points in mean Intersection over Union (mIoU) under low-label settings (5% labeled data). Furthermore, UDAMT requires 12.9 M parameters, which is slightly higher than the baseline model (12.5 M) but significantly lower than MT (14.8 M) and UAMT (15.2 M). It also achieves an inference time of 25.7 ms per image, ensuring computational efficiency. Ablation studies validate the contributions of each component, and cross-dataset evaluations on the PH2 benchmark confirm robustness to small lesions. This work provides a scalable and efficient solution for semi-supervised medical image segmentation, balancing accuracy, efficiency, and clinical applicability. Full article

► Show Figures

Figure 1

20 pages, 1885 KB

Open AccessArticle

Highlighting the Advanced Capabilities and the Computational Efficiency of DeepLabV3+ in Medical Image Segmentation: An Ablation Study

by Ioannis Prokopiou and Panagiota Spyridonos

BioMedInformatics 2025, 5(1), 10; https://doi.org/10.3390/biomedinformatics5010010 - 14 Feb 2025

Cited by 6 | Viewed by 4899

Abstract

Background: In clinical practice, identifying the location and extent of tumors and lesions is crucial for disease diagnosis and treatment. Artificial intelligence, particularly deep neural networks, offers precise and automated segmentation, yet limited data and high computational demands often hinder its application. Transfer [...] Read more.

Background: In clinical practice, identifying the location and extent of tumors and lesions is crucial for disease diagnosis and treatment. Artificial intelligence, particularly deep neural networks, offers precise and automated segmentation, yet limited data and high computational demands often hinder its application. Transfer learning helps mitigate these challenges by significantly reducing computational costs, although applying these models can still be resource intensive. This study aims to present flexible and computationally efficient architecture that leverages transfer learning and delivers highly accurate results across various medical imaging problems. Methods: We evaluated three datasets with varying similarities to ImageNet: ISIC 2018 (skin lesions), CBIS-DDSM (breast masses), and the Shenzhen and Montgomery CXR Set (lung segmentation). An ablation study on ISIC 2018 tested various pre-trained backbones, architectures, and loss functions. Results: The optimal configuration—DeepLabV3+ with a pre-trained ResNet50 backbone and Log-Cosh Dice loss—was validated on the remaining datasets, achieving state-of-the-art results. Conclusion: Computationally simpler architectures can deliver robust performance without extensive resources, establishing DeepLabV3+ with the ResNet50 as a baseline for future studies. In the medical domain, enhancing data quality is more critical for improving segmentation accuracy than increasing model complexity. Full article

(This article belongs to the Section Applied Biomedical Data Science)

► Show Figures

Figure 1

26 pages, 21880 KB

Open AccessArticle

Explainable AI-Based Skin Cancer Detection Using CNN, Particle Swarm Optimization and Machine Learning

by Syed Adil Hussain Shah, Syed Taimoor Hussain Shah, Roa’a Khaled, Andrea Buccoliero, Syed Baqir Hussain Shah, Angelo Di Terlizzi, Giacomo Di Benedetto and Marco Agostino Deriu

J. Imaging 2024, 10(12), 332; https://doi.org/10.3390/jimaging10120332 - 22 Dec 2024

Cited by 31 | Viewed by 8500

Abstract

Skin cancer is among the most prevalent cancers globally, emphasizing the need for early detection and accurate diagnosis to improve outcomes. Traditional diagnostic methods, based on visual examination, are subjective, time-intensive, and require specialized expertise. Current artificial intelligence (AI) approaches for skin cancer [...] Read more.

Skin cancer is among the most prevalent cancers globally, emphasizing the need for early detection and accurate diagnosis to improve outcomes. Traditional diagnostic methods, based on visual examination, are subjective, time-intensive, and require specialized expertise. Current artificial intelligence (AI) approaches for skin cancer detection face challenges such as computational inefficiency, lack of interpretability, and reliance on standalone CNN architectures. To address these limitations, this study proposes a comprehensive pipeline combining transfer learning, feature selection, and machine-learning algorithms to improve detection accuracy. Multiple pretrained CNN models were evaluated, with Xception emerging as the optimal choice for its balance of computational efficiency and performance. An ablation study further validated the effectiveness of freezing task-specific layers within the Xception architecture. Feature dimensionality was optimized using Particle Swarm Optimization, reducing dimensions from 1024 to 508, significantly enhancing computational efficiency. Machine-learning classifiers, including Subspace KNN and Medium Gaussian SVM, further improved classification accuracy. Evaluated on the ISIC 2018 and HAM10000 datasets, the proposed pipeline achieved impressive accuracies of 98.5% and 86.1%, respectively. Moreover, Explainable-AI (XAI) techniques, such as Grad-CAM, LIME, and Occlusion Sensitivity, enhanced interpretability. This approach provides a robust, efficient, and interpretable solution for automated skin cancer diagnosis in clinical applications. Full article

(This article belongs to the Special Issue Deep Learning in Image Analysis: Progress and Challenges)

► Show Figures

Figure 1

Search Results (45)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (45)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI