Deep Learning in Biomedical Image Segmentation and Classification: Advancements, Challenges and Applications

A special issue of Journal of Imaging (ISSN 2313-433X). This special issue belongs to the section "Medical Imaging".

Deadline for manuscript submissions: closed (31 March 2025) | Viewed by 28505

Special Issue Editor


E-Mail Website
Guest Editor
Department of Engineering and Applied Sciences, Memorial University, St. John’s, NL A1B 3X5, Canada
Interests: machine learning; computer vision; biomedical engineering; wireless communications and networks; remote sensing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

This Special Issue aims to explore the latest advancements, challenges, and applications of deep learning techniques in the field of biomedical image segmentation and classification. Biomedical image analysis plays a crucial role in various medical domains, enabling the accurate identification, segmentation, and classification of structures, organs, and anomalies. Deep learning, with its ability to learn complex features and patterns from large-scale datasets, has revolutionized biomedical image analysis, offering significant improvements in segmentation and classification accuracy and efficiency. This Special Issue welcomes original research papers, review articles, and case studies that present novel deep learning methodologies, architectures, and algorithms, as well as their practical applications and implications in biomedical image segmentation and classification. The collection of contributions will provide a comprehensive overview of the current state-of-the-art techniques, identify challenges and limitations, and pave the way for future research directions in this rapidly evolving field.

Dr. Ebrahim Karami
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Journal of Imaging is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • biomedical image segmentation
  • biomedical image classification
  • computer-aided diagnosis
  • deep learning for ultrasound imaging
  • deep learning for infrared imaging
  • deep learning for MRI and FMRI imaging
  • deep learning for X-ray imaging
  • lesion detection with deep learning
  • cancer detection and classification with deep learning
  • biomedical image denoising and enhancement using deep learning
  • deep learning for biomedical object localization

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

15 pages, 2054 KiB  
Article
Deep-Learning Approaches for Cervical Cytology Nuclei Segmentation in Whole Slide Images
by Andrés Mosquera-Zamudio, Sandra Cancino, Guillermo Cárdenas-Montoya, Juan D. Garcia-Arteaga, Carlos Zambrano-Betancourt and Rafael Parra-Medina
J. Imaging 2025, 11(5), 137; https://doi.org/10.3390/jimaging11050137 - 29 Apr 2025
Viewed by 121
Abstract
Whole-slide imaging (WSI) in cytopathology poses challenges related to segmentation accuracy, computational efficiency, and image acquisition artifacts. This study aims to evaluate the performance of deep-learning models for instance segmentation in cervical cytology, benchmarking them against state-of-the-art methods on both public and institutional [...] Read more.
Whole-slide imaging (WSI) in cytopathology poses challenges related to segmentation accuracy, computational efficiency, and image acquisition artifacts. This study aims to evaluate the performance of deep-learning models for instance segmentation in cervical cytology, benchmarking them against state-of-the-art methods on both public and institutional datasets. We tested three architectures—U-Net, vision transformer (ViT), and Detectron2—and evaluated their performance on the ISBI 2014 and CNseg datasets using panoptic quality (PQ), dice similarity coefficient (DSC), and intersection over union (IoU). All models were trained on CNseg and tested on an independent institutional dataset. Data preprocessing involved manual annotation using QuPath, patch extraction guided by GeoJSON files, and exclusion of regions containing less than 60% cytologic material. Our models achieved superior segmentation performance on public datasets, reaching up to 98% PQ. Performance decreased on the institutional dataset, likely due to differences in image acquisition and the presence of blurred nuclei. Nevertheless, the models were able to detect blurred nuclei, highlighting their robustness in suboptimal imaging conditions. In conclusion, the proposed models offer an accurate and efficient solution for instance segmentation in cytology WSI. These results support the development of reliable AI-powered tools for digital cytology, with potential applications in automated screening and diagnostic workflows. Full article
Show Figures

Figure 1

14 pages, 3375 KiB  
Article
YOLO-Tryppa: A Novel YOLO-Based Approach for Rapid and Accurate Detection of Small Trypanosoma Parasites
by Davide Antonio Mura, Luca Zedda, Andrea Loddo and Cecilia Di Ruberto
J. Imaging 2025, 11(4), 117; https://doi.org/10.3390/jimaging11040117 - 15 Apr 2025
Viewed by 271
Abstract
Early detection of Trypanosoma parasites is critical for the prompt treatment of trypanosomiasis, a neglected tropical disease that poses severe health and socioeconomic challenges in affected regions. To address the limitations of traditional manual microscopy and prior automated methods, we propose YOLO-Tryppa, a [...] Read more.
Early detection of Trypanosoma parasites is critical for the prompt treatment of trypanosomiasis, a neglected tropical disease that poses severe health and socioeconomic challenges in affected regions. To address the limitations of traditional manual microscopy and prior automated methods, we propose YOLO-Tryppa, a novel YOLO-based framework specifically engineered for the rapid and accurate detection of small Trypanosoma parasites in microscopy images. YOLO-Tryppa incorporates ghost convolutions to reduce computational complexity while maintaining robust feature extraction and introduces a dedicated P2 prediction head to improve the localization of small objects. By eliminating the redundant P5 prediction head, the proposed approach achieves a significantly lower parameter count and reduced GFLOPs. Experimental results on the public Tryp dataset demonstrate that YOLO-Tryppa outperforms the previous state of the art by achieving an AP50 of 71.3%, thereby setting a new benchmark for both accuracy and efficiency. These improvements make YOLO-Tryppa particularly well-suited for deployment in resource-constrained settings, facilitating more rapid and reliable diagnostic practices. Full article
Show Figures

Figure 1

15 pages, 3743 KiB  
Article
Blink Detection Using 3D Convolutional Neural Architectures and Analysis of Accumulated Frame Predictions
by George Nousias, Konstantinos K. Delibasis and Georgios Labiris
J. Imaging 2025, 11(1), 27; https://doi.org/10.3390/jimaging11010027 - 19 Jan 2025
Viewed by 1176
Abstract
Blink detection is considered a useful indicator both for clinical conditions and drowsiness state. In this work, we propose and compare deep learning architectures for the task of detecting blinks in video frame sequences. The first step is the training and application of [...] Read more.
Blink detection is considered a useful indicator both for clinical conditions and drowsiness state. In this work, we propose and compare deep learning architectures for the task of detecting blinks in video frame sequences. The first step is the training and application of an eye detector that extracts the eye regions from each video frame. The cropped eye regions are organized as three-dimensional (3D) input with the third dimension spanning time of 300 ms. Two different 3D convolutional neural networks are utilized (a simple 3D CNN and 3D ResNet), as well as a 3D autoencoder combined with a classifier coupled to the latent space. Finally, we propose the usage of a frame prediction accumulator combined with morphological processing and watershed segmentation to detect blinks and determine their start and stop frame in previously unseen videos. The proposed framework was trained on ten (9) different participants and tested on five (8) different ones, with a total of 162,400 frames and 1172 blinks for each eye. The start and end frame of each blink in the dataset has been annotate by specialized ophthalmologist. Quantitative comparison with state-of-the-art blink detection methodologies provide favorable results for the proposed neural architectures coupled with the prediction accumulator, with the 3D ResNet being the best as well as the fastest performer. Full article
Show Figures

Figure 1

23 pages, 7813 KiB  
Article
The Use of Hybrid CNN-RNN Deep Learning Models to Discriminate Tumor Tissue in Dynamic Breast Thermography
by Andrés Munguía-Siu, Irene Vergara and Juan Horacio Espinoza-Rodríguez
J. Imaging 2024, 10(12), 329; https://doi.org/10.3390/jimaging10120329 - 21 Dec 2024
Viewed by 1717
Abstract
Breast cancer is one of the leading causes of death for women worldwide, and early detection can help reduce the death rate. Infrared thermography has gained popularity as a non-invasive and rapid method for detecting this pathology and can be further enhanced by [...] Read more.
Breast cancer is one of the leading causes of death for women worldwide, and early detection can help reduce the death rate. Infrared thermography has gained popularity as a non-invasive and rapid method for detecting this pathology and can be further enhanced by applying neural networks to extract spatial and even temporal data derived from breast thermographic images if they are acquired sequentially. In this study, we evaluated hybrid convolutional-recurrent neural network (CNN-RNN) models based on five state-of-the-art pre-trained CNN architectures coupled with three RNNs to discern tumor abnormalities in dynamic breast thermographic images. The hybrid architecture that achieved the best performance for detecting breast cancer was VGG16-LSTM, which showed accuracy (ACC), sensitivity (SENS), and specificity (SPEC) of 95.72%, 92.76%, and 98.68%, respectively, with a CPU runtime of 3.9 s. However, the hybrid architecture that showed the fastest CPU runtime was AlexNet-RNN with 0.61 s, although with lower performance (ACC: 80.59%, SENS: 68.52%, SPEC: 92.76%), but still superior to AlexNet (ACC: 69.41%, SENS: 52.63%, SPEC: 86.18%) with 0.44 s. Our findings show that hybrid CNN-RNN models outperform stand-alone CNN models, indicating that temporal data recovery from dynamic breast thermographs is possible without significantly compromising classifier runtime. Full article
Show Figures

Figure 1

22 pages, 838 KiB  
Article
MediScan: A Framework of U-Health and Prognostic AI Assessment on Medical Imaging
by Sibtain Syed, Rehan Ahmed, Arshad Iqbal, Naveed Ahmad and Mohammed Ali Alshara
J. Imaging 2024, 10(12), 322; https://doi.org/10.3390/jimaging10120322 - 13 Dec 2024
Viewed by 2132
Abstract
With technological advancements, remarkable progress has been made with the convergence of health sciences and Artificial Intelligence (AI). Modern health systems are proposed to ease patient diagnostics. However, the challenge is to provide AI-based precautions to patients and doctors for more accurate risk [...] Read more.
With technological advancements, remarkable progress has been made with the convergence of health sciences and Artificial Intelligence (AI). Modern health systems are proposed to ease patient diagnostics. However, the challenge is to provide AI-based precautions to patients and doctors for more accurate risk assessment. The proposed healthcare system aims to integrate patients, doctors, laboratories, pharmacies, and administrative personnel use cases and their primary functions onto a single platform. The proposed framework can also process microscopic images, CT scans, X-rays, and MRI to classify malignancy and give doctors a set of AI precautions for patient risk assessment. The proposed framework incorporates various DCNN models for identifying different forms of tumors and fractures in the human body i.e., brain, bones, lungs, kidneys, and skin, and generating precautions with the help of the Fined-Tuned Large Language Model (LLM) i.e., Generative Pretrained Transformer 4 (GPT-4). With enough training data, DCNN can learn highly representative, data-driven, hierarchical image features. The GPT-4 model is selected for generating precautions due to its explanation, reasoning, memory, and accuracy on prior medical assessments and research studies. Classification models are evaluated by classification report (i.e., Recall, Precision, F1 Score, Support, Accuracy, and Macro and Weighted Average) and confusion matrix and have shown robust performance compared to the conventional schemes. Full article
Show Figures

Figure 1

17 pages, 6612 KiB  
Article
Semi-Supervised Medical Image Segmentation Based on Deep Consistent Collaborative Learning
by Xin Zhao and Wenqi Wang
J. Imaging 2024, 10(5), 118; https://doi.org/10.3390/jimaging10050118 - 14 May 2024
Cited by 2 | Viewed by 2549
Abstract
In the realm of medical image analysis, the cost associated with acquiring accurately labeled data is prohibitively high. To address the issue of label scarcity, semi-supervised learning methods are employed, utilizing unlabeled data alongside a limited set of labeled data. This paper presents [...] Read more.
In the realm of medical image analysis, the cost associated with acquiring accurately labeled data is prohibitively high. To address the issue of label scarcity, semi-supervised learning methods are employed, utilizing unlabeled data alongside a limited set of labeled data. This paper presents a novel semi-supervised medical segmentation framework, DCCLNet (deep consistency collaborative learning UNet), grounded in deep consistent co-learning. The framework synergistically integrates consistency learning from feature and input perturbations, coupled with collaborative training between CNN (convolutional neural networks) and ViT (vision transformer), to capitalize on the learning advantages offered by these two distinct paradigms. Feature perturbation involves the application of auxiliary decoders with varied feature disturbances to the main CNN backbone, enhancing the robustness of the CNN backbone through consistency constraints generated by the auxiliary and main decoders. Input perturbation employs an MT (mean teacher) architecture wherein the main network serves as the student model guided by a teacher model subjected to input perturbations. Collaborative training aims to improve the accuracy of the main networks by encouraging mutual learning between the CNN and ViT. Experiments conducted on publicly available datasets for ACDC (automated cardiac diagnosis challenge) and Prostate datasets yielded Dice coefficients of 0.890 and 0.812, respectively. Additionally, comprehensive ablation studies were performed to demonstrate the effectiveness of each methodological contribution in this study. Full article
Show Figures

Figure 1

14 pages, 4516 KiB  
Article
Elevating Chest X-ray Image Super-Resolution with Residual Network Enhancement
by Anudari Khishigdelger, Ahmed Salem and Hyun-Soo Kang
J. Imaging 2024, 10(3), 64; https://doi.org/10.3390/jimaging10030064 - 4 Mar 2024
Cited by 2 | Viewed by 3176
Abstract
Chest X-ray (CXR) imaging plays a pivotal role in diagnosing various pulmonary diseases, which account for a significant portion of the global mortality rate, as recognized by the World Health Organization (WHO). Medical practitioners routinely depend on CXR images to identify anomalies and [...] Read more.
Chest X-ray (CXR) imaging plays a pivotal role in diagnosing various pulmonary diseases, which account for a significant portion of the global mortality rate, as recognized by the World Health Organization (WHO). Medical practitioners routinely depend on CXR images to identify anomalies and make critical clinical decisions. Dramatic improvements in super-resolution (SR) have been achieved by applying deep learning techniques. However, some SR methods are very difficult to utilize due to their low-resolution inputs and features containing abundant low-frequency information, similar to the case of X-ray image super-resolution. In this paper, we introduce an advanced deep learning-based SR approach that incorporates the innovative residual-in-residual (RIR) structure to augment the diagnostic potential of CXR imaging. Specifically, we propose forming a light network consisting of residual groups built by residual blocks, with multiple skip connections to facilitate the efficient bypassing of abundant low-frequency information through multiple skip connections. This approach allows the main network to concentrate on learning high-frequency information. In addition, we adopted the dense feature fusion within residual groups and designed high parallel residual blocks for better feature extraction. Our proposed methods exhibit superior performance compared to existing state-of-the-art (SOTA) SR methods, delivering enhanced accuracy and notable visual improvements, as evidenced by our results. Full article
Show Figures

Figure 1

17 pages, 1236 KiB  
Article
A CNN Hyperparameters Optimization Based on Particle Swarm Optimization for Mammography Breast Cancer Classification
by Khadija Aguerchi, Younes Jabrane, Maryam Habba and Amir Hajjam El Hassani
J. Imaging 2024, 10(2), 30; https://doi.org/10.3390/jimaging10020030 - 24 Jan 2024
Cited by 19 | Viewed by 4498
Abstract
Breast cancer is considered one of the most-common types of cancers among females in the world, with a high mortality rate. Medical imaging is still one of the most-reliable tools to detect breast cancer. Unfortunately, manual image detection takes much time. This paper [...] Read more.
Breast cancer is considered one of the most-common types of cancers among females in the world, with a high mortality rate. Medical imaging is still one of the most-reliable tools to detect breast cancer. Unfortunately, manual image detection takes much time. This paper proposes a new deep learning method based on Convolutional Neural Networks (CNNs). Convolutional Neural Networks are widely used for image classification. However, the determination process for accurate hyperparameters and architectures is still a challenging task. In this work, a highly accurate CNN model to detect breast cancer by mammography was developed. The proposed method is based on the Particle Swarm Optimization (PSO) algorithm in order to look for suitable hyperparameters and the architecture for the CNN model. The CNN model using PSO achieved success rates of 98.23% and 97.98% on the DDSM and MIAS datasets, respectively. The experimental results proved that the proposed CNN model gave the best accuracy values in comparison with other studies in the field. As a result, CNN models for mammography classification can now be created automatically. The proposed method can be considered as a powerful technique for breast cancer prediction. Full article
Show Figures

Figure 1

15 pages, 1596 KiB  
Article
Lesion Detection in Optical Coherence Tomography with Transformer-Enhanced Detector
by Hanya Ahmed, Qianni Zhang, Ferranti Wong, Robert Donnan and Akram Alomainy
J. Imaging 2023, 9(11), 244; https://doi.org/10.3390/jimaging9110244 - 7 Nov 2023
Viewed by 2365
Abstract
Optical coherence tomography (OCT) is an emerging imaging tool in healthcare with common applications in ophthalmology for the detection of retinal diseases and in dentistry for the early detection of tooth decay. Speckle noise is ubiquitous in OCT images, which can hinder diagnosis [...] Read more.
Optical coherence tomography (OCT) is an emerging imaging tool in healthcare with common applications in ophthalmology for the detection of retinal diseases and in dentistry for the early detection of tooth decay. Speckle noise is ubiquitous in OCT images, which can hinder diagnosis by clinicians. In this paper, a region-based, deep learning framework for the detection of anomalies is proposed for OCT-acquired images. The core of the framework is Transformer-Enhanced Detection (TED), which includes attention gates (AGs) to ensure focus is placed on the foreground while identifying and removing noise artifacts as anomalies. TED was designed to detect the different types of anomalies commonly present in OCT images for diagnostic purposes and thus aid clinical interpretation. Extensive quantitative evaluations were performed to measure the performance of TED against current, widely known, deep learning detection algorithms. Three different datasets were tested: two dental and one CT (hosting scans of lung nodules, livers, etc.). The results showed that the approach verifiably detected tooth decay and numerous lesions across two modalities, achieving superior performance compared to several well-known algorithms. The proposed method improved the accuracy of detection by 16–22% and the Intersection over Union (IOU) by 10% for both dentistry datasets. For the CT dataset, the performance metrics were similarly improved by 9% and 20%, respectively. Full article
Show Figures

Figure 1

Other

Jump to: Research

13 pages, 1782 KiB  
Brief Report
Brain Age Prediction Using 2D Projections Based on Higher-Order Statistical Moments and Eigenslices from 3D Magnetic Resonance Imaging Volumes
by Johan Jönemo and Anders Eklund
J. Imaging 2023, 9(12), 271; https://doi.org/10.3390/jimaging9120271 - 6 Dec 2023
Cited by 1 | Viewed by 2178
Abstract
Brain age prediction from 3D MRI volumes using deep learning has recently become a popular research topic, as brain age has been shown to be an important biomarker. Training deep networks can be very computationally demanding for large datasets like the U.K. Biobank [...] Read more.
Brain age prediction from 3D MRI volumes using deep learning has recently become a popular research topic, as brain age has been shown to be an important biomarker. Training deep networks can be very computationally demanding for large datasets like the U.K. Biobank (currently 29,035 subjects). In our previous work, it was demonstrated that using a few 2D projections (mean and standard deviation along three axes) instead of each full 3D volume leads to much faster training at the cost of a reduction in prediction accuracy. Here, we investigated if another set of 2D projections, based on higher-order statistical central moments and eigenslices, leads to a higher accuracy. Our results show that higher-order moments do not lead to a higher accuracy, but that eigenslices provide a small improvement. We also show that an ensemble of such models provides further improvement. Full article
Show Figures

Figure 1

9 pages, 2411 KiB  
Brief Report
WindowNet: Learnable Windows for Chest X-ray Classification
by Alessandro Wollek, Sardi Hyska, Bastian Sabel, Michael Ingrisch and Tobias Lasser
J. Imaging 2023, 9(12), 270; https://doi.org/10.3390/jimaging9120270 - 6 Dec 2023
Viewed by 4401
Abstract
Public chest X-ray (CXR) data sets are commonly compressed to a lower bit depth to reduce their size, potentially hiding subtle diagnostic features. In contrast, radiologists apply a windowing operation to the uncompressed image to enhance such subtle features. While it has been [...] Read more.
Public chest X-ray (CXR) data sets are commonly compressed to a lower bit depth to reduce their size, potentially hiding subtle diagnostic features. In contrast, radiologists apply a windowing operation to the uncompressed image to enhance such subtle features. While it has been shown that windowing improves classification performance on computed tomography (CT) images, the impact of such an operation on CXR classification performance remains unclear. In this study, we show that windowing strongly improves the CXR classification performance of machine learning models and propose WindowNet, a model that learns multiple optimal window settings. Our model achieved an average AUC score of 0.812 compared with the 0.759 score of a commonly used architecture without windowing capabilities on the MIMIC data set. Full article
Show Figures

Figure 1

9 pages, 2417 KiB  
Brief Report
Placental Vessel Segmentation Using Pix2pix Compared to U-Net
by Anouk van der Schot, Esther Sikkel, Marèll Niekolaas, Marc Spaanderman and Guido de Jong
J. Imaging 2023, 9(10), 226; https://doi.org/10.3390/jimaging9100226 - 16 Oct 2023
Cited by 4 | Viewed by 2194
Abstract
Computer-assisted technologies have made significant progress in fetoscopic laser surgery, including placental vessel segmentation. However, the intra- and inter-procedure variabilities in the state-of-the-art segmentation methods remain a significant hurdle. To address this, we investigated the use of conditional generative adversarial networks (cGANs) for [...] Read more.
Computer-assisted technologies have made significant progress in fetoscopic laser surgery, including placental vessel segmentation. However, the intra- and inter-procedure variabilities in the state-of-the-art segmentation methods remain a significant hurdle. To address this, we investigated the use of conditional generative adversarial networks (cGANs) for fetoscopic image segmentation and compared their performance with the benchmark U-Net technique for placental vessel segmentation. Two deep-learning models, U-Net and pix2pix (a popular cGAN model), were trained and evaluated using a publicly available dataset and an internal validation set. The overall results showed that the pix2pix model outperformed the U-Net model, with a Dice score of 0.80 [0.70; 0.86] versus 0.75 [0.0.60; 0.84] (p-value < 0.01) and an Intersection over Union (IoU) score of 0.70 [0.61; 0.77] compared to 0.66 [0.53; 0.75] (p-value < 0.01), respectively. The internal validation dataset further validated the superiority of the pix2pix model, achieving Dice and IoU scores of 0.68 [0.53; 0.79] and 0.59 [0.49; 0.69] (p-value < 0.01), respectively, while the U-Net model obtained scores of 0.53 [0.49; 0.64] and 0.49 [0.17; 0.56], respectively. This study successfully compared U-Net and pix2pix models for placental vessel segmentation in fetoscopic images, demonstrating improved results with the cGAN-based approach. However, the challenge of achieving generalizability still needs to be addressed. Full article
Show Figures

Figure 1

Back to TopTop