Deep Learning Techniques for Medical Image Analysis

A special issue of Diagnostics (ISSN 2075-4418). This special issue belongs to the section "Machine Learning and Artificial Intelligence in Diagnostics".

Deadline for manuscript submissions: 15 March 2026 | Viewed by 12003

Special Issue Editor


E-Mail Website
Guest Editor
Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, China
Interests: biomedical ultrasonics; quantitative ultrasound for biological tissue characterization; ultrasound wave propagation in biological tissues; medical signal/image processing; artificial intelligence in medicine
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, deep learning techniques have been widely used in medical image analysis. These techniques employ deep neural networks to automatically extract multi-level, multi-scale, abundant information (features) from image data, which is hard for conventional machine learning techniques which use hand-crafted feature parameters, including supervised learning (with task-driven models), unsupervised or generative learning (with data-driven models), semi-supervised learning (with hybrid task-driven and data-driven models), reinforcement learning (with environment-driven models), and physics-informed learning (hybrid task-driven and physics-driven models). The analyzed imaging modalities can include structural imaging such as X-ray imaging, computed tomography (CT), magnetic resonance imaging (MRI), ultrasound imaging, and ultrasound computed tomography, as well as functional imaging such as functional MRI, positron emission tomography (PET), single-photon emission computed tomography (SPECT), and functional ultrasound imaging, whether two-dimensional, three-dimensional, or even four-dimensional (three-dimensional plus temporal). The vast applications of deep learning techniques in medical image analysis cover lesion detection and segmentation, disease diagnosis, treatment monitoring, efficacy evaluation, prognostic prediction, and even biomechanical analysis. In addition to medical image post-processing, deep learning techniques can also be applied to the front-end (e.g., image reconstruction) to enhance the quality of medical imaging.

Given the high level of research interest and clinical application prospects, deep learning techniques have continued to develop, especially in the field of medical image analysis. This Special Issue aims to report on state-of-the-art deep learning techniques applied to medical image analysis. Contributions related to deep learning techniques in medical image analysis are welcome.

Dr. Zhuhuang Zhou
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Diagnostics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • supervised learning
  • unsupervised learning
  • semi-supervised learning
  • self-supervised learning
  • generative learning
  • deep neural networks
  • convolutional neural networks
  • physics-informed neural networks
  • X-ray imaging
  • computed tomography (CT)
  • magnetic resonance imaging (MRI)
  • ultrasound imaging
  • ultrasound computed tomography
  • functional MRI
  • positron emission tomography (PET)
  • single-photon emission computed tomography (SPECT)
  • functional ultrasound imaging
  • image reconstruction

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

24 pages, 5218 KiB  
Article
Convolutional Neural Network-Based Approach for Cobb Angle Measurement Using Mask R-CNN
by Marcos Villar García, José-Benito Bouza-Rodríguez and Alberto Comesaña-Campos
Diagnostics 2025, 15(9), 1066; https://doi.org/10.3390/diagnostics15091066 - 23 Apr 2025
Viewed by 196
Abstract
Background: Scoliosis is a disorder characterized by an abnormal spinal curvature, which can lead to negative effects on patients, affecting their quality of life. Given its progressive nature, the classification of the scoliosis severity requires an accurate diagnosis and effective monitoring. The Cobb [...] Read more.
Background: Scoliosis is a disorder characterized by an abnormal spinal curvature, which can lead to negative effects on patients, affecting their quality of life. Given its progressive nature, the classification of the scoliosis severity requires an accurate diagnosis and effective monitoring. The Cobb angle measurement method has been widely considered as the gold standard for a scoliosis assessment. Commonly, an expert assesses scoliosis severity manually by identifying the most tilted vertebrae of the spine. However, this method requires time, effort, and presents limitations in measurement accuracy, such as the intra- and inter-observer variability. Artificial intelligence provides more objective tools that are less sensitive to manual intervention aiming to transform the diagnosis of scoliosis. Objectives: The objective of this study was to address three key research questions regarding automated Cobb angle quantification: “Where is the spine in this radiograph?”, “What is its exact shape?”, and “Is the proposed method accurate?”. We propose the use of Mask R-CNN architecture for spine detection and segmentation in response to the first two questions, and a set of algorithms to tackle the third. Methods: The network’s detection and segmentation performance was evaluated through various metrics. An automated workflow for Cobb angle quantification and severity classification was developed. Finally, statistical methods provided the agreement between manual and automated measurements. Results: A high segmentation accuracy was achieved, highlighting the following: mIoU of 0.8012, and a mean precision of 0.9145. MAE was 2.96° ± 2.60° demonstrating a high agreement. Conclusions: The results obtained in this study demonstrate the potential of the proposed automated approach in clinical scenarios, which provides experts with a clear visualization of each stage in the scoliosis assessment by overlaying the results onto the X-ray image. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

26 pages, 4277 KiB  
Article
Fractal-Based Architectures with Skip Connections and Attention Mechanism for Improved Segmentation of MS Lesions in Cervical Spinal Cord
by Rukiye Polattimur, Mehmet Süleyman Yıldırım and Emre Dandıl
Diagnostics 2025, 15(8), 1041; https://doi.org/10.3390/diagnostics15081041 - 19 Apr 2025
Viewed by 228
Abstract
Background/Objectives: Multiple sclerosis (MS) is an autoimmune disease that damages the myelin sheath of the central nervous system, which includes the brain and spinal cord. Although MS lesions in the brain are more frequently investigated, MS lesions in the cervical spinal cord [...] Read more.
Background/Objectives: Multiple sclerosis (MS) is an autoimmune disease that damages the myelin sheath of the central nervous system, which includes the brain and spinal cord. Although MS lesions in the brain are more frequently investigated, MS lesions in the cervical spinal cord (CSC) can be much more specific for the diagnosis of the disease. Furthermore, as lesion burden in the CSC is directly related to disease progression, the presence of lesions in the CSC may help to differentiate MS from other neurological diseases. Methods: In this study, two novel deep learning models based on fractal architectures are proposed for the automatic detection and segmentation of MS lesions in the CSC by improving the convolutional and connection structures used in the layers of the U-Net architecture. In our previous study, we introduced the FractalSpiNet architecture by incorporating fractal convolutional block structures into the U-Net framework to develop a deeper network for segmenting MS lesions in the CPC. In this study, to improve the detection of smaller structures and finer details in the images, an attention mechanism is integrated into the FractalSpiNet architecture, resulting in the Att-FractalSpiNet model. In addition, in the second hybrid model, a fractal convolutional block is incorporated into the skip connection structure of the U-Net architecture, resulting in the development of the Con-FractalU-Net model. Results: Experimental studies were conducted using U-Net, FractalSpiNet, Con-FractalU-Net, and Att-FractalSpiNet architectures to detect the CSC region and the MS lesions within its boundaries. In segmenting the CSC region, the proposed Con-FractalU-Net architecture achieved the highest Dice Similarity Coefficient (DSC) score of 98.89%. Similarly, in detecting MS lesions within the CSC region, the Con-FractalU-Net model again achieved the best performance with a DSC score of 91.48%. Conclusions: For segmentation of the CSC region and detection of MS lesions, the proposed fractal-based Con-FractalU-Net and Att-FractalSpiNet architectures achieved higher scores than the baseline U-Net architecture, particularly in segmenting small and complex structures. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

22 pages, 1334 KiB  
Article
A Robust YOLOv8-Based Framework for Real-Time Melanoma Detection and Segmentation with Multi-Dataset Training
by Saleh Albahli
Diagnostics 2025, 15(6), 691; https://doi.org/10.3390/diagnostics15060691 - 11 Mar 2025
Viewed by 731
Abstract
Background: Melanoma, the deadliest form of skin cancer, demands accurate and timely diagnosis to improve patient survival rates. However, traditional diagnostic approaches rely heavily on subjective clinical interpretations, leading to inconsistencies and diagnostic errors. Methods: This study proposes a robust YOLOv8-based [...] Read more.
Background: Melanoma, the deadliest form of skin cancer, demands accurate and timely diagnosis to improve patient survival rates. However, traditional diagnostic approaches rely heavily on subjective clinical interpretations, leading to inconsistencies and diagnostic errors. Methods: This study proposes a robust YOLOv8-based deep learning framework for real-time melanoma detection and segmentation. A multi-dataset training strategy integrating the ISIC 2020, HAM10000, and PH2 datasets was employed to enhance generalizability across diverse clinical conditions. Preprocessing techniques, including adaptive contrast enhancement and artifact removal, were utilized, while advanced augmentation strategies such as CutMix and Mosaic were applied to enhance lesion diversity. The YOLOv8 architecture unified lesion detection and segmentation tasks into a single inference pass, significantly enhancing computational efficiency. Results: Experimental evaluation demonstrated state-of-the-art performance, achieving a mean Average Precision (mAP@0.5) of 98.6%, a Dice Coefficient of 0.92, and an Intersection over Union (IoU) score of 0.88. These results surpass conventional segmentation models including U-Net, DeepLabV3+, Mask R-CNN, SwinUNet, and Segment Anything Model (SAM). Moreover, the proposed framework demonstrated real-time inference speeds of 12.5 ms per image, making it highly suitable for clinical deployment and mobile health applications. Conclusions: The YOLOv8-based framework effectively addresses the limitations of existing diagnostic methods by integrating detection and segmentation tasks, achieving high accuracy and computational efficiency. This study highlights the importance of multi-dataset training for robust generalization and recommends the integration of explainable AI techniques to enhance clinical trust and interpretability. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

22 pages, 2713 KiB  
Article
An Efficient 3D Convolutional Neural Network for Dose Prediction in Cancer Radiotherapy from CT Images
by Lam Thanh Hien, Pham Trung Hieu and Do Nang Toan
Diagnostics 2025, 15(2), 177; https://doi.org/10.3390/diagnostics15020177 - 14 Jan 2025
Viewed by 889
Abstract
Introduction: Cancer is a highly lethal disease with a significantly high mortality rate. One of the most commonly used methods for treatment is radiation therapy. However, cancer treatment using radiotherapy is a time-consuming process that requires significant manual work from planners and [...] Read more.
Introduction: Cancer is a highly lethal disease with a significantly high mortality rate. One of the most commonly used methods for treatment is radiation therapy. However, cancer treatment using radiotherapy is a time-consuming process that requires significant manual work from planners and doctors. In radiation therapy treatment planning, determining the dose distribution for each of the regions of the patient’s body is one of the most difficult and important tasks. Nowadays, artificial intelligence has shown promising results in improving the quality of disease treatment, particularly in cancer radiation therapy. Objectives: The main objective of this study is to build a high-performance deep learning model for predicting radiation therapy doses for cancer and to develop software to easily manipulate and use this model. Materials and Methods: In this paper, we propose a custom 3D convolutional neural network model with a U-Net-based architecture to automatically predict radiation doses during cancer radiation therapy from CT images. To ensure that the predicted doses do not have negative values, which are not valid for radiation doses, a rectified linear unit (ReLU) function is applied to the output to convert negative values to zero. Additionally, a proposed loss function based on a dose–volume histogram is used to train the model, ensuring that the predicted dose concentrations are highly meaningful in terms of radiation therapy. The model is developed using the OpenKBP challenge dataset, which consists of 200, 100, and 40 head and neck cancer patients for training, testing, and validation, respectively. Before the training phase, preprocessing and augmentation techniques, such as standardization, translation, and flipping, are applied to the training set. During the training phase, a cosine annealing scheduler is applied to update the learning rate. Results and Conclusions: Our model achieved strong performance, with a good DVH score (1.444 Gy) on the test dataset, compared to previous studies and state-of-the-art models. In addition, we developed software to display the dose maps predicted by the proposed model for each 2D slice in order to facilitate usage and observation. These results may help doctors in treating cancer with radiation therapy in terms of both time and effectiveness. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

12 pages, 5034 KiB  
Article
YOLOv8-Based System for Nail Capillary Detection on a Single-Board Computer
by Seda Arslan Tuncer, Muhammed Yildirim, Taner Tuncer and Mehmet Kamil Mülayim
Diagnostics 2024, 14(17), 1843; https://doi.org/10.3390/diagnostics14171843 - 23 Aug 2024
Cited by 1 | Viewed by 1108
Abstract
Nail capillaroscopic examination is an inexpensive and easily applicable method to identify capillary morphological changes in patients with conditions such as systemic sclerosis and Raynaud’s. The detection of changes in capillaries makes an important contribution to diagnosing these diseases. Capillary morphology is important [...] Read more.
Nail capillaroscopic examination is an inexpensive and easily applicable method to identify capillary morphological changes in patients with conditions such as systemic sclerosis and Raynaud’s. The detection of changes in capillaries makes an important contribution to diagnosing these diseases. Capillary morphology is important in the symptoms of these diseases, and capillary diameter, visibility, distribution, length, microbleeds, blood flow, and density are important indicators in capillaroscopic evaluation. Manual examination to determine these parameters is subjective, causes inconsistent results, and is labor-intensive and time-consuming. To overcome these problems, a YOLOv8s-based system was proposed in this paper to detect the number, thickness, and density of capillaries in the nail bed. The system’s components include database systems that store the analysis results, artificial intelligence-based software that runs on the SBC (Single-Board Computer), and recorded microscope images. mAP and F1_score parameters were used to evaluate the system’s performance, and values of 0.882 and 0.83 were obtained. The proposed system is promising in improving the diagnosis process of diseases such as systemic sclerosis and Raynaud’s by providing objective measurements and the early diagnosis and monitoring of diseases. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

15 pages, 6316 KiB  
Article
Deep Learning Based Automatic Left Ventricle Segmentation from the Transgastric Short-Axis View on Transesophageal Echocardiography: A Feasibility Study
by Yuan Tian, Wenting Qin, Zihang Zhao, Chunrong Wang, Yajie Tian, Yuelun Zhang, Kai He, Yuguan Zhang, Le Shen, Zhuhuang Zhou and Chunhua Yu
Diagnostics 2024, 14(15), 1655; https://doi.org/10.3390/diagnostics14151655 - 31 Jul 2024
Viewed by 1242
Abstract
Segmenting the left ventricle from the transgastric short-axis views (TSVs) on transesophageal echocardiography (TEE) is the cornerstone for cardiovascular assessment during perioperative management. Even for seasoned professionals, the procedure remains time-consuming and experience-dependent. The current study aims to evaluate the feasibility of deep [...] Read more.
Segmenting the left ventricle from the transgastric short-axis views (TSVs) on transesophageal echocardiography (TEE) is the cornerstone for cardiovascular assessment during perioperative management. Even for seasoned professionals, the procedure remains time-consuming and experience-dependent. The current study aims to evaluate the feasibility of deep learning for automatic segmentation by assessing the validity of different U-Net algorithms. A large dataset containing 1388 TSV acquisitions was retrospectively collected from 451 patients (32% women, average age 53.42 years) who underwent perioperative TEE between July 2015 and October 2023. With image preprocessing and data augmentation, 3336 images were included in the training set, 138 images in the validation set, and 138 images in the test set. Four deep neural networks (U-Net, Attention U-Net, UNet++, and UNeXt) were employed for left ventricle segmentation and compared in terms of the Jaccard similarity coefficient (JSC) and Dice similarity coefficient (DSC) on the test set, as well as the number of network parameters, training time, and inference time. The Attention U-Net and U-Net++ models performed better in terms of JSC (the highest average JSC: 86.02%) and DSC (the highest average DSC: 92.00%), the UNeXt model had the smallest network parameters (1.47 million), and the U-Net model had the least training time (6428.65 s) and inference time for a single image (101.75 ms). The Attention U-Net model outperformed the other three models in challenging cases, including the impaired boundary of left ventricle and the artifact of the papillary muscle. This pioneering exploration demonstrated the feasibility of deep learning for the segmentation of the left ventricle from TSV on TEE, which will facilitate an accelerated and objective alternative of cardiovascular assessment for perioperative management. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

22 pages, 5659 KiB  
Article
Exploring the Impact of Noise and Image Quality on Deep Learning Performance in DXA Images
by Dildar Hussain and Yeong Hyeon Gu
Diagnostics 2024, 14(13), 1328; https://doi.org/10.3390/diagnostics14131328 - 22 Jun 2024
Cited by 3 | Viewed by 3020
Abstract
Background and Objective: Segmentation of the femur in Dual-Energy X-ray (DXA) images poses challenges due to reduced contrast, noise, bone shape variations, and inconsistent X-ray beam penetration. In this study, we investigate the relationship between noise and certain deep learning (DL) techniques for [...] Read more.
Background and Objective: Segmentation of the femur in Dual-Energy X-ray (DXA) images poses challenges due to reduced contrast, noise, bone shape variations, and inconsistent X-ray beam penetration. In this study, we investigate the relationship between noise and certain deep learning (DL) techniques for semantic segmentation of the femur to enhance segmentation and bone mineral density (BMD) accuracy by incorporating noise reduction methods into DL models. Methods: Convolutional neural network (CNN)-based models were employed to segment femurs in DXA images and evaluate the effects of noise reduction filters on segmentation accuracy and their effect on BMD calculation. Various noise reduction techniques were integrated into DL-based models to enhance image quality before training. We assessed the performance of the fully convolutional neural network (FCNN) in comparison to noise reduction algorithms and manual segmentation methods. Results: Our study demonstrated that the FCNN outperformed noise reduction algorithms in enhancing segmentation accuracy and enabling precise calculation of BMD. The FCNN-based segmentation approach achieved a segmentation accuracy of 98.84% and a correlation coefficient of 0.9928 for BMD measurements, indicating its effectiveness in the clinical diagnosis of osteoporosis. Conclusions: In conclusion, integrating noise reduction techniques into DL-based models significantly improves femur segmentation accuracy in DXA images. The FCNN model, in particular, shows promising results in enhancing BMD calculation and clinical diagnosis of osteoporosis. These findings highlight the potential of DL techniques in addressing segmentation challenges and improving diagnostic accuracy in medical imaging. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

14 pages, 3521 KiB  
Article
Performance Comparison of Convolutional Neural Network-Based Hearing Loss Classification Model Using Auditory Brainstem Response Data
by Jun Ma, Seong Jun Choi, Sungyeup Kim and Min Hong
Diagnostics 2024, 14(12), 1232; https://doi.org/10.3390/diagnostics14121232 - 12 Jun 2024
Viewed by 1402
Abstract
This study evaluates the efficacy of several Convolutional Neural Network (CNN) models for the classification of hearing loss in patients using preprocessed auditory brainstem response (ABR) image data. Specifically, we employed six CNN architectures—VGG16, VGG19, DenseNet121, DenseNet-201, AlexNet, and InceptionV3—to differentiate between patients [...] Read more.
This study evaluates the efficacy of several Convolutional Neural Network (CNN) models for the classification of hearing loss in patients using preprocessed auditory brainstem response (ABR) image data. Specifically, we employed six CNN architectures—VGG16, VGG19, DenseNet121, DenseNet-201, AlexNet, and InceptionV3—to differentiate between patients with hearing loss and those with normal hearing. A dataset comprising 7990 preprocessed ABR images was utilized to assess the performance and accuracy of these models. Each model was systematically tested to determine its capability to accurately classify hearing loss. A comparative analysis of the models focused on metrics of accuracy and computational efficiency. The results indicated that the AlexNet model exhibited superior performance, achieving an accuracy of 95.93%. The findings from this research suggest that deep learning models, particularly AlexNet in this instance, hold significant potential for automating the diagnosis of hearing loss using ABR graph data. Future work will aim to refine these models to enhance their diagnostic accuracy and efficiency, fostering their practical application in clinical settings. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

12 pages, 3205 KiB  
Article
Deep Learning Detection and Segmentation of Facet Joints in Ultrasound Images Based on Convolutional Neural Networks and Enhanced Data Annotation
by Lingeer Wu, Di Xia, Jin Wang, Si Chen, Xulei Cui, Le Shen and Yuguang Huang
Diagnostics 2024, 14(7), 755; https://doi.org/10.3390/diagnostics14070755 - 2 Apr 2024
Viewed by 1601
Abstract
The facet joint injection is the most common procedure used to release lower back pain. In this paper, we proposed a deep learning method for detecting and segmenting facet joints in ultrasound images based on convolutional neural networks (CNNs) and enhanced data annotation. [...] Read more.
The facet joint injection is the most common procedure used to release lower back pain. In this paper, we proposed a deep learning method for detecting and segmenting facet joints in ultrasound images based on convolutional neural networks (CNNs) and enhanced data annotation. In the enhanced data annotation, a facet joint was considered as the first target and the ventral complex as the second target to improve the capability of CNNs in recognizing the facet joint. A total of 300 cases of patients undergoing pain treatment were included. The ultrasound images were captured and labeled by two professional anesthesiologists, and then augmented to train a deep learning model based on the Mask Region-based CNN (Mask R-CNN). The performance of the deep learning model was evaluated using the average precision (AP) on the testing sets. The data augmentation and data annotation methods were found to improve the AP. The AP50 for facet joint detection and segmentation was 90.4% and 85.0%, respectively, demonstrating the satisfying performance of the deep learning model. We presented a deep learning method for facet joint detection and segmentation in ultrasound images based on enhanced data annotation and the Mask R-CNN. The feasibility and potential of deep learning techniques in facet joint ultrasound image analysis have been demonstrated. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

Back to TopTop