Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (96)

Search Parameters:
Keywords = medical image explanations

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
37 pages, 13828 KB  
Article
XIMED: A Dual-Loop Evaluation Framework Integrating Predictive Model and Human-Centered Approaches for Explainable AI in Medical Imaging
by Gizem Karagoz, Tanir Ozcelebi and Nirvana Meratnia
Mach. Learn. Knowl. Extr. 2025, 7(4), 168; https://doi.org/10.3390/make7040168 - 17 Dec 2025
Abstract
In this study, a structured and methodological evaluation approach for eXplainable Artificial Intelligence (XAI) methods in medical image classification is proposed and implemented using LIME and SHAP explanations for chest X-ray interpretations. The evaluation framework integrates two critical perspectives: predictive model-centered and human-centered [...] Read more.
In this study, a structured and methodological evaluation approach for eXplainable Artificial Intelligence (XAI) methods in medical image classification is proposed and implemented using LIME and SHAP explanations for chest X-ray interpretations. The evaluation framework integrates two critical perspectives: predictive model-centered and human-centered evaluations. Predictive model-centered evaluations examine the explanations’ ability to reflect changes in input and output data and the internal model structure. Human-centered evaluations, conducted with 97 medical experts, assess trust, confidence, and agreements with AI’s indicative and contra-indicative reasoning as well as their changes before and after provision of explainability. Key findings of our study include explanation of sensitivity of LIME and SHAP to model changes, their effectiveness in identifying critical features, and SHAP’s significant impact on diagnosis changes. Our results show that both LIME and SHAP negatively affected contra-indicative agreement. Case-based analysis revealed AI explanations reinforce trust and agreement when participant’s initial diagnoses are correct. In these cases, SHAP effectively facilitated correct diagnostic changes. This study establishes a benchmark for future research in XAI for medical image analysis, providing a robust foundation for evaluating and comparing different XAI methods. Full article
Show Figures

Figure 1

32 pages, 3384 KB  
Review
A Survey of the Application of Explainable Artificial Intelligence in Biomedical Informatics
by Hassan Eshkiki, Farinaz Tanhaei, Fabio Caraffini and Benjamin Mora
Appl. Sci. 2025, 15(24), 12934; https://doi.org/10.3390/app152412934 - 8 Dec 2025
Viewed by 388
Abstract
This review investigates the application of Explainable Artificial Intelligence (XAI) in biomedical informatics, encompassing domains such as medical imaging, genomics, and electronic health records. Through a systematic analysis of 43 peer-reviewed articles, we examine current trends, as well as the strengths and limitations [...] Read more.
This review investigates the application of Explainable Artificial Intelligence (XAI) in biomedical informatics, encompassing domains such as medical imaging, genomics, and electronic health records. Through a systematic analysis of 43 peer-reviewed articles, we examine current trends, as well as the strengths and limitations of methodologies currently used in real-world healthcare settings. Our findings highlight a growing interest in XAI, particularly in medical imaging, yet reveal persistent challenges in clinical adoption, including issues of trust, interpretability, and integration into decision-making workflows. We identify critical gaps in existing approaches and underscore the need for more robust, human-centred, and intrinsically interpretable models, with only 44% of the papers studied proposing human-centred validations. Furthermore, we argue that fairness and accountability, which are key to the acceptance of AI in clinical practice, can be supported by the use of post hoc tools for identifying potential biases but ultimately require the implementation of complementary fairness-aware or causal approaches alongside evaluation frameworks that prioritise clinical relevance and user trust. This review provides a foundation for advancing XAI research on the development of more transparent, equitable, and clinically meaningful AI systems for use in healthcare. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Biomedical Informatics)
Show Figures

Figure 1

16 pages, 571 KB  
Article
Lightweight Statistical and Texture Feature Approach for Breast Thermogram Analysis
by Ana P. Romero-Carmona, Jose J. Rangel-Magdaleno, Francisco J. Renero-Carrillo, Juan M. Ramirez-Cortes and Hayde Peregrina-Barreto
J. Imaging 2025, 11(10), 358; https://doi.org/10.3390/jimaging11100358 - 13 Oct 2025
Viewed by 523
Abstract
Breast cancer is the most commonly diagnosed cancer in women globally and represents the leading cause of mortality related to malignant tumors. Currently, healthcare professionals are focused on developing and implementing innovative techniques to improve the early detection of this disease. Thermography, studied [...] Read more.
Breast cancer is the most commonly diagnosed cancer in women globally and represents the leading cause of mortality related to malignant tumors. Currently, healthcare professionals are focused on developing and implementing innovative techniques to improve the early detection of this disease. Thermography, studied as a complementary method to traditional approaches, captures infrared radiation emitted by tissues and converts it into data about skin surface temperature. During tumor development, angiogenesis occurs, increasing blood flow to support tumor growth, which raises the surface temperature in the affected area. Automatic classification techniques have been explored to analyze thermographic images and develop an optimal classification tool to identify thermal anomalies. This study aims to design a concise description using statistical and texture features to accurately classify thermograms as control or highly probable to be cancer (with thermal anomalies). The importance of employing a short description lies in facilitating interpretation by medical professionals. In contrast, a characterization based on a large number of variables could make it more challenging to identify which values differentiate the thermograms between groups, thereby complicating the explanation of results to patients. A maximum accuracy of 91.97% was achieved by applying only seven features and using a Coarse Decision Tree (DT) classifier and robust Machine Learning (ML) model, which demonstrated competitive performance compared with previously reported studies. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

21 pages, 806 KB  
Review
Application of Explainable Artificial Intelligence Based on Visual Explanation in Digestive Endoscopy
by Xiaohan Cai, Zexin Zhang, Siqi Zhao, Wentian Liu and Xiaofei Fan
Bioengineering 2025, 12(10), 1058; https://doi.org/10.3390/bioengineering12101058 - 30 Sep 2025
Viewed by 1310
Abstract
At present, artificial intelligence (AI) has shown significant potential in digestive endoscopy image analysis, serving as a powerful auxiliary tool for the accurate diagnosis and treatment of gastrointestinal diseases. However, mainstream models represented by deep learning are often characterized as complex “black boxes,” [...] Read more.
At present, artificial intelligence (AI) has shown significant potential in digestive endoscopy image analysis, serving as a powerful auxiliary tool for the accurate diagnosis and treatment of gastrointestinal diseases. However, mainstream models represented by deep learning are often characterized as complex “black boxes,” with decision-making processes that are difficult for humans to interpret. The lack of interpretability undermines physicians’ trust in model results and hinders the broader use of models in clinical practice. To address this core challenge, Explainable AI (XAI) has emerged to enhance the transparency of decision-making, thereby establishing a foundation of trust for human–machine collaboration. The review systematically reviews 34 articles (7 articles in esophagogastroduodenoscopy, 13 articles in colonoscopy, 9 articles in endoscopic ultrasonography, and 5 articles in wireless capsule endoscopy), focusing on the research progress and applications of XAI in the field of digestive endoscopic image analysis, with particular emphasis on the visual explanation-based methods. We first clarify the definition and mainstream classification of XAI, then introduce the principles and characteristics of key XAI methods based on visual explanation. Subsequently, we review the applications of these methods in digestive endoscopy image analysis. Lastly, we explore the obstacles presently faced in this domain and the future directions. This study provides a theoretical basis for constructing a trustworthy and transparent AI-assisted digestive endoscopy diagnosis and treatment system and promotes the implementation and application of XAI in clinical practice. Full article
Show Figures

Graphical abstract

18 pages, 1752 KB  
Systematic Review
Beyond Post hoc Explanations: A Comprehensive Framework for Accountable AI in Medical Imaging Through Transparency, Interpretability, and Explainability
by Yashbir Singh, Quincy A. Hathaway, Varekan Keishing, Sara Salehi, Yujia Wei, Natally Horvat, Diana V. Vera-Garcia, Ashok Choudhary, Almurtadha Mula Kh, Emilio Quaia and Jesper B Andersen
Bioengineering 2025, 12(8), 879; https://doi.org/10.3390/bioengineering12080879 - 15 Aug 2025
Cited by 5 | Viewed by 3470
Abstract
The integration of artificial intelligence (AI) in medical imaging has revolutionized diagnostic capabilities, yet the black-box nature of deep learning models poses significant challenges for clinical adoption. Current explainable AI (XAI) approaches, including SHAP, LIME, and Grad-CAM, predominantly focus on post hoc explanations [...] Read more.
The integration of artificial intelligence (AI) in medical imaging has revolutionized diagnostic capabilities, yet the black-box nature of deep learning models poses significant challenges for clinical adoption. Current explainable AI (XAI) approaches, including SHAP, LIME, and Grad-CAM, predominantly focus on post hoc explanations that may inadvertently undermine clinical decision-making by providing misleading confidence in AI outputs. This paper presents a systematic review and meta-analysis of 67 studies (covering 23 radiology, 19 pathology, and 25 ophthalmology applications) evaluating XAI fidelity, stability, and performance trade-offs across medical imaging modalities. Our meta-analysis of 847 initially identified studies reveals that LIME achieves superior fidelity (0.81, 95% CI: 0.78–0.84) compared to SHAP (0.38, 95% CI: 0.35–0.41) and Grad-CAM (0.54, 95% CI: 0.51–0.57) across all modalities. Post hoc explanations demonstrated poor stability under noise perturbation, with SHAP showing 53% degradation in ophthalmology applications (ρ = 0.42 at 10% noise) compared to 11% in radiology (ρ = 0.89). We demonstrate a consistent 5–7% AUC performance penalty for interpretable models but identify modality-specific stability patterns suggesting that tailored XAI approaches are necessary. Based on these empirical findings, we propose a comprehensive three-pillar accountability framework that prioritizes transparency in model development, interpretability in architecture design, and a cautious deployment of post hoc explanations with explicit uncertainty quantification. This approach offers a pathway toward genuinely accountable AI systems that enhance rather than compromise clinical decision-making quality and patient safety. Full article
(This article belongs to the Special Issue Explainable Artificial Intelligence (XAI) in Medical Imaging)
Show Figures

Figure 1

20 pages, 6359 KB  
Article
Symmetry in Explainable AI: A Morphometric Deep Learning Analysis for Skin Lesion Classification
by Rafael Fernandez, Angélica Guzmán-Ponce, Ruben Fernandez-Beltran and Ginés García-Mateos
Symmetry 2025, 17(8), 1264; https://doi.org/10.3390/sym17081264 - 7 Aug 2025
Viewed by 907
Abstract
Deep learning has achieved remarkable performance in skin lesion classification, but its lack of interpretability often remains a critical barrier to clinical adoption. In this study, we investigate the spatial properties of saliency-based model explanations, focusing on symmetry and other morphometric features. We [...] Read more.
Deep learning has achieved remarkable performance in skin lesion classification, but its lack of interpretability often remains a critical barrier to clinical adoption. In this study, we investigate the spatial properties of saliency-based model explanations, focusing on symmetry and other morphometric features. We benchmark five deep learning architectures (ResNet-50, EfficientNetV2-S, ConvNeXt-Tiny, Swin-Tiny, and MaxViT-Tiny) on a nine-class skin lesion dataset from the International Skin Imaging Collaboration (ISIC) archive, generating saliency maps with Grad-CAM++ and LayerCAM. The best-performing model, Swin-Tiny, achieved an accuracy of 78.2% and a macro-F1 score of 71.2%. Our morphometric analysis reveals statistically significant differences in the explanation maps between correct and incorrect predictions. Notably, the transformer-based models exhibit highly significant differences (p<0.001) in metrics related to attentional focus (Entropy and Gini), indicating that their correct predictions are associated with more concentrated saliency maps. In contrast, convolutional models show less consistent differences, and only at a standard significance level (p<0.05). These findings suggest that the quantitative morphometric properties of saliency maps could serve as valuable indicators of predictive reliability in medical AI. Full article
Show Figures

Figure 1

15 pages, 1758 KB  
Article
Eye-Guided Multimodal Fusion: Toward an Adaptive Learning Framework Using Explainable Artificial Intelligence
by Sahar Moradizeyveh, Ambreen Hanif, Sidong Liu, Yuankai Qi, Amin Beheshti and Antonio Di Ieva
Sensors 2025, 25(15), 4575; https://doi.org/10.3390/s25154575 - 24 Jul 2025
Viewed by 1187
Abstract
Interpreting diagnostic imaging and identifying clinically relevant features remain challenging tasks, particularly for novice radiologists who often lack structured guidance and expert feedback. To bridge this gap, we propose an Eye-Gaze Guided Multimodal Fusion framework that leverages expert eye-tracking data to enhance learning [...] Read more.
Interpreting diagnostic imaging and identifying clinically relevant features remain challenging tasks, particularly for novice radiologists who often lack structured guidance and expert feedback. To bridge this gap, we propose an Eye-Gaze Guided Multimodal Fusion framework that leverages expert eye-tracking data to enhance learning and decision-making in medical image interpretation. By integrating chest X-ray (CXR) images with expert fixation maps, our approach captures radiologists’ visual attention patterns and highlights regions of interest (ROIs) critical for accurate diagnosis. The fusion model utilizes a shared backbone architecture to jointly process image and gaze modalities, thereby minimizing the impact of noise in fixation data. We validate the system’s interpretability using Gradient-weighted Class Activation Mapping (Grad-CAM) and assess both classification performance and explanation alignment with expert annotations. Comprehensive evaluations, including robustness under gaze noise and expert clinical review, demonstrate the framework’s effectiveness in improving model reliability and interpretability. This work offers a promising pathway toward intelligent, human-centered AI systems that support both diagnostic accuracy and medical training. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

16 pages, 2557 KB  
Article
Explainable AI for Oral Cancer Diagnosis: Multiclass Classification of Histopathology Images and Grad-CAM Visualization
by Jelena Štifanić, Daniel Štifanić, Nikola Anđelić and Zlatan Car
Biology 2025, 14(8), 909; https://doi.org/10.3390/biology14080909 - 22 Jul 2025
Cited by 1 | Viewed by 2009
Abstract
Oral cancer is typically diagnosed through histological examination; however, the primary issue with this type of procedure is tumor heterogeneity, where a subjective aspect of the examination may have a direct effect on the treatment plan for a patient. To reduce inter- and [...] Read more.
Oral cancer is typically diagnosed through histological examination; however, the primary issue with this type of procedure is tumor heterogeneity, where a subjective aspect of the examination may have a direct effect on the treatment plan for a patient. To reduce inter- and intra-observer variability, artificial intelligence algorithms are often used as computational aids in tumor classification and diagnosis. This research proposes a two-step approach for automatic multiclass grading using oral histopathology images (the first step) and Grad-CAM visualization (the second step) to assist clinicians in diagnosing oral squamous cell carcinoma. The Xception architecture achieved the highest classification values of 0.929 (±σ = 0.087) AUCmacro and 0.942 (±σ = 0.074) AUCmicro. Additionally, Grad-CAM provided visual explanations of the model’s predictions by highlighting the precise areas of histopathology images that influenced the model’s decision. These results emphasize the potential of integrated AI algorithms in medical diagnostics, offering a more precise, dependable, and effective method for disease analysis. Full article
Show Figures

Figure 1

35 pages, 7934 KB  
Article
Analyzing Diagnostic Reasoning of Vision–Language Models via Zero-Shot Chain-of-Thought Prompting in Medical Visual Question Answering
by Fatema Tuj Johora Faria, Laith H. Baniata, Ahyoung Choi and Sangwoo Kang
Mathematics 2025, 13(14), 2322; https://doi.org/10.3390/math13142322 - 21 Jul 2025
Viewed by 4807
Abstract
Medical Visual Question Answering (MedVQA) lies at the intersection of computer vision, natural language processing, and clinical decision-making, aiming to generate accurate responses from medical images paired with complex inquiries. Despite recent advances in vision–language models (VLMs), their use in healthcare remains limited [...] Read more.
Medical Visual Question Answering (MedVQA) lies at the intersection of computer vision, natural language processing, and clinical decision-making, aiming to generate accurate responses from medical images paired with complex inquiries. Despite recent advances in vision–language models (VLMs), their use in healthcare remains limited by a lack of interpretability and a tendency to produce direct, unexplainable outputs. This opacity undermines their reliability in medical settings, where transparency and justification are critically important. To address this limitation, we propose a zero-shot chain-of-thought prompting framework that guides VLMs to perform multi-step reasoning before arriving at an answer. By encouraging the model to break down the problem, analyze both visual and contextual cues, and construct a stepwise explanation, the approach makes the reasoning process explicit and clinically meaningful. We evaluate the framework on the PMC-VQA benchmark, which includes authentic radiological images and expert-level prompts. In a comparative analysis of three leading VLMs, Gemini 2.5 Pro achieved the highest accuracy (72.48%), followed by Claude 3.5 Sonnet (69.00%) and GPT-4o Mini (67.33%). The results demonstrate that chain-of-thought prompting significantly improves both reasoning transparency and performance in MedVQA tasks. Full article
(This article belongs to the Special Issue Mathematical Foundations in NLP: Applications and Challenges)
Show Figures

Figure 1

18 pages, 5017 KB  
Article
A CECT-Based Radiomics Nomogram Predicts the Overall Survival of Patients with Hepatocellular Carcinoma After Surgical Resection
by Peng Zhang, Yue Shi, Maoting Zhou, Qi Mao, Yunyun Tao, Lin Yang and Xiaoming Zhang
Biomedicines 2025, 13(5), 1237; https://doi.org/10.3390/biomedicines13051237 - 19 May 2025
Cited by 1 | Viewed by 1251
Abstract
Objective: The primary objective of this study was to develop and validate a predictive nomogram that integrates radiomic features derived from contrast-enhanced computed tomography (CECT) images with clinical variables to predict overall survival (OS) in patients with hepatocellular carcinoma (HCC) after surgical [...] Read more.
Objective: The primary objective of this study was to develop and validate a predictive nomogram that integrates radiomic features derived from contrast-enhanced computed tomography (CECT) images with clinical variables to predict overall survival (OS) in patients with hepatocellular carcinoma (HCC) after surgical resection. Methods: This retrospective study analyzed the preoperative enhanced CT images and clinical data of 202 patients with HCC who underwent surgical resection at the Affiliated Hospital of North Sichuan Medical College (Institution 1) from June 2017 to June 2021 and at Nanchong Central Hospital (Institution 2) from June 2020 to June 2022. Among these patients, 162 patients from Institution 1 were randomly divided into a training cohort (112 patients) and an internal validation cohort (50 patients) at a 7:3 ratio, whereas 40 patients from Institution 2 were assigned as an independent external validation cohort. Univariate and multivariate Cox proportional hazards regression analyses were performed to identify clinical risk factors associated with OS after HCC resection. Using 3D-Slicer software, tumor lesions were manually delineated slice by slice on preoperative non-contrast-enhanced (NCE) CT, arterial phase (AP), and portal venous phase (PVP) images to generate volumetric regions of interest (VOIs). Radiomic features were subsequently extracted from these VOIs. LASSO Cox regression analysis was employed for dimensionality reduction and feature selection, culminating in the construction of a radiomic signature (Radscore). Cox proportional hazards regression models, including a clinical model, a radiomic model, and a radiomic–clinical model, were subsequently developed for OS prediction. The predictive performance of these models was assessed via the concordance index (C-index) and time–ROC curves. The optimal performance model was further visualized as a nomogram, and its predictive accuracy was evaluated via calibration curves and decision curve analysis (DCA). Finally, the risk factors in the optimal performance model were interpreted via Shapley additive explanations (SHAP). Results: Univariate and multivariate Cox regression analyses revealed that BCLC stage, the albumin–bilirubin index (ALBI), and the NLR–PLR score were independent predictors of OS after HCC resection. Among these three models, the radiomic–clinical model exhibited the highest predictive performance, with C-indices of 0.789, 0.726, and 0.764 in the training, internal and external validation cohorts, respectively. Furthermore, the time–ROC curves for the radiomic–clinical model showed 1-year and 3-year AUCs of 0.837 and 0.845 in the training cohort, 0.801 and 0.880 in the internal validation cohort, and 0.773 and 0.840 in the external validation cohort. Calibration curves and DCA demonstrated the model’s excellent calibration and clinical applicability. Conclusions: The nomogram combining CECT radiomic features and clinical variables provides an accurate prediction of OS after HCC resection. This model is beneficial for clinicians in developing individualized treatment strategies for patients with HCC. Full article
Show Figures

Figure 1

11 pages, 3732 KB  
Case Report
Involvement of Pruritus, Gut Dysbiosis and Histamine-Producing Bacteria in Paraneoplastic Syndromes
by Doina Georgescu, Daniel Lighezan, Mihai Ionita, Paul Ciubotaru, Gabriel Cozma, Alexandra Faur, Ioana Suceava, Oana Elena Ancusa and Roxana Buzas
Biomedicines 2025, 13(5), 1036; https://doi.org/10.3390/biomedicines13051036 - 25 Apr 2025
Viewed by 3679
Abstract
Background/Objectives: Paraneoplastic syndromes (PNS), characterized by a large diversity of symptoms, may sometimes be the first clinical feature of a severe underlying disorder such as cancer. Methods: We report the case of a middle-aged male patient with no significant previous medical history, a [...] Read more.
Background/Objectives: Paraneoplastic syndromes (PNS), characterized by a large diversity of symptoms, may sometimes be the first clinical feature of a severe underlying disorder such as cancer. Methods: We report the case of a middle-aged male patient with no significant previous medical history, a nonsmoker or alcohol heavy drinker, complaining about generalized, recently onset itch. Given no reasonable explanation of pruritus after dermatological consultation and the unsatisfactory response to treatment, the patient was referred to gastroenterology with the suspicion of a cholestatic liver disease. Results: The abdominal ultrasound examination revealed gallstones and no dilation of the biliary tree. Numerous tests were run and came out negative, except for the slight elevation of C-reactive protein, mild dyslipidemia, and positivity for H. pylori antigen. The gut microbiota displayed important dysbiosis with a significant increase in the histamine-producing bacteria. Given this chronic pruritus became suspicious, thorax and abdominal CT were recommended and performed soon after. A large right mid-thoracic tumor image was found. Bronchoscopy came out negative for a tumor. After the CT-guided biopsy, the tumor turned out not to be a lymphoma, but a non-small cell lung carcinoma (NSCLC). Conclusions: Chronic pruritus was not associated with cholestasis in a patient with gallstone disease, but rather with a PNS, as the first clinical manifestation of NSCLC, triggering many diagnostic and therapeutic challenges. Full article
Show Figures

Graphical abstract

30 pages, 3620 KB  
Article
Stroke Detection in Brain CT Images Using Convolutional Neural Networks: Model Development, Optimization and Interpretability
by Hassan Abdi, Mian Usman Sattar, Raza Hasan, Vishal Dattana and Salman Mahmood
Information 2025, 16(5), 345; https://doi.org/10.3390/info16050345 - 24 Apr 2025
Cited by 4 | Viewed by 4219
Abstract
Stroke detection using medical imaging plays a crucial role in early diagnosis and treatment planning. In this study, we propose a Convolutional Neural Network (CNN)-based model for detecting strokes from brain Computed Tomography (CT) images. The model is trained on a dataset consisting [...] Read more.
Stroke detection using medical imaging plays a crucial role in early diagnosis and treatment planning. In this study, we propose a Convolutional Neural Network (CNN)-based model for detecting strokes from brain Computed Tomography (CT) images. The model is trained on a dataset consisting of 2501 images, including both normal and stroke cases, and employs a series of preprocessing steps, including resizing, normalization, data augmentation, and splitting into training, validation, and test sets. The CNN architecture comprises three convolutional blocks followed by dense layers optimized through hyperparameter tuning to maximize performance. Our model achieved a validation accuracy of 97.2%, with precision and recall values of 96%, demonstrating high efficacy in stroke classification. Additionally, interpretability techniques such as Local Interpretable Model-agnostic Explanations (LIME), occlusion sensitivity, and saliency maps were used to visualize the model’s decision-making process, enhancing transparency and trust for clinical use. The results suggest that deep learning models, particularly CNNs, can provide valuable support for medical professionals in detecting strokes, offering both high performance and interpretability. The model demonstrates moderate generalizability, achieving 89.73% accuracy on an external, patient-independent dataset of 9900 CT images, underscoring the need for further optimization in diverse clinical settings. Full article
(This article belongs to the Special Issue Real-World Applications of Machine Learning Techniques)
Show Figures

Graphical abstract

19 pages, 7498 KB  
Article
An Efficient Explainability of Deep Models on Medical Images
by Salim Khiat, Sidi Ahmed Mahmoudi, Sédrick Stassin, Lillia Boukerroui, Besma Senaï and Saïd Mahmoudi
Algorithms 2025, 18(4), 210; https://doi.org/10.3390/a18040210 - 9 Apr 2025
Viewed by 1456
Abstract
Nowadays, Artificial Intelligence (AI) has revolutionized many fields and the medical field is no exception. Thanks to technological advancements and the emergence of Deep Learning (DL) techniques AI has brought new possibilities and significant improvements to medical practice. Despite the excellent results of [...] Read more.
Nowadays, Artificial Intelligence (AI) has revolutionized many fields and the medical field is no exception. Thanks to technological advancements and the emergence of Deep Learning (DL) techniques AI has brought new possibilities and significant improvements to medical practice. Despite the excellent results of DL models in terms of accuracy and performance, they remain black boxes as they do not provide meaningful insights into their internal functioning. This is where the field of Explainable AI (XAI) comes in, aiming to provide insights into the underlying workings of these black box models. In this present paper the visual explainability of deep models on chest radiography images are addressed. This research uses two datasets, the first on COVID-19, viral pneumonia, normality (healthy patients) and the second on pulmonary opacities. Initially the pretrained CNN models (VGG16, VGG19, ResNet50, MobileNetV2, Mixnet and EfficientNetB7) are used to classify chest radiography images. Then, the visual explainability methods (GradCAM, LIME, Vanilla Gradient, Gradient Integrated Gradient and SmoothGrad) are performed to understand and explain the decisions made by these models. The obtained results show that MobileNetV2 and VGG16 are the best models for the first and second datasets, respectively. As for the explainability methods, the results were subjected to doctors and were validated by calculating the mean opinion score. The doctors deemed GradCAM, LIME and Vanilla Gradient as the most effective methods, providing understandable and accurate explanations. Full article
(This article belongs to the Special Issue Machine Learning in Medical Signal and Image Processing (3rd Edition))
Show Figures

Figure 1

16 pages, 2439 KB  
Article
Ultrasound-Based Deep Learning Radiomics Models for Predicting Primary and Secondary Salivary Gland Malignancies: A Multicenter Retrospective Study
by Zhen Xia, Xiao-Chen Huang, Xin-Yu Xu, Qing Miao, Ming Wang, Meng-Jie Wu, Hao Zhang, Qi Jiang, Jing Zhuang, Qiang Wei and Wei Zhang
Bioengineering 2025, 12(4), 391; https://doi.org/10.3390/bioengineering12040391 - 5 Apr 2025
Viewed by 1138
Abstract
Background: Primary and secondary salivary gland malignancies differ significantly in treatment and prognosis. However, conventional ultrasonography often struggles to differentiate between these malignancies due to overlapping imaging features. We aimed to develop and evaluate noninvasive diagnostic models based on traditional ultrasound features, radiomics, [...] Read more.
Background: Primary and secondary salivary gland malignancies differ significantly in treatment and prognosis. However, conventional ultrasonography often struggles to differentiate between these malignancies due to overlapping imaging features. We aimed to develop and evaluate noninvasive diagnostic models based on traditional ultrasound features, radiomics, and deep learning—independently or in combination—for distinguishing between primary and secondary salivary gland malignancies. Methods: This retrospective study included a total of 140 patients, comprising 68 with primary and 72 with secondary salivary gland malignancies, all pathologically confirmed, from four medical centers. Ultrasound features of salivary gland tumors were analyzed, and a radiomics model was established. Transfer learning with multiple pre-trained models was used to create deep learning (DL) models from which features were extracted and combined with radiomics features to construct a radiomics-deep learning (RadiomicsDL) model. A combined model was further developed by integrating ultrasound features. Least absolute shrinkage and selection operator (LASSO) regression and various machine learning algorithms were employed for feature selection and modeling. The optimal model was determined based on the area under the receiver operating characteristic curve (AUC), and interpretability was assessed using SHapley Additive exPlanations (SHAP). Results: The RadiomicsDL model, which combines radiomics and deep learning features using the Multi-Layer Perceptron (MLP), demonstrated the best performance on the test set with an AUC of 0.807. This surpassed the performances of the ultrasound (US), radiomics, DL, and combined models, which achieved AUCs of 0.421, 0.636, 0.763, and 0.711, respectively. SHAP analysis revealed that the radiomic feature Wavelet_LHH_glcm_SumEntropy contributed most significantly to the mode. Conclusions: The RadiomicsDL model based on ultrasound images provides an efficient and non-invasive method to differentiate between primary and secondary salivary gland malignancies. Full article
(This article belongs to the Special Issue Diagnostic Imaging and Radiation Therapy in Biomedical Engineering)
Show Figures

Figure 1

19 pages, 4910 KB  
Article
A Novel SHAP-GAN Network for Interpretable Ovarian Cancer Diagnosis
by Jingxun Cai, Zne-Jung Lee, Zhihxian Lin and Ming-Ren Yang
Mathematics 2025, 13(5), 882; https://doi.org/10.3390/math13050882 - 6 Mar 2025
Cited by 1 | Viewed by 1531
Abstract
Ovarian cancer stands out as one of the most formidable adversaries in women’s health, largely due to its typically subtle and nonspecific early symptoms, which pose significant challenges to early detection and diagnosis. Although existing diagnostic methods, such as biomarker testing and imaging, [...] Read more.
Ovarian cancer stands out as one of the most formidable adversaries in women’s health, largely due to its typically subtle and nonspecific early symptoms, which pose significant challenges to early detection and diagnosis. Although existing diagnostic methods, such as biomarker testing and imaging, can help with early diagnosis to some extent, these methods still have limitations in sensitivity and accuracy, often leading to misdiagnosis or missed diagnosis. Ovarian cancer’s high heterogeneity and complexity increase diagnostic challenges, especially in disease progression prediction and patient classification. Machine learning (ML) has outperformed traditional methods in cancer detection by processing large datasets to identify patterns missed by conventional techniques. However, existing AI models still struggle with accuracy in handling imbalanced and high-dimensional data, and their “black-box” nature limits clinical interpretability. To address these issues, this study proposes SHAP-GAN, an innovative diagnostic model for ovarian cancer that integrates Shapley Additive exPlanations (SHAP) with Generative Adversarial Networks (GANs). The SHAP module quantifies each biomarker’s contribution to the diagnosis, while the GAN component optimizes medical data generation. This approach tackles three key challenges in medical diagnosis: data scarcity, model interpretability, and diagnostic accuracy. Results show that SHAP-GAN outperforms traditional methods in sensitivity, accuracy, and interpretability, particularly with high-dimensional and imbalanced ovarian cancer datasets. The top three influential features identified are PRR11, CIAO1, and SMPD3, which exhibit wide SHAP value distributions, highlighting their significant impact on model predictions. The SHAP-GAN network has demonstrated an impressive accuracy rate of 99.34% on the ovarian cancer dataset, significantly outperforming baseline algorithms, including Support Vector Machines (SVM), Logistic Regression (LR), and XGBoost. Specifically, SVM achieved an accuracy of 72.78%, LR achieved 86.09%, and XGBoost achieved 96.69%. These results highlight the superior performance of SHAP-GAN in handling high-dimensional and imbalanced datasets. Furthermore, SHAP-GAN significantly alleviates the challenges associated with intricate genetic data analysis, empowering medical professionals to tailor personalized treatment strategies for individual patients. Full article
Show Figures

Figure 1

Back to TopTop