Breast Lesion Detection Using Weakly Dependent Customized Features and Machine Learning Models with Explainable Artificial Intelligence
Abstract
:1. Introduction
- -
- To propose new weakly dependent features, as follows: bounded histogram features (CHi) and grayscale density features (Ci). The dependency between the features is rarely analyzed, and this study investigates the impact of dependence within individual ML predictions and searches for explanatory reasons;
- -
- To integrate XAI techniques like LIME and SHAP in ML models to understand the model’s predictions. XAI assists the classification process and enhances trust in distinguishing between benign and malignant breast lesions.
- -
- To investigate ML and XAI models’ possible discrepancies in identifying the most important features and their explanations for decision-making contributions.
2. Related Work
3. Experiment Setup and Weakly Dependent Feature Generation
3.1. Study Design
3.2. Weakly Dependent Features
3.2.1. Bounded Histogram Features CHi
3.2.2. Grayscale Density Features Ci
3.2.3. Algorithm for Extracting Bounded Histogram and Grayscale Density Features
- (i)
- The input consists of raw breast ultrasound (US) images (denoted as A) and the corresponding binary masks for each lesion (denoted as B), serving as the ground truth images. The output consists of two feature classes, one computed based on the image histograms and the other from the segmented ROIs (B superimposed over A).
- (ii)
- To determine the CHi and Ci feature values, the sum of the intensity values of all pixels is computed as follows: sum(sum(B (:))) and sum(sum(ROI (:)).
- (iii)
- The area of each bounded repartition histogram and the full ROI histogram is divided, allowing the computation of the eight CHi features using sum(H(:)).
Algorithm 1. Computation of CHi features |
Input: A<-grey level US images B<-ground truth images binary images ROI<-overleap(A,B) S = sum(sum(B (:))) H = histogram(ROI) for i: = 1 to 8 do p←0 j←31 if ROI >= p AND ROI <= j then Ci = sum(sum(ROI(:)))/S end if if H >= p AND H <= j then Chi = sum(H(:)) end if p ← p + 32 j ← j + 32 end for Output: Eight Ci features Eight CHi features |
4. AI Tools
4.1. ML Algorithms and Weakly Dependent Features—Importance
4.2. The Interpretation XAI Framework
4.2.1. Local Interpretable Model-Agnostic Explanations (LIME)
4.2.2. SHapley Additive exPlanations (SHAP)
5. Results
5.1. Classification Results
5.2. XAI Interpretability Results
- i.
- RF classifier—CH8 and C8 were the most important contributing features in the prediction of malignancy; CH3, CH1, C3, and C5 were the most important contributing features in the prediction of benign conditions;
- ii.
- GBC classifier—CH7, CH8, CH4, C8, C1, and C4 were the most important contributing features in the prediction of malignancy; for benign conditions, they were CH3, CH2, CH5, C3, and C5;
- iii.
- XGB classifier—CH7, CH8, CH4, C8, and C1 were the most contributing features in the prediction of malignancy; for benign conditions, they were CH3, CH2, CH5, and C6.
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- World Cancer Research Found International. Breast Cancer Statistics. Available online: https://www.wcrf.org/cancer-trends/ (accessed on 15 July 2024).
- American Cancer Society. How Common Is Breast Cancer? Available online: https://www.cancer.org/cancer/types/breast-cancer (accessed on 15 July 2024).
- World Health Organization. Breast Cancer. Available online: https://www.who.int/news-room/fact-sheets/detail/breast-cancer (accessed on 15 July 2024).
- Kerlikowske, K. Epidemiology of Ductal Carcinoma In Situ. JNCI Monogr. 2010, 2010, 139–141. [Google Scholar] [CrossRef] [PubMed]
- Mahdavi, M.; Nassiri, M.; Kooshyar, M.M.; Vakili-Azghandi, M.; Avan, A.; Sandry, R.; Pillai, S.; Lam, A.K.; Gopalan, V. Hereditary Breast Cancer; Genetic Penetrance and Current Status with BRCA. J. Cell. Physiol. 2019, 234, 5741–5750. [Google Scholar] [CrossRef]
- Heer, E.; Ruan, Y.; Mealey, N.; Quan, M.L.; Brenner, D.R. The Incidence of Breast Cancer in Canada 1971–2015: Trends in Screening-Eligible and Young-Onset Age Groups. Can. J. Public Health 2020, 111, 787–793. [Google Scholar] [CrossRef]
- Venkatachalam, N.; Shanmugam, L.; Heltin Genitha, C.; Kumar, S. Automated Breast Boundary Segmentation to Improve the Accuracy of Identifying Abnormalities in Breast Thermograms. IETE J. Res. 2024, 70, 1462–1471. [Google Scholar] [CrossRef]
- Shi, H.-Y.; Li, C.-H.; Chen, Y.-C.; Chiu, C.-C.; Lee, H.-H.; Hou, M.-F. Quality of Life and Cost-Effectiveness of Different Breast Cancer Surgery Procedures: A Markov Decision Tree-Based Approach in the Framework of Predictive, Preventive, and Personalized Medicine. EPMA J. 2023, 14, 457–475. [Google Scholar] [CrossRef] [PubMed]
- Raaj, R.S. Breast Cancer Detection and Diagnosis Using Hybrid Deep Learning Architecture. Biomed. Signal Process. Control 2023, 82, 104558. [Google Scholar] [CrossRef]
- Anghelache Nastase, I.-N.; Moldovanu, S.; Moraru, L. Image Moment-Based Features for Mass Detection in Breast US Images via Machine Learning and Neural Network Classification Models. Inventions 2022, 7, 42. [Google Scholar] [CrossRef]
- Moldovanu, S.; Miron, M.; Rusu, C.-G.; Biswas, K.C.; Moraru, L. Refining Skin Lesions Classification Performance Using Geometric Features of Superpixels. Sci. Rep. 2023, 13, 11463. [Google Scholar] [CrossRef]
- Singh, H.; Sharma, V.; Singh, D. Comparative Analysis of Proficiencies of Various Textures and Geometric Features in Breast Mass Classification Using K-Nearest Neighbor. Vis. Comput. Ind. Biomed. Art 2022, 5, 3. [Google Scholar] [CrossRef]
- Miron, M.; Moldovanu, S.; Culea-Florescu, A.L. A Multi-Layer Feed Forward Neural Network for Breast Cancer Diagnosis from Ultrasound Images. In Proceedings of the 2022 26th International Conference on System Theory, Control and Computing (ICSTCC), Sinaia, Romania, 19–21 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 421–425. [Google Scholar]
- Atban, F.; Ekinci, E.; Garip, Z. Traditional Machine Learning Algorithms for Breast Cancer Image Classification with Optimized Deep Features. Biomed. Signal Process. Control 2023, 81, 104534. [Google Scholar] [CrossRef]
- Uddin, K.M.M.; Biswas, N.; Rikta, S.T.; Dey, S.K. Machine Learning-Based Diagnosis of Breast Cancer Utilizing Feature Optimization Technique. Comput. Methods Programs Biomed. Update 2023, 3, 100098. [Google Scholar] [CrossRef]
- Al Tawil, A.; Almazaydeh, L.; Alqudah, B.; Abualkishik, A.Z.; Alwan, A.A. Predictive Modeling for Breast Cancer Based on Machine Learning Algorithms and Features Selection Methods. Int. J. Electr. Comput. Eng. 2024, 14, 1937–1947. [Google Scholar] [CrossRef]
- Li, D.; Zhang, S.; Ma, X. Dynamic Module Detection in Temporal Attributed Networks of Cancers. IEEE/ACM Trans. Comput. Biol. Bioinform. 2021, 19, 2219–2230. [Google Scholar] [CrossRef]
- Sovrano, F.; Sapienza, S.; Palmirani, M.; Vitali, F. A Survey on Methods and Metrics for the Assessment of Explainability Under the Proposed AI Act. In Frontiers in Artificial Intelligence and Applications; Schweighofer, E., Ed.; IOS Press: Amsterdam, Netherlands, 2021; ISBN 978-1-64368-252-5. [Google Scholar]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 1135–1144. [Google Scholar]
- Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar]
- Nastase, I.-N.A.; Moldovanu, S.; Biswas, K.C.; Moraru, L. Role of Inter- and Extra-Lesion Tissue, Transfer Learning, and Fine-Tuning in the Robust Classification of Breast Lesions. Sci. Rep. 2024, 14, 22754. [Google Scholar] [CrossRef]
- Akram, A. Recognizing Breast Cancer Using Edge-Weighted Texture Features of Histopathology Images. Comput. Mater. Contin. 2023, 77, 1081–1101. [Google Scholar] [CrossRef]
- Hoque, R.; Das, S.; Hoque, M.; Haque, E. Breast Cancer Classification Using XGBoost. World J. Adv. Res. Rev. 2024, 21, 1985–1994. [Google Scholar] [CrossRef]
- Zhang, H.; Zhang, M.; Xu, T.; Wang, X.; Qi, J.; Wang, Y.; Liu, W.; Zhu, L.; Yuan, Z.; Si, C. Ions and Electrons Dual Transport Channels Regulated by Nanocellulose for Mitigating Dendrite Growth of Zinc-Ion Batteries. Chem. Eng. J. 2025, 505, 159476. [Google Scholar] [CrossRef]
- Ghnemat, R.; Alodibat, S.; Abu Al-Haija, Q. Explainable Artificial Intelligence (XAI) for Deep Learning Based Medical Imaging Classification. J. Imaging 2023, 9, 177. [Google Scholar] [CrossRef]
- Jean-Quartier, C.; Bein, K.; Hejny, L.; Hofer, E.; Holzinger, A.; Jeanquartier, F. The Cost of Understanding—XAI Algorithms towards Sustainable ML in the View of Computational Cost. Computation 2023, 11, 92. [Google Scholar] [CrossRef]
- Silva-Aravena, F.; Núñez Delafuente, H.; Gutiérrez-Bahamondes, J.H.; Morales, J. A Hybrid Algorithm of ML and XAI to Prevent Breast Cancer: A Strategy to Support Decision Making. Cancers 2023, 15, 2443. [Google Scholar] [CrossRef] [PubMed]
- Munshi, R.M.; Cascone, L.; Alturki, N.; Saidani, O.; Alshardan, A.; Umer, M. A Novel Approach for Breast Cancer Detection Using Optimized Ensemble Learning Framework and XAI. Image Vis. Comput. 2024, 142, 104910. [Google Scholar] [CrossRef]
- Rezazadeh, A.; Jafarian, Y.; Kord, A. Explainable Ensemble Machine Learning for Breast Cancer Diagnosis Based on Ultrasound Image Texture Features. Forecasting 2022, 4, 262–274. [Google Scholar] [CrossRef]
- Adelodun, A.B.; Ogundokun, R.O.; Yekini, A.O.; Awotunde, J.B.; Timothy, C.C. Explainable Artificial Intelligence with Scaling Techniques to Classify Breast Cancer Images. In Explainable Machine Learning for Multimedia Based Healthcare Applications; Hossain, M.S., Kose, U., Gupta, D., Eds.; Springer International Publishing: Cham, Switzerland, 2023; pp. 99–137. ISBN 978-3-031-38035-8. [Google Scholar]
- Sathyan, A.; Weinberg, A.I.; Cohen, K. Interpretable AI for Bio-Medical Applications. Complex Eng. Syst. 2022, 2, 18. [Google Scholar] [CrossRef]
- Zhang, Y.; Weng, Y.; Lund, J. Applications of Explainable Artificial Intelligence in Diagnosis and Surgery. Diagnostics 2022, 12, 237. [Google Scholar] [CrossRef]
- Yagin, B.; Yagin, F.; Colak, C.; Inceoglu, F.; Kadry, S.; Kim, J. Cancer Metastasis Prediction and Genomic Biomarker Identification through Machine Learning and eXplainable Artificial Intelligence in Breast Cancer Research. Diagnostics 2023, 13, 3314. [Google Scholar] [CrossRef]
- Lee, Y.-W.; Huang, C.-S.; Shih, C.-C.; Chang, R.-F. Axillary Lymph Node Metastasis Status Prediction of Early-Stage Breast Cancer Using Convolutional Neural Networks. Comput. Biol. Med. 2021, 130, 104206. [Google Scholar] [CrossRef] [PubMed]
- Al-Dhabyani, W.; Gomaa, M.; Khaled, H.; Fahmy, A. Dataset of Breast Ultrasound Images. Data Brief 2020, 28, 104863. [Google Scholar] [CrossRef]
- Tăbăcaru, G.; Moldovanu, S.; Răducan, E.; Barbu, M. A Robust Machine Learning Model for Diabetic Retinopathy Classification. J. Imaging 2023, 10, 8. [Google Scholar] [CrossRef]
- Saarela, M.; Jauhiainen, S. Comparison of Feature Importance Measures as Explanations for Classification Models. SN Appl. Sci. 2021, 3, 272. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.-I. From Local Explanations to Global Understanding with Explainable AI for Trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef] [PubMed]
- Westfall, P.H.; Troendle, J.F.; Pennello, G. Multiple McNemar tests. Biometrics 2010, 66, 1185–1191. [Google Scholar] [CrossRef] [PubMed]
- Dayal, S.; Krishna, M.; Kannaujia, S.K.; Singh, S. Gray Lesions of the Breast and Its Diagnostic Significance: A Retrospective Study from Rural India. J. Microsc. Ultrastruct. 2021, 9, 119–124. [Google Scholar] [CrossRef] [PubMed]
- Rafferty, A.; Nenutil, R.; Rajan, A. Explainable Artificial Intelligence for Breast Tumour Classification: Helpful or Harmful. In Interpretability of Machine Intelligence in Medical Image Computing; Reyes, M., Henriques Abreu, P., Cardoso, J., Eds.; Lecture Notes in Computer Science; Springer Nature Switzerland: Cham, Switzerland, 2022; Volume 13611, pp. 104–123. ISBN 978-3-031-17975-4. [Google Scholar]
- Mohi Uddin, K.M.; Biswas, N.; Rikta, S.T.; Dey, S.K.; Qazi, A. XML-LIGHTGBMDROID: A Self-driven Interactive Mobile Application Utilizing Explainable Machine Learning for Breast Cancer Diagnosis. Eng. Rep. 2023, 5, e12666. [Google Scholar] [CrossRef]
- Suresh, T.; Assegie, T.A.; Ganesan, S.; Tulasi, R.L.; Mothukuri, R.; Salau, A.O. Explainable Extreme Boosting Model for Breast Cancer Diagnosis. Int. J. Electr. Comput. Eng. 2023, 13, 5764. [Google Scholar] [CrossRef]
- Maheswari, B.U.; Aaditi, A.; Avvaru, A.; Tandon, A.; De Prado, R.P. Interpretable Machine Learning Model for Breast Cancer Prediction Using LIME and SHAP. In Proceedings of the 2024 IEEE 9th International Conference for Convergence in Technology (I2CT), Pune, India, 5–7 April 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–6. [Google Scholar]
- Kaushik, A.; Madhuranath, B.; Rao, D.; Dey, S.R.; Sampatrao, G.S. Interpreting Breast Cancer Recurrence Prediction Models: Exploring Feature Importance with Explainable AI. In Proceedings of the 2024 3rd International Conference on Artificial Intelligence For Internet of Things (AIIoT), Vellore, India, 3–4 May 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–6. [Google Scholar]
- Wani, N.A.; Kumar, R.; Bedi, J. Harnessing Fusion Modeling for Enhanced Breast Cancer Classification through Interpretable Artificial Intelligence and In-Depth Explanations. Eng. Appl. Artif. Intell. 2024, 136, 108939. [Google Scholar] [CrossRef]
Reference/Year | Models | Dataset | Accuracy |
---|---|---|---|
[14], 2023 | SVM | BreakHis | 97.75% |
[15], 2023 | Logistic regression + SVM | Wisconsin Breast Cancer Dataset | 96.50% |
[16], 2024 | LightGBM | Wisconsin Breast Cancer Dataset | 95.00% |
[18], 2021 | SVM | Wisconsin Breast Cancer Dataset | 96.50% |
[21], 2020 | SVM | Wisconsin Breast Cancer Diagnostic | 98.00% |
[22], 2023 | XGBoost | BreakHis | 99.27% |
[23], 2024 | XGBoost | Wisconsin Breast Cancer Dataset | 94.74% |
[24], 2024 | RF | GSE9893 | 93.60% |
[27], 2023 | XGBoost | Mendeley | 81.00% |
[28], 2024 | RF + SVM | Wisconsin Breast Cancer Dataset | 99.99% |
[29], 2022 | LightGBM | Ultrasound breast images dataset | 91.00% |
[30], 2023 | RF | Wisconsin Breast Cancer Dataset | 98.60% |
[31], 2022 | Deep neural network (DNN) | Wisconsin Breast Cancer Dataset | 97.00% |
Hardware/Software | Specification | Value |
---|---|---|
Hardware | Model | Mac BookPro |
Chip | Apple M1 Pro | |
Memory | 16 GB | |
Software | PyCharm 2023.3.3 | Integrated development environment |
Python 3.10 | Scikit-learn 1.4.1 | |
Shap 0.46.0 & Lime 0.2.0.1 | ||
MATLAB R2021a | Image Processing Toolbox |
Feature Class | Classifiers | Accuracy | F1 Score | AUC | Confusion Matrix | |
---|---|---|---|---|---|---|
CHi | XGB | 0.851 | 0.739 | 0.728 | 2.66 | [[34 16] [8 104]] |
RF | 0.969 | 0.950 | 0.965 | 0.20 | [[48 2] [3 109]] | |
GBC | 0.919 | 0.863 | 0.872 | 1.92 | [[41 9] [4 109]] | |
LASSO | 0.776 | 0.834 | 0.798 | 0.482 | [[28 14] [15 73]] | |
Ci | XGB | 0.882 | 0.822 | 0.816 | 4.26 | [[44 14] [5 99]] |
RF | 0.969 | 0.957 | 0.967 | 0.20 | [[57 2] [3 100]] | |
GBC | 0.938 | 0.901 | 0.916 | 0.40 | [[46 6] [4 106]] | |
LASSO | 0.761 | 0.818 | 0.812 | 0.429 | [[29 13] [18 70]] |
Features | Classifier | Malignant (0) (%) | Benign (1) (%) |
---|---|---|---|
CHi | XGB | 13 | 87 |
RF | 23 | 77 | |
GBC | 4 | 96 | |
Ci 2 | XGB | 48 | 52 |
RF | 24 | 76 | |
GBC | 40 | 60 |
Classifier | Figure 3 | Relevant Features | |
---|---|---|---|
Malignant (0) (%) | Benign (1) (%) | ||
RF | (a1) | CH7, CH8, CH6 | CH4, CH5, CH3, CH2, CH1 |
(a2) | C7, C1, C6, C8 | C3, C2, C5, C4 | |
GBC | (b1) | CH7, CH6, CH4, CH8 | CH5, CH2, CH3, CH1 |
(b2) | C7, C1, C4, C8 | C5, C6, C3, C2 | |
XGB | (c1) | CH7, CH6, CH8, CH4 | CH3, CH2, CH5, CH1 |
(c2) | C7, C2, C3, C1, C8 | C5, C4, C6 |
Classifier | Figure 4 | Relevant Features | |
---|---|---|---|
Malignant (Blue) | Benign (Red) | ||
RF | (a1) | CH8, CH7, CH4 | CH3, CH2, CH1 |
(a2) | C8, C4, C2 | C3, C7, C5 | |
GBC | (b1) | CH8, CH7, CH4 | CH3, CH2, CH5 |
(b2) | C8, C1, C4 | C3, C7, C5 | |
XGB | (c1) | CH8, CH7, CH4 | CH3, CH2, CH5 |
(c2) | C8, C6, C1 | C3, C6, C1 |
Reference, Year | AI/Accuracy | XAI | Dataset |
---|---|---|---|
Sathyan et al. [31], 2022 | DNN/97% | SHAP, LIME | Wisconsin Diagnostic Breast Cancer |
Uddin et al. [42], 2023 | LightGBM/99% | SHAP | Kaggle Breast Cancer Dataset |
Suresh et al. [43], 2023 | XGB/98.42%. | SHAP | Wisconsin Diagnostic Breast Cancer |
Maheswari et al. [44], 2024 | RF/95.9% | SHAP, LIME | Kaggle Breast Cancer Dataset |
Munshi et al. [28], 2024 | RF+SVM/99.99% | SHAP | Wisconsin Diagnostic Breast Cancer |
Kaushik et al. [45], 2024 | SVM/97.90% | SHAP, LIME | Breast Cancer Dataset |
Wani et al. [46], 2024 | Light Gradient Boosting Model/98.71% | SHAP | SEER Breast Cancer dataset |
Proposed | RF/100% | SHAP, LIME | BUSI dataset |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Moldovanu, S.; Munteanu, D.; Biswas, K.C.; Moraru, L. Breast Lesion Detection Using Weakly Dependent Customized Features and Machine Learning Models with Explainable Artificial Intelligence. J. Imaging 2025, 11, 135. https://doi.org/10.3390/jimaging11050135
Moldovanu S, Munteanu D, Biswas KC, Moraru L. Breast Lesion Detection Using Weakly Dependent Customized Features and Machine Learning Models with Explainable Artificial Intelligence. Journal of Imaging. 2025; 11(5):135. https://doi.org/10.3390/jimaging11050135
Chicago/Turabian StyleMoldovanu, Simona, Dan Munteanu, Keka C. Biswas, and Luminita Moraru. 2025. "Breast Lesion Detection Using Weakly Dependent Customized Features and Machine Learning Models with Explainable Artificial Intelligence" Journal of Imaging 11, no. 5: 135. https://doi.org/10.3390/jimaging11050135
APA StyleMoldovanu, S., Munteanu, D., Biswas, K. C., & Moraru, L. (2025). Breast Lesion Detection Using Weakly Dependent Customized Features and Machine Learning Models with Explainable Artificial Intelligence. Journal of Imaging, 11(5), 135. https://doi.org/10.3390/jimaging11050135