A New Computer-Aided Diagnosis System for Breast Cancer Detection from Thermograms Using Metaheuristic Algorithms and Explainable AI
Abstract
:1. Introduction
- Our primary goals encompass the comprehensive solution to the challenges of achieving dynamic optimization problems. The development of Hybrid PSO (HPSO) and Hybrid SMO (HSMO) methodologies for tackling continuous variable optimization and discrete problems. The proposed approaches allow the optimization algorithm to ensure effective solutions in a varied search space.
- Harnessing the power of multi-objective optimization techniques, the proposed optimizers aim to provide a deeper understanding of model behavior across varying problem formulations. We used multi-objective ML approaches to strive to optimize multiple aspects of model performance simultaneously. These include hyperparameter optimization, prediction performance, sparseness, and interpretability.
- Through our research, the automated aspects on finding textures and features through machine learning-based optimization functions that best represent breast cancer classification patterns by producing handcrafted features from breast thermogram images using different methods, such as LBP, HOG, Gabor and canny edge and SVM for classification, are employed.
- Using visual explanation techniques that generate interpretable graphical representations enables healthcare practitioners to intuitively grasp the rationale behind each classification, making complex patterns associated with breast thermograms more comprehensible.
- The proposed HSMO and HPSO optimizer incorporates the SHapley Additive exPlanations (SHAP) framework into the evaluation process; this method ensures that the importance of different features is considered. This method adds another layer of complexity and variation to the solutions being explored.
- The other goal of this paper is to evaluate the convergence behaviors of distinct metaheuristic algorithms (HPSO, HSMO, Binary Particle Swarm Optimization algorithm (BPSO), and Binary Spider Monkey Optimization algorithm (BSMO) to identify similarities and differences in their solutions and second, to ensure consistent results upon repeated runs.
2. Related Work
3. Methodology
3.1. Dataset Description
3.2. Data Pre-Processing
3.3. Feature Extraction
3.4. The Proposed Metaheuristic-Based Feature and Hyperparameter Optimization
3.4.1. HPSO Optimizer
3.4.2. HSMO Optimizer
Algorithm 1. Proposed HPSO |
1. Initialize: - Initialize the particle positions and velocities. - Set parameters like inertia weight (w), cognitive parameter (c1), social parameter (c2), and the number of particles (N). - Randomly generate initial positions and velocities for particles. - Create a DataFrame to store particle information: ID of particle, objective function value, global best position g, and parameters (analogic and continuous). - Calculate the initial objective function values for particles. - Set the best positions for each particle and global best position. 2. Iterative Optimization: - For a specified number of iterations or until an iteration is met: a. Update Velocities: - Calculate new velocities for each particle using the formula in Equation (8) b. Update Positions: - Update the positions of particles using the new velocities. - For continuous parameters, add the velocity to the position, as defined in Equation (2). - For binary parameters, use a sigmoid function to determine whether to flip the bit based on the velocity as defined in Equation (6). c. Evaluate Objective Function: - Calculate the objective function values for the updated positions of the particles. - Update the DataFrame with the new positions and objective function values. - Update the best positions for each particle if the new position is better. d. Update Global Best: - Determine if any particle’s current position is better than the global best position. - Update the global best position if necessary, as defined in Equation (8). 3. Final Output: - Return the best solution found and its corresponding objective function value. |
Algorithm 2. Proposed HSMO |
1. Initialize: - Set parameters like population, parameters (analogic and continuous), groups, max groups, pr, acc_err_delta_threshold, global_lim_thresh, local_lim_thresh, target_value, debug_mode, and pr increment - Initialize spider monkeys’ positions and IDs - Initialize DataFrame to store positions, fitness, probabilities, etc. - Calculate the initial objective function values and fitness values - Assign initial probabilities - Create initial groups - Identify local leaders and the global leader 2. Iterative Optimization: - For each iteration: a. Local Leader Phase: - Randomly select spider monkeys based on pr - Randomly select group members for the value update - Update positions using local leader and selected group members - Update fitness values - Swap positions if fitness improves b. Global Leader Phase: - Identify the global leader - For each group, update positions using the global leader and selected group members - Update fitness values - Swap positions if fitness improves c. Local Decision Phase: - Check if local leader performance improves - Increment limit count if no improvement - Reset group if limit count exceeds threshold d. Global Decision Phase: - Check if global leader performance improves - Increment global limit count if no improvement - Increase the number of groups or reset groups if limit count exceeds threshold e. Update pr 3. Final Output: - Return the best solution found and its corresponding objective function value |
3.5. XAI Models through Metaheuristic Optimization
4. Experimental Setup
5. Results and Discussion
6. Comparison with State-of-the-Art Methodologies
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ciatto, S.; Rosselli Del Turco, M.; Zappa, M. The detectability of breast cancer by screening mammography. Br. J. Cancer 1995, 71, 337–339. [Google Scholar] [CrossRef]
- Foxcroft, L.M.; Evans, E.B.; Joshua, H.K.; Hirst, C. Breast cancers invisible on mammography. ANZ J. Surg. 2000, 70, 162–167. [Google Scholar] [CrossRef] [PubMed]
- Pataky, R.; Phillips, N.; Peacock, S.; Coldman, A.J. Cost-effectiveness of population-based mammography screening strategies by age range and frequency. J. Cancer Policy 2014, 2, 97–102. [Google Scholar] [CrossRef]
- Pauwels, E.K.J.; Foray, N.; Bourguignon, M.H. Breast Cancer Induced by X-Ray Mammography Screening? A Review Based on Recent Understanding of Low-Dose Radiobiology. Med. Princ. Pract. 2015, 25, 101–109. [Google Scholar] [CrossRef] [PubMed]
- Dabbous, F.M.; Dolecek, T.A.; Berbaum, M.L.; Friedewald, S.M.; Summerfelt, W.T.; Hoskins, K.; Rauscher, G.H. Impact of a False-Positive Screening Mammogram on Subsequent Screening Behavior and Stage at Breast Cancer Diagnosis. Cancer Epidemiol. Biomark. Prev. 2017, 26, 397–403. [Google Scholar] [CrossRef] [PubMed]
- Bansal, R.; Collison, S.; Krishnan, L.; Aggarwal, B.; Vidyasagar, M.; Kakileti, S.T.; Manjunath, G. A prospective evaluation of breast thermography enhanced by a novel machine learning technique for screening breast abnormalities in a general population of women presenting to a secondary care hospital. Front. Artif. Intell. 2023, 5, 1050803. Available online: https://www.frontiersin.org/articles/10.3389/frai.2022.1050803 (accessed on 25 February 2023). [CrossRef] [PubMed]
- Da Luz, T.G.R.; Coninck, J.C.; Ulbricht, L. Comparison of the Sensitivity and Specificity Between Mammography and Thermography in Breast Cancer Detection. In XXVII Brazilian Congress on Biomedical Engineering, Proceedings of the CBEB 2020, Vitória, Brazil, 26–30 October 2020; Bastos-Filho, T.F., de Oliveira Caldeira, E.M., Frizera-Neto, A., Eds.; IFMBE Proceedings; Springer International Publishing: Cham, Switzerland, 2022; Volume 83, pp. 2163–2168. ISBN 978-3-030-70600-5. [Google Scholar]
- Head, J.F.; Elliott, R.L. Infrared imaging: Making progress in fulfilling its medical promise. IEEE Eng. Med. Biol. Mag. 2002, 21, 80–85. [Google Scholar] [CrossRef]
- Sarigoz, T.; Ertan, T.; Topuz, Ö.; Sevim, Y.; Cihan, Y. Role of Digital Infrared Thermal Imaging in the Diagnosis of Breast Mass: A Pilot Study. Infrared Phys. Technol. 2018, 91, 214–219. [Google Scholar] [CrossRef]
- Guetari, R.; Ayari, H.; Sakly, H. Computer-aided diagnosis systems: A comparative study of classical machine learning versus deep learning-based approaches. Knowl. Inf. Syst. 2023, 65, 3881–3921. [Google Scholar] [CrossRef]
- Retson, T.A.; Eghtedari, M. Expanding Horizons: The Realities of CAD, the Promise of Artificial Intelligence, and Machine Learning’s Role in Breast Imaging beyond Screening Mammography. Diagnostics 2023, 13, 2133. [Google Scholar] [CrossRef]
- Skjong, R.; Wentworth, B.H. Expert Judgment and Risk Perception. In Proceedings of the Eleventh International Offshore and Polar Engineering Conference, Stavanger, Norway, 17–22 June 2001; OnePetro: Richardson, TX, USA, 2001. [Google Scholar]
- Park, H.; Megahed, A.; Yin, P.; Ong, Y.; Mahajan, P.; Guo, P. Incorporating Experts’ Judgment into Machine Learning Models. Expert. Syst. Appl. 2023, 228, 120118. [Google Scholar] [CrossRef]
- Ali, S.; Abuhmed, T.; El-Sappagh, S.; Muhammad, K.; Alonso-Moral, J.M.; Confalonieri, R.; Guidotti, R.; Del Ser, J.; Díaz-Rodríguez, N.; Herrera, F. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Inf. Fusion. 2023, 99, 101805. [Google Scholar] [CrossRef]
- Yang, G.; Ye, Q.; Xia, J. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: A mini-review, two showcases and beyond. Inf. Fusion. 2022, 77, 29–52. [Google Scholar] [CrossRef] [PubMed]
- Tsai, C.-W.; Chiang, M.-C.; Ksentini, A.; Chen, M. Metaheuristic Algorithms for Healthcare: Open Issues and Challenges. Comput. Electr. Eng. 2016, 53, 421–434. [Google Scholar] [CrossRef]
- Dihmani, H.; Bousselham, A.; Bouattane, O. A Review of Feature Selection and HyperparameterOptimization Techniques for Breast Cancer Detection on thermograms Images. In Proceedings of the 2023 IEEE 6th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech), Marrakech, Morocco, 21–23 November 2023; pp. 01–08. [Google Scholar]
- Del Ser, J.; Osaba, E.; Molina, D.; Yang, X.-S.; Salcedo-Sanz, S.; Camacho, D.; Das, S.; Suganthan, P.N.; Coello Coello, C.A.; Herrera, F. Bio-inspired computation: Where we stand and what’s next. Swarm Evol. Comput. 2019, 48, 220–250. [Google Scholar] [CrossRef]
- Abdel-Nasser, M.; Moreno, A.; Puig, D. Breast Cancer Detection in Thermal Infrared Images Using Representation Learning and Texture Analysis Methods. Electronics 2019, 8, 100. [Google Scholar] [CrossRef]
- Resmini, R.; Silva, L.; Araujo, A.S.; Medeiros, P.; Muchaluat-Saade, D.; Conci, A. Combining Genetic Algorithms and SVM for Breast Cancer Diagnosis Using Infrared Thermography. Sensors 2021, 21, 4802. [Google Scholar] [CrossRef]
- Gonçalves, C.B.; Souza, J.R.; Fernandes, H. CNN architecture optimization using bio-inspired algorithms for breast cancer detection in infrared images. Comput. Biol. Med. 2022, 142, 105205. [Google Scholar] [CrossRef]
- Pramanik, R.; Pramanik, P.; Sarkar, R. Breast cancer detection in thermograms using a hybrid of GA and GWO based deep feature selection method. Expert. Syst. Appl. 2023, 219, 119643. [Google Scholar] [CrossRef]
- Ensafi, M.; Keyvanpour, M.R.; Shojaedini, S.V. A New method for promote the performance of deep learning paradigm in diagnosing breast cancer: Improving role of fusing multiple views of thermography images. Health Technol. 2022, 12, 1097–1107. [Google Scholar] [CrossRef]
- Aidossov, N.; Zarikas, V.; Zhao, Y.; Mashekova, A.; Ng, E.Y.K.; Mukhmetov, O.; Mirasbekov, Y.; Omirbayev, A. An Integrated Intelligent System for Breast Cancer Detection at Early Stages Using IR Images and Machine Learning Methods with Explainability. SN Comput. Sci. 2023, 4, 184. [Google Scholar] [CrossRef] [PubMed]
- Aidossov, N.; Zarikas, V.; Mashekova, A.; Zhao, Y.; Ng, E.Y.K.; Midlenko, A.; Mukhmetov, O. Evaluation of Integrated CNN, Transfer Learning, and BN with Thermography for Breast Cancer Detection. Appl. Sci. 2023, 13, 600. [Google Scholar] [CrossRef]
- Nicandro, C.-R.; Efrén, M.-M.; María Yaneli, A.-A.; Enrique, M.-D.-C.-M.; Héctor Gabriel, A.-M.; Nancy, P.-C.; Alejandro, G.-H.; Guillermo de Jesús, H.-R.; Rocío Erandi, B.-M. Evaluation of the Diagnostic Power of Thermography in Breast Cancer Using Bayesian Network Classifiers. Comput. Math. Methods Med. 2013, 2013, 1–10. [Google Scholar] [CrossRef] [PubMed]
- Dey, S.; Roychoudhury, R.; Malakar, S.; Sarkar, R. Screening of breast cancer from thermogram images by edge detection aided deep transfer learning model. Multimed. Tools Appl. 2022, 81, 9331–9349. [Google Scholar] [CrossRef]
- Silva, L.; Saade, D.; Sequeiros Olivera, G.; Silva, A.; Paiva, A.; Bravo, R.; Conci, A. A New Database for Breast Research with Infrared Image. J. Med. Imaging Health Inform. 2014, 4, 92–100. [Google Scholar] [CrossRef]
- Zuluaga-Gomez, J.; Masry, Z.A.; Benaggoune, K.; Meraghni, S.; Zerhouni, N. A CNN-based methodology for breast cancer diagnosis using thermal images. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 2020, 9, 1–15. [Google Scholar] [CrossRef]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]
- Daugman, J.G. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J. Opt. Soc. Am. A JOSAA 1985, 2, 1160–1169. [Google Scholar] [CrossRef]
- Canny, J. A Computational Approach to Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, PAMI-8, 679–698. [Google Scholar] [CrossRef]
- Bansal, J.C.; Sharma, H.; Jadon, S.S.; Clerc, M. Spider Monkey Optimization algorithm for numerical optimization. Memetic Comp. 2014, 6, 31–47. [Google Scholar] [CrossRef]
- Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the Proceedings of ICNN’95—International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
- Singh, U.; Salgotra, R.; Rattan, M. A Novel Binary Spider Monkey Optimization Algorithm for Thinning of Concentric Circular Antenna Arrays. IETE J. Res. 2016, 62, 736–744. [Google Scholar] [CrossRef]
- Kennedy, J.; Eberhart, R.C. A discrete binary version of the particle swarm algorithm. In Proceedings of the Computational Cybernetics and Simulation 1997 IEEE International Conference on Systems, Man, and Cybernetics, Orlando, FL, USA, 12–15 October 1997; Volume 5, pp. 4104–4108. [Google Scholar]
- Gupta, K.; Deep, K. Investigation of Suitable Perturbation Rate Scheme for Spider Monkey Optimization Algorithm. In Proceedings of Fifth International Conference on Soft Computing for Problem Solving; Pant, M., Deep, K., Bansal, J.C., Nagar, A., Das, K.N., Eds.; Advances in Intelligent Systems and Computing; Springer: Singapore, 2016; pp. 839–850. [Google Scholar]
1522 Thermogram Images (56 Subject: 37 Sick IDs–19 Healthy IDs) | |||||
---|---|---|---|---|---|
Desenvolvimento da Metodologia | 12 Novos Casos de Testes | ||||
Total | Doentes (Sick) | Saud á veis (Healthy) | Total | Doentes (Sick) | Saud á veis (Healthy) |
1282 | 640 | 642 | 240 | 120 | 120 |
CAD Steps | Methods | Feature Variables | Parameter Values * |
---|---|---|---|
Feature Extraction | LBP | Radius | 8; [2, 24]: 2 |
Points | 60; [100, 550]: 10 | ||
HOG | Orientations | 9; [9, 18]: 1 | |
Pixels per cell | 8; [2, 16]: 1 | ||
Cells per block | 4; [2, 5]: 1 | ||
Gabor Filter | K-size: Size of the Gabor kernel. Larger values capture more spatial frequencies but also increase computational complexity. | 80; [20, 190]: 20 | |
Sigma: Standard deviation of the Gaussian envelope. It controls the spread of the filter. | 2.5; [0.10, 7]: 0.10 | ||
Theta: Orientation of the normal to the parallel stripes of a Gabor function. It determines the orientation of the features to be detected. | 40; [0.1, 46]: 2 | ||
lamda: Wavelength of the sinusoidal factor. It affects the frequency of the feature to be detected. | 1.6; [0.2, 8]: 0.5 | ||
Gamma: Spatial aspect ratio. It controls the ellipticity of the filter. | 90; [8, 250]: 0.5 | ||
Canny Edge | Aperture size: The size of the Sobel kernel used for gradient computation. | 4; [7, 80]: 1 | |
Aperture transition: Represents the transition range for adjusting the lower threshold during Edge Detection. | 2; [2, 10]: 2 | ||
Aperture transition steps: This parameter specifies the number of steps or iterations used for adjusting the aperture transition range. It determines the granularity of the optimization process for the aperture transition. | 60; [10, 120]:1 | ||
Feature Selection | The number of selected features by optimizer | 128 Binary variables | True for included/False for not-included feature |
Classification | SVM | C: Regularization parameter. Controls the trade-off between training error and margin. | 100; [10, 7500]: 10 |
Coef0: Independent term in the kernel function. | 4; [1 × 10−6, 4]: 0.5 | ||
Tolerance: determines the convergence criterion for the model training. | 1 × 10−6; [1 × 10−6, 4 × 10−2]: 9 × 10−4 | ||
Degree: represents the degree of the polynomial function used in the kernel. | 3; [2, 10]: 1 |
Feature Extractor | Optimizer Approach | Hyperparameter Tuning | Remaining Features (%) | Accuracy (%) | F1-Score (%) | Geometric Mean | ||||
---|---|---|---|---|---|---|---|---|---|---|
C | Degree | Coef0 | Tolerance | Feature Parameter | ||||||
LBP | BSMO | Default parameter | 28.91 | 93.94 | 93.33 | 0.8172022148770768 | ||||
BPSO | Default parameter | 24.22 | 92.42 | 91.65 | 0.836874399178275 | |||||
HSMO | 4050.0 | 4.0 | 3.0 | 0.012 | Radius: 14.0 Point: 240.0 | 26.56 | 97.62 | 97.49 | 0.7169212799999999 | |
HPSO | 3880.0 | 7.0 | 3.0 | 0.002 | Radius: 14.0 Point: 330.0 | 23.44 | 95.89 | 95.67 | 0.73413384 | |
Gabor Filters | BSMO | Default parameter | 35.16 | 64.94 | 64.63 | 0.6488998073662836 | ||||
BPSO | Default parameter | 32.81 | 63.64 | 0.63 | 0.6539091374189535 | |||||
HSMO | 1820.0 | 6.0 | 1.4 | 0.0004 | size: 180.0 sigma: 6.9 theta: 2.0 lamda: 8.0 gamma: 249.0 | 39.06 | 85.5 | 83.78 | 0.7218289271011518 | |
HPSO | 7110.0 | 6.0 | 1.5 | 0.011 | size: 40.0 sigma: 5.2 theta: 2.0 lamda: 4.5 gamma: 348.5 | 33.59 | 83.33 | 81.97 | 0.7439049199998613 | |
HOG | BSMO | Default parameter | 31.25 | 93.29 | 92.91 | 0.8008550118467137 | ||||
BPSO | Default parameter | 32.81 | 91.34 | 90.99 | 0.7833986596873905 | |||||
HSMO | 410.0 | 5.0 | 2.0 | 0.022 | Orientations: 12.0 Pixels per cell: 2.0 Cells per block: 3.0 | 25.78 | 98.27 | 98.15 | 0.8540257256078414 | |
HPSO | 730.0 | 13.0 | 1.5 | 0.033 | Orientations: 14.0 Pixels per cell: 3.0 Cells per block: 5.0 | 25.78 | 95.02 | 94.71 | 0.8397847581374646 | |
Canny Edge | BSMO | Default parameter | 32.03 | 88.31 | 88.31 | 0.7747535543642249 | ||||
BPSO | Default parameter | 39.06 | 89.39 | 89.28 | 0.7380668397916275 | |||||
HSMO | 2380.0 | 6.0 | 1.5 | 0.009 | Aperture size: 37.0 Aperture transition: 2.0 Aperture transition steps: 10.0 | 37.5 | 91.99 | 91.04 | 0.7582463320056353 | |
HPSO | 1770.0 | 4.0 | 4.5 | 0.006 | Aperture size: 16.0 Aperture transition: 4.0 Aperture transition steps: 12.0 | 32.03 | 85.71 | 83.33 | 0.7632633031922863 |
Work Ref. | Methodology | No. of Features | Instances/Split Method | Accuracy (%) | F1-Score (%) |
---|---|---|---|---|---|
[29] | Optimizing CNN by Bayesian optimization | - | 1120 Data augmentation | 92 | 92 |
[22] | Deep Feature and a Hybrid GA and GWO | 29 | 1126 (70% train, 10% validation, 20% test) | 100 | - |
[20] | Textural features and GA | 42 | 300 (70% train, 30% test –test splitting with fourfold cross-validation) | 96.15 | 95.89 |
[23] | Deep learning | - | 461 (fourfold cross-validation) | 93 | - |
[19] | HOG features and LTR | 44 | 1120 (70% train, 15% validation, 15% test) | 95.8 | 95.4 |
Proposed methodology | HSMO based HOG features and hyperparameter optimization | 33 | 1522 (70% train, 30% test) | 98.27 | 98.15 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dihmani, H.; Bousselham, A.; Bouattane, O. A New Computer-Aided Diagnosis System for Breast Cancer Detection from Thermograms Using Metaheuristic Algorithms and Explainable AI. Algorithms 2024, 17, 462. https://doi.org/10.3390/a17100462
Dihmani H, Bousselham A, Bouattane O. A New Computer-Aided Diagnosis System for Breast Cancer Detection from Thermograms Using Metaheuristic Algorithms and Explainable AI. Algorithms. 2024; 17(10):462. https://doi.org/10.3390/a17100462
Chicago/Turabian StyleDihmani, Hanane, Abdelmajid Bousselham, and Omar Bouattane. 2024. "A New Computer-Aided Diagnosis System for Breast Cancer Detection from Thermograms Using Metaheuristic Algorithms and Explainable AI" Algorithms 17, no. 10: 462. https://doi.org/10.3390/a17100462
APA StyleDihmani, H., Bousselham, A., & Bouattane, O. (2024). A New Computer-Aided Diagnosis System for Breast Cancer Detection from Thermograms Using Metaheuristic Algorithms and Explainable AI. Algorithms, 17(10), 462. https://doi.org/10.3390/a17100462