Current Challenges and Future Opportunities for XAI in Machine Learning-Based Clinical Decision Support Systems: A Systematic Review
Abstract
:1. Introduction
2. Fundamental Concepts and Background
2.1. Clinical Decision Support Systems
2.2. Explainable AI (XAI)
2.2.1. The Need for XAI: Fair and Ethical Decision-Making
2.3. XAI in Medicine
2.4. Types of Explanations
2.4.1. Ante-Hoc Methods
2.4.2. Post-Hoc Methods
2.5. Trade-off between Interpretability and Performance
2.6. What Do Clinicians Want?
- Clinicians view explainability as a means of justifying their clinical decision-making (e.g., to patients and colleagues) in the context of a model’s decision.
- The implemented system/model needs to provide information about the context within which the model operates and promote awareness of situations where the model may fall short (e.g., model did not use specific history or did not have information around certain aspects of a patient’s context). Models that fall short in accuracy were deemed acceptable provided there is clarity around why the model under-performs.
- Familiar metrics such as reliability, specificity, and sensitivity were important for the initial uptake of an AI tool. However, a critical factor for continuing use was whether the tool was repeatedly successful in prognosticating their patient’s condition in their personal experience. Real-world application was crucial to developing “a sense of when it’s working and when it’s limited” which meant “alignment with expectations and clinical presentation”.
- Clinical thought processes for acting on predictions of any assistive tool appear to consist of two primary steps following presentation of the model’s prediction: (i) understanding and (ii) rationalizing the predictions. Thus, classes of explanations for clinical ML models should be designed with the purpose of facilitating the understanding and rationalization process. Clinicians believe that carefully designed visualization and presentation can facilitate further understanding of the model.
- A well designed explanation should augment or supplement clinical ML systems to (a) recalibrate clinician (stakeholder) trust of model predictions, (b) provide a level of transparency that allows users to validate model outputs with domain knowledge, (c) reliably disseminate model prediction using task specific representations (e.g., confidence scores), and (d) provide parsimonious and actionable steps clinicians can undertake.
3. Materials and Methods
- Specifying research questions
- Conducting searches of specified databases
- Selecting studies by criterion
- Filtering studies by evaluating their pertinence
- Extracting data
- Synthesizing results.
3.1. Research Questions
- RQ1: What AI-based CDSS have been developed that incorporate XAI?
- RQ2: What aspects/methods of the use of XAI in CDSS have been the focus of the literature?
- RQ3: What benefits have been reported when addressing different aspects of the use of XAI in CDSS?
- RQ4: What open problems, challenges, and needs of explainable CDSS are expressed in literature?
3.2. Conducting Searches
- S1 “clinical decision support system” XAI (35)
- S2 “clinical decision support system” explainable AI (165)
- S3 “clinical decision support system” explainable ML (181)
- S4 CDSS XAI (41)
- S5 CDSS explainable AI (122)
- S6 CDSS explainable ML (124)
3.3. Paper Selection and Filtering
4. Results
4.1. RQ1: What AI-Based CDSS Have Been Developed That Incorporates XAI?
4.2. RQ2: What Aspects/Methods of the Use of XAI in CDSS Have Been the Focus of the Literature?
4.3. RQ3: What Benefits Have Been Reported When Addressing Different Aspects of the Use of XAI in CDSS?
4.4. RQ4: What Open Problems, Challenges, and Needs of Explainable CDSS Are Expressed in Literature?
5. Discussion
Guidelines for Implementing Explainable Models in CDSS: Opportunities, Challenges, and Future Research Needs
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | Artificial Intelligence |
ML | Machine Learning |
CDSS | Clinical Decision Support Systems |
XAI | Explainable AI |
GDPR | General Data Protection Regulation |
References
- Falcone, P.; Borrelli, F.; Asgari, J.; Tseng, H.E.; Hrovat, D. Predictive active steering control for autonomous vehicle systems. IEEE Trans. Control Syst. Technol. 2007, 15, 566–580. [Google Scholar] [CrossRef]
- Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; Van Den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. Mastering the game of Go with deep neural networks and tree search. Nature 2016, 529, 484–489. [Google Scholar] [CrossRef] [PubMed]
- Holzinger, A.; Langs, G.; Denk, H.; Zatloukal, K.; Müller, H. Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2019, 9, e1312. [Google Scholar] [CrossRef] [Green Version]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Marcus, G. Deep learning: A critical appraisal. arXiv 2018, arXiv:1801.00631. [Google Scholar]
- Goodman, B.; Flaxman, S. European Union regulations on algorithmic decision-making and a “right to explanation”. AI Mag. 2017, 38, 50–57. [Google Scholar] [CrossRef] [Green Version]
- Holzinger, A.; Biemann, C.; Pattichis, C.S.; Kell, D.B. What do we need to build explainable AI systems for the medical domain? arXiv 2017, arXiv:1712.09923. [Google Scholar]
- Birhane, A. Algorithmic injustice: A relational ethics approach. Patterns 2021, 2, 100205. [Google Scholar] [CrossRef]
- Li, T.; Wang, S.; Lillis, D.; Yang, Z. Combining Machine Learning and Logical Reasoning to Improve Requirements Traceability Recovery. Appl. Sci. 2020, 10, 7253. [Google Scholar] [CrossRef]
- Becker, B.A. Artificial Intelligence in Education: What is it, Where is it Now, Where is it Going? In Ireland’s Yearbook of Education 2017–2018; Mooney, B., Ed.; Education Matters: Dublin, Ireland, 2017; Volume 1, pp. 42–48. ISBN 978-0-9956987-1-0. [Google Scholar]
- Du, X.; Hargreaves, C.; Sheppard, J.; Anda, F.; Sayakkara, A.; Le-Khac, N.A.; Scanlon, M. SoK: Exploring the State of the Art and the Future Potential of Artificial Intelligence in Digital Forensic Investigation. In Proceedings of the 13th International Workshop on Digital Forensics (WSDF) and 15th International Conference on Availability, Reliability and Security (ARES’20), Virtually, 25–28 August 2020; ACM: New York, NY, USA, 2020. [Google Scholar]
- Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef]
- Hwang, E.J.; Park, S.; Jin, K.N.; Im Kim, J.; Choi, S.Y.; Lee, J.H.; Goo, J.M.; Aum, J.; Yim, J.J.; Cohen, J.G.; et al. Development and validation of a deep learning–based automated detection algorithm for major thoracic diseases on chest radiographs. JAMA Netw. Open 2019, 2, e191095. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Geras, K.J.; Wolfson, S.; Shen, Y.; Wu, N.; Kim, S.; Kim, E.; Heacock, L.; Parikh, U.; Moy, L.; Cho, K. High-resolution breast cancer screening with multi-view deep convolutional neural networks. arXiv 2017, arXiv:1703.07047. [Google Scholar]
- Chilamkurthy, S.; Ghosh, R.; Tanamala, S.; Biviji, M.; Campeau, N.G.; Venugopal, V.K.; Mahajan, V.; Rao, P.; Warier, P. Deep learning algorithms for detection of critical findings in head CT scans: A retrospective study. Lancet 2018, 392, 2388–2396. [Google Scholar] [CrossRef]
- Burbidge, R.; Trotter, M.; Buxton, B.; Holden, S. Drug design by machine learning: Support vector machines for pharmaceutical data analysis. Comput. Chem. 2001, 26, 5–14. [Google Scholar] [CrossRef]
- Kourou, K.; Exarchos, T.P.; Exarchos, K.P.; Karamouzis, M.V.; Fotiadis, D.I. Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 2015, 13, 8–17. [Google Scholar] [CrossRef] [Green Version]
- Arrieta, A.B.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef] [Green Version]
- Adadi, A.; Berrada, M. Peeking inside the black-box: A survey on Explainable Artificial Intelligence (XAI). IEEE Access 2018, 6, 52138–52160. [Google Scholar] [CrossRef]
- Vellido, A. The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput. Appl. 2019, 32, 18069–18083. [Google Scholar] [CrossRef] [Green Version]
- Gilpin, L.H.; Bau, D.; Yuan, B.Z.; Bajwa, A.; Specter, M.; Kagal, L. Explaining explanations: An overview of interpretability of machine learning. In Proceedings of the 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), Turin, Italy, 1–3 October 2018; pp. 80–89. [Google Scholar]
- Osheroff, J.A.; Teich, J.M.; Middleton, B.; Steen, E.B.; Wright, A.; Detmer, D.E. A roadmap for national action on clinical decision support. J. Am. Med. Inform. Assoc. 2007, 14, 141–145. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Coiera, E. Clinical decision support systems. Guide Health Inform. 2003, 2, 331–345. [Google Scholar]
- Shahsavarani, A.M.; Azad Marz Abadi, E.; Hakimi Kalkhoran, M.; Jafari, S.; Qaranli, S. Clinical decision support systems (CDSSs): State of the art review of literature. Int. J. Med. Rev. 2015, 2, 299–308. [Google Scholar]
- Sutton, R.T.; Pincock, D.; Baumgart, D.C.; Sadowski, D.C.; Fedorak, R.N.; Kroeker, K.I. An overview of clinical decision support systems: Benefits, risks, and strategies for success. NPJ Digit. Med. 2020, 3, 17. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Belard, A.; Buchman, T.; Forsberg, J.; Potter, B.K.; Dente, C.J.; Kirk, A.; Elster, E. Precision diagnosis: A view of the clinical decision support systems (CDSS) landscape through the lens of critical care. J. Clin. Monit. Comput. 2017, 31, 261–271. [Google Scholar] [CrossRef] [PubMed]
- Abbasi, M.; Kashiyarndi, S. Clinical Decision Support Systems: A Discussion on Different Methodologies Used in Health Care; Marlaedalen University Sweden: Västerås, Sweden, 2006. [Google Scholar]
- Obermeyer, Z.; Emanuel, E.J. Predicting the future—Big data, machine learning, and clinical medicine. N. Engl. J. Med. 2016, 375, 1216–1219. [Google Scholar] [CrossRef] [Green Version]
- IBM Watson Health. Available online: https://www.ibm.com/watson-health (accessed on 25 April 2021).
- Strickland, E. IBM Watson, heal thyself: How IBM overpromised and underdelivered on AI health care. IEEE Spectr. 2019, 56, 24–31. [Google Scholar] [CrossRef]
- ClinicalPath. Available online: https://www.elsevier.com/solutions/clinicalpath (accessed on 25 April 2021).
- ClinicalKey. Available online: https://www.clinicalkey.com (accessed on 25 April 2021).
- Symptomate. Available online: https://symptomate.com/ (accessed on 25 April 2021).
- Hanover Project. Available online: https://www.microsoft.com/en-us/research/project/project-hanover/ (accessed on 25 April 2021).
- Schaaf, J.; Sedlmayr, M.; Schaefer, J.; Storf, H. Diagnosis of Rare Diseases: A scoping review of clinical decision support systems. Orphanet J. Rare Dis. 2020, 15, 1–14. [Google Scholar] [CrossRef] [PubMed]
- Walsh, S.; de Jong, E.E.; van Timmeren, J.E.; Ibrahim, A.; Compter, I.; Peerlings, J.; Sanduleanu, S.; Refaee, T.; Keek, S.; Larue, R.T.; et al. Decision Support Systems in Oncology. JCO Clin. Cancer Inform. 2019, 3, 1–9. [Google Scholar] [CrossRef]
- Mazo, C.; Kearns, C.; Mooney, C.; Gallagher, W.M. Clinical decision support systems in breast cancer: A systematic review. Cancers 2020, 12, 369. [Google Scholar] [CrossRef] [Green Version]
- Velickovski, F.; Ceccaroni, L.; Roca, J.; Burgos, F.; Galdiz, J.B.; Marina, N.; Lluch-Ariet, M. Clinical Decision Support Systems (CDSS) for preventive management of COPD patients. J. Transl. Med. 2014, 12. [Google Scholar] [CrossRef] [Green Version]
- Durieux, P.; Nizard, R.; Ravaud, P.; Mounier, N.; Lepage, E. A Clinical Decision Support System for Prevention of Venous Thromboembolism Effect on Physician Behavior. JAMA 2000, 283, 2816–2821. [Google Scholar] [CrossRef]
- Lakshmanaprabu, S.; Mohanty, S.N.; Sheeba, R.S.; Krishnamoorthy, S.; Uthayakumar, J.; Shankar, K. Online clinical decision support system using optimal deep neural networks. Appl. Soft Comput. 2019, 81, 105487. [Google Scholar] [CrossRef]
- Mattila, J.; Koikkalainen, J.; Virkki, A.; van Gils, M.; Lötjönen, J. Design and Application of a Generic Clinical Decision Support System for Multiscale Data. IEEE Trans. Biomed. Eng. 2012, 59, 234–240. [Google Scholar] [CrossRef] [PubMed]
- Sim, L.L.W.; Ban, K.H.K.; Tan, T.W.; Sethi, S.K.; Loh, T.P. Development of a clinical decision support system for diabetes care: A pilot study. PLoS ONE 2017, 12, e0173021. [Google Scholar] [CrossRef]
- Anooj, P. Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules. J. King Saud Univ. Comput. Inf. Sci. 2012, 24, 27–40. [Google Scholar] [CrossRef] [Green Version]
- Prahl, A.; Van Swol, L. Out with the Humans, in with the Machines?: Investigating the Behavioral and Psychological Effects of Replacing Human Advisors with a Machine. Hum.-Mach. Commun. 2021, 2, 11. [Google Scholar]
- Van Lent, M.; Fisher, W.; Mancuso, M. An explainable artificial intelligence system for small-unit tactical behavior. In Proceedings of the National Conference on Artificial Intelligence, San Jose, CA, USA, 25–29 July 2004; pp. 900–907. [Google Scholar]
- Lipton, Z.C. The mythos of model interpretability. Queue 2018, 16, 31–57. [Google Scholar] [CrossRef]
- Bhatt, U.; Xiang, A.; Sharma, S.; Weller, A.; Taly, A.; Jia, Y.; Ghosh, J.; Puri, R.; Moura, J.M.; Eckersley, P. Explainable machine learning in deployment. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, 27–30 January 2020; pp. 648–657. [Google Scholar]
- Richard, A.; Mayag, B.; Talbot, F.; Tsoukias, A.; Meinard, Y. Transparency of Classification Systems for Clinical Decision Support. In Information Processing and Management of Uncertainty in Knowledge-Based Systems; Communications in Computer and Information Science (CCIS); Springer: Cham, Switzerland, 2020; Volume 1239, pp. 99–113. [Google Scholar] [CrossRef]
- Bhatt, U.; Andrus, M.; Weller, A.; Xiang, A. Machine learning explainability for external stakeholders. arXiv 2020, arXiv:2007.05408. [Google Scholar]
- Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A survey of methods for explaining black box models. ACM Comput. Surv. (CSUR) 2018, 51, 1–42. [Google Scholar] [CrossRef] [Green Version]
- Angwin, J.; Larson, J.; Mattu, S.; Kirchner, L. Machine bias. ProPublica May 2016, 23, 139–159. [Google Scholar]
- Dressel, J.; Farid, H. The accuracy, fairness, and limits of predicting recidivism. Sci. Adv. 2018, 4, eaao5580. [Google Scholar] [CrossRef] [Green Version]
- Richardson, R.; Schultz, J.M.; Crawford, K. Dirty data, bad predictions: How civil rights violations impact police data, predictive policing systems, and justice. NYUL Rev. Online 2019, 94, 15. [Google Scholar]
- Introna, L.D.; Nissenbaum, H. Shaping the Web: Why the politics of search engines matters. Inf. Soc. 2000, 16, 169–185. [Google Scholar]
- Ifeoma, A. The Auditing Imperative for Automated Hiring (15 March 2019). 34 Harv. J.L. & Tech. (forthcoming 2021). Available online: https://ssrn.com/abstract=3437631 (accessed on 24 July 2020).
- Lambrecht, A.; Tucker, C. Algorithmic bias? an empirical study of apparent gender-based discrimination in the display of stem career ads. Manag. Sci. 2019, 65, 2966–2981. [Google Scholar] [CrossRef]
- Imana, B.; Korolova, A.; Heidemann, J. Auditing for Discrimination in Algorithms Delivering Job Ads. arXiv 2021, arXiv:2104.04502. [Google Scholar]
- Wilson, B.; Hoffman, J.; Morgenstern, J. Predictive inequity in object detection. arXiv 2019, arXiv:1902.11097. [Google Scholar]
- O’Neil, C. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy; Penguin Random House: New York, NY, USA, 2016. [Google Scholar]
- Ferryman, K.; Pitcan, M. Fairness in precision medicine. Data Soc. 2018, 1. Available online: https://datasociety.net/library/fairness-in-precision-medicine/ (accessed on 24 July 2020).
- Landry, L.G.; Ali, N.; Williams, D.R.; Rehm, H.L.; Bonham, V.L. Lack of diversity in genomic databases is a barrier to translating precision medicine research into practice. Health Aff. 2018, 37, 780–785. [Google Scholar] [CrossRef]
- Hense, H.W.; Schulte, H.; Löwel, H.; Assmann, G.; Keil, U. Framingham risk function overestimates risk of coronary heart disease in men and women from Germany—Results from the MONICA Augsburg and the PROCAM cohorts. Eur. Heart J. 2003, 24, 937–945. [Google Scholar] [CrossRef]
- Slack, D.; Hilgard, S.; Jia, E.; Singh, S.; Lakkaraju, H. Fooling lime and shap: Adversarial attacks on post hoc explanation methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, 7–8 February 2020; pp. 180–186. [Google Scholar]
- Miller, T.; Howe, P.; Sonenberg, L. Explainable AI: Beware of inmates running the asylum or: How I learnt to stop worrying and love the social and behavioural sciences. arXiv 2017, arXiv:1712.00547. [Google Scholar]
- Aïvodji, U.; Arai, H.; Fortineau, O.; Gambs, S.; Hara, S.; Tapp, A. Fairwashing: The risk of rationalization. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 161–170. [Google Scholar]
- Doshi-Velez, F.; Kim, B. Towards a rigorous science of interpretable machine learning. arXiv 2017, arXiv:1702.08608. [Google Scholar]
- Molnar, C.; Casalicchio, G.; Bischl, B. Interpretable Machine Learning—A Brief History, State-of-the-Art and Challenges. arXiv 2020, arXiv:2010.09337. [Google Scholar]
- Caruana, R.; Lou, Y.; Gehrke, J.; Koch, P.; Sturm, M.; Elhadad, N. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; pp. 1721–1730. [Google Scholar]
- Tonekaboni, S.; Joshi, S.; McCradden, M.D.; Goldenberg, A. What Clinicians Want: Contextualizing Explainable Machine Learning for Clinical End Use. In Proceedings of the Machine Learning for Healthcare Conference, Boston, MA, USA, 13–14 June 2019; pp. 359–380. [Google Scholar]
- Monteath, I.; Sheh, R. Assisted and incremental medical diagnosis using explainable artificial intelligence. In Proceedings of the 2nd Workshop on Explainable Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 104–108. [Google Scholar]
- Wu, J.; Peck, D.; Hsieh, S.; Dialani, V.; Lehman, C.D.; Zhou, B.; Syrgkanis, V.; Mackey, L.; Patterson, G. Expert identification of visual primitives used by CNNs during mammogram classification. In Medical Imaging 2018: Computer-Aided Diagnosis; International Society for Optics and Photonics: Bellingham, WA, USA, 2018; Volume 10575, p. 105752T. [Google Scholar]
- Zheng, Q.; Delingette, H.; Ayache, N. Explainable cardiac pathology classification on cine MRI with motion characterization by semi-supervised learning of apparent flow. Med. Image Anal. 2019, 56, 80–95. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Tosun, A.B.; Pullara, F.; Becich, M.J.; Taylor, D.; Fine, J.L.; Chennubhotla, S.C. Explainable AI (xAI) for Anatomic Pathology. Adv. Anat. Pathol. 2020, 27, 241–250. [Google Scholar] [CrossRef] [PubMed]
- Hicks, S.A.; Eskeland, S.; Lux, M.; de Lange, T.; Randel, K.R.; Jeppsson, M.; Pogorelov, K.; Halvorsen, P.; Riegler, M. Mimir: An automatic reporting and reasoning system for deep learning based analysis in the medical domain. In Proceedings of the 9th ACM Multimedia Systems Conference, Amsterdam, The Netherlands, 12–15 June 2018; pp. 369–374. [Google Scholar]
- Bussone, A.; Stumpf, S.; O’Sullivan, D. The role of explanations on trust and reliance in clinical decision support systems. In Proceedings of the 2015 International Conference on Healthcare Informatics, Dallas, TX, USA, 21–23 October 2015; pp. 160–169. [Google Scholar]
- Lakkaraju, H.; Kamar, E.; Caruana, R.; Leskovec, J. Interpretable & explorable approximations of black box models. arXiv 2017, arXiv:1707.01154. [Google Scholar]
- Ibrahim, M.; Louie, M.; Modarres, C.; Paisley, J. Global explanations of neural networks: Mapping the landscape of predictions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, Honolulu, HI, USA, 27–28 January 2019; pp. 279–287. [Google Scholar]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar]
- Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 4765–4774. [Google Scholar]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. Anchors: High-Precision Model-Agnostic Explanations. In Proceedings of the AAAI, New Orleans, LA, USA, 2–7 February 2018; Volume 18, pp. 1527–1535. [Google Scholar]
- White, A.; Garcez, A.D. Measurable counterfactual local explanations for any classifier. arXiv 2019, arXiv:1908.03020. [Google Scholar]
- Sharma, S.; Henderson, J.; Ghosh, J. CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-box Models. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, 7–8 February 2020; pp. 166–172. [Google Scholar]
- Simonyan, K.; Vedaldi, A.; Zisserman, A. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv 2013, arXiv:1312.6034. [Google Scholar]
- Sundararajan, M.; Taly, A.; Yan, Q. Axiomatic attribution for deep networks. In Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia, 6–11 August 2017; pp. 3319–3328. [Google Scholar]
- Shrikumar, A.; Greenside, P.; Kundaje, A. Learning important features through propagating activation differences. In Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia, 6–11 August 2017; pp. 3145–3153. [Google Scholar]
- Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 818–833. [Google Scholar]
- Zeiler, M.D.; Taylor, G.W.; Fergus, R. Adaptive deconvolutional networks for mid and high level feature learning. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2018–2025. [Google Scholar]
- Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
- Garreau, D.; von Luxburg, U. Explaining the explainer: A first theoretical analysis of LIME. arXiv 2020, arXiv:2001.03447. [Google Scholar]
- Fidel, G.; Bitton, R.; Shabtai, A. When explainability meets adversarial learning: Detecting adversarial examples using SHAP signatures. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
- Holzinger, A. Explainable AI and Multi-Modal Causability in Medicine. i-com 2020, 19, 171–179. [Google Scholar] [CrossRef]
- Amann, J.; Blasimme, A.; Vayena, E.; Frey, D.; Madai, V.I. Explainability for artificial intelligence in healthcare: A multidisciplinary perspective. BMC Med. Inform. Decis. Mak. 2020, 20, 310. [Google Scholar] [CrossRef]
- Kitchenham, B.A.; Charters, S. Guidelines for Performing Systematic Literature Reviews in Software Engineering; Technical Report EBSE 2007-001, Keele University and Durham University Joint Report. 2007. Available online: http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=2BE22FED09591B99D6A7ACF8FE2258D5? (accessed on 24 July 2020).
- Martín-Martín, A.; Orduna-Malea, E.; Thelwall, M.; López-Cózar, E.D. Google Scholar, Web of Science, and Scopus: A systematic comparison of citations in 252 subject categories. J. Inf. 2018, 12, 1160–1177. [Google Scholar] [CrossRef] [Green Version]
- Gusenbauer, M. Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics 2019, 118, 177–214. [Google Scholar] [CrossRef] [Green Version]
- Luz, C.; Vollmer, M.; Decruyenaere, J.; Nijsten, M.; Glasner, C.; Sinha, B. Machine learning in infection management using routine electronic health records: Tools, techniques, and reporting of future technologies. Clin. Microbiol. Infect. 2020, 26, 1291–1299. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zucco, C.; Liang, H.; Fatta, G.D.; Cannataro, M. Explainable Sentiment Analysis with Applications in Medicine. In Proceedings of the 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018, Madrid, Spain, 3–6 December 2019; pp. 1740–1747. [Google Scholar] [CrossRef]
- Jin, W.; Fatehi, M.; Abhishek, K.; Mallya, M.; Toyota, B.; Hamarneh, G. Artificial intelligence in glioma imaging: Challenges and advances. J. Neural Eng. 2020, 17, 021002. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wulff, A.; Montag, S.; Marschollek, M.; Jack, T. Clinical Decision-Support Systems for Detection of Systemic Inflammatory Response Syndrome, Sepsis, and Septic Shock in Critically Ill Patients: A Systematic Review. Methods Inf. Med. 2019, 58, e43–e57. [Google Scholar] [CrossRef]
- Rundo, L.; Pirrone, R.; Vitabile, S.; Sala, E.; Gambino, O. Recent advances of HCI in decision-making tasks for optimized clinical workflows and precision medicine. J. Biomed. Inform. 2020, 108, 103479. [Google Scholar] [CrossRef]
- Fu, L.H.; Schwartz, J.; Moy, A.; Knaplund, C.; Kang, M.J.; Schnock, K.O.; Garcia, J.P.; Jia, H.; Dykes, P.C.; Cato, K.; et al. Development and validation of early warning score system: A systematic literature review. J. Biomed. Inform. 2020, 105, 103410. [Google Scholar] [CrossRef]
- Angehrn, Z.; Haldna, L.; Zandvliet, A.S.; Gil Berglund, E.; Zeeuw, J.; Amzal, B.; Cheung, S.Y.A.; Polasek, T.M.; Pfister, M.; Kerbusch, T.; et al. Artificial Intelligence and Machine Learning Applied at the Point of Care. Front. Pharmacol. 2020, 11, 759. [Google Scholar] [CrossRef]
- Ibrahim, A.; Primakov, S.; Beuque, M.; Woodruff, H.; Halilaj, I.; Wu, G.; Refaee, T.; Granzier, R.; Widaatalla, Y.; Hustinx, R.; et al. Radiomics for precision medicine: Current challenges, future prospects, and the proposal of a new framework. Methods 2020, 188, 20–29. [Google Scholar] [CrossRef]
- Mahadevaiah, G.; RV, P.; Bermejo, I.; Jaffray, D.; Dekker, A.; Wee, L. Artificial intelligence-based clinical decision support in modern medical physics: Selection, acceptance, commissioning, and quality assurance. Med. Phys. 2020, 47, e228–e235. [Google Scholar] [CrossRef]
- Vorm, E.S. Assessing Demand for Transparency in Intelligent Systems Using Machine Learning. In Proceedings of the 2018 Innovations in Intelligent Systems and Applications (INISTA), Thessaloniki, Greece, 3–5 July 2018; pp. 1–7. [Google Scholar] [CrossRef] [Green Version]
- Jamieson, T.; Goldfarb, A. Clinical considerations when applying machine learning to decision-support tasks versus automation. BMJ Qual. Saf. 2019, 28, 778–781. [Google Scholar] [CrossRef] [Green Version]
- Choudhury, A.; Asan, O.; Mansouri, M. Role of Artificial Intelligence, Clinicians & Policymakers in Clinical Decision Making: A Systems Viewpoint. In Proceedings of the 2019 International Symposium on Systems Engineering (ISSE), Edinburgh, UK, 1–3 October 2019; pp. 1–8. [Google Scholar] [CrossRef]
- Cánovas-Segura, B.; Morales, A.; Martínez-Carrasco, A.L.; Campos, M.; Juarez, J.M.; Rodríguez, L.L.; Palacios, F. Exploring Antimicrobial Resistance Prediction Using Post-hoc Interpretable Methods. In Artificial Intelligence in Medicine: Knowledge Representation and Transparent and Explainable Systems; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer Nature: Cham, Switzerland, 2019; Volume 11979 LNAI, pp. 93–107. [Google Scholar] [CrossRef]
- Zihni, E.; Madai, V.I.; Livne, M.; Galinovic, I.; Khalil, A.A.; Fiebach, J.B.; Frey, D. Opening the black box of artificial intelligence for clinical decision support: A study predicting stroke outcome. PLoS ONE 2020, 15, e0231166. [Google Scholar] [CrossRef] [Green Version]
- Liao, Q.V.; Gruen, D.; Miller, S. Questioning the AI: Informing Design Practices for Explainable AI User Experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; ACM: New York, NY, USA, 2020; pp. 1–15. [Google Scholar] [CrossRef]
- Johnson, S.L.J. AI, Machine Learning, and Ethics in Health Care. J. Leg. Med. 2019, 39, 427–441. [Google Scholar] [CrossRef] [PubMed]
- Timotijevic, L.; Hodgkins, C.E.; Banks, A.; Rusconi, P.; Egan, B.; Peacock, M.; Seiss, E.; Touray, M.M.L.; Gage, H.; Pellicano, C.; et al. Designing a mHealth clinical decision support system for Parkinson’s disease: A theoretically grounded user needs approach. BMC Med. Inform. Decis. Mak. 2020, 20, 34. [Google Scholar] [CrossRef] [Green Version]
- Ben Souissi, S.; Abed, M.; El Hiki, L.; Fortemps, P.; Pirlot, M. PARS, a system combining semantic technologies with multiple criteria decision aiding for supporting antibiotic prescriptions. J. Biomed. Inform. 2019, 99, 103304. [Google Scholar] [CrossRef] [PubMed]
- Gangavarapu, T.; S Krishnan, G.; Kamath S, S.; Jeganathan, J. FarSight: Long-Term Disease Prediction Using Unstructured Clinical Nursing Notes. IEEE Trans. Emerg. Top. Comput. 2020, 1–16. [Google Scholar] [CrossRef]
- Xie, Y.; Chen, M.; Kao, D.; Gao, G.; Chen, X.A. CheXplain: Enabling Physicians to Explore and Understand Data-Driven, AI-Enabled Medical Imaging Analysis. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; ACM: New York, NY, USA, 2020; pp. 1–13. [Google Scholar] [CrossRef]
- Sadeghi, R.; Banerjee, T.; Hughes, J.C.; Lawhorne, L.W. Sleep quality prediction in caregivers using physiological signals. Comput. Biol. Med. 2019, 110, 276–288. [Google Scholar] [CrossRef]
- Wang, D.; Yang, Q.; Abdul, A.; Lim, B.Y. Designing theory-driven user-centric explainable AI. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Scotland, UK, 4–9 May 2019; pp. 1–15. [Google Scholar]
- Lee, E.; Choi, J.S.; Kim, M.; Suk, H.I. Toward an interpretable Alzheimer’s disease diagnostic model with regional abnormality representation via deep learning. NeuroImage 2019, 202, 116113. [Google Scholar] [CrossRef]
- Hu, C.A.; Chen, C.M.; Fang, Y.C.; Liang, S.J.; Wang, H.C.; Fang, W.F.; Sheu, C.C.; Perng, W.C.; Yang, K.Y.; Kao, K.C.; et al. Using a machine learning approach to predict mortality in critically ill influenza patients: A cross-sectional retrospective multicentre study in Taiwan. BMJ Open 2020, 10, e033898. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Militello, C.; Rundo, L.; Toia, P.; Conti, V.; Russo, G.; Filorizzo, C.; Maffei, E.; Cademartiri, F.; La Grutta, L.; Midiri, M.; et al. A semi-automatic approach for epicardial adipose tissue segmentation and quantification on cardiac CT scans. Comput. Biol. Med. 2019, 114, 103424. [Google Scholar] [CrossRef]
- Blanco, A.; Perez, A.; Casillas, A.; Cobos, D. Extracting Cause of Death from Verbal Autopsy with Deep Learning interpretable methods. IEEE J. Biomed. Health Inform. 2020, 25, 1315–1325. [Google Scholar] [CrossRef]
- Lamy, J.B.; Sedki, K.; Tsopra, R. Explainable decision support through the learning and visualization of preferences from a formal ontology of antibiotic treatments. J. Biomed. Inform. 2020, 104, 103407. [Google Scholar] [CrossRef] [PubMed]
- Tan, T.Z.; Ng, G.S.; Quek, C. Improving tractability of Clinical Decision Support system. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China, 1–6 June 2008; pp. 1997–2002. [Google Scholar] [CrossRef]
- El-Sappagh, S.; Alonso, J.M.; Ali, F.; Ali, A.; Jang, J.H.; Kwak, K.S. An Ontology-Based Interpretable Fuzzy Decision Support System for Diabetes Diagnosis. IEEE Access 2018, 6, 37371–37394. [Google Scholar] [CrossRef]
- Lamy, J.B.; Sekar, B.; Guezennec, G.; Bouaud, J.; Séroussi, B. Explainable artificial intelligence for breast cancer: A visual case-based reasoning approach. Artif. Intell. Med. 2019, 94, 42–53. [Google Scholar] [CrossRef] [PubMed]
- Cai, C.J.; Winter, S.; Steiner, D.; Wilcox, L.; Terry, M. “Hello AI”: Uncovering the Onboarding Needs of Medical Practitioners for Human-AI Collaborative Decision-Making. In Proceedings of the ACM on Human-Computer Interaction; ACM: New York, NY, USA, 2019; Volume 3, pp. 1–24. [Google Scholar] [CrossRef] [Green Version]
- Kunapuli, G.; Varghese, B.A.; Ganapathy, P.; Desai, B.; Cen, S.; Aron, M.; Gill, I.; Duddalwar, V. A Decision-Support Tool for Renal Mass Classification. J. Digit. Imaging 2018, 31, 929–939. [Google Scholar] [CrossRef]
- Guidotti, R.; Monreale, A.; Ruggieri, S.; Pedreschi, D.; Turini, F.; Giannotti, F. Local rule-based explanations of black box decision systems. arXiv 2018, arXiv:1805.10820. [Google Scholar]
- Zhang, Q.; Li, H. MOEA/D: A multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 2007, 11, 712–731. [Google Scholar] [CrossRef]
- Lamy, J.B.; Berthelot, H.; Capron, C.; Favre, M. Rainbow boxes: A new technique for overlapping set visualization and two applications in the biomedical domain. J. Vis. Lang. Comput. 2017, 43, 71–82. [Google Scholar] [CrossRef] [Green Version]
- Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy 2021, 23, 18. [Google Scholar] [CrossRef] [PubMed]
- Gomolin, A.; Netchiporouk, E.; Gniadecki, R.; Litvinov, I.V. Artificial intelligence applications in dermatology: Where do we stand? Front. Med. 2020, 7, 100. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- London, A.J. Artificial intelligence and black-box medical decisions: Accuracy versus explainability. Hastings Cent. Rep. 2019, 49, 15–21. [Google Scholar] [CrossRef] [PubMed]
- Baldi, P. Deep learning in biomedical data science. Annu. Rev. Biomed. Data Sci. 2018, 1, 181–205. [Google Scholar] [CrossRef]
- Sullivan, E. Understanding from machine learning models. Br. J. Philos. Sci. 2020. [Google Scholar] [CrossRef] [Green Version]
- Bruckert, S.; Finzel, B.; Schmid, U. The Next Generation of Medical Decision Support: A Roadmap Toward Transparent Expert Companions. Front. Artif. Intell. 2020, 3, 75. [Google Scholar] [CrossRef]
- Carvalho, D.V.; Pereira, E.M.; Cardoso, J.S. Machine learning interpretability: A survey on methods and metrics. Electronics 2019, 8, 832. [Google Scholar] [CrossRef] [Green Version]
- Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artif. Intell. 2019, 267, 1–38. [Google Scholar] [CrossRef]
- Antoniadi, A.M.; Galvin, M.; Heverin, M.; Hardiman, O.; Mooney, C. Development of an explainable clinical decision support system for the prediction of patient quality of life in amyotrophic lateral sclerosis. In Proceedings of the 36th Annual ACM Symposium on Applied Computing, Gwangju, Korea, 22–26 March 2021; pp. 594–602. [Google Scholar]
- Zhou, J.; Gandomi, A.H.; Chen, F.; Holzinger, A. Evaluating the quality of machine learning explanations: A survey on methods and metrics. Electronics 2021, 10, 593. [Google Scholar] [CrossRef]
- Holzinger, A.; Carrington, A.; Müller, H. Measuring the quality of explanations: The system causability scale (SCS). KI-Künstliche Intell. 2020, 34, 193–198. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kenny, E.M.; Ford, C.; Quinn, M.; Keane, M.T. Explaining Black-Box classifiers using Post-Hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies. Artif. Intell. 2021, 294, 103459. [Google Scholar] [CrossRef]
- Jacobs, M.; Pradier, M.F.; McCoy, T.H.; Perlis, R.H.; Doshi-Velez, F.; Gajos, K.Z. How machine-learning recommendations influence clinician treatment selections: The example of the antidepressant selection. Transl. Psychiatry 2021, 11, 108. [Google Scholar] [CrossRef] [PubMed]
Paper Subject Area | Main Contribution | Data Processed |
---|---|---|
Sadeghi et al. [117] Sleep quality prediction | Use of time domain features for transparency and explainability | Tabular |
Wang et al. [118] Intensive care phenotyping | Used SHAP [79] for attribution, LORE [129] for counterfactual rules and multiobjective evolutionary algorithm based on decomposition (MOEA/D) for sensitivity analysis [130] | Tabular |
Lee et al. [119] Alzheimer’s Disease (AD) | Regional abnormalities in the brain space are visualized to create a “regional abnormality map” which is used to interpret regional statuses based on the probability that a region represents later stages of AD progression for a target task, and to draw potential relationships between symptomatic observations | Image |
Hu et al. [120] Critically ill influenza | SHAP [79] is used to illustrate the individual feature-level impacts on the 30-day mortality | Tabular |
Militello et al. [121] Epicardial fat volume | A user-centred Graphical User Interface (GUI) design is used to allow for safe interaction of the physician as well as for an effective integration into the existing clinical workflow | Image |
Blanco et al. [122] Cause of death | A bidirectional Gated Recurrent Units (GRU) with attention mechanism allows for exploration of how much each fragment of the text contributed in the prediction | Text |
Lamy et al. [123] Antibiotic treatment | The CDSS uses rainbow boxes [131] a visualization technique that displays all the antibiotics present in the ontology in columns and their properties in colored boxes, using labels and icons | Tabular |
Tan et al. [124] Breast cancer | Implemented a novel method: Complementary Learning Fuzzy Neural Network (CLFNN) | Tabular |
El-Sappagh et al. [125] Diabetes | Implemented a novel Fuzzy Rule-Based Systems (FRBS) for diagnosis | Tabular |
Lamy et al. [126] Breast cancer | Implemented a visual case-based reasoning approach for breast cancer management | Tabular |
Cai et al. [127] Prostate cancer | Algorithmic predictions (benign, grade 3, 4, and 5) were displayed as visual overlays on the image | Image |
Kunapuli et al. [128] Renal mass classification | XAI based on the Relational Functional Gradient Boosting (RFGB), a statistical relational learning method which provides explanations in terms of tumor shape, size, and texture metrics as well as clinical, demographic, and other factors when they are available | Image |
Paper | XAI Method | Model-Agnostic/Specific | Ante-Hoc/Post-Hoc | Local/Global |
---|---|---|---|---|
Wang et al. [118] | SHAP [79] for attribution, LORE [129] for counterfactual rules, MOEA/D [130] for sensitivity analysis | agnostic | post-hoc | local |
Lee et al. [119] | Pre-processing to obtain regions, application of randomised Deep Neural Networks on each region and extraction of regional abnormality representations in the form of a map | specific | post-hoc | local |
Hu et al. [120] | SHAP [79] for summary plot and partial dependence plot | agnostic | post-hoc | global |
Blanco et al. [122] | BiGRU with attention mechanism to show the contribution of each fragment of text to the prediction | specific | ante-hoc | local |
Lamy et al. [123] | Visualised the created preference model using rainbow boxes [131] | agnostic | post-hoc | local |
Tan et al. [124] | CLFNN, which autonomously generates fuzzy rules to provide human-like reasoning | specific | ante-hoc | local |
El-Sappagh et al. [125] | Semantically interpretable FRBS with the integration of semantic ontology-based reasoning | specific | ante-hoc | local |
Lamy et al. [126] | Visual (using rainbow-boxes [131] and a polar multidimensional scaling scatter plot) case-based reasoning approach | specific | ante-hoc | local |
Kunapuli et al. [128] | RFGB, a statistical relational learning method which uses tree models and provides explanations in terms of features of interest | specific | ante-hoc | local |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Antoniadi, A.M.; Du, Y.; Guendouz, Y.; Wei, L.; Mazo, C.; Becker, B.A.; Mooney, C. Current Challenges and Future Opportunities for XAI in Machine Learning-Based Clinical Decision Support Systems: A Systematic Review. Appl. Sci. 2021, 11, 5088. https://doi.org/10.3390/app11115088
Antoniadi AM, Du Y, Guendouz Y, Wei L, Mazo C, Becker BA, Mooney C. Current Challenges and Future Opportunities for XAI in Machine Learning-Based Clinical Decision Support Systems: A Systematic Review. Applied Sciences. 2021; 11(11):5088. https://doi.org/10.3390/app11115088
Chicago/Turabian StyleAntoniadi, Anna Markella, Yuhan Du, Yasmine Guendouz, Lan Wei, Claudia Mazo, Brett A. Becker, and Catherine Mooney. 2021. "Current Challenges and Future Opportunities for XAI in Machine Learning-Based Clinical Decision Support Systems: A Systematic Review" Applied Sciences 11, no. 11: 5088. https://doi.org/10.3390/app11115088