Defining an Ethical Explainability Metric for Measuring AI Trustworthiness in Connected Healthcare Systems
Abstract
1. Introduction
2. Key Ethical Issues and Countermeasures: Related Literature
2.1. Review Methodology
2.2. Positioning Ethical Explainability Within XAI Evaluation Frameworks
3. Conceptualizing Ethical Explainability: Metric Basis and Formulation
3.1. Component 1: Human Agreement Ratio (HAR)
3.2. Component 2: Entropy Reduction Index (ERI)
4. Operationalizing the Ethical Explainability Metric in Healthcare IoT Workflows
Blueprint for an Empirical Evaluation Design
5. Linking HAR and ERI to the Five Ethical Domains
6. Benefits of Leveraging the Ethical Explainability Metric in Healthcare IoT
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Stanford University. Artificial Intelligence Index Report. 2025. Available online: https://hai.stanford.edu/ai-index/2025-ai-index-report/economy (accessed on 11 October 2025).
- McKinsey. The State of AI: How Organizations Are Rewiring to Capture Value. 2025. Available online: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai (accessed on 12 October 2025).
- ACA Group. ACA 2024 AI Benchmarking Survey. Available online: https://www.acaglobal.com/news-and-announcements/financial-services-firms-lag-ai-governance-and-compliance-readiness-survey-reveals/ (accessed on 15 October 2025).
- Zakerabasali, S.; Ayyoubzadeh, S.M. Internet of Things and healthcare system: A systematic review of ethical issues. Health Sci. Rep. 2022, 5, e863. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Parihar, A.; Prajapati, J.B.; Prajapati, B.G.; Trambadiya, B.; Thakkar, A.; Engineer, P. Role of IOT in healthcare: Applications, security & privacy concerns. Intell. Pharm. 2024, 2, 707–714. [Google Scholar] [CrossRef]
- Artificial Intelligence Governance Network (AIGN). Standardized Ethical Metrics: Setting Global Benchmarks for Responsible AI. Available online: https://aign.global/ai-governance-insights/patrick-upmann/standardized-ethical-metrics-setting-global-benchmarks-for-responsible-ai/#:~:text=Ethical%20metrics%20provide%20a%20structured,public%20trust%20in%20AI%20technologies (accessed on 17 October 2025).
- Veriti. The State of Healthcare Cybersecurity 2025. Available online: https://veriti.ai/wp-content/uploads/2024/12/The-State-of-Healthcare-Cybersecurity-2025-_-A-Veriti-Research-Report.pdf (accessed on 18 October 2025).
- Li, S.; Surineni, K.; Prabhakaran, N. Cyber-Attacks on Hospital Systems: A Narrative Review. Am. J. Geriatr. Psychiatry Open Sci. Educ. Pract. 2025, 7, 30–39. [Google Scholar] [CrossRef]
- Lekadir, K.; Frangi, A.F.; Porras, A.R.; Glocker, B.; Cintas, C.; Langlotz, C.P.; Weicken, E.; Asselbergs, F.W.; Prior, F.; Collins, G.S.; et al. FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare. BMJ 2025, 388, e081554. [Google Scholar] [CrossRef]
- Singhal, A.; Neveditsin, N.; Tanveer, H.; Mago, V. Toward Fairness, Accountability, Transparency, and Ethics in AI for Social Media and Health Care: Scoping Review. JMIR. Med. Inf. 2024, 12, e50048. [Google Scholar] [CrossRef]
- Abbas, Q.; Jeong, W.; Lee, S.W. Explainable AI in clinical decision support systems: A meta-analysis of methods, applications, and usability challenges. Healthcare 2025, 13, 2154. [Google Scholar] [CrossRef]
- Basiouni, A.; Abdelqader, K.; Shaalan, K. Unlocking the Future: Systematic Review of the Progress and Challenges in Explainableartificial Intelligence (Xai). SSRN 2024. [Google Scholar] [CrossRef]
- Hou, J.; Cheng, X.; Liao, J.; Zhang, Z.; Wang, W. Ethical concerns of AI in healthcare: A systematic review of qualitative studies. Nurs. Ethics 2025. [Google Scholar] [CrossRef]
- Weiner, E.B.; Dankwa-Mullan, I.; Nelson, W.A.; Hassanpour, S. Ethical challenges and evolving strategies in the integration of artificial intelligence into clinical practice. PLoS Digit. Health 2025, 4, e0000810. [Google Scholar] [CrossRef]
- Matthew, U.O.; Rosa, R.L.; Saadi, M.; Rodriguez, D.Z. Interpretable AI Framework for Secure and Reliable Medical Image Analysis in IoMT Systems. IEEE J. Biomed. Health Inform. 2025, 1–14. [Google Scholar] [CrossRef] [PubMed]
- Rasheed, K.; Qayyum, A.; Ghaly, M.; Al-Fuqaha, A.; Razi, A.; Qadir, J. Explainable, trustworthy, and ethical machine learning for healthcare: A survey. Comput. Biol. Med. 2022, 149, 106043. [Google Scholar] [CrossRef]
- Ehrmann, D.E.; Joshi, S.; Goodfellow, S.D.; Mazwi, M.L.; Eytan, D. Making machine learning matter to clinicians: Model actionability in medical decision-making. NPJ Digit. Med. 2023, 6, 7. [Google Scholar] [CrossRef]
- Ament, T.; Sondhi, T. 5 Trends Shaping Healthcare Cybersecurity in 2025. Palo Alto Networks. 2025. Available online: https://www.paloaltonetworks.com/blog/2025/01/5-trends-shaping-healthcare-cybersecurity-in-2025/ (accessed on 11 October 2025).
- Shammar, E.; Cui, X.; Zahary, A.; Alsamhi, S.H.; Al-qaness, M.A. Threat to Trust: A Systematic Review on Internet of Medical Things Security. J. Parallel Distrib. Comput. 2025, 206, 105172. [Google Scholar] [CrossRef]
- Kruse, C.S.; Frederick, B.; Jacobson, T.; Monticone, D.K. Cybersecurity in healthcare: A systematic review of modern threats and trends. Technol. Health Care 2017, 25, 1–10. [Google Scholar] [CrossRef] [PubMed]
- Khallaf, F.; El-Shafai, W.; El-Rabaie, E.S.M.; Abd El-Samie, F.E. A Systematic Review of New Technologies for Cybersecurity Healthcare Applications: A Systematic and Comprehensive Study. Trans. Emerg. Telecommun. Technol. 2025, 36, e70183. [Google Scholar] [CrossRef]
- Adadi, A.; Berrada, M. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access 2018, 6, 52138–52160. [Google Scholar] [CrossRef]
- Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A Survey of Methods for Explaining Black Box Models. ACM Comput. Surv. 2018, 51, 93. [Google Scholar] [CrossRef]
- Barredo Arrieta, A.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
- Tjoa, E.; Guan, C. A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4793–4813. [Google Scholar] [CrossRef]
- Miller, T. Explanation in Artificial Intelligence: Insights from the Social Sciences. Artif. Intell. 2019, 267, 1–38. [Google Scholar] [CrossRef]
- Murdoch, W.J.; Singh, C.; Kumbier, K.; Abbasi-Asl, R.; Yu, B. Definitions, Methods, and Applications in Interpretable Machine Learning. Proc. Natl. Acad. Sci. USA 2019, 116, 22071–22080. [Google Scholar] [CrossRef]
- Carvalho, D.V.; Pereira, E.M.; Cardoso, J.S. Machine Learning Interpretability: A Survey on Methods and Metrics. Electronics 2019, 8, 832. [Google Scholar] [CrossRef]
- Nauta, M.; Trienes, J.; Pathak, S.; Nguyen, E.; Peters, M.; Schmitt, Y.; Schlötterer, J.; van Keulen, M.; Seifert, C. From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI. ACM Comput. Surv. 2023, 55, 295. [Google Scholar] [CrossRef]
- Bodria, F.; Giannotti, F.; Guidotti, R.; Naretto, F.; Pedreschi, D.; Rinzivillo, S. Benchmarking and Survey of Explanation Methods for Black Box Models. Data Min. Knowl. Disc. 2023, 37, 1719–1778. [Google Scholar] [CrossRef]
- Ali, S.; Abuhmed, T.; El-Sappagh, S.; Muhammad, K.; Alonso-Moral, J.M.; Confalonieri, R.; Guidotti, R.; Del Ser, J.; Díaz-Rodríguez, N.; Herrera, F. Explainable Artificial Intelligence (XAI): What We Know and What Is Left to Attain Trustworthy Artificial Intelligence. Inf. Fusion 2023, 99, 101805. [Google Scholar] [CrossRef]
- National Institute of Standards and Technology. Psychological Foundations of Explainability and Interpretability in Artificial Intelligence; National Institute of Standards and Technology Interagency Internal Report 8367; National Institute of Standards and Technology: Gaithersburg, MD, USA, April 2021. [CrossRef]
- National Institute of Standards and Technology Interagency. AI Risk Management Framework (AI RMF). Available online: https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf (accessed on 26 April 2026).
- Shang, Z. Use of Delphi in health sciences research: A narrative review. Medicine 2023, 102, e32829. [Google Scholar] [CrossRef]
- Bertolini, S.; Maoli, A.; Rauch, G.; Giacomini, M. Entropy-driven decision tree building for decision support in gastroenterology. Stud. Health Technol. Inform. 2013, 186, 93–97. [Google Scholar]
- Monaco, A.; Amoroso, N.; Bellantuono, L.; Lella, E.; Lombardi, A.; Monda, A.; Tateo, A.; Bellotti, R.; Tangaro, S. Shannon entropy approach reveals relevant genes in Alzheimer’s disease. PLoS ONE 2019, 14, e0226190. [Google Scholar] [CrossRef]
- Sim, J.; Wright, C.C. The kappa statistic in reliability studies: Use, interpretation, and sample size requirements. Phys. Ther. 2005, 85, 257–268. [Google Scholar] [CrossRef]


| Domain | Key Indicator |
|---|---|
| Global AI Investment | USD 252 billion (2024) |
| AI Adoption in Business Processes | 78% of firms use AI in ≥1 function (55% in 2024) |
| Organizations with No Drafted AI Policy at All | 54% of organizations with no AI policy |
| Healthcare Organizations Breached | 400 companies; average loss ≈ USD 3.5 million |
| Unsecured Medical IoT Devices Online | >1,000,000 exposed devices |
| Category | Representative Approach | What It Measures | Strengths | Limitations | Relation to Ee |
|---|---|---|---|---|---|
| Model-grounded | Perturbation tests (e.g., deletion/insertion) | Faithfulness of feature attributions to model behavior | Links explanation to model output changes | Sensitive to perturbation design; may not reflect human utility | Complementary: Ee adds human clarity + consensus alignment |
| Model-grounded | Sufficiency/comprehensiveness | Whether explanation subset is sufficient, whether removing it changes prediction | Intuitive faithfulness logic | Can be affected by feature correlation | Used as baseline in planned validation; Ee captures governance relevance |
| Model-grounded | Sensitivity/stability | Robustness of explanations under small input changes | Captures explanation reliability | Not a measure of human usefulness | Ee targets human uncertainty reduction |
| Functionally grounded | Complexity/sparsity proxies | Proxy interpretability (simplicity of explanation) | Cheap and scalable | Proxy ≠ human clarity; can oversimplify | Ee uses human-centered outcomes rather than proxies |
| Human-grounded | Simulatability/forward prediction tasks | Whether users can predict model behavior with explanation | Direct comprehension measure | Requires user studies | ERI provides a quantitative uncertainty analog |
| Human-grounded | Trust/usability scales, time, workload | Perceived trust/usefulness, reliance behavior, cognitive burden | Workflow-relevant | Subjective; may diverge from correctness | Ee reduces reliance on perception-only measures via HAR + ERI |
| Composite (this work) | Ee = wH × HAR + wE × ERI | Decision alignment + uncertainty reduction | Governance-ready scalar with interpretable components | Requires expert elicitation | Integrates decision and explanation utility under ethical domains |
| Ethical Domain (from Review) | How It Is Operationalized in Ee | What to Measure and Report | Practical Implementation/Governance Cues |
|---|---|---|---|
| Justice and fairness | Fairness is assessed by examining HARoutcome, HARrationale, and ERI across subgroups to identify disparities in agreement or uncertainty reduction. | Report HARoutcome, HARrationale, ERI, and Ee stratified by relevant groups (e.g., device type, unit/ward, patient demographic groups), plus disparity metrics (confidence intervals). | If subgroup gaps exceed governance thresholds, trigger re-training, re-calibration of explanations, or policy review; include subgroup auditing in periodic monitoring. |
| Transparency and explainability | ERI captures explanation utility as proportional uncertainty reduction; HARrationale captures whether explanations are judged coherent/actionable. | Report ERI distribution (% cases with ERI ≥ target; report HARrationale with criteria breakdown (e.g., coherence, actionability, non-misleadingness). | Compare multiple XAI methods; select explanation modality that maximizes ERI without harming HARoutcome; use ERI to detect opaque explanations. |
| Consent and confidentiality | Incorporated via HARrationale by penalizing explanations that expose unnecessary sensitive information or violate minimum-necessary disclosure. | Report the proportion of explanations failing privacy/consent criteria document explanation content controls. | Establish explanation redaction rules; ensure explanations do not reveal identifiers; align with local consent policies. |
| Accountability | HARoutcome provides auditable alignment with expert consensus; ERI demonstrates that explanations reduce uncertainty; together support governance thresholds (E0). | Report inter-rater reliability (κ/ICC), consensus method, calibration protocol; report Ee relative to E0 (pass/fail rates); provide audit logs of cases where Ee < E0. | Define automation policy: if Ee ≥ E0, allow higher automation; else require human oversight; incorporate periodic audits and drift monitoring. |
| Patient-centered design | Ensures explanations improve clinician/analyst decision-making and fit workflow constraints: ERI measures clarity gains; HARrationale includes usability/actionability items; HARoutcome supports safety. | Report ERI alongside user-centered outcomes (optional but recommended): perceived usefulness, time-to-decision, workload; include actionability items in HARrationale | Tailor explanation format to user roles; adopt human factors evaluation; ensure explanations support safe patient-impacting decisions and do not increase cognitive burden. |
| Application Area | AI Scenario/Alert | How Ee Is Used | Outcome When Ee Is Low |
|---|---|---|---|
| Remote patient monitoring | Arrhythmia alert on wearable | Clinician and nurse review/explain | Model triggers manual validation, not auto-intervention |
| Device security monitoring | Network anomaly on patient monitor | Engineers and experts review SHAP explanation | Device isolated or flagged for further probe |
| Population health/fairness auditing | Risk score disparity in different groups | Medical teams review scores per group | Bias mitigation and retraining initiated |
| Digital health apps and consent | Lifestyle/intervention recommendation | User feedback on explanation comprehensibility | Algorithm revised, content adapted, or user given more context |
| Regulatory compliance audit | Annual hospital AI systems review | Aggregate Ee statistics reported | Noncompliant systems suspended or remediated |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Naib, P.; Park, J.; Abedin, P.; King, C.; Gurupur, V. Defining an Ethical Explainability Metric for Measuring AI Trustworthiness in Connected Healthcare Systems. Information 2026, 17, 438. https://doi.org/10.3390/info17050438
Naib P, Park J, Abedin P, King C, Gurupur V. Defining an Ethical Explainability Metric for Measuring AI Trustworthiness in Connected Healthcare Systems. Information. 2026; 17(5):438. https://doi.org/10.3390/info17050438
Chicago/Turabian StyleNaib, Parul, Jaeyoung Park, Paniz Abedin, Christian King, and Varadraj Gurupur. 2026. "Defining an Ethical Explainability Metric for Measuring AI Trustworthiness in Connected Healthcare Systems" Information 17, no. 5: 438. https://doi.org/10.3390/info17050438
APA StyleNaib, P., Park, J., Abedin, P., King, C., & Gurupur, V. (2026). Defining an Ethical Explainability Metric for Measuring AI Trustworthiness in Connected Healthcare Systems. Information, 17(5), 438. https://doi.org/10.3390/info17050438

