Next Article in Journal
Fermentation of Grapefruit Juice with Lacticaseibacillus rhamnosus and Enzymatic Debittering by Naringinase
Previous Article in Journal
Vibration Control of Passenger Aircraft Active Landing Gear Using Neural Network-Based Fuzzy Inference System
Previous Article in Special Issue
Uncertainty-Aware Predictive Process Monitoring in Healthcare: Explainable Insights into Probability Calibration for Conformal Prediction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Balancing Privacy and Utility in Artificial Intelligence-Based Clinical Decision Support: Empirical Evaluation Using De-Identified Electronic Health Record Data

1
Artificial Intelligence Big Data Medical Center, College of Medicine, Yonsei University, Wonju 26417, Republic of Korea
2
Department of Precision Medicine, College of Medicine, Yonsei University, Wonju 26417, Republic of Korea
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(19), 10857; https://doi.org/10.3390/app151910857
Submission received: 7 September 2025 / Revised: 6 October 2025 / Accepted: 8 October 2025 / Published: 9 October 2025
(This article belongs to the Special Issue Digital Innovations in Healthcare)

Abstract

The secondary use of electronic health records is essential for developing artificial intelligence-based clinical decision support systems. However, even after direct identifiers are removed, de-identified electronic health records remain vulnerable to re-identification, membership inference attacks, and model extraction attacks. This study examined the balance between privacy protection and model utility by evaluating de-identification strategies and differentially private learning in large-scale electronic health records. De-identified records from a tertiary medical center were analyzed and compared with three strategies—baseline generalization, enhanced generalization, and enhanced generalization with suppression—together with differentially private stochastic gradient descent. Privacy risks were assessed through k-anonymity distributions, membership inference attacks, and model extraction attacks. Model performance was evaluated using standard predictive metrics, and privacy budgets were estimated for differentially private stochastic gradient descent. Enhanced generalization with suppression consistently improved k-anonymity distributions by reducing small, high-risk classes. Membership inference attacks remained at the chance level under all conditions, indicating that patient participation could not be inferred. Model extraction attacks closely replicated victim model outputs under baseline training but were substantially curtailed once differentially private stochastic gradient descent was applied. Notably, privacy-preserving learning maintained clinically relevant performance while mitigating privacy risks. Combining suppression with differentially private stochastic gradient descent reduced re-identification risk and markedly limited model extraction while sustaining predictive accuracy. These findings provide empirical evidence that a privacy–utility balance is achievable in clinical applications.
Keywords: electronic health records; de-identification; membership inference attack; model extraction attack; differentially private stochastic gradient descent; clinical decision support systems electronic health records; de-identification; membership inference attack; model extraction attack; differentially private stochastic gradient descent; clinical decision support systems

Share and Cite

MDPI and ACS Style

Lee, J.; Lee, K.H. Balancing Privacy and Utility in Artificial Intelligence-Based Clinical Decision Support: Empirical Evaluation Using De-Identified Electronic Health Record Data. Appl. Sci. 2025, 15, 10857. https://doi.org/10.3390/app151910857

AMA Style

Lee J, Lee KH. Balancing Privacy and Utility in Artificial Intelligence-Based Clinical Decision Support: Empirical Evaluation Using De-Identified Electronic Health Record Data. Applied Sciences. 2025; 15(19):10857. https://doi.org/10.3390/app151910857

Chicago/Turabian Style

Lee, Jungwoo, and Kyu Hee Lee. 2025. "Balancing Privacy and Utility in Artificial Intelligence-Based Clinical Decision Support: Empirical Evaluation Using De-Identified Electronic Health Record Data" Applied Sciences 15, no. 19: 10857. https://doi.org/10.3390/app151910857

APA Style

Lee, J., & Lee, K. H. (2025). Balancing Privacy and Utility in Artificial Intelligence-Based Clinical Decision Support: Empirical Evaluation Using De-Identified Electronic Health Record Data. Applied Sciences, 15(19), 10857. https://doi.org/10.3390/app151910857

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop