Previous Article in Journal
Tray Application Versus the Standard Surgical Procedure: A Prospective Evaluation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Data-Leakage-Aware Preoperative Prediction of Postoperative Complications from Structured Data and Preoperative Clinical Notes

1
Department of Osteopathic Manipulative Medicine, College of Osteopathic Medicine, New York Institute of Technology, Old Westbury, NY 11568, USA
2
Department of Surgery, Icahn School of Medicine at Mount Sinai, 1428 Madison Avenue, Atran Berg Building, 8th Floor, New York, NY 10029, USA
*
Author to whom correspondence should be addressed.
Surgeries 2025, 6(4), 87; https://doi.org/10.3390/surgeries6040087 (registering DOI)
Submission received: 29 August 2025 / Revised: 4 October 2025 / Accepted: 8 October 2025 / Published: 9 October 2025

Abstract

Background/Objectives: Machine learning has been suggested as a way to improve how we predict anesthesia-related complications after surgery. However, many studies report overly optimistic results due to issues like data leakage and not fully using information from clinical notes. This study provides a transparent comparison of different machine learning models using both structured data and preoperative notes, with a focus on avoiding data leakage and involving clinicians throughout. We show how high reported metrics in the literature can result from methodological pitfalls and may not be clinically meaningful. Methods: We used a dataset containing both structured patient and surgery information and preoperative clinical notes. To avoid data leakage, we excluded any variables that could directly reveal the outcome. The data was cleaned and processed, and information from clinical notes was summarized into features suitable for modeling. We tested a range of machine learning methods, including simple, tree-based, and modern language-based models. Models were evaluated using a standard split of the data and cross-validation, and we addressed class imbalance with sampling techniques. Results: All models showed only modest ability to distinguish between patients with and without complications. The best performance was achieved by a simple model using both structured and summarized text features, with an area under the curve of 0.644 and accuracy of 60%. Other models, including those using advanced language techniques, performed similarly or slightly worse. Adding information from clinical notes gave small improvements, but no single type of data dominated. Overall, the results did not reach the high levels reported in some previous studies. Conclusions: In this analysis, machine learning models using both structured and unstructured preoperative data achieved only modest predictive performance for postoperative complications. These findings highlight the importance of transparent methodology and clinical oversight to avoid data leakage and inflated results. Future progress will require better control of data leakage, richer data sources, and external validation to develop clinically useful prediction tools.
Keywords: data leakage; machine learning; anesthesia; clinical notes; complication prediction data leakage; machine learning; anesthesia; clinical notes; complication prediction

Share and Cite

MDPI and ACS Style

Amanatidis, A.; Egan, K.; Nio, K.; Toma, M. Data-Leakage-Aware Preoperative Prediction of Postoperative Complications from Structured Data and Preoperative Clinical Notes. Surgeries 2025, 6, 87. https://doi.org/10.3390/surgeries6040087

AMA Style

Amanatidis A, Egan K, Nio K, Toma M. Data-Leakage-Aware Preoperative Prediction of Postoperative Complications from Structured Data and Preoperative Clinical Notes. Surgeries. 2025; 6(4):87. https://doi.org/10.3390/surgeries6040087

Chicago/Turabian Style

Amanatidis, Anastasia, Kyle Egan, Kusuma Nio, and Milan Toma. 2025. "Data-Leakage-Aware Preoperative Prediction of Postoperative Complications from Structured Data and Preoperative Clinical Notes" Surgeries 6, no. 4: 87. https://doi.org/10.3390/surgeries6040087

APA Style

Amanatidis, A., Egan, K., Nio, K., & Toma, M. (2025). Data-Leakage-Aware Preoperative Prediction of Postoperative Complications from Structured Data and Preoperative Clinical Notes. Surgeries, 6(4), 87. https://doi.org/10.3390/surgeries6040087

Article Metrics

Back to TopTop