Next Article in Journal
The Frictional Impact with Rebound for 3D Printed Surfaces
Previous Article in Journal
Color Stability of a Composite Containing Hydroxyapatite, Fluorine, and Silver Fillers After Artificial Aging
Previous Article in Special Issue
Development and Comparison of Machine Learning and Deep Learning Models for Speech Audiometry Prediction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Ensemble Transfer Learning for Gastric Cancer Prediction Using Electronic Health Records in a Data-Scarce Single-Hospital Setting

1
Department of Statistics and Information Science, Dongduk Women’s University, Seoul 02748, Republic of Korea
2
Department of Internal Medicine, Kangdong Sacred Heart Hospital, College of Medicine, Hallym University, Seoul 05355, Republic of Korea
3
Department of Medical Informatics and Statistics, Kangdong Sacred Heart Hospital, Seoul 05355, Republic of Korea
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2025, 15(23), 12428; https://doi.org/10.3390/app152312428 (registering DOI)
Submission received: 28 September 2025 / Revised: 9 November 2025 / Accepted: 20 November 2025 / Published: 23 November 2025
(This article belongs to the Special Issue Advances in Machine Learning for Healthcare Applications)

Abstract

Gastric cancer is a significant health concern in East Asia, where early risk prediction is critical for prevention. However, the scarcity of single-hospital electronic health records (EHRs) data limits the applicability and generalizability of machine learning models. To address this challenge, we propose an ensemble transfer learning framework for gastric cancer prediction using structured EHRs in a data-scarce single-hospital setting. Three base models, Support Vector Machine (SVM), Random Forest, and Deep Neural Network (DNN), were pretrained on a large-scale national dataset from the Republic of Korean National Health Insurance Service (NHIS) and fine-tuned on a smaller institutional dataset from Kangdong Sacred Heart Hospital (KSHH). These fine-tuned models were combined via stacking ensemble learning with logistic regression as a meta-learner. The proposed model achieved strong performance with precision 0.78, recall 0.92, F1-score 0.83, accuracy 0.91, and AUC 0.93. For interpretability, permutation feature importance and Shapley Additive Explanations (SHAP) were applied. Smoking status, gender, and hypertensive disorder were identified as key predictors consistent with previous studies. This study demonstrates the successful application of transfer learning to overcome data scarcity in single-hospital structured EHRs. Furthermore, our stacking ensemble strategy outperformed the individual fine-tuned models, offering a generalizable framework for gastric cancer prediction in data-scarce clinical settings.
Keywords: gastric cancer prediction; stacking ensemble; transfer learning; permutation feature importance; SHAP; electronic health records; eXplainable AI; medical AI gastric cancer prediction; stacking ensemble; transfer learning; permutation feature importance; SHAP; electronic health records; eXplainable AI; medical AI

Share and Cite

MDPI and ACS Style

Kim, H.H.; Han, J.Y.; Lim, Y.B.; Lim, Y.S.; Seo, S.-I.; Lee, K.J.; Shin, W.G. Ensemble Transfer Learning for Gastric Cancer Prediction Using Electronic Health Records in a Data-Scarce Single-Hospital Setting. Appl. Sci. 2025, 15, 12428. https://doi.org/10.3390/app152312428

AMA Style

Kim HH, Han JY, Lim YB, Lim YS, Seo S-I, Lee KJ, Shin WG. Ensemble Transfer Learning for Gastric Cancer Prediction Using Electronic Health Records in a Data-Scarce Single-Hospital Setting. Applied Sciences. 2025; 15(23):12428. https://doi.org/10.3390/app152312428

Chicago/Turabian Style

Kim, Hyon Hee, Ji Yeon Han, Yae Bin Lim, Young Seo Lim, Seung-In Seo, Kyung Joo Lee, and Woon Geon Shin. 2025. "Ensemble Transfer Learning for Gastric Cancer Prediction Using Electronic Health Records in a Data-Scarce Single-Hospital Setting" Applied Sciences 15, no. 23: 12428. https://doi.org/10.3390/app152312428

APA Style

Kim, H. H., Han, J. Y., Lim, Y. B., Lim, Y. S., Seo, S.-I., Lee, K. J., & Shin, W. G. (2025). Ensemble Transfer Learning for Gastric Cancer Prediction Using Electronic Health Records in a Data-Scarce Single-Hospital Setting. Applied Sciences, 15(23), 12428. https://doi.org/10.3390/app152312428

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop