Prediction of Out-of-Hospital Cardiac Arrest Survival Outcomes Using a Hybrid Agnostic Explanation TabNet Model
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Source
2.2. Data Preprocessing
2.3. Development of a TabNet-Based OHCA Survival Outcome Prediction Model
2.4. Hyperparameter Fine-Tuning Method: Bayesian Optimization
- Phase 1: By selecting a few data points at random, the BO algorithm tries to fit a surrogate function across the . A Gaussian process (GP) is used to update the surrogate function and generate the posterior distribution over the because of its versatility, robustness, accuracy, and analytic tractability.
- Phase 2: Using the posterior distribution created in Phase 1, an acquisition function is developed. This function explores new parts of the search space and exploits existing regions with optimal results. Until a stopping threshold is reached, the exploration and exploitation processes will continue, and the surrogate model will be updated with new results. The goal is to maximize the acquisition function in order to find the next sampling point.
- “n_d” (Number of features in the output of each decision step): This hyperparameter controls the size of the output space for each decision step, which can affect the model’s ability to capture complex patterns in the data. A larger value for “n_d” can increase the capacity of the model, but may also increase the risk of overfitting. A typical range for “n_d” is between 4 and 128.
- “n_a” (Number of features in the attention function for each decision step): This hyperparameter controls the size of the attention network in each decision step, which can affect the model’s ability to focus on relevant features for prediction. A larger value for “n_a” can improve the quality of the attention mechanism, but may also increase the computational cost. A typical range for n_a is between 4 and 128.
- “n_steps” (Number of decision steps in the network): This hyperparameter controls the depth of the TabNet model, which can affect its ability to capture hierarchical relationships in the data. A larger value for “n_steps” can increase the capacity of the model, but may also increase the risk of overfitting. A typical range for “n_steps” is between 1 and 10.
- “gamma” (Sparsity regularization coefficient): This hyperparameter controls the degree of sparsity in the feature selection process. A larger value for “gamma” can increase the sparsity of the model, but may also decrease its ability to capture important features. A typical range for gamma is between 0.8 and 2.0.
- “n_independent” (Number of independent GLU (gated linear unit) activations per feature): This hyperparameter controls the number of independent nonlinear transformations for each feature. A larger value for “n_independent” can increase the complexity of the model, but may also increase the risk of overfitting. A typical range for n_independent is between 1 and 10.
- “n_shared” (Number of shared GLU activations for all features): This hyperparameter controls the number of shared nonlinear transformations for all features. A larger value for “n_shared” can increase the capacity of the model, but may also increase the computational cost. A typical range for “n_shared” is between 1 and 10.
- “momentum” (Batch normalization momentum): This hyperparameter controls the momentum for the batch normalization layer, which can affect the stability and convergence of the optimization process. A larger value for “momentum” can improve the stability of the model, but may also increase the risk of overfitting. A typical range for “momentum” is between 0.02 and 0.3.
- “batch_size” (The size of each batch used for training the model): A larger value for “batch_size” can improve the efficiency of the optimization process, but may also increase the risk of overfitting. A typical range for “batch_size” is between 16 and 256.
- “virtual_batch_size” (The size of the virtual batch used for training the model): This hyperparameter can be used to simulate larger batch sizes without increasing memory usage. A larger value for “virtual_batch_size” can improve the stability of the optimization process, but may also increase the computational cost. A typical range for “virtual_batch_size” is between 16 and 256.
- “patience”: The number of epochs to wait before early stopping if the validation loss does not improve. This hyperparameter can be used to prevent overfitting and improve the efficiency of the optimization process. A larger value for ‘patience’ can improve stability. A range for “patience” is between 5 to 25.
2.5. Performance Metrics
2.6. Prediction Performance Comparison
- XGBoost (XGB): XGB is a popular gradient-boosting algorithm widely used in regression and classification tasks. It uses a combination of multiple decision trees to make predictions, and it is known for its high performance and accuracy. XGB is particularly good at handling large datasets and handling missing data. Some notable features of XGB include its ability to handle different types of data, including both categorical and numerical data, and its ability to perform feature selection to identify the most important features for prediction.
- K-nearest neighbors (KNN): KNN is a simple and effective algorithm for both regression and classification problems. It works by finding the k-nearest data points to a new observation and using their values to predict the outcome of that observation. KNN is often used for problems with non-linear decision boundaries and can be effective in low-dimensional datasets. One of the key features of KNN is that it is easy to understand and implement, but it can be computationally expensive for large datasets.
- Random forest (RF): RF is an ensemble learning algorithm that uses multiple decision trees to make predictions. It is particularly good at handling high-dimensional datasets and can handle both categorical and continuous data. RF is known for its ability to reduce overfitting and handle missing data, and it can be used for both regression and classification tasks.
- Decision tree (DT): DT is a simple yet powerful algorithm for both regression and classification problems. It works by recursively splitting the data based on the most informative feature at each node until a stopping criterion is reached. DT is easy to interpret and can handle both categorical and continuous data.
- Logistic regression (LR): LR is a popular algorithm for binary classification problems. It works by estimating the probability of the outcome variable given the input variables, and it uses a logistic function to transform this probability into a binary outcome. LR is easy to interpret and can handle both categorical and continuous data.
2.7. Local Interpretable Model-Agnostic Explanations (LIMEs)
3. Results
3.1. Performances of the HAE-TabNet Model
3.2. Evaluation of HAE-TabNet’s Feature Masks
3.3. Evaluation of LIME-HAE-TabNet
- Electrocardiogram findings of sudden cardiac arrest in the emergency room = 0 (rhythm after recovery of spontaneous circulation (state of recovery of spontaneous circulation at the time of visit));
- Recovery of spontaneous circulation before hospital arrival = 1 (recovery of spontaneous circulation);
- Whether emergency room defibrillation is performed = 1 (not conducted);
- smoking history = 8 (does not exist);
- Gender = 2 (female);
- Type of insurance = 1 (national health insurance);
- Whether defibrillation was performed prior to hospital arrival = 9 (unknown);
- Detector or witness types of sudden cardiac arrest patients = 9 (unknown);
- Causes of sudden cardiac arrest = 1 (disease);
- Reasons for non-implementation of ambulance automatic defibrillator = 1 (indication).
4. Discussion
5. Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Roh, S.-Y.; Choi, J.-I.; Park, S.H.; Kim, Y.G.; Shim, J.; Kim, J.-S.; Han, K.D.; Kim, Y.-H. The 10-Year Trend of Out-of-Hospital Cardiac Arrests: A Korean Nationwide Population-Based Study. Korean Circ. J. 2021, 51, 866. [Google Scholar] [CrossRef] [PubMed]
- Daya, M.R.; Schmicker, R.H.; Zive, D.M.; Rea, T.D.; Nichol, G.; Buick, J.E.; Brooks, S.; Christenson, J.; MacPhee, R.; Craig, A.; et al. Out-of-hospital cardiac arrest survival improving over time: Results from the Resuscitation Outcomes Consortium (ROC). Resuscitation 2015, 91, 108–115. [Google Scholar] [CrossRef]
- Stecker, E.C.; Reinier, K.; Marijon, E.; Narayanan, K.; Teodorescu, C.; Uy-Evanado, A.; Gunson, K.; Jui, J.; Chugh, S.S. Public Health Burden of Sudden Cardiac Death in the United States. Circ. Arrhythmia Electrophysiol. 2014, 7, 212–217. [Google Scholar] [CrossRef]
- Adabag, A.S.; Peterson, G.; Apple, F.S.; Titus, J.; King, R.; Luepker, R.V. Etiology of Sudden Death in the Community: Results of Anatomical, Metabolic, and Genetic Evaluation. Am. Heart J. 2010, 159, 33–39. [Google Scholar] [CrossRef] [PubMed]
- Nichol, G.; Soar, J. Regional cardiac resuscitation systems of care. Curr. Opin. Crit. Care 2010, 16, 223–230. [Google Scholar] [CrossRef] [PubMed]
- Adnet, F.; Triba, M.N.; Borron, S.W.; Lapostolle, F.; Hubert, H.; Gueugniaud, P.-Y.; Escutnaire, J.; Guenin, A.; Hoogvorst, A.; Marbeuf-Gueye, C.; et al. Cardiopulmonary Resuscitation Duration and Survival in Out-of-Hospital Cardiac Arrest Patients. Resuscitation 2017, 111, 74–81. [Google Scholar] [CrossRef]
- Kashiura, M.; Hamabe, Y.; Akashi, A.; Sakurai, A.; Tahara, Y.; Yonemoto, N.; Nagao, K.; Yaguchi, A.; Morimura, N. Applying the Termination of Resuscitation Rules to Out-of-Hospital Cardiac Arrests of Both Cardiac and Non-Cardiac Etiologies: A Prospective Cohort Study. Crit. Care 2016, 20, 49. [Google Scholar] [CrossRef]
- Cho, K.-J.; Kwon, O.; Kwon, J.; Lee, Y.; Park, H.; Jeon, K.-H.; Kim, K.-H.; Park, J.; Oh, B.-H. Detecting Patient Deterioration Using Artificial Intelligence in a Rapid Response System. Crit. Care Med. 2020, 48, e285–e289. [Google Scholar] [CrossRef]
- Kang, D.-Y.; Cho, K.-J.; Kwon, O.; Kwon, J.; Jeon, K.-H.; Park, H.; Lee, Y.; Park, J.; Oh, B.-H. Artificial Intelligence Algorithm to Predict the Need for Critical Care in Prehospital Emergency Medical Services. Scand. J. Trauma Resusc. Emerg. Med. 2020, 28, 17. [Google Scholar] [CrossRef]
- Kwon, J.; Jeon, K.-H.; Kim, H.M.; Kim, M.J.; Lim, S.; Kim, K.-H.; Song, P.S.; Park, J.; Choi, R.K.; Oh, B.-H. Deep-Learning-Based out-of-Hospital Cardiac Arrest Prognostic System to Predict Clinical Outcomes. Resuscitation 2019, 139, 84–91. [Google Scholar] [CrossRef]
- Kwon, J.; Lee, Y.; Lee, Y.; Lee, S.; Park, H.; Park, J. Validation of Deep-Learning-Based Triage and Acuity Score Using a Large National Dataset. PLoS ONE 2018, 13, e0205836. [Google Scholar] [CrossRef]
- Kwon, J.; Lee, Y.; Lee, Y.; Lee, S.; Park, J. An Algorithm Based on Deep Learning for Predicting In-Hospital Cardiac Arrest. J. Am. Heart Assoc. 2018, 7, e008678. [Google Scholar] [CrossRef]
- Shwartz-Ziv, R.; Armon, A. Tabular Data: Deep Learning Is Not All You Need. Inf. Fusion 2022, 81, 84–90. [Google Scholar] [CrossRef]
- Arik, S.Ö.; Pfister, T. TabNet: Attentive Interpretable Tabular Learning. Proc. AAAI Conf. Artif. Intell. 2021, 35, 6679–6687. [Google Scholar] [CrossRef]
- Cahan, N.; Marom, E.M.; Soffer, S.; Barash, Y.; Konen, E.; Klang, E.; Greenspan, H. Weakly Supervised Multimodal 30-Day All-Cause Mortality Prediction for Pulmonary Embolism Patients. In Proceedings of the 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), Kolkata, India, 28–31 March 2022. [Google Scholar] [CrossRef]
- Asadi-Pooya, A.A.; Kashkooli, M.; Asadi-Pooya, A.; Malekpour, M.; Jafari, A. Machine Learning Applications to Differentiate Comorbid Functional Seizures and Epilepsy from Pure Functional Seizures. J. Psychosom. Res. 2022, 153, 110703. [Google Scholar] [CrossRef]
- Chen, D.; Zhao, H.; He, J.; Pan, Q.; Zhao, W. An Causal XAI Diagnostic Model for Breast Cancer Based on Mammography Reports. In Proceedings of the 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Houston, TX, USA, 9–12 December 2021. [Google Scholar] [CrossRef]
- Zhang, Q.; Tian, X.; Chen, G.; Yu, Z.; Zhang, X.; Lu, J.; Zhang, J.; Wang, P.; Hao, X.; Huang, Y.; et al. A Prediction Model for Tacrolimus Daily Dose in Kidney Transplant Recipients With Machine Learning and Deep Learning Techniques. Front. Med. 2022, 9. [Google Scholar] [CrossRef]
- Yu, Z.; Ye, X.; Liu, H.; Li, H.; Hao, X.; Zhang, J.; Kou, F.; Wang, Z.; Wei, H.; Gao, F.; et al. Predicting Lapatinib Dose Regimen Using Machine Learning and Deep Learning Techniques Based on a Real-World Study. Front. Oncol. 2022, 12. [Google Scholar] [CrossRef]
- Vilone, G.; Longo, L. Notions of Explainability and Evaluation Approaches for Explainable Artificial Intelligence. Inf. Fusion 2021, 76, 89–106. [Google Scholar] [CrossRef]
- Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy 2020, 23, 18. [Google Scholar] [CrossRef]
- Yang, G.; Ye, Q.; Xia, J. Unbox the Black-Box for the Medical Explainable AI via Multi-Modal and Multi-Centre Data Fusion: A Mini-Review, Two Showcases and Beyond. Inf. Fusion 2022, 77, 29–52. [Google Scholar] [CrossRef]
- Alves, M.A.; Castro, G.Z.; Oliveira, B.A.S.; Ferreira, L.A.; Ramírez, J.A.; Silva, R.; Guimarães, F.G. Explaining Machine Learning Based Diagnosis of COVID-19 from Routine Blood Tests with Decision Trees and Criteria Graphs. Comput. Biol. Med. 2021, 132, 104335. [Google Scholar] [CrossRef] [PubMed]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. Why Should I Trust You. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016. [Google Scholar] [CrossRef]
- Yang, W.; Kim, J.-G.; Kang, G.-H.; Jang, Y.-S.; Kim, W.; Choi, H.-Y.; Lee, Y. Prognostic Effect of Underlying Chronic Kidney Disease and Renal Replacement Therapy on the Outcome of Patients after Out-of-Hospital Cardiac Arrest: A Nationwide Observational Study. Medicina 2022, 58, 444. [Google Scholar] [CrossRef] [PubMed]
- Batista, G.E.A.P.A.; Prati, R.C.; Monard, M.C. A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data. ACM SIGKDD Explor. Newsl. 2004, 6, 20–29. [Google Scholar] [CrossRef]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Mi, Y. Imbalanced Classification Based on Active Learning SMOTE. Res. J. Appl. Sci. Eng. Technol. 2013, 5, 944–949. [Google Scholar] [CrossRef]
- Beckmann, M.; Ebecken, N.F.F.; Pires de Lima, B.S.L. A KNN Undersampling Approach for Data Balancing. J. Intell. Learn. Syst. Appl. 2015, 7, 104–116. [Google Scholar] [CrossRef]
- Massaoudi, M.; Refaat, S.S.; Chihi, I.; Trabelsi, M.; Oueslati, F.S.; Abu-Rub, H. A Novel Stacked Generalization Ensemble-Based Hybrid LGBM-XGB-MLP Model for Short-Term Load Forecasting. Energy 2021, 214, 118874. [Google Scholar] [CrossRef]
- Shi, R.; Xu, X.; Li, J.; Li, Y. Prediction and Analysis of Train Arrival Delay Based on XGBoost and Bayesian Optimization. Appl. Soft Comput. 2021, 109, 107538. [Google Scholar] [CrossRef]
- Kulshrestha, A.; Krishnaswamy, V.; Sharma, M. Bayesian BILSTM Approach for Tourism Demand Forecasting. Ann. Tour. Res. 2020, 83, 102925. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016. [Google Scholar] [CrossRef]
- Peterson, L. K-Nearest Neighbor. Scholarpedia 2009, 4, 1883. [Google Scholar] [CrossRef]
- Cutler, A.; Cutler, D.R.; Stevens, J.R. Random Forests. In Ensemble Machine Learning; Springer: Berlin/Heidelberg, Germany, 2012; pp. 157–175. [Google Scholar] [CrossRef]
- Safavian, S.R.; Landgrebe, D. A Survey of Decision Tree Classifier Methodology. IEEE Trans. Syst. Man Cybern. 1991, 21, 660–674. [Google Scholar] [CrossRef]
- Hosmer, D.W., Jr.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression; John Wiley & Sons: Hoboken, NJ, USA, 2013; Volume 398. [Google Scholar]
- Krizmaric, M.; Verlic, M.; Stiglic, G.; Grmec, S.; Kokol, P. Intelligent analysis in predicting outcome of out-of-hospital cardiac arrest. Comput. Methods Programs Biomed. 2009, 95, S22–S32. [Google Scholar] [CrossRef]
- Lee, Y.; Kwon, J.; Lee, Y.; Park, H.; Cho, H.; Park, J. Deep Learning in the Medical Domain: Predicting Cardiac Arrest Using Deep Learning. Acute Crit. Care 2018, 33, 117–120. [Google Scholar] [CrossRef]
- Seki, T.; Tamura, T.; Suzuki, M. Outcome Prediction of Out-of-Hospital Cardiac Arrest with Presumed Cardiac Aetiology Using an Advanced Machine Learning Technique. Resuscitation 2019, 141, 128–135. [Google Scholar] [CrossRef]
- Pareek, N.; Kordis, P.; Beckley-Hoelscher, N.; Pimenta, D.; Kocjancic, S.T.; Jazbec, A.; Nevett, J.; Fothergill, R.; Kalra, S.; Lockie, T.; et al. A Practical Risk Score for Early Prediction of Neurological Outcome after Out-of-Hospital Cardiac Arrest: MIRACLE2. Eur. Heart J. 2020, 41, 4508–4517. [Google Scholar] [CrossRef] [PubMed]
- Adrie, C.; Cariou, A.; Mourvillier, B.; Laurent, I.; Dabbane, H.; Hantala, F.; Rhaoui, A.; Thuong, M.; Monchi, M. Predicting Survival with Good Neurological Recovery at Hospital Admission after Successful Resuscitation of Out-of-Hospital Cardiac Arrest: The OHCA Score. Eur. Heart J. 2006, 27, 2840–2845. [Google Scholar] [CrossRef]
- Martinell, L.; Nielsen, N.; Herlitz, J.; Karlsson, T.; Horn, J.; Wise, M.P.; Undén, J.; Rylander, C. Early Predictors of Poor Outcome after Out-of-Hospital Cardiac Arrest. Crit. Care 2017, 21, 96. [Google Scholar] [CrossRef]
- Maupain, C.; Bougouin, W.; Lamhaut, L.; Deye, N.; Diehl, J.-L.; Geri, G.; Perier, M.-C.; Beganton, F.; Marijon, E.; Jouven, X.; et al. The CAHP (Cardiac Arrest Hospital Prognosis) Score: A Tool for Risk Stratification after out-of-Hospital Cardiac Arrest. Eur. Heart J. 2015, 37, 3222–3228. [Google Scholar] [CrossRef]
- Keegan, M.T.; Gajic, O.; Afessa, B. Severity of Illness Scoring Systems in the Intensive Care Unit. Crit. Care Med. 2011, 39, 163–169. [Google Scholar] [CrossRef]
- Sinuff, T.; Adhikari, N.K.J.; Cook, D.J.; Schünemann, H.J.; Griffith, L.E.; Rocker, G.; Walter, S.D. Mortality Predictions in the Intensive Care Unit: Comparing Physicians with Scoring Systems. Crit. Care Med. 2006, 34, 878–885. [Google Scholar] [CrossRef]
- Farinholt, P.; Park, M.; Guo, Y.; Bruera, E.; Hui, D. A Comparison of the Accuracy of Clinician Prediction of Survival Versus the Palliative Prognostic Index. J. Pain Symptom Manag. 2018, 55, 792–797. [Google Scholar] [CrossRef] [PubMed]
- Casini, L.; Roccetti, M. Reopening Italy’s Schools in September 2020: A Bayesian Estimation of the Change in the Growth Rate of New SARS-CoV-2 Cases. BMJ Open 2021, 11, e051458. [Google Scholar] [CrossRef] [PubMed]
Variables | Field Type |
---|---|
Gender | Categorical: male (1), female (2) |
Age (only age) | Continuous: ( ) years old |
Type of insurance | Categorical: no (0), yes (1) |
Whether CPR was performed before arriving at the hospital | Categorical: CPR continuous transfer (1), transfer without CPR (2) |
Recovery of spontaneous circulation before hospital arrival | Categorical: recovery of spontaneous circulation (1), no recovery of spontaneous circulation (2) |
Sudden cardiac arrest witnessed prior to hospital arrival | Categorical: not seen (1), sighted (2), unknown (9) |
Detector or witness types of sudden cardiac arrest patients | Categorical: working in the following occupations (1), occupations that do not belong to 1. or non-working (2), unknown (9) |
Whether CPR is performed by the general public | Categorical: not enforced (1), enforced (2), N/A (if paramedics and medical personnel on duty are witnesses) (3), unknown (9) |
Place of sudden cardiac arrest | Categorical: public places (1), non-public place (2), etc. (8), unknown (9) |
Activities at the time of sudden cardiac arrest | Categorical: during athletics (1), during leisure activities (2), working for pay (3), working without pay (4), in training (5), on the go (6), in daily life (7), in treatment (8), drinking (9), etc. (88), unknown (99) |
Causes of sudden cardiac arrest | Categorical: disease (1), other than disease (2), unknown (9) |
Electrocardiographic findings of sudden cardiac arrest before hospital arrival | Categorical: not watching (10), ventricular fibrillation (VF) (20), pulseless VT (30), pulseless electrical activity (PEA) (40), asystole (50), bradycardia (60), indistinct defibrillable rhythm (81), indistinct non-shockable rhythm (82), etc. (88), unknown (99) |
Whether defibrillation was performed prior to hospital arrival | Categorical: not conducted (1), carried out (2), unknown (9) |
Past history_hypertension | Categorical: has existed (1), does not exist (2), unknown (9) |
Past history_diabetes | Categorical: has existed (1), does not exist (2), unknown (9) |
Past history_heart disease | Categorical: has existed (1), does not exist (2), unknown (9) |
Past history_chronic kidney disease | Categorical: has existed (1), does not exist (2), unknown (9) |
Past history_respiratory disease | Categorical: has existed (1), does not exist (2), unknown (9) |
Past history_stroke | Categorical: has existed (1), does not exist (2), unknown (9) |
Past history_dyslipidemia | Categorical: has existed (1), does not exist (2), unknown (9) |
Drinking power | Categorical: current drinking (1), past drinking (2), does not exist (8), unknown (9) |
Smoking history | Categorical: current smoking (1), past smoking (2), e-cigarette (3), does not exist (8), unknown (9) |
CPR in the emergency room | Categorical: not enforced (1), tried but stopped within 20 min (2), enforced (3) |
Electrocardiogram findings of sudden cardiac arrest in the emergency room | Categorical: rhythm after recovery of spontaneous circulation (state of recovery of spontaneous circulation at the time of visit) (0), not watching (1), ventricular fibrillation (VF) (2), pulseless VT (3), pulseless electrical activity (PEA) (4), asystole (5), bradycardia (6), etc. (8), unknown (9) |
Place of first electrocardiogram | Categorical: pre-hospital stage (1), other hospital (2), research hospital (3), not enforced (4), unknown (9) |
Whether emergency room defibrillation is performed | Categorical: not conducted (1), carried out (2), unknown (9) |
ECG findings using an automated defibrillator | Categorical: no initial ECG monitoring (10), ventricular fibrillation (VF) (20), pulseless VT (30), pulseless electrical activity (PEA) (40), asystole (50), bradycardia (60), indistinct defibrillable rhythm (81), indistinct non-shockable rhythm (82), etc. (88), unknown (99) |
Whether ambulances implement automatic defibrillators | Categorical: not enforced (1), enforced (2), unknown (9) |
Reasons for non-implementation of ambulance automatic defibrillator | Categorical: indication (1), AED device condition is bad (2), family rejection (3), etc. (8), unknown (9) |
First aid guidance before the ambulance arrives | Categorical: does not exist (1), paramedic (3), fire control room (3), etc. (8) |
Survival | Categorical: dead (0), survival (1) |
Model | Hyperparameters |
---|---|
HAE-TabNet | TabNetClassifier(n_d = 80, n_a = 80, n_steps = 6, gamma = 1.8, cat_emb_dim = 1, n_independent = 2, n_shared = 2, epsilon = 1 × 10−15, momentum = 0.02, lambda_sparse = 7.85 × 10−8, seed = 0, clip_value = 1, verbose = 0, optimizer_fn = <class ‘torch.optim.adam.Adam’>, optimizer_params = {‘lr’: 0.02, ‘weight_decay’: 1 × 10−5}, scheduler_fn = <class ‘torch.optim.lr_scheduler.ReduceLROnPlateau’>, scheduler_params = {‘mode’: ‘min’, ‘patience’: 5, ‘min_lr’: 1 × 10−5, ‘factor’: 0.5}, mask_type = ‘entmax’, device_name = ‘auto’, n_shared_decoder = 1, n_indep_decoder = 1) |
XGB | XGBClassifier(colsample_bytree = 0.5293, max_depth = 2, n_estimators = 295, reg_alpha = 3.1064 × 10−5, reg_lambda = 0.0005322, scale_pos_weight = 21.6672, subsample = 0.9932) |
KNN | KNNClassifier (leaf_size = 30, metric = minkowski, n_neighbors = 41, p = 2) |
RF | RandomForestClassifier(max_depth = 7, max_features = 0.7594, min_impurity_decrease = 1.4295 × 10−6, min_samples_leaf = 3, min_samples_split = 7, n_estimators = 104) |
DT | DecisionTreeClassifier(max_depth = 6, max_features = 0.6509, min_impurity_decrease = 8.8038 × 10−5, min_samples_leaf = 5, min_samples_split = 3) |
LR | LogisticRegression(C = 1.4870) |
Dataset | Model | Accuracy | Precision | Recall | F1-Score | ROC AUC |
---|---|---|---|---|---|---|
Original imbalanced training set | HAE-TabNet | 0.9572 ± 0.003 | 0. 9031 ± 0.004 | 0.8339 ± 0.005 | 0.8669 ± 0.005 | 0.9858 ± 0.002 |
XGB | 0.9444 ± 0.003 | 0.8793 ± 0.004 | 0.8380 ± 0.005 | 0.8579 ± 0.005 | 0.9717 ± 0.002 | |
KNN | 0.9362 ± 0.003 | 0.8715 ± 0.004 | 0.7730 ± 0.006 | 0.8190 ± 0.005 | 0.9615 ± 0.003 | |
RF | 0.9483 ± 0.003 | 0.9193 ± 0.004 | 0.8254 ± 0.005 | 0.8695 ± 0.005 | 0.9711 ± 0.002 | |
DT | 0.9344 ± 0.003 | 0.8167 ± 0.005 | 0.8306 ± 0.005 | 0.8234 ± 0.005 | 0.9016 ± 0.004 | |
LR | 0.8750 ± 0.005 | 0.8785 ± 0.004 | 0.8446 ± 0.005 | 0.8608 ± 0.005 | 0.8875 ± 0.003 | |
Rebalanced training set | HAE-TabNet | 0.9611 ± 0.002 | 0.9681 ± 0.002 | 0.9623 ± 0.002 | 0.9652 ± 0.002 | 0.9934 ± 0.001 |
XGB | 0.9539 ± 0.002 | 0.9692 ± 0.002 | 0.9480 ± 0.003 | 0.9585 ± 0.002 | 0.9909 ± 0.001 | |
KNN | 0.9441 ± 0.003 | 0.9296 ± 0.003 | 0.9742 ± 0.002 | 0.9514 ± 0.002 | 0.9854 ± 0.001 | |
RF | 0.9608 ± 0.002 | 0.9889 ± 0.001 | 0.9906 ± 0.001 | 0.9897 ± 0.001 | 0.9894 ± 0.001 | |
DT | 0.9296 ± 0.003 | 0.9252 ± 0.003 | 0.9380 ± 0.003 | 0.9315 ± 0.003 | 0.9205 ± 0.003 | |
LR | 0.8613 ± 0.004 | 0.8704 ± 0.004 | 0.8845 ± 0.004 | 0.8774 ± 0.004 | 0.9469 ± 0.003 | |
Test set | HAE-TabNet | 0.9605 ± 0.003 | 0.9598 ± 0.003 | 0.9600 ± 0.003 | 0.9599 ± 0.003 | 0.9930 ± 0.001 |
XGB | 0.9543 ± 0.004 | 0.9532 ± 0.004 | 0.9542 ± 0.004 | 0.9537 ± 0.004 | 0.9905 ± 0.002 | |
KNN | 0.9248 ± 0.005 | 0.9232 ± 0.005 | 0.9242 ± 0.005 | 0.9237 ± 0.005 | 0.9807 ± 0.002 | |
RF | 0.9514 ± 0.004 | 0.9499 ± 0.004 | 0.9517 ± 0.004 | 0.9507 ± 0.004 | 0.9868 ± 0.002 | |
DT | 0.9191 ± 0.005 | 0.9137 ± 0.005 | 0.9453 ± 0.004 | 0.9293 ± 0.004 | 0.9205 ± 0.005 | |
LR | 0.8632 ± 0.006 | 0.8715 ± 0.006 | 0.8875 ± 0.005 | 0.8794 ± 0.006 | 0.9469 ± 0.004 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nguyen, H.V.; Byeon, H. Prediction of Out-of-Hospital Cardiac Arrest Survival Outcomes Using a Hybrid Agnostic Explanation TabNet Model. Mathematics 2023, 11, 2030. https://doi.org/10.3390/math11092030
Nguyen HV, Byeon H. Prediction of Out-of-Hospital Cardiac Arrest Survival Outcomes Using a Hybrid Agnostic Explanation TabNet Model. Mathematics. 2023; 11(9):2030. https://doi.org/10.3390/math11092030
Chicago/Turabian StyleNguyen, Hung Viet, and Haewon Byeon. 2023. "Prediction of Out-of-Hospital Cardiac Arrest Survival Outcomes Using a Hybrid Agnostic Explanation TabNet Model" Mathematics 11, no. 9: 2030. https://doi.org/10.3390/math11092030
APA StyleNguyen, H. V., & Byeon, H. (2023). Prediction of Out-of-Hospital Cardiac Arrest Survival Outcomes Using a Hybrid Agnostic Explanation TabNet Model. Mathematics, 11(9), 2030. https://doi.org/10.3390/math11092030