Cascade-Based Input-Doubling Classifier for Predicting Survival in Allogeneic Bone Marrow Transplants: Small Data Case
Abstract
:1. Introduction
- We have enhanced the input-doubling classifier for predicting survival in allogeneic bone marrow transplants with limited data by employing two machine learning methods using cascading principles.
- We developed four algorithmic implementations of the enhanced cascade-based input-doubling classifier using an Artificial Neural Network-based classifier without training and existing machine learning algorithms, specifically for analyzing small datasets.
- We optimized the performance of the improved classifiers using the Dual Annealing method and demonstrated a significant increase in accuracy compared to several existing methods, based on six different performance metrics.
2. The State-of-the-Arts
- Calculating the chosen distance from the current sample to all samples in the support dataset;
- Calculating Gaussian functions from the distances computed in the previous step;
- Determining the probabilities of belonging to each of the predefined classes;
- Assigning the class label with the highest probability from the set defined in the previous step.
- PNNs estimate the probability that the input data belongs to a specific class or outcome. This is useful for tasks where accounting for uncertainty or data incompleteness is important, which is very characteristic of small medical data samples, most of which are collected manually by doctors.
- PNNs often use special activation functions, such as kernel functions, to calculate probabilities for each class. One of the most well-known variants is the kernel-based network, where Gaussian functions are used to estimate probabilities. This is the type that will be used in this work.
- Probabilistic neural networks have the advantage of faster training compared to some other types of neural networks, as their training reduces to modeling probabilities without complex error backpropagation processes, which are typical for other types of artificial neural networks.
- Probabilistic neural networks can model complex dependencies and relationships in data. They are capable of integrating additional information about uncertainty and data incompleteness, which can be important for small datasets.
- By using probabilistic approaches, this neural network can be less prone to overfitting, which is a major issue during the intelligent analysis of small datasets.
- Due to the explicit probability estimation, PNNs can be easier to interpret, as they provide information on how confident the algorithm is in its predictions.
- PNNs are characterized by high generalization properties and have only one main parameter (smooth factor), the optimal value of which should be determined experimentally for effective functioning of the artificial neural network. In this paper, we will use an optimization method to determine the value of the Gaussian function’s smoothing parameter, or smooth factor (sigma).
- PNNs can use various distances between input data and reference (training) samples to assess the probability of belonging to each class. This paper will investigate the effectiveness of using a range of such functions, including Euclidean, Manhattan, cosine, Chebyshev, Minkowski, and Canberra distances.
3. Materials and Methods
- Apply PNN to assign class labels to each vector in both the training and test samples. Note that during the classification of the training (support) dataset, the vector for which the prediction is made is temporarily removed from the sample (at the time of prediction).
- Generate extended vectors by concatenating all possible pairs of vectors from the extended reference (training) sample. This process results in a quadratic increase in the number of vectors, with each vector having doubled input features. However, unlike the method described in Ref. [15], the dimensionality of these vectors will be increased by 2 positions due to the additional step 1.
- Use the selected classifier (ML-2) to carry out its training procedure, if required. In this approach, the combined execution of steps 1 and 2 allows us to create a reference sample for ML-2, which can be chosen by the user.
- As shown in Ref. [15], the existing input-doubling classifier demonstrates better results than the baseline nonlinear machine learning method used in its training procedure. However, in some cases, when the data are small, high-dimensional, and complex, this method shows unsatisfactory results. This is why this paper proposes a modification of the method [15] to improve the accuracy of analysis for high-dimensional small datasets. It is based on the cascading principles by incorporating temporary predictions as an additional step.
- Assign the class label to the current observation with an unknown outcome using the PNN and the initial dataset as the supporting dataset.
- Create a temporary dataset for the current observation by its stepwise concatenation with each vector from the initial support (training) sample, incorporating previously predicted outcomes (highlighted in green). Unlike the method described in Ref. [15], each vector in the temporary dataset will have 2 additional dimensions due to the inclusion of step 1.
- Apply the selected classifier (ML-2) to process this dataset and compute the differences zi (as shown in Figure 3). In this case, we used a different Ml or ANN, as we can apply it, utilizing the quadratically augmented dataset.
- Calculate sums from two elements—the known output value yi and the corresponding prediction from the previous step zi. Use the plurality voting principle to determine the final class label for the current vector with an unknown label (Figure 3).
- Repeat steps 1–4 for all subsequent vectors with unknown class labels (from the test sample).
4. Modeling and Results
4.1. Dataset Descriptions
4.2. Performance Indicators
4.3. Cascade Method’s Parameters Optimization
4.4. Results
- Algorithm 1—Cascade-based input-doubling method using PNN with RandomForest;
- Algorithm 2—Cascade-based input-doubling method using PNN with XGBoost;
- Algorithm 3—Cascade-based input-doubling method using PNN with HistGradientBoosting;
- Algorithm 4—Cascade-based input-doubling method using PNN-1 with PNN-2.
5. Comparison and Discussion
- Investigate the effectiveness of using simpler optimization methods [4] to reduce the execution time of the algorithm while maintaining high accuracy.
- Explore alternative methods for analyzing small datasets to potentially replace ML-1 or ML-2 of the proposed method with other models for the enhanced method implementation described in this paper. For instance, substituting ML-2 with RBF ANN [48] could further improve the method’s overall accuracy and decrease its training time.
- Refine the data augmentation procedure by incorporating clustering techniques [31] to significantly reduce the size of the augmented dataset while preserving high accuracy.
- Enhance the proposed method by expanding the input data space using a set of probabilities of class membership [13], rather than class labels as done in this study. This approach may improve the method’s accuracy.
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Distance | Accuracy | Precision | Recall | F1-Score | Matthews Correlation Coefficient | Cohen’s Kappa Score |
---|---|---|---|---|---|---|
Euclidean | 0.895 | 0.904 | 0.91 | 0.905 | 0.79 | 0.79 |
Manhattan | 0.763 | 0.667 | 0.875 | 0.757 | 0.553 | 0.534 |
cosine | 0.842 | 0.952 | 0.800 | 0.87 | 0.69 | 0.67 |
Chebyshev | 0.789 | 0.857 | 0.783 | 0.818 | 0.573 | 0.569 |
Minkowski | 0.842 | 0.86 | 0.857 | 0.858 | 0.681 | 0.681 |
Canberra | 0.789 | 0.905 | 0.76 | 0.826 | 0.578 | 0.564 |
Algorithm | Parameter’s Value |
---|---|
PNN with RandomForest | {‘Chebyshev distance’, ‘sigma=0.508605399’, ‘’n_estimators’: 165, ’max_depth’: 14} |
PNN with XGBoost | {‘Euclidian distance’, ‘sigma=2.69521’, ’n_estimators’: 179, ’learning_rate’: 0.8668009441186528, ’max_depth’: 3} |
PNN with HistGradientBoosting | {‘cosine distance’, ‘sigma=5.167324462’, ’max_iter’: 52, ’learning_rate’: 0.5598098557162532, ’max_depth’: 5} |
PNN-1 with PNN-2 | {‘Euclidian distance’, ‘sigma1=2.69521’; ‘sigma2=4.11678’} |
References
- Tolstyak, Y.; Chopyak, V.; Havryliuk, M. An Investigation of the Primary Immunosuppressive Therapy’s Influence on Kidney Transplant Survival at One Month after Transplantation. Transpl. Immunol. 2023, 78, 101832. [Google Scholar] [CrossRef]
- Tolstyak, Y.; Zhuk, R.; Yakovlev, I.; Shakhovska, N.; Gregus Ml, M.; Chopyak, V.; Melnykova, N. The Ensembles of Machine Learning Methods for Survival Predicting after Kidney Transplantation. Appl. Sci. 2021, 11, 10380. [Google Scholar] [CrossRef]
- Bhat, M.; Rabindranath, M.; Chara, B.S.; Simonetto, D.A. Artificial Intelligence, Machine Learning, and Deep Learning in Liver Transplantation. J. Hepatol. 2023, 78, 1216–1233. [Google Scholar] [CrossRef] [PubMed]
- Havryliuk, M.; Hovdysh, N.; Tolstyak, Y.; Chopyak, V.; Kustra, N. Investigation of PNN Optimization Methods to Improve Classification Performance in Transplantation Medicine. In Proceedings of the IDDM’2023: 6th International Conference on Informatics & Data-Driven Medicine, CEUR-WS.org 3609, Bratislava, Slovakia, 17–19 November 2023; pp. 338–345. [Google Scholar]
- Huang, S.; Deng, H. Data Analytics: A Small Data Approach, 1st ed.; Chapman & Hall/CRC Data Science Series; CRC Press: Boca Raton, FL, USA, 2021; ISBN 978-0-367-60950-4. [Google Scholar]
- Krak, I.; Kuznetsov, V.; Kondratiuk, S.; Azarova, L.; Barmak, O.; Padiuk, P. Analysis of Deep Learning Methods in Adaptation to the Small Data Problem Solving. In Lecture Notes in Data Engineering, Computational Intelligence, and Decision Making; Babichev, S., Lytvynenko, V., Eds.; Lecture Notes on Data Engineering and Communications Technologies; Springer International Publishing: Cham, Switzerland, 2023; Volume 149, pp. 333–352. ISBN 978-3-031-16202-2. [Google Scholar]
- Medykovskvi, M.; Tsmots, I.; Skorokhoda, O. Spectrum Neural Network Filtration Technology for Improving the Forecast Accuracy of Dynamic Processes in Economics. Actual. Probl. Econ. 2014, 162, 410–416. [Google Scholar]
- Izonin, I.; Tkachenko, R.; Berezsky, O.; Krak, I.; Kováč, M.; Fedorchuk, M. Improvement of the ANN-Based Prediction Technology for Extremely Small Biomedical Data Analysis. Technologies 2024, 12, 112. [Google Scholar] [CrossRef]
- Chumachenko, D.; Butkevych, M.; Lode, D.; Frohme, M.; Schmailzl, K.J.G.; Nechyporenko, A. Machine Learning Methods in Predicting Patients with Suspected Myocardial Infarction Based on Short-Time HRV Data. Sensors 2022, 22, 7033. [Google Scholar] [CrossRef]
- Chumachenko, D.; Piletskiy, P.; Sukhorukova, M.; Chumachenko, T. Predictive Model of Lyme Disease Epidemic Process Using Machine Learning Approach. Appl. Sci. 2022, 12, 4282. [Google Scholar] [CrossRef]
- Shaikhina, T.; Lowe, D.; Daga, S.; Briggs, D.; Higgins, R.; Khovanova, N. Machine Learning for Predictive Modelling Based on Small Data in Biomedical Engineering. IFAC-Pap. 2015, 48, 469–474. [Google Scholar] [CrossRef]
- Specht, D.F. Probabilistic Neural Networks. Neural. Netw. 1990, 3, 109–118. [Google Scholar] [CrossRef]
- Zub, K.; Zhezhnych, P.; Strauss, C. Two-Stage PNN–SVM Ensemble for Higher Education Admission Prediction. BDCC 2023, 7, 83. [Google Scholar] [CrossRef]
- Izonin, I.; Tkachenko, R.; Ryvak, L.; Zub, K.; Rashkevych, M.; Pavliuk, O. Addressing Medical Diagnostics Issues: Essential Aspects of the PNN-Based Approach. In Proceedings of the CEUR-WS.org, Volume 2753: Proceedings of the 3rd International Conference on Informatics & Data-Driven Medicine, Växjö, Sweden, 19–21 November 2020; pp. 209–218. [Google Scholar]
- Izonin, I.; Tkachenko, R.; Havryliuk, M.; Gregus, M.; Yendyk, P.; Tolstyak, Y. An Adaptation of the Input Doubling Method for Solving Classification Tasks in Case of Small Data Processing. Procedia Comput. Sci. 2024, 241, 171–178. [Google Scholar] [CrossRef]
- Snow, D. DeltaPy: A Framework for Tabular Data Augmentation in Python; Social Science Research Network: Rochester, NY, USA, 2020. [Google Scholar]
- Xu, L.; Skoularidou, M.; Cuesta-Infante, A.; Veeramachaneni, K. Modeling Tabular Data Using Conditional GAN. arXiv 2019, arXiv:1907.00503. [Google Scholar]
- Deep Learning for Tabular Data Augmentation. Available online: https://lschmiddey.github.io/fastpages_/2021/04/10/DeepLearning_TabularDataAugmentation.html (accessed on 16 May 2021).
- Izonin, I.; Tkachenko, R.; Pidkostelnyi, R.; Pavliuk, O.; Khavalko, V.; Batyuk, A. Experimental Evaluation of the Effectiveness of ANN-Based Numerical Data Augmentation Methods for Diagnostics Tasks. In Proceedings of the 4th International Conference on Informatics & Data-Driven Medicine, Valencia, Spain, 19 November 2021; Volume 3038, pp. 223–232. [Google Scholar]
- Nanni, L.; Brahnam, S.; Loreggia, A.; Barcellona, L. Heterogeneous Ensemble for Medical Data Classification. Analytics 2023, 2, 676–693. [Google Scholar] [CrossRef]
- Subbotin, S. Radial-Basis Function Neural Network Synthesis on the Basis of Decision Tree. Opt. Mem. Neural Netw. 2020, 29, 7–18. [Google Scholar] [CrossRef]
- Rokach, L. Taxonomy for Characterizing Ensemble Methods in Classification Tasks: A Review and Annotated Bibliography. Comput. Stat. Data Anal. 2009, 53, 4046–4072. [Google Scholar] [CrossRef]
- Yaman, M.A.; Rattay, F.; Subasi, A. Comparison of Bagging and Boosting Ensemble Machine Learning Methods for Face Recognition. Procedia Comput. Sci. 2021, 194, 202–209. [Google Scholar] [CrossRef]
- Lee, S.-J.; Tseng, C.-H.; Yang, H.-Y.; Jin, X.; Jiang, Q.; Pu, B.; Hu, W.-H.; Liu, D.-R.; Huang, Y.; Zhao, N. Random RotBoost: An Ensemble Classification Method Based on Rotation Forest and AdaBoost in Random Subsets and Its Application to Clinical Decision Support. Entropy 2022, 24, 617. [Google Scholar] [CrossRef]
- Bagnall, A.; Flynn, M.; Large, J.; Line, J.; Bostrom, A.; Cawley, G. Is Rotation Forest the Best Classifier for Problems with Continuous Features? arXiv 2018, arXiv:1809.06705. [Google Scholar]
- Bodyanskiy, Y.; Zaychenko, Y.; Pliss, I.; Chala, O. Matrix Neural Network with Kernel Activation Function and Its Online Combined Learning. In Proceedings of the 2022 IEEE 3rd International Conference on System Analysis & Intelligent Computing (SAIC), Kyiv, Ukraine, 4 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–4. [Google Scholar]
- Kandel, I.; Castelli, M.; Popovič, A. Comparing Stacking Ensemble Techniques to Improve Musculoskeletal Fracture Image Classification. J. Imaging 2021, 7, 100. [Google Scholar] [CrossRef]
- Swaroop, K.; Cheruku, R.; Edla, D.R. Cascading of RBFN, PNN and SVM for Improved Type-2 Diabetes Prediction Accuracy. Aust. J. Wirel. Technol. Mobil. Secur. 2019, 1, 4. [Google Scholar]
- Paul, S. Ensemble Learning—Bagging, Boosting, Stacking and Cascading Classifiers in Machine Learning. Available online: https://medium.com/@saugata.paul1010/ensemble-learning-bagging-boosting-stacking-and-cascading-classifiers-in-machine-learning-9c66cb271674 (accessed on 2 October 2022).
- Fernández-Alemán, J.L.; Carrillo-de-Gea, J.M.; Hosni, M.; Idri, A.; García-Mateos, G. Homogeneous and Heterogeneous Ensemble Classification Methods in Diabetes Disease: A Review. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; pp. 3956–3959. [Google Scholar]
- Izonin, I.; Tkachenko, R.; Yemets, K.; Gregus, M.; Tomashy, Y.; Pliss, I. An Approach Towards Reducing Training Time of the Input Doubling Method via Clustering for Middle-Sized Data Analysis. Procedia Comput. Sci. 2024, 241, 32–39. [Google Scholar] [CrossRef]
- Bodyanskiy, Y.V.; Tyshchenko, O.K. A Hybrid Cascade Neural Network with Ensembles of Extended Neo-Fuzzy Neurons and Its Deep Learning. In Proceedings of the Information Technology, Systems Research, and Computational Physics, Cracow, Poland, 2–5 July 2018; Springer: Cham, Switzerland, 2018; pp. 164–174. [Google Scholar]
- García-Pedrajas, N.; Ortiz-Boyer, D.; del Castillo-Gomariz, R.; Hervás-Martínez, C. Cascade Ensembles. In Proceedings of the Computational Intelligence and Bioinspired Systems, Barcelona, Spain, 8–10 June 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 598–603. [Google Scholar]
- Izonin, I.; Kazantzi, A.K.; Tkachenko, R.; Mitoulis, S.-A. GRNN-Based Cascade Ensemble Model for Non-Destructive Damage State Identification: Small Data Approach. Eng. J. 2024; under review. [Google Scholar]
- Samuelson, F.; Brown, D.G. Application of Cover’s Theorem to the Evaluation of the Performance of CI Observers. In Proceedings of the The 2011 International Joint Conference on Neural Networks, San Jose, CA, USA, 31 July–5 August 2011; IEEE: San Jose, CA, USA, 2011; pp. 1020–1026. [Google Scholar]
- Bone Marrow Transplant: Children—UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu/dataset/565/bone+marrow+transplant+children (accessed on 25 November 2023).
- Gudyś, A.; Sikora, M.; Wróbel, Ł. RuleKit: A Comprehensive Suite for Rule-Based Learning. Knowl. Based Syst. 2020, 194, 105480. [Google Scholar] [CrossRef]
- Sikora, M.; Wróbel, Ł.; Gudyś, A. GuideR: A Guided Separate-and-Conquer Rule Learning in Classification, Regression, and Survival Settings. Knowl. Based Syst. 2019, 173, 1–14. [Google Scholar] [CrossRef]
- Wróbel, Ł.; Gudyś, A.; Sikora, M. Learning Rule Sets from Survival Data. BMC Bioinform. 2017, 18, 285. [Google Scholar] [CrossRef]
- Kałwak, K.; Porwolik, J.; Mielcarek, M.; Gorczyńska, E.; Owoc-Lempach, J.; Ussowicz, M.; Dyla, A.; Musiał, J.; Paździor, D.; Turkiewicz, D.; et al. Higher CD34+ and CD3+ Cell Doses in the Graft Promote Long-Term Survival, and Have No Impact on the Incidence of Severe Acute or Chronic Graft-versus-Host Disease after In Vivo T Cell-Depleted Unrelated Donor Hematopoietic Stem Cell Transplantation in Children. Biol. Blood Marrow Transplant. 2010, 16, 1388–1401. [Google Scholar] [CrossRef] [PubMed]
- Manna, S. Small Sample Estimation of Classification Metrics. In Proceedings of the 2022 Interdisciplinary Research in Technology and Management (IRTM), Kolkata, India, 24–26 February 2022; IEEE: Kolkata, India, 2022; pp. 1–3. [Google Scholar]
- Berezsky, O.; Pitsun, O.; Liashchynskyi, P.; Derysh, B.; Batryn, N. Computational Intelligence in Medicine. In Lecture Notes in Data Engineering, Computational Intelligence, and Decision Making; Babichev, S., Lytvynenko, V., Eds.; Lecture Notes on Data Engineering and Communications Technologies; Springer International Publishing: Cham, Switzerland, 2023; Volume 149, pp. 488–510. ISBN 978-3-031-16202-2. [Google Scholar]
- Ryczkowski, A.; Piotrowski, T.; Staszczak, M.; Wiktorowicz, M.; Adrich, P. Optimization of the Regularization Parameter in the Dual Annealing Method Used for the Reconstruction of Energy Spectrum of Electron Beam Generated by the AQURE Mobile Accelerator. Z. Für Med. Phys. 2023, 34, 510–520. [Google Scholar] [CrossRef]
- Chadaga, K.; Prabhu, S.; Sampathila, N.; Chadaga, R. A Machine Learning and Explainable Artificial Intelligence Approach for Predicting the Efficacy of Hematopoietic Stem Cell Transplant in Pediatric Patients. Healthc. Anal. 2023, 3, 100170. [Google Scholar] [CrossRef]
- Gross, M.-P.; Taormina, R.; Cominola, A. A Machine Learning-Based Framework and Open-Source Software for Non Intrusive Water Monitoring. Environ. Model. Softw. 2025, 183, 106247. [Google Scholar] [CrossRef]
- Yu, T.-C.; Yang, C.-K.; Hsu, W.-H.; Hsu, C.-A.; Wang, H.-C.; Hsiao, H.-J.; Chao, H.-L.; Hsieh, H.-P.; Wu, J.-R.; Tsai, Y.-C.; et al. A Machine-Learning-Based Algorithm for Bone Marrow Cell Differential Counting. Int. J. Med. Inform. 2025, 194, 105692. [Google Scholar] [CrossRef]
- Buturovic, L.; Shelton, J.; Spellman, S.R.; Wang, T.; Friedman, L.; Loftus, D.; Hesterberg, L.; Woodring, T.; Fleischhauer, K.; Hsu, K.C.; et al. Evaluation of a Machine Learning-Based Prognostic Model for Unrelated Hematopoietic Cell Transplantation Donor Selection. Biol. Blood Marrow Transplant. 2018, 24, 1299–1306. [Google Scholar] [CrossRef]
- Shaikhina, T.; Khovanova, N.A. Handling Limited Datasets with Neural Networks in Medical Applications: A Small-Data Approach. Artif. Intell. Med. 2017, 75, 51–63. [Google Scholar] [CrossRef] [PubMed]
Method | Parameter |
---|---|
Probabilistic Neural Network | {‘distance type’, ‘smooth factor—sigma’} |
RandomForest | {’n_estimators’, ’max_depth’} |
XGBoost | {’n_estimators’, ’learning_rate’, ’max_depth’} |
HistGradientBoosting | {’max_iter’, ’learning_rate’, ’max_depth’} |
Algorithm / Metric | Accuracy | Precision | Recall | F1-Score | Matthews Correlation Coefficient | Cohen’s Kappa | Training Time, Seconds * |
---|---|---|---|---|---|---|---|
Algorithm 1 (PNN with RandomForest) | 0.97± 0.021 | 1.00± 0.03 | 0.95± 0.019 | 0.98± 0.025 | 0.95± 0.022 | 0.95± 0.020 | 883.45 |
Algorithm 2 (PNN with XGBoost) | 0.95± 0.029 | 1.00± 0.025 | 0.90± 0.032 | 0.95± 0.029 | 0.90± 0.026 | 0.89± 0.029 | 516.8 |
Algorithm 3 (PNN with HistGradientBoosting) | 0.95± 0.022 | 1.00± 0.02 | 0.90± 0.018 | 0.95± 0.019 | 0.90± 0.019 | 0.89± 0.019 | 38,462.5 |
Algorithm 4 (PNN-1 with PNN-2) | 0.89± 0.034 | 0.90± 0.036 | 0.90± 0.027 | 0.90± 0.032 | 0.79± 0.029 | 0.79± 0.030 | 99,100.49 |
Algorithm / Metric | Accuracy | Precision | Recall | F1-Score | Matthews Correlation Coefficient | Cohen’s Kappa | Training Time, Seconds |
---|---|---|---|---|---|---|---|
Algorithm 1 (PNN with RandomForest) | 0.97 | 1.00 | 0.95 | 0.98 | 0.95 | 0.95 | 883.45 |
Algorithm 2 (PNN with XGBoost) | 0.95 | 1.00 | 0.90 | 0.95 | 0.90 | 0.89 | 516.8 |
Algorithm 3 (PNN with HistGradientBoosting) | 0.95 | 1.00 | 0.90 | 0.95 | 0.90 | 0.89 | 38,462.5 |
Algorithm 4 (PNN-1 with PNN-2) | 0.89 | 0.90 | 0.90 | 0.90 | 0.79 | 0.79 | 99,100.49 |
MLP | 0.92 | 1.00 | 0.86 | 0.92 | 0.85 | 0.84 | 1260.97 |
HistGradientBoosting | 0.92 | 1.00 | 0.86 | 0.92 | 0.85 | 0.84 | 43.39 |
RandomForest | 0.92 | 1.00 | 0.86 | 0.92 | 0.85 | 0.84 | 28.51 |
SVM | 0.89 | 0.95 | 0.86 | 0.90 | 0.79 | 0.79 | 5.565 |
LightGBM | 0.89 | 1.00 | 0.81 | 0.89 | 0.81 | 0.79 | 41.69 |
XGBoost | 0.89 | 1.00 | 0.81 | 0.89 | 0.81 | 0.79 | 41.42 |
Input-doubling method (PNN) | 0.82 | 0.95 | 0.77 | 0.85 | 0.64 | 0.62 | 44,522.88 |
Classical PNN | 0.71 | 0.81 | 0.62 | 0.70 | 0.45 | 0.43 | 6.336 |
KNN | 0.66 | 0.90 | 0.43 | 0.58 | 0.42 | 0.35 | 1.263 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Izonin, I.; Tkachenko, R.; Hovdysh, N.; Berezsky, O.; Yemets, K.; Tsmots, I. Cascade-Based Input-Doubling Classifier for Predicting Survival in Allogeneic Bone Marrow Transplants: Small Data Case. Computation 2025, 13, 80. https://doi.org/10.3390/computation13040080
Izonin I, Tkachenko R, Hovdysh N, Berezsky O, Yemets K, Tsmots I. Cascade-Based Input-Doubling Classifier for Predicting Survival in Allogeneic Bone Marrow Transplants: Small Data Case. Computation. 2025; 13(4):80. https://doi.org/10.3390/computation13040080
Chicago/Turabian StyleIzonin, Ivan, Roman Tkachenko, Nazarii Hovdysh, Oleh Berezsky, Kyrylo Yemets, and Ivan Tsmots. 2025. "Cascade-Based Input-Doubling Classifier for Predicting Survival in Allogeneic Bone Marrow Transplants: Small Data Case" Computation 13, no. 4: 80. https://doi.org/10.3390/computation13040080
APA StyleIzonin, I., Tkachenko, R., Hovdysh, N., Berezsky, O., Yemets, K., & Tsmots, I. (2025). Cascade-Based Input-Doubling Classifier for Predicting Survival in Allogeneic Bone Marrow Transplants: Small Data Case. Computation, 13(4), 80. https://doi.org/10.3390/computation13040080