Integrating Data Mining Techniques for Naïve Bayes Classification: Applications to Medical Datasets
Abstract
:1. Introduction
2. Materials and Methods
2.1. Basic Concepts
2.1.1. Classification Tree
2.1.2. Association Rules Analysis (ASA)
2.2. Proposed Method: Naïve Bayes Classifier Framework
3. Illustrated Examples
3.1. Thyroid Dataset
3.2. Diabetes Dataset
3.3. Appendicitis Dataset
4. Performance Comparison via Medical Datasets
5. Discussion and Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Al-Aidaroos, K.M.; Bakar, A.A.; Othman, Z. Medical data classification with Naïve Bayes approach. Inf. Technol. J. 2012, 11, 1166–1174. [Google Scholar] [CrossRef] [Green Version]
- Golpour, P.; Ghayour-Mobarhan, M.; Saki, A.; Esmaily, H.; Taghipour, A.; Tajfard, M.; Ghazizadeh, H.; Moohebati, M.; Ferns, G.A. Comparison of support vector machine, Naïve Bayes and logistic regression for assessing the necessity for coronary angiography. Int. J. Environ. Res. Public Health 2020, 17, 6449. [Google Scholar] [CrossRef] [PubMed]
- Langarizadeh, M.; Moghbeli, F. Applying Naïve Bayesian networks to disease prediction: A systematic review. Acta Inform. Med. 2016, 24, 364–369. [Google Scholar] [CrossRef] [PubMed]
- Miasnikof, P.; Giannakeas, V.; Gomes, M.; Aleksandrowicz, L.; Shestopaloff, A.Y.; Alam, D.; Tollman, S.; Samarikhalaj, A.; Jha, P. Naïve Bayes classifiers for verbal autopsies: Comparison to physician-based classification for 21,000 child and adult deaths. BMC Med. 2015, 13, 286. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Al-Aidaroos, K.M.; Bakar, A.A.; Othman, Z. Naïve Bayes Variants in Classification Learning. In Proceedings of the 2010 International Conference on Information Retrieval & Knowledge Management (CAMP), Shah Alam, Malaysia, 17–18 March 2010; Institute of Electrical and Electronics Engineers (IEEE): Shah Alam, Malaysia, 2010; pp. 276–281. [Google Scholar]
- Melingi, S.; Vijayalakshmi, V. An effective approach for sub-acute ischemic stroke lesion segmentation by adopting meta-heuristics feature selection technique along with hybrid Naïve Bayes and sample-weighted random forest classification. Sens. Imaging 2019, 20, 7. [Google Scholar]
- Jiang, L.; Wang, D.; Cai, Z.; Yan, X. Survey of improving Naive Bayes for classification. In Advanced Data Mining and Applications. ADMA 2007. Lecture Notes in Computer Science; Alhajj, R., Gao, H., Li, J., Li, X., Zaïane, O.R., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; Volume 4632, pp. 134–145. [Google Scholar]
- Farid, D.; Zhang, L.; Rahman, C.; Hossain, M.; Strachan, R. Hybrid decision tree and Naïve Bayes classifiers for multi-class classification tasks. Expert Syst. Appl. Int. J. 2014, 41, 1937–1946. [Google Scholar] [CrossRef]
- Abraham, R.; Simha, J.; Iyengar, S. Effective discretization and hybrid feature selection using Naïve Bayesian classifier for medical datamining. Int. J. Comput. Intell. Res. 2008, 4, 974–1259. [Google Scholar] [CrossRef]
- Huang, Z.; Zhou, Z.; He, T. Resolving rule conflicts based on Naïve Bayesian model for associative classification. J. Digit. Inform. Manag. 2014, 12, 36–43. [Google Scholar]
- Hadi, W.; Al-Radaideh, Q.; Alhawari, S. Integrating associative rule-based classification with Naïve Bayes for text classification. Appl. Soft Comput. 2018, 69, 344–356. [Google Scholar] [CrossRef]
- Bressan, M.; Vitrià, J. Improving Naïve Bayes using class-conditional ICA. In Advances in Artificial Intelligence, IBERAMIA 2002. Lecture Notes in Computer Science; Garijo, F.J., Riquelme, J.C., Toro, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2002; Volume 3, pp. 1–10. [Google Scholar]
- Domingos, P.; Pazzani, M. On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 1997, 29, 103–130. [Google Scholar] [CrossRef]
- Changpetch, P.; Reid, M. Data mining techniques: Which one is your favorite? J. Educ. Bus. 2021, 96, 143–148. [Google Scholar] [CrossRef]
- Berry, M.J.A.; Linoff, G. Data Mining Techniques: For Marketing, Sales, and Customer Support, 3rd ed.; John Wiley & Sons: Indianapolis, IN, USA, 1997. [Google Scholar]
- Changpetch, P.; Lin, D.K.J. Model selection for logistic regression via association rules analysis. J. Stat. Comput. Simul. 2013, 83, 1415–1428. [Google Scholar] [CrossRef]
- Changpetch, P.; Lin, D.K.J. Selection for multinomial logit models via association rules analysis. WIREs Comput. Stat. 2013, 5, 68–77. [Google Scholar] [CrossRef]
- Agrawal, R.; Srikant, S. Fast Algorithms for Mining Association Rules. In VLDB’94, Proceedings of the 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, 12–15 September 1994; Bocca, J.B., Jarke, M., Zaniolo, C., Eds.; Morgan Kaufmann: San Francisco, CA, USA, 1994; pp. 487–499. [Google Scholar]
- Liu, B.; Hsu, W.; Ma, Y. Integrating Classification and Association Rule Mining. In KDD-98, Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 27–31 August 1998; Agrawal, R., Stolorz, P.E., Piatetsky-Shapiro, G., Eds.; AAAI Press: Menlo Park, CA, USA, 1998; pp. 80–86. [Google Scholar]
- Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufmann: San Francisco, CA, USA, 1992. [Google Scholar]
- Smith, J.W.; Everhart, J.E.; Dickson, W.C.; Knowler, W.C.; Johannes, R.S. Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus. In Proceedings of the Symposium on Computer Applications in Medical Care, Minneapolis, MN, USA, 8–10 June 1988; Greenes, R.A., Ed.; IEEE Computer Society Press: Los Alamitos, CA, USA, 1988; pp. 261–265. [Google Scholar]
Variable | N | Mean | Standard Deviation | Minimum | Median | Maximum |
---|---|---|---|---|---|---|
T3 RESIN | 215 | 109.60 | 13.15 | 65.00 | 110.00 | 144.00 |
THYROXIN | 215 | 9.81 | 4.70 | 0.50 | 9.20 | 25.30 |
THYRONINE | 215 | 2.05 | 1.42 | 0.20 | 1.70 | 10.00 |
THYROID | 215 | 2.88 | 6.12 | 0.10 | 1.30 | 56.40 |
TSH_VALUE | 215 | 4.20 | 8.07 | -0.70 | 2.00 | 56.30 |
Original Variable | Discretized Variable | Detail |
---|---|---|
T3 resin | X1 | X1 = 1 if T3 resin < 99.5 |
X1 = 2 if 99.5 ≤ T3 resin < 117.5 | ||
X1 = 3 if T3 resin ≥ 117.5 | ||
Thyroxine | X2 | X2 = 1 if thyroxine < 5.65 |
X2 = 2 if 5.65 ≤ thyroxine < 12.65 | ||
X2 = 3 if thyroxine ≥ 12.65 | ||
Thyronine | X3 | X3 = 1 if thyronine < 1.15 |
X3 = 2 if 1.15 ≤ thyronine < 2.65 | ||
X3 = 3 if thyronine ≥ 2.65 | ||
Thyroid | X4 | X4 = 1 if thyroid < 0.75 |
X4 = 2 if 0.75 ≤ thyroid < 1.05 | ||
X4 = 3 if 1.05 ≤ thyroid < 1.15 | ||
X4 = 4 if 1.15 ≤ thyroid < 1.45 | ||
X4 = 5 if 1.45 ≤ thyroid < 1.65 | ||
X4 = 6 if 1.65 ≤ thyroid < 1.75 | ||
X4 = 7 if 1.75 ≤ thyroid < 1.85 | ||
X4 = 8 if 1.85 ≤ thyroid < 4 | ||
X4 = 9 if thyroid ≥ 4 | ||
TSH-value | X5 | X5 = 1 if TSH-value < 0.65 |
X5 = 2 if 0.65 ≤ TSH-value < 4.45 | ||
X5 = 3 if TSH-value ≥ 4.45 |
No. | Rules | Generated Interactions |
---|---|---|
1 | If X5 = 2 and X2 = 2, then Y = 1 | X5(2)X2(2) = 1 if X5 = 2 and X2 = 2 |
X5(2)X2(2) = 0, otherwise | ||
2 | If X5 = 2 and X3 = 2, then Y = 1 | X5(2)X3(2) = 1 if X5 = 2 and X3 = 2 |
X5(2)X3(2) = 0, otherwise | ||
3 | If X4 = 2 and X2 = 2, then Y = 1 | X4(2)X2(2) = 1 if X4 = 2 and X2 = 2 |
X4(2)X2(2) = 0, otherwise | ||
4 | If X5 = 2 and X4 = 2, then Y = 1 | X5(2)X4(2) = 1 if X5 = 2 and X4 = 2 |
X5(2)X4(2) = 0, otherwise | ||
5 | If X4 = 4 and X2 = 2, then Y = 1 | X4(4)X2(2) = 1 if X4 = 4 and X2 = 2 |
X4(4)X2(2) = 0, otherwise | ||
6 | If X3 = 3 and X2 = 3, then Y = 2 | X3(3)X2(3) = 1 if X3 = 3 and X2 = 3 |
X3(3)X2(3) = 0, otherwise | ||
7 | If X4 = 9, then Y = 3 | X4(9) = 1 if X4 = 9 |
X4(9) = 0, otherwise | ||
8 | If X5 = 3 and X2 = 1, then Y = 3 | X5(3)X2(1) = 1 if X5 = 3 and X2 = 1 |
X5(3)X2(1) = 0, otherwise | ||
9 | If X2 = 3 and X1 = 1, then Y = 2 | X2(3)X1(1) = 1 if X2 = 3 and X1 = 1 |
X2(3)X1(1) = 0, otherwise | ||
10 | If X4 = 5 and X3 = 2, then Y = 1 | X4(5)X3(2) = 1 if X4 = 5 and X3 = 2 |
X4(5)X3(2) = 0, otherwise | ||
11 | If X3 = 3 and X1 = 1, then Y = 2 | X3(3)X1(1) = 1 if X3 = 3 and X1 = 1 |
X3(3)X1(1) = 0, otherwise | ||
12 | If X4 = 8 and X3 = 2, then Y = 1 | X4(8)X3(2) = 1 if X4 = 8 and X3 = 2 |
X4(8)X3(2) = 0, otherwise | ||
13 | If X3 = 1 and X2 = 1, then Y = 3 | X3(1)X2(1) = 1 if X3 = 1 and X2 = 1 |
X3(1)X2(1) = 0, otherwise | ||
14 | If X4 = 8 and X1 = 2, then Y = 1 | X4(8)X1(2) = 1 if X4 = 8 and X1 = 2 |
X4(8)X1(2) = 0, otherwise | ||
15 | If X4 = 1 and X2 = 3, then Y = 2 | X4(1)X2(3) = 1 if X4 = 1 and X2 = 3 |
X4(1)X2(3) = 0, otherwise | ||
16 | If X4 = 1 and X1 = 1, then Y = 2 | X4(1)X1(1) = 1 if X4 = 1 and X1 = 1 |
X4(1)X1(1) = 0, otherwise | ||
17 | If X4 = 3 and X3 = 3, then Y = 2 | X4(3)X3(3) = 1 if X4 = 3 and X3 = 3 |
X4(3)X3(3) = 0, otherwise | ||
18 | If X3 = 2, X2 = 2, and X1 = 2, then Y = 1 | X3(2)X2(2)X1(2) = 1 if X3 = 2, X2 = 2, and X1 = 2 |
X3(2)X2(2)X1(2) = 0, otherwise | ||
19 | If X5 = 2 and X1 = 2, then Y = 1 | X5(2)X1(2) = 1 if X5 = 2 and X1 = 2 |
X5(2)X1(2) = 0, otherwise | ||
20 | If X2 = 2 and X1 = 2, then Y = 1 | X2(2)X1(2) = 1 if X2 = 2 and X1 = 2 |
X2(2)X1(2) = 0, otherwise | ||
21 | If X5 = 1 and X2 = 3, then Y = 2 | X5(1)X2(3) = 1 if X5 = 1 and X2 = 3 |
X5(1)X2(3) = 0, otherwise |
Variable | N | Mean | Standard Deviation | Minimum | Median | Maximum |
---|---|---|---|---|---|---|
Pregnancies | 768 | 3.85 | 3.37 | 0.00 | 3.00 | 17.00 |
Glucose | 768 | 120.89 | 31.97 | 0.00 | 117.00 | 199.00 |
Blood pressure | 768 | 69.11 | 19.36 | 0.00 | 72.00 | 122.00 |
Skin thickness | 768 | 20.54 | 15.95 | 0.00 | 23.00 | 99.00 |
Insulin | 768 | 79.80 | 115.24 | 0.00 | 30.50 | 846.00 |
BMI | 768 | 31.99 | 7.88 | 0.00 | 32.00 | 67.10 |
Diabetes pedigree function | 768 | 0.47 | 0.33 | 0.08 | 0.37 | 2.42 |
Age | 768 | 33.24 | 11.76 | 21.00 | 29.00 | 81.00 |
Original Variable | Discretized Variable | Detail |
---|---|---|
Pregnancies | X1 | X1 = 1 if pregnancies < 6.5 |
X1 = 2 if pregnancies ≥ 6.5 | ||
Glucose | X2 | X2 = 1 if glucose < 99.5 |
X2 = 2 if 99.5 ≤ glucose < 111.5 | ||
X2 = 3 if 111.5 ≤ glucose < 114.5 | ||
X2 = 4 if 114.5 ≤ glucose < 115.5 | ||
X2 = 5 if 115.5 ≤ glucose < 123.5 | ||
X2 = 6 if 123.5 ≤ glucose < 125.5 | ||
X2 = 7 if 125.5 ≤ glucose < 126.5 | ||
X2 = 8 if 126.5 ≤ glucose < 127.5 | ||
X2 = 9 if 127.5 ≤ glucose < 152.5 | ||
X2 = 10 if 152.5 ≤ glucose < 154.5 | ||
X2 = 11 if glucose ≥ 154.5 | ||
Blood pressure | X3 | X3 = 1 if blood pressure < 42 |
X3 = 2 if 42 ≤ blood pressure < 69 | ||
X3 = 3 if 69 ≤ blood pressure < 71 | ||
X3 = 4 if 71 ≤ blood pressure < 73 | ||
X3 = 5 if 73 ≤ blood pressure < 74.5 | ||
X3 = 6 if 74.5 ≤ blood pressure < 75.5 | ||
X3 = 7 if 75.5 ≤ blood pressure < 79 | ||
X3 = 8 if 79 ≤ blood pressure < 81 | ||
X3 = 9 if blood pressure ≥ 81 | ||
Skin thickness | X4 | X4 = 1 if skin thickness < 7.5 |
X4 = 2 if 7.5 ≤ skin thickness < 31.5 | ||
X4 = 3 if skin thickness ≥ 31.5 | ||
Insulin | X5 | X5 = 1 if insulin < 14.5 |
X5 = 2 if 14.5 ≤ insulin < 87.5 | ||
X5 = 3 if 87.5 ≤ insulin < 91.5 | ||
X5 = 4 if 91.5 ≤ insulin < 95.5 | ||
X5 = 5 if 95.5 ≤ insulin < 99.5 | ||
X5 = 6 if 99.5 ≤ insulin < 121 | ||
X5 = 7 if insulin ≥ 121 | ||
BMI | X6 | X6 = 1 if BMI < 27.85 |
X6 = 2 if 27.85 ≤ BMI < 29.85 | ||
X6 = 3 if 29.85 ≤ BMI < 40.05 | ||
X6 = 4 if 40.05 ≤ BMI < 40.85 | ||
X6 = 5 if BMI ≥ 40.85 | ||
Diabetes pedigree function | X7 | X7 = 1 if diabetes pedigree function < 0.21 |
X7 = 2 if 0.21 ≤ diabetes pedigree function < 0.28 | ||
X7 = 3 if 0.28 ≤ diabetes pedigree function < 0.32 | ||
X7 = 4 if 0.32 ≤ diabetes pedigree function < 0.38 | ||
X7 = 5 if 0.38 ≤ diabetes pedigree function < 0.52 | ||
X7 = 6 if 0.52 ≤ diabetes pedigree function < 0.53 | ||
X7 = 7 if diabetes pedigree function ≥ 0.53 | ||
Age | X8 | X8 = 1 if age < 28.5 |
X8 = 2 if 28.5 ≤ age < 62.5 | ||
X8 = 3 if age ≥ 62.5 |
No. | Rules | Generated Interactions |
---|---|---|
1 | If X6 = 1, X2 = 1, and X1 = 1, then Y = 0 | X6(1)X2(1)X1(1) = 1 if X6 = 1, X2 = 1, and X1 = 1 |
X6(1)X2(1)X1(1) = 0, otherwise | ||
2 | If X4 = 2, X3 = 2, and X2 = 1, then Y = 0 | X4(2)X3(2)X2(1) = 1 if X4 = 2, X3 = 2, and X2 = 1 |
X4(2)X3(2)X2(1) = 0, otherwise | ||
3 | If X6 = 1, X4 = 2, and X2 = 1, then Y = 0 | X6(1)X4(2)X2(1) = 1 if X6 = 1, X4 = 2, and X2 = 1 |
X6(1)X4(2)X2(1) = 0, otherwise | ||
4 | If X5 = 2, X3 = 2, and X2 = 1, then Y = 0 | X5(2)X3(2)X2(1) = 1 if X5 = 2, X3 = 2, and X2 = 1 |
X5(2)X3(2)X2(1) = 0, otherwise | ||
5 | If X8 = 1, X7 = 1, and X6 = 1, then Y = 0 | X8(1)X7(1)X6(1) = 1 if X8 = 1, X7 = 1, and X6 = 1 |
X8(1)X7(1)X6(1) = 0, otherwise | ||
6 | If X8 = 1, X5 = 1, and X2 = 2, then Y = 0 | X8(1)X5(1)X2(2) = 1 if X8 = 1, X5 = 1, and X2 = 2 |
X8(1)X5(1)X2(2) = 0, otherwise | ||
7 | If X8 = 2, X7 = 7, X2 = 11, and X1 = 2, then Y = 1 | X8(2)X7(7)X2(11)X1(2) = 1 if X8 = 2, X7 = 7, X2 = 11, and X1 = 2 |
X8(2)X7(7)X2(11)X1(2) = 0, otherwise | ||
8 | If X5 = 7, X2 = 11, and X1 = 2, then Y = 1 | X5(7)X2(11)X1(2) = 1 if X5 = 7, X2 = 11, and X1 = 2 |
X5(7)X2(11)X1(2) = 0, otherwise | ||
9 | If X6 = 3, X5 = 1, X2 = 11, and X1 = 1, then Y = 1 | X6(3)X5(1)X2(11)X1(1) = 1 if X6 = 3, X5 = 1, X2 = 11, and X1 = 1 |
X6(3)X5(1)X2(11)X1(1) = 0, otherwise | ||
10 | If X6 = 1 and X2 = 1, then Y = 0 | X6(1)X2(1) = 1 if X6 = 1 and X2 = 1 |
X6(1)X2(1) = 0, otherwise |
Variable | N | Mean | Standard Deviation | Minimum | Median | Maximum |
---|---|---|---|---|---|---|
WBC1 | 106 | 0.40 | 0.19 | 0.00 | 0.41 | 1.00 |
MNEP | 106 | 0.68 | 0.21 | 0.00 | 0.75 | 1.00 |
MNEA | 106 | 0.42 | 0.21 | 0.00 | 0.44 | 1.00 |
MBAP | 106 | 0.21 | 0.20 | 0.00 | 0.15 | 1.00 |
MBAA | 106 | 0.17 | 0.18 | 0.00 | 0.11 | 1.00 |
HNEP | 106 | 0.68 | 0.22 | 0.00 | 0.74 | 1.00 |
HNEA | 106 | 0.38 | 0.20 | 0.00 | 0.40 | 1.00 |
Original Variable | Discretized Variable | Detail |
---|---|---|
WBC1 | X1 | X1 = 1 if WBC1 < 0.2155 |
X1 = 2 if 0.2155 ≤ WBC1 < 0.362 | ||
X1 = 3 if 0.362 ≤ WBC1 < 0.3845 | ||
X1 = 4 if 0.3845 ≤ WBC1 < 0.942 | ||
X1 = 5 if WBC1 ≥ 0.942 | ||
MNEP | X2 | X2 = 1 if MNEP < 0.42 |
X2 = 2 if 0.42 ≤ MNEP < 0.509 | ||
X2 = 3 if 0.509 ≤ MNEP < 0.5625 | ||
X2 = 4 if 0.5625 ≤ MNEP < 0.598 | ||
X2 = 5 if 0.598 ≤ MNEP < 0.616 | ||
X2 = 6 if 0.616 ≤ MNEP < 0.652 | ||
X2 = 7 if 0.652 ≤ MNEP < 0.741 | ||
X2 = 8 if 0.741 ≤ MNEP < 0.8125 | ||
X2 = 9 if MNEP ≥ 0.8125 | ||
MNEA | X3 | X3 = 1 if MNEA < 0.2315 |
X3 = 2 if MNEA ≥ 0.2315 | ||
MBAP | X4 | X4 = 1 if MBAP < 0.007 |
X4 = 2 if 0.007 ≤ MBAP < 0.021 | ||
X4 = 3 if 0.021 ≤ MBAP < 0.035 | ||
X4 = 4 if 0.035 ≤ MBAP < 0.049 | ||
X4 = 5 if 0.049 ≤ MBAP < 0.0625 | ||
X4 = 6 if 0.0625 ≤ MBAP < 0.104 | ||
X4 = 7 if 0.104 ≤ MBAP < 0.132 | ||
X4 = 8 if 0.132 ≤ MBAP < 0.16 | ||
X4 = 9 if 0.16 ≤ MBAP < 0.3125 | ||
X4 = 10 if 0.3125 ≤ MBAP < 0.34 | ||
X4 = 11 if 0.34 ≤ MBAP < 0.5695 | ||
X4 = 12 if 0.5695 ≤ MBAP < 0.59 | ||
X4 = 13 if MBAP ≥ 0.59 | ||
MBAA | X5 | X5 = 1 if MBAA < 0.0535 |
X5 = 2 if MBAA ≥ 0.0535 | ||
HNEP | X6 | X6 = 1 if HNEP < 0.509 |
X6 = 2 if 0.509 ≤ HNEP < 0.6685 | ||
X6 = 3 if 0.6685 ≤ HNEP < 0.757 | ||
X6 = 4 if HNEP ≥ 0.757 | ||
HNEA | X7 | X7 = 1 if HNEA < 0.1475 |
X7 = 2 if 0.1475 ≤ HNEA < 0.215 | ||
X7 = 3 if 0.215 ≤ HNEA < 0.2435 | ||
X7 = 4 if 0.2435 ≤ HNEA < 0.343 | ||
X7 = 5 if 0.343 ≤ HNEA < 0.365 | ||
X7 = 6 if 0.365 ≤ HNEA < 0.432 | ||
X7 = 7 if 0.432 ≤ HNEA < 0.4365 | ||
X7 = 8 if 0.4365 ≤ HNEA < 0.9185 | ||
X7 = 9 if HNEA ≥ 0.9185 |
No. | Rules | Generated Interactions |
---|---|---|
1 | If X6 = 4 and X5 = 2, then Y = 0 | X6(4)X5(2) = 1 if X6 = 4 and X5 = 2 |
X6(4)X5(2) = 0, otherwise | ||
2 | If X6 = 2 and X3 = 2, then Y = 0 | X6(2)X3(2) = 1 if X6 = 2 and X3 = 2 |
X6(2)X3(2) = 0, otherwise | ||
3 | If X4 = 5 and X1 = 1, then Y = 1 | X4(5)X1(1) = 1 if X4 = 5 and X1 = 1 |
X4(5)X1(1) = 0, otherwise | ||
4 | If X7 = 3 and X5 = 2, then Y = 1 | X7(3)X5(2) = 1 if X7 = 3 and X5 = 2 |
X7(3)X5(2) = 0, otherwise | ||
5 | If X7 = 3 and X6 = 3, then Y = 1 | X7(3)X6(3) = 1 if X7 = 3 and X6 = 3 |
X7(3)X6(3) = 0, otherwise | ||
6 | If X7 = 7 and X2 = 7, then Y = 1 | X7(7)X2(7) = 1 if X7 = 7 and X2 = 7 |
X7(7)X2(7) = 0, otherwise | ||
7 | If X4 = 3 and X2 = 8, then Y = 1 | X4(3)X2(8) = 1 if X4 = 3 and X2 = 8 |
X4(3)X2(8) = 0, otherwise | ||
8 | If X5 = 1 and X1 = 1, then Y = 1 | X5(1)X1(1) = 1 if X5 = 1 and X1 = 1 |
X5(1)X1(1) = 0, otherwise | ||
9 | If X6 = 1 and X1 = 1, then Y = 1 | X6(1)X1(1) = 1 if X6 = 1 and X1 = 1 |
X6(1)X1(1) = 0, otherwise | ||
10 | If X7 = 1 and X1 = 1, then Y = 1 | X7(1)X1(1) = 1 if X7 = 1 and X1 = 1 |
X7(1)X1(1) = 0, otherwise |
Medical Dataset | Random Forest | SVM | kNN | Classification Tree | CT+ASA+NB | ||
---|---|---|---|---|---|---|---|
Ntree | Accuracy | Kernel | Accuracy | Accuracy | Accuracy | Accuracy | |
Thyroid | 100 | 96.74 | sigmoid | 93.95 | 96.28 | 93.95 | 99.53 |
200 | 97.21 | linear | 96.27 | ||||
500 | 96.28 | poly | 91.16 | ||||
1000 | 96.28 | radial | 95.81 | ||||
Diabetes | 100 | 76.30 | sigmoid | 69.66 | 74.87 | 78.12 | 81.25 |
200 | 77.08 | linear | 77.08 | ||||
500 | 76.95 | poly | 74.74 | ||||
1000 | 76.69 | radial | 75.78 | ||||
Appendicitis | 100 | 87.74 | sigmoid | 78.30 | 87.74 | 84.91 | 95.28 |
200 | 87.74 | linear | 87.74 | ||||
500 | 86.79 | poly | 86.79 | ||||
1000 | 86.79 | radial | 86.79 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Changpetch, P.; Pitpeng, A.; Hiriote, S.; Yuangyai, C. Integrating Data Mining Techniques for Naïve Bayes Classification: Applications to Medical Datasets. Computation 2021, 9, 99. https://doi.org/10.3390/computation9090099
Changpetch P, Pitpeng A, Hiriote S, Yuangyai C. Integrating Data Mining Techniques for Naïve Bayes Classification: Applications to Medical Datasets. Computation. 2021; 9(9):99. https://doi.org/10.3390/computation9090099
Chicago/Turabian StyleChangpetch, Pannapa, Apasiri Pitpeng, Sasiprapa Hiriote, and Chumpol Yuangyai. 2021. "Integrating Data Mining Techniques for Naïve Bayes Classification: Applications to Medical Datasets" Computation 9, no. 9: 99. https://doi.org/10.3390/computation9090099
APA StyleChangpetch, P., Pitpeng, A., Hiriote, S., & Yuangyai, C. (2021). Integrating Data Mining Techniques for Naïve Bayes Classification: Applications to Medical Datasets. Computation, 9(9), 99. https://doi.org/10.3390/computation9090099