The Combinations of Fuzzy Membership Functions on Discretization in the Decision Tree-ID3 to Predict Degenerative Disease Status
Abstract
1. Introduction
2. Materials and Methods
2.1. Research Dataset
2.1.1. Coronary Heart Disease (CHD) Dataset
2.1.2. Diabetes Mellitus Dataset (DMD)
2.2. Research Method
2.2.1. Data Exploration and Processing
2.2.2. Building the DTID3 Prediction Model Based on Discretization
Five-Fold Cross Validation
2.2.3. Prediction Evaluation Metric
3. Results and Discussion
3.1. Coronary Heart Disease Dataset
3.1.1. Dataset Exploration and Preprocessing
3.1.2. Discretization
3.1.3. Five-Fold Cross Validation
3.1.4. FDTID3 Modeling
| If CP is Typical Angina and Thal is Normal, then the decision is 0. | |
| If CP is Typical Angina and Thal is Permanent disability, then the decision is 0. | |
| If CP is Typical Angina, Thal is Temporary disability, and Cholesterol is a High Limit, then the decision is 1. | |
| If CP is Typical Angina, Thal is Temporary disability, and Cholesterol is a High Limit, then the decision is 1. | |
| If CP is Typical Angina, Thal is Temporary disability, Cholesterol is a High Limit, and the Trestbps is Normal, then the decision is 1. | |
| If CP is Typical Angina, Thal is Temporary disability, Cholesterol is a High Limit, Trestbps is Hypertension, FBS is False, and Oldpeak is No, then the decision is 1. | |
| If CP is Typical Angina, Thal is Temporary disability, Cholesterol is a High Limit, Trestbps is Hypertension, FBS is False, and Oldpeak is Yes, then the decision is 1. | |
| If CP is typical angina, Thal is Temporary disability, Cholesterol is a High Limit, Trestbps is Hypertension, and FBS is True, then the decision is 0. | |
| If CP is Atypical Angina, Thal is Normal, and Restecg is Normal, then the decision is 0. | |
| If CP is Atypical Angina, Thal is Normal, Restecg is Ventricular hypertrophy, and Ca is 0, then the decision is 0. | |
| If CP is Atypical Angina, Thal is Normal, Restecg is Ventricular hypertrophy, Ca is 1, and Age is Middle, then the decision is 1. | |
| If CP is Atypical Angina, Thal is Normal, Restecg is Ventricular hypertrophy, Ca is 1, and Age is Old, then the decision is 0. | |
| If CP is Atypical Angina, Thal is Normal, Restecg is Ventricular hypertrophy, and Ca is 1, then the decision is 1. | |
| If CP is Atypical Angina and Thal is Permanent disability, then the decision is 1. | |
| If CP is Atypical Angina, Thal is Temporary disability, Slope is Learning Up, and Trestbps is Hypertension, then the decision is 1. | |
| If CP is Atypical Angina, Thal is Temporary disability, Slope is Learning Up, and Trestbps is Normal, then the decision is 0. | |
| If CP has Atypical Angina, Thal has a Temporary disability, Slope is Learning Up, Trestbps is prehypertension, and Cholesterol is High, then the decision is 1. | |
| If CP has Atypical Angina, Thal has a Temporary disability, Slope is Learning Up, Trestbps is prehypertension, and Cholesterol is High limit, then the decision is 0. | |
| If CP is Atypical Angina, Thal is Temporary disability, and Slope is Flat, then the decision is 1. | |
| If CP has Atypical Angina, Thal has Temporary disability, and the Slope is Slightly Sloping, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Hypertension, Oldpeak is No, and Age is Middle, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Hypertension, Oldpeak is No, Age is Old, and Sex is Male, then the decision is 0 | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Hypertension, Oldpeak is No, Age is Middle, and Sex is Female, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Hypertension, Oldpeak is No, and Age is Young, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Hypertension, and Oldpeak is Yes, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Normal, and Sex is Male, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Normal, Sex is Female, Cholesterol is High, Age is Middle, and Restecg is Normal, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Normal, Sex is Female, Cholesterol is High, Age is Middle, and Restecg is Ventricular hypertrophy, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Normal, Sex is Female, Cholesterol is High, and Age is Young, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Normal, Sex is Female, and Cholesterol is High Limit, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Normal, Ca is 0, Trestbps is Normal, Sex is Female, and Cholesterol is Normal, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Normal, and Ca is 1, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Normal, and Ca is 2, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Normal, and Ca is 3, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Temporary disability, Slope is Leaning up, and Trestbps is Hypertension, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Temporary disability, Slope is Leaning up, and Trestbps is Hypertension, then the decision is 0. | |
| If CP is Nonanginal pain and Thal is Temporary disability, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Temporary disability, Slope is Leaning up, and Trestbps is Prehypertension, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Temporary disability, Slope is Flat, and Trestbps is Hypertension, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Temporary disability, Slope is Flat, Trestbps is Normal, and Ca is 0, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Temporary disability, Slope is Flat, Trestbps is Normal, and Ca is 1, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Temporary disability, Slope is Flat, Trestbps is Normal, and Ca is 3, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Temporary disability, Slope is Flat, Trestbps is Prehypertension, and Ca is 1, then the decision is 1. | |
| If CP is Nonanginal pain, Thal is Temporary disability, Slope is Flat, Trestbps is Prehypertension, and Ca is 3, then the decision is 0. | |
| If CP is Nonanginal pain, Thal is Temporary disability, and the Slope is Slightly Sloping, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, Trestbps is Hypertension, Sex is Male, Age is Middle, and Restecg is Normal, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, Trestbps is Hypertension, Sex is Male, Age is Middle, and Restecg is ST-T wave abnormalities, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, Trestbps is Hypertension, Sex is Male, Age is Middle, Restecg is Ventricular hypertrophy, and Slope is Leaning up, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, Trestbps is Hypertension, Sex is Male, Age is Middle, Restecg is Ventricular hypertrophy, and Slope is Flat, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, Trestbps is Hypertension, Sex is Male, and Age is Old, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, Trestbps is Hypertension, and Sex is Female, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, and Trestbps is Normal, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, Trestbps is Prehypertension, Age is Middle, and Restecg is Normal, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, Trestbps is Prehypertension, Age is Middle, Restecg is Ventricular Hypertrophy, and Slope is Learning Up, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, Trestbps is Prehypertension, Age is Middle, Restecg is Ventricular Hypertrophy, and Slope is Flat, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Normal, Trestbps is Prehypertension, and Age is Young, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Permanent Disability, and Trestbps is Hypertension, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 0, Thal is Permanent Disability, Trestbps is Normal, and Cholesterol is High Limit, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 0, Thal is Permanent Disability, Trestbps is Normal, and Cholesterol is Normal, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Permanent Disability, and Trestbps is Prehypertension, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Temporary Disability, and Cholesterol is High, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Temporary Disability, Cholesterol is High Limit, Age is Middle, and Fbs is False, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 0, Thal is Temporary Disability, Cholesterol is High Limit, Age is Middle, and Fbs is True, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Temporary Disability, Cholesterol is High Limit, and Age is Young, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Temporary Disability, Cholesterol is Normal, and Age is Middle, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Temporary Disability, Cholesterol is High Limit, Age is Old, and Trestbps is Hypertension, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 0, Thal is Temporary Disability, Cholesterol is High Limit, Age is Old, and Trestbps is Normal, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 0, Thal is a Temporary Disability, Cholesterol is Normal, and Age is Young, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 1, and Sex is Male, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 1, Sex is Female, and Age is Middle, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 1, Sex is Female, and Age is Old, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 1, Sex is Male, Age is Old, and Restecg is Ventricular Hypertrophy, then the decision is 1. | |
| If CP is Asymptomatic and Ca is 1, and Sex is Male, and Age is Young, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 2, Thal is Normal, and Cholesterol is High, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 2, Thal is Normal, and Cholesterol is High Limit, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 2, and Thal is Permanent disability, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 2, and Thal is Temporary disability, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 3, and Trestbps is Hypertension, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 3, Trestbps is Normal, and Cholesterol is High, then the decision is 1. | |
| If CP is Asymptomatic, Ca is 3, Trestbps is Normal, and Cholesterol is High Limit, then the decision is 0. | |
| If CP is Asymptomatic, Ca is 3, and Trestbps is Prehypertension, then the decision is 1. | 
3.2. Diabetes Mellitus Disease Dataset
3.2.1. Dataset Exploration and Preprocessing
3.2.2. Discretization
| If BMI is Normal, Glucose is High, Pregnancy is High, and Skin Thickness is Normal, then the decision is 0. | |
| If BMI is Normal, Glucose is High, Pregnancy is High, and Skin Thickness is Thick, then the decision is 1. | |
| If BMI is Normal, Glucose is High, Pregnancy is Low, and Skin Thickness is Normal, then the decision is 0. | |
| If BMI is Normal, Glucose is High, Pregnancy is Low, and Skin Thickness is Thin, then the decision is 1. | |
| If BMI is Normal, Glucose is High, Pregnancy is Low, and Skin Thickness is Very Thin, then the decision is 0. | |
| If BMI is Normal, Glucose is High, and Pregnancy is Normal, then the decision is 1. | |
| If BMI is Normal, Glucose is Low, and Pregnancy is Low, then the decision is 0. | |
| If BMI is Normal, Glucose is Low, Pregnancy is Normal, and Skin Thickness is Thick, then the decision is 1. | |
| If BMI is Normal, Glucose is Low, Pregnancy is Normal, and Skin Thickness is Very Thin, then the decision is 0. | |
| If BMI is Normal and Glucose is Low, then the decision is 0. | |
| If BMI is Obesity, Glucose is High, and Age is Middle, then the decision is 1. | |
| If BMI is Obesity, Glucose is High, Age is Young, Skin Thickness is Normal, and Blood Pressure is Low, then the decision is 1. | |
| If BMI is Obesity, Glucose is High, Age is Young, Skin Thickness is Normal, and Blood Pressure is Prehypertension, then the decision is 0. | |
| If BMI is Obesity, Glucose is High, Age is Young, Skin Thickness is Thick, and Blood Pressure is Hypertension, then the decision is 0. | |
| If BMI is Obesity, Glucose is High, Age is Young, Skin Thickness is Thick, Blood Pressure is Prehypertension, and Pregnancy is High, then the decision is 0. | |
| If BMI is Obesity, Glucose is High, Age is Young, Skin Thickness is Thick, and Blood Pressure is Prehypertension, Pregnancies are Low, then the decision is 0. | |
| If BMI is Obesity, Glucose is High, Age is Young, Skin Thickness is Thick, Blood Pressure is Prehypertension, and Pregnancy is Normal, then the decision is 1. | |
| If BMI is Obesity, Glucose is High, Age is Young, and Skin Thickness is Very Thin, then the decision is 1. | |
| If BMI is Obesity, Glucose is Low, and Blood Pressure is Hypertension, then the decision is 1. | |
| If BMI is Obesity, Pregnancy Glucose is Low, and Blood Pressure is Low, then the decision is 1. | |
| If BMI is Obesity, Glucose is Low, and Blood Pressure is Prehypertension, then the decision is 1. | |
| If BMI is Obesity, Glucose is Low, Blood Pressure is Normal, and Skin Thickness is Normal, then the decision is 1. | |
| If BMI is Obesity, Glucose is Low, Blood Pressure is Normal, Skin Thickness is Thick, Insulin is Low, Pregnancy is Low, and Age is Middle, then the decision is 0. | |
| If BMI is Obesity, Glucose is Low, Blood Pressure is Normal, Skin Thickness is Thick, Insulin is Low, Pregnancy is Low, and Age is Young, then the decision is 1. | |
| If BMI is Obesity, Glucose is Low, Blood Pressure is Normal, Skin Thickness is Thick, Insulin is Low, Pregnancy is Normal, and Age is Middle, then the decision is 1. | |
| If BMI is Obesity, Glucose is Low, Blood Pressure is Normal, Skin Thickness is Thick, Insulin is Low, Pregnancy is Normal, and Age is Young, then the decision is 0. | |
| If BMI is Obesity, Glucose is Low, Blood Pressure is Normal, Skin Thickness is Thick, and Insulin is Normal, then the decision is 1. | |
| If BMI is Obesity, Glucose is Low, Blood Pressure is Normal, Skin Thickness is Very Thin, Pregnancies is Low, and Age is Middle, then the decision is 0. | |
| If BMI is Obesity, Glucose is Low, Blood Pressure is Normal, Skin Thickness is Very Thin, Pregnancy is Low, and Age is Young, then the decision is 1. | |
| If BMI is Obesity, Glucose is Low, Blood Pressure is Normal, Skin Thickness is Very Thin, Pregnancy is Normal, and Age is Middle, then the decision is 1. | |
| If BMI is Obesity, Glucose is Low, Blood Pressure is Normal, Skin Thickness is Very Thin, Pregnancy is Normal, and Age is Young, then the decision is 0. | |
| If BMI is Obesity, Glucose is Normal, and Blood Pressure is Hypertension, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Low, and Pregnancy is High, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Low, and Pregnancy is Normal, then the decision is 0. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Low, Pregnancy is Low, and Age is Middle, then the decision is 0. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Low, Pregnancy is Low, and Age is Young, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Normal, Pregnancy is High, and Skin Thickness is Normal, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Normal, Pregnancy is High, and Skin Thickness is Thick, then the decision is 0. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Normal, Pregnancy is Low, Age is Middle, and Skin Thickness is Thick, then the decision is 0. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Normal, Pregnancy is Low, Age is Middle, and Skin Thickness is Very Thin, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Normal, Pregnancies is Low, and Age is Young, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Normal, Pregnancies is Normal, and Insulin is Normal, then the decision is 0. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Normal, Pregnancies is Normal, Insulin is Low, and Skin Thickness is Normal, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Normal, Pregnancies is Normal, Insulin is Low, and Skin Thickness is Thick, then the decision is 0. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Normal, Pregnancies is Normal, Insulin is Low, and Skin Thickness is Very Thin, then the decision is 0. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Prehypertension, Pregnancies is High, Skin Thickness is Thick, and Insulin is Low, then the decision is 0. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Prehypertension, Pregnancies is High, Skin Thickness is Thick, and Insulin is Normal, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Prehypertension, Pregnancies is High, and Skin Thickness is Very Thin, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Prehypertension, Pregnancies is Low, Skin Thickness is Normal, and Insulin is Low, then the decision is 0. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Prehypertension, Pregnancy is Low, and Skin Thickness is Thick, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Prehypertension, Pregnancies is Low, and Skin Thickness is Very Thin, then the decision is 1. | |
| If BMI is Obesity, Glucose is Normal, Blood Pressure is Prehypertension, and Pregnancies is Normal, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Middle, and Insulin is High, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Middle, and Insulin is Normal, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Middle, Insulin is Low, and Glucose is High, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Middle, Insulin is Low, Glucose is Low, and Pregnancy is Low, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Middle, Insulin is Low, Glucose is Low, and Pregnancy is Normal, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Middle, Insulin is Low, Glucose is Normal, and Pregnancy is Low, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Middle, Insulin is Low, Glucose is Normal, and Pregnancy is Normal, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Middle, Insulin is Low, Glucose is Normal, Pregnancy is Normal, and Blood Pressure is Prehypertension, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Normal, and Age is Old, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Young, and Blood Pressure is Low, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Young, and Blood Pressure is Normal, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Young, Blood Pressure is Prehypertension, and Pregnancy is Low, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Normal, Age is Young, Blood Pressure is Prehypertension, and Pregnancy is Normal, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is High, Insulin is Low, and Age is Middle, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is High, Insulin is Low, and Age is Young, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is High, and Insulin is Normal, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Low, and Blood Pressure is Hypertension, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Low, and Blood Pressure is Low, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Low, and Blood Pressure is Normal, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Low, Blood Pressure is Prehypertension, and Pregnancy is Low, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Low, Blood Pressure is Hypertension, and Pregnancy is Normal, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Normal, and Blood Pressure is Hypertension, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Normal, and Blood Pressure is Low, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Normal, Blood Pressure is Normal, and Pregnancy is High, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Normal, Blood Pressure is Normal, and Pregnancy is Low, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Normal, Blood Pressure is Normal, and Pregnancy is Normal, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Normal, Blood Pressure is Prehypertension, and Pregnancy is High, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Normal, Blood Pressure is Prehypertension, and Pregnancy is Normal, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Normal, Blood Pressure is Prehypertension, Pregnancy is Low, and Age is Middle, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thick, Glucose is Normal, Blood Pressure is Prehypertension, Pregnancy is Low, and Age is Young, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Thin, and Glucose is High, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thin, and Glucose is Normal, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thin, Glucose is Low, and Blood Pressure is Low, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thin, Glucose is Low, and Blood Pressure is Normal, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Thin, Glucose is Low, and Blood Pressure is Prehypertension, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Hypertension, and Glucose is High, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is hypertension, and Glucose is Normal, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Low, and Age is Middle, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Low, and Age is Young, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, and Pregnancy is High, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, Pregnancy is High, and Age is Young, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, Pregnancy is High, and Age is Middle, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, Pregnancy is High, and Age is Young, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, Pregnancy is Low, and Age is Middle, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, Pregnancy is Low, Age is Young, and Glucose is High, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, Pregnancy is Low, Age is Young, and Glucose is Low, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, Pregnancy is Low, Age is Young, and Glucose is Normal, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, Pregnancy is Normal, Age is Middle, and Glucose is High, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, Pregnancy is Normal, Age is Middle, and Glucose is Low, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Normal, Pregnancy is Normal, Age is Middle, and Glucose is Normal, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Prehypertension, and Glucose is High, then the decision is 1. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Prehypertension, and Glucose is Low, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Prehypertension, Glucose is Normal, and Age is Middle, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Prehypertension, Glucose is Normal, and Age is Young, then the decision is 0. | |
| If BMI is Overweight, Skin Thickness is Very Thin, Blood Pressure is Prehypertension, Glucose is Normal, Age is Young, and Pregnancy is Normal, then the decision is 1. | |
| If BMI is Underweight, and Blood Pressure is Hypertension, then the decision is 1. | |
| If BMI is Underweight, and Blood Pressure is Prehypertension, then the decision is 0. | |
| If BMI is Underweight and Low Blood Pressure, then the decision is 0. | 
3.3. Model Performance Comparison with Other Research
3.3.1. Coronary Heart Disease
3.3.2. Diabetes Mellitus Disease
| No. | Research | The Best Prediction Method | Validation Method | Accuracy | Recall | Precision | F1-Score | AUC | 
|---|---|---|---|---|---|---|---|---|
| 1 | Chowdary et al. [46] | Ensemble of LR, RF, GNB, NNR, KNN | Hold out 67:33 | 87.00 | 94.00 | 91.60 | 88.00 | - | 
| 2 | Kresnawati et al. [33] | DTID3 | 10-fold CV | 99.63 | 100.00 | 99.23 | 99.61 | 99.67 | 
| 3 | Hassan et al. [21] | RF | Hold out 70:30 | 96.28 | 95.37 | 96.28 | 96.28 | - | 
| 4 | Hossen [19] | LR | Hold out 80:20 | 95.00 | 95.00 | - | - | - | 
| 5 | Kanwal et al. [24] | SVM with LASSO | Hold out 80:20 | 85.19 | 80.77 | - | - | - | 
| 6 | Chandrasekhar and Peddakrishna, 2023 [56] | Ensemble of RF, KNN, LR, NB, GB, AB, SVE | 5-fold CV | 90.00 | 89.00 | - | - | - | 
| 7 | Patil and Bhosale, 2023 [23] | FCM-based NN with feature scaling | Hold out 70:30 | 98.78 | - | - | - | - | 
| 8 | Karthikeyini et al., 2023 [26] | DGRU with LCHB | - | 95.15 | 91.48 | 92.26 | 92.21 | - | 
| 9 | Femina and Sudheep, 2020 [36] | Linguistic Fuzzy NB Classifier (LFNBC) | Hold out 90:10 | 91.30 | 92.68 | - | - | 91.44 | 
| 10 | Proposed Method | FDTID3-4 | 5-fold CV | 99.67 | 100.00 | 99.29 | 99.64 | 99.70 | 
| No. | Research | Zero-Value Data | Balance Class | The Best Prediction Method | Validation Method | Recall | F1-Score | AUC | Accuracy | Precision | 
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Maniruzzaman et al., 2017 [59] | impute by median | no special treatment | GP | 10-fold CV | 91.79 | 88.22 | - | 81.97 | 84.91 | 
| 2 | Shanmugapriya et al., 2017 [37] | no special treatment | no special treatment | SVM | Hold out 75:25 | 58.90 | - | - | 73.82 | - | 
| 3 | Tigga and Garg, 2020 [57] | no special treatment | no special treatment | SVM | 10-fold CV | 77.50 | 81.30 | 77.10 | 74.40 | 85.60 | 
| 4 | Resti et al., 2021 [46] | no special treatment | no special treatment | NB | 5-fold CV | 94.48 | 94.15 | - | 95.83 | 93.82 | 
| 5 | Tasin et al., 2022 [22] | impute by mean (for skin thickness and BMI) and impute by XGB (for others) | balanced using ADASYN | XGBoost | Hold out 80:20 | 80.00 | 81.00 | - | 88.50 | 82.00 | 
| 6 | Kresnawati et al., 2023 [58] | no special treatment | no special treatment | QDA | Hold out 70:30 | 69.23 | 81.82 | 84.62 | 98.27 | 100.00 | 
| 7 | Binerbia, 2022 [28] | impute by mean | no special treatment | SVM | Hold out 80:20 | 86.00 | - | - | 80.00 | 75.00 | 
| 8 | Palanivinayagam and Damasevicius, 2023 [27] | impute by SVM | no special treatment | SVM | 10-fold CV | 88.23 | 85.71 | - | 94.89 | 83.33 | 
| 9 | Proposed Method | no special treatment | no special treatment | FDTID3-5 | 5-fold CV | 96.99 | 94.89 | 89.64 | 93.23 | 92.95 | 
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Kuo, N.I.; Jorm, L.; Barbieri, S. Synthetic health-related longitudinal data with mixed-type variables generated using diffusion models. arXiv 2023. [Google Scholar] [CrossRef]
- Nezhad, S.N.; Zahedi, M.H.; Farahani, E. Detecting diseases in medical prescriptions using data mining methods. BioData Min. 2022, 15, 29. [Google Scholar] [CrossRef] [PubMed]
- Kee, O.T.; Harun, H.; Mustafa, N.; Murad, N.A.A.; Chin, S.F.; Jaafar, R.; Abdullah, N. Cardiovascular complications in a diabetes prediction model using machine learning: A systematic review. Cardiovasc. Diabetol. 2023, 22, 13. [Google Scholar] [CrossRef] [PubMed]
- Abdalrada, A.S.; Abawajy, J.; Al-Quraishi, T.; Islam, S.M.S. Machine learning models for prediction of co-occurrence of diabetes and cardiovascular diseases: A retrospective cohort study. J. Diabetes Metab. Disord. 2022, 21, 251–261. [Google Scholar] [CrossRef]
- Eadie, M.J. The Australian Journal of Physiotherapy degenerative disease affecting the nervous system. Aust. J. Physiother. 1974, 20, 20–22. [Google Scholar] [CrossRef]
- Batista, P.; Pereira, A. Quality of life in patients with neurodegenerative diseases. Imedpub journals quality of life in patients with neurodegenerative diseases. J. Neurol. Neurosci. 2016, 7, 74. [Google Scholar] [CrossRef]
- Harahap, J.; Andayani, L.S. Screening of Degenerative Diseases and Quality of Life among Elderly People in Posyandu Lansia Medan Amplas. In Proceedings of the 5th Annual International Conference Syiah Kuala University, Banda Aceh, Indonesia, 9 September 2015. [Google Scholar]
- Barendregt, J.J.M. Degenerative Disease in an Aging Population Models and Conjectures. Ph.D. Thesis, The Department of Public Health of Erasmus Universiteit, Rotterdam, The Netherlands, 1998. [Google Scholar]
- Di Renzo, L.; Gualtieri, P.; Frank, G.; De Lorenzo, A. Nutrition for prevention and control of chronic degenerative diseases and COVID-19. Nutrients 2023, 15, 2253. [Google Scholar] [CrossRef]
- Livingston, K.A.; Freeman, K.J.; Friedman, S.M.; Stout, R.W.; Lianov, L.S.; Drozek, D.; Shallow, J.; Shurney, D.; Patel, P.M.; Campbell, T.M.; et al. Lifestyle medicine and economics: A proposal for research priorities informed by a case series of disease reversal. J. Environ. Res. Public Health 2021, 18, 11364. [Google Scholar] [CrossRef]
- Nelwan, E.J.; Widjajanto, E.; Andarini, S.; Djati, M.S. Modified risk factors for coronary heart disease (CHD) in Minahasa ethnic group from Manado city Indonesia. J. Exp. Life Sci. 2016, 6, 88–94. [Google Scholar] [CrossRef][Green Version]
- Di Cesare, M.; Bixby, H.; Gaziano, T.; Hadeed, L.; Kabudula, C.; McGhie, D.V.; Mwangi, J.; Pervan, B.; Perel, P.; Piñeiro, D.; et al. World Heart Report 2023 Confronting the World’s Number One Killer; World Heart Federation: Geneva, Switzerland, 2023. [Google Scholar]
- Antini, C.; Caixeta, R.; Luciani, S.; Hennis, A.J.M. Diabetes mortality: Trends and multi-country analysis of the Americas from 2000 to 2019. Int. J. Epidemiol. 2024, 53, dyad182. [Google Scholar] [CrossRef]
- WHO. Global Report on Diabetes; WHO Library Cataloguing in Publication Data: Lyon, France, 2016. [Google Scholar]
- IDF. Diabetes Voice; IDF: Brussels, Belgium, 2017; Volume 64. [Google Scholar]
- Abdollahi, J.; Moghaddam, B.N.; Parvar, E. Improving diabetes diagnosis in smart health using genetic-based ensemble learning algorithm approach to IoT infrastructure. Future Gener. Distrib. Syst. J. 2019, 1, 26–33. [Google Scholar]
- Cavan, D.; Makaroff, L.; Fernandes, J.D.R. Cost-Effective Solutions for the Prevention of Type 2 Diabetes; IDF: Brussels, Belgium, 2016. [Google Scholar]
- WHO. World Health Statistics Overview 2019; WHO: Geneva, Switzerland, 2019. [Google Scholar]
- Hossen, M.K. Heart disease prediction using machine learning techniques. Am. J. Comput. Sci. Technol. 2022, 5, 146–154. [Google Scholar] [CrossRef]
- Chowdary, G.J.; Suganya, G.; Mariappan, P. Predicting the presence of coronary heart disease using machine learning classifiers. J. Crit. Rev. 2020, 7, 1865–1875. [Google Scholar]
- Hassan, C.A.U.; Iqbal, J.; Irfan, R.; Hussain, S.; Algami, A.D.; Bukhari, S.S.H.; Alturki, N.; Ullah, S.S. Effectively predicting the presence of coronary heart disease using machine learning classifiers. Sensors 2022, 22, 7227. [Google Scholar] [CrossRef] [PubMed]
- Tasin, I.; Nabil, T.U.; Islam, S.; Khan, R. Diabetes prediction using machine learning and explainable AI. Healthc. Technol. Lett. 2023, 10, 1–10. [Google Scholar] [CrossRef]
- Patil, S.; Bhosale, S. Improving cardiovascular disease prognosis using outlier detection and hyperparameter optimization of machine learning models. Rev. d’Intell. Artif. 2023, 37, 1069–1080. [Google Scholar] [CrossRef]
- Kanwal, A.; Ahmad, K.T.; Abid, M.K.; Aslam, N. Detection of heart disease using supervised machine learning. Vfast Trans. Softw. Eng. 2022, 6246, 58–70. [Google Scholar] [CrossRef]
- Selvan, S.; Varadhaganapathy, S. Deep learning based cardiovascular disease risk factor prediction among type 2 diabetes mellitus patients. Inf. Technol. Control 2023, 52, 215–227. [Google Scholar] [CrossRef]
- Karthikeyini, S.; Vidhya, G.; Vetriselvi, T.; Deepa, K. Heart disease prognosis using D-GRU with logistic chaos honey badger optimization in IOMT framework. Inf. Technol. Control 2023, 52, 367–380. [Google Scholar] [CrossRef]
- Palanivinayagam, A.; Damaševičius, R. Effective handling of missing values in datasets for classification using machine learning methods. Information 2023, 14, 92. [Google Scholar] [CrossRef]
- Benarbia, M. A Machine Learning Approach to Predicting the Onset of Type II Diabetes in a Sample of Pima Indian Women. Master’s Thesis, City University of New York, NY, USA, 2022. [Google Scholar]
- Dougherty, J.; Kohavi, R.; Sahami, M. Supervised and Unsupervised Discretization of Continuous Features. In Proceedings of the Twelfth International Conference on Machine Learning, Tahoe, CA, USA, 9–12 July 1995. [Google Scholar] [CrossRef]
- García, S.; Luengo, J.; Herrera, F. Data Preprocessing in Data Mining; Springer: Cham, Switzerland, 2015; Volume 72. [Google Scholar]
- Roy, A.; Pal, S.K. Fuzzy discretization of feature space for a rough set classifier. Pattern Recognit. Lett. 2003, 24, 895–902. [Google Scholar] [CrossRef]
- Resti, Y. Credit Risk-Type Classification using Statistical Learning. In Proceedings of the 3rd Conference on Fundamental and Applied Science for Advanced Technology Universitas Ahmad Dahlan, Yogyakarta, Indonesia, 22 January 2022. [Google Scholar] [CrossRef]
- Kresnawati, E.S.; Resti, Y.; Suprihatin, B.; Kurniawan, M.R.; Amanda, W.A. Coronary artery disease prediction using decision trees and multinomial naïve bayes with k-fold cross validation. Inomatika 2021, 3, 174–189. [Google Scholar] [CrossRef]
- Resti, Y.; Irsan, C.; Amini, M.; Yani, I.; Passarella, R. Performance improvement of decision tree model using fuzzy membership function for classification of corn plant diseases and pests. Sci. Technol. Indones. 2022, 7, 284–290. [Google Scholar] [CrossRef]
- Resti, Y.; Irsan, C.; Neardiaty, A.; Annabila, C.; Yani, I. Fuzzy discretization on the multinomial naïve Bayes method for modeling multiclass classification of corn plant diseases and pests. Mathematics 2023, 11, 1761. [Google Scholar] [CrossRef]
- Femina, B.T.; Sudheep, E.M. A novel fuzzy linguistic fusion approach to naive Bayes classifier for decision-making applications. Int. J. Adv. Sci. Eng. Inf. Technol. 2020, 10, 1889–1897. [Google Scholar] [CrossRef]
- Shanmugapriya, M.; Nehemiah, H.K.; Bhuvaneswaran, R.S.; Arputharaj, K.; Sweetlin, J.D. Fuzzy discretization based classification of medical data. Res. J. Appl. Sci. Eng. Technol. 2017, 14, 291–298. [Google Scholar] [CrossRef]
- Tutuncu, G.Y.; Kayaalp, N. An aggregated fuzzy naive Bayes data classifier. J. Comput. Appl. Math. 2019, 286, 17–27. [Google Scholar] [CrossRef]
- Algehyne, E.A.; Jibril, M.L.; Algehainy, N.A.; Alamri, O.A.; Alzahrani, A.K. Fuzzy neural network expert system with an improved gini index random forest-based feature importance measure algorithm for early diagnosis of breast cancer in Saudi Arabia. Big Data Cogn. Comput. 2022, 6, 13. [Google Scholar] [CrossRef]
- Altay, A.; Cinar, D. Fuzzy decision trees. In Fuzzy Statistical Decision-Making; Springer International Publisher: Cham, Switzerland, 2016. [Google Scholar] [CrossRef]
- Araniba, L.A.Q. Learning Fuzzy Logic from Examples. Master’s Thesis, Ohio University, Athens, OH, USA, 1994. [Google Scholar]
- Resti, Y.; Burlian, F.; Yani, I.; Zayanti, D.A.; Sari, I.M. Improved the cans waste classification rate of naive Bayes using fuzzy approach. Sci. Technol. Indones. 2020, 5, 75–78. [Google Scholar] [CrossRef]
- Fernandez, S.; Ito, T.; Cruz-Piris, L.; Marsa-Maestre, I. Fuzzy ontology-based system for driver behavior classification. Sensor 2022, 22, 7954. [Google Scholar] [CrossRef]
- Kaggle. Available online: https://www.kaggle.com/datasets/aavigan/cleveland-clinic-heart-disease-dataset/data (accessed on 17 January 2024).
- Kaggle. Available online: https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database (accessed on 17 January 2024).
- Resti, Y.; Kresnawati, E.S.; Dewi, N.R.; Zayanti, D.A.; Eliyati, N. Diagnosis of diabetes mellitus in women of reproductive age using the prediction methods of naive Bayes, discriminant analysis, and logistic regression. Sci. Technol. Indones. 2021, 6, 96–104. [Google Scholar] [CrossRef]
- Lee, C.F.; Tzeng, G.H.; Wang, S.Y. A new application of fuzzy set theory to the black-scholes option pricing model. Expert Syst. Appl. 2005, 29, 330–342. [Google Scholar] [CrossRef]
- Resti, Y.; Irsan, C.; Putri, M.T.; Yani, I.; Ansyori, A.; Suprihatin, B. Identification of corn plant diseases and pests based on digital images using multinomial naïve Bayes and k-nearest neighbor. Sci. Technol. Indones. 2022, 7, 29–35. [Google Scholar] [CrossRef]
- Bhattacharyya, R.; Mukherjee, S. Fuzzy membership function evaluation by non-linear regression: An algorithmic approach. Fuzzy Inf. Eng. 2021, 12, 412–434. [Google Scholar] [CrossRef]
- Alzoman, R.M.; Alenazi, M.J.F. A comparative study of traffic classification techniques for smart city networks. Sensors 2021, 21, 4677. [Google Scholar] [CrossRef]
- Rutkowski, L. Flexible Neuro-Fuzzy Systems; Kluwer Academic Publisher: Boston, FL, USA, 2004. [Google Scholar]
- Medasani, S.; Kim, J.; Krishnapuram, R. An overview of membership function generation techniques for pattern recognition. Int. J. Approx. Reason. 1998, 19, 391–417. [Google Scholar] [CrossRef]
- Lantz, B. Machine Learning with R; Packt Publishing: Birmingham, UK, 2013; pp. 315–348. [Google Scholar]
- Rodríguez, J.D.; Perez, A.; Lozano, J.A. Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 569–575. [Google Scholar] [CrossRef]
- Ramasubramanian, K.; Singh, A. Machine Learning Using R, 2nd ed.; Apress: Berkeley, CA, USA, 2019. [Google Scholar] [CrossRef]
- Chandrasekhar, N.; Peddakrishna, S. Enhancing heart disease prediction accuracy through machine learning techniques and optimization. Processes 2023, 11, 1210. [Google Scholar] [CrossRef]
- Tigga, N.P.; Garg, S. Prediction of type 2 diabetes using machine learning classification methods. Procedia Comput. Sci. 2020, 167, 706–716. [Google Scholar] [CrossRef]
- Kresnawati, E.S.; Suprihatin, B.; Resti, Y. Diabetes Mellitus Diagnosis Using The Prediction Model of Discriminant Analysis. In Proceedings of the AIP Conference Proceedings of Annual Conference on Science and Technology Research, Palembang, Indonesia, 24 August 2021. [Google Scholar] [CrossRef]
- Maniruzzaman, M.; Kumar, N.; Abedin, M.M.; Islam, M.S.; Suri, H.S.; El-Baz, A.S.; Suri, J.S. Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm. Comput. Methods Programs Biomed. 2017, 152, 23–34. [Google Scholar] [CrossRef]














| Variable | Description | Type | Information | 
|---|---|---|---|
| Age | age in years | Continuous | 29–77 years | 
| Sex | sex | Categoric | 0: male 1: female | 
| CP | chest pain type | Categoric | 1: typical angina 2: atypical angina 3: non-anginal pain 4: asymptomatic | 
| Trestbps | resting blood pressure | Continuous | 94–200 mmHg | 
| Chol | serum cholesterol | Continuous | 126–564 mg/dL | 
| FBS | fasting blood sugar > 120 mg/dL | Categoric | 0: false 1: true | 
| Restecg | resting electrocardiographic results | Categoric | 0: normal 1: having ST-T wave abnormal (>0.05 mV) 2: showing probable or definite left ventricular hypertrophy by Estes’ criteria | 
| Thalach | maximum heart rate achieved | Continuous | 71–202 bpm | 
| Exang | exercise-induced angina | Categoric | 0: no 1: yes | 
| Oldpeak | ST depression induced by exercise relative to rest | Continuous | 0–6.2 mV | 
| Slope | the slope of the peak exercise ST segment | Categoric | 1: upsloping 2: flat 3: down sloping | 
| Ca | number of significant vessels colored by fluoroscopy | Discrete | 0–3 | 
| Thal | thalassemia (types of blood disorder) | Categoric | 3: normal 6: fixed defect 7: reversible defect | 
| Variable | Description | Information | 
|---|---|---|
| Glucose (ratio) | Plasma glucose concentration 2 h in an oral glucose tolerance test | 0–199 mg/dL | 
| Blood Pressure | Diastolic blood pressure (blood pressure when the heart relaxes) | 0–122 mmHg | 
| Skin Thickness | Triceps skin fold thickness | 0–99 mm | 
| Insulin | 2-Hour serum insulin | 0–846 μ/mL | 
| BMI | Body mass index (an approximate of total body fat) | 0–67.1 kg/m2 | 
| Diabetes Pedigree Function | a function that scores the probability of diabetes based on family history | 0.08–2.42 | 
| Age | Age in years | 21–81 years | 
| Pregnancies | Number of times pregnant | 0–17 times | 
| Status of Coronary Heart Disease | Statistics | Age | Trestbps | Cholesterol | Thalach | Oldpeak | 
|---|---|---|---|---|---|---|
| No | Min | 29 | 94 | 126 | 96 | 0 | 
| Q1 | 45 | 120 | 209 | 148.5 | 0 | |
| Mean | 52.67 | 129.20 | 243.06 | 158.29 | 0.59 | |
| Mode | 54 | 130 | 204 | 162 | 0 | |
| Q3 | 59 | 140 | 267.5 | 172 | 1.05 | |
| Max | 76 | 180 | 564 | 202 | 4.2 | |
| Yes | Min | 35 | 100 | 131 | 71 | 0 | 
| Q1 | 52 | 120 | 217.5 | 125 | 0.55 | |
| Mean | 56.63 | 134.57 | 251.47 | 139.26 | 1.57 | |
| Mode | 58 | 140 | 254 | 132 | 0 | |
| Q3 | 62 | 145 | 283.5 | 156.5 | 2.5 | |
| Max | 77 | 200 | 409 | 195 | 6.2 | 
| Variable | Crisp Discretization | Source of Prior Information | 
|---|---|---|
| Age | 40 years 40–64 years 65 years | Woodward et al., 2012 in [33] | 
| Trestbts | 90–119 mmHg (Normal) 120–139 mmHg (Prehypertension) 140 mmHg (Hypertension) | Borghi et al., 2003 in [33] | 
| Chol | <200 (Normal) 200–239 (High Limit); 240 (High) | Third Report of the National Cholesterol Education Program (NCEP), 2001 in [33] | 
| Thalach | 100 (Normal) >100 (Takikardi) | Palatini, 1999 in [33] | 
| Oldpeak | <3.2 (No/Normal) 3.2 (Yes/Risk) | Riani et al., 2019 in [33] | 
| Continuous Variable | Discretization Term | Discretization Interval | |||||
|---|---|---|---|---|---|---|---|
| FDTID3-1 | FDTID3-2 | FDT D3-3 | FDTID3-4 | FDTID3-5 | FDTID3-6 | ||
| Age | Young | [29, 41] | [29, 42] | [29, 45] | [29, 55] | [29, 41] | [29, 41] | 
| Middle | [39, 65] | [38, 66] | [38, 66] | [45, 69] | [40, 64] | [41, 63] | |
| Old | [63, 77] | [62, 77] | [63, 77] | [61, 77] | [63, 77] | [63, 77] | |
| Trestbps | Normal | [90, 121] | [90, 122] | [90, 124] | [90, 130] | [90, 120] | [90, 120] | 
| Pre-Hypertension | [119, 141] | [119, 142] | [120, 144] | [110, 150] | [119, 149] | [120, 138] | |
| Hypertension | [139, 200] | [139, 200] | [140, 200] | [130, 200] | [138, 200] | [138, 200] | |
| Cholesterol | Normal | [126, 202] | [126, 202] | [126, 210] | [126, 220] | [126, 201] | [126, 200] | 
| High limit | [198, 242] | [200, 242] | [200, 252] | [200, 260] | [200, 240] | [200, 240] | |
| High | [238, 564] | [240, 54] | [240, 54] | [240, 54] | [239, 564] | [240, 564] | |
| Thalach | Normal | [71, 102] | [71, 105] | [71, 121] | [71, 203] | [71, 101] | [71, 101] | 
| Taki Karbi | [98, 202] | [100, 202] | [100, 202] | [71, 203] | [99, 202] | [100, 202] | |
| Oldpeak | No | [0, 4] | [0, 4] | [0, 6] | [0, 6] | [0, 4] | [0, 4] | 
| Yes | [2, 6] | [2, 6] | [0, 6] | [0, 6] | [2, 6] | [2, 6] | |
| Data | Status | Iteration 1 | Iteration 2 | Iteration 3 | Iteration 4 | Iteration 5 | 
|---|---|---|---|---|---|---|
| Learning | No | 128 | 128 | 125 | 127 | 131 | 
| Yes | 110 | 109 | 113 | 110 | 107 | |
| Sum | 238 | 237 | 238 | 237 | 238 | |
| Testing | No | 32 | 32 | 35 | 33 | 29 | 
| Yes | 27 | 28 | 24 | 27 | 30 | |
| Sum | 59 | 60 | 59 | 60 | 59 | |
| Total | 297 | 297 | 297 | 297 | 297 | 
| Prediction of CHD Status | Sum | |||
|---|---|---|---|---|
| The fact of CHD Status | Yes | 27 | 0 | 27 | 
| No | 0 | 32 | 32 | |
| Sum | 27 | 32 | 59 | |
| Fuzzy Membership Functions Combination | Prediction Performance Metric (%) | ||||
|---|---|---|---|---|---|
| Accuracy | Recall | Precision | F1-Score | AUC | |
| DTID3 | 98.99 | 99.33 | 98.62 | 98.97 | 99.02 | 
| FDTID3-1 | 99.00 | 99.00 | 99.31 | 99.15 | 99.17 | 
| FDTID3-2 | 99.33 | 98.64 | 100 | 99.31 | 99.02 | 
| FDTID3-3 | 98.99 | 99.33 | 98.62 | 98.97 | 99.02 | 
| FDTID3-4 | 99.67 | 100 | 99.29 | 99.64 | 99.70 | 
| FDTID3-5 | 99.33 | 99.29 | 99.29 | 99.27 | 99.34 | 
| FDTID3-6 | 99.67 | 99.26 | 100 | 99.62 | 99.63 | 
| Metrics | Source of Var. | Sum of Squares | Mean Squares | F | p-Value | F-Criteria | 
|---|---|---|---|---|---|---|
| Accuracy | between | 256.06 | 42.68 | 645.90 | 6.0 × 10−186 | 2.12 | 
| within | 23.13 | 0.07 | ||||
| Recall | between | 181.26 | 30.21 | 165.11 | 7.3 × 10−99 | |
| within | 64.04 | 0.18 | ||||
| Precision | between | 356.08 | 59.35 | 455.94 | 4.6 × 10−162 | |
| within | 45.56 | 0.13 | ||||
| F1-score | between | 208.24 | 34.71 | 351.64 | 7.3 × 10−145 | |
| within | 34.55 | 0.10 | ||||
| AUC | between | 248.89 | 41.48 | 387.31 | 3.4 × 10−151 | |
| within | 37.49 | 0.11 | 
| Comparison Model | Absolute Mean Difference | ||||
|---|---|---|---|---|---|
| Accuracy | Recall | Precision | F1-Score | AUC | |
| FDTID3-1 vs. FDTID3-2 | 1.47 | 1.12 | 1.60 | 1.36 | 1.50 | 
| FDTID3-1 vs. FDTID3-3 | 1.91 | 1.12 | 2.41 | 1.77 | 1.97 | 
| FDTID3-1 vs. FDTID3-4 | 1.47 | 1.93 | 0.82 | 1.37 | 1.43 | 
| FDTID3-1 vs. FDTID3-5 | 0.52 | 1.12 | 0.13 | 0.49 | 0.56 | 
| FDTID3-1 vs. FDTID3-6 | 2.25 | 1.83 | 2.15 | 1.99 | 2.17 | 
| FDTID3-1 vs. DTID3 | 0.00 | 0.02 | 0.00 | 0.01 | 0.01 | 
| FDTID3-2 vs. FDTID3-3 | 0.43 | 0.00 | 0.81 | 0.40 | 0.47 | 
| FDTID3-2 vs. FDTID3-4 | 0.00 | 0.81 | 0.78 | 0.01 | 0.07 | 
| FDTID3-2 vs. FDTID3-5 | 0.96 | 0.00 | 1.73 | 0.87 | 0.94 | 
| FDTID3-2 vs. FDTID3-6 | 0.78 | 0.71 | 0.55 | 0.63 | 0.67 | 
| FDTID3-2 vs. DTID3 | 1.47 | 1.10 | 1.60 | 1.36 | 1.49 | 
| FDTID3-3 vs. FDTID3-4 | 0.43 | 0.80 | 1.59 | 0.40 | 0.54 | 
| FDTID3-3 vs. FDTID3-5 | 1.39 | 0.00 | 2.54 | 1.28 | 1.41 | 
| FDTID3-3 vs. FDTID3-6 | 0.35 | 0.71 | 0.25 | 0.23 | 0.20 | 
| FDTID3-3 vs. DTID3 | 1.91 | 1.11 | 2.41 | 1.76 | 1.97 | 
| FDTID3-4 vs. FDTID3-5 | 0.96 | 0.81 | 0.95 | 0.88 | 0.87 | 
| FDTID3-4 vs. FDTID3-6 | 0.78 | 0.10 | 1.33 | 0.62 | 0.74 | 
| FDTID3-4 vs. DTID3 | 1.47 | 1.91 | 0.82 | 1.36 | 1.43 | 
| FDTID3-5 vs. FDTID3-6 | 1.74 | 0.71 | 2.28 | 1.50 | 1.61 | 
| FDTID3-5 vs. DTID3 | 0.52 | 1.10 | 0.13 | 0.48 | 0.56 | 
| FDTID3-6 vs. DTID3 | 1.74 | 0.71 | 2.28 | 1.50 | 1.61 | 
| Status of Diabetes | Stat. | Glucose (mg/dL) | Blood Pressure (mmHg) | Skin Thickness (mm) | Insulin (μ/mL) | BMI (kg/hg) | Diabetes Pedigree Function (unit) | Age (Year) | Pregnancies | 
|---|---|---|---|---|---|---|---|---|---|
| No | Min | 0 | 0 | 0 | 0 | 0 | 0.08 | 21 | 0 | 
| Q1 | 93 | 62 | 0 | 0 | 25.4 | 0.23 | 23 | 1 | |
| Mean | 109.98 | 68.18 | 19.66 | 68.79 | 30.30 | 54.73 | 31.19 | 3.30 | |
| Mode | 99 | 74 | 0 | 0 | 0 | 0.207 | 22 | 1 | |
| Q3 | 125 | 78 | 31 | 105 | 35.3 | 0.56 | 37 | 5 | |
| Max | 197 | 122 | 60 | 744 | 57.3 | 2329.00 | 81 | 13 | |
| Yes | Min | 0 | 0 | 0 | 0 | 0 | 0.09 | 21 | 0 | 
| Q1 | 119 | 66 | 0 | 0 | 30.8 | 0.26 | 28 | 1.75 | |
| Mean | 141.26 | 70.82 | 22.16 | 100.34 | 35.14 | 131.80 | 37.07 | 4.87 | |
| Mode | 125 | 70 | 0 | 0 | 32.9 | 0.254 | 25 | 0 | |
| Q3 | 167 | 82 | 36 | 167.25 | 38.775 | 0.73 | 44 | 8 | |
| Max | 199 | 114 | 99 | 846 | 67.1 | 2288.00 | 70 | 17 | 
| Variable | Crisp Discretization | Source of Prior Information | 
|---|---|---|
| Glucose | 140 mg/dL 140 mg/dL | Araki et al., 2020 in [46] | 
| Blood Pressure | 60–80 mm hg 81–89 mm hg 90 mm hg | Tsujimoto and Kajio, 2018 in [46] | 
| Skin Thickness | 30 mm 30 mm | Marrodan et al., 2015 and Khadilkar et al., 2015 in [46] | 
| Insulin Level | 1–283 U/mL 284–565 U/mL 566–846 U/mL | Equation (1) | 
| BMI | 30 kg/m2 30 kg/m2 | Nutall, 2015 in [46] | 
| Diabetes Pedigree Function | <0.4 0.4–0.8 >0.8 | Survey, 2017 in [46] | 
| Age | 35 years 35 years | Lampinen et al., 2009 in [46] | 
| Pregnancies | 4 times 4 times | Karegowda et al., 2012 in [46] | 
| Continuous | Discretization | Discretization Interval | |||||
|---|---|---|---|---|---|---|---|
| Variable | Term | FDTID3-1 | FDTID3-2 | FDTID3-3 | FDTID3-4 | FDTID3-5 | FDTID3-6 | 
| Glucose | Low | [44, 60] | [44, 60] | [44, 62] | [44, 62] | [44, 64] | [44, 66] | 
| Normal | [60, 140] | [60, 140] | [59, 141] | [59, 141] | [58, 142] | [57, 143] | |
| High | [140, 200] | [140, 200] | [138, 200] | [138, 200] | [136, 200] | [134, 200] | |
| Blood Pressure | Normal | [24, 80] | [24, 80] | [24, 82] | [24, 82] | [24, 84] | [24, 84] | 
| Pre-Hypertension | [79, 91] | [80, 90] | [79, 91] | [79, 91] | [77, 93] | [77, 93] | |
| Hypertension | [90, 122] | [90, 122] | [88, 122] | [88, 122] | [86, 122] | [86, 122] | |
| Skin Thickness | Normal | [7, 30] | [7, 31] | [7, 31] | [7, 33] | [7, 32] | [7, 35] | 
| Thick | [30, 99] | [29, 99] | [28, 100] | [28, 99] | [28, 99] | [24, 99] | |
| Insulin Level | Normal | [0, 166] | [0, 167] | [0, 166] | [0, 168] | [0, 168] | [0, 170] | 
| High | [166, 846] | [159, 846] | [166, 846] | [164, 846] | [164, 846] | [162, 846] | |
| BMI | Normal | [18, 30] | [18, 31] | [18, 30] | [18, 32] | [18, 32] | [18, 36] | 
| Obesity | [30, 68] | [29, 68] | [30, 68] | [28, 68] | [28, 68] | [24, 68] | |
| Diabetes Pedigree Function | Low | [0, 0.4] | [0, 0.4] | [0, 0.4] | [0, 0.4] | [0, 0.5]] | [0, 0.6] | 
| Normal | [0.4, 0.8] | [0.4, 0.8] | [0.4, 0.8] | [0.4, 0.8] | [0.2, 1] | [0.3, 0.9] | |
| High | [0.8, 2329] | [0.8, 2329] | [0.8, 2329] | [0.8, 2329] | [0.7, 2329] | [0.6, 2329] | |
| Age | Young | [21, 35] | [21, 36] | [21, 35] | [21, 37] | [21, 39] | [21, 39] | 
| Old | [35, 81] | [34, 81] | [35, 81] | [33, 81] | [31, 81] | [31, 81] | |
| Pregnancies | Normal | [1, 4] | [1, 5] | [1, 5] | [1, 5] | [1, 6] | [1, 7] | 
| High | [4, 17] | [3, 17] | [3, 17] | [3, 17] | [2, 17] | [3, 17] | |
| Data | Status | Iteration 1 | Iteration 2 | Iteration 3 | Iteration 4 | Iteration 5 | 
|---|---|---|---|---|---|---|
| Learning | No | 400 | 400 | 397 | 401 | 402 | 
| Yes | 214 | 214 | 218 | 213 | 213 | |
| Sum | 614 | 614 | 615 | 614 | 615 | |
| Testing | No | 100 | 100 | 103 | 99 | 98 | 
| Yes | 54 | 54 | 50 | 55 | 55 | |
| Sum | 154 | 154 | 153 | 154 | 153 | |
| Total | 768 | 768 | 768 | 768 | 768 | 
| Prediction of DM Status | Sum | |||
|---|---|---|---|---|
| The fact of DM Status | Yes | 22 | 32 | 54 | 
| No | 5 | 95 | 100 | |
| Sum | 27 | 127 | 154 | |
| Fuzzy Membership Functions Combination | Prediction Performance Metric (%) | ||||
|---|---|---|---|---|---|
| Accuracy | Recall | Precision | F1-Score | AUC | |
| DTID3 | 91.54 | 92.91 | 91.19 | 91.95 | 87.68 | 
| FDTID3-1 | 92.19 | 95.76 | 92.46 | 94.06 | 90.61 | 
| FDTID3-2 | 92.58 | 95.97 | 92.90 | 94.38 | 91.10 | 
| FDTID3-3 | 91.59 | 94.58 | 91.81 | 93.13 | 89.29 | 
| FDTID3-4 | 92.19 | 95.78 | 93.37 | 94.54 | 91.46 | 
| FDTID3-5 | 93.23 | 96.99 | 92.95 | 94.89 | 89.64 | 
| FDTID3-6 | 93.11 | 92.30 | 97.60 | 94.86 | 88.54 | 
| Metrics | Source of Var. | Sum of Squares | Mean Squares | F | p-Value | F-Criteria | 
|---|---|---|---|---|---|---|
| Accuracy | between | 177.41 | 29.57 | 1107.70 | 3.23 × 10−224 | 2.12 | 
| within | 9.34 | 0.03 | ||||
| Recall | between | 488.61 | 81.43 | 258.53 | 2.51 × 10−125 | |
| within | 110.25 | 0.31 | ||||
| Precision | between | 212.58 | 35.43 | 385.79 | 6.21 × 10−151 | |
| within | 32.14 | 0.09 | ||||
| F1-score | between | 97.40 | 16.23 | 391.47 | 6.74 × 10−152 | |
| within | 14.51 | 0.04 | ||||
| AUC | between | 251.12 | 41.85 | 418.52 | 2.50 × 10−156 | |
| within | 35.00 | 0.10 | 
| Comparison Model | Absolute Mean Difference | ||||
|---|---|---|---|---|---|
| Accuracy | Recall | Precision | F1-Score | AUC | |
| FDTID3-1 vs. FDTID3-2 | 1.37 | 4.00 | 1.03 | 1.30 | 1.86 | 
| FDTID3-1 vs. FDTID3-3 | 1.71 | 1.32 | 1.12 | 1.21 | 1.22 | 
| FDTID3-1 vs. FDTID3-4 | 1.32 | 1.42 | 0.61 | 0.98 | 2.19 | 
| FDTID3-1 vs. FDTID3-5 | 2.09 | 1.73 | 1.32 | 1.51 | 0.40 | 
| FDTID3-1 vs. FDTID3-6 | 0.37 | 0.51 | 0.17 | 0.33 | 0.05 | 
| FDTID3-1 vs. DTID3 | 0.58 | 1.58 | 0.42 | 0.49 | 1.64 | 
| FDTID3-2 vs. FDTID3-3 | 0.34 | 2.68 | 2.14 | 0.08 | 1.00 | 
| FDTID3-2 vs. FDTID3-4 | 0.05 | 2.58 | 1.64 | 0.32 | 1.98 | 
| FDTID3-2 vs. FDTID3-5 | 0.72 | 2.27 | 2.35 | 0.22 | 0.19 | 
| FDTID3-2 vs. FDTID3-6 | 1.00 | 3.49 | 1.20 | 0.97 | 0.17 | 
| FDTID3-2 vs. DTID3 | 0.79 | 2.42 | 0.61 | 0.81 | 0.64 | 
| FDTID3-3 vs. FDTID3-4 | 0.39 | 0.10 | 0.50 | 0.23 | 0.34 | 
| FDTID3-3 vs. FDTID3-5 | 0.37 | 0.41 | 0.20 | 0.30 | 1.45 | 
| FDTID3-3 vs. FDTID3-6 | 1.34 | 0.82 | 0.94 | 0.89 | 1.81 | 
| FDTID3-3 vs. DTID3 | 1.14 | 0.26 | 1.53 | 0.72 | 0.98 | 
| FDTID3-4 vs. FDTID3-5 | 0.76 | 0.31 | 0.71 | 0.53 | 0.81 | 
| FDTID3-4 vs. FDTID3-6 | 0.95 | 0.92 | 0.44 | 0.65 | 1.17 | 
| FDTID3-4 vs. DTID3 | 0.75 | 0.16 | 1.03 | 0.49 | 1.79 | 
| FDTID3-5 vs. FDTID3-6 | 1.72 | 1.23 | 1.15 | 1.19 | 2.15 | 
| FDTID3-5 vs. DTID3 | 1.51 | 0.15 | 1.74 | 1.02 | 1.79 | 
| FDTID3-6 vs. DTID3 | 1.72 | 1.23 | 1.15 | 1.19 | 0.22 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kresnawati, E.S.; Suprihatin, B.; Resti, Y. The Combinations of Fuzzy Membership Functions on Discretization in the Decision Tree-ID3 to Predict Degenerative Disease Status. Symmetry 2024, 16, 1560. https://doi.org/10.3390/sym16121560
Kresnawati ES, Suprihatin B, Resti Y. The Combinations of Fuzzy Membership Functions on Discretization in the Decision Tree-ID3 to Predict Degenerative Disease Status. Symmetry. 2024; 16(12):1560. https://doi.org/10.3390/sym16121560
Chicago/Turabian StyleKresnawati, Endang Sri, Bambang Suprihatin, and Yulia Resti. 2024. "The Combinations of Fuzzy Membership Functions on Discretization in the Decision Tree-ID3 to Predict Degenerative Disease Status" Symmetry 16, no. 12: 1560. https://doi.org/10.3390/sym16121560
APA StyleKresnawati, E. S., Suprihatin, B., & Resti, Y. (2024). The Combinations of Fuzzy Membership Functions on Discretization in the Decision Tree-ID3 to Predict Degenerative Disease Status. Symmetry, 16(12), 1560. https://doi.org/10.3390/sym16121560
 
         
                                                

 
       