Heuristic Weight Initialization for Diagnosing Heart Diseases Using Feature Ranking
Abstract
:1. Introduction
- linear model parameters (feature weights) are calculated based on heuristics instead of gradient descent;
- higher accuracies achieved in various metrics compared to previous research;
- feature ranking is determined based on heuristics, guaranteeing a single solution, unlike Random Forest, and it does not suffer from the class inequality problem either.
2. Related Work
3. The Proposed Methodology
3.1. Heart Disease Data and the Proposed Model
3.2. The Proposed ML Model
Algorithm 1 Feature Ranking Algorithm |
|
4. Experiments
4.1. Data Preprocessing
4.2. Evaluation Metrics
4.3. Results of Existing Algorithms
4.4. Results of the Proposed Method
- It does not require normalized data, i.e., it is scale invariant;
- It determines the applicable border even in datasets that are class-imbalanced;
- It can replace three metrics Precision, Recall, and score, i.e., it is more stable;
- It is unaffected by class locations in the two intervals;
- It can be used to determine a fine-tuned threshold in logistic regression.
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Fujita, H. Microactuators and micromachines. Proc. IEEE 1998, 86, 1721–1732. [Google Scholar] [CrossRef]
- Roope, R.; Ismo, R.; Päivi, M.; Katri, S.; Jussi, R.; Ahmed, F. Human Augmentation: Past, Present and Future. Int. J. Hum.-Comput. Stud. 2019, 131, 131–143. [Google Scholar]
- Strickland, E. A bionic nose to smell to the roses again. IEEE Spectr. 2022, 59, 22–27. [Google Scholar]
- Dutta, B. Eavesdropping on the brain. IEEE Spectr. 2022, 6, 32–35. [Google Scholar]
- Connie, W.T. Heart Disease and Stroke Statistics—2022 Update: A Report From the American Heart Association. Circulation 2022, 145, e153–e639. [Google Scholar]
- Christodoulides, N. Programmable Bio-NanoChip Technology for the Diagnosis of Cardiovascular Disease at the Point of Care. MDCVJ 2012, 8, 6–12. [Google Scholar] [CrossRef] [PubMed]
- Tiny Microchip Monitoring Heart Failure Patients at Home. Available online: https://today.uconn.edu/2017/10/tiny-microchip-monitoring-heart-failure-patients-home/ (accessed on 17 October 2017).
- Tariq, T.; Khushal, S.S.; Lee, Y. Automatic Multimodal Heart Disease Classification using Phonocardiogram Signal. In Proceedings of the IEEE International Conference on Big Data, Atlanta, GA, USA, 10–13 December 2020; pp. 3514–3521. [Google Scholar] [CrossRef]
- Sharma, A.; Pal, T.; Jaiswal, V. Heart disease prediction using convolutional neural network. In Cardiovascular and Coronary Artery Imaging; El-Baz Ayman, S., Jasjit Suri, S., Eds.; Academic Press: Cambridge, MA, USA, 2022; pp. 245–272. [Google Scholar]
- Dutta, A.; Batabyal, T.; Basu, M.; Scott, T.A. An efficient convolutional neural network for coronary heart disease prediction. Expert Syst. Appl. 2022, 159, 113408. [Google Scholar]
- Arooj, S.; Rehman, S.; Imran, A.; Almuhaimeed, A.; Alzahrani, A.K.; Alzahrani, A. A Deep Convolutional Neural Network for the Early Detection of Heart Disease. Biomedicines 2022, 10, 2796. [Google Scholar] [CrossRef]
- Chicco, D.; Jurman, G. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Med. Inform. Decis. Mak. 2020, 20, 1–16. [Google Scholar] [CrossRef] [PubMed]
- Tanvir, A.; Munir, A.; Sajjad, H.B.; Aftab, M.; Muhammad, A.R. Survival analysis of heart failure patients: A case study. PLoS ONE 2017, 12, e0181001. [Google Scholar] [CrossRef]
- Ignatev, N.A. On Nonlinear Transformations of Features Based on the Functions of Objects Belonging to Classes. Pattern Recognit. Image Anal. 2021, 31, 197–204. [Google Scholar] [CrossRef]
- Ignatev, N.A. Structure choice for relations between objects in metric classification algorithms. Pattern Recognit. Image Anal. 2018, 28, 695–702. [Google Scholar] [CrossRef]
- Ignatev, N.A.; Madrakhimov, S.F.; Saidov, D.Y. Stability of Object Classes and Selection of the Latent Features. IJTES 2017, 4. [Google Scholar] [CrossRef]
- Ignatev, N.A.; Mirzaev, A.I. The intelligent health index calculation system. Pattern Recognit. Image Anal. 2016, 26, 114–118. [Google Scholar] [CrossRef]
- Markelle, K.; Rachel, L.; Kolby, N. The UCI Machine Learning Repository. Available online: http://archive.ics.uci.edu/ml (accessed on 1 November 2022).
- Madrakhimov, S.; Rozikhodjaeva, G.; Makharov, Q.T. The use of data mining methods for estimating of vascular aging. Atherosclerosis 2020, 315, e13513. [Google Scholar] [CrossRef]
- Makharov, Q.T.; Rozikhodjaeva, G.; Madrakhimov, S.; Ikramova, Z. The choice of informative signs for classification of atherosclerotic burden of carotid arteries. Diagn. Radiol. Radiother. 2022, 13, 189–190. [Google Scholar]
- David, M.W. Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation; Technical Report SIE-07-001; University of South Australia: Adelaide, Australia, 2008. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Tin, K.H. Random Decision Forest. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995. [Google Scholar]
# | Name of Feature | Type of Feature |
---|---|---|
1 | Age of the patient (Age) | Integer |
2 | The patient has or does not have anemia (Anemia) | Yes/no |
3 | Level of creatinine phosphokinase enzyme in the blood (mcg/L) (Creatinine phosphokinase) | Real |
4 | The patient has or does not have diabetes (Diabetes) | Yes/no |
5 | The patient has or does not have hypertension (High blood pressure) | Yes/no |
6 | Percentage of blood leaving the heart at each contraction (Ejection fraction) | Real |
7 | Platelets in the blood (kiloplatelets/mL) (Platelets) | Real |
8 | Male or female (Sex) | Binary |
9 | Level of serum creatinine in the blood (mg/dL) (Serum creatinine) | Real |
10 | Level of serum sodium in the blood (mEq/L) (Serum sodium) | Real |
11 | The patient smokes or does not smoke (Smoking) | Yes/no |
12 | The follow-up period in days (Period) | Integer |
13 | The patient deceased or survived during the follow-up period | Target (Death/Survived) |
The raw objects | ||||||||||||||||||||
Time periods | 41 | 58 | 85 | 65 | 69 | 60 | 70 | 42 | 75 | 55 | 70 | 67 | 60 | 79 | 59 | 51 | 55 | 65 | 44 | 57 |
Target values | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
The sorted objects | ||||||||||||||||||||
Time periods | 41 | 42 | 44 | 51 | 55 | 55 | 57 | 58 | 59 | 60 | 60 | 65 | 65 | 67 | 69 | 70 | 70 | 75 | 79 | 85 |
Target values | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
Feature | Rank by Equation (3) | t | Precision | Recall | Score | Accuracy |
---|---|---|---|---|---|---|
Period | 0.52 | −1 | 0.84/0.83 | 0.65/0.64 | 0.73/0.72 | 0.85/0.84 |
Serum creatinine | 0.34 | 1 | 0.63/0.60 | 0.48/0.46 | 0.54/0.51 | 0.74/0.72 |
Ejection fraction | 0.33 | −1 | 0.57/0.55 | 0.52/0.50 | 0.54/0.52 | 0.71/0.71 |
Serum sodium | 0.30 | −1 | 0.50/0.48 | 0.48/0.46 | 0.49/0.46 | 0.68/0.67 |
Age | 0.30 | 1 | 0.50/0.48 | 0.43/0.42 | 0.46/0.44 | 0.68/0.67 |
Platelets in the blood | 0.26 | −1 | 0.40/0.35 | 0.38/0.33 | 0.39/0.33 | 0.62/0.59 |
High blood pressure | 0.26 | 1 | 0.37/0.37 | 0.41/0.40 | 0.39/0.38 | 0.59/0.58 |
Anemia | 0.25 | −1 | 0.36/0.35 | 0.48/0.47 | 0.41/0.40 | 0.55/0.55 |
Creatinine phosphokinase | 0.25 | 1 | 0.35/0.28 | 0.46/0.37 | 0.40/0.32 | 0.56/0.50 |
Diabetes | 0.25 | −1 | 0.33/0.28 | 0.52/0.45 | 0.40/0.34 | 0.51/0.46 |
Sex | 0.25 | 1 | 0.33/0.28 | 0.50/0.43 | 0.39/0.33 | 0.52/0.47 |
Smoking | 0.24 | 1 | 0.33/0.29 | 0.58/0.51 | 0.41/0.36 | 0.49/0.45 |
Model | Precision | Recall | Score | Accuracy |
---|---|---|---|---|
SVM | 0.80/0.82 | 0.65/0.65 | 0.72/0.72 | 0.80/0.84 |
MLP | 1.00/0.61 | 1.00/0.58 | 1.00/0.59 | 1.00/0.74 |
kNN (3-neighbors) | 0.88/0.64 | 0.51/0.32 | 0.65/0.41 | 0.82/0.72 |
Nearest neighbor | 1.00/0.48 | 1.00/0.38 | 1.00/0.42 | 1.00/0.66 |
Random Forest | 0.93/0.85 | 0.74/0.65 | 0.83/0.73 | 0.90/0.85 |
Gradient Boosting Classifier | 1.00/0.76 | 1.00/0.72 | 1.00/0.73 | 1.00/0.83 |
Feature Group | Rank by Equation (3) | Precision | Recall | Score | Accuracy |
---|---|---|---|---|---|
Period, Serum creatinine | 0.52 | 0.80/0.76 | 0.68/0.65 | 0.74/0.70 | 0.84/0.82 |
Period, Serum creatinine, Ejection fraction | 0.53 | 0.80/0.79 | 0.70/0.69 | 0.75/0.73 | 0.85/0.84 |
Period, Serum creatinine, Ejection fraction, Serum sodium | 0.53 | 0.81/0.78 | 0.69/0.67 | 0.74/0.72 | 0.85/0.83 |
Period, Serum creatinine, Ejection fraction, Serum sodium, Age | 0.55 | 0.77/0.75 | 0.76/0.74 | 0.77/0.74 | 0.85/0.83 |
Period, Serum creatinine, Ejection fraction, Serum sodium, Age, Platelets in the blood | 0.54 | 0.81/0.78 | 0.70/0.68 | 0.75/0.73 | 0.85/0.84 |
Period, Age, High blood pressure | 0.48 | 0.72/0.53 | 0.71/0.76 | 0.71/0.62 | 0.82/0.71 |
Period, Age, High blood pressure, Sex, Smoking | 0.48 | 0.74/0.42 | 0.69/0.89 | 0.71/0.57 | 0.82/0.58 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lolaev, M.; Naik, S.M.; Paul, A.; Chehri, A. Heuristic Weight Initialization for Diagnosing Heart Diseases Using Feature Ranking. Technologies 2023, 11, 138. https://doi.org/10.3390/technologies11050138
Lolaev M, Naik SM, Paul A, Chehri A. Heuristic Weight Initialization for Diagnosing Heart Diseases Using Feature Ranking. Technologies. 2023; 11(5):138. https://doi.org/10.3390/technologies11050138
Chicago/Turabian StyleLolaev, Musulmon, Shraddha M. Naik, Anand Paul, and Abdellah Chehri. 2023. "Heuristic Weight Initialization for Diagnosing Heart Diseases Using Feature Ranking" Technologies 11, no. 5: 138. https://doi.org/10.3390/technologies11050138
APA StyleLolaev, M., Naik, S. M., Paul, A., & Chehri, A. (2023). Heuristic Weight Initialization for Diagnosing Heart Diseases Using Feature Ranking. Technologies, 11(5), 138. https://doi.org/10.3390/technologies11050138