Heart Failure Prediction Through a Comparative Study of Machine Learning and Deep Learning Models

Qadeer, Mohid; Ayaz, Rizwan; Thohir, Muhammad Ikhsan

doi:10.3390/engproc2025107061

Open AccessProceeding Paper

Heart Failure Prediction Through a Comparative Study of Machine Learning and Deep Learning Models^†

by

Mohid Qadeer

¹,

Rizwan Ayaz

^2,*

and

Muhammad Ikhsan Thohir

³

¹

Department of Software Engineering, University of Sialkot, Sialkot 51040, Pakistan

²

School of Computer Science, Taylor’s University, Subang Jaya 47500, Malaysia

³

Department of Information Technology, Nusa Putra University, Sukabumi 43152, West Java, Indonesia

^*

Author to whom correspondence should be addressed.

^†

Presented at the 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society, Aizuwakamatsu City, Japan, 20–26 January 2025.

Eng. Proc. 2025, 107(1), 61; https://doi.org/10.3390/engproc2025107061

Published: 4 September 2025

(This article belongs to the Proceedings of The 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society)

Download

Browse Figures

Versions Notes

Abstract

The heart is essential to human life, so it is important to protect it and understand any kind of damage it can have. All the diseases related to hearts leads to heart failure. To help address this, a tool for predicting survival is needed. This study explores the use of several classification models for forecasting heart failure outcomes using the Heart Failure Clinical Records dataset. The outcome contrasts a deep learning (DL) model known as the Convolutional Neural Network (CNN) with many machine learning models, including Random Forest (RF), K-Nearest Neighbors (KNN), Decision Tree (DT), and Naïve Bayes (NB). Various data processing techniques, like standard scaling and Synthetic Minority Oversampling Technique (SMOTE), are used to improve prediction accuracy. The CNN model performs best by achieving 99%. In comparison, the best-performing ML model, Naïve Bayes, reaches 92.57%. This shows that deep learning provides better predictions of heart failure, making it a useful tool for early detection and better patient care.

Keywords:

heart failure; deep learning; machine learning; convolutional neural network; classification models

1. Introduction

The heart is most important organ because it pumps blood to every single part of the body. This blood supplies the oxygen and nutrients that the body needs to live. Along with the brain, the heart acts as the body’s last line of defense against harmful substances. Heart rate is defined as how many times the heart beats in a minute. For a healthy adult at rest, it typically beats 60 or 100 times per minute [1]. Heart disease is one of the top causes of death worldwide. Each year, more than 2 million Americans die from heart diseases and stroke. It is a condition in which blood vessels and cause problems with the circulation of the blood [2]. Heart diseases, such as rheumatic heart disease, coronary heart disease (CHD), and heart failure, are some conditions that affect the heart and blood vessels [3]. Heart disease is a condition that affects the cardiovascular system. People with heart disease have a high risk of illness and death, especially when they reach heart failure which is the final stage of the disease. Heart failure is becoming one of the most common diseases because people are living longer and treatments for heart attacks are improving, allowing those with weakened hearts to live longer [4]. Diagnosing heart failure requires a full assessment of signs, medical history, and diagnostic tests. However, as technology improves, machine learning (ML) techniques can help to improve the prediction of this state. In recent years, there has been growing interest in how machine learning can assist in identifying and predicting heart diseases in the medical field [5].

2. Literature Review

The paper proposes two new voting classifiers, Modified K-Means Voting-Based (MKMVB), and Generalized K-Means Voting-Based (GKMVB), for predicting heart disease. Both of the methods perform better than traditional approaches like Artificial Neural Network (ANN) and Support Vector Machine (SVM) [6]. This paper introduces a clear and understandable technique for examining heart failure risk and predicting survival. It combines techniques like Control Burn, Survival Stacking, and Explainable Boosting Machines to achieve a high-level performance and uncover both known and new risk factors [7]. The paper examines the role of machine learning (ML) in diagnosing heart failure, identifying death probabilities, and evaluating prediction across subtypes. It concentrates on a multifaceted approach for managing patients in a better way [8]. This paper analyzes a dataset of 1200 patients, which was obtained through the University of Illinois (UI). It compromises the electronic health record (EHR) of patients who were 18 years or older during their first hospitalization. This study identifies five key themes, which also include other conditions linked to heart disease, providing better knowledge for patient care [9]. This paper proposes a Transformer Deep Learning model to predict heart failure using the records of 100,071 patients, achieving high accuracy while observing known and new risks through sensitivity analyses [10]. This study discovers that the Random Forest algorithm achieved a 93.36% accuracy in heart failure survival prediction, a result that outperforms the K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) algorithms [11]. The paper examines Naive Bayes (NB) and K-Nearest Neighbor (KNN) in RapidMiner for heart disease prediction, highlighting model accuracy and risk factors [12]. For the purpose of survival and identifying serum creatinine and ejection fraction as important risk factors for model development, this study examines a dataset of 299 heart failure patients from 2015. It does this by combining ML and biostatistical tests [13]. In order to undertake survival analysis and enhance prognostic predictions, this study examines the application of several classification models for predicting heart failure outcomes using the Heart Failure Clinical Records dataset [14]. This paper explores preprocessing techniques to address class imbalance in clinical datasets, aiming to improve mortality prediction in heart failure patients using machine learning models [15]. In order to improve the prediction model and solve issues in clinical trials, this study presents a random shuffle method that is specifically applied to a cohort of patients with heart failure [16]. In order to understand patient circumstances better, the study uses machine learning techniques and topic modeling to assess clinical notes from 1200 heart failure patients. It finds five hidden themes, one of which is centered on heart disease comorbidities [17]. The study constructed a heart failure database by collating administrative claims data and electronic medical records from three Japanese hospitals, analyzing 2750 patients to identify risk factors for in-hospital death and prolonged hospitalization based on early-hospitalization data [18]. The study finds important indicators like anemia, high blood pressure, serum creatinine, sex, and diagnoses of heart diseases using correlation filters and the KNN method. Additionally, it introduces the “Heart Info System” app, which offers AI-based survival forecasts for heart conditions [19]. As shown in Figure 1, the standardized rate of deaths from coronary heart diseases in the EU [20].

3. Methodology

Heart failure is a leading one of the causes of death worldwide, so it is very important to predict when patients might experience it. In this way, doctors can step in quickly and improve patient outcomes. This research uses the Heart Failure Clinical Records dataset to predict heart failure based on patients’ medical history. The overall methodology is shown in Figure 2. This study uses deep learning (DL) and machine learning (ML) models to make predictions [21]. It overcomes challenges like poor data quality and unequal class distribution by employing standard scaling (to maintain consistency) and SMOTE (to correct the imbalance in the dataset) [22]. This enhances the model’s predictive power through advanced techniques, providing a deeper understanding of potential early diagnosis. Moreover, this method also reduces the biases caused by imbalanced data, thus improving the accuracy of the prediction [23]. We use metrics such as F1-score, recall, accuracy, and precision to evaluate the performances of these models.

3.1. Machine Learning Techniques

There are three techniques in ML, which are discussed below.

3.2. Supervised Learning

This approach that has a model trained using labeled data. A data point is made up of an input and a corresponding correct output. The algorithm learns the patterns in this data and uses them to accurately predict results for new, previously unseen inputs. This is a popular approach for many tasks, including classification and regression; for example, spam detection, medical diagnosis, price prediction, etc.

3.3. Unsupervised Learning

This learning is a machine learning paradigm where a model learns from data that does not have labeled outputs. It does not dictate precise responses; its means learn the patterns, interrelationships, and structures naturally found in within the dataset. Using tasks such as clustering, anomaly detection, and dimensionality reduction, hidden layers in the data are uncovered.

3.4. Reinforced Learning

This is a subfield of machine learning where an agent learns by interacting with its environment. It learns from its own actions through reinforcement signals, receiving rewards or penalties, to improve by success and/or failure. The objective here is to find out which strategies provide the maximum cumulative reward by exploring the environment and learning from the results. This method is commonly applied in robotics, gaming, and autonomous systems.

3.5. Dataset Description

The dataset, which was obtained from a source on the internet and is utilized to identify heart failure risk factors and forecast patient outcomes, is described in this section, along with its contents. Table 1 shows the description of features selected from the dataset.

3.6. Dataset for Heart Failure Clinical Records

(1) Content: This dataset is derived from clinical records of patients with heart failure. It aims to predict patient mortality during follow-up periods and investigate the factors causing heart failure. The dataset, which was compiled into a tabular format with rows representing patient records and columns representing clinical information, was taken from actual clinical settings. It has a high utility factor since it contains important elements that are directly linked to the health and prognosis of patients. The dataset is primarily supplied in numerical form with binary attributes for categorical variables in order to facilitate interpretability and accurate predictive modeling. (2) Description: The details of the datasets are given in this section, emphasizing its salient characteristics. It includes clinical variables associated with heart failure, including lifestyle factors, laboratory test findings, and patient age. Each characteristic stands for important aspects of the patient’s health. It includes information about whether the patient suffers from conditions like diabetes, anemia, or high blood pressure, which are significant risk factors. Binary attributes like sex (0 = female, 1 = male) and smoking status (0 = no, 1 = yes) provide demographic and lifestyle insights. The target attribute of the dataset “DEATH_EVENT” is binary, representing whether the patient died during the follow-up period (1 = yes, 0 = no). This label helps classify the outcomes and assess the risk factors contributing to morality.

3.7. Operator Description

3.7.1. Optimized Selection

This operator makes use of both forward selection and backward elimination. Its primary purpose is to assist with one of the primary data mining activities. It is related to both weight and results in order to determine the weight of features. It has two connections: the normalizing example set operator and the filter examples example set.

3.7.2. Apply Model

This is used as model operator from algorithms of split data. It means that the model is initially trained on the training data with a learning algorithm and makes the model recognize patterns. It is designed to make predictions on new, never-before-seen data and evaluate the model on a more independent subset, the validation set. This makes sure that model generalizes and does not overfit. Metrics such as accuracy, precision, recall, and F1-score help assess the effectiveness of the model.

3.7.3. Split Data

To evaluate model performance, the split data operator separates the dataset into training and testing sets. A typical split ratio is 70:30, meaning that 30% of the data is utilized for testing and 70% is used for training. To verify the accuracy of the models, this operator makes sure they are tested on a different subset after being trained on one. The 80:20 and 90:10 divides are frequently used for larger datasets. For a dataset with 5000 entries, an 80:20 split is a right option.

3.7.4. Performance Evaluation

The performance binomial operator assesses the accuracy of the model and classification capabilities. It assesses the strengths and shortcomings of labeled data. It assesses the following metrics:

True positive (TP): accurately anticipated favorable results.
True negative (TN): accurately foreseen adverse results.
False positive (FP): when positive results are incorrectly projected.
False negative (FN): negative results that were not accurately expected.

3.7.5. Replace Missing Values

The user only needs to verify this while importing data; if values are missing, the operator will replace them appropriately. Otherwise, the operator can be used.

3.7.6. Filter Example

The dataset’s missing values are handled by the filter example operator. The user defines the conditions for filling in the missing values. All attributes are ensured to be complete and useful for analysis by this operation. In any actual dataset, missing values are a common occurrence that can have the potential to affect the performance and accuracy of a machine learning model. Missing data may be caused by a variety of reasons such as data entry mistakes, sensor malfunction, or survey non-response. Therefore, missing value handling is a critical step in the data preprocessing stage. In this case, the filter example operator plays a very important role in allowing users to define specific conditions under which missing values must be treated or skipped from analysis.

The filter example operator is adaptive when dealing with missing data. Instead of using a blanket rule to treat all missing values, it allows the user to construct customized filtering conditions based on analysis requirements. For example, one may choose to remove examples (rows) where a specific percentage of attributes are missing, or retain only records where significant attributes are complete. This selective filtering only passes on relevant and complete data to subsequent stages of the model-building process.

3.7.7. Convolutional Neutral Network

This is a type of feedforward neural network. It has four main layers: the input layer, pooling layer, convolutional layer, and output layer. The special design of CNN makes it excellent at learning and identifying features, which is very helpful for tasks like image recognition.

A typical CNN consists of four main layers: the input layer, convolutional layer, pooling layer, and the output layer.

The input layer takes raw image data, often in the form of pixel values. Each image is represented as a multidimensional array where every channel (for instance, red, green, and blue in a color image) is stored separately.
The convolutional layer is the basic building block of a CNN. It employs several learnable filters (or kernels). Each filter is learned to recognize specific patterns such as edges, textures, or shapes. The filters are not predefined, but learned automatically during training.
Following convolution, a non-linear activation function (most often ReLU) is used to introduce non-linearity in the model so that it can learn more complicated patterns.
The pooling layer is also responsible for downsampling the feature maps. It lowers the spatial dimensions (width and height) of the data, decreasing computational burden, controlling overfitting, and strengthening the network against input variations. Max pooling is the most commonly employed scheme, which selects the maximum value in each sub-region.
The output layer is usually a fully connected (dense) layer with softmax activation for classification problems. This layer produces the final predictions by encapsulating all the learned features.

Their design allows CNNs to learn local features (e.g., edges) first and then patch them together in an attempt to recognize more abstract forms at deeper levels (e.g., faces or objects). This aspect of automatically learning features without having to extract them manually makes CNNs extremely versatile and applied in a wide range of applications outside of image processing, including video processing, natural language, and even time-series forecasting. Figure 3 shows the evaluation metrics result of CNN classifier.

3.8. KNN Classifier

The K-Nearest Neighbor (KNN) classifier is used for classification. It stores the dataset during training and uses the majority class of its closest neighbors to classify fresh data points. A usual parameter of k = 1 is applied to this dataset. For classification issues, the KNN algorithm works through giving the most common class label among the k-nearest points. For instance, when k = 3, the algorithm calculates the three nearest points to the new point and then labels them as the majority vote of the neighbors. For this dataset, the typical value of k is k = 1, meaning that the algorithm will classify a new point solely on the basis of the class of its nearest neighbor.

KNN works optimally when the decision boundaries are non-linear and not regular because it does not assume anything about the data distribution. It may be computationally expensive, however, in large datasets because it needs to calculate distances from all the training points during prediction. To avoid this, a few optimization methods like KD-Trees or Ball Trees are occasionally utilized.

3.9. Random Forest

A splitting rule for a single attribute is represented by each node. The “number of trees” parameter controls the quantity of decision trees in the forest. The model’s performance is generally enhanced by increasing the number of trees, at the cost of increased computation time. Each of the decision trees in the forest is constructed by recursively splitting the training data on selected attributes. At each node, the best splitting rule is chosen based on metrics like Gini impurity (for classification) or mean squared error (for regression). These rules are decisions on a single attribute, and so the branches divide the data further.

3.10. Naïve Bayes

The “naïve” part of the algorithm comes from the assumption that all features used in the classification task are independent of each other, given the class label. While this assumption rarely holds in real-world data, the algorithm still tends to perform well in practice, especially when the feature interdependence is weak or not significantly impactful. There are several types of Naïve Bayes classifiers, including the following:

Gaussian Naïve Bayes, used when features are normally distributed;
Multinomial Naïve Bayes, commonly used in text classification where word counts are important;
Bernoulli Naïve Bayes, suitable for binary/Boolean features (e.g., word presence or absence).

Naïve Bayes has one major advantage as is its computational efficiency. It is particularly useful when working with large datasets and high-dimensional inputs such as text, where models like decision trees or neural networks can become computationally expensive.

3.11. Deep Learning

DL attempts to model high-level abstractions in data with the help of architectures composed of multiple non-linear transformations. DL is more commonly known as neural networks to many, but it encompasses a broad variety of architectures, such as the following:

CNN for image processing;
RNN for sequence modeling;
Transformers for natural language processing and others;
Autoencoders and Generative Adversarial Networks (GANs) for generative modeling.

DL algorithms can deal with unstructured data such as images, sound, and text. Traditional machine learning models require feature engineering, selecting and transforming input variables by hand. Deep learning, however, is able to learn these features from raw data, as long as there is ample computational power and training data.

Deep learning was made possible and prevalent due to improvements in computational hardware (specifically Graphical Processing Units (GPUs) and Tensor Processing Unit (TPUs), access to large datasets, and advances in training algorithms like backpropagation and gradient descent optimization techniques. Libraries like TensorFlow and PyTorch have also helped make it popular.

The methods in deep learning are currently employed in self-driving cars, voice recognition, machine translation, medical diagnosis, detecting fraud, recommendation systems, and robots. For example, self-driving cars use deep learning models to process the sensor data, while virtual assistants like Siri and Alexa rely on deep learning for natural language processing.

3.12. Decision Tree

A decision tree, which is used for regression and classification, is produced by this operation. It is an operator with subprocesses that creates a tree model after receiving an example set. Decision trees can handle both numeric and categorical data and are widely used because they are interpretable and easy to comprehend. One of the most common algorithms used to build decision trees is Classification and Regression Trees(CART). Other well-known algorithms include ID3 and C4.5, which use measures like information gain or Gini impurity to decide the best attribute to split the data at any point.

The major advantages of decision trees are the following:

Are simple to interpret and visualize;
Do not necessitate a lot of data preprocessing (e.g., do not require feature scaling);
Capable of modeling non-linear relationships in data;
Handle classification and regression issues effectively.

This algorithm also have their limitations. They can be prone to overfitting, especially if the tree is extremely deep. Overfitting occurs when the model performs well on the training data but poorly on new data. To avoid this, techniques such as pruning, maximum depth specification, or ensemble techniques such as Random Forest or Gradient Boosting are used.

In practice, machine learning workflows often use decision trees in ensemble learning. For example, Random Forest builds multiple decision trees and averages their outputs for improved accuracy and stability. Gradient Boosted Trees build trees sequentially to correct errors in predictions.

Applications of decision trees are widespread. They are utilized in medical diagnosis, credit scoring, fraud detection, and customer segmentation, just to name a few. Due to their ability to represent rules in an explicit form, they also fit into domains where explainability is crucial.

This section pertains to an examination of the results, as well as the subsequent discussion. In Table 2, the highest accuracy in smote upsampling with the lowest classification error is Naïve Bayes which is 88.21% accurate.

With SMOTE upsampling, the Convolutional Neural Network (CNN) outperformed all other models in terms of accuracy (99.53%), as indicated in Table 2. Furthermore, CNN demonstrated its efficacy in managing unbalanced data by outperforming other algorithms in terms of F-measure (99.56%), precision (99.45%), and sensitivity (99.67%).

4. Conclusions

Heart failure is responsible almost 8.5% of all heart disease deaths and 36% of cardiovascular disease (CVD) deaths worldwide. It is a top cause of death in people with obesity and diabetes. Early detection is critical to improve patient outcomes through specialized treatment, reduce visits of hospital, manage the symptoms, and provide timely intervention. In order to obtain a variety of readings, many machine learning approaches have been used and assessed on the dataset. The best machine learning algorithms have been determined following the application of various algorithms. The results show that the Convolutional Neural Network (CNN) model performs better than traditional machine learning models. The CNN model achieves 99.97% accuracy. In comparison with Naïve Bayes (NB), the best machine learning (ML) model achieved only 78.60% accuracy. This shows that CNN’s strong ability is to recognize complex patterns in clinical data, making it a great choice for predicting heart failure. Despite being extensive, the study does have some challenges. Although the dataset is large, it may not accurately reflect the variety of real-world clinical data, which might affect how well the model performs in different situations. Further techniques for optimization such as regularization and dropout can be applied to reduce overfitting and boost performance.

Author Contributions

M.Q. conceived and designed the study. R.A. was responsible for data collection, analysis, and drafting the manuscript. M.I.T. contributed to methodology refinement, critical review, and editing of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

Correction Statement

This article has been republished with a minor correction to the reference 9. This change does not affect the scientific content of the article.

References

Rozie, F. Rancang Bangun Alat Monitoring Jumlah Denyut Nadi/Jantung Berbasis Android. Available online: https://www.neliti.com/publications/191055/rancang-bangun-alat-monitoring-jumlah-denyut-nadi-jantung-berbasis-android (accessed on 1 January 2025).
World Health Organization. Avoiding Heart Attacks and Strokes Don’t Be a Victim Protect Yourself; World Health Organization: Geneva, Switzerland, 2005. [Google Scholar]
Widiastuti, N.A.; Santosa, S.; Supriyanto, C. Algoritma Klasifikasi Data Mining Naïve Bayes Berbasis Particle Swarm Optimization Untuk Deteksi Penyakit Jantung. 2014. Available online: https://ejournal.unib.ac.id/index.php/pseudocode/article/view/57 (accessed on 1 January 2025).
Sapna, F.N.U.; Raveena, F.N.U.; Chandio, M.; Bai, K.; Sayyar, M.; Varrassi, G.; Khatri, M.; Kumar, S.; Mohamad, T. Advancements in Heart Failure Management: A Comprehensive Narrative Review of Emerging Therapies. Cureus 2023, 15, e46486. [Google Scholar] [CrossRef] [PubMed]
Gollangi, H.K.; Galla, E.P.; Ramdas Bauskar, S.; Madhavaram, C.R.; Sunkara, J.R.; Reddy, M.S. Echoes in Pixels: The Intersection of Image Processing and Sound Detection through the Lens of AI and ML. Int. J. Dev. Res. 2020, 10, 39735–39743. [Google Scholar] [CrossRef]
Karadeniz, T.; Maraş, H.H.; Tokdemir, G.; Ergezer, H. Two Majority Voting Classifiers Applied to Heart Disease Prediction. Appl. Sci. 2023, 13, 3767. [Google Scholar] [CrossRef]
Van Ness, M.; Bosschieter, T.; Din, N.; Ambrosy, A.; Sandhu, A.; Udell, M. Interpretable Survival Analysis for Heart Failure Risk Prediction. 2023. Available online: http://arxiv.org/abs/2310.15472 (accessed on 1 January 2025).
Saqib, M.; Perswani, P.; Muneem, A.; Mumtaz, H.; Neha, F.; Ali, S.; Tabassum, S. Machine learning in heart failure diagnosis, prediction, and prognosis: Review. Ann. Med. Surg. 2024, 86, 3615–3623. [Google Scholar] [CrossRef] [PubMed]
Agarwal, A.; Thirunarayan, K.; Romine, W.L.; Alambo, A.; Cajita, M.; Banerjee, T. Leveraging Natural Learning Processing to Uncover Themes in Clinical Notes of Patients Admitted for Heart Failure. Available online: https://ieeexplore.ieee.org/abstract/document/9871400 (accessed on 1 January 2025).
Rao, S.; Li, Y.; Ramakrishnan, R.; Hassaine, A.; Canoy, D.; Cleland, J.; Lukasiewicz, T.; Salimi-Khorshidi, G.; Rahimi, K. An explainable Transformer-based deep learning model for the prediction of incident heart failure. IEEE J. Biomed. Health Inform. 2022, 26, 3362–3372. [Google Scholar] [CrossRef] [PubMed]
Ramdhani, Y.; Putra, C.M.; Alamsyah, D.P. Heart failure prediction based on random forest algorithm using genetic algorithm for feature selection. Int. J. Reconfigurable Embedded Syst. 2023, 12, 205–214. [Google Scholar] [CrossRef]
Islam, S.; Hossain, F.; Sattar, A.H.M.S. Predicting Heart Disease from Medical Records Heart Disease Prediction from Medical Record. Available online: https://www.researchgate.net/publication/381609333 (accessed on 1 January 2025).
Chicco, D.; Jurman, G. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Med. Inform. Decis. Mak. 2020, 20, 16. [Google Scholar] [CrossRef] [PubMed]
Patel, S. Survival Analysis in Heart Failure Patients Based on Clinical Data Using AI-Driven Statistical. Int. J. Core Eng. Manag. 2021, 6, 365–377. Available online: https://ssrn.com/abstract=5052706 (accessed on 1 January 2025). [CrossRef]
Kia, H.; Vali, M.; Sabahi, H. Enhancing Mortality Prediction in Heart Failure Patients: Exploring Preprocessing Methods for Imbalanced Clinical Datasets. In Proceedings of the 2023 30th National and 8th International Iranian Conference on Biomedical Engineering (ICBME), Tehran, Iran, 30 November–1 December 2023. [Google Scholar]
Fassina, L.; Faragli, A.; Lo Muzio, F.P.; Kelle, S.; Campana, C.; Pieske, B.; Edelmann, F.; Alogna, A. A random shuffle method to expand a narrow dataset and overcome the associated challenges in a clinical study: A heart failure cohort example. Front. Cardiovasc. Med. 2020, 7, 599923. [Google Scholar] [CrossRef] [PubMed]
Ashfaq, F.; Jhanjhi, N.Z.; Khan, N.A.; Javaid, D.; Masud, M.; Shorfuzzaman, M. Enhancing ECG Report Generation With Domain-Specific Tokenization for Improved Medical NLP Accuracy. IEEE Access 2025, 13, 85493–85506. [Google Scholar] [CrossRef]
Kodama, K.; Sakamoto, T.; Kubota, T.; Takimura, H.; Hongo, H.; Chikashima, H.; Shibasaki, Y.; Yada, T.; Node, K.; Nakayama, T.; et al. Construction of a Heart Failure Database Collating Administrative Claims Data and Electronic Medical Record Data to Evaluate Risk Factors for In-Hospital Death and Prolonged Hospitalization. Circ. Rep. 2019, 1, 582–592. [Google Scholar] [CrossRef] [PubMed]
Stefane Souza, V.; Araújo Lima, D. Cardiac Disease Diagnosis Using K-Nearest Neighbor Algorithm: A Study on Heart Failure Clinical Records Dataset. Artif. Intell. Appl. 2024, 3, 56–71. [Google Scholar] [CrossRef]
Eurostat. Standardised Rate of Deaths from Coronary Heart Diseases in the EU. Eurostat—News, 28 September 2020. Europe-Wide Rates and Infographic. Available online: https://ec.europa.eu/eurostat/web/products-eurostat-news/-/edn-20200928-1 (accessed on 1 January 2025).
Lim, M.; Abdullah, A.; Jhanjhi, N.; Khurram Khan, M.; Supramaniam, M. Link prediction in time-evolving criminal network with deep reinforcement learning technique. IEEE Access 2019, 7, 184797–184807. [Google Scholar] [CrossRef]
Diwaker, D.C.; Tomar, P.; Solanki, A.; Nayyar, A.; Jhanjhi, N.Z.; Abdullah, A.; Supramaniam, M. A New Model for Predicting Component Based Software Reliability Using Soft Computing. IEEE Access 2019, 7, 147191–147203. [Google Scholar] [CrossRef]
Airehrour, D.; Gutierrez, J.; Kumar Ray, S. GradeTrust: A secure trust based routing protocol for MANETs. In Proceedings of the 2015 International Telecommunication Networks and Applications Conference (ITNAC), Sydney, NSW, Australia, 18–20 November 2015; pp. 65–70. [Google Scholar] [CrossRef]

Figure 1. Coronary disease deaths over the years [20].

Figure 2. Flowchart for heart failure prediction methodology.

Figure 3. CNN model.

Table 1. Attributes description.

Features	Descriptions
Age	Patient’s age (years)
Anemia	Decline in hemoglobin and red blood cell levels
Creatinine	Blood concentrations of Creatine Phosphokinase (CPK) enzyme (mcg/L)
Diabetes	When dealing with diabetic patients
Ejection Fraction	The part of blood that is eliminated from the body as a result of heart contraction
High Blood Pressure	People suffering with hypertension
Platelets	Elements of blood plasma (kilo platelet/mL)
Serum Creatinine	Quantity of the creatinine in the blood serum (mg/dl)
Serum Sodium	Blood sodium concentration in the serum (mEq/L)
Sex	Describes gender (male or female)
Smoking	Whether the patient smokes or not
Time	Follow-up period (days)
Death Event	At completion of the follow-up

Table 2. Upsampling accuracy results.

Algorithms	Accuracy	Precision	F Meas.	Sensitivity
KNN	95.30%	92.70%	95.36%	92.28%
Naive Bayes	78.60%	74.75%	78.66%	78.89%
CNN	99.75%	99.72%	99.78%	99.76%
Random Forest	96.06%	94.29%	94.56%	93.26%
Deep Learning	92.22%	92.36%	92.44%	92.39%
SVM	94.27%	94.15%	94.36%	94.44%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qadeer, M.; Ayaz, R.; Thohir, M.I. Heart Failure Prediction Through a Comparative Study of Machine Learning and Deep Learning Models. Eng. Proc. 2025, 107, 61. https://doi.org/10.3390/engproc2025107061

AMA Style

Qadeer M, Ayaz R, Thohir MI. Heart Failure Prediction Through a Comparative Study of Machine Learning and Deep Learning Models. Engineering Proceedings. 2025; 107(1):61. https://doi.org/10.3390/engproc2025107061

Chicago/Turabian Style

Qadeer, Mohid, Rizwan Ayaz, and Muhammad Ikhsan Thohir. 2025. "Heart Failure Prediction Through a Comparative Study of Machine Learning and Deep Learning Models" Engineering Proceedings 107, no. 1: 61. https://doi.org/10.3390/engproc2025107061

APA Style

Qadeer, M., Ayaz, R., & Thohir, M. I. (2025). Heart Failure Prediction Through a Comparative Study of Machine Learning and Deep Learning Models. Engineering Proceedings, 107(1), 61. https://doi.org/10.3390/engproc2025107061

Article Menu

Heart Failure Prediction Through a Comparative Study of Machine Learning and Deep Learning Models †