Machine Learning and Ensemble Methods for Cardiovascular Disease Prediction: A Systematic Review of Approaches, Performance Trends, and Research Challenges
Abstract
1. Introduction
2. Literature Review
3. Methodology
3.1. Search Strategy
3.2. Inclusion and Exclusion Criteria
3.2.1. Inclusion Criteria
3.2.2. Exclusion Criteria
3.3. Data Extraction and Screening Process
Data Extraction
4. Discussion
- A larger, more diverse dataset. Larger datasets improve results [23,24,25]. The method for a large population requires many risk variables. References [26,27,28] state that a study requires API or cloud-based datasets. Cloud computing can handle big patient data sets, and references [29,55] state that IoT devices can collect clinical parameters in real time, improving existing systems. Collaborating with medical practitioners to update patient descriptions and obtain more data to refine the model is difficult. References [56,57] suggest training the model(s) on different hospital data sets for good results. Validating models is difficult; however, laboratory test data helps verify predictions [3]. Medical record data analysis can improve heart CT scan models, according to [58,59], who recommend real-world datasets over simulations and theories.
- More comprehensive and large, diverse dataset: The use of more comprehensive and large, diverse datasets plays a vital role in enhancing the accuracy and robustness of predictive models in healthcare [23,24,25]. In order to effectively model the health risks of a large population, a comprehensive set of risk variables should be incorporated within the modeling structure [26,27]. These studies can utilize API or cloud-based datasets, as cloud computing can manage ample volumes of patient data in a scalable and flexible way [28]. The IoT devices also play an important role in real-time collection of clinical parameters in enhancing existing healthcare systems [29,56]. Nevertheless, updating patient descriptions with medical professionals to continually acquire additional data is very challenging [56,57]. As mentioned by [57], training models on data from different hospitals may be useful for increasing performance and generalization. Though the validation of these models is still complicated, laboratory test data can be very useful for verifying the predictions [23]. Moreover, analysis of medical record data can help improve the heart CT scan models with enhanced diagnostic accuracy [58]. To obtain the maximum possible reliability, it is suggested in [59] to rely more on real-world data sets than simulations and hypothetical models because actual clinical environments best reflect the world’s complexities.
- Electrocardiogram (ECG) Data: The evaluation of ECG data poses significant current challenges, one of the largest of which, in fact, relates to the right segmentation of various waves before assigning a rhythm label and detecting isolated beats [60]. This poses an important requirement for enhancing reliability and accuracy levels of automated analysis of ECG signals since effective segmentation directly reflects classification accuracy. Advances in signal processing techniques and ML models are required to overcome this challenge and provide more robust and precise interpretation of ECG data in real-time clinical settings. Enhanced segmentation algorithms have the potential to significantly improve diagnostic accuracy, particularly in identifying arrhythmias and other cardiac abnormalities.
- Generalized Models: Using different feature selection techniques can improve existing models considerably when dealing with data that has a large amount of missing values [61]. In addition, combining ensemble classifiers with other features can help build more accurate illness severity models for better overall model performance [15], as suggested by ref. [62]. High-dimensional data with a large volume needs an appropriate reduction strategy to efficiently handle it. In addition, to improve the minimization of redundant features, treatment of missing values, and noise removal, ref. [63] proposed a more comprehensive strategy that might be capable of providing even better results that lead to better prediction. The next step should involve developing new techniques of feature selection in order to choose the best characteristics to be input into the dataset with improved predictive performance and robustness [64]. Such innovations will aid in the deployment of ML models on real healthcare data streams toward better healthcare delivery.
- The unavailability of some models, which are not publicly accessible, calls for open-source solutions to make the predictive models more widely adopted and shared [65]. Another significant challenge is the implementation of these systems in real-world clinical settings without continuous medical supervision, making it difficult to assess their efficacy using real-time data [66]. There is, in the absence of open-source solutions, the inefficiency of diagnosis and treatment by healthcare professionals about conditions like CAD because they do not have the correct tools to make an accurate decision [27,67]. In addition, medical diagnostic tools are seldom available in large clinical environments; thereby, their implementation cannot properly improve patient outcomes [68]. Development and dissemination of open-source models addressing these gaps would empower health workers with better diagnostic support at lesser costs and possibly in more accessible ways in resource-limited settings, ultimately leading to quality improvement in healthcare services.
- In an attempt to enhance the precision of prediction, ref. [27] presented a comparison between the linear kernel of SVM with other SVM classifier kernels. The uniqueness and interpretability of the operating properties of the linear kernel made it stand out. Based on this, ref. [67] added other parameters, whereas ref. [69] suggested the DL model for integration with the system in order to make it more enhanced. Such processing is performed so that the feature extraction, data classification and precision can be increased [15]. Another study [70] recommended that probabilistic methods and CAD projections can produce reliable and robust predictions [70]. A model may produce a lower accuracy, and to increase the accuracy, a hybrid framework would be the choice to reduce stress and maximize the prediction outcome.
- The existing diagnosis methods are also used for further purposes including cancer, diabetes, various neurological disorders, and kidney disease, rather than only to be used for the CVD prediction [29,66,71]. The betterment of patients and other management would be possible if earlier detection were possible through these advanced predictive models and if these models were successfully integrated with healthcare.
- The development of new models will necessarily require enhancements in multimodal data integration, real-time analytics and other machine learning algorithms, and this will allow more efficiency and precision to clinicians, so that they can target a wider variety of chronic diseases.
- These techniques, which analyze vast visual datasets, offer great opportunities for improving diagnostic accuracy in cardiovascular health care. As ref. [72] noted, more study has been undertaken to enrich the models, and further machine learning-based refinements might make these prediction systems capable enough, as shown in Table 5, Table 6, Table 7 and Table 8.
5. Comparative Analysis of Ensemble Learning Approaches for CVD Prediction
5.1. Bagging-Based Ensembles
5.2. Boosting-Based Ensembles
5.3. Stacking and Hybrid Ensembles
5.4. Deep Ensembles
5.5. Bayesian Ensembles
5.6. Federated Ensemble Learning
5.7. XAI for Ensemble Models in CVD Prediction
5.8. Summary of Ensemble Trade-Offs
5.9. Interpretation and Implications
6. Open Challenges in ML, DL, and Ensemble Learning for CVD Prediction
6.1. Data-Related Challenges
6.1.1. Scarcity and Imbalance of High-Quality CVD Data
6.1.2. Multimodal Data Integration Challenges
6.2. Algorithmic Challenges
6.2.1. Lack of Model Interpretability and Clinical Explainability
6.2.2. Generalization and Robustness Across Populations
6.3. Challenges in Clinical Integration
6.3.1. Workflow Compatibility and Real-World Deployment
6.3.2. Lack of External Validation and Prospective Studies
6.4. Regulatory, Ethical, and Governance Challenges
6.4.1. Data Privacy, Security, and Compliance
6.4.2. Fairness, Bias, and Ethical Decision-Making
6.5. Summary of Open Challenges
7. Limitations
7.1. Variability in Reporting Standards Across Studies
7.2. Limited Reliability of High Accuracy Claims
7.3. Dataset Bias and Restricted Demographic Representation
7.4. Methodological Inconsistencies and Lack of Reproducibility
7.5. Limited Consideration of Clinical Integration
7.6. Lack of External Validation and Prospective Trials
7.7. Constraints of the Survey Itself
8. Future Work
8.1. Development of Large, Diverse, and Multimodal CVD Datasets
8.2. Standardization of Evaluation Protocols
8.3. Advancing Interpretability and Clinician-in-the-Loop Modeling
8.4. Exploration of Uncertainty-Aware and Safety-Critical AI Models
8.5. External Validation and Prospective Clinical Trials
8.6. Federated and Privacy-Preserving Learning Frameworks
8.7. Integration of Models into Clinical Information Systems
8.8. Addressing Bias, Fairness, and Ethical Considerations
9. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Jindal, H.; Agrawal, S.; Khera, R.; Jain, R.; Nagrath, P. Heart disease prediction using machine learning algorithms. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1022, 012072. [Google Scholar] [CrossRef]
- Bhushan, M.; Pandit, A.; Garg, A. Machine Learning and Deep Learning Techniques for the Analysis of Heart Disease: A Systematic Literature Review, Open Challenges and Future Directions; Springer: Dordrecht, The Netherlands, 2023. [Google Scholar] [CrossRef]
- Elgendy, M.S.; Moustafa, H.E.-D.; Nafea, H.B.; Shaban, W.M. Utilizing voting classifiers for enhanced analysis and diagnosis of cardiac conditions. Results Eng. 2025, 26, 104636. [Google Scholar] [CrossRef]
- Boukhatem, C.; Youssef, H.Y.; Nassif, A.B. Heart Disease Prediction Using Machine Learning. In Proceedings of the 2022 Advances in Science and Engineering Technology International Conferences (ASET), Dubai, United Arab Emirates, 21–24 February 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Hashemi, A.; Dowlatshahi, M.B.; Nezamabadi-Pour, H. Ensemble of feature selection algorithms: A multi-criteria decision-making approach. Int. J. Mach. Learn. Cybern. 2022, 13, 49–69. [Google Scholar] [CrossRef]
- Kumar, Y.; Kaur, G.K.; Singh, R. Comprehensive review of machine learning applications in heart disease prediction. Int. J. Innov. Sci. Res. Technol. 2024, 9, 2805–2812. [Google Scholar] [CrossRef]
- Ganie, S.M.; Pramanik, P.K.D.; Zhao, Z. Ensemble learning with explainable AI for improved heart disease prediction based on multiple datasets. Sci. Rep. 2025, 15, 13912. [Google Scholar] [CrossRef]
- Eini, P.; Rezayee, M.; Kassulke, M.; Tremblay, J. Efficacy and Comparative Performance of Machine Learning Models for Stroke Risk Prediction in Hypertensive Patients: A Systematic Review and Meta-Analysis. Int. J. Cardiol. Cardiovasc. Risk Prev. 2025, 200564. [Google Scholar] [CrossRef]
- Khan, H.; Bilal, A.; Aslam, M.A.; Mustafa, H. Heart Disease Detection: A Comprehensive Analysis of Machine Learning, Ensemble Learning, and Deep Learning Algorithms. Nano Biomed. Eng. 2024, 16, 677–690. [Google Scholar] [CrossRef]
- Liu, T.; Krentz, A.; Lu, L.; Curcin, V. Machine learning based prediction models for cardiovascular disease risk using electronic health records data: Systematic review and meta-analysis. Eur. Heart J. Digit. Health 2024, 6, 7–22. [Google Scholar] [CrossRef]
- Liu, Y.; Wang, H.; Deng, L.; He, X. Development and validation of an ultrasound-based AI-radiomics model for diagnosing and risk-stratifying gastrointestional stromal tumors: A retrospective diagnostic study. BMC Med. Imaging 2025, 25, 493. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Sharma, D.; Aujla, G.S.; Bajaj, R. Deep neuro-fuzzy approach for risk and severity prediction using recommendation systems in connected health care. Trans. Emerg. Telecommun. Technol. 2021, 32, e4159. [Google Scholar] [CrossRef]
- Marjit, S.; Bhattacharyya, T.; Chatterjee, B.; Sarkar, R. Simulated annealing aided genetic algorithm for gene selection from microarray data. Comput. Biol. Med. 2023, 158, 106854. [Google Scholar] [CrossRef] [PubMed]
- Sodhar, I.N.; Jalbani, A.H.; Buller, A.H.; Channa, M.I.; Hakro, D.N. Sentiment analysis of Romanized Sindhi text. J. Intell. Fuzzy Syst. 2020, 38, 5877–5883. [Google Scholar] [CrossRef]
- Zhang, J.; Zhu, H.; Chen, Y.; Yang, C.; Cheng, H.; Li, Y.; Zhong, W.; Wang, F. Ensemble machine learning approach for screening of coronary heart disease based on echocardiography and risk factors. BMC Med. Inform. Decis. Mak. 2021, 21, 187. [Google Scholar] [CrossRef] [PubMed]
- Bihri, H.; Charaf, L.A.; Azzouzi, S.; Charaf, M.E.H. A Robust Stacking-Based Ensemble Model for Predicting Cardiovascular Diseases. AI 2025, 6, 160. [Google Scholar] [CrossRef]
- Goyal, P.; Rani, R. Comparative Analysis of Machine Learning, Ensemble Learning and Deep Learning Classifiers for Parkinson’s Disease Detection. SN Comput. Sci. 2024, 5, 66. [Google Scholar] [CrossRef]
- Tiwari, A.; Chugh, A.; Sharma, A. Ensemble framework for cardiovascular disease prediction. Comput. Biol. Med. 2022, 146, 105624. [Google Scholar] [CrossRef]
- Babar, M. A hybrid approach to financial big data analysis using extended ensemble learning and optimized spark streaming. J. Open Innov. Technol. Mark. Complex. 2025, 11, 100602. [Google Scholar] [CrossRef]
- Ganie, S.M.; Pramanik, P.K.D.; Malik, M.B.; Nayyar, A.; Kwak, K.S. An Improved Ensemble Learning Approach for Heart Disease Prediction Using Boosting Algorithms. Comput. Syst. Sci. Eng. 2023, 46, 3993–4006. [Google Scholar] [CrossRef]
- Osman, A.F. Radiation Oncology in the Era of Big Data and Machine Learning for Precision Medicine. In Artificial Intelligence—Applications in Medicine and Biology; Books on Demand: Norderstedt, Germany, 2019. [Google Scholar] [CrossRef]
- Navita; Mittal, P.; Sharma, Y.K.; Lilhore, U.K.; Simaiya, S.; Saleem, K.; Ghith, E.S. Advanced Hybrid Machine Learning Model for Accurate Detection of Cardiovascular Disease. Int. J. Comput. Intell. Syst. 2025, 18, 51. [Google Scholar] [CrossRef]
- Abdollahi, J.; Nouri-Moghaddam, B. A hybrid method for heart disease diagnosis utilizing feature selection based ensemble classifier model generation. Iran J. Comput. Sci. 2022, 5, 229–246. [Google Scholar] [CrossRef]
- Fang, Y.; Wu, Y.; Gao, L. Machine learning-based myocardial infarction bibliometric analysis. Front. Med. 2025, 12, 1477351. [Google Scholar] [CrossRef]
- Maulani, A.A.; Winarno, S.; Zeniarja, J.; Putri, R.T.E.; Cahyani, A.N. Comparison of Hyperparameter Optimization Techniques in Hybrid CNN-LSTM Model for Heart Disease Classification. Sink. J. Dan Penelit. Tek. Inform. 2024, 8, 455–465. [Google Scholar] [CrossRef]
- Torthi, R.; Marapatla, A.D.K.; Mande, S.; Gadiraju, H.K.V.; Kanumuri, C. Heart Disease Prediction Using Random Forest Based Hybrid Optimization Algorithms. Int. J. Intell. Eng. Syst. 2024, 17, 134–144. [Google Scholar] [CrossRef]
- Sreekumari, S.; Bhalla, R.; Singh, G. Feature Selection and Model Evaluation for Heart Disease Prediction Using Ensemble Methods. Procedia Comput. Sci. 2025, 259, 1282–1295. [Google Scholar] [CrossRef]
- Vincent, A.C.S.R.; Sengan, S. Edge computing-based ensemble learning model for health care decision systems. Sci. Rep. 2024, 14, 26997. [Google Scholar] [CrossRef]
- Mohamed, Y.A.; Khanan, A.; Bashir, M.; Hakro, D.N.; Babar, M. A survey on health spending and comprehensive eGuide for healthcare: Challenges, implementation and future directions. Sustain. Futures 2026, 11, 101584. [Google Scholar] [CrossRef]
- Teja, M.D.; Rayalu, G.M. Optimizing heart disease diagnosis with advanced machine learning models: A comparison of predictive performance. BMC Cardiovasc. Disord. 2025, 25, 4627. [Google Scholar] [CrossRef]
- Yang, Y.; Lv, H.; Chen, N. A survey on ensemble learning under the era of deep learning. Artif. Intell. Rev. 2023, 56, 5545–5589. [Google Scholar] [CrossRef]
- Mienye, I.D.; Sun, Y. A survey of ensemble learning: Concepts, algorithms, applications, and prospects. IEEE Access 2022, 10, 99129–99149. [Google Scholar] [CrossRef]
- Akella, A.; Akella, S. Machine learning algorithms for predicting coronary artery disease: Efforts toward an open-source solution. Future Sci. OA 2021, 7, FSO698. [Google Scholar] [CrossRef]
- Ganaie, M.A.; Hu, M.; Malik, A.K.; Tanveer, M.; Suganthan, P.N. Ensemble deep learning: A review. Eng. Appl. Artif. Intell. 2022, 115, 105151. [Google Scholar] [CrossRef]
- Dalvi, J.J.; Khole, S.M.; Kudale, B. A Survey on Heart Disease Prediction Using Machine Learning Techniques. Algorithms 2018, 11, 1–12. [Google Scholar]
- Ramalingam, V.V.; Dandapath, A.; Raja, M.K. Heart disease prediction using machine learning techniques: A survey. Int. J. Eng. Technol. 2018, 7, 684–687. [Google Scholar] [CrossRef]
- Kieu, S.T.H.; Bade, A.; Hijazi, M.H.A.; Kolivand, H. A survey of deep learning for lung disease detection on medical images: State-of-the-art, taxonomy, issues and future directions. J. Imaging 2020, 6, 131. [Google Scholar] [CrossRef]
- Ghosh, P.; Azam, S.; Jonkman, M.; Karim, A.; Shamrat, F.M.J.; Ignatious, E.; Shultana, S.; Beeravolu, A.R.; De Boer, F. Efficient prediction of cardiovascular disease using machine learning algorithms with relief and lasso feature selection techniques. IEEE Access 2021, 9, 19304–19326. [Google Scholar] [CrossRef]
- Jan, M.; Awan, A.A.; Khalid, M.S.; Nisar, S. Ensemble approach for developing a smart heart disease prediction system using classification algorithms. Res. Rep. Clin. Cardiol. 2018, 9, 33–45. [Google Scholar] [CrossRef]
- Ghosh, S.; Ghosh, R.; Das, D. An ensemble approach to stabilize the features for multi-domain sentiment analysis using supervised machine learning. J. Intell. Fuzzy Syst. 2017, 32, 3543–3554. [Google Scholar] [CrossRef]
- Asif, D.; Bibi, M.; Arif, M.S.; Mukheimer, A. Enhancing heart disease prediction through ensemble learning techniques with hyperparameter optimization. Algorithms 2023, 16, 308. [Google Scholar] [CrossRef]
- El-Shafiey, M.G.; Hagag, A.; El-Dahshan, E.S.A.; Ismail, M.A. A hybrid GA and PSO optimized approach for heart-disease prediction based on Random Forest. Multimed. Tools Appl. 2022, 81, 18155–18179. [Google Scholar] [CrossRef]
- Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A survey on ensemble learning. Front. Comput. Sci. 2020, 14, 241–258. [Google Scholar] [CrossRef]
- Gao, X.Y.; Ali, A.A.; Hassan, H.S.; Anwar, E.M. Improving the accuracy for analyzing heart disease prediction based on the ensemble method. Complexity 2021, 2021, 6663455. [Google Scholar] [CrossRef]
- Alqahtani, A.; Alsubai, S.; Sha, M.; Vilcekova, L.; Javed, T. Cardiovascular disease detection using ensemble learning. Comput. Intell. Neurosci. 2022, 2022, 5267498. [Google Scholar] [CrossRef] [PubMed]
- Gupta, P.; Seth, D. Comparative analysis and feature importance of machine learning and deep learning for heart disease prediction. Indones. J. Electr. Eng. Comput. Sci. 2022, 29, 451–459. [Google Scholar] [CrossRef]
- Limbitote, T.W.; Lavanya, D.; Vetrivel, S. A survey on prediction techniques of heart disease using machine learning. Int. J. Sci. Technol. Res. 2020, 9, 2083–2088. [Google Scholar]
- Mahajan, P.; Uddin, S.; Hajati, F.; Moni, M.A. Ensemble learning for disease prediction: A review. Healthcare 2023, 11, 1808. [Google Scholar] [CrossRef]
- Natarajan, K.; Kumar, V.V.; Mahesh, T.R.; Abbas, M.; Kathamuthu, N.; Mohan, E.; Annand, J.R. Efficient Heart Disease Classification Through Stacked Ensemble with Optimized Firefly Feature Selection. Int. J. Comput. Intell. Syst. 2024, 17, 174. [Google Scholar] [CrossRef]
- Al-Fatlawi, A.; Al-Shammaa, S.S.; Taha, Z.A. Prediction and classification models of heart disease using machine learning algorithms: A review. J. Inf. Sci. Eng. 2020, 36, 595–614. [Google Scholar]
- Fitriyani, N.L.; Syafrudin, M.; Alfian, G.; Rhee, J. Development of disease prediction model based on ensemble learning approach for diabetes and hypertension. IEEE Access 2019, 7, 144777–144789. [Google Scholar] [CrossRef]
- Al-Sayed, A.; Khayyat, M.M.; Zamzami, N. Predicting Heart Disease Using Collaborative Clustering and Ensemble Learning Techniques. Appl. Sci. 2023, 13, 13278. [Google Scholar] [CrossRef]
- Nguyen, T.T.; Yue, X.; Mane, H.; Seelman, K.; Mullaputi, P.S.P.; Dennard, E.; Alibilli, A.S.; Merchant, J.S.; Criss, S.; Hswen, Y.; et al. Decoding Digital Discourse Through Multimodal Text and Image Machine Learning Models to Classify Sentiment and Detect Hate Speech in Race- and Lesbian, Gay, Bisexual, Transgender, Queer, Intersex, and Asexual Community–Related Posts on Social Media: Quantitative Study. J. Med. Internet Res. 2025, 27, e72822. [Google Scholar] [CrossRef]
- Mathur, P.; Srivastava, S.; Xu, X.; Mehta, J.L. Artificial intelligence, machine learning, and cardiovascular disease. Clin. Med. Insights: Cardiol. 2020, 14, 1179546820927404. [Google Scholar] [CrossRef] [PubMed]
- Hamad, A.; Jasim, A. Heart disease diagnosis based on deep learning network. Open J. Sci. Technol. 2021, 4, 1–9. [Google Scholar] [CrossRef]
- Li, P.; Hu, Y.; Liu, Z.-P. Prediction of cardiovascular diseases by integrating multi-modal features with machine learning methods. Biomed. Signal Process. Control 2021, 66, 102474. [Google Scholar] [CrossRef]
- Biswas, R.; Beeravolu, A.R.; Karim, A.; Azam, S.; Hasan, T.; Alam, M.; Ghosh, P. A robust deep learning-based prediction system of heart disease using a combination of five datasets. In Proceedings of the 31st International Conference on Computer Theory and Applications (ICCTA), Alexandria, Egypt, 11–13 December 2021; pp. 223–228. [Google Scholar]
- Tomov, S.; Tomov, S. A novel deep learning approach to improving heart disease diagnosis. Biomed. Signal Process. Control 2021, 66, 10274. [Google Scholar] [CrossRef]
- Rajdhan, A.; Agarwal, A.; Sai, M.; Ravi, D.; Ghuli, P. Heart disease prediction using machine learning. Int. J. Res. Technol. 2020, 9, 659–662. [Google Scholar]
- Darmawahyuni, A.; Nurmaini, S.; Rachmatullah, M.N.; Tutuko, B.; Sapitri, A.I.; Firdaus, F.; Fansyuri, A.; Predyansyah, A. Deep learning-based electrocardiogram rhythm and beat features for heart abnormality classification. PeerJ Comput. Sci. 2022, 8, e825. [Google Scholar] [CrossRef]
- Sherly, S.I. An ensemble-based heart disease prediction using gradient boosting decision tree. Turk. J. Comput. Math. Educ. 2021, 12, 3648–3660. [Google Scholar]
- Kavitha, M.; Gnaneswar, G.; Dinesh, R.; Sai, Y.R.; Suraj, R.S. Heart disease prediction using hybrid machine learning model. In Proceedings of the 2021 6th International Conference on Inventive Computation Technol. (ICICT), Coimbatore, India, 20–22 January 2021; pp. 1329–1333. [Google Scholar]
- Ali, F.; El-Sappagh, S.; Islam, S.R.; Kwak, D.; Ali, A.; Imran, M.; Kwak, K.-S. A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf. Fusion 2020, 63, 208–222. [Google Scholar] [CrossRef]
- Singh, A.; Kumar, R. Heart disease prediction using machine learning algorithms. In Proceedings of the 2020 International Conference on Electrical and Electronics Engineering (ICE3), Gorakhpur, India, 14–15 February 2020; pp. 452–457. [Google Scholar]
- Patil, A.H.; Sonawane, O.S.; Sopan, V. Risk prediction of cardiovascular disease using logistic regression machine learning algorithm. Int. Res. J. Mod. Eng. Technol. Sci. 2022, 4, 1–7. [Google Scholar]
- Rani, P.; Kumar, R.; Ahmed, N.M.O.S.; Jain, A. A decision support system for heart disease prediction based upon machine learning. J. Reliab. Intell. Environ. 2021, 7, 263–275. [Google Scholar] [CrossRef]
- Mienye, I.D.; Sun, Y.; Wang, Z. An improved ensemble learning approach for the prediction of heart disease risk. Inform. Med. Unlocked 2020, 20, 100402. [Google Scholar] [CrossRef]
- Mohan, S.; Thirumalai, C.; Srivastava, G. Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 2019, 7, 81542–81554. [Google Scholar] [CrossRef]
- Krishnan, S.; Magalingam, P.; Ibrahim, R. Hybrid deep learning model using recurrent neural network and gated recurrent unit for heart disease prediction. Int. J. Electr. Comput. Eng. 2021, 11, 5467–5476. [Google Scholar] [CrossRef]
- Chen, J.I.Z.; Hengjinda, P. Early prediction of coronary artery disease (CAD) by machine learning method—A comparative study. J. Artif. Intell. 2021, 3, 17–33. [Google Scholar]
- Mehmood, A.; Iqbal, M.; Mehmood, Z.; Irtaza, A.; Nawaz, M.; Nazir, T.; Masood, M. Prediction of heart disease using deep convolutional neural networks. Arab. J. Sci. Eng. 2021, 46, 3409–3422. [Google Scholar] [CrossRef]
- Shah, S.; Patil, S.; Kulkarni, R. A Comparative Study of Machine Learning Algorithms for Predicting Cardiovascular Disease Risk. Int. J. Healthcare Inf. Syst. Inform. 2022, 17, 1–19. [Google Scholar]
- Subramanian, M.; Sathiskumar, V.E.; Deepalakshmi, G.; Cho, J.; Manikandan, G. A survey on hate speech detection and sentiment analysis using machine learning and deep learning models. Alex. Eng. J. 2023, 80, 110–121. [Google Scholar] [CrossRef]
- Niño-Adan, I.; Landa-Torres, I.; Portillo, E.; Manjarres, D. Influence of statistical feature normalisation methods on K-Nearest Neighbours and K-Means in the context of industry 4.0. Eng. Appl. Artif. Intell. 2022, 111, 104807. [Google Scholar] [CrossRef]
- Azmi, J.; Arif, M.; Nafis, M.T.; Alam, M.A.; Tanweer, S.; Wang, G. A systematic review on machine learning approaches for cardiovascular disease prediction using medical big data. Med. Eng. Phys. 2022, 105, 103825. [Google Scholar] [CrossRef]
- Haupt, M.; Maurer, M.H.; Thomas, R.P. Explainable Artificial Intelligence in Radiological Cardiovascular Imaging—A Systematic Review. Diagnostics 2025, 15, 1399. [Google Scholar] [CrossRef]
- Saberi-Karimian, M.; Khorasanchi, Z.; Ghazizadeh, H.; Tayefi, M.; Saffar, S.; Ferns, G.A.; Ghayour-Mobarhan, M. Potential value and impact of data mining and machine learning in clinical diagnostics. Crit. Rev. Clin. Lab. Sci. 2021, 58, 275–296. [Google Scholar] [CrossRef]
- Razzak, M.I.; Naz, S.; Zaib, A. Deep learning for medical image processing: Overview, challenges and the future. In Classification in BioApps; Springer: Cham, Switzerland, 2018; pp. 323–350. [Google Scholar]
- Ting, D.S.W.; Cheung, C.Y.-L.; Lim, G.; Tan, G.S.W.; Quang, N.D.; Gan, A.; Hamzah, H.; Garcia-Franco, R.; Yeo, I.Y.S.; Lee, S.Y.; et al. Development and Validation of a Deep Learning System for Diabetic Retinopathy and Related Eye Diseases Using Retinal Images from Multiethnic Populations With Diabetes. JAMA 2017, 318, 2211–2223. [Google Scholar] [CrossRef] [PubMed]
- Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nat. Med. 2020, 25, 115–118. [Google Scholar] [CrossRef]
- Rezaei, M.; Yang, H.; Meinel, C. Recurrent generative adversarial network for learning imbalanced medical image semantic segmentation. Multimed. Tools Appl. 2020, 79, 15329–15348. [Google Scholar] [CrossRef]
- Apostolopoulos, I.D.; Mpesiana, T.A. COVID-19: Automatic detection from X-ray images utilizing transfer learning with convolutional neural networks. Phys. Eng. Sci. Med. 2020, 43, 635–640. [Google Scholar] [CrossRef] [PubMed]
- Rafael-Palou, X.; Jimenez-Pastor, A.; Martí-Bonmatí, L.; Muñoz-Nuñez, C.F.; Laudazi, M.; Alberich-Bayarri, Á. Advancing deep learning-based segmentation for multiple lung cancer lesions in real-world multicenter CT scans. Eur. Radiol. Exp. 2025, 9, 78. [Google Scholar] [CrossRef]
- Deshmukh, R. Reinforcement learning in healthcare: Applications for personalized treatment planning and clinical decision support. Shodh Sagar J. Artif. Intell. Mach. Learn. 2024, 1, 19–24. [Google Scholar]
- Komura, D.; Ishikawa, S. Machine learning methods for histopathological image analysis. Comput. Struct. Biotechnol. J. 2018, 16, 34–42. [Google Scholar] [CrossRef]
- Zhang, F.; Li, Z.; Zhang, B.; Du, H.; Wang, B.; Zhang, X. Multi-modal deep learning model for auxiliary diagnosis of Alzheimer’s disease. Neurocomputing 2019, 361, 185–195. [Google Scholar] [CrossRef]
- Thakkar, H.K.; Shukla, H.; Patil, S. A Comparative Analysis of Machine Learning Classifiers for Robust Heart Disease Prediction. In Proceedings of the 2020 IEEE 17th India Council International Conference (INDICON), New Delhi, India, 10–13 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Nashwan, A.J.; Gharib, S.; Alhadidi, M.; El-Ashry, A.M.; Alamgir, A.; Al-Hassan, M.; Khedr, M.A.; Dawood, S.; Abufarsakh, B. Harnessing Artificial Intelligence: Strategies for Mental Health Nurses in Optimizing Psychiatric Patient Care. Issues Ment. Health Nurs. 2023, 44, 1020–1034. [Google Scholar] [CrossRef]
- Du, Z.; Yang, Y.; Zheng, J.; Li, Q.; Lin, D.; Li, Y.; Fan, J.; Cheng, W.; Chen, X.-H.; Cai, Y. Accurate prediction of coronary heart disease for patients with hypertension from electronic health records with big data and machine-learning methods: Model development and performance evaluation. JMIR Med. Inform. 2020, 8, e17257. [Google Scholar] [CrossRef] [PubMed]
- Daharwal, U.; Singh, I.; Khekare, G. Comparison of Machine Learning Algorithms for Heart Disease Prediction. Procedia Comput. Sci. 2025, 260, 12–21. [Google Scholar] [CrossRef]
- Akinola, S.; Leelakrishna, R.; Varadarajan, V. Enhancing cardiovascular disease prediction: A hybrid machine learning approach integrating oversampling and adaptive boosting techniques. ACS Med. Sci. 2024, 11, 58–71. [Google Scholar] [CrossRef]
- Hasan, M.N.; Hossain, M.A.; Rahman, M.A. An ensemble based lightweight deep learning model for the prediction of cardiovascular diseases from electrocardiogram images. Eng. Appl. Artif. Intell. 2025, 141, 109782. [Google Scholar] [CrossRef]
- Tang, K.; Ma, S.; Sun, X.; Guo, D. Optimizing machine learning for enhanced automated ECG analysis in cardiovascular healthcare. Egypt. Inform. J. 2024, 28, 100578. [Google Scholar] [CrossRef]
- Abdelhameed, A.; Abdelghani, W. A hybrid deep learning model for cardiovascular disease prediction based on multimodal data. Front. Cardiovasc. Med. 2021, 8, 619926. [Google Scholar]
- Li, Q.; Wang, C. Predicting Cardiovascular Disease Risks with Long Short-Term Memory Networks. J. Med. Syst. 2021, 45, 1–10. [Google Scholar]
- Salama, S.R.; Alshahrani, A. Detection and prediction of cardiovascular diseases using machine learning techniques and health informatics. Health Inform. Sci. Syst. 2020, 8, 1–15. [Google Scholar]
- Abdeljaber, T.; Rehman, S. A novel deep learning model for cardiovascular disease prediction using genetic and clinical data. Front. Genet. 2020, 11, 980. [Google Scholar]
- Ahmed, S.M.; Rahman, M.M. Heart disease prediction system using machine learning and soft computing techniques. Soft Comput. 2019, 23, 3027–3042. [Google Scholar]
- Ribeiro, A.H. A deep learning algorithm to optimize cardiovascular risk assessment from electronic health records. JACC Cardiovasc. Imaging 2021, 14, 736–748. [Google Scholar]
- Almustafa, K. Prediction of heart disease and classifiers’ sensitivity analysis. BMC Bioinform. 2020, 21, 78. [Google Scholar] [CrossRef]
- Al-Mahdi, I.S.; Darwish, S.M.; Madbouly, M.M. Heart Disease Prediction Model Using Feature Selection and Ensemble Deep Learning with Optimized Weight. CMES Comput. Model. Eng. Sci. 2025, 143, 875–909. [Google Scholar] [CrossRef]
- Syed, M.G.; Trucco, E.; Mookiah, M.R.K.; Lang, C.C.; McCrimmon, R.J.; Palmer, C.N.A.; Pearson, E.R.; Doney, A.S.F.; Mordi, I.R. Deep-learning prediction of cardiovascular outcomes from routine retinal images in individuals with type 2 diabetes. Cardiovasc. Diabetol. 2025, 24, 25. [Google Scholar] [CrossRef] [PubMed]
- Kumar, A.; Dhanka, S.; Sharma, A.; Bansal, R.; Fahlevi, M.; Rabby, F.; Aljuaid, M. A hybrid framework for heart disease prediction using classical and quantum-inspired machine learning techniques. Sci. Rep. 2025, 15, 9957. [Google Scholar] [CrossRef]
- Patil, S.; Kirange, D. Ensemble of Deep Learning Models for Brain Tumor Detection. Procedia Comput. Sci. 2023, 218, 2468–2479. [Google Scholar] [CrossRef]
- Shehzad, K.; Zhenhua, T.; Shoukat, S.; Saeed, A.; Ahmad, I.; Bhatti, S.S.; Chelloug, S.A. A Deep-Ensemble-Learning-Based Approach for Skin Cancer Diagnosis. Electronics 2023, 12, 1342. [Google Scholar] [CrossRef]
- Rosenzveig, A.; Jha, A.; Abdi, N.; Multani, A.; Massad, F.; Sleem, M.; Modumudi, S.; Dixit, S.; Brown, C.; Nikita, M.; et al. TCT-590 Single-Center Real world use of the Paradise Renal Denervation Catheter. Early experience from the Cleveland Clinic. JACC 2025, 86, B256–B257. [Google Scholar] [CrossRef]
- Hategeka, C.; Benjamin, E.J.; Preis, S.R. Association of Lipoprotein(a) With Atrial Fibrillation in the Framingham Heart Study. JACC Adv. 2025, 4, 102343. [Google Scholar] [CrossRef]
- Simegn, G.L.; Gebeyehu, W.B.; Degu, M.Z. Computer-aided decision support system for diagnosis of heart diseases. Res. Rep. Clin. Cardiol. 2022, 13, 39–54. [Google Scholar] [CrossRef]












| Reference | Algorithms | Accuracy |
|---|---|---|
| Yang et al. [31] | ANN, SVM, STACKING ENSEMBLE, MAJOR VOTING | 96% |
| Mienye et al. [32] | AdaBoost, XGBoost, KNN | 99.3%|95.03%|94.73% |
| Akella et al. [33] | Logistic regression, Neural network, Random Forest, SVM, k-Nearest neighbor | 0.8764|0.7978|0.8764|0.9303|0.8427 |
| Ganaie et al. [34] | Ensemble techniques bagging, random forest | 98% |
| Dalvi et al. [35] | SVM, random forest, logistic regression. | SVM best 97% |
| Ramalingam et al. [36] | SVM, Naive based, KNN | 99% |
| Kieu et al. [37] | CNN, ANN, Ensemble Technique | 96% |
| Reference | Dataset(s) Used | Accuracy (%) |
|---|---|---|
| Zhang et al. [15] | 2D-STE + 7 clinical features | 87.7 |
| Ghosh et al. [38] | Cleveland, Hungarian, Statlog, etc. | 46–95.19 |
| Jan et al. [39] | Not explicitly stated | 93.22–98.17 |
| Ghosh et al. [40] | IMDb, Electronics & Kitchen reviews | 81.71 |
| Asif et al. [41] | Kaggle: 297 + 1025 + 303 merged | 97.23–98.15 |
| Shafiey et al. [42] | Cleveland, Statlog datasets | 87.8–95.6 |
| Dong et al. [43] | N/A (Survey) | N/A |
| Gao et al. [44] | Cleveland Heart Disease dataset | 83.7–98.6 |
| Louridi et al. [45] | UCI Heart Disease | 85.25–95.83 |
| Gupta et al. [46] | Post-COVID Jammu patient dataset | 93.23 |
| Limbitote et al. [47] | WEKA-processed datasets | Up to 91 |
| Mahajan et al. [48] | UCI CKD, CHD, Dermatology, etc. | Up to 100 |
| Natarajan et al. [49] | Z-Alizadeh Sani dataset from UCI ML Repository | 86.79 |
| Year | Authors | Research Paper Title | ML Algorithm |
|---|---|---|---|
| 2021 | Zhang et al. [15] | Ensemble ML approach for screening of coronary heart disease based on echocardiography and risk factors | Ensemble (stacked classifiers) |
| 2021 | Ghosh et al. [38] | An Effective Method to Predict Heart Disease, Particularly Coronary Artery Disease or Coronary Heart Disease, Using A Combination of Five Datasets and Various Classifiers and Hybrid Approaches | Bagging, Boosting, RF, KNN, GBT |
| 2021 | Jan et al. [39] | Ensemble approach for developing a smart heart disease prediction system using classification algorithms | RF, Others (Ensemble) |
| 2018 | Ghosh et al. [40] | An ensemble approach to stabilize the features for multi-domain sentiment analysis using supervised machine learning | SVM + IG |
| 2023 | Asif et al. [41] | Enhancing Heart Disease Prediction through Ensemble Learning Techniques with Hyperparameter Optimization | SVM, DT, RF, CatBoost, etc. |
| 2022 | El-Shafiey et al. [42] | A hybrid GA and PSO optimized approach for heart-disease prediction based on random forest | RF + GA/PSO |
| 2020 | Dong et al. [43] | A survey on ensemble learning | SVM, HMM, Clustering (Review) |
| 2021 | Gao et al. [44] | Improving the Accuracy for Analyzing Heart Diseases Prediction Based on the Ensemble Method | RF, DT, NB, KNN, SVM |
| 2021 | Louridi et al. [45] | Machine learning-based identification of patients with a cardiovascular defect | Stacking, XGBoost, LGBM, etc. |
| 2021 | Gupta et al. [46] | Stacking Ensemble-Based Intelligent ML Model for Predicting Post-COVID-19 Complications | SVM, RF, DT, ANN |
| 2020 | Limbitote et al. [47] | A Survey on Prediction Techniques of Heart Disease using Machine Learning | DT, SVM, NB, RF, etc. |
| 2023 | Mahajan et al. [48] | Ensemble Learning for Disease Prediction: A Review | Multiple incl. ANN, RF, SVC |
| 2024 | Natarajan K. et al. [49] | Efficient Heart Disease Classification Through Stacked Ensemble with Optimized Firefly Feature Selection | Ensemble methods Stacking Voting |
| Database | Search Terms/Query Strings Used | Boolean Operators | Time Range | Results Retrieved |
|---|---|---|---|---|
| IEEE Xplore | “Ensemble learning for CVD detection”; “heart disease prediction machine learning”; “CVD diagnosis using DL”; “stacking OR bagging OR boosting heart disease” | AND, OR, phrase search | 2018–2025 | 187 |
| PubMed | “Cardiovascular disease” AND “machine learning”; “DL CVD classification”; “CVD prediction using multimodal data”; “medical diagnosis ensemble classifier” | AND, OR | 2018–2025 | 189 |
| Scopus | “Hybrid ML model” AND “cardiac disease prediction”; “ECG DL heart disease”; “ensemble classifier for CVD”; “clinical dataset heart disease ML” | AND, OR, truncation | 2018–2025 | 268 |
| Research Title | Journal Name | Year |
|---|---|---|
| A survey on hate speech detection and sentiment analysis using ML and DL models [73] | Alexandria Engineering Journal | 2023 |
| Ensemble DL: A review [74] | Engineering Applications of Artificial Intelligence | 2022 |
| ML approaches for CVD diagnosis: A systematic review [75] | Computers in Biology and Medicine | 2022 |
| Explainable artificial intelligence for CVD detection: A review [76] | Biomedical Signal Processing and Control | 2021 |
| CVD diagnosis using ML and data mining: [77] A systematic review | Applied Sciences | 2021 |
| Research Title | Journal Name | Year |
|---|---|---|
| DL for Medical Image Processing: Overview & Challenges [78] | Classification in BioApps | 2018 |
| Deep Learning for Diabetic Retinopathy Detection [79] | IEEE Transactions on Medical Imaging | 2019 |
| Automated Skin Cancer Classification Using CNNs [80] | Nature Medicine | 2020 |
| Deep Learning Approaches for Brain Tumor MRI Segmentation [81] | Computers in Biology and Medicine | 2021 |
| COVID-19 Detection from Chest X-Ray Images Using Deep CNNs [82] | IEEE Access | 2020 |
| Deep Learning for Lung Nodule Detection and Classification [83] | Medical Image Analysis | 2019 |
| Transformer-Based Models for Medical Text Analysis [84] | Journal of Biomedical Informatics | 2022 |
| Deep Learning in Histopathology: Cancer Grading and Diagnosis [85] | Patterns | 2021 |
| Multi-Modal DL Models for Alzheimer’s Disease Prediction [86] | NeuroImage | 2020 |
| Deep Reinforcement Learning in Personalized Treatment Planning [87] | Artificial Intelligence in Medicine | 2024 |
| Research Title | Journal Name | Year |
|---|---|---|
| Artificial intelligence, ML, and cardiovascular disease [88] | Clinical Medicine Insights: Cardiology | 2020 |
| Ensemble learning for CVD prediction using EHR data [89] | IEEE Journal of Biomedical and Health Informatics | 2023 |
| Comparative study of ML algorithms for heart disease prediction [90] | Journal of Healthcare Engineering | 2020 |
| Hybrid DL model for CVD prediction using physiological & lifestyle data [91] | Computer Methods and Programs in Biomedicine | 2024 |
| CVD prediction using ML with ECG data [92] | International Journal of Computer Applications | 2022 |
| DL model for early detection of CVD using wearable devices [93] | Journal of Ambient Intelligence and Humanized Computing | 2022 |
| Hybrid DL model for CVD prediction using multimodal data [94] | Frontiers in Cardiovascular Medicine | 2021 |
| Predicting CVD risks with LSTM networks [95] | Journal of Medical Systems | 2021 |
| Detection and prediction of CVD using ML techniques [96] | Health Information Science and Systems | 2020 |
| DL model for CVD prediction using genetic and clinical data [97] | Frontiers in Genetics | 2020 |
| Heart disease prediction using ML and soft computing [98] | Soft Computing | 2019 |
| DL algorithm to optimize CVD risk assessment from EHR [99] | JACC: Cardiovascular Imaging | 2021 |
| Hybrid DL model for CVD detection using optimized features [100] | Computers in Biology and Medicine | 2020 |
| Investigating DL models in CVD prediction [101] | International Conference on Data Analytics | 2020 |
| Deep-learning prediction of cardiovascular outcomes from routine retinal images in individuals with type 2 diabetes [102] | Cardiovascular Diabetology | 2025 |
| Optimizing heart disease diagnosis with advanced ML models: a comparison of predictive performance [30] | BMC Cardiovascular Disorders | 2025 |
| Advanced Hybrid ML Model for Accurate CVD Detection [22] | Applied Intelligence/Springer-link (or similar hybrid ML journal) | 2025 |
| A hybrid framework for heart disease prediction using classical and quantum-inspired ML techniques [103] | Scientific Reports | 2025 |
| Reference | Year | Ensemble Techniques | Application | Accuracy |
|---|---|---|---|---|
| Alqahtani et al. [45] | 2022 | Ensemble-based approach that uses ML and DL | Heart Disease | 88.70%. |
| Kirange et al. [104] | 2023 | Designing deep ensemble model. First shallow convolutional neural network (SCNN) and VGG16 network were designed with T1C modality MRI image. | Brain Tumor | 97.77%. |
| Zhenhua et al. [105] | 2023 | Proposed an ensemble model that uses the vision of both EfficientNetV2S and Swin-Transformer models to detect the early focal zone of skin cancer. | Skin Cancer | 99.80%. |
| Ensemble Type | Accuracy | Interpretability | Computational Cost | Robustness | Clinical Suitability |
|---|---|---|---|---|---|
| Random Forest | High | Moderate | Low | High | Very High |
| Boosting Models | Very High | Moderate | Medium | High | High |
| Stacking Ensembles | Very High | Low | High | High | Moderate |
| Deep Ensembles | Very High | Low | Very High | Very High | High (ECG) |
| Bayesian Ensembles | High | Low–Moderate | Very High | High | High-risk applications |
| Federated Ensembles | High | Moderate | Medium–High | High | Multi-hospital deployments |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Gul, G.; Korejo, I.A.; Hakro, D.N.; Alqahtani, H.; Abbasi, A.; Babar, M.; Al Rahbi, O.; Ali, N.I. Machine Learning and Ensemble Methods for Cardiovascular Disease Prediction: A Systematic Review of Approaches, Performance Trends, and Research Challenges. Computers 2026, 15, 25. https://doi.org/10.3390/computers15010025
Gul G, Korejo IA, Hakro DN, Alqahtani H, Abbasi A, Babar M, Al Rahbi O, Ali NI. Machine Learning and Ensemble Methods for Cardiovascular Disease Prediction: A Systematic Review of Approaches, Performance Trends, and Research Challenges. Computers. 2026; 15(1):25. https://doi.org/10.3390/computers15010025
Chicago/Turabian StyleGul, Ghazala, Imtiaz Ali Korejo, Dil Nawaz Hakro, Haitham Alqahtani, Abdullah Abbasi, Muhammad Babar, Osama Al Rahbi, and Najma Imtiaz Ali. 2026. "Machine Learning and Ensemble Methods for Cardiovascular Disease Prediction: A Systematic Review of Approaches, Performance Trends, and Research Challenges" Computers 15, no. 1: 25. https://doi.org/10.3390/computers15010025
APA StyleGul, G., Korejo, I. A., Hakro, D. N., Alqahtani, H., Abbasi, A., Babar, M., Al Rahbi, O., & Ali, N. I. (2026). Machine Learning and Ensemble Methods for Cardiovascular Disease Prediction: A Systematic Review of Approaches, Performance Trends, and Research Challenges. Computers, 15(1), 25. https://doi.org/10.3390/computers15010025

