An ML Framework for the Early Detection and Prediction of Hypertension: Enhancing Diagnostic Accuracy

Areeb, Muhammad; Rehman, Attique Ur; Sujjada, Alun

doi:10.3390/engproc2025107018

Open AccessProceeding Paper

An ML Framework for the Early Detection and Prediction of Hypertension: Enhancing Diagnostic Accuracy^†

by

Muhammad Areeb

¹,

Attique Ur Rehman

^1,*

and

Alun Sujjada

²

¹

Department of Software Engineering, University of Sialkot, Sialkot 51310, Pakistan

²

Informatic Engineering, Nusa Putra University, Sukabumi 43152, West Java, Indonesia

^*

Author to whom correspondence should be addressed.

^†

Presented at the 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society, Aizuwakamatsu City, Japan, 20–26 January 2025.

Eng. Proc. 2025, 107(1), 18; https://doi.org/10.3390/engproc2025107018

Published: 25 August 2025

(This article belongs to the Proceedings of The 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society)

Download

Browse Figure

Versions Notes

Abstract

A major worldwide health problem, hypertension can result in serious consequences such as stroke, renal failure, and cardiovascular illnesses if it is not identified and treated promptly. Reducing death rates and facilitating prompt therapies need the early identification of hypertension. This research examines if there are ways ML could enhance early identification of hypertension. Therefore, hypertension is still considered a global public health problem, and one of the most important preventive goals is its timely and accurate diagnosis. Leveraging a 99.92% accuracy rate, the present study therefore proposes a novel ML framework that significantly dwarfs the currently documented best accuracy of 99.5%. This achievement of correctly identifying the essentiality of hypertension in establishing our recommended paradigm highlights the robustness and trustworthiness of the proposed actions to ensure timely treatment and enhance patients’ quality of life the largest amount.

Keywords:

hypertension; early detection; machine learning; diagnostic accuracy; predictive modeling

1. Introduction

Hypertension or high blood pressure is one of the major leading global health concerns affecting millions of people. It is a leading maker of serious heart conditions including strokes, heart attacks, and renal failure. Preventable effects attributed to hypertension such as high mortality rates can only be eliminated if early detection and treatment are achieved. The current state of affairs is that, despite enhanced techniques in diagnosing hypertension, the common putting off of diagnosing it is detrimental to patients’ wellbeing, often leading to worse consequences. Traditional methods of diagnosis of hypertension are mainly based on normal clinical approaches, including blood pressure measurements and biochemical tests. Nonetheless, these techniques are effective, only constrained by patient compliance, monitoring frequency, and availability. This underlines the importance of further research into better, more accurate and less encumbered means for early detection of hypertension. The capability to process a practically unlimited number of predictive factors and identify what cannot be seen by the human eye are critical benefits offered by ML approaches. Hypertension diagnosis is particularly based on several factors, that include age, BMI, cholesterol, smoking status, and medical history.

Due to the complexity of this data, it may not therefore be effective to use conventional methods to zoom in on the various patterns which may be indicative of early hypertension. Many of these complexities, however, are not easily manageable using conventional algorithms, requiring specially designed machine learning algorithms. Numbers of patients can be computed within reasonable time, desired features can be found, and patients can be grouped into risk categories where effective treatments and preventive measures can be applied. Several studies have pointed out how ML can enhance the efficiency of diagnosing hypertension. For predicting the probability of hypertension, the possible algorithms which can be applied include Random Forest, Decision Tree, and Naïve Bayes, as well as ensemble methods, and the results obtained are quite promising.

These models help enhance diagnostic accuracy since they integrate clinical and demographic information. Furthermore, using multiple datasets in a study can improve model reliability, reduce sample selection prejudice, and extend the utility of the diagnostic tools to different groups. The five supervised classification algorithms in our proposed system include K-Nearest Neighbors (KNN), Naïve Bayes (NB), Decision Tree (DT), Random Forest (RF), and the ensemble classifiers AdaBoost and Voting Ensemble. After dissecting these models, the best method for diagnosing hypertension is reviewed and determined. To ensure that excellent accuracy and reliability are obtained, the framework consequently also involves strict steps in feature selection, model training, data preprocessing, and also model validation. Diagnosing high blood pressure is therefore complex, because the pattern of patient data is influenced by such factors as lifestyle, coexisting diseases, genetic makeup, and other factors. To address this issue, our method taps into advanced information processing based on deep machine learning to minimize diagnostic errors and detect valuable features. Additionally, we outperform past findings in similar experiments by achieving a high accuracy of 99.92%. Through the provision of a scalable and flexible machine learning framework that can be used across a range of populations, this research advances the area of hypertension detection. ML has the potential to transform the diagnosis of hypertension by facilitating early identification, prompt therapies, and better patient outcomes, as the data show. Future research will concentrate on integrating other data sources, such as genetic data, to further improve the model’s accuracy and scalability [1].

2. Literature Review

Eberhard Ritz, Bryan Williams, and Franz H. Messerli used epidemiological research to forecast cardiovascular risk and suggested conventional antihypertensive therapy approaches. Since theirs was a clinical evaluation, no metrics for accuracy were given. The study recommends more research into contemporary strategies to improve the treatment of hypertension [2]. For the categorization of hypertension, blood pressure prediction, and real-time estimation, Clea du Toit et al. offered a number of machine learning models, such as Random Forests, Logistic Regression, and Neural Networks, with SVM (AUC 0.8977) and Random Forest (99.5%) among the accuracy metrics attained. Their future research will focus on enhancing recall and accuracy as well as modifying machine learning methods for practical use [3]. In order to identify hypertension and estimate blood pressure, Erick Martinez-Ríos, Luis Montesinos, Mariel Alfaro-Ponce, and Leandro Pecchia investigated machine learning models such as SVM, CNN, LSTM, and regression approaches. LSTM (97.33%), F1-score (93.93%), and CNN (89.95%) are among the accuracy metrics attained. In order to improve performance in clinical settings, their future research will concentrate on utilizing different datasets and sophisticated models [4]. Thomas Mroz et al. utilized XGBoost to predict hypertension control, achieving an AUC of 0.756 using retrospective data from the Cleveland Clinic (350,008 patients). They suggest future work could refine predictive capabilities for broader applications [5]. Yukina Hirata et al. used data from 885 patients to classify pulmonary hypertension using Logistic Regression, SVM, Random Forest, and XGBoost, with AUCs ranging from 0.742 to 0.789. The goal of their future research will be to increase classifier accuracy for clinical applications [6]. Using health data from 699 individuals, Sohrab Effati et al. employed Random Forest, Logistic Regression, and SVM to predict cardiovascular disease and hypertension in mine workers with 99% and 97% accuracy, respectively. Expanding datasets for generalization may be a future project [7]. In order to diagnose hypertension, Sara Montagna et al. used Random Forest, Logistic Regression, Decision Tree, SVM, and XGBoost. On World Hypertension Day data (20,206 people), Random Forest achieved an AUC of 0.816. The improvement of sensitivity and specificity may be the main focus of future research [8]. Valeria Visco et al. investigated machine learning (ML) methods, such as deep learning and neural networks, for the treatment of hypertension, emphasizing uses in personalized medicine with omics-based data and electronic health records. Future research could concentrate on developing precision medicine techniques [9]. Majid Nour and Kemal Polat used personal characteristics including sex, age, and blood pressure to categorize different forms of hypertension using C4.5 Decision Tree [10], Random Forest, LDA, and LSVM. The obtained accuracies were 92.7% for LSVM, 96.3% for LDA, and 99.5% for C4.5 and Random Forest. The improvement of generalizability across various groups may be their main goal in future research [11]. Xiaolin Diao et al. optimized XGBoost by using grid search and 10-fold cross-validation to predict secondary hypertension etiologies from EMRs [12]. The composite model’s AUC was 0.924, primary aldosteronism’s was 0.965, and thyroid dysfunction’s was 0.959. The model may be extended to additional etiologies and clinical contexts in future research [13]. Yuki Sakai et al. proposed a novel intervention, CARTO-II combined with partial splenic embolization, to treat gastric varices caused by left-sided portal hypertension. Their outcomes showed significant variceal reduction with no complications during a 6-month follow-up. Future work could assess long-term efficacy and broader clinical applications [10,12,14].

3. Machine Learning Models

3.1. KKN

Using a distance measure, KNN identifies people by comparing their health indicators (such as blood pressure, age, and weight) with those of their closest neighbors in the dataset. By comparing a person’s characteristics to those of known hypertensive patients, it forecasts hypertension. Accuracy (99.87%).

3.2. Naive Bayes

Using probabilistic reasoning, Naive Bayes determines the probability of hypertension from independent characteristics such as age, BMI, and blood pressure. It makes fast and effective predictions by assuming feature independence.

A decision tree creates a flowchart-like structure, splitting the data at key thresholds (e.g., systolic blood pressure > 140) to classify whether an individual has hypertension. Each decision node represents a feature, simplifying the diagnosis process. Units: Accuracy (82.74%).

3.3. Decision Tree

To determine if a person has hypertension, Decision Tree divides the data at important thresholds (such as systolic blood pressure > 140) and produces a structure resembling a flowchart. The diagnosis procedure is made simpler by the fact that each decision node represents a feature. Accuracy (91.55%).

3.4. Random Forest

To increase prediction accuracy, Random Forest integrates many decision trees. Every tree casts a vote on whether a person has hypertension, and the final categorization is decided by the majority vote. Accuracy (95.86%).

3.5. Logistic Regression Model

Hypertension and the probability of exhibiting its features are quantified using an odds ratio in Logistic Regression. It produces a probability score by which people are sorted based on a cut-off point adopted, commonly, for example, (>50%). Accuracy (85.85%).

3.6. Deep Learning Algorhythms

Users of deep learning apply neural networks to identify the complex relationships among characteristics such as age, blood pressure, and cholesterol. People can be accurately classified as hypertensive or non-hypertensive using these patterns. Accuracy (99.92%).

4. Proposed Framework

The overall architecture for predictive modeling and the investigation of cardiovascular health trends and hypertension risk factors is illustrated in Figure 1, which presents the proposed framework. The dataset contained 26,083 records and 14 features associated with cardiovascular health and hypertension.

Age, sex, and clinical characteristics including heart rate (thalach), cholesterol, fasting blood sugar, and resting blood pressure (trestbps) are among the attributes. The kind of chest pain (cp), electrocardiogram data (restecg), exercise-induced angina (exang), and metrics such as old peak, slope, and ca are additional characteristics. Whereas the goal variable (target) denotes health outcomes or the categorization of hypertension, the thal characteristic relates to thalassemia. Predictive modeling and the investigation of cardiovascular health trends and risk factors for hypertension are best suited for this dataset.

The dataset was divided into training, validation and test sets. In this dataset, the train set comprised 70% of the data, validation 15%, and test 15%. In this dataset we applied different models like Knn—accuracy (99.87%); Naïve Bayes—accuracy (82.74%); Decision Tree—accuracy (91.55%); Random Forest—accuracy (95.86%); Gradient Boosted Tree—accuracy (93.65%); a Logistic Regression model—accuracy (85.85%); and deep learning algorithms—accuracy (99.92%).

5. Results

Utilizing a number of datasets and types of machine learning, a number of works have addressed the topic of hypertension diagnosis. The literature review comparison is as follows: Deep learning techniques are utilized in our proposed model, where for the hypertension dataset offered, an accuracy percentage of up to 99.92% is achieved as shown in Table 1. By enhancing the high level of flexibility and robustness, this method increases the probability of accurate hypertension identification. This model outperforms its predecessors in detecting important patterns occurring in the data and increases its applicability in complex medical cases. Here, it can augment the conventional ML models predisposed to high noise susceptibility and bias, thus providing a deep learning-based diagnostic system less prone to these issues. Moreover, the approach comprises elaborate feature engineering and selection steps to allow the model to converge around essential hypertension predictors. Thus, the utilization of highly advanced deep learning approaches leads the model to identify key features and lessen potential bias. The diagnosis and treatment of hypertension has been made easier with this comprehensive feature analysis and integration which ensures high-quality diagnosis results.

6. Conclusions and Future Work

Hypertension is still considered a global public health problem, and one of the most important preventive goals is its timely and accurate diagnosis. Leveraging a 99.92% accuracy rate, the present study proposes a novel ML framework that significantly dwarfs the currently documented best accuracy of 99.5%. This achievement of correctly identifying the essentiality of hypertension in establishing our recommended paradigm highlights the robustness and trustworthiness of the proposed actions to ensure timely treatment and enhance patients’ quality of life to the greatest extent. The course of future research will be to enrich the suggested framework through other data sources, like tendencies in lifestyle, genetic biomarkers, environmental factors, and patient record logs for certain periods in time, with the aim of building up a more comprehensive forecasting model. In order to forecast and control hypertension, we will also investigate the integration of real-time monitoring technologies, such as wearable medical technology and Internet of Things-enabled sensors, which can offer continuous and dynamic data. These developments will allow for proactive and individualized healthcare treatments in addition to increasing diagnostic accuracy. In order to facilitate clinical decision-making, efforts will also be directed at improving the model’s interpretability and explain ability. Through the provision of unambiguous insights into the variables impacting forecasts, the framework can assist in identifying possible therapies customized to meet the requirements of specific patients.

Author Contributions

M.A. and A.U.R. contributed to the conceptualization, methodology, and validation of the study. A.S. assisted with formal analysis, investigation, and review of the manuscript. A.U.R. supervised the overall work and finalized the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ahmed, S.; Hossain, M.A.; Bhuiyan, M.M.I.; Ray, S.K.A. A Comparative Study of Machine Learning Algorithms to Predict Road Accident Severity. In Proceedings of the 2021 20th International Conference on Ubiquitous Computing and Communications (IUCC/CIT/DSCI/SmartCNS), Virtual, 20–22 December 2021; pp. 390–397. [Google Scholar] [CrossRef]
Visco, V.; Izzo, C.; Mancusi, C.; Rispoli, A.; Tedeschi, M.; Virtuoso, N.; Giano, A.; Gioia, R.; Melfi, A.; Serio, B.; et al. Artificial Intelligence in Hypertension Management: An Ace up Your Sleeve. J. Cardiovasc. Dev. Dis. 2023, 10, 74. [Google Scholar] [CrossRef] [PubMed]
Diao, X.; Huo, Y.; Yan, Z.; Wang, H.; Yuan, J.; Wang, Y.; Cai, J.; Zhao, W. An application of machine learning to etiological diagnosis of secondary hypertension: Retrospective study using electronic medical records. JMIR Med. Inform. 2021, 9, e19739. [Google Scholar] [CrossRef] [PubMed]
Effati, S.; Kamarzardi-Torghabe, A.; Azizi-Froutaghe, F.; Atighi, I.; Ghiasi-Hafez, S. Web application using machine learning to predict cardiovascular disease and hypertension in mine workers. Sci. Rep. 2024, 14, 1. [Google Scholar] [CrossRef] [PubMed]
Martinez-Ríos, E.; Montesinos, L.; Alfaro-Ponce, M.; Pecchia, L. A review of machine learning in hypertension detection and blood pressure estimation based on clinical and physiological data. Biomed. Signal Process. Control 2021, 70, 102813. [Google Scholar] [CrossRef]
Sakai, Y.; Yamamoto, A.; Jogo, A.; Kita, R.; Hirose, H.; Ikeda, K.; Terayama, E.; Ozaki, M.; Murai, K.; Kageyama, K.; et al. A Case of Successful Treatment of Gastric Varices Due to Left-sided Portal Hypertension with Multidisciplinary Treatment Including Transportal Coil-assisted Balloon-occluded Retrograde Transvenous Obliteration II and Partial Splenic Embolization. Interv. Radiol. 2025, 10, e2023-0025. [Google Scholar] [CrossRef] [PubMed]
Nour, M.; Polat, K. Automatic Classification of Hypertension Types Based on Personal Features by Machine Learning Algorithms. Math. Probl. Eng. 2020, 2020, 2742781. [Google Scholar] [CrossRef]
Pengo, M.; Montagna, S.; Ferretti, S.; Bilo, G.; Borghi, C.; Ferri, C.; Parati, G. Machine Learning in Hypertension Detection: A Study on World Hypertension Day Data. J. Med. Syst. 2023, 47, 1. [Google Scholar] [CrossRef] [PubMed]
Mroz, T.; Griffin, M.; Cartabuke, R.; Laffin, L.; Russo-Alvarez, G.; Thomas, G.; Smedira, N.; Meese, T.; Shost, M.; Habboub, G.; et al. Predicting hypertension control using machine learning. PLoS ONE 2024, 19, e0299932. [Google Scholar] [CrossRef] [PubMed]
Diwaker, C.; Tomar, P.; Solanki, A.; Nayyar, A.; Jhanjhi, N.Z.; Abdullah, A.; Supramaniam, M.A. A New Model for Predicting Component-Based Software Reliability Using Soft Computing. IEEE Access 2019, 7, 147191–147203. [Google Scholar] [CrossRef]
du Toit, C.; Tran, T.Q.B.; Deo, N.; Aryal, S.; Lip, S.; Sykes, R.; Manandhar, I.; Sionakidis, A.; Stevenson, L.; Pattnaik, H.; et al. Survey and Evaluation of Hypertension Machine Learning Research. J. Am. Heart Assoc. 2023, 12, e027896. [Google Scholar] [CrossRef] [PubMed]
Kok, S.H.; Abdullah, A.; Jhanjhi, N.Z.; Supramaniam, M. A review of intrusion detection system using machine learning ap-proach. Int. J. Eng. Res. Technol. 2019, 12, 8–15. [Google Scholar]
Hirata, Y.; Tsuji, T.; Kotoku, J.; Sata, M.; Kusunose, K. Echocardiographic artificial intelligence for pulmonary hypertension classification. Heart 2024, 110, 586–593. [Google Scholar] [CrossRef] [PubMed]
Messerli, F.H.; Williams, B.; Ritz, E. Essential hypertension. Lancet 2007, 361, 1629–1641. Available online: https://www.thelancet.com (accessed on 1 January 2025). [CrossRef] [PubMed]

Figure 1. Framework.

Table 1. Results of classifier accuracy.

Classifier	Accuracy
KNN	99.87%
Naive Bayes	82.74%
Decision Tree	91.55%
Random Forest	95.86%
Gradient Boosted Tree	93.65%
Logistic Regression Model	85.85%
Deep Learning Algorithms	99.92%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Areeb, M.; Rehman, A.U.; Sujjada, A. An ML Framework for the Early Detection and Prediction of Hypertension: Enhancing Diagnostic Accuracy. Eng. Proc. 2025, 107, 18. https://doi.org/10.3390/engproc2025107018

AMA Style

Areeb M, Rehman AU, Sujjada A. An ML Framework for the Early Detection and Prediction of Hypertension: Enhancing Diagnostic Accuracy. Engineering Proceedings. 2025; 107(1):18. https://doi.org/10.3390/engproc2025107018

Chicago/Turabian Style

Areeb, Muhammad, Attique Ur Rehman, and Alun Sujjada. 2025. "An ML Framework for the Early Detection and Prediction of Hypertension: Enhancing Diagnostic Accuracy" Engineering Proceedings 107, no. 1: 18. https://doi.org/10.3390/engproc2025107018

APA Style

Areeb, M., Rehman, A. U., & Sujjada, A. (2025). An ML Framework for the Early Detection and Prediction of Hypertension: Enhancing Diagnostic Accuracy. Engineering Proceedings, 107(1), 18. https://doi.org/10.3390/engproc2025107018

Article Menu

An ML Framework for the Early Detection and Prediction of Hypertension: Enhancing Diagnostic Accuracy^†

Abstract

1. Introduction

2. Literature Review

3. Machine Learning Models

3.1. KKN

3.2. Naive Bayes

3.3. Decision Tree

3.4. Random Forest

3.5. Logistic Regression Model

3.6. Deep Learning Algorhythms

4. Proposed Framework

5. Results

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

An ML Framework for the Early Detection and Prediction of Hypertension: Enhancing Diagnostic Accuracy †

Abstract

1. Introduction

2. Literature Review

3. Machine Learning Models

3.1. KKN

3.2. Naive Bayes

3.3. Decision Tree

3.4. Random Forest

3.5. Logistic Regression Model

3.6. Deep Learning Algorhythms

4. Proposed Framework

5. Results

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

An ML Framework for the Early Detection and Prediction of Hypertension: Enhancing Diagnostic Accuracy^†