Next Article in Journal
Foundational Algorithms for Modern Cybersecurity: A Unified Review on Defensive Computation in Adversarial Environments
Previous Article in Journal
Comprehensive Forensic Tool for Crime Scene and Traffic Accident 3D Reconstruction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhancing Cardiovascular Disease Classification with Routine Blood Tests Using an Explainable AI Approach

by
Nurdaulet Tasmurzayev
1,
Bibars Amangeldy
1,*,
Zhanel Baigarayeva
1,2,
Assiya Boltaboyeva
1,2,
Baglan Imanbek
1,*,
Naoya Maeda-Nishino
3,4,
Sarsenbek Zhussupbekov
5 and
Aliya Baidauletova
6
1
Faculty of Information Technologies and Artificial Intelligence, Al-Farabi Kazakh National University, Almaty 050040, Kazakhstan
2
LLP Kazakhstan R&D Solutions, Almaty 050056, Kazakhstan
3
Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA 94305, USA
4
HAKUAI Medical Corporation, Osaka 573-1010, Japan
5
Department of Automation and Control, Energo University, Almaty 050013, Kazakhstan
6
Neurology and Sleep Medicine Center, Almaty 050008, Kazakhstan
*
Authors to whom correspondence should be addressed.
Algorithms 2025, 18(11), 708; https://doi.org/10.3390/a18110708
Submission received: 7 October 2025 / Revised: 31 October 2025 / Accepted: 5 November 2025 / Published: 7 November 2025

Abstract

Background: While machine learning (ML) is widely applied in cardiology, a critical research gap persists. The incremental diagnostic value of routine blood tests for classifying cardiovascular disease (CVD) remains largely unquantified, and many models operate as non-interpretable “black boxes,” limiting their clinical adoption. This study aims to address these gaps by quantifying the contribution of readily available laboratory panels and demonstrating the utility of transparent diagnostic modeling within a real-world clinical cohort. Methods: We conducted a retrospective study on the clinical data of 896 adult patients from a hospital database. A baseline feature set (demographics, vital signs) was compared against an enhanced set that additionally included results from routine hematology and biochemistry panels. Five machine learning classifiers were trained and evaluated. To ensure transparency, SHAP (SHapley Additive exPlanations) analysis, a key component of explainable AI (XAI), was used to interpret the predictions of the top-performing model. Results: The inclusion of routine blood tests consistently and significantly improved the performance of all classifiers. The XGBoost model demonstrated the best performance (accuracy 91.62%, precision 95.00%, recall 87.36%). Critically, SHAP analysis identified aspartate aminotransferase (AST), glucose, and creatinine as the most significant biomarkers, providing clear, interpretable insights into the biochemical drivers of the model’s predictions. Conclusion: Routine laboratory markers contain a strong, interpretable signal indicative of CVD that is crucial for accurate risk stratification. These findings underscore the diagnostic relevance of common blood biomarkers and demonstrate how explainable AI can transform routine clinical data into transparent and actionable cardiovascular insights. Further validation in larger and demographically diverse cohorts is warranted.

1. Introduction

Cardiovascular diseases (CVDs) represent a serious global health problem, remaining the leading cause of death worldwide, with 17.9 million deaths recorded in 2019 [1,2]. This burden is observed across various regions and age groups, and for youth (ages 15–39), from 1990 to 2019, despite a decrease in mortality rates, standardized incidence and prevalence rates moderately or significantly increased [3]. Key risk factors include high systolic blood pressure, high body mass index (BMI), and high LDL cholesterol, as well as lifestyle determinants and environmental factors [3,4].
In Kazakhstan, as globally, CVDs are also the main cause of death [4]. Although overall preventable mortality from CVDs was decreasing from 2011 to 2021, it rose in 2020 and 2021 for conditions such as ischemic heart disease and cerebrovascular disease, while mortality among youth remains high [5]. The country is experiencing a significant increase in the incidence of arterial hypertension, and predictive modeling forecasts a further increase in cases of acute myocardial infarction and cerebrovascular diseases by 2030 [6].
Cardiovascular diseases (CVD) also have a significant social and economic impact on individuals, families, communities and societies as a whole, as documented by various sources. High prevalence and incidence make CVD the most common cause of death in ESC Member States. Despite an overall decline in age-standardised CVD mortality rates over 30 years, some regions have seen increases since 2010, and the global decline has slowed over the past 5 years [7]. Time lost to treatment and illness is an important resource. The burden of disease associated with the development of CVD results in fewer healthy life years and worse quality of life [8]. CVDs also account for a significant number of potential years of life lost (PYLL), accounting for more than one third of all PYLLs. CVDs are not only a health problem but also an economic challenge for health systems that is expected to grow exponentially in the future. CVD expenditures account for the largest share of health expenditures, reaching around 16% of total expenditures in some countries. The European Union alone spent 210 billion euros fighting CVD in 2015 [9]. The economy is affected not only by the struggle but also by the loss of labor productivity.
Although Kazakhstan has made significant efforts to improve tertiary CVD prevention and covers costly interventional procedures such as percutaneous coronary interventions, the projected increase in CVD complications could overwhelm the healthcare system in the coming decade and lead to significant economic pressure [6].
ML techniques are increasingly applied in cardiovascular disease (CVD) diagnosis, where they enable automated interpretation of electrocardiograms (ECGs) and imaging data. They support tasks such as arrhythmia detection, identification of structural and functional abnormalities (left ventricular hypertrophy, contractile dysfunction), and direct diagnosis of conditions including hypertrophic cardiomyopathy and heart failure, in many cases surpassing human interpretation [10,11]. They are also applied to ejection fraction calculation, as well as to enhanced CCTA and CAC analysis for assessing atherosclerosis and stenosis [12]. ECGs are ideal for DL due to their accessibility and digital format, with CNNs achieving diagnostic accuracy comparable to specialists in detecting arrhythmias, valvulopathies, heart failure, cardiomyopathies, myocardial infarction, and pulmonary hypertension. AI-enabled wearables using photoplethysmography further expand screening potential [13].
A study focused on improving early detection of heart diseases, particularly myocardial infarction (MI), using machine learning (ML) and deep learning while addressing imbalanced datasets. Models such as KNN, SVM, Logistic Regression, CNN, Gradient Boost, XGBoost, and Random Forest were evaluated on two datasets, with a fine-tuned XGBoost achieving the best performance (98.50% accuracy, 99.14% precision, 98.29% recall, 98.71% F1) [14]. Another study developed a model for predicting CVDs by applying k-modes clustering with Huang initialization to preprocess data. Continuous attributes such as age, height, weight, and blood pressure were converted into categorical values, and new features like Body Mass Index (BMI) and Mean Arterial Pressure (MAP) were generated to enhance accuracy and interpretability. These steps improved performance across multiple ML models, with the MLP achieving the best results (87.28% accuracy, 0.95 AUC) [15].
Biomarkers are essential tools in modern medicine, serving as objective, quantifiable indicators of biological processes and therapeutic responses. However, despite significant advancements and the use of “omics” technologies, their clinical application is often limited by a fragmented approach. Many biomarkers are analyzed in isolation using traditional statistical methods that rely on mean thresholds, which fail to capture the complexity of disease dynamics, inter-individual variability, and treatment responses [16,17]. This limitation is compounded by a frequent lack of comprehensive validation in large, independent cohorts, which hinders their prognostic utility across diverse patient populations [18,19,20]. In contrast, studies in cardiovascular disease demonstrate the potential of integrating even simple, accessible markers into more sophisticated models. Hematological markers from a complete blood count (CBC) and biochemical markers like lipid profiles or creatinine offer significant prognostic value by reflecting systemic inflammation, metabolic dysfunction, and thrombotic risk [21,22,23]. Notably, machine learning (ML) models utilizing these simple hematological markers have achieved high concordance indices (0.60 to 0.80) for predicting major outcomes like heart failure and all-cause mortality, highlighting the clinical utility of combining accessible data with advanced analytical frameworks [24].
Despite the widespread application of machine learning in cardiology, a critical blind spot persists. The incremental diagnostic value of adding routine hematological and biochemical blood panels to baseline clinical data for classifying current cardiovascular disease remains largely unquantified. Most existing studies focus on long-term risk prediction or isolated biomarkers, failing to provide rigorous comparisons that measure the precise gain in accuracy from a standard blood test [25,26]. This gap is deepened by a second, equally important problem: the lack of interpretability. Many high-performing models operate as “black boxes,” leaving clinicians unable to understand why a particular prediction was made [27,28,29]. This opacity erodes trust and creates a major barrier to real-world clinical adoption, as doctors are hesitant to act on recommendations they cannot verify or understand through their own medical knowledge.
This study aims to address the two challenges of model quantification and interpretability. We employ a methodologically sound, model-agnostic, paired design that allows for an assessment of the impact of routine laboratory tests on diagnostic quality and provides a data-driven understanding of their value. A key focus of our work is on interpretability, which helps transform the model from a “black box” into an understandable tool for clinicians. The use of SHAP analysis has allowed us to not only confirm the better performance of the model with blood test data but also to identify the predictive biomarkers that underlie its decisions, particularly AST, glucose, and creatinine. By linking these markers to known pathophysiological mechanisms, we provide physicians with practically relevant information, turning the algorithm into a tool capable of supporting and complementing their clinical reasoning.

2. Materials and Methods

2.1. Sample and Data Description

This study utilized a retrospective cohort design based on anonymized patient data from the City Clinical Hospital, a major multidisciplinary healthcare provider under the Public Health Department of Almaty, Kazakhstan. The use of a localized, contemporary cohort is a key aspect of this research, as it ensures the relevance of the findings to the specific demographic and epidemiological context of the region, which may differ from populations described in foundational international studies. The overall process, from data acquisition and patient selection to the final formulation of the experimental datasets, is illustrated in Figure 1.
The dataset covers records of patients hospitalized between January 2023 and February 2025. In accordance with data protection and ethical principles, the data was received in a fully anonymized format.
All clinical and laboratory variables used in this analysis were collected at a single time point corresponding to each patient’s initial assessment upon hospital admission. This means that every recorded parameter—such as blood pressure, heart rate, and biochemical markers—represents the baseline measurement obtained before any therapeutic intervention was initiated. Using admission-time data ensures a standardized baseline across all participants, thereby minimizing confounding effects related to treatment progression, length of hospital stay, or measurement variability over time. This design choice was made deliberately to maintain comparability among patients and to align with the study’s predictive modeling objective, which focuses on classifying cardiovascular disease based on initial clinical presentation rather than treatment response [30,31].
The study population consisted of adult patients aged 18 and older. The key inclusion criterion was the availability of a complete clinical and laboratory record, which mandatorily included the following data: age, sex, systolic and diastolic blood pressure, complete blood count parameters (PLT, HGB, WBC, RBC, HCT), and glucose level. Furthermore, a clearly documented primary diagnosis in the medical record was a prerequisite.
To minimize the influence of confounding factors, a series of strict exclusion criteria were applied. Patients with an uncertain, preliminary, or symptomatic diagnosis (e.g., “chest pain” instead of “angina pectoris”) were excluded from the analysis. Also excluded were patients whose data were collected during acute non-cardiac conditions, such as severe trauma, sepsis, or acute hemorrhage, as this could lead to temporary distortions in hematological parameters. The exclusion criterion was extended to patients with terminal conditions, including stage 4 oncological diseases and severe renal or hepatic failure. Additionally, pregnant women were excluded from the sample due to physiological changes in blood counts and hemodynamics. Records containing physically impossible or obviously erroneous values (e.g., systolic BP > 300 mmHg or age < 18 years) were also removed from the dataset. The case group consisted of 1198 individuals with a clinically confirmed primary diagnosis of a major cardiovascular disease. The mean age in this group was 63.8 ± 11.5 years (age range: 28–91 years), and it was composed of 52% males (n = 623) and 48% females (n = 575).
The control group consisted of 900 individuals. To ensure the study’s validity and create a cardiovascular health reference group, participants underwent a multi-level selection process. The primary inclusion criterion was the complete absence of cardiac complaints and any established diagnosis of cardiovascular disease (CVD), both past and present. The exclusion criteria were as follows: all patients with ischemic heart disease (IHD), a history of myocardial infarction or stroke, chronic heart failure (CHF), atrial fibrillation, significant valvular defects, or previous heart or vascular surgery were excluded. Furthermore, individuals with key risk factors such as type 1 or type 2 diabetes mellitus, chronic kidney disease (CKD) stage 3 or higher, and uncontrolled arterial hypertension (BP > 160/100 mmHg) were excluded from the control group. The selection also considered laboratory markers of systemic inflammation (ESR > 20 mm/h) and the regular use of cardioprotective medications (statins, antiplatelet agents), as their prescription already indicates an elevated baseline risk. As a result of this thorough selection, a control group was formed with a mean age of 59.3 ± 14.8 years (range: 28 to 89 years); the group consisted of 55% females and 45% males.

2.2. Formulation of Experimental Datasets

At the initial stage of data acquisition, the records exhibited a rich and multidimensional structure. The raw tables comprised 45 columns, encompassing a wide range of attributes. These included demographic and physiological variables (age, gender, height, weight, body temperature, blood pressure, heart rate, respiratory rate, blood oxygen saturation, blood glucose levels, etc.), clinical and diagnostic details (preliminary and preoperative diagnoses, symptoms, medical history, diagnostic methods, and prescribed medications), as well as extensive laboratory blood test results (platelets, hemoglobin, white blood cells, glucose, bilirubin, ALT, AST, and others). The completeness and diversity of these variables enabled an in-depth and comprehensive analytical approach throughout the study.
Table 1 and Table 2 present real excerpts from the two resulting datasets. These samples illustrate the actual structure of the data used in the study, including the presence of missing values in certain features (e.g., oxygen saturation or hematocrit), and serve to transparently reflect the original form in which the data were processed.
The first dataset, referred to as Dataset 1, includes essential baseline clinical parameters that are commonly available in routine hospital records. Specifically, it comprises the patient’s age, gender (both in Cyrillic and binary encoded formats), height, weight, blood pressure (both combined and as separate systolic and diastolic values), body mass index (BMI), heart rate (HR), and peripheral oxygen saturation (SpO2). As illustrated in Table 1, the dataset preserves the original structure of the medical records, including character-based gender markers (e.g., “F”, “M”) and real-world imperfections such as missing values in some entries (e.g., SpO2 for patient ID 4). These features provide a realistic foundation for downstream analysis, allowing models to be developed and validated under conditions that closely resemble practical clinical scenarios.
In addition to the fundamental clinical attributes already included in Dataset 1—such as age, gender, weight, height, blood pressure, body mass index (BMI), heart rate, and blood oxygen saturation—Dataset 2 incorporates an extended panel of blood-based laboratory biomarkers. These variables were selected to enrich the dataset with physiologically relevant indicators that are frequently used in the diagnosis and management of cardiovascular disease (CVD), thereby enabling more granular modeling and improved predictive capacity.
As shown in Table 2, the added biomarkers include platelet count (PLT), hemoglobin (HGB), white blood cell count (WBC), red blood cell count (RBC), hematocrit (HCT), serum creatinine (Creat), alanine aminotransferase (ALT), aspartate aminotransferase (AST), total bilirubin (Bili), and fasting blood glucose (Glucose). These parameters provide insight into systemic inflammation, oxygen transport, liver and kidney function, and metabolic state—factors known to influence cardiovascular health. Notably, as observed in the table, some values such as hematocrit are missing in a few entries, reflecting the real-world incompleteness and variability typically encountered in clinical datasets.

2.3. Data Preprocessing

The data preparation stage followed a complete-case principle to ensure data quality and model comparability. Our initial cohort consisted of patients for whom baseline clinical variables (age, sex, and other covariates) were available, resulting in a sample of 1557 patients. To assess the incremental value of laboratory tests, we then applied an inclusion criterion: from this group, we selected only those who also had a complete set of all required routine laboratory results specified in the study. This filtering process yielded a final analytical sample of 896 patients.
During preprocessing, records with missing or corrupted values in key clinical or laboratory variables were excluded. The overall proportion of incomplete records was relatively small (<10%), so their exclusion did not materially affect statistical power or population representativeness. Although missing values could be imputed using median or mode replacement, such imputation was avoided because it may artificially reduce variance and distort clinically meaningful relationships among biomarkers. Given that the objective of this study was to evaluate diagnostic signal quality under realistic but controlled conditions, a complete-case analysis was considered the most appropriate strategy to preserve methodological integrity and avoid potential imputation bias.
To ensure a fair and direct comparison, this final cohort of 896 patients was used to construct both experimental datasets. Dataset 2 (Enhanced) was created using all features (baseline + laboratory) for these 896 individuals. Dataset 1 (Baseline) was created using data from the exact same 896 individuals, but with the laboratory features excluded. This paired-design approach guarantees that the models are trained and evaluated on the identical set of patients, ensuring that any performance differences are directly attributable to the inclusion of the blood biomarkers. The final harmonized datasets used for all subsequent modeling consisted of 459 CVD cases and 437 non-CVD cases and were ready for subsequent modeling tasks.
To enable model ingestion, all non-numeric categorical features were encoded into numeric format. Specifically, the “Gender” column, originally represented in Cyrillic characters (“F” for female and “M” for male), was converted using Label Encoding into binary values [32].
As shown in Table 3, the “Gender” column now contains binary values, with 0 indicating female and 1 indicating male. Additionally, a new target column “CVD” was introduced to label the presence of cardiovascular disease, where 1 denotes a positive diagnosis and 0 indicates no disease. These transformations rendered the datasets fully numeric and suitable for training classification models.
Before initiating the model training phase, mutual information (MI) analysis was conducted to assess the relationship between each feature and the target variable, CVD [33]. We chose MI because it can capture any statistical dependencies, including complex nonlinear relationships. This makes it a more robust tool for investigating the relevance of features in complex biological datasets [34]. This statistical method quantifies the amount of information shared between input features and the target, helping identify the most informative predictors.
Figure 2 illustrates the MI values for the baseline clinical features included in Dataset 1. The highest scores were observed for systolic blood pressure (0.258), heart rate (0.231), and diastolic blood pressure (0.170), indicating a strong association with CVD status. In contrast, features such as gender, age, and weight exhibited lower MI values, suggesting limited individual predictive power.
Figure 3 presents the MI scores for Dataset 2, limited to blood-based laboratory features. Among these, AST (0.089) and glucose (0.083) demonstrated the highest levels of mutual information with the CVD target. Other features, including bilirubin, erythrocytes, creatinine, and hemoglobin, showed weaker associations, while hematocrit (HCT) exhibited no mutual information at all, indicating no detectable dependency with CVD diagnosis in this dataset.
Thus, the preprocessing phase not only improved the overall quality of the datasets but also structured them appropriately for modeling and facilitated the identification of the most relevant predictive features.

2.4. Machine Learning Models

The performance of machine learning systems built on clinical data largely depends on the quality of data preprocessing. Clinical features often vary in scale and units of measurement, which may lead certain models—especially those based on gradient methods or kernel functions—to give undue weight to specific variables, thereby distorting the overall diagnostic classification results. To address this, all numerical features in the dataset were standardized to a common scale.
This standardization was achieved using z-score normalization, a technique that transforms each feature based on its mean and standard deviation [35]. As a result, all variables are centered around zero with a standard deviation of one.
For model training and evaluation, the final 896-patient dataset was partitioned into a training set (80% of the data) and a hold-out test set (20%). This strict separation ensures that the models are evaluated on data they have never seen during training or hyperparameter tuning. All subsequent steps, including model fitting and tuning, were performed exclusively on the training set, while the test set was reserved for the final, unbiased performance assessment.
Model selection was guided by the specific nature of clinical tasks, which require not only high predictive accuracy but also interpretability. For this purpose, five supervised machine learning algorithms were employed: Light Gradient Boosting Machine (LightGBM), XGBoost, Support Vector Machine (SVM), CatBoost, and Decision Tree.
LightGBM is a fast and efficient gradient boosting model. It significantly reduces training time and memory usage when working with large datasets. By using a histogram-based approach, it speeds up the learning process and reduces overfitting. Additionally, it provides feature importance scores, which makes it useful for clinical decision-making [36].
XGBoost is a high-performance version of gradient boosting. It works effectively with structured data and is known for its resistance to overfitting. This method works effectively with large structured datasets, enhancing model stability and the reliability of its outcomes. Furthermore, it can handle missing values and offers feature importance scores for better interpretability [37].
Support Vector Machine (SVM) is a powerful algorithm commonly used for binary classification tasks. It performs well on high-dimensional data and aims to find the optimal hyperplane that best separates the classes. With kernel functions, it can define complex nonlinear decision boundaries. It is often used in medical diagnostics where precise decision boundaries are required [38].
CatBoost is a gradient boosting model specifically designed to work with categorical features. It can process categorical values directly without the need for prior encoding and is less sensitive to hyperparameters. It uses techniques that reduce overfitting and speed up the training process, making it well-suited for clinical datasets [39].
Decision Tree is a model that learns decision rules by splitting data based on feature thresholds. Each internal node represents a decision, while the leaves represent outcomes. It is simple and easy to interpret, but prone to overfitting on small datasets. This limitation can be mitigated by using trees within ensemble models [40].
To ensure a fair comparison, each model was trained on two separate datasets: Dataset 1, which included only basic clinical parameters, and Dataset 2, which was extended with blood test biomarkers. This dual-dataset approach allowed for a clear evaluation of the contribution of laboratory features to model performance.
Hyperparameter tuning for each model was conducted on the training set using GridSearchCV. This technique exhaustively explores combinations of parameters and uses a five-fold cross-validation scheme internally on the training data to identify the most effective hyperparameter configuration without causing data leakage [41]. For instance, parameter tuning for XGBoost and CatBoost included n_estimators, learning_rate, max_depth, and subsample; for SVM, C and gamma were varied; for Decision Tree, max_depth and min_samples_split were optimized; and for LightGBM, key parameters such as n_estimators, learning_rate, max_depth, subsample, and reg_lambda were tuned. Additionally, class_weight = ‘balanced’ was applied to address class imbalance. This process ensured that all models were trained under optimal conditions.
Model performance was evaluated using four key classification metrics: accuracy, precision, recall, and F1 score [42].
Accuracy reflects the proportion of correctly classified instances among all predictions:
A c c u r a c y = T P + T N T P + T N + F P + F N
Precision measures the proportion of true positive cases among those predicted as positive:
P r e c i s i o n = T P T P + F N
Recall indicates the proportion of actual positive cases that were correctly identified:
R e c a l l = T P T P + F N
F1 Score is the harmonic mean of precision and recall:
F 1 = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
In medical settings, recall and F1 score are particularly important, as they reflect the model’s ability to correctly identify patients with disease without missing critical cases. Overall, the preprocessing methods, model selection strategy, and evaluation procedures used in this study contribute to the scientific robustness and clinical relevance of the results.
We used SHAP (SHapley Additive exPlanations) to obtain clinically interpretable explanations of the model’s diagnostic decisions. In this study, positive SHAP values indicate a feature’s contribution to increasing the predicted probability of prevalent CVD, while negative values indicate a contribution to decreasing it; the absolute value |SHAP| reflects the strength of influence. In the plots, color encodes the observed feature value (low → high).
In addition to discrimination metrics, we also assessed model calibration to ensure that predicted probabilities align with actual observed outcomes. Calibration plots were generated to visualize the agreement between predicted risk probabilities and true outcome frequencies using the predictions on the hold-out test set. A perfectly calibrated model would produce a calibration curve closely following the diagonal line, indicating that the predicted probability of cardiovascular disease corresponds accurately to the observed prevalence [43]. Deviations from this line highlight overestimation or underestimation of risk, which is critical in clinical applications where probability outputs may directly influence medical decision-making. By combining calibration analysis with SHAP-based interpretability, our evaluation provides a comprehensive view of both the accuracy and reliability of the developed models.
We preserve original measurement units and clinical reference ranges to facilitate clinical interpretation. We report global explanations (beeswarm and dependence plots) and patient-level explanations (force plots).

3. Results

This section presents a detailed comparative evaluation of machine learning models applied to two structurally distinct datasets designed for the diagnostic classification of cardiovascular disease (CVD). Dataset 1 includes only basic clinical parameters such as demographic features, vital signs, and anthropometric measurements. Dataset 2 extends this baseline with laboratory-derived blood biomarkers, including but not limited to glucose, aspartate aminotransferase (AST), creatinine, and platelet count—parameters that are routinely measured in clinical settings and considered physiologically relevant to cardiovascular risk assessment.
The primary objective of this comparative analysis was to determine whether the integration of blood-based biomarkers enhances the predictive capacity of machine learning models in detecting CVD. To this end, five well-established models were employed: LightGBM [44], CatBoost [45], Support Vector Machine (SVM) [46], XGBoost [47], and Decision Tree [48]. Each model was trained and evaluated on the same balanced dataset comprising 896 patient records, with an equal distribution between CVD-positive and CVD-negative cases (459 and 437, respectively). This controlled experimental setup ensures that performance differences are attributable to the input features rather than data imbalance.
To comprehensively assess model performance, four standard evaluation metrics were used: Accuracy, Precision, Recall, and F1 Score. These metrics collectively provide insight into overall correctness, sensitivity to positive cases, and robustness of diagnostic classification—particularly crucial in medical diagnostics where false negatives may result in significant clinical consequences. Among these, Recall holds specific importance, as it reflects the model’s ability to correctly identify true CVD cases, thus minimizing the risk of missed diagnoses.
The comparative results are summarized in Table 4, which contrasts model performance across the two datasets:
As shown in Table 4, the inclusion of blood test indicators led to a noticeable improvement in the predictive performance of all models for cardiovascular disease (CVD) classification. Across all five models, the main evaluation metrics—accuracy, precision, recall, and F1 score—improved when blood parameters were added. These enhancements are particularly significant from a clinical perspective, as reducing false negatives is critical in avoiding missed diagnoses and ensuring timely medical intervention. A consistent improvement is observed after adding blood biomarkers, mirroring findings from other recent gradient-boosting studies on CVD risk [49].
Models based on gradient boosting methods, such as LightGBM, CatBoost, and XGBoost, performed particularly well when trained on the extended dataset. For instance, LightGBM improved its accuracy from 0.8715 to 0.8939 and its F1 score from 0.8700 to 0.8914 with the addition of blood features. CatBoost showed a similar trend, with accuracy rising from 0.8547 to 0.8994 and F1 score from 0.8506 to 0.8977. These results highlight the ability of boosting algorithms to capture complex patterns when enriched with additional medical variables [50].
XGBoost achieved the highest performance, with its accuracy increasing from 0.8715 to 0.9162 and its F1 score reaching 0.9102. This reflects the model’s strong adaptability to heterogeneous clinical data and its effectiveness in distinguishing between patients with and without CVD.
SVM showed moderate improvement, particularly in recall, which increased from 0.7609 to 0.8152. This indicates a reduced likelihood of missing patients with CVD when blood features are included, addressing one of the key concerns in medical risk diagnostic classification [51].
The Decision Tree model, on the other hand, showed limited benefit from the added features. Its accuracy slightly decreased after the inclusion of blood data, and gains in other metrics were minimal. This suggests that tree-based models without ensemble mechanisms may struggle to utilize complex biochemical information effectively.
Blood variables such as glucose, AST, and creatinine contributed strongly to the improved performance, indicating their relevance as predictive markers in the context of machine learning applications for cardiovascular diagnostics.
To further investigate feature contributions, SHAP (SHapley Additive exPlanations) analysis was conducted for the XGBoost model, which had among the highest performance metrics [52]. SHAP is a model-agnostic method widely used for interpreting machine learning decisions. It quantifies the individual contribution of each input variable to the final diagnostic classification, making it particularly valuable in medical applications where understanding the influence of clinical features is essential. The SHAP analysis was limited to XGBoost given its interpretability and superior performance.
As shown in the SHAP summary plot, in Figure 4, the most impactful feature in Dataset 1 is Systolic_BP, which recorded the highest mean SHAP value, indicating its strong influence on the model’s diagnostic classification of cardiovascular disease. It is followed by Heart Rate, BMI, and Oxygen Saturation, all of which also demonstrated considerable predictive power despite the absence of blood-based biomarkers. Weight and Age showed moderate importance, contributing to the model but with less influence compared to the top-ranking features. In contrast, Height, Diastolic_BP, and Gender had relatively low SHAP values, suggesting minimal impact on the model’s decision-making process, with Gender contributing the least overall. The differences in SHAP values across features reflect the degree to which each input shaped the model’s classification results. This visualization provides a clear interpretation of the model’s internal logic, highlighting which basic clinical features played a critical role in distinguishing patients with and without CVD based solely on non-laboratory parameters.
This bar chart, in Figure 5, presents the SHAP feature importance values derived from the XGBoost model trained on Dataset 2, which includes blood biomarkers. The visualization highlights the extent to which each laboratory variable contributed to the model’s diagnostic classification of cardiovascular disease (CVD). Among all features, AST (aspartate aminotransferase) demonstrated the highest SHAP value, indicating its dominant influence on the model’s decision-making process. Glucose and Creatinine followed closely, also showing substantial contributions. Other important features included ALT, Platelets, Hemoglobin, Total Bilirubin, WBC (white blood cells), RBC (red blood cells), and Hematocrit. These variables represent critical physiological functions related to metabolic, hepatic, renal, and hematologic systems, all of which play a key role in cardiovascular health. The inclusion of these markers provided the model with deeper insight into the underlying clinical patterns associated with CVD. Overall, the SHAP analysis offers a transparent way to interpret model behavior and validate the clinical relevance of selected features.
Following the initial findings, we further investigated the internal structure of the model. SHAP analysis not only quantifies the individual contribution of each feature but also visualizes how features interact—highlighting thresholds where risk escalates and identifying feature combinations that amplify or dampen each other’s effects. This enabled us to pinpoint how pressure, heart rate, BMI, and biochemical markers interact to create compounded cardiovascular risk. The visualizations below provide an interpretable window into these interactions and suggest actionable clinical insights.
Figure 6 aggregates the full dataset, showing one row per biomarker and one dot per patient. The position of each dot reflects its SHAP value—indicating whether the variable increased or decreased predicted risk. Color represents the raw feature value (blue = low, pink = high).
Systolic BP displays a strong rightward skew at values ≥ 140 mmHg, indicating sharp risk elevation. Similarly, heart rates above 90 bpm shift rightward, confirming tachycardia as an independent risk factor. BMI demonstrates a fan-shaped distribution: values above 25 are progressively more impactful. These patterns confirm that the model robustly captures the classical triad of hemodynamic stress—pressure, pulse, and adiposity.
In Figure 7a SHAP values remain negative up to ~120 mmHg, suggesting a protective zone. At ~130 mmHg, SHAP values turn sharply positive, and older individuals (indicated by pink dots) show more pronounced risk at equivalent BP levels—highlighting age as a synergistic amplifier [22]. As we can see in Figure 7b, heart rate between 60 and 80 bpm appears benign. Above 90 bpm, SHAP values rise steeply. High-BMI individuals (pink dots) experience even greater impact—demonstrating that tachycardia and obesity jointly magnify cardiovascular stress [53]. In Figure 7c SHAP values are negative below BMI 25 and sharply increase above 30, indicating the transition to high-risk adiposity. While nearly all points are pink (SpO2 ≥ 90%), oxygen saturation appears to play a minor moderating role in this context.
For this patient in Figure 8, the model predicted a log-odds score of –0.50, indicating low overall risk. Red segments (Hematocrit = 39.3, AST = 20.2, Total Bilirubin = 3.0) decrease predicted risk, while blue segments (Leukocytes = 5.4, Hemoglobin = 123, Platelets = 316, Glucose = 7, RBC ≈ 4.5) marginally increase it. Importantly, the diagnostic classification remains on the protective side. This graphical summary highlights both favorable and borderline markers, allowing the clinician to focus on emerging issues—namely, mild hyperglycemia and elevated platelets in this case.
The findings are in line with current medical understanding and suggest the model captures meaningful biological patterns. The integration of SHAP further supports its use as a transparent, interpretable tool in personalized risk stratification and clinical decision-making.
In addition, confusion matrices were generated for the XGBoost model trained on each dataset, allowing a visual inspection of classification errors. This visualization helps to quantify how well the model identifies true positives and true negatives, while also shedding light on common misclassifications such as false positives and false negatives [54].
As shown in the confusion matrix for Dataset 1, in Figure 9, the XGBoost model demonstrated balanced classification capabilities with 78 true positives and 78 true negatives. However, its clinical applicability is limited by the presence of 14 false negatives—instances where patients with cardiovascular disease were incorrectly labeled as healthy. Such errors are especially concerning in medical contexts, as they may result in missed diagnoses and delayed interventions. Additionally, 9 false positives were recorded, suggesting that the model occasionally overestimated risk in healthy individuals. These inaccuracies highlight the shortcomings of using only baseline clinical features like blood pressure, heart rate, and body metrics. Without access to deeper physiological information such as blood biomarkers, the model struggles to capture the full complexity of cardiovascular pathology. While the model performs moderately well, the level of misclassification—especially in identifying true cases—signals that exclusive reliance on baseline data may not be sufficient for robust clinical diagnostic classification in real-world practice.
As shown in the confusion matrix for Dataset 2, in Figure 10, the inclusion of blood biomarkers significantly improved the model’s ability to classify patients with cardiovascular disease. The model achieved 76 true positives and 88 true negatives, reflecting a strong balance between sensitivity and specificity. Compared to the baseline-only model, the number of false negatives was reduced to 11, and false positives dropped to just 4. This improvement is clinically meaningful, as it reduces the likelihood of missed diagnoses while minimizing unnecessary alerts for healthy individuals. The enhanced performance can be attributed to the rich diagnostic information provided by blood parameters such as AST, Glucose, and Creatinine, which captured underlying metabolic and biochemical patterns not available in baseline features alone. These results are consistent with the observed increase in Recall and overall accuracy. The confusion matrix clearly illustrates that blood analysis contributes valuable predictive power, reinforcing its importance in developing reliable and clinically useful ML-based diagnostic tools.
As a complementary evaluation step, we further examined the reliability of probability estimates using calibration analysis. Unlike accuracy-based metrics, calibration curves assess whether the predicted probabilities of cardiovascular disease correspond to the actual observed outcomes [55]. This is particularly important in clinical settings, where physicians may rely on the predicted risk score as an indicator of disease likelihood. A well-calibrated model ensures that, for example, patients predicted to have a 70% risk truly have disease prevalence close to 70%, making the predictions more clinically trustworthy.
The solid green line, in Figure 11, represents the observed fraction of positives against the predicted probabilities, while the dashed diagonal line corresponds to a perfectly calibrated model. The close alignment between the two curves demonstrates that the model’s probability estimates closely match the true outcome distribution [43]. The closer the calibration curve lies to the diagonal, the better the model’s performance, providing strong evidence that the model is well-calibrated and free from significant overfitting. This indicates that the predicted risks are not only accurate in classification but also reliable as quantitative measures of disease likelihood, which is essential for clinical decision-making and risk stratification.
The use of blood biomarkers improved the ability of machine learning models to correctly identify patients with cardiovascular disease, as seen in the higher Recall scores. Hematological and biochemical indicators provide clinically meaningful information that is not fully captured by baseline features such as blood pressure, heart rate, or body composition. Markers like AST, glucose, creatinine, and WBC reflect key physiological functions related to liver activity, metabolism, kidney function, and immune response—areas that are closely linked to cardiovascular health. These variables introduce an additional diagnostic layer that helps the model detect subtle but significant signals of disease. In contrast, models trained only on baseline features demonstrated limited sensitivity and produced more false negatives, indicating a lack of depth in capturing the full clinical picture. Incorporating blood test data allows machine learning algorithms to operate with greater precision and medical relevance, leading to stronger performance in identifying at-risk individuals based on both external and internal physiological cues.

4. Discussion

The outcomes of this study provide strong support for the hypothesis that incorporating blood-based biomarkers into clinical datasets significantly enhances the performance of machine learning models in predicting cardiovascular disease (CVD). All five algorithms—XGBoost, LightGBM, Support Vector Machine (SVM), CatBoost, and Decision Tree—exhibited improved predictive capability when trained on the enriched dataset that included laboratory features, compared to the baseline dataset containing only fundamental clinical attributes. Among these, the XGBoost model demonstrated the most substantial performance gains, achieving an increase in accuracy from 0.871 to 0.916 and a notable improvement in F1 score from 0.871 to 0.910, thereby highlighting the added value of haematological and biochemical variables in CVD diagnostic classification.
The reasons for XGBoost’s superiority can be traced to its architectural design. The algorithm combines second-order gradient optimization with leaf-wise tree growth, allowing it to capture complex nonlinear interactions—such as the compounded effects of elevated glucose, male gender, and high BMI—without requiring explicit feature engineering. Furthermore, its built-in regularization mechanisms (L1/L2 penalties, shrinkage, and column subsampling) control model variance and prevent overfitting, particularly important given the moderate size and near-balanced class distribution of our dataset.
The XGBoost model demonstrated the highest overall performance among all evaluated algorithms, making it the most suitable choice for clinically significant applications such as early cardiovascular screening and diagnostic triage. Its superior discrimination ability enables reliable identification of high-risk patients. Notably, the model’s performance improved substantially after the inclusion of blood test features. This suggests that laboratory biomarkers are not merely supplementary—they reveal hidden patterns and risk mechanisms that cannot be fully captured by vital signs alone. Integrating routine blood tests into machine learning models for cardiovascular disease diagnostic classification represents a critical advancement, both clinically and computationally [28].
The improvement in statistical indicators translates into clinically meaningful advantages. An increase in model accuracy from 0.871 to 0.916 means that, in practical terms, approximately 45 additional patients out of every 1000 screened would receive a correct diagnosis. This contributes to a reduction in false negatives, ensuring that high-risk patients are not overlooked, as well as a decrease in false positives, preventing healthy individuals from being subjected to unnecessary, costly, and psychologically burdensome additional examinations. In this regard, the XGBoost model may have several concrete applications in clinical practice. For example, at the level of primary healthcare, physicians can use the model alongside standard blood test results to detect patients at risk who show no outward symptoms and promptly refer them to a cardiologist or for further examinations such as ECG or echocardiography. Furthermore, in healthcare systems with limited resources, this approach can help determine who should be prioritized for expensive diagnostic procedures. In emergency departments, the model may serve as a supportive tool in decision-making for classifying patients presenting with chest pain. One of the strengths of this study is that several modern machine learning algorithms were compared, and among them, the XGBoost model consistently demonstrated superior performance. In addition, because the dataset included nearly balanced groups of patients with and without coronary artery disease (459 with CAD vs. 437 without CAD), the likelihood of biased model evaluation was minimized. Finally, the study focused on widely available and low-cost laboratory biomarkers, which enhances the potential for real-world implementation of the model in clinical practice.
Although the overall accuracy of the top-performing model reached approximately 92%, this level of performance is considered appropriate given the intended application and inherent data limitations. The feature set was deliberately designed to be pragmatic, incorporating routinely collected vital signs and low-cost laboratory tests available at the initial point of care. Within this constrained yet clinically relevant signal space, the models consistently demonstrated performance gains when blood biomarkers were included alongside the baseline clinical variables. The gradient-boosting classifier in particular achieved balanced improvements across precision, recall, and F1 score. From a clinical standpoint, the models exhibited desirable behavior by reducing the number of false negatives compared with the baseline configuration, thereby minimizing the likelihood of missing patients who may require further diagnostic evaluation. Moreover, the interpretation of overall accuracy should be contextualized within considerations of feasibility and generalizability. The proposed approach relies solely on tests that are universally available in most healthcare settings, avoiding dependence on advanced imaging or costly biomarker assays. Consequently, achieving an accuracy of approximately 92% represents a competitive and clinically meaningful outcome, particularly when the primary goal is the practical deployment of the model across diverse healthcare environments rather than maximizing discrimination under idealized conditions.
The inclusion of the enhanced biomarker set is justified by the significant body of evidence linking each component to the underlying pathophysiology of CVD. These markers, which are routinely collected, provide a window into systemic inflammation, metabolic dysfunction, thrombotic risk, and end-organ stress, all of which are central to the development and progression of CVD. Fasting glucose is a primary indicator of metabolic health [56], as chronic hyperglycemia is a well-established driver of cardiovascular risk [57]. The liver enzymes AST/ALT ratio is indicative of broader metabolic issues that impact cardiovascular health. A study indicated that increased AST/ALT ratio levels were predictive of all-cause and cardiovascular mortality among Chinese hypertensive patients [58]. Serum creatinine is a critical marker for kidney function [59]. Kidney dysfunction is a powerful and independent risk factor for CVD [60], as the kidneys play an essential role in blood pressure regulation and filtering waste [61]. The WBC is a reliable marker of systemic inflammation [62]. An elevated WBC count is an independent predictor of coronary heart disease mortality [63]. PLT are not merely clotting cells; they are active participants in atherosclerosis. They contribute to vascular inflammation and are the primary drivers of thrombosis [64]. Finally, parameters related to red blood cells—HGB, HCT, and RBC—are crucial indicators of the blood’s oxygen-carrying capacity [65]. Low levels (anemia) are an independent risk factor for CVD [66]. Anemia forces the heart to work harder to supply the body with oxygen, which is a strong precursor to heart failure [67]. Total bilirubin offers a unique perspective, as it acts as an endogenous antioxidant [68]. A strong inverse relationship exists between bilirubin and CVD, where higher levels (within a normal range) are associated with lower cardiovascular risk [69].
SHAP analysis, upon analyzing two distinct datasets, revealed two major categories of risk signals. In our context, the term “black box” refers to conventional machine learning models that produce accurate predictions without revealing how specific features contribute to the output. Explainable AI (XAI) methods, such as SHAP, overcome this limitation by quantifying the individual contribution of each variable to the final prediction. In this study, SHAP values indicated how increases in biomarkers like AST, glucose, and creatinine raised the model’s predicted risk of CVD, while lower levels of bilirubin and hematocrit reduced it. Such visualization provides clinicians with interpretable evidence linking model behavior to physiological mechanisms, thereby transforming algorithmic output into clinically meaningful insight. This transparency bridges the gap between AI-based predictions and medical reasoning, supporting informed decision-making rather than replacing physician expertise. The first dataset included only vital signs, while the second expanded on this by incorporating laboratory blood markers. SHAP provided clinically interpretable explanations of the diagnostic model, enhancing transparency of the decision process and facilitating potential integration into clinical workflows.
In the first dataset, systolic blood pressure emerged as the most influential predictor. When arterial pressure is chronically elevated, the endothelial lining experiences persistent mechanical stress, creating microtears that are prone to plaque formation and rupture. A meta-analysis has shown that even a 5 mmHg reduction in systolic pressure can decrease the risk of cardiovascular events by 13% [70]. Resting heart rate ranked second—each 10 bpm increase adds mechanical load and accelerates cardiac wear, raising mortality risk by 8% [71]. The third most important factor was body mass index (BMI). Excess adiposity not only adds mechanical strain but also drives systemic inflammation; individuals in the highest BMI trajectory face a 42% higher cardiovascular risk compared to those maintaining normal weight [72].
When laboratory data were introduced in the second dataset, the picture expanded significantly. AST (aspartate aminotransferase) emerged as the leading contributor. This enzyme is released into the bloodstream when hepatic or myocardial cells are injured, acting as a biochemical “flare.” Elevated AST levels have been associated with cardiac dysfunction and increased mortality risk [73,74]. Glucose followed closely; hyperglycemia stiffens blood vessels, promotes plaque instability, and triples the risk of in-hospital death following myocardial infarction [75]. ALT, a liver-specific counterpart to AST, also moved to the top. Elevated ALT levels are frequently observed in patients with hepatic steatosis and are increasingly recognized as a marker of cardiometabolic distress and heightened cardiovascular risk [76].
Building on these interpretability insights, we further evaluated model performance through confusion matrix analysis to examine real-world diagnostic implications. This assessment confirmed that models incorporating blood parameters consistently produced lower false negative rates, thereby improving sensitivity in detecting CVD cases. Such an improvement is of particular clinical relevance, as minimizing missed diagnoses is crucial for initiating timely and potentially life-saving interventions. The higher recall scores observed in these enhanced models reflect an increased capacity to identify high-risk individuals who may otherwise remain undetected when relying solely on baseline clinical data.
The core of our study—integrating routine hematology and biochemistry markers with baseline clinical data for interpretable ML-based diagnostic classification of CVD—aligns directly with several recent high-impact works. In large-scale clinical populations, the combined use of routine blood counts and biochemistry (versus individual blocks) has been shown to substantially improve diagnostic performance: in pan-cardiovascular disease models, XGBoost achieved AUC ≈ 0.99, with SHAP confirming the contribution of features such as potassium, albumin, total protein, and bilirubin, thereby reinforcing the clinical relevance of our selected feature set [22].
In the context of obstructive CAD (ObCAD), ensemble learning of clinical + laboratory data (i.e., “baseline + biomarkers”) with first-line ECG signals was found to outperform classical pretest probability (PTP) models: the clinico-laboratory model achieved AUC = 0.747, the ECG-DL model AUC = 0.685, while the ensemble reached AUC = 0.767; this underscores that our “integrating accessible data” strategy is moving in the right direction [77]. Another closely related study demonstrated that a CatBoost model using only clinical and routine laboratory markers improved ObCAD prediction (AUROC = 0.796) and outperformed the updated CAD Consortium/DF PTP. SHAP analysis highlighted the importance of routine markers such as hs-cTnT, HbA1c, TG, and HDL, which closely mirrors the nature of the features employed in our approach [53].
Early-stage CAD detection (prior to invasive testing) using panels of routine clinical biomarkers has also been validated. A study published in European Heart Journal—Digital Health showed that combining clinical data with routine markers, augmented by synthetic data, enhanced performance and maintained robust external generalizability when transferred to the Young Finns Study. This directly strengthens the clinical feasibility of our “low-cost, widely accessible markers for early triage” concept [78].
The incremental value of moving from “baseline → baseline + biomarkers” has been systematically observed not only for diagnosis but also for long-term prognosis. A cohort analysis with external validation across two independent hospitals showed that adding hematological indices to the baseline model increased the C-index by up to 0.072 and improved calibration—methodologically validating our data architecture that incorporates hematology/biochemistry signals [24]. Moreover, a large-scale JAMA meta-analysis across 28 cohorts found that while the incremental gain of routine cardio-biomarkers over standard risk factors for ASCVD was modest, their added value was more pronounced for outcomes such as heart failure and mortality. The general conclusion—that biomarker utility depends on the target endpoint—provides a strong rationale that in our diagnostic classification setting, routine blood markers indeed carry a meaningful predictive signal [79].
These results align with current trends in biomedical data analysis, highlighting the value of integrating a broader range of physiological indicators to improve risk assessment models. Traditional methods have often focused on a narrow set of clinical features. By comparison, our findings show that incorporating routine laboratory data into clinical profiles adds useful predictive information, enhancing both model performance and reliability.
Several limitations warrant consideration. First, during data cleaning many records lacking complete laboratory information were removed, leaving the combined vital-plus-blood dataset markedly smaller than the vital-only cohort and reducing statistical power for some comparisons. Second, the high proportion of missing values in key blood markers further narrowed the analytic sample and may have weakened the influence of those variables in the final models. Third, all observations were drawn from City Clinical Hospital within a single metropolitan area, which limits the generalizability of our findings. However, this study was not aimed at developing a universally generalizable model, but rather at conducting a rigorous methodological comparison within a well-characterized, homogeneous cohort from a single medical center. This design choice helped minimize confounding factors and allowed for an accurate assessment of the added value of laboratory data. Further validation on larger, multi-center cohorts is the logical next step. In addition, cardiovascular disease was modeled as a single diagnostic category, which may have masked subtype-specific trends that require dedicated analysis in future research. Fourth, the demographic imbalance observed between the Healthy Control and CVD cohorts is another constraint. This mismatch introduces a considerable risk of confounding bias, as both advanced age and gender are well-established, independent risk factors for CVD [80]. This confounding effect may lead to an inflation of performance metrics reported herein and, more importantly, diminish the generalizability or external validity of the model when applied to a clinical population with a more closely matched or different age/gender distribution. While this reflects a common challenge in real-world retrospective data analysis, future research must explicitly mitigate this limitation through methods such as propensity score matching or the use of age- and gender-standardized cohorts to more cleanly isolate the true predictive power of the routine blood panel features. Another critical limitation of this paper is the classification of CVD as a single, heterogeneous entity. We acknowledge that CVD encompasses a wide range of distinct pathologies (such as ischemic heart disease, heart failure, and cerebrovascular disease) and that our dataset contained the clinical information to potentially explore these subcategories. This binary (CVD vs. Non-CVD) classification was, however, a deliberate methodological choice. This formulation is widely used in the literature and underpins several established studies in which clinical and laboratory features are used to train machine learning classifiers for CVD detection [81,82,83]. However, our research direction and the selection of features differ from those employed in previous studies. Nevertheless, this limitation highlights a valuable and logical direction for future work. A more detailed analysis that extends our approach to classify specific cardiovascular disease subtypes could reveal how the predictive importance of routine biomarkers varies across different conditions, providing a deeper and more individualized understanding of cardiovascular risk.

5. Conclusions

The primary objective of this study was to evaluate the impact of incorporating blood test biomarkers on the predictive performance of cardiovascular disease (CVD) models and to determine whether these indicators provide meaningful improvements beyond baseline clinical features. The results confirm that integrating routine blood tests significantly enhances diagnostic accuracy across multiple machine learning models, with XGBoost showing the best performance. A critical contribution of this work lies in its commitment to explainable AI (XAI). By leveraging SHAP, we moved beyond “black box” predictions to deliver clinically interpretable explanations of the diagnostic model. The analysis revealed that markers like AST, glucose, and creatinine are key drivers of the model’s decisions, aligning with known pathophysiological pathways of CVD. This transparency is essential for building trust and facilitating the adoption of AI tools in clinical practice. Furthermore, this study represents a meaningful step toward personalized medicine. The ability of the model to identify specific risk drivers for each individual patient—as demonstrated by SHAP force plots—allows for a more nuanced and personalized risk stratification. Instead of relying on generalized risk factors, clinicians can use these insights to tailor preventive strategies to the unique biochemical profile of each patient. While this study confirms the predictive value of blood biomarkers, future research should focus on external validation in diverse populations and prospective clinical trials to assess the real-world impact of this approach on patient outcomes. Ultimately, the integration of interpretable machine learning with routine clinical data holds significant promise for developing more precise, transparent, and personalized diagnostic tools in cardiovascular medicine.

Author Contributions

Conceptualization, N.T., B.A., Z.B., A.B. (Assiya Boltaboyeva) and B.I.; methodology, N.T., B.A., Z.B., S.Z., B.I. and N.M.-N.; software, N.T., B.A., Z.B., A.B. (Assiya Boltaboyeva) and S.Z.; validation, B.I., N.M.-N., S.Z. and A.B. (Aliya Baidauletova); formal analysis, B.I., N.M.-N., S.Z. and A.B. (Aliya Baidauletova); investigation, N.T., B.A., Z.B., A.B. (Assiya Boltaboyeva), B.I. and N.M.-N.; resources, N.T., B.A., Z.B., A.B. (Assiya Boltaboyeva) and B.I.; data curation, B.I., N.M.-N., S.Z. and A.B. (Aliya Baidauletova); writing—original draft preparation, N.T., B.A., Z.B., A.B. (Assiya Boltaboyeva) and B.I.; writing—review and editing, N.M.-N., S.Z. and A.B. (Aliya Baidauletova); visualization, N.T., B.A., Z.B. and A.B. (Assiya Boltaboyeva); supervision, N.M.-N. and S.Z.; project administration, B.I. and S.Z.; funding acquisition, B.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Committee of Science of the Ministry of Science and Higher Education of the Republic of Kazakhstan (Grant No. AP26103523).

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of Al-Farabi Kazakh National University (KazNU).

Informed Consent Statement

Informed consent for participation is not required as per local legislation [Protocol No. IRB-A862].

Data Availability Statement

The data supporting the findings of this study are available from the corresponding author upon the request.

Conflicts of Interest

Authors Zhanel Baigarayeva and Assiya Boltaboyeva was employed by the company LLP “Kazakhstan R&D Solutions”. Author Naoya Maeda-Nishino was employed by the company HAKUAI Medical Corporation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ALTAlanine Aminotransferase
AMIAcute Myocardial Infarction
ASTAspartate Aminotransferase
BMIBody Mass Index
CBCComplete Blood Count
CVDCardiovascular Disease
DALYDisability-Adjusted Life Years
DBMDeep Boltzmann Machine
ECGElectrocardiogram
EHRElectronic Health Record
ESCEuropean Society of Cardiology
FRSFramingham Risk Score
HbA1cGlycated Hemoglobin
HDLHigh-Density Lipoprotein
HRHeart Rate
HCTHematocrit
IHDIschemic Heart Disease
LDLLow-Density Lipoprotein
LightGBMLight Gradient Boosting Machine
LSTMLong Short-Term Memory
MLMachine Learning
NT-proBNPN-terminal pro B-type Natriuretic Peptide
PCAPrincipal Component Analysis
PLTPlatelet Count
PYLLPotential Years of Life Lost
RBCRed Blood Cell Count
RNNRecurrent Neural Network
SHAPSHapley Additive exPlanations
SCORESystematic Coronary Risk Evaluation
SpO2Peripheral Oxygen Saturation
SVMSupport Vector Machine
WBCWhite Blood Cell Count
XGBoostExtreme Gradient Boosting

References

  1. World Health Organization. Cardiovascular Diseases (CVDs); World Health Organization: Geneva, Switzerland, 2025; Available online: https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds) (accessed on 1 August 2025).
  2. Feigin, V.L.; Brainin, M.; Norrving, B.; Martins, S.O.; Pandian, J.; Lindsay, P.; Grupper, M.F.; Rautalin, I. World Stroke Organization: Global Stroke Fact Sheet 2025. Int. J. Stroke 2025, 20, 132–144. [Google Scholar] [CrossRef]
  3. Sun, J.; Qiao, Y.; Zhao, M.; Magnussen, C.G.; Xi, B. Global, Regional, and National Burden of Cardiovascular Diseases in Youths and Young Adults Aged 15–39 Years in 204 Countries/Territories, 1990–2019: A Systematic Analysis of Global Burden of Disease Study 2019. BMC Med. 2023, 21, 222. [Google Scholar] [CrossRef]
  4. Tochieva, Z.U.; Iskakova, F.A.; Abzaliev, K. Dynamics of Deaths and Mortality Rate in Kazakhstan Population. Eurasian J. Appl. Biotechnol. 2023, 3, 12–21. [Google Scholar] [CrossRef]
  5. Junusbekova, G.; Tundybayeva, M.; Akhtaeva, N.; Kosherbayeva, L. Recent Trends in Cardiovascular Disease Mortality in Kazakhstan. Vasc. Health Risk Manag. 2023, 19, 547–556. [Google Scholar] [CrossRef]
  6. Glushkova, N.; Turdaliyeva, B.; Kulzhanov, M.; Karibayeva, I.K.; Kamaliev, M.; Smailova, D.; Zhamakurova, A.; Namazbayeva, Z.; Mukasheva, G.; Kuanyshkalieva, A.; et al. Examining Disparities in Cardiovascular Disease Prevention Strategies and Incidence Rates between Urban and Rural Populations: Insights from Kazakhstan. Sci. Rep. 2023, 13, 20917. [Google Scholar] [CrossRef] [PubMed]
  7. Roth, G.; Mensah, G.; Fuster, V. The Global Burden of Cardiovascular Diseases and Risks: A Compass for Global Action. J. Am. Coll. Cardiol. 2020, 76, 2980–2981. [Google Scholar] [CrossRef] [PubMed]
  8. Khan, S.S.; Ning, H.; Wilkins, J.T.; Allen, N.; Carnethon, M.; Berry, J.D.; Sweis, R.N.; Lloyd-Jones, D.M. Association of Body Mass Index with Lifetime Risk of Cardiovascular Disease and Compression of Morbidity. JAMA Cardiol. 2018, 3, 280–287. [Google Scholar] [CrossRef]
  9. OECD; The King’s Fund. Is Cardiovascular Disease Slowing Improvements in Life Expectancy? In OECD and The King’s Fund Workshop Proceedings; OECD Publishing: Paris, France, 2020. [Google Scholar] [CrossRef]
  10. Shu, S.; Ren, J.; Song, J. Clinical Application of Machine Learning-Based Artificial Intelligence in the Diagnosis, Prediction, and Classification of Cardiovascular Diseases. Circ. J. 2021, 85, 1416–1425. [Google Scholar] [CrossRef]
  11. Kilic, A. Artificial Intelligence and Machine Learning in Cardiovascular Health Care. Ann. Thorac. Surg. 2019, 109, 1323–1329. [Google Scholar] [CrossRef] [PubMed]
  12. Al’Aref, S.J.; Anchouche, K.; Singh, G.; Slomka, P.J.; Kolli, K.K.; Kumar, A.; Pandey, M.; Maliakal, G.; Van Rosendael, A.R.; Beecy, A.N.; et al. Clinical Applications of Machine Learning in Cardiovascular Disease and Its Relevance to Cardiac Imaging. Eur. Heart J. 2018, 40, 1975–1986. [Google Scholar] [CrossRef]
  13. Di Costanzo, A.; Spaccarotella, C.A.M.; Esposito, G.; Indolfi, C. An Artificial Intelligence Analysis of Electrocardiograms for the Clinical Diagnosis of Cardiovascular Diseases: A Narrative Review. J. Clin. Med. 2024, 13, 1033. [Google Scholar] [CrossRef]
  14. Ogunpola, A.; Saeed, F.; Basurra, S.; Albarrak, A.M.; Qasem, S.N. Machine Learning-Based Predictive Models for Detection of Cardiovascular Diseases. Diagnostics 2024, 14, 144. [Google Scholar] [CrossRef]
  15. Bhatt, C.M.; Patel, P.; Ghetia, T.; Mazzeo, P.L. Effective Heart Disease Prediction Using Machine Learning Techniques. Algorithms 2023, 16, 88. [Google Scholar] [CrossRef]
  16. Califf, R.M. Biomarker Definitions and Their Applications. Exp. Biol. Med. 2018, 243, 213–221. [Google Scholar] [CrossRef] [PubMed]
  17. Myrou, A.; Barmpagiannos, K.; Ioakimidou, A.; Savopoulos, C. Molecular Biomarkers in Neurological Diseases: Advances in Diagnosis and Prognosis. Int. J. Mol. Sci. 2025, 26, 2231. [Google Scholar] [CrossRef] [PubMed]
  18. Seyhan, A.A.; Carini, C. Are Innovation and New Technologies in Precision Medicine Paving a New Era in Patient-Centric Care? J. Transl. Med. 2019, 17, 114. [Google Scholar] [CrossRef]
  19. Sobsey, C.A.; Ibrahim, S.; Richard, V.R.; Gaspar, V.; Mitsa, G.; Lacasse, V.; Zahedi, R.P.; Batist, G.; Borchers, C.H. Targeted and Untargeted Proteomics Approaches in Biomarker Development. Proteomics 2020, 20, e1900029. [Google Scholar] [CrossRef] [PubMed]
  20. Qiu, S.; Cai, Y.; Yao, H.; Lin, C.; Xie, Y.; Tang, S.; Zhang, A. Small Molecule Metabolites: Discovery of Biomarkers and Therapeutic Targets. Signal Transduct. Target Ther. 2023, 8, 132. [Google Scholar] [CrossRef]
  21. Lucijanić, M.; Krečak, I.; Šorić, E.; Sabljic, A.; Galušić, D.; Holik, H.; Perisa, V.; Morić Perić, M.; Žekanovic, I.; Budimir, J.; et al. Evaluation of Absolute Neutrophil, Lymphocyte and Platelet Count and Their Ratios as Predictors of Thrombotic Risk in Patients with Prefibrotic and Overt Myelofibrosis. Life 2024, 14, 523. [Google Scholar] [CrossRef]
  22. Wang, Z.; Gu, Y.; Huang, L.; Liu, S.; Chen, Q.; Yang, Y.; Hong, G.; Ning, W. Construction of Machine Learning Diagnostic Models for Cardiovascular Pan-Disease Based on Blood Routine and Biochemical Detection Data. Cardiovasc. Diabetol. 2024, 23, 351. [Google Scholar] [CrossRef]
  23. Pieszko, K.; Hiczkiewicz, J.; Budzianowski, P.; Rzeźniczak, J.; Budzianowski, J.; Błaszczyński, J.; Słowiński, R.; Burchardt, P. Machine-Learned Models Using Hematological Inflammation Markers in the Prediction of Short-Term Acute Coronary Syndrome Outcomes. J. Transl. Med. 2018, 16, 334. [Google Scholar] [CrossRef]
  24. Truslow, J.G.; Goto, S.; Homilius, M.; Mow, C.; Higgins, J.M.; MacRae, C.A.; Deo, R.C. Cardiovascular Risk Assessment Using Artificial Intelligence-Enabled Event Adjudication and Hematologic Predictors. Circ. Cardiovasc. Qual. Outcomes 2022, 15, e008007. [Google Scholar] [CrossRef] [PubMed]
  25. Thomas, M.R.; Lip, G.Y.H. Novel Risk Markers and Risk Assessments for Cardiovascular Disease. Circ. Res. 2017, 120, 133–149. [Google Scholar] [CrossRef] [PubMed]
  26. MacNamara, J.; Eapen, D.J.; Quyyumi, A.; Sperling, L. Novel Biomarkers for Cardiovascular Risk Assessment: Current Status and Future Directions. Future Cardiol. 2015, 11, 597–613. [Google Scholar] [CrossRef] [PubMed]
  27. Olivier, M.; Asmis, R.; Hawkins, G.A.; Howard, T.D.; Cox, L.A. The Need for Multi-Omics Biomarker Signatures in Precision Medicine. Int. J. Mol. Sci. 2019, 20, 4781. [Google Scholar] [CrossRef]
  28. Dhingra, R.; Vasan, R.S. Biomarkers in Cardiovascular Disease: Statistical Assessment and Section on Key Novel Heart Failure Biomarkers. Trends Cardiovasc. Med. 2017, 27, 123–133. [Google Scholar] [CrossRef]
  29. Dong, W.; Jiang, H.; Li, Y.; Lv, L.; Gong, Y.; Li, B.; Wang, H.; Zeng, H. Interpretable Machine Learning Analysis of Immunoinflammatory Biomarkers for Predicting CHD among NAFLD Patients. Cardiovasc. Diabetol. 2025, 24, 263. [Google Scholar] [CrossRef]
  30. Moons, K.G.M.; Wolff, R.F.; Riley, R.D.; Whiting, P.F.; Westwood, M.; Collins, G.S.; Reitsma, J.B.; Kleijnen, J.; Mallett, S. PROBAST: A Tool to Assess Risk of Bias and Applicability of Prediction Model Studies: Explanation and Elaboration. Ann. Intern. Med. 2019, 170, W1–W33. [Google Scholar] [CrossRef]
  31. Antman, E.M.; Cohen, M.; Bernink, P.J.L.M.; McCabe, C.H.; Horacek, T.; Papuchis, G.; Mautner, B.; Corbalan, R.; Radley, D.; Braunwald, E. The TIMI Risk Score for Unstable Angina/Non–ST Elevation MI: A Method for Prognostication and Therapeutic Decision Making. JAMA 2000, 284, 835–842. [Google Scholar] [CrossRef]
  32. Bolikulov, F.; Nasimov, R.; Rashidov, A.; Akhmedov, F.; Cho, Y.-I. Effective Methods of Categorical Data Encoding for Artificial Intelligence Algorithms. Mathematics 2024, 12, 2553. [Google Scholar] [CrossRef]
  33. Ur Rehman, M.; Naseem, S.; Butt, A.U.R.; Mahmood, T.; Khan, A.R.; Khan, I.; Khan, J.; Jung, Y. Predicting Coronary Heart Disease with Advanced Machine Learning Classifiers for Improved Cardiovascular Risk Assessment. Sci. Rep. 2025, 15, 13361. [Google Scholar] [CrossRef]
  34. Vergara, J.R.; Estévez, P.A. A Review of Feature Selection Methods Based on Mutual Information. Neural Comput. Appl. 2014, 24, 175–186. [Google Scholar] [CrossRef]
  35. Mohamed, N.; Almutairi, R.L.; Abdelrahim, S.; Alharbi, R.; Alhomayani, F.M.; Alsulami, A.; Alkhalaf, S. Deep Convolutional Fuzzy Neural Networks with Stork Optimisation on Chronic Cardiovascular Disease Monitoring for Pervasive Healthcare Services. Sci. Rep. 2025, 15, 19008. [Google Scholar] [CrossRef]
  36. Wang, Y.; Cao, H. Heart-Failure Prediction Based on Bootstrap Sampling and Weighted-Fusion LightGBM Model. Appl. Sci. 2025, 15, 4360. [Google Scholar] [CrossRef]
  37. Cao, K.; Liu, C.; Yang, S.; Zhang, Y.; Li, L.; Jung, H.; Zhang, S. Prediction of Cardiovascular Disease Based on Multiple Feature Selection and Improved PSO-XGBoost Model. Sci. Rep. 2025, 15, 12406. [Google Scholar] [CrossRef]
  38. Alsabhan, W.; Alfadhly, A. Effectiveness of Machine Learning Models (Including SVM) in Heart-Disease Diagnosis: A Comparative Study. Sci. Rep. 2025, 15, 24568. [Google Scholar] [CrossRef]
  39. Wei, X.; Rao, C.; Xiao, X.; Chen, L.; Goh, M. Risk Assessment of Cardiovascular Disease Based on SOLSSA-CatBoost Model. Expert Syst. Appl. 2023, 219, 119648. [Google Scholar] [CrossRef]
  40. Asadi, F.; Homayounfar, R.; Mehrali, Y.; Masci, C.; Talebi, S.; Zayeri, F. Detection of Cardiovascular Disease Cases Using Advanced Tree-Based Machine Learning Algorithms. Sci. Rep. 2024, 14, 22230. [Google Scholar] [CrossRef] [PubMed]
  41. Hidayaturrohman, Q.A.; Hanada, E. A Comparative Analysis of Grid-, Random-, and Bayesian-Search for Heart-Failure Outcome Prediction. Appl. Sci. 2025, 15, 3393. [Google Scholar] [CrossRef]
  42. Minhas, A.; Pal, S.C.; Jain, K. Machine Learning Analysis of Integrated ABP and PPG Signals towards Early Detection of Coronary Artery Disease. Sci. Rep. 2025, 15, 8574. [Google Scholar] [CrossRef]
  43. Riley, R.D.; Archer, L.; Snell, K.I.E.; Ensor, J.; Dhiman, P.; Martin, G.P.; Bonnett, L.J.; Collins, G.S. Evaluation of Clinical Prediction Models (Part 2): How to Undertake an External Validation Study. BMJ 2024, 384, e074820. [Google Scholar] [CrossRef] [PubMed]
  44. Deng, L.; Lu, K.; Hu, H. An interpretable LightGBM model for predicting coronary heart disease: Enhancing clinical de-cision-making with machine learning. PLoS ONE 2025, 20, e0330377. [Google Scholar] [CrossRef]
  45. Hamid, M.; Hajjej, F.; Alluhaidan, A.S.; Bin Mannie, N.W. Fine tuned CatBoost machine learning approach for early de-tection of cardiovascular disease through predictive modeling. Sci. Rep. 2025, 15, 31199. [Google Scholar] [CrossRef]
  46. Elsedimy, E.I.; AboHashish, S.M.M.; Algarni, F. New cardiovascular disease prediction approach using support vector machine and quantum-behaved particle swarm optimization. Multimed. Tools Appl. 2024, 83, 23901–23928. [Google Scholar] [CrossRef]
  47. Madhusai, B.; Aarthi, V.P.M.; Jenila, C.; Ajayreddy, B.; Reddy, R.M.; Mayuri, V. Explainable AI for Cardiovascular Health: A SHAP-Based Framework. In Proceedings of the 2025 International Conference on Pervasive Computational Technologies (ICPCT), Greater Noida, India, 8–9 February 2025; IEEE: New York, NY, USA, 2025; pp. 353–358. [Google Scholar] [CrossRef]
  48. Abdulqader, H.A.; Abdulazeez, A.M. A Review on Decision Tree Algorithm in Healthcare Applications. Indones. J. Comput. Sci. 2024, 13, 3863–3864. [Google Scholar] [CrossRef]
  49. Ganie, S.M.; Pramanik, P.K.D. A Comparative Analysis of Boosting Algorithms for Chronic Liver Disease Prediction. Healthc. Anal. 2024, 5, 100313. [Google Scholar] [CrossRef]
  50. Zhang, M.; Shen, T.; Li, Y.; Li, Q.; Lou, Y. Exploring the Complex Associations between Community Public Spaces and Healthy Aging: An Explainable Analysis Using CatBoost and SHAP. BMC Public Health 2025, 25, 2200. [Google Scholar] [CrossRef]
  51. Pathak, A.; Seyam, T.A.; Chakraborty, A.; Santa, N.K.; Uddin, E.; Mim, T.A. Enhancing Cardiovascular Risk Prediction Using Support Vector Machines and Advanced Machine Learning Algorithms. In Proceedings of the 2024 IEEE International Conference on Computing, Applications and Systems (COMPAS), Cox’s Bazar, Bangladesh, 25–26 September 2024; IEEE: New York, NY, USA, 2024. [Google Scholar] [CrossRef]
  52. Lundberg, S.M.; Erion, G.G.; Lee, S.-I. From Local Explanations to Global Understanding with Explainable AI in Health. Nat. Mach. Intell. 2020, 2, 252–261. [Google Scholar] [CrossRef]
  53. Lee, H.G.; Park, S.D.; Bae, J.W.; Moon, S.; Jung, C.Y.; Kim, M.-S.; Kim, T.-H.; Lee, W.K. Machine Learning Approaches That Use Clinical, Laboratory, and Electrocardiogram Data Enhance the Prediction of Obstructive Coronary Artery Disease. Sci. Rep. 2023, 13, 12635. [Google Scholar] [CrossRef]
  54. Inoue, T.; Ichikawa, D.; Ueno, T.; Cheong, M.; Inoue, T.; Whetstone, W.D.; Endo, T.; Nizuma, K.; Tominaga, T. XGBoost, a Machine Learning Method, Predicts Neurological Recovery in Patients with Cervical Spinal Cord Injury. Neurotrauma Rep. 2020, 1, 8–16. [Google Scholar] [CrossRef]
  55. Huang, Y.; Li, W.; Macheret, F.; Gabriel, R.A.; Ohno-Machado, L. A Tutorial on Calibration Measurements and Calibration Models for Clinical Prediction Models. J. Am. Med. Inform. Assoc. 2020, 27, 621–633. [Google Scholar] [CrossRef]
  56. Palliyaguru, D.L.; Shiroma, E.J.; Nam, J.K.; Duregon, E.; Vieira Ligo Teixeira, C.; Price, N.L.; Bernier, M.; Camandola, S.; Vaughan, K.L.; Colman, R.J.; et al. Fasting Blood Glucose as a Predictor of Mortality: Lost in Translation. Cell Metab. 2021, 33, 2189–2200.e3. [Google Scholar] [CrossRef]
  57. Kumar, A.; Khan, M.N.; Dubey, P.C. Hyperglycemia and Its Association with Cardio Vascular Disease (CVD) Post COVID-19 Era. Mathews J. Case Rep. 2024, 9, 167. [Google Scholar] [CrossRef]
  58. Liu, H.; Ding, C.; Hu, L.; Li, M.; Zhou, W.; Wang, T.; Zhu, L.; Bao, H.; Cheng, X. The Association between AST/ALT Ratio and All-Cause and Cardiovascular Mortality in Patients with Hypertension. Medicine 2021, 100, e26693. [Google Scholar] [CrossRef]
  59. Ávila, M.; Mora Sánchez, M.G.; Bernal Amador, A.S.; Paniagua, R. The Metabolism of Creatinine and Its Usefulness to Evaluate Kidney Function and Body Composition in Clinical Practice. Biomolecules 2025, 15, 41. [Google Scholar] [CrossRef]
  60. Fellström, B.; Jardine, A.G.; Soveri, I.; Cole, E.; Neumayer, H.H.; Maes, B.; Gimpelewicz, C.; Holdaas, H.; ALERT Study Group. Renal Dysfunction Is a Strong and Independent Risk Factor for Mortality and Cardiovascular Complications in Renal Transplantation. Am. J. Transplant. 2005, 5, 1986–1991. [Google Scholar] [CrossRef]
  61. Fularski, P.; Czarnik, W.; Frankenstein, H.; Gąsior, M.; Młynarska, E.; Rysz, J.; Franczyk, B. Unveiling Selected Influences on Chronic Kidney Disease Development and Progression. Cells 2024, 13, 751. [Google Scholar] [CrossRef] [PubMed]
  62. Wirth, M.D.; Sevoyan, M.; Hofseth, L.; Shivappa, N.; Hurley, T.G.; Hébert, J.R. The Dietary Inflammatory Index Is Associated with Elevated White Blood Cell Counts in the National Health and Nutrition Examination Survey. Brain Behav. Immun. 2018, 69, 296–303. [Google Scholar] [CrossRef] [PubMed]
  63. Brown, D.W.; Giles, W.H.; Croft, J.B. White Blood Cell Count: An Independent Predictor of Coronary Heart Disease Mortality among a National Cohort. J. Clin. Epidemiol. 2001, 54, 316–322. [Google Scholar] [CrossRef] [PubMed]
  64. Nording, H.; Baron, L.; Langer, H.F. Platelets as Therapeutic Targets to Prevent Atherosclerosis. Atherosclerosis 2020, 307, 97–108. [Google Scholar] [CrossRef]
  65. Gajewska, A.; Wysokiński, A.; Strzelecki, D.; Gawlik-Kotelnicka, O. Limited Changes in Red Blood Cell Parameters after Probiotic Supplementation in Depressive Individuals: Insights from a Secondary Analysis of the PRO-DEMET Randomized Controlled Trial. J. Clin. Med. 2025, 14, 265. [Google Scholar] [CrossRef]
  66. Sarnak, M.J.; Tighiouart, H.; Manjunath, G.; MacLeod, B.; Griffith, J.; Salem, D.; Levey, A.S. Anemia as a Risk Factor for Cardiovascular Disease in the Atherosclerosis Risk in Communities (ARIC) Study. J. Am. Coll. Cardiol. 2002, 40, 27–33. [Google Scholar] [CrossRef]
  67. Mozos, I. Mechanisms Linking Red Blood Cell Disorders and Cardiovascular Diseases. Biomed. Res. Int. 2015, 2015, 682054. [Google Scholar] [CrossRef]
  68. Žiberna, L.; Jenko-Pražnikar, Z.; Petelin, A. Serum Bilirubin Levels in Overweight and Obese Individuals: The Importance of Anti-Inflammatory and Antioxidant Responses. Antioxidants 2021, 10, 1352. [Google Scholar] [CrossRef]
  69. Suh, S.; Cho, Y.R.; Park, M.K.; Kim, D.K.; Cho, N.H.; Lee, M.K. Relationship between Serum Bilirubin Levels and Cardiovascular Disease. PLoS ONE 2018, 13, e0193041. [Google Scholar] [CrossRef] [PubMed]
  70. Li, K.; Gao, L.; Jiang, Y.; Jia, J.; Li, J.; Fan, F.; Zhang, Y.; Huo, Y. Association of Cardiovascular Events with Central Systolic Blood Pressure: A Systematic Review and Meta-Analysis. J. Clin. Hypertens. 2024, 26, 747–756. [Google Scholar] [CrossRef] [PubMed]
  71. Shen, A.; Liu, F.; Chen, S.; Huang, K.; Cao, J.; Shen, C.; Liu, X.; Yu, L.; Gu, S.; Zhao, L.; et al. Impact of Resting Heart Rate and Predicted Cardiovascular Risk on Mortality in Nearly 110,000 Chinese Adults. Heart Rhythm 2025, in press. [Google Scholar] [CrossRef]
  72. Kibret, K.T.; Strugnell, C.; Backholer, K.; Peeters, A.; Tegegne, T.K.; Nichols, M. Life-Course Trajectories of Body Mass Index and Cardiovascular Disease Risks and Health Outcomes in Adulthood: Systematic Review and Meta-Analysis. Obes. Rev. 2024, 25, e13695. [Google Scholar] [CrossRef]
  73. Bian, Y.; Kou, H.; Jia, Z.; Cui, Q.; Wu, P.; Ma, J.; Ma, X.; Jin, P. Association between Aspartate Aminotransferase to Alanine Aminotransferase Ratio and Mortality in Critically Ill Patients with Congestive Heart Failure. Sci. Rep. 2024, 14, 26317. [Google Scholar] [CrossRef] [PubMed]
  74. Liu, X.; Zhang, H.-J.; Fang, C.-C.; Li, L.; Lai, Z.-Q.; Liang, N.-P.; Zhang, X.-T.; Wu, M.-B.; Yin, X.; Zhang, H.; et al. Association between Noninvasive Liver Fibrosis Scores and Heart Failure in a General Population. J. Am. Heart Assoc. 2024, 13, e035371. [Google Scholar] [CrossRef]
  75. Mesri Alamdari, N.; Lotfi Yagin, N.; Ghaffari, S.; Roshanravan, N.; Zarrintan, A.; Mobbaseri, M. The Different Effects of Admission Blood Glucose Levels on the Outcomes of ST-Segment Elevation Myocardial Infarction Patients with and without Diabetes. Sci. Rep. 2025, 15, 27682. [Google Scholar] [CrossRef] [PubMed]
  76. Wang, Z.; Gong, Z.; Wen, J.; Zhang, S.; Hu, X.; Guo, W.; Tian, Y.; Li, Q. Association between Liver Fibrosis and Risk of Incident Stroke and Mortality: A Large Prospective Cohort Study. J. Am. Heart Assoc. 2025, 14, e037081. [Google Scholar] [CrossRef] [PubMed]
  77. Kim, J.; Lee, S.Y.; Cha, B.H.; Lee, W.; Ryu, J.; Chung, Y.H.; Kim, D.; Lim, S.-H.; Kang, T.S.; Park, B.-E.; et al. Machine Learning Models of Clinically Relevant Biomarkers for the Prediction of Stable Obstructive Coronary Artery Disease. Front. Cardiovasc. Med. 2022, 9, 933803. [Google Scholar] [CrossRef]
  78. Koloi, A.; Loukas, V.S.; Hourican, C.; Sakellarios, A.I.; Quax, R.; Mishra, P.P.; Lehtimäki, T.; Raitakari, O.T.; Papaloukas, C.; Bosch, J.A.; et al. Predicting Early-Stage Coronary Artery Disease Using Machine Learning and Routine Clinical Biomarkers Improved by Augmented Virtual Data. Eur. Heart J.-Digit. Health 2024, 5, 542–550. [Google Scholar] [CrossRef]
  79. Neumann, J.T.; Twerenbold, R.; Weimann, J.; Ballantyne, C.M.; Benjamin, E.J.; Costanzo, S.; de Lemos, J.A.; Defilippi, C.R.; Di Castelnuovo, A.; Donfrancesco, C.; et al. Prognostic Value of Cardiovascular Biomarkers in the Population. JAMA 2024, 331, 1898–1909. [Google Scholar] [CrossRef]
  80. Rodgers, J.L.; Jones, J.; Bolleddu, S.I.; Vanthenapalli, S.; Rodgers, L.E.; Shah, K.; Karia, K.; Panguluri, S.K. Cardiovascular Risks Associated with Gender and Aging. J. Cardiovasc. Dev. Dis. 2019, 6, 19. [Google Scholar] [CrossRef]
  81. Shah, P.; Shukla, M.; Dholakia, N.H.; Gupta, H. Predicting Cardiovascular Risk with Hybrid Ensemble Learning and Explainable AI. Sci. Rep. 2025, 15, 17927. [Google Scholar] [CrossRef]
  82. Iacobescu, P.; Marina, V.; Anghel, C.; Anghele, A.D. Evaluating Binary Classifiers for Cardiovascular Disease Prediction: Enhancing Early Diagnostic Capabilities. J. Cardiovasc. Dev. Dis. 2024, 11, 396. [Google Scholar] [CrossRef]
  83. Xia, B.; Innab, N.; Kandasamy, V.; Ahmadian, A.; Ferrara, M. Intelligent Cardiovascular Disease Diagnosis Using Deep Learning Enhanced Neural Network with Ant Colony Optimization. Sci. Rep. 2024, 14, 21777. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Data Acquisition and Formulation Pipeline.
Figure 1. Data Acquisition and Formulation Pipeline.
Algorithms 18 00708 g001
Figure 2. Mutual Information heatmap for Dataset 1.
Figure 2. Mutual Information heatmap for Dataset 1.
Algorithms 18 00708 g002
Figure 3. Mutual Information heatmap for Dataset 2 (blood test features only).
Figure 3. Mutual Information heatmap for Dataset 2 (blood test features only).
Algorithms 18 00708 g003
Figure 4. SHAP Feature Importance (Dataset 1: baseline features only).
Figure 4. SHAP Feature Importance (Dataset 1: baseline features only).
Algorithms 18 00708 g004
Figure 5. SHAP Feature Importance (Dataset 2: with blood biomarkers).
Figure 5. SHAP Feature Importance (Dataset 2: with blood biomarkers).
Algorithms 18 00708 g005
Figure 6. Beeswarm Plot.
Figure 6. Beeswarm Plot.
Algorithms 18 00708 g006
Figure 7. (a) Systolic BP Dependence Plot (colored by age); (b) Heart Rate Dependence Plot (colored by BMI); (c) BMI Dependence Plot (colored by SpO2).
Figure 7. (a) Systolic BP Dependence Plot (colored by age); (b) Heart Rate Dependence Plot (colored by BMI); (c) BMI Dependence Plot (colored by SpO2).
Algorithms 18 00708 g007aAlgorithms 18 00708 g007b
Figure 8. Force Plot (individual patient).
Figure 8. Force Plot (individual patient).
Algorithms 18 00708 g008
Figure 9. Confusion Matrix (Dataset 1: baseline only).
Figure 9. Confusion Matrix (Dataset 1: baseline only).
Algorithms 18 00708 g009
Figure 10. Confusion Matrix (Dataset 2: with blood analysis).
Figure 10. Confusion Matrix (Dataset 2: with blood analysis).
Algorithms 18 00708 g010
Figure 11. Calibration plot for the XGBoost model on Dataset 2.
Figure 11. Calibration plot for the XGBoost model on Dataset 2.
Algorithms 18 00708 g011
Table 1. Sample from Dataset 1 (Baseline).
Table 1. Sample from Dataset 1 (Baseline).
IDAgeGenderWeightHeightSystolic BPDiastolic BPBMIHRSpO2
118F50.01601107019.57297
219M60.01521107026.07296
362F74.01601208028.97099
422M80.01701208027.772-
559F78.01691208027.38499
Table 2. Sample from Dataset 2 (Enhanced).
Table 2. Sample from Dataset 2 (Enhanced).
IDPLTHGBWBCRBCHCTCreatALTASTBiliGlucose
13201226.514.4437.041.589.5613.359.415.25
21831655.205.35-83.00182419.004.20
31861208.004.6334.744.5028.319.510.516.964
41921655.105.80-98.00322311.85.59
51801394.905.2842.161.0021.819.215.007.00
Table 3. Example of the dataset after gender encoding.
Table 3. Example of the dataset after gender encoding.
AgeGenderWeightHeightSystolic BPDiastolic BPBMIHRSpO2CVD
18050.01601107019.572970
19160.01521107026.072960
62074.01601208028.970990
63172.01701409024.985971
751781791608024.380951
Table 4. Comparative Model Results (with vs. without blood biomarkers).
Table 4. Comparative Model Results (with vs. without blood biomarkers).
Baseline Clinical FeaturesEnhanced Features
AccuracyPrecisionRecallF1_scoreAccuracyPrecisionRecallF1_score
LightGBM0.87150.90580.83690.87000.89390.93980.84780.8914
CatBoost0.85470.90240.80430.85060.89940.94050.85870.8977
SVM0.84920.93330.76090.83830.86030.90360.81520.8571
XGBoost0.87150.89660.84780.87150.91620.95000.87360.9102
Decision Tree0.86030.87640.84780.86190.85470.90670.78160.8395
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tasmurzayev, N.; Amangeldy, B.; Baigarayeva, Z.; Boltaboyeva, A.; Imanbek, B.; Maeda-Nishino, N.; Zhussupbekov, S.; Baidauletova, A. Enhancing Cardiovascular Disease Classification with Routine Blood Tests Using an Explainable AI Approach. Algorithms 2025, 18, 708. https://doi.org/10.3390/a18110708

AMA Style

Tasmurzayev N, Amangeldy B, Baigarayeva Z, Boltaboyeva A, Imanbek B, Maeda-Nishino N, Zhussupbekov S, Baidauletova A. Enhancing Cardiovascular Disease Classification with Routine Blood Tests Using an Explainable AI Approach. Algorithms. 2025; 18(11):708. https://doi.org/10.3390/a18110708

Chicago/Turabian Style

Tasmurzayev, Nurdaulet, Bibars Amangeldy, Zhanel Baigarayeva, Assiya Boltaboyeva, Baglan Imanbek, Naoya Maeda-Nishino, Sarsenbek Zhussupbekov, and Aliya Baidauletova. 2025. "Enhancing Cardiovascular Disease Classification with Routine Blood Tests Using an Explainable AI Approach" Algorithms 18, no. 11: 708. https://doi.org/10.3390/a18110708

APA Style

Tasmurzayev, N., Amangeldy, B., Baigarayeva, Z., Boltaboyeva, A., Imanbek, B., Maeda-Nishino, N., Zhussupbekov, S., & Baidauletova, A. (2025). Enhancing Cardiovascular Disease Classification with Routine Blood Tests Using an Explainable AI Approach. Algorithms, 18(11), 708. https://doi.org/10.3390/a18110708

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop