Next Article in Journal
Determinants of Length of Hospital Stay in Older Adult Hip Fracture Patients in a Northern Peruvian Hospital
Previous Article in Journal
Leadless Pacemakers in Complex Congenital Heart Disease
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning-Based Prediction of Early Left Ventricular Function After STEMI

by
Shunjie-Fabian Zheng
1,†,
Kathrin Diegruber
1,2,†,
David Esser
1,
Solveig Vieluf
1,2,‡ and
Christopher Stremmel
1,2,*,‡
1
Department of Medicine I, LMU University Hospital, LMU Munich, 81377 Munich, Germany
2
DZHK (German Centre for Cardiovascular Research), Partner Site Munich Heart Alliance, LMU University Hospital, LMU Munich, 81377 Munich, Germany
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work as co-first author.
These authors contributed equally to this work as co-last author.
J. Clin. Med. 2025, 14(23), 8563; https://doi.org/10.3390/jcm14238563 (registering DOI)
Submission received: 28 October 2025 / Revised: 24 November 2025 / Accepted: 29 November 2025 / Published: 3 December 2025
(This article belongs to the Section Cardiology)

Abstract

Background: Left ventricular (LV) function and lactate dynamics are major prognostic markers after ST-segment elevation myocardial infarction (STEMI). Early identification of patients at risk for impaired LV function or systemic hypoperfusion may improve outcomes. Machine learning (ML) can enhance predictive accuracy beyond traditional statistical methods, yet most prior studies were limited by small sample sizes and categorical outcomes. Methods: We retrospectively analyzed 2132 consecutive STEMI patients admitted to LMU Hospital (2014–2023). After preprocessing, 1608 patients with complete data were included. Thirty-eight demographic, clinical, procedural, and laboratory variables were used to train Decision Tree, Random Forest, and XGBoost regression models for predicting continuous left ventricular ejection fraction (LVEF) at discharge and lactate levels during hospitalization. Model performance was evaluated using mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), coefficient of determination (R2), and mean absolute percentage error (MAPE). Feature importance and Shapley additive explanations (SHAP) were applied for interpretability. Results: Ensemble models outperformed single trees. XGBoost achieved the best performance for LVEF prediction (MSE = 0.008, RMSE = 0.086, MAE = 0.068, R2 = 0.35). Lactate prediction showed moderate accuracy (R2 = 0.42 for admission and 0.47 for peak levels). Key predictors included cardiogenic shock, left anterior descending (LAD) culprit lesions, and peak lactate. Conclusions: ML enables individualized prediction of LV function and lactate dynamics after STEMI using routinely available clinical and laboratory data. Ensemble models, particularly XGBoost, demonstrated consistent and clinically meaningful predictive performance and generalizability, supporting their potential for early, data-driven risk stratification in acute cardiac care.

1. Introduction

Cardiovascular diseases remain the leading cause of morbidity and mortality worldwide, with acute myocardial infarction (MI)—particularly ST-elevation myocardial infarction (STEMI)—representing one of the most severe and life-threatening clinical manifestations [1]. Despite significant advances in reperfusion strategies and guideline-directed pharmacological therapy, STEMI continues to be associated with substantial short- and long-term complications, including heart failure, arrhythmias, and sudden cardiac death [2,3].
A central determinant of both early clinical course and long-term prognosis after STEMI is the left ventricular ejection fraction (LVEF). Reduced LVEF is a direct consequence of acute myocardial injury and remains one of the strongest predictors of morbidity, mortality, and arrhythmic events. It not only reflects the extent of ischemic damage but also guides treatment decisions, including the need for long-term heart failure therapy and prophylactic implantation of an implantable cardioverter-defibrillator (ICD), as outlined in international heart failure and post-MI management guidelines [2,3]. A machine learning (ML) study on one-year mortality risk prediction after STEMI identified LVEF as the strongest predictor of outcome (mean Shapley value (SHAP) of 0.978) in a cohort of 2887 patients [4].
Importantly, many patients who will eventually present with persistently reduced LV function show early signs already during the acute phase of STEMI—within the first 24 to 72 h. Hence, accurate early prediction of LVEF is clinically essential: it allows risk-adapted decisions regarding monitoring intensity (e.g., telemetry, intensive care unit (ICU) stay), timing of imaging studies, and early initiation of prognostically relevant treatments. Furthermore, in the face of growing pressure on healthcare resources, timely risk stratification also supports more efficient use of ICU capacity and helps avoid unnecessary prolongation of hospitalization.
Numerous clinical and biochemical markers have been investigated as predictors of post-infarct LVEF. These include patient-related factors (such as age, comorbidities, and Killip class), infarct characteristics (location, extent, and time to reperfusion), and laboratory values (e.g., creatine kinase (CK), CK-myocardial band (MB), and cardiac troponins) [5,6,7,8]. Several studies have shown that prolonged ischemia time and elevated peak biomarker levels, especially troponin, strongly correlate with infarct size and are independently associated with reduced LVEF [5,7,9,10,11]. Hemodynamic compromise on admission—manifested as hypotension, tachycardia, or cardiogenic shock—also reflects greater myocardial involvement and carries a high risk of LV dysfunction and adverse events [6].
Logistic regression models combining multiple parameters to predict LVEF showed promising results in previous analyses, but were limited by sample size and the number of assessed parameters [5,12,13]. Felbel and colleagues identified anterior wall myocardial infarction and pain-to-balloon time as the most decisive factors [5]. With respect to laboratory markers Reinstadler et al. predicted LV remodeling with an area under the curve (AUC) of 0.85 by a combination of lactate dehydrogenase (LDH), troponin, aspartate transaminase (AST), alanine transaminase (ALT), and N-terminal pro-B-type natriuretic peptide (NT-proBNP) measurements. Among these parameters troponin (AUC = 0.75) and LDH (AUC = 0.78) achieved the best individual results [13].
In contrast, while imaging-based predictors—such as wall motion abnormalities on echocardiography, strain analysis, or infarct size and microvascular obstruction on cardiac magnetic resonance imaging (CMR)—are known to be valuable, they require advanced diagnostics that are not always immediately available in the acute phase [14,15]. Likewise, advanced electrocardiographic parameters (e.g., ST-segment resolution, QRS fragmentation) and laboratory markers of wall stress (NT-proBNP) have shown prognostic value in selected cohorts, but were neither systematically assessed nor uniformly available in the present dataset [16,17,18]. Hence, these variables were excluded from the analysis in the current study. Similarly, we excluded inflammatory markers (e.g., leukocyte count, C-reactive protein (CRP)) due to their limited predictive value in relation to LV function and cardiovascular risk [19,20,21].
Despite the recognized importance of early LV function, there is still a lack of robust and clinically applicable prediction tools based solely on readily available parameters in the acute care setting. In recent years, artificial intelligence (AI) and ML approaches have shown great potential in the field of cardiovascular risk prediction. These methods can incorporate large numbers of variables and uncover complex, non-linear associations that may be missed by traditional statistical models. Initial studies applying ML to post-MI prognosis have reported promising predictive accuracy, suggesting that AI may augment clinical decision-making in this context. However, one of the largest studies on this topic included 1220 patients with myocardial infarction and focused on long-term prognosis. It achieved remarkable predictive power with the Extreme Gradient Boosting (XGBoost) model (AUC = 0.92) in relation to the development of heart failure. The leading predictive parameters based on SHAP analyses were LV function, LV end-systolic diameter, and LDH [22].
Short-term studies to optimize care capacities and to identify high-risk groups based on ML models are very limited. Jeon and colleagues used an AI-enabled ECG index and achieved promising results in predicting LV function with a cohort of 637 STEMI patients [23]. Similarly, a study by Xin and colleagues investigated 56 variables of 315 STEMI patients to predict infarction size. The Random Forest model achieved a coefficient of determination (R2) of 0.687 and the following ten factors were the most predictive parameters in feature importance analysis: leukocyte count, anterior wall MI, CK, cholesterol, troponin, platelet count, systolic blood pressure, blood urea nitrogen (BUN), creatinine, and NT-proBNP on admission [24]. Of note, LDH—a major predictive parameter in previous trials—was not included in this analysis and the overall sample size was relatively small [13,24].
The largest present study on short-term risk evaluation included 1863 patients to predict in-hospital mortality (AUC = 0.79), ICU admission (AUC = 0.78), and LVEF < 40% (AUC = 0.74). However, the predictive capacity of the model was limited due a limited number of evaluated parameters, namely age, pre-hospital cardiac arrest, robust collateral recruitment, cardiovascular risk factors, blood pressure, heart rate, culprit lesion, and TIMI flow [25]. Although laboratory values are known as the best predictors according to the current literature, they were not included in this trial.
Notably, most previous ML approaches have framed post-STEMI LVEF as a categorical outcome (e.g., reduced vs. preserved function), rather than as a continuous variable. While classification facilitates risk grouping, it oversimplifies the spectrum of ventricular dysfunction and may obscure subtle prognostic differences. Continuous prediction models, in contrast, allow a more precise and individualized estimation of LV function, enhancing clinical interpretability and risk stratification.
Our present study aims to systematically identify early predictors of LV function following STEMI by analyzing one of the largest retrospective STEMI cohorts to date. We applied ML algorithms, including multivariate regression models, to determine which combinations of clinical, laboratory, and procedural variables best predict continuous LVEF at hospital discharge. While previous studies have demonstrated the prognostic importance of LVEF and explored ML-based prediction approaches, existing models often relied on small samples, categorical outcomes, or advanced imaging modalities that limit clinical applicability.
In this study, we sought to overcome these limitations by developing and validating ML models for the continuous prediction of LVEF and lactate dynamics after STEMI using only routinely available clinical and laboratory data. We hypothesized that ensemble models based on such parameters could accurately estimate early LV function and identify patients at risk. Beyond methodological development, this study aimed to evaluate the feasibility of integrating these ML-based tools into clinical workflows to enable early, data-driven risk stratification in acute cardiac care.

2. Materials and Methods

2.1. Study Population

This retrospective cohort study was conducted at Ludwig-Maximilians-University (LMU) Hospital, Munich, Germany. We screened 2553 consecutive patients admitted with a primary diagnosis of STEMI between 2014 and 2023. Exclusion criteria included incorrect STEMI admission diagnosis, myocardial infarction with non-obstructive coronary arteries (MINOCA), indication for emergency bypass surgery, and incomplete datasets. After applying these criteria, 2132 patients remained eligible for our analysis. To ensure a fair comparison among different ensemble methods, we also drop predictors with more than 200 missing values (Figure 1, Supplementary Table S1).
This study was carried out in accordance with the Declaration of Helsinki and the German Data Protection Act. It was approved by the institutional ethics committee of LMU Munich (#23-0609).

2.2. Data Collection

The final STEMI registry comprised 2132 patients with a total of 38 variables. Specifically, features comprised demographics, cardiovascular risk factors, clinical history and course, procedural data, and laboratory markers. Variables with >200 missing values were excluded in a preprocessing step, resulting in a total of 26 analyzed variables, coded into 42 covariates (33 when excluding baseline category) considering the categorical nature of some variables (Supplementary Figure S1, Supplementary Table S1).

2.3. Echocardiographic Assessment of Left Ventricular Function

LVEF at discharge was assessed using transthoracic echocardiography performed in accordance with current guideline recommendations [26,27]. Standard apical two- and four-chamber views were acquired, and LVEF was calculated using the biplane Simpson’s method whenever image quality permitted. All examinations were conducted by board-certified cardiologists or experienced sonographers as part of routine clinical care. LVEF values used for the analysis were extracted from the finalized echocardiography reports stored in the institutional imaging archive. Inter-observer variability is expected to be minimal due to standardized acquisition and reporting protocols applied within our department.

2.4. Data Processing

Continuous variables were log-transformed (log + 1) to reduce skewness, to stabilize variance, and for normalization. Categorical variables were one-hot encoded to ensure comparability across models. Although some algorithms can process categorical features directly, consistent preprocessing was chosen to standardize analyses across methods (Supplementary Table S1).
It is important to note that we did not perform feature elimination. Because the ensemble methods used in this study do not rely on matrix inversion for coefficient estimation, multicollinearity does not pose a methodological limitation. Tree-based models are inherently robust to correlated predictors, and their predictive performance is not adversely affected by multicollinearity. In addition, one-hot encoding introduces correlations among dummy variables by design. Arbitrarily removing such variables through recursive elimination would be counterproductive, as it may exclude clinically relevant information without offering methodological benefit. To preserve the full clinical context and ensure that the models had access to the complete set of available predictors, all variables were retained [28,29] (Supplementary Figure S2).

2.5. Outcomes

The primary outcome was LV function at hospital discharge, a clinically relevant predictor of cardiovascular risk. LV function was directly taken as a continuous outcome.
Secondary outcomes were lactate concentrations, analyzed both at admission and at peak levels during hospitalization. Lactate was modeled analogously to LV function and (log + 1) transformed to tackle large outliers.

2.6. Model Development

For the complete dataset, predictors and biomarkers with >200 missing entries were excluded. Further, patients with missing values in the remaining covariates were excluded, yielding 1608 patients for model development. Data were randomly split into training (80%) and test (20%) sets (seed = 42). Missing data handling and preprocessing were performed before splitting. Hyperparameters were optimized using grid search with leave-one-out cross-validation, (n-1)-fold, on the test set [30].
We implemented regression models using Decision Trees, Random Forest, and XGBoost. To provide meaningful baseline benchmarks and contextualize the performance of the ensemble models, we additionally incorporated three linear models: Ordinary Least Squares Regression, Lasso Regularization (L1), and Ridge Regularization (L2). These linear baselines allow for a clearer comparison between traditional linear approaches and the non-linear ensemble methods applied in this study.
  • Decision Trees: partition the feature space recursively to minimize variance within subgroups [31].
  • Random Forests: ensemble of Decision Trees built on bootstrapped samples with feature subsampling to reduce variance and overfitting [29].
  • XGBoost: sequentially adds trees to correct residual errors, with gradient-based optimization and regularization to control model complexity [28].

2.7. Model Evaluation

Regression performance was assessed using mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), coefficient of determination (R2), explained variance score (EVS), and mean absolute percentage error (MAPE). The model was validated on a temporally distinct dataset from the same institution (20 consecutive STEMI patients in January 2024).

2.8. Explainability

To ensure interpretability, we computed model-based feature importance scores and analyzed SHAP (Shapley additive explanations) values for global and local interpretability. Beeswarm plots were generated to visualize feature effects, illustrating both magnitude and direction of influence on a single patient and full cohort level.

2.9. Statistical Analysis

In a quest of analyzing differences across the patients with prior PCI and those without prior PCI, we deployed statistical tests. Continuous variables were summarized as mean (±SD). Binary variables were reported as proportions. Comparisons between groups (e.g., patients with vs. without prior intervention) were performed using the Mann–Whitney U test for continuous data and the χ2 or z-test for categorical variables.

2.10. Software

All analyses were performed using Python 3 (Python Software Foundation). Data preprocessing and management were carried out with the Pandas library (version 1.5) and NumPy (version 1.23) for numerical operations [32,33].
Machine learning models were implemented using the following:
  • Scikit-learn (version 1.2) for Decision Tree and Random Forest regressors [34].
  • XGBoost (version 1.7) for gradient boosting regression [28].
Model evaluation metrics, including MSE, RMSE, MAE, R2, EVS, and MAPE, were calculated using scikit-learn.
Visualization of results was performed using Matplotlib library (version 3.6) [35]. For model interpretability, we employed the SHAP library (version 0.41) to generate SHAP values, including beeswarm and summary plots [36].
Statistical analyses were conducted with scipy.stats (version 1.9) for nonparametric tests (Mann–Whitney U) and statsmodels (version 0.13) for parametric testing (proportional z-test) [37,38].

3. Results

3.1. Data Basis

Our study included a total of 2553 consecutive STEMI patients admitted to our hospital between 2014 and 2023. After exclusion of patients with an incorrect STEMI admission diagnosis, cases without acute coronary obstruction (MINOCA cohort), and patients with an indication for emergency bypass surgery, as well as missing datasets, 2132 patients with 38 different variables remained for further statistical evaluations.
Among these, variables with more than 200 missing values were excluded, resulting in 26 remaining variables. Further, we removed patients with missing values for a fair comparison, since XGBoost can handle missing values internally but other ensemble methods do not. This leads to a final complete dataset for 1608 patients (Figure 1, Supplementary Table S1, Supplementary Figure S1).

3.2. Baseline Characteristics

Our final study population included 1608 individuals with an average age of 65 years. Mean body mass index was 27.2 kg/m2 and one quarter of them were female (25.1%). About 20% (N = 325/1608) had a known coronary artery disease with a history of a prior percutaneous coronary intervention (PCI) (Table 1).
For the overall cohort, approximately 30% have coronary artery disease (CAD) affecting one (30.9%) or two vessels (27.8%), respectively, while the remaining 40% presented with a three-vessel pathology (41.3%). However, in patients with a prior PCI the relative proportion of three-vessel CAD was almost twice as high as in those without a prior PCI (64.0% vs. 35.5%; p < 0.001) (Table 1).
The culprit lesion was the left anterior descending artery (LAD) in half of all cases, right coronary artery (RCA) in one third, and left circumflex artery (LCX) in about 11%. The left main coronary artery required only very few PCIs in our cohort, with about 1–2%. Interestingly, the culprit lesion distribution was independent of the status regarding a prior PCI (Table 1).
Procedural data indicated an average contrast agent volume (CAV) of about 207 ± 98 mL with a radiation time of 14.1 ± 11.7 min and a dose area product (DAP) of 4306 ± 4111 cGy/cm2. Tirofiban, a potent intravenous antiplatelet agent, was used in 10% of cases due to very high thrombus load or lack of reperfusion. Of note is that the use of tirofiban was more frequent in patients with prior PCI (16.6% vs. 8.3%; p < 0.001) (Table 1).

3.3. Hemodynamics and Shock

A total of 16% of all patients presented with cardiogenic shock on admission with a significantly higher proportion among those with prior PCI (23% vs. 15%; p = 0.001). Cardiopulmonary resuscitation (CPR) was necessary in 14% and in about 5% mechanical circulatory support (MCS) systems were implanted. These were clearly dominated by extracorporeal membrane oxygenation (ECMO) systems (4.7% of patients) and a minor proportion of Impella (0.9%), ECMO + Impella (0.3%), or intra-aortic balloon pump (IABP) devices (0.3%). Importantly, the LVEF upon discharge was only mildly impaired, at 49 ± 11% (Table 1).

3.4. Laboratory Indicators of Infarction Size

Key laboratory findings indicate high CK values on admission (875 ± 1502 U/L) with a maximum peak of 2348 ± 3727 U/L reflecting a significant myocardial infarction. This goes in line with cardiac-specific troponin T values of 3.4 ng/mL on admission, which rose up to 9.1 ng/mL in the further clinical course. LDH as a marker of general tissue damage was 421 ± 653 U/L on admission and average renal function was largely normal (creatinine 1.13 ± 0.62 mg/dL) in our cohort. In line with only a minor proportion of STEMI patients presenting with cardiogenic shock, lactate values were only mildly elevated in our cohort on admission (2.48 ± 2.47 mmol/L), as were their maximum peaks (3.49 ± 3.80 mmol/L).
When stratified for the presence of a prior PCI, we identified slightly higher infarction sizes based on laboratory markers in patients with no prior PCI (CK max 2361 vs. 2296 U/L, p = 0.013; CK-MB max 241 vs. 207 U/L, p = 0.001), although maximum troponin T as the best investigated marker did not reach statistical significance (troponin T max 9.04 vs. 9.25 ng/mL; p = 0.059). Moreover, renal function based on creatinine measurements was slightly worse in patients with a prior PCI.

3.5. Mortality

Overall mortality for our final STEMI cohort was 6.6% (N = 106/1608). In line with previous studies, we set the LVEF threshold to 40% and divided our study cohort into two groups [25]. While STEMI patients with an LVEF of 40% had a mortality rate of 2.6% (N = 36/1374), it was 29.9% (N = 70/234) in patients with an LVEF < 40% (p < 0.001).

3.6. Model Performance in Prediction of LV Function

For the prediction of LVEF based on our dataset, the Decision Tree model showed the weakest performance, with an MAE of 0.075, and an R2 of only 0.17 (MSE = 0.010, RMSE = 0.097). In contrast, the ensemble approaches yielded markedly better results. Random Forest achieved an MAE of 0.067 and an R2 of 0.33 (MSE = 0.008, RMSE = 0.087). XGBoost provided the most favorable overall performance, with an MAE of 0.068 and an R2 of 0.35 (MSE = 0.008, RMSE = 0.086). Although the differences between Random Forest and XGBoost were small, both methods substantially outperformed the single Decision Tree, highlighting the advantage of ensemble learning for this prediction task.
Furthermore, the linear baseline models—Ordinary Least Squares Regression, Lasso Regularization (L1), and Ridge Regularization (L2)—demonstrated competitive error metrics but consistently explained less variance than the ensemble methods. This performance gap highlights the limitations of traditional linear approaches in modeling non-linear relationships within the dataset and further emphasizes the benefit of ensemble-based techniques for LVEF prediction (Table 2).
In the stratified analysis according to prior PCI, model performance differed substantially between subgroups. In patients with previous PCI, all models showed lower predictive accuracy, with XGBoost achieving the highest R2 of 0.24 but with markedly higher error metrics (RMSE = 0.107, MAPE = 22.9%) compared to the overall cohort. In contrast, in patients without prior PCI, predictive performance improved, with both Random Forest (R2 = 0.36, RMSE = 0.085, MAPE = 15.6%) and XGBoost (R2 = 0.35, RMSE = 0.086, MAPE = 15.8%) achieving substantially better results. These findings suggest that the presence of prior PCI may be associated with increased heterogeneity in LV function, rendering accurate prediction from clinical variables more challenging (Table 2).
When restricting the predictor set to laboratory parameters only, model performance was comparable, though with some notable differences, to that obtained from the full clinical dataset. The Decision Tree achieved limited accuracy (MSE = 0.010, RMSE = 0.100, MAE = 0.080, R2 = 0.25, MAPE = 19.7%), whereas both Random Forest (MSE = 0.008, RMSE = 0.092, MAE = 0.074, R2 = 0.36, MAPE = 18.0%) and XGBoost (MSE = 0.009, RMSE = 0.092, MAE = 0.074, R2 = 0.36, MAPE = 18.2%) demonstrated superior predictive performance. Interestingly, R2 in the laboratory-only setting was slightly higher than in the overall cohort, suggesting that laboratory markers capture a substantial proportion of the variability in LV function. However, error metrics such as RMSE and MAE were consistently higher, indicating lower precision of individual predictions when non-laboratory clinical variables were excluded. Taken together, these results highlight that while laboratory markers alone provide valuable predictive information, integration with broader clinical data improves the accuracy of LV function estimation (Figure 2, Table 2).

3.7. Categorical Analysis of LV Function

To facilitate comparability with previous work, we additionally performed a categorical analysis of the LV function and applied separators at LVEFs of 30%, 35%, 40%, and 50%, respectively (Table 3, Supplementary Figure S3). Overall, an LVEF threshold of 40% provided the best results with an AUC of 0.82 (with logistic regression (LR); XGBoost AUC = 0.80) in our model—a threshold that has also proven to be optimal in the largest previous studies on this topic [25]. In line with our data on continuous LVEF, the model performs even better in patients without a prior PCI (XGBoost AUC = 0.89) and still provides good results when exclusively based on laboratory values (XGBoost AUC = 0.80) (Table 3, Supplementary Figure S3).

3.8. Lactate Values as a Surrogate for LV Function

LV function is a major determinant of cardiac output and overall hemodynamic stability. Impaired LV function may lead to tissue hypoperfusion, which is often reflected by elevated lactate concentrations. Given this close pathophysiological relationship, we investigated whether ML models trained on clinical and laboratory parameters could predict lactate levels both at hospital admission and at their peak values during the clinical course. This concept is in line with several studies reporting a strong correlation between lactate levels and cardiovascular outcome [39,40,41]. We know lactate prediction has no direct application in the clinical setting due to fast and easy point of care measurements. Yet we used this important parameter to evaluate the overall performance of our model.
For the prediction of lactate levels at hospital admission, ensemble models outperformed a single Decision Tree. In the overall cohort, Random Forest (MSE = 0.138, RMSE = 0.372, MAE = 0.274, R2 = 0.41, EVS = 0.42, MAPE = 27.2%) and XGBoost (MSE = 0.137, RMSE = 0.371, MAE = 0.274, R2 = 0.42, EVS = 0.42, MAPE = 27.4%) demonstrated superior predictive accuracy compared to the Decision Tree (R2 = 0.30, MAPE = 29.4%). Subgroup analysis revealed considerable heterogeneity: In patients with prior PCI, prediction accuracy was low, with negative R2 values for Decision Tree and Random Forest and only marginal performance of XGBoost (R2 = 0.02, MAPE = 39.1%). In contrast, in patients without prior PCI, both Random Forest and XGBoost achieved consistent moderate predictive performance (R2 = 0.40 and 0.42, respectively; MAPE = 28%), comparable to or slightly better than the full cohort. These findings suggest that lactate levels on admission can be predicted with moderate accuracy, particularly in patients without prior PCI, whereas prediction is substantially less reliable in those with a history of PCI (Figure 3, Supplementary Table S2).
For the prediction of peak lactate values during the clinical course, ensemble models again outperformed a single Decision Tree. In the overall cohort, Random Forest yielded the best performance (MSE = 0.194, RMSE = 0.441, MAE = 0.333, R2 = 0.47, EVS = 0.51, MAPE = 29.6%), closely followed by XGBoost (R2 = 0.43, MAPE = 30.8%), whereas the Decision Tree performed worse (R2 = 0.37, MAPE = 28.9%). Subgroup analysis revealed that in patients without prior PCI, predictive performance improved substantially, with XGBoost achieving R2 = 0.57, RMSE = 0.368, and MAPE = 19.4%. By contrast, in patients with prior PCI, performance was reduced, mirroring the overall cohort results. These findings indicate that peak lactate levels can be predicted with moderate accuracy, particularly in patients without a history of PCI, while prediction remains less reliable in those with prior PCI (Figure 3, Supplementary Table S3).
A categorical analysis of admission lactate values with cutoffs of 2.5, 3–0, and 3.5 mmol/L provided the best values for the 3.5 mmol/L cutoff. The LR for the full cohort resulted in AUCs of 0.88 and 0.76 when analyzed with XGBoost. Like our LV function analysis, patients without a prior PCI allowed better prediction of admission lactate values (XGBoost AUC = 0.90). Results for peak lactate values were largely comparable to admission values for the full cohort (LR AUC = 0.84) as well as in relation to the presence or absence of a prior PCI (prior PCI Random Forest AUC = 0.93; no prior PCI Random Forest AUC = 0.84) (Supplementary Tables S4 and S5).

3.9. Feature Importance

The feature importance and SHAP beeswarm plots display the most influential covariates in the best performing models. We explicitly drop features of very low importance. In our analysis, the presence of cardiogenic shock emerged as the strongest predictor of LV function. This finding is not surprising, as cardiogenic shock in the STEMI population typically reflects an acute impairment of LV function that often does not fully recover even after successful revascularization of the culprit vessel. In some cases within our cohort, shock severity necessitated the implantation of a MCS device, which also explains the close association with ECMO implantation (Figure 4 and Figure 5).
Among the coronary vessels, a culprit lesion in the LAD had the largest impact on subsequent LV function, which is consistent with its central role in myocardial perfusion. When focusing on laboratory parameters, peak lactate levels showed high feature importance, which is in line with their established role as a surrogate marker of cardiogenic shock and systemic hypoperfusion [39,42,43]. Interestingly, LDH at admission also demonstrated strong predictive value. This may be explained by two factors: first, LDH serves as a nonspecific marker of tissue injury, and second, it may reflect the time delay between ischemia onset and hospital admission. Higher LDH levels therefore indicate longer ischemia duration, correlating with irreversible myocardial damage and subsequently reduced LV function (Figure 4 and Figure 5).
Classical cardiac biomarkers such as troponin T, CK, and CK-MB all exhibited predictive value without one parameter clearly outperforming the others. Notably, serum creatinine levels were also associated with later LV function. This association may reflect a higher overall morbidity in patients with renal impairment, but it could also be a consequence of true renal hypoperfusion secondary to reduced LV function (Figure 4 and Figure 5).
An analogous feature importance analysis including SHAP beeswarm with respect to lactate levels at admission and their peak values during the clinical course yielded very similar results. However, CPR emerged as an additional important variable in this context. This is most likely explained by the fact that prior CPR episodes can substantially elevate lactate levels, while having only a limited direct impact on subsequent LV function (Supplementary Figure S4).

3.10. Independent Validation

To further evaluate model robustness, we performed an additional validation in 20 consecutive STEMI patients admitted in January 2024, after register completion (Table 4). For the full cohort, XGBoost achieved comparable performance to the model development cohort (MSE = 0.0080, RMSE = 0.089, MAE = 0.079, R2 = 0.34, EVS = 0.37, MAPE = 18.5%), closely matching our overall results (R2 = 0.35, MAPE = 16.1%). In the subgroup of patients with prior PCI, predictive accuracy was substantially lower, with the validation yielding a negative explained variance (R2 = −0.56) despite moderate error metrics (RMSE = 0.099, MAPE = 22.6%), consistent with the lower performance already observed during model development (R2 = 0.24). In patients without prior PCI, Random Forest maintained good predictive accuracy (R2 = 0.29, MAPE = 18.6%) in subsequent validation, although slightly lower compared with model development results (R2 = 0.36, MAPE = 15.6%). Similarly, when restricting predictors to laboratory values only, Random Forest achieved moderate accuracy in validation (R2 = 0.25, MAPE = 20.2%), somewhat lower than in the model development setting (R2 = 0.36, MAPE = 18.0%) (Table 4).
For the prediction of lactate levels at admission, our validation cohort demonstrated consistent but heterogeneous results across subgroups. In the full cohort, XGBoost achieved an MSE of 0.0800, RMSE of 0.283, MAE of 0.239, R2 of 0.19, EVS of 0.27, and MAPE of 19.6%. While error metrics improved compared with the model development cohort (RMSE = 0.371, MAE = 0.274, MAPE = 27.4%), the explained variance was lower for the independent validation group (R2 = 0.42 in the model development cohort). In patients with prior PCI, predictive accuracy remained poor, with validation yielding a negative R2 (−0.15) and high error values (RMSE = 0.411, MAPE = 35.9%), consistent with the weak performance observed in the model development cohort (R2 = 0.02, MAPE = 39.1%). By contrast, in patients without prior PCI, validation demonstrated robust performance, with R2 = 0.43 and MAPE = 15.7%, closely mirroring or even improving upon the model development results (R2 = 0.42, MAPE = 27.6%) (Supplementary Table S6).
For the prediction of peak lactate levels, validation confirmed the overall trends observed during model development. In the full cohort, Random Forest achieved an MSE of 0.178, RMSE of 0.422, MAE of 0.312, R2 of 0.40, EVS of 0.41, and MAPE of 21.8%. This performance was comparable to the model development cohort (R2 = 0.47, MAPE = 29.6%), with slightly lower error metrics. In patients with prior PCI, however, predictive performance deteriorated markedly, with our validation cohort yielding an RMSE of 0.731, an MAE of 0.637, negative explained variance (R2 = −0.74, EVS = −0.62), and an MAPE of 46.3%, indicating poor model generalizability in this subgroup. By contrast, in patients without prior PCI, XGBoost maintained robust predictive accuracy, with validation yielding MSE = 0.116, RMSE = 0.341, MAE = 0.265, R2 = 0.60, EVS = 0.61, and MAPE = 17.7%, closely aligning with or slightly improving upon the results from model development (R2 = 0.57, MAPE = 19.4%) (Supplementary Table S7).

4. Discussion

This study demonstrates the feasibility of predicting LV function and lactate concentration after STEMI using routinely available clinical and laboratory parameters with ML approaches. In contrast to prior STEMI ML studies that typically enrolled a few hundred to about 1800 patients and often focused on classification rather than exact-value prediction, our single-center cohort (>2100 screened; 1608 for modeling) is among the largest to date for this purpose and enabled extensive subgroup analyses (prior PCI vs. no prior PCI; laboratory-only features) [20,21,22]. Beyond LV function, we extend prediction to admission and peak lactate, a marker closely linked to hemodynamic compromise and short-term prognosis [39,40,41,42,43].
Compared with earlier work emphasizing classical regression or restricted biomarker panels [5,12,13], we integrated a broad array of demographic, clinical, procedural, and laboratory variables and observed consistent gains of ensemble methods (Random Forest, XGBoost) over single trees [28,29,31], in line with method-agnostic benchmarks showing that tree-based models remain state-of-the-art for medium-sized tabular datasets [44,45].
In addition to these findings, recent advances in artificial intelligence highlight promising opportunities for future multimodal extensions of our approach. Several contemporary studies have demonstrated that combining ECG signals, echocardiographic imaging, and clinical data can substantially improve cardiac risk stratification and functional assessment. Parvathi et al. showed that integrating ECG images with clinical variables via a hybrid convolutional neural network-ensemble architecture improves myocardial infarction prediction, while Soto et al. demonstrated that a multimodal deep learning framework using 12-lead ECGs and echocardiogram videos enhances diagnostic precision for left ventricular hypertrophy [46,47]. Similarly, other investigations have applied deep learning algorithms to echocardiographic video sequences or 12-lead ECG recordings to quantify cardiac function, further illustrating the potential of imaging- and signal-based approaches [23,48,49]. Broader methodological reviews further emphasize that multimodal AI integrating imaging, physiologic signals, and structured clinical data is increasingly used across cardiovascular medicine [50,51].
Beyond imaging- and ECG-based prediction, artificial intelligence has also shown utility across other areas of cardiology. The recent perspective by Cersosimo et al. discusses emerging applications of AI and large language models in electrophysiology, including ECG interpretation, arrhythmia detection, ablation guidance, and sudden cardiac death risk stratification [52]. Incorporating these studies provides broader clinical context and situates our work within the rapidly expanding adoption of AI-driven tools in cardiovascular care. While our present models rely solely on routinely collected clinical and laboratory data, these multimodal approaches outline future directions for extending our methodology and potentially improving predictive performance even further.
Our focus on exact-value LVEF prediction complements prior classification-oriented models (e.g., LVEF < 40% or in-hospital endpoints) and AI-enabled ECG indices for LV dysfunction in STEMI, supporting the incremental value of routine, early-available signals [20,22]. We achieved an MAE of 7% for LVEF prediction—even if the model was exclusively based on laboratory values. Of note is that routine quick LVEF estimation in the emergency department or intensive care unit by visual approximation reaches the same accuracy of 7% if performed by experienced examiners [53,54,55]. Integrating such predictive models into clinical information systems or mobile decision-support dashboards could improve real-time patient triage.
In the benchmark analysis with previous ML studies our model demonstrates robust performance. In a classification-oriented model, i.e., predicting an LVEF < 40%, Sritharan and coworker reached an AUC of 0.74 [25]. Our XGBoost-based model achieved an AUC of 0.80 for the overall cohort as well as solely based on laboratory values and an AUC of 0.89 if no prior PCI was performed.
In terms of exact-value LVEF prediction there is no directly comparable study to our knowledge. Xin and colleagues predicted infarction size based on MRI measurements in a cohort of 315 STEMI patients without prior PCI with an impressive overall model performance (R2 = 0.687) [24]. Despite the fact that infarction size correlates with LVEF, there is no direct LVEF prediction, and the sample size is small, limiting the comparability with our reported results (R2 = 0.346 for the overall cohort). Although lactate measurements are usually quickly available in the emergency department or ICU setting, our study demonstrates that the model is capable of reliably predicting highly important clinical parameters in STEMI patients [42,43].
Several strengths of our work should be emphasized. First, to our knowledge, this is one of the largest datasets of STEMI patients analyzed with ML for LV function prediction. The size and diversity of the cohort improve the generalizability of our findings and reduce the risk of overfitting. Second, we systematically evaluated different model architectures, demonstrating the superiority of ensemble methods and confirming that even restricted feature sets (e.g., laboratory parameters only) retained predictive value. Third, we performed independent validation in a temporally distinct patient cohort from the same institution, which, despite its small size, confirmed the robustness of our findings and strengthens the translational potential of our approach.

Limitations

Nonetheless, several limitations must be acknowledged. First, this was a single-center study, which may limit external validity given regional differences in patient populations, treatment strategies, and healthcare systems. Second, the retrospective design introduces potential bias, particularly due to the lack of standardized time points for assessing LV function and laboratory parameters. Both echocardiographic and laboratory assessments were performed during routine care, not in a protocol-driven manner, which may have introduced variability into the outcomes. Especially for lactate, a previous study has shown that lactate clearance in the first 8 h of cardiogenic shock is superior to a single admission value [42,43]. Third, although our registry initially included 2553 patients, only 1608 remained for the final analysis after applying exclusion criteria and handling missing values. This may have reduced statistical power and prevented inclusion of potentially important biomarkers such as NT-proBNP, which were not available in a sufficient number of patients [17]. Fourth, there were no standardized protocols for echocardiographic assessment of LVEF, and inter-operator variability is likely to have attenuated model performance [53,54,55].

5. Conclusions

ML models demonstrated that LV function at discharge and lactate dynamics after STEMI can be predicted with clinically meaningful accuracy using routinely available clinical and laboratory data. Ensemble methods, particularly Random Forest and XGBoost, consistently outperformed single algorithms and proved robust in validation. Although predictive performance was lower in patients with prior PCI, the overall findings underscore the potential of ML for individualized risk stratification after STEMI.
Importantly, an early, data-driven estimation of LVEF around the clinical cutoff of 40%, derived from routine clinical and laboratory variables, may allow rapid identification of high-risk patients even before final imaging results are available. The value of such early risk stratification is underscored by the tenfold increase in mortality associated with severely impaired LV function in the presented STEMI cohort. Our approach could thus complement echocardiography by providing an admission-based prediction of recovery potential and supporting early therapeutic and monitoring decisions.
Future research should validate these findings in larger, prospective multi-center cohorts using standardized diagnostic protocols and comprehensive biomarker assessment to establish clinical integration of ML-based prognostic tools in acute cardiac care.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/jcm14238563/s1: Figure S1: Data selection flow chart.; Figure S2: Correlation matrix of predictor variables.; Figure S3: Classification performance of predicting LVEF.; Figure S4: Feature importance and SHAP beeswarm for lactate prediction.; Table S1: List of variables.; Table S2: Regression analysis of admission lactate prediction.; Table S3: Regression analysis of maximum lactate prediction.; Table S4: Categorical analysis of admission lactate prediction.; Table S5: Categorical analysis of maximum lactate prediction.; Table S6: Regression analysis of admission lactate prediction (validation cohort).; Table S7: Regression analysis of maximum lactate prediction (validation cohort).

Author Contributions

Conceptualization: S.V. and C.S. Data curation: S.-F.Z., K.D., D.E., and C.S. Formal analysis: S.-F.Z., K.D., S.V., and C.S. Investigation: S.-F.Z., K.D., D.E., and C.S. Methodology: S.-F.Z., S.V., and C.S. Project administration: S.V. and C.S. Resources: S.V. and C.S. Software: S.-F.Z., S.V., and C.S. Supervision: S.V. and C.S. Validation: S.-F.Z., K.D., D.E., S.V., and C.S. Visualization: S.-F.Z., S.V., and C.S. Writing—original draft: S.-F.Z., S.V., and C.S. Writing—review and editing: S.-F.Z., K.D., D.E., S.V., and C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Ethics Committee of LMU Munich, Germany (#23-0609; date of approval: 16 August 2023).

Informed Consent Statement

Patient consent was waived due to the retrospective nature of the study and anonymized data.

Data Availability Statement

Data are available on reasonable request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ACCAmerican College of Cardiology
AHAAmerican Heart Association
ALTAlanine transaminase
ASTAspartate transaminase
AUCArea under the curve
BMIBody mass index
CADCoronary artery disease
CAVContrast agent volume
CKCreatine kinase
CK-MBCreatine kinase-myocardial band
CMRCardiac magnetic resonance imaging
CPRCardiopulmonary resuscitation
CRPC-reactive protein
DAPDose area product
DTDecision Tree
ECGElectrocardiogram
ECMOExtracorporeal membrane oxygenation
EVSExplained variance score
ICUIntensive care unit
IABPIntra-aortic balloon pump
ICDImplantable cardioverter-defibrillator
LADLeft anterior descending coronary artery
LCALeft main coronary artery
LCXLeft circumflex coronary artery
LDHLactate dehydrogenase
LVEFLeft ventricular ejection fraction
LVLeft ventricle / left ventricular
MAEMean absolute error
MAPEMean absolute percentage error
MCSMechanical circulatory support
MIMyocardial infarction
MINOCAMyocardial infarction with non-obstructive coronary arteries
MLMachine learning
MSEMean squared error
NT-proBNPN-terminal pro-B-type natriuretic peptide
OLSOrdinary Least Squares Regression
PCIPercutaneous coronary intervention
PR-AUCPrecision–recall area under the curve
R2Coefficient of determination
RCARight coronary artery
RFRandom Forest
RMSERoot mean squared error
SDStandard deviation
SHAPShapley additive explanations
STEMIST-segment elevation myocardial infarction
XGExtreme Gradient Boosting (XGBoost)

References

  1. WHO. Global Health Estimates: Leading Causes of Death. Available online: https://www.who.int/data/gho/data/themes/mortality-and-global-health-estimates/ghe-leading-causes-of-death (accessed on 13 April 2025).
  2. Byrne, R.A.; Rossello, X.; Coughlan, J.J.; Barbato, E.; Berry, C.; Chieffo, A.; Claeys, M.J.; Dan, G.A.; Dweck, M.R.; Galbraith, M.; et al. 2023 ESC Guidelines for the management of acute coronary syndromes. Eur. Heart J. 2023, 44, 3720–3826. [Google Scholar] [CrossRef]
  3. Rao, S.V.; O’Donoghue, M.L.; Ruel, M.; Rab, T.; Tamis-Holland, J.E.; Alexander, J.H.; Baber, U.; Baker, H.; Cohen, M.G.; Cruz-Ruiz, M.; et al. 2025 ACC/AHA/ACEP/NAEMSP/SCAI Guideline for the Management of Patients With Acute Coronary Syndromes: A Report of the American College of Cardiology/American Heart Association Joint Committee on Clinical Practice Guidelines. Circulation 2025, 151, e771–e862. [Google Scholar] [CrossRef]
  4. Kodesh, A.; Loebl, N.; de Filippo, O.; Orvin, K.; Levi, A.; Bental, T.; Kornowski, R.; D’Ascenzo, F.; Perl, L. Machine-learning based calculator for personalized risk assessment following ST-elevation myocardial infarction. BMC Cardiovasc. Disord. 2025, 25, 639. [Google Scholar] [CrossRef]
  5. Felbel, D.; Fackler, S.; Michalke, R.; Paukovitsch, M.; Groger, M.; Kessler, M.; Nita, N.; Teumer, Y.; Schneider, L.; Imhof, A.; et al. Prolonged pain-to-balloon time still impairs midterm left ventricular function following STEMI. BMC Cardiovasc. Disord. 2025, 25, 37. [Google Scholar] [CrossRef]
  6. Hashmi, K.A.; Adnan, F.; Ahmed, O.; Yaqeen, S.R.; Ali, J.; Irfan, M.; Edhi, M.M.; Hashmi, A.A. Risk Assessment of Patients After ST-Segment Elevation Myocardial Infarction by Killip Classification: An Institutional Experience. Cureus 2020, 12, e12209. [Google Scholar] [CrossRef] [PubMed]
  7. Hendriks, T.; Hartman, M.H.T.; Vlaar, P.J.J.; Prakken, N.H.J.; van der Ende, Y.M.Y.; Lexis, C.P.H.; van Veldhuisen, D.J.; van der Horst, I.C.C.; Lipsic, E.; Nijveldt, R.; et al. Predictors of left ventricular remodeling after ST-elevation myocardial infarction. Int. J. Cardiovasc. Imaging 2017, 33, 1415–1423. [Google Scholar] [CrossRef] [PubMed]
  8. Wegiel, M.; Rakowski, T. Circulating biomarkers as predictors of left ventricular remodeling after myocardial infarction. Postep. Kardiol. Interwencyjnej 2021, 17, 21–32. [Google Scholar] [CrossRef]
  9. Poyhonen, P.; Kylmala, M.; Vesterinen, P.; Kivisto, S.; Holmstrom, M.; Lauerma, K.; Vaananen, H.; Toivonen, L.; Hanninen, H. Peak CK-MB has a strong association with chronic scar size and wall motion abnormalities after revascularized non-transmural myocardial infarction—A prospective CMR study. BMC Cardiovasc. Disord. 2018, 18, 27. [Google Scholar] [CrossRef] [PubMed]
  10. Babuin, L.; Jaffe, A.S. Troponin: The biomarker of choice for the detection of cardiac injury. CMAJ 2005, 173, 1191–1202. [Google Scholar] [CrossRef]
  11. Younger, J.F.; Plein, S.; Barth, J.; Ridgway, J.P.; Ball, S.G.; Greenwood, J.P. Troponin-I concentration 72 h after myocardial infarction correlates with infarct size and presence of microvascular obstruction. Heart 2007, 93, 1547–1551. [Google Scholar] [CrossRef]
  12. Liu, S.; Jiang, Z.; Zhang, Y.; Pang, S.; Hou, Y.; Liu, Y.; Huang, Y.; Peng, N.; Tang, Y. A nomogramic model for predicting the left ventricular ejection fraction of STEMI patients after thrombolysis-transfer PCI. Front. Cardiovasc. Med. 2023, 10, 1178417. [Google Scholar] [CrossRef]
  13. Reinstadler, S.J.; Feistritzer, H.J.; Reindl, M.; Klug, G.; Mayr, A.; Mair, J.; Jaschke, W.; Metzler, B. Combined biomarker testing for the prediction of left ventricular remodelling in ST-elevation myocardial infarction. Open Heart 2016, 3, e000485. [Google Scholar] [CrossRef]
  14. Eslami, V.; Bayat, F.; Asadzadeh, B.; Saffarian, E.; Gheymati, A.; Mahmoudi, E.; Movahed, M.R. Early prediction of ventricular functional recovery after myocardial infarction by longitudinal strain study. Am. J. Cardiovasc. Dis. 2021, 11, 471–477. [Google Scholar]
  15. Virbickiene, A.; Lapinskas, T.; Garlichs, C.D.; Mattecka, S.; Tanacli, R.; Ries, W.; Torzewski, J.; Heigl, F.; Pfluecke, C.; Darius, H.; et al. Imaging Predictors of Left Ventricular Functional Recovery after Reperfusion Therapy of ST-Elevation Myocardial Infarction Assessed by Cardiac Magnetic Resonance. J. Cardiovasc. Dev. Dis. 2023, 10, 294. [Google Scholar] [CrossRef]
  16. Kiron, V.; George, P.V. Correlation of cumulative ST elevation with left ventricular ejection fraction and 30-day outcome in patients with ST elevation myocardial infarction. J. Postgrad. Med. 2019, 65, 146–151. [Google Scholar] [CrossRef]
  17. Mathbout, M.; Asfour, A.; Leung, S.; Lolay, G.; Idris, A.; Abdel-Latif, A.; Ziada, K.M. NT-proBNP Level Predicts Extent of Myonecrosis and Clinical Adverse Outcomes in Patients with ST-Elevation Myocardial Infarction: A Pilot Study. Med. Res. Arch. 2020, 8, 10-18103. [Google Scholar] [CrossRef] [PubMed]
  18. Jering, K.S.; Claggett, B.L.; Pfeffer, M.A.; Granger, C.B.; Kober, L.; Lewis, E.F.; Maggioni, A.P.; Mann, D.L.; McMurray, J.J.V.; Prescott, M.F.; et al. Prognostic Importance of NT-proBNP (N-Terminal Pro-B-Type Natriuretic Peptide) Following High-Risk Myocardial Infarction in the PARADISE-MI Trial. Circ. Heart Fail. 2023, 16, e010259. [Google Scholar] [CrossRef] [PubMed]
  19. Smit, J.J.; Ottervanger, J.P.; Slingerland, R.J.; Kolkman, J.J.; Suryapranata, H.; Hoorntje, J.C.; Dambrink, J.H.; Gosselink, A.T.; de Boer, M.J.; Zijlstra, F.; et al. Comparison of usefulness of C-reactive protein versus white blood cell count to predict outcome after primary percutaneous coronary intervention for ST elevation myocardial infarction. Am. J. Cardiol. 2008, 101, 446–451. [Google Scholar] [CrossRef] [PubMed]
  20. Vanhaverbeke, M.; Veltman, D.; Pattyn, N.; De Crem, N.; Gillijns, H.; Cornelissen, V.; Janssens, S.; Sinnaeve, P.R. C-reactive protein during and after myocardial infarction in relation to cardiac injury and left ventricular function at follow-up. Clin. Cardiol. 2018, 41, 1201–1206. [Google Scholar] [CrossRef]
  21. Ferrari, J.P.; Lueneberg, M.E.; da Silva, R.L.; Fattah, T.; Gottschall, C.A.M.; Moreira, D.M. Correlation between leukocyte count and infarct size in ST segment elevation myocardial infarction. Arch. Med. Sci. Atheroscler. Dis. 2016, 1, e44–e48. [Google Scholar] [CrossRef]
  22. Lin, Q.; Zhao, W.; Zhang, H.; Chen, W.; Lian, S.; Ruan, Q.; Qu, Z.; Lin, Y.; Chai, D.; Lin, X. Predicting the risk of heart failure after acute myocardial infarction using an interpretable machine learning model. Front. Cardiovasc. Med. 2025, 12, 1444323. [Google Scholar] [CrossRef]
  23. Jeon, K.H.; Lee, H.S.; Kang, S.; Jang, J.H.; Jo, Y.Y.; Son, J.M.; Lee, M.S.; Kwon, J.M.; Kwun, J.S.; Cho, H.W.; et al. AI-enabled ECG index for predicting left ventricular dysfunction in patients with ST-segment elevation myocardial infarction. Sci. Rep. 2024, 14, 16575. [Google Scholar] [CrossRef]
  24. Xin, A.; Li, K.; Yan, L.L.; Chandramouli, C.; Hu, R.; Jin, X.; Li, P.; Chen, M.; Qian, G.; Chen, Y. Machine learning-based prediction of infarct size in patients with ST-segment elevation myocardial infarction: A multi-center study. Int. J. Cardiol. 2023, 375, 131–141. [Google Scholar] [CrossRef]
  25. Sritharan, H.P.; Nguyen, H.; Ciofani, J.; Bhindi, R.; Allahwala, U.K. Machine-learning based risk prediction of in-hospital outcomes following STEMI: The STEMI-ML score. Front. Cardiovasc. Med. 2024, 11, 1454321. [Google Scholar] [CrossRef]
  26. Lang, R.M.; Badano, L.P.; Mor-Avi, V.; Afilalo, J.; Armstrong, A.; Ernande, L.; Flachskampf, F.A.; Foster, E.; Goldstein, S.A.; Kuznetsova, T.; et al. Recommendations for cardiac chamber quantification by echocardiography in adults: An update from the American Society of Echocardiography and the European Association of Cardiovascular Imaging. J. Am. Soc. Echocardiogr. 2015, 28, 1–39.e14. [Google Scholar] [CrossRef]
  27. McDonagh, T.A.; Metra, M.; Adamo, M.; Gardner, R.S.; Baumbach, A.; Bohm, M.; Burri, H.; Butler, J.; Celutkiene, J.; Chioncel, O.; et al. 2023 Focused Update of the 2021 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure. Eur. Heart J. 2023, 44, 3627–3639. [Google Scholar] [CrossRef]
  28. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  29. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  30. Stone, M. Cross-validatory choice and assessment of statistical predictions. J. R. Stat. Soc. Ser. B (Methodol.) 1974, 36, 111–133. [Google Scholar] [CrossRef]
  31. Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef]
  32. Harris, C.R.; Millman, K.J.; Van Der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef]
  33. McKinney, W. pandas: A foundational Python library for data analysis and statistics. Python High Perform. Sci. Comput. 2011, 14, 1–9. [Google Scholar]
  34. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  35. Hunter, J.D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
  36. Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 1–10. [Google Scholar]
  37. McKinney, W. Data structures for statistical computing in Python. In Proceedings of the 9th Python in Science Conference, Austin, TX, USA, 28 June–3 July 2010; Volume 445, pp. 51–56. [Google Scholar]
  38. Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef]
  39. Lazzeri, C.; Valente, S.; Chiostri, M.; Gensini, G.F. Clinical significance of lactate in acute cardiac patients. World J. Cardiol. 2015, 7, 483–489. [Google Scholar] [CrossRef]
  40. Park, I.H.; Cho, H.K.; Oh, J.H.; Chun, W.J.; Park, Y.H.; Lee, M.; Kim, M.S.; Choi, K.H.; Kim, J.; Song, Y.B.; et al. Clinical Significance of Serum Lactate in Acute Myocardial Infarction: A Cardiac Magnetic Resonance Imaging Study. J. Clin. Med. 2021, 10, 5278. [Google Scholar] [CrossRef]
  41. Wu, Y.; Huang, N.; Sun, T.; Zhang, B.; Zhang, S.; Zhang, P.; Zhang, C. Association between normalized lactate load and in-hospital mortality in patients with acute myocardial infarction. Int. J. Cardiol. 2024, 399, 131658. [Google Scholar] [CrossRef]
  42. Fuernau, G.; Desch, S.; de Waha-Thiele, S.; Eitel, I.; Neumann, F.J.; Hennersdorf, M.; Felix, S.B.; Fach, A.; Bohm, M.; Poss, J.; et al. Arterial Lactate in Cardiogenic Shock: Prognostic Value of Clearance Versus Single Values. JACC Cardiovasc. Interv. 2020, 13, 2208–2216. [Google Scholar] [CrossRef]
  43. Marbach, J.A.; Di Santo, P.; Kapur, N.K.; Thayer, K.L.; Simard, T.; Jung, R.G.; Parlow, S.; Abdel-Razek, O.; Fernando, S.M.; Labinaz, M.; et al. Lactate Clearance as a Surrogate for Mortality in Cardiogenic Shock: Insights From the DOREMI Trial. J. Am. Heart Assoc. 2022, 11, e023322. [Google Scholar] [CrossRef]
  44. Grinsztajn, L.; Oyallon, E.; Varoquaux, G. Why do tree-based models still outperform deep learning on typical tabular data? Adv. Neural Inf. Process. Syst. 2022, 35, 507–520. [Google Scholar]
  45. Shwartz-Ziv, R.; Armon, A. Tabular data: Deep learning is not all you need. Inf. Fusion 2022, 81, 84–90. [Google Scholar] [CrossRef]
  46. Parvathi, R.; Pavithra, S.; Pattabiraman, V. Deep Learning Approach on Multimodal Data for Myocardial Infarction Prediction. In Proceedings of the 2024 International Conference on Computational Intelligence and Network Systems (CINS), Dubai, United Arab Emirates, 28–29 November 2024; pp. 1–8. [Google Scholar]
  47. Soto, J.T.; Weston Hughes, J.; Sanchez, P.A.; Perez, M.; Ouyang, D.; Ashley, E.A. Multimodal deep learning enhances diagnostic precision in left ventricular hypertrophy. Eur. Heart J. Digit. Health 2022, 3, 380–389. [Google Scholar] [CrossRef]
  48. Tokodi, M.; Magyar, B.; Soos, A.; Takeuchi, M.; Tolvaj, M.; Lakatos, B.K.; Kitano, T.; Nabeshima, Y.; Fabian, A.; Szigeti, M.B.; et al. Deep Learning-Based Prediction of Right Ventricular Ejection Fraction Using 2D Echocardiograms. JACC Cardiovasc. Imaging 2023, 16, 1005–1018. [Google Scholar] [CrossRef]
  49. Devkota, A.; Prajapati, R.; El-Wakeel, A.; Adjeroh, D.; Patel, B.; Gyawali, P. AI analysis for ejection fraction estimation from 12-lead ECG. Sci. Rep. 2025, 15, 13502. [Google Scholar] [CrossRef]
  50. Yang, X.Y.; Li, Y.M.; Wang, J.Y.; Jia, Y.H.; Yi, Z.; Chen, M. Utilizing multimodal artificial intelligence to advance cardiovascular diseases. Precis. Clin. Med. 2025, 8, pbaf016. [Google Scholar] [CrossRef]
  51. Khera, R.; Oikonomou, E.K.; Nadkarni, G.N.; Morley, J.R.; Wiens, J.; Butte, A.J.; Topol, E.J. Transforming Cardiovascular Care With Artificial Intelligence: From Discovery to Practice: JACC State-of-the-Art Review. J. Am. Coll. Cardiol. 2024, 84, 97–114. [Google Scholar] [CrossRef]
  52. Cersosimo, A.; Zito, E.; Pierucci, N.; Matteucci, A.; La Fazia, V.M. A Talk with ChatGPT: The Role of Artificial Intelligence in Shaping the Future of Cardiology and Electrophysiology. J. Pers. Med. 2025, 15, 205. [Google Scholar] [CrossRef]
  53. Cole, G.D.; Dhutia, N.M.; Shun-Shin, M.J.; Willson, K.; Harrison, J.; Raphael, C.E.; Zolgharni, M.; Mayet, J.; Francis, D.P. Defining the real-world reproducibility of visual grading of left ventricular function and visual estimation of left ventricular ejection fraction: Impact of image quality, experience and accreditation. Int. J. Cardiovasc. Imaging 2015, 31, 1303–1314. [Google Scholar] [CrossRef]
  54. Akil, S.; Castaings, J.; Thind, P.; Ahlfeldt, T.; Akhtar, M.; Gonon, A.T.; Quintana, M.; Bouma, K. Impact of experience on visual and Simpson’s biplane echocardiographic assessment of left ventricular ejection fraction. Clin. Physiol. Funct. Imaging 2025, 45, e12918. [Google Scholar] [CrossRef]
  55. Thavendiranathan, P.; Popovic, Z.B.; Flamm, S.D.; Dahiya, A.; Grimm, R.A.; Marwick, T.H. Improved interobserver variability and accuracy of echocardiographic visual left ventricular ejection fraction assessment through a self-directed learning program using cardiac magnetic resonance images. J. Am. Soc. Echocardiogr. 2013, 26, 1267–1273. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Patient and data selection flow chart. Depicted are patient cohorts and the number of variables on all selection levels. Major exclusion criteria for each step are highlighted in the boxes on the right. LMU, Ludwig-Maximilians-University; STEMI, ST-segment elevation myocardial infarction; PCI, percutaneous coronary intervention.
Figure 1. Patient and data selection flow chart. Depicted are patient cohorts and the number of variables on all selection levels. Major exclusion criteria for each step are highlighted in the boxes on the right. LMU, Ludwig-Maximilians-University; STEMI, ST-segment elevation myocardial infarction; PCI, percutaneous coronary intervention.
Jcm 14 08563 g001
Figure 2. Model performance for LVEF prediction. Comparison of Decision Tree, Random Forest, and XGBoost models for the prediction of the exact LVEF function for the full cohort of patients as well as in relation to the presence or absence of a prior percutaneous coronary intervention (PCI) or based on laboratory values only. The models are evaluated by (A) the R2 (the higher the better) and (B) the MSE (the lower the better). The performance is shown with the confidence intervals in red.
Figure 2. Model performance for LVEF prediction. Comparison of Decision Tree, Random Forest, and XGBoost models for the prediction of the exact LVEF function for the full cohort of patients as well as in relation to the presence or absence of a prior percutaneous coronary intervention (PCI) or based on laboratory values only. The models are evaluated by (A) the R2 (the higher the better) and (B) the MSE (the lower the better). The performance is shown with the confidence intervals in red.
Jcm 14 08563 g002
Figure 3. Model performance for lactate prediction. Comparison of Decision Tree, Random Forest, and XGBoost models for the prediction of log lactate values across all patients (overall) and in relation to the presence or absence of a prior percutaneous coronary intervention (PCI). The first row contains the prediction performance for the log admission lactate with (A) the R2 and (B) MSE. The second row shows the prediction performance of the log peak lactate with (C) the R2 and (D) MSE.
Figure 3. Model performance for lactate prediction. Comparison of Decision Tree, Random Forest, and XGBoost models for the prediction of log lactate values across all patients (overall) and in relation to the presence or absence of a prior percutaneous coronary intervention (PCI). The first row contains the prediction performance for the log admission lactate with (A) the R2 and (B) MSE. The second row shows the prediction performance of the log peak lactate with (C) the R2 and (D) MSE.
Jcm 14 08563 g003
Figure 4. Feature importance for LVEF prediction. Depicted are variables with the highest feature importance values for (A) the overall cohort, in relation to (B) the presence or (C) absence of a prior percutaneous coronary intervention (PCI) and (D) based on laboratory values only. CL, culprit lesion; LAD, left anterior descending coronary artery; ECMO, extracorporeal membrane oxygenation; LDH, lactate dehydrogenase; adm, admission; CK, creatine kinase-myocardial band; MB, myocardial band; max, maximum; Lac, lactate; Crea, creatinine; RCA, right coronary artery; TropT, troponin T; DAP, dose area product; BMI, body mass index.
Figure 4. Feature importance for LVEF prediction. Depicted are variables with the highest feature importance values for (A) the overall cohort, in relation to (B) the presence or (C) absence of a prior percutaneous coronary intervention (PCI) and (D) based on laboratory values only. CL, culprit lesion; LAD, left anterior descending coronary artery; ECMO, extracorporeal membrane oxygenation; LDH, lactate dehydrogenase; adm, admission; CK, creatine kinase-myocardial band; MB, myocardial band; max, maximum; Lac, lactate; Crea, creatinine; RCA, right coronary artery; TropT, troponin T; DAP, dose area product; BMI, body mass index.
Jcm 14 08563 g004
Figure 5. SHAP beeswarm of XGBoost regression for LVEF prediction. Depicted data were trained on (A) the full cohort as well as in relation to (B) the presence or (C) absence of a prior percutaneous coronary intervention (PCI) and (D) based on laboratory values only. CL, culprit lesion; LAD, left anterior descending; LDH, lactate dehydrogenase; adm, admission; max, maximum; CK, creatine kinase; MB, myocardial band, Crea, creatinine; Lac, lactate; ECMO, extracorporeal membrane oxygenation; RCA right coronary artery; TropT, troponin T; DAP, dose area product; BMI, body mass index.
Figure 5. SHAP beeswarm of XGBoost regression for LVEF prediction. Depicted data were trained on (A) the full cohort as well as in relation to (B) the presence or (C) absence of a prior percutaneous coronary intervention (PCI) and (D) based on laboratory values only. CL, culprit lesion; LAD, left anterior descending; LDH, lactate dehydrogenase; adm, admission; max, maximum; CK, creatine kinase; MB, myocardial band, Crea, creatinine; Lac, lactate; ECMO, extracorporeal membrane oxygenation; RCA right coronary artery; TropT, troponin T; DAP, dose area product; BMI, body mass index.
Jcm 14 08563 g005
Table 1. Baseline characteristics. Values are shown for the full patient cohort as well as in relation to the presence or absence of a prior coronary intervention (PCI). Data are presented as mean and standard deviation (SD) or N (%) as indicated. Bold p values with an asterisk indicate statistical significance. BMI, body mass index; LCA, left main coronary artery; LAD, left anterior descending coronary artery; LCX, left circumflex coronary artery; RCA, right coronary artery; CAV, contrast agent volume; DAP, dose area product; CPR, cardiopulmonary resuscitation; ECMO, extracorporeal membrane oxygenation; IABP, intra-aortic balloon pump; LVEF, left ventricular ejection fraction; CK, creatine kinase; Trop, troponin; Crea, creatinine; adm, admission; max, maximum; MB, myocardial band; LDH, lactate dehydrogenase.
Table 1. Baseline characteristics. Values are shown for the full patient cohort as well as in relation to the presence or absence of a prior coronary intervention (PCI). Data are presented as mean and standard deviation (SD) or N (%) as indicated. Bold p values with an asterisk indicate statistical significance. BMI, body mass index; LCA, left main coronary artery; LAD, left anterior descending coronary artery; LCX, left circumflex coronary artery; RCA, right coronary artery; CAV, contrast agent volume; DAP, dose area product; CPR, cardiopulmonary resuscitation; ECMO, extracorporeal membrane oxygenation; IABP, intra-aortic balloon pump; LVEF, left ventricular ejection fraction; CK, creatine kinase; Trop, troponin; Crea, creatinine; adm, admission; max, maximum; MB, myocardial band; LDH, lactate dehydrogenase.
Total
(N = 1608)
Prior PCI
(N = 325)
No Prior PCI
(N = 1283)
p ValueCorrected
p Value
Demographics
Age [years]64.8 (±13.5)68.8 (±12.9)63.8 (±13.4)0.0000 *0.0000 *
BMI [kg/m2]27.2 (±4.4)27.7 (±4.6)27.1 (±4.4)0.0123 *0.0267 *
Sex (female) [N (%)]404 (25.1%)78 (24.0%)326 (25.4%)0.60080.6759
Coronary perfusion type
Right [N (%)]1401 (87.1%)289 (88.9%)1112 (86.7%)0.27900.3720
Left [N (%)]123 (7.7%)20 (6.2%)103 (8.0%)0.25620.3720
Balanced [N (%)]84 (5.2%)16 (4.9%)68 (5.3%)0.78500.8312
Coronary artery disease
1 vessel [N (%)]497 (30.9%)42 (12.9%)455 (35.5%)0.0000 *0.0000 *
2 vessels [N (%)]447 (27.8%)75 (23.1%)372 (29.0%)0.0334 *0.0668
3 vessels [N (%)]664 (41.3%)208 (64.0%)456 (35.5%)0.0000 *0.0000 *
Culprit lesion
LCA [N (%)]21 (1.3%)7 (2.2%)14 (1.1%)0.13170.2163
LAD [N (%)]818 (50.9%)159 (48.9%)659 (51.4%)0.43170.4317
LCX [N (%)]176 (11.0%)36 (11.1%)140 (10.9%)0.93220.9322
RCA [N (%)]592 (36.8%)122 (37.5%)470 (36.6%)0.76240.8312
Procedural data
CAV [ml]206.97 (±98.37)188.46 (±94.29)211.66 (±98.86)0.0002 *0.0006 *
Radiation time [min]14.09 (±11.69)14.59 (±12.98)13.97 (±11.35)0.54050.6277
DAP [cGy/cm2]4306 (±4111)4210 (±3621)4331 (±4227)0.39710.5106
Tirofiban [N (%)]160 (10.0%)54 (16.6%)106 (8.3%)0.0000 *0.0000 *
Hemodynamics and shock
Shock [N (%)]264 (16.4%)73 (22.5%)191 (14.9%)0.0010 *0.0026 *
CPR [N (%)]222 (13.8%)51 (15.7%)171 (13.3%)0.26980.3720
ECMO [N (%)]76 (4.7%)27 (8.3%)49 (3.8%)0.0007 *0.0019 *
Impella [N (%)]15 (0.9%)1 (0.3%)14 (1.1%)0.18940.2965
ECMO + Impella [N (%)]4 (0.3%)1 (0.3%)3 (0.2%)0.81130.8345
IABP [N (%)]5 (0.3%)0 (0.0%)5 (0.4%)0.25970.3720
LVEF discharge [%]49 (±11)47 (±12)50 (±10)0.0000 *0.0000 *
Exitus [N (%)]106 (6.59%)32 (13.68%)74 (5.39%)0.0000 *0.0000 *
Laboratory values
CK adm [U/L]874.7(±1501.6)818.2 (±1685.2)889.0 (±1451.8)0.0001 *0.0003 *
CK max [U/L]2348.2 (±3726.7)2296.4 (±4020.9)2361.3 (±3649.9)0.0126 *0.0267 *
CK-MB adm [U/L]102.1 (±162.9)86.1 (±172.2)106.2 (±160.3)0.0000 *0.0000 *
CK-MB max [U/L]234.3 (±251.9)206.9 (±238.6)241.2 (±254.8)0.0011 *0.0026 *
Trop T adm [ng/mL]3.41 (±10.35)3.08 (±10.64)3.50 (±10.27)0.0001 *0.0003 *
Trop T max [ng/mL]9.09 (±15.02)9.25 (±16.15)9.04 (±14.73)0.05890.1116
Crea adm [mg/dL]1.13 (±0.62)1.36 (±1.00)1.07 (±0.46)0.0000 *0.0000 *
Crea max [mg/dL]1.35 (±0.90)1.65 (±1.32)1.27 (±0.73)0.0000 *0.0000 *
LDH adm [U/L]421.43 (±653.40)456.82 (±760.56)412.47 (±623.36)0.11270.2029
Lactate adm [mmol/L]2.48 (±2.47)2.65 (±2.54)2.44 (±2.45)0.52780.6277
Lactate max [mmol/L]3.49 (±3.80)3.96 (±4.10)3.37 (±3.71)0.13220.2163
Table 2. Regression analysis for LVEF prediction. Results of the regression analysis for the prediction of the LVEF for the full cohort as well as stratified for a prior intervention and exclusively based on laboratory values as covariates. The results are compared based on mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), coefficient of determination (R2), explained variance score (EVS), and mean absolute percentage error (MAPE). The best results per subgroup are bolded. Lower values for MSE, RMSE, MAE, and MAPE indicate better performance. Vice versa, higher values for R2 and EVS indicate better performance. PCI, percutaneous coronary intervention; OLS, Ordinary Least Squares Regression; L1, Lasso Regularization; L2, Ridge Regularization; DT, Decision Tree; RF, Random Forest; XG, XGBoost.
Table 2. Regression analysis for LVEF prediction. Results of the regression analysis for the prediction of the LVEF for the full cohort as well as stratified for a prior intervention and exclusively based on laboratory values as covariates. The results are compared based on mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), coefficient of determination (R2), explained variance score (EVS), and mean absolute percentage error (MAPE). The best results per subgroup are bolded. Lower values for MSE, RMSE, MAE, and MAPE indicate better performance. Vice versa, higher values for R2 and EVS indicate better performance. PCI, percutaneous coronary intervention; OLS, Ordinary Least Squares Regression; L1, Lasso Regularization; L2, Ridge Regularization; DT, Decision Tree; RF, Random Forest; XG, XGBoost.
MSERMSEMAER2EVSMAPE
Full CohortOLS0.0078
(0.0064, 0.0090)
0.0881
(0.0803, 0.0950)
0.0686
(0.0619, 0.0950)
0.3202
(0.1821, 0.4354)
0.3202
(0.1915, 0.4394)
17.92%
(15.89%, 20.02%)
OLS + L10.0077
(0.0065, 0.0089)
0.0877
(0.0808, 0.0945)
0.0686
(0.0629, 0.0747)
0.3263
(0.1977, 0.4226)
0.3310
(0.2085, 0.4243)
17.08%
(14.99%, 19.48%)
OLS + L20.0076
(0.0065, 0.0090)
0.0873
(0.0809, 0.094)
0.0686
(0.0633, 0.0751)
0.3324
(0.2033, 0.4371)
0.3379
(0.2145, 0.4406)
17.02%
(15.11%, 19.44%)
DT0.0095
(0.0078, 0.0113)
0.0972
(0.0882, 0.1064)
0.0754
(0.0688, 0.0826)
0.1721
(-0.0224, 0.3213)
0.1837
(0.0063, 0.3289)
17.72%
(15.27%, 20.51%)
RF0.0076
(0.0063, 0.0090)
0.0873
(0.0795, 0.0950)
0.0673
(0.0614, 0.0734)
0.3326
(0.2025, 0.4286)
0.3402
(0.2171, 0.4368)
16.11%
(13.64%, 18.96%)
XG0.0075
(0.0063, 0.0089)
0.0864
(0.0793, 0.0942)
0.0677
(0.0620, 0.0740)
0.3461
(0.2289, 0.4346)
0.3542
(0.2427, 0.4414)
16.11%
(13.89%, 18.81%)
Prior PCIOLS0.0144
(0.0096, 0.0173)
0.1202
(0.0980, 0.1315)
0.0953
(0.0779, 0.1057)
0.1870
(−0.1179, 0.3925)
0.1896
(−0.0894, 0.3989)
24.45%
(18.24%, 30.43%)
OLS + L10.0121
(0.0081, 0.0161)
0.1101
(0.0900, 0.1269)
0.0875
(0.0716, 0.1008)
0.2206
(−0.0787, 0.4049)
0.2213
(−0.0702, 0.4175)
26.09%
(19.78%, 33.43%)
OLS + L20.0121
(0.0081, 0.0160)
0.1098
(0.0900, 0.1265)
0.0878
(0.0719, 0.1011)
0.2097
(−0.0839, 0.3823)
0.2116
(−0.0612, 0.3988)
26.70%
(20.49%, 34.19%)
DT0.0137
(0.0093, 0.0193)
0.1169
(0.0963, 0.1390)
0.0923
(0.0767, 0.1102)
0.1028
(−0.3303, 0.3722)
0.1028
(−0.2943, 0.3828)
24.43%
(18.69%, 30.68%)
RF0.0122
(0.0081, 0.0165)
0.1106
(0.0901, 0.1283)
0.0882
(0.0723, 0.1043)
0.1969
(−0.1805, 0.4322)
0.1971
(−0.1578, 0.4387)
22.81%
(17.53%, 28.60%)
XG0.0115
(0.0080, 0.0157)
0.1073
(0.0896, 0.1253)
0.0850
(0.0698, 0.1015)
0.2442
(−0.0385, 0.4156)
0.2443
(−0.0135, 0.4229)
22.85%
(16.82%, 29.78%)
No Prior PCIOLS0.0073
(0.0058, 0.0087)
0.0855
(0.0762, 0.0933)
0.0675
(0.0605, 0.0741)
0.3351
(0.2134, 0.4461)
0.3399
(0.2189, 0.4502)
16.27%
(14.00%, 18.89%)
OLS + L10.0073
(0.0061, 0.0085)
0.0854
(0.0781, 0.0922)
0.0679
(0.0621, 0.07327)
0.3390
(0.2203, 0.4391)
0.3435
(0.2245, 0.4408)
16.24%
(14.23%, 17.42%)
OLS + L20.0072
(0.0058, 0.0086)
0.0849
(0.0762, 0.0927)
0.0673
(0.0604, 0.0735)
0.3421
(0.2307, 0.4365)
0.3456
(0.2379, 0.4365)
16.46%
(13.42%, 17.78%)
DT0.0092
(0.0074, 0.0114)
0.0961
(0.0862, 0.1068)
0.0756
(0.0684, 0.0832)
0.1766
(0.0270, 0.2993)
0.1795
(0.0286, 0.3001)
18.17%
(15.67%, 20.89%)
RF0.0072
(0.0058, 0.0086)
0.0846
(0.0761, 0.0930)
0.0664
(0.0604, 0.0727)
0.3622
(0.2470, 0.4678)
0.3627
(0.2498, 0.4707)
15.64%
(13.61%, 17.95%)
XG0.0073
(0.0060, 0.0089)
0.0856
(0.0775, 0.0943)
0.0667
(0.0606, 0.0730)
0.3464
(0.2326, 0.4349)
0.3474
(0.2349, 0.4404)
15.79%
(13.72%, 18.15%)
Laboratory Values OnlyOLS0.0088
(0.0076, 0.0101)
0.0938
(0.0873, 0.1003)
0.0760
(0.0707, 0.0819)
0.3328
(0.2322, 0.4105)
0.03345
(0.2402, 0.4139)
18.58%
(16.34%, 20.98%)
OLS + L10.0088
(0.0076, 0.0102)
0.0939
(0.0871, 0.1010)
0.0762
(0.0705, 0.0825)
0.3313
(0.2359, 0.4163)
0.3330
(0.2398, 0.4168)
18.65%
(16.30%, 21.30%)
OLS + L20.0088
(0.0076, 0.0100)
0.0938
(0.0869, 0.1001)
0.0761
(0.0706, 0.0818)
0.3328
(0.2312, 0.4135)
0.3346
(0.2387, 0.4181)
18.72%
(16.21%, 21.09%)
DT0.0099
(0.0084, 0.0115)
0.0997
(0.0919, 0.1074)
0.0797
(0.0738, 0.0861)
0.2464
(0.1127, 0.3464)
0.2470
(0.1154, 0.3477)
19.68%
(17.37%, 22.23%)
RF0.0084
(0.0072, 0.0099)
0.0918
(0.0850, 0.0994)
0.0741
(0.0686, 0.0803)
0.3604
(0.2564, 0.4492)
0.3618
(0.2624, 0.4496)
18.01%
(15.83%, 20.43%)
XG0.0085
(0.0071, 0.0097)
0.0919
(0.0844, 0.0987)
0.0738
(0.0680, 0.0797)
0.3588
(0.2559, 0.4362)
0.3594
(0.2622, 0.4387)
18.23%
(15.74%, 20.51%)
Table 3. Classification analysis for LVEF prediction. Classification analysis for LVEF prediction for the full cohort as well as in relation to the presence or absence of a prior percutaneous coronary intervention (PCI), as well as based on laboratory values only. The results are compared based on the area under the curve (AUC), precision–recall AUC (PR-AUC), and the binary F1 score. Further, we display the binary classification results for four different LVEF thresholds (τ) in %. The best results per subgroup are bolded. LR, Logistic Regression; DT, Decision Tree; RF, Random Forest; XG, XGBoost.
Table 3. Classification analysis for LVEF prediction. Classification analysis for LVEF prediction for the full cohort as well as in relation to the presence or absence of a prior percutaneous coronary intervention (PCI), as well as based on laboratory values only. The results are compared based on the area under the curve (AUC), precision–recall AUC (PR-AUC), and the binary F1 score. Further, we display the binary classification results for four different LVEF thresholds (τ) in %. The best results per subgroup are bolded. LR, Logistic Regression; DT, Decision Tree; RF, Random Forest; XG, XGBoost.
τ = 30 τ = 35 τ = 40 τ = 50
AUCPR-AUCF1AUCPR-AUCF1AUCPR-AUCF1AUCPR-AUCF1
Full CohortLR0.530.930.950.480.880.940.820.930.900.760.670.62
DT0.530.920.890.500.880.920.750.900.880.640.460.61
RF0.530.920.950.460.870.940.800.920.890.750.660.60
XG0.520.920.950.490.870.920.800.920.890.750.670.60
Prior PCILR0.570.900.710.680.910.880.780.890.860.670.440.50
DT0.620.920.900.630.910.840.630.840.740.650.430.49
RF0.620.910.930.620.870.890.760.830.880.650.440.40
XG0.660.910.930.650.890.890.770.860.870.690.480.38
No Prior PCILR0.440.920.960.370.870.940.880.960.910.750.690.65
DT0.500.960.930.520.940.890.710.920.840.690.600.63
RF0.520.920.960.540.900.940.840.940.910.760.690.66
XG0.410.900.960.520.910.890.890.950.890.750.680.61
Laboratory
Values Only
LR0.770.960.950.380.820.930.780.880.890.670.530.53
DT0.650.850.940.510890.900.780.890.840.650.510.57
RF0.760.970.950.500.860.860.800.910.890.660.540.55
XG0.740.960.950.480.860.860.800.910.880.690.550.50
Table 4. Validation of LVEF prediction. Validation of the regression analysis for the prediction of LVEF for the full cohort, as well as in relation to the presence or absence of a prior PCI and based on laboratory values only as covariates. The results are compared based on mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), coefficient of determination (R2), explained variance score (EVS), and mean absolute percentage error (MAPE).
Table 4. Validation of LVEF prediction. Validation of the regression analysis for the prediction of LVEF for the full cohort, as well as in relation to the presence or absence of a prior PCI and based on laboratory values only as covariates. The results are compared based on mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), coefficient of determination (R2), explained variance score (EVS), and mean absolute percentage error (MAPE).
MethodMSERMSEMAER2EVSMAPE
Full CohortXGBoost
(model development)
0.0075
(0.0063, 0.0089)
0.0864
(0.0793, 0.0942)
0.0677
(0.0620, 0.0740)
0.3461
(0.2289, 0.4346)
0.3542
(0.2427, 0.4414)
16.11% (13.89%, 18.81%)
Validation cohort0.00800.08940.07910.34370.366018.48%
Prior PCIXGBoost
(model development)
0.0115
(0.0080, 0.0157)
0.1073
(0.0896, 0.1253)
0.0850
(0.0698, 0.1015)
0.2442
(−0.0385, 0.4156)
0.2443
(−0.0135, 0.4229)
22.85% (16.82%, 29.78%)
Validation cohort0.00980.09880.0759−0.56260.359422.55%
No Prior PCIRandom Forest
(model development)
0.0072
(0.0058, 0.0086)
0.0846
(0.0761, 0.0930)
0.0664
(0.0604, 0.0727)
0.3622
(0.2470, 0.4678)
0.3627
(0.2498, 0.4707)
15.64% (13.61%, 17.95%)
Validation cohort0.00900.09480.07570.29330.295618.62%
Laboratory
Values Only
Random Forest
(model development)
0.0084
(0.0072, 0.0099)
0.0918
(0.0850, 0.0994)
0.0741
(0.0686, 0.0803)
0.3604
(0.2564, 0.4492)
0.3618
(0.2624, 0.4496)
18.01% (15.83%, 20.43%)
Validation cohort0.00910.09550.08520.25220.257020.18%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zheng, S.-F.; Diegruber, K.; Esser, D.; Vieluf, S.; Stremmel, C. Machine Learning-Based Prediction of Early Left Ventricular Function After STEMI. J. Clin. Med. 2025, 14, 8563. https://doi.org/10.3390/jcm14238563

AMA Style

Zheng S-F, Diegruber K, Esser D, Vieluf S, Stremmel C. Machine Learning-Based Prediction of Early Left Ventricular Function After STEMI. Journal of Clinical Medicine. 2025; 14(23):8563. https://doi.org/10.3390/jcm14238563

Chicago/Turabian Style

Zheng, Shunjie-Fabian, Kathrin Diegruber, David Esser, Solveig Vieluf, and Christopher Stremmel. 2025. "Machine Learning-Based Prediction of Early Left Ventricular Function After STEMI" Journal of Clinical Medicine 14, no. 23: 8563. https://doi.org/10.3390/jcm14238563

APA Style

Zheng, S.-F., Diegruber, K., Esser, D., Vieluf, S., & Stremmel, C. (2025). Machine Learning-Based Prediction of Early Left Ventricular Function After STEMI. Journal of Clinical Medicine, 14(23), 8563. https://doi.org/10.3390/jcm14238563

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop