Natural Language Processing of Unstructured Healthcare Data for Predicting Heart Failure in Individuals with Type 2 Diabetes
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design and Study Population
2.2. Study Variables and Data Extraction
2.3. Descriptive Analysis
2.4. Predictive Model Training and Validation
2.5. Ethical Considerations and Study Approval
3. Results
3.1. Study Population
3.2. Descriptive Analysis
3.3. Predictive Model Training and Validation
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- American Diabetes Association. 9. Cardiovascular disease and risk management: Standards of medical care in diabetes-2018. Diabetes Care 2018, 41, S86–S104. [Google Scholar] [CrossRef]
- Birkeland, K.I.; Bodegard, J.; Eriksson, J.W.; Norhammar, A.; Haller, H.; Linssen, G.C.M.; Banerjee, A.; Thuresson, M.; Okami, S.; Garal-Pantaler, E.; et al. Heart failure and chronic kidney disease manifestation and mortality risk associations in type 2 diabetes: A large multinational cohort study. Diabetes Obes. Metab. 2020, 22, 1607–1618. [Google Scholar] [CrossRef] [PubMed]
- Seferovic, P.M.; Petrie, M.C.; Filippatos, G.S.; Anker, S.D.; Rosano, G.; Bauersachs, J.; Paulus, W.J.; Komajda, M.; Cosentino, F.; de Boer, R.A.; et al. Type 2 diabetes mellitus and heart failure: A position statement from the heart failure association of the european society of cardiology. Eur. J. Heart Fail. 2018, 20, 853–872. [Google Scholar] [CrossRef]
- Ndumele, C.E.; Rangaswami, J.; Chow, S.L.; Neeland, I.J.; Tuttle, K.R.; Khan, S.S.; Coresh, J.; Mathew, R.O.; Baker-Smith, C.M.; Carnethon, M.R.; et al. Cardiovascular-kidney-metabolic health: A presidential advisory from the american heart association. Circulation 2023, 148, 1606–1635. [Google Scholar] [CrossRef]
- Cosentino, F.; Grant, P.J.; Aboyans, V.; Bailey, C.J.; Ceriello, A.; Delgado, V.; Federici, M.; Filippatos, G.; Grobbee, D.E.; Hansen, T.B.; et al. 2019 esc guidelines on diabetes, pre-diabetes, and cardiovascular diseases developed in collaboration with the easd. Eur. Heart J. 2020, 41, 255–323. [Google Scholar] [CrossRef]
- Shahim, B.; Kapelios, C.J.; Savarese, G.; Lund, L.H. Global public health burden of heart failure: An updated review. Card. Fail. Rev. 2023, 9, e11. [Google Scholar] [CrossRef]
- Pop-Busui, R.; Januzzi, J.L.; Bruemmer, D.; Butalia, S.; Green, J.B.; Horton, W.B.; Knight, C.; Levi, M.; Rasouli, N.; Richardson, C.R. Heart failure: An underappreciated complication of diabetes. A consensus report of the american diabetes association. Diabetes Care 2022, 45, 1670–1690. [Google Scholar] [CrossRef] [PubMed]
- Heidenreich, P.A.; Bozkurt, B.; Aguilar, D.; Allen, L.A.; Byun, J.J.; Colvin, M.M.; Deswal, A.; Drazner, M.H.; Dunlay, S.M.; Evers, L.R.; et al. 2022 aha/acc/hfsa guideline for the management of heart failure: Executive summary: A report of the american college of cardiology/american heart association joint committee on clinical practice guidelines. Circulation 2022, 145, e876–e894. [Google Scholar] [CrossRef]
- Pandey, A.; Khan, M.S.; Patel, K.V.; Bhatt, D.L.; Verma, S. Predicting and preventing heart failure in type 2 diabetes. Lancet Diabetes Endocrinol. 2023, 11, 607–624. [Google Scholar] [CrossRef] [PubMed]
- Ceriello, A.; Catrinoiu, D.; Chandramouli, C.; Cosentino, F.; Dombrowsky, A.C.; Itzhak, B.; Lalic, N.M.; Prattichizzo, F.; Schnell, O.; Seferovic, P.M.; et al. Heart failure in type 2 diabetes: Current perspectives on screening, diagnosis and management. Cardiovasc. Diabetol. 2021, 20, 218. [Google Scholar] [CrossRef] [PubMed]
- Echouffo-Tcheugui, J.B.; Zhang, S.; Florido, R.; Hamo, C.; Pankow, J.S.; Michos, E.D.; Goldberg, R.B.; Nambi, V.; Gerstenblith, G.; Post, W.S.; et al. Duration of diabetes and incident heart failure: The aric (atherosclerosis risk in communities) study. JACC Heart Fail. 2021, 9, 594–603. [Google Scholar] [CrossRef]
- Dunlay, S.M.; Givertz, M.M.; Aguilar, D.; Allen, L.A.; Chan, M.; Desai, A.S.; Deswal, A.; Dickson, V.V.; Kosiborod, M.N.; Lekavich, C.L.; et al. Type 2 diabetes mellitus and heart failure, a scientific statement from the american heart association and heart failure society of america. J. Card. Fail. 2019, 25, 584–619. [Google Scholar] [CrossRef]
- Elharram, M.; Ferreira, J.P.; Huynh, T.; Ni, J.; Giannetti, N.; Verma, S.; Zannad, F.; Sharma, A. Prediction of heart failure outcomes in patients with type 2 diabetes mellitus: Validation of the thrombolysis in myocardial infarction risk score for heart failure in diabetes (trs-hf(dm)) in patients in the accord trial. Diabetes Obes. Metab. 2021, 23, 782–790. [Google Scholar] [CrossRef]
- Berg, D.D.; Wiviott, S.D.; Scirica, B.M.; Zelniker, T.A.; Goodrich, E.L.; Jarolim, P.; Mosenzon, O.; Cahn, A.; Bhatt, D.L.; Leiter, L.A.; et al. A biomarker-based score for risk of hospitalization for heart failure in patients with diabetes. Diabetes Care 2021, 44, 2573–2581. [Google Scholar] [CrossRef] [PubMed]
- Lin, Y.; Shao, H.; Shi, L.; Anderson, A.H.; Fonseca, V. Predicting incident heart failure among patients with type 2 diabetes mellitus: The dm-cure risk score. Diabetes Obes. Metab. 2022, 24, 2203–2211. [Google Scholar] [CrossRef]
- Said, F.; Arnott, C.; Voors, A.A.; Heerspink, H.J.L.; Ter Maaten, J.M. Prediction of new-onset heart failure in patients with type 2 diabetes derived from altitude and canvas. Diabetes Obes. Metab. 2024, 26, 2741–2751. [Google Scholar] [CrossRef]
- Di Tanna, G.L.; Wirtz, H.; Burrows, K.L.; Globe, G. Evaluating risk prediction models for adults with heart failure: A systematic literature review. PLoS ONE 2020, 15, e0224135. [Google Scholar] [CrossRef] [PubMed]
- Razaghizad, A.; Oulousian, E.; Randhawa, V.K.; Ferreira, J.P.; Brophy, J.M.; Greene, S.J.; Guida, J.; Felker, G.M.; Fudim, M.; Tsoukas, M.; et al. Clinical prediction models for heart failure hospitalization in type 2 diabetes: A systematic review and meta-analysis. J. Am. Heart Assoc. 2022, 11, e024833. [Google Scholar] [CrossRef]
- Navarro-González, J.F.; Pérez de Isla, L.; Cánovas Molina, G.; Brito-Sanfiel, M.Á.; Barajas Galindo, D.E.; Cuellar Olmedo, L.Á.; Mauricio, D.; Tofé Povedano, S.; Balsa Barro, J.A.; Rubio Almanza, M.; et al. Predicting ckd in type 2 diabetes using natural language processing on healthcare data. Kidney Dis. 2025, 12, 18–28. [Google Scholar] [CrossRef] [PubMed]
- Velupillai, S.; Suominen, H.; Liakata, M.; Roberts, A.; Shah, A.D.; Morley, K.; Osborn, D.; Hayes, J.; Stewart, R.; Downs, J.; et al. Using clinical natural language processing for health outcomes research: Overview and actionable suggestions for future advances. J. Biomed. Inform. 2018, 88, 11–19. [Google Scholar] [CrossRef]
- Reading Turchioe, M.; Volodarskiy, A.; Pathak, J.; Wright, D.N.; Tcheng, J.E.; Slotwiner, D. Systematic review of current natural language processing methods and applications in cardiology. Heart 2022, 108, 909–916. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Tan, Z.; Zhang, Z.; Wang, S.; Guo, J.; Liu, H.; Chen, T.; Bian, J. A natural language processing-based approach for early detection of heart failure onset using electronic health records. Knowl.-Based Syst. 2025, 327, 114102. [Google Scholar] [CrossRef]
- Blanco-Carrasco, A.J.; Merino-Torres, J.F.; Rubio Almanza, M.; Canovas Molina, G.; Brito-Sanfiel, M.A.; Barajas Galindo, D.E.; Cuellar Olmedo, L.A.; Mauricio, D.; Tofe Povedano, S.; Balsa Barro, J.A.; et al. Characterizing the clinical profile and prevalence of people with diabetes attended in the hospital setting by using unstructured healthcare data and natural language processing: The diabetic@ study. Diabetes Res. Clin. Pract. 2025, 226, 112214. [Google Scholar] [CrossRef]
- Benson, T. Principles of Health Interoperability Hl7 and Snomed, 2nd ed.; Springer London Ltd.: London, UK, 2012. [Google Scholar]
- Hernández Medrano, I.; Tello Guijarro, J.; Belda, C.; Ureña, A.; Salcedo, I.; Espinosa-Anke, L.; Saggion, H. Savana: Re-using electronic health records with artificial intelligence. Int. J. Interact. Multimed. Artif. Intell. (IJIMAI) 2018, 4, 8–12. [Google Scholar]
- Espinosa-Anke, L.; Tello, J.; Pardo, A.; Medrano, I.; Ureña, A.; Salcedo, I.; Saggion, H. Savana: A global information extraction and terminology expansion framework in the medical domain. Proces. Leng. Nat. 2016, 57, 23–30. [Google Scholar]
- Canales, L.; Menke, S.; Marchesseau, S.; D’Agostino, A.; Del Rio-Bermudez, C.; Taberna, M.; Tello, J. Assessing the performance of clinical natural language processing systems: Development of an evaluation methodology. JMIR Med. Inform. 2021, 9, e20492. [Google Scholar] [CrossRef]
- Landis, J.R.; Koch, G.G. The measurement of observer agreement for categorical data. Biometrics 1977, 33, 159–174. [Google Scholar] [CrossRef]
- Moons, K.G.; Altman, D.G.; Reitsma, J.B.; Ioannidis, J.P.; Macaskill, P.; Steyerberg, E.W.; Vickers, A.J.; Ransohoff, D.F.; Collins, G.S. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): Explanation and elaboration. Ann. Intern. Med. 2015, 162, W1–W73. [Google Scholar] [CrossRef]
- Greene, S.J.; Vaduganathan, M.; Khan, M.S.; Bakris, G.L.; Weir, M.R.; Seltzer, J.H.; Sattar, N.; McGuire, D.K.; Januzzi, J.L.; Stockbridge, N.; et al. Prevalent and incident heart failure in cardiovascular outcome trials of patients with type 2 diabetes. J. Am. Coll. Cardiol. 2018, 71, 1379–1390. [Google Scholar] [CrossRef] [PubMed]
- Marassi, M.; Fadini, G.P. The cardio-renal-metabolic connection: A review of the evidence. Cardiovasc. Diabetol. 2023, 22, 195. [Google Scholar] [CrossRef] [PubMed]

| Training Set | Validation Set | |||||
|---|---|---|---|---|---|---|
| Predictive iHF | Predictive Non-iHF | Overall | Predictive iHF | Predictive Non-iHF | Overall | |
| (N = 43,479) | (N = 277,029) | (N = 320,508) | (N = 3756) | (N = 29,107) | (N = 32,863) | |
| Demographic characteristics | ||||||
| Age at index—years, median (Q1; Q3) | 68 (55; 80) | 54 (39; 68) | 56 (41; 70) | 76 (65; 84) | 67 (55; 79) | 68 (56; 79) |
| Female sex, n (%) | 21,638 (49.8) | 147,932 (53.4) | 169,570 (52.9) | 1670 (44.5) | 13,764 (47.3) | 15,434 (47.0) |
| Smoking (current/former), n (%) | 13,179 (30.3) | 61,295 (22.1) | 74,474 (23.2) | 977 (26.0) | 5837 (20.1) | 6814 (20.7) |
| Comorbidities and T2DM-related complications, n (%) 1 | ||||||
| Dyslipidemia | 24,267 (55.8) | 167,597 (60.5) | 191,864 (59.9) | 1964 (52.3) | 15,164 (52.1) | 17,128 (52.1) |
| Chronic kidney disease | 11,768 (27.1) | 34,779 (12.6) | 46,547 (14.5) | 1281 (34.1) | 6079 (20.9) | 7360 (22.4) |
| Ischemic heart disease | 10,384 (23.9) | 31,150 (11.2) | 41,534 (13.0) | 881 (23.5) | 3608 (12.4) | 4489 (13.7) |
| Peripheral vascular disease | 7463 (17.2) | 19,557 (7.1) | 27,020 (8.4) | 693 (18.5) | 3006 (10.3) | 3699 (11.3) |
| Atrial fibrillation | 6785 (15.6) | 15,111 (5.5) | 21,896 (6.8) | 741 (19.7) | 2766 (9.5) | 3507 (10.7) |
| Cerebrovascular disease | 4546 (10.5) | 10,572 (3.8) | 15,118 (4.7) | 341 (9.1) | 1755 (6) | 2096 (6.4) |
| Obesity | 4130 (9.5) | 16,557 (6.0) | 20,687 (6.5) | 487 (13.0) | 2892 (9.9) | 3379 (10.3) |
| Peripheral arterial disease | 1936 (4.5) | 4456 (1.6) | 6392 (2.0) | 226 (6.0) | 756 (2.6) | 982 (3.0) |
| Diabetic retinopathy | 1170 (2.7) | 5211 (1.9) | 6381 (2.0) | 203 (5.4) | 848 (2.9) | 1051 (3.2) |
| Diabetic neuropathy | 555 (1.3) | 2296 (0.8) | 2851 (0.9) | 43 (1.1) | 279 (1.0) | 322 (1.0) |
| Foot amputation | 143 (0.3) | 306 (0.1) | 449 (0.1) | 38 (1.0) | 136 (0.5) | 174 (0.5) |
| Pharmacological treatments, n (%) 2 | ||||||
| Any antihypertensive treatment | 20,807 (47.9) | 71,252 (25.7) | 92,059 (28.7) | 2122 (56.5) | 11,098 (38.1) | 13,220 (40.2) |
| ACE inhibitors | 8483 (19.5) | 31,419 (11.3) | 39,902 (12.4) | 730 (19.4) | 3942 (13.5) | 4672 (14.2) |
| Beta-blocking agents | 7816 (18.0) | 20,728 (7.5) | 28,544 (8.9) | 826 (22.0) | 3599 (12.4) | 4425 (13.5) |
| Angiotensin II receptor antagonists | 7219 (16.6) | 24,964 (9.0) | 32,183 (10.0) | 803 (21.4) | 4288 (14.7) | 5091 (15.5) |
| Loop diuretics | 5720 (13.2) | 9552 (3.4) | 15,272 (4.8) | 928 (24.7) | 3011 (10.3) | 3939 (12.0) |
| Calcium channel blockers | 5179 (11.9) | 14,681 (5.3) | 19,860 (6.2) | 415 (11.0) | 1810 (6.2) | 2225 (6.8) |
| Potassium-sparing agents | 1643 (3.8) | 3832 (1.4) | 5475 (1.7) | 208 (5.5) | 711 (2.4) | 919 (2.8) |
| Low-ceiling diuretics (thiazides) | 1634 (3.8) | 4395 (1.6) | 6029 (1.9) | 73 (1.9) | 429 (1.5) | 502 (1.5) |
| Low-ceiling diuretics (other) | 665 (1.5) | 3006 (1.1) | 3671 (1.1) | 85 (2.3) | 467 (1.6) | 552 (1.7) |
| Unspecified antihypertensives | 1918 (4.4) | 4663 (1.7) | 6581 (2.1) | 204 (5.4) | 772 (2.7) | 976 (3.0) |
| Statins | 12,079 (27.8) | 43,509 (15.7) | 55,588 (17.3) | 1076 (28.6) | 6329 (21.7) | 7405 (22.5) |
| Antiplatelet agents | 10,506 (24.2) | 30,990 (11.2) | 41,496 (12.9) | 995 (26.5) | 5452 (18.7) | 6447 (19.6) |
| Anticoagulants | 4307 (9.9) | 8686 (3.1) | 12,993 (4.1) | 623 (16.6) | 2040 (7.0) | 2663 (8.1) |
| Fibrates | 1141 (2.6) | 4973 (1.8) | 6114 (1.9) | 127 (3.4) | 743 (2.6) | 870 (2.6) |
| Ezetimibe | 622 (1.4) | 2408 (0.9) | 3030 (0.9) | 54 (1.4) | 328 (1.1) | 382 (1.2) |
| Bile-acid sequestrants | 94 (0.2) | 258 (0.1) | 352 (0.1) | 4 (0.1) | 43 (0.1) | 47 (0.1) |
| Clinical parameters 3 | ||||||
| Weight, kg | ||||||
| N (%) | 2939 (6.8) | 12,396 (4.5) | 15,335 (4.8) | 147 (3.9) | 896 (3.1) | 1043 (3.2) |
| Median (Q1; Q3) | 75 (62; 88) | 74 (63; 86.5) | 74 (62.8; 87.0) | 73.5 (61.9; 87.0) | 75 (63; 89) | 75 (63; 89) |
| Height, meters | ||||||
| N (%) | 2111 (4.9) | 8568 (3.1) | 10,679 (3.3) | 190 (5.1) | 1543 (5.3) | 1733 (5.3) |
| Median (Q1; Q3) | 1.6 (1.6; 1.7) | 1.6 (1.6; 1.7) | 1.6 (1.6; 1.7) | 1.6 (1.5; 1.7) | 1.6 (1.6; 1.7) | 1.6 (1.6; 1.7) |
| Body mass index, kg/m2 | ||||||
| N (%) | 1180 (2.7) | 4659 (1.7) | 5839 (1.8) | 121 (3.2) | 1074 (3.7) | 1195 (3.6) |
| Median (Q1; Q3) | 28.9 (24.0; 34.4) | 29.3 (24.9; 34.6) | 29.1 (24.8; 34.5) | 30.0 (26.0; 37.0) | 30.0 (26.0; 35.0) | 30.0 (26.0; 35.0) |
| HbA1c, % | ||||||
| N (%) | 2722 (6.3) | 9302 (3.4) | 12,024 (3.8) | 815 (21.7) | 4933 (16.9) | 5748 (17.5) |
| Median (Q1; Q3) | 6.2 (5.6; 7.2) | 6.1 (5.6; 7.0) | 6.1 (5.6; 7.1) | 6.8 (6.0; 8.0) | 6.7 (6.0; 7.9) | 6.7 (6.0; 7.9) |
| GFR, mL/min/1.73 m2 | ||||||
| N (%) | 19,304 (44.4) | 88,332 (31.9) | 10,7636 (33.6) | 2259 (60.1) | 15,115 (51.9) | 17,374 (52.9) |
| Median (Q1; Q3) | 73.7 (52.1–93.8) | 85.3 (68.6–101.9) | 83.6 (65.6–100.7) | 59.4 (38.7–76.1) | 60 (55–86.3) | 60 (52.8–85) |
| Creatinine in blood, mg/dL | ||||||
| N (%) | 18,913 (43.5) | 87,913 (31.7) | 106,826 (33.3) | 2265 (60.3) | 15,158 (52.1) | 17,423 (53) |
| Median (Q1; Q3) | 0.9 (0.7–1.2) | 0.8 (0.7–1) | 0.8 (0.7–1) | 1.1 (0.8–1.6) | 0.9 (0.7–1.1) | 0.9 (0.7–1.2) |
| Proteins in urine, mg/dL | ||||||
| N (%) | 1851 (4.3) | 11,997 (4.3) | 13,848 (4.3) | 291 (7.7) | 2078 (7.1) | 2369 (7.2) |
| Median (Q1; Q3) | 5.8 (0.1; 75) | 0.2 (0.1; 2.5) | 0.2 (0.1; 10) | 25 (25; 75) | 25 (25; 75) | 25 (25; 75) |
| Albumin to creatinine ratio, mg/g | ||||||
| N (%) | 141 (0.3) | 531 (0.2) | 672 (0.2) | 87 (2.3) | 389 (1.3) | 476 (1.4) |
| Median (Q1; Q3) | 43.2 (14.9; 159) | 21.6 (7.4; 80.7) | 25.6 (8.7; 96.9) | 47.6 (8.3; 169.9) | 24.2 (4.9; 136.5) | 29.1 (5.4; 145.1) |
| Albumin in urine, mg/24 h | ||||||
| N (%) | 244 (0.6) | 932 (0.3) | 1176 (0.4) | 185 (4.9) | 748 (2.6) | 933 (2.8) |
| Median (Q1; Q3) | 3.8 (2.3; 26.8) | 3.5 (0.3; 4.5) | 3.6 (0.2; 5.0) | 7.3 (2.7; 51.3) | 3.7 (0.8; 17.9) | 4.2 (1.1; 24) |
| Total cholesterol, mg/dL | ||||||
| N (%) | 4800 (11.0) | 20,806 (7.5) | 25,606 (8.0) | 1312 (34.9) | 10,067 (34.6) | 11,379 (34.6) |
| Median (Q1; Q3) | 177 (145; 212) | 185 (154; 218) | 184 (153; 217) | 164 (134.8; 196) | 176 (146; 208) | 174 (145; 207) |
| HDL, mg/dL | ||||||
| N (%) | 2908 (6.7) | 12,970 (4.7) | 15,878 (5) | 933 (24.8) | 6660 (22.9) | 7593 (23.1) |
| Median (Q1; Q3) | 47 (37; 58) | 50 (40; 61) | 49 (39; 61) | 45 (36; 56) | 47 (37; 58) | 46 (37; 58) |
| LDL, mg/dL | ||||||
| N (%) | 3549 (8.2) | 16,518 (6) | 20,067 (6.3) | 962 (25.6) | 6788 (23.3) | 7750 (23.6) |
| Median (Q1; Q3) | 101 (78; 130) | 107 (85; 135) | 106 (84; 134) | 94 (71; 122) | 103 (79; 130.1) | 102 (78; 129.4) |
| Triglycerides, mg/dL | ||||||
| N (%) | 4847 (11.1) | 19,943 (7.2) | 24,790 (7.7) | 1289 (34.3) | 8875 (30.5) | 10,164 (30.9) |
| Median (Q1; Q3) | 118 (85; 170) | 113 (80; 165) | 114 (81; 166) | 115 (84; 163) | 111 (80; 158) | 112 (80; 158) |
| Model | AUC-ROC | Accuracy | Precision | Recall | F1-Score | F2-Score |
|---|---|---|---|---|---|---|
| Cross-validation metrics of the regularized models, mean (CI) | ||||||
| Full models | ||||||
| XGB Classifier | 0.74 (0.73–0.75) | 0.68 (0.63–0.74) | 0.26 (0.24–0.28) | 0.67 (0.61–0.73) | 0.37 (0.35–0.39) | 0.5 (0.48–0.52) |
| Logistic Regression | 0.73 (0.72–0.75) | 0.69 (0.64–0.74) | 0.26 (0.24–0.28) | 0.65 (0.59–0.71) | 0.37 (0.35–0.39) | 0.49 (0.47–0.51) |
| Random Forest Classifier | 0.57 (0.56–0.57) | 0.78 (0.76–0.8) | 0.22 (0.21–0.23) | 0.25 (0.22–0.27) | 0.23 (0.22–0.24) | 0.24 (0.23–0.25) |
| Decision Tree Classifier | 0.51 (0.51–0.52) | 0.74 (0.72–0.77) | 0.22 (0.21–0.23) | 0.34 (0.31–0.37) | 0.26 (0.26–0.27) | 0.3 (0.29–0.32) |
| Reduced and refined models | ||||||
| Logistic regression reduced | 0.73 (0.71–0.75) | 0.69 (0.64–0.74) | 0.26 (0.24–0.28) | 0.64 (0.57–0.71) | 0.37 (0.35–0.39) | 0.49 (0.47–0.51) |
| Logistic regression refined | 0.73 (0.71–0.75) | 0.69 (0.64–0.74) | 0.26 (0.24–0.28) | 0.65 (0.58–0.72) | 0.37 (0.35–0.39) | 0.49 (0.47–0.51) |
| Validation metrics of the logistic regression models, mean (95% CI) | ||||||
| Full model | 0.69 (0.68–0.70) | 0.51 (0.51–0.52) | 0.16 (0.16–0.17) | 0.79 (0.78–0.80) | 0.27 (0.26–0.27) | 0.44 (0.44–0.45) |
| Reduced model | 0.68 (0.68–0.69) | 0.51 (0.51–0.52) | 0.16 (0.16–0.16) | 0.78 (0.77–0.79) | 0.27 (0.26–0.27) | 0.44 (0.43–0.45) |
| Refined model | 0.68 (0.68–0.69) | 0.51 (0.51–0.52) | 0.16 (0.16–0.16) | 0.77 (0.76–0.79) | 0.26 (0.26–0.27) | 0.44 (0.43–0.45) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Navarro-González, J.F.; Pérez de Isla, L.; Cánovas Molina, G.; Brito-Sanfiel, M.Á.; Barajas Galindo, D.E.; Cuellar Olmedo, L.Á.; Mauricio, D.; Tofé Povedano, S.; Balsa Barro, J.A.; Rubio Almanza, M.; et al. Natural Language Processing of Unstructured Healthcare Data for Predicting Heart Failure in Individuals with Type 2 Diabetes. J. Clin. Med. 2026, 15, 3287. https://doi.org/10.3390/jcm15093287
Navarro-González JF, Pérez de Isla L, Cánovas Molina G, Brito-Sanfiel MÁ, Barajas Galindo DE, Cuellar Olmedo LÁ, Mauricio D, Tofé Povedano S, Balsa Barro JA, Rubio Almanza M, et al. Natural Language Processing of Unstructured Healthcare Data for Predicting Heart Failure in Individuals with Type 2 Diabetes. Journal of Clinical Medicine. 2026; 15(9):3287. https://doi.org/10.3390/jcm15093287
Chicago/Turabian StyleNavarro-González, Juan F., Leopoldo Pérez de Isla, Gloria Cánovas Molina, Miguel Ángel Brito-Sanfiel, David Emilio Barajas Galindo, Luis Ángel Cuellar Olmedo, Dídac Mauricio, Santiago Tofé Povedano, José Antonio Balsa Barro, Matilde Rubio Almanza, and et al. 2026. "Natural Language Processing of Unstructured Healthcare Data for Predicting Heart Failure in Individuals with Type 2 Diabetes" Journal of Clinical Medicine 15, no. 9: 3287. https://doi.org/10.3390/jcm15093287
APA StyleNavarro-González, J. F., Pérez de Isla, L., Cánovas Molina, G., Brito-Sanfiel, M. Á., Barajas Galindo, D. E., Cuellar Olmedo, L. Á., Mauricio, D., Tofé Povedano, S., Balsa Barro, J. A., Rubio Almanza, M., Aparicio Sánchez, J. J., Sequera Mutiozabal, M., Pimentel, B., Pérez Domínguez, A., Latorre Garrido, V., Maté, C., Salvador, D., Merino-Torres, J. F., & Blanco-Carrasco, A. J. (2026). Natural Language Processing of Unstructured Healthcare Data for Predicting Heart Failure in Individuals with Type 2 Diabetes. Journal of Clinical Medicine, 15(9), 3287. https://doi.org/10.3390/jcm15093287

