Long COVID’s Hidden Complexity: Machine Learning Reveals Why Personalized Care Remains Essential
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Design
2.2. Data Source
2.3. Study Population
2.4. Data Analysis
3. Results
3.1. Patient
3.2. Unsupervised Machine Learning for the Identification of Patient Aggregation
4. Discussion
5. Conclusions
- Integration of biomarkers: future phenotyping studies could benefit from the inclusion of inflammatory, immunological, metabolic, and specific organ damage biomarkers, which might reveal underlying patterns not evident from symptom analysis alone.
- Multimodal imaging data: incorporating structural and functional imaging data could detect patterns of subclinical organ damage that might underlie reported symptoms.
- Digital health data: the use of remote monitoring technologies could facilitate the longitudinal collection of physiological data in real-world settings, revealing temporal patterns not easily captured during standard clinical visits.
- Integrated multi-omic approaches: high-resolution omic technologies could provide in-depth molecular characterization of PASC patients, potentially identifying distinctive molecular signatures.
- Extended longitudinal analyses: longer-term follow-up studies (1–5 years) are needed to understand the natural history of PASC and identify predictors of symptom persistence or recovery.
- Machine learning-based predictive models: developing models that integrate demographic, clinical, biological, and imaging data could help early identification of patients at risk for PASC.
- Adaptive clinical trials: designing trials that dynamically respond to emerging data could accelerate the development of effective therapeutic strategies for PASC.
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- National Institute for Health and Care Excellence (NICE). COVID-19 Rapid Guideline: Managing the Long-Term Effects of COVID-19; NICE: London, UK, 2020; ISBN-13: 978-1-4731-3943-5. [Google Scholar]
- Center for Disease Control and Prevention. COVID-19 Post-COVID Condition: Information for Healthcare Providers. 2022; Volume 1–16. Available online: https://archive.cdc.gov/www_cdc_gov/coronavirus/2019-ncov/hcp/clinical-care/post-covid-conditions.html (accessed on 2 April 2024).
- Thaweethai, T.; Jolley, S.E.; Karlson, E.W.; Levitan, E.B.; Levy, B.; McComsey, G.A.; McCorkell, L.; Nadkarni, G.N.; Parthasarathy, S.; Singh, U.; et al. Development of a Definition of Postacute Sequelae of SARS-CoV-2 Infection. JAMA 2023, 329, 1934–1946. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Crook, H.; Raza, S.; Nowell, J.; Young, M.; Edison, P. Long Covid-mechanisms, risk factors, and management. BMJ 2021, 374, n1648. [Google Scholar] [CrossRef] [PubMed]
- Astin, R.; Banerjee, A.; Baker, M.R.; Dani, M.; Ford, E.; Hull, J.H.; Lim, P.B.; McNarry, M.; Morten, K.; O’Sullivan, O.; et al. Long COVID: Mechanisms, risk factors and recovery. Exp. Physiol. 2023, 108, 12–27. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Najafi, M.B.; Javanmard, S.H. Post-COVID-19 syndrome mechanisms, prevention and management. Int. J. Prev. Med. 2023, 14, 59. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- van Kessel, S.A.; Olde Hartman, T.C.; Lucassen, P.L.; van Jaarsveld, C.H. Post-acute and long-COVID-19 symptoms in patients with mild diseases: A systematic review. Fam. Pract. 2022, 39, 159–167. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Fernández-de-Las-Peñas, C.; Palacios-Ceña, D.; Gómez-Mayordomo, V.; Florencio, L.L.; Cuadrado, M.L.; Plaza-Manzano, G.; Navarro-Santana, M. Prevalence of post-COVID-19 symptoms in hospitalized and non-hospitalized COVID-19 survivors: A systematic review and meta-analysis. Eur. J. Intern. Med. 2021, 92, 55–70. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Davis, H.E.; McCorkell, L.; Vogel, J.M.; Topol, E.J. Long COVID: Major findings, mechanisms and recommendations. Nat. Rev. Microbiol. 2023, 21, 133–146. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Wong, A.W.; Tran, K.C.; Binka, M.; Janjua, N.Z.; Sbihi, H.; Russell, J.A.; Carlsten, C.; Levin, A.; Ryerson, C.J. Use of latent class analysis and patient reported outcome measures to identify distinct long COVID phenotypes: A longitudinal cohort study. PLoS ONE 2023, 18, e0286588. [Google Scholar] [CrossRef]
- Kenny, G.; McCann, K.; O’Brien, C.; Savinelli, S.; Tinago, W.; Yousif, O.; Lambert, J.S.; O’Broin, C.; Feeney, E.R.; De Barra, E.; et al. Identification of Distinct Long COVID Clinical Phenotypes Through Cluster Analysis of Self-Reported Symptoms. Open Forum. Infect. Dis. 2022, 9, ofac060. [Google Scholar] [CrossRef]
- Blankestijn, J.M.; Abdel-Aziz, M.I.; Baalbaki, N.; Bazdar, S.; Beekers, I.; Beijers, R.J.; Bloemsma, L.D.; Cornelissen, M.E.; Gach, D.; Houweling, L.; et al. Long COVID exhibits clinically distinct phenotypes at 3–6 months post-SARS-CoV-2 infection: Results from the P4O2 consortium. ERJ Open Res. 2024, 10, e001907. [Google Scholar] [CrossRef]
- Reese, J.T.; Blau, H.; Casiraghi, E.; Bergquist, T.; Loomba, J.J.; Callahan, T.J.; Laraway, B.; Antonescu, C.; Coleman, B.; Gargano, M.; et al. Generalizable Long COVID Subtypes: Findings from the NIH N3C and RECOVER Programs. eBioMedicine 2023, 87, 104413. [Google Scholar] [CrossRef] [PubMed]
- Su, Y.; Yuan, D.; Chen, D.G.; Ng, R.H.; Wang, K.; Choi, J.; Li, S.; Hong, S.; Zhang, R.; Xie, J.; et al. Multiple Early Factors Anticipate Post-Acute COVID-19 Sequelae. Cell 2022, 185, 881–895.e20. [Google Scholar] [CrossRef] [PubMed]
- Al-Aly, Z.; Xie, Y.; Bowe, B. High-dimensional characterization of post-acute sequelae of COVID-19. Nature 2021, 594, 259–264. [Google Scholar] [CrossRef] [PubMed]
- Cojocaru, E.; Cojocaru, C.; Vlad, C.E.; Eva, L. Role of the Renin-Angiotensin System in Long COVID’s Cardiovascular Injuries. Biomedicines 2023, 11, 2004. [Google Scholar] [CrossRef]
- Vaduganathan, M.; Vardeny, O.; Michel, T.; McMurray, J.J.; Pfeffer, M.A.; Solomon, S.D. Renin-Angiotensin-Aldosterone System Inhibitors in Patients with Covid-19. New Engl. J. Med. 2020, 382, 1653–1659. [Google Scholar] [CrossRef]
- Nizzoli, M.E.; Merati, G.; Tenore, A.; Picone, C.; Consensi, E.; Perotti, L.; Ferretti, V.V.; Sambo, M.; Di Sabatino, A.; Iotti, G.A.; et al. Circulating endothelial cells in COVID-19. Am. J. Hematol. 2020, 95, E187–E215. [Google Scholar] [CrossRef]
- Fernández-de-Las-Peñas, C.; Martín-Guerrero, J.D.; Florencio, L.L.; Navarro-Pardo, E.; Rodríguez-Jiménez, J.; Torres-Macho, J.; Pellicer-Valero, O.J. Clustering analysis reveals different profiles associating long-term post-COVID symptoms, COVID-19 symptoms at hospital admission and previous medical co-morbidities in previously hospitalized COVID-19 survivors. Infection 2023, 51, 61–69. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Kisiel, M.A.; Lee, S.; Malmquist, S.; Rykatkin, O.; Holgert, S.; Janols, H.; Janson, C.; Zhou, X. Clustering Analysis Identified Three Long COVID Phenotypes and Their Association with General Health Status and Working Ability. J. Clin. Med. 2023, 12, 3617. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Subramanian, A.; Nirantharakumar, K.; Hughes, S.; Myles, P.; Williams, T.; Gokhale, K.M.; Taverner, T.; Chandan, J.S.; Brown, K.; Simms-Williams, N.; et al. Symptoms and risk factors for long COVID in non-hospitalized adults. Nat. Med. 2022, 28, 1706–1714. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Seeßle, J.; Waterboer, T.; Hippchen, T.; Simon, J.; Kirchner, M.; Lim, A.; Müller, B.; Merle, U. Persistent Symptoms in Adult Patients 1 Year After Coronavirus Disease 2019 (COVID-19): A Prospective Cohort Study. Clin. Infect. Dis. 2022, 74, 1191–1198. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Greenhalgh, T.; Sivan, M.; Delaney, B.; Evans, R.; Milne, R. Long covid—An update for primary care. BMJ 2022, 378, e072117. [Google Scholar] [CrossRef] [PubMed]
- Sisó-Almirall, A.; Brito-Zerón, P.; Conangla Ferrín, L.; Kostov, B.; Moragas Moreno, A.; Mestres, J.; Sellarès, J.; Galindo, G.; Morera, R.; Basora, J.; et al. Long Covid-19: Proposed Primary Care Clinical Guidelines for Diagnosis and Disease Management. Int. J. Environ. Res. Public Health 2021, 18, 4350. [Google Scholar] [CrossRef] [PubMed]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2023. [Google Scholar]
- Tsuchida, T.; Yoshimura, N.; Ishizuka, K.; Katayama, K.; Inoue, Y.; Hirose, M.; Nakagama, Y.; Kido, Y.; Sugimori, H.; Matsuda, T.; et al. Five cluster classifications of long COVID and their background factors: A cross-sectional study in Japan. Clin. Exp. Med. 2023, 23, 3663–3670. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Bai, F.; Tomasoni, D.; Falcinella, C.; Barbanotti, D.; Castoldi, R.; Mulè, G.; Augello, M.; Mondatore, D.; Allegrini, M.; Cona, A.; et al. Female gender is associated with long COVID syndrome: A prospective cohort study. Clin. Microbiol. Infect. 2022, 28, e9–e611. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Marchi, M.; Grenzi, P.; Serafini, V.; Capoccia, F.; Rossi, F.; Marrino, P.; Pingani, L.; Galeazzi, G.M.; Ferrari, S. Psychiatric symptoms in Long-COVID patients: A systematic review. Front. Psychiatry 2023, 14, 1138389. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Zakia, H.; Pradana, K.; Iskandar, S. Risk factors for psychiatric symptoms in patients with long COVID: A systematic review. PLoS ONE 2023, 18, e0284075. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Goldhaber, N.H.; Kohn, J.N.; Ogan, W.S.; Sitapati, A.; Longhurst, C.A.; Wang, A.; Lee, S.; Hong, S.; Horton, L.E. Deep Dive into the Long Haul: Analysis of Symptom Clusters and Risk Factors for Post-Acute Sequelae of COVID-19 to Inform Clinical Care. Int. J. Environ. Res. Public. Health 2022, 19, 16841. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Perlis, R.H.; Santillana, M.; Ognyanova, K.; Safarpour, A.; Trujillo, K.L.; Simonson, M.D.; Green, J.; Quintana, A.; Druckman, J.; Baum, M.A.; et al. Prevalence and Correlates of Long COVID Symptoms Among US Adults. JAMA Netw. Open. 2022, 5, e2238804. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Chudzik, M.; Babicki, M.; Kapusta, J.; Kałuzińska-Kołat, Ż.; Kołat, D.; Jankowski, P.; Mastalerz-Migas, A. Long-COVID Clinical Features and Risk Factors: A Retrospective Analysis of Patients from the STOP-COVID Registry of the PoLoCOV Study. Viruses 2022, 14, 1755. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Szabo, S.; Zayachkivska, O.; Hussain, A.; Muller, V. What is really ’Long COVID’? Inflammopharmacology 2023, 31, 551–557. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Ziauddeen, N.; Gurdasani, D.; O’Hara, M.E.; Hastie, C.; Roderick, P.; Yao, G.; Alwan, N.A. Characteristics and impact of Long Covid: Findings from an online survey. PLoS ONE 2022, 17, e0264331. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Murtagh, F.; Legendre, P. Ward’s hierarchical agglomerative clustering method: Which algorithms implement Ward’s criterion? J. Classif. 2014, 31, 274–295. [Google Scholar] [CrossRef]
- Kaufman, L.; Rousseeuw, P.J. Finding Groups in Data: An Introduction to Cluster Analysis; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1990. [Google Scholar]
- Schubert, E.; Rousseeuw, P.J. Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms. In Similarity Search and Applications, Proceedings of 15th International Conference (SISAP 2022), Bologna, Italy, 5–7 October 2022; Springer: New York, NY, USA, 2019; pp. 171–187. [Google Scholar]
- Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A K-Means Clustering Algorithm. J. R. Stat. Soc. Ser. C. Appl. Stat. 1979, 28, 100–108. [Google Scholar] [CrossRef]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, OR, USA, 2–4 August 1996; AAAI Press: Washington, DC, USA, 1996; pp. 226–231. [Google Scholar]
- Gower, J.C. A General Coefficient of Similarity and Some of Its Properties. Biometrics 1971, 27, 857–871. [Google Scholar] [CrossRef]
- Hennig, C.; Liao, T.F. How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification. J. R. Stat. Soc. Ser. C Appl. 2013, 62, 309–369. [Google Scholar] [CrossRef]
ICD9-CM Chapter | Overall N = 382 | San Matteo Hosp N = 242 | Cremona Hosp N = 140 | |
---|---|---|---|---|
Sex (F) | - | 102 (26.7%) | 72 (29.8%) | 30 (21.4%) |
Age > 65 | - | 136 (35.7%) | 84 (34.7%) | 52 (37.4%) |
Endotracheal intubation | - | 97 (25.4%) | 51 (21.1%) | 46 (32.9%) |
Multimorbidities | - | 324 (84.8%) | 200 (82.6%) | 124 (88.6%) |
Circulatory | 7 | 176 (46.1%) | 126 (52.1%) | 50 (35.7%) |
Endocrin | 3 | 76 (19.9%) | 66 (27.3%) | 10 (7.1%) |
Genitourinary | 10 | 34 (8.9%) | 24 (9.9%) | 10 (7.1%) |
Neurological | 6 | 25 (6.5%) | 21 (8.7%) | 4 (2.9%) |
Gastroenterological | 9 | 13 (3.4%) | 11 (4.5%) | 2 (1.4%) |
Cancer | 2 | 12 (3.1%) | 8 (3.3%) | 4 (2.9%) |
Hematological | 4 | 10 (2.6%) | 9 (3.7%) | 1 (0.7%) |
Dermatological | 12 | 8 (2.1%) | 5 (2.1%) | 3 (2.1%) |
Trauma | 17 | 6 (1.6%) | 4 (1.7%) | 2 (1.4%) |
Mental | 5 | 5 (1.3%) | 4 (1.7%) | 1 (0.7%) |
Musculoskeletal | 13 | 4 (1.0%) | 4 (1.7%) | 0 (0.0%) |
Other | 18 | 157 (41.1%) | 28 (11.6%) | 129 (92.1%) |
Symptoms | 16 | 113 (29.6%) | 7 (2.9%) | 106 (75.7%) |
Symptom | All (N = 382) | San Matteo Hosp (N = 242) | Cremona Hosp (N = 140) | |||
---|---|---|---|---|---|---|
N | % (95%CI) | N | % (95%CI) | N | % (95%CI) | |
Residual symptoms | 253 | 67.8 (62.8, 72.5) | 148 | 63.5 (56.9, 69.6) | 105 | 75.0 (66.8, 81.8) |
Multiple symptoms | ||||||
1 | 107 | 28.0 (23.6, 32.9) | 71 | 29.3 (23.8, 35.6) | 36 | 25.7 (18.9, 33.9) |
2 | 77 | 20.2 (16.3, 24.6) | 46 | 19.0 (14.4, 24.6) | 31 | 22.1 (15.8, 30.1) |
3+ | 74 | 19.4 (15.6, 23.8) | 36 | 14.9 (10.8, 20.1) | 38 | 27.1 (20.1, 35.4) |
Dyspnea | 170 | 60.9 (54.9, 66.6) | 100 | 68.5 (60.2, 75.8) | 70 | 52.6 (43.8, 61.3) |
Fatigue | 109 | 39.8 (34.0, 45.9) | 64 | 45.7 (37.3, 54.3) | 45 | 33.6 (25.8, 42.3) |
Neuro-psychological symptoms | 69 | 30.4 (24.6, 36.9) | 33 | 35.9 (26.3, 46.6) | 36 | 26.7 (19.6, 35.1) |
Rheumatologic symptoms | 47 | 21.1 (16.0, 27.1) | 21 | 23.6 (15.5, 34.0) | 26 | 19.4 (13.3, 27.3) |
Cardiovascular symptoms | 47 | 17.2 (13.0, 22.3) | 28 | 20.3 (14.1, 28.2) | 19 | 14.1 (8.9, 21.4) |
Otorhinolaryngological symptoms | 28 | 10.3 (7.1, 14.7) | 20 | 14.5 (9.3, 21.7) | 8 | 6.0% (2.8, 11.8) |
Dermatologic symptoms | 22 | 9.8 (6.4, 14.6) | 6 | 6.7 (2.7, 14.5) | 16 | 11.9 (7.1, 18.8) |
Cough | 18 | 6.6 (4.1, 10.5) | 7 | 5.1% (2.3, 10.7) | 11 | 8.1 (4.3, 14.4) |
Gastrointestinal disorders | 19 | 6.9 (4.3, 10.8) | 16 | 11.4 (6.9, 18.2) | 3 | 2.2 (0.6, 6.9) |
Headache | 11 | 4.9 (2.6, 8.9) | 9 | 10.1 (5.0, 18.8) | 2 | 1.5 (0.3, 5.8) |
Method | Average Silhouette | Separation Index (SI) | Cophenetic Correlation Coefficient | Entropy |
---|---|---|---|---|
Agglomerative Clustering | 0.31 | 0.05 | 0.61 | 1.10 |
Divisive Clustering | 0.31 | 0.03 | 0.74 | 0.74 |
PAM Clustering | 0.18 | 0.01 | - | 1.27 |
K-Means | 0.19 | 0.26 | - | 0.69 |
DBSCAN | 0.47 | 0.55 | - | 1.01 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fresi, E.; Pagani, E.; Pezzetti, F.; Montomoli, C.; Monti, C.; Betti, M.; De Silvestri, A.; Sagliocco, O.; Zuccaro, V.; Bruno, R.; et al. Long COVID’s Hidden Complexity: Machine Learning Reveals Why Personalized Care Remains Essential. J. Clin. Med. 2025, 14, 3670. https://doi.org/10.3390/jcm14113670
Fresi E, Pagani E, Pezzetti F, Montomoli C, Monti C, Betti M, De Silvestri A, Sagliocco O, Zuccaro V, Bruno R, et al. Long COVID’s Hidden Complexity: Machine Learning Reveals Why Personalized Care Remains Essential. Journal of Clinical Medicine. 2025; 14(11):3670. https://doi.org/10.3390/jcm14113670
Chicago/Turabian StyleFresi, Eleonora, Elisabetta Pagani, Federica Pezzetti, Cristina Montomoli, Cristina Monti, Monia Betti, Annalisa De Silvestri, Orlando Sagliocco, Valentina Zuccaro, Raffaele Bruno, and et al. 2025. "Long COVID’s Hidden Complexity: Machine Learning Reveals Why Personalized Care Remains Essential" Journal of Clinical Medicine 14, no. 11: 3670. https://doi.org/10.3390/jcm14113670
APA StyleFresi, E., Pagani, E., Pezzetti, F., Montomoli, C., Monti, C., Betti, M., De Silvestri, A., Sagliocco, O., Zuccaro, V., Bruno, R., & Klersy, C. (2025). Long COVID’s Hidden Complexity: Machine Learning Reveals Why Personalized Care Remains Essential. Journal of Clinical Medicine, 14(11), 3670. https://doi.org/10.3390/jcm14113670