Identifying Long COVID Definitions, Predictors, and Risk Factors in the United States: A Scoping Review of Data Sources Utilizing Electronic Health Records
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Source
2.2. Search Strategy
2.3. Study Selection
2.4. Data Extraction
2.5. Narrative Synthesis
3. Results
Study | Year | Study Design | Total Sample Size | Total Participants (n = Study Population) | EHR Source | Main Methods | Method of Validation |
---|---|---|---|---|---|---|---|
Al-Aly et al. [30] | 2021 | Observational Study | 5,213,885 | 73,435 | US Department of Veterans Affairs electronic healthcare databases | Cox regression | N/A |
Baskett et al. [31] | 2022 | Retrospective Cohort Study | N/A | 17,487 | Cerner Real-World data set | Logistic regression; Propensity score matching | N/A |
Estiri et al. [29] | 2021 | Retrospective Cohort Study | 96,025 | 22,475 | Mass General Brigham Hospital | MLHO; Multivariate time series analysis | Clinical expert review |
Fritsche et al. [32] | 2023 | Case-Control | 63,675 | 1724 | Michigan Medicine EHR data | Logistic regression | Sensitivity analysis; Logistic regression; AAUC |
Haupert et al. [33] | 2022 | Case-Crossover Design | 204,597 | 44,198 | Michigan Medicine Health System | Logistic regression | N/A |
Jiang et al. [34] | 2022 | Retrospective Cohort Study | 85,196 | 28,558 | N3C | Deep neural networks; Extreme gradient boosting (decision tree); PCA | AUC; F1 score |
Khullar et al. [35] | 2023 | Retrospective Cohort Study | 310,220 | 62,339 | INSIGHT Network; New York City Health Systems | Logistic regression | Sensitivity analysis |
Lorman et al. [36] | 2023 | Observational Study | 14,399 | 1309 | RECOVER PEDSnet EHR | Propensity score matching; Decision trees | Sensitivity analysis |
Nasir et al. [37] | 2023 | Observational Study | 11,209 | 4091 | Health Choice Network | Bayesian structural time series modeling | N/A |
Pfaff et al. [38] | 2023 | Retrospective Cohort Study | 36,880 | 33,782 | N3C | Clustering; Network analysis | N/A |
Pfaff et al. [23] | 2022 | Retrospective Cohort Study | 1,793,604 | 73,972 | N3C | Extreme gradient boosting (decision tree) | Cross-validation; AUC; Precision; Recall; F-score |
Rao et al. [39] | 2022 | Exploratory, Retrospective Cohort | 659,286 | 59,893 | PEDSnet | Cox regression; Logistic regression | N/A |
Reese et al. [40] | 2023 | Retrospective Cohort | 5,434,528 | 20,532 | N3C | NLP; k-means clustering | N/A |
Sengupta et al. [41] | 2022 | Retrospective Cohort | 49,950 | 7511 | N3C | Convolutional and LSTM neural networks | AUC |
Wang et al. [42] | 2022 | Observational Study | 51,485 | 26,117 | Mass General Brigham Hospital | Rule-based NLP | Precision |
Zang et al. [43] | 2023 | Observational Study | 361,401 (INSIGHT) 199,351 (OneFlorida+) | 35,275 (INSIGHT) 22,341 (OneFlorida+) | INSIGHT CRN; OneFlorida+ CRN | Propensity score matching | Sensitivity analysis |
H Zhang et al. [44] | 2022 | Retrospective Cohort Study | 3460 | 20,881 | INSIGHT CRN; OneFlorida+ CRN | Clustering; NLP via topic modeling | Topic Coherence; Sensitivity analysis |
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Crook, H.; Raza, S.; Nowell, J.; Young, M.; Edison, P. Long COVID—Mechanisms, risk factors, and management. BMJ 2021, 374, n1648. [Google Scholar] [CrossRef] [PubMed]
- Devi, K.P.; Pourkarim, M.R.; Thijssen, M.; Sureda, A.; Khayatkashani, M.; Cismaru, C.A.; Neagoe, I.B.; Habtemariam, S.; Razmjouei, S.; Khayat Kashani, H.R. A perspective on the applications of furin inhibitors for the treatment of SARS-CoV-2. Pharmacol. Rep. 2022, 74, 425–430. [Google Scholar] [CrossRef] [PubMed]
- Garg, M.; Maralakunte, M.; Garg, S.; Dhooria, S.; Sehgal, I.; Bhalla, A.S.; Vijayvergiya, R.; Grover, S.; Bhatia, V.; Jagia, P. The conundrum of ‘long-COVID-19: A narrative review. Int. J. Gen. Med. 2021, 14, 2491–2506. [Google Scholar] [CrossRef]
- Makhoul, E.; Aklinski, J.L.; Miller, J.; Leonard, C.; Backer, S.; Kahar, P.; Parmar, M.S.; Khanna, D. A review of COVID-19 in relation to metabolic syndrome: Obesity, hypertension, diabetes, and dyslipidemia. Cureus 2022, 14, e27438. [Google Scholar] [CrossRef]
- Cutler, D.M. The Economic Cost of Long COVID: An Update. Publish Online July 2022. Available online: https://scholar.harvard.edu/cutler/news/long-covid (accessed on 9 June 2024).
- Sagy, Y.W.; Feldhamer, I.; Brammli-Greenberg, S.; Lavie, G. Estimating the economic burden of long-COVID: The additive cost of healthcare utilisation among COVID-19 recoverees in Israel. BMJ Glob. Health 2023, 8, e012588. [Google Scholar] [CrossRef] [PubMed]
- Gandjour, A. Long COVID: Costs for the German economy and health care and pension system. BMC Health Serv. Res. 2023, 23, 641. [Google Scholar] [CrossRef] [PubMed]
- Sanchez-Ramirez, D.C.; Normand, K.; Zhaoyun, Y.; Torres-Castro, R. Long-term impact of COVID-19: A systematic review of the literature and meta-analysis. Biomedicines 2021, 9, 900. [Google Scholar] [CrossRef]
- Ham, D.I. Long-Haulers and Labor Market Outcomes; Federal Reserve Bank of Minneapolis: Minneapolis, MN, USA, 2022; Available online: https://www.minneapolisfed.org/institute/working-papers-institute/iwp60.pdf (accessed on 9 June 2024).
- Amin-Chowdhury, Z.; Ladhani, S.N. Causation or confounding: Why controls are critical for characterizing long COVID. Nat. Med. 2021, 27, 1129–1130. [Google Scholar] [CrossRef]
- Barizien, N.; Le Guen, M.; Russel, S.; Touche, P.; Huang, F.; Vallée, A. Clinical characterization of dysautonomia in long COVID-19 patients. Sci. Rep. 2021, 11, 14042. [Google Scholar] [CrossRef] [PubMed]
- Davis, H.E.; Assaf, G.S.; McCorkell, L.; Wei, H.; Low, R.J.; Re’em, Y.; Redfield, S.; Austin, J.P.; Akrami, A. Characterizing long COVID in an international cohort: 7 months of symptoms and their impact. EClinicalMedicine 2021, 38, 101019. [Google Scholar] [CrossRef]
- Deer, R.R.; Rock, M.A.; Vasilevsky, N.; Carmody, L.; Rando, H.; Anzalone, A.J.; Basson, M.D.; Bennett, T.D.; Bergquist, T.; Boudreau, E.A. Characterizing long COVID: Deep phenotype of a complex condition. EBioMedicine 2021, 74, 103722. [Google Scholar] [CrossRef]
- Taquet, M.; Dercon, Q.; Luciano, S.; Geddes, J.R.; Husain, M.; Harrison, P.J. Incidence, co-occurrence, and evolution of long-COVID features: A 6-month retrospective cohort study of 273,618 survivors of COVID-19. PLoS Med. 2021, 18, e1003773. [Google Scholar] [CrossRef] [PubMed]
- Thaweethai, T.; Jolley, S.E.; Karlson, E.W.; Levitan, E.B.; Levy, B.; McComsey, G.A.; McCorkell, L.; Nadkarni, G.N.; Parthasarathy, S.; Singh, U. Development of a definition of postacute sequelae of SARS-CoV-2 infection. JAMA 2023, 329, 1934–1946. [Google Scholar] [CrossRef] [PubMed]
- Bonilla, H.; Peluso, M.J.; Rodgers, K.; Aberg, J.A.; Patterson, T.F.; Tamburro, R.; Baizer, L.; Goldman, J.D.; Rouphael, N.; Deitchman, A.; et al. Therapeutic trials for long COVID-19: A call to action from the interventions taskforce of the RECOVER initiative. Front. Immunol. 2023, 14, 1129459. [Google Scholar] [CrossRef] [PubMed]
- Jones, R.; Davis, A.; Stanley, B.; Julious, S.; Ryan, D.; Jackson, D.J.; Halpin, D.M.; Hickman, K.; Pinnock, H.; Quint, J.K. Risk predictors and symptom features of long COVID within a broad primary care patient population including both tested and untested patients. Pragmatic Obs. Res. 2021, 12, 93–104. [Google Scholar] [CrossRef] [PubMed]
- Knight, D.R.; Munipalli, B.; Logvinov, I.I.; Halkar, M.G.; Mitri, G.; Hines, S.L. Perception, prevalence, and prediction of severe infection and post-acute sequelae of COVID-19. Am. J. Med. Sci. 2022, 363, 295–304. [Google Scholar] [CrossRef] [PubMed]
- Sudre, C.H.; Murray, B.; Varsavsky, T.; Graham, M.S.; Penfold, R.S.; Bowyer, R.C.; Pujol, J.C.; Klaser, K.; Antonelli, M.; Canas, L.S. Attributes and predictors of long COVID. Nat. Med. 2021, 27, 626–631. [Google Scholar] [CrossRef] [PubMed]
- Mollalo, A.; Hamidi, B.; Lenert, L.; Alekseyenko, A.V. Characterizing Patient Phenotypes and Emerging Trends in Application of Spatial Analysis in Individual-Level Health Data. Res. Square 2023. [Google Scholar] [CrossRef]
- NIH. N3C: Translating Health Data into Health Solutions. Available online: https://ncats.nih.gov/sites/default/files/NCATS-N3C-One-Pager-508.pdf (accessed on 1 December 2023).
- Kurbasic, I.; Pandza, H.; Masic, I.; Huseinagic, S.; Tandir, S.; Alicajic, F.; Toromanovic, S. The advantages and limitations of international classification of diseases, injuries and causes of death from aspect of existing health care system of Bosnia and Herzegovina. Acta Inform. Med. 2008, 16, 159. [Google Scholar] [CrossRef]
- Pfaff, E.R.; Girvin, A.T.; Bennett, T.D.; Bhatia, A.; Brooks, I.M.; Deer, R.R.; Dekermanjian, J.P.; Jolley, S.E.; Kahn, M.G.; Kostka, K. Identifying who has long COVID in the USA: A machine learning approach using N3C data. Lancet Digit. Health 2022, 4, e532–e541. [Google Scholar] [CrossRef]
- Michelen, M.; Manoharan, L.; Elkheir, N.; Cheng, V.; Dagens, A.; Hastie, C.; O’Hara, M.; Suett, J.; Dahmash, D.; Bugaeva, P. Characterising long COVID: A living systematic review. BMJ Glob. Health 2021, 6, e005427. [Google Scholar] [CrossRef] [PubMed]
- Akbarialiabad, H.; Taghrir, M.H.; Abdollahi, A.; Ghahramani, N.; Kumar, M.; Paydar, S.; Razani, B.; Mwangi, J.; Asadi-Pooya, A.A.; Malekmakan, L. Long COVID, a comprehensive systematic scoping review. Infection 2021, 49, 1163–1186. [Google Scholar] [CrossRef] [PubMed]
- Aiyegbusi, O.L.; Hughes, S.E.; Turner, G.; Rivera, S.C.; McMullan, C.; Chandan, J.S.; Haroon, S.; Price, G.; Davies, E.H.; Nirantharakumar, K. Symptoms, complications and management of long COVID: A review. J. R. Soc. Med. 2021, 114, 428–442. [Google Scholar] [CrossRef] [PubMed]
- Kelly, J.D.; Curteis, T.; Rawal, A.; Murton, M.; Clark, L.J.; Jafry, Z.; Shah-Gupta, R.; Berry, M.; Espinueva, A.; Chen, L. SARS-CoV-2 post-acute sequelae in previously hospitalised patients: Systematic literature review and meta-analysis. Eur. Respir. Rev. 2023, 32, 220254. [Google Scholar] [CrossRef] [PubMed]
- Iqbal, F.M.; Lam, K.; Sounderajah, V.; Clarke, J.M.; Ashrafian, H.; Darzi, A. Characteristics and predictors of acute and chronic post-COVID syndrome: A systematic review and meta-analysis. EClinicalMedicine 2021, 36, 100899. [Google Scholar] [CrossRef] [PubMed]
- Estiri, H.; Strasser, Z.H.; Brat, G.A.; Semenov, Y.R.; Patel, C.J.; Murphy, S.N. Evolving phenotypes of non-hospitalized patients that indicate long COVID. BMC Med. 2021, 19, 249. [Google Scholar] [CrossRef] [PubMed]
- Al-Aly, Z.; Xie, Y.; Bowe, B. High-dimensional characterization of post-acute sequelae of COVID-19. Nature 2021, 594, 259–264. [Google Scholar] [CrossRef] [PubMed]
- Baskett, W.I.; Qureshi, A.I.; Shyu, D.; Armer, J.M.; Shyu, C.-R. COVID-specific long-term sequelae in comparison to common viral respiratory infections: An analysis of 17 487 infected adult patients. Open Forum Infect. Dis. 2023, 10, ofac683. [Google Scholar] [CrossRef]
- Fritsche, L.G.; Jin, W.; Admon, A.J.; Mukherjee, B. Characterizing and predicting post-acute sequelae of SARS-CoV-2 infection (PASC) in a large academic medical center in the US. J. Clin. Med. 2023, 12, 1328. [Google Scholar] [CrossRef]
- Haupert, S.R.; Shi, X.; Chen, C.; Fritsche, L.G.; Mukherjee, B. A Case-Crossover Phenome-wide association study (PheWAS) for understanding Post-COVID-19 diagnosis patterns. J. Biomed. Inform. 2022, 136, 104237. [Google Scholar] [CrossRef] [PubMed]
- Jiang, S.; Loomba, J.; Sharma, S.; Brown, D. Vital Measurements of Hospitalized COVID-19 Patients as a Predictor of Long COVID: An EHR-based Cohort Study from the RECOVER Program in N3C. In Proceedings of the 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, NV, USA, 6–8 December 2022; pp. 3023–3030. [Google Scholar]
- Khullar, D.; Zhang, Y.; Zang, C.; Xu, Z.; Wang, F.; Weiner, M.G.; Carton, T.W.; Rothman, R.L.; Block, J.P.; Kaushal, R. Racial/ethnic disparities in post-acute sequelae of SARS-CoV-2 infection in New York: An EHR-based cohort study from the RECOVER program. J. Gen. Intern. Med. 2023, 38, 1127–1136. [Google Scholar] [CrossRef] [PubMed]
- Lorman, V.; Rao, S.; Jhaveri, R.; Case, A.; Mejias, A.; Pajor, N.M.; Patel, P.; Thacker, D.; Bose-Brill, S.; Block, J. Understanding pediatric long COVID using a tree-based scan statistic approach: An EHR-based cohort study from the RECOVER Program. JAMIA Open 2023, 6, ooad016. [Google Scholar] [CrossRef] [PubMed]
- Nasir, M.; Cook, N.; Parras, D.; Mukherjee, S.; Miller, G.; Ferres, J.L.; Chung-Bridges, K. Using Data Science and a Health Equity Lens to Identify Long-COVID Sequelae Among Medically Underserved Populations. J. Health Care Poor Underserved 2023, 34, 521–534. [Google Scholar] [CrossRef] [PubMed]
- Pfaff, E.R.; Madlock-Brown, C.; Baratta, J.M.; Bhatia, A.; Davis, H.; Girvin, A.; Hill, E.; Kelly, E.; Kostka, K.; Loomba, J. Coding long COVID: Characterizing a new disease through an ICD-10 lens. BMC Med. 2023, 21, 58. [Google Scholar] [CrossRef] [PubMed]
- Rao, S.; Lee, G.M.; Razzaghi, H.; Lorman, V.; Mejias, A.; Pajor, N.M.; Thacker, D.; Webb, R.; Dickinson, K.; Bailey, L.C. Clinical features and burden of postacute sequelae of SARS-CoV-2 infection in children and adolescents. JAMA Pediatr. 2022, 176, 1000–1009. [Google Scholar] [CrossRef] [PubMed]
- Reese, J.T.; Blau, H.; Casiraghi, E.; Bergquist, T.; Loomba, J.J.; Callahan, T.J.; Laraway, B.; Antonescu, C.; Coleman, B.; Gargano, M. Generalisable long COVID subtypes: Findings from the NIH N3C and RECOVER programmes. EBioMedicine 2023, 87, 104413. [Google Scholar] [CrossRef] [PubMed]
- Sengupta, S.; Loomba, J.; Sharma, S.; Brown, D.E.; Thorpe, L.; Haendel, M.A.; Chute, C.G.; Hong, S. Analyzing historical diagnosis code data from NIH N3C and RECOVER Programs using deep learning to determine risk factors for Long COVID. In Proceedings of the 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, NV, USA, 6–8 December 2022; pp. 2797–2802. [Google Scholar]
- Wang, L.; Foer, D.; MacPhaul, E.; Lo, Y.-C.; Bates, D.W.; Zhou, L. PASCLex: A comprehensive post-acute sequelae of COVID-19 (PASC) symptom lexicon derived from electronic health record clinical notes. J. Biomed. Inform. 2022, 125, 103951. [Google Scholar] [CrossRef] [PubMed]
- Zang, C.; Zhang, Y.; Xu, J.; Bian, J.; Morozyuk, D.; Schenck, E.J.; Khullar, D.; Nordvig, A.S.; Shenkman, E.A.; Rothman, R.L. Data-driven analysis to understand long COVID using electronic health records from the RECOVER initiative. Nat. Commun. 2023, 14, 1948. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Zang, C.; Xu, Z.; Zhang, Y.; Xu, J.; Bian, J.; Morozyuk, D.; Khullar, D.; Zhang, Y.; Nordvig, A.S. Data-driven identification of post-acute SARS-CoV-2 infection subphenotypes. Nat. Med. 2023, 29, 226–235. [Google Scholar] [CrossRef]
- Walker, A.J.; MacKenna, B.; Inglesby, P.; Tomlinson, L.; Rentsch, C.T.; Curtis, H.J.; Morton, C.E.; Morley, J.; Mehrkar, A.; Bacon, S. Clinical coding of long COVID in English primary care: A federated analysis of 58 million patient records in situ using OpenSAFELY. Br. J. Gen. Pract. 2021, 71, e806–e814. [Google Scholar] [CrossRef] [PubMed]
- Mayor, N.; Meza-Torres, B.; Okusi, C.; Delanerolle, G.; Chapman, M.; Wang, W.; Anand, S.; Feher, M.; Macartney, J.; Byford, R. Developing a long COVID phenotype for postacute COVID-19 in a national primary care sentinel cohort: Observational retrospective database analysis. JMIR Public Health Surveill. 2022, 8, e36989. [Google Scholar] [CrossRef] [PubMed]
- Jeffrey, K.; Woolford, L.; Maini, R.; Basetti, S.; Batchelor, A.; Weatherill, D.; White, C.; Hammersley, V.; Millington, T.; Macdonald, C. Prevalence and risk factors for long COVID among adults in Scotland using electronic health records: A national, retrospective, observational cohort study. EClinicalMedicine 2024, 71, 102590. [Google Scholar] [CrossRef] [PubMed]
- Kessler, R.; Philipp, J.; Wilfer, J.; Kostev, K. Predictive Attributes for Developing Long COVID—A Study Using Machine Learning and Real-World Data from Primary Care Physicians in Germany. J. Clin. Med. 2023, 12, 3511. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.G.; Dagliati, A.; Shakeri Hossein Abad, Z.; Xiong, X.; Bonzel, C.-L.; Xia, Z.; Tan, B.W.; Avillach, P.; Brat, G.A.; Hong, C. International electronic health record-derived post-acute sequelae profiles of COVID-19 patients. NPJ Digit. Med. 2022, 5, 81. [Google Scholar] [CrossRef]
- Dagliati, A.; Strasser, Z.H.; Abad, Z.S.H.; Klann, J.G.; Wagholikar, K.B.; Mesa, R.; Visweswaran, S.; Morris, M.; Luo, Y.; Henderson, D.W. Characterization of long COVID temporal sub-phenotypes by distributed representation learning from electronic health record data: A cohort study. Eclinicalmedicine 2023, 64, 102210. [Google Scholar] [CrossRef]
Theme | Key Terms |
---|---|
Long COVID | “long COVID*” OR “long-term COVID*” OR “post-acute sequelae” OR “late-stage COVID*” OR “SARS-CoV-2 post-recovery” OR “post-COVID*” OR “PASC” OR “long-haul COVID*” OR “Chronic COVID*” OR “persistent COVID*” OR “prolonged COVID*” OR “extended COVID*” OR “post-recovery COVID*” OR “Aftermath COVID*” OR “survivorship COVID*” OR “late effects COVID*” OR “long-term effects of COVID” OR “post-acute COVID-19” OR “post-acute sequelae of SARS-CoV-2 (PASC)” OR “ICD-10-CM” OR “ICD-10” |
Electronic health records | “Electronic Medical Record*” OR “Electronic Health Record*” OR “Electronic Patient Record*” OR “EHR” OR “EMR” OR “N3C” OR “All of US” |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Luke, R.A.; Shaw, G., Jr.; Saarunya, G.; Mollalo, A. Identifying Long COVID Definitions, Predictors, and Risk Factors in the United States: A Scoping Review of Data Sources Utilizing Electronic Health Records. Informatics 2024, 11, 41. https://doi.org/10.3390/informatics11020041
Luke RA, Shaw G Jr., Saarunya G, Mollalo A. Identifying Long COVID Definitions, Predictors, and Risk Factors in the United States: A Scoping Review of Data Sources Utilizing Electronic Health Records. Informatics. 2024; 11(2):41. https://doi.org/10.3390/informatics11020041
Chicago/Turabian StyleLuke, Rayanne A., George Shaw, Jr., Geetha Saarunya, and Abolfazl Mollalo. 2024. "Identifying Long COVID Definitions, Predictors, and Risk Factors in the United States: A Scoping Review of Data Sources Utilizing Electronic Health Records" Informatics 11, no. 2: 41. https://doi.org/10.3390/informatics11020041
APA StyleLuke, R. A., Shaw, G., Jr., Saarunya, G., & Mollalo, A. (2024). Identifying Long COVID Definitions, Predictors, and Risk Factors in the United States: A Scoping Review of Data Sources Utilizing Electronic Health Records. Informatics, 11(2), 41. https://doi.org/10.3390/informatics11020041