1. Introduction
Many who have contracted coronavirus disease (COVID-19) caused by Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) have not fully recovered or have persisting or new symptoms, in a condition termed Long COVID [
1]. By February 2024, 17.1 to 18.2% of U.S. adults reported having experienced Long COVID [
2]. As of March 2024, about 7% of all US adults had Long COVID, which is roughly 17 million people [
3]. Those with a more severe initial infection have a higher risk of developing Long COVID, but Long COVID can develop even after a mild SARS-CoV-2 infection [
4]. Other terms for Long COVID include “post-COVID-19 condition” or “long haul COVID”. The CDC uses the term post-acute sequelae of COVID-19 (PASC). Stephenson et al. [
5] found that those with Long COVID have more symptoms than those who tested negative for SARS-CoV-2; other studies have corroborated these findings [
2,
6].
There have been multiple efforts to define Long COVID. The World Health Organization’s [
7] case definition of what they referred to as “post-COVID-19 condition” states that this condition “occurs in individuals with a history of probable or confirmed SARS-CoV-2 infection, usually 3 months from the onset of COVID-19 with symptoms that last for at least 2 months and cannot be explained by an alternative diagnosis”. Another more recent broad definition by the National Academies of Sciences, Engineering, and Medicine [
8] involves having continuous, relapsing, and remitting symptoms affecting one or more organ systems for at least three months following acute SARS-CoV-2 infection.
Thaweethai et al. [
9] proposed a narrower definition of Long COVID and evaluated their definition using a national data set of those infected versus those not infected with SARS-CoV-2. They identified 12 symptoms (i.e., loss of or change in smell and taste, post-exertional malaise, chronic cough, brain fog, thirst, heart palpitations, chest pain, fatigue, change in sexual desire or capacity, dizziness, gastrointestinal symptoms, and abnormal movements). A symptom score (created from LASSO coefficient magnitudes) was provided for each symptom, and individuals who scored 12 or greater were classified as having Long COVID. Each of the symptoms had an associated value. The occurrence of “loss of or change in smell or taste” was the best discriminator between those with and without Long COVID and was associated with 8 points. Therefore, if a respondent had this symptom, it counted for 8 of the needed 12 points, so 66% of the required score to meet their Long COVID definition. They found that among those who were ultimately classified as having Long COVID, 41% had this symptom. Dorri and Jason [
10] looked at the same item, “loss of or change in smell and/or taste,” in another Long COVID data set and found just 12.6% of patients satisfied the requirements for this symptom, using criteria that require symptoms to occur at least half the time and have at least moderate severity. Thus, frequency and severity criteria caused this symptom to decrease in prevalence from 41% to 12.6% in patients with Long COVID. Other studies have also demonstrated a lower prevalence of this symptom among patients with COVID-19 across the US [
11].
Occurrence measures (i.e., a symptom occurs or does not occur) alone may benefit from more complexity to characterize Long COVID. As an example, Oliveira et al. [
12] used frequency and severity ratings with a machine learning algorithm and applied an adaptive LASSO approach for feature selection, choosing tuning parameters and penalization with cross-validation. They found that “unrefreshing sleep” and “flu-like symptoms” were the best discriminators of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) versus Long COVID. In the current study, we applied a similar analysis to patients with Long COVID to distinguish that group from recovered controls. We used a symptom measure that incorporated both frequency and severity with a sample of individuals with Long COVID and those without. LASSO procedures were used to first develop and then test models based on symptom presentation to best differentiate those with Long COVID and recovered controls. A scale was developed using an ROC curve to determine the best scores distinguishing patients with Long COVID from recovered controls. The scale and threshold score could then be used to define those with Long COVID.
2. Methods
2.1. Participants
COVID data were collected from 2023 to 2024. Questionnaires were first distributed to patients who self-reported having Long COVID on multiple online Long COVID support group forums and through recruitment efforts at local universities. Patients were then identified as having Long COVID using quantitative questionnaires, qualitative interviews, a medical examination, and laboratory screening to rule out other conditions (see below). Comprehensive clinical evaluations determined who had both been infected with SARS-CoV-2 and continued to experience significant symptoms for at least three months following infection and did not have any other medical conditions that could explain their symptoms. Interviews and questionnaires were conducted so that the language used would not bias the patient’s answer. For example, we asked questions about their infection with SARS-CoV-2 and what symptoms persisted. Through various methods including the medical examination, surveys, and interviews, we were able to determine which participants had continuing symptoms since the SARS-CoV-2. We included (a) Any person with a positive Nucleic Acid Amplification Test (NAAT) OR positive SARS-CoV-2 antibody test. (This definition modifies the WHO criterion to add detectable SARS-CoV-2 antibody as a qualifying test.); (b) Any person with a positive SARS-CoV-2 Antigen-rapid test (including home-administered rapid test) AND meeting either the probable or suspected case definition; (c) An asymptomatic person with a positive SARS-CoV-2 Antigen-rapid diagnostic test (RDT) (including home-administered RDT) who is a contact of a probable or confirmed case; (d) Any person with a positive SARS-CoV-2 nucleocapsid protein antibody test OR a positive SARS-CoV-2 spike protein antibody test in a non-vaccinated individual. We attempted to identify relative weights that could be applied to symptoms through an algorithm to best distinguish cases of Long COVID from recovered controls. Participants included 55 patients with Long COVID and 55 recovered controls, matched to patients with Long COVID by sex, age, race, and ethnicity.
2.2. Measures
Participants completed the DePaul Symptom Questionnaire (DSQ), a 54-item self-report survey that measures ME/CFS symptomatology [
13], and the DSQ-COVID, a questionnaire developed to measure Long COVID symptoms [
10,
14]. The DSQ-COVID collected data on COVID-related symptoms, created by identifying the most common symptoms across the COVID-19 research literature. A literature search used the following terms: “exploratory factor analysis of COVID”, “exploratory factor analysis of Long COVID”, “factor analysis of COVID”, and “factor analysis of Long COVID” across several databases (DePaul University Library system, PubMed, Google Search, and Google Scholar). In addition, possible symptoms were presented to patient communities for their feedback. The information patients provided was adjudicated among the research team, and then a revised list was created and shared with patients for additional feedback. This feedback was evaluated, and a final list of 38 symptoms was developed. All symptoms were measured on a 5-point Likert scale for frequency (e.g., 0 = None of the time, 1 = A little of the time, 2 = About half the time, 3 = Most of the time, and 4 = All of the time) and severity (e.g., 0 = Absent, 1 = Mild, 2 = Moderate, 3 = Severe, and 4 = Very severe). Composite scores were calculated for each symptom by averaging their corresponding frequency and severity and then multiplying each result by 25 to create values ranging from 0 to 100, with higher scores indicating more frequent/severe symptoms. In the past, we have used a frequency score of 2 or higher (at least about half the time) and a severity score of 2 or higher (at least moderate) to indicate a significant symptom, but in this study, we allowed the scores to vary from 0 to 100 as we were attempting to develop an algorithm for defining Long COVID.
2.3. Preprocessing
Missing values were imputed depending on what values were missing. For simplicity, the frequency and severity of answers for a particular question were referred to as pair values. For cases where one paired value (either frequency or severity) for a symptom was missing, we imputed the missing value based on the available information as follows: If the missing value was paired with a value of zero, the missing value was assumed to be zero. If the missing value had a non-zero pair value, the mode of the score for other participants who had the same score for the pair value was imputed. For instance, if the severity of a symptom was missing, and its pair value frequency score was 3, the mode of other participants with a frequency of 3 was imputed as the missing value. If both values for a symptom were missing, the median value for the symptom was imputed for both values.
Our choice of imputation method was guided by the structure of the symptom-rating system. In the DSQ framework, frequency and severity scores are conceptually linked paired dimensions of the same underlying symptom. In practice, these two values tend to cluster together because respondents who report a symptom as frequent generally also report higher severity, while those who report a symptom as infrequent typically report low severity. Imputing from participants with the same paired value preserves this empirical dependency more effectively than alternative simple imputations (e.g., mean substitution or global modal imputation), which would ignore the known correspondence between the two ratings. More complex approaches (e.g., multiple imputation) were not suitable for this dataset because of (a) the ordinal, highly skewed distribution of the items, and (b) the fact that symptom frequency and severity scores are not independent variables, but components of a scoring algorithm used to derive overall symptom burden. Multiple imputation algorithms treating these scores as independent predictors tend to produce unstable or clinically nonsensical values.
2.4. Statistical Analysis
We used composite variables taking in the frequency and severity of symptoms on a scale from 0 to 100, as that provided the most pertinent symptom information for developing a predictive model for a definition of Long COVID. Since our dataset was relatively small, the Least Absolute Shrinkage and Selection Operator (LASSO) analysis was trained using a Leave One Out Cross-Validation (LOOCV), through which we generated 110 LASSO models. Because LASSO variable selection can be unstable in small samples, we generated 110 LOOCV LASSO models and averaged the resulting coefficients to obtain more stable estimates of predictor importance. This approach aligns with established methods in stability selection and model aggregation, which use repeated regularization fits to improve robustness in datasets with correlated predictors [
15,
16]. We then took the average of the LASSO coefficients for each symptom to create an average model. The LASSO coefficient is a regression coefficient that indicates the strength of a variable’s influence on the outcome. Due to the LASSO process, each symptom coefficient can be penalized by a standardized value. Sometimes the penalty is larger than a particular coefficient’s value, in which case that coefficient would be shrunk to zero and eliminated from the model. We also made a set of simplified integer scores for each symptom by multiplying their smaller LASSO coefficients by 1000, rounding the results to whole numbers, and then removing any symptoms whose integer scores were less than 1. We tested the model using a threshold based on the simplified integer scores.
Each LASSO coefficient score was multiplied by a participant’s composite score, which was calculated from the frequency and severity of their symptoms according to the DSQ. The resulting products were then summed for each participant, giving the participants a final score predicting their likelihood of having Long COVID. The participants’ final scores were put along an ROC curve to evaluate the best threshold for distinguishing participants with Long COVID from recovered controls. The threshold with the best accuracy was selected and used to formulate a definition of Long COVID.
3. Results
The participants, who averaged about 19 and a half years of age, were primarily white females who were not Latinx. There were no significant age [Long COVID (M= 19.8, SD = 2.31 versus Recovered (M = 19.5, SD = 1.30), t(88.32) = −0.77, p = 0.45], racial [Long COVID White (69.1%), Other (16.4%), Asian (7.3%), Black (7.3%) versus Recovered White (70.9%), Other (9.1%), Asian (12.7%), Black (7.3%), X2 (3, N = 110) = 1.97, p = 0.58], Latinx [Long COVID (25.5%) versus Recovered (20%), [X2 (1, N =110) = 0.47, p = 0.50], or gender differences [Long COVID women (78.0%) versus Recovered women (64.2%) [X2(1, N = 110) = 2.39, p = 0.12].
Table 1 shows each of the symptoms with associated values created from their LASSO coefficients. Among the highest-scored symptoms (i.e., those that were most discriminatory between the two groups) were shortness of breath or trouble catching your breath, gastrointestinal symptoms, loss of/change in smell or taste, dizziness or fainting, heavy legs and/or swelling of legs, physically drained or sick after mild activity, nose congestion, muscle aches, vision problems, no appetite, and absent-mindedness or forgetfulness.
Figure 1 shows the means of composite scores for the key symptoms, on a 100-point scale, with higher scores indicating higher frequency/severity of symptoms.
Figure 2 provides the optimal threshold for identifying Long COVID: composite scores of 530 or higher yielded a diagnosis of Long COVID with 90.91% Accuracy, 89.09% Sensitivity, and 92.73% Specificity (see
Table 2). This formula can thus be used as the basis for a case definition of Long COVID. The equation for the likelihood of having Long COVID total score is: Likelihood of Having Long COVID = (6) × (shortness of breath composite) + (5) × (gastrointestinal composite) + (3) × (loss smell and taste composite) + (2) × (dizziness composite) + (2) × (heavy legs composite) + (2) × (physically drained composite) + (2) × (nose congestion composite) + (1) × (muscle aches composite) + (1) × (vision problems composite) + (1) × (no appetite composite) + (1) × (absentmindedness composite).
4. Discussion
The current study found that by using data mining strategies, it is possible to achieve high accuracy in differentiating those with Long COVID versus those who have recovered using precise measures of frequency and severity. In contrast to definitions of Long COVID that use occurrence measures, such as those of the National Academies of Sciences, Engineering, and Medicine [
8], our study identifies key symptoms that best differentiate those who have recovered from SARS-CoV-2 infection versus those who have not. When Thaweethai et al. [
9] employed a narrower criterion to define Long COVID, they found that among their participants first infected on or after 1 December 2021, and enrolled within 30 days of infection, 10% were Long COVID positive at 6 months; however, among those not infected, 4.6% still met their case definition of Long COVID. Our study used more precise frequency/severity measures but was not able to determine what percentage of uninfected patients would satisfy our formulaic definition of Long COVID as we do not have an uninfected control group. Our study also utilized a high threshold for predicting Long COVID, at 530, which according to our ROC analysis was the threshold that would yield the most accuracy; however, it did cause 6 of our 55 participants (10.9%) with Long COVID to be falsely classified as recovered controls.
Our study found that shortness of breath or trouble catching your breath, gastrointestinal symptoms, and loss of/change in smell or taste were the three highest-rated items for identifying Long COVID, with other high-scoring items consisting of autonomic domains (dizziness or fainting, heavy legs and/or swelling of legs, vision problems), post-exertional malaise (physically drained or sick after mild activity), a respiratory symptom (nose congestion), another gastrointestinal symptom (no appetite), muscle aches, and a cognitive item (absent-mindedness or forgetfulness). In contrast, Thaweethai et al. [
9] found smell/taste and post-exertional malaise to be the highest-rated items, with chronic cough, brain fog, and thirst being the next highest-rated items. The importance of smell/taste was lessened in the current study, and gastrointestinal symptoms take on greater prominence in our study, as was also seen in our previous analysis of patients with ME/CFS following infectious mononucleosis [
17]. It appears that using more precise measures of frequency/severity rather than just using occurrence measures resulted in a different selection of symptoms as being the most important.
A key question involves how broad or narrow a case definition might be. Long COVID has more than 200 possible symptoms, and criteria that are very broad will have good sensitivity so that all those who have the condition will be identified. However, such an approach will have poor specificity, so many will be inaccurately diagnosed with Long COVID. If a person can meet Long COVID criteria by merely having a few minor symptoms for 3 months following COVID infection, the prevalence of Long COVID will be extremely high. For example, a large percentage of primary care patients with psychogenic causes have unexplained symptoms [
18], and they might fit a broad case definition of Long COVID. Therefore, a broader case definition might also lead to incorrectly attributing those with Long COVID to having psychogenic causes.
Those with another post-viral illness, ME/CFS, have sometimes been re-traumatized by the reaction of healthcare workers, friends, and even family members to their disease [
19]. This same type of stigma could occur for those with Long COVID. With ME/CFS, because about 20% of the general population experiences fatigue [
20], it is not uncommon for people to feel their fatigue is comparable to ME/CFS, and if they can cope with their symptoms, they expect others to cope with what they believe to be similar symptoms. Yet these attitudes trivialize the experience of ME/CFS, because common fatigue is not the same as the debilitating fatigue (and other associated symptoms) of ME/CFS. The consequences are that 95% of individuals seeking medical treatment for ME/CFS report feelings of estrangement [
21], 90% of patients with ME/CFS report delegitimizing experiences by physicians [
22], and most cannot find a knowledgeable and sympathetic physician to care for them [
23]. Avoiding similar trauma for patients with Long COVID should be a priority [
24].
There are several methodological limitations to the current study. Sample sizes were small given the number of variables; however, we used leave-one-out procedures to help mitigate this issue. The highly specific demographic profile of our sample, predominantly young, adult, white females, limits the generalization of the findings to broader Long COVID populations. Finally, as mentioned above, we did not have an uninfected control group, and biological measures were not used in the current study. Hopefully, biomarkers will be identified in future studies to provide a better understanding of different presentation types of Long COVID and their relation to organ systems impacted by Long COVID.