Characterizing Risk Factors for Hospitalization and Clinical Characteristics in a Cohort of COVID-19 Patients Enrolled in the GENCOV Study

The GENCOV study aims to identify patient factors which affect COVID-19 severity and outcomes. Here, we aimed to evaluate patient characteristics, acute symptoms and their persistence, and associations with hospitalization. Participants were recruited at hospital sites across the Greater Toronto Area in Ontario, Canada. Patient-reported demographics, medical history, and COVID-19 symptoms and complications were collected through an intake survey. Regression analyses were performed to identify associations with outcomes including hospitalization and COVID-19 symptoms. In total, 966 responses were obtained from 1106 eligible participants (87% response rate) between November 2020 and May 2022. Increasing continuous age (aOR: 1.05 [95%CI: 1.01–1.08]) and BMI (aOR: 1.17 [95%CI: 1.10–1.24]), non-White/European ethnicity (aOR: 2.72 [95%CI: 1.22–6.05]), hypertension (aOR: 2.78 [95%CI: 1.22–6.34]), and infection by viral variants (aOR: 5.43 [95%CI: 1.45–20.34]) were identified as risk factors for hospitalization. Several symptoms including shortness of breath and fever were found to be more common among inpatients and tended to persist for longer durations following acute illness. Sex, age, ethnicity, BMI, vaccination status, viral strain, and underlying health conditions were associated with developing and having persistent symptoms. By improving our understanding of risk factors for severe COVID-19, our findings may guide COVID-19 patient management strategies by enabling more efficient clinical decision making.


Introduction
Since the beginning of the coronavirus disease 2019 (COVID-19) pandemic, a diverse spectrum of clinical presentations has been identified among individuals infected by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), ranging from asymptomatic infection to severe illness resulting in hospitalization or death [1]. Specific risk factors for COVID-19 severity have been described by numerous studies and include male sex, increasing age and body mass index (BMI), and underlying chronic health conditions [2][3][4][5][6][7][8].
Meta-analyses which have consolidated the findings of several studies have reported underlying health conditions such as hypertension, diabetes, respiratory disease, and cardiovascular disease to be associated with increased risk of severe illness or mortality [9,10]. However, there is considerable heterogeneity among risk factors identified between studies, which may be attributed in part to variability in study design and differences between populations. Moreover, many studies have only examined risk factors for severity within the context of hospitalized patients and have not compared these individuals to the majority who do not require treatment [2][3][4]7]. By characterizing individuals with milder illness, risk factors for severe illness leading to hospitalization can be identified more appropriately. COVID-19 symptom prevalence and persistence have also been characterized across many populations, and significant heterogeneity has been observed in both within-and between-country study populations [11][12][13]. Despite this heterogeneity, these independent studies have found that individuals with symptomatic illness most commonly report experiencing fever, cough, and fatigue. However, little work has been performed to assess whether patient factors contribute to the development or persistence of these symptoms. By identifying associations between patient factors and clinical manifestations of COVID-19, features of patients may be utilized to predict the progression of illness.
GENCOV is a prospective, observational cohort study of COVID-19-positive adults across the Greater Toronto Area in Ontario, Canada, which seeks to identify patient characteristics associated with differences in COVID-19 severity and patient outcomes. Here, our objective was to (1) identify patient characteristics associated with hospitalization from COVID-19 and (2) examine COVID-19 symptoms, their persistence, and associations with hospitalization and other patient factors. The limited generalizability of previous studies highlights the need for population-specific considerations when assessing COVID-19 patients and their clinical presentation and determining the likely course of their illness. By improving our understanding of risk factors for (1) hospitalization, (2) severe clinical presentations, and (3) persistent illness, clinical decision-making processes for COVID-19 patients can be refined to facilitate targeted and rapid patient care. Based on prior studies which have investigated risk factors for both symptomatic and more severe COVID-19 [2,3,14], we hypothesized that both intrinsic and extrinsic patient factors are associated with variable COVID-19 severity and that these same factors influence the presentation and persistence of specific clinical manifestations of acute COVID-19.

Participant Recruitment
Participants were recruited across the Greater Toronto Area in Ontario, Canada, at one of the following sites: Sinai Health System, University Health Network (including Toronto General Hospital and Toronto Western Hospital), William Osler Health System (including Brampton Civic Hospital and Etobicoke General Hospital), Mackenzie Health (including Cortellucci Vaughan Hospital and Mackenzie Richmond Hill Hospital), and Women's College Hospital. Hospitalized inpatients who were admitted with or from COVID-19 illness were prospectively recruited into the study. Outpatients were recruited from the emergency department or COVID-19 assessment centres at participating hospital sites. Study participants with a prior SARS-CoV-2 infection were retrospectively recruited using study fliers posted at participating hospital sites. Hospitalization status was confirmed at enrolment. Participants were enrolled between November 2020 and May 2022 based on the following inclusion criteria: (1) 18 years or older, (2) had a confirmed positive SARS-CoV-2 PCR test, and (3) provided informed consent to participate in the study.

Patient Data Collection by Intake Surveys
Upon enrolment, participants were provided with a link to an intake survey [15] hosted on NOVI Survey's web-based survey software (version 5.9.4510, 3rd Millennium Inc., Cambridge, MA, USA). The survey was developed and refined according to feedback provided by the GENCOV study team, consisting of 21 co-investigators and researchers across multiple healthcare disciplines. Participants completed the survey in English and were permitted to pause, resume, and modify responses prior to submission. Data were collected between 9 and 411 days (median: 86 days) post-COVID-19 diagnosis and stored on REDCap (version 11.4.2, Vanderbilt University, Nashville, TN, USA), a secure database hosted on Mount Sinai Hospital's server. Patient-reported information related to demographics, medical history, and COVID-19 symptoms (type and duration) was collected. Information related to COVID-19 vaccination was also obtained from participants, including the total number of doses received, the date of each dose, and dose manufacturers. With respect to patient medical histories, data related to 14 categories of health conditions (shown in Table A1) were collected. Participants who reported having any of the listed conditions were prompted to elaborate on their specific conditions in free-text format. A manual review of free-text responses was performed to correct entry errors or misclassification. Clinical conditions which could be categorized in more than one way (e.g., Hepatitis B infection as both a non-SARS-CoV-2 viral infection and a hepatic condition) were subcategorized (e.g., viral hepatitis) and were included under all relevant conditions to ensure responses were categorized consistently (Table A1). Participants were presented with a list of 23 COVID-19 symptoms and were asked whether they experienced each symptom, and if so, for how long. Symptom durations were reported to the nearest 1-week interval (e.g., <1 week, 1-2 weeks), with a maximum reportable duration of >8 weeks. Data from 47 participants who completed the survey within 8 weeks of COVID-19 positivity and reported ongoing symptoms were excluded from symptom duration analyses.

Collection of Viral Lineage Data
SARS-CoV-2 viral genomes were isolated from nasopharyngeal swabs collected at baseline and subsequently sequenced as per the study protocol [15]. Viral lineages which were incompletely sequenced were inferred by spike (S)-gene target failure (SGTF) and/or epidemiological data of circulating variants during the study period [16]. Deletion of amino acids 69 and 70 within the S gene of SARS-CoV-2, sometimes attributable to the N501Y mutation, can result in SGTF for some real-time reverse transcriptase polymerase chain reaction (RT-PCR) testing methods. A subset of study samples were tested using the TaqPath COVID-19 PCR (Thermo Fisher Scientific, Waltham, MA, USA), which demonstrates SGTF for SARS-CoV-2 viral types Alpha (lineage B.1.1.7) and Omicron (lineages B.1.1.529, BA.1, BA.2, XBB). SGTF was defined as non-detection of the S-gene target among samples that tested positive (cycle threshold < 37) for both the N-gene and ORF1ab-gene targets. In this study, SGTF was used as an indicator of presumptive Alpha or Omicron viral type in the absence of sequence data. This was accomplished by reviewing the epidemiological data for circulating variants at the time of sample collection using a publicly available Nextstrain database maintained by Public Health Ontario and populated using data generated by the Ontario COVID-19 Genomics Network [17,18]. Based on these data, the Alpha viral type was in circulation within the population between December 2020 and August 2021, while Omicron was in circulation beginning November 2021, based on the sample collection date. As there is no overlap between the circulation of these two viral types, collection date in conjunction with SGTF is a strong predictor of Alpha and Omicron viral types for their respective periods of circulation.
Further discrimination of SARS-CoV-2 variants was accomplished for a subset of samples using a multiplex real-time RT-PCR assay. The assay discriminates between Alpha and Beta/Gamma viral types by detecting site-specific mutations in the SARS-CoV-2 spike gene by identifying three targets: (1) N501 (wild-type) due to the presence of adenine at nucleic acid position 23063, (2) N501Y (mutation associated with Alpha, Beta, and Gamma viral types) due to presence of A23063T substitution, and (3) E484K (mutation associated with Beta and Gamma viral types) due to presence of G23012A substitution. Reactions demonstrating N501Y mutation in the absence of E484K mutation were designated Alpha, while those demonstrating N501Y and E484K were designated Beta/Gamma (further discrimination was not possible without sequencing). Pearson's chi-squared test or Fisher's exact test (for comparisons with fewer than 5 expected observations) was performed in R version 4.1.1 using the "chisq.test()" and "fisher.test()" functions to compare COVID-19 symptoms based on hospitalization status. The Mann-Whitney U test was performed in Stata 14 using the "ranksum" function to compare symptom durations between patient groups based on participants' hospitalization status. Symptoms were excluded from duration analysis if they were reported by fewer than 10 total respondents (e.g., seizures) or less than 10% of either patient group after stratification (e.g., seizures, hemoptysis, conjunctivitis, ear pain, skin rash).

Statistical Analyses
When examining patient characteristics as risk factors for hospitalization, the following were controlled for as covariates based on previous evidence of being associated with COVID-19 severity and mortality [8][9][10]19]: sex (male vs. female), age, BMI, ethnicity (White/European vs. non-White/European), and underlying health conditions including hypertension, diabetes, and cardiovascular and pulmonary conditions. The following characteristics were analyzed as potential risk factors for developing and having persistent COVID-19 symptoms based on prior evidence of being predictive of symptomatic illness [20][21][22][23][24][25][26]: sex (male vs. female), age, BMI, ethnicity (White/European vs. non-White/European), viral strain (wild-type vs. variant), vaccination status (unvaccinated vs. vaccinated), and having any underlying health conditions.
Logistic regression was performed to model the relationship of several covariates with dichotomous outcomes including hospitalization and COVID-19 symptom prevalence. Logistic regression was performed in Stata 14 using the "logistic" function to fit maximum-likelihood dichotomous logistic models. Crude (OR) and adjusted (aOR) odds ratios with 95% confidence intervals were estimated for univariable and multivariable analyses, respectively.
Interval regression analysis (a type of censored regression) was used to model the relationship between risk factors for symptomatic COVID-19 and the persistence of symptoms. Interval regression was performed in Stata 14 using the "intreg" function to fit linear models with an outcome measured as point data, interval data, left-censored data, or right-censored data. The duration of each symptom was analyzed as a separate outcome. The durations for seizures, conjunctivitis, and hemoptysis were excluded from multivariable regression analysis due to insufficient observations for each symptom. Regression coefficients (β) and 95% confidence intervals correspond to the estimated difference in symptom duration in weeks. For all multivariable regression analyses, only observations with complete data for all covariates were included in the final models. The statistical significance level was set at α = 0.05.

Participant Characteristics and Associations with Hospitalization
In total, 966 responses were obtained from 1106 eligible participants (87% response rate; Table 1). Most respondents were outpatients (94.4%), female (56.8%), of White or European ethnicity (50.1%), had at least one underlying medical condition (57.1%), and had no history of smoking (60.2%). The most prevalent non-White/European ethnicities represented among respondents included Middle Eastern (9.8%) and South Asian (7.8%) ( Table A2). The median age of respondents was 43 years (IQR: 32-55 years), while the median body mass index (BMI) was determined to be 25.9 kg/m 2 (IQR: 22.9-29.1 kg/m 2 ). Most respondents indicated that they were unvaccinated prior to having COVID-19 (69.1%), while only 25.3% of respondents were vaccinated. A summary of the number of vaccine doses, time of most recent vaccination, and specific vaccine combinations received by participants is presented in Table A3. The remainder of the participants (5.6%) did not disclose their vaccination status at enrolment. Approximately 28.7% of participants were determined to have been infected by wild-type SARS-CoV-2. The most prevalent viral variants detected among respondents were Alpha (20.8%), Omicron (7.1%), and Delta (6.6%) (Table A2). Viral lineages were undetermined for 336 (34.8%) participants due to the unavailability of viral swabs or complete sequencing data.
Older age, increasing BMI, non-White/European ethnicity, and infection by SARS-CoV-2 variants were associated with increased odds of hospitalization by univariable and multivariable analysis (controlling for covariates described above). Univariable analysis, but not multivariable analysis, identified vaccination prior to infection was associated with reduced odds of hospitalization (Table 1). Vaccination status was not included in the multivariable analysis because all vaccinated participants with complete data were outpatients. Similarly, having any underlying health conditions was excluded from multivariable analysis given the collinearity with other predictors. Sex, having any non-specific underlying health conditions, and history of smoking were not associated with hospitalization risk.
The most prevalent health conditions among respondents included gastrointestinal disorders (14.7%), hypertension (12.7%), and endocrine disorders (12.1%) including diabetes (6.3%) ( Table 2). Univariable analysis identified that hypertension, endocrine conditions including diabetes, cardiovascular conditions, cancer, lipid conditions, and hepatic conditions were significantly associated with increased risk of hospitalization. After controlling for covariates, hypertension was the only condition which remained significantly associated with hospitalization. Blood, gastrointestinal, pulmonary, autoimmune, neurologic/psychiatric, renal, and non-SARS-CoV-2 viral infections were not associated with hospitalization. Hereditary genetic conditions were not assessed as a risk factor for hospitalization due to their low prevalence among the study cohort.

COVID-19 Symptom Prevalence, Duration, and Associations with Patient Characteristics
With respect to the 23 COVID-19 symptoms documented (Table 3), fatigue (80.0%), headache (64.8%), and muscle aches (63.7%) were the most frequently reported symptoms, while conjunctivitis (2.9%), cough with bloody sputum/phlegm (i.e., hemoptysis) (2.2%), and seizures (0.4%) were the most uncommon symptoms. The longest persisting symptoms included shortness of breath, loss of taste, and loss of smell, which lasted 3-4 weeks on average ( Figure 1). Short-lived symptoms included fever, sore throat, diarrhea, and conjunctivitis, lasting <1 week on average. When stratified by hospitalization status, shortness of breath, fever, runny nose/nasal congestion, chest pain, wheezing, altered consciousness/confusion, abdominal pain, and vomiting/nausea were significantly associated with hospitalization (Table 3). Conversely, loss of smell was significantly more prevalent among outpatients. Furthermore, inpatients reported significantly longer-lasting symptoms com-Viruses 2023, 15, 1764 6 of 25 pared to outpatients for fever, cough (productive and non-productive), sore throat, runny nose/nasal congestion, chest pain, muscle aches, joint pain, fatigue, shortness of breath, headache, and abdominal pain ( Figure 2). Table 1. Summary of GENCOV participant characteristics, stratified by hospitalization status. p-values, crude odds ratios (ORs), adjusted odds ratios (aORs), and 95% confidence intervals were calculated from the logistic regression models. Variables which were controlled in the multiple logistic regression model included continuous age and BMI, sex, ethnicity (White/European vs. non-White/European), hypertension, diabetes, cardiovascular conditions, and pulmonary conditions. Percentages may not add up to 100% due to rounding.     Sex, BMI, age, ethnicity, viral strain, vaccination status, and having one or more underlying health conditions were independently associated with the development of one or more COVID-19 symptoms (Figure 3). Female sex was independently associated with greater odds of developing sore throat, runny nose/nasal congestion, wheezing, chest pain, headache, abdominal pain, and vomiting/nausea. Increasing BMI was independently associated with greater odds of developing productive and non-productive cough, wheezing, chest pain, shortness of breath, and diarrhea. Older age was independently associated with lower odds of developing productive and non-productive cough, sore throat, runny nose/nasal congestion, chest pain, headache, and loss of taste and smell. Non-White or non-European ethnicity was associated with greater odds of developing sore throat and conjunctivitis. Infection by SARS-CoV-2 viral variants was associated with lower odds of loss of taste and smell, and greater odds of fever, productive and non-productive cough, chest pain, vomiting/nausea, and skin rash. Vaccination prior to infection was associated with lower odds of developing productive cough, chest pain, shortness of breath, and diarrhea. However, vaccination was also associated with greater odds of experiencing runny nose/nasal congestion. Lastly, having one or more underlying health conditions was independently associated with greater odds of productive and non-productive cough, runny nose/nasal congestion, chest pain, fatigue, headache, altered consciousness/confusion, abdominal pain, and diarrhea. None of the covariates examined were associated with developing hemoptysis, ear pain, muscle aches, joint pain, or seizures.   Sex, BMI, age, ethnicity, viral strain, vaccination status, and having one or more underlying health conditions were similarly associated with differences in the duration of COVID-19 symptoms (Figure 4). Female sex was predictive of longer-lasting symptoms including runny nose/nasal congestion, chest pain, and muscle aches. Increasing BMI was also predictive of longer-lasting fever, muscle aches, joint pain, and fatigue, while older age was predictive of longer-lasting fever, productive and non-productive cough, fatigue, and abdominal pain. Non-White or non-European ethnicity was predictive of longer-lasting muscle aches, headache, and diarrhea. Infection by viral variants was predictive of shorterlasting altered consciousness or confusion. Vaccination prior to infection was predictive of shorter-lasting fever and muscle aches and longer-lasting diarrhea. Lastly, having one or more comorbidities was associated with longer-lasting chest pain, muscle aches, joint pain, fatigue, and shortness of breath. None of the covariates were associated with differences in the duration of sore throat, ear pain, wheezing, vomiting/nausea, skin rash, or loss of taste or smell. Viruses 2023, 15, x FOR PEER REVIEW 12 of 28 Figure 2. COVID-19 symptom duration (in weeks) stratified by hospitalization status. Comparisons of symptom duration between patient groups were performed using the Mann-Whitney U test. Symptoms reported by >10% of each patient group were included for analysis. n values and percentages indicated below each symptom reflect the total number and proportion of participants out of each patient group who experienced a given symptom. Percentages indicated on bars represent the proportion of symptomatic participants from each patient group who reported a given symptom duration.

No. of Observations (%)
Sex, BMI, age, ethnicity, viral strain, vaccination status, and having one or more underlying health conditions were independently associated with the development of one or more COVID-19 symptoms (Figure 3). Female sex was independently associated with Figure 2. COVID-19 symptom duration (in weeks) stratified by hospitalization status. Comparisons of symptom duration between patient groups were performed using the Mann-Whitney U test. Symptoms reported by >10% of each patient group were included for analysis. n values and percentages indicated below each symptom reflect the total number and proportion of participants out of each patient group who experienced a given symptom. Percentages indicated on bars represent the proportion of symptomatic participants from each patient group who reported a given symptom duration.   vaccinated), and underlying medical conditions (none (reference) vs. any). Adjusted odds ratios and 95% confidence intervals for each covariate were calculated using a multiple logistic regression model. Adjusted odds ratios correspond to the odds of experiencing a given symptom in the comparator group relative to the reference group (for categorical variables), or for a one-year increase in age or one-unit increase in BMI. Statistically significant associations are indicated by asterisks (*). vaccinated), and underlying medical conditions (none (reference) vs. any). Adjusted odds ratios and 95% confidence intervals for each covariate were calculated using a multiple logistic regression model. Adjusted odds ratios correspond to the odds of experiencing a given symptom in the comparator group relative to the reference group (for categorical variables), or for a one-year increase in age or one-unit increase in BMI. Statistically significant associations are indicated by asterisks (*).

Discussion
Health service interruptions resulting from the COVID-19 pandemic have brought attention to the growing need for better strategies to prioritize and allocate resources effectively during periods of increased strain on healthcare systems. In the context of COVID-19, we propose that (1) identifying patient features which are predictive of illness requiring hospitalization and (2) characterizing clinical symptoms and complications associated with severe illness are critical to managing and improving patient care. Our study provides unique insights into improving the clinical decision-making process for COVID-19 patients by examining these measures collectively in a cohort of patients possessing diverse health histories, clinical characteristics, and SARS-CoV-2 viral variants.
We have identified several risk factors independently associated with hospitalization which corroborate the findings of previous studies. Specifically, patient factors including non-White/European ethnicity, increasing age, and higher BMI have similarly been shown to be associated with more severe illness and hospitalization. A systematic review and meta-analysis conducted by Sze et al. [27] found that ethnic minority groups (specifically individuals from Black and Asian communities) were at greater risk of COVID-19 infection compared to White individuals. Consequently, they also identified that Asian individuals were at greater risk of mortality, owing to increased transmission among members of this community due to various lifestyle and social factors. Similarly, Magesh et al. [28] found that members of racial and ethnic minority groups across 68 independent studies were at heightened risk of COVID-19 positivity and disease severity. In a review article published by Gao et al. [8], older age and BMI (as it relates to obesity) were highlighted as prominent risk factors for disease severity across a series of multivariable-adjusted analyses. Studies comparing illness severity resulting from infection by SARS-CoV-2 variants of concern have found variants including Alpha (B.1.1.7) and Delta (B.1.617.2) to be associated with a greater risk of poorer outcomes relative to the wild-type virus [29,30]. As the Alpha variant was the most highly represented SARS-CoV-2 variant in our cohort, the significant association observed with respect to hospitalization is congruent with other studies. Similar to our study, studies conducted by Buchan et al. [31] and Moghadas et al. [32] have shown that vaccination drastically reduces the risk of adverse COVID-19 outcomes including hospitalization, even in cases of infection by SARS-CoV-2 variants such as Delta and Omicron.
In contrast to other studies [9,10], associations identified between specific underlying health conditions (e.g., endocrine conditions including diabetes, cancer) and hospitalization were no longer significant after controlling for covariates, with the exception of hypertension. Our findings suggest that other patient characteristics, such as age and BMI, are the greatest risk factors for hospitalization among participants of our study. Moreover, unlike other studies [33], we did not observe any significant difference in the risk of hospitalization between sexes. The incongruence of these findings highlights how certain risk factors for COVID-19 severity may be population-specific and dependent on study design. For example, differences in health behaviours between sexes are suggested to play a role in the severity and outcomes of COVID-19. Handwashing, masking, and adherence to public health guidance are more likely to be followed by females, while males have been reported to be more likely to delay accessing healthcare resources [34,35]. Some studies have also hypothesized that social factors including gender-linked occupations and structural exposures (e.g., incarceration and homelessness) may contribute to sex disparities in COVID-19 outcomes [35]. As a result, whether these social determinants are present is likely to be population-specific and may contribute to differences in risk factors identified between study populations. Similarly, health conditions which are more prevalent among individuals of a certain population may play a greater role in patient outcomes and consequently may be identified as significant risk factors for COVID-19.
In addition to patient characteristics, COVID-19 symptoms are also suggested to have prognostic value [36]. Symptoms including shortness of breath and fever were shown to be associated with hospitalization, and they have similarly been found to be common clinical presentations among those with severe illness by other studies [12,37]. Conversely, loss of smell was identified as the only symptom more prevalent among outpatients of our study and is suggested to be an early predictor of mild COVID-19 given its potential link to rapid antiviral responses within the nasal epithelium [38].
The associations identified between participant characteristics and specific symptoms provide novel insights into risk factors for developing or having persistent COVID-19 symptoms. Female sex was associated with greater odds of developing many of the symptoms examined, which is supported by previous studies. For example, Lhendup et al. found that males were 64% less likely to be symptomatic compared to females [20]. Similarly, our findings provide supporting evidence that individuals with comorbidities, individuals with higher BMI (specifically those who are clinically overweight or obese), and those who are non-White or non-European are more likely to develop symptomatic illness and experience a greater number of symptoms over the course of their illness. Yu et al. [26] found that the presence of comorbidities such as hypertension was predictive of symptomatic progression among patients who were asymptomatic at admission. Similarly, Cheng et al. [22] found that individuals classified as overweight or obese tended to experience a greater number of severe symptoms including shortness of breath relative to those who were not classified as overweight or obese. In a study conducted by Patel and colleagues [39], non-Hispanic Black and Hispanic participants were found to be more frequently affected by symptoms compared to non-Hispanic White participants. The associations we identified between the development of specific symptoms and SARS-CoV-2 viral variants are consistent with differences in the clinical presentation of patients infected by SARS-CoV-2 viral variants characterized previously [24]. We also found that COVID-19 vaccination prior to infection was associated with lower odds of developing more severe symptoms (e.g., shortness of breath) and greater odds of developing mild symptoms (e.g., nasal congestion), which provides evidence in support of the protection conferred by vaccines against severe symptomatic infection [25].
As we found age to be a risk factor for hospitalization, the independent associations observed between increasing age and lower odds of developing specific COVID-19 symptoms were unexpected. Indeed, previous studies have reported that older individuals are at greater risk of developing symptomatic COVID-19 [21]. However, as the older individuals examined in this study only included those who survived their illness (most of whom were outpatients and never hospitalized), our observations are consistent with favourable illness outcomes. Our findings suggest that there may be other factors (e.g., serological or genetic differences) which we have not assessed among older participants that aid in protecting against developing specific symptoms relative to younger adults. Additionally, as we assessed age as a continuous variable, the association identified between increasing age and lower symptom prevalence is relevant to the entire age range of all participants and not necessarily limited to elderly patients.
With respect to COVID-19 symptom duration, studies of long COVID-19 have similarly found that the most persistent symptoms include fatigue, shortness of breath, and loss of smell [40,41]. These studies have also identified that female sex, higher BMI, belonging to an ethnic minority, and underlying comorbidities are risk factors for long COVID-19, while variability has been observed with respect to age. In totality, our findings elucidate and provide evidence of risk factors for symptomatic and persistent illness. The significant associations we have identified suggest that both patient characteristics and clinical presentation should be utilized to inform decisions for patient care.
As most study variables were patient-reported measures, our study has a few limitations. Since the survey was conducted online and only in English, many of the collected variables of interest were dependent on participants' ability and willingness to disclose all information to the study team (i.e., ability to access, comprehend, and complete the survey). Consequently, our data collection methods may have excluded or deterred non-English-speaking persons or non-native English speakers from participating in the survey. Moreover, as many of our outcomes of interest were patient-reported and obtained retro-spectively, we recognize that this may be a potential source of bias in our findings. Data which may be affected by recall bias may include outcomes related to medical histories and COVID-19 symptoms. As participant recruitment spanned approximately a year and a half, the availability of COVID-19 vaccines and therapeutics improved over the course of the study period [42]. Conversely, the general accessibility of healthcare resources may have fluctuated depending on the level of strain on healthcare systems throughout various waves of the pandemic. Each of these factors may have contributed to differences in patients' COVID-19 illness progression and outcomes. However, the impact of these factors is highly context-and patient-specific (e.g., the capacity of any given hospital, or the use of COVID-19 therapeutics), and as a result, we were unable to thoroughly assess their effect on our outcomes of interest. Similarly, we did not assess other non-physiological patient factors such as COVID-19 vaccine hesitancy attitudes and public health misinformation which may be related to COVID-19 severity and outcomes [43][44][45]. However, these factors were unlikely to have affected or biased recruitment, as most inpatients and outpatients were enrolled within the same period between January 2021 and July 2021 ( Figure A1). Due to the limited availability of data for deceased participants enrolled in the study, these individuals were not included in our analyses. Consequently, further stratification by clinical course (e.g., ICU admission) or outcomes (e.g., death) was not performed. Additionally, due to the complexity of COVID-19 vaccination among participants of our study, we were limited in our ability to examine associations between different vaccine combinations and the outcomes of interest. As different vaccine regimens and more recent vaccination have been shown to impact symptomatic COVID-19 [46], additional differences may exist in the clinical characteristics between vaccinated participants which we were unable to account for.

Conclusions
In summary, in a diverse cohort of COVID-19-positive adults residing in Ontario, Canada, we have identified (1) risk factors associated with hospitalization and (2) predictors of COVID-19 symptom development and persistence. Specifically, we have shown that intrinsic patient factors including age, ethnicity, BMI, and comorbidities such as hypertension and extrinsic factors including viral strain are risk factors for hospitalization. These findings are congruent with those of previous studies and lend support to the body of evidence surrounding risk factors for COVID-19 severity. More importantly, however, we have shown that these same patient factors, in addition to other factors including sex and vaccination status, are associated with the differences in the prevalence and persistence of specific clinical manifestations of COVID-19. These findings may be utilized to provide insight into the likely course of one's illness based on patient characteristics. Furthermore, the findings of this study may aid in improving COVID-19 patient management strategies which will enable more efficient clinical decision making.

Funding:
The authors disclose receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Canadian Institutes of Health Research (Funding Reference Number VR4-172753, VS1-175526, VS2-175572).

Institutional Review Board Statement:
The study protocol was approved by the Mount Sinai Hospital research ethics board (Study ID: 424901).
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Written informed consent has been obtained from the patients to publish this paper.

Data Availability Statement:
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.  Figure A1. GENCOV participant recruitment period, stratified by patient hospitalization status. The proportion of participants recruited on any given date is expressed as a percentage (%).