Thrombophilia and Immune-Related Genetic Markers in Long COVID

Aiming to evaluate the role of ten functional polymorphisms in long COVID, involved in major inflammatory, immune response and thrombophilia pathways, a cross-sectional sample composed of 199 long COVID (LC) patients and a cohort composed of 79 COVID-19 patients whose follow-up by over six months did not reveal any evidence of long COVID (NLC) were investigated to detect genetic susceptibility to long COVID. Ten functional polymorphisms located in thrombophilia-related and immune response genes were genotyped by real time PCR. In terms of clinical outcomes, LC patients presented higher prevalence of heart disease as preexistent comorbidity. In general, the proportions of symptoms in acute phase of the disease were higher among LC patients. The genotype AA of the interferon gamma (IFNG) gene was observed in higher frequency among LC patients (60%; p = 0.033). Moreover, the genotype CC of the methylenetetrahydrofolate reductase (MTHFR) gene was also more frequent among LC patients (49%; p = 0.045). Additionally, the frequencies of LC symptoms were higher among carriers of IFNG genotypes AA than among non-AA genotypes (Z = 5.08; p < 0.0001). Two polymorphisms were associated with LC in both inflammatory and thrombophilia pathways, thus reinforcing their role in LC. The higher frequencies of acute phase symptoms among LC and higher frequency of underlying comorbidities might suggest that acute disease severity and the triggering of preexisting condition may play a role in LC development.


Introduction
Since 2020, the coronavirus disease 2019 (COVID- 19) constituted one of the greatest challenges in global public health, reaching over 600 million confirmed cases and a death toll of more than 6 million by the end of 2022. The most relevant pathogenesis pathways in the disease evolution are inflammatory cytokine storm and thrombophilia events [1][2][3][4].

Study Design and Ethic Aspects
The present study was composed of two sample groups, a cross-sectional group composed of 199 long COVID patients and a cohort composed of 79 COVID-19 patients whose follow-up by over six months did not reveal any evidence of long COVID. Both sample groups were selected from a larger sample according to very rigorous inclusion/exclusion criteria, as described below.
All patients of both sample groups had their diagnosis of COVID-19 confirmed by RT-PCR, with clinical symptoms and recovery information obtained from medical records. The same clinical parameters and health multiprofessional approaches were used to evaluate and classify all patients. The sampling was performed between July 2020 and December 2021 and included patients that are residents in Belém (Pará, Brazil) from both sexes, over 18 years old and unvaccinated during the time of the study. Additionally, no reinfection could be detected or reported among both sample groups. The severity of acute COVID-19 was evaluated according to WHO criteria [13], from information on medical records.
The non-long COVID patients (NLC) constituted of 76 patients that had mild acute COVID-19, with no need for hospitalization or supplemental oxygen, and three patients that were asymptomatic, but had SARS-CoV-2 infection confirmed by real time PCR. The exclusion of severe and moderate COVID-19 patients was conducted in order to match more properly this sample with the long COVID sample. Only patients whose medical records and mutiprofessional follow-up allowed to exclude any new signal or symptom that could be assigned to long COVID development were included.
The long COVID sample (LC) was constituted of patients screened by the Comprehensive Health Care Program for Patients with long COVID of State University of Pará. Among the patients obtained using this service, 199 patients were screened based on the following criteria: (i) reporting long COVID symptoms and sequalae by over three months after the acute infection, the symptoms and sequelae were evaluated and confirmed by a multiprofessional team composed of physiotherapists and specialized physicians, and by image and laboratory exams; (ii) only patients with SARS-CoV-2 RNA detection by RT-PCR and complementary exams were accepted in the sample; (iii) Only patients that presented mild acute COVID-19 were included in the sample in order to avoid confounding sequelae, as those from intensive care syndrome with long COVID signs and symptoms [9].
Personal, demographic and clinical data were collected using cryptographed Google forms™, stored in computers with controlled access by individual passwords.
The present study was approved by the National Ethic Committee (CAEE: 33470020. 1001.0018; protocol number nº 2.190.330). All the participants provided written informed consent. This study was conducted in strict accordance with the principles of the Declaration of Helsinki and followed recommendations provided by the guidelines for reporting observational studies, the STrengthening the REporting of Genetic Association studies (STREGA) [14].
Variables used for describing and subgrouping of the final sample were age, sex, main symptoms presented during the acute phase of COVID-19 and LC, duration of symptoms (LC) and severity of the acute phase of COVID-19.

Sample Processing and Genotyping
DNA was isolated from venous blood samples (4 mL) and collected using EDTA as the anticoagulant. DNA isolation was performed using the kit ReliaPrepTM Blood gDNA Miniprep System (Promega), following the protocol recommended by the fabricant.

Statistical Analyses
All SNPs were tested for Hardy-Weinberg equilibrium. The genotype and allele frequencies of each SNP were estimated by direct count. Comparison of the allele and genotype frequencies between LC and NLC groups were carried out using Fisher exact test. The comparison of the frequencies of symptoms between different genotype carriers were performed using paired Wilcoxon test.
Correction for multiple tests is usually made in genes with several alleles because there is only one hypothesis to be tested with many tests. Otherwise, multiple genes represent one hypothesis per gene. However, some studies apply corrections across multiple genes, such as GWAS. The crossroad in this issue is how conservative the authors want to be in their conclusions. Thus, we opted to present the raw test results without correction, allowing the readers to take their own conclusions.

Results
Demographic and clinical characteristics of the samples are presented in Table 1. Female predominates in both samples, as well as ages under 60 years, and the proportion of females is higher among LC patients than in the general population. In terms of clinical characteristics, LC presented higher prevalence of heart disease as a preexisting comorbidity (Fisher exact test; p = 0.001), and the proportions of the remaining comorbidities were similar to those observed in the NLC sample. Moreover, in general the proportions of symptoms in acute phase of the disease were higher among LC patients (Wilcoxon test; Z = 2.9; p = 0.0032), fatigue being by far the most frequent, (observed in 53% of the patients), followed by anosmia/hyposmia/parosmia (29%). More than 9 to 15 symptoms -11 (5%) -Major symptoms in long COVID (n, %) The exclusion of asymptomatic patients did not change the significance of the statistical significance observed.
Complete genotype and allele frequencies of all ten SNPs, in both LC and NLC groups, are presented in Supplementary Table S1. Two SNPs, rs2430561 (Interferon Gamma, IFNG) and rs1801133 (Methylenetetrahydrofolate Reductase, MTHFR), showed statistical differences between LC and NLC groups, as highlighted in Figure 1. Even the exclusion of the asymptomatic patients did not alter these results. higher frequency among long COVID patients (60%), and this difference was statistic significant (Fisher Exact test; p=0.033).
Moreover, the genotype CC of MTHFR gene, associated with higher expression, also more frequent among long COVID patients (49%). Fisher Exact test show significance (p = 0.045). The genotype AA of IFNG gene, associated with lower expression, was observed in higher frequency among long COVID patients (60%), and this difference was statistically significant (Fisher Exact test; p = 0.033).
Moreover, the genotype CC of MTHFR gene, associated with higher expression, was also more frequent among long COVID patients (49%). Fisher Exact test showed significance (p = 0.045).
The genotypes AA of IFNG and CC of MTHFR in the population of Belém, according to data gathered from studies conducted prior the COVID-19 pandemics, showed frequencies of 56.8% and 42.1%, respectively ( Figure 1).
Additionally, the frequencies of long COVID symptoms between carriers of IFNG genotypes AA and non-AA were performed across all symptoms by paired Wilcoxon test (Figure 2), showing a strong statistical significance in higher frequencies of symptoms among AA carriers (Z = 5.08; p < 0.0001). However, the frequencies of symptoms were not statistically different in MTHFR CC genotypes compared with non-CC ones ( Figure 3).
Additionally, the frequencies of long COVID symptoms between carriers of genotypes AA and non-AA were performed across all symptoms by paired Wilcoxo (Figure 2), showing a strong statistical significance in higher frequencies of symp among AA carriers (Z=5.08; p<0.0001). However, the frequencies of symptoms wer statistically different in MTHFR CC genotypes compared with non-CC ones ( Figure   Figure 2. Comparison of frequencies of the main symptoms (%) during long COVID among AA and AT+TT genotype carriers (Paired Wilcoxon test; Z = 5.08; p < 0.0001).

Discussion
The female gender was more frequent among LC patients than among NLC ones. This result is in agreement with previous studies that suggested an association of long COVID with female gender [25][26][27][28]. Additionally, our results also showed the incidence of LC among patients younger than 60 years, which was also reported in previous studies [29,30].
Cardiac disease was the most prevalent preexistent comorbidity among LC compared to NLC patients. Some studies showed that preexisting comorbidities might potentialize or increase the risk of prolonged symptoms associated to long COVID [13,29]. In this context, there is evidence that COVID-19 patients show an augmented risk of cardiovascular disorders, even in the absence of previous heart disease [31], providing a link between cardiovascular disease development and long COVID.

Discussion
The female gender was more frequent among LC patients than among NLC ones. This result is in agreement with previous studies that suggested an association of long COVID with female gender [25][26][27][28]. Additionally, our results also showed the incidence of LC among patients younger than 60 years, which was also reported in previous studies [29,30].
Cardiac disease was the most prevalent preexistent comorbidity among LC compared to NLC patients. Some studies showed that preexisting comorbidities might potentialize or increase the risk of prolonged symptoms associated to long COVID [13,29]. In this context, there is evidence that COVID-19 patients show an augmented risk of cardiovascular disorders, even in the absence of previous heart disease [31], providing a link between cardiovascular disease development and long COVID. The frequencies of symptoms during COVID-19 acute phase were clearly higher in the LC group. These results could represent a putative association with COVID-19 severity, in agreement with previous studies that associated the severity of the acute phase with persistent symptoms [2].
An additional point is that cardiovascular disease has been widely associated to the allele T of the rs1801133 from MTHFR loci. Thus, the higher frequency cardiac disease among LC patients could not be a consequence of the genotype CC's higher frequency in this group [32].
Polymorphisms of MTHFR gene have been reported in association with several diseases, including cardiovascular diseases, thrombophilia predisposition, inflammatory disorders and even cancer [32]. Regarding infectious diseases, mutations at this gene could be associated with important protozoa infections such as malaria and leishmaniosis [33,34], as well as with viral diseases such as human papilloma virus [35] Cytomegalovirus, HIV and Crimean-Congo hemorrhagic fever [36][37][38]. While most of the studies indicate predisposition due to the presence of MTHFR *T allele, at least one study reported protective effects of this allele against persistent HBV infection in West Africa [39].
The associations of this polymorphism with COVID-19 severity have been suggested by meta-analysis based on the correlation of T allele frequencies with COVID-19 mortality [40]. However, to this date, no studies have been conducted investigating the role of MTHFR mutations in long COVID. Our results are the first to provide initial clues on the relationship of impaired folate-mediated one carbon metabolism with long COVID.
However, carriers of the MTHFR predisposing genotype do not show differences in the long COVID symptoms frequencies, if compared to non-carriers. This result, along with a low significance of p-value of the Fisher exact test, can be indicative of a spurious association.
After SARS-CoV-2 infection, a persistent inflammatory response could be detected for about 40-60 days, even among patients with mild and asymptomatic COVID-19 [3]. In this context, long COVID is assumed to be related to residual inflammation and tissue damage in association with preexisting comorbidities [41]. Indeed, a previous study from our group suggested a molecular signature of Th17 inflammatory profile with a decrease in IL-4 and IL-10 anti-inflammatory cytokine levels [11].
Following a simple logic, the SNP associated with a low expression of IFNG would lead to low plasma levels of IFN-γ. If this polymorphism is associated with long COVID, it would be expected that in our previous paper [11], low levels of IFN-γ should be detected in plasma, but this was not the case. However, the present study used only LC patients that had mild acute COVID-19, while the previous cytokine paper also used patients that had severe acute COVID-19 patients. Thus, they are not directly comparable. Moreover, during the development of Th17 immune response pattern the dynamics of interferon-γ production is not linear. It is known that in some situations, interferon-gamma can negatively regulate Th17-mediated immunopathology [42]. Thus, low IFNG expression can be associated with initial Th17 profile development. Moreover, pathogen-induced Th17 cells are also able to produce IFN-γ afterwards [43]. In this scenario, during an infection followed by Th17 profile establishment, early and late IFN-γ plasma levels would not necessary be similar.
In conclusion, the association of long COVID with interferon gamma gene polymorphism seems to be a valuable clue for understanding the underlying mechanisms. Not only was the frequency of the low expression CC genotype associated with long COVID, but the long COVID symptoms showed to be more frequent among CC genotype carriers when compared with the remaining genotypes. Thus, due to the high significance, even after correction for multiple tests, we considered that the rs2430561, at the INFG gene, can be an important direct or indirect marker for long COVID.
Interferon gamma is a key factor in viral infections, involved in several immunological pathways, such as antigen processing and presentation, apoptosis, antiviral mediators, lysosome mediated killing/phagosome maturation and complement pathway, among others [44]. Thus, genotypes that modulate INF-γ expression could influence the persistence of COVID-19 symptomatology.
Despite the scarcity of studies in long COVID host genetics, some aspects of IFNG reveals interesting links between inflammatory pathways related to long COVID. Indeed, INF-γ was identified as an important mediator in controlling sortilin-1 (Sort-1) levels, which is a receptor of VPS10p family associated with cardiovascular disease [45], including the reduction in Sort-1 by INF-γ regulated by JAK/STAT pathway.
Interestingly, Sort-1 is associated with several diseases, including inflammation syndromes [46]. Thus, INF-γ low expression genotype could display lower Sort-1 inhibition, predisposing to inflammatory profiles underlying long COVID pathogenesis.
Moreover, by analogy with other viral infections presenting long lasting inflammatory/ immune-based diseases, retroviral chronic infections showed higher INF-γ levels associated with inflammatory symptoms [47,48]. However, in such diseases, the viral infection persists along with the inflammatory symptomatology. Alternatively, some viral infections, such as hemorrhagic fevers, could present immune-based disease after the viral infection clearance. In this context, a gene expression study highlighted the role of INF-γ in the protection against hemorrhagic dengue fever [49], in agreement with our results, suggesting that low expression genotypes are predisposed to long COVID and the higher expression genotypes are protective.
Despite presenting promising genetic associations, our study has limitations and strengths, such as the need for future studies with larger samples and rigorous followup and controlling of long COVID patients in order to evaluate long COVID evolution patterns and their putative host genetics basis. The value of our results resides in the first multigenic approach with rational choice of candidate polymorphisms in well delimited samples for absence or occurrence of long COVID, controlled for acute disease severity. Even with the limitation of sample size, it was possible to detect associations that will guide future studies.

Conclusions
The present study identified, among ten candidate genes, two polymorphisms associated with long COVID in both inflammatory and thrombophilia major pathways. The genetic basis of long COVID triggering is scarce and our study is among the first ones to approach the underlying host genetic factors of the disease. The results provide valuable clues for future studies, such as homocysteine plasma levels evaluation, and begin to reveal the extensive complexity of the long-lasting symptomatology of COVID-19.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/v15040885/s1, Table S1. Genotypic and allelic frequencies of genetic markers in relation to acute COVID-19 (NLC) patients, long COVID patients (LC) and the general population in Belém (BEL; prior the pandemics).  Informed Consent Statement: All participants were informed about the study objectives and signed an informed consent form. The collected biological samples were stored in a biorepository until use.

Data Availability Statement:
The raw data supporting the conclusions of this article will be made available by the authors without undue reservation.