C9orf72 Intermediate Repeats Confer Genetic Risk for Severe COVID-19 Pneumonia Independently of Age

A cytokine storm, autoimmune features and dysfunctions of myeloid cells significantly contribute to severe coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. Genetic background of the host seems to be partly responsible for severe phenotype and genes related to innate immune response seem critical host determinants. The C9orf72 gene has a role in vesicular trafficking, autophagy regulation and lysosome functions, is highly expressed in myeloid cells and is involved in immune functions, regulating the lysosomal degradation of mediators of innate immunity. A large non-coding hexanucleotide repeat expansion (HRE) in this gene is the main genetic cause of frontotemporal dementia (FTD) and amyotrophic lateral sclerosis (ALS), both characterized by neuroinflammation and high systemic levels of proinflammatory cytokines, while HREs of intermediate length, although rare, are more frequent in autoimmune disorders. C9orf72 full mutation results in haploinsufficiency and intermediate HREs seem to modulate gene expression as well and impair autophagy. Herein, we sought to explore whether intermediate HREs in C9orf72 may be a risk factor for severe COVID-19. Although we found intermediate HREs in only a small portion of 240 patients with severe COVID-19 pneumonia, the magnitude of risk for requiring non-invasive or mechanical ventilation conferred by harboring intermediate repeats >10 units in at least one C9orf72 allele was more than twice respect to having shorter expansions, when adjusted for age (odds ratio (OR) 2.36; 95% confidence interval (CI) 1.04–5.37, p = 0.040). The association between intermediate repeats >10 units and more severe clinical outcome (p = 0.025) was also validated in an independent cohort of 201 SARS-CoV-2 infected patients. These data suggest that C9orf72 HREs >10 units may influence the pathogenic process driving more severe COVID-19 phenotypes.


Introduction
A large non-coding hexanucleotide repeat expansion (HRE) in the C9orf72 gene (>30 up to 1000 units) is the main genetic cause of frontotemporal dementia (FTD) and amyotrophic lateral sclerosis (ALS) [1,2], both characterized by neuroinflammation and high systemic levels of interleukin-6, interleukin-1β and tumor necrosis factor-α [3]. Healthy people harbor alleles ranging from 2 to 30 repeat units, but a real cut-off has not been determined and HREs of intermediate length (9-30 units), although rare, seem to be more frequent in neurodegenerative, neuropsychiatric and autoimmune disorders [4][5][6][7][8][9][10][11][12]. Gain of functions linked to the presence of the large HRE, resulting in nuclear RNA foci and cytoplasmic aggregation of dipeptide repeat proteins, are the main pathogenic mechanisms of neurodegeneration in FTD and ALS, but C9orf72 haploinsufficiency is assumed to play a role in the underlying neuroinflammation [13].
C9orf72 is ubiquitously expressed in the body with highest levels in myeloid cells. This gene is also differentially expressed with regard to the type of transcript among the 3 described variants and the use of differential transcription start sites for each transcript variant in the brain and myeloid cells, suggesting cell and/or tissue specific functions [25]. Considering the crucial role of autophagy in inflammation and immunity [26], these observations opened the possibility that C9orf72 loss of function might affect not only neurons but also the innate immune system [25]. Complete loss of the gene in C9orf72 −/− knock-out mice results in the release of proinflammatory cytokines, splenomegaly, lymphadenopathy and production of autoantibodies, indicating the appearance of autoinflammation and autoimmunity [27][28][29]. Importantly, even hemizygous C9orf72 +/− mice show altered inflammatory response, suggesting that also haploinsufficiency could lead to unbalanced immunity in mice. More recent work has corroborated these findings, showing that a defective C9orf72-SMCR8-WDR41 complex in murine myeloid cells causes prolonged Tolllike receptor (TLR) signaling and hyperactive type I interferon (IFN) response, due to the disrupted degradation of stimulator of IFN response cGAMP interactor 1 (STING) [30,31].
As stated above, C9orf72 full mutation results in haploinsufficiency, observed in blood cells and post-mortem brains and spinal cord of ALS/FTD patients [1,32]. C9orf72 HREs of intermediate length also seem to modulate gene expression. Expansions of more than 8 repeats mainly occur within a 110 kb FTD/ALS risk haplotype, that is more common in individuals of Northern European ancestry [33]. This haplotype was found associated with slightly higher expression of C9orf72 transcript variants 1 and 3 (both having the HRE region within intron 1) and lower expression of the most abundant transcript variant 2 (with the HRE located in the promoter), with more marked effect in the case of homozygosity for the risk haplotype [25]. Similar findings were described by Cali and colleagues [10], who found the risk alleles significantly associated with increased C9orf72 expression across several tissues, with the largest effect in neural tissues. The same authors also demonstrated the increase in transcript variant 3 and protein levels in induced pluripotent stem cells edited with intermediate HREs and differentiated into neural progenitor cells [10]. In contrast, Gijselinck and colleagues [34] found that repeat length from 7 to 24 unit resulted in slightly higher methylation degree in comparison with shorter repeats in humans, particularly in the homozygous state, and observed a decrease of C9orf72 promoter transcriptional activity with increasing number of repeats from 7 to 24 units in HEK293T and SH-SY5Y cells.
Both decrease and increase of C9orf72 expression have been found to impair autophagy [10,35,36]. We hypothesized that this effect may also reflect in host immune response to infections and that harboring C9orf72 HREs of intermediate length may then modulate this response. It has been recently found that gut microbiota may influence the autoinflammatory phenotype of C9orf72 −/− mice [37] and that Herpes Simplex Virus-2 (HSV-2) infection in spinal cord of mice results in the decrease of C9orf72 protein [38]. Apart from the above observations and the described role of C9orf72 in regulating TLR and type I IFN signaling in mice, there is no evidence of an involvement of the gene in host response to infectious diseases in humans.
Since 2020, we have faced the coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. Despite enormous efforts of the scientific community, there is still a lack of knowledge on the pathogenic mechanisms of this new virus. Excessive inflammation, autoimmune phenomena and defective antiviral type I IFN signaling are believed to significantly contribute to COVID-19 severity [39][40][41][42][43]. Genetic background contributes to susceptibility to autoimmune and infectious diseases in humans and genetic variants associated with those diseases are often found in genes involved in immune response and inflammation, including genes related to the autophagy pathways [44,45]. Furthermore, a multifactorial risk score for COVID-19 severity based on a polygenic model and including autophagy genes has been recently proposed [46].
In view of the above, in the present study we have explored the hypothesis that normal, but in the upper range, HREs in the C9orf72 gene could represent a risk factor for the development of more severe COVID-19 forms.

Results
In the present study, we initially included 240 patients with severe COVID-19 defined by SARS-CoV-2 positive molecular test and pneumonia that requires hospitalization. During hospitalization, 92 out of 240 (38.3%) patients received mechanical ventilation (MV) or non-invasive ventilation (NIV). Need of MV or NIV was used to define the most severe degree of COVID-19 in further analyses.
In order to explore our hypothesis, we compared C9orf72 repeat size, allele distribution and frequency with those observed in a historical cohort of genetically characterized patients with ALS (n = 93), harboring no C9orf72 pathogenic large expansions, without clinically defined disorders related to immune dysfunctions, mostly Caucasian and from the same geographical region (Lombardy, Italy) of COVID-19 patients. Indeed, no significant differences in the distribution of repeat size, allele distribution and frequency have been observed between C9orf72 expansion-negative ALS cases and healthy controls in published studies [1,2,4,8,[47][48][49][50], while the length of the HRE may depend on the genetic ancestry, being expansions of more than 8 repeats linked to the chromosome 9 Finnish founder ALS risk haplotype that is common in individuals of Northern Europe ancestry [33]. Therefore, the cohort of C9orf72 expansion-negative ALS patients from the same geographical area of COVID-19 patients can be considered as representative of the population of that region regarding the genetic background relative to the C9orf72 gene and used to compare COVID-19 patients. Demographic data of all patients are described in Table 1. No differences in sex, age and ethnicity were found between the two sub-cohorts. Genetic analysis for C9orf72 HREs in the 240 COVID-19 patients did not reveal the presence of large (>30 repeats) expansions. Alleles with 2, 5, and 8 repeat units were the most frequent HREs in both sub-cohorts ( Figure 1), as previously reported in several populations of both ALS patients and healthy controls [1,2,[4][5][6][7][8][9][10][11][12][47][48][49][50].
to compare COVID-19 patients. Demographic data of all patients are describ No differences in sex, age and ethnicity were found between the two sub-co Genetic analysis for C9orf72 HREs in the 240 COVID-19 patients did n presence of large (>30 repeats) expansions. Alleles with 2, 5, and 8 repeat u most frequent HREs in both sub-cohorts ( Figure 1), as previously reported in ulations of both ALS patients and healthy controls [1,2,[4][5][6][7][8][9][10][11][12][47][48][49][50]. Based on a preliminary size cut-off of >8 and ≤30 repeat units to define lengths, which was determined on the basis of previous studies [12]   Based on a preliminary size cut-off of >8 and ≤30 repeat units to define intermediate lengths, which was determined on the basis of previous studies [12] (see Materials and Methods section for more details), we found C9orf72 intermediate HREs in 39 Figure 1). The ALS patient with both intermediate alleles is an Italian female (age 70 years) that started with spastic dysarthria onset and predominant involvement of motor neuron I and evolved in anarthria as main clinical characteristic. Currently she wears percutaneous endoscopic gastrostomy (PEG) and is employing nighttime NIV. The COVID-19 patient with both intermediate alleles is a Caucasian male (age 36 years) with negative anamnesis that during hospitalization received MV and dialysis due to acute immune-mediated glomerular disease and tubular injury.
Considering data shown in Figure 1 and in order to find the number of C9orf72 hexanucleotide repeats that may better distinguish between COVID-19 and ALS patients, we conducted univariate logistic regression analysis at each repeat length >8 repeat units ( Figure 2) and found that patients hospitalized for COVID-19 had an odds ratio (OR) of 2.82 (p = 0.06) of having more than 10 repeats, when compared to the ALS patients.
trend towards a higher prevalence of intermediate expansions in hospitalized COVID-19 vs. ALS patients, despite comparable average, median number and range of repeat units (Table 1). Intermediate HREs were present on both alleles in only one COVID-19 patient and only one ALS patient. Comparing the overall number of intermediate alleles, we then found 40 out of 480 (8.33%) intermediate alleles in hospitalized COVID-19 patients and 9 out of 186 (4.84%) in ALS patients ( Figure 1). The ALS patient with both intermediate alleles is an Italian female (age 70 years) that started with spastic dysarthria onset and predominant involvement of motor neuron I and evolved in anarthria as main clinical characteristic. Currently she wears percutaneous endoscopic gastrostomy (PEG) and is employing night-time NIV. The COVID-19 patient with both intermediate alleles is a Caucasian male (age 36 years) with negative anamnesis that during hospitalization received MV and dialysis due to acute immune-mediated glomerular disease and tubular injury.
Considering data shown in Figure 1 and in order to find the number of C9orf72 hexanucleotide repeats that may better distinguish between COVID-19 and ALS patients, we conducted univariate logistic regression analysis at each repeat length >8 repeat units (Figure 2) and found that patients hospitalized for COVID-19 had an odds ratio (OR) of 2.82 (p = 0.06) of having more than 10 repeats, when compared to the ALS patients. Moreover, 27 out of 240 (11.25%) COVID-19 patients had at least one allele with more than 10 repeats compared to 4 out of 93 (4.30%) ALS patients (p = 0.050) ( Table 1). Comparing the overall number of intermediate alleles with > 10 repeats, we found 27 out of 480 alleles (5.63%) with more than 10 repeats in COVID-19 patients vs. 4 out of 186 (2.15%) in ALS patients (p = 0.056) ( Table 1).
Univariate logistic regression analysis ( Table 2) reveals that COVID-19 patients with more than 10 repeats in at least one allele are younger than patients with shorter expansions [mean age (±SD) 60.67 (±13.37) vs. 64.76 (±11.98), p = 0.090] and required more frequently MV or NIV (56% vs. 36%, p = 0.053), although differences did not reach the statistical significance. We also analyzed routine laboratory parameters, however no significant differences in terms of mean values were found, except for D-dimer levels (p = 0.02) ( Table 2).
Multivariate regression analysis further suggested the presence of more than 10 repeats in at least one allele as a possible risk factor for NIV or MV requirements independently of age in patients with COVID-19 pneumonia (OR 2.36, 95% confidence interval (CI) 1.04-5.37, p = 0.040] ( Table 3).  Finally, we replicated our analysis in an independent cohort of 201 SARS-CoV-2 infected individuals from the GEN-COVID Multicenter Study [51]. This replication cohort included 101 severely affected COVID-19 patients who received MV or NIV during hospitalization and 100 non-hospitalized subjects (asymptomatic or with very mild symptoms). Demographic data of patients are described in Table 4. As expected, severely affected COVID-19 patients were mostly males and older in comparison with non-hospitalized patients.
As in the first cohort of COVID-19 patients, we did not find large (>30 repeats) C9orf72 expansions. Based on the results obtained with the first cohort, we chose a cut-off value of more than 10 repeats and stratified patients by disease severity. We found 16 out of 101 (15.84%) subjects with at least one C9orf72 allele with more than 10 repeats in COVID-19 patients treated by either MV or NIV and 6 out of 100 (6%) in non-hospitalized SARS-CoV-2 infected subjects (p = 0.025). In this cohort we did not find subjects with more than 10 repeats in both C9orf72 alleles. When considering the overall number of alleles, we found 16 out of 202 alleles (7.92%) with more 10 repeats in the first group (patients treated by MV or NIV) and 6 out of 200 (3%) in the second group (asymptomatic or with very mild symptoms) of SARS-CoV-2 infected patients (p = 0.030) ( Table 4).  treated by MV or NIV) and 6 out of 200 (3%) in the second group (asymptomat very mild symptoms) of SARS-CoV-2 infected patients (p = 0.030) ( Table 4).

Discussion
With this work, we sought to explore whether intermediate HREs in C9orf72 may be a risk factor for severe COVID-19 pneumonia. Although we found intermediate repeats in only a small percentage of COVID-19 patients, the magnitude of risk for requiring MV or NIV conferred by harboring intermediate repeats >10 units in at least one allele was more than twice with respect to having shorter expansions (≤10 units), when adjusted for age (OR 2.36; 95% C.I. 1.04-5.37, p = 0.040). The association between intermediate repeats >10 units and more severe clinical outcome (p = 0.025) was also validated in an independent cohort of 201 SARS-CoV-2 infected patients, comprising 101 severely affected COVID-19 patients who received MV or NIV during hospitalization and, as control group, 100 non-hospitalized subjects (asymptomatic or with very mild symptoms). These data suggest that C9orf72 HREs > 10 units may be not a common cause of severe COVID-19 pneumonia but may influence the pathogenic process driving to severe phenotypes.
Although mainly implicated in neurodegenerative disorders, the human gene C9orf72 is highly expressed not only in microglia but also in myeloid cells, mainly monocytes and dendritic cells [25], is critical for their proper functions and is involved in autoimmunity and inflammation [27][28][29]. C9orf72 −/− knock-out mice indeed exhibit dysregulation of the immune system, age-dependent inflammation characterized by a cytokine storm, neuroinflammation and features of autoimmunity like systemic lymphadenopathy, splenomegaly, pseudothrombocytopenia, high levels of autoantibodies and membranoproliferative glomerulonephritis reminiscent of systemic lupus erythematosus (SLE). Even haploinsufficient hemyzygous C9orf72 +/− mice exhibit enhanced cytokine production in response to several immune stimuli [28]. Interestingly, we found that one COVID-19 patient with intermediate HREs in both C9orf72 alleles received MV during hospitalization and experienced acute immune-mediated glomerular disease.
Accumulating evidence supports the role of C9orf72 in regulating vesicle trafficking [15,18,52,53] and lysosomal degradation of inflammatory mediators, including TLRs and STING, leading to their prolonged inflammatory signaling [30,31]. Interestingly, the environment, especially variation in gut microorganisms, seems to directly influence the pathological phenotype of C9orf72 −/− mice [37] and HSV-2 latent infection in the spinal cord of mice results in altered microglia and leucocyte infiltration accompanied by a decrease in C9orf72 protein levels [38]. C9orf72 interacts with different Rab GTPases and might affect autophagy at many steps and through the regulation of mammalian target of rapamycin Complex 1 (mTORC1) [13]. Of note, autophagy dysfunctions are often associated with inflammatory and autoimmune diseases [44] and innate immune responses and inflammation, crucial in anti-viral responses, are regulated by autophagy [54]. Several studies have shown that many viruses, like coronaviruses, have evolved strategies to evade the host response by directly hijacking the autophagy pathway in support of their life cycle and spread or by disrupting the host control on the production of anti-viral cytokines [54][55][56]. Host genetics also contributes to aberrant immunity in autoimmune diseases and susceptibility to infectious diseases in humans and such variants are often found in genes involved in the immune response and inflammation [44,45]. The current knowledge and our work confirm these findings in COVID-19 [41,43,45,57,58]. Autophagy genes have recently been proposed as susceptibility factors in COVID-19 [46]. Our results are the first report on the potential involvement of variants in an autophagy gene in determining susceptibility for severe COVID-19 phenotype. The recent observation that C9orf72 is involved in the lysosomal degradation of inflammatory mediators like TLRs and STING [30,31], that are crucial in anti-viral response, further corroborates our findings.
Large hexanucleotide expansions in C9orf72 lead to neurodegeneration in ALS/FTD through the cooperation between loss and gain of functions, derived from C9orf72 haploinsufficiency and accumulation in patients' brain and spinal cord of C9orf72 HRE bidirectional transcripts and cytoplasmic toxic aggregates of dipeptide repeat proteins (DPRs) [13]. C9orf72 intermediate expansions of 24-30 repeats have recently been found associated with ALS in a large meta-analysis on 5071 cases and 3747 controls [59], but characteristic nuclear RNA foci and DPR aggregates were absent in one ALS patient with an intermediate expansion of 16 repeats [60] and 9 cases with corticobasal degeneration and intermediate repeats ranging from 17 to 29 units [10]. Furthermore, while full expansion results in the decrease in C9orf72 mRNA and protein expression [1,61], due to premature abortion of transcription [62] and hypermethylation of the CpG-rich C9orf72 promoter region [34,63], for intermediate expansions current results are discordant. Some authors and our group found association of intermediate repeats with neurodegenerative disorders like corticobasal degeneration, Parkinson's Disease, atypical parkinsonisms, multiple sclerosis, psychiatric symptoms in ALS/FTD patients, neuropsychiatric disorders and autoimmune diseases [4][5][6][7][8][9][10][11][12]. Moreover, intermediate repeats from 7 to 24 units showed a slightly higher methylation degree, particularly in the homozygous state, in comparison with short repeats. A decrease of transcriptional activity with increasing number of repeats from 7 to 24 units compared with shorter repeats has been demonstrated in HEK293T and SH-SY5Y cells [34].
By contrast, the risk haplotype was found to be associated with slightly higher expression of C9orf72 transcript variants 1 and 3 and lower expression of transcript variant 2 [25] and induced pluripotent stem cells edited with intermediate HREs and differentiated into neural progenitor cells showed an increase in transcript variant 3 and protein levels [10]. As stated above, C9orf72 protein expression is down-modulated by HSV-2 infection [38] while a cell type-dependent regulation of its levels via the ubiquitin-proteasome system and autophagy has been recently suggested [35]. Given the role of C9orf72 in TLR and type I IFN pathways [30,31], it is tempting to speculate that intermediate repeats, likely through gene expression modulation, may influence host response to infection with SARS-CoV-2 and perhaps further viruses. This could explain our findings regarding the higher risk of having severe COVID-19 requiring NIV or MV independently of age. Indeed, hyperactivation of myeloid cells, aberrant release of pro-inflammatory cytokines, autoimmune features and defective innate immune responses, particularly in type I IFN signaling, are believed to significantly contribute to severe clinical course of COVID-19. Recent studies highlighted the role of host genetics in determining COVID-19 severity with the identification of inborn errors of TLR3, IFN regulator factor 7-dependent production of type I IFN and variants in further genes involved in IFN signaling, cytokine release and inflammation underlying life-threatening COVID-19 [41,43,58]. To date, there is limited direct experimental evidence on autophagy involvement in SARS-CoV-2 infection, either in an anti-viral or pro-viral manner, with the exception of recent studies demonstrating that the SARS-CoV-2 papain-like protease (PLpro) cleaves the serine/threonine kinase unc-51-like autophagy activating kinase 1 disrupting autophagy [64] and that SARS-CoV-2 ORF3a inhibits autophagy activity by blocking fusion of autophagosomes/amphisomes with lysosomes [65]. We can hypothesize that harboring intermediate HREs in C9orf72 could contribute to negatively balancing the host innate immune response to SARS-Cov-2 infection leading to a more severe disease. A limit of our study is that we did not measure C9orf72 mRNA expression in patients' peripheral blood cells, however we thought that gene expression could be influenced not only by harboring intermediate repeats >10 units, as described above [10,25,34], but also by the clinical state, as suggested for HSV-2 infection [38], making it hard to discriminate between the two conditions. COVID-19 patients of the first cohort were enrolled after discharge, however they had been severe COVID-19 hospitalized patients (38.3% of them received MV or NIV) and most of them at the time of the recruitment in this study, during the follow-up, were still showing some signs of severe COVID-19. This could make difficult and hamper a clean analysis of C9orf72 expression relative to the length of the intermediate expansion. Furthermore, some patients of the validation cohort were recruited during the pandemic and the ongoing inflammatory conditions could likely affect C9orf72 expression. Further studies are therefore needed to determine if intermediate expansions may modulate C9orf72 in vivo and, more importantly, which immune cells are mainly affected but also to verify if SARS-CoV-2 may influence C9orf72 expression in particular subsets of myeloid cells.
Increased levels of pro-inflammatory cytokines have been observed in sera of C9orf72 −/− knockout mice [27][28][29]37], with a pattern that similarly defines the "cytokine storm" driving acute injuries during severe COVID-19 [66]. In our cohort of severe COVID-19 patients, we did not find any evident correlation between the presence of C9orf72 intermediate repeats and routine inflammatory laboratory parameters, except for D-dimers, and we did not measure levels of pro-inflammatory chemokines and cytokines. This is a limit of our study. Coagulation biomarkers, including D-dimers, are frequently altered during severe inflammation [67][68][69][70]. In patients with severe COVID-19, genetic variants studied here may be involved in more severe inflammatory conditions perhaps through STING signaling-mediated altered type I IFN production [31]. Indeed, very recently, inflammasome-dependent coagulation activation has been found to associate with excessive activation of the STING pathway [67], while beclin-1, a marker of autophagy, has been found to be increased in COVID-19 patients, particularly in severe patients, and its levels have been demonstrated to correlate with D-dimer levels [71].
The complex interactions between genetic background and the environment are poorly understood. The variable phenotype associated with C9orf72 large HREs in ALS/FTD has indicated that penetrance is incomplete [72], suggesting that either further genetic or environmental factors could modify the individual risk of disease. Microbiota seems to be a potent modifier of onset and progression of autoimmunity, inflammation and premature mortality in C9orf72 −/− knockout mice [37]. We cannot, then, exclude that environmental factors like microbiota may also influence the effect of intermediate C9orf72 repeats on COVID-19 clinical phenotype.
Further limitations of our study are, first, that the number of carriers of C9orf72 intermediate alleles in the 240 severe COVID-19 patient cohort, as well as in the validation cohort of 201 SARS-CoV-2 infected patients, is small, and one should be cautious with the interpretation of these results. Secondly, we considered a cohort of genetically characterized patients with ALS, harboring no C9orf72 pathogenic large expansions and without clinically defined disorders related to immune dysfunctions, as representative of the general population for the first part of this study rather than considering uninfected controls, likely resistant to SARS-CoV-2 infection, to make the first comparisons and find the number of repeats in C9orf72 HRE at which the difference between COVID-19 patients and ALS patients was significant. Nevertheless, at the time of patients' recruitment for this study we had no easy access to SARS-CoV-2 negative subjects, since in the midst of the pandemic molecular tests were executed mainly in symptomatic patients in Italy and, however, we could have not been sure that SARS-CoV-2 negative subjects were not infected because of their genetic background or simply because they did not come into close contact with infected people. Therefore, SARS-CoV-2 negative subjects could not represent the correct control population. Furthermore, as stated above, all published studies performed in ALS cases without pathological C9orf72 expansions and healthy controls found no significant differences in distribution, range, and median number of repeats [1,2,8,47,48]. For these reasons, and to avoid bias possibly deriving from genetic ancestry, being expansions of more than 8 repeats linked to the chromosome 9 Finnish founder ALS risk haplotype that is more common in individuals of Northern Europe ancestry [33], we decided to choose the ALS cohort (mostly Caucasian and from the same geographical region of COVID-19 patients), already used in a previous work [12] for the first comparison. Moreover, further analyses in this work compared severe COVID-19 patients of the first cohort considering MV and NIV requirement as a proxy of high severity of disease to find association with C9orf72 intermediate repeats >10 units. Furthermore, we validated our findings in an additional cohort. Since the aim of the second part of our work was not the comparison between COVID-19 patients and the general population but the confirmation of our hypothesis that harboring alleles with more than 10 repeats in the C9orf72 gene may be a risk to develop a more severe form of disease, we considered only COVID-19 patients stratified in severely affected ones that received MV or NIV during hospitalization and non-hospitalized subjects (asymptomatic or with very mild symptoms). The genetic analyses in this stratified cohort confirmed the association between intermediate repeats >10 units and more severe clinical outcome.
Finally, we cannot completely exclude that COVID-19 severity could be unrelated to C9orf72 HRE itself but rather associate with the genetic background defined by the chromosome 9 region in which C9orf72 is located, comprising the 110 kb risk Finnish haplotype, that is, as stated above, more frequent for alleles with more than 8 repeats within the C9orf72 HRE. Interestingly, genome-wide association studies (GWAS) identified single nucleotide polymorphisms (SNPs) in the region of chromosome 9 that contains the Mps One Binder Kinase Activator-Like 2B (MOBKL2B), C9orf72 and IFN-K loci as associated with the response to anti-tumor necrosis factor α therapy in rheumatoid arthritis (RA) [73], with a genetic predisposition to SLE [74] and recently as genetic loci shared between ALS and autoimmune diseases like SLE and RA [75]. IFN-K is expressed in oral epithelial cells, one of the first sites of host interaction with viruses that are spread via saliva and may be spread through the mouth [76]. Near this region are also clustered further genes of type I IFNs and we recently found a significantly higher frequency of C9orf72 intermediate repeats in patients with SLE and RA [12].
In conclusion, C9orf72 intermediate alleles >10 repeat units are over-represented in hospitalized COVID-19 patients with severe pneumonia and related to MV and NIV requirements independently of age, suggesting that they could represent a risk factor contributing to the occurrence of severe COVID-19 forms. Autophagy may be involved in the COVID-19 clinical phenotype and a polygenic model also related to genes involved in the autophagy machinery has been recently proposed to explain COVID-19 risk assessment and guide precision medical care [46]. This is the first report describing the association of severe forms of COVID-19 with variants in a gene involved in autophagy. Understanding how host genetic factors contribute to variation in disease susceptibility and severity may shed light on heterogeneity in the immune response and the host-pathogen interaction and facilitate the development of therapeutics and vaccines.

Patients
In the first cohort, we consecutively enrolled 240 adult patients (aged > 18 years, mostly Caucasian and most of them from the same Italian region, Lombardy) with confirmed COVID-19 pneumonia (defined by SARS-CoV-2-positive molecular test on nasopharyngeal swab and radiological features of pneumonia) who previously required hospitalization at ASST-Spedali Civili di Brescia over the period March-December 2020. Recruitment was performed when discharged patients were referred to the University Department of Infectious and Tropical Diseases of our Hospital for clinical and virological control and follow-up. Hospitalization with COVID-19 pneumonia was used as proxy of severity for patients' inclusion. NIV or MV were used to define the most severe degree of pneumonia in further analyses. Patients' clinical data and routine laboratory findings (white blood cell, lymphocyte and platelet counts, serum biochemical tests for liver and renal function, C-reactive protein, ferritin, D-dimer) were collected from clinical and electronic charts. The worst value for each biochemical parameter during hospitalization was used for analyses. No significant differences between ALS cases without large C9orf72 HRE and healthy subjects have been thoroughly described regarding distribution, range and median repeat number [1,2,4,8,[47][48][49][50]. Nonetheless, differences in the prevalence of large C9orf72 pathogenic expansions have been described between people from Southern and Northern Europe and both large (>30 repeat units) and intermediate (>8 but ≤30 repeat units) expansions are linked to the chromosome 9 Finnish founder ALS risk haplotype, that is common in individuals of Northern European ancestry [33]. To avoid potential bias deriving from the genetic background, as control group for the first cohort of analyzed subjects we included 93 ALS patients, mostly Caucasian and from the same geographical region of COVID-19 patients, but without large C9orf72 pathogenic expansions. ALS patients included in this study referred to the Centre for Neuromuscular Diseases and Neuropathies ASST-Spedali Civili di Brescia and were recently admitted to the Cytogenetics and Molecular Genetics Section of our Hospital for routine genetic diagnosis (some of these patients were already described in reference [12]. In the replication study, 201 SARS-CoV-2 infected individuals (defined by SARS-CoV-2-positive molecular test on nasopharyngeal swab as above) from the GEN-COVID Multicenter Study [51] were considered. Among them, 101 patients were severely affected and treated by either MV or NIV, while 100 were non-hospitalized subjects (asymptomatic or with very mild symptoms). Specimens were provided by the COVID-19 Biobank of Siena, which is part of the Genetic Biobank of Siena, a member of BBMRI-IT, Telethon Network of Genetic Biobanks (project no. GTB18001), EuroBioBank and RD-Connect.
All data were collected in anonymized form by study physicians. Written informed consent was obtained by all patients. The protocol for enrollment of COVID-19 patients of the first cohort was approved by the Ethics Committee of ASST-Spedali Civili di Brescia (GEVACOBA Study Project). The GEN-COVID study was approved by the University Hospital of Siena Ethics Review Board. Clinical research was conducted in accordance with the principles for medical research involving human subjects described in the Declaration of Helsinki.

C9orf72 Genotyping
Genomic DNA samples were obtained from peripheral blood samples using the Wizard Genomic DNA Purification kit (Promega Corporation, Madison, WI, USA). DNA samples were quantified by the use of Qubit 2.0 Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA), with Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific) and genotyped with a polymerase chain reaction (PCR)-based two-step C9orf72 analysis, essentially as previously described [77].

Statistics
Categorical variables were reported as proportion and/or percentage, continuous variables as mean (±SD) values. Fisher's exact or Chi-square test for categorical variables and Student's t-test for continuous variables were applied as appropriate. To find the number of repeats in C9orf72 HRE at which the differences in COVID-19 and control patients was more significant, we performed a logistic regression analysis, using the COVID-19 condition as dependent variable and the number of patients with different maximum repeats level. We then plotted the OR and p-value on the number of maximum repeats.
Logistic regression was used to perform the adjusted analysis for COVID-19 severity (using NIV and MV requirements as proxy), and presence of C9orf72 HRE >10 units, adjusted for age. p values < 0.05 were considered significant. When significant, OR with 95% CI were indicated.  Toscana" project to Azienda Ospedaliero-Universitaria Senese, charity fund 2020 from Intesa San Paolo dedicated to the project N. B/2020/0119 "Identificazione delle basi genetiche determinanti la variabilità clinica della risposta a COVID-19 nella popolazione italiana". These funding sources had no role in the design of this study, analyses, interpretation of the data or decision to submit results.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki. Protocol for the enrollment of patients of the first cohort was approved by the Ethics Committee of ASST Spedali Civili di Brescia (protocol code NP4139, GEVACOBA Study Project, date of final approval 8 July 2020). The GEN-COVID study was approved by the University Hospital of Siena Ethics Review Board (Protocol n. 16917, date of final approval 16 March 2020).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: All study data, including raw and analyzed data, and materials will be available from the corresponding author on reasonable request.