Investigation of genetic variations of il6 and il6r as potential prognostic and pharmacogenetics biomarkers: Implications for covid19 and neuroin ammatory disorders

In the present study, we investigated the distribution of genetic variations in IL6 and IL6R genes, which may be employed as prognostic and pharmacogenetic biomarkers for COVID-19 and neurodegenerative diseases. The study was performed on 271 samples representative of the Italian general population and identi ed seven variants (rs140764737, rs142164099, rs2069849, rs142759801, rs190436077, rs148171375, rs13306435) in IL6 and ve variants (rs2228144, rs2229237, rs2228145, rs28730735, rs143810642) within IL6R, respectively. These variants have been predicted to affect the expression and binding ability of IL6 and IL6R. The Ingenuity Pathway Analysis (IPA) showed that IL6 and IL6R appeared to be implicated in several pathogenetic mechanisms associated with COVID19 severity and mortality as well as with neurodegenerative diseases mediated by neuroin ammation. Thus, the availability of IL6-IL6R-related biomarkers for COVID19 disease may be helpful to counteract harmful complications and prevent multi-organ failure. At the same time, IL6-IL6R-related biomarkers could also be useful for assessing the susceptibility and progression of neuroin ammatory disorders and undertake the most suitable treatment strategies to improve patients’ prognosis and quality of life. In conclusion, this study showed how IL6 pleiotropic activity could be exploited to meet different clinical needs and realize precision medicine protocols for chronic, age-related and modern public health emergencies. © 2020, MDPI. All rights reserved.


Introduction
The identi cation of biological markers (i.e. biomarkers) of disease detectable in several biological uids and tissues represents the key milestone for the implementation of precision medicine protocols into the clinical practice [1][2][3][4][5]. The need of clinically useful biomarkers and precision medicine strategies became even more important with the recent outbreak of the novel pathogenic Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-Cov-2) and the resulting Coronavirus Infectious Disease (COVID19) pandemics [6][7]. However, COVID19 is just the most recent challenges for the healthcare system, which has also to tackle with a growing aging population and the diverse pathological conditions affecting the elderly, such as neurodegenerative disorders. Interestingly, neurodegenerative disorders and viral infections pathologies share several molecular signatures, which, altogether, indicate the dysfunction of local and peripheral immune system response as a common etiopathogenetic mechanism [8][9][10]. In this context, the availability of speci c biomarkers could be extremely helpful to identify at-risk individuals, select the most suitable therapy and counteract the progression and mortality of such public health emergencies. Two of the most promising candidate biomarkers for both COVID19 and neurodegenerative disorders are Interleukin-6 (IL6) and its receptor (IL6R). IL6 consists of a four-helix bundle conformation and exerts its functions by binding the Interlukin-6 Receptor (IL6R) [11]. IL6R can be either membrane-bound (classical signaling) on the extracellular surface of immune, epithelial and liver cells or it can be circulating in soluble form (trans-signaling pathway) and act as IL6 agonist [12][13]. The IL6-IL6R complex interacts with the glycoprotein 130 (gp130) membrane receptor, which, in turn, triggers downstream intracellular signaling pathways ( Figure 1) mainly involved in the immuno-in ammatory response [11,13].
Genetic variants in IL6 (7p15.3) and IL6R (1q21.3) genes have been supposed to affect the binding ability, expression levels and biological functions of IL6-IL6R complex, contributing thereby to the onset and progression of severe infectious, autoimmune and neuroin ammatory/neurodegenerative diseases, including COVID19, Hepatitis B infection, Rheumatoid Arthritis (RA), Cardiovascular Disorders (CVD), Multiple Sclerosis (MS), Alzheimer Disease (AD) and Parkinson Disease (PD) [14][15][16][17]. Indeed, a number of studies identi ed variants located in the regulatory regions of IL6 and IL6R as genetic determinants of high IL6 circulating levels in serum and tissues that have been proposed to affect the risk and progression of many different disease states (especially COVID19, CVD and AD) [16][17][18]. In the present study, we investigated the distribution of Copy Number Variations (CNVs) and genetic variants located within the coding sequences of IL6 and IL6R genes, which may be employed as prognostic and drug response (pharmacogenetic) biomarkers for COVID-19 and neuroin ammatory diseases. We decided to focus the attention on these pathologies because they both represent current public health issues and the identi cation of biomarkers within IL6 and IL6R could provide therapeutic strategies relevant to both pathological conditions.

Materials And Methods
The study was performed on a cohort of 271 DNA samples representative of the Italian general population. The study cohort was composed of 100 samples analyzed by array Comparative Genomic Hybridization (aCGH) for assessing the presence of structural genomic variations and 171 samples utilized for identifying common and rare Single Nucleotide Variants (SNVs) located in the coding or splice site regions of the genome. Genetic data were partially obtained from aCGH and Whole Exome Sequencing (WES) data available at the Genomic Medicine Laboratory of IRCCS Santa Lucia Foundation and partially retrieved from Ensembl [19][20][21]. The use of laboratory data for research purposes was approved by the Ethics Committee of IRCCS Santa Lucia Foundation of Rome (CE/PROG.650 approved on 01/03/2018) and by the signed informed consent provided by the individuals subjected to genetic testing at our laboratory.
The CNV analysis was performed by Chromosome Analysis Suite (ChAS) 3.1 (Affymetrix, Santa Clara, CA, USA) using the Cytoscan750k_Array Single Sample analysis: NA33_hg19 as reference le and an average resolution of 100 Kb. Concerning SNVs, Ensembl [19], 1000Genomes [20] and GnomAD [21] databases were utilized to extract the frequency data of the exonic variants of interest. In particular, 22 variants located within IL6 and 37 variants within IL6R were selected. Successively, the presence and the frequency distribution of the selected variants have been evaluated in the cohort of Italian samples. For WES results, a coverage of 20X was considered for the analysis of IL6 and IL6R sequence. The Variant Caller Files (VCF) obtained by WES analysis were rstly scanned with vcfR and then subjected to "genomic variants ltering by deep learning models in NGS" (GARFIELD-NGS) analysis. In particular, vcfR is a package that enables to visualize, manipulate and perform the quality control of VCF data [22]. GARFIELD-NGS is an informatics tool, which relies on deep learning models to dissect false and true variants in exome sequencing experiments [23].
Successively, the identi ed variants were subjected to bioinformatic predictive analysis in order to assess their potential impact on protein expression and function. In particular, VarSite and Human Splicing Finder (HSF) were interrogated. VarSite analyzes and predicts the effect of amino acid changes on the protein structure [24]. HSF evaluates the effects of variants on the splicing mechanisms [25]. Moreover, Uniprot annotation database was utilized to retrieve the topological and functional domains organization of proteins [26]. Moreover, Ingenuity Pathway Analysis (IPA, Qiagen) software application was performed in order to place IL6 and IL6R into their biological context and postulate their possible association with COVID19 severity and neuroin ammatory disorders and their potential use as druggable targets. IPA is an all-in-one web-based software application that allow the analysis and integration of different kinds of genetic data, facilitating their interpretation, the identi cation of speci c targets or candidate biomarkers and placing them in the context of larger biological or chemical systems. The software is backed by the Ingenuity Knowledge Base, which consists of highly structured, detail-rich biological and chemical ndings. The entire analytical work ow of the study has been illustrated in Figure 2.

Results And Discussion
The present study investigated the distribution of CNVs and SNVs located within the coding sequences of IL6 and IL6R genes, with the aim of identifying candidate prognostic and pharmacogenetic biomarkers for COVID-19 and neuroin ammatory diseases. The analysis of CNVs did not report any signi cant variation in our study cohort ruling out that frequent copy number variations could potentially impact IL6 and IL6R expression. Concerning SNVs instead, 22 variants located within IL6 and 37 variants within IL6R were selected and investigated in the cohort of Italian samples. As a result, seven variants located within IL6 and ve variants within IL6R were identi ed, respectively (Table 1). Among the variants of IL6, three were synonymous (rs140764737, C/T; rs142164099, G/A, rs2069849, C/T) and four missense (rs142759801, C/A; rs190436077, G/C; rs148171375, A/T and rs13306435, T/A). These variants appeared to be rare in the Italian cohort, with a Minor Allele Frequency (MAF) ranging from 0.009 and 0.003 (Table 1). reported a potential alteration of an ESE site. VarSite did not report any effect on protein structure. Furthermore, the bioinformatic predictive analysis suggested that rs190436077 and rs13306435 may impact on protein structure because of their amino acid substitution, whereas rs148171375 was not predicted to affect protein function, structure and splicing activity. In fact, the rs190436077 causes a Glutamate (with a negatively charged side chain) to Glutamine (carrying a neutral side chain) change at the 79th amino acid and it has also been predicted to disrupt an ESE site. Moreover, it falls within the loop connecting the rst two helical structures of the protein, which contributes to the formation of the binding site for IL6/IL6R complex to gp130 (23). The rs190436077 may therefore be experimentally investigated to verify its potential role on the alteration of IL6 binding ability and could be also evaluated for potential effects on the a nity with IL6 drugs, which may cause an altered drug response or effectiveness. Concerning the variants located within IL6R, two synonymous (rs2228144, G/A and rs2229237, C/T) and three missense variants (rs2228145, A/C; rs28730735, C/T and rs143810642, C/T) were detected in the Italian cohort. These variants showed variable frequency distributions, with the rs2228144 (MAF: 0.178) and rs2228145 (MAF: 0.327) being the most frequently observed (Table 1). Concerning the synonymous variants, the bioinformatic analysis supported an effect on the splicing mechanisms for the rs2229237 variant, which was predicted to activate a cryptic acceptor site and alter the regulatory splicing sequences (Table 1). All the missense variants were predicted to alter ESE sites and/or create new ESS sites, whereas the rs2228145 was also predicted to impact on the protein function due to the amino acid substitution from an Aspartate (Asp, with a negatively charged side chain) to Alanine (Ala, with an aliphatic side chain) at the 358th residue. Interestingly, this amino-acid variant is located within the extracellular domain of the receptor, which is fundamental for IL6R interaction with extracellular ligands. Therefore, the variant may alter the domain conformation, potentially interfering with IL6 recognition. Indeed, the C allele of rs2228145 is strongly associated with increased levels of soluble IL6R in blood, serum and Cerebrospinal Fluid (CSF) [11,27]. This nding may be explained by the fact that the Ala residue makes the conformation at this site more susceptible to the cleavage, leading to increased levels of soluble IL6R [11].
Over the investigation of coding variants in IL6 and IL6R genes, we performed a "Disease and Function" analysis on IPA to visualize the pathophysiological pathways in which IL6 and IL6R may be implicated, how they could affect the severity/progression of COVID19 and neuroin ammatory disease and their possible use as druggable targets for these conditions. According to this analysis, IL6 and IL6R appeared to be implicated in several pathogenetic mechanisms associated with COVID19 severity and mortality, especially affecting lungs, liver, heart and nervous system ( Figure 3).
Notably, lung is the most affected organ by SARS-Cov-2, whose infection triggers acute immuno-in ammatory responses culminating in decreased oxygen uptake, lung injury and severe pneumonia [28]. Moreover, acute cardiac injury (arrhythmias, myocardial infarction and heart failure) and abnormal blood clotting have been reported as complications of SARS-Cov-2 infection in approximately 20-30% and 38% of COVID19 patients, respectively [28][29]. Cardiac and blood vessels involvement can result by direct and indirect mechanisms, including viral in ltration into myocardial tissue (causing cardiomyocyte death and in ammation), stress induced by respiratory failure and hypoxemia and in ammation due to severe systemic hyperin ammation [29]. In 14-53% cases, abnormal levels of alanine aminotransferase, aspartate aminotransferase, lactate dehydrogenase, lymphopenia have been associated with hepatic dysfunction and liver injury [30][31]. These alterations may be either a consequence of direct viral invasion or may be due to drugs hepatotoxicity and immune system overdrive. In addition, 14-36% of severe COVID-19 patients reported neurological symptoms, such as taste and smell impairment, dizziness, seizures, impaired consciousness, encephalitis, stroke [32][33]. Even in this case, neurological symptoms could depend from brain viral infection or from the systemic hyperin ammation and abnormal blood clotting. In addition, the "Disease and Function" analysis reported that IL6 and IL6R were implicated in damage of synapses, microglia proliferation, astrocytes swelling and severe dementia in AD (namely, clinical dementia rating score 3 Alzheimer's disease) (Figure 3).
These data, together with the evidence of association between high IL6 levels and neuroin ammation [11], advocate for a role of IL6 and IL6R as molecular contributors to AD progression and designate them as candidate druggable targets for AD and other neurodegenerative diseases mediated by neuroin ammation.

Conclusions
Considering the above-presented data, we encourage similar studies on other populations to verify the existence of population-speci c genomic pro les, which could contribute to the differential susceptibility and progression of COVID19 and/or neuroin ammatory diseases as well as to the variable drug response. It is important to remark that IL6 and IL6R are excellent targets for immuno-modulatory therapies because of their pleiotropic effects in several tissues (liver, brain, bone, lung, skeletal muscle, heart) and biological uids (blood, serum/plasma, urine). In fact, several drugs (sirukumab, clazakizumab, siltuximab and olokizumab) targeting IL6 have been designed and are currently approved or under investigation for treating RA, Chron's disease, depression, Lupus nephritis and Castleman disease. Concerning IL6R-targeting drugs, Sarilumab is currently indicated for moderate to severe active RA, whereas Tocilizumab is utilized in the treatment of moderate to severe RA, giant cell arteritis, polyarticular juvenile idiopathic arthritis, systemic juvenile idiopathic arthritis and cytokine release syndrome. In addition, Tocilizumab is currently under investigation as treatment option for patients affected with severe COVID-19 [34][35]. Given this data, the availability of IL6-IL6R-related biomarkers for COVID19 disease may be helpful to counteract or timely treat harmful complications and prevent multi-organ failure. At the same time, IL6-IL6R-related biomarkers could also be useful for assessing the susceptibility and progression of neuroin ammatory disorders and undertake the most suitable treatment strategies in order to improve patients' prognosis and quality of life. In conclusion, this study showed how IL6 pleiotropic activity could be exploited to meet different clinical needs and achieve the realization of precision medicine protocols for chronic, age-related and modern public health emergencies.

Figure 1
Canonical pathways involving IL6 and IL6R retrieved by IPA software. This gure has been created by "Path Designer" IPA tool.  Pathophysiological conditions in which IL6 and IL6R may be implicated following the "Disease & Functions" analysis performed on IPA tool. The gure has been created by "Path Designer" IPA tool.