Examination of Genetic Variants Revealed from a Rat Model of Brain Ischemia in Patients with Ischemic Stroke: A Pilot Study

Although there has been great progress in understanding the genetic bases of ischemic stroke (IS), many of its aspects remain underexplored. These include the genetics of outcomes, as well as problems with the identification of real causative loci and their functional annotations. Therefore, analysis of the results obtained from animal models of brain ischemia could be helpful. We have developed a bioinformatic approach exploring single nucleotide polymorphisms (SNPs) in human orthologues of rat genes expressed differentially under conditions of induced brain ischemia. Using this approach, we identified and analyzed nine SNPs in 553 Russian individuals (331 patients with IS and 222 controls). We explored the association of SNPs with both IS outcomes and with the risk of IS. SNP rs66782529 (LGALS3) was associated with negative IS outcomes (p = 0.048). SNPs rs62278647 and rs2316710 (PTX3) were associated significantly with IS (p = 0.000029 and p = 0.0025, respectively). These correlations for rs62278647 and rs2316710 were found only in women, which suggests a sex-specific association of the PTX3 polymorphism. Thus, this research not only reveals some new genetic associations with IS and its outcomes but also shows how exploring variations in genes from a rat model of brain ischemia can be of use in searching for human genetic markers of this disorder.


Introduction
Stroke, including ischemic stroke (IS), is a multifactorial disease and one of the main causes of morbidity and mortality worldwide [1]. Several issues are known to influence the risk of IS, and genetic factors can make essential contributions to an individual's susceptibility. Depending on the subtype of IS, the estimated heritability has ranged from 16.1% for small vessel disease to 32.6% and 40.3% for cardioembolic and large vessel disease, respectively [2]. The last largest genome-wide association (GWA) study revealed 22 new stroke risk loci, bringing the number of known stroke-associated loci to 32, comprising 149 genes [3]. The outcomes of IS are also influenced by different factors, including genetic ones [4]. However, in contrast to IS itself, our knowledge of the genetics of poststroke outcome is very limited. Only a few GWA studies have been performed for this, and several loci were found to be associated [5][6][7]. Because these studies proved effective in revealing trait-associated loci, further such studies are expected. However, GWA studies suffer from some limitations. For example, they have poor precision in identifying real causal variants or genes caused by the linkage disequilibrium-based architecture of DNA arrays used, which has hindered the development of biological insights and clinical translation of GWA findings, and requires functional validation of each gene association discovered [8]. Ultimately, this deficiency highlights the need for the development of new approaches that can delineate causal genetic variants and biological mechanisms underlying the associations observed with IS. They can be realized in enhanced downstream characterization of identified loci through the involvement of different 'omics' data and advanced analytic techniques, or in direct functional dissection of particular variants and genes [8][9][10].
In parallel, the genetics of IS have also been studied intensively in animal models [11,12]. Some of the results obtained in rodent models of stroke have been confirmed or correlated with the results of association studies in humans, including GWA-assessed outcomes after IS [5], but their translation remains generally low. This is because of substantial physiological differences between animal models and humans, and the inadequacy of animal models of stroke [12]. Although the creation of more perfect animal models of the diseases has been declared as one method of improving the translation, relevant results can be obtained through the development of approaches based on careful selection and cross-species multi-omic data validation [13,14]. Following similar principles, we have recently developed a bioinformatics protocol to search for genomic markers that could affect IS outcome [15]. The protocol bears certain similarities with the method of Pasterkamp et al. [16] that was used for the human validation of atherosclerosis-causing genes studied in knockout and transgenic mice. Our method differs in approaches to the selection of both potential candidate genes, which are human orthologues of rat genes that have demonstrated the most significant changes in their expression in the brain as a response to temporal artery occlusion, and SNPs, including functionally important SNPs. Here, we present the results of an association study of nine genetic variants which have been identified with this protocol. Both outcomes and the risk of IS were tested.

Study Subjects
The association study was carried out on a subgroup of patients with IS admitted to the Neurologic Department of Moscow City Clinical Hospital No. 31 during 2009-2011. Detailed procedures of patient enrollment and data collection have been described previously [17]. Briefly, the study group comprised both men and women aged ≥55 years, diagnosed with their first IS, and taken to hospital at the acute stage of the disease. Patients with transitory ischemic attacks, recurrent stroke, hemorrhagic stroke, acute myocardial infarction, decompensated heart failure, and other severe accompanying conditions, including autoimmune and infectious diseases, were not recruited into the study. The stroke subtypes were defined according to the Trial of Org 10172 in Acute Stroke Treatment criteria [18]. To assess stroke severity, the National Institutes of Health stroke scale was used [19]. The functional outcome after stroke was graded using a modified Rankin scale (mRS) [20]. Severity and outcome were assessed twice, at days 1 and 14 after the stroke event. To obtain a group with homogeneous ancestry, only self-described Russian patients were included. In all, 331 patients (166 males and 165 females; mean age 71.68 ± 8.69) with IS were enrolled. The control group included 222 healthy individuals (109 males and 113 females; mean age 70.20 ± 9.63) with Russian ancestry from regions of Central European Russia. All subjects provided written informed consent for participation.

Selection and Genotyping of Markers
DNA was isolated from peripheral blood cells by proteinase K treatment, followed by extraction with phenol-chloroform [21]. The principles of the selection of polymorphic markers analyzed in the study were described in our recently developed protocol [15]. They included the following main steps: (1) Selection of rat genes that showed the most significant changes in their expression in the brain in response to temporal artery occlusion; (2) Identification of the human orthologues of these rat genes; (3) Identification of SNPs within human genes and revealing among them any informative SNPs with a high degree of linkage disequilibrium (tagSNPs). These require whole genome sequence data from an appropriate population (in our study, it was CEU population from 1000 Genomes Project); (4) Annotation and identification of functionally important tagSNPs that affect the expression of the genes studied, namely expressed quantitative trait loci (eQTLs) (GTEx Analysis Release V8 was used to annotate the tagSNPs). The annotation resulted in a dramatic drop in the number of candidate genes, discarding those with low potentials. In evaluating the tagSNPs, we first selected the SNPs associated with changes in the expression of corresponding genes in human brain tissues. There were eight such SNPs in five genes (PTX3, RGS9, EMP1, CD14, and LGALS3) ( Table 1). An additional SNP (locus rs1491961 in CCR1) was included in the analysis based on its prominent eQTL-related statistics. It had the greatest absolute p-value and normalized effect size (10 -47 and -0.40, respectively) and its functional effect was associated with expression in whole blood.
The selected tag SNPs were genotyped in the current study with the use of TaqMan real-time PCR assays. Primers and probes for allele discrimination were designed using Beacon Designer software v. 7.91 (PREMIER Biosoft International, San Francisco, CA, USA) and were synthesized by DNA-Synthesis Company (Moscow, Russia) ( Table 1) Table 1). All the genotypes obtained are presented in Supplementary Table S1.

Statistical Analysis
The aim of this study was to explore the associations between genetic variants and IS outcomes as well as the risk of developing the disease. The effects of SNPs on stroke outcome were assessed through their associations with mRS scores. Scores were analyzed as follows: (1) mRS scores of 0-2 vs. scores of 3-6 (Study 1); (2) mRS scores of 0-3 vs. scores of 4-6 (Study 2); and (3) changes in mRS scores from days 1 to 14 following IS (Study 3). The first two studies differed according to the level of help required by the patients in everyday life (mRS scores of 0-2 vs. scores of 3-6; functional recovery group 1 and nonfunctional recovery group 1); the ability of patients to walk (mRS scores of 0-3 vs. mRS scores of 4-6; functional recovery group 2 and nonfunctional recovery group 2); and were based on the mRS scores assessed at day 14. Study 3 was conducted to evaluate the differences in direction of poststroke outcomes. These were assessed through the calculation of individual changes (∆ values) in mRS scores given to patients at days 1 and 14 as ∆mRS = mRS1-mRS14 and defined as positive (∆mRS > 0), negative (∆mRS < 0), or stable (∆mRS = 0). To access the relationships between the genotypes and variables, the genotypic χ 2 test was used. When there was a need to clarify the contribution of particular genotypes, dominant, recessive, and overdominant genetic models were additionally applied. To find independent variables, multiple linear regression was used. The results were adjusted for age and sex [3,5,22]. The significance of associations was set at p < 0.0056, which was calculated as the nominal value (p < 0.05) divided by the total number of SNPs tested (n = 9). Statistical analyses were performed using Statistica software v. 8.0 (StatSoft, Inc., Tulsa, OK, USA), R package AssocTests [23], and Haploview software v. 4.2 [24].

Results
Nine SNPs were genotyped in the current study. The frequencies of genotypes according to the groups studied are listed in Tables 2 and 3. The distributions of all genotypes in control samples did not deviate from expected values under Hardy-Weinberg equilibrium.
Of 331 IS patients analyzed in Study 1, 116 were classified as the functional recovery group (mRS scores of 0-2) and 214 were nonfunctional recovery group (mRS scores of 3-6) (one patient was excluded from the analysis because he died at day 8 after IS). In Study 2, the functional recovery group included 175 patients (mRS scores of 0-3) while the nonfunctional recovery group comprised 155 patients (mRS scores of 4-6). One hundred and fifty-five patients demonstrated positive dynamics in changes to their mRS scores (∆mRS > 0); negative changes were detected in 67 patients (∆mRS < 0); and 108 patients were stable (∆mRS = 0) ( Table 2). No substantial differences in genotype distributions were found between groups in Studies 1 and 2. At the same time, analysis of the genotype distributions observed in Study 3 revealed one nominally significant SNP (p < 0.05), namely rs66782529 (Table 2). Additional pairwise comparisons suggested that this association was caused by a distinct genotype pattern in the group of patients with negative ∆mRS values, particularly by a higher frequency of genotype CC in this group compared to two others. The association was the most important in the comparison of groups with negative and stable ∆mRS values where rs66782529 was an independent predictor of IS outcome (p = 0.0047 compared to p = 0.9235 and p = 0.3071 for sex and age, respectively).
Because the genetics of IS are not yet clearly understood, and experiments in animal models cannot be absolutely relevant to humans, the SNPs were also tested for association with the risk of IS. Two SNPs showed significant associations with the disease: rs62278647 and rs2316710 (Table 3). Both SNPs belonged to the PTX3 gene and were in strong LD in our sample (r 2 = 0.82). Combining information on genotype distribution in terms of the genetic models of inheritance (dominant, recessive, and overdominant) showed the most pronounced association with a heterozygous genotype AT in the patients (p = 0.000027). In the case of SNP rs2316710, such prevalence was also observed but it was not significant (p = 0.0095) in that it corresponded to the results of regression analysis where SNP rs62278647 was the only independent predictor of IS disease. Additional comparisons performed separately in male and female cohorts showed that the associations found were significant only in the females (p = 0.000057 and p = 0.000765 for rs62278647 and rs2316710, respectively). Both associations remained significant even after correction for the number of two additional comparisons performed (p < 0.0056/2). No substantial differences in genotype distributions between male and female cohorts were observed for other SNPs (Table S2, Figure S1).  * Numbers in brackets are the numbers of individuals with particular genotypes; ** Genotype distribution in groups was compared by genotypic test; † Significant p-values.

Discussion
The outcomes of IS vary from complete recovery to persistent severe disability or death. Here, we assessed the associations of genetic variants in six genes with poststroke outcomes. Traditionally, outcomes are analyzed by dividing patients into functional and nonfunctional recovery groups based on individual mRS scores (i.e., the contrasted groups can include patients with mRS scores of 0-2 and scores of 3-6 [5] or with mRS scores of 0-3 and scores of 4-6 [25]). As the optimal grouping lacks consensus, we also subdivided patients based on the dynamics and direction of changes in their mRS scores from days 1 to 14. This grouping resulted in three different categories (positive, negative, and stable), which did not correlate directly with functional impairments after IS. The results of regression analysis showed less dependence of the proposed grouping on the age of patients and suggested that it could be an alternative to other variants in the division of patients when studying the outcomes of IS.
No associations were found between the studied SNPs and functional outcome as assessed by mRS scores. However, one SNP locus (rs66782529) was found to be nominally correlated (p < 0.05) with negative ∆mRS values. This SNP is 8 kb distant from the start of the LGALS3 gene, but following formal criteria, we annotated it with this gene.
LGALS3 encodes a beta-galactoside-binding protein, galectin-3, that is involved in regulation or mediation of diverse intra-and extracellular processes, including cell adhesion, migration, differentiation, apoptosis, and inflammation [26]. Galectin-3 is ubiquitously expressed in adult humans under normal physiological conditions. Its expression was also found to be significantly and rapidly induced under disease conditions. Therefore, galectin-3 is regarded as an important biomarker, particularly of cardiovascular and cerebrovascular injuries and diseases [27,28]. This potential is associated with pro-inflammatory properties. The role of galectin-3 in mediating inflammatory responses during subacute stroke in rats has been demonstrated directly through the reversion of neuroinflammation when LGALS3 expression was inhibited [29]. Considering the data of the Genotype-Tissue Expression (GTEx) project, which demonstrated the ability of rs66782529 genotypes to influence LGALS3 expression in the brain, the results of our study agree with the literature. First, according to the GTEx study, the CT and TT genotypes decreased expression. Second, the maximum number of patients with a CC genotype was observed in the group with negative ∆mRS values. In that group, the decline of functional abilities was not stopped or reversed. Therefore, neuroinflammation as a factor for producing further brain damage could have been involved in these patients. Additionally, when considering the functional features of SNP rs66782529 according to the results of HaploReg [30] and RegulomeDB [31], it is interesting to note that the SNP affects LGALS expression influencing local chromatin state.
Additional associations were found between the SNPs and IS. As stated above, they were evaluated because of the limitations of existing data on stroke genetics, particularly its potential specificity in populations with different ethnicities. Our analysis was performed in patients self-identifying as Russian. Both significant SNPs were located upstream of the PTX3 gene. This encodes a pentraxin-3 protein that is an essential mediator of innate resistance to different pathogens of fungal, bacterial, and viral origin, and is involved in the regulation of inflammation, angiogenesis, tissue remodeling, and cancers [32]. As with LGALS3, PTX3 expression was found to be increased under diseased conditions, and pentraxin-3 has been defined as a novel and independent prognostic marker of mortality after acute myocardial infarction (AMI) and IS [33,34]. At the same time, experiments in animal models showed that PTX3 depletion resulted in behavioral deficits and brain damage progression in mice at a chronic phase, suggesting that pentraxin-3 might act in a time-dependent manner, intensifying inflammation early after injury and promoting tissue repair processes at a later date [35][36][37]. Here, we also did not find any correlations between the eQTL tagSNPs from PTX3 with functional outcomes after IS but found them to relate to the risk of IS (rs62278647 and rs2316710). The risk of IS was generally correlated with the prevalence of heterozygous genotypes in our patients. Similar direction of the association was demonstrated in the study of Melegy et al. [38], who tested the association of SNP rs2305619 with AMI (SNP rs2305619 was in strong linkage disequilibrium (LD; r 2 = 0.97) with our SNP rs2316710 in CEU test population from 1000 Genomes Project [15]). At the same time, no such correlations were observed in the larger study of Barbati et al. [39], who explored the influence of PTX3 polymorphisms on the risk of AMI and pentraxin-3 plasma levels (the SNPs tested were in strong LD with our SNPs rs2316710 and rs7634847, but not with rs62278647 (r 2 = 0.47-0.69), in CEU test population [15]). Given that PTX3 is a marker for prognosis after AMI and IS, and following Barbati et al. [39], we propose that the lack of correlations between PTX3 polymorphisms and IS outcome is because the PTX3 level itself is not a causal factor for IS outcome. The results of additional functional annotations of the SNPs rs62278647 and rs2316710 seem to support this idea as both were classified as having lower ranks compared to the SNP rs66782529, and were limited to the effects on regulatory motif sequences. However, our data on the differences in genotypic associations with IS risk between male and female patients complicate the question, suggesting the possibility of sex-based specificity of PTX3 function, including its correlation with a higher rate of IS in women [40,41]. However, this presumption requires verification in further studies.

Conclusions
Summarizing the results of the work, we can conclude that our recently developed approach in searching for genetic markers of IS was clearly effective, as it has revealed several new associations with the outcomes and risk of IS. At the same time, the study had some limitations. First was the relatively small sample size that could reduce the statistical power of our study. Second, because the analysis was restricted to local eQTLs, the potential associations of distant SNPs (trans-eQTLs) with stroke and poststroke events were not considered. Taking these factors into account will increase the abilities and power of our approach.