Gene–Toxicant Interactions in Gulf War Illness: Differential Effects of the PON1 Genotype

About 25–35% of United States veterans who fought in the 1990–1991 Gulf War report several moderate or severe chronic systemic symptoms, defined as Gulf War illness (GWI). Thirty years later, there is little consensus on the causes or biological underpinnings of GWI. The Gulf War Era Cohort and Biorepository (GWECB) was designed to investigate genetic and environmental associations with GWI and consists of 1343 veterans. We investigate candidate gene–toxicant interactions that may be associated with GWI based on prior associations found in human and animal model studies, focusing on SNPs in or near ACHE, BCHE, and PON1 genes to replicate results from prior studies. SOD1 was also considered as a candidate gene. CDC Severe GWI, the primary outcome, was observed in 26% of the 810 deployed veterans included in this study. The interaction between the candidate SNP rs662 and pyridostigmine bromide (PB) pills was found to be associated with CDC Severe GWI. Interactions between PB pill exposure and rs3917545, rs3917550, and rs2299255, all in high linkage disequilibrium in PON1, were also associated with respiratory symptoms. These SNPs could point toward biological pathways through which GWI may develop, which could lead to biomarkers to detect GWI or to better treatment options for veterans with GWI.


Introduction
Gulf War veterans began reporting chronic systemic symptoms almost immediately after returning from deployment to the Gulf in support of the 1990-1991 Gulf War [1][2][3][4]. These symptoms have only increased, in both severity and frequency, over the past 30 years, and have been defined as Gulf War illness (GWI) [1, [5][6][7][8][9][10]. The 2016 Institute of Medicine (IOM) report on GWI indicated great heterogeneity in the expression of GWI across the veteran population, suggesting heterogeneity in genetic susceptibility, environmental exposures, and/or their interactions [11]. There are still important gaps in our understanding

Introduction to GWECB
The Gulf War Era Cohort and Biorepository (GWECB) consists of DNA, electronic medical records, and survey data from individuals who served in the United States Armed Forces during the 1990-1991 Gulf War [44]. Data collected includes self-reported exposures, exposure lengths, symptoms, symptom severity ratings, clinical conditions, and deployment locations. Women, veterans of color, and deployed veterans were oversampled, and veterans did not have to use the Veterans Affairs system for healthcare to be eligible for inclusion.

GWI Phenotypes
Veterans from the GWECB cohort have been classified as meeting or not meeting the case definition using the CDC, CDC Severe, Kansas full case definition, and Kansas symptom criteria [6]. Case status was coded using a deterministic SAS algorithm and basic survey data cleaning was also performed in SAS 9.4, as described in Vahey et al. and Gifford et al. [6,45].
Eleven outcomes were used for the regression analysis: the CDC Severe GWI Case Definition, the three CDC Severe symptom domains (CDC Severe: Musculoskeletal, CDC Severe: Mood/Cognition, and CDC Severe: Fatigue), Kansas Symptom Criteria, and the six Kansas symptom domains (Kansas: Pain, Kansas: Fatigue, Kansas: Respiratory, Kansas: Skin, Kansas: Mood/Neurological/Cognitive, and Kansas: Gastrointestinal). An additional variable derived from the GWI case algorithm is the Kansas exclusionary conditions indicator. This variable identifies veterans with at least one of 21 exclusionary health conditions, such as diabetes, heart disease, and cancer. Fulfilling the exclusionary criterion precludes an individual from being considered a GWI case. The CDC Severe and Kansas GWI case definitions are known to be heterogeneous; it is possible that multiple traits with different underlying etiologies make up GWI, complicating analysis. Thus, studying the symptom domains separately may isolate more homogeneous subphenotypes. Each symptom domain in the CDC Severe and Kansas case definitions were considered as individual outcomes, as in prior studies [13,14]. Previous analyses of deployment in GWECB demonstrated a strong and consistent association of deployment with the CDC Severe Case Definition and thus CDC Severe will serve as the primary outcome for this analysis [6].

Gulf War Exposures
Veterans deployed to the Gulf during the 1990-1991 Gulf War were asked to report on their exposures. The survey asks for both a yes/no response and a "number of days exposed" response for 11 exposure questions. Appendix A Figure A1 is a scan of the survey page for these exposure responses. Ambiguous values were coded based on previous work [6]: number of days exposed was set to zero for individuals who said they did not experience an exposure, regardless of their secondary answer. Exposures used were "Took pyridostigmine bromide (anti-nerve agent pills)", "Used pesticide cream or liquid on your skin", "Wore a uniform treated with pesticides", and "Used insect baits/no-pest strips in your living area". The PB exposure variable used was the number of days each individual reported being exposed to PB pills; the options were "No, none", "Yes, 1-6 days", "Yes, 7-10 days", and "Yes, 31 days or more". The pesticide exposure variable combined the number of days reported for all three pesticide variables listed above, based on the minimum number of days per option (i.e., "Yes, 1-6 days" was counted as 1 day). Less than 31 days combined exposure was a short exposure, 31-60 days combined exposure was a medium exposure, and more than 60 days combined exposure was a long exposure. Pesticide use and PB pill exposure were chosen as candidate exposures over "exposed to biological or chemical warfare agents", following conclusions of the 2014 Research Advisory Committee report [12].

Genotypes and Genetic Data Cleaning
Blood samples were collected for DNA extraction, as described by Khalil et al. [44]. DNA samples were whole-genome amplified, hybridized to microarrays, single-base extended and stained using the Automated Protocol for the Illumina Infinium HTS Assay in batches of two 96-sample plates per run. Genotyping was done on an Illumina Omni2.5-8v1.4 microarray at the Pharmacogenomics Analysis Laboratory (PAL) in the Central Arkansas Veterans Healthcare System. Analysis of agreement in genotypes between sample duplicates indicated excellent genotyping quality. Plink 1.9 was used for genetic data cleaning and analysis [46]. A total of 2,372,461 SNPs were genotyped; 82,369 SNPs were removed due to missingness greater than 1% and 512,116 SNPs with minor allele frequency (MAF) less than 1% were removed, including the two SNPs included by Steele et al. [13], leaving 1,777,976 SNPs. From those 1,777,976 SNPs, 3 SNPs were chosen for Tier 1 candidate SNP testing (i.e., BCHE, PON1, ACHE) and 371 SNPs were chosen for Tier 2 candidate SNP testing, representing all SNPs +/− 50 kb from the annotated gene. Individuals who returned surveys but who did not provide a specimen for genetic testing were excluded from analysis. Beginning with 1343 samples, replicates were removed, keeping the most informative sample for each individual. Two individuals were removed because of missing genotypes and 13 individuals were removed because of perceived genetic relatedness, leaving 1260 veterans for genetic analysis. As only veterans deployed to the Gulf were asked to report on their exposures, those who were deployed elsewhere and those who were not deployed were removed from the analysis. A total of 810 GW-deployed veterans were in the final analytic dataset after genotype and phenotype data cleaning, as seen in Figure 1.

Statistical Analysis
We fit a gene/environment interaction model to estimate the effect and statistical significance of specific candidate gene/toxicant interactions. We used logistic regression as implemented in PLINK 1.9 to test the effect of genetic variants and gene-by-exposure interaction [47]. As discussed above, the primary outcome was the CDC Severe phenotype. Analyses of other GWI case phenotypes and the individual symptom domain phenotypes were considered to be secondary. Given the complexity of the GWI phenotypes and the correlation among individual symptom domains, as well as their relationship to GWI outcomes, we did not adjust for multiple comparisons across the phenotypes but rather reported the p-values and effect sizes for each analysis.
Ten genetic principal components, representing population structure, along with age and sex were used as covariates for the basic model. The fully adjusted model included age, sex, education, income, and a binary indicator of the Kansas exclusionary criterion as an indicator of presence or absence of medical conditions. Because the Kansas exclusionary criterion is part of the full Kansas GWI case definition, the Kansas symptom criteria indicator was used as an outcome rather than the full Kansas GWI case definition in the fully adjusted model. Equation (1) outlines the basic model and Equation (2) outlines the fully adjusted model. We tested β 3 using the standard allele test in an additive genetic model as implemented in PLINK.
Interaction trends were tested for significance using the Cochran-Armitage test for trend, according to the prop_trend_test() function in the rstatix package in R version 3.61. LocusZoom plots were generated using legacy LocusZoom.org [48]. Tier 1 SNPs were tested at p = 0.05, while Tier 2 SNPs were tested at p = 0.00013 using a Bonferroni correction for the 371 SNPs in Tier 2. SNPs were prioritized based on p-value and representation across phenotypes.

SNP Selection
Both models were performed for each of the 11 outcomes, and both exposures. For inclusion in these models, we identified two kinds of a priori SNPs based on published evidence for gene-toxicant interactions in GWI. Tier 1 SNPs in Table 1 have been previously identified in genetic analysis of GWI. Tier 2 SNPs, enumerated in Table 1, were chosen to cover all common variation in the genes of interest.

Demographics and Outcomes
Veterans who fulfill the CDC Severe GWI criteria and those who do not are similarly distributed by sex, age group, and OEF/OIF deployment. Those who were Black or Hispanic, in the Army, have a lower household income, or were deployed active duty were more highly represented among those who fulfill the CDC Severe GWI criteria than those who do not fulfill the CDC Severe GWI criteria ( Table 2). Those who earned advanced degrees or who were in the reserves were underrepresented among those who fulfill the CDC Severe GWI criteria.  Across the 810 deployed veterans, 26% were categorized as fulfilling the CDC Severe GWI criteria. CDC Severe case status was associated with PB and pesticide exposure ( Table 3). Table 3 shows that 36% of those who report long PB exposure (n = 149) and 33% of those who report long pesticide exposure (n = 156) reported symptoms consistent with CDC Severe GWI compared to 18% and 19% of those who reported no PB or no pesticide exposure, respectively. Across all outcomes, a larger proportion of those who were exposed to either pesticides or PB pills fulfilled the symptom criteria compared to those in the overall deployed group (Appendix A Tables A1 and A2).

Association of Tier 1 SNP/Exposure Interactions with Primary and Secondary Outcomes
The results for association for interactions of PB and pesticide exposure with the Tier 1 candidate SNPs with the primary outcome (CDC Severe GWI) and the secondary outcomes are shown for the covariate adjusted model (Table 4). Only one SNP, rs662, showed a statistically significant interaction with PB exposure for the CDC Severe GWI outcome (p = 0.049). Among the secondary outcomes, the interaction between rs662 (PON1) and pesticides had a p-value of 0.029 in the fully adjusted model of the Kansas Gastrointestinal symptom domain. The other Tier 1 interaction below p = 0.05 was rs1799805 (ACHE) with PB pill exposure for the Kansas skin symptom domain. Rs1799807 (BCHE) and rs1799805 (ACHE) had low minor allele frequencies: 0.022 and 0.039, respectively, likely resulting in lower power to detect differences across groups for these two variants. LocusZoom plots for these results can be found in Appendix A Figures A2-A5.  Figure 2 illustrates the unadjusted CDC Severe GWI frequencies across rs662 genotype (AA-homozygote 192Q phenotype, AG-heterozygote and GG-homozygote 192R phenotype) by PB pill length of exposure group. Significant trends were observed for longer lengths of exposure with higher rates of GWI were seen in the PON1 rs662 GA and GG genotypes, with p-values for the simple test for trend of 0.0004 and 0.0084, respectively. This trend was not observed in the AA group (p = 0.179).

Association of Tier 2 SNP/Exposure interactions with CDC Severe GWI
Common SNPs in the four candidate genes were plotted by genomic location using LocusZoom, highlighting the SNP with the lowest exposure/SNP interaction p-value for each gene. The LocusZoom plots for association of the interaction between PB and variants in ACHE, BCHE, PON1, and SOD1 for the primary outcome, CDC Severe GWI, are shown in Figure 3. Similar plots for the interaction between pesticide exposure and variants in ACHE, BCHE, PON1, and SOD1 are included in Appendix A Figure A6. No results achieved Bonferroni-corrected statistical significance. The lowest p-value among the PB exposure/SNP interaction terms is in PON1 for the SNP rs2299260 (p = 0.005) ( Table 5). Likewise, the lowest p-value for the interaction term with pesticides is in PON1 for rs62467349 (p = 0.0086) for CDC Severe GWI (Table 5).

Association of Tier 2 SNP/Exposure Interactions with Secondary Outcomes
Across the secondary outcomes, three SNPs showed statistically significant interactions with PB exposure for the Kansas Respiratory Symptom Domain (rs3917545, rs3917550, and rs2299255). These three SNPs were in high linkage disequilibrium with one another (r 2 = 1), across all ancestry groups and are in the PON1 gene. Other gene/environment interaction associations with individual symptom domains were nominally significant (p < 0.05), although given the small sample size and lack of power, many are likely to be false positive signals or a result of correlation among the symptoms themselves. The linkage disequilibrium between the SNPs identified in PON1 is variable with moderate LD between rs662 and rs3917545, rs3917550, and rs2299255 in European ancestry and low LD between the same SNPs in African ancestry (r 2 = 0.036 and r 2 = 0.39 for YRI and CEU, respectively, Appendix A Figures A7 and A8 [52]).   The group of interactions that was significant in the Kansas Respiratory domain model remained significant according to both the basic and fully adjusted models, but only in the PB interaction model. Figure 4 shows the interaction p-values for the region plotted on the −log 10 scale. The PON1 candidate SNP rs662 is not in high LD (r 2 < 0.4) with rs3917545.

Discussion
We identified a statistically significant association between CDC Severe GWI and the rs662/PB pill interaction, as seen in Figure 2, representing a pattern consistent with a gene-environment interaction model. While this was not the lowest p-value interaction in the study, it is the only signal that appears for our primary outcome (i.e., CDC Severe GWI) out of the Tier 1 SNPs. Significantly higher rates of CDC Severe GWI with increasing exposure to PB pills were detected for GG and GA genotypes. This increase in risk does not appear in the AA genotype, indicating that exposure alone is not responsible for this effect and separating the risk genotypes from the non-risk genotype. A significant trend among those who reported over 31 days exposure to PB pills suggests an additive risk according to the count of G alleles. This is consistent with previous reports in the literature, that the R version of the paraoxonase enzyme is less effective in detoxifying most substrates, and therefore results in higher risk with more exposure [51,53]. Rs662 is a functional variant; the rs662 G alternative allele causes a change from a glutamine to an arginine at amino acid 192. As early as 1999, Haley et al. demonstrated a genotype-specific difference in paraoxonase (arylesterase) activity which was strongly associated with GWI [14]. The Q version of the protein was previously found to have higher catalytic activity across most substrates, including sarin, while the R version of the enzyme was found to more efficiently detoxify paraoxon. In genome-wide association analyses, two variants in PON1 were associated at genome-wide significance level with paraoxonase enzyme activity, rs854572 and rs2057681, with rs2057681 in complete linkage disequilibrium (r 2 = 1) with rs662 in European Americans [52,54]. Where prior studies have measured catalytic activity, we measured genotype using a genome-wide array. The multiple associated variants that we observed in PON1 were not in high LD with one another, reflecting the conclusions of Jarvik et al. that multiple independent variants are associated with arylesterase activity of PON1 [42]. Follow-up analysis to assess enzymatic activity has not been completed in GWECB and would be required to fully replicate this finding. This variant has also been implicated in several other diseases, including as a modifier of sporadic ALS [39], which affects Gulf War veterans at higher rates than other comparable military groups [32].
Analysis of common variants within the candidate genes also identified PON1 variants as interacting with GW exposures. Rs3917545 was statistically significant after Bonferroni correction interacting with PB exposure for association with the Kansas Respiratory symptom domain outcome. Two additional SNPs were also significant, are all in linkage disequilibrium with one another, and thus are likely a single signal. These SNPs are all non-coding variants that have not been identified by other GWI studies, although one of the SNPs, rs3917545 has been identified as associated with blood protein levels of erythroid membrane associated protein in a plasma proteome screen [55,56]. Additional PON1 SNP-exposure pairs had suggestive p-values (p < 0.05) but did not meet the threshold for statistical significance. Whether these represent independent signals or are merely correlated with other SNPs, exposures and/or outcomes will require a larger dataset to untangle. The other three candidate genes BCHE, ACHE, and SOD1 did not yield any statistically significant interactions for this candidate gene-environment interaction study. Larger datasets will be required for additional discovery of gene-environment interactions related to GWI.
There are several limitations to our study. While our sample size is large relative to other studies of GWI with genetic information, we had limited power to detect rare variants, particularly those in the BCHE gene. We were also unable to replicate the BCHE/PB interaction identified by Steele et al. [13]. This is partially due to the lower allele frequency of the BCHE variants; Steele et al. combined several BCHE variants together for analysis, but we did not test variants that appeared in our data at less than 1% minor allele frequency. We prioritized candidate SNPs into two tiers, which allowed us to identify several significant signals, one of which has biological meaning. Given our sample size, we cannot dismiss type I error as a possible explanation for our findings. Future work could examine data from the MVP study, including over 100,000 Gulf War Era veterans with genotypes [57], which would allow for greater statistical power to detect gene-by-environment interactions. Future work could also analyze catalytic activity of our candidate genes, specifically Pon1, which could increase confidence in these findings. An additional limitation is around the generalizability of our findings. The GWECB survey had a very low return rate and may not be representative of the general Gulf War veteran population [44]. An important assumption in our model is the independence of genotype and exposure. While we did not observe an association between genotype and exposures, we are unable to fully rule out a possible dependency. Genotype proportions by exposure time remained consistent across all exposure categories, indicating that there is not an association between genotype and exposure length, visualized in Appendix A Figure A9. One benefit of GWECB is that both women and racial minorities were oversampled, allowing some examination of racial differences despite our modest sample size. The rs662 risk genotype is more common in those who identify as Hispanic or Black, non-Hispanic than in those who identify as White, non-Hispanic. Among White veterans, risk was higher with increased exposure in the GG and AG groups, but not in the AA group (Appendix A, Figure A10). The same is true among the Black and Hispanic veterans in the dataset (Appendix A, Figure A11), despite the smaller sample size. Additionally, we were unable to replicate a statistically significant interaction involving pesticides in our analysis, although prior work indicates that pesticide exposure may also interact with PON1 rs662 genotypes [24,25,[58][59][60][61]. Qualitatively the pesticide results show similar relationships with the SNPs in PON1. This is potentially attributable to the heterogeneity in the composition of pesticides experienced. The metabolism of carbamates (some pesticides and PB pills) is notably different than the metabolism of organophosphates (other pesticides and sarin gas) contributing to additional heterogeneity in the effect of the exposures. We used a variable that combined exposures from different pesticide sources into a variable indicating total exposure. Different pesticides may affect different biological pathways, and this further biologic heterogeneity may be reducing power to detect gene-environment interactions with pesticides. In addition, the GWECB measure of PB pills is based on self-report of exposure and did not query reaction to pesticide or PB exposure [62]. Finally, due to data and sample size limitations, we did not consider all potentially important exposures, such as chemical weapon exposure or oil-well fire smoke, or alternative timing of exposure effects, such as acute or chronic effects of exposures. A larger dataset or meta-analysis across similar analyses is required to address these limitations.
In summary, two significant signals were identified, associating GWI with the interaction between PON1 and PB pill exposure. The primary result of the association between CDC Severe GWI and the rs662/PB pill exposure interaction corresponds with prior findings, as the rs662 locus is a functional variant that has been previously identified in smaller studies [14,30,38,40]. The secondary result of the association between respiratory symptoms and the interaction of PB with a trio of SNPs in high linkage disequilibrium (LD) with one another is a new finding and requires replication in additional, larger studies. The SNPs in these two signals do not appear to be in high LD with one another, although both are in PON1 and both interactions are with PB pill exposure. We did not observe statistically significant interactions or marginal effects for any SNPs in ACHE, BCHE or SOD1.

Conclusions
Aligning with prior work in the area, this study identified an interaction between PB pill exposure and the G allele of rs662. Because this G allele codes for a functional variant, this interaction replicated prior work implicating genetic susceptibility to PB and pesticide exposure in GW veterans and identifies a potential biomarker for GWI risk.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.