Genome-Wide Association Study of Fluorescent Oxidation Products Accounting for Tobacco Smoking Status in Adults from the French EGEA Study

Oxidative stress (OS) is the main pathophysiological mechanism involved in several chronic diseases, including asthma. Fluorescent oxidation products (FlOPs), a global biomarker of damage due to OS, is of growing interest in epidemiological studies. We conducted a genome-wide association study (GWAS) of the FlOPs level in 1216 adults from the case-control and family-based EGEA study (mean age 43 years old, 51% women, and 23% current smokers) to identify genetic variants associated with FlOPs. The GWAS was first conducted in the whole sample and then stratified according to smoking status, the main exogenous source of reactive oxygen species. Among the top genetic variants identified by the three GWAS, those located in BMP6 (p = 3 × 10−6), near BMPER (p = 9 × 10−6), in GABRG3 (p = 4 × 10−7), and near ATG5 (p = 2 × 10−9) are the most relevant because of both their link to biological pathways related to OS and their association with several chronic diseases for which the role of OS in their pathophysiology has been pointed out. BMP6 and BMPER are of particular interest due to their involvement in the same biological pathways related to OS and their functional interaction. To conclude, this study, which is the first GWAS of FlOPs, provides new insights into the pathophysiology of chronic OS-related diseases.


Introduction
Oxidative stress (OS) was defined in 1985 as "a disturbance in the pro-oxidant/ antioxidant balance in favour of the former" [1]. Beyond its essential role in life processes, OS is involved in the pathophysiology of several chronic diseases, including cardiovascular diseases, chronic kidney diseases, and asthma [2]. Sources of reactive oxygen species include the diseases themselves, through their intracellular metabolisms, and some exogenous sources, among which the most important is cigarette smoke [3].
Among the numerous biomarkers related to OS [1,4], Fluorescent Oxidation Products (FlOPs), which reflect a global measurement of oxidation of lipids, proteins, carbohydrates, fluorescence intensity (RFU)·mL −1 of plasma. Each sample was replicated. The intra-assay coefficient of variation (CV) for FlOPs was less than 20%. The dosages for which the CV were ≥20 % or those that were haemolysed were removed of analysis (n = 11 and n = 8, respectively, see Figure S1 in Supplementary Materials).

Genotyping
The EGEA participants were genotyped using Illumina 610 Quad array at the Centre National de Génotypage (CNG, Evry, France) as part of the European Gabriel consortium asthma GWAS [18]. As part of this consortium, principal-components analysis was conducted for all participants to control population admixture and was carried out using the EIGENSTRAT2.0 software. Putative non-European samples were flagged as outliers and eliminated from subsequent analyses [18]. Stringent quality control (QC) criteria were used to select both individuals and genotyped Single Nucleotide Polymorphisms (SNPs) for analysis [19]. Among participants with genotyped data (n = 1481), 44 with invalid genotyped data were excluded (see Supplementary Materials: Figure S1, flow chart, for details). The following SNPs quality controls were applied: genotyping call rates ≥ 97% and departure from the Hardy-Weinberg equilibrium in the controls (p-value ≥ 1.0 × 10 −4 ) and minor allele frequencies (MAF) ≥ 5%. After this SNPs QC control, 66,422 SNPs were excluded, and a total of 501 167 SNPs were available for the analysis. To investigate regions of interest, including the top three SNPs, imputed SNPs from the reference panel 1000 Genome Phase I were used [20]. The software IMPUTE2 was used for imputation [21]. Imputed SNPs were kept for analysis if their imputation information score was greater than or equal to 0.70 and if their minor allele frequency (MAF) was greater than or equal to 0.05.

Asthma
Ever-asthma status was generated as a dichotomous variable (never-asthmatic/everasthmatic). Ever-asthmatics were participants who answered positively to at least one of the two following questions: "Have you ever had attacks of breathlessness at rest with wheezing?" or "Have you ever had asthma attacks?", or those who were recruited as asthmatic cases at EGEA1. Never-asthmatics were those who answered negatively to the two questions above; they were not recruited as asthmatic cases at EGEA1.

Chronic Bronchitis
Chronic bronchitis was generated as a dichotomous variable (yes/no). Participants with chronic bronchitis were those who answered positively to at least one of the two following questions: "Do you usually cough during the day or at night in winter almost every day for three months of continued every year?" or "Do you usually spit during the day or at night in winter, almost every day for three months of continued every year?".

Lung Function
A lung function test with spirometry and methacholine challenge was performed using standardised protocol with similar equipment across centres and according to the American Thoracic Society/European Respiratory Society guidelines [22]. The forced expiratory volume in one second (FEV 1 ) percent predicted value was based on Quanjer et al. reference equations [23]. For participants with a FEV 1 ≥ 80% of the predicted value, a methacholine bronchial challenge test was performed (maximum dose 4 mg) using a Biomedin spirometer (Biomedin Srl, Padua, Italy) in all centres, except in Lyon, where a Pneumotach Jaeger spirometer (Jaeger) was used. The following measures of lung function were used as continuous variables and expressed as %: FEV 1 and Forced Vital Capacity (FVC). FEV 1 was also generated as a dichotomous variable (FEV 1 < 80%, FEV 1 ≥ 80%).

Smoking Status
Tobacco consumption was defined by the answer to the question "Do you smoke or have you smoked previously one cigarette per day or more, for at least a year?". If so, participants were asked their age at the start of smoking and the age of quitting, if applicable. Participants were also asked to quantify the average daily consumption of cigarettes and the average weekly consumption of cigars, if applicable. Current smoking status was generated as a 3-class categorical variable: never-smoker, ex-smoker, or current smoker. Lifelong cumulative quantity of tobacco was generated as a continuous variable and also categorised using a 4-class variable, with cut-offs defined a priori: never-smokers; <10 pack-year; 10-20 pack-year; and >20 pack-year.

Body Mass Index (BMI)
BMI was generated as a continuous variable and was also expressed as a dichotomous variable (<30 kg/m 2 ; ≥30 kg/m 2 ).

Biological Parameters
Total serum Immunoglobulin E (IgE) determination was assessed by the UniCAP system (Pharmacia ® ) from blood samples in a centralised laboratory and expressed in international units (IU) per millilitre. For the analysis, total IgE level was examined as a continuous variable.
Blood neutrophil and eosinophil counts were expressed in cells/mm 3 and coded as continuous variables [24,25].

Characteristics of the Studied Population and Association with the FlOPs Level
First, characteristics of the studied population were described and summarised as n (%) or mean (m) ± standard deviation (sd), according to the type of variable, either quantitative or qualitative. Due to their skewed distribution, the FlOPs level was log-transformed and expressed as geometric mean (GM) and values of first quartile (Q1) and third quartile (Q3).
In order to select potential confounding factors prior to GWAS, we estimated associations between log-FlOPs and several characteristics of the whole sample using Gaussian linear models, taking into account EGEA family structure, by Generalised Estimated Equation (GEE, SAS v9.4 (SAS Institute, Cary, NC, USA), proc genmod, option repeated). We previously identified the following characteristics as factors associated with FlOPs: age, sex, current smoking status, lifelong cumulative quantity of tobacco, blood neutrophil count, and FEV 1 . Age was entered either as a continuous variable or a categorical one, with cut-points defined a priori (<25 years; 25-34 years; 35-44 years; 45-54 years; and ≥55 years). Regarding tobacco smoking, models included either smoking status as a categorical variable (never smokers; ex-smokers; and current smokers), or lifelong cumulative quantity of tobacco as a continuous variable or a 4-class categorical variable. The best model was selected based on the QIC, an Akaike's Information Criterion in the framework of GEE models [26]. Among all models tested, the best model included age (continuous), sex, and lifelong cumulative quantity of tobacco (never-smokers; <10 pack-year; 10-20 pack-year; and >20 pack-year).
Adjusted log-FlOPs were obtained as residuals of the best linear model identified in the previous step. Z-scores were then obtained by standardizing residuals, and adequacy to Gaussian distribution was assessed using the Kolmogorov test. In order to exclude participants whose log-FlOPs were poorly predicted by the linear model, participants with the highest Z-score (i.e., |Z-score| > 3, corresponding to the 0.1th and 99.9th percentiles of a standard Gaussian distribution) were excluded. The process was repeated until no significant deviation from Gaussian distribution was evidenced. This process excluded 20 participants (see Figure S1 in Supplementary Materials).
Adjusted log-FlOPs used for GWAS stratified on smoking status (see below) were generated by applying the same procedure as described above, except that lifelong cumulative quantity of tobacco was not entered in the model.

GWAS of the FlOPs Level
We first conducted a GWAS of the FlOPs level in the whole sample with genotyped data. Then, we conducted two supplementary GWAS separately in contrasted groups according to smoking status at the time of measurement: never-smokers and current smokers. Ex-smokers were excluded from this analysis.
An association analysis between adjusted log-FlOPs (standardised residual) and each SNP was performed by the Gaussian linear model, adjusted for principal components (PCs) to account for within European diversity. The EGEA family structure was taken into account using a robust variance estimator for clustered data (STATA command: regress, option vce(cluster), within family). SNPs were coded under an additive genetic model.
For the top three SNPs obtained by each of the three GWAS (in the whole sample, in never-smokers and in current smokers), further analyses were conducted. First, we split the sample into two actual independent sub-samples regarding the ascertainment mode (controls vs. cases/relatives) to check the consistency of the results by using a homogeneity test between the sub-samples. Second, due to the mode of ascertainment of EGEA families, i.e., through asthmatic participants, the independence of the results regarding the asthma status was verified by homogeneity test according to ever-asthma status (never-asthmatics vs. ever-asthmatics). A test for homogeneity was performed by fitting the interaction term between the SNP and dummy variables (cases/related vs. controls, and ever-asthmatics vs. never-asthmatics, respectively) in models. The top three SNPs of the three GWAS (in the whole sample, the never-smokers and the current smokers) was also focused on: we used imputed data from the reference panel 1000 Genome project Phase 1 [20] CEU population, spanning 500 kb on each side of each top SNP. For each region of interest, association results were graphically represented using LocusZoom [27].
All analyses were performed using SAS v9.4 (SAS Institute, Cary, NC, USA) or STATA v14.1 (StataCorp. 2015. Stata Statistical Software: Release 14. College Station, TX, USA: StataCorp LP). All tests were two-sided. To account for multiple testing, the Bonferronicorrected significance p-value threshold applied to the Meff (effective number of independent tests after discarding dependence due to linkage disequilibrium (LD) between the SNPs) was calculated. For a chip of 610K SNPs, the significance p-value threshold was estimated to be 1.3 × 10 −7 [28].

Results
The present analysis was carried out among adult participants (≥16 years old) at EGEA2 with available data on the FlOPs level, valid genotyped data, and asthma and tobacco smoking status. A total of 1216 adult participants were included in the analyses (see Figure S1 in Supplementary Materials).
Compared to the 355 adult participants not included in the analyses, the 1216 participants did not differ in terms of age, sex, and asthma status (data not shown).

Characteristics of the Studied Population
The characteristics of the 1216 participants are presented in Table 1. The results are presented for the whole sample and according to asthma status (ever-asthmatics and never-asthmatics), to the current tobacco smoking status (never-smokers and current smokers), and to the study design (controls and cases/relatives). In the whole sample (mean age 43.3 years, 51% women), 44% had ever-asthma and 23% were current smokers. The geometric mean (GM) (Q1, Q3) of the FlOPs level was 92.3 (80, 105) RFU/mL.
Associations between the FlOPs level and the characteristics of the whole sample are presented in Table S1 (see Supplementary Materials). the FlOPs level was independently associated with age, sex, and smoking (all p-values < 5.0 × 10 −3 ): it increased significantly with age, was significantly higher in women than in men, in ex-/current smokers (GM = 97.2 and 93.8 RFU/mL, respectively) as compared to never-smokers (GM = 89.1 RFU/mL), and increased significantly with lifelong cumulated quantity of cigarettes smoked. The geometric mean (GM) (Q1, Q3) of the FlOPs level was 93.4 (81, 107), 98.7 (88, 108), and 100.3 (88, 114) RFU/mL in participants with lifelong quantity of cigarettes smoked of <10, 10-20, and >20 pack-year, respectively (p < 1.0 × 10 −4 ).  Table 2 reports the results of the associations in the whole sample for the 10 SNPs showing the strongest signals. The Manhattan plot is available in Supplementary Materials, Figure S2A, and the Q-Q plot in Figure S3A shows that there was no inflation in the statistical test, with the genomic inflation factor estimated to 1.002. The top three SNPs were rs270404, located in the BMP6 gene on chromosome 6p24.3 (p = 3.0 × 10 −6 ); rs13223298, located upstream (from 2 kb apart) of the BMPER gene on chromosome 7p14.3 (p-value = 8.7 × 10 −6 ); and rs491274, located in the intergenic region nearest SEMA6D (from 607 kb apart) genes on chromosome 15q21.1 (p = 8.9 × 10 −6 ). The association analysis for these top three SNPs in the two independent sub-samples (controls vs. cases/relatives) showed no indication of heterogeneity (all p-values > 0.6, see Table S2 in Supplementary Materials).
The associations stratified by asthma status (never-asthmatics, ever-asthmatics) for the top three SNPs associated with the FlOPs level in the whole sample are reported in Table 3. No indication of heterogeneity was observed (all p-values ≥ 0.25).
An analysis using imputed SNPs spanning 500 kb on each side of each top SNP in the regions of the top three SNPs located in BMP6, near BMPER and near SEMA6D, confirmed the initial results (see Supplementary Materials: Figure S4A-C). Analyses of imputed SNPs in these regions found signals with similar or slightly improved significance levels at genotyped SNPs, and for at least two other imputed SNPs, which were close to and in strong linkage disequilibrium (LD, r 2 > 0.8) with the genotyped top SNP ( Figure S4A-C).   Table 4 presents the associations with FlOPs for the top 10 SNPs in never-smokers and in current smokers. Manhattan plots are available for the two GWAS in Supplementary Materials; Figure S2B,C; and the Q-Q plots in Figure S3B,C and show that there was no inflation in the statistical test for the two GWAS, with genomic inflation factors estimated to 1.006 and 1.03, respectively. In never-smokers, the top three SNPs (all p-values < 5.0 × 10 −7 ) were rs17823624 located in the COL21A1 gene on the chromosome 6p12.1 (p = 2.3 × 10 −7 ), rs6606856 located in GABRG3 gene on chromosome 15q12 (p = 4.1 × 10 −7 ), and rs2962642 located in an intergenic region near NUDT12 (from 568 kb apart) on chromosome 5q21.2 (p = 4.4 × 10 −7 ). For these top three SNPs, association analysis in the two independent sub-samples (controls vs. cases/relatives) showed no indication of heterogeneity (all p-values > 0.6, see Table S3 in Supplementary Materials). Besides that, association analysis for these top SNPs yielded similar results in never-asthmatics and in ever-asthmatics, with no indication of heterogeneity (p ≥ 0.10, Table 5). Note that none of the top three SNPs found in never-smokers showed indication of association in current smokers (all p-values > 0.10).

In Never-Smokers and in Current Smokers
In current smokers, the top three SNPs were rs3851212 located in CRYBG1 on chromosome 6q21 (p = 2.4 × 10 −9 , exceeding the significance level of 1.3 × 10 −7 ), rs1793958 located in COL2A1 on chromosome 12q13.11 (p = 4.7 × 10 −7 ), and rs17174795 located upstream PTPRO (from 1.4 kb apart) on chromosome 12 p12.3 (p = 9.2 × 10 −7 ). For these top three SNPs, association analysis in the two independent sub-samples (controls vs. cases/relatives) showed no indication of heterogeneity (all p-values ≥ 0.4, see Table S3 in Supplementary Materials). No heterogeneity of association signals was detected according to asthma status (all p-values ≥ 0.7, except for rs17174795 with p-value = 0.05, but not significant after correction for multiple testing; see Table 5). The top three SNPs found in current smokers showed no indication of association in never-smokers (p > 0. 20) or only a weak signal (p = 0.05). Analysis using imputed SNPs spanning 500 kb on each side of each top SNP in the regions of the top three SNPs in both never-smokers and current smokers confirmed the initial results, with a similar significance level as those observed with genotyped SNPs (Figures S5A-C and S6A-C in Supplementary Materials). Analyses of imputed data in the region around two of the top six SNPs found additional signals at imputed SNPs close to and in strong linkage disequilibrium (LD, r 2 > 0.8), with the genotyped top SNPs, 12 for rs2962642 located near NUDT12 with similar significance level ( Figure S5C) and two for rs17174795 located near PTPRO with improved significance level ( Figure S6C). These results also supported the initial findings for these top two SNPs.

eQTLs, meQTLs, and Functional Annotations
Using the browser Phenoscanner v2 (http://www.phenoscanner.medschl.cam.ac. uk/, accessed on 18 October 2021), we found that top two SNPs, one in never-smokers (rs17823624 in COL21A1) and one in current smokers (rs1793958 in COL2A1), were associated with gene expression in a whole blood sample from subject of European ancestry (Table S4 in Supplementary Materials). The SNP rs1783624 was associated with gene expression of DST (p-value = 2.0 × 10 −15 ), while the SNP rs1793958 was associated with the expression of five genes belonging to the 12q13.11 region: OR10AD1, PFKM, SENP1, TMEM106C, and VDR (all p-values < 2.5 × 10 −8 ). No eQTL was found for the other top SNPs. Furthermore, we found from 1 to 26 CpG sites considering all top three SNPs of GWAS in the whole sample, in never-smokers and in current smokers (See Table S5 in Supplementary Materials). Most CpG sites were located near or in the same gene as the associated SNP. The SNP rs13223298 in BMPER was found to be more associated with the methylation level of one CpG site located on another chromosome than BMPER, in the PTPN22 gene on chromosome 1p13.2 (p-value = 9.3 × 10 −8 , not significant after correction for multiple testing).
Note that a proxy of rs3851212 (the top SNP in current smokers located in CRYBG1) rs79231630 had a CADD score equal to 14.5. Detailed results for CADD scores are presented in Table S6.
Using the functional annotation tool HaploReg-v4.1, we found that all top three SNPs (or their proxies) evidenced respectively in the whole sample, in never-smokers and in current smokers, mapped to marks of active regulatory elements, including cells from heart, lung, kidney, breast, and brain. Detailed results for functional annotations are presented in Table S6 (see Supplementary Materials).

Discussion
This first genetic study on the FlOPs level identified several variants, among which those located in BMP6, near BMPER and between SQOR and SEMA6D were the most strongly associated with this biomarker in the whole sample. Stratified analyses on tobacco smoking status identified other genetic variants: among them, the top three SNPS in neversmokers located in COL21A1, in GABRG3, and near NUDT12, and the top three SNPS in current smokers located in CRYBG1, in COL2A1, and near PTPRO.
Our study is based on the hypothesis that GWAS of FlOPs may provide new insights nito the pathophysiology of chronic diseases related to the OS pathway. The GWAS analyses we performed were based on the EGEA study, whose participants had extensive clinical, genetic, biological, and environmental characterisation. To our knowledge, there was no other epidemiological study with such data for a replication sample. Interestingly, all our association results were supported by consistent results observed in controls and cases/relative sub-samples. Furthermore, analyses of imputed data in the region around each top SNP confirmed our initial association results obtained with genotyped data. However, all our findings should be validated/replicated in other independent cohorts.
Due to the ascertainment mode of families in the EGEA study, i.e., through asthmatic participants, and the involvement of the OS pathway in the pathophysiology of asthma, we repeated our analyses in ever-and never-asthmatics in order to evaluate the associations independently of the disease. The results were consistent between never-and ever-asthmatics, which showed the independence of our results from the disease. Furthermore, we verified that any of our top SNPs were associated with lung function or adult-onset asthma in EGEA sample [19]. All these results suggest that our main results are not driven by asthma.
In the whole population, the strongest association signals were observed for rs270404 located in BMP6 and rs13223298 located near BMPER (i.e., 2.2 kb from that gene). BMP6 and BMPER were reported to interact physically in a functional study [35]. In line with this result, we tested the effect of the statistical interaction between these SNPs on FlOPs and found a borderline significant interaction (p-value = 0.07). BMP6 (Bone Morphogenetic Protein 6) encodes a secreted ligand of the transforming growth factor-beta (TGF-beta) superfamily of proteins, and BMPER (BMP Binding Endothelial Regulator) encodes a secreted protein that interacts with and inhibits the bone morphogenetic protein (BMP) function. It is noteworthy that these two genes belong to the biological process "regulation of pathway restricted SMAD protein phosphorylation" pathway (GO:006093) that is involved in the TGF-beta receptor signalling pathways [36]. The role of TGF-beta has been discussed in chronic asthma, as a potent fibrogenic growth factor overexpressed in the asthmatic lung [37]. Moreover, BMPER belongs to the biological process "immune response" pathway (GO:0006954) [38]. From the GWAS Catalog [39], we found that BMP6 was associated with FVC [40][41][42]; FEV 1 [42]; and, to a lesser extent, chronic obstructive pulmonary disease [43] and small cell lung carcinoma [44]. In previous GWAS, associations were reported between BMPER and FVC [40], FEV 1 [45], and other chronic diseases such as Alzheimer's disease [46] and metastatic colorectal cancer [47]. Moreover, the top two SNPs in BMP6 and near the BMPER map to marks of active regulatory elements in heart, lung, brain, breast, and kidney tissues and the top SNP near BMPER was associated with the methylation level of one CpG site located in PTPN22 (Protein Tyrosine Phosphatase Non-Receptor Type 22), a gene involved in "NF-Kappa B signalling" and "immune response" pathways.
As cigarette smoke is the most important exogenous source of ROS, we performed GWAS separately in two contrasted sub-groups according to smoking status and identified specific association signals in each sub-group. In never-smokers, the top SNP was rs1782324 located in COL21A1 (Collagen Type XXI Alpha 1 Chain). In previous GWAS, COL21A1 showed associations with lung function [40] and to a lesser extend with allergic sensitisation [50] and small cell lung carcinoma [44]. We also found that rs17823624 was eQTL of DST (Dystonine), a gene close to COL21A1, for which variants were found to be associated with lung function [40,42] and Alzheimer's disease [51]. The next top SNP rs6606856 was located in GABRG3 (Gamma-Aminobutyric Acid Type A Receptor Subunit Gamma3), a gene involved in the "response to drug" biological pathway (GO:0042493). This gene was shown to be associated with gene methylation in the lung tissue of smokers, as reported in a previous GWAS [52]. An association between GABRG3 and several chronic diseases, including Alzheimer's disease [53], ovarian carcinoma [54,55], and non-melanoma carcinoma [56], was also reported in other previous GWAS. Finally, the third highest SNP rs2962642 was located near (i.e., 568 kbp apart) to NUDT12 (Nudix Hydrolase 12). This gene was involved in the "NADH metabolic process" biological pathway (GO:0006734), and was shown to be associated with longevity [57] and smoking behaviour [58] in previous GWAS. Furthermore, proxies of rs2962642 map onto the regulatory motifs altered for histone deacetylase 2 (HDAC2), whose activity is regulated by oxidative stress.
In current smokers, the top SNP, rs3851212, was located in CRYBG1, which exceeded the genome-wide significance threshold level of 4.3 × 10 −8 when accounting for the three GWAS. The role of CRYBG1 (crystallin beta-gamma domain containing 1), also called AIM1 (absent in melanoma 1 protein) in malignant melanoma, is well-known [59]. To note, rs3851212 is located 50 kb from ATG5 (autophagy-related 5), which belongs to the biological pathway "immune system process" (GO:0002376) and is involved in mitochondrial quality control after oxidative damage, and in subsequent cellular longevity. In previous GWAS, ATG5 was found to be associated with allergic diseases [60], including asthma [61][62][63][64], and with several other chronic diseases such as systemic lupus erythematous [65][66][67][68], rheumatoid arthritis [69,70], and multiple myeloma [71]. Furthermore, note that a proxy of rs3851212-rs79231630-has a CADD score close to 15, indicating deleterious effect of the SNP. The next top SNP is rs1793958, located in COL2A1 (collagen type II alpha 1 chain), a gene involved in "regulation of immune response" biological process (GO:0050776). COL2A1 was found to be associated with rheumatoid arthritis [72] and with prostate carcinoma [73][74][75] in previous GWAS. On the other hand, the third highest SNP rs17174795 is located 1.4 kb apart from PTPRO (Protein Tyrosine Phosphatase Receptor type O), which has been suggested as a candidate tumour suppressor via the NF-Kappa B signalling pathway [76,77], and the transcription factor NF-Kappa B plays a central role in inflammatory airway diseases such as asthma [78].
None of the top signals found in one sub-group of smoking status were found in the other sub-group, nor in the whole sample, showing that, as we hypothesised, accounting for smoking status may help one to identify loci not found in the whole sample. These results are likely explained by the existing interactions between the environment (here smoking) and genes that lead to "up" or "down" regulation of the pathways that influence the level of FlOPs.
None of the top signals found in the whole sample were found in the two contrasted sub-groups according to smoking status, which is an interesting result. Indeed, these analyses were carried out with the objective to identify genetic loci that could have been missed in the whole sample as smoking is the main environmental source of OS and is associated with FlOPs. The two GWAS in groups contrasted by smoking revealed additional genetic loci to those found in the whole sample.
Overall, many of the top SNPs identified by the three GWAS are located in regions comprising promising candidate genes. Among them, BMP6, BMPER, GABRG3, and ATG5 are the most relevant because of both their link to biological pathways related to OS and their association with several chronic diseases, for which the role of OS in their pathophysiology has been pointed out. BMP6 and BMPER are of particular interest due to their involvement in the same biological pathways related to OS and their functional interaction.

Conclusions
In conclusion, the present study identified, for the first time, new and promising candidate genes associated with the FlOPs level potentially involved in the pathophysiology of chronic diseases through their link with the oxidative stress pathway. Although further studies are needed to replicate these findings, this work highlights the interest in performing genome-wide analyses of biomarkers to identify new genes and potential mechanisms related to specific pathways common to chronic diseases.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/antiox11050802/s1. References [15,16,79] are cited in the supplementary materials. The description of the EGEA study; Figure S1: a flow chart of the studied population; Figure S2: a Manhattan plot of the GWAS results of the FlOPs levels (A) in the whole sample (n = 1216), (B) in never-smokers (n = 604), and (C) in current smokers (n = 275); Figure  S3: a quantile-quantile (QQ) plot of the GWAS results of the FlOPs levels (A) in the whole sample (n = 1216), (B) in never-smokers (n = 604), and (C) in current smokers (n = 275); Figure S4: a regional plot of the association results using imputed genetic data for the top three SNPs in the whole sample (n = 1216), the region around rs270404 (A), the region around rs13223298 (B), and the region around rs491274 (C); Figure S5: a regional plot of the association results using imputed genetic data for the top three SNPs in never-smokers (n = 604), the region around rs17823624 (A), the region around rs6606856 (B), and the region around rs2962642 (C); Figure S6: the regional plot of association results using imputed genetic data for the top three-SNPs in current smokers (n = 275), the region around rs3851212 (A), the region around rs1793958 (B), and the region around rs17174795 (C); Table S1: the association between the FlOPs level and the characteristics of the studied population (n = 1216); Table S2: the top three SNPs in the whole sample: the consistency of the results in two independent sub-samples; Table S3: the top three SNPs in never-smokers and in current smokers: the consistency of the results in two independent sub-samples; Table S4: the results from the eQTL browser Phenoscanner v2; Table S5: the meQTLs for the top three-SNPs identified in GWAS in the whole sample, in never-smokers and in current smokers; Table S6: Regulatory elements for the top three SNPs (and proxies with r 2 > 0.80) identified in GWAS in the whole sample, in never-smokers and in current smokers.  Informed Consent Statement: Written informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Due to third-party restrictions, EGEA data are not publicly available. Please see the following URL for more information: https://egeanet.vjf.inserm.fr/index.php/en/ contacts-en, accessed on 17 February 2022. Interested researchers should contact egea.cohorte@inserm.fr with further questions regarding data access.