SNPs Sets in Codifying Genes for Xenobiotics-Processing Enzymes Are Associated with COPD Secondary to Biomass-Burning Smoke

Chronic obstructive pulmonary disease (COPD) is the third leading cause of death worldwide; the main risk factors associated with the suffering are tobacco smoking (TS) and chronic exposure to biomass-burning smoke (BBS). Different biological pathways have been associated with COPD, especially xenobiotic or drug metabolism enzymes. This research aims to identify single nucleotide polymorphisms (SNPs) profiles associated with COPD from two expositional sources: tobacco smoking and BBS. One thousand-five hundred Mexican mestizo subjects were included in the study and divided into those exposed to biomass-burning smoke and smokers. Genome-wide exome genotyping was carried out using Infinium Exome-24 kit arrays v. 1.2. Data quality control was conducted using PLINK 1.07. For clinical and demographic data analysis, Rstudio was used. Eight SNPs were found associated with COPD secondary to TS and seven SNPs were conserved when data were analyzed by genotype. When haplotype analyses were carried out, five blocks were predicted. In COPD secondary to BBS, 24 SNPs in MGST3 and CYP family genes were associated. Seven blocks of haplotypes were associated with COPD-BBS. SNPs in the ARNT2 and CYP46A1 genes are associated with COPD secondary to TS, while in the BBS comparison, SNPs in CYP2C8, CYP2C9, MGST3, and MGST1 genes were associated with increased COPD risk.


Introduction
Chronic obstructive pulmonary disease (COPD) is a complex and multifactorial disease; preventable, treatable, and partially reversible, characterized by airflow limitation due to an airway inflammatory process in response to chronic exposure to noxious particles [1][2][3]. Worldwide, COPD is the third leading cause of death, with a prevalence of

Population Included
For this study, 1500 subjects were included and divided into 2 comparison groups; the first was composed of 900 smokers: 300 patients with COPD secondary to tobacco smoking (COPD-S) and 600 smokers without the disease (SWOC). The second comparison group included 600 subjects exposed to smoke from biomass burning, divided into 220 patients with COPD secondary to BBS (COPD-BBS) and 380 subjects exposed to BBS but without COPD (BBES).
Mexican mestizo subjects over 40 years of age and of indistinct sex were included; for the comparison group of smokers, participants with a tobacco index (TI) > 5 packs/year and no history of exposure to BBS were included. In the exposed to BBS group, subjects with an exposure index to BBS (BEI) >100 h/year were included and were never smokers. COPD patients were defined when the post-bronchodilator FEV1/FVC ratio was <70% (Supplementary Figure S1). Participants with other inflammatory, autoimmune, or respiratory diseases were eliminated. The participants were recruited from the COPD Clinic, smoking cessation support groups of the Tobacco Smoking and COPD Research Department, from COPD early detection campaigns in rural Oaxaca [18], and from suburban areas of Mexico City. The Ethics in Research Committee from Instituto Nacional de Enfermedades Respiratorias Ismael Cosio Villegas approved the protocol under code numbers B14-17, B11-19, and C53- 19.
The clinical evaluation of the patients was carried out by specialized chest physicians from the Instituto Nacional de Enfermedades Respiratorias Ismael Cosio Villegas using GOLD guidelines. Ref. [19] Demographic and ancestry data were obtained through a questionnaire. Before taking biological samples, all patients signed an informed consent approved by institutional ethical boards.

Biological Samples
All participants took a blood sample by forearm puncture, and DNA and plasma were obtained using the previously described methodology [20]. The DNA samples were quantified spectrophotometry through a nanodrop device (Thermo Scientific, Wilmington, DE, USA), and samples with a 260/280 ratio between 1.8 and 2.2 were selected, adjusted to 60 ng/µL, and their integrity was evaluated in 1.5% agarose gels.
We applied functional candidate gene methodology to select only genes related to the metabolism of xenobiotics, cytotoxic products, and drug metabolism. Through bibliography research, we selected 38 genes. After applying the Hardy-Weinberg (p > 1 × 10 −9 test and excluding SNPs with MAF < 10%, we chose all SNPs in the proposed genes. We worked with 748 SNPs in both comparison groups (Supplementary Figure S3).

Data Analysis
PLINK v. 1.07 [21] was used for data quality control (QC). We considered a genotype call rate > 95% and eliminated subjects with >0.05 of missing genotypes; sex discrepancies were considered by X chromosome homozygosity (men > 0.8, women < 0.2), while relatedness was assessed by identity by descent (IBD) considering pi-hat values < 0.25.
Association analysis was carried out using PLINK v. 1.07 applying Fisher's exact test adjusted by covariates; in the smoker group comparison, we included sex, age, and cigarettes/day (TI) as covariates, while in the biomass group, comparison, age, and biomass burning-smoke exposure index (BEI) was included, utilizing the Bonferroni correction test.
The R language [22] and the Rstudio interface [23] were employed for statistical analysis. Admixture and principal component analysis (PCA) were carried out using packages SNPRelate and gdsfmt from Bioconductor. We included Hapmap population data from Northern Europeans from Utah (CEU), Yoruba in Ibadan from Nigeria (YRI), and native Amerindian populations (AMR) described by Huerta-Chagoya, and we selected 32 ancestry informative markers (AIMs) and used k = 3 [24]. The distribution of demographic variables, exposure data, or lung function was analyzed to determine the type of statistical comparison being made.

Severity Analysis
Afterward, we stratified COPD patients based on the GOLD states, comparing mild (GOLD 1 + 2) vs. severe forms (GOLD 3 + 4) of the illness to avoid bias by subgrouping. This analysis was carried out for COPD-S and COPD-BBS individually using PLINK v. 1.07 and applying Fisher's exact test, correcting by covariates, and the Bonferroni multiple testing.

Multiple Correspondence Analysis
We applied multiple correspondence analysis (MCA) to determine possible grouping between SNPs associated, exposure indexes, and/or FEV1 values. These analyses were carried out using Rstudio [22] and the packages FactoMineR [25] and factoextra [26].

Calculation of Haplotype Blocks
We included 750 SNPs in the Haplotype blocks analysis. This analysis was carried out using Haploview 4.2 software, [27] applying the analysis algorithm presented by Gabriel et al. [28]. We applied a window of inclusion of 5000 Kb per pair of SNPs. Linkage disequilibrium (LD) was presented using D' value. Haplotypes association analysis was carried out using R through Fisher's exact test between cases vs. controls and adjustment by logistic regression, including covariates. Genes' schemes and SNPs' positions are included in the Supplementary Material ( Supplementary Figures S7 and S8).

Population Studied
After quality control, 745 subjects were included; 354 were in the group of smokers (COPD-S = 141, SWOC = 213) and 391 had been exposed to BBS (COPD-BBS = 98, BBES = 293). The distribution of the variables presented a non-normal distribution, so the demographic, clinical, and exposure variables are presented as a function of the median and quartiles 1 and 3. At the same time, the comparisons were made using the Mann-Whitney U test and χ 2 for qualitative variables.
When comparing COPD-S vs. SWOC, significant differences (p < 0.05) were found in the male-female ratio, age, BMI, and TI; because of this, sex, age, and TI were selected for covariate correction. In the BBS comparison group (COPD-BBS vs. BBES), significant differences (p < 0.05) were found in age and BBS exposure index (BEI), so these were included as covariates in the association analysis of this group (Table 1). By ancestry analysis, we found different proportions for both groups of comparison. We found a highly conserved Amerindian composition in the biomass-burning comparison group (COPD_BBS, BBES), while in the smokers' comparison (COPD_S, SWOC), we found a heterogeneous composition, predominantly Amerindian and Caucasian ( Figure 1). We included Hapmap population data: northern Europeans from Utah (CEU), Yoruba in Ibadan from Nigeria (YRI), and native Amerindian (AMR). COPD_BBS: COPD patients exposed to biomass-burning smoke; COPD_S: COPD patients as smokers; SWOC: Smokers without COPD; COPD_BBS: COPD secondary to biomass-burning smoke exposition; BBES: Biomass burning exposed subject. (B) Admixture plot for ancestry composition in subjects included. We included the following Hapmap population data: Northern Europeans from Utah (CEU, in red), Yoruba in Ibadan from Nigeria (YRI, in green), and native Amerindian (AMR, in blue).

Association Analysis in the Group of Smokers
All the SNPs associated with this stage met the Hardy-Weinberg equilibrium and MAF > 10% (Supplementary Table S1). In the comparison of smokers (COPD-S vs. SWOC), after correction for covariates, 8 SNPs/alleles were found associated (p < 0.05) with COPD secondary to TS, 6 SNPs (rs11572191, rs8133, rs17497857, rs4964059, rs3901896, rs8041826) associated with increased risk (OR > 1.0), and 2 SNPs (rs4147611, rs3742377) with decreased risk (OR < 1.0). Of the SNPs associated with risk, rs11572191 in the CYP2J2 gene presented the highest OR value, with an almost three-fold increased risk of developing COPD secondary to smoking. On the other hand, the ARNT2 and ARNTL2 genes each presented two SNPs associated with increased risk, these being the genes with the highest number of associated SNPs in this comparison group. However, when we applied the Bonferroni correction, no significant associations were retained (Table 2).
Seven SNPs associated with an increased risk of COPD secondary to TS were found in the genotype analysis. Of these SNPs, rs11572191 in CYP2J2, rs17497857 in ARNTL2, rs3901896, and rs8041826 in ARNT2 remained associated. On the other hand, the rs1951576 and rs943881 in CYP46A1 and rs6488842 in MGST1 are new findings by this model analysis. Interestingly, rs11572191 and rs17497857 are associated with heterozygous genotypes (Table 3). We extracted the data of COPD-S and COPD-BBS, looking for possible differentiation patterns, including SNPs with MAF > 1% by MCA. Even though we have differential grouping patterns, the variance did not surpass >1% (Supplementary Figure S4A).
We included 336 SNPs for the MCA in smokers' comparison. By biplots, we did not find any cluster of SNPs that could explain variance >1% (Supplementary Figure S4C). Next, we included all the SNPs associated with the allele analysis but did not get any possible component (Supplementary Figure S5).
The possible participation of other SNPs in the genetic susceptibility was assessed through haplotype blocks, including all associated SNPs, before correction for covariates to maximize the analysis screen. Five blocks of haplotypes were found to form in the ARNTL2 gene, CYP19A1, ARNT2, CYP46A2, and MGST3, all with LD > 85 ( Figure 2).
When the association of haplotypes was carried out, we found nine different combinations of SNPs associated with COPD-S in the genes: Table 4). Of these combinations, six haplotypes were associated with a lower risk of COPD-S and three to higher risk (OR > 1.5). We found five haplotypes containing SNPs previously associated in the allele or genotype analysis: rs3901896 in ARNT2, rs1951576 in CYP46A1, and rs17497857 in ARNTL2, also rs8133 and rs4147611 in MGST3. We included 336 SNPs for the MCA in smokers' comparison. By biplots, we did not find any cluster of SNPs that could explain variance >1% (Supplementary Figure S4C). Next, we included all the SNPs associated with the allele analysis but did not get any possible component (Supplementary Figure S5).
The possible participation of other SNPs in the genetic susceptibility was assessed through haplotype blocks, including all associated SNPs, before correction for covariates to maximize the analysis screen. Five blocks of haplotypes were found to form in the ARNTL2 gene, CYP19A1, ARNT2, CYP46A2, and MGST3, all with LD > 85 ( Figure 2). When the association of haplotypes was carried out, we found nine different combinations of SNPs associated with COPD-S in the genes: Table 4). Of these combi-

Severity Analysis
We stratify COPD-S and COPD-BBS subjects according to the GOLD stages (mild stages: GOLD I + II; severe stages: GOLD III + IV). In COPD-S, we found four SNPs: rs12435918 in CYP46A1, rs625456 in GSTM2, and rs1058930 in CYP2C8 associated with severe forms of COPD secondary to tobacco smoking (Supplementary Table S8). For COPD-BBS, we found rs12300289 in ARNTL2, rs10847 in ARNT, and rs2234696 in GSTM3 associated with the severe form of COPD secondary to biomass-burning smoke exposition (Supplementary Table S9).

Association Analysis in the Group Exposed to BBS
In the BBS exposure comparison group (COPD-BBS vs. BBES), 24 SNPs were found to be significantly associated (p < 0.05), of which twenty were associated with a higher risk of COPD and four with a decreased risk of suffering from the disease. Interestingly, the associated polymorphisms are mainly distributed in the MGST3, MGST1, CYP2C8, and CYP2C9 genes (Table 5). After applying the Bonferroni correction test, only three SNPs remained associated, rs11799886/MGST3 (p = 0.019), rs1856908/CYP2C9 (p = 0.003), and rs1934953/CYP2C8 (p = 0.021).
When performing the genotype analysis, 23 SNPs associated with the disease were found; three with reduced risk and twenty with a higher COPD risk. In six SNPs, no homozygotes were found for the minor allele and the leading associations were with the heterozygous genotypes. It should be noted that the groups of SNPs in the MGST3, CYP2C8, CYP2C9, and MGST1 genes remained associated. MGST3 presented the highest number of associated SNPs and OR values, presenting a three-fold increased risk of developing the disease. In the case of CYP2C8, although only three SNPs were found to be associated with increased risk, their OR values were also up to four times higher risk of developing COPD secondary to BBS (Table 6). For BBS comparison, we included 298 SNPs after filtering by MAF (>1%). We did not find any clusters with more than 2% of the variance (Supplementary Figure S4B). Looking for other clustering patterns, we included only the SNPs associated with COPD-BBS, but no grouping patterns that could explain higher variability were found (<1%) (Supplementary Figure S6).
We found seven blocks of haplotypes in high LD in the genes ARNTL, CYP2R1, MGST1, ARNTL2, GSTP1, CYP1A2, ARNT2, CYP2C18, CYP2C9, CYP2C8, GSTM5, GSTM3, and MGST3 ( Figure 3). In block 4, we found the rs1856908 reported in allele and genotype analysis. A haplotype block (block 7) was found in MGST3; this block was found in the smokers' comparison (rs8133-rs4147611). For BBS comparison, we included 298 SNPs after filtering by MAF ( > 1%). We did not find any clusters with more than 2% of the variance (Supplementary Figure S4B). Looking for other clustering patterns, we included only the SNPs associated with COPD-BBS, but no grouping patterns that could explain higher variability were found (<1%) (Supplementary Figure S6).
We found seven blocks of haplotypes in high LD in the genes ARNTL, CYP2R1, MGST1, ARNTL2, GSTP1, CYP1A2, ARNT2, CYP2C18, CYP2C9, CYP2C8, GSTM5, GSTM3, and MGST3 ( Figure 3). In block 4, we found the rs1856908 reported in allele and genotype analysis. A haplotype block (block 7) was found in MGST3; this block was found in the smokers' comparison (rs8133-rs4147611). Eight combinations of SNPs were associated with a lower risk of suffering COPD (OR < 1) and eighteen were associated with a higher risk (OR > 1.5). The larger SNP combination was composed of 15 variants that range from CYP2C18 to CYP2C9, with the highest OR value at almost eight times higher risk of COPD. MGST3 was the gene with more blocks; we found three haplotypes, and the SNPs included in the haplotypes had been previously reported in alleles and genotypes analyses (Table 7).  Eight combinations of SNPs were associated with a lower risk of suffering COPD (OR < 1) and eighteen were associated with a higher risk (OR > 1.5). The larger SNP combination was composed of 15 variants that range from CYP2C18 to CYP2C9, with the highest OR value at almost eight times higher risk of COPD. MGST3 was the gene with more blocks; we found three haplotypes, and the SNPs included in the haplotypes had been previously reported in alleles and genotypes analyses (Table 7). Table 6. Genotype association analysis in the exposed to biomass-burning smoke.  In the analysis by severity, we found four SNPs associated, three with the severe GOLD stages and one with mild COPD stages. When data were corrected by covariates, three out of four SNPs remained associated. However, no SNP retained its association after Bonferroni correction. (Supplementary Table S8). We found five SNPs in the severity analysis of the COPD-BBS group, four associated with a severe form of COPD and one with a mild form of the illness. Although four remained associated after the correction by covariates, no SNP conserves association after Bonferroni adjustment (Supplementary Table S9). Haplotypes association analysis corrected by covariates (age, sex, and BEI). Data are presented as% frequency. χ 2 was carried out to calculate p-values, OR and CI (95%); we considered significant association when p < 0.05. Freq%: frequency in percentages; OR: odds ratio; CI: confidence interval.

Discussion
Although multiple GWAS have described associations with COPD, most studies focus on COPD secondary to tobacco smoking in Caucasian populations from Europe and the USA; we analyzed SNPs in exome regions in the whole human genome by the array genotyping technology looking for variants associated with COPD both secondary to tobacco smoking or biomass-burning smoke in the Mexican mestizo population. The participants were recruited from different campaigns of COPD early detection in Mexico City and the highlands of Oaxaca.
In our group, Perez-Rubio et al. had previously described the genetic component of the population included in this study, demonstrating the contribution of the Amerindian/ Caucasian genetic component [29]. All patients had at least three prior generations born in Mexico (parents and grandparents) and were considered Mexican mestizos. We have previously demonstrated that this criterion is a good proxy of Mexican ancestry evaluated by ancestry-informative markers [30].
We found differences in variables, such as sex, age, BMI, and tobacco index, in comparing smokers. Due to these differences, we included these covariates in the association of alleles and haplotypes analyses to avoid false positive findings. For the BBS group, we found differences in age and exposure data. We did not find differences in the men/women ratio, but women are predominantly represented in both groups. Low-to middle-income countries are the principal users of biomass, and each region worldwide reported the use of specific kind of biomass; for example, in China, there is a predominance in the use of charcoal and coal; in Nepal and Kenya, the use of manure from big ruminants is a common practice; in a large variety of Latin America, African and South Asian countries is predominant the use of firewood from a great variety of trees and even agriculture waste [18]. The primary biomass fuel used in Mexico is firewood or mixtures of firewood, manure, and farming waste, especially in rural or suburban areas. The principal population exposed are women and children because women are the principal family members in charge of cooking [18,31].
Rehfuess and collaborators establish that 52% of the world population uses either biomass or solid fuel. Stratifying six geographic areas, they determined that Africa, South Asia, and different areas of Latin America are the principal biomass users [32].
The World Health Organization reported that around 2.5 billion people used any biomass only to cook, and, especially in rural zones, combustion takes place indoors, in closed or poorly ventilated places using improvised stoves or pipes, resulting in an event called "indoor pollution", affecting mainly women and children, and producing 1.3 million of premature deaths associated with respiratory diseases and infections [31,33].
Candidate genes analysis methodologies are strategies for post-genotyping data in genome-wide studies (GWAS) [34]. In this study, we used genotyping exome array that includes up to 560 thousand specific sequence probes capable of detecting the SNPs in exome regions. We included genes whose biological function was related to xenobiotic and drug metabolism processing.
In smokers' comparison, we found eight SNPs associated with COPD; six SNPs were associated with a higher risk of suffering COPD-S; rs11572191 in CYP2J2; rs8133 in MGST3; rs17497857 and rs4964059 in ARNTL2; and rs3901896 and rs8041826 in ARNT2, all with the minor allele. Only two SNPs were found associated with lower risk; rs4147611 in MGST3 and rs3742377 in CYP46A1.
Our is the first study reporting these sets of SNPs with COPD-S, particularly in a mestizo (admixed) population as the Mexican. Although any polymorphism in our findings was previously described, the genes associated are reported in different studies as associated factors to lung diseases. Four SNPs in CYP2J2 were found to be associated with the Chinese Han population with COPD-S [40], and even in the Russian population, SNPs in CYP2J2 are associated with bronchitis secondary to smoking [41]. Other investigations have demonstrated that SNPs in CYP2J2 could be involved in lung ischemia and reperfusion injury, especially in smokers [42,43]. Although our investigation focuses on COPD, lung injury and hypertension are common in subjects with COPD. Additionally, CYP2J2 is related to asthma models and cancer. Refs. [44][45][46] CYP2C9 has been included in studies related to adenocarcinoma and other forms of lung cancer [45,47,48].
Even though there are few reports of MGST3 and COPD, some SNPs have been associated with attenuating smokers' accelerated decline in FEV1/FVC [49]. No other reports of lung diseases have been reported.
ARNT genes encode proteins capable of binding to aryl hydro-carbon receptors to translocate them to the cell nucleus as transcription factors related to gene promoters such as HIF1α. The principal studies between ARNT genes suggest a possible relation with small-cell cancer [50][51][52][53].
The protein encoded by MGST1 (microsomal glutathione S-transferase 1) is a membraneassociated protein with peroxidase activity which avoids lipid damage against reactive oxygen species (ROS), cytotoxic, and drugs. The principal association between MGST1 and lung diseases includes different types of cancer, such as adenocarcinoma or non-small cell lung cancer [54,55]. Woldhuis et al. proposed that microsomal glutathione S-transferase 1 could be related to cell senescence and extracellular matrix reorganization [56]. Recently, ferroptosis has been described as a programmed dead type with a higher lipid peroxide concentration in other illnesses. MGST1 is differentially expressed in alveolar type 2 cells [57]. In the case of MGST3, sets of polymorphism attenuated lung function decline in European-American smokers [49].
In genotypes, only higher risk associated SNPs were found associated with the illness; among these, four were previously described in allele analysis; rs11572191, rs17497857, rs3901896, and rs8041826. Three more SNPs were found in genotype analysis, the GG of rs1951576 and CC genotype of rs943881 in gene CYP46A1 and TT of rs6488842 in gene MGST1. The alleles associated with a low risk of COPD were possibly not found in the genotype phase due to the low frequency of minor alleles; not enough homozygous minor allele genotypes were found.
There is limited information regarding the severity data related to the SNPs and genes associated with xenobiotic metabolism. Studies in emphysema have demonstrated that the expression of GSTM3 was upregulated in mild illness [58], while other studies describe SNPs associated with a lower FEV1/FVC ratio [59]. GSTM3 is a gene in which protein product is related to eliminating electrophilic compounds and carcinogens. We found rs2234696 in GSTM3 to be associated with the severe form of COPD in smokers, and while there are no reports regarding the SNP, we can state that the SNP could affect the structure of the protein codified by the gene, thus preventing its biological function.
Haplotypes analysis is used to elucidate possible associations in groups of SNPs in different regions of genes [60]. For the comparison with smokers, we found five blocks with high LD (>85) in ARNTL1, CYP19A1, ARNT2, CYP46A1, and MGST3. Multiple genes have been associated with complex diseases like COPD but with moderate OR [61]. Including multiple analyses as polygenic risk scores has demonstrated that a combination of genetic variants could explain the multiples association and even reach the haplotype analysis [62]. In the haplotype blocks, we found five combinations of SNPs associated with allele analysis, suggesting a probably critical role in COPD pathophysiology.
We found more SNPs associated with COPD-BBS than COPD-S; at the allele level, the principal findings include SNPs in MGST3, CYP2C8, and MGST1. Few studies have been made in the genetic field about COPD-BBS. Our current study is the first in exomewide genotyping.
Principal studies with COPD-BBS in Latin America emphasize clinical description; other studies include Chinese and Chilean populations but focus on genes such as PRDM15 and CXCL10, respectively [63,64]. Additionally, we found a greater number of SNPs in COPD-BBS than in COPD-S, which could suggest a possible major complex in developing COPD-BBS.
In haplotype analysis, we found seven blocks in high LD; among these findings, the larger haplotype block was found in MGST1, and the leading role of the protein encoded is related to extracellular matrix reorganization. A clinical characteristic of COPD-BBS is anthracofibrosis, bronchial caliber diminution, and increased mucus production [65]. This bronchial remodeling could be related to genes such as MGST1, but we cannot demonstrate it due to the limited investigation of the cellular effects of the BBS. Additionally, we found two different haplotypes in ARNT2 associated higher risk of COPD-BBS. Previous reports about COPD focus on tobacco smoking, and some of the most significant results involve AHR and ARNT family genes. The evidence demonstrates an important role of AHR in attenuating inflammation related to neutrophils [66] and in lung remodeling by different genes such as MMP9 [67]. Studies in animal models have demonstrated a possible relationship between the aryl hydrocarbon receptors and CYP genes, especially in asthma, which control inflammatory processes [68]. We found many haplotypes in CYP and ARNT genes, which could support the biological relation.
Even though we found SNPs associated specifically with COPD-S or COPD-BBS, we also found similar SNPs and haplotypes, such as ARNT2 and MGST3. This result could suggest the participation of a molecular shared component. Using in silico databases, such as GTEx, we found four SNPs (rs6681 and rs9333378 in MGST3, rs10789501 in CYP4A22, rs117987520 in CYP11A1) that affect the expression levels in the genes where they are located.
With the MCA, we included the SNPs associated with each subtype of COPD, but we did not find clear subgroups. Some studies have demonstrated that multivariate analysis as MCA and polygenic risk score calculation could give more information regarding the effect of exposure/clinical variables and genetic variants as SNPs [69].
Other phenomena reported in our investigation are the SNPs associated with a lower risk of COPD. In previous investigations, we have described similar associations with other SNPs in different genes [70,71]. This effect is described in different illnesses, called the "Hispanic Paradox", a theory that describes the role of the genetic background of Amerindians which could lead to lower severity or better prognosis in illnesses, including COPD [72,73].
Our is the first exome-wide association study in Mexican mestizos with COPD, classified by tobacco smoking and biomass burning-smoke exposition. We demonstrated the highly conserved composition of the Mexican Amerindian population. Although we found differences in demographics and exposure, we corrected data by logistic regression. Nevertheless, our study is not exempt from limitations; first of all, we need more clinical data, such as the number of exacerbations or predominant phenotypes (bronchitis or emphysema). We also require other auxiliary tools, for instance, expression-related or immunohistochemical. Additionally, we need to include more COPD patients to strengthen the severity analysis.

Conclusions
Single-nucleotide variants in CYP2C8, CYP2C9, and MGST3 genes are associated with the risk of COPD secondary to biomass-burning smoke exposure. In addition, shared haplotype blocks in MGST3 and ARNT2 were found in both tobacco smokers and biomassburning smoke-exposed subjects.