Genome-Wide Association Study of Airway Wall Thickening in a Korean Chronic Obstructive Pulmonary Disease Cohort

Airway wall thickening (AWT) plays an important pathophysiological role in airway diseases such as chronic obstructive pulmonary disease (COPD). There are only a few studies on the genetic components contributing to AWT in the Korean population. This study aimed to identify AWT-related single-nucleotide polymorphisms (SNPs) using a genome-wide association study (GWAS). We performed GWAS for AWT using the CODA and KUCOPD cohorts. Thereafter, a meta-analysis was performed. Airway wall thickness was measured using automatic segmentation software. The AWT at an internal perimeter of 10 mm (AWT-Pi10) was calculated by the square root of the theoretical airway wall area using the full-width-half-maximum method. We identified a significant SNP (rs11648772, p = 1.41 × 10−8) located in LINC02127, near SALL1. This gene is involved in the inhibition of epithelial–mesenchymal transition in glial cells, and it affects bronchial wall depression in COPD patients. Additionally, we identified other SNPs (rs11970854, p = 1.92 × 10−6; rs16920168, p = 5.29 × 10−6) involved in airway inflammation and proliferation and found that AWT is influenced by these genetic variants. Our study helps identify the genetic cause of COPD in an Asian population and provides a potential basis for treatment.


Introduction
Chronic obstructive pulmonary disease (COPD) is one of the most common lung diseases worldwide and is characterized by airflow obstruction that is not fully reversible with treatment [1]. Small airway diseases and parenchymal destruction play a role in the pathogenesis of COPD at different rates over time, resulting in chronic airflow limitation.
These pathologies do not always occur simultaneously, and their contribution to the development of COPD differs between individuals [2]. Airway remodeling in asthma and COPD results in airway wall thickening (AWT), which affects lung function. AWT is associated with chronic mucus hypersecretion in larger airways and airway obstruction in smaller airways [3]. The pathological process underlying AWT is chronic inflammation and remodeling of the airway wall by external factors, such as dust [3]. However, there are no certain genetic factors known to influence airway inflammation. Understanding the underlying genetic mechanisms will help develop novel diagnostic techniques and treatment strategies. Many improvements have been made in the quantification of the airway phenotype using computed tomography (CT) quantification methods. We attempted to measure airway wall thickness more objectively using image scanner technology and the associated software.
AWT plays an important pathophysiological role in airway disease [4]. COPD patients with chronic bronchitis have thicker airway walls than COPD patients without chronic bronchitis [5]; AWT plays a major role in COPD, and COPD-associated genes are also associated with AWT [6]. Moreover, two single-nucleotide polymorphisms (SNPs) associated with AWT were identified in male participants from Groningen and Utrecht [3].
However, few studies have explored the genetic mechanism underlying AWT using genome-wide association studies (GWAS) [7]. Therefore, we aimed to identify SNPs associated with AWT through GWAS. This study may enable the identification of genetic mechanisms that aid in understanding the development of AWT.

Study Population
Study participants were chosen from two cohorts: 500 subjects (324 COPD patients and 176 controls) from the COPD in dusty areas (CODA) cohort [8] and 474 subjects (235 COPD patients and 239 matched controls) from the KoGES-Ansan cohort [9,10]. Written informed consent was obtained from all the participants. This study was approved by the Kangwon National University Hospital IRB (KNUH 2012-06-007, clinical trial registration number KCT-0000552) and the Korea University Ansan Hospital IRB (2017AS0070).

Spirometry and Imaging Procedures
Lung function was measured before and after administering 400 µg of salbutamol using EasyOne (NDD, Zurich, Switzerland). The pulmonary function measures were calculated according to the American Thoracic Society and European Respiratory Society guidelines [11]. CT was performed at maximal inspiration and expiration in the supine position using a dual-source CT scanner (Somatom Definition, Siemens Healthcare, Forchheim, Germany for the CODA cohort; Brilliance 64, Philips Healthcare, Cleveland, OH, USA for the KUCOPD cohort). Airways were measured using automatic segmentation software (Aview, Coreline Soft, Seoul, Korea). The AWT at an internal perimeter of 10 mm (AWT-Pi10) was calculated by plotting the square root of the airway wall area against the internal perimeter of each measured airway using the full-width-half-maximum (FWHM) method.

Genotyping and Quality Controls
Genomic DNA was isolated from blood samples. Genotypes were created using the Axiom ™ Precision Medicine Research Array, which contains more than 860,000 SNPs. Quality control of the SNPs was performed using PLINK [12] and ONETOOL [13]. SNPs were excluded based on the following exclusion criteria: genotype call rate < 95% and the Hardy-Weinberg equilibrium (HWE) test [14], where p < 1 × 10 −5 . Subjects were excluded if 0.2 < X chromosome homozygosity < 0.8, genotype call rate < 95%, or heterozygosity rates of SNPs were outside the average heterozygosity rate ± 3 standard deviation (SD). Following the quality control processes, 433 and 387 subjects and 768,913 and 775,371 SNPs were included in the CODA and KUCOPD cohorts, respectively ( Figure 1).
Following the quality control processes, 433 and 387 subjects and 768,913 and 775,371 SNPs were included in the CODA and KUCOPD cohorts, respectively ( Figure 1). Figure 1. Workflow of quality control for CODA and KUCOPD data. Several standard quality control steps were produced for CODA and KUCOPD data to exclude outlier single-nucleotide polymorphisms (SNPs) and subjects.

Genotype Imputation of SNP Genotype Data
The imputation was conducted using the Michigan imputation server (https://imputationserver.sph.umich.edu, accessed on 20 May 2020) for the CODA and KUCOPD data. We used Haplotype Reference Consortium release v1.1 as a reference panel and considered only 'not European' or 'mixed' populations [15]. Pre-phasing and imputation were conducted using Eagle V 2.4 and Minimac4, respectively [16,17]. After the imputation processes, the following imputed SNPs were removed: SNPs with Rsq less than 0.3; with duplications; missing genotype rates larger than 0.05; p values for HWE less than 1 × 10 −5 ; and minor allele frequencies (MAFs) less than 0.05. Additionally, subjects with identity-by-descent > 0.9 and principal component score outside the 5 × IQRPC were excluded. A total of 433 and 387 subjects and 4,941,935 and 4,956,071 SNPs were used for our analyses (Figure 1). Figure 1. Workflow of quality control for CODA and KUCOPD data. Several standard quality control steps were produced for CODA and KUCOPD data to exclude outlier single-nucleotide polymorphisms (SNPs) and subjects.

Genotype Imputation of SNP Genotype Data
The imputation was conducted using the Michigan imputation server (https://im putationserver.sph.umich.edu, accessed on 20 May 2020) for the CODA and KUCOPD data. We used Haplotype Reference Consortium release v1.1 as a reference panel and considered only 'not European' or 'mixed' populations [15]. Pre-phasing and imputation were conducted using Eagle V 2.4 and Minimac4, respectively [16,17]. After the imputation processes, the following imputed SNPs were removed: SNPs with Rsq less than 0.3; with duplications; missing genotype rates larger than 0.05; p values for HWE less than 1 × 10 −5 ; and minor allele frequencies (MAFs) less than 0.05. Additionally, subjects with identity-by-descent > 0.9 and principal component score outside the 5 × IQR PC were excluded. A total of 433 and 387 subjects and 4,941,935 and 4,956,071 SNPs were used for our analyses (Figure 1).

Meta-Analysis of GWAS
GWAS for AWT was applied to CODA and KUCOPD data using linear regression. Airway wall thickness was used as a response variable, and ten PC scores, estimated using PLINK adjusted for population structure, age, and sex, were included as covariates. A Genes 2022, 13, 1258 4 of 10 meta-analysis of both datasets was performed using METAL program. The sample size of each dataset was used as the weight [18]. The genome-wide significance level was set at α = 5 × 10 −8 .

RNA Expression Association Test
GSE18965 from the GEO database was used to detect significant associations between AWT and the RNA expression of genes identified from the meta-analysis [19]. RNA expression in the airway epithelial cells of nine patients with asthma and seven healthy non-atopic controls was evaluated using the Affymetrix Human Genome U133A Array. The 'limma' and 'balli' tools of the R package [20,21] were used to detect the differentially expressed genes.

Subject Characteristics
The demographic properties of the CODA data of 433 subjects and the KUCOPD data of 387 subjects are presented in Table 1. There were significant differences in age, height, weight, and BMI between the CODA and KUCOPD data. Subjects from the CODA data group were much older, shorter, and heavier than those included in the KUCOPD data group. Lung function is indicated by the predicted forced expiratory volume in 1 s measured without a bronchodilator (FEV1.Pred.pre-BD, %); this was lower in the CODA data (83.9 ± 22.7) than in the KUCOPD data (101.7 ± 17.0). Additionally, forced expiratory volume in 1 s divided by forced vital capacity (FEV1/FVC) was lower in the CODA data (65.1 ± 11.5) than in the KUCOPD data (71.2 ± 8.7). There were no significant differences in the level of smoking. In conclusion, there were significant differences in age, physique, and pulmonary functions between both CODA data and the KUCOPD data.

GWAS of Airway Wall Thickness
We performed GWAS with linear regression for the CODA and KUCOPD data. A multi-dimensional scaling plot based on our dataset and the 1000 genome datasets was generated; no evidence of population stratification was found for both the CODA and KUCOPD data (Supplementary Figure S1). However, both CODA and KUCOPD data were case-control data, and the population stratification was expected to be the main confounder for genetic analyses. Ten PC scores were utilized as covariates to adjust this problem. Quantile-quantile plots for GWAS with CODA data, KUCOPD data, and their meta-analyses are shown in Supplementary Figure S2. There was no genomic inflation, as indicated by the genomic inflation factors (Λ CODA = 1.00, Λ KUCOPD = 0.99, Λ meta = 1.00). In Supplementary Figure S3, the top Manhattan plot indicated a genetically significant locus (rs4491106, β = 0.1503, p = 1.58 × 10 −8 ) in the linear regression of CODA data. However, there were no significant SNPs in the linear regression of KUCOPD data (bottom Manhattan plot). For meta-analysis, rs11648772, located on chromosome 16 in LINC02127 and Spalt-like transcription factor 1 (SALL1), met the genome-wide significance level (p = 1.41 × 10 −8 ) (Figure 2). Table 2 shows the results for the most significant ten SNPs by meta-analysis. The MAFs of the ten SNPs were compared with those of the SNPs in the Korean reference (Kref) dataset. When there were several genome-wide significant SNPs in the same LD block, only the most significant SNP was included. Figure 3 shows the forest plot for the top ten SNPs in Table 2. In particular, rs11648772, which is located near rs147153117, has been shown to be associated with post-bronchodilator and FEV1/FVC ratio [22]. In summary, we found rs11648772 located near the SALL1 gene is genetically correlated with airway wall thickness in a meta-analysis using CODA and KUCOPD data.

GWAS of Airway Wall Thickness
We performed GWAS with linear regression for the CODA and KUCOPD data. A multi-dimensional scaling plot based on our dataset and the 1000 genome datasets was generated; no evidence of population stratification was found for both the CODA and KUCOPD data (Supplementary Figure S1). However, both CODA and KUCOPD data were case-control data, and the population stratification was expected to be the main confounder for genetic analyses. Ten PC scores were utilized as covariates to adjust this problem. Quantile-quantile plots for GWAS with CODA data, KUCOPD data, and their meta-analyses are shown in Supplementary Figure S2. There was no genomic inflation, as indicated by the genomic inflation factors (ΛCODA = 1.00, ΛKUCOPD = 0.99, Λmeta = 1.00). In Supplementary Figure S3, the top Manhattan plot indicated a genetically significant locus (rs4491106, β = 0.1503, p = 1.58 × 10 −8 ) in the linear regression of CODA data. However, there were no significant SNPs in the linear regression of KUCOPD data (bottom Manhattan plot). For meta-analysis, rs11648772, located on chromosome 16 in LINC02127 and Spalt-like transcription factor 1 (SALL1), met the genome-wide significance level (p = 1.41 × 10 −8 ) (Figure 2). Table 2 shows the results for the most significant ten SNPs by metaanalysis. The MAFs of the ten SNPs were compared with those of the SNPs in the Korean reference (Kref) dataset. When there were several genome-wide significant SNPs in the same LD block, only the most significant SNP was included. Figure 3 shows the forest plot for the top ten SNPs in Table 2. In particular, rs11648772, which is located near rs147153117, has been shown to be associated with post-bronchodilator and FEV1/FVC ratio [22]. In summary, we found rs11648772 located near the SALL1 gene is genetically correlated with airway wall thickness in a meta-analysis using CODA and KUCOPD data.

Analysis of Differentially Expressed Genes
To assess the differentially expressed genes associated with AWT, we evaluated six genes (SALL1, CYLD, NOD2, BRD7, ADCY7, and HEATR3) located near rs11648772, using the GSE18965 dataset from the GEO database. SALL1 gene expression was upregulated in

Analysis of Differentially Expressed Genes
To assess the differentially expressed genes associated with AWT, we evaluated six genes (SALL1, CYLD, NOD2, BRD7, ADCY7, and HEATR3) located near rs11648772, using the GSE18965 dataset from the GEO database. SALL1 gene expression was upregulated in patients with asthma (β limma = 0.12, p limma = 0.0173, β balli = 0.12, p balli = 0.0092) ( Table 3). In short, we confirmed that the expression of the SALL1 gene, the closest gene to the rs11648772, had a positive association with airway wall thickness in asthmatic patients.

Discussion
In this study, we identified SNPs associated with AWT using genotype and CT data meta-analysis from two Korean cohorts. The most significantly associated SNP was rs11648772, located in LINC02127 and SALL1. SALL1 was the nearest gene to rs11648772, and its RNA expression was associated with airway wall thickness in asthmatic patients with atopic dermatitis.
Previous studies have assessed airway dimensions, such as lumen area or diameter, or Pi10 with different airway sampling methods, in particular chest CT, in relation to airflow limitation, respiratory symptoms, and emphysema [3]. The high resolution and marked natural contrast between air and normal lung parenchyma enable quantitative measurements using dedicated software. The combination of quantitative parameters offered by these tools enables better analysis of COPD phenotypes and prediction of outcomes. Therefore, in addition to diagnosis, these quantitative measurements allow for the staging of disease severity and phenotyping of patients [23]. In chest CT scan phenotypes, the estimated heritability of both FEV1 and FEV1/FVC is close to 25%, while the heritability of COPD status was estimated to be 37.7% in non-Hispanic whites and African Americans [24].
Identifying SNPs associated with the AWT phenotype through GWAS will enable the discovery of airway disease-related genes. Therefore, we used FWHM to detect SNPs associated with AWT and found that rs11648772 was the most significant in the meta-analysis. rs11648772 is located in the LINC02127 and SALL1 genes. SALL1, the nearest target gene to rs11648772, encodes a zinc finger transcriptional repressor, which functions in the nucleosome remodeling deacetylase histone deacetylase complex. Asthmatic epithelial cells in children secrete less fibronectin, an important contributor to the dysregulated airway epithelial cell repair. Fibronectin is an essential component of the provisional extracellular matrix because it provides a surface for epithelial migration and proliferation. Impaired fibronectin expression contributes to the abnormal epithelial repair seen in asthmatic airways [19]. SALL1 inhibits cell migration by preventing epithelial-mesenchymal transition (EMT) and downregulating the expression of stem cell markers [25]. Furthermore, it acts as a tumor suppressor by inhibiting Wnt/β-catenin signaling [25]. Recently, the role of EMT in airway remodeling was established [26]. In COPD, the airway epithelium may be damaged and/or activated by irritants, such as the constituents of cigarette smoke, stimulating the deposition of collagen from myofibroblasts in the lamina propria [27]. SALL1 influences EMT inhibition in glial cells, and it could affect bronchial wall depressions in asthmatic patients. Therefore, we hypothesize that SALL1 is associated with the EMT pathway and inhibition of cell proliferation in the bronchial airway. However, further studies on SALL1 expression in patients with bronchial damage are essential to elucidate its role in bronchial epithelial cells, and the genetic functions of rs11648772 need further analysis.
In addition to SALL1, rs11970854 and rs16920168 were involved in airway inflammation and proliferation; rs11970854 is located near KLRG2 (dist = 7,721) on chromosome 7, and rs16920168 is located in the LYN of chromosome 1. KLRG2 is associated with the inflammatory Innate counterpart of type 2 helper T cells (ILC2s); it expresses GATA3, a key transcription factor of ILC2 [28]. Increased ILC2 levels cause pathogenic chronic inflammation and/or alterations in the structure, repair, and developmental processes of the lung [29]. LYN downregulates allergen-induced airway inflammation, and its overexpression decreases mucus secretion and MUC5AC transcription in mice exposed to allergens [30].
In a cohort recruited from the Dutch NELSON, novel AWT-associated SNPs rs10794108 and rs7078439 on the C10ORF90 and DOCK1 loci were identified. These SNPs were not significant in the meta-analysis and KUCOPD data; however, in CODA data, rs10794108 was replicated with p = 0.04352 in the linear regression. We also found a significant SNP (rs4492106) located on chromosome 5 in the CODA data group.
This study has some limitations. We could not identify the SNPs reported in a previous study based on the European population. This could be attributed to ethnic differences. Although our analysis was conducted in two independent groups, the small sample size could decrease the statistical power. Therefore, further studies with larger population sizes are needed.
To conclude, we identified SNPs associated with the AWT phenotype, which could influence the pathogenesis of airway diseases and provide a potential basis for treatment in Asian populations.