GSDMB/ORMDL3 Rare/Common Variants Are Associated with Inhaled Corticosteroid Response among Children with Asthma

Inhaled corticosteroids (ICS) are efficacious in the treatment of asthma, which affects more than 300 million people in the world. While genome-wide association studies have identified genes involved in differential treatment responses to ICS in asthma, few studies have evaluated the effects of combined rare and common variants on ICS response among children with asthma. Among children with asthma treated with ICS with whole exome sequencing (WES) data in the PrecisionLink Biobank (91 White and 20 Black children), we examined the effect and contribution of rare and common variants with hospitalizations or emergency department visits. For 12 regions previously associated with asthma and ICS response (DPP10, FBXL7, NDFIP1, TBXT, GLCCI1, HDAC9, TBXAS1, STAT6, GSDMB/ORMDL3, CRHR1, GNGT2, FCER2), we used the combined sum test for the sequence kernel association test (SKAT) adjusting for age, sex, and BMI and stratified by race. Validation was conducted in the Biorepository and Integrative Genomics (BIG) Initiative (83 White and 134 Black children). Using a Bonferroni threshold for the 12 regions tested (i.e., 0.05/12 = 0.004), GSDMB/ORMDL3 was significantly associated with ICS response for the combined effect of rare and common variants (p-value = 0.003) among White children in the PrecisionLink Biobank and replicated in the BIG Initiative (p-value = 0.02). Using WES data, the combined effect of rare and common variants for GSDMB/ORMDL3 was associated with ICS response among asthmatic children in the PrecisionLink Biobank and replicated in the BIG Initiative. This proof-of-concept study demonstrates the power of biobanks of pediatric real-life populations in asthma genomic investigations.


Introduction
Asthma is the most common chronic illness in children, and inhaled corticosteroids (ICS) are the most commonly used controller medication for asthma [1].Individual response profiles to ICS show significant and repeatable heterogeneity.For patients receiving ICS for 8-12 weeks, 35-40% of these patients do not show significant improvement in lung function [2,3].Among asthmatic patients on maintenance ICS treatment, approximately 10% may remain at high risk of asthma attacks or being symptomatic [4].In addition to poor adherence to medication, other important factors influencing treatment response in patients with asthma also include environmental exposures, misdiagnosis, and genetic variation [5].
To date, most pharmacogenetic studies on children with asthma have been clinical trials.However, studies in real-life populations are complementary to clinical trial studies as real-life populations account for external patient, provider, and system level factors that may moderate an intervention's effect.The objective of this study is to: (1) assess the potential effects of combined rare and common variants and (2) conduct a proof-ofconcept pharmacogenetic study in two real-world pediatric biobanks through the Genomic Information Commons [17], a collaborative across nine children's hospitals.As a result, we examined rare and common variants associated with ICS response for 12 regions previously associated with asthma and ICS response (DPP10, FBXL7, NDFIP1, TBXT, GLCCI1, HDAC9, TBXAS1, STAT6, GSDMB/ORMDL3, CRHR1, GNGT2, FCER2) in the PrecisionLink Biobank with replication in the Biorepository and Integrative Genomics (BIG) initiative.

Discovery Population: PrecisionLink Biobank
The discovery population was at Boston Children's Hospital PrecisionLink Biobank for Health Discovery [18], which is a site in the Genomic Information Commons [17].The PrecisionLink Biobank links residual specimens produced as by-products of routine care to phenotypic data derived from electronic health records.Inclusion criteria included age ≥ 6 years, a diagnosis of asthma, and use of one of the following ICSs within the past year: beclomethasone, budesonide, ciclesonide, fluticasone, mometasone.For this proof-ofconcept study, 91 White and 20 Black children were identified through the PrecisionLink Portal as meeting inclusion criteria and confirmed based on manual review of the electronic health record (EHR) [19].The main outcome measure, ICS response, was defined as number of emergency department visits or hospitalizations for asthma over a one-year period based on EHRs.

Validation Population: BIG Initiative
The validation population was the BIG Initiative, a collaborative effort between the University of Tennessee Health Science Center and Le Bonheur Children's Hospital [20].Inclusion criteria for the validation population were the same as for the discovery population.In the BIG Initiative, 83 White and 134 Black children met inclusion criteria and had available WES data.

WES Processing for PrecisionLink Biobank Samples
DNA was obtained from the PrecisionLink Biobank.The Precision Link Biobank initiative is approved by the BCH Institutional Review Board (protocol number-P00000159).Genomic DNA (gDNA) samples were extracted from whole blood specimens of 200 subjects, using Gentra Purgene Extraction Kit or Chemagic B5k Extraction Kits and quality controlled using Quant-It assay.Gentra Purgene is part of QUIAGEN.The Boston Children's Biobank uses a standard operating procedure to calculate and perform normalization.Afterwards, gDNA samples were shipped to the Quest Diagnostics Sequencing Center for whole exome sequencing (WES).After assessment of DNA quantity, qualified genomic DNA sample is enzymatically fragmented.Sequencing library was prepared by ligating sequencing adapters to both ends of DNA fragments.Sequencing libraries were size-selected with bead-based method to ensure optimal template size and amplified by polymerase chain reaction (PCR).Regions of interest (exons and intronic targets) were targeted using hybridization-based target capture methods (Integrated DNA Technologies xGen Exome).The quality of the completed sequencing library was controlled by ensuring the correct template size and quantity and to eliminate the presence of leftover primers and adapter-adapter dimers.Ready sequencing libraries that passed the quality control were sequenced using the Illumina's sequencing-by-synthesis method using paired-end sequencing (150 by 150 bases).Sequenced reads were processed using GPU-accelerated genomics software suite NVIDIA ClaraTM Parabricks (v3.5.0, https://www.nvidia.com/en-us/clara/genomics/(accessed on 10 November 2022)).We used the modules 'fq2bam', 'applybqsr', and 'haplotypecaller' which implements mapping reads using BWA-MEM, marking duplicated reads, recalibrating base quality scores using GATK BQSR (Base Quality Score Recalibration), and variant calling using GATK HaplotypeCaller.The Parabricks v3.5.0 is based on BWA version 0.7.15 and GATK version 4.1.0.0.The outcomes from the Parabricks suite were per-sample genomic VCF (gVCF) files, which were then used for joint genotyping using GATK4 HaplotypeCaller (version 4.2.0.0).

Statistical Approach
We used the combined sum test for the sequence kernel association test (SKAT) to examine the combined effect of rare and common variants on ICS response [21].In addition, we considered rare and common variant associations separately.We used a MAF cutoff to define rare vs. common variants of 1/ 2 * Sample Size.This is the default value for the SKAT_CommonRare function that implements the combined sum test for SKAT in R.This resulted in a MAF threshold cutoff to define rare vs. common variants of 0.07 and 0.16 for the White and Black children in the PresionLink biobank, respectively and 0.08 and 0.06 for the White and Black children in the BIG Initiative biobank, respectively.The supplement contains the rs numbers for SNPs included in the analyses and classification of whether the SNP was included as a rare or common variant and whether the SNP is an intronic variant.We defined the outcome of non-response based on hospitalization or ED visit.We adjusted for age, sex, and BMI and stratified by race.Using a Bonferroni correction for the 12 regions, we used a significance threshold of 0.05/12 = 0.004.For LD pruning, we considered a shifting window of 50 SNPs, calculated the LD between each pair of SNPs in the window and removed one of a pair of SNPs if the R squared was greater than 0.5.
In the secondary analysis, for the region (GSDMB/ORMDL3) that was significantly associated with ICS response for the combined effect of both rare and common variants among White children in the PrecisionLink Biobank, we also analyzed the association of each SNP in the region with ICS response separately in order to examine the contribution of each SNP.Using logistic regression models, we adjusted for age, sex, and BMI.We restricted the analysis to SNPs with at least 4 variants present in order to assure convergence of the logistic regression models.

Results
For the PrecisionLink and BIG Initiative biobanks, baseline demographics are shown in Table 1 and results are shown in Table 2.Among White children in the PrecisionLink Biobank, GSDMB/ORMDL3 was significantly associated with ICS response for the combined effect of both rare and common variants (p-value = 0.003).This signal replicated in the BIG Initiative (p-value = 0.02).When we stratified by rare and common variants in the GSDMB/ORMDL3 region, both rare (p-value = 0.008) and common variants (p-value = 0.03) were marginally associated with ICS response among White children in the PrecisionLink Biobank.However, common variants (p-value = 0.03) and not rare variants (p-value = 0.30) were marginally associated with ICS response in the BIG Initiative.In addition, STAT6 was associated with ICS response for the combined effect of rare and common variants (p-value = 0.003) among Black children in the PrecisionLink Biobank.However, the sample size included only 20 children and the signal did not replicate in the BIG initiative.For the secondary analysis, the results are displayed in Table 3 for the association of ICS response with the individual SNPs in GSDMB/ORMDL3 among White children in the PrecisionLink Biobank for SNPs, with at least four variants present.While the joint effect of the common variants (p = 0.03) and rare variants (0.008) were separately associated with ICS response in Table 2, there is not a single common variant that is strongly associated with ICS response as seen in Table 3.The joint association of the common variants in GSDMB/ORMDL3 with ICS response seems to be due to several SNPs in the region and not the effect of a single SNP.Also, note that all of the SNPs that were classified as common variants in GSDMB/ORMDL3 are also intronic variants.

Discussion
Using WES data, GSDMB/ORMDL3 was significantly associated with ICS response for the combined effect of rare and common variants (p-value = 0.003) among White children in the PrecisionLink Biobank and replicated in the BIG Initiative (p-value = 0.02).GSDMB/ORMDL3 has previously been associated with ICS response and exacerbations for common variants [22], asthma for common variants [23], and asthma for the combined effect of rare and common variants [24].However, GSDMB/ORMDL3 has not previously been associated with ICS response or exacerbations for rare variants.While the association of asthma exacerbations or ICS response with rare variants has been studied previously, no significant associations were previously found [15,16].
While the objectives of this study were to assess the association of combined rare and common variants on ICS response and to conduct a proof-of-concept pharmacogenetic study in two real-world pediatric biobanks, there were limitations for this study.The sample sizes were small and evaluation in larger cohorts is needed.While STAT6 was significantly associated with ICS response among Black children in the PrecisionLink Biobank, caution is needed for this result since the sample size is extremely small and the signal did not replicate.Also, while the outcome was based on EHRs for Boston Children's Hospital, outpatient, ED visits, and hospitalizations that occurred at other sites may have been missed.
Future studies are needed due to the small sample size for this study.Given the complexity of asthma and its treatment responses, larger and more diverse samples could potentially provide a more robust understanding of the genetic determinants involved.Future studies incorporating preliminary evidence from functional or structural predictions regarding the identified variants could further strengthen this study's conclusions.The clinical relevance of these findings would be enhanced by such evidence that could offer insights into the biological mechanisms through which these variants influence ICS response.
Large biobanks that link collections of biological specimens to clinical information in diverse, real-life populations have been fruitful in the discovery of pharmacogenetic markers.Nevertheless, few such biobanks are dedicated to children, likely because of logistical and regulatory issues related to the recruitment of children.Nevertheless, including children is important to ensure that children benefit from genetic advancements.This proof-of-concept study demonstrates the potential power of biobanks of pediatric real-life populations in genomic investigations for asthma.

Table 1 .
Characteristics of participants for the PrecisionLink Biobank and Biorepository and Integrative Genomics (BIG) Initiative.

Table 2 .
Results of the combined common and rare variant analyses.Analyses are also stratified by rare and common variants giving the p-value for the combined sum test and the number of SNPs included in the analysis.Bonferroni significant results are highlighted in green and marginal associations (p-value < 0.05) are highlighted in yellow.Note that the column "BP" for base pairs refers to genome build GRCh38/hg38, and note that the columns "# SNP" is the number of SNPs.

Table 3 .
Results of the single variant logistic regression analyses for GSDMB/ORMDL3 in White children in the PrecisionLink Biobank for SNPs with at least 4 present.