The Genomics and Metagenomics of Asthma Severity (GEMAS) Study: Rationale and Design

Asthma exacerbations are a major contributor to the global disease burden, but no significant predictive biomarkers are known. The Genomics and Metagenomics of Asthma Severity (GEMAS) study aims to assess the role of genomics and the microbiome in severe asthma exacerbations. Here, we present the design of GEMAS and the characteristics of patients recruited from March 2018 to March 2020. Different biological samples and demographic and clinical variables were collected from asthma patients recruited by allergy and pulmonary medicine units in several hospitals from Spain. Cases and controls were defined by the presence/absence of severe asthma exacerbations in the past year (oral corticosteroid use, emergency room visits, and/or asthma-related hospitalizations). A total of 137 cases and 120 controls were recruited. After stratifying by recruitment location (i.e., Canary Islands and Basque Country), cases and controls did not differ for most demographic and clinical variables (p > 0.05). However, cases showed a higher proportion of characteristics inherent to asthma exacerbations (impaired lung function, severe disease, uncontrolled asthma, gastroesophageal reflux, and use of asthma medications) compared to controls (p < 0.05). Similar results were found after stratification by recruitment unit. Thereby, asthma patients enrolled in GEMAS are balanced for potential confounders and have clinical characteristics that support the phenotype definition. GEMAS will improve the knowledge of potential biomarkers of asthma exacerbations.


Introduction
Asthma is one of the most prevalent chronic respiratory diseases worldwide. Up to 330 million individuals have asthma and 346,000 deaths per year are related to this disease [1]. In Spain, the prevalence of asthma in children is about 10% [2], although there is a high variation among regions, reaching the highest value (17%) in the Canary Islands [3,4]. Asthma is defined as a heterogeneous respiratory disease characterized by chronic airway inflammation, variable expiratory airflow limitation, and airway hyperresponsiveness to different triggers [5]. Characteristic symptoms of asthma, including wheeze, dyspnea, chest tightness, and cough, may resolve spontaneously or may be controlled with adequate pharmacological therapies [2,5]. Nonetheless, patients with asthma can experience worsening episodes of the disease known as exacerbations, which carry a considerable burden to patients and health systems [5]. Asthma exacerbations limit the daily routine of patients, and they are related to hospitalizations, disability, and higher mortality [1,5]. Moreover, they are associated with high healthcare expenses derived from emergency room (ER) visits, hospitalizations, increased use of medication, and lost workdays [6]. Despite its clinical and economical relevance, no biomarkers are available to predict asthma exacerbations and the only good predictor is the history of at least one episode of an asthma exacerbation in the past year [7]. Therefore, the development of new strategies and biomarkers to identify patients at high risk of asthma exacerbations is needed [8].
Asthma is a multifactorial disease in which genetic variation has a key role. The heritability of asthma has been estimated at 65% in adults and even higher in childhood asthma [9]. As in most polygenic and complex diseases, single nucleotide polymorphisms (SNPs) are the most widely studied genetic variants. They consist of a variation in only one nucleotide with a population frequency higher than 1%. Hundreds of thousands of SNPs can be genotyped by micro-arrays and because of linkage disequilibrium patterns, millions of SNPs can be inferred using imputation reference panels. Therefore, genome-wide association studies (GWAS) allow for assessing genetic variation throughout the whole genome without any prior hypothesis of the processes involved in, unlike candidate-gene studies that focus on targeted genes with a previous relationship with the disease [10]. While the genetic variation of asthma and other related phenotypes, such as treatment response, have been explored by GWAS, the genetic basis of severe asthma exacerbations has been scarcely investigated through candidate-gene approaches analyzing genes involved in asthma susceptibility [9,[11][12][13].
Similar to human genetic variation, the human microbiota has a role in the development of the immune system and allergic diseases such as asthma [14]. Advances in next-generation sequencing (NGS) techniques have allowed a more accurate microbial characterization based on its genetic make-up (known as the microbiome) regardless of whether the microorganisms are culturable or not. Targeted sequencing of the 16S ribosomal RNA (16S rRNA) gene, a specific bacterial taxonomic marker, has been extensively used to infer the bacterial abundance and diversity of the microbiota [15]. This approach has been applied to describe the microbial diversity in the airways and the continuity in bacterial communities through the oral cavity, upper airways, and lungs [16,17]. In relation to asthma, the alteration (dysbiosis) of the nasopharyngeal or the salivary microbiome is associated with asthma susceptibility and different phenotypes of severe asthma [18][19][20]. Moreover, respiratory pathogens constitute known triggers for asthma exacerbations and are involved in the development of airway inflammation [16,21].
Despite all this evidence, the role of the salivary and nasopharyngeal microbiome in the development of asthma exacerbations has been scarcely investigated [22][23][24]. Furthermore, regardless of the interplay between host genetics and the microbiome [25], the integrated effect of both layers in the development of asthma exacerbations is still unknown. The integration of clinical data, human genetic variants, and the microbiome has been proposed as a novel approach moving forward the asthma pathogenesis knowledge and precision medicine in asthma management [8,26]. Moreover, well-conducted observational studies are needed to identify potential pharmacogenetic biomarkers before assessing their applicability in precision medicine through clinical trials [27]. In this context, the Genomics and Metagenomics of Asthma Severity (GEMAS) study has emerged to assess the role of genomics, the microbiome, and the interaction between them in the development of asthma exacerbations in Spanish patients with asthma. This manuscript aims to present the rationale, study design, and the demographic and clinical characteristics of the patients recruited in GEMAS between March 2018 and March 2020.

Study Design and Data Collection
The GEMAS study is an ongoing retrospective multicenter study coordinated by the Genomics and Health group from Universidad de La Laguna, Tenerife, Spain. Patients are recruited from the allergy and/or pulmonary medicine units from three hospitals located at the Canary Islands (Hospital Universitario de Canarias, Hospital Universitario Nuestra Señora de Candelaria, and Hospital General de La Palma), and in the division of pediatric respiratory medicine of the Hospital Universitario Donostia in the Basque Country. The study was done according to the code of ethics of the World Medical Association (Declaration of Helsinki) and approval was obtained from the ethics committees of participant centers (approval 29/17 for the Canary Islands hospitals and PI2019077 for Hospital Universitario Donostia). This study is registered in the ClinicalTrials.gov database of the National Institutes of Health (NIH) (identifier: NCT04501926).
Patients were eligible according to the following inclusion criteria: (1) male or female aged ≥8 and ≤85 years, (2) physician diagnosis of asthma according to the Global Initiative for Asthma (GINA) guidelines [5], and (3) treated on GINA step 1-5. Patients were excluded according to the following exclusion criteria: (1) one or more grandparents of non-European origin, (2) pregnancy, (3) coexistence of other chronic pulmonary disorders including cystic fibrosis, chronic obstructive pulmonary disease (emphysema or chronic bronchitis), or congenital disorders of the lungs or airways, and (4) known family relatedness (first or second degree) with another participant already included in the study. Asthma patients from allergy and/or pulmonary medicine units were ≥14 years old and those recruited in the division of pediatric respiratory medicine were aged between 8 and 16 years. Patients under 18 years old provided their assent (with signature for those aged 12-17 years) and one of their legal guardians signed the informed consent. Participants older than 18 years old provided their informed consent to participate in the study. Clinical and demographic variables were registered using a standardized questionnaire. Briefly, these variables include gender, age, height, weight, smoking exposure, educational level, home environment, household pets, lung function measurements, fraction of exhaled nitric oxide (FeNO), total serum immunoglobulin E (IgE) and blood eosinophil levels, comorbidities (otorhinolaryngology diseases, gastroesophageal reflux [GERD], and sleep apnea), atopy tests, the presence of other allergic phenotypes, family history of allergic diseases, age of asthma onset, asthma severity, asthma control, asthma exacerbations, and pharmacological treatment.
The main outcome assessed in this study was severe asthma exacerbations, defined by one of the following events related to asthma in the past year: (1) emergency room (ER) visits, (2) hospitalizations, and/or (3) oral corticosteroids (OCS) use. Case and control status were defined as the presence or absence of asthma exacerbations in the year before enrollment, respectively. Asthma exacerbations, as well as the use of asthma medication were recorded for three time points prior to recruitment: the past week, the past six months, and the past 12 months. Medication adherence was evaluated by the Medication Adherence Report Scale (MARS-5). Asthma control was assessed by the asthma control questionnaire (ACQ) score and the severity of the disease in each patient was classified based on treatment step according to GINA 2020 [5]. In addition, the following variables that may influence the microbiome composition were recorded: (1) the use of antibiotics in the past two months, (2) health condition during the sampling (presence of cavities, gingivitis, mouth sores, cold, or flu, and pregnancy and lactation status when appropriate), (3) sampling conditions within 30 min and 2 h prior sampling (inhaled corticosteroids use, any non-alcoholic or alcoholic drinking, eating, teeth brushing, smoking, and chewing gum), and (4) vaccination in the past month (i.e., influenza and pneumococcal bacteria).
Due to the multicenter nature of the study, quality control of the registered data was performed in order to ensure consistent results across recruiter centers. Physicians were instructed to select eligible patients to recruit and to collect the biological samples following common procedures, and standardized questionnaires were completed minimizing missing data. Additionally, the data recorded in the questionnaires were registered and double-checked in a unique database by at least two independent researchers. Implausible data entries or inconsistencies between variables detected in the questionnaires were corrected by checking the medical records.

Clinical Assessment
Total serum IgE and blood eosinophil levels were assessed by blood tests. Atopy status was evaluated by skin prick tests and/or specific IgE levels. These tests assessed the most common aeroallergens of each recruiting center, including at least: mites (Dermatophagoides pteronyssinus, Dermatophagoides farinae, and Blomia tropicalis), fungi (Alternaria alternata), dog and cat epithelium, grass mix (Anthoxanthum odoratum, Cynodon dactylon, Dactylis glomerata, Festuca pratensis, Holcus lanatus, Lolium perenne, Phleum pratense, Poa pratensis, and/or Secale cereale), mugwort (Artemisia vulgaris), and Parietaria. The skin prick test included both positive (histamine) and negative controls. Atopy was defined by at least one positive result from any of the allergens evaluated, based on a wheal diameter ≥3 mm larger than the negative control in the skin prick test or specific IgE levels >0.35 UI/mL. Lung function was assessed by baseline and post-bronchodilator measurements of the forced expiratory volume in the first second (FEV 1 ), the forced vital capacity (FVC), and the FEV 1 /FVC ratio. Spirometry and the bronchodilator test were performed according to the American Thoracic Society and the European Respiratory Society (ATS/ERS) international guidelines [28]. Subjects were instructed to withhold short-acting bronchodilators for 8 h and long-acting bronchodilators for 24 h prior to pulmonary function testing. Patients must avoid smoking or vaping in the previous hour, do not consume intoxicants in the eight hours before the test (including alcohol), do not perform vigorous exercise in the previous hour, and do not wear clothes that may restrict the lung function. Moreover, based on recommendations from the Spanish Society of Pulmonology and Thoracic Surgery (SEPAR) [29], patients were also recommended to avoid the consumption of stimulant drinks (including coffee or tea) and large meals in the previous 2-3 h. Predicted values and z-scores of lung function measurements were estimated using the Global Lung Function Initiative (GLI) 2012 equations [30]. Bronchodilator response (BDR) was calculated as (post-FEV 1 − preFEV 1 )/preFEV 1 . FeNO levels were assessed at Hospital Universitario Donostia following ATS/ERS international guidelines [31].
Body mass index (BMI) was calculated based on weight and height, and obesity was defined following the World Health Organization (WHO) criteria as BMI ≥ 30 kg/m 2 for participants older than 18 years. For those aged ≤18 years, BMI-for-age z-scores were calculated based on the WHO growth reference for children aged between 5-19 years, and obesity was defined as a value greater than two standard deviations above the reference median [32].

Biological Sample Collection and Storage
From each patient, saliva, nasal, and pharyngeal samples were obtained for bacterial deoxyribonucleic acid (DNA) extraction. One milliliter of saliva was collected in Oragene OMNIgene ® ORAL OM501 tubes (DNA Genotek, Inc., Ottawa, ON, Canada) to ensure the stability and integrity of bacterial communities. Following standardized procedures, nasal and pharyngeal samples were obtained with sterile swabs stored in AMIES transport medium (DeltaSwab Amies, Deltalab, Barcelona, Spain). Moreover, 4 mL of blood were drawn in tubes with ethylenediaminetetraacetic acid (EDTA) for human DNA extraction and 2.5 mL were collected with PAXgene ® Blood RNA Tubes (PreAnalytiX, Feldbachstrasse, Switzerland) to stabilize the intracellular human ribonucleic acid (RNA). EDTA tubes were drawn first to avoid changes in gene expression due to the phlebotomy procedure. Samples and data from patients from the Basque country included in this study were provided by the Basque Biobank/Biodonostia Node and were processed following standard operating procedures with the appropriate approval of the Ethical and Scientific Committees. Samples were stored at −20 • C at each recruiter center, and they were transported and stored at the same temperature at Universidad de La Laguna until processing, except for the PAXgene tubes that were stored at −80 • C.

Processing of Samples for Future Genomic and Metagenomic Studies
Bacterial DNA extraction was performed from saliva, nasal, and pharyngeal samples following a protocol that includes a mechanical pre-lysis of bacterial cells with the Pathogen Lysis Tubes S kit (Qiagen, Hilden, Germany) and the isolation and purification of nucleic acids with the spin column-based kit QIAamp UCP Pathogen Mini kit (Qiagen). Human DNA extraction from blood samples was performed using the Illustra blood genomicPrep Mini Spin Kit (GE Healthcare, Amersham, UK) following the manufacturer's instructions. RNA extraction was performed with the NucleoSpin ® RNA Blood kit (MACHEREY-NAGEL, Düren, Germany). DNA and RNA samples are stored at −20 • C and −80 • C, respectively, until they are analyzed.
The human microbiome will be assessed by targeted sequencing of the V3-V4 region of the 16S ribosomal RNA (16S rRNA) gene. This region has been selected because it provides information with high coverage of the bacterial microbiome at genera-level and is the most used region in human microbiome studies [33]. Paired-end sequencing (300 nucleotides × 2) will be performed using the NGS MiSeq platform and the MiSeq Reagent Kit v3 (Illumina, Inc., San Diego, USA). On the other hand, human DNA samples will be genotyped using the Infinium Global Screening Array-24 kit v3.0 (Illumina, Inc.) to interrogate up to 642,824 single nucleotide polymorphisms (SNPs) across the whole genome.

Statistical Analysis for the Comparison of Demographic and Clinical Characteristics between Cases and Controls
A comparison of the demographic and clinical characteristics of cases and controls was performed separately for the participants recruited in the Canary Islands and the Basque country. Additionally, a stratified analysis was performed for each of the recruitment units (i.e., allergy or pulmonary medicine). Continuous variables with a normal distribution were summarized using mean values and standard deviation, and those non-normally distributed by the median values and the interquartile range. Differences between groups were evaluated using the t-test and the Mann-Whitney U-test for continuous variables with normal and non-normal distribution, respectively. Categorical variables were summarized by counts in each category and percentages, and they were compared between cases and controls using the Fisher's exact test. Statistical significance was declared at p-value < 0.05.

Results
A total of 257 individuals with asthma (137 cases and 120 controls) have been recruited (Table 3, Box 1). Saliva, nasal, pharyngeal, and blood samples are available from all subjects, and blood samples in PAXgene tubes have been collected from a subset of 39 adults with asthma from the Canary Islands (21 cases and 18 controls).
The comparison of cases (n = 103) and controls (n = 102) recruited in the Canary Islands showed no differences between both groups for age, gender, tobacco smoking exposure, educational level, home environment, or household pets (p-value > 0.05, Table 3). Moreover, some clinical variables related to asthma, such as the presence of atopy and other allergic phenotypes, total IgE and eosinophil levels, family history of asthma and allergic diseases, age of asthma onset, and medication adherence were similar between both groups (p-value > 0.05).
Cases had a higher lung function impairment compared to controls (median predicted pre-FEV 1 : 83.4% vs. 90.3%, p-value = 0.003; median predicted pre-FVC: 87.7% vs. 94.2%, p-value = 0.020). When asthma-related comorbidities were assessed, only GERD was more frequent in patients who had exacerbations in the past year compared to those who did not have exacerbations (32.7% vs. 15.7%, p-value = 0.005). Additionally, severe asthma and poorly controlled asthma were more frequent in cases compared to controls (p-value = 0.005 and p-value = 1.2 × 10 −5 , respectively). Moreover, patients with asthma exacerbations required higher administration of both reliever and controller asthma medications in the past year, such as short-acting beta agonists (SABA), inhaled corticosteroids (ICS), long-acting beta agonists (LABA), or OCS (p-value < 0.05, Table 3), and were more frequently treated with antibiotics in the past two months compared to controls (44.1% vs. 17.8%, p-value = 6.8 × 10 −5 ). Similar trends of lung function impairment, coexistence of GERD, presence of severe and poorly controlled asthma, and use of asthma medications were observed when cases and controls were stratified based on recruitment unit (allergy and pulmonary medicine units) (Table S1).

Box 1. Summary of the main characteristics of patients enrolled in the Genomics and Metagenomics of Asthma Severity (GEMAS) study.
• A total of 257 individuals with asthma have been recruited between March 2018 and March 2020 (137 cases and 120 controls).

•
Cases and controls recruited in the Canary Islands and in the Basque Country did not differ for many potential confounders for the future genomic and microbiome analyses (e.g., age, gender, tobacco exposure, comorbidities, etc.).

•
Cases from the Canary Islands have clinical characteristics that support the definition of asthma exacerbations (i.e., impaired lung function, severe and uncontrolled asthma, coexistence of GERD, and use of asthma medication).

•
Children from the Basque Country who had asthma exacerbations in the past year have higher proportion of severe asthma and OCS use, and worse medication adherence than controls.   Table 3. Description of patients with asthma recruited in the GEMAS study. From the 52 children with asthma recruited in the Basque country, 34 were classified as cases and 18 as controls (Table 3). There were no differences between both groups in any of the clinical and demographic variables except for severe asthma (94.1% vs. 22.2%, p-value = 1.7 × 10 −7 ) and treatment with OCS (94.1% vs. 5.6%, p-value = 1.3 × 10 −10 ), which were more frequent in exacerbators. Additionally, patients with exacerbations had worse self-reported medication adherence than controls (p-value = 0.033, Table 3). No differences either in lung function or FeNO were found between children who developed asthma exacerbations and those who did not (p-value > 0.05, Table 3).

Discussion
GEMAS is a study aimed at investigating the role of genomics and the human microbiome in asthma exacerbations that has included, so far, a total of 257 children and adults with asthma from the Canary Islands and the Basque Country. Patients were divided into cases and controls according to the presence/absence of asthma exacerbations, obtaining a well-balanced distribution of individuals in the two groups. There were no differences between groups in many potential confounders for the genomic and microbiome analyses and the observed dissimilarities support the definition of cases and controls since they are based on characteristics inherent to the trait [34]. From the patients recruited in the Canary Islands, those who developed asthma exacerbations showed lower lung function and had a higher proportion of poor asthma control, more severe disease, the coexistence of GERD, and the use of reliever and controller medications than controls. Moreover, the group of cases showed higher rates of prescription of antibiotics in the last two months, which was expected given the relationship between respiratory infections and asthma exacerbations [21]. From children recruited in the Basque Country, cases and controls were also balanced for potential confounders except for the use of OCS and self-reported medication adherence. Unlike adult patients, lung function in children did not show greater deterioration in cases with asthma exacerbations than in controls, which was expected since the impairment of lung function because of asthma is a chronic process that worsens over time and with the frequency of exacerbations [35].
Across the literature, similarities and differences between adulthood and childhood asthma have been described [36,37]. Indeed, differences in some clinical variables from the Canary Islands and Basque Country patients might be related to the proportion of asthma phenotypes in these two groups. All patients from the Basque Country were children (aged between 8 and 16 years) and have a median age of asthma onset of 2.3 years and a high frequency of clinical variables related to allergy. In addition, 80% of them have atopy, 83% other allergic phenotypes (e.g., rhinitis and/or dermatitis), and 81% family history of other allergic diseases. All these features are characteristic of an early-onset allergic T H 2 asthma [5,37], so it is expected that a higher incidence of this phenotype will occur in this group. Besides, T H 2 inflammation has been also strongly associated with type I hypersensitivity allergy (which is mediated by IgE) and eosinophilic inflammation [5,37], which can explain the higher levels of total serum IgE (median 462.0 vs. 137.9 UI/mL, p-value = 0.013) and blood eosinophils (median 500.0 vs. 300.0 cells/mm 3 , p-value = 0.002) in this population when compared with the Canary Islands patients. In contrast, patients recruited in the Canary Islands include mostly adult asthma patients. Therefore, it is expected that not only childhood-onset but also adult-onset asthma phenotypes (e.g., neutrophilic, obesity-related, smoking-associated . . . ) coexist in this group. Indeed, the median age of asthma onset in the Canary Islands patients was 18 years (interquartile range: 6.0-40.0 years). As age of asthma onset increases, it is expected a higher incidence of non-T H 2 asthma and thereby, other mechanisms different than IgE-mediated response or eosinophilic inflammation may underly adult-onset asthma [37].
Regarding the genetics of asthma, adulthood and childhood asthma phenotypes have a similar but not identical genetic basis [38]. Moreover, although several genetic variants associated with asthma-related traits, such as treatment response, have demonstrated similar effects both in adults and children, many of them are not shared and aging has a mitigating effect [12]. Nevertheless, the overlap between polymorphisms related to asthma exacerbations in adults and children has been scarcely investigated [11]. Besides, it is well known that age is closely related to the composition of the human microbiome [39], which can limit the comparison between adult and children studies. Indeed, microbiome studies of asthma exacerbations conducted in children have shown consistent findings, such as the presence of Moraxella in the nasal microbiome as a risk factor of exacerbations [22,24], but these findings differ from those identified in adults [23]. Given the limited number of studies, definitive conclusions cannot be established, and GEMAS can provide independent insights into the comparison between pediatric and adult populations.
The GEMAS study has several strengths and limitations that must be considered. Regarding the strengths, to the best of our knowledge, GEMAS is the first study of asthma exacerbations where different non-invasive samples (saliva, nasal, and pharyngeal swabs) have been collected. Moreover, the use of standardized and common procedures makes GEMAS an open-collaboration study to participate in meta-analyses to disentangle the role of genomics not only in asthma exacerbations but also in other allergic and respiratory traits. Indeed, the characteristics of GEMAS will allow collaboration with other studies of Spanish asthma patients recruited in different regions of the country, such as the MEGA project [40], to improve the knowledge of asthma in this population. Finally, the fact that genotype and microbiome data will be available in GEMAS from both children and adults with asthma will allow investigation into the interaction of these two layers in two differentiated asthma phenotypes.
On the other hand, some limitations need to be acknowledged. First, despite assessing potential variables that may disturb the microbiome, many other potential confounders of microbiome assessment may be uncontrolled (e.g., use of probiotics, gastrointestinal diseases, evidence of oral candidiasis, etc.) [41]. Moreover, the use of antibiotics was quite extended among patients who developed asthma exacerbations and their effects on the human microbiome have a high interindividual variation [42]. However, the fact that antibiotic use is recorded will allow us to take this variable into account in the statistical analyses. Second, the multicenter nature study of the microbiome may limit the power to detect differences in the human microbiome composition, since it is highly conditioned by the environment [43]. Third, the current sample is limited, and healthy individuals have not been recruited as additional controls. However, this limitation will be overcome in future stages of the study, since patient enrollment is still ongoing.

Conclusions
GEMAS is a promising study with well-balanced groups of asthma patients with and without asthma exacerbations that will contribute to improving the knowledge of the scarcely investigated role of genomics and the human microbiome in the development of asthma exacerbations. The clinical characteristics of the enrolled patients support the definitions of cases and controls, and the use of common and standardized procedures will allow for the collaboration with other studies. Moreover, the collection of non-invasive samples to investigate the human microbiome will help to establish potential biomarkers with better clinical applicability that could be included in further clinical studies to move towards precision medicine.