Colonic Mucosal Microbiota and Association of Bacterial Taxa with the Expression of Host Antimicrobial Peptides in Pediatric Ulcerative Colitis

Inflammatory bowel diseases (IBD), ulcerative colitis (UC) and Crohn’s disease (CD), are chronic debilitating disorders of unknown etiology. Over 200 genetic risk loci are associated with IBD, highlighting a key role for immunological and epithelial barrier functions. Environmental factors account for the growing incidence of IBD, and microbiota are considered as an important contributor. Microbiota dysbiosis can lead to a loss of tolerogenic immune effects and initiate or exacerbate inflammation. We aimed to study colonic mucosal microbiota and the expression of selected host genes in pediatric UC. We used high-throughput 16S rDNA sequencing to profile microbiota in colonic biopsies of pediatric UC patients (n = 26) and non-IBD controls (n = 27). The expression of 13 genes, including five for antimicrobial peptides, in parallel biopsies was assessed with qRT-PCR. The composition of microbiota between UC and non-IBD differed significantly (PCoA, p = 0.001). UC children had a decrease in Bacteroidetes and an increase in several family-level taxa including Peptostreptococcaceae and Enterobacteriaceae, which correlated negatively with the expression of antimicrobial peptides REG3G and DEFB1, respectively. Enterobacteriaceae correlated positively with the expression siderophore binding protein LCN2 and Betaproteobacteria negatively with DEFB4A expression. The results indicate that reciprocal interaction of epithelial microbiota and defense mechanisms play a role in UC.


Introduction
Inflammatory bowel diseases (IBD) affect up to seven million people globally and three million in Europe and their incidence and prevalence are increasing [1,2]. The two main conditions of IBD are ulcerative colitis (UC) and Crohn's disease (CD). Epidemiological patterns suggest that IBD will emerge as a major worldwide disease in the coming years [1,2]. The etiology of IBD is on the microbiota composition, although the subgroup analysis of UC and non-IBD did not separately show the effect of location (Supplementary Figure S1). Further, the patient age had a significant effect on the microbiota composition (Supplementary Figure S2). Based on these results, the sampling location and age were included as confounders in all subsequent analysis.

Mucosal Microbiota Associated with UC
In assessing the UC-associated microbiota, we first compared microbiota diversity and richness between active, inactive disease and non-IBD, and found no significant difference between the groups (Supplementary Figure S3).
Next, we compared the microbiota composition at bacterial phylum level. Non-IBD subjects were found to harbor significantly less Firmicutes ( Figure 1A, p = 0.006) than UC patients, as well as more Bacteroidetes and less Proteobacteria as compared to UC, although the latter differences were not statistically significant. Subsequent analysis with higher taxonomic resolution separated UC and non-IBD in PCoA at bacterial family level ( Figure 1B, p = 0.001) and disease activity accounted for 9% of variation in the data ( Figure 1C, p = 0.001). Concerning individual bacterial taxa at family level, Sutterellaceae, Veillonellaceae and unclassified Erysipelotrichia were increased, and unclassified Bacteroidetes and Negativicutes decreased in UC (Table 2). Further subgroup analysis revealed interesting patterns concerning increasing and decreasing abundances of individual taxa from active to inactive UC and to non-IBD ( Figure 2). Most notably, unclassified Bacteroidales and Porhyromonadaceae (also belonging to the Bacteroidales order) as well as unclassified Bacteroidia were decreased in UC as compared to non-IBD, whereas Coriobacteriaceae, Streptococcaceae, Peptostreptococcaceae, Veillonellaceae, Enterobacteriaceae and unclassified Gammaproteobacteria were increased in UC. Inactive UC differed from active disease in having a decreased abundance of Streptococcaceae and Peptostreptococcaceae, and an increased abundance of unclassified Saccharibacteria (formerly known as TM7 phylum), which was equally low in abundance in active UC and non-IBD. Across all samples, Saccharibacteria correlated positively with Micrococcaceae (r = 0.35, p = 0.01) and Ruminococcaceae (r = 0.28, p = 0.04). . Family-level microbiota differences between active, inactive ulcerative colitis and non-IBD controls. Microbiota abundance expressed as relative abundance. An asterisk indicates statistical significance between active UC and non-IBD control.

Mucosal Gene exPression and Correlations with Microbial Taxa
The relative gene expression of 6 out of the 13 studied genes differed between UC subjects and controls, and also between active and inactive disease ( Table 3). The expression of interleukin 8 (IL-8), chemokine CXCL16, the calcium binding proteins S100A8 and S100A9 and lipocalin 2 (LCN2) were significantly increased and the expression of antimicrobial peptide DEFB1 was significantly decreased in UC patients when compared to non-IBD. The expression of the other studied genes for antimicrobial peptides DEFB103B, DEFB4A, RETNLB and REG3G did not differ between the study groups, although the last mentioned showed a tendency for decreased expression in UC (p = 0.076). Further, the expression of trefoil factor 3 (TFF3), which is involved in the maintenance and repair of intestinal mucosa, and the main colonic mucin MUC2 were at similar levels in both study groups. Correlation analysis of the microbial and qRT-PCR data revealed correlations between the abundance of family-level taxa and the expression of specific genes ( Table 3). The abundance of Peptostreptococcaceae and Enterobacteriaceae correlated negatively with the expression of antimicrobial peptides REG3g and DEFB1, respectively. The abundance of Enterobacteriaceae correlated positively with the expression of siderophore binding protein LCN2 and Betaproteobacteria correlated negatively with the expression of DEFB4A. Sutterellaceae and Veillonellaceae had positive correlations with the expression of CXCL16 chemokine and its receptor CXCR6 as well as IL-8. The abundance of Lactobacillaceae correlated negatively with the expression of CXCL16, S100A8 and S100A9. ns. ns.

Discussion
Previous studies have suggested that both microbiota and impaired function of the intestinal epithelium contribute to UC pathogenesis. Here, we carried out high-throughput sequencing of the microbiota and applied a targeted host gene expression analysis to study mucosal microbiota-host interactions in Finnish pediatric UC patients and controls. The main limitation of our study is that the cohort size is relatively small and hence some of the results, particularly those showing associations between microbial abundance with mucosal gene expressions, should be considered as preliminary, and confirmed in a larger group of study subjects. Moreover, there was variation in the subjects' age and biopsy location, which was taken into account in the microbiota analyses. Unfortunately, we did not gather long-term information on the history of antibiotic usage before the sampling, and it was not possible to assess the impact of overall antibiotic use on microbiota in this cohort, although at the time of sampling the study subjects were not receiving antibiotics.
On the other hand, the homogenous ethnic background of the study population could be considered as a strength. Our cohort included children with varying ages and biopsy locations, and both of these factors were included as confounders in the microbiota analysis. Similar to our results, age has been found to have a significant impact on the composition of both mucosal and fecal microbiota in children [18,23]. Concerning biopsy location, our results were not fully conclusive that mucosal microbiota would significantly differ between different parts of the colon, but for prudence it was included as a confounder. Some previous studies have concluded that mucosal microbiota differs significantly between colon segments [4], whereas others have considered it to be fairly comparable [17].
Our results on the comparison of microbiota between patients having active and inactive UC and control subjects reassert the previously described UC-associated dysbiosis that has been characterized by the depletion of anaerobic commensals and increase in facultatively anaerobic taxa [10,18], while bacterial diversity or richness may not be affected [4]. We found depletion of Bacteroidetes and several family-level taxa in the colonic mucosa of pediatric UC patients, which has also been described for pediatric CD [17]. Bacteroides species, albeit also being opportunistic pathogens, are considered as health-promoting in the gut mucosa, as they are capable of reinforcing the epithelial barrier, exerting anti-inflammatory actions by releasing polysaccharide A (PSA), sphingolipids and outer membrane vesicles, and ameliorating experimental colitis [24][25][26].
We observed that UC patients had an increased amount of Enterobacteriaceae and other unclassified Gammaproteobacteria, Sutterellaceae, Veillonellaceae, Streptococcaceae and Peptostreptococcaceae, which was emphasized in the patients with active disease. This replicates previous studies showing these taxa to be increased in pediatric CD or UC [17,18], and particularly Gammaproteobacteria and Enterobacteriaceae are renowned for their proinflammatory properties due to the production of lipopolysaccharide (LPS) [9,10]. Veillonellaceae and specific species within Streptococcaceae have also been previously linked with more severe disease progression in new-onset pediatric UC [18]. Concerning Sutterellaceae, the association with IBD is unclear as the results vary between studies [17,18,27]. All these families, excluding Peptostreptococcaceae, are aerotolerant and increased oxygen levels in the inflamed gut may promote their growth, as suggested by the oxygen hypothesis [28]. Peptostreptococcaceae are anaerobic commensals, whose increased abundance in UC may be linked to other factors than increased oxygen levels during inflammation, such as altered expression of antimicrobial peptides in the epithelium, which is supported by the result that Peptostreptococcaceae abundance correlated negatively with the expression of REG3G. The expansion of this bacterial taxa has been previously observed in adult UC patients in remission [14].
A novel finding in our study was that patients with inactive UC had an increased abundance of Saccharibacteria as compared to non-IBD and active UC patients, and that across all samples Saccharibacteria correlated positively with Micrococcaceae and Ruminococcaceae, the latter of which has been found to be reduced in treatment-naïve pediatric UC and CD [17,18]. Saccharibacteria (formerly known as the TM7 phylum) are ultrasmall bacteria, which parasitize on other bacteria and display highly dynamic interactions with their hosts, including virulent killing [29]. Thus, Saccharibacteria may affect the gut microbiota structure and functionality, and consequently mucosal homeostasis. For example, specific species of Saccharibacteria have been shown to silence the ability of its host bacterium to induce TNF-alpha expression in macrophages [30]. Thereby, our finding on the increased abundance of Saccharibacteria in inactive UC is of particular interest, and the possible involvement of this bacterial group in fluctuating microbiota composition and remission-relapse cycling in UC should be studied further.
Overall, it seems that the shifts in the balance of the mucosal microbiota in UC towards increased proportions of pro-inflammatory bacteria, especially Enterobacteriaceae, and decreased proportions of anti-inflammatory bacteria, such as Bacteroides spp., may initiate and exacerbate inflammation. The origin of microbiota shifts is an intriguing question, and future studies should address microbiota changes in patients longitudinally across remission-relapse cycles and also investigate the possible role of less studied microbial groups in the shifts, including phages and the ultrasmall bacterial parasites Saccharibacteria that were found to be increased in UC in this study.
The quantitative expression analysis of 13 selected genes revealed differential expression of six genes in UC as compared to non-IBD. These included IL-8 and CXCL16, which were previously found to have increased expression in pediatric UC patients [5], as well as the calcium binding proteins S100A8 and S100A9, whose complex is also known as calprotectin-an established biomarker of disease activity in UC and other chronic inflammatory diseases. We also confirmed now in a pediatric cohort the previous findings of decreased expression of antimicrobial defensin β 1 (DEFB1), a key effector of the innate immune system [31]. Moreover, we showed that the decrease in expression could be linked to an increased Enterobacteriaceae abundance. Defective production of DEFB1 could potentially lead to the increased levels of proinflammatory bacteria and, therefore, activation of the mucosal immune system and activity of the inflammatory disease.
In addition to these findings, we showed that Sutterellaceae and Veillonellaceae had a positive correlation with the expression of CXCL16 and although the causal link remains speculative, the possible regulatory functions of these taxa on mucosal CXCL16 expression is an intriguing question that could be addressed in future studies. Interestingly, the expression of CXCL16, S100A8 and S100A9 correlated negatively with Lactobacillaceae, which are proposed to exert anti-inflammatory action in the gut [32]. However, we did not find correlation between the abundance of Lactobacillaceae and the expression of the selected AMP, although recent animal model studies have shown that Lactobacillus spp. could stimulate AMP production [33][34][35].
The negative correlation between the expression of antimicrobial peptides REG3g and DEFB1 and Peptostreptococcaceae and Enterobacteriaceae may partly explain the increase in abundance of these taxa in UC. On the other hand, Betaproteobacteria abundance correlated negatively with the expression of DEFB4A, and although DEFB4A expression did not differ between the study groups, the result supports the idea that a repertoire of antimicrobial peptides participates in maintaining microbiota eubiosis. However, the positive correlation between LCN2 expression and Enterobacteriaceae may suggest that these bacteria induce the expression of the siderophore binding protein as a host defense mechanism to limit the availability of iron and subsequently bacterial growth.
In summary, our results reinforce the suggestion that reciprocal interaction of mucosal microbiota and epithelial defense mechanisms play a role in UC. The depletion of Bacteroidetes and increase in facultative anaerobes including Enterobacteriaceae may negatively affect the immunological tolerance towards gut microbiota. The negative correlation between the expression of antimicrobial peptides and specific taxa suggests that impaired host defense mechanisms may allow the expansion of specific microbes able to exacerbate inflammation in UC. The results provide leads for further studies to investigate host-microbiota interactions and can help to develop strategies to restore mucosal microbiota and homeostasis in UC.

Patients
Patients were recruited at the Department of Pediatrics in Turku University Hospital, Turku, Finland and Tampere University Hospital, Tampere, Finland. Endoscopies were done on the clinical basis due to a previous diagnosis of UC or symptoms suggestive of IBD. Endoscopic findings in colonoscopy were classified according to the Mayo endoscopic subscore as normal or inactive disease (score 0), mild (score 1), moderate (score 2) and severe disease (score 3) [36]. The division of the patients to the active and inactive UC groups was based on the histological scoring, which was in accordance with the endoscopic Mayo scoring. The following groups of patients were included in the study: children with endoscopically active UC (n = 18), children with clinically and endoscopically quiescent UC (n = 8) and 27 children with macroscopically and microscopically non-inflamed colon to whom endoscopy was done due to various reasons (Table 1), such as chronic diarrhea (n = 7), abdominal pain (n = 8), hematochezia (n = 9) or other gastrointestinal symptoms (n = 3). These were included in the study as non-IBD controls. In children with active UC, biopsies were taken in the involved area, i.e., where macroscopic inflammation was found during endoscopy, and these locations included cecum and ascending and descending colon. In inactive UC and non-IBD subjects, biopsies were collected from the same locations to allow reasonable comparison between the groups. The study cohort demographics are presented in Table 1. The study subjects were not receiving antibiotics at the time of sampling or for four weeks prior to the sampling. Written informed consent was obtained from all the study patients or their parents. The study was accepted by the ethical committee of the hospital district of Southwest Finland.

Isolation of Host RNA and Microbial DNA
Two biopsies, one for microbial DNA extraction and one for RNA isolation, were taken from each patient, in addition to the routine biopsies for histological examination. The biopsy was taken in the involved area if macroscopic inflammation was found during endoscopy. Otherwise, the biopsy was taken in a non-involved area. In any case, a sample for histological evaluation was taken in the same area as the biopsy for RNA isolation in order to confirm whether the area was inflamed or not. The collected biopsies represented different locations, i.e., cecum, ascending and descending colon ( Table 1). Biopsy samples for microbial DNA isolation were frozen in −80 • C within 2 h of collection, and the microbial DNA was extracted as described previously [37]. Biopsy samples for RNA analysis were rinsed with RNAse-free water and then immediately immersed in RNAlater RNA stabilization reagent (Qiagen, Hilden, Germany), then incubated at 4 • C for 1 day and stored at −20 • C until RNA isolation. After the tissue homogenization, isolation of RNA was performed by RNeasy Plus Mini-kit (Qiagen) according to the manufacturer's instructions [37]. Quality of the isolated RNA was analyzed with Bio-Rad Experion System (Bio RAD Laboratories, Hercules, CA, USA).

16.S rDNA Amplicon Sequencing
Amplicons from the V1 to V2 region of 16S rRNA genes were generated by PCR using the degenerated primers 27F-DegL (5 -AGRGTTYGATYMTGGCTCAG-3 ) and 338R (5 -TGCTGCCTCCCGTAGGAGT-3 ) producing a~311 bp amplicon [38]. To facilitate pyrosequencing using titanium chemistry, each forward primer was appended with the titanium sequencing adaptor A and an "NNNNNNNN" barcode sequence at the 5 end, where NNNNNNNN is a sequence of eight nucleotides that was unique for each sample. The reverse primer carried the titanium adaptor B at the 5 end.
PCRs were performed using a Mx3005P thermocycler (Stratagene, La Jolla, CA, USA) in a total volume of 25 µL containing 1× PCR buffer, 1 µL PCR-grade nucleotide mix, 2.4 units of AmpliTaq Gold DNA polymerase (Applied Biosystems/Life Technologies Waltham, MA, USA), 200 nM forward and reverse primers (Oligomer, Helsinki, Finland) and 100 to 300 ng of template DNA. The amplification program consisted of an initial denaturation step at 96 • C for 2 min; 35 cycles of denaturation at 96 • C for 30 s, annealing at 56 • C for 45 s and elongation at 72 • C for 60 s; and a final extension step at 72 • C for 10 min. Three parallel reactions per sample were prepared. The size of the PCR products was confirmed by gel electrophoresis using 1% (wt/vol) agarose gel and ethidium bromide staining. Control PCRs were performed alongside each separate amplification without addition of template, and consistently yielded no product. PCR products from 3 to 5 parallel reactions were pooled and purified with the QIAquick PCR purification kit (Qiagen, Hilden, Germany) followed by DNA yield quantification using a NanoDrop ND-1000 spectrophotometer. The pooled amplicons were pyrosequenced using a 454-GS FLX titanium chemistry (Roche Diagnostics, Rotkreuz, Switzerland) in the sequencing core facility in the Institute of Biotechnology, University of Helsinki using the manufacturer´s protocols. Data published in ENA ref. no PRJEB38527.

16.S rDNA Amplicon Data Analysis
Pyrosequences were sorted per barcode. We processed 876,571 raw reads using in-house R scripts and the Quantitative Insights Into Microbial Ecology (QIIME) software package version 1.9 [39].
Preprocessing in R included removal of chimeric reads by mapping to the ChimeraSlayer reference database (Broad MIcrobiome Utilities version microbiomeutil-r20110519) using the Usearch v. 8.0.1623 uchime_ref algorithm with default settings [40,41]. Furthermore, sequences having length <300 nt were excluded. In QIIME, the preprocessing included removing reads lacking a barcode or primer sequence and removing the forward and reverse primer sequences from the reads. The quality control steps in QIIME were done with default settings. Briefly, a maximum of 6 ambiguous bases per read was allowed and sequences were discarded if the average quality score over a sliding window spanning 50 nucleotides dropped below 25 [39]. The final dataset included 393,237 reads, with a mean read count of 5869 per sample. The OTUs detected once across all samples were removed. OTUs for the filtered reads were defined at 97% sequence similarity using UCLUST in QIIME [40]. Representative sequences from each OTU were taxonomically assigned with the Uclust method and the SILVA v.119 reference database in QIIME.

Statistical Analysis
The data analysis was performed in R version 2.15.1 (R Development CT 2012) and by using in-house scripts. Microbial richness and the community diversity (Shannon diversity) and the proportion of how different parameters effect the variation (MANOVA) was calculated using functions from the Vegan package. Microbiota compositional analysis was conducted using MARE functions using the family taxonomical level [42]. Principal co-ordinate analysis (PCoA) was used to visualize the dissimilarities in the microbial community using Bray-Curtis dissimilarities.
The statistical difference in family-level taxa abundances between UC and non-IBD as well as UC active and inactive and non-IBD was tested with generalized linear mixed models with functions from the MARE R package. The read number for each sample was used as an offset to account for the varying sequencing depth and biopsy location, and sequencing patch and subjects' age were used as confounding factors. The obtained p-values were corrected using multiple testing with the false discovery rate approach. The values with p-values below 0.01 and FDR-adjusted p-values (q-values) below 0.2 were considered to be significant. Associations between gene expression and bacterial abundances were estimated with generalized linear mixed models with functions from the MARE R package. The read number for each sample was used as an offset and the biopsy location and subjects' age were used as confounding factors. The FDR-corrected p-values below 0.05 were considered to be significant. The difference in phylum-level abundances and gene expression levels between groups were analyzed with ANOVA with Benjamini-Hochberg (BH) adjustment. The correlations between bacterial groups and gene expressions were estimated by Spearman coefficient, followed by FDR correction (BH) of p-value.

Ethical Considerations
The study was accepted by the ethical committee of the Hospital District of Southwest Finland (Journal number 16.11.2004 § 344). Written informed consent was obtained from all of the study patients or their parents.  Acknowledgments: The authors would like to thank all patients and their parents who participated in this study. Riikka Lankinen is acknowledged for excellent laboratory assistance. Lars Paulin and the Institute of Biotechnology, University of Helsinki are thanked for high quality sequencing services. Open access funding provided by University of Helsinki.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.