Inflammatory bowel disease (IBD) is a common chronic inflammatory gastrointestinal disorder whose major subtypes are Crohn’s disease (CD) and ulcerative colitis (UC) [1
]. Ulcerative colitis is characterized by an inflammation limited to the colon, while Crohn’s disease involves any part of the gastrointestinal tract, generally the terminal ileum or the perianal region. The general believe is that these diseases develop due to a continuous and inappropriate inflammatory response to intestinal microbes and foreign antigens in genetically susceptible individuals [2
]. However, the exact mechanisms by which they are triggered remain mostly unknown. Although anti-TNF agents have been widely used for the treatment of IBD, new treatment options are being actively explored, mainly due to the loss of response to anti-TNF therapy observed in around 30% of IBD patients [5
Meta-analyses of multiple GWAS have implicated in more than 160 genetic loci in IBD susceptibility, both common and specific for each subtype [6
]. Functional analyses of the associated genes have revealed multiple pathophysiological mechanisms related to these variants. In this sense, the CARD15/NOD2 region has been related to an impaired antimicrobial defense and the TNFSF15
gene has been involved in the induction of the proinflammatory environment, which is a characteristic feature of the disease [7
]. Nevertheless, although some of the risk regions have been functionally characterized, the implication of the majority of these loci in the development of the disease are yet to be determined. In trying to elucidate the mechanisms involved in IBD development, some studies have shown that epigenetic changes, such as DNA methylation, could be also involved in the susceptibility of this disease. Trying to elucidate the mechanisms involved in IBD development, some studies have shown that epigenetic changes, such as DNA methylation, could be also involved in the susceptibility of this disease [8
N6-methyladenosine (m6A) modification is the most abundant internal chemical modification in mRNAs. This dynamic process is implicated in multiple aspects of RNA metabolism, representing a novel component of genetic regulation termed epitranscriptomics. Nowadays we know that m6A marks regulate the stability and structure of RNA molecules, which can, in turn, regulate gene expression [9
]. SNPs located within or near m6A motifs (m6A-SNPs) have been suggested as potential contributors to disease pathogenesis and different databases of genetic variants related to m6A modifications have been developed in the last few years (e.g., m6Avar and m6ANSP [11
Proteins that regulate m6A RNA modifications, such as METTL3, WTAP or FTO, have been involved in the development of different diseases due to their ability to alter m6A RNA modification levels [13
]. Indeed, it has been speculated that m6A RNA methylation can be involved in immune tolerance by regulating the activity of immune pathways and by controlling the development of T lymphocytes [15
]. In addition, m6A related changes have also been associated with alterations in gut microbiota and with the development of gastrointestinal cancers [17
]. However, the implication of disease associated SNPs in mRNA methylation and its relationship with IBD pathogenesis remains unexplored.
In this study, we have systematically scrutinized online available GWAS, m6A and transcriptome data to find IBD associated genes putatively regulated by m6A related mechanisms. Moreover, we have observed that IFNγ, which is a distinctive inflammatory feature of IBD, induces m6A methylation in intestinal cells. Further studies are needed to clarify the real impact of disease-associated variants in m6A RNA modifications and their implication in IBD pathogenesis.
m6A RNA modifications are gaining interest due to their implication on a wide range of biological processes. Recent studies have described the involvement of these modifications in immune and inflammatory responses [26
], but their implication in the development of inflammatory bowel diseases has not been evaluated so far. As the functional implication of the vast majority of the IBD-associated SNPs is still unknown, we have taken advantage of online available GWAS, m6A and transcriptome data to select IBD candidate genes that could be regulated by m6A related mechanisms. Additionally, we also performed some in vitro experiments in intestinal cell lines to determine the potential implication of m6A in IBD pathogenesis.
Although different databases have been developed to find SNPs that could be affecting m6A methylation levels [11
], the different methods to describe m6A motifs and the huge amount of available GWAS data make it complicated to get unified results. Since m6A marks are described to be tissue specific, we decided to select m6A peaks of two different epithelial cell types, as they are the main cells of IBD target tissue. Methylated sites were crossed with IBD associated SNPs, and genes harboring m6A-SNPs that were differentially expressed in Crohn’s Disease or Ulcerative Colitis were selected for further analysis. The list of m6A-SNPs we got using this approach did not overlap with the output of any of the available databases, highlighting the importance of the data origin in this type of analysis. Although, selecting differentially expressed genes reduced the list of the candidate genes to a more manageable number, it is worth mentioning that we found more than 50 m6A-SNPs that were not located into differentially expressed genes, but that could also be involved in disease pathogenesis mechanisms independent of gene expression.
We analyzed whether the m6A-SNPs were directly influencing gene expression using an eQTL approach, but we observed that the expressions of only two out of five candidates were affected by the associated SNP genotype. Moreover, UBE2L3 was found to be overexpressed in CD while the risk allele was associated with lower expression of this gene, confirming that the relationship between m6A-SNPs and gene expression is not a straightforward effect. It is worth mentioning that IBD treatments, as anti-TNF drugs, could also influence the mRNA methylation levels and downstream expressions of the candidate. Thus, eQTL analyses in samples from treated patients would give further information about the involvement of m6A-SNPs in the regulation of candidate gene expression. Additional analysis of the potential effect of disease-associated m6A-SNPs in the candidate genes revealed that these variants are within RNA binding protein binding sites, so that the interaction of the mRNAs with their target proteins might be influenced by genotype and therefore their regulatory effect would be in an allele specific manner. Indeed, we observed that specific genotypes in disease-associated variants led to secondary structure changes in UBE2L3 and SNAPC4 mRNA molecules. Hence, it is important to mention that SNPs could indirectly affect gene expression via more complex mechanisms that were probably missed by an eQTL analysis. These results underline the importance of considering different layers of gene expression regulation when analyzing the potential functional effects of disease associated SNPs.
We used the TREW database to evaluate the m6A machinery proteins that have been found to bind our candidate genes. We observed that the YTHDF1 reader had the ability to interact with all candidate genes. Interestingly, YTHDF1 is implicated in the development of colorectal carcinomas, and its expression has been shown to be significantly augmented in this type of cancer [28
]. Although the only changes in YTHDF1
gene expression that we observed point to a downregulation of this gene in UC patients, the induction of this protein together with the augmented overall m6A methylation found in response to IFNγ suggest that a continuous inflammatory environment could lead to an increase in m6A and m6A regulatory proteins. Indeed, we observed that WTAP
writer expression is upregulated in UC patients. WTAP increase has been also found in gastrointestinal tumors and its expression has been related to T lymphocyte infiltration [29
]. These results suggest that increased m6A methylation and the induction of m6A machinery proteins could predispose individuals with IBD to develop gastrointestinal malignancies. Interaction analysis also showed that SLC22A4
can bind different m6A writers, even if these two genes had contrary expression trends in the disease condition. The different locations of the m6A-SNPs in these two genes (intron vs. gene body) evidence the diverse effects of the m6A motifs, based on their location. Co-expression analysis of m6A machinery proteins with candidate genes did not give the same information as the TREW interaction data, stressing, once again, the complexity of m6A mediated gene interactions.
The induction of METTL3, METTL14
after IFNγ stimulation confirmed the implication of the m6A pathway in the inflammatory response in intestinal cells. The pathophysiological role of IFNγ in IBD has been mainly attributed to its effects on the intestinal epithelium [25
]. m6A RNA methylation has been shown to play important roles in the interferon response to viral infections [16
] but, to our knowledge, this is the first study showing an increase in the overall levels of m6A in the presence of IFNγ in intestinal cells. Although the overexpression of METTL3
and the silencing of YTHDF1
did not give very striking results, we could observe certain alterations in some downstream candidate genes. As previously mentioned, m6A regulated pathways are generally complex, so further studies in the localization or stability of the mRNA or even in the protein expression of the candidates will provide more specific information on the implication that m6A exerts into these genes.
To summarize, our study shows that the combination of different datasets can be very helpful for describing novel candidate genes involved in disease pathogenesis and pinpoints m6A as a mechanism that could be helpful to understand the etiology of complex diseases. Moreover, our data suggest that the m6A pathway may play a role in the development of IBD and, as previously proposed for DNA methylation [30
], our work opens a new window to the development of novel therapeutic approaches based on the regulation of mRNA methylation.
4. Materials and Methods
4.1. Selection of Candidate Genes
IBD, CD and UC associated SNPs, located within m6A peaks (m6A-SNPs) were selected. For this purpose, International IBD Genetic Consortium GWAS results [19
] were retrieved and variants with a p
-value < 1 × 10−8
and INFO score > 0.8 were kept. On the other hand, m6A peaks found in HeLa and HepG2 epithelial cell lines available on MeT-DBv2 [31
] were used. Overlapping positions of disease-associated SNPs and m6A peaks were calculated using the intersect method implemented in Bedtools (v2.29.2) [23
]. The disease-associated SNPs located within m6A peaks present in both cell lines were considered as hits of interest and were selected for further analyses.
Then, differential expression analyses of genes harboring m6A-SNPs were performed using expression data of both IBD subtypes available in the Gene Expression Omnibus (GEO). For Crohn’s disease, available read counts of GSE85499 [32
] were used; values were normalized by RUVSeq [33
] R package [34
] using the less variable 5000 genes as reference genes; differential gene expression analysis was carried out using edgeR [24
] R package. For Ulcerative colitis, normalized values from GSE105074 [35
] were used and Mann–Whitney test/GLM (Generalized Linear Model) were applied using R language. Genes that were differentially expressed and harbored a disease subtype-associated m6A-SNP were selected for further studies.
4.2. Analysis of m6A-SNP Regulatory Capacity
-eQTL analysis was carried using the GTEx eQTL Dashboard tool [36
]. Colon tissue was used as target tissue as it is the affected tissue in both CD and UC subtypes.
POSTAR2 database [37
] was used to search RNA binding proteins (RBP) that may have their binding site on an m6A-SNP location. “Variation” submodule in “RNA” module was used to evaluate candidate genes and associated SNPs.
RTH RNAsnp Web Server of the Centre for non-coding RNA in Technology and Health (RTH) [38
] was used to predict SNP effects on local RNA secondary structure. The most common mRNA transcripts sequences were extracted from Ensembl Genome Browser [36
]. The m6A-SNP position and the allele change were indicated in the input sequence and a comparison, based on global folding (mode 1) was selected.
4.3. Candidate Gene and m6A Machinery Protein Interaction Analysis
m6A machinery proteins, described to interact with our candidate gene mRNAs, were evaluated using the TREW database (Target of m6A Readers, Erasers and Writers database), available in metDB-V2.0. This database collects ParCLIP-seq and MeRIP-seq data from different studies for the m6A regulator and reader proteins, including FTO, KIAA1429, METTL14, METTL3, WTAP, HNRNPC, YTHDC1 and YTHDF1 [31
]. The m6A related proteins interacting with our targets in both HeLa and HepG2 epithelial cell lines were selected.
4.4. Candidate Gene and m6A Machinery Protein Co-expression Analysis
Whole genome expression data of controls and patients from the CD and UC datasets [32
] were used for co-expression analyses between candidate genes and all the m6A machinery proteins that were found to interact with the candidates. Pearson correlation coefficients and p
values were calculated using GraphPad. Correlation was considered significant when p
4.5. Cell Lines and Stimulations
Intestinal HCT116 (#91091005) cell line was purchased from Sigma-Aldrich (Poole, UK). Cells were cultured in DMEM (Lonza, Basel, Switzerland, #12-604F) supplemented with 10% FBS (Millipore, Burlington, MA, USA #S0115), 100 units/mL penicillin and 100 μg/mL streptomycin (Lonza, #17-602E). For IFNγ stimulation, HCT116 cells were treated with IFNγ (R&D Systems, Minneapolis, MN, USA, #285-IF-100/CF) at a final concentration of 200 U/mL for 4 h.
4.6. Overexpression and Silencing
For METTL3, the overexpression 250 ng of plasmid from Addgene (#53739) was transfected using X-Treme Gene HP DNA transfection reagent (Sigma-Aldrich, #6366546001). Cells were harvested 48h post-transfection.
For YTHDF1 silencing, 30 nM of two different siRNAs against YTHDF1 (IDT, # hs.Ri.YTHDF1.13.1 and hs.Ri.YTHDF1.13.2) or negative control siRNA (IDT # 51-01-14-01) were transfected using Lipofectamine RNAimax reagent (Thermo Fisher Scientific, Waltham, MA, USA). Cells were harvested 48 h post-transfection.
4.7. RNA and Protein Extractions
RNA was extracted using NucleoSpin RNA Kit (Macherey Nagel, Düren, Germany, #740984.50) and proteins were lysed in RIPA buffer (150 mM NaCl, 1.0% NP-40, 0.5% NaDeoxicholate, 0.1% SDS, 50 mM TrisHCl, 1 mM EDTA).
4.8. Dot Blot
300 ng of RNA were heated for 3 min and rapidly put into ice. RNA was then crosslinked into a nitrocellulose membrane using UV. The membrane was blocked using 5% non-fatty milk in 0.1% PBST (0.1% Tween in PBS) and was incubated overnight with an m6A antibody (1:200) (Abcam, Cambridge, UK, #ab151230) at 4 °C. After washing the membrane in 0.1% PBST, it was incubated with a secondary HRP conjugate anti-rabbit antibody (1:10,000) (Santa Cruz Biotechnology, Dallas, TX, USA, #sc-2357) and finally developed using Clarity Max ECL Substrate (BioRad, Hercules, CA, USA, #1705062).
4.9. Gene Expression Analysis
500–1000 ng of RNA was used for the retrotranscription reaction using iScript cDNA Synthesis Kit (BioRad, #1708890). qPCR was performed using iTaq SYBR Green Supermix (Bio-Rad, #1725124). Reactions were run in a BioRad CFX384 and melting curves were analyzed to ensure the amplification of a single product. All qPCR measurements were performed in duplicates and expression levels were analyzed using the 2–ΔCt
method. Specific primer pairs used for expression levels determination are listed in the Supplementary Table S2
4.10. Western Blot
Laemmli buffer 6X (62 mM Tris-HCl, 100 mM dithiothreitol (DTT), 10% glycerol, 2% SDS, 0.2 mg/mL bromophenol blue, 2% 2-mercaptoethanol) was added to proteins extracts and denaturized at 95 °C for 10 min. Proteins were migrated on 10% SDS-PAGE gels. Following electrophoresis, proteins were transferred onto nitrocellulose membranes using a Transblot-Turbo Transfer System (BioRad) and blocked in 5% non-fatty milk diluted in TBST (20 mM Tris, 150 mM NaCl and 0.1% Tween 20) at room temperature for 1 h. The membranes were incubated overnight at 4 °C with primary antibodies for METTL3, YTHDF1 and GAPDH at a 1:1000 dilution. Immunoreactive bands were revealed using the Femto ECL Substrate after incubation with a horseradish peroxidase-conjugated anti-rabbit or anti-mouse secondary antibody (1:10,000 dilution in 2.5% non-fatty milk) for 1h at room temperature. The immunoreactive bands were detected using a Bio-Rad Molecular Imager ChemiDoc XRS (BioRad).